Distributed Hash Table File System for Linux

Documentation effort magma 0.0.20070911 released magma 0.0.20070829 released magma 0.0.20070806 released on SVN All the news

Valid HTML 4.01 Transitional
Valid CSS!
hosted by gna!

DOCUMENTATION :: How to solve hash key collisions

Magma storage backend, the flare system, is extensively based on SHA1 hashes. Each flare (see names to know more about flares) is actually saved in a 3 element format on disk.

  1. a directory named as the SHA1 of the full flare path.
  2. a contents file, inside previous directory, which stores flare contents where applicable
  3. a metadata companion file, inside same directory, which stores flare metadata like full path, struct stat and hash rappresentation

If two flare paths lead to same hash, both will operate on same file contents. That's even worst you can think. Suppose that a regular file and a directory can be referred by the same hash. What will happen if file is scanned as a directory?

Proposal for a clean solution

The simplest possible approach is to add a middle layer in storage backend, adding a second hash inside main directory. That's the pattern.

Suppose that a file f1 and a directory d2 compute the same hash 6d67fa810e3b294e7204ff5a72f98023f34a238f. To distinguish them, the structure can be changed in:

+ 6d67fa810e3b294e7204ff5a72f98023f34a238f   (h1)
  + 328a2764b4c543d28e3f0a8b73fc9d3e6f824522 (h2)
    + metadata
    + contents
  + 8a1b6c4d09e754fa57b6c28d532648ee243f6a38 (h3)
    + metadata
    + contents

How are calculated h2 and h3 hashes? Second level hashes are the result of sha1(strcat(h1,flare->path)). The probability that two colliding paths generate a second collision when concatenated to collided hash is simply not relevant.

That change should be reflected even inside the b-tree cache. A linear linked list should be enough to manage collisions. Each flare should be provided of a link to next flare with same hash. And cache operations should investigate a little bit more while scanning or modifying the b-tree, not accepting the first position (flare) found but matching also flare path to see if flare is exactly what was requested.

Which code should be changed to reflect this situation?

The short answer is a lot of code, but with minor changes.

struct magma_flare should be provided of a magma_flare_t pointer to build linked list of colliding flares.

magma_search() should double check flare path too before returning a flare. If flare path don't match and colliding list is not empty, all linked elements should be scanned until valid one is found or NULL pointer is reached.

magma_add_to_cache_after() should add a flare to collision list if a flare with same hash exists in b-tree but is not the same flare.

magma_remove_from_cache() should check if flare has non-NULL collision list. If has one, insted of performing normal element removal, flare should be simply substituted by its following colliding flare.

First picture show how cache removal is managed when flare (2) has no collisions. Flare (3) has been choosen for its load, higher than load of flare (4).

Next picture show how cache removal is managed when removing a flare (2) which has a colliding flare (2b) with same hash and pointed to by flare (2) collision list.

magma_save_flare() and magma_load_flare() should also be modified to handle new storage backend schema.

This file last modified Wednesday, 09-Jan-2008 20:02:15 CET