Distributed Hash Table File System for Linux

Documentation effort magma 0.0.20070911 released magma 0.0.20070829 released magma 0.0.20070806 released on SVN All the news

Valid HTML 4.01 Transitional
Valid CSS!
hosted by gna!


Flare objects

(part of Flare system)

This is the fundation of Magma: is the component which creates, stores, manages and destroy data. Everyhting is a "flare" under Magma. Flare objects is implemented by files magma_flare_internals.h, magma_flare_internals.c but should be used only with interface exported by magma_flare.h

Flare objects have internal caching mechanisms which uses binary tree on hash buckets of complete file pathname. (for more information read here)

A fundamental issue I've forgot previously is that so far flare objects are not able to properly manage collision detection in hash rappresentations. That's unacceptable for a production system. See document How to solve collisions for further explanation. That's why Flare object is rated at 90%!

Flare protocol

(part of Flare system)

This is the protocol used between mount.magma and magmad and also between nodes when a request should be forwarded to owner node. Protocol is defined in protocol_pkt.h as a set of preprocessor macro. Network byte ordering as been managed in latter release.

Flare protocol has been extensivly tested and should be considered working.

When both flare system and flare protocol will be terminated, magma will be able to replace NFS as a traditional single-server-hosted network filesystem.

Internode protocol

(part of Flare system)

This is the protocol nodes talk to communicate each one load, key space and diagnostics. Is also used to exchange keys when a node is overloaded and wants to drop a portion of its key space. It's also used on startup to join an existing network, get the lava ring image and locate itself in the keyspace. This protocol is exposed as separate feature just to better explain magma developement, but is in fact an internal component of flare system.

Base types of Internode protocol are defined in vulcano.h. where are lava network nodes and relationship between them.

Load balancing

(part of Flare system)

Actually is implemented as a separate thread that computes an integer index using logaritmic function. But it's not yet tested on the ground. Even load balancing is an internal component of flare system.

When both Internode protocol and Load balancing will be added to Magma, it will be able to properly act as a distributed network filesystem, able to dynamically grow its size and scale.

Connection ACLs

(part of Flare system)

Quite obvious: not everyone is allowed to connect to a magma network. First implementation will provide per IP filtering. Later can be used a TLS encrypted authentication mechanism based on symmetic or asymmetric cryptography, but will be considered later, since there are a lot of good VPN software out there which can provide reliable channels. ACL as well are an internal component of flare system.

Basic support has been coded in acl.c! Describe multiple ACLs, one per share, limitating access by ip/netmask and by subpaths (like /home and /home/john) to grant r (read-only), w (read/write) or n (no) access to flares.

Network failover policy

(part of Flare system)

This is one of the most delicate an fragile aspects of distributed architectures. What should be done if a node collapse? And what if a node simply disappear for 60seconds and later reappear? And how to distinguish first from last case? How should the network manage a single node failover? And what should be assumed by a node cutted off from the network? Failover policy is a component of flare system. First step in developement has been creating a flare directory called /.dht/, available in every magma filesystem, that holds a complete description of distributed hash table.


(part of Flare system)

To better manage network instability, nodes are supposed to duplicate keys managing to have a copy of previous node key space. When a node gets off line, its redundant mirror can be queried (may be just readonly) to recover missing contents and, when it's sure that missing node will not come back, mirror node will be able to replace missing node by first becoming owner of missing node key space, and later rebalancing load to other nearest nodes.

Redundancy will be implemented by flare system in a transparent way. In protocol_flare.c has been added code to duplicate writing operations (like magma_write() but unlike magma_lstat()) on redundant node. Code is young and should be tested!

When both Connection ACLs, Network failover and Redundancy will be coded, Magma will be able to act as Internet public storage system, supposed to be resistant enough against break-in attempts and network unreliability.

Cryptographic layer

(client only)

Cryptography it's an expensive job. To be scalable, cryptographic layer will be included in client mounter and will be performed locally. So will be something that magmad should never even hear about! As a consequence, cryptography layer is not part of flare system.

NFS interface

(server only)

This is just an add-on planned (and partially but not usably implemented) to allow even non-linux clients to connect to a magma network. While NFS is a more accepted interface than custom protocol based on libfuse, you should also note that using NFS makes impossible to insert a cryptographic layer, for example. NFS implementation is nfsvers=2 to allow as many clients as possible to be able to connect. NFS will be provided by magmad directly, not being part of flare system.

IPv6 support

(part of Flare system)

A network application without support for IPv6 will not live longer than few years. Since network communication is intrinsic to DHT operations, IPv6 support must be added inside flare system. Many subcomponents need to be updated: vulcano nodes, ACL support, flare network loop and more.

This file last modified Wednesday, 09-Jan-2008 20:02:16 CET