Tapestry (DHT)

Tapestry
Written in	C++
Type	peer-to-peer
Website	current.cs.ucsb.edu/projects/chimera/

Tapestry is a peer-to-peer overlay network which provides a distributed hash table, routing, and multicasting infrastructure for distributed applications. The Tapestry peer-to-peer system offers efficient, scalable, self-repairing, location-aware routing to nearby resources.

The first generation of peer-to-peer applications, including Napster, Gnutella, had restricting limitations such as a central directory for Napster and scoped broadcast queries for Gnutella limiting scalability. To address these problems a second generation of P2P applications were developed including Tapestry, Chord, Pastry, and CAN. These overlays implement a basic key-based routing mechanism. This allows for deterministic routing of messages and adaptation to node failures in the overlay network. Of the named networks Pastry is very close to Tapestry as they both adopt the same routing algorithm by Plaxton et al.

Tapestry is an extensible infrastructure that provides decentralized object location and routing focusing on efficiency and minimizing message latency. This is achieved since Tapestry constructs locally optimal routing tables from initialization and maintains them in order to reduce routing stretch. Furthermore, Tapestry allows object distribution determination according to the needs of a given application. Similarly Tapestry allows applications to implement multicasting in the overlay network.

Each node is assigned a unique nodeID uniformly distributed in a large identifier space. Tapestry uses SHA-1 to produce a 160-bit identifier space represented by a 40 digit hex key. Application specific endpoints GUIDs are similarly assigned unique identifiers. NodeIDs and GUIDs are roughly evenly distributed in the overlay network with each node storing several different IDs. From experiments it is shown that Tapestry efficiency increases with network size, so multiple applications sharing the same overlay network increases efficiency. To differentiate between applications a unique application identifier is used. Tapestry uses best-effort to publish and route objects.

Each identifier is mapped to a live node called the root. If a node's nodeID is G then it is the root else use the routing table's nodeIDs and IP addresses to find the nodes neighbors. At each hop a message is progressively routed closer to G by incremental suffix routing. Each neighbor map has multiple levels where each level contains links to nodes matching up to a certain digit position in the ID. The primary i^th entry in the j^th level is the ID and location of the closest node that begins with prefix (N, j-1)+i. This means that level 1 has links to nodes that have nothing in common, level 2 has the first digit in common, etc. Because of this, routing takes approximately $\log _{B}N$ hops in a network of size N and IDs of base B (hex: B=16). If an exact ID can not be found, the routing table will route to the closest matching node. For fault tolerance, nodes keep c secondary links such that the routing table has size $c*B*\log _{B}N$ .

...
Wikipedia