Distributed Hash Table (DHT) and Kademlia

Deep technical documentation of the DHT implementation powering Zentalk’s decentralized infrastructure.

Kademlia DHT

Bootstrap 1

Bootstrap 2

Peer

New

Lookup

Query

XOR Distance

k-Closest

Target

XOR metric: d(a,b) = a ⊕ b • k-buckets organize peers by distance • O(log n) lookup

Why DHT? The Problem with Centralized Lookup

In a centralized system, finding resources is simple: ask the server. But this creates critical problems:

Problem	Centralized System	Decentralized DHT
Single point of failure	Server down = system down	No single point of failure
Censorship	Easy to block/censor	Extremely difficult to censor
Privacy	Server sees all queries	Queries distributed across network
Scalability	Server bottleneck	Scales with network size
Trust	Must trust server operator	Trustless operation

The core question DHT answers: How can millions of nodes collectively store and retrieve data without any central coordinator?

Key-Value Storage Across Nodes

A DHT distributes a key-value store across all participating nodes:


Traditional Database:
  Client → Server → Database[key] → value

DHT:
  Client → Node_1 → Node_7 → Node_42 → value
          (each hop gets closer to the key's location)

Properties:

Deterministic: Given a key, any node can find who stores it
Load-balanced: Data distributed evenly across nodes
Fault-tolerant: Node failures don’t lose data (replication)
Scalable: Lookup is O(log n) where n = number of nodes

Kademlia Protocol

Kademlia is the DHT protocol used by Zentalk (also used by BitTorrent, IPFS, and Ethereum). It was designed by Petar Maymounkov and David Mazieres in 2002.

Core Concepts

Concept	Description
Node ID	256-bit identifier for each node
Key	256-bit identifier for stored data
XOR Distance	Distance metric between IDs
k-bucket	List of known nodes at specific distance ranges
Routing Table	Collection of k-buckets

XOR Distance Metric

Why XOR?

Kademlia’s key innovation is using XOR (exclusive or) as a distance metric. This isn’t arbitrary - XOR has mathematical properties that make DHT operations efficient.

XOR Distance Formula:


distance(A, B) = A XOR B

Example:


Node A:  1010 1100
Node B:  1010 0011
         ─────────
XOR:     0000 1111  = distance 15

Node A:  1010 1100
Node C:  1010 1101
         ─────────
XOR:     0000 0001  = distance 1

Node C is much closer to A than B

Why XOR Works

XOR satisfies the mathematical requirements for a distance metric:

Property	Formula	Why It Matters
Identity	d(x, x) = 0	A node is zero distance from itself
Symmetry	d(x, y) = d(y, x)	If A is close to B, B is close to A
Triangle Inequality	d(x, z) ≤ d(x, y) + d(y, z)	No shortcuts - routing always converges
Unidirectional	For any x and d, exactly one y where d(x,y) = d	Each distance maps to exactly one node

The unidirectional property is crucial: For any node and any target distance, there’s exactly ONE other node at that distance. This prevents routing ambiguity.

XOR Distance Interpretation

The XOR distance has a useful interpretation: the position of the highest differing bit.


If A XOR B has its highest set bit at position i:
  - A and B share their first (256-i) bits
  - They diverge starting at bit i

Example:


A: 1010 1100 0000 0000 ...
B: 1010 1000 0000 0000 ...
         ^
         First differing bit at position 5

XOR = 0000 0100 ... = 4 (2^2, but position 5 from left in 8-bit view)

This bit-level interpretation enables the binary tree structure.

Binary Tree Structure

Conceptual Model

Kademlia conceptualizes the ID space as a binary tree where:

Each node is a leaf
Node ID determines the path from root to leaf
Each bit in the ID represents a left (0) or right (1) branch


                    Root
                   /    \
                  0      1
                 / \    / \
                00  01 10  11
               /\  /\ /\  /\
              ... ... ... ...

Node 1010... follows path: right → left → right → left → ...

Subtree Coverage

For any node, the ID space divides into subtrees:


From Node A's perspective (ID: 0...):

Subtree 0: All nodes starting with 1... (half the network)
           XOR distance: 1xxx... (very far)

Subtree 1: All nodes starting with 01... (quarter of network)
           XOR distance: 01xx... (far)

Subtree 2: All nodes starting with 001... (eighth of network)
           XOR distance: 001x... (medium)

...and so on

Key insight: Each subtree is exponentially smaller, but A needs to know at least one node from each subtree to route to any destination.

k-Buckets: The Routing Table

What Are k-Buckets?

A k-bucket is a list of up to k contacts (nodes) for a specific distance range. Zentalk uses k=20.

Each node maintains 256 k-buckets (one for each bit position):

Bucket	Distance Range	Contains Nodes Where
0	2^0 to 2^1-1	First 255 bits match, last bit differs
1	2^1 to 2^2-1	First 254 bits match, bit 254 differs
2	2^2 to 2^3-1	First 253 bits match, bit 253 differs
…	…	…
255	2^255 to 2^256-1	First bit differs

Why k=20?

The parameter k balances multiple concerns:

Factor	Small k	Large k
Memory usage	Lower	Higher
Lookup reliability	Lower	Higher
Network overhead	Lower	Higher
Resistance to churn	Lower	Higher

Why 20 specifically:

Probability of all k nodes failing simultaneously: extremely low
Even with 50% node churn, lookup succeeds with high probability
Memory overhead is acceptable (256 buckets * 20 nodes * ~50 bytes = ~256KB)

k-Bucket Maintenance

When a node learns about a new contact:


1. Calculate XOR distance to determine target bucket
2. If bucket has < k entries:
   → Add new contact
3. If bucket is full:
   → Ping least-recently-seen contact
   → If no response: replace with new contact
   → If responds: move to end of list, discard new contact

Why favor old contacts?

Long-lived nodes are more likely to stay alive (empirically proven)
Attackers can’t easily inject new malicious nodes
“Least-recently-seen” eviction provides Sybil resistance

Routing Table Visualization


Node ID: 00101... (256 bits)

Bucket 255: [Nodes starting with 1...]     ← Far (half the network)
Bucket 254: [Nodes starting with 01...]    ← Far-ish
Bucket 253: [Nodes starting with 001...]   ← Medium
...
Bucket 2:   [Nodes starting with 00100...] ← Close
Bucket 1:   [Nodes starting with 001010...]← Very close
Bucket 0:   [Nodes starting with 0010100...]← Closest

Each bucket holds up to k=20 nodes

Lookup Algorithm

FIND_NODE Operation

The fundamental operation: find the k closest nodes to a target ID.

Algorithm:


FIND_NODE(target):
  1. Initialize:
     - shortlist = k closest nodes from local routing table
     - queried = empty set
     - closest_seen = infinity

  2. Loop:
     - Select alpha (3) nodes from shortlist not yet queried
     - Send parallel FIND_NODE RPCs to selected nodes
     - Mark nodes as queried

  3. Process responses:
     - Each response contains k closest nodes the responder knows
     - Add new nodes to shortlist
     - Update closest_seen if closer node found

  4. Termination:
     - If no node closer than closest_seen found in a round
     - Query all remaining nodes in shortlist within k closest
     - Return k closest nodes seen

Lookup Convergence: Why It Works

Key property: Each hop at least halves the remaining distance.


Start: Node A (ID: 0000...)
Target: 1111...
Initial distance: 1111... XOR 0000... = 1111... (very large)

Hop 1: A asks bucket 255 → gets node B (ID: 1000...)
       New distance: 1111... XOR 1000... = 0111... (halved!)

Hop 2: B asks its bucket → gets node C (ID: 1100...)
       New distance: 1111... XOR 1100... = 0011... (halved again!)

Hop 3: C asks its bucket → gets node D (ID: 1110...)
       Distance: 0001...

Hop 4: D asks its bucket → gets node E (ID: 1111...)
       Distance: 0000... (found!)

Maximum hops: O(log n) where n = network size

For a million-node network: ~20 hops maximum.

FIND_VALUE Operation

Similar to FIND_NODE, but for retrieving stored values:


FIND_VALUE(key):
  1. Compute target = hash(key)

  2. Perform iterative lookup (like FIND_NODE)

  3. At each hop:
     - If node has value for key → return value immediately
     - Otherwise → return k closest nodes (continue lookup)

  4. If lookup completes without finding value:
     - Store value at closest node found (caching)
     - Return "not found"

Iterative vs Recursive Lookup

Approach	How It Works	Pros	Cons
Iterative	Querying node controls entire lookup	Predictable latency, easier debugging	More round trips
Recursive	Each node forwards to next	Fewer round trips	Unpredictable, harder to secure

Zentalk uses iterative lookups because:

Better control over timeout/retry
Easier to reason about security
Can implement lookup caching at originator

Alpha Parameter: Parallel Queries

The alpha parameter (typically 3) controls lookup parallelism:


Alpha = 1: Sequential lookups
  A → B → C → D (slow, but minimal bandwidth)

Alpha = 3: Parallel lookups
  A → B ─┐
  A → C ─┼→ best response → continue
  A → D ─┘
  (faster, more bandwidth)

Why alpha=3:

Tolerates 2 slow/failed nodes per round
3x speedup over sequential in practice
Reasonable bandwidth overhead

Parameter	Zentalk Value	Purpose
k (bucket size)	20	Redundancy, reliability
alpha (parallelism)	3	Lookup speed
ID size	256 bits	Security margin, collision avoidance

Node Operations

JOIN: How a New Node Joins

When a node first starts, it knows nothing about the network except bootstrap nodes:


JOIN(bootstrap_nodes):
  1. Generate node ID:
     - node_id = SHA256(public_key)
     - This binds ID to cryptographic identity

  2. Contact bootstrap node:
     - Send PING to verify connectivity
     - Receive initial contacts

  3. Self-lookup:
     - Perform FIND_NODE(own_node_id)
     - This populates routing table with nearby nodes

  4. Bucket refresh:
     - For each bucket, FIND_NODE(random_id_in_bucket)
     - Discovers nodes at all distance ranges

  5. Announce presence:
     - Stored data will be replicated to us automatically
     - Other lookups will discover us

Why self-lookup works:

Other nodes learn about us when we query them
We learn about nodes close to us (most important)
Routing table fills organically

STORE: How Data Is Stored


STORE(key, value):
  1. Compute storage location:
     - target = hash(key)  // 256-bit ID

  2. Find storage nodes:
     - nodes = FIND_NODE(target)
     - Select k closest nodes to target

  3. Store value:
     - Send STORE RPC to selected nodes
     - Each stores (key, value, timestamp, TTL)

  4. Replication:
     - Value stored on k nodes for redundancy
     - Nodes periodically replicate to neighbors

Storage responsibility:

Nodes whose IDs are closest to the key are responsible
On node join/leave, data migrates to maintain this property

Refresh: Keeping Routing Tables Fresh

Routing tables become stale as nodes join/leave:


REFRESH():
  Every hour, for each bucket:
    1. If no lookup touched this bucket recently:
       - random_id = generate ID in bucket's range
       - FIND_NODE(random_id)

    2. Ping least-recently-seen nodes:
       - If no response → remove from bucket
       - If response → move to end of bucket

Why refresh matters:

Detects failed nodes
Discovers new nodes
Keeps routing table accurate

Republishing Data

Stored data must be republished to survive node churn:


REPUBLISH():
  Every hour:
    1. For each locally stored (key, value):
       - If originally_stored_by_us OR near_expiration:
         - STORE(key, value) to refresh

    2. Replicate to new close neighbors:
       - If new node joined closer to key:
         - Send them the data

Operation	Frequency	Purpose
Bucket refresh	1 hour	Maintain routing accuracy
Data republish	1 hour	Prevent data loss
Ping stale contacts	On new contact	Verify liveness

Security Considerations

Eclipse Attacks

The Attack: An attacker surrounds a victim node with malicious nodes, controlling all its network view.


Normal:
  Victim ──► Honest Node A
         ──► Honest Node B
         ──► Honest Node C

Eclipsed:
  Victim ──► Attacker Node 1 ──► Attacker controls
         ──► Attacker Node 2 ──► all information
         ──► Attacker Node 3 ──► victim receives

Impact:

Attacker controls what data victim sees
Can partition victim from honest network
Can censor specific keys

Mitigations in Zentalk:

Mitigation	How It Works
Diverse bootstrap	Multiple independent bootstrap nodes
Connection limits	Max connections per IP/subnet
Path diversity	Multiple lookup paths for critical data
Bucket constraints	Limit nodes from same IP range per bucket
Long-term contacts	Prefer established contacts (harder to eclipse)

Sybil Attacks

The Attack: Attacker creates many fake identities to gain disproportionate network influence.


Honest network: 1000 nodes
Attacker adds: 10000 Sybil nodes

Result: Attacker controls 90% of routing table entries
        Attacker likely responsible for most keys

Why Sybil is hard in DHT:

Can’t choose arbitrary node IDs (see below)
k-bucket design limits exposure
Honest nodes seen first are preferred

Mitigations:

Mitigation	How It Works
Proof-of-work	Node ID generation requires computational cost
Cryptographic binding	node_id = hash(public_key), can’t choose ID
Rate limiting	Limit new contacts per time period
Reputation	Track node behavior over time
Resource testing	Verify nodes have real resources (bandwidth, storage)

Node ID Generation

Critical security requirement: Nodes must not be able to choose their ID.

Why this matters:


If attackers could choose IDs:
  - Position Sybil nodes near target key
  - Become responsible for any key they want
  - Intercept/modify stored data

Zentalk’s approach:


node_id = SHA256(ed25519_public_key || creation_timestamp)

Property	Guarantee
Binding to key	ID tied to cryptographic identity
Unpredictable	SHA256 output is uniformly random
Verifiable	Anyone can verify ID was generated correctly
Non-reusable	Timestamp prevents pre-computation

Verification on first contact:


1. Node presents: (node_id, public_key, timestamp, signature)
2. Verifier checks:
   - node_id == SHA256(public_key || timestamp)
   - timestamp within acceptable window
   - signature valid

Additional Security Measures

Threat	Mitigation
Data poisoning	Store signatures with data, verify on retrieval
Routing table poisoning	Verify node IDs before adding to routing table
DDoS via lookups	Rate limit lookup requests
Replay attacks	Include timestamps, maintain seen-message cache

Zentalk-Specific Usage

Relay Node Discovery

Zentalk uses DHT to discover relay nodes for 3-hop relay routing:


1. Client needs relay nodes for circuit
2. Query DHT for key: hash("relay:" || region || timestamp_bucket)
3. DHT returns list of available relays
4. Client selects relays for circuit building

Why DHT for relay discovery:

No central relay directory
Load balancing across regions
Censorship resistant
Relays can join/leave dynamically

DHT Key Pattern	Returns
`relay:guard:eu`	Guard relays in Europe
`relay:middle:*`	All middle relays
`relay:exit:us`	Exit relays in US

Key Bundle Distribution

User public key bundles are stored in DHT:


Store key bundle:
  key = hash("keybundle:" || user_wallet_address)
  value = encrypted_and_signed_key_bundle
  STORE(key, value)

Fetch key bundle:
  key = hash("keybundle:" || recipient_wallet_address)
  bundle = FIND_VALUE(key)
  verify_signature(bundle)

Benefits:

No central key server
Key bundles available even if API offline
Redundant storage across k nodes

Mesh Storage Addressing

Decentralized storage uses DHT for content addressing:


Store encrypted media chunk:
  content_hash = SHA256(encrypted_chunk)
  storage_key = hash("media:" || content_hash)
  STORE(storage_key, encrypted_chunk)

Retrieve:
  storage_key = hash("media:" || known_content_hash)
  chunk = FIND_VALUE(storage_key)
  verify: SHA256(chunk) == content_hash

Mesh storage modules using DHT:

Module	DHT Key Format	Purpose
messages	`msg:{recipient}:{timestamp_bucket}`	Store-and-forward messages
media	`media:{content_hash}`	Chunked encrypted media
keys	`keybundle:{wallet_address}`	Public key distribution
profile	`profile:{wallet_address}`	Encrypted profile sync

DHT Parameters in Zentalk

Parameter	Value	Rationale
k (bucket size)	20	Balance reliability vs overhead
alpha (parallel queries)	3	Fast lookups, fault tolerance
ID size	256 bits	Match SHA256, security margin
Refresh interval	1 hour	Keep routing tables fresh
Republish interval	1 hour	Maintain data availability
Data TTL	24-72 hours	Configurable per data type
Bootstrap nodes	5+	Geographic distribution

Performance Characteristics

Lookup Complexity

Metric	Value	Notes
Lookup hops	O(log n)	~20 hops for 1M nodes
Messages per lookup	O(k * log n)	With alpha parallelism
Routing table size	O(k * log n)	~5000 entries for 1M network
Storage overhead	O(k) per key	Replication factor

Latency Expectations


Lookup latency = hops * avg_rtt + processing_time

For 1M node network:
  - Hops: ~20
  - Avg RTT: 50ms (varies by geography)
  - Total: ~1 second typical

With alpha=3 parallelism:
  - Effective latency: ~300-500ms typical

Architecture - System overview including mesh storage
Onion Routing - How DHT enables relay discovery
Cryptography - Hash functions and signatures used
Run a Node - Operating a Zentalk DHT node