Skip to Content
Message Delivery

Message Delivery

Technical specification for reliable message delivery in Zentalk’s decentralized network.


Overview

Zentalk implements a store-and-forward message delivery system optimized for privacy-preserving, decentralized communication. Messages traverse the network through 3-hop relay routing, are stored temporarily on the mesh network, and are delivered to recipients with strong reliability guarantees.

Delivery Pipeline

1. Sender creates message 2. Message queued in local outbox 3. E2EE encryption (Double Ratchet) 4. 3-hop relay routing 5. Storage on mesh DHT (multiple nodes) 6. Recipient polls/notified 7. Message retrieved and decrypted 8. ACK sent back through mesh 9. Sender removes from outbox

Key Properties

PropertyGuarantee
Delivery semanticsAt-least-once
OrderingCausal ordering within session
DurabilityMesh-replicated storage
PrivacyEnd-to-end encrypted, onion routed
Offline supportStore-and-forward up to 30 days

Delivery Guarantees

At-Least-Once Delivery

Zentalk provides at-least-once delivery semantics. This means:

ScenarioOutcome
Normal deliveryMessage delivered exactly once
Network failure during sendMessage retried, may deliver multiple times
Network failure during ACKMessage retried, may deliver multiple times
Recipient offlineMessage stored, delivered when online

Why not exactly-once?

True exactly-once delivery in a distributed system requires either:

  1. Centralized coordination (violates decentralization)
  2. Two-phase commit (requires both parties online simultaneously)
  3. Expensive consensus protocols (high latency, low throughput)

Instead, Zentalk achieves effectively-once delivery through client-side deduplication.

Delivery Semantics Comparison

SemanticGuaranteeUse CaseZentalk
At-most-onceMay lose messagesFire-and-forget analyticsNo
At-least-onceMay duplicate messagesCritical messagingYes
Exactly-onceNeither loss nor duplicationFinancial transactionsEffectively (via dedup)

Causal Ordering

Messages within a session maintain causal ordering:

If Alice sends M1 then M2: - Bob receives M1 before M2 (or holds M2 until M1 arrives) - This is enforced by the Double Ratchet message counters Cross-session ordering: - Not guaranteed between different conversations - Not guaranteed across device boundaries

Message Queue Structure

Outbox Queue (Sender Side)

Messages awaiting delivery are stored in a local priority queue.

Outbox Structure: ┌─────────────────────────────────────────────────────────────┐ │ Queue Entry │ ├─────────────────────────────────────────────────────────────┤ │ message_id (16 bytes) │ UUID v4 │ │ recipient (32 bytes) │ Recipient wallet address │ │ encrypted_payload (variable) │ E2EE ciphertext │ │ priority (1 byte) │ 0=low, 1=normal, 2=high │ │ created_at (8 bytes) │ Unix timestamp (ms) │ │ retry_count (2 bytes) │ Number of send attempts │ │ next_retry_at (8 bytes) │ Scheduled retry time │ │ expires_at (8 bytes) │ Message TTL deadline │ │ status (1 byte) │ pending, sending, sent │ └─────────────────────────────────────────────────────────────┘

Queue Organization

PropertyImplementation
OrderPriority queue with FIFO within priority
PersistenceIndexedDB with WAL
Max size10,000 messages
Memory limit50MB encrypted payloads

Priority Levels

PriorityValueUse CaseMax Queue Time
High2Real-time messages, read receipts1 hour
Normal1Regular messages24 hours
Low0Media chunks, backfill72 hours

Queue Processing Algorithm

process_outbox(): while true: // Get highest priority pending message msg = outbox.peek_highest_priority() if msg == null: sleep(100ms) continue if msg.next_retry_at > now(): sleep(min(msg.next_retry_at - now(), 100ms)) continue if msg.expires_at < now(): outbox.remove(msg.message_id) emit_event("message_expired", msg.message_id) continue // Attempt delivery msg.status = "sending" msg.retry_count += 1 result = attempt_delivery(msg) if result.success: // Keep in outbox until ACK received msg.status = "sent" msg.sent_at = now() else: msg.status = "pending" msg.next_retry_at = calculate_backoff(msg.retry_count) if msg.retry_count >= MAX_RETRIES: outbox.move_to_dead_letter(msg)

Inbox Queue (Recipient Side)

Inbox Structure: ┌─────────────────────────────────────────────────────────────┐ │ Inbox Entry │ ├─────────────────────────────────────────────────────────────┤ │ message_id (16 bytes) │ Sender-assigned UUID │ │ sender (32 bytes) │ Sender wallet address │ │ decrypted_payload (variable) │ Plaintext message │ │ received_at (8 bytes) │ Local receipt time │ │ server_timestamp (8 bytes) │ Mesh storage time │ │ sequence_number (4 bytes) │ DR message counter │ │ session_id (32 bytes) │ Double Ratchet session │ │ ack_sent (1 byte) │ Whether ACK was sent │ └─────────────────────────────────────────────────────────────┘

Delivery Retry Logic

Exponential Backoff Strategy

When message delivery fails, Zentalk uses exponential backoff with jitter to prevent thundering herd problems.

Backoff Formula: base_delay = 1000ms (1 second) max_delay = 3600000ms (1 hour) jitter_factor = 0.2 (20%) delay = min(base_delay * 2^retry_count, max_delay) jitter = delay * random(-jitter_factor, +jitter_factor) next_retry = now() + delay + jitter

Retry Schedule

AttemptBase DelayWith Jitter Range
11s0.8s - 1.2s
22s1.6s - 2.4s
34s3.2s - 4.8s
48s6.4s - 9.6s
516s12.8s - 19.2s
632s25.6s - 38.4s
764s51.2s - 76.8s
8128s102.4s - 153.6s
9256s204.8s - 307.2s
10512s409.6s - 614.4s
11+1 hour48m - 72m

Retry Configuration

ParameterDefaultRange
Max retries155-50
Base delay1s100ms-10s
Max delay1 hour1m-24h
Jitter factor0.20-0.5

Failure Classification

Different failure types trigger different retry behaviors:

Failure TypeRetryBackoffNotes
Network timeoutYesExponentialTemporary connectivity issue
Connection refusedYesExponentialNode may be down
TLS handshake failedYesLinear (5min)Certificate issue
DHT lookup failedYesExponentialTry different nodes
Circuit build failedYesExponentialRebuild circuit
Recipient key not foundDelayed1 hourRecipient may not be registered
Invalid message formatNoN/ABug in sender, log error
Authentication failedNoN/ASession corruption, trigger reset

Retry Algorithm Implementation

attempt_delivery(msg): try: // Build onion circuit circuit = build_circuit() if circuit.failed: return { success: false, error: "circuit_build_failed" } // Find storage nodes for recipient storage_nodes = dht.find_nodes( hash("inbox:" + msg.recipient), replication_factor=3 ) if storage_nodes.empty: return { success: false, error: "dht_lookup_failed" } // Store on mesh with replication stored_count = 0 for node in storage_nodes: result = circuit.send(node, STORE_MESSAGE, msg.encrypted_payload) if result.success: stored_count += 1 // Require majority success if stored_count >= 2: return { success: true, stored_nodes: stored_count } else: return { success: false, error: "insufficient_replication" } catch NetworkError as e: return { success: false, error: "network_error", details: e } catch TimeoutError as e: return { success: false, error: "timeout", details: e }

Dead Letter Handling

Messages that exceed maximum retry attempts are moved to a dead letter queue for manual review or automatic expiration.

Dead Letter Queue Structure

Dead Letter Entry: ┌─────────────────────────────────────────────────────────────┐ │ original_message │ Full message data │ │ failure_reason (string) │ Last error message │ │ total_attempts (2 bytes) │ Number of delivery tries │ │ first_attempt_at (8 bytes) │ Initial send time │ │ last_attempt_at (8 bytes) │ Final retry time │ │ moved_to_dlq_at (8 bytes) │ When moved to DLQ │ │ expires_at (8 bytes) │ Auto-deletion time │ └─────────────────────────────────────────────────────────────┘

Dead Letter Policies

PolicyActionTrigger
Max retries exceededMove to DLQ15 attempts
Permanent failureMove to DLQAuth failure, invalid format
User actionMove to DLQManual cancel
TTL expiredDeleteMessage expired before delivery

Dead Letter Retention

CategoryRetentionAuto-action
Retriable failures7 daysAuto-retry once daily
Permanent failures30 daysManual review required
Expired messages24 hoursAuto-delete

Dead Letter Recovery

recover_from_dlq(message_id): 1. Fetch message from DLQ 2. Reset retry count to 0 3. Recalculate expiration (if not already expired) 4. Move back to outbox 5. Trigger immediate delivery attempt // User can also: - Manually delete from DLQ - Export failed messages for debugging - View failure history

Message Expiration (TTL)

TTL Configuration

Messages have configurable time-to-live values:

Message TypeDefault TTLMax TTLMin TTL
Text message30 days90 days1 hour
Media message7 days30 days1 hour
Read receipt24 hours7 days1 hour
Typing indicator30 seconds60 seconds10 seconds
Disappearing messageUser-defined30 days5 seconds

TTL Enforcement Points

PointAction
Sender outboxDelete if expired before send
Mesh storageNodes refuse storage if TTL < 1 hour remaining
Mesh retrievalNodes delete expired messages on access
Recipient inboxDelete on expiration check

TTL Wire Format

TTL Header (included in encrypted payload): ┌─────────────────────────────────────────────────────────────┐ │ Field │ Size │ Description │ ├─────────────────────────────────────────────────────────────┤ │ created_at │ 8 bytes │ Message creation │ │ ttl_seconds │ 4 bytes │ Lifetime in secs │ │ ttl_type │ 1 byte │ absolute/relative│ └─────────────────────────────────────────────────────────────┘ TTL Types: 0x00 = absolute: expires at created_at + ttl_seconds 0x01 = relative: expires ttl_seconds after first read

Expiration Algorithm

is_expired(message): if message.ttl_type == ABSOLUTE: return now() > message.created_at + message.ttl_seconds if message.ttl_type == RELATIVE: if message.first_read_at == null: return false // Not yet read return now() > message.first_read_at + message.ttl_seconds

Deduplication Mechanism

Since Zentalk provides at-least-once delivery, clients must handle potential duplicates.

Deduplication Strategy

Deduplication is performed at the recipient using: 1. Message ID (UUID) 2. Sender address 3. Session ID 4. Double Ratchet message counter

Deduplication Cache

Dedup Cache Structure: ┌─────────────────────────────────────────────────────────────┐ │ Key: (sender || session_id || message_id) │ │ Value: { │ │ received_at: timestamp, │ │ message_counter: uint32, │ │ hash: SHA256(encrypted_payload) │ │ } │ │ TTL: 7 days │ └─────────────────────────────────────────────────────────────┘

Deduplication Algorithm

process_incoming_message(msg): // Generate dedup key dedup_key = hash(msg.sender || msg.session_id || msg.message_id) // Check cache existing = dedup_cache.get(dedup_key) if existing != null: // Duplicate detected if existing.hash == hash(msg.encrypted_payload): // Exact duplicate - safe to ignore log("Duplicate message ignored", msg.message_id) return DUPLICATE else: // Same ID but different content - potential attack log_security_event("Message ID collision", msg) return COLLISION_ERROR // New message - decrypt and process decrypted = decrypt(msg) // Verify message counter (Double Ratchet provides this) if decrypted.counter <= session.last_received_counter: // Out of order or replay if decrypted.counter in session.skipped_keys: // Legitimate out-of-order message process_out_of_order(decrypted) else: // Replay attack log_security_event("Replay attack detected", msg) return REPLAY_ERROR // Valid new message dedup_cache.set(dedup_key, { received_at: now(), message_counter: decrypted.counter, hash: hash(msg.encrypted_payload) }) return SUCCESS

Cache Size Management

ParameterValue
Max entries100,000
Max memory10MB
Eviction policyLRU with TTL
PersistenceIndexedDB

Out-of-Order Message Handling

Message Ordering Guarantees

ScopeGuarantee
Same session, same senderCausal order (via DR counters)
Same session, bidirectionalNo ordering between directions
Different sessionsNo ordering guarantee
Different devicesNo ordering guarantee

Reordering Buffer

When messages arrive out of order, they are buffered until predecessors arrive:

Reorder Buffer: ┌─────────────────────────────────────────────────────────────┐ │ Session: (sender, session_id) │ │ Expected counter: 42 │ │ Buffer: { │ │ 43: { message: M43, received_at: ... }, │ │ 45: { message: M45, received_at: ... }, │ │ 44: { message: M44, received_at: ... } │ │ } │ │ Buffer limit: 1000 messages │ │ Buffer timeout: 5 minutes │ └─────────────────────────────────────────────────────────────┘

Reordering Algorithm

handle_message(session, msg): counter = msg.message_counter if counter == session.expected_counter: // In order - deliver immediately deliver(msg) session.expected_counter += 1 // Check buffer for subsequent messages while session.buffer.has(session.expected_counter): buffered = session.buffer.remove(session.expected_counter) deliver(buffered) session.expected_counter += 1 elif counter > session.expected_counter: // Future message - buffer it gap = counter - session.expected_counter if gap > MAX_GAP (1000): // Too large gap - likely session corruption trigger_session_reset(session) return session.buffer.add(counter, msg) // Set timeout for missing messages schedule_timeout(session, counter, 5 minutes) else: // Past message - check skipped keys if session.has_skipped_key(counter): deliver_with_skipped_key(session, msg, counter) else: // Already processed or replay log("Duplicate or replay", msg.message_id)

Gap Timeout Handling

on_gap_timeout(session, expected_counter): // Missing messages didn't arrive in time // Deliver buffered messages anyway to prevent UI stall if session.buffer.has(expected_counter + 1): log("Gap timeout, delivering subsequent messages") // Mark the gap session.gaps.add({ missing_counter: expected_counter, detected_at: now() }) // Skip and deliver session.expected_counter = find_next_available(session.buffer) flush_buffer(session) // Request retransmission if possible request_message_range(session.peer, expected_counter, expected_counter + 10)

Sequence Number vs Timestamp Ordering

MethodProsConsUse in Zentalk
Sequence numbersReliable, gap detectionSender-specificPrimary
TimestampsCross-sender orderingClock skew, spoofingDisplay hint
Vector clocksCausal orderingComplex, overheadNot used

Zentalk approach:

  • Use Double Ratchet message counters for ordering (sequence numbers)
  • Use timestamps for UI display order
  • Never trust timestamps for security decisions

Acknowledgment Protocol (ACKs)

ACK Types

TypeCodePurpose
DELIVERY_ACK0x01Message stored on mesh
RECEIPT_ACK0x02Recipient retrieved message
READ_ACK0x03Recipient read message (optional)

ACK Message Format

ACK Structure: ┌─────────────────────────────────────────────────────────────┐ │ ack_type (1 byte) │ Type of acknowledgment │ │ message_id (16 bytes) │ Original message UUID │ │ sender (32 bytes) │ Original sender address │ │ timestamp (8 bytes) │ ACK generation time │ │ signature (64 bytes) │ Ed25519 signature │ └─────────────────────────────────────────────────────────────┘ Total: 121 bytes

ACK Flow

Sender Mesh DHT Recipient │ │ │ │──── Message ─────────────►│ │ │◄─── DELIVERY_ACK ─────────│ │ │ │ │ │ │◄──── Poll/Notify ─────────│ │ │───── Message ────────────►│ │ │ │ │ │◄──── RECEIPT_ACK ─────────│ │◄─── RECEIPT_ACK ──────────│ │ │ │ │ │ │ (User reads message) │ │ │◄──── READ_ACK ────────────│ │◄─── READ_ACK ─────────────│ │ │ │ │

ACK Processing (Sender)

on_ack_received(ack): msg = outbox.find(ack.message_id) if msg == null: return // Already removed or unknown switch ack.ack_type: case DELIVERY_ACK: // Message stored on mesh msg.delivery_confirmed = true msg.delivered_at = ack.timestamp // Keep in outbox until RECEIPT_ACK case RECEIPT_ACK: // Recipient has the message outbox.remove(msg.message_id) emit_event("message_delivered", msg.message_id) case READ_ACK: // Recipient read the message emit_event("message_read", msg.message_id, ack.timestamp)

ACK Reliability

ACKs themselves may be lost. Handling strategies:

Lost ACKConsequenceMitigation
DELIVERY_ACK lostSender retries unnecessarilyMesh deduplicates
RECEIPT_ACK lostSender keeps in outbox longerPeriodic re-poll
READ_ACK lostNo read receipt shownAcceptable (optional feature)

ACK Aggregation

To reduce overhead, ACKs can be batched:

Batch ACK Structure: ┌─────────────────────────────────────────────────────────────┐ │ ack_type (1 byte) │ Type (same for all) │ │ count (2 bytes) │ Number of ACKs │ │ For each ACK: │ │ message_id (16 bytes) │ Original message UUID │ │ timestamp (8 bytes) │ Batch timestamp │ │ signature (64 bytes) │ Ed25519 signature │ └─────────────────────────────────────────────────────────────┘

Offline Message Storage on Mesh

Storage Architecture

When recipients are offline, messages are stored on the mesh DHT network.

Storage Location: key = hash("inbox:" || recipient_wallet_address || timestamp_bucket) Timestamp Bucketing: bucket = floor(timestamp / 3600) // 1-hour buckets This prevents timing analysis of message patterns

Replication Strategy

ParameterValue
Replication factor3 nodes
Consistency modelEventual
Read quorum1
Write quorum2

Storage Node Selection

select_storage_nodes(recipient): // Use DHT to find nodes close to recipient's inbox key inbox_key = hash("inbox:" || recipient) // Find k closest nodes nodes = dht.find_closest(inbox_key, k=5) // Select top 3 by reliability score scored = nodes.map(n => { uptime: n.uptime_percentage, bandwidth: n.available_bandwidth, latency: n.average_latency, score: 0.4 * uptime + 0.3 * bandwidth + 0.3 * (1/latency) }) return scored.sort_by(score).take(3)

Message Retrieval

retrieve_messages(my_address): // Calculate recent inbox keys (last 30 days) buckets = [] for day in range(0, 30): for hour in range(0, 24): bucket = floor((now() - day*86400 - hour*3600) / 3600) buckets.append(hash("inbox:" || my_address || bucket)) // Query DHT for each bucket messages = [] for bucket_key in buckets: nodes = dht.find_closest(bucket_key, k=3) for node in nodes: result = node.get(bucket_key) if result.success: messages.extend(result.messages) break // Got from one node, move to next bucket // Deduplicate and sort return deduplicate(messages).sort_by(timestamp)

Storage Quotas

QuotaLimit
Per-recipient storage100MB
Max messages per recipient10,000
Max message size256KB
Storage duration30 days

Garbage Collection

Storage nodes periodically clean expired messages: gc_interval: 1 hour gc_algorithm: 1. Scan all stored messages 2. Delete if: - TTL expired - Storage duration > 30 days - ACK received (confirmed delivery) 3. Report storage metrics

Message Routing through Onion Network

Circuit Selection for Messages

select_circuit(destination): // Check for existing circuit to destination region existing = circuit_pool.find_for_destination(destination) if existing && existing.is_healthy(): return existing // Build new circuit guard = select_guard() middle = select_middle(exclude: [guard.family]) exit = select_exit(exclude: [guard.family, middle.family]) circuit = build_circuit(guard, middle, exit) circuit_pool.add(circuit) return circuit

Message Encapsulation

send_message_through_circuit(circuit, msg): // Wrap message for exit node exit_cell = { command: RELAY_DATA, destination: msg.storage_node, payload: msg.encrypted_payload } layer3 = encrypt(circuit.exit_key, exit_cell) // Wrap for middle node middle_cell = { command: RELAY, next_hop: circuit.exit_node, payload: layer3 } layer2 = encrypt(circuit.middle_key, middle_cell) // Wrap for guard node guard_cell = { command: RELAY, next_hop: circuit.middle_node, payload: layer2 } layer1 = encrypt(circuit.guard_key, guard_cell) // Send to guard circuit.guard_connection.send(layer1)

Circuit Lifecycle for Messaging

PhaseDurationAction
Build500ms-2sCreate 3-hop circuit
ActiveUp to 10minSend/receive messages
Idle timeout60sDestroy if unused
Max lifetime10minForce rotation
FailureImmediateRebuild and retry

Performance Characteristics

Latency Breakdown

OperationTypicalWorst Case
Local encryption1-5ms20ms
Circuit build500ms3s
3-hop relay routing100-300ms1s
DHT lookup200-500ms2s
Mesh storage50-100ms500ms
Total send latency850ms-1s7s

Throughput Limits

ConstraintLimit
Messages per circuit100/minute
Circuits per client10 concurrent
Max message rate1000/minute
Bandwidth per client1MB/s

Scalability Characteristics

DHT Lookup: O(log n) where n = network size Storage Lookup: O(1) with k replicas Message Routing: O(3) fixed hops For 1M node network: - DHT lookup: ~20 hops max - Effective lookup time: ~1s - Storage availability: 99.9%+ with k=3

Reliability Metrics

MetricTargetMechanism
Message delivery rate99.9%Retries, replication
Delivery latency P501sCircuit pooling
Delivery latency P9910sAggressive retry
Duplicate rate< 0.1%Client deduplication

Error Handling

Error Codes

CodeNameDescriptionRetry
E001NETWORK_UNREACHABLENo network connectivityYes
E002CIRCUIT_BUILD_FAILEDCannot establish onion circuitYes
E003DHT_LOOKUP_FAILEDCannot find storage nodesYes
E004STORAGE_FAILEDMesh storage rejected messageYes
E005RECIPIENT_NOT_FOUNDRecipient not registeredDelayed
E006MESSAGE_TOO_LARGEExceeds size limitNo
E007RATE_LIMITEDToo many messagesDelayed
E008ENCRYPTION_FAILEDCryptographic errorNo
E009SESSION_INVALIDDR session corruptedReset session
E010TTL_EXPIREDMessage expiredNo

Error Recovery Flows

on_delivery_error(msg, error): switch error.code: case E001, E002, E003, E004: // Transient error - schedule retry schedule_retry(msg) case E005: // Recipient not registered // May be new user, delay retry schedule_delayed_retry(msg, 1 hour) notify_user("Recipient may not have Zentalk yet") case E006: // Message too large - cannot recover move_to_dlq(msg, "message_too_large") notify_user("Message too large to send") case E007: // Rate limited - back off schedule_delayed_retry(msg, error.retry_after) case E008, E009: // Crypto error - session issue trigger_session_reset(msg.recipient) schedule_retry(msg) case E010: // Expired - remove outbox.remove(msg) notify_user("Message expired before delivery")

Implementation Examples

Sending a Message (TypeScript)

interface Message { id: string; recipient: string; content: Uint8Array; ttl: number; priority: 'low' | 'normal' | 'high'; } async function sendMessage(msg: Message): Promise<void> { // 1. Encrypt with Double Ratchet const session = await getOrCreateSession(msg.recipient); const encrypted = await session.encrypt(msg.content); // 2. Add to outbox await outbox.add({ messageId: msg.id, recipient: msg.recipient, encryptedPayload: encrypted, priority: msg.priority, createdAt: Date.now(), expiresAt: Date.now() + msg.ttl * 1000, retryCount: 0, status: 'pending' }); // 3. Trigger delivery (non-blocking) deliveryWorker.notify(); }

Processing Incoming Messages (TypeScript)

interface IncomingMessage { messageId: string; sender: string; sessionId: string; encryptedPayload: Uint8Array; serverTimestamp: number; } async function processIncoming(msg: IncomingMessage): Promise<void> { // 1. Check deduplication cache const dedupKey = hash(msg.sender, msg.sessionId, msg.messageId); if (dedupCache.has(dedupKey)) { console.log('Duplicate message ignored'); return; } // 2. Decrypt const session = await getSession(msg.sender, msg.sessionId); const decrypted = await session.decrypt(msg.encryptedPayload); // 3. Handle ordering if (decrypted.counter > session.expectedCounter) { // Buffer for reordering reorderBuffer.add(session.id, decrypted); return; } // 4. Deliver to UI await inbox.add(decrypted); eventEmitter.emit('newMessage', decrypted); // 5. Update dedup cache dedupCache.set(dedupKey, { receivedAt: Date.now(), counter: decrypted.counter }); // 6. Send receipt ACK await sendAck(msg.sender, msg.messageId, 'RECEIPT_ACK'); }

Last updated on