Skip to Content
Performance

Performance

Comprehensive performance analysis, benchmarks, and optimization strategies for the Zentalk protocol.


Performance Philosophy

Zentalk prioritizes security and privacy over raw performance, but extensive optimizations ensure the system remains highly responsive for end users.

Design Goals

PriorityGoalTradeoff
1SecurityNever compromise cryptographic operations for speed
2PrivacyMetadata protection worth additional latency
3ReliabilityRedundancy over minimal resource usage
4ResponsivenessUser-perceived latency optimization
5EfficiencyResource optimization where possible

Performance vs Privacy Tradeoffs

FeaturePerformance ImpactPrivacy Benefit
3-hop relay routing+150-400ms latencyHides sender/recipient
Traffic padding+20% bandwidthPrevents traffic analysis
Stealth addressesScanning overheadUnlinkable payments
Double RatchetPer-message crypto opsForward secrecy
DHT lookupO(log n) hopsDecentralized, censorship resistant

Latency Characteristics

End-to-End Message Latency

Measured latency from send button press to recipient notification:

PercentileCold StartWarm (Circuit Ready)Notes
p501.2s350msTypical conditions
p751.8s520msModerate network
p902.5s850msCongested network
p953.2s1.1sPoor conditions
p995.5s2.2sWorst case

Cold start includes circuit building. Warm assumes established circuit.

Latency Breakdown by Component

ComponentDuration (p50)Duration (p99)Notes
Client-Side Encryption
Double Ratchet key derivation0.5ms2msHKDF operations
AES-256-GCM encryption0.3ms1msPer message
Ed25519 signature0.1ms0.3msMessage authentication
Network Transport
Circuit build (if needed)800ms2.5s3 TLS handshakes
3-hop relay routing150ms500msSequential relay
DHT lookup200ms800msO(log n) hops
Mesh storage50ms200msReplication factor 3
Recipient Side
Message retrieval100ms400msFrom nearest replica
Decryption + verification1ms5msReverse of send
Total (warm)350ms2.2sTypical path

Factors Affecting Latency

FactorImpactMitigation
Geographic distance+50-200ms per hopRegional relay selection
Network congestion+100-500msAdaptive timeout, parallel queries
DHT network sizeO(log n) lookupCaching, shortcut routing
Circuit healthVariableProactive circuit rotation
Time of day+20-50% peak hoursLoad balancing
Mobile network+100-300msOptimized packet sizes

Latency Optimization Techniques

Circuit Pooling:

Maintain pool of ready circuits: - 3 guard circuits (persistent) - 5 general circuits (rotating) - 2 backup circuits (warm standby) Result: Eliminate circuit build latency for 95%+ of messages

Predictive Prefetching:

On app foreground: 1. Refresh DHT routing table 2. Pre-build circuits to frequent contacts 3. Prefetch unread message pointers Result: First message send is always "warm"

Throughput Metrics

Messages Per Second Per Node

Node TypeMessages/SecondLimiting Factor
Relay node (guard)5,000TLS termination
Relay node (middle)8,000Packet forwarding
Relay node (exit)4,000DHT operations
Storage node2,000 writesDisk I/O
Storage node10,000 readsMemory cache
Client (mobile)50Battery/CPU limits
Client (desktop)200Crypto operations

Maximum Concurrent Connections

ComponentConnectionsMemory Per Connection
Client WebSocket1-3~50KB
Relay node10,000~2KB
Storage node5,000~5KB
DHT node500 peers~1KB

Bandwidth Consumption

ActivityBandwidthNotes
Idle (connected)1-2 KB/sKeepalive + padding
Active chat5-20 KB/sDepends on message rate
Media send (photo)100-500 KB/sChunked upload
Voice call30-50 KB/sOpus codec
Video call (720p)500-1500 KB/sVP9 codec
Background sync0.5-1 KB/sHeartbeat only

Traffic Padding Overhead:

ModeOverheadPrivacy Level
Minimal+5%Basic
Standard+20%Good
Maximum+50%High
Constant-rate+100-300%Maximum

Cryptographic Performance

All benchmarks measured on reference hardware (Apple M2, Intel i7-12700).

X25519 Key Exchange

OperationM2 (ARM64)i7-12700 (x86)WebCrypto
Key generation25us32us45us
DH computation28us35us50us
Throughput35,000/s28,000/s20,000/s

X3DH Full Exchange (4 DH operations):

OperationDurationNotes
Fetch key bundle200-500msNetwork bound
4x DH computation0.1msCPU bound
HKDF derivation0.02msFast
Total (network)200-500msDominated by fetch
Total (local only)0.15msIf keys cached

AES-256-GCM Encryption/Decryption

Message SizeEncryptDecryptThroughput
256 bytes0.02ms0.02ms12 GB/s
1 KB0.03ms0.03ms30 GB/s
4 KB0.05ms0.05ms75 GB/s
64 KB0.3ms0.3ms200 GB/s
1 MB4ms4ms250 GB/s

Hardware AES-NI enabled. WebCrypto typically 2-3x slower.

Ed25519 Signing and Verification

OperationDurationThroughput
Key generation30us33,000/s
Sign (64 bytes)35us28,000/s
Verify (64 bytes)70us14,000/s
Batch verify (100)4ms25,000/s

Signature size: 64 bytes (constant)

Kyber-768 (Post-Quantum) Operations

OperationDurationNotes
Key generation50usGenerate keypair
Encapsulation60usSender operation
Decapsulation55usRecipient operation
Hybrid X3DH total0.3msX25519 + Kyber combined

Key/Ciphertext Sizes:

ComponentSize
Kyber public key1,184 bytes
Kyber private key2,400 bytes
Kyber ciphertext1,088 bytes
Shared secret32 bytes
Overhead vs X25519-only+2,272 bytes per session init

Double Ratchet Operations

OperationDurationFrequency
Symmetric ratchet (per message)0.05msEvery message
DH ratchet (key rotation)0.1msEvery reply
Full ratchet state serialization0.5msOn persist
State deserialization0.3msOn load

Memory per Session:

ComponentSize
Ratchet state~500 bytes
Skipped message keys (max)~40 KB
Total per active session~1-50 KB

Cryptographic Operation Summary

OperationTimeOperations/Second
Send message (warm session)0.5ms2,000
Receive message0.6ms1,600
New session (X3DH)0.15ms6,500
New session (X3DH + Kyber)0.3ms3,300
Group message (Sender Keys)0.3ms3,300

Scanning Performance

Stealth Address Scanning Rate

Scanning involves ECDH computation per announcement to check ownership.

MetricValueNotes
ECDH per announcement0.3msCore operation
Raw scan rate3,300/sSingle-threaded
With view tag optimization99% skip rateFirst byte check
Effective scan rate330,000/sAfter view tag filter

View Tag Optimization Impact

Announcements/DayWithout View TagWith View TagSpeedup
10,0003 seconds0.03 seconds100x
100,00030 seconds0.3 seconds100x
1,000,0005 minutes3 seconds100x
10,000,00050 minutes30 seconds100x

Bloom Filter Efficiency

ParameterValue
Expected elements1,000,000
False positive rate1%
Bits per element9.6
Total filter size1.2 MB
Hash functions7
Lookup time0.001ms

Multi-stage Cascade:

StageFalse Positive RateCumulative
Stage 1 (coarse)10%10%
Stage 2 (fine)1%0.1%
Stage 3 (exact)0%0%

Batch Scanning Benchmarks

Batch SizeDurationAnnouncements/Second
1,0000.003s333,000
10,0000.03s333,000
100,0000.3s333,000
1,000,0003s333,000

With view tag optimization enabled, 4 worker threads.

Parallel Scanning Architecture

WorkersThroughputCPU Usage
1100,000/s25%
2190,000/s50%
4350,000/s90%
8400,000/s95%

Diminishing returns beyond 4 workers due to memory bandwidth.


DHT Operations

Lookup Latency

Network SizeHops (O(log n))Latency (p50)Latency (p99)
1,000 nodes10300ms800ms
10,000 nodes14400ms1.1s
100,000 nodes17500ms1.4s
1,000,000 nodes20600ms1.8s

With alpha=3 parallel queries.

Storage Operations

OperationLatency (p50)Latency (p99)Notes
STORE150ms500msWrite to k=3 nodes
FIND_VALUE (hit)100ms400msFirst node with data
FIND_VALUE (miss)300ms1sFull lookup
FIND_NODE200ms600msRouting lookup

Replication Overhead

Replication FactorWrite AmplificationStorage OverheadAvailability
k=11x1x90%
k=33x3x99.9%
k=55x5x99.99%
k=77x7x99.999%

Zentalk default: k=3 (optimal tradeoff)

DHT Memory Usage

ComponentPer NodeNotes
Routing table256 KB256 buckets x 20 contacts x 50 bytes
Stored dataVariableDepends on node role
Message cache10 MBRecent announcements
Connection state1 KB/peerActive connections

Group Chat Scaling

Performance vs Group Size

Group SizeKey DistributionMessage EncryptMessage Decrypt
109 E2EE sends0.3ms0.3ms
5049 E2EE sends0.3ms0.3ms
10099 E2EE sends0.3ms0.3ms
500499 E2EE sends0.3ms0.3ms
1000999 E2EE sends0.3ms0.3ms

Key insight: Message encryption is O(1) regardless of group size due to Sender Keys.

Sender Keys Efficiency

MetricPairwiseSender KeysImprovement
Encrypt ops per message (100 members)1001100x
Key material per memberO(n)O(n)Same
Message sizeO(n)O(1)Linear
Bandwidth per messageO(n)O(1)Linear

Key Rotation Overhead

EventOperationsLatencyNotes
Member joinn key sends2-5sBackground
Member leaven key regenerations3-10sAll rotate
Periodic rotationn key sends2-5sEvery 1000 msgs
Device compromise1 key regeneration1sAffected only

Group Chat Memory Footprint

Group SizeSender Keys StorageMessage CacheTotal
103.2 KB100 KB~103 KB
5016 KB100 KB~116 KB
10032 KB100 KB~132 KB
500160 KB100 KB~260 KB
1000320 KB100 KB~420 KB

Client Resource Usage

Memory Footprint

PlatformIdleActive ChatPeak
iOS45 MB80 MB150 MB
Android50 MB90 MB180 MB
Desktop (Electron)120 MB200 MB400 MB
Web (Chrome)60 MB100 MB200 MB

Memory Breakdown (Active):

ComponentMobileDesktop
WebAssembly runtime15 MB20 MB
Crypto state5 MB10 MB
Session cache10 MB30 MB
Message buffer20 MB50 MB
UI framework30 MB90 MB

CPU Usage Patterns

ActivityMobile CPUDesktop CPUDuration
Idle<1%<1%Continuous
Receiving message5-10%2-5%100ms
Sending message10-15%5-8%200ms
Voice call15-25%8-12%Continuous
Video call30-50%15-25%Continuous
Background sync2-5%1-3%Periodic (1s)
Scanning (batch)50-80%30-50%Depends on backlog

Battery Impact (Mobile)

ActivitymAh/hourRelative to Idle
Idle (screen off)5-101x
Connected (foreground)50-808x
Active messaging100-15015x
Voice call200-30030x
Video call400-60050x

Optimization Strategies:

StrategyBattery SavingsTradeoff
Push notifications80% idle reductionSlight delay
Batch message fetch40% active reductionGrouping delay
Adaptive polling30% reductionVariable latency
Codec selection20% call reductionQuality tradeoff

Storage Requirements

Data TypePer ItemTypical TotalNotes
Message (text)0.5-2 KB50-200 MB100K messages
Message (with media ref)1-3 KB100-300 MBMedia stored separately
Media (photo)50-500 KB1-10 GBOriginal quality
Media (thumbnail)5-20 KB50-200 MBPreview cache
Session state1-50 KB1-10 MBPer contact
DHT cacheN/A10-50 MBRouting tables
Crypto keysN/A1-5 MBAll key material

Total Storage (Typical User):

Usage PatternStorageNotes
Light (text only)50-100 MBFew contacts
Moderate200-500 MBRegular usage
Heavy1-5 GBMany groups, media
Power user5-20 GBFull history, HD media

Network Scalability

Nodes vs Throughput

Network SizeTotal ThroughputPer-Node LoadDHT Latency
100 nodes50K msg/s500/s200ms
1,000 nodes500K msg/s500/s300ms
10,000 nodes5M msg/s500/s400ms
100,000 nodes50M msg/s500/s500ms
1,000,000 nodes500M msg/s500/s600ms

Key property: Throughput scales linearly with network size.

Geographic Distribution Impact

ConfigurationLatency ImpactAvailability
Single regionBaseline99%
Multi-region (3)+20-50ms99.9%
Global (6+ regions)+50-150ms99.99%

Regional Relay Selection:

User LocationPreferred GuardLatency Benefit
EUEU guard-100ms vs US
US-EastUS-East guard-50ms vs US-West
AsiaAsia guard-200ms vs EU

Bottleneck Analysis

ComponentBottleneck TypeLimitMitigation
DHT lookupLatencyO(log n)Caching, shortcuts
Relay bandwidthThroughput10 Gbps typicalMore relays
Storage nodesIOPS10K writes/sSSD, caching
BootstrapConnection10K concurrentMultiple bootstraps
Key serverQueries100K/sDHT distribution

Horizontal Scaling Strategy

Scaling approach for each component: DHT Layer: - Add nodes → automatic load distribution - No coordination needed - Linear scaling Relay Layer: - Add relays → more circuit capacity - Geographic distribution for latency - Linear scaling Storage Layer: - Add storage nodes → more capacity - Replication handles hot data - Linear scaling Result: All layers scale horizontally without bottleneck

Optimization Strategies

Caching Strategies

Cache TypeHit RateLatency SavingsMemory
DHT routing table60%200-500ms256 KB
Key bundle cache90%200-500ms10 MB
Circuit cache95%500-2000ms1 MB
Message dedup cache99.9%N/A10 MB
Session cache99%1-5ms5 MB

Cache Invalidation:

CacheTTLInvalidation Trigger
Key bundle1 hourSafety number change
DHT entry24 hoursRepublish
Circuit10 minutesError or rotation
Message7 daysLRU eviction

Batch Processing

OperationIndividualBatchedImprovement
DHT store (10 items)1.5s0.3s5x
Message fetch (100)10s1s10x
Signature verify (100)7ms4ms1.75x
Key bundle fetch (10)5s0.8s6x

Batch Sizes:

OperationOptimal BatchMax Batch
Message send1050
DHT query520
Signature verify1001000
Scanning10,000100,000

Lazy Loading

ResourceLoad TriggerMemory Savings
Message historyScroll to date80%
Media thumbnailsViewport entry60%
Contact avatarsFirst view40%
Session stateFirst message70%
Group metadataGroup open50%

Connection Pooling

Pool TypeSizeReuse RateLatency Savings
WebSocket399%100-300ms
Circuit895%500-2000ms
DHT peer5080%50-200ms
Storage node1090%100-300ms

Pool Management:

Connection Pool Strategy: - Minimum connections: 3 (always ready) - Maximum connections: 20 (prevent resource exhaustion) - Idle timeout: 60s (balance freshness vs overhead) - Health check: 10s (detect failures) - Warm-up: On app start, build minimum pool

Benchmarking Methodology

Test Environment

Reference Hardware:

ComponentSpecification
CPU (ARM)Apple M2, 8 cores
CPU (x86)Intel i7-12700, 12 cores
Memory16 GB
StorageNVMe SSD
Network1 Gbps symmetric

Mobile Reference:

DeviceSpecification
iOSiPhone 14, A15 Bionic
AndroidPixel 7, Tensor G2

Measurement Tools

ToolPurposeMetrics
perf / InstrumentsCPU profilingCycles, cache misses
HeaptrackMemory profilingAllocations, leaks
WiresharkNetwork analysisPackets, latency
Custom harnessEnd-to-end timingUser-perceived latency
PrometheusProduction metricsReal-world performance

Benchmark Categories

CategoryWhat It MeasuresHow Often
MicrobenchmarksIndividual operationsEvery commit
Integration benchmarksComponent interactionsDaily
System benchmarksFull message flowWeekly
Load testsScalability limitsPre-release
Chaos testsFailure resilienceMonthly

Reproducibility Requirements

RequirementImplementation
Isolated environmentDocker containers
Controlled networktc (traffic control)
Fixed random seedsDeterministic tests
Warm-up periodDiscard first 1000 ops
Statistical significance10,000+ iterations
Multiple runs5 runs, report median

Benchmark Caveats

CaveatImpactMitigation
Network variability+/- 50% latencyControlled test network
CPU frequency scaling+/- 20% throughputFixed frequency
GC pauses+/- 30% p99Report percentiles
JIT warm-upFirst run slowerWarm-up iterations
Real-world loadDifferent patternsProduction monitoring

Performance Monitoring

Key Metrics to Track

MetricTargetAlert Threshold
Message latency (p50)<500ms>1s
Message latency (p99)<3s>5s
Circuit build time<2s>5s
DHT lookup time<1s>3s
Encryption throughput>1000/s<500/s
Memory usage (mobile)<100MB>200MB
Battery drain (idle)<5mAh/h>20mAh/h

Performance Regression Detection

Regression Detection Pipeline: 1. Run benchmark suite on every PR 2. Compare against baseline (main branch) 3. Flag if: - p50 regresses by >10% - p99 regresses by >20% - Memory increases by >15% - Any metric exceeds absolute threshold 4. Require explicit approval for performance regressions

Production Monitoring

Data PointCollection MethodRetention
Client-side latencyIn-app telemetry30 days
Server-side metricsPrometheus90 days
Error ratesSentry90 days
Network topologyDHT crawl7 days
Bandwidth usageFlow logs30 days

Last updated on