Traffic Analysis Protection
Comprehensive documentation on Zentalk’s defenses against traffic analysis attacks.
Overview
Even when message contents are encrypted, an adversary observing network traffic can extract significant intelligence from metadata: who communicates with whom, when, how often, and how much data is exchanged. Zentalk implements multiple layers of traffic analysis protection to prevent these correlation attacks.
The Threat Landscape
What Traffic Analysis Reveals
Without protection, network observers can determine:
| Observable | Intelligence Extracted |
|---|---|
| IP addresses | Physical location, ISP, organization |
| Timing | Communication patterns, work hours, time zones |
| Message size | Type of content (text, image, file) |
| Frequency | Relationship strength, urgency |
| Direction | Initiator vs responder |
| Session duration | Conversation length, engagement |
Adversary Capabilities
| Adversary Type | Observation Point | Threat Level |
|---|---|---|
| Local network admin | LAN traffic | High |
| ISP | All user traffic | High |
| Nation-state | Internet backbone | Critical |
| Malicious relay | Single hop | Medium |
| Colluding relays | Multiple hops | High |
Timing Attacks
Timing attacks correlate message sending and receiving times:
Attack Scenario:
Observer watches Alice's outbound traffic
Observer watches Bob's inbound traffic
If message leaves Alice at T1
And message arrives at Bob at T1 + network_delay
Then: Alice → Bob correlation established| Attack Variant | Description |
|---|---|
| Direct timing | Correlate send/receive with network delay |
| Statistical timing | Aggregate timing patterns over time |
| Burst correlation | Match traffic bursts across endpoints |
| Idle period analysis | Correlate quiet periods |
Volume Analysis
Message sizes reveal content types:
| Size Range | Likely Content |
|---|---|
| < 256 bytes | Short text message |
| 256B - 1KB | Long text, metadata |
| 1KB - 100KB | Compressed image |
| 100KB - 10MB | High-res image, audio |
| > 10MB | Video, file transfer |
Communication Pattern Analysis
Long-term observation reveals:
| Pattern | Intelligence |
|---|---|
| Consistent daily timing | Work schedule, time zone |
| Weekly patterns | Work vs personal contacts |
| Response latency | Relationship closeness |
| Burst conversations | Real-time chat vs async |
| Silence periods | Sleep schedule, travel |
Traffic Padding
Constant-Rate Dummy Traffic
Zentalk maintains constant traffic flow regardless of actual communication:
| Parameter | Value | Purpose |
|---|---|---|
| Base rate | 2 cells/second | Minimum constant traffic |
| Active rate | 10 cells/second | During active conversation |
| Idle rate | 1 cell/second | When app backgrounded |
| Burst absorption | Queue depth 50 | Smooth traffic spikes |
Padding Frequency
Traffic Generation Algorithm:
Every 500ms:
If real_message_queue not empty:
Send real_message from queue
Else:
Send dummy_message
Maintains: Constant 2 messages/second minimum| Mode | Interval | Cells/Second | Bandwidth |
|---|---|---|---|
| Low power | 1000ms | 1 | ~0.5 KB/s |
| Normal | 500ms | 2 | ~1 KB/s |
| Active | 100ms | 10 | ~5 KB/s |
| Maximum privacy | 50ms | 20 | ~10 KB/s |
Dummy Message Format
Dummy messages are cryptographically indistinguishable from real messages:
| Property | Real Message | Dummy Message |
|---|---|---|
| Size | 542 bytes | 542 bytes |
| Encryption | AES-256-GCM | AES-256-GCM |
| Header format | Identical | Identical |
| Nonce | Random 12 bytes | Random 12 bytes |
| Auth tag | Valid 16 bytes | Valid 16 bytes |
Dummy Message Construction:
1. Generate random payload (509 bytes)
2. Set cell command to PADDING (0x07)
3. Encrypt with current circuit keys
4. Add valid authentication tag
Result: Indistinguishable from RELAY cell to observerBandwidth Overhead
| Privacy Level | Real Traffic | Padding Overhead | Total |
|---|---|---|---|
| Minimum | Variable | +50% | 1.5x |
| Normal | Variable | +100% | 2x |
| High | Variable | +200% | 3x |
| Maximum | Variable | +400% | 5x |
Message Size Padding
Fixed-Size Buckets
All messages are padded to predetermined sizes:
| Bucket | Size | Typical Content |
|---|---|---|
| Tiny | 256 bytes | Text ≤ 200 chars |
| Small | 1 KB | Text ≤ 900 chars |
| Medium | 4 KB | Formatted text, small data |
| Large | 16 KB | Images (thumbnail) |
| XLarge | 64 KB | Images (preview) |
| Jumbo | 256 KB | Full images |
| Stream | 1 MB chunks | Video, large files |
Bucket Selection Algorithm
Select_Bucket(message_size):
buckets = [256, 1024, 4096, 16384, 65536, 262144, 1048576]
For each bucket in buckets:
If message_size ≤ bucket:
Return bucket
Return STREAM_MODE (chunked)Padding Implementation
| Component | Implementation |
|---|---|
| Padding bytes | Cryptographically random |
| Length field | Encrypted, authenticated |
| Padding position | End of message |
| Verification | HMAC over original length |
Pad_Message(plaintext):
target_size = Select_Bucket(length(plaintext))
padding_needed = target_size - length(plaintext) - 4 // 4 bytes for length
padding = SecureRandom(padding_needed)
length_field = Encode_U32(length(plaintext))
Return plaintext || padding || length_fieldSize Correlation Prevention
| Attack | Without Padding | With Padding |
|---|---|---|
| Message type inference | High accuracy | Bucket-level only |
| Conversation matching | Possible | Infeasible |
| User identification | By message patterns | Indistinguishable |
Timing Obfuscation
Random Delay Injection
Messages experience intentional random delays:
| Parameter | Value | Purpose |
|---|---|---|
| Minimum delay | 0 ms | Preserve usability |
| Maximum delay | 500 ms | Bound latency |
| Distribution | Exponential | Natural traffic pattern |
| Mean delay | 100 ms | Balance privacy/latency |
Delay Distribution Analysis
| Distribution | Pros | Cons | Used For |
|---|---|---|---|
| Uniform | Simple | Detectable pattern | Not used |
| Exponential | Natural, unbounded tail | Variable latency | Default |
| Gaussian | Bounded, predictable | Truncation artifacts | High-priority |
| Laplacian | Privacy-optimal | Implementation complexity | Research |
Exponential Delay Implementation
Generate_Delay():
lambda = 1 / mean_delay // mean_delay = 100ms
u = SecureRandom_Float(0, 1)
delay = -ln(u) / lambda
Return min(delay, max_delay) // Cap at 500msLatency vs Privacy Tradeoff
| Setting | Mean Delay | Max Delay | Privacy | UX Impact |
|---|---|---|---|---|
| Realtime | 10 ms | 50 ms | Low | None |
| Balanced | 100 ms | 500 ms | Medium | Minimal |
| Private | 300 ms | 2000 ms | High | Noticeable |
| Maximum | 1000 ms | 5000 ms | Very High | Significant |
Cover Traffic Generation
Cover Traffic Principles
Cover traffic ensures observers cannot distinguish real communication:
| Requirement | Implementation |
|---|---|
| Indistinguishability | Same encryption, size, timing |
| Unpredictability | Random destinations |
| Persistence | Continuous, not burst |
| Authenticity | Valid circuit traversal |
When Cover Traffic Is Sent
Cover Traffic Decision:
If time_since_last_send ≥ padding_interval:
If real_message_available:
Send real_message
Else:
Generate cover_traffic
Send to random_destination| Trigger | Action |
|---|---|
| Idle timeout | Generate cover message |
| Below rate threshold | Fill with cover traffic |
| App backgrounded | Reduced cover traffic |
| App foregrounded | Resume full rate |
Distinguishing Real from Cover
From observer perspective: Impossible
| Property | Real Message | Cover Message |
|---|---|---|
| Source | Your device | Your device |
| Destination | Real recipient circuit | Random circuit |
| Size | 542 bytes | 542 bytes |
| Encryption | AES-256-GCM | AES-256-GCM |
| Timing | Within padding window | Within padding window |
| Path | 3-hop circuit | 3-hop circuit |
Cover Traffic Destinations
| Destination Type | Percentage | Purpose |
|---|---|---|
| Known contacts | 30% | Hide active conversations |
| Random nodes | 50% | Prevent destination analysis |
| Loopback circuits | 20% | Prevent volume analysis |
Select_Cover_Destination():
r = SecureRandom_Float(0, 1)
If r < 0.3:
Return random_contact()
Else If r < 0.8:
Return random_node()
Else:
Return self_loopback()3-Hop Relay Integration
Per-Hop Padding
Each relay adds its own padding layer:
| Hop | Padding Added | Purpose |
|---|---|---|
| Guard | +16 bytes overhead | Circuit ID, command |
| Middle | +16 bytes overhead | Relay header |
| Exit | +16 bytes overhead | Destination header |
Circuit-Level Timing Protection
Per-Hop Delay:
Guard Node:
delay = Exponential(mean=30ms)
Apply delay before forwarding
Middle Node:
delay = Exponential(mean=30ms)
Apply delay before forwarding
Exit Node:
delay = Exponential(mean=30ms)
Apply delay before delivery
Total added latency: ~90ms mean| Protection Layer | Mechanism |
|---|---|
| Entry timing | Guard adds random delay |
| Transit timing | Middle adds random delay |
| Exit timing | Exit adds random delay |
| Aggregate | 3 independent delay sources |
Multi-Path Routing for Large Messages
Large files are split across multiple circuits:
| File Size | Circuits Used | Chunks Per Circuit |
|---|---|---|
| ≤ 256 KB | 1 | All |
| 256 KB - 1 MB | 2 | Interleaved |
| 1 MB - 10 MB | 4 | Round-robin |
| > 10 MB | 8 | Parallel streams |
Multi-Path Chunking:
chunks = Split_File(file, chunk_size=64KB)
circuits = Build_Circuits(count=4)
For i, chunk in enumerate(chunks):
circuit = circuits[i mod 4]
Send(chunk, circuit)
Reassemble at destination using sequence numbersBenefits of Multi-Path
| Benefit | Description |
|---|---|
| Volume obfuscation | Single circuit doesn’t show total size |
| Timing distribution | Parallel paths prevent timing signature |
| Resilience | Partial delivery on circuit failure |
| Bandwidth | Aggregate throughput of all paths |
Bandwidth Considerations
Overhead Breakdown
| Component | Overhead | Justification |
|---|---|---|
| Onion headers | 48 bytes/message | 16 bytes × 3 hops |
| Size padding | 0-100% | Bucket rounding |
| Traffic padding | 50-400% | Constant rate maintenance |
| Delay buffers | Variable | Timing obfuscation queuing |
Total Bandwidth Multiplier
| Privacy Mode | Multiplier | Effective Bandwidth |
|---|---|---|
| Economy | 1.5x | 67% of raw |
| Standard | 2.5x | 40% of raw |
| Enhanced | 4x | 25% of raw |
| Maximum | 6x | 17% of raw |
User-Configurable Padding Levels
| Setting | Constant Rate | Size Buckets | Timing Delay |
|---|---|---|---|
| Off | No padding | Minimal | None |
| Low | 1 msg/sec | 4 buckets | 50ms mean |
| Medium | 2 msg/sec | 7 buckets | 100ms mean |
| High | 5 msg/sec | 7 buckets | 200ms mean |
| Paranoid | 10 msg/sec | 7 buckets | 500ms mean |
Low-Bandwidth Mode Tradeoffs
| Feature | Full Mode | Low-Bandwidth Mode |
|---|---|---|
| Cover traffic | 2/second | 0.2/second |
| Size padding | All buckets | 4 buckets only |
| Timing delays | Exponential | Minimal |
| Privacy level | Maximum | Reduced |
| Vulnerability | None | Timing, volume analysis |
Low-Bandwidth Recommendations:
If connection_type == "metered" or bandwidth < 100kbps:
Enable low_bandwidth_mode
Warn user about reduced privacy
If sensitive_communication:
Recommend waiting for better connectionEffectiveness Analysis
Attacks Prevented
| Attack | Without Protection | With Protection | Effectiveness |
|---|---|---|---|
| Simple timing correlation | Trivial | Infeasible | 99%+ |
| Message counting | Trivial | Infeasible | 99%+ |
| Size-based classification | Easy | Bucket-level only | 95%+ |
| Burst detection | Easy | Smoothed | 90%+ |
| Long-term pattern analysis | Possible | Significantly harder | 80%+ |
| Active probing | Possible | Detectable | 70%+ |
Attacks Remaining Possible
| Attack | Difficulty | Mitigation |
|---|---|---|
| Long-term statistical analysis | Hard | Rotate circuits frequently |
| Intersection attack | Very Hard | Large anonymity set |
| Confirmation attack | Hard | Requires endpoint compromise |
| Machine learning classification | Medium | Adaptive padding algorithms |
| Side-channel attacks | Hard | Hardware isolation |
Intersection Attack Analysis
Intersection Attack:
Observer monitors over time T
Tracks when Alice is online: Set_A
Tracks when Bob receives messages: Set_B
Intersection = Set_A ∩ Set_B
If Intersection consistently matches:
Probable link established
Mitigation:
- Large user base (anonymity set)
- Cover traffic even when offline
- Delayed message deliveryAcademic Research References
| Paper | Year | Contribution |
|---|---|---|
| Timing Attacks on Low-Latency Anonymity Systems | 2014 | Quantified timing attack effectiveness |
| Website Fingerprinting Defenses | 2016 | Traffic padding strategies |
| Tamaraw | 2014 | Constant-rate traffic analysis defense |
| WTF-PAD | 2016 | Adaptive padding for Tor |
| TrafficSliver | 2020 | Multi-path traffic analysis resistance |
| DeepCorr | 2018 | ML-based correlation attacks |
Protection Strength Summary
| Adversary | Protection Level | Notes |
|---|---|---|
| Passive local observer | Very Strong | Cannot correlate traffic |
| Passive global observer | Strong | Statistical attacks remain |
| Active local observer | Strong | Probing detectable |
| Active global observer | Medium | Confirmation attacks possible |
| Compromised relay | Strong | Single hop reveals little |
| Multiple compromised relays | Medium | Depends on positions |
Configuration Recommendations
By Use Case
| Use Case | Padding Level | Delay Setting | Bandwidth |
|---|---|---|---|
| Casual privacy | Low | Minimal | ~1 KB/s overhead |
| Journalist/activist | High | Maximum | ~10 KB/s overhead |
| Whistleblower | Paranoid | Maximum | ~50 KB/s overhead |
| General use | Medium | Balanced | ~5 KB/s overhead |
By Network Condition
| Condition | Recommended Setting |
|---|---|
| Unlimited broadband | Maximum privacy |
| Limited broadband | Standard privacy |
| Mobile data | Low-bandwidth mode |
| Metered connection | Economy mode + warnings |
Related Documentation
- Onion Routing - Circuit-based anonymity
- Protocol Specification - Encryption details
- Threat Model - Security analysis
- Wire Protocol - Message format details