Skip to Content
Error Handling

Error Handling & Recovery

Comprehensive guide to error handling, recovery mechanisms, and resilience strategies in the Zentalk protocol.

Overview

Zentalk implements a multi-layered error handling system designed to maintain communication continuity while preserving security guarantees. The system distinguishes between recoverable and unrecoverable errors, applying appropriate strategies for each category.

Design Principles

PrincipleDescription
Fail-secureErrors default to secure behavior, never exposing sensitive data
Graceful degradationPartial functionality maintained when possible
Automatic recoverySelf-healing without user intervention where safe
TransparencyUsers informed of issues affecting their communication

Error Categories

Zentalk errors fall into four primary categories, each requiring distinct handling strategies.

Category Overview

CategoryExamplesSeverityUser Impact
NetworkTimeout, DNS failure, connection resetLow-HighMessage delay
CryptographicDecryption failure, invalid signatureHighMessage loss possible
ProtocolState mismatch, version incompatibilityMedium-HighSession reset possible
StorageQuota exceeded, corruptionMedium-HighData loss possible

Error Hierarchy

┌─────────────────────────────────────────────────────────────┐ │ ZentalkError (base) │ ├─────────────────────────────────────────────────────────────┤ │ ├── NetworkError │ │ │ ├── TimeoutError │ │ │ ├── ConnectionError │ │ │ ├── DNSError │ │ │ └── TLSError │ │ │ │ │ ├── CryptoError │ │ │ ├── DecryptionError │ │ │ ├── SignatureError │ │ │ ├── MACError │ │ │ └── KeyError │ │ │ │ │ ├── ProtocolError │ │ │ ├── StateError │ │ │ ├── VersionError │ │ │ ├── SequenceError │ │ │ └── HandshakeError │ │ │ │ │ └── StorageError │ │ ├── QuotaError │ │ ├── CorruptionError │ │ ├── AccessError │ │ └── TransactionError │ └─────────────────────────────────────────────────────────────┘

Network Errors

Network errors are the most common error category and typically the most recoverable.

Error Types

ErrorCodeCauseTypical Duration
Connection TimeoutNET_001Server unreachable, high latency30s default
Connection ResetNET_002TCP RST received, server restartImmediate
DNS Resolution FailedNET_003DNS server unreachable, invalid domain5-30s
TLS Handshake FailedNET_004Certificate invalid, cipher mismatchImmediate
Connection RefusedNET_005Server not listening, firewall blockImmediate
Network UnreachableNET_006No internet connectivityVariable
Host UnreachableNET_007Routing failure, host offlineVariable

Timeout Configuration

Different operations have different timeout requirements based on their expected duration and criticality.

OperationTimeoutRationale
TCP Connect10 secondsFast networks should connect quickly
TLS Handshake15 secondsIncludes certificate validation
Key Bundle Fetch20 secondsServer may need database lookup
Message Send30 secondsIncludes relay path establishment
File Upload60 secondsLarge payloads need more time
Circuit Creation45 seconds3-hop path establishment
DHT Lookup30 secondsMultiple node queries

Retry Strategy: Exponential Backoff

Zentalk uses exponential backoff with jitter to prevent thundering herd problems during service recovery.

Retry Algorithm: 1. Initialize: base_delay = 1 second max_delay = 30 seconds max_attempts = 10 attempt = 0 2. On failure: if attempt >= max_attempts: return PERMANENT_FAILURE delay = min(base_delay * (2 ^ attempt), max_delay) jitter = random(0, delay * 0.1) actual_delay = delay + jitter wait(actual_delay) attempt = attempt + 1 retry_operation() 3. On success: reset attempt counter resume normal operation

Retry Delay Table

AttemptBase DelayWith Max CapApproximate Range (with jitter)
11s1s1.0 - 1.1s
22s2s2.0 - 2.2s
34s4s4.0 - 4.4s
48s8s8.0 - 8.8s
516s16s16.0 - 17.6s
632s30s30.0 - 33.0s
7-1064s+30s30.0 - 33.0s

Offline Queue Management

When network connectivity is lost, messages are queued locally for later transmission.

Offline Queue Structure: ┌─────────────────────────────────────────────────────────────┐ │ Queue Entry │ ├─────────────────────────────────────────────────────────────┤ │ message_id │ UUID │ │ recipient │ Wallet address │ │ encrypted_data │ Pre-encrypted message blob │ │ created_at │ Unix timestamp (ms) │ │ attempts │ Retry count │ │ last_attempt │ Last retry timestamp │ │ priority │ HIGH / NORMAL / LOW │ │ expires_at │ TTL expiration timestamp │ └─────────────────────────────────────────────────────────────┘
Queue ParameterValueDescription
Max queue size1,000 messagesPrevents storage exhaustion
Max message age7 daysMessages expire if not sent
Priority levels3HIGH (calls), NORMAL (chat), LOW (receipts)
Flush batch size10 messagesSent per reconnection cycle
Flush interval5 secondsDelay between batches

Reconnection Behavior

Reconnection State Machine: ┌───────────┐ network lost ┌──────────────┐ │ CONNECTED │─────────────────────→│ DISCONNECTED │ └───────────┘ └──────────────┘ ↑ │ │ │ start backoff │ ↓ │ success ┌──────────────┐ └────────────────────────────│ RECONNECTING │←─┐ └──────────────┘ │ │ │ │ failure │ └──────────┘
StateBehaviorUser Indication
CONNECTEDNormal operationGreen indicator
DISCONNECTEDQueue messages locallyRed indicator
RECONNECTINGAttempting connectionYellow indicator

Network Error Recovery Actions

ErrorImmediate ActionRetry StrategyUser Notification
TimeoutRetry with backoffUp to 10 attemptsAfter 3 failures
Connection ResetImmediate retryUp to 5 attemptsAfter 2 failures
DNS FailureTry alternate DNSUp to 3 attemptsImmediate
TLS FailureCheck certificateNo retry (security)Immediate
Connection RefusedCheck server statusUp to 10 attemptsAfter 5 failures

Cryptographic Errors

Cryptographic errors indicate potential security issues and require careful handling to maintain protocol security.

Error Types

ErrorCodeCauseSecurity Implication
Decryption FailedCRYPTO_001Wrong key, corrupted ciphertextPossible attack or desync
Invalid SignatureCRYPTO_002Key mismatch, tamperingPossible MITM attack
MAC Verification FailedCRYPTO_003Message modifiedDefinite tampering
Key Derivation FailedCRYPTO_004Invalid input parametersImplementation error
Invalid Public KeyCRYPTO_005Malformed key, wrong curveProtocol violation
Nonce Reuse DetectedCRYPTO_006Same nonce used twiceSecurity critical
Key ExpiredCRYPTO_007SPK or session key too oldRotation required

Decryption Failure Handling

When message decryption fails, the system must determine whether the failure is due to desynchronization or an attack.

Decryption Failure Decision Tree: 1. Decryption fails with current receiving chain key ├─→ Check skipped message keys │ │ │ ├─→ Found: Decrypt with skipped key, delete key │ │ │ └─→ Not found: Continue to step 2 2. Check if message contains new DH ratchet key ├─→ Yes: Attempt DH ratchet step │ │ │ ├─→ Success: Decrypt with new chain │ │ │ └─→ Failure: Continue to step 3 └─→ No: Continue to step 3 3. Increment failure counter for this session ├─→ Counter < 3: Request message resend ├─→ Counter >= 3: Trigger session reset └─→ Counter >= 5: Flag as potential attack

When to Request Message Resend

Message resend is appropriate when decryption failure is likely due to network issues rather than cryptographic desynchronization.

ConditionRequest ResendRationale
First decryption failureYesLikely transient error
Message number gap detectedYesMessages may have been lost
Previous messages decrypted OKYesIsolated failure
Multiple consecutive failuresNoLikely desync, need reset
MAC verification failedNoTampering detected
Same message fails twiceNoNot a transient error

When to Trigger Session Reset

Session reset destroys current cryptographic state and re-establishes the session via X3DH.

Trigger ConditionActionData Preserved
3+ consecutive decrypt failuresAutomatic resetMessage history
Invalid ratchet state detectedAutomatic resetMessage history
User requests resetManual resetMessage history
Peer sends reset requestAccept resetMessage history
Key compromise suspectedForce resetMessage history

MAC Verification Failure

MAC (Message Authentication Code) failures indicate definite message tampering and are handled with zero tolerance.

MAC Failure Response: 1. IMMEDIATELY discard message 2. Do NOT attempt alternative decryption 3. Do NOT advance ratchet state 4. Log security event: { type: "MAC_FAILURE", peer: <address>, timestamp: <now>, message_header: <non-sensitive portion> } 5. Increment MAC failure counter 6. If counter >= 2 in 1 hour: - Flag session as potentially compromised - Notify user with security warning - Recommend session reset

Signature Verification Failures

Failure TypePossible CauseAction
Identity key signature invalidKey bundle tamperedReject, warn user
SPK signature invalidSPK corrupted or forgedReject, fetch fresh bundle
Message signature invalidSender key changedVerify key fingerprint
Timestamp signature invalidReplay attackReject message

Protocol Errors

Protocol errors occur when communication violates expected state or format.

Error Types

ErrorCodeCauseRecovery Path
Invalid State TransitionPROTO_001Message received in wrong stateReset state machine
Version MismatchPROTO_002Incompatible protocol versionsNegotiate or fail
Unknown Message TypePROTO_003Newer protocol, corrupted dataIgnore or request clarification
Sequence Number InvalidPROTO_004Replay or lost messagesCheck skipped keys
Handshake FailedPROTO_005X3DH computation mismatchRetry with fresh keys
Circuit ID UnknownPROTO_006Circuit destroyed or expiredCreate new circuit
Stream ID InvalidPROTO_007Stream closed or never openedOpen new stream

State Machine Recovery

When the session state machine enters an invalid state, recovery depends on the current and expected states.

State Recovery Matrix: Current Expected Recovery Action ───────────────────────────────────────────────── NO_SESSION ESTABLISHED Initiate X3DH PENDING ESTABLISHED Wait or retry X3DH ESTABLISHED NO_SESSION Accept (peer reset) CLOSED Any active Create new session Invalid/Corrupt Any Force reset

Version Mismatch Handling

Client VersionServer VersionBehavior
v1.0v1.0Normal operation
v1.0v2.0Server downgrades if supported
v2.0v1.0Client downgrades if supported
v1.0v3.0+Connection refused, upgrade required
Version Negotiation: 1. Client sends supported_versions: [1.0, 1.1, 2.0] 2. Server selects highest common version 3. If no common version: Server responds: { error: "VERSION_MISMATCH", server_versions: [3.0, 3.1], client_versions: [1.0, 1.1, 2.0], upgrade_url: "https://zentalk.io/download" } 4. Client notifies user: "Update required"

Invalid Message Type Handling

Message TypeKnownAction
0x01 - 0x0FYesProcess normally
0x10 - 0xEFReservedLog and ignore
0xF0 - 0xFEExtensionCheck extension support
0xFFErrorProcess error response

Double Ratchet Desynchronization

Desynchronization occurs when sender and receiver ratchet states diverge, preventing message decryption.

How Desync Happens

CauseFrequencyDetection Difficulty
Network packet lossCommonEasy
Device switch mid-conversationOccasionalMedium
App crash during ratchet stepRareMedium
Storage corruptionRareHard
Concurrent message sendingOccasionalMedium
Clock skew affecting orderingRareHard

Desync Scenarios

Scenario 1: Lost Message Alice Bob ────── ─── Sends M1 (N=0) ─────────────────→ Receives M1 Sends M2 (N=1) ────────X (lost in network) Sends M3 (N=2) ─────────────────→ Receives M3 Bob expects N=1, receives N=2 Detection: Message number gap Recovery: Bob stores skipped key for N=1
Scenario 2: Lost DH Ratchet Alice Bob ────── ─── DH ratchet, sends M1 ──────X (lost) Sends M2 ─────────────────────────→ Receives M2 Bob has old DHr, cannot derive correct chain Detection: Decryption fails with current state Recovery: Attempt ratchet with received DH key
Scenario 3: State Corruption Alice Bob ────── ─── Stores state to disk Receives message App crashes before flush Decrypts successfully App restarts with old state Advances ratchet Alice's state is behind Bob's state Detection: Multiple decryption failures Recovery: Session reset required

Detection Mechanisms

MechanismDetectsFalse Positive Rate
Message number gapLost messagesVery low
Consecutive decrypt failuresChain desyncLow
DH key mismatchRatchet desyncVery low
Timestamp anomaliesOrdering issuesMedium
MAC failuresCorruption or attackVery low

Desync Detection Algorithm

On message receipt: 1. Extract header: DH_pub, msg_num, prev_chain_len 2. Check for message number gap: if msg_num > Nr: gap_size = msg_num - Nr if gap_size > MAX_SKIP: return REJECT_POSSIBLE_DOS store_skipped_keys(Nr, msg_num - 1) 3. Check DH key: if DH_pub != current_DHr: if DH_pub in recent_DH_keys: // Out of order from previous chain use_previous_chain() else: // New DH ratchet perform_dh_ratchet(DH_pub) 4. Attempt decryption: if success: clear_failure_counter() else: increment_failure_counter() if failures >= DESYNC_THRESHOLD: initiate_recovery()

Recovery Protocol

Desync Recovery Flow: ┌─────────────────┐ │ Desync Detected │ └────────┬────────┘ ┌─────────────────────────┐ │ failures < 3? │───Yes───→ Request resend └───────────┬─────────────┘ │ No ┌─────────────────────────┐ │ Try alternative chains │ │ (skipped keys, prev DH) │ └───────────┬─────────────┘ ┌─────────────────────────┐ │ Alternative succeeded? │───Yes───→ Resume normal └───────────┬─────────────┘ │ No ┌─────────────────────────┐ │ Send RESET_REQUEST │ │ to peer │ └───────────┬─────────────┘ ┌─────────────────────────┐ │ Await RESET_ACK │ │ (timeout: 30 seconds) │ └───────────┬─────────────┘ ┌─────────────────────────┐ │ Re-establish session │ │ via X3DH │ └─────────────────────────┘

Message Loss During Recovery

During session recovery, some messages may be unrecoverable.

Message StateRecovery Outcome
Successfully decryptedPreserved in history
Queued for sendRe-encrypted with new session
In-flight during resetLost (resend notification sent)
Received but undecryptedLost (request resend from peer)
Recovery Message Handling: 1. Before reset: - Flush all pending decrypted messages to storage - Mark queued outgoing messages as NEEDS_REENCRYPT - Record message IDs of failed decryptions 2. After new session established: - Re-encrypt and resend queued messages - Request resend of failed incoming messages: { type: "RESEND_REQUEST", message_ids: [<list of lost message IDs>], reason: "session_reset" } 3. Peer responds: - Re-encrypts requested messages with new session - Marks messages as RESENT to prevent duplicates

Message Delivery Failures

Message delivery follows a state machine tracking each message from creation to confirmed delivery.

Delivery States

Message Delivery State Machine: ┌──────────┐ encrypt ┌─────────┐ send ┌────────┐ │ CREATING │─────────────→│ PENDING │──────────→│ SENT │ └──────────┘ └─────────┘ └────────┘ │ │ timeout/error delivered │ │ ▼ ▼ ┌─────────┐ ┌───────────┐ │ FAILED │ │ DELIVERED │ └─────────┘ └───────────┘ │ │ retry succeeds read │ │ ▼ ▼ ┌────────┐ ┌────────┐ │ SENT │ │ READ │ └────────┘ └────────┘
StateDescriptionTypical Duration
CREATINGMessage being composed/encryptedMilliseconds
PENDINGEncrypted, awaiting networkUntil connected
SENTTransmitted to relay networkUntil ACK received
DELIVEREDConfirmed received by recipientUntil read
READRead receipt receivedFinal state
FAILEDDelivery failed after retriesUntil manual retry

Retry Behavior Per State

StateAutomatic RetryMax AttemptsBackoff Strategy
PENDINGYes (when online)UnlimitedQueue order
SENT (no ACK)Yes3Exponential, max 30s
FAILEDNo (user action)User controlledNone
DELIVERED (no read)NoN/AN/A

Delivery Failure Causes

CauseDetection MethodRetry Appropriate
Network timeoutNo ACK within timeoutYes
Recipient offlineServer queued responseYes (server queues)
Recipient key changedKey bundle mismatchYes (fetch new keys)
Recipient blocked sender403 responseNo
Message too large413 responseNo
Rate limited429 responseYes (after cooldown)
Server error5xx responseYes

User Notification Strategy

ConditionNotification TypeTiming
First retryNoneSilent
3rd retrySubtle indicatorDelayed badge
Max retries exhaustedAlertImmediate
Permanent failureDialogImmediate
Rate limitedToastImmediate
Notification Decision: 1. Message enters FAILED state 2. Determine failure type: - Transient (network): "Message will retry when online" - Permanent (blocked): "Message could not be delivered" - Recoverable (key change): "Recipient's keys changed. Verify and resend?" 3. Display appropriate UI: - Transient: Yellow warning icon on message - Permanent: Red error icon, tap for details - Recoverable: Action prompt with verify option

Permanent Failure Handling

Failure TypeUser ActionSystem Action
Recipient blockedInform userRemove from contacts (optional)
Invalid recipientPrompt correctionDiscard message
Message expiredInform userArchive or delete
Key verification failedPrompt verificationHold message

Storage Errors

Storage errors affect local data persistence and can lead to data loss if not handled correctly.

Error Types

ErrorCodeCauseSeverity
Quota ExceededSTORE_001IndexedDB limit reachedHigh
Database CorruptionSTORE_002Unexpected shutdown, disk errorCritical
Transaction FailedSTORE_003Concurrent access conflictMedium
Access DeniedSTORE_004Browser permissions revokedHigh
Version MismatchSTORE_005Database schema outdatedMedium
Encryption Key LostSTORE_006Key derivation failedCritical

IndexedDB Quota Management

Browser storage quotas vary by platform and available disk space.

PlatformTypical QuotaZentalk Target Usage
Chrome Desktop60% of diskMax 500 MB
Firefox Desktop50% of diskMax 500 MB
Safari Desktop1 GBMax 500 MB
Mobile browsers50-100 MBMax 50 MB

Quota Exceeded Handling

Quota Exceeded Response: 1. Identify storage consumers: - Message history: typically largest - Media cache: can be cleared - Session state: critical, cannot reduce - Logs: can be truncated 2. Execute cleanup strategy: Priority 1: Clear media cache Priority 2: Compress old messages Priority 3: Archive old conversations Priority 4: Prompt user for action 3. Cleanup thresholds: - At 80% quota: Clear media cache - At 90% quota: Archive conversations > 6 months - At 95% quota: Notify user, suggest export - At 100%: Emergency mode, queue to memory only

Database Corruption Detection

Detection MethodChecks ForFrequency
Checksum validationData integrityEvery read
Schema verificationStructure validityOn app launch
Foreign key checkReferential integrityOn app launch
Index verificationIndex corruptionWeekly
Transaction logIncomplete writesOn app launch
Corruption Detection Algorithm: 1. On database open: - Verify schema version matches expected - Check critical tables exist - Validate index structures 2. On read operation: if stored_checksum != computed_checksum: mark_record_corrupted(record_id) attempt_recovery(record_id) 3. Periodic integrity check (weekly): for each table: for each record: validate_structure(record) validate_references(record) validate_checksum(record) report_corruption_rate()

Automatic Repair Strategies

Corruption TypeRepair StrategySuccess Rate
Single recordRestore from backupHigh
Index corruptionRebuild indexVery high
Table corruptionRestore table from backupMedium
Schema corruptionReset schema, migrate dataMedium
Full DB corruptionRestore from encrypted backupDepends on backup
Repair Decision Tree: ┌─────────────────────────┐ │ Corruption Detected │ └───────────┬─────────────┘ ┌─────────────────────────┐ │ Scope Assessment │ │ (single record/table/DB)│ └───────────┬─────────────┘ ┌────────┼────────┐ │ │ │ ▼ ▼ ▼ ┌──────┐ ┌──────┐ ┌──────┐ │Record│ │Table │ │ Full │ └──┬───┘ └──┬───┘ └──┬───┘ │ │ │ ▼ ▼ ▼ Restore Rebuild Restore from from from cache indices backup

Manual Recovery Options

When automatic recovery fails, users have several manual options.

OptionData PreservedComplexity
Export and reimportAll exportable dataMedium
Restore from backupUp to backup pointLow
Clear and resyncContacts onlyLow
Full resetNone (fresh start)Very low
Manual Recovery UI Flow: 1. User accesses Settings → Data Recovery 2. Options presented: ┌─────────────────────────────────────────┐ │ Recovery Options │ ├─────────────────────────────────────────┤ │ [Attempt Auto-Repair] │ │ Try to fix corruption automatically │ │ │ │ [Restore from Backup] │ │ Last backup: 2 hours ago │ │ │ │ [Export Available Data] │ │ Save what can be recovered │ │ │ │ [Clear All Data] │ │ Start fresh (contacts re-sync) │ └─────────────────────────────────────────┘ 3. After selection: - Confirm destructive actions - Show progress indicator - Report success/failure - Guide next steps

Error Codes Reference

Network Error Codes (NET_xxx)

CodeNameDescriptionResolution
NET_001TIMEOUTOperation exceeded time limitRetry with backoff
NET_002CONN_RESETConnection forcibly closedReconnect
NET_003DNS_FAILEDDomain name resolution failedCheck network, try alternate DNS
NET_004TLS_FAILEDTLS handshake failedVerify certificates, check time
NET_005CONN_REFUSEDServer refused connectionCheck server status
NET_006NET_UNREACHABLENo route to networkCheck internet connection
NET_007HOST_UNREACHABLECannot reach specific hostCheck host status
NET_008CERT_EXPIREDServer certificate expiredContact server operator
NET_009CERT_INVALIDCertificate validation failedSecurity alert, do not proceed
NET_010RATE_LIMITEDToo many requestsWait and retry

Cryptographic Error Codes (CRYPTO_xxx)

CodeNameDescriptionResolution
CRYPTO_001DECRYPT_FAILEDDecryption produced invalid dataCheck keys, request resend
CRYPTO_002SIG_INVALIDSignature verification failedVerify sender identity
CRYPTO_003MAC_FAILEDAuthentication tag mismatchDiscard message, alert user
CRYPTO_004KDF_FAILEDKey derivation errorCheck inputs, retry
CRYPTO_005KEY_INVALIDPublic key validation failedRequest new key bundle
CRYPTO_006NONCE_REUSESame nonce used twiceCritical security error
CRYPTO_007KEY_EXPIREDKey past validity periodTrigger key rotation
CRYPTO_008ALGO_UNSUPPORTEDUnknown algorithm requestedCheck protocol version
CRYPTO_009RANDOM_FAILEDRNG failureCritical system error
CRYPTO_010KEY_MISMATCHPublic/private key mismatchRegenerate keypair

Protocol Error Codes (PROTO_xxx)

CodeNameDescriptionResolution
PROTO_001INVALID_STATEUnexpected state transitionReset state machine
PROTO_002VERSION_MISMATCHIncompatible protocol versionUpgrade client
PROTO_003UNKNOWN_MSG_TYPEUnrecognized message typeIgnore or upgrade
PROTO_004SEQ_INVALIDInvalid sequence numberCheck for replay
PROTO_005HANDSHAKE_FAILEDX3DH computation failedRetry with fresh keys
PROTO_006CIRCUIT_UNKNOWNCircuit ID not foundCreate new circuit
PROTO_007STREAM_INVALIDStream ID not validOpen new stream
PROTO_008MSG_TOO_LARGEMessage exceeds size limitSplit or compress
PROTO_009REPLAY_DETECTEDDuplicate message receivedDiscard message
PROTO_010DESYNC_DETECTEDRatchet desynchronizationInitiate recovery

Storage Error Codes (STORE_xxx)

CodeNameDescriptionResolution
STORE_001QUOTA_EXCEEDEDStorage limit reachedClear cache, archive old data
STORE_002CORRUPTIONData integrity check failedAttempt repair or restore
STORE_003TXN_FAILEDDatabase transaction failedRetry operation
STORE_004ACCESS_DENIEDPermission deniedRequest permissions
STORE_005SCHEMA_MISMATCHDatabase schema outdatedRun migration
STORE_006KEY_LOSTEncryption key unavailableRestore from backup
STORE_007NOT_FOUNDRequested record missingCheck ID, may be deleted
STORE_008LOCKEDDatabase locked by another processWait and retry
STORE_009FULLStorage completely fullEmergency cleanup
STORE_010INIT_FAILEDDatabase initialization failedClear and reinitialize

User-Facing vs Internal Errors

Error TypeUser-FacingTechnical Details Shown
Network timeoutYesNo (just “Connection problem”)
Decryption failedYesNo (just “Message unavailable”)
MAC failureYesPartial (“Security warning”)
Storage quotaYesYes (space remaining)
Protocol versionYesYes (version numbers)
Internal errorsYesNo (generic message)
Rate limitingYesYes (retry time)
Key expirationNoN/A (auto-handled)

Error Message Templates

CodeUser MessageTechnical Log
NET_001”Connection timed out. Retrying…""NET_001: Timeout after 30000ms to relay.zentalk.io:9001”
CRYPTO_003”Message could not be verified. It may have been tampered with.""CRYPTO_003: MAC verification failed for msg_id=abc123, session=def456”
STORE_001”Storage full. Please free up space or archive old messages.""STORE_001: QuotaExceededError, used=490MB, limit=500MB”
PROTO_002”Please update Zentalk to continue.""PROTO_002: Version mismatch, local=1.2.0, remote=2.0.0”

Recovery Best Practices

Error Recovery Priority

PriorityError CategoryRationale
1 (Highest)Security errorsPrevent data exposure
2Cryptographic syncRestore communication
3Storage integrityPrevent data loss
4Network connectivityRestore service
5 (Lowest)UI/UX errorsUser convenience

Logging and Diagnostics

Log LevelError TypesRetention
ERRORAll failures7 days
WARNRetryable issues3 days
INFORecovery actions1 day
DEBUGDetailed tracesSession only
Log Entry Structure: { timestamp: "2024-01-15T10:30:00.000Z", level: "ERROR", code: "CRYPTO_001", message: "Decryption failed", context: { session_id: "abc123", peer: "0x1234...5678", message_num: 42, attempt: 2 }, stack: "<stack trace for DEBUG only>" }

Circuit Breaker Pattern

For operations that repeatedly fail, Zentalk implements a circuit breaker to prevent resource exhaustion.

StateBehaviorTransition
CLOSEDNormal operationOpens after 5 failures
OPENFail fast, no attemptsHalf-opens after 60s
HALF-OPENAllow single test requestCloses on success, opens on failure
Circuit Breaker Logic: state = CLOSED failure_count = 0 last_failure_time = null on_operation(): if state == OPEN: if now() - last_failure_time > 60 seconds: state = HALF_OPEN else: return FAIL_FAST result = attempt_operation() if result == SUCCESS: state = CLOSED failure_count = 0 else: failure_count += 1 last_failure_time = now() if failure_count >= 5: state = OPEN
Last updated on