Skip to content

decryptWithSessions blocks event loop for tens of seconds with many stale sessions #18

@bobrenze-bot

Description

@bobrenze-bot

Problem

SessionCipher.decryptWithSessions() iterates through all known sessions synchronously, attempting to decrypt with each one. Signal Protocol crypto (doDecryptWhisperMessage) is CPU-intensive JS. When there are many stale sessions and messages arrive that fail to decrypt (e.g. MessageCounterError: Key used already or never filled), the loop runs all sessions before throwing — blocking the Node.js event loop for 10-52 seconds in production.

Impact

This causes:

  • Downstream keepalive timers not firing on time → WebSocket 408 disconnects
  • All I/O stalled during decryption → cascading latency for everything else on the event loop
  • After each 408 reconnect, more decryption failures → more blocking → another 408 → infinite cycle

Measured: eventLoopDelayP99Ms=52244ms (52 seconds) with many stale sessions.

Solution

Add a setImmediate yield between session attempts. This lets other event loop tasks (timers, I/O callbacks) run between attempts without changing the decryption logic:

async decryptWithSessions(data, sessions) {
    if (!sessions.length) {
        throw new errors.SessionError("No sessions available");
    }
    const errs = [];
    for (const session of sessions) {
        // Yield to the event loop between attempts so keepalive timers can fire.
        // Without this, many sessions × crypto work blocks the event loop for tens of seconds.
        await new Promise(resolve => setImmediate(resolve));
        let plaintext;
        try {
            plaintext = await this.doDecryptWhisperMessage(data, session);
            session.indexInfo.used = Date.now();
            return { session, plaintext };
        } catch(e) {
            errs.push(e);
        }
    }
    console.error("Failed to decrypt message with any known session...");
    for (const e of errs) {
        console.error("Session error:" + e, e.stack);
    }
    throw new errors.SessionError("No matching sessions found for message");
}

Verification

After applying this patch:

  • eventLoopDelayP99Ms dropped from 52244ms to 134ms
  • WhatsApp 408 reconnect frequency reduced significantly
  • Decryption behavior is unchanged — logic only altered to yield to event loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions