Skip to content

fix(poll-loop): suppress duplicate text when send_message fires mid-turn#2531

Open
cfis wants to merge 1 commit into
nanocoai:mainfrom
cfis:fix/suppress-duplicate-send
Open

fix(poll-loop): suppress duplicate text when send_message fires mid-turn#2531
cfis wants to merge 1 commit into
nanocoai:mainfrom
cfis:fix/suppress-duplicate-send

Conversation

@cfis
Copy link
Copy Markdown
Contributor

@cfis cfis commented May 18, 2026

Type of Change

  • Feature skill - adds a channel or integration (source code changes + SKILL.md)
  • Utility skill - adds a standalone tool (code files in `.claude/skills//`, no source changes)
  • Operational/container skill - adds a workflow or agent skill (SKILL.md only, no source changes)
  • Fix - bug fix or security fix to source code
  • Simplification - reduces or simplifies source code
  • Documentation - docs, README, or CONTRIBUTING changes only

Description

What. When an agent calls `send_message` or `send_file` as an MCP tool mid-turn, the Claude SDK still emits a closing-text result event afterward. nanoclaw was dispatching that closing text as a second chat row, so the user saw the tool's message followed by a near-duplicate summary from the model wrapping up. Now suppressed.

Why. Two delivery paths for the same intent (the tool wrote it, and then the SDK's closing text described it) felt like a bug from the user's side, especially on chat channels with low information density (Signal, WhatsApp, CLI) where the duplicate is jarring.

How it works.

  • Track `turn_send_invoked` in `session_state` (SQLite). Set by `send_message` and `send_file` MCP handlers; `add_reaction` deliberately doesn't set it (reaction + closing summary is a valid reply pattern).
  • At the `result` event in `processQuery`, suppress `dispatchResultText` if the flag is set. The agent's closing narration goes to scratchpad instead of outbound.
  • Cleared at four points: top of `runPollLoop()` (SIGKILL recovery), in the stale-session-invalid catch (so a retry path doesn't inherit suppression), after each result event (so follow-up messages pushed into the same open query stream don't carry suppression from a prior result), and in the outer `finally` (defense in depth at turn boundary).
  • Why SQLite, not a module variable: the nanoclaw MCP server (which owns `send_message` and `send_file`) runs in a separate process from the poll-loop. `index.ts` spawns the MCP server via `bun run mcp-tools/index.ts` and the SDK connects over stdio. SQLite is the only IPC channel between those processes; a module-level flag is invisible across the boundary, leaving suppression dead code. The cold-start clear at the top of `runPollLoop()` handles the SIGKILL-mid-turn case that SQLite persistence would otherwise leave hanging.

How it was tested. Live against the real Claude Agent SDK on a running install:

  1. Without the fix: prompt the agent with "Use `send_message` to deliver 'interim update', then briefly say what you did." The session's `outbound.db` ends up with two `chat` rows for the turn — the tool's row plus the SDK's closing-text duplicate.
  2. With the fix: same prompt, container log shows `[poll-loop] Suppressing result text (send already fired this turn): ...`, and `outbound.db` has exactly one `chat` row (the tool's write).

The mock-based unit tests in `integration.test.ts` deliberately don't cover this — the bug is in the seam between the real SDK and the real cross-process MCP server, which mock providers can't reproduce. The 89 existing tests all pass.

Usage. Transparent. Existing `send_message` / `send_file` callers (agents) need no change. The improvement is visible to users as cleaner replies on send-mid-turn flows.

For Skills

  • SKILL.md contains instructions, not inline code (code goes in separate files)
  • SKILL.md is under 500 lines
  • I tested this skill on a fresh clone

@cfis cfis requested review from gabi-simons and gavrielc as code owners May 18, 2026 02:57
@github-actions github-actions Bot added the PR: Fix Bug fix label May 18, 2026
@taslim
Copy link
Copy Markdown
Contributor

taslim commented May 19, 2026

cherrypicking this. thanks!

@taslim
Copy link
Copy Markdown
Contributor

taslim commented May 19, 2026

one issue i quickly ran into: with the boolean flag, it drops subsequent messages even if they are different. e.g. agent said "on it - looking at your calendar" mid-turn, the final response was the availability from the calendar but it never delivered that because turn_send_invoked was set to true by the mid-turn message.

i'm mitigating this by comparing the contents of the mid-turn message to the subsequent responses... and only skipping when they are verbatim. there may be a more comprehensive solution that compares and drops based on similarities as well (not just verbatim).

@cfis
Copy link
Copy Markdown
Contributor Author

cfis commented May 19, 2026

@taslim great catch — you're right, the boolean approach drops legitimate distinct content. I just pushed a revised version that does per-payload verbatim matching instead.

What changed:

  • turn_send_invoked (boolean) → turn_sent_payloads (string array in session_state).
  • send_message / send_file handlers append their text to the array.
  • dispatchResultText filters parsed <message to="...">body</message> blocks against the array; a block is dropped only when its body is a verbatim match for an already-sent payload.

Your calendar scenario now flows correctly: send_message("on it — looking at your calendar") records that payload, then a result of <message to="cli">Tomorrow at 2pm you're free</message> doesn't match, so the answer is delivered.

Verified live on a running install:

  • Verbatim match: send_message("X") + result with <message to="cli">X</message>[poll-loop] Suppressing duplicate <message> block (verbatim of an already-sent payload) fires; outbound has one row (the tool's write).
  • Distinct content: send_message("Looking it up") + result with <message to="cli">The answer is 42</message> → no suppression log; block flows through normally.

Known limitation: paraphrased duplicates aren't caught (e.g. tool sends "X"; SDK closing text says "I've sent X to the channel"). I considered similarity scoring but it's fragile and the failure mode (suppressing real content) is much worse than the success mode (catching a paraphrased duplicate). Verbatim-only is the narrower correct gate; better to ship the occasional paraphrased duplicate than to drop a real answer. Open to revisiting if you have a less fragile approach in mind.

@cfis cfis force-pushed the fix/suppress-duplicate-send branch from ae46adc to f391776 Compare May 19, 2026 16:39
@taslim
Copy link
Copy Markdown
Contributor

taslim commented May 19, 2026

@cfis i agree that similarity comparison is fragile. tried and dropped it as well. verbatim-only makes sense.

small normalization tweak you might want: s.trim().replace(/\s+/g, ' ') instead of just s.trim(). catches "hello world" vs "hello\nworld" type mismatches. probably rare but cheap.

also - your note about not unit testing this because of the SDK seam tracked when it was the boolean version. with per-payload comparison, dispatchResultText is just string matching - no SDK needed. wrote some tests on my fork - happy to push to your branch if you want.

When an agent calls send_message or send_file as an MCP tool mid-turn,
the Claude SDK still emits a closing-text result event afterward. If
that result's <message to="..."> block body is a verbatim repeat of
what the tool already shipped, nanoclaw was delivering it as a second
chat row — the user saw the tool's message followed by a duplicate.

Fix: record each text payload that send_message / send_file delivers
into turn_sent_payloads (a JSON-encoded array in session_state).
dispatchResultText filters parsed <message to="name">body</message>
blocks against that set; bodies that are a verbatim match for an
already-sent payload are skipped with a log line. Distinct content
the agent emits as part of the same final result (e.g. a progress
update via send_message followed by the actual answer in the result)
flows through normally.

Per-payload comparison is deliberate. An earlier iteration used a
boolean "did send_message fire this turn?" flag and suppressed the
entire result text on any send — but that conflates "the tool was
called" with "everything after it is a duplicate." It silently
dropped legitimate distinct content: send_message("looking it up") +
result with the calendar availability → user never saw the answer.
The verbatim-match check is the narrower correct gate. Paraphrased
duplicates aren't caught; that's the deliberate tradeoff. Better to
ship an occasional paraphrased duplicate than to suppress a real
answer.

add_reaction does not record a payload (reaction + closing summary is
a valid reply pattern).

turn_sent_payloads lives in SQLite session_state, not a module
variable. The nanoclaw MCP server (which owns send_message and
send_file) runs in a separate process from the poll-loop: index.ts
spawns it via `bun run mcp-tools/index.ts` and the SDK connects over
stdio. SQLite is the only IPC channel between the two processes;
module-level state is invisible across the boundary, leaving
suppression dead code.

Cleared at four boundaries: top of runPollLoop() (handles SIGKILL
mid-turn — stale payloads would otherwise survive into the next
container's first turn and suppress legitimate matches), in the
stale-session-invalid catch (so retry paths don't inherit the failed
attempt's payloads), in the outer finally (defense in depth at turn
boundary), and after each result event in processQuery (so follow-up
messages pushed into the same open query stream don't carry payloads
from a prior result).

Verified live against the real Claude Agent SDK:
  - Verbatim match: send_message("X"), result contains
    <message to="cli">X</message> → suppression log fires,
    outbound has one row (the tool's write).
  - Distinct content: send_message("Looking it up"), result contains
    <message to="cli">The answer is 42</message> → no suppression
    log; the block flows through dispatchResultText normally.

Thanks @taslim for the false-positive catch on the boolean version.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cfis cfis force-pushed the fix/suppress-duplicate-send branch from f391776 to 6af88f8 Compare May 20, 2026 02:31
@cfis
Copy link
Copy Markdown
Contributor Author

cfis commented May 20, 2026

@taslim thanks - applied both your ideas in 6af88f8.

Normalization tweak- s.trim().replace(/\s+/g, ' ') on both sides of the comparison, encapsulated in a small isVerbatimDuplicate(body, sentPayloads) helper that's exported for direct testing.

Then added 13 tests (container/agent-runner/src/integration.test.ts):

  • 11 unit tests against isVerbatimDuplicate directly — identical match, distinct strings (the taslim calendar false-positive case), empty payload list, multi-payload turns, whitespace collapse (newline, multi-space, tab), symmetric normalization, trim, case sensitivity, multiline → flattened, edge cases for empty body and empty payload.
  • 2 end-to-end through runPollLoop covering the cases where clearTurnSentPayloads() at the top of the loop doesn't interfere: empty-payloads turn flows through normally, and the add_reaction pattern (reaction + closing summary) still delivers.

Happy to review any additional ones if I missed test cases you were considering — push directly to fix/suppress-duplicate-send on my fork (cfis/nanoclaw) or paste here, either works.

@taslim
Copy link
Copy Markdown
Contributor

taslim commented May 20, 2026

@cfis your tests are actually much better than the ones i had planned.

tagging @gavrielc or @gabi-simons to review and help merge this please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: Fix Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants