fix(poll-loop): suppress duplicate text when send_message fires mid-turn by cfis · Pull Request #2531 · nanocoai/nanoclaw

cfis · 2026-05-18T02:57:15Z

Type of Change

Feature skill - adds a channel or integration (source code changes + SKILL.md)
Utility skill - adds a standalone tool (code files in `.claude/skills//`, no source changes)
Operational/container skill - adds a workflow or agent skill (SKILL.md only, no source changes)
Fix - bug fix or security fix to source code
Simplification - reduces or simplifies source code
Documentation - docs, README, or CONTRIBUTING changes only

Description

What. When an agent calls `send_message` or `send_file` as an MCP tool mid-turn, the Claude SDK still emits a closing-text result event afterward. nanoclaw was dispatching that closing text as a second chat row, so the user saw the tool's message followed by a near-duplicate summary from the model wrapping up. Now suppressed.

Why. Two delivery paths for the same intent (the tool wrote it, and then the SDK's closing text described it) felt like a bug from the user's side, especially on chat channels with low information density (Signal, WhatsApp, CLI) where the duplicate is jarring.

How it works.

Track `turn_send_invoked` in `session_state` (SQLite). Set by `send_message` and `send_file` MCP handlers; `add_reaction` deliberately doesn't set it (reaction + closing summary is a valid reply pattern).
At the `result` event in `processQuery`, suppress `dispatchResultText` if the flag is set. The agent's closing narration goes to scratchpad instead of outbound.
Cleared at four points: top of `runPollLoop()` (SIGKILL recovery), in the stale-session-invalid catch (so a retry path doesn't inherit suppression), after each result event (so follow-up messages pushed into the same open query stream don't carry suppression from a prior result), and in the outer `finally` (defense in depth at turn boundary).
Why SQLite, not a module variable: the nanoclaw MCP server (which owns `send_message` and `send_file`) runs in a separate process from the poll-loop. `index.ts` spawns the MCP server via `bun run mcp-tools/index.ts` and the SDK connects over stdio. SQLite is the only IPC channel between those processes; a module-level flag is invisible across the boundary, leaving suppression dead code. The cold-start clear at the top of `runPollLoop()` handles the SIGKILL-mid-turn case that SQLite persistence would otherwise leave hanging.

How it was tested. Live against the real Claude Agent SDK on a running install:

Without the fix: prompt the agent with "Use `send_message` to deliver 'interim update', then briefly say what you did." The session's `outbound.db` ends up with two `chat` rows for the turn — the tool's row plus the SDK's closing-text duplicate.
With the fix: same prompt, container log shows `[poll-loop] Suppressing result text (send already fired this turn): ...`, and `outbound.db` has exactly one `chat` row (the tool's write).

The mock-based unit tests in `integration.test.ts` deliberately don't cover this — the bug is in the seam between the real SDK and the real cross-process MCP server, which mock providers can't reproduce. The 89 existing tests all pass.

Usage. Transparent. Existing `send_message` / `send_file` callers (agents) need no change. The improvement is visible to users as cleaner replies on send-mid-turn flows.

For Skills

SKILL.md contains instructions, not inline code (code goes in separate files)
SKILL.md is under 500 lines
I tested this skill on a fresh clone

taslim · 2026-05-19T04:42:57Z

cherrypicking this. thanks!

taslim · 2026-05-19T07:14:54Z

one issue i quickly ran into: with the boolean flag, it drops subsequent messages even if they are different. e.g. agent said "on it - looking at your calendar" mid-turn, the final response was the availability from the calendar but it never delivered that because turn_send_invoked was set to true by the mid-turn message.

i'm mitigating this by comparing the contents of the mid-turn message to the subsequent responses... and only skipping when they are verbatim. there may be a more comprehensive solution that compares and drops based on similarities as well (not just verbatim).

cfis · 2026-05-19T16:39:21Z

@taslim great catch — you're right, the boolean approach drops legitimate distinct content. I just pushed a revised version that does per-payload verbatim matching instead.

What changed:

turn_send_invoked (boolean) → turn_sent_payloads (string array in session_state).
send_message / send_file handlers append their text to the array.
dispatchResultText filters parsed <message to="...">body</message> blocks against the array; a block is dropped only when its body is a verbatim match for an already-sent payload.

Your calendar scenario now flows correctly: send_message("on it — looking at your calendar") records that payload, then a result of <message to="cli">Tomorrow at 2pm you're free</message> doesn't match, so the answer is delivered.

Verified live on a running install:

Verbatim match: send_message("X") + result with <message to="cli">X</message> → [poll-loop] Suppressing duplicate <message> block (verbatim of an already-sent payload) fires; outbound has one row (the tool's write).
Distinct content: send_message("Looking it up") + result with <message to="cli">The answer is 42</message> → no suppression log; block flows through normally.

Known limitation: paraphrased duplicates aren't caught (e.g. tool sends "X"; SDK closing text says "I've sent X to the channel"). I considered similarity scoring but it's fragile and the failure mode (suppressing real content) is much worse than the success mode (catching a paraphrased duplicate). Verbatim-only is the narrower correct gate; better to ship the occasional paraphrased duplicate than to drop a real answer. Open to revisiting if you have a less fragile approach in mind.

taslim · 2026-05-19T17:48:28Z

@cfis i agree that similarity comparison is fragile. tried and dropped it as well. verbatim-only makes sense.

small normalization tweak you might want: s.trim().replace(/\s+/g, ' ') instead of just s.trim(). catches "hello world" vs "hello\nworld" type mismatches. probably rare but cheap.

also - your note about not unit testing this because of the SDK seam tracked when it was the boolean version. with per-payload comparison, dispatchResultText is just string matching - no SDK needed. wrote some tests on my fork - happy to push to your branch if you want.

@taslim

When an agent calls send_message or send_file as an MCP tool mid-turn, the Claude SDK still emits a closing-text result event afterward. If that result's <message to="..."> block body is a verbatim repeat of what the tool already shipped, nanoclaw was delivering it as a second chat row — the user saw the tool's message followed by a duplicate. Fix: record each text payload that send_message / send_file delivers into turn_sent_payloads (a JSON-encoded array in session_state). dispatchResultText filters parsed <message to="name">body</message> blocks against that set; bodies that are a verbatim match for an already-sent payload are skipped with a log line. Distinct content the agent emits as part of the same final result (e.g. a progress update via send_message followed by the actual answer in the result) flows through normally. Per-payload comparison is deliberate. An earlier iteration used a boolean "did send_message fire this turn?" flag and suppressed the entire result text on any send — but that conflates "the tool was called" with "everything after it is a duplicate." It silently dropped legitimate distinct content: send_message("looking it up") + result with the calendar availability → user never saw the answer. The verbatim-match check is the narrower correct gate. Paraphrased duplicates aren't caught; that's the deliberate tradeoff. Better to ship an occasional paraphrased duplicate than to suppress a real answer. add_reaction does not record a payload (reaction + closing summary is a valid reply pattern). turn_sent_payloads lives in SQLite session_state, not a module variable. The nanoclaw MCP server (which owns send_message and send_file) runs in a separate process from the poll-loop: index.ts spawns it via `bun run mcp-tools/index.ts` and the SDK connects over stdio. SQLite is the only IPC channel between the two processes; module-level state is invisible across the boundary, leaving suppression dead code. Cleared at four boundaries: top of runPollLoop() (handles SIGKILL mid-turn — stale payloads would otherwise survive into the next container's first turn and suppress legitimate matches), in the stale-session-invalid catch (so retry paths don't inherit the failed attempt's payloads), in the outer finally (defense in depth at turn boundary), and after each result event in processQuery (so follow-up messages pushed into the same open query stream don't carry payloads from a prior result). Verified live against the real Claude Agent SDK: - Verbatim match: send_message("X"), result contains <message to="cli">X</message> → suppression log fires, outbound has one row (the tool's write). - Distinct content: send_message("Looking it up"), result contains <message to="cli">The answer is 42</message> → no suppression log; the block flows through dispatchResultText normally. Thanks @taslim for the false-positive catch on the boolean version. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cfis · 2026-05-20T02:31:51Z

@taslim thanks - applied both your ideas in 6af88f8.

Normalization tweak- s.trim().replace(/\s+/g, ' ') on both sides of the comparison, encapsulated in a small isVerbatimDuplicate(body, sentPayloads) helper that's exported for direct testing.

Then added 13 tests (container/agent-runner/src/integration.test.ts):

11 unit tests against isVerbatimDuplicate directly — identical match, distinct strings (the taslim calendar false-positive case), empty payload list, multi-payload turns, whitespace collapse (newline, multi-space, tab), symmetric normalization, trim, case sensitivity, multiline → flattened, edge cases for empty body and empty payload.
2 end-to-end through runPollLoop covering the cases where clearTurnSentPayloads() at the top of the loop doesn't interfere: empty-payloads turn flows through normally, and the add_reaction pattern (reaction + closing summary) still delivers.

Happy to review any additional ones if I missed test cases you were considering — push directly to fix/suppress-duplicate-send on my fork (cfis/nanoclaw) or paste here, either works.

taslim · 2026-05-20T04:02:44Z

@cfis your tests are actually much better than the ones i had planned.

tagging @gavrielc or @gabi-simons to review and help merge this please.

cfis requested review from gabi-simons and gavrielc as code owners May 18, 2026 02:57

github-actions Bot added the PR: Fix Bug fix label May 18, 2026

cfis force-pushed the fix/suppress-duplicate-send branch from ae46adc to f391776 Compare May 19, 2026 16:39

cfis force-pushed the fix/suppress-duplicate-send branch from f391776 to 6af88f8 Compare May 20, 2026 02:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(poll-loop): suppress duplicate text when send_message fires mid-turn#2531

fix(poll-loop): suppress duplicate text when send_message fires mid-turn#2531
cfis wants to merge 1 commit into
nanocoai:mainfrom
cfis:fix/suppress-duplicate-send

cfis commented May 18, 2026

Uh oh!

taslim commented May 19, 2026

Uh oh!

taslim commented May 19, 2026

Uh oh!

cfis commented May 19, 2026

Uh oh!

taslim commented May 19, 2026 •

edited

Loading

Uh oh!

cfis commented May 20, 2026 •

edited

Loading

Uh oh!

taslim commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cfis commented May 18, 2026

Type of Change

Description

For Skills

Uh oh!

taslim commented May 19, 2026

Uh oh!

taslim commented May 19, 2026

Uh oh!

cfis commented May 19, 2026

Uh oh!

taslim commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfis commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taslim commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

taslim commented May 19, 2026 •

edited

Loading

cfis commented May 20, 2026 •

edited

Loading

taslim commented May 20, 2026 •

edited

Loading