Skip to content

fix(slackbot): avoid corrupting final delivery when the streamed-char offset lands mid-token#342

Closed
axis-guzus wants to merge 1 commit into
paradigmxyz:mainfrom
axis-guzus:fix/slackbot-final-delivery-mid-token-corruption
Closed

fix(slackbot): avoid corrupting final delivery when the streamed-char offset lands mid-token#342
axis-guzus wants to merge 1 commit into
paradigmxyz:mainfrom
axis-guzus:fix/slackbot-final-delivery-mid-token-corruption

Conversation

@axis-guzus

Copy link
Copy Markdown

Problem

Agent answers delivered to Slack are intermittently corrupted — characters are dropped or words merge (observed in production: "market cap" → "marketountries", "defillama.com" → "vama.com", "authenticate" → "thenticate"). The content is correct server-side; only the Slack render is mangled.

Root cause

continuationText (services/slackbot/src/centaur/final-delivery.ts) blind-slices the canonical result_text at slackbot_streamed_answer_chars:

return text.slice(offset).trimStart();

That offset counts the source characters streamed live (the per-delta source-char accumulator in agent-session.ts / codex-session.ts), which is a different string-space than result_text:

  • markdown normalization adds boundary characters to the live stream that aren't in the source count;
  • whitespace / paragraph reflow changes lengths;
  • the harness can revise the final answer — the code even logs slack_codex_canonical_answer_correction when the canonical answer length differs from what was streamed.

When the offset is misaligned, text.slice(offset) lands mid-token and drops characters. The offset is accumulated server-side with max(...), so overshoot is sticky.

Fix

continuationText only has the offset count, not the streamed string, so it can't verify the prefix. This guards the slice: only treat the offset as a split point when it lands on a whitespace boundary in result_text; when it's misaligned, fall back to posting the full answer (re-showing the already-streamed prefix is strictly preferable to emitting corrupted text). The aligned suffix-continuation path (live delivery cut off at a clean boundary) is unchanged.

Tests

Adds a regression test (offset lands mid-token → the full answer is posted, not a corrupted slice). The existing "posts only the missing suffix when live delivery was cut off" test still passes.

bun test src/centaur/final-delivery.test.ts  →  6 pass, 0 fail

Note

A more complete fix would plumb the streamed prefix text through (or compute the continuation server-side, where both strings are available) so the split point can be verified exactly rather than heuristically. Happy to follow up if you'd prefer that direction — this change is a minimal, safe guard against the user-visible corruption.

… offset lands mid-token

`continuationText` blind-slices the canonical `result_text` at
`slackbot_streamed_answer_chars`. That offset counts the *source* characters
streamed live, which is a different string-space than `result_text` — markdown
normalization, whitespace reflow, and post-hoc answer revisions all change
lengths, so the two drift. When the offset is misaligned the `text.slice(offset)`
lands mid-token and drops characters (observed in production as e.g. "market cap"
rendered as "marketountries", "defillama.com" as "vama.com").

We only have the count here, not the streamed string, so the prefix can't be
verified — but we can refuse to cut mid-token: only treat the offset as a split
point when it lands on a whitespace boundary in `result_text`; otherwise post the
full answer (re-showing the already-streamed prefix is strictly preferable to
emitting corrupted text). The aligned suffix-continuation path is unchanged.

Adds a regression test; existing final-delivery tests still pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Zygimantass

Copy link
Copy Markdown
Member

We've merged the Rust rewrite in #344, meaning the old API and Slackbot services are deprecated. As such we're closing this PR - if you think this is a mistake, please reopen the PR and ping me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants