Skip to content

Defend against ACP server reporting used > size#5

Merged
timvisher-dd merged 2 commits intomainfrom
timvisher-dd/tests/add-shell-usage-regression-tests
Mar 15, 2026
Merged

Defend against ACP server reporting used > size#5
timvisher-dd merged 2 commits intomainfrom
timvisher-dd/tests/add-shell-usage-regression-tests

Conversation

@timvisher-dd
Copy link
Copy Markdown
Owner

@timvisher-dd timvisher-dd commented Mar 15, 2026

Closes xenodium#364

The ACP server (claude-agent-acp) has a bug where model switches cause used to exceed size in session/update notifications. For example, switching from Opus 1M to Sonnet 200k drops size to 200000 while used keeps growing past it (observed: 419574/200000 = 209.8%). This results in nonsensical context indicators and percentages e.g. (from a real session)

 Context: 420k/200k (209.8%)
  Tokens: 32 in · 11k out · 11m cached (11m total)
    Cost: USD75.96

While I intend to get a fix for this improper reporting into claude-agent-acp, agent-shell should be robust against it. To that end, when the garbage usage data is observed, the UI now signals unreliable data instead of showing nonsense:

  • Context indicator shows ? with warning face when used > size
  • Formatted usage shows raw numbers with (?) instead of a bogus percentage
  • A regression test replays the real observed ACP traffic from the model-switch scenario so this class of bug is caught going forward

This also adds comprehensive ERT test coverage for agent-shell-usage.el: notification updates, indicator scaling/colors, compaction replay, token saving, and number formatting.

For the claude-agent-acp side of this see agentclientprotocol/claude-agent-acp#412

Test plan

  • All 21 ERT tests pass
  • checkdoc clean on both files
  • byte-compile clean on both files
  • Manual baking verification

timvisher-dd and others added 2 commits March 15, 2026 16:00
Add comprehensive ERT tests for agent-shell-usage.el covering
notification updates, context indicator scaling/colors, compaction
replay, token saving, and number formatting.

The ACP server has a bug where model switches cause used to exceed
size in session/update notifications. Rather than clamping, signal
unreliable data: indicator shows ? with warning face, format shows
(?) instead of a bogus percentage. A regression test replays real
observed traffic from the Opus 1M -> Sonnet 200k switch scenario.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@timvisher-dd timvisher-dd marked this pull request as ready for review March 15, 2026 20:04
@timvisher-dd timvisher-dd merged commit 7f85cf4 into main Mar 15, 2026
2 checks passed
@timvisher-dd timvisher-dd deleted the timvisher-dd/tests/add-shell-usage-regression-tests branch March 15, 2026 20:04
timvisher-dd added a commit that referenced this pull request Mar 16, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
timvisher-dd added a commit that referenced this pull request Mar 17, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant