fix(junior): keep resume diagnostics cumulative#385
Closed
omribz156 wants to merge 2 commits into
Closed
Conversation
…es (getsentry#383) ## Summary The tool error handler (`handleToolExecutionError`) sent all non-MCP tool errors to Sentry via `logException`, even when the error was caused by invalid model input — missing files, bad regex, ambiguous edits, path traversal. This created noisy alerts for expected conditions like [JUNIOR-2Q](https://sentry.sentry.io/issues/7495355493/). This introduces `ToolInputError` as the single classification mechanism for expected model/user-caused tool failures. Domain code throws `ToolInputError` at the source where it knows the failure is input-caused, not a system error. The global error handler respects this: no `logException`, no Sentry noise. ## Architecture ``` Domain code throws ToolInputError → handleToolExecutionError → logWarn only → rethrow → Pi tool_error Domain code throws Error (system) → handleToolExecutionError → logException (Sentry) → rethrow MCP tool returns isError → McpToolError → logWarn only → rethrow → Pi tool_error ``` No ad-hoc error-type catches at the executor level. The domain function that creates the error owns the classification. ## Changes **New:** - `tool-input-error.ts` — `ToolInputError` class **Error handler:** - `tool-error-handler.ts` — skip `logException` for `ToolInputError`; replace `getMcpAwareErrorType` with `getToolErrorType()` covering both MCP and input error classification **Domain sources that now throw `ToolInputError`:** - `edit-file.ts` — ENOENT on `fs.readFile` → `ToolInputError("File not found: ...")` - `text-edits.ts` — all validation (not found, ambiguous, overlap, no-op, empty oldText, bad newText) - `file-utils.ts` — workspace path traversal - `list-dir.ts` — not-a-directory - `grep.ts` — invalid regex pattern - `sandbox.ts` — missing required arguments (path, edits, pattern, command) **What stays as system errors (still reports to Sentry):** - Sandbox API failures (`throwSandboxOperationError`) - Slack API errors - Network/connection failures - Non-ENOENT filesystem errors - `unsupported sandbox tool` (config error) ## Tests - **`tool-error-handler.test.ts`** (new) — system errors call `logException`; `ToolInputError` and `McpToolError` do not - **`file-tools.test.ts`** — `ToolInputError` for ambiguous edits, text not found, path traversal, invalid regex, file-as-directory - **`sandbox-executor.test.ts`** — editFile on missing path throws `ToolInputError` ## Verification - `pnpm typecheck` ✅ - 51/51 tests passing ✅ - Lint + prettier ✅ Fixes [JUNIOR-2Q](https://sentry.sentry.io/issues/7495355493/) Requested-By: David Cramer <dcramer> Co-authored-by: Junior <junior[bot]@sentry.io> Co-authored-by: Claude (anthropic/claude-opus-4.6) <[email protected]>
Track cumulative duration and token usage across paused turn checkpoints, then use those totals when rendering timeout-resume Slack footers. Timeout continuation notices also now use the normal Slack footer shape so the conversation ID stays traceable. Co-Authored-By: GPT-5 Codex <[email protected]>
|
@omribz156 is attempting to deploy a commit to the Sentry Team on Vercel. A member of the Team first needs to authorize it. |
Member
|
Superseded by #388, which keeps the timeout diagnostics changes and adds the broader continuation/failure recovery work. |
dcramer
added a commit
that referenced
this pull request
May 21, 2026
Persist safe running checkpoints at Pi-continuable boundaries so interrupted turns retain a previous resumable state without checkpointing mid-tool-call. Treat interrupted sandbox command streams as failed bash results that the agent can inspect or retry, and add eval fault injection to prove the recovery path. Supersedes GH-385 Co-Authored-By: GPT-5 Codex <[email protected]>
dcramer
added a commit
that referenced
this pull request
May 21, 2026
Preserve the timeout diagnostics work from #385 and broaden the continuation recovery strategy for interrupted turns. Junior now writes safe in-progress checkpoints at Pi-continuable boundaries, posts consistent Slack continuation notices, and treats interrupted sandbox command streams as recoverable bash results instead of terminal agent failures. **Safe Continuation Boundaries** Running checkpoints are persisted only when Pi can legally continue from the transcript, such as after the user message or after tool results are recorded. This keeps progress available without checkpointing unsafe mid-tool-call state. **Sandbox Stream Recovery** A sandbox command stream that ends before a command finishes now returns a failed bash result with guidance to inspect or retry. The agent can continue from that durable tool result rather than failing the whole turn. **Fault-Injection Coverage** The eval harness can inject sandbox bash stream interruptions, and the lifecycle eval proves the real agent recovers by retrying and finishing the request. Supersedes #385 --------- Co-authored-by: Omri SirComp <[email protected]> Co-authored-by: GPT-5 Codex <[email protected]> Co-authored-by: GPT-5 Codex <[email protected]>
Member
|
Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #384
Summary
Verification
pnpm install --frozen-lockfilepnpm --filter @sentry/junior exec vitest run tests/unit/services/turn-checkpoint.test.ts tests/unit/handlers/oauth-resume.test.tspnpm --filter @sentry/junior exec tsc --noEmit --pretty falsepnpm --filter @sentry/junior lintpnpm --filter @sentry/junior buildgit diff --checkImplemented with Codex assistance, then manually reviewed and kept to the pause/resume diagnostics path.