feat(contracts): structured diagnostics for agent connection tests#2294
feat(contracts): structured diagnostics for agent connection tests#2294EthanGuo-coder wants to merge 4 commits into
Conversation
Builds the diagnostics envelope on every agent-mode response and exposes it through a new `od agent test <agentId>` subcommand for CLI/MCP consumers.
| } | ||
| const rest = args.slice(1); | ||
| const flags = parseFlags(rest, { string: AGENT_STRING_FLAGS, boolean: AGENT_BOOLEAN_FLAGS }); | ||
| const agentId = rest.find((a) => !a.startsWith('-')); |
There was a problem hiding this comment.
runAgent() re-scans rest with rest.find((a) => !a.startsWith(-)), but parseFlags() does not remove values for --daemon-url, --model, or --reasoning. In this diff that means od agent test --daemon-url http://127.0.0.1:7456 codex picks the URL as agentId, and od agent test --model gpt-4 sends gpt-4 as the agent id instead of failing for a missing positional. That breaks a common flag ordering and turns a required-argument error into an incorrect request. Please derive positionals with a helper that skips string-flag values (like positionalArgs() later in this file), or require <agentId> in a fixed slot before parsing flags, and add CLI coverage for both flag-before-positional and missing-agent-id cases.
There was a problem hiding this comment.
| ): string | null { | ||
| if (typeof raw !== 'string') return null; | ||
| const stripped = raw.replace(ANSI_ESCAPE_RE, '').replace(/\r/g, ''); | ||
| const redacted = redactSecrets(stripped).trim(); |
There was a problem hiding this comment.
sanitizeStderrExcerpt() now feeds every agent-mode stderrExcerpt, but it only calls redactSecrets(stripped) with no exact secret values. redactSecrets() redacts Bearer/header/query-string patterns, so stderr like Error: invalid api key sk-secret still survives unchanged and then gets returned from /api/test/connection and printed by od agent test. That leaks raw credentials from a failing CLI into API responses and terminal output. Please extend this sanitization path to redact plain-token forms as well (or pass known agent secret values into buildAgentDiagnostics()), and add a regression test that uses a bare sk-... token in stderr.
Promotes positionalArgs to a module-level helper so callers no longer mistake string-flag values (e.g. --daemon-url's URL) for positional arguments. Co-Authored-By: Claude <noreply@anthropic.com>
| ): string | null { | ||
| if (typeof raw !== 'string') return null; | ||
| const stripped = raw.replace(ANSI_ESCAPE_RE, '').replace(/\r/g, ''); | ||
| const redacted = redactSecrets(stripped).trim(); |
There was a problem hiding this comment.
sanitizeStderrExcerpt() still only runs stderr through redactSecrets(), which redacts Bearer/header/query-string forms but not bare tokens. In this diff the new integration test feeds Error: authentication failed (sk-secret) into stderr, yet the assertions only check ANSI stripping, so diagnostics.stderrExcerpt can still return the raw token to /api/test/connection callers and od agent test --json. That is a real secret-exposure path in the feature this PR adds. Please extend this sanitization path to redact plain token forms as well (or pass known exact secret values into buildAgentDiagnostics()), and tighten the new regression to assert that sk-secret is absent from the surfaced excerpt.
| configuredExecutablePath: executableResolution.configuredOverridePath, | ||
| detectedExecutablePath: executableResolution.pathResolvedPath, | ||
| usedExecutablePath: executableResolution.launchPath ?? executableResolution.pathResolvedPath, | ||
| usedExecutablePath: usedPath, |
There was a problem hiding this comment.
In the fallback_failed success branch, usedPath still comes from the primary executableResolution.launchPath ?? executableResolution.pathResolvedPath. When a configured CODEX_BIN fails and the retry without that override succeeds, launchPath still points at the failing configured binary, so this branch can report the wrong usedExecutablePath and propagate the same wrong value into diagnostics.binaryPath. That breaks the core diagnostic this PR is introducing by telling users the successful probe ran against the binary that actually failed. Please source the final path from the fallback attempt (fallbackDiag.binaryPath or a fresh fallback resolution) and add a regression that exercises configured override fails -> PATH fallback succeeds and asserts both usedExecutablePath and diagnostics.binaryPath point at the PATH binary.
Closes #2248
Why
Agent connection failures surfaced no structured diagnostics, leaving users with no actionable path forward when a binary was missing, a token was stale, or a model name was wrong.
What users will see
POST /api/test/connectionresponse now carries a structureddiagnosticsobject on every agent-mode result (success or failure):agentId,agentName,phase,binaryPath,binaryVersion, sanitizedstderrExcerpt(ANSI-stripped, secrets-redacted, ≤500 chars), and an array ofrecoveryHints.od agent test <agentId> [--model M] [--reasoning R] [--json]subcommand drives the same probe the Settings dialog uses. Default output is a human summary (kind, agent name, phase, recovery hints);--jsonemits the full envelope so external agents and pipelines canjq .diagnostics.recoveryHints.ConnectionTestResponse.diagnosticsis optional and provider-mode responses still omit it. Existingkind/detail/usedExecutablePathfields are unchanged.Surface area
apps/weborapps/desktop(including Electron menu bar)odsubcommand or flag, newtools-dev/tools-pack/tools-prflag, or newOD_*env var/api/*endpoint, new SSE event, or changed shape inpackages/contractsskills/,design-systems/,design-templates/, orcraft/, or change to the skills protocolTRANSLATIONS.mdfor the locale workflow)package.json(dependenciesordevDependencies); workspace-packagepackage.jsonfiles are out of scope. Include a paragraph on what we get vs. what bytes we ship (seeCONTRIBUTING.md→ Code style)Screenshots
N/A — this PR ships the daemon contract and CLI surface only. The web render of
diagnostics.recoveryHintsinSettingsDialog.tsxis intentionally deferred to a follow-up so the i18n key matrix lands in lockstep with the UI strings.Bug fix verification
N/A — this is a
type/featurePR (additive contract field plus new CLI subcommand), not a bug fix.Validation
pnpm guardpnpm --filter @open-design/contracts buildpnpm --filter @open-design/daemon typecheckpnpm --filter @open-design/daemon test -- tests/connection-test-diagnostics.test.ts— 17 tests, all green; coverssanitizeStderrExcerptANSI stripping / length cap / secret redaction,recoveryHintsForper-phase keying,buildAgentDiagnosticspurity, and end-to-end via fake-codex / fake-claude binaries (success path, spawn-failure path, auth-required path, model-listing path, unknown-agent path, and the codex fallback path that reattaches the envelope against the finalusedExecutablePath).Follow-up
recoveryHintssection + "Copy diagnostics" button toSettingsDialog.tsxand ship the matching keys across all 18 locales.binaryVersioncapture. Reserved on the contract asstring | null; ships asnullin this PR. The naïve in-line<binary> --versionprobe added ~3 s of serialized latency per test and blew the existingtests/connection-test.test.ts > hard-cancels aborted agent probes…10 s budget; wiring it from the cacheddetectAgentsresult is non-trivial (crosses theapps/daemon↔packages/contractsboundary) and belongs in its own PR.redactSecretsdoes not currently scrub/Users/<name>/.... Adding a home-path scrubber here would risk false-positive redaction of legitimately-surfaced Codex install paths underusedExecutablePath. Deferred until we can scope the redactor properly.od agent test--codex-bin/agentCliEnvforwarding. The new subcommand does not yet forward per-user CLI env overrides (e.g.CODEX_BIN) in the request body, so the CLI cannot fully reproduce a connection test that depends on the Settings dialog's per-CLI env overrides. The daemon contract field is optional, so this is non-blocking for v1; flagged as a small follow-up.