Skip to content

feat(contracts): structured diagnostics for agent connection tests#2294

Open
EthanGuo-coder wants to merge 4 commits into
nexu-io:mainfrom
EthanGuo-coder:feat/issue-2248
Open

feat(contracts): structured diagnostics for agent connection tests#2294
EthanGuo-coder wants to merge 4 commits into
nexu-io:mainfrom
EthanGuo-coder:feat/issue-2248

Conversation

@EthanGuo-coder
Copy link
Copy Markdown
Contributor

Closes #2248

Why

Agent connection failures surfaced no structured diagnostics, leaving users with no actionable path forward when a binary was missing, a token was stale, or a model name was wrong.

What users will see

  • The daemon's POST /api/test/connection response now carries a structured diagnostics object on every agent-mode result (success or failure): agentId, agentName, phase, binaryPath, binaryVersion, sanitized stderrExcerpt (ANSI-stripped, secrets-redacted, ≤500 chars), and an array of recoveryHints.
  • A new od agent test <agentId> [--model M] [--reasoning R] [--json] subcommand drives the same probe the Settings dialog uses. Default output is a human summary (kind, agent name, phase, recovery hints); --json emits the full envelope so external agents and pipelines can jq .diagnostics.recoveryHints.
  • No behavior change for existing callers — ConnectionTestResponse.diagnostics is optional and provider-mode responses still omit it. Existing kind / detail / usedExecutablePath fields are unchanged.

Surface area

  • UI — new page / dialog / panel / menu item / setting / empty state in apps/web or apps/desktop (including Electron menu bar)
  • Keyboard shortcut — new or changed
  • CLI / env var — new od subcommand or flag, new tools-dev / tools-pack / tools-pr flag, or new OD_* env var
  • API / contract — new /api/* endpoint, new SSE event, or changed shape in packages/contracts
  • Extension point — new entry under skills/, design-systems/, design-templates/, or craft/, or change to the skills protocol
  • i18n keys — added new translation keys (see TRANSLATIONS.md for the locale workflow)
  • New top-level dependency — adding any new entry to the root package.json (dependencies or devDependencies); workspace-package package.json files are out of scope. Include a paragraph on what we get vs. what bytes we ship (see CONTRIBUTING.md → Code style)
  • Default behavior change — changes what existing users experience without opting in (default model, default setting, file/SQLite schema, auto-network on startup, auto-install)
  • None — internal refactor, docs, tests, or translation update only

Screenshots

N/A — this PR ships the daemon contract and CLI surface only. The web render of diagnostics.recoveryHints in SettingsDialog.tsx is intentionally deferred to a follow-up so the i18n key matrix lands in lockstep with the UI strings.

Bug fix verification

N/A — this is a type/feature PR (additive contract field plus new CLI subcommand), not a bug fix.

Validation

  • pnpm guard
  • pnpm --filter @open-design/contracts build
  • pnpm --filter @open-design/daemon typecheck
  • pnpm --filter @open-design/daemon test -- tests/connection-test-diagnostics.test.ts — 17 tests, all green; covers sanitizeStderrExcerpt ANSI stripping / length cap / secret redaction, recoveryHintsFor per-phase keying, buildAgentDiagnostics purity, and end-to-end via fake-codex / fake-claude binaries (success path, spawn-failure path, auth-required path, model-listing path, unknown-agent path, and the codex fallback path that reattaches the envelope against the final usedExecutablePath).

Follow-up

  • Web render. A follow-up PR will add a recoveryHints section + "Copy diagnostics" button to SettingsDialog.tsx and ship the matching keys across all 18 locales.
  • binaryVersion capture. Reserved on the contract as string | null; ships as null in this PR. The naïve in-line <binary> --version probe added ~3 s of serialized latency per test and blew the existing tests/connection-test.test.ts > hard-cancels aborted agent probes… 10 s budget; wiring it from the cached detectAgents result is non-trivial (crosses the apps/daemonpackages/contracts boundary) and belongs in its own PR.
  • Home-directory path scrubbing. The issue body asks for "remove any path segments that could reveal sensitive home directory structure"; the existing redactSecrets does not currently scrub /Users/<name>/.... Adding a home-path scrubber here would risk false-positive redaction of legitimately-surfaced Codex install paths under usedExecutablePath. Deferred until we can scope the redactor properly.
  • od agent test --codex-bin / agentCliEnv forwarding. The new subcommand does not yet forward per-user CLI env overrides (e.g. CODEX_BIN) in the request body, so the CLI cannot fully reproduce a connection test that depends on the Settings dialog's per-CLI env overrides. The daemon contract field is optional, so this is non-blocking for v1; flagged as a small follow-up.

Builds the diagnostics envelope on every agent-mode response and exposes it through a new `od agent test <agentId>` subcommand for CLI/MCP consumers.
@lefarcen lefarcen requested a review from nettee May 19, 2026 17:35
@lefarcen lefarcen added size/L PR changes 300-700 lines risk/high High risk: apps/desktop, daemon, auth, migration, workflows, package deps type/feature New feature labels May 19, 2026
@lefarcen lefarcen requested a review from qiongyu1999 May 19, 2026 17:37
Copy link
Copy Markdown
Contributor

@nettee nettee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found two blocking issues in the new od agent test / diagnostics path.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

Comment thread apps/daemon/src/cli.ts Outdated
}
const rest = args.slice(1);
const flags = parseFlags(rest, { string: AGENT_STRING_FLAGS, boolean: AGENT_BOOLEAN_FLAGS });
const agentId = rest.find((a) => !a.startsWith('-'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runAgent() re-scans rest with rest.find((a) => !a.startsWith(-)), but parseFlags() does not remove values for --daemon-url, --model, or --reasoning. In this diff that means od agent test --daemon-url http://127.0.0.1:7456 codex picks the URL as agentId, and od agent test --model gpt-4 sends gpt-4 as the agent id instead of failing for a missing positional. That breaks a common flag ordering and turns a required-argument error into an incorrect request. Please derive positionals with a helper that skips string-flag values (like positionalArgs() later in this file), or require <agentId> in a fixed slot before parsing flags, and add CLI coverage for both flag-before-positional and missing-agent-id cases.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nettee Promoted positionalArgs to a module-level helper and switched runAgent plus the ten sibling subcommand call-sites that shared the same rest.find(...) pattern over to it, with CLI coverage for both the flag-before-positional and missing-agent-id paths. b6400555

): string | null {
if (typeof raw !== 'string') return null;
const stripped = raw.replace(ANSI_ESCAPE_RE, '').replace(/\r/g, '');
const redacted = redactSecrets(stripped).trim();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitizeStderrExcerpt() now feeds every agent-mode stderrExcerpt, but it only calls redactSecrets(stripped) with no exact secret values. redactSecrets() redacts Bearer/header/query-string patterns, so stderr like Error: invalid api key sk-secret still survives unchanged and then gets returned from /api/test/connection and printed by od agent test. That leaks raw credentials from a failing CLI into API responses and terminal output. Please extend this sanitization path to redact plain-token forms as well (or pass known agent secret values into buildAgentDiagnostics()), and add a regression test that uses a bare sk-... token in stderr.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

Promotes positionalArgs to a module-level helper so callers no longer mistake string-flag values (e.g. --daemon-url's URL) for positional arguments.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@nettee nettee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blocking issues remain in the new diagnostics path.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

): string | null {
if (typeof raw !== 'string') return null;
const stripped = raw.replace(ANSI_ESCAPE_RE, '').replace(/\r/g, '');
const redacted = redactSecrets(stripped).trim();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitizeStderrExcerpt() still only runs stderr through redactSecrets(), which redacts Bearer/header/query-string forms but not bare tokens. In this diff the new integration test feeds Error: authentication failed (sk-secret) into stderr, yet the assertions only check ANSI stripping, so diagnostics.stderrExcerpt can still return the raw token to /api/test/connection callers and od agent test --json. That is a real secret-exposure path in the feature this PR adds. Please extend this sanitization path to redact plain token forms as well (or pass known exact secret values into buildAgentDiagnostics()), and tighten the new regression to assert that sk-secret is absent from the surfaced excerpt.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

configuredExecutablePath: executableResolution.configuredOverridePath,
detectedExecutablePath: executableResolution.pathResolvedPath,
usedExecutablePath: executableResolution.launchPath ?? executableResolution.pathResolvedPath,
usedExecutablePath: usedPath,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the fallback_failed success branch, usedPath still comes from the primary executableResolution.launchPath ?? executableResolution.pathResolvedPath. When a configured CODEX_BIN fails and the retry without that override succeeds, launchPath still points at the failing configured binary, so this branch can report the wrong usedExecutablePath and propagate the same wrong value into diagnostics.binaryPath. That breaks the core diagnostic this PR is introducing by telling users the successful probe ran against the binary that actually failed. Please source the final path from the fallback attempt (fallbackDiag.binaryPath or a fresh fallback resolution) and add a regression that exercises configured override fails -> PATH fallback succeeds and asserts both usedExecutablePath and diagnostics.binaryPath point at the PATH binary.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

risk/high High risk: apps/desktop, daemon, auth, migration, workflows, package deps size/L PR changes 300-700 lines type/feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve local agent connection test diagnostics

3 participants