Skip to content

Add MCP OAuth authentication flow#411

Open
timvisher-dd wants to merge 1 commit intoagentclientprotocol:mainfrom
timvisher-dd:support-mcp-oauth-flow
Open

Add MCP OAuth authentication flow#411
timvisher-dd wants to merge 1 commit intoagentclientprotocol:mainfrom
timvisher-dd:support-mcp-oauth-flow

Conversation

@timvisher-dd
Copy link
Copy Markdown
Contributor

@timvisher-dd timvisher-dd commented Mar 12, 2026

When ACP clients provide HTTP/SSE MCP servers without Authorization headers, the agent now detects servers that need authentication and triggers the Claude Code SDK's built-in OAuth flow automatically. Previously, MCP servers requiring OAuth would silently fail to connect.

Problem

MCP servers that require OAuth (e.g., Datadog, Atlassian) return 401 on connection. The SDK detects this and marks them as needs-auth, but claude-agent-acp never checked this status or triggered authentication. ACP clients had to work around this by performing OAuth themselves and injecting static Authorization headers at session creation time.

For example, we built agent-shell-mcp-oauth --- a full MCP OAuth 2.1 implementation in Emacs Lisp (RFC 9728 discovery, dynamic client registration, PKCE, keychain-backed token storage) --- as an add-on for the agent-shell ACP client, specifically to paper over this gap on the client side. Every ACP client that wants OAuth MCP servers currently has to implement something similar.

I think this is inconsistent with the ACP spec's intent: the client specifies MCP servers, and the agent handles connecting to them --- including auth. The spec could be clearer here, but the headers field on McpServerHttp reads as a convenience for static credentials, not the intended mechanism for OAuth. With this change, ACP clients can be truly agnostic about MCP auth --- they just provide the server URL and the agent handles the rest.

Solution

After query().initializationResult(), check mcpServerStatus() for servers with status === 'needs-auth'. For each, send the SDK's mcp_authenticate control message, which triggers:

  1. RFC 9728 Protected Resource Metadata discovery
  2. RFC 8414 Authorization Server Metadata discovery
  3. RFC 7591 Dynamic Client Registration
  4. RFC 7636 PKCE S256 authorization flow
  5. Localhost callback server for the OAuth redirect
  6. Token exchange and keychain storage

The agent opens the user's browser for OAuth consent and polls mcpServerStatus() until the server transitions to connected.

How we got here: empirical investigation

This change is backed by extensive empirical testing against the Claude Code SDK and CLI internals. Here's the investigation trail:

Phase 1: Can the SDK handle MCP OAuth?

Hypothesis: The Claude Agent SDK already implements MCP OAuth. If we pass an MCP server config without headers, the SDK should handle auth.

Test: Built a headless ACP client (x.oauth-probe.mjs) that spawns claude-agent-acp, creates a session with Datadog MCP (no headers), and sends a prompt.

Result (with cached tokens): MCP tools loaded and worked. The SDK found cached OAuth tokens in the macOS keychain under Claude Code-credentials -> mcpOAuth and used them silently.

Result (with cache cleared): MCP tools did NOT load. No [OAUTH] events, no browser prompt. The SDK detected needs-auth but silently gave up.

Conclusion: The SDK uses cached tokens when available but does NOT autonomously trigger OAuth when called via query().

Phase 2: What does the SDK expose?

Test: Used query() directly (x.sdk-direct-probe.mjs) to call mcpServerStatus().

Result:

[mcp] name=datadog-mcp status=needs-auth error=none

The SDK correctly detects the 401 and reports needs-auth. But reconnectMcpServer() rejects with "Server status: needs-auth" --- it only works for servers that have already been authenticated.

Phase 3: Does the CLI have an auth control message?

Investigation: Searched the minified cli.js for control message subtypes. All identifiers below are minified names from the bundled CLI; we infer their purpose from usage context.

Discovery: The CLI handles these undocumented MCP auth control messages:

  • mcp_authenticate --- triggers OAuth discovery + PKCE + localhost callback
  • mcp_oauth_callback_url --- provides the callback URL manually
  • mcp_clear_auth --- clears stored tokens

The CLI has an internal OAuth provider class (minified as S_6, likely something like McpOAuthProvider) with a skipBrowserOpen flag that defaults to false. When false, the provider opens the browser itself via an internal cross-platform open function (minified as $Y, likely openUrl). There are two code paths that invoke this provider:

  1. TUI path (in-process): The React/Ink MCP settings components call the provider directly with skipBrowserOpen: true, get back the auth URL, and render it in the terminal UI for the user to interact with. The TUI never uses the control message protocol.
  2. Control message path (mcp_authenticate): The CLI's control message handler also calls the provider with skipBrowserOpen: true hardcoded, returning the URL as JSON. This is the path available to out-of-process callers like the SDK.

In both paths, skipBrowserOpen is true --- the browser is never opened by the CLI when auth is triggered programmatically. The TUI handles presentation in its React UI; we handle it by calling open/xdg-open/start. The internal browser-opening function ($Y) is only used when the provider is called with the default skipBrowserOpen: false, which none of the current code paths actually do for MCP auth.

The SDK wrapper (sdk.mjs) doesn't expose mcp_authenticate in its TypeScript types, but the Query object's request() method can send arbitrary control messages to the CLI subprocess.

Phase 4: End-to-end validation

Test: x.sdk-auth-spike.mjs --- clear keychain, call query(), detect needs-auth, send mcp_authenticate, open browser, poll.

Result:

SDK MCP status: needs-auth
Sending mcp_authenticate...
Opening browser for OAuth consent...
polling: needs-auth
polling: needs-auth
polling: connected
MCP server connected! OAuth flow completed successfully.
tools (20): analyze_datadog_logs, create_datadog_notebook, ...

Full PKCE flow completed in ~12 seconds.

Phase 5: Patched ACP agent validation

Test: x.acp-patched-spike.mjs --- applied this patch, launched via ACP, created session with empty headers.

Result:

[MCP OAuth] Server "atlassian" needs auth, triggering OAuth flow...
[MCP OAuth] Opening browser for "atlassian"...
[MCP OAuth] Server "atlassian" connected!
[MCP OAuth] Server "datadog-mcp" needs auth, triggering OAuth flow...
[MCP OAuth] Opening browser for "datadog-mcp"...
[MCP OAuth] Server "datadog-mcp" connected!
Session created: ed593bc7-...

Both Datadog and Atlassian authenticated automatically. MCP tools worked in the subsequent prompt.

Architecture

ACP Client (Emacs/Zed)
  ↕ ACP JSON-RPC (session/new with mcpServers, empty headers)
claude-agent-acp  ← THIS PATCH
  ↕ SDK query() API
  │  mcpServerStatus()  →  detects 'needs-auth'
  │  request({subtype: 'mcp_authenticate'})  →  {authUrl, requiresUserAction}
  │  open(authUrl)  →  browser consent
  │  poll mcpServerStatus()  →  'connected'
  ↕ stream-json control messages
Claude Code CLI subprocess
  ↕ MCP OAuth (RFC 9728 discovery, PKCE, localhost callback)
  ↕ HTTP
MCP Server (Datadog, Atlassian, etc.)

Caveats

  • mcp_authenticate is undocumented: This control message is not in the SDK's public TypeScript types. It's used internally by the Claude Code CLI. Both the TUI and the control message handler call the CLI's internal OAuth provider with skipBrowserOpen: true, so the caller is always responsible for opening the browser or presenting the URL. See Phase 3 above for details. This message could change without notice; we use @ts-ignore and should add feature detection / fallback in a follow-up.
  • Local GUI vs headless/remote: Since mcp_authenticate returns the auth URL without opening a browser, we handle browser opening ourselves, mirroring the CLI's internal openUrl function (minified as $Y): respects the $BROWSER env var, uses rundll32 url,OpenURL on Windows, open on macOS, and xdg-open on Linux. If opening fails (headless environments, containers, SSH, CI), the auth URL is logged as an error and the server is skipped --- the client falls back to providing static Authorization headers at session/new. Supporting headless agents properly would require ACP protocol extensions (ExtRequest/ExtNotification) to surface auth URLs to the client, plus a mechanism for mid-session credential injection. That's orthogonal to this change.
  • Blocking session creation: OAuth consent blocks session/new for up to 60 seconds per server. This is acceptable for first-time auth (subsequent sessions use cached tokens) but could be improved with async notification to the ACP client.
  • Token refresh is not implemented for MCP servers: Investigation of the minified CLI code (cli.js) reveals that while refreshToken and expiresAt are stored in the keychain alongside accessToken, the CLI does not use grant_type=refresh_token for MCP server tokens. The CLI does have 401→refresh→retry logic for its own API calls (e.g., /api/oauth/claude_cli/client_data), but this code path is not wired up for MCP HTTP requests. When a cached MCP token expires, the server returns 401, and the CLI simply marks it needs-auth — no refresh attempt, no retry. The ACP layer has no visibility into token state (only the binary needs-auth vs connected status), so it cannot paper over this. In practice, this means users will see a full browser re-auth flow every time their MCP tokens expire (typically ~1h15m for Datadog/Atlassian), rather than a silent background refresh. Compare with agent-shell-mcp-oauth, which proactively refreshes tokens 60 minutes before expiry and never requires re-login for cached servers. This needs an upstream SDK fix: the CLI should attempt grant_type=refresh_token before falling back to needs-auth.
  • Backward-compatible with client-side auth: ACP clients that already handle OAuth themselves and inject Authorization headers at session/new are unaffected. When valid headers are provided, the MCP server connects normally and mcpServerStatus() never returns needs-auth, so none of the new auth code is triggered.

Testing

Validated end-to-end with:

  • Datadog MCP (https://mcp.datadoghq.com/api/unstable/mcp-server/mcp)
  • Atlassian MCP (https://mcp.atlassian.com/v1/mcp)
  • Manual test driving. I'll be doing this over the next few days but I wanted to get the overall reaction to the changes as I went.

Both with cache cleared (fresh OAuth) and with cached tokens (automatic reconnection).

Future work

  • Async auth notification: Surface auth URL to ACP client via ExtNotification instead of blocking session creation
  • Feature detection: Gracefully handle missing mcp_authenticate with fallback to client-provided headers
  • Timeout configuration: Make the 60s auth timeout configurable
  • Parallel auth: Authenticate multiple servers concurrently instead of sequentially
  • Upstream: MCP token refresh: File an issue on the Claude Agent SDK requesting that the CLI use grant_type=refresh_token for MCP OAuth tokens before falling back to needs-auth. The refresh plumbing already exists for the CLI's own API calls; it just needs to be wired up for MCP server connections. Without this, users face a full browser re-auth every ~1h15m.
  • codex-acp: The codex-acp adapter likely has the same gap --- earlier analysis of its source showed it passes MCP headers through without checking auth status, but we haven't validated this end-to-end. If confirmed, it would need a similar implementation using the codex-rs SDK's equivalent APIs.

@cla-bot cla-bot bot added the cla-signed label Mar 12, 2026
@timvisher-dd timvisher-dd force-pushed the support-mcp-oauth-flow branch 3 times, most recently from d8e9b90 to a086305 Compare March 12, 2026 17:39
@timvisher-dd timvisher-dd marked this pull request as ready for review March 12, 2026 17:46
@MrSubidubi MrSubidubi changed the title feat: Add MCP OAuth authentication flow Add MCP OAuth authentication flow Mar 12, 2026
@timvisher-dd timvisher-dd changed the title Add MCP OAuth authentication flow feat: Add MCP OAuth authentication flow Mar 12, 2026
@timvisher-dd timvisher-dd changed the title feat: Add MCP OAuth authentication flow Add MCP OAuth authentication flow Mar 12, 2026
After query() initialization, check mcpServerStatus() for HTTP/SSE
MCP servers in 'needs-auth' state and trigger the Claude Code CLI's
built-in OAuth flow via the mcp_authenticate control message.

The CLI handles the full PKCE flow: RFC 9728 discovery, dynamic client
registration, localhost callback server, token exchange, and keychain
storage.  The agent opens the user's browser for OAuth consent and
polls until the server transitions to 'connected'.

Browser opening mirrors the CLI's internal approach: respects $BROWSER,
uses rundll32 on Windows, open on macOS, xdg-open on Linux.  In
headless environments where opening fails, the auth URL is logged as
an error and the server is skipped gracefully.

Previously, MCP servers requiring OAuth would silently fail to connect
unless the ACP client pre-injected static Authorization headers.

Validated end-to-end with Datadog and Atlassian MCP servers.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant