feat(store/mcp): #7 — mcp_servers proxy substrate (mcp_register + mcp-query, leashed, creds-by-reference) by hartsock · Pull Request #58 · hartsock/modulex-mcp

hartsock · 2026-06-13T01:42:50Z

Refs #7 (PR D of the agent-state-store epic — the mcp_servers proxy piece only).

Presents registered downstream MCP servers as a single modulex surface, so a hotseat agent points its harness at modulex and never touches enterprise credentials.

What this PR does

Store (modulex-core/src/store.rs)

McpServer struct + mcp_register / mcp_servers / mcp_server / mcp_unregister over the already-reserved mcp_servers table (name PK, command, args_json, note, created_gen — generation-stamped, never wall-clock). Upsert by name.
Wired into StoreDump export/import (sovereignty). The export carries the invocation shape only — never a credential.

Tool (modulex-mcp/src/tools.rs)

mcp_register MCP tool (action = add | list | remove). It is a mutation, so it is a legitimate tool (per modulex's rule: tools for mutations, step types for capability).
Lands in a new opt-in mcp facet — NOT on the default tools/list, so DEFAULT_TOOL_BUDGET stays 12 (CI-pinned). It is discoverable + callable via the tool_search / tool_describe / tool_invoke trio, exactly like the existing board facet.

Step (modulex-core/src/steps/mcp_query.rs)

mcp-query step: spawns a registered downstream as a stdio MCP client (initialize → notifications/initialized → tools/call, newline-delimited JSON-RPC 2.0) over the existing single-shot Spawner seam — no new dependencies, lean (--no-default-features) build unaffected. Embeds only the tool result as a report section.

Security / trust model (this is the credential-proxy surface)

Leashed spawn. The downstream command runs through ExecGate::spawn, which calls agent-bridle's check_exec before any process exists (mcp_query.rs query() → cx.exec.spawn, denial arm at steps/mcp_query.rs:301-306). An mcp-query to a server whose command is not in the run's exec grant is denied, not run (state: "denied"). The store cannot silently widen the leash — a registered server's command must be granted: either a config mcp-query step declares command inline (so it joins the declared-default grant via required_programs, steps/mcp_query.rs:167-170 + :222-225) or it is listed in [caveats] exec.
Credentials by reference, never by value. The downstream's secrets are supplied through the step's env = { NAME = {env|file|cmd} } references, resolved at spawn time into Secrets (resolve_step_env, steps/mcp_query.rs:274) injected only into the child environment (ExecRequest.env). Secret is unserializable by construction, so it cannot enter the report data. Secrets never reach the store, an export, the report markdown/data, or error text.
Result only; stderr withheld. Only the parsed JSON-RPC result is embedded (steps/mcp_query.rs:320-331). stderr is never surfaced (it could echo a credential); only parsed JSON-RPC error messages are reported. The exec gate also scrubs any injected secret from captured stdout/stderr as defense-in-depth before this code sees the output (exec.rs:258-264).
Out of our control / out of scope: a downstream server that echoes a value it was passed in arguments back inside its own tool-result content. Those arguments come from the calling step config, not from stored credentials, so no modulex-held secret is involved.

Test plan

Coverage measured with cargo llvm-cov (just cov-ci), library code (binary/pyo3 entrypoints excluded): 95.03% lines total; on the new modules:

module	lines
`steps/mcp_query.rs` (the step)	96.46%
`store.rs` (incl. mcp_servers accessors)	96.73%
`tools.rs` (incl. `h_mcp_register`)	93.79%
`server.rs` (new integration tests)	96.89%

Tests across modulex's three tiers:

Mocked logic / pure: build_stdin, parse_call_response (result / JSON-RPC error / missing / chatty stdout), render_result; happy path, ungranted-command denial (asserts never spawned), timeout, missing store, missing credential (names the var, not the value), downstream error, and a credential-leak regression asserting a resolved {cmd} secret never appears in the serialized step result.
Real-seam: rides ExecGate::spawn, whose live leash + secret-scrub behavior is proven end-to-end in tests/gated_exec.rs.
Golden schema pinned for mcp-query (tests/golden/mcp-query.json); the data-contract conformance harness validates executed-step data against it.
Server-level round-trip via tool_invoke (register → list → remove) + an unlisted-but-discoverable assertion for the mcp facet.

Gates: just check (fmt + clippy -D warnings + tests) green; lean --no-default-features build green. Adds just cov / cov-ci and a CI coverage job enforcing an 80% line floor (mirrors the test gate; the pre-push hook note records that coverage is CI-only — full instrumentation is too slow per push). Example config gains a commented mcp-query block with the leash/credentials note.

Disclosure tiers: capability = mcp-query STEP (zero tool surface) + mcp_register MUTATION tool in the opt-in mcp facet (discovered, not listed; budget stays 12).

Out of scope

The rest of issue #7: reminders, countdowns, watches, iCal feeds, and the url-watch / ical-search steps (PRs A–C). This PR is only the mcp_servers proxy substrate (PR D). Per-generation result caching for deterministic replays (an open question on #7) is not implemented.

🤖 Generated with Claude Code

…-query) The credential-proxy piece of the agent-state-store epic (#7, PR D): present registered downstream MCP servers as a single modulex surface so a hotseat agent points its harness at modulex and never touches enterprise credentials. Store (modulex-core/store.rs): - McpServer struct + mcp_register/mcp_servers/mcp_server/mcp_unregister over the already-reserved mcp_servers table (name PK, command, args_json, note, created_gen — generation-stamped, never wall-clock). Upsert by name. - Wired into StoreDump export/import (sovereignty); the export carries the invocation shape only — never a credential. Tool (modulex-mcp/tools.rs): - mcp_register MCP tool (action add|list|remove) — a mutation, so a real tool. Lands in a new opt-in `mcp` facet: NOT on the default ≤12 tools/list budget (DEFAULT_TOOL_BUDGET stays 12, CI-pinned), discoverable + callable via the tool_search/tool_describe/tool_invoke trio, exactly like the board facet. Step (modulex-core/steps/mcp_query.rs): - mcp-query step: spawn a registered downstream as a stdio MCP client (initialize → notifications/initialized → tools/call over the existing single-shot Spawner seam — no new deps, lean build unaffected), embed ONLY the tool result as a report section. Security (this is the credential-proxy surface): - Leashed spawn: the downstream command goes through ExecGate::spawn → check_exec BEFORE any process; an ungranted command is DENIED, not run. The store cannot silently widen the leash — a registered server's command must be granted (inline `command` joins the declared-default grant via required_programs, or via [caveats] exec). - Credentials by reference: the downstream's secrets are resolved at spawn time into unserializable Secrets injected only into the child env; they never reach the store, an export, the report data/markdown, or error text. stderr is never surfaced (only parsed JSON-RPC error messages are). - Regression test asserts a resolved {cmd} credential value never appears in the serialized step result. Coverage: cargo llvm-cov on the new modules — mcp_query.rs 96.46% lines, store.rs 96.73%, tools.rs 93.79%, server.rs 96.89%; library total 95.03% lines (binary/pyo3 entrypoints excluded). Adds `just cov`/`cov-ci` and a CI `coverage` job enforcing an 80% floor, mirroring the test gate (hook parity note updated — coverage is CI-only, too slow pre-push). 3-tier tests: pure wire-shape (build_stdin/parse_call_response/render), mocked-spawner logic (happy path, denial, timeout, missing store/credential, downstream error, credential-leak regression), and server-level round-trip via tool_invoke. Golden schema pinned for the new step. Disclosure tiers: capability = mcp-query STEP (zero tool surface) + mcp_register MUTATION tool in the opt-in `mcp` facet (discovered, not listed). Refs #7. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… stderr/stdout body Adversarial review (DO-NOT-MERGE-AS-IS) found a real credential leak: a {cmd} credential helper (gh auth token, vault, pass, custom wrappers) that exits non-zero routinely echoes the very token it tried — on stdout AND stderr. credentials.rs surfaced the trimmed stderr verbatim into CredentialError::CommandFailed, which mcp_query embeds into the agent-visible step result. The exec-gate scrub can't catch it: during a credential's own resolution the secret is the command's OUTPUT, not in the child env, so the scrub loop is empty. Fix is in the SHARED resolver, so it closes the leak for all seven call sites at once (project/gitlab/script/python/github/board_ingest + the new mcp-query). Surface only the exit status, never the stream body. Adds the adversarial regression (failing gh-auth helper echoing a token → token absent from the error, exit status still reported). Everything else in #58 held under review: the exec leash (argv exec, no shell — injection-proof), store-can't-widen-the-grant, the result-echo scrub, Secret unserializability, the ≤12 tool budget, generation stamps. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

hartsock · 2026-06-13T02:02:20Z

Adversarial security review — verdict: DO-NOT-MERGE-AS-IS → one fix applied, now MERGE.

The review attacked credential-leak and exec-leash-bypass across every path. Everything held except one real leak, now fixed in commit d2bb58c:

The leak (Medium-High): a failing {cmd} secret helper (gh auth token, vault, pass) routinely echoes the token it tried on its own stderr/stdout; credentials.rs surfaced that body verbatim into CredentialError::CommandFailed, which mcp-query embeds into the agent-visible step result. The exec-gate scrub can't reach it (during resolution the secret is the command's output, not in the child env). Fix: surface only the exit status, never the stream body — in the shared resolver, so it closes the same latent leak for all six other steps too. Added the adversarial regression (failing helper echoing a token → token absent from the error).

Held under attack (evidence in the review): exec leash runs check_exec before any spawn; downstream is Command::new(prog).args() — direct argv, no shell, so args_json can't inject (; $() && pass as literal tokens); the store cannot widen the grant (grant computed from config only, never the store); the result-echo scrub replaces an echoed modulex secret with *** before embedding; Secret is unserializable-by-construction (the compile_fail doctest actually runs); StoreDump export carries invocation shape only; mcp_register is behind the mcp facet so the ≤12 default-tool budget stays pinned; created_gen is a generation counter, no wall-clock.

One pre-existing note (not a #58 blocker): agent-bridle's exec scope matches by basename, so a bare grant gh-mcp authorizes /path/gh-mcp — operators granting bare names should know. Worth a doc line in a follow-up.

Merging on green.

hartsock and others added 2 commits June 12, 2026 21:41

hartsock merged commit 942440a into main Jun 13, 2026
2 checks passed

hartsock deleted the feat/mcp-proxy-substrate branch June 13, 2026 02:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(store/mcp): #7 — mcp_servers proxy substrate (mcp_register + mcp-query, leashed, creds-by-reference)#58

feat(store/mcp): #7 — mcp_servers proxy substrate (mcp_register + mcp-query, leashed, creds-by-reference)#58
hartsock merged 2 commits into
mainfrom
feat/mcp-proxy-substrate

hartsock commented Jun 13, 2026

Uh oh!

hartsock commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hartsock commented Jun 13, 2026

What this PR does

Security / trust model (this is the credential-proxy surface)

Test plan

Out of scope

Uh oh!

hartsock commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant