Releases: plexara/mcp-test
mcp-test-v1.2.0
mcp-test v1.2.0 — Audit Inspection / Debugging Utility
This release closes #8 and ships the operator-facing inspection / debugging utility on top of the audit_payloads foundation that landed in v1.1.0. Three merged PRs (#10, #11, #12) layered the backend query / replay / streaming primitives, then the portal UI that consumes them.
+6240 / −111 across 45 files since v1.1.1 (which itself was the foundation cleanup release). Twelve commits on main.
Highlights
Click-to-expand audit drawer (4 tabs)
The portal's Audit page is now operator-grade. Click any row to open a side drawer with deep-linkable URL state (?id=<event-id>):
- Overview — timing, identity, request id, session id, source (
mcp/portal-tryit/portal-replay), andreplayed_fromlinkage with click-through. - Request — captured
request_params(sanitized viaaudit.redact_keys) and request headers whenaudit.capture_headers: true. Credential-bearing header names (Authorization,Cookie,Set-Cookie,Proxy-Authorization,X-API-Key) are stored as[redacted]; names remain visible so an operator can confirm "this request carried an Authorization header" without seeing the bearer. - Response — full
CallToolResultcontent blocks (text / image / audio / structured) plusresponse_errorwhen the call errored. Truncation banners when the response exceededaudit.max_payload_bytes. - Notifications — chronological list of every
notifications/*(progress, log message) the tool dispatched during the call window. Tab strip count appends+when the captured list was truncated.
role="dialog" + aria-modal + auto-focus on close button + return-focus-on-close for assistive tech. Esc and backdrop close.
Side-by-side comparison
/portal/audit/compare?a=<id>&b=<id> renders a JSON-path-aware structural diff of two events. Walks objects and arrays by key/index so reordered keys don't masquerade as changes; one-side-undefined trees show per-key only-A / only-B leaves rather than a single (undefined) → {…} line; deep trees indent linearly via per-<ul> padding (no off-panel bleed). Summary, request_params, response_result, response_error, and notifications all render distinct panels.
Replay endpoint
POST /api/v1/portal/audit/events/{id}/replay re-invokes the captured tool call through an in-process MCP client. New audit row tagged source=portal-replay with replayed_from set; the new event fires with the portal-replay caller's identity, not the original caller's, so the audit row reflects who triggered the replay.
- Per-identity rate limit: 5 burst, one token refilled every 12 seconds (sustained 5/min), scoped by API-key id or OIDC subject.
429withRetry-Afterwhen exhausted. - Tokens consumed only after validation passes — clicks on non-replayable rows return
400without burning the operator's budget. - Refuses
400on: invalid UUID, missing event, no captured payload, redacted parameter values, unregistered tool. - HTTP
502on transport-level callErr OR tool-sideIsError, mirroring/admin/tryitsemantics. error_categoryprecedence mirrorspkg/mcpmw/audit.go(auth → tool → handler) so/events?error_category=toolfilters bucket replays alongside native tool calls.- CSRF-gated via
X-Requested-With.
The portal Replay button opens a confirmation modal that calls out the side-effect re-run; default focus is on Cancel so a reflexive Enter dismisses rather than fires. The button is disabled with a tooltip when the row is non-replayable (mirrors the server-side validation client-side via the [redacted] marker walk).
SSE live tail
GET /api/v1/portal/audit/stream opens a Server-Sent Events stream of new audit events as they're written.
- New
audit.SubscribingLoggercapability:AsyncLoggerbroadcasts afterinner.Logsucceeds;MemoryLoggerbroadcasts on everyLog. - Per-subscriber mutex serializes send-vs-cancel (race-tested).
- Atomic SSE frame write via
bytes.Buffer(no half-formed frames on partial encode failure). - Opening
: connectedcomment +: keepaliveevery 30 seconds; setsX-Accel-Buffering: nofor nginx-fronted deployments.
The portal Live tail toggle subscribes via fetch + ReadableStream (rather than EventSource, which can't carry custom headers — would lock out CLI / API-key callers). The client parser strips one optional leading space per W3C SSE, caps the line buffer at 1 MiB so a misbehaving producer can't OOM the tab, drops malformed events at parse time, surfaces 401 to the unauthorized handler without a per-stream banner flash, and reports server-close so the operator can re-enable.
New events land in a fixed-cap most-recent-first list (cap 20) above the table; the table itself stays a historical-filter view to avoid refetch-per-event storms under load.
JSONB path filters
/audit/events and /audit/export accept additional query parameters that compile to EXISTS (SELECT 1 FROM audit_payloads p WHERE p.event_id = audit_events.id AND p.<col> @> $N::jsonb) against the existing jsonb_path_ops GIN indexes:
?param.<dotted.path>=v— request_params containment (dotted path → nested object literal).?response.<dotted.path>=v— response_result containment.?header.<name>=v— request_headers containment (header name canonicalized).?has=<column>— payload column non-empty (allowlisted; closed-switch gate against SQL injection).
Values are type-detected (true/false → bool, integer / float → number, else string; quote to force string). Header values are always strings. Filters are AND-combined with each other and with the indexed-column filters.
The portal exposes a JSONB filter editor that produces this exact syntax. The editor sources its has-keys list from a new endpoint (/audit/meta) so a server-side schema change doesn't require a UI redeploy.
NDJSON export
GET /api/v1/portal/audit/export?format=jsonl streams the filtered set as newline-delimited summary rows for offline analysis, ad-hoc ETL, or backups.
- Hard cap at 100,000 rows per request; truncates silently at the cap (doc explicitly calls out the caveat and notes future versions may emit a sentinel — operator must verify row count against the filter window).
- Per-row context check so client disconnects stop the stream promptly.
Cache-Control: no-storeand deferredWriteHeaderso a backend error before the first row sends a clean 5xx instead of a half-written 200.- Full JSONB filter contract supported —
?success=false&has=notifications&from=...scopes a backfill cleanly.
Filter contract endpoint
GET /api/v1/portal/audit/meta returns:
{
"has_keys": [...],
"json_sources": ["param", "response", "header"],
"replay": {"burst": 5, "refill_secs": 12, "sustained_min": 5},
"export": {"max_rows": 100000}
}Lets a UI build its filter editor against the server's source of truth instead of duplicating allow-lists in client code. sustained_min is derived from 60 / refill_secs (not coincidentally equal to burst), so a tuning change to the rate limit propagates automatically.
Inspection walkthrough doc
docs/operations/inspection.md is the operator-facing end-to-end: capture a call → open the drawer → read each tab → replay it → compare to a baseline → filter via JSONB paths → live-tail → export. Cross-referenced against the actual replayBurst / replayRefill / maxExportEvents constants. Header redaction policy and the post-validation token-consumption contract are called out explicitly.
Security & operator follow-ups
These landed alongside the feature work to make the surface safe to ship:
- Header redaction at the source.
auth.WithHeadersnow redacts credential-bearing names before stashing onto ctx, soaudit_payloads.request_headersshows[redacted]rather than verbatim. Pre-existing leak (the comment claimed redaction; the implementation didn't); v1.2.0 was the first to put those bytes in front of UI users so the fix landed here.pkg/auth.RedactHeadersis now exported for reuse. - Try-It payload capture.
recordTryitAuditpreviously bypassed the MCP middleware and never built theaudit.Payloadsibling. Try-It rows landed withpayload=null, and the audit drawer's Response/Notifications tabs correctly reported "No response captured" — making it look like capture was disabled. Now mirrorsrecordReplayAudit: builds*audit.PayloadwithRequestParams,ResponseResult,ResponseError, witherrCategoryprecedence aligned to the regular middleware. - Empty-audit-log crash on fresh deployments. Go marshals nil
[]audit.Eventas JSONnull; the SPA'srecent.map(...)/events.map(...)crashed. Audit-store layer now initializes empty results as[]T{}so JSON marshals as[]. Belt-and-braces?? []on the client. NewTestMemoryLogger_EmptyResultsAreNotNilenforces the invariant. - Compare-stash cleared on signOut and on 401.
audit-compare-stashno longer survives a session on a shared workstation. - make-dev unblocked. Every
dev-*Makefile target that touches docker-compose now declaresdev-secretsas a prereq and sources.env.devinline before invoking compose. (Compose interpolates${MCPTEST_COOKIE_SECRET:?required}at parse time on every invocation, and Make subshells lose env state.)
Changes by area
Backend — pkg/audit
- New
SubscribingLoggerinterface;AsyncLoggerandMemoryLoggerimplementations with per-subscriber mutex. MaxQueryLimit = 1000exported as the single source of truth across Postgres + memory backends.- Postgres
Store.Query/TimeSeries/BreakdownandMemoryLogger.Breakdowninitialize empty results as[]T{}(no nil). JSONPathFiltertype compiles to JSONB containment;IsAllowedHasKey/IsAllowedJSONSourceclosed-switch gates withAllowedHasKeysList()/ `All...
mcp-test-v1.1.1
Highlights
This release closes two subtasks from the audit-inspection roadmap (#8): the notification recorder and the PR #5 review-fix cleanup. Together with the bundled hardening they make the full request/response capture pipeline production-ready.
What you get:
- Server-initiated notifications now land in
audit_payloads.notifications. Everynotifications/progress,notifications/message, and any futurenotifications/*your tools dispatch viareq.Session.NotifyProgress/LogMessageis captured during the call window and stored alongside the request and response. The portal will surface these in the upcoming inspection drawer; today they're queryable directly viaGET /api/v1/portal/audit/events/{id}. - Redaction extended to notifications.
audit.redact_keysnow applies to notification params, not just tool params. A tool that emits a token in a progress message no longer bypasses the operator's redact list. - Notifications are byte-bounded. A
LogMessagewith a largedatablob can't blow pastaudit.max_payload_bytes; the captured slice is trimmed from the tail withnotifications_truncatedset so operators can tell. - Content annotations preserved.
audit_payloads.response_result.content[]round-trips image, audio, embedded resource, and resource-link blocks through their SDK MarshalJSON, so annotations on those blocks survive into the captured payload (previously they were dropped to{type: unknown}). error_categoryis consistent. The indexedaudit_events.error_categoryand the detail row'sresponse_error.categorynow always agree acrossauth,tool, andhandlererrors. The portal can filter on the indexed column without string-matching against the message.- CI tooling pinned in
make verify. Local lint and security checks now install and use the exact versions CI runs (golangci-lint v2.11.4, gosec v2.25.0) intobin/tools/instead of relying on$PATH. A pre-mergemake verifywill no longer green-light something that fails CI minutes later.
Breaking changes
None at the API or wire level. The schema cleanup migration drops three columns (jsonrpc_id, request_method, request_path) that the audit middleware never populated since v1.1.0; if you had ad-hoc tooling reading those, they were always empty.
Schema migration
A new migration 0003_audit_payloads_cleanup runs on first start after upgrade. It:
DROP COLUMN IF EXISTS jsonrpc_idDROP COLUMN IF EXISTS request_methodDROP COLUMN IF EXISTS request_pathADD COLUMN IF NOT EXISTS notifications_truncated BOOLEAN NOT NULL DEFAULT false
Migration 0002 was deliberately not edited in place; v1.1.0 operators get the same final shape via 0003, greenfield installs apply 0001 + 0002 (v1.1.0 schema) + 0003 and converge there.
Configuration
No required changes. Two existing knobs got new effective scope:
| Setting | Default | Notes |
|---|---|---|
audit.capture_payloads |
true (when unset) |
Enables the sibling audit_payloads row, including the new notifications array. Set to false to keep summary-only behavior. |
audit.max_payload_bytes |
65536 |
Now also bounds the notifications slice as a unit, in addition to request / response sides. |
audit.max_notifications |
100 |
Existing knob; per-call count cap on the recorder. |
audit.redact_keys |
[] |
Substrings (case-insensitive) of param keys to redact. Now applies to notification params too. |
What changed
pkg/mcpmw
- New
mcpmw.Notifications()sending-side middleware. Wired alongsidemcpmw.Audit(...)at server boot viasrv.AddSendingMiddleware(...). Records everynotifications/*method dispatched while a tool-call window is open into a per-request recorder seeded by the receiving Audit middleware. notificationRecorderis concurrency-safe; tools that fan out goroutines and callNotifyProgressfrom each can do so without external synchronization. JSON marshal and redaction run outside the recorder mutex so concurrent calls don't serialize.- The recorder applies
redactKeysat append time. Once the tool-call window closes, the snapshot is taken; goroutines that fire later are silently dropped from that snapshot (this is documented and tested). - Payload assembly trims notifications from the tail with a single linear pass to fit
max_payload_bytes;audit_payloads.notifications_truncatedflags the trim. error_categoryis assigned once after the call returns, soev.ErrorCategoryandpayload.response_error.categoryare always equal. Auth-failure category propagates to both fields.
pkg/audit
audit.Notificationstruct:{ts, method, params}.audit.PayloadgainsNotifications []NotificationandNotificationsTruncated bool.audit.SanitizeParametersshort-circuits whenredactKeysis empty (returns the input map without a deep copy). Hot-path optimization that affects every audit row.
pkg/httpsrv
GET /api/v1/portal/audit/events/{id}validates the id as a UUID at the boundary. Anything else returns 400, blocking the gosec G706 (log injection) flow on the soft-fail WARN log if the payload fetch errors.
CI / verify
Makefile: newtools-installtarget installs golangci-lint v2.11.4 and gosec v2.25.0 intobin/tools/with a version-stamped sentinel for idempotent reinstalls.lint,gosec,govulncheckusebin/tools/<binary>instead of whateverwhichfinds first.tools-checkshows resolved tool versions instead of just confirming presence.
Tests added
pkg/mcpmw/notifications_test.go: recorder concurrency, count cap, redact-keys propagation, post-snapshot append behavior, sending middleware wiring.pkg/mcpmw/audit_payload_test.go: end-to-end notification capture; byte cap trimming withNotificationsTruncatedassertion; redact-keys end-to-end; handler-error category consistency between event and payload; auth-failure category.pkg/audit/postgres/store_test.go: testcontainer-backed Postgres store roundtrip exercising every captured column includingNotificationsTruncated; cascade delete of payload row when the parent event is deleted.tests/audit_notifications_test.go: full HTTP-stack integration test that calls the streamingprogresstool with aProgressTokenand asserts the resulting audit row carries the notifications. Locks the SDK contract thatServer.AddSendingMiddlewarefires onServerSession.NotifyProgress.
Upgrade
docker pull ghcr.io/plexara/mcp-test:v1.1.1The 0003_audit_payloads_cleanup migration runs automatically on start. No restart-time downtime expected; the migration only touches audit_payloads, not the indexed audit_events summary table.
Roadmap
Remaining subtasks from #8 will land in subsequent releases:
- Replay endpoint (
POST /audit/events/{id}/replay) - SSE live tail (
GET /audit/stream) - JSONB path filter compiler (
?param.<path>=...,?has=...) - NDJSON export
- Portal UI inspection drawer (Overview / Request / Response / Notifications tabs)
- Side-by-side comparison page
docs/operations/inspection.mdwalkthrough
Changelog
Features
Bug Fixes
- 0b8452f: fix(audit): address PR #5 critical review (@cjimti)
- 34ee489: fix(audit): address PR #9 critical review (@cjimti)
- c1489ed: fix(audit): address PR #9 second review and CI lint/security failures (@cjimti)
Installation
Container
docker pull ghcr.io/plexara/mcp-test:v1.1.1Binary (macOS / Linux)
curl -L -o mcp-test.tar.gz \
https://github.com/plexara/mcp-test/releases/download/v1.1.1/mcp-test_1.1.1_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz
tar -xzf mcp-test.tar.gz
./mcp-test --versionDocumentation
Full docs at https://mcp-test.plexara.io.
Open source by Plexara, the commercial MCP server with configurable enrichment built in.
mcp-test-v1.1.0
Minor release. Two themes: full request/response payload capture for the audit pipeline, and a CI / project-tooling overhaul.
No breaking changes. Drop-in upgrade from v1.0.1.
Audit payload capture
The audit log now records the full request and response envelope for every tool call, not just metadata. For a server whose entire purpose is to be a fixture for testing MCP gateways, this is the diff between "I can see calls happened" and "I can see exactly what flowed through": the inputs the gateway forwarded, the outputs the server returned, the headers, the timing, the redaction state.
Capture is on by default and configurable. Operators can turn it off in privacy-sensitive deployments, cap per-call size to bound storage, or scope what's recorded.
Schema
A new audit_payloads sibling table joined 1:1 with audit_events by event_id carries the full envelope. Two-table layout keeps the indexable summary fast for time-range queries while letting operators drill into the full envelope on demand.
| Column | Contents |
|---|---|
jsonrpc_method |
The receiving-middleware-dispatched method (typically tools/call) |
request_params JSONB |
Full sanitized arguments object, GIN-indexed via jsonb_path_ops for fast containment queries |
request_size_bytes, request_truncated |
Size + flag; oversize requests are dropped wholesale and the flag is set |
request_headers JSONB |
Redacted HTTP headers, only when audit.capture_headers: true |
request_remote_addr |
Caller's network address |
response_result JSONB |
Full CallToolResult (content blocks + isError + structuredContent), GIN-indexed |
response_error JSONB |
{message, category} for failed calls |
response_size_bytes, response_truncated |
Same shape as request |
notifications JSONB |
Array of server-initiated notifications |
replayed_from |
FK to the original event when this row is a replay |
captured_at |
Forensic timestamp; supports queries when the async drain is delayed |
ON DELETE CASCADE from audit_events.id keeps retention atomic: deleting a summary row drops its payload row in the same statement. No second policy.
Configuration
Four new keys under audit::
audit:
enabled: true
retention_days: 30
redact_keys: [...]
capture_payloads: true # default
capture_headers: true # default; flip false in privacy-sensitive deployments
max_payload_bytes: 65536 # per side (request, response); larger is dropped + flagged
max_notifications: 100 # cap on recorded notifications per callcapture_payloads and capture_headers are tri-state: an omitted YAML key defaults to true, an explicit false opts out. mcp-test.example.yaml, mcp-test.dev.yaml, and mcp-test.live.yaml all surface the new keys with explanatory comments.
API
GET /api/v1/portal/audit/events/{id} returns the full event with payload joined from audit_payloads:
curl -H "X-API-Key: $MCPTEST_DEV_KEY" \
http://localhost:8080/api/v1/portal/audit/events/<event-id>Returns 404 on unknown id. Stores that don't persist payloads (in-memory, noop) return the summary alone with the payload key omitted.
audit.QueryFilter gained EventID for primary-key lookups. Both the Postgres store (PK index) and the in-memory logger honor it.
Capture middleware
mcpmw.Audit accepts variadic options:
mcpmw.Audit(chain, logger, redactKeys, toolGroups,
mcpmw.WithPayloadCapture(maxBytes),
mcpmw.WithHeaderCapture(),
mcpmw.WithMaxNotifications(n),
)The server boot wires options from cfg.Audit automatically.
Auth-failure inspection
Auth-failure rows now carry a payload row too, with a structured category in response_error.category ("auth") so the portal can filter without string-matching. Useful for triaging gateway misconfiguration: a 401 from the upstream now leaves a row that says exactly which validation step rejected the bearer.
Performance and storage
- Defaults: 64 KiB cap per side, 30-day retention. Worst case 1000 calls/day ~= 128 MB/day ~= 3.8 GB/30 days.
- Payload writes go through the same async buffered drain as summary writes; a stalled DB never inflates request latency.
request_paramsandresponse_resultcarry GINjsonb_path_opsindexes (smaller and faster than default GIN for@>containment).
CI / project tooling overhaul
mcp-test's CI surface is now at parity with the Plexara reference standard.
ci.yml split into 6 parallel jobs
| Job | What it does |
|---|---|
lint |
golangci-lint v2.11.4 |
test |
go test -race -count=1 -covermode=atomic; coverage gate at >= 80%; Codecov upload guarded by CODECOV_TOKEN presence |
build |
go build -v ./... + go mod verify |
frontend |
pnpm install + tsc --noEmit + pnpm build against ui/ so TypeScript regressions and broken Vite builds fail CI before they ship to the embedded SPA |
security |
gosec is gating (was advisory), govulncheck via the dedicated action, Semgrep with p/golang plus a project-local .semgrep/ ruleset |
integration |
go test -tags=integration ./tests/... runs the testcontainers-backed end-to-end smoke on every PR |
Per-job permissions: { contents: read } on top of workflow-level permissions: read-all. Per-job timeout-minutes. Concurrency cancellation correctly groups push and pull_request events on the same head SHA. Pure docs/markdown changes skip CI via paths-ignore.
Two new workflows
codeql.yml: CodeQL Go analysis on push/PR plus a weekly Monday cron using the security-and-quality query bundle. Findings post to the repo's Security tab.
scorecard.yml: OpenSSF Scorecard on branch_protection_rule, push to main, and a weekly Saturday cron. SARIF uploaded to code-scanning and as a 5-day artifact.
Supply-chain hardening: every action SHA-pinned
Every uses: in every workflow (ci.yml, codeql.yml, scorecard.yml, docs.yml, release.yml) is pinned to a commit SHA with a trailing version comment:
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2With write + id-token permissions on the release path, floating tags would have meant a compromised action repo could silently sign and push binaries under the project's name. Pinning to bytes removes that vector.
Project-local Semgrep rules
.semgrep/go-security.yml adds two rules covering the OOM-via-user-input pattern: unbounded-make-slice-capacity and unbounded-make-map-size.
Manual repo settings recommended after upgrade
CODECOV_TOKENrepo secret (Settings -> Secrets and variables -> Actions). Without it the Codecov upload step skips cleanly; with it, every PR gets a coverage diff report.- Disable default-config CodeQL (Settings -> Code security -> Code scanning -> CodeQL analysis -> "None" or "Custom"). The explicit
codeql.ymlworkflow is the source of truth; leaving the default config on means every PR runs two identical scans.
Operational notes
- Audit storage growth. With defaults, expect roughly 3.8 GB / 30 days at 1000 calls/day worst case. Tune
audit.max_payload_bytesor setaudit.capture_payloads: falsefor high-throughput deployments. - Backwards compatibility. Public Go API is unchanged:
audit.Logger,mcpmw.Audit,audit.Eventall work as before. The newaudit.Payloadtype and themcpmw.AuditOptionvariadics are purely additive. - Existing audit_events rows. Pre-v1.1.0 rows still render correctly through the portal API; their
payloadfield is simply omitted. - Database migration. Runs automatically on boot via golang-migrate. The
0002_audit_payloads.up.sqlmigration adds the new table and indexes; reverts cleanly via0002_audit_payloads.down.sqlif needed.
Installation
Container
docker pull ghcr.io/plexara/mcp-test:v1.1.0The image ships only the binary and LICENSE; mount your own config:
docker run --rm \
-v $(pwd)/mcp-test.yaml:/app/configs/mcp-test.yaml:ro \
-p 8080:8080 \
ghcr.io/plexara/mcp-test:v1.1.0Binary (macOS / Linux)
curl -L -o mcp-test.tar.gz \
https://github.com/plexara/mcp-test/releases/download/v1.1.0/mcp-test_1.1.0_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz
tar -xzf mcp-test.tar.gz
./mcp-test --versionDocumentation
Full docs at https://mcp-test.plexara.io.
Verification
After upgrading, inspect the new capture surface:
# Make a tool call
curl -s -X POST http://localhost:8080/api/v1/admin/tryit/echo \
-H "X-API-Key: $MCPTEST_DEV_KEY" \
-H "X-Requested-With: XMLHttpRequest" \
-H "Content-Type: application/json" \
-d '{"arguments":{"message":"inspect-me"}}'
# Find the resulting event id
EVENT_ID=$(curl -s -H "X-API-Key: $MCPTEST_DEV_KEY" \
http://localhost:8080/api/v1/portal/audit/events?limit=1 \
| jq -r '.events[0].id')
# Fetch the full event, including payload
curl -s -H "X-API-Key: $MCPTEST_DEV_KEY" \
http://localhost:8080/api/v1/portal/audit/events/$EVENT_ID | jq '.payload'The response includes request_params, response_result.content, sizes, and any captured headers.
Open source by Plexara, the commercial MCP server with configurable enrichment built in.
mcp-test-v1.0.1
Patch release. Closes #2 via #3: the auth chain now logs per-authenticator rejections with full diagnostic context, and the HTTP gateway emits an RFC 6750 §3 WWW-Authenticate challenge so MCP clients see a useful error alongside the 401.
No breaking changes. Drop-in upgrade from v1.0.0.
What changed
Auth chain logging
Before v1.0.1, Chain.Authenticate silently dropped errors from OIDCValidator.ValidateBearer and APIKeyStore.Authenticate and fell through to ErrNotAuthenticated. A 401 from a misconfigured deployment looked identical to a 401 from a malformed token: empty body, no log, no diagnostic surface. Operators had to modify the binary or run packet captures to find out why a token was rejected.
The chain now writes one slog line per rejection from each configured authenticator, with the underlying validator error and full request correlation:
{
"time": "2026-04-30T07:42:18Z",
"level": "WARN",
"msg": "auth: token rejected",
"method": "oidc",
"error": "audience mismatch: want \"prod\"",
"request_id": "req-abc-123",
"remote_addr": "10.0.0.7"
}request_id and remote_addr are pulled from the same context the audit middleware already populates, so a WARN line joins cleanly against the matching audit_events row.
Token redaction
The chain strips JWT-shaped substrings (xxx.yyy.zzz of base64url segments at least 8 characters each) from the validator error before logging. If a custom validator unwisely embeds the rejected bearer in its error message, the chain replaces the JWT with [redacted-jwt]. Defense in depth on top of the validators-don't-echo-tokens contract.
Rate-limited WARN
A scanner hitting the 401 path at line speed would otherwise produce one WARN per request, drowning operator logs and crowding out real signal. The chain rate-limits per remote_addr: the first failure within a 60-second window logs at WARN, subsequent failures in the same window log at DEBUG. Different sources are independent. Failures with no remote_addr (no correlation context available) always surface at WARN since there's no bucket to share.
WWW-Authenticate per RFC 6750 §3
MCPAuthGateway's 401 response now carries:
WWW-Authenticate: Bearer realm="mcp-test",
error="invalid_request",
error_description="missing or unsupported credential; supply X-API-Key or Authorization: Bearer <token>",
resource_metadata="<protected-resource-metadata-url>"
error and error_description quote escaping follows RFC 7235 §2.1 (\ and " escaped per the auth-param ABNF). MCP clients that follow the spec surface error_description to the user. The JSON body mirrors error_description for non-RFC-aware clients.
API note
Chain.WithLogger was renamed to SetLogger to match the receiver-mutation semantics. WithLogger is kept as a Deprecated: alias so existing source-level callers compile without changes. Prefer SetLogger going forward.
Operational notes
- Log volume. Expect more INFO/WARN entries during steady-state failure scenarios. The rate limiter caps unique-source bursts, but a deployment behind a single egress IP (or one that doesn't propagate
X-Forwarded-For) may see all failures attributed to the sameremote_addrand rate-limited together. If that's noisier than you want, lower slog level filtering on the auth-token-rejected line specifically, or setaudit.enabledand use theaudit_eventstable as the canonical record. - Backwards compatibility. Public API is unchanged:
auth.NewChain(allowAnonymous, apiKeys, oidc)works as before. The newSetLoggeris additive. - No config flags introduced. Logging behavior is on by default.
Installation
Container
docker pull ghcr.io/plexara/mcp-test:v1.0.1The image ships only the binary and LICENSE; mount your own config:
docker run --rm \
-v $(pwd)/mcp-test.yaml:/app/configs/mcp-test.yaml:ro \
-p 8080:8080 \
ghcr.io/plexara/mcp-test:v1.0.1Binary (macOS / Linux)
curl -L -o mcp-test.tar.gz \
https://github.com/plexara/mcp-test/releases/download/v1.0.1/mcp-test_1.0.1_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz
tar -xzf mcp-test.tar.gz
./mcp-test --versionDocumentation
Full docs at https://mcp-test.plexara.io.
Verification
After upgrading, the changes are visible from a single curl:
curl -i http://localhost:8080/
# HTTP/1.1 401 Unauthorized
# WWW-Authenticate: Bearer realm="mcp-test", error="invalid_request",
# error_description="missing or unsupported credential...",
# resource_metadata="..."
# {"error":"unauthorized","error_description":"missing or unsupported credential"}For the chain logs, hit the same path with a bearer the OIDC validator rejects and check the binary's stderr for the auth: token rejected JSON line.
Open source by Plexara, the commercial MCP server with configurable enrichment built in.
mcp-test-v1.0.0
First public release of mcp-test, a Plexara-sponsored OSS Go MCP server built specifically as a controllable fixture for testing MCP gateways end-to-end.
The point isn't what its tools do (they are intentionally boring); the point is that they're predictable, deterministic, and observable, so a gateway sitting in front of them can be asserted on. Same input always produces the same output. Failures happen exactly when asked. Every call lands in a Postgres-backed audit log that the embedded React portal lets you browse, filter, and chart.
Apache 2.0. Source at https://github.com/plexara/mcp-test. Docs at https://mcp-test.plexara.io.
What's in 1.0.0
MCP server (Go)
Streamable HTTP transport via the official github.com/modelcontextprotocol/go-sdk v1.5.0; mounted at /, with browsers redirected to /portal/ and MCP clients passing through.
Twelve test tools across four toolkits, each individually flag-gated:
| Group | Tools | What they verify in your gateway |
|---|---|---|
| identity | whoami, echo, headers |
Identity forwarding, argument round-trip, header pass-through |
| data | fixed_response, sized_response, lorem |
Deterministic dedup, size-limit handling, seeded reproducibility |
| failure | error, slow, flaky |
Error categorization, timeout policy, seeded retry behavior |
| streaming | progress, long_output, chatty |
Progress notification pass-through, multi-block content, ordering |
Three auth methods, daisy-chained:
- File API keys (constant-time compare)
- Postgres-backed bcrypt keys (managed via
/api/v1/admin/keys) - External OIDC delegation with JWKS caching, alg pinning (
RS256/384/512),exprequired, singleflight refresh, stale-while-revalidate - RFC 9728 protected-resource metadata at
/.well-known/oauth-protected-resourceso MCP clients can discover the IdP
Postgres-backed audit log of every tools/call: sanitized parameters, identity (subject, email, name, auth type), latency, response size, content blocks, source (mcp vs portal-tryit). Async buffered drain so DB latency never gates the request path; audit.enabled: false swaps in a noop logger.
Built-in MCP instructions: the server's initialize response carries server-level instructions clients surface to the LLM as system context, telling models these tools are test fixtures and not data sources.
Embedded portal (React 19)
Vite + Tailwind 4 + shadcn/ui, baked into the Go binary via go:embed all:dist; mounted at /portal/.
Pages: Login (API key or OIDC), Dashboard (counts, error rate, p50/p95 latency, recent activity), Tools (per-tool detail with Try-It form generated from JSON schema), Audit (filterable timeline with full event inspection), API Keys (create/revoke Postgres-backed bcrypt keys), Config viewer (with secrets redacted), Discovery (well-known viewer), About.
Light/dark theming with persisted preference; CSRF-protected admin endpoints (X-Requested-With required); top-level error boundary; 401 interceptor for clean session expiry handling.
The Try-It proxy at /api/v1/admin/tryit/{name} invokes tools through an in-process MCP client; audit rows tagged source=portal-tryit so you can filter portal traffic out of gateway-traffic counts.
Infrastructure
- Multi-arch container at
ghcr.io/plexara/mcp-test, distroless static, signed via cosign keyless on tag. docker-compose.dev.ymlbrings up Postgres 16 + Keycloak with a pre-seeded realm (mcp-test), client, anddev/devuser.- GoReleaser pipeline publishes binaries (linux/darwin/windows × amd64/arm64), checksums, and the OCI image to GHCR.
- GitHub Actions: lint + race-tested test suite + 80% coverage gate on every PR; docs deploy to
mcp-test.plexara.ioonmain; full release on tag. - Bundled
.mcp.jsonso Claude Code can connect to a local instance directly.
Documentation
mcp-test.plexara.io is built from the source tree's docs/ with MkDocs Material, themed to match Plexara's marketing identity (midnight neutrals, copper teal-azure accent, Outfit display + DM Sans body).
Sections: Getting Started, Configuration, Tools, Operations, Reference (HTTP API, MCP protocol, Architecture, Releases). Plus full SEO (per-page descriptions, OG/Twitter cards, JSON-LD), Google Analytics 4, sitemap.xml, robots.txt, and an llms.txt curated index.
Out of scope (deliberately)
- Personas / RBAC. Any authenticated caller is admin. mcp-test is a fixture, not a multi-tenant system.
- stdio transport. HTTP-only.
- Real upstream data backends (Trino, S3, Snowflake). The
datatoolkit returns synthetic, seeded fixtures. - Built-in OAuth authorization server. OIDC is delegated to an external IdP. Keycloak is bundled in the dev stack for convenience.
Installation
Container
docker pull ghcr.io/plexara/mcp-test:v1.0.0The image ships only the binary and LICENSE; mount your own config:
docker run --rm \
-v $(pwd)/mcp-test.yaml:/app/configs/mcp-test.yaml:ro \
-p 8080:8080 \
ghcr.io/plexara/mcp-test:v1.0.0A starter config lives in the repo at configs/mcp-test.example.yaml.
Binary (macOS / Linux)
curl -L -o mcp-test.tar.gz \
https://github.com/plexara/mcp-test/releases/download/v1.0.0/mcp-test_1.0.0_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz
tar -xzf mcp-test.tar.gz
./mcp-test --versionFrom source
git clone https://github.com/plexara/mcp-test
cd mcp-test
make dev # postgres + keycloak via docker, binary in foregroundmake dev generates random secrets into .env.dev (gitignored) on first run; subsequent runs reuse them so portal sessions persist across restarts.
Quick smoke test
# 401 with a discovery pointer (RFC 9728)
curl -i http://localhost:8080/
# WWW-Authenticate: Bearer resource_metadata="..."
# Call a tool through the bundled file API key
curl -s -X POST http://localhost:8080/api/v1/admin/tryit/echo \
-H "X-API-Key: $MCPTEST_DEV_KEY" \
-H "X-Requested-With: XMLHttpRequest" \
-H "Content-Type: application/json" \
-d '{"arguments":{"message":"hello"}}'Then open http://localhost:8080/portal/ and watch the audit row appear within a second tagged source=portal-tryit.
What's next
This is the foundational release. Likely directions for 1.x:
- Streamed Try-It progress notifications surfaced in the portal.
- Schema-driven ToolForm fully replacing the curated form registry.
- Optional CodeQL / SBOM signing in the release pipeline.
- Stateless mode for multi-replica deployments (Streamable HTTP session externalization).
Issues and PRs welcome at https://github.com/plexara/mcp-test/issues.
Open source by Plexara, the commercial MCP server with configurable enrichment built in. mcp-test is what we use to verify Plexara's gateway behavior end-to-end; we ship it as OSS so anyone building MCP integrations can use the same fixture.