Skip to content

fix(mcp): hide internal tool events in response streams#1405

Open
zhoug9127 wants to merge 4 commits into
mainfrom
fix/hide-internal-mcp-streaming
Open

fix(mcp): hide internal tool events in response streams#1405
zhoug9127 wants to merge 4 commits into
mainfrom
fix/hide-internal-mcp-streaming

Conversation

@zhoug9127
Copy link
Copy Markdown
Collaborator

@zhoug9127 zhoug9127 commented Apr 28, 2026

Description

Problem

Internal/self-provided MCP servers can be visible to the model during Responses tool loops, but client-facing OpenAI Responses streaming and final outputs must not expose those internal tool details.

Solution

Hide internal non-builtin MCP tool artifacts from OpenAI Responses streaming and final response paths while preserving public tools and builtin-routed MCP outputs. This is PR A1 in the upstream SMG memory/LTM series: it establishes the output-privacy base that later memory provider recall PRs can reuse.

Changes

  • Suppress live streaming tool-call events for internal non-builtin MCP tools.
  • Filter internal mcp_list_tools SSE events from client-visible streaming output.
  • Sanitize synthetic/final Responses output so internal tools, tool choices, and internal MCP output items do not leak.
  • Keep public user function calls visible when they share a name with an internal MCP tool.
  • Update internal MCP server reference docs.

Test Plan

  • pre-commit run --all-files
  • cargo +nightly fmt --all --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --manifest-path model_gateway/Cargo.toml internal --all-features
  • cargo test --manifest-path model_gateway/Cargo.toml forward_streaming_event_strips_internal_tools_from_response_envelope --all-features
  • Attempted cargo test --manifest-path model_gateway/Cargo.toml --all-features; local run failed only in model_gateway/tests/otel_tracing_test.rs::test_router_with_tracing because the OTLP collector received 0 spans. The same exact test fails on origin/main in this workspace, so this appears pre-existing/local-infra rather than introduced by this PR.
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

  • Bug Fixes

    • Completely hide internal, non-builtin MCP tool details from clients — including intermediate streaming events, final streaming/completed events, live tool-call events, and tool listings.
    • Redact tool-related fields in response envelopes (including tool_choice) so only user-visible tools appear.
  • Documentation

    • Clarified docs: internal MCP tools are hidden across all response types, including streaming.
  • Tests

    • Added tests to verify suppressed emissions and redaction behavior.

Signed-off-by: Daisy Zhou <zhoug9127@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

Warning

Rate limit exceeded

@zhoug9127 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 58 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7ccc4826-a57a-4959-8386-a105e2450b56

📥 Commits

Reviewing files that changed from the base of the PR and between c1cc8d2 and 8728cd1.

📒 Files selected for processing (1)
  • crates/mcp/src/core/session.rs
📝 Walkthrough

Walkthrough

Updates hide internal, non-builtin MCP tool artifacts from client-facing outputs across streaming and non-streaming paths: suppressing SSE emissions during execution, filtering/redacting streaming events and response envelopes, and excluding internal servers from streamed tool listings. Documentation and tests were updated accordingly.

Changes

Cohort / File(s) Summary
Documentation
docs/reference/mcp-internal-servers.md
Clarifies that internal: true hides non-builtin tool details in both streaming and non-streaming outputs (final responses, streaming completion/events, live tool-call events, mcp_list_tools events, and response envelope fields). Removes prior streaming-exclusion statement.
Streaming tool execution
model_gateway/src/routers/openai/mcp/tool_loop.rs
Adds emit_tool_events flag to suppress intermediate/completion SSE for internal non-builtin tools; guards SSE calls and adds early-exit checks when channel closed. Adds unit test verifying no SSE events emitted while call remains recorded.
Streaming response forwarding
model_gateway/src/routers/openai/responses/streaming.rs
Drops forwarded streaming events for hidden items and internal non-builtin tools (based on resolved tool name); centralizes tool-name derivation; filters mcp_list_tools bindings; redacts internal MCP artifacts inside live response envelopes; passes session to tool-restoration; adds unit test for envelope stripping.
Response redaction utilities
model_gateway/src/routers/openai/responses/utils.rs
Introduces strip_internal_mcp_artifacts and centralized strip_internal_mcp_artifacts_with_names to remove internal tool items/tools and conditionally scrub tool_choice (overwriting when hidden). Adjusts tests for structured tool_choice injection.
Session visibility predicate
crates/mcp/src/core/session.rs
Replaces should_hide_function_call_like with should_hide_function_output_item_like to align typed (streaming) and JSON (non-streaming) visibility checks for internal non-builtin tools.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant StreamingHandler
    participant ToolExecutor
    participant ResponseFilter
    participant Storage as ToolLoopState

    Client->>StreamingHandler: Submit request (may reference MCP tools)
    StreamingHandler->>ToolExecutor: Start tool execution (streaming)
    ToolExecutor->>ToolExecutor: Resolve tool name / is_internal_non_builtin?
    alt internal non-builtin
        ToolExecutor->>ToolExecutor: emit_tool_events = false
        ToolExecutor->>Storage: Record call item (no SSE emitted)
    else public/builtin
        ToolExecutor->>StreamingHandler: Emit SSE events for client
        ToolExecutor->>Storage: Record call item
    end
    ToolExecutor->>ResponseFilter: Execution result / events
    ResponseFilter->>ResponseFilter: Check session visibility / strip_internal_mcp_artifacts
    alt contains internal artifacts
        ResponseFilter->>ResponseFilter: Remove tools/tool_choice/hidden items
        ResponseFilter->>Client: Send redacted events/envelope
    else
        ResponseFilter->>Client: Send events/envelope as-is
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

mcp, tests

Suggested reviewers

  • CatherineSue
  • key4ng
  • slin1237

Poem

🐰 I burrowed through streams and code so deep,
Hid little tools where clients cannot peep.
SSEs hushed, envelopes swept clean,
No leaks remain — neat, quiet, and keen.
🥕 Quiet paws, quiet keep.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and concisely describes the main change: hiding internal tool events in response streams. This aligns with the primary objective of suppressing internal MCP tool artifacts from client-facing outputs.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/hide-internal-mcp-streaming

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added documentation Improvements or additions to documentation model-gateway Model gateway crate changes openai OpenAI router changes labels Apr 28, 2026
Comment thread model_gateway/src/routers/openai/responses/streaming.rs Outdated
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-structured PR. The internal-tool suppression logic is correct across all three paths (tool-loop events, live streaming events, final response envelopes), and the test coverage is solid. One minor nit posted about a duplicate collect_user_function_names allocation in the streaming hot path.

Summary: 0 🔴 Important · 1 🟡 Nit · 0 🟣 Pre-existing

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extends the internal: true flag for MCP servers to hide internal tool details from streaming outputs, live events, and response envelopes. It introduces logic to suppress streaming tool events and redact internal artifacts from response data. The review feedback highlights a compilation error in the tests due to a missing helper function in tool_loop.rs and suggests an optimization to avoid redundant function calls in should_suppress_internal_streaming_event.

Comment thread model_gateway/src/routers/openai/mcp/tool_loop.rs
Comment thread model_gateway/src/routers/openai/responses/streaming.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b81b530857

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread model_gateway/src/routers/openai/responses/streaming.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/openai/mcp/tool_loop.rs`:
- Around line 203-205: The code currently sets emit_tool_events =
!session.is_internal_non_builtin_tool(&call.name) and then skips all
tx.send(...) paths for internal tools, which prevents early detection of
disconnected clients; change the logic so that even when emit_tool_events is
false you still perform a lightweight disconnect check before long-running MCP
calls—either by doing a non-blocking tx.try_send() of a minimal heartbeat/event
or by checking the transmit channel state (e.g., tx.is_closed() or equivalent)
and returning early on disconnect; update the same pattern around the other send
sites referenced (the blocks covering the 225-233, 248-250, and 305-313 ranges)
so every branch checks/sends a minimal probe or verifies channel liveness prior
to executing the tool call.

In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 438-461: In should_suppress_internal_streaming_event, avoid
calling collect_user_function_names twice: call
collect_user_function_names(ctx.original_request) once at the top of the
function (e.g., let user_function_names =
collect_user_function_names(ctx.original_request);) and reuse that variable for
both the session.should_hide_output_item_json(item, &user_function_names) check
and the final session.is_internal_non_builtin_tool(tool_name.as_ref()) &&
!user_function_names.contains(tool_name.as_ref()) check; keep the
streaming_event_tool_name(parsed_data, handler) and session checks unchanged but
remove the second collect_user_function_names call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8642f3a2-0253-42ad-8307-ba885a392626

📥 Commits

Reviewing files that changed from the base of the PR and between a99af9c and b81b530.

📒 Files selected for processing (4)
  • docs/reference/mcp-internal-servers.md
  • model_gateway/src/routers/openai/mcp/tool_loop.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/src/routers/openai/responses/utils.rs

Comment thread model_gateway/src/routers/openai/mcp/tool_loop.rs
Comment thread model_gateway/src/routers/openai/responses/streaming.rs
Signed-off-by: Daisy Zhou <zhoug9127@gmail.com>
@zhoug9127 zhoug9127 requested a review from zhaowenzi as a code owner April 28, 2026 15:37
@github-actions github-actions Bot added the mcp MCP related changes label Apr 28, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1cc8d2315

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread model_gateway/src/routers/openai/responses/streaming.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/openai/responses/streaming.rs (1)

369-460: 🧹 Nitpick | 🔵 Trivial

Avoid rebuilding user-function names on every streamed chunk.

should_suppress_internal_streaming_event runs on the hot path; hoisting collect_user_function_names(ctx.original_request) out of the helper would avoid repeated HashSet allocations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/openai/responses/streaming.rs` around lines 369 -
460, The helper should_suppress_internal_streaming_event rebuilds the
user-function HashSet each call by calling
collect_user_function_names(ctx.original_request); move that call out of the
hot-path and pass the precomputed set into the helper (or store it once on
StreamingEventContext) so you avoid repeated HashSet allocations. Update callers
of should_suppress_internal_streaming_event (the streaming loop that currently
invokes it) to compute user_function_names =
collect_user_function_names(ctx.original_request) once and change the helper
signature to accept &HashSet<String> (or appropriate type) instead of capturing
it internally; keep the existing checks (item hiding and
streaming_event_tool_name) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 369-460: The helper should_suppress_internal_streaming_event
rebuilds the user-function HashSet each call by calling
collect_user_function_names(ctx.original_request); move that call out of the
hot-path and pass the precomputed set into the helper (or store it once on
StreamingEventContext) so you avoid repeated HashSet allocations. Update callers
of should_suppress_internal_streaming_event (the streaming loop that currently
invokes it) to compute user_function_names =
collect_user_function_names(ctx.original_request) once and change the helper
signature to accept &HashSet<String> (or appropriate type) instead of capturing
it internally; keep the existing checks (item hiding and
streaming_event_tool_name) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f8af576d-8f38-40ce-b3bb-a76e92065349

📥 Commits

Reviewing files that changed from the base of the PR and between b81b530 and c1cc8d2.

📒 Files selected for processing (3)
  • crates/mcp/src/core/session.rs
  • model_gateway/src/routers/openai/mcp/tool_loop.rs
  • model_gateway/src/routers/openai/responses/streaming.rs

Signed-off-by: Daisy Zhou <zhoug9127@gmail.com>
Signed-off-by: Daisy Zhou <zhoug9127@gmail.com>
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had any activity within 14 days. It will be automatically closed if no further activity occurs within 16 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions Bot added the stale PR has been inactive for 14+ days label May 13, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 13, 2026

Hi @zhoug9127, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch:

git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease

@mergify mergify Bot added the needs-rebase PR has merge conflicts that need to be resolved label May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation mcp MCP related changes model-gateway Model gateway crate changes needs-rebase PR has merge conflicts that need to be resolved openai OpenAI router changes stale PR has been inactive for 14+ days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant