Skip to content

refactor(mcp): extract OpenAI-flavored bridge into router-side module#1429

Open
slin1237 wants to merge 6 commits into
mainfrom
refactor/extract-openai-bridge
Open

refactor(mcp): extract OpenAI-flavored bridge into router-side module#1429
slin1237 wants to merge 6 commits into
mainfrom
refactor/extract-openai-bridge

Conversation

@slin1237
Copy link
Copy Markdown
Collaborator

@slin1237 slin1237 commented May 1, 2026

Summary

The smg-mcp crate had grown into a mix of MCP protocol code and OpenAI-flavored request/response/event transformation, forcing dual edits across the crate boundary on every router-side change. This PR moves all OpenAI-shaped concerns into a new model_gateway::routers::common::openai_bridge module, leaving smg-mcp focused on the MCP protocol surface.

  • New openai_bridge module: response_format, format_registry (DashMap side-map keyed by QualifiedToolName), format_descriptor (per-format dispatch table), tool_descriptors, transformer, overrides.
  • smg-mcp drops its openai-protocol dependency entirely; ResponseFormat, ToolEntry::response_format, in-session tool/inventory JSON builders, and the alias-routing 3-tuple are removed. Aliases now live under a literal ALIAS_SERVER_KEY prefix that the bridge looks up.
  • Adding a new built-in hosted tool no longer requires a parallel patch in smg-mcp — routers consume the bridge directly.
  • Image-generation result compactor is wired into the persistence path, the POST /v1/conversations/{id}/items handler, and the request-input persistence path so multi-MB base64 payloads are stripped before storage on every ingress; build_stored_response consumes the compacted JSON to avoid a second full-tree clone.

Net diff: +1437 / -1778 across 49 files. smg-mcp shrinks by ~3k LOC.

Design

Full design doc with grounded reasoning at .claude/plans/mcp-refactor.md.

Test plan

  • cargo build -p smg — clean
  • cargo test -p smg --lib openai_bridge:: — 45 pass (5 new format_registry tests, 3 compactor tests)
  • cargo test -p smg --test mcp_test — 23 integration tests pass against session.execute_tool
  • e2e image-generation compactor test exercises persistence with an explicit conversation
  • CI green

Summary by CodeRabbit

  • Chores

    • Reduced legacy response-format plumbing and public surface; simplified session/orchestrator APIs and inventory shapes; added a stable alias server key constant.
  • New Features

    • Added a shared OpenAI bridge with a FormatRegistry and descriptors for centralized per-tool format lookup.
    • Added bridge helpers to build tool descriptors and to transform/compact tool outputs.
  • Tests

    • Updated unit and e2e tests; tightened image-generation persistence checks and compaction.

@github-actions github-actions Bot added dependencies Dependency updates grpc gRPC client and router changes mcp MCP related changes tests Test changes model-gateway Model gateway crate changes openai OpenAI router changes anthropic Anthropic router changes gemini Gemini router changes labels May 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Moves OpenAI/protocol response-format responsibilities out of the MCP crate and into a new model_gateway openai_bridge. MCP no longer stores or transforms response formats; model_gateway now owns FormatRegistry, tool-descriptor builders, transformation/compaction, and threads the registry through routers, persistence, and workflow.

Changes

MCP → Gateway format refactor

Layer / File(s) Summary
Data Shape / Config
crates/mcp/src/core/config.rs, crates/mcp/src/inventory/types.rs
ToolConfig.response_formatOption<ResponseFormatConfig}; removed ToolEntry.response_format; added pub const ALIAS_SERVER_KEY.
MCP Core API Removal
crates/mcp/src/core/orchestrator.rs, crates/mcp/src/core/mod.rs, crates/mcp/src/core/session.rs
Deleted response-format transformation plumbing and helpers (removed ToolCallResult, ResponseTransformer usage, response_format fields, and related session APIs); added set_approval_mode and qualified_name_for_exposed.
Crate Surface Cleanup
crates/mcp/src/lib.rs, crates/mcp/src/responses_bridge.rs, crates/mcp/src/transform/mod.rs
Removed responses_bridge and transform modules and pruned OpenAI-protocol re-exports from crate root.
Gateway: new openai_bridge
model_gateway/src/routers/common/openai_bridge/*
Added FormatDescriptor, FormatRegistry, ResponseFormat model (Copy), tool descriptor builders, hosted-tool overrides, transformer (transform_tool_output) and JSON compaction helpers, plus re-exports (mod entry).
Registry population & descriptor mapping
model_gateway/src/routers/common/openai_bridge/format_registry.rs, .../format_descriptor.rs
Added registry storage/lookup, populate_from_server_config rules (alias handling, builtin defaults, explicit Passthrough), and compile-time descriptor(format) mapping.
Gateway wiring / DI
model_gateway/src/app_context.rs, model_gateway/src/routers/*/context.rs, model_gateway/src/routers/*/router.rs
Added mcp_format_registry: FormatRegistry to AppContext and router contexts; injected/cloned registry into router initialization and ResponsesContext construction.
MCP connection & utils updated
model_gateway/src/routers/common/mcp_utils.rs, model_gateway/src/routers/common/mod.rs
connect/collect/ensure functions now accept & populate FormatRegistry; registered openai_bridge module in routers/common.
Tool descriptors & visibility moved
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs, model_gateway/src/routers/openai/responses/utils.rs
Moved function/chat/response tool builders and client-visible MCP injection/hiding helpers into openai_bridge; gateway calls these instead of MCP session methods.
Transformation & persistence compaction
model_gateway/src/routers/common/openai_bridge/transformer.rs, model_gateway/src/routers/common/persistence_utils.rs, model_gateway/src/routers/conversations/handlers.rs
Added transform_tool_output, compact_image_generation_outputs_json; build_stored_response consumes and compacts JSON in-place; conversation parsing compacts non-message items.
Tool-loop & responses plumbing
model_gateway/src/routers/openai/mcp/tool_loop.rs, model_gateway/src/routers/*/responses/*, model_gateway/src/routers/grpc/*/responses/*
Replaced session-owned format lookups/transform calls with openai_bridge APIs; threaded FormatRegistry through ensure/strip/execute/emit flows; updated many signatures to use value-based ResponseFormat.
Tests, workflow & misc
model_gateway/src/workflow/mcp_registration.rs, model_gateway/src/service_discovery.rs, e2e_test/responses/test_image_generation.py, model_gateway/tests/mcp_test.rs, crates/mcp/Cargo.toml
Populate registry from connected server config; initialize registry in test contexts; tighten e2e image compaction assertions; refactor MCP tests to session-based execution; removed openai-protocol workspace dep from MCP Cargo.toml.

Sequence Diagram(s)

sequenceDiagram
    participant Router as ModelGateway Router
    participant Bridge as openai_bridge::FormatRegistry
    participant Session as McpToolSession
    participant Orch as McpOrchestrator
    participant Tool as External Tool Server
    participant Store as Persistence

    Router->>Bridge: lookup_tool_format(server_key, tool_name)
    Bridge-->>Router: ResponseFormat
    Router->>Session: prepare ToolExecutionInput (with ResponseFormat)
    Session->>Orch: execute tool call (arguments)
    Orch->>Tool: dispatch hosted call
    Tool-->>Orch: raw JSON result
    Orch-->>Session: ToolExecutionOutput (raw)
    Router->>Bridge: transform_tool_output(output, ResponseFormat)
    Bridge-->>Router: ResponseOutputItem
    Router->>Store: compact_image_generation_outputs_json + persist
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • CatherineSue
  • key4ng
  • zhaowenzi

Poem

🐇 I hopped from core to gateway's ridge,

Formats bundled tidy in a bridge.
Old transforms tucked out of sight,
Routers call, the registry lights.
A rabbit cheers the code's new flight.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: extracting OpenAI-specific functionality into a new router-side bridge module. This is the primary architectural change across the 49 modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/extract-openai-bridge

Comment @coderabbitai help to get the list of available commands and usage tips.

The smg-mcp crate had grown into a mix of MCP protocol code and
OpenAI-flavored request/response/event transformation, forcing dual edits
across the crate boundary on every router-side change. This PR moves all
OpenAI-shaped concerns (response-format dispatch, tool descriptors,
hosted-tool overrides, mcp_call/web_search_call/code_interpreter_call/
file_search_call/image_generation_call transformation, and the format
side-map keyed off the inventory) into
`model_gateway::routers::common::openai_bridge`, leaving smg-mcp focused
on the MCP protocol surface.

Highlights:
- New `openai_bridge` module: `response_format`, `format_registry`
  (DashMap side-map keyed by `QualifiedToolName`), `format_descriptor`
  (per-format dispatch table), `tool_descriptors`, `transformer`,
  `overrides`.
- smg-mcp drops its `openai-protocol` dependency entirely.
- `ResponseFormat`, `ToolEntry::response_format`, the in-session tool/
  inventory JSON builders, and the alias-routing 3-tuple are removed
  from smg-mcp; aliases now live under a literal `ALIAS_SERVER_KEY`
  prefix that the bridge looks up.
- Routers consume the bridge directly, so adding a new built-in
  hosted tool no longer requires a parallel patch in smg-mcp.
- Image-generation result compactor is wired into the persistence
  path, the POST /v1/conversations/{id}/items handler, and the
  request-input persistence path so multi-MB base64 payloads are
  stripped before storage on every ingress; `build_stored_response`
  now consumes the compacted JSON to avoid a second full-tree clone.
- MCP integration tests rewritten against `session.execute_tool`.
- e2e image-generation compactor test now creates an explicit
  conversation so it actually exercises persistence.

Net diff: +708 / -3645 across 45 files (smg-mcp shrinks by ~3k LOC).

Signed-off-by: Simo Lin <[email protected]>
@slin1237 slin1237 force-pushed the refactor/extract-openai-bridge branch from f2d6026 to a2212fc Compare May 1, 2026 19:11
@lightseekorg lightseekorg deleted a comment from mergify Bot May 1, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request decouples the smg_mcp crate from the OpenAI protocol by moving response transformation logic, tool descriptors, and format registries into a new openai_bridge module within the model_gateway. It also implements image generation output compaction to optimize storage usage. Feedback suggests marking a test helper function in format_registry.rs with #[cfg(test)] to prevent it from being compiled into production code and optimizing inject_client_visible_mcp_output_items to reduce Vec allocations in a performance-sensitive path.

Comment on lines +103 to +118
fn server(name: &str) -> McpServerConfig {
McpServerConfig {
name: name.to_string(),
transport: McpTransport::Streamable {
url: "http://x".to_string(),
token: None,
headers: HashMap::new(),
},
proxy: None,
required: false,
tools: None,
builtin_type: None,
builtin_tool_name: None,
internal: false,
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The server helper function is used only in tests. It should be marked with #[cfg(test)] to avoid being compiled into production code.

Comment on lines +145 to +156
let existing = std::mem::take(output);
let servers = session.mcp_servers();
output.reserve(servers.len() + tool_call_items.len() + existing.len());

for binding in servers {
if !session.is_internal_non_builtin_server_label(&binding.label) {
output.push(mcp_list_tools_item(
session,
&binding.label,
&binding.server_key,
));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The inject_client_visible_mcp_output_items function performs multiple Vec allocations and std::mem::take operations. Given this is a hot path for response processing, consider using a single pre-allocated Vec to improve performance.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f2d60264d0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

output: &smg_mcp::ToolExecutionOutput,
registry: &FormatRegistry,
) -> ResponseOutputItem {
let response_format = registry.lookup_by_names(&output.server_key, &output.tool_name);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve tool format from qualified name before transforming output

transform_tool_output looks up the format with (output.server_key, output.tool_name), but McpToolSession::execute_tool_result rewrites output.tool_name to the invoked/exposed name (crates/mcp/src/core/session.rs, output.tool_name = invoked_name). When exposed names differ from the underlying tool name (for example duplicate names that get disambiguated, like mcp_<server>_<tool>), this lookup misses and falls back to Passthrough, so hosted-tool results are emitted as mcp_call instead of web_search_call/image_generation_call etc. This breaks response typing for those valid multi-server configurations.

Useful? React with 👍 / 👎.

Comment on lines +12 to +25
/// and the registry. Returns `Passthrough` for unknown tools.
///
/// Lives next to `FormatRegistry` because it's a thin lookup helper that
/// composes session's name map with `FormatRegistry::lookup_by_names`,
/// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`.
pub fn lookup_tool_format(
session: &smg_mcp::McpToolSession<'_>,
registry: &FormatRegistry,
exposed_name: &str,
) -> ResponseFormat {
let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {
return ResponseFormat::Passthrough;
};
registry.lookup_by_names(qn.server_key(), qn.tool_name())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: The doc comment says this avoids the Arc<str> allocations of QualifiedToolName::new, but qualified_name_for_exposed already creates a QualifiedToolName (two Arc<str>), and then lookup_by_names creates a second one internally. You can halve the allocation by calling registry.lookup(&qn) directly:

Suggested change
/// and the registry. Returns `Passthrough` for unknown tools.
///
/// Lives next to `FormatRegistry` because it's a thin lookup helper that
/// composes session's name map with `FormatRegistry::lookup_by_names`,
/// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`.
pub fn lookup_tool_format(
session: &smg_mcp::McpToolSession<'_>,
registry: &FormatRegistry,
exposed_name: &str,
) -> ResponseFormat {
let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {
return ResponseFormat::Passthrough;
};
registry.lookup_by_names(qn.server_key(), qn.tool_name())
/// Resolve an exposed tool name's `ResponseFormat` via the session's name map
/// and the registry. Returns `Passthrough` for unknown tools.
pub fn lookup_tool_format(
session: &smg_mcp::McpToolSession<'_>,
registry: &FormatRegistry,
exposed_name: &str,
) -> ResponseFormat {
let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {
return ResponseFormat::Passthrough;
};
registry.lookup(&qn)
}

Comment on lines +53 to +54
/// Populate from a server config: per-tool overrides + builtin defaults.
/// Safe to call repeatedly — entries are overwritten.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: "entries are overwritten" is only true for non-Passthrough formats. Because the loop skips Passthrough (line 68), re-calling this for a server whose tool changed from e.g. WebSearchCallPassthrough would leave the stale WebSearchCall entry in the map.

In practice all servers are registered once so this is unlikely to bite, but the doc comment creates a stronger guarantee than the code delivers. Either document the caveat or add a self.formats.remove(...) for the Passthrough case so the "safe to call repeatedly" contract actually holds.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/harmony/streaming.rs (1)

698-707: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Default exposed MCP tools to Passthrough when the registry is absent.

If session is present but format_registry is None, this returns None and the tool gets emitted as a plain function_call instead of mcp_call/builtin events. The safe fallback here is still MCP semantics, i.e. Some(ResponseFormat::Passthrough), or making session and format_registry impossible to desynchronize.

Suggested fix
-                                let response_format = session.and_then(|s| {
-                                    if s.has_exposed_tool(tool_name) {
-                                        format_registry.map(|reg| {
-                                            openai_bridge::lookup_tool_format(s, reg, tool_name)
-                                        })
-                                    } else {
-                                        None
-                                    }
-                                });
+                                let response_format = session.and_then(|s| {
+                                    if !s.has_exposed_tool(tool_name) {
+                                        return None;
+                                    }
+                                    Some(format_registry.map_or(ResponseFormat::Passthrough, |reg| {
+                                        openai_bridge::lookup_tool_format(s, reg, tool_name)
+                                    }))
+                                });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/harmony/streaming.rs` around lines 698 - 707,
The code that computes response_format (using session.and_then(...)) currently
returns None when format_registry is None, causing exposed MCP tools to be
emitted as plain function_call; change the logic in the response_format
calculation so when session.is_some() and s.has_exposed_tool(tool_name) but
format_registry is None you return Some(ResponseFormat::Passthrough) instead of
None; specifically update the closure that calls
openai_bridge::lookup_tool_format(...) to fall back to
ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on
format_registry before calling lookup), referencing response_format, session,
format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and
ResponseFormat::Passthrough.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 81-85: The current check (has_explicit =
config.tools.as_ref().is_some_and(|tools| tools.contains_key(tool_name))) treats
any per-tool stanza as an override and disables builtin formatting even when the
stanza only sets alias/arg_mapping; fix this by detecting whether the per-tool
entry actually specifies a response_format. Update the logic in
format_registry.rs to look up the per-tool config (e.g., tools.get(tool_name))
and only consider it an explicit format override when its response_format field
is present/Some (or when a new boolean “response_format_present” flag is set
during deserialization), otherwise fall back to applying the builtin default;
ensure you reference the response_format/Passthrough semantics so tools with
only alias/arg_mapping still get the hosted-tool shape instead of falling back
to mcp_call.
- Around line 67-69: populate_from_server_config currently skips inserting when
ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior
mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so
that when format == ResponseFormat::Passthrough you first remove/clear any
existing entries for that QualifiedToolName (and any alias keys it may have)
from the registry maps before continuing, ensuring downgrades from a concrete
format back to Passthrough actually remove the transformation entries.

In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs`:
- Around line 95-99: The current serialization of entry.tool.annotations (in
build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations
struct and leaks internal fields; change it to only emit the read_only flag by
mapping annotations to a JSON object containing only {"read_only":
entry.tool.annotations.read_only} (or None if annotations is None) instead of
serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the
previous OpenAI Responses API parity.

In `@model_gateway/src/routers/grpc/common/responses/streaming.rs`:
- Around line 359-364: The change to use
crate::routers::common::openai_bridge::build_mcp_tool_infos in
tool_entries_to_json widens the serialized ToolAnnotations (adding
destructive/idempotent/open_world) and breaks the expected wire shape; revert to
or emulate the previous behavior by ensuring tool_entries_to_json only emits the
minimal annotation payload ({"read_only": entry.annotations.read_only}) like the
original smg_mcp::build_mcp_tool_infos did—either call the original
smg_mcp::build_mcp_tool_infos or post-process the values returned by
openai_bridge::build_mcp_tool_infos to strip all annotation fields except
read_only before returning.

In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 330-344: The current mcp_has_image_generation check treats any MCP
tool whose format_lookup yields ResponseFormat::ImageGenerationCall as the
hosted image_generation, which is too broad; change the predicate in
strip_image_generation_from_request_tools (the mcp_has_image_generation
variable) to require both that format_registry.lookup(&entry.qualified_name) ==
ResponseFormat::ImageGenerationCall AND that entry.qualified_name matches the
session’s actual hosted image_generation exposure name (use the session API that
returns the exposed/hosted tool name or compare against the known hosted
qualified name), so only the real hosted `image_generation` exposure causes
ResponseTool::ImageGeneration to be stripped.

In `@model_gateway/src/routers/openai/mcp/tool_loop.rs`:
- Around line 959-966: prepare_hosted_dispatch_args(...) mutates the local
`arguments`, but the subsequent call to build_transformed_mcp_call_item(...) is
still serializing `call.arguments`, so the client-visible transformed item can
differ from what was dispatched; change the build_transformed_mcp_call_item
invocation(s) to use the post-merge `arguments` (the mutated variable) when
serializing/constructing the transformed item (rather than `call.arguments`),
keeping the same `response_format`, `original_body` inputs (tools/user) and
apply the same change to the other occurrence of build_transformed_mcp_call_item
in this file where hosted dispatch args are prepared.

In `@model_gateway/src/routers/openai/responses/non_streaming.rs`:
- Around line 65-69: The code currently uses
ctx.components.mcp_format_registry().cloned().unwrap_or_default(), which
silently falls back to FormatRegistry::default() and can cause wrong MCP format
resolution; change this to require the registry be present by removing
unwrap_or_default and returning an error or panicking when mcp_format_registry
is None (e.g., use expect or propagate a Result) so that mcp_format_registry is
mandatory; update the call site that sets let mcp_format_registry to use
ctx.components.mcp_format_registry().cloned().expect("mcp_format_registry is
required") or propagate an appropriate error so missing registry fails fast.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/streaming.rs`:
- Around line 698-707: The code that computes response_format (using
session.and_then(...)) currently returns None when format_registry is None,
causing exposed MCP tools to be emitted as plain function_call; change the logic
in the response_format calculation so when session.is_some() and
s.has_exposed_tool(tool_name) but format_registry is None you return
Some(ResponseFormat::Passthrough) instead of None; specifically update the
closure that calls openai_bridge::lookup_tool_format(...) to fall back to
ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on
format_registry before calling lookup), referencing response_format, session,
format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and
ResponseFormat::Passthrough.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8691361e-d6ad-47b3-a08a-8837e7111084

📥 Commits

Reviewing files that changed from the base of the PR and between fbe2497 and a2212fc.

📒 Files selected for processing (49)
  • crates/mcp/Cargo.toml
  • crates/mcp/src/core/config.rs
  • crates/mcp/src/core/mod.rs
  • crates/mcp/src/core/orchestrator.rs
  • crates/mcp/src/core/session.rs
  • crates/mcp/src/inventory/mod.rs
  • crates/mcp/src/inventory/types.rs
  • crates/mcp/src/lib.rs
  • crates/mcp/src/responses_bridge.rs
  • crates/mcp/src/transform/mod.rs
  • e2e_test/responses/test_image_generation.py
  • model_gateway/src/app_context.rs
  • model_gateway/src/routers/anthropic/context.rs
  • model_gateway/src/routers/anthropic/router.rs
  • model_gateway/src/routers/common/mcp_utils.rs
  • model_gateway/src/routers/common/mod.rs
  • model_gateway/src/routers/common/openai_bridge/format_descriptor.rs
  • model_gateway/src/routers/common/openai_bridge/format_registry.rs
  • model_gateway/src/routers/common/openai_bridge/mod.rs
  • model_gateway/src/routers/common/openai_bridge/overrides.rs
  • model_gateway/src/routers/common/openai_bridge/response_format.rs
  • model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/common/persistence_utils.rs
  • model_gateway/src/routers/conversations/handlers.rs
  • model_gateway/src/routers/gemini/context.rs
  • model_gateway/src/routers/gemini/router.rs
  • model_gateway/src/routers/grpc/common/responses/context.rs
  • model_gateway/src/routers/grpc/common/responses/streaming.rs
  • model_gateway/src/routers/grpc/common/responses/utils.rs
  • model_gateway/src/routers/grpc/harmony/responses/common.rs
  • model_gateway/src/routers/grpc/harmony/responses/execution.rs
  • model_gateway/src/routers/grpc/harmony/responses/non_streaming.rs
  • model_gateway/src/routers/grpc/harmony/responses/streaming.rs
  • model_gateway/src/routers/grpc/harmony/streaming.rs
  • model_gateway/src/routers/grpc/regular/responses/common.rs
  • model_gateway/src/routers/grpc/regular/responses/handlers.rs
  • model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs
  • model_gateway/src/routers/grpc/router.rs
  • model_gateway/src/routers/openai/context.rs
  • model_gateway/src/routers/openai/mcp/tool_loop.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/src/routers/openai/responses/utils.rs
  • model_gateway/src/routers/openai/router.rs
  • model_gateway/src/service_discovery.rs
  • model_gateway/src/workflow/mcp_registration.rs
  • model_gateway/tests/mcp_test.rs
💤 Files with no reviewable changes (3)
  • crates/mcp/Cargo.toml
  • crates/mcp/src/responses_bridge.rs
  • crates/mcp/src/transform/mod.rs

Comment on lines +67 to +69
if format == ResponseFormat::Passthrough {
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

populate_from_server_config is not actually repeat-safe when a format is removed.

The doc says repeated calls are safe, but this branch only skips insertion; it never clears an existing map entry. If a tool was previously registered as WebSearchCall and a later config changes it back to Passthrough, the old QualifiedToolName -> ResponseFormat mapping survives and the registry keeps transforming that tool. Clear the current direct/alias key(s) before continue so format downgrades take effect.

♻️ Minimal direction for the cleanup
             for (tool_name, tool_config) in tools {
+                self.formats
+                    .remove(&QualifiedToolName::new(&config.name, tool_name));
+                if let Some(alias) = &tool_config.alias {
+                    self.formats
+                        .remove(&QualifiedToolName::new(ALIAS_SERVER_KEY, alias));
+                }
+
                 let format: ResponseFormat = tool_config.response_format.into();
                 if format == ResponseFormat::Passthrough {
                     continue;
                 }
                 if let Some(alias) = &tool_config.alias {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around
lines 67 - 69, populate_from_server_config currently skips inserting when
ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior
mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so
that when format == ResponseFormat::Passthrough you first remove/clear any
existing entries for that QualifiedToolName (and any alias keys it may have)
from the registry maps before continuing, ensuring downgrades from a concrete
format back to Passthrough actually remove the transformation entries.

Comment thread model_gateway/src/routers/common/openai_bridge/format_registry.rs Outdated
Comment on lines +95 to +99
annotations: entry
.tool
.annotations
.as_ref()
.and_then(|a| serde_json::to_value(a).ok()),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep mcp_list_tools[].tools[].annotations narrowed to read_only.

Serializing ToolAnnotations wholesale changes the wire shape by exposing internal fields like destructive, idempotent, and open_world. The previous bridge intentionally emitted only read_only, and approval logic still reads the richer struct directly, so widening the JSON here is a compatibility regression.

Suggested fix
-            annotations: entry
-                .tool
-                .annotations
-                .as_ref()
-                .and_then(|a| serde_json::to_value(a).ok()),
+            annotations: entry.tool.annotations.as_ref().map(|a| {
+                json!({
+                    "read_only": a.read_only,
+                })
+            }),

Based on learnings: build_mcp_tool_infos intentionally serializes annotations as only {"read_only": entry.annotations.read_only} to preserve OpenAI Responses API wire parity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs` around
lines 95 - 99, The current serialization of entry.tool.annotations (in
build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations
struct and leaks internal fields; change it to only emit the read_only flag by
mapping annotations to a JSON object containing only {"read_only":
entry.tool.annotations.read_only} (or None if annotations is None) instead of
serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the
previous OpenAI Responses API parity.

Comment on lines +359 to 364
/// Convert tool entries to JSON values using the shared bridge builder.
fn tool_entries_to_json(
tools: &[mcp::ToolEntry],
) -> Result<Vec<serde_json::Value>, serde_json::Error> {
mcp::build_mcp_tool_infos(tools)
crate::routers::common::openai_bridge::build_mcp_tool_infos(tools)
.into_iter()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't widen mcp_list_tools[].tools[].annotations on this path.

Switching tool_entries_to_json() to openai_bridge::build_mcp_tool_infos() changes the streaming wire shape because the shared builder currently serializes the full ToolAnnotations object. The previous smg_mcp::build_mcp_tool_infos() intentionally exposed only {"read_only": ...}, so this swap can reintroduce destructive / idempotent / open_world into response.mcp_list_tools.completed.

Based on learnings build_mcp_tool_infos intentionally serializes only {"read_only": entry.annotations.read_only} to match OpenAI Responses API wire parity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/common/responses/streaming.rs` around lines
359 - 364, The change to use
crate::routers::common::openai_bridge::build_mcp_tool_infos in
tool_entries_to_json widens the serialized ToolAnnotations (adding
destructive/idempotent/open_world) and breaks the expected wire shape; revert to
or emulate the previous behavior by ensuring tool_entries_to_json only emits the
minimal annotation payload ({"read_only": entry.annotations.read_only}) like the
original smg_mcp::build_mcp_tool_infos did—either call the original
smg_mcp::build_mcp_tool_infos or post-process the values returned by
openai_bridge::build_mcp_tool_infos to strip all annotation fields except
read_only before returning.

Comment thread model_gateway/src/routers/grpc/harmony/responses/common.rs Outdated
Comment thread model_gateway/src/routers/openai/mcp/tool_loop.rs
Comment thread model_gateway/src/routers/openai/responses/non_streaming.rs Outdated
…s, hosted args

Round 1 of PR #1429 review fixes (CodeRabbit, Codex, Claude, Gemini):

- (P1) `transform_tool_output` now takes a pre-resolved `ResponseFormat`
  instead of re-deriving it from `(output.server_key, output.tool_name)`.
  After `McpToolSession::execute_tool_result` the `tool_name` is the
  *invoked/exposed* name, so the registry lookup missed for disambiguated
  names like `mcp_<server>_<tool>` and silently degraded to `Passthrough`.
  All four call sites already had a session-aware `lookup_tool_format`
  result — this drops the duplicate (broken) lookup.

- (Major) Alias/arg_mapping-only stanzas no longer disable a builtin
  tool's hosted format. `ToolConfig.response_format` is now
  `Option<ResponseFormatConfig>`: `None` means "inherit context"
  (builtin → hosted format), `Some(Passthrough)` is the explicit
  opt-out that still suppresses the builtin default. Adds a
  regression test covering the alias-only-on-builtin case.

- (Major) Hosted-tool dispatch arguments mismatch in `tool_loop.rs`:
  `build_transformed_mcp_call_item` now serializes the post-merge
  `effective_arguments`, so the client-visible item describes what the
  router actually dispatched (e.g. image-generation `size`/`quality`
  overrides) rather than the pre-merge string emitted by the model.

- (Major) `strip_image_generation_from_request_tools` now keys on the
  literal exposed name `"image_generation"` routed to
  `ImageGenerationCall` rather than "any MCP tool with that format".
  A custom MCP tool with image-generation-shaped output (e.g.
  `thumbnailer`) no longer drops the real hosted tag.

- (Major) Both `/v1/responses` non-streaming and streaming entrypoints
  now hard-fail with 500 when `mcp_format_registry` is missing, instead
  of silently falling back to `FormatRegistry::default()` and
  mis-routing MCP-backed tools.

- (Nit) `lookup_tool_format` reuses the `QualifiedToolName` returned by
  `qualified_name_for_exposed` instead of rebuilding it via
  `lookup_by_names`, saving two `Arc<str>` allocations per call.

Skipped review items:
- "ToolAnnotations widening" (×2): false positive — `git show fbe2497`
  confirms the previous bridge already used the same
  `serde_json::to_value(annotations)` serialization. No wire-shape change.
- "`server` helper should be `#[cfg(test)]`": false positive — it
  already lives inside `#[cfg(test)] mod tests {}`.
- "Repeat-safe registry on Passthrough downgrade": no production caller
  mutates server config in place; documented the actual behavior in the
  doc comment instead of adding a `remove` path that would never fire.

Signed-off-by: Simo Lin <[email protected]>
Comment on lines +100 to +103
if !has_explicit_format {
let format: ResponseFormat = builtin_type.response_format().into();
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Important: Builtin default is stored under the wrong key when the tool has an alias, so the production lookup still misses.

collect_visible_mcp_tools (session.rs:565-571) replaces a direct entry with its alias entries. So when do_search has alias web_search, only web_search is exposed. The production lookup path is:

  1. lookup_tool_format(session, registry, "web_search")
  2. session.qualified_name_for_exposed("web_search")QualifiedToolName("alias", "web_search")
  3. registry.lookup(…) checks for ("alias", "web_search")missPassthrough

But the builtin default was inserted as ("search", "do_search"), which is never queried in production for an aliased tool.

The per-tool loop (lines 84-88) already handles this correctly — when a tool has an alias, the format is stored under ("alias", alias_name). The builtin default path should mirror that:

Suggested change
if !has_explicit_format {
let format: ResponseFormat = builtin_type.response_format().into();
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
}
if !has_explicit_format {
let format: ResponseFormat = builtin_type.response_format().into();
// When the tool has an alias, store under the alias key to match
// the QualifiedToolName shape that McpToolSession exposes.
let alias = config
.tools
.as_ref()
.and_then(|tools| tools.get(tool_name))
.and_then(|cfg| cfg.alias.as_ref());
if let Some(alias_name) = alias {
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias_name), format);
} else {
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
}
}

Comment on lines +265 to +269
assert_eq!(
r.lookup_by_names("search", "do_search"),
ResponseFormat::WebSearchCall,
"alias-only stanza must not disable the builtin's hosted format"
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Important: This test doesn't exercise the production lookup path and gives a false-positive.

lookup_by_names("search", "do_search") queries the (server, tool) pair directly. But in production, the session resolves alias "web_search" to QualifiedToolName("alias", "web_search") — because collect_visible_mcp_tools replaces direct entries with their aliases — and that key was never inserted into the registry.

To catch the bug flagged above, this assertion should additionally (or instead) check the alias key:

assert_eq!(
    r.lookup_by_names("alias", "web_search"),
    ResponseFormat::WebSearchCall,
    "alias-only stanza must not disable the builtin's hosted format (alias lookup)"
);

This assertion will fail with the current populate_from_server_config implementation, confirming the bug.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cafb8ddd35

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +100 to +102
if !has_explicit_format {
let format: ResponseFormat = builtin_type.response_format().into();
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register builtin default format for aliased builtin tools

When a built-in server config sets tools.<builtin_tool_name>.alias but leaves response_format unset (None), this code inserts the default hosted format only for (<server>, <tool_name>). However, McpToolSession exposes and resolves the callable name through the alias entry (("alias", <alias_name>)), so lookup_tool_format(...) misses and falls back to Passthrough, emitting mcp_call instead of the hosted shape (e.g. web_search_call / image_generation_call). This regresses the documented “alias-only stanza preserves builtin default” behavior for real tool execution paths.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/harmony/responses/common.rs (1)

184-189: ⚠️ Potential issue | 🟠 Major

Don't re-expose mcp_list_tools for internal builtin servers.

Switching this path to openai_bridge::inject_client_visible_mcp_output_items(...) introduces a visibility regression. The bridge helper emits mcp_list_tools for any binding that is not is_internal_non_builtin_server_label(...), which allows builtin-routed internal servers to become client-visible again. This breaks the established contract where all internal MCP plumbing (both builtin and non-builtin) stays hidden.

Change the predicate in model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs to use is_internal_server_label instead, so that only external servers emit mcp_list_tools items.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/harmony/responses/common.rs` around lines 184
- 189, The bridge helper inject_client_visible_mcp_output_items is currently
allowing builtin-routed internal servers to become client-visible because
tool_descriptors.rs uses is_internal_non_builtin_server_label in its predicate
that decides whether to emit mcp_list_tools; update that predicate to use
is_internal_server_label instead so only external servers emit mcp_list_tools.
Locate the emission logic in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code
path that decides to emit mcp_list_tools) and replace the
is_internal_non_builtin_server_label check with is_internal_server_label,
preserving surrounding logic and tests to ensure builtins remain hidden.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 84-102: When inserting the response format for a tool that has an
alias, also insert the same format under the canonical key so builtin dispatch
won't fall back to Passthrough; in the block that checks if let Some(alias) =
&tool_config.alias { ... } (and uses
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also
check if tool_config.response_format.is_some() (or the resolved format is
explicit) and then call self.insert(QualifiedToolName::new(&config.name,
tool_name), format) so the canonical (server_name, tool_name) entry is
preserved; keep using QualifiedToolName::new and the same ResponseFormat value.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 815-816: The SSE payload currently emits the ad-hoc raw
`{arguments, output}`/`{error}` object instead of the transformed
ResponseOutputItem; change the code that serializes and emits the streamed
"done" event to use the value returned by openai_bridge::transform_tool_output
(the variable output_item used in state.record_call()) — serialize output_item
and emit that for output_item.done. Also ensure failure shapes for
hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall,
ImageGenerationCall) are represented by status: "failed" only (no error field)
while allowing McpCall to include an error field, so the serialized output_item
follows those family-specific rules.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 184-189: The bridge helper inject_client_visible_mcp_output_items
is currently allowing builtin-routed internal servers to become client-visible
because tool_descriptors.rs uses is_internal_non_builtin_server_label in its
predicate that decides whether to emit mcp_list_tools; update that predicate to
use is_internal_server_label instead so only external servers emit
mcp_list_tools. Locate the emission logic in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code
path that decides to emit mcp_list_tools) and replace the
is_internal_non_builtin_server_label check with is_internal_server_label,
preserving surrounding logic and tests to ensure builtins remain hidden.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5f06c2f0-2c36-4062-bd62-26fc7b3df2e5

📥 Commits

Reviewing files that changed from the base of the PR and between a2212fc and cafb8dd.

📒 Files selected for processing (13)
  • crates/mcp/src/core/config.rs
  • crates/mcp/src/core/orchestrator.rs
  • model_gateway/src/routers/common/mcp_utils.rs
  • model_gateway/src/routers/common/openai_bridge/format_registry.rs
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/grpc/harmony/responses/common.rs
  • model_gateway/src/routers/grpc/harmony/responses/execution.rs
  • model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs
  • model_gateway/src/routers/openai/mcp/tool_loop.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/tests/mcp_test.rs

Comment on lines +84 to +102
if let Some(alias) = &tool_config.alias {
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format);
} else {
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
}
}
}

if let (Some(builtin_type), Some(tool_name)) =
(&config.builtin_type, &config.builtin_tool_name)
{
let has_explicit_format = config
.tools
.as_ref()
.and_then(|tools| tools.get(tool_name))
.is_some_and(|cfg| cfg.response_format.is_some());
if !has_explicit_format {
let format: ResponseFormat = builtin_type.response_format().into();
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the explicit format on the canonical builtin key, too.

If tool_name is also config.builtin_tool_name and the stanza has both alias and a concrete response_format, this code only inserts ("alias", alias) and suppresses the builtin fallback. Builtin dispatch later resolves by (server_name, tool_name), so those calls degrade to Passthrough even though the config explicitly requested a hosted shape.

♻️ Minimal fix
                 if format == ResponseFormat::Passthrough {
                     continue;
                 }
                 if let Some(alias) = &tool_config.alias {
                     self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format);
+                    if config.builtin_tool_name.as_deref() == Some(tool_name.as_str()) {
+                        self.insert(QualifiedToolName::new(&config.name, tool_name), format);
+                    }
                 } else {
                     self.insert(QualifiedToolName::new(&config.name, tool_name), format);
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around
lines 84 - 102, When inserting the response format for a tool that has an alias,
also insert the same format under the canonical key so builtin dispatch won't
fall back to Passthrough; in the block that checks if let Some(alias) =
&tool_config.alias { ... } (and uses
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also
check if tool_config.response_format.is_some() (or the resolved format is
explicit) and then call self.insert(QualifiedToolName::new(&config.name,
tool_name), format) so the canonical (server_name, tool_name) entry is
preserved; keep using QualifiedToolName::new and the same ResponseFormat value.

Comment on lines +815 to +816
let output_item =
openai_bridge::transform_tool_output(&tool_output, &response_format);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the transformed item for streamed output_item.done.

transform_tool_output() is only fed into state.record_call(). The SSE payload emitted just above still uses the ad-hoc raw {arguments, output} / {error} object, so builtin formats can stream a different shape than the final ResponseOutputItem, and failures on hosted builtin families leak an error field those variants do not support. Serialize the transformed item here and emit that instead.

Based on learnings, hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) must convey failure via status: "failed" alone; only McpCall may carry an error field.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines
815 - 816, The SSE payload currently emits the ad-hoc raw `{arguments,
output}`/`{error}` object instead of the transformed ResponseOutputItem; change
the code that serializes and emits the streamed "done" event to use the value
returned by openai_bridge::transform_tool_output (the variable output_item used
in state.record_call()) — serialize output_item and emit that for
output_item.done. Also ensure failure shapes for hosted-builtin families
(WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) are
represented by status: "failed" only (no error field) while allowing McpCall to
include an error field, so the serialized output_item follows those
family-specific rules.

slin1237 added 2 commits May 1, 2026 14:53
CI runs `cargo clippy --all-targets --all-features -- -D warnings` on this
PR for the first time (prior runs were cancelled), surfacing three pre-existing
lint classes:

- `clippy::absolute_paths`: ~17 sites referenced
  `crate::routers::common::openai_bridge::…` inline. Add a `use` for the
  module in each affected file and shorten the call sites.
- `clippy::trivially_copy_pass_by_ref`: `ResponseFormat` is a 1-byte `Copy`
  enum. Switch ~15 function signatures (and their callers) from
  `&ResponseFormat` to `ResponseFormat`. The same shift applies to the
  internal `ResponseTransformer::transform`, `transform_tool_output`,
  `build_transformed_mcp_call_item`, `non_streaming_tool_item_id_source`,
  `stable_streaming_tool_item_id`, `send_tool_call_*`, the gRPC emitter
  helpers, and the harmony streaming arg-classifier.
- `clippy::clone_on_copy`: drop a `.clone()` on `Option<ResponseFormat>` in
  the harmony tool-call tracking insert, and reformulate three
  `if let Some(ref fmt) = response_format` bindings now that the inner
  format is `Copy` (using `if let Some(fmt) = *response_format` for the
  borrowed-tuple cases).

No behavioral change — these are surface-level type / import edits to
satisfy the workspace lint gate.

Signed-off-by: Simo Lin <[email protected]>
The earlier import-shortening pass left several call sites at 5-line
formatting that fmt collapses back to 1-2 lines. Apply cargo fmt to
match the project style.

Signed-off-by: Simo Lin <[email protected]>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
model_gateway/src/routers/grpc/regular/responses/streaming.rs (1)

748-795: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stream output_item.done from the transformed item.

The stream still emits ad-hoc {arguments, output} / {error} objects here, then builds the real ResponseOutputItem separately below for state.record_call(). For hosted formats that means SSE clients see a different shape than the final response, and builtin failures still leak an error field those variants do not carry. Reuse the transformed item for output_item.done so the streamed and persisted forms stay identical.

Also applies to: 815-816

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines
748 - 795, The stream currently emits ad-hoc JSON objects for successful and
failed tool results then separately builds the final ResponseOutputItem for
state.record_call(), causing mismatched shapes; instead, transform the
tool_output once into the final ResponseOutputItem structure (the same shape
used for state.record_call()) and pass that transformed item to
emitter.emit_output_item_done/output_item.done; update the branches where
item_done is constructed (the mutable json! blocks around
emitter.emit_output_item_done, using tool_output, output_str, response_format,
attach_mcp_server_label) to reuse that transformed item for both
emitter.send_event(&event, &tx)? and for state.record_call(), and remove the
duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted
forms are identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/transformer.rs`:
- Around line 21-33: transform_tool_output currently discards
ToolExecutionOutput's failure info and always builds a successful
ResponseOutputItem via ResponseTransformer::transform; update
transform_tool_output to detect output.is_error (and use output.error_message)
and propagate that into the resulting ResponseOutputItem (e.g., set status to
"failed" and populate error/message fields) instead of always returning a
completed/no-error item so failed MCP/hosted-tool executions are persisted as
failures. Locate transform_tool_output and the call to
ResponseTransformer::transform and either pass through an explicit error/status
override or post-process the returned ResponseOutputItem to set status and error
using output.is_error and output.error_message.

In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 298-303: item_to_new_conversation_item() currently extracts
non-message input items by reading the "content" field which is absent for
replayed tool-output items, causing them to be persisted as [] and lost later;
after compact_image_generation_outputs_json(...) is called on value, change the
persistence logic in item_to_new_conversation_item() to detect when "content" is
missing or when the item is a structured tool/input (e.g., contains
image_generation_call or hosted-tool keys) and instead serialize the compacted
whole JSON value (the same `value` passed into
openai_bridge::compact_image_generation_outputs_json) verbatim into the
conversation item field used for storage (rather than using content), so
replayed structured inputs persist intact across loads.

In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 1092-1103: The code currently requires
ctx.components.mcp_format_registry() unconditionally, causing 500s for non‑MCP
streams; instead, only require and clone the MCP format registry when the
request will enter MCP interception (i.e., when original_body.tools.as_deref()
is Some). Move the mcp_format_registry lookup into the branch that checks
original_body.tools (the same branch that calls ensure_request_mcp_client), and
if tools exist but ctx.components.mcp_format_registry() is None return the
internal_error; otherwise let plain streaming paths continue to
handle_simple_streaming_passthrough() without needing the registry.

In `@model_gateway/tests/mcp_test.rs`:
- Around line 337-347: The test currently hardcodes call/server/tool/args when
calling ResponseTransformer::transform; replace that with the production helper
openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or
at minimum pass output.call_id, output.server_label, output.tool_name, and
output.arguments_str into ResponseTransformer::transform) so the assertions
exercise the actual ToolExecutionOutput produced by session.execute_tool();
update both the instance around lines ~337 and the similar block at ~422-429 to
use transform_tool_output or the output.* fields.

---

Duplicate comments:
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 748-795: The stream currently emits ad-hoc JSON objects for
successful and failed tool results then separately builds the final
ResponseOutputItem for state.record_call(), causing mismatched shapes; instead,
transform the tool_output once into the final ResponseOutputItem structure (the
same shape used for state.record_call()) and pass that transformed item to
emitter.emit_output_item_done/output_item.done; update the branches where
item_done is constructed (the mutable json! blocks around
emitter.emit_output_item_done, using tool_output, output_str, response_format,
attach_mcp_server_label) to reuse that transformed item for both
emitter.send_event(&event, &tx)? and for state.record_call(), and remove the
duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted
forms are identical.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 87979d65-d087-4c6e-a968-b80299ab9481

📥 Commits

Reviewing files that changed from the base of the PR and between cafb8dd and 71e1b23.

📒 Files selected for processing (18)
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/common/persistence_utils.rs
  • model_gateway/src/routers/conversations/handlers.rs
  • model_gateway/src/routers/grpc/common/responses/streaming.rs
  • model_gateway/src/routers/grpc/common/responses/utils.rs
  • model_gateway/src/routers/grpc/harmony/responses/common.rs
  • model_gateway/src/routers/grpc/harmony/responses/execution.rs
  • model_gateway/src/routers/grpc/harmony/streaming.rs
  • model_gateway/src/routers/grpc/regular/responses/common.rs
  • model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs
  • model_gateway/src/routers/openai/context.rs
  • model_gateway/src/routers/openai/mcp/tool_loop.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/src/routers/openai/responses/utils.rs
  • model_gateway/src/service_discovery.rs
  • model_gateway/tests/mcp_test.rs

Comment on lines +21 to +33
pub fn transform_tool_output(
output: &smg_mcp::ToolExecutionOutput,
response_format: ResponseFormat,
) -> ResponseOutputItem {
ResponseTransformer::transform(
&output.output,
response_format,
&output.call_id,
&output.server_label,
&output.tool_name,
&output.arguments_str,
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve ToolExecutionOutput failure state here.

transform_tool_output() throws away is_error/error_message and always delegates to builders that stamp successful items (status = "completed" / error = None). The streaming caller still invokes this helper on error paths, so failed MCP or hosted-tool executions can be persisted and replayed as successes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/common/openai_bridge/transformer.rs` around lines
21 - 33, transform_tool_output currently discards ToolExecutionOutput's failure
info and always builds a successful ResponseOutputItem via
ResponseTransformer::transform; update transform_tool_output to detect
output.is_error (and use output.error_message) and propagate that into the
resulting ResponseOutputItem (e.g., set status to "failed" and populate
error/message fields) instead of always returning a completed/no-error item so
failed MCP/hosted-tool executions are persisted as failures. Locate
transform_tool_output and the call to ResponseTransformer::transform and either
pass through an explicit error/status override or post-process the returned
ResponseOutputItem to set status and error using output.is_error and
output.error_message.

Comment on lines +298 to +303
// Strip image_generation_call.result base64 from
// historical replayed items before persistence —
// a no-op for non-image item types.
openai_bridge::compact_image_generation_outputs_json(
std::slice::from_mut(&mut value),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist replayed structured input items verbatim.

This now preserves replayed image_generation_call/hosted-tool inputs long enough to compact them, but item_to_new_conversation_item() still stores non-message input items by reading content. For replayed tool-output items that field is absent, so the linked conversation item becomes [] and the next history load loses the original item.

💡 Suggested fix
-    let store_whole_item = if is_input {
-        item_type == "function_call" || item_type == "function_call_output"
-    } else {
-        item_type != "message"
-    };
+    let store_whole_item = item_type != "message";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/common/persistence_utils.rs` around lines 298 -
303, item_to_new_conversation_item() currently extracts non-message input items
by reading the "content" field which is absent for replayed tool-output items,
causing them to be persisted as [] and lost later; after
compact_image_generation_outputs_json(...) is called on value, change the
persistence logic in item_to_new_conversation_item() to detect when "content" is
missing or when the item is a structured tool/input (e.g., contains
image_generation_call or hosted-tool keys) and instead serialize the compacted
whole JSON value (the same `value` passed into
openai_bridge::compact_image_generation_outputs_json) verbatim into the
conversation item field used for storage (rather than using content), so
replayed structured inputs persist intact across loads.

Comment on lines +1092 to +1103
// Same fail-fast contract as the non-streaming path: a missing format
// registry means MCP routing decisions would be silently wrong.
let mcp_format_registry = match ctx.components.mcp_format_registry() {
Some(r) => r.clone(),
None => {
return error::internal_error("internal_error", "MCP format registry required");
}
};

// Check for MCP tools and create request context if needed
let mcp_servers = if let Some(tools) = original_body.tools.as_deref() {
ensure_request_mcp_client(&mcp_orchestrator, tools).await
ensure_request_mcp_client(&mcp_orchestrator, &mcp_format_registry, tools).await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only require the format registry for MCP-bound streams.

This fail-fast now runs before we know whether the request will even enter the MCP interception path. In contexts where components.mcp_format_registry() is unset, plain streaming requests that would otherwise use handle_simple_streaming_passthrough() now return 500 even though no MCP routing is needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/openai/responses/streaming.rs` around lines 1092 -
1103, The code currently requires ctx.components.mcp_format_registry()
unconditionally, causing 500s for non‑MCP streams; instead, only require and
clone the MCP format registry when the request will enter MCP interception
(i.e., when original_body.tools.as_deref() is Some). Move the
mcp_format_registry lookup into the branch that checks original_body.tools (the
same branch that calls ensure_request_mcp_client), and if tools exist but
ctx.components.mcp_format_registry() is None return the internal_error;
otherwise let plain streaming paths continue to
handle_simple_streaming_passthrough() without needing the registry.

Comment on lines +337 to +347
// The session returns the raw `output` Value from the MCP call. Re-transform
// with WebSearchCall format to verify serialization (end-to-end source
// extraction is covered by the gateway bridge's own tests).
let transformed = ResponseTransformer::transform(
&output.output,
ResponseFormat::WebSearchCall,
"test-request-openai-search",
"openai_search_server",
"brave_web_search",
"{\"query\":\"rust openai search\"}",
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Exercise the production bridge helper in these assertions.

These checks hardcode the call/server/tool/args instead of transforming the actual ToolExecutionOutput returned by session.execute_tool(). They can still pass if session-side rewriting regresses. Prefer openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) here, or at least feed output.call_id, output.server_label, output.tool_name, and output.arguments_str back into the transformer.

Also applies to: 422-429

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/tests/mcp_test.rs` around lines 337 - 347, The test currently
hardcodes call/server/tool/args when calling ResponseTransformer::transform;
replace that with the production helper
openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or
at minimum pass output.call_id, output.server_label, output.tool_name, and
output.arguments_str into ResponseTransformer::transform) so the assertions
exercise the actual ToolExecutionOutput produced by session.execute_tool();
update both the instance around lines ~337 and the similar block at ~422-429 to
use transform_tool_output or the output.* fields.

slin1237 added 2 commits May 1, 2026 16:12
…tself

The previous assertion (\"base64 not anywhere in the GET payload\") was
over-specified. On the gRPC SGLang lane the model (gpt-oss-20b) sees the
base64 in its function_call_output and legitimately echoes it back inside
its assistant reply text (\"Base64 PNG data: iVBORw...\"). That message
text is then persisted as a normal message item and trips the assertion
even though the compactor is doing exactly what it was wired to do — the
stored image_generation_call item has no \`result\` field.

Narrow the assertion to the structured field where compactor failure
actually surfaces: the persisted image_generation_call must carry no
top-level or nested \`result\`, and the base64 must not appear inside the
serialized image_generation_call item. Free-form assistant text is
deliberately not policed here — that is a separate concern (model
behavior on the upstream side) and the gateway has nothing to compact.

Signed-off-by: Simo Lin <[email protected]>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d844414070

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +81 to +82
if format == ResponseFormat::Passthrough {
continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove cached format on passthrough reconfiguration

When populate_from_server_config is called again for an already-registered server, Some(Passthrough) is treated as a no-op (continue) and the previous non-passthrough mapping is left in formats. Because ConnectMcpServerStep repopulates this shared registry on each RegisterMcpServer run, changing a tool from web_search_call/etc. back to passthrough will keep transforming outputs as the old hosted type. This creates incorrect response item types after MCP server config updates unless stale keys are explicitly deleted.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

anthropic Anthropic router changes dependencies Dependency updates gemini Gemini router changes grpc gRPC client and router changes mcp MCP related changes model-gateway Model gateway crate changes openai OpenAI router changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant