refactor(mcp): extract OpenAI-flavored bridge into router-side module#1429
refactor(mcp): extract OpenAI-flavored bridge into router-side module#1429slin1237 wants to merge 6 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMoves OpenAI/protocol response-format responsibilities out of the MCP crate and into a new model_gateway openai_bridge. MCP no longer stores or transforms response formats; model_gateway now owns FormatRegistry, tool-descriptor builders, transformation/compaction, and threads the registry through routers, persistence, and workflow. ChangesMCP → Gateway format refactor
Sequence Diagram(s)sequenceDiagram
participant Router as ModelGateway Router
participant Bridge as openai_bridge::FormatRegistry
participant Session as McpToolSession
participant Orch as McpOrchestrator
participant Tool as External Tool Server
participant Store as Persistence
Router->>Bridge: lookup_tool_format(server_key, tool_name)
Bridge-->>Router: ResponseFormat
Router->>Session: prepare ToolExecutionInput (with ResponseFormat)
Session->>Orch: execute tool call (arguments)
Orch->>Tool: dispatch hosted call
Tool-->>Orch: raw JSON result
Orch-->>Session: ToolExecutionOutput (raw)
Router->>Bridge: transform_tool_output(output, ResponseFormat)
Bridge-->>Router: ResponseOutputItem
Router->>Store: compact_image_generation_outputs_json + persist
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
The smg-mcp crate had grown into a mix of MCP protocol code and
OpenAI-flavored request/response/event transformation, forcing dual edits
across the crate boundary on every router-side change. This PR moves all
OpenAI-shaped concerns (response-format dispatch, tool descriptors,
hosted-tool overrides, mcp_call/web_search_call/code_interpreter_call/
file_search_call/image_generation_call transformation, and the format
side-map keyed off the inventory) into
`model_gateway::routers::common::openai_bridge`, leaving smg-mcp focused
on the MCP protocol surface.
Highlights:
- New `openai_bridge` module: `response_format`, `format_registry`
(DashMap side-map keyed by `QualifiedToolName`), `format_descriptor`
(per-format dispatch table), `tool_descriptors`, `transformer`,
`overrides`.
- smg-mcp drops its `openai-protocol` dependency entirely.
- `ResponseFormat`, `ToolEntry::response_format`, the in-session tool/
inventory JSON builders, and the alias-routing 3-tuple are removed
from smg-mcp; aliases now live under a literal `ALIAS_SERVER_KEY`
prefix that the bridge looks up.
- Routers consume the bridge directly, so adding a new built-in
hosted tool no longer requires a parallel patch in smg-mcp.
- Image-generation result compactor is wired into the persistence
path, the POST /v1/conversations/{id}/items handler, and the
request-input persistence path so multi-MB base64 payloads are
stripped before storage on every ingress; `build_stored_response`
now consumes the compacted JSON to avoid a second full-tree clone.
- MCP integration tests rewritten against `session.execute_tool`.
- e2e image-generation compactor test now creates an explicit
conversation so it actually exercises persistence.
Net diff: +708 / -3645 across 45 files (smg-mcp shrinks by ~3k LOC).
Signed-off-by: Simo Lin <[email protected]>
f2d6026 to
a2212fc
Compare
There was a problem hiding this comment.
Code Review
This pull request decouples the smg_mcp crate from the OpenAI protocol by moving response transformation logic, tool descriptors, and format registries into a new openai_bridge module within the model_gateway. It also implements image generation output compaction to optimize storage usage. Feedback suggests marking a test helper function in format_registry.rs with #[cfg(test)] to prevent it from being compiled into production code and optimizing inject_client_visible_mcp_output_items to reduce Vec allocations in a performance-sensitive path.
| fn server(name: &str) -> McpServerConfig { | ||
| McpServerConfig { | ||
| name: name.to_string(), | ||
| transport: McpTransport::Streamable { | ||
| url: "http://x".to_string(), | ||
| token: None, | ||
| headers: HashMap::new(), | ||
| }, | ||
| proxy: None, | ||
| required: false, | ||
| tools: None, | ||
| builtin_type: None, | ||
| builtin_tool_name: None, | ||
| internal: false, | ||
| } | ||
| } |
| let existing = std::mem::take(output); | ||
| let servers = session.mcp_servers(); | ||
| output.reserve(servers.len() + tool_call_items.len() + existing.len()); | ||
|
|
||
| for binding in servers { | ||
| if !session.is_internal_non_builtin_server_label(&binding.label) { | ||
| output.push(mcp_list_tools_item( | ||
| session, | ||
| &binding.label, | ||
| &binding.server_key, | ||
| )); | ||
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2d60264d0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| output: &smg_mcp::ToolExecutionOutput, | ||
| registry: &FormatRegistry, | ||
| ) -> ResponseOutputItem { | ||
| let response_format = registry.lookup_by_names(&output.server_key, &output.tool_name); |
There was a problem hiding this comment.
Resolve tool format from qualified name before transforming output
transform_tool_output looks up the format with (output.server_key, output.tool_name), but McpToolSession::execute_tool_result rewrites output.tool_name to the invoked/exposed name (crates/mcp/src/core/session.rs, output.tool_name = invoked_name). When exposed names differ from the underlying tool name (for example duplicate names that get disambiguated, like mcp_<server>_<tool>), this lookup misses and falls back to Passthrough, so hosted-tool results are emitted as mcp_call instead of web_search_call/image_generation_call etc. This breaks response typing for those valid multi-server configurations.
Useful? React with 👍 / 👎.
| /// and the registry. Returns `Passthrough` for unknown tools. | ||
| /// | ||
| /// Lives next to `FormatRegistry` because it's a thin lookup helper that | ||
| /// composes session's name map with `FormatRegistry::lookup_by_names`, | ||
| /// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`. | ||
| pub fn lookup_tool_format( | ||
| session: &smg_mcp::McpToolSession<'_>, | ||
| registry: &FormatRegistry, | ||
| exposed_name: &str, | ||
| ) -> ResponseFormat { | ||
| let Some(qn) = session.qualified_name_for_exposed(exposed_name) else { | ||
| return ResponseFormat::Passthrough; | ||
| }; | ||
| registry.lookup_by_names(qn.server_key(), qn.tool_name()) |
There was a problem hiding this comment.
🟡 Nit: The doc comment says this avoids the Arc<str> allocations of QualifiedToolName::new, but qualified_name_for_exposed already creates a QualifiedToolName (two Arc<str>), and then lookup_by_names creates a second one internally. You can halve the allocation by calling registry.lookup(&qn) directly:
| /// and the registry. Returns `Passthrough` for unknown tools. | |
| /// | |
| /// Lives next to `FormatRegistry` because it's a thin lookup helper that | |
| /// composes session's name map with `FormatRegistry::lookup_by_names`, | |
| /// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`. | |
| pub fn lookup_tool_format( | |
| session: &smg_mcp::McpToolSession<'_>, | |
| registry: &FormatRegistry, | |
| exposed_name: &str, | |
| ) -> ResponseFormat { | |
| let Some(qn) = session.qualified_name_for_exposed(exposed_name) else { | |
| return ResponseFormat::Passthrough; | |
| }; | |
| registry.lookup_by_names(qn.server_key(), qn.tool_name()) | |
| /// Resolve an exposed tool name's `ResponseFormat` via the session's name map | |
| /// and the registry. Returns `Passthrough` for unknown tools. | |
| pub fn lookup_tool_format( | |
| session: &smg_mcp::McpToolSession<'_>, | |
| registry: &FormatRegistry, | |
| exposed_name: &str, | |
| ) -> ResponseFormat { | |
| let Some(qn) = session.qualified_name_for_exposed(exposed_name) else { | |
| return ResponseFormat::Passthrough; | |
| }; | |
| registry.lookup(&qn) | |
| } |
| /// Populate from a server config: per-tool overrides + builtin defaults. | ||
| /// Safe to call repeatedly — entries are overwritten. |
There was a problem hiding this comment.
🟡 Nit: "entries are overwritten" is only true for non-Passthrough formats. Because the loop skips Passthrough (line 68), re-calling this for a server whose tool changed from e.g. WebSearchCall → Passthrough would leave the stale WebSearchCall entry in the map.
In practice all servers are registered once so this is unlikely to bite, but the doc comment creates a stronger guarantee than the code delivers. Either document the caveat or add a self.formats.remove(...) for the Passthrough case so the "safe to call repeatedly" contract actually holds.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/harmony/streaming.rs (1)
698-707:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDefault exposed MCP tools to
Passthroughwhen the registry is absent.If
sessionis present butformat_registryisNone, this returnsNoneand the tool gets emitted as a plainfunction_callinstead ofmcp_call/builtin events. The safe fallback here is still MCP semantics, i.e.Some(ResponseFormat::Passthrough), or makingsessionandformat_registryimpossible to desynchronize.Suggested fix
- let response_format = session.and_then(|s| { - if s.has_exposed_tool(tool_name) { - format_registry.map(|reg| { - openai_bridge::lookup_tool_format(s, reg, tool_name) - }) - } else { - None - } - }); + let response_format = session.and_then(|s| { + if !s.has_exposed_tool(tool_name) { + return None; + } + Some(format_registry.map_or(ResponseFormat::Passthrough, |reg| { + openai_bridge::lookup_tool_format(s, reg, tool_name) + })) + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/harmony/streaming.rs` around lines 698 - 707, The code that computes response_format (using session.and_then(...)) currently returns None when format_registry is None, causing exposed MCP tools to be emitted as plain function_call; change the logic in the response_format calculation so when session.is_some() and s.has_exposed_tool(tool_name) but format_registry is None you return Some(ResponseFormat::Passthrough) instead of None; specifically update the closure that calls openai_bridge::lookup_tool_format(...) to fall back to ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on format_registry before calling lookup), referencing response_format, session, format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and ResponseFormat::Passthrough.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 81-85: The current check (has_explicit =
config.tools.as_ref().is_some_and(|tools| tools.contains_key(tool_name))) treats
any per-tool stanza as an override and disables builtin formatting even when the
stanza only sets alias/arg_mapping; fix this by detecting whether the per-tool
entry actually specifies a response_format. Update the logic in
format_registry.rs to look up the per-tool config (e.g., tools.get(tool_name))
and only consider it an explicit format override when its response_format field
is present/Some (or when a new boolean “response_format_present” flag is set
during deserialization), otherwise fall back to applying the builtin default;
ensure you reference the response_format/Passthrough semantics so tools with
only alias/arg_mapping still get the hosted-tool shape instead of falling back
to mcp_call.
- Around line 67-69: populate_from_server_config currently skips inserting when
ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior
mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so
that when format == ResponseFormat::Passthrough you first remove/clear any
existing entries for that QualifiedToolName (and any alias keys it may have)
from the registry maps before continuing, ensuring downgrades from a concrete
format back to Passthrough actually remove the transformation entries.
In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs`:
- Around line 95-99: The current serialization of entry.tool.annotations (in
build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations
struct and leaks internal fields; change it to only emit the read_only flag by
mapping annotations to a JSON object containing only {"read_only":
entry.tool.annotations.read_only} (or None if annotations is None) instead of
serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the
previous OpenAI Responses API parity.
In `@model_gateway/src/routers/grpc/common/responses/streaming.rs`:
- Around line 359-364: The change to use
crate::routers::common::openai_bridge::build_mcp_tool_infos in
tool_entries_to_json widens the serialized ToolAnnotations (adding
destructive/idempotent/open_world) and breaks the expected wire shape; revert to
or emulate the previous behavior by ensuring tool_entries_to_json only emits the
minimal annotation payload ({"read_only": entry.annotations.read_only}) like the
original smg_mcp::build_mcp_tool_infos did—either call the original
smg_mcp::build_mcp_tool_infos or post-process the values returned by
openai_bridge::build_mcp_tool_infos to strip all annotation fields except
read_only before returning.
In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 330-344: The current mcp_has_image_generation check treats any MCP
tool whose format_lookup yields ResponseFormat::ImageGenerationCall as the
hosted image_generation, which is too broad; change the predicate in
strip_image_generation_from_request_tools (the mcp_has_image_generation
variable) to require both that format_registry.lookup(&entry.qualified_name) ==
ResponseFormat::ImageGenerationCall AND that entry.qualified_name matches the
session’s actual hosted image_generation exposure name (use the session API that
returns the exposed/hosted tool name or compare against the known hosted
qualified name), so only the real hosted `image_generation` exposure causes
ResponseTool::ImageGeneration to be stripped.
In `@model_gateway/src/routers/openai/mcp/tool_loop.rs`:
- Around line 959-966: prepare_hosted_dispatch_args(...) mutates the local
`arguments`, but the subsequent call to build_transformed_mcp_call_item(...) is
still serializing `call.arguments`, so the client-visible transformed item can
differ from what was dispatched; change the build_transformed_mcp_call_item
invocation(s) to use the post-merge `arguments` (the mutated variable) when
serializing/constructing the transformed item (rather than `call.arguments`),
keeping the same `response_format`, `original_body` inputs (tools/user) and
apply the same change to the other occurrence of build_transformed_mcp_call_item
in this file where hosted dispatch args are prepared.
In `@model_gateway/src/routers/openai/responses/non_streaming.rs`:
- Around line 65-69: The code currently uses
ctx.components.mcp_format_registry().cloned().unwrap_or_default(), which
silently falls back to FormatRegistry::default() and can cause wrong MCP format
resolution; change this to require the registry be present by removing
unwrap_or_default and returning an error or panicking when mcp_format_registry
is None (e.g., use expect or propagate a Result) so that mcp_format_registry is
mandatory; update the call site that sets let mcp_format_registry to use
ctx.components.mcp_format_registry().cloned().expect("mcp_format_registry is
required") or propagate an appropriate error so missing registry fails fast.
---
Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/streaming.rs`:
- Around line 698-707: The code that computes response_format (using
session.and_then(...)) currently returns None when format_registry is None,
causing exposed MCP tools to be emitted as plain function_call; change the logic
in the response_format calculation so when session.is_some() and
s.has_exposed_tool(tool_name) but format_registry is None you return
Some(ResponseFormat::Passthrough) instead of None; specifically update the
closure that calls openai_bridge::lookup_tool_format(...) to fall back to
ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on
format_registry before calling lookup), referencing response_format, session,
format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and
ResponseFormat::Passthrough.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8691361e-d6ad-47b3-a08a-8837e7111084
📒 Files selected for processing (49)
crates/mcp/Cargo.tomlcrates/mcp/src/core/config.rscrates/mcp/src/core/mod.rscrates/mcp/src/core/orchestrator.rscrates/mcp/src/core/session.rscrates/mcp/src/inventory/mod.rscrates/mcp/src/inventory/types.rscrates/mcp/src/lib.rscrates/mcp/src/responses_bridge.rscrates/mcp/src/transform/mod.rse2e_test/responses/test_image_generation.pymodel_gateway/src/app_context.rsmodel_gateway/src/routers/anthropic/context.rsmodel_gateway/src/routers/anthropic/router.rsmodel_gateway/src/routers/common/mcp_utils.rsmodel_gateway/src/routers/common/mod.rsmodel_gateway/src/routers/common/openai_bridge/format_descriptor.rsmodel_gateway/src/routers/common/openai_bridge/format_registry.rsmodel_gateway/src/routers/common/openai_bridge/mod.rsmodel_gateway/src/routers/common/openai_bridge/overrides.rsmodel_gateway/src/routers/common/openai_bridge/response_format.rsmodel_gateway/src/routers/common/openai_bridge/tool_descriptors.rsmodel_gateway/src/routers/common/openai_bridge/transformer.rsmodel_gateway/src/routers/common/persistence_utils.rsmodel_gateway/src/routers/conversations/handlers.rsmodel_gateway/src/routers/gemini/context.rsmodel_gateway/src/routers/gemini/router.rsmodel_gateway/src/routers/grpc/common/responses/context.rsmodel_gateway/src/routers/grpc/common/responses/streaming.rsmodel_gateway/src/routers/grpc/common/responses/utils.rsmodel_gateway/src/routers/grpc/harmony/responses/common.rsmodel_gateway/src/routers/grpc/harmony/responses/execution.rsmodel_gateway/src/routers/grpc/harmony/responses/non_streaming.rsmodel_gateway/src/routers/grpc/harmony/responses/streaming.rsmodel_gateway/src/routers/grpc/harmony/streaming.rsmodel_gateway/src/routers/grpc/regular/responses/common.rsmodel_gateway/src/routers/grpc/regular/responses/handlers.rsmodel_gateway/src/routers/grpc/regular/responses/non_streaming.rsmodel_gateway/src/routers/grpc/regular/responses/streaming.rsmodel_gateway/src/routers/grpc/router.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/non_streaming.rsmodel_gateway/src/routers/openai/responses/streaming.rsmodel_gateway/src/routers/openai/responses/utils.rsmodel_gateway/src/routers/openai/router.rsmodel_gateway/src/service_discovery.rsmodel_gateway/src/workflow/mcp_registration.rsmodel_gateway/tests/mcp_test.rs
💤 Files with no reviewable changes (3)
- crates/mcp/Cargo.toml
- crates/mcp/src/responses_bridge.rs
- crates/mcp/src/transform/mod.rs
| if format == ResponseFormat::Passthrough { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
populate_from_server_config is not actually repeat-safe when a format is removed.
The doc says repeated calls are safe, but this branch only skips insertion; it never clears an existing map entry. If a tool was previously registered as WebSearchCall and a later config changes it back to Passthrough, the old QualifiedToolName -> ResponseFormat mapping survives and the registry keeps transforming that tool. Clear the current direct/alias key(s) before continue so format downgrades take effect.
♻️ Minimal direction for the cleanup
for (tool_name, tool_config) in tools {
+ self.formats
+ .remove(&QualifiedToolName::new(&config.name, tool_name));
+ if let Some(alias) = &tool_config.alias {
+ self.formats
+ .remove(&QualifiedToolName::new(ALIAS_SERVER_KEY, alias));
+ }
+
let format: ResponseFormat = tool_config.response_format.into();
if format == ResponseFormat::Passthrough {
continue;
}
if let Some(alias) = &tool_config.alias {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around
lines 67 - 69, populate_from_server_config currently skips inserting when
ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior
mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so
that when format == ResponseFormat::Passthrough you first remove/clear any
existing entries for that QualifiedToolName (and any alias keys it may have)
from the registry maps before continuing, ensuring downgrades from a concrete
format back to Passthrough actually remove the transformation entries.
| annotations: entry | ||
| .tool | ||
| .annotations | ||
| .as_ref() | ||
| .and_then(|a| serde_json::to_value(a).ok()), |
There was a problem hiding this comment.
Keep mcp_list_tools[].tools[].annotations narrowed to read_only.
Serializing ToolAnnotations wholesale changes the wire shape by exposing internal fields like destructive, idempotent, and open_world. The previous bridge intentionally emitted only read_only, and approval logic still reads the richer struct directly, so widening the JSON here is a compatibility regression.
Suggested fix
- annotations: entry
- .tool
- .annotations
- .as_ref()
- .and_then(|a| serde_json::to_value(a).ok()),
+ annotations: entry.tool.annotations.as_ref().map(|a| {
+ json!({
+ "read_only": a.read_only,
+ })
+ }),Based on learnings: build_mcp_tool_infos intentionally serializes annotations as only {"read_only": entry.annotations.read_only} to preserve OpenAI Responses API wire parity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs` around
lines 95 - 99, The current serialization of entry.tool.annotations (in
build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations
struct and leaks internal fields; change it to only emit the read_only flag by
mapping annotations to a JSON object containing only {"read_only":
entry.tool.annotations.read_only} (or None if annotations is None) instead of
serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the
previous OpenAI Responses API parity.
| /// Convert tool entries to JSON values using the shared bridge builder. | ||
| fn tool_entries_to_json( | ||
| tools: &[mcp::ToolEntry], | ||
| ) -> Result<Vec<serde_json::Value>, serde_json::Error> { | ||
| mcp::build_mcp_tool_infos(tools) | ||
| crate::routers::common::openai_bridge::build_mcp_tool_infos(tools) | ||
| .into_iter() |
There was a problem hiding this comment.
Don't widen mcp_list_tools[].tools[].annotations on this path.
Switching tool_entries_to_json() to openai_bridge::build_mcp_tool_infos() changes the streaming wire shape because the shared builder currently serializes the full ToolAnnotations object. The previous smg_mcp::build_mcp_tool_infos() intentionally exposed only {"read_only": ...}, so this swap can reintroduce destructive / idempotent / open_world into response.mcp_list_tools.completed.
Based on learnings build_mcp_tool_infos intentionally serializes only {"read_only": entry.annotations.read_only} to match OpenAI Responses API wire parity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/grpc/common/responses/streaming.rs` around lines
359 - 364, The change to use
crate::routers::common::openai_bridge::build_mcp_tool_infos in
tool_entries_to_json widens the serialized ToolAnnotations (adding
destructive/idempotent/open_world) and breaks the expected wire shape; revert to
or emulate the previous behavior by ensuring tool_entries_to_json only emits the
minimal annotation payload ({"read_only": entry.annotations.read_only}) like the
original smg_mcp::build_mcp_tool_infos did—either call the original
smg_mcp::build_mcp_tool_infos or post-process the values returned by
openai_bridge::build_mcp_tool_infos to strip all annotation fields except
read_only before returning.
…s, hosted args Round 1 of PR #1429 review fixes (CodeRabbit, Codex, Claude, Gemini): - (P1) `transform_tool_output` now takes a pre-resolved `ResponseFormat` instead of re-deriving it from `(output.server_key, output.tool_name)`. After `McpToolSession::execute_tool_result` the `tool_name` is the *invoked/exposed* name, so the registry lookup missed for disambiguated names like `mcp_<server>_<tool>` and silently degraded to `Passthrough`. All four call sites already had a session-aware `lookup_tool_format` result — this drops the duplicate (broken) lookup. - (Major) Alias/arg_mapping-only stanzas no longer disable a builtin tool's hosted format. `ToolConfig.response_format` is now `Option<ResponseFormatConfig>`: `None` means "inherit context" (builtin → hosted format), `Some(Passthrough)` is the explicit opt-out that still suppresses the builtin default. Adds a regression test covering the alias-only-on-builtin case. - (Major) Hosted-tool dispatch arguments mismatch in `tool_loop.rs`: `build_transformed_mcp_call_item` now serializes the post-merge `effective_arguments`, so the client-visible item describes what the router actually dispatched (e.g. image-generation `size`/`quality` overrides) rather than the pre-merge string emitted by the model. - (Major) `strip_image_generation_from_request_tools` now keys on the literal exposed name `"image_generation"` routed to `ImageGenerationCall` rather than "any MCP tool with that format". A custom MCP tool with image-generation-shaped output (e.g. `thumbnailer`) no longer drops the real hosted tag. - (Major) Both `/v1/responses` non-streaming and streaming entrypoints now hard-fail with 500 when `mcp_format_registry` is missing, instead of silently falling back to `FormatRegistry::default()` and mis-routing MCP-backed tools. - (Nit) `lookup_tool_format` reuses the `QualifiedToolName` returned by `qualified_name_for_exposed` instead of rebuilding it via `lookup_by_names`, saving two `Arc<str>` allocations per call. Skipped review items: - "ToolAnnotations widening" (×2): false positive — `git show fbe2497` confirms the previous bridge already used the same `serde_json::to_value(annotations)` serialization. No wire-shape change. - "`server` helper should be `#[cfg(test)]`": false positive — it already lives inside `#[cfg(test)] mod tests {}`. - "Repeat-safe registry on Passthrough downgrade": no production caller mutates server config in place; documented the actual behavior in the doc comment instead of adding a `remove` path that would never fire. Signed-off-by: Simo Lin <[email protected]>
| if !has_explicit_format { | ||
| let format: ResponseFormat = builtin_type.response_format().into(); | ||
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); | ||
| } |
There was a problem hiding this comment.
🔴 Important: Builtin default is stored under the wrong key when the tool has an alias, so the production lookup still misses.
collect_visible_mcp_tools (session.rs:565-571) replaces a direct entry with its alias entries. So when do_search has alias web_search, only web_search is exposed. The production lookup path is:
lookup_tool_format(session, registry, "web_search")session.qualified_name_for_exposed("web_search")→QualifiedToolName("alias", "web_search")registry.lookup(…)checks for("alias", "web_search")→ miss →Passthrough
But the builtin default was inserted as ("search", "do_search"), which is never queried in production for an aliased tool.
The per-tool loop (lines 84-88) already handles this correctly — when a tool has an alias, the format is stored under ("alias", alias_name). The builtin default path should mirror that:
| if !has_explicit_format { | |
| let format: ResponseFormat = builtin_type.response_format().into(); | |
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); | |
| } | |
| if !has_explicit_format { | |
| let format: ResponseFormat = builtin_type.response_format().into(); | |
| // When the tool has an alias, store under the alias key to match | |
| // the QualifiedToolName shape that McpToolSession exposes. | |
| let alias = config | |
| .tools | |
| .as_ref() | |
| .and_then(|tools| tools.get(tool_name)) | |
| .and_then(|cfg| cfg.alias.as_ref()); | |
| if let Some(alias_name) = alias { | |
| self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias_name), format); | |
| } else { | |
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); | |
| } | |
| } |
| assert_eq!( | ||
| r.lookup_by_names("search", "do_search"), | ||
| ResponseFormat::WebSearchCall, | ||
| "alias-only stanza must not disable the builtin's hosted format" | ||
| ); |
There was a problem hiding this comment.
🔴 Important: This test doesn't exercise the production lookup path and gives a false-positive.
lookup_by_names("search", "do_search") queries the (server, tool) pair directly. But in production, the session resolves alias "web_search" to QualifiedToolName("alias", "web_search") — because collect_visible_mcp_tools replaces direct entries with their aliases — and that key was never inserted into the registry.
To catch the bug flagged above, this assertion should additionally (or instead) check the alias key:
assert_eq!(
r.lookup_by_names("alias", "web_search"),
ResponseFormat::WebSearchCall,
"alias-only stanza must not disable the builtin's hosted format (alias lookup)"
);This assertion will fail with the current populate_from_server_config implementation, confirming the bug.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cafb8ddd35
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if !has_explicit_format { | ||
| let format: ResponseFormat = builtin_type.response_format().into(); | ||
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); |
There was a problem hiding this comment.
Register builtin default format for aliased builtin tools
When a built-in server config sets tools.<builtin_tool_name>.alias but leaves response_format unset (None), this code inserts the default hosted format only for (<server>, <tool_name>). However, McpToolSession exposes and resolves the callable name through the alias entry (("alias", <alias_name>)), so lookup_tool_format(...) misses and falls back to Passthrough, emitting mcp_call instead of the hosted shape (e.g. web_search_call / image_generation_call). This regresses the documented “alias-only stanza preserves builtin default” behavior for real tool execution paths.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/harmony/responses/common.rs (1)
184-189:⚠️ Potential issue | 🟠 MajorDon't re-expose
mcp_list_toolsfor internal builtin servers.Switching this path to
openai_bridge::inject_client_visible_mcp_output_items(...)introduces a visibility regression. The bridge helper emitsmcp_list_toolsfor any binding that is notis_internal_non_builtin_server_label(...), which allows builtin-routed internal servers to become client-visible again. This breaks the established contract where all internal MCP plumbing (both builtin and non-builtin) stays hidden.Change the predicate in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rsto useis_internal_server_labelinstead, so that only external servers emitmcp_list_toolsitems.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/harmony/responses/common.rs` around lines 184 - 189, The bridge helper inject_client_visible_mcp_output_items is currently allowing builtin-routed internal servers to become client-visible because tool_descriptors.rs uses is_internal_non_builtin_server_label in its predicate that decides whether to emit mcp_list_tools; update that predicate to use is_internal_server_label instead so only external servers emit mcp_list_tools. Locate the emission logic in model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code path that decides to emit mcp_list_tools) and replace the is_internal_non_builtin_server_label check with is_internal_server_label, preserving surrounding logic and tests to ensure builtins remain hidden.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 84-102: When inserting the response format for a tool that has an
alias, also insert the same format under the canonical key so builtin dispatch
won't fall back to Passthrough; in the block that checks if let Some(alias) =
&tool_config.alias { ... } (and uses
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also
check if tool_config.response_format.is_some() (or the resolved format is
explicit) and then call self.insert(QualifiedToolName::new(&config.name,
tool_name), format) so the canonical (server_name, tool_name) entry is
preserved; keep using QualifiedToolName::new and the same ResponseFormat value.
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 815-816: The SSE payload currently emits the ad-hoc raw
`{arguments, output}`/`{error}` object instead of the transformed
ResponseOutputItem; change the code that serializes and emits the streamed
"done" event to use the value returned by openai_bridge::transform_tool_output
(the variable output_item used in state.record_call()) — serialize output_item
and emit that for output_item.done. Also ensure failure shapes for
hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall,
ImageGenerationCall) are represented by status: "failed" only (no error field)
while allowing McpCall to include an error field, so the serialized output_item
follows those family-specific rules.
---
Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 184-189: The bridge helper inject_client_visible_mcp_output_items
is currently allowing builtin-routed internal servers to become client-visible
because tool_descriptors.rs uses is_internal_non_builtin_server_label in its
predicate that decides whether to emit mcp_list_tools; update that predicate to
use is_internal_server_label instead so only external servers emit
mcp_list_tools. Locate the emission logic in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code
path that decides to emit mcp_list_tools) and replace the
is_internal_non_builtin_server_label check with is_internal_server_label,
preserving surrounding logic and tests to ensure builtins remain hidden.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5f06c2f0-2c36-4062-bd62-26fc7b3df2e5
📒 Files selected for processing (13)
crates/mcp/src/core/config.rscrates/mcp/src/core/orchestrator.rsmodel_gateway/src/routers/common/mcp_utils.rsmodel_gateway/src/routers/common/openai_bridge/format_registry.rsmodel_gateway/src/routers/common/openai_bridge/transformer.rsmodel_gateway/src/routers/grpc/harmony/responses/common.rsmodel_gateway/src/routers/grpc/harmony/responses/execution.rsmodel_gateway/src/routers/grpc/regular/responses/non_streaming.rsmodel_gateway/src/routers/grpc/regular/responses/streaming.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/non_streaming.rsmodel_gateway/src/routers/openai/responses/streaming.rsmodel_gateway/tests/mcp_test.rs
| if let Some(alias) = &tool_config.alias { | ||
| self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format); | ||
| } else { | ||
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if let (Some(builtin_type), Some(tool_name)) = | ||
| (&config.builtin_type, &config.builtin_tool_name) | ||
| { | ||
| let has_explicit_format = config | ||
| .tools | ||
| .as_ref() | ||
| .and_then(|tools| tools.get(tool_name)) | ||
| .is_some_and(|cfg| cfg.response_format.is_some()); | ||
| if !has_explicit_format { | ||
| let format: ResponseFormat = builtin_type.response_format().into(); | ||
| self.insert(QualifiedToolName::new(&config.name, tool_name), format); |
There was a problem hiding this comment.
Preserve the explicit format on the canonical builtin key, too.
If tool_name is also config.builtin_tool_name and the stanza has both alias and a concrete response_format, this code only inserts ("alias", alias) and suppresses the builtin fallback. Builtin dispatch later resolves by (server_name, tool_name), so those calls degrade to Passthrough even though the config explicitly requested a hosted shape.
♻️ Minimal fix
if format == ResponseFormat::Passthrough {
continue;
}
if let Some(alias) = &tool_config.alias {
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format);
+ if config.builtin_tool_name.as_deref() == Some(tool_name.as_str()) {
+ self.insert(QualifiedToolName::new(&config.name, tool_name), format);
+ }
} else {
self.insert(QualifiedToolName::new(&config.name, tool_name), format);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around
lines 84 - 102, When inserting the response format for a tool that has an alias,
also insert the same format under the canonical key so builtin dispatch won't
fall back to Passthrough; in the block that checks if let Some(alias) =
&tool_config.alias { ... } (and uses
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also
check if tool_config.response_format.is_some() (or the resolved format is
explicit) and then call self.insert(QualifiedToolName::new(&config.name,
tool_name), format) so the canonical (server_name, tool_name) entry is
preserved; keep using QualifiedToolName::new and the same ResponseFormat value.
| let output_item = | ||
| openai_bridge::transform_tool_output(&tool_output, &response_format); |
There was a problem hiding this comment.
Use the transformed item for streamed output_item.done.
transform_tool_output() is only fed into state.record_call(). The SSE payload emitted just above still uses the ad-hoc raw {arguments, output} / {error} object, so builtin formats can stream a different shape than the final ResponseOutputItem, and failures on hosted builtin families leak an error field those variants do not support. Serialize the transformed item here and emit that instead.
Based on learnings, hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) must convey failure via status: "failed" alone; only McpCall may carry an error field.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines
815 - 816, The SSE payload currently emits the ad-hoc raw `{arguments,
output}`/`{error}` object instead of the transformed ResponseOutputItem; change
the code that serializes and emits the streamed "done" event to use the value
returned by openai_bridge::transform_tool_output (the variable output_item used
in state.record_call()) — serialize output_item and emit that for
output_item.done. Also ensure failure shapes for hosted-builtin families
(WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) are
represented by status: "failed" only (no error field) while allowing McpCall to
include an error field, so the serialized output_item follows those
family-specific rules.
CI runs `cargo clippy --all-targets --all-features -- -D warnings` on this PR for the first time (prior runs were cancelled), surfacing three pre-existing lint classes: - `clippy::absolute_paths`: ~17 sites referenced `crate::routers::common::openai_bridge::…` inline. Add a `use` for the module in each affected file and shorten the call sites. - `clippy::trivially_copy_pass_by_ref`: `ResponseFormat` is a 1-byte `Copy` enum. Switch ~15 function signatures (and their callers) from `&ResponseFormat` to `ResponseFormat`. The same shift applies to the internal `ResponseTransformer::transform`, `transform_tool_output`, `build_transformed_mcp_call_item`, `non_streaming_tool_item_id_source`, `stable_streaming_tool_item_id`, `send_tool_call_*`, the gRPC emitter helpers, and the harmony streaming arg-classifier. - `clippy::clone_on_copy`: drop a `.clone()` on `Option<ResponseFormat>` in the harmony tool-call tracking insert, and reformulate three `if let Some(ref fmt) = response_format` bindings now that the inner format is `Copy` (using `if let Some(fmt) = *response_format` for the borrowed-tuple cases). No behavioral change — these are surface-level type / import edits to satisfy the workspace lint gate. Signed-off-by: Simo Lin <[email protected]>
The earlier import-shortening pass left several call sites at 5-line formatting that fmt collapses back to 1-2 lines. Apply cargo fmt to match the project style. Signed-off-by: Simo Lin <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
model_gateway/src/routers/grpc/regular/responses/streaming.rs (1)
748-795:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStream
output_item.donefrom the transformed item.The stream still emits ad-hoc
{arguments, output}/{error}objects here, then builds the realResponseOutputItemseparately below forstate.record_call(). For hosted formats that means SSE clients see a different shape than the final response, and builtin failures still leak anerrorfield those variants do not carry. Reuse the transformed item foroutput_item.doneso the streamed and persisted forms stay identical.Also applies to: 815-816
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines 748 - 795, The stream currently emits ad-hoc JSON objects for successful and failed tool results then separately builds the final ResponseOutputItem for state.record_call(), causing mismatched shapes; instead, transform the tool_output once into the final ResponseOutputItem structure (the same shape used for state.record_call()) and pass that transformed item to emitter.emit_output_item_done/output_item.done; update the branches where item_done is constructed (the mutable json! blocks around emitter.emit_output_item_done, using tool_output, output_str, response_format, attach_mcp_server_label) to reuse that transformed item for both emitter.send_event(&event, &tx)? and for state.record_call(), and remove the duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted forms are identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/transformer.rs`:
- Around line 21-33: transform_tool_output currently discards
ToolExecutionOutput's failure info and always builds a successful
ResponseOutputItem via ResponseTransformer::transform; update
transform_tool_output to detect output.is_error (and use output.error_message)
and propagate that into the resulting ResponseOutputItem (e.g., set status to
"failed" and populate error/message fields) instead of always returning a
completed/no-error item so failed MCP/hosted-tool executions are persisted as
failures. Locate transform_tool_output and the call to
ResponseTransformer::transform and either pass through an explicit error/status
override or post-process the returned ResponseOutputItem to set status and error
using output.is_error and output.error_message.
In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 298-303: item_to_new_conversation_item() currently extracts
non-message input items by reading the "content" field which is absent for
replayed tool-output items, causing them to be persisted as [] and lost later;
after compact_image_generation_outputs_json(...) is called on value, change the
persistence logic in item_to_new_conversation_item() to detect when "content" is
missing or when the item is a structured tool/input (e.g., contains
image_generation_call or hosted-tool keys) and instead serialize the compacted
whole JSON value (the same `value` passed into
openai_bridge::compact_image_generation_outputs_json) verbatim into the
conversation item field used for storage (rather than using content), so
replayed structured inputs persist intact across loads.
In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 1092-1103: The code currently requires
ctx.components.mcp_format_registry() unconditionally, causing 500s for non‑MCP
streams; instead, only require and clone the MCP format registry when the
request will enter MCP interception (i.e., when original_body.tools.as_deref()
is Some). Move the mcp_format_registry lookup into the branch that checks
original_body.tools (the same branch that calls ensure_request_mcp_client), and
if tools exist but ctx.components.mcp_format_registry() is None return the
internal_error; otherwise let plain streaming paths continue to
handle_simple_streaming_passthrough() without needing the registry.
In `@model_gateway/tests/mcp_test.rs`:
- Around line 337-347: The test currently hardcodes call/server/tool/args when
calling ResponseTransformer::transform; replace that with the production helper
openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or
at minimum pass output.call_id, output.server_label, output.tool_name, and
output.arguments_str into ResponseTransformer::transform) so the assertions
exercise the actual ToolExecutionOutput produced by session.execute_tool();
update both the instance around lines ~337 and the similar block at ~422-429 to
use transform_tool_output or the output.* fields.
---
Duplicate comments:
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 748-795: The stream currently emits ad-hoc JSON objects for
successful and failed tool results then separately builds the final
ResponseOutputItem for state.record_call(), causing mismatched shapes; instead,
transform the tool_output once into the final ResponseOutputItem structure (the
same shape used for state.record_call()) and pass that transformed item to
emitter.emit_output_item_done/output_item.done; update the branches where
item_done is constructed (the mutable json! blocks around
emitter.emit_output_item_done, using tool_output, output_str, response_format,
attach_mcp_server_label) to reuse that transformed item for both
emitter.send_event(&event, &tx)? and for state.record_call(), and remove the
duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted
forms are identical.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 87979d65-d087-4c6e-a968-b80299ab9481
📒 Files selected for processing (18)
model_gateway/src/routers/common/openai_bridge/transformer.rsmodel_gateway/src/routers/common/persistence_utils.rsmodel_gateway/src/routers/conversations/handlers.rsmodel_gateway/src/routers/grpc/common/responses/streaming.rsmodel_gateway/src/routers/grpc/common/responses/utils.rsmodel_gateway/src/routers/grpc/harmony/responses/common.rsmodel_gateway/src/routers/grpc/harmony/responses/execution.rsmodel_gateway/src/routers/grpc/harmony/streaming.rsmodel_gateway/src/routers/grpc/regular/responses/common.rsmodel_gateway/src/routers/grpc/regular/responses/non_streaming.rsmodel_gateway/src/routers/grpc/regular/responses/streaming.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/non_streaming.rsmodel_gateway/src/routers/openai/responses/streaming.rsmodel_gateway/src/routers/openai/responses/utils.rsmodel_gateway/src/service_discovery.rsmodel_gateway/tests/mcp_test.rs
| pub fn transform_tool_output( | ||
| output: &smg_mcp::ToolExecutionOutput, | ||
| response_format: ResponseFormat, | ||
| ) -> ResponseOutputItem { | ||
| ResponseTransformer::transform( | ||
| &output.output, | ||
| response_format, | ||
| &output.call_id, | ||
| &output.server_label, | ||
| &output.tool_name, | ||
| &output.arguments_str, | ||
| ) | ||
| } |
There was a problem hiding this comment.
Preserve ToolExecutionOutput failure state here.
transform_tool_output() throws away is_error/error_message and always delegates to builders that stamp successful items (status = "completed" / error = None). The streaming caller still invokes this helper on error paths, so failed MCP or hosted-tool executions can be persisted and replayed as successes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/common/openai_bridge/transformer.rs` around lines
21 - 33, transform_tool_output currently discards ToolExecutionOutput's failure
info and always builds a successful ResponseOutputItem via
ResponseTransformer::transform; update transform_tool_output to detect
output.is_error (and use output.error_message) and propagate that into the
resulting ResponseOutputItem (e.g., set status to "failed" and populate
error/message fields) instead of always returning a completed/no-error item so
failed MCP/hosted-tool executions are persisted as failures. Locate
transform_tool_output and the call to ResponseTransformer::transform and either
pass through an explicit error/status override or post-process the returned
ResponseOutputItem to set status and error using output.is_error and
output.error_message.
| // Strip image_generation_call.result base64 from | ||
| // historical replayed items before persistence — | ||
| // a no-op for non-image item types. | ||
| openai_bridge::compact_image_generation_outputs_json( | ||
| std::slice::from_mut(&mut value), | ||
| ); |
There was a problem hiding this comment.
Persist replayed structured input items verbatim.
This now preserves replayed image_generation_call/hosted-tool inputs long enough to compact them, but item_to_new_conversation_item() still stores non-message input items by reading content. For replayed tool-output items that field is absent, so the linked conversation item becomes [] and the next history load loses the original item.
💡 Suggested fix
- let store_whole_item = if is_input {
- item_type == "function_call" || item_type == "function_call_output"
- } else {
- item_type != "message"
- };
+ let store_whole_item = item_type != "message";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/common/persistence_utils.rs` around lines 298 -
303, item_to_new_conversation_item() currently extracts non-message input items
by reading the "content" field which is absent for replayed tool-output items,
causing them to be persisted as [] and lost later; after
compact_image_generation_outputs_json(...) is called on value, change the
persistence logic in item_to_new_conversation_item() to detect when "content" is
missing or when the item is a structured tool/input (e.g., contains
image_generation_call or hosted-tool keys) and instead serialize the compacted
whole JSON value (the same `value` passed into
openai_bridge::compact_image_generation_outputs_json) verbatim into the
conversation item field used for storage (rather than using content), so
replayed structured inputs persist intact across loads.
| // Same fail-fast contract as the non-streaming path: a missing format | ||
| // registry means MCP routing decisions would be silently wrong. | ||
| let mcp_format_registry = match ctx.components.mcp_format_registry() { | ||
| Some(r) => r.clone(), | ||
| None => { | ||
| return error::internal_error("internal_error", "MCP format registry required"); | ||
| } | ||
| }; | ||
|
|
||
| // Check for MCP tools and create request context if needed | ||
| let mcp_servers = if let Some(tools) = original_body.tools.as_deref() { | ||
| ensure_request_mcp_client(&mcp_orchestrator, tools).await | ||
| ensure_request_mcp_client(&mcp_orchestrator, &mcp_format_registry, tools).await |
There was a problem hiding this comment.
Only require the format registry for MCP-bound streams.
This fail-fast now runs before we know whether the request will even enter the MCP interception path. In contexts where components.mcp_format_registry() is unset, plain streaming requests that would otherwise use handle_simple_streaming_passthrough() now return 500 even though no MCP routing is needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/openai/responses/streaming.rs` around lines 1092 -
1103, The code currently requires ctx.components.mcp_format_registry()
unconditionally, causing 500s for non‑MCP streams; instead, only require and
clone the MCP format registry when the request will enter MCP interception
(i.e., when original_body.tools.as_deref() is Some). Move the
mcp_format_registry lookup into the branch that checks original_body.tools (the
same branch that calls ensure_request_mcp_client), and if tools exist but
ctx.components.mcp_format_registry() is None return the internal_error;
otherwise let plain streaming paths continue to
handle_simple_streaming_passthrough() without needing the registry.
| // The session returns the raw `output` Value from the MCP call. Re-transform | ||
| // with WebSearchCall format to verify serialization (end-to-end source | ||
| // extraction is covered by the gateway bridge's own tests). | ||
| let transformed = ResponseTransformer::transform( | ||
| &output.output, | ||
| ResponseFormat::WebSearchCall, | ||
| "test-request-openai-search", | ||
| "openai_search_server", | ||
| "brave_web_search", | ||
| "{\"query\":\"rust openai search\"}", | ||
| ); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Exercise the production bridge helper in these assertions.
These checks hardcode the call/server/tool/args instead of transforming the actual ToolExecutionOutput returned by session.execute_tool(). They can still pass if session-side rewriting regresses. Prefer openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) here, or at least feed output.call_id, output.server_label, output.tool_name, and output.arguments_str back into the transformer.
Also applies to: 422-429
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/tests/mcp_test.rs` around lines 337 - 347, The test currently
hardcodes call/server/tool/args when calling ResponseTransformer::transform;
replace that with the production helper
openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or
at minimum pass output.call_id, output.server_label, output.tool_name, and
output.arguments_str into ResponseTransformer::transform) so the assertions
exercise the actual ToolExecutionOutput produced by session.execute_tool();
update both the instance around lines ~337 and the similar block at ~422-429 to
use transform_tool_output or the output.* fields.
…tself The previous assertion (\"base64 not anywhere in the GET payload\") was over-specified. On the gRPC SGLang lane the model (gpt-oss-20b) sees the base64 in its function_call_output and legitimately echoes it back inside its assistant reply text (\"Base64 PNG data: iVBORw...\"). That message text is then persisted as a normal message item and trips the assertion even though the compactor is doing exactly what it was wired to do — the stored image_generation_call item has no \`result\` field. Narrow the assertion to the structured field where compactor failure actually surfaces: the persisted image_generation_call must carry no top-level or nested \`result\`, and the base64 must not appear inside the serialized image_generation_call item. Free-form assistant text is deliberately not policed here — that is a separate concern (model behavior on the upstream side) and the gateway has nothing to compact. Signed-off-by: Simo Lin <[email protected]>
Signed-off-by: Simo Lin <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d844414070
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if format == ResponseFormat::Passthrough { | ||
| continue; |
There was a problem hiding this comment.
Remove cached format on passthrough reconfiguration
When populate_from_server_config is called again for an already-registered server, Some(Passthrough) is treated as a no-op (continue) and the previous non-passthrough mapping is left in formats. Because ConnectMcpServerStep repopulates this shared registry on each RegisterMcpServer run, changing a tool from web_search_call/etc. back to passthrough will keep transforming outputs as the old hosted type. This creates incorrect response item types after MCP server config updates unless stale keys are explicitly deleted.
Useful? React with 👍 / 👎.
…rror state, wire shape) (#1442) Signed-off-by: Simo Lin <[email protected]>
Summary
The
smg-mcpcrate had grown into a mix of MCP protocol code and OpenAI-flavored request/response/event transformation, forcing dual edits across the crate boundary on every router-side change. This PR moves all OpenAI-shaped concerns into a newmodel_gateway::routers::common::openai_bridgemodule, leavingsmg-mcpfocused on the MCP protocol surface.openai_bridgemodule:response_format,format_registry(DashMap side-map keyed byQualifiedToolName),format_descriptor(per-format dispatch table),tool_descriptors,transformer,overrides.smg-mcpdrops itsopenai-protocoldependency entirely;ResponseFormat,ToolEntry::response_format, in-session tool/inventory JSON builders, and the alias-routing 3-tuple are removed. Aliases now live under a literalALIAS_SERVER_KEYprefix that the bridge looks up.smg-mcp— routers consume the bridge directly.POST /v1/conversations/{id}/itemshandler, and the request-input persistence path so multi-MB base64 payloads are stripped before storage on every ingress;build_stored_responseconsumes the compacted JSON to avoid a second full-tree clone.Net diff: +1437 / -1778 across 49 files.
smg-mcpshrinks by ~3k LOC.Design
Full design doc with grounded reasoning at
.claude/plans/mcp-refactor.md.Test plan
cargo build -p smg— cleancargo test -p smg --lib openai_bridge::— 45 pass (5 new format_registry tests, 3 compactor tests)cargo test -p smg --test mcp_test— 23 integration tests pass againstsession.execute_toolSummary by CodeRabbit
Chores
New Features
Tests