refactor(mcp): extract OpenAI-flavored bridge into router-side module by slin1237 · Pull Request #1429 · lightseekorg/smg

slin1237 · 2026-05-01T19:09:54Z

Summary

The smg-mcp crate had grown into a mix of MCP protocol code and OpenAI-flavored request/response/event transformation, forcing dual edits across the crate boundary on every router-side change. This PR moves all OpenAI-shaped concerns into a new model_gateway::routers::common::openai_bridge module, leaving smg-mcp focused on the MCP protocol surface.

New openai_bridge module: response_format, format_registry (DashMap side-map keyed by QualifiedToolName), format_descriptor (per-format dispatch table), tool_descriptors, transformer, overrides.
smg-mcp drops its openai-protocol dependency entirely; ResponseFormat, ToolEntry::response_format, in-session tool/inventory JSON builders, and the alias-routing 3-tuple are removed. Aliases now live under a literal ALIAS_SERVER_KEY prefix that the bridge looks up.
Adding a new built-in hosted tool no longer requires a parallel patch in smg-mcp — routers consume the bridge directly.
Image-generation result compactor is wired into the persistence path, the POST /v1/conversations/{id}/items handler, and the request-input persistence path so multi-MB base64 payloads are stripped before storage on every ingress; build_stored_response consumes the compacted JSON to avoid a second full-tree clone.

Net diff: +1437 / -1778 across 49 files. smg-mcp shrinks by ~3k LOC.

Design

Full design doc with grounded reasoning at .claude/plans/mcp-refactor.md.

Test plan

cargo build -p smg — clean
cargo test -p smg --lib openai_bridge:: — 45 pass (5 new format_registry tests, 3 compactor tests)
cargo test -p smg --test mcp_test — 23 integration tests pass against session.execute_tool
e2e image-generation compactor test exercises persistence with an explicit conversation
CI green

Summary by CodeRabbit

Chores
- Reduced legacy response-format plumbing and public surface; simplified session/orchestrator APIs and inventory shapes; added a stable alias server key constant.
New Features
- Added a shared OpenAI bridge with a FormatRegistry and descriptors for centralized per-tool format lookup.
- Added bridge helpers to build tool descriptors and to transform/compact tool outputs.
Tests
- Updated unit and e2e tests; tightened image-generation persistence checks and compaction.

coderabbitai · 2026-05-01T19:10:08Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Moves OpenAI/protocol response-format responsibilities out of the MCP crate and into a new model_gateway openai_bridge. MCP no longer stores or transforms response formats; model_gateway now owns FormatRegistry, tool-descriptor builders, transformation/compaction, and threads the registry through routers, persistence, and workflow.

Changes

MCP → Gateway format refactor

Layer / File(s)	Summary
Data Shape / Config `crates/mcp/src/core/config.rs`, `crates/mcp/src/inventory/types.rs`	`ToolConfig.response_format` → `Option<ResponseFormatConfig}`; removed `ToolEntry.response_format`; added `pub const ALIAS_SERVER_KEY`.
MCP Core API Removal `crates/mcp/src/core/orchestrator.rs`, `crates/mcp/src/core/mod.rs`, `crates/mcp/src/core/session.rs`	Deleted response-format transformation plumbing and helpers (removed `ToolCallResult`, `ResponseTransformer` usage, response_format fields, and related session APIs); added `set_approval_mode` and `qualified_name_for_exposed`.
Crate Surface Cleanup `crates/mcp/src/lib.rs`, `crates/mcp/src/responses_bridge.rs`, `crates/mcp/src/transform/mod.rs`	Removed `responses_bridge` and `transform` modules and pruned OpenAI-protocol re-exports from crate root.
Gateway: new openai_bridge `model_gateway/src/routers/common/openai_bridge/*`	Added FormatDescriptor, `FormatRegistry`, `ResponseFormat` model (Copy), tool descriptor builders, hosted-tool overrides, transformer (`transform_tool_output`) and JSON compaction helpers, plus re-exports (mod entry).
Registry population & descriptor mapping `model_gateway/src/routers/common/openai_bridge/format_registry.rs`, `.../format_descriptor.rs`	Added registry storage/lookup, `populate_from_server_config` rules (alias handling, builtin defaults, explicit Passthrough), and compile-time `descriptor(format)` mapping.
Gateway wiring / DI `model_gateway/src/app_context.rs`, `model_gateway/src/routers//context.rs`, `model_gateway/src/routers//router.rs`	Added `mcp_format_registry: FormatRegistry` to AppContext and router contexts; injected/cloned registry into router initialization and ResponsesContext construction.
MCP connection & utils updated `model_gateway/src/routers/common/mcp_utils.rs`, `model_gateway/src/routers/common/mod.rs`	connect/collect/ensure functions now accept & populate `FormatRegistry`; registered `openai_bridge` module in routers/common.
Tool descriptors & visibility moved `model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs`, `model_gateway/src/routers/openai/responses/utils.rs`	Moved function/chat/response tool builders and client-visible MCP injection/hiding helpers into openai_bridge; gateway calls these instead of MCP session methods.
Transformation & persistence compaction `model_gateway/src/routers/common/openai_bridge/transformer.rs`, `model_gateway/src/routers/common/persistence_utils.rs`, `model_gateway/src/routers/conversations/handlers.rs`	Added `transform_tool_output`, `compact_image_generation_outputs_json`; `build_stored_response` consumes and compacts JSON in-place; conversation parsing compacts non-message items.
Tool-loop & responses plumbing `model_gateway/src/routers/openai/mcp/tool_loop.rs`, `model_gateway/src/routers//responses/`, `model_gateway/src/routers/grpc//responses/`	Replaced session-owned format lookups/transform calls with `openai_bridge` APIs; threaded `FormatRegistry` through ensure/strip/execute/emit flows; updated many signatures to use value-based `ResponseFormat`.
Tests, workflow & misc `model_gateway/src/workflow/mcp_registration.rs`, `model_gateway/src/service_discovery.rs`, `e2e_test/responses/test_image_generation.py`, `model_gateway/tests/mcp_test.rs`, `crates/mcp/Cargo.toml`	Populate registry from connected server config; initialize registry in test contexts; tighten e2e image compaction assertions; refactor MCP tests to session-based execution; removed `openai-protocol` workspace dep from MCP Cargo.toml.

Sequence Diagram(s)

sequenceDiagram
    participant Router as ModelGateway Router
    participant Bridge as openai_bridge::FormatRegistry
    participant Session as McpToolSession
    participant Orch as McpOrchestrator
    participant Tool as External Tool Server
    participant Store as Persistence

    Router->>Bridge: lookup_tool_format(server_key, tool_name)
    Bridge-->>Router: ResponseFormat
    Router->>Session: prepare ToolExecutionInput (with ResponseFormat)
    Session->>Orch: execute tool call (arguments)
    Orch->>Tool: dispatch hosted call
    Tool-->>Orch: raw JSON result
    Orch-->>Session: ToolExecutionOutput (raw)
    Router->>Bridge: transform_tool_output(output, ResponseFormat)
    Bridge-->>Router: ResponseOutputItem
    Router->>Store: compact_image_generation_outputs_json + persist

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Add tests for built-in tool support #173 — Related to removing/relocating ResponseTransformer and builtin server response-format lookup.

Possibly related PRs

refactor(mcp): introduce McpToolSession to bundle MCP execution state #360 — Overlaps MCP session API and orchestrator changes.
feat(gateway): add image_generation infrastructure for hosted-tool MCP plumbing (R6.1) #1355 — Overlaps response-format/transform plumbing relocated into bridge.
refactor(mcp): simplify ensure_request_mcp_client and remove McpLoopConfig #368 — Overlaps ensure/connection and MCP client wiring changes.

Suggested reviewers

CatherineSue
key4ng
zhaowenzi

Poem

🐇 I hopped from core to gateway's ridge,

Formats bundled tidy in a bridge.
Old transforms tucked out of sight,
Routers call, the registry lights.
A rabbit cheers the code's new flight.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main change: extracting OpenAI-specific functionality into a new router-side bridge module. This is the primary architectural change across the 49 modified files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/extract-openai-bridge

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The smg-mcp crate had grown into a mix of MCP protocol code and OpenAI-flavored request/response/event transformation, forcing dual edits across the crate boundary on every router-side change. This PR moves all OpenAI-shaped concerns (response-format dispatch, tool descriptors, hosted-tool overrides, mcp_call/web_search_call/code_interpreter_call/ file_search_call/image_generation_call transformation, and the format side-map keyed off the inventory) into `model_gateway::routers::common::openai_bridge`, leaving smg-mcp focused on the MCP protocol surface. Highlights: - New `openai_bridge` module: `response_format`, `format_registry` (DashMap side-map keyed by `QualifiedToolName`), `format_descriptor` (per-format dispatch table), `tool_descriptors`, `transformer`, `overrides`. - smg-mcp drops its `openai-protocol` dependency entirely. - `ResponseFormat`, `ToolEntry::response_format`, the in-session tool/ inventory JSON builders, and the alias-routing 3-tuple are removed from smg-mcp; aliases now live under a literal `ALIAS_SERVER_KEY` prefix that the bridge looks up. - Routers consume the bridge directly, so adding a new built-in hosted tool no longer requires a parallel patch in smg-mcp. - Image-generation result compactor is wired into the persistence path, the POST /v1/conversations/{id}/items handler, and the request-input persistence path so multi-MB base64 payloads are stripped before storage on every ingress; `build_stored_response` now consumes the compacted JSON to avoid a second full-tree clone. - MCP integration tests rewritten against `session.execute_tool`. - e2e image-generation compactor test now creates an explicit conversation so it actually exercises persistence. Net diff: +708 / -3645 across 45 files (smg-mcp shrinks by ~3k LOC). Signed-off-by: Simo Lin <[email protected]>

gemini-code-assist

Code Review

This pull request decouples the smg_mcp crate from the OpenAI protocol by moving response transformation logic, tool descriptors, and format registries into a new openai_bridge module within the model_gateway. It also implements image generation output compaction to optimize storage usage. Feedback suggests marking a test helper function in format_registry.rs with #[cfg(test)] to prevent it from being compiled into production code and optimizing inject_client_visible_mcp_output_items to reduce Vec allocations in a performance-sensitive path.

gemini-code-assist · 2026-05-01T19:13:29Z

+    fn server(name: &str) -> McpServerConfig {
+        McpServerConfig {
+            name: name.to_string(),
+            transport: McpTransport::Streamable {
+                url: "http://x".to_string(),
+                token: None,
+                headers: HashMap::new(),
+            },
+            proxy: None,
+            required: false,
+            tools: None,
+            builtin_type: None,
+            builtin_tool_name: None,
+            internal: false,
+        }
+    }


The server helper function is used only in tests. It should be marked with #[cfg(test)] to avoid being compiled into production code.

gemini-code-assist · 2026-05-01T19:13:29Z

+    let existing = std::mem::take(output);
+    let servers = session.mcp_servers();
+    output.reserve(servers.len() + tool_call_items.len() + existing.len());
+
+    for binding in servers {
+        if !session.is_internal_non_builtin_server_label(&binding.label) {
+            output.push(mcp_list_tools_item(
+                session,
+                &binding.label,
+                &binding.server_key,
+            ));
+        }


The inject_client_visible_mcp_output_items function performs multiple Vec allocations and std::mem::take operations. Given this is a hot path for response processing, consider using a single pre-allocated Vec to improve performance.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f2d60264d0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-01T19:16:40Z

+    output: &smg_mcp::ToolExecutionOutput,
+    registry: &FormatRegistry,
+) -> ResponseOutputItem {
+    let response_format = registry.lookup_by_names(&output.server_key, &output.tool_name);


Resolve tool format from qualified name before transforming output

transform_tool_output looks up the format with (output.server_key, output.tool_name), but McpToolSession::execute_tool_result rewrites output.tool_name to the invoked/exposed name (crates/mcp/src/core/session.rs, output.tool_name = invoked_name). When exposed names differ from the underlying tool name (for example duplicate names that get disambiguated, like mcp_<server>_<tool>), this lookup misses and falls back to Passthrough, so hosted-tool results are emitted as mcp_call instead of web_search_call/image_generation_call etc. This breaks response typing for those valid multi-server configurations.

Useful? React with 👍 / 👎.

claude · 2026-05-01T19:16:45Z

+/// and the registry. Returns `Passthrough` for unknown tools.
+///
+/// Lives next to `FormatRegistry` because it's a thin lookup helper that
+/// composes session's name map with `FormatRegistry::lookup_by_names`,
+/// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`.
+pub fn lookup_tool_format(
+    session: &smg_mcp::McpToolSession<'_>,
+    registry: &FormatRegistry,
+    exposed_name: &str,
+) -> ResponseFormat {
+    let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {
+        return ResponseFormat::Passthrough;
+    };
+    registry.lookup_by_names(qn.server_key(), qn.tool_name())


🟡 Nit: The doc comment says this avoids the Arc<str> allocations of QualifiedToolName::new, but qualified_name_for_exposed already creates a QualifiedToolName (two Arc<str>), and then lookup_by_names creates a second one internally. You can halve the allocation by calling registry.lookup(&qn) directly:

Suggested change

/// and the registry. Returns `Passthrough` for unknown tools.

///

/// Lives next to `FormatRegistry` because it's a thin lookup helper that

/// composes session's name map with `FormatRegistry::lookup_by_names`,

/// avoiding the `Arc<str>` allocations of `QualifiedToolName::new`.

pub fn lookup_tool_format(

session: &smg_mcp::McpToolSession<'_>,

registry: &FormatRegistry,

exposed_name: &str,

) -> ResponseFormat {

let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {

return ResponseFormat::Passthrough;

};

registry.lookup_by_names(qn.server_key(), qn.tool_name())

/// Resolve an exposed tool name's `ResponseFormat` via the session's name map

/// and the registry. Returns `Passthrough` for unknown tools.

pub fn lookup_tool_format(

session: &smg_mcp::McpToolSession<'_>,

registry: &FormatRegistry,

exposed_name: &str,

) -> ResponseFormat {

let Some(qn) = session.qualified_name_for_exposed(exposed_name) else {

return ResponseFormat::Passthrough;

};

registry.lookup(&qn)

}

claude · 2026-05-01T19:16:49Z

+    /// Populate from a server config: per-tool overrides + builtin defaults.
+    /// Safe to call repeatedly — entries are overwritten.


🟡 Nit: "entries are overwritten" is only true for non-Passthrough formats. Because the loop skips Passthrough (line 68), re-calling this for a server whose tool changed from e.g. WebSearchCall → Passthrough would leave the stale WebSearchCall entry in the map.

In practice all servers are registered once so this is unlikely to bite, but the doc comment creates a stronger guarantee than the code delivers. Either document the caveat or add a self.formats.remove(...) for the Passthrough case so the "safe to call repeatedly" contract actually holds.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

model_gateway/src/routers/grpc/harmony/streaming.rs (1)

698-707: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Default exposed MCP tools to Passthrough when the registry is absent.

If session is present but format_registry is None, this returns None and the tool gets emitted as a plain function_call instead of mcp_call/builtin events. The safe fallback here is still MCP semantics, i.e. Some(ResponseFormat::Passthrough), or making session and format_registry impossible to desynchronize.

Suggested fix

-                                let response_format = session.and_then(|s| {
-                                    if s.has_exposed_tool(tool_name) {
-                                        format_registry.map(|reg| {
-                                            openai_bridge::lookup_tool_format(s, reg, tool_name)
-                                        })
-                                    } else {
-                                        None
-                                    }
-                                });
+                                let response_format = session.and_then(|s| {
+                                    if !s.has_exposed_tool(tool_name) {
+                                        return None;
+                                    }
+                                    Some(format_registry.map_or(ResponseFormat::Passthrough, |reg| {
+                                        openai_bridge::lookup_tool_format(s, reg, tool_name)
+                                    }))
+                                });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/harmony/streaming.rs` around lines 698 - 707,
The code that computes response_format (using session.and_then(...)) currently
returns None when format_registry is None, causing exposed MCP tools to be
emitted as plain function_call; change the logic in the response_format
calculation so when session.is_some() and s.has_exposed_tool(tool_name) but
format_registry is None you return Some(ResponseFormat::Passthrough) instead of
None; specifically update the closure that calls
openai_bridge::lookup_tool_format(...) to fall back to
ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on
format_registry before calling lookup), referencing response_format, session,
format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and
ResponseFormat::Passthrough.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 81-85: The current check (has_explicit =
config.tools.as_ref().is_some_and(|tools| tools.contains_key(tool_name))) treats
any per-tool stanza as an override and disables builtin formatting even when the
stanza only sets alias/arg_mapping; fix this by detecting whether the per-tool
entry actually specifies a response_format. Update the logic in
format_registry.rs to look up the per-tool config (e.g., tools.get(tool_name))
and only consider it an explicit format override when its response_format field
is present/Some (or when a new boolean “response_format_present” flag is set
during deserialization), otherwise fall back to applying the builtin default;
ensure you reference the response_format/Passthrough semantics so tools with
only alias/arg_mapping still get the hosted-tool shape instead of falling back
to mcp_call.
- Around line 67-69: populate_from_server_config currently skips inserting when
ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior
mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so
that when format == ResponseFormat::Passthrough you first remove/clear any
existing entries for that QualifiedToolName (and any alias keys it may have)
from the registry maps before continuing, ensuring downgrades from a concrete
format back to Passthrough actually remove the transformation entries.

In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs`:
- Around line 95-99: The current serialization of entry.tool.annotations (in
build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations
struct and leaks internal fields; change it to only emit the read_only flag by
mapping annotations to a JSON object containing only {"read_only":
entry.tool.annotations.read_only} (or None if annotations is None) instead of
serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the
previous OpenAI Responses API parity.

In `@model_gateway/src/routers/grpc/common/responses/streaming.rs`:
- Around line 359-364: The change to use
crate::routers::common::openai_bridge::build_mcp_tool_infos in
tool_entries_to_json widens the serialized ToolAnnotations (adding
destructive/idempotent/open_world) and breaks the expected wire shape; revert to
or emulate the previous behavior by ensuring tool_entries_to_json only emits the
minimal annotation payload ({"read_only": entry.annotations.read_only}) like the
original smg_mcp::build_mcp_tool_infos did—either call the original
smg_mcp::build_mcp_tool_infos or post-process the values returned by
openai_bridge::build_mcp_tool_infos to strip all annotation fields except
read_only before returning.

In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 330-344: The current mcp_has_image_generation check treats any MCP
tool whose format_lookup yields ResponseFormat::ImageGenerationCall as the
hosted image_generation, which is too broad; change the predicate in
strip_image_generation_from_request_tools (the mcp_has_image_generation
variable) to require both that format_registry.lookup(&entry.qualified_name) ==
ResponseFormat::ImageGenerationCall AND that entry.qualified_name matches the
session’s actual hosted image_generation exposure name (use the session API that
returns the exposed/hosted tool name or compare against the known hosted
qualified name), so only the real hosted `image_generation` exposure causes
ResponseTool::ImageGeneration to be stripped.

In `@model_gateway/src/routers/openai/mcp/tool_loop.rs`:
- Around line 959-966: prepare_hosted_dispatch_args(...) mutates the local
`arguments`, but the subsequent call to build_transformed_mcp_call_item(...) is
still serializing `call.arguments`, so the client-visible transformed item can
differ from what was dispatched; change the build_transformed_mcp_call_item
invocation(s) to use the post-merge `arguments` (the mutated variable) when
serializing/constructing the transformed item (rather than `call.arguments`),
keeping the same `response_format`, `original_body` inputs (tools/user) and
apply the same change to the other occurrence of build_transformed_mcp_call_item
in this file where hosted dispatch args are prepared.

In `@model_gateway/src/routers/openai/responses/non_streaming.rs`:
- Around line 65-69: The code currently uses
ctx.components.mcp_format_registry().cloned().unwrap_or_default(), which
silently falls back to FormatRegistry::default() and can cause wrong MCP format
resolution; change this to require the registry be present by removing
unwrap_or_default and returning an error or panicking when mcp_format_registry
is None (e.g., use expect or propagate a Result) so that mcp_format_registry is
mandatory; update the call site that sets let mcp_format_registry to use
ctx.components.mcp_format_registry().cloned().expect("mcp_format_registry is
required") or propagate an appropriate error so missing registry fails fast.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/streaming.rs`:
- Around line 698-707: The code that computes response_format (using
session.and_then(...)) currently returns None when format_registry is None,
causing exposed MCP tools to be emitted as plain function_call; change the logic
in the response_format calculation so when session.is_some() and
s.has_exposed_tool(tool_name) but format_registry is None you return
Some(ResponseFormat::Passthrough) instead of None; specifically update the
closure that calls openai_bridge::lookup_tool_format(...) to fall back to
ResponseFormat::Passthrough when format_registry.map(...) is None (or branch on
format_registry before calling lookup), referencing response_format, session,
format_registry, openai_bridge::lookup_tool_format, has_exposed_tool, and
ResponseFormat::Passthrough.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8691361e-d6ad-47b3-a08a-8837e7111084

📥 Commits

Reviewing files that changed from the base of the PR and between fbe2497 and a2212fc.

📒 Files selected for processing (49)

crates/mcp/Cargo.toml
crates/mcp/src/core/config.rs
crates/mcp/src/core/mod.rs
crates/mcp/src/core/orchestrator.rs
crates/mcp/src/core/session.rs
crates/mcp/src/inventory/mod.rs
crates/mcp/src/inventory/types.rs
crates/mcp/src/lib.rs
crates/mcp/src/responses_bridge.rs
crates/mcp/src/transform/mod.rs
e2e_test/responses/test_image_generation.py
model_gateway/src/app_context.rs
model_gateway/src/routers/anthropic/context.rs
model_gateway/src/routers/anthropic/router.rs
model_gateway/src/routers/common/mcp_utils.rs
model_gateway/src/routers/common/mod.rs
model_gateway/src/routers/common/openai_bridge/format_descriptor.rs
model_gateway/src/routers/common/openai_bridge/format_registry.rs
model_gateway/src/routers/common/openai_bridge/mod.rs
model_gateway/src/routers/common/openai_bridge/overrides.rs
model_gateway/src/routers/common/openai_bridge/response_format.rs
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs
model_gateway/src/routers/common/openai_bridge/transformer.rs
model_gateway/src/routers/common/persistence_utils.rs
model_gateway/src/routers/conversations/handlers.rs
model_gateway/src/routers/gemini/context.rs
model_gateway/src/routers/gemini/router.rs
model_gateway/src/routers/grpc/common/responses/context.rs
model_gateway/src/routers/grpc/common/responses/streaming.rs
model_gateway/src/routers/grpc/common/responses/utils.rs
model_gateway/src/routers/grpc/harmony/responses/common.rs
model_gateway/src/routers/grpc/harmony/responses/execution.rs
model_gateway/src/routers/grpc/harmony/responses/non_streaming.rs
model_gateway/src/routers/grpc/harmony/responses/streaming.rs
model_gateway/src/routers/grpc/harmony/streaming.rs
model_gateway/src/routers/grpc/regular/responses/common.rs
model_gateway/src/routers/grpc/regular/responses/handlers.rs
model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
model_gateway/src/routers/grpc/regular/responses/streaming.rs
model_gateway/src/routers/grpc/router.rs
model_gateway/src/routers/openai/context.rs
model_gateway/src/routers/openai/mcp/tool_loop.rs
model_gateway/src/routers/openai/responses/non_streaming.rs
model_gateway/src/routers/openai/responses/streaming.rs
model_gateway/src/routers/openai/responses/utils.rs
model_gateway/src/routers/openai/router.rs
model_gateway/src/service_discovery.rs
model_gateway/src/workflow/mcp_registration.rs
model_gateway/tests/mcp_test.rs

💤 Files with no reviewable changes (3)

crates/mcp/Cargo.toml
crates/mcp/src/responses_bridge.rs
crates/mcp/src/transform/mod.rs

coderabbitai · 2026-05-01T19:21:55Z

+                if format == ResponseFormat::Passthrough {
+                    continue;
+                }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

populate_from_server_config is not actually repeat-safe when a format is removed.

The doc says repeated calls are safe, but this branch only skips insertion; it never clears an existing map entry. If a tool was previously registered as WebSearchCall and a later config changes it back to Passthrough, the old QualifiedToolName -> ResponseFormat mapping survives and the registry keeps transforming that tool. Clear the current direct/alias key(s) before continue so format downgrades take effect.

♻️ Minimal direction for the cleanup

for (tool_name, tool_config) in tools { + self.formats + .remove(&QualifiedToolName::new(&config.name, tool_name)); + if let Some(alias) = &tool_config.alias { + self.formats + .remove(&QualifiedToolName::new(ALIAS_SERVER_KEY, alias)); + } + let format: ResponseFormat = tool_config.response_format.into(); if format == ResponseFormat::Passthrough { continue; } if let Some(alias) = &tool_config.alias {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around lines 67 - 69, populate_from_server_config currently skips inserting when ResponseFormat::Passthrough but doesn't remove any existing mapping, so a prior mapping (e.g., WebSearchCall) persists; modify populate_from_server_config so that when format == ResponseFormat::Passthrough you first remove/clear any existing entries for that QualifiedToolName (and any alias keys it may have) from the registry maps before continuing, ensuring downgrades from a concrete format back to Passthrough actually remove the transformation entries.

coderabbitai · 2026-05-01T19:21:55Z

+            annotations: entry
+                .tool
+                .annotations
+                .as_ref()
+                .and_then(|a| serde_json::to_value(a).ok()),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep mcp_list_tools[].tools[].annotations narrowed to read_only.

Serializing ToolAnnotations wholesale changes the wire shape by exposing internal fields like destructive, idempotent, and open_world. The previous bridge intentionally emitted only read_only, and approval logic still reads the richer struct directly, so widening the JSON here is a compatibility regression.

Suggested fix

- annotations: entry - .tool - .annotations - .as_ref() - .and_then(|a| serde_json::to_value(a).ok()), + annotations: entry.tool.annotations.as_ref().map(|a| { + json!({ + "read_only": a.read_only, + }) + }),

Based on learnings: build_mcp_tool_infos intentionally serializes annotations as only {"read_only": entry.annotations.read_only} to preserve OpenAI Responses API wire parity.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs` around lines 95 - 99, The current serialization of entry.tool.annotations (in build_mcp_tool_infos / tool_descriptors.rs) serializes the whole ToolAnnotations struct and leaks internal fields; change it to only emit the read_only flag by mapping annotations to a JSON object containing only {"read_only": entry.tool.annotations.read_only} (or None if annotations is None) instead of serde_json::to_value(entry.tool.annotations).ok(), so the wire shape matches the previous OpenAI Responses API parity.

coderabbitai · 2026-05-01T19:21:55Z

+    /// Convert tool entries to JSON values using the shared bridge builder.
    fn tool_entries_to_json(
        tools: &[mcp::ToolEntry],
    ) -> Result<Vec<serde_json::Value>, serde_json::Error> {
-        mcp::build_mcp_tool_infos(tools)
+        crate::routers::common::openai_bridge::build_mcp_tool_infos(tools)
            .into_iter()


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't widen mcp_list_tools[].tools[].annotations on this path.

Switching tool_entries_to_json() to openai_bridge::build_mcp_tool_infos() changes the streaming wire shape because the shared builder currently serializes the full ToolAnnotations object. The previous smg_mcp::build_mcp_tool_infos() intentionally exposed only {"read_only": ...}, so this swap can reintroduce destructive / idempotent / open_world into response.mcp_list_tools.completed.

Based on learnings build_mcp_tool_infos intentionally serializes only {"read_only": entry.annotations.read_only} to match OpenAI Responses API wire parity.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/common/responses/streaming.rs` around lines 359 - 364, The change to use crate::routers::common::openai_bridge::build_mcp_tool_infos in tool_entries_to_json widens the serialized ToolAnnotations (adding destructive/idempotent/open_world) and breaks the expected wire shape; revert to or emulate the previous behavior by ensuring tool_entries_to_json only emits the minimal annotation payload ({"read_only": entry.annotations.read_only}) like the original smg_mcp::build_mcp_tool_infos did—either call the original smg_mcp::build_mcp_tool_infos or post-process the values returned by openai_bridge::build_mcp_tool_infos to strip all annotation fields except read_only before returning.

…s, hosted args Round 1 of PR #1429 review fixes (CodeRabbit, Codex, Claude, Gemini): - (P1) `transform_tool_output` now takes a pre-resolved `ResponseFormat` instead of re-deriving it from `(output.server_key, output.tool_name)`. After `McpToolSession::execute_tool_result` the `tool_name` is the *invoked/exposed* name, so the registry lookup missed for disambiguated names like `mcp_<server>_<tool>` and silently degraded to `Passthrough`. All four call sites already had a session-aware `lookup_tool_format` result — this drops the duplicate (broken) lookup. - (Major) Alias/arg_mapping-only stanzas no longer disable a builtin tool's hosted format. `ToolConfig.response_format` is now `Option<ResponseFormatConfig>`: `None` means "inherit context" (builtin → hosted format), `Some(Passthrough)` is the explicit opt-out that still suppresses the builtin default. Adds a regression test covering the alias-only-on-builtin case. - (Major) Hosted-tool dispatch arguments mismatch in `tool_loop.rs`: `build_transformed_mcp_call_item` now serializes the post-merge `effective_arguments`, so the client-visible item describes what the router actually dispatched (e.g. image-generation `size`/`quality` overrides) rather than the pre-merge string emitted by the model. - (Major) `strip_image_generation_from_request_tools` now keys on the literal exposed name `"image_generation"` routed to `ImageGenerationCall` rather than "any MCP tool with that format". A custom MCP tool with image-generation-shaped output (e.g. `thumbnailer`) no longer drops the real hosted tag. - (Major) Both `/v1/responses` non-streaming and streaming entrypoints now hard-fail with 500 when `mcp_format_registry` is missing, instead of silently falling back to `FormatRegistry::default()` and mis-routing MCP-backed tools. - (Nit) `lookup_tool_format` reuses the `QualifiedToolName` returned by `qualified_name_for_exposed` instead of rebuilding it via `lookup_by_names`, saving two `Arc<str>` allocations per call. Skipped review items: - "ToolAnnotations widening" (×2): false positive — `git show fbe2497` confirms the previous bridge already used the same `serde_json::to_value(annotations)` serialization. No wire-shape change. - "`server` helper should be `#[cfg(test)]`": false positive — it already lives inside `#[cfg(test)] mod tests {}`. - "Repeat-safe registry on Passthrough downgrade": no production caller mutates server config in place; documented the actual behavior in the doc comment instead of adding a `remove` path that would never fire. Signed-off-by: Simo Lin <[email protected]>

claude · 2026-05-01T20:02:52Z

+            if !has_explicit_format {
+                let format: ResponseFormat = builtin_type.response_format().into();
+                self.insert(QualifiedToolName::new(&config.name, tool_name), format);
+            }


🔴 Important: Builtin default is stored under the wrong key when the tool has an alias, so the production lookup still misses.

collect_visible_mcp_tools (session.rs:565-571) replaces a direct entry with its alias entries. So when do_search has alias web_search, only web_search is exposed. The production lookup path is:

lookup_tool_format(session, registry, "web_search")

session.qualified_name_for_exposed("web_search") → QualifiedToolName("alias", "web_search")

registry.lookup(…) checks for ("alias", "web_search") → miss → Passthrough

But the builtin default was inserted as ("search", "do_search"), which is never queried in production for an aliased tool.

The per-tool loop (lines 84-88) already handles this correctly — when a tool has an alias, the format is stored under ("alias", alias_name). The builtin default path should mirror that:

Suggested change

if !has_explicit_format {

let format: ResponseFormat = builtin_type.response_format().into();

self.insert(QualifiedToolName::new(&config.name, tool_name), format);

}

if !has_explicit_format {

let format: ResponseFormat = builtin_type.response_format().into();

// When the tool has an alias, store under the alias key to match

// the QualifiedToolName shape that McpToolSession exposes.

let alias = config

.tools

.as_ref()

.and_then(|tools| tools.get(tool_name))

.and_then(|cfg| cfg.alias.as_ref());

if let Some(alias_name) = alias {

self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias_name), format);

} else {

self.insert(QualifiedToolName::new(&config.name, tool_name), format);

}

}

claude · 2026-05-01T20:02:57Z

+        assert_eq!(
+            r.lookup_by_names("search", "do_search"),
+            ResponseFormat::WebSearchCall,
+            "alias-only stanza must not disable the builtin's hosted format"
+        );


🔴 Important: This test doesn't exercise the production lookup path and gives a false-positive.

lookup_by_names("search", "do_search") queries the (server, tool) pair directly. But in production, the session resolves alias "web_search" to QualifiedToolName("alias", "web_search") — because collect_visible_mcp_tools replaces direct entries with their aliases — and that key was never inserted into the registry.

To catch the bug flagged above, this assertion should additionally (or instead) check the alias key:

assert_eq!( r.lookup_by_names("alias", "web_search"), ResponseFormat::WebSearchCall, "alias-only stanza must not disable the builtin's hosted format (alias lookup)" );

This assertion will fail with the current populate_from_server_config implementation, confirming the bug.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cafb8ddd35

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-01T20:05:06Z

+            if !has_explicit_format {
+                let format: ResponseFormat = builtin_type.response_format().into();
+                self.insert(QualifiedToolName::new(&config.name, tool_name), format);


Register builtin default format for aliased builtin tools

When a built-in server config sets tools.<builtin_tool_name>.alias but leaves response_format unset (None), this code inserts the default hosted format only for (<server>, <tool_name>). However, McpToolSession exposes and resolves the callable name through the alias entry (("alias", <alias_name>)), so lookup_tool_format(...) misses and falls back to Passthrough, emitting mcp_call instead of the hosted shape (e.g. web_search_call / image_generation_call). This regresses the documented “alias-only stanza preserves builtin default” behavior for real tool execution paths.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

model_gateway/src/routers/grpc/harmony/responses/common.rs (1)
184-189: ⚠️ Potential issue | 🟠 Major

Don't re-expose mcp_list_tools for internal builtin servers.

Switching this path to openai_bridge::inject_client_visible_mcp_output_items(...) introduces a visibility regression. The bridge helper emits mcp_list_tools for any binding that is not is_internal_non_builtin_server_label(...), which allows builtin-routed internal servers to become client-visible again. This breaks the established contract where all internal MCP plumbing (both builtin and non-builtin) stays hidden.

Change the predicate in model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs to use is_internal_server_label instead, so that only external servers emit mcp_list_tools items.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/harmony/responses/common.rs` around lines 184
- 189, The bridge helper inject_client_visible_mcp_output_items is currently
allowing builtin-routed internal servers to become client-visible because
tool_descriptors.rs uses is_internal_non_builtin_server_label in its predicate
that decides whether to emit mcp_list_tools; update that predicate to use
is_internal_server_label instead so only external servers emit mcp_list_tools.
Locate the emission logic in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code
path that decides to emit mcp_list_tools) and replace the
is_internal_non_builtin_server_label check with is_internal_server_label,
preserving surrounding logic and tests to ensure builtins remain hidden.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 84-102: When inserting the response format for a tool that has an
alias, also insert the same format under the canonical key so builtin dispatch
won't fall back to Passthrough; in the block that checks if let Some(alias) =
&tool_config.alias { ... } (and uses
self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also
check if tool_config.response_format.is_some() (or the resolved format is
explicit) and then call self.insert(QualifiedToolName::new(&config.name,
tool_name), format) so the canonical (server_name, tool_name) entry is
preserved; keep using QualifiedToolName::new and the same ResponseFormat value.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 815-816: The SSE payload currently emits the ad-hoc raw
`{arguments, output}`/`{error}` object instead of the transformed
ResponseOutputItem; change the code that serializes and emits the streamed
"done" event to use the value returned by openai_bridge::transform_tool_output
(the variable output_item used in state.record_call()) — serialize output_item
and emit that for output_item.done. Also ensure failure shapes for
hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall,
ImageGenerationCall) are represented by status: "failed" only (no error field)
while allowing McpCall to include an error field, so the serialized output_item
follows those family-specific rules.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/harmony/responses/common.rs`:
- Around line 184-189: The bridge helper inject_client_visible_mcp_output_items
is currently allowing builtin-routed internal servers to become client-visible
because tool_descriptors.rs uses is_internal_non_builtin_server_label in its
predicate that decides whether to emit mcp_list_tools; update that predicate to
use is_internal_server_label instead so only external servers emit
mcp_list_tools. Locate the emission logic in
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs (the code
path that decides to emit mcp_list_tools) and replace the
is_internal_non_builtin_server_label check with is_internal_server_label,
preserving surrounding logic and tests to ensure builtins remain hidden.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5f06c2f0-2c36-4062-bd62-26fc7b3df2e5

📥 Commits

Reviewing files that changed from the base of the PR and between a2212fc and cafb8dd.

📒 Files selected for processing (13)

crates/mcp/src/core/config.rs
crates/mcp/src/core/orchestrator.rs
model_gateway/src/routers/common/mcp_utils.rs
model_gateway/src/routers/common/openai_bridge/format_registry.rs
model_gateway/src/routers/common/openai_bridge/transformer.rs
model_gateway/src/routers/grpc/harmony/responses/common.rs
model_gateway/src/routers/grpc/harmony/responses/execution.rs
model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
model_gateway/src/routers/grpc/regular/responses/streaming.rs
model_gateway/src/routers/openai/mcp/tool_loop.rs
model_gateway/src/routers/openai/responses/non_streaming.rs
model_gateway/src/routers/openai/responses/streaming.rs
model_gateway/tests/mcp_test.rs

coderabbitai · 2026-05-01T20:10:10Z

+                if let Some(alias) = &tool_config.alias {
+                    self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format);
+                } else {
+                    self.insert(QualifiedToolName::new(&config.name, tool_name), format);
+                }
+            }
+        }
+
+        if let (Some(builtin_type), Some(tool_name)) =
+            (&config.builtin_type, &config.builtin_tool_name)
+        {
+            let has_explicit_format = config
+                .tools
+                .as_ref()
+                .and_then(|tools| tools.get(tool_name))
+                .is_some_and(|cfg| cfg.response_format.is_some());
+            if !has_explicit_format {
+                let format: ResponseFormat = builtin_type.response_format().into();
+                self.insert(QualifiedToolName::new(&config.name, tool_name), format);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the explicit format on the canonical builtin key, too.

If tool_name is also config.builtin_tool_name and the stanza has both alias and a concrete response_format, this code only inserts ("alias", alias) and suppresses the builtin fallback. Builtin dispatch later resolves by (server_name, tool_name), so those calls degrade to Passthrough even though the config explicitly requested a hosted shape.

♻️ Minimal fix

if format == ResponseFormat::Passthrough { continue; } if let Some(alias) = &tool_config.alias { self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format); + if config.builtin_tool_name.as_deref() == Some(tool_name.as_str()) { + self.insert(QualifiedToolName::new(&config.name, tool_name), format); + } } else { self.insert(QualifiedToolName::new(&config.name, tool_name), format); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs` around lines 84 - 102, When inserting the response format for a tool that has an alias, also insert the same format under the canonical key so builtin dispatch won't fall back to Passthrough; in the block that checks if let Some(alias) = &tool_config.alias { ... } (and uses self.insert(QualifiedToolName::new(ALIAS_SERVER_KEY, alias), format)), also check if tool_config.response_format.is_some() (or the resolved format is explicit) and then call self.insert(QualifiedToolName::new(&config.name, tool_name), format) so the canonical (server_name, tool_name) entry is preserved; keep using QualifiedToolName::new and the same ResponseFormat value.

coderabbitai · 2026-05-01T20:10:10Z

+                let output_item =
+                    openai_bridge::transform_tool_output(&tool_output, &response_format);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the transformed item for streamed output_item.done.

transform_tool_output() is only fed into state.record_call(). The SSE payload emitted just above still uses the ad-hoc raw {arguments, output} / {error} object, so builtin formats can stream a different shape than the final ResponseOutputItem, and failures on hosted builtin families leak an error field those variants do not support. Serialize the transformed item here and emit that instead.

Based on learnings, hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) must convey failure via status: "failed" alone; only McpCall may carry an error field.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines 815 - 816, The SSE payload currently emits the ad-hoc raw `{arguments, output}`/`{error}` object instead of the transformed ResponseOutputItem; change the code that serializes and emits the streamed "done" event to use the value returned by openai_bridge::transform_tool_output (the variable output_item used in state.record_call()) — serialize output_item and emit that for output_item.done. Also ensure failure shapes for hosted-builtin families (WebSearchCall, CodeInterpreterCall, FileSearchCall, ImageGenerationCall) are represented by status: "failed" only (no error field) while allowing McpCall to include an error field, so the serialized output_item follows those family-specific rules.

CI runs `cargo clippy --all-targets --all-features -- -D warnings` on this PR for the first time (prior runs were cancelled), surfacing three pre-existing lint classes: - `clippy::absolute_paths`: ~17 sites referenced `crate::routers::common::openai_bridge::…` inline. Add a `use` for the module in each affected file and shorten the call sites. - `clippy::trivially_copy_pass_by_ref`: `ResponseFormat` is a 1-byte `Copy` enum. Switch ~15 function signatures (and their callers) from `&ResponseFormat` to `ResponseFormat`. The same shift applies to the internal `ResponseTransformer::transform`, `transform_tool_output`, `build_transformed_mcp_call_item`, `non_streaming_tool_item_id_source`, `stable_streaming_tool_item_id`, `send_tool_call_*`, the gRPC emitter helpers, and the harmony streaming arg-classifier. - `clippy::clone_on_copy`: drop a `.clone()` on `Option<ResponseFormat>` in the harmony tool-call tracking insert, and reformulate three `if let Some(ref fmt) = response_format` bindings now that the inner format is `Copy` (using `if let Some(fmt) = *response_format` for the borrowed-tuple cases). No behavioral change — these are surface-level type / import edits to satisfy the workspace lint gate. Signed-off-by: Simo Lin <[email protected]>

The earlier import-shortening pass left several call sites at 5-line formatting that fmt collapses back to 1-2 lines. Apply cargo fmt to match the project style. Signed-off-by: Simo Lin <[email protected]>

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

model_gateway/src/routers/grpc/regular/responses/streaming.rs (1)
748-795: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stream output_item.done from the transformed item.

The stream still emits ad-hoc {arguments, output} / {error} objects here, then builds the real ResponseOutputItem separately below for state.record_call(). For hosted formats that means SSE clients see a different shape than the final response, and builtin failures still leak an error field those variants do not carry. Reuse the transformed item for output_item.done so the streamed and persisted forms stay identical.

Also applies to: 815-816
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines
748 - 795, The stream currently emits ad-hoc JSON objects for successful and
failed tool results then separately builds the final ResponseOutputItem for
state.record_call(), causing mismatched shapes; instead, transform the
tool_output once into the final ResponseOutputItem structure (the same shape
used for state.record_call()) and pass that transformed item to
emitter.emit_output_item_done/output_item.done; update the branches where
item_done is constructed (the mutable json! blocks around
emitter.emit_output_item_done, using tool_output, output_str, response_format,
attach_mcp_server_label) to reuse that transformed item for both
emitter.send_event(&event, &tx)? and for state.record_call(), and remove the
duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted
forms are identical.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/transformer.rs`:
- Around line 21-33: transform_tool_output currently discards
ToolExecutionOutput's failure info and always builds a successful
ResponseOutputItem via ResponseTransformer::transform; update
transform_tool_output to detect output.is_error (and use output.error_message)
and propagate that into the resulting ResponseOutputItem (e.g., set status to
"failed" and populate error/message fields) instead of always returning a
completed/no-error item so failed MCP/hosted-tool executions are persisted as
failures. Locate transform_tool_output and the call to
ResponseTransformer::transform and either pass through an explicit error/status
override or post-process the returned ResponseOutputItem to set status and error
using output.is_error and output.error_message.

In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 298-303: item_to_new_conversation_item() currently extracts
non-message input items by reading the "content" field which is absent for
replayed tool-output items, causing them to be persisted as [] and lost later;
after compact_image_generation_outputs_json(...) is called on value, change the
persistence logic in item_to_new_conversation_item() to detect when "content" is
missing or when the item is a structured tool/input (e.g., contains
image_generation_call or hosted-tool keys) and instead serialize the compacted
whole JSON value (the same `value` passed into
openai_bridge::compact_image_generation_outputs_json) verbatim into the
conversation item field used for storage (rather than using content), so
replayed structured inputs persist intact across loads.

In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 1092-1103: The code currently requires
ctx.components.mcp_format_registry() unconditionally, causing 500s for non‑MCP
streams; instead, only require and clone the MCP format registry when the
request will enter MCP interception (i.e., when original_body.tools.as_deref()
is Some). Move the mcp_format_registry lookup into the branch that checks
original_body.tools (the same branch that calls ensure_request_mcp_client), and
if tools exist but ctx.components.mcp_format_registry() is None return the
internal_error; otherwise let plain streaming paths continue to
handle_simple_streaming_passthrough() without needing the registry.

In `@model_gateway/tests/mcp_test.rs`:
- Around line 337-347: The test currently hardcodes call/server/tool/args when
calling ResponseTransformer::transform; replace that with the production helper
openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or
at minimum pass output.call_id, output.server_label, output.tool_name, and
output.arguments_str into ResponseTransformer::transform) so the assertions
exercise the actual ToolExecutionOutput produced by session.execute_tool();
update both the instance around lines ~337 and the similar block at ~422-429 to
use transform_tool_output or the output.* fields.

---

Duplicate comments:
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 748-795: The stream currently emits ad-hoc JSON objects for
successful and failed tool results then separately builds the final
ResponseOutputItem for state.record_call(), causing mismatched shapes; instead,
transform the tool_output once into the final ResponseOutputItem structure (the
same shape used for state.record_call()) and pass that transformed item to
emitter.emit_output_item_done/output_item.done; update the branches where
item_done is constructed (the mutable json! blocks around
emitter.emit_output_item_done, using tool_output, output_str, response_format,
attach_mcp_server_label) to reuse that transformed item for both
emitter.send_event(&event, &tx)? and for state.record_call(), and remove the
duplicate ad-hoc {arguments, output} / {error} objects so streamed and persisted
forms are identical.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 87979d65-d087-4c6e-a968-b80299ab9481

📥 Commits

Reviewing files that changed from the base of the PR and between cafb8dd and 71e1b23.

📒 Files selected for processing (18)

model_gateway/src/routers/common/openai_bridge/transformer.rs
model_gateway/src/routers/common/persistence_utils.rs
model_gateway/src/routers/conversations/handlers.rs
model_gateway/src/routers/grpc/common/responses/streaming.rs
model_gateway/src/routers/grpc/common/responses/utils.rs
model_gateway/src/routers/grpc/harmony/responses/common.rs
model_gateway/src/routers/grpc/harmony/responses/execution.rs
model_gateway/src/routers/grpc/harmony/streaming.rs
model_gateway/src/routers/grpc/regular/responses/common.rs
model_gateway/src/routers/grpc/regular/responses/non_streaming.rs
model_gateway/src/routers/grpc/regular/responses/streaming.rs
model_gateway/src/routers/openai/context.rs
model_gateway/src/routers/openai/mcp/tool_loop.rs
model_gateway/src/routers/openai/responses/non_streaming.rs
model_gateway/src/routers/openai/responses/streaming.rs
model_gateway/src/routers/openai/responses/utils.rs
model_gateway/src/service_discovery.rs
model_gateway/tests/mcp_test.rs

coderabbitai · 2026-05-01T22:06:02Z

+pub fn transform_tool_output(
+    output: &smg_mcp::ToolExecutionOutput,
+    response_format: ResponseFormat,
+) -> ResponseOutputItem {
+    ResponseTransformer::transform(
+        &output.output,
+        response_format,
+        &output.call_id,
+        &output.server_label,
+        &output.tool_name,
+        &output.arguments_str,
+    )
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve ToolExecutionOutput failure state here.

transform_tool_output() throws away is_error/error_message and always delegates to builders that stamp successful items (status = "completed" / error = None). The streaming caller still invokes this helper on error paths, so failed MCP or hosted-tool executions can be persisted and replayed as successes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/common/openai_bridge/transformer.rs` around lines 21 - 33, transform_tool_output currently discards ToolExecutionOutput's failure info and always builds a successful ResponseOutputItem via ResponseTransformer::transform; update transform_tool_output to detect output.is_error (and use output.error_message) and propagate that into the resulting ResponseOutputItem (e.g., set status to "failed" and populate error/message fields) instead of always returning a completed/no-error item so failed MCP/hosted-tool executions are persisted as failures. Locate transform_tool_output and the call to ResponseTransformer::transform and either pass through an explicit error/status override or post-process the returned ResponseOutputItem to set status and error using output.is_error and output.error_message.

coderabbitai · 2026-05-01T22:06:02Z

+                            // Strip image_generation_call.result base64 from
+                            // historical replayed items before persistence —
+                            // a no-op for non-image item types.
+                            openai_bridge::compact_image_generation_outputs_json(
+                                std::slice::from_mut(&mut value),
+                            );


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist replayed structured input items verbatim.

This now preserves replayed image_generation_call/hosted-tool inputs long enough to compact them, but item_to_new_conversation_item() still stores non-message input items by reading content. For replayed tool-output items that field is absent, so the linked conversation item becomes [] and the next history load loses the original item.

💡 Suggested fix

- let store_whole_item = if is_input { - item_type == "function_call" || item_type == "function_call_output" - } else { - item_type != "message" - }; + let store_whole_item = item_type != "message";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/common/persistence_utils.rs` around lines 298 - 303, item_to_new_conversation_item() currently extracts non-message input items by reading the "content" field which is absent for replayed tool-output items, causing them to be persisted as [] and lost later; after compact_image_generation_outputs_json(...) is called on value, change the persistence logic in item_to_new_conversation_item() to detect when "content" is missing or when the item is a structured tool/input (e.g., contains image_generation_call or hosted-tool keys) and instead serialize the compacted whole JSON value (the same `value` passed into openai_bridge::compact_image_generation_outputs_json) verbatim into the conversation item field used for storage (rather than using content), so replayed structured inputs persist intact across loads.

coderabbitai · 2026-05-01T22:06:03Z

+    // Same fail-fast contract as the non-streaming path: a missing format
+    // registry means MCP routing decisions would be silently wrong.
+    let mcp_format_registry = match ctx.components.mcp_format_registry() {
+        Some(r) => r.clone(),
+        None => {
+            return error::internal_error("internal_error", "MCP format registry required");
+        }
+    };

    // Check for MCP tools and create request context if needed
    let mcp_servers = if let Some(tools) = original_body.tools.as_deref() {
-        ensure_request_mcp_client(&mcp_orchestrator, tools).await
+        ensure_request_mcp_client(&mcp_orchestrator, &mcp_format_registry, tools).await


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only require the format registry for MCP-bound streams.

This fail-fast now runs before we know whether the request will even enter the MCP interception path. In contexts where components.mcp_format_registry() is unset, plain streaming requests that would otherwise use handle_simple_streaming_passthrough() now return 500 even though no MCP routing is needed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/openai/responses/streaming.rs` around lines 1092 - 1103, The code currently requires ctx.components.mcp_format_registry() unconditionally, causing 500s for non‑MCP streams; instead, only require and clone the MCP format registry when the request will enter MCP interception (i.e., when original_body.tools.as_deref() is Some). Move the mcp_format_registry lookup into the branch that checks original_body.tools (the same branch that calls ensure_request_mcp_client), and if tools exist but ctx.components.mcp_format_registry() is None return the internal_error; otherwise let plain streaming paths continue to handle_simple_streaming_passthrough() without needing the registry.

coderabbitai · 2026-05-01T22:06:03Z

+    // The session returns the raw `output` Value from the MCP call. Re-transform
+    // with WebSearchCall format to verify serialization (end-to-end source
+    // extraction is covered by the gateway bridge's own tests).
+    let transformed = ResponseTransformer::transform(
+        &output.output,
+        ResponseFormat::WebSearchCall,
+        "test-request-openai-search",
+        "openai_search_server",
+        "brave_web_search",
+        "{\"query\":\"rust openai search\"}",
+    );


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Exercise the production bridge helper in these assertions.

These checks hardcode the call/server/tool/args instead of transforming the actual ToolExecutionOutput returned by session.execute_tool(). They can still pass if session-side rewriting regresses. Prefer openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) here, or at least feed output.call_id, output.server_label, output.tool_name, and output.arguments_str back into the transformer.

Also applies to: 422-429

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/tests/mcp_test.rs` around lines 337 - 347, The test currently hardcodes call/server/tool/args when calling ResponseTransformer::transform; replace that with the production helper openai_bridge::transform_tool_output(&output, ResponseFormat::WebSearchCall) (or at minimum pass output.call_id, output.server_label, output.tool_name, and output.arguments_str into ResponseTransformer::transform) so the assertions exercise the actual ToolExecutionOutput produced by session.execute_tool(); update both the instance around lines ~337 and the similar block at ~422-429 to use transform_tool_output or the output.* fields.

…tself The previous assertion (\"base64 not anywhere in the GET payload\") was over-specified. On the gRPC SGLang lane the model (gpt-oss-20b) sees the base64 in its function_call_output and legitimately echoes it back inside its assistant reply text (\"Base64 PNG data: iVBORw...\"). That message text is then persisted as a normal message item and trips the assertion even though the compactor is doing exactly what it was wired to do — the stored image_generation_call item has no \`result\` field. Narrow the assertion to the structured field where compactor failure actually surfaces: the persisted image_generation_call must carry no top-level or nested \`result\`, and the base64 must not appear inside the serialized image_generation_call item. Free-form assistant text is deliberately not policed here — that is a separate concern (model behavior on the upstream side) and the gateway has nothing to compact. Signed-off-by: Simo Lin <[email protected]>

Signed-off-by: Simo Lin <[email protected]>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d844414070

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T07:36:37Z

+                if format == ResponseFormat::Passthrough {
+                    continue;


Remove cached format on passthrough reconfiguration

When populate_from_server_config is called again for an already-registered server, Some(Passthrough) is treated as a no-op (continue) and the previous non-passthrough mapping is left in formats. Because ConnectMcpServerStep repopulates this shared registry on each RegisterMcpServer run, changing a tool from web_search_call/etc. back to passthrough will keep transforming outputs as the old hosted type. This creates incorrect response item types after MCP server config updates unless stale keys are explicitly deleted.

Useful? React with 👍 / 👎.

…lightseekorg#1429)

…rror state, wire shape) (#1442) Signed-off-by: Simo Lin <[email protected]>

slin1237 requested review from CatherineSue, XinyueZhang369, key4ng, zhaowenzi and zhoug9127 as code owners May 1, 2026 19:09

github-actions Bot added dependencies Dependency updates grpc gRPC client and router changes mcp MCP related changes tests Test changes model-gateway Model gateway crate changes openai OpenAI router changes anthropic Anthropic router changes gemini Gemini router changes labels May 1, 2026

slin1237 force-pushed the refactor/extract-openai-bridge branch from f2d6026 to a2212fc Compare May 1, 2026 19:11

lightseekorg deleted a comment from mergify Bot May 1, 2026

gemini-code-assist Bot reviewed May 1, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 1, 2026

View reviewed changes

claude Bot reviewed May 1, 2026

View reviewed changes

coderabbitai Bot requested changes May 1, 2026

View reviewed changes

claude Bot reviewed May 1, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 1, 2026

View reviewed changes

coderabbitai Bot requested changes May 1, 2026

View reviewed changes

slin1237 added 2 commits May 1, 2026 14:53

style: cargo fmt after openai_bridge import shortening

96e9014

The earlier import-shortening pass left several call sites at 5-line formatting that fmt collapses back to 1-2 lines. Apply cargo fmt to match the project style. Signed-off-by: Simo Lin <[email protected]>

coderabbitai Bot requested changes May 1, 2026

View reviewed changes

slin1237 added 2 commits May 1, 2026 16:12

style: ruff format for test_image_generation.py

d844414

Signed-off-by: Simo Lin <[email protected]>

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

slin1237 mentioned this pull request May 4, 2026

fix(openai-bridge): address PR #1429 review follow-ups (alias miss, error state, wire shape) #1442

Merged

8 tasks

YouNeedCryDear pushed a commit to YouNeedCryDear/smg that referenced this pull request May 4, 2026

refactor(mcp): extract OpenAI-flavored bridge into router-side module (…

fe4faa9

…lightseekorg#1429)

slin1237 added a commit that referenced this pull request May 5, 2026

fix(openai-bridge): address PR #1429 review follow-ups (alias miss, e…

03e8d5a

…rror state, wire shape) (#1442) Signed-off-by: Simo Lin <[email protected]>

slin1237 mentioned this pull request May 5, 2026

refactor(openai-bridge): finish descriptor pattern, fix FileSearch routing gap #1450

Merged

8 tasks

		/// Populate from a server config: per-tool overrides + builtin defaults.
		/// Safe to call repeatedly — entries are overwritten.

		let output_item =
		openai_bridge::transform_tool_output(&tool_output, &response_format);

Conversation

slin1237 commented May 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 1, 2026

Choose a reason for hiding this comment

slin1237 commented May 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 1, 2026 •

edited

Loading