refactor(openai-bridge): finish descriptor pattern, fix FileSearch routing gap#1450
Conversation
…ype_str reverse lookup
The descriptor's id_prefix carried a trailing underscore ("ws_",
"mcp_", …) for legacy reasons. Drop it so the field is the discriminator
itself; the one join site adds the underscore via format!() instead.
Lets the grpc emitter consume the prefix directly without trimming
(see follow-up commit) and matches the convention used by every other
descriptor field (no trailing punctuation).
Adds format_from_type_str(&str) -> Option<ResponseFormat>: the reverse
of descriptor(format).type_str. Routers that received an item-type
string from the upstream wire and need to consult the descriptor were
previously re-implementing the per-ItemType match by hand. Replaces the
five-arm match in routers/openai/responses/streaming.rs with a
descriptor read.
Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
…iptor.id_prefix
The grpc-router-private OutputItemType enum had two roles bundled
together: a discriminator for the four non-format kinds (Message,
McpListTools, FunctionCall, Reasoning) AND a parallel mirror of the
five ResponseFormat variants. The mirror is dead weight — the prefixes
are already in FormatDescriptor.id_prefix.
Split the enum into OutputItemKind (the four non-format kinds only),
expose two allocator entry points on ResponseStreamEventEmitter:
* allocate_output_index(OutputItemKind) — for the 4 non-format kinds
* allocate_output_index_for_format(Option<ResponseFormat>) — derives
the prefix from descriptor(format).id_prefix; None falls back to
"fc" (function_call), matching the prior output_item_type_for_format
behaviour.
Deletes output_item_type_for_format. New builtin formats now require
zero edits in grpc/common/responses/streaming.rs — adding a row to the
FormatDescriptor table is enough.
Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
…scriptor
Four sites in tool_loop.rs each independently matched on ResponseFormat:
* send_tool_call_intermediate_event — searching event per format
* send_tool_call_completion_events — completed event per item type
* stable_streaming_tool_item_id — id prefix per format
* non_streaming_tool_item_id_source — Passthrough vs hosted decision
Each was a hand-written enumeration of the five ResponseFormat variants
that drifted from FormatDescriptor's existing fields
(searching_event / completed_event / id_prefix). Replace with descriptor
reads:
* descriptor(format).searching_event (None short-circuits Passthrough)
* descriptor(format).completed_event (with format_from_type_str
fallback when the routed item-type re-tagged the call)
* descriptor(format).id_prefix joined to the suffix at the format!()
site
Drops the per-event-name and per-prefix imports the old code needed.
Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
…ileSearch Three sites (mcp_utils.rs collect_builtin_routing + extract_builtin_types, grpc/common/responses/utils.rs ensure_mcp_connection has-builtin-tools) each hand-rolled the same ResponseTool::* match against three of the four hosted families — every one of them forgot ResponseTool::FileSearch. That meant a request carrying file_search wasn't recognized as needing MCP routing in any of the three predicates, so the request short- circuited the MCP loop and dispatched the file_search tool as a plain function call. Adding a new builtin would have meant remembering to edit all three call sites. Add openai_bridge::builtin_type_for_response_tool(&ResponseTool) -> Option<BuiltinToolType> as the single source of truth (covers all four hosted families) and route the three sites through it. The FileSearch gap is fixed as a side effect. Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
… dispatch The two non-streaming tool execution paths (harmony + grpc-regular) called lookup_tool_format twice per tool call — once during input preparation (for prepare_hosted_dispatch_args) and again during output processing (for transform_tool_output). Each lookup allocates two Arc<str>s for the QualifiedToolName, so this was 4 extra allocations per tool call. session.execute_tools is buffered() and preserves input ordering, so we can resolve the format once per call up-front, store it in a parallel Vec<ResponseFormat>, unzip into (inputs, formats), and zip back through the output processing pass. Streaming paths already do a single lookup per call, so they're unaffected. Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
…tool lookup_tool_format silently returned Passthrough whenever the registry had no entry for a tool — even when the session DID expose the tool. That asymmetric miss is the fingerprint of a registration-path bug (orchestrator added the tool, but populate_from_server_config wasn't called on the same server), and would otherwise dispatch as mcp_call in production with no signal. Add tracing::debug carrying the qualified name, gated on the asymmetry (session-known + registry-missing). Adds FormatRegistry::contains so the gate distinguishes a registered Some(Passthrough) override from a missing entry. Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughCentralizes ResponseTool→BuiltinToolType and ItemType→ResponseFormat mappings in openai_bridge, refactors output-item allocation to use format-driven prefixes and a new OutputItemKind, resolves per-tool ResponseFormat once during tool-call preparation, and normalizes tool-item IDs/event selection across streaming and non-streaming paths. ChangesOpenAI Mapping & Registry
Output-Item Allocation & Per-Call Format Resolution
OpenAI ID Normalization & Event Selection
Sequence Diagram(s)sequenceDiagram
participant Client
participant Session
participant OpenAI_Bridge
participant FormatRegistry
participant Emitter
Client->>Session: request with tool calls
Session->>OpenAI_Bridge: prepare ToolExecutionInput (resolve ResponseFormat once)
OpenAI_Bridge->>FormatRegistry: lookup_tool_format / contains
FormatRegistry-->>OpenAI_Bridge: ResponseFormat or Passthrough
OpenAI_Bridge->>Session: execute_tools(inputs)
Session-->>OpenAI_Bridge: tool outputs[]
OpenAI_Bridge->>OpenAI_Bridge: transform_tool_output(output, paired ResponseFormat)
OpenAI_Bridge->>Emitter: allocate_output_index_for_format(ResponseFormat) / emit events
Emitter-->>Client: streamed/non-streamed events with normalized IDs
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly Related PRs
Suggested Reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
|
||
| fn normalize_tool_item_id_with_prefix(source_id: &str, target_prefix: &str) -> String { | ||
| if source_id.starts_with(target_prefix) { | ||
| fn normalize_tool_item_id_with_prefix(source_id: &str, prefix: &str) -> String { |
There was a problem hiding this comment.
🔴 Important: The contract of this function changed — it now expects a bare prefix (no trailing _) and appends _ itself on line 596. But the caller at line 623 (approval_request_item_id_source) still passes "mcpr_" with the old convention, producing double-underscore IDs like mcpr__something instead of mcpr_something.
Fix: change line 623 to normalize_tool_item_id_with_prefix(item_id, "mcpr").
There was a problem hiding this comment.
Code Review
This pull request centralizes tool type mapping and response format descriptors within the openai_bridge module, replacing redundant match statements with unified lookup functions. It optimizes tool execution loops by resolving response formats once per call to reduce allocations and refactors output item ID prefixing and event resolution to use the centralized descriptors. Telemetry is also improved with debug logging for registry misses. I have no feedback to provide.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 283f27f1a1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| fn normalize_tool_item_id_with_prefix(source_id: &str, target_prefix: &str) -> String { | ||
| if source_id.starts_with(target_prefix) { | ||
| fn normalize_tool_item_id_with_prefix(source_id: &str, prefix: &str) -> String { | ||
| let prefix_with_underscore = format!("{prefix}_"); |
There was a problem hiding this comment.
Avoid double-underscore approval item IDs
normalize_tool_item_id_with_prefix now always appends an underscore to the provided prefix, but approval_request_item_id_source still passes "mcpr_", so approval IDs become mcpr__... instead of mcpr_.... This silently changes the wire ID shape for mcp_approval_request items and can break downstream clients that parse or correlate approval IDs using the single-underscore format; pass the bare discriminator ("mcpr") or make normalization tolerate already-suffixed prefixes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/responses/non_streaming.rs (1)
353-388: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winDon't let the unzip/zip refactor silently truncate tool batches here.
This loop also assumes one
ToolExecutionResultper prepared input. Ifsession.execute_toolsever returns fewer or more entries,zipdrops the remainder and the tail tools never make it into metrics or loop state. Please check the lengths before iterating.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@model_gateway/src/routers/grpc/regular/responses/non_streaming.rs` around lines 353 - 388, The current refactor zips results and formats which can silently drop items if session.execute_tools(inputs) returns a different length; before iterating, compare results.len() and formats.len() (and inputs.len() if relevant) and handle mismatches explicitly: log an error via the existing logger with clear context, include the call_ids or tool names from prepared/inputs, and decide a deterministic handling strategy (e.g., iterate by index up to max length and treat missing entries as failures or create placeholder ToolExecutionResult entries so metrics/loop-state still account for every prepared input). Update the loop that consumes results.into_iter().zip(formats) to use indexed iteration (by index into results and formats) or to iterate over a zipped iterator built from aligned vectors after applying the chosen mismatch handling; reference prepared, inputs, formats, results, and session.execute_tools in your changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@model_gateway/src/routers/common/openai_bridge/format_registry.rs`:
- Around line 34-44: Add a unit test that exercises the asymmetric-miss branch
of lookup_tool_format by constructing a session-exposed ToolQN (use the same
exposed_name/session path used elsewhere), ensuring the registry.lookup returns
ResponseFormat::Passthrough while registry.contains(&qn) is false, and asserting
that lookup_tool_format returns Passthrough and (optionally) that a debug log is
emitted; specifically target the lookup_tool_format helper so the path that
calls registry.lookup, checks ResponseFormat::Passthrough and
!registry.contains(&qn) is executed. Use the existing FormatRegistry test
fixtures or create a minimal registry stub where lookup(&qn) ->
ResponseFormat::Passthrough but contains(&qn) -> false, and call
lookup_tool_format with that registry and a session-exposed tool QN to lock in
the regression guard.
In `@model_gateway/src/routers/grpc/harmony/responses/execution.rs`:
- Around line 113-127: The code assumes outputs.len() == formats.len() before
zipping outputs and formats (after calling session.execute_tools(inputs)), so
add an explicit length check right after let outputs =
session.execute_tools(inputs).await; that compares outputs.len() and
formats.len() and fails loudly if they differ (e.g., return Err / log error and
propagate or panic/assert) to avoid silent truncation when mapping into
ToolResult in the closure that follows; reference the prepared tuple unzip, the
session.execute_tools call, the outputs and formats variables, and the
ToolResult mapping so the check is inserted before the zip/map loop.
In `@model_gateway/src/routers/openai/mcp/tool_loop.rs`:
- Around line 595-605: The normalize_tool_item_id_with_prefix function now adds
the underscore internally, so callers like approval_request_item_id_source
should pass the raw prefix without an underscore (e.g., "mcpr" instead of
"mcpr_"); update all calls that still use the trailing underscore (including the
occurrences around lines 595 and 622-624) to pass "mcpr" (and any similar
prefixes) so IDs remain stable and do not become double-underscored.
---
Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/responses/non_streaming.rs`:
- Around line 353-388: The current refactor zips results and formats which can
silently drop items if session.execute_tools(inputs) returns a different length;
before iterating, compare results.len() and formats.len() (and inputs.len() if
relevant) and handle mismatches explicitly: log an error via the existing logger
with clear context, include the call_ids or tool names from prepared/inputs, and
decide a deterministic handling strategy (e.g., iterate by index up to max
length and treat missing entries as failures or create placeholder
ToolExecutionResult entries so metrics/loop-state still account for every
prepared input). Update the loop that consumes results.into_iter().zip(formats)
to use indexed iteration (by index into results and formats) or to iterate over
a zipped iterator built from aligned vectors after applying the chosen mismatch
handling; reference prepared, inputs, formats, results, and
session.execute_tools in your changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8ce5574d-51ae-4fd6-a696-00910b175ccb
📒 Files selected for processing (13)
model_gateway/src/routers/common/mcp_utils.rsmodel_gateway/src/routers/common/openai_bridge/format_descriptor.rsmodel_gateway/src/routers/common/openai_bridge/format_registry.rsmodel_gateway/src/routers/common/openai_bridge/mod.rsmodel_gateway/src/routers/common/openai_bridge/tool_descriptors.rsmodel_gateway/src/routers/grpc/common/responses/streaming.rsmodel_gateway/src/routers/grpc/common/responses/utils.rsmodel_gateway/src/routers/grpc/harmony/responses/execution.rsmodel_gateway/src/routers/grpc/harmony/streaming.rsmodel_gateway/src/routers/grpc/regular/responses/non_streaming.rsmodel_gateway/src/routers/grpc/regular/responses/streaming.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/streaming.rs
|
Hi @slin1237, the DCO sign-off check has failed. All commits must include a To fix existing commits: # Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-leaseTo sign off future commits automatically:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d53053d84e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| match tool { | ||
| ResponseTool::WebSearchPreview(_) => Some(BuiltinToolType::WebSearchPreview), | ||
| ResponseTool::CodeInterpreter(_) => Some(BuiltinToolType::CodeInterpreter), | ||
| ResponseTool::FileSearch(_) => Some(BuiltinToolType::FileSearch), |
There was a problem hiding this comment.
Avoid forcing MCP for native file_search requests
When a Responses request includes file_search on a deployment without MCP components, this new mapping makes request_uses_mcp_routing() return true; both OpenAI Responses entrypoints then immediately return internal_error while fetching mcp_orchestrator/mcp_format_registry (checked model_gateway/src/routers/openai/responses/non_streaming.rs:64-76 and streaming.rs:1080-1092) before ensure_request_mcp_client() can determine that no MCP file-search server is configured and pass the native tool through. That regresses file-search-only requests that previously flowed to the upstream worker; gate FileSearch on actual MCP routing availability or allow the component-less passthrough path.
Useful? React with 👍 / 👎.
`normalize_tool_item_id_with_prefix` was refactored in 2d10a9b to take a bare prefix token and append `_` internally. The `approval_request_item_id_source` call site still passed `"mcpr_"`, producing `mcpr__...` ids on the wire and breaking the approval flow's client-visible id contract. Pass the bare `"mcpr"` prefix and add a regression test that pins the single-underscore convention across the four input shapes the normalizer accepts (fc_, call_, existing-mcpr_, raw). Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
The 4d27779 hoist zips per-call `ResponseFormat` against `session.execute_tools` outputs. The order-and-length guarantee comes from `buffered()`, currently captured only in a comment. If a future refactor breaks that invariant, `zip` would silently truncate to the shorter side and drop tool results, which is a bug class hard to notice in production. Promote the comment to an `assert_eq!` on both call sites so a regression there panics loudly instead of dropping outputs. Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
`lookup_tool_format` (added in 283f27f) had no direct unit coverage — existing tests only exercised `FormatRegistry::lookup_by_names` which bypasses the session lookup. Add three tests for the composed helper: - session unknown → Passthrough (early return path) - session known + registry empty → Passthrough (the asymmetric-miss branch that emits the registration-bug debug log) - session known + registry has hosted entry → registry value wins Tests construct an `McpOrchestrator` and live `McpToolSession` so the session's `qualified_name_for_exposed` is actually exercised. Signed-off-by: Simo Lin <25425177+slin1237@users.noreply.github.com>
d53053d to
bbc603f
Compare
Summary
Follow-up to the PR #1429 / #1442 audit. The bridge extraction landed cleanly (goals 1 & 2 met), but the audit identified that the
FormatDescriptorpattern was only half-deployed — eightmatch/matches!sites againstResponseFormatremained outside the bridge, and three sites that mappedResponseTool::*→BuiltinToolTypehad each independently forgottenFileSearch. This PR closes those gaps + the perf and telemetry follow-ups.After this change, adding a new builtin (e.g. a hypothetical
WebSearchAdvancedCallwith the existingin_progress → searching → completedlifecycle) requires:crates/mcp/: aBuiltinToolTypevariant +Displayarm +response_format()arm +ResponseFormatConfigvariant. (Pre-PR-refactor(mcp): extract OpenAI-flavored bridge into router-side module #1429 baseline; intentional — these are MCP-side routing config.)routers/common/openai_bridge/:response_format.rs(variant + From/to_builtin),format_descriptor.rs(one row indescriptor()),transformer.rs(one match arm), and possiblyoverrides.rsif the new builtin has hosted-tool overrides.routers/openai/,routers/grpc/,routers/anthropic/. The descriptor table is now the single per-format dispatch surface.The audit predicted ~12 files across 3 areas would need editing; after this PR it's actually ~5 files in 2 areas (mcp config + bridge directory).
Commits
f069e886_fromdescriptor.id_prefix("ws_"→"ws"); addformat_from_type_strreverse lookup. Replace per-ItemTypeevent match inrouters/openai/responses/streaming.rswithdescriptor(format_from_type_str(t)?).in_progress_event.074d8cd1OutputItemTypeinto the 4 non-formatOutputItemKindvariants only; addallocate_output_index_for_format(Option<ResponseFormat>)that derives the prefix fromdescriptor(format).id_prefix. Delete the dead format mirror +output_item_type_for_formatmap.2d10a9b4routers/openai/mcp/tool_loop.rs(intermediate event, completed event, id prefix, prefix-stripping) to descriptor reads.d9d362d8openai_bridge::builtin_type_for_response_tool(&ResponseTool) → Option<BuiltinToolType>as the single mapping. Route the three duplicate sites through it (mcp_utils::collect_builtin_routing+extract_builtin_types,grpc/common/responses/utils::ensure_mcp_connection). Fixes the latentResponseTool::FileSearchgap that all three sites had — file_search requests now correctly route through MCP.4d27779flookup_tool_formatresolution to a single up-front pass in the harmony non-streaming and grpc-regular non-streaming execution paths. Each tool call now allocates 2Arc<str>s instead of 4. Also fixes the staleto_response_item()doc comment inharmony/responses/execution.rs.283f27f1tracing::debuginlookup_tool_formatwhen the session has the tool exposed but the registry doesn't. Surfaces a registration-path bug that would otherwise dispatch silently asmcp_call.Latent bug fixed (commit 4)
Three independent matches in
mcp_utils.rsandgrpc/common/responses/utils.rscoveredWebSearchPreview / CodeInterpreter / ImageGenerationbut all three forgotFileSearch. This means a request carryingfile_searchwas never recognized as needing MCP routing — the request short-circuited the MCP loop and dispatched the tool as a plain function call. Centralizing the mapping in the bridge fixes all three at once and prevents the same mistake when the next builtin lands.Test plan
cargo build -p smgcleancargo clippy --all-targets --all-features -- -D warningscleancargo fmt --all -- --checkcleancargo test -p smg --lib openai_bridge::— 52 passcargo test -p smg --lib openai::mcp::tool_loop— 13 pass (the four*_completion_events_fire_before_output_item_donetests cover the descriptor refactor in commit 3)cargo test -p smg --test mcp_test— 23 passcargo test -p smg --lib— 744 passSummary by CodeRabbit