feat(proxy): raw-passthrough /v1/chat/completions — fixes Codex / OpenAI tool calls by KillerQueen-Z · Pull Request #7 · BlockRunAI/blockrun-litellm

KillerQueen-Z · 2026-06-12T09:18:37Z

Problem

/v1/chat/completions still went through the SDK's typed chat_completion_stream, which crashes on streamed tool calls:

File ".../blockrun_llm/solana_client.py", _aiter_and_archive
    if choice.delta.content:
AttributeError: 'dict' object has no attribute 'delta'

Root cause (confirmed reproduced on Solana with a real model):

The SDK's ToolCall schema requires id / function.name / arguments, but streaming tool-call argument-fragment frames (id/name absent, partial args) don't satisfy it.
_aiter_sse_chunks therefore falls back to ChatCompletionChunk.model_construct(...), which doesn't parse nested models → choices stay as raw dicts.
_aiter_and_archive then does choice.delta.content on a dict → crash.

Plain chat works (text frames validate fine); any tool call breaks. This hits every OpenAI client doing tool use — notably Codex with wire_api=chat.

Fix — extend the verbatim passthrough to `/v1/chat/completions`

#6 made /v1/messages a byte-for-byte x402-signed passthrough. This generalizes that helper (_forward_anthropic → _forward_passthrough, taking a headers arg) and routes /v1/chat/completions through it too.

Body forwarded byte-for-byte, only the x402 signature added → the SDK's streaming/parsing/archiving is no longer in the hot path, so the crash can't be reached and streamed tool_calls survive intact.
Reuses the merged Anthropic path's semaphore gating + real-upstream-status handling (no more unconditional 200 on upstream errors).
One sidecar now serves Claude Code (/v1/messages), Codex / OpenAI clients (/v1/chat/completions), all as pure signing passthroughs.

This is the structural direction: the proxy is a thin x402 signer; the gateway owns the protocol. No edge translation, so SDK streaming bugs can't surface in the proxy.

Verification (real, end-to-end)

On Solana (sol.blockrun.ai) with real paid models:

Endpoint	Before	After
`/v1/chat/completions` stream + tools	❌ `'dict' object has no attribute 'delta'`	✅ `tool_calls` + full args `{"city":"Madrid"}` + `finish_reason=tool_calls`
`/v1/messages` stream + tools	✅	✅ (unchanged)
non-stream chat	✅	✅

pytest: 84 passed (1 pre-existing litellm-version canary deselected). Compile clean.

Note on `/v1/responses` (Codex Responses API)

The gateway has no native /v1/responses, so it can't be a verbatim passthrough; the existing bridge stays as-is. Recommended Codex config is wire_api = "chat" → routes through this fixed /v1/chat/completions. A native gateway /v1/responses would be the clean long-term answer for wire_api=responses.

… tool calls) /v1/chat/completions went through the SDK's typed chat_completion_stream, which crashes on streamed tool calls: the strict ToolCall schema rejects streaming argument-fragment frames, _aiter_sse_chunks falls back to model_construct (leaving choices as raw dicts), and _aiter_and_archive then reads `.delta` on a dict — `'dict' object has no attribute 'delta'`. Any OpenAI client doing tool calls (Codex with wire_api=chat, etc.) hit this. Generalize the verbatim passthrough already used for /v1/messages (_forward_anthropic -> _forward_passthrough, taking a headers arg) and route /v1/chat/completions through it too. The body is forwarded byte-for-byte with only the x402 signature added, so the SDK's streaming/parsing/archiving is no longer in the hot path and streamed tool_calls survive intact. Keeps the semaphore gating and real-upstream-status handling from the Anthropic path. Verified on Solana with real paid models: /v1/chat/completions streaming + tools now returns tool_calls with full arguments and finish_reason=tool_calls (no crash); /v1/messages unchanged. 84 tests pass.

Replacing the SDK-based /v1/chat/completions handler with a raw passthrough left _sse_event_stream (and its only-consumer helper _openai_error_event) with no callers. Remove them. The /v1/responses bridge keeps its own _responses_sse_stream + the shared _payment_error_* helpers, which are still used.

KillerQueen-Z added 2 commits June 12, 2026 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(proxy): raw-passthrough /v1/chat/completions — fixes Codex / OpenAI tool calls#7

feat(proxy): raw-passthrough /v1/chat/completions — fixes Codex / OpenAI tool calls#7
KillerQueen-Z wants to merge 2 commits into
mainfrom
feat/chat-completions-raw-passthrough

KillerQueen-Z commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KillerQueen-Z commented Jun 12, 2026

Problem

Fix — extend the verbatim passthrough to /v1/chat/completions

Verification (real, end-to-end)

Note on /v1/responses (Codex Responses API)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix — extend the verbatim passthrough to `/v1/chat/completions`

Note on `/v1/responses` (Codex Responses API)