Skip to content

fix: preserve harmony analysis channel in non-streaming output#695

Open
jaredlockhart wants to merge 1 commit intojundot:mainfrom
jaredlockhart:fix/harmony-non-streaming-reasoning
Open

fix: preserve harmony analysis channel in non-streaming output#695
jaredlockhart wants to merge 1 commit intojundot:mainfrom
jaredlockhart:fix/harmony-non-streaming-reasoning

Conversation

@jaredlockhart
Copy link
Copy Markdown

Summary

Non-streaming /v1/chat/completions for gpt-oss (Harmony) models returned reasoning_content: null because the scheduler only persisted the final channel as visible_text. Streaming worked because the analysis channel was emitted as <think>…</think> in stream_text.

Fix

  • parse_tool_calls_from_tokens now also returns analysis_text from the analysis channel.
  • HarmonyOutputParserSession.finalize() wraps analysis text as <think>…</think> and returns it via a new output_text_prefix field on OutputParserFinalizeResult.
  • Scheduler prepends output_text_prefix to request.output_text on finalize, so the existing extract_thinking(output.text) in the non-streaming server paths recovers reasoning with no server-side changes.

Test

  • New regression test test_harmony_non_streaming_preserves_reasoning in tests/test_output_parser.py drives analysis+final tokens through the session and asserts extract_thinking recovers both channels.
  • Fails on main, passes after the fix.

Fixes #569

non-streaming chat completions for gpt-oss dropped reasoning_content
because the scheduler only persisted the final channel as visible_text.
have the harmony parser emit the analysis channel as an output_text
prefix on finalize so the existing extract_thinking path in the server
picks it up alongside the final answer.

Fixes jundot#569
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reasoning_content is null in non-streaming chat completions for Harmony models (gpt-oss)

1 participant