fix: preserve harmony analysis channel in non-streaming output#695
Open
jaredlockhart wants to merge 1 commit intojundot:mainfrom
Open
fix: preserve harmony analysis channel in non-streaming output#695jaredlockhart wants to merge 1 commit intojundot:mainfrom
jaredlockhart wants to merge 1 commit intojundot:mainfrom
Conversation
non-streaming chat completions for gpt-oss dropped reasoning_content because the scheduler only persisted the final channel as visible_text. have the harmony parser emit the analysis channel as an output_text prefix on finalize so the existing extract_thinking path in the server picks it up alongside the final answer. Fixes jundot#569
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Non-streaming
/v1/chat/completionsfor gpt-oss (Harmony) models returnedreasoning_content: nullbecause the scheduler only persisted thefinalchannel asvisible_text. Streaming worked because the analysis channel was emitted as<think>…</think>instream_text.Fix
parse_tool_calls_from_tokensnow also returnsanalysis_textfrom theanalysischannel.HarmonyOutputParserSession.finalize()wraps analysis text as<think>…</think>and returns it via a newoutput_text_prefixfield onOutputParserFinalizeResult.output_text_prefixtorequest.output_texton finalize, so the existingextract_thinking(output.text)in the non-streaming server paths recovers reasoning with no server-side changes.Test
test_harmony_non_streaming_preserves_reasoningintests/test_output_parser.pydrives analysis+final tokens through the session and assertsextract_thinkingrecovers both channels.Fixes #569