Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix llm stream response #3115

Merged
merged 14 commits into from
Mar 31, 2025
Merged

BUG: fix llm stream response #3115

merged 14 commits into from
Mar 31, 2025

Conversation

amumu96
Copy link
Contributor

@amumu96 amumu96 commented Mar 24, 2025

  1. Modify xinference/client/tests/test_client.py: For chunk where finish_reason is not None, assert that delta = {"content": ""}.

  2. Modify xinference/model/llm/llama_cpp/core.py: Filter out keys in the returned results that do not belong to ChatCompletionChunk.

  3. Modify xinference/model/llm/reasoning_parser.py: Fix the issue where both reasoning_content="" and content="".

  4. Modify xinference/model/llm/utils.py: Ensure that chunk with finish_reason not being None includes values for both content and reasoning_content.

@XprobeBot XprobeBot added the bug Something isn't working label Mar 24, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Mar 24, 2025
@amumu96 amumu96 changed the title BUG: fix vllm stream response BUG: fix llm stream response Mar 24, 2025
Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit a6e99b4 into xorbitsai:main Mar 31, 2025
12 of 13 checks passed
@qinxuye qinxuye deleted the bug/stream-resp branch March 31, 2025 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants