Updating chat completions inference data retrieval logic #1199

kumar-shivam-ranjan · 2025-06-06T12:38:02Z

Description

This PR is intended to update the inference logic when user overrides the header to /v1/chat/completions in case of model deployed with llama-cpp container. MD with llama cpp container returns chunked response in different format than vllm/tgi containers.

github-actions · 2025-06-06T13:15:14Z

📌 Cov diff with main:

📌 Overall coverage:

Updating chat completions inference data retrieval logic

3b2121a

kumar-shivam-ranjan requested review from darenr, mayoor, mrDzurb, VipulMascarenhas, qiuosier and ahosler as code owners June 6, 2025 12:38

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jun 6, 2025

Merge branch 'main' into ODSC-72334/streaming-inference-handler

6bb01ab

mrDzurb approved these changes Jun 6, 2025

View reviewed changes

mrDzurb enabled auto-merge (squash) June 6, 2025 15:22

elizjo approved these changes Jun 6, 2025

View reviewed changes

mrDzurb merged commit b7ba124 into main Jun 6, 2025
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updating chat completions inference data retrieval logic #1199

Updating chat completions inference data retrieval logic #1199

Uh oh!

kumar-shivam-ranjan commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!

Updating chat completions inference data retrieval logic #1199

Updating chat completions inference data retrieval logic #1199

Uh oh!

Conversation

kumar-shivam-ranjan commented Jun 6, 2025

Description

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!