feat: enable streaming usage metrics for OpenAI-compatible providers #4200

skamenan7 · 2025-11-19T22:08:30Z

What does this PR do?

Injects stream_options={"include_usage": True} for OpenAI-compatible providers when streaming and telemetry is active. This allows token usage metrics to be collected and emitted for streaming responses.

Changes include:

Injecting stream_options in OpenAIMixin (completion & chat) when tracing is enabled
Adding metric emission logic for completion streaming in InferenceRouter
Removing duplicate logic from WatsonX and Runpod providers

Closes #3981

Test Plan

Added unit tests in tests/unit/providers/utils/inference/test_openai_mixin.py verifying:

Ran tests locally:
PYTHONPATH=src pytest tests/unit/providers/utils/inference/test_openai_mixin.py -v

mergify · 2025-11-19T22:09:04Z

This pull request has merge conflicts that must be resolved before it can be merged. @skamenan7 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

cdoern · 2025-11-19T22:38:21Z

I think #4127 should supersede this probably, right?

mergify · 2025-11-19T22:51:19Z

This pull request has merge conflicts that must be resolved before it can be merged. @skamenan7 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

skamenan7 · 2025-11-20T13:53:06Z

Yes, Charlie, even with such overhaul of the telemetry (Thanks @iamemilio ), OpenAI provider will not send usage data unless we explicitly ask for it with stream_options={"include_usage": True} as Automatic instrumentation libraries usually do not modify your API payloads to ask for extra data.

iamemilio · 2025-11-20T14:50:14Z

@skamenan7 Sorry to say, my changes are going to make a bit of work for you. I would really suggest working in the bounds of the changes I made and experimenting with automatic instrumentation from opentelemetry, because tokens are something it actively captures. That said, you are correct that tokens are not included in the payloads from streaming data unless you set it in the arguments. Please do experiment with my PR and see what has changed, the old telemetry system you are using is going to be removed soon. If llama stack wanted to enable token metrics to all the services it routes inference streaming requests too, that is a clever solution, but we also need to make sure we are respecting the client's preferences and not returning the token metrics chunk if they did not enable it. I'm happy to help if you need it!

mattf

@iamemilio this change is focused and addresses a known issue with the streaming metric generation. can you help by having it go in and then align it w/ the new telemetry architecture as part of #4127?

@skamenan7 bedrock needs updating as well

iamemilio · 2025-11-20T16:06:34Z

I think we are doing this backwards @mattf. @skamenan7 and I are going to pair on this to position this change as a follow up PR to #4127.

I can not keep increasing the complexity of what is already an egregiously large pull request otherwise it will be too difficult to review and test. Having to handle this would be a major time sink and setback for me.

skamenan7 · 2025-11-20T18:29:24Z

Thanks @mattf for that catch. I updated Bedrock as well.

skamenan7 · 2025-11-20T18:32:22Z

Yes, me and Emilio are meeting soon to discuss but I made the updates so as not to forget.

skamenan7 · 2025-11-20T20:05:12Z

cc: @leseb @rhuss

Inject stream_options for telemetry, add completion streaming metrics, fix params mutation, remove duplicate provider logic. Add unit tests.

skamenan7 requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners November 19, 2025 22:08

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 19, 2025

mergify bot added the needs-rebase label Nov 19, 2025

skamenan7 force-pushed the feat/3981-enable-streaming-usage-metrics-all branch from 0b6843b to 4325345 Compare November 19, 2025 22:11

mergify bot removed the needs-rebase label Nov 19, 2025

mergify bot added the needs-rebase label Nov 19, 2025

skamenan7 force-pushed the feat/3981-enable-streaming-usage-metrics-all branch from 4325345 to c81784c Compare November 20, 2025 14:13

mergify bot removed the needs-rebase label Nov 20, 2025

mattf approved these changes Nov 20, 2025

View reviewed changes

skamenan7 force-pushed the feat/3981-enable-streaming-usage-metrics-all branch from c81784c to 553d2e5 Compare November 20, 2025 18:27

Enable streaming usage metrics for OpenAI providers

37d588d

Inject stream_options for telemetry, add completion streaming metrics, fix params mutation, remove duplicate provider logic. Add unit tests.

skamenan7 force-pushed the feat/3981-enable-streaming-usage-metrics-all branch from 553d2e5 to 37d588d Compare November 20, 2025 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: enable streaming usage metrics for OpenAI-compatible providers #4200

feat: enable streaming usage metrics for OpenAI-compatible providers #4200

skamenan7 commented Nov 19, 2025 •

edited

Loading

Uh oh!

mergify bot commented Nov 19, 2025

Uh oh!

cdoern commented Nov 19, 2025

Uh oh!

mergify bot commented Nov 19, 2025

Uh oh!

skamenan7 commented Nov 20, 2025 •

edited

Loading

Uh oh!

iamemilio commented Nov 20, 2025 •

edited

Loading

Uh oh!

mattf left a comment

Uh oh!

iamemilio commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: enable streaming usage metrics for OpenAI-compatible providers #4200

Are you sure you want to change the base?

feat: enable streaming usage metrics for OpenAI-compatible providers #4200

Conversation

skamenan7 commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

mergify bot commented Nov 19, 2025

Uh oh!

cdoern commented Nov 19, 2025

Uh oh!

mergify bot commented Nov 19, 2025

Uh oh!

skamenan7 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iamemilio commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

iamemilio commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

skamenan7 commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

skamenan7 commented Nov 19, 2025 •

edited

Loading

skamenan7 commented Nov 20, 2025 •

edited

Loading

iamemilio commented Nov 20, 2025 •

edited

Loading