Skip to content

Conversation

@fenilfaldu
Copy link

@fenilfaldu fenilfaldu commented Nov 14, 2025

Closes #2258
Before
Screenshot 2025-11-14 at 6 05 50 AM

After
Screenshot 2025-11-14 at 6 06 55 AM


Note

Capture LLM model name and detailed token usage in BaseOpenAIChatCompletionClient.create and .create_stream, with streaming usage injection and end-of-stream attribute setting.

  • Instrumentation: OpenAI ChatCompletion (_wrappers.py)
    • LLM token metrics: Extract token usage from CreateResult.usage (prompt, completion, total) with details (prompt: cache_read/audio/cache_input; completion: reasoning/audio) and set via get_llm_token_count_attributes, plus explicit reasoning/audio completion detail attributes.
    • Model metadata: Add model name via get_llm_model_name_attributes to LLM spans.
    • Streaming support: Ensure include_usage (via extra_create_args.stream_options or include_usage param); during streaming, accumulate output/tool-call/token attributes and set them after stream completion.
    • Outputs/Tools: Continue setting output attributes and extracting tool call attributes from responses.

Written by Cursor Bugbot for commit ab0a81a. This will update automatically on new commits. Configure here.

@fenilfaldu fenilfaldu requested a review from a team as a code owner November 14, 2025 00:48
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 14, 2025
@fenilfaldu fenilfaldu changed the title Add token metrics support to autogen-agentchat instrumentation with streaming support feat:Add token metrics support to autogen-agentchat instrumentation with streaming support Nov 14, 2025
if details:
prompt_details = _extract_details_from_object(
details,
{"cache_read": "cached_tokens", "audio": "audio_tokens", "cache_input": "text_tokens"}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Token Attribution Fails for Cached Input

The mapping includes "cache_input": "text_tokens", but get_llm_token_count_attributes doesn't handle the cache_input key in prompt_details. If text_tokens exists in the response, it gets extracted into the token usage dict but is never converted to a span attribute, making this extraction ineffective.

Fix in Cursor Fix in Web

@caroger caroger self-requested a review December 11, 2025 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[bug] cannot get token counts and cost from streaming llm clients in autogen-agentchat

1 participant