-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(llmobs): instrument oai agents sdk #13081
Merged
Merged
+3,776
−8
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Kyle-Verhoog
approved these changes
Apr 7, 2025
emmettbutler
approved these changes
Apr 7, 2025
Co-authored-by: Emmett Butler <[email protected]>
…nto evan.li/oai-agents
Add span linking between tool & llm spans for the openai agents sdk. We use the core dispatch api since span linking requires cross-integration communication in the case where someone selects "chat completions" as the llm api to use for the agents sdk. Signals are dispatched - when LLM spans finish (chat completions api) in the oai integration - when LLM spans finish (responses api) in the agents sdk integration - when tool calls/handoffs finish in the agents sdk integration `ToolCallTracker` in `ddtrace.llmobs._utils` contains the functions that handles these signals to add span links. ### Links created **[LLM output -> tool input]** for the case where an LLM span chooses a tool and that tool is later executed via the agents sdk. We do this by mapping the tool name & arguments to it's tool id. When the tool call is triggered, we have access to it's name and arguments. From there, we can look up it's tool id and the LLM span that is used to generate that argument. We pop the tool name/arg from the lookup dictionary after it's used. **[Tool output -> LLM input]** for the case where a tool's output is fed back into a later LLM call, either in the same agent or another agent. We can tell this since the tool_id is present in the LLM's input messages. We then use this tool id to lookup the tool span. So the general lifecycle is: 1. An llm chooses a tool. A save the tool id, tool name, and tool arguments and correlate it with the LLM span 2. The tool is run. - We look at the argument and name of the tool and use it to look up the LLM span that chose this tool. We then delete the name/arg from the lookup dict. We then - We save the span/trace id of the tool and correlate it with the tool_id 4. The tool output is used as input for an LLM span. We have access to the tool id here, and lookup the span/trace id of the tool to link it to the LLM span #### A note on handoffs Hand-offs are implemented as tool calls in the agents SDK, so the span linking logic is largely the same. Two notes - there are no arguments for handoffs, so we use a dummy default lookup key for [LLM output -> tool input] linking step - the tool_id representing a handoff may be continually used as input for an LLM call since the list of messages is kept and added to across agent runs. However, it realistically should only be linked to the first LLM call of the agent being handed-off to since. Unlike other tool calls, a handoff is only an orchestration step and it doesn't provide extra context actually "used" in downstream llm generations - There are two brittle parts of hand-off linking that relies on some implementation details internal to the agents sdk - We are re-constructing the raw tool name used for hand-offs `handoff_tool_name = "transfer_to_{}".format("_".join(oai_span.to_agent.split(" ")).lower())` - We are using `{}` as the placeholder for the hand-off tool call argument. This is what's generated by the LLM when it chooses a handoff. We can improve on this by inferring these values when an LLM chooses a handoff tool, but this requires a bit more exploring ## Checklist - [ ] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [ ] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: lievan <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Re-opening #12846 but changed the logic in the trace processor to check for
agents._datadog_patch
before doing anything, otherwise all the functions are noopSkipping
test_ddtrace_run_patch_on_import
for oai agents due to ci issues that im still investigatingChecklist
Reviewer Checklist