Skip to content

Commit 534fa86

Browse files
lievanlievan
and
lievan
authored
feat(llmobs): openai agents sdk integration (#12846)
Adds tracing for OpenAI Agents SDK This is different from traditional tracing integrations since we hook into openai’s native tracing api to start and stop spans. This means we don’t do any patching of the agent sdk functions. Hooking into native tracing gives us access to the fine-grained, inline tracing that’s already been implemented by oai. This allows us to get visibility into the main orchestration logic in the [`Run` method](https://github.com/openai/openai-agents-python/blob/090e79bdf4c8c6a77fbf5b599b19f6350b359855/src/agents/run.py#L109C1-L118C20). ## How does OpenAI Native tracing work? OpenAI has implemented their own tracing api where they automatically instrument agent invocations. Users can also create custom traces and spans. With OpenAI’s tracing API, traces and spans are two different concepts/data structures. Agent runs each create one trace, but you can also create traces manually to connect multiple agent runs. Traces don't nest—if a trace is already active, new agent runs won't start another one. There are a couple different span types which we map to LLM Obs span kinds: ``` Handoff -> tool Function -> tool Gaurdrail -> task Agent -> agent Custom -> task Response -> llm (sdk uses responses api by default) Generation -> llm (NOT IMPLEMENTED IN THIS PR) ``` Each span type has a different data format - for example the [response (llm) kind](https://github.com/openai/openai-agents-python/blob/090e79bdf4c8c6a77fbf5b599b19f6350b359855/src/agents/tracing/span_data.py#L110C1-L132C1) OpenAI provides us the ability to set a trace processor which allows us to hook into spans and traces starting and ending. This integration works by adding an LLM Obs trace processor when we patch. ## LLM Obs Tracing logic #### Data flow We start a span inside `on_span/trace_start` in our trace processor. We finish spans inside the `on_span/trace_finish` functions by maintaining a map between openai’s tracing id’s and our spans. Before we finish a span, we call llmobs_set_tags to extract all the attributes from the openai’s span data and translate them onto our LLM Obs span. For 7 different openai span types all with their own data structures, there is a lot of brittle translation logic needed. We use a `OaiSpanAdapter` class that wraps around an openai span to have some helpful utility functions to standardize behavior and defaults when trying to access fields on oai spans. This also keeps _all_ the data transformation code that needs to access oai agent sdk objects in a single place, making troubleshooting and code navigation a bit easier. #### Setting IO for traces OpenAI’s trace data structure is very minimal and doesn’t contain any I/O data. We do, however, know that 1. The input is always passed in as the input for the first llm call in the first agent 2. The output is always the last llm call of the last agent So for each trace, we maintain some metadata of the trace in memory to track the first LLM call and last LLM call, and use the data on these llm spans to populate the IO for the top level trace ## Follow ups 1. span linking 2. I’m currently investigating an issue where test apm spans don’t show up when a guardrail throws an error (llmobs spans are still there). It’s likely an issue with the mock tracer. 3. We don’t trace `generation` span types since they are wrappers around openai chat completions which we already trace. Span linking is not supported in this case. We’ll need to investigate how to support span linking here without creating duplicate LLM spans. we’ll bump into the same problem if we instrument the openai responses api. ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: lievan <[email protected]>
1 parent 2db08ec commit 534fa86

36 files changed

+3179
-2
lines changed

.riot/requirements/14e7a10.txt

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.9
3+
# by the following command:
4+
#
5+
# pip-compile --no-annotate .riot/requirements/14e7a10.in
6+
#
7+
annotated-types==0.7.0
8+
anyio==4.9.0
9+
attrs==25.3.0
10+
certifi==2025.1.31
11+
charset-normalizer==3.4.1
12+
colorama==0.4.6
13+
coverage[toml]==7.8.0
14+
distro==1.9.0
15+
exceptiongroup==1.2.2
16+
griffe==1.7.2
17+
h11==0.14.0
18+
httpcore==1.0.7
19+
httpx==0.28.1
20+
hypothesis==6.45.0
21+
idna==3.10
22+
iniconfig==2.1.0
23+
jiter==0.9.0
24+
mock==5.2.0
25+
multidict==6.3.2
26+
openai==1.70.0
27+
openai-agents==0.0.8
28+
opentracing==2.4.0
29+
packaging==24.2
30+
pluggy==1.5.0
31+
propcache==0.3.1
32+
pydantic==2.11.2
33+
pydantic-core==2.33.1
34+
pytest==8.3.5
35+
pytest-asyncio==0.26.0
36+
pytest-cov==6.1.0
37+
pytest-mock==3.14.0
38+
pyyaml==6.0.2
39+
requests==2.32.3
40+
sniffio==1.3.1
41+
sortedcontainers==2.4.0
42+
tomli==2.2.1
43+
tqdm==4.67.1
44+
types-requests==2.31.0.6
45+
types-urllib3==1.26.25.14
46+
typing-extensions==4.13.1
47+
typing-inspection==0.4.0
48+
urllib3==1.26.20
49+
vcrpy==7.0.0
50+
wrapt==1.17.2
51+
yarl==1.18.3

.riot/requirements/1538bcb.txt

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.12
3+
# by the following command:
4+
#
5+
# pip-compile --no-annotate .riot/requirements/1538bcb.in
6+
#
7+
annotated-types==0.7.0
8+
anyio==4.9.0
9+
attrs==25.3.0
10+
certifi==2025.1.31
11+
charset-normalizer==3.4.1
12+
click==8.1.8
13+
colorama==0.4.6
14+
coverage[toml]==7.8.0
15+
distro==1.9.0
16+
griffe==1.7.2
17+
h11==0.14.0
18+
httpcore==1.0.7
19+
httpx==0.28.1
20+
httpx-sse==0.4.0
21+
hypothesis==6.45.0
22+
idna==3.10
23+
iniconfig==2.1.0
24+
jiter==0.9.0
25+
mcp==1.6.0
26+
mock==5.2.0
27+
multidict==6.3.2
28+
openai==1.70.0
29+
openai-agents==0.0.8
30+
opentracing==2.4.0
31+
packaging==24.2
32+
pluggy==1.5.0
33+
propcache==0.3.1
34+
pydantic==2.11.2
35+
pydantic-core==2.33.1
36+
pydantic-settings==2.8.1
37+
pytest==8.3.5
38+
pytest-asyncio==0.26.0
39+
pytest-cov==6.1.0
40+
pytest-mock==3.14.0
41+
python-dotenv==1.1.0
42+
pyyaml==6.0.2
43+
requests==2.32.3
44+
sniffio==1.3.1
45+
sortedcontainers==2.4.0
46+
sse-starlette==2.2.1
47+
starlette==0.46.1
48+
tqdm==4.67.1
49+
types-requests==2.32.0.20250328
50+
typing-extensions==4.13.1
51+
typing-inspection==0.4.0
52+
urllib3==2.3.0
53+
uvicorn==0.34.0
54+
vcrpy==7.0.0
55+
wrapt==1.17.2
56+
yarl==1.18.3

.riot/requirements/1f24364.txt

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.10
3+
# by the following command:
4+
#
5+
# pip-compile --no-annotate .riot/requirements/1f24364.in
6+
#
7+
annotated-types==0.7.0
8+
anyio==4.9.0
9+
attrs==25.3.0
10+
certifi==2025.1.31
11+
charset-normalizer==3.4.1
12+
click==8.1.8
13+
colorama==0.4.6
14+
coverage[toml]==7.8.0
15+
distro==1.9.0
16+
exceptiongroup==1.2.2
17+
griffe==1.7.2
18+
h11==0.14.0
19+
httpcore==1.0.7
20+
httpx==0.28.1
21+
httpx-sse==0.4.0
22+
hypothesis==6.45.0
23+
idna==3.10
24+
iniconfig==2.1.0
25+
jiter==0.9.0
26+
mcp==1.6.0
27+
mock==5.2.0
28+
multidict==6.3.2
29+
openai==1.70.0
30+
openai-agents==0.0.8
31+
opentracing==2.4.0
32+
packaging==24.2
33+
pluggy==1.5.0
34+
propcache==0.3.1
35+
pydantic==2.11.2
36+
pydantic-core==2.33.1
37+
pydantic-settings==2.8.1
38+
pytest==8.3.5
39+
pytest-asyncio==0.26.0
40+
pytest-cov==6.1.0
41+
pytest-mock==3.14.0
42+
python-dotenv==1.1.0
43+
pyyaml==6.0.2
44+
requests==2.32.3
45+
sniffio==1.3.1
46+
sortedcontainers==2.4.0
47+
sse-starlette==2.2.1
48+
starlette==0.46.1
49+
tomli==2.2.1
50+
tqdm==4.67.1
51+
types-requests==2.32.0.20250328
52+
typing-extensions==4.13.1
53+
typing-inspection==0.4.0
54+
urllib3==2.3.0
55+
uvicorn==0.34.0
56+
vcrpy==7.0.0
57+
wrapt==1.17.2
58+
yarl==1.18.3

.riot/requirements/213dcfe.txt

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.13
3+
# by the following command:
4+
#
5+
# pip-compile --no-annotate .riot/requirements/213dcfe.in
6+
#
7+
annotated-types==0.7.0
8+
anyio==4.9.0
9+
attrs==25.3.0
10+
certifi==2025.1.31
11+
charset-normalizer==3.4.1
12+
click==8.1.8
13+
colorama==0.4.6
14+
coverage[toml]==7.8.0
15+
distro==1.9.0
16+
griffe==1.7.2
17+
h11==0.14.0
18+
httpcore==1.0.7
19+
httpx==0.28.1
20+
httpx-sse==0.4.0
21+
hypothesis==6.45.0
22+
idna==3.10
23+
iniconfig==2.1.0
24+
jiter==0.9.0
25+
mcp==1.6.0
26+
mock==5.2.0
27+
multidict==6.3.2
28+
openai==1.70.0
29+
openai-agents==0.0.8
30+
opentracing==2.4.0
31+
packaging==24.2
32+
pluggy==1.5.0
33+
propcache==0.3.1
34+
pydantic==2.11.2
35+
pydantic-core==2.33.1
36+
pydantic-settings==2.8.1
37+
pytest==8.3.5
38+
pytest-asyncio==0.26.0
39+
pytest-cov==6.1.0
40+
pytest-mock==3.14.0
41+
python-dotenv==1.1.0
42+
pyyaml==6.0.2
43+
requests==2.32.3
44+
sniffio==1.3.1
45+
sortedcontainers==2.4.0
46+
sse-starlette==2.2.1
47+
starlette==0.46.1
48+
tqdm==4.67.1
49+
types-requests==2.32.0.20250328
50+
typing-extensions==4.13.1
51+
typing-inspection==0.4.0
52+
urllib3==2.3.0
53+
uvicorn==0.34.0
54+
vcrpy==7.0.0
55+
wrapt==1.17.2
56+
yarl==1.18.3

.riot/requirements/55abc5e.txt

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.11
3+
# by the following command:
4+
#
5+
# pip-compile --no-annotate .riot/requirements/55abc5e.in
6+
#
7+
annotated-types==0.7.0
8+
anyio==4.9.0
9+
attrs==25.3.0
10+
certifi==2025.1.31
11+
charset-normalizer==3.4.1
12+
click==8.1.8
13+
colorama==0.4.6
14+
coverage[toml]==7.8.0
15+
distro==1.9.0
16+
griffe==1.7.2
17+
h11==0.14.0
18+
httpcore==1.0.7
19+
httpx==0.28.1
20+
httpx-sse==0.4.0
21+
hypothesis==6.45.0
22+
idna==3.10
23+
iniconfig==2.1.0
24+
jiter==0.9.0
25+
mcp==1.6.0
26+
mock==5.2.0
27+
multidict==6.3.2
28+
openai==1.70.0
29+
openai-agents==0.0.8
30+
opentracing==2.4.0
31+
packaging==24.2
32+
pluggy==1.5.0
33+
propcache==0.3.1
34+
pydantic==2.11.2
35+
pydantic-core==2.33.1
36+
pydantic-settings==2.8.1
37+
pytest==8.3.5
38+
pytest-asyncio==0.26.0
39+
pytest-cov==6.1.0
40+
pytest-mock==3.14.0
41+
python-dotenv==1.1.0
42+
pyyaml==6.0.2
43+
requests==2.32.3
44+
sniffio==1.3.1
45+
sortedcontainers==2.4.0
46+
sse-starlette==2.2.1
47+
starlette==0.46.1
48+
tqdm==4.67.1
49+
types-requests==2.32.0.20250328
50+
typing-extensions==4.13.1
51+
typing-inspection==0.4.0
52+
urllib3==2.3.0
53+
uvicorn==0.34.0
54+
vcrpy==7.0.0
55+
wrapt==1.17.2
56+
yarl==1.18.3

ddtrace/_monkey.py

+2
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@
109109
"coverage": False,
110110
"selenium": True,
111111
"valkey": True,
112+
"openai_agents": True,
112113
}
113114

114115

@@ -158,6 +159,7 @@
158159
"langgraph",
159160
"langgraph.graph",
160161
),
162+
"openai_agents": ("agents",),
161163
}
162164

163165
_NOT_PATCHABLE_VIA_ENVVAR = {"ddtrace_api"}

ddtrace/contrib/_openai_agents.py

+43
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
"""
2+
The OpenAI Agents integration instruments the openai-agents Python library to emit traces for agent workflows.
3+
4+
All traces submitted from the OpenAI Agents integration are tagged by:
5+
- ``service``, ``env``, ``version``: see the `Unified Service Tagging docs <https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging>`_.
6+
7+
Enabling
8+
~~~~~~~~
9+
10+
The OpenAI Agents integration is enabled automatically when you use
11+
:ref:`ddtrace-run<ddtracerun>` or :ref:`import ddtrace.auto<ddtraceauto>`.
12+
13+
Alternatively, use :func:`patch() <ddtrace.patch>` to manually enable the OpenAI Agents integration::
14+
15+
from ddtrace import patch
16+
17+
patch(openai_agents=True)
18+
19+
20+
Global Configuration
21+
~~~~~~~~~~~~~~~~~~~~
22+
23+
.. py:data:: ddtrace.config.openai_agents["service"]
24+
25+
The service name reported by default for OpenAI Agents requests.
26+
27+
Alternatively, you can set this option with the ``DD_SERVICE`` or ``DD_OPENAI_AGENTS_SERVICE`` environment
28+
variables.
29+
30+
Default: ``DD_SERVICE``
31+
32+
33+
Instance Configuration
34+
~~~~~~~~~~~~~~~~~~~~~~
35+
36+
To configure the OpenAI Agents integration on a per-instance basis use the
37+
``Pin`` API::
38+
39+
import agents
40+
from ddtrace.trace import Pin
41+
42+
Pin.override(agents, service="my-agents-service")
43+
""" # noqa: E501
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
import agents
2+
from agents.tracing import add_trace_processor
3+
4+
from ddtrace import config
5+
from ddtrace.contrib.internal.openai_agents.processor import LLMObsTraceProcessor
6+
from ddtrace.contrib.internal.openai_agents.processor import NoOpTraceProcessor
7+
from ddtrace.llmobs._integrations.openai_agents import OpenAIAgentsIntegration
8+
from ddtrace.trace import Pin
9+
10+
11+
config._add("openai_agents", {})
12+
13+
_span_processor = None
14+
15+
16+
def get_version() -> str:
17+
from agents import version
18+
19+
return getattr(version, "__version__", "")
20+
21+
22+
def patch():
23+
"""
24+
Patch the instrumented methods
25+
"""
26+
if getattr(agents, "_datadog_patch", False):
27+
return
28+
29+
agents._datadog_patch = True
30+
31+
global _span_processor
32+
33+
Pin().onto(agents)
34+
35+
_span_processor = LLMObsTraceProcessor(
36+
integration=OpenAIAgentsIntegration(integration_config=config.openai_agents),
37+
)
38+
add_trace_processor(_span_processor)
39+
40+
41+
def unpatch():
42+
"""
43+
Remove instrumentation from patched methods
44+
"""
45+
if not getattr(agents, "_datadog_patch", False):
46+
return
47+
48+
# Since there's no public API to remove a trace processor, we set the instance
49+
# we added to a no-op instance
50+
global _span_processor
51+
if _span_processor is not None:
52+
_span_processor = NoOpTraceProcessor()
53+
54+
agents._datadog_patch = False

0 commit comments

Comments
 (0)