Skip to content

fix(tracing): fix orphan spans in OtelTracingMiddleware by reading parent OTel Context from Reactor ContextView#1940

Open
Buktal wants to merge 1 commit into
agentscope-ai:mainfrom
Buktal:fix/otel-tracing-context-propagation
Open

fix(tracing): fix orphan spans in OtelTracingMiddleware by reading parent OTel Context from Reactor ContextView#1940
Buktal wants to merge 1 commit into
agentscope-ai:mainfrom
Buktal:fix/otel-tracing-context-propagation

Conversation

@Buktal

@Buktal Buktal commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

AgentScope-Java Version

2.0.0-SNAPSHOT

Description

Background

OtelTracingMiddleware was producing orphan spans: chat <model> and execute_tool <name> spans appeared as disconnected root spans in OTLP backends (e.g. Langfuse) instead of being nested under invoke_agent. Additionally, any SpanProcessor that reads the parent Context in onStart (e.g. to inject business metadata) received an empty context.

Closes #1938

Root cause

All three hooks (onAgent, onModelCall, onActing) used Flux.defer + Context.current() (ThreadLocal) to resolve the parent OTel Context. Inside a Reactor pipeline, the parent span is stored in Reactor Context (written by ContextPropagationOperator.runWithContext), not in ThreadLocal. After any thread hop (publishOn/subscribeOn), Context.current() returns an empty context, so child spans had no parent.

Fix

Switch all three hooks from Flux.defer to Flux.deferContextual, and resolve the parent context via ContextPropagationOperator.getOpenTelemetryContextFromContextView(ctxView, Context.current()) — this reads from Reactor Context first and falls back to ThreadLocal. Then call spanBuilder.setParent(parentContext) explicitly and build the child context with span.storeInContext(parentContext).

This is the same pattern used in the now-deprecated TelemetryTracer.callModel.

Additional improvements

  • onActing: replaced LinkedHashSet with ConcurrentHashMap.newKeySet() for callIds (defensive thread safety)
  • onActing: span name now uses <firstTool> (+N more) for batch tool calls to cap cardinality; full tool names are preserved in the gen_ai.tool.name attribute
  • Added resolveOtelContext and buildToolSpanName helpers with clarifying comments

How to test

Configure any OTLP backend (e.g. Langfuse). Add OtelTracingMiddleware to a ReActAgent. Invoke the agent and verify:

  1. chat <model> and execute_tool <name> spans appear nested under invoke_agent
  2. All three spans share the same traceId
  3. Any SpanProcessor that reads parent context in onStart correctly receives business metadata

Checklist

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

@Buktal Buktal requested a review from a team June 26, 2026 09:48
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...agentscope/core/tracing/OtelTracingMiddleware.java 90.90% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@itxaiohanglover

Copy link
Copy Markdown
Contributor

Good fix! Reading parent OTel Context from Reactor ContextView instead of ThreadLocal is the correct approach for Reactor pipelines. The ContextPropagationOperator stores it there for exactly this reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OtelTracingMiddleware produces orphan spans — child spans missing parent due to wrong OTel Context lookup

2 participants