Skip to content

Conversation

@aliibii
Copy link

@aliibii aliibii commented Oct 16, 2025

Description
This commit adds plugin execution tracing capability to the OpenTelemetry plugin, allowing users to trace individual plugin phases (rewrite, access, header_filter, body_filter, log) as child spans of the main request trace.

Changes:

  • Added trace_plugins configuration object with the following properties:
    • enabled: boolean to enable/disable plugin execution tracing (default: false)
    • plugin_span_kind: string enum ("internal" or "server") for observability provider compatibility (default: "internal")
    • excluded_plugins: array of plugin names to exclude from tracing (default: ["opentelemetry", "prometheus"])
  • Enhanced plugin execution in plugin.lua with OpenTelemetry span creation and finishing
  • Added proper span hierarchy: plugin phase spans are nested under main request spans
  • Added span context management with stack-based tracking for nested spans
  • Added upstream attributes (upstream.addr, upstream.host, upstream.ip, upstream.port) to the main request span in before_proxy phase
  • Updated test suite to reflect new schema structure and added tests for new features

New OpenTelemetry API functions available to plugins via api_ctx.otel:

  • api_ctx.otel.start_span(span_info) - Create custom spans

    • Parameters: span_info table with optional fields: name, kind, attributes, parent
    • Returns: span context object or nil
    • Automatically tracks spans in a stack for proper parent-child relationships
  • api_ctx.otel.stop_span(span_ctx, error_msg) - Finish spans with error handling

    • Parameters: span_ctx (from start_span), optional error_msg string
    • Sets error status on span if error_msg provided
    • Automatically manages span stack
  • api_ctx.otel.current_span() - Get current span context

    • Returns: most recently started span context (top of stack) or nil
    • Useful for adding attributes or creating child spans
  • api_ctx.otel.get_plugin_context(plugin_name, phase) - Get plugin phase span context

    • Parameters: plugin_name (string), phase (string)
    • Returns: span context for the specified plugin phase or nil
    • Useful for plugins that want to reference or extend existing plugin phase spans
  • api_ctx.otel.with_span(span_info, fn) - Create span, execute function, and automatically finish span

    • Parameters: span_info table (same as start_span), fn function to execute
    • The function receives span_ctx as its first parameter, allowing access to the span for setting attributes
    • Automatically handles span creation, execution, error handling, and cleanup
    • Returns function results in error-first pattern (err, ...values)
    • Sets span status to ERROR if function throws Lua error or returns an error

Features:

  • Plugin Phase Tracing: Creates child spans for each plugin phase execution automatically
  • Span Kind Control: Supports "internal" (default) and "server" span kinds for observability provider compatibility
  • Configurable: Can be enabled/disabled via trace_plugins.enabled configuration
  • Plugin Exclusion: Can exclude specific plugins from tracing (e.g., opentelemetry, prometheus)
  • Proper Hierarchy: Plugin spans are correctly nested under main request spans
  • Upstream Attributes: Upstream information (addr, host, ip, port) is automatically attached to the main request span
  • Error Handling: Proper error status and message propagation for plugin execution errors
  • No-op API: API is always available but returns no-ops when tracing is disabled
  • Stack Management: Automatic span stack tracking for nested span hierarchies
  • Convenient Span Management: with_span provides automatic span lifecycle management with error handling

Which issue(s) this PR fixes:

Resolves #12510

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch 2 times, most recently from f456f22 to 8393636 Compare October 17, 2025 10:35
@aliibii aliibii marked this pull request as ready for review October 18, 2025 09:55
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request plugin labels Oct 18, 2025
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from 8393636 to 70731ad Compare October 22, 2025 19:06
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch 2 times, most recently from 087aa81 to d478351 Compare November 2, 2025 12:01
This commit adds plugin execution tracing capability to the OpenTelemetry plugin,
allowing users to trace individual plugin phases (rewrite, access, header_filter,
body_filter, log) as child spans of the main request trace.

Changes:
- Added trace_plugins configuration option (default: false, opt-in)
- Added plugin_span_kind configuration for observability provider compatibility
- Enhanced plugin execution with OpenTelemetry span creation and finishing
- Added comprehensive request context attributes to plugin spans
- Updated documentation with examples and usage instructions
- Added comprehensive test suite for the new functionality

Features:
- Plugin Phase Tracing: Creates child spans for each plugin phase execution
- Rich Context: Includes HTTP method, URI, hostname, user agent, route info, and service info
- Configurable: Can be enabled/disabled via trace_plugins configuration
- Span Kind Control: Supports internal (default) and server span kinds for observability provider compatibility
- Proper Hierarchy: Plugin spans are correctly nested under main request spans
- Performance: Minimal overhead when disabled (default behavior)

Configuration:
- trace_plugins: boolean (default: false) - Enable/disable plugin tracing
- plugin_span_kind: string (default: 'internal') - Span kind for plugin spans
  - 'internal': Standard internal operation (may be excluded from metrics)
  - 'server': Server-side operation (typically included in service-level metrics)

This addresses GitHub issue apache#12510 and provides end-to-end tracing visibility
for APISIX plugin execution phases.
@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from d478351 to 8af172f Compare November 2, 2025 12:16
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Nov 2, 2025
@juzhiyuan
Copy link
Member

  1. Just approved to run tests.
  2. @bzp2010 @moonming @Baoyuantop can have a look?

@aliibii aliibii force-pushed the feat/opentelemetry-plugin-tracing branch from 2cba5ed to a81fd3e Compare November 3, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: opentelemetry plugin give traces of all other plugin that are added into route

3 participants