You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I'm having a couple of issues with the Java trace API.
When setting the environment variable DD_TRACE_ENABLED to false, the tracing library still attempts to initiate a connection to an agent endpoint. In the event it cannot connect, this will take up to the full timeout to exit, whereas I'd expect it to disable all tracing.
user@host:~$ export DD_TRACE_ENABLED=false
user@host:~$ time <long java start command> -pdd -Ddd.trace.health.metrics.enabled=true -Ddd.agent.host=localhost java/lib/hellojava.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[dd.trace 2024-09-10 20:18:39:649 +0000] [main] DEBUG datadog.trace.agent.tooling.AgentInstaller - Adding filtered classes - inst
rumentation.class=datadog.trace.instrumentation.akka.concurrent.AkkaForkJoinTaskInstrumentation
[dd.trace 2024-09-10 20:18:39:652 +0000] [main] DEBUG datadog.trace.agent.tooling.AgentInstaller - Adding filtered classes - inst
rumentation.class=datadog.trace.instrumentation.akka.concurrent.AkkaMailboxInstrumentation
[dd.trace 2024-09-10 20:18:39:686 +0000] [main] DEBUG datadog.trace.agent.tooling.AgentInstaller - Adding filtered classes - inst
rumentation.class=datadog.trace.instrumentation.googlepubsub.PublisherInstrumentation
[dd.trace 2024-09-10 20:18:39:716 +0000] [main] DEBUG datadog.trace.agent.tooling.AgentInstaller - Adding filtered classes - inst
rumentation.class=datadog.trace.instrumentation.java.concurrent.JavaForkJoinTaskInstrumentation
...
[dd.trace 2024-09-10 20:19:23:055 +0000] [main] DEBUG datadog.trace.agent.tooling.TracerInstaller - Tracing is disabled, not installing GlobalTracer.
...
[dd.trace 2024-09-10 20:19:23:389 +0000] [main] EXCLUDE_TELEMETRY datadog.communication.ddagent.DDAgentFeaturesDiscovery - Error querying info at http://localhost:8126/
real 0m1.994s
user 0m2.931s
sys 0m0.271s
After disabling the agent running on localhost, when I set the agent.host to localhost the connection fails within less than 3 seconds and exits:
user@host:~$ time <long java start command> -pdd -Ddd.trace.health.metrics.enabled=true -Ddd.agent.host=localhost java/lib/hellojava.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[dd.trace 2024-09-10 20:14:32:022 +0000] [dd-telemetry] WARN datadog.telemetry.TelemetryRouter - Got FAILURE sending telemetry request to http://localhost:8126/telemetry/proxy/api/v2/apmtelemetry.
real 0m2.638s
user 0m4.009s
sys 0m0.334s
However, if I set it to a host that does not exist over the ethernet device, it will attempt to connect and take over 30 seconds to fail:
user@host:~$ time <long java start command> -pdd -Ddd.trace.health.metrics.enabled=true -Ddd.agent.host=this-host-doesnt-exist.com java/lib/hellojava.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[dd.trace 2024-09-06 19:33:57:476 +0000] [dd-task-scheduler] INFO datadog.trace.agent.core.StatusLogger - DATADOG TRACER CONFIGURATION {"version":"1.35.0~97065ed022","os_name":"Linux","os_version":"6.1.85-ts1-amd64","architecture":"amd64","lang":"jvm","lang_version":"21.0.2-tsjgss","jvm_vendor":"N/A","jvm_version":"21.0.2-tsjgss","java_class_version":"65.0","http_nonProxyHosts":"null","http_proxyHost":"null","enabled":true,"service":"hellojava","agent_url":"http://this-host-doesnt-exist.com:8126","agent_error":true,"debug":false,"trace_propagation_style_extract":["datadog","tracecontext"],"trace_propagation_style_inject":["datadog","tracecontext"],"analytics_enabled":false,"priority_sampling_enabled":true,"logs_correlation_enabled":true,"profiling_enabled":false,"remote_config_enabled":true,"debugger_enabled":false,"debugger_exception_enabled":false,"appsec_enabled":"ENABLED_INACTIVE","telemetry_enabled":true,"telemetry_dependency_collection_enabled":true,"telemetry_log_collection_enabled":false,"dd_version":"","health_checks_enabled":true,"configuration_file":"no config file present","runtime_id":"1f517376-f108-4012-a0ef-ea1940f110f3","logging_settings":{"levelInBrackets":false,"dateTimeFormat":"'[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']'","logFile":"System.err","configurationFile":"simplelogger.properties","showShortLogName":false,"showDateTime":true,"showLogName":true,"showThreadName":true,"defaultLogLevel":"INFO","warnLevelString":"WARN","embedException":false},"cws_enabled":false,"cws_tls_refresh":5000,"datadog_profiler_enabled":false,"datadog_profiler_safe":true,"datadog_profiler_enabled_overridden":false,"data_streams_enabled":false}
[dd.trace 2024-09-06 19:34:01:688 +0000] [main] WARN datadog.telemetry.TelemetrySystem - Telemetry thread join was not completed
real 0m38.012s
user 0m5.508s
sys 0m0.353s
I did a network capture that shows the localhost sent back an RST immediately:
But to the agent over the network, it went into a TCP retransmission loop (the host I am attempting to connect to lives behind several layers of switches and firewalls, and does not send back an RST):
The DEFAULT_AGENT_TIMEOUT looks to be 10 seconds, so I'm not sure why this is taking 30 seconds to complete, other than the fact that the TCP retransmissions are adding to the delay.
Can you please advise:
Why the tracer still attempts to initiate a connection to an agent if the environment variable is set to disable it
If we are setting our agent timeout incorrectly
Please let me know if you need any more information. Thank you!
The text was updated successfully, but these errors were encountered:
Hello, I'm having a couple of issues with the Java trace API.
DD_TRACE_ENABLED
to false, the tracing library still attempts to initiate a connection to an agent endpoint. In the event it cannot connect, this will take up to the full timeout to exit, whereas I'd expect it to disable all tracing.agent.host
to localhost the connection fails within less than 3 seconds and exits:However, if I set it to a host that does not exist over the ethernet device, it will attempt to connect and take over 30 seconds to fail:
I did a network capture that shows the localhost sent back an RST immediately:
But to the agent over the network, it went into a TCP retransmission loop (the host I am attempting to connect to lives behind several layers of switches and firewalls, and does not send back an RST):
The DEFAULT_AGENT_TIMEOUT looks to be 10 seconds, so I'm not sure why this is taking 30 seconds to complete, other than the fact that the TCP retransmissions are adding to the delay.
Can you please advise:
Please let me know if you need any more information. Thank you!
The text was updated successfully, but these errors were encountered: