-
Notifications
You must be signed in to change notification settings - Fork 457
feat(debugger): add agent state check to uploader #14653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
We add an agent state check to the uploader to allow it to fall back to an appropriate snapshot collection endpoint that is guaranteed to have redaction support. If such guarantees cannot be made we log a warning and disable the uploads of snapshots to prevent sensitive data from being captured.
61d42db
to
8bc5007
Compare
Co-authored-by: Tyler Finethy <[email protected]>
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 269 ± 3 ms. The average import time from base is: 272 ± 3 ms. The import time difference between this PR and base is: -2.9 ± 0.1 ms. Import time breakdownThe following import paths have grown:
|
Performance SLOsComparing candidate chore/debugger-agent-check-uploader (be76628) with baseline main (e806495) 📈 Performance Regressions (1 suite)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 0.401µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -1.9% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ add_inplace_aspectTime: ✅ 0.411µs (SLO: <10.000µs 📉 -95.9%) vs baseline: -0.1% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ add_inplace_noaspectTime: ✅ 0.314µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -1.0% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ add_noaspectTime: ✅ 0.278µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +1.1% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ bytearray_aspectTime: ✅ 1.326µs (SLO: <10.000µs 📉 -86.7%) vs baseline: -0.6% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ bytearray_extend_aspectTime: ✅ 1.548µs (SLO: <10.000µs 📉 -84.5%) vs baseline: +4.3% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ bytearray_extend_noaspectTime: ✅ 0.611µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -0.4% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ bytearray_noaspectTime: ✅ 0.480µs (SLO: <10.000µs 📉 -95.2%) vs baseline: -0.7% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.6% ✅ bytes_aspectTime: ✅ 1.541µs (SLO: <10.000µs 📉 -84.6%) vs baseline: 📈 +17.6% Memory: ✅ 37.749MB (SLO: <39.000MB -3.2%) vs baseline: +5.2% ✅ bytes_noaspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.3% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ bytesio_aspectTime: ✅ 1.364µs (SLO: <10.000µs 📉 -86.4%) vs baseline: +0.6% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ bytesio_noaspectTime: ✅ 0.492µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ capitalize_aspectTime: ✅ 0.732µs (SLO: <10.000µs 📉 -92.7%) vs baseline: -1.1% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.9% ✅ capitalize_noaspectTime: ✅ 0.434µs (SLO: <10.000µs 📉 -95.7%) vs baseline: -0.7% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ casefold_aspectTime: ✅ 0.737µs (SLO: <10.000µs 📉 -92.6%) vs baseline: -0.6% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ casefold_noaspectTime: ✅ 0.371µs (SLO: <10.000µs 📉 -96.3%) vs baseline: ~same Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ decode_aspectTime: ✅ 0.725µs (SLO: <10.000µs 📉 -92.8%) vs baseline: ~same Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.9% ✅ decode_noaspectTime: ✅ 0.420µs (SLO: <10.000µs 📉 -95.8%) vs baseline: +0.6% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ encode_aspectTime: ✅ 0.712µs (SLO: <10.000µs 📉 -92.9%) vs baseline: +0.5% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ encode_noaspectTime: ✅ 0.403µs (SLO: <10.000µs 📉 -96.0%) vs baseline: -0.8% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ format_aspectTime: ✅ 3.419µs (SLO: <10.000µs 📉 -65.8%) vs baseline: -1.0% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ format_map_aspectTime: ✅ 3.650µs (SLO: <10.000µs 📉 -63.5%) vs baseline: -1.3% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ format_map_noaspectTime: ✅ 0.789µs (SLO: <10.000µs 📉 -92.1%) vs baseline: +1.2% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ format_noaspectTime: ✅ 0.603µs (SLO: <10.000µs 📉 -94.0%) vs baseline: +1.3% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ index_aspectTime: ✅ 0.357µs (SLO: <10.000µs 📉 -96.4%) vs baseline: ~same Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ index_noaspectTime: ✅ 0.279µs (SLO: <10.000µs 📉 -97.2%) vs baseline: +0.2% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ join_aspectTime: ✅ 1.378µs (SLO: <10.000µs 📉 -86.2%) vs baseline: +0.3% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ join_noaspectTime: ✅ 0.490µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.9% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ ljust_aspectTime: ✅ 2.670µs (SLO: <20.000µs 📉 -86.6%) vs baseline: +1.6% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ ljust_noaspectTime: ✅ 0.407µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.2% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +4.9% ✅ lower_aspectTime: ✅ 2.234µs (SLO: <10.000µs 📉 -77.7%) vs baseline: -0.3% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ lower_noaspectTime: ✅ 0.370µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +1.8% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ lstrip_aspectTime: ✅ 2.294µs (SLO: <20.000µs 📉 -88.5%) vs baseline: +1.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ lstrip_noaspectTime: ✅ 0.385µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +0.7% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ modulo_aspectTime: ✅ 0.999µs (SLO: <10.000µs 📉 -90.0%) vs baseline: -0.8% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.546µs (SLO: <10.000µs 📉 -84.5%) vs baseline: -0.3% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ modulo_aspect_for_bytesTime: ✅ 0.981µs (SLO: <10.000µs 📉 -90.2%) vs baseline: -0.6% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.287µs (SLO: <10.000µs 📉 -87.1%) vs baseline: +4.0% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ modulo_noaspectTime: ✅ 0.624µs (SLO: <10.000µs 📉 -93.8%) vs baseline: -0.4% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ replace_aspectTime: ✅ 4.892µs (SLO: <10.000µs 📉 -51.1%) vs baseline: -0.6% Memory: ✅ 37.591MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ replace_noaspectTime: ✅ 0.465µs (SLO: <10.000µs 📉 -95.3%) vs baseline: +0.4% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +4.8% ✅ repr_aspectTime: ✅ 0.906µs (SLO: <10.000µs 📉 -90.9%) vs baseline: +0.2% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ repr_noaspectTime: ✅ 0.415µs (SLO: <10.000µs 📉 -95.9%) vs baseline: ~same Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ rstrip_aspectTime: ✅ 1.977µs (SLO: <20.000µs 📉 -90.1%) vs baseline: +1.2% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.7% ✅ rstrip_noaspectTime: ✅ 0.383µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +1.4% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ slice_aspectTime: ✅ 0.495µs (SLO: <10.000µs 📉 -95.1%) vs baseline: +0.3% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ slice_noaspectTime: ✅ 0.447µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.4% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.1% ✅ stringio_aspectTime: ✅ 1.563µs (SLO: <10.000µs 📉 -84.4%) vs baseline: -0.5% Memory: ✅ 37.729MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ stringio_noaspectTime: ✅ 0.723µs (SLO: <10.000µs 📉 -92.8%) vs baseline: +0.3% Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.7% ✅ strip_aspectTime: ✅ 2.253µs (SLO: <20.000µs 📉 -88.7%) vs baseline: -0.6% Memory: ✅ 37.631MB (SLO: <39.000MB -3.5%) vs baseline: +4.8% ✅ strip_noaspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: ~same Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +5.0% ✅ swapcase_aspectTime: ✅ 2.455µs (SLO: <10.000µs 📉 -75.5%) vs baseline: ~same Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ swapcase_noaspectTime: ✅ 0.541µs (SLO: <10.000µs 📉 -94.6%) vs baseline: +0.7% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.6% ✅ title_aspectTime: ✅ 2.344µs (SLO: <10.000µs 📉 -76.6%) vs baseline: -1.0% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ title_noaspectTime: ✅ 0.503µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.2% Memory: ✅ 37.650MB (SLO: <39.000MB -3.5%) vs baseline: +4.7% ✅ translate_aspectTime: ✅ 3.278µs (SLO: <10.000µs 📉 -67.2%) vs baseline: ~same Memory: ✅ 37.670MB (SLO: <39.000MB -3.4%) vs baseline: +4.9% ✅ translate_noaspectTime: ✅ 1.037µs (SLO: <10.000µs 📉 -89.6%) vs baseline: -0.7% Memory: ✅ 37.709MB (SLO: <39.000MB -3.3%) vs baseline: +5.0% ✅ upper_aspectTime: ✅ 2.243µs (SLO: <10.000µs 📉 -77.6%) vs baseline: -0.7% Memory: ✅ 37.690MB (SLO: <39.000MB -3.4%) vs baseline: +5.0% ✅ upper_noaspectTime: ✅ 0.372µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.3% Memory: ✅ 37.611MB (SLO: <39.000MB -3.6%) vs baseline: +4.8% 🟡 Near SLO Breach (3 suites)🟡 djangosimple - 28/28✅ appsecTime: ✅ 20.523ms (SLO: <22.300ms -8.0%) vs baseline: +0.2% Memory: ✅ 65.349MB (SLO: <67.000MB -2.5%) vs baseline: +5.1% ✅ exception-replay-enabledTime: ✅ 1.350ms (SLO: <1.450ms -6.9%) vs baseline: +0.6% Memory: ✅ 64.290MB (SLO: <67.000MB -4.0%) vs baseline: +5.1% ✅ iastTime: ✅ 20.461ms (SLO: <22.250ms -8.0%) vs baseline: +0.2% Memory: ✅ 65.191MB (SLO: <67.000MB -2.7%) vs baseline: +4.8% ✅ profilerTime: ✅ 15.266ms (SLO: <16.550ms -7.8%) vs baseline: ~same Memory: ✅ 53.340MB (SLO: <54.500MB -2.1%) vs baseline: +4.8% ✅ span-code-originTime: ✅ 26.154ms (SLO: <28.200ms -7.3%) vs baseline: ~same Memory: ✅ 67.290MB (SLO: <69.500MB -3.2%) vs baseline: +4.8% ✅ tracerTime: ✅ 20.558ms (SLO: <21.750ms -5.5%) vs baseline: +0.5% Memory: ✅ 65.257MB (SLO: <67.000MB -2.6%) vs baseline: +4.8% ✅ tracer-and-profilerTime: ✅ 22.167ms (SLO: <23.500ms -5.7%) vs baseline: ~same Memory: ✅ 66.395MB (SLO: <67.500MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-dont-create-db-spansTime: ✅ 19.284ms (SLO: <21.500ms 📉 -10.3%) vs baseline: -0.3% Memory: ✅ 65.247MB (SLO: <66.000MB 🟡 -1.1%) vs baseline: +4.9% ✅ tracer-minimalTime: ✅ 16.629ms (SLO: <17.500ms -5.0%) vs baseline: -0.3% Memory: ✅ 65.168MB (SLO: <66.000MB 🟡 -1.3%) vs baseline: +5.4% ✅ tracer-nativeTime: ✅ 20.548ms (SLO: <21.750ms -5.5%) vs baseline: +0.7% Memory: ✅ 71.064MB (SLO: <72.500MB 🟡 -2.0%) vs baseline: +4.7% ✅ tracer-no-cachesTime: ✅ 18.422ms (SLO: <19.650ms -6.2%) vs baseline: -0.2% Memory: ✅ 65.247MB (SLO: <67.000MB -2.6%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 18.769ms (SLO: <20.100ms -6.6%) vs baseline: +0.4% Memory: ✅ 64.912MB (SLO: <67.000MB -3.1%) vs baseline: +5.0% ✅ tracer-no-middlewareTime: ✅ 20.231ms (SLO: <21.500ms -5.9%) vs baseline: ~same Memory: ✅ 65.224MB (SLO: <67.000MB -2.7%) vs baseline: +4.9% ✅ tracer-no-templatesTime: ✅ 20.317ms (SLO: <22.000ms -7.7%) vs baseline: +0.1% Memory: ✅ 65.214MB (SLO: <67.000MB -2.7%) vs baseline: +4.9% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.036ms (SLO: <19.850ms -9.1%) vs baseline: ~same Memory: ✅ 65.068MB (SLO: <66.500MB -2.2%) vs baseline: +5.0% ✅ errortracking-enabled-userTime: ✅ 18.134ms (SLO: <19.400ms -6.5%) vs baseline: +0.2% Memory: ✅ 65.175MB (SLO: <66.500MB 🟡 -2.0%) vs baseline: +5.0% ✅ tracer-enabledTime: ✅ 18.170ms (SLO: <19.450ms -6.6%) vs baseline: +1.0% Memory: ✅ 64.979MB (SLO: <66.500MB -2.3%) vs baseline: +5.1% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 45.065ms (SLO: <47.150ms -4.4%) vs baseline: -0.5% Memory: ✅ 45.115MB (SLO: <47.000MB -4.0%) vs baseline: +4.9% ✅ add-metricsTime: ✅ 320.032ms (SLO: <344.800ms -7.2%) vs baseline: -1.3% Memory: ✅ 552.833MB (SLO: <562.000MB 🟡 -1.6%) vs baseline: +4.7% ✅ add-tagsTime: ✅ 291.708ms (SLO: <314.000ms -7.1%) vs baseline: -0.3% Memory: ✅ 553.468MB (SLO: <563.500MB 🟡 -1.8%) vs baseline: +4.8% ✅ get-contextTime: ✅ 82.827ms (SLO: <92.350ms 📉 -10.3%) vs baseline: +0.3% Memory: ✅ 40.194MB (SLO: <46.500MB 📉 -13.6%) vs baseline: +4.9% ✅ is-recordingTime: ✅ 43.102ms (SLO: <44.500ms -3.1%) vs baseline: +0.4% Memory: ✅ 44.434MB (SLO: <47.500MB -6.5%) vs baseline: +4.7% ✅ record-exceptionTime: ✅ 61.720ms (SLO: <67.650ms -8.8%) vs baseline: -0.1% Memory: ✅ 40.448MB (SLO: <47.000MB 📉 -13.9%) vs baseline: +4.8% ✅ set-statusTime: ✅ 49.051ms (SLO: <50.400ms -2.7%) vs baseline: ~same Memory: ✅ 44.433MB (SLO: <47.000MB -5.5%) vs baseline: +4.6% ✅ startTime: ✅ 42.228ms (SLO: <43.450ms -2.8%) vs baseline: -0.2% Memory: ✅ 44.423MB (SLO: <47.000MB -5.5%) vs baseline: +4.7% ✅ start-finishTime: ✅ 83.107ms (SLO: <88.000ms -5.6%) vs baseline: +0.5% Memory: ✅ 34.544MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +4.9% ✅ start-finish-telemetryTime: ✅ 84.416ms (SLO: <89.000ms -5.2%) vs baseline: +0.2% Memory: ✅ 34.564MB (SLO: <46.500MB 📉 -25.7%) vs baseline: +4.9% ✅ update-nameTime: ✅ 44.292ms (SLO: <45.150ms 🟡 -1.9%) vs baseline: ~same Memory: ✅ 44.678MB (SLO: <47.000MB -4.9%) vs baseline: +4.7%
|
Small tweak to rate-limit the number of warning logs to at most once an hour. At some point we should also do this for: ``` [DEBUG] ddtrace.internal.remoteconfig.worker: Agent is down or Remote Config is not enabled in the Agent Check your Agent version, you need an Agent running on 7.39.1 version or above. Check Your Remote Config environment variables on your Agent: DD_REMOTE_CONFIGURATION_ENABLED=true ``` Since it's quite noisy as well, but it's behind a DEBUG flag so less important.
Performing QA on this now:
@P403n1x87 something is going wrong here, what I did was:
This was the last log before the I restarted the latest agent, then no hints from the
But heartbeats also permanently stopped if that helps. |
log.debug("Downgrading snapshot endpoint to %s", uploader_track.endpoint) | ||
# Try again immediately. If this fails for the same | ||
# reason we transition to agent check state | ||
self._flush_track(uploader_track) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the track be empty after the call to queue.flush()
since the buffer is emptied?
Description
We add an agent state check to the uploader to allow it to fall back to an appropriate snapshot collection endpoint that is guaranteed to have redaction support. If such guarantees cannot be made we log a warning and disable the uploads of snapshots to prevent sensitive data from being captured.