-
Notifications
You must be signed in to change notification settings - Fork 457
feat(profiling): add memory allocator testing variants #14659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
00a63c5
to
702aee6
Compare
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 256 ± 6 ms. The average import time from base is: 260 ± 6 ms. The import time difference between this PR and base is: -4.0 ± 0.3 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate feat/memory-allocator-testing (afd4f10) with baseline main (a32abd2) 📈 Performance Regressions (1 suite)📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 3.178µs (SLO: <20.000µs 📉 -84.1%) vs baseline: ~same Memory: ✅ 32.106MB (SLO: <34.000MB -5.6%) vs baseline: +4.8% ✅ 1-count-metrics-100-timesTime: ✅ 213.436µs (SLO: <250.000µs 📉 -14.6%) vs baseline: ~same Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.7% ✅ 1-distribution-metric-1-timesTime: ✅ 3.329µs (SLO: <20.000µs 📉 -83.4%) vs baseline: 📈 +12.4% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.0% ✅ 1-distribution-metrics-100-timesTime: ✅ 191.354µs (SLO: <220.000µs 📉 -13.0%) vs baseline: -0.3% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.7% ✅ 1-gauge-metric-1-timesTime: ✅ 2.120µs (SLO: <20.000µs 📉 -89.4%) vs baseline: -0.3% Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +4.8% ✅ 1-gauge-metrics-100-timesTime: ✅ 124.111µs (SLO: <150.000µs 📉 -17.3%) vs baseline: -2.7% Memory: ✅ 32.204MB (SLO: <34.000MB -5.3%) vs baseline: +5.0% ✅ 1-rate-metric-1-timesTime: ✅ 3.219µs (SLO: <20.000µs 📉 -83.9%) vs baseline: +1.0% Memory: ✅ 32.027MB (SLO: <34.000MB -5.8%) vs baseline: +4.6% ✅ 1-rate-metrics-100-timesTime: ✅ 212.005µs (SLO: <250.000µs 📉 -15.2%) vs baseline: -0.8% Memory: ✅ 32.204MB (SLO: <34.000MB -5.3%) vs baseline: +5.0% ✅ 100-count-metrics-100-timesTime: ✅ 21.204ms (SLO: <23.500ms -9.8%) vs baseline: -1.5% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +5.1% ✅ 100-distribution-metrics-100-timesTime: ✅ 1.984ms (SLO: <2.250ms 📉 -11.8%) vs baseline: +0.5% Memory: ✅ 32.126MB (SLO: <34.000MB -5.5%) vs baseline: +4.8% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.284ms (SLO: <1.550ms 📉 -17.1%) vs baseline: -0.9% Memory: ✅ 32.165MB (SLO: <34.000MB -5.4%) vs baseline: +5.1% ✅ 100-rate-metrics-100-timesTime: ✅ 2.180ms (SLO: <2.550ms 📉 -14.5%) vs baseline: -0.6% Memory: ✅ 32.145MB (SLO: <34.000MB -5.5%) vs baseline: +5.0% ✅ flush-1-metricTime: ✅ 4.183µs (SLO: <20.000µs 📉 -79.1%) vs baseline: +2.4% Memory: ✅ 32.086MB (SLO: <34.000MB -5.6%) vs baseline: +4.9% ✅ flush-100-metricsTime: ✅ 182.109µs (SLO: <250.000µs 📉 -27.2%) vs baseline: +0.7% Memory: ✅ 32.165MB (SLO: <34.000MB -5.4%) vs baseline: +5.3% ✅ flush-1000-metricsTime: ✅ 2.209ms (SLO: <2.500ms 📉 -11.6%) vs baseline: +0.7% Memory: ✅ 32.853MB (SLO: <34.500MB -4.8%) vs baseline: +4.6% 🟡 Near SLO Breach (4 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.482ms (SLO: <22.300ms -8.2%) vs baseline: -0.3% Memory: ✅ 65.563MB (SLO: <67.000MB -2.1%) vs baseline: +5.0% ✅ exception-replay-enabledTime: ✅ 1.347ms (SLO: <1.450ms -7.1%) vs baseline: ~same Memory: ✅ 64.484MB (SLO: <67.000MB -3.8%) vs baseline: +4.8% ✅ iastTime: ✅ 20.529ms (SLO: <22.250ms -7.7%) vs baseline: ~same Memory: ✅ 65.519MB (SLO: <67.000MB -2.2%) vs baseline: +5.1% ✅ profilerTime: ✅ 15.311ms (SLO: <16.550ms -7.5%) vs baseline: +0.2% Memory: ✅ 53.704MB (SLO: <54.500MB 🟡 -1.5%) vs baseline: +4.8% ✅ resource-renamingTime: ✅ 20.574ms (SLO: <21.750ms -5.4%) vs baseline: +0.2% Memory: ✅ 65.596MB (SLO: <67.000MB -2.1%) vs baseline: +5.1% ✅ span-code-originTime: ✅ 26.138ms (SLO: <28.200ms -7.3%) vs baseline: -0.3% Memory: ✅ 67.526MB (SLO: <69.500MB -2.8%) vs baseline: +5.1% ✅ tracerTime: ✅ 20.457ms (SLO: <21.750ms -5.9%) vs baseline: -0.2% Memory: ✅ 65.488MB (SLO: <67.000MB -2.3%) vs baseline: +4.7% ✅ tracer-and-profilerTime: ✅ 22.042ms (SLO: <23.500ms -6.2%) vs baseline: -0.4% Memory: ✅ 66.659MB (SLO: <67.500MB 🟡 -1.2%) vs baseline: +4.8% ✅ tracer-dont-create-db-spansTime: ✅ 19.351ms (SLO: <21.500ms -10.0%) vs baseline: ~same Memory: ✅ 65.529MB (SLO: <66.000MB 🟡 -0.7%) vs baseline: +4.9% ✅ tracer-minimalTime: ✅ 16.664ms (SLO: <17.500ms -4.8%) vs baseline: +0.2% Memory: ✅ 65.157MB (SLO: <66.000MB 🟡 -1.3%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 20.505ms (SLO: <21.750ms -5.7%) vs baseline: ~same Memory: ✅ 71.443MB (SLO: <72.500MB 🟡 -1.5%) vs baseline: +4.9% ✅ tracer-no-cachesTime: ✅ 18.431ms (SLO: <19.650ms -6.2%) vs baseline: ~same Memory: ✅ 65.242MB (SLO: <67.000MB -2.6%) vs baseline: +4.7% ✅ tracer-no-databasesTime: ✅ 18.872ms (SLO: <20.100ms -6.1%) vs baseline: +0.4% Memory: ✅ 65.125MB (SLO: <67.000MB -2.8%) vs baseline: +4.8% ✅ tracer-no-middlewareTime: ✅ 20.178ms (SLO: <21.500ms -6.1%) vs baseline: ~same Memory: ✅ 65.504MB (SLO: <67.000MB -2.2%) vs baseline: +4.8% ✅ tracer-no-templatesTime: ✅ 20.360ms (SLO: <22.000ms -7.5%) vs baseline: +0.2% Memory: ✅ 65.525MB (SLO: <67.000MB -2.2%) vs baseline: +4.9% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.215ms (SLO: <19.850ms -8.2%) vs baseline: +1.0% Memory: ✅ 65.352MB (SLO: <66.500MB 🟡 -1.7%) vs baseline: +4.9% ✅ errortracking-enabled-userTime: ✅ 18.017ms (SLO: <19.400ms -7.1%) vs baseline: ~same Memory: ✅ 65.294MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.8% ✅ tracer-enabledTime: ✅ 18.121ms (SLO: <19.450ms -6.8%) vs baseline: +0.2% Memory: ✅ 65.294MB (SLO: <66.500MB 🟡 -1.8%) vs baseline: +4.9% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.576ms (SLO: <4.750ms -3.7%) vs baseline: +0.4% Memory: ✅ 61.912MB (SLO: <65.000MB -4.8%) vs baseline: +4.8% ✅ appsec-postTime: ✅ 6.574ms (SLO: <6.750ms -2.6%) vs baseline: +0.2% Memory: ✅ 61.892MB (SLO: <65.000MB -4.8%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.578ms (SLO: <4.750ms -3.6%) vs baseline: -0.6% Memory: ✅ 61.951MB (SLO: <65.000MB -4.7%) vs baseline: +4.9% ✅ debuggerTime: ✅ 1.857ms (SLO: <2.000ms -7.1%) vs baseline: ~same Memory: ✅ 45.338MB (SLO: <47.000MB -3.5%) vs baseline: +4.6% ✅ iast-getTime: ✅ 1.864ms (SLO: <2.000ms -6.8%) vs baseline: ~same Memory: ✅ 42.389MB (SLO: <49.000MB 📉 -13.5%) vs baseline: +5.0% ✅ profilerTime: ✅ 1.914ms (SLO: <2.100ms -8.8%) vs baseline: +0.3% Memory: ✅ 46.498MB (SLO: <47.000MB 🟡 -1.1%) vs baseline: +4.9% ✅ resource-renamingTime: ✅ 3.385ms (SLO: <3.650ms -7.3%) vs baseline: ~same Memory: ✅ 52.219MB (SLO: <53.500MB -2.4%) vs baseline: +4.9% ✅ tracerTime: ✅ 3.374ms (SLO: <3.650ms -7.6%) vs baseline: ~same Memory: ✅ 52.199MB (SLO: <53.500MB -2.4%) vs baseline: +4.7% ✅ tracer-nativeTime: ✅ 3.372ms (SLO: <3.650ms -7.6%) vs baseline: +0.3% Memory: ✅ 57.923MB (SLO: <60.000MB -3.5%) vs baseline: +4.6% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 45.295ms (SLO: <47.150ms -3.9%) vs baseline: ~same Memory: ✅ 45.220MB (SLO: <47.000MB -3.8%) vs baseline: +4.8% ✅ add-metricsTime: ✅ 322.141ms (SLO: <344.800ms -6.6%) vs baseline: +0.5% Memory: ✅ 553.402MB (SLO: <562.000MB 🟡 -1.5%) vs baseline: +5.1% ✅ add-tagsTime: ✅ 290.783ms (SLO: <314.000ms -7.4%) vs baseline: ~same Memory: ✅ 555.098MB (SLO: <563.500MB 🟡 -1.5%) vs baseline: +5.1% ✅ get-contextTime: ✅ 82.805ms (SLO: <92.350ms 📉 -10.3%) vs baseline: +0.4% Memory: ✅ 40.266MB (SLO: <46.500MB 📉 -13.4%) vs baseline: +4.9% ✅ is-recordingTime: ✅ 43.998ms (SLO: <44.500ms 🟡 -1.1%) vs baseline: +2.5% Memory: ✅ 44.597MB (SLO: <47.500MB -6.1%) vs baseline: +4.6% ✅ record-exceptionTime: ✅ 61.872ms (SLO: <67.650ms -8.5%) vs baseline: ~same Memory: ✅ 40.616MB (SLO: <47.000MB 📉 -13.6%) vs baseline: +5.0% ✅ set-statusTime: ✅ 48.917ms (SLO: <50.400ms -2.9%) vs baseline: +0.5% Memory: ✅ 44.624MB (SLO: <47.000MB -5.1%) vs baseline: +5.0% ✅ startTime: ✅ 42.190ms (SLO: <43.450ms -2.9%) vs baseline: +0.3% Memory: ✅ 44.613MB (SLO: <47.000MB -5.1%) vs baseline: +4.8% ✅ start-finishTime: ✅ 83.159ms (SLO: <88.000ms -5.5%) vs baseline: +0.6% Memory: ✅ 34.583MB (SLO: <46.500MB 📉 -25.6%) vs baseline: +4.9% ✅ start-finish-telemetryTime: ✅ 84.563ms (SLO: <89.000ms -5.0%) vs baseline: +0.2% Memory: ✅ 34.583MB (SLO: <46.500MB 📉 -25.6%) vs baseline: +4.9% ✅ update-nameTime: ✅ 44.186ms (SLO: <45.150ms -2.1%) vs baseline: +0.2% Memory: ✅ 44.907MB (SLO: <47.000MB -4.5%) vs baseline: +4.7%
|
55efed1
to
6d84ac3
Compare
b18aed2
to
5d2fe92
Compare
5d2fe92
to
85fac4f
Compare
riotfile.py
Outdated
# Library not found or ctypes issue - try next name | ||
continue | ||
|
||
return [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to just specify env var for different allocators and run that as part of the existing venvs? and only for memory allocators?
This would result in running all profiling tests with different allocators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, let me check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taegyunkim , fixed the code to test different allocators for test_memalloc only
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
bac6863
to
f35963d
Compare
f35963d
to
dbf4702
Compare
03fc03a
to
ae91fe9
Compare
riotfile.py
Outdated
jemalloc_names = [ | ||
"libjemalloc.so.2", # Ubuntu/Debian | ||
"libjemalloc.so.1", # Older versions | ||
"libjemalloc.so", # Generic | ||
"libjemalloc.dylib", # macOS (though less common) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have jemalloc installed on our test container
scripts/ddtest find / -name "libjemalloc.so*"
this returns empty.
If we want to test with jemalloc for real, we'd need to update https://github.com/DataDog/dd-trace-py/blob/main/docker/Dockerfile
this dockerfile to install jemalloc first. Then update the testrunner images to use the updated one by updating references to testrunner images
https://github.com/search?q=repo%3ADataDog%2Fdd-trace-py+dd-trace-py%2Ftestrunner%3A+language%3AYAML&type=code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right - I have a note about that in the PR summary.
| note: there are perm issues for Docker containers setup affecting testing of jemalloc. Adding a defensive try/catch strategy to determine if it's installed. I will try to figure out Docker perm issues to install the lib in a follow up PR.
We should do this in a followup PR, as this one focuses on the default allocators for proof of concept.
I could potentially remove the part for testing jemalloc
, and reintroduce them in the new PR.
riotfile.py
Outdated
# Library not found or ctypes issue - try next name | ||
continue | ||
|
||
return [ |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
f2537cd
to
18d9fe1
Compare
riotfile.py
Outdated
name="profile-v2-memalloc", | ||
command="python -m pytest {cmdargs} tests/profiling_v2/test_memalloc.py", | ||
pys=select_pys(), | ||
pkgs=_profile_v2_pkgs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to DRY/redefine these here, since you are a child of the main profile-v2
venv, you'll inherit them from your parent.
riotfile.py
Outdated
pkgs=_profile_v2_pkgs, | ||
venvs=[ | ||
Venv( | ||
name="pymalloc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need/want to name these, instead you could just do:
Venv(
name="profile-v2-memalloc",
command="python -m pytest {cmdargs} tests/profiling_v2/test_memalloc.py",
env={
"PYTHONMALLOC": ["pymalloc", "malloc", "malloc_debug", "pymalloc_debug"],
},
),
tests/profiling/suitespec.yml
Outdated
- tests/profiling/suitespec.yml | ||
- tests/profiling_v2/* | ||
pattern: profile-v2$ | ||
pattern: profile-v2.*$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pattern: profile-v2.*$ | |
pattern: profile-v2 |
would be enough
libbz2-dev \ | ||
libenchant-2-dev \ | ||
libffi-dev \ | ||
libjemalloc2 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to update this first before we can run the new tests?
18d9fe1
to
7abee51
Compare
8f73a8b
to
8cec608
Compare
…mg (#14706) [PROF-12490: Running memory allocator tests with different allocators](https://datadoghq.atlassian.net/browse/PROF-12490) ## Description In order to test memalloc with `jemalloc` lib, we need to update the testrunner's Docker image with this lib. [14659](#14659) ## Testing signals green ## Risks n/a ## Additional Notes one more dependency in the image
b5bb4a2
to
4004119
Compare
4004119
to
df532e4
Compare
df532e4
to
afd4f10
Compare
PROF-12490: Running memory allocator tests with different allocators
Description
Some clients might be using non-default allocator types in their Python code. The goal of this PR is to ensure that existing memory-related tests are working for (a) all standard allocator types and (b)
jemalloc
, a powerful implementation optimized to avoid memory fragmentation, and providing scalable concurrency, particularly in multithreaded applications.Allocators tested
note: there are perm issues for Docker containers setup affecting testing of
jemalloc
. Adding a defensive try/catch strategy to determine if it's installed. I will try to figure out Docker perm issues to install the lib in a follow up PR.Testing
test_memory_collector*
testsRisks
None
Additional Notes
Changes