chore: Add memory reservation debug logging and visualization #2521

andygrove · 2025-10-03T15:19:14Z

Which issue does this PR close?

Closes #.

Rationale for this change

Debugging.

[Task 486] MemoryPool[ExternalSorter[6]].try_grow(256232960) returning Ok
[Task 486] MemoryPool[ExternalSorter[6]].try_grow(256375168) returning Ok
[Task 486] MemoryPool[ExternalSorter[6]].try_grow(256899456) returning Ok
[Task 486] MemoryPool[ExternalSorter[6]].try_grow(257296128) returning Ok
[Task 486] MemoryPool[ExternalSorter[6]].try_grow(257820416) returning Err
[Task 486] MemoryPool[ExternalSorterMerge[6]].shrink(10485760)
[Task 486] MemoryPool[ExternalSorter[6]].shrink(150464)
[Task 486] MemoryPool[ExternalSorter[6]].shrink(146688)
[Task 486] MemoryPool[ExternalSorter[6]].shrink(137856)
[Task 486] MemoryPool[ExternalSorter[6]].shrink(141952)
[Task 486] MemoryPool[ExternalSorterMerge[6]].try_grow(0) returning Ok
[Task 486] MemoryPool[ExternalSorterMerge[6]].try_grow(0) returning Ok
[Task 486] MemoryPool[ExternalSorter[6]].shrink(524288)
[Task 486] MemoryPool[ExternalSorterMerge[6]].try_grow(0) returning Ok
[Task 486] MemoryPool[ExternalSorterMerge[6]].try_grow(68928) returning Ok

From this, we can make pretty charts to help with comprehension:

What changes are included in this PR?

Add new config spark.comet.debug.memory
Add new LoggingPool that is enabled when the new config is set

How are these changes tested?

andygrove · 2025-10-03T17:40:13Z

native/core/src/execution/jni_api.rs

-    debug_native: jboolean,
-    explain_native: jboolean,
-    tracing_enabled: jboolean,


rather than adding yet another flag to this API call, I am now using the already available spark config map in native code.

+1. The config map should be the preferred method

codecov-commenter · 2025-10-03T17:53:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.93%. Comparing base (f09f8af) to head (2884ed3).
⚠️ Report is 585 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2521      +/-   ##
============================================
+ Coverage     56.12%   58.93%   +2.80%     
- Complexity      976     1449     +473     
============================================
  Files           119      147      +28     
  Lines         11743    13649    +1906     
  Branches       2251     2369     +118     
============================================
+ Hits           6591     8044    +1453     
- Misses         4012     4382     +370     
- Partials       1140     1223      +83

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

comphead · 2025-10-03T18:09:15Z

native/core/src/execution/memory_pools/logging_pool.rs

+
+impl MemoryPool for LoggingPool {
+    fn grow(&self, reservation: &MemoryReservation, additional: usize) {
+        println!(


should be println as info! or trace! ?

I guess info! would be ok. I pushed that change. If we use trace! then we would have to set spark.comet.debug.memory=true and also configure trace logging for this one file, which seem like overkill for a debug feature

spark/src/main/scala/org/apache/comet/CometExecIterator.scala

andygrove · 2025-10-03T20:53:22Z

moving to draft while I work on the Python scripts

andygrove · 2025-10-03T21:16:30Z

Still experimenting...

…o debug-mem

andygrove · 2025-10-03T21:54:21Z

Chart now shows when try_grow failed:

andygrove added 5 commits August 22, 2025 10:10

Access Spark configs from native code

7252605

code cleanup

d084cfa

revert

4837935

debug

ad9c9b8

use df release

f3bb412

andygrove changed the title ~~chore: Add memory pool trace logging [WIP]~~ chore: Add memory pool trace logging [WIP] [skip-ci] Oct 3, 2025

andygrove changed the title ~~chore: Add memory pool trace logging [WIP] [skip-ci]~~ chore: Add memory pool trace logging [WIP] [skip ci] Oct 3, 2025

andygrove added 2 commits October 3, 2025 09:36

cargo update

13f14d3

[skip ci]

78f5b4f

andygrove changed the title ~~chore: Add memory pool trace logging [WIP] [skip ci]~~ chore: Add memory pool trace logging [WIP] Oct 3, 2025

andygrove added 10 commits October 3, 2025 09:40

merge other PR [skip-ci]

5a39d3b

save [skip-ci]

dc11515

[skip ci]

d2a1ab1

save [skip ci]

31cdbc6

Merge remote-tracking branch 'apache/main' into debug-mem

ffb1f71

info logging

322b4c5

log task id [skip ci]

89e10ac

println

3b191fd

revert lock file

7c24836

prep for review

405f5b7

andygrove marked this pull request as ready for review October 3, 2025 17:36

andygrove mentioned this pull request Oct 3, 2025

chore: Access Spark configs from native code #2219

Closed

andygrove changed the title ~~chore: Add memory pool trace logging [WIP]~~ chore: Add memory pool trace logging Oct 3, 2025

andygrove requested review from comphead and parthchandra October 3, 2025 17:37

save

522238d

andygrove commented Oct 3, 2025

View reviewed changes

comphead reviewed Oct 3, 2025

View reviewed changes

spark/src/main/scala/org/apache/comet/CometExecIterator.scala Outdated Show resolved Hide resolved

andygrove added 3 commits October 3, 2025 14:18

revert

df69875

add Python script to convert log to csv

ad891a0

Python script to generate chart

8756256

andygrove marked this pull request as draft October 3, 2025 20:53

andygrove added 2 commits October 3, 2025 15:00

scripts

7eb1bc1

new script

21bd386

andygrove added 5 commits October 3, 2025 15:24

show err

ec823c2

save

a66fa65

Merge branch 'debug-mem' of github.com:andygrove/datafusion-comet int…

12db37f

…o debug-mem

track errors

2fb336e

format

706f5e7

andygrove mentioned this pull request Oct 3, 2025

Add ability to log all interactions with a memory pool, by consumer apache/datafusion#17901

Open

andygrove marked this pull request as ready for review October 3, 2025 21:59

andygrove added 3 commits October 3, 2025 16:13

ASF header

4faf881

add brief docs

d91abda

docs

f6128b5

andygrove changed the title ~~chore: Add memory pool trace logging~~ chore: Add memory reservation debug logging and visualization Oct 4, 2025

andygrove added 6 commits October 5, 2025 13:00

fix

7d40ac2

cargo fmt

c495897

upmerge

06814b7

format

e51751f

upmerge

75e727f

fix regression

e844287

andygrove mentioned this pull request Oct 10, 2025

chore: Pass Comet configs to native createPlan #2543

Merged

andygrove marked this pull request as draft October 10, 2025 09:20

upmerge

2884ed3

andygrove marked this pull request as ready for review October 10, 2025 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Add memory reservation debug logging and visualization #2521

chore: Add memory reservation debug logging and visualization #2521

Uh oh!

andygrove commented Oct 3, 2025 •

edited

Loading

Uh oh!

andygrove Oct 3, 2025

Uh oh!

parthchandra Oct 3, 2025

Uh oh!

codecov-commenter commented Oct 3, 2025 •

edited

Loading

Uh oh!

comphead Oct 3, 2025

Uh oh!

andygrove Oct 3, 2025

Uh oh!

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chore: Add memory reservation debug logging and visualization #2521

Are you sure you want to change the base?

chore: Add memory reservation debug logging and visualization #2521

Uh oh!

Conversation

andygrove commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

andygrove Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

parthchandra Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

comphead Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

andygrove commented Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andygrove commented Oct 3, 2025 •

edited

Loading

codecov-commenter commented Oct 3, 2025 •

edited

Loading