Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jun 8, 2025

Which issue does this PR close?

Part of #1254

Closes #1252
Closes #1824

Rationale for this change

The config COMET_SHUFFLE_FALLBACK_TO_COLUMNAR was added as a hack so that we could avoid fixing some failing Spark SQL tests. However, the failing tests are real issues and only show up now because more queries are running natively.

What changes are included in this PR?

This PR removes the hack and updates the diff files to either fix or ignore failing tests. Some of these tests will be enabled later as part of #1254.

  • Remove the hack
  • Update 3.4.3 diffs
  • Update 3.5.4 diffs
  • Update 3.5.5 diffs
  • Update 4.0.0-preview1 diffs

How are these changes tested?

@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2025

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 59.38%. Comparing base (f09f8af) to head (97008a4).
Report is 249 commits behind head on main.

Files with missing lines Patch % Lines
...he/comet/rules/EliminateRedundantTransitions.scala 0.00% 1 Missing and 2 partials ⚠️
...pache/spark/sql/comet/CometColumnarToRowExec.scala 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1865      +/-   ##
============================================
+ Coverage     56.12%   59.38%   +3.26%     
- Complexity      976     1150     +174     
============================================
  Files           119      130      +11     
  Lines         11743    12661     +918     
  Branches       2251     2375     +124     
============================================
+ Hits           6591     7519     +928     
+ Misses         4012     3930      -82     
- Partials       1140     1212      +72     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove
Copy link
Member Author

@rluvaton @Kontinuation fyi

@parthchandra
Copy link
Contributor

For the cases where we were falling back to columnar, the tests now fail (and are ignored), or are we falling back to Spark?

@andygrove
Copy link
Member Author

For the cases where we were falling back to columnar, the tests now fail (and are ignored), or are we falling back to Spark?

Previously, we were falling back to Spark because we didn't support native shuffle. Now we are falling back to columnar shuffle instead, making more queries run natively, hence uncovering more bugs.

The main bug remaining is related to DPP and exchange re-use.

@andygrove
Copy link
Member Author

Thanks for the reviews @parthchandra and @Kontinuation. I will need to rebase this PR and update the 3.5.6 diff now that #1861 is merged.

@andygrove andygrove merged commit 068b900 into apache:main Jun 9, 2025
85 checks passed
@andygrove andygrove deleted the remove-fallback-to-shuffle-hack branch June 9, 2025 22:01
coderfender pushed a commit to coderfender/datafusion-comet that referenced this pull request Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants