Skip to content

test: enforce distribution is no longer inserting the coalesce #14949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Feb 28, 2025

In a previous reproducer, I showed how the enforce distribution was inserting the coalesce.
See here: influxdata#58 (comment)

On the latest main, we no longer have the coalesce being inserted. This WIP demonstrates that the original bug no longer exists.

@github-actions github-actions bot added the core Core DataFusion crate label Feb 28, 2025
Comment on lines +3316 to +3342
/// Same starting plan as [`repartitions_for_aggregate_after_sorted_union`], but adds a projection
/// between the union and aggregate. This change the outcome:
///
/// * we no longer get repartitioning, and instead get coalescing.
#[test]
fn coalesces_for_aggregate_after_sorted_union_projection() {
let plan = aggregate_over_sorted_union_projection(vec![parquet_exec_with_stats(); 2]);

// Starting: has expected distribution error
let err = "does not satisfy distribution requirements: SinglePartition. Child-0 output partitioning: UnknownPartitioning(2)";
assert_sanity_check_err(&plan, err);

// Outcome:
// * adds an SPM->Repartition before the first aggregation stage.
// * adds a Repartition->Sort before the second aggregation stage.
let expected_after_first_run = &[
"AggregateExec: mode=FinalPartitioned, gby=[a1@0 as a1], aggr=[], ordering_mode=Sorted",
" SortExec: expr=[a1@0 ASC NULLS LAST], preserve_partitioning=[true]",
" RepartitionExec: partitioning=Hash([a1@0], 10), input_partitions=10",
" AggregateExec: mode=Partial, gby=[a@0 as a1], aggr=[], ordering_mode=Sorted",
" RepartitionExec: partitioning=RoundRobinBatch(10), input_partitions=1",
" SortPreservingMergeExec: [a@0 ASC]",
" SortExec: expr=[a@0 ASC], preserve_partitioning=[true]",
" ProjectionExec: expr=[a@0 as a, b@1 as value]",
" UnionExec",
" DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet",
" DataSourceExec: file_groups={1 group: [[x]]}, projection=[a, b, c, d, e], file_type=parquet",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer inserts the coalesce.

@wiedld
Copy link
Contributor Author

wiedld commented Feb 28, 2025

Example WIP only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant