-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Remove CoalescePartitions insertion from HashJoinExec #15476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This backs out commit 4346cb8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, it works :) thank you @ctsk
Note that this does break for users of HashJoinExec that
|
Thanks @ctsk what exactly is broken? |
Before this PR, if someone hand-wired a CollectLeft HashJoin where the left child has more than one output partition, the HashJoin would automatically add a CoalescePartitions exec. This behaviour never triggers for plans that are constructed by datafusion, because the EnforceDistribution pass makes sure that that CoalescePartitions exist. After this PR, if someone hand-wires a CollectLeft HashJoin and does not use the EnforceDistribution pass and provides a left child that has more than 1 output partition, the resulting plan, when executed will return a wrong result (because the hash table will only built on partition 0). Now that I've written it out. I believe it is strictly better to return an internal_err! if one constructs such a plan and tries to execute it rather than return a wrong result. |
I've amended the PR so that |
Thanks @ctsk |
This PR appears to have caused CI failures for some reason so @goldmedal has a PR to revert it: |
Sorry about that! Thanks for tracking it down @goldmedal. |
It can be reproduced in the local by the following steps:
|
Alright, so what went wrong here is that for CollectLeft joins, the left ExecutionPlan gets executed for every input partition. This was not caught by the tests 2 weeks ago, because most plans can have a partition executed multiple times - but repartition can not. I suspect there is no case where a CollectLeft join is used and a RepartitionExec is under the build side. |
As the signature of execute() is not self consuming, what I believe is all execute()'s should be idempotent. If repartition breaks this, should we create a ticket for that?
Even if this case is not generated in tests at all, we should be prepared for that.
The issue is now CollectLeft should always receive only 1 partition, by EnforceDistribution, right? Then, we can re-apply these changes |
I typo'd there - I meant to say for every output partition. |
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?