-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: applying multiple times EnforceDistribution
generates invalid plan
#14150
Comments
Any thoughts? @alamb @andygrove @Dandandan @akurmustafa If you also think it's a bug, I am pleasure to give a fix. |
I agree that this sounds like a bug. Thank you for the report @xudong963 and the offer to fix |
This is indeed a bug. Resulting plan should be correct independent of how many times they are applied. Thanks @xudong963 for reporting this. |
Thanks for your reply, will give a fix next week |
there is a fix #14207, looking forward to your feedback |
I added this to the candidate list to fix for the next release: |
Describe the bug
For a topk SQL:
select * from aggregate_test_100 ORDER BY c13 limit 5;
, If applied twiceEnforceDistribution
, will generate an invalid plan and result in the wrong result.The root reason is that the fetch of the limit will be missed at the second
EnforceDistribution
.To Reproduce
Here is an example to reproduce
Expected behavior
Generated a valid plan and correct result as the doc said: https://github.com/apache/datafusion/blob/main/datafusion/core/src/physical_optimizer/enforce_distribution.rs#L159-L168
Additional context
No response
The text was updated successfully, but these errors were encountered: