- 
                Notifications
    You must be signed in to change notification settings 
- Fork 576
feat(enhancement): Include the e-graph equality saturation framework in the KQIR optimizer. #2832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
…KQIR optimizer. In order to improve the KQIR optimizer's capacity to produce more effective query plans, this commit presents an e-graph equality saturation framework. The following are included in the implementation: Rewrite rules for query plan transformation; equality saturation algorithms for query optimization; and a new e-graph representation for KQIR nodes A new optimization pass that integrates with the current PassManager and makes use of the e-graph framework Through equality saturation, which can find equivalent query plans and choose the most effective one based on a cost model, the e-graph framework makes it possible for more potent term rewriting capabilities. By choosing the best execution strategy and examining a wider range of equivalent query plans, this improvement will increase query performance.
| @AryanVBW Thanks for your contribution! | 
| 
 Thank you, sir, It’s truly my pleasure to work with such humble people. I always love to contribute. Please let me know if there are any improvements I can make or any changes needed | 
| @AryanVBW  As I see, a clang-lint into CI have a some warning. Could you please run  | 
| Ok | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm it seems there's just some skeleton rather than a complete implementation.
It cannot work so I think it's hard to get it merged. Also we need some test cases for it.
| Hey @AryanVBW , this PR looks like a good starting point, but it’s missing a some of the key parts to be a functional KQIR optimizer. There’s no actual equality saturation algorithm, the rewrite rules don’t do anything yet, the cost model is just a placeholder, and it’s not integrated with Kvrocks' query engine (this can be done later ig). Plus, without tests or benchmarks, we can’t validate if it actually improves anything. I’d suggest making this a draft PR and continuing work on it, or creating a separate branch where we can properly develop it before merging. To the best of my knowledge, this is going to be much more complex than standard SQL parsing. I’d recommend checking out some existing articles like KQIR: a query engine for Apache Kvrocks to get better understanding of how query optimization works in Kvrocks. A good next step would be to experiment with combining multiple operations like  A really good resource: https://egraphs-good.github.io/ | 
The search-tests directory contains build artifacts and should not be tracked in version control. This commit removes it to keep the repository clean.
The search-tests directory contains build artifacts and should not be tracked in version control. This commit removes it to keep the repository clean.
| 
 Yes, sir. I initially started working on it but soon realized that it wasn’t a complete implementation. So, I continued working to complete it properly. Thank you so much, sir, for your review | 
| 
 Thank you, sir. I really appreciate your detailed feedback and guidance. I truly enjoy working on this, and I understand that there’s still a lot to refine. I’ll start by creating a separate branch to continue developing a proper KQIR optimizer, ensuring that key components like equality saturation, rewrite rules, and cost models are implemented correctl I'll also spend some time reading the recommended materials to learn more about query optimization in Kvrocks. I'm looking forward to gradually improving this. Once again, I appreciate your help! | 
This pull request introduces significant additions to the e-graph data structure and its associated components for the KQIR optimizer. The changes include the implementation of the e-graph itself, equivalence classes, nodes, and rewrite rules, as well as a new equality saturation pass for query optimization.
fix:#2561
Key changes include:
E-Graph Implementation:
EGraph,EClass, andENodeclasses to represent the e-graph, equivalence classes, and nodes respectively. This includes methods for adding nodes, merging classes, and extracting the best query plan based on a cost model. (src/search/passes/egraph.h)Rewrite Rules:
FilterPushDownRewrite,MergeFilterRewrite,SortPushDownRewrite,FilterMergeRewrite,CommonSubexpressionRewrite) to transform the e-graph and optimize query plans. (src/search/passes/egraph_saturation.h)Equality Saturation Pass:
EGraphSaturationclass, which applies the rewrite rules to the e-graph until saturation is achieved and extracts the best query plan using a cost model. (src/search/passes/egraph_saturation.h)These changes collectively enhance the query optimization capabilities of the KQIR optimizer by leveraging e-graph-based equality saturation techniques.
@git-hulk @aleksraiden
@PragmaTwice