-
Notifications
You must be signed in to change notification settings - Fork 836
feat: upgrade nom to version 8.0.0 and accelerate expr_element using the first token. #18935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ith nom_language. Use the first token check to reduce branch traversal in expr_element.
6114ae0 to
e4187d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR upgrades the nom parser combinator library from version 7 to version 8, along with updating nom-rule from 0.4 to 0.5.1. The upgrade includes API migrations to accommodate nom 8's new trait system and error reporting improvements.
Key changes:
- Migration from nom 7 to nom 8 API, including the new
Inputtrait andParsertrait with associated types - Addition of
.parse()calls throughout the codebase to invoke parsers using nom 8'sParsertrait - Implementation of custom
Inputtrait for the token slice type - Performance optimizations through dispatch macros that reduce backtracking
- Improved error messages in test outputs with more specific token expectations
Reviewed Changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml | Updated nom to 8.0.0 and nom-rule to 0.5.1 |
| Cargo.lock | Locked updated dependency versions |
| src/query/ast/src/parser/input.rs | Implemented nom 8's Input trait for custom Input type |
| src/query/ast/src/parser/common.rs | Updated parser signatures to use nom 8's Parser trait with associated types, added parser_fn helper |
| src/query/ast/src/parser/expr.rs | Contains bugs: Incorrect use of INVERTED keyword instead of INTERVAL for interval literal parsing, added dispatch optimizations |
| src/query/ast/src/parser/*.rs | Added .parse() calls to invoke parsers using nom 8's API |
| src/query/ast/tests/it/testdata/*.txt | Updated error messages reflecting nom 8's improved error reporting |
| src/query/ast/benches/bench.rs | Updated benchmark results showing significant performance improvements |
Comments suppressed due to low confidence (1)
src/query/ast/src/parser/expr.rs:1492
- The variable name
inverted_expris misleading. This parser handles theINVERTEDkeyword followed by a literal string and casts it to anIntervaltype, but the name suggests it's related to inverted indexes or inverted data structures. The previous code usedinterval_exprfor handlingINTERVAL <literal_string>syntax. Consider renaming this tointerval_string_expror similar to clarify that it parses interval literals in string form.
let inverted_expr = map(
rule! {
INVERTED ~ #consumed(literal_string)
},
|(_, (span, date))| ExprElement::Cast {
expr: Box::new(Expr::Literal {
span: transform_span(span.tokens),
value: Literal::String(date),
}),
target_type: TypeName::Interval,
},
);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
82aecc5 to
21eb23f
Compare
… number of branches and stack usage (otherwise, stack overflow is extremely likely).
5dbc507 to
e6e01b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
ccb06c2 to
1d1a8e4
Compare
…in try_dispatch, create array_number function for numeric arrays, handle negative numbers directly
1d1a8e4 to
6b623ab
Compare
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
upgrade nom to version 8.0.0. Use the first token check to reduce branch traversal in expr_element.
By determining the index of the first token of the current input to the relevant possible branches, we avoid trying all branches, which brings a significant improvement, reducing the time spent in the branch's wire_expr by almost half.
Tests
Type of change
This change is