-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation: Plan custom expressions #15353
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @Jiashu-Hu -- this looks great.
I have some suggestions on how to improve this section and example, but we could also do it as a follow on PR.
It is great to see the docs being improved
@@ -1160,6 +1160,89 @@ async fn main() -> Result<()> { | |||
// +---+ | |||
``` | |||
|
|||
## Custom Expression Planning | |||
|
|||
DataFusion provides native support for a limited set of SQL operators by default. For operators not natively defined, developers can extend DataFusion's functionality by implementing custom expression planning. This extensibility is a core feature of DataFusion, allowing it to be customized for particular workloads and requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some wordsmithing suggestions:
DataFusion provides native support for a limited set of SQL operators by default. For operators not natively defined, developers can extend DataFusion's functionality by implementing custom expression planning. This extensibility is a core feature of DataFusion, allowing it to be customized for particular workloads and requirements. | |
DataFusion provides native support for common SQL operators by default such as `+`, `-`, `||`. However it does not provide support for other operators such as `@>`. To override DataFusion's default handling or support unsupported operators, developers can extend DataFusion by implementing custom expression planning, a core feature of DataFusion |
|
||
1. Implement the `ExprPlanner` trait: This allows you to define custom logic for planning expressions that DataFusion doesn't natively recognize. The trait provides the necessary interface to translate logical expressions into physical execution plans. | ||
|
||
For a detailed documentation please see: [Trait ExprPlanner](https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.ExprPlanner.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a detailed documentation please see: [Trait ExprPlanner](https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.ExprPlanner.html) | |
For detailed documentation please see: [Trait ExprPlanner](https://docs.rs/datafusion/latest/datafusion/logical_expr/planner/trait.ExprPlanner.html) |
|
||
To extend DataFusion with support for custom operators not natively available, you need to: | ||
|
||
1. Implement the `ExprPlanner` trait: This allows you to define custom logic for planning expressions that DataFusion doesn't natively recognize. The trait provides the necessary interface to translate logical expressions into physical execution plans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ExprPlanner actually converts the SQL AST to Expr
s:
1. Implement the `ExprPlanner` trait: This allows you to define custom logic for planning expressions that DataFusion doesn't natively recognize. The trait provides the necessary interface to translate logical expressions into physical execution plans. | |
1. Implement the `ExprPlanner` trait: This allows you to define custom logic for planning expressions that DataFusion doesn't natively recognize. The trait provides the necessary interface to translate SQL AST nodes into logical `Expr`. |
|
||
2. Register your custom planner: Integrate your implementation with DataFusion's `SessionContext` to ensure your custom planning logic is invoked during the query optimization and execution planning phase. | ||
|
||
For a detailed documentation please see: [fn register_expr_planner](https://docs.rs/datafusion/latest/datafusion/execution/trait.FunctionRegistry.html#method.register_expr_planner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a detailed documentation please see: [fn register_expr_planner](https://docs.rs/datafusion/latest/datafusion/execution/trait.FunctionRegistry.html#method.register_expr_planner) | |
For a detailed documentation see: [fn register_expr_planner](https://docs.rs/datafusion/latest/datafusion/execution/trait.FunctionRegistry.html#method.register_expr_planner) |
# // Define the custom planner | ||
# struct MyCustomPlanner; | ||
|
||
// Implement ExprPlanner for cutom operator logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Implement ExprPlanner for cutom operator logic | |
// Implement ExprPlanner to add support for the `->` custom operator |
ctx.register_expr_planner(Arc::new(MyCustomPlanner))?; | ||
let results = ctx.sql("select 'foo'->'bar';").await?.collect().await?; | ||
|
||
pretty::print_batches(&results)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please change this to use `assert_batches_eq! so the actual output is in the test and it is tested in CI
Which issue does this PR close?
Rationale for this change
This PR adds documentation for using the
ExprPlanner
API to plan custom expressions, as requested in #15267. Clear documentation with an example will help users extend DataFusion to support custom operators like PostgreSQL's->
.What changes are included in this PR?
## Custom Expression Planning
to documentExprPlanner
does and why it’s useful.ExprPlanner
to make the->
operator concatenate strings (e.g.,'foo'->'bar'
outputsfoobar
).ExprPlanner
andFunctionRegistry
) for further reading.Are these changes tested?
yes
Are there any user-facing changes?
Yes, since this is an documentation chang