Skip to content
This repository has been archived by the owner on Sep 28, 2021. It is now read-only.

Support Dynamic expression evaluation #242

Closed
alex-dukhno opened this issue Aug 22, 2020 · 13 comments · Fixed by #258
Closed

Support Dynamic expression evaluation #242

alex-dukhno opened this issue Aug 22, 2020 · 13 comments · Fixed by #258
Assignees
Labels
C-Query Engine Execution SQL Query Engine P-High High priority T-Enhancement New feature or request

Comments

@alex-dukhno
Copy link
Owner

alex-dukhno commented Aug 22, 2020

Right now ExpressionEvaluation supports only static values evaluation e.g. 4 + 2 will be evaluated into 6
However, the case with column names is not covered.
E.g.

update schema_name.table_name set col1 = col1 + 2;

Also, it requires to fold static expression as much as possible, e.g.

update schema_name.table_name set col1 = (2 + 3) * col1 + 2;

should become equivalent to

update schema_name.table_name set col1 = 5 * col1 + 2;
@alex-dukhno alex-dukhno added C-Query Engine Execution SQL Query Engine P-High High priority T-Enhancement New feature or request labels Aug 22, 2020
@AndrewBregger
Copy link
Contributor

@alex-dukhno can you assign this issue to me? Thanks.

@AndrewBregger
Copy link
Contributor

Is this issue primarily meant for update query and then extend the implementation to other queries later? Or generalized implementation now?

@alex-dukhno
Copy link
Owner Author

I think ExpressionEvaluation should be extended. Currently, InsertCommand and UpdatedCommand use it.
AFAIR, InsertCommand uses it for evaluating expressions similar to insert into table values (3 + 5) and UpdateCommand uses it for update table set col_1 = 4 * 2
So by making ExpressionEvaluation support aliases, column names it could be used in other places like where clauses also in select 1 + col_1 from table

@alex-dukhno
Copy link
Owner Author

I've looked through variants of Expr enum in sqlparser https://github.com/ballista-compute/sqlparser-rs/blob/main/src/ast/mod.rs#L149.
Let's start with evaluating BinaryOp, UnaryOp and Identifier(Ident), Wildcard, QualifiedWildcard(Vec<Ident>), and CompoundIdentifier(Vec<Ident>).

@alex-dukhno
Copy link
Owner Author

@AndrewBregger uh, I notice that ExpressionEvaluation does not evaluate expression like 2 > 3, so it misses functionality with logical and bitwise operations, probably it is better to start with them?
Also I remembered that some PostgreSQL operations should be added to sqlparser tests are ignored for now (e.g. https://github.com/alex-dukhno/database/blob/master/src/sql_engine/src/tests/insert.rs#L410)
Guys, seems like ok to add them to crate from their response apache/datafusion-sqlparser-rs#7 (comment)

@AndrewBregger
Copy link
Contributor

The current limitation of ExpressionEvaluation is that it doesn't know anything about the rest of the query (i.e. the tables referenced). I am thinking that the source tables information can be given to ExpressionEvaluation, the job of this struct would be to evaluate as much of the expression as it can. Using the table information to resolve column accesses to column indices. However, when evaluating the resulting expression during query evaluation a different mechanism will handle it. I have some ideas for how the query can be evaluated which we should discuss at some point.

@alex-dukhno
Copy link
Owner Author

my idea was that whenever ExpressionEvaluation faces any sort of identifier struct "skips" it and continue evaluate what is possible to evaluate at current situation.
I think I wasn't clear what means "Dynamic expression evaluation" - I didn't mean to evaluate column values at this stage, I meant not to fail and handle them as variable values whenever ExpressionEvaluation met one.

@AndrewBregger
Copy link
Contributor

I knew what you meant. I was explaining what I was going to implement and what needed to be added to the current structure to allow for this implementation. Whenever it sees an identifier it will resolve it to a column in a table. If it can't do that then it will fail.

@alex-dukhno
Copy link
Owner Author

I think it is better to introduce another struct to do so it could be either a field of ExpressionEvaluation or uses result of ExpressionEvaluation::eval function.

@AndrewBregger
Copy link
Contributor

AndrewBregger commented Aug 28, 2020

Successful updating with dynamic expressions! Depending on what you think, it could be good enough for this pull request. It is not the final implementation for handling expressions. For the implementation, you can look at my expressions branch.

CREATE SCHEMA
CREATE TABLE
INSERT 0 7
=> select * from test1.foo;
 a | b
---+---
 1 | 2
 2 | 3
 4 | 4
 5 | 4
 6 | 4
 7 | 4
 8 | 4
(7 rows)


=> update test1.foo set a = 2 * a;
UPDATE 7
r=> select * from test1.foo;
 a  | b
----+---
  2 | 2
  4 | 3
  8 | 4
 10 | 4
 12 | 4
 14 | 4
 16 | 4
(7 rows)

@alex-dukhno
Copy link
Owner Author

I am in favour of a PR.

  1. We'll be sure that feature code is integrated with mainline
  2. Whatever my comments on PR could be discussed and addressed in the follow up PRs or create separate issues

It's always good to see some progress 😄

@AndrewBregger
Copy link
Contributor

The current issue is that it isn't doing type checking on the columns. I'll do that this evening and then make a pull request.

@alex-dukhno
Copy link
Owner Author

If it is complex or takes longer than you expect, create PR and left a comment about that, I think issue could be addressed in the following PRs

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C-Query Engine Execution SQL Query Engine P-High High priority T-Enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants