Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: convert sql expression into proto extended expressions #191

Merged

Conversation

davisusanibar
Copy link
Contributor

@davisusanibar davisusanibar commented Oct 24, 2023

The proposal involves receiving a SQL expression (e.g. column_a > 18), validating the SqlNode expression, and then converting it into RexNode. It then traverses RexNode to validate the type of expression, and finally creates a proto Extended Expression message.

These are the activities involved on this PR:

  • The objective of this PoC is to convert a SQL Expression into an Extended Expression by using the RexNode convertExpression(...) method and hardcoding some components of the logic. This draft PR cover that part.

These are the activities pending to solve on this PR:

Please let me know if you have any ideas, improvements, or concerns so that I can evaluate them appropriately. I appreciate your assistance in advance.

To closes: #128

Copy link
Contributor

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davisusanibar Nice work, I took an initial look at the code.

@davisusanibar
Copy link
Contributor Author

The PR is ready for review. It therefore depends on #206

@vbarua
Copy link
Member

vbarua commented Dec 20, 2023

Additionally please if you could give some reason of the current error on the CI build

I'm not entirely sure either unfortunately. In some case Jabel doesn't seem to fire correctly and it doesn't transform code that it should be. It happens more than I would like, see #195

I tried making a small change in 3835a4f to retry the build, which sometimes works, but apparently not in this case. If the CI check is failing after we've approved the PR we can figure it out then.

Comment on lines 98 to 99
testImplementation("org.apache.arrow:arrow-dataset:${ARROW_VERSION}")
testImplementation("org.apache.arrow:arrow-memory-netty:${ARROW_VERSION}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great integration test, but I think it would be better if we moved it to the Arrow repo instead of adding Arrow as a dependency to Isthmus. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me delete that part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just deleted

Comment on lines 116 to 125
SqlToRelConverter createSqlToRelConverter(
SqlValidator validator, CalciteCatalogReader catalogReader) {
return new SqlToRelConverter(
null,
validator,
catalogReader,
relOptCluster,
StandardConvertletTable.INSTANCE,
converterConfig);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
Do we need this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just asking because it is merely creating an instance of SqlToRelConverter but I feel like the best would be to have something like SqlToRelConverter.of(validator, catalogReader) or inline the function.
Don't have a very strong opinion to do this change in this PR itself, maybe we could do similar tasks in a follow up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved to the inline method. Thank you.

Copy link
Contributor

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davisusanibar added a few comments and a few questions for my curiosity.
Nice work!!!

@vibhatha
Copy link
Contributor

Additionally please if you could give some reason of the current error on the CI build

I'm not entirely sure either unfortunately. In some case Jabel doesn't seem to fire correctly and it doesn't transform code that it should be. It happens more than I would like, see #195

I tried making a small change in 3835a4f to retry the build, which sometimes works, but apparently not in this case. If the CI check is failing after we've approved the PR we can figure it out then.

@vbarua would be a best practice to keep the usage of higher SDK features minimal and stick to basics at most cases?

@davisusanibar
Copy link
Contributor Author

Thank you @vbarua / @vibhatha / @danepitkin for your comments/suggestion.

Please if you could help me with a new code review.

PD: PR related to support qualified path on Calcite SQL Expression was created.

@vibhatha
Copy link
Contributor

PD: PR related to support qualified path on Calcite SQL Expression was created.
🎆 👍

Copy link
Contributor

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davisusanibar I added a few comments. Overall the PR LGTM!

Comment on lines 116 to 125
SqlToRelConverter createSqlToRelConverter(
SqlValidator validator, CalciteCatalogReader catalogReader) {
return new SqlToRelConverter(
null,
validator,
catalogReader,
relOptCluster,
StandardConvertletTable.INSTANCE,
converterConfig);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just asking because it is merely creating an instance of SqlToRelConverter but I feel like the best would be to have something like SqlToRelConverter.of(validator, catalogReader) or inline the function.
Don't have a very strong opinion to do this change in this PR itself, maybe we could do similar tasks in a follow up PR.

isthmus/src/test/resources/tpch/schema_error.sql Outdated Show resolved Hide resolved
@vibhatha
Copy link
Contributor

@vbarua should we cover more ground on tests? Not necessarily for this PR itself but for the sustainability in the longer term?

Copy link
Contributor

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davisusanibar, PR LGTM!

Copy link
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead a pushed a commit to remove the parquet related data and config changes as they weren't needed anymore.

Thanks for adding the duplicate column check, and thanks for working on this. Changes look good!

I will merge this after CI passes.

@vbarua vbarua merged commit 750220e into substrait-io:main Jan 18, 2024
8 checks passed
ajegou pushed a commit to ajegou/substrait-java that referenced this pull request Mar 29, 2024
…ssions (substrait-io#191)

Introduces the SqlExpressionToSubstrait class for converting SQL expression to Substrait

---------

Co-authored-by: Dane Pitkin <[email protected]>
Co-authored-by: Victor Barua <[email protected]>
Co-authored-by: Vibhatha Lakmal Abeykoon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Isthmus] Convert SQL expression to Substrait ExtendedExpression
4 participants