-
Notifications
You must be signed in to change notification settings - Fork 93
Isthmus To/From SQL examples & enhanced APIs #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d5fc9df to
eb50abf
Compare
I think it's fine having multiple example subprojects that each show how to use each of the 3 modules provided by substrait-java: Core, Isthmus, Spark. It would also show which dependency declarations / Gradle setup one needs to interact with each of the modules. If all of them were in a single examples project then it might be difficult to parse the Gradle setup and understand which dependencies / configuration is needed by which module. |
eb50abf to
c0685ac
Compare
343277e to
c7eb95d
Compare
79933c8 to
7baf2ca
Compare
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some parts I didn't quite get around to reading, but just dumping some comments I have for now. Thank you for doing this! I think it will be a great help :)
Most of my comments are about code comments / the readme. Feel free to ignore whatever if you think my suggestions are less clear. Thanks!
examples/isthmus-api/src/main/java/io/substrait/examples/ToSql.java
Outdated
Show resolved
Hide resolved
examples/isthmus-api/src/main/java/io/substrait/examples/FromSql.java
Outdated
Show resolved
Hide resolved
examples/isthmus-api/src/main/java/io/substrait/examples/FromSql.java
Outdated
Show resolved
Hide resolved
Starting from two examples of using Isthmus to go from and to SQL a few new methods make the process a lot simpler and symetrical. We found that in use we needed helper wrapper clases, but these could be pushed into the main code. Signed-off-by: MBWhite <[email protected]>
Signed-off-by: MBWhite <[email protected]>
Signed-off-by: MBWhite <[email protected]>
Signed-off-by: MBWhite <[email protected]>
db1e1c9 to
d768adb
Compare
|
Thanks @benbellick - changes made. I've used the batch accepted feature - so hopefully I've not missed any comment - as I agreed with the updates. |
d768adb to
7c1bb7f
Compare
Co-authored-by: Ben Bellick <[email protected]>
Co-authored-by: Ben Bellick <[email protected]>
Co-authored-by: Ben Bellick <[email protected]>
Co-authored-by: Ben Bellick <[email protected]>
Co-authored-by: Ben Bellick <[email protected]>
abadf67 to
c8e010d
Compare
Signed-off-by: MBWhite <[email protected]>
c8e010d to
c9abb2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really helpful sample that makes it much easier for end users to get to grips with using substrait-java (and Isthmus) programmatically.
There is a very minor code fix required to SubstraitSqlStatementParser, and there is now a merge conflict with a recent code change. I have rebased this change and resolved both of these issues in my pr/625 branch if it is easier to pick that up rather than make the changes yourself.
Additionally there are a couple of minor Javadoc fixes needed, and some small suggestions described with inline comments.
| * @return a list of {@link SqlNode}s corresponding to the given statements | ||
| * @throws SqlParseException if there is an error while parsing the SQL statements | ||
| */ | ||
| public static List<SqlNode> parseStatements(String sqlStatements, SqlParser.Config parserConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parserConfig parameter is not used. It should replace PARSER_CONFIG below.
| * Converts one or more SQL statements to a List of {@link RelRoot}, with one {@link RelRoot} per | ||
| * statement. | ||
| * | ||
| * @param sqlStatements a string containing one or more SQL statements | ||
| * @param catalogReader the {@link Prepare.CatalogReader} for finding tables/views referenced in | ||
| * the SQL statements | ||
| * @return a list of {@link RelRoot}s corresponding to the given SQL statements | ||
| * @throws SqlParseException if there is an error while parsing the SQL statements | ||
| */ | ||
| public static List<RelRoot> convertQueries( | ||
| String sqlStatements, | ||
| Prepare.CatalogReader catalogReader, | ||
| final SqlParser.Config sqlParserConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc is missing the sqlDialect parameter.
| * Converts one or more SQL statements into a Substrait {@link Plan}. | ||
| * | ||
| * @param sqlStatements a string containing one more SQL statements | ||
| * @param catalogReader the {@link Prepare.CatalogReader} for finding tables/views referenced in | ||
| * the SQL statements | ||
| * @return the Substrait {@link Plan} | ||
| * @throws SqlParseException if there is an error while parsing the SQL statements | ||
| */ | ||
| public Plan convert( | ||
| final String sqlStatements, | ||
| final Prepare.CatalogReader catalogReader, | ||
| final SqlDialect sqlDialect) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc is missing the sqlDialect parameter.
| final List<String> sqlStrings = new ArrayList<>(); | ||
|
|
||
| System.out.println("\n"); | ||
| // and get each root from the Substrait plan | ||
| for (final Root root : substraitPlan.getRoots()) { | ||
| // Substrait -> Calcite | ||
| final RelNode calciteRelNode = converter.convert(root).project(true); | ||
| // Calcite -> SQL | ||
| final SqlNode sqlNode = relToSql.visitRoot(calciteRelNode).asStatement(); | ||
|
|
||
| final String sqlString = sqlNode.toSqlString(sqlDialect).getSql(); | ||
| sqlStrings.add(sqlString); | ||
| } | ||
| sqlStrings.forEach(System.out::println); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion; your call on whether you want to change this at all...
I am not sure there is any value in collecting the SQL strings into a list and then printing them rather than just printing them directly.
Either way, you might (or might not) find it more readable using streams.
| final List<String> sqlStrings = new ArrayList<>(); | |
| System.out.println("\n"); | |
| // and get each root from the Substrait plan | |
| for (final Root root : substraitPlan.getRoots()) { | |
| // Substrait -> Calcite | |
| final RelNode calciteRelNode = converter.convert(root).project(true); | |
| // Calcite -> SQL | |
| final SqlNode sqlNode = relToSql.visitRoot(calciteRelNode).asStatement(); | |
| final String sqlString = sqlNode.toSqlString(sqlDialect).getSql(); | |
| sqlStrings.add(sqlString); | |
| } | |
| sqlStrings.forEach(System.out::println); | |
| System.out.println("\n"); | |
| // Convert each of the Ssubstrait plan roots to SQL | |
| substraitPlan.getRoots().stream() | |
| // Substrait -> Calcite Rel | |
| .map(root -> converter.convert(root).project(true)) | |
| // Calcite Rel -> Calcite SQL | |
| .map(calciteRelNode -> relToSql.visitRoot(calciteRelNode).asStatement()) | |
| // Calcite SQL -> SQL String | |
| .map(sqlNode -> sqlNode.toSqlString(sqlDialect).getSql()) | |
| .forEachOrdered(System.out::println); |
| To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here. | ||
|
|
||
| ```bash | ||
| ./gradlew examples:isthmus-api:run --args "FromSql substrait.plan" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest splitting the command and sample output into separate code blocks so only the command can be more easily copied.
| To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here. | |
| ```bash | |
| ./gradlew examples:isthmus-api:run --args "FromSql substrait.plan" | |
| To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. | |
| ```bash | |
| ./gradlew examples:isthmus-api:run --args "FromSql substrait.plan" | |
| ``` | |
| The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here. | |
| ``` |
| To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated. | ||
| ```bash | ||
| ./gradlew examples:isthmus-api:run --args "ToSql substrait.plan" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest splitting the command and sample output into separate code blocks so only the command can be more easily copied.
| To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated. | |
| ```bash | |
| ./gradlew examples:isthmus-api:run --args "ToSql substrait.plan" | |
| To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. | |
| ```bash | |
| ./gradlew examples:isthmus-api:run --args "ToSql substrait.plan" | |
| ``` | |
| The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated. | |
| ``` |
Starting from two examples of using Isthmus to go from and to SQL a few new methods make the process a lot simpler and symetrical.
We found that in use we needed helper wrapper clases, but these could be pushed into the main code.
Note to reviewers
I'll leave as draft until I'm sure the build is good and tests are fully working.
I did create a separate
examplesproject; however it does feel a bit unwieldy to have this and the spark one. Maybe some rework to make a general examples project with sub directories per example 'category'... Happy to make that change if you wish.