Skip to content

Conversation

@mbwhite
Copy link
Contributor

@mbwhite mbwhite commented Dec 1, 2025

Starting from two examples of using Isthmus to go from and to SQL a few new methods make the process a lot simpler and symetrical.

We found that in use we needed helper wrapper clases, but these could be pushed into the main code.

Note to reviewers

I'll leave as draft until I'm sure the build is good and tests are fully working.

I did create a separate examples project; however it does feel a bit unwieldy to have this and the spark one. Maybe some rework to make a general examples project with sub directories per example 'category'... Happy to make that change if you wish.

@mbwhite mbwhite force-pushed the isthmus-api-updates branch 3 times, most recently from d5fc9df to eb50abf Compare December 1, 2025 10:58
@nielspardon
Copy link
Member

I did create a separate examples project; however it does feel a bit unwieldy to have this and the spark one. Maybe some rework to make a general examples project with sub directories per example 'category'... Happy to make that change if you wish.

I think it's fine having multiple example subprojects that each show how to use each of the 3 modules provided by substrait-java: Core, Isthmus, Spark. It would also show which dependency declarations / Gradle setup one needs to interact with each of the modules. If all of them were in a single examples project then it might be difficult to parse the Gradle setup and understand which dependencies / configuration is needed by which module.

@mbwhite mbwhite force-pushed the isthmus-api-updates branch from eb50abf to c0685ac Compare December 1, 2025 11:22
@mbwhite mbwhite marked this pull request as ready for review December 1, 2025 11:56
@mbwhite mbwhite force-pushed the isthmus-api-updates branch 3 times, most recently from 343277e to c7eb95d Compare December 1, 2025 12:20
@mbwhite mbwhite force-pushed the isthmus-api-updates branch 2 times, most recently from 79933c8 to 7baf2ca Compare December 2, 2025 14:24
Copy link
Member

@benbellick benbellick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some parts I didn't quite get around to reading, but just dumping some comments I have for now. Thank you for doing this! I think it will be a great help :)

Most of my comments are about code comments / the readme. Feel free to ignore whatever if you think my suggestions are less clear. Thanks!

Starting from two examples of using Isthmus to go from and to
SQL a few new methods make the process a lot simpler and symetrical.

We found that in use we needed helper wrapper clases, but these could
be pushed into the main code.

Signed-off-by: MBWhite <[email protected]>
Signed-off-by: MBWhite <[email protected]>
@mbwhite mbwhite force-pushed the isthmus-api-updates branch from db1e1c9 to d768adb Compare December 5, 2025 10:26
@mbwhite
Copy link
Contributor Author

mbwhite commented Dec 5, 2025

Thanks @benbellick - changes made. I've used the batch accepted feature - so hopefully I've not missed any comment - as I agreed with the updates.

@mbwhite mbwhite requested a review from benbellick December 5, 2025 10:27
@mbwhite mbwhite force-pushed the isthmus-api-updates branch from d768adb to 7c1bb7f Compare December 5, 2025 10:28
@mbwhite mbwhite force-pushed the isthmus-api-updates branch 2 times, most recently from abadf67 to c8e010d Compare December 5, 2025 11:36
Signed-off-by: MBWhite <[email protected]>
@mbwhite mbwhite force-pushed the isthmus-api-updates branch from c8e010d to c9abb2b Compare December 5, 2025 11:51
Copy link
Member

@bestbeforetoday bestbeforetoday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really helpful sample that makes it much easier for end users to get to grips with using substrait-java (and Isthmus) programmatically.

There is a very minor code fix required to SubstraitSqlStatementParser, and there is now a merge conflict with a recent code change. I have rebased this change and resolved both of these issues in my pr/625 branch if it is easier to pick that up rather than make the changes yourself.

Additionally there are a couple of minor Javadoc fixes needed, and some small suggestions described with inline comments.

* @return a list of {@link SqlNode}s corresponding to the given statements
* @throws SqlParseException if there is an error while parsing the SQL statements
*/
public static List<SqlNode> parseStatements(String sqlStatements, SqlParser.Config parserConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parserConfig parameter is not used. It should replace PARSER_CONFIG below.

Comment on lines +93 to +105
* Converts one or more SQL statements to a List of {@link RelRoot}, with one {@link RelRoot} per
* statement.
*
* @param sqlStatements a string containing one or more SQL statements
* @param catalogReader the {@link Prepare.CatalogReader} for finding tables/views referenced in
* the SQL statements
* @return a list of {@link RelRoot}s corresponding to the given SQL statements
* @throws SqlParseException if there is an error while parsing the SQL statements
*/
public static List<RelRoot> convertQueries(
String sqlStatements,
Prepare.CatalogReader catalogReader,
final SqlParser.Config sqlParserConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc is missing the sqlDialect parameter.

Comment on lines +66 to +77
* Converts one or more SQL statements into a Substrait {@link Plan}.
*
* @param sqlStatements a string containing one more SQL statements
* @param catalogReader the {@link Prepare.CatalogReader} for finding tables/views referenced in
* the SQL statements
* @return the Substrait {@link Plan}
* @throws SqlParseException if there is an error while parsing the SQL statements
*/
public Plan convert(
final String sqlStatements,
final Prepare.CatalogReader catalogReader,
final SqlDialect sqlDialect)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc is missing the sqlDialect parameter.

Comment on lines +69 to +82
final List<String> sqlStrings = new ArrayList<>();

System.out.println("\n");
// and get each root from the Substrait plan
for (final Root root : substraitPlan.getRoots()) {
// Substrait -> Calcite
final RelNode calciteRelNode = converter.convert(root).project(true);
// Calcite -> SQL
final SqlNode sqlNode = relToSql.visitRoot(calciteRelNode).asStatement();

final String sqlString = sqlNode.toSqlString(sqlDialect).getSql();
sqlStrings.add(sqlString);
}
sqlStrings.forEach(System.out::println);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion; your call on whether you want to change this at all...

I am not sure there is any value in collecting the SQL strings into a list and then printing them rather than just printing them directly.

Either way, you might (or might not) find it more readable using streams.

Suggested change
final List<String> sqlStrings = new ArrayList<>();
System.out.println("\n");
// and get each root from the Substrait plan
for (final Root root : substraitPlan.getRoots()) {
// Substrait -> Calcite
final RelNode calciteRelNode = converter.convert(root).project(true);
// Calcite -> SQL
final SqlNode sqlNode = relToSql.visitRoot(calciteRelNode).asStatement();
final String sqlString = sqlNode.toSqlString(sqlDialect).getSql();
sqlStrings.add(sqlString);
}
sqlStrings.forEach(System.out::println);
System.out.println("\n");
// Convert each of the Ssubstrait plan roots to SQL
substraitPlan.getRoots().stream()
// Substrait -> Calcite Rel
.map(root -> converter.convert(root).project(true))
// Calcite Rel -> Calcite SQL
.map(calciteRelNode -> relToSql.visitRoot(calciteRelNode).asStatement())
// Calcite SQL -> SQL String
.map(sqlNode -> sqlNode.toSqlString(sqlDialect).getSql())
.forEachOrdered(System.out::println);

Comment on lines +32 to +35
To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here.

```bash
./gradlew examples:isthmus-api:run --args "FromSql substrait.plan"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest splitting the command and sample output into separate code blocks so only the command can be more easily copied.

Suggested change
To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here.
```bash
./gradlew examples:isthmus-api:run --args "FromSql substrait.plan"
To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository.
```bash
./gradlew examples:isthmus-api:run --args "FromSql substrait.plan"
```
The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here.
```

Comment on lines +84 to +87
To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated.
```bash
./gradlew examples:isthmus-api:run --args "ToSql substrait.plan"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest splitting the command and sample output into separate code blocks so only the command can be more easily copied.

Suggested change
To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated.
```bash
./gradlew examples:isthmus-api:run --args "ToSql substrait.plan"
To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository.
```bash
./gradlew examples:isthmus-api:run --args "ToSql substrait.plan"
```
The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated.
```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants