-
Notifications
You must be signed in to change notification settings - Fork 93
Isthmus To/From SQL examples & enhanced APIs #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
bestbeforetoday
merged 11 commits into
substrait-io:main
from
mbwhite:isthmus-api-updates
Dec 12, 2025
+552
−4
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
e5051cd
feat: isthmus To/From SQL examples & enhanced APIs
mbwhite f803e8f
fix: removed spark libraries
mbwhite 3fe6214
fix: typos
mbwhite ad1c6d1
fix: review comments on the format
mbwhite e419d0b
fix: update examples/isthmus-api/README.md
mbwhite 93d4e31
fix: update examples/isthmus-api/README.md
mbwhite 5817859
fix: update examples/isthmus-api/README.md
mbwhite 53090be
fix: update examples/isthmus-api/README.md
mbwhite bae26b5
fix: apply suggestions from code review
mbwhite ebd0290
fix: review comments
mbwhite a8a62d3
fix: bestbeforetoday review comments
mbwhite File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| _apps | ||
| _data | ||
| **/*/bin | ||
| build |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| # Isthmus API Examples | ||
|
|
||
| The Isthmus library converts Substrait plans to and from SQL Plans. There are two examples showing conversion in each direction. | ||
|
|
||
| ## How does this work in theory? | ||
|
|
||
| The [Calcite](https://calcite.apache.org/) library is used to do parsing and generation of the SQL String. Calcite has it's own relational object model, distinct from substrait's. There are classes within Isthmus to convert Substrait to and from Calcite's object model. | ||
|
|
||
| The conversion flows work as follows: | ||
|
|
||
| **SQL to Substrait:** | ||
| `SQL ---[Calcite parsing]---> Calcite Object Model ---[Isthmus conversion]---> Substrait` | ||
|
|
||
| **Substrait to SQL:** | ||
| `Substrait ---[Isthmus conversion]---> Calcite Object Model ---[Calcite SQL generation]---> SQL` | ||
|
|
||
| ## Running the examples | ||
|
|
||
| There are 2 example classes: | ||
|
|
||
| - [FromSql](./src/main/java/io/substrait/examples/FromSql.java) that creates a plan starting from SQL | ||
| - [ToSql](./app/src/main/java/io/substrait/examples/ToSQL.java) that reads a plan and creates the SQL | ||
|
|
||
|
|
||
| ### Requirements | ||
|
|
||
| To run these you will need Java 17 or greater, and this repository cloned to you local system. | ||
|
|
||
|
|
||
| ## Creating a Substrait Plan from SQL | ||
|
|
||
| To run [`FromSql.java`](./src/main/java/io/substrait/examples/FromSql.java), execute the command below from the root of this repository. | ||
|
|
||
| ```bash | ||
| ./gradlew examples:isthmus-api:run --args "FromSql substrait.plan" | ||
| ``` | ||
|
|
||
| The example writes a binary plan to `substrait.plan` and outputs the text format of the protobuf to stdout. The output is quite lengthy, so it has been abbreviated here. | ||
|
|
||
| ```bash | ||
| > Task :examples:isthmus-api:run | ||
| extension_uris { | ||
| extension_uri_anchor: 2 | ||
| uri: "/functions_aggregate_generic.yaml" | ||
| } | ||
| extension_uris { | ||
| extension_uri_anchor: 1 | ||
| uri: "/functions_comparison.yaml" | ||
| } | ||
| extensions { | ||
| extension_function { | ||
| extension_uri_reference: 1 | ||
| function_anchor: 1 | ||
| name: "equal:any_any" | ||
| extension_urn_reference: 1 | ||
| } | ||
| } | ||
| extensions { | ||
| extension_function { | ||
| extension_uri_reference: 2 | ||
| function_anchor: 2 | ||
| name: "count:" | ||
| extension_urn_reference: 2 | ||
| } | ||
| } | ||
| relations {....} | ||
| } | ||
| version { | ||
| minor_number: 77 | ||
| producer: "isthmus" | ||
| } | ||
| extension_urns { | ||
| extension_urn_anchor: 1 | ||
| urn: "extension:io.substrait:functions_comparison" | ||
| } | ||
| extension_urns { | ||
| extension_urn_anchor: 2 | ||
| urn: "extension:io.substrait:functions_aggregate_generic" | ||
| } | ||
|
|
||
| File written to substrait.plan | ||
| ``` | ||
|
|
||
| Please see the code comments for details of how the conversion is done. | ||
|
|
||
| ## Creating SQL from a Substrait Plan | ||
|
|
||
| To run [`ToSql.java`](./src/main/java/io/substrait/examples/ToSql.java), execute the command below from the root of this repository. | ||
| ```bash | ||
| ./gradlew examples:isthmus-api:run --args "ToSql substrait.plan" | ||
| ``` | ||
|
|
||
| The example reads from `substrait.plan` (likely the file created by `FromSql`) and outputs SQL. The text format of the protobuf has been abbreviated | ||
| ```bash | ||
| > Task :examples:isthmus-api:run | ||
| Reading from substrait.plan | ||
| extension_uris { | ||
| extension_uri_anchor: 2 | ||
| uri: "/functions_aggregate_generic.yaml" | ||
| } | ||
| extension_uris { | ||
| extension_uri_anchor: 1 | ||
| uri: "/functions_comparison.yaml" | ||
| } | ||
| extensions { | ||
| extension_function { | ||
| extension_uri_reference: 1 | ||
| function_anchor: 1 | ||
| name: "equal:any_any" | ||
| extension_urn_reference: 1 | ||
| } | ||
| } | ||
| extensions {....} | ||
| relations {....} | ||
| version { | ||
| minor_number: 77 | ||
| producer: "isthmus" | ||
| } | ||
| extension_urns { | ||
| extension_urn_anchor: 1 | ||
| urn: "extension:io.substrait:functions_comparison" | ||
| } | ||
| extension_urns { | ||
| extension_urn_anchor: 2 | ||
| urn: "extension:io.substrait:functions_aggregate_generic" | ||
| } | ||
|
|
||
|
|
||
| SELECT `t2`.`colour0` AS `COLOUR`, `t2`.`$f1` AS `COLOURCOUNT` | ||
| FROM (SELECT `vehicles`.`colour` AS `colour0`, COUNT(*) AS `$f1` | ||
| FROM `vehicles` | ||
| INNER JOIN `tests` ON `vehicles`.`vehicle_id` = `tests`.`vehicle_id` | ||
| WHERE `tests`.`test_result` = 'P' | ||
| GROUP BY `vehicles`.`colour` | ||
| ORDER BY COUNT(*) IS NULL, 2) AS `t2` | ||
|
|
||
| ``` | ||
|
|
||
| The SQL statement in the selected dialect will be created (MySql is used in the example). | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| plugins { | ||
| // Apply the application plugin to add support for building a CLI application in Java. | ||
| id("application") | ||
| alias(libs.plugins.spotless) | ||
| id("substrait.java-conventions") | ||
| } | ||
|
|
||
| repositories { mavenCentral() } | ||
|
|
||
| dependencies { | ||
| implementation(project(":isthmus")) | ||
| implementation(libs.calcite.core) | ||
| implementation(libs.calcite.server) | ||
| } | ||
|
|
||
| application { mainClass = "io.substrait.examples.IsthmusAppExamples" } | ||
|
|
||
| tasks.named<Test>("test") { useJUnitPlatform() } | ||
|
|
||
| java { toolchain { languageVersion.set(JavaLanguageVersion.of(17)) } } | ||
|
|
||
| tasks.pmdMain { dependsOn(":core:shadowJar") } |
107 changes: 107 additions & 0 deletions
107
examples/isthmus-api/src/main/java/io/substrait/examples/FromSql.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| package io.substrait.examples; | ||
|
|
||
| import io.substrait.examples.IsthmusAppExamples.Action; | ||
| import io.substrait.isthmus.SqlToSubstrait; | ||
| import io.substrait.isthmus.SubstraitTypeSystem; | ||
| import io.substrait.isthmus.sql.SubstraitCreateStatementParser; | ||
| import io.substrait.plan.Plan; | ||
| import io.substrait.plan.PlanProtoConverter; | ||
| import java.io.IOException; | ||
| import java.nio.file.Files; | ||
| import java.nio.file.Path; | ||
| import java.nio.file.Paths; | ||
| import java.util.List; | ||
| import org.apache.calcite.config.CalciteConnectionConfig; | ||
| import org.apache.calcite.config.CalciteConnectionProperty; | ||
| import org.apache.calcite.jdbc.CalciteSchema; | ||
| import org.apache.calcite.jdbc.JavaTypeFactoryImpl; | ||
| import org.apache.calcite.prepare.CalciteCatalogReader; | ||
| import org.apache.calcite.rel.type.RelDataTypeFactory; | ||
| import org.apache.calcite.sql.SqlDialect; | ||
| import org.apache.calcite.sql.parser.SqlParseException; | ||
|
|
||
| /** | ||
| * Substrait from SQL conversions. | ||
| * | ||
| * <p>The conversion process involves four steps: | ||
| * | ||
| * <p>1. Create a fully typed schema for the inputs. Within a SQL context this represents the CREATE | ||
| * TABLE commands, which need to be converted to a Calcite Schema. | ||
| * | ||
| * <p>2. Parse the SQL query to convert (in the source SQL dialect). | ||
| * | ||
| * <p>3. Convert the SQL query to Calcite Relations. | ||
| * | ||
| * <p>4. Convert the Calcite Relations to Substrait relations. | ||
| * | ||
| * <p>Note that the schema could be created from other means, such as Calcite's reflection-based | ||
| * schema. | ||
| */ | ||
| public class FromSql implements Action { | ||
|
|
||
| @Override | ||
| public void run(final String[] args) { | ||
| try { | ||
| final String createSql = | ||
| """ | ||
| CREATE TABLE "vehicles" ("vehicle_id" varchar(15), "make" varchar(40), "model" varchar(40), | ||
| "colour" varchar(15), "fuel_type" varchar(15), | ||
| "cylinder_capacity" int, "first_use_date" varchar(15)); | ||
|
|
||
| CREATE TABLE "tests" ("test_id" varchar(15), "vehicle_id" varchar(15), | ||
| "test_date" varchar(20), "test_class" varchar(20), "test_type" varchar(20), | ||
| "test_result" varchar(15),"test_mileage" int, "postcode_area" varchar(15)); | ||
|
|
||
| """; | ||
|
|
||
| // Create the Calcite Schema from the CREATE TABLE statements. | ||
| // The Isthmus helper classes assume a standard SQL format for parsing. | ||
| final CalciteSchema calciteSchema = CalciteSchema.createRootSchema(false); | ||
| SubstraitCreateStatementParser.processCreateStatements(createSql) | ||
| .forEach(t -> calciteSchema.add(t.getName(), t)); | ||
|
|
||
| // Type Factory based on Java Types | ||
| final RelDataTypeFactory typeFactory = | ||
| new JavaTypeFactoryImpl(SubstraitTypeSystem.TYPE_SYSTEM); | ||
|
|
||
| // Default configuration for calcite | ||
| final CalciteConnectionConfig calciteDefaultConfig = | ||
| CalciteConnectionConfig.DEFAULT.set( | ||
| CalciteConnectionProperty.CASE_SENSITIVE, Boolean.FALSE.toString()); | ||
|
|
||
| final CalciteCatalogReader catalogReader = | ||
| new CalciteCatalogReader(calciteSchema, List.of(), typeFactory, calciteDefaultConfig); | ||
|
|
||
| // Query that needs to be converted; again this could be in a variety of SQL dialects | ||
| final String apacheDerbyQuery = | ||
| """ | ||
| SELECT vehicles.colour, count(*) as colourcount FROM vehicles INNER JOIN tests | ||
| ON vehicles.vehicle_id=tests.vehicle_id WHERE tests.test_result = 'P' | ||
| GROUP BY vehicles.colour ORDER BY count(*) | ||
| """; | ||
| final SqlToSubstrait sqlToSubstrait = new SqlToSubstrait(); | ||
|
|
||
| // choose Apache Derby as an example dialect | ||
| final SqlDialect dialect = SqlDialect.DatabaseProduct.DERBY.getDialect(); | ||
| final Plan substraitPlan = sqlToSubstrait.convert(apacheDerbyQuery, catalogReader, dialect); | ||
|
|
||
| // Create the proto plan to display to stdout - as it has a better format | ||
| final PlanProtoConverter planToProto = new PlanProtoConverter(); | ||
| final io.substrait.proto.Plan protoPlan = planToProto.toProto(substraitPlan); | ||
| System.out.println(protoPlan); | ||
|
|
||
| // write out to file if given a file name | ||
| // convert to a protobuff byte array and write as binary file | ||
| if (args.length == 1) { | ||
|
|
||
| final byte[] buffer = protoPlan.toByteArray(); | ||
| final Path outputFile = Paths.get(args[0]); | ||
| Files.write(outputFile, buffer); | ||
| System.out.println("File written to " + outputFile); | ||
| } | ||
|
|
||
| } catch (SqlParseException | IOException e) { | ||
| e.printStackTrace(); | ||
| } | ||
| } | ||
| } |
53 changes: 53 additions & 0 deletions
53
examples/isthmus-api/src/main/java/io/substrait/examples/IsthmusAppExamples.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| package io.substrait.examples; | ||
|
|
||
| import java.util.Arrays; | ||
|
|
||
| /** Main class */ | ||
| public final class IsthmusAppExamples { | ||
|
|
||
| /** Implemented by all examples */ | ||
| @FunctionalInterface | ||
| public interface Action { | ||
|
|
||
| /** | ||
| * Run | ||
| * | ||
| * @param args String [] | ||
| */ | ||
| void run(String[] args); | ||
| } | ||
|
|
||
| private IsthmusAppExamples() {} | ||
|
|
||
| /** | ||
| * Traditional main method | ||
| * | ||
| * @param args string[] | ||
| */ | ||
| @SuppressWarnings("unchecked") | ||
| public static void main(final String args[]) { | ||
| try { | ||
|
|
||
| if (args.length == 0) { | ||
| System.err.println( | ||
| "Please provide base classname of example to run. eg ToSql to run class io.substrait.examples.ToSql "); | ||
| System.exit(-1); | ||
| } | ||
| final String exampleClass = args[0]; | ||
|
|
||
| final Class<Action> clz = | ||
| (Class<Action>) | ||
| Class.forName( | ||
| String.format("%s.%s", IsthmusAppExamples.class.getPackageName(), exampleClass)); | ||
| final Action action = clz.getDeclaredConstructor().newInstance(); | ||
| if (args.length == 1) { | ||
| action.run(new String[] {}); | ||
| } else { | ||
| action.run(Arrays.copyOfRange(args, 1, args.length)); | ||
| } | ||
| } catch (Exception e) { | ||
| e.printStackTrace(); | ||
| System.exit(-1); | ||
| } | ||
| } | ||
| } |
39 changes: 39 additions & 0 deletions
39
examples/isthmus-api/src/main/java/io/substrait/examples/SchemaHelper.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| package io.substrait.examples; | ||
|
|
||
| import io.substrait.isthmus.calcite.SubstraitTable; | ||
| import io.substrait.isthmus.sql.SubstraitCreateStatementParser; | ||
| import java.util.ArrayList; | ||
| import java.util.List; | ||
| import org.apache.calcite.jdbc.CalciteSchema; | ||
| import org.apache.calcite.prepare.CalciteCatalogReader; | ||
| import org.apache.calcite.sql.parser.SqlParseException; | ||
|
|
||
| /** Helper functions for schemas. */ | ||
| public final class SchemaHelper { | ||
|
|
||
| private SchemaHelper() {} | ||
|
|
||
| /** | ||
| * Parses one or more SQL strings containing only CREATE statements into a {@link | ||
| * CalciteCatalogReader} | ||
| * | ||
| * @param createStatements a SQL string containing only CREATE statements | ||
| * @return a {@link CalciteCatalogReader} generated from the CREATE statements | ||
| * @throws SqlParseException | ||
| */ | ||
| public static CalciteSchema processCreateStatementsToSchema(final List<String> createStatements) | ||
| throws SqlParseException { | ||
|
|
||
| final List<SubstraitTable> tables = new ArrayList<>(); | ||
| for (final String statement : createStatements) { | ||
| tables.addAll(SubstraitCreateStatementParser.processCreateStatements(statement)); | ||
| } | ||
|
|
||
| final CalciteSchema rootSchema = CalciteSchema.createRootSchema(false); | ||
| for (final SubstraitTable table : tables) { | ||
| rootSchema.add(table.getName(), table); | ||
| } | ||
|
|
||
| return rootSchema; | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is dialect selected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like this is better?