-
Notifications
You must be signed in to change notification settings - Fork 78
fix(isthmus): handle Subqueries/set predicates with field references outside of the subquery #383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions
isthmus/src/main/java/io/substrait/isthmus/SubstraitRelNodeConverter.java
Outdated
Show resolved
Hide resolved
isthmus/src/main/java/io/substrait/isthmus/SubstraitRelNodeConverter.java
Show resolved
Hide resolved
isthmus/src/main/java/io/substrait/isthmus/SubstraitRelNodeConverter.java
Show resolved
Hide resolved
eaa231c
to
074c9ef
Compare
@vbarua questions have been answered; I've updated the test cases to better separate concerns. found the hidden setting in vscode to change the formatted back to spotless! |
io.substrait.plan.Plan plan = new ProtoPlanConverter().from(possible); | ||
SubstraitToCalcite substraitToCalcite = new SubstraitToCalcite(extensions, typeFactory); | ||
RelNode relRoot = substraitToCalcite.convert(plan.getRoots().get(0)).project(true); | ||
System.out.println(SubstraitToSql.toSql(relRoot)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: avoid printing during test. I suggest assertNotNull
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also changed the other existing tests where this occurred.
isthmus/src/main/java/io/substrait/isthmus/expression/ExpressionRexConverter.java
Show resolved
Hide resolved
@@ -487,25 +492,44 @@ public RexNode visit(Expression.Cast expr) throws RuntimeException { | |||
typeConverter.toCalcite(typeFactory, expr.getType()), expr.input().accept(this), safeCast); | |||
} | |||
|
|||
AtomicInteger correlIdCount = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for? It's never used.
if (outerref.isPresent()) { | ||
if (segment instanceof FieldReference.StructField) { | ||
FieldReference.StructField f = (FieldReference.StructField) segment; | ||
var node = referenceRelList.get(outerref.get() - 1).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to track or handle the fact the there might be multiple Filters that can add correlation variables? We only ever add to this list.
How would we know which ones came from which input?
This is one the issue I had in mind when I mentioned
https://github.com/substrait-io/substrait-java/pull/383/files#r2038429378
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly the Calcite docs don't really help with the scope of these variables. There is some concept of 'namespace' eg from this error .
All correlation variables should resolve to the same namespace. Prev ns=org.apache.calcite.sql.validate.IdentifierNamespace@d36c1c3, new ns=org.apache.calcite.sql.validate.IdentifierNamespace@96abc7
which came from
select
c1.c_name,
o1.o_orderstatus,
o1.o_totalprice
from
customer c1,
orders o1
where
o1.o_custkey = c1.c_custkey
and o1.o_totalprice > (
select
avg(o_totalprice)
from
orders o2, customer c2
where
o2.o_totalprice < c1.c_acctbal
and o2.o_totalprice > (
select
avg(c3.c_acctbal)
from
customer c3
where
c3.c_custkey = o2.o_custkey
and c3.c_address = o1.o_clerk
)
);
change the last line to c3.c_address = o2.o_clerk
and it's ok..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implications is that each relation and it's immediate subexpression is the same namespace.
Signed-off-by: MBWhite <[email protected]>
Signed-off-by: MBWhite <[email protected]>
This is resolving issue #382 found with TPC-H 17, when converting back to SQL from Substrait
The subquery references fields in an outer scope, the calcite correlation variables where referenced by Subsrtrait's 'outer reference'
However when converting back to calcite from a plan with these 'outer references' they were ignored.
Notes::