Adapt SQL RAPIDS callers to columnar helpers#15033
Conversation
9792db2 to
fb9aa40
Compare
f061f44 to
6f4f246
Compare
830cf4e to
b6bb8fe
Compare
6f4f246 to
5f9363a
Compare
b6bb8fe to
bba66bf
Compare
5f9363a to
1457c07
Compare
bba66bf to
c7d443e
Compare
1457c07 to
64a238d
Compare
c7d443e to
b8602d4
Compare
64a238d to
1f90438
Compare
b8602d4 to
b0f5b13
Compare
6011c26 to
fea9cb1
Compare
05f51a5 to
cac80aa
Compare
7160eb5 to
3c8769b
Compare
43e499a to
94fc626
Compare
7bbdcc0 to
d35c3a9
Compare
94fc626 to
eb42a47
Compare
d35c3a9 to
cf518aa
Compare
eb42a47 to
916ddb5
Compare
cf518aa to
f714012
Compare
e69ce28 to
bc1ad84
Compare
0434223 to
98459fb
Compare
7b50024 to
3848638
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
3848638 to
e505feb
Compare
98459fb to
8b9f3f0
Compare
Greptile SummaryThis PR is the SQL-layer step in the "unshim stack" (#15025): it decouples SQL-facing callers from versioned shim imports by replacing direct class/object references with reflective lazy-val dispatch, converting
Confidence Score: 4/5The change is a pure mechanical refactor with no intended behavior change; the vast majority of diffs are import/mixin swaps, case-class-to-class conversions, and local logger migrations. Two call sites in
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[SQL RAPIDS Callers\nthis PR] -->|reflective lazy dispatch| B[TrampolineConnectShims]
A -->|reflective lazy dispatch| C[DFUDFShims]
A -->|reflective lazy dispatch| D[PerfIO]
A -->|RapidsLocalLog mixin| E[SLF4J Logger]
A -->|factory apply| F[RapidsAnalysisException\nobject - probes AnalysisException\nconstructor shape]
A -->|case class to class| G[DFUDF*/JDFUDF*\nPartialFile/CompressedRecord\nBatchState/Converters]
A -->|new shim files| H[ShimPredicate\nspark330-spark411]
B --> I[Lower Execution Layer\nprev PR 15051]
C --> I
D --> I
Reviews (1): Last reviewed commit: "Keep ShimPredicate version-specific" | Re-trigger Greptile |
| private def emptyQueryContextArray: AnyRef = { | ||
| java.lang.reflect.Array.newInstance(Class.forName("org.apache.spark.QueryContext"), 0) | ||
| } |
There was a problem hiding this comment.
emptyQueryContextArray re-creates on every call
emptyQueryContextArray is a plain def that calls Class.forName("org.apache.spark.QueryContext") and java.lang.reflect.Array.newInstance on each invocation. Because rawMessageCtor7, rawMessageCtor8, and rawMessageCtorWithStringParameters each call it afresh (and they themselves rescan classOf[AnalysisException].getConstructors every time), every single RapidsAnalysisException(msg) site performs multiple full reflective constructor scans plus class-lookup + array-allocation per call. The AnalysisException constructor shape is fixed at JVM load time, so the lookup and the resolved constructor should be cached in lazy vals at object initialization. The empty context-array can similarly be a private lazy val.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| * Compute cardinality statistics for both sides of the join. | ||
| * This is used for diagnostic logging when logJoinCardinality is enabled. | ||
| */ | ||
| protected def joinStrategy: JoinStrategy.JoinStrategy = | ||
| joinOptions.strategy.asInstanceOf[JoinStrategy.JoinStrategy] | ||
|
|
There was a problem hiding this comment.
asInstanceOf on Enumeration.Value fields is fragile
joinOptions.strategy.asInstanceOf[JoinStrategy.JoinStrategy] (and the buildSideSelection equivalent) casts an AnyRef-typed field to a Value type alias from a specific Enumeration. If the actual stored value was ever populated from the wrong Enumeration singleton (e.g., due to a cross-layer type mismatch introduced by moving JoinOptions to a lower module), this silently succeeds at the cast but fails later at call sites that distinguish values. Could you clarify why the cast is necessary here rather than simply typing the accessors as joinOptions.strategy directly? If the new JoinOptions class exposes strategy as AnyRef intentionally for cross-module compatibility, a comment explaining this design choice would help future maintainers. Why does joinOptions.strategy need an asInstanceOf[JoinStrategy.JoinStrategy] cast rather than being directly typed as JoinStrategy.JoinStrategy in the new JoinOptions definition?
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It updates SQL RAPIDS callers to use the moved columnar helpers. This follows the lower-level execution caller changes and keeps SQL-facing call-site changes grouped together.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance