Adapt core execution callers to columnar helpers#15051
Conversation
9792db2 to
fb9aa40
Compare
739aece to
e0350c2
Compare
fb9aa40 to
14eaeb5
Compare
e0350c2 to
afac240
Compare
830cf4e to
b6bb8fe
Compare
2b813f5 to
ef56038
Compare
c7d443e to
b8602d4
Compare
05bae3b to
3cfcc3a
Compare
b8602d4 to
b0f5b13
Compare
2db1b2a to
733c9d0
Compare
05f51a5 to
cac80aa
Compare
4150e69 to
ea52688
Compare
43e499a to
94fc626
Compare
ea52688 to
762a24f
Compare
eb42a47 to
916ddb5
Compare
6f6f3d3 to
ad5eccd
Compare
0434223 to
98459fb
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
98459fb to
8b9f3f0
Compare
Greptile SummaryThis is one reviewable layer in the unshim stack (introduced by #15025). It adapts core execution callers in
Confidence Score: 4/5The PR is a clean structural refactoring with no intended behavior change; the mechanical case-class-to-class conversions are correctly applied across all call sites. Two minor concerns keep this from a perfect score: the reflection-based
Important Files Changed
Sequence DiagramsequenceDiagram
participant GpuDataWritingCommandExec
participant GpuDataWritingCommand
participant ShimDataWritingCommand
participant ColumnarHelpers as sql-plugin-columnar (Java)
Note over GpuDataWritingCommand,ShimDataWritingCommand: ShimDataWritingCommand new trait in TreeNode.scala
GpuDataWritingCommandExec->>GpuDataWritingCommand: runColumnarFromAny(sparkSession, child)
GpuDataWritingCommand->>ShimDataWritingCommand: cast AnyRef to SparkSession + runColumnar()
ShimDataWritingCommand->>GpuDataWritingCommand: runColumnar(SparkSession, SparkPlan)
GpuDataWritingCommand-->>GpuDataWritingCommandExec: Seq[ColumnarBatch]
Note over ColumnarHelpers: Moved definitions
GpuDataWritingCommandExec->>ColumnarHelpers: new HostAllocResult
GpuDataWritingCommandExec->>ColumnarHelpers: AggregateModeInfo.from
GpuDataWritingCommandExec->>ColumnarHelpers: new AutoCloseableTargetSize
Reviews (1): Last reviewed commit: "Update Spark 4.1 tool metadata" | Re-trigger Greptile |
|
|
||
| object GpuExec { | ||
| @transient private[this] lazy val sparkPlanSessionMethod = | ||
| classOf[SparkPlan].getMethod("session") | ||
|
|
||
| def sessionFromPlan(plan: SparkPlan): SparkSession = | ||
| sparkPlanSessionMethod.invoke(plan).asInstanceOf[SparkSession] |
There was a problem hiding this comment.
Bare reflection lookup without
NoSuchMethodException handling
classOf[SparkPlan].getMethod("session") is called inside a lazy val, so any NoSuchMethodException is deferred until the first access of sparkSession and surfaces as an opaque failure. The previous shimmed path (SparkSessionUtils.sessionFromPlan) called plan.session with full Scala type-safety; here we lose that at the cost of no visible error if the public method name changes or its visibility tightens in a future Spark version. Consider wrapping in a Try with a meaningful error message, or retaining a comment explaining why reflection is intentionally used instead of the shim.
| override def run(sparkSession: SparkSession, child: SparkPlan): Seq[org.apache.spark.sql.Row] = { | ||
| com.nvidia.spark.rapids.Arm.withResource(runColumnar(sparkSession, child)) { batches => | ||
| assert(batches.isEmpty) | ||
| } | ||
| Seq.empty[org.apache.spark.sql.Row] | ||
| } |
There was a problem hiding this comment.
Silent batch discard when JVM assertions are disabled
assert(batches.isEmpty) is a JVM assertion that can be suppressed with -da. When disabled, non-empty batches returned by runColumnar are silently closed and discarded without any log message or exception. This same pattern existed in the old GpuDataWritingCommand.run(), but moving it here makes it the default for all ShimDataWritingCommand implementors. A plain if (!batches.isEmpty) throw new IllegalStateException(...) (or at minimum a logError) would make the failure visible in production.
|
Greptile encountered an error while reviewing this PR. Please reach out to support@greptile.com for assistance. |
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It updates core execution callers to use the moved columnar helpers. This keeps the central execution-path adaptations in one reviewable layer.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance