Adapt rule and plugin metadata callers to Java helpers#15048
Adapt rule and plugin metadata callers to Java helpers#15048gerashegalov wants to merge 2 commits into
Conversation
ea9be01 to
d6b101f
Compare
d922d5a to
b94b8e4
Compare
d6b101f to
af3b30c
Compare
b94b8e4 to
4e3e434
Compare
af3b30c to
e1865fd
Compare
0a315f6 to
b614cf0
Compare
d0a0fb5 to
c983119
Compare
413b5ee to
7a668b3
Compare
c983119 to
11daef2
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
11daef2 to
8a3927f
Compare
Greptile SummaryThis PR adapts GPU override rule and plugin metadata callers to use shared Java helpers as part of the unshim stack refactoring. The dominant pattern is converting Scala
Confidence Score: 4/5The PR is safe to merge for existing workloads; the new S3PerfReader wiring is the only functional addition and is gated behind a config that defaults to false. The case-class-to-class and apply-to-new mechanical transformations are correct and compile-verified by the full-stack build. The reflection-based shim dispatch is a deliberate architecture choice, not a latent bug. The open question is whether the RapidsInputFiles.setS3PerfReader calls are intentional new functionality or leaked from an adjacent layer — the PR description says no standalone behavior change, but this wiring enables the S3 performance path that was previously a no-op. The logWarning inconsistency and logger class-name loss in GpuRowBasedUserDefinedFunction are minor quality issues that do not affect correctness. Plugin.scala — the two new setS3PerfReader call sites should be confirmed as intentional; GpuUserDefinedFunction.scala — logger bound to trait class rather than concrete class. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["GpuOverrides (object)"] -->|reflection| B["shimSingleton(name)\nClass.forName + MODULE$"]
B --> C["invokeShimSingleton(name, method)"]
C --> D["shimExprs / shimExecRules\nshimScanRules / shimPartRules"]
D --> E["expressions / execs / scans / parts (val maps)"]
A -->|lazy val| F["aggregateInPandasExecShimsModule"]
F --> G["aggregateInPandasExecRule"]
J["RapidsMeta (object)"] -->|lazy val| K["sparkShimImplModule\naggregateInPandasExecShimsModule"]
K --> L["isWindowFunctionExec\nisAggregateInPandasExec"]
M["Plugin.scala (Driver+Executor init)"] -->|NEW| N["RapidsInputFiles.setS3PerfReader(PerfIOS3Reader.INSTANCE)"]
N --> O["sql-plugin-fileio S3PerfReader\n(enabled when S3PERF_ENABLED=true)"]
P["case class to class conversions"] -->|Serializable| Q["ParamCheck / RepeatingParamCheck\nInputCheck / ContextChecks\nPartChecksImpl / ExprChecksImpl\nOomInjectionConf / ActiveTaskMetrics"]
Reviews (1): Last reviewed commit: "Adapt plugin metadata callers to Java he..." | Re-trigger Greptile |
| override def init( | ||
| sc: SparkContext, pluginContext: PluginContext): java.util.Map[String, String] = { | ||
| val sparkConf = pluginContext.conf | ||
| RapidsInputFiles.setS3PerfReader(PerfIOS3Reader.INSTANCE) |
There was a problem hiding this comment.
New S3PerfReader wiring appears to be unreferenced new functionality
RapidsInputFiles.setS3PerfReader(PerfIOS3Reader.INSTANCE) is called in both the driver and executor init() paths. This registers a live S3 performance reader that bridges sql-plugin's private PerfIO$ Scala object with the sql-plugin-fileio Java interface. This is not a pure caller adaptation; it introduces new runtime behavior when PerfIOConf.S3PERF_ENABLED is set to true. The PR description states "no standalone behavior change is intended in this layer." Can you confirm this S3PerfReader wiring was previously absent (i.e., the feature was already broken/no-op before this PR) and this PR intentionally enables it as part of the unshim refactor?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It updates GPU override rule and plugin metadata callers to use the moved shared Java helpers. The helper movement is already in lower layers, so this PR is limited to caller adaptation.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance