Remove old Spark 3.3.1 through 3.4 shim sources#15036
Conversation
49480ce to
c0d071d
Compare
eb17bcf to
81c9244
Compare
c0d071d to
c6d9166
Compare
e4f3ab7 to
cc57ebc
Compare
5febf4b to
7bb6083
Compare
cc57ebc to
92c28d1
Compare
7bb6083 to
796810e
Compare
0bcc39d to
78e3755
Compare
2663bb3 to
d746dd5
Compare
78e3755 to
a43841a
Compare
d746dd5 to
cd226f8
Compare
721a940 to
d1cfd5c
Compare
e4d544e to
7ee72ed
Compare
4d278cb to
c06aa0a
Compare
7ee72ed to
3d5e85e
Compare
c06aa0a to
9780ace
Compare
47cc5a9 to
c2f0e73
Compare
9780ace to
278ab3d
Compare
c2f0e73 to
3c2a442
Compare
278ab3d to
4654034
Compare
3c2a442 to
f27a5af
Compare
4654034 to
0089b56
Compare
f27a5af to
df68682
Compare
0089b56 to
c546023
Compare
df68682 to
9d8e2a1
Compare
c546023 to
a72346b
Compare
9d8e2a1 to
d8581c4
Compare
a72346b to
873c175
Compare
d8581c4 to
9973398
Compare
873c175 to
550ff06
Compare
9973398 to
9bb4096
Compare
17f3131 to
a735370
Compare
9bb4096 to
903afa7
Compare
a735370 to
2feeaf8
Compare
903afa7 to
0b03989
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
0b03989 to
9103bb7
Compare
2feeaf8 to
bb08a89
Compare
Greptile SummaryThis PR is one layer in a broader "unshim stack" that consolidates version-specific Spark shim sources into shared helpers. It removes duplicate shim sources for Spark 3.3.1–3.4.x that are now provided by new helper modules introduced in the preceding stack layer.
Confidence Score: 4/5The PR is a structural cleanup with no intended behavioral change; the full-stack build was validated tree-equivalent to the pre-split stack, so merging carries low risk. The deletions are mechanical and the refactorings (Logging → SLF4J, WriteFilesExecRule delegation, shim provider removal) are straightforward. The OrcProtoWriterShim reflection rewrite is the most complex change: it is logically correct but surfaces an opaque error message when neither protobuf library is resolvable at runtime. The GpuGroupedPythonRunnerFactory API break (removed argNames default) is an internal shim concern that should be covered by the full-stack build. sql-plugin/src/main/spark340/scala/com/nvidia/spark/rapids/shims/OrcProtoWriterShim.scala deserves a second look for the reflection-based proto dispatch and its error-path diagnostics. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["SparkShimServiceProvider (deleted per version)\nspark331 / 332 / 332db / 333 / 334 / 340 / 341 / 341db / 342 / 343 / 344"]
B["Shared helper module\n(previous stack layer #15053)"]
A -->|"replaced by"| B
C["Spark.internal.Logging (removed)"]
D["Direct SLF4J logger"]
C -->|"replaced by"| D
C --> RapidsShuffleIterator
C --> RapidsCachingReader
C --> HiveFileUtil
C --> GpuWindowGroupLimitingIterator
RapidsShuffleIterator --> D
RapidsCachingReader --> D
HiveFileUtil --> D
GpuWindowGroupLimitingIterator --> D
E["OrcProtoWriterShim (static CodedOutputStream)"]
F["OrcProtoWriterShim (reflection over protoApis)\nSupports org.apache.orc.protobuf + com.google.protobuf"]
E -->|"refactored to"| F
G["WriteFilesExecRule (new shared object)\nWriteFilesExecShims.exec → GpuWriteFilesMeta"]
H["Spark332PlusDBShims.getExecs"]
I["Spark340PlusNonDBShims.getExecs"]
G --> H
G --> I
|
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It removes old Spark 3.3.1 through Spark 3.4 shim sources that are now provided by shared helpers or the new helper modules.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance