Remove old Spark 3.3.0 shim sources#15035
Conversation
0861ac7 to
3aedcbc
Compare
21ca89a to
d54de00
Compare
3aedcbc to
543d7d8
Compare
d54de00 to
fad5d01
Compare
8f6db57 to
c80d61c
Compare
fad5d01 to
499dac2
Compare
c80d61c to
ddfb246
Compare
04da28c to
5c9d602
Compare
19375c6 to
c271777
Compare
bb0cf14 to
b2a7a9e
Compare
91a33db to
ce1c95c
Compare
b2a7a9e to
8bb1ce8
Compare
ce1c95c to
b91960c
Compare
e23faf4 to
77b338e
Compare
b91960c to
6a596b3
Compare
0ac124a to
9965c09
Compare
6c50ba6 to
a99339a
Compare
36f44d8 to
0e24722
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
0e24722 to
4ec5b87
Compare
a99339a to
0471b91
Compare
Greptile SummaryThis PR removes Spark 3.3.0-specific shim sources that are now covered by shared helper modules introduced earlier in the unshim stack. It also makes several backward-compatible adjustments to the remaining spark330 files to enable binary-class deduplication with newer shims.
Confidence Score: 4/5Structured cleanup of spark330-specific shim sources; no new end-user logic is introduced, and the reflection-based ORC dispatch and Logging-to-SLF4J conversions are technically correct. The reflection-based OrcProtoWriterShim and the GpuGroupedPythonRunnerFactory API surface changes are the two areas that deserve a second look. The reflection dispatch is sound — protoApis is lazily populated once and the per-instance cache is always flushed before reassignment — but the single-slot cache design is worth documenting. The removal of the case class companion apply and the argNames default value are intentional within the validated stack but are silent source-compatibility breaks for any downstream consumer. OrcProtoWriterShim.scala (reflection-based protobuf dispatch) and GpuGroupedPythonRunnerFactory.scala (case class to class, default arg removal) warrant the closest review; deleted files and Logging-to-SLF4J conversions are straightforward. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[writeAndFlush called] --> B{apiFor obj}
B -- None --> C[throw IllegalArgumentException]
B -- Some api --> D{proxiedApi == api?}
D -- Yes --> E[reuse cached proxied]
D -- No --> F[invoke api.newInstance with orcOutStream]
F --> G[update proxiedApi + proxied]
G --> E
E --> H[api.writeTo.invoke obj proxied]
H --> I[api.flush.invoke proxied]
I --> J[orcOutStream.flush]
subgraph protoApis [lazy protoApis init]
K[try org.apache.orc.protobuf] --> L{found?}
L -- Yes --> M[add ProtoApi]
L -- No --> N[skip]
O[try com.google.protobuf] --> P{found?}
P -- Yes --> Q[add ProtoApi]
P -- No --> R[skip]
end
B -.->|first call triggers| protoApis
|
|
|
||
| case class GpuGroupedPythonRunnerFactory( | ||
| class GpuGroupedPythonRunnerFactory( | ||
| conf: org.apache.spark.sql.internal.SQLConf, | ||
| chainedFunc: Seq[(ChainedPythonFunctions, Long)], | ||
| argOffsets: Array[Array[Int]], | ||
| dedupAttrs: StructType, | ||
| pythonOutputSchema: StructType, | ||
| evalType: Int, | ||
| argNames: Option[Array[Array[Option[String]]]] = None) { | ||
| argNames: Option[Array[Array[Option[String]]]]) extends Serializable { | ||
| val sessionLocalTimeZone = conf.sessionLocalTimeZone |
There was a problem hiding this comment.
Removal of
case class companion apply and default argNames
Converting from case class to class removes the auto-generated companion apply factory method. Any call site using GpuGroupedPythonRunnerFactory(args...) without new will no longer compile. Removing = None from argNames also means every call site must now supply the argument explicitly. Both are intentional within the validated stack, but represent silent source-compatibility breaks for any downstream or out-of-tree consumer.
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It removes old Spark 3.3.0 shim sources that are now provided by shared helpers or the new helper modules. This is scoped to the OSS Spark 3.3.0 cleanup.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance