Remove old Spark 3.3 DBR shim sources#15053
Conversation
c0d071d to
c6d9166
Compare
d54de00 to
fad5d01
Compare
b930032 to
5febf4b
Compare
499dac2 to
04da28c
Compare
5febf4b to
7bb6083
Compare
04da28c to
5c9d602
Compare
7bb6083 to
796810e
Compare
6828be5 to
bb0cf14
Compare
796810e to
2663bb3
Compare
bb0cf14 to
b2a7a9e
Compare
d746dd5 to
cd226f8
Compare
b2a7a9e to
8bb1ce8
Compare
cd226f8 to
e4d544e
Compare
8bb1ce8 to
e23faf4
Compare
e4d544e to
7ee72ed
Compare
e23faf4 to
77b338e
Compare
7ee72ed to
3d5e85e
Compare
2bc489a to
52ebd8c
Compare
47cc5a9 to
c2f0e73
Compare
52ebd8c to
e8921a0
Compare
c2f0e73 to
3c2a442
Compare
e8921a0 to
0ac124a
Compare
3c2a442 to
f27a5af
Compare
0ac124a to
9965c09
Compare
f27a5af to
df68682
Compare
b0a7d69 to
9f4924d
Compare
9d8e2a1 to
d8581c4
Compare
9f4924d to
101e6cd
Compare
d8581c4 to
9973398
Compare
101e6cd to
a0d26e1
Compare
9973398 to
9bb4096
Compare
f0a54b4 to
36f44d8
Compare
903afa7 to
0b03989
Compare
36f44d8 to
0e24722
Compare
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
0b03989 to
9103bb7
Compare
0e24722 to
4ec5b87
Compare
Greptile SummaryThis PR removes several old Spark 3.3 DBR shim source files from
Confidence Score: 3/5The changes to modified files are clean, but two deleted files had shim-json coverage extending to other DB versions still active in the tree; those builds reference objects that no longer exist anywhere in the codebase. The original DatabricksShimServiceProvider.scala was the sole definition of the DatabricksShimServiceProvider object for 332db, 341db, 350db143, and 400db173. All four of those service providers still call DatabricksShimServiceProvider.matchesVersion(...) in the working tree, pointing to a compilation gap if this layer is merged independently. The OriginContextShim deletion poses the same risk for 332db. The PR author notes that DBR validation was not run locally and frames this as a stack layer only valid when the full stack is applied together, which limits confidence when reviewing this diff in isolation. The two deleted files — DatabricksShimServiceProvider.scala and OriginContextShim.scala — are the most important to scrutinize; a second look should confirm that a prior or subsequent stack layer supplies their replacements for 332db and other affected DB variants. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Deleted["Deleted from spark330db/"]
A[DatabricksShimServiceProvider.scala\nshim-json: 330db,332db,341db,350db143,400db173]
B[SparkShimServiceProvider.scala\nshim-json: 330db only]
C[OriginContextShim.scala\nshim-json: 330db, 332db]
D[SparkDateTimeExceptionShims.scala\nshim-json: 330db]
E[SparkUpgradeExceptionShims.scala\nshim-json: 330db]
end
subgraph Added["Added to spark330db/"]
F[CheckOverflowInTableInsertShims.scala\nshim-json: 330db,331,332,...,411]
end
subgraph Modified["Modified in spark330db/"]
G[Spark330PlusDBShims.scala\nDelegates to CheckOverflowInTableInsertShims.exprs]
H[SparkShims.scala\nUses dataWriteCmdFromShim]
I[GpuGroupedPythonRunnerFactory.scala\ncase class → class + Serializable]
J[arithmetic.scala\nRemoves Logging from GpuDecimalRemainder]
end
subgraph StillReferencing["Still reference deleted DatabricksShimServiceProvider"]
K[spark332db/SparkShimServiceProvider]
L[spark341db/SparkShimServiceProvider]
M[spark350db143/SparkShimServiceProvider]
N[spark400db173/SparkShimServiceProvider]
end
A -->|deleted, breaks| K
A -->|deleted, breaks| L
A -->|deleted, breaks| M
A -->|deleted, breaks| N
C -->|deleted, 332db now missing OriginContextShim| K
F --> G
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
subgraph Deleted["Deleted from spark330db/"]
A[DatabricksShimServiceProvider.scala\nshim-json: 330db,332db,341db,350db143,400db173]
B[SparkShimServiceProvider.scala\nshim-json: 330db only]
C[OriginContextShim.scala\nshim-json: 330db, 332db]
D[SparkDateTimeExceptionShims.scala\nshim-json: 330db]
E[SparkUpgradeExceptionShims.scala\nshim-json: 330db]
end
subgraph Added["Added to spark330db/"]
F[CheckOverflowInTableInsertShims.scala\nshim-json: 330db,331,332,...,411]
end
subgraph Modified["Modified in spark330db/"]
G[Spark330PlusDBShims.scala\nDelegates to CheckOverflowInTableInsertShims.exprs]
H[SparkShims.scala\nUses dataWriteCmdFromShim]
I[GpuGroupedPythonRunnerFactory.scala\ncase class → class + Serializable]
J[arithmetic.scala\nRemoves Logging from GpuDecimalRemainder]
end
subgraph StillReferencing["Still reference deleted DatabricksShimServiceProvider"]
K[spark332db/SparkShimServiceProvider]
L[spark341db/SparkShimServiceProvider]
M[spark350db143/SparkShimServiceProvider]
N[spark400db173/SparkShimServiceProvider]
end
A -->|deleted, breaks| K
A -->|deleted, breaks| L
A -->|deleted, breaks| M
A -->|deleted, breaks| N
C -->|deleted, 332db now missing OriginContextShim| K
F --> G
|
Related to #14834.
Description
This PR is one reviewable layer in the unshim stack introduced by #15025. It removes old Spark 3.3 DBR shim sources as a separate cleanup layer. DBR changes are intentionally isolated because they are not locally buildable and tend to be review outliers.
Stack context
Testing and validation notes
Checklists
Documentation
Testing
(Covered by the validation notes in the PR description.)
Performance