Skip to content

Add Spark 4.0.3 shim support#15151

Open
firestarman wants to merge 2 commits into
NVIDIA:mainfrom
firestarman:spark-403-shim-refresh
Open

Add Spark 4.0.3 shim support#15151
firestarman wants to merge 2 commits into
NVIDIA:mainfrom
firestarman:spark-403-shim-refresh

Conversation

@firestarman

@firestarman firestarman commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Fixes #15065.

Description

  • Add Spark 4.0.3 as a supported Scala 2.13 shim profile so the plugin can build and load against Spark 4.0.3.
  • Add the Spark 4.0.3 shim service provider and generated support metadata so runtime shim discovery and qualification documentation include Spark 4.0.3.
  • Split Spark 4.0.3 from the 4.0.1/4.0.2 SparkShims path so 4.0.3 can carry its Spark-specific behavior without affecting earlier 4.0.x shims.
  • Share ACOSH/ASINH compatibility overrides and boundary tests with Spark 4.1.2 because Spark 4.0.3 has the same CPU behavior for large hyperbolic inputs.
  • Update AST fallback handling for ACOSH/ASINH so Spark 4.0.3 uses the non-AST path where GPU AST semantics would not match Spark CPU results.
  • Validated with SPARK_HOME=/bigdata/work/tools/spark-4.0.3-bin-hadoop3 mvn -B -s /home/liangcail/.m2/settings_art.xml -f scala2.13/pom.xml -Dbuildver=403 -Dcuda.version=cuda13 verify: BUILD SUCCESS; integration tests reported 35271 passed, 2058 skipped, 513 xfailed, 895 xpassed; Scala tests reported 1759 succeeded, 0 failed.
  • Local NDS performance results show no overall performance regression observed for Spark 4.0.3.
Item Value
Dataset /bigdata/tpcds_data/parquet_100f
Format Parquet
RAPIDS jar rapids-4-spark_2.13-26.08.0-SNAPSHOT-cuda13-403-perf.jar
Spark 4.0.2 /bigdata/work/tools/spark-4.0.2-bin-hadoop3
Spark 4.0.3 /bigdata/work/tools/spark-4.0.3-bin-hadoop3
Runs 3 per shim
Compared runs Warm runs only: run 2 and run 3
Query status 618 query JSON records checked, 0 non-Completed
Result No overall performance regression observed for Spark 4.0.3
Metric Spark 4.0.2 Spark 4.0.3 4.0.3 vs 4.0.2
Power run 2 450.000s 454.000s +0.89%
Power run 3 464.000s 455.000s -1.94%
Power run avg 457.000s 454.500s -0.55%
Sum of per-query avg times 455.934s 453.577s -0.52%

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman firestarman requested a review from a team as a code owner June 26, 2026 07:38
@greptile-apps

greptile-apps Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Too many files changed for review. (199 files found, 100 file limit)

@firestarman firestarman requested a review from a team June 26, 2026 07:42
@firestarman

Copy link
Copy Markdown
Collaborator Author

build

{"spark": "412"}
spark-rapids-shim-json-lines ***/
package com.nvidia.spark.rapids.shims.spark412
package com.nvidia.spark.rapids.shims

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjacent to this changed hunk, the private vals below are named Log2 and Epsilon. Could you rename them to lowerCamelCase (log2, epsilon) and update their call sites? The repo style expects member vals to use camelCase.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, updated

Comment on lines 34 to 35

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:
Log2 => log2
Epsilon => epsilon

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion, updated

Signed-off-by: Firestarman <firestarmanllc@gmail.com>
@firestarman

Copy link
Copy Markdown
Collaborator Author

build

@firestarman firestarman requested a review from res-life June 26, 2026 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Add support for Apache Spark 4.0.3

3 participants