-
Notifications
You must be signed in to change notification settings - Fork 2k
[Spark][Infra] Drop support for Spark 3.5 and formally pin to released Spark 4.0.1 #5616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark][Infra] Drop support for Spark 3.5 and formally pin to released Spark 4.0.1 #5616
Conversation
This reverts commit 8fd880f.
|
In theory I could do the src code shims + test code shims separately if that would help. Let me know if that makes reviews easier (not sure if anyone wants to review the shim code changes closely, or if tests pass & code compiles that's enough). |
kernel-spark/src/main/java/io/delta/kernel/spark/catalog/SparkTable.java
Outdated
Show resolved
Hide resolved
| @@ -1,59 +0,0 @@ | |||
| name: "Delta Spark Master" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we update the PR title to say drop support for spark 3.5 and spark master compilation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we haven't actually been compiling with spark master in a while... (as we're using a very stale snapshot). But I can make the title more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Sorry, I'm still confused. Here we are deleting our job to compile against spark "master" right? (perhaps it was a stale master ..)
But does Drop support for Spark 3.5 and formally pin to released Spark 4.0.1 reflect that?
That seems like an important highlight, sorry, and I want to make sure my understanding is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling it spark master before was misleading, in fact, in the previous PR we renamed the spark version spec to spark40Snapshot instead of master. I think saying we are removing spark master is misleading considering we never were compiling with Spark master. We will be fixing that in future PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more correct to say spark_master_test.yaml was incorrectly named this whole time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove this action completely? eventually we will be addng back spark master support. then what should we be doing .. finding it from history and adding it back?
alternative is to just make this a no-op in some way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah for iceberg action I just made it a no-op.
I'm fine with just removing this though since this work i'm tracking directly and will add it back (need to update the CI in a few ways for multi-versions anyways)
|
|
||
| // Changes in 4.1.0 | ||
| // TODO: change in type hierarchy due to removal of DeltaThrowableConditionShim | ||
| ProblemFilters.exclude[MissingTypesProblem]("io.delta.exceptions.*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reviewers this seems safe to me, considering no one should be catching DeltaThrowableConditionShim... but would like additional opinions
| ).configureUnidoc() | ||
|
|
||
| /* | ||
| TODO: readd delta-iceberg on Spark 4.0+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lzlfred Hey Fred, we will be releasing in on both Spark 4.0 and Spark 4.1 next release, we will need to update this build to work for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also tracking the todo at #5326
| ).configureUnidoc() | ||
|
|
||
| /* | ||
| TODO: compilation broken for Spark 4.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
@linzhou-db @littlegrasscao FYI can you please look into fixing this once I merge this PR
| val lookupSparkVersion: PartialFunction[(Int, Int), String] = { | ||
| // version 4.0.0-preview1 | ||
| case (major, minor) if major >= 4 => "4.0.0-preview1" | ||
| // TODO: how to run integration tests for multiple Spark versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
| with open("python/README.md", "r", encoding="utf-8") as fh: | ||
| long_description = fh.read() | ||
|
|
||
| # TODO: once we support multiple Spark versions update this to be compatible with both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking at #5326
raveeram-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build changes look good to me overall
| uses: actions/setup-java@v3 | ||
| with: | ||
| distribution: "zulu" | ||
| java-version: "11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general question @scottsand-db why is the kernel unitycatalog is a separate github action from kernel?
|
|
||
| // Try to write as same file and expect an error | ||
| intercept[FileAlreadyExistsException] { | ||
| val e = intercept[IOException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is remove spark 3.5 causing all these kernel code changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I upgraded our hadoop version to match Spark 4.0. More details on #5616 (comment)
Kernel changes are only related to this.
|
lets merge this quick. the java version is adding all sort of complications in a different place. |
…d Spark 4.0.1 (delta-io#5616) <!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md 2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP] Your PR title ...'. 3. Be sure to keep the PR description updated to reflect all changes. 4. Please write your PR title to summarize what this PR proposes. 5. If possible, provide a concise example to reproduce the issue for a faster review. 6. If applicable, include the corresponding issue number in the PR title and link it in the body. --> #### Which Delta project/connector is this regarding? <!-- Please add the component selected below to the beginning of the pull request title For example: [Spark] Title of my pull request --> - [x] Spark - [ ] Standalone - [ ] Flink - [ ] Kernel - [ ] Other (fill in here) ## Description PART OF delta-io#5326 Contains the following changes: - Removes Spark 3.5 support - Adds explicit Spark 4.0 support - Removes a "master" build for now - Merges shims from the 3.5 vs 4.0 breaking changes into the src code In a future PR - we will add Spark 4.1.0-SNAPSHOT support (in preparation for the Spark 4.1 release) - we will add back a "master" build tracking Spark master (these will require adding new shims, but in different areas) ## How was this patch tested? Unit tests + ran integration tests locally (python, scala + pip) Tracking open TODOs at delta-io#5326
Which Delta project/connector is this regarding?
Description
PART OF #5326
Contains the following changes:
In a future PR
(these will require adding new shims, but in different areas)
How was this patch tested?
Unit tests + ran integration tests locally (python, scala + pip)
Tracking open TODOs at #5326