Skip to content

Conversation

@turboFei
Copy link
Owner

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

bjornjorgensen and others added 7 commits March 25, 2025 18:52
…file`

### What changes were proposed in this pull request?
Change `description-file` to `description_file`

### Why are the changes needed?

`./dev/make-distribution.sh --name custom-spark --pip -Pkubernetes > output.txt 2>&1`
in the file there is this

```
+ echo 'Building python distribution package'
Building python distribution package
+ pushd /home/bjorn/spark/python
+ rm -rf pyspark.egg-info
+ python3 setup.py sdist
/usr/lib/python3.11/site-packages/setuptools/dist.py:745: SetuptoolsDeprecationWarning: Invalid dash-separated options
!!

        ********************************************************************************
        Usage of dash-separated 'description-file' will not be supported in future
        versions. Please use the underscore name 'description_file' instead.

        This deprecation is overdue, please update your project and remove deprecated
        calls to avoid build errors in the future.

        See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
        ********************************************************************************

!!
  opt = self.warn_dash_deprecation(opt, section)
running sdist
running egg_info
```
### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#50369
Closes apache#50372

Closes apache#50371 from bjornjorgensen/bjornjorgensen-description_file3.5].

Authored-by: Bjørn Jørgensen <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
### What changes were proposed in this pull request?

Fix variable name typo in document

### Why are the changes needed?

For doc

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

No

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#50443 from Mrhs121/typo.

Authored-by: ShengHuang <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
(cherry picked from commit 1dfb046)
Signed-off-by: Kent Yao <[email protected]>
…sloader based on the default session classloader on executor

### What changes were proposed in this pull request?

This PR is to construct the session-specific classloader based on the default session classloader which has already added the global jars (e.g., added by `--jars` ) into the classpath on the executor side in the connect mode.

### Why are the changes needed?

In Spark Connect mode, when connecting to a non-local (e.g., standalone) cluster, the executor creates an isolated session state that includes a session-specific classloader for each task. However, a notable issue arises: this session-specific classloader does not include the global JARs specified by the --jars option in the classpath. This oversight can lead to deserialization exceptions. For example:

``` console
Caused by: java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance of org.apache.spark.rdd.MapPartitionsRDD
        at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2096)
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

The newly added test can pass.
### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#50475 from wbo4958/classloader-3.5.

Authored-by: Bobby Wang <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
### What changes were proposed in this pull request?

Bump Parquet to 1.15.1.

### Why are the changes needed?

To fix critical CVE: https://www.cve.org/CVERecord?id=CVE-2025-30065

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

Pass GHA.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#50528 from wangyum/parquet-branch-3.5.

Lead-authored-by: [email protected] <[email protected]>
Co-authored-by: Fokko <[email protected]>
Co-authored-by: Fokko Driesprong <[email protected]>
Co-authored-by: panbingkun <[email protected]>
Co-authored-by: Fokko Driesprong <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
…k plugins are not reloaded

### What changes were proposed in this pull request?

This PR adds a unit test to verify that Spark plugin JARs specified via `--jars` are not reloaded.

### Why are the changes needed?

This PR is a followup of apache#50334 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

The test added can pass

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#50526 from wbo4958/SPARK-51537-followup.

Authored-by: Bobby Wang <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit 622fa35)
Signed-off-by: Hyukjin Kwon <[email protected]>
@turboFei turboFei force-pushed the branch-3.5_celeborn_patch branch from 49cc5c4 to cf4ee3f Compare April 11, 2025 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants