Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50671][BUILD] Error compiling spark-protobuf module using user-defined protoc #49296

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

morvenhuang
Copy link
Contributor

@morvenhuang morvenhuang commented Dec 26, 2024

What changes were proposed in this pull request?

People may use user-defined protoc(as below) for various reasons, for example, if they're on macOS 11, they have to use it since the default version of protoc and protoc-gen-grpc-java plugin do not work on macOS 11.

$ export SPARK_PROTOC_EXEC_PATH=/Users/foobar/dev/protoc-4.29.1/protoc
$ export CONNECT_PLUGIN_EXEC_PATH=/Users/foobar/dev/protoc-gen-grpc-java-1.59.1/protoc-gen-grpc-java
$ ./build/mvn -DskipTests clean package -Puser-defined-protoc -DskipDefaultProtoc

But this results in compilation failure:

[ERROR] Failed to execute goal com.github.os72:protoc-jar-maven-plugin:3.11.4:run (default) on project spark-protobuf_2.13: Execution default of goal com.github.os72:protoc-jar-maven-plugin:3.11.4:run failed: Cannot read the array length because "<local6>" is null

This PR aims to fix this compilation failure, the error is caused by protocol-jar-maven-plugin bug:

Why are the changes needed?

Before:

$ export SPARK_PROTOC_EXEC_PATH=/Users/foobar/dev/protoc-4.29.1/protoc
$ export CONNECT_PLUGIN_EXEC_PATH=/Users/foobar/dev/protoc-gen-grpc-java-1.59.1/protoc-gen-grpc-java
$ ./build/mvn -DskipTests clean package -Puser-defined-protoc -DskipDefaultProtoc
...
[ERROR] Failed to execute goal com.github.os72:protoc-jar-maven-plugin:3.11.4:run (default) on project spark-protobuf_2.13: Execution default of goal com.github.os72:protoc-jar-maven-plugin:3.11.4:run failed: Cannot read the array length because "<local6>" is null

After:

$ export SPARK_PROTOC_EXEC_PATH=/Users/foobar/dev/protoc-4.29.1/protoc
$ export CONNECT_PLUGIN_EXEC_PATH=/Users/foobar/dev/protoc-gen-grpc-java-1.59.1/protoc-gen-grpc-java
$ ./build/mvn -DskipTests clean package -Puser-defined-protoc -DskipDefaultProtoc
...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 4.0.0-SNAPSHOT:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [  7.539 s]
[INFO] Spark Project Tags ................................. SUCCESS [  7.527 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  8.413 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 29.923 s]
[INFO] Spark Project Local DB ............................. SUCCESS [ 10.732 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 16.691 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 12.653 s]
[INFO] Spark Project Variant .............................. SUCCESS [  5.262 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 16.246 s]
[INFO] Spark Project Connect Shims ........................ SUCCESS [  4.356 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  8.666 s]
[INFO] Spark Project Core ................................. SUCCESS [02:56 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 30.063 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 33.989 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 51.897 s]
[INFO] Spark Project SQL API .............................. SUCCESS [ 39.508 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [03:15 min]
[INFO] Spark Project SQL .................................. SUCCESS [04:56 min]
[INFO] Spark Project ML Library ........................... SUCCESS [02:30 min]
[INFO] Spark Project Tools ................................ SUCCESS [  4.463 s]
[INFO] Spark Project Hive ................................. SUCCESS [01:33 min]
[INFO] Spark Project Connect Common ....................... SUCCESS [ 55.308 s]
[INFO] Spark Avro ......................................... SUCCESS [ 33.350 s]
[INFO] Spark Protobuf ..................................... SUCCESS [ 38.981 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 21.165 s]
[INFO] Spark Project Connect Server ....................... SUCCESS [ 58.033 s]
[INFO] Spark Project Connect Client ....................... SUCCESS [ 59.078 s]
[INFO] Spark Project Assembly ............................. SUCCESS [  6.703 s]
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 20.047 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 26.882 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 41.397 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 41.308 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 11.300 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Run command:

$ export SPARK_PROTOC_EXEC_PATH=/Users/foobar/dev/protoc-4.29.1/protoc
$ export CONNECT_PLUGIN_EXEC_PATH=/Users/foobar/dev/protoc-gen-grpc-java-1.59.1/protoc-gen-grpc-java
$ ./build/mvn -DskipTests clean package -Puser-defined-protoc -DskipDefaultProtoc

Was this patch authored or co-authored using generative AI tooling?

No.

@morvenhuang
Copy link
Contributor Author

Hi @LuciferYang @zhengruifeng, could you please help to reivew this when you have time? Many thanks.

@LuciferYang
Copy link
Contributor

Although it seems like there should be an issue, executing the command build/mvn -DskipTests clean install -pl connector/protobuf -e -am should not trigger the user-defined-protoc profile.

Is it a mistake in the pr description or does it trigger other issues when specifying SPARK_PROTOC_EXEC_PATH

@morvenhuang
Copy link
Contributor Author

@LuciferYang Thank you very much for pointing out, you're right, I forgot to mention -Puser-defined-protoc -DskipDefaultProtoc, I've corrected the pr description and also added some details. Please help to review when you have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants