diff --git a/docs/get-started/xgboost-examples/csp/databricks/databricks.md b/docs/get-started/xgboost-examples/csp/databricks/databricks.md index be9bab26..cfbb838c 100644 --- a/docs/get-started/xgboost-examples/csp/databricks/databricks.md +++ b/docs/get-started/xgboost-examples/csp/databricks/databricks.md @@ -21,7 +21,7 @@ Navigate to your home directory in the UI and select **Create** > **File** from create an `init.sh` scripts with contents: ```bash #!/bin/bash - sudo wget -O /databricks/jars/rapids-4-spark_2.12-26.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar + sudo wget -O /databricks/jars/rapids-4-spark_2.12-26.06.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar ``` 1. Select the Databricks Runtime Version from one of the supported runtimes specified in the Prerequisites section. @@ -68,7 +68,7 @@ create an `init.sh` scripts with contents: ```bash spark.rapids.sql.python.gpu.enabled true spark.python.daemon.module rapids.daemon_databricks - spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-26.02.0.jar:/databricks/spark/python + spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-26.06.0.jar:/databricks/spark/python ``` Note that since python memory pool require installing the cudf library, so you need to install cudf library in each worker nodes `pip install cudf-cu11 --extra-index-url=https://pypi.nvidia.com` or disable python memory pool diff --git a/docs/get-started/xgboost-examples/csp/databricks/init.sh b/docs/get-started/xgboost-examples/csp/databricks/init.sh index 559c6f56..2ce9c5ea 100644 --- a/docs/get-started/xgboost-examples/csp/databricks/init.sh +++ b/docs/get-started/xgboost-examples/csp/databricks/init.sh @@ -17,7 +17,7 @@ sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-gpu_2.12--ml.dmlc__xgboost4j-gpu_2.12__1.5.2.jar sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.5.2.jar -sudo wget -O /databricks/jars/rapids-4-spark_2.12-26.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar +sudo wget -O /databricks/jars/rapids-4-spark_2.12-26.06.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar sudo wget -O /databricks/jars/xgboost4j-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.1/xgboost4j-gpu_2.12-1.7.1.jar sudo wget -O /databricks/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.1/xgboost4j-spark-gpu_2.12-1.7.1.jar ls -ltr diff --git a/docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md b/docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md index 5dfae489..0e9b55e8 100644 --- a/docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md +++ b/docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md @@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag ### Download the jars Download the RAPIDS Accelerator for Apache Spark plugin jar - * [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar) + * [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar) ### Build XGBoost Python Examples diff --git a/docs/get-started/xgboost-examples/prepare-package-data/preparation-scala.md b/docs/get-started/xgboost-examples/prepare-package-data/preparation-scala.md index 8b5b06e1..d23d6115 100644 --- a/docs/get-started/xgboost-examples/prepare-package-data/preparation-scala.md +++ b/docs/get-started/xgboost-examples/prepare-package-data/preparation-scala.md @@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag ### Download the jars 1. Download the RAPIDS Accelerator for Apache Spark plugin jar - * [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar) + * [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar) ### Build XGBoost Scala Examples diff --git a/examples/ML+DL-Examples/Optuna-Spark/README.md b/examples/ML+DL-Examples/Optuna-Spark/README.md index e7c8311a..92bd7ee3 100644 --- a/examples/ML+DL-Examples/Optuna-Spark/README.md +++ b/examples/ML+DL-Examples/Optuna-Spark/README.md @@ -147,8 +147,8 @@ We use [RAPIDS](https://docs.rapids.ai/install/#get-rapids) for GPU-accelerated ``` shell sudo apt install libmysqlclient-dev -conda create -n rapids-26.02 -c rapidsai -c conda-forge -c nvidia \ - cudf=26.02 cuml=26.02 python=3.10 'cuda-version>=12.0,<=12.5' +conda create -n rapids-26.06 -c rapidsai -c conda-forge -c nvidia \ + cudf=26.06 cuml=26.06 python=3.10 'cuda-version>=12.0,<=12.5' conda activate optuna-spark pip install mysqlclient pip install optuna joblib joblibspark ipywidgets diff --git a/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/init_optuna.sh b/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/init_optuna.sh index 18267c94..69ab5898 100644 --- a/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/init_optuna.sh +++ b/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/init_optuna.sh @@ -55,7 +55,7 @@ fi # rapids import -SPARK_RAPIDS_VERSION=26.02.0 +SPARK_RAPIDS_VERSION=26.06.0 curl -L https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/${SPARK_RAPIDS_VERSION}/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar -o \ /databricks/jars/rapids-4-spark_2.12-${SPARK_RAPIDS_VERSION}.jar diff --git a/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/start_cluster.sh b/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/start_cluster.sh index 4ffb594a..36d12c23 100755 --- a/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/start_cluster.sh +++ b/examples/ML+DL-Examples/Optuna-Spark/optuna-examples/databricks/start_cluster.sh @@ -26,7 +26,7 @@ json_config=$(cat < Here is the bar chart from a recent execution on Google Colab's T4 High RAM instance using -RAPIDS Spark 26.02.0 with Apache Spark 3.5.0 +RAPIDS Spark 26.06.0 with Apache Spark 3.5.0 ![tpcds-speedup](/docs/img/guides/tpcds.png) diff --git a/examples/SQL+DF-Examples/tpcds/notebooks/TPCDS-SF10.ipynb b/examples/SQL+DF-Examples/tpcds/notebooks/TPCDS-SF10.ipynb index f5f12385..11e2332e 100644 --- a/examples/SQL+DF-Examples/tpcds/notebooks/TPCDS-SF10.ipynb +++ b/examples/SQL+DF-Examples/tpcds/notebooks/TPCDS-SF10.ipynb @@ -30,7 +30,7 @@ "outputs": [], "source": [ "spark_version='3.5.5'\n", - "rapids_version='26.02.0'\n", + "rapids_version='26.06.0'\n", "sparkmeasure_version='0.27'" ] }, diff --git a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/README.md b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/README.md index 530cdf1b..5ceb8d94 100644 --- a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/README.md +++ b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/README.md @@ -274,7 +274,7 @@ then do the following inside the Docker container. ### Get jars from Maven Central -[rapids-4-spark_2.12-26.02.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar) +[rapids-4-spark_2.12-26.06.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar) ### Launch a local mode Spark diff --git a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/extract-cudf-libs.sh b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/extract-cudf-libs.sh index a00081b3..45ad3b92 100755 --- a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/extract-cudf-libs.sh +++ b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/extract-cudf-libs.sh @@ -28,13 +28,13 @@ # ./extract-cudf-libs.sh # # Environment Variables (optional, will use pom.xml values if not set): -# RAPIDS4SPARK_VERSION - rapids-4-spark version (e.g., 26.02.0 or 26.06.0-SNAPSHOT) +# RAPIDS4SPARK_VERSION - rapids-4-spark version (e.g., 26.06.0 or 26.06.0-SNAPSHOT) # SCALA_VERSION - Scala binary version (e.g., 2.12, 2.13) # CUDA_VERSION - CUDA version (e.g., cuda11, cuda12) -# CUDF_BRANCH - cuDF git branch for headers (e.g., main, branch-26.02) +# CUDF_BRANCH - cuDF git branch for headers (e.g., main, branch-26.06) # # Example with overrides: -# RAPIDS4SPARK_VERSION=26.02.0 CUDA_VERSION=cuda11 ./extract-cudf-libs.sh +# RAPIDS4SPARK_VERSION=26.06.0 CUDA_VERSION=cuda11 ./extract-cudf-libs.sh ############################################################################### set -e diff --git a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/pom.xml b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/pom.xml index 7ac5a7c3..b2c4390d 100644 --- a/examples/UDF-Examples/RAPIDS-accelerated-UDFs/pom.xml +++ b/examples/UDF-Examples/RAPIDS-accelerated-UDFs/pom.xml @@ -37,7 +37,7 @@ cuda12 2.12 - 26.02.0 + 26.06.0 3.1.1 2.12.15 ${project.build.directory}/cpp-build diff --git a/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb b/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb index 45a3fdc9..c50149ab 100644 --- a/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb +++ b/examples/XGBoost-Examples/mortgage/notebooks/python/MortgageETL.ipynb @@ -9,7 +9,7 @@ "Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-24.12/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n", "\n", "### 2. Download needed jars\n", - "* [rapids-4-spark_2.12-26.02.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar)\n", + "* [rapids-4-spark_2.12-26.06.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar)\n", "\n", "\n", "### 3. Start Spark Standalone\n", @@ -17,7 +17,7 @@ "\n", "### 4. Add ENV\n", "```\n", - "$ export SPARK_JARS=rapids-4-spark_2.12-26.02.0.jar\n", + "$ export SPARK_JARS=rapids-4-spark_2.12-26.06.0.jar\n", "$ export PYSPARK_DRIVER_PYTHON=jupyter \n", "$ export PYSPARK_DRIVER_PYTHON_OPTS=notebook\n", "```\n", diff --git a/examples/XGBoost-Examples/mortgage/notebooks/scala/mortgage-ETL.ipynb b/examples/XGBoost-Examples/mortgage/notebooks/scala/mortgage-ETL.ipynb index b7074ed1..c98a9956 100644 --- a/examples/XGBoost-Examples/mortgage/notebooks/scala/mortgage-ETL.ipynb +++ b/examples/XGBoost-Examples/mortgage/notebooks/scala/mortgage-ETL.ipynb @@ -20,14 +20,14 @@ "Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.12/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n", "\n", "### 2. Download needed jars\n", - "* [rapids-4-spark_2.12-26.02.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar)\n", + "* [rapids-4-spark_2.12-26.06.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar)\n", "\n", "### 3. Start Spark Standalone\n", "Before Running the script, please setup Spark standalone mode\n", "\n", "### 4. Add ENV\n", "```\n", - "$ export SPARK_JARS=rapids-4-spark_2.12-26.02.0.jar\n", + "$ export SPARK_JARS=rapids-4-spark_2.12-26.06.0.jar\n", "\n", "```\n", "\n", diff --git a/examples/XGBoost-Examples/taxi/notebooks/python/taxi-ETL.ipynb b/examples/XGBoost-Examples/taxi/notebooks/python/taxi-ETL.ipynb index 781a528e..f4e6b6df 100644 --- a/examples/XGBoost-Examples/taxi/notebooks/python/taxi-ETL.ipynb +++ b/examples/XGBoost-Examples/taxi/notebooks/python/taxi-ETL.ipynb @@ -19,14 +19,14 @@ "All data could be found at https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page\n", "\n", "### 2. Download needed jars\n", - "* [rapids-4-spark_2.12-26.02.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar)\n", + "* [rapids-4-spark_2.12-26.06.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar)\n", "\n", "### 3. Start Spark Standalone\n", "Before running the script, please setup Spark standalone mode\n", "\n", "### 4. Add ENV\n", "```\n", - "$ export SPARK_JARS=rapids-4-spark_2.12-26.02.0.jar\n", + "$ export SPARK_JARS=rapids-4-spark_2.12-26.06.0.jar\n", "$ export PYSPARK_DRIVER_PYTHON=jupyter \n", "$ export PYSPARK_DRIVER_PYTHON_OPTS=notebook\n", "```\n", diff --git a/examples/XGBoost-Examples/taxi/notebooks/scala/taxi-ETL.ipynb b/examples/XGBoost-Examples/taxi/notebooks/scala/taxi-ETL.ipynb index 6f1dfc04..3091a194 100644 --- a/examples/XGBoost-Examples/taxi/notebooks/scala/taxi-ETL.ipynb +++ b/examples/XGBoost-Examples/taxi/notebooks/scala/taxi-ETL.ipynb @@ -19,14 +19,14 @@ "All data could be found at https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page\n", "\n", "### 2. Download needed jar\n", - "* [rapids-4-spark_2.12-26.02.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.02.0/rapids-4-spark_2.12-26.02.0.jar)\n", + "* [rapids-4-spark_2.12-26.06.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/26.06.0/rapids-4-spark_2.12-26.06.0.jar)\n", "\n", "### 3. Start Spark Standalone\n", "Before running the script, please setup Spark standalone mode\n", "\n", "### 4. Add ENV\n", "```\n", - "$ export SPARK_JARS=rapids-4-spark_2.12-26.02.0.jar\n", + "$ export SPARK_JARS=rapids-4-spark_2.12-26.06.0.jar\n", "\n", "```\n", "\n", diff --git a/examples/spark-connect-gpu/server/README.md b/examples/spark-connect-gpu/server/README.md index 893c8437..f1a8d850 100644 --- a/examples/spark-connect-gpu/server/README.md +++ b/examples/spark-connect-gpu/server/README.md @@ -150,7 +150,7 @@ path. Otherwise, we use variables starting with `local_`. ### Spark Connect Server - **Image**: Custom build based on `apache/spark:4.0.0` with Spark RAPIDS ETL and ML Plugins -- **RAPIDS Version**: 26.02.0 for CUDA 12 +- **RAPIDS Version**: 26.06.0 for CUDA 12 - **Ports**: 15002 (gRPC), 4040 (Driver UI) - **Configuration**: Optimized for GPU acceleration with memory management diff --git a/examples/spark-connect-gpu/server/docker-compose.yaml b/examples/spark-connect-gpu/server/docker-compose.yaml index abcf497d..22f8b7b0 100644 --- a/examples/spark-connect-gpu/server/docker-compose.yaml +++ b/examples/spark-connect-gpu/server/docker-compose.yaml @@ -74,7 +74,7 @@ services: dockerfile: Dockerfile args: - CUDA_VERSION=${CUDA_VERSION:-12} - - RAPIDS_VERSION=${RAPIDS_VERSION:-26.02.0} + - RAPIDS_VERSION=${RAPIDS_VERSION:-26.06.0} - REPO_URL=${REPO_URL:-https://repo1.maven.org/maven2} container_name: spark-connect-server hostname: spark-connect-server diff --git a/tools/databricks/README.md b/tools/databricks/README.md index b6676b3b..54e4bae2 100644 --- a/tools/databricks/README.md +++ b/tools/databricks/README.md @@ -19,4 +19,4 @@ top of the notebook. After that, select *Run all* to execute the tools for the 1. Multiple event logs must be comma-separated. - For example: `/dbfs/path/to/eventlog1,/dbfs/path/to/eventlog2` -**Latest Tools Version Supported** 26.02.0 \ No newline at end of file +**Latest Tools Version Supported** 26.06.0 \ No newline at end of file