rapidsai · jacobtomlinson · Feb 8, 2023 · Feb 9, 2023 · Feb 10, 2023
@@ -1,48 +1,67 @@
-# RAPIDS on Databricks
+# Databricks
 
-## 0. Pre-requisites
+To use RAPIDS on Databricks we can create and launch a compute cluster with the RAPIDS libraries.
 
-1. Your Databricks workspace must have Databricks Container Services [enabled](https://docs.databricks.com/administration-guide/clusters/container-services.html).
+## Pre-requisites
 
-2. Your machine must be running a recent Docker daemon (one that is tested and works with Version 18.03.0-ce) and the `docker` command must be available on your PATH:
+1. Your Databricks workspace must have [Databricks Container Services enabled](https://docs.databricks.com/administration-guide/clusters/container-services.html).
 
-3. It is recommended to build from a [Databricks base image](https://hub.docker.com/u/databricksruntime). But you can also build your Docker base from scratch. The Docker image must meet these [requirements](https://docs.databricks.com/clusters/custom-containers.html#option-2-build-your-own-docker-base)
+2. You'll need [Docker](https://docs.docker.com/engine/reference/commandline/cli/) and a container registry such as [DockerHub](https://hub.docker.com/) or [Amazon ECR](https://aws.amazon.com/ecr/) where you can publish images.
 
-## 1. Build custom RAPIDS container
+## Build custom RAPIDS container
 
-```console
-ARG RAPIDS_IMAGE
+To start we need to build a container image that is compatible with Databricks and has the RAPIDS libraries installed. It is recommended to build from a [Databricks base image](https://hub.docker.com/u/databricksruntime) so we will use a multi-stage build to combine the Databricks container with the RAPIDS container
+
+```{note}
+You can also build your Docker base from scratch if you prefer. The Docker image must meet these [requirements](https://docs.databricks.com/clusters/custom-containers.html#option-2-build-your-own-docker-base)
+```
 
-FROM $RAPIDS_IMAGE as rapids
+Let's create a new `Dockerfile` with the following contents.
 
+```dockerfile
+# First stage will use the RAPIDS image to export the RAPIDS conda environment
+FROM rapidsai/rapidsai-core:22.12-cuda11.5-runtime-ubuntu18.04-py3.9 as rapids
 RUN conda list -n rapids --explicit > /rapids/rapids-spec.txt
 
+# Second stage will take the Databricks image and install the exported conda environment
 FROM databricksruntime/gpu-conda:cuda11
-
 COPY --from=rapids /rapids/rapids-spec.txt /tmp/spec.txt
-
-RUN conda create --name rapids --file /tmp/spec.txt && \
-    rm -f /tmp/spec.txt
+RUN conda create --name rapids --file /tmp/spec.txt && rm -f /tmp/spec.txt
 ```
 
+Now we can build the image. Be sure to use the registry/username where you will be publishing your image.
+
 ```console
-$ docker build --tag <username>/rapids_databricks:latest --build-arg RAPIDS_IMAGE=rapidsai/rapidsai-core:22.12-cuda11.5-runtime-ubuntu18.04-py3.9 ./docker
+$ docker build --tag <registry>/<username>/rapids_databricks:latest .
 ```
 
-Push this image to a Docker registry (DockerHub, Amazon ECR or Azure ACR).
-
-## 2. Configure and create GPU-enabled cluster
+Then push the image to your registry.
 
-1. Compute > Create compute > Name your cluster > Select `Multi` or `Single` Node
-2. Select a Standard Databricks runtime.
-   - **Note** Databricks ML Runtime does not support Databricks Container Services
-3. Under **Advanced Options**, in the the **Docker** tab select **"Use your own Docker container"**
-   - In the Docker Image URL field, enter the image that you created above
-   - Select the authentication type
-4. Select a GPU enabled worker and driver type
-   - Selected GPU must be Pascal generation or greater (eg: `g4dn.xlarge`)
-5. Create and launch your cluster
+```console
+$ docker push <registry>/<username>/rapids_databricks:latest
+```
 
-## 3. Test Rapids
+## Configure and create GPU-enabled cluster
+
+Next we can create a compute cluster on Databricks and use our RAPIDS powered container image.
+
+1. Open the [Databricks control panel](https://accounts.cloud.databricks.com).
+2. Navigate to Compute > Create compute.
+3. Name your cluster.
+4. Select `Multi node` or `Single node` depending on the type of cluster you want to launch.
+5. Select a Standard Databricks runtime.
+   - **Note** Databricks ML Runtime does not support Databricks Container Services.
+   - You may also need to uncheck "Use Photon Acceleration".
+6. Under **Advanced Options**, in the the **Docker** tab select **Use your own Docker container**.
+   - In the Docker Image URL field, enter the image that you created above.
+   - Select the authentication type.
+7. Also under **Advanced Options**, in the **Spark** tab add the following configuration line.
+   - `spark.databricks.driverNfs.enabled false`
+8. Scroll back up to **Performance** and select a GPU enabled node type.
+   - Selected GPU must be Pascal generation or greater (eg: `g4dn.xlarge`).
+   - You will need to have checked **Use your own Docker container** in the previous step in order for GPU nodes to be available.
+9. Create and launch your cluster.
+
+## Test Rapids
 
 For more details on Integrating Databricks Jobs with MLFlow and RAPIDS, check out this [blog post](https://medium.com/rapids-ai/managing-and-deploying-high-performance-machine-learning-models-on-gpus-with-rapids-and-mlflow-753b6fcaf75a).
@@ -39,4 +39,14 @@ Run RAPIDS on Coiled.
 {bdg}`multi-node`
 ````
 
+````{grid-item-card}
+:link: databricks
+:link-type: doc
+Databricks
+^^^
+Run RAPIDS on Databricks.
+
+{bdg}`multi-node`
+````
+
 `````