adjusted files names

DIYBigData · Sep 21, 2019 · 37e19e4 · 37e19e4
1 parent 362566f
commit 37e19e4
Show file tree

Hide file tree

Showing 8 changed files with 7 additions and 2 deletions.
diff --git a/spark-on-docker-swarm/README.md → docker-swarm-spark-simple/README.md b/spark-on-docker-swarm/README.md → docker-swarm-spark-simple/README.md
@@ -7,10 +7,15 @@ First, edit the following items as needed for your swarm:
 
 1. `configured-sparknode -> spark-conf -> spark-env.sh`: adjust the environment variables as appropriate for your cluster's nodes, most notably `SPARK_WORKER_MEMORY` and `SPARK_WORKER_CORES`. Leave 1-2 cores and at least 10% of RAM for other processes.
 2. `configured-sparknode -> spark-conf -> spark-env.sh`: Adjust the memory and core settings for the executors and driver. Each executor should have about 5 cores (if possible), and should be a whole divisor into `SPARK_WORKER_CORES`. Spark will launch as many executors as `SPARK_WORKER_CORES` divided by `spark.executor.cores`. Reserve about 7-8% of `SPARK_WORKER_MEMORY` for overhead when setting `spark.executor.memory`.
-3. `build-images.sh`: Adjust the IP address for your local Docker registry. You can use a domain name if all nodes in your swarm can resolve it. This is needed as it allows all nodes in the swarm to pull the locally built Docker images.
+3. `build-images.sh`: Adjust the IP address for your local Docker registry that all nodes in your cluster can access. You can use a domain name if all nodes in your swarm can resolve it. This is needed as it allows all nodes in the swarm to pull the locally built Docker images.
 4. `spark-deploy.yml`: Adjust all image names for the updated local Docker registry address you used in the prior step. Also, adjust the resource limits for each of the services. Setting a `cpus` limit here that is smaller than the number of cores on your node has the effect of giving your process a fraction of each core's capacity. You might consider doing this if your swarm hosts other services or does not handle long term 100% CPU load well (e.g., overheats). Also adjust the `replicas` count for the `spark-worker` service to be equal to the number of nodes in your swarm (or less). 
 
-This set up depend son have a GlusterFS volume mounted at `/mnt/gfs` on all nodes and a directory `/mnt/gfs/jupyter-notbooks` exists on it. Then, to start up the Spark cluster in your Docker swarm, `cd` into this project's directory and:
+This set up depends on have a GlusterFS volume mounted at `/mnt/gfs` on all nodes and the directories exist on it:
+
+* `/mnt/gfs/jupyter-notbooks`
+* `/mnt/gfs/data`
+
+Then, to start up the Spark cluster in your Docker swarm, `cd` into this project's directory and:
 ```
 ./build-images.sh
 docker stack deploy -c deploy-spark-swarm.yml spark

diff --git a/spark-on-docker-swarm/build-images.sh → docker-swarm-spark-simple/build-images.sh b/spark-on-docker-swarm/build-images.sh → docker-swarm-spark-simple/build-images.sh
diff --git a/...er-swarm/configured-spark-node/Dockerfile → ...k-simple/configured-spark-node/Dockerfile b/...er-swarm/configured-spark-node/Dockerfile → ...k-simple/configured-spark-node/Dockerfile
diff --git a/...spark-node/spark-conf/spark-defaults.conf → ...spark-node/spark-conf/spark-defaults.conf b/...spark-node/spark-conf/spark-defaults.conf → ...spark-node/spark-conf/spark-defaults.conf
diff --git a/...igured-spark-node/spark-conf/spark-env.sh → ...igured-spark-node/spark-conf/spark-env.sh b/...igured-spark-node/spark-conf/spark-env.sh → ...igured-spark-node/spark-conf/spark-env.sh
diff --git a/spark-on-docker-swarm/deploy-spark-swarm.yml → ...swarm-spark-simple/deploy-spark-swarm.yml b/spark-on-docker-swarm/deploy-spark-swarm.yml → ...swarm-spark-simple/deploy-spark-swarm.yml
diff --git a/...r-swarm/spark-jupyter-notebook/Dockerfile → ...-simple/spark-jupyter-notebook/Dockerfile b/...r-swarm/spark-jupyter-notebook/Dockerfile → ...-simple/spark-jupyter-notebook/Dockerfile
diff --git a/...m/spark-jupyter-notebook/start-jupyter.sh → ...e/spark-jupyter-notebook/start-jupyter.sh b/...m/spark-jupyter-notebook/start-jupyter.sh → ...e/spark-jupyter-notebook/start-jupyter.sh