addressed spelling errors in comments

DIYBigData · Sep 9, 2019 · 0179b01 · 0179b01
1 parent 1d2966d
commit 0179b01
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 3 deletions.
diff --git a/spark-on-docker-swarm/README.md b/spark-on-docker-swarm/README.md
@@ -1,12 +1,12 @@
 # Deploy Stand Alone Spark Cluster on Docker Swarm
 
-This project brings up a simple Apache Spark stand alone cluster in a Docker swarm. It will also launch and make available a Jupyter PySpark notebook that is connected to the Spark cluster. The cluster has [`matplotlib`](https://matplotlib.org) and [`pandas`](https://pandas.pydata.org) preinstalled for you PySpark on Jupyter joys.
+This project brings up a simple Apache Spark stand alone cluster in a Docker swarm. It will also launch and make available a Jupyter PySpark notebook that is connected to the Spark cluster. The cluster has [`matplotlib`](https://matplotlib.org) and [`pandas`](https://pandas.pydata.org) preinstalled for your PySpark on Jupyter joys.
 
 ## Usage
 First, edit the following items as needed for your swarm:
 
 1. `configured-sparknode -> spark-conf -> spark-env.sh`: adjust the environment variables as appropriate for your cluster's nodes, most notably `SPARK_WORKER_MEMORY` and `SPARK_WORKER_CORES`. Leave 1-2 cores and at least 10% of RAM for other processes.
-2. `configured-sparknode -> spark-conf -> spark-env.sh`: Adjust the memory and core settings for the executors and driver. Each executor should have about 5 cores (if possible), and should be a whole divisor into `SPARK_WORKER_CORES`. Spark will launch as many executors as `SPARK_WORKER_CORES` divided by `spark.executor.cores`. reserver about 7-8% of `SPARK_WORKER_MEMORY` for overhead when setting `spark.executor.memory`.
+2. `configured-sparknode -> spark-conf -> spark-env.sh`: Adjust the memory and core settings for the executors and driver. Each executor should have about 5 cores (if possible), and should be a whole divisor into `SPARK_WORKER_CORES`. Spark will launch as many executors as `SPARK_WORKER_CORES` divided by `spark.executor.cores`. Reserve about 7-8% of `SPARK_WORKER_MEMORY` for overhead when setting `spark.executor.memory`.
 3. `build-images.sh`: Adjust the IP address for your local Docker registry. You can use a domain name if all nodes in your swarm can resolve it. This is needed as it allows all nodes in the swarm to pull the locally built Docker images.
 4. `spark-deploy.yml`: Adjust all image names for the updated local Docker registry address you used in the prior step. Also, adjust the resource limits for each of the services. Setting a `cpus` limit here that is smaller than the number of cores on your node has the effect of giving your process a fraction of each core's capacity. You might consider doing this if your swarm hosts other services or does not handle long term 100% CPU load well (e.g., overheats). Also adjust the `replicas` count for the `spark-worker` service to be equal to the number of nodes in your swarm (or less). 
 

diff --git a/spark-on-docker-swarm/configured-spark-node/spark-conf/spark-env.sh b/spark-on-docker-swarm/configured-spark-node/spark-conf/spark-env.sh
@@ -14,5 +14,5 @@ SPARK_WORKER_WEBUI_PORT=8081
 # which python the spark cluster should use for pyspark
 PYSPARK_PYTHON=python3 
 
-# hash seed so all node has numbers consistently
+# hash seed so all node hash numbers consistently
 PYTHONHASHSEED=8675309