-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from DIYBigData/spark-qfs-swarm
Spark + QFS Docker Swarm Stack
- Loading branch information
Showing
18 changed files
with
422 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Deploy Standalone Spark Cluster with QFS on Docker Swarm | ||
This project deploys a standalone Spark Cluster onto a Docker Swarm. Includes the [Quantcast File System](https://github.com/quantcast/qfs) (QFS) as the clusters distributed file system. Why QFS? Why not. this configuration will also launch and make available a Jupyter PySpark notebook that is connected to the Spark cluster. The cluster has [`matplotlib`](https://matplotlib.org) and [`pandas`](https://pandas.pydata.org) preinstalled for your PySpark on Jupyter joys. | ||
|
||
## Usage | ||
First, edit the following items as needed for your swarm: | ||
|
||
1. `worker-node -> spark-conf -> spark-env.sh`: adjust the environment variables as appropriate for your cluster's nodes, most notably `SPARK_WORKER_MEMORY` and `SPARK_WORKER_CORES`. Leave 1-2 cores and at least 10% of RAM for other processes. | ||
2. `worker-node -> spark-conf -> spark-env.sh`: Adjust the memory and core settings for the executors and driver. Each executor should have about 5 cores (if possible), and should be a whole divisor into `SPARK_WORKER_CORES`. Spark will launch as many executors as `SPARK_WORKER_CORES` divided by `spark.executor.cores`. Reserve about 7-8% of `SPARK_WORKER_MEMORY` for overhead when setting `spark.executor.memory`. | ||
3. `build-images.sh`: Adjust the IP address for your local Docker registry that all nodes in your cluster can access. You can use a domain name if all nodes in your swarm can resolve it. This is needed as it allows all nodes in the swarm to pull the locally built Docker images. | ||
4. `deploy-spark-qfs-swarm.yml`: Adjust all image names for the updated local Docker registry address you used in the prior step. Also, adjust the resource limits for each of the services. Setting a `cpus` limit here that is smaller than the number of cores on your node has the effect of giving your process a fraction of each core's capacity. You might consider doing this if your swarm hosts other services or does not handle long term 100% CPU load well (e.g., overheats). Also adjust the `replicas` count for the `spark-worker` service to be equal to the number of nodes in your swarm (or less). | ||
|
||
This set up depends on have a GlusterFS volume mounted at `/mnt/gfs` on all nodes and the following directories exist on it: | ||
|
||
* `/mnt/gfs/jupyter-notbooks` - used to persist the Jupyter notebooks. | ||
* `/mnt/data/qfs/logs` - where QFS will store it's logs | ||
* `/mnt/data/qfs/chunk` - Where the chunk servers of QFS will store the data | ||
* `/mnt/data/qfs/checkpoint` - Where the QFS metaserver will store the fulesystem check points | ||
* `/mnt/data/spark` - The local working directory for spark | ||
|
||
You can adjust these as you see fit, but be sure to update the mounts specified in `deploy-spark-qfs-swarm.yml`. | ||
|
||
Then, to start up the Spark cluster in your Docker swarm, `cd` into this project's directory and: | ||
``` | ||
./build-images.sh | ||
docker stack deploy -c deploy-spark-qfs-swarm.yml spark | ||
``` | ||
|
||
Point your development computer's browser at `http://swarm-public-ip:7777/` to load the Jupyter notebook. | ||
|
||
### Working with QFS | ||
To launch a Docker container to give you command line access to QFS, use the following command: | ||
``` | ||
docker run -it --network="spark_cluster_network" master:5000/qfs-master:latest /bin/bash | ||
``` | ||
Note that you must attach to the network on which the Docker spark cluster services are using. From this command prompt, the following commands are pre-configured to connect to the QFS instance: | ||
|
||
* `qfs` - enables most linux-style file operations on the QFS instance. | ||
* `cptoqfs` - Copies files from the local file system (in the Docker container) to the QFS instance. | ||
* `cpfromqfs` - Copies files from the QFS instance to the local file system (in the Docker container) | ||
* `qfsshell` - A useful shell-style interface to the QFS instance | ||
|
||
You might consider adding a volume mount to the `docker run` command so that the Docker container can access data from you local file system. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
#!/bin/bash | ||
|
||
set -e | ||
|
||
# build images | ||
docker build -t worker-node:latest ./worker-node | ||
docker build -t qfs-master:latest ./qfs-master | ||
docker build -t spark-master:latest ./spark-master | ||
docker build -t jupyter-server:latest ./jupyter-server | ||
|
||
# tag image with local repository | ||
docker tag worker-node:latest master:5000/worker-node:latest | ||
docker tag qfs-master:latest master:5000/qfs-master:latest | ||
docker tag spark-master:latest master:5000/spark-master:latest | ||
docker tag jupyter-server:latest master:5000/jupyter-server:latest | ||
|
||
# push the images to local repository | ||
docker push master:5000/worker-node:latest | ||
docker push master:5000/qfs-master:latest | ||
docker push master:5000/spark-master:latest | ||
docker push master:5000/jupyter-server:latest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
version: '3.4' | ||
services: | ||
qfs-master: | ||
image: master:5000/qfs-master:latest | ||
hostname: qfs-master | ||
networks: | ||
- cluster_network | ||
ports: | ||
- 20000:20000 | ||
- 30000:30000 | ||
- 20050:20050 | ||
volumes: | ||
- type: bind | ||
source: /mnt/data/qfs | ||
target: /data/qfs | ||
deploy: | ||
resources: | ||
limits: | ||
cpus: "2.0" | ||
memory: 2g | ||
placement: | ||
constraints: | ||
- node.role == manager | ||
spark-master: | ||
image: master:5000/spark-master:latest | ||
hostname: spark-master | ||
environment: | ||
- SPARK_PUBLIC_DNS=10.1.1.1 | ||
- SPARK_LOG_DIR=/data/spark/logs | ||
networks: | ||
- cluster_network | ||
ports: | ||
- 6066:6066 | ||
- 7077:7077 | ||
- 8080:8080 | ||
volumes: | ||
- type: bind | ||
source: /mnt/data/spark | ||
target: /data/spark | ||
deploy: | ||
resources: | ||
limits: | ||
cpus: "2.0" | ||
memory: 6g | ||
jupyter-server: | ||
image: master:5000/jupyter-server:latest | ||
hostname: jupyter-server | ||
environment: | ||
- SPARK_PUBLIC_DNS=10.1.1.1 | ||
- SPARK_LOG_DIR=/data/spark/logs | ||
depends_on: | ||
- spark-master | ||
- qfs-master | ||
- worker-node | ||
networks: | ||
- cluster_network | ||
ports: | ||
- 7777:7777 | ||
- 4040:4040 | ||
volumes: | ||
- type: bind | ||
source: /mnt/gfs/jupyter-notebooks | ||
target: /home/jupyter/notebooks | ||
- type: bind | ||
source: /mnt/gfs/data | ||
target: /data | ||
deploy: | ||
resources: | ||
limits: | ||
cpus: "2.0" | ||
memory: 6g | ||
worker-node: | ||
image: master:5000/worker-node:latest | ||
hostname: worker | ||
environment: | ||
- SPARK_PUBLIC_DNS=10.1.1.1 | ||
- SPARK_LOG_DIR=/data/spark/logs | ||
depends_on: | ||
- qfs-master | ||
- spark-master | ||
networks: | ||
- cluster_network | ||
ports: | ||
- 8081:8081 | ||
volumes: | ||
- type: bind | ||
source: /mnt/data/qfs | ||
target: /data/qfs | ||
- type: bind | ||
source: /mnt/data/spark | ||
target: /data/spark | ||
deploy: | ||
mode: global | ||
resources: | ||
limits: | ||
cpus: "6.0" | ||
memory: 56g | ||
networks: | ||
cluster_network: | ||
attachable: true | ||
ipam: | ||
driver: default | ||
config: | ||
- subnet: 10.20.30.0/24 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
FROM worker-node:latest | ||
|
||
RUN apt-get install -y g++ | ||
RUN pip3 install jupyter | ||
RUN mkdir -p /home/jupyter/runtime | ||
|
||
COPY start-jupyter.sh / | ||
|
||
CMD ["/bin/bash", "/start-jupyter.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
#!/bin/bash | ||
|
||
XDG_RUNTIME_DIR=/home/jupyter/runtime PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=7777 --notebook-dir=/home/jupyter/notebooks --ip=* --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''" $SPARK_HOME/bin/pyspark --master spark://spark-master:7077 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
FROM worker-node:latest | ||
|
||
# | ||
# Expected volumes: | ||
# /data/qfs - this is where QFS will store its data | ||
# | ||
# Instance should run on the swam's master node so as to persist configuration | ||
# | ||
|
||
# need python 2 for webserver | ||
|
||
RUN apt-get update \ | ||
&& apt-get install -y python2.7 less wget \ | ||
&& ln -s /usr/bin/python2.7 /usr/bin/python2 \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# set configuration | ||
COPY ./qfs-conf/* $QFS_HOME/conf/ | ||
|
||
# create some useful bash aliases for when at bash shell prompt of this image | ||
RUN echo 'alias qfs="qfs -fs qfs://qfs-master:20000"' >> ~/.bashrc \ | ||
&& echo 'alias cptoqfs="cptoqfs -s qfs-master -p 20000"' >> ~/.bashrc \ | ||
&& echo 'alias cpfromqfs="cpfromqfs -s qfs-master -p 20000"' >> ~/.bashrc \ | ||
&& echo 'alias qfsshell="qfsshell -s qfs-master -p 20000"' >> ~/.bashrc | ||
|
||
COPY start-qfs-master.sh / | ||
CMD ["/bin/bash", "/start-qfs-master.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
metaServer.clientPort = 20000 | ||
metaServer.chunkServerPort = 30000 | ||
metaServer.createEmptyFs = 1 | ||
metaServer.logDir = /data/qfs/logs | ||
metaServer.cpDir = /data/qfs/checkpoint | ||
metaServer.recoveryInterval = 30 | ||
metaServer.clusterKey = qfs-personal-compute-cluster | ||
metaServer.msgLogWriter.logLevel = INFO | ||
chunkServer.msgLogWriter.logLevel = NOTICE | ||
metaServer.rootDirMode = 0777 | ||
metaServer.rootDirGroup = 1000 | ||
metaServer.rootDirUser = 1000 |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[webserver] | ||
webServer.metaserverHost = qfs-master | ||
webServer.metaserverPort = 20000 | ||
webServer.port = 20050 | ||
webServer.docRoot = $QFS_HOME/webui/files/ | ||
webServer.host = 0.0.0.0 | ||
webserver.allmachinesfn = /dev/null |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/bash | ||
|
||
$QFS_HOME/bin/metaserver $QFS_HOME/conf/Metaserver.prp &> $QFS_LOGS_DIR/metaserver.log & | ||
|
||
python2 $QFS_HOME/webui/qfsstatus.py $QFS_HOME/conf/webUI.cfg &> $QFS_LOGS_DIR/webui.log & | ||
|
||
# now do nothing and do not exit | ||
while true; do sleep 3600; done | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
FROM worker-node:latest | ||
|
||
# | ||
# Expected volumes: | ||
# /data/spark - this is the spark working directory | ||
# | ||
|
||
COPY start-spark-master.sh / | ||
CMD ["/bin/bash", "/start-spark-master.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
|
||
# start Spark master | ||
$SPARK_HOME/sbin/start-master.sh | ||
|
||
# now do nothing and do not exit | ||
while true; do sleep 3600; done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
FROM debian:stretch | ||
MAINTAINER Michael Kamprath "https://github.com/michaelkamprath" | ||
# | ||
# Base image for Apace Spak standalone cluster with QFS | ||
# | ||
# Inspired by https://hub.docker.com/r/gettyimages/spark/dockerfile | ||
# | ||
# | ||
# Expected volumes: | ||
# /data/qfs - this is where QFS will store its data | ||
# /data/spark - this is the spark working directory | ||
# | ||
# Expected service names: | ||
# qfs-master - the service where the QFS metaserver runs | ||
# spark-master - the service where the spark master runs | ||
# | ||
|
||
RUN apt-get update \ | ||
&& apt-get install -y locales \ | ||
&& dpkg-reconfigure -f noninteractive locales \ | ||
&& locale-gen C.UTF-8 \ | ||
&& /usr/sbin/update-locale LANG=C.UTF-8 \ | ||
&& echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen \ | ||
&& locale-gen \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
ENV LANG en_US.UTF-8 | ||
ENV LANGUAGE en_US:en | ||
ENV LC_ALL en_US.UTF-8 | ||
|
||
RUN apt-get update \ | ||
&& apt-get install -y curl unzip \ | ||
python3 python3-setuptools \ | ||
libboost-regex-dev \ | ||
&& ln -s /usr/bin/python3 /usr/bin/python \ | ||
&& easy_install3 pip py4j \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
ENV PYTHONIOENCODING UTF-8 | ||
ENV PIP_DISABLE_PIP_VERSION_CHECK 1 | ||
|
||
# JAVA | ||
RUN apt-get update \ | ||
&& apt-get install -y openjdk-8-jre \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# QFS | ||
ENV QFS_VERSION 2.1.2 | ||
ENV HADOOP_VERSION 2.7.2 | ||
ENV QFS_PACKAGE qfs-debian-9-${QFS_VERSION}-x86_64 | ||
ENV QFS_HOME /usr/qfs-${QFS_VERSION} | ||
ENV QFS_LOGS_DIR /data/qfs/logs | ||
ENV LD_LIBRARY_PATH ${QFS_HOME}/lib | ||
RUN curl -sL --retry 3 \ | ||
"https://s3.amazonaws.com/quantcast-qfs/qfs-debian-9-${QFS_VERSION}-x86_64.tgz" \ | ||
| gunzip \ | ||
| tar x -C /usr/ \ | ||
&& mv /usr/$QFS_PACKAGE $QFS_HOME \ | ||
&& chown -R root:root $QFS_HOME | ||
COPY ./qfs-conf/* $QFS_HOME/conf/ | ||
ENV PATH $PATH:${QFS_HOME}/bin:${QFS_HOME}/bin/tools | ||
|
||
# SPARK | ||
ENV SPARK_VERSION 2.4.4 | ||
ENV SPARK_PACKAGE spark-${SPARK_VERSION}-bin-hadoop2.7 | ||
ENV SPARK_HOME /usr/spark-${SPARK_VERSION} | ||
ENV SPARK_DIST_CLASSPATH="$QFS_HOME/lib/hadoop-$HADOOP_VERSION-qfs-$QFS_VERSION.jar:$QFS_HOME/lib/qfs-access-$QFS_VERSION" | ||
ENV HADOOP_CONF_DIR=${SPARK_HOME}/conf/ | ||
ENV PATH $PATH:${SPARK_HOME}/bin | ||
RUN curl -sL --retry 3 \ | ||
"https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/${SPARK_PACKAGE}.tgz" \ | ||
| gunzip \ | ||
| tar x -C /usr/ \ | ||
&& mv /usr/$SPARK_PACKAGE $SPARK_HOME \ | ||
&& chown -R root:root $SPARK_HOME | ||
COPY ./spark-conf/* $SPARK_HOME/conf/ | ||
|
||
# add python libraries useful in PySpark | ||
RUN python3 -mpip install matplotlib \ | ||
&& pip3 install pandas | ||
|
||
# set up command | ||
WORKDIR /root | ||
COPY start-worker-node.sh / | ||
CMD ["/bin/bash", "/start-worker-node.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
chunkServer.metaServer.hostname = qfs-master | ||
chunkServer.metaServer.port = 30000 | ||
chunkServer.clientPort = 22000 | ||
chunkServer.chunkDir = /data/qfs/chunk | ||
chunkServer.clusterKey = qfs-personal-compute-cluster | ||
chunkServer.stdout = /dev/null | ||
chunkServer.stderr = /dev/null | ||
chunkServer.ioBufferPool.partitionBufferCount = 65536 | ||
chunkServer.msgLogWriter.logLevel = INFO | ||
chunkServer.diskQueue.threadCount = 4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
<?xml version="1.0"?> | ||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> | ||
|
||
<!-- Setting for QFS--> | ||
|
||
<configuration> | ||
<property> | ||
<name>fs.qfs.impl</name> | ||
<value>com.quantcast.qfs.hadoop.QuantcastFileSystem</value> | ||
</property> | ||
<property> | ||
<name>fs.defaultFS</name> | ||
<value>qfs://qfs-master:20000</value> | ||
</property> | ||
<property> | ||
<name>fs.qfs.metaServerHost</name> | ||
<value>qfs-master</value> | ||
</property> | ||
<property> | ||
<name>fs.qfs.metaServerPort</name> | ||
<value>20000</value> | ||
</property> | ||
</configuration> |
Oops, something went wrong.