Skip to content

Commit

Permalink
enabled better QFS access from Jupyter terminal
Browse files Browse the repository at this point in the history
michaelkamprath committed Oct 6, 2019
1 parent f3902ed commit c66d101
Showing 5 changed files with 5 additions and 4 deletions.
2 changes: 1 addition & 1 deletion spark-qfs-swarm/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Deploy Standalone Spark Cluster with QFS on Docker Swarm
This project deploys a standalone Spark Cluster onto a Docker Swarm. Includes the [Quantcast File System](https://github.com/quantcast/qfs) (QFS) as the clusters distributed file system. Why QFS? Why not. this configuration will also launch and make available a Jupyter PySpark notebook that is connected to the Spark cluster. The cluster has [`matplotlib`](https://matplotlib.org) and [`pandas`](https://pandas.pydata.org) preinstalled for your PySpark on Jupyter joys.
This project deploys a standalone Spark Cluster onto a Docker Swarm. Includes the [Quantcast File System](https://github.com/quantcast/qfs) (QFS) as the clusters distributed file system. Why QFS? Why not. This configuration will also launch and make available a Jupyter PySpark notebook that is connected to the Spark cluster. The cluster has [`matplotlib`](https://matplotlib.org) and [`pandas`](https://pandas.pydata.org) preinstalled for your PySpark on Jupyter joys.

## Usage
First, edit the following items as needed for your swarm:
2 changes: 1 addition & 1 deletion spark-qfs-swarm/jupyter-server/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM worker-node:latest
FROM qfs-master:latest

RUN apt-get install -y g++
RUN pip3 install jupyter
2 changes: 1 addition & 1 deletion spark-qfs-swarm/jupyter-server/start-jupyter.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

XDG_RUNTIME_DIR=/home/jupyter/runtime PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=7777 --notebook-dir=/home/jupyter/notebooks --ip=* --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''" $SPARK_HOME/bin/pyspark --master spark://spark-master:7077
SHELL=/bin/bash XDG_RUNTIME_DIR=/home/jupyter/runtime PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook --no-browser --port=7777 --notebook-dir=/home/jupyter/notebooks --ip=* --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''" $SPARK_HOME/bin/pyspark --master spark://spark-master:7077
3 changes: 2 additions & 1 deletion spark-qfs-swarm/qfs-master/Dockerfile
Original file line number Diff line number Diff line change
@@ -19,7 +19,8 @@ RUN apt-get update \
COPY ./qfs-conf/* $QFS_HOME/conf/

# create some useful bash aliases for when at bash shell prompt of this image
RUN echo 'alias qfs="qfs -fs qfs://qfs-master:20000"' >> ~/.bashrc \
RUN echo 'export PATH=$PATH:$QFS_HOME/bin/:$QFS_HOME/bin/tools/' >> ~/.bashrc \
&& echo 'alias qfs="qfs -fs qfs://qfs-master:20000"' >> ~/.bashrc \
&& echo 'alias cptoqfs="cptoqfs -s qfs-master -p 20000"' >> ~/.bashrc \
&& echo 'alias cpfromqfs="cpfromqfs -s qfs-master -p 20000"' >> ~/.bashrc \
&& echo 'alias qfsshell="qfsshell -s qfs-master -p 20000"' >> ~/.bashrc
Empty file.

0 comments on commit c66d101

Please sign in to comment.