This repository contains a Docker compose setup that generate an environment to test locally jobflow, jobflow-remote and atomate2.
This Docker Compose setup provides three integrated containers:
- JupyterLab with atomate2 and jobflow-remote preconfigured for usage.
- SLURM with SSH access to simulate an HPC cluster.
- MongoDB database for workflow data storage.
.
├── docker-compose.yml
├── .env
├── slurm/
│ ├── Dockerfile
│ └── slurm_startup.sh
├── jupyter/
│ └── Dockerfile
├── config/
│ └── jfremote_template.yaml
├── notebooks/
│ └── (jupyter notebooks for the hands on sessions)
└── shared/
└── (folder mounted inside the jupyter container)
Before building the project name can be configured. This can be done opening the
.env
file with a text editor and setting the PROJECTNAME
value. This is not mandatory
and a default value (test_project
) is already set.
In addition, some of the usernames and ports can be configured in the .env
file, in case the default
values clash with some of your local services.
To launch the environment, clone the repository, navigate to the project directory and run:
docker-compose up -d
Once the containers are running, verify their status with:
docker ps
If everything is set up correctly, you should see the containers listed (e.g., jobflow_tutorial-jupyter-1
,
jobflow_tutorial-slurm-1
, jobflow_tutorial-mongodb-1
).
Based on JupyterLab container, is the main entry point for the execution of workflows.
The base python environment includes atomate2
, jobflow
, jobflow-remote
and all related
packages required to execute the workflows.
Jobflow-remote is already fully configured.
Connect to http://localhost:8888 (or your custom JUPYTER_PORT
) and use atomate
as password.
Can also be accessed through a shell. Run
docker container list
to get the name of the container (similar to jobflow_tutorial-jupyter-1
) and launch a bash
session inside the container with
docker exec -it <container-name> /bin/bash
A local worker with slurm to mimic the execution on an HPC cluster. Includes the same python packages as in the jupyter container and can be accessed through ssh with
ssh -p 2222 atomate@localhost
(or your custom SSH_PORT
and SLURM_USERNAME
) from the local machine or with
ssh atomate@slurm
from the jupyter container. The password is the same as the username.
The MongoDB is accessible on localhost:27018 (or your custom MONGODB_PORT
) from the local
machine and on mongodb:27017 from the jupyter container. There is no password protection
on the database.
It may be instructive to explore the content of the database with a GUI like
MongoDB Compass.
To ensure data persistence across container restarts, the following volumes are mounted:
- JupyterLab (
jupyter_data
): Stores data in/home/jovyan/work
. Useful files are copied in this folder at container startup. - SLURM (
slurm_data
): Holds job execution data in/home/${SLURM_USERNAME:-atomate}/jobs
. - MongoDB (
mongodb_data
): Persists database records in/data/db
.
These volumes allow workflows and job results to be retained even if the containers are stopped or rebuilt.
The jobflow-remote GUI can be started in the jupyter container running
jf gui
and can be accessed from the local machine connecting to http://localhost:5001 (or your custom WEB_APP_PORT
).
The local folder notebooks
is copied in the jupyter container in the ~/work
folder.
Warning
Switching on and off the containers will not overwrite the content of the ~/work/notebooks
folder,
but deleting the jupyter_data
volume associated will delete the files there.
The ~/work/develop
folder in the jupyter container is added to the PYTHONPATH
in the jobflow-remote
configuration, so that it can be used to store newly developed workflows that can be recognised from the
local_shell
worker.
docker-compose down
To remove all data volumes:
docker-compose down -v
Caution
This will delete all the notebooks, job files and DB content in the containers.