mmli-backend

Unified FastAPI based backend for ChemScraper, (CLEAN job-manager and Molli - future scope)

⭐️ Recommended local development (Docker)

(1/4) Create a `.env`

Create a .env from the env.tpl file in this repo. The default env is fine without modifications for testing. Change the passwords for production use.

cp .env.tpl .env

(2/4) Setup a K8 cluster, here we use Minikube

Install Minikube.
Start minikube w/ an external network (defined in this repo's docker-compose.yml)

minikube start --network=mmli-net --driver=docker --memory=24384

Ensure it's running: minikube kubectl cluster-info

Kubernetes control plane is running at https://192.168.49.2:8443
CoreDNS is running at https://192.168.49.2:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

Apply the necessary confgurations:

# Create `mmli` namespace
minikube kubectl -- create ns mmli

# Apply secret and config
minikube kubectl -- apply -f /home/kastan/ncsa/mmli/mmli-backend/app/cfg/local.secret.yaml -n mmli
minikube kubectl -- apply -f /home/kastan/ncsa/mmli/mmli-backend/app/cfg/local.config.yaml -n mmli

# Create PVC needed by molli jobs
minikube kubectl -- apply -f /home/kastan/ncsa/mmli/mmli-backend/chart/weights.pvc.yaml

(3/4) Run Docker Compose build

Edit the docker-compose.yml to expose your kube config. In our case, minikube requires 3 values: ca.crt, client.crt, client.key.

Copy-paste this into the docker-compose.yml, under the rest container:

⚠️ Note: I had problems with ${HOME} and had to provide full absolute paths manually; e.g. replace ${HOME} with /home/username. ⚠️

rest:
    container_name: mmli-backend
    
    ...

    volumes:
        - ./app:/code/app
        - ./migrations:/code/migrations
        - ${HOME}/.kube/config:/opt/kubeconfig
        - ${HOME}/.minikube/ca.crt:/home/kastan/.minikube/ca.crt
        - ${HOME}/.minikube/profiles/minikube/client.crt:${HOME}/.minikube/profiles/minikube/client.crt
        - ${HOME}/.minikube/profiles/minikube/client.key:${HOME}/.minikube/profiles/minikube/client.key

Finally start the compose. Monitor for errors from mmli-backend in the logs.

docker compose up --build # optionally add -d for detached

This will run MinIO + PostgreSQL + the Python app mmli-backend.

Test the service works: Navigate to localhost:8080/docs and you should see the FastAPI Swagger docs.

(4/4) Initialize the databse

Initialize the Postgres database, this creates the SQL tables.

docker compose exec -w /code rest alembic upgrade head

# You sould see the logs: 
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> d775ee615d7b, init
INFO  [alembic.runtime.migration] Running upgrade d775ee615d7b -> 88355d0f323b, added moleculecacheentry for caching molecules, modified job schema, added flaggedmolecule for saving flagged molecules
INFO  [alembic.runtime.migration] Running upgrade 88355d0f323b -> e8569ab45dd1, removed moleculecacheentry
INFO  [alembic.runtime.migration] Running upgrade e8569ab45dd1 -> 30b240622d34, add chemical identifier model and table

Finally, verify the tables are created:

exec into the database container, running the pgsql command.

docker exec -it mmli-backend-postgresql psql -U postgres mmli

Run \d command to list tables.

psql (15.8 (Debian 15.8-1.pgdg120+1))
Type "help" for help.

mmli=# \d 
                     List of relations
 Schema |            Name            |   Type   |  Owner
--------+----------------------------+----------+----------
 public | alembic_version            | table    | postgres
 public | chemical_identifier        | table    | postgres
 public | chemical_identifier_id_seq | sequence | postgres
 public | flaggedmolecule            | table    | postgres
 public | job                        | table    | postgres
(5 rows)

Check the jobs table \d job:

mmli=# \d job
                        Table "public.job"
    Column    |       Type        | Collation | Nullable | Default
--------------+-------------------+-----------+----------+---------
 job_info     | character varying |           |          |
 email        | character varying |           |          |
 job_id       | character varying |           | not null |
 run_id       | character varying |           |          |
 phase        | character varying |           | not null |
 type         | character varying |           | not null |
 image        | character varying |           |          |
 command      | character varying |           |          |
 time_created | integer           |           | not null |
 time_start   | integer           |           | not null |
 time_end     | integer           |           | not null |
 deleted      | integer           |           | not null |
 user_agent   | character varying |           | not null |
Indexes:
    "job_id_pk" PRIMARY KEY, btree (job_id)
Referenced by:
    TABLE "flaggedmolecule" CONSTRAINT "flaggedmolecule_job_id_fkey" FOREIGN KEY (job_id) REFERENCES job(job_id)

🎉 All done! 🎉 Check the Swagger docs for important commands on localhost:8080/docs.

How to monitor running jobs

First, submit a job using Curl, Swagger or Postman. E.g.:

curl -X POST https://mmli.kastan.ai/aceretro/jobs \
-H "Content-Type: application/json" \
-d '{
  "job_id": "123",
  "run_id": "123",
  "email": "[email protected]",
  "job_info": "{\"nuc\": \"hi\", \"CORES_FILE_NAME\": \"hi\", \"SUBS_FILE_NAME\": \"hi\"}"
}'

Monitoring the job:

# after submitted a job, it should create a pod
minikube kubectl -- get pods -A

# Then get details of the pod, including failures
minikube kubectl -- describe pod mmli-job-molli-123456-j4pwd -n mmli

# get the logs from a pod
minikube kubectl -- logs mmli-job-aceretro-222222222222-n5cl4 -n mmli -c job

Local development Setup (without Docker, not recommended)

(1/3) Configure Environment

Create a .env from the env.tpl file in this repo. The default env is fine without modifications for testing. Change the passwords for production use.

cp .env.tpl .env

Setting DEBUG=true will enable automatically reload the app when the Python source code changes

(2/3) Install dependencies

Or, you can use Python + pip if you have them installed locally

To install Dependencies:

# create a new virtual environment, e.g. for conda `conda create -n mmli-backend python=3.10 -y`
# conda activate mmli-backend
pip install -r requirements.txt

This will only run the Python app.

⚠️ You must run MinIO and PostgreSQL yourself. Set their credentials in the .env file.

(3/3) Initialize the databse

Initialize the Postgres database, this initializes the SQL tables with the "init" migration.

alembic upgrade head
# You should see these logs:
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> d775ee615d7b, init
...

Finally, verify the tables are created:

Run the pgsql command.

psql mmli

Run \d command to list tables.

psql (15.8 (Debian 15.8-1.pgdg120+1))
Type "help" for help.

mmli=# \d 
                     List of relations
 Schema |            Name            |   Type   |  Owner
--------+----------------------------+----------+----------
 public | alembic_version            | table    | postgres
 public | chemical_identifier        | table    | postgres
 public | chemical_identifier_id_seq | sequence | postgres
 public | flaggedmolecule            | table    | postgres
 public | job                        | table    | postgres
(5 rows)

Check the job table \d job:

mmli=# \d job
                        Table "public.job"
    Column    |       Type        | Collation | Nullable | Default
--------------+-------------------+-----------+----------+---------
 job_info     | character varying |           |          |
 email        | character varying |           |          |
 job_id       | character varying |           | not null |
 run_id       | character varying |           |          |
 phase        | character varying |           | not null |
 type         | character varying |           | not null |
 image        | character varying |           |          |
 command      | character varying |           |          |
 time_created | integer           |           | not null |
 time_start   | integer           |           | not null |
 time_end     | integer           |           | not null |
 deleted      | integer           |           | not null |
 user_agent   | character varying |           | not null |
Indexes:
    "job_id_pk" PRIMARY KEY, btree (job_id)
Referenced by:
    TABLE "flaggedmolecule" CONSTRAINT "flaggedmolecule_job_id_fkey" FOREIGN KEY (job_id) REFERENCES job(job_id)

🎉 All done! 🎉 Check the Swagger docs for important commands on localhost:8080/docs.

Database Migrations

Any time that you add, modify, or remove anything in the Job or JobBase classes, this will affect the database schema.

Migrations are handled using Alembic

You can use Alembic to automatically generate a script that will migrate the database to a new schema version.

See the migrations README for more info

Name		Name	Last commit message	Last commit date
Latest commit History 473 Commits
.github/workflows		.github/workflows
app		app
chart		chart
migrations		migrations
seeds		seeds
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
env.tpl		env.tpl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mmli-backend

⭐️ Recommended local development (Docker)

(1/4) Create a `.env`

(2/4) Setup a K8 cluster, here we use Minikube

(3/4) Run Docker Compose build

(4/4) Initialize the databse

How to monitor running jobs

Local development Setup (without Docker, not recommended)

(1/3) Configure Environment

(2/3) Install dependencies

(3/3) Initialize the databse

Database Migrations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

License

moleculemaker/mmli-backend

Folders and files

Latest commit

History

Repository files navigation

mmli-backend

⭐️ Recommended local development (Docker)

(1/4) Create a .env

(2/4) Setup a K8 cluster, here we use Minikube

(3/4) Run Docker Compose build

(4/4) Initialize the databse

How to monitor running jobs

Local development Setup (without Docker, not recommended)

(1/3) Configure Environment

(2/3) Install dependencies

(3/3) Initialize the databse

Database Migrations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

(1/4) Create a `.env`

Packages