This repository contains the configs and scripts used to run the WB Apollo server.
New instances are current set up manually. To setup a new instance (for OS upgrades, machine replacement etc.):
- Start a new machine (any linux OS)
- Install docker (if not pre-installed)
- Ensure docker is enabled on system startup
sudo systemctl is-enabled docker # If above command returns "disabled" sudo systemctl enable docker
- Migrate the data from prior instances if required (see below). If no migration was required, start up the apollo service as described below.
- Setup automatic mounting of the apollo volume through
/etc/fstab
. Mount through the device's UUID rather than its device name to prevent errors when changing the volumes attached to the instance (changing this can result in changing names for all previously attached volumes). To this end, the fstab file on the machine at time of writing contains:UUID=bd2e223a-2eea-4b7f-9953-6139d3acaeb3 /mnt/apollo_volume/ ext4 defaults,nofail 0 2
To start up the apollo service on a machine with all volumes, directories and data in place:
-
If no
apollo-service-config
directory is present on the apollo data volume: Clone the content of this repository into the apollo data volume (currently mounted on/mnt/apollo_volume
)git clone https://github.com/WormBase/wb-apollo.git ./apollo-service-config/
Otherwise, if this repository has been cloned onto the apollo data volume before, ensure it is up-to-date:
cd ./apollo-service-config/ git pull cd ../
-
Set all environment variables used and required by docker compose. Required variables are coded as
${VAR_NAME?}
in the apollo-service-commons.yml file. This can be done by sourcing the apollo-env.sh script, which will pull the required credentials from the AWS Systems Manager Parameter Store:. apollo-service-config/apollo-env.sh
-
Start the appolo service daemon:
docker compose -f apollo-service-config/docker-compose.yml up -d wb-apollo-service
To migrate/restore all data from another Apollo instance to a new instance, to be used by the Apollo service defined in this repository, do the following steps:
- Create a snapshot of the data volume attached to the old instance (on which the jbrowse data files are stored).
- If the (postgres) DBs on the old instance are not stored on the data volume (i.e. there is no
./postgres_data
directory on the volume), make DB dumps of the appollo DB and the chado DB on it:$ sudo -u postgres pg_dump --format=c -d apollo-release-production > apollo-release-production_dump.pg_dump $ sudo -u postgres pg_dump --format=c -d apollo-production-chado > apollo-production-chado_dump.pg_dump
- Transfer the dump files to the new machine if generated, store them in the
./apollo_backups
directory (create the directory if needed). - Once the snapshot from step 1 completed, restore it as a new volume and attach it to the new instance.
- Mount the new volume (assumed Amazon linux as OS and device attachment at
/dev/nvme1n1
):$ export APOLLO_DATA_VOLUME=/mnt/apollo_volume $ mount /dev/nvme1n1 ${APOLLO_DATA_VOLUME}
- If the root of this volume contains the JBrowse data files (subfolders for each species, rather than JBrowse/Postgres/Apollo data separation),
reorganise the data on it (on the new instance):
- All JBrowse data folders go in a folder
${APOLLO_DATA_VOLUME}/jbrowse_data
$ cd ${APOLLO_DATA_VOLUME} $ mkdir jbrowse_data/ $ mv * jbrowse_data/
- Make an empty folder
./temp_apollo/data
in the Apollo data volume, for temporary apollo intermediate files. - Make an empty folder
./temp_apollo/io
in the Apollo data volume, for file transfer between container and host system. - Make an empty folder
./postgres_data
, in the Apollo data volume, for permanent postgres DB storage.
- All JBrowse data folders go in a folder
- Start the apollo service as described in Apollo service startup,
but start the service as foreground process instead of as daemon (drop the
-d
argument from thedocker compose up
command). Let it complete startup (to create the necessary DB roles), then stop the process (ctrl+C or cmd+.) - Open a terminal into an appollo service container, with all data volumes mounted:
$ docker compose -f apollo-service-config/docker-compose.yml run --rm apollo-import-export
- In this container, start the postgres server.
/usr/lib/postgresql/9.6/bin/pg_ctl -D /var/lib/postgresql/9.6/main -w start
- Drop and restore the main postgres database.
dropdb $WEBAPOLLO_DB_NAME createdb -E UTF-8 -O $WEBAPOLLO_DB_USERNAME $WEBAPOLLO_DB_NAME pg_restore --no-owner -d $WEBAPOLLO_DB_NAME ./apollo_backups/apollo-release-production.pg_dump
- Drop and restore the chado postgres database.
dropdb $CHADO_DB_NAME createdb -E UTF-8 -O $CHADO_DB_USERNAME $CHADO_DB_NAME pg_restore --no-owner -d $CHADO_DB_NAME ./apollo_backups/apollo-production-chado.pg_dump
- Exit the import/export container (ctrl+d) and start the apollo service again
(this time in the background).
docker compose -f apollo-service-config/docker-compose.yml up -d wb-apollo-service
- The Apollo server will attempt a database schema migration when needed.
Inspect the logs for the error documented here.
If the error occured, patch the CHADO database manually:
docker exec -it wb.apollo.server psql -c "ALTER TABLE chadoprop ADD COLUMN cvterm_id int8 not null DEFAULT 1" docker compose -f apollo-service-config/docker-compose.yml restart wb-apollo-service
- After successful server startup, browse to the web interface (
http://<IP>:8080/apollo/annotator/index
) and log in using the admin user. If you get the notification pop-up stating "Unable to write to directory apollo_data.", fill in the new apollo temp data directory path (/temp_apollo/data
) in the textbox below and click the "update common data path" button below.
The Apollo container provides a set of CLI scripts and utilities for batch import/export and track management,
in the $CATALINA_BASE/webapps/apollo/jbrowse/bin/
directory (/var/lib/tomcat9/webapps/apollo/jbrowse/bin/
when using Apollo 2.8.1).
To use any of the Apollo CLI scripts:
docker exec wb.apollo.server sh -c 'perl $CATALINA_BASE/webapps/apollo/jbrowse/bin/script-name.pl --arguments'
Keep in mind that since this runs the process in the apollo container, paths available to it are also defined by the container environment:
- jbrowse track files found in the
jbrowse_data
subdirectory of the apollo volume on the host system, live under/data
in the container. - Files put into the
temp_apollo/io
subdirectory in the apollo data volume can be accessed within the container at/temp_apollo/io
and vice versa.