Each subfolder here corresponds to a Docker image that utilizes a standalone script to perform a task, usually called in producton by an Airflow DAG.
A person with Docker set up locally can build a new version of each image at any time after making changes. From the relevant subfolder, run
docker build -t ghcr.io/cal-itp/data-infra/[gtfs-aggregator-parser/gtfs-rt-parser-v2/gtfs-schedule-validator/etc.]:[NEW VERSION TAG] .
That image can be used alongside a local Airflow instance to test the changed job locally prior to merging.
When changes are finalized, a new version number should be specified in the given subfolder's pyproject.toml
file. When changes to this directory are merged into main
, a GitHub Action called build-[JOB NAME]
automatically publishes an updated version of the image. A contributor with proper GHCR permissions can also manually deploy a new version of the image via the CLI:
docker build -t ghcr.io/cal-itp/data-infra/[gtfs-aggregator-parser/gtfs-rt-parser-v2/gtfs-schedule-validator/etc.]:[NEW VERSION TAG] .
docker push ghcr.io/cal-itp/data-infra/[gtfs-aggregator-parser/gtfs-rt-parser-v2/gtfs-schedule-validator/etc.]:[NEW VERSION TAG]
After deploying, no additional steps should be necessary. All internal code referencing the gtfs-aggregator-scraper
, gtfs-rt-parser-v2
, and gtfs-schedule-validator
jobs utilize the Airflow image_tag macro to automatically fetch the latest version during DAG runs.