Skip to content

Commit

Permalink
Improve caching strategy across the board of CI workflow
Browse files Browse the repository at this point in the history
We are using various caches in our build and so far - due to the
way how "standard" caching works, PRs from forks could not effectively
use the cache from main Airflow repository - because caches are not
shared with other repositories - so the PRs builds could only
use cache effectively when they were rebased and continued running from
the same fork.

This PR improves caching strategy using "stash" action from the ASF.
Unlike `cache` - the action uses artifacts to store cache, and that
makes it possible for the stash action to use such cache uploaded from
`main` canary builds in PRs coming from the fork.

As part of this change all the places where setup-python was used
and breeze installed afterwards were reviewed and updated to use
only breeze installation action (it already installs python) and this
action has been improved to use UV caching effectively.

Overall this PR should decrease setup overhead for many jobs across
the CI workflow.

Follow-up after #45266
  • Loading branch information
potiuk committed Dec 31, 2024
1 parent 52ed7d7 commit 7007d97
Show file tree
Hide file tree
Showing 29 changed files with 440 additions and 141 deletions.
22 changes: 18 additions & 4 deletions .github/actions/breeze/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ inputs:
python-version:
description: 'Python version to use'
default: "3.9"
use-uv:
description: 'Whether to use uv tool'
required: "true"
type: "string"
outputs:
host-python-version:
description: Python version used in host
Expand All @@ -33,13 +37,11 @@ runs:
uses: actions/setup-python@v5
with:
python-version: ${{ inputs.python-version }}
cache: 'pip'
cache-dependency-path: ./dev/breeze/pyproject.toml
# NOTE! Installing Breeze without using cache is FASTER than when using cache - uv is so fast and has
# so low overhead, that just running upload cache/restore cache is slower than installing it from scratch
- name: "Install Breeze"
shell: bash
run: ./scripts/ci/install_breeze.sh
env:
PYTHON_VERSION: ${{ inputs.python-version }}
- name: "Free space"
shell: bash
run: breeze ci free-space
Expand All @@ -56,3 +58,15 @@ runs:
run: breeze setup config --no-cheatsheet --no-asciiart
env:
AIRFLOW_SOURCES_ROOT: "${{ github.workspace }}"
- name: "Use uv "
shell: bash
run: breeze setup config --use-uv
env:
AIRFLOW_SOURCES_ROOT: "${{ github.workspace }}"
if: inputs.use-uv == 'true'
- name: "Don't use uv "
shell: bash
run: breeze setup config --no-use-uv
env:
AIRFLOW_SOURCES_ROOT: "${{ github.workspace }}"
if: inputs.use-uv != 'true'
46 changes: 36 additions & 10 deletions .github/actions/install-pre-commit/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,19 @@
name: 'Install pre-commit'
description: 'Installs pre-commit and related packages'
inputs:
# TODO(potiuk): automate update of these versions
python-version:
description: 'Python version to use'
default: 3.9
default: "3.9"
uv-version:
description: 'uv version to use'
default: 0.5.5
default: "0.5.13"
pre-commit-version:
description: 'pre-commit version to use'
default: 4.0.1
default: "4.0.1"
pre-commit-uv-version:
description: 'pre-commit-uv version to use'
default: 4.1.4
default: "4.1.4"
runs:
using: "composite"
steps:
Expand All @@ -40,10 +41,35 @@ runs:
pip install uv==${{inputs.uv-version}} || true
uv tool install pre-commit==${{inputs.pre-commit-version}} --with uv==${{inputs.uv-version}} \
--with pre-commit-uv==${{inputs.pre-commit-uv-version}}
- name: Cache pre-commit envs
uses: actions/cache@v4
working-directory: ${{ github.workspace }}
# We need to use tar file with archive to restore all the permissions and symlinks
- name: "Delete ~.cache"
run: |
du ~/ --max-depth=2
echo
echo Deleting ~/.cache
echo
rm -rf ~/.cache
echo
shell: bash
- name: "Restore pre-commit cache"
uses: apache/infrastructure-actions/stash/restore@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
path: ~/.cache/pre-commit
key: "pre-commit-${{inputs.python-version}}-${{ hashFiles('.pre-commit-config.yaml') }}"
restore-keys: |
pre-commit-${{inputs.python-version}}-
key: cache-pre-commit-v4-${{ inputs.python-version }}-${{ hashFiles('.pre-commit-config.yaml') }}
path: /tmp/
id: restore-pre-commit-cache
- name: "Restore .cache from the tar file"
run: tar -C ~ -xzf /tmp/cache-pre-commit.tar.gz
shell: bash
if: steps.restore-pre-commit-cache.outputs.stash-hit == 'true'
- name: "Show restored files"
run: |
echo "Restored files"
du ~/ --max-depth=2
echo
shell: bash
if: steps.restore-pre-commit-cache.outputs.stash-hit == 'true'
- name: Install pre-commit hooks
shell: bash
run: pre-commit install-hooks || (cat ~/.cache/pre-commit/pre-commit.log && exit 1)
working-directory: ${{ github.workspace }}
7 changes: 6 additions & 1 deletion .github/actions/prepare_breeze_and_image/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ inputs:
platform:
description: 'Platform for the build - linux/amd64 or linux/arm64'
required: true
use-uv:
description: 'Whether to use uv'
required: true
outputs:
host-python-version:
description: Python version used in host
Expand All @@ -40,11 +43,13 @@ runs:
shell: bash
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
use-uv: ${{ inputs.use-uv }}
id: breeze
- name: "Restore ${{ inputs.image-type }} docker image ${{ inputs.platform }}:${{ inputs.python }}"
uses: apache/infrastructure-actions/stash/restore@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
key: "${{ inputs.image-type }}-image-save-${{ inputs.platform }}-${{ inputs.python }}"
key: ${{ inputs.image-type }}-image-save-${{ inputs.platform }}-${{ inputs.python }}
path: "/tmp/"
- name: "Load ${{ inputs.image-type }} image ${{ inputs.platform }}:${{ inputs.python }}"
run: >
Expand Down
2 changes: 1 addition & 1 deletion .github/actions/prepare_single_ci_image/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ runs:
- name: "Restore CI docker images ${{ inputs.platform }}:${{ inputs.python }}"
uses: apache/infrastructure-actions/stash/restore@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
key: "ci-image-save-${{ inputs.platform }}-${{ inputs.python }}"
key: ci-image-save-${{ inputs.platform }}-${{ inputs.python }}
path: "/tmp/"
if: contains(inputs.python-versions-list-as-string, inputs.python)
- name: "Load CI image ${{ inputs.platform }}:${{ inputs.python }}"
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/additional-ci-image-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ jobs:
python-versions: ${{ inputs.python-versions }}
branch: ${{ inputs.branch }}
constraints-branch: ${{ inputs.constraints-branch }}
use-uv: ${{ inputs.use-uv}}
use-uv: ${{ inputs.use-uv }}
include-success-outputs: ${{ inputs.include-success-outputs }}
docker-cache: ${{ inputs.docker-cache }}
disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
Expand Down Expand Up @@ -143,6 +143,8 @@ jobs:
run: ./scripts/ci/cleanup_docker.sh
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
use-uv: ${{ inputs.use-uv }}
- name: "Login to ghcr.io"
run: echo "${{ env.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
- name: "Check that image builds quickly"
Expand All @@ -167,7 +169,7 @@ jobs:
# platform: "linux/arm64"
# branch: ${{ inputs.branch }}
# constraints-branch: ${{ inputs.constraints-branch }}
# use-uv: ${{ inputs.use-uv}}
# use-uv: ${{ inputs.use-uv }}
# upgrade-to-newer-dependencies: ${{ inputs.upgrade-to-newer-dependencies }}
# docker-cache: ${{ inputs.docker-cache }}
# disable-airflow-repo-cache: ${{ inputs.disable-airflow-repo-cache }}
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/additional-prod-image-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,10 @@ on: # yamllint disable-line rule:truthy
description: "Which version of python should be used by default"
required: true
type: string
use-uv:
description: "Whether to use uv"
required: true
type: string
jobs:
prod-image-extra-checks-main:
name: PROD image extra checks (main)
Expand Down Expand Up @@ -117,6 +121,7 @@ jobs:
platform: "linux/amd64"
image-type: "prod"
python: ${{ inputs.default-python-version }}
use-uv: ${{ inputs.use-uv }}
- name: "Test examples of PROD image building"
run: "
cd ./docker_tests && \
Expand Down Expand Up @@ -150,6 +155,7 @@ jobs:
platform: "linux/amd64"
image-type: "prod"
python: ${{ env.PYTHON_MAJOR_MINOR_VERSION }}
use-uv: ${{ inputs.use-uv }}
id: breeze
- name: "Test docker-compose quick start"
run: breeze testing docker-compose-tests
124 changes: 91 additions & 33 deletions .github/workflows/basic-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,10 @@ on: # yamllint disable-line rule:truthy
description: "Whether to run only latest version checks (true/false)"
required: true
type: string
use-uv:
description: "Whether to use uv in the image"
required: true
type: string
jobs:
run-breeze-tests:
timeout-minutes: 10
Expand All @@ -72,16 +76,12 @@ jobs:
persist-credentials: false
- name: "Cleanup docker"
run: ./scripts/ci/cleanup_docker.sh
- uses: actions/setup-python@v5
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
python-version: "${{ inputs.default-python-version }}"
cache: 'pip'
cache-dependency-path: ./dev/breeze/pyproject.toml
- run: pip install --editable ./dev/breeze/
- run: python -m pytest -n auto --color=yes
use-uv: ${{ inputs.use-uv }}
- run: uv tool run --from apache-airflow-breeze pytest -n auto --color=yes
working-directory: ./dev/breeze/


tests-ui:
timeout-minutes: 10
name: React UI tests
Expand All @@ -108,15 +108,24 @@ jobs:
node-version: 21
cache: 'pnpm'
cache-dependency-path: 'airflow/ui/pnpm-lock.yaml'
- name: "Cache eslint"
uses: actions/cache@v4
- name: "Restore eslint cache (ui)"
uses: apache/infrastructure-actions/stash/restore@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
path: 'airflow/ui/node_modules'
key: ${{ runner.os }}-ui-node-modules-${{ hashFiles('airflow/ui/**/pnpm-lock.yaml') }}
path: airflow/ui/node_modules/
key: cache-ui-node-modules-v1-${{ runner.os }}-${{ hashFiles('airflow/ui/**/pnpm-lock.yaml') }}
id: restore-eslint-cache
- run: cd airflow/ui && pnpm install --frozen-lockfile
- run: cd airflow/ui && pnpm test
env:
FORCE_COLOR: 2
- name: "Save eslint cache (ui)"
uses: apache/infrastructure-actions/stash/save@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
path: airflow/ui/node_modules/
key: cache-ui-node-modules-v1-${{ runner.os }}-${{ hashFiles('airflow/ui/**/pnpm-lock.yaml') }}
if-no-files-found: 'error'
retention-days: '2'
if: steps.restore-eslint-cache.outputs.stash-hit != 'true'

tests-www:
timeout-minutes: 10
Expand All @@ -137,15 +146,66 @@ jobs:
uses: actions/setup-node@v4
with:
node-version: 21
- name: "Cache eslint"
uses: actions/cache@v4
- name: "Restore eslint cache (www)"
uses: apache/infrastructure-actions/stash/restore@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
path: 'airflow/www/node_modules'
key: ${{ runner.os }}-www-node-modules-${{ hashFiles('airflow/www/**/yarn.lock') }}
path: airflow/www/node_modules/
key: cache-www-node-modules-v1-${{ runner.os }}-${{ hashFiles('airflow/www/**/yarn.lock') }}
id: restore-eslint-cache
- run: yarn --cwd airflow/www/ install --frozen-lockfile --non-interactive
- run: yarn --cwd airflow/www/ run test
env:
FORCE_COLOR: 2
- name: "Save eslint cache (www)"
uses: apache/infrastructure-actions/stash/save@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
path: airflow/www/node_modules/
key: cache-www-node-modules-v1-${{ runner.os }}-${{ hashFiles('airflow/www/**/yarn.lock') }}
if-no-files-found: 'error'
retention-days: '2'
if: steps.restore-eslint-cache.outputs.stash-hit != 'true'

install-pre-commit:
timeout-minutes: 5
name: "Install pre-commit for cache"
runs-on: ${{ fromJSON(inputs.runs-on-as-json-default) }}
env:
PYTHON_MAJOR_MINOR_VERSION: "${{ inputs.default-python-version }}"
if: inputs.basic-checks-only == 'true'
steps:
- name: "Cleanup repo"
shell: bash
run: docker run -v "${GITHUB_WORKSPACE}:/workspace" -u 0:0 bash -c "rm -rf /workspace/*"
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v4
with:
persist-credentials: false
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
use-uv: ${{ inputs.use-uv }}
id: breeze
- name: "Install pre-commit"
uses: ./.github/actions/install-pre-commit
id: pre-commit
with:
python-version: ${{steps.breeze.outputs.host-python-version}}
- name: "Prepare .tar file from pre-commit cache"
run: |
tar -C ~ -czf /tmp/cache-pre-commit.tar.gz .cache/pre-commit .cache/uv
shell: bash
# Saving pre-commit cache should happen only in one job in the entire workflow - because otherwise
# it will cause 409 conflict errors - see https://github.com/actions/upload-artifact/issues/478
# the way it works with airflow - even if the same action is in "ci-image-tests" the if condition
# above `if: inputs.basic-checks-only == 'true'` will prevent it from running in the other job
- name: "Save pre-commit[pre-commit] cache"
uses: apache/infrastructure-actions/stash/save@c94b890bbedc2fc61466d28e6bd9966bc6c6643c
with:
# yamllint disable rule:line-length
key: cache-pre-commit-v4-${{ steps.breeze.outputs.host-python-version }}-${{ hashFiles('.pre-commit-config.yaml') }}
path: /tmp/cache-pre-commit.tar.gz
if-no-files-found: 'error'
retention-days: '2'

# Those checks are run if no image needs to be built for checks. This is for simple changes that
# Do not touch any of the python code or any of the important files that might require building
Expand All @@ -154,6 +214,7 @@ jobs:
timeout-minutes: 30
name: "Static checks: basic checks only"
runs-on: ${{ fromJSON(inputs.runs-on-as-json-public) }}
needs: install-pre-commit
if: inputs.basic-checks-only == 'true'
steps:
- name: "Cleanup repo"
Expand All @@ -165,20 +226,10 @@ jobs:
persist-credentials: false
- name: "Cleanup docker"
run: ./scripts/ci/cleanup_docker.sh
- name: "Setup python"
uses: actions/setup-python@v5
with:
python-version: ${{ inputs.default-python-version }}
cache: 'pip'
cache-dependency-path: ./dev/breeze/pyproject.toml
- name: "Setup python"
uses: actions/setup-python@v5
with:
python-version: "${{ inputs.default-python-version }}"
cache: 'pip'
cache-dependency-path: ./dev/breeze/pyproject.toml
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
use-uv: ${{ inputs.use-uv }}
id: breeze
- name: "Install pre-commit"
uses: ./.github/actions/install-pre-commit
Expand Down Expand Up @@ -216,6 +267,7 @@ jobs:
timeout-minutes: 45
name: "Upgrade checks"
runs-on: ${{ fromJSON(inputs.runs-on-as-json-public) }}
needs: install-pre-commit
env:
PYTHON_MAJOR_MINOR_VERSION: "${{ inputs.default-python-version }}"
if: inputs.canary-run == 'true' && inputs.latest-versions-only != 'true'
Expand All @@ -229,12 +281,16 @@ jobs:
persist-credentials: false
- name: "Cleanup docker"
run: ./scripts/ci/cleanup_docker.sh
# Install python from scratch. No cache used. We always want to have fresh version of everything
- uses: actions/setup-python@v5
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
python-version: "${{ inputs.default-python-version }}"
- name: "Install latest pre-commit"
run: pip install pre-commit
use-uv: ${{ inputs.use-uv }}
id: breeze
- name: "Install pre-commit"
uses: ./.github/actions/install-pre-commit
id: pre-commit
with:
python-version: ${{steps.breeze.outputs.host-python-version}}
- name: "Autoupdate all pre-commits"
run: pre-commit autoupdate
- name: "Run automated upgrade for black"
Expand Down Expand Up @@ -305,6 +361,8 @@ jobs:
run: ./scripts/ci/cleanup_docker.sh
- name: "Install Breeze"
uses: ./.github/actions/breeze
with:
use-uv: ${{ inputs.use-uv }}
- name: "Cleanup dist files"
run: rm -fv ./dist/*
- name: Setup git for tagging
Expand Down
Loading

0 comments on commit 7007d97

Please sign in to comment.