Skip to content

Add lychee prek hook (offline mode) and fix internal markdown links#66356

Merged
shahar1 merged 2 commits intoapache:mainfrom
potiuk:add-lychee-link-check
May 5, 2026
Merged

Add lychee prek hook (offline mode) and fix internal markdown links#66356
shahar1 merged 2 commits intoapache:mainfrom
potiuk:add-lychee-link-check

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented May 4, 2026

Summary

Add the lychee link checker as a prek hook in offline mode so we catch
broken intra-repo Markdown links before they land, and fix the 13
existing broken internal links the hook surfaced on main.

The hook

Configured in .pre-commit-config.yaml:

- repo: https://github.com/lycheeverse/lychee
  rev: e85aaf5524b2f808e63bae55e594c843220f10f2  # frozen: lychee-v0.24.2
  hooks:
    - id: lychee
      types: [markdown]
      args:
        - LYCHEE_VERSION=0.24.2
        - --offline
        - --no-progress
        - --root-dir
        - .
      exclude: |
        (?x)
        ^clients/python/|
        ^.*/openapi-gen/|
        ^.*/node_modules/|
        ^\.build/|
        ^generated/

Notes:

  • --offline means lychee skips every HTTP/HTTPS link and only validates
    on-disk references (relative paths, anchors, file:// links).
  • --root-dir . makes root-relative GitHub-style links like
    [X](/contributing-docs/X.rst) resolve from the repo root.
  • LYCHEE_VERSION=0.24.2 is passed as the first arg because lychee's
    pre-commit script otherwise requires the rev field to be a version
    tag, but airflow's convention is to pin SHAs. The arg keeps both
    happy. When bumping lychee, update both the rev SHA and the version
    string together.
  • Auto-generated content is excluded (the Python OpenAPI client docs
    under clients/python/, JS node_modules/, openapi-gen output,
    .build/, the generated/ dir).

The 13 broken links fixed

  • Missing locale guides (6 hits across 2 tables): added one-line stub
    files for ar, it, tr under
    .github/skills/airflow-translations/locales/. They are listed in the
    airflow-translations SKILL.md table because those locales are active
    in the UI i18n directory; until a full style-guide is authored each
    stub redirects the reader at the parent SKILL.md global rules.
  • airflow-core/src/airflow/_shared/{AGENTS,README}.md: the
    [shared folder](../../shared) link resolved inside airflow-core/
    rather than at the repo root. Fixed to ../../../../shared.
  • airflow-core/src/airflow/api_fastapi/execution_api/AGENTS.md:
    one missing ../ segment; the contributing-docs/... link was
    resolving to airflow-core/contributing-docs/ instead of the repo
    root. Fixed.
  • airflow-core/src/airflow/ui/tests/e2e/README.md: the
    dag-trigger.spec.ts example referenced no longer exists; pointed
    at the existing dag-runs.spec.ts instead.
  • dev/breeze/doc/ci/02_images.md: docs moved from
    docs/docker-stack/ to docker-stack-docs/; link updated.
  • dev/README_RELEASE_HELM_CHART.md: helm-chart docs moved from
    docs/helm-chart/ to chart/docs/; link updated.
  • dev/system_tests/README.md: dropped the dangling reference to
    a dev/requirements.txt that no longer exists; replaced with a
    uv run --with PyGithub --with rich-click --with rich invocation
    that picks up the script's actual dependencies.

Test plan

  • prek run lychee --all-files passes (verified locally — 13
    errors on main → 0 errors after this PR).
  • prek run insert-license lint-markdown --all-files still passes
    across the touched files.

Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.7)

Generated-by: Claude Code (Opus 4.7) following the guidelines

* Add the lychee link checker as a prek hook in offline mode
  (`--offline --no-progress --root-dir .`) so we catch broken
  intra-repo links in Markdown files before they land. The hook
  excludes auto-generated content (the Python OpenAPI client docs
  under `clients/python/`, JS `node_modules`, build output) and runs
  on every Markdown file otherwise. The lychee binary version is
  pinned via `LYCHEE_VERSION=0.24.2` as the first arg, alongside the
  SHA-pinned `rev`, so we keep the airflow SHA-pinning convention
  while satisfying lychee's pre-commit script which expects a
  version tag.

* Fix the 13 broken links the hook surfaced on `main`:

  - Add minimum-stub locale guideline files for active locales that
    were referenced from the airflow-translations skill table but had
    no file yet (`ar`, `it`, `tr`). The stub points the reader at the
    parent SKILL.md so the global rules apply until a real guide is
    written.
  - `airflow-core/src/airflow/_shared/{AGENTS,README}.md`: correct the
    `../../shared` path (which resolved inside `airflow-core/`) to
    `../../../../shared` so it lands on the repo-root `shared/` dir.
  - `airflow-core/src/airflow/api_fastapi/execution_api/AGENTS.md`:
    add the missing `../` so the `contributing-docs/...` link resolves
    to repo root instead of `airflow-core/contributing-docs/`.
  - `airflow-core/src/airflow/ui/tests/e2e/README.md`: replace the
    reference to a no-longer-existing `dag-trigger.spec.ts` example
    with the existing `dag-runs.spec.ts`.
  - `dev/breeze/doc/ci/02_images.md`: docs were moved from
    `docs/docker-stack/` to `docker-stack-docs/`; update the link.
  - `dev/README_RELEASE_HELM_CHART.md`: helm-chart docs moved from
    `docs/helm-chart/` to `chart/docs/`; update the link.
  - `dev/system_tests/README.md`: drop the dangling reference to a
    `dev/requirements.txt` that no longer exists; replace with a
    `uv run --with PyGithub --with rich-click --with rich` invocation
    that picks up the script's actual dependencies.
@boring-cyborg boring-cyborg Bot added area:dev-tools area:task-sdk area:UI Related to UI/UX. For Frontend Developers. backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch labels May 4, 2026
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented May 4, 2026

Should help us to keep our markdown cross-references in order :)

Copy link
Copy Markdown
Contributor

@parkhojeong parkhojeong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really useful feature :)

I tested this locally and found that broken fragments(anchor) are not reported with the current configuration.

Comment thread .pre-commit-config.yaml
Two follow-ups for apache#66356:

* CI failure: the `lychee` script-based hook auto-installs the
  pre-built `lychee` binary via cargo-binstall, but lychee 0.24.x
  binaries are linked against `GLIBC_2.38` / `GLIBC_2.39`, which
  the ubuntu-22.04 CI runners do not ship (they have glibc 2.35).
  Switch to the upstream `lychee-docker` variant — it runs the
  official `lycheeverse/lychee` Docker image and bundles its own
  libc, so it is portable across runners.

* Reviewer feedback (@parkhojeong): pass `--include-fragments` so
  lychee also reports broken Markdown anchor fragments. That
  surfaced three real broken fragments which are also fixed here:

  - `.github/skills/pr-triage/actions.md`: add explicit
    `<a id="mark-ready"></a>` and `<a id="mark-ready-with-ping"></a>`
    anchors before the `## ` headings, since GitHub's auto-generated
    anchors include the full descriptive heading (e.g.
    `mark-ready--add-ready-for-maintainer-review-label`) rather than
    the short action name that the existing cross-references expect.
  - `dev/README_RELEASE_AIRFLOW.md`: the link was pointing at
    `#removing-or-replacing-ownership` but the section is called
    "Relinquishing translation/code ownership" — fix to
    `#relinquishing-translationcode-ownership`.

The `LYCHEE_VERSION=0.24.2` arg used by the script-based hook is
no longer needed for the docker variant; the docker entry pins the
version itself.
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented May 5, 2026

Good catch — added --include-fragments to the args. It surfaced three genuinely broken anchors which I fixed in this push:

  1. .github/skills/pr-triage/classify-and-act.md and SKILL.md were both linking at actions.md#mark-ready and actions.md#mark-ready-with-ping, but GitHub's auto-anchors for those ## headings actually include the full descriptive text (mark-ready--add-ready-for-maintainer-review-label). Added explicit <a id="mark-ready"></a> / <a id="mark-ready-with-ping"></a> markers in actions.md so the existing short-form links resolve.

  2. dev/README_RELEASE_AIRFLOW.md was linking at the i18n README's #removing-or-replacing-ownership, but that section is actually called "Relinquishing translation/code ownership" — fixed the link to #relinquishing-translationcode-ownership.

I also switched the hook to lychee-docker to fix the unrelated GLIBC incompatibility that was failing CI (the lychee 0.24 prebuilt binary needs glibc 2.39+, ubuntu-22.04 runners have 2.35). The docker image bundles its own libc so it works portably.


Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting

@parkhojeong
Copy link
Copy Markdown
Contributor

Good catch — added --include-fragments to the args. It surfaced three genuinely broken anchors which I fixed in this push:

  1. .github/skills/pr-triage/classify-and-act.md and SKILL.md were both linking at actions.md#mark-ready and actions.md#mark-ready-with-ping, but GitHub's auto-anchors for those ## headings actually include the full descriptive text (mark-ready--add-ready-for-maintainer-review-label). Added explicit <a id="mark-ready"></a> / <a id="mark-ready-with-ping"></a> markers in actions.md so the existing short-form links resolve.
  2. dev/README_RELEASE_AIRFLOW.md was linking at the i18n README's #removing-or-replacing-ownership, but that section is actually called "Relinquishing translation/code ownership" — fixed the link to #relinquishing-translationcode-ownership.

I also switched the hook to lychee-docker to fix the unrelated GLIBC incompatibility that was failing CI (the lychee 0.24 prebuilt binary needs glibc 2.39+, ubuntu-22.04 runners have 2.35). The docker image bundles its own libc so it works portably.

Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting

Thanks for the quick follow-up. I reviewed the latest commit (fe61ea2), and it looks good to me.

@shahar1
Copy link
Copy Markdown
Contributor

shahar1 commented May 5, 2026

@parkhojeong thanks for helping reviewing it!
LGTM and merging :)

@shahar1 shahar1 merged commit 5936cf8 into apache:main May 5, 2026
143 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Backport failed to create: v3-2-test. View the failure log Run details

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 5936cf8 v3-2-test

This should apply the commit to the v3-2-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:task-sdk area:UI Related to UI/UX. For Frontend Developers. backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants