Skip to content

ci(docs): check changed markdown links on pull requests#1139

Open
13ernkastel wants to merge 10 commits intoNVIDIA:mainfrom
13ernkastel:codex/issue-552-docs-link-checker
Open

ci(docs): check changed markdown links on pull requests#1139
13ernkastel wants to merge 10 commits intoNVIDIA:mainfrom
13ernkastel:codex/issue-552-docs-link-checker

Conversation

@13ernkastel
Copy link
Copy Markdown

@13ernkastel 13ernkastel commented Mar 31, 2026

Summary

  • add a lightweight pull request workflow that runs the existing docs link checker on changed markdown files
  • improve check-docs.sh output so broken local links include the source line number
  • stabilize the checker's parsing locale so it behaves cleanly across environments
  • add a focused Vitest file that covers broken links and fenced-code exclusions

Why

Issue #552 asks for markdown link checking in CI. The repo already had a useful checker in test/e2e/e2e-cloud-experimental/check-docs.sh, but it only ran in broader E2E contexts and its broken-link output did not point back to the exact markdown line.

This keeps the fix small by reusing the existing checker instead of introducing a second link-checking tool. The pull request workflow runs --local-only on changed markdown files so review-time checks stay fast and avoid flaky network-driven failures.

Validation

  • bash -n test/e2e/e2e-cloud-experimental/check-docs.sh
  • ruby -e 'require "yaml"; puts YAML.load_file(".github/workflows/docs-links-pr.yaml")["name"]'
  • npx vitest run test/check-docs-links.test.js
  • bash test/e2e/e2e-cloud-experimental/check-docs.sh --only-links --local-only README.md

Closes #552.

Summary by CodeRabbit

  • New Features

    • Added a "Docs Links PR" workflow to run link checks on changed Markdown in pull requests.
  • Tests

    • Added test suite that verifies local Markdown link checking, reports broken links with source context, and ensures links inside fenced code blocks are ignored.
  • Chores

    • Improved link-checking to skip fenced code blocks, emit source line numbers, and make extraction more locale-stable.

Signed-off-by: 13ernkastel LennonCMJ@live.com

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

📝 Walkthrough

Walkthrough

Adds a PR-triggered GitHub Actions workflow that runs a local markdown link checker on changed .md files; the checker now records source line numbers and properly ignores both backtick and tilde fenced code blocks; new Vitest tests verify detection and exclusions.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
/.github/workflows/docs-links-pr.yaml
New workflow triggered on PRs to main for .md changes; computes changed markdown files, exports has_files and the file list, and conditionally runs the link-check script with --only-links --local-only.
Link checking script
test/e2e/e2e-cloud-experimental/check-docs.sh
Set LC_ALL=C for deterministic processing; replace naive link extractor with a stateful fenced-code parser supporting backtick and tilde fences (variable length); extractor now emits line_no<TAB>target; check_local_ref and run_links_check updated to accept and report source line numbers in diagnostics.
Tests
test/check-docs-links.test.js
New Vitest suite that runs check-docs.sh --only-links --local-only against temporary Markdown fixtures; asserts broken local links are reported with source file and line number, and verifies links inside various fenced code blocks (including mismatched/short closers) are ignored.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GHA as GitHub Actions (docs-links-pr)
    participant Script as check-docs.sh
    participant FS as Repository File System
    participant CI as CI Result

    PR->>GHA: open/reopen/synchronize with .md changes
    GHA->>GHA: git diff base...head -> list of .md files
    alt markdown files present
        GHA->>Script: run with --only-links --local-only and file paths
        Script->>Script: parse files (skip fenced code), emit line_no<TAB>target
        loop per extracted target
            Script->>FS: check target exists
            FS-->>Script: exists / missing
        end
        alt missing targets found
            Script-->>CI: exit non-zero with "md_path:line_no -> target"
        else
            Script-->>CI: exit 0
        end
    else no markdown files
        GHA-->>CI: skip link check
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through docs by lantern light,

counting links both day and night.
Fenced code hides its secret ways,
I skip the traps and mark the stray.
CI hums — I guard the docs' bright.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'ci(docs): check changed markdown links on pull requests' directly describes the main change: adding a CI workflow to validate markdown links in PRs.
Linked Issues check ✅ Passed All core requirements from issue #552 are met: CI workflow validates relative links in changed markdown files, detects broken local links with line numbers and paths, skips external URLs/anchors/fenced code blocks, excludes non-doc paths, and runs on PR events with local-only mode.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #552 requirements: workflow for PR checks, enhanced link checker with line-number reporting, test coverage for fenced-block exclusions, and locale stability.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@13ernkastel 13ernkastel marked this pull request as ready for review March 31, 2026 05:54
@13ernkastel
Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/e2e/e2e-cloud-experimental/check-docs.sh (1)

224-230: Handle tilde fences (~~~) in fenced-code skipping.

extract_targets currently toggles fence state only for backtick fences, so links inside ~~~ fenced blocks can still be parsed as real links. Line 225 is the toggle point to broaden.

Suggested patch
-    if (/^\s*```/) { $in = !$in; next; }
+    if (/^\s*(```|~~~)/) { $in = !$in; next; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/e2e-cloud-experimental/check-docs.sh` around lines 224 - 230, The
fenced-code detection only toggles for backticks (the Perl one-liner in
check-docs.sh that currently uses if (/^\s*```/) to flip $in), so add tilde
fence support by changing that condition to match either ``` or ~~~ (i.e.,
update the Perl fence-toggle regex inside the extract/processing one-liner to
/^\s*(```|~~~)/ so links inside ~~~ blocks are skipped as well).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docs-links-pr.yaml:
- Line 35: The current population of md_files captures all changed *.md
including vendored/generated paths; update the mapfile command that sets
md_files to exclude common non-doc directories by filtering out patterns like
node_modules, dist, vendor, build (e.g., modify the git diff pipeline that
produces md_files or append a grep -v -E '^(node_modules|dist|vendor|build)/'
before the sort) so md_files only contains real documentation markdown changes.

---

Nitpick comments:
In `@test/e2e/e2e-cloud-experimental/check-docs.sh`:
- Around line 224-230: The fenced-code detection only toggles for backticks (the
Perl one-liner in check-docs.sh that currently uses if (/^\s*```/) to flip $in),
so add tilde fence support by changing that condition to match either ``` or ~~~
(i.e., update the Perl fence-toggle regex inside the extract/processing
one-liner to /^\s*(```|~~~)/ so links inside ~~~ blocks are skipped as well).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 47af3454-3672-41b5-a1a6-40f5430603a9

📥 Commits

Reviewing files that changed from the base of the PR and between 0086886 and 113bbf4.

📒 Files selected for processing (3)
  • .github/workflows/docs-links-pr.yaml
  • test/check-docs-links.test.js
  • test/e2e/e2e-cloud-experimental/check-docs.sh

@13ernkastel 13ernkastel force-pushed the codex/issue-552-docs-link-checker branch from bd5d7fc to 1ff2346 Compare March 31, 2026 06:16
@13ernkastel
Copy link
Copy Markdown
Author

Addressed the actionable review feedback in 1ff2346 and resolved the review conversation.

Updates included:

  • excluded node_modules/, dist/, vendor/, and build/ paths from the changed-markdown diff list
  • taught the markdown link extractor to ignore ~~~ fenced code blocks as well as backtick fences
  • added regression coverage for tilde-fenced blocks

Validation rerun:

  • bash -n test/e2e/e2e-cloud-experimental/check-docs.sh
  • ruby -e 'require \"yaml\"; puts YAML.load_file(\".github/workflows/docs-links-pr.yaml\")[\"name\"]'\n- npx vitest run test/check-docs-links.test.js\n- bash test/e2e/e2e-cloud-experimental/check-docs.sh --only-links --local-only README.md

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/e2e-cloud-experimental/check-docs.sh`:
- Around line 224-225: Replace the permissive fence-toggle logic in the perl
one-liner (the /^\s*(```|~~~)/ branch that flips $in) with a delimiter-aware
handler: detect fence openings via /^\s*(`{3,}|~{3,})/ and record the fence
character and length into $fch and $flen when entering ($in=1), and only exit
the fence (set $in=0 and clear $fch/$flen) when a closing fence uses the same
character and length >= $flen; update the assignments/tuples that set ($in,
$fch, $flen) accordingly so the link-scanning next/while blocks remain gated by
the refined $in state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ed0e22e1-3e2a-47bf-b23d-a297ccdf6b0a

📥 Commits

Reviewing files that changed from the base of the PR and between 113bbf4 and 1ff2346.

📒 Files selected for processing (3)
  • .github/workflows/docs-links-pr.yaml
  • test/check-docs-links.test.js
  • test/e2e/e2e-cloud-experimental/check-docs.sh
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/docs-links-pr.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/check-docs-links.test.js

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/e2e-cloud-experimental/check-docs.sh`:
- Around line 225-235: The fence-closing check using the regex
/^\s*(`{3,}|~{3,})/ incorrectly treats lines like "```not-a-close" as closes;
update the logic in the block that uses the regex and variables ($in, $fch,
$flen) so that a closing fence is only accepted if the matched fence is followed
only by optional whitespace (i.e., nothing else on the line). Concretely, either
change the regex to assert end-of-line or only whitespace after the marker, or
after matching check the remainder of the line for non-whitespace characters
before flipping ($in, $fch, $flen) to close the fence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b784f8ab-e97b-46f5-92cf-76320a8b344d

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff2346 and 8af07c8.

📒 Files selected for processing (2)
  • test/check-docs-links.test.js
  • test/e2e/e2e-cloud-experimental/check-docs.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/check-docs-links.test.js

@cv cv enabled auto-merge (squash) March 31, 2026 07:20
Signed-off-by: 13ernkastel <LennonCMJ@live.com>
@13ernkastel
Copy link
Copy Markdown
Author

The current head is ready from my side, but the latest required checks are still in GitHub's action_required state because this is a fork PR. A maintainer workflow approval should unblock dco-check, commit-lint, Docs Links PR, and pr for the current head (be3c965). Once those run, the PR should be mergeable.

@13ernkastel 13ernkastel requested a review from cv March 31, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: add markdown link checker for docs and README

2 participants