Skip to content

chore(ci): docs/skills/MCP drift audit — script + baseline + weekly workflow (+ fix found drift)#630

Merged
masukai merged 2 commits into
mainfrom
feat/drift-check-automation
Jun 10, 2026
Merged

chore(ci): docs/skills/MCP drift audit — script + baseline + weekly workflow (+ fix found drift)#630
masukai merged 2 commits into
mainfrom
feat/drift-check-automation

Conversation

@masukai

@masukai masukai commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Automated detection for the docs / skills / MCP drift problem we kept hitting by hand: connectors and MCP tools ship in code PRs, but the surrounding surfaces are updated manually and regularly lag. Real examples that motivated this:

Design

State-based, not diff-based

scripts/check_drift.sh audits current reality (the registry vs the surfaces), not a PR diff — so it also catches drift that accumulated before the check existed, and works identically for the weekly schedule and post-merge runs.

9 checks: destination → connector doc / create-sync skill / README table · source → README table / init skill · MCP tool → server docstring / README + README.ja tables · registered connector → drt_list_connectors inventory (sources and destinations blocks matched separately).

Contributor isolation — deliberately NOT on: pull_request

Per maintainer requirement: contributor PRs never see drift warnings. Docs/skill upkeep is a maintainer concern; surfacing it on first-time contributors' PRs adds onboarding friction. The workflow runs:

  • post-merge on main (path-filtered to registry / MCP / docs / skills / README)
  • weekly Monday 01:00 UTC (offset from contributors-audit at 00:30)
  • manual workflow_dispatch

and reports to a self-updating maintainer-facing tracking issue — same model as contributors-audit.yml (#621). The issue auto-closes when the audit comes back clean.

Baseline ratchet

scripts/drift_baseline.txt holds known historical gaps (check_id:item lines). Only NEW drift fails the run. The 14 destinations that predate the docs/connectors/ convention (discord / jira / teams / twilio / …) are baselined as a docs backlog — every entry is debt to burn down, not a permanent allowlist.

Dogfooding: drift found by the first run, fixed in this PR

Drift Fix
dest-skill:databricks#629 merged hours earlier without the skill mention Added to /drt-create-sync destinations list + mirror-mode availability note
src-readme:rest_api — REST API source (#474, v0.7) missing from Sources tables Row added to README.md + README.ja.md
mcp-inventory × 10 — drt_list_connectors missing mixpanel / s3 / snowflake-dest / databricks ×2 / twilio / intercom / email_smtp / google_ads / staged_upload / sqlserver / rest_api-src Inventory now carries all 27 destinations + 11 sources, with a comment pointing at the drift check that enforces it

The databricks-in-skill catch is exactly the failure mode this tool exists for — it happened within hours of #629 merging.

Usage

make check-drift     # local run; exit 0 = no new drift

To accept an intentional, short-lived gap: add check_id:item to scripts/drift_baseline.txt (and burn it down later).

Test plan

  • bash scripts/check_drift.sh — exit 0 after fixes, 14 baselined gaps reported as KNOWN
  • Calibration run before fixes correctly flagged all 12 real gaps (exit 1)
  • make check-skills clean after make sync-skills
  • ruff check drt clean
  • 22 MCP tests pass (inventory change is data-only)
  • CI green
  • Post-merge: first workflow run on main should exit 0 (no tracking issue)

Out of scope

  • Burn-down of the 14 baselined connector docs (docs backlog — good-first-issue candidates)
  • "Quality" drift (a doc that exists but is stale) — that's a different class of problem; this catches presence/absence only

🤖 Generated with Claude Code

…ly workflow (+ fix the drift it found)

## Why

Connectors and MCP tools ship in code PRs, but the surrounding
surfaces are updated by hand and regularly lag:

- /drt-debug skill went 3 months without learning about `drt doctor`
- the MCP server lagged the CLI by 4 months (no drt_doctor, no
  --diff parity, docstring missing drt_get_history since #445)
- drt_list_connectors' hardcoded inventory silently lagged the
  registry by 10 connectors
- the README Sources table never got a REST API source row (#474)

## What

**scripts/check_drift.sh** — 9 state-based checks (audits current
reality, not a PR diff, so it also catches drift that accumulated
before the check existed):

1. destination → docs/connectors/<name>.md
2. destination → /drt-create-sync skill mention
3. destination → README.md Destinations table
4. source → README.md Sources table
5. source → /drt-init skill mention
6. MCP tool → server.py module docstring
7. MCP tool → README.md MCP table
8. MCP tool → README.ja.md MCP table
9. registered connector → drt_list_connectors inventory (sources and
   destinations blocks matched separately so presence in one can't
   mask absence from the other)

Display-name matching is fuzzy (type key with underscores → spaces,
case-insensitive) with an alias map for exceptions (file → "CSV").

**scripts/drift_baseline.txt** — ratchet for known historical gaps.
Only NEW drift fails the run (exit 1). The 14 destinations that
predate the docs/connectors/ convention are baselined as a docs
backlog. Every baseline entry is debt to burn down, not a permanent
allowlist.

**.github/workflows/drift-check.yml** — post-merge on main (path-
filtered) + weekly Monday 01:00 UTC + manual dispatch. Deliberately
NOT on pull_request: contributor PRs must never get drift warnings —
docs/skill upkeep is a maintainer concern, and surfacing it on
first-time contributors' PRs adds onboarding friction. Reports to a
self-updating maintainer-facing tracking issue (same model as
contributors-audit.yml from #621) and auto-closes the issue when the
audit comes back clean.

**Makefile**: `make check-drift`.

## Drift found by the first run, fixed here

The audit's calibration run found 12 real gaps (besides the 14
baselined docs):

1. **dest-skill:databricks** — #629 merged hours earlier without the
   /drt-create-sync skill mention. Added (destinations list + mirror
   mode availability note). Exactly the failure mode this tool
   exists to catch.

2. **src-readme:rest_api** — the REST API source (#474, v0.7) never
   got a README / README.ja Sources-table row. Added.

3. **mcp-inventory × 10** — drt_list_connectors was missing
   mixpanel, s3, snowflake (destinations side), databricks (both
   sides), twilio, intercom, email_smtp, google_ads, staged_upload,
   sqlserver, rest_api (source side). The inventory now carries all
   27 destinations + 11 sources, with a comment pointing at the
   drift check that enforces it.

`make sync-skills` ran; `.claude/commands/drt-create-sync.md`
matches. `make check-skills` clean. `make check-drift` exits 0.
22 MCP tests pass.

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
@masukai masukai requested a review from yodakanohoshi June 10, 2026 00:12
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…self-bootstrap guard

Self-review before merge caught a latent label time-bomb — the same
class of first-run failure yodakanohoshi flagged on #621's
contributors-audit workflow.

The drift-check workflow opened its tracking issue with
`--label "documentation"`. That label exists in the repo *right now*
only as a leftover GitHub default — `.github/labels.yml` defines
`docs`, and `sync-labels` runs `crazy-max/ghaction-github-labeler`
with `skip-delete: false`. So the moment sync-labels reconciles
(which #621 will trigger by adding the `contributors` label),
`documentation` gets DELETED and `docs` created — at which point
`gh issue create --label "documentation"` would hard-fail on the
next drift run.

Fix mirrors the #621 pattern:
- Use `docs` (the labels.yml canonical) instead of `documentation`.
- Add an idempotent `gh label create docs --force || true` guard
  right before `gh issue create`, so the workflow is self-sufficient
  regardless of whether sync-labels has run yet. labels.yml stays
  the source of truth; the guard's colour/description are cosmetic
  (sync-labels reconciles them).

Verified: YAML valid, `make check-drift` still exits 0, open/update/
close paths all search the same title substring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@masukai masukai merged commit 3f968f6 into main Jun 10, 2026
8 checks passed
@masukai masukai deleted the feat/drift-check-automation branch June 10, 2026 07:19
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant