fix(release): close 4 release-path bugs found by pr-review ensemble#6
Merged
Conversation
Ran pr-review (codex-cli + copilot-cli ensemble) retroactively against the just-merged direct-publish flow (PR #4). Both reviewers caught the same two release-path bugs; codex caught a third about breaking-change parsing; copilot caught a least-privilege gap. All four are real and fixable in one focused PR. 1. [medium] Single-package version gate skipped manual changesets. The workflow compared only `packages/eslint-config/package.json` before/after `changeset:version`. A valid manual changeset bumping only `@happyvertical/prettier-config` or `@happyvertical/tsconfig-base` would update those package files, pass through `changeset:version`, and then hit the eslint-only gate which sees no change → exit 0 without commit/publish. The manual release silently vanishes. Fixed by gating on the working tree itself (`git diff --quiet && git diff --cached --quiet`) after `changeset:version`. Catches every case: auto-generated bumps to all packages, manual changesets that touch any subset, and no-op runs where nothing was releasable. 2. [medium] Publish ran before push, allowing registry/git drift. Original order: commit → publish → push. If main advanced mid-run (concurrency only serializes release runs, not human pushes), publish would succeed and the push would fail non-fast-forward. Result: registry has new versions, main doesn't have the bump commit, next run tries to republish. Fixed by reordering: commit → fetch origin/main → verify it hasn't moved past our parent → push commit → publish → push tags. If origin/main advanced, we abort BEFORE publishing with a clear error message; the next push to main re-runs and picks up the new state. The narrow window where publish runs after push but before tags-push can still produce missing tags on a partial publish failure, but that's recoverable (rerun `changeset:publish`; published versions are idempotent). 3. [medium] Breaking-change footers were ignored. auto-changeset.ts read commit subjects only via `--pretty=format:%H|||%s` and matched `BREAKING CHANGE` against the subject. Standard Conventional Commits put breaking markers in a body footer: feat: change config shape BREAKING CHANGE: consumers must update their extends path That commit was being classified as patch — shipping a breaking change under a patch version. Fixed by including `%b` in the log format (with sentinels for safe parsing across newlines and pipes in bodies), passing the body through to `isBreaking()`, and matching both `BREAKING CHANGE` and `BREAKING-CHANGE` in subject OR body. 4. [low] `id-token: write` permission was declared but no step uses OIDC. GitHub Packages publishing doesn't need it; we don't mint attestations. Removed per least-privilege. Smoke test: `pnpm changeset:auto` defers correctly to existing manual changeset infrastructure.
Round-1 fix in f2ec77a introduced two bugs that the round-2 ensemble caught: 1. [HIGH] `\x00FIELD\x00` / `\x00COMMIT\x00` sentinels in argv Node's child_process refuses NUL bytes in argv with a TypeError. The try/catch in `git()` swallowed it as "no output", so getCommitsSinceLastRelease() always returned [], real.length === 0, and auto-changeset early-returned every run — silently breaking the whole release pipeline. Fix: use git's pretty-format escapes `%x1f` / `%x1e` which emit ASCII 31 / 30 in the OUTPUT, not in argv. Neither byte appears in commit text, and they don't trip Node's argv validation. Also tightened `git()` to distinguish git's non-zero exits (expected for `describe --tags` with no tags) from Node-level errors (our bugs) — Node-level errors now log to stderr so the next such typo doesn't silently skip releases. 2. [medium] No recovery path for partial publish failure If `pnpm changeset:publish` fails after the version commit was already pushed, the version bump is on main but packages aren't published. Re-running via push is skipped by the chore(release) guard, and the main-advanced check would abort anyway. Fix: add workflow_dispatch trigger with mode=publish-only that skips auto-changeset + version + commit + push and runs only changeset publish against current main. The job-level if: no longer skips workflow_dispatch invocations.
… push Round-3 ensemble found three real release-correctness hazards in the publish-only recovery path added in 10f879f: 1. [HIGH] Recovery could publish the wrong source tree `pnpm changeset:publish` runs against whatever ref the workflow_dispatch checked out. If main has advanced beyond the failed release commit `R` (any commit `N` landed in the interval), recovery publishes R's package.json versions with N's source code — the registry artifact for that version silently contains post-version-commit changes. Fix: guard the recovery step. Refuse to run unless HEAD is a `chore(release):` commit AND HEAD === origin/main. Recovery is only safe immediately after a failed publish, before any new commits land. If main has advanced, operator must resolve manually (revert the new commits, or hand-publish from a clean checkout of `R`). 2. [medium] Recovery doesn't recreate lost local tags In a partial failure, the first run can publish + tag some packages locally before erroring out. Local tags die with the runner. On recovery, `changeset publish` skips already-published versions (correctly idempotent), so the tags are never recreated by publish alone, and `git push --tags` pushes nothing for them. The next auto run's tag-based release boundary then scans too far back and double-includes old commits. Fix: run `pnpm changeset tag` after publish in the recovery step. That reads package.json versions and creates any missing tags regardless of publish outcome. 3. [medium] Tag-push step also pushed the main branch ref `git push --tags origin main` pushes BOTH tags and the main branch. In the auto path, main was already pushed at step 5 before publish ran. If another commit lands between that step-5 push and the step-7 tag push, the redundant main push fails with non-fast-forward — AFTER packages have already published to the registry, leaving registry/git drift. Fix: `git push origin --tags` in both paths. Recovery never needs to advance main; auto already pushed it explicitly.
Round-4 ensemble caught two more medium-severity correctness hazards in the workflow: 1. [medium] `git fetch origin main` may not update `refs/remotes/origin/main` With a source-only refspec (`main` instead of `main:main`), git's behaviour depends on the remote's configured fetch refspec, shallow state, and partial-clone setup. Under some configurations it only updates FETCH_HEAD, leaving `refs/remotes/origin/main` stale. The subsequent `git rev-parse origin/main` then compares against a stale value — the guard passes even though main has actually advanced, and the publish runs against the wrong tree. Fix: explicit destination refspec `+refs/heads/main:refs/remotes/origin/main` in both auto and recovery paths. The `+` forces non-fast-forward updates so the remote-tracking branch always reflects the remote's current state. 2. [medium] Commit subject embedded in `::error::` workflow command unescaped The recovery guard's failure message interpolates `$HEAD_MSG` (commit subject — user-controlled) directly into a workflow command. GitHub workflow commands require `%`, CR, LF to be escaped (`%` → `%25`, CR → `%0D`, LF → `%0A`) before interpolating user-controlled content; without escaping, a crafted commit subject can corrupt the command payload or inject additional workflow commands into the log parser. Fix: inline bash parameter expansion to escape the three bytes before embedding. Verified the escape produces correct output for subjects containing all three characters. (The non-error `echo` in the success path interpolates only the short SHA from `git rev-parse --short HEAD`, which is hex-only and can't contain `%`/CR/LF — no escaping needed there.)
…osition Round-5 ensemble caught a real bump-classification bug, empirically verified against this branch's own commits: `isBreaking()` body-scan used `\bBREAKING[- ]CHANGE\b`, matching the phrase anywhere in the commit body. That includes narrative text, docstring examples, and any commit that documents what a breaking-change footer is. Concrete failure: commit `f2ec77a` on this branch has a docstring example in its body that reads `BREAKING CHANGE: consumers must update their extends path`. Under the loose body regex, that triggers `isBreaking() === true` and forces a minor bump for the whole release — even though every commit on the branch is `fix(release): ...` (patch only). Per Conventional Commits spec, `BREAKING CHANGE:` (or `BREAKING-CHANGE:`) is a footer line. Anchor the regex with `^BREAKING[- ]CHANGE: ` and the multiline flag so only line-anchored occurrences match. Narrative mentions and indented examples no longer false-positive. Verified on this branch's 4 commits: all now classify as patch (PATCH verdict), matching the intended `fix:` semantics.
Round-5 tightened the subject regex to `^BREAKING[- ]CHANGE: ` (line start anchor), matching the body regex. That fixed the docstring-false- positive but accidentally narrowed too far: the previously documented inline subject form `feat: BREAKING CHANGE: remove legacy preset` no longer matches, so a breaking release could ship under a patch version. Failure path: - Subject `feat: BREAKING CHANGE: remove legacy preset` - `!` regex on line 141 — no match (`feat:` not `feat!:`) - body footer regex — no match (no body footer) - subject `^BREAKING[- ]CHANGE: ` — no match (starts with `feat:`) - `isBreaking()` → false → patch bump for a breaking change Fix: keep the body regex line-anchored (footer-correct) but loosen the subject regex back to `\bBREAKING[- ]CHANGE: ` — word boundary + colon-space. The colon requirement keeps narrative phrasing like "avoid BREAKING CHANGE false positives" out; the word-boundary lets the marker appear anywhere in the subject including after `feat:` / `fix:` prefixes. Verified 8 edge cases (5 subject + 3 body) including indented docstring examples, narrative prose, hyphenated variants, and the restored inline subject form. All pass.
Rounds 5 and 6 of the review loop were both bug-fixing the same feature — and that feature does not match how this org actually writes commits. The org's convention is `feat!:` / `feat(scope)!:`, not `BREAKING CHANGE:` footers. Footer scanning is a regex-tuning trap: - Round 5 caught the body regex matching docstring narrative (`f2ec77a`'s own JSDoc example would have triggered a false minor bump for this very PR). - Round 6 caught the round-5 tightening dropping the legitimate inline subject form (`feat: BREAKING CHANGE: ...`). Every fix for one false positive introduces a window for a different false negative — narrative vs footer, indented vs unindented, inline subject vs bare subject, hyphenated vs spaced. None of those edge cases would arise if we simply recognized `!`. Drop footer scanning entirely. `isBreaking()` is now one line. As a follow-on simplification, the body fetch goes away too: the git pretty format drops to `%H %s`, the field/record separator machinery disappears, and parsing uses newlines (git subjects can't contain newlines, so they're unambiguous). If a footer-marked commit ever lands (e.g. from an external contributor unfamiliar with the org convention), the documented escape hatch is to drop a manual `.changeset/*.md` file — the deferral check at the top of main() yields to it. Net: -51 / +27 lines, removes the two highest-friction regex paths in the file.
…t errors Round-7 ensemble found two more correctness issues: 1. [medium] Empty-changeset trap `pnpm changeset:version` reads and DELETES changeset files even when their frontmatter is empty (no bumps). The old gate `git diff --quiet && git diff --cached --quiet` would see the deletion as a change, commit it, run publish as a no-op, advance no tags. The next auto run would then re-scan commits from the same (unmoved) tag boundary and re-include the commits the empty changeset was meant to skip. Fix: scope the gate to `packages/*/package.json` only. If no package version actually bumped, exit clean — leave the deleted changeset file uncommitted on the runner (next push re-fetches it from origin anyway). 2. [low] Node-level git errors went silent The round-2 `git()` helper logged Node-level failures (NUL in argv, missing binary, spawn errors) but still returned empty. Callers interpret empty as "no commits" / "no tags" — a real spawn-time bug would skip the release with a green workflow run. Fix: throw on Node-level errors (e.status undefined/null), keep the empty-return path only for git's own non-zero exits (which are expected for `describe --tags` with no tags). Verified: NUL in argv now throws ERR_INVALID_ARG_VALUE rather than silently returning.
…curate commit title Round-8 ensemble found three issues, including a regression from round-7's empty-changeset fix: 1. [medium] Tag-push failures lost the release boundary If `pnpm changeset:publish` succeeds but `git push origin --tags` fails, packages are on the registry, the commit is on main, but the per-package tags are not on origin. The recovery path requires HEAD to still be the `chore(release):` commit — once any new commit lands, recovery is blocked, and the next auto run scans `LAST_TAG..HEAD`, filters out the chore commit, and re-includes the commits that were already published. Duplicate release with duplicate changelog entries. Fix: add a step-0 guard that scans for an untagged `chore(release):` commit between the last `@happyvertical/*@*` tag and HEAD. If found, abort with an operator-action message pointing at the resolution (hand-push the missing tags, then re-run). 2. [medium] No-op changesets were stuck on main (regression from round-7) Round-7's package-json gate exits early without committing the deletion of an empty manual changeset. The cleanup never reaches origin, so the no-op changeset stays on main — every future push repeats the same no-op cycle, silently blocking real auto-changeset runs. Fix: when the gate finds no package bumps but the tree changed, commit + push the cleanup with a `chore(release):` title (so the next push doesn't re-trigger us via the job-level guard). Includes the same origin-main-advanced check used in the main path to avoid drift. 3. [low] Subset releases got a misleading commit title The old `chore(release): v$AFTER` used eslint-config's version unconditionally. A manual changeset that bumps only prettier-config would commit `v<old-eslint-version>` while actually publishing a different package. Fix: build the commit title from the package names that actually bumped this run (`git diff --name-only -- 'packages/*/package.json'` → comma-separated list). Per-package versions live in the tags created by `changeset publish`; the commit title carries which packages changed.
…mits only Round-9 caught a regression where round-8's two fixes interacted badly: - Fix 8.1 added a preflight detecting any `chore(release):` commit between the last per-package tag and HEAD (signal of failed tag-push from a previous release). - Fix 8.2 added a cleanup path that creates `chore(release): clean up consumed no-op changesets` commits when a manual no-op changeset needs to be removed from main. After a cleanup commit lands, the preflight grep matches it (same `chore(release):` prefix), and EVERY future release self-blocks forever — the operator sees "previous release didn't complete tag push" errors on commits that were never a release. Fix: narrow the preflight grep to `^chore(release): bump ` — the exact format the version-bump commit uses (set by fix 8.3). The cleanup commit's `clean up consumed no-op changesets` suffix doesn't match, so it's correctly ignored by the partial-release detector. Verified the grep behavior in a synthetic git repo: pattern matches `chore(release): bump eslint-config,prettier-config`, rejects `chore(release): clean up consumed no-op changesets` and `fix(release): not a release commit`.
…very guidance
Round-10 ensemble found two issues with the round-8 partial-release
detector:
1. [HIGH] First-release partial publish bypassed the guard
The detector only ran when `LAST_PKG_TAG` was non-empty. That
excluded the exact case the guard is meant to catch in a fresh
repo: a first release that publishes packages but fails to push
tags. With no prior tags, `LAST_PKG_TAG` is empty, the guard is
skipped, and `auto-changeset` scans all history, filters out the
chore commit, and re-includes the commits that were already
published — duplicate release.
Fix: when no package tag exists, scan all history (`UNTAGGED_RANGE=HEAD`)
instead of skipping the guard. Same detection, same abort path.
2. [medium] Error message could lead operators to publish-skip
The old message said "previous release published packages but
failed to push tags" and told operators to push the missing tags.
But the same untagged-bump-commit state can arise from a partial
publish failure (some packages published, some didn't) before tag
push was even attempted. If an operator follows the old guidance
and pushes tags for versions that don't exist on the registry,
future auto runs treat those tags as the release boundary and
skip publishing the missing versions — silent data loss.
Fix: distinguish the two failure modes explicitly. Recovery
guidance now:
(a) Recommends workflow_dispatch publish-only when HEAD is still
the bump commit. That path is idempotent on both axes:
changeset:publish skips already-published versions, and
changeset tag creates tags only for versions in package.json.
Single action covers both partial-publish and tag-push-failure.
(b) When HEAD has advanced past the bump commit (the case where
this guard fires), tells operators to manually verify
registry state before pushing any tags, with a concrete
example of how to tag-only what's actually published.
…t guards Round-11 ensemble caught two findings, both rooted in the same bug class: matching commit message contents (body) instead of just the subject. 1. [medium] Untagged-release detector matched commit bodies `git log --grep` searches the FULL commit message — subject and body. Round-8 added `--grep='^chore(release): bump '` thinking it would only match release commit subjects, but a regular commit whose body documents the literal string `chore(release): bump …` (e.g. as a code example, like several of my own commit messages on this PR) sets `UNTAGGED_RELEASE` and aborts every future run. Fix: pipe `git log --pretty='%H %s'` through awk to filter on the subject line only. Verified in a synthetic repo: false-positive commit (body mentions `chore(release): bump …`) was matched by `--grep` but correctly skipped by the awk pipeline; the real bump commit still matched. 2. [medium] Publish-only recovery accepted cleanup commits Round-3's recovery guard checked `^chore(release):` — too loose. The cleanup commit from round-8 (`chore(release): clean up consumed no-op changesets`) matches that prefix. If an operator runs publish-only while that cleanup is at HEAD, `changeset tag` creates per-package tags pointing at the cleanup SHA instead of the actual version-bump commit. Future release scans then trust the wrong commit as the release boundary. Fix: narrow to `^chore(release): bump ` — the exact format of the real version-bump commits, matching the detector in step 0.
… push
Round-12 ensemble found two findings:
1. [medium] No-op cleanup path didn't preserve "skipped" boundary
Round-7/8 added a path that commits the cleanup of an empty manual
changeset (`chore(release): clean up consumed no-op changesets`)
so the empty file wouldn't keep blocking future runs. But:
- The cleanup commits no tag
- auto-changeset.ts's `LAST_PKG_TAG` still points at the previous
real release
- The commits the operator intended to "skip release" via the
empty changeset stay in the next auto run's scan range
- Next push releases them anyway
The cleanup path was treating empty changesets as a valid "skip
release" signal that this workflow can't actually honor (no
commit-range bookkeeping persists across runs).
Fix: remove the cleanup machinery. Fail loudly when an empty
changeset is detected. Operator's only valid actions are:
(a) fill in package bumps in the changeset frontmatter, or
(b) delete the empty changeset file.
This also kills the round-9 fix's reason-to-exist (narrowing the
detector to exclude `chore(release): clean up` commits) — those
commits no longer get created. The narrowing stays as defense in
depth.
2. [medium] Non-atomic tag push left partial state undetectable
`git push origin --tags` pushes all local tags but is NOT atomic:
a network failure mid-push can land some tags and leave others
local-only. The step-0 untagged-bump detector relies on
`LAST_PKG_TAG` being absent when the previous release's tags
never reached origin — a partial push sets `LAST_PKG_TAG` to one
of the freshly-pushed sibling tags, the detector treats the bump
commit as already-released, and the missing tags persist
silently.
Fix: `git push --atomic origin --tags` in both auto and recovery
paths. With atomic, either all tags reach origin or none do, so
the detector's "no tag pushed" branch is the only possible
partial-tag state.
…t subject alone Round-13 ensemble added copilot-cli (codex-cli had solo'd rounds 2-12). Copilot caught a real release-deadlock that codex missed across 12 rounds: The step-0 partial-release detector and the publish-only recovery guard both relied on subject pattern `^chore(release): bump ` to identify workflow-created release commits. But that subject is also a valid human Conventional Commit — e.g. `chore(release): bump pnpm/action-setup` for a dependency update. Deadlock path: - Human pushes `chore(release): bump pnpm/action-setup` - Job-level `if:` skips that push (filter on chore(release): prefix) - Any future push runs the workflow - Step-0 detector finds the human commit, treats it as a failed release commit, aborts with operator-action error - No release ever happens until someone manually deletes the human commit (impossible — it's already on main) or rewrites history Fix: combine subject pattern with committer-identity check. The workflow sets `have-release-bot` as the git committer for its own commits (line 81), so committer identity is a reliable machine-only marker that humans can't accidentally produce. - Step-0 detector: `git log --committer='have-release-bot' ...` combined with the existing awk subject filter - Recovery guard: also check `git log -1 --pretty=%cn HEAD` matches `have-release-bot`, in addition to the subject grep Verified in a synthetic repo: human commit with matching subject is correctly excluded; bot commit with same subject still matches. Process note: this finding shipped after rounds 2-12 of codex-cli solo because the loop wasn't running the ensemble it documents. From now: copilot-cli runs each round alongside codex-cli. Also rejecting copilot's second finding (loss of `BREAKING CHANGE` recognition) — that was the deliberate round-6.5 simplification (see commit 7e13016), not a regression. Copilot didn't see that design conversation.
…geset + escape HEAD_COMMITTER
Round-14 ensemble (codex-cli + copilot-cli both running this round —
correcting rounds 2-13 which were codex-cli solo). Copilot caught
two real findings codex missed; codex caught one of them at lower
severity. Deduplicated:
1. [medium] Subject-only chore(release) match in TWO more code paths
(extension of round-13 finding into job gate + auto-changeset)
Round-13 fixed the step-0 detector and the publish-only recovery
guard to require BOTH subject pattern AND `have-release-bot`
committer identity. But two other places still used subject-only
matching:
- Job-level `if:` filter (line 42): skips ANY commit starting
with `chore(release):` regardless of committer. A human commit
like `chore(release): bump pnpm/action-setup` (valid Conventional
Commit for a dependency update) would never trigger the release
workflow for its own changes.
- auto-changeset.ts filter (line 161): excludes ANY commit with
that subject prefix from the generated changelog summary. The
same human commit would be silently absent from release notes
even when a subsequent commit triggers a release.
Net effect: human-authored `chore(release):` commits become
invisible in release pipeline AND release notes.
Fix:
- Job-level if: combine `startsWith` with
`github.event.head_commit.committer.name == 'have-release-bot'`.
GitHub's push event payload includes committer.name so we can
gate at the workflow level without running any steps.
- auto-changeset.ts: re-add committer field to git log output
(using `%cn` with `\x1f` field separator — newline still
separates records since subjects/committers can't contain
newlines). Update the filter to require BOTH conditions.
Verified in synthetic git repos that human commits with matching
subject pass through the filter while bot commits with matching
subject are excluded.
2. [medium] HEAD_COMMITTER printed in ::error:: without escaping
(caught by both reviewers, copilot rated medium / codex rated low)
Round-13 added HEAD_COMMITTER to the recovery guard's failure
output but only escaped HEAD_MSG. Copilot empirically verified
that git accepts committer names containing `%0A::warning::pwn`
— direct workflow-command injection risk.
Fix: extract escape logic into a local `escape_wc` function and
apply to BOTH user-controlled git metadata fields before
embedding in workflow commands.
Process note: the round-1-13 codex-cli-solo run produced 22 findings
across 12 rounds. Copilot's first-round catch on a fundamental
deadlock scenario (covered by round 13) and this round's two findings
(both real) show why the ensemble is non-negotiable. Going forward:
both run every round; ensemble dedup happens at fix time.
…id SIGPIPE bypass
Round-15 ensemble: codex-cli clean, copilot-cli caught a real
high-confidence bug.
The step-0 untagged-release detector used `awk '/.../ {print $1; exit}'`
to find the first match efficiently. Under `set -o pipefail`, that
early exit closes awk's stdin, git log dies on SIGPIPE (exit 141),
and the pipeline returns non-zero.
Bash's default behavior (no `inherit_errexit`) is that command
substitution swallows pipeline failures silently. So
`UNTAGGED_RELEASE=$(... | awk ... exit)` produces empty output when
the pipeline fails — the subsequent `if [ -n "$UNTAGGED_RELEASE" ]`
check sees empty, skips the abort, and the detector is bypassed
ENTIRELY. The release proceeds against a state that contains an
untagged release commit, producing the exact duplicate-changelog
scenario this detector was added to prevent.
Fix: full-scan awk with deferred print:
```awk
/^[a-f0-9]+ chore\(release\): bump / && !found {hash=$1; found=1}
END {if (found) print hash}
```
Same "first match wins" semantics, no early exit, no SIGPIPE.
Verified on three input cases (match, no-match, empty) under
`set -euo pipefail` — all return cleanly without tripping the
pipeline failure path.
Not applying copilot's other two findings:
- Legacy `chore(release): v<ver>` format support: this repo has no
prior releases; no legacy state exists. The detector only needs
to recognize commits this workflow itself creates.
- Stronger committer identity (name + email): real concern but the
threat model is "anyone with push access". Push access already
permits many forms of release pipeline interference. Adding email
check is marginal defense in depth; keeping name-only for now.
willgriffin
added a commit
that referenced
this pull request
May 23, 2026
…msg in error output PR #5's commits use `fix(review-cycle,ship):` (multi-scope, one edit landing in both slash command surfaces). The old regex `(\([a-z0-9-]+\))?` rejected the comma, blocking the PR from merging despite the commits being well-formed Conventional Commits. Relax to `[a-z0-9][a-z0-9,/-]*`: - First char must be alphanumeric — prevents stray punctuation like `(,foo)` or `(-foo)`. - Allows comma for multi-scope commits. - Allows forward slash for dep-name scopes like `chore(tibdex/github-app-token):`. - Still rejects uppercase, colons in scope, leading punctuation. - Verified against 9 synthetic cases (5 pass, 4 fail as expected) AND all 13 PR #5 commits (all pass). Also fix the existing `echo "::error::Invalid commit message: $msg"` to escape `%`, CR, LF before embedding the user-controlled commit message in a workflow command. This is the same lesson from PR #6 round 14 / round 4: GitHub workflow commands parse `%`, `%0D`, and `%0A` as their respective bytes, so an attacker (or just an inconvenient commit message) containing those bytes can corrupt the command payload or inject additional workflow commands. Extract escape logic into `escape_wc()` for reuse. Also switched `echo "$msg" | grep` to `printf '%s' "$msg" | grep` — `echo` can interpret leading `-n`/`-e`/`-E` as flags in some shells, so a commit subject starting with one of those would silently produce wrong output. `printf '%s'` is unconditional.
willgriffin
added a commit
that referenced
this pull request
May 23, 2026
…, no rejections) Round-4 ensemble (codex-cli + copilot-cli + claude sub-agent + me). Surfaced 6 distinct findings; 4 accepted, 1 declined as nit, 1 was my own self-review note that overlapped with #1. 1. [medium, ALL 4 REVIEWERS] Cap-blocked-verify case overloaded `partial` status, breaking `/ship`'s partial-handling contract Round-3 fix #6 said "report status `partial`" when a P0/P1/P2 fix landed in the final permitted round (no verify possible). But the Status contract defines `partial` strictly as "skipped required reviewer". `/ship` then routes `partial` only by `Skipped reviewers`, which would be empty in this case — undefined behavior, agent could fall through to "treat as clean and ship unverified fix". Four-way confirmation: - codex: "make it `blocked`/`needs-verify`" - copilot: "either add explicit `verify-round-missed` reason + new ship branch, or classify as `blocked` to keep partial strict" - claude sub-agent: "either add third partial sub-branch in ship.md, or use different status (`blocked` with verify-needed marker)" - me (self-review): "contract overload — needs broadened definition or different status" Fix: reclassify as `blocked` with reason `verify-round-blocked-by-cap`. Three coordinated edits: (a) Round-cap section: change "report status as `partial`" to "report status as `blocked` with reason `verify-round-blocked-by-cap`". (b) Status contract `blocked =` definition: extend to enumerate "P0/P1/P2 fix landed in the final permitted round with no verify round possible". Reasoning: an unverified fix counts as potentially unaccepted because we don't yet know if the fix introduced new findings. (c) Status contract `partial =` definition: add "Single cause only — other 'incomplete' states (unverified fix, validation failed) are `blocked`, not `partial`" to lock down the single-cause invariant. (d) ship.md `blocked` branch: add explicit sub-case for `verify-round-blocked-by-cap` directing operator to re-run `/review-cycle rounds=N+1`. 2. [low, copilot + claude] `.gitignore` comment cited prior incident filenames the patterns don't actually catch Round-3 comment claimed defense-in-depth for `.deny-test.jsonl` / `.revparse-test.jsonl` from the 1fc8677 incident, but the narrow `copilot-session-*.jsonl` patterns don't match those names. Misleading framing. Fix: rewrite the comment to be honest about what the patterns cover and don't cover. Note explicitly that the narrow patterns were a deliberate round-7 walkback (avoided `*-test.jsonl` wildcard that would hide legitimate fixtures), and that the structural defense for arbitrary probe filenames is the /tmp rule, not gitignore. 3. [low, codex] Commitlint workflow had unused `pull-requests: read` permission The job only does `actions/checkout` + `git log` over commit SHAs from the event payload. No PR API calls. `pull-requests: read` is dead scope. Per least-privilege, dropped it. 4. [low, claude] Commitlint error message didn't mention `@` (the regex was updated to allow it in round 3 but the help text lagged) A user hitting the error with `feat(@foo/bar): ...` for an unrelated reason would read the help message and conclude `@` isn't supported. One-line fix to add "optional leading @ (scoped npm packages)" to the chars list. DECLINED: 5. [nit, claude] Regex accepts `feat(@scope):` (no `/pkg` suffix) — not a valid npm-scoped reference Tightening to require `/` after `@` would reject legitimate non-npm uses (e.g. someone wanting to use `@`-prefix for their own scope convention). The regex isn't claiming to enforce npm-semantic validity. Out of scope for a commit-message linter. Round-4 ran the full 4-reviewer ensemble in parallel against the same commit (1dd7688). Three rounds total of 4-reviewer ensembles now — the claude-sub-agent-via-Agent-tool workaround is proving durable enough to fold into the docs as the canonical claude-cli substitute when OAuth fails.
willgriffin
added a commit
that referenced
this pull request
May 23, 2026
… converged) Round-10 ensemble (codex + copilot + claude sub-agent + me). 4 distinct findings, all converging on the same conclusion: the WIP-commit recipe added in round 8 and reworked in round 9 has accumulated too many independent footguns to keep patching. The footgun inventory (caught across this and prior rounds): 1. [HIGH, round 10, codex + copilot 2-way] `git reset --hard` step loses untracked files included in the WIP commit. `git diff` doesn't capture them, so the restore patch can't recreate them. Concrete local data loss for new files/binary assets. 2. [medium, round 10, claude sub-agent] `git apply` of the unstaged patch is unsound when staged + unstaged changes touch overlapping hunks in the same file. The unstaged patch's context is the INDEX state, but we apply it to a worktree at pre-WIP HEAD (no staged applied yet). Hunks fuzz-fail, leave `.rej` files, silent failure if not run with `set -e`. 3. [medium, round 10, claude sub-agent] `git commit` failures (extremely common — commitlint hooks, gitleaks, gpg signing, biome, plus this very repo's own commitlint workflow which rejects `wip:` as a non-Conventional-Commits type) silently make `WIP_SHA` empty, and the downstream verify check misdiagnoses it as "HEAD moved during review". User goes hunting for non-existent HEAD movement. 4. [medium, round 10, copilot + claude + me 3-way] Fixed `/tmp/` paths collide across concurrent runs and across retry runs after a failure (second invocation overwrites recovery state from first failed run). 5. [round 9, codex + claude] `git add -A` stages unrelated untracked files which end up in the WIP commit AND in `pr-review --base`'s diff sent to external reviewers — potential local data leak. 6. [round 9, claude] `git reset --mixed HEAD~1` is positional; if HEAD moved (interrupted run, stray amend, side commit), wrong commit gets reset. 7. [round 9, claude + me 2-way] Hard Rules section forbids `git reset --hard`; recipe's "carve-out" was only inline, not in the Hard Rules section itself. 8. [round 9, me] WIP dance destroys user's pre-WIP staging discipline (carefully staged/unstaged split → everything unstaged after reset). That's 8 distinct sharp edges across 2 rounds. Each round of fixes introduces another caveat to document. Same pattern as PR #6's round-6.5 BREAKING-CHANGE-scanning removal: when a recipe needs more caveats than recipe, delete it. Fix: remove the WIP-commit alternative entirely. Replace with a fully-fleshed-out snapshot-comparison recipe that: - Uses `mktemp -d` for per-run snapshot directory (no /tmp collisions). - Captures status + unstaged diff + staged diff + untracked file hashes BEFORE the reviewer. - Captures same shape AFTER. - Diffs the four pairs; ANY difference → round invalid. - Catches same-status-but-different-content cases (e.g. `M path → M path` with different bytes) that a simple "is status empty" check would miss. - Catches added/modified untracked files via the hash list. - Preserves staging discipline. - No `git reset` calls → Hard Rules don't need a carve-out. Brief "why not WIP" explanation kept inline as a tombstone so future contributors don't reinvent the WIP recipe — enumerates the 8 footguns so the choice is documented, not just made. Net: -71 / +49 lines across both variants. Removes the most caveat-heavy section of these docs and replaces it with a single correct recipe.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to the just-merged direct-publish flow (#4). Ran
pr-reviewensemble (codex-cli + copilot-cli) retroactively against the release workflow + auto-changeset script — caught 4 real bugs that would degrade release behavior in different scenarios.Findings → fixes
id-token: writepermissionFix details
Working-tree gate instead of single-package:
git diff --quiet && git diff --cached --quietafterchangeset:versioncatches any package's changes — auto-generated, manual subset, or no-op alike. The old eslint-config-only check was a coupling assumption that didn't hold for manual changesets.Reordered publish flow: now
commit → fetch origin/main → verify it hasn't moved → push commit → publish → push tags. If origin/main advanced, we abort before publish with a clear error. Next run picks up the new state. Narrow window remains where partial publish failure could leave tags un-pushed, but that's recoverable (publish is idempotent for already-published versions).Read commit body in auto-changeset.ts: format now includes
%b(with\x00-sentinel separators so embedded newlines/pipes don't break parsing).isBreaking()checks both subject and body forBREAKING CHANGE/BREAKING-CHANGE. Matches what the Conventional Commits spec actually says about where to put breaking markers.Dropped
id-token: write: no step in the workflow uses OIDC. GitHub Packages publish via App-minted token doesn't need it.Test plan
pnpm changeset:autodefers correctly to existing manual changeset infrastructure (current state of the repo).changeset/foo.mdbumping only prettier-config, verify it publishes (previously would silently skip)feat: x\n\nBREAKING CHANGE: y, verify auto-changeset produces a minor bump (previously would produce patch)Notes