pr-review

Pre-PR self-review prompt generator. Harness-agnostic: it emits a prompt + diff bundle to stdout, and you pipe it to whichever LLM tool you prefer (codex-cli, claude-cli, gemini, web chat, …).

Built to compress review iterations by front-loading the kind of analysis a careful reviewer (or an AI reviewer like GitHub copilot-cli) does after you open the PR — so the author can fix issues now instead of in a back-and-forth.

Install

# From your repos root:
export PATH="$PWD/pr-review/bin:$PATH"

# Or symlink into your $PATH:
ln -s "$PWD/pr-review/bin/pr-review" ~/bin/pr-review

Quick start

From inside any git repo:

# codex-cli (uses your ~/.codex/config.toml defaults: gpt-5.5 + xhigh)
pr-review | codex exec -

# Claude Code (one-shot)
claude -p "$(pr-review)"

# Gemini CLI
pr-review | gemini -

# Paste into a web chat
pr-review --pretty | pbcopy   # macOS
pr-review --pretty | xclip    # linux

Default base branch is auto-detected from origin/HEAD, falling back to origin/dev, origin/develop, origin/main, origin/master in that order. Override with --base origin/main.

How it works

            ┌──────────────────────┐
            │ pr-review/prompts/   │  shared core checklist (10 themes
            │   checklist.md       │  mined from real PR review history)
            │   output-json.md     │
            │   output-pretty.md   │
            └──────────┬───────────┘
                       │
            ┌──────────▼───────────┐
            │ <repo>/.pr-review/   │  per-repo extensions:
            │   extensions.md      │  framework conventions, infra
            │                      │  patterns, tenant model, paths
            └──────────┬───────────┘
                       │
            ┌──────────▼───────────┐
            │ pr-review (script)   │  bundles checklist + extensions +
            │                      │  diff + repo metadata into a prompt
            └──────────┬───────────┘
                       │ stdout
                       ▼
              your harness of choice

The script only emits a prompt — it never calls an LLM directly. That keeps it harness-agnostic and easy to maintain as CLIs evolve.

Options

--base BRANCH         Base branch to diff against (default: auto-detect)
--repo DIR            Repo directory (default: $PWD)
--pretty              Emit markdown findings instead of JSON
--include-new-files   Embed full contents of newly-added files (<50KB)
--core-dir DIR        Path to shared core (default: discovered)
-h, --help            Show help

Per-repo extensions

Each repo can add its own patterns in <repo>/.pr-review/extensions.md. These are appended to the core checklist when reviewing that repo.

Good extensions cover:

Framework conventions (e.g. SMRT's 0 vs 0.0, Svelte 5 $state proxy equality, when to use transformJSON vs toJSON).
Project-specific paths (apps/dashboard/migrations/ vs the repo root migrations/, kustomize overlay layout).
Tenant model specifics (which columns are tenant-scoped, which routes must check access level).
Known footguns captured from past review comments.

See anytown.ai/.pr-review/extensions.md for a worked example.

Output

Default is JSON (machine-readable, easy to post as PR comments later):

{
  "summary": "Refactor extracting PortalChatTool from PortalAssistantSidebar...",
  "findings": [
    {
      "severity": "medium",
      "category": "refactor-regression",
      "file": "packages/site-portal/src/lib/components/PortalChatTool.svelte",
      "line": 1131,
      "title": "Dropped .primary CSS rule during component extraction",
      "body": "The previous component had `.primary, .composer button { ... }` but here only `.composer button` keeps the primary styling. The Approve button (line 771-782) now falls through to the generic button rule and looks identical to Reject. **Fix:** add the `.primary` rule mirroring the dropped declaration.",
      "confidence": "high"
    }
  ],
  "skipped": []
}

Use --pretty for a human-readable markdown report instead.

Workflow recipes

Run before every push

# .git/hooks/pre-push (or via husky)
pr-review --pretty | tee /tmp/pr-review.md
read -p "Review and continue? [y/N] " ans
[[ "$ans" == "y" ]] || exit 1

Run in a Claude Code slash command

Add to ~/.claude/skills/pr-review/skill.md or a project-level skill — it runs the script and pipes through claude-cli's planning loop, surfacing findings in your terminal.

Iterate to convergence

# Round 1: get findings
pr-review --pretty | codex exec - > /tmp/round1.md

# Round 2: ask the same harness to apply the fixes you accept
codex exec "Apply the high+medium severity findings from /tmp/round1.md to the working tree"

# Round 3: re-run pr-review to confirm convergence
pr-review --pretty | codex exec -

When round-N produces no high/medium findings, you're done. Push.

Ensemble review (multiple LLMs in parallel)

For high-stakes PRs, run the same pr-review prompt through several LLMs in parallel and merge findings. Each model has different blind spots — codex-cli catches things claude-cli misses and vice versa. copilot-cli adds a third viewpoint that's tuned to GitHub's review style.

This is a recipe, not a script — pr-review stays emit-only on purpose so you compose with whatever ensemble glue suits your harness.

PR_BASE=origin/dev
OUT=/tmp/pr-review-$(date +%s)
mkdir -p "$OUT"

# Capture each tool's findings to separate files
pr-review --base "$PR_BASE"            | codex exec -        > "$OUT/codex.json" 2> "$OUT/codex.err" &
pr-review --base "$PR_BASE"            | claude -p           > "$OUT/claude.json" 2> "$OUT/claude.err" &
pr-review --base "$PR_BASE" --no-diff  | codex review --base "$PR_BASE" - > "$OUT/codex-review.txt" 2> "$OUT/codex-review.err" &
gh copilot suggest -t shell "$(pr-review --base "$PR_BASE" --pretty)" > "$OUT/copilot.txt" 2> "$OUT/copilot.err" &
wait

# Merge — ask one LLM to deduplicate and rank
cat "$OUT"/*.json "$OUT"/*.txt | claude -p "Merge these review findings into one prioritized checklist. Dedupe near-duplicates. Order by severity desc, then by file. Drop findings that fewer than 2 reviewers flagged unless severity is high." > "$OUT/merged.md"

less "$OUT/merged.md"

Operational notes:

Allow at least 15 minutes per reviewer. Some tools time out at 2-5 minutes by default — bump it. Slow reviews are usually thorough reviews; fast ones often skim.
Capture stdout and stderr separately. When a tool returns no findings or a malformed result, the cause is almost always in stderr. Saving them apart makes triage trivial.
Use --no-diff for codex review. It fetches its own diff from --base. Sending the diff inside the prompt too just confuses the model about which diff to review.
Treat each tool's output as evidence, not as orders. Verify before fixing. Tools hallucinate file paths and line numbers more often when temperature is high — concrete file:line references with matching context are reliable, vague claims are not.
Don't make a downstream PR draft solely because one of the reviewers raised a low-confidence concern. Address the concrete findings, document the dismissed ones in the PR body, ship.

For a real-world example of this pattern wired into a shipping pipeline, see the /review-cycle and /ship slash commands at https://github.com/willgriffin/personal-ship-command (or whatever ensemble harness you prefer).

The calibration loop

The checklist will rot if you don't close the feedback loop. pr-review-tune compares pre-PR findings against what reviewers actually said on the merged PR, and proposes specific edits to checklist.md / extensions.md as a unified diff.

Setup: capture pre-PR findings

# Run as part of your pre-push workflow
pr-review | codex exec - | pr-review-capture | tee /dev/tty
#                          ^^^^^^^^^^^^^^^^^
#                          saves to .pr-review/history/<sha>.json

.pr-review/history/ is gitignored by default — findings stay local. (If you want shared visibility across collaborators, you can post them as a PR comment via gh pr comment in a second step; not built in by design.)

Tune: after some PRs have merged

# Tune against the last 10 merged PRs
pr-review-tune --last 10 | codex exec - > tune-proposal.md

# Or a specific PR
pr-review-tune --pr 422 | claude -p

# Or everything since a date
pr-review-tune --since 2026-04-01 | codex exec - > tune-proposal.md

The tune prompt enforces three rules to prevent prompt rot:

N-occurrence threshold (default 3): a missed pattern must appear in ≥N PRs in the batch before it can be proposed as a new checklist item. One-off quirks won't rewrite the prompt.
Consolidation preferred: existing sections get strengthened rather than the checklist growing unboundedly.
Removal proposals: items that fire repeatedly with zero reviewer agreement get flagged for weakening or removal.
Token budget (default ~8000 tokens for the checklist): forces aggressive consolidation when near the cap.

The output is a unified diff. Review it, apply with patch -p0, commit:

codex exec - < tune-proposal.md > diff.patch    # or however your harness emits
patch -p0 < diff.patch                          # apply the proposed edits
git diff prompts/ .pr-review/extensions.md      # see what changed
# decide what to keep, commit if good

Cadence

Per-PR calibration is noisy; don't run after every PR.
Every ~10 PRs is a good cadence for --last 10. Apply edits manually for the first 2-3 rounds so you can see whether the tune prompt is drifting. Automate later if it isn't.
Monthly coverage health check: run --since over a longer window and ask the harness specifically about categories of bugs reviewers still find that the checklist doesn't address. That surfaces blind spots that the per-PR loop won't.

What the tune loop can't fix

The loop converges toward "what reviewers caught that pr-review didn't." It does not catch what neither caught:

Business-logic intent (does the PR do the right thing?)
Cross-PR design questions
Bugs both pr-review and copilot-cli share blind spots on

Keep a human in the loop periodically. The tune doesn't replace review; it shortens iterations on the review you'd do anyway.

What this catches

Based on analysis of 110 review comments across 36 anytown.ai PRs:

Theme	% of comments	Catchable pre-PR?
Refactor regressions (dropped CSS/handlers/fallbacks)	~20%	Yes — diff-aware
Tenant isolation gaps	~15%	Yes — pattern-based
Svelte 5 effects/onMount races	~10%	Yes — checklist
Silent error swallowing	~10%	Yes — grep-able
Hardcoded values	~10%	Yes — grep + checklist
Doc-code drift	~10%	Partial — depends on doc proximity
Infra duplication / RBAC scope	~8%	Yes — checklist
Dead config / unwired params	~6%	Partial — needs call-site analysis
Concurrency on shared rows	~5%	Yes — checklist
Pure mechanical (separators, dupe imports)	~6%	Leave to linter

Realistic compression target: 50–70% of round-2 review cycles eliminated. A/B test it: count the findings copilot-cli adds on top of what pr-review already caught. If that delta drops below half of baseline, it's working.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
bin		bin
prompts		prompts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pr-review

Install

Quick start

How it works

Options

Per-repo extensions

Output

Workflow recipes

Run before every push

Run in a Claude Code slash command

Iterate to convergence

Ensemble review (multiple LLMs in parallel)

The calibration loop

Setup: capture pre-PR findings

Tune: after some PRs have merged

Cadence

What the tune loop can't fix

What this catches

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pr-review

Install

Quick start

How it works

Options

Per-repo extensions

Output

Workflow recipes

Run before every push

Run in a Claude Code slash command

Iterate to convergence

Ensemble review (multiple LLMs in parallel)

The calibration loop

Setup: capture pre-PR findings

Tune: after some PRs have merged

Cadence

What the tune loop can't fix

What this catches

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages