backnotprop · backnotprop · Mar 11, 2026 · Mar 11, 2026 · Mar 11, 2026 · Mar 11, 2026
diff --git a/.agents/skills/checklist/SKILL.md b/.agents/skills/checklist/SKILL.md
@@ -0,0 +1,153 @@
+---
+name: checklist
+description: >
+  Generate a QA checklist for manual developer verification of code changes.
+  Use when the user wants to verify completed work, review a diff for quality,
+  create acceptance criteria checks, or run through QA steps before shipping.
+  Triggers on requests like "create a checklist", "what should I test",
+  "verify my changes", "QA this", or "pre-flight check".
+disable-model-invocation: true
+---
+
+# QA Checklist
+
+You are a senior QA engineer. Your job is to analyze the current code changes and produce a **QA checklist** — a structured list of verification tasks the developer needs to manually review before the work is considered done.
+
+This is not a code review. Code reviews catch style issues and logic bugs in the diff itself. A QA checklist catches the things that only a human can verify by actually running, clicking, testing, and observing the software. You're producing the verification plan that bridges "the code looks right" to "the software actually works."
+
+## Principles
+
+**Focus on what humans must verify.** If an automated test already covers something with meaningful assertions, it doesn't need a checklist item. But "tests exist" is not enough — test coverage that only asserts existence or happy-path behavior still leaves gaps that need human eyes.
+
+**Be specific, not vague.** "Test the login flow" is useless. "Verify that login with an expired JWT returns a 401 with `{error: 'token_expired'}` body, not a 500 with a stack trace" tells the developer exactly what to check, what to expect, and what failure looks like.
+
+**Every item is a mini test case.** Each checklist entry should have enough context that a developer unfamiliar with the change could pick it up and verify it. The description explains the change and the risk. The steps walk through the exact verification procedure. The expected outcome is clear.
+
+**Fewer good items beat many shallow ones.** Aim for 5–15 items. If you're producing more than 15, you're generating busywork — prioritize the items where human verification actually matters. If you're producing fewer than 5, look harder at edge cases, integration points, and deployment concerns.
+
+## Workflow
+
+### 1. Gather Context
+
+Start by understanding what changed and why.
+
+```bash
+git diff HEAD
+```
+
+If that's empty, try the branch diff:
+
+```bash
+git diff main...HEAD
+```
+
+As you read the diff, build a mental model:
+
+- **What kind of change is this?** New feature, bug fix, refactor, dependency update, config/infra change. This determines which categories of verification matter most.
+- **Which files changed and what do they do?** UI components need visual verification. API routes need functional testing. Database migrations need data integrity checks. Config files need deployment verification.
+- **Do tests exist for this code?** Look for test files related to the changed code. Tests that meaningfully cover the changed behavior reduce the need for manual verification — but tests that only cover the happy path or assert existence still leave gaps.
+
+### 2. Decide What Needs Manual Verification
+
+Think about each change through the lens of what could go wrong that a human needs to catch. Consider categories like:
+
+- **Visual** — Does it look right? Layout, responsiveness, dark mode, animations, color contrast. Only relevant when UI files changed.
+- **Functional** — Does the feature work end-to-end? Happy path and primary error paths. Always relevant for new features and bug fixes.
+- **Edge cases** — Empty input, huge input, special characters, concurrent access, timezone issues. Focus on cases the diff suggests are likely, not every theoretical scenario.
+- **Integration** — Does this break callers or consumers? API contract changes, event format changes, shared state mutations.
+- **Security** — Auth checks on new endpoints, input sanitization, secrets exposure, CORS changes.
+- **Data** — Database migrations, schema changes, backwards compatibility, data format changes.
+- **Performance** — Only when the diff touches hot paths, adds queries, or changes data structures.
+- **Deployment** — New environment variables, feature flags, migration ordering, new dependencies.
+- **Developer experience** — Error messages, documentation, CLI help text, logging.
+
+These are suggestions, not a fixed list. Use whatever category label best describes the type of verification. If the change involves "api-contract" or "accessibility" or "offline-behavior," use that.
+
+### 3. Generate the Checklist JSON
+
+Produce a JSON object with this structure:
+
+```json
+{
+  "title": "Short title for the checklist",
+  "summary": "One paragraph explaining what changed and why manual verification matters.",
+  "pr": {
+    "number": 142,
+    "url": "https://github.com/org/repo/pull/142",
+    "title": "feat: add OAuth2 support",
+    "branch": "feat/oauth2",
+    "provider": "github"
+  },
+  "items": [
+    {
+      "id": "category-N",
+      "category": "free-form category label",
+      "check": "Imperative verb phrase — the headline",
+      "description": "Markdown narrative explaining what changed in the code, what could go wrong, what the expected behavior is, and how the developer knows the test passes.",
+      "steps": [
+        "Step 1: Do this specific thing",
+        "Step 2: Observe this specific result",
+        "Step 3: Confirm this specific expectation"
+      ],
+      "reason": "Why this needs human eyes — what makes it not fully automatable.",
+      "files": ["path/to/relevant/file.ts"],
+      "critical": false
+    }
+  ]
+}
+```
+
+**Field guidance:**
+
+- **`pr`** (optional): Include when the checklist is associated with a pull/merge request. The UI displays a PR badge in the header and enables automation options (post results as a PR comment, auto-approve if all checks pass). Detect the provider from the git remote:
+  - `github.com` → `"provider": "github"`
+  - `gitlab.com` or self-hosted GitLab → `"provider": "gitlab"`
+  - `dev.azure.com` or `visualstudio.com` → `"provider": "azure-devops"`
+
+  To detect if a PR exists for the current branch:
+  ```bash
+  # GitHub
+  gh pr view --json number,url,title,headRefName 2>/dev/null
+  # GitLab
+  glab mr view --output json 2>/dev/null
+  # Azure DevOps
+  az repos pr list --source-branch "$(git branch --show-current)" --output json 2>/dev/null
+  ```
+  If the command succeeds, populate the `pr` field. If it fails (no PR exists, CLI not installed), omit it entirely. Do not error on missing CLIs — the `pr` field is optional.
+
+- **`id`**: Prefix with a short category tag and number: `func-1`, `sec-2`, `visual-1`. This makes items easy to reference in feedback.
+- **`category`**: Free-form string. Pick the label that best describes the verification type. Common ones: `visual`, `functional`, `edge-case`, `integration`, `security`, `data`, `performance`, `deployment`, `devex`.
+- **`check`**: The headline. Always starts with a verb: Verify, Confirm, Check, Test, Ensure, Open, Navigate, Run. This is what appears as the checklist item label.
+- **`description`**: The heart of the item. Write this as a markdown narrative that tells the full story:
+  - What changed in the code (reference specific files/functions)
+  - What could go wrong as a result
+  - What the expected behavior should be
+  - How the developer knows the test passes vs fails
+- **`steps`**: Required. Ordered instructions for conducting the verification. Be concrete — "Open browser devtools" not "check the network." Each step should be a single clear action.
+- **`reason`**: One sentence explaining why automation can't fully cover this. "CSS grid rendering varies across browsers" is good. "Because it changed" is not.
+- **`files`**: File paths from the diff that this item relates to. Helps the developer trace your reasoning.
+- **`critical`**: Reserve for items where failure means data loss, security vulnerability, or broken deployment. Typically 0–3 items per checklist.
+
+### 4. Launch the Checklist UI
+
+Write your JSON to a temporary file and pass it via `--file`:
+
+```bash
+cat > /tmp/checklist.json << 'CHECKLIST_EOF'
+<your-json-here>
+CHECKLIST_EOF
+plannotator checklist --file /tmp/checklist.json
+```
+
+This avoids shell quoting issues with large or complex JSON. The UI opens for the developer to work through each item — marking them as passed, failed, or skipped with notes and screenshot evidence. Wait for the output — it contains the developer's results.
+
+### 5. Respond to Results
+
+When the checklist results come back:
+
+- **All passed**: The verification is complete. Acknowledge it and move on.
+- **Items failed**: Read the developer's notes carefully. Fix the issue if you can. If the current behavior is actually correct, explain why.
+- **Items skipped**: Note the reason. If items were skipped as "not applicable," your checklist may have been too broad for this change — take that as feedback.
+- **Questions attached**: Answer them directly, with references to the relevant code.
+
+$ARGUMENTS
diff --git a/.gitignore b/.gitignore
@@ -19,13 +19,15 @@ dist-ssr
 # VS Code extension package
 *.vsix
 
-# OpenCode plugin build artifacts (generated from hook/review dist)
+# OpenCode plugin build artifacts (generated from hook/review/checklist dist)
 apps/opencode-plugin/plannotator.html
 apps/opencode-plugin/review-editor.html
+apps/opencode-plugin/checklist.html
 
-# Pi extension build artifacts (generated from hook/review dist)
+# Pi extension build artifacts (generated from hook/review/checklist dist)
 apps/pi-extension/plannotator.html
 apps/pi-extension/review-editor.html
+apps/pi-extension/checklist.html
 
 # Editor directories and files
 .vscode/*

diff --git a/apps/checklist/index.html b/apps/checklist/index.html
@@ -0,0 +1,18 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>QA Checklist</title>
+
+    <!-- Fonts -->
+    <link rel="preconnect" href="https://fonts.googleapis.com">
+    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500;600&display=swap" rel="stylesheet">
+
+  </head>
+  <body class="min-h-screen antialiased">
+    <div id="root" class="h-full"></div>
+    <script type="module" src="/index.tsx"></script>
+  </body>
+</html>
diff --git a/apps/checklist/index.tsx b/apps/checklist/index.tsx
@@ -0,0 +1,16 @@
+import React from 'react';
+import ReactDOM from 'react-dom/client';
+import App from '@plannotator/checklist-editor';
+import '@plannotator/checklist-editor/styles';
+
+const rootElement = document.getElementById('root');
+if (!rootElement) {
+  throw new Error("Could not find root element to mount to");
+}
+
+const root = ReactDOM.createRoot(rootElement);
+root.render(
+  <React.StrictMode>
+    <App />
+  </React.StrictMode>
+);
diff --git a/apps/checklist/package.json b/apps/checklist/package.json
@@ -0,0 +1,27 @@
+{
+  "name": "@plannotator/checklist",
+  "private": true,
+  "version": "0.0.1",
+  "type": "module",
+  "scripts": {
+    "dev": "vite",
+    "build": "vite build"
+  },
+  "dependencies": {
+    "@plannotator/checklist-editor": "workspace:*",
+    "@plannotator/server": "workspace:*",
+    "@plannotator/shared": "workspace:*",
+    "@plannotator/ui": "workspace:*",
+    "react": "^19.2.3",
+    "react-dom": "^19.2.3",
+    "tailwindcss": "^4.1.18",
+    "@tailwindcss/vite": "^4.1.18"
+  },
+  "devDependencies": {
+    "@vitejs/plugin-react": "^5.0.0",
+    "typescript": "~5.8.2",
+    "vite": "^6.2.0",
+    "vite-plugin-singlefile": "^2.0.3",
+    "@types/node": "^22.14.0"
+  }
+}
diff --git a/apps/checklist/vite.config.ts b/apps/checklist/vite.config.ts
@@ -0,0 +1,37 @@
+import path from 'path';
+import { defineConfig } from 'vite';
+import react from '@vitejs/plugin-react';
+import { viteSingleFile } from 'vite-plugin-singlefile';
+import tailwindcss from '@tailwindcss/vite';
+import pkg from '../../package.json';
+
+export default defineConfig({
+  server: {
+    port: 3002,
+    host: '0.0.0.0',
+  },
+  define: {
+    __APP_VERSION__: JSON.stringify(pkg.version),
+  },
+  plugins: [react(), tailwindcss(), viteSingleFile()],
+  resolve: {
+    alias: {
+      '@': path.resolve(__dirname, '.'),
+      '@plannotator/ui': path.resolve(__dirname, '../../packages/ui'),
+      '@plannotator/shared': path.resolve(__dirname, '../../packages/shared'),
+      '@plannotator/checklist-editor/styles': path.resolve(__dirname, '../../packages/checklist-editor/index.css'),
+      '@plannotator/checklist-editor': path.resolve(__dirname, '../../packages/checklist-editor/App.tsx'),
+    }
+  },
+  build: {
+    target: 'esnext',
+    assetsInlineLimit: 100000000,
+    chunkSizeWarningLimit: 100000000,
+    cssCodeSplit: false,
+    rollupOptions: {
+      output: {
+        inlineDynamicImports: true,
+      },
+    },
+  },
+});
diff --git a/apps/codex/README.md b/apps/codex/README.md
@@ -1,6 +1,14 @@
 # Plannotator for Codex
 
-Code review and markdown annotation are supported today. Plan mode is not yet supported — it requires hooks to intercept the agent's plan submission, which Codex does not currently expose.
+## Capabilities
+
+| Feature | How to use |
+|---------|------------|
+| **Code Review** | `!plannotator review` — Visual diff annotation UI |
+| **Markdown Annotation** | `!plannotator annotate path/to/file.md` — Annotate any markdown file |
+| **QA Checklist** | Skill: `checklist` — Generate and verify QA checklists interactively |
+
+Plan mode is not yet supported — it requires hooks to intercept the agent's plan submission, which Codex does not currently expose.
 
 ## Install
 
@@ -16,26 +24,28 @@ curl -fsSL https://plannotator.ai/install.sh | bash
 irm https://plannotator.ai/install.ps1 | iex
 ```
 
+This installs the `plannotator` CLI and places skills in `~/.agents/skills/` where Codex discovers them on startup. To install skills only: `npx skills add backnotprop/plannotator`.
+
 ## Usage
 
 ### Code Review
 
-Run `!plannotator review` to open the code review UI for your current changes:
-
 ```
 !plannotator review
 ```
 
-This captures your git diff, opens a browser with the review UI, and waits for your feedback. When you submit annotations, the feedback is printed to stdout.
+Captures your git diff, opens a browser with the review UI, and waits for your feedback. Annotations are sent back to the agent as structured feedback.
 
 ### Annotate Markdown
 
-Run `!plannotator annotate` to annotate any markdown file:
-
 ```
 !plannotator annotate path/to/file.md
 ```
 
+### QA Checklist
+
+The `checklist` skill is invoked by the agent when you ask it to verify changes, create acceptance criteria, or run QA checks. It generates a structured checklist and opens an interactive UI for pass/fail/skip verification.
+
 ## Environment Variables
 
 | Variable | Description |

diff --git a/apps/factory/README.md b/apps/factory/README.md
@@ -0,0 +1,19 @@
+# Plannotator for Factory
+
+## Install
+
+**macOS / Linux / WSL:**
+
+```bash
+curl -fsSL https://plannotator.ai/install.sh | bash
+```
+
+## Skills
+
+Skills are installed automatically by the install script above.
+
+Alternatively, install skills only via `npx skills add backnotprop/plannotator`.
+
+| Skill | Description |
+|-------|-------------|
+| `checklist` | Generate a QA checklist for manual verification of code changes |
diff --git a/apps/hook/README.md b/apps/hook/README.md
@@ -1,6 +1,13 @@
 # Plannotator Claude Code Plugin
 
-This directory contains the Claude Code plugin configuration for Plannotator.
+## Capabilities
+
+| Feature | How to use |
+|---------|------------|
+| **Plan Review** | Automatic — intercepts `ExitPlanMode` via hooks |
+| **Code Review** | `/plannotator-review` — Visual diff annotation UI |
+| **Markdown Annotation** | `/plannotator-annotate path/to/file.md` — Annotate any markdown file |
+| **QA Checklist** | `/plannotator-checklist` or skill: `checklist` — Generate and verify QA checklists interactively |
 
 ## Prerequisites
 
@@ -23,7 +30,7 @@ curl -fsSL https://plannotator.ai/install.cmd -o install.cmd && install.cmd && d
 
 ---
 
-[Plugin Installation](#plugin-installation) · [Manual Installation (Hooks)](#manual-installation-hooks) · [Obsidian Integration](#obsidian-integration)  
+[Plugin Installation](#plugin-installation) · [Manual Installation (Hooks)](#manual-installation-hooks) · [Obsidian Integration](#obsidian-integration)
 
 ---
 
@@ -36,7 +43,7 @@ In Claude Code:
 /plugin install plannotator@plannotator
 ```
 
-**Important:** Restart Claude Code after installing the plugin for the hooks to take effect.
+**Important:** Restart Claude Code after installing the plugin for the hooks to take effect. Skills are included with the plugin install.
 
 ## Manual Installation (Hooks)
 

diff --git a/apps/hook/commands/plannotator-checklist.md b/apps/hook/commands/plannotator-checklist.md
@@ -0,0 +1,12 @@
+---
+description: Open interactive QA checklist verification UI
+allowed-tools: Bash(plannotator:*)
+---
+
+## QA Checklist Results
+
+!`plannotator checklist '$ARGUMENTS'`
+
+## Your task
+
+Address the checklist results above. Items marked FAILED need fixes — read the developer's notes and act on them. Items with questions need answers. Items marked SKIPPED were not verified — acknowledge the reason.
diff --git a/apps/hook/package.json b/apps/hook/package.json
@@ -5,7 +5,7 @@
   "type": "module",
   "scripts": {
     "dev": "vite",
-    "build": "vite build && cp dist/index.html dist/redline.html && cp ../review/dist/index.html dist/review.html",
+    "build": "bun run --cwd ../review build && bun run --cwd ../checklist build && vite build && cp dist/index.html dist/redline.html && cp ../review/dist/index.html dist/review.html && cp ../checklist/dist/index.html dist/checklist.html",
     "serve": "bun run server/index.ts"
   },
   "dependencies": {