Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 153 additions & 0 deletions .agents/skills/checklist/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
name: checklist
description: >
Generate a QA checklist for manual developer verification of code changes.
Use when the user wants to verify completed work, review a diff for quality,
create acceptance criteria checks, or run through QA steps before shipping.
Triggers on requests like "create a checklist", "what should I test",
"verify my changes", "QA this", or "pre-flight check".
disable-model-invocation: true
---

# QA Checklist

You are a senior QA engineer. Your job is to analyze the current code changes and produce a **QA checklist** — a structured list of verification tasks the developer needs to manually review before the work is considered done.

This is not a code review. Code reviews catch style issues and logic bugs in the diff itself. A QA checklist catches the things that only a human can verify by actually running, clicking, testing, and observing the software. You're producing the verification plan that bridges "the code looks right" to "the software actually works."

## Principles

**Focus on what humans must verify.** If an automated test already covers something with meaningful assertions, it doesn't need a checklist item. But "tests exist" is not enough — test coverage that only asserts existence or happy-path behavior still leaves gaps that need human eyes.

**Be specific, not vague.** "Test the login flow" is useless. "Verify that login with an expired JWT returns a 401 with `{error: 'token_expired'}` body, not a 500 with a stack trace" tells the developer exactly what to check, what to expect, and what failure looks like.

**Every item is a mini test case.** Each checklist entry should have enough context that a developer unfamiliar with the change could pick it up and verify it. The description explains the change and the risk. The steps walk through the exact verification procedure. The expected outcome is clear.

**Fewer good items beat many shallow ones.** Aim for 5–15 items. If you're producing more than 15, you're generating busywork — prioritize the items where human verification actually matters. If you're producing fewer than 5, look harder at edge cases, integration points, and deployment concerns.

## Workflow

### 1. Gather Context

Start by understanding what changed and why.

```bash
git diff HEAD
```

If that's empty, try the branch diff:

```bash
git diff main...HEAD
```

As you read the diff, build a mental model:

- **What kind of change is this?** New feature, bug fix, refactor, dependency update, config/infra change. This determines which categories of verification matter most.
- **Which files changed and what do they do?** UI components need visual verification. API routes need functional testing. Database migrations need data integrity checks. Config files need deployment verification.
- **Do tests exist for this code?** Look for test files related to the changed code. Tests that meaningfully cover the changed behavior reduce the need for manual verification — but tests that only cover the happy path or assert existence still leave gaps.

### 2. Decide What Needs Manual Verification

Think about each change through the lens of what could go wrong that a human needs to catch. Consider categories like:

- **Visual** — Does it look right? Layout, responsiveness, dark mode, animations, color contrast. Only relevant when UI files changed.
- **Functional** — Does the feature work end-to-end? Happy path and primary error paths. Always relevant for new features and bug fixes.
- **Edge cases** — Empty input, huge input, special characters, concurrent access, timezone issues. Focus on cases the diff suggests are likely, not every theoretical scenario.
- **Integration** — Does this break callers or consumers? API contract changes, event format changes, shared state mutations.
- **Security** — Auth checks on new endpoints, input sanitization, secrets exposure, CORS changes.
- **Data** — Database migrations, schema changes, backwards compatibility, data format changes.
- **Performance** — Only when the diff touches hot paths, adds queries, or changes data structures.
- **Deployment** — New environment variables, feature flags, migration ordering, new dependencies.
- **Developer experience** — Error messages, documentation, CLI help text, logging.

These are suggestions, not a fixed list. Use whatever category label best describes the type of verification. If the change involves "api-contract" or "accessibility" or "offline-behavior," use that.

### 3. Generate the Checklist JSON

Produce a JSON object with this structure:

```json
{
"title": "Short title for the checklist",
"summary": "One paragraph explaining what changed and why manual verification matters.",
"pr": {
"number": 142,
"url": "https://github.com/org/repo/pull/142",
"title": "feat: add OAuth2 support",
"branch": "feat/oauth2",
"provider": "github"
},
"items": [
{
"id": "category-N",
"category": "free-form category label",
"check": "Imperative verb phrase — the headline",
"description": "Markdown narrative explaining what changed in the code, what could go wrong, what the expected behavior is, and how the developer knows the test passes.",
"steps": [
"Step 1: Do this specific thing",
"Step 2: Observe this specific result",
"Step 3: Confirm this specific expectation"
],
"reason": "Why this needs human eyes — what makes it not fully automatable.",
"files": ["path/to/relevant/file.ts"],
"critical": false
}
]
}
```

**Field guidance:**

- **`pr`** (optional): Include when the checklist is associated with a pull/merge request. The UI displays a PR badge in the header and enables automation options (post results as a PR comment, auto-approve if all checks pass). Detect the provider from the git remote:
- `github.com` → `"provider": "github"`
- `gitlab.com` or self-hosted GitLab → `"provider": "gitlab"`
- `dev.azure.com` or `visualstudio.com` → `"provider": "azure-devops"`

To detect if a PR exists for the current branch:
```bash
# GitHub
gh pr view --json number,url,title,headRefName 2>/dev/null
# GitLab
glab mr view --output json 2>/dev/null
# Azure DevOps
az repos pr list --source-branch "$(git branch --show-current)" --output json 2>/dev/null
```
If the command succeeds, populate the `pr` field. If it fails (no PR exists, CLI not installed), omit it entirely. Do not error on missing CLIs — the `pr` field is optional.

- **`id`**: Prefix with a short category tag and number: `func-1`, `sec-2`, `visual-1`. This makes items easy to reference in feedback.
- **`category`**: Free-form string. Pick the label that best describes the verification type. Common ones: `visual`, `functional`, `edge-case`, `integration`, `security`, `data`, `performance`, `deployment`, `devex`.
- **`check`**: The headline. Always starts with a verb: Verify, Confirm, Check, Test, Ensure, Open, Navigate, Run. This is what appears as the checklist item label.
- **`description`**: The heart of the item. Write this as a markdown narrative that tells the full story:
- What changed in the code (reference specific files/functions)
- What could go wrong as a result
- What the expected behavior should be
- How the developer knows the test passes vs fails
- **`steps`**: Required. Ordered instructions for conducting the verification. Be concrete — "Open browser devtools" not "check the network." Each step should be a single clear action.
- **`reason`**: One sentence explaining why automation can't fully cover this. "CSS grid rendering varies across browsers" is good. "Because it changed" is not.
- **`files`**: File paths from the diff that this item relates to. Helps the developer trace your reasoning.
- **`critical`**: Reserve for items where failure means data loss, security vulnerability, or broken deployment. Typically 0–3 items per checklist.

### 4. Launch the Checklist UI

Write your JSON to a temporary file and pass it via `--file`:

```bash
cat > /tmp/checklist.json << 'CHECKLIST_EOF'
<your-json-here>
CHECKLIST_EOF
plannotator checklist --file /tmp/checklist.json
```

This avoids shell quoting issues with large or complex JSON. The UI opens for the developer to work through each item — marking them as passed, failed, or skipped with notes and screenshot evidence. Wait for the output — it contains the developer's results.

### 5. Respond to Results

When the checklist results come back:

- **All passed**: The verification is complete. Acknowledge it and move on.
- **Items failed**: Read the developer's notes carefully. Fix the issue if you can. If the current behavior is actually correct, explain why.
- **Items skipped**: Note the reason. If items were skipped as "not applicable," your checklist may have been too broad for this change — take that as feedback.
- **Questions attached**: Answer them directly, with references to the relevant code.

$ARGUMENTS
6 changes: 4 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,15 @@ dist-ssr
# VS Code extension package
*.vsix

# OpenCode plugin build artifacts (generated from hook/review dist)
# OpenCode plugin build artifacts (generated from hook/review/checklist dist)
apps/opencode-plugin/plannotator.html
apps/opencode-plugin/review-editor.html
apps/opencode-plugin/checklist.html

# Pi extension build artifacts (generated from hook/review dist)
# Pi extension build artifacts (generated from hook/review/checklist dist)
apps/pi-extension/plannotator.html
apps/pi-extension/review-editor.html
apps/pi-extension/checklist.html

# Editor directories and files
.vscode/*
Expand Down
18 changes: 18 additions & 0 deletions apps/checklist/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>QA Checklist</title>

<!-- Fonts -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500;600&display=swap" rel="stylesheet">

</head>
<body class="min-h-screen antialiased">
<div id="root" class="h-full"></div>
<script type="module" src="/index.tsx"></script>
</body>
</html>
16 changes: 16 additions & 0 deletions apps/checklist/index.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import React from 'react';
import ReactDOM from 'react-dom/client';
import App from '@plannotator/checklist-editor';
import '@plannotator/checklist-editor/styles';

const rootElement = document.getElementById('root');
if (!rootElement) {
throw new Error("Could not find root element to mount to");
}

const root = ReactDOM.createRoot(rootElement);
root.render(
<React.StrictMode>
<App />
</React.StrictMode>
);
27 changes: 27 additions & 0 deletions apps/checklist/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"name": "@plannotator/checklist",
"private": true,
"version": "0.0.1",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build"
},
"dependencies": {
"@plannotator/checklist-editor": "workspace:*",
"@plannotator/server": "workspace:*",
"@plannotator/shared": "workspace:*",
"@plannotator/ui": "workspace:*",
"react": "^19.2.3",
"react-dom": "^19.2.3",
"tailwindcss": "^4.1.18",
"@tailwindcss/vite": "^4.1.18"
},
"devDependencies": {
"@vitejs/plugin-react": "^5.0.0",
"typescript": "~5.8.2",
"vite": "^6.2.0",
"vite-plugin-singlefile": "^2.0.3",
"@types/node": "^22.14.0"
}
}
37 changes: 37 additions & 0 deletions apps/checklist/vite.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import path from 'path';
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import { viteSingleFile } from 'vite-plugin-singlefile';
import tailwindcss from '@tailwindcss/vite';
import pkg from '../../package.json';

export default defineConfig({
server: {
port: 3002,
host: '0.0.0.0',
},
define: {
__APP_VERSION__: JSON.stringify(pkg.version),
},
plugins: [react(), tailwindcss(), viteSingleFile()],
resolve: {
alias: {
'@': path.resolve(__dirname, '.'),
'@plannotator/ui': path.resolve(__dirname, '../../packages/ui'),
'@plannotator/shared': path.resolve(__dirname, '../../packages/shared'),
'@plannotator/checklist-editor/styles': path.resolve(__dirname, '../../packages/checklist-editor/index.css'),
'@plannotator/checklist-editor': path.resolve(__dirname, '../../packages/checklist-editor/App.tsx'),
}
},
build: {
target: 'esnext',
assetsInlineLimit: 100000000,
chunkSizeWarningLimit: 100000000,
cssCodeSplit: false,
rollupOptions: {
output: {
inlineDynamicImports: true,
},
},
},
});
22 changes: 16 additions & 6 deletions apps/codex/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
# Plannotator for Codex

Code review and markdown annotation are supported today. Plan mode is not yet supported — it requires hooks to intercept the agent's plan submission, which Codex does not currently expose.
## Capabilities

| Feature | How to use |
|---------|------------|
| **Code Review** | `!plannotator review` — Visual diff annotation UI |
| **Markdown Annotation** | `!plannotator annotate path/to/file.md` — Annotate any markdown file |
| **QA Checklist** | Skill: `checklist` — Generate and verify QA checklists interactively |

Plan mode is not yet supported — it requires hooks to intercept the agent's plan submission, which Codex does not currently expose.

## Install

Expand All @@ -16,26 +24,28 @@ curl -fsSL https://plannotator.ai/install.sh | bash
irm https://plannotator.ai/install.ps1 | iex
```

This installs the `plannotator` CLI and places skills in `~/.agents/skills/` where Codex discovers them on startup. To install skills only: `npx skills add backnotprop/plannotator`.

## Usage

### Code Review

Run `!plannotator review` to open the code review UI for your current changes:

```
!plannotator review
```

This captures your git diff, opens a browser with the review UI, and waits for your feedback. When you submit annotations, the feedback is printed to stdout.
Captures your git diff, opens a browser with the review UI, and waits for your feedback. Annotations are sent back to the agent as structured feedback.

### Annotate Markdown

Run `!plannotator annotate` to annotate any markdown file:

```
!plannotator annotate path/to/file.md
```

### QA Checklist

The `checklist` skill is invoked by the agent when you ask it to verify changes, create acceptance criteria, or run QA checks. It generates a structured checklist and opens an interactive UI for pass/fail/skip verification.

## Environment Variables

| Variable | Description |
Expand Down
19 changes: 19 additions & 0 deletions apps/factory/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Plannotator for Factory

## Install

**macOS / Linux / WSL:**

```bash
curl -fsSL https://plannotator.ai/install.sh | bash
```

## Skills

Skills are installed automatically by the install script above.

Alternatively, install skills only via `npx skills add backnotprop/plannotator`.

| Skill | Description |
|-------|-------------|
| `checklist` | Generate a QA checklist for manual verification of code changes |
13 changes: 10 additions & 3 deletions apps/hook/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Plannotator Claude Code Plugin

This directory contains the Claude Code plugin configuration for Plannotator.
## Capabilities

| Feature | How to use |
|---------|------------|
| **Plan Review** | Automatic — intercepts `ExitPlanMode` via hooks |
| **Code Review** | `/plannotator-review` — Visual diff annotation UI |
| **Markdown Annotation** | `/plannotator-annotate path/to/file.md` — Annotate any markdown file |
| **QA Checklist** | `/plannotator-checklist` or skill: `checklist` — Generate and verify QA checklists interactively |

## Prerequisites

Expand All @@ -23,7 +30,7 @@ curl -fsSL https://plannotator.ai/install.cmd -o install.cmd && install.cmd && d

---

[Plugin Installation](#plugin-installation) · [Manual Installation (Hooks)](#manual-installation-hooks) · [Obsidian Integration](#obsidian-integration)
[Plugin Installation](#plugin-installation) · [Manual Installation (Hooks)](#manual-installation-hooks) · [Obsidian Integration](#obsidian-integration)

---

Expand All @@ -36,7 +43,7 @@ In Claude Code:
/plugin install plannotator@plannotator
```

**Important:** Restart Claude Code after installing the plugin for the hooks to take effect.
**Important:** Restart Claude Code after installing the plugin for the hooks to take effect. Skills are included with the plugin install.

## Manual Installation (Hooks)

Expand Down
12 changes: 12 additions & 0 deletions apps/hook/commands/plannotator-checklist.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
description: Open interactive QA checklist verification UI
allowed-tools: Bash(plannotator:*)
---

## QA Checklist Results

!`plannotator checklist '$ARGUMENTS'`

## Your task

Address the checklist results above. Items marked FAILED need fixes — read the developer's notes and act on them. Items with questions need answers. Items marked SKIPPED were not verified — acknowledge the reason.
2 changes: 1 addition & 1 deletion apps/hook/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build && cp dist/index.html dist/redline.html && cp ../review/dist/index.html dist/review.html",
"build": "bun run --cwd ../review build && bun run --cwd ../checklist build && vite build && cp dist/index.html dist/redline.html && cp ../review/dist/index.html dist/review.html && cp ../checklist/dist/index.html dist/checklist.html",
"serve": "bun run server/index.ts"
},
"dependencies": {
Expand Down
Loading
Loading