From b8e2cc0e1869f8595c01ecbe2424ffc6acc9faaf Mon Sep 17 00:00:00 2001 From: Andy Dalton Date: Wed, 22 Apr 2026 10:49:58 -0400 Subject: [PATCH 1/5] Add e2e workflow and fix shared issues in implement workflow MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a new e2e workflow that takes [QE] Jira stories through an 8-phase pipeline (ingest, plan, revise, code, validate, publish, respond) to produce end-to-end tests. The workflow discovers the target project's test infrastructure at runtime — framework, harness, auxiliary services, conventions, and reference suites — rather than hardcoding assumptions. Key e2e-specific adaptations from the implement workflow: - Scenario-driven planning (ACs map to Describe/Context/It blocks) - Deep infrastructure discovery (10 sub-steps covering harness, services, utilities, conventions, reference suites) - Anti-pattern detection during validation (hardcoded sleeps, brittle selectors, missing cleanup, harness bypass, etc.) - Feature-defect-vs-test-bug distinction in deviation handling - No TDD cycle (tests are the deliverable) Also fixes issues inherited by both workflows, identified during skill-reviewer audit: - Replace fragile cross-file step-number references with descriptive names in validate.md and respond.md (both workflows) - Add shared config documentation for .artifacts/prd/config.json in ingest.md (both workflows) Co-Authored-By: Claude Opus 4.6 --- AGENTS.md | 16 ++ README.md | 4 + e2e/README.md | 158 +++++++++++ e2e/SKILL.md | 25 ++ e2e/commands/code.md | 11 + e2e/commands/ingest.md | 11 + e2e/commands/plan.md | 11 + e2e/commands/publish.md | 11 + e2e/commands/respond.md | 11 + e2e/commands/revise.md | 11 + e2e/commands/validate.md | 11 + e2e/guidelines.md | 69 +++++ e2e/skills/code.md | 465 ++++++++++++++++++++++++++++++ e2e/skills/controller.md | 167 +++++++++++ e2e/skills/ingest.md | 528 +++++++++++++++++++++++++++++++++++ e2e/skills/plan.md | 234 ++++++++++++++++ e2e/skills/publish.md | 215 ++++++++++++++ e2e/skills/respond.md | 234 ++++++++++++++++ e2e/skills/revise.md | 132 +++++++++ e2e/skills/validate.md | 326 +++++++++++++++++++++ implement/skills/ingest.md | 2 + implement/skills/respond.md | 4 +- implement/skills/validate.md | 4 +- 23 files changed, 2656 insertions(+), 4 deletions(-) create mode 100644 e2e/README.md create mode 100644 e2e/SKILL.md create mode 100644 e2e/commands/code.md create mode 100644 e2e/commands/ingest.md create mode 100644 e2e/commands/plan.md create mode 100644 e2e/commands/publish.md create mode 100644 e2e/commands/respond.md create mode 100644 e2e/commands/revise.md create mode 100644 e2e/commands/validate.md create mode 100644 e2e/guidelines.md create mode 100644 e2e/skills/code.md create mode 100644 e2e/skills/controller.md create mode 100644 e2e/skills/ingest.md create mode 100644 e2e/skills/plan.md create mode 100644 e2e/skills/publish.md create mode 100644 e2e/skills/respond.md create mode 100644 e2e/skills/revise.md create mode 100644 e2e/skills/validate.md diff --git a/AGENTS.md b/AGENTS.md index c8cff9f..4766af6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -16,6 +16,7 @@ This repository contains reusable AI coding workflows that can be installed glob - **prd** — Requirements-to-PRD workflow (ingest, clarify, draft, revise, publish, respond) - **design** — Design-and-decompose workflow (ingest, draft, decompose, revise, publish, respond, sync) - **implement** — Story-to-code workflow (ingest, plan, revise, code, validate, publish, respond) +- **e2e** — Story-to-tests workflow for [QE] stories (ingest, plan, revise, code, validate, publish, respond) - **kcs** — KCS Solution article workflow (gather, draft, validate, handoff) ## Architecture @@ -60,6 +61,7 @@ Workflows write outputs to `.artifacts/{workflow-name}/{context}/`: - **prd**: `.artifacts/prd/{issue-number}/` (01-requirements.md, 02-clarifications.md, 03-prd.md, 04-pr-description.md, 05-review-responses.md) - **design**: `.artifacts/design/{issue-number}/` (01-context.md, 02-design.md, 03-epics.md, 04-stories/epic-{N}-{slug}.md, 04-stories/epic-{N}/story-{NN}-{slug}.md, 05-coverage.md, 06-pr-description.md, 07-review-responses.md, publish-metadata.json, sync-manifest.json) - **implement**: `.artifacts/implement/{jira-key}/` (01-context.md, 02-plan.md, 03-test-report.md, 04-impl-report.md, 05-validation-report.md, 06-pr-description.md, 07-review-responses.md, publish-metadata.json) +- **e2e**: `.artifacts/e2e/{jira-key}/` (01-context.md, 02-plan.md, 03-test-report.md, 04-impl-report.md, 05-validation-report.md, 06-pr-description.md, 07-review-responses.md, publish-metadata.json) - **kcs**: `.artifacts/kcs/{issue-key}/` (01-context.md, 02-kcs-draft.md, 03-handoff-message.md) ## Prerequisites @@ -76,6 +78,7 @@ Workflows write outputs to `.artifacts/{workflow-name}/{context}/`: - **prd**: Jira MCP server — for requirements ingestion; GitHub CLI (`gh`) — for PR creation and review comment management - **design**: Jira MCP server or CLI — for `/ingest` (read-only) and `/sync` (creates epics/stories); GitHub CLI (`gh`) — for `/publish` and `/respond` - **implement**: Jira MCP server or CLI — for `/ingest` (read-only); GitHub CLI (`gh`) — for `/publish` and `/respond`; project build/test/lint tooling (discovered during `/ingest`); docs repo (local clone) — for `/ingest` (reads PRD and design document) +- **e2e**: Jira MCP server or CLI — for `/ingest` (read-only); GitHub CLI (`gh`) — for `/publish` and `/respond`; project e2e test tooling (discovered during `/ingest`); docs repo (local clone) — for `/ingest` (reads PRD and design document) - **kcs**: Jira MCP server — for `/gather` (read-only) ## Installation System @@ -193,6 +196,18 @@ For detailed workflow development guidelines (structure, file conventions, testi - Plan evolves during implementation — `02-plan.md` is updated as tasks complete, enabling resumption after interruptions - Code changes happen in the source repo on a feature branch; `/publish` creates a PR in the source repo (not a separate docs repo) +### e2e + +- Requires a Jira [QE] Story (typically created by the design workflow's `/decompose` phase) as input +- Jira is read-only — no phase in this workflow writes to Jira +- Discovery-based infrastructure: e2e test framework, harness, auxiliary services, execution commands, and conventions are discovered during `/ingest` — not hardcoded +- Reference suite pattern: before writing tests, identifies the most similar existing e2e test suite and extracts its patterns (imports, setup/teardown, harness usage, assertions, labels) +- Scenario-driven planning: each acceptance criterion maps to concrete test scenarios with Describe/Context/It nesting, steps, assertions, and labels +- Anti-pattern detection during `/validate`: checks for hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, harness bypass, missing labels, hardcoded values, missing async polling, missing failure diagnostics +- Feature defects are not test bugs — if tests reveal a defect in the [DEV] implementation, the test is adjusted (xfail/skip) and the defect is noted in the implementation report +- Plan evolves during implementation — `02-plan.md` is updated as tasks complete, enabling resumption after interruptions +- Code changes happen in the source repo on a feature branch; `/publish` creates a PR in the source repo + ### kcs - Requires Jira MCP server for `/gather` (read-only — never modifies Jira) @@ -265,6 +280,7 @@ ai-workflows/ ├── cve-fix/ ├── design/ ├── docs-writer/ +├── e2e/ ├── implement/ ├── kcs/ ├── prd/ diff --git a/README.md b/README.md index f303712..4b55d02 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,9 @@ Reusable AI coding workflows a team member can install globally or per-project, - **Implement** -- Story-to-code workflow: take a Jira Story, plan the implementation, write contract-based tests and production code via TDD, validate against the project's CI expectations, and manage review via GitHub PRs. See [implement/README.md](implement/README.md). +- **E2E** -- Story-to-tests workflow for [QE] stories: discover the project's e2e testing infrastructure, map acceptance criteria to test scenarios, write e2e test code following the project's patterns and reference suite, validate against anti-patterns and scenario coverage, and manage review via GitHub PRs. + See [e2e/README.md](e2e/README.md). + - **CVE Fix** -- Automated CVE remediation: read vulnerability details from Jira, apply multi-strategy dependency fixes, validate, create pull requests, backport to release branches, and close Jira tickets. Language-agnostic. See [cve-fix/README.md](cve-fix/README.md). @@ -103,6 +106,7 @@ Each workflow is intended for a specific project or use case: - **prd** -- teams drafting Product Requirements Documents from Jira features - **design** -- teams creating technical design documents and Jira-ready epic/story breakdowns from PRDs - **implement** -- teams implementing Jira stories produced by the design workflow +- **e2e** -- teams writing e2e tests for [QE] stories produced by the design workflow - **cve-fix** -- teams patching CVEs and updating vulnerable dependencies from Jira vulnerability tickets - **ai-ready** -- onboarding any project for AI agents by generating AGENTS.md - **kcs** -- teams writing KCS Solution articles for known issues with workarounds diff --git a/e2e/README.md b/e2e/README.md new file mode 100644 index 0000000..21504cb --- /dev/null +++ b/e2e/README.md @@ -0,0 +1,158 @@ +# E2E Test Workflow + +A story-to-tests workflow that takes a Jira [QE] Story, discovers the project's e2e testing infrastructure, plans test scenarios mapped to acceptance criteria, writes e2e test code following the project's patterns, validates against anti-patterns and scenario coverage, and manages review via GitHub PRs. + +## Prerequisites + +| Tool | Required | Purpose | +|------|----------|---------| +| Jira access (MCP or CLI) | For `/ingest` | Fetch [QE] Story issue details | +| GitHub CLI (`gh`) | For `/publish`, `/respond` | Create PRs, post review comments | +| Git | Yes | Branch management, commits | +| Project e2e test tooling | Yes | Discovered during `/ingest` from project's AGENTS.md, Makefile, CI workflows | +| Docs repo (local clone) | For `/ingest` | Read PRD and design document for upstream context | + +## Phases + +| Phase | Command | Purpose | Artifact(s) | +|-------|---------|---------|-------------| +| Ingest | `/ingest` | Fetch [QE] story, verify [DEV] dependencies, explore e2e infrastructure | `01-context.md` | +| Plan | `/plan` | Map ACs to test scenarios, select reference suite | `02-plan.md` | +| Revise | `/revise` | Incorporate feedback into the test plan | Updated `02-plan.md` | +| Code | `/code` | Write e2e test code following discovered patterns | `03-test-report.md`, `04-impl-report.md` | +| Validate | `/validate` | Run tests, check anti-patterns, verify scenario coverage | `05-validation-report.md` | +| Publish | `/publish` | Push branch, create draft PR | `06-pr-description.md` | +| Respond | `/respond` | Address reviewer comments | `07-review-responses.md` | + +## Typical Flow + +```text +/ingest EDM-5678 + -> fetches [QE] story from Jira + -> verifies [DEV] dependencies are merged + -> loads design document and PRD context + -> explores e2e test infrastructure (framework, harness, patterns) + -> selects reference suite as pattern source + -> discovers validation profile (test execution, lint commands) + -> writes .artifacts/e2e/EDM-5678/01-context.md + +/plan + -> maps each acceptance criterion to test scenarios + -> selects reference suite and documents patterns to follow + -> designs test file structure (suite file + test files) + -> plans harness method usage and auxiliary services + -> breaks work into ordered tasks (suite file first, then scenarios) + -> writes 02-plan.md + +/revise (optional, repeatable) + -> user reviews plan, requests changes + -> plan updated, consistency maintained + +/code + -> creates feature branch + -> for each task: read reference -> write test code -> run tests -> review -> commit + -> updates 02-plan.md with task completion status + -> writes 03-test-report.md, 04-impl-report.md + +/validate + -> runs e2e tests (scoped to new suite) + -> checks for anti-patterns (hardcoded sleeps, missing cleanup, etc.) + -> verifies every AC has a passing test scenario + -> checks for regressions in adjacent suites + -> writes 05-validation-report.md + +/publish + -> pushes feature branch + -> creates draft GitHub PR with Jira link + -> writes 06-pr-description.md + +/respond (repeatable) + -> fetches PR review comments + -> proposes responses (user approves before posting) + -> applies code changes if needed + -> writes 07-review-responses.md +``` + +## Artifacts + +All artifacts are stored in `.artifacts/e2e/{jira-key}/`. + +```text +.artifacts/e2e/EDM-5678/ + 01-context.md (story context, e2e infrastructure, validation profile) + 02-plan.md (scenario breakdown, AC coverage -- updated as tasks complete) + 03-test-report.md (tests written, harness methods used) + 04-impl-report.md (changes, commits, deviations, discoveries) + 05-validation-report.md (check results, anti-patterns, regressions) + 06-pr-description.md (PR body) + 07-review-responses.md (review comment log) + publish-metadata.json (PR number, branch, URL) +``` + +## Key Design Decisions + +### Discovery-Based Infrastructure + +The workflow does not hardcode language-specific commands or framework assumptions. During `/ingest`, it discovers the project's e2e testing framework, harness, auxiliary services, execution commands, and conventions. This makes the workflow portable across projects using different testing stacks (Ginkgo, Playwright, pytest, Cypress, etc.). + +### Reference Suite Pattern + +Before writing any test code, the workflow identifies the most similar existing e2e test suite in the project and extracts its patterns: imports, setup/teardown, harness usage, assertion style, labels, and cleanup. New tests follow these patterns exactly, ensuring consistency with the project's existing test base. + +### Scenario-Driven Planning + +Unlike implementation planning (task-driven), e2e test planning is scenario-driven. Each acceptance criterion maps to one or more concrete test scenarios with specific Describe/Context/It nesting, steps, assertions, and labels. This ensures every AC is verifiably covered. + +### Anti-Pattern Detection + +Validation checks for 10 common e2e test anti-patterns: hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, harness bypass, missing labels, hardcoded values, missing async polling, and missing failure diagnostics. Each detected anti-pattern is fixed during validation. + +### Feature Defects Are Not Test Bugs + +If e2e tests reveal that the feature behaves differently than the acceptance criteria describe, that is a defect in the [DEV] implementation, not a test failure. The test is adjusted (xfail/skip) and the defect is noted in the implementation report. The e2e workflow does not fix feature code. + +### Incremental Commits + +Each logical unit of work gets its own commit, following the project's commit format (discovered during `/ingest`). Each commit should be independently meaningful. + +### Plan as Living Document + +`02-plan.md` is updated during `/code` as tasks are completed. On re-invocation (e.g., after context limits or interruptions), the plan shows which tasks are done and which remain. + +## Directory Structure + +```text +e2e/ +├── SKILL.md # Workflow entry point +├── guidelines.md # Behavioral rules and guardrails +├── README.md # This file +├── skills/ +│ ├── controller.md # Phase dispatcher and transitions +│ ├── ingest.md # Fetch story, explore e2e infrastructure +│ ├── plan.md # Map ACs to test scenarios +│ ├── revise.md # Incorporate plan feedback +│ ├── code.md # Write e2e test code +│ ├── validate.md # Run tests, check anti-patterns +│ ├── publish.md # Create GitHub PR +│ └── respond.md # Address review comments +└── commands/ + ├── ingest.md # /ingest command + ├── plan.md # /plan command + ├── revise.md # /revise command + ├── code.md # /code command + ├── validate.md # /validate command + ├── publish.md # /publish command + └── respond.md # /respond command +``` + +## Getting Started + +```bash +# Install the workflow +./install.sh claude --workflows e2e + +# Or install all workflows +./install.sh all +``` + +Then in your project, run the `e2e` workflow's `ingest` command for your [QE] Jira story (e.g., EDM-5678). diff --git a/e2e/SKILL.md b/e2e/SKILL.md new file mode 100644 index 0000000..569ad00 --- /dev/null +++ b/e2e/SKILL.md @@ -0,0 +1,25 @@ +--- +name: e2e +description: >- + Story-to-e2e-test workflow that takes a Jira [QE] Story, discovers the + project's e2e testing infrastructure, plans test scenarios, writes e2e + tests matching project conventions, validates them, and manages review + via GitHub PRs. Use when implementing [QE] stories produced by the + design workflow. + Activated by commands: /ingest, /plan, /revise, /code, /validate, /publish, /respond. +--- +# E2E Test Workflow Orchestrator + +## Quick Start + +1. If the user invoked a specific command (e.g., `/plan`, `/code`), read + `commands/{command}.md` and follow it. +2. Otherwise, read `skills/controller.md` to load the workflow controller: + - If the user provided a Jira issue key or URL, execute the `/ingest` phase + - Otherwise, execute the first phase the user requests + +If a step fails or produces unexpected output (e.g., Jira MCP errors, test +failures, build errors), stop and report the error to the user. Do not +advance to the next phase. Offer to retry the failed step or escalate. + +For principles, hard limits, safety, quality, and escalation rules, see `guidelines.md`. diff --git a/e2e/commands/code.md b/e2e/commands/code.md new file mode 100644 index 0000000..e5c5f45 --- /dev/null +++ b/e2e/commands/code.md @@ -0,0 +1,11 @@ +--- +name: e2e:code +description: "Write e2e test code following discovered patterns, committing incrementally" +--- +# /code + +Read `../skills/controller.md` and follow it. + +Dispatch the **code** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/ingest.md b/e2e/commands/ingest.md new file mode 100644 index 0000000..52f143f --- /dev/null +++ b/e2e/commands/ingest.md @@ -0,0 +1,11 @@ +--- +name: e2e:ingest +description: "Fetch [QE] story, verify dependencies, explore e2e infrastructure, build test-execution profile" +--- +# /ingest + +Read `../skills/controller.md` and follow it. + +Dispatch the **ingest** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/plan.md b/e2e/commands/plan.md new file mode 100644 index 0000000..a6f8b56 --- /dev/null +++ b/e2e/commands/plan.md @@ -0,0 +1,11 @@ +--- +name: e2e:plan +description: "Map acceptance criteria to e2e test scenarios, select reference suite, design test structure" +--- +# /plan + +Read `../skills/controller.md` and follow it. + +Dispatch the **plan** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/publish.md b/e2e/commands/publish.md new file mode 100644 index 0000000..d6f2e55 --- /dev/null +++ b/e2e/commands/publish.md @@ -0,0 +1,11 @@ +--- +name: e2e:publish +description: "Push feature branch and create draft PR for e2e tests" +--- +# /publish + +Read `../skills/controller.md` and follow it. + +Dispatch the **publish** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/respond.md b/e2e/commands/respond.md new file mode 100644 index 0000000..5019348 --- /dev/null +++ b/e2e/commands/respond.md @@ -0,0 +1,11 @@ +--- +name: e2e:respond +description: "Fetch and address PR reviewer comments on e2e test code" +--- +# /respond + +Read `../skills/controller.md` and follow it. + +Dispatch the **respond** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/revise.md b/e2e/commands/revise.md new file mode 100644 index 0000000..ee37a59 --- /dev/null +++ b/e2e/commands/revise.md @@ -0,0 +1,11 @@ +--- +name: e2e:revise +description: "Incorporate user feedback into the e2e test plan" +--- +# /revise + +Read `../skills/controller.md` and follow it. + +Dispatch the **revise** phase. Context: + +$ARGUMENTS diff --git a/e2e/commands/validate.md b/e2e/commands/validate.md new file mode 100644 index 0000000..dc6df13 --- /dev/null +++ b/e2e/commands/validate.md @@ -0,0 +1,11 @@ +--- +name: e2e:validate +description: "Run e2e tests, check for anti-patterns, verify scenario coverage, assess PR readiness" +--- +# /validate + +Read `../skills/controller.md` and follow it. + +Dispatch the **validate** phase. Context: + +$ARGUMENTS diff --git a/e2e/guidelines.md b/e2e/guidelines.md new file mode 100644 index 0000000..c352d9c --- /dev/null +++ b/e2e/guidelines.md @@ -0,0 +1,69 @@ +# E2E Test Workflow Guidelines + +## Principles + +- The e2e tests must validate the **user-facing behaviors** described in the story's acceptance criteria. Each AC maps to one or more concrete test scenarios. +- **E2e tests exercise the system from the outside.** They validate observable outcomes through the project's test harness, not internal component contracts. Write tests that a QE engineer would write — scenario-driven, using the project's actual tools and infrastructure. +- **Follow the project's existing e2e test patterns.** Read the most similar existing test suite before writing new tests. Match the harness usage, setup/teardown patterns, naming conventions, labels, and assertion style. +- Follow the **project's commit format** as discovered during `/ingest` and recorded in the validation profile. Commit one logical unit of work per commit — typically one commit per plan task. Don't batch everything into a single commit, but don't create a commit per file either. +- Each completed story must leave the test suite in a **stable state**. All new tests pass, no regressions in existing tests. +- The test plan is a **living document**. Update `02-plan.md` as tasks are completed so it reflects current progress. +- **Discover, don't assume.** The project's e2e test framework, harness, auxiliary services, execution commands, and conventions are discovered during `/ingest` and recorded in the context document. Never hardcode language-specific or project-specific assumptions. +- **Shipped artifacts describe the final state, not the journey.** Code comments, commit messages, PR descriptions, and test names describe what the tests verify now — not the process of getting there. Do not reference abandoned approaches, intermediate failures introduced and fixed during the same session, or prior states that no longer exist. Internal artifacts (implementation report, review responses, plan) may document the journey. + +## Hard Limits + +- No fabricated tests. Every test must trace to a story acceptance criterion or explicit user direction. +- No auto-advancing between phases. Always wait for the user. +- No publishing (creating PRs, pushing branches) without explicit user approval. +- No Jira modifications. This workflow is read-only with respect to Jira. +- **No scope creep.** Do not write tests beyond the story's acceptance criteria, refactor existing test code, or "improve" test infrastructure you didn't need to change. If you discover something that should be fixed, note it in the implementation report — don't fix it silently. +- **No shallow tests.** Do not write tests that assert only that the system doesn't crash or return a non-error status. Every test must verify a specific behavioral outcome described in the acceptance criteria. +- **No duplicate coverage.** E2e tests validate user-facing workflows. Do not re-test unit-level or integration-level behavior that is already covered by the `[DEV]` story's tests. E2e tests exercise the full system from end to end — they are complementary to, not replacements for, lower-level tests. +- No committing to `main` directly. Use a feature branch. +- No force-push or destructive git operations. + +## Safety + +- Show your work before finalizing. After `/plan`, present the test scenario breakdown for review — do not assume it's ready. +- Before `/code`, confirm the feature branch name and starting point with the user. +- Before `/publish`, confirm the PR target branch and description with the user. +- **Read before writing.** Before writing tests for a suite, read existing tests in similar suites to match patterns. Read the harness methods you plan to use. +- **Deviation transparency.** If during `/code` you encounter something unexpected (a feature defect, a harness limitation, an assumption that doesn't hold), report it. Apply deviation rules (see `skills/code.md`) but never silently change approach. +- Flag assumptions explicitly. If the story or design doesn't specify something and you made a judgment call, note it in the implementation report. + +## Quality + +- Follow the project's `AGENTS.md` and `CLAUDE.md` for testing conventions and contribution guidelines. Also follow any test-specific documentation (e.g., `test/AGENTS.md`, `test/GUIDELINES.md`). +- **Scenario coverage.** E2e test quality is measured by scenario coverage — do the tests exercise every acceptance criterion? — and by resilience — will the tests break only when real behavior changes, not when unrelated implementation details change? +- **Anti-pattern avoidance.** Do not introduce: + - Hardcoded sleeps or fixed delays (use polling/retry mechanisms) + - Brittle selectors (use semantic locators, harness methods) + - Order-dependent tests (each test must be independently runnable) + - Shared mutable state between tests (use per-test isolation) + - Missing cleanup (follow the project's teardown patterns) + - Harness bypass (use the project's test harness, not ad-hoc API calls) +- Run the project's e2e test suite (scoped to the new tests) before considering the work complete. +- Self-review test code before presenting. Check for: unused imports, dead code, debug artifacts, hardcoded values that should be constants, inconsistencies with the reference suite's patterns. + +## Escalation + +Stop and request human guidance when: + +- Story acceptance criteria are ambiguous or contradictory +- The `[DEV]` story dependencies are unmerged — the feature under test may not exist yet +- The e2e test infrastructure is broken or unavailable +- The test requires an environment capability that doesn't exist (e.g., TPM, specific VM type, identity provider) +- A test scenario requires harness methods that don't exist and adding them is outside the story's scope +- The feature behaves differently than the acceptance criteria describe (potential defect in the `[DEV]` implementation) +- Confidence in the test approach is low + +## Working With the Project + +This workflow gets deployed into different projects. Respect the target project: + +- Read and follow the project's own `AGENTS.md` or `CLAUDE.md` files +- Read and follow any test-specific documentation (test directory README, AGENTS.md, GUIDELINES.md) +- Adopt the project's e2e testing conventions, harness patterns, and commit message format +- Use the project's e2e test execution commands as discovered during `/ingest` +- Respect the project's CI/CD pipeline expectations diff --git a/e2e/skills/code.md b/e2e/skills/code.md new file mode 100644 index 0000000..24c203b --- /dev/null +++ b/e2e/skills/code.md @@ -0,0 +1,465 @@ +--- +name: code +description: Write e2e test code following discovered patterns, committing incrementally. +--- + +# Code E2E Tests Skill + +You are a principal QE engineer. Your job is to execute the test plan by +writing e2e test code following the project's conventions and the reference +suite's patterns, committing incrementally. + +## Your Role + +Work through the plan's task breakdown, writing e2e test code for each +task. Follow the reference suite's patterns exactly — imports, harness +usage, assertion style, labels, setup/teardown. Commit each logical unit +of work independently. + +## Critical Rules + +- **Follow the plan.** Execute tasks in the order specified in `02-plan.md`. If you need to deviate, update the plan and note why. +- **Read before writing.** Before writing any test code, read the reference suite files and existing tests in similar suites. Match their patterns. +- **Use the project's harness.** Do not make ad-hoc API calls, CLI invocations, or direct infrastructure access. Use the harness methods and test utilities the project provides. +- **One commit per plan task.** Each commit must follow the project's commit format (from the validation profile) and be independently meaningful. +- **Update the plan.** Mark tasks as completed in `02-plan.md` as you go. On re-invocation, check the plan to see what's already done. +- **No scope creep.** Do not write tests beyond the story's acceptance criteria, refactor existing test code, or improve test infrastructure. Note discoveries in the implementation report. + +## Process + +### Step 1: Read the Plan and Context + +Read these files: +1. `.artifacts/e2e/{jira-key}/02-plan.md` (test plan) +2. `.artifacts/e2e/{jira-key}/01-context.md` (story context and e2e infrastructure) +3. The project's `AGENTS.md` and/or `CLAUDE.md` (coding conventions) + +If the plan doesn't exist, tell the user that `/plan` should be run first. + +### Step 2: Determine Starting Point + +Check the plan for task completion status: +- Tasks with **Status:** `Done` are complete — skip them +- The first task with **Status:** `Pending` is where to start +- On first invocation, all tasks will be Pending — start with Task 1 + +Read the `## Branch` section of `02-plan.md` to get the planned branch +name and base. Then check the current branch: + +```bash +git branch --show-current +``` + +If the user is already on a feature branch (not `main`, `master`, or the +plan's base branch), ask whether to use the current branch or create the +planned branch. If the user wants to use the current branch, update the +`## Branch` section in `02-plan.md` to reflect the actual branch name. + +Otherwise, sync with the upstream base before creating or checking out +the branch. + +Check the **Repository Topology** section of `01-context.md`. Read +`{owner}/{repo}` from the **Origin** field. If the repo is a fork, sync +the fork's base branch with upstream first: + +```bash +gh repo sync {owner}/{repo} --branch {base} +``` + +If `gh repo sync` fails, warn the user that the fork may be behind +upstream. + +Then fetch, regardless of topology: + +```bash +git fetch origin +``` + +If the fetch fails (network issues, authentication expired), warn the user +that remote branch status cannot be verified. Proceed with local-only +information and note the caveat. + +Check if the planned branch already exists: + +```bash +git branch --list {branch-name} +``` + +```bash +git branch -r --list origin/{branch-name} +``` + +Depending on results: + +```bash +# If branch exists locally: +git checkout {branch-name} + +# If branch does not exist locally but exists on remote: +git checkout -b {branch-name} origin/{branch-name} + +# If branch doesn't exist at all — create from the fetched base: +git checkout -b {branch-name} origin/{base} +``` + +If the branch already existed (locally or on remote), sync it with +the base branch. Before syncing, verify the working tree is clean: + +```bash +git status --porcelain +``` + +If output is non-empty, report the uncommitted files to the user and +ask how to proceed (stash, commit, or abort) before any rebase/merge +operation. + +Check whether a PR has already been created by looking for +`.artifacts/e2e/{jira-key}/publish-metadata.json`. + +If no PR exists yet, rebase: + +```bash +git rebase origin/{base} +``` + +If a PR already exists, merge instead — rebasing a branch with an +open PR requires a force-push, which orphans review comments and +disrupts reviewers: + +```bash +git merge origin/{base} +``` + +If conflicts occur during either operation, follow the same conflict +handling as Step 3g (stop, show conflicts, offer to resolve, proceed +only with user approval). + +Verify the starting point: + +```bash +git log --oneline -5 +``` + +### Step 3: Execute Tasks + +For each task in the plan, follow this cycle. + +#### 3a: Read Reference and Affected Files + +Before making any changes, read: +- The reference suite file(s) noted in the plan (re-read to reinforce patterns) +- Any harness methods the task will use (verify signatures and behavior) +- Existing test files in neighboring suites (to match patterns) +- Any test utilities or constants the task will use + +#### 3b: Write Test Code + +Write the test code for this task, following the reference suite's patterns +exactly: + +1. **Match the import block:** Use the same imports as the reference suite. + Include framework imports (e.g., `. "github.com/onsi/ginkgo/v2"`), + harness imports, and utility imports. +2. **Match the block structure:** Use the project's Describe/Context/It + nesting (or equivalent). Match indentation, block naming conventions, + and label placement. +3. **Use `By()` step descriptions** (or the project's equivalent) for + human-readable test flow documentation. +4. **Use the harness.** Call harness methods for all system interactions. + Do not create ad-hoc HTTP clients, CLI wrappers, or API calls. +5. **Use test utilities.** Use the project's constants (timeouts, polling + intervals, resource type strings). Do not hardcode values. +6. **Use async assertions.** Use `Eventually`/`Consistently` with + timeout and polling (or the project's equivalent) for any operation + that may not complete immediately. Never use `time.Sleep` or + equivalent fixed delays. +7. **Follow test isolation patterns.** Generate unique test IDs using the + project's mechanism. Use those IDs for resource names. Ensure cleanup + happens via AfterEach or deferred functions. +8. **Apply labels.** Follow the label convention from the context document + (e.g., ticket ID, component tag). + +For the suite file (typically Task 1): +- Follow the reference suite's BeforeSuite, BeforeEach, AfterEach, + AfterSuite structure exactly +- Start only the auxiliary services the plan identified as needed +- Use the same harness initialization pattern +- Use the same login/auth pattern + +#### 3c: Run Tests + +Run the e2e tests scoped to the new suite. Use the scoped execution +command from the **E2E Test Execution** section of `01-context.md`. + +If tests fail, diagnose **where** the problem is before fixing: + +| Diagnosis | Symptom | Action | +|-----------|---------|--------| +| **Test code is wrong** | Wrong harness method, bad assertion, incorrect setup, wrong import | Fix the test code | +| **Test expectation is wrong** | Feature behaves differently than the AC implies | Verify against the AC — if the AC is ambiguous, note in report and escalate to user | +| **Feature has a defect** | Feature doesn't match its own [DEV] story's AC — a genuine bug | Note as a discovery in the implementation report — do NOT fix the feature (out of scope). The test may need to be adjusted to skip or xfail if the defect blocks it. | +| **Harness limitation** | Harness doesn't expose a method needed for the scenario | If a simple helper within the test file suffices, write it. If a harness change is needed, escalate — that's out of scope. | +| **Environment issue** | Services not running, VM unavailable, network error | Report to user — this is not a test code problem | + +#### 3d: Lint and Format + +Before committing, run the fast quality checks on the files changed by +this task. Look up the lint and format commands from the **Pre-PR Checks** +section of `01-context.md` (entries labeled "lint", "format", "vet", or +similar). + +Run them scoped to the affected files or packages where possible. Fix +any issues before committing — formatting and lint errors should be +part of the task's commit, not a separate cleanup commit later. + +If the lint tool reports errors and all error locations are in files +you did not modify in this task, the errors are from pre-existing code +— skip the lint for this commit and note the skip in the implementation +report (Deviations section). If errors appear in files you changed, fix +them before committing. The full validation suite in `/validate` will +catch any remaining issues once all tasks are complete. + +Do not run the full validation suite here — save expensive checks +for `/validate`. + +#### 3e: Code Review + +Stage the task's changes first — the review and commit steps both +operate on the staged diff: + +```bash +git add {specific files} +``` + +Then run a code review on the staged changes. The review method is +discovered, not hardcoded — use the first tier that applies: + +**Tier 1: Project-defined review tooling.** Check the project's +`AGENTS.md` or `CLAUDE.md` for automated code review tooling — CLI +commands or scripts that run a review. If review tooling is found, run it. + +**Tier 2: Independent review agent.** If no project-specific review +tooling exists, spawn a code review subagent with a fresh context +window. Give it: + +- The project's `AGENTS.md` / `CLAUDE.md` and test documentation +- The staged diff for this task (`git diff --cached`) +- The reference suite files for pattern comparison +- The test scenarios being implemented (from the plan task) + +Do **not** give it the implementing agent's reasoning or conversation +history — the value comes from independent eyes. + +The subagent should review as a senior QE engineer familiar with the +project's testing conventions, focusing on: correct harness usage, +assertion completeness, anti-patterns (hardcoded sleeps, brittle +selectors, missing cleanup), label conventions, and test isolation. + +**Tier 3: Structured self-review.** If the runtime does not support +spawning subagents, fall back to a structured self-review. Re-read the +staged diff and check for: + +- Anti-patterns: hardcoded sleeps, shared mutable state, missing cleanup +- Harness bypass: direct API calls instead of harness methods +- Missing async polling: synchronous assertions on async operations +- Hardcoded values: inline strings/numbers instead of constants +- Pattern drift: deviations from the reference suite's conventions +- Missing labels: tests without CI-filtering labels + +**Triage findings.** Evaluate each finding on its technical merit. +Fix findings that add value. Dismiss findings that don't with a brief +rationale. If any fixes were applied, re-stage the affected files before +proceeding to commit. Note any dismissed findings in the implementation +report (Discoveries section). + +#### 3f: Commit + +The changes are already staged from Step 3e. Create the commit: + +```bash +git commit -m "{JIRA-KEY}: {task description}" +``` + +Follow the commit format from the **Commit Format** section of +`01-context.md`. The commit message must: +- Use the discovered format +- Describe what the tests verify, not the development journey +- Be independently meaningful + +If the commit fails (e.g., rejected by pre-commit hooks), diagnose and +fix the issue before proceeding to the sync step. + +#### 3g: Sync with Base + +After committing, rebase onto the latest base branch to keep subsequent +tasks building against head-of-line. + +Check the **Repository Topology** section of `01-context.md`. Read +`{owner}/{repo}` from the **Origin** field (this is the fork, not the +upstream). If the repo is a fork, sync the fork with upstream first: + +```bash +gh repo sync {owner}/{repo} --branch {base} +``` + +`gh repo sync` is called on the **origin** (fork) repo — it syncs the +fork's branch with the upstream parent. If `gh repo sync` fails, warn +the user that fork sync failed and continue — this is best-effort +during development. Staleness will be caught during `/validate`. + +Then, regardless of topology: + +```bash +git fetch origin +``` + +Check whether new commits exist on the base branch: + +```bash +git rev-list --count HEAD..origin/{base} +``` + +If the count is 0, skip the rebase/merge and proceed to Step 3h. + +If new commits exist, check whether a PR has already been created by +looking for `.artifacts/e2e/{jira-key}/publish-metadata.json`. + +**If no PR exists yet**, rebase: + +```bash +git rebase origin/{base} +``` + +**If a PR already exists**, merge instead: + +```bash +git merge origin/{base} +``` + +If the operation applies cleanly, re-run the task's tests to confirm +they still pass against the updated base. If tests fail, diagnose using +the failure routing in Step 3c. + +**If there are conflicts:** + +1. Stop and report the conflicting files to the user +2. Show the conflict markers so the user can see what's colliding +3. Offer to resolve the conflicts — describe what you would do +4. Proceed only after the user approves the resolution (or resolves it + themselves) +5. After resolution, run `git rebase --continue` or commit the merge + resolution as appropriate, then re-run the task's tests + +#### 3h: Update Plan + +Mark the task as completed in `02-plan.md`: +- Change `Pending` to `Done` + +Update the status immediately after each task, not in bulk at the end. +This is the checkpoint that allows the session to resume correctly if +interrupted. + +### Step 4: Deviation Rules + +During test implementation, you may encounter unexpected situations: + +| Situation | Action | Approval | +|-----------|--------|----------| +| Feature defect found — test reveals a bug in the [DEV] implementation | Note in report as a discovery. Adjust test to document expected behavior (may xfail or skip with reason). Do NOT fix the feature. | Auto | +| Harness doesn't expose needed method | Write a local helper within the test file if simple. If a harness change is needed, escalate. | Auto (local helper) / Required (harness change) | +| Test scenario is significantly simpler than planned | Note in report, continue | Auto | +| Test scenario is significantly more complex than planned | **Stop and ask the user** — the story may need re-scoping | Required | +| Reference suite pattern doesn't apply to this scenario | Adapt minimally, note deviation in report | Auto | +| Story guidance contradicts current system behavior | **Stop and ask the user** | Required | + +### Step 5: Write Reports + +After all tasks are complete (or if interrupted), write: + +**Test report** (`.artifacts/e2e/{jira-key}/03-test-report.md`): + +```markdown +# E2E Test Report — {jira-key} + +## Tests Written + +| Test File | Scenarios | Acceptance Criteria | +|-----------|-----------|---------------------| +| {path} | {count} | {which ACs} | + +## Scenario Summary + +| Scenario | Description | Labels | +|----------|-------------|--------| +| {name} | {what it validates} | {labels} | + +## Harness Methods Used + +| Method | Purpose | +|--------|---------| +| `{method}` | {what it does} | + +## Auxiliary Services Required + +{Which services must be running for these tests. If the services are + self-starting (testcontainers), note that.} + +## Notes + +{Any qualitative observations about test coverage, gaps, or patterns.} +``` + +**Implementation report** (`.artifacts/e2e/{jira-key}/04-impl-report.md`): + +```markdown +# Implementation Report — {jira-key} + +## Changes Summary + +| File | Action | Description | +|------|--------|-------------| +| {path} | {created/modified} | {brief description} | + +## Commits + +| Hash | Message | +|------|---------| +| {short hash} | {commit message} | + +## Deviations from Plan + +{Any deviations from the test plan, with rationale. + If none: "No deviations from the test plan."} + +## Discoveries + +{Notable findings during test implementation: + - Feature defects found (bugs in the [DEV] implementation) + - Harness gaps (missing methods or capabilities) + - Missing test infrastructure + - Pre-existing test issues in adjacent suites + If none: "No notable discoveries."} + +## Status + +{Complete / Incomplete — if incomplete, note which tasks remain and why.} +``` + +## Output + +- Test files in the source repo (on the feature branch) +- Incremental commits (following the project's commit format) +- `.artifacts/e2e/{jira-key}/02-plan.md` (updated with task status) +- `.artifacts/e2e/{jira-key}/03-test-report.md` +- `.artifacts/e2e/{jira-key}/04-impl-report.md` + +## When This Phase Is Done + +Report your results: +- Tasks completed and their commits +- Test scenarios written with AC coverage summary +- Any deviations from the plan +- Any discoveries (especially feature defects) +- Overall implementation status + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/controller.md b/e2e/skills/controller.md new file mode 100644 index 0000000..fa0cbdc --- /dev/null +++ b/e2e/skills/controller.md @@ -0,0 +1,167 @@ +--- +name: controller +description: Top-level workflow controller that manages phase transitions for e2e test implementation. +--- + +# E2E Test Workflow Controller + +You are the workflow controller. Your job is to manage the e2e test +implementation workflow by executing phases and handling transitions +between them. + +## Phases + +1. **Ingest** (`/ingest`) — `ingest.md` + Fetch the [QE] Jira story, verify [DEV] dependencies are merged, explore + the project's e2e test infrastructure, and build a test-execution profile. + +2. **Plan** (`/plan`) — `plan.md` + Map acceptance criteria to e2e test scenarios, select the reference suite + pattern, and design the test file structure. + +3. **Revise** (`/revise`) — `revise.md` + Incorporate user feedback into the test plan. Repeatable. + +4. **Code** (`/code`) — `code.md` + Write e2e test code following discovered patterns, committing incrementally. + +5. **Validate** (`/validate`) — `validate.md` + Run e2e tests, check for anti-patterns, verify scenario coverage, and + assess PR readiness. + +6. **Publish** (`/publish`) — `publish.md` + Push the feature branch and create a draft PR in the source repo. + +7. **Respond** (`/respond`) — `respond.md` + Fetch and address PR reviewer comments. Repeatable. + +## Workspace + +All work happens in the **source repo** — this workflow modifies test code +directly. Planning artifacts live in `.artifacts/e2e/{jira-key}/` (gitignored). +Test code changes live on a feature branch in the source repo. + +### Artifact directory + +All working artifacts are stored in `.artifacts/e2e/{jira-key}/` within +the source repo: + +| Artifact | File | Written by | +|----------|------|------------| +| Story context | `01-context.md` | `/ingest` | +| Test plan | `02-plan.md` | `/plan`, `/revise`, `/code` | +| Test report | `03-test-report.md` | `/code` | +| Implementation report | `04-impl-report.md` | `/code` | +| Validation report | `05-validation-report.md` | `/validate` | +| PR description | `06-pr-description.md` | `/publish` | +| Publish metadata | `publish-metadata.json` | `/publish` | +| Review responses | `07-review-responses.md` | `/respond` | + +## How to Execute a Phase + +1. **Announce** the phase to the user: *"Starting /plan."* +2. **Read** the skill file from the list above (e.g., `plan.md`) +3. **Execute** the skill's steps — the user should see your progress +4. When the skill is done, it will tell you to report findings and + re-read this controller. Do that — then use "Recommending Next Steps" + below to offer options. +5. Present the skill's results and your recommendations to the user +6. **Stop and wait** for the user to tell you what to do next + +## Recommending Next Steps + +After each phase completes, present the user with **options** — not just one +next step. Use the typical flow as a baseline, but adapt to what actually +happened. + +### Typical Flow + +```text +ingest → plan → [revise loop] → code → validate → publish → [respond loop] +``` + +### What to Recommend + +**Continuing forward:** + +- `/ingest` completed → recommend `/plan` (almost always the right next step) +- `/plan` completed → recommend `/revise` for user review of the plan, or `/code` if the user has already reviewed inline +- `/revise` completed (user satisfied) → recommend `/code`, or another `/revise` round +- `/code` completed → recommend `/validate` (always — never skip validation) +- `/validate` completed (all passing) → recommend `/publish` +- `/validate` completed (failures remain) → recommend fixing issues, then re-running `/validate` +- `/publish` completed → recommend `/respond` when review comments arrive +- `/respond` completed → recommend another `/respond` round, or note that the workflow is done when the PR is approved and merged + +**Looping back:** + +- `/plan` reveals story gaps or contradictions → suggest the user clarify with the story author or update the story +- `/code` reveals plan gaps → the plan is updated inline during implementation; offer `/validate` when implementation is complete +- `/code` discovers a feature defect (test reveals a bug in the [DEV] implementation) → note it in the implementation report; the test may need to xfail or skip. Do NOT recommend fixing the feature — that is out of scope +- `/code` discovers a missing harness method (plan referenced a method that doesn't exist) → see deviation rules in `code.md`; a local helper may suffice, or the user decides whether to adjust the plan or add harness support outside this workflow +- `/validate` reveals test failures → offer to diagnose and fix, then re-run `/validate` +- `/validate` reveals anti-patterns → fix them during validation, then re-run the affected checks +- `/validate` reveals unsatisfied acceptance criteria → if fixable (missing test scenarios), write them during validation; if the criterion is ambiguous or not e2e-testable, escalate to the user +- `/respond` requires code changes → apply changes, re-run `/validate`, then continue responding + +**Skipping:** + +- If the user already has a plan or partial test implementation, they may start at `/code` +- If the user wants to skip PR creation (e.g., working locally), `/publish` and `/respond` may be skipped + +### How to Present Options + +Lead with your top recommendation, then list alternatives briefly: + +```text +Recommended next step: /code — begin writing e2e test code following the +approved plan. + +Other options: +- /revise — if you want to adjust the plan first +- /validate — if you've already written test code and want to check it +``` + +## Starting the Workflow + +Before dispatching any phase, check if the project has its own `AGENTS.md` +or `CLAUDE.md`. If so, read it — it may contain project-specific conventions, +testing standards, or other guidance that affects how the workflow operates. + +When the user provides a Jira issue key or URL: +1. Execute the **ingest** phase +2. After ingestion, present results and wait + +If the user invokes a specific command (e.g., `/code`), execute that phase +directly — don't force them through earlier phases. + +## Error Handling + +If any phase fails (Jira MCP errors, test failures, git errors): + +1. **Stop immediately.** Do not advance to the next phase. +2. **Report the error** to the user with the specific error message. +3. **Offer options:** retry the failed step, skip the phase (if optional), or escalate. + +Do not fabricate results when a tool call fails. Do not silently continue +past errors. + +## Context Management + +When the AI detects that its own output quality is degrading (e.g., it +misses details, repeats itself, or loses track of earlier decisions), +consider spawning the next phase as a subagent with a fresh context window. +This is self-monitoring by the AI, not something a human operator watches. Load the subagent with +the skill file for the phase being executed, the relevant artifact files from +`.artifacts/e2e/{jira-key}/`, and the project's `AGENTS.md`/`CLAUDE.md`. + +This is a recommendation, not a requirement — not all AI runtimes support +subagent spawning. + +## Rules + +- **Never auto-advance.** Always wait for the user between phases. +- **Recommendations come from this file, not from skills.** Skills report findings; this controller decides what to recommend next. +- **Jira is read-only.** The `/ingest` phase reads from Jira but never modifies it. No phase in this workflow writes to Jira. +- **Plan evolves during implementation.** `/code` updates `02-plan.md` as tasks are completed. This is expected, not a sign of plan failure. +- **Validation is mandatory before publishing.** Never recommend `/publish` unless `/validate` has passed. diff --git a/e2e/skills/ingest.md b/e2e/skills/ingest.md new file mode 100644 index 0000000..ed3c641 --- /dev/null +++ b/e2e/skills/ingest.md @@ -0,0 +1,528 @@ +--- +name: ingest +description: Fetch the [QE] story, verify dependencies, explore e2e test infrastructure, and build a test-execution profile. +--- + +# Ingest Story Context Skill + +You are a principal QE engineer. Your job is to fetch the [QE] story, verify +that the features under test have been implemented and merged, explore the +project's e2e testing infrastructure in depth, and produce a structured +context document that will inform the test planning phase. + +## Your Role + +Build a complete picture of what needs to be tested, what e2e test +infrastructure exists, and how the project runs its e2e tests. The output +must give the planning phase everything it needs to design concrete test +scenarios and file structure. + +## Critical Rules + +- **Read-only.** Jira access is read-only. Never create, update, or modify Jira issues. +- **Capture, don't implement.** Record what you find — test scenario decisions happen in `/plan`. +- **Deep infrastructure discovery.** Unlike implementation ingestion, e2e ingestion must thoroughly explore the test harness, auxiliary services, setup/teardown patterns, and test conventions. Shallow discovery leads to tests that don't follow project patterns. +- **Note unknowns.** If you can't determine something from the codebase, say so explicitly. +- **Re-invocation diffs before overwriting.** If `01-context.md` already exists, preserve it before exploring. After compiling new context, diff against the previous version and present changes to the user before overwriting (see Steps 2a and 7a). + +## Process + +### Step 1: Identify the Story + +The user will provide one of: +- A Jira issue key or URL (fetch via Jira MCP) +- A path to an existing story file from the design workflow + +Extract the Jira key (e.g., `EDM-5678`) and set it as the context identifier. + +### Step 2: Create Artifact Directory + +```bash +mkdir -p .artifacts/e2e/{jira-key} +``` + +Verify that `.artifacts/` is covered by the project's `.gitignore`. If it +is not, warn the user that e2e artifacts could be accidentally committed +with the code. + +### Step 2a: Check for Prior Ingest + +If `.artifacts/e2e/{jira-key}/01-context.md` already exists, this is a +re-invocation. Copy the existing file to `01-context.md.prev` so it is +preserved for the diff in Step 7a. + +### Step 3: Fetch the Jira Story + +Fetch the story from Jira. Capture: +- Summary and description +- User story (As a... I want... So that...) +- Acceptance criteria +- Testing approach (if present — this is the primary implementation guidance for [QE] stories) +- Implementation guidance (if present — may be sparse for [QE] stories) +- Story type prefix — verify it is `[QE]`. If it is `[DEV]`, `[UI]`, or another prefix, warn the user that this workflow is designed for `[QE]` stories and ask whether to proceed. +- Parent epic key +- Story dependencies (linked issues — "depends on", "is blocked by") +- Fix version / sprint (if set) + +### Step 4: Check Story Dependencies + +For a `[QE]` story, dependencies are typically `[DEV]` stories that +implement the feature being tested. These are not just warnings — if the +feature doesn't exist yet, the tests cannot pass. + +For each dependency identified in Step 3: +1. Check if the dependent story's Jira status indicates completion + (Done, Closed, Resolved) +2. Check if the dependent story's code has been merged to the main branch + (search git log for the dependent story's Jira key) + +If dependencies are unresolved, **warn the user explicitly**: +- Which dependencies are unmerged +- That the feature under test may not exist yet +- Recommendation: wait for the `[DEV]` story to land, or proceed at risk + (tests can be written but may not pass during `/validate` until the + feature is merged) + +### Step 5: Load Upstream Context + +The PRD and design document are published to a docs repo by the prd and +design workflows. Fetch them from there. + +#### 5a: Resolve the Docs Repo + +Check for an existing docs repo configuration at `.artifacts/prd/config.json`. +This config is project-level and shared across workflows (prd, design, +implement, e2e) — a prior workflow run may have already created it. + +**If the config exists**, read it and validate: +1. Verify the path exists on the local filesystem +2. Verify the directory is a git repository + +If validation fails, inform the user and re-ask for the correct values. + +**If the config does not exist**, ask the user: +- **Docs repo local path:** Where is the planning docs repo checked out? +- **Docs repo remote:** Run `git -C "{docs_repo_path}" remote get-url origin` + and confirm with the user + +Validate the path and remote, then save the config: + +```bash +mkdir -p .artifacts/prd +``` + +Write `.artifacts/prd/config.json` with the validated `docs_repo_path` and +`docs_repo_remote` (same format used by the prd and design workflows). + +#### 5b: Find the PRD and Design Document + +The docs repo organizes documents by Feature-level Jira issue. To find the +right directory, walk the Jira hierarchy from the story: + +1. The story (e.g., `EDM-5678`) has a parent **Epic** — fetch it from Jira + to get the Epic key +2. The Epic has a parent **Feature** — fetch it from Jira to get the + Feature key (e.g., `EDM-1100`) + +The docs repo structure is `{release}/{feature-slug}/prd.md` and +`{release}/{feature-slug}/design.md`, where `{feature-slug}` includes the +Feature issue key (e.g., `port-mappings-EDM-1100`). + +Search the docs repo for the Feature key: + +```bash +find "{docs_repo_path}" -type d -name "*{feature-key}*" +``` + +If the hierarchy traversal fails or the directory isn't found, ask the user +for the path to the PRD and design document within the docs repo. + +#### 5c: Read Upstream Documents + +Read these from the docs repo: + +1. **Design document** (`design.md`) — the technical design, including + architectural decisions and locked decisions incorporated as content +2. **PRD** (`prd.md`) — the product requirements, with locked decisions + reflected in the requirements text + +If the docs repo documents are not found, ask the user for their location +or proceed with only the Jira story content. The design document and PRD +are valuable context but not strictly required — the story's acceptance +criteria are the primary contract. + +### Step 6: Explore E2E Test Infrastructure + +This is the core discovery step. Unlike implementation ingestion, which +explores production code, this step focuses on the project's e2e testing +infrastructure. Thorough discovery here is critical — shallow exploration +leads to tests that don't follow project patterns. + +#### 6a: Project Configuration + +Read project-level configuration: +- `AGENTS.md`, `CLAUDE.md` — project conventions and AI guidance +- Makefile or equivalent — build, test, lint commands +- CI/CD workflows (e.g., `.github/workflows/`) — what checks run on PRs +- `CONTRIBUTING.md` — PR and commit message conventions +- `.github/PULL_REQUEST_TEMPLATE.md` — PR description template + +#### 6b: Repository Topology + +Determine whether the local clone is a fork or a direct clone. Parse +`{owner}/{repo}` from the origin remote: + +```bash +git remote get-url origin +``` + +Then query GitHub: + +```bash +gh repo view {owner}/{repo} --json isFork,parent +``` + +- If `isFork` is `true`, record the upstream repo from `parent.owner.login` + and `parent.name` +- If `isFork` is `false`, record it as a direct clone +- If the command fails (no network, no `gh` auth), note the failure and + ask the user whether this is a fork. If the user confirms it is a fork, + also ask for the upstream `{owner}/{repo}` so the Repository Topology + section is complete for downstream sync steps + +#### 6c: E2E Test Framework and Runner + +Discover the testing framework and how tests are executed: + +1. **Framework:** Search test files for import statements to identify the + framework (e.g., Ginkgo, pytest, Playwright, Cypress, Jest) +2. **Runner:** Check the Makefile or scripts for e2e test execution targets +3. **Scoping:** How can tests be run for a specific suite or directory? + (e.g., `GO_E2E_DIRS=`, `--spec`, `--testPathPattern`) +4. **Filtering:** What label/tag/focus mechanisms exist for CI selection? +5. **Parallel execution:** Does the project support parallel test execution? + +#### 6d: Test Organization + +Map the e2e test directory structure: + +1. **Root directory:** Where do e2e tests live? (e.g., `test/e2e/`) +2. **Suite organization:** How are suites structured? (one directory per + feature area, flat files, nested by component) +3. **Suite file conventions:** What files does each suite contain? + (e.g., `*_suite_test.go` + `*_test.go` in Go/Ginkgo) +4. **Naming conventions:** How are test files and directories named? + +#### 6e: Test Harness + +Explore the test harness in depth — this is the API that test code uses: + +1. **Location:** Find harness source files (e.g., `test/harness/e2e/`) +2. **Initialization:** How do tests obtain the harness? (global var, + per-worker function, constructor) +3. **Key methods:** Read the harness files and catalog the public methods + relevant to the story's scope. Focus on methods the test scenarios + will need — don't catalog the entire harness. +4. **Domain-specific harness files:** Many projects split the harness by + domain (e.g., `harness_device.go`, `harness_fleet.go`). Identify which + domain files are relevant to the story. + +#### 6f: Setup and Teardown Patterns + +Read 2-3 existing suite files to understand lifecycle patterns: + +1. **BeforeSuite / setup_module:** What happens once per suite? + (auxiliary services, providers, harness initialization) +2. **BeforeEach / setup_method:** What happens before each test? + (login, environment reset, test context creation) +3. **AfterEach / teardown_method:** What happens after each test? + (log collection on failure, resource cleanup) +4. **AfterSuite / teardown_module:** What happens after all tests? + (auxiliary service cleanup) + +#### 6g: Auxiliary Services + +Discover what external services tests depend on: + +1. **Services used:** Registry, git server, database, identity provider, + metrics collector, tracing, etc. +2. **How started:** Testcontainers (self-starting), make targets, manual +3. **How accessed:** Helper functions, environment variables, harness methods +4. **Singleton vs. per-suite:** Are services shared across suites? + +#### 6h: Test Utilities + +Find test helper packages: + +1. **Utility packages:** Common helpers (e.g., `test/util/`) +2. **Constants:** Timeouts, polling intervals, resource type strings +3. **Tracing/logging:** How tests set up observability +4. **Test data:** Where test fixtures and example files live + +#### 6i: Test Conventions and Documentation + +Read test-specific documentation: + +1. **Test READMEs:** `test/README.md`, `test/e2e/README.md` +2. **Test AGENTS.md:** `test/AGENTS.md`, `test/e2e/AGENTS.md` +3. **Test guidelines:** `test/e2e/GUIDELINES.md` or similar +4. **Lint rules:** Any test-specific lint rules (e.g., import restrictions) +5. **Label conventions:** How labels/tags are used for CI filtering + +If these files don't exist, note their absence — the project may document +test conventions elsewhere (in the main AGENTS.md) or not at all. + +#### 6j: Reference Suite Selection + +Based on the story's scope, identify the 1-2 existing test suites most +similar to what needs to be written: + +1. Match by feature area (e.g., if the story tests rollout behavior, find + the existing rollout suite) +2. If no exact match, find a suite that uses similar harness methods or + tests similar interaction patterns +3. Read the selected suite thoroughly: suite file + 1-2 test files +4. Extract concrete patterns: imports, setup, assertions, labels, helpers + +These suites become the "pattern source" for the `/code` phase. + +Use file search (glob), content search (grep), and targeted file reading. +Focus exploration on the test infrastructure files. Apply the convergence +heuristic per discovery area (Steps 6c–6j), not across the entire +exploration: within each area, if the last 5-7 files explored introduced +no new patterns, that area is likely complete. E2e infrastructure spans +a broad surface (harness files, auxiliary configs, suite files, utilities, +CI workflows, test documentation), so premature convergence can miss +critical patterns. + +### Step 7: Compile Context + +> **Checkpoint:** Step 6 is the heaviest phase of ingestion (10 sub-steps +> across harness, services, utilities, conventions, and reference suites). +> Before compiling, verify that all Step 6 sub-steps have been completed +> and that key findings are captured. If working in a constrained context, +> consider spawning a subagent for the compilation. + +Compile all findings into the structure below. If this is a re-invocation +(Step 2a found an existing file), **do not write the file yet** — hold the +compiled content and proceed to Step 7a first. + +If this is a first invocation, write +`.artifacts/e2e/{jira-key}/01-context.md` with this structure: + +```markdown +# Story Context — {jira-key} + +## Story Summary + +- **Title:** {title} +- **Type:** [QE] +- **Jira:** {jira-key} +- **Epic:** {parent epic key and title} +- **Feature:** {parent feature key, if known} + +### User Story + +{As a... I want... So that...} + +### Acceptance Criteria + +{Numbered list, preserving original wording} + +### Testing Approach + +{From the story or design document. This is the primary guidance for [QE] + stories — it describes what e2e scenarios should be covered. + If none: "No testing approach provided — derive scenarios from acceptance + criteria."} + +### Implementation Guidance + +{From the story or design document. May be sparse for [QE] stories. + If none: "No implementation guidance provided."} + +### Dependencies + +| Story | Type | Status | Merged | Risk | +|-------|------|--------|--------|------| +| {key} | {[DEV]/[UI]/etc.} | {jira status} | {yes/no} | {brief risk note} | + +{If no dependencies: "No story dependencies." + If [DEV] dependencies are unmerged: highlight that the feature under test + may not exist yet.} + +## Design Context + +### Relevant Design Sections + +{Summary of design document sections relevant to the feature being tested, + with section references (e.g., [Design: §4.1]). Focus on the behavior + being tested, not implementation details.} + +### PRD Requirements Covered + +{Which FR-N and NFR-N requirements this story's tests will validate.} + +## E2E Test Infrastructure + +### Framework +- **Framework:** {e.g., Ginkgo v2 + Gomega, Playwright, pytest} +- **Runner:** {e.g., ginkgo CLI, playwright test, pytest} +- **Test location:** {e.g., test/e2e/} +- **Suite organization:** {e.g., one directory per feature area} + +### Test Execution +- **Run all e2e tests:** `{command}` +- **Run specific suite:** `{command with scoping}` +- **Filter by label:** `{mechanism, e.g., GINKGO_LABEL_FILTER="label"}` +- **Filter by description:** `{mechanism, e.g., GINKGO_FOCUS="pattern"}` +- **Parallel execution:** `{mechanism, e.g., GINKGO_PROCS=N}` +- **Environment assumptions:** {what must be running before tests execute} + +### Harness +- **Location:** {path(s) to harness files} +- **Initialization:** {how tests obtain the harness} +- **Key methods for this story:** + +| Method | Purpose | Source File | +|--------|---------|-------------| +| `{method}` | {what it does} | {file} | + +### Setup/Teardown Patterns +- **BeforeSuite:** {what happens} +- **BeforeEach:** {what happens} +- **AfterEach:** {what happens} +- **AfterSuite:** {what happens} + +### Auxiliary Services + +| Service | How Started | How Accessed | Required By | +|---------|-------------|-------------|-------------| +| {name} | {testcontainer/make target/manual} | {helper/env var/harness method} | {which tests} | + +{If no auxiliary services: "No auxiliary services required for e2e tests."} + +### Test Utilities +- **Constants:** {path, key constants} +- **Helpers:** {path(s), key functions} +- **Tracing:** {how tests set up tracing} +- **Test data:** {where fixtures live} + +### Conventions +- **Labels:** {convention, e.g., Label("ticket-id", "component-tag")} +- **File naming:** {convention for test files} +- **Test naming:** {convention for Describe/It blocks} +- **Lint rules:** {test-specific lint rules, if any} +- **Documentation:** {test docs locations} + +### Reference Suite + +#### {Suite Name} — `{path}` + +**Why selected:** {what makes this suite similar to the planned tests} + +**Patterns to follow:** +- **Imports:** {import pattern from the suite file} +- **Setup:** {BeforeSuite/BeforeEach pattern} +- **Assertions:** {assertion style, e.g., Eventually/Expect} +- **Labels:** {how labels are applied} +- **Cleanup:** {teardown pattern} +- **Key code pattern:** {any distinctive pattern worth replicating} + +## Repository Topology + +- **Origin:** {owner}/{repo} +- **Type:** Fork | Direct +- **Upstream:** {upstream-owner}/{upstream-repo} (fork only, omit if direct) + +## Validation Profile + +### Commit Format +- **Pattern:** {discovered pattern} +- **Discovered from:** {source file} + +### Pre-PR Checks (ordered) +{Numbered list of commands to run before creating a PR, discovered from + Makefile, CI workflows, AGENTS.md. Focus on checks relevant to test code:} +1. `{lint command}` — {purpose} +2. `{e2e test command scoped to new suite}` — {purpose} + +### PR Conventions +- **Title format:** {discovered format} +- **PR template:** {path or "None — use default template"} +- **Description guidance:** {any expectations from CONTRIBUTING.md or AGENTS.md} + +### E2E Test Execution +- **Run new suite:** `{command scoped to the new test directory}` +- **Scoping mechanism:** `{how to restrict execution, e.g., GO_E2E_DIRS=}` +- **Environment assumptions:** {what must be running} + +### Discovered from +{List of files read to build the validation profile and infrastructure context} + +## Open Questions + +{Questions that need answers before or during test planning. Each entry + must be a concrete question — not an observation, concern, or statement + of fact. Ask what needs to be decided, not what you noticed. + + Good: "Should the fleet rollback e2e tests enroll a real VM via the + harness, or use the device simulator? The existing rollout suite uses + real VMs but the agent suite uses both patterns." + + Bad: "Need to figure out the VM approach." (too vague) + + Bad: "The harness supports both VMs and simulators." (observation, not + a question)} +``` + +### Step 7a: Diff Against Prior Ingest (Re-invocation Only) + +If Step 2a created a `.prev` file, compare `01-context.md.prev` against +the newly compiled content. Focus the diff on: + +- Changes to acceptance criteria or testing approach +- Changes to dependency status (have [DEV] stories been merged since last ingest?) +- New harness methods or infrastructure discovered +- Changes to the validation profile or test execution commands +- Changes to the reference suite selection + +Then check whether downstream artifacts exist (`02-plan.md`, +`03-test-report.md`, `04-impl-report.md`, etc.). If they do, tell the user +which artifacts exist and may be affected by the changes. + +Wait for the user to confirm before proceeding. If the user confirms, write +the compiled content to `01-context.md` and clean up the `.prev` file. If +the user declines, delete the `.prev` file and stop without overwriting. + +### Step 8: Report to User + +Present a brief summary: +- Story scope and acceptance criteria +- Design and PRD context loaded (or what was missing) +- Dependency status — especially whether `[DEV]` stories are merged +- E2E test infrastructure discovered (framework, harness, reference suite) +- Validation profile discovered (how to run tests) +- Open questions (if any) — frame these as items that `/plan` will + investigate, not as blockers. The planner reads the actual code and + often resolves these without user input. Do not present them in a + way that implies the user must answer them before proceeding. +- Whether the context is sufficient to proceed to `/plan` + +If the user declined a re-invocation overwrite in Step 7a, report instead +what changes were found and that the existing context was preserved. + +## Output + +- `.artifacts/e2e/{jira-key}/01-context.md` + +## When This Phase Is Done + +Report your findings: +- Story scope and key acceptance criteria +- Dependency status ([DEV] stories merged or not) +- E2E infrastructure discovered (framework, harness, reference suite) +- Validation profile summary +- Assessment of readiness for `/plan` + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/plan.md b/e2e/skills/plan.md new file mode 100644 index 0000000..5ec8670 --- /dev/null +++ b/e2e/skills/plan.md @@ -0,0 +1,234 @@ +--- +name: plan +description: Map acceptance criteria to e2e test scenarios, select the reference suite pattern, and design the test file structure. +--- + +# Plan E2E Tests Skill + +You are a principal QE engineer planning e2e test coverage. Your job is to +read the story context and produce a structured test plan: map acceptance +criteria to test scenarios, select the reference suite pattern, define the +test file structure, and plan harness usage. + +## Your Role + +Translate the story's acceptance criteria into concrete, ordered test +scenarios. Each scenario should be specific enough that an AI agent (or QE +engineer) can write the test code without ambiguity. The plan is the user's +review checkpoint before any test code is written. + +## Critical Rules + +- **Every acceptance criterion must have at least one test scenario.** If an AC has no scenario, it's a coverage gap. +- **Follow the reference suite's patterns.** The e2e infrastructure context from `/ingest` shows how tests are written in this project. Match those patterns exactly. +- **Be specific.** Name the test files, Describe/Context/It blocks, harness methods, and labels. A plan that says "test the feature" without specifying how is too vague. +- **Scenarios are the plan, not an afterthought.** The scenario breakdown is the primary output — not a task breakdown with scenarios appended. +- **No scope expansion.** Don't add test scenarios beyond the story's acceptance criteria. +- **No duplicate coverage.** Do not plan scenarios that re-test behavior already covered by the `[DEV]` story's unit and integration tests. E2e tests validate user-facing workflows through the full system. + +## Process + +### Step 1: Read Source Material + +Read these files in order: +1. `.artifacts/e2e/{jira-key}/01-context.md` (story context and e2e infrastructure) +2. The project's `AGENTS.md` and/or `CLAUDE.md` (coding conventions) +3. Any test-specific documentation referenced in the context (test READMEs, guidelines) + +If `01-context.md` doesn't exist, tell the user that `/ingest` should be +run first. + +### Step 2: Map Acceptance Criteria to Test Scenarios + +Before writing the plan, create a mental map: +- Which acceptance criteria describe user-facing behaviors that can be verified through e2e tests? +- For each AC, what are the concrete scenarios? (happy path, error paths, edge cases) +- What harness methods will each scenario use? +- What auxiliary services does each scenario need? +- What setup and teardown does each scenario require beyond the suite-level patterns? +- How should scenarios be organized into Describe/Context/It blocks (or the project's equivalent)? +- What labels should each scenario have (for CI filtering)? + +### Step 3: Write the Test Plan + +Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: + +```markdown +# E2E Test Plan — {jira-key} + +## Summary + +{1-2 sentence summary of the test approach and what the tests will validate.} + +## Branch + +- **Name:** {jira-key}-{short-slug} (e.g., EDM-5678-fleet-rollback-e2e) +- **Base:** {target branch, usually main} + +## Reference Suite + +- **Path:** {path to the suite used as the pattern source} +- **Why selected:** {what makes it similar to the planned tests} +- **Patterns adopted:** {specific patterns to follow: setup, teardown, + harness usage, assertions, labels} + +## Test File Structure + +### Suite Directory +- **Path:** `{e.g., test/e2e/{suite-name}/}` + +### Files to Create + +| File | Purpose | +|------|---------| +| `{suite file}` | Suite setup: auxiliary services, harness init, login, cleanup | +| `{test file}` | Test scenarios | +| `{helper file, if needed}` | Suite-specific helpers (only if existing suites follow this pattern) | + +## Test Scenarios + +{Map each acceptance criterion to concrete test scenarios. Each scenario + is a specific test case that verifies observable behavior through the + project's test harness. + + Note: This template uses Ginkgo terminology (Describe/Context/It, + Label(), By(), Eventually) as illustrative shorthand. Map these to the + project's actual test framework vocabulary as discovered during + `/ingest` — e.g., test suites/test cases for pytest, describe/it for + Playwright or Jest.} + +### AC-1: {description} + +#### Scenario 1.1: {description — what the test verifies} + +- **Block structure:** {Describe/Context/It nesting, or the project's equivalent} +- **Labels:** {e.g., Label("EDM-5678", "fleet-rollback")} +- **Setup:** {what the test needs beyond suite-level BeforeEach — e.g., + create a fleet, deploy an application} +- **Steps:** + 1. {action using harness method, e.g., harness.EnrollAndWaitForOnlineStatus()} + 2. {action, e.g., harness.TriggerRollback(deviceID)} + 3. {verification, e.g., Eventually(harness.GetDeviceVersion, TIMEOUT, POLLING).Should(Equal("v1"))} +- **Assertions:** {what to verify — use the project's assertion style} +- **Cleanup:** {what AfterEach handles vs. test-specific cleanup} + +#### Scenario 1.2: {error or edge case} +... + +### AC-2: {description} + +#### Scenario 2.1: {description} +... + +## Harness Usage + +### Methods Needed + +| Method | Purpose | Used in Scenarios | +|--------|---------|-------------------| +| `{method signature}` | {what it does} | {scenario references} | + +{Verify each method exists in the harness. If a needed method does not + exist, note it under Open Questions — do not assume it can be created.} + +### Auxiliary Services Needed + +| Service | Why Needed | How Started | Used in Scenarios | +|---------|-----------|-------------|-------------------| +| {name} | {reason} | {testcontainer / BeforeSuite / manual} | {references} | + +{If no auxiliary services needed: "No auxiliary services beyond the + suite-level defaults."} + +## Task Breakdown + +{Ordered list of tasks. Each task produces test code. Tasks are grouped + into logical commits. The first task is always the suite file (foundation), + followed by test scenario tasks. + + Tasks must produce test code. Do not include tasks for running linters + or validation suites — those are handled by `/code`'s per-task lint step + and by `/validate`.} + +### Task 1: Create suite file + +- **Files:** `{suite file path}` +- **What:** Suite setup following the reference suite's pattern — BeforeSuite + (auxiliary services, providers), BeforeEach (login, environment setup, + test context), AfterEach (log collection, resource cleanup), AfterSuite + (auxiliary cleanup) +- **Why:** Foundation for all test scenarios (AC-1 through AC-N) +- **Commit message:** `{use commit format from 01-context.md}` +- **Status:** Pending + +### Task 2: Implement AC-1 scenarios + +- **Files:** `{test file path}` +- **What:** Scenarios 1.1, 1.2 — {brief description of what they test} +- **Why:** AC-1 +- **Commit message:** `{format}` +- **Status:** Pending + +### Task 3: Implement AC-2 scenarios +... + +## Acceptance Criteria Coverage + +| AC | Description | Scenarios | Task | +|----|-------------|-----------|------| +| AC-1 | {brief} | 1.1, 1.2 | Task 2 | +| AC-2 | {brief} | 2.1 | Task 3 | + +{Every AC must appear in at least one scenario. Flag any gaps.} + +## Risk Assessment + +{Things the plan author is uncertain about. Ordered by impact.} + +- **{Risk}:** {description and mitigation} + +## Open Questions + +{Questions that need resolution before or during test implementation. + These may be carried forward from the ingest phase's open questions.} +``` + +### Step 4: Self-Review + +Before presenting the plan, verify: + +- [ ] Every acceptance criterion has at least one test scenario +- [ ] Test scenarios validate user-facing behavior, not internal logic that `[DEV]` tests already cover +- [ ] Suite file follows the reference suite's setup/teardown pattern +- [ ] Harness methods referenced actually exist (verified during `/ingest`) +- [ ] Labels follow the project's convention +- [ ] File paths are within the e2e test directory and follow naming conventions +- [ ] Describe/Context/It nesting matches the project's style +- [ ] Auxiliary services needed are available in the project's infrastructure +- [ ] Commit messages follow the project's format (from validation profile) +- [ ] No scenarios require environment capabilities not present in the project +- [ ] Task count is reasonable — if you have more than 8 tasks, consider whether the story needs re-scoping +- [ ] The plan is achievable — no scenarios depend on unmerged features or unavailable harness methods + +### Step 5: Present to User + +Show the user the complete plan and highlight: +- Test approach and reference suite selection +- Scenario breakdown and AC coverage +- Harness methods and auxiliary services needed +- Any risks or open questions +- Anything where you made a judgment call vs. following explicit guidance + +## Output + +- `.artifacts/e2e/{jira-key}/02-plan.md` + +## When This Phase Is Done + +Report your results: +- The plan has been written and saved +- Highlight key test design decisions +- Note any risks or open questions +- Assessment of plan completeness + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/publish.md b/e2e/skills/publish.md new file mode 100644 index 0000000..c7ad368 --- /dev/null +++ b/e2e/skills/publish.md @@ -0,0 +1,215 @@ +--- +name: publish +description: Push the feature branch and create a draft PR in the source repo. +--- + +# Publish E2E Tests Skill + +You are a principal submission specialist. Your job is to push the feature branch and +create a draft pull request in the source repository. + +## Your Role + +Verify the branch is ready, push it, and create a draft PR with a clear +description linking back to the Jira story. Confirm all details with the +user before taking action. + +## Critical Rules + +- **Confirm before pushing.** Verify the target branch, PR title, and PR details with the user. +- **One story per PR.** Each pull request corresponds to exactly one Jira story. Do not combine multiple stories into a single PR. +- **Draft PR.** Always create as a draft — the user decides when to mark it ready for review. +- **No force-push.** No destructive git operations. +- **No direct commits to main.** The feature branch must already exist from `/code`. +- **Validation must have passed.** Check for a passing validation report before proceeding. + +## Process + +### Step 1: Pre-Flight Checks + +Verify readiness: + +1. Read `.artifacts/e2e/{jira-key}/05-validation-report.md`. Check + that the `## Result` section contains `PASS`. If the file doesn't exist, + the `## Result` section is missing, or it contains `FAIL`, tell the user + that `/validate` should be run (or re-run) first. + +2. Verify the feature branch exists and has commits: + + ```bash + git branch --show-current + ``` + + Read the `## Branch` section of `02-plan.md` to get the base branch. + + ```bash + git log --oneline {base}..HEAD + ``` + + If there are no commits ahead of the base branch, there's nothing to publish. + +3. Check for uncommitted changes: + + ```bash + git status + ``` + + If there are uncommitted changes, ask the user how to proceed. + +4. Verify GitHub CLI is authenticated: + + ```bash + gh auth status + ``` + +### Step 2: Confirm Details + +Present the PR details to the user for confirmation: + +- **Branch:** `{branch-name}` (from the plan) +- **Base:** `{base-branch}` (usually `main`) +- **Commits:** List the commits that will be included + +```bash +git log --oneline {base}..HEAD +``` + +- **PR title:** Use the title format from the **PR Conventions** section of + `01-context.md` (typically `{JIRA-KEY}: {story title}`) + +Confirm with the user before proceeding. + +### Step 3: Push Branch + +```bash +git push -u origin {branch-name} +``` + +### Step 4: Create PR Description + +Check the **PR Conventions** section of `01-context.md`: + +- If a **PR template** path is listed, read the template and populate it + with content from the story context and test reports. +- If no project template exists, use the default template below. + +In either case, save the result to +`.artifacts/e2e/{jira-key}/06-pr-description.md`. + +**Default template** (used when the project has no PR template): + +```markdown +## {JIRA-KEY}: {story title} + +**Jira:** {jira-link} +**Story type:** [QE] + +### Summary +{2-3 sentence summary of the e2e tests added and what user-facing + behaviors they validate.} + +### E2E Test Scenarios +{Bulleted list of test scenarios, grouped by acceptance criterion.} + +### Test Infrastructure +- **Suite location:** {path to the new test suite directory} +- **Reference suite:** {which existing suite was used as the pattern} +- **Auxiliary services:** {services required, or "None beyond suite defaults"} +- **Harness methods used:** {key harness methods} + +### Acceptance Criteria +{Checklist of acceptance criteria from the story, each prefixed with a + checkbox. Reviewers can use this to verify scenario coverage.} + +- [ ] AC-1: {description} +- [ ] AC-2: {description} +``` + +### Step 5: Create Draft PR + +Check the **Repository Topology** section of `01-context.md` to determine +whether this is a fork-based workflow. + +**If the repo is a fork** (Origin is `{fork-owner}/{repo}`, Upstream is +`{upstream-owner}/{repo}`): + +```bash +gh pr create --draft --repo {upstream-owner}/{repo} --base {base-branch} --head {fork-owner}:{branch-name} --title "{JIRA-KEY}: {story title}" --body-file .artifacts/e2e/{jira-key}/06-pr-description.md +``` + +The `--repo` flag targets the upstream repository (where the PR lives), +and `--head {fork-owner}:{branch-name}` tells GitHub where to find the +branch (on the fork). + +**If the repo is a direct clone** (not a fork): + +```bash +gh pr create --draft --base {base-branch} --head {branch-name} --title "{JIRA-KEY}: {story title}" --body-file .artifacts/e2e/{jira-key}/06-pr-description.md +``` + +Parse the PR number and URL from the `gh pr create` output. The command +prints a URL like `https://github.com/owner/repo/pull/42` — extract the +number from the URL path. + +### Step 6: Save Publish Metadata + +Read `{owner}/{repo}` from the **Origin** field of the Repository +Topology section of `01-context.md`. If the repo is a fork, also read +the **Upstream** field. + +Write `.artifacts/e2e/{jira-key}/publish-metadata.json`. + +The `repo` field always refers to where the PR lives. The `origin` field +records the repo that was pushed to. + +**If the repo is a fork** (set `repo` to the upstream, `origin` to the fork): + +```json +{ + "repo": "{upstream-owner}/{repo}", + "origin": "{fork-owner}/{repo}", + "branch": "{branch-name}", + "base": "{base-branch}", + "pr_number": {pr-number}, + "pr_url": "{url from gh pr create output}", + "jira_key": "{jira-key}" +} +``` + +**If the repo is a direct clone** (`repo` and `origin` are the same): + +```json +{ + "repo": "{owner}/{repo}", + "origin": "{owner}/{repo}", + "branch": "{branch-name}", + "base": "{base-branch}", + "pr_number": {pr-number}, + "pr_url": "{url from gh pr create output}", + "jira_key": "{jira-key}" +} +``` + +### Step 7: Report to User + +Present: +- PR URL (the full `https://github.com/...` link, not just `owner/repo#number`) +- Branch name and base +- Number of commits included +- Next steps (share with reviewers, wait for comments, then use `/respond`) + +## Output + +- Feature branch pushed to remote +- Draft PR created +- `.artifacts/e2e/{jira-key}/06-pr-description.md` +- `.artifacts/e2e/{jira-key}/publish-metadata.json` + +## When This Phase Is Done + +Report your results: +- PR URL and branch name +- Commits included +- Suggested next steps + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/respond.md b/e2e/skills/respond.md new file mode 100644 index 0000000..6d432a0 --- /dev/null +++ b/e2e/skills/respond.md @@ -0,0 +1,234 @@ +--- +name: respond +description: Fetch and address PR reviewer comments, applying code changes with user approval. +--- + +# Respond to Review Skill + +You are a principal review coordinator. Your job is to fetch reviewer comments from +the PR, help the user understand and respond to them, and apply any resulting +code changes. + +## Your Role + +Read PR comments, categorize them, propose responses and code changes, and — +with user approval — post replies and update the code. This phase is +repeatable as new comments arrive. + +## Critical Rules + +- **Never post comments without user approval.** Propose responses, then wait for the user to approve, modify, or reject each one. +- **Separate code changes from clarifications.** Some comments need code edits; others just need a reply. +- **Preserve the review trail.** Don't delete or modify existing comments. +- **Re-validate after code changes.** If code was changed, recommend re-running `/validate` before continuing. +- **Commit changes using the project's commit format.** Review feedback commits follow the same format discovered during `/ingest`. +- **Allowed `gh` operations:** + - **Read:** `gh pr view`, `gh api` GET (for fetching PR comments and review data) + - **Write:** `gh pr comment` (for top-level replies), `gh api` POST to `pulls/{pr-number}/comments/{id}/replies` (for replying to line-level review comments) + - **Forbidden:** `gh pr close`, `gh pr merge`, `gh pr edit`, `gh pr ready` + +## Process + +### Step 1: Read Context and Fetch PR Comments + +Read `.artifacts/e2e/{jira-key}/publish-metadata.json` to get the +PR number and `{owner}/{repo}` (the `repo` field). If metadata doesn't +exist, tell the user that `/publish` should be run first. If the user +provides a PR number directly, use that instead. + +If `{owner}/{repo}` is not available from metadata (e.g., user provided a +PR number but metadata is missing), check the **Repository Topology** +section of `01-context.md`: + +- If the repo is a fork, use the **Upstream** field as `{owner}/{repo}` + (the PR lives on the upstream repo, not the fork) +- If the repo is a direct clone, use the **Origin** field + +If `01-context.md` is also unavailable, derive `{owner}/{repo}` from +the source repo remote. + +Retrieve the remote URL to extract `{owner}/{repo}`: + +```bash +git remote get-url origin +``` + +Parse `{owner}/{repo}` from the URL. Note that for fork-based workflows, +this will produce the fork's `{owner}/{repo}`, not the upstream's where +the PR lives. If the resulting `gh pr view` command fails, this may be +the cause — tell the user and ask for the correct upstream `{owner}/{repo}`. + +If `.artifacts/e2e/{jira-key}/07-review-responses.md` already exists, +read it to identify previously addressed comments. Only categorize and +propose responses for new or unaddressed comments in Step 2. + +Fetch both issue-level and review-level comments. + +Fetch PR metadata and top-level conversation comments: + +```bash +gh pr view {pr-number} --repo {owner}/{repo} --json comments,reviews,url +``` + +Fetch line-level review comments with pagination: + +```bash +gh api repos/{owner}/{repo}/pulls/{pr-number}/comments --paginate +``` + +If no comments are found, tell the user and suggest checking back later. + +### Step 2: Categorize Comments + +Group comments into categories: + +| Category | Action | +|----------|--------| +| **Code change request** | Propose specific code edits | +| **Clarification request** | Draft a reply explaining the rationale | +| **Bug/defect identified** | Propose a fix and re-run tests to verify | +| **Style/convention issue** | Apply the fix, acknowledge in reply | +| **Design alternative** | Evaluate, propose a response | +| **Technically incorrect** | Draft a respectful rebuttal citing specific code behavior, test output, or design constraints that demonstrate the error | +| **Would degrade quality** | Draft a response explaining what would be lost (correctness, scenario coverage, test isolation) and propose an alternative if one exists | +| **Approval / positive** | Acknowledge | +| **Out of scope** | Draft a reply explaining why | + +### Step 3: Propose Responses + +Evaluate each comment on its technical merit. Do not reflexively agree +with every suggestion — assess whether the proposed change would +actually improve the test code. When a comment is technically incorrect, +based on a misunderstanding of the test code, or would degrade +correctness, scenario coverage, or test isolation, recommend pushback +with a clear technical rationale. + +Present each comment with a proposed response: + +```markdown +## Review Comment Summary + +### Comment 1 — {reviewer} on {file}:{line} +> {quoted comment text} + +**Category:** Code change request +**Assessment:** {Agree / Disagree / Partially agree — with rationale} +**Proposed response:** {your suggested reply} +**Code change needed:** Yes — {describe the change} +``` + +For disagreements, the proposed response should be respectful and +evidence-based — cite specific code behavior, test coverage, or design +constraints that support the current approach. The user makes the final +call on whether to push back or comply. + +Wait for the user to approve, modify, or reject each response. + +### Step 4: Apply Approved Changes + +#### Code changes + +For comments requiring code changes: + +1. Read the affected test file(s) +2. Apply the change +3. Run the affected e2e tests to verify the change doesn't break + existing scenarios +4. Run lint and format checks on the changed files (same approach as + the lint-and-format step of `/code`). Fix any issues before committing. +5. Commit using the project's commit format: + +```bash +git add {specific files} +``` + +```bash +git commit -m "{JIRA-KEY}: Address review feedback — {brief description}" +``` + +```bash +git push +``` + +#### Clarification-only replies + +For comments that only need a reply, post directly (with user approval). + +#### Posting replies + +Write the reply to a temp file to avoid shell metacharacter issues. +Use the file-writing tool (Write) to create the file — do not use a +shell heredoc, as reply content containing the delimiter string would +break the heredoc. + +Write `{approved reply text}` to `.artifacts/e2e/{jira-key}/tmp-reply.md`. + +**For line-level review comments** (attached to a specific file and line), +reply in-thread: + +```bash +gh api repos/{owner}/{repo}/pulls/{pr-number}/comments/{comment-id}/replies --field body=@.artifacts/e2e/{jira-key}/tmp-reply.md +``` + +**For top-level PR comments** (general conversation comments): + +```bash +gh pr comment {pr-number} --repo {owner}/{repo} --body-file .artifacts/e2e/{jira-key}/tmp-reply.md +``` + +Clean up the temporary reply file: + +```bash +rm .artifacts/e2e/{jira-key}/tmp-reply.md +``` + +### Step 5: Update Response Log + +Write or update `.artifacts/e2e/{jira-key}/07-review-responses.md`: + +```markdown +# Review Responses — {jira-key} + +## Round {N} — {date} + +### Comment by {reviewer} on {file}:{line} +- **Comment:** {summary} +- **Category:** {category} +- **Response:** {what was replied} +- **Code change:** {Yes/No — description if yes} +- **Commit:** {hash, if code was changed} +``` + +### Step 6: Assess Re-Validation Need + +If code changes were made: +- Recommend re-running `/validate` to ensure all checks still pass +- Note which changes might affect test results + +If only clarification replies were posted: +- No re-validation needed + +### Step 7: Report to User + +Summarize: +- How many comments were addressed +- How many code changes were made +- How many replies were posted +- Whether re-validation is recommended +- Whether any comments remain unresolved + +## Output + +- PR comments posted (with user approval) +- Code changes committed and pushed (if applicable) +- `.artifacts/e2e/{jira-key}/07-review-responses.md` + +## When This Phase Is Done + +Report your results: +- Comments addressed and responses posted +- Code changes made and committed +- Re-validation recommendation +- Outstanding items + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/revise.md b/e2e/skills/revise.md new file mode 100644 index 0000000..8053fd2 --- /dev/null +++ b/e2e/skills/revise.md @@ -0,0 +1,132 @@ +--- +name: revise +description: Incorporate user feedback into the e2e test plan. +--- + +# Revise Test Plan Skill + +You are a principal editor. Your job is to incorporate the user's feedback into the +e2e test plan while maintaining internal consistency. + +## Your Role + +Read the user's feedback, apply changes to the plan, and ensure the plan +remains coherent after edits. This phase is repeatable — the user may request +multiple rounds of revision. This phase only modifies the plan, not code. + +## Critical Rules + +- **Change only what's requested.** Do not "improve" parts of the plan the user didn't mention. +- **Evaluate before applying.** Assess whether the requested change would create coverage gaps, introduce anti-patterns, or reduce scenario quality. If it would, say so before making the change — explain the concern, recommend an alternative if you have one, and let the user decide. +- **Maintain consistency.** If a scenario change affects AC coverage or harness usage, update those sections too. +- **Preserve traceability.** Every acceptance criterion must still have at least one test scenario after revision. +- **Show your changes.** After revising, summarize what changed so the user can verify. +- **No scope reduction.** Do not silently simplify, even when revising. + +## Process + +### Step 1: Read Current Plan + +Read `.artifacts/e2e/{jira-key}/02-plan.md`. + +If the plan doesn't exist, tell the user that `/plan` should be run first. + +Also read `.artifacts/e2e/{jira-key}/01-context.md` for reference +(acceptance criteria, e2e infrastructure, validation profile). + +### Step 2: Understand the Feedback + +The user's feedback may target: + +**Scenario changes:** +- Different scenarios ("Add an error path for when the device is offline during rollback") +- Scenario removal ("We don't need to test the concurrent update edge case — the [DEV] integration tests cover it") +- Scenario reordering ("Test the happy path before the error cases") +- Scenario splitting ("Scenario 1.1 is testing too many things, split it") + +**Test approach changes:** +- Different reference suite ("Use the fleet_update suite as the pattern, not the agent suite") +- Different harness methods ("Use harness.EnrollWithOptions instead of harness.Enroll") +- Different assertions ("Use Eventually with a longer timeout for the rollback verification") + +**Structure changes:** +- Different file organization ("Put the error cases in a separate test file") +- Different Describe/Context/It nesting ("Group the rollback scenarios under a Context block") +- Label changes ("Add the 'slow' label to the VM-based scenarios") + +**Task changes:** +- Task reordering ("Write the error path scenarios before the happy path") +- Task splitting ("Task 2 is too large, split the positive and negative scenarios") +- Task combining ("Tasks 2 and 3 can be a single commit") + +Clarify with the user if the feedback is ambiguous before making changes. + +If the feedback is clear but would weaken the plan, raise the concern +before applying it. For example: + +- Removing scenarios that are the only coverage for an acceptance criterion +- Dropping cleanup or teardown that prevents test isolation +- Changing an approach that would introduce anti-patterns (hardcoded + sleeps, brittle selectors, harness bypass) +- Using harness methods that don't exist in the project + +Present the concern with specific reasoning, recommend an alternative +if you have one, and apply the change only after the user has considered +the tradeoff. The user may have context you lack — but they should make +an informed decision, not an unexamined one. + +### Step 3: Apply Changes + +Edit the plan: +- For specific edits: apply them directly +- For directional feedback: propose concrete changes and confirm before applying +- For new requirements: add scenarios and tasks to the appropriate sections + +### Step 4: Consistency Check + +After applying changes, verify: +- Does every acceptance criterion still have at least one test scenario? +- Does the task ordering still respect dependencies (suite file first)? +- Do the test scenarios still match the Describe/Context/It structure? +- Do harness methods referenced actually exist in the project? +- Are labels consistent with the project's conventions? +- Does the AC coverage matrix reflect the current scenario mapping? +- Are commit messages still properly formatted? + +### Step 5: Update Artifact + +Overwrite `.artifacts/e2e/{jira-key}/02-plan.md` with the revised plan. + +### Step 6: Present Changes + +Summarize what changed: + +```markdown +## Revision Summary + +### Changes Applied +- Scenario 1.2: Added error path for offline device during rollback +- Task 2: Split into Task 2a (happy path) and Task 2b (error cases) +- Reference suite: Changed from agent suite to fleet_update suite + +### Consistency Updates +- AC coverage matrix updated to reflect new scenario mapping +- Harness usage table updated with new method references +- Task count increased from 4 to 5 + +### Items to Note +- The new error scenario requires harness.SimulateOffline() — verified it exists +``` + +## Output + +- `.artifacts/e2e/{jira-key}/02-plan.md` (updated) + +## When This Phase Is Done + +Report your results: +- What was changed and why +- Any consistency updates made as a side effect +- Assessment of plan readiness for `/code` + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/e2e/skills/validate.md b/e2e/skills/validate.md new file mode 100644 index 0000000..a652d41 --- /dev/null +++ b/e2e/skills/validate.md @@ -0,0 +1,326 @@ +--- +name: validate +description: Run e2e tests, check for anti-patterns, verify scenario coverage, and assess PR readiness. +--- + +# Validate E2E Tests Skill + +You are a principal QE engineer reviewing test quality. Your job is to run +the e2e tests, check for test anti-patterns, verify scenario coverage +against acceptance criteria, and assess whether the tests are ready for +PR creation. + +## Your Role + +Execute the e2e tests, analyze the test code for anti-patterns, verify +that every acceptance criterion is covered by a test scenario, and iterate +until quality standards are met. This phase may loop — you fix issues, +re-run checks, and repeat until everything passes. + +## Critical Rules + +- **Run the project's actual commands.** Use the validation profile from `01-context.md`, not hardcoded commands. +- **Fix issues, don't skip them.** If linting fails, fix the code. If tests fail, diagnose and fix. If the user asks to skip a failing check, evaluate the risk: explain what the failing check is testing, what would go unverified if skipped, and whether skipping could mask a real problem. Present this assessment so the user can make an informed decision. +- **Anti-patterns are defects.** Hardcoded sleeps, brittle selectors, missing cleanup, and harness bypass are test quality issues that will cause flaky tests in CI. Fix them. +- **Commit fixes separately.** Validation fixes get their own commits following the project's commit format. +- **Do not modify code outside the story's scope** to fix pre-existing lint or test issues. Note them in the validation report. + +## Process + +### Step 1: Read Context + +Read: +1. `.artifacts/e2e/{jira-key}/01-context.md` (validation profile and e2e infrastructure) +2. `.artifacts/e2e/{jira-key}/02-plan.md` (what was planned) +3. `.artifacts/e2e/{jira-key}/04-impl-report.md` (implementation status) + +Extract the validation profile's pre-PR checks list and the e2e test +execution command. + +### Step 2: Check Base Branch Currency + +Before running checks, verify the branch is current with its base. + +Check the **Repository Topology** section of `01-context.md`. Read +`{owner}/{repo}` from the **Origin** field. + +If the repo is a fork, sync the fork with upstream first: + +```bash +gh repo sync {owner}/{repo} --branch {base} +``` + +If `gh repo sync` fails, warn the user and record the failure in the +validation report. Do not silently skip — this is the last gate before +`/publish`. + +Then, regardless of topology: + +```bash +git fetch origin +``` + +If `git fetch` fails, warn the user. Record the failure in the validation +report under Branch Currency as "Unable to verify — fetch failed." + +```bash +git rev-list --count HEAD..origin/{base} +``` + +If the branch is behind base, check whether a PR has already been +created by looking for `.artifacts/e2e/{jira-key}/publish-metadata.json`. + +**If no PR exists yet**, offer to rebase: + +```bash +git rebase origin/{base} +``` + +Follow the same conflict handling as the sync-with-base step of `/code` +(stop, show conflicts, offer to resolve, proceed only with user approval). + +**If a PR already exists**, offer to merge instead: + +```bash +git merge origin/{base} +``` + +If the user declines either operation, continue but note the staleness +in the validation report. + +### Step 3: Run Pre-PR Checks + +Execute each check from the validation profile in order. For each check: + +1. **Run the command** +2. **Capture the output** +3. **Assess the result:** pass, fail, or warning + +Typical checks for e2e test code (discovered, not hardcoded): +- Linting (on the test files) +- E2e test execution (scoped to the new suite) + +**If a check fails:** + +1. Diagnose the failure — is it caused by the new test code or pre-existing? +2. If caused by the new test code: fix it, commit the fix, re-run the check +3. If pre-existing: note it in the validation report, do not fix it +4. If unclear: report to the user + +### Step 4: Anti-Pattern Check + +Read the base branch from the `## Branch` section of `02-plan.md`. +Then read the diff of all new test files: + +```bash +git diff {base}..HEAD -- {test file paths from the plan} +``` + +Focus on the test files created or modified by this story — ignore +unrelated changes in the diff. + +Check for each anti-pattern. For each finding, fix the issue, commit, +and re-run the affected tests. + +| Anti-Pattern | How to Detect | Fix | +|---|---|---| +| **Hardcoded sleeps** | `time.Sleep()`, `sleep()`, `setTimeout()` with fixed delay used to wait for system state | Replace with polling/retry: `Eventually(func, timeout, polling)` or the project's equivalent | +| **Brittle selectors** | Hardcoded element IDs, CSS classes, XPath (UI tests) | Use semantic locators: roles, labels, text content, harness methods | +| **Order-dependent tests** | Tests that reference state created by a prior test in the same file (not in BeforeEach). Detection heuristic: check whether any test case references a variable that is assigned in a prior test case rather than in setup/BeforeEach (or the framework's equivalent setup mechanism) | Make each test independent — create needed state in the test or BeforeEach | +| **Shared mutable state** | Package-level variables mutated by tests, global state without per-test reset. Detection heuristic: look for variables declared outside test functions that are assigned inside test cases without per-test reinitialization in BeforeEach (or the framework's equivalent setup mechanism) | Use per-test state via harness, test context, or local variables | +| **Missing cleanup** | Resources created in tests without corresponding cleanup in AfterEach or defer | Add cleanup matching the reference suite's pattern | +| **Harness bypass** | Direct HTTP calls, CLI exec, or API client instantiation instead of harness methods | Replace with harness method calls | +| **Missing labels** | Test blocks (It/Describe) without CI-filtering labels | Add labels following the project's convention | +| **Hardcoded values** | Inline timeout durations, polling intervals, resource names instead of constants | Use the project's test utility constants | +| **Missing async polling** | Direct assertions on results of async operations (no Eventually/retry) | Wrap in Eventually with appropriate timeout and polling interval | +| **Missing failure diagnostics** | No log collection or diagnostic output when tests fail | Add diagnostic output in AfterEach (matching reference suite pattern) | + +If no anti-patterns are found, record "No anti-patterns detected" in the +validation report. + +### Step 5: Regression Check + +Verify that the new test suite doesn't interfere with existing tests: + +1. If the project has a "sanity" or "smoke" label filter, run that subset + as a fast check +2. If no fast subset exists, run the e2e tests for adjacent suites (suites + in the same feature area) to check for interference +3. Check for test isolation issues: verify that AfterSuite/teardown + cleans up all resources created by BeforeSuite/setup. Compare + against the reference suite's cleanup pattern. Ensure the new suite + does not leave state (running services, created resources, modified + configuration) that could affect other suites. + +If regressions are found: +- Diagnose whether the new test code caused them +- Fix regressions caused by the new tests, commit separately +- Note pre-existing failures in the validation report + +### Step 6: Code Quality Review + +After automated checks pass, review the new test code for issues that +automated tooling does not catch. Read the base branch from the `## Branch` +section of `02-plan.md`, then diff against it: + +```bash +git diff {base}..HEAD +``` + +**Test quality:** +- Do tests actually verify what the AC describes? (not asserting something tangentially related) +- Are assertions specific enough? (not just "no error" — verify the actual outcome) +- Are `By()` step descriptions clear and meaningful? +- Will these tests break only when real behavior changes, not when unrelated details change? + +**Maintainability:** +- Are test names descriptive of what they verify? +- Is the test structure consistent with the reference suite? +- Are helper functions used appropriately (not duplicating existing utilities)? + +**Safety:** +- No test code that could leak credentials or modify production state +- No test code that creates resources without cleanup +- No test code that could interfere with other test suites + +If this review surfaces issues: + +1. Fix issues in the new test code, commit each fix separately +2. Note pre-existing issues in the validation report (do not fix them) + +### Step 7: Acceptance Criteria Verification + +After automated checks and code quality review, verify that every +acceptance criterion from the story has been covered by a test scenario. +This is the workflow's primary contract. + +1. Read the **Acceptance Criteria** from `01-context.md` +2. Read the **Acceptance Criteria Coverage** matrix from `02-plan.md` +3. For each acceptance criterion: + - **Trace to test scenario:** Is there a test (Describe/It block) + that exercises this criterion's behavior? Follow the task mapping — + check that the task is marked Done and that the corresponding test + code exists. + - **Verify the test runs:** Does the test pass when executed? + - **Assess coverage:** Is the criterion fully covered by the test + scenarios, partially covered, or not addressed? + +Record the result for each criterion. If any criterion is not fully +covered: + +1. If it's a gap in test scenarios — write the missing test, commit, + and re-run +2. If it's ambiguous whether the criterion is covered — flag it to the + user with your assessment +3. If the criterion cannot be verified through e2e tests (e.g., it + describes internal behavior or a non-functional requirement that + requires manual measurement) — note it as "not e2e-testable" with + an explanation of why + +### Step 8: Write Validation Report + +Write `.artifacts/e2e/{jira-key}/05-validation-report.md`: + +```markdown +# Validation Report — {jira-key} + +## Branch Currency + +{Current with base / N commits behind {base} — rebased before validation + / N commits behind {base} — user chose to continue without rebasing} + +## Check Results + +| Check | Command | Result | Notes | +|-------|---------|--------|-------| +| {name} | `{command}` | {pass/fail/warning} | {brief note} | + +## Anti-Pattern Check + +| Anti-Pattern | Found | Fixed | Notes | +|---|---|---|---| +| Hardcoded sleeps | {yes/no} | {yes/n/a} | {details} | +| Brittle selectors | {yes/no} | {yes/n/a} | {details} | +| Order-dependent tests | {yes/no} | {yes/n/a} | {details} | +| Shared mutable state | {yes/no} | {yes/n/a} | {details} | +| Missing cleanup | {yes/no} | {yes/n/a} | {details} | +| Harness bypass | {yes/no} | {yes/n/a} | {details} | +| Missing labels | {yes/no} | {yes/n/a} | {details} | +| Hardcoded values | {yes/no} | {yes/n/a} | {details} | +| Missing async polling | {yes/no} | {yes/n/a} | {details} | +| Missing failure diagnostics | {yes/no} | {yes/n/a} | {details} | + +{If no anti-patterns found: "No anti-patterns detected."} + +## Regressions + +{Any test failures in existing tests. Distinguish between: + - Caused by the new test code (should be fixed) + - Pre-existing (noted but not fixed) + If none: "No regressions detected."} + +## Acceptance Criteria Verification + +| AC | Description | Test Scenario | Test File | Status | +|----|-------------|--------------|-----------|--------| +| AC-1 | {brief} | {scenario name} | {file:line} | {covered/partial/not addressed/not e2e-testable} | + +{If all covered: "All acceptance criteria verified." + If any gaps: describe what's missing and what was done about it. + If any not e2e-testable: list them with rationale.} + +## Quality Review Findings + +{Findings from the test quality, maintainability, and safety review. + Distinguish between: + - Issues fixed during validation (with commit hashes) + - Pre-existing issues noted but not fixed + If none: "No quality review findings."} + +## Pre-existing Issues + +{Lint warnings, test failures, or other issues that existed before this + story and were not fixed. If none: "No pre-existing issues observed."} + +## Validation Commits + +| Hash | Message | +|------|---------| +| {short hash} | {commit message} | + +{If no validation commits: "No additional commits needed during + validation."} + +## Result + +{PASS — all checks pass, no anti-patterns, all acceptance criteria + covered, no regressions. + OR + FAIL — with explanation of what still needs fixing.} +``` + +### Step 9: Present Results + +Summarize for the user: +- Which checks passed and which failed +- Anti-pattern check results (clean or what was fixed) +- Acceptance criteria coverage (all covered, or which ones have gaps) +- Any regressions found (and whether they were fixed) +- Overall verdict: ready for `/publish` or not + +## Output + +- `.artifacts/e2e/{jira-key}/05-validation-report.md` +- Additional test fixes (if anti-patterns or gaps were found) +- Fix commits (if issues were found and fixed) + +## When This Phase Is Done + +Report your results: +- Validation check results (all pass / some fail) +- Anti-pattern check results +- Acceptance criteria coverage +- Regression status +- Overall verdict + +Then **re-read the controller** (`controller.md`) for next-step guidance. diff --git a/implement/skills/ingest.md b/implement/skills/ingest.md index d860ede..f89bcd2 100644 --- a/implement/skills/ingest.md +++ b/implement/skills/ingest.md @@ -85,6 +85,8 @@ design workflows. Fetch them from there. #### 5a: Resolve the Docs Repo Check for an existing docs repo configuration at `.artifacts/prd/config.json`. +This config is project-level and shared across workflows (prd, design, +implement, e2e) — a prior workflow run may have already created it. **If the config exists**, read it and validate: 1. Verify the path exists on the local filesystem diff --git a/implement/skills/respond.md b/implement/skills/respond.md index b8d2452..57092c8 100644 --- a/implement/skills/respond.md +++ b/implement/skills/respond.md @@ -134,11 +134,11 @@ For comments requiring code changes: 2. Apply the change 3. If the change affects behavior, update or add tests. Tests must validate behavioral contracts through public interfaces, not - implementation details — the same standard as Step 3b of + implementation details — the same standard as the write-tests step of `/code`. Match existing test patterns in the affected package. 4. Run the affected tests to verify 5. Run lint and format checks on the changed files (same approach as - Step 3e of `/code`). Fix any issues before committing. + the lint-and-format step of `/code`). Fix any issues before committing. 6. Commit using the project's commit format: ```bash diff --git a/implement/skills/validate.md b/implement/skills/validate.md index 0b6fcd3..16978d0 100644 --- a/implement/skills/validate.md +++ b/implement/skills/validate.md @@ -77,8 +77,8 @@ created by looking for `.artifacts/implement/{jira-key}/publish-metadata.json`. git rebase origin/{base} ``` -Follow the same conflict handling as Step 3h of `/code` (stop, -show conflicts, offer to resolve, proceed only with user approval). +Follow the same conflict handling as the sync-with-base step of `/code` +(stop, show conflicts, offer to resolve, proceed only with user approval). **If a PR already exists** (post-publish), offer to merge instead: From f72730aad643897ded7e18b404d7b6ada628a71c Mon Sep 17 00:00:00 2001 From: Andy Dalton Date: Wed, 22 Apr 2026 22:31:51 -0400 Subject: [PATCH 2/5] Address PR review feedback: fix phase reference and add feature-defect handling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - AGENTS.md: Correct `/decompose` to `/sync` — the `/sync` phase is what creates Jira issues, not `/decompose` which produces the story breakdown - validate.md: Add explicit handling for feature defects discovered during e2e test execution — mark tests as xfail/skip and note the defect rather than attempting to fix the feature implementation Co-Authored-By: Claude Opus 4.6 --- AGENTS.md | 2 +- e2e/skills/validate.md | 7 ++++++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 4766af6..d24f215 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -198,7 +198,7 @@ For detailed workflow development guidelines (structure, file conventions, testi ### e2e -- Requires a Jira [QE] Story (typically created by the design workflow's `/decompose` phase) as input +- Requires a Jira [QE] Story (typically created by the design workflow's `/sync` phase) as input - Jira is read-only — no phase in this workflow writes to Jira - Discovery-based infrastructure: e2e test framework, harness, auxiliary services, execution commands, and conventions are discovered during `/ingest` — not hardcoded - Reference suite pattern: before writing tests, identifies the most similar existing e2e test suite and extracts its patterns (imports, setup/teardown, harness usage, assertions, labels) diff --git a/e2e/skills/validate.md b/e2e/skills/validate.md index a652d41..22baae6 100644 --- a/e2e/skills/validate.md +++ b/e2e/skills/validate.md @@ -105,7 +105,12 @@ Typical checks for e2e test code (discovered, not hardcoded): 1. Diagnose the failure — is it caused by the new test code or pre-existing? 2. If caused by the new test code: fix it, commit the fix, re-run the check 3. If pre-existing: note it in the validation report, do not fix it -4. If unclear: report to the user +4. If the test is correct but the feature behaves differently than the AC + describes: this is a feature defect, not a test bug. Mark the test as + xfail or skip with a reason referencing the AC, note the defect in the + validation report, and continue. Do not fix the feature implementation — + that is a [DEV] scope issue (see deviation rules in `/code`). +5. If unclear: report to the user ### Step 4: Anti-Pattern Check From 2c4783e82186c00854a7fc1632955bf18d164529 Mon Sep 17 00:00:00 2001 From: Andy Dalton Date: Thu, 23 Apr 2026 11:43:56 -0400 Subject: [PATCH 3/5] Make e2e workflow framework-agnostic MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace Ginkgo-specific vocabulary with generic terms throughout all e2e skill files. Generic terms now appear in process instructions with framework-specific examples in parentheses (e.g., "test grouping blocks (Describe/Context/It in Ginkgo, test classes in pytest, describe/it in Playwright)"). Key changes: - "harness" as structural term → "test infrastructure abstractions" with harness/fixtures/page objects/helpers as discovered examples - Ginkgo lifecycle hooks → generic "test lifecycle hooks" with project-specific discovery - Auxiliary services made explicitly conditional (not all projects manage test services) - Anti-pattern detection table rewritten with multi-framework examples (Go, Python, Playwright) - Plan and code templates use generic vocabulary with notes to substitute the project's discovered terminology Co-Authored-By: Claude Opus 4.6 --- e2e/README.md | 14 ++-- e2e/guidelines.md | 19 ++--- e2e/skills/code.md | 83 ++++++++++---------- e2e/skills/controller.md | 2 +- e2e/skills/ingest.md | 158 +++++++++++++++++++++++++-------------- e2e/skills/plan.md | 79 ++++++++++---------- e2e/skills/publish.md | 2 +- e2e/skills/revise.md | 20 ++--- e2e/skills/validate.md | 32 ++++---- 9 files changed, 229 insertions(+), 180 deletions(-) diff --git a/e2e/README.md b/e2e/README.md index 21504cb..aaa9688 100644 --- a/e2e/README.md +++ b/e2e/README.md @@ -31,7 +31,7 @@ A story-to-tests workflow that takes a Jira [QE] Story, discovers the project's -> fetches [QE] story from Jira -> verifies [DEV] dependencies are merged -> loads design document and PRD context - -> explores e2e test infrastructure (framework, harness, patterns) + -> explores e2e test infrastructure (framework, test abstractions, patterns) -> selects reference suite as pattern source -> discovers validation profile (test execution, lint commands) -> writes .artifacts/e2e/EDM-5678/01-context.md @@ -40,7 +40,7 @@ A story-to-tests workflow that takes a Jira [QE] Story, discovers the project's -> maps each acceptance criterion to test scenarios -> selects reference suite and documents patterns to follow -> designs test file structure (suite file + test files) - -> plans harness method usage and auxiliary services + -> plans test infrastructure usage and auxiliary services (if any) -> breaks work into ordered tasks (suite file first, then scenarios) -> writes 02-plan.md @@ -81,7 +81,7 @@ All artifacts are stored in `.artifacts/e2e/{jira-key}/`. .artifacts/e2e/EDM-5678/ 01-context.md (story context, e2e infrastructure, validation profile) 02-plan.md (scenario breakdown, AC coverage -- updated as tasks complete) - 03-test-report.md (tests written, harness methods used) + 03-test-report.md (tests written, test infrastructure used) 04-impl-report.md (changes, commits, deviations, discoveries) 05-validation-report.md (check results, anti-patterns, regressions) 06-pr-description.md (PR body) @@ -93,19 +93,19 @@ All artifacts are stored in `.artifacts/e2e/{jira-key}/`. ### Discovery-Based Infrastructure -The workflow does not hardcode language-specific commands or framework assumptions. During `/ingest`, it discovers the project's e2e testing framework, harness, auxiliary services, execution commands, and conventions. This makes the workflow portable across projects using different testing stacks (Ginkgo, Playwright, pytest, Cypress, etc.). +The workflow does not hardcode language-specific commands or framework assumptions. During `/ingest`, it discovers the project's e2e testing framework, test infrastructure abstractions (harness, fixtures, page objects, helpers — whatever the project uses), auxiliary services (if any), execution commands, and conventions. This makes the workflow portable across projects using different testing stacks (Ginkgo, Playwright, pytest, Cypress, Jest, etc.). ### Reference Suite Pattern -Before writing any test code, the workflow identifies the most similar existing e2e test suite in the project and extracts its patterns: imports, setup/teardown, harness usage, assertion style, labels, and cleanup. New tests follow these patterns exactly, ensuring consistency with the project's existing test base. +Before writing any test code, the workflow identifies the most similar existing e2e test suite in the project and extracts its patterns: imports, lifecycle hooks, test infrastructure usage, assertion style, labels, and cleanup. New tests follow these patterns exactly, ensuring consistency with the project's existing test base. ### Scenario-Driven Planning -Unlike implementation planning (task-driven), e2e test planning is scenario-driven. Each acceptance criterion maps to one or more concrete test scenarios with specific Describe/Context/It nesting, steps, assertions, and labels. This ensures every AC is verifiably covered. +Unlike implementation planning (task-driven), e2e test planning is scenario-driven. Each acceptance criterion maps to one or more concrete test scenarios with specific test grouping, steps, assertions, and labels. This ensures every AC is verifiably covered. ### Anti-Pattern Detection -Validation checks for 10 common e2e test anti-patterns: hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, harness bypass, missing labels, hardcoded values, missing async polling, and missing failure diagnostics. Each detected anti-pattern is fixed during validation. +Validation checks for 10 common e2e test anti-patterns: hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, test infrastructure bypass, missing labels, hardcoded values, missing async polling, and missing failure diagnostics. Each detected anti-pattern is fixed during validation. ### Feature Defects Are Not Test Bugs diff --git a/e2e/guidelines.md b/e2e/guidelines.md index c352d9c..731176a 100644 --- a/e2e/guidelines.md +++ b/e2e/guidelines.md @@ -3,12 +3,13 @@ ## Principles - The e2e tests must validate the **user-facing behaviors** described in the story's acceptance criteria. Each AC maps to one or more concrete test scenarios. -- **E2e tests exercise the system from the outside.** They validate observable outcomes through the project's test harness, not internal component contracts. Write tests that a QE engineer would write — scenario-driven, using the project's actual tools and infrastructure. -- **Follow the project's existing e2e test patterns.** Read the most similar existing test suite before writing new tests. Match the harness usage, setup/teardown patterns, naming conventions, labels, and assertion style. +- **E2e tests exercise the system from the outside.** They validate observable outcomes through the project's test infrastructure, not internal component contracts. Write tests that a QE engineer would write — scenario-driven, using the project's actual tools and infrastructure. +- **Scope is e2e only.** Do not consider, plan, or write unit or integration tests. Every test this workflow produces must exercise the system end-to-end. +- **Follow the project's existing e2e test patterns.** Read the most similar existing test suite before writing new tests. Match the test infrastructure usage, lifecycle hooks, naming conventions, labels, and assertion style. - Follow the **project's commit format** as discovered during `/ingest` and recorded in the validation profile. Commit one logical unit of work per commit — typically one commit per plan task. Don't batch everything into a single commit, but don't create a commit per file either. - Each completed story must leave the test suite in a **stable state**. All new tests pass, no regressions in existing tests. - The test plan is a **living document**. Update `02-plan.md` as tasks are completed so it reflects current progress. -- **Discover, don't assume.** The project's e2e test framework, harness, auxiliary services, execution commands, and conventions are discovered during `/ingest` and recorded in the context document. Never hardcode language-specific or project-specific assumptions. +- **Discover, don't assume.** The project's e2e test framework, test infrastructure abstractions, auxiliary services (if any), execution commands, and conventions are discovered during `/ingest` and recorded in the context document. Never hardcode language-specific or project-specific assumptions. Different projects use different test infrastructure — harness objects, fixtures, page objects, helper modules, or nothing at all. Use whatever vocabulary the project uses. - **Shipped artifacts describe the final state, not the journey.** Code comments, commit messages, PR descriptions, and test names describe what the tests verify now — not the process of getting there. Do not reference abandoned approaches, intermediate failures introduced and fixed during the same session, or prior states that no longer exist. Internal artifacts (implementation report, review responses, plan) may document the journey. ## Hard Limits @@ -28,8 +29,8 @@ - Show your work before finalizing. After `/plan`, present the test scenario breakdown for review — do not assume it's ready. - Before `/code`, confirm the feature branch name and starting point with the user. - Before `/publish`, confirm the PR target branch and description with the user. -- **Read before writing.** Before writing tests for a suite, read existing tests in similar suites to match patterns. Read the harness methods you plan to use. -- **Deviation transparency.** If during `/code` you encounter something unexpected (a feature defect, a harness limitation, an assumption that doesn't hold), report it. Apply deviation rules (see `skills/code.md`) but never silently change approach. +- **Read before writing.** Before writing tests for a suite, read existing tests in similar suites to match patterns. Read the test infrastructure code you plan to use (harness methods, fixtures, page objects, or helpers — whatever the project provides). +- **Deviation transparency.** If during `/code` you encounter something unexpected (a feature defect, a test infrastructure limitation, an assumption that doesn't hold), report it. Apply deviation rules (see `skills/code.md`) but never silently change approach. - Flag assumptions explicitly. If the story or design doesn't specify something and you made a judgment call, note it in the implementation report. ## Quality @@ -38,11 +39,11 @@ - **Scenario coverage.** E2e test quality is measured by scenario coverage — do the tests exercise every acceptance criterion? — and by resilience — will the tests break only when real behavior changes, not when unrelated implementation details change? - **Anti-pattern avoidance.** Do not introduce: - Hardcoded sleeps or fixed delays (use polling/retry mechanisms) - - Brittle selectors (use semantic locators, harness methods) + - Brittle selectors (use semantic locators, test infrastructure methods) - Order-dependent tests (each test must be independently runnable) - Shared mutable state between tests (use per-test isolation) - Missing cleanup (follow the project's teardown patterns) - - Harness bypass (use the project's test harness, not ad-hoc API calls) + - Test infrastructure bypass (use the project's test abstractions, not ad-hoc API calls or direct infrastructure access) - Run the project's e2e test suite (scoped to the new tests) before considering the work complete. - Self-review test code before presenting. Check for: unused imports, dead code, debug artifacts, hardcoded values that should be constants, inconsistencies with the reference suite's patterns. @@ -54,7 +55,7 @@ Stop and request human guidance when: - The `[DEV]` story dependencies are unmerged — the feature under test may not exist yet - The e2e test infrastructure is broken or unavailable - The test requires an environment capability that doesn't exist (e.g., TPM, specific VM type, identity provider) -- A test scenario requires harness methods that don't exist and adding them is outside the story's scope +- A test scenario requires test infrastructure methods that don't exist and adding them is outside the story's scope - The feature behaves differently than the acceptance criteria describe (potential defect in the `[DEV]` implementation) - Confidence in the test approach is low @@ -64,6 +65,6 @@ This workflow gets deployed into different projects. Respect the target project: - Read and follow the project's own `AGENTS.md` or `CLAUDE.md` files - Read and follow any test-specific documentation (test directory README, AGENTS.md, GUIDELINES.md) -- Adopt the project's e2e testing conventions, harness patterns, and commit message format +- Adopt the project's e2e testing conventions, test infrastructure patterns, and commit message format - Use the project's e2e test execution commands as discovered during `/ingest` - Respect the project's CI/CD pipeline expectations diff --git a/e2e/skills/code.md b/e2e/skills/code.md index 24c203b..d172f31 100644 --- a/e2e/skills/code.md +++ b/e2e/skills/code.md @@ -12,15 +12,15 @@ suite's patterns, committing incrementally. ## Your Role Work through the plan's task breakdown, writing e2e test code for each -task. Follow the reference suite's patterns exactly — imports, harness -usage, assertion style, labels, setup/teardown. Commit each logical unit -of work independently. +task. Follow the reference suite's patterns exactly — imports, test +infrastructure usage, assertion style, labels, lifecycle hooks. Commit +each logical unit of work independently. ## Critical Rules - **Follow the plan.** Execute tasks in the order specified in `02-plan.md`. If you need to deviate, update the plan and note why. - **Read before writing.** Before writing any test code, read the reference suite files and existing tests in similar suites. Match their patterns. -- **Use the project's harness.** Do not make ad-hoc API calls, CLI invocations, or direct infrastructure access. Use the harness methods and test utilities the project provides. +- **Use the project's test infrastructure.** Do not make ad-hoc API calls, CLI invocations, or direct infrastructure access. Use the test abstractions (harness, fixtures, page objects, helpers) and utilities the project provides. - **One commit per plan task.** Each commit must follow the project's commit format (from the validation profile) and be independently meaningful. - **Update the plan.** Mark tasks as completed in `02-plan.md` as you go. On re-invocation, check the plan to see what's already done. - **No scope creep.** Do not write tests beyond the story's acceptance criteria, refactor existing test code, or improve test infrastructure. Note discoveries in the implementation report. @@ -148,7 +148,7 @@ For each task in the plan, follow this cycle. Before making any changes, read: - The reference suite file(s) noted in the plan (re-read to reinforce patterns) -- Any harness methods the task will use (verify signatures and behavior) +- Any test infrastructure methods the task will use (verify signatures and behavior) - Existing test files in neighboring suites (to match patterns) - Any test utilities or constants the task will use @@ -158,32 +158,36 @@ Write the test code for this task, following the reference suite's patterns exactly: 1. **Match the import block:** Use the same imports as the reference suite. - Include framework imports (e.g., `. "github.com/onsi/ginkgo/v2"`), - harness imports, and utility imports. -2. **Match the block structure:** Use the project's Describe/Context/It - nesting (or equivalent). Match indentation, block naming conventions, - and label placement. -3. **Use `By()` step descriptions** (or the project's equivalent) for - human-readable test flow documentation. -4. **Use the harness.** Call harness methods for all system interactions. - Do not create ad-hoc HTTP clients, CLI wrappers, or API calls. + Include framework imports, test infrastructure imports, and utility + imports. +2. **Match the test grouping structure:** Use the project's test + organization blocks (e.g., Describe/Context/It in Ginkgo, + test classes/methods in pytest, describe/it in Playwright). Match + indentation, naming conventions, and label placement. +3. **Use step annotations** (if the project uses them — e.g., By() in + Ginkgo, test step markers in BDD frameworks) for human-readable test + flow documentation. +4. **Use the project's test infrastructure.** Call the project's test + abstractions for all system interactions. Do not create ad-hoc HTTP + clients, CLI wrappers, or API calls. 5. **Use test utilities.** Use the project's constants (timeouts, polling intervals, resource type strings). Do not hardcode values. -6. **Use async assertions.** Use `Eventually`/`Consistently` with - timeout and polling (or the project's equivalent) for any operation - that may not complete immediately. Never use `time.Sleep` or - equivalent fixed delays. +6. **Use async polling.** Use the project's async waiting mechanism + (e.g., Eventually/Consistently in Ginkgo, polling helpers in pytest, + expect with toPass/polling in Playwright) for any operation that may + not complete immediately. Never use fixed delays (e.g., time.Sleep, + asyncio.sleep, page.waitForTimeout). 7. **Follow test isolation patterns.** Generate unique test IDs using the project's mechanism. Use those IDs for resource names. Ensure cleanup - happens via AfterEach or deferred functions. -8. **Apply labels.** Follow the label convention from the context document - (e.g., ticket ID, component tag). + happens via the project's teardown hooks or deferred functions. +8. **Apply labels/tags.** Follow the label convention from the context + document. For the suite file (typically Task 1): -- Follow the reference suite's BeforeSuite, BeforeEach, AfterEach, - AfterSuite structure exactly -- Start only the auxiliary services the plan identified as needed -- Use the same harness initialization pattern +- Follow the reference suite's lifecycle hook structure exactly (use the + actual hook names discovered during `/ingest`) +- Start only the auxiliary services the plan identified as needed (if any) +- Use the same test infrastructure initialization pattern - Use the same login/auth pattern #### 3c: Run Tests @@ -195,10 +199,10 @@ If tests fail, diagnose **where** the problem is before fixing: | Diagnosis | Symptom | Action | |-----------|---------|--------| -| **Test code is wrong** | Wrong harness method, bad assertion, incorrect setup, wrong import | Fix the test code | +| **Test code is wrong** | Wrong method call, bad assertion, incorrect setup, wrong import | Fix the test code | | **Test expectation is wrong** | Feature behaves differently than the AC implies | Verify against the AC — if the AC is ambiguous, note in report and escalate to user | -| **Feature has a defect** | Feature doesn't match its own [DEV] story's AC — a genuine bug | Note as a discovery in the implementation report — do NOT fix the feature (out of scope). The test may need to be adjusted to skip or xfail if the defect blocks it. | -| **Harness limitation** | Harness doesn't expose a method needed for the scenario | If a simple helper within the test file suffices, write it. If a harness change is needed, escalate — that's out of scope. | +| **Feature has a defect** | Feature doesn't match its AC — a genuine bug | Note as a discovery in the implementation report — do NOT fix the feature (out of scope). The test may need to be adjusted to skip or xfail if the defect blocks it. | +| **Test infrastructure limitation** | The project's test abstractions don't expose a method needed for the scenario | If a simple helper within the test file suffices, write it. If a test infrastructure change is needed, escalate — that's out of scope. | | **Environment issue** | Services not running, VM unavailable, network error | Report to user — this is not a test code problem | #### 3d: Lint and Format @@ -251,8 +255,8 @@ Do **not** give it the implementing agent's reasoning or conversation history — the value comes from independent eyes. The subagent should review as a senior QE engineer familiar with the -project's testing conventions, focusing on: correct harness usage, -assertion completeness, anti-patterns (hardcoded sleeps, brittle +project's testing conventions, focusing on: correct test infrastructure +usage, assertion completeness, anti-patterns (hardcoded sleeps, brittle selectors, missing cleanup), label conventions, and test isolation. **Tier 3: Structured self-review.** If the runtime does not support @@ -260,7 +264,7 @@ spawning subagents, fall back to a structured self-review. Re-read the staged diff and check for: - Anti-patterns: hardcoded sleeps, shared mutable state, missing cleanup -- Harness bypass: direct API calls instead of harness methods +- Test infrastructure bypass: direct API calls instead of project-provided abstractions - Missing async polling: synchronous assertions on async operations - Hardcoded values: inline strings/numbers instead of constants - Pattern drift: deviations from the reference suite's conventions @@ -366,7 +370,7 @@ During test implementation, you may encounter unexpected situations: | Situation | Action | Approval | |-----------|--------|----------| | Feature defect found — test reveals a bug in the [DEV] implementation | Note in report as a discovery. Adjust test to document expected behavior (may xfail or skip with reason). Do NOT fix the feature. | Auto | -| Harness doesn't expose needed method | Write a local helper within the test file if simple. If a harness change is needed, escalate. | Auto (local helper) / Required (harness change) | +| Test infrastructure doesn't expose needed method | Write a local helper within the test file if simple. If a test infrastructure change is needed, escalate. | Auto (local helper) / Required (infrastructure change) | | Test scenario is significantly simpler than planned | Note in report, continue | Auto | | Test scenario is significantly more complex than planned | **Stop and ask the user** — the story may need re-scoping | Required | | Reference suite pattern doesn't apply to this scenario | Adapt minimally, note deviation in report | Auto | @@ -393,16 +397,17 @@ After all tasks are complete (or if interrupted), write: |----------|-------------|--------| | {name} | {what it validates} | {labels} | -## Harness Methods Used +## Test Infrastructure Used -| Method | Purpose | -|--------|---------| +| Method / Abstraction | Purpose | +|---------------------|---------| | `{method}` | {what it does} | -## Auxiliary Services Required +## Auxiliary Services -{Which services must be running for these tests. If the services are - self-starting (testcontainers), note that.} +{Which services must be running for these tests, if any. If the services + are self-starting, note that. If tests run against a pre-existing + environment: "Tests run against {environment}."} ## Notes @@ -435,7 +440,7 @@ After all tasks are complete (or if interrupted), write: {Notable findings during test implementation: - Feature defects found (bugs in the [DEV] implementation) - - Harness gaps (missing methods or capabilities) + - Test infrastructure gaps (missing methods or capabilities) - Missing test infrastructure - Pre-existing test issues in adjacent suites If none: "No notable discoveries."} diff --git a/e2e/skills/controller.md b/e2e/skills/controller.md index fa0cbdc..010c9ea 100644 --- a/e2e/skills/controller.md +++ b/e2e/skills/controller.md @@ -98,7 +98,7 @@ ingest → plan → [revise loop] → code → validate → publish → [respond - `/plan` reveals story gaps or contradictions → suggest the user clarify with the story author or update the story - `/code` reveals plan gaps → the plan is updated inline during implementation; offer `/validate` when implementation is complete - `/code` discovers a feature defect (test reveals a bug in the [DEV] implementation) → note it in the implementation report; the test may need to xfail or skip. Do NOT recommend fixing the feature — that is out of scope -- `/code` discovers a missing harness method (plan referenced a method that doesn't exist) → see deviation rules in `code.md`; a local helper may suffice, or the user decides whether to adjust the plan or add harness support outside this workflow +- `/code` discovers a missing test infrastructure method (plan referenced a method that doesn't exist) → see deviation rules in `code.md`; a local helper may suffice, or the user decides whether to adjust the plan or add test infrastructure support outside this workflow - `/validate` reveals test failures → offer to diagnose and fix, then re-run `/validate` - `/validate` reveals anti-patterns → fix them during validation, then re-run the affected checks - `/validate` reveals unsatisfied acceptance criteria → if fixable (missing test scenarios), write them during validation; if the criterion is ambiguous or not e2e-testable, escalate to the user diff --git a/e2e/skills/ingest.md b/e2e/skills/ingest.md index ed3c641..c1f8a34 100644 --- a/e2e/skills/ingest.md +++ b/e2e/skills/ingest.md @@ -21,7 +21,7 @@ scenarios and file structure. - **Read-only.** Jira access is read-only. Never create, update, or modify Jira issues. - **Capture, don't implement.** Record what you find — test scenario decisions happen in `/plan`. -- **Deep infrastructure discovery.** Unlike implementation ingestion, e2e ingestion must thoroughly explore the test harness, auxiliary services, setup/teardown patterns, and test conventions. Shallow discovery leads to tests that don't follow project patterns. +- **Deep infrastructure discovery.** Unlike implementation ingestion, e2e ingestion must thoroughly explore the project's test infrastructure — whatever abstractions it uses (harness, fixtures, page objects, helpers), lifecycle hooks, auxiliary services (if any), and test conventions. Shallow discovery leads to tests that don't follow project patterns. - **Note unknowns.** If you can't determine something from the codebase, say so explicitly. - **Re-invocation diffs before overwriting.** If `01-context.md` already exists, preserve it before exploring. After compiling new context, diff against the previous version and present changes to the user before overwriting (see Steps 2a and 7a). @@ -213,43 +213,70 @@ Map the e2e test directory structure: (e.g., `*_suite_test.go` + `*_test.go` in Go/Ginkgo) 4. **Naming conventions:** How are test files and directories named? -#### 6e: Test Harness +#### 6e: Test Infrastructure Abstractions -Explore the test harness in depth — this is the API that test code uses: +Discover what abstractions the project uses for test code to interact +with the system under test. Projects vary widely — look for whichever +of these the project uses (it may use one, several, or none): -1. **Location:** Find harness source files (e.g., `test/harness/e2e/`) -2. **Initialization:** How do tests obtain the harness? (global var, - per-worker function, constructor) -3. **Key methods:** Read the harness files and catalog the public methods - relevant to the story's scope. Focus on methods the test scenarios - will need — don't catalog the entire harness. -4. **Domain-specific harness files:** Many projects split the harness by - domain (e.g., `harness_device.go`, `harness_fleet.go`). Identify which - domain files are relevant to the story. +- **Harness object:** A central test API object (e.g., `test/harness/`) +- **Fixtures:** Framework-provided setup mechanisms (e.g., pytest + fixtures, Playwright fixtures) +- **Page objects:** UI interaction abstractions (e.g., Playwright/Cypress + page object models) +- **Helper modules:** Standalone utility functions or classes for test + setup and interaction -#### 6f: Setup and Teardown Patterns +For whatever the project uses: -Read 2-3 existing suite files to understand lifecycle patterns: +1. **Location:** Where do the test infrastructure source files live? +2. **Initialization:** How do tests obtain access? (global variable, + dependency injection, fixture parameter, constructor, import) +3. **Key methods:** Catalog the public methods relevant to the story's + scope. Focus on methods the test scenarios will need — don't catalog + the entire API. +4. **Domain-specific files:** Some projects split test infrastructure by + domain. Identify which files are relevant to the story. -1. **BeforeSuite / setup_module:** What happens once per suite? - (auxiliary services, providers, harness initialization) -2. **BeforeEach / setup_method:** What happens before each test? - (login, environment reset, test context creation) -3. **AfterEach / teardown_method:** What happens after each test? - (log collection on failure, resource cleanup) -4. **AfterSuite / teardown_module:** What happens after all tests? - (auxiliary service cleanup) +If the project has no dedicated test infrastructure abstractions (tests +interact with the system directly), note that — this is a valid pattern. + +#### 6f: Test Lifecycle + +Read 2-3 existing suite/test files to understand lifecycle patterns. +Use whatever terminology the project's framework uses — common patterns +include: + +1. **Suite-level setup** (e.g., BeforeSuite, setup_module, beforeAll): + What happens once per suite? (services, providers, initialization) +2. **Per-test setup** (e.g., BeforeEach, setup_method, beforeEach): + What happens before each test? (login, reset, context creation) +3. **Per-test teardown** (e.g., AfterEach, teardown_method, afterEach): + What happens after each test? (log collection, resource cleanup) +4. **Suite-level teardown** (e.g., AfterSuite, teardown_module, afterAll): + What happens after all tests? (service cleanup) + +Record the actual hook names the project uses — downstream phases will +use these names, not generic placeholders. #### 6g: Auxiliary Services -Discover what external services tests depend on: +If the project manages external services for e2e tests, discover them. +Not all projects do this — some test against pre-deployed environments +or let the framework handle service lifecycle internally. + +If the project does manage test services: 1. **Services used:** Registry, git server, database, identity provider, metrics collector, tracing, etc. -2. **How started:** Testcontainers (self-starting), make targets, manual -3. **How accessed:** Helper functions, environment variables, harness methods +2. **How started:** Testcontainers, make targets, docker-compose, manual +3. **How accessed:** Helper functions, environment variables, test + infrastructure methods 4. **Singleton vs. per-suite:** Are services shared across suites? +If no auxiliary service management is found, note "Tests run against a +pre-existing environment" or whatever the actual pattern is. + #### 6h: Test Utilities Find test helper packages: @@ -279,8 +306,8 @@ similar to what needs to be written: 1. Match by feature area (e.g., if the story tests rollout behavior, find the existing rollout suite) -2. If no exact match, find a suite that uses similar harness methods or - tests similar interaction patterns +2. If no exact match, find a suite that uses similar test infrastructure + methods or tests similar interaction patterns 3. Read the selected suite thoroughly: suite file + 1-2 test files 4. Extract concrete patterns: imports, setup, assertions, labels, helpers @@ -291,14 +318,14 @@ Focus exploration on the test infrastructure files. Apply the convergence heuristic per discovery area (Steps 6c–6j), not across the entire exploration: within each area, if the last 5-7 files explored introduced no new patterns, that area is likely complete. E2e infrastructure spans -a broad surface (harness files, auxiliary configs, suite files, utilities, +a broad surface (test infrastructure files, auxiliary configs, suite files, utilities, CI workflows, test documentation), so premature convergence can miss critical patterns. ### Step 7: Compile Context > **Checkpoint:** Step 6 is the heaviest phase of ingestion (10 sub-steps -> across harness, services, utilities, conventions, and reference suites). +> across test infrastructure, services, utilities, conventions, and reference suites). > Before compiling, verify that all Step 6 sub-steps have been completed > and that key findings are captured. If working in a constrained context, > consider spawning a subagent for the compilation. @@ -366,41 +393,56 @@ If this is a first invocation, write ## E2E Test Infrastructure ### Framework -- **Framework:** {e.g., Ginkgo v2 + Gomega, Playwright, pytest} -- **Runner:** {e.g., ginkgo CLI, playwright test, pytest} +- **Framework:** {e.g., Ginkgo v2 + Gomega, Playwright, pytest, Cypress, Jest} +- **Runner:** {e.g., ginkgo CLI, playwright test, pytest, npx cypress} - **Test location:** {e.g., test/e2e/} -- **Suite organization:** {e.g., one directory per feature area} +- **Suite organization:** {e.g., one directory per feature area, flat files} ### Test Execution - **Run all e2e tests:** `{command}` - **Run specific suite:** `{command with scoping}` -- **Filter by label:** `{mechanism, e.g., GINKGO_LABEL_FILTER="label"}` -- **Filter by description:** `{mechanism, e.g., GINKGO_FOCUS="pattern"}` -- **Parallel execution:** `{mechanism, e.g., GINKGO_PROCS=N}` +- **Filter by label/tag:** `{mechanism}` +- **Filter by name/description:** `{mechanism}` +- **Parallel execution:** `{mechanism, if supported}` - **Environment assumptions:** {what must be running before tests execute} -### Harness -- **Location:** {path(s) to harness files} -- **Initialization:** {how tests obtain the harness} +### Test Infrastructure + +{Describe what abstractions the project uses. Include whichever of the + following the project actually has — omit sections that don't apply:} + +- **Type:** {harness object / fixtures / page objects / helper modules / none} +- **Location:** {path(s) to test infrastructure files} +- **Initialization:** {how tests obtain access} - **Key methods for this story:** | Method | Purpose | Source File | |--------|---------|-------------| | `{method}` | {what it does} | {file} | -### Setup/Teardown Patterns -- **BeforeSuite:** {what happens} -- **BeforeEach:** {what happens} -- **AfterEach:** {what happens} -- **AfterSuite:** {what happens} +{If no dedicated test infrastructure: "Tests interact with the system + directly — no harness, fixtures, or page objects."} + +### Test Lifecycle + +{Use the actual hook names from the project's framework:} + +- **Suite-level setup** ({discovered hook name}): {what happens} +- **Per-test setup** ({discovered hook name}): {what happens} +- **Per-test teardown** ({discovered hook name}): {what happens} +- **Suite-level teardown** ({discovered hook name}): {what happens} ### Auxiliary Services +{If the project manages external services for tests:} + | Service | How Started | How Accessed | Required By | |---------|-------------|-------------|-------------| -| {name} | {testcontainer/make target/manual} | {helper/env var/harness method} | {which tests} | +| {name} | {mechanism} | {how tests access it} | {which tests} | -{If no auxiliary services: "No auxiliary services required for e2e tests."} +{If tests run against a pre-existing environment or no auxiliary service + management exists: "Tests run against {describe environment}. No + test-managed auxiliary services."} ### Test Utilities - **Constants:** {path, key constants} @@ -409,9 +451,9 @@ If this is a first invocation, write - **Test data:** {where fixtures live} ### Conventions -- **Labels:** {convention, e.g., Label("ticket-id", "component-tag")} +- **Labels/tags:** {convention for CI-filtering labels or tags} - **File naming:** {convention for test files} -- **Test naming:** {convention for Describe/It blocks} +- **Test naming:** {convention for test grouping and naming} - **Lint rules:** {test-specific lint rules, if any} - **Documentation:** {test docs locations} @@ -423,9 +465,9 @@ If this is a first invocation, write **Patterns to follow:** - **Imports:** {import pattern from the suite file} -- **Setup:** {BeforeSuite/BeforeEach pattern} -- **Assertions:** {assertion style, e.g., Eventually/Expect} -- **Labels:** {how labels are applied} +- **Setup:** {lifecycle hook pattern, using the project's actual hook names} +- **Assertions:** {assertion style, including any async/polling patterns} +- **Labels:** {how labels/tags are applied} - **Cleanup:** {teardown pattern} - **Key code pattern:** {any distinctive pattern worth replicating} @@ -466,14 +508,14 @@ If this is a first invocation, write must be a concrete question — not an observation, concern, or statement of fact. Ask what needs to be decided, not what you noticed. - Good: "Should the fleet rollback e2e tests enroll a real VM via the - harness, or use the device simulator? The existing rollout suite uses - real VMs but the agent suite uses both patterns." + Good: "Should the fleet rollback e2e tests enroll a real VM, or use + the device simulator? The existing rollout suite uses real VMs but the + agent suite uses both patterns." Bad: "Need to figure out the VM approach." (too vague) - Bad: "The harness supports both VMs and simulators." (observation, not - a question)} + Bad: "The test infrastructure supports both VMs and simulators." + (observation, not a question)} ``` ### Step 7a: Diff Against Prior Ingest (Re-invocation Only) @@ -483,7 +525,7 @@ the newly compiled content. Focus the diff on: - Changes to acceptance criteria or testing approach - Changes to dependency status (have [DEV] stories been merged since last ingest?) -- New harness methods or infrastructure discovered +- New test infrastructure methods or patterns discovered - Changes to the validation profile or test execution commands - Changes to the reference suite selection @@ -501,7 +543,7 @@ Present a brief summary: - Story scope and acceptance criteria - Design and PRD context loaded (or what was missing) - Dependency status — especially whether `[DEV]` stories are merged -- E2E test infrastructure discovered (framework, harness, reference suite) +- E2E test infrastructure discovered (framework, test abstractions, reference suite) - Validation profile discovered (how to run tests) - Open questions (if any) — frame these as items that `/plan` will investigate, not as blockers. The planner reads the actual code and @@ -521,7 +563,7 @@ what changes were found and that the existing context was preserved. Report your findings: - Story scope and key acceptance criteria - Dependency status ([DEV] stories merged or not) -- E2E infrastructure discovered (framework, harness, reference suite) +- E2E infrastructure discovered (framework, test abstractions, reference suite) - Validation profile summary - Assessment of readiness for `/plan` diff --git a/e2e/skills/plan.md b/e2e/skills/plan.md index 5ec8670..5a968ef 100644 --- a/e2e/skills/plan.md +++ b/e2e/skills/plan.md @@ -8,7 +8,7 @@ description: Map acceptance criteria to e2e test scenarios, select the reference You are a principal QE engineer planning e2e test coverage. Your job is to read the story context and produce a structured test plan: map acceptance criteria to test scenarios, select the reference suite pattern, define the -test file structure, and plan harness usage. +test file structure, and plan test infrastructure usage. ## Your Role @@ -21,7 +21,7 @@ review checkpoint before any test code is written. - **Every acceptance criterion must have at least one test scenario.** If an AC has no scenario, it's a coverage gap. - **Follow the reference suite's patterns.** The e2e infrastructure context from `/ingest` shows how tests are written in this project. Match those patterns exactly. -- **Be specific.** Name the test files, Describe/Context/It blocks, harness methods, and labels. A plan that says "test the feature" without specifying how is too vague. +- **Be specific.** Name the test files, test grouping blocks, test infrastructure methods, and labels. A plan that says "test the feature" without specifying how is too vague. - **Scenarios are the plan, not an afterthought.** The scenario breakdown is the primary output — not a task breakdown with scenarios appended. - **No scope expansion.** Don't add test scenarios beyond the story's acceptance criteria. - **No duplicate coverage.** Do not plan scenarios that re-test behavior already covered by the `[DEV]` story's unit and integration tests. E2e tests validate user-facing workflows through the full system. @@ -43,11 +43,11 @@ run first. Before writing the plan, create a mental map: - Which acceptance criteria describe user-facing behaviors that can be verified through e2e tests? - For each AC, what are the concrete scenarios? (happy path, error paths, edge cases) -- What harness methods will each scenario use? -- What auxiliary services does each scenario need? +- What test infrastructure methods will each scenario use? +- What auxiliary services does each scenario need (if any)? - What setup and teardown does each scenario require beyond the suite-level patterns? -- How should scenarios be organized into Describe/Context/It blocks (or the project's equivalent)? -- What labels should each scenario have (for CI filtering)? +- How should scenarios be organized into the project's test grouping blocks? +- What labels/tags should each scenario have (for CI filtering)? ### Step 3: Write the Test Plan @@ -69,8 +69,8 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: - **Path:** {path to the suite used as the pattern source} - **Why selected:** {what makes it similar to the planned tests} -- **Patterns adopted:** {specific patterns to follow: setup, teardown, - harness usage, assertions, labels} +- **Patterns adopted:** {specific patterns to follow: lifecycle hooks, + test infrastructure usage, assertions, labels} ## Test File Structure @@ -81,7 +81,7 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: | File | Purpose | |------|---------| -| `{suite file}` | Suite setup: auxiliary services, harness init, login, cleanup | +| `{suite file}` | Suite setup: auxiliary services (if any), test infrastructure init, login, cleanup | | `{test file}` | Test scenarios | | `{helper file, if needed}` | Suite-specific helpers (only if existing suites follow this pattern) | @@ -89,28 +89,26 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: {Map each acceptance criterion to concrete test scenarios. Each scenario is a specific test case that verifies observable behavior through the - project's test harness. + project's test infrastructure. - Note: This template uses Ginkgo terminology (Describe/Context/It, - Label(), By(), Eventually) as illustrative shorthand. Map these to the - project's actual test framework vocabulary as discovered during - `/ingest` — e.g., test suites/test cases for pytest, describe/it for - Playwright or Jest.} + Use the project's actual test framework vocabulary as discovered during + `/ingest`. The template below uses generic terms — replace them with + the project's terminology (e.g., Describe/Context/It for Ginkgo, + test classes/methods for pytest, describe/it for Playwright).} ### AC-1: {description} #### Scenario 1.1: {description — what the test verifies} -- **Block structure:** {Describe/Context/It nesting, or the project's equivalent} -- **Labels:** {e.g., Label("EDM-5678", "fleet-rollback")} -- **Setup:** {what the test needs beyond suite-level BeforeEach — e.g., - create a fleet, deploy an application} +- **Block structure:** {test grouping/nesting using the project's vocabulary} +- **Labels/tags:** {using the project's label convention} +- **Setup:** {what the test needs beyond suite-level per-test setup} - **Steps:** - 1. {action using harness method, e.g., harness.EnrollAndWaitForOnlineStatus()} - 2. {action, e.g., harness.TriggerRollback(deviceID)} - 3. {verification, e.g., Eventually(harness.GetDeviceVersion, TIMEOUT, POLLING).Should(Equal("v1"))} + 1. {action using test infrastructure method} + 2. {action} + 3. {verification using the project's assertion/polling style} - **Assertions:** {what to verify — use the project's assertion style} -- **Cleanup:** {what AfterEach handles vs. test-specific cleanup} +- **Cleanup:** {what teardown hooks handle vs. test-specific cleanup} #### Scenario 1.2: {error or edge case} ... @@ -120,7 +118,7 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: #### Scenario 2.1: {description} ... -## Harness Usage +## Test Infrastructure Usage ### Methods Needed @@ -128,17 +126,21 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: |--------|---------|-------------------| | `{method signature}` | {what it does} | {scenario references} | -{Verify each method exists in the harness. If a needed method does not - exist, note it under Open Questions — do not assume it can be created.} +{Verify each method exists in the project's test infrastructure. If a + needed method does not exist, note it under Open Questions — do not + assume it can be created.} ### Auxiliary Services Needed +{If the project manages test services:} + | Service | Why Needed | How Started | Used in Scenarios | |---------|-----------|-------------|-------------------| -| {name} | {reason} | {testcontainer / BeforeSuite / manual} | {references} | +| {name} | {reason} | {mechanism} | {references} | -{If no auxiliary services needed: "No auxiliary services beyond the - suite-level defaults."} +{If no auxiliary services needed or tests run against a pre-existing + environment: "No auxiliary services beyond the suite-level defaults." + or "Tests run against {environment}."} ## Task Breakdown @@ -153,10 +155,9 @@ Write `.artifacts/e2e/{jira-key}/02-plan.md` with this structure: ### Task 1: Create suite file - **Files:** `{suite file path}` -- **What:** Suite setup following the reference suite's pattern — BeforeSuite - (auxiliary services, providers), BeforeEach (login, environment setup, - test context), AfterEach (log collection, resource cleanup), AfterSuite - (auxiliary cleanup) +- **What:** Suite setup following the reference suite's pattern — lifecycle + hooks for initialization, per-test setup, per-test teardown, and suite + cleanup (use the project's actual hook names) - **Why:** Foundation for all test scenarios (AC-1 through AC-N) - **Commit message:** `{use commit format from 01-context.md}` - **Status:** Pending @@ -199,23 +200,23 @@ Before presenting the plan, verify: - [ ] Every acceptance criterion has at least one test scenario - [ ] Test scenarios validate user-facing behavior, not internal logic that `[DEV]` tests already cover -- [ ] Suite file follows the reference suite's setup/teardown pattern -- [ ] Harness methods referenced actually exist (verified during `/ingest`) +- [ ] Suite file follows the reference suite's lifecycle hook pattern +- [ ] Test infrastructure methods referenced actually exist (verified during `/ingest`) - [ ] Labels follow the project's convention - [ ] File paths are within the e2e test directory and follow naming conventions -- [ ] Describe/Context/It nesting matches the project's style -- [ ] Auxiliary services needed are available in the project's infrastructure +- [ ] Test grouping/nesting matches the project's style +- [ ] Auxiliary services needed (if any) are available in the project's infrastructure - [ ] Commit messages follow the project's format (from validation profile) - [ ] No scenarios require environment capabilities not present in the project - [ ] Task count is reasonable — if you have more than 8 tasks, consider whether the story needs re-scoping -- [ ] The plan is achievable — no scenarios depend on unmerged features or unavailable harness methods +- [ ] The plan is achievable — no scenarios depend on unmerged features or unavailable test infrastructure methods ### Step 5: Present to User Show the user the complete plan and highlight: - Test approach and reference suite selection - Scenario breakdown and AC coverage -- Harness methods and auxiliary services needed +- Test infrastructure methods and auxiliary services needed - Any risks or open questions - Anything where you made a judgment call vs. following explicit guidance diff --git a/e2e/skills/publish.md b/e2e/skills/publish.md index c7ad368..8a8aa19 100644 --- a/e2e/skills/publish.md +++ b/e2e/skills/publish.md @@ -115,7 +115,7 @@ In either case, save the result to - **Suite location:** {path to the new test suite directory} - **Reference suite:** {which existing suite was used as the pattern} - **Auxiliary services:** {services required, or "None beyond suite defaults"} -- **Harness methods used:** {key harness methods} +- **Test infrastructure used:** {key methods from the project's test abstractions} ### Acceptance Criteria {Checklist of acceptance criteria from the story, each prefixed with a diff --git a/e2e/skills/revise.md b/e2e/skills/revise.md index 8053fd2..b6d4eaf 100644 --- a/e2e/skills/revise.md +++ b/e2e/skills/revise.md @@ -18,7 +18,7 @@ multiple rounds of revision. This phase only modifies the plan, not code. - **Change only what's requested.** Do not "improve" parts of the plan the user didn't mention. - **Evaluate before applying.** Assess whether the requested change would create coverage gaps, introduce anti-patterns, or reduce scenario quality. If it would, say so before making the change — explain the concern, recommend an alternative if you have one, and let the user decide. -- **Maintain consistency.** If a scenario change affects AC coverage or harness usage, update those sections too. +- **Maintain consistency.** If a scenario change affects AC coverage or test infrastructure usage, update those sections too. - **Preserve traceability.** Every acceptance criterion must still have at least one test scenario after revision. - **Show your changes.** After revising, summarize what changed so the user can verify. - **No scope reduction.** Do not silently simplify, even when revising. @@ -46,12 +46,12 @@ The user's feedback may target: **Test approach changes:** - Different reference suite ("Use the fleet_update suite as the pattern, not the agent suite") -- Different harness methods ("Use harness.EnrollWithOptions instead of harness.Enroll") -- Different assertions ("Use Eventually with a longer timeout for the rollback verification") +- Different test infrastructure methods ("Use a different setup helper for enrollment") +- Different assertions ("Use a longer timeout for the rollback verification") **Structure changes:** - Different file organization ("Put the error cases in a separate test file") -- Different Describe/Context/It nesting ("Group the rollback scenarios under a Context block") +- Different test grouping ("Group the rollback scenarios under a separate context block") - Label changes ("Add the 'slow' label to the VM-based scenarios") **Task changes:** @@ -67,8 +67,8 @@ before applying it. For example: - Removing scenarios that are the only coverage for an acceptance criterion - Dropping cleanup or teardown that prevents test isolation - Changing an approach that would introduce anti-patterns (hardcoded - sleeps, brittle selectors, harness bypass) -- Using harness methods that don't exist in the project + sleeps, brittle selectors, test infrastructure bypass) +- Using test infrastructure methods that don't exist in the project Present the concern with specific reasoning, recommend an alternative if you have one, and apply the change only after the user has considered @@ -87,8 +87,8 @@ Edit the plan: After applying changes, verify: - Does every acceptance criterion still have at least one test scenario? - Does the task ordering still respect dependencies (suite file first)? -- Do the test scenarios still match the Describe/Context/It structure? -- Do harness methods referenced actually exist in the project? +- Do the test scenarios still match the project's test grouping structure? +- Do test infrastructure methods referenced actually exist in the project? - Are labels consistent with the project's conventions? - Does the AC coverage matrix reflect the current scenario mapping? - Are commit messages still properly formatted? @@ -111,11 +111,11 @@ Summarize what changed: ### Consistency Updates - AC coverage matrix updated to reflect new scenario mapping -- Harness usage table updated with new method references +- Test infrastructure table updated with new method references - Task count increased from 4 to 5 ### Items to Note -- The new error scenario requires harness.SimulateOffline() — verified it exists +- The new error scenario requires SimulateOffline() — verified it exists in the test infrastructure ``` ## Output diff --git a/e2e/skills/validate.md b/e2e/skills/validate.md index 22baae6..7f08b93 100644 --- a/e2e/skills/validate.md +++ b/e2e/skills/validate.md @@ -21,7 +21,7 @@ re-run checks, and repeat until everything passes. - **Run the project's actual commands.** Use the validation profile from `01-context.md`, not hardcoded commands. - **Fix issues, don't skip them.** If linting fails, fix the code. If tests fail, diagnose and fix. If the user asks to skip a failing check, evaluate the risk: explain what the failing check is testing, what would go unverified if skipped, and whether skipping could mask a real problem. Present this assessment so the user can make an informed decision. -- **Anti-patterns are defects.** Hardcoded sleeps, brittle selectors, missing cleanup, and harness bypass are test quality issues that will cause flaky tests in CI. Fix them. +- **Anti-patterns are defects.** Hardcoded sleeps, brittle selectors, missing cleanup, and test infrastructure bypass are test quality issues that will cause flaky tests in CI. Fix them. - **Commit fixes separately.** Validation fixes get their own commits following the project's commit format. - **Do not modify code outside the story's scope** to fix pre-existing lint or test issues. Note them in the validation report. @@ -129,16 +129,16 @@ and re-run the affected tests. | Anti-Pattern | How to Detect | Fix | |---|---|---| -| **Hardcoded sleeps** | `time.Sleep()`, `sleep()`, `setTimeout()` with fixed delay used to wait for system state | Replace with polling/retry: `Eventually(func, timeout, polling)` or the project's equivalent | -| **Brittle selectors** | Hardcoded element IDs, CSS classes, XPath (UI tests) | Use semantic locators: roles, labels, text content, harness methods | -| **Order-dependent tests** | Tests that reference state created by a prior test in the same file (not in BeforeEach). Detection heuristic: check whether any test case references a variable that is assigned in a prior test case rather than in setup/BeforeEach (or the framework's equivalent setup mechanism) | Make each test independent — create needed state in the test or BeforeEach | -| **Shared mutable state** | Package-level variables mutated by tests, global state without per-test reset. Detection heuristic: look for variables declared outside test functions that are assigned inside test cases without per-test reinitialization in BeforeEach (or the framework's equivalent setup mechanism) | Use per-test state via harness, test context, or local variables | -| **Missing cleanup** | Resources created in tests without corresponding cleanup in AfterEach or defer | Add cleanup matching the reference suite's pattern | -| **Harness bypass** | Direct HTTP calls, CLI exec, or API client instantiation instead of harness methods | Replace with harness method calls | -| **Missing labels** | Test blocks (It/Describe) without CI-filtering labels | Add labels following the project's convention | +| **Hardcoded sleeps** | Fixed delay calls (e.g., `time.Sleep()` in Go, `asyncio.sleep()` in Python, `page.waitForTimeout()` in Playwright) used to wait for system state | Replace with the project's async polling/retry mechanism (e.g., Eventually in Ginkgo, polling helpers in pytest, expect with toPass in Playwright) | +| **Brittle selectors** | Hardcoded element IDs, CSS classes, XPath (UI tests) | Use semantic locators: roles, labels, text content, test infrastructure methods | +| **Order-dependent tests** | Tests that reference state created by a prior test in the same file (not in per-test setup). Detection heuristic: check whether any test case references a variable that is assigned in a prior test case rather than in the per-test setup hook | Make each test independent — create needed state in the test or per-test setup hook | +| **Shared mutable state** | Package-level variables mutated by tests, global state without per-test reset. Detection heuristic: look for variables declared outside test functions that are assigned inside test cases without per-test reinitialization in the per-test setup hook | Use per-test state via test infrastructure, test context, or local variables | +| **Missing cleanup** | Resources created in tests without corresponding cleanup in teardown hooks or defer | Add cleanup matching the reference suite's pattern | +| **Test infrastructure bypass** | Direct HTTP calls, CLI exec, or API client instantiation instead of using the project's test abstractions | Replace with the project's test infrastructure methods | +| **Missing labels** | Test blocks without CI-filtering labels/tags | Add labels/tags following the project's convention | | **Hardcoded values** | Inline timeout durations, polling intervals, resource names instead of constants | Use the project's test utility constants | -| **Missing async polling** | Direct assertions on results of async operations (no Eventually/retry) | Wrap in Eventually with appropriate timeout and polling interval | -| **Missing failure diagnostics** | No log collection or diagnostic output when tests fail | Add diagnostic output in AfterEach (matching reference suite pattern) | +| **Missing async polling** | Direct assertions on results of async operations (no polling/retry) | Wrap in the project's async polling mechanism with appropriate timeout and polling interval | +| **Missing failure diagnostics** | No log collection or diagnostic output when tests fail | Add diagnostic output in teardown hooks (matching reference suite pattern) | If no anti-patterns are found, record "No anti-patterns detected" in the validation report. @@ -151,8 +151,8 @@ Verify that the new test suite doesn't interfere with existing tests: as a fast check 2. If no fast subset exists, run the e2e tests for adjacent suites (suites in the same feature area) to check for interference -3. Check for test isolation issues: verify that AfterSuite/teardown - cleans up all resources created by BeforeSuite/setup. Compare +3. Check for test isolation issues: verify that the suite-level teardown + cleans up all resources created by suite-level setup. Compare against the reference suite's cleanup pattern. Ensure the new suite does not leave state (running services, created resources, modified configuration) that could affect other suites. @@ -175,7 +175,7 @@ git diff {base}..HEAD **Test quality:** - Do tests actually verify what the AC describes? (not asserting something tangentially related) - Are assertions specific enough? (not just "no error" — verify the actual outcome) -- Are `By()` step descriptions clear and meaningful? +- Are step annotations (if the project uses them) clear and meaningful? - Will these tests break only when real behavior changes, not when unrelated details change? **Maintainability:** @@ -202,8 +202,8 @@ This is the workflow's primary contract. 1. Read the **Acceptance Criteria** from `01-context.md` 2. Read the **Acceptance Criteria Coverage** matrix from `02-plan.md` 3. For each acceptance criterion: - - **Trace to test scenario:** Is there a test (Describe/It block) - that exercises this criterion's behavior? Follow the task mapping — + - **Trace to test scenario:** Is there a test that exercises this + criterion's behavior? Follow the task mapping — check that the task is marked Done and that the corresponding test code exists. - **Verify the test runs:** Does the test pass when executed? @@ -249,7 +249,7 @@ Write `.artifacts/e2e/{jira-key}/05-validation-report.md`: | Order-dependent tests | {yes/no} | {yes/n/a} | {details} | | Shared mutable state | {yes/no} | {yes/n/a} | {details} | | Missing cleanup | {yes/no} | {yes/n/a} | {details} | -| Harness bypass | {yes/no} | {yes/n/a} | {details} | +| Test infrastructure bypass | {yes/no} | {yes/n/a} | {details} | | Missing labels | {yes/no} | {yes/n/a} | {details} | | Hardcoded values | {yes/no} | {yes/n/a} | {details} | | Missing async polling | {yes/no} | {yes/n/a} | {details} | From e9542bbcf45b63fca8f959a09511cd0e32ceaacc Mon Sep 17 00:00:00 2001 From: Andy Dalton Date: Thu, 23 Apr 2026 13:25:04 -0400 Subject: [PATCH 4/5] Add parallelism discovery, lifecycle skeletons, and anti-pattern examples to e2e workflow Enhance /ingest to discover the project's parallelism model (mechanism, isolation strategy, lifecycle interaction) and extract a sanitized lifecycle skeleton from the reference suite as a copy-paste starting point for /code. Expand the test infrastructure method table to include full signatures (parameters and return types). Add concrete framework examples to anti-pattern detection rules in /validate (e.g., bare Expect without Eventually in Ginkgo, bare assert without polling in pytest). Update AGENTS.md summary to match. Co-Authored-By: Claude Opus 4.6 --- AGENTS.md | 8 ++--- e2e/skills/code.md | 5 +-- e2e/skills/ingest.md | 72 ++++++++++++++++++++++++++++++++++++++---- e2e/skills/validate.md | 4 +-- 4 files changed, 75 insertions(+), 14 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index d24f215..b6512c3 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -200,10 +200,10 @@ For detailed workflow development guidelines (structure, file conventions, testi - Requires a Jira [QE] Story (typically created by the design workflow's `/sync` phase) as input - Jira is read-only — no phase in this workflow writes to Jira -- Discovery-based infrastructure: e2e test framework, harness, auxiliary services, execution commands, and conventions are discovered during `/ingest` — not hardcoded -- Reference suite pattern: before writing tests, identifies the most similar existing e2e test suite and extracts its patterns (imports, setup/teardown, harness usage, assertions, labels) -- Scenario-driven planning: each acceptance criterion maps to concrete test scenarios with Describe/Context/It nesting, steps, assertions, and labels -- Anti-pattern detection during `/validate`: checks for hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, harness bypass, missing labels, hardcoded values, missing async polling, missing failure diagnostics +- Discovery-based infrastructure: e2e test framework, test infrastructure abstractions (harness, fixtures, page objects, helpers — whatever the project uses), auxiliary services (if any), execution commands, and conventions are discovered during `/ingest` — not hardcoded +- Reference suite pattern: before writing tests, identifies the most similar existing e2e test suite and extracts its patterns (imports, lifecycle hooks, test infrastructure usage, assertions, labels) +- Scenario-driven planning: each acceptance criterion maps to concrete test scenarios with specific test grouping, steps, assertions, and labels +- Anti-pattern detection during `/validate`: checks for hardcoded sleeps, brittle selectors, order-dependent tests, shared mutable state, missing cleanup, test infrastructure bypass, missing labels, hardcoded values, missing async polling, missing failure diagnostics - Feature defects are not test bugs — if tests reveal a defect in the [DEV] implementation, the test is adjusted (xfail/skip) and the defect is noted in the implementation report - Plan evolves during implementation — `02-plan.md` is updated as tasks complete, enabling resumption after interruptions - Code changes happen in the source repo on a feature branch; `/publish` creates a PR in the source repo diff --git a/e2e/skills/code.md b/e2e/skills/code.md index d172f31..d299cc6 100644 --- a/e2e/skills/code.md +++ b/e2e/skills/code.md @@ -184,8 +184,9 @@ exactly: document. For the suite file (typically Task 1): -- Follow the reference suite's lifecycle hook structure exactly (use the - actual hook names discovered during `/ingest`) +- Start from the **lifecycle skeleton** in the Reference Suite section of + `01-context.md` — it provides a sanitized copy of the reference suite's + hook structure as a starting point - Start only the auxiliary services the plan identified as needed (if any) - Use the same test infrastructure initialization pattern - Use the same login/auth pattern diff --git a/e2e/skills/ingest.md b/e2e/skills/ingest.md index c1f8a34..7acf5b5 100644 --- a/e2e/skills/ingest.md +++ b/e2e/skills/ingest.md @@ -233,8 +233,9 @@ For whatever the project uses: 2. **Initialization:** How do tests obtain access? (global variable, dependency injection, fixture parameter, constructor, import) 3. **Key methods:** Catalog the public methods relevant to the story's - scope. Focus on methods the test scenarios will need — don't catalog - the entire API. + scope — include their full signatures (parameters and return types), + not just names. Focus on methods the test scenarios will need — don't + catalog the entire API. 4. **Domain-specific files:** Some projects split test infrastructure by domain. Identify which files are relevant to the story. @@ -259,6 +260,20 @@ include: Record the actual hook names the project uses — downstream phases will use these names, not generic placeholders. +Also discover the **parallelism model** — if tests can run in parallel: + +1. **Mechanism:** How is parallelism achieved? (framework-native workers, + test sharding, process-level parallelism) +2. **Isolation strategy:** How do parallel workers avoid interfering? + (per-worker resources, shared resource pool allocation, unique naming + with worker IDs, separate databases, snapshot revert) +3. **Lifecycle interaction:** How do lifecycle hooks relate to + parallelism? In particular: does suite-level setup run once + globally or once per worker? This affects how the suite file + must be structured. + +If tests run sequentially or parallelism is not documented, note that. + #### 6g: Auxiliary Services If the project manages external services for e2e tests, discover them. @@ -310,6 +325,23 @@ similar to what needs to be written: methods or tests similar interaction patterns 3. Read the selected suite thoroughly: suite file + 1-2 test files 4. Extract concrete patterns: imports, setup, assertions, labels, helpers +5. **Extract a lifecycle skeleton:** Create a sanitized copy of the + reference suite's lifecycle structure. This skeleton goes into the + context document and gives `/code` a copy-paste starting point for + the new suite file. + + **Keep:** hook declarations, framework registration calls, + infrastructure initialization calls (e.g., harness setup, fixture + creation), cleanup/teardown calls, worker or parallelism setup, + and the structural nesting of test blocks. + + **Strip:** specific assertions, business-logic conditionals, + hardcoded resource names and test data, and inline comments that + reference story-specific details. + + Replace stripped content with brief comments describing what + happens at that point (e.g., `// create test resources`, + `// verify expected state`). These suites become the "pattern source" for the `/code` phase. @@ -403,7 +435,7 @@ If this is a first invocation, write - **Run specific suite:** `{command with scoping}` - **Filter by label/tag:** `{mechanism}` - **Filter by name/description:** `{mechanism}` -- **Parallel execution:** `{mechanism, if supported}` +- **Parallel execution:** `{command flag or mechanism, if supported}` - **Environment assumptions:** {what must be running before tests execute} ### Test Infrastructure @@ -416,9 +448,9 @@ If this is a first invocation, write - **Initialization:** {how tests obtain access} - **Key methods for this story:** -| Method | Purpose | Source File | -|--------|---------|-------------| -| `{method}` | {what it does} | {file} | +| Method | Parameters / Return | Purpose | Source File | +|--------|---------------------|---------|-------------| +| `{method}` | `{params and return types}` | {what it does} | {file} | {If no dedicated test infrastructure: "Tests interact with the system directly — no harness, fixtures, or page objects."} @@ -432,6 +464,21 @@ If this is a first invocation, write - **Per-test teardown** ({discovered hook name}): {what happens} - **Suite-level teardown** ({discovered hook name}): {what happens} +### Parallelism Model + +{If the project supports parallel test execution:} + +- **Supported:** {yes/no} +- **Mechanism:** {e.g., framework-native workers, test sharding, process-level} +- **Isolation strategy:** {how parallel workers avoid interfering — e.g., + per-worker resources, shared resource pool allocation, unique naming, + separate databases} +- **Lifecycle interaction:** {how lifecycle hooks relate to parallelism — + e.g., "suite setup runs once per worker, not once globally"} + +{If parallelism is not supported or tests run sequentially: "Tests run + sequentially. No parallel execution model."} + ### Auxiliary Services {If the project manages external services for tests:} @@ -471,6 +518,19 @@ If this is a first invocation, write - **Cleanup:** {teardown pattern} - **Key code pattern:** {any distinctive pattern worth replicating} +**Lifecycle skeleton:** + +{Sanitized skeleton of the reference suite's lifecycle — hook ordering, + infrastructure initialization, parallelism integration, cleanup. Include + the suite entry point, all lifecycle hooks, and a representative test + block. Apply the keep/strip rules from Step 6j.} + +```{language — use the project's language, e.g., go, python, typescript} +{skeleton code here — actual hook names, actual method calls, + with domain-specific logic replaced per the keep/strip rules + in Step 6j} +``` + ## Repository Topology - **Origin:** {owner}/{repo} diff --git a/e2e/skills/validate.md b/e2e/skills/validate.md index 7f08b93..531a2ba 100644 --- a/e2e/skills/validate.md +++ b/e2e/skills/validate.md @@ -137,8 +137,8 @@ and re-run the affected tests. | **Test infrastructure bypass** | Direct HTTP calls, CLI exec, or API client instantiation instead of using the project's test abstractions | Replace with the project's test infrastructure methods | | **Missing labels** | Test blocks without CI-filtering labels/tags | Add labels/tags following the project's convention | | **Hardcoded values** | Inline timeout durations, polling intervals, resource names instead of constants | Use the project's test utility constants | -| **Missing async polling** | Direct assertions on results of async operations (no polling/retry) | Wrap in the project's async polling mechanism with appropriate timeout and polling interval | -| **Missing failure diagnostics** | No log collection or diagnostic output when tests fail | Add diagnostic output in teardown hooks (matching reference suite pattern) | +| **Missing async polling** | Direct assertions on results of async operations without polling/retry (e.g., bare `Expect` without `Eventually` in Ginkgo, bare `assert` without polling in pytest, bare `expect` without `toPass` in Playwright) | Wrap in the project's async polling mechanism with appropriate timeout and polling interval | +| **Missing failure diagnostics** | No log collection or diagnostic output when tests fail (e.g., no diagnostic log capture in Go teardown, no screenshot capture in Playwright `afterEach`, no log dump in pytest teardown) | Add diagnostic output in teardown hooks (matching reference suite pattern) | If no anti-patterns are found, record "No anti-patterns detected" in the validation report. From cbed24a21db53575aaaf8dcd5d20bb8d1829cc53 Mon Sep 17 00:00:00 2001 From: Andy Dalton Date: Thu, 23 Apr 2026 13:40:37 -0400 Subject: [PATCH 5/5] Address PR review feedback: e2e scope clarity and clean-tree checks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reinforce in /ingest that this workflow is exclusively for e2e tests — unit and integration tests are handled by the implement workflow. Add git status --porcelain checks before rebase/merge operations in /code Step 3g and /validate Step 2, matching the safety pattern already used in the implement workflow. Co-Authored-By: Claude Opus 4.6 --- e2e/skills/code.md | 13 +++++++++++-- e2e/skills/ingest.md | 3 +++ e2e/skills/validate.md | 14 ++++++++++++-- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/e2e/skills/code.md b/e2e/skills/code.md index d299cc6..791f8ae 100644 --- a/e2e/skills/code.md +++ b/e2e/skills/code.md @@ -326,8 +326,17 @@ git rev-list --count HEAD..origin/{base} If the count is 0, skip the rebase/merge and proceed to Step 3h. -If new commits exist, check whether a PR has already been created by -looking for `.artifacts/e2e/{jira-key}/publish-metadata.json`. +If new commits exist, verify the working tree is clean before syncing: + +```bash +git status --porcelain +``` + +If output is non-empty, stop and ask the user how to proceed (commit, +stash, or abort) before continuing. + +Check whether a PR has already been created by looking for +`.artifacts/e2e/{jira-key}/publish-metadata.json`. **If no PR exists yet**, rebase: diff --git a/e2e/skills/ingest.md b/e2e/skills/ingest.md index 7acf5b5..264b191 100644 --- a/e2e/skills/ingest.md +++ b/e2e/skills/ingest.md @@ -10,6 +10,9 @@ that the features under test have been implemented and merged, explore the project's e2e testing infrastructure in depth, and produce a structured context document that will inform the test planning phase. +This workflow is exclusively for e2e tests. Unit and integration tests are +handled by the implement workflow — do not consider them here. + ## Your Role Build a complete picture of what needs to be tested, what e2e test diff --git a/e2e/skills/validate.md b/e2e/skills/validate.md index 531a2ba..657259c 100644 --- a/e2e/skills/validate.md +++ b/e2e/skills/validate.md @@ -67,8 +67,18 @@ report under Branch Currency as "Unable to verify — fetch failed." git rev-list --count HEAD..origin/{base} ``` -If the branch is behind base, check whether a PR has already been -created by looking for `.artifacts/e2e/{jira-key}/publish-metadata.json`. +If the branch is behind base, verify the working tree is clean before +syncing: + +```bash +git status --porcelain +``` + +If output is non-empty, stop and ask the user how to proceed (commit, +stash, or abort) before continuing. + +Check whether a PR has already been created by looking for +`.artifacts/e2e/{jira-key}/publish-metadata.json`. **If no PR exists yet**, offer to rebase: