-
Notifications
You must be signed in to change notification settings - Fork 15
feat(e2e): agentic verification loop with MCP Playwright browser layer #128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
informatico-madrid
wants to merge
399
commits into
tzachbon:main
Choose a base branch
from
informatico-madrid:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
399 commits
Select commit
Hold shift + click to select a range
2a43b38
refactor(skills): rename selector-map, add ui-map-init, update gitign…
informatico-madrid c81feae
refactor(skills): remove generic selector-map.skill.md (renamed to ho…
informatico-madrid 6e56bc9
docs: update .gitignore and FORK_GOALS for renamed skills
informatico-madrid 3c62d34
feat(task-planner): inject ui-map-init before first Playwright E2E task
informatico-madrid 0ba45ab
docs: add Phase 0.2 — task-planner auto-injects ui-map-init for Playw…
informatico-madrid 7f74e21
feat(phase1+2): verification contract in requirements template + prod…
informatico-madrid f5ecd5b
feat(phase3+4): repair loop + regression sweep in stop-watcher
informatico-madrid 3154a7c
docs: mark Phase 3+4 complete in FORK_GOALS
informatico-madrid 8de254e
feat(specum): Phase 5 - MCP Playwright skill with dependency check an…
informatico-madrid 2dbf977
docs(fork-goals): mark Phase 5 complete, update status and contributi…
informatico-madrid 36bbdf6
docs(skills): update legacy ui-map-init to point to Phase 5 supersedi…
informatico-madrid 15fc560
docs: add README.fork.md documenting fork deltas and PR contribution …
informatico-madrid d994519
feat(e2e): add playwright-env skill + example, update session and mcp…
informatico-madrid db78148
fix(e2e): red-1 gitignore + red-2 connectivity check in playwright-env
informatico-madrid 49968ae
fix(e2e): orange-3 token patterns, orange-4 ui-map-init Step-1, yello…
informatico-madrid eff0888
fix(e2e-skills): fix critical bugs + playwright MCP cache/lock issues
informatico-madrid 6869fb3
fix(ui-map-init): add map invalidation logic + protected-route auth h…
informatico-madrid 59d07bc
fix(e2e-skills): 4 minor fixes — portable date, lock recovery conditi…
informatico-madrid 168ab54
fix(integration): 3 integration fixes — project-type detection, VE ta…
informatico-madrid 4effae6
feat(architect-reviewer): mandatory test strategy with mock rules v0.3.0
informatico-madrid f82f845
feat(spec-executor): test writing guardrails — read strategy, no skip…
informatico-madrid dc0ef4d
fix(templates): update design.md Test Strategy to match architect-rev…
informatico-madrid 22c3df5
fix(spec-workflow): clarify quick mode reviewer, add --fresh to decis…
informatico-madrid bd1811c
fix(spec-executor): document state cleanup on SPEC_COMPLETE v0.4.1
informatico-madrid dbba56e
fix(spec-executor): correct e2e skill loading order for VE tasks
informatico-madrid f2121f0
fix(mcp-playwright): clarify --isolated and --caps are MCP server con…
informatico-madrid f07dd61
fix(playwright-session): remove agent-managed server lifecycle (pkill…
informatico-madrid fb6f68b
fix(e2e skills): 5 new issues — session start auth order, fresh-conte…
informatico-madrid 1f53731
fix(spec-executor): handle VERIFICATION_DEGRADED from VE0 (v0.4.4)
informatico-madrid 9263c6c
fix(playwright-session): v8 — remove contradictory Context Isolation …
informatico-madrid 461f48e
refactor(skills): move e2e skills to plugin, remove legacy root skill…
informatico-madrid 3b43556
refactor(skills): remove legacy skills/e2e/ root folder (moved to plu…
informatico-madrid de27428
refactor(skills): remove legacy skills/e2e/ root folder (moved to plu…
informatico-madrid 44f8c9e
refactor(skills): remove legacy skills/e2e/ root folder (moved to plu…
informatico-madrid 92ea195
feat(e2e): incremental ui-map updates in qa-engineer and spec-executor
informatico-madrid a3698ad
feat(agents): ui-map incremental update — qa-engineer and spec-executor
informatico-madrid 7c83550
docs(references): document ui-map lifecycle in phase-rules and qualit…
informatico-madrid 9b3b286
docs(readme): document agentic verification loop and MCP Playwright e…
informatico-madrid e42df1e
Update README.md
informatico-madrid 934bb2d
Delete FORK_GOALS.md
informatico-madrid 3cd0f60
fix(specum): resolve points 4, 6, and 9 in single commit
informatico-madrid 91b4c04
fix(templates): add Project type field to Verification Contract
informatico-madrid 3fd9388
Update plugins/ralph-specum/hooks/scripts/stop-watcher.sh
informatico-madrid 025f8bb
feat(e2e): add generic selector-map skill, decouple from homeassistant
informatico-madrid 9717ff7
refactor(e2e): move homeassistant-selector-map to examples/, update e…
informatico-madrid a889dbe
refactor(e2e): remove homeassistant-selector-map from skills root (mo…
informatico-madrid efaca69
fix: apply verified problem statement fixes across plugin files
Copilot 9fc6768
Merge pull request #1 from informatico-madrid/copilot/fix-verificatio…
informatico-madrid d63c1a6
fix(product-manager): correct Project type values in Verification Con…
informatico-madrid f596ba0
Update plugins/ralph-specum/hooks/scripts/stop-watcher.sh
informatico-madrid ab384fa
fix: correct Project type field description to reflect actual routing…
informatico-madrid fdf4979
Merge pull request #2 from informatico-madrid/fix/project-type-values…
informatico-madrid 7e81f15
fix(mcp-playwright): remove auto-install fallback in Step 0a
informatico-madrid 9da53ce
fix(stop-watcher): add missing fi closing Phase 3 outer if block
informatico-madrid 380f9c3
fix(playwright-env): update command execution method to use bash -lc …
informatico-madrid 80bc834
fix(phase4): guard regression sweep against infinite re-trigger loop
informatico-madrid eca66c8
fix(e2e): replace broken selector-map.skill.md refs with homeassistan…
informatico-madrid fcc2c09
revert(e2e): restore selector-map.skill.md references
informatico-madrid fdfd7b8
docs: document MCP server flag security context (--isolated, --caps)
informatico-madrid 74fb562
fix(readme): correct code block syntax for task planner and verificat…
informatico-madrid 260462a
Merge branch 'main' of github.com:informatico-madrid/smart-ralph
informatico-madrid 1470262
docs: fix VERIFICATION_DEGRADED behavior — degrade+continue, not esca…
informatico-madrid 7174848
fix(playwright-env): use resolved appUrl and seedCommand, not raw env…
informatico-madrid 7dfb1af
fix(ui-map-init): correct code block syntax for ESCALATE reasons in s…
informatico-madrid a5304f4
fix(ui-map-init): make screenshot filename contract canonical (ve0-pu…
informatico-madrid 7b45463
fix(implement): add repairIteration, failedStory, and originTaskIndex…
informatico-madrid 6e96e9c
fix(selector-map): correct code block syntax for selector hierarchy
informatico-madrid 2f41b29
fix(selector-map): actualizar ejemplo de getByTestId para mayor claridad
informatico-madrid 5863f83
fix(stop-watcher): handle VERIFICATION_DEGRADED as blocking escalatio…
informatico-madrid a3a9c43
fix(homeassistant-selector-map): corregir formato de data-testid y er…
informatico-madrid 8c3f0f4
Merge branch 'main' of github.com:informatico-madrid/smart-ralph
informatico-madrid b91b230
fix(playwright-session): mejorar manejo de cierre de sesión y escritu…
informatico-madrid a2c0e19
fix(ui-map-init): mejorar filtrado de puntos de entrada para solo inc…
informatico-madrid 6df6dc7
fix(readme): correct ui-map.local.md canonical location to spec direc…
informatico-madrid 101e952
chore(plugin): bump version to 4.9.3 to trigger reinstall from fork
informatico-madrid 84789f7
revert(plugin): restore version to 4.9.2 — version bump should not be…
informatico-madrid ed52359
research: init e2e HA findings scratchpad
informatico-madrid 836be08
research: add agent live session findings (2026-04-03)
informatico-madrid 69f9b08
research: agent confirms 404/auth/sidebar hypothesis + new findings (…
informatico-madrid 4717804
docs: añadir Bloque 10 — nuevo hallazgo baseURL IIFE evaluado antes d…
informatico-madrid 7a88db5
fix(qa-engineer): define VERIFICATION_DEGRADED as valid output signal
informatico-madrid 62592c1
research: nuevo plan de prueba paso a paso desde /start (proyecto lim…
informatico-madrid 7448225
pizarra: actualizar tabla de seguimiento con observaciones sesión 202…
informatico-madrid 3b8d3fc
pizarra: actualizar Q3-Q4 goal-interview + hallazgo crítico contamina…
informatico-madrid 51fc259
pizarra: hallazgo crítico P10 - copilot-instructions contradice el ex…
informatico-madrid 1dd1d25
pizarra: interview completada + Phase 1 research arrancada - hallazgo…
informatico-madrid b72121a
pizarra: Phase 1 Explore results - hallazgos críticos sobre hass-tast…
informatico-madrid 2f72a94
pizarra: añadir hallazgos P11-P14, research-analyst completado, estad…
informatico-madrid 920f12f
research: pizarra actualizada — design aprobado, tasks en ejecución
informatico-madrid 8dc03b4
research: P18 skills ausentes en tasks, tasks completado, P15 resuelta
informatico-madrid b3789ac
research: add P19-P20 (engram as implicit skills, cross-project vault…
informatico-madrid 2b3e1ca
research: P21 spec-executor unknown skill + fix tasks review findings
informatico-madrid 1bcfe73
research: add Bloque 23-25 — fix task execution analysis, TypeScript …
informatico-madrid cf70184
research: add Bloque 26 — VE1 results: HA arranca OK, auth selector f…
informatico-madrid ff16496
docs(pizarra): Bloque 27 — aclaración crítica tipos de verificación (…
informatico-madrid 1221fbc
docs(pizarra): Bloque 28 — P27: Fix Q mal documentado + qa-engineer i…
informatico-madrid dd582df
pizarra: P28/P29 — auth_callback URL muerta, goto() sidebar nav bug c…
informatico-madrid 1019212
Update e2e-ha-findings.md
informatico-madrid cf13c12
feat(e2e): add delegation contracts, source-of-truth protocol, naviga…
Copilot 533252e
refactor(e2e): extract canonical e2e-anti-patterns.md reference, dedu…
Copilot 02d5749
feat(e2e): GAP-5 ESM/CJS detection + GAP-7 EXECUTOR_START signal and …
Copilot 1403257
fix: use version placeholder in EXECUTOR_START example to prevent drift
Copilot f04c1fe
fix(phase-rules): agregar advertencia sobre la verificación de la con…
informatico-madrid 3e3074e
feat(context-auditor): add mandatory system prompt audit skill (GAP 1)
Copilot 4fc658e
Merge branch 'main' of github.com:informatico-madrid/smart-ralph
informatico-madrid f6e954b
Merge branch 'main' into research/e2e-ha-findings
informatico-madrid db84e47
Update plugins/ralph-specum/.claude-plugin/plugin.json
informatico-madrid bce971a
Update .claude-plugin/marketplace.json
informatico-madrid a28378b
fix(task-planner): align ui-map-init Skills field with full E2E skill…
Copilot 8331c6f
fix(task-planner): consolidate Skills field — replace single playwrig…
informatico-madrid 36971fc
Merge branch 'copilot/fix-e2e-test-performance' into research/e2e-ha-…
informatico-madrid 7519456
Merge branch 'research/e2e-ha-findings' of github.com:informatico-mad…
informatico-madrid e2f04fd
Update plugins/ralph-specum/.claude-plugin/plugin.json
informatico-madrid dbd27dd
Update .claude-plugin/marketplace.json
informatico-madrid f74ec2a
Update research/e2e-ha-findings.md
informatico-madrid 3abce74
fix(review): address all reviewer findings
informatico-madrid 3ecc4f0
fix: MD040 fences in coordinator-pattern, nearest package.json detect…
Copilot f5d47a3
Merge pull request #6 from informatico-madrid/copilot/fix-coordinator…
informatico-madrid f9ce257
refactor(architect-reviewer): scope Test Strategy to design concerns …
informatico-madrid a04d194
fix: revert ralph-specum version to 4.9.1, update README links, and e…
informatico-madrid bdabe4b
Merge branch 'main' of github.com:informatico-madrid/smart-ralph
informatico-madrid ee3d22f
refactor(architect-reviewer): replace generic Mock Boundary table wit…
informatico-madrid 7a3b879
refactor(architect-reviewer): rewrite Test Strategy section with prec…
informatico-madrid 88559d9
feat(spec-reviewer): add Test Strategy dimension to Design Rubric
informatico-madrid c3a2457
feat(spec-executor): clarify Mock Boundary column selection and Fixtu…
informatico-madrid e987673
feat(task-planner): derive Phase 3 test tasks from Test Coverage Tabl…
informatico-madrid f7f88f0
feat(architect-reviewer): add consistency note between Test Double Po…
informatico-madrid 076f4e7
fix(plugin.json): revert version to 4.9.1 for consistency
informatico-madrid 66e88da
feat(docs): Add comprehensive testing system analysis document
informatico-madrid dca58c0
fix(architect): testing discovery checklist + cross-table consistency…
informatico-madrid 50cd4b5
fix(template): sync Mock Boundary to component-based structure matchi…
informatico-madrid 304601f
docs: Add comprehensive testing system analysis document for Ralph Sp…
informatico-madrid 1ad4b0d
fix(stop-watcher): remove orphaned fi on line 413 — bash syntax error
informatico-madrid 733fdec
Merge branch 'main' of github.com:informatico-madrid/smart-ralph
informatico-madrid fce951b
fix(task-planner): enforce checkbox format - [ ] for all tasks, never…
informatico-madrid 6289905
fix(research-analyst): Makefile body extraction + decouple UI presenc…
informatico-madrid 8cdec43
fix(spec-executor): add Stuck State Protocol for repeated test failures
informatico-madrid 191dd92
docs(quality-checkpoints): document test-task false-complete anti-pat…
informatico-madrid 90e61ed
fix(pipeline): V5 no espera CI cloud — agente local desconecta al abr…
informatico-madrid f9376f9
fix(executor): add Stuck State Protocol + Exit Code Gate (v0.4.9)
informatico-madrid 12dbbc7
feat: Implement ralph-quality-improvements spec
informatico-madrid c4e5ee4
feat: Add self-review checklist and external reviewer protocol to imp…
informatico-madrid ccae3d5
feat: Update documentation structure and add verification steps for s…
informatico-madrid e1c2047
spec(ralph-quality-improvements): add implementation tasks
informatico-madrid b7beca6
fix(tasks): apply P1-P4 corrections to tasks.md
informatico-madrid 59d6e49
fix(tasks): correct FR-A2 anchor to AFTER ## Performance Considerations
informatico-madrid c492cdb
fix: agregar node_modules a .gitignore para evitar el seguimiento de …
informatico-madrid df8b034
fix(tasks): apply architect-anchors and verify-anchors corrections
informatico-madrid 825694f
fix: agregar .claude/settings.json a .gitignore para evitar el seguim…
informatico-madrid b31eb45
feat(architect-reviewer): add Document Self-Review Checklist for spec…
informatico-madrid e153381
feat(architect-reviewer): add On Design Update reconciliation section
informatico-madrid f33e1bc
chore(spec): mark task 1.2 complete
informatico-madrid 5631af8
feat(architect-reviewer): add On Design Update reconciliation section
informatico-madrid 6d87426
chore(spec): mark tasks 1.2 and 1.3 complete
informatico-madrid 6cdefa5
feat(templates): add Concurrency & Ordering Risks section to design.md
informatico-madrid b8423ee
feat(architect-reviewer): add On Design Update reconciliation section
informatico-madrid a8ef4d9
chore(spec): mark tasks 1.2 and 1.3 complete
informatico-madrid 1b41842
feat(architect-reviewer): add On Design Update reconciliation section
informatico-madrid 4f1820d
chore(tasks): mark 1.2, 1.3 complete
informatico-madrid e224f8d
feat(product-manager): add On Requirements Update reconciliation section
informatico-madrid 468ecc3
feat(spec-executor): add External Review Protocol and external_unmark…
informatico-madrid b79372c
feat(templates): add task_review.md for external reviewer protocol
informatico-madrid f763a19
chore(tasks): mark tasks 1.7-1.12 complete
informatico-madrid 9c049c4
chore(version): bump patch version for quality improvements release
informatico-madrid 5d89fac
feat(task-review): update monitoring summary with full review results…
informatico-madrid 939ff14
feat(tests): eliminar informe de diagnóstico de tests E2E en CI
informatico-madrid 5cee546
feat(spec): eliminar documento de especificaciones y agregar protocol…
informatico-madrid c99bb2f
feat(tasks): actualizar el total de tareas y agregar fase 2 para corr…
informatico-madrid eaa095d
feat(tasks): add tasks 2.10 and 2.11 for external-reviewer agent and …
informatico-madrid 66d027d
fix(architect-reviewer): reposition Document Self-Review Checklist af…
informatico-madrid 70f9168
fix(spec-executor): correct External Review Protocol and external_unm…
informatico-madrid 4825602
fix: complete Phase 2 fixes from PR review
informatico-madrid dcd3ce8
chore(tasks): mark all Phase 2 fix tasks complete
informatico-madrid be1f2c7
fix(product-manager): align checklist item and changelog format with …
informatico-madrid 2a7f1a2
fix(spec-executor): integrate effectiveIterations as Stuck State Prot…
informatico-madrid 0951a79
feat(agents): add external-reviewer agent prompt for parallel review …
informatico-madrid d60d2a2
feat(interview-framework): add parallel reviewer onboarding and quali…
informatico-madrid b24d8b2
feat(tasks): mark tasks 2.8, 2.9, 2.10, and 2.11 as complete with cor…
informatico-madrid 55c9ce3
Merge pull request #8 from informatico-madrid/feat/ralph-quality-impr…
informatico-madrid fb3a000
spec(agent-chat-protocol): add requirements
informatico-madrid be44794
spec(agent-chat-protocol): add technical design
informatico-madrid 14b06c2
spec(agent-chat-protocol): fix 3 design issues
informatico-madrid 12dcc9f
spec(agent-chat-protocol): add 46 implementation tasks
informatico-madrid 08a3789
Add agent-chat protocol spec and FLOC framework for agent communication
informatico-madrid 448b36f
feat(chat-template): create chat.md template with FLOC signals legend
informatico-madrid 3d99c2e
feat(chat-state): add chat field to .ralph-state.json schema
informatico-madrid c8a8679
feat(spec-executor): add Chat Protocol section with atomic read/write
informatico-madrid d68efbf
feat(spec-executor): add OVER and HOLD signal handling
informatico-madrid 087db55
feat(spec-executor): add STILL TTL tracking for deadlock prevention
informatico-madrid cb898be
feat(spec-executor): add FLOC signal writers to Chat Protocol
informatico-madrid 658d706
feat(external-reviewer): add Chat Protocol section infrastructure
informatico-madrid 5a1dfa2
fix(external-reviewer): prohibit background scripts, enforce real ver…
informatico-madrid c615c5c
feat(external-reviewer): add OVER response signals
informatico-madrid 19e13d9
feat(external-reviewer): add STILL and ALIVE signal implementation
informatico-madrid d8660f3
feat(external-reviewer): add URGENT, INTENT-FAIL, DEADLOCK signals
informatico-madrid 81e2ea6
chore(external-reviewer): add version field for plugin versioning
informatico-madrid f274514
fix(external-reviewer): resolve DRY violation and basePath variable i…
informatico-madrid 56a4c2b
docs(spec-executor): clarify bash patterns are inline templates not e…
informatico-madrid 378be91
docs(spec-executor): clarify bash patterns are inline templates not e…
informatico-madrid 346a5d5
fix(spec-executor): integrate External Review Protocol into Task Loop…
informatico-madrid 4ea6d8f
fix(external-reviewer): add aggressive fallback to tasks.md + progres…
informatico-madrid 086210a
feat(implement): enhance reviewer onboarding with chat.md creation
informatico-madrid 94491ee
fix(agent-chat-protocol): document final external review findings and…
informatico-madrid 593816c
feat(agent-chat-protocol): add Phase 5 tasks for PR #9 review fixes
informatico-madrid 9a561db
fix(atomic-write): use flock for safe concurrent append
informatico-madrid 4c1fba2
chore(spec): update progress for task 37
informatico-madrid 65d58c6
fix(requirements): clarify FR-13 atomic write — flock required
informatico-madrid b18e447
chore(spec): update progress for task 38
informatico-madrid fc7ed64
fix(tasks): add flock to atomic write pattern in task 1.3
informatico-madrid 66e51ef
chore(progress): update task 5.3 completion in .progress.md
informatico-madrid 00b12b1
chore(spec): update progress for task 39
informatico-madrid 90f3bcf
fix(external-reviewer): use flock for atomic chat append
informatico-madrid 8177754
chore(spec): update progress for task 40
informatico-madrid 8787868
chore(spec): update progress for task 5.5 verification pass
informatico-madrid d64374a
fix(design): update architecture diagram to use .ralph-state.json
informatico-madrid a5dd431
chore(spec): update progress for task 5.6
informatico-madrid a8ef24f
fix(design): remove .chat-state references, use .ralph-state.json
informatico-madrid bb5137c
fix(design): rename lastReadIndex to lastReadLine across all spec files
informatico-madrid 881d025
fix(requirements): remove all .chat-state.*.json references — state i…
informatico-madrid 3216291
chore(spec): update progress for task 5.9
informatico-madrid 86abcdd
fix(design): remove vitest references, use bats consistently
informatico-madrid c567b6a
fix(lint): add language identifiers to fenced code blocks
informatico-madrid 6b3604e
feat(external-reviewer): add tool permissions, Judge pattern, converg…
informatico-madrid a8e9818
chore(spec): update progress for task 5.13
informatico-madrid f259711
chore(external-reviewer): bump version to 0.2.0 for reviewer improvem…
informatico-madrid d09f822
chore: mark task 5.14 complete in tasks.md and .progress.md
informatico-madrid 24628b9
fix(agent-chat-protocol): address PR #9 review feedback — atomic writ…
informatico-madrid a205387
chore(spec): final progress update for agent-chat-protocol
informatico-madrid 93722e3
chore(progress): update progress and review log for task 5.15 completion
informatico-madrid 68d1e2a
fix(reviewer): address PR review findings — BLOCK→HOLD, flock in spec…
informatico-madrid 3644b72
chore: housekeeping — index consistency, research.md activation rule,…
informatico-madrid 9272067
Merge branch 'main' into feat/agent-chat-protocol
informatico-madrid 8c0b66a
Merge pull request #9 from informatico-madrid/feat/agent-chat-protocol
informatico-madrid b5eb7ca
fix(coordinator): enforce chat.md read before each task delegation
informatico-madrid f219800
fix(executor): insert step 2a in Task Loop to enforce chat.md read
informatico-madrid 9a92048
fix(coordinator): complete signal handling in Chat Protocol
informatico-madrid 801711c
fix(coordinator): 3-way merge — chat.md enforcement, anti-fabrication…
informatico-madrid 428e7f6
fix(reviewer): G2+G4 — bootstrap reads chat.md + atomic flock on task…
informatico-madrid 811984b
feat(multi-agent): TASK_AMBIGUOUS signal + channel-map.md
informatico-madrid d24afb7
feat(e2e): enhance verification processes and introduce mandatory ski…
informatico-madrid 456b757
fix(spec-reviewer): update selector grounding criteria for E2E review…
informatico-madrid b3794e4
fix: address PR #10 review comments — real issues only
informatico-madrid 8ae6844
Merge pull request #10 from informatico-madrid/fix/e2e-implementation…
informatico-madrid f901b44
Merge upstream/main into main
informatico-madrid dc0f8f3
fix(.gitignore): add node_modules, .serena, and .qwen to ignore list
informatico-madrid c1dcaf9
feat: add supervisor role and verification checks for external-review…
informatico-madrid 54adea4
feat: add SPEC-ADJUSTMENT and SPEC-DEFICIENCY signals, enhance error …
informatico-madrid d748c31
feat: mejorar manejo de errores y ajustes en la verificación de tarea…
informatico-madrid 6a414d4
feat: mejorar la lógica de verificación y manejo de errores en los ag…
informatico-madrid 80a15b3
Merge pull request #11 from informatico-madrid/improve-flat-flow
informatico-madrid File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -34,4 +34,8 @@ test-ac-*/ | |
|
|
||
| # Claude | ||
| .claude/worktrees/** | ||
| .omc/** | ||
| .omc/** | ||
| node_modules/ | ||
| .serena/ | ||
| .qwen/ | ||
| .mcp.json | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| /cache | ||
| /project.local.yml |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,167 @@ | ||
| # Smart Ralph — Fork Notes | ||
|
|
||
| > **Upstream:** [`tzachbon/smart-ralph`](https://github.com/tzachbon/smart-ralph) | ||
| > **Fork:** [`informatico-madrid/smart-ralph`](https://github.com/informatico-madrid/smart-ralph) | ||
|
|
||
| This document tracks every deliberate divergence from upstream. It exists so that: | ||
|
|
||
| 1. The `README.md` stays clean and mergeable with upstream at any time | ||
| 2. There is a clear record of what to include in a future PR back to upstream | ||
| 3. Any contributor to this fork understands what was changed and why | ||
|
|
||
| --- | ||
|
|
||
| ## Fork Goals (TL;DR) | ||
|
|
||
| The upstream Smart Ralph spec loop ends at **Phase 4: Quality Gates** (lint, types, CI). | ||
|
|
||
| This fork extends it with a **Phase 5: Agentic Verification Loop** — browser-based end-to-end verification driven by `@playwright/mcp`, where the agent navigates, asserts, and reports against the spec's acceptance criteria before marking a task complete. | ||
|
|
||
| The core thesis: tests written by the agent are only as good as the agent's ability to run them in a real browser. Phase 5 closes that gap. | ||
|
|
||
| --- | ||
|
|
||
| ## What Changed vs Upstream | ||
|
|
||
| ### New: Phase 5 — Agentic Verification Loop | ||
|
|
||
| Upstream's `spec-executor` agent stops after quality gates. This fork adds a fifth phase: | ||
|
|
||
| ``` | ||
| Phase 1: Make It Work (upstream — unchanged) | ||
| Phase 2: Refactoring (upstream — unchanged) | ||
| Phase 3: Testing (upstream — unchanged) | ||
| Phase 4: Quality Gates (upstream — unchanged) | ||
| Phase 5: Verification ← NEW in this fork | ||
| ``` | ||
|
|
||
| Phase 5 is driven by **VE tasks** (Verification Execution) generated by `task-planner` and executed by `spec-executor` using MCP Playwright browser tools. | ||
|
|
||
| **Files added:** | ||
|
|
||
| | File | Purpose | | ||
| |---|---| | ||
| | `plugins/ralph-specum/skills/e2e/mcp-playwright.skill.md` | Full browser verification protocol — tool selection, verification sequence, signal format, degradation strategy | | ||
| | `plugins/ralph-specum/skills/e2e/playwright-session.skill.md` | Session lifecycle — context isolation, auth flow, cleanup, state persistence | | ||
| | `plugins/ralph-specum/skills/e2e/playwright-env.skill.md` | Environment context resolution — URL, auth type, credentials, seed data, browser config, safety limits | | ||
| | `plugins/ralph-specum/skills/e2e/ui-map.skill.md` | UI component map — stable selector registry so VE tasks don't hand-write CSS selectors | | ||
| | `plugins/ralph-specum/skills/e2e/ui-map-init.skill.md` | Initialise the UI map by crawling the live app with MCP Playwright | | ||
| | `.gitignore` additions | `playwright-env.local.md` — contains env var references and local config, never committed | | ||
|
|
||
| **Files modified:** | ||
|
|
||
| | File | What changed | | ||
| |---|---| | ||
| | `plugins/ralph-specum/agents/task-planner.md` | Added VE task format, `[VE]` markers, and Verification Contract generation rules | | ||
| | `plugins/ralph-specum/agents/spec-executor.md` | Added Phase 5 execution rules, VE task handling, skill loading order | | ||
| | `plugins/ralph-specum/templates/tasks.md` | Added Phase 5 section with VE task template and Verification Contract template | | ||
| | `plugins/ralph-specum/templates/requirements.md` | Added Entry Points section (UI routes the agent needs to navigate) | | ||
| | `CLAUDE.md` (Key Files section) | Added references to Phase 5 skill files | | ||
|
|
||
| --- | ||
|
|
||
| ## New Concepts Not in Upstream | ||
|
|
||
| ### VE Tasks | ||
|
|
||
| VE tasks (`[VE]`) are a new task type, generated in Phase 5, that instruct the `spec-executor` to verify a specific acceptance criterion via browser. They follow this format in `tasks.md`: | ||
|
|
||
| ```markdown | ||
| - [ ] [VE] AC-1.2 — verify user can submit the login form and land on dashboard | ||
| ``` | ||
|
|
||
| VE tasks are non-destructive by default. The agent reads `RALPH_ALLOW_WRITE` from the environment before performing any write action in a real environment. | ||
|
|
||
| ### Verification Contract | ||
|
|
||
| A structured block appended to `requirements.md` after the requirements phase, listing: | ||
| - UI entry points (URLs the agent will navigate to) | ||
| - Auth type required | ||
| - Seed data dependencies | ||
| - Expected signals (`VERIFICATION_PASS` / `VERIFICATION_FAIL`) | ||
|
|
||
| ### playwright-env.local.md | ||
|
|
||
| A per-project local file (gitignored) that resolves environment context for the agent before any browser interaction. See [`playwright-env.local.md.example`](playwright-env.local.md.example) for the full template with all auth type variants. | ||
|
|
||
| ### Auth Types Supported | ||
|
|
||
| | Type | Env var that activates it | | ||
| |---|---| | ||
| | `none` | `RALPH_AUTH_TYPE=none` | | ||
| | `form` | `RALPH_AUTH_TYPE=form` | | ||
| | `token` | `RALPH_AUTH_TYPE=token` | | ||
| | `cookie` | `RALPH_AUTH_TYPE=cookie` | | ||
| | `oauth` | `RALPH_AUTH_TYPE=oauth` | | ||
| | `basic` | `RALPH_AUTH_TYPE=basic` | | ||
informatico-madrid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Credentials are **never stored in files**. They live exclusively in environment variables (see example file). | ||
|
|
||
| --- | ||
|
|
||
| ## Signal Protocol | ||
|
|
||
| Phase 5 emits structured signals at the end of every VE task: | ||
|
|
||
| ``` | ||
| VERIFICATION_PASS — AC verified, screenshot attached | ||
| VERIFICATION_FAIL — AC failed, full diagnosis (console + network + snapshot) | ||
| VERIFICATION_DEGRADED — MCP not available, static fallback used | ||
| ESCALATE — Human intervention required | ||
| ``` | ||
|
|
||
| The stop-watcher captures `ESCALATE` and blocks the loop until the human resolves the blocker. | ||
|
|
||
| --- | ||
|
|
||
| ## What Upstream Does Not Have (and Why) | ||
|
|
||
| | Feature | Why not upstream (yet) | | ||
| |---|---| | ||
| | Phase 5 VE loop | Requires `@playwright/mcp` — adds an optional dependency upstream doesn't mandate | | ||
| | `playwright-env.local.md` protocol | Project-specific config pattern; upstream is project-agnostic | | ||
| | Auth-aware browser sessions | Outside scope of upstream's self-contained workflow | | ||
| | `RALPH_ALLOW_WRITE` safety gate | Needed when agent runs against staging/production — upstream only targets local dev | | ||
|
|
||
| --- | ||
|
|
||
| ## PR Contribution Plan | ||
|
|
||
| When the Phase 5 work stabilises, the intended upstream contribution is: | ||
|
|
||
| 1. **Phase 5 as opt-in** — activated only when `@playwright/mcp` is detected (already implemented via Protocol A/B in `mcp-playwright.skill.md`) | ||
| 2. **`[VE]` task type** — additive to `task-planner` and `tasks.md` template, no breaking change | ||
| 3. **Verification Contract** — additive section in `requirements.md` template | ||
| 4. **`playwright-env.local.md.example`** — example only, never committed with real values | ||
|
|
||
| The auth credential handling and `playwright-env.local.md` are **out of scope for the upstream PR** — too project-specific. Those stay in the fork. | ||
|
|
||
| --- | ||
|
|
||
| ## Staying in Sync with Upstream | ||
|
|
||
| ```bash | ||
| # Add upstream remote (once) | ||
| git remote add upstream https://github.com/tzachbon/smart-ralph.git | ||
|
|
||
| # Pull upstream changes | ||
| git fetch upstream | ||
| git merge upstream/main | ||
|
|
||
| # Conflicts to expect: | ||
| # - agents/task-planner.md (VE task additions) | ||
| # - agents/spec-executor.md (Phase 5 additions) | ||
| # - templates/tasks.md (Phase 5 section) | ||
| # - templates/requirements.md (Entry Points + Verification Contract) | ||
| # - CLAUDE.md (Key Files additions) | ||
| ``` | ||
|
|
||
| When merging upstream, preserve the Phase 5 additions in the files above. Everything else should merge cleanly. | ||
|
|
||
| --- | ||
|
|
||
| ## Version | ||
|
|
||
| This fork is based on upstream `v3.x` (self-contained loop, no ralph-loop dependency). | ||
|
|
||
| Fork maintained by [@informatico-madrid](https://github.com/informatico-madrid). | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.