Ticket: P1.5-S5-05
Type: Feature | Est: TBD — scope in dedicated session
Goal: Automated harness that runs all 30 incidents in both invocation modes (tool and API), collects per-incident results, and scores AC-01 through AC-06 against ground_truth.json.
Scope: This ticket is a placeholder. Do not start implementation without a dedicated scoping session that produces an implementation spec in .docs/.
Acceptance criteria:
Ticket:
P1.5-S5-05Type: Feature | Est: TBD — scope in dedicated session
Goal: Automated harness that runs all 30 incidents in both invocation modes (tool and API), collects per-incident results, and scores AC-01 through AC-06 against
ground_truth.json.Scope: This ticket is a placeholder. Do not start implementation without a dedicated scoping session that produces an implementation spec in
.docs/.Acceptance criteria:
.docs/test_harness_spec.md