docs: add PRDBench experiment results (Chorus vs CC Baseline) by ChenNima · Pull Request #77 · Chorus-AIDLC/Chorus

ChenNima · 2026-04-02T04:28:10Z

Summary

Add docs/benchmark/EXPERIMENT_RESULTS.md with empirical findings from PRDBench Task 47
Update docs/benchmark/README.md with link to experiment results
Three models tested (Opus, Sonnet, Haiku) × two setups (CC Baseline, Chorus AI-DLC)

Key findings

Opus + Chorus = Opus baseline (69.0%), but 3x slower
Sonnet + Chorus = worse than baseline (37-42% vs 50%), premature exit pattern
Haiku + Chorus = much worse (14-36% vs 64%), context degradation and MCP tool loss
Test-driven acceptance criteria are essential for Chorus to match baseline
Chorus has a model capability threshold — below it, the harness hurts

Changed files

File	Change
`docs/benchmark/EXPERIMENT_RESULTS.md`	New: full experiment report
`docs/benchmark/README.md`	Add index link to experiment results

Test plan

Markdown renders correctly
Links in README point to correct file

🤖 Generated with Claude Code

Empirical findings from running PRDBench Task 47 (Library Management System) across three models (Opus, Sonnet, Haiku) with and without Chorus AI-DLC harness. Key results: - Opus: Chorus matches baseline (69.0% = 69.0%), 3x slower - Sonnet: Chorus hurts (-13%), premature exit pattern - Haiku: Chorus significantly hurts (-28~50%), context degradation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

github-actions · 2026-04-02T04:29:24Z

Coverage Report

Status	Category	Percentage	Covered / Total
🔵	Lines	96.43% (🎯 95%)	1896 / 1966
🔵	Statements	95.43% (🎯 95%)	2048 / 2146
🔵	Functions	95% (🎯 93%)	399 / 420
🔵	Branches	87.55% (🎯 85%)	1266 / 1446

File Coverage

No changed files found.

Generated in workflow #91 for commit 500849d by the Vitest Coverage Report Action

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add PRDBench experiment results (Chorus vs CC Baseline)#77

docs: add PRDBench experiment results (Chorus vs CC Baseline)#77
ChenNima wants to merge 1 commit intodevelopfrom
docs/prdbench-experiment-results

ChenNima commented Apr 2, 2026

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChenNima commented Apr 2, 2026

Summary

Key findings

Changed files

Test plan

Uh oh!

github-actions bot commented Apr 2, 2026

Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant