Problem
v0.9.0 is large enough that “tests pass” is not a complete release criterion. The release needs an explicit acceptance matrix covering core stability, provider routing, UI, Model Lab, WhaleFlow, docs, packaging, and rollback.
Required matrix sections
- Core build/test:
- fmt, clippy, workspace tests, release build
- provider registry drift check
- packaging smoke for Cargo/npm/GitHub release flow
- Provider/model/auth:
- DeepSeek V4
- Xiaomi MiMo token-plan and pay-as-you-go
- Arcee Trinity Thinking
- Hugging Face route
- OpenRouter/Novita/Fireworks/Volcengine provider env behavior
- Runtime stability:
- Windows input/render smoke or documented manual verification
- large-repo startup smoke
- sub-agent timeout/completion smoke
- long-running command/live-state smoke
- UI/UX:
- first-look screen
- slash picker readability
- transcript tool-collapse
- sidebar popovers
- plan review/handoff
- v0.9.0 feature gates:
- WhaleFlow MVP if included
- HF/Model Lab MVP if included
- HarnessProfile MVP if included
- codebase_search MVP if included
- Remote workbench:
- explicitly included, experimental, or deferred
- if included: VM install + Telegram bridge smoke
- Docs and migration:
- README/config/docs agree
- release notes list breaking changes/deprecations
- upgrade/rollback steps exist
Acceptance criteria
- The matrix is checked off before any v0.9.0 tag/release.
- Each unchecked item has an owner and explicit defer/ship decision.
- Manual smoke results include dates, environment, provider/model, and redacted config info.
- The final release prompt points agents to this issue first.
Problem
v0.9.0 is large enough that “tests pass” is not a complete release criterion. The release needs an explicit acceptance matrix covering core stability, provider routing, UI, Model Lab, WhaleFlow, docs, packaging, and rollback.
Required matrix sections
Acceptance criteria