Conversation
Introduce adversarial debate evaluation (autoreason) foundation for EGRI subjective domains. Adds ComparativeEvaluator and LlmBackend traits, debate types (DebateConfig, Winner, JudgeVote, CritiqueResult, DebateRound, ComparisonOutcome), PromotionPolicy::Comparative variant, and the full AUTOREASON.md design document. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThe PR introduces the Autoreason protocol, a debate-based evaluation system for subjective-domain artifact comparison. New trait abstractions ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
ComparativeEvaluatortrait — compares two artifacts instead of scoring one in isolation, enabling subjective domain evaluationLlmBackendtrait — stateless LLM abstraction enforcing context isolation between debate phasesDebateConfig,Winner,IssueSeverity,CritiqueIssue,CritiqueResult,JudgeVote,DebateRound,ComparisonOutcomePromotionPolicy::Comparativevariant +DebateSpec/TaskSpecin ProblemSpecAUTOREASON.mddesign document (protocol, architecture, dependency chain, cost model, safety analysis)Context
Phase 1 of 7 for BRO-348 (Autoreason epic). Extends EGRI to subjective domains where no objective scalar metric exists, using adversarial multi-agent debate with blind judge panels. Zero changes to existing EgriLoop behavior — all 67 existing + new tests pass.
Test plan
cargo clippy -- -D warningscleancargo fmt --checkcleanPromotionPolicy::Comparativecorrectly escalates inDefaultSelector(requiresDebateLoop)🤖 Generated with Claude Code
Closes BRO-349
Summary by CodeRabbit
Release Notes
New Features
Documentation