Skip to content

feat(autoany): autoreason types, traits, and design doc (BRO-349)#2

Merged
broomva merged 1 commit intomainfrom
feature/bro-349-autoreason-phase-1-types-and-traits
Apr 1, 2026
Merged

feat(autoany): autoreason types, traits, and design doc (BRO-349)#2
broomva merged 1 commit intomainfrom
feature/bro-349-autoreason-phase-1-types-and-traits

Conversation

@broomva
Copy link
Copy Markdown
Owner

@broomva broomva commented Apr 1, 2026

Summary

  • Add ComparativeEvaluator trait — compares two artifacts instead of scoring one in isolation, enabling subjective domain evaluation
  • Add LlmBackend trait — stateless LLM abstraction enforcing context isolation between debate phases
  • Add 9 debate types: DebateConfig, Winner, IssueSeverity, CritiqueIssue, CritiqueResult, JudgeVote, DebateRound, ComparisonOutcome
  • Add PromotionPolicy::Comparative variant + DebateSpec/TaskSpec in ProblemSpec
  • Add comprehensive AUTOREASON.md design document (protocol, architecture, dependency chain, cost model, safety analysis)

Context

Phase 1 of 7 for BRO-348 (Autoreason epic). Extends EGRI to subjective domains where no objective scalar metric exists, using adversarial multi-agent debate with blind judge panels. Zero changes to existing EgriLoop behavior — all 67 existing + new tests pass.

Test plan

  • All 67 unit + integration tests pass
  • cargo clippy -- -D warnings clean
  • cargo fmt --check clean
  • Serde roundtrip tests for all new types
  • PromotionPolicy::Comparative correctly escalates in DefaultSelector (requires DebateLoop)

🤖 Generated with Claude Code

Closes BRO-349

Summary by CodeRabbit

Release Notes

  • New Features

    • Comparative evaluation system for assessing artifact quality through multi-agent debate and pairwise comparison
    • Configurable debate parameters including judge count, convergence thresholds, label randomization, and model diversity options
  • Documentation

    • Added comprehensive specification and integration guide for the Autoreason debate protocol

Introduce adversarial debate evaluation (autoreason) foundation for EGRI
subjective domains. Adds ComparativeEvaluator and LlmBackend traits,
debate types (DebateConfig, Winner, JudgeVote, CritiqueResult, DebateRound,
ComparisonOutcome), PromotionPolicy::Comparative variant, and the full
AUTOREASON.md design document.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@linear
Copy link
Copy Markdown

linear bot commented Apr 1, 2026

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The PR introduces the Autoreason protocol, a debate-based evaluation system for subjective-domain artifact comparison. New trait abstractions (ComparativeEvaluator, LlmBackend), comprehensive debate-related data types, specification extensions, and integration points are added alongside detailed protocol documentation.

Changes

Cohort / File(s) Summary
New Trait Abstractions
autoany-core/src/comparative_evaluator.rs, autoany-core/src/llm_backend.rs
Introduced ComparativeEvaluator trait for pairwise artifact comparison returning ComparisonOutcome, and LlmBackend trait for stateless LLM generation with optional model selection override.
Debate & Comparison Types
autoany-core/src/types.rs
Added 8 new types supporting debate rounds: DebateConfig, Winner, IssueSeverity, CritiqueIssue, CritiqueResult, JudgeVote, DebateRound, and ComparisonOutcome. Includes serde support and Default impl for DebateConfig.
Specification & Policies
autoany-core/src/spec.rs
Extended ProblemSpec with optional debate and task fields. Added DebateSpec and TaskSpec public types. Introduced PromotionPolicy::Comparative variant for debate-mode evaluation routing.
Module Exports & Integration
autoany-core/src/lib.rs, autoany-core/src/selector.rs
Exported new comparative_evaluator and llm_backend modules. Added explicit PromotionPolicy::Comparative handling in DefaultSelector::select returning Action::Escalated with misconfiguration reason.
Protocol Documentation
autoany/references/AUTOREASON.md
Comprehensive specification of the Autoreason adversarial multi-agent debate protocol including five-phase control flow (Attack, Revise, Synthesize, Judge, Decide), EGRI integration approach, type/trait dependencies, convergence detection, and safety considerations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Debaters gather 'round the ring,
Incumbent and Revision sling,
Their votes converge, consensus grows—
No scores today, just pairwise prose!
A protocol born of reasoned cheer,
Where majorities make winners clear. 🏆

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the primary change: addition of Autoreason types, traits, and design documentation for the feature branch BRO-349.
Docstring Coverage ✅ Passed Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/bro-349-autoreason-phase-1-types-and-traits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@broomva broomva merged commit 774b71e into main Apr 1, 2026
2 of 3 checks passed
@broomva broomva deleted the feature/bro-349-autoreason-phase-1-types-and-traits branch April 1, 2026 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant