feat(autoany): autoreason types, traits, and design doc (BRO-349) by broomva · Pull Request #2 · broomva/autoany

broomva · 2026-04-01T16:23:29Z

Summary

Add ComparativeEvaluator trait — compares two artifacts instead of scoring one in isolation, enabling subjective domain evaluation
Add LlmBackend trait — stateless LLM abstraction enforcing context isolation between debate phases
Add 9 debate types: DebateConfig, Winner, IssueSeverity, CritiqueIssue, CritiqueResult, JudgeVote, DebateRound, ComparisonOutcome
Add PromotionPolicy::Comparative variant + DebateSpec/TaskSpec in ProblemSpec
Add comprehensive AUTOREASON.md design document (protocol, architecture, dependency chain, cost model, safety analysis)

Context

Phase 1 of 7 for BRO-348 (Autoreason epic). Extends EGRI to subjective domains where no objective scalar metric exists, using adversarial multi-agent debate with blind judge panels. Zero changes to existing EgriLoop behavior — all 67 existing + new tests pass.

Test plan

All 67 unit + integration tests pass
cargo clippy -- -D warnings clean
cargo fmt --check clean
Serde roundtrip tests for all new types
PromotionPolicy::Comparative correctly escalates in DefaultSelector (requires DebateLoop)

🤖 Generated with Claude Code

Closes BRO-349

Summary by CodeRabbit

Release Notes

New Features
- Comparative evaluation system for assessing artifact quality through multi-agent debate and pairwise comparison
- Configurable debate parameters including judge count, convergence thresholds, label randomization, and model diversity options
Documentation
- Added comprehensive specification and integration guide for the Autoreason debate protocol

Introduce adversarial debate evaluation (autoreason) foundation for EGRI subjective domains. Adds ComparativeEvaluator and LlmBackend traits, debate types (DebateConfig, Winner, JudgeVote, CritiqueResult, DebateRound, ComparisonOutcome), PromotionPolicy::Comparative variant, and the full AUTOREASON.md design document. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

linear · 2026-04-01T16:23:33Z

BRO-349 Autoreason Phase 1: Types and traits

coderabbitai · 2026-04-01T16:23:44Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The PR introduces the Autoreason protocol, a debate-based evaluation system for subjective-domain artifact comparison. New trait abstractions (ComparativeEvaluator, LlmBackend), comprehensive debate-related data types, specification extensions, and integration points are added alongside detailed protocol documentation.

Changes

Cohort / File(s)	Summary
New Trait Abstractions `autoany-core/src/comparative_evaluator.rs`, `autoany-core/src/llm_backend.rs`	Introduced `ComparativeEvaluator` trait for pairwise artifact comparison returning `ComparisonOutcome`, and `LlmBackend` trait for stateless LLM generation with optional model selection override.
Debate & Comparison Types `autoany-core/src/types.rs`	Added 8 new types supporting debate rounds: `DebateConfig`, `Winner`, `IssueSeverity`, `CritiqueIssue`, `CritiqueResult`, `JudgeVote`, `DebateRound`, and `ComparisonOutcome`. Includes serde support and `Default` impl for `DebateConfig`.
Specification & Policies `autoany-core/src/spec.rs`	Extended `ProblemSpec` with optional `debate` and `task` fields. Added `DebateSpec` and `TaskSpec` public types. Introduced `PromotionPolicy::Comparative` variant for debate-mode evaluation routing.
Module Exports & Integration `autoany-core/src/lib.rs`, `autoany-core/src/selector.rs`	Exported new `comparative_evaluator` and `llm_backend` modules. Added explicit `PromotionPolicy::Comparative` handling in `DefaultSelector::select` returning `Action::Escalated` with misconfiguration reason.
Protocol Documentation `autoany/references/AUTOREASON.md`	Comprehensive specification of the Autoreason adversarial multi-agent debate protocol including five-phase control flow (Attack, Revise, Synthesize, Judge, Decide), EGRI integration approach, type/trait dependencies, convergence detection, and safety considerations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Debaters gather 'round the ring,
Incumbent and Revision sling,
Their votes converge, consensus grows—
No scores today, just pairwise prose!
A protocol born of reasoned cheer, ✨
Where majorities make winners clear. 🏆

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the primary change: addition of Autoreason types, traits, and design documentation for the feature branch BRO-349.
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/bro-349-autoreason-phase-1-types-and-traits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

broomva merged commit 774b71e into main Apr 1, 2026
2 of 3 checks passed

broomva deleted the feature/bro-349-autoreason-phase-1-types-and-traits branch April 1, 2026 16:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(autoany): autoreason types, traits, and design doc (BRO-349)#2

feat(autoany): autoreason types, traits, and design doc (BRO-349)#2
broomva merged 1 commit intomainfrom
feature/bro-349-autoreason-phase-1-types-and-traits

broomva commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

linear bot commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

broomva commented Apr 1, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

linear bot commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

broomva commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 1, 2026 •

edited

Loading