Add orcarouter-adaptive router predictions by ZhenghuaBao · Pull Request #104 · RouteWorks/RouterArena

ZhenghuaBao · 2026-05-21T11:00:40Z

Method

orcarouter-adaptive is a LinUCB contextual bandit with embedding-augmented features.

Algorithm: per-arm LinUCB ridge regression with UCB exploration. Each /decide call scores all candidate arms and picks the argmax.
Feature vector: lexical features concatenated with a sentence-transformer embedding (all-MiniLM-L6-v2, L2-normalized, CPU inference for determinism across environments).
Reward shape: per-query Arena Score — aligns the bandit's training objective with the leaderboard evaluation metric.
Pre-training: per-arm parameters are initialized via offline observations before evaluation. /decide calls during evaluation are read-only on bandit state.

Pool

10 models across 5 providers:

Provider	Models
Anthropic	claude-haiku-4-5-20251001, claude-sonnet-4
Google	gemini-2.5-flash, gemini-2.5-flash-lite
OpenAI	gpt-4o-mini, gpt-5-mini
DeepSeek	deepseek-chat, deepseek-reasoner
Alibaba	qwen3-235b-a22b-instruct-2507, qwen3-30b-a3b-instruct-2507

Files

router_inference/config/orcarouter-adaptive.json — router config
router_inference/predictions/orcarouter-adaptive.json — full-split predictions with populated generated_result
router_inference/predictions/orcarouter-adaptive-robustness.json — robustness predictions (routing-only)
universal_model_names.py — provider-prefix → bare-name mappings for the pool
model_cost/model_cost.json — token cost entries for pool models not previously listed

Local validation via router_inference/check_config_prediction_files.py passes for both full and robustness splits.

10-model pool: anthropic claude-haiku/sonnet-4, google gemini-2.5-flash/-lite, openai gpt-4o-mini/gpt-5-mini, deepseek chat/reasoner, alibaba qwen3-235b/30b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add prefix→bare mappings (universal_model_names.py) and cost entries (model_cost/model_cost.json) for the 5 pool models not already present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZhenghuaBao · 2026-05-21T11:01:16Z

/evaluate

github-actions · 2026-05-21T11:13:05Z

Router Evaluation Results

Router: orcarouter-adaptive
Dataset Split: full

RouterArena Metrics

Metric	Value
RouterArena Score	0.7208
Accuracy	75.54%
Total Cost	$8.380033
Avg Cost per Query	$0.000998
Avg Cost per 1K Queries	$0.9976
Number of Queries	8400
Robustness Score	0.2262

Evaluation completed by RouterArena automated workflow

ZhenghuaBao · 2026-05-21T11:14:02Z

Hi @yl231,

All pipelines pass and evaluation results are included (full 8400 + 420 robustness splits). We'd like to merge OrcaRouter Adaptive as an online learning entry — a different regime from the offline/pre-trained routers currently on the leaderboard, intended as a reference point for contextual bandits in routing.

Thanks in advance for the review!

@ZhenghuaBao

OrcaRouter-Adaptive (LinUCB contextual bandit with embedding features) submitted in #104 by @ZhenghuaBao. With Acc-Cost Arena 72.08 it slots between Sqwish (75.27) and Azure-Model-Router (71.87), landing at 🥈. Every row below shifts down by one (16 → 17 entries). Bot-reported metrics: Acc-Cost Arena 72.08 Accuracy 75.54 Cost/1K Queries $1.00 ($0.9976 rounded to 2dp like other rows) Robustness 22.62 The submitted prediction file does not include for_optimality entries, so the three optimality columns are dashed (same convention as GPT-5 and the Azure row). URL and affiliation cells are left blank for the maintainer to fill in pre-merge, matching the Auto Router / Sqwish Router pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@ZhenghuaBao

OrcaRouter-Adaptive (LinUCB contextual bandit with embedding features) submitted in #104 by @ZhenghuaBao. With Acc-Cost Arena 72.08 it slots between Sqwish (75.27) and Azure-Model-Router (71.87), landing at 🥈. Every row below shifts down by one (16 → 17 entries). Bot-reported metrics: Acc-Cost Arena 72.08 Accuracy 75.54 Cost/1K Queries $1.00 ($0.9976 rounded to 2dp like other rows) Robustness 22.62 The submitted prediction file does not include for_optimality entries, so the three optimality columns are dashed (same convention as GPT-5 and the Azure row). URL and affiliation cells are left blank for the maintainer to fill in pre-merge, matching the Auto Router / Sqwish Router pattern. Co-authored-by: yl231 <yl231@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Merge branch 'main' resolution dropped the `},` separator between the last orcarouter-adaptive entry and the first upstream-added entry, breaking JSON parse for the pre-commit check-json hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ZhenghuaBao and others added 2 commits May 21, 2026 18:49

Add orcarouter-adaptive router predictions

cefac61

10-model pool: anthropic claude-haiku/sonnet-4, google gemini-2.5-flash/-lite, openai gpt-4o-mini/gpt-5-mini, deepseek chat/reasoner, alibaba qwen3-235b/30b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Register orcarouter-adaptive model pool in name + cost registries

466286d

Add prefix→bare mappings (universal_model_names.py) and cost entries (model_cost/model_cost.json) for the 5 pool models not already present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

yl231 mentioned this pull request May 21, 2026

Add OrcaRouter-Adaptive to leaderboard (#104) #108

Merged

ZhenghuaBao and others added 2 commits May 22, 2026 09:52

Merge branch 'main' into submit/orcarouter-adaptive-v2

49eff9e

yl231 self-assigned this May 25, 2026

yl231 approved these changes May 25, 2026

View reviewed changes

yl231 merged commit 571bf08 into RouteWorks:main May 25, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add orcarouter-adaptive router predictions#104

Add orcarouter-adaptive router predictions#104
yl231 merged 4 commits into
RouteWorks:mainfrom
Continuum-AI-Corp:submit/orcarouter-adaptive-v2

ZhenghuaBao commented May 21, 2026

Uh oh!

ZhenghuaBao commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

ZhenghuaBao commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhenghuaBao commented May 21, 2026

Method

Pool

Files

Uh oh!

ZhenghuaBao commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Router Evaluation Results

RouterArena Metrics

Uh oh!

ZhenghuaBao commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants