fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent preservation, and model override #1777

youngbinkim0 · 2026-02-11T22:02:14Z

Summary

Fixes 9 critical bugs in the runtime-fallback feature from #1408 that prevented it from working end-to-end. After these fixes, runtime-fallback correctly:

Detects Anthropic "credit balance too low" errors (HTTP 400)
Automatically retries with the next fallback model
Preserves the original agent identity during retry
Allows manual model override via the UI without being overridden by fallback state

Related PR

This builds on top of #1408 (feat/runtime-fallback-only). All fixes are applied to the same branch.

Bug Fixes (9 total)

1. `extractStatusCode()` missed nested `data.statusCode`

Problem: Anthropic errors have structure {name: "APIError", data: {statusCode: 400, message: "..."}} but extractStatusCode only checked top-level status/statusCode properties.
Fix: Added (errorObj.data as Record<string, unknown>)?.statusCode check.

2. Error regex pattern too strict

Problem: /credit.?balance.?too.?low/ failed because actual Anthropic message has multi-character gaps between words.
Fix: Changed to /credit.*balance.*too.*low/i.

3. Zod schema rejected valid config silently

Problem: Schema had .max(10) for max_fallback_attempts, but users could reasonably want more. Zod silently rejected the entire runtime_fallback block when any field failed validation.
Fix: Bumped schema limit to .max(20).

4. `getFallbackModelsForSession()` returned empty array

Problem: session.error event doesn't include agent and session IDs don't contain agent names for main sessions. The function couldn't find fallback models.
Fix: Added fallback logic: try "sisyphus" first, then iterate all known agents until one with fallback_models is found.

5. "No model info available, cannot fallback"

Problem: Neither session.error nor session.created events include the current model. Without a model, createFallbackState() couldn't be initialized.
Fix: Added detectAgentFromSession() helper to derive model from agent config when event data is missing.

6. No auto-retry after fallback model selection

Problem: The hook set pendingFallbackModel but required the user to manually resend their message.
Fix: Added ctx.client.session.promptAsync calls that fetch the last user message from the session and resend it with the fallback model. Added sessionRetryInFlight guard to prevent double-retries.

7. Fallback model reverted after one message

Problem: chatMessageHandler cleared pendingFallbackModel after one use, so subsequent messages reverted to the original (failing) model.
Fix: Changed to persistent override: checks state.currentModel !== state.originalModel on every chat.message and applies the fallback model continuously.

8. Manual UI model changes overridden by persistent fallback

Problem: After fix #7 made fallback persistent, users could no longer change models via the UI because the fallback state would override their selection.
Fix: Added detection in chatMessageHandler: if requestedModel !== state.currentModel, the user changed the model manually, so reset fallback state via createFallbackState(requestedModel).

9. Auto-retry defaulted to Sisyphus agent

Problem: promptAsync body only included model + parts, no agent field. OpenCode defaults to "sisyphus" when agent is omitted.
Fix: Added agent: resolvedAgent to both auto-retry promptAsync calls. Also added:

resolveAgentForSession() with 3-tier resolution: event agent → session memory → session ID pattern
normalizeAgentName() for display variants like "Prometheus (Planner)" → "prometheus"
resolveAgentForSessionFromContext() that fetches recent messages to find agent

Additional Improvements

Added 400 to default retry_on_errors (Anthropic "credit balance" = HTTP 400)
Added credit.*balance.*too.*low and insufficient.?credits? patterns to RETRYABLE_ERROR_PATTERNS
Moved AGENT_NAMES and agentPattern to module scope for reuse
Registered runtimeFallback hooks in event.ts and chat-message.ts
Cleanup of sessionRetryInFlight in session deletion and stale session cleanup
User message extraction reads from message.parts first, then message.info.parts

Files Changed

File	Changes
`src/hooks/runtime-fallback/index.ts`	All 9 bug fixes, agent resolution, auto-retry logic
`src/hooks/runtime-fallback/index.test.ts`	New agent preservation test, updated mocks
`src/hooks/runtime-fallback/constants.ts`	Added 400 to retry codes, credit balance patterns
`src/config/schema/runtime-fallback.ts`	Bumped max_fallback_attempts to 20
`src/plugin/chat-message.ts`	Register runtimeFallback chat.message hook
`src/plugin/event.ts`	Register runtimeFallback event hook

Key Discoveries

Anthropic "credit balance too low" = HTTP 400 (not 402). Error type is invalid_request_error.
OpenCode's session.error event is sparse — only sessionID + error. No agent, no model.
Zod validation silently rejects entire config section on any field failure.
OpenCode defaults to sisyphus when agent omitted from promptAsync body.
Agent names in events can be display names (e.g., "Prometheus (Planner)") — need normalization.

Testing

✅ 25 tests passing (bun test src/hooks/runtime-fallback/index.test.ts)
✅ Typecheck clean (bun run typecheck)
✅ Manually verified end-to-end: Anthropic credit error → auto-fallback → auto-retry → correct agent preserved → manual model override works

Summary by cubic

Fixes runtime-fallback so it reliably auto-switches and retries on provider errors, preserves the right agent, and respects manual model changes. Adds agent/category fallback_models with provider-aware selection and a runtime_fallback config for cooldowns, session timeouts, and notifications.

New Features
- runtime_fallback config (enabled, retry_on_errors incl 400, max_attempts up to 20, cooldown, timeout, toast).
- Support fallback_models on agents and categories; model resolution honors them at init and runtime with provider checks.
- Auto-retry resends the last user message with the resolved agent; persistent override until the user changes models.
Bug Fixes
- Broaden error detection and status parsing (Anthropic credit errors on 400, nested data.statusCode, stricter status code matching).
- Robust model/agent resolution when events lack data; agent preserved in auto-retry with guards against double retries.
- Persistent fallback across messages with correct reset on manual model changes; fix provider constraints for userFallbackModels.
- Harden fallback progression and success detection; add per-model cooldown and session-level timeout to advance when a model hangs.

^{Written for commit 3d0e070. Summary will update on new commits.}

Add configuration schemas for runtime model fallback feature: - RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback - FallbackModelsSchema for init-time fallback model selection - Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema - Export types and schemas from config/index.ts Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

- Add Category-level fallback_models support in getFallbackModelsForSession() - Try agent-level fallback_models first - Then try agent's category fallback_models - Support all builtin agents including hephaestus, sisyphus-junior, build, plan - Expand agent name recognition regex to include: - hephaestus, sisyphus-junior, build, plan, multimodal-looker - Add comprehensive test coverage (6 new tests, total 24): - Model switching via chat.message hook - Agent-level fallback_models configuration - SessionID agent pattern detection - Cooldown mechanism validation - Max attempts limit enforcement All 24 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

Implement full fallback_models support across all integration points: 1. Model Resolution Pipeline (src/shared/model-resolution-pipeline.ts) - Add userFallbackModels to ModelResolutionRequest - Process user fallback_models before hardcoded fallback chain - Support both connected provider and availability checking modes 2. Agent Utils (src/agents/utils.ts) - Update applyModelResolution to accept userFallbackModels - Inject fallback_models for all builtin agents (sisyphus, oracle, etc.) - Support both single string and array formats 3. Model Resolver (src/shared/model-resolver.ts) - Add userFallbackModels to ExtendedModelResolutionInput type - Pass through to resolveModelPipeline 4. Delegate Task Executor (src/tools/delegate-task/executor.ts) - Extract category fallback_models configuration - Pass to model resolution pipeline - Register session category for runtime-fallback hook 5. Session Category Registry (src/shared/session-category-registry.ts) - New module: maps sessionID -> category - Used by runtime-fallback to lookup category fallback_models - Auto-cleanup support 6. Runtime Fallback Hook (src/hooks/runtime-fallback/index.ts) - Check SessionCategoryRegistry first for category fallback_models - Fallback to agent-level configuration - Import and use SessionCategoryRegistry Test Results: - runtime-fallback: 24/24 tests passing - model-resolver: 46/46 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

…Execution

…ching Replace word-boundary regex with stricter patterns that match status codes only at start/end of string or surrounded by whitespace. Prevents false matches like '1429' or '4290'. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

Add shared utility to normalize fallback_models config values. Handles both single string and array inputs consistently. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

Replace 5 instances of inline fallback_models normalization with the shared normalizeFallbackModels() utility function. Eliminates code duplication and ensures consistent behavior. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>

Resolved conflicts in: - src/config/schema.ts (kept both hooks) - src/hooks/index.ts (exported both hooks) - src/index.ts (imported both hooks) - src/shared/index.ts (exported both utilities)

Resolved conflicts in: - src/config/schema.ts (HookNameSchema + OhMyOpenCodeConfigSchema) - src/agents/utils.ts (imports + model resolution calls) - docs/configurations.md (category options table + runtime fallback docs) - src/hooks/AGENTS.md (hook list) - src/tools/delegate-task/executor.ts (imports + session category registry) - src/tools/delegate-task/tools.test.ts (test case updates) - src/features/background-agent/manager.ts (cleanup + SessionCategoryRegistry)

- Fix bun.lock version conflicts (3.3.1 -> 3.3.2) - Remove Git conflict markers from docs/configurations.md - Remove duplicate normalizeFallbackModels, import from shared module

Implements runtime model fallback that automatically switches to backup models when the primary model encounters transient errors (rate limits, overload, etc.). Features: - runtime_fallback configuration with customizable error codes, cooldown, notifications - Runtime fallback hook intercepts API errors (429, 503, 529) - Support for fallback_models from agent/category configuration - Session-state TTL and periodic cleanup to prevent memory leaks - Robust agent name detection with explicit AGENT_NAMES array - Session category registry for category-specific fallback lookup Schema changes: - Add RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback options - Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema - Add runtime-fallback to HookNameSchema Files added: - src/hooks/runtime-fallback/index.ts - Main hook implementation - src/hooks/runtime-fallback/types.ts - Type definitions - src/hooks/runtime-fallback/constants.ts - Constants and defaults - src/hooks/runtime-fallback/index.test.ts - Comprehensive tests - src/config/schema/runtime-fallback.ts - Schema definition - src/shared/session-category-registry.ts - Session category tracking Files modified: - src/hooks/index.ts - Export runtime-fallback hook - src/plugin/hooks/create-session-hooks.ts - Register runtime-fallback hook - src/config/schema.ts - Export runtime-fallback schema - src/config/schema/oh-my-opencode-config.ts - Add runtime_fallback config - src/config/schema/agent-overrides.ts - Add fallback_models to agent config - src/config/schema/categories.ts - Add fallback_models to category config - src/config/schema/hooks.ts - Add runtime-fallback to hook names - src/shared/index.ts - Export session-category-registry - docs/configurations.md - Add Runtime Fallback documentation - docs/features.md - Add runtime-fallback to hooks list Supersedes code-yeongyu#1237, code-yeongyu#1408 Closes code-yeongyu#1408

- Add normalizeFallbackModels helper to centralize string/array normalization (P3) - Export RuntimeFallbackConfig and FallbackModels types from config/index.ts - Fix agent detection regex to use word boundaries for sessionID matching - Improve tests to verify actual fallback switching logic (not just log paths) - Add SessionCategoryRegistry cleanup in executeSyncTask on completion/error (P2) - All 24 runtime-fallback tests pass, 115 delegate-task tests pass

…-my-opencode into feat/runtime-fallback-only

…gent detection The \b word boundary regex treats '-' as a boundary, causing 'sisyphus-junior-session-123' to incorrectly match 'sisyphus' instead of 'sisyphus-junior'. Sorting agent names by length (descending) ensures longer names are matched first, fixing the hyphenated agent detection issue. Fixes cubic-dev-ai review issue code-yeongyu#8

…servation, and model override Bug fixes: 1. extractStatusCode: handle nested data.statusCode (Anthropic error structure) 2. Error regex: relax credit.*balance.*too.*low pattern for multi-char gaps 3. Zod schema: bump max_fallback_attempts from 10 to 20 (config rejected silently) 4. getFallbackModelsForSession: fallback to sisyphus/any agent when session.error lacks agent 5. Model detection: derive model from agent config when session.error lacks model info 6. Auto-retry: resend last user message with fallback model via promptAsync 7. Persistent fallback: override model on every chat.message (not just pendingFallbackModel) 8. Manual model change: detect UI model changes and reset fallback state 9. Agent preservation: include agent in promptAsync body to prevent defaulting to sisyphus Additional: - Add sessionRetryInFlight guard to prevent double-retries - Add resolveAgentForSession with 3-tier resolution (event → session memory → session ID) - Add normalizeAgentName for display names like "Prometheus (Planner)" → "prometheus" - Add resolveAgentForSessionFromContext to fetch agent from session messages - Move AGENT_NAMES and agentPattern to module scope for reuse - Register runtime-fallback hooks in event.ts and chat-message.ts - Remove diagnostic debug logging from isRetryableError - Add 400 to default retry_on_errors and credit/balance patterns to RETRYABLE_ERROR_PATTERNS

github-actions · 2026-02-11T22:02:26Z

All contributors have signed the CLA. Thank you! ✅
_{Posted by the CLA Assistant Lite bot.}

youngbinkim0 · 2026-02-11T22:03:00Z

I have read the CLA Document and I hereby sign the CLA

youngbinkim0 · 2026-02-11T22:03:26Z

recheck

youngbinkim0 · 2026-02-11T22:04:31Z

recheck

cubic-dev-ai

2 issues found across 27 files

Confidence score: 3/5

Potential user-impacting inconsistency in src/shared/model-resolution-pipeline.ts: new userFallbackModels reads connected providers only from cache, ignoring constraints.connectedProviders, which could change model resolution behavior.
Code duplication in src/hooks/runtime-fallback/index.ts auto-retry handlers raises maintenance risk and could lead to subtle divergence, though it’s not an immediate blocker.
Pay close attention to src/shared/model-resolution-pipeline.ts and src/hooks/runtime-fallback/index.ts - connected provider handling and duplicated retry logic.

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/hooks/runtime-fallback/index.ts">

<violation number="1" location="src/hooks/runtime-fallback/index.ts:473">
P2: Significant code duplication in auto-retry logic between `session.error` and `message.updated` event handlers. The logic for fetching session messages, extracting the last user message, parsing parts, and calling `promptAsync` is duplicated almost verbatim (approximately 40+ lines). This increases maintenance burden and risk of inconsistencies.</violation>
</file>

<file name="src/shared/model-resolution-pipeline.ts">

<violation number="1" location="src/shared/model-resolution-pipeline.ts:105">
P1: Inconsistent handling of `constraints.connectedProviders` - the new `userFallbackModels` logic reads connected providers only from cache, ignoring the `constraints.connectedProviders` parameter that is respected in `categoryDefaultModel` and `fallbackChain` logic. This prevents callers from enforcing specific provider constraints in the user fallback code path.</violation>
</file>

Since this is your first cubic review, here's how it works:

cubic automatically reviews your code and comments on bugs and improvements
Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
Ask questions if you need clarification on any suggestion

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

src/shared/model-resolution-pipeline.ts

src/hooks/runtime-fallback/index.ts

…r constraint inconsistency - Extract duplicated auto-retry logic (~40 lines each) from session.error and message.updated handlers into shared autoRetryWithFallback() helper - Fix userFallbackModels path in model-resolution-pipeline to respect constraints.connectedProviders parameter instead of reading cache directly, matching the behavior of categoryDefaultModel and fallbackChain paths

Rebase Bot and others added 20 commits February 4, 2026 19:41

docs: add runtime-fallback and fallback_models documentation

c62d8f1

fix(delegate-task): restore overrideModel priority in resolveCategory…

7a6b8da

…Execution

fix(runtime-fallback): per-model cooldown and stricter retry patterns

8e98f98

fix(session-category-registry): cleanup entries for task sessions

8433b95

test(delegate-task): stabilize browserProvider and default variant cases

708bcc1

test(agents): update Atlas uiSelectedModel expectation

cea5f43

Merge branch 'dev' into feat/runtime-fallback-only

bd7f2be

Resolved conflicts in: - src/config/schema.ts (kept both hooks) - src/hooks/index.ts (exported both hooks) - src/index.ts (imported both hooks) - src/shared/index.ts (exported both utilities)

fix: resolve merge conflicts in PR code-yeongyu#1408

8f72f52

- Fix bun.lock version conflicts (3.3.1 -> 3.3.2) - Remove Git conflict markers from docs/configurations.md - Remove duplicate normalizeFallbackModels, import from shared module

Merge branch 'feat/runtime-fallback-only' of github.com:youming-ai/oh…

e88340f

…-my-opencode into feat/runtime-fallback-only

github-actions bot added a commit that referenced this pull request Feb 11, 2026

@youngbinkim0 has signed the CLA in #1777

e4be8ce

cubic-dev-ai bot reviewed Feb 11, 2026

View reviewed changes

src/shared/model-resolution-pipeline.ts Outdated Show resolved Hide resolved

src/hooks/runtime-fallback/index.ts Outdated Show resolved Hide resolved

youngbinkim0 added 4 commits February 11, 2026 17:14

fix(runtime-fallback): harden fallback progression and success detection

e03b080

feat(runtime-fallback): add configurable session timeout controls

f4df912

docs(runtime-fallback): document retry classes and timeout behavior

3d0e070

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent preservation, and model override #1777

fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent preservation, and model override #1777

youngbinkim0 commented Feb 11, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

github-actions bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent preservation, and model override #1777

Are you sure you want to change the base?

fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent preservation, and model override #1777

Conversation

youngbinkim0 commented Feb 11, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related PR

Bug Fixes (9 total)

1. extractStatusCode() missed nested data.statusCode

2. Error regex pattern too strict

3. Zod schema rejected valid config silently

4. getFallbackModelsForSession() returned empty array

5. "No model info available, cannot fallback"

6. No auto-retry after fallback model selection

7. Fallback model reverted after one message

8. Manual UI model changes overridden by persistent fallback

9. Auto-retry defaulted to Sisyphus agent

Additional Improvements

Files Changed

Key Discoveries

Testing

Summary by cubic

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

youngbinkim0 commented Feb 11, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

youngbinkim0 commented Feb 11, 2026 •

edited by cubic-dev-ai bot

Loading

1. `extractStatusCode()` missed nested `data.statusCode`

4. `getFallbackModelsForSession()` returned empty array

github-actions bot commented Feb 11, 2026 •

edited

Loading