-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feat(runtime-fallback): auto-retry, generic retry detection, and timeout toggle #1777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
feat(runtime-fallback): auto-retry, generic retry detection, and timeout toggle #1777
Conversation
Add configuration schemas for runtime model fallback feature: - RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback - FallbackModelsSchema for init-time fallback model selection - Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema - Export types and schemas from config/index.ts Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
- Add Category-level fallback_models support in getFallbackModelsForSession() - Try agent-level fallback_models first - Then try agent's category fallback_models - Support all builtin agents including hephaestus, sisyphus-junior, build, plan - Expand agent name recognition regex to include: - hephaestus, sisyphus-junior, build, plan, multimodal-looker - Add comprehensive test coverage (6 new tests, total 24): - Model switching via chat.message hook - Agent-level fallback_models configuration - SessionID agent pattern detection - Cooldown mechanism validation - Max attempts limit enforcement All 24 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
Implement full fallback_models support across all integration points: 1. Model Resolution Pipeline (src/shared/model-resolution-pipeline.ts) - Add userFallbackModels to ModelResolutionRequest - Process user fallback_models before hardcoded fallback chain - Support both connected provider and availability checking modes 2. Agent Utils (src/agents/utils.ts) - Update applyModelResolution to accept userFallbackModels - Inject fallback_models for all builtin agents (sisyphus, oracle, etc.) - Support both single string and array formats 3. Model Resolver (src/shared/model-resolver.ts) - Add userFallbackModels to ExtendedModelResolutionInput type - Pass through to resolveModelPipeline 4. Delegate Task Executor (src/tools/delegate-task/executor.ts) - Extract category fallback_models configuration - Pass to model resolution pipeline - Register session category for runtime-fallback hook 5. Session Category Registry (src/shared/session-category-registry.ts) - New module: maps sessionID -> category - Used by runtime-fallback to lookup category fallback_models - Auto-cleanup support 6. Runtime Fallback Hook (src/hooks/runtime-fallback/index.ts) - Check SessionCategoryRegistry first for category fallback_models - Fallback to agent-level configuration - Import and use SessionCategoryRegistry Test Results: - runtime-fallback: 24/24 tests passing - model-resolver: 46/46 tests passing Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
…ching Replace word-boundary regex with stricter patterns that match status codes only at start/end of string or surrounded by whitespace. Prevents false matches like '1429' or '4290'. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
Add shared utility to normalize fallback_models config values. Handles both single string and array inputs consistently. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
Replace 5 instances of inline fallback_models normalization with the shared normalizeFallbackModels() utility function. Eliminates code duplication and ensures consistent behavior. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <[email protected]>
Resolved conflicts in: - src/config/schema.ts (kept both hooks) - src/hooks/index.ts (exported both hooks) - src/index.ts (imported both hooks) - src/shared/index.ts (exported both utilities)
Resolved conflicts in: - src/config/schema.ts (HookNameSchema + OhMyOpenCodeConfigSchema) - src/agents/utils.ts (imports + model resolution calls) - docs/configurations.md (category options table + runtime fallback docs) - src/hooks/AGENTS.md (hook list) - src/tools/delegate-task/executor.ts (imports + session category registry) - src/tools/delegate-task/tools.test.ts (test case updates) - src/features/background-agent/manager.ts (cleanup + SessionCategoryRegistry)
- Fix bun.lock version conflicts (3.3.1 -> 3.3.2) - Remove Git conflict markers from docs/configurations.md - Remove duplicate normalizeFallbackModels, import from shared module
Implements runtime model fallback that automatically switches to backup models when the primary model encounters transient errors (rate limits, overload, etc.). Features: - runtime_fallback configuration with customizable error codes, cooldown, notifications - Runtime fallback hook intercepts API errors (429, 503, 529) - Support for fallback_models from agent/category configuration - Session-state TTL and periodic cleanup to prevent memory leaks - Robust agent name detection with explicit AGENT_NAMES array - Session category registry for category-specific fallback lookup Schema changes: - Add RuntimeFallbackConfigSchema with enabled, retry_on_errors, max_fallback_attempts, cooldown_seconds, notify_on_fallback options - Add fallback_models to AgentOverrideConfigSchema and CategoryConfigSchema - Add runtime-fallback to HookNameSchema Files added: - src/hooks/runtime-fallback/index.ts - Main hook implementation - src/hooks/runtime-fallback/types.ts - Type definitions - src/hooks/runtime-fallback/constants.ts - Constants and defaults - src/hooks/runtime-fallback/index.test.ts - Comprehensive tests - src/config/schema/runtime-fallback.ts - Schema definition - src/shared/session-category-registry.ts - Session category tracking Files modified: - src/hooks/index.ts - Export runtime-fallback hook - src/plugin/hooks/create-session-hooks.ts - Register runtime-fallback hook - src/config/schema.ts - Export runtime-fallback schema - src/config/schema/oh-my-opencode-config.ts - Add runtime_fallback config - src/config/schema/agent-overrides.ts - Add fallback_models to agent config - src/config/schema/categories.ts - Add fallback_models to category config - src/config/schema/hooks.ts - Add runtime-fallback to hook names - src/shared/index.ts - Export session-category-registry - docs/configurations.md - Add Runtime Fallback documentation - docs/features.md - Add runtime-fallback to hooks list Supersedes code-yeongyu#1237, code-yeongyu#1408 Closes code-yeongyu#1408
- Add normalizeFallbackModels helper to centralize string/array normalization (P3) - Export RuntimeFallbackConfig and FallbackModels types from config/index.ts - Fix agent detection regex to use word boundaries for sessionID matching - Improve tests to verify actual fallback switching logic (not just log paths) - Add SessionCategoryRegistry cleanup in executeSyncTask on completion/error (P2) - All 24 runtime-fallback tests pass, 115 delegate-task tests pass
…-my-opencode into feat/runtime-fallback-only
…gent detection The \b word boundary regex treats '-' as a boundary, causing 'sisyphus-junior-session-123' to incorrectly match 'sisyphus' instead of 'sisyphus-junior'. Sorting agent names by length (descending) ensures longer names are matched first, fixing the hyphenated agent detection issue. Fixes cubic-dev-ai review issue code-yeongyu#8
…servation, and model override Bug fixes: 1. extractStatusCode: handle nested data.statusCode (Anthropic error structure) 2. Error regex: relax credit.*balance.*too.*low pattern for multi-char gaps 3. Zod schema: bump max_fallback_attempts from 10 to 20 (config rejected silently) 4. getFallbackModelsForSession: fallback to sisyphus/any agent when session.error lacks agent 5. Model detection: derive model from agent config when session.error lacks model info 6. Auto-retry: resend last user message with fallback model via promptAsync 7. Persistent fallback: override model on every chat.message (not just pendingFallbackModel) 8. Manual model change: detect UI model changes and reset fallback state 9. Agent preservation: include agent in promptAsync body to prevent defaulting to sisyphus Additional: - Add sessionRetryInFlight guard to prevent double-retries - Add resolveAgentForSession with 3-tier resolution (event → session memory → session ID) - Add normalizeAgentName for display names like "Prometheus (Planner)" → "prometheus" - Add resolveAgentForSessionFromContext to fetch agent from session messages - Move AGENT_NAMES and agentPattern to module scope for reuse - Register runtime-fallback hooks in event.ts and chat-message.ts - Remove diagnostic debug logging from isRetryableError - Add 400 to default retry_on_errors and credit/balance patterns to RETRYABLE_ERROR_PATTERNS
|
All contributors have signed the CLA. Thank you! ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
recheck |
1 similar comment
|
recheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 27 files
Confidence score: 3/5
- Potential user-impacting inconsistency in
src/shared/model-resolution-pipeline.ts: newuserFallbackModelsreads connected providers only from cache, ignoringconstraints.connectedProviders, which could change model resolution behavior. - Code duplication in
src/hooks/runtime-fallback/index.tsauto-retry handlers raises maintenance risk and could lead to subtle divergence, though it’s not an immediate blocker. - Pay close attention to
src/shared/model-resolution-pipeline.tsandsrc/hooks/runtime-fallback/index.ts- connected provider handling and duplicated retry logic.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/hooks/runtime-fallback/index.ts">
<violation number="1" location="src/hooks/runtime-fallback/index.ts:473">
P2: Significant code duplication in auto-retry logic between `session.error` and `message.updated` event handlers. The logic for fetching session messages, extracting the last user message, parsing parts, and calling `promptAsync` is duplicated almost verbatim (approximately 40+ lines). This increases maintenance burden and risk of inconsistencies.</violation>
</file>
<file name="src/shared/model-resolution-pipeline.ts">
<violation number="1" location="src/shared/model-resolution-pipeline.ts:105">
P1: Inconsistent handling of `constraints.connectedProviders` - the new `userFallbackModels` logic reads connected providers only from cache, ignoring the `constraints.connectedProviders` parameter that is respected in `categoryDefaultModel` and `fallbackChain` logic. This prevents callers from enforcing specific provider constraints in the user fallback code path.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…r constraint inconsistency - Extract duplicated auto-retry logic (~40 lines each) from session.error and message.updated handlers into shared autoRetryWithFallback() helper - Fix userFallbackModels path in model-resolution-pipeline to respect constraints.connectedProviders parameter instead of reading cache directly, matching the behavior of categoryDefaultModel and fallbackChain paths
Refactor retry signal detection to be provider-agnostic: - Replace hardcoded Copilot/OpenAI checks with generic pattern matching - Detect any provider message containing limit/quota keywords + [retrying in X] - Add OpenAI pattern: 'usage limit has been reached [retrying in X]' - Update logging to use generic 'provider' instead of specific names - Add 'usage limit has been reached' to RETRYABLE_ERROR_PATTERNS This enables fallback escalation for any provider that signals automatic retries due to quota/rate limits, not just Copilot and OpenAI. Closes PR discussion: generalize retry pattern detection
Make provider auto-retry signal detection respect timeout_seconds setting: - When timeout_seconds=0, disable quota-based fallback escalation - Only treat auto-retry signals as errors when timeout is enabled - Add test to verify behavior when timeout_seconds is disabled - Update documentation to explain timeout_seconds=0 behavior This allows users to disable timeout-based fallbacks while keeping error-based fallback functionality intact.
Previously, the Zod schema rejected timeout_seconds: 0 due to .min(1). Now it accepts 0-integer values to allow disabling timeout-based fallback. - Changed z.number().min(1) to z.number().min(0) - Updated comment to clarify 0 disables timeout checks - All tests pass (44 runtime-fallback + 46 schema tests) - Build successful
SummaryThis PR implements runtime fallback improvements to detect provider auto-retry signals and adds a timeout toggle for quota-based fallback escalation. Changes
Test Results
Example Configuration{ |
Summary
This PR implements runtime fallback auto-retry with full agent preservation. The fix chain enables end-to-end runtime fallback to work correctly in production.
What Changed
Bug Fixes (9 total)
data.statusCodestructuremax_fallback_attemptsnow supports 20 (was 10)session.errorevents without agent infoNew Features
timeout_seconds: 0disables quota-based fallback escalationTesting
✅ 44 runtime-fallback tests passing
✅ 46 schema tests passing
✅ TypeScript compilation clean
✅ Manual end-to-end verification (Anthropic error → auto-fallback → agent preserved → manual override works)
Related
Builds on #1408 (
feat/runtime-fallback-only)