Skip to content
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8f212f8
feat(config): add runtime_fallback and fallback_models schema
Feb 3, 2026
6190d46
fix(runtime-fallback): add Category support and expand test coverage
Feb 3, 2026
6689ff2
feat(fallback_models): complete init-time and runtime integration
Feb 3, 2026
c62d8f1
docs: add runtime-fallback and fallback_models documentation
youming-ai Feb 3, 2026
7a6b8da
fix(delegate-task): restore overrideModel priority in resolveCategory…
youming-ai Feb 3, 2026
8e98f98
fix(runtime-fallback): per-model cooldown and stricter retry patterns
youming-ai Feb 4, 2026
8433b95
fix(session-category-registry): cleanup entries for task sessions
youming-ai Feb 4, 2026
708bcc1
test(delegate-task): stabilize browserProvider and default variant cases
youming-ai Feb 4, 2026
cea5f43
test(agents): update Atlas uiSelectedModel expectation
youming-ai Feb 4, 2026
4a64646
fix(runtime-fallback): use precise regex patterns for status code mat…
youming-ai Feb 5, 2026
be3c53b
refactor(shared): add normalizeFallbackModels utility function
youming-ai Feb 5, 2026
e61dd6d
refactor(agents): use normalizeFallbackModels utility across codebase
youming-ai Feb 5, 2026
bd7f2be
Merge branch 'dev' into feat/runtime-fallback-only
youming-ai Feb 5, 2026
c375961
Merge branch 'upstream/dev' into feat/runtime-fallback-only
youming-ai Feb 8, 2026
8f72f52
fix: resolve merge conflicts in PR #1408
youming-ai Feb 9, 2026
62fac11
feat(runtime-fallback): automatic model switching on API errors
youming-ai Feb 9, 2026
eaf52ca
fix(runtime-fallback): address cubic AI review issues
youming-ai Feb 9, 2026
e88340f
Merge branch 'feat/runtime-fallback-only' of github.com:youming-ai/oh…
youming-ai Feb 10, 2026
4aed41b
fix(runtime-fallback): sort agent names by length to fix hyphenated a…
youming-ai Feb 10, 2026
6796c6d
fix(runtime-fallback): 9 critical bug fixes for auto-retry, agent pre…
youngbinkim0 Feb 11, 2026
45d350c
refactor(runtime-fallback): extract auto-retry helper and fix provide…
youngbinkim0 Feb 11, 2026
e03b080
fix(runtime-fallback): harden fallback progression and success detection
youngbinkim0 Feb 12, 2026
f4df912
feat(runtime-fallback): add configurable session timeout controls
youngbinkim0 Feb 12, 2026
3d0e070
docs(runtime-fallback): document retry classes and timeout behavior
youngbinkim0 Feb 12, 2026
6c4e376
feat(runtime-fallback): generalize provider auto-retry signal detection
youngbinkim0 Feb 12, 2026
c430d96
feat(runtime-fallback): add timeout toggle for quota retry detection
youngbinkim0 Feb 12, 2026
7029ba2
fix(config): allow timeout_seconds to be 0 to disable fallback
youngbinkim0 Feb 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

148 changes: 132 additions & 16 deletions docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,19 +163,20 @@ Override built-in agent settings:
}
```

Each agent supports: `model`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `category`, `variant`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`.
Each agent supports: `model`, `fallback_models`, `temperature`, `top_p`, `prompt`, `prompt_append`, `tools`, `disable`, `description`, `mode`, `color`, `permission`, `category`, `variant`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `providerOptions`.

### Additional Agent Options

| Option | Type | Description |
| ------------------- | ------- | ----------------------------------------------------------------------------------------------- |
| `category` | string | Category name to inherit model and other settings from category defaults |
| `variant` | string | Model variant (e.g., `max`, `high`, `medium`, `low`, `xhigh`) |
| `maxTokens` | number | Maximum tokens for response. Passed directly to OpenCode SDK. |
| `thinking` | object | Extended thinking configuration for Anthropic models. See [Thinking Options](#thinking-options) below. |
| `reasoningEffort` | string | OpenAI reasoning effort level. Values: `low`, `medium`, `high`, `xhigh`. |
| `textVerbosity` | string | Text verbosity level. Values: `low`, `medium`, `high`. |
| `providerOptions` | object | Provider-specific options passed directly to OpenCode SDK. |
| Option | Type | Description |
| ------------------- | -------------- | ----------------------------------------------------------------------------------------------- |
| `fallback_models` | string/array | Fallback models for runtime switching on API errors. Single string or array of model strings. |
| `category` | string | Category name to inherit model and other settings from category defaults |
| `variant` | string | Model variant (e.g., `max`, `high`, `medium`, `low`, `xhigh`) |
| `maxTokens` | number | Maximum tokens for response. Passed directly to OpenCode SDK. |
| `thinking` | object | Extended thinking configuration for Anthropic models. See [Thinking Options](#thinking-options) below. |
| `reasoningEffort` | string | OpenAI reasoning effort level. Values: `low`, `medium`, `high`, `xhigh`. |
| `textVerbosity` | string | Text verbosity level. Values: `low`, `medium`, `high`. |
| `providerOptions` | object | Provider-specific options passed directly to OpenCode SDK. |

#### Thinking Options (Anthropic)

Expand Down Expand Up @@ -714,6 +715,63 @@ Configure concurrency limits for background agent tasks. This controls how many
- Allow more concurrent tasks for fast/cheap models (e.g., Gemini Flash)
- Respect provider rate limits by setting provider-level caps

## Runtime Fallback

Automatically switch to backup models when the primary model encounters transient API errors (rate limits, overload, etc.). This keeps conversations running without manual intervention.

```json
{
"runtime_fallback": {
"enabled": true,
"retry_on_errors": [429, 503, 529],
"max_fallback_attempts": 3,
"cooldown_seconds": 60,
"notify_on_fallback": true
}
}
```

| Option | Default | Description |
| ----------------------- | ----------------- | --------------------------------------------------------------------------- |
| `enabled` | `true` | Enable runtime fallback |
| `retry_on_errors` | `[429, 503, 529]` | HTTP status codes that trigger fallback (rate limit, service unavailable) |
| `max_fallback_attempts` | `3` | Maximum fallback attempts per session (1-10) |
| `cooldown_seconds` | `60` | Cooldown in seconds before retrying a failed model |
| `notify_on_fallback` | `true` | Show toast notification when switching to a fallback model |

### How It Works

1. When an API error matching `retry_on_errors` occurs, the hook intercepts it
2. The next request automatically uses the next available model from `fallback_models`
3. Failed models enter a cooldown period before being retried
4. Toast notification (optional) informs you of the model switch

### Configuring Fallback Models

Define `fallback_models` at the agent or category level:

```json
{
"agents": {
"sisyphus": {
"model": "anthropic/claude-opus-4-5",
"fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"]
}
},
"categories": {
"ultrabrain": {
"model": "openai/gpt-5.2-codex",
"fallback_models": ["anthropic/claude-opus-4-5", "google/gemini-3-pro"]
}
}
}
```

When the primary model fails:
1. First fallback: `openai/gpt-5.2`
2. Second fallback: `google/gemini-3-pro`
3. After `max_fallback_attempts`, returns to primary model

## Categories

Categories enable domain-specific task delegation via the `task` tool. Each category applies runtime presets (model, temperature, prompt additions) when calling the `Sisyphus-Junior` agent.
Expand Down Expand Up @@ -825,14 +883,72 @@ Add your own categories or override built-in ones:
}
```

Each category supports: `model`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`, `description`, `is_unstable_agent`.
Each category supports: `model`, `fallback_models`, `temperature`, `top_p`, `maxTokens`, `thinking`, `reasoningEffort`, `textVerbosity`, `tools`, `prompt_append`, `variant`, `description`, `is_unstable_agent`.

### Additional Category Options

| Option | Type | Default | Description |
| ------------------ | ------- | ------- | --------------------------------------------------------------------------------------------------- |
| `description` | string | - | Human-readable description of the category's purpose. Shown in task prompt. |
| `is_unstable_agent`| boolean | `false` | Mark agent as unstable - forces background mode for monitoring. Auto-enabled for gemini models. |
| Option | Type | Default | Description |
| ------------------- | ------------ | ------- | --------------------------------------------------------------------------------------------------- |
| `fallback_models` | string/array | - | Fallback models for runtime switching on API errors. Single string or array of model strings. |
| `description` | string | - | Human-readable description of the category's purpose. Shown in task prompt. |
| `is_unstable_agent` | boolean | `false` | Mark agent as unstable - forces background mode for monitoring. Auto-enabled for gemini models. |

## Runtime Fallback

Automatically switch to backup models when the primary model encounters transient API errors (rate limits, overload, etc.). This keeps conversations running without manual intervention.

```json
{
"runtime_fallback": {
"enabled": true,
"retry_on_errors": [429, 503, 529],
"max_fallback_attempts": 3,
"cooldown_seconds": 60,
"notify_on_fallback": true
}
}
```

| Option | Default | Description |
| ----------------------- | ----------------- | --------------------------------------------------------------------------- |
| `enabled` | `true` | Enable runtime fallback |
| `retry_on_errors` | `[429, 503, 529]` | HTTP status codes that trigger fallback (rate limit, service unavailable) |
| `max_fallback_attempts` | `3` | Maximum fallback attempts per session (1-10) |
| `cooldown_seconds` | `60` | Cooldown in seconds before retrying a failed model |
| `notify_on_fallback` | `true` | Show toast notification when switching to a fallback model |

### How It Works

1. When an API error matching `retry_on_errors` occurs, the hook intercepts it
2. The next request automatically uses the next available model from `fallback_models`
3. Failed models enter a cooldown period before being retried
4. Toast notification (optional) informs you of the model switch

### Configuring Fallback Models

Define `fallback_models` at the agent or category level:

```json
{
"agents": {
"sisyphus": {
"model": "anthropic/claude-opus-4-5",
"fallback_models": ["openai/gpt-5.2", "google/gemini-3-pro"]
}
},
"categories": {
"ultrabrain": {
"model": "openai/gpt-5.2-codex",
"fallback_models": ["anthropic/claude-opus-4-5", "google/gemini-3-pro"]
}
}
}
```

When the primary model fails:
1. First fallback: `openai/gpt-5.2`
2. Second fallback: `google/gemini-3-pro`
3. After `max_fallback_attempts`, returns to primary model

## Model Resolution System

Expand Down Expand Up @@ -967,7 +1083,7 @@ Disable specific built-in hooks via `disabled_hooks` in `~/.config/opencode/oh-m
}
```

Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`, `auto-slash-command`, `sisyphus-junior-notepad`, `start-work`
Available hooks: `todo-continuation-enforcer`, `context-window-monitor`, `session-recovery`, `session-notification`, `comment-checker`, `grep-output-truncator`, `tool-output-truncator`, `directory-agents-injector`, `directory-readme-injector`, `empty-task-response-detector`, `think-mode`, `anthropic-context-window-limit-recovery`, `rules-injector`, `background-notification`, `auto-update-checker`, `startup-toast`, `keyword-detector`, `agent-usage-reminder`, `non-interactive-env`, `interactive-bash-session`, `compaction-context-injector`, `thinking-block-validator`, `claude-code-hooks`, `ralph-loop`, `preemptive-compaction`, `auto-slash-command`, `sisyphus-junior-notepad`, `start-work`, `runtime-fallback`

**Note on `directory-agents-injector`**: This hook is **automatically disabled** when running on OpenCode 1.1.37+ because OpenCode now has native support for dynamically resolving AGENTS.md files from subdirectories (PR #10678). This prevents duplicate AGENTS.md injection. For older OpenCode versions, the hook remains active to provide the same functionality.

Expand Down
1 change: 1 addition & 0 deletions docs/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,7 @@ Hooks intercept and modify behavior at key points in the agent lifecycle.
| **session-recovery** | Stop | Recovers from session errors - missing tool results, thinking block issues, empty messages. |
| **anthropic-context-window-limit-recovery** | Stop | Handles Claude context window limits gracefully. |
| **background-compaction** | Stop | Auto-compacts sessions hitting token limits. |
| **runtime-fallback** | Event | Automatically switches to backup models on API errors (429, 503, 529). Configurable retry logic with per-model cooldown. |

#### Truncation & Context Management

Expand Down
1 change: 1 addition & 0 deletions src/agents/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ export type AgentName = BuiltinAgentName
export type AgentOverrideConfig = Partial<AgentConfig> & {
prompt_append?: string
variant?: string
fallback_models?: string | string[]
}

export type AgentOverrides = Partial<Record<OverridableAgentName, AgentOverrideConfig>>
4 changes: 2 additions & 2 deletions src/agents/utils.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ describe("createBuiltinAgents with model overrides", () => {
expect(agents.sisyphus.thinking).toBeUndefined()
})

test("Atlas uses uiSelectedModel when provided", async () => {
test("Atlas does not use uiSelectedModel (respects its own fallback chain)", async () => {
// #given
const fetchSpy = spyOn(shared, "fetchAvailableModels").mockResolvedValue(
new Set(["openai/gpt-5.2", "anthropic/claude-sonnet-4-5"])
Expand All @@ -75,7 +75,7 @@ describe("createBuiltinAgents with model overrides", () => {

// #then
expect(agents.atlas).toBeDefined()
expect(agents.atlas.model).toBe("openai/gpt-5.2")
expect(agents.atlas.model).toBe("anthropic/claude-sonnet-4-5")
} finally {
fetchSpy.mockRestore()
}
Expand Down
2 changes: 2 additions & 0 deletions src/config/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,6 @@ export type {
TmuxLayout,
SisyphusConfig,
SisyphusTasksConfig,
RuntimeFallbackConfig,
FallbackModels,
} from "./schema"
1 change: 1 addition & 0 deletions src/config/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ export * from "./schema/hooks"
export * from "./schema/notification"
export * from "./schema/oh-my-opencode-config"
export * from "./schema/ralph-loop"
export * from "./schema/runtime-fallback"
export * from "./schema/skills"
export * from "./schema/sisyphus"
export * from "./schema/sisyphus-agent"
Expand Down
3 changes: 3 additions & 0 deletions src/config/schema/agent-overrides.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { z } from "zod"
import { AgentPermissionSchema } from "./internal/permission"
import { FallbackModelsSchema } from "./runtime-fallback"

export const AgentOverrideConfigSchema = z.object({
/** @deprecated Use `category` instead. Model is inherited from category defaults. */
Expand Down Expand Up @@ -37,6 +38,8 @@ export const AgentOverrideConfigSchema = z.object({
textVerbosity: z.enum(["low", "medium", "high"]).optional(),
/** Provider-specific options. Passed directly to OpenCode SDK. */
providerOptions: z.record(z.string(), z.unknown()).optional(),
/** Fallback models for runtime switching on API errors. Single string or array of model strings. */
fallback_models: FallbackModelsSchema.optional(),
})

export const AgentOverridesSchema = z.object({
Expand Down
Loading