Skip to content

fix: handle session.status cooldown retries in runtime fallback#2368

Open
code-yeongyu wants to merge 7 commits intodevfrom
fix/issue-2301
Open

fix: handle session.status cooldown retries in runtime fallback#2368
code-yeongyu wants to merge 7 commits intodevfrom
fix/issue-2301

Conversation

@code-yeongyu
Copy link
Owner

@code-yeongyu code-yeongyu commented Mar 7, 2026

Summary

  • route session.status retry events through runtime fallback so cooldown retry loops can switch to the next fallback model
  • stop model-fallback from handling session.status when runtime_fallback is enabled and dedupe retry updates by attempt, model, and retry class
  • add regression coverage for runtime fallback status handling, plugin gating, and retry-status normalization helpers

Testing

  • bun test
  • bun run typecheck
  • bun run build

Summary by cubic

Handle provider cooldown auto-retry signals from session.status via runtime fallback to switch to the next model immediately and avoid countdown spam. Also gate model-fallback when runtime_fallback is enabled.

  • New Features

    • Runtime fallback handles session.status "retry" events, aborts in-flight requests, and prepares the next fallback model.
    • Shared utils to extract retry attempt/model and normalize messages for stable dedupe.
  • Bug Fixes

    • Deduplicate session.status countdown updates by attempt, model, and retry class.
    • Match more cooldown/quota/capacity messages as retryable.
    • Prevent model-fallback from reacting to session.status when runtime fallback is enabled.
    • Clear session.status retry state on session deletion.

Written for commit b1946a6. Summary will update on new commits.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 8 files

Confidence score: 3/5

  • There is some merge risk because both findings are medium severity (6/10) with high confidence (9/10), indicating concrete behavior gaps rather than speculative nits.
  • In src/hooks/runtime-fallback/event-handler.ts, not clearing the session-status retry key after session.idle/session.stop/session.error can cause retry deduplication to leak across requests in the same session, leading to incorrect retry handling.
  • In src/shared/retry-status-utils.ts, the regex currently misses : in the character class, so model IDs containing colons (such as Ollama-style IDs) may not be matched correctly and can break retry-status detection for those models.
  • Pay close attention to src/hooks/runtime-fallback/event-handler.ts and src/shared/retry-status-utils.ts - retry-key lifecycle and model-ID parsing need fixes to avoid request-level retry regressions.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/hooks/runtime-fallback/event-handler.ts">

<violation number="1" location="src/hooks/runtime-fallback/event-handler.ts:38">
P2: Clear the session status retry key when a request completes (`session.idle`, `session.stop`, `session.error`) to avoid incorrectly deduplicating retry events across different requests in the same session.</violation>
</file>

<file name="src/shared/retry-status-utils.ts">

<violation number="1" location="src/shared/retry-status-utils.ts:17">
P2: Add `:` to the regex character class to correctly match model IDs that contain colons (e.g., Ollama models).</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@@ -6,9 +6,11 @@ import { extractStatusCode, extractErrorName, classifyErrorType, isRetryableErro
import { createFallbackState, prepareFallback } from "./fallback-state"
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Clear the session status retry key when a request completes (session.idle, session.stop, session.error) to avoid incorrectly deduplicating retry events across different requests in the same session.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/hooks/runtime-fallback/event-handler.ts, line 38:

<comment>Clear the session status retry key when a request completes (`session.idle`, `session.stop`, `session.error`) to avoid incorrectly deduplicating retry events across different requests in the same session.</comment>

<file context>
@@ -33,6 +35,7 @@ export function createEventHandler(deps: HookDeps, helpers: AutoRetryHelpers) {
       sessionRetryInFlight.delete(sessionID)
       sessionAwaitingFallbackResult.delete(sessionID)
       helpers.clearSessionFallbackTimeout(sessionID)
+      sessionStatusHandler.clearRetryKey(sessionID)
       SessionCategoryRegistry.remove(sessionID)
     }
</file context>
Fix with Cubic

}

export function extractRetryStatusModel(message: string): string | undefined {
return message.match(/model\s+([a-z0-9._/-]+)(?=\s+(?:are|is)\b)/i)?.[1]?.toLowerCase()
Copy link

@cubic-dev-ai cubic-dev-ai bot Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Add : to the regex character class to correctly match model IDs that contain colons (e.g., Ollama models).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/retry-status-utils.ts, line 17:

<comment>Add `:` to the regex character class to correctly match model IDs that contain colons (e.g., Ollama models).</comment>

<file context>
@@ -0,0 +1,51 @@
+}
+
+export function extractRetryStatusModel(message: string): string | undefined {
+  return message.match(/model\s+([a-z0-9._/-]+)(?=\s+(?:are|is)\b)/i)?.[1]?.toLowerCase()
+}
+
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant