Skip to content

fix: handle rate_limit_event in claude_agent_sdk to prevent parse errors#1875

Open
Rex-Arnab wants to merge 7 commits intoAndyMik90:developfrom
Rex-Arnab:fix/1864-rate-limit-event
Open

fix: handle rate_limit_event in claude_agent_sdk to prevent parse errors#1875
Rex-Arnab wants to merge 7 commits intoAndyMik90:developfrom
Rex-Arnab:fix/1864-rate-limit-event

Conversation

@Rex-Arnab
Copy link

@Rex-Arnab Rex-Arnab commented Feb 18, 2026

Added a monkey patch in backend run and spec_runner modules to intercept
and handle "rate_limit_event" messages from claude_agent_sdk. The patch
converts these events into SystemMessage instances instead of raising
MessageParseError, allowing sessions to continue and retry after rate
limit resets. Also updated error_utils to detect "rate_limit" error strings.

Summary by CodeRabbit

  • Bug Fixes

    • Broader detection of rate-limit messages to avoid misclassification.
    • Rate-limit notifications no longer close streams; sessions remain open and recover.
    • Visible rate-limit notices (debug/info and stdout) added to aid diagnosis.
  • Chores

    • Startup now applies runtime patches so rate-limit notifications are handled consistently.
  • Tests

    • Added tests covering the patching behavior and rate-limit handling in sessions.

Added a monkey patch in backend run and spec_runner modules to intercept
and handle "rate_limit_event" messages from claude_agent_sdk. The patch
converts these events into SystemMessage instances instead of raising
MessageParseError, allowing sessions to continue and retry after rate
limit resets. Also updated error_utils to detect "rate_limit" error strings.
@CLAassistant
Copy link

CLAassistant commented Feb 18, 2026

CLA assistant check
All committers have signed the CLA.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Rex-Arnab, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the application's robustness by gracefully handling rate limit events from the claude_agent_sdk. By preventing parse errors and converting these events into retryable system messages, the system can now recover more effectively from temporary API rate limits, ensuring smoother operation and improved user experience.

Highlights

  • Claude SDK Rate Limit Handling: Implemented a monkey patch in backend/run.py and runners/spec_runner.py to intercept rate_limit_event messages from claude_agent_sdk.
  • Graceful Error Conversion: Converted rate_limit_event messages into SystemMessage instances instead of raising MessageParseError, allowing sessions to continue and retry after rate limit resets.
  • Enhanced Rate Limit Detection: Updated error_utils.py to include "rate_limit" as a recognized string for detecting rate limit errors, specifically targeting Unknown message type: rate_limit_event from the Claude SDK.
Changelog
  • apps/backend/core/error_utils.py
    • Added "rate_limit" to the list of patterns used to identify rate limit errors.
  • apps/backend/run.py
    • Introduced a monkey patch function _patch_claude_agent_sdk_rate_limit to modify the claude_agent_sdk's message parsing behavior.
    • The patch intercepts rate_limit_event dictionaries and converts them into SystemMessage objects, logging a warning instead of raising an exception.
    • Applied the patch to both message_parser.parse_message and client.parse_message within the SDK.
  • apps/backend/runners/spec_runner.py
    • Implemented an identical monkey patch as in apps/backend/run.py to ensure consistent rate limit event handling in the spec runner.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an idempotent runtime patch to handle unknown rate_limit_event messages from claude_agent_sdk as SystemMessage, broadens rate-limit keyword detection, applies the patch early across multiple runners/entrypoints, and logs/prints rate-limit notices in agent session processing while keeping streams open.

Changes

Cohort / File(s) Summary
Error detection
apps/backend/core/error_utils.py
Added "rate_limit" to is_rate_limit_error keyword checks to expand rate-limit message detection.
SDK runtime patch
apps/backend/core/sdk_patches.py
New module exposing apply_claude_agent_sdk_patches() that monkey-patches claude_agent_sdk parsing to return a SystemMessage for rate_limit_event; idempotent, logged, and guarded by try/except.
Startup patch application (runners & entrypoints)
apps/backend/run.py, apps/backend/runners/spec_runner.py, apps/backend/runners/github/runner.py, apps/backend/runners/gitlab/runner.py, apps/backend/runners/ideation_runner.py, apps/backend/runners/insights_runner.py, apps/backend/runners/roadmap_runner.py
Import and invoke apply_claude_agent_sdk_patches() early during module initialization so the SDK patch is applied before dotenv loading and normal SDK usage.
Session handling
apps/backend/agents/session.py
Added handling for SystemMessage with subtype rate_limit_event: debug log, stdout notice, and INFO entry to task logger; stream is kept open and processing continues.
Tests
tests/test_sdk_patches.py, tests/test_session_rate_limit.py
New unit tests covering patch behavior, idempotency, unknown/known message handling, rate-limit absorption in session flow, logging, and stream continuation.

Sequence Diagram(s)

sequenceDiagram
    participant Startup
    participant Patcher as apply_claude_agent_sdk_patches()
    participant SDK as claude_agent_sdk
    participant AgentSession
    participant TaskLogger

    Startup->>Patcher: apply_claude_agent_sdk_patches()
    Patcher->>SDK: replace parse_message (wrap original)
    Startup->>SDK: continue initialization
    SDK->>AgentSession: deliver parsed message
    alt subtype == rate_limit_event
        SDK->>AgentSession: SystemMessage(rate_limit_event)
        AgentSession->>TaskLogger: log INFO "rate limit reached"
        AgentSession->>Startup: print rate-limit notice (stdout)
        AgentSession-->>SDK: keep stream open (continue)
    else normal message
        SDK->>AgentSession: AssistantMessage / other
        AgentSession->>TaskLogger: normal processing
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

bug, size/M

Poem

🐰 I found a sneaky rate-limit sigh,
I patched the parser, gave it a try.
Unknown events wear a System crown,
Streams stay open — no falling down.
Nibble, log, and hop — onward we fly! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: handling rate_limit_event messages in claude_agent_sdk to prevent parse errors, which is the core objective of this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Thanks for your first PR!

A maintainer will review it soon. Please make sure:

  • Your branch is synced with develop
  • CI checks pass
  • You've followed our contribution guide

Welcome to the Auto Claude community!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a monkey patch to handle rate_limit_event from claude_agent_sdk and prevent parsing errors. The patch seems correct and addresses the issue described.

My main feedback is about code duplication. The monkey-patching function is duplicated in run.py and spec_runner.py, which is a maintenance risk. I've recommended extracting it into a shared utility module.

I also pointed out that the patch function silently swallows errors, which could make debugging difficult in the future. I've suggested adding logging to provide visibility into potential patching failures.

@sentry
Copy link

sentry bot commented Feb 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/run.py`:
- Around line 86-89: The import block is unsorted causing Ruff I001; move the
standard-library import "import logging as _logging" to appear before the
third-party "from claude_agent_sdk..." imports so stdlib imports come first,
then third-party—i.e., reorder the lines so _logging is imported before _mp,
_ic, and _SystemMessage to satisfy Ruff's import sorting.

In `@apps/backend/runners/spec_runner.py`:
- Around line 100-103: Reorder the import block so the stdlib import comes
before third-party imports: move "import logging as _logging" to appear before
the "from claude_agent_sdk._internal import message_parser as _mp", "from
claude_agent_sdk._internal import client as _ic", and "from
claude_agent_sdk.types import SystemMessage as _SystemMessage" lines in
spec_runner.py; ensure the names (_mp, _ic, _SystemMessage, _logging) remain
unchanged and then run Ruff/isort to verify the import ordering rule I001 is
satisfied.
- Around line 87-127: The function _patch_claude_agent_sdk_rate_limit is
duplicated in spec_runner.py and run.py; extract it into a single shared module
(e.g., create apps/backend/core/sdk_patches.py) as a public function named
patch_claude_agent_sdk_rate_limit that contains the same monkey‑patch logic
(imports of claude_agent_sdk._internal.message_parser and client, _SystemMessage
handling, swapping parse_message on _mp and _ic) and make it idempotent (guard
like _mp._rate_limit_patched or an early return). Then remove the duplicate
inline definition from spec_runner.py and run.py and replace each with an import
and call to patch_claude_agent_sdk_rate_limit(). Ensure the shared function
preserves the try/except fail‑safe behavior so startup never breaks if patching
fails.

---

Duplicate comments:
In `@apps/backend/run.py`:
- Around line 73-113: Extract the duplicated _patch_claude_agent_sdk_rate_limit
implementation into a new module (e.g., core.sdk_patches) and replace the copies
in run.py and spec_runner.py with a single import and call to that function;
inside the new module implement an idempotency guard (e.g., a module-level
boolean like _CLAUDE_SDK_RATE_LIMIT_PATCHED) so repeated calls to
_patch_claude_agent_sdk_rate_limit() do nothing after the first patch, and keep
the same behavior of patching
claude_agent_sdk._internal.message_parser.parse_message and
claude_agent_sdk._internal.client.parse_message and returning a
claude_agent_sdk.types.SystemMessage for rate_limit_event while swallowing patch
errors to avoid breaking startup.

- Added runtime patch in sdk_patches.py to override claude_agent_sdk's parse_message
  to return a SystemMessage for "rate_limit_event" instead of raising an error.
- Updated session.py to handle "rate_limit_event" SystemMessage by logging and
  keeping the stream open while waiting for rate limit reset.
- Removed previous inline patch from run.py and moved patch logic to a dedicated module.
- This prevents session failures caused by unrecognized rate_limit_event messages,
  allowing smoother handling of rate limits during agent sessions.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/agents/session.py`:
- Around line 651-661: The failing Ruff format is caused by overly long lines in
the SystemMessage rate_limit_event branch: split the long task_logger.log call
(in the block where msg_type == "SystemMessage" and getattr(msg, "subtype",
None) == "rate_limit_event") so no line exceeds the configured length — e.g.,
place each argument (the message string, LogEntryType.INFO, phase,
print_to_console=False) on its own line or shorten the message string; then run
ruff format apps/backend/ and commit the result.

In `@apps/backend/core/sdk_patches.py`:
- Around line 29-40: The _patched_parse wrapper currently catches all exceptions
from _orig_parse and only then checks for a rate_limit_event, which can hide
unrelated errors; change _patched_parse to first proactively inspect the input
(if isinstance(data, dict) and data.get("type") == "rate_limit_event") and
immediately return _SystemMessage(subtype="rate_limit_event", data=data) with a
logger.warning, otherwise call and return _orig_parse(data) without a broad
try/except so other errors bubble up; keep references to _patched_parse,
_orig_parse, _SystemMessage and logger to locate and update the code.
- Around line 22-44: Add an idempotency guard so the patch isn't applied more
than once: before capturing _orig_parse, check if _mp.parse_message already has
a sentinel (e.g., getattr(_mp.parse_message, "_is_patched_by_us", False)) and
skip re-patching if true; otherwise capture _orig_parse = _mp.parse_message,
create _patched_parse, assign it to both _mp.parse_message and
_ic.parse_message, and then set a sentinel attribute on the patched function
(e.g., _patched_parse._is_patched_by_us = True) and optionally store the
original via __wrapped__ = _orig_parse to preserve access.

…lity

Reformatted the handling of the "rate_limit_event" SystemMessage in the agent session to improve code readability and maintain consistent formatting without changing functionality.
- Added idempotent patch to claude_agent_sdk's parse_message to handle "rate_limit_event" by returning a SystemMessage, preventing stream closure on rate limits.
- Mark patch as applied to avoid duplicate patching.
- Applied this patch automatically in multiple runner entry points (GitHub, GitLab, ideation, insights, roadmap) to ensure consistent behavior across platforms.
…_patches

- Added a new test suite in tests/test_sdk_patches.py covering:
  - Successful patching of parse_message on both internal modules
  - Idempotency guard to prevent double-wrapping
  - Handling of "rate_limit_event" messages returning SystemMessage instances
  - Ensuring unknown message types still raise parse errors
  - Graceful failure when SDK internals are unavailable
- These tests improve reliability and maintainability of SDK patching logic by verifying expected behaviors and edge cases.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_sdk_patches.py`:
- Around line 124-127: The assertion incorrectly checks for "FakeSystemMessage"
(missing leading underscore) so the class-name branch is always false; update
the check to match the actual fake class name _FakeSystemMessage (i.e., replace
"FakeSystemMessage" with "_FakeSystemMessage") or remove the class-name check
and rely on hasattr(result, "subtype") — ensure you update the assertion that
references result's class name and/or the OR condition around result.subtype to
reference the correct symbol _FakeSystemMessage.

In `@tests/test_session_rate_limit.py`:
- Line 138: Remove the redundant local import of MagicMock (the line "from
unittest.mock import MagicMock as MM") since MagicMock is already imported at
module level; delete that duplicate import and ensure references use the
existing MagicMock/MM binding (e.g., existing module-level MagicMock or MM) so
there are no duplicate imports or shadowed names.
- Around line 123-133: The test test_non_rate_limit_system_message_ignored
currently allows "error" as a valid status even though non-rate-limit
SystemMessage should not produce an error; update the assertion so it asserts
status is one of ("continue", "complete") (i.e., exclude "error") to reflect the
intended behavior when creating msg = SystemMessage(subtype="some_other_event",
...) and calling run_agent_session(client, "test prompt", self.spec_dir).
- Around line 96-105: The test test_rate_limit_event_does_not_raise currently
asserts status in ("continue","complete","error") which masks failures; update
the assertion to ensure the session did not surface an error by asserting status
!= "error" (or assert a specific expected status such as "continue"/"complete")
after calling run_agent_session with the
SystemMessage(subtype="rate_limit_event", ...); locate the test function
test_rate_limit_event_does_not_raise and replace the weak membership assertion
with a negative error check (or explicit expected status) so the
rate_limit_event must be absorbed rather than swallowed into an "error" result.
- Around line 33-38: The ResultMessage class is dead code and should be removed
to avoid confusion: delete the ResultMessage class definition (the class named
ResultMessage with attributes subtype, is_error, and result) from
tests/test_session_rate_limit.py and run the test suite to ensure nothing
references it; if any test needs a similar sentinel later, replace usages with a
local fixture or explicit value rather than this unused class.

Added `# pragma: no cover` comments to all imports and calls of `apply_claude_agent_sdk_patches` across backend runner modules to exclude them from test coverage. This clarifies coverage reports by ignoring these patch applications which are environment-specific and not directly testable.

test: clean up test_sdk_patches and test_session_rate_limit

- Removed unused imports and dead code in `test_sdk_patches.py` to address CodeQL findings and fix incorrect class name usage.
- Removed unused `ResultMessage` class and tightened assertions in `test_session_rate_limit.py` to improve test accuracy and prevent false positives.
- Simplified mocking imports by removing redundant aliases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants