fix: handle rate_limit_event in claude_agent_sdk to prevent parse errors#1875
fix: handle rate_limit_event in claude_agent_sdk to prevent parse errors#1875Rex-Arnab wants to merge 7 commits intoAndyMik90:developfrom
Conversation
Added a monkey patch in backend run and spec_runner modules to intercept and handle "rate_limit_event" messages from claude_agent_sdk. The patch converts these events into SystemMessage instances instead of raising MessageParseError, allowing sessions to continue and retry after rate limit resets. Also updated error_utils to detect "rate_limit" error strings.
Summary of ChangesHello @Rex-Arnab, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the application's robustness by gracefully handling rate limit events from the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an idempotent runtime patch to handle unknown Changes
Sequence Diagram(s)sequenceDiagram
participant Startup
participant Patcher as apply_claude_agent_sdk_patches()
participant SDK as claude_agent_sdk
participant AgentSession
participant TaskLogger
Startup->>Patcher: apply_claude_agent_sdk_patches()
Patcher->>SDK: replace parse_message (wrap original)
Startup->>SDK: continue initialization
SDK->>AgentSession: deliver parsed message
alt subtype == rate_limit_event
SDK->>AgentSession: SystemMessage(rate_limit_event)
AgentSession->>TaskLogger: log INFO "rate limit reached"
AgentSession->>Startup: print rate-limit notice (stdout)
AgentSession-->>SDK: keep stream open (continue)
else normal message
SDK->>AgentSession: AssistantMessage / other
AgentSession->>TaskLogger: normal processing
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🎉 Thanks for your first PR!
A maintainer will review it soon. Please make sure:
- Your branch is synced with
develop - CI checks pass
- You've followed our contribution guide
Welcome to the Auto Claude community!
There was a problem hiding this comment.
Code Review
This pull request introduces a monkey patch to handle rate_limit_event from claude_agent_sdk and prevent parsing errors. The patch seems correct and addresses the issue described.
My main feedback is about code duplication. The monkey-patching function is duplicated in run.py and spec_runner.py, which is a maintenance risk. I've recommended extracting it into a shared utility module.
I also pointed out that the patch function silently swallows errors, which could make debugging difficult in the future. I've suggested adding logging to provide visibility into potential patching failures.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/run.py`:
- Around line 86-89: The import block is unsorted causing Ruff I001; move the
standard-library import "import logging as _logging" to appear before the
third-party "from claude_agent_sdk..." imports so stdlib imports come first,
then third-party—i.e., reorder the lines so _logging is imported before _mp,
_ic, and _SystemMessage to satisfy Ruff's import sorting.
In `@apps/backend/runners/spec_runner.py`:
- Around line 100-103: Reorder the import block so the stdlib import comes
before third-party imports: move "import logging as _logging" to appear before
the "from claude_agent_sdk._internal import message_parser as _mp", "from
claude_agent_sdk._internal import client as _ic", and "from
claude_agent_sdk.types import SystemMessage as _SystemMessage" lines in
spec_runner.py; ensure the names (_mp, _ic, _SystemMessage, _logging) remain
unchanged and then run Ruff/isort to verify the import ordering rule I001 is
satisfied.
- Around line 87-127: The function _patch_claude_agent_sdk_rate_limit is
duplicated in spec_runner.py and run.py; extract it into a single shared module
(e.g., create apps/backend/core/sdk_patches.py) as a public function named
patch_claude_agent_sdk_rate_limit that contains the same monkey‑patch logic
(imports of claude_agent_sdk._internal.message_parser and client, _SystemMessage
handling, swapping parse_message on _mp and _ic) and make it idempotent (guard
like _mp._rate_limit_patched or an early return). Then remove the duplicate
inline definition from spec_runner.py and run.py and replace each with an import
and call to patch_claude_agent_sdk_rate_limit(). Ensure the shared function
preserves the try/except fail‑safe behavior so startup never breaks if patching
fails.
---
Duplicate comments:
In `@apps/backend/run.py`:
- Around line 73-113: Extract the duplicated _patch_claude_agent_sdk_rate_limit
implementation into a new module (e.g., core.sdk_patches) and replace the copies
in run.py and spec_runner.py with a single import and call to that function;
inside the new module implement an idempotency guard (e.g., a module-level
boolean like _CLAUDE_SDK_RATE_LIMIT_PATCHED) so repeated calls to
_patch_claude_agent_sdk_rate_limit() do nothing after the first patch, and keep
the same behavior of patching
claude_agent_sdk._internal.message_parser.parse_message and
claude_agent_sdk._internal.client.parse_message and returning a
claude_agent_sdk.types.SystemMessage for rate_limit_event while swallowing patch
errors to avoid breaking startup.
- Added runtime patch in sdk_patches.py to override claude_agent_sdk's parse_message to return a SystemMessage for "rate_limit_event" instead of raising an error. - Updated session.py to handle "rate_limit_event" SystemMessage by logging and keeping the stream open while waiting for rate limit reset. - Removed previous inline patch from run.py and moved patch logic to a dedicated module. - This prevents session failures caused by unrecognized rate_limit_event messages, allowing smoother handling of rate limits during agent sessions.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/agents/session.py`:
- Around line 651-661: The failing Ruff format is caused by overly long lines in
the SystemMessage rate_limit_event branch: split the long task_logger.log call
(in the block where msg_type == "SystemMessage" and getattr(msg, "subtype",
None) == "rate_limit_event") so no line exceeds the configured length — e.g.,
place each argument (the message string, LogEntryType.INFO, phase,
print_to_console=False) on its own line or shorten the message string; then run
ruff format apps/backend/ and commit the result.
In `@apps/backend/core/sdk_patches.py`:
- Around line 29-40: The _patched_parse wrapper currently catches all exceptions
from _orig_parse and only then checks for a rate_limit_event, which can hide
unrelated errors; change _patched_parse to first proactively inspect the input
(if isinstance(data, dict) and data.get("type") == "rate_limit_event") and
immediately return _SystemMessage(subtype="rate_limit_event", data=data) with a
logger.warning, otherwise call and return _orig_parse(data) without a broad
try/except so other errors bubble up; keep references to _patched_parse,
_orig_parse, _SystemMessage and logger to locate and update the code.
- Around line 22-44: Add an idempotency guard so the patch isn't applied more
than once: before capturing _orig_parse, check if _mp.parse_message already has
a sentinel (e.g., getattr(_mp.parse_message, "_is_patched_by_us", False)) and
skip re-patching if true; otherwise capture _orig_parse = _mp.parse_message,
create _patched_parse, assign it to both _mp.parse_message and
_ic.parse_message, and then set a sentinel attribute on the patched function
(e.g., _patched_parse._is_patched_by_us = True) and optionally store the
original via __wrapped__ = _orig_parse to preserve access.
…lity Reformatted the handling of the "rate_limit_event" SystemMessage in the agent session to improve code readability and maintain consistent formatting without changing functionality.
- Added idempotent patch to claude_agent_sdk's parse_message to handle "rate_limit_event" by returning a SystemMessage, preventing stream closure on rate limits. - Mark patch as applied to avoid duplicate patching. - Applied this patch automatically in multiple runner entry points (GitHub, GitLab, ideation, insights, roadmap) to ensure consistent behavior across platforms.
…_patches - Added a new test suite in tests/test_sdk_patches.py covering: - Successful patching of parse_message on both internal modules - Idempotency guard to prevent double-wrapping - Handling of "rate_limit_event" messages returning SystemMessage instances - Ensuring unknown message types still raise parse errors - Graceful failure when SDK internals are unavailable - These tests improve reliability and maintainability of SDK patching logic by verifying expected behaviors and edge cases.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_sdk_patches.py`:
- Around line 124-127: The assertion incorrectly checks for "FakeSystemMessage"
(missing leading underscore) so the class-name branch is always false; update
the check to match the actual fake class name _FakeSystemMessage (i.e., replace
"FakeSystemMessage" with "_FakeSystemMessage") or remove the class-name check
and rely on hasattr(result, "subtype") — ensure you update the assertion that
references result's class name and/or the OR condition around result.subtype to
reference the correct symbol _FakeSystemMessage.
In `@tests/test_session_rate_limit.py`:
- Line 138: Remove the redundant local import of MagicMock (the line "from
unittest.mock import MagicMock as MM") since MagicMock is already imported at
module level; delete that duplicate import and ensure references use the
existing MagicMock/MM binding (e.g., existing module-level MagicMock or MM) so
there are no duplicate imports or shadowed names.
- Around line 123-133: The test test_non_rate_limit_system_message_ignored
currently allows "error" as a valid status even though non-rate-limit
SystemMessage should not produce an error; update the assertion so it asserts
status is one of ("continue", "complete") (i.e., exclude "error") to reflect the
intended behavior when creating msg = SystemMessage(subtype="some_other_event",
...) and calling run_agent_session(client, "test prompt", self.spec_dir).
- Around line 96-105: The test test_rate_limit_event_does_not_raise currently
asserts status in ("continue","complete","error") which masks failures; update
the assertion to ensure the session did not surface an error by asserting status
!= "error" (or assert a specific expected status such as "continue"/"complete")
after calling run_agent_session with the
SystemMessage(subtype="rate_limit_event", ...); locate the test function
test_rate_limit_event_does_not_raise and replace the weak membership assertion
with a negative error check (or explicit expected status) so the
rate_limit_event must be absorbed rather than swallowed into an "error" result.
- Around line 33-38: The ResultMessage class is dead code and should be removed
to avoid confusion: delete the ResultMessage class definition (the class named
ResultMessage with attributes subtype, is_error, and result) from
tests/test_session_rate_limit.py and run the test suite to ensure nothing
references it; if any test needs a similar sentinel later, replace usages with a
local fixture or explicit value rather than this unused class.
Added `# pragma: no cover` comments to all imports and calls of `apply_claude_agent_sdk_patches` across backend runner modules to exclude them from test coverage. This clarifies coverage reports by ignoring these patch applications which are environment-specific and not directly testable. test: clean up test_sdk_patches and test_session_rate_limit - Removed unused imports and dead code in `test_sdk_patches.py` to address CodeQL findings and fix incorrect class name usage. - Removed unused `ResultMessage` class and tightened assertions in `test_session_rate_limit.py` to improve test accuracy and prevent false positives. - Simplified mocking imports by removing redundant aliases.
Added a monkey patch in backend run and spec_runner modules to intercept
and handle "rate_limit_event" messages from claude_agent_sdk. The patch
converts these events into SystemMessage instances instead of raising
MessageParseError, allowing sessions to continue and retry after rate
limit resets. Also updated error_utils to detect "rate_limit" error strings.
Summary by CodeRabbit
Bug Fixes
Chores
Tests