-
Notifications
You must be signed in to change notification settings - Fork 1.4k
fix: set idleTimeout to 255 (Bun max) #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nathanclevenger
wants to merge
26
commits into
anthropics:main
Choose a base branch
from
dot-do:fix/idle-timeout-max-255
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Changed condition from always() to !cancelled() - Prevents 'Claude encountered an error' message when workflow is cancelled - Still posts error messages on actual failures - Solves issue where concurrent workflow runs trigger error messages unnecessarily
- Add 'assigned' to validActions in validateTrackProgressEvent() - Add 'assigned' to supportedActions for PR event detection - Add tests for assigned action with track_progress - Enables automation workflows that trigger on PR assignment Fixes: platform workflow failures when auto-assigning PRs after CHANGES_REQUESTED reviews
- Update validation to accept AWS_BEARER_TOKEN_BEDROCK as alternative to full AWS credentials - AWS_REGION still required - Supports both authentication methods: - Bearer token (simpler): AWS_BEARER_TOKEN_BEDROCK - IAM credentials: AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY - Aligns with Claude Code CLI documentation References: https://docs.claude.com/en/docs/claude-code/amazon-bedrock
Implements automatic retry with exponential backoff when Claude API returns 429 rate limit errors: - Detects rate limit errors in CLI output (429, 'Too many requests', 'rate limit', 'throttl') - Throws retriable error instead of hard failing - Adds runClaudeWithRetry wrapper with exponential backoff - Default: 5 attempts with 1min → 2min → 4min → 8min → 16min delays - Only retries on rate limit errors, fails fast on other errors - Updates main entrypoint to use runClaudeWithRetry by default This prevents complete workflow failures when hitting transient API rate limits, improving reliability of automated code reviews and other Claude-powered workflows. Resolves: dot-do/platform#7400 (comment)
Strategy to minimize costs while handling rate limits: 1. First attempt: Use AWS Bedrock (we have credits, prefer this) 2. If rate limited: Switch to Anthropic API immediately (costs money but available) 3. If still rate limited: Retry with Anthropic using exponential backoff Key changes: - Detect rate limit on first Bedrock attempt - Switch environment from Bedrock to Anthropic API - No delay when switching providers (immediate retry) - Clear logging shows which provider is being used - TODO added noting AWS rate limit increase has been requested This prevents long retry loops with Bedrock when we can immediately fallback to Anthropic API as a secondary provider.
When a code review is submitted via `gh pr review`, delete the initial "Claude Code is working..." comment instead of updating it to show "Claude finished @user's task". This reduces noise in PRs with multiple review iterations. The logic: 1. Check if this is a PR (not an issue) 2. Check if the bot submitted a review in the last 5 minutes 3. If yes, delete the progress comment instead of updating it 4. If deletion fails, fall back to normal update behavior This ensures that only the review itself is visible, not both the review and a separate completion comment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Addresses code review feedback: - Extract review detection logic to testable helper function - Add type guard for review.submitted_at to prevent undefined dates - Return structured result with review ID and timestamp for better logging - Make time window explicit (5 minutes) with clear parameter This improves: - Testability: Helper function can be unit tested independently - Type safety: Explicit check for submitted_at before date conversion - Clarity: Structured return type makes intent clear - Maintainability: All review detection logic in one place 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fixes crash when pullRequest.files is null (occurs on large PRs >100 files or when GitHub API returns null for files field). ## Problem The code was accessing `pullRequest.files.nodes` without checking if `files` was null first, causing: ``` TypeError: null is not an object (evaluating 'pullRequest.files.nodes') ``` This happens on PRs with >100 changed files where the GraphQL query `files(first: 100)` returns null. ## Solution - Use optional chaining: `pullRequest.files?.nodes || []` - Add warning log when files data is null - Gracefully fallback to empty array for changedFiles ## Impact - Prevents crashes on large PRs - Reviews can now proceed even without full file list - User gets clear warning about missing files data Co-Authored-By: Claude <[email protected]>
…ilover ## Problem The old code would stick with Anthropic after the first 429, never trying Bedrock again on subsequent attempts. This wastes Anthropic credits when Bedrock might work on the next try. ## Correct Behavior **Each attempt:** 1. ALWAYS try AWS Bedrock first (maximize free credits) 2. If Bedrock returns 429 → IMMEDIATELY try Anthropic (no delay) 3. If Anthropic succeeds → done 4. If both fail → next attempt starts over with Bedrock **Example flow:** - Attempt 1: Bedrock (429) → Anthropic (success) ✅ - Attempt 2: Bedrock (429) → Anthropic (success) ✅ - Attempt 3: Bedrock (429) → Anthropic (success) ✅ Every attempt gives Bedrock a chance to work (in case 429 was temporary), but fails over to Anthropic immediately with zero delay. ## Key Changes - **Reset to Bedrock** at start of each attempt - **Zero delays** anywhere - **Immediate failover** on any 429 - Removed exponential backoff parameters (unused) ## Impact - 💰 Maximizes AWS Bedrock credit usage (tries every time) - ⚡ Zero delays on failover (immediate) - 💰 Only uses Anthropic credits when Bedrock unavailable - 🚀 Fast retry cycles Co-Authored-By: Claude <[email protected]>
Replaces complex retry logic with single execution through claude-lb worker proxy. The proxy handles Bedrock-first routing with immediate HTTP-level Anthropic failover. Key changes: - Set ANTHROPIC_BASE_URL to route through claude-lb.dotdo.workers.dev - Simplify runClaudeWithRetry to single execution (no retry loop) - Remove CLI-level retry overhead (~177s MCP initialization per retry) - Proxy handles failover at HTTP level (milliseconds, not minutes) Benefits: - Single Claude CLI process (no re-initialization) - HTTP-level failover (<1s vs 3+ minutes) - Unified AI Gateway tracking for both providers - Pass-through authentication (no secrets in worker) - Bedrock-first routing maintained 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Implements local HTTP proxy server that intercepts Claude CLI requests and adds authentication headers before forwarding to claude-lb worker. Architecture: ``` Claude CLI ↓ ANTHROPIC_BASE_URL=http://127.0.0.1:18765 ↓ Makes request Local Proxy Server (this action) ↓ Adds headers: ↓ x-bedrock-token: $AWS_BEARER_TOKEN_BEDROCK ↓ x-anthropic-key: $ANTHROPIC_API_KEY ↓ Forwards to https://claude-lb.dotdo.workers.dev claude-lb Worker (NO secrets!) ├─ Try Bedrock with x-bedrock-token └─ On 429, instant Anthropic failover with x-anthropic-key ↓ Returns response Local Proxy → Claude CLI ``` Security Benefits: - 🔒 Zero secrets in worker - 🔒 Credentials only in GitHub Secrets - 🔒 Fresh credentials per request - ⚡ <1 second failover - 💪 Conversation state preserved Files Changed: - base-action/src/proxy-server.ts (new) - Local HTTP proxy - base-action/src/index.ts - Start proxy and route through it 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Supports 3 modes: bedrock-anthropic, anthropic-only, direct - Auto-detects mode based on available credentials - Adds direct Anthropic API fallback if proxy fails - Includes stats/health endpoint for monitoring - Maintains secure pass-through auth (no secrets in worker) - Enables centralized observability via Cloudflare AI Gateway Fixes: claude-lb proxy now works without Bedrock credentials
This fixes a control flow bug where the /health endpoint was unreachable because it came AFTER the 404 response for non-/v1/messages URLs. When a request came in for /health: 1. Line 171 checked: req.url !== '/v1/messages' (true for /health) 2. Line 172 returned 404 immediately 3. Lines 164-168 (health check) were never reached (dead code) Solution: Move /health endpoint check BEFORE the 404 response, making it the first check in the request handler. Fixes automated Claude Code reviews returning 'API Error: 404 Not Found'
Co-authored-by: Claude <[email protected]>
Co-authored-by: Claude <[email protected]>
Bun.serve requires idleTimeout to be 255 seconds or less. The previous value of 300 caused: TypeError [ERR_INVALID_ARG_TYPE]: Bun.serve expects idleTimeout to be 255 or less This changes the timeout from 5 minutes (300s) to 4.25 minutes (255s), which is still sufficient for long-running Claude API requests.
| const server = Bun.serve({ | ||
| port: PROXY_PORT, | ||
| hostname: '127.0.0.1', | ||
| idleTimeout: 255, // Max allowed by Bun (4.25 minutes) for long-running Claude API requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idleTimeout: 0
will make it unlimited
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The proxy was failing with:
PR #12 set
idleTimeout: 300(5 minutes), but Bun requires it to be ≤255 seconds.Solution
Changed
idleTimeoutfrom 300 to 255 seconds (4.25 minutes).This is still sufficient for long-running Claude API requests while staying within Bun's limit.
Testing