fix: set idleTimeout to 255 (Bun max) #659

nathanclevenger · 2025-11-04T10:59:40Z

Problem

The proxy was failing with:

TypeError [ERR_INVALID_ARG_TYPE]: Bun.serve expects idleTimeout to be 255 or less

PR #12 set idleTimeout: 300 (5 minutes), but Bun requires it to be ≤255 seconds.

Solution

Changed idleTimeout from 300 to 255 seconds (4.25 minutes).

This is still sufficient for long-running Claude API requests while staying within Bun's limit.

Testing

Confirms idleTimeout is now within Bun's allowed range
Maintains adequate timeout for Claude API requests

- Changed condition from always() to !cancelled() - Prevents 'Claude encountered an error' message when workflow is cancelled - Still posts error messages on actual failures - Solves issue where concurrent workflow runs trigger error messages unnecessarily

- Add 'assigned' to validActions in validateTrackProgressEvent() - Add 'assigned' to supportedActions for PR event detection - Add tests for assigned action with track_progress - Enables automation workflows that trigger on PR assignment Fixes: platform workflow failures when auto-assigning PRs after CHANGES_REQUESTED reviews

- Update validation to accept AWS_BEARER_TOKEN_BEDROCK as alternative to full AWS credentials - AWS_REGION still required - Supports both authentication methods: - Bearer token (simpler): AWS_BEARER_TOKEN_BEDROCK - IAM credentials: AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY - Aligns with Claude Code CLI documentation References: https://docs.claude.com/en/docs/claude-code/amazon-bedrock

Implements automatic retry with exponential backoff when Claude API returns 429 rate limit errors: - Detects rate limit errors in CLI output (429, 'Too many requests', 'rate limit', 'throttl') - Throws retriable error instead of hard failing - Adds runClaudeWithRetry wrapper with exponential backoff - Default: 5 attempts with 1min → 2min → 4min → 8min → 16min delays - Only retries on rate limit errors, fails fast on other errors - Updates main entrypoint to use runClaudeWithRetry by default This prevents complete workflow failures when hitting transient API rate limits, improving reliability of automated code reviews and other Claude-powered workflows. Resolves: dot-do/platform#7400 (comment)

Strategy to minimize costs while handling rate limits: 1. First attempt: Use AWS Bedrock (we have credits, prefer this) 2. If rate limited: Switch to Anthropic API immediately (costs money but available) 3. If still rate limited: Retry with Anthropic using exponential backoff Key changes: - Detect rate limit on first Bedrock attempt - Switch environment from Bedrock to Anthropic API - No delay when switching providers (immediate retry) - Clear logging shows which provider is being used - TODO added noting AWS rate limit increase has been requested This prevents long retry loops with Bedrock when we can immediately fallback to Anthropic API as a secondary provider.

…te-limit

@user

When a code review is submitted via `gh pr review`, delete the initial "Claude Code is working..." comment instead of updating it to show "Claude finished @user's task". This reduces noise in PRs with multiple review iterations. The logic: 1. Check if this is a PR (not an issue) 2. Check if the bot submitted a review in the last 5 minutes 3. If yes, delete the progress comment instead of updating it 4. If deletion fails, fall back to normal update behavior This ensures that only the review itself is visible, not both the review and a separate completion comment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Addresses code review feedback: - Extract review detection logic to testable helper function - Add type guard for review.submitted_at to prevent undefined dates - Return structured result with review ID and timestamp for better logging - Make time window explicit (5 minutes) with clear parameter This improves: - Testability: Helper function can be unit tested independently - Type safety: Explicit check for submitted_at before date conversion - Clarity: Structured return type makes intent clear - Maintainability: All review detection logic in one place 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…itted

Fixes crash when pullRequest.files is null (occurs on large PRs >100 files or when GitHub API returns null for files field). ## Problem The code was accessing `pullRequest.files.nodes` without checking if `files` was null first, causing: ``` TypeError: null is not an object (evaluating 'pullRequest.files.nodes') ``` This happens on PRs with >100 changed files where the GraphQL query `files(first: 100)` returns null. ## Solution - Use optional chaining: `pullRequest.files?.nodes || []` - Add warning log when files data is null - Gracefully fallback to empty array for changedFiles ## Impact - Prevents crashes on large PRs - Reviews can now proceed even without full file list - User gets clear warning about missing files data Co-Authored-By: Claude <[email protected]>

…ilover ## Problem The old code would stick with Anthropic after the first 429, never trying Bedrock again on subsequent attempts. This wastes Anthropic credits when Bedrock might work on the next try. ## Correct Behavior **Each attempt:** 1. ALWAYS try AWS Bedrock first (maximize free credits) 2. If Bedrock returns 429 → IMMEDIATELY try Anthropic (no delay) 3. If Anthropic succeeds → done 4. If both fail → next attempt starts over with Bedrock **Example flow:** - Attempt 1: Bedrock (429) → Anthropic (success) ✅ - Attempt 2: Bedrock (429) → Anthropic (success) ✅ - Attempt 3: Bedrock (429) → Anthropic (success) ✅ Every attempt gives Bedrock a chance to work (in case 429 was temporary), but fails over to Anthropic immediately with zero delay. ## Key Changes - **Reset to Bedrock** at start of each attempt - **Zero delays** anywhere - **Immediate failover** on any 429 - Removed exponential backoff parameters (unused) ## Impact - 💰 Maximizes AWS Bedrock credit usage (tries every time) - ⚡ Zero delays on failover (immediate) - 💰 Only uses Anthropic credits when Bedrock unavailable - 🚀 Fast retry cycles Co-Authored-By: Claude <[email protected]>

Replaces complex retry logic with single execution through claude-lb worker proxy. The proxy handles Bedrock-first routing with immediate HTTP-level Anthropic failover. Key changes: - Set ANTHROPIC_BASE_URL to route through claude-lb.dotdo.workers.dev - Simplify runClaudeWithRetry to single execution (no retry loop) - Remove CLI-level retry overhead (~177s MCP initialization per retry) - Proxy handles failover at HTTP level (milliseconds, not minutes) Benefits: - Single Claude CLI process (no re-initialization) - HTTP-level failover (<1s vs 3+ minutes) - Unified AI Gateway tracking for both providers - Pass-through authentication (no secrets in worker) - Bedrock-first routing maintained 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Implements local HTTP proxy server that intercepts Claude CLI requests and adds authentication headers before forwarding to claude-lb worker. Architecture: ``` Claude CLI ↓ ANTHROPIC_BASE_URL=http://127.0.0.1:18765 ↓ Makes request Local Proxy Server (this action) ↓ Adds headers: ↓ x-bedrock-token: $AWS_BEARER_TOKEN_BEDROCK ↓ x-anthropic-key: $ANTHROPIC_API_KEY ↓ Forwards to https://claude-lb.dotdo.workers.dev claude-lb Worker (NO secrets!) ├─ Try Bedrock with x-bedrock-token └─ On 429, instant Anthropic failover with x-anthropic-key ↓ Returns response Local Proxy → Claude CLI ``` Security Benefits: - 🔒 Zero secrets in worker - 🔒 Credentials only in GitHub Secrets - 🔒 Fresh credentials per request - ⚡ <1 second failover - 💪 Conversation state preserved Files Changed: - base-action/src/proxy-server.ts (new) - Local HTTP proxy - base-action/src/index.ts - Start proxy and route through it 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

- Supports 3 modes: bedrock-anthropic, anthropic-only, direct - Auto-detects mode based on available credentials - Adds direct Anthropic API fallback if proxy fails - Includes stats/health endpoint for monitoring - Maintains secure pass-through auth (no secrets in worker) - Enables centralized observability via Cloudflare AI Gateway Fixes: claude-lb proxy now works without Bedrock credentials

This fixes a control flow bug where the /health endpoint was unreachable because it came AFTER the 404 response for non-/v1/messages URLs. When a request came in for /health: 1. Line 171 checked: req.url !== '/v1/messages' (true for /health) 2. Line 172 returned 404 immediately 3. Lines 164-168 (health check) were never reached (dead code) Solution: Move /health endpoint check BEFORE the 404 response, making it the first check in the request handler. Fixes automated Claude Code reviews returning 'API Error: 404 Not Found'

Co-authored-by: Claude <[email protected]>

Bun.serve requires idleTimeout to be 255 seconds or less. The previous value of 300 caused: TypeError [ERR_INVALID_ARG_TYPE]: Bun.serve expects idleTimeout to be 255 or less This changes the timeout from 5 minutes (300s) to 4.25 minutes (255s), which is still sufficient for long-running Claude API requests.

Jarred-Sumner · 2025-11-04T12:21:21Z

base-action/src/proxy-server.ts

+  const server = Bun.serve({
+    port: PROXY_PORT,
+    hostname: '127.0.0.1',
+    idleTimeout: 255, // Max allowed by Bun (4.25 minutes) for long-running Claude API requests


idleTimeout: 0

will make it unlimited

Claude Code and others added 26 commits October 14, 2025 12:21

chore: sync with upstream anthropics/claude-code-action

04a053b

merge: integrate track_progress support for pull_request.assigned events

30e8f60

Merge pull request #1 from dot-do/feature/retry-on-rate-limit

07cf46a

Merge pull request #2 from dot-do/feature/fallback-to-anthropic-on-ra…

13f3147

…te-limit

Merge pull request #3 from dot-do/fix/delete-comment-when-review-subm…

d4042c6

…itted

Merge pull request #5 from dot-do/fix/reduce-anthropic-retry-delay

690c9c2

Merge pull request #6 from dot-do/fix/early-429-detection

c566a2c

Merge pull request #4 from dot-do/fix/handle-null-files-pagination

dd4d00c

Merge pull request #8 from dot-do/feature/proxy-auth-headers

d3d06c1

fix: move /health endpoint check before 404 response (#10)

af6557d

Co-authored-by: Claude <[email protected]>

fix: add idleTimeout and bedrock-only proxy mode (#11)

0cabd93

refactor: simplify proxy server - require both API keys (#12)

a08b95f

Co-authored-by: Claude <[email protected]>

Jarred-Sumner reviewed Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: set idleTimeout to 255 (Bun max) #659

fix: set idleTimeout to 255 (Bun max) #659

Uh oh!

nathanclevenger commented Nov 4, 2025

Uh oh!

Jarred-Sumner Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: set idleTimeout to 255 (Bun max) #659

Are you sure you want to change the base?

fix: set idleTimeout to 255 (Bun max) #659

Uh oh!

Conversation

nathanclevenger commented Nov 4, 2025

Problem

Solution

Testing

Uh oh!

Jarred-Sumner Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants