Bug
The refill() method in src/services/rate-limiter.ts calculates token replenishment using the full refillIntervalMs (60s) as the divisor instead of the per-token interval (refillIntervalMs / refillRate = 7.5s).
This means after the initial burst of 8 tokens is consumed, the next token only becomes available after 60s — but the queue timeout is 30s. Queued requests always time out before a token can be refilled.
Root cause
// Current (broken): tokens refill all-at-once after full interval
const intervalsElapsed = Math.floor(elapsed / this.refillIntervalMs);
const tokensToAdd = intervalsElapsed * this.refillRate;
Should be:
// Fixed: tokens refill individually at evenly spaced intervals
const msPerToken = this.refillIntervalMs / this.refillRate;
const tokensToAdd = Math.floor(elapsed / msPerToken);
Additionally, queued requests only schedule a single setTimeout for refill checking. If multiple requests are queued, only the first one triggers a check — the rest are never processed.
Impact
With default config (8 tokens/min, 30s queue timeout): any request that arrives after the initial 8-request burst will hang for 30s and then fail with Rate limit queue timeout.
Suggested fix
- Change
refill() to calculate msPerToken = refillIntervalMs / refillRate and use that for both token calculation and lastRefill update
- Replace the one-shot
setTimeout with a setInterval-based scheduleRefill() that runs while requests are queued
Bug
The
refill()method insrc/services/rate-limiter.tscalculates token replenishment using the fullrefillIntervalMs(60s) as the divisor instead of the per-token interval (refillIntervalMs / refillRate= 7.5s).This means after the initial burst of 8 tokens is consumed, the next token only becomes available after 60s — but the queue timeout is 30s. Queued requests always time out before a token can be refilled.
Root cause
Should be:
Additionally, queued requests only schedule a single setTimeout for refill checking. If multiple requests are queued, only the first one triggers a check — the rest are never processed.
Impact
With default config (8 tokens/min, 30s queue timeout): any request that arrives after the initial 8-request burst will hang for 30s and then fail with Rate limit queue timeout.
Suggested fix
refill()to calculatemsPerToken = refillIntervalMs / refillRateand use that for both token calculation andlastRefillupdatesetTimeoutwith a setInterval-basedscheduleRefill()that runs while requests are queued