Skip to content

fix: bound per-client rate limiter storage with LRU eviction#4057

Open
strawgate wants to merge 4 commits into
mainfrom
codex/rate-limiting-fix
Open

fix: bound per-client rate limiter storage with LRU eviction#4057
strawgate wants to merge 4 commits into
mainfrom
codex/rate-limiting-fix

Conversation

@strawgate
Copy link
Copy Markdown
Collaborator

@strawgate strawgate commented Apr 26, 2026

RateLimitingMiddleware and SlidingWindowRateLimitingMiddleware stored per-client limiters in an unbounded defaultdict keyed by client id. Every distinct client id created a permanent entry that was never evicted, so a long-running server facing many transient clients (or attacker-supplied ids) leaked memory without bound.

This replaces the defaultdict with an OrderedDict used as an LRU cache, capped by a configurable max_clients (default 10,000). Accessing a client moves it to the most-recently-used end; inserting past the cap evicts the least-recently-used entry. Eviction is safe because an evicted client was idle long enough for max_clients other clients to be more recent — by which point its token bucket would have refilled (or its sliding window expired) anyway, so it is indistinguishable from a fresh client. max_clients < 1 is rejected at construction. The same fix is applied to both middlewares. Note the internal storage attribute is renamed limiters_client_limiters to reflect that it is no longer a public auto-vivifying dict.

RateLimitingMiddleware(max_requests_per_second=10, max_clients=10_000)
# memory is now bounded; LRU clients are evicted instead of accumulating

Refs #4053 — fixes Bug 1 (rate-limiter dicts never evicted). Bugs 2 and 3 (FilesystemProvider lazy-lock race, unbounded _warned_files) are out of scope here.

@strawgate strawgate added the DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. label Apr 26, 2026
@strawgate
Copy link
Copy Markdown
Collaborator Author

surely we can find a non-hand rolled ttl dict

@marvin-context-protocol marvin-context-protocol Bot added bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality. labels Apr 26, 2026
chatgpt-codex-connector[bot]

This comment was marked as resolved.

@marvin-context-protocol
Copy link
Copy Markdown
Contributor

marvin-context-protocol Bot commented Apr 26, 2026

Edited to reflect the latest analysis.

tl;dr: ruff format (via the static_analysis / prek hook) reformatted tests/server/middleware/test_rate_limiting.py. Run uv run ruff format . locally and push the result.

The ruff-format hook collapsed four RateLimitingMiddleware(get_client_id=lambda ctx: ctx, max_clients=N) calls in TestRateLimiterLRUEviction that were wrapped across three lines but fit on one. Every other hook (ruff check, ty, loq, codespell, prettier, validate-pyproject) passed; the only failure was the formatter rewriting that single file.

Log excerpt — job 76372877049
ruff format..............................................................Failed
- hook id: ruff-format
- files were modified by this hook

  1 file reformatted, 762 files left unchanged
...
diff --git a/tests/server/middleware/test_rate_limiting.py b/tests/server/middleware/test_rate_limiting.py
@@ -274,9 +274,7 @@ class TestRateLimiterLRUEviction:
     def test_token_bucket_evicts_lru_when_full(self):
-        mw = RateLimitingMiddleware(
-            get_client_id=lambda ctx: ctx, max_clients=2
-        )
+        mw = RateLimitingMiddleware(get_client_id=lambda ctx: ctx, max_clients=2)

(Three additional identical hunks at lines 282, 296, and 331 for max_clients=2/5/1.)

##[error]prek exited with code 1

@strawgate strawgate force-pushed the codex/rate-limiting-fix branch from e5c8f0f to a6dde66 Compare April 26, 2026 06:16
chatgpt-codex-connector[bot]

This comment was marked as resolved.

@strawgate strawgate force-pushed the codex/rate-limiting-fix branch from a6dde66 to 30869a2 Compare April 26, 2026 15:13
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 30869a2e25

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread fastmcp_slim/fastmcp/server/middleware/rate_limiting.py
Comment thread fastmcp_slim/fastmcp/server/middleware/rate_limiting.py
@strawgate strawgate force-pushed the codex/rate-limiting-fix branch from 30869a2 to d280a50 Compare May 13, 2026 03:40
strawgate and others added 2 commits May 17, 2026 00:19
🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 <[email protected]>
🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@strawgate strawgate changed the title fix: rate limiting token bucket timing and dict memory leak fix: bound per-client rate limiter storage with LRU eviction May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. server Related to FastMCP server implementation or server-side functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant