Skip to content

feat: MCP tools rate limiting#2973

Open
joeyorlando wants to merge 6 commits intomainfrom
jorlando/mcp-tools-rate-limitng
Open

feat: MCP tools rate limiting#2973
joeyorlando wants to merge 6 commits intomainfrom
jorlando/mcp-tools-rate-limitng

Conversation

@joeyorlando
Copy link
Contributor

@joeyorlando joeyorlando commented Feb 23, 2026

Summary

  • Add MCP rate limiting enforcement to the MCP Gateway, checking limits before tool execution
  • New /mcp-rate-limits page under "MCP & Tools" sidebar for managing per-server and per-tool rate limits with configurable time windows
  • Sliding window rate limiter reuses existing isRateLimited infrastructure from CacheManager
  • Rate-limited calls return helpful MCP tool error results with limit info and retry timing

Changes

  • Database: Added windowMs column to limits table for per-limit custom time windows
  • Backend: New checkMcpRateLimits() service integrated into mcp-client.ts before tool execution; checks agent → team → organization limits in priority order
  • Frontend: New MCP Rate Limits page with CRUD, usage progress bars, window presets (1 min to 1 month); removed "Coming Soon" MCP section from /cost/limits
  • Metrics: mcp_rate_limit_rejections_total Prometheus counter with agent/server/tool labels
  • Grafana: New "Rate Limits" row in MCP Monitoring dashboard with rejection panels by server and tool
  • Docs: New "MCP Rate Limits" section in MCP Gateway docs; updated observability docs with new metric

@London-Cat
Copy link
Collaborator

London-Cat commented Feb 23, 2026

📊 Reputation Summary

User Rep Pull Requests Activity Assigned Core Reactions
joeyorlando ⚡ 2278 90✅ 6🔄 4❌ 100 issues, 50 comments 4

How is the score calculated? Read about it in the Reputation Bot repository 🤖

@joeyorlando joeyorlando marked this pull request as draft February 23, 2026 21:17
@claude
Copy link
Contributor

claude bot commented Feb 23, 2026

Claude finished @joeyorlando's task in 3m 21s —— View job


PR Review: MCP Tools Rate Limiting (#2973)

  • Read repository guidelines (CLAUDE.md)
  • Review database schema & migrations
  • Review backend rate limiting logic
  • Review backend routes & models
  • Review frontend changes
  • Review e2e tests
  • Review shared types & access control
  • Review metrics & observability
  • Review changes to existing limits model
  • Compile and post review

Overview

This PR adds MCP rate limiting with a clean full-stack implementation: new mcp_rate_limits table, a sliding window rate limiter reusing the existing isRateLimited infrastructure, CRUD API routes, a frontend management page, Prometheus metrics, and Grafana dashboards. The architecture cleanly separates MCP rate limits from existing LLM token cost limits. Nice work on the overall design.


Issues Found

1. Bug: createAgent fixture signature mismatch causes 47 test failures

Severity: High — This is the root cause of all 47 e2e test failures.

In platform/e2e-tests/tests/api/fixtures.ts:108-122, the createAgent fixture now requires a scope parameter:

const createAgent = async (
  request: APIRequestContext,
  name: string,
  scope: "personal" | "team" | "org",  // NEW required param
) => ...

But the MCP rate limit e2e tests in platform/e2e-tests/tests/api/mcp-gateway-rate-limits.spec.ts:34 call it without scope:

const createResponse = await createAgent(
  request,
  `MCP Rate Limit Test ${uniqueSuffix}`,
  // missing 'scope' argument
);

The token-cost-limits spec also appears to be affected (32 of the 47 failures). This was likely a merge conflict from the base where scope was added. The fix is straightforward: add "org" as the third argument to all createAgent calls in the new test files.

Fix this →

2. Bug: No authorization scoping on MCP rate limits API

Severity: High — Security concern.

The GET /api/mcp-rate-limits endpoint (platform/backend/src/routes/mcp-rate-limits.ts:34) returns all MCP rate limits across the entire system without filtering by organizationId. Similarly, POST, PATCH, and DELETE operations don't verify that the target limit belongs to the requesting user's organization.

For example, findAll in platform/backend/src/models/mcp-rate-limit.ts:27-46 queries without any organization scope:

static async findAll(filters?: {
  agentId?: string;
  limitType?: McpRateLimitType;
}) {
  // No organizationId filter
  return db.select().from(schema.mcpRateLimitsTable).where(...);
}

In a multi-tenant deployment, one organization could see and modify another organization's rate limits. The existing limits route (platform/backend/src/routes/limits.ts) has this same pattern but the limits table uses entityId tied to teams/orgs which provides implicit scoping. The new mcp_rate_limits table is only scoped by agentId — but agents already belong to organizations, so the fix would be to join through the agents table to filter by organization, or verify the agent belongs to the user's organization on mutation routes.

Fix this →

3. Missing: No unique constraint preventing duplicate rate limits

Severity: Medium

The mcp_rate_limits table in platform/backend/src/database/schemas/mcp-rate-limit.ts has indexes but no unique constraint on (agentId, limitType, mcpServerName, toolName). This means a user could accidentally create two identical rate limits for the same agent+server+tool combination, leading to unpredictable enforcement (the stricter limit wins due to the for-loop in checkMcpRateLimits).

Consider adding a unique index or a check in the create handler to prevent duplicates.

4. Missing: Rate limit cache not invalidated when limits are updated/deleted

Severity: Medium

When a limit is updated via PATCH /api/mcp-rate-limits/:id or deleted via DELETE /api/mcp-rate-limits/:id, the existing cache entry in CacheManager keyed as mcp-rate-limit-{limitId} is not invalidated. This means:

  • Update scenario: If maxCalls is increased from 10 to 100, the old cache entry still has the current count. Since isRateLimited compares against the new maxCalls from the database, this is actually OK for increases. But if windowSeconds is changed, the old window timing in cache persists.
  • Delete scenario: If a rate limit is deleted, the cache entry remains. Since checkMcpRateLimits fetches limits from DB first and filters, the orphaned cache entry won't cause false rejections. But getMcpUsageForLimit would still return stale usage for a deleted limit.

This is not a blocking issue since enforcement always reads fresh limits from DB, but it could confuse the usage display. Consider invalidating the cache key on update/delete.

5. Suggestion: formatWindowSeconds duplicated between backend and frontend

Severity: Low

The formatWindowSeconds function exists in both:

  • platform/backend/src/clients/mcp-rate-limit.ts:138-158
  • platform/frontend/src/app/mcp/rate-limits/page.client.tsx:379-385

These have different implementations (backend returns full words like "1 hour", frontend returns abbreviated like "1h" for non-preset values). Consider extracting to @shared if you want consistency, or keep as-is if the different formats are intentional.

6. Observation: Migration drops limit_type column from limits table

Severity: Low — Worth noting for reviewers.

The migration at platform/backend/src/database/migrations/0160_dark_piledriver.sql:17 drops the limit_type column from the existing limits table and also drops limits_type_idx. This is a breaking change for the existing limits infrastructure. The existing limits table was previously used for both token cost limits and (unused) MCP limits — now MCP limits live in the new mcp_rate_limits table and the old limit_type column is removed.

The corresponding code changes in platform/backend/src/types/limit.ts and platform/backend/src/models/limit.ts remove all references to limitType, which is consistent. The RBAC migration that renames limitllmTokenLimit and copies permissions to mcpRateLimit is a nice touch for backward compatibility.

7. Observation: Frontend dialog doesn't reset state properly on re-open

Severity: Low

In platform/frontend/src/app/mcp/rate-limits/page.client.tsx:508-567, the McpRateLimitDialog component manages form state with individual useState hooks initialized from initialData. The resetForm callback on line 547 resets state from initialData, but the useCallback dependency array only includes initialData. If the dialog is opened for create (no initialData) and then opened for edit (with initialData), the form fields may not update because initialData goes from null → object, which correctly triggers resetForm via handleOpenChange. This looks correct on closer inspection.


What Looks Good

  • Fail-open design (platform/backend/src/clients/mcp-rate-limit.ts:93-100): Rate limit errors don't block tool calls — good defensive design.
  • Clean separation: New mcp_rate_limits table + model + types + routes is well-structured and follows repo conventions.
  • Agent name lookup only on rejection path (line 64): Avoids unnecessary DB queries on the hot path.
  • Comprehensive test coverage: Unit tests (mcp-rate-limit.test.ts, mcp-rate-limit.test.ts in models), e2e API tests, and e2e UI tests.
  • RBAC integration: Proper mcpRateLimit resource with CRUD actions, editor/member permissions, route mappings, and frontend PermissionButton usage.
  • Metrics: New mcp_rate_limit_rejections_total counter with useful labels for monitoring.
  • Zod validation in CreateMcpRateLimitSchema: Properly validates tool_calls requires toolName, and maxCalls/windowSeconds > 0.
  • Frontend UX: Tool search combobox, usage progress bars, window presets, and grouped agent selects are polished.
  • refetchInterval: 5000 on the rate limits query provides near-real-time usage updates.
  • Migration: Clean RBAC permission migration from limitllmTokenLimit + mcpRateLimit.

Summary

The core implementation is solid. The two high-priority items to address are:

  1. Fix the createAgent fixture call (missing scope parameter) — this is causing all 47 e2e failures
  2. Add organization scoping to the MCP rate limits API routes to prevent cross-tenant access

The medium-priority items (duplicate prevention, cache invalidation) would improve robustness but aren't blockers.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

Playwright test results

failed  47 failed
passed  543 passed
flaky  6 flaky

Details

stats  596 tests across 59 suites
duration  3 minutes, 54 seconds
commit  dd9edfe

Failed tests

api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenAI › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenAI › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Anthropic › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Anthropic › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Gemini › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Gemini › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cohere › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cohere › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cerebras › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cerebras › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Mistral › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Mistral › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Groq › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Groq › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-xAI › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-xAI › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenRouter › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenRouter › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Perplexity › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Perplexity › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-vLLM › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-vLLM › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Ollama › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Ollama › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Zhipuai › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Zhipuai › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-DeepSeek › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-DeepSeek › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Minimax › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Minimax › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Bedrock › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Bedrock › allows request when under limit
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › mcp_server_calls limit blocks calls after threshold is exceeded
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › tool_calls limit blocks calls after threshold for a specific tool
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › calls succeed normally when no rate limit is configured
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › rate limit usage is reflected in the MCP rate limits API response
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › creates an mcp_server_calls limit
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › creates a tool_calls limit with toolName
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › lists limits filtered by agentId with mcpUsage field
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › updates a limit via PATCH
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › deletes a limit and GET returns 404
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › rejects tool_calls without toolName
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › shows empty state when no rate limits exist
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can create a per-server rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can create a per-tool rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can edit a rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can delete a rate limit

Flaky tests

api › api/mcp-catalog-labels.spec.ts › MCP Catalog Labels › update catalog item labels
api › api/mcp-catalog-labels.spec.ts › MCP Catalog Labels › remove all labels
api › api/mcp-gateway.spec.ts › MCP Gateway - External MCP Server Tests › should list internal-dev-test-server tool
api › api/mcp-gateway.spec.ts › MCP Gateway - External MCP Server Tests › should invoke internal-dev-test-server tool successfully
api › api/organization.spec.ts › Organization API logo validation › Error handling › should reject oversized PNG logo
chromium › ui/chat-settings.spec.ts › Provider Settings - Virtual API Keys › Can create a virtual key from the Virtual API Keys tab

@joeyorlando joeyorlando marked this pull request as ready for review February 24, 2026 18:27
Separate MCP rate limits (per-server and per-tool call limits) from the
shared limits table into their own table, model, routes, types, RBAC
resource, and frontend page.

- New `mcp_rate_limits` table with direct agent FK and cascade delete
- New `McpRateLimitModel` with CRUD operations
- New `/api/mcp-rate-limits` REST endpoints with RBAC
- Rename `limit` RBAC resource to `llmTokenLimit`, add `mcpRateLimit`
- New `/mcp-rate-limits` frontend page with create/edit/delete dialogs
- Remove MCP limits section from cost/limits page
- E2E API and UI tests
- Documentation and Grafana dashboard updates
@joeyorlando joeyorlando force-pushed the jorlando/mcp-tools-rate-limitng branch from a6a5e46 to 528d4cc Compare March 2, 2026 20:19
joeyorlando and others added 5 commits March 2, 2026 19:40
…te-limitng

Resolved conflicts:
- shared/access-control.ts: kept llmTokenLimit permission key with
  updated /llm/cost/limits path from URL restructuring
- frontend/src/app/llm/cost/limits/page.tsx: accepted new CatalogItem
  import from main (unused, removed)

Updated rate-limits paths for new URL structure:
- Moved frontend/src/app/mcp-rate-limits/ to mcp/rate-limits/
- Updated sidebar nav from /mcp-rate-limits to /mcp/rate-limits
- Fixed import path to @/app/mcp/registry/_parts/mcp-server-card
- Fixed href from /mcp-gateways to /mcp/gateways
- Updated e2e UI test goToPage calls
- Added /mcp-rate-limits redirect in next.config.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants