feat: MCP tools rate limiting by joeyorlando · Pull Request #2973 · archestra-ai/archestra

joeyorlando · 2026-02-23T21:10:35Z

Summary

Add MCP rate limiting enforcement to the MCP Gateway, checking limits before tool execution
New /mcp-rate-limits page under "MCP & Tools" sidebar for managing per-server and per-tool rate limits with configurable time windows
Sliding window rate limiter reuses existing isRateLimited infrastructure from CacheManager
Rate-limited calls return helpful MCP tool error results with limit info and retry timing

Changes

Database: Added windowMs column to limits table for per-limit custom time windows
Backend: New checkMcpRateLimits() service integrated into mcp-client.ts before tool execution; checks agent → team → organization limits in priority order
Frontend: New MCP Rate Limits page with CRUD, usage progress bars, window presets (1 min to 1 month); removed "Coming Soon" MCP section from /cost/limits
Metrics: mcp_rate_limit_rejections_total Prometheus counter with agent/server/tool labels
Grafana: New "Rate Limits" row in MCP Monitoring dashboard with rejection panels by server and tool
Docs: New "MCP Rate Limits" section in MCP Gateway docs; updated observability docs with new metric

London-Cat · 2026-02-23T21:12:12Z

📊 Reputation Summary

User	Rep	Pull Requests	Activity	Assigned	Core Reactions
joeyorlando	⚡ 2278	90✅ 6🔄 4❌	100 issues, 50 comments	4	—

How is the score calculated? Read about it in the Reputation Bot repository 🤖

claude · 2026-02-23T21:18:11Z

Claude finished @joeyorlando's task in 3m 21s —— View job

PR Review: MCP Tools Rate Limiting (#2973)

Overview

This PR adds MCP rate limiting with a clean full-stack implementation: new mcp_rate_limits table, a sliding window rate limiter reusing the existing isRateLimited infrastructure, CRUD API routes, a frontend management page, Prometheus metrics, and Grafana dashboards. The architecture cleanly separates MCP rate limits from existing LLM token cost limits. Nice work on the overall design.

Issues Found

1. Bug: `createAgent` fixture signature mismatch causes 47 test failures

Severity: High — This is the root cause of all 47 e2e test failures.

In platform/e2e-tests/tests/api/fixtures.ts:108-122, the createAgent fixture now requires a scope parameter:

const createAgent = async (
  request: APIRequestContext,
  name: string,
  scope: "personal" | "team" | "org",  // NEW required param
) => ...

But the MCP rate limit e2e tests in platform/e2e-tests/tests/api/mcp-gateway-rate-limits.spec.ts:34 call it without scope:

const createResponse = await createAgent(
  request,
  `MCP Rate Limit Test ${uniqueSuffix}`,
  // missing 'scope' argument
);

The token-cost-limits spec also appears to be affected (32 of the 47 failures). This was likely a merge conflict from the base where scope was added. The fix is straightforward: add "org" as the third argument to all createAgent calls in the new test files.

Fix this →

2. Bug: No authorization scoping on MCP rate limits API

Severity: High — Security concern.

The GET /api/mcp-rate-limits endpoint (platform/backend/src/routes/mcp-rate-limits.ts:34) returns all MCP rate limits across the entire system without filtering by organizationId. Similarly, POST, PATCH, and DELETE operations don't verify that the target limit belongs to the requesting user's organization.

For example, findAll in platform/backend/src/models/mcp-rate-limit.ts:27-46 queries without any organization scope:

static async findAll(filters?: {
  agentId?: string;
  limitType?: McpRateLimitType;
}) {
  // No organizationId filter
  return db.select().from(schema.mcpRateLimitsTable).where(...);
}

In a multi-tenant deployment, one organization could see and modify another organization's rate limits. The existing limits route (platform/backend/src/routes/limits.ts) has this same pattern but the limits table uses entityId tied to teams/orgs which provides implicit scoping. The new mcp_rate_limits table is only scoped by agentId — but agents already belong to organizations, so the fix would be to join through the agents table to filter by organization, or verify the agent belongs to the user's organization on mutation routes.

Fix this →

3. Missing: No unique constraint preventing duplicate rate limits

Severity: Medium

The mcp_rate_limits table in platform/backend/src/database/schemas/mcp-rate-limit.ts has indexes but no unique constraint on (agentId, limitType, mcpServerName, toolName). This means a user could accidentally create two identical rate limits for the same agent+server+tool combination, leading to unpredictable enforcement (the stricter limit wins due to the for-loop in checkMcpRateLimits).

Consider adding a unique index or a check in the create handler to prevent duplicates.

4. Missing: Rate limit cache not invalidated when limits are updated/deleted

Severity: Medium

When a limit is updated via PATCH /api/mcp-rate-limits/:id or deleted via DELETE /api/mcp-rate-limits/:id, the existing cache entry in CacheManager keyed as mcp-rate-limit-{limitId} is not invalidated. This means:

Update scenario: If maxCalls is increased from 10 to 100, the old cache entry still has the current count. Since isRateLimited compares against the new maxCalls from the database, this is actually OK for increases. But if windowSeconds is changed, the old window timing in cache persists.
Delete scenario: If a rate limit is deleted, the cache entry remains. Since checkMcpRateLimits fetches limits from DB first and filters, the orphaned cache entry won't cause false rejections. But getMcpUsageForLimit would still return stale usage for a deleted limit.

This is not a blocking issue since enforcement always reads fresh limits from DB, but it could confuse the usage display. Consider invalidating the cache key on update/delete.

5. Suggestion: `formatWindowSeconds` duplicated between backend and frontend

Severity: Low

The formatWindowSeconds function exists in both:

platform/backend/src/clients/mcp-rate-limit.ts:138-158
platform/frontend/src/app/mcp/rate-limits/page.client.tsx:379-385

These have different implementations (backend returns full words like "1 hour", frontend returns abbreviated like "1h" for non-preset values). Consider extracting to @shared if you want consistency, or keep as-is if the different formats are intentional.

6. Observation: Migration drops `limit_type` column from `limits` table

Severity: Low — Worth noting for reviewers.

The migration at platform/backend/src/database/migrations/0160_dark_piledriver.sql:17 drops the limit_type column from the existing limits table and also drops limits_type_idx. This is a breaking change for the existing limits infrastructure. The existing limits table was previously used for both token cost limits and (unused) MCP limits — now MCP limits live in the new mcp_rate_limits table and the old limit_type column is removed.

The corresponding code changes in platform/backend/src/types/limit.ts and platform/backend/src/models/limit.ts remove all references to limitType, which is consistent. The RBAC migration that renames limit → llmTokenLimit and copies permissions to mcpRateLimit is a nice touch for backward compatibility.

7. Observation: Frontend dialog doesn't reset state properly on re-open

Severity: Low

In platform/frontend/src/app/mcp/rate-limits/page.client.tsx:508-567, the McpRateLimitDialog component manages form state with individual useState hooks initialized from initialData. The resetForm callback on line 547 resets state from initialData, but the useCallback dependency array only includes initialData. If the dialog is opened for create (no initialData) and then opened for edit (with initialData), the form fields may not update because initialData goes from null → object, which correctly triggers resetForm via handleOpenChange. This looks correct on closer inspection.

What Looks Good

Fail-open design (platform/backend/src/clients/mcp-rate-limit.ts:93-100): Rate limit errors don't block tool calls — good defensive design.
Clean separation: New mcp_rate_limits table + model + types + routes is well-structured and follows repo conventions.
Agent name lookup only on rejection path (line 64): Avoids unnecessary DB queries on the hot path.
Comprehensive test coverage: Unit tests (mcp-rate-limit.test.ts, mcp-rate-limit.test.ts in models), e2e API tests, and e2e UI tests.
RBAC integration: Proper mcpRateLimit resource with CRUD actions, editor/member permissions, route mappings, and frontend PermissionButton usage.
Metrics: New mcp_rate_limit_rejections_total counter with useful labels for monitoring.
Zod validation in CreateMcpRateLimitSchema: Properly validates tool_calls requires toolName, and maxCalls/windowSeconds > 0.
Frontend UX: Tool search combobox, usage progress bars, window presets, and grouped agent selects are polished.
refetchInterval: 5000 on the rate limits query provides near-real-time usage updates.
Migration: Clean RBAC permission migration from limit → llmTokenLimit + mcpRateLimit.

Summary

The core implementation is solid. The two high-priority items to address are:

Fix the createAgent fixture call (missing scope parameter) — this is causing all 47 e2e failures
Add organization scoping to the MCP rate limits API routes to prevent cross-tenant access

The medium-priority items (duplicate prevention, cache invalidation) would improve robustness but aren't blockers.

github-actions · 2026-02-23T21:36:57Z

Playwright test results

47 failed
543 passed
6 flaky

Details

596 tests across 59 suites
3 minutes, 54 seconds
dd9edfe

Failed tests

api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenAI › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenAI › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Anthropic › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Anthropic › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Gemini › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Gemini › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cohere › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cohere › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cerebras › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Cerebras › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Mistral › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Mistral › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Groq › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Groq › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-xAI › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-xAI › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenRouter › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-OpenRouter › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Perplexity › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Perplexity › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-vLLM › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-vLLM › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Ollama › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Ollama › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Zhipuai › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Zhipuai › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-DeepSeek › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-DeepSeek › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Minimax › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Minimax › allows request when under limit
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Bedrock › blocks request when profile token cost limit is exceeded
api › api/llm-proxy/token-cost-limits.spec.ts › LLMProxy-TokenCostLimits-Bedrock › allows request when under limit
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › mcp_server_calls limit blocks calls after threshold is exceeded
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › tool_calls limit blocks calls after threshold for a specific tool
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › calls succeed normally when no rate limit is configured
api › api/mcp-gateway-rate-limits.spec.ts › MCP Gateway - Rate Limits › rate limit usage is reflected in the MCP rate limits API response
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › creates an mcp_server_calls limit
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › creates a tool_calls limit with toolName
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › lists limits filtered by agentId with mcpUsage field
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › updates a limit via PATCH
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › deletes a limit and GET returns 404
api › api/mcp-gateway-rate-limits.spec.ts › MCP Rate Limits - CRUD API › rejects tool_calls without toolName
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › shows empty state when no rate limits exist
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can create a per-server rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can create a per-tool rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can edit a rate limit
chromium › ui/mcp-rate-limits.spec.ts › MCP Rate Limits › can delete a rate limit

Flaky tests

api › api/mcp-catalog-labels.spec.ts › MCP Catalog Labels › update catalog item labels
api › api/mcp-catalog-labels.spec.ts › MCP Catalog Labels › remove all labels
api › api/mcp-gateway.spec.ts › MCP Gateway - External MCP Server Tests › should list internal-dev-test-server tool
api › api/mcp-gateway.spec.ts › MCP Gateway - External MCP Server Tests › should invoke internal-dev-test-server tool successfully
api › api/organization.spec.ts › Organization API logo validation › Error handling › should reject oversized PNG logo
chromium › ui/chat-settings.spec.ts › Provider Settings - Virtual API Keys › Can create a virtual key from the Virtual API Keys tab

Separate MCP rate limits (per-server and per-tool call limits) from the shared limits table into their own table, model, routes, types, RBAC resource, and frontend page. - New `mcp_rate_limits` table with direct agent FK and cascade delete - New `McpRateLimitModel` with CRUD operations - New `/api/mcp-rate-limits` REST endpoints with RBAC - Rename `limit` RBAC resource to `llmTokenLimit`, add `mcpRateLimit` - New `/mcp-rate-limits` frontend page with create/edit/delete dialogs - Remove MCP limits section from cost/limits page - E2E API and UI tests - Documentation and Grafana dashboard updates

…te-limitng Resolved conflicts: - shared/access-control.ts: kept llmTokenLimit permission key with updated /llm/cost/limits path from URL restructuring - frontend/src/app/llm/cost/limits/page.tsx: accepted new CatalogItem import from main (unused, removed) Updated rate-limits paths for new URL structure: - Moved frontend/src/app/mcp-rate-limits/ to mcp/rate-limits/ - Updated sidebar nav from /mcp-rate-limits to /mcp/rate-limits - Fixed import path to @/app/mcp/registry/_parts/mcp-server-card - Fixed href from /mcp-gateways to /mcp/gateways - Updated e2e UI test goToPage calls - Added /mcp-rate-limits redirect in next.config.ts

joeyorlando marked this pull request as draft February 23, 2026 21:17

joeyorlando marked this pull request as ready for review February 24, 2026 18:27

joeyorlando force-pushed the jorlando/mcp-tools-rate-limitng branch from a6a5e46 to 528d4cc Compare March 2, 2026 20:19

joeyorlando and others added 5 commits March 2, 2026 19:40

wip

4f1d688

wip

01016d8

wip

d4f6d56

Merge branch 'main' into jorlando/mcp-tools-rate-limitng

dd9edfe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MCP tools rate limiting#2973

feat: MCP tools rate limiting#2973
joeyorlando wants to merge 6 commits intomainfrom
jorlando/mcp-tools-rate-limitng

joeyorlando commented Feb 23, 2026 •

edited

Loading

Uh oh!

London-Cat commented Feb 23, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joeyorlando commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

London-Cat commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Reputation Summary

Uh oh!

claude bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: MCP Tools Rate Limiting (#2973)

Overview

Issues Found

1. Bug: createAgent fixture signature mismatch causes 47 test failures

2. Bug: No authorization scoping on MCP rate limits API

3. Missing: No unique constraint preventing duplicate rate limits

4. Missing: Rate limit cache not invalidated when limits are updated/deleted

5. Suggestion: formatWindowSeconds duplicated between backend and frontend

6. Observation: Migration drops limit_type column from limits table

7. Observation: Frontend dialog doesn't reset state properly on re-open

What Looks Good

Summary

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Playwright test results

Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joeyorlando commented Feb 23, 2026 •

edited

Loading

London-Cat commented Feb 23, 2026 •

edited

Loading

claude bot commented Feb 23, 2026 •

edited

Loading

1. Bug: `createAgent` fixture signature mismatch causes 47 test failures

5. Suggestion: `formatWindowSeconds` duplicated between backend and frontend

6. Observation: Migration drops `limit_type` column from `limits` table

github-actions bot commented Feb 23, 2026 •

edited

Loading