Skip to content

feat: Add API rate limiting (DRF throttling) #1147

@mihow

Description

@mihow

Problem

Antenna has no API rate limiting. This creates two risks:

  1. Worker polling storms — Multiple ADC workers polling /jobs/ every 5 seconds each. With N workers x M pipelines, this scales linearly. We recently consolidated to a single pipeline__slug__in call per poll cycle (PSv2: Improve task fetching & web worker concurrency configuration #1142, feat: fetch jobs for all pipelines in a single API request ami-data-companion#114), but there's no server-side protection if workers misbehave or a bug causes tight retry loops.

  2. Crawler / bot abuse — The public API is exposed to the internet. Facebook's crawler bots and similar have been observed hitting the API. Without throttling, a bot or misconfigured client can saturate the server.

Proposed Approach

Use DRF's built-in throttling framework (rest_framework.throttling).

Throttle classes to configure

Scope Rate Purpose
anon 60/min Unauthenticated requests (bots, crawlers)
user 300/min Authenticated users (UI, normal API usage)
worker 30/min Worker job-polling endpoints specifically

Implementation sketch

1. Settings (config/settings/base.py):

REST_FRAMEWORK = {
    ...
    "DEFAULT_THROTTLE_CLASSES": [
        "rest_framework.throttling.AnonRateThrottle",
        "rest_framework.throttling.UserRateThrottle",
    ],
    "DEFAULT_THROTTLE_RATES": {
        "anon": "60/min",
        "user": "300/min",
        "worker": "30/min",
    },
}

2. Worker-specific throttle for job polling endpoints:

from rest_framework.throttling import SimpleRateThrottle

class WorkerPollThrottle(SimpleRateThrottle):
    scope = "worker"

    def get_cache_key(self, request, view):
        # Throttle by auth token or processing_service_name
        if request.user.is_authenticated:
            return self.cache_format % {"scope": self.scope, "ident": request.user.pk}
        return None

Apply to JobViewSet.list when ids_only=1 or incomplete_only=1 query params are present (these are the worker polling patterns).

3. Backend: Use Redis (already in the stack) for throttle state via django-redis cache backend.

Considerations

  • DRF returns 429 Too Many Requests with a Retry-After header — well-behaved clients (including ADC worker) should respect this
  • The ADC worker's get_jobs() already has a try/except around the HTTP call, so 429s would be handled gracefully (logged as error, retry on next poll cycle)
  • Consider whether ScopedRateThrottle is more appropriate for fine-grained per-view control
  • Burst vs sustained rates: DRF's default uses a simple sliding window; for more sophisticated token-bucket behavior, consider django-ratelimit or a reverse proxy (nginx/Caddy) layer

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions