Skip to content

feat: Configurable rate limiting per domain #51

@Snider

Description

@Snider

Summary

Fine-grained rate limiting with per-domain configuration.

Use Case

Different sites have different tolerance. GitHub API has strict limits, static sites can handle more.

Commands

# Global rate limit
borg collect website https://example.com --rate-limit 2/s

# Per-domain config file
borg collect website https://example.com --rate-config .borg-rates.yaml

Configuration

# .borg-rates.yaml
defaults:
  requests_per_second: 1
  burst: 5

domains:
  api.github.com:
    requests_per_second: 0.5  # 1 req per 2 seconds
    burst: 1
    
  bitcointalk.org:
    requests_per_second: 0.2  # 1 req per 5 seconds
    burst: 1
    reason: "aggressive anti-scraping"
    
  eprint.iacr.org:
    requests_per_second: 2
    burst: 10
    
  "*.archive.org":
    requests_per_second: 1
    burst: 3

Token Bucket Algorithm

  • requests_per_second: sustained rate
  • burst: max concurrent before throttling

CLI Options

Flag Description
--rate-limit N/s Requests per second
--rate-limit N/m Requests per minute
--burst N Burst allowance
--rate-config Config file path

Automatic Detection

  • Parse Retry-After headers
  • Detect 429 responses
  • Adjust rate dynamically

Acceptance Criteria

  • Token bucket rate limiter
  • Per-domain configuration
  • Wildcard domain matching
  • Dynamic adjustment on 429
  • Retry-After header support

Metadata

Metadata

Assignees

No one assigned

    Labels

    julesFor Jules AI to work on

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions