Feat/rate limiting helper #75

LifeJiggy · 2025-10-28T15:06:56Z

Summary

This PR adds a RateLimiter utility class that implements token bucket rate limiting, helping developers manage API rate limits and avoid being throttled by the Gradient service.

Problem

Gradient API has rate limits that developers must respect to avoid being throttled or blocked. Currently, developers have no built-in way to manage request rates, leading to:

Unexpected throttling errors during high-traffic periods
Difficulty implementing proper rate limiting logic
Poor user experience when requests are rejected
Manual implementation of rate limiting across different parts of applications

Solution

Add RateLimiter class with token bucket algorithm:

Configurable requests per minute limit
Automatic token refill based on elapsed time
Simple API for checking if requests can be made
Wait time calculation for rate limit management
Thread-safe implementation using standard library

Key Features

Token Bucket Algorithm: Industry-standard rate limiting
Configurable Limits: Adjustable requests per minute
Automatic Refill: Tokens replenish over time
Wait Time Calculation: Know how long to wait for next request
Thread Safe: Uses standard library only, no external dependencies
Simple API: Easy to integrate into existing code

Benefits

Prevents API throttling errors
Smooths out request patterns
Improves application reliability
Helps stay within API quotas
Better user experience during high load

Testing

Added comprehensive test suite covering:

Basic rate limiting behavior
Token acquisition and exhaustion
Wait time calculations
Token refill over time
Custom rate limit configurations

All tests pass with full coverage of rate limiting functionality.

Usage Examples

from gradient._utils import RateLimiter
import time

# Create rate limiter for 30 requests per minute
limiter = RateLimiter(requests_per_minute=30)

# Before making API calls
if limiter.acquire():
    # Make API request
    response = client.chat.completions.create(...)
else:
    # Wait for tokens to be available
    wait_seconds = limiter.wait_time()
    time.sleep(wait_seconds)
    response = client.chat.completions.create(...)

# Or integrate into request loop
def make_rate_limited_request():
    while not limiter.acquire():
        wait_seconds = limiter.wait_time()
        time.sleep(wait_seconds)
    
    return client.chat.completions.create(...)

LifeJiggy added 3 commits October 28, 2025 15:26

feat: add enhanced API key validation functions

afb24b4

feat: add ResponseCache utility class

d941755

feat: add RateLimiter utility class

05399b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/rate limiting helper #75

Feat/rate limiting helper #75

LifeJiggy commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat/rate limiting helper #75

Are you sure you want to change the base?

Feat/rate limiting helper #75

Conversation

LifeJiggy commented Oct 28, 2025

Summary

Problem

Solution

Key Features

Benefits

Testing

Usage Examples

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant