-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Epic: Cache
Summary
TLDR - Implement HTTP PURGE method for immediate cache invalidation, following Varnish's pattern. This allows operators to remove specific cached responses instantly via PURGE /api/endpoint requests. Essential for maintaining cache consistency when backend data changes, providing immediate invalidation without waiting for TTL expiration.
Context
HARP uses hishel 1.0 for HTTP caching within the AsyncCacheTransport layer. Currently, cached responses can only expire via TTL or be cleared entirely. Production deployments need targeted invalidation when:
- Content updates require immediate propagation
- Cached error responses need removal
- Security patches invalidate previous responses
- Data corrections must reflect immediately
Varnish's PURGE method is the industry standard: send PURGE /path to immediately remove that exact URL from cache. HARP's kernel event architecture allows intercepting PURGE requests at the ASGI layer before they reach the proxy, enabling http_client to handle cache operations directly.
sequenceDiagram
participant Client
participant Kernel as ASGIKernel
participant PurgeService as CachePurgeListener
participant Storage as Cache Storage
Client->>Kernel: PURGE /api/users/123
Kernel->>PurgeService: EVENT_CORE_REQUEST
PurgeService->>PurgeService: Check IP ACL
PurgeService->>Storage: delete cache entry
Storage-->>PurgeService: deleted
PurgeService-->>Kernel: 200 Purged
Kernel-->>Client: 200 Purged
Note over Kernel,Storage: Bypasses proxy entirely
Input
PURGE Request Format:
PURGE /api/users/123 HTTP/1.1
Host: api.example.com
X-Forwarded-For: 10.0.0.5Configuration (config.yml):
http_client:
cache:
enabled: true
purge:
enabled: true # Opt-in for security (default: false)
acl:
- 127.0.0.1
- 10.0.0.0/8
- 172.16.0.0/12
policy:
shared: true
# ... existing cache policy settingsSecurity: IP-based ACL restricts PURGE to authorized sources (configurable allowlist)
Cache Key Matching: Exact URL + Host header using injected SpecificationPolicy instance - guaranteed match with hishel's cache key generation
Output and Testing Scenarios
Expected Responses:
200 Purged- Successfully removed from cache404 Not in cache- URL was not cached403 Forbidden- IP not in allowlist405 Method Not Allowed- PURGE disabled
Testing Scenarios:
- Happy Path: Cache GET /api/users/123, PURGE /api/users/123, verify removal
- Security: PURGE from unauthorized IP returns 403
- Not Cached: PURGE non-existent URL returns 404
- Case Sensitivity: PURGE /Api/Users/123 doesn't match /api/users/123
- Host Header: Different Host headers create different cache entries
- Bypass Verification: PURGE returns without hitting backend service
- Disabled: When cache.purge.enabled=false, PURGE returns 405
Possible Implementation
Chosen Approach: Kernel Event Interception via EVENT_CORE_REQUEST
Implement in harp_apps/http_client/ using kernel-level event interception with subscriber pattern:
1. Settings Structure
File: harp_apps/http_client/settings/cache.py
class CachePurgeSettings(BaseModel):
"""Cache purge configuration."""
enabled: bool = False # Opt-in for security
acl: list[str] = Field(default_factory=lambda: ["127.0.0.1"])
class CacheSettings(BaseModel):
enabled: bool = True
purge: CachePurgeSettings = Field(default_factory=CachePurgeSettings)
transport: Service
policy: Service # SpecificationPolicy - reused for cache key generation!
storage: Service2. Service Registration
File: harp_apps/http_client/services.yml
- condition: [!cfg "cache.enabled", !!bool "true"]
services:
# Existing cache services (options, policy, transport, storage)
# NEW: Purge listener (nested condition)
- condition: [!cfg "cache.purge.enabled", !!bool "false"]
services:
- name: http_client.cache.purge.listener
class: harp_apps.http_client.contrib.cache.purge_listener.CachePurgeListener
kwargs:
settings: !cfg "cache.purge"
policy: !svc http_client.cache.policy # Reuse injected SpecificationPolicy
storage: !svc http_client.cache.storage3. Listener Implementation
File: harp_apps/http_client/contrib/cache/purge_listener.py
class CachePurgeListener:
"""Listens for PURGE requests and invalidates cache entries."""
def __init__(self, settings: CachePurgeSettings, policy: SpecificationPolicy, storage: AsyncStorage):
self.settings = settings
self.policy = policy # Same instance used by AsyncCacheTransport
self.storage = storage
async def subscribe(self, dispatcher):
"""Register EVENT_CORE_REQUEST listener."""
dispatcher.add_listener(EVENT_CORE_REQUEST, self.on_core_request, priority=100)
async def unsubscribe(self, dispatcher):
"""Unregister listener on shutdown."""
dispatcher.remove_listener(EVENT_CORE_REQUEST, self.on_core_request)
async def on_core_request(self, event: RequestEvent):
"""Intercept PURGE requests and handle cache invalidation."""
if event.request.method == "PURGE":
# Check ACL, generate cache key using self.policy, delete from storage
# Set early response via event.set_controller()4. Lifecycle Management
File: harp_apps/http_client/__app__.py
async def on_bind(event: OnBindEvent):
"""Subscribe purge listener when enabled."""
settings = event.settings.get("http_client", {}).get("cache", {})
if settings.get("enabled") and settings.get("purge", {}).get("enabled"):
listener = await event.container.resolve("http_client.cache.purge.listener")
await listener.subscribe(event.container.dispatcher)
async def on_shutdown(event: OnShutdownEvent):
"""Unsubscribe purge listener."""
settings = event.settings.get("http_client", {}).get("cache", {})
if settings.get("enabled") and settings.get("purge", {}).get("enabled"):
listener = await event.container.resolve("http_client.cache.purge.listener")
await listener.unsubscribe(event.container.dispatcher)
application = Application(
on_bind=on_bind,
on_shutdown=on_shutdown,
settings_type=HttpClientSettings,
)Key Architecture Benefits
- Lives in http_client (semantic fit - cache operations)
- Uses EVENT_CORE_REQUEST (earliest kernel interception)
- Bypasses proxy flow entirely (performance + simplicity)
- Reuses injected SpecificationPolicy (guaranteed cache key match)
- Subscriber pattern (clean lifecycle, follows Rules app convention)
- Nested configuration (
cache.purge) - logical hierarchy - Security by default (opt-in via
enabled: false) - No modifications to proxy or kernel code
Integration Points
harp/asgi/events.py- EVENT_CORE_REQUEST definition (line 15)harp/asgi/kernel.py- Event dispatch at request arrival (line 166-174)harp_apps/http_client/__app__.py- Listener registration in on_bind/on_shutdown hooksharp_apps/http_client/services.yml- Conditional service registration (lines 42-56)harp_apps/storage/types/blob_storage.py- IBlobStorage.delete()
Current Challenges
None - The SpecificationPolicy is already injected via DI and can be reused directly for cache key generation, ensuring exact matches with hishel's algorithm.
Pattern Matching: Initial version supports exact URL only. Wildcard support (PURGE /api/users/*) requires additional metadata storage or blob scanning, deferred to future enhancement.