feat(traffic): scaffold Cloudflare Worker traffic source (foundation)#635
Draft
arberx wants to merge 1 commit into
Draft
feat(traffic): scaffold Cloudflare Worker traffic source (foundation)#635arberx wants to merge 1 commit into
arberx wants to merge 1 commit into
Conversation
First slice of the Cloudflare adapter — contracts, schema, and the push-receive integration package. The HTTP route + CLI + doctor checks land in a follow-up PR. Why a Worker instead of GraphQL Analytics or Logpush: Cloudflare's GraphQL API is aggregate-only and Logpush is Business+ only, so the Worker is the universal raw-row access path. Also unblocks future "Cloudflare-as-proxy" support for hosts with no native logs. This is the first push-receive traffic source — every existing adapter pulls. Safe because canonry is single-tenant per deployment; the Worker only ever talks to the operator's own canonry instance. Includes: - Zod schemas for the source config, connect request/response, ingest payload, and the per-event shape - integration-cloudflare-worker package with HMAC-SHA256 signature verifier, event → NormalizedTrafficRequest normalizer, and Worker script generator (broad edge-side bot/referer filter; strict classification stays server-side in integration-traffic) - traffic_sources columns ingest_token_hash + last_worker_version (migration v67) - plans/cloudflare-worker-traffic-source.md design doc - Tests written first (TDD): 43 tests in the new package, 26 contract schema cases, 3 DB column round-trip tests Full workspace test passes (3381/3381). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First slice of the Cloudflare adapter — contracts, DB schema, and the push-receive integration package. The HTTP route + CLI + doctor checks land in a follow-up PR.
Design doc:
plans/cloudflare-worker-traffic-source.mdWhy Worker push instead of GraphQL Analytics or Logpush
Also unlocks the future "Cloudflare-as-proxy" story for hosts with no native logs (Shopify, Webflow, Ghost, etc.) — once a customer has canonry's Worker on their zone, that zone is a fully ingestible traffic source regardless of where the site is actually hosted.
Why push-receive
This is the first push-receive traffic source — every existing adapter (
cloud-run,vercel,wordpress) pulls. The principle inplans/server-side-ai-traffic-ingestion.md(no canonry-hosted endpoint in the hot path) is preserved because canonry is single-tenant per deployment: the Worker only ever talks to the operator's own canonry instance, never to a canonry-hosted SaaS relay.What's in this PR
packages/contracts/src/traffic.ts):cloudflareWorkerSourceConfigSchematrafficConnectCloudflareRequestSchema/trafficConnectCloudflareResponseSchemacloudflareWorkerEventSchema/cloudflareWorkerIngestRequestSchema@ainyc/canonry-integration-cloudflare-worker:generateWorkerScript— produces the JS string with embedded source-id, bearer, HMAC secret, version, and bot keyword constantsgenerateWranglerToml— companionwrangler.tomlfor operators who preferwrangler deployverifyRequestSignature— HMAC-SHA256 + ±300s timestamp window, constant-time comparisonnormalizeCloudflareWorkerEvent— Worker event → provider-neutralNormalizedTrafficRequestDEFAULT_BOT_LIST— versioned edge-side keyword setpackages/db/src/schema.ts+ migration v67):traffic_sources.ingest_token_hash(sha256 of the per-source bearer)traffic_sources.last_worker_version(drives the futurecloudflare.worker.version-staledoctor check)Two-tier filtering
Edge-side filter is generic and stable — broad UA/referer keywords + Cloudflare bot signals — so the Worker only needs redeploys when the category of signal changes, not when individual bot names are added to canonry's list. The strict bot/operator classification stays server-side in
packages/integration-traffic.Secrets
ingest_token_hash), cleartext in~/.canonry/config.yaml, embedded in the Worker script at generation time~/.canonry/config.yaml, embedded in the Worker scriptTest plan
Written TDD (red → green for every unit). All workspace tests pass (3381/3381).
waitUntilused, all 4 documented headers emitted, HMAC-SHA256 via SubtleCrypto, POST method, JS parses, custom score threshold, default bot list shape)Out of scope (next PR)
POST /traffic/connect/cloudflare,POST /traffic/cloudflare/ingest,POST /traffic/cloudflare/rotate/:sourceIdcanonry traffic connect cloudflare,traffic rotate cloudflare,traffic verify cloudflarecloudflare.worker.last-seen,cloudflare.worker.version-stale,cloudflare.worker.signature-failures🤖 Generated with Claude Code