feat(traffic): add Cloud Run puller foundation#386
Merged
Conversation
277dd17 to
c14bf91
Compare
e91e632 to
45f073c
Compare
Required by scripts/check-docs.sh — every package under packages/ must have AGENTS.md and CLAUDE.md.
Local probe that pulls Cloud Run request logs and GA4 AI-referral rows over the same window and surfaces the gap: per-AI-surface comparison (CR referer hits vs GA sessions), path-level join with crawled/clicked verdicts, and crawler-only summary. Reuses the existing Cloud Run puller, traffic classifier, and GA4 client; supports live (canonry project lookup or manual service-account JSON) and offline fixture modes for the replay loop before persistence/API/CLI surfaces land.
The hand-rolled per-event matching only checked the Referer header, so UTM-only AI clicks (e.g. links from ChatGPT app that strip referer but tag ?utm_source=chatgpt.com) showed up in the probe totals but were missed by the source comparison and path join. On a 24h ainyc.ai pull both ChatGPT clicks were UTM-only, so the comparison reported 0 Cloud Run hits vs 1 GA session — wrong direction. Switch to classifyAiReferral / classifyCrawler from @ainyc/canonry-integration-traffic, which already handles referer + UTM and the path join now agrees with the probe totals. Source comparison is now grouped by product (ChatGPT, Copilot, …) instead of by rule so the chatgpt.com / chat.openai.com double-row goes away, and the row exposes a referer/utm breakdown so it's obvious which evidence channel produced the click.
2e7adcd to
94f9f63
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Starts the server-side traffic ingestion stack with the provider-neutral model and Cloud Run / Cloud Logging adapter foundation.
This PR intentionally does not add public API, CLI, MCP, or dashboard traffic surfaces yet. The new code is a reusable integration package and shared contract layer so the next stacked PR can add persistence and public reads/writes in one complete API/CLI slice.
What changed
NormalizedTrafficRequestin@ainyc/canonry-contracts.@ainyc/canonry-integration-cloud-runwith:cloud_run_revision;entries.listpull support with page tokens;LogEntry.httpRequestinto Canonry request evidence.@ainyc/canonry-integration-trafficwith local, provider-neutral crawler/referrer classification and hourly rollups over normalized request events.scripts/test-cloud-run-traffic-pull.tsplus a fixture so we can test pull -> normalize -> ingest -> analyze before wiring Canonry DB/API/CLI surfaces.3.3.0.Local probe
Fixture mode:
Real Cloud Run logs:
Use
--narrow-botsonly when testing crawler detection specifically; it lowers Cloud Logging volume but misses human AI referrals.Validation
pnpm run typecheckpnpm run lintpnpm -r --no-bail run testpnpm --filter @ainyc/canonry-integration-cloud-run test -- --runInBandpnpm --filter @ainyc/canonry-integration-cloud-run typecheckpnpm exec eslint scripts/test-cloud-run-traffic-pull.ts packages/integration-cloud-run/src packages/integration-cloud-run/testpnpm exec tsc --noEmit --target ES2022 --lib ES2022,DOM,DOM.Iterable --module NodeNext --moduleResolution NodeNext --types node --skipLibCheck scripts/test-cloud-run-traffic-pull.tspnpm tsx scripts/test-cloud-run-traffic-pull.ts --gcp-project openclaw-nyc --service openclaw-nyc --location us-east1 --since 6h --url-contains ainyc.ai --use-gcloud --page-size 1000 --max-pages 3 --out .tmp/ainyc-cloud-run-traffic-report.jsonpnpm tsx scripts/test-cloud-run-traffic-pull.ts --gcp-project openclaw-nyc --service openclaw-nyc --location us-east1 --since 6h --url-contains ainyc.ai --narrow-bots --use-gcloud --page-size 1000 --max-pages 3 --out .tmp/ainyc-cloud-run-traffic-bots-report.jsonpnpm -r run lint