Skip to content

fix(ga): dedupe AI referral rows to winning attribution dimension#430

Merged
arberx merged 1 commit into
mainfrom
arberx/issue-428
May 8, 2026
Merged

fix(ga): dedupe AI referral rows to winning attribution dimension#430
arberx merged 1 commit into
mainfrom
arberx/issue-428

Conversation

@arberx
Copy link
Copy Markdown
Member

@arberx arberx commented May 8, 2026

Summary

GA4 emits one row per attribution dimension (session, first_user, manual_utm) for the same visit, so /ga/traffic returned three rows for what users perceive as one source — 1 source, 2 sessions, 6 rows in the bug report. The chart, report, and aggregate counters already deduped; the detail tables were the missing case. This collapses to the winning-dimension row per (source, medium) (and per (source, medium, landingPage) for the landing-page table) at the API so CLI, web, and MCP consumers all see the corrected shape. Adds a regression test that seeds three dimensions for the same source and asserts only the winner survives.

Closes #428

Test plan

  • pnpm typecheck passes
  • pnpm lint passes
  • pnpm test — all 2090 tests pass, including the new dedup regression in packages/api-routes/test/ga.test.ts
  • Verify the "Known AI referrers" tables on a real project show one row per source instead of one row per attribution dimension

GA4 emits one row per attribution dimension (session, first_user,
manual_utm), but those are overlapping lenses on the same visit. The
detail tables in /ga/traffic returned all three, inflating the row
count (e.g. 1 source, 2 sessions, 6 rows). The chart, report, and
aggregate counters already deduped — the tables were the missing case.

Fix at the API so CLI, web, and MCP all see the same shape: keep the
dimension with the highest sessions per (source, medium) and per
(source, medium, landingPage). Adds a regression test that seeds
multi-dimension rows and asserts only the winner is returned.

Closes #428
@arberx arberx merged commit 36efd9e into main May 8, 2026
2 checks passed
@arberx arberx deleted the arberx/issue-428 branch May 8, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TrafficSection: AI referrer tables show duplicate rows from GA4 attribution dimensions

1 participant