Skip to content

GSC auth is operationally brittle: connected tokens can still fail with hidden 403 property permission errors #346

@arberx

Description

@arberx

Summary

GSC auth in Canonry is operationally brittle. A project can show as connected, hold a valid refresh token, and refresh access tokens successfully, yet repeated canonry google refresh <project> / GSC syncs fail later with a 403 permission error on the selected property. In practice this feels like the operator has to keep reconnecting GSC without Canonry clearly explaining what principal lost access or how to fix it.

Reproduction / observed behavior

Project: ainyc
Property: sc-domain:ainyc.ai

Observed from the live install:

canonry google refresh ainyc --format json

Returns:

{
  "error": {
    "code": "GOOGLE_REFRESH_FAILED",
    "message": "GSC refresh failed.",
    "details": {
      "runId": "a0adb808-9386-49aa-81c2-bbb8d82373e9",
      "status": "failed",
      "project": "ainyc"
    }
  }
}

The underlying run record shows the real error:

GSC API error (403): {
  "error": {
    "code": 403,
    "message": "User does not have sufficient permission for site 'sc-domain:ainyc.ai'. See also: https://support.google.com/webmasters/answer/2451999.",
    "errors": [
      {
        "message": "User does not have sufficient permission for site 'sc-domain:ainyc.ai'. See also: https://support.google.com/webmasters/answer/2451999.",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }
}

At the same time, the stored Canonry Google connection still looks healthy at a glance:

  • domain: ainyc.ai
  • connectionType: gsc
  • propertyId: sc-domain:ainyc.ai
  • scopes include:
    • https://www.googleapis.com/auth/indexing
    • https://www.googleapis.com/auth/webmasters.readonly
  • refresh token present
  • access token refresh still succeeds and updates tokenExpiresAt

So the current UX is:

  • Canonry says the project is connected
  • token refresh works
  • actual GSC calls fail with forbidden
  • operator is left to infer that the authenticated Google principal no longer has access to the selected property

Why this is a product issue

This is not just an external Google failure. Canonry currently hides the critical operator-facing diagnosis:

  • which Google principal is actually being used for GSC
  • whether that principal can still access the selected property
  • whether the problem is token expiry vs revoked access vs wrong property selection

That makes ongoing operations feel flaky, because the practical remediation path becomes “reconnect until it works”.

Current implementation details contributing to this

A few implementation choices make this harder to operate:

  1. google status only shows connection type, property, createdAt, updatedAt

    • It does not show the authenticated principal/email/account identity.
    • It does not show last auth validation result.
  2. OAuth callback stores tokens/scopes/property but not the identity behind the token

    • packages/api-routes/src/google.ts
    • connection record persists tokens and scopes, but not the granted account email / subject.
  3. gsc-sync refreshes the access token and proceeds straight into API calls

    • packages/canonry/src/gsc-sync.ts
    • when Google returns 403 for the property, the run fails generically.
  4. CLI error surface is too generic for operators

    • canonry google refresh ... reports GOOGLE_REFRESH_FAILED
    • the actionable explanation is buried in run storage instead of surfaced directly.

Expected behavior

Canonry should make GSC auth state explicit and durable:

  1. Persist and display the principal identity

    • store the Google account email / subject granted during OAuth
    • show it in canonry google status <project>
  2. Add explicit auth preflight / validation

    • before or during sync, validate access to the selected property
    • classify results as one of:
      • connected and authorized
      • connected but property access forbidden
      • connected but property missing / mismatched
      • token invalid / refresh failed
  3. Improve CLI/operator error messages

    • surface the real failure inline, e.g.
      • GSC property access forbidden for sc-domain:ainyc.ai
      • Authorized principal: <email>
      • Fix: grant this principal access in Search Console or reconnect using an account that already has access
  4. Track connection health separately from token freshness

    • a token can be refreshable while the principal is unauthorized for the property
    • those are different states and should be shown differently
  5. Reduce repeated noisy failures

    • once Canonry knows a connection is unauthorized, status/reporting should surface an auth-health incident instead of repeating generic sync failures

Suggested implementation

Connection model

Extend Google connection records to include fields like:

  • principalEmail
  • principalSubject or grantedAccountId if available
  • lastValidatedAt
  • lastValidationStatus (ok, forbidden, property_missing, token_invalid, etc.)
  • lastValidationError

Validation path

On OAuth completion, or on first refresh after connect:

  • call sites.list
  • verify the selected propertyId is present and capture permission level
  • persist the result

Before gsc-sync:

  • if validation status is stale or unknown, revalidate
  • on 403, update connection health fields with a classified auth failure

CLI / API UX

  • canonry google status <project> should show:
    • property
    • principal
    • permission / validation state
    • last validated at
  • canonry google refresh/sync should bubble up the classified auth error directly

Documentation / operator enablement

Canonry should also ship an Agent Skill for Google auth setup and recovery, specifically so an agent and operator can configure this correctly together.

The skill should help with:

  • choosing the right auth path (GSC OAuth vs GA service account vs indexing scope)
  • verifying the exact redirect URI Canonry expects
  • listing accessible GSC properties after auth
  • selecting the correct property
  • confirming the granted Google principal has Search Console access
  • diagnosing 401 vs 403 vs property mismatch
  • explaining exact remediation steps in operator language

Right now this setup is too easy to misconfigure and too hard to diagnose from the normal CLI surface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions