Skip to content

feat: add aeorank-cli doctor — diagnose AI crawler access in 30s #5

@vinpatel

Description

@vinpatel

The #1 invisible-to-ChatGPT failure mode is a site that blocks AI crawlers in `robots.txt` or via firewall/WAF rules (Cloudflare Bot Fight Mode, AWS WAF AI-bot blocklists). Users don't realize they're blocked until they run the full scanner.

A `doctor` command would diagnose this in 30 seconds without a full scan.

Scope

`aeorank-cli doctor ` should:

  1. Fetch `robots.txt` and list every AI agent (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider, Amazonbot, …) as allowed / blocked / unspecified.
  2. Make a HEAD request as each agent's UA and report the status code.
  3. Flag Cloudflare / AWS WAF signals in response headers (`cf-mitigated`, `x-amzn-waf-action`).
  4. Check for common firewall blocks on `/llms.txt` specifically (many WAFs serve 403 for unknown paths with .txt extensions).
  5. Print a clean table and a one-line verdict: "✅ All major AI crawlers can reach your site" or "❌ 4 of 9 AI crawlers blocked — see details".

Acceptance

  • New command in `packages/cli/src/commands/doctor.ts`
  • Covers the AI agents in the existing `packages/core` constants
  • No false positives against aeorank.dev, vercel.com, stripe.com
  • Tests with mocked fetch
  • Docs page

Meatier than a good-first-issue but still scoped — ideal for someone who wants to ship a distinct feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cliaeorank-cli command, flag, or outputenhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions