Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,52 @@ npx @ainyc/aeo-audit https://example.com --include-geo
npx @ainyc/aeo-audit https://example.com --include-agent-skills
```

### Platform Detection Mode

Detect what platform, CMS, framework, or static site generator a website is built on. Useful for competitor research, lead qualification, and triage before an audit.

```bash
# Identify the stack (WordPress, Webflow, Shopify, Next.js, Vercel, etc.)
npx @ainyc/aeo-audit https://example.com --detect-platform

# JSON for programmatic use
npx @ainyc/aeo-audit https://example.com --detect-platform --format json

# Only show high-confidence matches
npx @ainyc/aeo-audit https://example.com --detect-platform --min-confidence high
```

The detector inspects HTML, response headers, `<meta name="generator">`, script and link sources, and platform-specific globals to fingerprint:

- **CMS:** WordPress, Drupal, Joomla, Ghost, HubSpot, Craft CMS, Sanity, Contentful, Notion
- **Site builders:** Wix, Squarespace, Webflow, Framer, Carrd, Bubble
- **E-commerce:** Shopify, WooCommerce, BigCommerce, Magento, PrestaShop
- **Frameworks:** Next.js, Nuxt, Gatsby, Remix, Astro, SvelteKit, Angular, Vue, React, Ember, Qwik
- **Static site generators:** Hugo, Jekyll, Eleventy, Hexo, Docusaurus, MkDocs
- **Hosting / CDN:** Vercel, Netlify, Cloudflare, GitHub Pages, Fastly, AWS CloudFront

Each detected platform is reported with a confidence bucket (`high`, `medium`, `low`), a numeric score, an optional version, and the list of signals that matched. When no CMS, site builder, or e-commerce platform is found, the report flags the site as `custom-built` (framework and hosting fingerprints are still surfaced for context). Exit code is `0` when at least one platform is detected, `1` otherwise.

#### Batch detection

Pass `--urls` to fingerprint many sites in a single run. Pages are fetched with bounded concurrency (5 in flight by default; tune with `--concurrency`).

```bash
# From a file (one URL per line; # comments and blank lines are skipped)
npx @ainyc/aeo-audit --detect-platform --urls urls.txt

# Inline comma-separated list
npx @ainyc/aeo-audit --detect-platform --urls https://a.com,https://b.com,https://c.com

# From stdin
cat urls.txt | npx @ainyc/aeo-audit --detect-platform --urls -

# JSON for downstream processing
npx @ainyc/aeo-audit --detect-platform --urls urls.txt --format json
```

Per-URL fetch errors don't abort the batch — each entry is reported with `status: 'success'` or `status: 'error'`. Exit code is `0` when at least one URL succeeded, `1` otherwise.

### Sitemap Mode

Audit every page discovered from the site's sitemap with bounded concurrency (5 in flight):
Expand Down Expand Up @@ -107,6 +153,10 @@ When the sitemap has more URLs than `--limit`, the run audits the highest-priori
| `--sitemap [url]` | Audit all pages from the sitemap (auto-discovers `/sitemap.xml` or uses an explicit URL) |
| `--limit <n>` | Max pages to audit in sitemap mode (default 200, sorted by sitemap priority) |
| `--top-issues` | In sitemap mode, skip per-page output and show only cross-cutting issues |
| `--detect-platform` | Identify the platform/CMS/framework powering the site instead of running an audit |
| `--urls <src>` | In `--detect-platform` mode, run on multiple URLs. `<src>` is a file path (one URL per line), a comma-separated list, or `-` for stdin |
| `--concurrency <n>` | In `--detect-platform` batch mode, max in-flight fetches (default 5) |
| `--min-confidence <lvl>` | In platform-detect mode, only report matches at or above this level: `low` (default), `medium`, `high` |
| `-h`, `--help` | Show the help message |

Exit code `0` for score >= 70, `1` for < 70 (CI-friendly). In sitemap mode the exit code is based on the aggregate score.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@ainyc/aeo-audit",
"version": "1.5.0",
"version": "1.6.0",
"description": "The most comprehensive open-source Answer Engine Optimization (AEO) audit tool. Scores websites across 13 ranking factors that determine AI citation.",
"type": "module",
"main": "./dist/index.js",
Expand Down
36 changes: 35 additions & 1 deletion skills/aeo/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ npx @ainyc/aeo-audit@1 "<url>" [flags] --format json
- `schema`: validate JSON-LD and entity consistency
- `llms`: create or improve `llms.txt` and `llms-full.txt`
- `monitor`: compare changes over time or benchmark competitors
- `detect-platform`: identify the CMS, site builder, framework, or hosting stack a site uses

If no mode is provided, default to `audit`.

Expand All @@ -55,10 +56,14 @@ If no mode is provided, default to `audit`.
- `schema https://example.com`
- `llms https://example.com`
- `monitor https://site-a.com --compare https://site-b.com`
- `detect-platform https://example.com`
- `detect-platform https://example.com --min-confidence high`
- `detect-platform --urls competitors.txt`
- `detect-platform --urls https://a.com,https://b.com`

## Mode Selection

- If the first argument is one of `audit`, `fix`, `schema`, `llms`, or `monitor`, use that mode.
- If the first argument is one of `audit`, `fix`, `schema`, `llms`, `monitor`, or `detect-platform`, use that mode.
- If no explicit mode is given, infer the intent from the request and default to `audit`.

## Audit
Expand Down Expand Up @@ -101,6 +106,35 @@ Returns:
- Aggregate score and grade
- Prioritized fixes ranked by site-wide impact

### Detect Platform Mode

Use `--detect-platform` when the user wants to know what stack a site is built on (e.g., "is this WordPress?", "what framework does competitor X use?", "is this site custom-built?"). This is much faster than a full audit because it skips analyzer scoring.

```bash
npx @ainyc/aeo-audit@1 "<url>" --detect-platform --format json
npx @ainyc/aeo-audit@1 "<url>" --detect-platform --min-confidence high --format json
```

Flags:
- `--detect-platform` — switch to detection mode instead of auditing
- `--min-confidence <lvl>` — filter to `low` (default), `medium`, or `high` confidence
- `--urls <src>` — run on multiple URLs at once (file path, comma-separated list, or `-` for stdin)
- `--concurrency <n>` — max in-flight fetches in batch mode (default 5)

The report groups detections by category (CMS, site builder, e-commerce, framework, SSG, hosting), each with a confidence bucket, a 0–100 score, an optional version, and the signals that matched. When the report's `isCustom` flag is true, no CMS/site-builder/e-commerce platform was identified — the site is likely custom-built. Exit code is `0` when at least one platform is detected, `1` otherwise.

#### Batch detection

When the user wants to fingerprint many sites at once (competitor lists, customer cohorts), pass `--urls`:

```bash
npx @ainyc/aeo-audit@1 --detect-platform --urls urls.txt --format json
npx @ainyc/aeo-audit@1 --detect-platform --urls https://a.com,https://b.com --format json
cat urls.txt | npx @ainyc/aeo-audit@1 --detect-platform --urls - --format json
```

The batch report contains a `results` array; each entry has `status: 'success'` or `'error'`, plus the same shape as a single-URL report on success. Per-URL fetch errors do not abort the run. Exit code is `0` when at least one URL succeeded, `1` otherwise.

## Fix

Use when the user wants code changes applied after the audit.
Expand Down
166 changes: 157 additions & 9 deletions src/cli.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,33 @@
import { readFile } from 'node:fs/promises'
import { runAeoAudit } from './index.js'
import { runSitemapAudit } from './sitemap.js'
import { detectPlatform, detectPlatformBatch } from './detect-platform.js'
import { isAeoAuditError } from './errors.js'
import { formatJson } from './formatters/json.js'
import { formatSitemapJson } from './formatters/json.js'
import { formatMarkdown, formatSitemapMarkdown } from './formatters/markdown.js'
import { formatText, formatSitemapText } from './formatters/text.js'
import type { AuditReport, SitemapAuditReport, SitemapAuditOptions } from './types.js'
import {
formatBatchPlatformJson,
formatJson,
formatPlatformJson,
formatSitemapJson,
} from './formatters/json.js'
import {
formatBatchPlatformMarkdown,
formatMarkdown,
formatPlatformMarkdown,
formatSitemapMarkdown,
} from './formatters/markdown.js'
import {
formatBatchPlatformText,
formatPlatformText,
formatSitemapText,
formatText,
} from './formatters/text.js'
import type {
BatchPlatformDetectionReport,
PlatformConfidence,
PlatformDetectionReport,
SitemapAuditOptions,
SitemapAuditReport,
} from './types.js'

const FORMATTERS = {
json: formatJson,
Expand All @@ -19,6 +41,18 @@ const SITEMAP_FORMATTERS = {
text: (report: SitemapAuditReport, topIssuesOnly: boolean) => formatSitemapText(report, topIssuesOnly),
}

const PLATFORM_FORMATTERS = {
json: (report: PlatformDetectionReport) => formatPlatformJson(report),
markdown: (report: PlatformDetectionReport) => formatPlatformMarkdown(report),
text: (report: PlatformDetectionReport) => formatPlatformText(report),
}

const BATCH_PLATFORM_FORMATTERS = {
json: (report: BatchPlatformDetectionReport) => formatBatchPlatformJson(report),
markdown: (report: BatchPlatformDetectionReport) => formatBatchPlatformMarkdown(report),
text: (report: BatchPlatformDetectionReport) => formatBatchPlatformText(report),
}

type FormatterName = keyof typeof FORMATTERS

interface ParsedArgs {
Expand All @@ -32,6 +66,10 @@ interface ParsedArgs {
sitemapUrl: string | null
limit: number | null
topIssues: boolean
detectPlatform: boolean
minConfidence: PlatformConfidence | null
urls: string | null
concurrency: number | null
}

function isFormatterName(value: string): value is FormatterName {
Expand All @@ -51,6 +89,10 @@ function parseArgs(argv: string[]): ParsedArgs {
sitemapUrl: null,
limit: null,
topIssues: false,
detectPlatform: false,
minConfidence: null,
urls: null,
concurrency: null,
}

for (let i = 0; i < args.length; i += 1) {
Expand Down Expand Up @@ -79,6 +121,23 @@ function parseArgs(argv: string[]): ParsedArgs {
i += 1
} else if (args[i] === '--top-issues') {
result.topIssues = true
} else if (args[i] === '--detect-platform') {
result.detectPlatform = true
} else if (args[i] === '--min-confidence' && args[i + 1]) {
const value = args[i + 1]
if (value === 'high' || value === 'medium' || value === 'low') {
result.minConfidence = value
}
i += 1
} else if (args[i] === '--urls' && args[i + 1]) {
result.urls = args[i + 1]
i += 1
} else if (args[i] === '--concurrency' && args[i + 1]) {
const num = parseInt(args[i + 1], 10)
if (Number.isFinite(num) && num > 0) {
result.concurrency = num
}
i += 1
} else if (args[i] === '--help' || args[i] === '-h') {
result.help = true
} else if (!args[i].startsWith('-')) {
Expand All @@ -89,6 +148,39 @@ function parseArgs(argv: string[]): ParsedArgs {
return result
}

export function parseUrlList(text: string): string[] {
const urls: string[] = []
for (const rawLine of text.split(/\r?\n/)) {
const line = rawLine.trim()
if (line.length === 0 || line.startsWith('#')) continue
// Allow comma-separated values on a single line too.
for (const part of line.split(',')) {
const candidate = part.trim()
if (candidate.length > 0) urls.push(candidate)
}
}
return urls
}

async function readStdin(): Promise<string> {
const chunks: Buffer[] = []
for await (const chunk of process.stdin) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk))
}
return Buffer.concat(chunks).toString('utf-8')
}

async function resolveUrls(spec: string): Promise<string[]> {
if (spec === '-') {
return parseUrlList(await readStdin())
}
if (spec.startsWith('http://') || spec.startsWith('https://')) {
return parseUrlList(spec)
}
const text = await readFile(spec, 'utf-8')
return parseUrlList(text)
}

function printHelp() {
console.log(`
Usage: aeo-audit <url> [options]
Expand All @@ -103,6 +195,14 @@ Options:
--limit <n> Max pages to audit in sitemap mode (default 200, sorted by sitemap priority).
When the sitemap exceeds the limit, a notice is printed to stderr.
--top-issues In sitemap mode, skip per-page output and show only cross-cutting issues
--detect-platform Detect what platform/CMS/framework the site is built on (WordPress,
Webflow, Shopify, Next.js, etc.) instead of running a full audit.
--urls <src> In --detect-platform mode, run on multiple URLs. <src> can be a path
to a text file (one URL per line, # comments allowed), a comma-separated
list (e.g. https://a.com,https://b.com), or - to read from stdin.
--concurrency <n> In --detect-platform batch mode, max in-flight fetches (default 5).
--min-confidence <lvl> In platform-detect mode, only report platforms at or above this
confidence level: low (default), medium, high.
-h, --help Show this help message

Examples:
Expand All @@ -115,8 +215,16 @@ Examples:
aeo-audit https://example.com --sitemap https://example.com/sitemap.xml
aeo-audit https://example.com --sitemap --limit 10
aeo-audit https://example.com --sitemap --top-issues
aeo-audit https://example.com --detect-platform
aeo-audit https://example.com --detect-platform --format json
aeo-audit https://example.com --detect-platform --min-confidence medium
aeo-audit --detect-platform --urls urls.txt
aeo-audit --detect-platform --urls https://a.com,https://b.com --format json
cat urls.txt | aeo-audit --detect-platform --urls -

Exit code: 0 when score >= 70, 1 otherwise. In sitemap mode, the aggregate score is used.
In --detect-platform mode, exit code is 0 if any platform is detected, 1 otherwise.
In --detect-platform batch mode, exit code is 0 if at least one URL succeeded, 1 otherwise.
`)
}

Expand All @@ -128,17 +236,57 @@ export async function main(argv: string[] = process.argv): Promise<number> {
return 0
}

if (!args.url) {
console.error('Error: URL is required. Run with --help for usage.')
if (!isFormatterName(args.format)) {
console.error(`Error: Unknown format "${args.format}". Use: text, json, markdown`)
return 1
}

if (!isFormatterName(args.format)) {
console.error(`Error: Unknown format "${args.format}". Use: text, json, markdown`)
if (args.urls && !args.detectPlatform) {
console.error('Error: --urls is only supported with --detect-platform.')
return 1
}

try {
if (args.detectPlatform) {
if (args.urls) {
if (args.url) {
console.error('Error: cannot combine a positional URL with --urls. Use one or the other.')
return 1
}

const urls = await resolveUrls(args.urls)
if (urls.length === 0) {
console.error('Error: no URLs found in --urls input.')
return 1
}

const batch = await detectPlatformBatch(urls, {
minConfidence: args.minConfidence ?? undefined,
concurrency: args.concurrency ?? undefined,
})
const batchFormatter = BATCH_PLATFORM_FORMATTERS[args.format]
console.log(batchFormatter(batch))
return batch.successful > 0 ? 0 : 1
}

if (!args.url) {
console.error('Error: URL is required (or pass --urls <file|->|<url1,url2>). Run with --help for usage.')
return 1
}

const report = await detectPlatform(args.url, {
minConfidence: args.minConfidence ?? undefined,
})
const platformFormatter = PLATFORM_FORMATTERS[args.format]
console.log(platformFormatter(report))
return report.detected.length > 0 ? 0 : 1
}

if (!args.url) {
console.error('Error: URL is required. Run with --help for usage.')
return 1
}

if (args.sitemap) {
const options: SitemapAuditOptions = {
factors: args.factors,
Expand Down
Loading
Loading