feat(detect-platform): platform/CMS/framework fingerprinting mode#25
Merged
feat(detect-platform): platform/CMS/framework fingerprinting mode#25
Conversation
…mode Adds --detect-platform mode (single URL or batched via --urls) that fingerprints CMS, site builder, e-commerce, framework, SSG, and hosting stacks from HTML, headers, generator meta, and platform-specific globals. Skips auxiliary fetches in detection mode and tunes HubSpot CMS weights so tracking pixels alone don't trigger high-confidence false positives. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--detect-platformCLI mode that fingerprints the stack a site is built on (CMS, site builder, e-commerce, framework, SSG, hosting), with single-URL and--urlsbatch flavors, bounded concurrency (--concurrency, default 5), and--min-confidencefiltering.<meta name="generator">, script/link/img sources, and platform-specific globals; emits per-platform confidence scores plus aggregatedisCustomflag, with text/markdown/JSON formatters and exit code keyed off detection success./llms.txt,/llms-full.txt,/robots.txt,/sitemap.xmlauxiliary fetches in detection mode (newfetchPage({ skipAuxiliary })option) so batch detection doesn't burn 4×N unrelated requests.<meta name="generator">remains the strong signal.detect-platformand CLI URL-list parsing.Test plan
pnpm run typecheckpnpm test(132 tests pass)pnpm lintpnpm run build && node bin/aeo-audit.js https://example.com --detect-platform--urlsfrom a file and from stdin🤖 Generated with Claude Code