OpenAPI mindset for CLIs: crawl real
--helpoutput, build structured CLI maps, and generate Claude Code plugins with reliable command knowledge.
- Input: any CLI binary available in your environment (
docker,git,gh,uv,pnpm, etc.) - Phase 1:
cli-crawlerparses help output intooutput/<cli>.json - Phase 2:
generate-pluginconverts that JSON intoplugins/cli-<name>/ - Result: Claude Code gets concrete, tool-specific command intelligence instead of guessing flags
LLMs are strong at reasoning, but weak on precise, current CLI syntax unless the exact tool/version is in context.
Typical failure modes when relying on model memory alone:
- hallucinated flags
- wrong argument order
- outdated commands after CLI updates
- missing required options hidden in subcommand help
This project creates a deterministic bridge between real CLI help and AI plugin context.
--help is necessary, but not sufficient as a direct context source for agents:
| Approach | Works for humans | Works for agents at scale | Main issue |
|---|---|---|---|
Manual copy/paste from --help |
Yes | No | token-heavy, inconsistent, stale quickly |
man/online docs |
Yes | Sometimes | fragmented formats, weak machine structure |
| Raw CLI help in prompt | Sometimes | Poorly | noisy, hard to navigate, expensive context |
| CLI Plugins (this project) | Yes | Yes | structured, compact, regenerable |
-
Format-agnostic parsing Parses multiple help styles (Cobra, Click, Rich-Click, Git/manpage, POSIX-like patterns) with deterministic heuristics.
-
Progressive disclosure by default Generates compact
SKILL.mdplus on-demand references (commands.md,examples.md) to reduce context pressure. -
Safety-first help detection Includes protections such as auth-required precedence and non-mutating fallback policy for subcommand probing.
-
Cross-platform resilience (Windows/WSL) Supports executable-name canonicalization (e.g.,
.exeidentity handling) while preserving runnable invocation. -
Global-only CLI quality fallback When a CLI has no subcommands, plugin output still provides meaningful usage examples and stable version labeling.
-
Inventory drift visibility
config-auditcomparesconfig.yaml,output/*.json, andplugins/cli-*to highlight stale/missing artifacts. -
Idempotent generation Re-runs overwrite generated outputs cleanly and predictably.
-
No runtime external dependencies Core runtime (
cli-crawler,generate-plugin,config-audit) uses project code and stdlib-friendly implementation.
CLI binary
-> cli-crawler
-> output/<cli>.json (+ optional output/<cli>.raw.json)
-> generate-plugin
-> plugins/cli-<slug>/
cli-crawler: crawl and parse CLI help outputgenerate-plugin: build Claude plugin from crawler JSONconfig-audit: detect inventory drift and suggest minimal config overrides
git clone https://github.com/nsalvacao/cli-plugins.git
cd cli-plugins
# Option A (recommended)
uv sync
# Option B
# python3 -m venv .venv-wsl
# source .venv-wsl/bin/activate
# pip install -e ".[dev]"uv run cli-crawler docker -o output/docker.json --raw -vuv run generate-plugin output/docker.jsonPlugin output:
plugins/cli-docker/
cp -r plugins/cli-docker ~/.claude/plugins/cli-dockerOr test directly:
claude --plugin-dir plugins/cli-dockerplugins/cli-my-tool/
├── .claude-plugin/
│ └── plugin.json
├── skills/
│ └── cli-my-tool/
│ ├── SKILL.md
│ └── references/
│ ├── commands.md
│ └── examples.md
├── commands/
│ └── scan-cli.md
└── scripts/
└── rescan.sh
rescan.sh is generated per plugin and re-runs crawl + generation for that CLI.
Use config.yaml for operational overrides, not as a full inventory catalog.
Example:
defaults:
timeout: 5
max_depth: 5
max_concurrent: 5
retry: 1
environment: wsl
raw_threshold: 10000
clis:
docker:
max_depth: 6
kubectl:
timeout: 8
plugins:
discovery_command: kubectl plugin listRecommended minimal override fields:
environmenthelp_patternmax_depthmax_concurrentplugins.discovery_command
Run:
uv run config-audit \
--config config.yaml \
--output-dir output \
--plugins-dir plugins \
--report output/config-audit.jsonThe report highlights:
missing_in_configstale_in_configmissing_outputmissing_pluginsuggested_minimal_overrides
This repo follows a strict loop for each batch:
- plan and define atomic tasks
- tests first (RED)
- implementation (GREEN)
- full regression run (
pytest tests/) - E2E with a new CLI
- update tracking in specs/control files
- atomic commit + PR + review
Common checks:
uv run ruff check .
uv run ruff format --check .
uv run pytest tests/A major next step is to support grouping multiple related CLIs into a single plugin (for example, one domain plugin for several DevOps CLIs).
Expected impact:
- drastically fewer installed plugins
- simpler activation/maintenance model
- better domain-level discoverability
- lower operational overhead for large toolchains
This is planned through the group taxonomy and inference backlog (deterministic first, ML fallback), then future grouped-plugin generation.
- observability and confidence scoring expansion
- dashboard UI for inventory, operations, and plugin inspection
- PyPI release hardening and publish workflow
See docs/CONTRIBUTING.md.