Skip to content

Comments

feat: harden crawler/parser pipeline and expand plugin coverage#8

Merged
nsalvacao merged 5 commits intomainfrom
001-cli-plugins-base
Feb 16, 2026
Merged

feat: harden crawler/parser pipeline and expand plugin coverage#8
nsalvacao merged 5 commits intomainfrom
001-cli-plugins-base

Conversation

@nsalvacao
Copy link
Owner

@nsalvacao nsalvacao commented Feb 16, 2026

Summary

  • harden crawler help detection, auth failure handling, thread-safety, sectionless/embedded parsing, and version attribution
  • add canonical CLI identity support (e.g. git.exe -> git) for version matching and plugin slug generation
  • add config inventory audit command (config-audit) with tests and README policy/runbook updates
  • regenerate/normalize plugin artifacts for expanded CLI coverage and normalize plugins/cli-jq layout
  • update workflow/backlog tracking in AGENTS/tasks

Validation

  • ruff check .
  • .venv-wsl/bin/python -m pytest tests/ (206 passed)

Commits

  • feat(crawler): harden parsing, discovery and plugin generation
  • feat(config): add inventory audit command and config policy docs
  • docs(workflow): update agent loop and backlog tracking
  • chore(plugins): regenerate and normalize generated CLI plugins

@nsalvacao
Copy link
Owner Author

@copilot review

Copy link
Contributor

Copilot AI commented Feb 16, 2026

@nsalvacao I've opened a new pull request, #9, to work on those changes. Once the pull request is ready, I'll request review from you.

@gemini-code-assist
Copy link

Summary of Changes

Hello @nsalvacao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and coverage of the CLI crawler and plugin generation pipeline. It addresses several edge cases in help text parsing, improves the reliability of version detection, and introduces a new tool for auditing configuration consistency. The changes also expand the range of CLIs for which plugins can be generated, ensuring a more comprehensive and accurate set of reference materials.

Highlights

  • Crawler/Parser Hardening: Improved help detection, authentication failure handling, thread-safety in crawl state, and parsing for sectionless/embedded help outputs. This includes better version attribution and handling of various CLI help formats.
  • Expanded Plugin Coverage: Added canonical CLI identity support (e.g., git from git.exe) for version matching and plugin slug generation. New plugins were generated for awk, curl, g++, gcc, grep, make, node, npx, perl, pip, pnpm, python, python3, tar, yq, and zip.
  • Config Inventory Audit Command: Introduced a new config-audit command with tests and updated README policy/runbook. This command helps detect drift between config.yaml, crawled output, and generated plugins.
  • Plugin Artifact Normalization: Regenerated and normalized plugin artifacts for expanded CLI coverage and standardized layout, ensuring consistency across all generated plugins.
  • Workflow and Backlog Tracking Updates: Updated AGENTS.md and specs/001-cli-plugins-base/tasks.md to reflect new tasks and improved quality verification steps in the agent cycle.
Changelog
  • .gitignore
    • Added .productivity/ to the ignore list.
  • AGENTS.md
    • Updated step 7 to include quality verification of output and generated plugins.
    • Modified step 8 to mark tasks in CLAUDE.local.md in addition to tasks.md.
  • README.md
    • Added a 'Minimal override policy' section for config.yaml.
    • Introduced a new 'Config Audit (Inventory Drift)' section with usage examples for the config-audit command.
  • config.yaml
    • Updated the default raw_threshold value from 10240 to 10000.
  • plugins/cli-awk/.claude-plugin/plugin.json
    • Added new file for the awk CLI plugin metadata.
  • plugins/cli-awk/commands/scan-cli.md
    • Added new file for the awk CLI scan command documentation.
  • plugins/cli-awk/scripts/rescan.sh
    • Added new file for the awk CLI rescan script.
  • plugins/cli-awk/skills/cli-awk/SKILL.md
    • Added new file for the awk CLI skill documentation.
  • plugins/cli-awk/skills/cli-awk/references/commands.md
    • Added new file for the awk CLI commands reference.
  • plugins/cli-awk/skills/cli-awk/references/examples.md
    • Added new file for the awk CLI usage examples.
  • plugins/cli-curl/.claude-plugin/plugin.json
    • Added new file for the curl CLI plugin metadata.
  • plugins/cli-curl/commands/scan-cli.md
    • Added new file for the curl CLI scan command documentation.
  • plugins/cli-curl/scripts/rescan.sh
    • Added new file for the curl CLI rescan script.
  • plugins/cli-curl/skills/cli-curl/SKILL.md
    • Added new file for the curl CLI skill documentation.
  • plugins/cli-curl/skills/cli-curl/references/commands.md
    • Added new file for the curl CLI commands reference.
  • plugins/cli-curl/skills/cli-curl/references/examples.md
    • Added new file for the curl CLI usage examples.
  • plugins/cli-docker/.claude-plugin/plugin.json
    • Updated keywords to be more relevant to Docker subcommands and removed author metadata.
  • plugins/cli-docker/skills/cli-docker/SKILL.md
    • Refactored the SKILL.md content for docker to use a compact format, removing verbose sections like 'Prerequisites' and 'Quick Reference', and summarizing top-level commands.
  • plugins/cli-g++/.claude-plugin/plugin.json
    • Added new file for the g++ CLI plugin metadata.
  • plugins/cli-g++/commands/scan-cli.md
    • Added new file for the g++ CLI scan command documentation.
  • plugins/cli-g++/scripts/rescan.sh
    • Added new file for the g++ CLI rescan script.
  • plugins/cli-g++/skills/cli-g++/SKILL.md
    • Added new file for the g++ CLI skill documentation.
  • plugins/cli-g++/skills/cli-g++/references/commands.md
    • Added new file for the g++ CLI commands reference.
  • plugins/cli-g++/skills/cli-g++/references/examples.md
    • Added new file for the g++ CLI usage examples.
  • plugins/cli-gcc/.claude-plugin/plugin.json
    • Added new file for the gcc CLI plugin metadata.
  • plugins/cli-gcc/commands/scan-cli.md
    • Added new file for the gcc CLI scan command documentation.
  • plugins/cli-gcc/scripts/rescan.sh
    • Added new file for the gcc CLI rescan script.
  • plugins/cli-gcc/skills/cli-gcc/SKILL.md
    • Added new file for the gcc CLI skill documentation.
  • plugins/cli-gcc/skills/cli-gcc/references/commands.md
    • Added new file for the gcc CLI commands reference.
  • plugins/cli-gcc/skills/cli-gcc/references/examples.md
    • Added new file for the gcc CLI usage examples.
  • plugins/cli-gh/.claude-plugin/plugin.json
    • Updated keywords to be more relevant to gh subcommands and removed author metadata.
  • plugins/cli-gh/skills/cli-gh/SKILL.md
    • Refactored the SKILL.md content for gh to use a compact format, removing verbose sections like 'Prerequisites' and 'Quick Reference', and summarizing top-level commands.
  • plugins/cli-gh/skills/cli-gh/references/commands.md
    • Removed a placeholder description for gh api.
  • plugins/cli-git/.claude-plugin/plugin.json
    • Updated keywords to be more relevant to git concepts and removed author metadata.
  • plugins/cli-git/skills/cli-git/SKILL.md
    • Refactored the SKILL.md content for git to use a compact format, removing verbose sections like 'Prerequisites' and 'Quick Reference', and summarizing top-level commands.
  • plugins/cli-git/skills/cli-git/references/commands.md
    • Updated the description for git reset.
  • plugins/cli-git/skills/cli-git/references/examples.md
    • Reduced the number of examples and condensed their descriptions for brevity.
  • plugins/cli-grep/.claude-plugin/plugin.json
    • Added new file for the grep CLI plugin metadata.
  • plugins/cli-grep/commands/scan-cli.md
    • Added new file for the grep CLI scan command documentation.
  • plugins/cli-grep/scripts/rescan.sh
    • Added new file for the grep CLI rescan script.
  • plugins/cli-grep/skills/cli-grep/SKILL.md
    • Added new file for the grep CLI skill documentation.
  • plugins/cli-grep/skills/cli-grep/references/commands.md
    • Added new file for the grep CLI commands reference.
  • plugins/cli-grep/skills/cli-grep/references/examples.md
    • Added new file for the grep CLI usage examples.
  • plugins/cli-jq/.claude-plugin/plugin.json
    • Renamed from plugins/cli-jq/cli-jq/.claude-plugin/plugin.json.
    • Removed author metadata.
  • plugins/cli-jq/commands/scan-cli.md
    • Renamed from plugins/cli-jq/cli-jq/commands/scan-cli.md.
  • plugins/cli-jq/scripts/rescan.sh
    • Renamed from plugins/cli-jq/cli-jq/scripts/rescan.sh.
  • plugins/cli-jq/skills/cli-jq/SKILL.md
    • Renamed from plugins/cli-jq/cli-jq/skills/cli-jq/SKILL.md.
    • Refactored content to use a compact format, removing verbose sections and summarizing top-level commands.
  • plugins/cli-jq/skills/cli-jq/references/commands.md
    • Renamed from plugins/cli-jq/cli-jq/skills/cli-jq/references/commands.md.
  • plugins/cli-jq/skills/cli-jq/references/examples.md
    • Renamed from plugins/cli-jq/cli-jq/skills/cli-jq/references/examples.md.
  • plugins/cli-make/.claude-plugin/plugin.json
    • Added new file for the make CLI plugin metadata.
  • plugins/cli-make/commands/scan-cli.md
    • Added new file for the make CLI scan command documentation.
  • plugins/cli-make/scripts/rescan.sh
    • Added new file for the make CLI rescan script.
  • plugins/cli-make/skills/cli-make/SKILL.md
    • Added new file for the make CLI skill documentation.
  • plugins/cli-make/skills/cli-make/references/commands.md
    • Added new file for the make CLI commands reference.
  • plugins/cli-make/skills/cli-make/references/examples.md
    • Added new file for the make CLI usage examples.
  • plugins/cli-node/.claude-plugin/plugin.json
    • Added new file for the node CLI plugin metadata.
  • plugins/cli-node/commands/scan-cli.md
    • Added new file for the node CLI scan command documentation.
  • plugins/cli-node/scripts/rescan.sh
    • Added new file for the node CLI rescan script.
  • plugins/cli-node/skills/cli-node/SKILL.md
    • Added new file for the node CLI skill documentation.
  • plugins/cli-node/skills/cli-node/references/commands.md
    • Added new file for the node CLI commands reference.
  • plugins/cli-node/skills/cli-node/references/examples.md
    • Added new file for the node CLI usage examples.
  • plugins/cli-npx/.claude-plugin/plugin.json
    • Added new file for the npx CLI plugin metadata.
  • plugins/cli-npx/commands/scan-cli.md
    • Added new file for the npx CLI scan command documentation.
  • plugins/cli-npx/scripts/rescan.sh
    • Added new file for the npx CLI rescan script.
  • plugins/cli-npx/skills/cli-npx/SKILL.md
    • Added new file for the npx CLI skill documentation.
  • plugins/cli-npx/skills/cli-npx/references/commands.md
    • Added new file for the npx CLI commands reference.
  • plugins/cli-npx/skills/cli-npx/references/examples.md
    • Added new file for the npx CLI usage examples.
  • plugins/cli-perl/.claude-plugin/plugin.json
    • Added new file for the perl CLI plugin metadata.
  • plugins/cli-perl/commands/scan-cli.md
    • Added new file for the perl CLI scan command documentation.
  • plugins/cli-perl/scripts/rescan.sh
    • Added new file for the perl CLI rescan script.
  • plugins/cli-perl/skills/cli-perl/SKILL.md
    • Added new file for the perl CLI skill documentation.
  • plugins/cli-perl/skills/cli-perl/references/commands.md
    • Added new file for the perl CLI commands reference.
  • plugins/cli-perl/skills/cli-perl/references/examples.md
    • Added new file for the perl CLI usage examples.
  • plugins/cli-pip/.claude-plugin/plugin.json
    • Added new file for the pip CLI plugin metadata.
  • plugins/cli-pip/commands/scan-cli.md
    • Added new file for the pip CLI scan command documentation.
  • plugins/cli-pip/scripts/rescan.sh
    • Added new file for the pip CLI rescan script.
  • plugins/cli-pip/skills/cli-pip/SKILL.md
    • Added new file for the pip CLI skill documentation.
  • plugins/cli-pip/skills/cli-pip/references/commands.md
    • Added new file for the pip CLI commands reference.
  • plugins/cli-pip/skills/cli-pip/references/examples.md
    • Added new file for the pip CLI usage examples.
  • plugins/cli-pnpm/.claude-plugin/plugin.json
    • Added new file for the pnpm CLI plugin metadata.
  • plugins/cli-pnpm/commands/scan-cli.md
    • Added new file for the pnpm CLI scan command documentation.
  • plugins/cli-pnpm/scripts/rescan.sh
    • Added new file for the pnpm CLI rescan script.
  • plugins/cli-pnpm/skills/cli-pnpm/SKILL.md
    • Added new file for the pnpm CLI skill documentation.
  • plugins/cli-pnpm/skills/cli-pnpm/references/commands.md
    • Added new file for the pnpm CLI commands reference.
  • plugins/cli-pnpm/skills/cli-pnpm/references/examples.md
    • Added new file for the pnpm CLI usage examples.
  • plugins/cli-python/.claude-plugin/plugin.json
    • Added new file for the python CLI plugin metadata.
  • plugins/cli-python/commands/scan-cli.md
    • Added new file for the python CLI scan command documentation.
  • plugins/cli-python/scripts/rescan.sh
    • Added new file for the python CLI rescan script.
  • plugins/cli-python/skills/cli-python/SKILL.md
    • Added new file for the python CLI skill documentation.
  • plugins/cli-python/skills/cli-python/references/commands.md
    • Added new file for the python CLI commands reference.
  • plugins/cli-python/skills/cli-python/references/examples.md
    • Added new file for the python CLI usage examples.
  • plugins/cli-python3/.claude-plugin/plugin.json
    • Added new file for the python3 CLI plugin metadata.
  • plugins/cli-python3/commands/scan-cli.md
    • Added new file for the python3 CLI scan command documentation.
  • plugins/cli-python3/scripts/rescan.sh
    • Added new file for the python3 CLI rescan script.
  • plugins/cli-python3/skills/cli-python3/SKILL.md
    • Added new file for the python3 CLI skill documentation.
  • plugins/cli-python3/skills/cli-python3/references/commands.md
    • Added new file for the python3 CLI commands reference.
  • plugins/cli-python3/skills/cli-python3/references/examples.md
    • Added new file for the python3 CLI usage examples.
  • plugins/cli-tar/.claude-plugin/plugin.json
    • Added new file for the tar CLI plugin metadata.
  • plugins/cli-tar/commands/scan-cli.md
    • Added new file for the tar CLI scan command documentation.
  • plugins/cli-tar/scripts/rescan.sh
    • Added new file for the tar CLI rescan script.
  • plugins/cli-tar/skills/cli-tar/SKILL.md
    • Added new file for the tar CLI skill documentation.
  • plugins/cli-tar/skills/cli-tar/references/commands.md
    • Added new file for the tar CLI commands reference.
  • plugins/cli-tar/skills/cli-tar/references/examples.md
    • Added new file for the tar CLI usage examples.
  • plugins/cli-yq/.claude-plugin/plugin.json
    • Added new file for the yq CLI plugin metadata.
  • plugins/cli-yq/commands/scan-cli.md
    • Added new file for the yq CLI scan command documentation.
  • plugins/cli-yq/scripts/rescan.sh
    • Added new file for the yq CLI rescan script.
  • plugins/cli-yq/skills/cli-yq/SKILL.md
    • Added new file for the yq CLI skill documentation.
  • plugins/cli-yq/skills/cli-yq/references/commands.md
    • Added new file for the yq CLI commands reference.
  • plugins/cli-yq/skills/cli-yq/references/examples.md
    • Added new file for the yq CLI usage examples.
  • plugins/cli-zip/.claude-plugin/plugin.json
    • Added new file for the zip CLI plugin metadata.
  • plugins/cli-zip/commands/scan-cli.md
    • Added new file for the zip CLI scan command documentation.
  • plugins/cli-zip/scripts/rescan.sh
    • Added new file for the zip CLI rescan script.
  • plugins/cli-zip/skills/cli-zip/SKILL.md
    • Added new file for the zip CLI skill documentation.
  • plugins/cli-zip/skills/cli-zip/references/commands.md
    • Added new file for the zip CLI commands reference.
  • plugins/cli-zip/skills/cli-zip/references/examples.md
    • Added new file for the zip CLI usage examples.
  • pyproject.toml
    • Added config-audit as a new console script entry point.
  • specs/001-cli-plugins-base/tasks.md
    • Updated task descriptions to be more concise and consistent.
    • Marked several tasks as completed (e.g., T001, T002, T038, T039, T003, T004, T005, T032, T066, T044, T007, T067, T008, T028, T029, T040, T041, T042, T043, T045, T046, T009, T010, T011, T012, T047, T048, T049, T079, T082, T084, T089, T091, T093, T095, T097, T099, T101, T118, T013, T014, T015, T016, T050, T051, T052, T080, T053, T081, T083, T090, T092, T094, T096, T098, T100, T102, T119, T120, T131, T132, T133, T134, T054, T055, T069, T070, T071, T072).
    • Added new tasks related to configuration hygiene (T069-T072) and CLI group inference (T073-T078).
    • Added new tasks for Dashboard UI & Operations Cockpit (T103-T117).
    • Updated task dependencies to reflect the new task structure and progress.
  • src/config/audit.py
    • Added new file implementing a configuration audit tool to detect drift between config.yaml, crawled output, and generated plugins.
  • src/crawler/cli_crawler.py
    • Added run_root_help, crawl_single, and crawl_configured helper functions.
    • Updated __all__ export list.
  • src/crawler/config.py
    • Updated the default raw_threshold value from 10240 to 10000.
  • src/crawler/detector.py
    • Added AUTH_REQUIRED_RE for detecting authentication failures.
    • Implemented SAFE_BARE_FALLBACK_SUBCOMMANDS and MUTATING_SUBCOMMAND_TOKENS to prevent executing potentially harmful bare subcommands during help detection.
    • Modified detect_help_pattern and detect_subcommand_help to handle authentication-required results and apply safety guards for bare subcommand fallbacks.
  • src/crawler/discovery.py
    • Added threading.Lock to CrawlState for thread-safe access to visited, raw_outputs, errors, and warnings.
    • Implemented atomic methods (mark_visited, set_raw_output, get_raw_output, increment_errors, add_warning, extend_warnings) for CrawlState.
  • src/crawler/formatter.py
    • Modified write_output to always write a separate raw JSON sidecar file, regardless of size, for deterministic output layout.
    • Removed the config parameter from write_output as raw_threshold is no longer used for conditional raw file writing.
  • src/crawler/parser.py
    • Implemented _truncate_embedded_help to filter out foreign embedded tool help blocks (e.g., jq help within yq output).
    • Added _deduplicate_flags and _merge_flag_metadata for deterministic flag deduplication and merging.
    • Introduced _DESC_NOISE_PATTERNS and _clean_description to filter noisy runtime/status messages from flag and command descriptions.
    • Integrated extract_usage_line_options for parsing flags from sectionless help outputs.
    • Updated parse_help_output to use embedded help truncation and _clean_description.
  • src/crawler/parsers/commands.py
    • Added CMD_TABULAR_ALIAS_RE to parse commands with aliases in tabular formats (e.g., i, install).
    • Improved _try_tabular to handle wrapped descriptions and correctly extract aliases as separate commands.
  • src/crawler/parsers/flags.py
    • Added FLAG_GNU_SINGLE_DASH_TOKEN_RE for parsing GNU-style single-dash long options (e.g., -print-file-name).
    • Implemented _try_parse_gnu_single_dash and _looks_like_gnu_value_token to support parsing these complex flag formats.
  • src/crawler/parsers/manpage.py
    • Refined _parse_manpage_examples with EXAMPLE_COMMAND_RE, EXAMPLE_BULLET_PREFIX_RE, and EXAMPLE_PROSE_STARTERS to improve example extraction accuracy from man pages, especially for npm-style outputs.
  • src/crawler/parsers/sections.py
    • Added new regex patterns for pnpm-style grouped command headings (e.g., 'Manage your dependencies:').
    • Updated _BOX_HEADER_RE and _BOX_BORDER_RE to support Unicode box-drawing characters in rich-click output.
    • Modified _strip_rich_boxes to correctly process Unicode box characters.
  • src/crawler/parsers/usage.py
    • Added extract_usage_line_options to parse flags from sectionless help outputs (e.g., Python's --help output).
    • Implemented helper functions like _extract_value_hint, _extract_choices_from_hint, _normalize_option_description, _split_option_spec_and_description, _extract_continuation_description, _split_colon_description_outside_placeholders, _rebalance_alias_split, _extract_option_tokens, _normalize_short_option_fragment, and _looks_like_option_value_token for robust parsing of various option formats.
  • src/crawler/pipeline.py
    • Added _resolve_output_path to handle flexible output paths (directory or specific JSON file).
    • Implemented _apply_progressive_loading to truncate very long help outputs for parsing, improving performance and token usage.
    • Introduced _compute_confidence_score to provide a crawl-level confidence score, penalizing degraded modes (auth required, unknown pattern, progressive loading).
    • Updated crawl_cli to integrate these new features, including handling of authentication failures and progressive loading.
  • src/crawler/version.py
    • Refactored version detection logic with a scoring system to prioritize reliable version candidates.
    • Implemented _is_placeholder_version to filter out non-informative versions (e.g., '0.0.0', 'unknown').
    • Added _build_cli_mention_re, _line_has_structural_cli_version, _cli_name_aliases, _line_has_foreign_version_subject, _line_has_foreign_subject, and _is_cli_name_match for more accurate and robust version attribution.
    • Integrated canonical_cli_name for consistent CLI identity across different OS executable names.
  • src/generator/plugin_generator.py
    • Implemented generate_semantic_keywords to create more relevant keywords for plugins based on CLI name, command groups, and domain terms from descriptions.
    • Added _resolve_author and _build_author_metadata for configurable author information in plugin.json.
    • Introduced _approx_token_count for estimating token usage in generated markdown, _escape_triple_backticks for preventing markdown fence breaks, and _example_command_from_usage for generating examples from usage patterns.
    • Refactored _collect_document_examples to prioritize explicit examples and fallback to usage patterns when none are found.
    • Updated generate_plugin_json, generate_skill_md, generate_examples_md, generate_rescan_sh, and generate_plugin to use these new features, including progressive disclosure for SKILL.md and canonical plugin slugs.
  • src/lib/cli_identity.py
    • Added new file containing canonical_cli_name to normalize CLI names (e.g., git.exe to git) and plugin_slug for generating consistent plugin identifiers.
  • tests/conftest.py
    • Added pnpm_help fixture for pnpm help text.
  • tests/fixtures/pnpm/help.txt
    • Added new fixture file containing pnpm help output.
  • tests/integration/test_gcc_climap_generation.py
    • Added new integration test to verify gcc CLIMap and plugin generation quality, specifically for GNU single-dash flags.
  • tests/integration/test_pnpm_climap_generation.py
    • Added new integration test to verify pnpm CLIMap and plugin generation quality, including grouped commands and wrapped descriptions.
  • tests/performance/test_smoke_perf.py
    • Added test_parse_1000_line_fixture_under_5_seconds to ensure parsing performance for large help outputs.
    • Added test_docker_crawl_and_generate_under_30_seconds to verify end-to-end performance for a complex CLI like Docker.
  • tests/test_discovery_thread_safety.py
    • Added new unit tests (test_mark_visited_is_atomic, test_error_and_warning_updates_are_thread_safe) to verify thread-safety of CrawlState operations.
  • tests/test_generate_plugin.py
    • Updated test_counts to assert cf_stats.version starts with '3.1.0-alpha.' for flexibility.
    • Updated test_valid_json to assert obj['version'] matches cf_map['cli_version'].
    • Added test_escapes_backticks_in_example_descriptions to ensure proper escaping in generated examples.
    • Added test_fallback_examples_from_usage_when_none_explicit to verify example generation from usage patterns.
  • tests/test_parser_description_cleaning.py
    • Added new unit tests to ensure noisy runtime/status messages and circular descriptions are discarded from parsed descriptions, while valid descriptions are retained.
  • tests/test_parser_manpage.py
    • Added test_npm_style_examples_are_extracted and test_examples_with_left_aligned_prose_still_extract_commands to verify improved manpage example extraction.
  • tests/test_parser_sections.py
    • Added test_pipe_table_without_box_markers_is_not_stripped to ensure pipe tables are not incorrectly stripped by box-drawing logic.
    • Updated test_richclick_arguments_section to include SectionType.FLAGS in assertions.
  • tests/test_pipeline_integration.py
    • Added test_parse_unicode_richclick_flags to verify parsing of rich-click output with Unicode box characters.
    • Modified test_json_output_exists to allow auxiliary reports in the output directory.
  • tests/unit/test_author_config.py
    • Added new unit tests to verify that author metadata is correctly omitted when not configured, loaded from environment variables, and overridden by explicit arguments.
  • tests/unit/test_cli_crawler_basic.py
    • Added new basic crawler smoke test (test_git_initial_crawl_root_help) to ensure initial git help parsing works.
  • tests/unit/test_cli_name_canonicalization.py
    • Added new unit tests to verify that detect_version canonicalizes executable suffixes while preserving the original invocation command, and that generate_plugin uses canonical slugs for plugin naming.
  • tests/unit/test_command_parsing_basic.py
    • Added new basic command hierarchy parsing tests for docker and claude-flow.
  • tests/unit/test_config_audit.py
    • Added new unit tests to verify the config-audit tool handles empty directories, detects inventory drift, tracks environment and plugin discovery overrides, writes JSON reports, and ignores non-CLIMap JSON files.
  • tests/unit/test_edge_case_auth_help.py
    • Added new unit tests to ensure authentication-required help outputs are handled gracefully (returning clear warnings in non-strict mode, raising errors in strict mode) and that auth detection prioritizes stderr.
  • tests/unit/test_edge_case_long_help.py
    • Added new unit test to verify that very long help outputs trigger progressive loading, truncating the parsed content while recording metadata about the truncation.
  • tests/unit/test_embedded_help_boundary.py
    • Added new unit test to confirm that the parser correctly identifies and ignores foreign embedded help blocks (e.g., jq help within yq output).
  • tests/unit/test_flag_dedup_embedded_help.py
    • Added new unit test to ensure duplicate flags (like --help and --version) from embedded help outputs are deduplicated deterministically.
  • tests/unit/test_flag_parsing_basic.py
    • Added new basic flag parsing tests for standard, short, and boolean flags.
  • tests/unit/test_flag_parsing_gnu_single_dash.py
    • Added new unit tests to verify parsing of GNU single-dash long options and pass-through families (e.g., -print-file-name, -Wa,<options>).
  • tests/unit/test_keyword_generation.py
    • Added new unit tests to ensure semantic keywords are generated from CLI names, command groups, and domain terms, avoiding generic first words.
  • tests/unit/test_output_layout.py
    • Added new unit tests to verify _resolve_output_path correctly handles directory and file paths, and that write_output always creates a raw sidecar file.
  • tests/unit/test_parser_pnpm_grouped_help.py
    • Added new unit tests to confirm that pnpm-style grouped help headers are detected as command sections, and that grouped commands with wrapped descriptions are correctly extracted.
  • tests/unit/test_parsing_rich_man.py
    • Added new tests to verify rich-click flag extraction rate is above 50% and that manpage examples are correctly extracted from npm-style content.
  • tests/unit/test_progressive_disclosure.py
    • Added new unit tests to verify that SKILL.md adheres to a token budget for compact display, while references/commands.md and references/examples.md retain full details.
  • tests/unit/test_subcommand_help_safety.py
    • Added new unit tests to ensure that potentially mutating subcommands do not trigger bare fallback help detection, and that explicit help/version subcommands can use bare fallback safely.
  • tests/unit/test_usage_line_option_extraction.py
    • Added new unit tests to verify sectionless usage-line option extraction for Python-style help, including normalization of descriptions, enrichment of long options, handling of placeholder colons, and parsing of attached/combined short forms.
  • tests/unit/test_version_detection_fallback.py
    • Added new unit tests to verify robust version detection, including skipping placeholder versions (e.g., '0.0.0', 'unknown'), preferring CLI-attributed versions over dependency versions, and handling various version string formats.
Activity
  • The pull request introduces significant enhancements to the crawler and parser, focusing on robustness and expanded CLI coverage.
  • A new config-audit command has been added to help maintain consistency between configuration, crawled data, and generated plugins.
  • Numerous parsing improvements have been implemented to handle diverse help output formats, including rich-click, man pages, and sectionless styles.
  • Version detection logic has been refined to be more accurate and resilient to placeholder values or embedded dependency versions.
  • Thread-safety has been added to the crawling state to prevent race conditions during concurrent operations.
  • Plugin generation now includes semantic keyword generation and configurable author metadata.
  • Progressive disclosure for generated plugin documentation (SKILL.md, references/) ensures compact summaries while retaining full detail on demand.
  • New integration and unit tests have been added to validate these improvements across various CLIs like gcc and pnpm, and to ensure performance and safety.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@nsalvacao nsalvacao merged commit 180df34 into main Feb 16, 2026
9 checks passed
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and impressive pull request that significantly hardens the crawler and generator, expands parser coverage, and improves the overall quality and maintainability of the project. The introduction of thread-safety in the crawler, robust version detection, and safety guards against executing mutating commands are excellent improvements. The expanded parser coverage for pnpm and GNU-style options is a great addition. I also appreciate the new config-audit tool for maintaining configuration hygiene and the move to canonical CLI identities. The plugin generator has been greatly enhanced with semantic keyword generation and a more user-friendly progressive disclosure format. The extensive suite of new tests accompanying these features demonstrates a strong commitment to quality. The code is well-structured and the changes are well-implemented. I only found a couple of minor typos in the documentation files. Fantastic work!

* Run `cli-crawler <tool>` -> `generate-plugin <tool>`.
* Verify the output manually in D:\GitHub\cli-plugins\output and D:\GitHub\cli-plugins\plugins to prove the solution works in the wild.
8. **MARK TASK**: Check off `[x]` in `specs/001-cli-plugins-base/tasks.md` (or relevant task file).
* Verify the quality of the output and the generated plugin against the main goal of the project. If the quality is not good, go back to step 5 ou defer creating new improvement tasks in `specs/001-cli-plugins-base/tasks.md` and `CLAUDE.local.md`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo here. 'ou' should be 'or'.

Suggested change
* Verify the quality of the output and the generated plugin against the main goal of the project. If the quality is not good, go back to step 5 ou defer creating new improvement tasks in `specs/001-cli-plugins-base/tasks.md` and `CLAUDE.local.md`.
* Verify the quality of the output and the generated plugin against the main goal of the project. If the quality is not good, go back to step 5 or defer creating new improvement tasks in `specs/001-cli-plugins-base/tasks.md` and `CLAUDE.local.md`.

- [x] T005 [P] Setup Pytest for testing in `pyproject.toml` — add `[tool.pytest.ini_options]` with testpaths, markers for unit/integration/e2e
- [x] T032 [MOVED from Phase 6] Create CI/CD pipeline: `.github/workflows/ci.yml` — pytest matrix on Python 3.11+3.12, ruff lint, black check; runs on push and PR; add CI badge to `README.md` *(constitution §CI/CD Automatizado; execution-plan A7)*
- [x] T066 [P] [NEW] Add `__version__` to package via `src/__init__.py` sourced from `pyproject.toml`; add `--version` flag to `cli-crawler` and `generate-plugin` entry points *(evaluation-results A5, L3)*
- [x] **T001 Createeeee project directories** - `src/crawler`, `src/generator`, `src/config`, `src/lib`, `tests/unit`, `tests/integration`, `tests/end_to_end` — decision: migrate existing `crawler/` → `src/crawler/` (preserve git history with `git mv`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in the task title. 'Createeeee' should be 'Create'.

Suggested change
- [x] **T001 Createeeee project directories** - `src/crawler`, `src/generator`, `src/config`, `src/lib`, `tests/unit`, `tests/integration`, `tests/end_to_end` — decision: migrate existing `crawler/``src/crawler/` (preserve git history with `git mv`)
- [x] **T001 Create project directories** - `src/crawler`, `src/generator`, `src/config`, `src/lib`, `tests/unit`, `tests/integration`, `tests/end_to_end` — decision: migrate existing `crawler/``src/crawler/` (preserve git history with `git mv`)

@nsalvacao nsalvacao deleted the 001-cli-plugins-base branch February 16, 2026 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants