Skip to content

feat: add LLM enrichment for model name extraction#63

Open
cassiocouto wants to merge 2 commits intoTrusera:mainfrom
cassiocouto:feat/llm-enrichment
Open

feat: add LLM enrichment for model name extraction#63
cassiocouto wants to merge 2 commits intoTrusera:mainfrom
cassiocouto:feat/llm-enrichment

Conversation

@cassiocouto
Copy link

@cassiocouto cassiocouto commented Mar 2, 2026

Summary

Closes #36 — adds optional LLM-based enrichment to extract specific model names (e.g., gpt-4o, claude-3-opus-20240229) from code snippets around detected AI components.

  • New --llm-enrich CLI flag with --llm-model, --llm-api-key, and --llm-base-url options
  • Uses litellm as a unified LLM client — supports OpenAI, Anthropic, Ollama (local), and 100+ providers
  • Only enriches llm_provider and model component types with empty model_name (filters out containers, tools, MCP servers, etc.)
  • Reads ~20 lines of source context around detection sites for accurate extraction
  • Cross-references extracted names against the built-in model registry for provider/deprecation metadata
  • Batched LLM calls (default 5 per request) with graceful fallback to individual calls on error
  • Dependency-missing guard: clear error with install hint when litellm is not installed
  • Privacy warning for non-local models; recommends ollama/* for sensitive codebases
  • 39 new tests (unit + CLI integration), all mocked — no real LLM calls in tests
  • New docs/enrichment.md with usage guide, privacy section, and cost guidance
  • Also fixes two pre-existing test failures (test_demo_command, test_sarif_relative_path_calculation on Windows)

Test plan

  • pytest tests/test_enrichment/ — 39 tests pass (prompt templates, JSON parsing, component filtering, batch/single enrichment, error handling, CLI flags, privacy warnings)
  • pytest tests/ — full suite: 786 passed, 0 failed
  • ruff check — no lint errors on new/modified files
  • Manual test with --llm-enrich --llm-model ollama/llama3 against a real project (requires Ollama running locally)

@cassiocouto cassiocouto requested a review from Zie619 as a code owner March 2, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add LLM enrichment feature for model name extraction

1 participant