diff --git a/.prompts/docs-review.md b/.prompts/docs-review.md index 42f67915e..6917586b7 100644 --- a/.prompts/docs-review.md +++ b/.prompts/docs-review.md @@ -5,8 +5,18 @@ Review documentation for accuracy, completeness, and consistency. Focus on thing Don't waste time on these—CI and pre-commit hooks handle them: - **README help output**: `markdown-code-runner` regenerates `agent-cli --help` blocks +- **Options tables**: `docs_gen` module auto-generates options from CLI introspection - **Linting/formatting**: Handled by pre-commit +The `docs_gen` module (`agent_cli/docs_gen.py`) provides: +- `all_options_for_docs(cmd)`: Complete options tables grouped by panel +- `env_vars_table()`: Environment variables documentation +- `provider_matrix()`: Provider comparison table +- `config_example(cmd)`: Example TOML configuration +- `commands_table()`: Commands overview table + +Run `uv run python docs/update_docs.py` to regenerate all auto-generated content. + ## What This Review Is For Focus on things that require judgment: @@ -31,53 +41,41 @@ git diff --name-only HEAD~20 | grep "\.py$" Look for new features, changed defaults, renamed options, or removed functionality. -### 2. Verify Command Documentation (HIGH PRIORITY) +### 2. Verify Command Documentation -**Options tables in `docs/commands/` are manually maintained and frequently drift from the actual CLI.** This is the most common source of documentation errors. +Options tables are now auto-generated, so focus on what's NOT automated: -For EACH command doc file, systematically compare against the actual CLI: +**Check for missing command docs:** ```bash -# Get list of all commands +# Compare commands in CLI vs docs that exist agent-cli --help - -# For each command, dump actual options and compare to docs -agent-cli transcribe --help -agent-cli chat --help -agent-cli autocorrect --help -agent-cli speak --help -agent-cli voice-edit --help -agent-cli assistant --help -agent-cli transcribe-daemon --help -agent-cli rag-proxy --help -agent-cli memory --help -agent-cli server --help +ls docs/commands/*.md ``` -**Check each option in the help output against the docs:** +Every command should have a corresponding `docs/commands/.md` file. When adding a new command doc, also update: +- `docs/commands/index.md` - add to the commands table +- `zensical.toml` - add to the `nav` sidebar under Commands -| Check | Common Issues | -|-------|---------------| -| Option exists in docs? | New options added to CLI but not documented | -| Option still exists in CLI? | Removed options still in docs | -| Default value correct? | Defaults change, docs not updated | -| Short flag correct? | `-m` vs `-M`, missing short flags | -| Description accurate? | Behavior changed, description stale | -| Type correct? | `PATH` vs `TEXT`, `INTEGER` vs `FLOAT` | +**What still needs manual review:** -**Also check `agent_cli/opts.py`** - this defines shared options used across commands. Changes here affect multiple commands. +| Check | What to Look For | +|-------|------------------| +| Description accuracy | Does the prose description match actual behavior? | +| Example commands | Would these actually work? Are they useful? | +| Workflow explanations | Is the step-by-step flow still accurate? | +| Use cases | Are suggested use cases realistic? | +| Cross-links | Do links to related commands work? | -**Check for missing command docs:** +**Verify auto-generation is working:** ```bash -# Compare commands in CLI vs docs that exist -agent-cli --help -ls docs/commands/*.md +# Run update script and check for changes +uv run python docs/update_docs.py +git diff docs/ ``` -Every command should have a corresponding `docs/commands/.md` file. When adding a new command doc, also update: -- `docs/commands/index.md` - add to the commands table -- `zensical.toml` - add to the `nav` sidebar under Commands +If the script produces changes, either commit them or investigate why docs drifted. ### 3. Verify docs/configuration.md @@ -118,19 +116,19 @@ Check: These are particularly prone to errors when docs are AI-generated or AI-maintained: -| Area | How to Verify | Past Issues | -|------|---------------|-------------| -| **Model names** | Check `agent_cli/opts.py` defaults | `gpt-4o-mini` → `gpt-5-mini` | -| **Tool/function names** | Check `agent_cli/_tools.py` | `web_search` didn't exist, was `duckduckgo_search` | -| **File paths** | Grep for `PID_DIR`, `CONFIG_DIR`, etc. | PID files documented in wrong directory | +| Area | How to Verify | Notes | +|------|---------------|-------| +| **Tool/function names** | Check `agent_cli/_tools.py` | Tool names in prose/examples may drift | +| **File paths** | Grep for `PID_DIR`, `CONFIG_DIR`, etc. | Paths in prose may be wrong | | **Dependencies** | Compare against `pyproject.toml` | Listed packages that don't exist | -| **Environment variables** | Check `envvar=` in opts.py | Undocumented or renamed env vars | | **Provider names** | Check `agent_cli/services/` | Providers listed that aren't implemented | -```bash -# Verify model defaults -grep -E "model.*=.*\"" agent_cli/opts.py +**Now auto-generated (lower risk):** +- Model defaults → captured in options tables via `docs_gen` +- Environment variables → use `env_vars_table()` for accuracy +- Option defaults/types → auto-generated from CLI introspection +```bash # Verify tool names grep -E "def.*_tool|Tool\(" agent_cli/_tools.py @@ -140,8 +138,8 @@ grep -rE "(PID_DIR|CONFIG_DIR|CACHE_DIR)" agent_cli/ # Verify dependencies cat pyproject.toml | grep -A 50 "dependencies" -# Verify env vars -grep "envvar=" agent_cli/opts.py +# Verify providers match implementations +ls agent_cli/services/ ``` ### 6. Check Examples @@ -155,10 +153,12 @@ For examples in any doc: The same info appears in multiple places. Check for conflicts: - README.md vs docs/index.md -- docs/commands/*.md vs actual CLI help +- Prose/examples in docs vs actual CLI behavior - docs/configuration.md vs agent_cli/example-config.toml - Provider/port info across architecture docs +Note: Options tables are auto-generated, so conflicts there indicate the update script wasn't run. + ### 8. Cross-Links for Navigation When commands are mentioned in prose or examples, they should link to their documentation pages. This improves discoverability and user navigation. @@ -215,8 +215,8 @@ For each issue, provide a ready-to-apply fix: ``` ### Issue: [Brief description] -- **File**: docs/commands/chat.md:45 -- **Problem**: `--history-dir` default shown as `~/.chat-history` but actual default is `~/.config/agent-cli/history` -- **Fix**: Update the default value in the options table -- **Verify**: `agent-cli chat --help` +- **File**: docs/commands/chat.md:25 +- **Problem**: Example uses `--model gpt-4` but the default model is now `gpt-4o-mini` +- **Fix**: Update the example to use the current default or a valid model name +- **Verify**: `agent-cli chat --help` shows current default ``` diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 000000000..68ad951ca --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,114 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Development Rules + +- Prefer functional style Python over classes with inheritance +- Keep it DRY - reuse code, check for existing patterns before adding new ones +- Implement the simplest solution; don't generalize until needed +- Only implement what's asked for, nothing extra +- Always run `pytest` before claiming a task is done +- Run `pre-commit run --all-files` before committing +- Use `git add ` not `git add .` +- CLI help in README.md is auto-generated - don't edit manually + +## Build & Development Commands + +```bash +# Install all dependencies (recommended for development) +uv sync --all-extras + +# Or install specific extras only +uv sync --extra rag # RAG proxy dependencies +uv sync --extra memory # Memory proxy dependencies +uv sync --extra vad # Voice activity detection (transcribe-daemon) + +# Run the CLI during development +uv run agent-cli + +# Run tests (coverage enabled by default via pyproject.toml) +uv run pytest + +# Linting (pre-commit runs ruff + other checks) +pre-commit run --all-files + +# Update auto-generated documentation (CODE:START blocks in markdown) +uv run python docs/update_docs.py +``` + +## Architecture Overview + +### CLI Structure + +The CLI is built with **Typer**. Entry point is `agent_cli/cli.py` which registers all commands. + +**Shared Options Pattern**: Common CLI options (providers, API keys, audio devices) are defined once in `agent_cli/opts.py` and reused across commands. This ensures consistency and enables auto-generated documentation. + +### Provider Abstraction + +The codebase uses a **provider pattern** for AI services, allowing switching between local and cloud backends: + +| Capability | Providers | Implementation | +|------------|-----------|----------------| +| ASR (Speech-to-Text) | `wyoming`, `openai` | `services/asr.py` | +| LLM | `ollama`, `openai`, `gemini` | `services/llm.py` | +| TTS (Text-to-Speech) | `wyoming`, `openai`, `kokoro` | `services/tts.py` | + +Each agent accepts `--{asr,llm,tts}-provider` flags to select the backend. + +### Key Modules + +``` +agent_cli/ +├── cli.py # Typer app, command registration +├── opts.py # Shared CLI option definitions (single source of truth) +├── config.py # Config file loading, dataclasses for typed configs +├── agents/ # CLI commands (one file per command) +│ ├── transcribe.py # Voice-to-text with optional LLM cleanup +│ ├── autocorrect.py # Grammar/spelling correction +│ ├── chat.py # Conversational agent with tools +│ ├── voice_edit.py # Voice commands on clipboard text +│ └── ... +├── services/ # Provider implementations +│ ├── asr.py # Wyoming/OpenAI transcription +│ ├── llm.py # Ollama/OpenAI/Gemini LLM calls +│ └── tts.py # Wyoming/OpenAI/Kokoro TTS +├── core/ # Shared utilities +│ ├── audio.py # Audio recording, device selection +│ ├── process.py # Background process management (--toggle, --stop) +│ └── utils.py # Console output, logging setup +├── rag/ # RAG proxy server implementation +├── memory/ # Long-term memory proxy server +└── docs_gen.py # Auto-generates docs from CLI introspection +``` + +### Agent Pattern + +Each agent in `agents/` follows a consistent pattern: +1. Import shared options from `opts.py` +2. Define a Typer command decorated with `@app.command()` +3. Use `config.py` dataclasses to group related options +4. Call provider services from `services/` + +### Background Process Management + +Commands like `transcribe`, `voice-edit`, and `chat` support running as background processes with hotkey integration: +- `--toggle`: Start if stopped, stop if running +- `--stop`: Stop any running instance +- `--status`: Check if running + +PID files are stored in `~/.cache/agent-cli/`. + +### Documentation Auto-Generation + +The `docs_gen` module introspects Typer commands to generate Markdown tables. Documentation files use `markdown-code-runner` markers: + +```markdown + + + + +``` + +Run `uv run python docs/update_docs.py` to regenerate all auto-generated content. diff --git a/docs/architecture/index.md b/docs/architecture/index.md index fca6c40c9..4bda9816e 100644 --- a/docs/architecture/index.md +++ b/docs/architecture/index.md @@ -44,7 +44,7 @@ Each AI capability (ASR, LLM, TTS) has multiple backend providers: | Provider | Implementation | GPU Support | Latency | |----------|---------------|-------------|---------| -| `wyoming` | Wyoming Faster Whisper | CUDA/Metal | Low | +| `wyoming` | Wyoming Whisper (faster-whisper/MLX) | CUDA/Metal | Low | | `openai` | OpenAI Whisper API | Cloud | Medium | ### LLM (Large Language Model) diff --git a/docs/commands/install-hotkeys.md b/docs/commands/install-hotkeys.md index da0c90137..b34dc6e18 100644 --- a/docs/commands/install-hotkeys.md +++ b/docs/commands/install-hotkeys.md @@ -16,18 +16,19 @@ agent-cli install-hotkeys [OPTIONS] Sets up hotkeys for common workflows: -macOS: +**macOS:** + - Cmd+Shift+R: Toggle voice transcription - Cmd+Shift+A: Autocorrect clipboard text - Cmd+Shift+V: Voice edit clipboard text -Linux: +**Linux:** + - Super+Shift+R: Toggle voice transcription - Super+Shift+A: Autocorrect clipboard text - Super+Shift+V: Voice edit clipboard text -On macOS, you may need to grant Accessibility permissions to skhd in -System Settings -> Privacy & Security -> Accessibility. +On macOS, you may need to grant Accessibility permissions to skhd in System Settings → Privacy & Security → Accessibility. ## Options diff --git a/docs/commands/install-services.md b/docs/commands/install-services.md index eafb04fb9..1423132a5 100644 --- a/docs/commands/install-services.md +++ b/docs/commands/install-services.md @@ -17,7 +17,7 @@ agent-cli install-services [OPTIONS] Installs the following services (based on your OS): - Ollama (local LLM server) -- Wyoming Faster Whisper (speech-to-text) +- Wyoming Whisper (faster-whisper on Linux/Intel, MLX Whisper on Apple Silicon) - Wyoming Piper (text-to-speech) - Wyoming OpenWakeWord (wake word detection) diff --git a/docs/commands/rag-proxy.md b/docs/commands/rag-proxy.md index e452dc0b8..0fdaa0412 100644 --- a/docs/commands/rag-proxy.md +++ b/docs/commands/rag-proxy.md @@ -143,13 +143,13 @@ Any OpenAI-compatible client can use the RAG proxy: # curl curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ - -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "What do my notes say about X?"}]}' + -d '{"model": "", "messages": [{"role": "user", "content": "What do my notes say about X?"}]}' # Python (openai library) from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed") response = client.chat.completions.create( - model="gpt-5-mini", + model="", messages=[{"role": "user", "content": "Summarize my project notes"}] ) ``` diff --git a/docs/commands/start-services.md b/docs/commands/start-services.md index 3016e16cc..721720b23 100644 --- a/docs/commands/start-services.md +++ b/docs/commands/start-services.md @@ -17,7 +17,7 @@ agent-cli start-services [OPTIONS] Starts: - Ollama (LLM server) -- Wyoming Faster Whisper (speech-to-text) +- Wyoming Whisper (faster-whisper on Linux/Intel, MLX Whisper on Apple Silicon) - Wyoming Piper (text-to-speech) - Wyoming OpenWakeWord (wake word detection) diff --git a/docs/commands/transcribe.md b/docs/commands/transcribe.md index c79f8aa74..9d5d170cf 100644 --- a/docs/commands/transcribe.md +++ b/docs/commands/transcribe.md @@ -18,7 +18,7 @@ This command: 1. Starts listening to your microphone immediately 2. Records your speech -3. When you press `Ctrl+C`, sends audio to a Whisper server +3. When you press `Ctrl+C`, stops recording and finalizes transcription (Wyoming streams live; OpenAI uploads after stop) 4. Copies the transcribed text to your clipboard 5. Optionally uses an LLM to clean up the transcript diff --git a/docs/index.md b/docs/index.md index 1a9cf01d1..f87a84d04 100644 --- a/docs/index.md +++ b/docs/index.md @@ -13,9 +13,9 @@ A collection of **local-first**, AI-powered command-line agents that run entirel Agent CLI provides a suite of powerful tools for voice and text interaction, designed for privacy, offline capability, and seamless integration with system-wide hotkeys and workflows. !!! important "Local and Private by Design" - All agents are designed to run **100% locally**. Your data—whether from your clipboard, microphone, or files—is never sent to any cloud API. This ensures your privacy and allows the tools to work completely offline. + All agents can run **100% locally**. Your data—whether from your clipboard, microphone, or files—stays on your machine unless you configure a cloud provider. This keeps workflows private and allows the tools to work offline. - You can also optionally configure the agents to use OpenAI/Gemini services. + You can optionally configure the agents to use OpenAI/Gemini services. ## Quick Demo diff --git a/docs/installation/docker.md b/docs/installation/docker.md index ff2daaef2..953e7f516 100644 --- a/docs/installation/docker.md +++ b/docs/installation/docker.md @@ -6,7 +6,8 @@ icon: lucide/container Universal Docker setup that works on any platform with Docker support. -> **⚠️ Important Limitations** +> [!WARNING] +> **Important Limitations** > > - **macOS**: Docker does not support GPU acceleration. For 10x better performance, use [macOS native setup](macos.md) > - **Linux**: Limited GPU support. For full NVIDIA GPU acceleration, use [Linux native setup](linux.md) @@ -37,7 +38,7 @@ Universal Docker setup that works on any platform with Docker support. 3. **Install agent-cli:** ```bash - uv tools install agent-cli + uv tool install agent-cli # or: pip install agent-cli ``` diff --git a/docs/installation/index.md b/docs/installation/index.md index 45eae050e..31198993a 100644 --- a/docs/installation/index.md +++ b/docs/installation/index.md @@ -15,7 +15,8 @@ Choose the best installation method for your platform and performance needs. | **NixOS** | [System Integration](nixos.md) | ✅ NVIDIA GPU | Best | | **Any Platform** | [Docker Setup](docker.md) | ⚠️ Limited\* | Good | -> **Note**: Docker on macOS does not support GPU acceleration. For best performance on Mac, use the native setup. +> [!NOTE] +> Docker on macOS does not support GPU acceleration. For best performance on Mac, use the [native setup](macos.md). ## Installation Methods @@ -65,7 +66,7 @@ Choose the best installation method for your platform and performance needs. All installation methods set up these services: - **🧠 Ollama** - LLM server (gemma3:4b model) -- **🎤 Wyoming Faster Whisper** - Speech-to-text +- **🎤 Wyoming Whisper** - Speech-to-text (faster-whisper on Linux/Intel, MLX Whisper on Apple Silicon) - **🗣️ Wyoming Piper** - Text-to-speech - **👂 Wyoming OpenWakeWord** - Wake word detection @@ -84,7 +85,7 @@ Once services are running, install the agent-cli package: ```bash # Using uv (recommended) -uv tools install agent-cli +uv tool install agent-cli # Using pip pip install agent-cli diff --git a/docs/installation/linux.md b/docs/installation/linux.md index c3793c4ba..51ac2ac64 100644 --- a/docs/installation/linux.md +++ b/docs/installation/linux.md @@ -6,8 +6,8 @@ icon: lucide/terminal Native Linux setup with full NVIDIA GPU acceleration for optimal performance. -> **🐧 Recommended for Linux** -> This setup provides optimal performance with full NVIDIA GPU acceleration support. +> [!TIP] +> **🐧 Recommended for Linux** — Optimal performance with full NVIDIA GPU acceleration. ## Prerequisites @@ -41,7 +41,7 @@ Native Linux setup with full NVIDIA GPU acceleration for optimal performance. 3. **Install agent-cli:** ```bash - uv tools install agent-cli + uv tool install agent-cli ``` 4. **Test the setup:** diff --git a/docs/installation/macos.md b/docs/installation/macos.md index e4a9174a4..5fcd6f87d 100644 --- a/docs/installation/macos.md +++ b/docs/installation/macos.md @@ -6,8 +6,8 @@ icon: lucide/apple Native macOS setup with full Metal GPU acceleration for optimal performance. -> **🍎 Recommended for macOS** -> This setup provides ~10x better performance than Docker by utilizing Metal GPU acceleration. +> [!TIP] +> **🍎 Recommended for macOS** — ~10x better performance than Docker via Metal GPU acceleration. ## Prerequisites @@ -87,19 +87,25 @@ scripts/start-all-services.sh If you prefer running services individually: ```bash -# Terminal 1: Ollama (native GPU acceleration) +# Ollama (brew service recommended) +brew services start ollama +# Or run in foreground: ollama serve -# Terminal 2: Whisper (CPU optimized) +# Whisper (Apple Silicon: launchd service or manual) +launchctl list com.wyoming_mlx_whisper +# Or run in foreground: scripts/run-whisper.sh -# Terminal 3: Piper (Apple Silicon compatible) +# Piper scripts/run-piper.sh -# Terminal 4: OpenWakeWord (macOS compatible fork) +# OpenWakeWord scripts/run-openwakeword.sh ``` +Intel Macs: prefer Docker or a Linux-style Wyoming Faster Whisper setup; MLX Whisper is Apple Silicon only. + ## Why Native Setup? - **10x faster than Docker** - Full Metal GPU acceleration diff --git a/docs/installation/nixos.md b/docs/installation/nixos.md index fe8f729b2..168f162b6 100644 --- a/docs/installation/nixos.md +++ b/docs/installation/nixos.md @@ -6,8 +6,8 @@ icon: lucide/snowflake Native NixOS setup using system configuration with full GPU acceleration support. -> **❄️ For NixOS Users** -> This setup integrates agent-cli services directly into your NixOS system configuration. +> [!TIP] +> **❄️ For NixOS Users** — Integrates agent-cli services directly into your NixOS system configuration. ## Prerequisites diff --git a/docs/system-integration.md b/docs/system-integration.md index 0a03164f6..c5a3de910 100644 --- a/docs/system-integration.md +++ b/docs/system-integration.md @@ -123,10 +123,10 @@ Install [AutoHotkey](https://www.autohotkey.com/) and create a script: ```ahk ; Transcribe to clipboard -#+r::Run, wsl agent-cli transcribe --toggle --input-device-index 1 +#+r::Run, agent-cli transcribe --toggle --input-device-index 1 ; Autocorrect clipboard -#+a::Run, wsl agent-cli autocorrect +#+a::Run, agent-cli autocorrect ``` ### PowerToys