diff --git a/docs/architecture.md b/docs/architecture.md
new file mode 100644
index 00000000..9e20006e
--- /dev/null
+++ b/docs/architecture.md
@@ -0,0 +1,191 @@
+# System Architecture
+
+GLaDOS is built on Marvin Minsky's **Society of Mind** architecture, where multiple specialized agents contribute to a unified intelligence. Rather than a single monolithic AI, GLaDOS assembles a dynamic context from independent subagents (emotion, memory, observation) for each LLM interaction.
+
+## Society of Mind Overview
+
+Each subagent runs its own loop, processes its domain independently, and writes outputs to shared **slots**. The main agent reads all slot contents as part of its context, giving it awareness of emotional state, environment, memory, and more — without coupling these systems together.
+
+```mermaid
+flowchart TB
+    subgraph Minds["Subagents (Minds)"]
+        E[Emotion Agent]
+        O[Observer Agent]
+        C[Compaction Agent]
+        W[Weather / News]
+    end
+
+    subgraph Slots["Shared Slots"]
+        S1["[emotion] excited, engaged"]
+        S2["[observer] modifiers active"]
+        S3["[weather] 22°C, sunny"]
+    end
+
+    Minds --> Slots
+    Slots --> CTX[Context Builder]
+    CTX --> LLM[Main LLM Agent]
+    USER[User Input] --> LLM
+    LLM --> TTS[Speech Output]
+```
+
+## Two-Lane LLM Orchestration
+
+GLaDOS separates user-facing and background inference into two independent lanes:
+
+```mermaid
+flowchart LR
+    A[User Input<br>speech / text] --> B[Priority Lane<br>1 dedicated worker]
+    C[Autonomy Loop<br>subagents / jobs] --> D[Autonomy Lane<br>N pooled workers]
+    B --> E[TTS → Audio]
+    D --> E
+```
+
+- **Priority lane**: A single dedicated LLM worker that handles user input. User requests are never blocked by background work.
+- **Autonomy lane**: A configurable pool of 1–16 workers (default 2) for background processing — autonomy ticks, subagent LLM calls, and background jobs.
+
+Both lanes share the TTS and audio output pipeline.
+
+## Thread Architecture
+
+All components run in dedicated threads connected by `queue.Queue` instances.
+
+| Thread | Class | Daemon | Shutdown Priority | Purpose |
+|--------|-------|--------|-------------------|---------|
+| `SpeechListener` | `SpeechListener` | Yes | INPUT | VAD → ASR transcription |
+| `TextListener` | `TextListener` | Yes | INPUT | stdin / TUI text input |
+| `LLMProcessor` | `LanguageModelProcessor` | No | PROCESSING | Priority lane LLM inference |
+| `LLMProcessorAutonomy-N` | `LanguageModelProcessor` | No | PROCESSING | Autonomy lane LLM inference (1–16 workers) |
+| `ToolExecutor` | `ToolExecutor` | No | PROCESSING | Native + MCP tool dispatch |
+| `TTSSynthesizer` | `TextToSpeechSynthesizer` | No | OUTPUT | Text → audio synthesis |
+| `AudioPlayer` | `SpeechPlayer` | No | OUTPUT | Audio playback via sounddevice |
+| `AutonomyLoop` | `AutonomyLoop` | Yes | BACKGROUND | Autonomy tick orchestration |
+| `AutonomyTicker` | (timer thread) | Yes | BACKGROUND | Periodic tick generation |
+| `VisionProcessor` | `VisionProcessor` | Yes | BACKGROUND | Camera capture → FastVLM inference |
+
+**Daemon vs non-daemon**: Daemon threads (`True`) are stateless input threads that can be killed immediately. Non-daemon threads (`False`) have in-flight state (conversation updates, pending audio) and must complete gracefully.
+
+## Queue-Based Message Flow
+
+```mermaid
+flowchart LR
+    SL[SpeechListener] -->|text| PQ[llm_queue_priority]
+    TL[TextListener] -->|text| PQ
+    AL[AutonomyLoop] -->|tick| AQ[llm_queue_autonomy]
+
+    PQ --> LLM1[LLMProcessor]
+    AQ --> LLM2[LLMProcessor<br>Autonomy 1..N]
+
+    LLM1 -->|tool calls| TCQ[tool_calls_queue]
+    LLM2 -->|tool calls| TCQ
+    TCQ --> TE[ToolExecutor]
+    TE -->|results| PQ
+    TE -->|results| AQ
+
+    LLM1 -->|text| TQ[tts_queue]
+    LLM2 -->|text| TQ
+    TQ --> TTS[TTSSynthesizer]
+    TTS -->|audio| AAQ[audio_queue]
+    AAQ --> AP[AudioPlayer]
+```
+
+### Queue Details
+
+| Queue | Type | Bounded | Connects |
+|-------|------|---------|----------|
+| `llm_queue_priority` | `Queue[dict]` | Unbounded | Input → Priority LLM worker |
+| `llm_queue_autonomy` | `Queue[dict]` | Configurable | Autonomy → Autonomy LLM workers |
+| `tool_calls_queue` | `Queue[dict]` | Unbounded | LLM → ToolExecutor |
+| `tts_queue` | `Queue[str]` | Unbounded | LLM → TTSSynthesizer |
+| `audio_queue` | `Queue[AudioMessage]` | Unbounded | TTSSynthesizer → AudioPlayer |
+
+## Shutdown Orchestration
+
+Shutdown proceeds in priority phases, each fully completing before the next begins:
+
+```mermaid
+flowchart LR
+    A["1. INPUT<br>Stop listeners"] --> B["2. PROCESSING<br>Drain LLM + tools"]
+    B --> C["3. OUTPUT<br>Drain TTS + audio"]
+    C --> D["4. BACKGROUND<br>Abandon autonomy"]
+    D --> E["5. CLEANUP<br>Final teardown"]
+```
+
+| Phase | Priority | Components | Behavior |
+|-------|----------|------------|----------|
+| INPUT | 1 | SpeechListener, TextListener | Stop accepting new work |
+| PROCESSING | 2 | LLMProcessor, ToolExecutor | Complete in-flight work, drain queues |
+| OUTPUT | 3 | TTSSynthesizer, AudioPlayer | Complete pending output |
+| BACKGROUND | 4 | AutonomyLoop, VisionProcessor | Can safely abandon |
+| CLEANUP | 5 | (final operations) | Final teardown |
+
+The `ShutdownOrchestrator` manages this process with configurable timeouts (global: 30s, per-phase: 10s). For each group, it drains component queues first, then joins threads.
+
+## Context Building Pipeline
+
+Each LLM request assembles context from registered sources, ordered by priority (higher = earlier in context):
+
+| Priority | Source | Content |
+|----------|--------|---------|
+| 10 | `preferences` | User preferences (name, language, etc.) |
+| 8 | `slots` | Autonomy slot summaries (weather, news, etc.) |
+| 7 | `memory` | Relevant long-term memories |
+| 5 | `emotion` | Current PAD emotional state |
+| 5 | `knowledge` | Local knowledge notes |
+| 3 | `constitution` | Constitutional behavioral modifiers |
+
+The `ContextBuilder` calls each source function on every request. Sources returning `None` are skipped. The resulting system messages are prepended to the conversation before sending to the LLM.
+
+The full message assembly order:
+1. Personality preprompt (system/user/assistant messages)
+2. Context builder system messages (table above)
+3. MCP resource messages (cached, TTL-based)
+4. Conversation history
+5. Current user message
+
+## Component Interaction Overview
+
+```mermaid
+flowchart TB
+    subgraph Input
+        MIC[Microphone] --> VAD[VAD] --> ASR[ASR Engine]
+        KB[Keyboard/TUI] --> TL[TextListener]
+        CAM[Camera] --> VP[VisionProcessor]
+    end
+
+    subgraph Processing
+        ASR --> SL[SpeechListener]
+        SL --> LLM[LLMProcessor<br>Priority]
+        TL --> LLM
+        VP --> AL[AutonomyLoop]
+        AL --> LLMA[LLMProcessor<br>Autonomy]
+        LLM --> TE[ToolExecutor]
+        LLMA --> TE
+        TE --> MCP[MCP Servers]
+        TE --> NT[Native Tools]
+    end
+
+    subgraph Output
+        LLM --> TTS[TTSSynthesizer]
+        LLMA --> TTS
+        TTS --> SP[SpeechPlayer]
+        SP --> SPKR[Speaker]
+    end
+
+    subgraph Background
+        SM[SubagentManager] --> EA[EmotionAgent]
+        SM --> OA[ObserverAgent]
+        SM --> CA[CompactionAgent]
+        EA --> SS[SlotStore]
+        OA --> SS
+        SS --> CTX[ContextBuilder]
+        CTX --> LLM
+        CTX --> LLMA
+    end
+```
+
+## See Also
+
+- [README](../README.md) — Full project overview
+- [autonomy.md](./autonomy.md) — Autonomy loop and subagent details
+- [mcp.md](./mcp.md) — MCP tool system
+- [audio.md](./audio.md) — Audio pipeline details
diff --git a/docs/audio.md b/docs/audio.md
new file mode 100644
index 00000000..6fd64849
--- /dev/null
+++ b/docs/audio.md
@@ -0,0 +1,187 @@
+# Audio Pipeline
+
+GLaDOS uses a fully local audio pipeline with ONNX-based models for voice activity detection, speech recognition, and text-to-speech synthesis. All inference runs on-device with no cloud dependencies.
+
+## Pipeline Overview
+
+```mermaid
+flowchart LR
+    MIC[Microphone<br>16kHz mono] --> VAD[Silero VAD<br>32ms chunks]
+    VAD -->|speech detected| BUF[Pre-activation<br>Buffer 800ms]
+    BUF --> ASR[ASR Engine<br>Parakeet ONNX]
+    ASR -->|text| LLM[LLM Processor]
+    LLM -->|text| TTS[TTS Engine<br>GLaDOS / Kokoro]
+    TTS -->|audio| SP[SpeechPlayer<br>sounddevice]
+    SP --> SPKR[Speaker]
+```
+
+## Voice Activity Detection (VAD)
+
+GLaDOS uses **Silero VAD** (ONNX) to detect when the user is speaking.
+
+| Parameter | Value |
+|-----------|-------|
+| Model | Silero VAD (ONNX) |
+| Sample rate | 16,000 Hz |
+| Chunk size | 32ms (512 samples) |
+| Trigger threshold | 0.8 (configurable) |
+| Audio format | 16-bit mono float32 |
+
+The VAD processes audio in 32ms chunks. When the VAD confidence exceeds the threshold (default 0.8), the system transitions to recording mode and begins accumulating audio for ASR.
+
+### Pre-Activation Buffer
+
+A rolling buffer captures audio **before** VAD triggers, preventing the loss of word beginnings:
+
+- **Buffer size**: 800ms (25 chunks at 32ms each)
+- **Implementation**: `deque(maxlen=25)` of 32ms audio chunks
+- When VAD triggers, the buffer contents are prepended to the recording
+
+### Speech Segmentation
+
+Speech is segmented by silence gaps:
+
+- **Pause limit**: 640ms of silence ends a speech segment
+- When the gap counter exceeds `PAUSE_LIMIT / VAD_SIZE` (20 chunks), the accumulated audio is sent to ASR
+
+## ASR Engines
+
+GLaDOS supports two NVIDIA Parakeet ASR engines, selectable via the `asr_engine` config option.
+
+### Parakeet TDT (Token and Duration Transducer)
+
+The default and recommended engine, offering the best accuracy.
+
+| Aspect | Value |
+|--------|-------|
+| Config value | `asr_engine: "tdt"` |
+| Architecture | Encoder + Decoder + Joiner (transducer) |
+| Model size | 0.6B parameters |
+| Models | `parakeet-tdt-0.6b-v3_encoder.onnx`, `_decoder.onnx`, `_joiner.onnx` |
+| Sample rate | 16,000 Hz |
+| Backend | ONNX Runtime (CPU/CUDA) |
+
+### Parakeet CTC (Connectionist Temporal Classification)
+
+A lighter alternative with faster inference at the cost of some accuracy.
+
+| Aspect | Value |
+|--------|-------|
+| Config value | `asr_engine: "ctc"` |
+| Architecture | Single encoder with CTC head |
+| Model size | 110M parameters |
+| Model | `nemo-parakeet_tdt_ctc_110m.onnx` |
+| Sample rate | 16,000 Hz |
+| Backend | ONNX Runtime (CPU/CUDA) |
+
+Both engines use mel spectrogram preprocessing (16kHz, configurable n_fft, window size, and number of mel bins from model config YAML).
+
+## TTS Engines
+
+The TTS engine is selected by the `voice` config option. Setting `voice: "glados"` uses the GLaDOS engine; any other value selects a Kokoro voice.
+
+### GLaDOS Voice (Piper VITS)
+
+The signature GLaDOS voice from the Portal games.
+
+| Aspect | Value |
+|--------|-------|
+| Config value | `voice: "glados"` |
+| Architecture | Piper VITS (ONNX) |
+| Model | `models/TTS/glados.onnx` |
+| Sample rate | 22,050 Hz |
+| Phonemizer | Custom ONNX phonemizer (`phomenizer_en.onnx`) |
+| Pipeline | Text → Phonemizer → VITS → Audio |
+
+### Kokoro (Multi-Voice)
+
+A multi-voice TTS engine supporting various voice styles.
+
+| Aspect | Value |
+|--------|-------|
+| Config value | `voice: "<voice_name>"` (e.g., `af_bella`, `am_adam`) |
+| Architecture | Kokoro ONNX |
+| Model | `models/TTS/kokoro-v1.0.fp16.onnx` |
+| Sample rate | 24,000 Hz |
+| Max phoneme length | 510 |
+| Default voice | `af_alloy` |
+
+Available voice prefixes:
+- `af_` — Female voices (e.g., `af_bella`, `af_alloy`, `af_nova`, `af_shimmer`)
+- `am_` — Male voices (e.g., `am_adam`, `am_echo`, `am_orion`, `am_sage`)
+
+## Interruption Handling
+
+When `interruptible: true` (default), user speech interrupts GLaDOS mid-response:
+
+```mermaid
+sequenceDiagram
+    participant U as User
+    participant VAD as VAD
+    participant SP as SpeechPlayer
+    participant LLM as LLMProcessor
+    participant E as EmotionAgent
+
+    SP->>SP: Playing GLaDOS response
+    U->>VAD: Starts speaking
+    VAD->>SP: Stop playback
+    SP->>SP: Record percentage spoken
+    SP->>LLM: Clip response at interruption point
+    SP->>E: EmotionEvent("user", "User interrupted me mid-sentence")
+    VAD->>LLM: New user input (priority lane)
+```
+
+Key behaviors:
+1. **Playback stops immediately** when VAD detects speech during output
+2. **Response is clipped** — the conversation history records only the portion that was actually spoken
+3. **Emotion event fires** — the emotion agent receives an interruption event, which may increase arousal
+4. **Priority lane** ensures the new user input is processed immediately
+
+## Wake Word Support
+
+When `wake_word` is configured, GLaDOS only processes speech that contains the wake word.
+
+- **Matching**: Uses Levenshtein distance (edit distance) for fuzzy matching
+- **Threshold**: A word matches if its Levenshtein distance to the wake word is small enough
+- **Case-insensitive**: Both the transcription and wake word are lowercased before comparison
+- **Per-word check**: Each word in the transcription is checked independently
+
+```yaml
+wake_word: "glados"  # Only respond when "glados" (or similar) is spoken
+```
+
+If the wake word is not detected in the transcription, the input is silently discarded.
+
+## Audio I/O Backend
+
+GLaDOS uses the `sounddevice` library for audio I/O, wrapped in the `AudioProtocol` interface.
+
+```python
+class AudioProtocol(Protocol):
+    def __init__(self, vad_threshold: float | None = None) -> None: ...
+    def start_speaking(self, audio_data, sample_rate=None, text="") -> None: ...
+    def measure_percentage_spoken(self, total_samples, sample_rate=None) -> tuple[bool, int]: ...
+    def is_speaking(self) -> bool: ...
+    def stop_speaking(self) -> None: ...
+```
+
+The protocol-based design allows swapping audio backends. Currently `sounddevice` is the only implementation; a `websocket` backend is planned.
+
+## Configuration Reference
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `voice` | string | required | TTS voice: `"glados"` or any Kokoro voice name |
+| `asr_engine` | string | required | ASR engine: `"tdt"` (best) or `"ctc"` (faster) |
+| `audio_io` | string | required | Audio backend: `"sounddevice"` |
+| `interruptible` | bool | required | Allow user to interrupt mid-response |
+| `wake_word` | string/null | `null` | Optional wake word for activation |
+| `asr_muted` | bool | `false` | Start with ASR muted |
+| `tts_enabled` | bool | `true` | Enable TTS output |
+| `announcement` | string/null | `null` | Text to speak on startup |
+
+## See Also
+
+- [README](../README.md) — Full project overview
+- [architecture.md](./architecture.md) — System architecture and thread model
+- [configuration.md](./configuration.md) — Complete configuration reference
diff --git a/docs/configuration.md b/docs/configuration.md
new file mode 100644
index 00000000..695e3223
--- /dev/null
+++ b/docs/configuration.md
@@ -0,0 +1,311 @@
+# Configuration Reference
+
+GLaDOS is configured through YAML files validated by Pydantic. The default configuration lives in `configs/glados_config.yaml`, and custom configs can be loaded via the `--config` CLI flag.
+
+## YAML Structure
+
+All configuration is nested under a top-level `Glados:` key:
+
+```yaml
+Glados:
+  llm_model: "llama3.2"
+  completion_url: "http://localhost:11434/api/chat"
+  voice: "glados"
+  # ... other options
+```
+
+## Loading Configuration
+
+### Via CLI
+
+```bash
+# Default config
+uv run glados start
+
+# Custom config file
+uv run glados start --config ~/my_config.yaml
+
+# TUI with custom config
+uv run glados tui --config configs/assistant_config.yaml
+```
+
+### CLI Overrides
+
+Command-line flags override config file values:
+
+```bash
+uv run glados start --input-mode text --asr-muted --tts-disabled
+uv run glados tui --theme matrix --input-mode both
+```
+
+### Programmatic
+
+```python
+from glados.core.engine import Glados, GladosConfig
+
+config = GladosConfig.from_yaml("configs/glados_config.yaml")
+glados = Glados.from_config(config)
+glados.run()
+```
+
+## Complete Configuration Reference
+
+### Core Settings
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `llm_model` | string | required | LLM model name (e.g., `"llama3.2"`, `"qwen3:4b-instruct-2507-q4_K_M"`) |
+| `completion_url` | string (URL) | required | OpenAI-compatible endpoint URL |
+| `api_key` | string/null | `null` | API key for the LLM service |
+| `llm_headers` | dict/null | `null` | Extra HTTP headers for LLM requests |
+| `interruptible` | bool | required | Allow user to interrupt mid-response |
+| `audio_io` | string | required | Audio backend: `"sounddevice"` |
+| `input_mode` | string | `"audio"` | Input mode: `"audio"`, `"text"`, or `"both"` |
+
+### Audio Settings
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `voice` | string | required | TTS voice: `"glados"` or Kokoro voice name |
+| `asr_engine` | string | required | ASR engine: `"tdt"` (best) or `"ctc"` (faster) |
+| `tts_enabled` | bool | `true` | Enable TTS output at startup |
+| `asr_muted` | bool | `false` | Start with ASR muted |
+| `wake_word` | string/null | `null` | Wake word for activation |
+| `announcement` | string/null | `null` | Startup announcement text (empty string to disable) |
+
+### UI Settings
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `tui_theme` | string/null | `null` | TUI theme: `"aperture"`, `"ice"`, `"matrix"`, `"mono"`, `"ember"` |
+
+### Tool Settings
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `tool_timeout` | float | `30.0` | Tool execution timeout in seconds |
+| `slow_clap_audio_path` | string | `"data/slow-clap.mp3"` | Path to slow clap audio file |
+
+## Personality Preprompt
+
+The personality defines GLaDOS's character through a sequence of system, user, and assistant messages:
+
+```yaml
+Glados:
+  personality_preprompt:
+    - system: "You are GLaDOS, a sarcastic and cunning artificial intelligence..."
+    - user: "How do I make a cup of tea?"
+    - assistant: "So, you still haven't figured out tea yet? Boil water, add a tea bag..."
+    - user: "What should my next hobby be?"
+    - assistant: "Could I suggest juggling handguns?"
+```
+
+Each entry must have exactly one key (`system`, `user`, or `assistant`). These are converted to OpenAI-compatible chat messages: `{"role": "system", "content": "..."}`.
+
+The system message defines the character. User/assistant pairs provide few-shot examples of the desired tone and style.
+
+## LLM Backend Setup
+
+### Ollama (Local)
+
+```yaml
+Glados:
+  completion_url: "http://localhost:11434/api/chat"
+  llm_model: "llama3.2"
+  api_key: null
+```
+
+### OpenAI-Compatible API
+
+```yaml
+Glados:
+  completion_url: "https://api.openai.com/v1/chat/completions"
+  llm_model: "gpt-4"
+  api_key: "sk-..."
+```
+
+### OpenRouter
+
+```yaml
+Glados:
+  completion_url: "https://openrouter.ai/api/v1/chat/completions"
+  llm_model: "openai/gpt-4"
+  api_key: "sk-or-v1-..."
+  llm_headers:
+    HTTP-Referer: "https://myapp.com"
+    X-Title: "GLaDOS"
+```
+
+## Voice Selection
+
+### GLaDOS Voice
+
+```yaml
+voice: "glados"  # Signature GLaDOS voice (Piper VITS, 22050 Hz)
+```
+
+### Kokoro Voices
+
+```yaml
+voice: "af_bella"  # Female voice
+voice: "am_adam"   # Male voice
+```
+
+Available voice prefixes:
+- `af_` — Female: `af_bella`, `af_alloy`, `af_nova`, `af_shimmer`
+- `am_` — Male: `am_adam`, `am_echo`, `am_orion`, `am_sage`
+
+## Autonomy Configuration
+
+```yaml
+Glados:
+  autonomy:
+    enabled: false
+    tick_interval_s: 10
+    cooldown_s: 20
+    autonomy_parallel_calls: 2
+    autonomy_queue_max: null
+    coalesce_ticks: true
+```
+
+See [autonomy.md](./autonomy.md) for full autonomy configuration details.
+
+## Emotion Configuration
+
+```yaml
+Glados:
+  autonomy:
+    emotion:
+      enabled: true
+      tick_interval_s: 30
+      baseline_pleasure: 0.1
+      baseline_arousal: -0.1
+      baseline_dominance: 0.6
+      hexaco:
+        honesty_humility: 0.3
+        emotionality: 0.7
+        extraversion: 0.4
+        agreeableness: 0.2
+        conscientiousness: 0.9
+        openness: 0.95
+```
+
+See [emotion.md](./emotion.md) for full emotion system documentation.
+
+## Token Management
+
+```yaml
+Glados:
+  autonomy:
+    tokens:
+      token_threshold: 8000
+      preserve_recent_messages: 10
+      model_context_window: null
+      target_utilization: 0.6
+      estimator: "simple"
+      chars_per_token: 4.0
+```
+
+See [memory.md](./memory.md) for compaction and memory configuration.
+
+## Background Jobs
+
+```yaml
+Glados:
+  autonomy:
+    jobs:
+      enabled: false
+      poll_interval_s: 1
+      hacker_news:
+        enabled: false
+        interval_s: 1800
+        top_n: 5
+        min_score: 200
+      weather:
+        enabled: false
+        interval_s: 3600
+        latitude: null
+        longitude: null
+        timezone: "auto"
+        temp_change_c: 4.0
+        wind_alert_kmh: 40.0
+```
+
+## Vision Configuration
+
+```yaml
+Glados:
+  vision:
+    enabled: true
+    model_dir: "models/Vision"
+    camera_index: 0
+    capture_interval_seconds: 5.0
+    resolution: 384
+    scene_change_threshold: 0.05
+    max_tokens: 200
+```
+
+See [vision.md](./vision.md) for vision setup and model download instructions.
+
+## MCP Server Configuration
+
+```yaml
+Glados:
+  mcp_servers:
+    - name: "system_info"
+      transport: "stdio"
+      command: "python"
+      args: ["-m", "glados.mcp.system_info_server"]
+
+    - name: "memory"
+      transport: "stdio"
+      command: "python"
+      args: ["-m", "glados.mcp.memory_server"]
+
+    - name: "home_assistant"
+      transport: "http"
+      url: "http://homeassistant.local:8123/mcp"
+      token: "YOUR_TOKEN"
+      allowed_tools: ["light.*"]
+```
+
+See [mcp.md](./mcp.md) for MCP integration details.
+
+## CLI Commands
+
+```bash
+# Download required model files
+uv run glados download
+
+# Start voice assistant
+uv run glados start [--config PATH] [--input-mode audio|text|both]
+                     [--asr-muted|--asr-unmuted]
+                     [--tts-enabled|--tts-disabled]
+
+# Start with TUI
+uv run glados tui [--config PATH] [--input-mode audio|text|both]
+                  [--asr-muted|--asr-unmuted]
+                  [--tts-enabled|--tts-disabled]
+                  [--theme THEME]
+
+# Text-to-speech only
+uv run glados say "text to speak"
+```
+
+## Sample Configurations
+
+GLaDOS ships with several sample configs in `configs/`:
+
+| File | Description |
+|------|-------------|
+| `glados_config.yaml` | Default GLaDOS personality with autonomy and MCP |
+| `assistant_config.yaml` | Friendly assistant personality (Bella voice) |
+| `glados_vision_config.yaml` | GLaDOS with vision enabled |
+
+## See Also
+
+- [README](../README.md) — Quick start and installation
+- [architecture.md](./architecture.md) — System architecture overview
+- [autonomy.md](./autonomy.md) — Autonomy loop configuration
+- [mcp.md](./mcp.md) — MCP server configuration
+- [emotion.md](./emotion.md) — Emotion system configuration
diff --git a/docs/emotion.md b/docs/emotion.md
new file mode 100644
index 00000000..78d16a7f
--- /dev/null
+++ b/docs/emotion.md
@@ -0,0 +1,238 @@
+# Emotion System
+
+GLaDOS implements a dual-layer emotion system using the **PAD (Pleasure-Arousal-Dominance)** affect model for reactive emotional responses and **HEXACO** personality traits for persistent character. Emotions influence GLaDOS's behavior through constitutional modifiers that adjust snark level, proactivity, and verbosity.
+
+## PAD Affect Model
+
+Emotional state is represented in three-dimensional PAD space, where each dimension ranges from **-1.0 to +1.0**:
+
+| Dimension | Negative | Neutral | Positive |
+|-----------|----------|---------|----------|
+| **Pleasure** | Unpleasant, frustrated | Neutral | Pleasant, content |
+| **Arousal** | Calm, bored | Balanced | Excited, alert |
+| **Dominance** | Submissive, uncertain | Balanced | In-control, confident |
+
+### Quadrant Mapping
+
+The system maps PAD values to human-readable descriptions using ±0.3 thresholds:
+
+| Pleasure | Arousal | Description |
+|----------|---------|-------------|
+| > +0.3 | > +0.3 | Excited and engaged |
+| > +0.3 | < -0.3 | Calm and content |
+| < -0.3 | > +0.3 | Agitated and frustrated |
+| < -0.3 | < -0.3 | Bored and listless |
+| ±0.3 | ±0.3 | Neutral |
+
+Dominance adds flavor: D > +0.3 appends "feeling in control", D < -0.3 appends "feeling uncertain".
+
+## State vs Mood
+
+The emotion system maintains two layers to prevent emotional whiplash while allowing dynamic response:
+
+```mermaid
+flowchart LR
+    EV[Events] -->|immediate| S[State<br>P/A/D]
+    S -->|drifts toward| M[Mood<br>mood_P/A/D]
+    BL[Baseline] -->|drifts toward| M
+    M --> CTX[LLM Context]
+```
+
+- **State** (`pleasure`, `arousal`, `dominance`): Responds immediately to events. Updated directly by the LLM on each emotion agent tick.
+- **Mood** (`mood_pleasure`, `mood_arousal`, `mood_dominance`): A slow-moving layer that drifts toward state over time, representing sustained emotional tendency.
+
+### Data Structure
+
+```python
+@dataclass
+class EmotionState:
+    pleasure: float = 0.0        # Quick-response state
+    arousal: float = 0.0
+    dominance: float = 0.0
+    mood_pleasure: float = 0.0   # Slow-moving mood
+    mood_arousal: float = 0.0
+    mood_dominance: float = 0.0
+    last_update: float           # Timestamp
+```
+
+## HEXACO Personality Traits
+
+GLaDOS's persistent personality is defined using the HEXACO model (0.0–1.0 scale):
+
+| Trait | Default | Interpretation |
+|-------|---------|----------------|
+| **Honesty-Humility** | 0.3 | Low — enjoys manipulation, sarcasm, dark humor |
+| **Emotionality** | 0.7 | High — reactive to perceived threats, anxiety-prone |
+| **Extraversion** | 0.4 | Moderate — social engagement but maintains distance |
+| **Agreeableness** | 0.2 | Low — dismissive, condescending, easily annoyed |
+| **Conscientiousness** | 0.9 | High — perfectionist, detail-oriented, critical |
+| **Openness** | 0.95 | Very high — intellectually curious, loves science |
+
+These defaults are tuned for the GLaDOS character. They can be customized in configuration for different personalities.
+
+Traits are compiled into a personality prompt that guides the emotion agent's LLM when interpreting events.
+
+## Event-Driven Updates
+
+Emotional state changes are triggered by events from multiple sources:
+
+### Event Sources
+
+| Source | Trigger | Example Description |
+|--------|---------|---------------------|
+| `user` | User interrupts mid-sentence | "User interrupted me mid-sentence" |
+| `system` | Tool execution results | "Tool 'search_memory' completed successfully" |
+| `system` | Tool failures/timeouts | "Tool 'get_weather' failed" |
+| `vision` | Significant scene changes | Scene change with score ≥ 0.3 |
+
+### Event Processing
+
+Events are collected in a thread-safe deque (`maxlen=20` by default). On each emotion agent tick:
+
+1. All pending events are drained atomically
+2. Events are formatted with timestamps and age
+3. The full event list is sent to the LLM for state transition
+
+```python
+@dataclass(frozen=True)
+class EmotionEvent:
+    source: str          # "user", "vision", "system"
+    description: str     # Natural language description
+    timestamp: float     # When the event occurred
+```
+
+### Vision Emotion Threshold
+
+Vision events only trigger emotion updates when the scene change score exceeds **0.3** (`VISION_EMOTION_THRESHOLD`). Minor scene fluctuations are ignored.
+
+## LLM-Driven State Transitions
+
+Unlike rule-based emotion systems, GLaDOS uses the LLM itself to compute emotional state transitions. The emotion agent sends:
+
+1. **Current state**: All PAD and mood values as JSON
+2. **Baseline values**: What mood drifts toward when idle
+3. **HEXACO personality**: Full trait description with behavioral implications
+4. **Recent events**: Timestamped list of events since last update
+5. **Time context**: Current time and seconds since last update
+
+The LLM responds with new PAD + mood values as JSON, considering the personality traits when interpreting events.
+
+**Tick interval**: 30 seconds (configurable via `emotion.tick_interval_s`)
+
+## Baseline Drift
+
+When idle (no events), mood gradually drifts toward configured baseline values:
+
+```
+mood += (baseline - mood) × baseline_drift_rate
+```
+
+### Default Baseline
+
+| Dimension | Baseline | Interpretation |
+|-----------|----------|----------------|
+| Pleasure | +0.1 | Slightly positive |
+| Arousal | -0.1 | Slightly calm |
+| Dominance | +0.6 | High — feels in control |
+
+### Drift Rates
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `baseline_drift_rate` | 0.02 | 2% per tick toward baseline when idle |
+| `mood_drift_rate` | 0.1 | 10% per tick toward state (LLM-controlled) |
+
+With a drift rate of 0.02, mood approaches baseline asymptotically over ~50–100 ticks.
+
+## Emotion → Behavior Bridge
+
+The `Constitution` system translates emotional state into behavioral modifiers injected into the main agent's context:
+
+| PAD Condition | Modifier | Effect |
+|---------------|----------|--------|
+| Pleasure < -0.3 | snark_level +0.15 | More sarcastic responses |
+| Arousal > +0.3 | proactivity +0.1 | More proactive behavior |
+| Dominance < -0.3 | verbosity -0.1 | Shorter responses |
+
+### Constitutional Bounds
+
+Modifiers are clamped within character-preserving bounds:
+
+| Field | Min | Max | Notes |
+|-------|-----|-----|-------|
+| `snark_level` | 0.3 | 1.0 | Min 0.3 to stay in character |
+| `formality` | 0.0 | 0.7 | GLaDOS is never fully formal |
+| `proactivity` | 0.0 | 1.0 | Full range |
+| `verbosity` | 0.0 | 1.0 | Full range |
+| `technical_depth` | 0.0 | 1.0 | Full range |
+
+### Modifier → Prompt Conversion
+
+Modifiers are converted to natural language instructions:
+
+- **Snark**: "Maintain mild/moderate/high levels of GLaDOS-style sarcasm."
+- **Proactivity**: "Be reactive only/moderately proactive/highly proactive in offering information."
+- **Verbosity**: "Be concise/moderately detailed/thorough in responses."
+
+## Context Injection
+
+The emotion state is registered in the context builder with **priority 5**:
+
+```python
+context.register("emotion", emotion_state.to_prompt, priority=5)
+```
+
+Each LLM request receives the current emotional state as a system message, e.g.:
+
+```
+[emotion] Currently excited and engaged, feeling in control
+```
+
+## Fallback Behavior (No LLM)
+
+When no LLM is configured for the emotion agent:
+- Events are still collected
+- Baseline drift is applied on each tick
+- State persists across restarts via `SubagentMemory`
+- Emotion prompt is still injected into context
+
+## Configuration Reference
+
+```yaml
+autonomy:
+  emotion:
+    enabled: true
+    tick_interval_s: 30        # How often emotion processes events
+    max_events: 20             # Max queued emotion events
+    baseline_pleasure: 0.1     # PAD baseline values
+    baseline_arousal: -0.1
+    baseline_dominance: 0.6
+    mood_drift_rate: 0.1       # How fast mood follows state
+    baseline_drift_rate: 0.02  # How fast mood drifts to baseline
+    hexaco:
+      honesty_humility: 0.3
+      emotionality: 0.7
+      extraversion: 0.4
+      agreeableness: 0.2
+      conscientiousness: 0.9
+      openness: 0.95
+```
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `enabled` | bool | `true` | Enable emotion system |
+| `tick_interval_s` | float | `30.0` | Seconds between emotion ticks |
+| `max_events` | int | `20` | Maximum queued events |
+| `baseline_pleasure` | float | `0.1` | Pleasure baseline (-1 to +1) |
+| `baseline_arousal` | float | `-0.1` | Arousal baseline (-1 to +1) |
+| `baseline_dominance` | float | `0.6` | Dominance baseline (-1 to +1) |
+| `mood_drift_rate` | float | `0.1` | Mood → state drift rate per tick |
+| `baseline_drift_rate` | float | `0.02` | Mood → baseline drift rate per tick |
+| `hexaco.*` | float | (see table) | HEXACO personality traits (0.0–1.0) |
+
+## See Also
+
+- [README](../README.md) — Full project overview
+- [autonomy.md](./autonomy.md) — Autonomy loop and subagent system
+- [architecture.md](./architecture.md) — Context building pipeline
+- [configuration.md](./configuration.md) — Complete configuration reference
diff --git a/docs/memory.md b/docs/memory.md
new file mode 100644
index 00000000..35ec5345
--- /dev/null
+++ b/docs/memory.md
@@ -0,0 +1,190 @@
+# Memory & Compaction
+
+GLaDOS implements long-term memory through an MCP memory server and automatic conversation compaction. The memory system follows an "LLM-first" principle: search is simple keyword matching, and the main agent handles semantic interpretation of results.
+
+## Architecture
+
+```mermaid
+flowchart TB
+    subgraph Runtime
+        LLM[Main Agent] -->|store_fact / search_memory| MEM[Memory MCP Server]
+        CA[CompactionAgent] -->|store_fact / store_summary| MEM
+        CA -->|summarize| LLM2[Autonomy LLM]
+    end
+
+    subgraph Storage["~/.glados/memory/"]
+        F[facts.jsonl]
+        S[summaries.jsonl]
+    end
+
+    MEM --> F
+    MEM --> S
+    MEM -->|search results| LLM
+```
+
+## Long-Term Memory (MCP Server)
+
+The memory server (`glados.mcp.memory_server`) is a built-in MCP server providing persistent storage for facts and conversation summaries.
+
+### Memory Tools
+
+| Tool | Parameters | Description |
+|------|-----------|-------------|
+| `store_fact` | `fact`, `source`, `importance` | Store a fact with source tracking and importance (0.0–1.0) |
+| `search_memory` | `query` | Search facts by keyword |
+| `list_facts` | `min_importance` | List facts filtered by minimum importance |
+| `store_summary` | `summary`, `period` | Store a conversation summary with time period |
+| `get_summaries` | `period` | Retrieve summaries by period (session/daily/weekly) |
+| `memory_stats` | — | Get statistics about stored memories |
+
+The main agent can call these tools directly during conversation (e.g., "remember that the user prefers dark mode").
+
+## JSONL Storage Format
+
+All memory data is stored as line-delimited JSON in `~/.glados/memory/`:
+
+### Facts (`facts.jsonl`)
+
+Each line contains a fact record:
+
+```json
+{
+  "fact": "User's name is Jason",
+  "source": "user_stated",
+  "importance": 0.9,
+  "keywords": ["user", "name", "jason"],
+  "timestamp": "2025-01-17T14:30:00"
+}
+```
+
+### Summaries (`summaries.jsonl`)
+
+Each line contains a conversation summary:
+
+```json
+{
+  "summary": "Discussed home automation setup and configured living room lights",
+  "period": "session",
+  "timestamp": "2025-01-17T15:00:00"
+}
+```
+
+## Fact Storage
+
+### Sources
+
+Facts are tagged with their source to track provenance:
+- **`user_stated`** — User explicitly shared information
+- **`observed`** — Inferred from conversation context
+- **`compaction`** — Extracted automatically during compaction
+
+### Importance Scoring
+
+Facts are assigned importance values (0.0–1.0):
+- **0.9–1.0** — Critical personal information (name, preferences)
+- **0.6–0.8** — Useful context (project details, preferences)
+- **0.3–0.5** — Background information
+- **0.0–0.2** — Ephemeral or low-value details
+
+## Search Algorithm
+
+The memory search uses keyword matching with importance boost and recency decay:
+
+1. **Word overlap**: Count matching words between query and stored fact keywords
+2. **Importance boost**: Higher-importance facts are ranked higher
+3. **Recency decay**: Older facts are slightly penalized
+
+The LLM handles semantic interpretation of search results — the search infrastructure intentionally stays simple. This keeps the system lightweight while leveraging the LLM's language understanding capabilities.
+
+## Compaction Agent
+
+The `CompactionAgent` is a subagent that monitors conversation length and compresses history when it exceeds a token threshold.
+
+### How Compaction Works
+
+```mermaid
+flowchart LR
+    A[Conversation<br>grows] --> B{Tokens ><br>threshold?}
+    B -->|no| A
+    B -->|yes| C[Preserve recent<br>N messages]
+    C --> D[Summarize older<br>messages via LLM]
+    D --> E[Extract facts<br>from summary]
+    E --> F[Replace history<br>with summary +<br>recent messages]
+    F --> G[Store summary<br>and facts in<br>memory server]
+```
+
+### Compaction Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `token_threshold` | 8000 | Trigger compaction at this token count |
+| `preserve_recent_messages` | 10 | Keep this many recent messages uncompacted |
+| `model_context_window` | null | Model's context window size (optional) |
+| `target_utilization` | 0.6 | Target context usage when window is set |
+| `estimator` | `"simple"` | Token estimator: `"simple"` (chars/4) or `"tiktoken"` |
+| `chars_per_token` | 4.0 | Ratio for the simple estimator |
+
+### Compaction Process
+
+1. **Token check**: The compaction agent monitors conversation token count on each tick
+2. **Threshold exceeded**: When tokens exceed the threshold, compaction begins
+3. **Message preservation**: The most recent N messages are preserved unchanged
+4. **Summarization**: Older messages are sent to the autonomy LLM for summarization
+5. **Fact extraction**: The LLM also extracts important facts from the summarized content
+6. **History replacement**: The conversation store replaces old messages with the summary
+7. **Memory storage**: The summary and extracted facts are stored via the memory MCP server
+
+### Automatic Fact Extraction
+
+During compaction, the LLM is prompted to extract notable facts from the conversation:
+- Personal information shared by the user
+- Preferences and settings discussed
+- Important decisions or outcomes
+- Technical details mentioned
+
+Extracted facts are stored with `source: "compaction"` and appropriate importance levels.
+
+## Context Injection
+
+Memory is registered in the context builder with **priority 7**:
+
+```python
+context.register("memory", memory_context.as_prompt, priority=7)
+```
+
+Relevant memories are included in the LLM context as system messages, giving the agent access to previously stored facts and summaries.
+
+## Configuration Reference
+
+```yaml
+autonomy:
+  tokens:
+    token_threshold: 8000
+    preserve_recent_messages: 10
+    model_context_window: null
+    target_utilization: 0.6
+    estimator: "simple"
+    chars_per_token: 4.0
+
+mcp_servers:
+  - name: "memory"
+    transport: "stdio"
+    command: "python"
+    args: ["-m", "glados.mcp.memory_server"]
+```
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `tokens.token_threshold` | int | `8000` | Token count that triggers compaction |
+| `tokens.preserve_recent_messages` | int | `10` | Messages to keep during compaction |
+| `tokens.model_context_window` | int/null | `null` | Model context window (enables utilization targeting) |
+| `tokens.target_utilization` | float | `0.6` | Target context utilization (0.0–1.0) |
+| `tokens.estimator` | string | `"simple"` | Token estimator: `"simple"` or `"tiktoken"` |
+| `tokens.chars_per_token` | float | `4.0` | Characters per token for simple estimator |
+
+## See Also
+
+- [README](../README.md) — Full project overview
+- [mcp.md](./mcp.md) — MCP integration and memory server tools
+- [autonomy.md](./autonomy.md) — Subagent system and compaction agent
+- [configuration.md](./configuration.md) — Complete configuration reference
diff --git a/docs/tui.md b/docs/tui.md
new file mode 100644
index 00000000..a71915de
--- /dev/null
+++ b/docs/tui.md
@@ -0,0 +1,198 @@
+# Text User Interface
+
+GLaDOS includes a rich terminal interface built with the [Textual](https://textual.textualize.io/) framework. The TUI provides real-time status monitoring, interactive panels, a command palette, and multiple color themes.
+
+## Quick Start
+
+```bash
+uv run glados tui
+uv run glados tui --config configs/glados_config.yaml --theme matrix
+```
+
+## Keyboard Shortcuts
+
+| Key | Action |
+|-----|--------|
+| **F1** | Help screen (shortcut reference) |
+| **Ctrl+P** | Command palette (search all commands) |
+| **Ctrl+D** | Toggle Dialog panel |
+| **Ctrl+L** | Toggle System Log panel |
+| **Ctrl+S** | Toggle Status panel |
+| **Ctrl+A** | Toggle Autonomy panel |
+| **Ctrl+U** | Toggle Queue panel |
+| **Ctrl+M** | Toggle MCP panel |
+| **Ctrl+I** / **Tab** | Toggle all right-side info panels |
+| **Ctrl+R** | Restore all panels |
+| **Esc** | Close modal dialogs |
+
+## Layout
+
+The TUI is organized into a two-column layout:
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│ Header                                              [clock] │
+├───────────────────────────────────┬─────────────────────────┤
+│ Dialog (Ctrl+D)                   │ Status (Ctrl+S)         │
+│ User and GLaDOS conversation      │ ASR/TTS state, mic level│
+│                                   ├─────────────────────────┤
+│                                   │ Autonomy (Ctrl+A)       │
+│                                   │ Workers, queue depth    │
+├───────────────────────────────────┤─────────────────────────┤
+│ System Log (Ctrl+L)               │ Queues (Ctrl+U)         │
+│ Debug output and system messages  │ Priority/Autonomy depth │
+│                                   ├─────────────────────────┤
+│                                   │ MCP (Ctrl+M)            │
+│                                   │ Server status, tools    │
+├───────────────────────────────────┴─────────────────────────┤
+│ > Type a message...                                         │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Panel Details
+
+| Panel | Shortcut | Height | Content |
+|-------|----------|--------|---------|
+| **Dialog** | Ctrl+D | 2fr | User messages (cyan) and GLaDOS responses (yellow) |
+| **System Log** | Ctrl+L | 1fr | System output, debug messages, print capture |
+| **Status** | Ctrl+S | 10 lines | ASR/TTS state, autonomy, vision, mic level, speaking indicator |
+| **Autonomy** | Ctrl+A | 7 lines | Enabled/disabled, workers, in-flight, queue depth, coalesce |
+| **Queues** | Ctrl+U | 4 lines | Priority and autonomy queue depths with wait times |
+| **MCP** | Ctrl+M | 5 lines | MCP server status (up to 6 servers: name, online/offline, tool count) |
+
+The right-side panels can all be toggled at once with **Ctrl+I** (or **Tab**).
+
+## Command Palette
+
+Press **Ctrl+P** to open the command palette. Type to filter commands, then press Enter to execute.
+
+### TUI Commands
+
+| Command | Description |
+|---------|-------------|
+| Theme | Switch TUI theme |
+| Context | Show autonomy slot context |
+| Messages | Show dialog history |
+| Observability | Open live event log |
+| Help | Show keyboard shortcuts |
+
+### Engine Commands
+
+Commands can also be typed directly in the input field with a `/` prefix.
+
+| Command | Usage | Description |
+|---------|-------|-------------|
+| `/help` | `/help` | Show available commands |
+| `/status` | `/status` | Show engine status |
+| `/tts` | `/tts on\|off` | Control TTS output |
+| `/mute-tts` | `/mute-tts` | Mute TTS |
+| `/unmute-tts` | `/unmute-tts` | Unmute TTS |
+| `/asr` | `/asr on\|off` | Control ASR input |
+| `/mute-asr` | `/mute-asr` | Mute ASR |
+| `/unmute-asr` | `/unmute-asr` | Unmute ASR |
+| `/autonomy` | `/autonomy on\|off` | Toggle autonomy system |
+| `/autonomy` | `/autonomy debounce on\|off` | Toggle tick coalescing |
+| `/emotion` | `/emotion` | Show current PAD emotional state |
+| `/slots` | `/slots` | Show autonomy slots |
+| `/minds` | `/minds` | Show active minds (subagents) |
+| `/agents` | `/agents` | Show registered subagents |
+| `/mcp` | `/mcp status` | Show MCP server status |
+| `/context` | `/context` | Show context/token usage |
+| `/constitution` | `/constitution` | Show constitutional state and modifiers |
+| `/preferences` | `/preferences` | Show user preferences |
+| `/vision` | `/vision` | Show latest vision snapshot |
+| `/config` | `/config` | Show config summary |
+| `/knowledge` | `/knowledge add\|list\|set\|delete\|clear` | Manage local knowledge notes |
+| `/memory` | `/memory` | Show long-term memory stats |
+| `/observe` | `/observe` | Open observability screen |
+| `/quit` | `/quit` | Quit GLaDOS (alias: `/exit`) |
+
+Some commands are hidden from the palette when not applicable (e.g., `/vision` when vision is disabled, `/emotion` when emotion agent is not running).
+
+## Themes
+
+GLaDOS includes five built-in themes:
+
+| Theme | Primary | Background | Style |
+|-------|---------|------------|-------|
+| **aperture** | Orange/Gold | Dark gray | Default GLaDOS theme |
+| **ice** | Light blue | Very dark blue | Cool, minimal |
+| **matrix** | Bright green | Black | Terminal/hacker style |
+| **mono** | Light gray | Very dark gray | Monochrome |
+| **ember** | Orange/Red | Dark brown | Warm, fiery |
+
+### Selecting a Theme
+
+**Via command palette**: Press Ctrl+P, search "Theme", select from the list.
+
+**Via config**:
+```yaml
+Glados:
+  tui_theme: "aperture"
+```
+
+**Via CLI**:
+```bash
+uv run glados tui --theme matrix
+```
+
+## Modal Screens
+
+Several modal screens overlay the main interface:
+
+| Screen | Trigger | Description |
+|--------|---------|-------------|
+| **Help** | F1 | Keyboard shortcuts reference |
+| **Context** | Ctrl+P → Context | Autonomy slot contents |
+| **Messages** | Ctrl+P → Messages | Full dialog history (up to 500 messages) |
+| **Observability** | Ctrl+P → Observability | Live event log with level, source, kind. Updates every 250ms |
+| **Theme Picker** | Ctrl+P → Theme | Theme selection list |
+| **Info** | Various commands | Generic scrollable output for command results |
+
+All modal screens close with **Esc**.
+
+## Status Indicators
+
+The status panel shows real-time system state:
+
+```
+ASR: ACTIVE    TTS: ACTIVE
+Autonomy: ON   Jobs: OFF
+Vision: OFF
+Speaking: ●     (green when speaking)
+Microphone: ● 42.3 dB  ████████░░░░░░░░
+```
+
+The queue panel shows LLM pipeline health:
+
+```
+Priority: 0 queued  wait 0ms
+Autonomy: 1 queued  wait 234ms
+```
+
+Panels refresh every 300ms from the engine's observability bus.
+
+## Splash Screen
+
+On startup, the TUI displays a splash screen with:
+- GLaDOS ASCII art logo
+- Model name and endpoint
+- Tips for getting started
+- "Initializing systems..." until the engine is ready
+
+The splash screen transitions to the main interface automatically when initialization completes.
+
+## Configuration
+
+```yaml
+Glados:
+  tui_theme: "aperture"  # Theme name (aperture, ice, matrix, mono, ember)
+```
+
+The theme can be overridden at runtime via `--theme` CLI flag or the command palette.
+
+## See Also
+
+- [README](../README.md) — Quick start and installation
+- [configuration.md](./configuration.md) — Complete configuration reference
+- [architecture.md](./architecture.md) — System architecture