diff --git a/skills/contribute/SKILL.md b/skills/contribute/SKILL.md
index cd4b6db..71c9bfa 100644
--- a/skills/contribute/SKILL.md
+++ b/skills/contribute/SKILL.md
@@ -98,23 +98,23 @@ let value = parse(input).context("parsing model config")?;
 | Change config | `model_gateway/src/config/types.rs` |
 | Change worker creation | `model_gateway/src/core/steps/worker/local/` |
 | Change service discovery | `model_gateway/src/service_discovery.rs` |
-| Change API types | `protocols/src/` (careful — shared by all crates) |
+| Change API types | `crates/protocols/src/` (careful — shared by all crates) |
 | Add routing policy | `model_gateway/src/core/routing/` |
-| Add tool parser | `tool_parser/src/parsers/` |
-| Add reasoning parser | `reasoning_parser/src/parsers/` |
+| Add tool parser | `crates/tool_parser/src/parsers/` |
+| Add reasoning parser | `crates/reasoning_parser/src/parsers/` |
 | Update Python bindings | `bindings/python/src/lib.rs` |
 | Update Go SDK | `bindings/golang/` |
-| Add storage backend | `data_connector/src/` |
+| Add storage backend | `crates/data_connector/src/` |
 | Add E2E tests | `e2e_test/` |
-| Add WASM middleware | `wasm/examples/` |
-| Add MCP tool support | `mcp/src/` |
+| Add WASM middleware | `crates/wasm/examples/` |
+| Add MCP tool support | `crates/mcp/src/` |
 
 ## Rationalization Prevention
 
 | Excuse | Reality |
 |--------|---------|
 | "Clippy is clean enough with a few warnings" | `-D warnings` means zero. One warning = not clean. |
-| "I didn't change bindings, skip step 4" | If you touched `config/types.rs` or `protocols/`, the struct literal in `bindings/python/src/lib.rs` may need a default. Check. |
+| "I didn't change bindings, skip step 4" | If you touched `config/types.rs` or `crates/protocols/`, the struct literal in `bindings/python/src/lib.rs` may need a default. Check. |
 | "Only touched one file, don't need full gate" | The two-path config rule means a one-file change can silently break propagation. Run all five. |
 | "Tests are slow, I'll run them later" | "Later" means shipping untested code. Run them now. |
 | "It's just a docs change" | Even docs PRs need clean formatting and conventional commits. Steps 1 and 5 still apply. |
diff --git a/skills/implement/SKILL.md b/skills/implement/SKILL.md
index 92765b2..af641be 100644
--- a/skills/implement/SKILL.md
+++ b/skills/implement/SKILL.md
@@ -22,7 +22,7 @@ Do NOT write implementation code until you have:
 3. Created a task for each step in the recipe
 </HARD-GATE>
 
-**Escape hatch:** Single-file changes under 20 lines that don't touch `config/types.rs`, `protocols/`, `main.rs` (CliArgs or conversion functions), or `bindings/` may skip the full recipe. You MUST still chain to `smg:contribute` before PR.
+**Escape hatch:** Single-file changes under 20 lines that don't touch `config/types.rs`, `crates/protocols/`, `main.rs` (CliArgs or conversion functions), or `bindings/` may skip the full recipe. You MUST still chain to `smg:contribute` before PR.
 
 ## Detection Table
 
diff --git a/skills/implement/auth-feature.md b/skills/implement/auth-feature.md
index 456cea9..4fa9eb0 100644
--- a/skills/implement/auth-feature.md
+++ b/skills/implement/auth-feature.md
@@ -6,7 +6,7 @@ Two-factor auth: API key (SHA-256) + JWT/OIDC. Roles: Admin (control plane) and
 
 ### Adding a New Role
 
-1. Extend `Role` enum in `auth/src/`
+1. Extend `Role` enum in `crates/auth/src/`
 2. Update permission checks in middleware
 3. Update role mapping for JWT claims
 4. Add audit logging for new role actions
@@ -26,7 +26,7 @@ Two-factor auth: API key (SHA-256) + JWT/OIDC. Roles: Admin (control plane) and
 
 ### Adding a Custom Auth Method
 
-1. Implement validation logic in `auth/src/`
+1. Implement validation logic in `crates/auth/src/`
 2. Extract `Principal` from request
 3. Integrate in middleware chain (`/admin/*` routes)
 4. Add audit event for the new method
diff --git a/skills/implement/grpc-backend.md b/skills/implement/grpc-backend.md
index af95bb2..d3f06d8 100644
--- a/skills/implement/grpc-backend.md
+++ b/skills/implement/grpc-backend.md
@@ -6,7 +6,7 @@ gRPC clients connect to LLM backends (SGLang, vLLM, TRT). Use shared macros for
 
 ### Step 1: Create client file
 
-**File:** `grpc_client/src/mybackend.rs`
+**File:** `crates/grpc_client/src/mybackend.rs`
 
 Implement connection, health check, and inference methods. Use shared macros:
 ```rust
diff --git a/skills/implement/kv-index-feature.md b/skills/implement/kv-index-feature.md
index c58b92c..0a18bf7 100644
--- a/skills/implement/kv-index-feature.md
+++ b/skills/implement/kv-index-feature.md
@@ -15,7 +15,7 @@ Both implement `RadixTree` trait: prefix insertion, longest-prefix-match, LRU ev
 
 ### Adding Index Features
 
-1. Implement in `kv_index/src/`
+1. Implement in `crates/kv_index/src/`
 2. Ensure `Send + Sync` (accessed from routing hot path)
 3. Support both String and Token variants if applicable
 4. Add eviction/cleanup mechanism (prevent unbounded memory)
diff --git a/skills/implement/mcp-feature.md b/skills/implement/mcp-feature.md
index e9cf688..00ddb5c 100644
--- a/skills/implement/mcp-feature.md
+++ b/skills/implement/mcp-feature.md
@@ -5,7 +5,7 @@ Model Context Protocol client for external tool servers. Manages discovery, exec
 ## Architecture
 
 ```
-mcp/src/
+crates/mcp/src/
   core/
     orchestrator.rs  → Tool execution, routing, validation (101KB)
     session.rs       → Server bindings, tool sessions
@@ -25,7 +25,7 @@ Implement `rmcp::Transport` trait for the new connection type.
 
 ### Adding a Response Format
 
-**Directory:** `mcp/src/transform/`
+**Directory:** `crates/mcp/src/transform/`
 
 Convert MCP tool results to API-compatible format (OpenAI function calling, Claude tool use, custom).
 
@@ -63,4 +63,4 @@ ToolAnnotations { read_only, destructive, idempotent, open_world }
 
 Use `#[serial_test]` for approval workflow tests (shared state).
 
-**Verify:** `cargo test -p mcp`
+**Verify:** `cargo test -p smg-mcp`
diff --git a/skills/implement/mesh-feature.md b/skills/implement/mesh-feature.md
index c452eb2..d837186 100644
--- a/skills/implement/mesh-feature.md
+++ b/skills/implement/mesh-feature.md
@@ -5,7 +5,7 @@ SWIM gossip protocol with CRDT stores. Optional — only active with multiple ga
 ## Architecture
 
 ```
-mesh/src/
+crates/mesh/src/
   service.rs          → MeshServerBuilder, cluster state
   ping_server.rs      → SWIM gossip (60KB), message batching
   sync.rs             → MeshSyncManager, state reconciliation
@@ -20,10 +20,10 @@ mesh/src/
 
 ### Adding a New CRDT Store
 
-1. Define CRDT type in `mesh/src/crdt_kv/`
-2. Register in `StateStores` (`mesh/src/stores.rs`)
-3. Add sync logic in `MeshSyncManager` (`mesh/src/sync.rs`)
-4. Emit updates in gossip messages (`mesh/src/ping_server.rs`)
+1. Define CRDT type in `crates/mesh/src/crdt_kv/`
+2. Register in `StateStores` (`crates/mesh/src/stores.rs`)
+3. Add sync logic in `MeshSyncManager` (`crates/mesh/src/sync.rs`)
+4. Emit updates in gossip messages (`crates/mesh/src/ping_server.rs`)
 5. Version with `version: u64` for causality tracking
 
 ### Adding a Cluster Integration
diff --git a/skills/implement/multimodal-feature.md b/skills/implement/multimodal-feature.md
index c88e9fa..cd6aec8 100644
--- a/skills/implement/multimodal-feature.md
+++ b/skills/implement/multimodal-feature.md
@@ -17,19 +17,20 @@ User message with image URL/data
 
 ### Adding a New Modality
 
-1. Extend `Modality` enum and `ChatContentPart` in `multimodal/src/`
+1. Extend `Modality` enum and `ChatContentPart` in `crates/multimodal/src/`
 2. Add fetch method to media connector
 3. Implement processing pipeline
 4. Track with UUID for deduplication
 
 ### Adding a Vision Processor
 
-**Directory:** `multimodal/src/vision/`
+**Directory:** `crates/multimodal/src/vision/`
 
 1. Implement processor trait (image → model-specific tensor format)
 2. Handle resizing, normalization, placeholder insertion
-3. Register in `ImageProcessorRegistry`
-4. Add NPZ array comparison tests for output validation
+3. Add per-model spec module in `crates/multimodal/src/registry/` (e.g. `mymodel.rs`)
+4. Register in the registry's `mod.rs`
+5. Add NPZ array comparison tests for output validation
 
 ### Adding a Media Source
 
diff --git a/skills/implement/reasoning-parser.md b/skills/implement/reasoning-parser.md
index 84557ae..1973785 100644
--- a/skills/implement/reasoning-parser.md
+++ b/skills/implement/reasoning-parser.md
@@ -140,7 +140,7 @@ This flag controls whether the parser assumes the first token is reasoning or no
 
 ## Step 1: Create parser file
 
-**File:** `reasoning_parser/src/parsers/{MODEL_NAME}.rs`
+**File:** `crates/reasoning_parser/src/parsers/{MODEL_NAME}.rs`
 
 Generate this file, substituting the 4 inputs:
 
@@ -282,26 +282,26 @@ mod tests {
 }
 ```
 
-**Verify:** `cargo check -p reasoning_parser`
+**Verify:** `cargo check -p reasoning-parser`
 
 ## Step 2: Register in module exports
 
-**File:** `reasoning_parser/src/parsers/mod.rs` — add:
+**File:** `crates/reasoning_parser/src/parsers/mod.rs` — add:
 ```rust
 pub mod {MODEL_NAME};
 pub use {MODEL_NAME}::{ModelName}Parser;
 ```
 
-**File:** `reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
+**File:** `crates/reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
 ```rust
 {ModelName}Parser,
 ```
 
-**Verify:** `cargo check -p reasoning_parser`
+**Verify:** `cargo check -p reasoning-parser`
 
 ## Step 3: Register in factory
 
-**File:** `reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:
+**File:** `crates/reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:
 
 ```rust
 // Parser registration
@@ -314,12 +314,12 @@ registry.register_pattern("{pattern-2}", "{MODEL_NAME}");
 
 Pattern matching is **case-insensitive substring**: `model_id.to_lowercase().contains(pattern)`.
 
-**Verify:** `cargo check -p reasoning_parser`
+**Verify:** `cargo check -p reasoning-parser`
 
 ## Step 4: Run tests
 
 ```bash
-cargo test -p reasoning_parser
+cargo test -p reasoning-parser
 ```
 
 All 7 tests in the new file plus all existing tests must pass.
diff --git a/skills/implement/storage-backend.md b/skills/implement/storage-backend.md
index 7358e0f..5ab49fc 100644
--- a/skills/implement/storage-backend.md
+++ b/skills/implement/storage-backend.md
@@ -6,7 +6,7 @@
 
 ### Step 1: Create module
 
-**Directory:** `data_connector/src/mybackend/`
+**Directory:** `crates/data_connector/src/mybackend/`
 
 Implement all storage trait methods with consistent behavior across operations.
 
diff --git a/skills/implement/tool-parser.md b/skills/implement/tool-parser.md
index c3ecc66..141fa20 100644
--- a/skills/implement/tool-parser.md
+++ b/skills/implement/tool-parser.md
@@ -104,7 +104,7 @@ Observe the raw format before the API normalizes it.
 
 ## Step 1: Create parser file
 
-**File:** `tool_parser/src/parsers/{PARSER_NAME}.rs`
+**File:** `crates/tool_parser/src/parsers/{PARSER_NAME}.rs`
 
 For the most common case — **JSON with tags** — generate this template:
 
@@ -251,26 +251,26 @@ impl ToolParser for {ParserName}Parser {
 }
 ```
 
-**Verify:** `cargo check -p tool_parser`
+**Verify:** `cargo check -p tool-parser`
 
 ## Step 2: Register in module exports
 
-**File:** `tool_parser/src/parsers/mod.rs` — add:
+**File:** `crates/tool_parser/src/parsers/mod.rs` — add:
 ```rust
 pub mod {PARSER_NAME};
 pub use {PARSER_NAME}::{ParserName}Parser;
 ```
 
-**File:** `tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
+**File:** `crates/tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
 ```rust
 {ParserName}Parser,
 ```
 
-**Verify:** `cargo check -p tool_parser`
+**Verify:** `cargo check -p tool-parser`
 
 ## Step 3: Register in factory
 
-**File:** `tool_parser/src/factory.rs`
+**File:** `crates/tool_parser/src/factory.rs`
 
 In `ParserFactory::new()`:
 ```rust
@@ -285,11 +285,11 @@ registry.map_model("{model-pattern-2}*", "{PARSER_NAME}");
 
 Pattern matching uses **glob wildcards** (`*` matches any characters).
 
-**Verify:** `cargo check -p tool_parser`
+**Verify:** `cargo check -p tool-parser`
 
 ## Step 4: Write tests
 
-**File:** `tool_parser/tests/tool_parser_{PARSER_NAME}.rs`
+**File:** `crates/tool_parser/tests/tool_parser_{PARSER_NAME}.rs`
 
 ```rust
 mod common;
diff --git a/skills/implement/wasm-plugin.md b/skills/implement/wasm-plugin.md
index d454015..19d8f37 100644
--- a/skills/implement/wasm-plugin.md
+++ b/skills/implement/wasm-plugin.md
@@ -21,7 +21,7 @@ Wasmtime component model with WIT interface. Plugins intercept requests/response
 
 ### Step 1: Define types in WIT (if new interface needed)
 
-**File:** `wasm/src/interface/spec.wit`
+**File:** `crates/wasm/src/interface/spec.wit`
 
 ```wit
 interface middleware-types {
@@ -33,23 +33,23 @@ interface middleware-types {
 
 ### Step 2: Add attachment point (if new hook)
 
-**File:** `wasm/src/module.rs`
+**File:** `crates/wasm/src/module.rs`
 
 Add to `MiddlewareAttachPoint` enum.
 
 ### Step 3: Implement handler matching
 
-**File:** `wasm/src/runtime.rs`
+**File:** `crates/wasm/src/runtime.rs`
 
 Match the new attachment point and execute WASM module.
 
 ### Step 4: Update module validation
 
-**File:** `wasm/src/module_manager.rs`
+**File:** `crates/wasm/src/module_manager.rs`
 
 ### Step 5: Write example guest plugin
 
-**Directory:** `wasm/examples/`
+**Directory:** `crates/wasm/examples/`
 
 ```rust
 wit_bindgen::generate!({ world: "smg" });
diff --git a/skills/map/SKILL.md b/skills/map/SKILL.md
index 18b2efa..f8b0afe 100644
--- a/skills/map/SKILL.md
+++ b/skills/map/SKILL.md
@@ -25,23 +25,26 @@ High-performance Rust gateway for LLM inference backends. Routes requests to wor
 | `tool_parser` | 13+ tool call parsers (JSON, Mistral, Qwen, DeepSeek, Pythonic, etc.). Streaming with incremental JSON | `ToolParser` trait, `ParserFactory`, `StreamingParseResult` |
 | `reasoning_parser` | Reasoning extraction from 10+ model families (DeepSeek-R1, Qwen3, Kimi, Cohere). Streaming | `ReasoningParser` trait, `ParserFactory`, `ParserResult` |
 | `tokenizer` | LLM tokenization, chat templates | `Tokenizer` |
-| `multimodal` | Image/audio processing. Vision processors (LLaVA, LLaVA-Next), media fetching | `ImageFrame`, `MultiModalInputs`, `ChatContentPart` |
+| `multimodal` | Image/audio processing. Per-model vision specs (LLaVA, Qwen-VL, Llama4, Phi3-V), media fetching | `ImageFrame`, `ChatContentPart`, `MediaConnector` |
 | `workflow` | Step-based async workflow engine (wfaas) | `StepExecutor`, `WorkflowContext` |
 | `bindings/python` | PyO3 bindings. `Router` class with ~80 constructor params, enum mapping | `Router`, `PolicyType` |
 | `bindings/golang` | Go SDK via FFI (cgo). OpenAI-style API, streaming, tool calling | `Client`, `ChatCompletionRequest` |
 | `clients/rust` | Rust client library | |
+| `grpc_servicer` | Python gRPC servicer wrapping vLLM/SGLang backends | |
 
 ## Layering Rule
 
 ```
-protocols (shared types — ALL consumers)
+crates/protocols (shared types — ALL consumers)
     ↑
 model_gateway (implementation — ONE consumer writes each field)
     ↑
 bindings/* (language SDKs — wrap model_gateway + protocols)
 ```
 
-**Iron law**: If only one crate writes a field, it doesn't belong in `protocols/`. K8s-specific, runtime-specific, or gateway-specific fields stay in `model_gateway`.
+**Directory layout**: Library crates live under `crates/` (e.g. `crates/mcp/`, `crates/mesh/`). `model_gateway/` and `bindings/` remain at repo root.
+
+**Iron law**: If only one crate writes a field, it doesn't belong in `crates/protocols/`. K8s-specific, runtime-specific, or gateway-specific fields stay in `model_gateway`.
 
 ## Config Propagation (3-Stage)
 
@@ -61,6 +64,9 @@ ServiceDiscoveryConfig / ServerConfig — typed, runtime
 Client → HTTP/gRPC handler → Auth middleware → WASM OnRequest
   → Routing policy selects worker → Proxy to backend
   → Stream response → Tool/reasoning parsing → WASM OnResponse → Client
+
+Realtime (WebSocket):
+Client → WS upgrade → Realtime session registry → Proxy to backend WS
 ```
 
 ## Worker Lifecycle (5-Step Workflow)
diff --git a/skills/review-pr/SKILL.md b/skills/review-pr/SKILL.md
index 587822f..721f229 100644
--- a/skills/review-pr/SKILL.md
+++ b/skills/review-pr/SKILL.md
@@ -38,15 +38,15 @@ Do NOT write review comments, approve, or provide feedback until you have:
 
 | Files Changed | Review Sections |
 |---------------|-----------------|
-| `protocols/src/` | 1 (Layering), 3 (Worker Lifecycle) |
+| `crates/protocols/src/` | 1 (Layering), 3 (Worker Lifecycle) |
 | `model_gateway/src/config/` | 2 (Config Plumbing) |
 | `model_gateway/src/main.rs` | 2 (Config Plumbing) |
 | `model_gateway/src/service_discovery.rs` | 3 (Worker Lifecycle) |
 | `model_gateway/src/core/steps/worker/` | 3 (Worker Lifecycle) |
 | `model_gateway/src/core/routing/` | 4 (Routing Policy) |
-| `tool_parser/src/` | 5 (Parser Changes) |
-| `reasoning_parser/src/` | 5 (Parser Changes) |
-| `data_connector/src/` | 6 (Storage) |
+| `crates/tool_parser/src/` | 5 (Parser Changes) |
+| `crates/reasoning_parser/src/` | 5 (Parser Changes) |
+| `crates/data_connector/src/` | 6 (Storage) |
 | `bindings/` | 2 (Config Plumbing) |
 | Any file | 7 (Error Handling), 8 (Testing), 9 (Code Quality) |
 
@@ -56,7 +56,7 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad
 
 ### 1. Layering & Separation of Concerns
 
-- [ ] No new fields in `protocols/` types that only one crate sets
+- [ ] No new fields in `crates/protocols/` types that only one crate sets
 - [ ] Config types at correct layer: user-facing → `config/types.rs`, runtime → module-specific
 - [ ] No raw strings parsed at runtime — parse at boundary
 - [ ] WASM/MCP concerns stay in their crates, not leaking into core
@@ -111,8 +111,8 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad
 - [ ] Unit tests for new types/parsing including error cases
 - [ ] Integration test for full flow
 - [ ] Existing test struct literals updated with new fields
-- [ ] E2E tests if user-facing behavior changes
-- [ ] Thread-unsafe tests marked `@pytest.mark.thread_unsafe`
+- [ ] E2E tests if user-facing behavior changes (in `e2e_test/` — tests run sequentially with class-scoped backends)
+- [ ] E2E test markers set: `@pytest.mark.engine(...)`, `@pytest.mark.gpu(count)`, `@pytest.mark.model(...)` as needed
 
 ### 9. Code Quality
 
@@ -126,7 +126,7 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad
 
 ### 10. Architecture Smell Tests
 
-- "If I remove K8s, does this change still make sense?" → shouldn't be in `protocols/`
+- "If I remove K8s, does this change still make sense?" → shouldn't be in `crates/protocols/`
 - "Can existing config overrides or labels achieve this?" → may be unnecessary
 - "Does this compose with DP-aware mode, PD disagg, mesh HA?" → don't break existing
 - "Is this Send + Sync safe under concurrent load?" → all routing state thread-safe
diff --git a/skills/review-pr/anti-patterns.md b/skills/review-pr/anti-patterns.md
index d3488b7..68f02d3 100644
--- a/skills/review-pr/anti-patterns.md
+++ b/skills/review-pr/anti-patterns.md
@@ -14,9 +14,9 @@ Per-subsystem anti-patterns to check during PR review.
 
 | Anti-Pattern | Consequence | What to Look For |
 |-------------|-------------|------------------|
-| Adding `_override` field to WorkerSpec | Bypasses label pipeline, creates parallel data path | New fields on `WorkerSpec` in `protocols/src/worker.rs` |
+| Adding `_override` field to WorkerSpec | Bypasses label pipeline, creates parallel data path | New fields on `WorkerSpec` in `crates/protocols/src/worker.rs` |
 | Post-hoc ModelCard mutation | Race conditions, stale data in routing | `model_card.model_id = ...` after `build_model_card()` |
-| Injecting K8s-specific data into `protocols/` types | Tight coupling to K8s, breaks non-K8s deployments | New fields in `protocols/` that reference namespaces, pods, labels |
+| Injecting K8s-specific data into `crates/protocols/` types | Tight coupling to K8s, breaks non-K8s deployments | New fields in `crates/protocols/` that reference namespaces, pods, labels |
 
 ## Routing