Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions skills/contribute/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,23 +98,23 @@ let value = parse(input).context("parsing model config")?;
| Change config | `model_gateway/src/config/types.rs` |
| Change worker creation | `model_gateway/src/core/steps/worker/local/` |
| Change service discovery | `model_gateway/src/service_discovery.rs` |
| Change API types | `protocols/src/` (careful — shared by all crates) |
| Change API types | `crates/protocols/src/` (careful — shared by all crates) |
| Add routing policy | `model_gateway/src/core/routing/` |
| Add tool parser | `tool_parser/src/parsers/` |
| Add reasoning parser | `reasoning_parser/src/parsers/` |
| Add tool parser | `crates/tool_parser/src/parsers/` |
| Add reasoning parser | `crates/reasoning_parser/src/parsers/` |
| Update Python bindings | `bindings/python/src/lib.rs` |
| Update Go SDK | `bindings/golang/` |
| Add storage backend | `data_connector/src/` |
| Add storage backend | `crates/data_connector/src/` |
| Add E2E tests | `e2e_test/` |
| Add WASM middleware | `wasm/examples/` |
| Add MCP tool support | `mcp/src/` |
| Add WASM middleware | `crates/wasm/examples/` |
| Add MCP tool support | `crates/mcp/src/` |

## Rationalization Prevention

| Excuse | Reality |
|--------|---------|
| "Clippy is clean enough with a few warnings" | `-D warnings` means zero. One warning = not clean. |
| "I didn't change bindings, skip step 4" | If you touched `config/types.rs` or `protocols/`, the struct literal in `bindings/python/src/lib.rs` may need a default. Check. |
| "I didn't change bindings, skip step 4" | If you touched `config/types.rs` or `crates/protocols/`, the struct literal in `bindings/python/src/lib.rs` may need a default. Check. |
| "Only touched one file, don't need full gate" | The two-path config rule means a one-file change can silently break propagation. Run all five. |
| "Tests are slow, I'll run them later" | "Later" means shipping untested code. Run them now. |
| "It's just a docs change" | Even docs PRs need clean formatting and conventional commits. Steps 1 and 5 still apply. |
Expand Down
2 changes: 1 addition & 1 deletion skills/implement/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Do NOT write implementation code until you have:
3. Created a task for each step in the recipe
</HARD-GATE>

**Escape hatch:** Single-file changes under 20 lines that don't touch `config/types.rs`, `protocols/`, `main.rs` (CliArgs or conversion functions), or `bindings/` may skip the full recipe. You MUST still chain to `smg:contribute` before PR.
**Escape hatch:** Single-file changes under 20 lines that don't touch `config/types.rs`, `crates/protocols/`, `main.rs` (CliArgs or conversion functions), or `bindings/` may skip the full recipe. You MUST still chain to `smg:contribute` before PR.

## Detection Table

Expand Down
4 changes: 2 additions & 2 deletions skills/implement/auth-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Two-factor auth: API key (SHA-256) + JWT/OIDC. Roles: Admin (control plane) and

### Adding a New Role

1. Extend `Role` enum in `auth/src/`
1. Extend `Role` enum in `crates/auth/src/`
2. Update permission checks in middleware
3. Update role mapping for JWT claims
4. Add audit logging for new role actions
Expand All @@ -26,7 +26,7 @@ Two-factor auth: API key (SHA-256) + JWT/OIDC. Roles: Admin (control plane) and

### Adding a Custom Auth Method

1. Implement validation logic in `auth/src/`
1. Implement validation logic in `crates/auth/src/`
2. Extract `Principal` from request
3. Integrate in middleware chain (`/admin/*` routes)
4. Add audit event for the new method
Expand Down
2 changes: 1 addition & 1 deletion skills/implement/grpc-backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ gRPC clients connect to LLM backends (SGLang, vLLM, TRT). Use shared macros for

### Step 1: Create client file

**File:** `grpc_client/src/mybackend.rs`
**File:** `crates/grpc_client/src/mybackend.rs`

Implement connection, health check, and inference methods. Use shared macros:
```rust
Expand Down
2 changes: 1 addition & 1 deletion skills/implement/kv-index-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Both implement `RadixTree` trait: prefix insertion, longest-prefix-match, LRU ev

### Adding Index Features

1. Implement in `kv_index/src/`
1. Implement in `crates/kv_index/src/`
2. Ensure `Send + Sync` (accessed from routing hot path)
3. Support both String and Token variants if applicable
4. Add eviction/cleanup mechanism (prevent unbounded memory)
Expand Down
6 changes: 3 additions & 3 deletions skills/implement/mcp-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Model Context Protocol client for external tool servers. Manages discovery, exec
## Architecture

```
mcp/src/
crates/mcp/src/
core/
orchestrator.rs → Tool execution, routing, validation (101KB)
session.rs → Server bindings, tool sessions
Expand All @@ -25,7 +25,7 @@ Implement `rmcp::Transport` trait for the new connection type.

### Adding a Response Format

**Directory:** `mcp/src/transform/`
**Directory:** `crates/mcp/src/transform/`

Convert MCP tool results to API-compatible format (OpenAI function calling, Claude tool use, custom).

Expand Down Expand Up @@ -63,4 +63,4 @@ ToolAnnotations { read_only, destructive, idempotent, open_world }

Use `#[serial_test]` for approval workflow tests (shared state).

**Verify:** `cargo test -p mcp`
**Verify:** `cargo test -p smg-mcp`
10 changes: 5 additions & 5 deletions skills/implement/mesh-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ SWIM gossip protocol with CRDT stores. Optional — only active with multiple ga
## Architecture

```
mesh/src/
crates/mesh/src/
service.rs → MeshServerBuilder, cluster state
ping_server.rs → SWIM gossip (60KB), message batching
sync.rs → MeshSyncManager, state reconciliation
Expand All @@ -20,10 +20,10 @@ mesh/src/

### Adding a New CRDT Store

1. Define CRDT type in `mesh/src/crdt_kv/`
2. Register in `StateStores` (`mesh/src/stores.rs`)
3. Add sync logic in `MeshSyncManager` (`mesh/src/sync.rs`)
4. Emit updates in gossip messages (`mesh/src/ping_server.rs`)
1. Define CRDT type in `crates/mesh/src/crdt_kv/`
2. Register in `StateStores` (`crates/mesh/src/stores.rs`)
3. Add sync logic in `MeshSyncManager` (`crates/mesh/src/sync.rs`)
4. Emit updates in gossip messages (`crates/mesh/src/ping_server.rs`)
5. Version with `version: u64` for causality tracking

### Adding a Cluster Integration
Expand Down
9 changes: 5 additions & 4 deletions skills/implement/multimodal-feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,20 @@ User message with image URL/data

### Adding a New Modality

1. Extend `Modality` enum and `ChatContentPart` in `multimodal/src/`
1. Extend `Modality` enum and `ChatContentPart` in `crates/multimodal/src/`
2. Add fetch method to media connector
3. Implement processing pipeline
4. Track with UUID for deduplication

### Adding a Vision Processor

**Directory:** `multimodal/src/vision/`
**Directory:** `crates/multimodal/src/vision/`

1. Implement processor trait (image → model-specific tensor format)
2. Handle resizing, normalization, placeholder insertion
3. Register in `ImageProcessorRegistry`
4. Add NPZ array comparison tests for output validation
3. Add per-model spec module in `crates/multimodal/src/registry/` (e.g. `mymodel.rs`)
4. Register in the registry's `mod.rs`
5. Add NPZ array comparison tests for output validation

### Adding a Media Source

Expand Down
16 changes: 8 additions & 8 deletions skills/implement/reasoning-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ This flag controls whether the parser assumes the first token is reasoning or no

## Step 1: Create parser file

**File:** `reasoning_parser/src/parsers/{MODEL_NAME}.rs`
**File:** `crates/reasoning_parser/src/parsers/{MODEL_NAME}.rs`

Generate this file, substituting the 4 inputs:

Expand Down Expand Up @@ -282,26 +282,26 @@ mod tests {
}
```

**Verify:** `cargo check -p reasoning_parser`
**Verify:** `cargo check -p reasoning-parser`

## Step 2: Register in module exports

**File:** `reasoning_parser/src/parsers/mod.rs` — add:
**File:** `crates/reasoning_parser/src/parsers/mod.rs` — add:
```rust
pub mod {MODEL_NAME};
pub use {MODEL_NAME}::{ModelName}Parser;
```

**File:** `reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
**File:** `crates/reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
```rust
{ModelName}Parser,
```

**Verify:** `cargo check -p reasoning_parser`
**Verify:** `cargo check -p reasoning-parser`

## Step 3: Register in factory

**File:** `reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:
**File:** `crates/reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:

```rust
// Parser registration
Expand All @@ -314,12 +314,12 @@ registry.register_pattern("{pattern-2}", "{MODEL_NAME}");

Pattern matching is **case-insensitive substring**: `model_id.to_lowercase().contains(pattern)`.

**Verify:** `cargo check -p reasoning_parser`
**Verify:** `cargo check -p reasoning-parser`

## Step 4: Run tests

```bash
cargo test -p reasoning_parser
cargo test -p reasoning-parser
```

All 7 tests in the new file plus all existing tests must pass.
Expand Down
2 changes: 1 addition & 1 deletion skills/implement/storage-backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

### Step 1: Create module

**Directory:** `data_connector/src/mybackend/`
**Directory:** `crates/data_connector/src/mybackend/`

Implement all storage trait methods with consistent behavior across operations.

Expand Down
16 changes: 8 additions & 8 deletions skills/implement/tool-parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ Observe the raw format before the API normalizes it.

## Step 1: Create parser file

**File:** `tool_parser/src/parsers/{PARSER_NAME}.rs`
**File:** `crates/tool_parser/src/parsers/{PARSER_NAME}.rs`

For the most common case — **JSON with tags** — generate this template:

Expand Down Expand Up @@ -251,26 +251,26 @@ impl ToolParser for {ParserName}Parser {
}
```

**Verify:** `cargo check -p tool_parser`
**Verify:** `cargo check -p tool-parser`

## Step 2: Register in module exports

**File:** `tool_parser/src/parsers/mod.rs` — add:
**File:** `crates/tool_parser/src/parsers/mod.rs` — add:
```rust
pub mod {PARSER_NAME};
pub use {PARSER_NAME}::{ParserName}Parser;
```

**File:** `tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
**File:** `crates/tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
```rust
{ParserName}Parser,
```

**Verify:** `cargo check -p tool_parser`
**Verify:** `cargo check -p tool-parser`

## Step 3: Register in factory

**File:** `tool_parser/src/factory.rs`
**File:** `crates/tool_parser/src/factory.rs`

In `ParserFactory::new()`:
```rust
Expand All @@ -285,11 +285,11 @@ registry.map_model("{model-pattern-2}*", "{PARSER_NAME}");

Pattern matching uses **glob wildcards** (`*` matches any characters).

**Verify:** `cargo check -p tool_parser`
**Verify:** `cargo check -p tool-parser`

## Step 4: Write tests

**File:** `tool_parser/tests/tool_parser_{PARSER_NAME}.rs`
**File:** `crates/tool_parser/tests/tool_parser_{PARSER_NAME}.rs`

```rust
mod common;
Expand Down
10 changes: 5 additions & 5 deletions skills/implement/wasm-plugin.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Wasmtime component model with WIT interface. Plugins intercept requests/response

### Step 1: Define types in WIT (if new interface needed)

**File:** `wasm/src/interface/spec.wit`
**File:** `crates/wasm/src/interface/spec.wit`

```wit
interface middleware-types {
Expand All @@ -33,23 +33,23 @@ interface middleware-types {

### Step 2: Add attachment point (if new hook)

**File:** `wasm/src/module.rs`
**File:** `crates/wasm/src/module.rs`

Add to `MiddlewareAttachPoint` enum.

### Step 3: Implement handler matching

**File:** `wasm/src/runtime.rs`
**File:** `crates/wasm/src/runtime.rs`

Match the new attachment point and execute WASM module.

### Step 4: Update module validation

**File:** `wasm/src/module_manager.rs`
**File:** `crates/wasm/src/module_manager.rs`

### Step 5: Write example guest plugin

**Directory:** `wasm/examples/`
**Directory:** `crates/wasm/examples/`

```rust
wit_bindgen::generate!({ world: "smg" });
Expand Down
12 changes: 9 additions & 3 deletions skills/map/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,23 +25,26 @@ High-performance Rust gateway for LLM inference backends. Routes requests to wor
| `tool_parser` | 13+ tool call parsers (JSON, Mistral, Qwen, DeepSeek, Pythonic, etc.). Streaming with incremental JSON | `ToolParser` trait, `ParserFactory`, `StreamingParseResult` |
| `reasoning_parser` | Reasoning extraction from 10+ model families (DeepSeek-R1, Qwen3, Kimi, Cohere). Streaming | `ReasoningParser` trait, `ParserFactory`, `ParserResult` |
| `tokenizer` | LLM tokenization, chat templates | `Tokenizer` |
| `multimodal` | Image/audio processing. Vision processors (LLaVA, LLaVA-Next), media fetching | `ImageFrame`, `MultiModalInputs`, `ChatContentPart` |
| `multimodal` | Image/audio processing. Per-model vision specs (LLaVA, Qwen-VL, Llama4, Phi3-V), media fetching | `ImageFrame`, `ChatContentPart`, `MediaConnector` |
| `workflow` | Step-based async workflow engine (wfaas) | `StepExecutor`, `WorkflowContext` |
| `bindings/python` | PyO3 bindings. `Router` class with ~80 constructor params, enum mapping | `Router`, `PolicyType` |
| `bindings/golang` | Go SDK via FFI (cgo). OpenAI-style API, streaming, tool calling | `Client`, `ChatCompletionRequest` |
| `clients/rust` | Rust client library | |
| `grpc_servicer` | Python gRPC servicer wrapping vLLM/SGLang backends | |

## Layering Rule

```
protocols (shared types — ALL consumers)
crates/protocols (shared types — ALL consumers)
model_gateway (implementation — ONE consumer writes each field)
bindings/* (language SDKs — wrap model_gateway + protocols)
```

**Iron law**: If only one crate writes a field, it doesn't belong in `protocols/`. K8s-specific, runtime-specific, or gateway-specific fields stay in `model_gateway`.
**Directory layout**: Library crates live under `crates/` (e.g. `crates/mcp/`, `crates/mesh/`). `model_gateway/` and `bindings/` remain at repo root.

**Iron law**: If only one crate writes a field, it doesn't belong in `crates/protocols/`. K8s-specific, runtime-specific, or gateway-specific fields stay in `model_gateway`.

## Config Propagation (3-Stage)

Expand All @@ -61,6 +64,9 @@ ServiceDiscoveryConfig / ServerConfig — typed, runtime
Client → HTTP/gRPC handler → Auth middleware → WASM OnRequest
→ Routing policy selects worker → Proxy to backend
→ Stream response → Tool/reasoning parsing → WASM OnResponse → Client

Realtime (WebSocket):
Client → WS upgrade → Realtime session registry → Proxy to backend WS
```

## Worker Lifecycle (5-Step Workflow)
Expand Down
16 changes: 8 additions & 8 deletions skills/review-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,15 @@ Do NOT write review comments, approve, or provide feedback until you have:

| Files Changed | Review Sections |
|---------------|-----------------|
| `protocols/src/` | 1 (Layering), 3 (Worker Lifecycle) |
| `crates/protocols/src/` | 1 (Layering), 3 (Worker Lifecycle) |
| `model_gateway/src/config/` | 2 (Config Plumbing) |
| `model_gateway/src/main.rs` | 2 (Config Plumbing) |
| `model_gateway/src/service_discovery.rs` | 3 (Worker Lifecycle) |
| `model_gateway/src/core/steps/worker/` | 3 (Worker Lifecycle) |
| `model_gateway/src/core/routing/` | 4 (Routing Policy) |
| `tool_parser/src/` | 5 (Parser Changes) |
| `reasoning_parser/src/` | 5 (Parser Changes) |
| `data_connector/src/` | 6 (Storage) |
| `crates/tool_parser/src/` | 5 (Parser Changes) |
| `crates/reasoning_parser/src/` | 5 (Parser Changes) |
| `crates/data_connector/src/` | 6 (Storage) |
| `bindings/` | 2 (Config Plumbing) |
| Any file | 7 (Error Handling), 8 (Testing), 9 (Code Quality) |

Expand All @@ -56,7 +56,7 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad

### 1. Layering & Separation of Concerns

- [ ] No new fields in `protocols/` types that only one crate sets
- [ ] No new fields in `crates/protocols/` types that only one crate sets
- [ ] Config types at correct layer: user-facing → `config/types.rs`, runtime → module-specific
- [ ] No raw strings parsed at runtime — parse at boundary
- [ ] WASM/MCP concerns stay in their crates, not leaking into core
Expand Down Expand Up @@ -111,8 +111,8 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad
- [ ] Unit tests for new types/parsing including error cases
- [ ] Integration test for full flow
- [ ] Existing test struct literals updated with new fields
- [ ] E2E tests if user-facing behavior changes
- [ ] Thread-unsafe tests marked `@pytest.mark.thread_unsafe`
- [ ] E2E tests if user-facing behavior changes (in `e2e_test/` — tests run sequentially with class-scoped backends)
- [ ] E2E test markers set: `@pytest.mark.engine(...)`, `@pytest.mark.gpu(count)`, `@pytest.mark.model(...)` as needed

### 9. Code Quality

Expand All @@ -126,7 +126,7 @@ Sections 7, 8, 9 always apply. Section 10 applies to PRs touching 3+ files or ad

### 10. Architecture Smell Tests

- "If I remove K8s, does this change still make sense?" → shouldn't be in `protocols/`
- "If I remove K8s, does this change still make sense?" → shouldn't be in `crates/protocols/`
- "Can existing config overrides or labels achieve this?" → may be unnecessary
- "Does this compose with DP-aware mode, PD disagg, mesh HA?" → don't break existing
- "Is this Send + Sync safe under concurrent load?" → all routing state thread-safe
Expand Down
Loading