diff --git a/skills/implement/reasoning-parser.md b/skills/implement/reasoning-parser.md
index 70cb70a..84557ae 100644
--- a/skills/implement/reasoning-parser.md
+++ b/skills/implement/reasoning-parser.md
@@ -1,88 +1,365 @@
 # Adding a Reasoning Parser to SMG
 
-10+ model families supported. Most delegate to `BaseReasoningParser` — only add custom logic if needed.
+## Collect These 4 Inputs First
 
-## Key Decision: `initial_in_reasoning`
+Before writing any code, get these from the user:
+
+| Input | Example | How To Decide |
+|-------|---------|---------------|
+| `MODEL_NAME` | `mymodel` | Snake case. Used as parser name, model_type, factory key |
+| `PATTERNS` | `["my-model", "my-model-v2", "mymodel"]` | Case-insensitive substrings matched against model IDs. Check the provider's model catalog for all variant names |
+| `TOKENS` | `<think>` / `</think>` | See **How to Find Reasoning Tokens** below |
+| `INITIAL_IN_REASONING` | `true` or `false` | See **How to Determine initial_in_reasoning** below |
+
+## How to Find Reasoning Tokens
+
+There is no single canonical source. Check these locations in order:
+
+### 1. Check existing implementations in vLLM and SGLang
+
+**Fastest path.** These projects track reasoning tokens across all major models. If the model already has a parser there, just copy the tokens.
+
+**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/reasoning
+- Registry: `__init__.py` → `_REASONING_PARSERS_TO_REGISTER` dict
+- Each parser file defines `start_token` / `end_token` properties
+- 17 models: deepseek_r1, deepseek_v3, qwen3, kimi_k2, glm45, step3, step3p5, minimax_m2, mistral, granite, ernie45, gptoss, hunyuan_a13b, olmo3, holo2, seed_oss, nano_v3
+
+**SGLang** — https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/parser/reasoning_parser.py
+- Single file with all parsers and a `DetectorMap` dict mapping model names to detector classes
+- Each detector class defines `think_start_token` / `think_end_token`
+
+If the model exists in either project, use their tokens directly — they've already been validated against real model output.
+
+### 2. HuggingFace `tokenizer_config.json` → `added_tokens_decoder`
+
+Most reliable primary source. Download from `https://huggingface.co/{org}/{model}/raw/main/tokenizer_config.json` and search for thinking-related tokens.
+
+**Example — Qwen3** (tokens 151667 and 151668):
+```json
+"151667": { "content": "<think>", "special": false },
+"151668": { "content": "</think>", "special": false }
+```
+
+### 2. HuggingFace `tokenizer_config.json` → `chat_template`
+
+The Jinja2 chat template often shows how reasoning tokens are handled during formatting.
+
+**Example — DeepSeek-R1** (template strips reasoning from assistant messages):
+```jinja2
+{% if '</think>' in content %}
+  {% set content = content.split('</think>')[-1] %}
+{% endif %}
+```
+
+The `add_generation_prompt` section may also reveal the start token — DeepSeek-R1 appends `<think>\n` when generating.
+
+**Example — Qwen3** (template wraps reasoning_content):
+```jinja2
+'<think>\n' + reasoning_content + '\n</think>\n\n' + content
+```
+
+### 3. HuggingFace model card / README
+
+Some providers document the token format in their model card.
+
+**Example — Cohere Command-A Reasoning:**
+```
+<|START_THINKING|>
+[reasoning]
+<|END_THINKING|>
+```
+
+Cohere also supports `reasoning=True` parameter in `apply_chat_template()`.
+
+### 4. Provider API docs
+
+Check the provider's API documentation for reasoning/thinking output format.
+
+**Example — Together.ai, OpenRouter, etc.** often document thinking output as a separate `reasoning` field with its own delimiter format.
+
+### 5. Send a test request and observe
+
+Last resort. Send a reasoning-capable prompt through the provider's API and inspect the raw response tokens.
+
+```bash
+curl -s https://api.provider.com/v1/chat/completions \
+  -H "Authorization: Bearer $KEY" \
+  -d '{"model":"model-id","messages":[{"role":"user","content":"Think step by step: what is 17*23?"}]}' \
+  | jq '.choices[0].message'
+```
+
+Look for the delimiters wrapping the chain-of-thought content.
+
+### Known Token Formats
+
+**In SMG today:**
+
+| Model Family | Start Token | End Token | Notes |
+|---|---|---|---|
+| DeepSeek-R1 | `<think>` | `</think>` | `initial_in_reasoning=true` |
+| Qwen3 | `<think>` | `</think>` | `initial_in_reasoning=false` |
+| QwenThinking | `<think>` | `</think>` | `initial_in_reasoning=true` |
+| Kimi | `◁think▷` | `◁/think▷` | Unicode tokens |
+| GLM-4.5 | `<think>` | `</think>` | `initial_in_reasoning=false` |
+| Step3 | `<think>` | `</think>` | `initial_in_reasoning=true` |
+| NanoV3 (Nemotron) | `<think>` | `</think>` | `initial_in_reasoning=true` |
+| MiniMax M2 | `<think>` | `</think>` | Model doesn't emit start token — SMG prepends it |
+| Cohere Command | `<\|START_THINKING\|>` | `<\|END_THINKING\|>` | `initial_in_reasoning=false` |
+
+**In vLLM/SGLang but NOT yet in SMG** (candidates to add):
+
+| Model Family | Start Token | End Token | Notes | Source |
+|---|---|---|---|---|
+| Mistral | `[THINK]` | `[/THINK]` | Uses `mistral_common` special tokens | vLLM |
+| Granite | regex: `Here's my thought process:` | regex: `Here's my response:` | Regex-based, not token-based | vLLM |
+| Ernie 4.5 | `<think>` | `</think>` + `<response>`/`</response>` | Extra response delimiters | vLLM |
+| GptOss | `<\|channel\|>analysis<\|message\|>` | `<\|end\|>` | Multi-token sequence | vLLM, SGLang |
+| Hunyuan A13B | `<think>` | `</think>` | Standard tokens | vLLM |
+| OLMo 3 | `<think>` | `</think>` | Standard tokens | vLLM |
+| Seed OSS | `<think>` | `</think>` | Standard tokens | vLLM |
+| Holo2 | `<think>` | `</think>` | Standard tokens | vLLM |
+
+Most models use `<think>`/`</think>`. If the model you're adding uses those, you can usually just confirm via one of the sources above. For non-standard tokens (Mistral, Granite, GptOss), check the vLLM/SGLang implementations for the exact format.
+
+## How to Determine `initial_in_reasoning`
+
+This flag controls whether the parser assumes the first token is reasoning or normal text.
+
+**Check the chat template's `add_generation_prompt` section:**
+- If it appends the start token (e.g. `<think>\n`) → `initial_in_reasoning: true`
+- If it appends nothing or just a role tag → `initial_in_reasoning: false`
+
+**Or send a test request** and observe:
+- If the very first output token is reasoning content (before any start token) → `true`
+- If the first output is normal text or starts with explicit start token → `false`
 
 | Value | Behavior | Models |
 |-------|----------|--------|
-| `true` | Everything is reasoning until end token | DeepSeek-R1, Step3, Nemotron |
-| `false` | Requires explicit start token | Qwen3, GLM-4, Kimi, MiniMax, Cohere |
+| `true` | Everything is reasoning until end token appears | DeepSeek-R1, Step3, Nemotron, QwenThinking |
+| `false` | Normal text until explicit start token | Qwen3, GLM-4.5, Kimi, MiniMax, Cohere |
 
-## Steps
+## Step 1: Create parser file
 
-### Step 1: Create parser file
+**File:** `reasoning_parser/src/parsers/{MODEL_NAME}.rs`
 
-**File:** `reasoning_parser/src/parsers/mymodel.rs`
+Generate this file, substituting the 4 inputs:
 
 ```rust
-pub struct MyModelParser {
+use crate::{
+    parsers::BaseReasoningParser,
+    traits::{ParseError, ParserConfig, ParserResult, ReasoningParser},
+};
+
+pub struct {ModelName}Parser {
     base: BaseReasoningParser,
 }
 
-impl MyModelParser {
+impl {ModelName}Parser {
     pub fn new() -> Self {
+        let config = ParserConfig {
+            think_start_token: "{START_TOKEN}".to_string(),
+            think_end_token: "{END_TOKEN}".to_string(),
+            initial_in_reasoning: {INITIAL_IN_REASONING},
+            stream_reasoning: true,
+            max_buffer_size: 65536,
+        };
         Self {
-            base: BaseReasoningParser::new(ParserConfig {
-                think_start_token: "<think>".to_string(),
-                think_end_token: "</think>".to_string(),
-                initial_in_reasoning: false,  // Choose based on model behavior
-                stream_reasoning: true,
-                max_buffer_size: 65536,
-            }).with_model_type("mymodel".to_string()),
+            base: BaseReasoningParser::new(config)
+                .with_model_type("{MODEL_NAME}".to_string()),
         }
     }
 }
 
-impl ReasoningParser for MyModelParser {
+impl Default for {ModelName}Parser {
+    fn default() -> Self {
+        Self::new()
+    }
+}
+
+impl ReasoningParser for {ModelName}Parser {
     fn detect_and_parse_reasoning(&mut self, text: &str) -> Result<ParserResult, ParseError> {
         self.base.detect_and_parse_reasoning(text)
     }
-    fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result<ParserResult, ParseError> {
+
+    fn parse_reasoning_streaming_incremental(
+        &mut self,
+        text: &str,
+    ) -> Result<ParserResult, ParseError> {
         self.base.parse_reasoning_streaming_incremental(text)
     }
-    fn reset(&mut self) { self.base.reset() }
-    fn model_type(&self) -> &str { self.base.model_type() }
-    fn is_in_reasoning(&self) -> bool { self.base.is_in_reasoning() }
+
+    fn reset(&mut self) {
+        self.base.reset();
+    }
+
+    fn model_type(&self) -> &str {
+        self.base.model_type()
+    }
+
+    fn is_in_reasoning(&self) -> bool {
+        self.base.is_in_reasoning()
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_model_type() {
+        let parser = {ModelName}Parser::new();
+        assert_eq!(parser.model_type(), "{MODEL_NAME}");
+    }
+
+    #[test]
+    fn test_reasoning_extraction() {
+        let mut parser = {ModelName}Parser::new();
+        let result = parser
+            .detect_and_parse_reasoning("{START_TOKEN}thinking here{END_TOKEN}normal text")
+            .unwrap();
+        assert_eq!(result.reasoning_text, "thinking here");
+        assert_eq!(result.normal_text, "normal text");
+    }
+
+    #[test]
+    fn test_streaming_partial_token() {
+        let mut parser = {ModelName}Parser::new();
+        // Send partial end token — must not lose characters
+        let r1 = parser
+            .parse_reasoning_streaming_incremental("{START_TOKEN}partial</")
+            .unwrap();
+        let r2 = parser
+            .parse_reasoning_streaming_incremental("not-a-token")
+            .unwrap();
+        // The "</not-a-token" should appear in reasoning (not lost)
+        let combined = format!("{}{}", r1.reasoning_text, r2.reasoning_text);
+        assert!(combined.contains("partial"));
+        assert!(combined.contains("</not-a-token"));
+    }
+
+    #[test]
+    fn test_streaming_multiple_chunks() {
+        let mut parser = {ModelName}Parser::new();
+        let _r1 = parser
+            .parse_reasoning_streaming_incremental("{START_TOKEN}chunk1")
+            .unwrap();
+        let _r2 = parser
+            .parse_reasoning_streaming_incremental("chunk2{END_TOKEN}")
+            .unwrap();
+        let r3 = parser
+            .parse_reasoning_streaming_incremental("normal output")
+            .unwrap();
+        assert_eq!(r3.normal_text, "normal output");
+    }
+
+    #[test]
+    fn test_reset() {
+        let mut parser = {ModelName}Parser::new();
+        parser
+            .parse_reasoning_streaming_incremental("{START_TOKEN}thinking")
+            .unwrap();
+        assert!(parser.is_in_reasoning());
+        parser.reset();
+        assert!(!parser.is_in_reasoning());
+    }
+
+    #[test]
+    fn test_initial_state() {
+        let parser = {ModelName}Parser::new();
+        // If initial_in_reasoning=true: parser starts in reasoning mode
+        // If initial_in_reasoning=false: parser starts in normal mode
+        assert_eq!(parser.is_in_reasoning(), {INITIAL_IN_REASONING});
+    }
+
+    #[test]
+    fn test_empty_reasoning_block() {
+        let mut parser = {ModelName}Parser::new();
+        let result = parser
+            .detect_and_parse_reasoning("{START_TOKEN}{END_TOKEN}normal")
+            .unwrap();
+        assert_eq!(result.normal_text, "normal");
+    }
 }
 ```
 
-Only add custom logic (override base methods) for non-standard token formats like Kimi (Unicode `◁think▷`) or MiniMax (auto-prepends `<think>`).
+**Verify:** `cargo check -p reasoning_parser`
+
+## Step 2: Register in module exports
+
+**File:** `reasoning_parser/src/parsers/mod.rs` — add:
+```rust
+pub mod {MODEL_NAME};
+pub use {MODEL_NAME}::{ModelName}Parser;
+```
+
+**File:** `reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
+```rust
+{ModelName}Parser,
+```
+
+**Verify:** `cargo check -p reasoning_parser`
 
-**Verify:** `cargo build`
+## Step 3: Register in factory
 
-### Step 2: Register in factory
+**File:** `reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:
 
 ```rust
-registry.register_parser("mymodel", || Box::new(MyModelParser::new()));
-registry.register_pattern("my-model", "mymodel");  // Case-insensitive substring
+// Parser registration
+registry.register_parser("{MODEL_NAME}", || Box::new({ModelName}Parser::new()));
+
+// Pattern mappings — one per model ID variant
+registry.register_pattern("{pattern-1}", "{MODEL_NAME}");
+registry.register_pattern("{pattern-2}", "{MODEL_NAME}");
 ```
 
-**Verify:** `cargo build`
+Pattern matching is **case-insensitive substring**: `model_id.to_lowercase().contains(pattern)`.
+
+**Verify:** `cargo check -p reasoning_parser`
+
+## Step 4: Run tests
+
+```bash
+cargo test -p reasoning_parser
+```
 
-### Step 3: Export
+All 7 tests in the new file plus all existing tests must pass.
 
-**Files:** `reasoning_parser/src/parsers/mod.rs` and `reasoning_parser/src/lib.rs`
+## Step 5: Run full quality gate
 
-### Step 4: Write tests
+Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format.
 
-Required test cases:
-- Streaming with partial tokens (e.g. `</` that doesn't complete `</think>`)
-- Buffer overflow (>64KB reasoning block)
-- Normal text / reasoning text separation
-- Reset between requests (NOT between chunks)
-- `initial_in_reasoning` behavior matches expectations
+## When You Need Custom Logic
 
-**Verify:** `cargo test`
+Most parsers are pure `BaseReasoningParser` wrappers. Only override methods when:
 
-## Critical: Partial Token Bug
+| Scenario | Example | What To Do |
+|----------|---------|------------|
+| Model always starts with reasoning but doesn't emit start token | MiniMax M2 | Prepend start token on first chunk (track `is_first_chunk` state) |
+| Non-standard delimiters | Kimi (`◁think▷`), Cohere (`<\|START_THINKING\|>`) | Just change token strings in `ParserConfig` |
+| Model needs preprocessing | Hypothetical | Override trait methods, call `self.base` after preprocessing |
+
+For MiniMax-style prepending, add an `is_first_chunk: bool` field and override `parse_reasoning_streaming_incremental`:
+
+```rust
+fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result<ParserResult, ParseError> {
+    let modified = if self.is_first_chunk {
+        self.is_first_chunk = false;
+        format!("{START_TOKEN}{text}")
+    } else {
+        text.to_string()
+    };
+    self.base.parse_reasoning_streaming_incremental(&modified)
+}
+```
 
-When `</` appears but doesn't complete `</think>`, the system must not lose the `</` prefix. The `BaseReasoningParser` handles this correctly — verify in tests by sending `</` followed by non-`think>` text.
+Don't forget to reset `is_first_chunk = true` in `reset()`.
 
 ## Common Mistakes
 
-| Mistake | Consequence |
-|---------|-------------|
-| Wrong `initial_in_reasoning` | All output classified as wrong type |
-| Resetting between chunks instead of requests | Loses buffer state mid-stream |
-| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases |
+| Mistake | Consequence | Fix |
+|---------|-------------|-----|
+| Wrong `initial_in_reasoning` | All output classified as wrong type | Test with real model output to observe whether first token is reasoning |
+| Missing pattern variant | Model ID doesn't match, falls back to base | Check provider's model catalog for ALL variant names (aliases, versioned IDs) |
+| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests |
+| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases (partial tokens, buffer overflow) | Always compose `BaseReasoningParser` unless you have a very good reason |
+| Forgetting `parsers/mod.rs` export | Compiles but factory can't find the type | Always update both `mod.rs` and `lib.rs` |
diff --git a/skills/implement/tool-parser.md b/skills/implement/tool-parser.md
index becec9a..c3ecc66 100644
--- a/skills/implement/tool-parser.md
+++ b/skills/implement/tool-parser.md
@@ -1,84 +1,423 @@
 # Adding a Tool Parser to SMG
 
-13+ formats supported. Most are JSON-based and can reuse `handle_json_tool_streaming()`.
+## Collect These 4 Inputs First
 
-## Steps
+Before writing any code, get these from the user:
 
-### Step 1: Create parser file
+| Input | Example | How To Decide |
+|-------|---------|---------------|
+| `PARSER_NAME` | `mymodel` | Snake case. Used as parser name, factory key, test file name |
+| `MODEL_PATTERNS` | `["my-model*", "MyModel*"]` | Wildcard patterns matched against model IDs. Check the provider's model catalog for all variant names |
+| `FORMAT` | See **Tool Call Format Types** below | The format the model uses to emit tool calls |
+| `DELIMITERS` | `<tool_call>` / `</tool_call>` | Start/end tokens wrapping tool calls. See **How to Find Tool Call Format** below |
 
-**File:** `tool_parser/src/parsers/myformat.rs`
+## How to Find Tool Call Format
+
+### 1. Check existing implementations in vLLM and SGLang
+
+**Fastest path.** These projects track tool call formats across all major models.
+
+**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/tool_parsers
+- Registry: `__init__.py` → `_TOOL_PARSERS_TO_REGISTER` dict
+- 34 parser files covering all major model families
+- Each parser defines start/end tokens and format structure
+
+**SGLang** — https://github.com/sgl-project/sglang/tree/main/python/sglang/srt/function_call
+- 25 detector files, one per model family
+- `function_call_parser.py` is the router
+
+If the model exists in either project, match their format.
+
+### 2. HuggingFace `tokenizer_config.json` → `chat_template`
+
+The Jinja2 chat template shows how tools are formatted in the prompt AND how the model is expected to respond. Look for:
+- Tool-related special tokens in `added_tokens_decoder` (e.g., `<tool_call>`, `<|tool_calls_section_begin|>`)
+- Template sections handling `message.tool_calls` — shows the output format
+- `tool_use` or `function_call` handling in the template
+
+### 3. HuggingFace model card / README
+
+Model cards often document the expected tool call output format with examples.
+
+### 4. Send a test request with tools
+
+```bash
+curl -s https://api.provider.com/v1/chat/completions \
+  -H "Authorization: Bearer $KEY" \
+  -d '{
+    "model": "model-id",
+    "tools": [{"type":"function","function":{"name":"get_weather","parameters":{"type":"object","properties":{"city":{"type":"string"}}}}}],
+    "messages": [{"role":"user","content":"What is the weather in Paris?"}]
+  }' | jq '.choices[0].message'
+```
+
+Observe the raw format before the API normalizes it.
+
+## Tool Call Format Types
+
+| Type | Description | Parsers Using It | Reuse Strategy |
+|------|-------------|------------------|----------------|
+| **JSON with tags** | JSON wrapped in model-specific tags | Mistral `[TOOL_CALLS]`, Cohere `<\|START_ACTION\|>`, Step3 `<stepml:function_call>`, KimiK2 `<\|tool_call_begin\|>`, MiniMax `<FUNCTION_CALL>` | Extract between tags, delegate to `helpers::handle_json_tool_streaming()` |
+| **XML with parameters** | XML tags with key-value parameter children | QwenCoder `<tool_call><function=..><parameter=..>`, GLM4 `<tool_call>..<arg_key>..<arg_value>` | Custom XML parsing |
+| **Raw JSON** | Pure JSON object or array | JsonParser (OpenAI, Claude, Gemini) | Use `JsonParser` directly or register as `json` |
+| **Pythonic** | Python function call syntax `[func(arg=val)]` | PythonicParser (Llama 4, DeepSeek R1) | Use `PythonicParser` directly |
+| **Unicode tokens** | Full-width Unicode delimiters | DeepSeek V3 `<｜tool▁calls▁begin｜>` | Custom parser with Unicode handling |
+
+**Most new models use "JSON with tags"** — wrap standard JSON in model-specific start/end tokens. This is the easiest to implement because `helpers::handle_json_tool_streaming()` handles 80% of the logic.
+
+### Known Formats
+
+**In SMG today:**
+
+| Model Family | Start Token | End Token | Format | Parser |
+|---|---|---|---|---|
+| OpenAI, Claude, Gemini | `{` or `[` | `}` or `]` | Raw JSON | `json` |
+| Mistral, Mixtral | `[TOOL_CALLS] [` | `]` | JSON array with prefix | `mistral` |
+| Qwen 2/2.5/3 | `<tool_call>\n` | `\n</tool_call>` | JSON in XML tags | `qwen` |
+| Qwen 3/2.5 Coder | `<tool_call>` | `</tool_call>` | XML with `<function=..><parameter=..>` | `qwen_coder` |
+| Llama 3.2 | `<\|python_tag\|>` | JSON boundary | JSON with prefix tag | `llama` |
+| Llama 4, DeepSeek R1 | `[` | `]` | Pythonic `func(arg=val)` | `pythonic` |
+| DeepSeek V3 | `<｜tool▁calls▁begin｜>` | `<｜tool▁calls▁end｜>` | Unicode tokens + JSON code block | `deepseek` |
+| GLM-4.5/4.6 | `<tool_call>` | `</tool_call>` | XML with `<arg_key>`/`<arg_value>` | `glm45_moe` |
+| GLM-4.7 | `<tool_call>` | `</tool_call>` | XML (whitespace variant) | `glm47_moe` |
+| Step-3 | `<stepml:function_call>` | `</stepml:function_call>` | JSON in XML tags | `step3` |
+| Kimi K2 | `<\|tool_call_begin\|>` | `<\|tool_call_end\|>` | JSON in pipe-delimited tags | `kimik2` |
+| MiniMax M2 | `<FUNCTION_CALL>` | `</FUNCTION_CALL>` | JSON in all-caps XML tags | `minimax_m2` |
+| Cohere Command | `<\|START_ACTION\|>` | `<\|END_ACTION\|>` | JSON with `tool_name`/`parameters` fields | `cohere` |
+
+**In vLLM/SGLang but NOT yet in SMG** (candidates to add):
+
+| Model Family | Format | Source |
+|---|---|---|
+| Hermes 2 Pro | `<tool_call>` JSON `</tool_call>` (with optional `<scratch_pad>`) | vLLM, SGLang |
+| Jamba | `<tool_calls>` JSON array `</tool_calls>` | vLLM |
+| xLAM | Multiple: JSON code block, `[TOOL_CALLS]`, or XML | vLLM |
+| FunctionGemma | `<start_function_call>call:name{args}<end_function_call>` | vLLM |
+| Granite | `<\|tool_call\|>` or `<tool_call>` + JSON array | vLLM, SGLang |
+| Phi-4 Mini | JSON format | vLLM |
+| Seed OSS | `<seed:tool_call>` with `<function=..><parameter=..>` | vLLM |
+| InternLM | Custom format | vLLM, SGLang |
+| Hunyuan A13B | Custom format | vLLM |
+| OLMo 3 | Pythonic format | vLLM |
+| GigaChat 3 | Custom format | vLLM, SGLang |
+| DeepSeek V3.1/V3.2 | Variant of DeepSeek V3 Unicode tokens | vLLM, SGLang |
+
+## Step 1: Create parser file
+
+**File:** `tool_parser/src/parsers/{PARSER_NAME}.rs`
+
+For the most common case — **JSON with tags** — generate this template:
 
 ```rust
-pub struct MyFormatParser {
-    buffer: String,
+use async_trait::async_trait;
+use openai_protocol::common::Tool;
+use serde_json::Value;
+
+use crate::{
+    errors::ParserResult,
+    parsers::helpers,
+    partial_json::PartialJson,
+    traits::ToolParser,
+    types::{FunctionCall, StreamingParseResult, ToolCall, ToolCallItem},
+};
+
+const START_TOKEN: &str = "{START_TOKEN}";
+const END_TOKEN: &str = "{END_TOKEN}";
+
+pub struct {ParserName}Parser {
     partial_json: PartialJson,
+    buffer: String,
+    prev_tool_call_arr: Vec<Value>,
     current_tool_id: i32,
+    current_tool_name_sent: bool,
     streamed_args_for_tool: Vec<String>,
 }
 
+impl {ParserName}Parser {
+    pub fn new() -> Self {
+        Self {
+            partial_json: PartialJson::default(),
+            buffer: String::new(),
+            prev_tool_call_arr: Vec::new(),
+            current_tool_id: -1,
+            current_tool_name_sent: false,
+            streamed_args_for_tool: Vec::new(),
+        }
+    }
+}
+
+impl Default for {ParserName}Parser {
+    fn default() -> Self {
+        Self::new()
+    }
+}
+
 #[async_trait]
-impl ToolParser for MyFormatParser {
-    async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec<ToolCall>)> { ... }
-    async fn parse_incremental(&mut self, chunk: &str, tools: &[Tool]) -> ParserResult<StreamingParseResult> { ... }
-    fn has_tool_markers(&self, text: &str) -> bool { ... }
-    fn reset(&mut self) { ... }  // Clear buffer, tool_id, streamed_args
+impl ToolParser for {ParserName}Parser {
+    async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec<ToolCall>)> {
+        // Find content between START_TOKEN and END_TOKEN
+        let Some(start) = output.find(START_TOKEN) else {
+            return Ok((output.to_string(), vec![]));
+        };
+        let normal_text = output[..start].to_string();
+        let after_start = &output[start + START_TOKEN.len()..];
+        let json_str = if let Some(end) = after_start.find(END_TOKEN) {
+            &after_start[..end]
+        } else {
+            after_start
+        };
+
+        // Parse JSON and extract tool calls
+        let json_str = json_str.trim();
+        let value: Value = serde_json::from_str(json_str)?;
+
+        // Adapt based on your model's JSON structure:
+        // Most use {"name": "func", "arguments": {...}}
+        // Some use {"tool_name": "func", "parameters": {...}}
+        let calls = helpers::extract_tool_calls_from_value(&value)?;
+        Ok((normal_text, calls))
+    }
+
+    async fn parse_incremental(
+        &mut self,
+        chunk: &str,
+        tools: &[Tool],
+    ) -> ParserResult<StreamingParseResult> {
+        self.buffer.push_str(chunk);
+
+        // Look for start token
+        let Some(start_idx) = self.buffer.find(START_TOKEN) else {
+            // No tool call yet — emit as normal text
+            let normal = self.buffer.clone();
+            self.buffer.clear();
+            return Ok(StreamingParseResult {
+                normal_text: normal,
+                calls: vec![],
+            });
+        };
+
+        // Emit any text before the tool call as normal text
+        let normal_text = self.buffer[..start_idx].to_string();
+        let json_start = start_idx + START_TOKEN.len();
+
+        // Check for end token
+        let json_text = if let Some(end_idx) = self.buffer[json_start..].find(END_TOKEN) {
+            &self.buffer[json_start..json_start + end_idx]
+        } else {
+            &self.buffer[json_start..]
+        };
+
+        // Build tool index map from available tools
+        let tool_indices: std::collections::HashMap<String, usize> = tools
+            .iter()
+            .enumerate()
+            .filter_map(|(i, t)| t.function.as_ref().map(|f| (f.name.clone(), i)))
+            .collect();
+
+        // Delegate to shared JSON streaming helper
+        let mut result = helpers::handle_json_tool_streaming(
+            json_text,
+            0,
+            &mut self.partial_json,
+            &tool_indices,
+            &mut self.buffer,
+            &mut self.current_tool_id,
+            &mut self.current_tool_name_sent,
+            &mut self.streamed_args_for_tool,
+            &mut self.prev_tool_call_arr,
+        )?;
+
+        result.normal_text = normal_text;
+        Ok(result)
+    }
+
+    fn has_tool_markers(&self, text: &str) -> bool {
+        text.contains(START_TOKEN)
+    }
+
+    fn get_unstreamed_tool_args(&self) -> Option<Vec<ToolCallItem>> {
+        helpers::get_unstreamed_args(&self.prev_tool_call_arr, &self.streamed_args_for_tool)
+    }
+
+    fn reset(&mut self) {
+        helpers::reset_parser_state(
+            &mut self.buffer,
+            &mut self.prev_tool_call_arr,
+            &mut self.current_tool_id,
+            &mut self.current_tool_name_sent,
+            &mut self.streamed_args_for_tool,
+        );
+    }
 }
 ```
 
-**Verify:** `cargo build`
+**Verify:** `cargo check -p tool_parser`
 
-### Step 2: Implement streaming (two-stage pattern)
+## Step 2: Register in module exports
 
-Stage 1 — Name detected:
+**File:** `tool_parser/src/parsers/mod.rs` — add:
 ```rust
-StreamingParseResult { items: vec![ToolCallItem { name: Some("func"), parameters: "" }] }
+pub mod {PARSER_NAME};
+pub use {PARSER_NAME}::{ParserName}Parser;
 ```
 
-Stage 2 — Arguments delta:
+**File:** `tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
 ```rust
-// Calculate delta = current_args[previous_length..]
-StreamingParseResult { items: vec![ToolCallItem { name: None, parameters: delta }] }
+{ParserName}Parser,
 ```
 
-For JSON-based formats, use `helpers::handle_json_tool_streaming()` — handles 80% of the logic.
+**Verify:** `cargo check -p tool_parser`
 
-**Anti-pattern:** Not tracking `streamed_args_for_tool` — re-sends entire argument string each chunk.
+## Step 3: Register in factory
 
-### Step 3: Register in factory
+**File:** `tool_parser/src/factory.rs`
 
-**File:** `tool_parser/src/` (factory/registry)
+In `ParserFactory::new()`:
+```rust
+registry.register_parser("{PARSER_NAME}", || Box::new({ParserName}Parser::new()));
+```
 
+In `ParserFactory::register_default_mappings()`:
 ```rust
-registry.register_parser("myformat", || Box::new(MyFormatParser::new()));
-registry.map_model("my-model*", "myformat");  // Wildcard pattern matching
+registry.map_model("{model-pattern-1}*", "{PARSER_NAME}");
+registry.map_model("{model-pattern-2}*", "{PARSER_NAME}");
 ```
 
-**Verify:** `cargo build`
+Pattern matching uses **glob wildcards** (`*` matches any characters).
 
-### Step 4: Export
+**Verify:** `cargo check -p tool_parser`
 
-**Files:** `tool_parser/src/parsers/mod.rs` and `tool_parser/src/lib.rs`
+## Step 4: Write tests
 
-Add `pub mod myformat;` and re-export.
+**File:** `tool_parser/tests/tool_parser_{PARSER_NAME}.rs`
 
-### Step 5: Write tests
+```rust
+mod common;
+
+use common::create_test_tools;
+use tool_parser::{ParserName}Parser, ToolParser};
 
-**File:** `tool_parser/tests/tool_parser_myformat.rs`
+#[tokio::test]
+async fn test_parse_single_tool_call() {
+    let parser = {ParserName}Parser::new();
+    let input = r#"Some text{START_TOKEN}{"name":"get_weather","arguments":{"city":"Paris"}}{END_TOKEN}"#;
+    let (normal, calls) = parser.parse_complete(input).await.unwrap();
+    assert_eq!(normal, "Some text");
+    assert_eq!(calls.len(), 1);
+    assert_eq!(calls[0].function.name, "get_weather");
+}
+
+#[tokio::test]
+async fn test_parse_multiple_tool_calls() {
+    let parser = {ParserName}Parser::new();
+    // Test with multiple sequential tool calls
+    let input = r#"{START_TOKEN}{"name":"func1","arguments":{}}{END_TOKEN}{START_TOKEN}{"name":"func2","arguments":{}}{END_TOKEN}"#;
+    let (_, calls) = parser.parse_complete(input).await.unwrap();
+    assert_eq!(calls.len(), 2);
+}
+
+#[tokio::test]
+async fn test_no_tool_calls() {
+    let parser = {ParserName}Parser::new();
+    let input = "Just normal text, no tool calls here.";
+    let (normal, calls) = parser.parse_complete(input).await.unwrap();
+    assert_eq!(normal, input);
+    assert!(calls.is_empty());
+}
+
+#[tokio::test]
+async fn test_has_tool_markers() {
+    let parser = {ParserName}Parser::new();
+    assert!(parser.has_tool_markers("text {START_TOKEN} more"));
+    assert!(!parser.has_tool_markers("plain text"));
+}
+
+#[tokio::test]
+async fn test_streaming_chunks() {
+    let mut parser = {ParserName}Parser::new();
+    let tools = create_test_tools();
+
+    // Split a tool call across multiple chunks
+    let chunks = vec![
+        "Normal text",
+        "{START_TOKEN}{\"na",
+        "me\":\"get_weather\",\"ar",
+        "guments\":{\"city\":\"",
+        "Paris\"}}{END_TOKEN}",
+    ];
+
+    let mut all_normal = String::new();
+    let mut all_calls = Vec::new();
+    for chunk in chunks {
+        let result = parser.parse_incremental(chunk, &tools).await.unwrap();
+        all_normal.push_str(&result.normal_text);
+        all_calls.extend(result.calls);
+    }
+    assert_eq!(all_normal, "Normal text");
+    assert!(!all_calls.is_empty());
+}
+
+#[tokio::test]
+async fn test_empty_arguments() {
+    let parser = {ParserName}Parser::new();
+    let input = r#"{START_TOKEN}{"name":"no_args","arguments":{}}{END_TOKEN}"#;
+    let (_, calls) = parser.parse_complete(input).await.unwrap();
+    assert_eq!(calls[0].function.arguments, "{}");
+}
+
+#[tokio::test]
+async fn test_reset() {
+    let mut parser = {ParserName}Parser::new();
+    let tools = create_test_tools();
+    parser.parse_incremental("{START_TOKEN}{\"name", &tools).await.unwrap();
+    parser.reset();
+    // After reset, parser should handle new input cleanly
+    let result = parser.parse_incremental("fresh text", &tools).await.unwrap();
+    assert_eq!(result.normal_text, "fresh text");
+}
+```
+
+**Verify:** `cargo test --test tool_parser_{PARSER_NAME}`
+
+## Step 5: Run full quality gate
+
+Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format.
+
+## Adapting for Non-Standard JSON Fields
+
+Some models use non-standard field names. Map them in `parse_complete`:
+
+| Model | Name Field | Arguments Field |
+|-------|-----------|-----------------|
+| Most models | `name` | `arguments` |
+| Cohere | `tool_name` | `parameters` |
+| Llama 3.2 | `name` | `parameters` |
+
+Use `helpers::normalize_tool_call_fields()` if available, or map manually:
+```rust
+let name = obj.get("name")
+    .or_else(|| obj.get("tool_name"))
+    .and_then(|v| v.as_str());
+let args = obj.get("arguments")
+    .or_else(|| obj.get("parameters"));
+```
 
-Required test cases:
-- Single tool call
-- Multiple tool calls
-- Empty arguments
-- Streaming with chunks split at different boundaries
-- Invalid tool name (should skip, not error)
-- Mixed normal text and tool calls
-- Unicode and special characters
+## When NOT to Use the JSON-with-Tags Template
 
-**Verify:** `cargo test --test tool_parser_myformat`
+| Scenario | Example | What To Do Instead |
+|----------|---------|-------------------|
+| Model uses XML with key-value parameters | QwenCoder `<parameter=key>value</parameter>` | Write custom XML extraction (see `qwen_coder.rs`) |
+| Model uses Python function syntax | Llama 4 `[func(arg=val)]` | Use `PythonicParser` or register as `pythonic` |
+| Model uses raw JSON (no tags) | OpenAI, Claude | Register as `json` — no new parser needed |
+| Model uses Unicode delimiter tokens | DeepSeek V3 full-width chars | Write custom parser with Unicode-aware matching (see `deepseek.rs`) |
 
-## Key Rules
+## Common Mistakes
 
-- Validate tool names against the `tools` list — skip invalid calls
-- Reset state between **requests**, NOT between **chunks**
-- Use `partial_json.parse_value()` for incomplete JSON
-- Parsers are pooled with `Arc<Mutex<>>` — must be `Send + Sync`
+| Mistake | Consequence | Fix |
+|---------|-------------|-----|
+| Not handling multiple tool calls | Only first call extracted | Loop over all start/end token pairs or parse JSON array |
+| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests |
+| Not validating tool names against `tools` list | Invalid tool calls forwarded to client | Skip calls where name doesn't match any provided tool |
+| Re-sending full arguments each chunk | Client receives duplicate argument data | Track `streamed_args_for_tool` and send only the delta |
+| Missing `get_unstreamed_tool_args()` | Final arguments lost on fast completions | Implement using `helpers::get_unstreamed_args()` |
+| Forgetting `Send + Sync` bounds | Parser can't be pooled with `Arc<Mutex<>>` | Avoid `Rc`, `RefCell`, or non-`Send` types in struct fields |