diff --git a/skills/implement/reasoning-parser.md b/skills/implement/reasoning-parser.md
index 70cb70a..84557ae 100644
--- a/skills/implement/reasoning-parser.md
+++ b/skills/implement/reasoning-parser.md
@@ -1,88 +1,365 @@
# Adding a Reasoning Parser to SMG
-10+ model families supported. Most delegate to `BaseReasoningParser` — only add custom logic if needed.
+## Collect These 4 Inputs First
-## Key Decision: `initial_in_reasoning`
+Before writing any code, get these from the user:
+
+| Input | Example | How To Decide |
+|-------|---------|---------------|
+| `MODEL_NAME` | `mymodel` | Snake case. Used as parser name, model_type, factory key |
+| `PATTERNS` | `["my-model", "my-model-v2", "mymodel"]` | Case-insensitive substrings matched against model IDs. Check the provider's model catalog for all variant names |
+| `TOKENS` | `` / `` | See **How to Find Reasoning Tokens** below |
+| `INITIAL_IN_REASONING` | `true` or `false` | See **How to Determine initial_in_reasoning** below |
+
+## How to Find Reasoning Tokens
+
+There is no single canonical source. Check these locations in order:
+
+### 1. Check existing implementations in vLLM and SGLang
+
+**Fastest path.** These projects track reasoning tokens across all major models. If the model already has a parser there, just copy the tokens.
+
+**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/reasoning
+- Registry: `__init__.py` → `_REASONING_PARSERS_TO_REGISTER` dict
+- Each parser file defines `start_token` / `end_token` properties
+- 17 models: deepseek_r1, deepseek_v3, qwen3, kimi_k2, glm45, step3, step3p5, minimax_m2, mistral, granite, ernie45, gptoss, hunyuan_a13b, olmo3, holo2, seed_oss, nano_v3
+
+**SGLang** — https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/parser/reasoning_parser.py
+- Single file with all parsers and a `DetectorMap` dict mapping model names to detector classes
+- Each detector class defines `think_start_token` / `think_end_token`
+
+If the model exists in either project, use their tokens directly — they've already been validated against real model output.
+
+### 2. HuggingFace `tokenizer_config.json` → `added_tokens_decoder`
+
+Most reliable primary source. Download from `https://huggingface.co/{org}/{model}/raw/main/tokenizer_config.json` and search for thinking-related tokens.
+
+**Example — Qwen3** (tokens 151667 and 151668):
+```json
+"151667": { "content": "", "special": false },
+"151668": { "content": "", "special": false }
+```
+
+### 2. HuggingFace `tokenizer_config.json` → `chat_template`
+
+The Jinja2 chat template often shows how reasoning tokens are handled during formatting.
+
+**Example — DeepSeek-R1** (template strips reasoning from assistant messages):
+```jinja2
+{% if '' in content %}
+ {% set content = content.split('')[-1] %}
+{% endif %}
+```
+
+The `add_generation_prompt` section may also reveal the start token — DeepSeek-R1 appends `\n` when generating.
+
+**Example — Qwen3** (template wraps reasoning_content):
+```jinja2
+'\n' + reasoning_content + '\n\n\n' + content
+```
+
+### 3. HuggingFace model card / README
+
+Some providers document the token format in their model card.
+
+**Example — Cohere Command-A Reasoning:**
+```
+<|START_THINKING|>
+[reasoning]
+<|END_THINKING|>
+```
+
+Cohere also supports `reasoning=True` parameter in `apply_chat_template()`.
+
+### 4. Provider API docs
+
+Check the provider's API documentation for reasoning/thinking output format.
+
+**Example — Together.ai, OpenRouter, etc.** often document thinking output as a separate `reasoning` field with its own delimiter format.
+
+### 5. Send a test request and observe
+
+Last resort. Send a reasoning-capable prompt through the provider's API and inspect the raw response tokens.
+
+```bash
+curl -s https://api.provider.com/v1/chat/completions \
+ -H "Authorization: Bearer $KEY" \
+ -d '{"model":"model-id","messages":[{"role":"user","content":"Think step by step: what is 17*23?"}]}' \
+ | jq '.choices[0].message'
+```
+
+Look for the delimiters wrapping the chain-of-thought content.
+
+### Known Token Formats
+
+**In SMG today:**
+
+| Model Family | Start Token | End Token | Notes |
+|---|---|---|---|
+| DeepSeek-R1 | `` | `` | `initial_in_reasoning=true` |
+| Qwen3 | `` | `` | `initial_in_reasoning=false` |
+| QwenThinking | `` | `` | `initial_in_reasoning=true` |
+| Kimi | `◁think▷` | `◁/think▷` | Unicode tokens |
+| GLM-4.5 | `` | `` | `initial_in_reasoning=false` |
+| Step3 | `` | `` | `initial_in_reasoning=true` |
+| NanoV3 (Nemotron) | `` | `` | `initial_in_reasoning=true` |
+| MiniMax M2 | `` | `` | Model doesn't emit start token — SMG prepends it |
+| Cohere Command | `<\|START_THINKING\|>` | `<\|END_THINKING\|>` | `initial_in_reasoning=false` |
+
+**In vLLM/SGLang but NOT yet in SMG** (candidates to add):
+
+| Model Family | Start Token | End Token | Notes | Source |
+|---|---|---|---|---|
+| Mistral | `[THINK]` | `[/THINK]` | Uses `mistral_common` special tokens | vLLM |
+| Granite | regex: `Here's my thought process:` | regex: `Here's my response:` | Regex-based, not token-based | vLLM |
+| Ernie 4.5 | `` | `` + ``/`` | Extra response delimiters | vLLM |
+| GptOss | `<\|channel\|>analysis<\|message\|>` | `<\|end\|>` | Multi-token sequence | vLLM, SGLang |
+| Hunyuan A13B | `` | `` | Standard tokens | vLLM |
+| OLMo 3 | `` | `` | Standard tokens | vLLM |
+| Seed OSS | `` | `` | Standard tokens | vLLM |
+| Holo2 | `` | `` | Standard tokens | vLLM |
+
+Most models use ``/``. If the model you're adding uses those, you can usually just confirm via one of the sources above. For non-standard tokens (Mistral, Granite, GptOss), check the vLLM/SGLang implementations for the exact format.
+
+## How to Determine `initial_in_reasoning`
+
+This flag controls whether the parser assumes the first token is reasoning or normal text.
+
+**Check the chat template's `add_generation_prompt` section:**
+- If it appends the start token (e.g. `\n`) → `initial_in_reasoning: true`
+- If it appends nothing or just a role tag → `initial_in_reasoning: false`
+
+**Or send a test request** and observe:
+- If the very first output token is reasoning content (before any start token) → `true`
+- If the first output is normal text or starts with explicit start token → `false`
| Value | Behavior | Models |
|-------|----------|--------|
-| `true` | Everything is reasoning until end token | DeepSeek-R1, Step3, Nemotron |
-| `false` | Requires explicit start token | Qwen3, GLM-4, Kimi, MiniMax, Cohere |
+| `true` | Everything is reasoning until end token appears | DeepSeek-R1, Step3, Nemotron, QwenThinking |
+| `false` | Normal text until explicit start token | Qwen3, GLM-4.5, Kimi, MiniMax, Cohere |
-## Steps
+## Step 1: Create parser file
-### Step 1: Create parser file
+**File:** `reasoning_parser/src/parsers/{MODEL_NAME}.rs`
-**File:** `reasoning_parser/src/parsers/mymodel.rs`
+Generate this file, substituting the 4 inputs:
```rust
-pub struct MyModelParser {
+use crate::{
+ parsers::BaseReasoningParser,
+ traits::{ParseError, ParserConfig, ParserResult, ReasoningParser},
+};
+
+pub struct {ModelName}Parser {
base: BaseReasoningParser,
}
-impl MyModelParser {
+impl {ModelName}Parser {
pub fn new() -> Self {
+ let config = ParserConfig {
+ think_start_token: "{START_TOKEN}".to_string(),
+ think_end_token: "{END_TOKEN}".to_string(),
+ initial_in_reasoning: {INITIAL_IN_REASONING},
+ stream_reasoning: true,
+ max_buffer_size: 65536,
+ };
Self {
- base: BaseReasoningParser::new(ParserConfig {
- think_start_token: "".to_string(),
- think_end_token: "".to_string(),
- initial_in_reasoning: false, // Choose based on model behavior
- stream_reasoning: true,
- max_buffer_size: 65536,
- }).with_model_type("mymodel".to_string()),
+ base: BaseReasoningParser::new(config)
+ .with_model_type("{MODEL_NAME}".to_string()),
}
}
}
-impl ReasoningParser for MyModelParser {
+impl Default for {ModelName}Parser {
+ fn default() -> Self {
+ Self::new()
+ }
+}
+
+impl ReasoningParser for {ModelName}Parser {
fn detect_and_parse_reasoning(&mut self, text: &str) -> Result {
self.base.detect_and_parse_reasoning(text)
}
- fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result {
+
+ fn parse_reasoning_streaming_incremental(
+ &mut self,
+ text: &str,
+ ) -> Result {
self.base.parse_reasoning_streaming_incremental(text)
}
- fn reset(&mut self) { self.base.reset() }
- fn model_type(&self) -> &str { self.base.model_type() }
- fn is_in_reasoning(&self) -> bool { self.base.is_in_reasoning() }
+
+ fn reset(&mut self) {
+ self.base.reset();
+ }
+
+ fn model_type(&self) -> &str {
+ self.base.model_type()
+ }
+
+ fn is_in_reasoning(&self) -> bool {
+ self.base.is_in_reasoning()
+ }
+}
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+
+ #[test]
+ fn test_model_type() {
+ let parser = {ModelName}Parser::new();
+ assert_eq!(parser.model_type(), "{MODEL_NAME}");
+ }
+
+ #[test]
+ fn test_reasoning_extraction() {
+ let mut parser = {ModelName}Parser::new();
+ let result = parser
+ .detect_and_parse_reasoning("{START_TOKEN}thinking here{END_TOKEN}normal text")
+ .unwrap();
+ assert_eq!(result.reasoning_text, "thinking here");
+ assert_eq!(result.normal_text, "normal text");
+ }
+
+ #[test]
+ fn test_streaming_partial_token() {
+ let mut parser = {ModelName}Parser::new();
+ // Send partial end token — must not lose characters
+ let r1 = parser
+ .parse_reasoning_streaming_incremental("{START_TOKEN}partial")
+ .unwrap();
+ let r2 = parser
+ .parse_reasoning_streaming_incremental("not-a-token")
+ .unwrap();
+ // The "`).
+**Verify:** `cargo check -p reasoning_parser`
+
+## Step 2: Register in module exports
+
+**File:** `reasoning_parser/src/parsers/mod.rs` — add:
+```rust
+pub mod {MODEL_NAME};
+pub use {MODEL_NAME}::{ModelName}Parser;
+```
+
+**File:** `reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
+```rust
+{ModelName}Parser,
+```
+
+**Verify:** `cargo check -p reasoning_parser`
-**Verify:** `cargo build`
+## Step 3: Register in factory
-### Step 2: Register in factory
+**File:** `reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add:
```rust
-registry.register_parser("mymodel", || Box::new(MyModelParser::new()));
-registry.register_pattern("my-model", "mymodel"); // Case-insensitive substring
+// Parser registration
+registry.register_parser("{MODEL_NAME}", || Box::new({ModelName}Parser::new()));
+
+// Pattern mappings — one per model ID variant
+registry.register_pattern("{pattern-1}", "{MODEL_NAME}");
+registry.register_pattern("{pattern-2}", "{MODEL_NAME}");
```
-**Verify:** `cargo build`
+Pattern matching is **case-insensitive substring**: `model_id.to_lowercase().contains(pattern)`.
+
+**Verify:** `cargo check -p reasoning_parser`
+
+## Step 4: Run tests
+
+```bash
+cargo test -p reasoning_parser
+```
-### Step 3: Export
+All 7 tests in the new file plus all existing tests must pass.
-**Files:** `reasoning_parser/src/parsers/mod.rs` and `reasoning_parser/src/lib.rs`
+## Step 5: Run full quality gate
-### Step 4: Write tests
+Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format.
-Required test cases:
-- Streaming with partial tokens (e.g. `` that doesn't complete ``)
-- Buffer overflow (>64KB reasoning block)
-- Normal text / reasoning text separation
-- Reset between requests (NOT between chunks)
-- `initial_in_reasoning` behavior matches expectations
+## When You Need Custom Logic
-**Verify:** `cargo test`
+Most parsers are pure `BaseReasoningParser` wrappers. Only override methods when:
-## Critical: Partial Token Bug
+| Scenario | Example | What To Do |
+|----------|---------|------------|
+| Model always starts with reasoning but doesn't emit start token | MiniMax M2 | Prepend start token on first chunk (track `is_first_chunk` state) |
+| Non-standard delimiters | Kimi (`◁think▷`), Cohere (`<\|START_THINKING\|>`) | Just change token strings in `ParserConfig` |
+| Model needs preprocessing | Hypothetical | Override trait methods, call `self.base` after preprocessing |
+
+For MiniMax-style prepending, add an `is_first_chunk: bool` field and override `parse_reasoning_streaming_incremental`:
+
+```rust
+fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result {
+ let modified = if self.is_first_chunk {
+ self.is_first_chunk = false;
+ format!("{START_TOKEN}{text}")
+ } else {
+ text.to_string()
+ };
+ self.base.parse_reasoning_streaming_incremental(&modified)
+}
+```
-When `` appears but doesn't complete ``, the system must not lose the `` prefix. The `BaseReasoningParser` handles this correctly — verify in tests by sending `` followed by non-`think>` text.
+Don't forget to reset `is_first_chunk = true` in `reset()`.
## Common Mistakes
-| Mistake | Consequence |
-|---------|-------------|
-| Wrong `initial_in_reasoning` | All output classified as wrong type |
-| Resetting between chunks instead of requests | Loses buffer state mid-stream |
-| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases |
+| Mistake | Consequence | Fix |
+|---------|-------------|-----|
+| Wrong `initial_in_reasoning` | All output classified as wrong type | Test with real model output to observe whether first token is reasoning |
+| Missing pattern variant | Model ID doesn't match, falls back to base | Check provider's model catalog for ALL variant names (aliases, versioned IDs) |
+| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests |
+| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases (partial tokens, buffer overflow) | Always compose `BaseReasoningParser` unless you have a very good reason |
+| Forgetting `parsers/mod.rs` export | Compiles but factory can't find the type | Always update both `mod.rs` and `lib.rs` |
diff --git a/skills/implement/tool-parser.md b/skills/implement/tool-parser.md
index becec9a..c3ecc66 100644
--- a/skills/implement/tool-parser.md
+++ b/skills/implement/tool-parser.md
@@ -1,84 +1,423 @@
# Adding a Tool Parser to SMG
-13+ formats supported. Most are JSON-based and can reuse `handle_json_tool_streaming()`.
+## Collect These 4 Inputs First
-## Steps
+Before writing any code, get these from the user:
-### Step 1: Create parser file
+| Input | Example | How To Decide |
+|-------|---------|---------------|
+| `PARSER_NAME` | `mymodel` | Snake case. Used as parser name, factory key, test file name |
+| `MODEL_PATTERNS` | `["my-model*", "MyModel*"]` | Wildcard patterns matched against model IDs. Check the provider's model catalog for all variant names |
+| `FORMAT` | See **Tool Call Format Types** below | The format the model uses to emit tool calls |
+| `DELIMITERS` | `` / `` | Start/end tokens wrapping tool calls. See **How to Find Tool Call Format** below |
-**File:** `tool_parser/src/parsers/myformat.rs`
+## How to Find Tool Call Format
+
+### 1. Check existing implementations in vLLM and SGLang
+
+**Fastest path.** These projects track tool call formats across all major models.
+
+**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/tool_parsers
+- Registry: `__init__.py` → `_TOOL_PARSERS_TO_REGISTER` dict
+- 34 parser files covering all major model families
+- Each parser defines start/end tokens and format structure
+
+**SGLang** — https://github.com/sgl-project/sglang/tree/main/python/sglang/srt/function_call
+- 25 detector files, one per model family
+- `function_call_parser.py` is the router
+
+If the model exists in either project, match their format.
+
+### 2. HuggingFace `tokenizer_config.json` → `chat_template`
+
+The Jinja2 chat template shows how tools are formatted in the prompt AND how the model is expected to respond. Look for:
+- Tool-related special tokens in `added_tokens_decoder` (e.g., ``, `<|tool_calls_section_begin|>`)
+- Template sections handling `message.tool_calls` — shows the output format
+- `tool_use` or `function_call` handling in the template
+
+### 3. HuggingFace model card / README
+
+Model cards often document the expected tool call output format with examples.
+
+### 4. Send a test request with tools
+
+```bash
+curl -s https://api.provider.com/v1/chat/completions \
+ -H "Authorization: Bearer $KEY" \
+ -d '{
+ "model": "model-id",
+ "tools": [{"type":"function","function":{"name":"get_weather","parameters":{"type":"object","properties":{"city":{"type":"string"}}}}}],
+ "messages": [{"role":"user","content":"What is the weather in Paris?"}]
+ }' | jq '.choices[0].message'
+```
+
+Observe the raw format before the API normalizes it.
+
+## Tool Call Format Types
+
+| Type | Description | Parsers Using It | Reuse Strategy |
+|------|-------------|------------------|----------------|
+| **JSON with tags** | JSON wrapped in model-specific tags | Mistral `[TOOL_CALLS]`, Cohere `<\|START_ACTION\|>`, Step3 ``, KimiK2 `<\|tool_call_begin\|>`, MiniMax `` | Extract between tags, delegate to `helpers::handle_json_tool_streaming()` |
+| **XML with parameters** | XML tags with key-value parameter children | QwenCoder ``, GLM4 `....` | Custom XML parsing |
+| **Raw JSON** | Pure JSON object or array | JsonParser (OpenAI, Claude, Gemini) | Use `JsonParser` directly or register as `json` |
+| **Pythonic** | Python function call syntax `[func(arg=val)]` | PythonicParser (Llama 4, DeepSeek R1) | Use `PythonicParser` directly |
+| **Unicode tokens** | Full-width Unicode delimiters | DeepSeek V3 `<|tool▁calls▁begin|>` | Custom parser with Unicode handling |
+
+**Most new models use "JSON with tags"** — wrap standard JSON in model-specific start/end tokens. This is the easiest to implement because `helpers::handle_json_tool_streaming()` handles 80% of the logic.
+
+### Known Formats
+
+**In SMG today:**
+
+| Model Family | Start Token | End Token | Format | Parser |
+|---|---|---|---|---|
+| OpenAI, Claude, Gemini | `{` or `[` | `}` or `]` | Raw JSON | `json` |
+| Mistral, Mixtral | `[TOOL_CALLS] [` | `]` | JSON array with prefix | `mistral` |
+| Qwen 2/2.5/3 | `\n` | `\n` | JSON in XML tags | `qwen` |
+| Qwen 3/2.5 Coder | `` | `` | XML with `` | `qwen_coder` |
+| Llama 3.2 | `<\|python_tag\|>` | JSON boundary | JSON with prefix tag | `llama` |
+| Llama 4, DeepSeek R1 | `[` | `]` | Pythonic `func(arg=val)` | `pythonic` |
+| DeepSeek V3 | `<|tool▁calls▁begin|>` | `<|tool▁calls▁end|>` | Unicode tokens + JSON code block | `deepseek` |
+| GLM-4.5/4.6 | `` | `` | XML with ``/`` | `glm45_moe` |
+| GLM-4.7 | `` | `` | XML (whitespace variant) | `glm47_moe` |
+| Step-3 | `` | `` | JSON in XML tags | `step3` |
+| Kimi K2 | `<\|tool_call_begin\|>` | `<\|tool_call_end\|>` | JSON in pipe-delimited tags | `kimik2` |
+| MiniMax M2 | `` | `` | JSON in all-caps XML tags | `minimax_m2` |
+| Cohere Command | `<\|START_ACTION\|>` | `<\|END_ACTION\|>` | JSON with `tool_name`/`parameters` fields | `cohere` |
+
+**In vLLM/SGLang but NOT yet in SMG** (candidates to add):
+
+| Model Family | Format | Source |
+|---|---|---|
+| Hermes 2 Pro | `` JSON `` (with optional ``) | vLLM, SGLang |
+| Jamba | `` JSON array `` | vLLM |
+| xLAM | Multiple: JSON code block, `[TOOL_CALLS]`, or XML | vLLM |
+| FunctionGemma | `call:name{args}` | vLLM |
+| Granite | `<\|tool_call\|>` or `` + JSON array | vLLM, SGLang |
+| Phi-4 Mini | JSON format | vLLM |
+| Seed OSS | `` with `` | vLLM |
+| InternLM | Custom format | vLLM, SGLang |
+| Hunyuan A13B | Custom format | vLLM |
+| OLMo 3 | Pythonic format | vLLM |
+| GigaChat 3 | Custom format | vLLM, SGLang |
+| DeepSeek V3.1/V3.2 | Variant of DeepSeek V3 Unicode tokens | vLLM, SGLang |
+
+## Step 1: Create parser file
+
+**File:** `tool_parser/src/parsers/{PARSER_NAME}.rs`
+
+For the most common case — **JSON with tags** — generate this template:
```rust
-pub struct MyFormatParser {
- buffer: String,
+use async_trait::async_trait;
+use openai_protocol::common::Tool;
+use serde_json::Value;
+
+use crate::{
+ errors::ParserResult,
+ parsers::helpers,
+ partial_json::PartialJson,
+ traits::ToolParser,
+ types::{FunctionCall, StreamingParseResult, ToolCall, ToolCallItem},
+};
+
+const START_TOKEN: &str = "{START_TOKEN}";
+const END_TOKEN: &str = "{END_TOKEN}";
+
+pub struct {ParserName}Parser {
partial_json: PartialJson,
+ buffer: String,
+ prev_tool_call_arr: Vec,
current_tool_id: i32,
+ current_tool_name_sent: bool,
streamed_args_for_tool: Vec,
}
+impl {ParserName}Parser {
+ pub fn new() -> Self {
+ Self {
+ partial_json: PartialJson::default(),
+ buffer: String::new(),
+ prev_tool_call_arr: Vec::new(),
+ current_tool_id: -1,
+ current_tool_name_sent: false,
+ streamed_args_for_tool: Vec::new(),
+ }
+ }
+}
+
+impl Default for {ParserName}Parser {
+ fn default() -> Self {
+ Self::new()
+ }
+}
+
#[async_trait]
-impl ToolParser for MyFormatParser {
- async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec)> { ... }
- async fn parse_incremental(&mut self, chunk: &str, tools: &[Tool]) -> ParserResult { ... }
- fn has_tool_markers(&self, text: &str) -> bool { ... }
- fn reset(&mut self) { ... } // Clear buffer, tool_id, streamed_args
+impl ToolParser for {ParserName}Parser {
+ async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec)> {
+ // Find content between START_TOKEN and END_TOKEN
+ let Some(start) = output.find(START_TOKEN) else {
+ return Ok((output.to_string(), vec![]));
+ };
+ let normal_text = output[..start].to_string();
+ let after_start = &output[start + START_TOKEN.len()..];
+ let json_str = if let Some(end) = after_start.find(END_TOKEN) {
+ &after_start[..end]
+ } else {
+ after_start
+ };
+
+ // Parse JSON and extract tool calls
+ let json_str = json_str.trim();
+ let value: Value = serde_json::from_str(json_str)?;
+
+ // Adapt based on your model's JSON structure:
+ // Most use {"name": "func", "arguments": {...}}
+ // Some use {"tool_name": "func", "parameters": {...}}
+ let calls = helpers::extract_tool_calls_from_value(&value)?;
+ Ok((normal_text, calls))
+ }
+
+ async fn parse_incremental(
+ &mut self,
+ chunk: &str,
+ tools: &[Tool],
+ ) -> ParserResult {
+ self.buffer.push_str(chunk);
+
+ // Look for start token
+ let Some(start_idx) = self.buffer.find(START_TOKEN) else {
+ // No tool call yet — emit as normal text
+ let normal = self.buffer.clone();
+ self.buffer.clear();
+ return Ok(StreamingParseResult {
+ normal_text: normal,
+ calls: vec![],
+ });
+ };
+
+ // Emit any text before the tool call as normal text
+ let normal_text = self.buffer[..start_idx].to_string();
+ let json_start = start_idx + START_TOKEN.len();
+
+ // Check for end token
+ let json_text = if let Some(end_idx) = self.buffer[json_start..].find(END_TOKEN) {
+ &self.buffer[json_start..json_start + end_idx]
+ } else {
+ &self.buffer[json_start..]
+ };
+
+ // Build tool index map from available tools
+ let tool_indices: std::collections::HashMap = tools
+ .iter()
+ .enumerate()
+ .filter_map(|(i, t)| t.function.as_ref().map(|f| (f.name.clone(), i)))
+ .collect();
+
+ // Delegate to shared JSON streaming helper
+ let mut result = helpers::handle_json_tool_streaming(
+ json_text,
+ 0,
+ &mut self.partial_json,
+ &tool_indices,
+ &mut self.buffer,
+ &mut self.current_tool_id,
+ &mut self.current_tool_name_sent,
+ &mut self.streamed_args_for_tool,
+ &mut self.prev_tool_call_arr,
+ )?;
+
+ result.normal_text = normal_text;
+ Ok(result)
+ }
+
+ fn has_tool_markers(&self, text: &str) -> bool {
+ text.contains(START_TOKEN)
+ }
+
+ fn get_unstreamed_tool_args(&self) -> Option> {
+ helpers::get_unstreamed_args(&self.prev_tool_call_arr, &self.streamed_args_for_tool)
+ }
+
+ fn reset(&mut self) {
+ helpers::reset_parser_state(
+ &mut self.buffer,
+ &mut self.prev_tool_call_arr,
+ &mut self.current_tool_id,
+ &mut self.current_tool_name_sent,
+ &mut self.streamed_args_for_tool,
+ );
+ }
}
```
-**Verify:** `cargo build`
+**Verify:** `cargo check -p tool_parser`
-### Step 2: Implement streaming (two-stage pattern)
+## Step 2: Register in module exports
-Stage 1 — Name detected:
+**File:** `tool_parser/src/parsers/mod.rs` — add:
```rust
-StreamingParseResult { items: vec![ToolCallItem { name: Some("func"), parameters: "" }] }
+pub mod {PARSER_NAME};
+pub use {PARSER_NAME}::{ParserName}Parser;
```
-Stage 2 — Arguments delta:
+**File:** `tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block:
```rust
-// Calculate delta = current_args[previous_length..]
-StreamingParseResult { items: vec![ToolCallItem { name: None, parameters: delta }] }
+{ParserName}Parser,
```
-For JSON-based formats, use `helpers::handle_json_tool_streaming()` — handles 80% of the logic.
+**Verify:** `cargo check -p tool_parser`
-**Anti-pattern:** Not tracking `streamed_args_for_tool` — re-sends entire argument string each chunk.
+## Step 3: Register in factory
-### Step 3: Register in factory
+**File:** `tool_parser/src/factory.rs`
-**File:** `tool_parser/src/` (factory/registry)
+In `ParserFactory::new()`:
+```rust
+registry.register_parser("{PARSER_NAME}", || Box::new({ParserName}Parser::new()));
+```
+In `ParserFactory::register_default_mappings()`:
```rust
-registry.register_parser("myformat", || Box::new(MyFormatParser::new()));
-registry.map_model("my-model*", "myformat"); // Wildcard pattern matching
+registry.map_model("{model-pattern-1}*", "{PARSER_NAME}");
+registry.map_model("{model-pattern-2}*", "{PARSER_NAME}");
```
-**Verify:** `cargo build`
+Pattern matching uses **glob wildcards** (`*` matches any characters).
-### Step 4: Export
+**Verify:** `cargo check -p tool_parser`
-**Files:** `tool_parser/src/parsers/mod.rs` and `tool_parser/src/lib.rs`
+## Step 4: Write tests
-Add `pub mod myformat;` and re-export.
+**File:** `tool_parser/tests/tool_parser_{PARSER_NAME}.rs`
-### Step 5: Write tests
+```rust
+mod common;
+
+use common::create_test_tools;
+use tool_parser::{ParserName}Parser, ToolParser};
-**File:** `tool_parser/tests/tool_parser_myformat.rs`
+#[tokio::test]
+async fn test_parse_single_tool_call() {
+ let parser = {ParserName}Parser::new();
+ let input = r#"Some text{START_TOKEN}{"name":"get_weather","arguments":{"city":"Paris"}}{END_TOKEN}"#;
+ let (normal, calls) = parser.parse_complete(input).await.unwrap();
+ assert_eq!(normal, "Some text");
+ assert_eq!(calls.len(), 1);
+ assert_eq!(calls[0].function.name, "get_weather");
+}
+
+#[tokio::test]
+async fn test_parse_multiple_tool_calls() {
+ let parser = {ParserName}Parser::new();
+ // Test with multiple sequential tool calls
+ let input = r#"{START_TOKEN}{"name":"func1","arguments":{}}{END_TOKEN}{START_TOKEN}{"name":"func2","arguments":{}}{END_TOKEN}"#;
+ let (_, calls) = parser.parse_complete(input).await.unwrap();
+ assert_eq!(calls.len(), 2);
+}
+
+#[tokio::test]
+async fn test_no_tool_calls() {
+ let parser = {ParserName}Parser::new();
+ let input = "Just normal text, no tool calls here.";
+ let (normal, calls) = parser.parse_complete(input).await.unwrap();
+ assert_eq!(normal, input);
+ assert!(calls.is_empty());
+}
+
+#[tokio::test]
+async fn test_has_tool_markers() {
+ let parser = {ParserName}Parser::new();
+ assert!(parser.has_tool_markers("text {START_TOKEN} more"));
+ assert!(!parser.has_tool_markers("plain text"));
+}
+
+#[tokio::test]
+async fn test_streaming_chunks() {
+ let mut parser = {ParserName}Parser::new();
+ let tools = create_test_tools();
+
+ // Split a tool call across multiple chunks
+ let chunks = vec![
+ "Normal text",
+ "{START_TOKEN}{\"na",
+ "me\":\"get_weather\",\"ar",
+ "guments\":{\"city\":\"",
+ "Paris\"}}{END_TOKEN}",
+ ];
+
+ let mut all_normal = String::new();
+ let mut all_calls = Vec::new();
+ for chunk in chunks {
+ let result = parser.parse_incremental(chunk, &tools).await.unwrap();
+ all_normal.push_str(&result.normal_text);
+ all_calls.extend(result.calls);
+ }
+ assert_eq!(all_normal, "Normal text");
+ assert!(!all_calls.is_empty());
+}
+
+#[tokio::test]
+async fn test_empty_arguments() {
+ let parser = {ParserName}Parser::new();
+ let input = r#"{START_TOKEN}{"name":"no_args","arguments":{}}{END_TOKEN}"#;
+ let (_, calls) = parser.parse_complete(input).await.unwrap();
+ assert_eq!(calls[0].function.arguments, "{}");
+}
+
+#[tokio::test]
+async fn test_reset() {
+ let mut parser = {ParserName}Parser::new();
+ let tools = create_test_tools();
+ parser.parse_incremental("{START_TOKEN}{\"name", &tools).await.unwrap();
+ parser.reset();
+ // After reset, parser should handle new input cleanly
+ let result = parser.parse_incremental("fresh text", &tools).await.unwrap();
+ assert_eq!(result.normal_text, "fresh text");
+}
+```
+
+**Verify:** `cargo test --test tool_parser_{PARSER_NAME}`
+
+## Step 5: Run full quality gate
+
+Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format.
+
+## Adapting for Non-Standard JSON Fields
+
+Some models use non-standard field names. Map them in `parse_complete`:
+
+| Model | Name Field | Arguments Field |
+|-------|-----------|-----------------|
+| Most models | `name` | `arguments` |
+| Cohere | `tool_name` | `parameters` |
+| Llama 3.2 | `name` | `parameters` |
+
+Use `helpers::normalize_tool_call_fields()` if available, or map manually:
+```rust
+let name = obj.get("name")
+ .or_else(|| obj.get("tool_name"))
+ .and_then(|v| v.as_str());
+let args = obj.get("arguments")
+ .or_else(|| obj.get("parameters"));
+```
-Required test cases:
-- Single tool call
-- Multiple tool calls
-- Empty arguments
-- Streaming with chunks split at different boundaries
-- Invalid tool name (should skip, not error)
-- Mixed normal text and tool calls
-- Unicode and special characters
+## When NOT to Use the JSON-with-Tags Template
-**Verify:** `cargo test --test tool_parser_myformat`
+| Scenario | Example | What To Do Instead |
+|----------|---------|-------------------|
+| Model uses XML with key-value parameters | QwenCoder `value` | Write custom XML extraction (see `qwen_coder.rs`) |
+| Model uses Python function syntax | Llama 4 `[func(arg=val)]` | Use `PythonicParser` or register as `pythonic` |
+| Model uses raw JSON (no tags) | OpenAI, Claude | Register as `json` — no new parser needed |
+| Model uses Unicode delimiter tokens | DeepSeek V3 full-width chars | Write custom parser with Unicode-aware matching (see `deepseek.rs`) |
-## Key Rules
+## Common Mistakes
-- Validate tool names against the `tools` list — skip invalid calls
-- Reset state between **requests**, NOT between **chunks**
-- Use `partial_json.parse_value()` for incomplete JSON
-- Parsers are pooled with `Arc>` — must be `Send + Sync`
+| Mistake | Consequence | Fix |
+|---------|-------------|-----|
+| Not handling multiple tool calls | Only first call extracted | Loop over all start/end token pairs or parse JSON array |
+| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests |
+| Not validating tool names against `tools` list | Invalid tool calls forwarded to client | Skip calls where name doesn't match any provided tool |
+| Re-sending full arguments each chunk | Client receives duplicate argument data | Track `streamed_args_for_tool` and send only the delta |
+| Missing `get_unstreamed_tool_args()` | Final arguments lost on fast completions | Implement using `helpers::get_unstreamed_args()` |
+| Forgetting `Send + Sync` bounds | Parser can't be pooled with `Arc>` | Avoid `Rc`, `RefCell`, or non-`Send` types in struct fields |