diff --git a/skills/implement/reasoning-parser.md b/skills/implement/reasoning-parser.md index 70cb70a..84557ae 100644 --- a/skills/implement/reasoning-parser.md +++ b/skills/implement/reasoning-parser.md @@ -1,88 +1,365 @@ # Adding a Reasoning Parser to SMG -10+ model families supported. Most delegate to `BaseReasoningParser` — only add custom logic if needed. +## Collect These 4 Inputs First -## Key Decision: `initial_in_reasoning` +Before writing any code, get these from the user: + +| Input | Example | How To Decide | +|-------|---------|---------------| +| `MODEL_NAME` | `mymodel` | Snake case. Used as parser name, model_type, factory key | +| `PATTERNS` | `["my-model", "my-model-v2", "mymodel"]` | Case-insensitive substrings matched against model IDs. Check the provider's model catalog for all variant names | +| `TOKENS` | `` / `` | See **How to Find Reasoning Tokens** below | +| `INITIAL_IN_REASONING` | `true` or `false` | See **How to Determine initial_in_reasoning** below | + +## How to Find Reasoning Tokens + +There is no single canonical source. Check these locations in order: + +### 1. Check existing implementations in vLLM and SGLang + +**Fastest path.** These projects track reasoning tokens across all major models. If the model already has a parser there, just copy the tokens. + +**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/reasoning +- Registry: `__init__.py` → `_REASONING_PARSERS_TO_REGISTER` dict +- Each parser file defines `start_token` / `end_token` properties +- 17 models: deepseek_r1, deepseek_v3, qwen3, kimi_k2, glm45, step3, step3p5, minimax_m2, mistral, granite, ernie45, gptoss, hunyuan_a13b, olmo3, holo2, seed_oss, nano_v3 + +**SGLang** — https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/parser/reasoning_parser.py +- Single file with all parsers and a `DetectorMap` dict mapping model names to detector classes +- Each detector class defines `think_start_token` / `think_end_token` + +If the model exists in either project, use their tokens directly — they've already been validated against real model output. + +### 2. HuggingFace `tokenizer_config.json` → `added_tokens_decoder` + +Most reliable primary source. Download from `https://huggingface.co/{org}/{model}/raw/main/tokenizer_config.json` and search for thinking-related tokens. + +**Example — Qwen3** (tokens 151667 and 151668): +```json +"151667": { "content": "", "special": false }, +"151668": { "content": "", "special": false } +``` + +### 2. HuggingFace `tokenizer_config.json` → `chat_template` + +The Jinja2 chat template often shows how reasoning tokens are handled during formatting. + +**Example — DeepSeek-R1** (template strips reasoning from assistant messages): +```jinja2 +{% if '' in content %} + {% set content = content.split('')[-1] %} +{% endif %} +``` + +The `add_generation_prompt` section may also reveal the start token — DeepSeek-R1 appends `\n` when generating. + +**Example — Qwen3** (template wraps reasoning_content): +```jinja2 +'\n' + reasoning_content + '\n\n\n' + content +``` + +### 3. HuggingFace model card / README + +Some providers document the token format in their model card. + +**Example — Cohere Command-A Reasoning:** +``` +<|START_THINKING|> +[reasoning] +<|END_THINKING|> +``` + +Cohere also supports `reasoning=True` parameter in `apply_chat_template()`. + +### 4. Provider API docs + +Check the provider's API documentation for reasoning/thinking output format. + +**Example — Together.ai, OpenRouter, etc.** often document thinking output as a separate `reasoning` field with its own delimiter format. + +### 5. Send a test request and observe + +Last resort. Send a reasoning-capable prompt through the provider's API and inspect the raw response tokens. + +```bash +curl -s https://api.provider.com/v1/chat/completions \ + -H "Authorization: Bearer $KEY" \ + -d '{"model":"model-id","messages":[{"role":"user","content":"Think step by step: what is 17*23?"}]}' \ + | jq '.choices[0].message' +``` + +Look for the delimiters wrapping the chain-of-thought content. + +### Known Token Formats + +**In SMG today:** + +| Model Family | Start Token | End Token | Notes | +|---|---|---|---| +| DeepSeek-R1 | `` | `` | `initial_in_reasoning=true` | +| Qwen3 | `` | `` | `initial_in_reasoning=false` | +| QwenThinking | `` | `` | `initial_in_reasoning=true` | +| Kimi | `◁think▷` | `◁/think▷` | Unicode tokens | +| GLM-4.5 | `` | `` | `initial_in_reasoning=false` | +| Step3 | `` | `` | `initial_in_reasoning=true` | +| NanoV3 (Nemotron) | `` | `` | `initial_in_reasoning=true` | +| MiniMax M2 | `` | `` | Model doesn't emit start token — SMG prepends it | +| Cohere Command | `<\|START_THINKING\|>` | `<\|END_THINKING\|>` | `initial_in_reasoning=false` | + +**In vLLM/SGLang but NOT yet in SMG** (candidates to add): + +| Model Family | Start Token | End Token | Notes | Source | +|---|---|---|---|---| +| Mistral | `[THINK]` | `[/THINK]` | Uses `mistral_common` special tokens | vLLM | +| Granite | regex: `Here's my thought process:` | regex: `Here's my response:` | Regex-based, not token-based | vLLM | +| Ernie 4.5 | `` | `` + ``/`` | Extra response delimiters | vLLM | +| GptOss | `<\|channel\|>analysis<\|message\|>` | `<\|end\|>` | Multi-token sequence | vLLM, SGLang | +| Hunyuan A13B | `` | `` | Standard tokens | vLLM | +| OLMo 3 | `` | `` | Standard tokens | vLLM | +| Seed OSS | `` | `` | Standard tokens | vLLM | +| Holo2 | `` | `` | Standard tokens | vLLM | + +Most models use ``/``. If the model you're adding uses those, you can usually just confirm via one of the sources above. For non-standard tokens (Mistral, Granite, GptOss), check the vLLM/SGLang implementations for the exact format. + +## How to Determine `initial_in_reasoning` + +This flag controls whether the parser assumes the first token is reasoning or normal text. + +**Check the chat template's `add_generation_prompt` section:** +- If it appends the start token (e.g. `\n`) → `initial_in_reasoning: true` +- If it appends nothing or just a role tag → `initial_in_reasoning: false` + +**Or send a test request** and observe: +- If the very first output token is reasoning content (before any start token) → `true` +- If the first output is normal text or starts with explicit start token → `false` | Value | Behavior | Models | |-------|----------|--------| -| `true` | Everything is reasoning until end token | DeepSeek-R1, Step3, Nemotron | -| `false` | Requires explicit start token | Qwen3, GLM-4, Kimi, MiniMax, Cohere | +| `true` | Everything is reasoning until end token appears | DeepSeek-R1, Step3, Nemotron, QwenThinking | +| `false` | Normal text until explicit start token | Qwen3, GLM-4.5, Kimi, MiniMax, Cohere | -## Steps +## Step 1: Create parser file -### Step 1: Create parser file +**File:** `reasoning_parser/src/parsers/{MODEL_NAME}.rs` -**File:** `reasoning_parser/src/parsers/mymodel.rs` +Generate this file, substituting the 4 inputs: ```rust -pub struct MyModelParser { +use crate::{ + parsers::BaseReasoningParser, + traits::{ParseError, ParserConfig, ParserResult, ReasoningParser}, +}; + +pub struct {ModelName}Parser { base: BaseReasoningParser, } -impl MyModelParser { +impl {ModelName}Parser { pub fn new() -> Self { + let config = ParserConfig { + think_start_token: "{START_TOKEN}".to_string(), + think_end_token: "{END_TOKEN}".to_string(), + initial_in_reasoning: {INITIAL_IN_REASONING}, + stream_reasoning: true, + max_buffer_size: 65536, + }; Self { - base: BaseReasoningParser::new(ParserConfig { - think_start_token: "".to_string(), - think_end_token: "".to_string(), - initial_in_reasoning: false, // Choose based on model behavior - stream_reasoning: true, - max_buffer_size: 65536, - }).with_model_type("mymodel".to_string()), + base: BaseReasoningParser::new(config) + .with_model_type("{MODEL_NAME}".to_string()), } } } -impl ReasoningParser for MyModelParser { +impl Default for {ModelName}Parser { + fn default() -> Self { + Self::new() + } +} + +impl ReasoningParser for {ModelName}Parser { fn detect_and_parse_reasoning(&mut self, text: &str) -> Result { self.base.detect_and_parse_reasoning(text) } - fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result { + + fn parse_reasoning_streaming_incremental( + &mut self, + text: &str, + ) -> Result { self.base.parse_reasoning_streaming_incremental(text) } - fn reset(&mut self) { self.base.reset() } - fn model_type(&self) -> &str { self.base.model_type() } - fn is_in_reasoning(&self) -> bool { self.base.is_in_reasoning() } + + fn reset(&mut self) { + self.base.reset(); + } + + fn model_type(&self) -> &str { + self.base.model_type() + } + + fn is_in_reasoning(&self) -> bool { + self.base.is_in_reasoning() + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_model_type() { + let parser = {ModelName}Parser::new(); + assert_eq!(parser.model_type(), "{MODEL_NAME}"); + } + + #[test] + fn test_reasoning_extraction() { + let mut parser = {ModelName}Parser::new(); + let result = parser + .detect_and_parse_reasoning("{START_TOKEN}thinking here{END_TOKEN}normal text") + .unwrap(); + assert_eq!(result.reasoning_text, "thinking here"); + assert_eq!(result.normal_text, "normal text"); + } + + #[test] + fn test_streaming_partial_token() { + let mut parser = {ModelName}Parser::new(); + // Send partial end token — must not lose characters + let r1 = parser + .parse_reasoning_streaming_incremental("{START_TOKEN}partial`). +**Verify:** `cargo check -p reasoning_parser` + +## Step 2: Register in module exports + +**File:** `reasoning_parser/src/parsers/mod.rs` — add: +```rust +pub mod {MODEL_NAME}; +pub use {MODEL_NAME}::{ModelName}Parser; +``` + +**File:** `reasoning_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block: +```rust +{ModelName}Parser, +``` + +**Verify:** `cargo check -p reasoning_parser` -**Verify:** `cargo build` +## Step 3: Register in factory -### Step 2: Register in factory +**File:** `reasoning_parser/src/factory.rs` — in `ParserFactory::new()`, add: ```rust -registry.register_parser("mymodel", || Box::new(MyModelParser::new())); -registry.register_pattern("my-model", "mymodel"); // Case-insensitive substring +// Parser registration +registry.register_parser("{MODEL_NAME}", || Box::new({ModelName}Parser::new())); + +// Pattern mappings — one per model ID variant +registry.register_pattern("{pattern-1}", "{MODEL_NAME}"); +registry.register_pattern("{pattern-2}", "{MODEL_NAME}"); ``` -**Verify:** `cargo build` +Pattern matching is **case-insensitive substring**: `model_id.to_lowercase().contains(pattern)`. + +**Verify:** `cargo check -p reasoning_parser` + +## Step 4: Run tests + +```bash +cargo test -p reasoning_parser +``` -### Step 3: Export +All 7 tests in the new file plus all existing tests must pass. -**Files:** `reasoning_parser/src/parsers/mod.rs` and `reasoning_parser/src/lib.rs` +## Step 5: Run full quality gate -### Step 4: Write tests +Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format. -Required test cases: -- Streaming with partial tokens (e.g. ``) -- Buffer overflow (>64KB reasoning block) -- Normal text / reasoning text separation -- Reset between requests (NOT between chunks) -- `initial_in_reasoning` behavior matches expectations +## When You Need Custom Logic -**Verify:** `cargo test` +Most parsers are pure `BaseReasoningParser` wrappers. Only override methods when: -## Critical: Partial Token Bug +| Scenario | Example | What To Do | +|----------|---------|------------| +| Model always starts with reasoning but doesn't emit start token | MiniMax M2 | Prepend start token on first chunk (track `is_first_chunk` state) | +| Non-standard delimiters | Kimi (`◁think▷`), Cohere (`<\|START_THINKING\|>`) | Just change token strings in `ParserConfig` | +| Model needs preprocessing | Hypothetical | Override trait methods, call `self.base` after preprocessing | + +For MiniMax-style prepending, add an `is_first_chunk: bool` field and override `parse_reasoning_streaming_incremental`: + +```rust +fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result { + let modified = if self.is_first_chunk { + self.is_first_chunk = false; + format!("{START_TOKEN}{text}") + } else { + text.to_string() + }; + self.base.parse_reasoning_streaming_incremental(&modified) +} +``` -When ``, the system must not lose the `` text. +Don't forget to reset `is_first_chunk = true` in `reset()`. ## Common Mistakes -| Mistake | Consequence | -|---------|-------------| -| Wrong `initial_in_reasoning` | All output classified as wrong type | -| Resetting between chunks instead of requests | Loses buffer state mid-stream | -| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases | +| Mistake | Consequence | Fix | +|---------|-------------|-----| +| Wrong `initial_in_reasoning` | All output classified as wrong type | Test with real model output to observe whether first token is reasoning | +| Missing pattern variant | Model ID doesn't match, falls back to base | Check provider's model catalog for ALL variant names (aliases, versioned IDs) | +| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests | +| Not delegating to `BaseReasoningParser` | Reimplements solved edge cases (partial tokens, buffer overflow) | Always compose `BaseReasoningParser` unless you have a very good reason | +| Forgetting `parsers/mod.rs` export | Compiles but factory can't find the type | Always update both `mod.rs` and `lib.rs` | diff --git a/skills/implement/tool-parser.md b/skills/implement/tool-parser.md index becec9a..c3ecc66 100644 --- a/skills/implement/tool-parser.md +++ b/skills/implement/tool-parser.md @@ -1,84 +1,423 @@ # Adding a Tool Parser to SMG -13+ formats supported. Most are JSON-based and can reuse `handle_json_tool_streaming()`. +## Collect These 4 Inputs First -## Steps +Before writing any code, get these from the user: -### Step 1: Create parser file +| Input | Example | How To Decide | +|-------|---------|---------------| +| `PARSER_NAME` | `mymodel` | Snake case. Used as parser name, factory key, test file name | +| `MODEL_PATTERNS` | `["my-model*", "MyModel*"]` | Wildcard patterns matched against model IDs. Check the provider's model catalog for all variant names | +| `FORMAT` | See **Tool Call Format Types** below | The format the model uses to emit tool calls | +| `DELIMITERS` | `` / `` | Start/end tokens wrapping tool calls. See **How to Find Tool Call Format** below | -**File:** `tool_parser/src/parsers/myformat.rs` +## How to Find Tool Call Format + +### 1. Check existing implementations in vLLM and SGLang + +**Fastest path.** These projects track tool call formats across all major models. + +**vLLM** — https://github.com/vllm-project/vllm/tree/main/vllm/tool_parsers +- Registry: `__init__.py` → `_TOOL_PARSERS_TO_REGISTER` dict +- 34 parser files covering all major model families +- Each parser defines start/end tokens and format structure + +**SGLang** — https://github.com/sgl-project/sglang/tree/main/python/sglang/srt/function_call +- 25 detector files, one per model family +- `function_call_parser.py` is the router + +If the model exists in either project, match their format. + +### 2. HuggingFace `tokenizer_config.json` → `chat_template` + +The Jinja2 chat template shows how tools are formatted in the prompt AND how the model is expected to respond. Look for: +- Tool-related special tokens in `added_tokens_decoder` (e.g., ``, `<|tool_calls_section_begin|>`) +- Template sections handling `message.tool_calls` — shows the output format +- `tool_use` or `function_call` handling in the template + +### 3. HuggingFace model card / README + +Model cards often document the expected tool call output format with examples. + +### 4. Send a test request with tools + +```bash +curl -s https://api.provider.com/v1/chat/completions \ + -H "Authorization: Bearer $KEY" \ + -d '{ + "model": "model-id", + "tools": [{"type":"function","function":{"name":"get_weather","parameters":{"type":"object","properties":{"city":{"type":"string"}}}}}], + "messages": [{"role":"user","content":"What is the weather in Paris?"}] + }' | jq '.choices[0].message' +``` + +Observe the raw format before the API normalizes it. + +## Tool Call Format Types + +| Type | Description | Parsers Using It | Reuse Strategy | +|------|-------------|------------------|----------------| +| **JSON with tags** | JSON wrapped in model-specific tags | Mistral `[TOOL_CALLS]`, Cohere `<\|START_ACTION\|>`, Step3 ``, KimiK2 `<\|tool_call_begin\|>`, MiniMax `` | Extract between tags, delegate to `helpers::handle_json_tool_streaming()` | +| **XML with parameters** | XML tags with key-value parameter children | QwenCoder ``, GLM4 `....` | Custom XML parsing | +| **Raw JSON** | Pure JSON object or array | JsonParser (OpenAI, Claude, Gemini) | Use `JsonParser` directly or register as `json` | +| **Pythonic** | Python function call syntax `[func(arg=val)]` | PythonicParser (Llama 4, DeepSeek R1) | Use `PythonicParser` directly | +| **Unicode tokens** | Full-width Unicode delimiters | DeepSeek V3 `<|tool▁calls▁begin|>` | Custom parser with Unicode handling | + +**Most new models use "JSON with tags"** — wrap standard JSON in model-specific start/end tokens. This is the easiest to implement because `helpers::handle_json_tool_streaming()` handles 80% of the logic. + +### Known Formats + +**In SMG today:** + +| Model Family | Start Token | End Token | Format | Parser | +|---|---|---|---|---| +| OpenAI, Claude, Gemini | `{` or `[` | `}` or `]` | Raw JSON | `json` | +| Mistral, Mixtral | `[TOOL_CALLS] [` | `]` | JSON array with prefix | `mistral` | +| Qwen 2/2.5/3 | `\n` | `\n` | JSON in XML tags | `qwen` | +| Qwen 3/2.5 Coder | `` | `` | XML with `` | `qwen_coder` | +| Llama 3.2 | `<\|python_tag\|>` | JSON boundary | JSON with prefix tag | `llama` | +| Llama 4, DeepSeek R1 | `[` | `]` | Pythonic `func(arg=val)` | `pythonic` | +| DeepSeek V3 | `<|tool▁calls▁begin|>` | `<|tool▁calls▁end|>` | Unicode tokens + JSON code block | `deepseek` | +| GLM-4.5/4.6 | `` | `` | XML with ``/`` | `glm45_moe` | +| GLM-4.7 | `` | `` | XML (whitespace variant) | `glm47_moe` | +| Step-3 | `` | `` | JSON in XML tags | `step3` | +| Kimi K2 | `<\|tool_call_begin\|>` | `<\|tool_call_end\|>` | JSON in pipe-delimited tags | `kimik2` | +| MiniMax M2 | `` | `` | JSON in all-caps XML tags | `minimax_m2` | +| Cohere Command | `<\|START_ACTION\|>` | `<\|END_ACTION\|>` | JSON with `tool_name`/`parameters` fields | `cohere` | + +**In vLLM/SGLang but NOT yet in SMG** (candidates to add): + +| Model Family | Format | Source | +|---|---|---| +| Hermes 2 Pro | `` JSON `` (with optional ``) | vLLM, SGLang | +| Jamba | `` JSON array `` | vLLM | +| xLAM | Multiple: JSON code block, `[TOOL_CALLS]`, or XML | vLLM | +| FunctionGemma | `call:name{args}` | vLLM | +| Granite | `<\|tool_call\|>` or `` + JSON array | vLLM, SGLang | +| Phi-4 Mini | JSON format | vLLM | +| Seed OSS | `` with `` | vLLM | +| InternLM | Custom format | vLLM, SGLang | +| Hunyuan A13B | Custom format | vLLM | +| OLMo 3 | Pythonic format | vLLM | +| GigaChat 3 | Custom format | vLLM, SGLang | +| DeepSeek V3.1/V3.2 | Variant of DeepSeek V3 Unicode tokens | vLLM, SGLang | + +## Step 1: Create parser file + +**File:** `tool_parser/src/parsers/{PARSER_NAME}.rs` + +For the most common case — **JSON with tags** — generate this template: ```rust -pub struct MyFormatParser { - buffer: String, +use async_trait::async_trait; +use openai_protocol::common::Tool; +use serde_json::Value; + +use crate::{ + errors::ParserResult, + parsers::helpers, + partial_json::PartialJson, + traits::ToolParser, + types::{FunctionCall, StreamingParseResult, ToolCall, ToolCallItem}, +}; + +const START_TOKEN: &str = "{START_TOKEN}"; +const END_TOKEN: &str = "{END_TOKEN}"; + +pub struct {ParserName}Parser { partial_json: PartialJson, + buffer: String, + prev_tool_call_arr: Vec, current_tool_id: i32, + current_tool_name_sent: bool, streamed_args_for_tool: Vec, } +impl {ParserName}Parser { + pub fn new() -> Self { + Self { + partial_json: PartialJson::default(), + buffer: String::new(), + prev_tool_call_arr: Vec::new(), + current_tool_id: -1, + current_tool_name_sent: false, + streamed_args_for_tool: Vec::new(), + } + } +} + +impl Default for {ParserName}Parser { + fn default() -> Self { + Self::new() + } +} + #[async_trait] -impl ToolParser for MyFormatParser { - async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec)> { ... } - async fn parse_incremental(&mut self, chunk: &str, tools: &[Tool]) -> ParserResult { ... } - fn has_tool_markers(&self, text: &str) -> bool { ... } - fn reset(&mut self) { ... } // Clear buffer, tool_id, streamed_args +impl ToolParser for {ParserName}Parser { + async fn parse_complete(&self, output: &str) -> ParserResult<(String, Vec)> { + // Find content between START_TOKEN and END_TOKEN + let Some(start) = output.find(START_TOKEN) else { + return Ok((output.to_string(), vec![])); + }; + let normal_text = output[..start].to_string(); + let after_start = &output[start + START_TOKEN.len()..]; + let json_str = if let Some(end) = after_start.find(END_TOKEN) { + &after_start[..end] + } else { + after_start + }; + + // Parse JSON and extract tool calls + let json_str = json_str.trim(); + let value: Value = serde_json::from_str(json_str)?; + + // Adapt based on your model's JSON structure: + // Most use {"name": "func", "arguments": {...}} + // Some use {"tool_name": "func", "parameters": {...}} + let calls = helpers::extract_tool_calls_from_value(&value)?; + Ok((normal_text, calls)) + } + + async fn parse_incremental( + &mut self, + chunk: &str, + tools: &[Tool], + ) -> ParserResult { + self.buffer.push_str(chunk); + + // Look for start token + let Some(start_idx) = self.buffer.find(START_TOKEN) else { + // No tool call yet — emit as normal text + let normal = self.buffer.clone(); + self.buffer.clear(); + return Ok(StreamingParseResult { + normal_text: normal, + calls: vec![], + }); + }; + + // Emit any text before the tool call as normal text + let normal_text = self.buffer[..start_idx].to_string(); + let json_start = start_idx + START_TOKEN.len(); + + // Check for end token + let json_text = if let Some(end_idx) = self.buffer[json_start..].find(END_TOKEN) { + &self.buffer[json_start..json_start + end_idx] + } else { + &self.buffer[json_start..] + }; + + // Build tool index map from available tools + let tool_indices: std::collections::HashMap = tools + .iter() + .enumerate() + .filter_map(|(i, t)| t.function.as_ref().map(|f| (f.name.clone(), i))) + .collect(); + + // Delegate to shared JSON streaming helper + let mut result = helpers::handle_json_tool_streaming( + json_text, + 0, + &mut self.partial_json, + &tool_indices, + &mut self.buffer, + &mut self.current_tool_id, + &mut self.current_tool_name_sent, + &mut self.streamed_args_for_tool, + &mut self.prev_tool_call_arr, + )?; + + result.normal_text = normal_text; + Ok(result) + } + + fn has_tool_markers(&self, text: &str) -> bool { + text.contains(START_TOKEN) + } + + fn get_unstreamed_tool_args(&self) -> Option> { + helpers::get_unstreamed_args(&self.prev_tool_call_arr, &self.streamed_args_for_tool) + } + + fn reset(&mut self) { + helpers::reset_parser_state( + &mut self.buffer, + &mut self.prev_tool_call_arr, + &mut self.current_tool_id, + &mut self.current_tool_name_sent, + &mut self.streamed_args_for_tool, + ); + } } ``` -**Verify:** `cargo build` +**Verify:** `cargo check -p tool_parser` -### Step 2: Implement streaming (two-stage pattern) +## Step 2: Register in module exports -Stage 1 — Name detected: +**File:** `tool_parser/src/parsers/mod.rs` — add: ```rust -StreamingParseResult { items: vec![ToolCallItem { name: Some("func"), parameters: "" }] } +pub mod {PARSER_NAME}; +pub use {PARSER_NAME}::{ParserName}Parser; ``` -Stage 2 — Arguments delta: +**File:** `tool_parser/src/lib.rs` — add to the `pub use parsers::{ ... }` block: ```rust -// Calculate delta = current_args[previous_length..] -StreamingParseResult { items: vec![ToolCallItem { name: None, parameters: delta }] } +{ParserName}Parser, ``` -For JSON-based formats, use `helpers::handle_json_tool_streaming()` — handles 80% of the logic. +**Verify:** `cargo check -p tool_parser` -**Anti-pattern:** Not tracking `streamed_args_for_tool` — re-sends entire argument string each chunk. +## Step 3: Register in factory -### Step 3: Register in factory +**File:** `tool_parser/src/factory.rs` -**File:** `tool_parser/src/` (factory/registry) +In `ParserFactory::new()`: +```rust +registry.register_parser("{PARSER_NAME}", || Box::new({ParserName}Parser::new())); +``` +In `ParserFactory::register_default_mappings()`: ```rust -registry.register_parser("myformat", || Box::new(MyFormatParser::new())); -registry.map_model("my-model*", "myformat"); // Wildcard pattern matching +registry.map_model("{model-pattern-1}*", "{PARSER_NAME}"); +registry.map_model("{model-pattern-2}*", "{PARSER_NAME}"); ``` -**Verify:** `cargo build` +Pattern matching uses **glob wildcards** (`*` matches any characters). -### Step 4: Export +**Verify:** `cargo check -p tool_parser` -**Files:** `tool_parser/src/parsers/mod.rs` and `tool_parser/src/lib.rs` +## Step 4: Write tests -Add `pub mod myformat;` and re-export. +**File:** `tool_parser/tests/tool_parser_{PARSER_NAME}.rs` -### Step 5: Write tests +```rust +mod common; + +use common::create_test_tools; +use tool_parser::{ParserName}Parser, ToolParser}; -**File:** `tool_parser/tests/tool_parser_myformat.rs` +#[tokio::test] +async fn test_parse_single_tool_call() { + let parser = {ParserName}Parser::new(); + let input = r#"Some text{START_TOKEN}{"name":"get_weather","arguments":{"city":"Paris"}}{END_TOKEN}"#; + let (normal, calls) = parser.parse_complete(input).await.unwrap(); + assert_eq!(normal, "Some text"); + assert_eq!(calls.len(), 1); + assert_eq!(calls[0].function.name, "get_weather"); +} + +#[tokio::test] +async fn test_parse_multiple_tool_calls() { + let parser = {ParserName}Parser::new(); + // Test with multiple sequential tool calls + let input = r#"{START_TOKEN}{"name":"func1","arguments":{}}{END_TOKEN}{START_TOKEN}{"name":"func2","arguments":{}}{END_TOKEN}"#; + let (_, calls) = parser.parse_complete(input).await.unwrap(); + assert_eq!(calls.len(), 2); +} + +#[tokio::test] +async fn test_no_tool_calls() { + let parser = {ParserName}Parser::new(); + let input = "Just normal text, no tool calls here."; + let (normal, calls) = parser.parse_complete(input).await.unwrap(); + assert_eq!(normal, input); + assert!(calls.is_empty()); +} + +#[tokio::test] +async fn test_has_tool_markers() { + let parser = {ParserName}Parser::new(); + assert!(parser.has_tool_markers("text {START_TOKEN} more")); + assert!(!parser.has_tool_markers("plain text")); +} + +#[tokio::test] +async fn test_streaming_chunks() { + let mut parser = {ParserName}Parser::new(); + let tools = create_test_tools(); + + // Split a tool call across multiple chunks + let chunks = vec![ + "Normal text", + "{START_TOKEN}{\"na", + "me\":\"get_weather\",\"ar", + "guments\":{\"city\":\"", + "Paris\"}}{END_TOKEN}", + ]; + + let mut all_normal = String::new(); + let mut all_calls = Vec::new(); + for chunk in chunks { + let result = parser.parse_incremental(chunk, &tools).await.unwrap(); + all_normal.push_str(&result.normal_text); + all_calls.extend(result.calls); + } + assert_eq!(all_normal, "Normal text"); + assert!(!all_calls.is_empty()); +} + +#[tokio::test] +async fn test_empty_arguments() { + let parser = {ParserName}Parser::new(); + let input = r#"{START_TOKEN}{"name":"no_args","arguments":{}}{END_TOKEN}"#; + let (_, calls) = parser.parse_complete(input).await.unwrap(); + assert_eq!(calls[0].function.arguments, "{}"); +} + +#[tokio::test] +async fn test_reset() { + let mut parser = {ParserName}Parser::new(); + let tools = create_test_tools(); + parser.parse_incremental("{START_TOKEN}{\"name", &tools).await.unwrap(); + parser.reset(); + // After reset, parser should handle new input cleanly + let result = parser.parse_incremental("fresh text", &tools).await.unwrap(); + assert_eq!(result.normal_text, "fresh text"); +} +``` + +**Verify:** `cargo test --test tool_parser_{PARSER_NAME}` + +## Step 5: Run full quality gate + +Invoke `smg:contribute` to run fmt → clippy → test → bindings → commit format. + +## Adapting for Non-Standard JSON Fields + +Some models use non-standard field names. Map them in `parse_complete`: + +| Model | Name Field | Arguments Field | +|-------|-----------|-----------------| +| Most models | `name` | `arguments` | +| Cohere | `tool_name` | `parameters` | +| Llama 3.2 | `name` | `parameters` | + +Use `helpers::normalize_tool_call_fields()` if available, or map manually: +```rust +let name = obj.get("name") + .or_else(|| obj.get("tool_name")) + .and_then(|v| v.as_str()); +let args = obj.get("arguments") + .or_else(|| obj.get("parameters")); +``` -Required test cases: -- Single tool call -- Multiple tool calls -- Empty arguments -- Streaming with chunks split at different boundaries -- Invalid tool name (should skip, not error) -- Mixed normal text and tool calls -- Unicode and special characters +## When NOT to Use the JSON-with-Tags Template -**Verify:** `cargo test --test tool_parser_myformat` +| Scenario | Example | What To Do Instead | +|----------|---------|-------------------| +| Model uses XML with key-value parameters | QwenCoder `value` | Write custom XML extraction (see `qwen_coder.rs`) | +| Model uses Python function syntax | Llama 4 `[func(arg=val)]` | Use `PythonicParser` or register as `pythonic` | +| Model uses raw JSON (no tags) | OpenAI, Claude | Register as `json` — no new parser needed | +| Model uses Unicode delimiter tokens | DeepSeek V3 full-width chars | Write custom parser with Unicode-aware matching (see `deepseek.rs`) | -## Key Rules +## Common Mistakes -- Validate tool names against the `tools` list — skip invalid calls -- Reset state between **requests**, NOT between **chunks** -- Use `partial_json.parse_value()` for incomplete JSON -- Parsers are pooled with `Arc>` — must be `Send + Sync` +| Mistake | Consequence | Fix | +|---------|-------------|-----| +| Not handling multiple tool calls | Only first call extracted | Loop over all start/end token pairs or parse JSON array | +| Resetting between chunks instead of requests | Loses buffer state mid-stream | Only call `reset()` between separate API requests | +| Not validating tool names against `tools` list | Invalid tool calls forwarded to client | Skip calls where name doesn't match any provided tool | +| Re-sending full arguments each chunk | Client receives duplicate argument data | Track `streamed_args_for_tool` and send only the delta | +| Missing `get_unstreamed_tool_args()` | Final arguments lost on fast completions | Implement using `helpers::get_unstreamed_args()` | +| Forgetting `Send + Sync` bounds | Parser can't be pooled with `Arc>` | Avoid `Rc`, `RefCell`, or non-`Send` types in struct fields |