You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llm router classifier + routing.router desugaring — the L0(a) "LLM-as-router" on-ramp, the starting point for most users. Kept inside the one-engine invariant (everything compiles to a Decision).
Scope
New classifier type: "llm": runs a (small, local) chat LLM with an author-supplied NL prompt, constrained to choose among the candidate names; parses the structured choice and returns it as the label (score 1.0) + the model's short rationale.
routing.router sugar: at load time the parser desugars { "router": { "type":"llm", "model", "prompt" } } into one llm classifier (labels = routing.candidates) + auto-generated identity rules (label X → route_to X). Engine stays single-path.
Runs the chat LLM through the existing Router chat path (extends the M·svc wiring, [Router] ClassifierServices to Router wiring (M-svc) #2384). The LLM model is a bundled component (declared as routing.router.model; role = classifier-model).
Trace records the chosen candidate + rationale — so even L0(a) is auditable.
Out of scope
The L0(b) offline NL→policy compiler (separate, deferred). Determinism guarantees — L0(a) is explicitly the least-deterministic/slowest tier; docs steer steady-state users to L0(b)/L1–L3.
Acceptance
A routing.router (llm) policy with N candidates routes a request to a candidate the LLM selects; the decision trace shows the chosen label + rationale.
Tested with a fake LLM-invoke returning a fixed choice (unit) + one live end-to-end path.
Invalid choice (LLM returns a non-candidate) falls back to default_model.
Example — L0(a) collection.router JSON (LLM-as-router)
The lean local form — components reference already-registered models by name (like the built-in Ultra Collection); no models[] needed:
{
"model_name": "user.Router-Auto",
"recipe": "collection.router",
"components": ["Qwen3-1.7B-GGUF", "Qwen3-8B-GGUF", "Qwen3.5-35B-A3B-GGUF"],
"routing": {
"candidates": ["Qwen3-8B-GGUF", "Qwen3.5-35B-A3B-GGUF"],
"default_model": "Qwen3-8B-GGUF",
"router": {
"type": "llm",
"model": "Qwen3-1.7B-GGUF",
"prompt": "You route requests. Reply with ONLY a model name. Use Qwen3-8B-GGUF for everyday questions; use Qwen3.5-35B-A3B-GGUF for hard reasoning, coding, or long context."
}
}
}
The router LLM (Qwen3-1.7B-GGUF) is a classifier-model, not a candidate. No rules — routing.router desugars to one llm classifier + identity rules. For Hugging Face redistribution, embed each component's full def in a models[] array (as LMX-Omni-52B-Halo.json does); locally it's omitted.
Goal
llmrouter classifier +routing.routerdesugaring — the L0(a) "LLM-as-router" on-ramp, the starting point for most users. Kept inside the one-engine invariant (everything compiles to aDecision).Scope
type: "llm": runs a (small, local) chat LLM with an author-supplied NLprompt, constrained to choose among the candidate names; parses the structured choice and returns it as the label (score 1.0) + the model's short rationale.routing.routersugar: at load time the parser desugars{ "router": { "type":"llm", "model", "prompt" } }into onellmclassifier (labels =routing.candidates) + auto-generated identity rules (label X → route_to X). Engine stays single-path.Routerchat path (extends the M·svc wiring, [Router] ClassifierServices to Router wiring (M-svc) #2384). The LLM model is a bundled component (declared asrouting.router.model; role = classifier-model).Out of scope
The L0(b) offline NL→policy compiler (separate, deferred). Determinism guarantees — L0(a) is explicitly the least-deterministic/slowest tier; docs steer steady-state users to L0(b)/L1–L3.
Acceptance
routing.router(llm) policy with N candidates routes a request to a candidate the LLM selects; the decisiontraceshows the chosen label + rationale.default_model.Example — L0(a)
collection.routerJSON (LLM-as-router)The lean local form —
componentsreference already-registered models by name (like the built-inUltra Collection); nomodels[]needed:{ "model_name": "user.Router-Auto", "recipe": "collection.router", "components": ["Qwen3-1.7B-GGUF", "Qwen3-8B-GGUF", "Qwen3.5-35B-A3B-GGUF"], "routing": { "candidates": ["Qwen3-8B-GGUF", "Qwen3.5-35B-A3B-GGUF"], "default_model": "Qwen3-8B-GGUF", "router": { "type": "llm", "model": "Qwen3-1.7B-GGUF", "prompt": "You route requests. Reply with ONLY a model name. Use Qwen3-8B-GGUF for everyday questions; use Qwen3.5-35B-A3B-GGUF for hard reasoning, coding, or long context." } } }The router LLM (
Qwen3-1.7B-GGUF) is a classifier-model, not a candidate. No rules —routing.routerdesugars to onellmclassifier + identity rules. For Hugging Face redistribution, embed each component's full def in amodels[]array (asLMX-Omni-52B-Halo.jsondoes); locally it's omitted.Depends on: #2379 (classifier registry), #2383 (parser/desugaring), #2384 (Router chat wiring)