An intelligent model selection system that automatically chooses the best LLM for a given task based on performance history, cost, and latency requirements.
- Automatic model selection based on task requirements
- Support for multiple LLM providers (OpenAI, Anthropic, Mistral)
- Performance tracking and evaluation
- Cost and latency optimization
- SQLite database for storing task history
- Extensible architecture for adding new models and evaluation metrics
llm_router/
├── config/
│ └── model_registry.yaml # Model configurations and pricing
├── data/
│ └── task_history.db # SQLite database for task history
├── src/
│ ├── models.py # Data models and types
│ ├── model_runner.py # LLM API integration
│ ├── evaluator.py # Output quality evaluation
│ ├── router_agent.py # Model selection logic
│ ├── task_runner.py # Pipeline orchestration
│ └── example.py # Usage example
└── requirements.txt # Project dependencies
- Clone the repository:
git clone https://github.com/yourusername/llm_router.git
cd llm_router
- Create a virtual environment and install dependencies:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
- Set up environment variables:
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export MISTRAL_API_KEY="your-mistral-key"
from src.models import TaskSpec, TaskType
from src.task_runner import TaskRunner
# Initialize task runner
runner = TaskRunner()
# Create a task
task = TaskSpec(
task_id="unique-id",
prompt="Your task prompt here",
task_type=TaskType.CODE_GENERATION,
importance=0.8,
latency_budget_ms=5000
)
# Run the task
result = runner.run_task(task)
# Access results
print(f"Selected model: {result.selected_model}")
print(f"Output: {result.output.output_text}")
print(f"Quality score: {result.evaluation_score}")
The router uses a weighted scoring system to select the best model:
score = α * quality - β * cost - γ * latency
Where:
- quality: Historical performance and task-specific capabilities
- cost: Token pricing and usage
- latency: Response time and budget compliance
- Add model configuration to
config/model_registry.yaml
- Implement API integration in
model_runner.py
- Extend
evaluator.py
with new evaluation metrics - Implement custom scoring logic
The project is designed to be easily integrated with:
- Streamlit for visualization
- LangSmith for experiment tracking
- Custom monitoring systems
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.