This is a web application that allows you to compare responses from 2 different LLM models side by side, and then summarize their common points into a synthesized answer. With this method, it can allelivate the effect of signle model's hullicinaiton and give you a more accurate answer to your prompt.
- Model Selection: Choose from available Ollama models via dropdown menus.
- Dual Model Comparison: Compare the responses from two different LLM models for same prompt side by side.
- Token Generation Rate: Show token generation rate for each model response to let user compare LLM performance.
- Automatic Synthesis: Automatically identifies and summarizes common points between 2 responses with a third LLM that is specialized in text analysis and summarization.
- Token Management: Use system prompt to enforces a 300-token limit for consistent, concise responses
- Clean Interface: Simple, intuitive UI for easy interaction
- Python 3.10
- Flask
- Ollama installed and running locally
- Required Python packages (see requirements.txt)
- Clone the repository:
git clone https://github.com/maverick001/chatbot-enhancer.git
cd chatbot-enhancer
-
Install dependencies:
pip install -r requirements.txt
-
Make sure Ollama is running locally on port 11434
-
Start the Flask server:
python backend.py
-
Open your browser and navigate to: http://localhost:5000
- Select your desired models from the dropdown menus on both sides
- Enter your prompt in the input field at the bottom
- Click "Send" to generate responses
- View the responses stream in real-time in the side panels
- Read the synthesized analysis in the center panel, which includes:
- Key common points between both responses
- A synthesized summary of the shared insights
- Backend: Flask (Python)
- Frontend: HTML, CSS, JavaScript
- LLM Integration: Ollama API
- Streaming Support: Server-Sent Events (SSE)
- GPU Acceleration: Enabled by default
MIT License
Contributions are welcome! Please feel free to submit a Pull Request.