- Clone this repository
- Run
npm run setup:allto configure everything - Run
npm run dev - Select your Ollama models, start the benchmark, and read results instantly
That is it. No heavy setup, no complex pipeline.
Fast benchmarking for Ollama models running locally (or on a remote endpoint), with a simple GUI, instant feedback, and practical metrics to quickly identify the fastest model on your real machine.
This project has one clear goal: real tests, right away, without heavy setup.
- Lightweight and immediate Tkinter interface
- Multi-select Ollama models
- Speed ranking (tokens/sec)
- Optional hardware metrics (CPU, RAM, GPU when available)
- CSV export for comparisons and history
- Minimal dependencies, low footprint
Great for publishing benchmarks from real hardware without heavy pipelines.
Also ideal for Edge AI scenarios: small systems with limited resources (mini PCs, SBCs, embedded devices) where quickly understanding small-model performance in real conditions is critical.
- Quick start with npm commands
- Minimal configuration
- No external infrastructure required
- Designed for daily Linux desktop use
- Suitable for Edge AI devices with constrained resources
The tool is designed to add very little overhead on top of inference workload, so results better reflect real Ollama model performance.
This makes it useful for comparing quantizations and lightweight models when choosing the best balance between speed, RAM usage, and stability on limited hardware.
For each selected model:
- Avg tok/s (main metric)
- Std tok/s (run stability)
- Avg total s and Avg load s
- Avg CPU percent, Peak RAM MB
- Peak GPU percent, Avg GPU percent (best effort if nvidia-smi is available)
npm run setup:allThis command:
- checks Python
- creates the local .venv
- checks or installs GUI prerequisites (tkinter) when possible
- validates project syntax
If you want to explicitly install system tkinter (with sudo), run:
npm run install:tk:systemnpm run devAlternative aliases:
npm run setup
npm run run- Enter your Ollama endpoint (default: http://localhost:11434)
- Click Refresh models
- Select the models
- Click Start benchmark
- Read ranking, row colors, and quick chart at a glance
- Export results with Export CSV
Export generates two files:
- *_summary.csv with aggregated ranking
- *_runs.csv with run-by-run details
Perfect for public repositories, benchmark sharing, and follow-up analysis.
- tkinter is a system dependency (not a pip package), so Linux may require OS-level installation.
- With a remote Ollama endpoint, displayed hardware metrics come from the client machine running the GUI.
- Python 3
- Tkinter
- Ollama HTTP API (/api/tags, /api/generate)
- NPM scripts for a consistent workflow (setup and run)
- main.py: GUI entrypoint
- ui.py: interface, table, quick chart, CSV export
- ollama_client.py: Ollama HTTP client
- benchmark_runner.py: benchmark orchestration
- hw_monitor.py: CPU/RAM/GPU sampling
- models.py: result dataclasses
MIT. See LICENSE.
