🎙️ Open Query AI

Open Query AI is very much the lightweight open-source version of Cluely before the hype. Run as a local, real-time speech-to-LLM assistant powered entirely by open-source models. Speak to your computer → get instant answers back from an offline LLM — no accounts, no paid APIs.

Built with:

faster-whisper — GPU‑accelerated, low‑latency speech-to-text
Ollama — Run local LLMs (Gemma 3 & others) on your machine

✨ Features

Real-time microphone capture → live transcription (partials + final commit)
In-context hotkey toggle: decide when user input should be sent to the LLM
Open-source models only — no closed API keys, no rate limits
Lightweight — runs entirely on your local machine
Easy to extend (custom prompts, text-to-speech, conversation memory, etc.)

👩‍💻 Running

uv sync            # Install all dependencies
uv run prompter.py # Start the live transcriber + LLM

Requirements

The following software needs to be installed on your local machine before running.

📦 Package Managers

We highly recommend the following:

uv – Blazing fast Python environment manager (written in Rust)

🤖 Local LLMs

Ollama – A streamlined, open-source platform for running and managing LLMs on your local machine. It simplifies downloading, setting up, and interacting with open-source models

ℹ️ Make sure Ollama is running in the background for LLM-based workflows.

🎙️ Audio to Text (Transcription)

To enable GPU-accelerated transcription with faster-whisper:

NVIDIA GPU with sufficient VRAM for your chosen model
NVIDIA GPU driver (version depends on your CUDA setup)
CUDA Toolkit (typically version 11+)
cuDNN (sometimes bundled with CUDA)

To ensure PyTorch is installed with CUDA support:

uv pip install torch --index-url https://download.pytorch.org/whl/cu128 && uv sync

Problem

This used to be a Next.js app that connected to OpenAI's GPT API and offered users access to ChatGPT without having to register. However, given their expiration on API keys, the only way to further access their API is by signing up for the paid version 🙄.

💡 Why Local + Open Source?

You control your data (runs entirely offline if desired)
No rate limits / API key expirations
Full extensibility for custom workflows
True to the spirit of OpenAI's founding principles

Architecture and Design

🎤 Microphone
   ↓ (sounddevice)
[Audio Queue]
   ↓ (Silero VAD – Voice Activity Detection)
Partial transcript every X seconds
   ↓ (Whisper transcription)
Final transcript when speech ends
   ↓ if in_context = True
Send to Ollama (Gemma3 or other LLM)
   ↓
LLM Response → log/console/next step

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.vscode		.vscode
_assets		_assets
utils		utils
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
config.py		config.py
prompter.py		prompter.py
pyproject.toml		pyproject.toml
transcriber.py		transcriber.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Open Query AI

✨ Features

👩‍💻 Running

Requirements

📦 Package Managers

🤖 Local LLMs

🎙️ Audio to Text (Transcription)

Problem

💡 Why Local + Open Source?

Architecture and Design

About

Uh oh!

Releases

Packages

Uh oh!

Languages

stevenxchung/open-query-ai

Folders and files

Latest commit

History

Repository files navigation

🎙️ Open Query AI

✨ Features

👩‍💻 Running

Requirements

📦 Package Managers

🤖 Local LLMs

🎙️ Audio to Text (Transcription)

Problem

💡 Why Local + Open Source?

Architecture and Design

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages