Skip to content

✨ A lightweight, terminal-based, open-source version of "Cluely" before the hype πŸš‚.

Notifications You must be signed in to change notification settings

stevenxchung/open-query-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ™οΈ Open Query AI

Open Query AI is very much the lightweight open-source version of Cluely before the hype. Run as a local, real-time speech-to-LLM assistant powered entirely by open-source models. Speak to your computer β†’ get instant answers back from an offline LLM β€” no accounts, no paid APIs.

Python Runs Locally

Demo

Built with:

  • faster-whisper β€” GPU‑accelerated, low‑latency speech-to-text
  • Ollama β€” Run local LLMs (Gemma 3 & others) on your machine

✨ Features

  • Real-time microphone capture β†’ live transcription (partials + final commit)
  • In-context hotkey toggle: decide when user input should be sent to the LLM
  • Open-source models only β€” no closed API keys, no rate limits
  • Lightweight β€” runs entirely on your local machine
  • Easy to extend (custom prompts, text-to-speech, conversation memory, etc.)

πŸ‘©β€πŸ’» Running

uv sync            # Install all dependencies
uv run prompter.py # Start the live transcriber + LLM

Requirements

The following software needs to be installed on your local machine before running.

πŸ“¦ Package Managers

We highly recommend the following:

  • uv – Blazing fast Python environment manager (written in Rust)

πŸ€– Local LLMs

  • Ollama – A streamlined, open-source platform for running and managing LLMs on your local machine. It simplifies downloading, setting up, and interacting with open-source models

ℹ️ Make sure Ollama is running in the background for LLM-based workflows.

πŸŽ™οΈ Audio to Text (Transcription)

To enable GPU-accelerated transcription with faster-whisper:

  • NVIDIA GPU with sufficient VRAM for your chosen model
  • NVIDIA GPU driver (version depends on your CUDA setup)
  • CUDA Toolkit (typically version 11+)
  • cuDNN (sometimes bundled with CUDA)

To ensure PyTorch is installed with CUDA support:

uv pip install torch --index-url https://download.pytorch.org/whl/cu128 && uv sync

Problem

This used to be a Next.js app that connected to OpenAI's GPT API and offered users access to ChatGPT without having to register. However, given their expiration on API keys, the only way to further access their API is by signing up for the paid version πŸ™„.

πŸ’‘ Why Local + Open Source?

  • You control your data (runs entirely offline if desired)
  • No rate limits / API key expirations
  • Full extensibility for custom workflows
  • True to the spirit of OpenAI's founding principles

Architecture and Design

🎀 Microphone
   ↓ (sounddevice)
[Audio Queue]
   ↓ (Silero VAD – Voice Activity Detection)
Partial transcript every X seconds
   ↓ (Whisper transcription)
Final transcript when speech ends
   ↓ if in_context = True
Send to Ollama (Gemma3 or other LLM)
   ↓
LLM Response β†’ log/console/next step

About

✨ A lightweight, terminal-based, open-source version of "Cluely" before the hype πŸš‚.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages