AI-powered voice-to-text flow application with context-aware processing and system tray integration.
- 🎤 Voice Transcription: Real-time speech-to-text with OpenAI Whisper
- 🤖 AI Completion: Context-aware text completion and commands
- 🔧 System Tray: Background daemon with tray icon and global hotkeys
- ⌨️ Global Hotkeys: Push-to-talk and single-press voice activation
- 📝 Multiple Modes: Transcribe, Auto-Transcribe, and Command modes
- ⚙️ Configurable: Customizable prompts, models, and settings
- Linux (Ubuntu/Debian/Mint) with desktop environment
- Python 3.12 (required for system tray support)
- OpenAI API Key (for transcription and completion)
# Install required system packages for tray icon support
sudo apt update && sudo apt install -y \
python3-gi \
gir1.2-gtk-3.0 \
gir1.2-appindicator3-0.1 \
libappindicator3-1 \
python3-venv \
python3-pip# Clone the repository
git clone https://github.com/sapountzis/whisper-flow.git
cd whisper-flow
# Create virtual environment with system site packages
python3.12 -m venv .venv --system-site-packages
# Activate environment
source .venv/bin/activate
# Install dependencies
pip install -e .# Set your OpenAI API key
export OPENAI_API_KEY="your-api-key-here"
# Or add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
echo 'export OPENAI_API_KEY="your-api-key-here"' >> ~/.bashrc# Start the daemon with tray icon (GTK backend configured by default)
whisper-flow daemon --foreground
# Or start in background
whisper-flow daemonYou should see a microphone icon in your system tray. Right-click it to access the menu!
- Right-click the tray icon to access:
- Settings
- Test Configuration
- Exit
- 🎤 Transcribe:
Ctrl+Cmd(push-to-talk) - 🔴 Auto-Transcribe:
Ctrl+Cmd+Space(single press) - 🤖 Command:
Ctrl+Cmd+Alt(single press) - 🛑 Cancel:
Escape - 📋 Menu:
F1
# Initialize configuration
whisper-flow init-config
# Start daemon
whisper-flow daemon
# Stop daemon
whisper-flow stop
# Check status
whisper-flow status
# Test configuration
whisper-flow validateIf you see the tray icon but the menu doesn't appear:
- Check backend: The daemon should show "Pystray backend: gtk"
- Verify gi module:
python -c "import gi; print('OK')" - Restart daemon:
whisper-flow stop && whisper-flow daemon
- "No module named 'gi'": Install system packages from step 1
- "XOrg backend": Ensure Python 3.12 and system packages are installed
- No tray icon: Check if your desktop environment supports system tray
The GTK backend is configured by default. To use a different backend:
# Set environment variable
export PYSTRAY_BACKEND=appindicator
whisper-flow daemon
# Or edit config file
# ~/.config/whisper-flow/config.yaml: pystray_backend: "appindicator"# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Format code
black src/
isort src/whisper-flow/
├── src/whisper_flow/
│ ├── app.py # Main application logic
│ ├── daemon.py # System tray daemon
│ ├── cli.py # Command-line interface
│ ├── audio.py # Audio recording
│ ├── transcription.py # OpenAI Whisper integration
│ ├── completion.py # AI completion
│ └── config.py # Configuration management
├── pyproject.toml # Project configuration
└── README.md # This file
Configuration files are stored in ~/.config/whisper-flow/:
config.yaml- Main settingsprompts.yaml- Default promptstranscribe.yaml- Transcription mode promptsauto_transcribe.yaml- Auto-transcribe mode promptscommand.yaml- Command mode prompts
MIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Once setup is complete, you can run individual tests:
# Test tray functionality
python tests/test_tray.py
# Test daemon components
python tests/test_daemon_tray.py