Skip to content

Dhevenddra/voice-assitant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ Voice-Native AI Assistant

A local-first, speech-to-speech conversational AI with multi-model orchestration, layered memory, and emotional state awareness.

Python 3.10+ License: MIT


✨ Features

  • 🎤 Voice-First Design - Real-time speech-to-speech conversation
  • 🧠 Multi-Model Orchestration - Routes to optimal AI model based on complexity
  • 💾 Layered Memory - Episodic, semantic, and identity memory systems
  • 🌐 Beautiful Web UI - Modern browser interface with emotional orb
  • 💻 CLI Mode - Simple text-based chat for quick testing
  • 🔒 Local-First - Your data stays on your machine
  • Ollama Powered - Uses local LLMs (llama3.2, phi4-mini, deepseek-r1)

🚀 Quick Start

Prerequisites

  • Python 3.10+
  • Ollama installed and running

1. Clone & Install

git clone https://github.com/Dhevenddra/voice-assitant.git
cd voice-assitant
pip install -e .

2. Pull AI Models

ollama pull llama3.2

3. Run

Option A: CLI Mode (Type to Chat)

python -m src.cli

Option B: Web UI (Visual Interface)

# Terminal 1: Start backend
python -m src.cli

# Terminal 2: Start web server
cd ui && python -m http.server 8080

Then open: http://localhost:8080/index.html


💬 Usage

CLI Commands

Command Description
/help Show all commands
/status View current state
/memory See conversation history
/clear Clear conversation
/quit Exit

Web UI

  1. Click anywhere to enable audio
  2. Type in the text box and press Enter
  3. Watch the emotional orb respond
  4. Hear the assistant speak back

📁 Project Structure

voice-assistant/
├── src/                 # Main application code
│   ├── cli.py           # CLI interface
│   ├── main.py          # Full voice mode
│   ├── audio/           # Microphone/speaker
│   ├── stt/             # Speech-to-Text
│   ├── tts/             # Text-to-Speech
│   ├── models/          # AI model routing
│   └── memory/          # Memory systems
├── ui/                  # Web interface
│   └── index.html       # Browser UI
├── docs/                # Documentation
│   ├── QUICKSTART.md    # Getting started guide
│   ├── ARCHITECTURE.md  # System design
│   └── API_REFERENCE.md # API documentation
├── config.yaml          # Configuration
└── data/                # Memory storage

⚙️ Configuration

Copy .env.example to .env and configure:

# Optional: Claude API for synthesis model
ANTHROPIC_API_KEY=your_key_here

# Force offline mode (local models only)
VOICE_ASSISTANT_OFFLINE=1

# Enable debug logging
VOICE_ASSISTANT_DEBUG=1

📚 Documentation

Guide Description
QUICKSTART.md Get up and running fast
ARCHITECTURE.md Technical deep-dive
API_REFERENCE.md API documentation
HOSTING_GUIDE.md Deployment guide

🛠️ Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Pull additional models
ollama pull phi4-mini:3.8b
ollama pull deepseek-r1:8b

🎯 Philosophy

  • Local-first: All memory and state live locally
  • Voice-primary: Designed around real-time spoken dialogue
  • Conversation, not commands: Supports interruption, pauses, reflection
  • Memory is identity: Selective, structured, meaningful memory
  • Privacy is a feature: Full user control over data

📄 License

MIT License - see LICENSE for details.


Made with ❤️ for natural, human-like AI conversations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors