Skip to content

Latest commit

 

History

History
85 lines (53 loc) · 2.21 KB

README.md

File metadata and controls

85 lines (53 loc) · 2.21 KB

LIA

LIA (Local Intelligent Agent) is a project that implements an AI voice assistant capable of transcribing speech, generate a response, and convert that response text to speech. It utilizes WhisperCPP for speech recognition, Ollama for text generation, and TTS for speech synthesis.

Features

  • Real-time audio recording
  • Speech-to-text transcription using WhisperCPP
  • Text generation using Ollama
  • Text-to-speech conversion using TTS
  • Audio playback of generated responses

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Python 3.7 or higher
  • Ollama server running locally on port 11434
  • WhisperCPP model file (ggml-base.en.bin) in the project directory

Installation

  1. Clone this repository:
git clone https://github.com/yourusername/ai-voice-assistant.git
cd ai-voice-assistant
  1. Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate  # On Windows, use venv\Scripts\activate
  1. Install the required packages:
pip install -r requirements.txt
  1. Download the WhisperCPP model file (ggml-base.en.bin) and place it in the project directory.

Usage

  1. Ensure that the Ollama server is running on http://localhost:11434.

  2. Run the main script:

python main.py
  1. The program will start recording audio. Speak into your microphone.

  2. Press Enter to stop the recording.

  3. The system will transcribe your speech, generate a response, and play it back through your speakers.

Configuration

  • To change the Ollama model, modify the "model" key in the generate_response function.
  • To use a different TTS model, update the model path in the TTS initialization.

Contributing

Contributions to this project are welcome. Please fork the repository and create a pull request with your changes.

License

MIT License

Acknowledgements

  • WhisperCPP for speech recognition
  • Ollama for text generation
  • TTS for text-to-speech conversion

Contact

If you have any questions or feedback, please open an issue in this repository.