Samanta AI Assistant 🌸

A kawaii-style AI assistant web application featuring voice interaction, real-time speech processing, and memory capabilities.

✨ Features

Frontend

🎨 Kawaii-style UI with responsive design
🎤 Real-time voice interaction with silence detection
🌓 Light/Dark mode support
💫 Smooth animations and transitions
🔊 Text-to-Speech using OpenAI's TTS API

Backend

🎯 Accurate speech recognition with WhisperX
🎤 Browser-based Voice Activity Detection
🧠 Memory capabilities through Memobase integration
🗣️ High-quality Text-to-Speech with OpenAI
💭 LLM integration with Ollama

Coming Soon:

TTS optimization for Macos lightning-whisper-mlx

🚀 Quick Start

Prerequisites

Python 3.11+
Node.js
Docker and Docker Compose (for Memobase)
FFmpeg (for audio processing)
Ollama installed and running

Backend Setup

Create and activate Python virtual environment:

uv sync

Set up environment variables:

# Create .env file in backend directory
cp .env.example .env
# Edit .env with your API keys and configurations

Set up Memobase:

# Navigate to memobase directory
cd memobase/src/server

# Start Memobase services
docker-compose up -d

Configure Memobase: Edit memobase/src/server/api/config.yaml:

llm_api_key: ollama
llm_base_url: http://host.docker.internal:11434/v1
best_llm_model: phi4  # Must match your Ollama model

Start the backend server:

uvicorn main:app --reload

Frontend Setup

Install Node.js dependencies:

cd frontend
npm install

Set up environment variables:

# Create .env.local file in frontend directory
cp .env.example .env.local
# Edit .env.local with your configurations

Start the development server:

npm run dev

🛠️ Tech Stack

Frontend

Next.js
TailwindCSS
Pixi Live2D Display
WebRTC

Backend

Python 3.12
FastAPI
pyannote.audio
WhisperX
Ollama
OpenAI (Coming Soon)

LLM Inference Server

Ollama
OpenAI (Coming Soon)

LLM Memory

Memobase

TTS

OpenAI (not hd model, meaning the cheapest one but still good)

📁 Project Structure

project-root/
├── backend/
│   ├── main.py              # FastAPI application
│   │   ├── services/
│   │   │   ├── vad.py          # Voice Activity Detection
│   │   │   ├── stt.py          # Speech-to-Text
│   │   │   ├── tts.py          # Text-to-Speech
│   │   │   └── llm.py          # Language Model
│   │   └── requirements.txt
│   │
│   ├── frontend/
│   │   ├── components/
│   │   │   ├── Live2DModel.js
│   │   │   └── AudioTranscriber.js
│   │   ├── pages/
│   │   ├── public/
│   │   │   └── live2d/         # Live2D model assets
│   │   └── styles/
│   │
│   └── docs/
│       ├── CHANGELOG.md
│       ├── CONTRIBUTING.md
│       ├── CODE_OF_CONDUCT.md
│       ├── SECURITY.md
│       └── LICENSE.md

⚙️ Configuration

Backend Configuration

HF_TOKEN=your_huggingface_token
OPENAI_API_KEY=your_openai_key
INFERENCE_SERVER=ollama
OLLAMA_MODEL=your_model_name
USER_NAME=your_username
STREAM=True
STT_LANGUAGE=it  # Language for speech recognition

Frontend Configuration

NEXT_PUBLIC_API_URL=http://localhost:8000

Memobase Configuration

Required files:

memobase/src/server/api/.env
memobase/src/server/api/config.yaml
memobase/src/server/.env

API Endpoints

POST /api/transcribe - Speech-to-text conversion
POST /api/chat - Chat with memory-enabled LLM
POST /api/tts - Text-to-speech conversion
POST /api/detect-voice - Voice activity detection

🤝 Contributing

See CONTRIBUTING.md for detailed contribution guidelines.

Quick start:

Fork the repository
Create a new branch (git checkout -b feature/amazing-feature)
Make your changes
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Please read our Code of Conduct before contributing.

📜 License

This project is licensed under the MIT License - see LICENSE.md for details.

🔒 Security

See SECURITY.md for reporting security vulnerabilities.

For immediate security concerns, please contact [email protected].

🙏 Acknowledgments

📝 Changelog

See CHANGELOG.md for a list of changes and versions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
components		components
docs		docs
frontend		frontend
memobase		memobase
pages		pages
public		public
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
package-lock.json		package-lock.json
pyproject.toml		pyproject.toml
tailwind.config.js		tailwind.config.js
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Samanta AI Assistant 🌸

✨ Features

Frontend

Backend

🚀 Quick Start

Prerequisites

Backend Setup

Frontend Setup

🛠️ Tech Stack

Frontend

Backend

LLM Inference Server

LLM Memory

TTS

📁 Project Structure

⚙️ Configuration

Backend Configuration

Frontend Configuration

Memobase Configuration

API Endpoints

🤝 Contributing

📜 License

🔒 Security

🙏 Acknowledgments

📝 Changelog

About

Releases

Packages

Languages

WasamiKirua/samanta-os

Folders and files

Latest commit

History

Repository files navigation

Samanta AI Assistant 🌸

✨ Features

Frontend

Backend

🚀 Quick Start

Prerequisites

Backend Setup

Frontend Setup

🛠️ Tech Stack

Frontend

Backend

LLM Inference Server

LLM Memory

TTS

📁 Project Structure

⚙️ Configuration

Backend Configuration

Frontend Configuration

Memobase Configuration

API Endpoints

🤝 Contributing

📜 License

🔒 Security

🙏 Acknowledgments

📝 Changelog

About

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages