🎤 ZKA Voice Agent

Full-featured voice AI agent — record your voice, get responses. Has memory, can manage cronjobs, send messages, and more.

🚀 Quick Start

1. Clone & Setup Server

git clone https://github.com/zeroknowledge0x/zka-voice.git
cd zka-voice/server
chmod +x setup.sh && ./setup.sh

2. Edit Config

# API keys (required)
nano .env

# Agent config (optional)
cp hermes_config.json.example hermes_config.json
nano hermes_config.json

3. Start Server

python3 audio-server.py

Open http://localhost:8082 in your browser. Done! ✅

🌐 Deploy Web Client (Free)

Host the web UI for free — no server needed for the frontend!

Option A: Vercel (Recommended)

Go to vercel.com and sign up with GitHub
Click "New Project"
Import zeroknowledge0x/zka-voice
Set Root Directory to web-client
Click Deploy
Get your URL (e.g., https://zka-voice.vercel.app)

Option B: Cloudflare Pages

Go to pages.cloudflare.com
Click "Create a project"
Connect GitHub repo zeroknowledge0x/zka-voice
Set Build output directory to web-client
Click Deploy
Get your URL (e.g., https://zka-voice.pages.dev)

Option C: GitHub Pages

Go to repo Settings → Pages
Source: Deploy from a branch
Branch: main, folder: /web-client
Click Save
Get your URL (e.g., https://zeroknowledge0x.github.io/zka-voice)

🤖 Auto-Install (for Hermes Agents)

If you have a Hermes Agent, just say:

"Install ZKA Voice from https://github.com/zeroknowledge0x/zka-voice"

The agent will automatically clone, setup, and run. No manual commands needed!

File: SKILL.md — skill loaded by other Hermes agents.

📱 Access from iPhone/Android

# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared

# Start tunnel
cloudflared tunnel --url http://localhost:8082

Get URL https://xxx.trycloudflare.com — open from phone.

🔑 API Keys (All FREE)

Service	Purpose	Sign Up
Groq	STT (Whisper)	console.groq.com
MiMo	LLM	mimo.xiaomi.com
Edge TTS	Text-to-Speech	No sign up needed

Groq is required for STT. MiMo is optional (fallback to Groq if not set).

⚙️ Config Examples

See examples/ folder for config samples:

basic-english.json — Simple, English language
basic-indonesia.json — Simple, Indonesian language
full-telegram.json — Full features + Telegram sync
developer.json — Developer-focused, terminal tools

Usage:

cp examples/basic-english.json server/hermes_config.json
# Edit as needed
nano server/hermes_config.json

✨ Features

🎙️ Hold-to-Talk — record voice, get audio response
🧠 Memory — remembers conversation context
⚡ Dual Mode — Chat (natural) & Command (12 tools)
🔧 Cronjob Management — manage from voice
📱 Telegram Sync — auto-send to topic
🔐 Password Protection — token-based auth
🌍 Multi-STT — Groq Whisper + MiMo fallback
🗣️ Indonesian TTS — Edge TTS ArdiNeural
📱 Mobile-Friendly — optimized for iPhone/Android

🛠️ Tools (Command Mode)

Tool	Description
`cronjob_list`	List all cronjobs
`cronjob_run`	Run a cronjob
`cronjob_pause`	Pause a cronjob
`cronjob_resume`	Resume a cronjob
`memory_read`	Read memory
`memory_write`	Write to memory
`skills_list`	List skills
`skills_search`	Search skills
`telegram_send`	Send Telegram message
`terminal_run`	Run command
`server_status`	Server status
`github_repos`	List repos

🏗️ Architecture

┌──────────────────┐     ┌──────────────────┐     ┌──────────────┐
│   Web Client     │────▶│  HTTP Server     │────▶│   Groq STT   │
│  (Vercel/CF)     │◀────│  (port 8082)     │◀────│  (Whisper)   │
└──────────────────┘     └──────┬───────────┘     └──────────────┘
                                │
                         ┌──────▼───────────┐     ┌──────────────┐
                         │   MiMo LLM       │────▶│  Edge TTS    │
                         │  (or Groq)       │◀────│ (ArdiNeural) │
                         └──────────────────┘     └──────────────┘

Pipeline: Audio → ffmpeg → Groq STT → LLM → Edge TTS → Audio

Latency: ~3-5 seconds total

📁 Project Structure

zka-voice/
├── web-client/                 # Static web UI
│   ├── index.html             # Main UI (hold-to-talk, dual mode)
│   └── config.js              # Server URL config
├── server/                     # Python backend
│   ├── audio-server.py        # Main server
│   ├── hermes_context.py      # Context builder (generic)
│   ├── setup.sh               # One-click setup
│   ├── requirements.txt       # Dependencies
│   ├── .env.example           # API key template
│   └── hermes_config.json.example  # Agent config template
├── examples/                   # Config examples
│   ├── basic-english.json
│   ├── basic-indonesia.json
│   ├── full-telegram.json
│   └── developer.json
├── SKILL.md                   # Auto-install skill for Hermes
├── vercel.json                # Vercel deployment config
├── wrangler.toml              # Cloudflare Pages config
├── .gitignore
└── README.md

📋 Requirements

Python 3.10+
ffmpeg
Groq API key (free)
VPS or laptop for server

🔒 Security

Password protection (24h TTL)
API keys not committed to repo
Cloudflare Tunnel = automatic HTTPS

📝 License

MIT — free to use, modify, and share.

🙏 Credits

Groq — STT Whisper
Xiaomi MiMo — LLM
Edge TTS — Text-to-Speech
Cloudflare — Tunnel & Pages
Vercel — Web Hosting

💬 Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Made with ❤️ by ZKA Labs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎤 ZKA Voice Agent

🚀 Quick Start

1. Clone & Setup Server

2. Edit Config

3. Start Server

🌐 Deploy Web Client (Free)

Option A: Vercel (Recommended)

Option B: Cloudflare Pages

Option C: GitHub Pages

🤖 Auto-Install (for Hermes Agents)

📱 Access from iPhone/Android

🔑 API Keys (All FREE)

⚙️ Config Examples

✨ Features

🛠️ Tools (Command Mode)

🏗️ Architecture

📁 Project Structure

📋 Requirements

🔒 Security

📝 License

🙏 Credits

💬 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
server		server
web-client		web-client
.gitignore		.gitignore
README.md		README.md
SKILL.md		SKILL.md
vercel.json		vercel.json
wrangler.toml		wrangler.toml

Folders and files

Latest commit

History

Repository files navigation

🎤 ZKA Voice Agent

🚀 Quick Start

1. Clone & Setup Server

2. Edit Config

3. Start Server

🌐 Deploy Web Client (Free)

Option A: Vercel (Recommended)

Option B: Cloudflare Pages

Option C: GitHub Pages

🤖 Auto-Install (for Hermes Agents)

📱 Access from iPhone/Android

🔑 API Keys (All FREE)

⚙️ Config Examples

✨ Features

🛠️ Tools (Command Mode)

🏗️ Architecture

📁 Project Structure

📋 Requirements

🔒 Security

📝 License

🙏 Credits

💬 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages