Skip to content

zeroknowledge0x/zka-voice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎀 ZKA Voice Agent

Full-featured voice AI agent β€” record your voice, get responses. Has memory, can manage cronjobs, send messages, and more.

Python License Status

πŸš€ Quick Start

1. Clone & Setup Server

git clone https://github.com/zeroknowledge0x/zka-voice.git
cd zka-voice/server
chmod +x setup.sh && ./setup.sh

2. Edit Config

# API keys (required)
nano .env

# Agent config (optional)
cp hermes_config.json.example hermes_config.json
nano hermes_config.json

3. Start Server

python3 audio-server.py

Open http://localhost:8082 in your browser. Done! βœ…


🌐 Deploy Web Client (Free)

Host the web UI for free β€” no server needed for the frontend!

Option A: Vercel (Recommended)

  1. Go to vercel.com and sign up with GitHub
  2. Click "New Project"
  3. Import zeroknowledge0x/zka-voice
  4. Set Root Directory to web-client
  5. Click Deploy
  6. Get your URL (e.g., https://zka-voice.vercel.app)

Option B: Cloudflare Pages

  1. Go to pages.cloudflare.com
  2. Click "Create a project"
  3. Connect GitHub repo zeroknowledge0x/zka-voice
  4. Set Build output directory to web-client
  5. Click Deploy
  6. Get your URL (e.g., https://zka-voice.pages.dev)

Option C: GitHub Pages

  1. Go to repo Settings β†’ Pages
  2. Source: Deploy from a branch
  3. Branch: main, folder: /web-client
  4. Click Save
  5. Get your URL (e.g., https://zeroknowledge0x.github.io/zka-voice)

πŸ€– Auto-Install (for Hermes Agents)

If you have a Hermes Agent, just say:

"Install ZKA Voice from https://github.com/zeroknowledge0x/zka-voice"

The agent will automatically clone, setup, and run. No manual commands needed!

File: SKILL.md β€” skill loaded by other Hermes agents.


πŸ“± Access from iPhone/Android

# Install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o /usr/local/bin/cloudflared
chmod +x /usr/local/bin/cloudflared

# Start tunnel
cloudflared tunnel --url http://localhost:8082

Get URL https://xxx.trycloudflare.com β€” open from phone.


πŸ”‘ API Keys (All FREE)

Service Purpose Sign Up
Groq STT (Whisper) console.groq.com
MiMo LLM mimo.xiaomi.com
Edge TTS Text-to-Speech No sign up needed

Groq is required for STT. MiMo is optional (fallback to Groq if not set).


βš™οΈ Config Examples

See examples/ folder for config samples:

  • basic-english.json β€” Simple, English language
  • basic-indonesia.json β€” Simple, Indonesian language
  • full-telegram.json β€” Full features + Telegram sync
  • developer.json β€” Developer-focused, terminal tools

Usage:

cp examples/basic-english.json server/hermes_config.json
# Edit as needed
nano server/hermes_config.json

✨ Features

  • πŸŽ™οΈ Hold-to-Talk β€” record voice, get audio response
  • 🧠 Memory β€” remembers conversation context
  • ⚑ Dual Mode β€” Chat (natural) & Command (12 tools)
  • πŸ”§ Cronjob Management β€” manage from voice
  • πŸ“± Telegram Sync β€” auto-send to topic
  • πŸ” Password Protection β€” token-based auth
  • 🌍 Multi-STT β€” Groq Whisper + MiMo fallback
  • πŸ—£οΈ Indonesian TTS β€” Edge TTS ArdiNeural
  • πŸ“± Mobile-Friendly β€” optimized for iPhone/Android

πŸ› οΈ Tools (Command Mode)

Tool Description
cronjob_list List all cronjobs
cronjob_run Run a cronjob
cronjob_pause Pause a cronjob
cronjob_resume Resume a cronjob
memory_read Read memory
memory_write Write to memory
skills_list List skills
skills_search Search skills
telegram_send Send Telegram message
terminal_run Run command
server_status Server status
github_repos List repos

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Web Client     │────▢│  HTTP Server     │────▢│   Groq STT   β”‚
β”‚  (Vercel/CF)     │◀────│  (port 8082)     │◀────│  (Whisper)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚   MiMo LLM       │────▢│  Edge TTS    β”‚
                         β”‚  (or Groq)       │◀────│ (ArdiNeural) β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pipeline: Audio β†’ ffmpeg β†’ Groq STT β†’ LLM β†’ Edge TTS β†’ Audio

Latency: ~3-5 seconds total


πŸ“ Project Structure

zka-voice/
β”œβ”€β”€ web-client/                 # Static web UI
β”‚   β”œβ”€β”€ index.html             # Main UI (hold-to-talk, dual mode)
β”‚   └── config.js              # Server URL config
β”œβ”€β”€ server/                     # Python backend
β”‚   β”œβ”€β”€ audio-server.py        # Main server
β”‚   β”œβ”€β”€ hermes_context.py      # Context builder (generic)
β”‚   β”œβ”€β”€ setup.sh               # One-click setup
β”‚   β”œβ”€β”€ requirements.txt       # Dependencies
β”‚   β”œβ”€β”€ .env.example           # API key template
β”‚   └── hermes_config.json.example  # Agent config template
β”œβ”€β”€ examples/                   # Config examples
β”‚   β”œβ”€β”€ basic-english.json
β”‚   β”œβ”€β”€ basic-indonesia.json
β”‚   β”œβ”€β”€ full-telegram.json
β”‚   └── developer.json
β”œβ”€β”€ SKILL.md                   # Auto-install skill for Hermes
β”œβ”€β”€ vercel.json                # Vercel deployment config
β”œβ”€β”€ wrangler.toml              # Cloudflare Pages config
β”œβ”€β”€ .gitignore
└── README.md

πŸ“‹ Requirements

  • Python 3.10+
  • ffmpeg
  • Groq API key (free)
  • VPS or laptop for server

πŸ”’ Security

  • Password protection (24h TTL)
  • API keys not committed to repo
  • Cloudflare Tunnel = automatic HTTPS

πŸ“ License

MIT β€” free to use, modify, and share.


πŸ™ Credits


πŸ’¬ Support


Made with ❀️ by ZKA Labs

About

🎀 ZKA Voice Agent β€” Full-featured voice AI with memory, tools, and Telegram sync

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors