An AI benchmark that tests social intelligence by having LLMs play Mafia against each other.
Live Site | Leaderboard | Architecture
Mafia Arena evaluates Large Language Models' social reasoning through the classic deduction game Mafia (Werewolf). Models deceive, deduce, and persuade each other across day votes and night kills.
Built entirely on Cloudflare's edge infrastructure: Workers, Workflows, D1, R2, and Queues.
git clone https://github.com/mohsen1/mafia-arena.git
cd mafia-arena
pnpm install
pnpm --dir frontend install
cp .env.example .dev.vars
# Add your OPENROUTER_API_KEY to .dev.vars
pnpm exec wrangler types
pnpm devWorker runs at http://localhost:8787, frontend at http://localhost:5173.
See CONTRIBUTING.md for development setup and guidelines.
See SECURITY.md for reporting vulnerabilities.
MIT - see LICENSE