First, thanks to the Harvard Computer Society AI Bootcamp for teaching me so much, especially Carl, who always answered my seemingly never-ending stream of questions.
Second, thanks to Abhinav Sukla for teaching me about poker, helping test the model, and being a great friend.
Κέρδος (kérdos) is Ancient Greek for gain, profit, advantage.
True to its name, Kerdos is a fully local reinforcement-learning bot that consistently beats semi-pro human players in heads-up fixed-limit Texas Hold’em—yet installs and runs in under two minutes on any recent macOS or Linux laptop.
| Component | Details |
|---|---|
| Algorithms | Neural Fictitious Self-Play (NFSP) for balance Deep Q-Network (DQN) for raw aggression |
| Training | 325 k self-play hands |
| Evaluation | Seat-swapped duel shows NFSP ties DQN while being ≈ 80 % less exploitable |
| Human test | A semi-pro club player won only 1 of 3 hands—variance dominates single hands, but NFSP held edge over a 500-hand set |
| Code size | < 250 LoC for eval/play (excluding train.py and others); demo repo ships only 1 runnable file (play.py), although 20+ files to create |
| Footprint | Repo < 5 MB; largest checkpoint fetched once from Google Drive |
git clone https://github.com/R-madhok/kerdos.git
cd kerdos
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python play.py
First run downloads nfsp_limit_holdem_2.pt from Google Drive. After that, everything runs completely offline.
| Prompt | What to type | Result |
|---|---|---|
Choose opponent … |
N-2 |
Balanced NFSP-2 bot |
| ″ ″ | DQ |
Aggressive DQN bot |
| In-game action | 0 |
call / check |
| ″ | 1 |
raise |
| ″ | 2 |
fold |
Pay-off line prints after each showdown. Press Ctrl-C to quit.
| File | Purpose |
|---|---|
play.py |
Self-contained CLI (downloads NFSP on first run) |
dqn_limit_holdem.pt |
Sub-100 MB checkpoint committed in Git |
requirements.txt |
Exact package versions (rlcard 1.2.0, torch ≥2.3, gdown) |
nfsp_limit_holdem_2.pt |
Not in Git — fetched automatically from Drive |
README.md |
You are here |
Training scripts (train.py, resume_train.py, duel.py) will be made available soon on the full branch if you want to reproduce results.
- Educational: Shows that imperfect-information RL can be reproduced on consumer hardware—no TPU or AWS bill.
- Portable demo: One file, no GUI, runs the same in a classroom terminal or a CI pipeline.
- Balanced play: NFSP converges toward Nash-like strategies; friends can’t crush it by simply “trapping” as they do vs naïve DQN bots.
MIT License for all code and checkpoints. RLCard and PyTorch retain their original open-source licenses.