Open source AI outreach intelligence tool for entrepreneurs.
Find real people with real problems you can actually help — across every platform where they talk about it.
Inkwell scans communities (Reddit today; Hacker News / Product Hunt / more via community scanners) for posts where people are asking for help, sharing projects, or discussing problems you can solve. It splits the work into two clean stages:
Scanning is free (no LLM tokens). Rule-based heuristics pick the signal, score engagement potential (Yes / Maybe / No), summarize the post, and surface the most interesting comment. Deterministic, auditable, fast.
Voice drafting is BYOK, on demand. When you see a signal worth replying to, click Draft and Inkwell uses your LLM key to write a reply in your voice — trained on your dos, don'ts, and example comments. The key lives in your browser's localStorage and never hits the server except on that one draft request.
Most outreach tools start with a contact database and blast cold messages. Inkwell flips this: it starts with real signals — people publicly expressing problems — and helps you engage authentically where they already are.
Three things make it different:
- $0 until you draft. Every other "AI outreach" tool bills you to scan. We don't spend a token until you ask for a reply in your voice.
- Your voice, not LLM-speak. The persona is a first-class artifact (YAML). Fork someone else's. Publish your own. Replies stop sounding generic.
- Self-hosted, no lock-in. One
pip install, runs on your laptop, your data never leaves your machine. LLM provider of your choice (OpenAI, Claude, Ollama, local).
git clone https://github.com/sausi-7/inkwell.git
cd inkwell
# Create a virtualenv — keeps deps off your system Python
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Editable install (CLI + web UI + FastAPI / LiteLLM / Jinja2 / …)
pip install -e ".[dev]"Every inkwell, python, and pytest command in this README assumes the venv is active. Re-run source .venv/bin/activate in a fresh terminal to come back.
cp .env.example .envEdit .env with your API keys:
GOOGLE_CLIENT_ID=your_client_id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your_client_secret
OPENAI_API_KEY=sk-your_openai_key
SPREADSHEET_ID=your_spreadsheet_id
Recommended — Web UI:
python -m inkwell serveThen open http://localhost:8000. The landing page walks you through profile → output → BYOK key → subreddits → first scan. The key lives in your browser; the server never sees it except when you click Draft.
CLI — headless / cron:
# Scan Reddit and export to Google Sheets (needs Google OAuth — one-time)
python -m inkwell scan
# Scan and export to CSV only (no Google setup required)
python -m inkwell scan --csv --no-sheets
# Generate a voice draft for a stored signal (uses your LLM key)
python -m inkwell draft reddit_abc123 --model gpt-4o-miniFirst Google Sheets run: A browser window opens for OAuth. Sign in and grant Sheets access. Happens once.
Four pages, plain HTML/CSS/JS, no framework. Binds to 127.0.0.1 only — single-user, self-hosted.
/Home — onboarding checklist that turns green as each step is complete./profile— edit your persona visually. See the exact prompt the LLM will use./settings— pick output (CSV / Sheets / both), paste your LLM key (stored in browser localStorage, never on the server), test the Google Sheets connection and the LLM key with one click, edit filters and the subreddit list./scan— start a scan, watch SSE-streamed progress, and see signals scored as they arrive. Click Draft on any signal to generate a reply and a top-level comment in your voice. Drafts are cached per-signal so refreshing doesn't re-bill you.
- No auth. The server binds to
127.0.0.1— do not expose it on a shared network without adding your own reverse proxy + auth. - The LLM key is stored in the browser's
localStorageunderink_llm_key. It never lands in.envor any log. - On a draft request, the browser sends the key as the
X-LLM-Keyheader. The server passes it straight to LiteLLM and does not write it anywhere.
You configure subreddits, personality, and filters
|
v
+---------------------------------+
| Scan communities for signals | Reddit, HN, Product Hunt...
+---------------------------------+
|
v
+---------------------------------+
| Rule-based filter + score | ZERO tokens — keywords, score,
| (analyzers/rules.py) | flairs, velocity, age, Yes/Maybe/No
+---------------------------------+
|
v
+---------------------------------+
| Export + persist signals | Google Sheets, CSV, data/signals
+---------------------------------+
|
v (user clicks "Draft" in the web UI, per signal)
|
+---------------------------------+
| Voice drafting (BYOK) | LLM tokens spent ONLY here.
| (analyzers/voice.py) | Key is per-request from browser
| OpenAI / Claude / Ollama | localStorage — never persisted.
+---------------------------------+
| Column | Description |
|---|---|
| Subreddit | e.g. r/SaaS |
| Post title | Title of the post |
| Summary | AI-generated 1-2 sentence summary |
| Age (hrs) | Hours since post creation |
| Created UTC | Post creation timestamp |
| Engage? | Yes, No, or Maybe |
| Why | 1-sentence reasoning for the recommendation |
| Status | active, archived, inactive, or blocked |
| Coolest comment | Most interesting/insightful comment in the thread |
| Suggested reply to cool comment | A reply written in your voice |
| Suggested post comment | A standalone comment for the post |
| Post link | Direct URL to the post |
| Source URL(s) | API endpoints used |
Only posts marked "Yes" have the coolest comment populated in the CSV/Sheets export. All three voice columns (Coolest comment reply, Suggested post comment) now show dashes by default — they're filled on demand when you click Draft in the web UI or run inkwell draft <signal_id>. This is the BYOK split: scans never burn tokens.
Simple list of subreddit names to scan:
- Entrepreneur
- SaaS
- SideProject
- learnprogramming
- indiegamesTwo lists are included: subreddits.yml (21 curated) and subreddits_1.yml (101 broad). Switch between them:
python -m inkwell scan --subreddits subreddits_1.ymlDefine the voice and tone of AI-generated comments:
name: Saurabh
bio: "Indie developer and creative coder. Building AI-powered tools..."
interests: [game development, AI/ML tools, creative coding, indie hacking]
expertise: [Python scripting, AI integration, app development]
tone:
style: "conversational, slightly nerdy, genuinely curious"
humor: "dry wit, occasional puns, self-deprecating"
formality: "casual but knowledgeable"
dos:
- Share specific experiences
- Ask follow-up questions
- Offer concrete advice
donts:
- Never mention products unless asked
- No excessive enthusiasm
- Max 1 emoji per comment
example_comments:
- "I ran into the same issue last month..."The AI uses this profile so comments sound like you, not a generic bot. Without this file, a default conversational tone is used.
Control which posts get analyzed. Filters run before AI, saving API costs:
| Filter | What It Does | Default |
|---|---|---|
keywords.include |
Post must contain at least one keyword | [] (no filter) |
keywords.exclude |
Skip if post contains any of these | [hiring, nsfw, crypto airdrop] |
thresholds.min_score |
Minimum upvote score | 2 |
thresholds.max_comments |
Skip mega-threads | 500 |
thresholds.max_age_hours |
How far back to scan | 24 |
post_type.allow |
all, self_only, or link_only |
all |
flairs.include/exclude |
Filter by post flair | exclude: [Meme, Shitpost, NSFW] |
allowed_statuses |
Which post statuses to keep | [active] |
ai_preferences |
Guides AI engagement decisions (not a hard filter) | see file |
Inkwell uses LiteLLM so one setting switches providers. Pick any model and set the matching API key:
# OpenAI
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini
# Anthropic Claude
ANTHROPIC_API_KEY=sk-ant-...
LLM_MODEL=claude-sonnet-4-6
# Ollama (local, no key needed — just have Ollama running)
LLM_MODEL=ollama/llama3
OLLAMA_API_BASE=http://localhost:11434
Any model LiteLLM supports works — Gemini, Groq, Together, Mistral, etc. See LiteLLM providers.
Press Ctrl+C at any time. Progress is saved automatically after each subreddit.
When you run again on the same day:
- Completed subreddits are skipped
- Already-analyzed posts are skipped
- Continues exactly where it left off
Progress resets automatically each new day.
inkwell/
├── inkwell/ # Python package
│ ├── __main__.py # CLI entry point
│ ├── app.py # FastAPI web app (Phase 1)
│ ├── config.py # Settings loader
│ ├── scanners/ # Platform scanners
│ │ ├── base.py # Scanner protocol + data models
│ │ ├── reddit.py # Reddit scanner
│ │ └── registry.py # Auto-discovery registry
│ ├── analyzers/ # AI analysis engine
│ │ ├── base.py # Analysis data model
│ │ ├── pipeline.py # Prompt building + LLM call
│ │ └── llm_client.py # LLM wrapper (OpenAI, LiteLLM)
│ ├── filters/ # Signal filtering
│ │ ├── rule_filter.py # Rule-based pre-filtering
│ │ └── dedup.py # Cross-day deduplication
│ ├── personas/ # Voice/tone engine
│ │ ├── loader.py # Load from YAML
│ │ └── prompt_builder.py # Build persona prompt blocks
│ ├── exporters/ # Output adapters
│ │ ├── google_sheets.py # Google Sheets export
│ │ └── csv_exporter.py # CSV export
│ └── storage/ # Local file storage
│ ├── progress.py # Checkpoint/resume
│ ├── signals.py # Signal CRUD (JSON files)
│ ├── campaigns.py # Campaign management
│ ├── feedback.py # Quality ratings
│ └── scan_history.py # Scan run tracking
├── config/ # YAML configuration files
│ ├── personality.yml # Your voice profile
│ ├── filters.yml # Filtering rules
│ └── subreddits.yml # Target subreddits
├── data/ # Runtime data (auto-created)
│ ├── signals/ # Daily signal JSON files
│ ├── campaigns/ # Campaign state files
│ └── scan_history/ # Scan run logs
├── pyproject.toml # Dependencies + packaging
├── .env.example # Environment template
└── README.md # This file
- Google Cloud account with OAuth 2.0 credentials (Desktop app type)
- Google Sheets API enabled in your Google Cloud project
- OpenAI API key with access to GPT-4o-mini (or your chosen model)
- Google Sheet where results will be written
- Go to Google Cloud Console
- Create a new project (or select existing)
- Navigate to APIs & Services > Library — search Google Sheets API, click Enable
- Go to APIs & Services > OAuth consent screen
- Choose External, click Create
- Fill in app name and your email
- On Scopes, add
https://www.googleapis.com/auth/spreadsheets - On Test users, add your Google email
- Go to APIs & Services > Credentials
- Click Create Credentials > OAuth client ID
- Application type: Desktop app
- Copy the Client ID and Client Secret into your
.env
Open your target Google Sheet. The URL looks like:
https://docs.google.com/spreadsheets/d/1HJy1bAfynXs.../edit
The long string after /d/ is your Spreadsheet ID.
# Run Reddit scan (default: exports to Google Sheets)
python -m inkwell scan
# Use a different subreddit list
python -m inkwell scan --subreddits subreddits_1.yml
# Export to CSV instead of Sheets
python -m inkwell scan --csv --no-sheets
# Both CSV and Sheets
python -m inkwell scan --csv
# Verbose logging
python -m inkwell scan -v
# Start web dashboard
python -m inkwell serve
# Web dashboard on custom port with auto-reload
python -m inkwell serve --port 3000 --reload| Problem | Fix |
|---|---|
ERROR: Set OPENAI_API_KEY in .env |
Your .env file is missing or the key name is wrong |
ERROR: Set GOOGLE_CLIENT_ID... |
Google OAuth credentials not set in .env |
No subreddits found |
Check config/subreddits.yml exists and is valid YAML |
| Browser doesn't open for OAuth | Run on a machine with a browser, or check firewall |
| No rows in sheet | Check SPREADSHEET_ID in .env is correct |
token.json auth error |
Delete token.json and re-run to re-authenticate |
| Rate limited (429) | Handled automatically with backoff — just wait |
| All posts filtered out | Check config/filters.yml — filters may be too strict |
- Phase 0 — Modular architecture (scanners, analyzers, filters, exporters, storage)
- Multi-provider LLM — OpenAI, Claude, Ollama, anything LiteLLM supports
- BYOK analyzer split — scans are free; voice drafting is on-demand, key-in-browser
- Web UI — Profile builder, settings (output + BYOK + filters), scan runner with live SSE progress and draft modal
- Persona marketplace — fork/publish personas under
config/personas/ - More scanners — Hacker News, Product Hunt, Dev.to (protocol-based, ~100 LOC each)
- More exporters — Notion, Airtable, Slack webhook
- Scheduling — APScheduler cron for daily sweeps; email/Slack digests
- Feedback loop — ratings feed back into the engagement score
See README_TECHNICAL.md for architecture details.
Inkwell is built as an open platform — scanners, exporters, and personas are all intentionally small surfaces so a contributor can ship a real feature in an afternoon.
Three-minute path in:
- 👀 Browse open good-first-issues — scanner / exporter / persona / docs. Each has clear acceptance criteria and files-to-touch.
- 💬 Comment "I'd like this" — you'll get assigned.
- 🛠️ Read CONTRIBUTING.md for dev setup, code style, and the step-by-step for new scanners & exporters. Read docs/ROADMAP.md for project direction.
Prefer a chat first? Open a discussion — ideas, persona show-and-tell, or "help me set this up."
The repo's whole architecture is designed around plain Python protocols (structural typing, no inheritance), one YAML per concept (persona, filters, subreddits), and plain HTML/CSS/JS (no build step, no framework). Easy to read, easy to fork, easy to extend.
MIT — do whatever you want with this, including commercial use and forks. Just keep the copyright line.
Built by Saurabh Singh. Originally started as a Reddit outreach script, now growing into a full outreach intelligence platform.
If Inkwell helps you find great conversations, consider starring the repo and sharing it with other entrepreneurs.