camplinks

Scrape and enrich US political election data from Wikipedia and Ballotpedia into a normalized SQLite database.

Data Model

---
config:
  theme: neutral
---
erDiagram
    elections ||--o{ candidates : has
    candidates ||--o{ contact_links : has

    elections {
        int election_id PK
        text state
        text race_type
        int year
        text district
        text election_stage
        text wikipedia_url
    }

    candidates {
        int candidate_id PK
        int election_id FK
        text party
        text candidate_name
        text wikipedia_url
        text ballotpedia_url
        real vote_pct
        int is_winner
    }

    contact_links {
        int contact_link_id PK
        int candidate_id FK
        text link_type
        text url
        text source
    }

link_type values: campaign_site, campaign_site_archived, campaign_facebook, campaign_x, campaign_instagram, personal_website, personal_facebook, personal_linkedin

source values: wikipedia, ballotpedia, web_search, wayback, csv_import

Quickstart

# Install
uv sync

# Scrape 2024 House races and enrich with contact info
python -m camplinks --year 2024 --race house

# Scrape 2024 Senate races
python -m camplinks --year 2024 --race senate

# Scrape 2025 gubernatorial races
python -m camplinks --year 2025 --race governor

# Scrape 2025 mayoral elections (Wikipedia, 62+ cities)
python -m camplinks --year 2025 --race municipal

# Scrape 2023-2026 mayoral elections (Ballotpedia, top-100 cities)
python -m camplinks --year 2023 --race bp_municipal --stage scrape

# Scrape gubernatorial elections from Ballotpedia (all 50 states)
python -m camplinks --year 2026 --race bp_governor --stage scrape

# Run all registered race types
python -m camplinks --year 2024 --race all

Available `--race` keys

Key	Description
`house`	US House of Representatives
`senate`	US Senate
`governor`	Governor (statewide)
`attorney_general`	Attorney General (statewide)
`special_house`	House special elections
`state_leg`	State legislature (regular sessions)
`state_leg_special`	State legislature special elections
`municipal`	Mayoral elections (Wikipedia)
`bp_municipal`	Mayoral elections (Ballotpedia, top-100 cities)
`bp_governor`	Gubernatorial elections (Ballotpedia, all states)
`judicial`	State Supreme Court elections
`all`	Run all of the above

The database is written to camplinks.db by default. Override with --db path/to/db.

Pipeline Stages

The pipeline runs four stages in order. Each stage is idempotent (safe to re-run).

Stage	What it does	Data source
scrape	Fetch election results from Wikipedia	Wikipedia state election pages
enrich	Extract campaign websites from candidate Wikipedia pages	Wikipedia candidate infoboxes
search	Find missing contact info via Ballotpedia and web search	Ballotpedia + DuckDuckGo
validate	Check campaign site accessibility, archive dead links	Wayback Machine API

Run individual stages with --stage:

python -m camplinks --year 2024 --race house --stage scrape
python -m camplinks --year 2024 --race house --stage enrich
python -m camplinks --year 2024 --race house --stage search
python -m camplinks --year 2024 --race house --stage validate

Querying the Database

import sqlite3

conn = sqlite3.connect("camplinks.db")
conn.row_factory = sqlite3.Row

# All 2024 House winners with their campaign sites
rows = conn.execute("""
    SELECT c.candidate_name, c.party, e.state, e.district, cl.url
    FROM candidates c
    JOIN elections e ON c.election_id = e.election_id
    LEFT JOIN contact_links cl ON c.candidate_id = cl.candidate_id
        AND cl.link_type = 'campaign_site'
    WHERE c.is_winner = 1 AND e.year = 2024 AND e.race_type = 'US House'
    ORDER BY e.state, e.district
""").fetchall()

for r in rows:
    print(f"{r['state']}-{r['district']}: {r['candidate_name']} ({r['party']}) - {r['url']}")

Or with Polars:

import polars as pl

df = pl.read_database(
    "SELECT * FROM candidates c JOIN elections e ON c.election_id = e.election_id",
    "sqlite:///camplinks.db",
)

Adding a New Race Type

See USAGE.md for a walkthrough with examples.

Migrating from Legacy CSV

If you have an existing house_races_2024.csv from the old wide-format pipeline:

python convert_to_tidy.py --csv house_races_2024.csv --db camplinks.db

Development

uv sync
uv run pytest tests/
uv run mypy camplinks/
uv run ruff check .

Contributing

See CONTRIBUTING.md for setup instructions and guidelines.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.claude		.claude
camplinks		camplinks
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
2025_election_summary.md		2025_election_summary.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
COVERAGE.md		COVERAGE.md
LICENSE		LICENSE
Metric-MagMod.md		Metric-MagMod.md
README.md		README.md
USAGE.md		USAGE.md
campaign_search_cache.json		campaign_search_cache.json
camplinks.db		camplinks.db
camplinks.db-shm		camplinks.db-shm
camplinks.db-wal		camplinks.db-wal
camplinks.db.bak		camplinks.db.bak
candidate_names_23_to_25.csv		candidate_names_23_to_25.csv
check_and_update_names.py		check_and_update_names.py
convert_to_tidy.py		convert_to_tidy.py
house_races_2024.csv		house_races_2024.csv
load_csv_to_db.py		load_csv_to_db.py
match_summary.md		match_summary.md
matched_data.csv		matched_data.csv
mayoral_names_23_to_25.csv		mayoral_names_23_to_25.csv
plan_primary_support.md		plan_primary_support.md
prelim-mag-analysis.ipynb		prelim-mag-analysis.ipynb
pyproject.toml		pyproject.toml
randsamp.csv		randsamp.csv
scraping-campaign-sites.py		scraping-campaign-sites.py
state-leg-special-2025-names.csv		state-leg-special-2025-names.csv
state_leg_names_23_24.csv		state_leg_names_23_24.csv
uv.lock		uv.lock
validate_results.md		validate_results.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

camplinks

Data Model

Quickstart

Available `--race` keys

Pipeline Stages

Querying the Database

Adding a New Race Type

Migrating from Legacy CSV

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

camplinks

Data Model

Quickstart

Available --race keys

Pipeline Stages

Querying the Database

Adding a New Race Type

Migrating from Legacy CSV

Development

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Available `--race` keys

Packages