Networking-centric analysis of Ethereum mainnet, published as a Quarto website.
# Install dependencies
uv sync
# Create .env with ClickHouse credentials
cat > .env << 'EOF'
CLICKHOUSE_HOST=your-host
CLICKHOUSE_PORT=8443
CLICKHOUSE_USER=your-user
CLICKHOUSE_PASSWORD=your-password
EOF
# Fetch data for yesterday
uv run python scripts/fetch_data.py --output-dir notebooks/data
# Start dev server
quarto preview| Notebook | Description |
|---|---|
| 01-blob-inclusion | Blob inclusion patterns per block and epoch |
| 02-blob-flow | Blob flow across validators, builders, and relays |
| 03-column-propagation | Column propagation timing across 128 data columns |
.
├── _quarto.yml # Quarto config
├── index.qmd # Home page
├── archive.qmd # Archive page (generated)
├── queries/ # Query layer (fetch + write to Parquet)
│ ├── blob_inclusion.py # fetch_blobs_per_slot(), fetch_blocks_blob_epoch(), ...
│ ├── blob_flow.py # fetch_proposer_blobs()
│ └── column_propagation.py # fetch_col_first_seen()
├── scripts/
│ ├── fetch_data.py # CLI for data fetching
│ ├── generate_archive.py # Generates archive.qmd for site
│ └── generate_historical_index.py
├── notebooks/
│ ├── loaders.py # load_parquet()
│ ├── data/ # Local data cache (gitignored)
│ └── *.qmd # Quarto notebooks (load + visualize)
└── _site/ # Built output (gitignored)
ClickHouse ──[fetch_data.py]──> Parquet files ──[notebooks]──> Visualizations
│
└── Stored on `data` branch (CI)
or `notebooks/data/` (local dev)
Two GitHub Actions workflows:
-
Fetch Daily Data (
fetch-data.yml)- Runs daily at 1am UTC
- Fetches yesterday's data from ClickHouse
- Commits Parquet files to
databranch - Maintains 30-day rolling window
-
Build and Deploy (
build-book.yml)- Triggers on push to
mainor after data fetch - Checks out
databranch for Parquet files - Builds Quarto site (executes notebooks at build time)
- Deploys to GitHub Pages
- Triggers on push to
| Branch | Purpose |
|---|---|
main |
Source code, notebooks, queries |
data |
Parquet files + manifest.json |
gh-pages |
Built static site (auto-deployed) |
# Fetch yesterday's data (default)
uv run python scripts/fetch_data.py --output-dir notebooks/data
# Fetch specific date
uv run python scripts/fetch_data.py --date 2025-01-15 --output-dir notebooks/data
# Fetch with custom retention
uv run python scripts/fetch_data.py --output-dir notebooks/data --max-days 7# Option 1: Jupyter Lab (from repo root)
uv run jupyter lab
# Option 2: VS Code with Quarto extension
# Install the Quarto extension and open any .qmd file# Start dev server with hot reload
quarto preview
# Build static HTML
quarto render
# Output is in _site/The CI workflow handles this, but to replicate locally:
# Build with execution (uses latest date from manifest)
quarto render
# Or specify a date
TARGET_DATE=2025-01-15 quarto render
# Serve locally to test
python -m http.server -d _site| Variable | Description |
|---|---|
CLICKHOUSE_HOST |
ClickHouse server hostname |
CLICKHOUSE_PORT |
ClickHouse server port (default: 8443) |
CLICKHOUSE_USER |
ClickHouse username |
CLICKHOUSE_PASSWORD |
ClickHouse password |
DATA_ROOT |
Override data directory (used by CI) |
TARGET_DATE |
Date for notebook execution (YYYY-MM-DD), defaults to latest in manifest |
-
Add query function in
queries/:def fetch_my_data(client, target_date: str, output_path: Path, network: str = "mainnet") -> int: query = f"SELECT ... WHERE {_get_date_filter(target_date)}" df = client.query_df(query) output_path.parent.mkdir(parents=True, exist_ok=True) df.to_parquet(output_path, index=False) return len(df)
-
Register in
scripts/fetch_data.py:FETCHERS = [ ... ("my_data", fetch_my_data), ]
-
Create Quarto notebook in
notebooks/:--- title: "My Analysis" --- ```{python} from loaders import load_parquet df = load_parquet("my_data") # Visualize...
-
Add to site in
_quarto.ymlnavbar