icelight

First party product analytics platform. Stream analytics.js events to Apache Iceberg tables on Cloudflare R2 - a very cost effective replacement for Google Analytics, MixPanel etc.

Overview

icelight provides a complete solution for collecting analytics events and storing them in queryable Iceberg tables using Cloudflare's infrastructure:

Event Ingestion: RudderStack/Segment-compatible HTTP endpoints
Data Storage: Apache Iceberg tables on R2 with automatic compaction
Query API: SQL queries via R2 SQL or DuckDB, plus a semantic layer

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Your App /    │────▶│  Event Ingest   │────▶│   Cloudflare    │
│  Analytics SDK  │     │    Worker       │     │    Pipeline     │
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                         │
                        ┌─────────────────┐     ┌────────▼────────┐
                        │   Query API     │◀────│  R2 + Iceberg   │
                        │    Worker       │     │   Data Catalog  │
                        └─────────────────┘     └─────────────────┘

Live Demo: https://try.icelight.dev

Prerequisites

Cloudflare Account: dash.cloudflare.com/sign-up (free tier works)
Node.js 18+: nodejs.org
pnpm 8+: npm install -g pnpm

Quick Start

1. Clone & Install

git clone https://github.com/cliftonc/icelight.git
cd icelight
pnpm install

2. Login to Cloudflare

npx wrangler login

3. Launch Everything

pnpm launch

Enter a project name when prompted. The script will:

Create an R2 bucket with Data Catalog enabled
Create and configure the Pipeline (stream, sink, pipeline)
Deploy the Event Ingest worker
Deploy the Query API worker (this is the same code as https://try.icelight.dev)
Deploy the DuckDB container and API

Once complete, you'll see your worker URLs. You can run this again if you see any issues, as it inspects your cloudflare environment and will attempt to resolve any changes.

4. Open the Web UI

Visit your Query API URL in a browser:

https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev

The Web UI includes:

Analysis Builder: Visual query builder with charts
R2 SQL: Direct SQL queries against your Iceberg tables
DuckDB: Full SQL support (JOINs, aggregations, window functions)
Event Simulator: Send test events using the RudderStack SDK

5. Finished?

If for any reason you thought this was interesting, but not that useful (I'd love to know why via an issue), you can clean up:

pnpm teardown

This command will remove everything created in your Cloudflare account, including any data loaded into the bucket.

Client SDK Integration

RudderStack / Segment

Icelight simply uses the open source Analytics.js library (from Segment and Rudderstack).

import { Analytics } from '@rudderstack/analytics-js';

const analytics = new Analytics({
  writeKey: 'any-value',
  dataPlaneUrl: 'https://icelight-event-ingest.YOUR-SUBDOMAIN.workers.dev'
});

analytics.track('Purchase Completed', { orderId: '12345', revenue: 99.99 });
analytics.identify('user-123', { email: '[email protected]', plan: 'premium' });

Direct HTTP

You can also send messages directly, e.g. from your backend:

# Track event
curl -X POST https://YOUR-WORKER.workers.dev/v1/track \
  -H "Content-Type: application/json" \
  -d '{"userId":"user-123","event":"Button Clicked","properties":{"button":"signup"}}'

# Batch events
curl -X POST https://YOUR-WORKER.workers.dev/v1/batch \
  -H "Content-Type: application/json" \
  -d '{"batch":[
    {"type":"track","userId":"u1","event":"Page View"},
    {"type":"identify","userId":"u1","traits":{"name":"John"}}
  ]}'

Querying Data

Via Web UI

The Query API includes a web-based explorer at your worker URL with R2 SQL, DuckDB, and a visual Analysis Builder.

Via API

# R2 SQL query
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/query \
  -H "Content-Type: application/json" \
  -d '{"sql": "SELECT * FROM analytics.events LIMIT 10"}'

# DuckDB query (full SQL support)
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/duckdb \
  -H "Content-Type: application/json" \
  -d '{"query": "SELECT type, COUNT(*) FROM r2_datalake.analytics.events GROUP BY type"}'

# Semantic API query
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/cubejs-api/v1/load \
  -H "Content-Type: application/json" \
  -d '{"query": {"dimensions": ["Events.type"], "measures": ["Events.count"], "limit": 100}}'

# Get CSV output
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/query \
  -H "Content-Type: application/json" \
  -d '{"sql": "SELECT * FROM analytics.events LIMIT 10", "format": "csv"}'

# List tables
curl https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/tables/analytics

# Describe table schema
curl https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/tables/analytics/events

Via External Tools

Connect PyIceberg, DuckDB, or Spark to your R2 Data Catalog. See Cloudflare R2 SQL docs for connection details.

API Endpoints

Ingestion Worker

Thes are the standard Analytics.js endpoints:

Endpoint	Method	Description
`/v1/batch`	POST	Batch events (primary)
`/v1/track`	POST	Single track event
`/v1/identify`	POST	Single identify event
`/v1/page`	POST	Single page event
`/v1/screen`	POST	Single screen event
`/v1/group`	POST	Single group event
`/v1/alias`	POST	Single alias event
`/health`	GET	Health check

Query API Worker

Endpoint	Method	Description
`/query`	POST	Execute R2 SQL query
`/duckdb`	POST	Execute DuckDB query (full SQL)
`/tables/:namespace`	GET	List tables in namespace
`/tables/:namespace/:table`	GET	Describe table schema
`/cubejs-api/v1/meta`	GET	Get Cube-js compatible semantic layer metadata
`/cubejs-api/v1/load`	POST	Execute Cube-js compatible semantic query
`/health`	GET	Health check

Development

# Run setup
pnpm install
pnpm launch

# Run ingest worker locally
pnpm dev:ingest

# Run query worker locally
pnpm dev:query

# Build all packages
pnpm build

# Type check
pnpm typecheck

The DuckDB container does not work locally, as there is no current local solution for Cloudflare Containers. Note that the Ingest and Query workers connect to remote infrastructure in Cloudflare.

Cleanup

pnpm teardown

Troubleshooting

"send is not a function" error

Ensure compatibility_date in wrangler.local.jsonc is "2025-01-01" or later. The Pipelines send() method requires this.

"Not logged in" error

Run npx wrangler login and complete the browser authorization flow.

Pipeline binding not working

Check that wrangler.local.jsonc exists in workers/event-ingest/ - if not, run pnpm launch
Verify the pipeline binding in wrangler.local.jsonc has the correct stream ID
Run npx wrangler pipelines streams list to see your streams
Redeploy after any config changes: pnpm deploy:ingest

Query API returns empty data

Check that data has been flushed to R2 (pipelines have a 5-minute flush interval by default)
Verify the WAREHOUSE_NAME in workers/query-api/wrangler.local.jsonc matches your bucket name
Check that CF_ACCOUNT_ID and CF_API_TOKEN secrets are set correctly

"wrangler.local.jsonc not found" error

Run pnpm launch to create the local configuration files with your pipeline bindings.

Limitations

Cloudflare Pipelines: Currently in open beta - API may change
R2 SQL: Read-only, limited query support (improving in 2026)
Local Development: Pipelines require --remote flag for full testing

Documentation

Getting Started
Configuration
Querying Data - SQL queries, Semantic API, JSON field extraction
SDK Integration

Links

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
container		container
docs		docs
packages		packages
scripts		scripts
templates		templates
workers		workers
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json

License

cliftonc/icelight

Folders and files

Latest commit

History

Repository files navigation

icelight

Overview

Prerequisites

Quick Start

1. Clone & Install

2. Login to Cloudflare

3. Launch Everything

4. Open the Web UI

5. Finished?

Client SDK Integration

RudderStack / Segment

Direct HTTP

Querying Data

Via Web UI

Via API

Via External Tools

API Endpoints

Ingestion Worker

Query API Worker

Development

Cleanup

Troubleshooting

"send is not a function" error

"Not logged in" error

Pipeline binding not working

Query API returns empty data

"wrangler.local.jsonc not found" error

Limitations

Documentation

Links

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages