First party product analytics platform. Stream analytics.js events to Apache Iceberg tables on Cloudflare R2 - a very cost effective replacement for Google Analytics, MixPanel etc.
icelight provides a complete solution for collecting analytics events and storing them in queryable Iceberg tables using Cloudflare's infrastructure:
- Event Ingestion: RudderStack/Segment-compatible HTTP endpoints
- Data Storage: Apache Iceberg tables on R2 with automatic compaction
- Query API: SQL queries via R2 SQL or DuckDB, plus a semantic layer
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Your App / ββββββΆβ Event Ingest ββββββΆβ Cloudflare β
β Analytics SDK β β Worker β β Pipeline β
βββββββββββββββββββ βββββββββββββββββββ ββββββββββ¬βββββββββ
β
βββββββββββββββββββ ββββββββββΌβββββββββ
β Query API βββββββ R2 + Iceberg β
β Worker β β Data Catalog β
βββββββββββββββββββ βββββββββββββββββββ
Live Demo: https://try.icelight.dev
- Cloudflare Account: dash.cloudflare.com/sign-up (free tier works)
- Node.js 18+: nodejs.org
- pnpm 8+:
npm install -g pnpm
git clone https://github.com/cliftonc/icelight.git
cd icelight
pnpm installnpx wrangler loginpnpm launchEnter a project name when prompted. The script will:
- Create an R2 bucket with Data Catalog enabled
- Create and configure the Pipeline (stream, sink, pipeline)
- Deploy the Event Ingest worker
- Deploy the Query API worker (this is the same code as https://try.icelight.dev)
- Deploy the DuckDB container and API
Once complete, you'll see your worker URLs. You can run this again if you see any issues, as it inspects your cloudflare environment and will attempt to resolve any changes.
Visit your Query API URL in a browser:
https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev
The Web UI includes:
- Analysis Builder: Visual query builder with charts
- R2 SQL: Direct SQL queries against your Iceberg tables
- DuckDB: Full SQL support (JOINs, aggregations, window functions)
- Event Simulator: Send test events using the RudderStack SDK
If for any reason you thought this was interesting, but not that useful (I'd love to know why via an issue), you can clean up:
pnpm teardown
This command will remove everything created in your Cloudflare account, including any data loaded into the bucket.
Icelight simply uses the open source Analytics.js library (from Segment and Rudderstack).
import { Analytics } from '@rudderstack/analytics-js';
const analytics = new Analytics({
writeKey: 'any-value',
dataPlaneUrl: 'https://icelight-event-ingest.YOUR-SUBDOMAIN.workers.dev'
});
analytics.track('Purchase Completed', { orderId: '12345', revenue: 99.99 });
analytics.identify('user-123', { email: '[email protected]', plan: 'premium' });You can also send messages directly, e.g. from your backend:
# Track event
curl -X POST https://YOUR-WORKER.workers.dev/v1/track \
-H "Content-Type: application/json" \
-d '{"userId":"user-123","event":"Button Clicked","properties":{"button":"signup"}}'
# Batch events
curl -X POST https://YOUR-WORKER.workers.dev/v1/batch \
-H "Content-Type: application/json" \
-d '{"batch":[
{"type":"track","userId":"u1","event":"Page View"},
{"type":"identify","userId":"u1","traits":{"name":"John"}}
]}'The Query API includes a web-based explorer at your worker URL with R2 SQL, DuckDB, and a visual Analysis Builder.
# R2 SQL query
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/query \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT * FROM analytics.events LIMIT 10"}'
# DuckDB query (full SQL support)
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/duckdb \
-H "Content-Type: application/json" \
-d '{"query": "SELECT type, COUNT(*) FROM r2_datalake.analytics.events GROUP BY type"}'
# Semantic API query
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/cubejs-api/v1/load \
-H "Content-Type: application/json" \
-d '{"query": {"dimensions": ["Events.type"], "measures": ["Events.count"], "limit": 100}}'
# Get CSV output
curl -X POST https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/query \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT * FROM analytics.events LIMIT 10", "format": "csv"}'
# List tables
curl https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/tables/analytics
# Describe table schema
curl https://icelight-query-api.YOUR-SUBDOMAIN.workers.dev/tables/analytics/eventsConnect PyIceberg, DuckDB, or Spark to your R2 Data Catalog. See Cloudflare R2 SQL docs for connection details.
Thes are the standard Analytics.js endpoints:
| Endpoint | Method | Description |
|---|---|---|
/v1/batch |
POST | Batch events (primary) |
/v1/track |
POST | Single track event |
/v1/identify |
POST | Single identify event |
/v1/page |
POST | Single page event |
/v1/screen |
POST | Single screen event |
/v1/group |
POST | Single group event |
/v1/alias |
POST | Single alias event |
/health |
GET | Health check |
| Endpoint | Method | Description |
|---|---|---|
/query |
POST | Execute R2 SQL query |
/duckdb |
POST | Execute DuckDB query (full SQL) |
/tables/:namespace |
GET | List tables in namespace |
/tables/:namespace/:table |
GET | Describe table schema |
/cubejs-api/v1/meta |
GET | Get Cube-js compatible semantic layer metadata |
/cubejs-api/v1/load |
POST | Execute Cube-js compatible semantic query |
/health |
GET | Health check |
# Run setup
pnpm install
pnpm launch
# Run ingest worker locally
pnpm dev:ingest
# Run query worker locally
pnpm dev:query
# Build all packages
pnpm build
# Type check
pnpm typecheckThe DuckDB container does not work locally, as there is no current local solution for Cloudflare Containers. Note that the Ingest and Query workers connect to remote infrastructure in Cloudflare.
pnpm teardownEnsure compatibility_date in wrangler.local.jsonc is "2025-01-01" or later. The Pipelines send() method requires this.
Run npx wrangler login and complete the browser authorization flow.
- Check that
wrangler.local.jsoncexists inworkers/event-ingest/- if not, runpnpm launch - Verify the pipeline binding in
wrangler.local.jsonchas the correct stream ID - Run
npx wrangler pipelines streams listto see your streams - Redeploy after any config changes:
pnpm deploy:ingest
- Check that data has been flushed to R2 (pipelines have a 5-minute flush interval by default)
- Verify the
WAREHOUSE_NAMEinworkers/query-api/wrangler.local.jsoncmatches your bucket name - Check that
CF_ACCOUNT_IDandCF_API_TOKENsecrets are set correctly
Run pnpm launch to create the local configuration files with your pipeline bindings.
- Cloudflare Pipelines: Currently in open beta - API may change
- R2 SQL: Read-only, limited query support (improving in 2026)
- Local Development: Pipelines require
--remoteflag for full testing
- Getting Started
- Configuration
- Querying Data - SQL queries, Semantic API, JSON field extraction
- SDK Integration
MIT