Summary
Add a tokf telemetry sync subcommand that reads invocation events from the local SQLite database and replays any that were not successfully exported to the OTLP backend in real time.
Motivation
The OTel exporter introduced in #85 uses a best-effort model: it waits at most 200 ms for the OTLP flush before giving up (see ADR-0001). Under a slow or temporarily unavailable endpoint, the last invocation's metrics may not reach the backend.
However, every invocation is always written to SQLite first, and the synced_to_otel_at column (added in #85) tracks which rows have been successfully exported. A sync command can replay those rows at any time — from a cron job, a CI post-step, or manually.
Design
Schema (already in place from #85)
-- events table already has:
synced_to_otel_at TEXT -- NULL = not yet synced
Command
tokf telemetry sync [--since <ISO8601>] [--dry-run] [--limit N]
- Queries
WHERE synced_to_otel_at IS NULL (or --since override)
- Builds raw OTLP HTTP export payloads with correct
start_time_unix_nano / time_unix_nano from the stored timestamp column
- POSTs directly to the configured OTLP endpoint (no
SdkMeterProvider overhead, no background thread)
- On success: updates
synced_to_otel_at = strftime('%Y-%m-%dT%H:%M:%SZ','now')
--dry-run: prints what would be synced without sending
--limit N: sync at most N events (for incremental rollouts)
Temporality
Uses Delta temporality (matching the real-time exporter). Each event is a single-invocation delta, so replaying them as historical deltas is semantically correct.
Backend compatibility
- Datadog, Grafana Mimir, New Relic, Honeycomb: accept historical OTLP with timestamps. ✓
- Prometheus Pushgateway: pull-based, rejects historical data. ✗ (document limitation)
Implementation notes
- No
SdkMeterProvider — build ExportMetricsServiceRequest protobuf directly using opentelemetry-proto crate (or prost-generated types already pulled in by opentelemetry-otlp)
- Sync is idempotent: re-running after a partial failure is safe
- Should respect the same
OTEL_EXPORTER_OTLP_* env vars as the real-time exporter
Acceptance Criteria
Relation
Depends on: #85 (schema and synced_to_otel_at column already added)
Related: #87 (tokf telemetry status), #88 (docs)
Referenced in: ADR-0001, consequence "Option C remains open"
Summary
Add a
tokf telemetry syncsubcommand that reads invocation events from the local SQLite database and replays any that were not successfully exported to the OTLP backend in real time.Motivation
The OTel exporter introduced in #85 uses a best-effort model: it waits at most 200 ms for the OTLP flush before giving up (see ADR-0001). Under a slow or temporarily unavailable endpoint, the last invocation's metrics may not reach the backend.
However, every invocation is always written to SQLite first, and the
synced_to_otel_atcolumn (added in #85) tracks which rows have been successfully exported. A sync command can replay those rows at any time — from a cron job, a CI post-step, or manually.Design
Schema (already in place from #85)
Command
WHERE synced_to_otel_at IS NULL(or--sinceoverride)start_time_unix_nano/time_unix_nanofrom the storedtimestampcolumnSdkMeterProvideroverhead, no background thread)synced_to_otel_at = strftime('%Y-%m-%dT%H:%M:%SZ','now')--dry-run: prints what would be synced without sending--limit N: sync at most N events (for incremental rollouts)Temporality
Uses Delta temporality (matching the real-time exporter). Each event is a single-invocation delta, so replaying them as historical deltas is semantically correct.
Backend compatibility
Implementation notes
SdkMeterProvider— buildExportMetricsServiceRequestprotobuf directly usingopentelemetry-protocrate (orprost-generated types already pulled in byopentelemetry-otlp)OTEL_EXPORTER_OTLP_*env vars as the real-time exporterAcceptance Criteria
tokf telemetry syncreplays allsynced_to_otel_at IS NULLeventssynced_to_otel_atupdated in DB on successful export--dry-runflag prints events without sending--limit Nlimits batch sizeotel-httpfeature; documents gRPC limitationRelation
Depends on: #85 (schema and
synced_to_otel_atcolumn already added)Related: #87 (
tokf telemetry status), #88 (docs)Referenced in: ADR-0001, consequence "Option C remains open"