platform: SchedulerDO Tenero refresh task not populating KV (root cause behind #792/#793 leaderboard workaround)

## Summary

`/api/prices` returns `{"prices":{}}` in production at ~2h post-Phase 3.1 deploy (#738), with per-token reads showing `fetchedAt: null` for every entry in `STATIC_TOKEN_IDS`. The KV cache the route reads from has never been populated. #793 worked around this for `/leaderboard` (browser fetches Tenero directly), but the underlying `SchedulerDO.runTenero` task not populating `tenero:price:*` in KV remains an unaddressed root cause and breaks any other consumer of `/api/prices`.

## Repro

```bash
$ curl -sS -H 'Accept: application/json' https://aibtc.com/api/prices
{
  "prices": {},
  "supportedTokens": [
    "stx",
    "SM3VDXK3WZZSA84XXFKAFAF15NNZX32CTSG82JFQ4.sbtc-token::sbtc",
    "SP4SZE494VC2YC5JYG7AYFQ44F5Q4PYV7DVMDPBG.ststx-token::ststx"
  ]
}

$ for t in 'stx' \
           'SM3VDXK3WZZSA84XXFKAFAF15NNZX32CTSG82JFQ4.sbtc-token::sbtc' \
           'SP4SZE494VC2YC5JYG7AYFQ44F5Q4PYV7DVMDPBG.ststx-token::ststx'; do
    curl -sS -H 'Accept: application/json' "https://aibtc.com/api/prices?token=$t"
    echo
  done
{"tokenId":"stx","priceUsd":null,"fetchedAt":null}
{"tokenId":"SM3VDXK3WZZSA84XXFKAFAF15NNZX32CTSG82JFQ4.sbtc-token::sbtc","priceUsd":null,"fetchedAt":null}
{"tokenId":"SP4SZE494VC2YC5JYG7AYFQ44F5Q4PYV7DVMDPBG.ststx-token::ststx","priceUsd":null,"fetchedAt":null}
```

(Probes at `2026-05-13T02:30Z`, ~2h after #738 merge at `00:24:40Z`.)

## Expected vs actual

| | Expected | Actual |
|---|---|---|
| `GET /api/prices` `Accept: application/json` | `prices: { stx: {…}, sbtc: {…}, ststx: {…} }` populated within ~5min of first SchedulerDO tick | `prices: {}` |
| `GET /api/prices?token=stx` | `priceUsd: <number>, fetchedAt: <unix-ms>` after first successful Tenero fetch | `priceUsd: null, fetchedAt: null` — per the self-doc, *"`fetchedAt: null` when no cache entry exists yet"* |
| `SchedulerDO.runTenero` cadence | ~5min refresh per `/api/prices` self-doc + the docstring in `app/api/prices/route.ts` | None successful in 2h+ (`fetchedAt: null` definitive for all 3 entries) |
| `/leaderboard` Volume USD | Computed from KV-cached prices (pre-#792/#793) | Was reading `$0` across all rows; #793 routed around by fetching Tenero direct from browser |

## Why this is not closed by #792 + #793

#792 + #793 fix the **leaderboard** rendering: the client now calls `https://api.tenero.io/v1/stacks/tokens/{contract_id}` directly per the merged change. That's a clean workaround for the user-facing visible bug.

But the **KV cache itself remains empty**, which means:

- `/api/prices` is broken for any non-leaderboard consumer (LLMs, third-party indexers, agent tools that prefer the cached server-side read over a direct Tenero call). The route's docstring still advertises "Cached by the SchedulerDO (~5 min refresh cadence) from Tenero" — that contract isn't being honored.
- The cost-shaping promise of the route ("scales with KV reads, not upstream API quota") is moot — there's nothing to read.
- The SchedulerDO's `lastTeneroRunAt`, `lastTeneroResult`, and `consecutiveFailures.tenero` fields are observability primitives that #784 (admin-controls move to v2) wired up but they're operator-side only, so external diagnostics can't see whether the task is firing-and-failing vs not-firing.

## Hypotheses (in order of likelihood per current evidence)

1. **`TENERO_API_KEY` env var missing in production** → all `fetchTokenPriceUsd` calls in `lib/scheduler/tenero-task.ts` return non-200 → no `setCachedTokenPrice` happens → KV stays empty. Per `tenero-task.ts:77`, only `r.status === 200` writes to KV; `r.status === 0 || r.status >= 500` and `r.status === 429` paths just bump `failed`. Auth failures (likely 401) fall under that "non-200 == no write" branch.

2. **`SCHEDULER` DO alarm never fired in production**. The `ctx.waitUntil(env.SCHEDULER.get(…).status())` kick in `app/leaderboard/page.tsx` is fire-and-forget; if that throws on the first call (binding misconfigured, instance name mismatch with the v2 cutover in #784), the constructor armor never runs and `alarm()` never schedules. `lastTeneroRunAt: null` in DO storage would confirm.

3. **`consecutiveFailures.tenero` hit the pause threshold + `pausedUntil` blocks future ticks**. #779 added a monthly-quota backoff (`TENERO_MONTH_QUOTA_BACKOFF_MS = 24h`); if Tenero returned a 429 with `month_remaining: 0` on the first tick post-deploy, the alarm pauses for 24h.

4. **KV namespace binding misconfigured in v2 instance** (post-#784 cutover). The DO instance migrated; if `KV` (or whatever binding `setCachedTokenPrice` uses) didn't migrate cleanly, writes fail silently.

(1) is testable purely from logs (`tenero.refresh_started` should fire, followed by `tenero.kv_write_failed` or 4xx response handling). (2) is testable from DO storage state. (3) is testable from `consecutiveFailures` + `pausedUntil`. (4) is testable by checking KV namespace bindings on the deployed worker.

## Diagnostic ask

A single admin-side scheduler status snapshot would isolate which of (1)-(4) applies:

```bash
curl -sS -H "X-Admin-Key: $ADMIN_KEY" "https://aibtc.com/api/admin/scheduler?name=v2"
# Expected fields per worker.ts: 
#   lastTeneroRunAt, lastTeneroResult, consecutiveFailures.tenero, 
#   nextRunAfter.tenero, pausedUntil
```

If `lastTeneroRunAt === null` → branch (2). If `lastTeneroRunAt` populated but `lastTeneroResult.succeeded === 0` repeatedly → branch (1) or (4). If `consecutiveFailures.tenero >= threshold` and `pausedUntil > now` → branch (3).

## What I'd take a stab at

If a `wrangler tail` snapshot pointed at branch (1) (auth: 401), I'd open a small fix-PR that:

- Adds a startup `tenero.api_key_missing` warn-level log in `SchedulerDO.constructor` (one shot, only logs once) so future deploys without the binding fail visibly
- Adds a single test in `lib/scheduler/__tests__/tenero-task.test.ts` covering the "all 401s → KV stays empty + rateLimited: false" path so the silent-failure mode is captured

If branch (2) (alarm never fired), the fix is in the `app/leaderboard/page.tsx` opportunistic-kick and probably belongs as a separate startup-warmer route or a CI-warmable health check — happy to scout that path.

Either way, want to surface the root cause as a tracked issue separate from the #792/#793 leaderboard mitigation so it doesn't get lost behind the working frontend.

## Related

- #738 — Phase 3.1 verifier merge that brought the DO + competition routes live
- #784 — SchedulerDO v2 instance + admin controls (potential KV/binding migration boundary)
- #779 — Tenero monthly-quota backoff (potential pause-source)
- #792 — Move Tenero price read from SSR to client (leaderboard refactor)
- #793 — Fetch Tenero prices direct from browser (KV bypass — symptomatic fix)
- My v301 observability anchor: aibtcdev/landing-page#738 (issuecomment-4436213944)


	Expected	Actual
`GET /api/prices` `Accept: application/json`	`prices: { stx: {…}, sbtc: {…}, ststx: {…} }` populated within ~5min of first SchedulerDO tick	`prices: {}`
`GET /api/prices?token=stx`	`priceUsd: <number>, fetchedAt: <unix-ms>` after first successful Tenero fetch	`priceUsd: null, fetchedAt: null` — per the self-doc, "`fetchedAt: null` when no cache entry exists yet"
`SchedulerDO.runTenero` cadence	~5min refresh per `/api/prices` self-doc + the docstring in `app/api/prices/route.ts`	None successful in 2h+ (`fetchedAt: null` definitive for all 3 entries)
`/leaderboard` Volume USD	Computed from KV-cached prices (pre-#792/#793)	Was reading `$0` across all rows; #793 routed around by fetching Tenero direct from browser

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

platform: SchedulerDO Tenero refresh task not populating KV (root cause behind #792/#793 leaderboard workaround) #794

Summary

Repro

Expected vs actual

Why this is not closed by #792 + #793

Hypotheses (in order of likelihood per current evidence)

Diagnostic ask

What I'd take a stab at

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

platform: SchedulerDO Tenero refresh task not populating KV (root cause behind #792/#793 leaderboard workaround) #794

Description

Summary

Repro

Expected vs actual

Why this is not closed by #792 + #793

Hypotheses (in order of likelihood per current evidence)

Diagnostic ask

What I'd take a stab at

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions