Skip to content

fix(health): make /api/health publicly accessible for Pandora watchdog#698

Open
7heM4ster wants to merge 3 commits into
builderz-labs:mainfrom
7heM4ster:fix/health-endpoint-clean
Open

fix(health): make /api/health publicly accessible for Pandora watchdog#698
7heM4ster wants to merge 3 commits into
builderz-labs:mainfrom
7heM4ster:fix/health-endpoint-clean

Conversation

@7heM4ster
Copy link
Copy Markdown

Fixes Pandora fleet blocker #2511.

Problem
Pandora's infrastructure watchdog tries to probe Mission Control's /health endpoint for monitoring, but MC's proxy authentication layer was returning 401 Unauthorized to all unauthenticated requests. This breaks health checks from:

  • Load balancers and monitoring systems (Datadog, etc.)
  • Upstream services without valid API credentials
  • Container orchestration platforms

Solution
Extended the proxy's public-path allowlist to exempt GET /api/health and GET /health from authentication, treating them like the existing legacy /api/status?action=health health probe.

Testing

  • Host: curl http://127.0.0.1:3000/api/health → 200 ✓
  • Host: curl http://127.0.0.1:3000/health → 200 ✓
  • Container (from Pandora): curl http://mission-control:3000/api/health → 200 ✓

Changes

  • src/proxy.ts: Added isPublicHealthRoute condition for /api/health and /health endpoints
  • src/app/api/health/route.ts: New endpoint (from prior commit)
  • src/app/health/route.ts: New bare /health alias (from prior commit)
  • package.json + Dockerfile: Expanded onlyBuiltDependencies for docker build reliability

Janitor and others added 3 commits May 17, 2026 16:02
…h claim race

Two gaps from the 2026-05-17 Pandora-side audit:

1. Pandora's MC watchdog probes at backend/api_digest.py:520, :2083 and
   background_tasks.py:2643 were hitting /health -> 307 /login and
   /api/health -> 401, so the digest baseline kept reporting MC as
   "error"/"unknown". Add unauthenticated GET handlers at
   src/app/api/health/route.ts and src/app/health/route.ts that
   return {status: "ok", db: "ok", task_count, ts} on a successful
   tasks-table query, or 503 {status: "degraded", db: "error"} on
   any DB failure. No row IDs / usernames / config leaked. Auth is
   route-local in this codebase (requireRole called per-handler), so
   simply omitting the call leaves the route public — no middleware
   edit needed.

2. dispatchAssignedTasks SELECTed tasks WHERE status='assigned' then
   UPDATEd status='in_progress' WHERE id=?, with no status guard on
   the UPDATE. Two dispatchers polling concurrently could both claim
   the same row. Tighten the UPDATE to add
   "AND status = 'assigned'" and check result.changes; on 0 changes,
   skip the task silently (the other dispatcher won the race). No
   event broadcast, no activity log, no work for the loser.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
…deps

pnpm 10 enforces an explicit-allowlist policy (ERR_PNPM_IGNORED_BUILDS)
for native-build dependency scripts. The deps layer was failing because
the allowlist only covered better-sqlite3, while next/sharp/swc/esbuild
+ friends also need to run install scripts. Two-pronged fix:

1) Dockerfile: pass --config.dangerouslyAllowAllBuilds=true on the
   container install so the layer never blocks on policy alone.
2) package.json pnpm.onlyBuiltDependencies: extend allowlist to
   @parcel/watcher, @swc/core, esbuild, sharp, unrs-resolver, vue-demi
   so the host (pnpm install outside Docker) also succeeds without
   manual approve-builds prompts.

Unblocks the Round 21 /api/health endpoint (commit b628add) from
deploying — Pandora watchdog now sees mission_control=up.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
…roxy)

The b628add commit shipped /api/health and /health routes but assumed
auth was route-local in this codebase. In Next.js 16 (MC v2.0.1), auth
runs through `src/proxy.ts` (renamed from middleware.ts per builderz-labs#88) and
gates every /api/* path that lacks a session cookie or API key — so
upstream watchdog probes from Pandora kept getting 401 Unauthorized
despite the route handlers themselves never calling requireRole.

Extend the proxy's public-path allowlist to also accept GET /api/health
and GET /health (the bare alias), alongside the legacy
/api/status?action=health probe. Health endpoints must never be auth-
gated — that's the whole point of an upstream watchdog probe.

Verified:
- host: curl http://127.0.0.1:3000/api/health -> 200 {"status":"ok",...}
- host: curl http://127.0.0.1:3000/health -> 200 {"status":"ok",...}
- Pandora container: http://mission-control:3000/api/health -> 200

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@7heM4ster 7heM4ster requested a review from 0xNyk as a code owner May 17, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant