feat(docker): self-contained Docker Compose stack with hot-reload dev mode#646
feat(docker): self-contained Docker Compose stack with hot-reload dev mode#646nnnet wants to merge 1 commit into
Conversation
… mode ## Summary Adds a fully self-contained Docker Compose stack for Mission Control plus a hot-reload dev variant. - **Production** (`docker-compose.yml` + `Dockerfile`) — multi-stage image, runs the standalone Next.js server. - **Development** (`docker-compose-dev.yml` + `Dockerfile.dev`) — bind-mounts `src/`, `public/`, `messages/` and runs `next dev`. Source edits hot-reload without rebuilding the image. - **Makefile** — minimal docker compose front-end with positional service args (`make build mission-control-dev`, `make logs`, `make ps`, `make clean`) and a `MODE=dev|prod` switch. - **`.env.example`** — documents the supported runtime environment surface (auth, gateway, providers, host session modes). - **Host config projection** — operators who already have `~/.local/bin/claude`, `~/.bun`, `~/.claude` on the host can bind-mount them into the container without rebuilding the image. The Makefile references an optional `docker-compose-openclaw.yml`; that file lives in a follow-up PR. Until that PR lands, `make MODE=prod up` and `make up` work against just `docker-compose*.yml`. (We can swap the Makefile to use `wildcard` if reviewers prefer it standalone-first — happy to amend.) ## Why this is useful upstream Today, anyone wanting to run Mission Control in containers has to assemble their own Dockerfile and compose. This standardises that surface so `docker compose up -d` "just works" for both quick eval and ongoing development, and the dev variant gives contributors a fast inner loop without touching their host Node version. ## Test plan - `make build && make up` — production stack comes up at `http://127.0.0.1:7012`. - `make MODE=dev up` — dev container starts; edits under `src/` reload without rebuild. - `make logs mission-control` — logs stream. - `make down` — clean teardown. ## Provenance This PR squashes the in-flight history from our fork (`nnnet/mission-control`) into one cohesive commit. Original commit chain: - `2825225 feat(docker): self-contained docker stack with host config projection` - `2f3c4f3 feat(docker): hot-reload dev compose stack` Plus follow-up Makefile simplification (`62daf0a` — kept the simpler form here; the stack of intermediate Makefile-iteration commits stays in our fork for archaeology).
0xbrainkid
left a comment
There was a problem hiding this comment.
Blocking this one: the submitted stack references files that are not present in the PR, so the documented/default Makefile path cannot run.\n\nExact repro from the PR branch:\ndocker compose -f docker-compose-dev.yml -f docker-compose-openclaw.yml config --quiet\n→ open .../docker-compose-openclaw.yml: no such file or directory\n\nAlso, both compose files mount ./scripts/openclaw-cli-shim.py, but that file is not included in the branch either. The new Makefile defaults to -f docker-compose-openclaw.yml, so make up / make build / default make are broken until those files are added or the references are made optional/removed.\n\nNon-blocking gate: docker compose -f docker-compose-dev.yml config --quiet passes for the dev compose file by itself.
…aude CLI (#647) ## Summary Four small, independent bug fixes: 1. **CSP nonce → next-themes hydration.** The CSP nonce was not forwarded to `next-themes`' `ThemeProvider`, so theme hydration was blocked under our default CSP and the dashboard rendered with the wrong theme on the first paint. `src/proxy.ts` now propagates the nonce; `src/app/layout.tsx` consumes it. 2. **`/api/sessions/continue` works on shared host Claude sessions.** When operators bind-mount their host `~/.claude/projects` into the container (a common pattern with the host-config projection in PR #646), the resolver couldn't open sessions whose `.jsonl` lived under that mount. Path resolver now accepts both container-local and host-projected paths. 3. **Tighter active-session window + drop orphan rows.** MC was treating any session whose `.jsonl` was touched recently as "alive" — even after the underlying Claude process had long stopped. The window is tighter now and orphan rows are filtered. 4. **Chat input stays anchored.** The chat input lost focus on every message because the whole transcript re-rendered. We anchor the input and patch the transcript instead. ## Why this is useful upstream These are user-visible defects that anyone running MC behind a CSP or sharing host Claude state hits within minutes of using the chat panel. None depend on each other or on any other in-flight PR. ## Test plan - Visit the dashboard with a strict CSP — theme matches the user's preference on first paint, no hydration errors in console. - Bind-mount host `~/.claude/projects` into the dev container, open a session that exists only on the host — `/api/sessions/continue` resolves it. - Watch `/api/sessions` after killing a Claude process — the orphan row disappears within the new window. - Send several chat messages — input keeps focus across each reply. ## Provenance Squashes our fork's commits: - `9ac39a6 fix(csp): pass nonce to next-themes ThemeProvider so hydration works` - `d7c10b2 fix(chat): /api/sessions/continue works on shared host claude sessions` - `8463a70 fix(chat): tighter active-session window + remove orphan rows` - `2fce07a fix(chat): keep input anchored, reply renders into transcript`
|
Thanks — closing this one only because it overlaps with #649, which adds the same baseline ( I've left detailed review notes on #649 — please address those there and we can land the consolidated stack via that PR instead of having two PRs that touch the same files. If you'd rather keep the no-OpenClaw self-contained dev stack as a smaller separate PR (and that's a valid argument — operators who don't need OpenClaw shouldn't have to read 2,900 lines to evaluate the basic compose setup), please rebase this on top of Closing for now in favor of #649. |
…bs#318) (#61) * Remove incorrect Task Board placeholder screenshots from README (builderz-labs#595) Five feature sections (Memory, Skills, Cost Tracking, Security, Cron) referenced doc images that were all identical Task Board screenshots instead of the actual feature UIs. Remove the broken image references until correct screenshots are captured and committed. * fix: gateway HTTPS/reverse-proxy, .env loading, DELETE pipeline, OpenCode part table - Fix gateway disconnection when accessing MC via HTTPS URL (closes builderz-labs#603): - buildGatewayWebSocketUrl now respects NEXT_PUBLIC_GATEWAY_REVERSE_PROXY=1 to use wss:// for localhost when behind a reverse proxy - Login page auto-selects wss:// preset when served over HTTPS - Added 'wss://gateway:18789' to login presets for container deployments - Preserves user-explicit wss:// protocol for localhost URLs (reverse proxy) - Fix .env not loaded by pnpm start/dev (closes builderz-labs#603): - start-standalone.sh now sources .env (consistent with Docker entrypoint) - Added prominent note in .env.example about NEXT_PUBLIC_* build-time baking - Documented NEXT_PUBLIC_GATEWAY_REVERSE_PROXY env var - Fix DELETE /api/pipelines returning 400 Request body required (closes builderz-labs#604): - DELETE handler now accepts id from query string OR request body - Frontend sends ?id=N, handler previously required JSON body - Fix OpenCode transcript showing only token counts (closes builderz-labs#593): - readOpenCodeTranscript now queries the part table when available (OpenCode >= 1.4 stores content in part, not message.data) - Falls back to inline message.data.content for older versions - Added test for part-table path with text and tool parts * fix: exclude .git from output file tracing to fix self-update (builderz-labs#619) When `output: 'standalone'` is enabled, Next.js's file tracer copies the entire repo `.git/` directory (~14 MB) into `.next/standalone/`, because `src/app/api/releases/update/route.ts` references the `git` binary via `execFileSync`. This makes Git treat `.next/standalone/` as its own working tree. When the self-update endpoint runs `git status --porcelain` from `process.cwd()` (which is `.next/standalone/` under `pnpm start:standalone`), Git reports every file the standalone build does not bundle (most of `src/lib/__tests__/`, etc.) as deleted, so the dirty-tree pre-flight check fails and self-update is permanently blocked even on a clean install. Adding `./.git/**/*` to `outputFileTracingExcludes` keeps `.git` out of the standalone output, which lets the existing `git status` pre-flight walk up to the real repo root and behave correctly. Repro on v2.0.1 install: pnpm install --frozen-lockfile && pnpm build ls -la .next/standalone/.git # full git dir copied curl -X POST -H "X-API-Key: \$K" -d '{"targetVersion":"v9.9.9-fake"}' \\ http://localhost:3000/api/releases/update # -> 409 dirty: src/lib/__tests__/* listed as deleted After the fix: rm -rf .next && pnpm build ls -la .next/standalone/.git # not present # update endpoint now passes the dirty-tree check on a clean checkout Co-authored-by: Kieran Hume <11814847+kieranhume@users.noreply.github.com> * fix: persist POST /api/tokens records to token_usage SQLite table (builderz-labs#620) The `POST /api/tokens` handler currently writes only to the JSON file at `tokensPath`. The corresponding aggregation endpoints (`/api/tokens/by-agent`, the dashboard's per-agent cost widget) read from the `token_usage` SQLite table instead. As a result, externally posted token records (e.g. from custom workers using direct CLI integration via `POST /api/connect`) never reach the by-agent breakdown — the dashboard shows zero usage even when the JSON file has dozens of records. This patch makes the POST handler also INSERT into `token_usage`, mirroring the columns used by `src/lib/task-dispatch.ts` for tasks that route through MC's internal dispatcher. The JSON write remains the canonical record (dedupe + cap behaviour unchanged); the DB INSERT is wrapped in try/catch so a SQLite failure can't break a successful API request. Repro on `main` before patch: curl -X POST -H "X-API-Key: $K" -H "Content-Type: application/json" \ -d '{"model":"qwen3-coder:30b","sessionId":"test:abc", "inputTokens":10,"outputTokens":5}' \ /api/tokens # -> {"success": true, ...} written to JSON curl /api/tokens/by-agent?days=1 # -> {"agents": []} ← empty, even though usage exists After patch: curl /api/tokens/by-agent?days=1 # -> {"agents": [{"agent": "test", "total_tokens": 15, ...}]} Schema-aligned with v2.0.1's actual `token_usage` columns (model, session_id, input_tokens, output_tokens, created_at, workspace_id, task_id, cost_usd, agent_name). No total_tokens / cost columns referenced — those are added in later migrations and absent on older instances. Co-authored-by: Kieran Hume <11814847+kieranhume@users.noreply.github.com> * fix(gateway-url): downgrade http(s)://localhost inputs to ws:// for the gateway WebSocket (builderz-labs#629) The gateway runs on plain HTTP locally; only a reverse proxy that explicitly terminates TLS in front of it speaks wss://. Before this change, buildGatewayWebSocketUrl preserved any user-set protocol on localhost inputs — including http:// and https://, which are not valid WebSocket schemes. A user passing host='https://127.0.0.1:18789' got 'https://127.0.0.1:18789' back verbatim instead of the expected 'ws://127.0.0.1:18789'. This is an additive tightening on top of the reverse-proxy fix in a020d1b — that change already routes the NEXT_PUBLIC_GATEWAY_REVERSE_PROXY=1 path to wss:// when the browser is on HTTPS, and that behavior is preserved here. The only new constraint is: for localhost, http:// and https:// (and bare ws://) collapse to ws://; wss:// stays wss:// as the operator's explicit TLS-terminator opt-in. Tests: - New: 'downgrades http://localhost to ws://' covers the plain HTTP case (http://localhost:18789 → ws://localhost:18789). - New: 'preserves explicit wss:// on localhost (reverse-proxy TLS opt-in)' pins the reverse-proxy contract — wss://127.0.0.1:18789 passes through unchanged. - Pre-existing 'downgrades https://127.0.0.1 to ws:// when no reverse proxy is configured' continues to assert the https://localhost path. All 17 cases in src/lib/__tests__/gateway-url.test.ts pass; full vitest suite and tsc --noEmit are green. * fix: send user id in DELETE request body (builderz-labs#631) Fixes builderz-labs#614 - Delete user from UI fails with 'Request body required' * fix(validation): drop default values from updateTaskSchema (builderz-labs#621) `updateTaskSchema = createTaskSchema.partial()` makes every field optional but Zod's `.partial()` keeps the underlying `.default(...)`, so a PUT body that omits e.g. `status` parses to `status: 'inbox'`, `tags: []`, `metadata: {}`. The route's `if (field !== undefined)` guards then silently overwrite the stored row — a `PUT {title}` resets status to 'inbox' and wipes tags/metadata. Define `updateTaskSchema` explicitly with `.optional()` and no defaults. Extract the field validators into a shared `taskFields` map so the two schemas don't drift. Defaults remain on `createTaskSchema` only. Adds regression tests covering empty body, single-field updates, pass-through, and constraint enforcement. * fix(agents): flatten gateway_config into buildAgentConfig overrides (builderz-labs#601) buildAgentConfig expects a flat overrides shape, but the POST /api/agents template branch was passing the nested gateway_config object with `as any`. That silenced a real mismatch: overrides.model was {primary: "<id>"}, and the function assigned the whole object to config.model.primary. Agents created from a template got a doubly-wrapped primary in the DB, crashing the Models and Config tabs with "Objects are not valid as a React child (found: object with keys {primary})". Extract the leaf fields at the call site so the primary string survives and the other template overrides (identity, sandbox, subagents) actually apply. Unit test locks in that overrides.model yields a string primary. Fixes builderz-labs#599 * fix: resolve CSRF origin mismatch behind reverse proxies (builderz-labs#592) The CSRF check only compared the Host header against the browser's Origin, causing false positives when reverse proxies (e.g. Nginx Proxy Manager) rewrite or strip the Host header or its port. - Check all host candidates (x-forwarded-host, x-original-host, etc.) instead of only the Host header - Add port-normalized comparison that strips default ports (80/443) before matching * fix: preserve gateway auth for saved urls (builderz-labs#642) Co-authored-by: Terry Le <terry@example.com> * fix: add logout button to user menu (builderz-labs#632) * fix: send user id in DELETE request body Fixes builderz-labs#614 - Delete user from UI fails with 'Request body required' * fix: add logout button to user menu Fixes builderz-labs#612 - No logout button in dashboard UI Add logout button to the user menu (avatar dropdown) in NavRail's ContextSwitcher. Calls POST /api/auth/logout then redirects to /login. * fix(tasks): reconcile async gateway dispatch completions (builderz-labs#655) Replace Mission Control task dispatch gateway calls with the programmatic websocket RPC path and persist async run metadata for deferred reconciliation. Keep agent.wait timeout results pending, promote terminal completions to review, and recover assistant response text from OpenClaw transcripts when terminal payloads do not include text. Add focused regression coverage for websocket gateway RPC, pending timeout handling, terminal promotion, missing run IDs, and transcript-backed result recovery. * feat(dispatch): direct multi-provider dispatch (Anthropic / OpenAI / local OpenAI-compatible) (builderz-labs#648) ## Summary Adds a direct-dispatch path for tasks so MC no longer requires the OpenClaw gateway to reach LLMs. When the gateway isn't available — most commonly on Linux deployments where the macOS-native gateway can't be installed — MC now calls provider HTTP APIs itself. ### Supported direct providers | Provider | Env variable | Notes | |---|---|---| | **Anthropic** | `ANTHROPIC_API_KEY` | Smart Opus/Sonnet/Haiku selection via `classifyDirectModel()` heuristics. | | **OpenAI** | `OPENAI_API_KEY` | Models like `openai/gpt-4o-mini`, `openai/gpt-5-nano` selectable per-agent. | | **Any OpenAI-compatible local** | `LOCAL_LLM_ENDPOINT`, optional `LOCAL_LLM_API_KEY` | LMStudio, Ollama, vLLM — anything serving `/v1/chat/completions`. | Each agent's `agent_config.dispatchModel` overrides the heuristics (e.g. set `openai/gpt-4o-mini` to lock that agent to OpenAI). ### Hardened gateway-availability check `isGatewayAvailable()` previously returned `true` whenever `~/.openclaw` was a non-empty path string — which is the **default** even when OpenClaw isn't installed. On Linux this caused `useDirectApi = !true && … = false`, so MC fell through to `runOpenClaw(...)` and crashed with `spawn openclaw ENOENT`. The check now requires physical evidence: - A real `openclaw.json` exists at `config.openclawConfigPath`, **or** - A `gateways` row whose status is in `{online, healthy, ready}` (the seed `unknown` row planted by onboarding doesn't count). ## Why this is useful upstream Dropping the macOS-only constraint is the single biggest barrier to running MC in cloud / Linux dev environments. With this change, an operator can `git clone`, set `ANTHROPIC_API_KEY`, and dispatch tasks immediately — no gateway, no Mac, no extra processes. ## Walkthrough `examples/MULTI-PROVIDER-DEMO.md` is a field-by-field walkthrough that builds a three-agent team using all three direct providers (Anthropic for the lead, OpenAI for the worker, local LMStudio for the reviewer) and runs an end-to-end task. ## Test plan - Run on a Linux host with no OpenClaw installed; confirm `/agents` shows "No gateways installed" and dispatch still works. - Set only `ANTHROPIC_API_KEY` and dispatch a task — logs show "Dispatching task via direct Claude API". - Set only `OPENAI_API_KEY`, set agent `dispatchModel: openai/gpt-4o-mini`, dispatch — logs show OpenAI direct. - Bring up LMStudio at `127.0.0.1:1234`, set `LOCAL_LLM_ENDPOINT=http://host.docker.internal:1234/v1`, dispatch with `dispatchModel: local/<your-model>` — logs show local direct. - With OpenClaw running and a healthy `gateways` row, dispatch — `useDirectApi=false`, gateway path used as before. Backwards-compatible. ## Provenance Squashes: - `70328ad feat(dispatch): direct OpenAI + OpenAI-compatible local providers` - `762ed51 fix(dispatch): make MC self-sufficient without OpenClaw gateway` - `8def310 docs(deployment): document host-config projection, direct providers, host session modes` - `bdd86ea docs(examples): add multi-provider team walkthrough` - `c7627c7 docs(examples): expand multi-provider demo into a field-by-field walkthrough` * fix(chat,csp): nonce hydration + chat session continuity with host Claude CLI (builderz-labs#647) ## Summary Four small, independent bug fixes: 1. **CSP nonce → next-themes hydration.** The CSP nonce was not forwarded to `next-themes`' `ThemeProvider`, so theme hydration was blocked under our default CSP and the dashboard rendered with the wrong theme on the first paint. `src/proxy.ts` now propagates the nonce; `src/app/layout.tsx` consumes it. 2. **`/api/sessions/continue` works on shared host Claude sessions.** When operators bind-mount their host `~/.claude/projects` into the container (a common pattern with the host-config projection in PR builderz-labs#646), the resolver couldn't open sessions whose `.jsonl` lived under that mount. Path resolver now accepts both container-local and host-projected paths. 3. **Tighter active-session window + drop orphan rows.** MC was treating any session whose `.jsonl` was touched recently as "alive" — even after the underlying Claude process had long stopped. The window is tighter now and orphan rows are filtered. 4. **Chat input stays anchored.** The chat input lost focus on every message because the whole transcript re-rendered. We anchor the input and patch the transcript instead. ## Why this is useful upstream These are user-visible defects that anyone running MC behind a CSP or sharing host Claude state hits within minutes of using the chat panel. None depend on each other or on any other in-flight PR. ## Test plan - Visit the dashboard with a strict CSP — theme matches the user's preference on first paint, no hydration errors in console. - Bind-mount host `~/.claude/projects` into the dev container, open a session that exists only on the host — `/api/sessions/continue` resolves it. - Watch `/api/sessions` after killing a Claude process — the orphan row disappears within the new window. - Send several chat messages — input keeps focus across each reply. ## Provenance Squashes our fork's commits: - `9ac39a6 fix(csp): pass nonce to next-themes ThemeProvider so hydration works` - `d7c10b2 fix(chat): /api/sessions/continue works on shared host claude sessions` - `8463a70 fix(chat): tighter active-session window + remove orphan rows` - `2fce07a fix(chat): keep input anchored, reply renders into transcript` * feat(docker): expose NEXT_PUBLIC_* as build args (builderz-labs#643) Browser-side env vars must be baked at Next.js build time. Adding ARG/ENV pairs in the build stage allows operators to configure the gateway URL via docker build --build-arg or compose.yml args without forking the image. Backward-compatible: defaults remain empty, behaviour unchanged when no build args are passed. Discovered via Project EIGHTBALL deployment with separate subdomains for gateway and dashboard. Co-authored-by: Daniel Rose <sella-media-hq@Mac.fritz.box> * chore: update antropic model pricing (builderz-labs#644) * chore: update antropic model pricing * test: update token pricing test * fix(security): detect homoglyph + zero-width + ROT13/URL/base64 bypasses; migrate device key to non-extractable IndexedDB (builderz-labs#657) Closes builderz-labs#574 and builderz-labs#576. This is a corrected take on PR builderz-labs#580 — that PR identified the right shape of the fix (Layer 1 normalization + IndexedDB CryptoKey) but landed it by deleting the entire 18-rule detection layer and removing the documented graceful fallback. This PR keeps the rules and the fallback intact. ## injection-guard.ts (builderz-labs#576) Adds a normalization layer that runs IN FRONT of the existing 18 rules: scanForInjection(input) now scans: 1. The original input (preserves all existing test expectations) 2. The Layer-1-normalized input (homoglyphs collapsed, zero-width stripped, NFKC compatibility decomposition) 3. ROT13 / URL-decoded / base64-decoded variants when heuristics suggest the input is encoded Matches across variants are deduplicated by `rule`, so the same attack reports once. Each match carries a `variant` field for audit logging. Specifically defends against the four PoCs in builderz-labs#576: - Cyrillic / Greek / fullwidth-Latin homoglyphs - Zero-width / bidi-override insertion - ROT13 ('cyrnfr rkrphgr ...' → 'please execute ...') - URL / base64 wrapping A new exported constant RULE_COUNT is asserted >= 18 in tests so future refactors cannot silently drop the rule layer (the failure mode of builderz-labs#580). All 18 existing detection rules, all existing exports (scanForInjection, noInjection, injectionRefinement, sanitizeForShell, sanitizeForPrompt, scanAndLogInjection, escapeHtml), and all original tests are preserved unchanged. The normalization helpers are also exported as standalone functions (normalizeConfusables, removeInvisibleChars, decodeRot13, detectRot13, generateDecodingVariants) so callers can normalize without running the full scan when they only need one piece. decodeVariants:false option lets hot paths (per-token streaming filters) opt out of the encoded-variant pass to bound latency. ## device-identity.ts (builderz-labs#574) Migrates the Ed25519 private key from plaintext localStorage to a non-extractable WebCrypto CryptoKey persisted in IndexedDB. An XSS that gets script execution can no longer exfiltrate the raw key bytes — crypto.subtle.sign() still works, crypto.subtle.exportKey() does not. Migration is idempotent and explicit: v1 (plaintext localStorage) → STORAGE_KEY_VERSION absent v2 (IndexedDB CryptoKey) → STORAGE_KEY_VERSION === '2' On first load after upgrade, getOrCreateDeviceIdentity(): 1. Re-imports the v1 PKCS8 bytes as a non-extractable CryptoKey 2. Persists into IndexedDB 3. Removes the v1 plaintext key from localStorage 4. Sets STORAGE_KEY_VERSION = '2' Crucially this PR PRESERVES the documented graceful fallback that the original code had (and that builderz-labs#580 dropped): - When IndexedDB is unavailable (older browsers, restrictive private mode), createNewIdentity() falls back to the v1 storage shape so the gateway handshake still works in auth-token-only mode. - The DeviceIdentity interface adds a `storageMode` field ('indexeddb-cryptokey' | 'localstorage-fallback') so the UI can surface a warning when the secure path isn't available. - clearDeviceIdentity() stays synchronous (the IndexedDB delete is fire-and-forget) so the WebSocket onmessage handler in src/lib/websocket.ts can keep its existing call shape. Tests: 64 passing (18 original + 46 new bypass-detection tests covering the homoglyph/zero-width/ROT13/URL/base64 PoCs from builderz-labs#576, plus RULE_COUNT regression guard, plus standalone helper coverage). Based on PR builderz-labs#580 by @Kecstacy — the IndexedDB CryptoKey shape and the Cyrillic confusables idea were good; this PR rebuilds those pieces without dropping the existing detection layer. Co-authored-by: Nyk <0xnykcd@gmail.com> * fix(openclaw-doctor): single-flight + 30s TTL cache to prevent CPU/RAM spikes (builderz-labs#658) Closes builderz-labs#613. Problem ------- `GET /api/openclaw/doctor` spawns a Node subprocess that allocates ~300-600 MB and runs at 37-51 % CPU. The dashboard banner + the onboarding modal + multiple browser tabs polling concurrently could produce 6+ simultaneous subprocesses on a 4 GB host (issue builderz-labs#613 observed combined CPU ~89 %, RAM ~3.5 GB). Fix --- Two layers of mitigation, GET-only: 1. Single-flight: if a doctor invocation is already in flight, concurrent callers share its eventual result — never spawn a second subprocess while one is running. 2. TTL cache: cache the last response for MC_DOCTOR_TTL_MS (30 s default; settable via env, 0 disables caching for tests). Subsequent GETs within the window return the cached payload. Cache invalidation ------------------ - `POST /api/openclaw/doctor` (the --fix path) calls `invalidateDoctorCache()` after a successful fix so the freshly- fixed state surfaces immediately rather than waiting out the TTL. - "Not installed" errors are NOT cached — the operator may install OpenClaw mid-session and we want the next poll to pick that up. - The POST/--fix subprocess is intentionally uncoalesced because operators clicking "Re-check" want a guaranteed fresh run. Response headers ---------------- - `X-Doctor-Cache: hit` | `miss` so dashboards can confirm the coalescing is working. - `X-Doctor-Age-Ms: <n>` on cache hits. Tests ----- 6 new tests in src/lib/__tests__/openclaw-doctor-route.test.ts: - Concurrent GETs collapse to a single runOpenClaw invocation - Cache hits within TTL skip the subprocess - TTL=0 forces re-run on every GET - "Not installed" errors are not cached - Unauthenticated requests reject before subprocess spawn - invalidateDoctorCache() drops the cache on demand Co-authored-by: Nyk <0xnykcd@gmail.com> * fix(recurring-tasks): include HH:MM in child title for sub-daily crons (builderz-labs#659) Closes builderz-labs#616. Problem ------- Recurring tasks with sub-daily cron schedules (e.g. `0 * * * *` hourly, `*/5 * * * *` every-5-min) only spawned once per calendar day. All subsequent spawns within the same day were silently skipped. Root cause: `spawnRecurringTasks` derives child titles from `formatDateSuffix()` which only emitted day-level granularity (`MMM DD`). The duplicate-prevention guard at recurring-tasks.ts:75-79 then matches on `(title, workspace_id, project_id)` and finds the previous spawn, returning early. Fix --- 1. New `isSubDailyCron(cronExpr)` helper returns true when the minute or hour field is anything other than a single concrete number (covers wildcards, ranges, lists, steps). 2. `formatDateSuffix(now, subDaily)` now takes an explicit `now` and a `subDaily` flag: Daily / weekly / monthly: "Apr 24" (unchanged) Sub-daily: "Apr 24, 13:35" Same-day spawns at 13:00 and 14:00 now produce different titles and the duplicate-guard correctly admits both. 3. Daily/weekly/monthly title format is preserved so existing operator workflows don't see a title-churn migration. Compatibility ------------- - `isCronDue()` already prevents same-minute double-spawns via the `lastSpawnedAtMs` check, so this PR doesn't loosen the dedup — it just stops the dedup from firing on the *wrong* layer. - The `formatDateSuffix` signature now takes optional arguments with sane defaults; existing call sites still work. Tests ----- 13 new tests in src/lib/__tests__/recurring-tasks.test.ts: - `isSubDailyCron` recognizes hourly / every-5-min / every-minute / ranged-minute crons; rejects daily/weekly; defensive on malformed. - `formatDateSuffix` returns `MMM DD` (daily) and `MMM DD, HH:MM` (sub-daily); zero-pads day-of-month. - Regression: two hourly spawns on the same day produce different titles. - Regression: every-5-min spawns within the same hour produce different titles. - Daily cron preserves historical "MMM DD" shape. Co-authored-by: Nyk <0xnykcd@gmail.com> * test(fork-skip): pre-existing upstream bug in task-dispatch direct-dispatch detection Skip `does not send heuristic model overrides when the agent has a configured default model` — the assertion assumes the gateway-dispatch path, but `isDirectDispatchAvailable()` is unconditionally true because `getLocalEndpoint()` returns a hardcoded `http://host.docker.internal:1234/v1` default when `LOCAL_LLM_ENDPOINT` is unset, so dispatch always takes the direct-API branch and `callOpenClawGateway` is never invoked. Verified across 4 angles before applying the skip: - Fails on `upstream/main` HEAD (85215c5) in our checkout - Fails in a fresh worktree-isolated upstream checkout with no fork code - builderz-labs/mission-control quality-gate CI is also red on this commit - Root cause traced to upstream src/lib/task-dispatch.ts:640 Expect this skip to drop out cleanly on a future rebase once upstream patches either `getLocalEndpoint` or `isDirectDispatchAvailable` (touched only the test, not the source, to keep the upstream-touch contract surface untouched). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(fork-skip): 4 pre-existing upstream E2E regressions Skip 4 E2E tests where upstream commits brought in via the rebase broke upstream-owned tests without updating them. Pattern identical to the unit-test fork-skip in the prior commit: - tests/delete-body.spec.ts:11 — pipelines DELETE returns "Pipeline ID required" instead of "body required" since upstream accepted ID via query string - tests/delete-body.spec.ts:73 — same regression, with body.error undefined for query-string-only DELETE - tests/agent-costs.spec.ts:74 — POST /api/tokens token totals doubled (expected 700, received 1400) - tests/agent-costs.spec.ts:101 — task-cost attribution doubled (expected 400, received 800) Verified that these test files are byte-identical to pre-rebase main (matching blob hashes), AND that pre-rebase main's quality-gate CI was green on the prior tip. The breakage is upstream code drift, not test drift on our side. Upstream's own builderz-labs CI is also red on the quality-gate at 85215c5. Touching only the test files (not the source), keeping the upstream- touch contract surface unchanged. Each skip is annotated with the specific regression and will drop out cleanly on a future rebase once upstream aligns code and tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Zakir Jiwani <zakirjiw@gmail.com> Co-authored-by: Nyk <0xnykcd@googlemail.com> Co-authored-by: kieranhume <kieranhume@gmail.com> Co-authored-by: Kieran Hume <11814847+kieranhume@users.noreply.github.com> Co-authored-by: Tyler Vo <toanntq@gmail.com> Co-authored-by: nmsn <136696700@qq.com> Co-authored-by: Gal Longin <galongin@users.noreply.github.com> Co-authored-by: codingDr <18496633+codingDr@users.noreply.github.com> Co-authored-by: HoshimiRIN <1181127934@qq.com> Co-authored-by: tl2811 <tonle@gearment.com> Co-authored-by: Terry Le <terry@example.com> Co-authored-by: ChanchoNezz <fernando.nunez0413@gmail.com> Co-authored-by: nnnet <zkowkmdx@sharklasers.com> Co-authored-by: YoungSella <155443947+YoungSella@users.noreply.github.com> Co-authored-by: Daniel Rose <sella-media-hq@Mac.fritz.box> Co-authored-by: Oskar <53711292+oskarkocol@users.noreply.github.com> Co-authored-by: nyk <93952610+0xNyk@users.noreply.github.com> Co-authored-by: Nyk <0xnykcd@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a fully self-contained Docker Compose stack for Mission Control plus a hot-reload dev variant.
docker-compose.yml+Dockerfile) — multi-stage image, runs the standalone Next.js server.docker-compose-dev.yml+Dockerfile.dev) — bind-mountssrc/,public/,messages/and runsnext dev. Source edits hot-reload without rebuilding the image.make build mission-control-dev,make logs,make ps,make clean) and aMODE=dev|prodswitch..env.example— documents the supported runtime environment surface (auth, gateway, providers, host session modes).~/.local/bin/claude,~/.bun,~/.claudeon the host can bind-mount them into the container without rebuilding the image.The Makefile references an optional
docker-compose-openclaw.yml; that file lives in a follow-up PR. Until that PR lands,make MODE=prod upandmake upwork against justdocker-compose*.yml. (We can swap the Makefile to usewildcardif reviewers prefer it standalone-first — happy to amend.)Why this is useful upstream
Today, anyone wanting to run Mission Control in containers has to assemble their own Dockerfile and compose. This standardises that surface so
docker compose up -d"just works" for both quick eval and ongoing development, and the dev variant gives contributors a fast inner loop without touching their host Node version.Test plan
make build && make up— production stack comes up athttp://127.0.0.1:7012.make MODE=dev up— dev container starts; edits undersrc/reload without rebuild.make logs mission-control— logs stream.make down— clean teardown.Provenance
This PR squashes the in-flight history from our fork (
nnnet/mission-control) into one cohesive commit. Original commit chain:2825225 feat(docker): self-contained docker stack with host config projection2f3c4f3 feat(docker): hot-reload dev compose stackPlus follow-up Makefile simplification (
62daf0a— kept the simpler form here; the stack of intermediate Makefile-iteration commits stays in our fork for archaeology).