Skip to content

Releases: johnkf5-ops/the-dev-squad

v0.4.3 — TV in the Office, ping-pong polish, sprite facing fix

19 Apr 19:41

Choose a tag to compare

Office gets a working TV

The wall-mounted TV sprite now plays a real animated GIF channel whenever a sprite is on the couch. 24 channel GIFs ship under public/sprites/tv/ plus a static.gif interstitial. Each channel airs for 7s, then 1s of static, then the next channel.

Channel rotation uses bag-shuffle (Spotify-style): every channel airs exactly once per deck before any repeats. When the deck empties it gets re-shuffled with Fisher-Yates, and the first item of the fresh deck is swapped if it would back-to-back the channel that was just on.

Ping-pong polish

  • Two-player only — both spots required free or the activity is skipped.
  • Both sprites depart together as a group and share a single return-time so the rally doesn't end solo.
  • Right-side player position tweaked so they're no longer drawn under the table.

Sprite facing fix

PixelSprite.getRow had 'left' and 'right' mapped to the wrong rows. Verified against the actual sprite sheet (row 2 is left-facing art, row 3 is right-facing art) and swapped the mapping. Sprites now face the direction they're walking.

Notes

  • No API or pipeline behavior changes — pure office UI.
  • No new dependencies.

See CHANGELOG.md for full history.

v0.4.2 — Office overlap fix, TypeScript baseline cleared

19 Apr 17:25

Choose a tag to compare

Bug-fix and cleanup release. No agent or pipeline behavior change.

Fixed

  • Office wander overlap. Agents could land on top of each other in the Lunar Office Scene because the wander avoidance only considered other workers' idle positions — it ignored workers at home or at a phase override. Surfaced when Eddie was added in v0.4.1: his couch home (1080, 485) coincided exactly with a couch idle spot, so wanderers could walk right onto him.
    • New computeOccupiedPositions helper considers every other worker's actual current position (phase override > idle > home).
    • New isCandidateBlocked helper preserves the existing group-spot exception (couch / hookah / pingpong allow same-label stacking) but blocks exact-coord collisions unconditionally.
    • Both wander loops in LunarOfficeScene.tsx now use the helpers.
  • TypeScript baseline cleared. All 10 pre-existing `tsc --noEmit` errors fixed:
    • 6 × TS5097 (.ts import extension) — added \"allowImportingTsExtensions\": true to tsconfig.json (one-line fix).
    • 2 × TS2322 (securityMode widening in orchestrator.ts) — annotated the let declaration as 'fast' | 'strict'.
    • 1 × TS2345 (Docker child stdin: null in runner.ts) — cast child as unknown as RunnerChild with a comment explaining nothing on the docker path reads stdin.
    • 1 × TS2345 ('pingpong' not in sendGroup signature in LunarOfficeScene.tsx) — widened the parameter type to include 'pingpong'.
    • `tsc --noEmit` now exits with zero errors.

Removed

  • Dead code in LunarOfficeScene.tsx:
    • pickIdlePositions function — a deterministic seed-based wander picker that lost to the timer-based wander system that ships today.
    • gatherSpots array — planned cluster spots that lost to the idleSpots label-based group system that ships today.
    • Both were defined and never called. Deletion is safe — no compile or runtime change.

Notes

  • All four test files still pass (pipeline-runtime, supervisor-snapshot, hook-contract, audit-findings).
  • Pure cleanup release. No new features, no roadmap movement.

v0.4.1 — Agent E lives in the Office

19 Apr 17:14

Choose a tag to compare

Cosmetic polish on top of v0.4.0. Eddie (white pirate sprite) now appears in the Lunar Office Scene whenever the Security Audit toggle is on. No behavior, API, or pipeline changes — same audit flow shipped in v0.4.0, now visually represented in the office UI.

What's new

  • New sprite: agent_e.png — white pirate, generated from agent_b (gray) by recoloring 5 body shades to white tones. Same 256×128 / 32×32 / 8×4 format so PixelSprite renders it identically. Source pack is CC0 Foozle Scallywag Pirates.
  • Office behavior:
    • When the Security Audit toggle is ON: Eddie chills on the couch by the TV (homePosition 1080, 485). He participates in the existing idle-wander system with the rest of the team.
    • During the security-audit phase: Eddie takes A's seat at the planner desk (184, 345 facing back — pixel-perfect overlap with where A normally sits). A walks down to the ping-pong area and waits there.
    • After the audit (deploy / complete / reset): Eddie returns to the couch, A returns to the desk via the existing deploy override.
  • Toggle OFF: Eddie does not render. Office layout is identical to pre-v0.4.0.

Code touch points

  • public/sprites/agent_e.png (new)
  • src/components/mission/PixelSprite.tsxspriteFile gets auditor
  • src/components/mission/LunarOfficeScene.tsxWorkerId += 'auditor' plus all derived Records, workers array, homePositions, new phasePositions['security-audit'] entry, agentSpeechPool E entries, idle-positions state and reset, two wander loops gated on the new runFinalAudit prop, render filter, useEffect deps
  • src/app/page.tsx — pass runFinalAudit + agentSpeech('E') to the scene

Verified

  • Manual smoke: toggle on, Eddie appears on couch; toggle off, Eddie disappears
  • Tests: pipeline-runtime, supervisor-snapshot, hook-contract, audit-findings all pass
  • tsc: 10 errors, identical to pre-existing baseline (zero new)

Known cleanup, deferred

  • pickIdlePositions function and gatherSpots array in LunarOfficeScene.tsx are defined but never called. Will be swept out in a follow-up commit.

v0.4.0 — Optional Security Audit (Agent E)

18 Apr 19:03

Choose a tag to compare

This release adds an optional Security Audit phase driven by a new agent (E) and replaces the dormant Docker sandbox plan with an honest "not a sandbox" stance.

What's new

  • Agent E (Security Auditor): read-only OWASP-class audit that runs after the tester reports tests passing, only when the user toggles "Security Audit" on at build start. Findings are severity-ranked (critical / high / medium / low) calibrated by exploitability and prerequisites.
  • Pause-and-decide flow: pipelineStatus = 'awaiting-audit-decision' after E's verdict. The orchestrator exits cleanly. User actions (Send to C / Dismiss / Deploy) respawn the orchestrator via the same pattern as resumePipelineRun.
  • Per-finding scoped fix loop (user-initiated only): "Send to C" runs C with a focused fix-only-this prompt, D verifies the existing tests still pass, then E re-audits only that finding. Status flows through Open → Sent to C → Re-auditing → Resolved or Still Open. No automatic loops.
  • Security Audit panel in the Office View: findings list with per-finding action buttons, chat with E, gated Deploy now button with confirmation modal. A/B/C/D panels squish to make room while S stays full size; layout reverts when audit isn't active.
  • New POST /api/audit-action endpoint enforces a serialization gate (auditActionInFlight) and validates state preconditions.
  • New files: pipeline/role-e.md, src/components/agents/SecurityAuditPanel.tsx, src/app/api/audit-action/route.ts, scripts/test-audit-findings.mjs.

What's deprecated

  • Sandboxed/isolated execution is no longer an active roadmap item. The runner abstraction (pipeline/runner.ts) and DockerRunner code remain in the tree for narrow cases, but Claude Code subscription auth inside containers is too unreliable to make sandboxed execution a default. If you need OS-level isolation today, run The Dev Squad inside a VM you own.
  • Removed: SANDBOX-RUNNER-PLAN.md, SUPERVISOR-BUILD-PLAN.md, UI-AND-HEADLESS-PLAN.md.

Docs

  • README: 6 agents (added E with optional callout), new badges (no API key, built with Claude Code, runs on subscription, GitHub stars/forks, last commit, open issues, TypeScript, macOS), updated Team table, new Phase 3.5 Security Audit section, updated Controls reference, new audit JSON example.
  • ARCHITECTURE.md, SECURITY.md, SECURITY-ROADMAP.md: rewritten to add E and remove the sandbox promise. SECURITY-ROADMAP repositions E as the near-term safety layer; v0.5 (host-owned policy) is the next planned phase, no ship date.
  • TODO.md: dropped dead Docker hardening items, added v0.5 gate.
  • pipeline/checklist-template.md: rewrote Phase 4b for the new user-controlled review/fix-loop/deploy flow.
  • All role files updated. role-c, role-d, role-s mention the optional audit involvement; role-e covers chat, severity ranking, and re-audit modes.

Tests

  • Hook contract: 10 new E permission checks (read allowed; write/edit/bash/web/agent denied). 35 checks total, all pass.
  • Pipeline runtime: E auto-resumes for security-audit phase; new resume prompt asserted.
  • Supervisor snapshot: 4 new scenarios (audit running, awaiting-decision with findings, action-in-flight, clean-audit awaiting deploy).
  • New scripts/test-audit-findings.mjs validates AUDIT_SCHEMA shape and AuditFinding state-machine transitions.

Full diff: 38 files changed, +1818 / −1262.

v0.3.17 — Markdown rendering and existing repo guidance

08 Apr 16:11

Choose a tag to compare

Highlights

  • Fixed markdown and newline rendering across both Office View and Squad View
  • Added a shared markdown renderer so agent conversations preserve formatting instead of showing flattened raw text
  • Added README and architecture guidance for working on existing repos through the Supervisor-first flow
  • Bumped the app to v0.3.17

Notes

  • Existing projects are supported today by telling the Supervisor you are working on an existing codebase and letting the Planner build context from the real repo
  • This release closes the markdown formatting gap and makes the current existing-project workflow easier to understand

v0.3.16 — Planner write-step recovery

08 Apr 16:02

Choose a tag to compare

Highlights

  • Fixed the planner write-step recovery path so a stalled write no longer keeps resuming the same poisoned session
  • If A stalls after entering the plan write step and the file still does not exist, the planner now restarts a fresh write turn from the captured research summary instead of looping the same bad session
  • Verified with a live heavy planning smoke: the planner reached the write step, created plan.md, and moved into self-review
  • Added a README and architecture note that large planning runs can take 10 to 15 minutes or longer for bigger builds

Notes

  • This release specifically targets the plan.md write-loop failure mode reported in the planner stalling issue
  • Planning can still be slow on large, research-heavy builds, but the old write-step failure did not reproduce after the fix

v0.3.15 — Bash-aware stall detection

05 Apr 02:49

Choose a tag to compare

What's fixed

False stalls during long bash commands

The stall detector watched for event gaps on stdout, but long-running bash commands (xcodebuild, package resolution, SDK downloads) produce no stdout events while executing. A 10-minute build looked identical to a dead session.

Fix: The orchestrator now tracks bash tool lifecycle — when a Bash tool_use fires, it marks bash as in-flight. The stall watcher skips idle checks until the tool_result comes back. Stderr output also resets the idle timer.

Agent D gets auto-resume

D (Tester) was the only agent that couldn't be auto-resumed. If D stalled during code-review or testing, the pipeline just gave up. Now D gets the same 3-retry treatment as A and B.

D won't spiral on environment setup

D was spending 15+ minutes trying to download iOS SDKs, Metal toolchains, and simulator runtimes instead of reporting that the build failed. The role file now explicitly tells D to never install or download platform tools — report missing tools as a finding and test with what's available.

Full changelog

v0.3.14...v0.3.15

v0.3.14 — Fix auto mode for Coder and Tester

05 Apr 02:25

Choose a tag to compare

What's fixed

Auto mode was broken for agents C and D

The runner set CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 for the Coder and Tester agents. This env var silently kills --permission-mode auto — the CLI rejects it immediately with "Auto mode is unavailable for your plan" and exits with code 1, even on Max plans.

Root cause: Auto mode's AI safety classifier requires network traffic that this flag blocks. The flag was originally added for Docker isolation (minimal network in containers), but it breaks host runs.

Fix: Removed the flag entirely from both host and Docker runner paths.

This was the actual cause behind issue #2 for users who do have auto mode access — even with the correct plan, C and D would always fail on first tool call.

Full changelog

v0.3.13...v0.3.14

v0.3.13 — Stall resilience for large plans

05 Apr 02:07

Choose a tag to compare

What's new

Agents get more time and more chances

Opus needs time to generate large outputs (800+ line Swift plans, thorough reviews). The orchestrator now waits longer before declaring a stall, and retries more aggressively before giving up.

Setting Before After
Idle timeout 2 minutes 5 minutes
Auto-resume attempts 1 3

Reviewer told to be concise

The plan reviewer (B) was stalling because it tried to summarize the entire plan back before giving its verdict. The prompt now tells B to skip the summary and just give the verdict — approved or list the issues. On stall recovery, B is told to output the JSON verdict immediately.

Why this matters

The consistent failure pattern was: agent reads large content → goes quiet generating a big response → marked stalled after 2 minutes → one retry fails → pipeline gives up. With 5 minutes of patience and 3 retries, most of these will land.

Full changelog

v0.3.12...v0.3.13

v0.3.12 — Supervisor concept chat & UI authority

05 Apr 01:31

Choose a tag to compare

What's new

Supervisor actually thinks during concept phase

The Supervisor now engages via Claude when you're exploring ideas before starting a build. Ask questions, brainstorm, get feedback — S responds intelligently instead of repeating canned templates. Only the very first message gets auto-captured as the concept; everything after is a real conversation.

Markdown rendering in chat panels

S, A/B/C/D panels, and expanded modals now render markdown — headers, bold, lists, code blocks, and dividers display properly instead of showing raw ** and ## symbols.

UI toggles are the sole authority

Security Mode, Permission Mode, and Supervisor Goal toggles in the dashboard are now the only source of truth. Saying "start full build in strict mode" to S no longer overrides what you set in the UI — S triggers the action, the config comes from the toggles.

Conversation preserved on pipeline start

The concept-phase conversation (your chat with S) is no longer wiped when the pipeline starts. Events, sessions, and usage carry over into the build.

Cleaner orchestrator startup

  • Removed misleading "resuming from checkpoint" messages on fresh starts
  • "Finishing the research pass" message only shows on actual resumes
  • Orchestrator correctly distinguishes fresh starts from resumes

Other fixes

  • Execution Path card shows "IDLE" when no run active instead of Docker alpha info
  • Permission mode threaded through S's start-run path

Full changelog

v0.3.11...v0.3.12