Skip to content
Open
No due date
Last updated Jun 6, 2026

The v0.9.0 architectural release promotes CodeWhale from a turn/subagent workbench into a WhaleFlow workflow workbench: typed branch-and-leaf workflows, pod-style background workflow monitoring, shared ARMH/RLM memoization, deterministic replay, external-memory evaluation, and a GEPA-style teacher/student promotion loop that turns validated lessons into a cached-main overlay.

Primary tracker: #2667 EPIC: v0.9.0 WhaleFlow branch/leaf workflow mode

In scope

  • WhaleFlow workflow mode: background workflow runs, /workflows-style monitoring, done/total progress, longest-running item peek, inspect/replay/report surfaces.
  • Typed Workflow IR as the source of truth: Starlark/YAML/generated plans compile to Rust-owned IR before execution.
  • Rust async executor: bounded branches, bounded leaves, cancellation, budgets, permissions, LoopUntil, Cond, Expand, BranchTournament, and Pareto reducers.
  • Branch/leaf semantics: isolated speculative branches, bounded leaves, losing-branch fruit harvesting, typed results.
  • ARMH/RLM integration: exact-context shared memo across branches with visible hit/miss/cost telemetry.
  • External-memory evaluation: decide whether Aleph-style memory belongs in core, optional plugins, or explicit workflow nodes, with visible state and clear/export controls.
  • TraceStore and deterministic replay: replay from recorded leaf/control outputs, not live model calls, unless explicitly allowed.
  • Teacher harness: TeacherReview proposes reusable lessons; StudentReplay and PromotionGate validate before promotion.
  • Cached-main overlay: promoted notes, workflows, tests, branch heuristics, model/cache policies, and prompt patches warm future runs without mutating Git main.
  • Janitor: stale invalidation, memo cleanup, candidate demotion, trace compaction, capacity enforcement.
  • Model-provider abstraction: workflow roles map to capabilities and configured providers; no workflow logic hardcodes Arcee, DeepSeek, Claude, tool calls, JSON mode, or large context.

Non-goals

  • No model-weight RL in v0.9.0.
  • No arbitrary JS/Python as workflow source of truth.
  • No script-level async/await. Starlark is a pure graph builder; Rust executes IR.
  • No hidden external-memory dependency for normal CodeWhale operation.
  • No uncontrolled self-modifying agent. Teacher output is inspectable, replayed, and reversible.
  • No public performance claims until evals are reproducible.

Definition of done

  • workflows/rlm_cache_change.star runs with mock provider in CI and can dogfood CodeWhale RLM/ARMH/provider changes.
  • Branch/leaf engine, control flow, TraceStore, replay, ARMH shared memo, TeacherReview, StudentReplay, PromotionGate, overlay, and janitor have focused tests.
  • Workflow mode can run, inspect, and replay a workflow from CLI and TUI.
  • ARMH savings, provider costs, and any external-memory use are visible in workflow telemetry.
  • All behavior is behind config/feature flags until stable.

Release gate

  • Parity gates green on the v0.9.0 integration branch.
  • CHANGELOG [0.9.0] frames the release as WhaleFlow branch/leaf workflows and validated cached-main learning.
  • Docs explain the Claude-workflow-inspired UX while preserving CodeWhale's typed IR/Rust executor safety model.
9% complete

List view