fix: remove lastIterationHandledMessage flag, reorder prompt to action-first

ccchow · claude · ccchow · commit 6b8dcb373217 · 2026-03-07T11:52:20.000-08:00
The deferred exit flag caused send_message loops — the LLM kept replying
without taking action. With "acknowledge last" ordering, unacknowledged
messages naturally prevent auto-exit (pendingMessages &gt; 0). The flag is
no longer needed. Guidelines now: take action first, acknowledge last,
send_message only for questions.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -99,7 +99,7 @@ Full lists: [`docs/CODING-GOTCHAS.md`](docs/CODING-GOTCHAS.md), [`docs/TESTING-G
 - **No auto-completion of blueprints**: Blueprints do NOT auto-transition to `"done"` when all nodes finish. The LLM must explicitly call `complete()` (or the user must take action). This applies everywhere: `runAutopilotLoop`, `maybeFinalizeBlueprint`, and `executeNextNode`. The `maybeFinalizeBlueprint` helper only resets stuck "running" blueprints back to "approved" — it never sets "done". The `complete` action in `executeDecision` has a structural guard: it rejects with `"active_nodes"` error if any nodes are still `"running"` or `"queued"`, and logs a warning if unacknowledged messages exist.
 - **Autopilot test mock ordering with reflections**: `mockRunSession` is shared between decision calls and reflection calls (`reflectAndUpdateMemory`). Reflections happen every `REFLECT_EVERY_N` (5) iterations, on pause, and on error. Tests with 5+ loop iterations must insert reflection response mocks at the right positions. For persistent `mockImplementation`, filter on prompt content: reflections contain `"reflecting"`, global memory contains `"global autopilot"` or `"global strategy"`.
 - **AI operations in plan-operations.ts**: `enrichNodeInternal`, `reevaluateNodeInternal`, `splitNodeInternal`, `smartDepsInternal`, `reevaluateAllInternal` are extracted from route handlers. Called by `plan-routes.ts` in manual mode only. In autopilot/FSD mode, these endpoints create a user message via `createAutopilotMessage` and call `triggerAutopilotIfNeeded` instead — the autopilot loop handles the request via its tool palette. `runWithRelatedSessionDetection` helper also lives here.
-- **Autopilot tool palette**: `autopilot.ts` uses read tools (`get_node_titles`, `get_node_details`, `get_node_handoff`), message tools (`read_user_messages`, `acknowledge_message`, `send_message`) instead of sub-agent AI operations. `send_message(content)` creates an "assistant"-role message visible in BlueprintChat. `AutopilotNodeState` is lightweight (no `description` or `suggestions` — fetched on-demand). Unacknowledged user messages are also injected directly into the prompt at each iteration (via `buildAutopilotPrompt`'s `userMessages` param) so the LLM sees them even without calling `read_user_messages`. The auto-exit condition (`allNodesDone && !pendingMessages`) is deferred for one iteration after message-related actions (`acknowledge_message`, `send_message`, `read_user_messages`) so the LLM can act on message content before the loop exits.
+- **Autopilot tool palette**: `autopilot.ts` uses read tools (`get_node_titles`, `get_node_details`, `get_node_handoff`), message tools (`read_user_messages`, `acknowledge_message`, `send_message`) instead of sub-agent AI operations. `send_message(content)` creates an "assistant"-role message visible in BlueprintChat. `AutopilotNodeState` is lightweight (no `description` or `suggestions` — fetched on-demand). Unacknowledged user messages are injected directly into the prompt at each iteration (via `buildAutopilotPrompt`'s `userMessages` param) so the LLM sees them even without calling `read_user_messages`. The prompt instructs the LLM to take action first, then `acknowledge_message` last — the unacknowledged message naturally prevents auto-exit (`pendingMessages.length > 0`) and keeps context in the prompt until the LLM has acted.
 - **triggerAutopilotIfNeeded helper**: `plan-routes.ts` has a `triggerAutopilotIfNeeded(blueprintId)` helper that checks if blueprint is in autopilot/FSD mode and no loop is running, then enqueues `runAutopilotLoop`. Used by message endpoint and AI operation endpoints to wake the autopilot when needed.
 - **Autopilot pause/resume flow**: When resuming from a safeguard pause, the resume handler must clear `pauseReason` and set `status: "running"` (both via API and optimistically in local state). `runAutopilotLoop` also clears `pauseReason` on start. The PUT endpoint's `switchingToAutopilot` only fires when `executionMode` changes FROM non-autopilot, so re-entering autopilot from a paused-autopilot state uses `runAllNodes` instead. The pause/resume UI is now inside `BlueprintChat` (not standalone `PauseBanner`).
 - **BlueprintChat replaces generator section**: The blueprint detail page uses `BlueprintChat` component instead of the old generator textarea + action buttons. It also subsumes `PauseBanner` (inline pause messages with Resume button) and `AutopilotLog` (interleaved log entries). The standalone `PauseBanner` and `AutopilotLog` components still exist for potential reuse but are no longer rendered on the blueprint detail page.
diff --git a/backend/src/autopilot.ts b/backend/src/autopilot.ts
@@ -762,10 +762,10 @@ Do NOT call complete() while unacknowledged messages exist — the user may be r
 ${userMessages.map((m) => `- [${m.id}] ${m.content}`).join("\n")}
 
 **Required**: For each message, follow this order:
-1. send_message(content) — reply to the user first: confirm what you understood and what you plan to do.
-2. Take action: create_node/batch_create_nodes for feature requests, run_node for tasks, etc.
-3. acknowledge_message(messageId) — mark as handled ONLY AFTER you have taken action.
-IMPORTANT: Do NOT acknowledge before acting — the message stays visible in your prompt until acknowledged, so you keep context about what the user asked for.
+1. Take action FIRST: create_node/batch_create_nodes for feature requests, run_node for tasks, etc.
+2. acknowledge_message(messageId) — mark as handled ONLY AFTER you have taken action.
+3. Optionally use send_message(content) to answer questions or explain decisions that don't require creating nodes.
+IMPORTANT: The message stays visible in your prompt until acknowledged, preserving context. Do NOT acknowledge before acting. Do NOT call send_message repeatedly — one reply per user message is enough.
 
 `;
   }
@@ -1329,9 +1329,6 @@ export async function runAutopilotLoop(blueprintId: string, options?: AutopilotL
   let blueprintMemory = getAutopilotMemory(blueprintId);
   const globalMemory = readGlobalMemory();
   let lastReflectionIteration = 0;
-  // Track whether the previous iteration handled a user message (acknowledge/send_message).
-  // When true, skip auto-exit for one iteration so the LLM can act on the message content.
-  let lastIterationHandledMessage = false;
 
   const safeguardState: LoopSafeguardState = {
     recentActions: [],
@@ -1351,17 +1348,15 @@ export async function runAutopilotLoop(blueprintId: string, options?: AutopilotL
 
       // 2. CHECK EXIT CONDITIONS
       // Exit loop when all nodes are done AND no pending user messages.
-      // BUT skip auto-exit if the previous iteration handled a message — give the LLM
-      // one more iteration to act on the message content (create nodes, run commands, etc.)
       // Blueprint status is NOT changed — it's managed by the user only.
+      // Unacknowledged messages naturally prevent exit (pendingMessages > 0).
+      // After the LLM acknowledges (which happens AFTER taking action), exit is allowed.
       const pendingMessages = getUnacknowledgedMessages(blueprintId);
-      if (state.allNodesDone && pendingMessages.length === 0 && !lastIterationHandledMessage) {
+      if (state.allNodesDone && pendingMessages.length === 0) {
         logAutopilot(blueprintId, iteration, state.summary, "All nodes complete, no pending user messages", "loop_exit");
         log.info(`Autopilot loop exiting for ${blueprintId.slice(0, 8)} at iteration ${iteration} (all nodes done, no pending messages)`);
         break;
       }
-      // Reset the flag — it applies for one iteration only
-      lastIterationHandledMessage = false;
 
       // Check if user switched to manual mode
       const current = getBlueprint(blueprintId);
@@ -1448,12 +1443,6 @@ export async function runAutopilotLoop(blueprintId: string, options?: AutopilotL
       // 4. EXECUTE — Carry out the AI's decision
       const result = await executeDecision(blueprintId, decision);
 
-      // Track message-handling actions to prevent premature auto-exit.
-      // Only acknowledge/send count — read_user_messages is a passive read that shouldn't defer exit.
-      if (decision.action === "acknowledge_message" || decision.action === "send_message") {
-        lastIterationHandledMessage = true;
-      }
-
       // 5. LOG
       logAutopilot(blueprintId, iteration, state.summary, decision, result);