Skip to content

fix: clear terminalEventSeen on task restart to prevent stuck-after-planning (#1828)#1840

Open
AndyMik90 wants to merge 2 commits intodevelopfrom
fix/1828-stuck-after-planning
Open

fix: clear terminalEventSeen on task restart to prevent stuck-after-planning (#1828)#1840
AndyMik90 wants to merge 2 commits intodevelopfrom
fix/1828-stuck-after-planning

Conversation

@AndyMik90
Copy link
Owner

@AndyMik90 AndyMik90 commented Feb 15, 2026

Summary

Fixes the root cause of tasks getting permanently stuck after the planning stage completes. The terminalEventSeen Set in TaskStateManager was never cleared when a task was restarted, causing handleProcessExited() to silently swallow PROCESS_EXITED events. The XState actor never received the transition, leaving the task permanently stuck in the 'coding' state with no way to recover.

Root cause: When spec_runner.py emits PLANNING_COMPLETE (a terminal event), terminalEventSeen.add(taskId) is called. If the subsequent coding process (run.py) fails or crashes, handleProcessExited() checks terminalEventSeen.has(taskId) and returns early, never sending PROCESS_EXITED to the XState actor. Restarting the task doesn't help because TASK_START never cleared the stale entry.

Changes

  • task-state-manager.ts: Added prepareForRestart(taskId) method that clears terminalEventSeen and lastSequenceByTask for a task without stopping the XState actor
  • execution-handlers.ts: Called prepareForRestart(taskId) in all 4 locations where a new agent process is started:
    1. TASK_START handler - before XState event dispatch
    2. TASK_STOP handler - after USER_STOPPED, so subsequent restart works
    3. TASK_UPDATE_STATUS auto-start path - when dragging card to in_progress
    4. TASK_RECOVER_STUCK auto-restart path - when recovering stuck tasks

Closes #1828

Summary by CodeRabbit

  • Bug Fixes
    • Improved task restart, stop, QA and recovery reliability by proactively clearing stale per-task execution state before restarts and auto-restarts.
    • Prevents residual tracking data from previous runs from blocking new processes or causing missed events, ensuring cleaner task initialization and more reliable recovery.

…lanning (#1828)

The terminalEventSeen Set in TaskStateManager was never cleared when a task
was restarted. When spec_runner.py emits PLANNING_COMPLETE, the taskId is
added to terminalEventSeen. If the subsequent coding process (run.py) fails,
handleProcessExited() returns early because terminalEventSeen.has(taskId)
is true, silently swallowing the PROCESS_EXITED event. The XState actor
never transitions, leaving the task permanently stuck in 'coding' state.

Additionally, lastSequenceByTask from the old process would cause events
from a new process (starting at sequence 0) to be dropped as duplicates.

Fix: Add prepareForRestart(taskId) method that clears both terminalEventSeen
and lastSequenceByTask without stopping the XState actor. Call it in all 4
locations where a new agent process is started:
- TASK_START handler
- TASK_STOP handler (so subsequent restart works)
- TASK_UPDATE_STATUS auto-start path
- TASK_RECOVER_STUCK auto-restart path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added area/frontend This is frontend only bug Something isn't working size/S Small (10-99 lines) labels Feb 15, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AndyMik90, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug where tasks could become permanently stuck after the planning stage if they were restarted. The root cause was identified as stale tracking state (terminalEventSeen and lastSequenceByTask) in TaskStateManager not being cleared upon task restarts, leading to PROCESS_EXITED events being swallowed. The solution involves implementing a new prepareForRestart method to explicitly clear this state and integrating it into all relevant task restart and auto-start flows, ensuring tasks can recover and proceed correctly after a restart.

Highlights

  • New prepareForRestart method: Introduced prepareForRestart(taskId) in task-state-manager.ts to clear terminalEventSeen and lastSequenceByTask for a given task, ensuring proper state reset without affecting the XState actor.
  • Integration into task execution flows: Integrated prepareForRestart(taskId) into execution-handlers.ts at four critical points: TASK_START, TASK_STOP, TASK_UPDATE_STATUS (auto-start), and TASK_RECOVER_STUCK (auto-restart) to prevent state-related issues upon task restarts.
Changelog
  • apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts
    • Added a call to taskStateManager.prepareForRestart(taskId) within the TASK_START handler to clear stale tracking state before a new execution.
    • Inserted taskStateManager.prepareForRestart(taskId) in the TASK_STOP handler to ensure subsequent restarts function correctly.
    • Included taskStateManager.prepareForRestart(taskId) in the TASK_UPDATE_STATUS handler for auto-starting tasks when moving to 'in_progress'.
    • Applied taskStateManager.prepareForRestart(taskId) in the TASK_RECOVER_STUCK handler to clear state before auto-restarting stuck tasks.
  • apps/frontend/src/main/task-state-manager.ts
    • Implemented a new public method prepareForRestart(taskId) which deletes the taskId from terminalEventSeen and lastSequenceByTask sets.
    • Added JSDoc comments explaining the purpose and behavior of the prepareForRestart method.
Activity
  • No human activity has been recorded for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 15, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Added proactive clearing of per-task tracking state by calling TaskStateManager.prepareForRestart(taskId) at multiple task restart/start points so restarted tasks don't inherit stale terminalEventSeen or lastSequenceByTask entries.

Changes

Cohort / File(s) Summary
Task State Manager
apps/frontend/src/main/task-state-manager.ts
Added new public method prepareForRestart(taskId: string) that deletes per-task terminalEventSeen and lastSequenceByTask entries to reset tracking for a restarting task.
Task Execution Handlers
apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts
Inserted taskStateManager.prepareForRestart(taskId) calls at multiple restart/start points (TASK_START, TASK_UPDATE_STATUS auto-start path, TASK_STOP, TASK_REVIEW before QA start/restart, RECOVER_STUCK and its auto-restart branch) with comments explaining why stale state is cleared.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant IPC as "IPC Handler\n(execution-handlers)"
participant State as "TaskStateManager\n(prepareForRestart)"
participant Actor as "Task Actor\n(XState / runner)"
IPC->>State: prepareForRestart(taskId)
Note right of State: clears terminalEventSeen\nand lastSequenceByTask for taskId
IPC->>Actor: spawn/start/restart task actor
Actor-->>IPC: events / sequences
IPC->>State: update terminalEventSeen / lastSequenceByTask

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • AlexMadera

Poem

🐰 A hop to clear what once was seen,

old flags removed, the slate wiped clean.
Restart we do, no ghosts remain,
fresh sequences hum, the task runs plain. ✨

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main fix: clearing terminalEventSeen on task restart to prevent tasks from becoming stuck after planning.
Linked Issues check ✅ Passed The PR directly addresses issue #1828 by implementing the fix to clear terminalEventSeen and lastSequenceByTask when tasks restart, preventing tasks from getting stuck after planning.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the stuck-after-planning issue through clearing stale task state. No unrelated modifications are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into develop

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/1828-stuck-after-planning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides an excellent fix for a subtle but critical bug causing tasks to get stuck. The root cause analysis is clear, and the solution of introducing a prepareForRestart method in TaskStateManager is clean and effective. You've done a great job applying this fix consistently to the four identified task restart paths.

While reviewing, I identified one additional scenario where this state-clearing logic is needed but appears to be missing. In the TASK_REVIEW handler within apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts, when a review is rejected (!approved), agentManager.startQAProcess is called. This also initiates a new agent process for the task. If the previous run ended with a terminal event (e.g., ALL_SUBTASKS_DONE), terminalEventSeen would be set. Without clearing it, a failure in the new QA process could lead to the same "stuck task" bug.

I recommend adding a call to taskStateManager.prepareForRestart(taskId) before agentManager.startQAProcess is called in the rejection path of the TASK_REVIEW handler. Since this part of the file is not included in the current diff, I am providing this feedback here in the general summary.

With that addition, this fix will be even more robust.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts (1)

495-506: ⚠️ Potential issue | 🟠 Major

Missing prepareForRestart before startQAProcess on review rejection.

When a review is rejected, startQAProcess spawns a new QA process for the same taskId (line 499). If a terminal event (e.g. ALL_SUBTASKS_DONE, QA_PASSED) was previously recorded in terminalEventSeen for this task, the new QA process's exit will be swallowed by handleProcessExited (which returns early at line 78 when terminalEventSeen.has(taskId)) — the same class of bug this PR fixes at the other four call sites.

Proposed fix
+      // Clear stale tracking state before starting a new QA process
+      taskStateManager.prepareForRestart(taskId);
+
       // Restart QA process - use worktree path if it exists, otherwise main project
       // The QA process needs to run where the implementation_plan.json with completed subtasks is
       const qaProjectPath = hasWorktree ? worktreePath : project.path;

@AndyMik90 AndyMik90 self-assigned this Feb 15, 2026
Copy link
Owner Author

@AndyMik90 AndyMik90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto Claude PR Review

Merge Verdict: 🟠 NEEDS REVISION

🟠 Needs revision - 1 issue(s) require attention.

1 issue(s) must be addressed (0 required, 1 recommended)

Risk Assessment

Factor Level Notes
Complexity Low Based on lines changed
Security Impact None Based on security findings
Scope Coherence Good Based on structural review

Findings Summary

  • Medium: 1 issue(s)

Generated by Auto Claude PR Review

Findings (1 selected of 1 total)

🟡 [4ae1f291d305] [MEDIUM] Missing prepareForRestart in TASK_REVIEW rejection path

📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:499

The PR description states prepareForRestart is called in "all 4 locations where a new agent process is started," but there are actually 5 locations. The TASK_REVIEW rejection handler starts a new QA fixer process via agentManager.startQAProcess() (line 499) without calling prepareForRestart(). When a task reaches human_review, a terminal event (e.g., ALL_SUBTASKS_DONE or QA_PASSED) was recorded during the previous execution, so terminalEventSeen.has(taskId) is true. If the QA fixer process crashes without emitting a terminal event, handleProcessExited() (task-state-manager.ts line 77) checks terminalEventSeen.has(taskId) → returns early → XState actor never receives PROCESS_EXITED → task gets stuck. Additionally, lastSequenceByTask retains the old high-water mark, so events from the new QA process with lower sequence numbers would be dropped by isNewSequence() (task-state-manager.ts line 386). This is the exact same bug class this PR fixes in the other 4 locations. | The PR establishes a clear pattern: call prepareForRestart(taskId) before starting any new process for an existing task. This is applied in 4 locations (TASK_START, TASK_STOP, TASK_UPDATE_STATUS auto-start, TASK_RECOVER_STUCK auto-restart). However, the TASK_REVIEW rejection path at line 499 calls agentManager.startQAProcess() to start a new QA fixer process without calling prepareForRestart() first. This creates the exact same stale-state scenario the PR fixes: terminalEventSeen would still contain the taskId from the previous execution's QA_PASSED/ALL_SUBTASKS_DONE event, causing handleProcessExited() to swallow exit events if the QA fixer crashes.

Searched Grep('prepareForRestart|startQAProcess|startTaskExecution|startSpecCreation', 'apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts') - confirmed startQAProcess at line 499 has no preceding prepareForRestart call, while all other start* calls do.

Suggested fix:

Add `taskStateManager.prepareForRestart(taskId);` before line 499 (the `agentManager.startQAProcess()` call), matching the pattern used at the other 4 call sites. For example:

```typescript
// Clear stale tracking state before starting new QA process
taskStateManager.prepareForRestart(taskId);
agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id);

---
*This review was generated by Auto Claude.*

Add missing prepareForRestart(taskId) call before startQAProcess() in the
TASK_REVIEW rejection handler. This is the 5th location where a new agent
process is started for an existing task, but was missed in the original fix.
Without this, if the QA fixer process crashes after a review rejection,
terminalEventSeen would cause handleProcessExited() to swallow the exit
event, leaving the task permanently stuck.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/frontend This is frontend only bug Something isn't working size/S Small (10-99 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: nothing happens after the planning stage

1 participant