Conversation
Summary of ChangesHello @AndyMik90, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug where tasks could become permanently stuck after the planning stage if they were restarted. The root cause was identified as stale tracking state ( Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughAdded calls to Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request provides an excellent fix for a subtle but critical bug causing tasks to get stuck. The root cause analysis is clear, and the solution of introducing a prepareForRestart method in TaskStateManager is clean and effective. You've done a great job applying this fix consistently to the four identified task restart paths.
While reviewing, I identified one additional scenario where this state-clearing logic is needed but appears to be missing. In the TASK_REVIEW handler within apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts, when a review is rejected (!approved), agentManager.startQAProcess is called. This also initiates a new agent process for the task. If the previous run ended with a terminal event (e.g., ALL_SUBTASKS_DONE), terminalEventSeen would be set. Without clearing it, a failure in the new QA process could lead to the same "stuck task" bug.
I recommend adding a call to taskStateManager.prepareForRestart(taskId) before agentManager.startQAProcess is called in the rejection path of the TASK_REVIEW handler. Since this part of the file is not included in the current diff, I am providing this feedback here in the general summary.
With that addition, this fix will be even more robust.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts (1)
495-506:⚠️ Potential issue | 🟠 MajorMissing
prepareForRestartbeforestartQAProcesson review rejection.When a review is rejected,
startQAProcessspawns a new QA process for the sametaskId(line 499). If a terminal event (e.g.ALL_SUBTASKS_DONE,QA_PASSED) was previously recorded interminalEventSeenfor this task, the new QA process's exit will be swallowed byhandleProcessExited(which returns early at line 78 whenterminalEventSeen.has(taskId)) — the same class of bug this PR fixes at the other four call sites.Proposed fix
+ // Clear stale tracking state before starting a new QA process + taskStateManager.prepareForRestart(taskId); + // Restart QA process - use worktree path if it exists, otherwise main project // The QA process needs to run where the implementation_plan.json with completed subtasks is const qaProjectPath = hasWorktree ? worktreePath : project.path;
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
Merge Verdict: 🟠 NEEDS REVISION
🟠 Needs revision - 1 issue(s) require attention.
1 issue(s) must be addressed (0 required, 1 recommended)
Risk Assessment
| Factor | Level | Notes |
|---|---|---|
| Complexity | Low | Based on lines changed |
| Security Impact | None | Based on security findings |
| Scope Coherence | Good | Based on structural review |
Findings Summary
- Medium: 1 issue(s)
Generated by Auto Claude PR Review
Findings (1 selected of 1 total)
🟡 [4ae1f291d305] [MEDIUM] Missing prepareForRestart in TASK_REVIEW rejection path
📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:499
The PR description states prepareForRestart is called in "all 4 locations where a new agent process is started," but there are actually 5 locations. The TASK_REVIEW rejection handler starts a new QA fixer process via agentManager.startQAProcess() (line 499) without calling prepareForRestart(). When a task reaches human_review, a terminal event (e.g., ALL_SUBTASKS_DONE or QA_PASSED) was recorded during the previous execution, so terminalEventSeen.has(taskId) is true. If the QA fixer process crashes without emitting a terminal event, handleProcessExited() (task-state-manager.ts line 77) checks terminalEventSeen.has(taskId) → returns early → XState actor never receives PROCESS_EXITED → task gets stuck. Additionally, lastSequenceByTask retains the old high-water mark, so events from the new QA process with lower sequence numbers would be dropped by isNewSequence() (task-state-manager.ts line 386). This is the exact same bug class this PR fixes in the other 4 locations. | The PR establishes a clear pattern: call prepareForRestart(taskId) before starting any new process for an existing task. This is applied in 4 locations (TASK_START, TASK_STOP, TASK_UPDATE_STATUS auto-start, TASK_RECOVER_STUCK auto-restart). However, the TASK_REVIEW rejection path at line 499 calls agentManager.startQAProcess() to start a new QA fixer process without calling prepareForRestart() first. This creates the exact same stale-state scenario the PR fixes: terminalEventSeen would still contain the taskId from the previous execution's QA_PASSED/ALL_SUBTASKS_DONE event, causing handleProcessExited() to swallow exit events if the QA fixer crashes.
Searched Grep('prepareForRestart|startQAProcess|startTaskExecution|startSpecCreation', 'apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts') - confirmed startQAProcess at line 499 has no preceding prepareForRestart call, while all other start* calls do.
Suggested fix:
Add `taskStateManager.prepareForRestart(taskId);` before line 499 (the `agentManager.startQAProcess()` call), matching the pattern used at the other 4 call sites. For example:
```typescript
// Clear stale tracking state before starting new QA process
taskStateManager.prepareForRestart(taskId);
agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id);
---
*This review was generated by Auto Claude.*
AndyMik90
left a comment
There was a problem hiding this comment.
✅ Auto Claude Review - APPROVED
Status: Ready to Merge
Summary: ### Merge Verdict: ✅ READY TO MERGE
✅ Ready to merge - All checks passing, no blocking issues found.
No blocking issues found
Risk Assessment
| Factor | Level | Notes |
|---|---|---|
| Complexity | Low | Based on lines changed |
| Security Impact | None | Based on security findings |
| Scope Coherence | Good | Based on structural review |
Generated by Auto Claude PR Review
This automated review found no blocking issues. The PR can be safely merged.
Generated by Auto Claude
…lanning (#1828) The terminalEventSeen Set in TaskStateManager was never cleared when a task was restarted. When spec_runner.py emits PLANNING_COMPLETE, the taskId is added to terminalEventSeen. If the subsequent coding process (run.py) fails, handleProcessExited() returns early because terminalEventSeen.has(taskId) is true, silently swallowing the PROCESS_EXITED event. The XState actor never transitions, leaving the task permanently stuck in 'coding' state. Additionally, lastSequenceByTask from the old process would cause events from a new process (starting at sequence 0) to be dropped as duplicates. Fix: Add prepareForRestart(taskId) method that clears both terminalEventSeen and lastSequenceByTask without stopping the XState actor. Call it in all 4 locations where a new agent process is started: - TASK_START handler - TASK_STOP handler (so subsequent restart works) - TASK_UPDATE_STATUS auto-start path - TASK_RECOVER_STUCK auto-restart path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing prepareForRestart(taskId) call before startQAProcess() in the TASK_REVIEW rejection handler. This is the 5th location where a new agent process is started for an existing task, but was missed in the original fix. Without this, if the QA fixer process crashes after a review rejection, terminalEventSeen would cause handleProcessExited() to swallow the exit event, leaving the task permanently stuck. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bdeac28 to
5e40408
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts (1)
540-554: 🧹 Nitpick | 🔵 Trivial
startQAProcessis called beforehandleUiEvent(USER_RESUMED)— consider swapping the orderThe QA process is spawned at line 547 while the XState actor is still in
human_review. The actor is transitioned viaUSER_RESUMEDat line 549 only after the process has started. This is safe in practice (Node.js event loop defers all process events until the current sync call stack completes), but the idiomatic pattern is to put the state machine in the correct state before starting the process it will receive events from.♻️ Suggested reorder
- // Clear stale tracking state before starting new QA process - taskStateManager.prepareForRestart(taskId); - - // Restart QA process - use worktree path if it exists, otherwise main project - const qaProjectPath = hasWorktree ? worktreePath : project.path; - console.warn('[TASK_REVIEW] Starting QA process with projectPath:', qaProjectPath); - agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id); - taskStateManager.handleUiEvent( taskId, { type: 'USER_RESUMED' }, task, project ); + + // Clear stale tracking state before starting new QA process + taskStateManager.prepareForRestart(taskId); + + // Restart QA process - use worktree path if it exists, otherwise main project + const qaProjectPath = hasWorktree ? worktreePath : project.path; + console.warn('[TASK_REVIEW] Starting QA process with projectPath:', qaProjectPath); + agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts` around lines 540 - 554, The QA process is started via agentManager.startQAProcess while the XState actor is still in human_review; swap the calls so the state transition happens first: call taskStateManager.handleUiEvent(taskId, { type: 'USER_RESUMED' }, task, project) before invoking agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id). Keep the existing taskStateManager.prepareForRestart(taskId) and the qaProjectPath calculation as-is, then perform the USER_RESUMED transition, and only after that startQAProcess so the state machine is in the correct state when the spawned process begins emitting events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts`:
- Around line 1149-1150: The auto-restart path calls
taskStateManager.prepareForRestart(taskId) but does not send an XState UI event
before launching the new process, so the actor can remain in an invalid state;
add a call to handleUiEvent(...) in the TASK_RECOVER_STUCK auto-restart path
(immediately after prepareForRestart and before invoking
startTaskExecution/startSpecCreation) to dispatch the appropriate UI event
(e.g., USER_RESUMED or a TASK_RECOVERED/TASK_RECOVER_STUCK event) so the actor
transitions to a receptive state prior to the new process starting.
- Around line 368-370: Remove the redundant call to
taskStateManager.prepareForRestart(taskId) from the TASK_STOP handler: locate
the TASK_STOP handling block where handleUiEvent(USER_STOPPED) is invoked and
delete the prepareForRestart(taskId) invocation so that terminalEventSeen
remains set and any late PROCESS_EXITED events are handled/ignored correctly by
the backlog state; rely on existing calls to taskStateManager.prepareForRestart
in TASK_START, TASK_UPDATE_STATUS, and TASK_RECOVER_STUCK to reset tracking
before a new run.
---
Outside diff comments:
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts`:
- Around line 540-554: The QA process is started via agentManager.startQAProcess
while the XState actor is still in human_review; swap the calls so the state
transition happens first: call taskStateManager.handleUiEvent(taskId, { type:
'USER_RESUMED' }, task, project) before invoking
agentManager.startQAProcess(taskId, qaProjectPath, task.specId, project.id).
Keep the existing taskStateManager.prepareForRestart(taskId) and the
qaProjectPath calculation as-is, then perform the USER_RESUMED transition, and
only after that startQAProcess so the state machine is in the correct state when
the spawned process begins emitting events.
|
|
||
| // Clear stale tracking state so a subsequent restart works correctly | ||
| taskStateManager.prepareForRestart(taskId); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's understand the file structure and find the taskMachine definition
find . -type f -name "*.ts" -path "*/ipc-handlers/task/*" | head -20Repository: AndyMik90/Auto-Claude
Length of output: 792
🏁 Script executed:
# Search for the backlog state definition as requested
rg -n 'backlog' --type ts -A 10 -B 2Repository: AndyMik90/Auto-Claude
Length of output: 50378
🏁 Script executed:
# Find where the taskMachine is defined
rg -n 'taskMachine\s*=' --type ts -l | head -10Repository: AndyMik90/Auto-Claude
Length of output: 120
🏁 Script executed:
# Look at the execution-handlers file to see the context around lines 368-370
wc -l apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 132
🏁 Script executed:
sed -n '360,380p' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 467
🏁 Script executed:
# Find the taskMachine definition
cat apps/frontend/src/shared/state-machines/task-machine.tsRepository: AndyMik90/Auto-Claude
Length of output: 7701
🏁 Script executed:
# Search for prepareForRestart calls
rg -n 'prepareForRestart' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 343
🏁 Script executed:
# Search for handleProcessExited definition and terminalEventSeen
rg -n 'handleProcessExited|terminalEventSeen' apps/frontend/src/main/ipc-handlers/task/ -A 5Repository: AndyMik90/Auto-Claude
Length of output: 835
🏁 Script executed:
sed -n '175,190p' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 828
🏁 Script executed:
sed -n '535,550p' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 884
🏁 Script executed:
sed -n '710,730p' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 1166
🏁 Script executed:
sed -n '1145,1160p' apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsRepository: AndyMik90/Auto-Claude
Length of output: 768
🏁 Script executed:
# Now let's find the handleProcessExited and terminalEventSeen implementation
rg -n 'handleProcessExited|terminalEventSeen' apps/frontend/src/main/task-state-manager.ts -A 10Repository: AndyMik90/Auto-Claude
Length of output: 3268
🏁 Script executed:
# Check if there are any other callers of handleProcessExited to understand the flow
rg -n 'handleProcessExited' apps/frontend/src/main/ -B 3 -A 1Repository: AndyMik90/Auto-Claude
Length of output: 4367
🏁 Script executed:
# Verify the backlog state doesn't have PROCESS_EXITED handler by extracting just that section
sed -n '50,65p' apps/frontend/src/shared/state-machines/task-machine.tsRepository: AndyMik90/Auto-Claude
Length of output: 525
Remove prepareForRestart from TASK_STOP handler
The prepareForRestart(taskId) call at line 370 is redundant and logically inconsistent. After handleUiEvent(USER_STOPPED) transitions the actor to its backlog state, clearing terminalEventSeen here allows a stale PROCESS_EXITED event from the killed process to reach the actor—where it will be silently ignored since the backlog state doesn't define a handler for it. Every subsequent restart path already calls prepareForRestart before starting a new process (TASK_START line 183, TASK_UPDATE_STATUS line 718, TASK_RECOVER_STUCK line 1150), so the state is clean regardless. Delete the call and let the next start handler reset the tracking state at the right moment.
🔧 Proposed fix
taskStateManager.handleUiEvent(
taskId,
{ type: 'USER_STOPPED', hasPlan },
task,
project
);
-
- // Clear stale tracking state so a subsequent restart works correctly
- taskStateManager.prepareForRestart(taskId);
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts` around lines
368 - 370, Remove the redundant call to
taskStateManager.prepareForRestart(taskId) from the TASK_STOP handler: locate
the TASK_STOP handling block where handleUiEvent(USER_STOPPED) is invoked and
delete the prepareForRestart(taskId) invocation so that terminalEventSeen
remains set and any late PROCESS_EXITED events are handled/ignored correctly by
the backlog state; rely on existing calls to taskStateManager.prepareForRestart
in TASK_START, TASK_UPDATE_STATUS, and TASK_RECOVER_STUCK to reset tracking
before a new run.
| // Clear stale tracking state before restarting | ||
| taskStateManager.prepareForRestart(taskId); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
prepareForRestart placement is correct; note that no XState event is sent before the new process starts
The call correctly clears terminalEventSeen and lastSequenceByTask before startTaskExecution/startSpecCreation. However, unlike TASK_START and TASK_UPDATE_STATUS, this auto-restart path sends no XState event (e.g., USER_RESUMED) to transition the actor before the new process begins. If the actor is in error or a mid-execution state, the machine may not accept events from the fresh process correctly.
This is a pre-existing design choice (outside the scope of this PR), but worth tracking separately.
Would you like me to open a follow-up issue to add an appropriate handleUiEvent call in the TASK_RECOVER_STUCK auto-restart path?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts` around lines
1149 - 1150, The auto-restart path calls
taskStateManager.prepareForRestart(taskId) but does not send an XState UI event
before launching the new process, so the actor can remain in an invalid state;
add a call to handleUiEvent(...) in the TASK_RECOVER_STUCK auto-restart path
(immediately after prepareForRestart and before invoking
startTaskExecution/startSpecCreation) to dispatch the appropriate UI event
(e.g., USER_RESUMED or a TASK_RECOVERED/TASK_RECOVER_STUCK event) so the actor
transitions to a receptive state prior to the new process starting.
Merged 9 upstream commits into fork's develop branch: - fix: PR review error visibility and gh CLI resolution (AndyMik90#1661) - fix: handle empty/greenfield projects in spec creation (AndyMik90#1426/AndyMik90#1841) - fix: clear terminalEventSeen on task restart (AndyMik90#1828/AndyMik90#1840) - fix: watch worktree path for implementation_plan.json (AndyMik90#1805/AndyMik90#1842) - fix: resolve Claude CLI not found on Windows (AndyMik90#1661/AndyMik90#1843) - fix: handle planning phase crash and resume recovery (AndyMik90#1562/AndyMik90#1844) - fix: show dismissed PR review findings in UI (AndyMik90#1852) - fix: preserve file/line info in PR review extraction (AndyMik90#1857) - chore: bump version to 2.7.6-beta.6 Conflict resolution strategy (5 files): - client.py: adopted upstream's CLAUDE.md truncation (better than skip) - agent-manager.ts: combined upstream projectPath cwd fix with fork's accountOverride - agent-process.ts: combined upstream PATH merge fix with fork's Copilot env handling - file-watcher.ts: adopted upstream's race-condition-safe watcher implementation - execution-handlers.ts: combined upstream plan file checking with fork's broken plan recovery + accountOverride
Mark PRs AndyMik90#1853, AndyMik90#1844, AndyMik90#1847 as applied. PR AndyMik90#1840 skipped (XState-specific). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes the root cause of tasks getting permanently stuck after the planning stage completes. The
terminalEventSeenSet inTaskStateManagerwas never cleared when a task was restarted, causinghandleProcessExited()to silently swallowPROCESS_EXITEDevents. The XState actor never received the transition, leaving the task permanently stuck in the 'coding' state with no way to recover.Root cause: When
spec_runner.pyemitsPLANNING_COMPLETE(a terminal event),terminalEventSeen.add(taskId)is called. If the subsequent coding process (run.py) fails or crashes,handleProcessExited()checksterminalEventSeen.has(taskId)and returns early, never sendingPROCESS_EXITEDto the XState actor. Restarting the task doesn't help becauseTASK_STARTnever cleared the stale entry.Changes
task-state-manager.ts: AddedprepareForRestart(taskId)method that clearsterminalEventSeenandlastSequenceByTaskfor a task without stopping the XState actorexecution-handlers.ts: CalledprepareForRestart(taskId)in all 4 locations where a new agent process is started:TASK_STARThandler - before XState event dispatchTASK_STOPhandler - after USER_STOPPED, so subsequent restart worksTASK_UPDATE_STATUSauto-start path - when dragging card to in_progressTASK_RECOVER_STUCKauto-restart path - when recovering stuck tasksCloses #1828
Summary by CodeRabbit