fix: handle planning phase crash and resume recovery (#1562)#1844
fix: handle planning phase crash and resume recovery (#1562)#1844
Conversation
When spec creation crashes, the task gets stuck in "planning" state forever because the backend never emits PLANNING_FAILED to the frontend XState machine. Clicking Resume then also crashes because the resume logic transitions to "coding" state, but there are no subtasks yet. Root causes and fixes: 1. Backend orchestrator (orchestrator.py): - Wrap run() in try/except to emit PLANNING_FAILED on unhandled exceptions - Add _emit_planning_failed() calls at every early return path - Fix spec_dir tracking after rename_spec_dir_from_requirements() 2. XState machine (task-machine.ts): - Add PLANNING_STARTED transitions from error and human_review states - This allows tasks that crashed during planning to resume back to planning 3. Execution handlers (execution-handlers.ts): - Detect error state with 0 subtasks and send PLANNING_STARTED (not USER_RESUMED) - Check actual implementation_plan.json for subtasks instead of task.subtasks.length - Handles both with-actor and without-actor (app restart) code paths Closes #1562 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughBackend: rename_spec_dir_from_requirements now returns a Path and orchestrator centralizes phase execution and emits PLANNING_FAILED on terminal errors while updating spec_dir after renames. Frontend: execution handler reads implementation_plan.json to determine planHasSubtasks and gates start/resume flows; state machine allows PLANNING_STARTED from human_review and error. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Frontend Client
participant Executor as Execution Handler
participant Orchestrator as Backend Orchestrator
participant State as XState Machine
Client->>Executor: TASK_START / Resume
Executor->>Executor: Read implementation_plan.json (main/worktree)
Executor->>Executor: Determine planHasSubtasks (true/false)
alt planHasSubtasks == false
Executor->>Orchestrator: Start planning phases
rect rgba(255, 200, 100, 0.5)
Orchestrator->>Orchestrator: _run_phases() -> discovery/requirements/complexity
Orchestrator->>Orchestrator: call models.rename_spec_dir_from_requirements -> returns Path
Orchestrator->>Orchestrator: update self.spec_dir and validator if renamed
Orchestrator->>Orchestrator: on terminal error -> _emit_planning_failed(error)
end
Orchestrator->>State: Emit PLANNING_FAILED
Executor->>State: Emit PLANNING_STARTED (resume)
State->>State: Transition error/human_review -> planning (clearReviewReason)
else planHasSubtasks == true
Executor->>State: Emit PLAN_APPROVED
State->>Executor: Proceed to implementation
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @AndyMik90, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the resilience and user experience of the task planning phase. It addresses critical issues where tasks would become unresponsive or crash upon resuming after a planning failure. By introducing robust error signaling from the backend and intelligent state transition logic in the frontend, tasks can now gracefully recover and restart the planning process, preventing them from getting stuck in an unrecoverable state. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a critical bug where tasks would get stuck in the "planning" state after a crash. The changes are well-thought-out and tackle the root causes on both the backend and frontend. On the backend, adding a try/except block to emit a PLANNING_FAILED event is a robust solution for state synchronization. The fix to update self.spec_dir after a directory rename is also a crucial correctness improvement. On the frontend, the state machine and IPC handlers are correctly updated to handle resuming a task that failed during planning, and the logic for detecting an implementation plan has been made more reliable. I have a couple of suggestions to further enhance the maintainability and robustness of these changes. Overall, this is a solid contribution that significantly improves the reliability of the task execution flow.
| } else if (currentXState === 'error' && task.subtasks.length === 0) { | ||
| // FIX (#1562): Task crashed during planning (no subtasks yet). | ||
| // Send PLANNING_STARTED to go back to planning state, not coding. | ||
| console.warn('[TASK_START] XState: error with 0 subtasks -> planning via PLANNING_STARTED'); | ||
| taskStateManager.handleUiEvent(taskId, { type: 'PLANNING_STARTED' }, task, project); |
There was a problem hiding this comment.
This condition uses task.subtasks.length === 0 to decide whether to restart the planning phase. As pointed out in the PR description, this can be unreliable. Later in this function (lines 230-250), you've introduced a much more robust check by reading implementation_plan.json to determine planHasSubtasks.
To make this fix fully robust, the logic for determining the initial XState event should also use this reliable check. I recommend moving the planHasSubtasks calculation (lines 230-250) to before the XState handling logic (before line 164).
Then you can change this condition and the one on line 194 to use !planHasSubtasks, like this:
} else if (currentXState === 'error' && !planHasSubtasks) {
// FIX (#1562): Task crashed during planning (no subtasks yet).
// Send PLANNING_STARTED to go back to planning state, not coding.
console.warn('[TASK_START] XState: error with 0 subtasks -> planning via PLANNING_STARTED');
taskStateManager.handleUiEvent(taskId, { type: 'PLANNING_STARTED' }, task, project);
}This will ensure the decision to restart planning is consistently based on the actual state of the plan file.
| if not self.spec_dir.exists() and old_spec_dir.name.endswith("-pending"): | ||
| # Directory was renamed - find the new path | ||
| parent = old_spec_dir.parent | ||
| prefix = old_spec_dir.name[:4] # e.g., "001-" | ||
| for candidate in parent.iterdir(): | ||
| if ( | ||
| candidate.name.startswith(prefix) | ||
| and candidate.is_dir() | ||
| and "pending" not in candidate.name | ||
| ): | ||
| self.spec_dir = candidate | ||
| self.validator = SpecValidator(self.spec_dir) | ||
| break |
There was a problem hiding this comment.
The logic to find the new spec_dir after it's been renamed is a bit complex and potentially brittle. It relies on iterating through the parent directory and matching a prefix, which could lead to incorrect behavior if other directories match the pattern.
A more robust and maintainable approach would be to modify the rename_spec_dir_from_requirements function (in apps/backend/spec/pipeline/models.py) to return the new path of the renamed directory. This would simplify the code here significantly and remove the need for this manual search logic.
For example, rename_spec_dir_from_requirements could be changed to return Path | None, and you could then update self.spec_dir directly with the returned value.
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
Merge Verdict: 🟠 NEEDS REVISION
🟠 Needs revision - 4 issue(s) require attention.
4 issue(s) must be addressed (0 required, 4 recommended), 1 suggestions
Risk Assessment
| Factor | Level | Notes |
|---|---|---|
| Complexity | Low | Based on lines changed |
| Security Impact | None | Based on security findings |
| Scope Coherence | Good | Based on structural review |
Findings Summary
- Medium: 4 issue(s)
- Low: 1 issue(s)
Generated by Auto Claude PR Review
Findings (5 selected of 5 total)
🟡 [a32816266b30] [MEDIUM] Incomplete fix: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length
📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749
The PR description explicitly identifies 'task.subtasks.length === 0 was used instead of checking the actual implementation_plan.json file' as a root cause of #1562. This was fixed in the TASK_START handler (lines 230-250) by reading the actual implementation_plan.json file. However, the TASK_UPDATE_STATUS handler at line 749 still uses the old unreliable pattern: const needsImplementation = hasSpec && task.subtasks.length === 0. This code path is triggered when a user drags a task card to 'in_progress' on the Kanban board (manual status change with auto-start). If a task crashed during planning and the user drags it to 'in_progress' instead of clicking Resume, the same bug will manifest — task.subtasks.length may be stale or inaccurate, leading to incorrect execution decisions.
Suggested fix:
Apply the same fix as the TASK_START handler: read implementation_plan.json from disk and check for valid subtasks using the same pattern (lines 230-250). The existing `checkSubtasksCompletion` helper could also be reused here after reading and parsing the plan file, to avoid duplicating the subtask-counting logic a third time.
🟡 [6e248e327264] [MEDIUM] phase_executor.spec_dir not updated after directory rename
📁 apps/backend/spec/pipeline/orchestrator.py:371
After the spec directory is renamed (line 358) and self.spec_dir is correctly updated (line 369), the phase_executor object still holds the old stale spec_dir path. The code explicitly updates phase_executor.task_description (line 378) but misses updating phase_executor.spec_dir and phase_executor.spec_validator. All subsequent phases run through phase_executor (lines 397-407: historical_context, research, context, spec_writing, planning, validation, quick_spec) and they all reference self.spec_dir within the executor to read/write files (e.g., self.spec_dir / 'spec.md', self.spec_dir / 'implementation_plan.json'). After the rename, the old directory no longer exists on disk, so these phases would fail with file-not-found errors. The top-level try/except would catch this and emit PLANNING_FAILED (preventing stuck tasks), but spec creation would fail unnecessarily when the directory was successfully renamed. Note: the stale phase_executor.spec_dir is a pre-existing issue (before this PR, self.spec_dir wasn't updated either), but since this PR explicitly aims to fix stale spec_dir tracking and the comment says 'Update self.spec_dir after rename so subsequent phases use the correct path', the fix is incomplete.
Suggested fix:
After line 371 (after the break), add:
```python
# Also update phase executor to use the new path
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
Alternatively, add it after the entire if block (after line 371) unconditionally:
# Keep phase executor in sync with updated spec_dir
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [6f884ef13cef] [MEDIUM] Duplicates existing _rename_spec_dir_from_requirements wrapper
📁 `apps/backend/spec/pipeline/orchestrator.py:357`
The new inline directory-search logic at lines 357-371 duplicates the existing `_rename_spec_dir_from_requirements()` wrapper method at lines 768-790 in the SAME class. Both call `rename_spec_dir_from_requirements(self.spec_dir)` then iterate the parent directory to find the renamed spec folder using the same prefix-matching pattern. Searched with Grep('_rename_spec_dir_from_requirements', 'apps/backend/') - confirmed the wrapper exists at line 768 and is used in tests. The inline version adds `candidate.is_dir()` and `self.validator = SpecValidator(...)` updates which the wrapper lacks, but the fix should enhance the existing wrapper rather than duplicate the logic.
**Suggested fix:**
Enhance the existing _rename_spec_dir_from_requirements() wrapper (line 768) to also add the candidate.is_dir() guard and self.validator = SpecValidator(self.spec_dir) update, then call self._rename_spec_dir_from_requirements() at line 358 instead of duplicating the logic inline.
#### 🟡 [00010260d37f] [MEDIUM] Duplicates existing checkSubtasksCompletion helper in same file
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:230`
The new code at lines 230-250 manually parses `implementation_plan.json` and counts subtasks by iterating `plan.phases` and summing `phase.subtasks.length`. The same file already defines `checkSubtasksCompletion()` at lines 44-58, which does the exact same traversal: `(plan?.phases as Array<...>)?.flatMap(phase => phase.subtasks || [])` and returns `totalCount`. Searched with Read tool at lines 44-58 - confirmed the helper extracts subtasks from the same `plan.phases` structure and counts them. The new code could reuse this helper: `planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0`.
**Suggested fix:**
Replace lines 230-250 with:
const planContent = safeReadFileSync(planFilePath);
let planHasSubtasks = false;
if (planContent) {
try {
const plan = JSON.parse(planContent);
planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0;
} catch {
// Invalid/corrupt plan file - treat as no subtasks
}
}This reuses both safeReadFileSync and checkSubtasksCompletion from the same file.
#### 🔵 [7cbc0d966de4] [LOW] Uses existsSync+readFileSync instead of safeReadFileSync
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:234`
The new code uses `existsSync(planFilePath)` followed by `readFileSync(planFilePath, 'utf-8')` in a try/catch. The same file defines `safeReadFileSync()` at line 29, which wraps exactly this pattern (handles ENOENT gracefully, logs other errors). The file's own docstring describes it as 'Safe file read that handles missing files without TOCTOU issues'. Using `safeReadFileSync` would be consistent with the file's established conventions and avoid the TOCTOU race between existsSync and readFileSync.
**Suggested fix:**
Replace the existsSync + readFileSync pattern with const planContent = safeReadFileSync(planFilePath); followed by if (planContent) { try { ... JSON.parse ... } catch {} }.
---
*This review was generated by Auto Claude.*
…name, empty except (#1562) - Move planHasSubtasks calculation (reads implementation_plan.json) before XState handling so the crash-during-planning check uses the reliable file-based check instead of task.subtasks.length - Change rename_spec_dir_from_requirements to return the new Path directly instead of a bool, eliminating brittle directory scanning in orchestrator - Add descriptive comment to empty except clause to satisfy code scanning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts (2)
746-749:⚠️ Potential issue | 🟠 Major
TASK_UPDATE_STATUSstill usestask.subtasks.length— same bug this PR fixes inTASK_START.Line 749 uses
task.subtasks.length === 0forneedsImplementation, but theTASK_STARThandler (Line 259) was explicitly changed to use the file-basedplanHasSubtaskscheck becausetask.subtasks.lengthis unreliable after crashes. TheTASK_UPDATE_STATUSauto-start path (triggered when dragging a task to "in_progress") still has the original vulnerability: if the plan file is missing/empty buttask.subtasksisn't populated, the task could be incorrectly routed.Proposed fix — apply the same file-based check
// Check if spec.md exists const specFilePath = path.join(specDir, AUTO_BUILD_PATHS.SPEC_FILE); const hasSpec = existsSync(specFilePath); const needsSpecCreation = !hasSpec; - const needsImplementation = hasSpec && task.subtasks.length === 0; + // Use the same file-based check as TASK_START to reliably detect missing subtasks + let planHasSubtasksForUpdate = false; + const updatePlanPath = path.join(specDir, AUTO_BUILD_PATHS.IMPLEMENTATION_PLAN); + if (existsSync(updatePlanPath)) { + try { + const planContent = readFileSync(updatePlanPath, 'utf-8'); + const plan = JSON.parse(planContent); + const phases = plan?.phases; + if (Array.isArray(phases)) { + const subtaskCount = phases.reduce( + (sum: number, phase: { subtasks?: unknown[] }) => + sum + (Array.isArray(phase.subtasks) ? phase.subtasks.length : 0), + 0 + ); + planHasSubtasksForUpdate = subtaskCount > 0; + } + } catch { + // Invalid/corrupt plan file - treat as no subtasks + } + } + const needsImplementation = hasSpec && !planHasSubtasksForUpdate;Better yet, extract the plan subtask check into a shared helper to avoid the duplication:
function checkPlanHasSubtasks(specDir: string): boolean { const planFilePath = path.join(specDir, AUTO_BUILD_PATHS.IMPLEMENTATION_PLAN); if (!existsSync(planFilePath)) return false; try { const planContent = readFileSync(planFilePath, 'utf-8'); const plan = JSON.parse(planContent); const phases = plan?.phases; if (Array.isArray(phases)) { return phases.reduce( (sum: number, phase: { subtasks?: unknown[] }) => sum + (Array.isArray(phase.subtasks) ? phase.subtasks.length : 0), 0 ) > 0; } } catch { // Invalid/corrupt plan file } return false; }
209-212:⚠️ Potential issue | 🟠 Major
human_reviewwith no subtasks sendsUSER_RESUMED→ coding, which will fail.When
currentXState === 'human_review'andplanHasSubtasksisfalse(e.g., user stopped during planning), this branch sendsUSER_RESUMEDwhich transitions tocoding. But with no subtasks, coding will fail. The state machine now supportsPLANNING_STARTEDfromhuman_review(task-machine.ts Line 131), but this handler doesn't use it.This edge case (stop during planning → resume from human_review) is less critical than the error-state fix but follows the same pattern.
Proposed fix
} else if (currentXState === 'error' && !planHasSubtasks) { console.warn('[TASK_START] XState: error with no plan subtasks -> planning via PLANNING_STARTED'); taskStateManager.handleUiEvent(taskId, { type: 'PLANNING_STARTED' }, task, project); - } else if (currentXState === 'human_review' || currentXState === 'error') { + } else if (currentXState === 'human_review' && !planHasSubtasks) { + // Task was stopped during planning (no subtasks yet) - restart planning + console.warn('[TASK_START] XState: human_review with no plan subtasks -> planning via PLANNING_STARTED'); + taskStateManager.handleUiEvent(taskId, { type: 'PLANNING_STARTED' }, task, project); + } else if (currentXState === 'human_review' || currentXState === 'error') { console.warn('[TASK_START] XState:', currentXState, '-> coding via USER_RESUMED'); taskStateManager.handleUiEvent(taskId, { type: 'USER_RESUMED' }, task, project);And the same for the no-actor fallback path (Lines 227–230).
🤖 Fix all issues with AI agents
In `@apps/backend/spec/pipeline/orchestrator.py`:
- Around line 766-769: The current change updates self.spec_dir but flips return
semantics by returning new_spec_dir.exists() which can be False even when the
rename succeeded; to preserve previous success/error semantics, detect whether a
rename actually occurred (compare new_spec_dir with the old value), set
self.spec_dir accordingly (as you already do), and return True when a rename
happened (indicating success); otherwise fall back to checking existence of the
existing spec dir (e.g., return self.spec_dir.exists()). Reference:
rename_spec_dir_from_requirements, self.spec_dir, new_spec_dir.exists().
- Around line 264-270: The _run_phases method is missing type hints for the
task_logger and ui parameters; update the signature of async def
_run_phases(self, interactive: bool, auto_approve: bool, task_logger, ui) ->
bool to add precise type annotations for task_logger and ui (e.g., TaskLogger or
logging.Logger and UIInterface or the concrete UI class used in this module)
consistent with the codebase typing conventions, and import any required types
or use forward references if necessary so the function signature becomes fully
typed.
- Around line 687-704: The except block in _emit_planning_failed currently
swallows all exceptions; change it to capture the exception (except Exception as
e) and emit a debug-level log so failures building or emitting the
PLANNING_FAILED event are visible for debugging. Use the module/class logger
(e.g., logger.debug or self.logger.debug if available) to log a short message
including the exception and context (e.g., spec_dir and that
TaskEventEmitter.from_spec_dir or task_emitter.emit failed) while still not
raising the error so original failure isn’t masked; reference
_emit_planning_failed, TaskEventEmitter.from_spec_dir, and task_emitter.emit
when making the change.
- Around line 356-360: After rename_spec_dir_from_requirements moves the spec
directory, update the PhaseExecutor instance to use the new paths: set
phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator =
self.validator (mirroring the existing assignment to
phase_executor.task_description). Locate the block around the rename call and
add those two assignments after self.spec_dir and self.validator are updated so
subsequent phases (historical_context, research, spec_writing, planning,
validation) reference the moved directory via phase_executor.spec_dir and
phase_executor.spec_validator instead of the deleted old path.
| async def _run_phases( | ||
| self, | ||
| interactive: bool, | ||
| auto_approve: bool, | ||
| task_logger, | ||
| ui, | ||
| ) -> bool: |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Missing type hints on _run_phases parameters.
task_logger and ui lack type annotations. Per coding guidelines, backend Python should use type hints.
Proposed fix
+ from types import ModuleType
+ from task_logger import TaskLogger
+
async def _run_phases(
self,
interactive: bool,
auto_approve: bool,
- task_logger,
- ui,
+ task_logger: TaskLogger,
+ ui: ModuleType,
) -> bool:🤖 Prompt for AI Agents
In `@apps/backend/spec/pipeline/orchestrator.py` around lines 264 - 270, The
_run_phases method is missing type hints for the task_logger and ui parameters;
update the signature of async def _run_phases(self, interactive: bool,
auto_approve: bool, task_logger, ui) -> bool to add precise type annotations for
task_logger and ui (e.g., TaskLogger or logging.Logger and UIInterface or the
concrete UI class used in this module) consistent with the codebase typing
conventions, and import any required types or use forward references if
necessary so the function signature becomes fully typed.
| def _emit_planning_failed(self, error: str) -> None: | ||
| """Emit PLANNING_FAILED event so the frontend transitions to error state. | ||
|
|
||
| Without this, the task stays stuck in 'planning' / 'in_progress' forever | ||
| when spec creation fails, because the XState machine never receives a | ||
| terminal event. | ||
|
|
||
| Args: | ||
| error: Human-readable error description | ||
| """ | ||
| try: | ||
| task_emitter = TaskEventEmitter.from_spec_dir(self.spec_dir) | ||
| task_emitter.emit( | ||
| "PLANNING_FAILED", | ||
| {"error": error, "recoverable": True}, | ||
| ) | ||
| except Exception: | ||
| pass # Best effort - don't mask the original failure |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Clean helper; consider logging the swallowed exception for debuggability.
The bare pass on Line 703–704 silences all failures, including unexpected ones (e.g., self.spec_dir is None). A debug-level log would help troubleshoot without masking the original error.
Suggested improvement
except Exception:
- pass # Best effort - don't mask the original failure
+ # Best effort - don't mask the original failure.
+ # Logging here could itself fail, so keep it minimal.
+ import sys
+ try:
+ sys.stderr.write("[orchestrator] _emit_planning_failed failed\n")
+ except Exception:
+ pass🤖 Prompt for AI Agents
In `@apps/backend/spec/pipeline/orchestrator.py` around lines 687 - 704, The
except block in _emit_planning_failed currently swallows all exceptions; change
it to capture the exception (except Exception as e) and emit a debug-level log
so failures building or emitting the PLANNING_FAILED event are visible for
debugging. Use the module/class logger (e.g., logger.debug or self.logger.debug
if available) to log a short message including the exception and context (e.g.,
spec_dir and that TaskEventEmitter.from_spec_dir or task_emitter.emit failed)
while still not raising the error so original failure isn’t masked; reference
_emit_planning_failed, TaskEventEmitter.from_spec_dir, and task_emitter.emit
when making the change.
| new_spec_dir = rename_spec_dir_from_requirements(self.spec_dir) | ||
| if new_spec_dir != self.spec_dir: | ||
| self.spec_dir = new_spec_dir | ||
| return new_spec_dir.exists() |
There was a problem hiding this comment.
Backward-compat method now correctly updates self.spec_dir, but return semantics have subtly changed.
Previously this method presumably returned True on success / False on error. Now it returns new_spec_dir.exists(). If the rename succeeds but the new directory is somehow immediately deleted (race condition), this would return False even though the rename succeeded. This is an unlikely edge case, but worth noting for callers that rely on the return value.
🤖 Prompt for AI Agents
In `@apps/backend/spec/pipeline/orchestrator.py` around lines 766 - 769, The
current change updates self.spec_dir but flips return semantics by returning
new_spec_dir.exists() which can be False even when the rename succeeded; to
preserve previous success/error semantics, detect whether a rename actually
occurred (compare new_spec_dir with the old value), set self.spec_dir
accordingly (as you already do), and return True when a rename happened
(indicating success); otherwise fall back to checking existence of the existing
spec dir (e.g., return self.spec_dir.exists()). Reference:
rename_spec_dir_from_requirements, self.spec_dir, new_spec_dir.exists().
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
🔴 Follow-up Review: Blocked
🔴 Blocked - 5 CI check(s) failing. Fix CI before merge.
Resolution Status
- ✅ Resolved: 0 previous findings addressed
- ❌ Unresolved: 5 previous findings remain
- 🆕 New Issues: 4 new findings in recent changes
Finding Validation
- 🔍 Dismissed as False Positives: 1 findings were re-investigated and found to be incorrect
- ✓ Confirmed Valid: 6 findings verified as genuine issues
- 👤 Needs Human Review: 0 findings require manual verification
🚨 Blocking Issues
- quality: [UNRESOLVED] Incomplete fix: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length
- quality: [UNRESOLVED] phase_executor.spec_dir not updated after directory rename
- quality: [UNRESOLVED] Duplicates existing checkSubtasksCompletion helper in same file
- quality: phase_executor.spec_dir and spec_validator not updated after directory rename
- quality: Backward-compat wrapper missing self.validator update (inconsistent with inline code)
- quality: [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
- quality: [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length (same bug as #1562)
Verdict
BLOCKED: 5 CI checks failing + 1 critical confirmed bug.
CI Status: All 5 CI checks are failing (CI Complete, test-python on ubuntu/windows/macos for 3.12 and 3.13). The PR cannot be merged until CI passes.
Critical Bug — phase_executor.spec_dir not updated after rename (confirmed by 3 agents): After the spec directory is renamed at orchestrator.py:357-360, self.spec_dir and self.validator are updated on the orchestrator, but phase_executor.spec_dir and phase_executor.spec_validator still point to the OLD (renamed/non-existent) directory. All subsequent phases (complexity_assessment through validation) use phase_executor.spec_dir, so every spec creation that triggers a directory rename will fail. This is the highest-priority fix.
High Bug — TASK_UPDATE_STATUS still uses unreliable task.subtasks.length: The TASK_START handler was correctly fixed to read implementation_plan.json, but TASK_UPDATE_STATUS at line 749 still uses task.subtasks.length === 0. Tasks resumed via drag-and-drop still have the #1562 bug.
Progress acknowledged: The commit makes good progress — the rename_spec_dir_from_requirements() return type change to Path is clean, the PLANNING_FAILED event emission is well-designed, and the XState transitions for error recovery are correct. The critical phase_executor.spec_dir issue needs 2 additional lines to fix.
Review Process
Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator
This is an AI-generated follow-up review using parallel specialist analysis with finding validation.
Findings (8 selected of 8 total)
🟡 [a32816266b30] [MEDIUM] [UNRESOLVED] Incomplete fix: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length
📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749
The PR description explicitly identifies 'task.subtasks.length === 0 was used instead of checking the actual implementation_plan.json file' as a root cause of #1562. This was fixed in the TASK_START handler (lines 230-250) by reading the actual implementation_plan.json file. However, the TASK_UPDATE_STATUS handler at line 749 still uses the old unreliable pattern: const needsImplementation = hasSpec && task.subtasks.length === 0. This code path is triggered when a user drags a task card to 'in_progress' on the Kanban board (manual status change with auto-start). If a task crashed during planning and the user drags it to 'in_progress' instead of clicking Resume, the same bug will manifest — task.subtasks.length may be stale or inaccurate, leading to incorrect execution decisions.
Resolution note: TASK_UPDATE_STATUS at line 749 still uses const needsImplementation = hasSpec && task.subtasks.length === 0; while the TASK_START handler was fixed at line 259 to use const needsImplementation = hasSpec && !planHasSubtasks; where planHasSubtasks is computed by reading implementation_plan.json (lines 172-190).
Suggested fix:
Apply the same fix as the TASK_START handler: read implementation_plan.json from disk and check for valid subtasks using the same pattern (lines 230-250). The existing `checkSubtasksCompletion` helper could also be reused here after reading and parsing the plan file, to avoid duplicating the subtask-counting logic a third time.
🟡 [6e248e327264] [MEDIUM] [UNRESOLVED] phase_executor.spec_dir not updated after directory rename
📁 apps/backend/spec/pipeline/orchestrator.py:371
After the spec directory is renamed (line 358) and self.spec_dir is correctly updated (line 369), the phase_executor object still holds the old stale spec_dir path. The code explicitly updates phase_executor.task_description (line 378) but misses updating phase_executor.spec_dir and phase_executor.spec_validator. All subsequent phases run through phase_executor (lines 397-407: historical_context, research, context, spec_writing, planning, validation, quick_spec) and they all reference self.spec_dir within the executor to read/write files (e.g., self.spec_dir / 'spec.md', self.spec_dir / 'implementation_plan.json'). After the rename, the old directory no longer exists on disk, so these phases would fail with file-not-found errors. The top-level try/except would catch this and emit PLANNING_FAILED (preventing stuck tasks), but spec creation would fail unnecessarily when the directory was successfully renamed. Note: the stale phase_executor.spec_dir is a pre-existing issue (before this PR, self.spec_dir wasn't updated either), but since this PR explicitly aims to fix stale spec_dir tracking and the comment says 'Update self.spec_dir after rename so subsequent phases use the correct path', the fix is incomplete.
Resolution note: After rename at lines 357-360, self.spec_dir and self.validator are updated but phase_executor.spec_dir and phase_executor.spec_validator are NOT. Line 367 even updates phase_executor.task_description showing awareness of the need to sync, but spec_dir was missed. PhaseExecutor stores spec_dir as a plain attribute (executor.py line 58).
Suggested fix:
After line 371 (after the break), add:
```python
# Also update phase executor to use the new path
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
Alternatively, add it after the entire if block (after line 371) unconditionally:
# Keep phase executor in sync with updated spec_dir
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [00010260d37f] [MEDIUM] [UNRESOLVED] Duplicates existing checkSubtasksCompletion helper in same file
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:230`
The new code at lines 230-250 manually parses `implementation_plan.json` and counts subtasks by iterating `plan.phases` and summing `phase.subtasks.length`. The same file already defines `checkSubtasksCompletion()` at lines 44-58, which does the exact same traversal: `(plan?.phases as Array<...>)?.flatMap(phase => phase.subtasks || [])` and returns `totalCount`. Searched with Read tool at lines 44-58 - confirmed the helper extracts subtasks from the same `plan.phases` structure and counts them. The new code could reuse this helper: `planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0`.
Resolution note: TASK_START handler at lines 178-185 manually counts subtasks with reduce() instead of using checkSubtasksCompletion helper (lines 44-58). The TASK_STOP handler (line 334-336) already uses the pattern: const { totalCount } = checkSubtasksCompletion(plan); hasPlan = totalCount > 0.
**Suggested fix:**
Replace lines 230-250 with:
const planContent = safeReadFileSync(planFilePath);
let planHasSubtasks = false;
if (planContent) {
try {
const plan = JSON.parse(planContent);
planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0;
} catch {
// Invalid/corrupt plan file - treat as no subtasks
}
}This reuses both safeReadFileSync and checkSubtasksCompletion from the same file.
#### 🔵 [7cbc0d966de4] [LOW] [UNRESOLVED] Uses existsSync+readFileSync instead of safeReadFileSync
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:234`
The new code uses `existsSync(planFilePath)` followed by `readFileSync(planFilePath, 'utf-8')` in a try/catch. The same file defines `safeReadFileSync()` at line 29, which wraps exactly this pattern (handles ENOENT gracefully, logs other errors). The file's own docstring describes it as 'Safe file read that handles missing files without TOCTOU issues'. Using `safeReadFileSync` would be consistent with the file's established conventions and avoid the TOCTOU race between existsSync and readFileSync.
Resolution note: TASK_START at line 174 uses existsSync(planFilePath) + readFileSync in try/catch. safeReadFileSync is defined at line 29 in the same file and already used by TASK_STOP (line 332) and TASK_RECOVER_STUCK (line 965).
**Suggested fix:**
Replace the existsSync + readFileSync pattern with const planContent = safeReadFileSync(planFilePath); followed by if (planContent) { try { ... JSON.parse ... } catch {} }.
#### 🔴 [NEW-001] [CRITICAL] phase_executor.spec_dir and spec_validator not updated after directory rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
After renaming the spec directory (lines 357-360), self.spec_dir and self.validator are updated on the orchestrator, but phase_executor.spec_dir and phase_executor.spec_validator still point to the OLD directory path. The PhaseExecutor stores spec_dir as a plain attribute (executor.py line 58), not a reference. All subsequent phases (complexity_assessment, historical_context, research, context, spec_writing, self_critique, planning, validation) use phase_executor.spec_dir extensively. The code at line 367 even updates phase_executor.task_description, showing awareness of the need to sync attributes, but spec_dir and spec_validator were missed.
**Suggested fix:**
Add after line 360:
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [NEW-002] [MEDIUM] Backward-compat wrapper missing self.validator update (inconsistent with inline code)
📁 `apps/backend/spec/pipeline/orchestrator.py:766`
The _rename_spec_dir_from_requirements backward-compat wrapper (lines 757-769) updates self.spec_dir but does NOT update self.validator, unlike the inline code at lines 357-360 which creates a new SpecValidator. If any code path calls this method, the validator will reference the old directory. Currently only tests call this method, limiting production impact.
**Suggested fix:**
Add self.validator = SpecValidator(self.spec_dir) inside the if block at line 768
#### 🔴 [CMT-001] [CRITICAL] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
Both @sentry[bot] (marked CRITICAL) and coderabbitai's analysis confirm that phase_executor.spec_dir is not updated after the directory rename. This concern has not been addressed in the latest commit and has not been responded to by the author.
**Suggested fix:**
Add phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator = self.validator after line 360
#### 🟠 [CMT-002] [HIGH] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length (same bug as #1562)
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749`
CodeRabbit's actionable comment identifies that TASK_UPDATE_STATUS at line 749 still uses the unreliable task.subtasks.length === 0 check. This is the exact same bug that was fixed in TASK_START. The auto-start path (triggered by dragging task to in_progress) still has the vulnerability.
**Suggested fix:**
Apply the same planHasSubtasks pattern from TASK_START (reading implementation_plan.json) to TASK_UPDATE_STATUS
---
*This review was generated by Auto Claude.*
…hange (#1562) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/backend/spec/pipeline/orchestrator.py (1)
757-784: 🛠️ Refactor suggestion | 🟠 MajorBackward-compat method duplicates checks and doesn't update
self.validator.Two issues:
Missing
self.validatorupdate: The main code path (Line 360) reinitializesself.validator = SpecValidator(self.spec_dir)after rename, but this method doesn't. Any caller relying on the validator post-rename will reference the stale path.Redundant prerequisite checks: Lines 767–778 duplicate the exact same guards that
rename_spec_dir_from_requirements()already performs internally (checking file existence, loading JSON, validatingtask_description). The function returns the original path when no rename is needed, so these pre-checks are unnecessary.Proposed simplification
def _rename_spec_dir_from_requirements(self) -> bool: - # Check prerequisites first - requirements_file = self.spec_dir / "requirements.json" - if not requirements_file.exists(): - return False - - try: - with open(requirements_file, encoding="utf-8") as f: - req = json.load(f) - task_desc = req.get("task_description", "") - if not task_desc: - return False - except (json.JSONDecodeError, OSError): - return False - - # Attempt rename new_spec_dir = rename_spec_dir_from_requirements(self.spec_dir) if new_spec_dir != self.spec_dir: self.spec_dir = new_spec_dir + self.validator = SpecValidator(self.spec_dir) return True
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
🟠 Follow-up Review: Needs Revision
🟠 Needs revision - 8 unresolved finding(s) from previous review.
Resolution Status
- ✅ Resolved: 0 previous findings addressed
- ❌ Unresolved: 8 previous findings remain
- 🆕 New Issues: 7 new findings in recent changes
Finding Validation
- 🔍 Dismissed as False Positives: 2 findings were re-investigated and found to be incorrect
- ✓ Confirmed Valid: 5 findings verified as genuine issues
- 👤 Needs Human Review: 0 findings require manual verification
🚨 Blocking Issues
- quality: [UNRESOLVED] [UNRESOLVED] Incomplete fix: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length
- quality: [UNRESOLVED] [UNRESOLVED] phase_executor.spec_dir not updated after directory rename
- quality: [UNRESOLVED] [UNRESOLVED] Duplicates existing checkSubtasksCompletion helper in same file
- quality: [UNRESOLVED] phase_executor.spec_dir and spec_validator not updated after directory rename
- quality: [UNRESOLVED] Backward-compat wrapper missing self.validator update (inconsistent with inline code)
- quality: [UNRESOLVED] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
- quality: [UNRESOLVED] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length (same bug as #1562)
- quality: phase_executor.spec_dir and spec_validator not updated after directory rename
- quality: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length instead of implementation_plan.json check
- quality: Backward-compat _rename_spec_dir_from_requirements missing self.validator update
- quality: Duplicates existing checkSubtasksCompletion helper in same file
- quality: [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
- quality: [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length
Verdict
CI Status: ✅ All 20 CI checks passing — CI is not blocking.
However, all 8 previous findings remain unresolved. The latest commit 62e8723 only updated tests for the rename_spec_dir_from_requirements return type change and did not address any of the substantive code issues.
Confirmed CRITICAL (1 unique issue, flagged by 3 findings + 2 external tools):
phase_executor.spec_dirandphase_executor.spec_validatornot updated after directory rename (6e248e327264 / NEW-001 / CMT-001). Afterrename_spec_dir_from_requirementsrenames the spec directory, all subsequent phases use stale paths viaphase_executor, causing file-not-found errors. This is independently confirmed by sentry[bot] and coderabbitai. Fix: Add 2 lines after line 360 to updatephase_executor.spec_dirandphase_executor.spec_validator.
Confirmed MEDIUM (3 unique issues):
- TASK_UPDATE_STATUS still uses
task.subtasks.length(a32816266b30 / CMT-002) — The same #1562 bug was fixed in TASK_START but not replicated in TASK_UPDATE_STATUS. - Backward-compat wrapper missing
self.validatorupdate (NEW-002) — Inconsistency with inline code. - Duplicates
checkSubtasksCompletionhelper (00010260d37f) — New code re-implements logic already available in the same file.
Confirmed LOW (1 issue):
- Uses
existsSync+readFileSyncinstead ofsafeReadFileSync(7cbc0d966de4).
Dismissed as false positives (2):
- NEW-003: Exception handler using updated
self.spec_diris correct, not stale. - NEW-004: Null subtask entries are unrealistic and have zero behavioral impact.
Verdict: NEEDS_REVISION — The critical phase_executor.spec_dir issue and the medium TASK_UPDATE_STATUS inconsistency are confirmed valid issues that should be fixed before merge. The critical issue would cause runtime failures when spec directories are renamed.
Review Process
Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator
This is an AI-generated follow-up review using parallel specialist analysis with finding validation.
Findings (15 selected of 15 total)
🟡 [a32816266b30] [MEDIUM] [UNRESOLVED] [UNRESOLVED] Incomplete fix: TASK_UPDATE_STATUS still uses unreliable task.subtasks.length
📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749
The PR description explicitly identifies 'task.subtasks.length === 0 was used instead of checking the actual implementation_plan.json file' as a root cause of #1562. This was fixed in the TASK_START handler (lines 230-250) by reading the actual implementation_plan.json file. However, the TASK_UPDATE_STATUS handler at line 749 still uses the old unreliable pattern: const needsImplementation = hasSpec && task.subtasks.length === 0. This code path is triggered when a user drags a task card to 'in_progress' on the Kanban board (manual status change with auto-start). If a task crashed during planning and the user drags it to 'in_progress' instead of clicking Resume, the same bug will manifest — task.subtasks.length may be stale or inaccurate, leading to incorrect execution decisions.
Resolution note: TASK_UPDATE_STATUS at line 749 still uses const needsImplementation = hasSpec && task.subtasks.length === 0; while the TASK_START handler was fixed at line 259 to use const needsImplementation = hasSpec && !planHasSubtasks; where planHasSubtasks is computed by reading implementation_plan.json (lines 172-190).
Resolution note: Line 749 of execution-handlers.ts: const needsImplementation = hasSpec && task.subtasks.length === 0; -- TASK_UPDATE_STATUS handler still uses task.subtasks.length instead of the planHasSubtasks approach used in the fixed TASK_START handler at line 259.
Suggested fix:
Apply the same fix as the TASK_START handler: read implementation_plan.json from disk and check for valid subtasks using the same pattern (lines 230-250). The existing `checkSubtasksCompletion` helper could also be reused here after reading and parsing the plan file, to avoid duplicating the subtask-counting logic a third time.
🟡 [6e248e327264] [MEDIUM] [UNRESOLVED] [UNRESOLVED] phase_executor.spec_dir not updated after directory rename
📁 apps/backend/spec/pipeline/orchestrator.py:371
After the spec directory is renamed (line 358) and self.spec_dir is correctly updated (line 369), the phase_executor object still holds the old stale spec_dir path. The code explicitly updates phase_executor.task_description (line 378) but misses updating phase_executor.spec_dir and phase_executor.spec_validator. All subsequent phases run through phase_executor (lines 397-407: historical_context, research, context, spec_writing, planning, validation, quick_spec) and they all reference self.spec_dir within the executor to read/write files (e.g., self.spec_dir / 'spec.md', self.spec_dir / 'implementation_plan.json'). After the rename, the old directory no longer exists on disk, so these phases would fail with file-not-found errors. The top-level try/except would catch this and emit PLANNING_FAILED (preventing stuck tasks), but spec creation would fail unnecessarily when the directory was successfully renamed. Note: the stale phase_executor.spec_dir is a pre-existing issue (before this PR, self.spec_dir wasn't updated either), but since this PR explicitly aims to fix stale spec_dir tracking and the comment says 'Update self.spec_dir after rename so subsequent phases use the correct path', the fix is incomplete.
Resolution note: After rename at lines 357-360, self.spec_dir and self.validator are updated but phase_executor.spec_dir and phase_executor.spec_validator are NOT. Line 367 even updates phase_executor.task_description showing awareness of the need to sync, but spec_dir was missed. PhaseExecutor stores spec_dir as a plain attribute (executor.py line 58).
Resolution note: Lines 357-367 of orchestrator.py: After rename, self.spec_dir and self.validator are updated, phase_executor.task_description is updated, but phase_executor.spec_dir and phase_executor.spec_validator are NOT updated. PhaseExecutor stores its own copies (executor.py lines 57-60).
Suggested fix:
After line 371 (after the break), add:
```python
# Also update phase executor to use the new path
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
Alternatively, add it after the entire if block (after line 371) unconditionally:
# Keep phase executor in sync with updated spec_dir
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [00010260d37f] [MEDIUM] [UNRESOLVED] [UNRESOLVED] Duplicates existing checkSubtasksCompletion helper in same file
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:230`
The new code at lines 230-250 manually parses `implementation_plan.json` and counts subtasks by iterating `plan.phases` and summing `phase.subtasks.length`. The same file already defines `checkSubtasksCompletion()` at lines 44-58, which does the exact same traversal: `(plan?.phases as Array<...>)?.flatMap(phase => phase.subtasks || [])` and returns `totalCount`. Searched with Read tool at lines 44-58 - confirmed the helper extracts subtasks from the same `plan.phases` structure and counts them. The new code could reuse this helper: `planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0`.
Resolution note: TASK_START handler at lines 178-185 manually counts subtasks with reduce() instead of using checkSubtasksCompletion helper (lines 44-58). The TASK_STOP handler (line 334-336) already uses the pattern: const { totalCount } = checkSubtasksCompletion(plan); hasPlan = totalCount > 0.
Resolution note: Lines 173-190 of execution-handlers.ts manually parse implementation_plan.json and count subtasks via reduce(). Lines 44-58 define checkSubtasksCompletion which does the same plan parsing and subtask counting via flatMap(). The duplication remains.
**Suggested fix:**
Replace lines 230-250 with:
const planContent = safeReadFileSync(planFilePath);
let planHasSubtasks = false;
if (planContent) {
try {
const plan = JSON.parse(planContent);
planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0;
} catch {
// Invalid/corrupt plan file - treat as no subtasks
}
}This reuses both safeReadFileSync and checkSubtasksCompletion from the same file.
#### 🔵 [7cbc0d966de4] [LOW] [UNRESOLVED] [UNRESOLVED] Uses existsSync+readFileSync instead of safeReadFileSync
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:234`
The new code uses `existsSync(planFilePath)` followed by `readFileSync(planFilePath, 'utf-8')` in a try/catch. The same file defines `safeReadFileSync()` at line 29, which wraps exactly this pattern (handles ENOENT gracefully, logs other errors). The file's own docstring describes it as 'Safe file read that handles missing files without TOCTOU issues'. Using `safeReadFileSync` would be consistent with the file's established conventions and avoid the TOCTOU race between existsSync and readFileSync.
Resolution note: TASK_START at line 174 uses existsSync(planFilePath) + readFileSync in try/catch. safeReadFileSync is defined at line 29 in the same file and already used by TASK_STOP (line 332) and TASK_RECOVER_STUCK (line 965).
Resolution note: Lines 174-176 of execution-handlers.ts: `if (existsSync(planFilePath)) { try { const planContent = readFileSync(planFilePath, 'utf-8');` -- safeReadFileSync is defined at line 29 and handles this exact pattern, but is not used.
**Suggested fix:**
Replace the existsSync + readFileSync pattern with const planContent = safeReadFileSync(planFilePath); followed by if (planContent) { try { ... JSON.parse ... } catch {} }.
#### 🔴 [NEW-001] [CRITICAL] [UNRESOLVED] phase_executor.spec_dir and spec_validator not updated after directory rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
After renaming the spec directory (lines 357-360), self.spec_dir and self.validator are updated on the orchestrator, but phase_executor.spec_dir and phase_executor.spec_validator still point to the OLD directory path. The PhaseExecutor stores spec_dir as a plain attribute (executor.py line 58), not a reference. All subsequent phases (complexity_assessment, historical_context, research, context, spec_writing, self_critique, planning, validation) use phase_executor.spec_dir extensively. The code at line 367 even updates phase_executor.task_description, showing awareness of the need to sync attributes, but spec_dir and spec_validator were missed.
Resolution note: Same issue as 6e248e327264. Lines 357-360 of orchestrator.py update self.spec_dir and self.validator but NOT phase_executor.spec_dir or phase_executor.spec_validator. PhaseExecutor.__init__ stores these as independent instance attributes (executor.py line 58, 60).
**Suggested fix:**
Add after line 360:
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [NEW-002] [MEDIUM] [UNRESOLVED] Backward-compat wrapper missing self.validator update (inconsistent with inline code)
📁 `apps/backend/spec/pipeline/orchestrator.py:766`
The _rename_spec_dir_from_requirements backward-compat wrapper (lines 757-769) updates self.spec_dir but does NOT update self.validator, unlike the inline code at lines 357-360 which creates a new SpecValidator. If any code path calls this method, the validator will reference the old directory. Currently only tests call this method, limiting production impact.
Resolution note: Lines 780-784 of orchestrator.py: backward-compat wrapper updates self.spec_dir but not self.validator = SpecValidator(self.spec_dir), unlike inline code at line 360.
**Suggested fix:**
Add self.validator = SpecValidator(self.spec_dir) inside the if block at line 768
#### 🔴 [CMT-001] [CRITICAL] [UNRESOLVED] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
Both @sentry[bot] (marked CRITICAL) and coderabbitai's analysis confirm that phase_executor.spec_dir is not updated after the directory rename. This concern has not been addressed in the latest commit and has not been responded to by the author.
Resolution note: Same as 6e248e327264. Lines 355-367 of orchestrator.py do not update phase_executor.spec_dir or phase_executor.spec_validator after rename.
**Suggested fix:**
Add phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator = self.validator after line 360
#### 🟠 [CMT-002] [HIGH] [UNRESOLVED] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length (same bug as #1562)
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749`
CodeRabbit's actionable comment identifies that TASK_UPDATE_STATUS at line 749 still uses the unreliable task.subtasks.length === 0 check. This is the exact same bug that was fixed in TASK_START. The auto-start path (triggered by dragging task to in_progress) still has the vulnerability.
Resolution note: Same as a32816266b30. Line 749 of execution-handlers.ts still uses task.subtasks.length === 0.
**Suggested fix:**
Apply the same planHasSubtasks pattern from TASK_START (reading implementation_plan.json) to TASK_UPDATE_STATUS
#### 🔴 [6e248e327264] [CRITICAL] phase_executor.spec_dir and spec_validator not updated after directory rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
After rename_spec_dir_from_requirements renames the spec directory (lines 357-360), the orchestrator updates self.spec_dir and self.validator, but phase_executor.spec_dir and phase_executor.spec_validator still point to the OLD path. Line 367 updates phase_executor.task_description but misses spec_dir and spec_validator. All subsequent phases (historical_context, research, spec_writing, planning, validation) use phase_executor.spec_dir and will fail with file-not-found errors when the directory is renamed. Fix: Add phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator = self.validator after line 360.
**Suggested fix:**
After line 360, add:
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
#### 🟡 [a32816266b30] [MEDIUM] TASK_UPDATE_STATUS still uses unreliable task.subtasks.length instead of implementation_plan.json check
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749`
The TASK_UPDATE_STATUS handler at line 749 uses `task.subtasks.length === 0` to determine needsImplementation, which is the exact same unreliable pattern that was fixed in TASK_START (line 259) using planHasSubtasks from implementation_plan.json. The TASK_START handler has explicit 'FIX (#1562)' comments, but the TASK_UPDATE_STATUS auto-start path was not updated. This means dragging a crashed task to in_progress (triggering TASK_UPDATE_STATUS) can still misroute execution.
**Suggested fix:**
Apply the same planHasSubtasks approach from TASK_START (lines 164-190) in the TASK_UPDATE_STATUS handler, or extract a shared helper function.
#### 🟡 [NEW-002] [MEDIUM] Backward-compat _rename_spec_dir_from_requirements missing self.validator update
📁 `apps/backend/spec/pipeline/orchestrator.py:783`
The backward-compat method _rename_spec_dir_from_requirements (lines 757-784) updates self.spec_dir (line 783) after rename but does NOT update self.validator = SpecValidator(self.spec_dir), unlike the inline code at lines 357-360 which correctly updates both. If this method is called, self.validator will point to the old directory.
**Suggested fix:**
Add self.validator = SpecValidator(self.spec_dir) inside the if block at line 783.
#### 🟡 [00010260d37f] [MEDIUM] Duplicates existing checkSubtasksCompletion helper in same file
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:173`
New code at lines 173-190 manually parses implementation_plan.json and counts subtasks via reduce(). The same file defines checkSubtasksCompletion() at lines 44-58 which does the same plan parsing and subtask counting. The helper is used elsewhere in the file (lines 335, 984, 1013). The new code could call checkSubtasksCompletion(plan) and check totalCount > 0 instead of duplicating the logic.
**Suggested fix:**
Replace lines 178-185 with: const { totalCount } = checkSubtasksCompletion(plan); planHasSubtasks = totalCount > 0;
#### 🔵 [7cbc0d966de4] [LOW] Uses existsSync+readFileSync instead of safeReadFileSync helper
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:174`
New code uses existsSync(planFilePath) + readFileSync() in try/catch. The same file defines safeReadFileSync() at line 29, designed specifically to avoid this TOCTOU anti-pattern. The outer try/catch prevents crashes, making this a code quality issue rather than a bug.
**Suggested fix:**
Replace with: const planContent = safeReadFileSync(planFilePath); if (planContent) { try { const plan = JSON.parse(planContent); ... } catch {} }
#### 🔴 [CMT-001] [CRITICAL] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
📁 `apps/backend/spec/pipeline/orchestrator.py:360`
Both @sentry[bot] (CRITICAL) and coderabbitai confirm that phase_executor.spec_dir is not updated after the directory rename. This is the same issue as finding 6e248e327264. The concern raised by both tools is validated - phase_executor.spec_dir and spec_validator remain stale.
**Suggested fix:**
Add phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator = self.validator after line 360.
#### 🟠 [CMT-002] [HIGH] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749`
CodeRabbit identified that the TASK_UPDATE_STATUS handler at line 749 still uses the unreliable task.subtasks.length === 0 check. This is the same root cause bug (#1562) that was fixed in TASK_START but not in TASK_UPDATE_STATUS. Same issue as finding a32816266b30.
**Suggested fix:**
Apply the same planHasSubtasks pattern used in TASK_START.
---
*This review was generated by Auto Claude.*
- Update phase_executor.spec_dir and spec_validator after directory rename so subsequent phases don't use stale paths (critical bug flagged by sentry, coderabbitai, and Auto Claude review) - Fix TASK_UPDATE_STATUS handler to use file-based plan check instead of unreliable task.subtasks.length (same #1562 bug fixed in TASK_START) - Replace manual subtask counting with existing checkSubtasksCompletion helper - Use safeReadFileSync instead of existsSync+readFileSync (TOCTOU fix) - Add self.validator update to backward-compat _rename_spec_dir_from_requirements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
apps/backend/spec/pipeline/orchestrator.py (1)
760-788: 🧹 Nitpick | 🔵 TrivialBackward-compat method looks correct for its stated purpose.
Note that unlike the main code path (lines 362–363), this method doesn't update
phase_executor.spec_dir/phase_executor.spec_validator. This is fine since it's only used by tests, but worth a brief docstring note so future callers don't use it in orchestration flow.apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts (2)
267-305: 🧹 Nitpick | 🔵 TrivialThe
needsImplementationandelsebranches are identical calls — consider merging.Both
needsImplementation(line 274) and theelsebranch (line 292) invokeagentManager.startTaskExecutionwith the exact same arguments. The only difference is the log message. Ifrun.pyhandles both cases identically, these could be merged into a single branch with a descriptive log:- } else if (needsImplementation) { - ... - console.warn('[TASK_START] Starting task execution (no valid subtasks in plan) for:', task.specId); - agentManager.startTaskExecution(/* ... */); - } else { - console.warn('[TASK_START] Starting task execution (has subtasks) for:', task.specId); - agentManager.startTaskExecution(/* ... */); - } + } else { + const reason = needsImplementation ? 'no valid subtasks in plan' : 'has subtasks'; + console.warn(`[TASK_START] Starting task execution (${reason}) for:`, task.specId); + agentManager.startTaskExecution(/* ... */); + }This reduces ~20 lines of duplication. Same applies to the equivalent block in
TASK_UPDATE_STATUS(lines 765–798).
1187-1216:⚠️ Potential issue | 🟡 MinorRecovery auto-restart path missing
planHasSubtaskscheck and XState event.The
TASK_RECOVER_STUCKauto-restart code (lines 1195–1216) checks onlyhasSpecwhen deciding whether to callstartTaskExecution, unlikeTASK_STARTandTASK_UPDATE_STATUSwhich both verifyplanHasSubtasksbefore restarting.When recovery encounters a task that crashed during planning (spec.md exists but implementation_plan.json is missing or has no subtasks), it will call
startTaskExecutionwithout sending thePLANNING_STARTEDXState event. This causes the state machine and log message to be inconsistent—the recovery handler already reads and analyzes the plan correctly to determine status, but discards that information when restarting.Add the same
planHasSubtaskscheck to the recovery restart path for consistency with other handlers and to ensure proper XState transitions.
🤖 Fix all issues with AI agents
In `@apps/backend/spec/pipeline/orchestrator.py`:
- Around line 241-262: Replace the inline emit/try block that duplicates the
emission helper with a call to the centralized helper: remove the try/except
that creates TaskEventEmitter and emits "PLANNING_FAILED" and instead call
self._emit_planning_failed(...) passing the current task_logger (and the caught
exception e) so the emission logic is centralized; keep the subsequent
task_logger.end_phase(...) handling as-is.
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts`:
- Around line 741-753: Extract a helper function
planFileHasSubtasks(planFilePath: string): boolean that encapsulates the current
pattern: read file with safeReadFileSync, JSON.parse the content inside a
try/catch, call checkSubtasksCompletion(parsedPlan).totalCount > 0 and return
that result (false on read/parse errors or missing file); then replace the
duplicated inline logic in both the TASK_UPDATE_STATUS handler (where
updatePlanFilePath / updatePlanHasSubtasks / needsImplementation are used) and
the TASK_START handler (the block using specDir,
AUTO_BUILD_PATHS.IMPLEMENTATION_PLAN and checkSubtasksCompletion) to call
planFileHasSubtasks(...) instead so both paths share the same implementation.
| try: | ||
| return await self._run_phases(interactive, auto_approve, task_logger, ui) | ||
| except Exception as e: | ||
| # Emit PLANNING_FAILED so the frontend XState machine transitions to error state | ||
| # instead of leaving the task stuck in "planning" forever | ||
| try: | ||
| task_emitter = TaskEventEmitter.from_spec_dir(self.spec_dir) | ||
| task_emitter.emit( | ||
| "PLANNING_FAILED", | ||
| {"error": str(e), "recoverable": True}, | ||
| ) | ||
| except Exception: | ||
| pass # Don't mask the original error | ||
| try: | ||
| task_logger.end_phase( | ||
| LogPhase.PLANNING, | ||
| success=False, | ||
| message=f"Spec creation crashed: {e}", | ||
| ) | ||
| except Exception: | ||
| pass # Best effort - don't mask the original error when logging fails | ||
| raise |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Use _emit_planning_failed helper instead of inlining duplicate logic.
Lines 246–251 replicate exactly what _emit_planning_failed (line 690) already does, including the protective try/except. Replace the inline block with a single call to keep the emission logic centralized.
♻️ Proposed fix
try:
return await self._run_phases(interactive, auto_approve, task_logger, ui)
except Exception as e:
# Emit PLANNING_FAILED so the frontend XState machine transitions to error state
# instead of leaving the task stuck in "planning" forever
- try:
- task_emitter = TaskEventEmitter.from_spec_dir(self.spec_dir)
- task_emitter.emit(
- "PLANNING_FAILED",
- {"error": str(e), "recoverable": True},
- )
- except Exception:
- pass # Don't mask the original error
+ self._emit_planning_failed(f"Spec creation crashed: {e}")
try:
task_logger.end_phase(
LogPhase.PLANNING,🤖 Prompt for AI Agents
In `@apps/backend/spec/pipeline/orchestrator.py` around lines 241 - 262, Replace
the inline emit/try block that duplicates the emission helper with a call to the
centralized helper: remove the try/except that creates TaskEventEmitter and
emits "PLANNING_FAILED" and instead call self._emit_planning_failed(...) passing
the current task_logger (and the caught exception e) so the emission logic is
centralized; keep the subsequent task_logger.end_phase(...) handling as-is.
| // FIX (#1562): Check actual plan file for subtasks, not just task.subtasks.length | ||
| const updatePlanFilePath = path.join(specDir, AUTO_BUILD_PATHS.IMPLEMENTATION_PLAN); | ||
| let updatePlanHasSubtasks = false; | ||
| const updatePlanContent = safeReadFileSync(updatePlanFilePath); | ||
| if (updatePlanContent) { | ||
| try { | ||
| const plan = JSON.parse(updatePlanContent); | ||
| updatePlanHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0; | ||
| } catch { | ||
| // Invalid/corrupt plan file - treat as no subtasks | ||
| } | ||
| } | ||
| const needsImplementation = hasSpec && !updatePlanHasSubtasks; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Duplicated plan-reading logic — extract a helper.
The pattern of reading the plan file → parsing JSON → calling checkSubtasksCompletion is repeated here (lines 741–752) and in the TASK_START handler (lines 164–182). Extract a small helper like planFileHasSubtasks(planFilePath: string): boolean to DRY this up and reduce the chance of the two paths diverging.
♻️ Proposed helper and usage
+/**
+ * Check if an implementation_plan.json file contains valid subtasks.
+ */
+function planFileHasSubtasks(planFilePath: string): boolean {
+ const content = safeReadFileSync(planFilePath);
+ if (!content) return false;
+ try {
+ const plan = JSON.parse(content);
+ return checkSubtasksCompletion(plan).totalCount > 0;
+ } catch {
+ return false;
+ }
+}Then replace the inline blocks in both TASK_START and TASK_UPDATE_STATUS:
- let planHasSubtasks = false;
- const planContent = safeReadFileSync(planFilePath);
- if (planContent) {
- try {
- const plan = JSON.parse(planContent);
- planHasSubtasks = checkSubtasksCompletion(plan).totalCount > 0;
- } catch {
- // Invalid/corrupt plan file - treat as no subtasks
- }
- }
+ const planHasSubtasks = planFileHasSubtasks(planFilePath);🤖 Prompt for AI Agents
In `@apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts` around lines
741 - 753, Extract a helper function planFileHasSubtasks(planFilePath: string):
boolean that encapsulates the current pattern: read file with safeReadFileSync,
JSON.parse the content inside a try/catch, call
checkSubtasksCompletion(parsedPlan).totalCount > 0 and return that result (false
on read/parse errors or missing file); then replace the duplicated inline logic
in both the TASK_UPDATE_STATUS handler (where updatePlanFilePath /
updatePlanHasSubtasks / needsImplementation are used) and the TASK_START handler
(the block using specDir, AUTO_BUILD_PATHS.IMPLEMENTATION_PLAN and
checkSubtasksCompletion) to call planFileHasSubtasks(...) instead so both paths
share the same implementation.
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
🟡 Follow-up Review: Merge With Changes
🟡 Can merge - Minor suggestions noted, no blockers.
Resolution Status
- ✅ Resolved: 8 previous findings addressed
- ❌ Unresolved: 4 previous findings remain
- 🆕 New Issues: 4 new findings in recent changes
Finding Validation
- 🔍 Dismissed as False Positives: 0 findings were re-investigated and found to be incorrect
- ✓ Confirmed Valid: 4 findings verified as genuine issues
- 👤 Needs Human Review: 0 findings require manual verification
🚨 Blocking Issues
- quality: TASK_UPDATE_STATUS auto-start path missing planning crash recovery logic (NEW-00
- quality: [UNRESOLVED] [UNRESOLVED] phase_executor.spec_dir and spec_validator not updated after directory rename
- quality: [UNRESOLVED] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
- quality: [UNRESOLVED] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length
Verdict
[Recovered via extraction] All 8 previous findings are resolved. However, there are 4 new findings (1 medium-severity new code issue and 3 low-severity code quality suggestions from comments), all confirmed valid. The medium-severity finding (missing planning crash recovery logic in TASK_UPDATE_STATUS auto-start path) warrants attention before merge, but none are blocking.
Review Process
Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator
This is an AI-generated follow-up review using parallel specialist analysis with finding validation.
Findings (7 selected of 7 total)
🟡 [FU-1CE8D5CA] [MEDIUM] TASK_UPDATE_STATUS auto-start path missing planning crash recovery logic (NEW-00
📁 unknown:0
[Recovered via extraction] TASK_UPDATE_STATUS auto-start path missing planning crash recovery logic (NEW-001)
🔵 [FU-2252D5AC] [LOW] Inline PLANNING_FAILED duplication (CMT-001)
📁 unknown:0
[Recovered via extraction] Inline PLANNING_FAILED duplication (CMT-001)
🔵 [FU-3068C285] [LOW] Duplicated plan-reading pattern (CMT-002)
📁 unknown:0
[Recovered via extraction] Duplicated plan-reading pattern (CMT-002)
🔵 [FU-0D9D302C] [LOW] Identical branches (CMT-003)
📁 unknown:0
[Recovered via extraction] Identical branches (CMT-003)
🔴 [NEW-001] [CRITICAL] [UNRESOLVED] [UNRESOLVED] phase_executor.spec_dir and spec_validator not updated after directory rename
📁 apps/backend/spec/pipeline/orchestrator.py:360
After renaming the spec directory (lines 357-360), self.spec_dir and self.validator are updated on the orchestrator, but phase_executor.spec_dir and phase_executor.spec_validator still point to the OLD directory path. The PhaseExecutor stores spec_dir as a plain attribute (executor.py line 58), not a reference. All subsequent phases (complexity_assessment, historical_context, research, context, spec_writing, self_critique, planning, validation) use phase_executor.spec_dir extensively. The code at line 367 even updates phase_executor.task_description, showing awareness of the need to sync attributes, but spec_dir and spec_validator were missed.
Resolution note: Same issue as 6e248e327264. Lines 357-360 of orchestrator.py update self.spec_dir and self.validator but NOT phase_executor.spec_dir or phase_executor.spec_validator. PhaseExecutor.init stores these as independent instance attributes (executor.py line 58, 60).
Suggested fix:
Add after line 360:
phase_executor.spec_dir = self.spec_dir
phase_executor.spec_validator = self.validator
🔴 [CMT-001] [CRITICAL] [UNRESOLVED] [FROM COMMENTS] sentry[bot] and coderabbitai both flag: phase_executor.spec_dir stale after rename
📁 apps/backend/spec/pipeline/orchestrator.py:360
Both @sentry[bot] (CRITICAL) and coderabbitai confirm that phase_executor.spec_dir is not updated after the directory rename. This is the same issue as finding 6e248e327264. The concern raised by both tools is validated - phase_executor.spec_dir and spec_validator remain stale.
Suggested fix:
Add phase_executor.spec_dir = self.spec_dir and phase_executor.spec_validator = self.validator after line 360.
🟠 [CMT-002] [HIGH] [UNRESOLVED] [FROM COMMENTS] coderabbitai flags: TASK_UPDATE_STATUS still uses task.subtasks.length
📁 apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:749
CodeRabbit identified that the TASK_UPDATE_STATUS handler at line 749 still uses the unreliable task.subtasks.length === 0 check. This is the same root cause bug (#1562) that was fixed in TASK_START but not in TASK_UPDATE_STATUS. Same issue as finding a32816266b30.
Suggested fix:
Apply the same planHasSubtasks pattern used in TASK_START.
This review was generated by Auto Claude.
Summary
Fixes #1562 — Tasks that crash during the spec creation/planning phase get stuck in "planning" state forever, and clicking Resume also crashes because it incorrectly transitions to "coding" state.
Root Causes
PLANNING_STARTEDtransition fromerrororhuman_reviewstates, so resume always went tocoding(which immediately fails with no subtasks)self.spec_dirwasn't updated afterrename_spec_dir_from_requirements(), causing subsequent phases to use a stale pathtask.subtasks.length === 0was used instead of checking the actualimplementation_plan.jsonfileChanges
Backend:
apps/backend/spec/pipeline/orchestrator.pyrun()in try/except to emitPLANNING_FAILEDon unhandled exceptions_emit_planning_failed()helper called at every early return pathself.spec_dirtracking after directory renameFrontend:
apps/frontend/src/shared/state-machines/task-machine.tsPLANNING_STARTEDtransitions fromerrorandhuman_reviewstatesFrontend:
apps/frontend/src/main/ipc-handlers/task/execution-handlers.tsPLANNING_STARTEDinstead ofUSER_RESUMEDimplementation_plan.jsonfor subtasks instead oftask.subtasks.lengthTest Plan
Closes #1562
Summary by CodeRabbit
Bug Fixes
Improvements