fix: self-healing for invalid file paths in coder pipeline#1847
fix: self-healing for invalid file paths in coder pipeline#1847
Conversation
…g in coder pipeline - Add functions for building a file index, finding correct paths using fuzzy matching, and auto-correcting subtask file paths. - Introduce directory exclusion logic to optimize file searching. - Enhance validation of file paths in implementation plans to support better error recovery. - Add comprehensive tests for the new functionalities, covering various matching scenarios and edge cases. This update improves the robustness of the coder pipeline by enabling it to automatically correct file paths based on existing project structure, thus enhancing overall stability and user experience.
📝 WalkthroughWalkthroughIntroduces file path self-healing with in-tree indexing and automatic path correction for implementation plans. Updates progress tracking to unblock phases when subtasks are stuck. Integrates validation and auto-correction into the agent loop and adds recovery marker cleanup to prevent infinite loops. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Coder Agent
participant Index as File Index Builder
participant Finder as Path Finder
participant Corrector as Auto-Corrector
participant Task as Task State
participant Storage as JSON Storage
Agent->>Index: _build_file_index(project_dir)
Index-->>Agent: file_index (basename & directory maps)
Agent->>Corrector: _auto_correct_subtask_files(subtask, missing_files)
Corrector->>Finder: _find_correct_path_indexed(missing_path)
Finder->>Index: lookup in file_index
Index-->>Finder: best match candidate
Finder-->>Corrector: corrected_path (or None)
Corrector->>Task: update subtask.files (in-memory)
Corrector->>Storage: write_json_atomic(plan_file)
Storage-->>Corrector: persisted
Corrector-->>Agent: list of corrected files
Agent->>Agent: retry validation with corrected paths
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 4❌ Failed checks (4 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @AndyMik90, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and self-healing capabilities of the coder pipeline by addressing critical issues related to invalid file paths. It introduces intelligent fuzzy matching to automatically correct erroneous file references, improves the system's ability to report and recover from stuck subtasks, and prevents infinite recovery loops, ensuring a more resilient and efficient development process. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent pull request that addresses a critical set of bugs causing the coder pipeline to get permanently stuck. The introduction of a self-healing mechanism for invalid file paths is a significant improvement to the system's reliability. The fuzzy path matching with a scoring algorithm is well-designed, and the post-planning validation is a great proactive measure. The changes are well-tested with a comprehensive new test suite, which gives high confidence in the solution. I have a few minor suggestions to improve code clarity and maintainability, but overall, this is a very strong contribution.
|
|
||
| def validate_subtask_files(subtask: dict, project_dir: Path) -> dict: | ||
| def _find_correct_path_indexed( | ||
| missing_path: str, parent_parts: tuple[str, ...], file_index: dict |
There was a problem hiding this comment.
The type hint for file_index is dict, which is too generic. For better type safety and clarity, it should be specified with the more precise type returned by _build_file_index.
| missing_path: str, parent_parts: tuple[str, ...], file_index: dict | |
| missing_path: str, parent_parts: tuple[str, ...], file_index: dict[str, list[tuple[str, Path]]] |
| found = False | ||
| for phase in plan.get("phases", []): | ||
| for plan_subtask in phase.get("subtasks", []): | ||
| if plan_subtask.get("id") == subtask_id: | ||
| plan_files = plan_subtask.get("files_to_modify", []) | ||
| plan_subtask["files_to_modify"] = [ | ||
| corrections.get(f, f) for f in plan_files | ||
| ] | ||
| found = True | ||
| break | ||
| if found: | ||
| break |
There was a problem hiding this comment.
The logic here to find and update a subtask in the plan duplicates functionality that already exists in agents.utils.find_subtask_in_plan. To improve maintainability and reduce code duplication, you could use the existing utility. You'll need to add find_subtask_in_plan to the import list from .utils.
plan_subtask = find_subtask_in_plan(plan, subtask_id)
if plan_subtask:
plan_files = plan_subtask.get("files_to_modify", [])
plan_subtask["files_to_modify"] = [
corrections.get(f, f) for f in plan_files
]|
|
||
| # First pass: collect all missing files and their suffixes | ||
| missing_entries: list[ | ||
| tuple[dict, int, str] |
|
|
||
| def test_exact_basename_with_shared_parents_wins(self, project_tree): | ||
| """Score prefers candidates sharing more parent directory segments.""" | ||
| result = _find_correct_path("src/renderer/components/Modal.tsx", project_tree) |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix the problem, we should remove the redundant assignment to result that is immediately overwritten and whose value is never used. In general, when the same variable is assigned multiple times before any read, keep only the assignment whose value is actually used, unless earlier assignments are needed for their side effects.
Concretely, in tests/test_file_path_self_healing.py, inside TestFindCorrectPath.test_exact_basename_with_shared_parents_wins, delete the first line that assigns to result (line 181: result = _find_correct_path("src/renderer/components/Modal.tsx", project_tree)). The comment right below explains that we intend to "test a wrong parent instead", which is exactly what the remaining second assignment does. No new imports, helpers, or changes elsewhere are needed; we just remove that one unused assignment line.
| @@ -178,7 +178,6 @@ | ||
|
|
||
| def test_exact_basename_with_shared_parents_wins(self, project_tree): | ||
| """Score prefers candidates sharing more parent directory segments.""" | ||
| result = _find_correct_path("src/renderer/components/Modal.tsx", project_tree) | ||
| # File exists at the exact path, but _find_correct_path is only called | ||
| # for missing paths. Let's test a wrong parent instead. | ||
| result = _find_correct_path("src/components/Modal.tsx", project_tree) |
| else: | ||
| plan_validated = True | ||
| planning_retry_context = None |
There was a problem hiding this comment.
Bug: After exhausting retries for invalid file paths, the plan is incorrectly marked as valid (plan_validated = True) and allowed to proceed, defeating the purpose of the validation step.
Severity: MEDIUM
Suggested Fix
When planning_validation_failures reaches max_planning_validation_retries and path_issues still exist, the system should not set plan_validated = True. Instead, it should explicitly fail the planning phase, for example by raising an error or returning a failure status to prevent the faulty plan from being executed.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: apps/backend/agents/coder.py#L1227-L1229
Potential issue: After the maximum number of planning retries
(`max_planning_validation_retries`) is reached due to invalid file paths in the plan,
the code proceeds to an `else` block that sets `plan_validated = True`. This incorrectly
marks the plan as valid, even though it contains known `path_issues`. While a subsequent
validation step during the coding phase will eventually catch these invalid paths and
mark the subtasks as stuck, this behavior defeats the purpose of the post-planning
validation, which is intended to prevent faulty plans from being executed. The system
silently accepts a bad plan after retries have failed.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@apps/backend/agents/coder.py`:
- Around line 101-116: The _EXCLUDE_DIRS frozenset is missing several common
IDE/build/vendor directories which can cause false positives; update the
_EXCLUDE_DIRS declaration in apps/backend/agents/coder.py to include additional
common folders such as ".idea", ".vscode", "vendor", "target", "out", and any
other org-specific build dirs you encounter, keeping the set as a frozenset and
preserving existing entries; run any existing file-search/unit tests after
adding those names to ensure no regressions.
- Around line 457-463: The except clause around write_json_atomic should not
catch json.JSONDecodeError because write_json_atomic serializes/writes (not
decodes); update the except to only catch relevant write/serialization errors
(e.g., OSError and optionally TypeError) in the blocks that call
write_json_atomic (the exception handlers currently around write_json_atomic in
coder.py); remove json.JSONDecodeError from those except tuples so the handler
accurately reflects possible failures.
- Around line 336-349: _auto_correct_subtask_files currently calls
_find_correct_path for every missing file which triggers a full os.walk each
time; instead build the file index once and reuse it: call
_build_file_index(project_dir) at the start of _auto_correct_subtask_files, then
replace per-file calls to _find_correct_path with a single-index lookup via
_find_correct_path_indexed(missing_path, file_index). Keep the same
logging/print_status and still_missing behavior and maintain the same return
semantics as _validate_plan_file_paths.
- Around line 231-313: The duplicated scoring logic in _find_correct_path and
_find_correct_path_indexed should be extracted into a single helper (e.g.,
_score_and_select_candidate) that accepts candidate records (rel string and Path
or candidate_parts) plus the original parent_parts and returns the best_path or
None; move the base weights (10.0, 8.0), per-shared-parent bonus (+3.0), depth
penalty (-0.5), minimum score (8.0) and runner-up gap (3.0) into that helper so
both functions call it instead of reimplementing scoring, then update
_find_correct_path and _find_correct_path_indexed to build the candidate list
(rel_str + Path/parts) and delegate selection to the helper, keeping behavior
identical and adding a small unit test to confirm parity.
- Around line 1213-1229: When _validate_plan_file_paths(spec_dir, project_dir)
returns path_issues but planning_validation_failures has already reached
max_planning_validation_retries, add an explicit warning before proceeding: set
planning_retry_context = path_issues (if not already), call print_status with a
clear warning like "Proceeding despite unresolved plan file paths after max
retries" (or similar), and keep first_run/status/plan_validated behavior
consistent with the current else branch so higher-level validation still runs;
reference the symbols _validate_plan_file_paths, planning_validation_failures,
max_planning_validation_retries, planning_retry_context, print_status,
first_run, status, and plan_validated to locate where to add this log.
In `@tests/test_file_path_self_healing.py`:
- Around line 114-150: The fixture spec_dir_with_plan is unused and should be
removed or integrated; either delete the pytest fixture named spec_dir_with_plan
from tests/test_file_path_self_healing.py to avoid dead code, or replace the
inline plan setup in TestAutoCorrectSubtaskFiles and TestValidatePlanFilePaths
with this fixture by using spec_dir_with_plan as a test parameter and updating
those tests to read the implementation_plan.json created by the fixture; ensure
references to the fixture name (spec_dir_with_plan) and the test classes
(TestAutoCorrectSubtaskFiles, TestValidatePlanFilePaths) are updated
accordingly.
- Around line 264-268: The test_excludes_git_dir test doesn't exercise directory
exclusion because files like .git/config lack an extension and are filtered out
before directory pruning in _build_file_index; fix by updating the project_tree
fixture to include a file with a real extension inside the .git directory (e.g.,
.git/secret.ts) and then modify test_excludes_git_dir to assert that that .ts
file is not present in the index returned by _build_file_index (keep the same
call and iterate index.values(), but check that the .git/*.ts path is absent).
| # Directories to exclude from file path search | ||
| _EXCLUDE_DIRS = frozenset( | ||
| { | ||
| "node_modules", | ||
| ".git", | ||
| "dist", | ||
| "build", | ||
| ".venv", | ||
| "__pycache__", | ||
| ".next", | ||
| ".nuxt", | ||
| ".auto-claude", | ||
| "coverage", | ||
| ".tox", | ||
| } | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Verify completeness of _EXCLUDE_DIRS for common project layouts.
The set covers JavaScript/Python ecosystems well. Depending on the user base, consider whether directories like .idea, .vscode, vendor (Go/PHP), target (Rust/Java), or out (common build output) should also be excluded to avoid false positive matches.
🤖 Prompt for AI Agents
In `@apps/backend/agents/coder.py` around lines 101 - 116, The _EXCLUDE_DIRS
frozenset is missing several common IDE/build/vendor directories which can cause
false positives; update the _EXCLUDE_DIRS declaration in
apps/backend/agents/coder.py to include additional common folders such as
".idea", ".vscode", "vendor", "target", "out", and any other org-specific build
dirs you encounter, keeping the set as a frozenset and preserving existing
entries; run any existing file-search/unit tests after adding those names to
ensure no regressions.
| def _find_correct_path(missing_path: str, project_dir: Path) -> str | None: | ||
| """ | ||
| Attempt to find the correct path for a missing file using fuzzy matching. | ||
|
|
||
| Strategies: | ||
| 1. Same basename in nearby directory | ||
| 2. index.{ext} pattern (e.g., preload/api.ts -> preload/api/index.ts) | ||
|
|
||
| Uses os.walk with directory pruning to avoid traversing into node_modules, | ||
| .git, dist, etc. — unlike Path.rglob which traverses everything then filters. | ||
|
|
||
| Args: | ||
| missing_path: The incorrect file path from the plan | ||
| project_dir: Root directory of the project | ||
|
|
||
| Returns: | ||
| Corrected relative path, or None if no good match found | ||
| """ | ||
| missing = Path(missing_path) | ||
| basename = missing.name | ||
| stem = missing.stem | ||
| suffix = missing.suffix | ||
| parent_parts = missing.parent.parts | ||
|
|
||
| if not suffix: | ||
| return None | ||
|
|
||
| candidates: list[tuple[str, float]] = [] | ||
| resolved_project = project_dir.resolve() | ||
| resolved_str = str(resolved_project) | ||
|
|
||
| # os.walk with pruning: modify dirs in-place to skip excluded directories | ||
| for root, dirs, files in os.walk(resolved_project): | ||
| dirs[:] = [d for d in dirs if d not in _EXCLUDE_DIRS] | ||
|
|
||
| for filename in files: | ||
| if not filename.endswith(suffix): | ||
| continue | ||
|
|
||
| full_path = os.path.join(root, filename) | ||
| rel_str = os.path.relpath(full_path, resolved_str).replace(os.sep, "/") | ||
| rel = Path(rel_str) | ||
|
|
||
| score = 0.0 | ||
|
|
||
| # Strategy 1: Exact basename match | ||
| if filename == basename: | ||
| score += 10.0 | ||
| # Strategy 2: index.{ext} in directory matching stem | ||
| elif filename == f"index{suffix}" and os.path.basename(root) == stem: | ||
| score += 8.0 | ||
| else: | ||
| continue | ||
|
|
||
| # Bonus: shared parent directory segments | ||
| candidate_parts = rel.parent.parts | ||
| for i, part in enumerate(parent_parts): | ||
| if i < len(candidate_parts) and candidate_parts[i] == part: | ||
| score += 3.0 | ||
|
|
||
| # Penalty: depth difference | ||
| depth_diff = abs(len(candidate_parts) - len(parent_parts)) | ||
| score -= 0.5 * depth_diff | ||
|
|
||
| candidates.append((rel_str, score)) | ||
|
|
||
| if not candidates: | ||
| return None | ||
|
|
||
| candidates.sort(key=lambda x: x[1], reverse=True) | ||
| best_path, best_score = candidates[0] | ||
|
|
||
| # Require minimum score | ||
| if best_score < 8.0: | ||
| return None | ||
|
|
||
| # Require clear winner (gap >= 3.0 from runner-up) | ||
| if len(candidates) > 1: | ||
| runner_up_score = candidates[1][1] | ||
| if best_score - runner_up_score < 3.0: | ||
| return None # Ambiguous match | ||
|
|
||
| return best_path |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Duplicated scoring logic across _find_correct_path and _find_correct_path_indexed.
Both functions implement the same scoring algorithm (base scores 10.0/8.0, +3.0 per shared parent segment, -0.5 depth penalty, ≥8.0 threshold, ≥3.0 gap check) independently. If thresholds or weights are tuned in one, the other will silently diverge.
Consider extracting the scoring into a shared helper that takes a list of (rel_str, Path) candidates and returns the best match:
♻️ Sketch of extracted scoring helper
+def _score_and_select(
+ candidates_raw: list[tuple[str, Path]],
+ parent_parts: tuple[str, ...],
+ base_score: float,
+) -> list[tuple[str, float]]:
+ """Score candidates by shared parent segments and depth difference."""
+ scored = []
+ for rel_str, rel_path in candidates_raw:
+ score = base_score
+ candidate_parts = rel_path.parent.parts
+ for i, part in enumerate(parent_parts):
+ if i < len(candidate_parts) and candidate_parts[i] == part:
+ score += 3.0
+ depth_diff = abs(len(candidate_parts) - len(parent_parts))
+ score -= 0.5 * depth_diff
+ scored.append((rel_str, score))
+ return scored
+
+
+def _pick_best(candidates: list[tuple[str, float]]) -> str | None:
+ if not candidates:
+ return None
+ candidates.sort(key=lambda x: x[1], reverse=True)
+ best_path, best_score = candidates[0]
+ if best_score < 8.0:
+ return None
+ if len(candidates) > 1 and best_score - candidates[1][1] < 3.0:
+ return None
+ return best_path🤖 Prompt for AI Agents
In `@apps/backend/agents/coder.py` around lines 231 - 313, The duplicated scoring
logic in _find_correct_path and _find_correct_path_indexed should be extracted
into a single helper (e.g., _score_and_select_candidate) that accepts candidate
records (rel string and Path or candidate_parts) plus the original parent_parts
and returns the best_path or None; move the base weights (10.0, 8.0),
per-shared-parent bonus (+3.0), depth penalty (-0.5), minimum score (8.0) and
runner-up gap (3.0) into that helper so both functions call it instead of
reimplementing scoring, then update _find_correct_path and
_find_correct_path_indexed to build the candidate list (rel_str + Path/parts)
and delegate selection to the helper, keeping behavior identical and adding a
small unit test to confirm parity.
| corrections: dict[str, str] = {} | ||
| still_missing: list[str] = [] | ||
|
|
||
| for missing_path in missing_files: | ||
| corrected = _find_correct_path(missing_path, project_dir) | ||
| if corrected: | ||
| corrections[missing_path] = corrected | ||
| logger.info(f"Auto-corrected file path: {missing_path} -> {corrected}") | ||
| print_status(f"Auto-corrected: {missing_path} -> {corrected}", "success") | ||
| else: | ||
| still_missing.append(missing_path) | ||
|
|
||
| if not corrections: | ||
| return still_missing |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Each missing file triggers a full tree walk.
_auto_correct_subtask_files calls _find_correct_path per missing file, each of which does an os.walk over the entire project tree. For subtasks with multiple missing files, this is redundant. Consider using _build_file_index + _find_correct_path_indexed here (as _validate_plan_file_paths already does) to walk once.
♻️ Use indexed lookup for batch correction
def _auto_correct_subtask_files(
subtask: dict,
missing_files: list[str],
project_dir: Path,
spec_dir: Path,
) -> list[str]:
corrections: dict[str, str] = {}
still_missing: list[str] = []
+ # Collect needed suffixes and build index once
+ suffixes = {Path(p).suffix for p in missing_files if Path(p).suffix}
+ if not suffixes:
+ return missing_files
+ file_index = _build_file_index(project_dir, suffixes)
+
for missing_path in missing_files:
- corrected = _find_correct_path(missing_path, project_dir)
+ missing = Path(missing_path)
+ corrected = _find_correct_path_indexed(
+ missing_path, missing.parent.parts, file_index
+ )
if corrected:🤖 Prompt for AI Agents
In `@apps/backend/agents/coder.py` around lines 336 - 349,
_auto_correct_subtask_files currently calls _find_correct_path for every missing
file which triggers a full os.walk each time; instead build the file index once
and reuse it: call _build_file_index(project_dir) at the start of
_auto_correct_subtask_files, then replace per-file calls to _find_correct_path
with a single-index lookup via _find_correct_path_indexed(missing_path,
file_index). Keep the same logging/print_status and still_missing behavior and
maintain the same return semantics as _validate_plan_file_paths.
| # Persist any corrections that were made | ||
| if corrections_made > 0: | ||
| try: | ||
| write_json_atomic(plan_file, plan) | ||
| logger.info(f"Persisted {corrections_made} post-plan path correction(s)") | ||
| except (OSError, json.JSONDecodeError, UnicodeDecodeError) as e: | ||
| logger.warning(f"Failed to persist post-plan corrections: {e}") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
write_json_atomic exception clause includes json.JSONDecodeError on a write path.
At line 462 (and similarly at line 381), the except clause catches json.JSONDecodeError alongside OSError. write_json_atomic serializes and writes—it wouldn't raise JSONDecodeError. While harmless (it just never matches), it could mislead readers into thinking decoding happens here.
Tighten except clause
- except (OSError, json.JSONDecodeError, UnicodeDecodeError) as e:
+ except OSError as e:
logger.warning(f"Failed to persist post-plan corrections: {e}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Persist any corrections that were made | |
| if corrections_made > 0: | |
| try: | |
| write_json_atomic(plan_file, plan) | |
| logger.info(f"Persisted {corrections_made} post-plan path correction(s)") | |
| except (OSError, json.JSONDecodeError, UnicodeDecodeError) as e: | |
| logger.warning(f"Failed to persist post-plan corrections: {e}") | |
| # Persist any corrections that were made | |
| if corrections_made > 0: | |
| try: | |
| write_json_atomic(plan_file, plan) | |
| logger.info(f"Persisted {corrections_made} post-plan path correction(s)") | |
| except OSError as e: | |
| logger.warning(f"Failed to persist post-plan corrections: {e}") |
🤖 Prompt for AI Agents
In `@apps/backend/agents/coder.py` around lines 457 - 463, The except clause
around write_json_atomic should not catch json.JSONDecodeError because
write_json_atomic serializes/writes (not decodes); update the except to only
catch relevant write/serialization errors (e.g., OSError and optionally
TypeError) in the blocks that call write_json_atomic (the exception handlers
currently around write_json_atomic in coder.py); remove json.JSONDecodeError
from those except tuples so the handler accurately reflects possible failures.
| # Fix 5: Validate file paths in the newly created plan | ||
| path_issues = _validate_plan_file_paths(spec_dir, project_dir) | ||
| if ( | ||
| path_issues | ||
| and planning_validation_failures < max_planning_validation_retries | ||
| ): | ||
| planning_validation_failures += 1 | ||
| planning_retry_context = path_issues | ||
| print_status( | ||
| "Plan has invalid file paths - retrying planner", | ||
| "warning", | ||
| ) | ||
| first_run = True | ||
| status = "continue" | ||
| else: | ||
| plan_validated = True | ||
| planning_retry_context = None |
There was a problem hiding this comment.
Silent fallthrough when path validation retries are exhausted.
When path_issues is truthy but planning_validation_failures >= max_planning_validation_retries, the else branch sets plan_validated = True and proceeds without logging that uncorrectable paths remain. The subtask-level validation will eventually catch these, but a warning here would help with debugging.
🛡️ Add a warning when proceeding despite unresolved paths
first_run = True
status = "continue"
else:
+ if path_issues:
+ logger.warning(
+ "Proceeding with %d uncorrectable file path(s) after %d planner retries",
+ path_issues.count("- `"),
+ planning_validation_failures,
+ )
+ print_status(
+ f"Proceeding with known path issues after {planning_validation_failures} planner retries",
+ "warning",
+ )
plan_validated = True
planning_retry_context = None🤖 Prompt for AI Agents
In `@apps/backend/agents/coder.py` around lines 1213 - 1229, When
_validate_plan_file_paths(spec_dir, project_dir) returns path_issues but
planning_validation_failures has already reached
max_planning_validation_retries, add an explicit warning before proceeding: set
planning_retry_context = path_issues (if not already), call print_status with a
clear warning like "Proceeding despite unresolved plan file paths after max
retries" (or similar), and keep first_run/status/plan_validated behavior
consistent with the current else branch so higher-level validation still runs;
reference the symbols _validate_plan_file_paths, planning_validation_failures,
max_planning_validation_retries, planning_retry_context, print_status,
first_run, status, and plan_validated to locate where to add this log.
| @pytest.fixture | ||
| def spec_dir_with_plan(tmp_path): | ||
| """Create a spec directory with an implementation plan containing wrong paths.""" | ||
| spec_dir = tmp_path / "spec" | ||
| spec_dir.mkdir() | ||
|
|
||
| plan = { | ||
| "feature": "test feature", | ||
| "phases": [ | ||
| { | ||
| "id": "phase-1", | ||
| "name": "Phase 1", | ||
| "subtasks": [ | ||
| { | ||
| "id": "task-1", | ||
| "description": "Fix the API", | ||
| "status": "pending", | ||
| "files_to_modify": [ | ||
| "src/preload/api.ts", # Wrong: should be src/preload/api/index.ts | ||
| "src/renderer/components/Button.tsx", # Correct | ||
| ], | ||
| }, | ||
| { | ||
| "id": "task-2", | ||
| "description": "Update store", | ||
| "status": "pending", | ||
| "files_to_modify": [ | ||
| "src/renderer/stores/task-store.ts", # Correct | ||
| ], | ||
| }, | ||
| ], | ||
| } | ||
| ], | ||
| } | ||
|
|
||
| (spec_dir / "implementation_plan.json").write_text(json.dumps(plan, indent=2)) | ||
| return spec_dir |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Unused fixture spec_dir_with_plan.
This fixture is defined but never referenced by any test method. All tests that need a spec directory with a plan create their own inline. Remove it to avoid confusion, or wire it into the TestAutoCorrectSubtaskFiles / TestValidatePlanFilePaths tests that replicate similar setup.
🤖 Prompt for AI Agents
In `@tests/test_file_path_self_healing.py` around lines 114 - 150, The fixture
spec_dir_with_plan is unused and should be removed or integrated; either delete
the pytest fixture named spec_dir_with_plan from
tests/test_file_path_self_healing.py to avoid dead code, or replace the inline
plan setup in TestAutoCorrectSubtaskFiles and TestValidatePlanFilePaths with
this fixture by using spec_dir_with_plan as a test parameter and updating those
tests to read the implementation_plan.json created by the fixture; ensure
references to the fixture name (spec_dir_with_plan) and the test classes
(TestAutoCorrectSubtaskFiles, TestValidatePlanFilePaths) are updated
accordingly.
| def test_excludes_git_dir(self, project_tree): | ||
| index = _build_file_index(project_tree, {".ts", ""}) | ||
| for entries in index.values(): | ||
| for rel_str, _ in entries: | ||
| assert ".git" not in rel_str |
There was a problem hiding this comment.
test_excludes_git_dir doesn't actually exercise directory exclusion.
The .git/config file has no extension, so _build_file_index skips it at the ext_idx == -1 check before the directory pruning matters. The empty-string suffix "" never matches because filename[ext_idx:] always starts with ".". To truly test .git exclusion, place a .ts file inside .git/ in the fixture and assert it's absent.
Proposed fix
In the project_tree fixture, add a .ts file inside .git/:
(tmp_path / ".git").mkdir(parents=True)
(tmp_path / ".git/config").write_text("[core]")
+ (tmp_path / ".git/hooks.ts").write_text("// should be excluded")Then update the test:
def test_excludes_git_dir(self, project_tree):
- index = _build_file_index(project_tree, {".ts", ""})
+ index = _build_file_index(project_tree, {".ts"})
for entries in index.values():
for rel_str, _ in entries:
assert ".git" not in rel_str🤖 Prompt for AI Agents
In `@tests/test_file_path_self_healing.py` around lines 264 - 268, The
test_excludes_git_dir test doesn't exercise directory exclusion because files
like .git/config lack an extension and are filtered out before directory pruning
in _build_file_index; fix by updating the project_tree fixture to include a file
with a real extension inside the .git directory (e.g., .git/secret.ts) and then
modify test_excludes_git_dir to assert that that .ts file is not present in the
index returned by _build_file_index (keep the same call and iterate
index.values(), but check that the .git/*.ts path is absent).
AndyMik90
left a comment
There was a problem hiding this comment.
🤖 Auto Claude PR Review
Merge Verdict: 🟠 NEEDS REVISION
🟠 Needs revision - 4 issue(s) require attention.
Branch is out of date with base branch. Update branch first - if no conflicts arise, you can merge. If merge conflicts arise, resolve them and run follow-up review again. 4 issue(s) must be addressed (0 required, 4 recommended). 3 non-blocking suggestion(s) to consider.
Risk Assessment
| Factor | Level | Notes |
|---|---|---|
| Complexity | High | Based on lines changed |
| Security Impact | None | Based on security findings |
| Scope Coherence | Good | Based on structural review |
🚨 Blocking Issues (Must Fix)
- Branch Out of Date: PR branch is behind the base branch and needs to be updated
Findings Summary
- Medium: 4 issue(s)
- Low: 3 issue(s)
Generated by Auto Claude PR Review
Findings (7 selected of 7 total)
🟡 [bdac7ab02ee9] [MEDIUM] Incorrect type annotation: list[tuple[dict, ...]] should be list[tuple[list, ...]]
📁 apps/backend/agents/coder.py:415
The type annotation for missing_entries declares tuple[dict, int, str] but the first element is actually a list (from subtask.get('files_to_modify', []) at line 422). At line 433, (files, i, file_path) is appended where files is a list[str], not a dict. The code works at runtime because both list and dict support __setitem__ (line 450: files_list[idx] = corrected), but this misleads type checkers and readers. A type checker in strict mode would flag files_list[idx] = corrected as invalid since dict.__setitem__ expects a key type matching the dict, not an int index. | The missing_entries variable is annotated as list[tuple[dict, int, str]] but the first element is actually a list[str] (from subtask.get('files_to_modify', [])), not a dict. The comment correctly says '(subtask_files_list, index, path)' but the type says dict. This has no runtime impact since Python doesn't enforce type annotations, but would confuse static analysis tools (mypy/pyright) and code readers.
Suggested fix:
Change type to `list[tuple[list[str], int, str]]` to accurately reflect that the first element is a list of file path strings.
🟡 [ff8364a6194b] [MEDIUM] Duplicated scoring logic between _find_correct_path and _find_correct_path_indexed
📁 apps/backend/agents/coder.py:189
The scoring algorithm (base scores 10.0/8.0, parent directory bonus +3.0, depth penalty -0.5, minimum threshold 8.0, disambiguation gap >= 3.0) is fully duplicated between _find_correct_path_indexed (lines 189-228) and _find_correct_path (lines 274-313). Both functions have identical scoring constants, sorting logic, and threshold checks. If scoring weights or thresholds need adjustment, both implementations must be updated in lockstep. _find_correct_path is called per-file during subtask validation, while _find_correct_path_indexed is used for batch post-plan validation.
Suggested fix:
Extract the scoring constants (BASE_SCORE_BASENAME=10.0, BASE_SCORE_INDEX=8.0, PARENT_BONUS=3.0, DEPTH_PENALTY=0.5, MIN_SCORE=8.0, MIN_GAP=3.0) into module-level constants, and extract the candidate selection/disambiguation logic into a shared helper function like `_select_best_candidate(candidates: list[tuple[str, float]]) -> str | None`.
🟡 [8df38235f1a0] [MEDIUM] [NEEDS REVIEW] Shared retry counter between schema and path validation causes silent plan acceptance
📁 apps/backend/agents/coder.py:1215
The planning_validation_failures counter (initialized at line 820) is shared between two independent validation stages: schema validation (line 1231) and file path validation (line 1219). When path_issues is truthy but planning_validation_failures >= max_planning_validation_retries, the code falls through to the else at line 1227-1229 and sets plan_validated = True with no warning. This means: (1) if schema validation consumed retries first, path validation gets fewer retries than the documented '3 times'; (2) when path retries ARE exhausted, a plan with known bad file paths is silently marked as validated. The per-subtask validation later catches this, but there's no log/warning indicating this fallthrough happened, making debugging difficult.
Suggested fix:
Add a warning log when path validation failures are bypassed: `if path_issues: logger.warning(f'Plan has {len(all_missing)} uncorrectable paths after {planning_validation_failures} retries - proceeding anyway')`. Consider also using a separate counter for path validation retries to ensure the documented '3 retries' budget is independent.
🔵 [c6c23ef38eb9] [LOW] Wrong exception type in write error handlers: JSONDecodeError cannot be raised by write operations
📁 apps/backend/agents/coder.py:381
Both _auto_correct_subtask_files (line 381) and _validate_plan_file_paths (line 462) catch json.JSONDecodeError when calling write_json_atomic(). However, write_json_atomic uses json.dump() which raises TypeError or ValueError for non-serializable data — never JSONDecodeError (which is a read/parse error). If the plan data contains non-serializable objects (e.g., Path objects, sets), the TypeError would propagate uncaught instead of being handled by the defensive except block. The OSError and UnicodeDecodeError catches are also questionable for a write path (UnicodeDecodeError is a read error), though OSError is correct for filesystem write failures.
Suggested fix:
Change to `except (OSError, TypeError, ValueError) as e:` — `OSError` covers filesystem issues, `TypeError` covers non-serializable types, and `ValueError` covers circular references. Drop `json.JSONDecodeError` and `UnicodeDecodeError` which are read-path exceptions.
🔵 [c8089b9863e8] [LOW] Per-subtask auto-correction performs full os.walk per missing file
📁 apps/backend/agents/coder.py:339
_auto_correct_subtask_files calls _find_correct_path (which does a full os.walk of the project tree) once per missing file in the subtask. If a subtask has N missing files, this results in N full directory traversals. In contrast, _validate_plan_file_paths (post-planning) efficiently builds a file index once and queries it for all missing files. The per-subtask path is a fallback (post-planning validation should catch most cases), but when triggered on large projects with multiple missing files per subtask, this is unnecessarily expensive.
Suggested fix:
Build the file index once at the start of _auto_correct_subtask_files (like _validate_plan_file_paths does) and use _find_correct_path_indexed for lookups. This would change N tree walks into 1 tree walk + N index lookups.
🟡 [2790c9beaea7] [MEDIUM] Duplicated directory exclusion set instead of reusing existing SKIP_DIRS
📁 apps/backend/agents/coder.py:102
The PR introduces _EXCLUDE_DIRS as a new frozenset of directories to skip during file walks. However, the codebase already has two SKIP_DIRS constants for the same purpose:
context/constants.py:9—SKIP_DIRS(20 entries)analysis/analyzers/base.py:14—SKIP_DIRS(22 entries)
The new _EXCLUDE_DIRS has only 11 entries and is missing directories present in the existing constants: venv, target, vendor, .idea, .vscode, .pytest_cache, .mypy_cache, .turbo, .cache, .worktrees. It also introduces .tox which appears nowhere else in the codebase.
Searched Grep('SKIP_DIRS|EXCLUDE_DIRS', 'apps/backend/') — confirmed two existing SKIP_DIRS constants and no prior uses of _EXCLUDE_DIRS.
The practical risk is that fuzzy path matching could return false positives from directories like target/, vendor/, or .cache/ that the rest of the codebase already knows to skip.
Suggested fix:
Import and reuse the existing `SKIP_DIRS` from `context/constants.py`:
```python
from context.constants import SKIP_DIRS
_EXCLUDE_DIRS = frozenset(SKIP_DIRS)
Or if a superset is needed, extend it:
from context.constants import SKIP_DIRS
_EXCLUDE_DIRS = frozenset(SKIP_DIRS | {".tox"})This keeps directory exclusion lists consistent across the codebase and ensures new exclusions (like .worktrees) are automatically picked up.
#### 🔵 [62a717323c25] [LOW] Uses raw readFileSync instead of established safeReadFileSync helper
📁 `apps/frontend/src/main/ipc-handlers/task/execution-handlers.ts:1076`
The new recovery code at line 1076 uses `readFileSync(attemptHistoryPath, 'utf-8')` directly, then wraps it in a try/catch with ENOENT handling. However, this same file already defines `safeReadFileSync()` (line 29) which does exactly this pattern — safely reads a file and returns null on ENOENT. The existing function is already used elsewhere in the same file (lines 302, 935).
Searched `Grep('safeReadFileSync', 'execution-handlers.ts')` — confirmed 4 existing uses of this helper within the same file.
The current code works correctly but duplicates the established error handling pattern.
**Suggested fix:**
Use the existing safeReadFileSync helper:
const historyContent = safeReadFileSync(attemptHistoryPath);
if (!historyContent) continue; // File doesn't exist, skip
try {
const history = JSON.parse(historyContent);
// ... rest of processing
} catch (historyErr) {
console.warn(`[Recovery] Could not parse attempt_history at ${dir}:`, historyErr);
}
---
*This review was generated by Auto Claude.*
Base Branch
developbranch (required for all feature/fix PRs)main(hotfix only - maintainers)Description
When the planner agent generates wrong file paths in
files_to_modify(e.g.,preload/api.tsinstead ofpreload/api/index.ts), the system gets permanently stuck with no self-healing. This happened overnight with spec 232: the app crashed, restarted, and entered an infinite recovery loop because the root cause (wrong paths) was never corrected.This PR fixes 5 interconnected bugs that together create permanent stuckness:
files_to_modifyreferences wrong paths (basename match,index.{ext}pattern, scored by shared parent dirs)Related Issue
Closes #1832
Type of Change
Area
AI Disclosure
Tool(s) used: Claude Code (Opus 4.6)
Testing level:
Untested -- AI output not yet verified
Lightly tested -- ran the app / spot-checked key paths
Fully tested -- all tests pass, manually verified behavior
I understand what this PR does and how the underlying code works
Checklist
developbranchPlatform Testing Checklist
platform/module instead of directprocess.platformchecksfindExecutable()or platform abstractions)CI/Testing Requirements
Key Implementation Details
Fuzzy File Path Matching (
coder.py)index.{ext}pattern (+8), shared parent dir segments (+3 each), depth penalty (-0.5 per level)os.walk()with directory pruning (excludesnode_modules,.git, etc.) + batch indexing for multi-file validationFiles Changed
apps/backend/agents/coder.pyapps/backend/core/progress.pyapps/frontend/.../execution-handlers.tstests/test_file_path_self_healing.pyFeature Toggle
Breaking Changes
Breaking: No
Summary by CodeRabbit
New Features
Bug Fixes
Tests