Skip to content

fix: handle empty/greenfield projects in spec creation (#1426)#1841

Open
AndyMik90 wants to merge 3 commits intodevelopfrom
fix/1426-spec-crash-empty-projects
Open

fix: handle empty/greenfield projects in spec creation (#1426)#1841
AndyMik90 wants to merge 3 commits intodevelopfrom
fix/1426-spec-crash-empty-projects

Conversation

@AndyMik90
Copy link
Owner

@AndyMik90 AndyMik90 commented Feb 15, 2026

Summary

Fixes spec creation crashing at phase 6 (SPEC DOCUMENT CREATION) for empty/greenfield projects and prevents the planning phase from getting stuck in "active" state after crashes.

Closes #1426

Root Cause

  1. Phase 6 crash: The spec writer agent tries to read project_index.json, requirements.json, and context.json which contain no files/code for greenfield projects. The agent gets confused trying to write a spec referencing non-existent code, producing invalid output or timing out. This also affects non-empty projects with sparse context files.

  2. Stuck "active" planning state: The task logger starts LogPhase.PLANNING at orchestrator startup but only calls end_phase on specific success/failure code paths. An unexpected exception bypasses all end_phase calls, leaving the task permanently stuck in "active" planning status — which contributes to the restart loop behavior.

Changes

apps/backend/spec/phases/spec_phases.py

  • Added _is_greenfield_project() — checks project_index.json for 0 discovered files
  • Added _greenfield_context() — returns clear instructions for the agent to focus on creating files from scratch rather than modifying existing code
  • Both phase_quick_spec() and phase_spec_writing() now detect greenfield projects and inject this context into the agent prompt

apps/backend/spec/pipeline/orchestrator.py

  • Wrapped phase execution in try/except safety net ensuring task_logger.end_phase() is always called on unexpected errors
  • Extracted phase logic into _run_phases() method, with run() acting as the safety wrapper
  • Prevents tasks from being stuck in "active" planning state after crashes

apps/backend/prompts/spec_writer.md

  • Added greenfield awareness to Phase 0 — when input files are missing/empty/show 0 files, adapt to focus on files to CREATE rather than reference existing code

Test plan

  • All 219 spec-related tests pass (spec_phases, spec_pipeline, spec_complexity, implementation_plan)
  • Pre-commit hooks pass (ruff lint + format)
  • Manual test: create a spec in an empty directory to verify greenfield detection works
  • Manual test: verify a crashed spec creation properly marks planning phase as failed (not stuck "active")

Summary by CodeRabbit

  • New Features

    • Automatic detection and tailored guidance for greenfield projects, including explicit PHASE 0 instructions for creating initial structure, tech stack, and patterns when input files are missing.
  • Bug Fixes

    • Improved orchestration error handling to prevent planning deadlocks and ensure the planning phase always concludes on failures.
  • Improvements

    • Clearer phase lifecycle tracking and explicit handling of human-review exits as unapproved.

…uck planning state (#1426)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added area/backend This is backend only bug Something isn't working size/S Small (10-99 lines) labels Feb 15, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AndyMik90, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stability and user experience of the spec creation process. It addresses critical issues where the system would crash or get stuck when dealing with new, empty projects or encountering unexpected errors during execution. By making the spec generation agents 'greenfield-aware' and implementing comprehensive error handling, the system is now more resilient and provides clearer feedback, especially for initial project setups.

Highlights

  • Greenfield Project Handling: Improved spec creation for empty or new ('greenfield') projects by providing specific instructions to the AI agent, ensuring it focuses on creating new files rather than modifying non-existent ones.
  • Robust Error Handling: Prevented the planning phase from getting stuck in an 'active' state due to unexpected crashes by implementing a safety net that ensures proper phase termination.
  • Dynamic Context Injection: Introduced logic to detect greenfield projects and dynamically inject tailored context into the spec writing process, guiding the agent's approach.
  • Orchestrator Refactoring: Refactored the orchestrator's run method to encapsulate phase execution within a try/except block, enhancing overall stability.
Changelog
  • apps/backend/prompts/spec_writer.md
    • Updated the prompt to explicitly mention that input files might not exist for greenfield projects.
    • Added detailed instructions for the AI agent on how to adapt its approach when a greenfield project is detected, focusing on creation rather than modification.
  • apps/backend/spec/phases/spec_phases.py
    • Added a utility function _is_greenfield_project to determine if a project is empty based on project_index.json.
    • Introduced _greenfield_context to generate specific guidance for the AI agent when operating on a greenfield project.
    • Integrated greenfield detection and context injection into both phase_quick_spec and phase_spec_writing methods.
    • Imported get_project_index_stats for project analysis.
  • apps/backend/spec/pipeline/orchestrator.py
    • Refactored the run method to include a try/except block, ensuring that the planning phase is always properly terminated even if unexpected exceptions occur.
    • Extracted the core phase execution logic into a new private method _run_phases to separate concerns and facilitate error handling.
Activity
  • All 219 spec-related tests (spec_phases, spec_pipeline, spec_complexity, implementation_plan) passed.
  • Pre-commit hooks for ruff lint and format passed.
  • Manual tests are planned to verify greenfield detection and proper handling of crashed spec creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 15, 2026

📝 Walkthrough

Walkthrough

Detects greenfield (empty-project) repositories and injects tailored guidance into spec-writing prompts; adds utilities to identify greenfield projects and log them; refactors orchestrator to run phases in an async _run_phases coroutine, track/ensure planning phase termination, and improve exception handling (including SystemExit from human-review).

Changes

Cohort / File(s) Summary
Prompt updates
apps/backend/prompts/spec_writer.md
PHASE 0 expanded with explicit guidance and an IMPORTANT block describing how to handle missing/empty input files (create initial structure, define stack, use patterns instead of existing code).
Spec phases — greenfield detection & context
apps/backend/spec/phases/spec_phases.py
Added _is_greenfield_project(spec_dir: Path) -> bool, _greenfield_context() -> str, and SpecPhaseMixin._check_and_log_greenfield(); quick-spec and spec-writing phases now call the check and inject greenfield context when detected.
Orchestrator — phase flow & error handling
apps/backend/spec/pipeline/orchestrator.py
Added async _run_phases(...) and _planning_phase_ended state; delegated run to _run_phases, added try/except to surface errors, treat SystemExit from human-review as unapproved, and ensure planning phase is ended exactly once on success or failure.
Imports / typing updates
apps/backend/spec/phases/spec_phases.py, apps/backend/spec/pipeline/orchestrator.py
Updated imports to include Path, get_project_index_stats, and typing adjustments for TaskLogger usage.

Sequence Diagram

sequenceDiagram
    participant Orchestrator as Orchestrator
    participant ProjectIndex as ProjectIndex
    participant SpecPhases as SpecPhases
    participant SpecWriterAgent as SpecWriterAgent
    participant UI as UI

    Orchestrator->>ProjectIndex: read project_index / stats
    ProjectIndex-->>Orchestrator: return file_count

    Orchestrator->>SpecPhases: _check_and_log_greenfield()
    SpecPhases-->>Orchestrator: is_greenfield (true/false)

    alt Greenfield
        Orchestrator->>SpecPhases: request _greenfield_context()
        SpecPhases-->>Orchestrator: greenfield guidance string
        Orchestrator->>UI: log "Greenfield detected" (info)
        Orchestrator->>SpecWriterAgent: run spec writing with greenfield_context
    else Normal
        Orchestrator->>SpecWriterAgent: run spec writing with normal context
    end

    SpecWriterAgent->>SpecWriterAgent: generate spec document
    SpecWriterAgent-->>Orchestrator: result (success/failure)

    Orchestrator->>Orchestrator: set _planning_phase_ended (true) and handle errors
    Orchestrator-->>UI: emit phase completion or error state
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 In an empty patch I hop and write,
No files? I craft the stack by night.
Greenfield guidance, gentle and bright,
Orchestrator hums to end the plight.
Spec seeds sprout — hop, plan, delight!

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly summarizes the main change: fixing spec creation for empty/greenfield projects.
Linked Issues check ✅ Passed PR fully addresses all objectives from #1426: detects greenfield projects, handles zero-file scenarios, prevents phase crashes, ensures planning phase completes, and prevents infinite restart loops.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing greenfield project handling: spec_phases detects empty projects, orchestrator adds error handling, and spec_writer.md adds greenfield-aware prompts.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into develop

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/1426-spec-crash-empty-projects

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a crash during spec creation for greenfield projects and improves the pipeline's robustness by preventing tasks from getting stuck. The changes to the agent prompts and the introduction of greenfield detection logic are well-implemented. The addition of a safety net in the orchestrator to handle unexpected errors is a significant improvement. I've provided a few suggestions to enhance type safety, reduce code duplication, and fix a potential issue with the error handling logic.

Comment on lines 241 to 256
# Track whether we've already ended the planning phase (to avoid double-end)
planning_phase_ended = False

try:
return await self._run_phases(interactive, auto_approve, task_logger, ui)
except Exception as e:
# Ensure planning phase is always properly ended on unexpected errors
# This prevents the task from being stuck in "active" planning state
if not planning_phase_ended:
planning_phase_ended = True
task_logger.end_phase(
LogPhase.PLANNING,
success=False,
message=f"Spec creation failed with unexpected error: {e}",
)
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The try...except block is a great addition for ensuring the planning phase always terminates. However, the planning_phase_ended flag is a local variable in the run method and is not shared with _run_phases. This means if _run_phases calls task_logger.end_phase() and then an unexpected exception occurs before it returns, task_logger.end_phase() will be called a second time in this except block, which could lead to issues.

To fix this, planning_phase_ended should be an instance variable (e.g., self.planning_phase_ended):

  1. Initialize self.planning_phase_ended = False in __init__ or at the start of run.
  2. In _run_phases, set self.planning_phase_ended = True immediately after every call to task_logger.end_phase().
  3. In this except block, check if not self.planning_phase_ended.

pass


def _is_greenfield_project(spec_dir) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better type safety and code clarity, please add a type hint for the spec_dir parameter. Based on its usage, it should be Path. You can use a forward reference "Path" and add from pathlib import Path inside the if TYPE_CHECKING: block to avoid potential circular import issues while keeping the type checker happy.

Suggested change
def _is_greenfield_project(spec_dir) -> bool:
def _is_greenfield_project(spec_dir: "Path") -> bool:

Comment on lines 114 to 119
is_greenfield = _is_greenfield_project(self.spec_dir)
if is_greenfield:
self.ui.print_status(
"Greenfield project detected - adapting spec for new project",
"info",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for detecting a greenfield project and logging a status message is duplicated from phase_quick_spec (lines 56-61). To improve maintainability and reduce redundancy, consider extracting this logic into a private helper method within the SpecPhaseMixin class.

For example:

def _check_and_log_greenfield(self) -> bool:
    """Checks for a greenfield project and logs a status message if detected."""
    is_greenfield = _is_greenfield_project(self.spec_dir)
    if is_greenfield:
        self.ui.print_status(
            "Greenfield project detected - adapting spec for new project",
            "info",
        )
    return is_greenfield

You could then replace the duplicated blocks in both phase_quick_spec and phase_spec_writing with a single call:
is_greenfield = self._check_and_log_greenfield()

Comment on lines 262 to 263
task_logger,
ui,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve code clarity and maintainability, please add type hints for the task_logger and ui parameters.

You'll need to add the following imports at the top of the file:

from types import ModuleType
from task_logger import TaskLogger # Can be added to existing import from task_logger
Suggested change
task_logger,
ui,
task_logger: "TaskLogger",
ui: "ModuleType",

Comment on lines +269 to +274
auto_approve: Whether to skip human review
task_logger: The task logger instance
ui: The UI module

Returns:
True if spec creation and review completed successfully

This comment was marked as outdated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@apps/backend/spec/phases/spec_phases.py`:
- Around line 19-22: The function _is_greenfield_project is missing a type
annotation for spec_dir; update its signature to type-hint spec_dir as
pathlib.Path and add/import Path from pathlib at the top of the file if not
already present (ensure the import is used alongside existing imports), e.g.,
modify the _is_greenfield_project signature to accept spec_dir: Path and keep
the body calling get_project_index_stats unchanged so callers using Path stay
type-safe.

In `@apps/backend/spec/pipeline/orchestrator.py`:
- Around line 258-275: The _run_phases coroutine currently has untyped
parameters task_logger and ui; update the signature to add explicit type hints
(e.g., TaskLogger and UI protocol/classes or typing.Any with a comment) and
import or define those types (or Protocols) where appropriate so IDEs and
linters can catch misuse; update any call sites if needed and keep the docstring
consistent with the typed signature.
- Around line 241-256: The planning_phase_ended flag is local and never updated
by _run_phases, so make it an instance attribute: initialize
self._planning_phase_ended = False at the start of run(), replace local
planning_phase_ended checks with self._planning_phase_ended in the except block,
and in _run_phases set self._planning_phase_ended = True immediately after every
task_logger.end_phase(LogPhase.PLANNING, ...) call so the double-end guard works
across method boundaries (update all end_phase sites in _run_phases
accordingly).

@AndyMik90 AndyMik90 self-assigned this Feb 15, 2026
Copy link
Owner Author

@AndyMik90 AndyMik90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto Claude PR Review

Merge Verdict: 🟠 NEEDS REVISION

🟠 Needs revision - 1 issue(s) require attention.

1 issue(s) must be addressed (0 required, 1 recommended), 1 suggestions

Risk Assessment

Factor Level Notes
Complexity Low Based on lines changed
Security Impact None Based on security findings
Scope Coherence Good Based on structural review

Findings Summary

  • Medium: 1 issue(s)
  • Low: 1 issue(s)

Generated by Auto Claude PR Review

Findings (2 selected of 2 total)

🟡 [fd2ef33d3874] [MEDIUM] planning_phase_ended flag cannot prevent double end_phase() calls

📁 apps/backend/spec/pipeline/orchestrator.py:242

The planning_phase_ended variable (line 242) is designed to prevent end_phase() from being called twice — the comment on line 241 explicitly states this: 'Track whether we've already ended the planning phase (to avoid double-end)'. However, this flag is local to run() and _run_phases() has no way to update it.

_run_phases() calls end_phase() internally — both on success (line 445) and on various failure paths (lines 329, 343, 373, 434). If an exception occurs AFTER one of those internal end_phase() calls, it propagates to run()'s except block where planning_phase_ended is still False, causing a second end_phase() call.

Concrete scenario: _run_phases() calls end_phase(success=True) on line 445, then json.load(f) on line 454 throws JSONDecodeError (malformed task_metadata.json). The exception propagates to run(), and end_phase(success=False) is called again — overwriting the 'completed' status with 'failed'.

The guard if not planning_phase_ended: (line 249) will always evaluate to True because nothing ever sets it to True before the except block runs. | The planning_phase_ended flag (line 242) is a local variable in run() that is only ever set to True inside the except block (line 250). _run_phases() has no mechanism to communicate back that it already called end_phase(). This creates a double-end scenario:

  1. _run_phases() calls task_logger.end_phase(success=True) at line 445
  2. Code continues to lines 450-472 where exceptions are possible (e.g., json.load() at line 454 raises JSONDecodeError on corrupted task_metadata.json, or TaskEventEmitter.emit() at line 462 raises, or _run_review_checkpoint propagates an uncaught exception)
  3. Exception propagates to run(), where planning_phase_ended is still False
  4. The except handler calls end_phase(success=False), overwriting the correct success status

This means a successfully completed planning phase could be incorrectly marked as 'failed' - the exact class of state corruption this PR aims to prevent. The _run_review_checkpoint method only catches SystemExit and KeyboardInterrupt, so other exceptions (e.g., OSError, PermissionError) would trigger this path.

Suggested fix:

Move `end_phase()` calls out of `_run_phases()` entirely and handle them in `run()` using a try/except/else pattern:

```python
async def run(self, ...):
    task_logger.start_phase(LogPhase.PLANNING, ...)
    try:
        result = await self._run_phases(...)
    except Exception as e:
        task_logger.end_phase(LogPhase.PLANNING, success=False, message=f'...: {e}')
        raise
    else:
        if not result:
            # _run_phases returned False (handled failure)
            # end_phase already called inside _run_phases for handled failures
            pass
        return result

Alternatively, remove all end_phase() calls from _run_phases() and have run() determine success/failure from the return value, calling end_phase() exactly once.


#### 🔵 [c25233a05384] [LOW] Missing type annotation on spec_dir parameter
📁 `apps/backend/spec/phases/spec_phases.py:19`

The `_is_greenfield_project(spec_dir)` function is missing the `Path` type annotation on its `spec_dir` parameter. The function it calls, `get_project_index_stats(spec_dir: Path)`, properly annotates this parameter. This is inconsistent with the codebase's typing conventions. | The `_is_greenfield_project(spec_dir)` function omits the `Path` type annotation for its `spec_dir` parameter. Every other function in the `spec/` package that accepts a `spec_dir` parameter uses `spec_dir: Path` — confirmed by searching with Grep across `apps/backend/spec/` for `def.*spec_dir:` which returned 18 matching functions all using `spec_dir: Path` (e.g., `get_project_index_stats(spec_dir: Path)` in `discovery.py:65`, `save_assessment(spec_dir: Path, ...)` in `complexity.py:438`, `create_minimal_plan(spec_dir: Path, ...)` in `writer.py:13`). The function it directly delegates to — `get_project_index_stats` — also uses `spec_dir: Path`. This requires adding `from pathlib import Path` to the imports (or putting it under `TYPE_CHECKING`).

**Suggested fix:**

Add the Path type annotation: def _is_greenfield_project(spec_dir: Path) -> bool: and add from pathlib import Path to the imports (or confirm it's available via TYPE_CHECKING).


---
*This review was generated by Auto Claude.*

…s, dedup (#1426)

- Convert planning_phase_ended to instance attribute self._planning_phase_ended
  so _run_phases() can mark it True after each end_phase() call, preventing
  double-end on exception propagation
- Add Path type annotation to _is_greenfield_project(spec_dir)
- Extract duplicated greenfield detection into _check_and_log_greenfield() helper
- Add TaskLogger and types.ModuleType type hints to _run_phases() signature
- Simplify redundant SystemExit handler with explanatory comment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added size/M Medium (100-499 lines) and removed size/S Small (10-99 lines) labels Feb 15, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/backend/spec/pipeline/orchestrator.py (1)

305-324: 🧹 Nitpick | 🔵 Trivial

run_phase return type annotation is inaccurate for async phase functions.

run_phase is a sync function annotated as -> phases.PhaseResult, but for async phase functions it actually returns a coroutine (which is then awaited at the call site). This works at runtime but misleads type checkers and IDE tooling.

Suggested fix

Either make it async:

-        def run_phase(name: str, phase_fn: Callable) -> phases.PhaseResult:
+        async def run_phase(name: str, phase_fn: Callable) -> phases.PhaseResult:
             ...
-            return phase_fn()
+            return await phase_fn()

Or widen the return type annotation.

🤖 Fix all issues with AI agents
In `@apps/backend/spec/phases/spec_phases.py`:
- Around line 20-23: The current _is_greenfield_project uses
get_project_index_stats and treats an empty/missing/corrupt index (returned as
{}) the same as a valid index with file_count==0, causing false positives;
change _is_greenfield_project to first check whether "file_count" exists in the
stats returned by get_project_index_stats and only return True when that key is
present and equals 0; otherwise return False (and optionally log or propagate
the missing/corrupt-index condition from get_project_index_stats so
callers/orchestrator like _ensure_fresh_project_index can react).

Copy link
Owner Author

@AndyMik90 AndyMik90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto Claude PR Review

🟠 Follow-up Review: Needs Revision

🟠 Needs revision - 2 blocking issue(s) require fixes.

Resolution Status

  • Resolved: 2 previous findings addressed
  • Unresolved: 0 previous findings remain
  • 🆕 New Issues: 3 new findings in recent changes

Finding Validation

  • 🔍 Dismissed as False Positives: 0 findings were re-investigated and found to be incorrect
  • Confirmed Valid: 2 findings verified as genuine issues
  • 👤 Needs Human Review: 0 findings require manual verification

🚨 Blocking Issues

  • quality: Greenfield false positive when project index is missing or has unrecognized format
  • quality: [FROM COMMENTS] CodeRabbit: False-positive greenfield detection on missing/corrupt project index (unaddressed)

Verdict

Both previous findings (planning_phase_ended double-end bug and missing type annotation) are fully resolved in commit 09858dd. All 20 CI checks are passing. However, 1 new MEDIUM finding (confirmed valid) was identified: _is_greenfield_project() cannot distinguish between a truly empty project and a missing/corrupt project index, since get_project_index_stats() returns {} on any failure and the function treats that as greenfield. While the orchestrator's phase ordering mitigates the most common failure path (index file not existing), corrupt or unrecognized formats still produce false-positive greenfield classification, injecting misleading context into spec-writing prompts. A simple fix (e.g., returning False when stats is empty) would resolve this. 1 LOW finding (empty TYPE_CHECKING block) is a minor quality nit. CodeRabbit independently identified the same greenfield concern, which remains unaddressed.

Review Process

Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator


This is an AI-generated follow-up review using parallel specialist analysis with finding validation.

Findings (3 selected of 3 total)

🟡 [NEW-001] [MEDIUM] Greenfield false positive when project index is missing or has unrecognized format

📁 apps/backend/spec/phases/spec_phases.py:20

_is_greenfield_project() relies on get_project_index_stats() which returns {} on any failure (file missing, JSON parse error, unexpected format). In all these cases, stats.get('file_count', 0) == 0 evaluates to True, falsely classifying an existing project as greenfield. This causes misleading 'GREENFIELD PROJECT' context to be injected into spec-writing prompts, instructing the LLM to ignore existing files. While the orchestrator ordering mitigates the file-missing case (discovery must succeed first), corrupt or unrecognized index formats still produce false positives.

Suggested fix:

Distinguish between 'index not available' and 'project has zero files'. When stats is empty ({}), default to non-greenfield as a safe fallback:

def _is_greenfield_project(spec_dir: Path) -> bool:
    stats = get_project_index_stats(spec_dir)
    if not stats:
        return False  # Can't determine - don't assume greenfield
    return stats.get('file_count', 0) == 0

🔵 [NEW-002] [LOW] Empty TYPE_CHECKING block is dead code

📁 apps/backend/spec/phases/spec_phases.py:16

The if TYPE_CHECKING: pass block at lines 16-17 does nothing. TYPE_CHECKING is imported from typing but the conditional block contains only pass with no type-only imports. This is dead code that should be removed or populated with actual type-checking imports.

Suggested fix:

Remove the unused TYPE_CHECKING import and the empty conditional block:

- from typing import TYPE_CHECKING
...
- if TYPE_CHECKING:
-     pass

🟡 [CMT-001] [MEDIUM] [FROM COMMENTS] CodeRabbit: False-positive greenfield detection on missing/corrupt project index (unaddressed)

📁 apps/backend/spec/phases/spec_phases.py:20

CodeRabbit identified that get_project_index_stats returns {} on any failure, causing _is_greenfield_project to return True incorrectly. This concern was raised before the fix commit but was not addressed in 09858dd. Same root cause as NEW-001.

Suggested fix:

Same fix as NEW-001: check if stats is empty and return False as safe fallback.

This review was generated by Auto Claude.

When get_project_index_stats() returns {} (file missing, JSON parse
error, or unrecognized format), _is_greenfield_project() now returns
False instead of incorrectly classifying the project as greenfield.
Also removes unused TYPE_CHECKING import and empty conditional block.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@apps/backend/spec/phases/spec_phases.py`:
- Around line 16-21: The current _is_greenfield_project uses
get_project_index_stats and treats any non-empty dict as valid; to avoid false
positives when the index has an unrecognized format, first verify that the
"file_count" key exists in the returned stats (e.g., check '"file_count" in
stats') before comparing its value, and return False if that key is missing so
only explicit file_count values are treated as greenfield indicators.

Comment on lines +16 to +21
def _is_greenfield_project(spec_dir: Path) -> bool:
"""Check if the project is empty/greenfield (0 discovered files)."""
stats = get_project_index_stats(spec_dir)
if not stats:
return False # Can't determine - don't assume greenfield
return stats.get("file_count", 0) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Good fix for the false-positive greenfield detection on missing/corrupt index.

The if not stats: return False guard correctly prevents misclassifying projects when get_project_index_stats returns {} due to a missing or unparsable index file.

One remaining edge case: if project_index.json exists and parses as valid JSON but uses an unrecognized format (no "files" or "services" key), get_project_index_stats returns {"file_count": 0, ...} — a non-empty dict that passes the guard and yields file_count == 0 → True. This would be a false positive on a valid-but-unrecognized index format. Consider additionally checking that "file_count" is actually present in stats as a key (rather than relying on the default):

Proposed hardening
 def _is_greenfield_project(spec_dir: Path) -> bool:
     """Check if the project is empty/greenfield (0 discovered files)."""
     stats = get_project_index_stats(spec_dir)
     if not stats:
         return False  # Can't determine - don't assume greenfield
-    return stats.get("file_count", 0) == 0
+    if "file_count" not in stats:
+        return False  # Unrecognized format - don't assume greenfield
+    return stats["file_count"] == 0
🤖 Prompt for AI Agents
In `@apps/backend/spec/phases/spec_phases.py` around lines 16 - 21, The current
_is_greenfield_project uses get_project_index_stats and treats any non-empty
dict as valid; to avoid false positives when the index has an unrecognized
format, first verify that the "file_count" key exists in the returned stats
(e.g., check '"file_count" in stats') before comparing its value, and return
False if that key is missing so only explicit file_count values are treated as
greenfield indicators.

Copy link
Owner Author

@AndyMik90 AndyMik90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto Claude Review - APPROVED

Status: Ready to Merge

Summary: ## ✅ Follow-up Review: Ready To Merge

✅ Ready to merge - All checks passing and findings addressed.

Resolution Status

  • Resolved: 3 previous findings addressed
  • Unresolved: 0 previous findings remain
  • 🆕 New Issues: 0 new findings in recent changes

Finding Validation

  • 🔍 Dismissed as False Positives: 3 findings were re-investigated and found to be incorrect
  • Confirmed Valid: 0 findings verified as genuine issues
  • 👤 Needs Human Review: 0 findings require manual verification

Verdict

All 3 previous findings (NEW-001, NEW-002, CMT-001) are fully resolved in commit cdd8c31. The if not stats: return False guard correctly prevents greenfield false positives on missing/corrupt indexes. The dead TYPE_CHECKING block was removed. The planning phase double-end bug fix from the prior commit remains intact.

All 20 CI checks are passing. No merge conflicts.

2 new findings from code review and 1 from comment analysis were all dismissed as false positives by the finding-validator:

  • The "unrecognized index format" edge case (medium) is unreachable because ProjectAnalyzer always includes a services key in its output — there is no supported code path that produces project_index.json without either files or services.
  • The _planning_phase_ended outside __init__ (low) cannot cause an AttributeError because the private _run_phases() method is exclusively called from run() which always initializes the attribute first.

The commit cleanly addresses all previously identified issues with a minimal, focused fix.

Review Process

Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator


This is an AI-generated follow-up review using parallel specialist analysis with finding validation.


This automated review found no blocking issues. The PR can be safely merged.

Generated by Auto Claude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend This is backend only bug Something isn't working size/M Medium (100-499 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Spec creation crashes for empty/greenfield projects

1 participant