Skip to content

Handle autocrlf in carry-forward detection#913

Open
Soph wants to merge 3 commits intomainfrom
soph/codex-e2e-fix
Open

Handle autocrlf in carry-forward detection#913
Soph wants to merge 3 commits intomainfrom
soph/codex-e2e-fix

Conversation

@Soph
Copy link
Copy Markdown
Collaborator

@Soph Soph commented Apr 10, 2026

Fixed a false carry-forward bug in manual-commit shadow branch cleanup when Git line-ending normalization is enabled.

In the failing Codex E2E path, the second edit left src/main.go in a CRLF-normalized working-tree form while the committed blob was LF-normalized under core.autocrlf=true. Our carry-forward logic compared raw on-disk bytes to the commit blob hash, treated that as a dirty file, and incorrectly recreated a shadow branch after the user had already committed the full change.

This change updates workingTreeMatchesCommit to ask Git whether the path is clean before falling back to raw blob hashing. That makes the decision respect Git’s clean/smudge behavior and avoids phantom carry-forward on line-ending-only differences. It also adds a regression test covering the autocrlf normalization case.


Note

Medium Risk
Changes carry-forward detection to shell out to git diff for cleanliness checks, which can affect behavior across environments (git availability/config) and could alter which files are carried forward in edge cases.

Overview
Fixes a false-positive carry-forward case by updating workingTreeMatchesCommit to first ask Git (git diff --quiet) whether a path is clean, instead of relying solely on hashing raw on-disk bytes (which can disagree under core.autocrlf and other clean/smudge filters).

Adds a regression test that enables core.autocrlf=true and verifies line-ending-only normalization does not cause files to be treated as having remaining agent changes.

Reviewed by Cursor Bugbot for commit 6f18f10. Configure here.

Entire-Checkpoint: 5e0e77648a11
@Soph Soph requested a review from a team as a code owner April 10, 2026 12:07
Copilot AI review requested due to automatic review settings April 10, 2026 12:07
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 6f18f10. Configure here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes false carry-forward detection during manual-commit shadow-branch cleanup when Git line-ending normalization (e.g., core.autocrlf=true) causes working-tree bytes to differ from committed blob bytes even though Git considers the path clean.

Changes:

  • Update workingTreeMatchesCommit to consult git diff --quiet before falling back to raw blob hashing.
  • Add a regression test intended to cover the core.autocrlf normalization scenario.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
cmd/entire/cli/strategy/content_overlap.go Uses Git’s cleanliness check to avoid phantom carry-forward due to clean/smudge normalization.
cmd/entire/cli/strategy/content_overlap_test.go Adds a test case meant to validate the autocrlf-normalization behavior.

The call chain from PostCommit passes an unbounded context, so
workingTreeMatchesCommit could block forever on a hung git process.
Thread the caller's context through and wrap it with a 5s timeout,
matching the pattern used by resolveWorktreeBranchGit for similar
local git operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 8b5b4eca788e
@gtrrz-victor
Copy link
Copy Markdown
Contributor

Follow-up: add 5s timeout to git diff in carry-forward detection

  • workingTreeMatchesCommit was using context.Background() with no timeout — the entire call chain from PostCommit passes an unbounded context
  • Now threads the caller's ctx and wraps it with context.WithTimeout(ctx, 5*time.Second), matching the pattern used by resolveWorktreeBranchGit for similar local git operations
  • If git diff hangs (e.g., misbehaving smudge filter), it will be killed after 5s and fall through to the existing hash comparison

@gtrrz-victor
Copy link
Copy Markdown
Contributor

Follow-up: distinguish git diff exit codes in carry-forward detection

  • Exit code 1 from git diff --exit-code means "file is dirty" — now returns false immediately instead of falling through to raw blob hashing, which could re-introduce the autocrlf mismatch this PR fixes
  • Only falls back to blob hash comparison on actual git failures (exit 128+, timeout, etc.)
  • Follows the existing errors.As(*exec.ExitError) + ExitCode() pattern from isDisconnected and DeleteBranchCLI

Exit 1 means "file is dirty" and should return false immediately,
not fall through to raw blob hashing which can re-introduce the
autocrlf mismatch. Only fall back to blob hash on actual git
failures (exit 128+, timeout).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: e4ee6688a68c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants