Fix Logs UI Error Display Oscillation Bug by AndyMik90 · Pull Request #1868 · AndyMik90/Aperant

AndyMik90 · 2026-02-18T14:12:15Z

Base Branch

This PR targets the develop branch (required for all feature/fix PRs)
This PR targets main (hotfix only - maintainers)

Description

Fixes oscillation bug in task logs UI where worktree coding phase would flip-flop between pending and active states during agent retries, and prevents stale cache emission when mid-write JSON parse failures occur. Implements anti-oscillation logic using nullish coalescing operator and cache versioning to gate event emissions.

Related Issue

Closes #236

Type of Change

Area

Frontend
Backend
Fullstack

Commit Message Format

Follow conventional commits: <type>: <subject>

Types: feat, fix, docs, style, refactor, test, chore

Example: feat: add user authentication system

AI Disclosure

This PR includes AI-generated code (Claude, Codex, Copilot, etc.)

Tool(s) used: Claude Code

Testing level:

Untested -- AI output not yet verified
Lightly tested -- ran the app / spot-checked key paths
Fully tested -- all tests pass, manually verified behavior
I understand what this PR does and how the underlying code works

Checklist

I've synced with develop branch
I've tested my changes locally
I've followed the code principles (SOLID, DRY, KISS)
My PR is small and focused (< 400 lines ideally)

Platform Testing Checklist

CRITICAL: This project supports Windows, macOS, and Linux. Platform-specific bugs are a common source of breakage.

Windows tested (either on Windows or via CI)
macOS tested (either on macOS or via CI)
Linux tested (CI covers this)
Used centralized platform/ module instead of direct process.platform checks
No hardcoded paths (used findExecutable() or platform abstractions)

If you only have access to one OS: CI now tests on all platforms. Ensure all checks pass before submitting.

CI/Testing Requirements

All CI checks pass on all platforms (Windows, macOS, Linux)
All existing tests pass
New features include test coverage
Bug fixes include regression tests

Screenshots

N/A

Feature Toggle

Behind localStorage flag: use_feature_name
Behind settings toggle
Behind environment variable/config
N/A - Feature is complete and ready for all users

Breaking Changes

Breaking: No

Details: Changes are backward compatible. Modified internal state management logic in TaskLogService without altering public API signatures.

Summary by CodeRabbit

Bug Fixes
- Improved stability of log retrieval to prevent rapid state flip‑flop during concurrent operations.
- Enhanced cache versioning so updates are only emitted when fresh data is available, avoiding stale notifications.
- Prevented failed parses from triggering spurious log updates.
Tests
- Added integration and unit tests validating oscillation prevention, stale-cache suppression, emission gating, and cache cleanup behaviors.

coderabbitai · 2026-02-18T14:12:37Z

📝 Walkthrough

Walkthrough

Adds per-spec cache versioning and version-gated emission logic to TaskLogService to suppress stale or mid-write log emissions and prevent oscillation between main/worktree phases. Adds integration and unit tests validating oscillation prevention, stale-cache suppression, parse-failure handling, and cache/version lifecyle cleanup.

Changes

Cohort / File(s)	Summary
Integration Tests `apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts`	Adds integration tests for oscillation prevention, consistent coding status under rapid calls, stale-cache emission suppression, and parse-failure scenarios (contains duplicated test blocks in diff).
Unit Tests `apps/frontend/src/main/__tests__/task-log-service.test.ts`	Adds unit tests for merge behavior, startWatching/loadLogsFromPath emission gating, cacheVersions lifecycle (clearCache/stopWatching), and parse-error resilience.
Service Implementation `apps/frontend/src/main/task-log-service.ts`	Introduces `cacheVersions: Map<string, number>`, increments per-spec version on successful parses, uses nullish coalescing for phase merge, gates `logs-changed` emits to version increases, and cleans up cacheVersions on stop/clear.
Repo Configuration `manifest_file`, `package.json`	Non-code manifest/package file changes included in diff (lines changed reported).

Sequence Diagram(s)

sequenceDiagram
  participant Renderer
  participant IPC as IPC Handler
  participant Service as TaskLogService
  participant FS as File/Parser

  Renderer->>IPC: request logs / subscribe
  IPC->>Service: startWatching(spec)
  Service->>FS: loadLogsFromPath(main/worktree)
  FS-->>Service: parse result (success/failure)
  alt parse success
    Service->>Service: increment cacheVersions[spec]
    Service-->>IPC: emit logs-changed (only if versionAfter > versionBefore)
    IPC-->>Renderer: forward logs-changed
  else parse failure (cached or mid-write)
    Service-->>IPC: skip emit (debug msg)
    IPC--/->>Renderer: no update
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

auto-claude: 199-bug-logs-disappear-after-restart #1728: Modifies the same TaskLogService and related log-loading behavior; code-level overlap with cache/versioning and emit gating changes.

Suggested labels

bug, size/M

Suggested reviewers

AlexMadera
MikeeBuilds

Poem

🐇
Versions hop, then numbers climb,
No more flip-flop, steady rhyme.
Fresh logs leap—old ones stay low,
I thump a drum and watch them flow. 🎉

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR objectives and code changes do not align with the linked issue `#236` about enabling spellcheck on Textarea components.	Verify that issue `#236` is the correct linked issue for this PR. The actual changes address logs UI oscillation, which appears unrelated to spellcheck functionality.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main fix: addressing an oscillation bug in the logs UI error display.
Out of Scope Changes check	✅ Passed	All code changes focus on fixing the logs UI oscillation bug through cache versioning and nullish coalescing, directly addressing the PR objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch auto-claude/236-bug-logs-ui-bug-with-errors

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-18T14:12:39Z

Summary of Changes

Hello @AndyMik90, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical UI bug in the task logs display where the worktree coding phase would inconsistently switch between pending and active statuses during agent retries, leading to a confusing user experience. Additionally, it resolves an issue where the UI could display stale log data if a JSON file was being written to and parsed mid-update. The changes enhance the reliability and accuracy of the task log display by implementing robust anti-oscillation and cache freshness mechanisms.

Highlights

Oscillation Prevention: Implemented anti-oscillation logic in the task logs UI to prevent the worktree coding phase from flip-flopping between pending and active states during agent retries.
Stale Cache Prevention: Prevented the emission of stale cached log data when mid-write JSON parse failures occur, ensuring only fresh, successfully parsed logs are displayed.
Nullish Coalescing Operator: Refactored log merging logic to utilize the nullish coalescing operator (??) for more robust phase selection, avoiding status-based ternary conditions that caused oscillation.
Cache Versioning: Introduced a cache versioning mechanism in TaskLogService to intelligently gate logs-changed event emissions, only sending updates when a fresh parse has successfully occurred.

Changelog

apps/frontend/src/main/ipc-handlers/task/tests/logs-integration.test.ts
- Added new test suites to specifically verify the oscillation prevention logic for the worktree coding phase.
- Included tests to confirm that stale cache emissions are correctly prevented during JSON parse failures.
- Verified consistent coding status across rapid calls, ensuring no flip-flop behavior.
- Confirmed that logs-changed events are only forwarded to the renderer when fresh data is available from the service.
apps/frontend/src/main/task-log-service.ts
- Introduced a cacheVersions map to track the version of cached logs for each spec directory.
- Modified the loadLogs method to increment the cacheVersions for a spec directory upon successful parsing.
- Updated the mergeLogs function to use the nullish coalescing operator (??) for selecting coding and validation phases, prioritizing worktree logs if they exist, which prevents oscillation during agent retries.
- Adjusted the startWatching method to only emit logs-changed events if the combined cache version for main and worktree logs has increased, indicating a successful fresh parse.
- Added a debug log message for when logs-changed emission is skipped due to only cached data being available (mid-write parse failure).

Activity

The pull request was created by AndyMik90.
It targets the develop branch.
The changes are categorized as a bug fix affecting the backend.
AI-generated code from Claude Code was included in this PR.
The changes have been lightly tested, and regression tests were added for the bug fix.
The PR is small and focused, adhering to the recommended line count.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively resolves two distinct bugs in the task logs UI: a visual oscillation during agent retries and the display of stale data due to JSON parsing failures. The use of the nullish coalescing operator is a clean fix for the oscillation bug. The introduction of a cache versioning system to gate event emissions is a robust solution for the stale data issue. The changes are well-supported by comprehensive new integration tests that specifically target these regression scenarios. I have one suggestion to refactor a small piece of duplicated code to enhance maintainability. Overall, this is a solid contribution.

gemini-code-assist · 2026-02-18T14:14:13Z

apps/frontend/src/main/task-log-service.ts

+        const versionBefore =
+          (this.cacheVersions.get(specDir) ?? 0) +
+          (currentWorktreeSpecDir ? (this.cacheVersions.get(currentWorktreeSpecDir) ?? 0) : 0);
+
        const logs = this.loadLogs(specDir);

-        if (logs) {
+        // Only emit when at least one fresh parse succeeded (version increased).
+        // If both paths returned cached data due to a mid-write JSON failure the
+        // version stays unchanged and we skip the emit to avoid oscillation.
+        const versionAfter =
+          (this.cacheVersions.get(specDir) ?? 0) +
+          (currentWorktreeSpecDir ? (this.cacheVersions.get(currentWorktreeSpecDir) ?? 0) : 0);


The logic to calculate the combined cache version for the main and worktree spec directories is duplicated for versionBefore and versionAfter. To improve maintainability and reduce redundancy, consider extracting this logic into a private helper method within the TaskLogService class.

For example:

private getCombinedCacheVersion(mainSpecDir: string, worktreeSpecDir: string | null): number { const mainVersion = this.cacheVersions.get(mainSpecDir) ?? 0; const worktreeVersion = worktreeSpecDir ? (this.cacheVersions.get(worktreeSpecDir) ?? 0) : 0; return mainVersion + worktreeVersion; }

You could then simplify the polling logic:

const versionBefore = this.getCombinedCacheVersion(specDir, currentWorktreeSpecDir); const logs = this.loadLogs(specDir); const versionAfter = this.getCombinedCacheVersion(specDir, currentWorktreeSpecDir);

AndyMik90

🤖 Auto Claude PR Review

Merge Verdict: 🟠 NEEDS REVISION

🟠 Needs revision - 1 structural/other issue(s) require attention.

Branch is out of date with base branch. Update branch first - if no conflicts arise, you can merge. If merge conflicts arise, resolve them and run follow-up review again. 2 non-blocking suggestion(s) to consider.

Risk Assessment

Factor	Level	Notes
Complexity	Medium	Based on lines changed
Security Impact	None	Based on security findings
Scope Coherence	Good	Based on structural review

🚨 Blocking Issues (Must Fix)

Branch Out of Date: PR branch is behind the base branch and needs to be updated

Findings Summary

Low: 2 issue(s)

Generated by Auto Claude PR Review

Findings (2 selected of 2 total)

🔵 [04fa6fc4b439] [LOW] cacheVersions not cleaned in clearCache() or stopWatching()

📁 apps/frontend/src/main/task-log-service.ts:531

The new cacheVersions Map is populated in loadLogsFromPath() (line 73) but never cleaned up. clearCache() (line 531) only clears logCache but not the corresponding cacheVersions entry. stopWatching() (line 443) clears pollIntervals and watchedPaths but neither logCache (pre-existing omission) nor cacheVersions. While this doesn't cause a functional bug — the monotonically increasing counter works regardless of accumulated value — it creates a minor inconsistency where clearCache() doesn't fully reset the cache state. If the class later adds cache-invalidation logic that depends on version being reset, this could become a problem. | The PR adds cacheVersions as a companion Map to logCache (both keyed by specDir, both tracking per-path state). However, the existing clearCache(specDir) method on line 531 only calls this.logCache.delete(specDir) without also calling this.cacheVersions.delete(specDir). While clearCache is not currently invoked anywhere in the codebase (verified via Grep for taskLogService.clearCache), this creates a lifecycle gap — any future caller of clearCache would clear the cached logs but leave orphaned version counters. The stopWatching method similarly doesn't clean either Map, so the omission is at least internally consistent with the existing cleanup pattern. Searched Grep('taskLogService\.clearCache', 'apps/frontend/src') — confirmed no callers exist.

Suggested fix:

Add `this.cacheVersions.delete(specDir);` inside `clearCache()` for consistency. Optionally also clear relevant cacheVersions entries in `stopWatching()` alongside the existing logCache cleanup gap.

🔵 [b4fcd3eed1d5] [LOW] Regression tests cover IPC layer but not the service-level fix

📁 apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts:483

The new test suites 'Oscillation prevention' and 'Stale cache emission prevention' test the IPC handler's pass-through behavior with the TaskLogService fully mocked. The actual bug fixes — the ?? operator in mergeLogs() (line 138) and the cacheVersions gating in startWatching() (lines 394-408) — are not exercised by any test. For example, the test 'should NOT forward any update to renderer when service does not emit (parse failure)' (line 643) validates that NOT calling an event handler results in no renderer update, which is trivially true. The test comments are honest about this ('The actual anti-oscillation fix lives in TaskLogService.mergeLogs()'), but there are no dedicated TaskLogService unit tests to verify the core fix behavior.

Suggested fix:

Add a unit test file for TaskLogService (e.g., `task-log-service.test.ts`) that directly tests: (1) `mergeLogs()` returns worktree coding phase even when status is 'pending' with empty entries (the ?? fix), and (2) the `startWatching()` poll skips emission when `loadLogsFromPath` returns cached data on parse failure (the cacheVersions gating).

This review was generated by Auto Claude.

…vice.ts Verified three root causes by reading actual source files: 1. RC1 CONFIRMED: mergeLogs() oscillation at lines 134-139 - condition falls back to main logs when worktree coding phase resets to pending during agent retry 2. RC2 CONFIRMED: Stale cache emitted in startWatching() polling loop (lines 329-413) when JSON parse fails mid-write, stale cache is emitted as logs-changed 3. RC3 NOTE: useTaskDetail.ts onTaskLogsChanged already uses additive setExpandedPhases pattern (lines 256-260) - fix appears already applied, subtask-2-3 should verify Investigation complete - findings documented in build-progress.txt. No code changes in this investigation-only subtask. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…in task-log-service.ts Replace the entry-count/status-based ternary condition for coding and validation phase selection with nullish coalescing (??) to prevent flip-flopping when the worktree coding/validation phase resets to 'pending' during agent retries. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…JSON parse failures Add cacheVersions Map to TaskLogService to track successful JSON parses per specDir. loadLogsFromPath increments the version on every successful parse. The startWatching polling loop now captures version sums before/after loadLogs() and only emits logs-changed when the version increased, preventing oscillation caused by emitting stale cached data during mid-write failures. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…and stale cache emission Add test cases to logs-integration.test.ts: 1. Oscillation prevention - worktree coding phase pending/empty passes through IPC as-is 2. Stale cache prevention - renderer only updated when service emits logs-changed All 127 test files pass (3224 tests passing, 6 skipped). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…ests Address PR review findings: - clearCache() now also deletes the corresponding cacheVersions entry, preventing stale version counters from accumulating across cache clears - stopWatching() cleans up cacheVersions for both main and worktree spec dirs, preventing stale counters when re-watching the same spec - Add task-log-service.test.ts with 7 unit tests covering: - mergeLogs() ?? operator: worktree pending phases are preserved (not overridden by main's completed phases) - startWatching() poll: emission is skipped when parse failure returns cached data (cacheVersions gating) - clearCache() and stopWatching() properly clean cacheVersions - loadLogsFromPath() cache fallback behavior on parse errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

apps/frontend/src/main/__tests__/task-log-service.test.ts

+      const service = new TaskLogService();
+      const specId = '001-test';
+      const specDir = '/project/.auto-claude/specs/001-test';
+      const worktreeSpecDir = '/project/.auto-claude/worktrees/tasks/001-test/.auto-claude/specs/001-test';


In general, unused variables should be removed to reduce clutter and avoid confusion. If the variable was meant to be used but isn't, the alternative is to restore or add the missing usage; but without evidence of such intended use, the safest fix is deletion.

The best minimal fix here is to remove the declaration of worktreeSpecDir from the stopWatching() - cacheVersions cleanup test. The rest of the code already hardcodes the corresponding worktree path when setting up findTaskWorktree.mockReturnValue, so removing the unused const will not affect behavior. Specifically, in apps/frontend/src/main/__tests__/task-log-service.test.ts, delete the line that declares const worktreeSpecDir = '/project/.auto-claude/worktrees/tasks/001-test/.auto-claude/specs/001-test'; (line 317 in the provided snippet). No additional imports, methods, or definitions are required.

apps/frontend/src/main/__tests__/task-log-service.test.ts

+      // but the cacheVersions entries should be gone.
+
+      // Load again to verify versions start fresh
+      const beforeSecondLoad = service.getCachedLogs(specDir);


In general, unused variables in tests (or production code) should be removed, unless they are needed for side effects or to clarify behavior. Here, beforeSecondLoad is simply assigned the return value of service.getCachedLogs(specDir) and never used, and the call itself has no side effects beyond reading internal state. The test’s purpose—to ensure cacheVersions is cleared on stopWatching and that watching can restart and emit events—is fully exercised by the rest of the code.

The minimal, behavior-preserving fix is to remove the beforeSecondLoad variable binding while preserving the call only if we still want to exercise getCachedLogs at that point. However, even that call is not necessary for the test’s assertions, and the comments already explain the intended behavior. The cleanest fix is to delete the entire line const beforeSecondLoad = service.getCachedLogs(specDir); and leave the surrounding comments intact. This change is localized to apps/frontend/src/main/__tests__/task-log-service.test.ts around line 338 and requires no new imports, methods, or definitions.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/frontend/src/main/__tests__/task-log-service.test.ts`:
- Around line 161-207: The test creates an untyped worktreeLogsRaw object with
null phase fields to simulate backend-produced nulls that bypass TypeScript's
TaskLogs typing; add a short inline comment above the worktreeLogsRaw
declaration (referencing worktreeLogsRaw and TaskLogs) stating this is an
intentional type bypass to test runtime null-fallback behavior (the stringified
readFileSync/JSON.parse path), so future readers know the nulls are deliberate
and not a typo.
- Around line 312-359: The test doesn't actually verify cacheVersions was reset
because versions could keep incrementing; update either the TaskLogService to
expose a test-only accessor (e.g., getCacheVersion(specDir) or
getCacheVersions()) or add a deterministic reload method (e.g., loadLogsFromPath
returning its computed version) and then change the test to read the version
before stopWatching, call stopWatching(specId), then assert the version for
specDir (and worktree path) equals the baseline (e.g., 0 or 1) rather than only
asserting that events fire after replay; reference stopWatching, startWatching,
getCachedLogs, and the internal cacheVersions when making the change.

In `@apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts`:
- Around line 598-641: The new test "should forward logs-changed events to
renderer when service emits fresh data" duplicates the existing "forward
logs-changed events to renderer" test; remove or consolidate to avoid redundancy
by either deleting the new duplicate block or merging its cacheVersions-focused
comments into the existing test; locate the duplicate by the test names and the
use of taskLogService.on mock and mockMainWindow.webContents?.send assertions,
and if consolidating, update the surviving test's description to mention the
cacheVersions / fresh-parse scenario and keep a single invocation of the
eventHandler('001-test-task', freshLogs) with the existing expect checks.
- Around line 643-660: The test 'should NOT forward any update to renderer when
service does not emit (parse failure)' is trivially asserting that not invoking
the service handler results in no send calls; update the test to either (a)
convert it into an explicit documentation/contract test by adding a clear
comment stating it only verifies "no-op when no event emitted" and reference the
more thorough unit test in task-log-service.test.ts, or (b) remove this
redundant test and rely on the version-gating unit test in task-log-service
(taskLogService and its startWatching/version check) that actually exercises the
parse-failure behavior; ensure references to taskLogService and
mockMainWindow.webContents.send remain or are removed consistently.

coderabbitai · 2026-02-20T11:44:10Z

apps/frontend/src/main/__tests__/task-log-service.test.ts

+    it('should fall back to main coding phase when worktree coding is null', () => {
+      const service = new TaskLogService();
+
+      const mainLogs = makeTaskLogs({
+        phases: {
+          planning: makePhaseLogs({ phase: 'planning', status: 'completed' }),
+          coding: makePhaseLogs({
+            phase: 'coding',
+            status: 'active',
+            entries: [{ type: 'text', content: 'Main coding', phase: 'coding', timestamp: '2024-01-01T00:30:00Z' }]
+          }),
+          validation: makePhaseLogs({ phase: 'validation' })
+        }
+      });
+
+      // Worktree logs where coding is explicitly null (not yet started in worktree)
+      const worktreeLogsRaw = {
+        spec_id: '001-test',
+        created_at: '2024-01-01T00:00:00Z',
+        updated_at: '2024-01-01T01:00:00Z',
+        phases: {
+          planning: makePhaseLogs({ phase: 'planning', status: 'completed' }),
+          coding: null,
+          validation: null
+        }
+      };
+
+      const mainSpecDir = '/project/.auto-claude/specs/001-test';
+      const worktreeSpecDir = '/project/.auto-claude/worktrees/tasks/001-test/.auto-claude/specs/001-test';
+
+      existsSync.mockReturnValue(true);
+      readFileSync.mockImplementation((filePath: string) => {
+        if (filePath.includes(worktreeSpecDir)) {
+          return JSON.stringify(worktreeLogsRaw);
+        }
+        return JSON.stringify(mainLogs);
+      });
+
+      findTaskWorktree.mockReturnValue('/project/.auto-claude/worktrees/tasks/001-test');
+
+      const result = service.loadLogs(mainSpecDir, '/project', '.auto-claude/specs', '001-test');
+
+      expect(result).not.toBeNull();
+      // With null worktree coding, should fall back to main
+      expect(result!.phases.coding.status).toBe('active');
+      expect(result!.phases.coding.entries).toHaveLength(1);
+    });


🧹 Nitpick | 🔵 Trivial

Good null-fallback test, but the worktreeLogsRaw object bypasses TypeScript's TaskLogs type.

worktreeLogsRaw is declared as an untyped object literal with coding: null and validation: null (lines 183-184). Since TaskLogs.phases.coding is typed as TaskPhaseLog (not TaskPhaseLog | null), this only works because readFileSync returns a stringified version that's parsed via JSON.parse(...) as TaskLogs — a runtime cast that doesn't enforce the type at compile time. This is intentional for testing the ?? behavior with truly-null phase values (which the Python backend can produce), but it's worth a brief comment noting this deliberate type bypass.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/main/__tests__/task-log-service.test.ts` around lines 161 - 207, The test creates an untyped worktreeLogsRaw object with null phase fields to simulate backend-produced nulls that bypass TypeScript's TaskLogs typing; add a short inline comment above the worktreeLogsRaw declaration (referencing worktreeLogsRaw and TaskLogs) stating this is an intentional type bypass to test runtime null-fallback behavior (the stringified readFileSync/JSON.parse path), so future readers know the nulls are deliberate and not a typo.

coderabbitai · 2026-02-20T11:44:10Z

apps/frontend/src/main/__tests__/task-log-service.test.ts

+  describe('stopWatching() - cacheVersions cleanup', () => {
+    it('should delete cacheVersions entries for watched paths when stopping', () => {
+      const service = new TaskLogService();
+      const specId = '001-test';
+      const specDir = '/project/.auto-claude/specs/001-test';
+      const worktreeSpecDir = '/project/.auto-claude/worktrees/tasks/001-test/.auto-claude/specs/001-test';
+
+      existsSync.mockReturnValue(true);
+      readFileSync.mockReturnValue(JSON.stringify(makeTaskLogs()));
+      findTaskWorktree.mockReturnValue('/project/.auto-claude/worktrees/tasks/001-test');
+
+      // Start watching (this will load initial logs, populating cacheVersions)
+      service.startWatching(specId, specDir, '/project', '.auto-claude/specs');
+
+      // Verify cache is populated for main spec dir
+      expect(service.getCachedLogs(specDir)).not.toBeNull();
+
+      // Stop watching
+      service.stopWatching(specId);
+
+      // After stopping, cacheVersions should be cleaned for both main and worktree paths.
+      // We verify indirectly: loading from the same path should start fresh version counting.
+      // The logCache may still have data (stopWatching doesn't clear logCache),
+      // but the cacheVersions entries should be gone.
+
+      // Load again to verify versions start fresh
+      const beforeSecondLoad = service.getCachedLogs(specDir);
+      // logCache is NOT cleared by stopWatching (only cacheVersions), so this may still exist
+      // The important thing is that cacheVersions was cleaned
+
+      // Start watching again - should not skip due to stale cacheVersions
+      service.startWatching(specId, specDir, '/project', '.auto-claude/specs');
+
+      const emittedEvents: Array<{ event: string; args: unknown[] }> = [];
+      service.on('logs-changed', (...args: unknown[]) => {
+        emittedEvents.push({ event: 'logs-changed', args });
+      });
+
+      // Change file content and advance timer
+      const updatedLogs = makeTaskLogs({ updated_at: '2024-01-01T03:00:00Z' });
+      readFileSync.mockReturnValue(JSON.stringify(updatedLogs));
+      vi.advanceTimersByTime(1100);
+
+      // Should emit because cacheVersions was cleaned and fresh parse succeeds
+      expect(emittedEvents.length).toBeGreaterThanOrEqual(1);
+
+      service.stopWatching(specId);
+    });


🧹 Nitpick | 🔵 Trivial

stopWatching cleanup test doesn't strongly prove cacheVersions was actually reset.

After stopWatching + startWatching again, the emission test (line 356) would pass even if cacheVersions were not cleaned — the version would simply continue incrementing from its previous value, and versionAfter > versionBefore would still hold. To make this test truly validate cleanup, you could expose a getCacheVersion(specDir) method for testing, or assert that after stopWatching, re-loading starts the version counter from a known baseline (e.g., by interleaving a clearCache + loadLogsFromPath and checking the version is 1, not a higher number).

As-is, the test serves as a smoke test for the watch-stop-rewatch cycle, which still has value.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/main/__tests__/task-log-service.test.ts` around lines 312 - 359, The test doesn't actually verify cacheVersions was reset because versions could keep incrementing; update either the TaskLogService to expose a test-only accessor (e.g., getCacheVersion(specDir) or getCacheVersions()) or add a deterministic reload method (e.g., loadLogsFromPath returning its computed version) and then change the test to read the version before stopWatching, call stopWatching(specId), then assert the version for specDir (and worktree path) equals the baseline (e.g., 0 or 1) rather than only asserting that events fire after replay; reference stopWatching, startWatching, getCachedLogs, and the internal cacheVersions when making the change.

coderabbitai · 2026-02-20T11:44:10Z

apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts

+  describe('Stale cache emission prevention (regression: mid-write JSON parse failures)', () => {
+    it('should forward logs-changed events to renderer when service emits fresh data', async () => {
+      // The service uses cacheVersions to gate emission: it only emits logs-changed when a
+      // fresh parse succeeded (version incremented). This test verifies that when the service
+      // DOES emit (fresh data available), the IPC handler correctly forwards it to the renderer.
+      const { taskLogService } = await import('../../../task-log-service');
+
+      const freshLogs: TaskLogs = {
+        spec_id: '001-test-task',
+        created_at: '2024-01-01T00:00:00Z',
+        updated_at: '2024-01-01T01:01:00Z',
+        phases: {
+          planning: { phase: 'planning', status: 'completed', started_at: null, completed_at: null, entries: [] },
+          coding: {
+            phase: 'coding',
+            status: 'active',
+            started_at: '2024-01-01T00:30:00Z',
+            completed_at: null,
+            entries: [
+              { type: 'text', content: 'Coding in progress', phase: 'coding', timestamp: '2024-01-01T01:00:00Z' }
+            ]
+          },
+          validation: { phase: 'validation', status: 'pending', started_at: null, completed_at: null, entries: [] }
+        }
+      };
+
+      const onCall = (taskLogService.on as Mock).mock.calls.find(
+        call => call[0] === 'logs-changed'
+      );
+      expect(onCall).toBeDefined();
+      if (!onCall) throw new Error('logs-changed handler not registered');
+      const eventHandler = onCall[1];
+
+      // Service emits (fresh parse succeeded, cacheVersions incremented)
+      eventHandler('001-test-task', freshLogs);
+
+      // IPC handler must forward it to the renderer exactly once
+      expect(mockMainWindow.webContents?.send).toHaveBeenCalledTimes(1);
+      expect(mockMainWindow.webContents?.send).toHaveBeenCalledWith(
+        'task:logsChanged',
+        '001-test-task',
+        freshLogs
+      );
+    });


🧹 Nitpick | 🔵 Trivial

Duplicate test: "forward logs-changed events to renderer when service emits fresh data" (line 599) is functionally identical to the existing "forward logs-changed events to renderer" (line 710).

Both tests locate the logs-changed handler via taskLogService.on, invoke it with mock logs, and assert webContents.send was called with 'task:logsChanged'. The only difference is the inline comments about cacheVersions. Consider removing the new test or consolidating the two to avoid redundant test maintenance.

Also applies to: 709-741

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts` around lines 598 - 641, The new test "should forward logs-changed events to renderer when service emits fresh data" duplicates the existing "forward logs-changed events to renderer" test; remove or consolidate to avoid redundancy by either deleting the new duplicate block or merging its cacheVersions-focused comments into the existing test; locate the duplicate by the test names and the use of taskLogService.on mock and mockMainWindow.webContents?.send assertions, and if consolidating, update the surviving test's description to mention the cacheVersions / fresh-parse scenario and keep a single invocation of the eventHandler('001-test-task', freshLogs) with the existing expect checks.

coderabbitai · 2026-02-20T11:44:10Z

apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts

+    it('should NOT forward any update to renderer when service does not emit (parse failure)', async () => {
+      // When a mid-write JSON parse failure occurs, the service's cacheVersions does NOT
+      // increment, so the service skips the logs-changed emit. This test confirms the renderer
+      // receives NO update when the service stays silent (no stale cache forwarded).
+      const { taskLogService } = await import('../../../task-log-service');
+
+      const onCall = (taskLogService.on as Mock).mock.calls.find(
+        call => call[0] === 'logs-changed'
+      );
+      expect(onCall).toBeDefined();
+      if (!onCall) throw new Error('logs-changed handler not registered');
+
+      // Simulate parse failure: service does NOT call the event handler
+      // (equivalent to the version check in startWatching() preventing the emit)
+
+      // Renderer should receive zero updates
+      expect(mockMainWindow.webContents?.send).not.toHaveBeenCalled();
+    });


🧹 Nitpick | 🔵 Trivial

The "parse failure → no forwarding" test is trivially true.

This test asserts that webContents.send was not called when no event handler was invoked. While it documents the expected no-op behavior, it doesn't actually exercise any code path — it only proves that not calling a function results in no side effects. Consider adding a comment acknowledging this is a documentation/contract test rather than a behavioral test, or removing it in favor of the unit test in task-log-service.test.ts which actually tests the version-gating logic.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/frontend/src/main/ipc-handlers/task/__tests__/logs-integration.test.ts` around lines 643 - 660, The test 'should NOT forward any update to renderer when service does not emit (parse failure)' is trivially asserting that not invoking the service handler results in no send calls; update the test to either (a) convert it into an explicit documentation/contract test by adding a clear comment stating it only verifies "no-op when no event emitted" and reference the more thorough unit test in task-log-service.test.ts, or (b) remove this redundant test and rely on the version-gating unit test in task-log-service (taskLogService and its startWatching/version check) that actually exercises the parse-failure behavior; ensure references to taskLogService and mockMainWindow.webContents.send remain or are removed consistently.

AndyMik90

🤖 Auto Claude PR Review

🟠 Follow-up Review: Needs Revision

🟠 Needs revision - 3 blocking issue(s) require fixes.

Resolution Status

✅ Resolved: 0 previous findings addressed
❌ Unresolved: 0 previous findings remain
🆕 New Issues: 7 new findings in recent changes

Finding Validation

🔍 Dismissed as False Positives: 0 findings were re-investigated and found to be incorrect
✓ Confirmed Valid: 5 findings verified as genuine issues
👤 Needs Human Review: 0 findings require manual verification

🚨 Blocking Issues

Branch Out of Date: PR branch is behind the base branch and needs to be updated
quality: CI Complete check is failing
quality: test-frontend windows-latest check is failing
quality: Validated code finding (details not provided in summary)

Verdict

[Recovered via extraction] 2 CI checks are failing (CI Complete, test-frontend windows-latest), which blocks merge readiness. Both previous findings are resolved. 5 new code findings were identified (1 MEDIUM, 4 LOW), all validated. No unanswered contributor questions.

Review Process

Agents invoked: resolution-verifier, new-code-reviewer, comment-analyzer, finding-validator

This is an AI-generated follow-up review using parallel specialist analysis with finding validation.

⚠️ CI Status: 2 check(s) failing: CI Complete, test-frontend (windows-latest)

Findings (3 selected of 3 total)

🟠 [FU-778068A5] [HIGH] CI Complete check is failing

📁 unknown:0

[Recovered via extraction] CI Complete check is failing

🟠 [FU-0856E883] [HIGH] test-frontend windows-latest check is failing

📁 unknown:0

[Recovered via extraction] test-frontend windows-latest check is failing

🟡 [FU-776B16C0] [MEDIUM] Validated code finding (details not provided in summary)

📁 unknown:0

[Recovered via extraction] Validated code finding (details not provided in summary)

This review was generated by Auto Claude.

github-actions bot added area/frontend This is frontend only size/M Medium (100-499 lines) labels Feb 18, 2026

gemini-code-assist bot reviewed Feb 18, 2026

View reviewed changes

AndyMik90 self-assigned this Feb 18, 2026

AndyMik90 commented Feb 18, 2026

View reviewed changes

AndyMik90 and others added 5 commits February 20, 2026 12:38

AndyMik90 force-pushed the auto-claude/236-bug-logs-ui-bug-with-errors branch from 5f1fc0f to e48c74c Compare February 20, 2026 11:39

github-actions bot added size/L Large (500-999 lines) and removed size/M Medium (100-499 lines) labels Feb 20, 2026

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

AndyMik90 commented Feb 20, 2026

View reviewed changes

Uh oh!

Conversation

AndyMik90 commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Base Branch

Description

Related Issue

Type of Change

Area

Commit Message Format

AI Disclosure

Checklist

Platform Testing Checklist

CI/Testing Requirements

Screenshots

Feature Toggle

Breaking Changes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

gemini-code-assist bot commented Feb 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

AndyMik90 left a comment

Choose a reason for hiding this comment

🤖 Auto Claude PR Review

Merge Verdict: 🟠 NEEDS REVISION

Risk Assessment

🚨 Blocking Issues (Must Fix)

Findings Summary

Findings (2 selected of 2 total)

🔵 [04fa6fc4b439] [LOW] cacheVersions not cleaned in clearCache() or stopWatching()

🔵 [b4fcd3eed1d5] [LOW] Regression tests cover IPC layer but not the service-level fix

Uh oh!

Check notice

Copilot Autofix

Check notice

Copilot Autofix

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

AndyMik90 left a comment

Choose a reason for hiding this comment

🤖 Auto Claude PR Review

🟠 Follow-up Review: Needs Revision

Resolution Status

Finding Validation

🚨 Blocking Issues

AndyMik90 commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 18, 2026 •

edited

Loading