Skip to content

Cap and coalesce terminal event streams to reduce memory growth#376

Merged
sbertix merged 2 commits into
mainfrom
sbertix/fix-unbound-accumulation
Jun 2, 2026
Merged

Cap and coalesce terminal event streams to reduce memory growth#376
sbertix merged 2 commits into
mainfrom
sbertix/fix-unbound-accumulation

Conversation

@sbertix

@sbertix sbertix commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

The terminal and worktree-info event streams used unbounded AsyncStream buffers. When a producer outpaces the main-actor consumer (for example a per-tab projection / progress / task-status storm), the in-process buffer could grow without bound and steadily increase memory use over a long session. This bounds and coalesces those streams, and removes the redundant churn at the source.

Event streams

  • Both streams now use .bufferingNewest(2048) instead of an unbounded buffer, so a wedged consumer caps memory instead of growing forever.
  • Latest-wins terminal state events (tab projection, progress, task status, focus) are coalesced by identity, so a burst of identical values collapses to the last one instead of flooding the stream. The coalesce cache is seeded from the resubscribe replay so the first live event after a resubscribe dedups consistently.
  • The pre-subscription pending buffer is bounded and coalesced, with the teardown purge mirrored into it, and the prune path clears both dedup caches in one place.
  • On a backpressure drop, a compact identity of the shed event is logged (case plus key ids, never the payload) so a drop storm cannot flood the log.
  • The worktree-info stream is capped but not deduped, since its events are refresh signals where each repeat is meaningful. Its per-worktree refresh loop now surfaces a non-cancellation error instead of breaking silently.

Test harness

  • Speeds up and de-flakes the presence test harness drain(), which spawned hundreds of thousands of detached tasks per call and never early-returned. The slowest socket-presence tests drop from 13 to 43 seconds to under 1.5 seconds each, with no behavior change.

Tests

  • Added coverage for the coalescing, the live and pending buffer caps, the teardown purge, lifecycle events never coalescing, the pending-replay cache seeding, and the worktree-info buffer cap.
  • Full suite green (1588 tests).

sbertix added 2 commits June 2, 2026 18:14
The terminal and worktree-info event streams used unbounded AsyncStream
buffers, so a producer outpacing the main-actor consumer (e.g. a per-tab
projection / progress / task-status storm) could grow the in-process
buffer without bound and steadily increase memory use over a long session.

Switch both streams to bufferingNewest(2048). Coalesce the latest-wins
terminal state events (tab projection, progress, task status, focus) by
identity so a storm collapses to its last value, dropping only a value
equal to the immediately-previous one per key. Seed the coalesce cache
from the resubscribe replay (both the pending-buffer drain and the
projection re-seed run through emit) so the first live event after a
resubscribe is deduped consistently. Bound and coalesce the
pre-subscription pending buffer, mirror the teardown purge into it, and
clear lastEmittedProjections alongside the coalesce keys on prune. On a
backpressure drop, log a compact identity of the shed event (case plus
key ids, never the payload) so a drop storm can't flood the log.

The worktree-info stream is capped but not deduped: its events are refresh
signals where each repeat is meaningful. Surface a non-cancellation error
from the per-worktree refresh loop instead of breaking silently.

Add tests for the coalescing, the live and pending buffer caps, the
teardown purge, lifecycle events never coalescing, the pending-replay
cache seeding, and the worktree-info buffer cap.
drain() ran Task.megaYield(count: 10_000) up to 64 times per call, and each
megaYield spawns `count` detached tasks, so a single drain churned hundreds of
thousands of tasks; the settle check also never early-returned. Socket-presence
tests that drain several times took 13 to 43 seconds each despite a TestClock.

Lower the per-pass yield count and settle on a genuinely quiescent pass
(consumer parked, nothing processed, and no idle-hook debounce still scheduled
via a test-only count on the manager). The last clause closes the race where
clock.advance returned but the awoken idle task had not yet emitted, which would
otherwise let a busy suite conclude "idle" too early. The six socket tests drop
from 13 to 43 seconds to between 0.06 and 1.5 seconds with no behavior change.
@sbertix sbertix enabled auto-merge (squash) June 2, 2026 16:17
@tuist

tuist Bot commented Jun 2, 2026

Copy link
Copy Markdown

🛠️ Tuist Run Report 🛠️

Builds 🔨

Scheme Status Duration Commit
supacode 2m 33s f3f9f4754

@sbertix sbertix merged commit b1ecdf3 into main Jun 2, 2026
2 checks passed
@sbertix sbertix deleted the sbertix/fix-unbound-accumulation branch June 2, 2026 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant