Skip to content

[Feature] UI rewrite v2: modular, perf-gated rollout #599

@Astro-Han

Description

@Astro-Han

What task are you trying to do?

Replace the old catch-all tracker for the remaining UI rewrite work with a modular rollout plan that can ship incrementally without repeating the PR #589 performance regression.

Which area would this change affect?

UI or design system

What do you do today?

Today the remaining rewrite work is split between two bad states:

That means the team cannot safely keep shipping large rewrite slices without a perf guardrail and smaller execution boundaries.

What would a good result look like?

A new master tracker owns only the remaining rewrite work and organizes it into 7 modular areas:

Order Area Scope
1 PR0 Automated perf regression CI
2 Area A Message flow
3 Area B Right panel
4 Area C Home
5 Area D Settings
6 Area E1 Visual shell
7 Area E2 Final shell

The rollout ships behind PR0 first, then each area moves independently with a soft file-size rule and explicit perf verification.

May launch contract

The May goal is to ship the UI rewrite series as a launchable experience and ship the complete first-user onboarding flow. This is not scoped to #601 alone, and perf guard work should protect the end-to-end user experience without becoming a separate project that delays the launch train.

User-facing done means:

  1. A first-time user can open PawWork, understand what to do next, and complete the first real task without guessing.
  2. Home, entry points, the main conversation flow, tool presentation, right-side context, and the basic shell path feel visually coherent.
  3. Long-session typing, scrolling, and large tool output stay usable without obvious regressions.
  4. The launch path is covered by focused E2E, perf guard coverage, and Electron desktop manual verification.
  5. Before calling the launch train complete, latest dev gets one integrated pass through first launch -> onboarding -> first successful task -> main conversation flow, with an Electron manual check plus focused E2E, perf, and CI snapshots.

Current ownership:

  • CC leads product and UI implementation for the rewrite and onboarding surfaces.
  • Codex supports with perf guards, review, focused E2E, desktop manual checks, integration quality, and risk-driven follow-up fixes.
  • Perf guard work is prioritized by the user experience it protects. The next Codex guard is long-session input lag; heavy diff/edit/apply_patch and virtualization/scroll-owner work only enter the launch path when they protect a real upcoming UI rewrite or onboarding risk.

This contract should be used with #545 for onboarding details and #600 for perf-gate mechanics.

What would count as done?

  • PR0 is merged and active before any new visual rewrite PR ships.
  • Each area starts from its own sub-issue instead of a large umbrella slice.
  • Every area's first PR must pass the perf gate from [Task] UI rewrite v2 PR0: perf-gated regression CI #600.
  • File-size guidance is explicit: soft cap 500 LOC, ideal 200 LOC, no splitting for the sake of splitting.
  • Already-delivered surfaces remain stable and are not re-opened by default.

What should stay out of scope?

Which audience does this matter to most?

Both

Extra context

Already delivered

These remain in production. New rewrites must not regress them.

Per-area workflow

  1. Before the first PR in an area, run a read-only audit over the files that area touches.
  2. Use that audit to scope the first PR. Each PR should stay to a single surface and roughly 5-8 files / 1000 LOC of behavioral change.
  3. Every PR must pass the PR0 perf gate.
  4. No PR ships visual changes that regress already-delivered surfaces.

Postmortem: PR #589 (5-cut perf cluster)

PR #589 attempted to fix C-min streaming-markdown jank by stacking 5 cuts:

  1. 969f6a982 - per-block dirty-tail markdown render with full-html fingerprint.
  2. f7646cb36 - delta cross-status coalescing.
  3. 61e7a0f78 - message.part.delta 30fps gate.
  4. aed399425 - animation suppression during streaming.
  5. f49fd6d21 - TerminalPanel hidden guard.

Observed on retest: home-page INP regressed from about 213-240ms on base 9e071d2ad to 199-3120ms on head f49fd6d21.

Why the team closed instead of fixing forward:

Lessons preserved here:

  • Performance work must ship behind an automated regression gate, not manual retest.
  • A perf cluster PR stacking unrelated cuts is the wrong shape.
  • Fingerprints that scale with content length are a smell; prefer identity or version counters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityappApplication behavior and product flowsenhancementNew feature or requestuiDesign system and user interface

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions