Skip to content

[Task] UI rewrite v2 PR0: perf-gated regression CI #600

@Astro-Han

Description

@Astro-Han

Goal

Automated perf regression gate is active on every PR before any visual rewrite ships.

Scope

In scope:

  • Playwright-based perf probe.
  • The 4 initial scenarios: homepage-cold, session-streaming-long, tool-call-expand, session-scroll-reading.
  • JSON output in the schema described below.
  • Base/head comparison rules, ratchet rules, and CI artifact capture.
  • Trace-on-failure support as part of the same measurement pipeline when needed by PR0.2.

Out of scope:

  • Terminal / ghostty scenario (deferred to PR0.3).
  • Lighthouse as the primary measurement path.
  • Any visual rewrite itself.

Relevant files or context

Parent: #599

Likely files and existing probes:

  • packages/app Playwright suite
  • debug-bar.tsx
  • renderer-diagnostics.ts

Reference standards:

Key measurement design:

  • Playwright Test + browser Performance APIs + same-runner base/head comparison + trace-on-failure.
  • Hard fail v1 is regression delta plus catastrophic absolute only.
  • Web Vitals good lines stay warn-only until the ratchet promotes them.

Verification

  • PR0.1 lands and dev-tip produces at least 3 baseline JSON runs across all 4 scenarios.
  • PR0.2 lands and a deliberate regressing PR is blocked by the comparator.
  • JSON output includes median + worst values per scenario.
  • CI uploads artifacts rather than silently passing.

Execution mode

Agent should investigate and propose a plan first

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priorityappApplication behavior and product flowsciContinuous integration / GitHub ActionstaskMaintainer or agent execution tasktech-debtInternal cleanup and maintainability debtuiDesign system and user interface

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions