Skip to content

test(app): add PR0.4 low-end perf profile#610

Merged
Astro-Han merged 3 commits into
devfrom
codex/pr0-low-end-profile
May 14, 2026
Merged

test(app): add PR0.4 low-end perf profile#610
Astro-Han merged 3 commits into
devfrom
codex/pr0-low-end-profile

Conversation

@Astro-Han
Copy link
Copy Markdown
Owner

@Astro-Han Astro-Han commented May 13, 2026

Summary

  • Add a single low-end perf profile that runs only the Session Timeline recompute scenario.
  • Keep the existing default perf probe broad, then merge default and optional low-end artifacts before comparison/commenting.
  • Split comparator semantics by profile: default regressions still fail as before; low-end moderate regressions warn, and only clear catastrophic regressions fail.

Why

PR0.4 needs a low-cost CPU pressure profile before Area A visual work lands. This is intentionally not a real-device low-end phone simulation. It is a repeatable CI signal for Session Timeline main-thread cost under a smaller CPU budget.

Related Issue

Refs #600.

Human Review Status

Pending. A human should make the final merge decision after reviewing the final diff and verification evidence.

Review Focus

  • Whether the low-end trigger scope stays narrow enough for CI cost.
  • Whether session-timeline-recompute is the right minimal Area A-facing scenario.
  • Whether low-end warning/failure thresholds are conservative enough to avoid noisy blocking.

Risk Notes

This adds optional extra perf E2E work only for workflow dispatch or Session Timeline/perf-related paths. It does not change product runtime behavior. I requested maintainer labeling through this PR body instead of applying labels directly.

How To Verify

bun test --preload ./happydom.ts ./src/testing/perf-metrics.test.ts ./src/testing/perf-workflow.test.ts
13 passed, 47 expects

bun --cwd packages/app typecheck
passed: tsgo -b

ruby -e 'require "yaml"; YAML.load_file(".github/workflows/perf-probe-baseline.yml"); puts "ok"'
ok

git diff --check / git diff --cached --check
no whitespace errors

PAWWORK_PERF_PROFILE=low-end PAWWORK_PERF_OUTPUT=/tmp/pawwork-low-end-perf.json bun --cwd packages/app test:e2e:local:perf
passed: 1 low-end scenario, 5 default scenarios skipped

PAWWORK_PERF_OUTPUT=/tmp/pawwork-default-perf.json bun --cwd packages/app test:e2e:local:perf
passed: 5 default scenarios, 1 low-end scenario skipped

bun packages/app/script/compare-perf.ts with default, low-end, and combined artifacts
passed; combined comment includes default rows plus low-end / session-timeline-recompute

Screenshots or Recordings

Not applicable. This PR changes perf harness and CI workflow behavior, not visible UI.

Checklist

  • Human review status is stated above as pending, approved, or not required
  • I linked the related issue, or stated why there is no issue
  • This PR has type, primary area, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Summary by CodeRabbit

Release Notes

  • New Features

    • Added low-end device performance profiling with CPU throttling simulation to evaluate application behavior under resource constraints.
  • Tests

    • Expanded performance testing infrastructure with profile-aware baseline comparisons, enabling performance measurements across multiple device profiles.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Warning

Rate limit exceeded

@Astro-Han has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 33 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 41b1f7ae-faf6-4b1b-bd3c-bc510d334b6a

📥 Commits

Reviewing files that changed from the base of the PR and between c8d0917 and 1099368.

📒 Files selected for processing (4)
  • .github/workflows/perf-probe-baseline.yml
  • packages/app/e2e/perf/perf-probe.spec.ts
  • packages/app/e2e/perf/probe.ts
  • packages/app/src/testing/perf-workflow.test.ts
📝 Walkthrough

Walkthrough

Introduces a profile dimension (default vs low-end) across the performance testing system. E2E scenarios conditionally execute and apply CPU throttling based profile. Comparison logic applies profile-specific regression thresholds and keys by both profile and scenario. CI workflow detects file changes to conditionally run low-end perf probes, merges artifacts, and re-compares on failure.

Changes

Low-End Performance Profiling

Layer / File(s) Summary
Profile type definition and scenario classification
packages/app/src/testing/perf-metrics.ts (types), packages/app/e2e/perf/profiles.ts
PerfProfile type and PerfScenarioSummary/PerfScenarioComparison extended with profile field. PerfScenarioName union defines scenarios; defaultScenarios and lowEndScenarios sets gate execution by profile.
Profile reading, scenario gating, and CPU throttling
packages/app/e2e/perf/profiles.ts
readPerfProfile() maps PAWWORK_PERF_PROFILE env var to default or low-end. shouldRunScenario() gates scenarios by profile membership. applyPerfProfile() applies CDP Emulation.setCPUThrottlingRate for low-end profiles.
Aggregation with profiles and low-end thresholds
packages/app/src/testing/perf-metrics.ts
aggregatePerfRuns propagates optional profile into returned PerfScenarioSummary. lowEndWarningThresholds and lowEndCatastrophicThresholds define profile-specific regression cutoffs.
Comparison logic with profile-aware thresholds and keying
packages/app/src/testing/perf-metrics.ts
comparePerfScenarioSummaries resolves profile and applies low-end thresholds when profile is low-end, returning comparison with profile included. comparePerfBaselines keys scenarios by profile + scenario via scenarioKey(), updating failure key formats and missing-scenario detection.
Baseline comment rendering with profile labels
packages/app/src/testing/perf-metrics.ts
Markdown baseline comparison table first column now shows "<profile> / <scenario>" instead of scenario-only label.
E2E spec profile loading and application
packages/app/e2e/perf/perf-probe.spec.ts
Loads perfProfile via readPerfProfile(). Introduces skipUnlessScenario() helper to conditionally skip scenarios. Each scenario (homepage-cold through session-timeline-recompute) conditionally skips, applies profile via applyPerfProfile(page, perfProfile), and includes profile in baseline summarization.
Probe module and timeline fixture updates
packages/app/e2e/perf/probe.ts, packages/app/e2e/perf/timeline-fixture.ts
probe.ts imports PerfProfile and updates summarizeScenarioRuns to accept optional profile. New timeline-fixture.ts exports seedTimelineRecomputeSession() for 36-turn timeline seeding.
Test suite updates for profile dimension
packages/app/src/testing/perf-metrics.test.ts
Introduces scenario() helper that defaults profile to "default". Updates existing comparison tests to include profile: "default" in inputs. Expands baseline-comparison coverage with profile-keyed missing-scenario failures and low-end regression/warning/fail assertions.
Artifact merging script
packages/app/script/merge-perf-artifacts.ts
New script reads perf JSON from required and optional input paths, merges into single array, and writes to output directory. Parses CLI flags (--output, --required, --optional).
CI workflow low-end detection and execution
.github/workflows/perf-probe-baseline.yml
Adds Detect low-end perf scope step to compute run_low_end from file changes. Conditionally runs low-end perf probes for base and head with PAWWORK_PERF_PROFILE: low-end, merges low-end artifacts into combined JSONs, re-compares using combined files, and conditionally confirms low-end regressions on failure. Adds diagnostic trace capture when both comparisons fail.
Workflow contract validation test
packages/app/src/testing/perf-workflow.test.ts
New test file loads perf-probe-baseline.yml and asserts expected strings: fetch-depth: 0, low-end scope markers, combined artifact filenames, and reference counts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Astro-Han/pawwork#608: Extends existing baseline comparison logic; this PR adds a profile/low-end dimension on top of that foundation.
  • Astro-Han/pawwork#607: Extends the perf baseline workflow/spec structure; this PR adds profile-aware scenario selection and low-end artifact handling on the same foundation.
  • Astro-Han/pawwork#609: Both PRs modify the perf-probe CI harness with shared workflow comparison/diagnostics flow; this PR applies new profile-based execution to scenarios adjusted by that PR.

Suggested labels

ci, app, P2

Poem

🐰 Low-end profiles hop into view,
CPU throttles to test what's true—
Base and head both run the race,
Scenarios grouped in their rightful place.
Artifacts merge, thresholds align,
Performance testing now works by design! 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a low-end perf profile. It uses conventional commits format (test scope) and is directly related to all the changeset modifications.
Description check ✅ Passed The description is comprehensive and follows the template structure with all major sections completed: Summary, Why, Related Issue, Human Review Status, Review Focus, Risk Notes, How To Verify, Screenshots, and full Checklist.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/pr0-low-end-profile

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added ci Continuous integration / GitHub Actions app Application behavior and product flows labels May 13, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes user-path files (packages/app/src/testing/perf-metrics.test.ts, packages/app/src/testing/perf-metrics.ts, packages/app/src/testing/perf-workflow.test.ts)).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

@Astro-Han Astro-Han added the P2 Medium priority label May 13, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces performance profiles, including a "low-end" profile that implements CPU throttling via the Chrome DevTools Protocol. It updates the E2E performance suite to support profile-specific scenarios and thresholds, adds a new timeline recomputation test, and provides a script for merging performance results. A review comment suggests parallelizing the session seeding process to improve setup efficiency, especially under throttled conditions.

Comment thread packages/app/e2e/perf/timeline-fixture.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/app/e2e/perf/probe.ts (1)

2-2: ⚡ Quick win

Type scenario as PerfScenarioName in this API.

This adds compile-time protection against misspelled scenario keys in perf summary outputs.

Proposed diff
-import { aggregatePerfRuns, summarizePerfRun, type PerfProfile, type PerfRunSummary } from "../../src/testing/perf-metrics"
+import {
+  aggregatePerfRuns,
+  summarizePerfRun,
+  type PerfProfile,
+  type PerfRunSummary,
+  type PerfScenarioName,
+} from "../../src/testing/perf-metrics"
...
-export function summarizeScenarioRuns(input: { branch: string; profile?: PerfProfile; scenario: string; runs: PerfRunSummary[] }) {
+export function summarizeScenarioRuns(input: {
+  branch: string
+  profile?: PerfProfile
+  scenario: PerfScenarioName
+  runs: PerfRunSummary[]
+}) {
   return aggregatePerfRuns(input)
 }

Also applies to: 161-161

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/app/e2e/perf/probe.ts` at line 2, The exported API that accepts a
scenario key (used by aggregatePerfRuns / summarizePerfRun and their outputs
PerfProfile / PerfRunSummary) should type that parameter as the PerfScenarioName
union to prevent misspelled keys; update the function signatures and any related
variables named scenario to use the PerfScenarioName type (and adjust callers to
satisfy the new type) so compile-time checks enforce valid scenario keys across
aggregatePerfRuns, summarizePerfRun, and any perf summary construction code that
assigns to scenario.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/app/e2e/perf/perf-probe.spec.ts`:
- Line 20: The module-level constant perfProfile should follow test constant
naming conventions; rename the identifier perfProfile to PERF_PROFILE and update
all usages accordingly (including the initialization call readPerfProfile() and
any references in perf-probe.spec.ts such as assertions or function calls that
use perfProfile) to maintain SCREAMING_SNAKE_CASE for test constants.

---

Nitpick comments:
In `@packages/app/e2e/perf/probe.ts`:
- Line 2: The exported API that accepts a scenario key (used by
aggregatePerfRuns / summarizePerfRun and their outputs PerfProfile /
PerfRunSummary) should type that parameter as the PerfScenarioName union to
prevent misspelled keys; update the function signatures and any related
variables named scenario to use the PerfScenarioName type (and adjust callers to
satisfy the new type) so compile-time checks enforce valid scenario keys across
aggregatePerfRuns, summarizePerfRun, and any perf summary construction code that
assigns to scenario.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: e50084bf-21e6-4069-99d5-c1093922b52a

📥 Commits

Reviewing files that changed from the base of the PR and between dc2ea6c and c8d0917.

📒 Files selected for processing (9)
  • .github/workflows/perf-probe-baseline.yml
  • packages/app/e2e/perf/perf-probe.spec.ts
  • packages/app/e2e/perf/probe.ts
  • packages/app/e2e/perf/profiles.ts
  • packages/app/e2e/perf/timeline-fixture.ts
  • packages/app/script/merge-perf-artifacts.ts
  • packages/app/src/testing/perf-metrics.test.ts
  • packages/app/src/testing/perf-metrics.ts
  • packages/app/src/testing/perf-workflow.test.ts

Comment thread packages/app/e2e/perf/perf-probe.spec.ts Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Perf delta summary

Comparator: pass

Profile / Scenario interaction median interaction worst long task max tbt frame gap p95 frame gap max jank count cls status
default / homepage-cold 48 -> 24 (-24) 64 -> 40 (-24) 97 -> 64 (-33) 47 -> 14 (-33) 16.8 -> 16.8 (0) 133.4 -> 100 (-33.4) 4 -> 3 (-1) 0 -> 0 (0) pass
default / session-streaming-long 48 -> 40 (-8) 56 -> 56 (0) 0 -> 0 (0) 0 -> 0 (0) 33.2 -> 16.8 (-16.4) 33.4 -> 33.3 (-0.1) 0 -> 0 (0) 0 -> 0 (0) pass
default / tool-call-expand 16 -> 16 (0) 24 -> 24 (0) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.8 (0) 16.8 -> 16.8 (0) 0 -> 0 (0) 0 -> 0 (0) pass
default / terminal-side-panel-open 40 -> 40 (0) 72 -> 48 (-24) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.8 (0) 16.8 -> 16.8 (0) 0 -> 0 (0) 0 -> 0 (0) pass
default / session-scroll-reading 16 -> 16 (0) 32 -> 24 (-8) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.7 (-0.1) 16.8 -> 16.7 (-0.1) 0 -> 0 (0) 0.505 -> 0.505 (0) warn: cls
low-end / session-timeline-recompute 120 -> 128 (+8) 136 -> 136 (0) 106 -> 113 (+7) 157 -> 144 (-13) 83.4 -> 83.3 (-0.1) 183.3 -> 183.4 (+0.1) 3 -> 3 (0) 0.194 -> 0.194 (0) pass

@Astro-Han Astro-Han merged commit 5f2056e into dev May 14, 2026
24 checks passed
@Astro-Han Astro-Han deleted the codex/pr0-low-end-profile branch May 14, 2026 01:27
Astro-Han added a commit that referenced this pull request May 14, 2026
Root cause:
- PR #610 added `packages/app/src/testing/perf-workflow.test.ts`, which read `.github/workflows/perf-probe-baseline.yml` and matched a literal LF-only snippet.
- Git for Windows checked the workflow file out with CRLF, so Windows advisory `unit-windows-app` failed even though the workflow content was correct.

Change boundary:
- Normalize line endings inside the perf workflow contract test before doing snippet assertions.
- Add a focused CRLF regression case matching the Windows advisory failure.
- Test-only change; no product runtime, packaging, updater, or workflow execution behavior changed.

Verification:
- Added the CRLF regression test first and confirmed the old assertion failed.
- `bun test --preload ./happydom.ts ./src/testing/perf-workflow.test.ts` passed: 2 tests, 0 failures.
- `bun run test:ci` in `packages/app` passed: 1067 tests, 0 failures.
- `git diff --check` passed.
- PR #623 checks passed, including CodeQL, CI, desktop smoke, e2e-artifacts, perf-probe-baseline, and PR title checks.
- Manual Windows advisory workflow `25872283970` on `codex/fix-windows-perf-workflow-eol` passed, including `unit-windows-app` and all Windows advisory jobs.

Review and risk:
- No unresolved review threads.
- Residual risk is low and limited to the test assertion helper; this does not alter release artifacts or application behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app Application behavior and product flows ci Continuous integration / GitHub Actions P2 Medium priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant