Skip to content

fix(ui): defer default-open tool details#622

Merged
Astro-Han merged 5 commits into
devfrom
codex/i601-basic-tool-defer
May 14, 2026
Merged

fix(ui): defer default-open tool details#622
Astro-Han merged 5 commits into
devfrom
codex/i601-basic-tool-defer

Conversation

@Astro-Han
Copy link
Copy Markdown
Owner

@Astro-Han Astro-Han commented May 14, 2026

Summary

Fix BasicTool deferred details so default-open heavy tool cards do not resolve or mount their body synchronously.

Add defer to the bash tool card, keep existing edit, write, and apply_patch deferred paths explicit, add a real Solid render test that proves deferred children are not evaluated until the next animation frame, and add a default-profile perf probe for a default-open heavy bash output session.

Why

Part of #601. After PR #620 split the message-part renderers, the remaining BasicTool defer gap was behavioral: defaultOpen=true + defer=true could still pay the first-render cost for large shell output, file bodies, or diffs.

The new perf scenario gives this PR a concrete base/head guardrail for the most direct changed tool path: shell tools default-open with a large completed bash output.

Related Issue

Part of #601. Follows the read-only audit in #601 (comment).

Human Review Status

Ready for human merge decision. Gemini and CodeRabbit inline review threads were replied to and resolved.

Review Focus

  • Confirm BasicTool keeps default-open visual state while deferring heavy details until the next animation frame.
  • Confirm deferred children are not resolved before the ready gate.
  • Confirm bash, edit, write, and apply_patch remain the only intended heavy BasicTool body targets in this slice.
  • Confirm tool-default-open-heavy-bash runs in the default perf profile only, with no diff/editor perf scope added in this PR.
  • Confirm this PR does not touch timeline virtualization, streaming markdown, ContextToolGroup, settings defaults, or scroll state.

Risk Notes

Low UI behavior risk. Default-open deferred cards may show their details one frame later, but card position, trigger, copy behavior, and expand/collapse semantics are intended to stay unchanged.

Animated deferred details now stay mounted until the close animation finishes, then reset readiness. Non-animated deferred tools still reset immediately on close.

The perf probe adds CI time for one 3-run default-profile scenario. It is a regression guardrail, not a hard requirement that every run show a fixed millisecond improvement.

No dependency, schema, permission, generated-file, or platform behavior changes intended.

How To Verify

Install: bun install --frozen-lockfile passed in this worktree
Red test: bun --cwd packages/ui test src/components/basic-tool-defer.test.ts first failed because the real BasicTool render path evaluated deferred children before the ready gate
Focused BasicTool tests: bun --cwd packages/ui test src/components/basic-tool-defer.test.ts src/components/basic-tool-trigger.test.ts src/components/message-part-registry.test.ts passed, 12 pass
Focused UI regression tests: bun --cwd packages/ui test src/components/basic-tool-defer.test.ts src/components/message-part-registry.test.ts src/components/message-part-stale.test.ts test/components/message-part-rename.test.ts src/components/basic-tool-trigger.test.ts src/components/apply-patch-file.test.ts src/components/markdown-stream.test.ts passed, 33 pass
Perf profile and metrics tests: cd packages/app && bun test ./e2e/perf/profiles.unit.ts src/testing/perf-workflow.test.ts src/testing/perf-metrics.test.ts passed, 14 pass
Local heavy bash perf probe: cd packages/app && bun run test:e2e:local -- e2e/perf/perf-probe.spec.ts --grep "tool-default-open-heavy-bash" passed, 1 passed; local JSON emitted 3 runs for default / tool-default-open-heavy-bash
Local smoke check for Playwright discovery: cd packages/app && bun run test:e2e:local:smoke passed, 21 passed and 1 skipped
UI typecheck: bun --cwd packages/ui typecheck passed
App typecheck: cd packages/app && bun run typecheck passed
Diff check: git diff --check passed
Electron smoke: bun run dev:desktop launched; desktop shell opened to the new-session composer without renderer crash; no prompt was sent
PR CI: ci, desktop-smoke, e2e-artifacts, codeql, dependency-review, commit-lint, labeler, pr-title-lint, pr-priority-triage, CodeRabbit, and perf-probe-baseline passed; mergeState CLEAN
Perf result: default / tool-default-open-heavy-bash passed; long task max 75 -> 68 (-7), TBT 42 -> 18 (-24), jank 4 -> 3 (-1), interaction median 24 -> 24 (0)

Screenshots or Recordings

Not included. This is a one-frame lazy-mount behavior change with no intended visible copy or layout change. Electron smoke, local smoke, and local heavy bash perf probe covered the relevant runtime paths.

Checklist

  • Human review status is stated above as pending, approved, or not required
  • I linked the related issue, or stated why there is no issue
  • This PR has type, primary area, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

@Astro-Han Astro-Han added tech-debt Internal cleanup and maintainability debt ui Design system and user interface P2 Medium priority labels May 14, 2026
@github-actions github-actions Bot removed P2 Medium priority tech-debt Internal cleanup and maintainability debt labels May 14, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

📝 Walkthrough

Walkthrough

Adds deferred initialization for BasicTool (new exported helper, updated ready synchronization and rendering with RAF-based gating), a test fixture and Bun tests controlling RAF, applies defer to the bash tool and updates registry assertions, and adds a new perf scenario + helpers to seed heavy bash sessions and profile it.

Changes

Deferred Tool Initialization

Layer / File(s) Summary
BasicTool contract, imports, and store init
packages/ui/src/components/basic-tool.tsx
Adds untrack import and exports basicToolInitialReady; store ready now initializes via this helper so defer starts ready false while open still follows defaultOpen.
Ready synchronization and close handling
packages/ui/src/components/basic-tool.tsx
Replaces previous on(open,...) effect with a props.defer-only effect that schedules/cancels requestAnimationFrame to set ready, skips redundant scheduling when already ready, and forces ready false on close when appropriate.
Details rendering, animated path, and attributes
packages/ui/src/components/basic-tool.tsx
Updates hasDetails/shouldRenderDetails/detailsReady predicates; gates children on detailsReady; during animated closing applies pointer-events:none, aria-hidden, and inert; resets ready after animated close when deferred.
Test fixture and defer unit tests
packages/ui/test/fixtures/basic-tool-render.fixture.tsx, packages/ui/src/components/basic-tool-defer.test.ts
Adds a mount fixture that counts details renders and exports helpers; adds Bun/Vite/Solid tests that control RAF to validate deferred/non-deferred mounting, mounting after frame flush, reset on close, and closed-start behavior.
Bash tool defer and registry tests
packages/ui/src/components/message-part/tools/bash.tsx, packages/ui/src/components/message-part-registry.test.ts
Applies defer to the bash tool invocation; refactors registry test to read individual tool sources and assert per-file defer counts via a readToolSource helper.

Perf E2E: heavy-bash scenario

Layer / File(s) Summary
Perf imports, types, heavy command
packages/app/e2e/perf/perf-probe.spec.ts
Extends imports for session lifecycle helpers, adds a type-only import for createSdk, and defines heavyBashCommand and local perf helper types/state.
Force shell tool parts expansion init script
packages/app/e2e/perf/perf-probe.spec.ts
Adds enableShellToolPartsExpanded to patch localStorage settings.v3 and inject an init script forcing general.shellToolPartsExpanded = true in the page.
Seed heavy bash session helper
packages/app/e2e/perf/perf-probe.spec.ts
Adds seedHeavyBashSession that creates a permissioned bash session, registers the bash tool with a heavy command, sends a prompt, waits for idle/save, and polls until expected output appears.
Perf probe test scenario
packages/app/e2e/perf/perf-probe.spec.ts
Adds tool-default-open-heavy-bash test that seeds three heavy-bash sessions, navigates to each, verifies the tool starts expanded, snapshots perf, and cleans up sessions.
Profiles gating
packages/app/e2e/perf/profiles.ts, packages/app/e2e/perf/profiles.test.ts
Adds "tool-default-open-heavy-bash" to PerfScenarioName and defaultScenarios; adds a Bun test asserting it runs under default but not under low-end.

Sequence Diagram(s)

No sequence diagram is provided.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

bug, app

Poem

🐰 I wait for frames in a cozy burrow,
Details tucked until the rafters hum,
A click, a frame, the hidden blooms follow,
Ready pops up — a tiny victory drum!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: fixing BasicTool to defer default-open tool details, which is the core objective of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively covers all required template sections with detailed explanations, concrete verification steps, risk assessment, and review focus.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/i601-basic-tool-defer

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Astro-Han Astro-Han added P2 Medium priority tech-debt Internal cleanup and maintainability debt labels May 14, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to defer the mounting of tool details, improving initial rendering performance for heavy tools like Bash. It includes a new test suite for deferred loading logic and updates the BasicTool component to manage its 'ready' state based on a new defer prop. Feedback from the review highlights a visual 'pop' issue when closing tools, suggesting that unmounting should be delayed until animations finish to maintain visual continuity. Additionally, a performance optimization was recommended to use untrack on the ready signal within the effect to prevent unnecessary re-runs.

Comment thread packages/ui/src/components/basic-tool.tsx Outdated
Comment thread packages/ui/src/components/basic-tool.tsx Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 14, 2026

Perf delta summary

Comparator: pass

Profile / Scenario interaction median interaction worst long task max tbt frame gap p95 frame gap max jank count cls status
default / homepage-cold 32 -> 40 (+8) 72 -> 56 (-16) 62 -> 70 (+8) 12 -> 20 (+8) 16.8 -> 16.8 (0) 116.6 -> 100 (-16.6) 3 -> 3 (0) 0 -> 0 (0) pass
default / session-streaming-long 48 -> 40 (-8) 56 -> 64 (+8) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.8 (0) 33.4 -> 33.4 (0) 0 -> 0 (0) 0 -> 0 (0) pass
default / tool-call-expand 16 -> 16 (0) 16 -> 24 (+8) 0 -> 0 (0) 0 -> 0 (0) 16.7 -> 16.7 (0) 16.7 -> 16.7 (0) 0 -> 0 (0) 0 -> 0 (0) pass
default / tool-default-open-heavy-bash 24 -> 24 (0) 32 -> 32 (0) 75 -> 68 (-7) 42 -> 18 (-24) 50.1 -> 50 (-0.1) 100 -> 116.7 (+16.7) 4 -> 3 (-1) 0 -> 0 (0) pass
default / terminal-side-panel-open 40 -> 40 (0) 48 -> 48 (0) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.8 (0) 33.4 -> 16.8 (-16.6) 0 -> 0 (0) 0 -> 0 (0) pass
default / session-scroll-reading 24 -> 24 (0) 32 -> 24 (-8) 0 -> 0 (0) 0 -> 0 (0) 16.8 -> 16.7 (-0.1) 16.8 -> 16.7 (-0.1) 0 -> 0 (0) 0.505 -> 0.505 (0) warn: cls
low-end / session-timeline-recompute 144 -> 136 (-8) 144 -> 144 (0) 122 -> 124 (+2) 210 -> 233 (+23) 100 -> 100 (0) 200 -> 216.7 (+16.7) 3 -> 3 (0) 0.081 -> 0.081 (0) pass

@github-actions github-actions Bot added the app Application behavior and product flows label May 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/app/e2e/perf/perf-probe.spec.ts (1)

41-42: ⚡ Quick win

Rename the test constant to SCREAMING_SNAKE_CASE.

Line 41 introduces heavyBashCommand, which breaks the test constant naming convention used in this suite.

♻️ Proposed rename
-const heavyBashCommand =
+const HEAVY_BASH_COMMAND =
   'node -e \'for (let i = 0; i < 900; i++) console.log(String(i).padStart(4, "0") + " " + "heavy bash output ".repeat(8))\''

@@
-    command: heavyBashCommand,
+    command: HEAVY_BASH_COMMAND,

As per coding guidelines "Use SCREAMING_SNAKE_CASE for constants in tests".

Also applies to: 210-210

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/app/e2e/perf/perf-probe.spec.ts` around lines 41 - 42, Rename the
test constant heavyBashCommand to SCREAMING_SNAKE_CASE (e.g.,
HEAVY_BASH_COMMAND) and update every reference to it in the spec; locate the
constant declaration named heavyBashCommand and replace it with
HEAVY_BASH_COMMAND, then update all usages (including the other occurrence in
the file) to the new identifier so tests compile and follow the test-constant
naming convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ui/src/components/basic-tool-defer.test.ts`:
- Line 37: The Vite server root is set incorrectly in the test file (root: new
URL("../..", import.meta.url).pathname) which resolves to packages/ui/src/
causing ssrLoadModule("/test/fixtures/basic-tool-render.fixture.tsx") to look
under packages/ui/src/test/...; update the root to move one level higher by
using new URL("../../..", import.meta.url).pathname so the root points at
packages/ui/ and the fixture at packages/ui/test/fixtures/... is resolved
correctly.

---

Nitpick comments:
In `@packages/app/e2e/perf/perf-probe.spec.ts`:
- Around line 41-42: Rename the test constant heavyBashCommand to
SCREAMING_SNAKE_CASE (e.g., HEAVY_BASH_COMMAND) and update every reference to it
in the spec; locate the constant declaration named heavyBashCommand and replace
it with HEAVY_BASH_COMMAND, then update all usages (including the other
occurrence in the file) to the new identifier so tests compile and follow the
test-constant naming convention.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 05249c00-0b30-4238-82ba-53fb0e679c71

📥 Commits

Reviewing files that changed from the base of the PR and between 726f028 and a4e6cce.

📒 Files selected for processing (6)
  • packages/app/e2e/perf/perf-probe.spec.ts
  • packages/app/e2e/perf/profiles.test.ts
  • packages/app/e2e/perf/profiles.ts
  • packages/ui/src/components/basic-tool-defer.test.ts
  • packages/ui/src/components/basic-tool.tsx
  • packages/ui/test/fixtures/basic-tool-render.fixture.tsx
✅ Files skipped from review due to trivial changes (1)
  • packages/app/e2e/perf/profiles.test.ts

Comment thread packages/ui/src/components/basic-tool-defer.test.ts
@Astro-Han Astro-Han merged commit 7e53c57 into dev May 14, 2026
27 checks passed
@Astro-Han Astro-Han deleted the codex/i601-basic-tool-defer branch May 16, 2026 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app Application behavior and product flows P2 Medium priority tech-debt Internal cleanup and maintainability debt ui Design system and user interface

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant