Skip to content

fix: scroll element into view before click, hover, tap, and drag#1073

Open
jin-2-kakaoent wants to merge 7 commits intovercel-labs:mainfrom
hyunjinee:fix/scroll-before-click-1044
Open

fix: scroll element into view before click, hover, tap, and drag#1073
jin-2-kakaoent wants to merge 7 commits intovercel-labs:mainfrom
hyunjinee:fix/scroll-before-click-1044

Conversation

@jin-2-kakaoent
Copy link
Copy Markdown
Contributor

@jin-2-kakaoent jin-2-kakaoent commented Mar 29, 2026

Closes #1044

Summary

  • Automatically scroll elements into view before click, hover, tap, and drag using CDP DOM.scrollIntoViewIfNeeded with JS fallback
  • Return an error when an element cannot be scrolled into the viewport (eliminates silent failure)
  • Detect detached (stale) nodes lazily via isConnected check inside the existing getBoundingClientRect JS call — zero extra CDP round-trips on the happy path
  • Retry with a fresh accessibility tree lookup when a detached node is detected, preserving the existing stale-ref fallback behavior
  • Fix drag coordinate invalidation: mouseDown at source before scrolling to target, preventing viewport-relative coords from going stale

Root cause

Two issues combined to produce the silent failure:

1. No auto-scroll before interactions

snapshot uses Accessibility.getFullAXTree which returns all elements regardless of viewport position. When interacting with an off-screen element, Input.dispatchMouseEvent / Input.dispatchTouchEvent was dispatched to coordinates outside the viewport — CDP doesn't error on this, so the interaction silently had no effect.

2. Coordinate system mismatch in resolve_element_center

DOM.getBoxModel returns page-absolute coordinates while Input.dispatchMouseEvent expects viewport-relative coordinates. After scrolling, these diverge and interactions miss the target.

Changes

  • get_center_and_viewport: uses getBoundingClientRect for viewport-relative coordinates, returns CenterResult::Found or CenterResult::Detached
  • scroll_into_view_if_needed: CDP DOM.scrollIntoViewIfNeeded with JS scrollIntoView fallback (errors propagated)
  • assert_in_viewport: validates element is within viewport after scrolling
  • resolve_element_object_id_fresh: skips cached backend_node_id for retry after detached node detection
  • resolve_scroll_and_center: shared helper for click/hover/tap/drag — resolve, scroll, get center, retry on detach
  • handle_drag: reordered to mouseDown at source before scrolling to target, preventing source coordinate invalidation
  • Removed dead code: resolve_element_center, box_model_center

Test plan

  • E2E: e2e_offscreen_scroll_before_interactions — click, hover, click-by-ref, tap, and drag on off-screen elements in a single unified test
  • E2E: e2e_click_stale_ref_falls_back_to_role_name — stale ref fallback still works
  • Manual: off-screen button clicked, hovered, tapped, and dragged via CLI — all verified
  • cargo fmt + cargo clippy clean
  • All E2E tests pass

Clicking off-screen elements silently dispatched mouse events to
coordinates outside the viewport, causing the click to have no effect
without returning an error.

- Add `scroll_into_view_if_needed` using CDP `DOM.scrollIntoViewIfNeeded`
  with JS `scrollIntoView` fallback for unsupported environments
- Add `assert_in_viewport` to return an error when an element cannot be
  scrolled into view
- Unify `resolve_element_center` to always use `getBoundingClientRect`
  (viewport-relative) instead of `DOM.getBoxModel` (page-absolute)
- Extract `get_center_and_viewport` to fetch element center and viewport
  dimensions in a single CDP call

Closes vercel-labs#1044
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 29, 2026

@hyunjinee is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

…abs#1044)

Verify that click, hover, and click-by-ref on off-screen elements
auto-scroll into view before dispatching the interaction.
DOM.resolveNode succeeds for detached nodes, which broke the stale-ref
fallback path. Instead of adding an extra CDP round-trip to check
isConnected, fold the check into the existing getBoundingClientRect JS
call (zero overhead on the happy path) and retry with a fresh
accessibility tree lookup when a detached node is detected.
@ctate
Copy link
Copy Markdown
Collaborator

ctate commented Mar 29, 2026

@jin-2-kakaoent LGTM! One minor follow-up to consider: tap_touch and drag still use the old resolve_element_center path without auto-scroll, so they'd benefit from the same treatment.

…ck-1044

# Conflicts:
#	cli/src/native/e2e_tests.rs
jin-2-kakaoent pushed a commit to hyunjinee/agent-browser that referenced this pull request Mar 30, 2026
…alidation

- tap_touch and handle_drag now use resolve_scroll_and_center for
  auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073)
- Fix drag ordering: mouseDown at source before scrolling to target,
  preventing source coordinates from being invalidated
- Remove unused resolve_element_center (dead code after migration)
- Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
…alidation

- tap_touch and handle_drag now use resolve_scroll_and_center for
  auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073)
- Fix drag ordering: mouseDown at source before scrolling to target,
  preventing source coordinates from being invalidated
- Remove unused resolve_element_center (dead code after migration)
- Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
@jin-2-kakaoent jin-2-kakaoent force-pushed the fix/scroll-before-click-1044 branch from 2ee77ed to f2c4d7d Compare March 30, 2026 01:09
- Remove unused box_model_center and its test
- Propagate JS scrollIntoView fallback error instead of silently ignoring
@jin-2-kakaoent jin-2-kakaoent changed the title fix: scroll element into view before click and hover fix: scroll element into view before click, hover, tap, and drag Mar 30, 2026
@jin-2-kakaoent
Copy link
Copy Markdown
Contributor Author

@ctate Thanks for the feedback! I've applied resolve_scroll_and_center (auto-scroll + detached node retry) to both tap_touch and handle_drag.

All e2e + manual tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clicking an invisible element with agent-browser seems to have no effect, yet the command doesn't throw an error?

3 participants