Skip to content

Releases: 4fuu/open-browser-cli

0.6.1 (631369d)

31 Mar 10:04

Choose a tag to compare

browser-cli 0.6.1

0.6.1 是一次 Linux 二进制兼容性修复版本,不含功能变更。

变更

  • Linux 预编译二进制从 glibc 动态链接切换为 musl 静态链接
    • 此前的二进制在 CI(Ubuntu 24.04,glibc 2.39)上编译,无法在 glibc 版本较旧的发行版(如 Ubuntu 20.04、Debian 11 等)上运行。
    • 切换到 musl 后产物完全静态链接,可在任意 Linux 发行版上直接运行,无 glibc 版本限制。
    • 受影响的目标:x86_64-unknown-linux-muslaarch64-unknown-linux-musl
    • macOS 和 Windows 二进制不受影响。

兼容性说明

  • 这是一次纯构建层修复,CLI 行为与协议完全不变。
  • 已在使用 0.6.0 的用户可直接替换二进制,无需任何配置变更。

English

0.6.1 is a Linux binary compatibility fix with no functional changes.

Changes

  • Linux prebuilt binaries now use musl static linking instead of glibc dynamic linking
    • Previous binaries were compiled on Ubuntu 24.04 (glibc 2.39) and would fail to run on older distributions such as Ubuntu 20.04 or Debian 11.
    • The switch to musl produces fully self-contained binaries that run on any Linux distribution regardless of the installed glibc version.
    • Affected targets: x86_64-unknown-linux-musl and aarch64-unknown-linux-musl.
    • macOS and Windows binaries are not affected.

Compatibility notes

  • This is a build-layer fix only; CLI behavior and protocol are completely unchanged.
  • Users already on 0.6.0 can drop in the new binary without any configuration changes.

Full Changelog: 0.6.0...0.6.1

0.6.0 (9e21fc1)

29 Mar 11:16

Choose a tag to compare

browser-cli 0.6.0

0.6.0 是一次以浏览器资源获取、媒体状态可见性和扩展发布稳定性增强为主的版本更新,汇总了自 0.5.4 以来这一轮 CLI 能力扩展、页面结构化增强、扩展侧交互修正,以及浏览器扩展打包/发布流程改进。

亮点

  • 新增截图与资源下载能力
    • 新增 browser-cli screenshot <session-id>,支持 --output--quality--json 等参数,可直接把当前页面视口截图保存到本地。
    • 新增 browser-cli download <session-id> <target>,可用元素 ID 或直接 URL 作为目标,并通过浏览器当前会话上下文下载图片或文件,保留登录态 / cookie 访问能力。
    • 下载结果会自动推断文件名并做安全清洗;截图和下载都支持结构化 JSON 输出,便于 Agent / 脚本串联。
  • 结构化页面视图开始暴露媒体播放状态
    • 浏览器快照现在会提取 <audio> / <video> 的播放状态,包括 playing / paused / ended、当前时间、总时长、静音状态,以及视频分辨率。
    • Rust 侧新增专用 media 节点,XML / JSON 输出会保留这些字段,让页面上的媒体控件与播放状态可以被更稳定地理解和利用。
    • 插件目标遍历也已同步理解媒体节点,为后续插件扩展和自动化规则提供更完整的语义基础。
  • 扩展侧交互与采集行为更稳定
    • 当点击或输入触发页面滚动时,浏览器里的光标提示现在会跟随目标元素移动,不再轻易出现动画脉冲停在旧位置的情况。
    • 截图流程补强了质量参数校验,并在捕获完成后恢复原先的活动标签页,减少截图动作对当前浏览上下文的打扰。
  • 浏览器扩展打包与 Chrome 集成更稳妥
    • 扩展构建现在会生成浏览器专用产物,Chrome / Firefox 各自使用对应的 manifest 与包内容进行发布。
    • Chrome 包不再混用 Manifest V2 的 background.scripts 与 Manifest V3 的 service_worker,解压加载时的产物路径也已与最终 zip 内容保持一致。
    • macOS 上的 Chrome Native Messaging host 默认安装路径已修正到正确目录,减少 browser-cli setup --browser chrome 后扩展仍无法连接 relay 的问题。

兼容性与行为说明

  • 这是一次向前兼容的版本更新;新增的 screenshotdownload 命令是增量能力,不会影响现有 open / page / click / type / wait 工作流。
  • screenshot --full-page 目前仍会回退为视口截图,并输出提示;这次版本先补齐统一命令入口与截图链路,完整全页拼接仍可后续继续增强。
  • download 面向“浏览器当前会话可访问”的资源:它会复用扩展背景页的网络上下文,因此更适合下载登录后图片、文件链接或页面中可直接访问的媒体资源。
  • Chrome 用户升级到这一版后,建议 reload 扩展;macOS 用户如果之前已经执行过 browser-cli setup --browser chrome --extension-id <id>,建议重新执行一次以把 native host manifest 写入修正后的目录。

English

0.6.0 is a feature release focused on browser-side resource capture, richer media-state visibility, and more reliable extension packaging and integration, covering this round of CLI expansion, structured-page enhancements, extension interaction fixes, and browser-extension release improvements since 0.5.4.

Highlights

  • New screenshot and resource-download workflows
    • Added browser-cli screenshot <session-id> with options such as --output, --quality, and --json, so the current page viewport can be captured directly to a local file.
    • Added browser-cli download <session-id> <target>, where the target can be either an element ID or a direct URL, and downloads are performed through the browser session context so authenticated resources remain accessible.
    • Download output now auto-detects and sanitizes filenames, and both screenshot and download flows support structured JSON responses for agent and script pipelines.
  • Structured page output now exposes media playback state
    • Browser snapshots now extract playback information from <audio> / <video> elements, including playing / paused / ended, current time, duration, mute state, and video resolution.
    • The Rust side now models these as dedicated media nodes, and both XML and JSON output preserve the media fields so playback surfaces can be understood more reliably.
    • Plugin target traversal has been updated to understand media nodes as well, which gives future plugin rules and automation flows a stronger semantic base.
  • Extension-side interaction and capture behavior is more robust
    • When click or type actions trigger page scrolling, the visible cursor overlay now stays attached to the active target instead of leaving pulse feedback behind at stale coordinates.
    • Screenshot handling now validates quality input more carefully and restores the previously active tab after capture, reducing disruption to the current browsing context.
  • Browser-extension packaging and Chrome integration are more reliable
    • Extension builds now produce browser-specific artifacts, and Chrome / Firefox release packages are assembled from their corresponding generated manifests.
    • The Chrome package no longer mixes Manifest V2 background.scripts with Manifest V3 service_worker, and archive paths now match the final unpacked zip layout.
    • On macOS, the default Chrome Native Messaging host install path has been corrected so browser-cli setup --browser chrome is less likely to leave the extension unable to connect to the relay.

Compatibility and behavior notes

  • This is a forward-compatible release. The new screenshot and download commands are additive and do not break existing open, page, click, type, or wait flows.
  • screenshot --full-page still falls back to viewport capture with a warning; this release establishes the command surface and capture pipeline first, while full-page stitching can be improved later.
  • download is designed for resources that are accessible from the browser's current session context, which makes it especially useful for authenticated images, file links, and page-visible media resources.
  • Chrome users should reload the extension after upgrading. macOS users who previously ran browser-cli setup --browser chrome --extension-id <id> should run it again once so the native host manifest is rewritten into the corrected directory.

Full Changelog: 0.5.4...0.6.0

0.5.4 (4bfcb96)

28 Mar 10:11

Choose a tag to compare

browser-cli 0.5.4

0.5.4 是一个以 JSON 输出分层、view 聚焦行为优化、浏览器扩展交互提示补强和文档补全为主的补丁版本,汇总了自 0.5.3 以来这一轮 CLI 输出、结构化视图、扩展侧交互反馈和使用说明改进。

亮点

  • --json 现在区分默认紧凑模式和 --verbose 全量模式
    • pagesearchblockview 新增 --verbose / -v,需要完整结构时可以显式展开。
    • 默认 --json 会返回更适合 Agent 消费的紧凑数据:压平低信号容器,省略 element_refsfull_textsfull_blocks 等原始扩展字段。
    • search --json 默认只保留 pagetagcontext 和命中的 element_id,减少下游解析噪声。
  • view 默认更聚焦当前目标
    • 当目标元素位于 list / table 中时,默认 view 会收窄到包含该目标的单个 item / row,避免把整块列表上下文一并展开。
    • 如果确实需要完整列表或表格上下文,可以显式使用 view --verbose
    • openclicktypewait 这些默认返回页面的路径仍继续输出完整 XML 页面,不会因为新的紧凑 JSON 策略而被意外裁剪。
  • 浏览器扩展的自动化状态提示更明显
    • 扩展侧 content script 调整了光标叠层样式:待机时会显示更小的光标和更明显的 zzz idle 标记,执行任务时再切回高亮工作态。
    • 这让真实浏览器里“当前只是待机”还是“正在执行自动化动作”更容易一眼区分。
  • 文档与开发说明同步更新
    • README.mdREADME.en.mdSKILL.md 已同步补充 --verboseview 默认聚焦行为,以及最新命令速查与输出说明。
    • 仓库新增 CONTRIBUTING.md,把本地开发、扩展重载、测试要求和提交前检查整理成独立说明。

兼容性与行为说明

  • 这是一个兼容性补丁版本,没有协议层破坏性变更。
  • XML 主输出行为基本保持不变;这次的“紧凑 / 全量”切换主要体现在 --json 结果和 view 的默认聚焦范围上。
  • 如果下游脚本原来依赖 page --jsonsearch --jsonblock --jsonview --json 中的完整原始字段,需要改为显式传入 --verbose
  • 浏览器扩展侧的这次改动主要是自动化光标的视觉反馈增强,不影响协议或页面结构化字段。

English

0.5.4 is a patch release focused on layered JSON output, a tighter default view scope, clearer browser-extension automation feedback, and refreshed documentation, covering this round of CLI-output, structured-view, extension UX, and usage-guide refinements since 0.5.3.

Highlights

  • --json now distinguishes between compact default output and full --verbose output
    • page, search, block, and view now accept --verbose / -v so callers can opt into the full structure when needed.
    • Default --json output is now more agent-friendly: low-signal containers are flattened and raw expansion fields such as element_refs, full_texts, and full_blocks are omitted.
    • search --json now keeps only page, tag, context, and matched element_id by default, which reduces downstream parsing noise.
  • view now defaults to a tighter target-focused scope
    • When the target element lives inside a list or table, default view output narrows to the single matching item / row instead of expanding the entire surrounding block.
    • When full surrounding list or table context is needed, view --verbose keeps the larger structure intact.
    • open, click, type, and wait still return full XML pages on their default non-JSON paths, so the new compact JSON behavior does not silently trim those flows.
  • Browser-extension automation state is easier to read
    • The extension-side content script updates the cursor overlay so idle state is shown with a smaller cursor and a more obvious zzz marker, while active task mode still switches back to the highlighted working state.
    • This makes it easier to tell at a glance whether the real browser is idle or currently executing automation.
  • Documentation and contributor guidance were refreshed together
    • README.md, README.en.md, and SKILL.md now document --verbose, the new default view focus behavior, and the updated command/output conventions.
    • A new CONTRIBUTING.md collects local setup, extension reload, test requirements, and pre-submission checks into one place.

Compatibility and behavior notes

  • This is a compatible patch release with no protocol-breaking changes.
  • XML output behavior is largely unchanged; the new compact-vs-full distinction mainly affects --json results and the default scope of view.
  • Downstream tooling that depended on full raw fields from page --json, search --json, block --json, or view --json should now pass --verbose explicitly.
  • The browser-extension change in this release is purely about automation cursor feedback and does not alter the protocol or structured page fields.

Full Changelog: 0.5.3...0.5.4

0.5.3 (16ff0dd)

28 Mar 08:06

Choose a tag to compare

browser-cli 0.5.3

0.5.3 是一个以结构化输出语义保留、class 信息可读性和列表项交互识别补强为主的补丁版本,汇总了自 0.5.2 以来这一轮页面结构化 / XML 渲染 / 插件目标遍历改进。

亮点

  • 结构化结果开始保留更有用的 class 语义
    • 交互节点如 linkbutton,以及列表项 item,现在会在结构化树中保留自身的 class 信息。
    • XML 输出会在保留 class 语义的同时做去重和压缩,减少重复兄弟节点、重复路径和 BEM 基类带来的噪声。
    • 对重复结构的列表项,会优先保留首个模式样本上的 class,后续同构项自动省略重复 class,让输出更短但仍保留足够语义。
  • 列表项交互识别更符合真实页面
    • 如果 <li> 本身就携带弱交互信号,例如 cursor:pointer,结构化阶段现在会直接把它识别为可点击元素,而不是先包成普通 item
    • 这让排行榜、tab 列表、卡片列表一类自定义站点组件更容易直接被 click 命中。
  • XML 上下文去噪更进一步
    • class 路径会自动压缩掉冗余父级前缀,例如 BEM 基类与修饰符、父块名与子元素名的重复片段。
    • 兄弟节点之间重复的相同 class 也会按结构去重,避免长列表输出被模板 class 填满。
    • 插件目标遍历与 CLI 目标查找已同步适配新的节点字段,保持交互查找行为一致。

兼容性与行为说明

  • 这是一个兼容性补丁版本,没有引入新的命令参数或协议层破坏性变更。
  • 现有 eN / tN / bN 工作流保持不变;本次主要提升的是输出中 class 语义的可利用性和列表交互元素的命中率。
  • XML 输出中的 class 属性会比之前更有信息量,但也会经过主动去重压缩;如果下游逻辑依赖重复模板 class 的完整原样展开,需要按新的更紧凑表示适配。

English

0.5.3 is a patch release focused on semantic preservation in structured output, more useful class rendering, and stronger interactive detection for list items, covering this round of page-structuring, XML-rendering, and plugin target-traversal refinements since 0.5.2.

Highlights

  • Structured output now preserves more useful class semantics
    • Interactive nodes such as link, button, and list item nodes now retain their own class information in the structured tree.
    • XML rendering keeps that class context while deduplicating and compacting it, reducing noise from repeated siblings, repeated path segments, and BEM base classes.
    • For repetitive list structures, class information is preferentially kept on the first representative item while later isomorphic items omit duplicate class data, keeping output shorter without losing the pattern.
  • List-item interaction detection better matches real pages
    • When an <li> itself carries weak interactive signals such as cursor:pointer, the structuring pass now emits it directly as an interactive target instead of wrapping it as a plain item first.
    • This makes custom site components such as ranking lists, tab strips, and card lists easier to target directly with click.
  • XML context denoising goes further
    • Class paths now compact redundant parent prefixes automatically, including repeated BEM base/modifier relationships and repeated block/element prefixes.
    • Repeated sibling classes are also deduplicated structurally so long lists are not flooded with template-level class noise.
    • Plugin target traversal and CLI target lookup have been updated to understand the expanded node fields and keep interaction resolution behavior consistent.

Compatibility and behavior notes

  • This is a compatible patch release with no new command syntax and no protocol-breaking changes.
  • Existing eN / tN / bN workflows remain unchanged; the main improvement here is better class-level semantics in output and more reliable targeting of interactive list items.
  • class attributes in XML now carry more signal than before, but they are also intentionally deduplicated and compacted; downstream logic that depended on fully repeated template classes should adapt to the denser representation.

Full Changelog: 0.5.2...0.5.3

0.5.2 (23b0d2e)

28 Mar 06:31

Choose a tag to compare

browser-cli 0.5.2

0.5.2 是一个以多窗口会话追踪、短 ID 参数一致性和弱交互识别补强为主的补丁版本,汇总了自 0.5.1 以来这一轮 CLI / 结构化解析 / 扩展侧会话管理改进。

亮点

  • 新窗口 / 新标签点击现在会自动生成新 session
    • 当普通 click 触发站点自身的新 tab / window 打开行为时,扩展现在会识别这个上下文切换,并为目标页面创建新的 session,而不是让追踪断开。
    • CLI 会直接拿到新的 session_id 和目标页内容,减少一次额外的 list / page 调用。
    • 原 session 保持停留在旧页面,不会被重绑到新 tab,整体行为更符合显式会话模型。
  • ID 参数语义更统一
    • click / type 现在同时接受 e33
    • text / block 现在也同时接受完整 ID 和数字部分,例如 t1 / 1b1 / 1
    • README.mdREADME.en.mdSKILL.md 和 CLI help 已同步更新到统一的参数说明。
  • 弱交互识别补强
    • 浏览器快照会保留非原生交互节点上的运行时 cursor: pointer 信号。
    • Rust 侧结构化阶段会把带可见文本的 cursor:pointer 节点保守识别为按钮,减少自定义列表项 / 卡片项上的漏判。

兼容性与行为说明

  • 这是一个兼容性补丁版本,没有引入破坏性的协议或会话模型变更。
  • click --new-session 仍然保留,继续用于“显式把链接目标打开成新会话”的场景;本次新增的是对站点自身新 tab / window 打开行为的自动追踪。
  • 现有数字 ID 工作流继续有效;eN / tN / bN 只是新增的等价写法,不影响已有脚本。
  • 由于扩展新增了 webNavigation 权限来识别新建导航目标,升级后需要 reload 浏览器扩展才能完整生效。

English

0.5.2 is a patch release focused on cross-window session tracking, consistent short-ID argument handling, and stronger weak-interaction detection across the CLI, structured page parsing, and extension-side session management since 0.5.1.

Highlights

  • Click-opened tabs/windows now become tracked sessions automatically
    • When a normal click triggers site-driven new-tab or new-window behavior, the extension now detects that context change and creates a new session for the destination page instead of losing tracking.
    • The CLI now receives the new session_id together with the destination page content, removing the need for an extra list / page round trip.
    • The original session stays on the old page and is not rebound to the new tab, which better matches the explicit session model.
  • More consistent ID argument semantics
    • click / type now accept both e3 and 3.
    • text / block now also accept either full IDs or their numeric forms, such as t1 / 1 and b1 / 1.
    • README.md, README.en.md, SKILL.md, and CLI help have been updated to reflect the unified argument behavior.
  • Stronger weak-interaction detection
    • Browser snapshots now preserve runtime cursor: pointer signals on non-native interactive nodes.
    • During Rust-side structuring, visible-text nodes with cursor:pointer are conservatively classified as buttons, reducing misses on custom list items and card-like controls.

Compatibility and behavior notes

  • This is a compatible patch release with no protocol-breaking or session-model-breaking changes.
  • click --new-session remains available for explicitly opening link targets as separate sessions; what is new here is automatic tracking for site-driven new tab/window behavior.
  • Existing numeric-ID workflows remain valid; eN / tN / bN are additive equivalent forms and do not break current scripts.
  • Because the extension now uses the webNavigation permission to detect newly created navigation targets, reloading the browser extension is required after upgrading for the full behavior to take effect.

Full Changelog: 0.5.1...0.5.2

0.5.1 (a5b6fd2)

28 Mar 05:46

Choose a tag to compare

browser-cli 0.5.1

0.5.1 是一个以页面语义保留、结构化视图可读性和长块元素定位修正为主的补丁版本,汇总了自 0.5.0 以来的这一轮页面解析 / XML 输出 / 扩展侧交互细节改进。

亮点

  • 更完整的交互语义识别
    • 结构化阶段现在能识别更多 role / 状态驱动的控件,包括 role="link"tabmenuitemoptionswitch 等常见自定义交互语义。
    • 对带 tabindex 且同时具有 aria-selected / aria-current / aria-expanded 等状态的节点,会更保守地判定为可交互元素,减少自定义组件上的漏判。
    • 浏览器侧快照补充了 classtabindex、更多 ARIA 状态和运行时 onclick 信息,为 Rust 侧结构化判断提供更完整的原始事实。
  • 聚焦查看与长块目标解析更稳定
    • view 现在可以正确定位被截断列表 / 表格块中的元素,并自动返回展开后的上下文。
    • click 的文本查询和 --new-session 链接解析会继续在完整块内容中查找目标,不再只局限于当前页已经展开的那一小段。
  • XML 输出更贴近真实语义
    • 会跳过纯展示型包装容器,减少 div / span / i 一类无意义层级噪声。
    • 被压平包装层上的 role / class 语义会向下继承到最终输出节点,便于后续规则、插件或 Agent 利用上下文。
    • 单一交互叶子的列表项会更紧凑地内联输出,但真正有记录边界的 item 分组仍会保留。
  • 光标驻留行为更克制、更自然
    • 空闲游走改为按间隔决策,并在长时间没有 CLI 活动后自动停止,避免页面长期无意义移动。
    • 移动轨迹加入轻微 wobble 和中途犹豫停顿,鼠标存在感更自然。

兼容性与行为说明

  • 这是一个兼容性补丁版本,没有引入新的会话模型或协议层破坏性变更。
  • 现有 eN / tN / bN 工作流继续可用;本次主要是让复杂自定义组件和长块场景下的结构化结果更稳定、更容易定位。
  • XML 输出会比 0.5.0 更少展示无语义包装层;如果下游逻辑依赖这些展示型容器的具体层级,需要按新的结构化视图适配。

English

0.5.1 is a patch release focused on semantic preservation, structured-view readability, and more reliable element resolution inside truncated blocks, covering this round of page parsing, XML rendering, and extension-side interaction refinements since 0.5.0.

Highlights

  • More complete interactive semantic detection
    • The structured page pass now recognizes more role- and state-driven widgets, including common custom semantics such as role="link", tab, menuitem, option, and switch.
    • Nodes with tabindex plus widget state such as aria-selected, aria-current, or aria-expanded are now classified more conservatively as interactive, reducing misses on custom components.
    • Browser-side snapshots now include class, tabindex, more ARIA state, and runtime onclick signals so the Rust side has a fuller set of raw facts to classify from.
  • More reliable focused views and long-block target resolution
    • view can now resolve elements that live inside truncated list/table blocks and automatically return the expanded surrounding context.
    • Text-query resolution in click and link extraction for --new-session now continue searching through full block contents instead of only the currently expanded slice.
  • XML output is closer to real page semantics
    • Purely presentational wrapper containers are skipped to reduce noise from meaningless div / span / i layers.
    • role / class semantics from flattened wrappers are propagated down to the final rendered nodes, which gives downstream rules, plugins, and agents more usable context.
    • List items that only wrap a single interactive leaf are rendered more compactly, while true record boundaries are still preserved.
  • Cursor presence becomes more restrained and natural
    • Idle wandering now runs on interval-based decisions and stops automatically after prolonged CLI inactivity, avoiding endless meaningless movement.
    • Cursor motion now includes slight wobble and occasional hesitation pauses to make on-page presence look less mechanical.

Compatibility and behavior notes

  • This is a compatible patch release with no new session-model or protocol-breaking changes.
  • Existing eN / tN / bN workflows remain valid; the main goal here is to make structured output more stable and easier to target on complex custom widgets and long blocks.
  • XML output in 0.5.1 intentionally shows fewer non-semantic wrapper layers than 0.5.0; downstream logic that depended on those presentational container levels should adapt to the cleaner structure.

Full Changelog: 0.5.0...0.5.1

0.5.0 (59eda41)

28 Mar 04:06

Choose a tag to compare

browser-cli 0.5.0

0.5.0 是一次以命令易用性、结构化页面视图和交互体验打磨为主的版本更新,汇总了自 0.4.3 以来的这一轮 CLI / 页面解析 / 扩展侧改进。

亮点

  • 更自然的命令参数与默认行为
    • open 新增 --wait--quiet,并默认直接返回当前页面,而不是只返回 session 信息。
    • click / type 现在既可以使用数字元素 ID,也可以直接用页面上的文本查询来定位交互元素。
    • wait 新增 --for <text> 文本等待模式,并支持 --quiet 只输出紧凑结果。
    • block 新增 --all,可以一次展开整个长列表或长表格块。
  • 新增聚焦查看能力
    • 新增 view 命令,可针对元素、长文本、长列表/表格块或文本查询返回聚焦视图。
    • 适合在复杂页面里只取某个局部上下文,减少手动翻页和二次检索。
  • 页面结构化模型升级
    • 页面内部结构从更扁平的表示升级为树形 nodes,输出能更好保留容器、列表、表格和嵌套交互元素的上下文关系。
    • XML / JSON 渲染、长块分页和聚焦视图现在共享同一套树结构,复杂 DOM 场景下的结果更稳定、更接近真实页面层次。
  • 插件与扩展侧同步增强
    • 插件匹配逻辑已适配新的树形页面结构,字符串目标解析会递归查找可交互节点。
    • 浏览器扩展在打开页面后的首个快照前支持额外稳定等待,减少刚加载完成时页面仍在抖动的问题。
    • 光标移动轨迹进一步拟人化,加入分段贝塞尔运动、绕行和更自然的停顿节奏。

兼容性与行为说明

  • 原有的数字元素 ID 工作流保持可用;文本查询是新增能力,不会影响已有脚本继续传数字 ID。
  • click / type / wait 现在默认会返回更新后的页面;如果自动化场景只需要成功摘要,可使用 --quiet
  • 本次没有引入破坏性的会话模型变更;现有 CLI / Relay / Extension 的整体使用方式保持不变。
  • README.mdAGENTS.mdSKILL.md 已同步更新到新的参数语义,便于 AI / Agent 和脚本侧直接按新接口使用。

English

0.5.0 is a release focused on command ergonomics, structured page views, and interaction polish, covering this round of CLI, page parsing, and extension-side improvements since 0.4.3.

Highlights

  • More natural command targets and defaults
    • open now supports --wait and --quiet, and returns the current page by default instead of only printing session info.
    • click / type now accept either numeric element IDs or direct text queries to resolve interactive targets.
    • wait now supports --for <text> for text-based waiting and --quiet for compact automation output.
    • block now supports --all to expand an entire long list/table block at once.
  • New focused inspection flow
    • Added a new view command that can return a focused view for an element, long text item, long list/table block, or text query.
    • This is especially useful on complex pages where only one local subtree or content segment is needed.
  • Upgraded structured page model
    • Internal page representation moved from a flatter structure to tree-based nodes, preserving container, list, table, and nested interactive context much better.
    • XML / JSON rendering, block pagination, and focused views now share the same tree model, which improves stability on complex DOM layouts.
  • Plugin and extension improvements
    • Plugin target matching now traverses the new tree structure recursively when resolving interactive targets.
    • The extension can now wait for additional DOM stability before sending the first snapshot after page open.
    • Cursor motion has been further humanized with segmented bezier movement, detours, and more natural pause timing.

Compatibility and behavior notes

  • Existing numeric element-ID workflows remain valid; text-query targeting is additive and does not break current scripts.
  • click / type / wait now return the updated page by default. Use --quiet when automation only needs a compact success result.
  • This release does not introduce a breaking session-model change. The overall CLI / Relay / Extension workflow remains the same.
  • README.md, AGENTS.md, and SKILL.md have been updated to reflect the new parameter semantics so agents and scripts can follow the latest interface directly.

Full Changelog: 0.4.3...0.5.0

0.4.3 (ef54ddf)

27 Mar 17:31

Choose a tag to compare

browser-cli 0.4.3

0.4.3 是一个以光标可视化和更自然的人类操作模拟为主的小版本更新,汇总了自 0.4.2 以来的交互增强。

  • 新增常驻光标渲染。现在通过 open 打开页面后,扩展会立即在页面中显示可见光标,而不是只在点击瞬间出现。
  • 新增空闲游走行为。光标会在页面可见交互区域与空白区域之间随机移动,增强页面停留期间的人类操作存在感。
  • 新增任务态光标标识。当收到明确操作指令时,光标会切换为黄色,和常规空闲态区分开来。
  • 改进点击与输入前的接近路径。执行动作前会先滚动目标、按曲线路径靠近控件,并在移动过程中补充更连续的鼠标移动事件。
  • 提升控件命中准确度。点击前会对目标可见区域做采样和命中校验,并在点击时提供缩放反馈,减少边缘命中和遮挡带来的误点。

本次发布没有引入协议层破坏性变更,现有 CLI / Relay / Extension 工作流保持不变。

English

0.4.3 is a small release focused on cursor visibility and more natural human-like browser interaction, covering all changes since 0.4.2.

  • Added persistent cursor rendering. After a page is opened with open, the extension now shows a visible cursor immediately instead of only during click execution.
  • Added idle wandering behavior. The cursor now moves between visible interactive regions and whitespace while the page is idle, making browser presence look less static.
  • Added a task-state cursor marker. When an explicit action is being executed, the cursor switches to yellow so task mode is visually distinct from normal idle mode.
  • Improved approach paths before click and type actions. The extension now scrolls the target into view, approaches it along a curved path, and emits a more continuous stream of mouse movement events before acting.
  • Improved hit accuracy for actionable controls. Before clicking, the extension samples the visible target area and validates hit points, then applies a click-scale feedback effect to reduce edge hits and occlusion-related misses.

This release does not introduce any breaking protocol changes. Existing CLI / Relay / Extension workflows remain unchanged.

Full Changelog: 0.4.2...0.4.3

0.4.2 (29a6eab)

27 Mar 16:26

Choose a tag to compare

browser-cli 0.4.2

0.4.2 是一个以结构化输出质量和导航稳定性为主的小版本更新,汇总了自 0.4.0 以来的改进。

  • 改进块级分页逻辑。超长 list / table 不再按固定条目数硬切,而是按渲染后的 XML 行数预算分页,让首屏内容更合理,后续 block 分页也更贴近实际阅读密度。
  • 优化表格 XML 输出。只有一个且内容较短的单元格行,现在会压缩成单行 <row><cell>...</cell></row>,减少冗余,提升终端和 AI 消费时的可读性。
  • 修复导航类列表的结构化错位问题。由纯链接组成的列表不再被重复输出成 <list><link> 两份;混合列表中的纯文本项与可点击项也会被更准确地拆分和排序。
  • 改进元素标签回退与搜索命中。现在会把 title 属性纳入标签回退和搜索字段,减少“元素可操作但没有可读标签”的情况。
  • 修复跨页跳转后的缓存问题。当 click 触发页面导航时,扩展会在新页面加载完成后主动请求新的快照,避免 Relay / CLI 继续读取旧页面缓存。

本次发布没有引入协议层破坏性变更,现有 CLI / Relay / Extension 工作流保持不变。

English

0.4.2 is a small release focused on structured output quality and navigation stability, rolling up all changes since 0.4.0.

  • Improved block pagination. Long list / table blocks are no longer split by a fixed item count. They are now paginated by rendered XML line budget, which makes the first page denser and follow-up block pages more predictable.
  • Cleaner table XML. Single-cell rows with short content are now emitted inline as <row><cell>...</cell></row>, reducing output noise for terminal and AI consumers.
  • Fixed structured-output misalignment for navigation-style lists. Lists made entirely of links are no longer duplicated as both <list> and <link>, and mixed lists now separate plain text items from actionable links more accurately.
  • Better label fallback and search coverage. The title attribute is now included in searchable fields and used as a fallback label source when needed.
  • Fixed stale cache after navigation. When a click causes a page transition, the extension now waits for the new page to load and requests a fresh snapshot, preventing Relay / CLI from serving the previous page state.

This release does not introduce any breaking protocol changes. Existing CLI / Relay / Extension workflows remain unchanged.

Full Changelog: 0.4.0...0.4.2

0.4.1 (795d724)

27 Mar 16:11

Choose a tag to compare

Full Changelog: 0.4.0...0.4.1