Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
### Bug Fixes

- Fixed **Lightpanda engine** compatibility (#1050)
- Fixed **Helium auto-connect discovery** by scanning Helium user data directories on macOS and updating CDP guidance/help to treat Helium as a supported Chromium-based browser
- Fixed **Windows daemon TCP bind** failing when Hyper-V reserves the port by falling back to an OS-assigned port and writing it to a `.port` file (#1041)
- Fixed **Windows dashboard relay** using Unix socket instead of TCP (#1038)
- Fixed **radio/checkbox elements** being dropped from compact snapshot tree because the `ref=` check required a leading `[` that those elements lack (#1008)
Expand Down
24 changes: 13 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,19 +373,21 @@ agent-browser provides multiple ways to persist login sessions so you don't re-a
|----------|----------|------------|
| **Persistent profile** | Full browser state (cookies, IndexedDB, service workers, cache) across restarts | `--profile <path>` / `AGENT_BROWSER_PROFILE` |
| **Session persistence** | Auto-save/restore cookies + localStorage by name | `--session-name <name>` / `AGENT_BROWSER_SESSION_NAME` |
| **Import from your browser** | Grab auth from a Chrome session you already logged into | `--auto-connect` + `state save` |
| **Import from your browser** | Grab auth from a Chrome/Chromium-family browser session you already logged into | `--auto-connect` + `state save` |
| **State file** | Load a previously saved state JSON on launch | `--state <path>` / `AGENT_BROWSER_STATE` |
| **Auth vault** | Store credentials locally (encrypted), login by name | `auth save` / `auth login` |

### Import auth from your browser

If you are already logged in to a site in Chrome, you can grab that auth state and reuse it:
If you are already logged in to a site in Chrome, Chromium, Brave, or Helium, you can grab that auth state and reuse it:

```bash
# 1. Launch Chrome with remote debugging enabled
# 1. Launch a Chromium-based browser with remote debugging enabled
# macOS:
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222
# Or use --auto-connect to discover an already-running Chrome
# Helium:
open -na "Helium" --args --remote-debugging-port=9222
# Or use --auto-connect to discover an already-running browser

# 2. Connect and save the authenticated state
agent-browser --auto-connect state save ./my-auth.json
Expand Down Expand Up @@ -584,7 +586,7 @@ This is useful for multimodal AI models that can reason about visual layout, unl
| `--screenshot-format <fmt>` | Screenshot format: `png`, `jpeg` (or `AGENT_BROWSER_SCREENSHOT_FORMAT` env) |
| `--headed` | Show browser window (not headless) (or `AGENT_BROWSER_HEADED` env) |
| `--cdp <port\|url>` | Connect via Chrome DevTools Protocol (port or WebSocket URL) |
| `--auto-connect` | Auto-discover and connect to running Chrome (or `AGENT_BROWSER_AUTO_CONNECT` env) |
| `--auto-connect` | Auto-discover and connect to a running Chromium-based browser, including Helium (or `AGENT_BROWSER_AUTO_CONNECT` env) |
| `--color-scheme <scheme>` | Color scheme: `dark`, `light`, `no-preference` (or `AGENT_BROWSER_COLOR_SCHEME` env) |
| `--download-path <path>` | Default download directory (or `AGENT_BROWSER_DOWNLOAD_PATH` env) |
| `--content-boundaries` | Wrap page output in boundary markers for LLM safety (or `AGENT_BROWSER_CONTENT_BOUNDARIES` env) |
Expand Down Expand Up @@ -926,28 +928,28 @@ This enables control of:

### Auto-Connect

Use `--auto-connect` to automatically discover and connect to a running Chrome instance without specifying a port:
Use `--auto-connect` to automatically discover and connect to a running Chromium-based browser without specifying a port:

```bash
# Auto-discover running Chrome with remote debugging
# Auto-discover a running Chromium-based browser with remote debugging
agent-browser --auto-connect open example.com
agent-browser --auto-connect snapshot

# Or via environment variable
AGENT_BROWSER_AUTO_CONNECT=1 agent-browser snapshot
```

Auto-connect discovers Chrome by:
Auto-connect discovers browsers by:

1. Reading Chrome's `DevToolsActivePort` file from the default user data directory
1. Reading `DevToolsActivePort` from the default user data directories for Chrome, Chrome Canary, Chromium, Brave, and Helium
2. Falling back to probing common debugging ports (9222, 9229)
3. If HTTP-based discovery (`/json/version`, `/json/list`) fails, falling back to a direct WebSocket connection

This is useful when:

- Chrome 144+ has remote debugging enabled via `chrome://inspect/#remote-debugging` (which uses a dynamic port)
- Chrome 144+ or another Chromium-based browser has remote debugging enabled with a dynamic port
- You want a zero-configuration connection to your existing browser
- You don't want to track which port Chrome is using
- You don't want to track which port the browser is using

## Streaming (Browser Preview)

Expand Down
23 changes: 22 additions & 1 deletion cli/src/native/cdp/chrome.rs
Original file line number Diff line number Diff line change
Expand Up @@ -553,7 +553,9 @@ pub async fn auto_connect_cdp() -> Result<String, String> {
}
}

Err("No running Chrome instance found. Launch Chrome with --remote-debugging-port or use --cdp.".to_string())
Err(
"No running Chromium-based browser instance found. Launch Chrome, Chromium, Brave, or Helium with --remote-debugging-port or use --cdp.".to_string(),
)
}

fn is_port_reachable(port: u16) -> bool {
Expand All @@ -574,6 +576,8 @@ fn get_chrome_user_data_dirs() -> Vec<PathBuf> {
"Google/Chrome Canary",
"Chromium",
"BraveSoftware/Brave-Browser",
"net.imput.helium",
"net.imput.helium-cdp",
] {
dirs.push(base.join(name));
}
Expand Down Expand Up @@ -1050,4 +1054,21 @@ mod tests {

assert!(!dir.exists(), "Temp dir should be cleaned up on drop");
}

#[cfg(target_os = "macos")]
#[test]
fn test_get_chrome_user_data_dirs_includes_helium() {
let dirs = get_chrome_user_data_dirs();
let paths: Vec<String> = dirs
.iter()
.map(|p| p.to_string_lossy().to_string())
.collect();

assert!(paths
.iter()
.any(|p| p.ends_with("Library/Application Support/net.imput.helium")));
assert!(paths
.iter()
.any(|p| p.ends_with("Library/Application Support/net.imput.helium-cdp")));
}
}
6 changes: 3 additions & 3 deletions cli/src/output.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2791,7 +2791,7 @@ Authentication:
(or AGENT_BROWSER_SESSION_NAME env)
--state <path> Load saved auth state (cookies + storage) from JSON file
(or AGENT_BROWSER_STATE env)
--auto-connect Connect to a running Chrome to reuse its auth state
--auto-connect Connect to a running Chromium-based browser to reuse auth state
Tip: agent-browser --auto-connect state save ./auth.json
--headers <json> HTTP headers scoped to URL's origin (e.g., Authorization bearer token)

Expand Down Expand Up @@ -2863,7 +2863,7 @@ Environment:
AGENT_BROWSER_DEBUG Debug output
AGENT_BROWSER_IGNORE_HTTPS_ERRORS Ignore HTTPS certificate errors
AGENT_BROWSER_PROVIDER Browser provider (ios, browserbase, kernel, browseruse, browserless)
AGENT_BROWSER_AUTO_CONNECT Auto-discover and connect to running Chrome
AGENT_BROWSER_AUTO_CONNECT Auto-discover and connect to a running Chromium-based browser
AGENT_BROWSER_ALLOW_FILE_ACCESS Allow file:// URLs to access local files
AGENT_BROWSER_COLOR_SCHEME Color scheme preference (dark, light, no-preference)
AGENT_BROWSER_DOWNLOAD_PATH Default download directory for browser downloads
Expand Down Expand Up @@ -2906,7 +2906,7 @@ Examples:
agent-browser screenshot --annotate # Labeled screenshot for vision models
agent-browser wait --load networkidle # Wait for slow pages to load
agent-browser --cdp 9222 snapshot # Connect via CDP port
agent-browser --auto-connect snapshot # Auto-discover running Chrome
agent-browser --auto-connect snapshot # Auto-discover running Chrome/Chromium/Brave/Helium
agent-browser stream enable # Start runtime streaming on an auto-selected port
agent-browser stream status # Inspect runtime streaming state
agent-browser --color-scheme dark open example.com # Dark mode
Expand Down
17 changes: 9 additions & 8 deletions docs/src/app/cdp-mode/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Connect to an existing browser via Chrome DevTools Protocol:

```bash
# Start Chrome with: google-chrome --remote-debugging-port=9222
# Or on macOS: open -na "Helium" --args --remote-debugging-port=9222

# Connect once, then run commands without --cdp
agent-browser connect 9222
Expand Down Expand Up @@ -34,28 +35,28 @@ The `--cdp` flag accepts either:

## Auto-Connect

Use `--auto-connect` to automatically discover and connect to a running Chrome instance without specifying a port:
Use `--auto-connect` to automatically discover and connect to a running Chromium-based browser without specifying a port:

```bash
# Auto-discover running Chrome with remote debugging
# Auto-discover a running Chromium-based browser with remote debugging
agent-browser --auto-connect open example.com
agent-browser --auto-connect snapshot

# Or via environment variable
AGENT_BROWSER_AUTO_CONNECT=1 agent-browser snapshot
```

Auto-connect discovers Chrome by:
Auto-connect discovers browsers by:

1. Reading Chrome's `DevToolsActivePort` file from the default user data directory
1. Reading `DevToolsActivePort` from the default user data directories for Chrome, Chrome Canary, Chromium, Brave, and Helium
2. Falling back to probing common debugging ports (9222, 9229)
3. If HTTP-based discovery (`/json/version`, `/json/list`) fails, falling back to a direct WebSocket connection

This is useful when:

- Chrome 144+ has remote debugging enabled via `chrome://inspect/#remote-debugging` (which uses a dynamic port)
- Chrome 144+ or another Chromium-based browser has remote debugging enabled with a dynamic port
- You want a zero-configuration connection to your existing browser
- You don't want to track which port Chrome is using
- You don't want to track which port the browser is using

## Color scheme

Expand All @@ -77,7 +78,7 @@ AGENT_BROWSER_COLOR_SCHEME=dark agent-browser --cdp 9222 open https://example.co
This enables control of:

- Electron apps
- Chrome/Chromium with remote debugging
- Chrome/Chromium/Brave/Helium with remote debugging
- WebView2 applications
- Remote browser services (via WebSocket URL)
- Any browser exposing a CDP endpoint
Expand All @@ -103,7 +104,7 @@ This enables control of:
<tr><td><code>--exact</code></td><td>Exact text match</td></tr>
<tr><td><code>--headed</code></td><td>Show browser window</td></tr>
<tr><td><code>{"--cdp <port|url>"}</code></td><td>CDP connection (port or WebSocket URL)</td></tr>
<tr><td><code>--auto-connect</code></td><td>Auto-discover and connect to running Chrome</td></tr>
<tr><td><code>--auto-connect</code></td><td>Auto-discover and connect to a running Chromium-based browser, including Helium</td></tr>
<tr><td><code>--color-scheme &lt;scheme&gt;</code></td><td>Persistent color scheme (<code>dark</code>, <code>light</code>, <code>no-preference</code>)</td></tr>
<tr><td><code>--debug</code></td><td>Debug output</td></tr>
</tbody>
Expand Down
2 changes: 1 addition & 1 deletion docs/src/app/commands/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,7 @@ agent-browser reload # Reload page
--screenshot-format <fmt> # Format: png (default), jpeg (or AGENT_BROWSER_SCREENSHOT_FORMAT)
--headed # Show browser window (not headless)
--cdp <port|url> # Connect via Chrome DevTools Protocol (port or WebSocket URL)
--auto-connect # Auto-discover and connect to running Chrome
--auto-connect # Auto-discover and connect to a running Chromium-based browser, including Helium
--color-scheme <scheme> # Color scheme: dark, light, no-preference
--download-path <path> # Default download directory
--content-boundaries # Wrap page output in boundary markers for LLM safety
Expand Down
2 changes: 1 addition & 1 deletion docs/src/app/configuration/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ These environment variables configure additional daemon and runtime behavior:
<tr><th>Variable</th><th>Description</th><th>Default</th></tr>
</thead>
<tbody>
<tr><td><code>AGENT_BROWSER_AUTO_CONNECT</code></td><td>Auto-discover and connect to a running Chrome instance.</td><td>(disabled)</td></tr>
<tr><td><code>AGENT_BROWSER_AUTO_CONNECT</code></td><td>Auto-discover and connect to a running Chromium-based browser, including Helium.</td><td>(disabled)</td></tr>
<tr><td><code>AGENT_BROWSER_ALLOW_FILE_ACCESS</code></td><td>Allow <code>file://</code> URLs to access local files.</td><td>(disabled)</td></tr>
<tr><td><code>AGENT_BROWSER_COLOR_SCHEME</code></td><td>Color scheme preference (<code>dark</code>, <code>light</code>, <code>no-preference</code>).</td><td>(none)</td></tr>
<tr><td><code>AGENT_BROWSER_DOWNLOAD_PATH</code></td><td>Default directory for browser downloads.</td><td>(temp directory)</td></tr>
Expand Down
9 changes: 6 additions & 3 deletions docs/src/app/sessions/page.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,19 +55,22 @@ The profile directory stores:

## Import auth from your browser

If you are already logged in to a site in Chrome, you can grab that auth state and reuse it in agent-browser. This is the fastest way to bypass login flows, OAuth, SSO, or 2FA.
If you are already logged in to a site in Chrome, Chromium, Brave, or Helium, you can grab that auth state and reuse it in agent-browser. This is the fastest way to bypass login flows, OAuth, SSO, or 2FA.

**Step 1:** Start Chrome with remote debugging:
**Step 1:** Start a Chromium-based browser with remote debugging:

```bash
# macOS
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222

# macOS (Helium)
open -na "Helium" --args --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222
```

Log in to your target site(s) in this Chrome window.
Log in to your target site(s) in that browser window.

`--remote-debugging-port` exposes full browser control on localhost. Any local process can connect. Only use on trusted machines and close Chrome when done.

Expand Down
6 changes: 3 additions & 3 deletions skills/agent-browser/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ When automating a site that requires login, choose the approach that fits:
**Option 1: Import auth from the user's browser (fastest for one-off tasks)**

```bash
# Connect to the user's running Chrome (they're already logged in)
# Connect to the user's running Chromium-based browser (they're already logged in)
agent-browser --auto-connect state save ./auth.json
# Use that auth state
agent-browser --state ./auth.json open https://app.example.com/dashboard
Expand Down Expand Up @@ -345,15 +345,15 @@ agent-browser session list
### Connect to Existing Chrome
```bash
# Auto-discover running Chrome with remote debugging enabled
# Auto-discover a running Chromium-based browser with remote debugging enabled
agent-browser --auto-connect open https://example.com
agent-browser --auto-connect snapshot
# Or with explicit CDP port
agent-browser --cdp 9222 snapshot
```
Auto-connect discovers Chrome via `DevToolsActivePort`, common debugging ports (9222, 9229), and falls back to a direct WebSocket connection if HTTP-based CDP discovery fails.
Auto-connect discovers Chrome, Chromium, Brave, and Helium via `DevToolsActivePort`, common debugging ports (9222, 9229), and falls back to a direct WebSocket connection if HTTP-based CDP discovery fails.
### Color Scheme (Dark Mode)
Expand Down
11 changes: 7 additions & 4 deletions skills/agent-browser/references/authentication.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,17 @@ Login flows, session persistence, OAuth, 2FA, and authenticated browsing.

## Import Auth from Your Browser

The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into.
The fastest way to authenticate is to reuse cookies from a Chromium-based browser session you are already logged into.

**Step 1: Start Chrome with remote debugging**
**Step 1: Start a Chromium-based browser with remote debugging**

```bash
# macOS
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222

# macOS (Helium)
open -na "Helium" --args --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222

Expand All @@ -43,7 +46,7 @@ Log in to your target site(s) in this Chrome window as you normally would.
**Step 2: Grab the auth state**

```bash
# Auto-discover the running Chrome and save its cookies + localStorage
# Auto-discover the running browser and save its cookies + localStorage
agent-browser --auto-connect state save ./my-auth.json
```

Expand All @@ -58,7 +61,7 @@ agent-browser state load ./my-auth.json
agent-browser open https://app.example.com/dashboard
```

This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies.
This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as the browser already has valid session cookies.

> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices).

Expand Down