Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions example/browser-agent/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Browser Agent Image
# Build context: repository root
#
# Build:
# docker build -t browser-agent:latest -f example/browser-agent/Dockerfile .

FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim

WORKDIR /app

COPY example/browser-agent/requirements.txt ./
RUN uv venv && uv pip install -r requirements.txt

COPY example/browser-agent/browser_agent.py ./

ENV PYTHONPATH="/app" \
PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["uv", "run", "browser_agent.py"]
132 changes: 132 additions & 0 deletions example/browser-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Browser Agent with Playwright MCP Tool

> An AI-powered browser agent that handles web search and analysis requests,
> using the official [Playwright MCP](https://github.com/microsoft/playwright-mcp)
> tool running in an isolated AgentCube sandbox.

## Architecture

```
┌───────────────┐ ┌────────────────┐ ┌───────────────────────────────┐
│ Client │──HTTP──▶ Browser Agent │──HTTP──▶ Router (AgentCube) │
│ (curl/SDK) │ │ (Deployment) │ │ session mgmt + JWT + proxy │
└───────────────┘ └────────────────┘ └───────────────┬───────────────┘
│ reverse proxy
┌───────────────▼───────────────┐
│ Playwright MCP Tool (sandbox) │
│ AgentRuntime microVM pod │
│ official MCP browser service │
└───────────────────────────────┘
```

### Components

| Component | Type | Image | Description |
|-----------|------|-------|-------------|
| **Playwright MCP Tool** | `AgentRuntime` CRD | `mcr.microsoft.com/playwright/mcp:latest` | Official Playwright MCP container from Microsoft. Runs as a real browser tool server in the sandbox, not as a custom in-repo agent. |
| **Browser Agent** | `Deployment` | `browser-agent:latest` | LLM-powered orchestrator that receives user requests, plans browser tasks, and calls the Playwright MCP tool via the AgentCube Router. |

### How It Works

1. **User sends a request** (e.g., "Search for the latest Kubernetes release notes")
2. **Browser Agent** uses an LLM to plan a concrete browser task
3. **Browser Agent** connects to the Playwright MCP tool via the AgentCube Router
4. **Router** provisions a sandbox pod (or reuses an existing session), signs a JWT, and proxies the request
5. **Playwright MCP Tool** inside the sandbox exposes browser automation tools over MCP
6. **Browser Agent** summarizes the result using the LLM and returns it to the user

Session reuse: the `session_id` returned in the first response can be passed in subsequent requests to reuse the same browser sandbox. The MCP server is started with `--shared-browser-context`, so repeated requests can keep the same browser state inside that sandbox.

## Prerequisites

- AgentCube deployed in a Kubernetes cluster (Router + Workload Manager running)
- An OpenAI-compatible LLM API key
- `kubectl` configured to access the cluster

## Quick Start

### 1. Create the API key secret

```bash
kubectl create secret generic browser-agent-secrets \
--from-literal=openai-api-key=<YOUR_API_KEY>
```

### 2. Deploy the Playwright MCP Tool (AgentRuntime)

```bash
# Create the AgentRuntime CRD using the official Microsoft image
kubectl apply -f example/browser-agent/browser-use-tool.yaml
```

### 3. Deploy the Browser Agent

```bash
# Build the agent image (from repo root)
docker build -t browser-agent:latest \
-f example/browser-agent/Dockerfile .

# Deploy
kubectl apply -f example/browser-agent/deployment.yaml
```

### 4. Test

```bash
# Port-forward to the agent
kubectl port-forward deploy/browser-agent 8000:8000

# Send a search request
curl -s http://localhost:8000/chat \
-H 'Content-Type: application/json' \
-d '{"message": "Search for the latest news about Kubernetes 1.33 release"}' \
| python -m json.tool

# Reuse the same browser session (pass session_id from previous response)
curl -s http://localhost:8000/chat \
-H 'Content-Type: application/json' \
-d '{"message": "Now find the Patch Releases list from the same release", "session_id": "<SESSION_ID>"}' \
| python -m json.tool
```
Comment on lines +85 to +90
Copy link
Copy Markdown
Contributor

@FAUST-BENCHOU FAUST-BENCHOU Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works well for me

(base) zhoujinyu@zhoujinyudeMacBook-Air agentcube % curl -s http://localhost:8000/chat \
  -H 'Content-Type: application/json' \
  -d '{"message": "Search for the latest news about Kubernetes 1.33 release"}' \
  | python -m json.tool

{
    "answer": "Kubernetes 1.33, codenamed \"Octarine,\" was released on April 23, 2025. The latest patch is 1.33.10 (as of March 2026).\n\n**Key Highlights:**\n*   **Stable Features:** Sidecar containers, in-place pod resource resize (Beta), multiple Service CIDRs, and the nftables backend for kube-proxy.\n*   **Beta Features:** User namespaces for Linux pods (default on), asynchronous scheduler preemption, and mounting OCI images as volumes.\n*   **Deprecations:** The stable Endpoints API is deprecated in favor of EndpointSlices. The in-tree `gitRepo` volume driver has been removed.\n\nThe release includes 64 total enhancements from a large community of contributors, with a focus on security, performance, and resource management.",
    "success": true,
    "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d",
    "urls_visited": [],
    "steps": 10
}
(base) zhoujinyu@zhoujinyudeMacBook-Air agentcube % curl -s http://localhost:8000/chat \
  -H 'Content-Type: application/json' \
  -d '{"message": "Now find the deprecation list from the same release", "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d"}' \
  | python -m json.tool
{
    "answer": "Based on the browser result, I cannot retrieve the specific deprecation list you requested due to technical constraints. However, here is how you can find it yourself and a summary of common deprecations in recent Kubernetes releases:\n\n**To find the exact list:**\n1. Go to the official Kubernetes release notes: [https://kubernetes.io/releases/](https://kubernetes.io/releases/)\n2. Select the specific release you need.\n3. Look for sections titled \"Deprecations\" or search for \"deprecated\" within the notes.\n\n**Common deprecation categories in recent releases typically include:**\n- **Legacy and beta APIs** being phased out in favor of stable versions.\n- **In-tree cloud provider plugins** moving to out-of-tree components.\n- **Older kubectl flags and commands** with newer alternatives.\n- **Storage and network plugins** transitioning to CSI and newer standards.\n\nFor the precise and complete list, please refer to the official release notes for your specific Kubernetes version.",
    "success": true,
    "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d",
    "urls_visited": [],
    "steps": 4
}

But the second example may be too hard for agent to find I cannot retrieve the specific deprecation list you requested due to technical constraints
Maybe can be changed to

curl -s http://localhost:8000/chat \
 -H 'Content-Type: application/json' \
 -d '{"message": "Now find the Patch Releases list from the same release", "session_id": "<SESSION_ID>"}' \
 | python -m json.tool

or other easier question since we only need to prove our session id works well here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion


## Configuration

### Browser Agent (Deployment)

| Env Var | Default | Description |
|---------|---------|-------------|
| `OPENAI_API_KEY` | (required) | LLM API key |
| `OPENAI_API_BASE` | `https://api.openai.com/v1` | LLM API base URL |
| `OPENAI_MODEL` | `gpt-4o` | LLM model name |
| `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented default ROUTER_URL here (http://router.agentcube.svc.cluster.local:8080) doesn’t match the service name used by the Helm chart (agentcube-router) and your provided deployment.yaml (http://agentcube-router.agentcube.svc.cluster.local:8080). This will cause confusion / misconfiguration when users follow the README defaults. Update the README (and ideally browser_agent.py defaults) to a consistent Router service address.

Suggested change
| `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |
| `ROUTER_URL` | `http://agentcube-router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |

Copilot uses AI. Check for mistakes.
| `PLAYWRIGHT_MCP_NAME` | `browser-use-tool` | Name of the Playwright MCP AgentRuntime CRD |
| `PLAYWRIGHT_MCP_NAMESPACE` | `default` | Namespace of the AgentRuntime |
| `BROWSER_TASK_TIMEOUT` | `300` | Timeout (seconds) for browser task execution |
| `MAX_TOOL_ROUNDS` | `10` | Maximum LLM-to-tool interaction rounds |

### Playwright MCP Tool (AgentRuntime)

| Env Var | Default | Description |
|---------|---------|-------------|
| `--port` | `8931` | MCP HTTP endpoint port |
| `--host` | `0.0.0.0` | Bind address |
| `--shared-browser-context` | enabled | Reuse the same browser context for repeat clients in the same sandbox |
| `--caps=vision` | enabled | Coordinate-based actions and screenshots |

## Files

```
example/browser-agent/
├── README.md # This file
├── browser_agent.py # Browser Agent: LLM planner + MCP client
├── browser-use-tool.yaml # AgentRuntime CRD for the Playwright MCP tool
├── deployment.yaml # K8s Deployment for the browser agent
├── Dockerfile # Dockerfile for browser agent
├── requirements.txt # Python deps for browser agent
```

## Why This Design

- `playwright-python` is a library, not a tool server. By itself it does not give AgentCube an MCP or HTTP endpoint to proxy.
- `microsoft/playwright-mcp` is already a real browser tool server with official Docker packaging and HTTP transport support.
- This removes the custom in-repo tool wrapper and keeps the sandboxed browser component as a pure tool.
39 changes: 39 additions & 0 deletions example/browser-agent/browser-use-tool.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
apiVersion: runtime.agentcube.volcano.sh/v1alpha1
kind: AgentRuntime
metadata:
name: browser-use-tool
namespace: default
spec:
targetPort:
- pathPrefix: "/"
port: 8931
protocol: "HTTP"
podTemplate:
labels:
app: browser-use-tool
spec:
containers:
- name: playwright-mcp
image: mcr.microsoft.com/playwright/mcp:latest
imagePullPolicy: IfNotPresent
args:
- "--port"
- "8931"
- "--host"
- "0.0.0.0"
- "--allowed-hosts"
- "*"
Comment on lines +24 to +25
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This AgentRuntime config passes --allowed-hosts *, effectively disabling host allowlisting in the Playwright MCP server. That makes SSRF/internal-network access much easier if an untrusted prompt/user controls browsing targets. Consider removing this flag (use tool defaults) or setting a restrictive allowlist that matches your intended use (and/or enforce allowlists at the network policy level).

Suggested change
- "--allowed-hosts"
- "*"

Copilot uses AI. Check for mistakes.
- "--shared-browser-context"
- "--caps=vision"
ports:
- containerPort: 8931
protocol: TCP
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
sessionTimeout: "30m"
maxSessionDuration: "8h"
Loading
Loading