-
Notifications
You must be signed in to change notification settings - Fork 45
Add browser agent example with session reuse #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Browser Agent Image | ||
| # Build context: repository root | ||
| # | ||
| # Build: | ||
| # docker build -t browser-agent:latest -f example/browser-agent/Dockerfile . | ||
|
|
||
| FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| COPY example/browser-agent/requirements.txt ./ | ||
| RUN uv venv && uv pip install -r requirements.txt | ||
|
|
||
| COPY example/browser-agent/browser_agent.py ./ | ||
|
|
||
| ENV PYTHONPATH="/app" \ | ||
| PYTHONDONTWRITEBYTECODE=1 \ | ||
| PYTHONUNBUFFERED=1 | ||
|
|
||
| EXPOSE 8000 | ||
|
|
||
| CMD ["uv", "run", "browser_agent.py"] |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,132 @@ | ||||||
| # Browser Agent with Playwright MCP Tool | ||||||
|
|
||||||
| > An AI-powered browser agent that handles web search and analysis requests, | ||||||
| > using the official [Playwright MCP](https://github.com/microsoft/playwright-mcp) | ||||||
| > tool running in an isolated AgentCube sandbox. | ||||||
|
|
||||||
| ## Architecture | ||||||
|
|
||||||
| ``` | ||||||
| ┌───────────────┐ ┌────────────────┐ ┌───────────────────────────────┐ | ||||||
| │ Client │──HTTP──▶ Browser Agent │──HTTP──▶ Router (AgentCube) │ | ||||||
| │ (curl/SDK) │ │ (Deployment) │ │ session mgmt + JWT + proxy │ | ||||||
| └───────────────┘ └────────────────┘ └───────────────┬───────────────┘ | ||||||
| │ reverse proxy | ||||||
| ┌───────────────▼───────────────┐ | ||||||
| │ Playwright MCP Tool (sandbox) │ | ||||||
| │ AgentRuntime microVM pod │ | ||||||
| │ official MCP browser service │ | ||||||
| └───────────────────────────────┘ | ||||||
| ``` | ||||||
|
|
||||||
| ### Components | ||||||
|
|
||||||
| | Component | Type | Image | Description | | ||||||
| |-----------|------|-------|-------------| | ||||||
| | **Playwright MCP Tool** | `AgentRuntime` CRD | `mcr.microsoft.com/playwright/mcp:latest` | Official Playwright MCP container from Microsoft. Runs as a real browser tool server in the sandbox, not as a custom in-repo agent. | | ||||||
| | **Browser Agent** | `Deployment` | `browser-agent:latest` | LLM-powered orchestrator that receives user requests, plans browser tasks, and calls the Playwright MCP tool via the AgentCube Router. | | ||||||
|
|
||||||
| ### How It Works | ||||||
|
|
||||||
| 1. **User sends a request** (e.g., "Search for the latest Kubernetes release notes") | ||||||
| 2. **Browser Agent** uses an LLM to plan a concrete browser task | ||||||
| 3. **Browser Agent** connects to the Playwright MCP tool via the AgentCube Router | ||||||
| 4. **Router** provisions a sandbox pod (or reuses an existing session), signs a JWT, and proxies the request | ||||||
| 5. **Playwright MCP Tool** inside the sandbox exposes browser automation tools over MCP | ||||||
| 6. **Browser Agent** summarizes the result using the LLM and returns it to the user | ||||||
|
|
||||||
| Session reuse: the `session_id` returned in the first response can be passed in subsequent requests to reuse the same browser sandbox. The MCP server is started with `--shared-browser-context`, so repeated requests can keep the same browser state inside that sandbox. | ||||||
|
|
||||||
| ## Prerequisites | ||||||
|
|
||||||
| - AgentCube deployed in a Kubernetes cluster (Router + Workload Manager running) | ||||||
| - An OpenAI-compatible LLM API key | ||||||
| - `kubectl` configured to access the cluster | ||||||
|
|
||||||
| ## Quick Start | ||||||
|
|
||||||
| ### 1. Create the API key secret | ||||||
|
|
||||||
| ```bash | ||||||
| kubectl create secret generic browser-agent-secrets \ | ||||||
| --from-literal=openai-api-key=<YOUR_API_KEY> | ||||||
| ``` | ||||||
|
|
||||||
| ### 2. Deploy the Playwright MCP Tool (AgentRuntime) | ||||||
|
|
||||||
| ```bash | ||||||
| # Create the AgentRuntime CRD using the official Microsoft image | ||||||
| kubectl apply -f example/browser-agent/browser-use-tool.yaml | ||||||
| ``` | ||||||
|
|
||||||
| ### 3. Deploy the Browser Agent | ||||||
|
|
||||||
| ```bash | ||||||
| # Build the agent image (from repo root) | ||||||
| docker build -t browser-agent:latest \ | ||||||
| -f example/browser-agent/Dockerfile . | ||||||
|
|
||||||
| # Deploy | ||||||
| kubectl apply -f example/browser-agent/deployment.yaml | ||||||
| ``` | ||||||
|
|
||||||
| ### 4. Test | ||||||
|
|
||||||
| ```bash | ||||||
| # Port-forward to the agent | ||||||
| kubectl port-forward deploy/browser-agent 8000:8000 | ||||||
|
|
||||||
| # Send a search request | ||||||
| curl -s http://localhost:8000/chat \ | ||||||
| -H 'Content-Type: application/json' \ | ||||||
| -d '{"message": "Search for the latest news about Kubernetes 1.33 release"}' \ | ||||||
| | python -m json.tool | ||||||
|
|
||||||
| # Reuse the same browser session (pass session_id from previous response) | ||||||
| curl -s http://localhost:8000/chat \ | ||||||
| -H 'Content-Type: application/json' \ | ||||||
| -d '{"message": "Now find the Patch Releases list from the same release", "session_id": "<SESSION_ID>"}' \ | ||||||
| | python -m json.tool | ||||||
| ``` | ||||||
|
|
||||||
| ## Configuration | ||||||
|
|
||||||
| ### Browser Agent (Deployment) | ||||||
|
|
||||||
| | Env Var | Default | Description | | ||||||
| |---------|---------|-------------| | ||||||
| | `OPENAI_API_KEY` | (required) | LLM API key | | ||||||
| | `OPENAI_API_BASE` | `https://api.openai.com/v1` | LLM API base URL | | ||||||
| | `OPENAI_MODEL` | `gpt-4o` | LLM model name | | ||||||
| | `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL | | ||||||
|
||||||
| | `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL | | |
| | `ROUTER_URL` | `http://agentcube-router.agentcube.svc.cluster.local:8080` | AgentCube Router URL | |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,39 @@ | ||||||
| apiVersion: runtime.agentcube.volcano.sh/v1alpha1 | ||||||
| kind: AgentRuntime | ||||||
| metadata: | ||||||
| name: browser-use-tool | ||||||
| namespace: default | ||||||
| spec: | ||||||
| targetPort: | ||||||
| - pathPrefix: "/" | ||||||
| port: 8931 | ||||||
| protocol: "HTTP" | ||||||
| podTemplate: | ||||||
| labels: | ||||||
| app: browser-use-tool | ||||||
| spec: | ||||||
| containers: | ||||||
| - name: playwright-mcp | ||||||
| image: mcr.microsoft.com/playwright/mcp:latest | ||||||
| imagePullPolicy: IfNotPresent | ||||||
| args: | ||||||
| - "--port" | ||||||
| - "8931" | ||||||
| - "--host" | ||||||
| - "0.0.0.0" | ||||||
| - "--allowed-hosts" | ||||||
| - "*" | ||||||
|
Comment on lines
+24
to
+25
|
||||||
| - "--allowed-hosts" | |
| - "*" |
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works well for me
But the second example may be too hard for agent to find
I cannot retrieve the specific deprecation list you requested due to technical constraintsMaybe can be changed to
or other easier question since we only need to prove our session id works well here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion