Add browser agent example with session reuse by hzxuzhonghu · Pull Request #255 · volcano-sh/agentcube

hzxuzhonghu · 2026-04-03T09:50:43Z

Summary

add a browser agent example backed by the Playwright MCP runtime
deploy the browser agent and browser-use tool with Kubernetes manifests and a Dockerfile
preserve AgentCube session reuse across MCP calls and stop cleanly when the tool-call limit is reached

Testing

/root/go/src/agent-cube/.venv/bin/python -m py_compile example/browser-agent/browser_agent.py

Notes

unrelated local changes in go.mod and generated API files were intentionally left out of this PR because they contain stash conflict markers and are not part of the browser-agent change set

Fix #254

Copilot

Pull request overview

Adds a new end-to-end “browser agent” example that orchestrates a Playwright MCP tool runtime through the AgentCube Router with session reuse, plus accompanying Kubernetes/Docker assets and refreshed architecture documentation.

Changes:

Introduces example/browser-agent/ (FastAPI service + Dockerfile + manifests) to call a Playwright MCP AgentRuntime via the Router and reuse x-agentcube-session-id.
Adds request body size limiting middleware to PicoD’s Gin server.
Replaces the docs architecture “overview” page with a new, expanded architecture.md.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
pkg/picod/server.go	Adds global Gin middleware for 32MB request body limit + multipart memory cap.
example/browser-agent/requirements.txt	Defines Python dependencies for the browser agent service.
example/browser-agent/README.md	Documents the browser agent architecture, deployment, and session reuse flow.
example/browser-agent/Dockerfile	Builds the browser agent container image (uv + Python 3.12).
example/browser-agent/deployment.yaml	Deploys the browser agent into Kubernetes with env-based configuration.
example/browser-agent/browser-use-tool.yaml	Defines the Playwright MCP tool as an AgentRuntime workload.
example/browser-agent/browser_agent.py	Implements FastAPI endpoint, LLM planning/summarization, and MCP client calls via Router.
docs/agentcube/docs/architecture/overview.md	Removes the prior architecture overview doc page.
docs/agentcube/docs/architecture/architecture.md	Adds a new, more comprehensive architecture document.

Copilot · 2026-04-03T09:55:06Z

+	// Limit request body size to 32 MB to prevent DoS attacks
+	engine.Use(func(c *gin.Context) {
+		c.Request.Body = http.MaxBytesReader(c.Writer, c.Request.Body, 32<<20)
+		c.Next()
+	})
+	engine.MaxMultipartMemory = 32 << 20


AuthMiddleware() in pkg/picod/auth.go already wraps c.Request.Body with http.MaxBytesReader using the MaxBodySize constant (32MB). Adding another MaxBytesReader middleware here is redundant and uses a separate hard-coded limit, increasing the risk of the limits drifting. Consider removing this middleware and using engine.MaxMultipartMemory = MaxBodySize (or a shared constant) if you want multipart parsing to match the enforced body size.

Suggested change

// Limit request body size to 32 MB to prevent DoS attacks

engine.Use(func(c *gin.Context) {

c.Request.Body = http.MaxBytesReader(c.Writer, c.Request.Body, 32<<20)

c.Next()

})

engine.MaxMultipartMemory = 32 << 20

engine.MaxMultipartMemory = MaxBodySize

Copilot · 2026-04-03T09:55:07Z

+class PlaywrightMCPClient:
+    """Client for calling the Playwright MCP tool via AgentCube Router."""
+
+    def __init__(self):
+        self.base_url = (
+            f"{ROUTER_URL}/v1/namespaces/{PLAYWRIGHT_MCP_NAMESPACE}"
+            f"/agent-runtimes/{PLAYWRIGHT_MCP_NAME}/invocations/mcp"
+        )
+        self.session_id: Optional[str] = None
+


PlaywrightMCPClient stores a mutable self.session_id and run_task() falls back to session_id or self.session_id. Because the FastAPI app uses a single global browser_client, concurrent or unrelated requests can accidentally reuse another caller's AgentCube session (cross-user sandbox reuse) and leak browser state. Make session reuse strictly client-provided (require/echo session_id) or scope the client/session to a single request (no global mutable state).

Copilot · 2026-04-03T09:55:07Z

+# ========================= FastAPI App =========================
+app = FastAPI(title="Browser Agent", description="AI agent with Playwright MCP tool")
+browser_client = PlaywrightMCPClient()
+


browser_client is instantiated as a module-level singleton. Combined with the client's internal mutable session tracking, this can cause request cross-talk under concurrency. Even if you remove self.session_id, consider constructing the MCP client per request (or keeping it stateless) to avoid any future shared mutable state issues.

Copilot · 2026-04-03T09:55:07Z

+    except Exception as e:
+        log.exception("Browser task execution failed")
+        return ChatResponse(answer=f"Error: {e}", success=False)


The general exception handler returns ChatResponse(answer=f"Error: {e}"), which can leak internal exception details (URLs, headers, stack context) to callers. Prefer returning a generic user-facing error message and logging the exception server-side (you already call log.exception(...)).

Suggested change

except Exception as e:

log.exception("Browser task execution failed")

return ChatResponse(answer=f"Error: {e}", success=False)

except Exception:

log.exception("Browser task execution failed")

return ChatResponse(

answer="The browser task failed due to an internal error. Please try again.",

success=False,

)

Copilot · 2026-04-03T09:55:07Z

+            - "--allowed-hosts"
+            - "*"


This AgentRuntime config passes --allowed-hosts *, effectively disabling host allowlisting in the Playwright MCP server. That makes SSRF/internal-network access much easier if an untrusted prompt/user controls browsing targets. Consider removing this flag (use tool defaults) or setting a restrictive allowlist that matches your intended use (and/or enforce allowlists at the network policy level).

Suggested change

- "--allowed-hosts"

- "*"

Copilot · 2026-04-03T09:55:08Z

+| `OPENAI_API_KEY` | (required) | LLM API key |
+| `OPENAI_API_BASE` | `https://api.openai.com/v1` | LLM API base URL |
+| `OPENAI_MODEL` | `gpt-4o` | LLM model name |
+| `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |


The documented default ROUTER_URL here (http://router.agentcube.svc.cluster.local:8080) doesn’t match the service name used by the Helm chart (agentcube-router) and your provided deployment.yaml (http://agentcube-router.agentcube.svc.cluster.local:8080). This will cause confusion / misconfiguration when users follow the README defaults. Update the README (and ideally browser_agent.py defaults) to a consistent Router service address.

Suggested change

| `ROUTER_URL` | `http://router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |

| `ROUTER_URL` | `http://agentcube-router.agentcube.svc.cluster.local:8080` | AgentCube Router URL |

Copilot · 2026-04-03T09:55:08Z

+OPENAI_API_BASE = os.environ.get("OPENAI_API_BASE", "https://api.openai.com/v1")
+OPENAI_MODEL = os.environ.get("OPENAI_MODEL", "gpt-4o")
+
+ROUTER_URL = os.environ.get("ROUTER_URL", "http://router.agentcube.svc.cluster.local:8080")


ROUTER_URL defaults to http://router.agentcube.svc.cluster.local:8080, but the Helm chart’s Router Service is named agentcube-router (and the provided Deployment sets that value). Consider updating this default to the chart’s service DNS name so running the agent outside Kubernetes manifests behaves consistently with the documented install path.

Suggested change

ROUTER_URL = os.environ.get("ROUTER_URL", "http://router.agentcube.svc.cluster.local:8080")

ROUTER_URL = os.environ.get("ROUTER_URL", "http://agentcube-router.agentcube.svc.cluster.local:8080")

Copilot · 2026-04-03T09:55:08Z

+              value: "https://api.deepseek.com/v1" # Change to your LLM API base
+            - name: OPENAI_MODEL
+              value: "deepseek-chat"


This Deployment hard-codes OPENAI_API_BASE/OPENAI_MODEL to DeepSeek-specific values, while the README and code defaults describe an OpenAI-compatible base at https://api.openai.com/v1 and gpt-4o. Consider aligning the example manifests with the README defaults (or clearly documenting that the manifest is configured for DeepSeek by default) to avoid surprise misconfiguration.

Suggested change

value: "https://api.deepseek.com/v1" # Change to your LLM API base

- name: OPENAI_MODEL

value: "deepseek-chat"

value: "https://api.openai.com/v1" # Change to your LLM API base

- name: OPENAI_MODEL

value: "gpt-4o"

Copilot · 2026-04-03T09:55:08Z

+# AgentCube Architecture
+
+> A Kubernetes-native platform that treats AI agents, code interpreters, MCP servers, and other AI tool runtimes (browser-use, computer-use, etc.) as first-class, serverless workloads with microVM-based sandbox isolation.
+
+---


overview.md has been removed in favor of this new architecture.md, but other docs still link to ./architecture/overview.md (e.g., docs/agentcube/docs/getting-started.md). Please update/redirect those references to avoid broken links in the rendered documentation site.

Copilot · 2026-04-03T09:55:08Z

+| **AgentRuntime** | `runtime.agentcube.io/v1alpha1` | AgentCube | User-facing agent runtime definition |
+| **CodeInterpreter** | `runtime.agentcube.io/v1alpha1` | AgentCube | Code execution environment with warm pool support |


The CRD API group in this table (runtime.agentcube.io/v1alpha1) doesn’t match the actual CRD group used elsewhere in the repo/examples (runtime.agentcube.volcano.sh/v1alpha1). Please update the API group here to the correct value to avoid misleading users copying these manifests.

Suggested change

| **AgentRuntime** | `runtime.agentcube.io/v1alpha1` | AgentCube | User-facing agent runtime definition |

| **CodeInterpreter** | `runtime.agentcube.io/v1alpha1` | AgentCube | Code execution environment with warm pool support |

| **AgentRuntime** | `runtime.agentcube.volcano.sh/v1alpha1` | AgentCube | User-facing agent runtime definition |

| **CodeInterpreter** | `runtime.agentcube.volcano.sh/v1alpha1` | AgentCube | Code execution environment with warm pool support |

gemini-code-assist

Code Review

This pull request introduces comprehensive architecture documentation for AgentCube and adds a practical 'Browser Agent' example that utilizes the Playwright MCP tool within a sandboxed environment. It also includes security hardening in the picod daemon by enforcing a 32MB request body limit. Review feedback identifies a critical thread-safety issue where session_id is stored in a global singleton, which could lead to session leakage in concurrent environments. Additionally, there are concerns regarding the use of non-standard MCP attributes, fragile JSON extraction from LLM responses, and unpopulated fields in the chat response.

gemini-code-assist · 2026-04-03T09:55:55Z

+            f"{ROUTER_URL}/v1/namespaces/{PLAYWRIGHT_MCP_NAMESPACE}"
+            f"/agent-runtimes/{PLAYWRIGHT_MCP_NAME}/invocations/mcp"
+        )
+        self.session_id: Optional[str] = None


Storing session_id as an instance variable in a global singleton (browser_client at line 322) is not thread-safe in a FastAPI environment. Concurrent requests from different users will overwrite this value, leading to session leakage or incorrect session reuse. The session_id should be managed per-request or passed explicitly through the call stack.

gemini-code-assist · 2026-04-03T09:55:55Z

+    if getattr(result, "structuredContent", None):
+        parts.append(json.dumps(result.structuredContent, ensure_ascii=True))


The structuredContent attribute is not part of the standard mcp.types.CallToolResult in the official Model Context Protocol specification. This check will likely always return None, making the code unreachable. Please verify if this is a custom extension or if you intended to process the content list instead.

gemini-code-assist · 2026-04-03T09:55:55Z

+        if content.startswith("```"):
+            content = content.split("\n", 1)[1].rsplit("```", 1)[0].strip()
+        plan = json.loads(content)


Manual string splitting to extract JSON from markdown code blocks is fragile and will fail if the LLM output format varies slightly (e.g., missing newlines or different tick styles). Consider using a more robust approach like a regular expression or a dedicated JSON extraction utility.

Suggested change

if content.startswith("```"):

content = content.split("\n", 1)[1].rsplit("```", 1)[0].strip()

plan = json.loads(content)

import re

content = planning_response.content.strip()

json_match = re.search(r"```(?:json)?\s*(\{.*?\})\s*```", content, re.DOTALL)

if json_match:

content = json_match.group(1)

plan = json.loads(content)

gemini-code-assist · 2026-04-03T09:55:55Z

+        answer=answer,
+        success=success,
+        session_id=result.get("session_id"),
+        urls_visited=result.get("urls_visited", []),


The urls_visited field in ChatResponse is never populated because the run_task method does not track or return visited URLs. This results in an empty list being returned to the client even after successful browsing tasks.

codecov-commenter · 2026-04-03T10:08:56Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.37%. Comparing base (845b798) to head (2ba0e12).
⚠️ Report is 163 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #255      +/-   ##
==========================================
+ Coverage   35.60%   43.37%   +7.76%     
==========================================
  Files          29       30       +1     
  Lines        2533     2610      +77     
==========================================
+ Hits          902     1132     +230     
+ Misses       1505     1355     -150     
+ Partials      126      123       -3

Flag	Coverage Δ
unittests	`43.37% <ø> (+7.76%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Zhonghu Xu <[email protected]>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Copilot · 2026-04-07T01:55:24Z

+        captured_session_id = session_id or self.session_id
+        transport_client_holder: dict[str, httpx.AsyncClient] = {}
+        tool_round_limit = max_rounds or max_steps


run_task() falls back to self.session_id when session_id isn’t provided. Since browser_client is a module-level singleton, this can cause unintended cross-request/session reuse (and potential data leakage) between different callers. Consider removing the implicit fallback (require explicit session_id to reuse) or storing session state per client/request rather than on a shared instance.

This is expected, this example demonstrate session resue capability

Copilot · 2026-04-07T01:55:25Z

+        # Extract JSON from LLM response (handle markdown code blocks)
+        content = planning_response.content.strip()
+        if content.startswith("```"):
+            content = content.split("\n", 1)[1].rsplit("```", 1)[0].strip()
+        plan = json.loads(content)


planning_response.content isn’t guaranteed to be a string (LangChain message content can be str | list[...]). Calling .strip() directly can raise at runtime. Consider normalizing with _message_content_to_text(planning_response.content) before JSON extraction.

hzxuzhonghu · 2026-04-08T01:53:55Z

/assign @acsoto

FAUST-BENCHOU

works well for me.
lgtm.

FAUST-BENCHOU · 2026-04-09T14:36:15Z

+# Reuse the same browser session (pass session_id from previous response)
+curl -s http://localhost:8000/chat \
+  -H 'Content-Type: application/json' \
+  -d '{"message": "Now find the deprecation list from the same release", "session_id": "<SESSION_ID>"}' \
+  | python -m json.tool
+```


it works well for me

(base) zhoujinyu@zhoujinyudeMacBook-Air agentcube % curl -s http://localhost:8000/chat \ -H 'Content-Type: application/json' \ -d '{"message": "Search for the latest news about Kubernetes 1.33 release"}' \ | python -m json.tool { "answer": "Kubernetes 1.33, codenamed \"Octarine,\" was released on April 23, 2025. The latest patch is 1.33.10 (as of March 2026).\n\n**Key Highlights:**\n* **Stable Features:** Sidecar containers, in-place pod resource resize (Beta), multiple Service CIDRs, and the nftables backend for kube-proxy.\n* **Beta Features:** User namespaces for Linux pods (default on), asynchronous scheduler preemption, and mounting OCI images as volumes.\n* **Deprecations:** The stable Endpoints API is deprecated in favor of EndpointSlices. The in-tree `gitRepo` volume driver has been removed.\n\nThe release includes 64 total enhancements from a large community of contributors, with a focus on security, performance, and resource management.", "success": true, "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d", "urls_visited": [], "steps": 10 } (base) zhoujinyu@zhoujinyudeMacBook-Air agentcube % curl -s http://localhost:8000/chat \ -H 'Content-Type: application/json' \ -d '{"message": "Now find the deprecation list from the same release", "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d"}' \ | python -m json.tool { "answer": "Based on the browser result, I cannot retrieve the specific deprecation list you requested due to technical constraints. However, here is how you can find it yourself and a summary of common deprecations in recent Kubernetes releases:\n\n**To find the exact list:**\n1. Go to the official Kubernetes release notes: [https://kubernetes.io/releases/](https://kubernetes.io/releases/)\n2. Select the specific release you need.\n3. Look for sections titled \"Deprecations\" or search for \"deprecated\" within the notes.\n\n**Common deprecation categories in recent releases typically include:**\n- **Legacy and beta APIs** being phased out in favor of stable versions.\n- **In-tree cloud provider plugins** moving to out-of-tree components.\n- **Older kubectl flags and commands** with newer alternatives.\n- **Storage and network plugins** transitioning to CSI and newer standards.\n\nFor the precise and complete list, please refer to the official release notes for your specific Kubernetes version.", "success": true, "session_id": "7b6d29da-62df-4e03-b34a-d73e555fec5d", "urls_visited": [], "steps": 4 }

But the second example may be too hard for agent to find I cannot retrieve the specific deprecation list you requested due to technical constraints
Maybe can be changed to

curl -s http://localhost:8000/chat \ -H 'Content-Type: application/json' \ -d '{"message": "Now find the Patch Releases list from the same release", "session_id": "<SESSION_ID>"}' \ | python -m json.tool

or other easier question since we only need to prove our session id works well here

good suggestion

acsoto

LGTM

volcano-sh-bot · 2026-04-10T03:44:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: acsoto
Once this PR has been reviewed and has the lgtm label, please ask for approval from hzxuzhonghu. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

example/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Zhonghu Xu <[email protected]>

Copilot AI review requested due to automatic review settings April 3, 2026 09:50

volcano-sh-bot requested review from LiZhenCheng9527, YaoZengzeng and acsoto April 3, 2026 09:50

volcano-sh-bot added the size/XXL label Apr 3, 2026

Copilot started reviewing on behalf of hzxuzhonghu April 3, 2026 09:51 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 3, 2026

View reviewed changes

hzxuzhonghu force-pushed the browser-use branch from a9d6a86 to 5cf0aad Compare April 7, 2026 01:42

volcano-sh-bot added size/XL and removed size/XXL labels Apr 7, 2026

Copilot AI review requested due to automatic review settings April 7, 2026 01:50

hzxuzhonghu force-pushed the browser-use branch from 5cf0aad to 86b7954 Compare April 7, 2026 01:50

Add browser agent example with session reuse

bf56d9d

Signed-off-by: Zhonghu Xu <[email protected]>

hzxuzhonghu force-pushed the browser-use branch from 86b7954 to bf56d9d Compare April 7, 2026 01:51

Copilot started reviewing on behalf of hzxuzhonghu April 7, 2026 01:51 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

volcano-sh-bot assigned acsoto Apr 8, 2026

hzxuzhonghu mentioned this pull request Apr 9, 2026

Add a sample on browser-use #254

Closed

FAUST-BENCHOU reviewed Apr 9, 2026

View reviewed changes

acsoto approved these changes Apr 10, 2026

View reviewed changes

address comment

2ba0e12

Signed-off-by: Zhonghu Xu <[email protected]>

hzxuzhonghu merged commit 45b3d5d into volcano-sh:main Apr 15, 2026
11 of 12 checks passed

hzxuzhonghu deleted the browser-use branch April 15, 2026 02:14

	\| `ROUTER_URL` \| `http://router.agentcube.svc.cluster.local:8080` \| AgentCube Router URL \|
	\| `ROUTER_URL` \| `http://agentcube-router.agentcube.svc.cluster.local:8080` \| AgentCube Router URL \|

	ROUTER_URL = os.environ.get("ROUTER_URL", "http://router.agentcube.svc.cluster.local:8080")
	ROUTER_URL = os.environ.get("ROUTER_URL", "http://agentcube-router.agentcube.svc.cluster.local:8080")

		\| AgentRuntime \| `runtime.agentcube.io/v1alpha1` \| AgentCube \| User-facing agent runtime definition \|
		\| CodeInterpreter \| `runtime.agentcube.io/v1alpha1` \| AgentCube \| Code execution environment with warm pool support \|

		if getattr(result, "structuredContent", None):
		parts.append(json.dumps(result.structuredContent, ensure_ascii=True))

Conversation

hzxuzhonghu commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hzxuzhonghu commented Apr 3, 2026 •

edited

Loading

codecov-commenter commented Apr 3, 2026 •

edited

Loading

FAUST-BENCHOU Apr 9, 2026 •

edited

Loading