-
Notifications
You must be signed in to change notification settings - Fork 45
v0.1.0 release doc #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
YaoZengzeng
wants to merge
7
commits into
volcano-sh:main
Choose a base branch
from
YaoZengzeng:release-note
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+301
−0
Open
v0.1.0 release doc #266
Changes from 6 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
2b8f8f6
v0.1.0 release doc
YaoZengzeng ed24459
fix comments
YaoZengzeng 101bba5
Update docs/release-notes/v0.1.0.md
YaoZengzeng 81c0619
Update docs/release-notes/v0.1.0.md
YaoZengzeng b054f44
Update docs/release-notes/v0.1.0.md
YaoZengzeng 09cfb52
Update docs/release-notes/v0.1.0.md
YaoZengzeng 29b1b2b
fix comments
YaoZengzeng File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,299 @@ | ||
| # v0.1.0 | ||
|
|
||
| ## Summary | ||
|
|
||
| AgentCube v0.1.0 is the **first official release** of AgentCube, a Volcano subproject that extends Kubernetes with native support for AI agent and code interpreter workloads. This release establishes the foundational architecture: a lightweight HTTP reverse proxy (Router) routes agent invocations to per-session microVM sandboxes, while a Workload Manager controls sandbox lifecycle, warm pools, and garbage collection. A minimal runtime daemon (PicoD) replaces SSH inside sandboxes, providing secure code execution, file operations, and JWT-based authentication with zero protocol overhead. Session state is stored in Redis/Valkey, enabling horizontal Router scaling. Two new Kubernetes CRDs — `AgentRuntime` and `CodeInterpreter` — are introduced to model agent workloads as first-class Kubernetes resources. A Python SDK, LangChain integration, and Dify plugin are included to make AgentCube immediately usable from popular AI frameworks. | ||
|
|
||
| ## What's New | ||
|
|
||
| ### Key Features Overview | ||
|
|
||
| - **Session-Based MicroVM Agent Routing**: Stateful request routing with session affinity, backed by isolated microVM sandboxes per session | ||
| - **AgentRuntime and CodeInterpreter CRDs**: Kubernetes-native abstractions for conversational agent and secure code interpreter workloads | ||
| - **Warm Pool for Fast Cold Starts**: Pre-warmed sandbox pool support for `CodeInterpreter`, reducing invocation latency via `SandboxClaim` adoption | ||
| - **PicoD Runtime Daemon**: Lightweight HTTP daemon replacing SSH inside sandboxes — code execution, file I/O, JWT authentication | ||
| - **JWT Security Chain (Router → PicoD)**: RSA-2048 key pair generated at startup; public key distributed via Kubernetes Secret and injected into sandbox pods | ||
| - **Dual GC Policy (Idle TTL + Max Duration)**: Background garbage collector enforces both idle timeout and hard maximum session duration | ||
| - **Python SDK and AI Framework Integrations**: Out-of-the-box SDK with LangChain and Dify plugin support | ||
|
|
||
| --- | ||
|
|
||
| ### Key Feature Details | ||
|
|
||
| ### Session-Based MicroVM Agent Routing | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| AI agent workloads are fundamentally stateful and interactive. A single agent session may span many invocations — tool calls, environment inspections, multi-step reasoning — all requiring the same isolated execution environment. Kubernetes has no native concept of persistent, identity-bound agent sessions. AgentCube fills this gap by mapping session IDs to dedicated microVM sandbox pods. | ||
|
|
||
| The Router acts as the data plane entry point. It reads the `x-agentcube-session-id` request header to look up an existing session in the store, or allocates a new sandbox via the Workload Manager when no session exists. Every response carries the `x-agentcube-session-id` header, enabling stateless clients to maintain session continuity across requests. | ||
|
|
||
| Key Capabilities: | ||
|
|
||
| - **Session affinity via header**: `x-agentcube-session-id` header maps requests to existing sandbox pods | ||
| - **Transparent sandbox allocation**: new sessions trigger automatic sandbox creation with no client-side configuration | ||
| - **Reverse proxy with path-prefix matching**: path-based routing to multiple exposed sandbox ports | ||
| - **HTTP/2 (h2c) support**: low-latency connections to sandbox endpoints | ||
| - **Configurable concurrency limit**: `MaxConcurrentRequests` prevents overload | ||
|
|
||
| **Router Endpoints:** | ||
|
|
||
| ``` | ||
| POST /v1/namespaces/{namespace}/agent-runtimes/{name}/invocations/*path | ||
| GET /v1/namespaces/{namespace}/agent-runtimes/{name}/invocations/*path | ||
| POST /v1/namespaces/{namespace}/code-interpreters/{name}/invocations/*path | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| GET /v1/namespaces/{namespace}/code-interpreters/{name}/invocations/*path | ||
| ``` | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Related: | ||
|
|
||
| - Design Doc: [Router Proposal](../design/router-proposal.md) | ||
| - Contributors: [@volcano-sh](https://github.com/volcano-sh) | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| --- | ||
|
|
||
| ### AgentRuntime and CodeInterpreter CRDs (Alpha) | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| Two distinct workload profiles emerge in the AI agent space: conversational/tool-using agents that need access to credentials, volumes, and custom networking; and short-lived code interpreters that require strict isolation and resource caps. Modeling both as first-class Kubernetes CRDs enables declarative configuration, RBAC integration, and GitOps-friendly workflows. | ||
|
|
||
| **AgentRuntime** (`agentruntimes.runtime.agentcube.volcano.sh`): | ||
|
|
||
| Designed for general-purpose AI agents. Accepts a full Kubernetes `PodSpec` template, allowing volume mounts, credential injection, sidecar containers, and custom resource requests. | ||
|
|
||
| - `spec.podTemplate` — full `PodSpec` for sandbox pod | ||
| - `spec.targetPort` — list of exposed ports with path prefix, port, and protocol | ||
| - `spec.sessionTimeout` — idle session expiry (default: `15m`) | ||
| - `spec.maxSessionDuration` — hard maximum session lifetime (default: `8h`) | ||
|
|
||
| **CodeInterpreter** (`codeinterpreters.runtime.agentcube.volcano.sh`): | ||
|
|
||
| Designed for secure, multi-tenant code execution. More locked-down than AgentRuntime, with a constrained sandbox template that restricts image, resources, and runtime class. | ||
|
|
||
| - `spec.template` — `CodeInterpreterSandboxTemplate` (image, imagePullPolicy, resources, runtimeClassName) | ||
| - `spec.ports` — list of exposed ports with path prefix | ||
| - `spec.sessionTimeout` / `spec.maxSessionDuration` — session lifecycle bounds | ||
| - `spec.warmPoolSize` — optional pre-warmed sandbox pool size | ||
| - `spec.authMode` — `picod` (default, RSA/JWT) or `none` (delegate auth to sandbox) | ||
|
|
||
| Alpha Feature Notice: APIs are under active development. Spec fields and default values may change in future releases. | ||
|
|
||
| Related: | ||
|
|
||
| - Design Doc: [AgentCube Proposal](../design/agentcube-proposal.md) | ||
| - Contributors: [@volcano-sh](https://github.com/volcano-sh) | ||
|
|
||
| --- | ||
|
|
||
| ### Warm Pool for Fast Cold Starts | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| Creating a microVM sandbox from scratch on every session request incurs a cold-start penalty that is unacceptable for interactive workloads. AgentCube introduces a warm pool mechanism: the Workload Manager pre-creates a configurable number of idle `Sandbox` pods and keeps them ready. When an invocation arrives, the Router claims a pre-warmed pod via a `SandboxClaim` CR instead of waiting for a new pod to start. The pool is automatically replenished after each claim. | ||
|
|
||
| Key Capabilities: | ||
|
|
||
| - `spec.warmPoolSize` on `CodeInterpreter` controls pool depth | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - `SandboxTemplate` + `SandboxClaim` pattern delegates pod adoption to the upstream `agent-sandbox` controller | ||
| - Pool refills asynchronously after each claim, keeping steady-state latency low | ||
| - Cold-start path remains available when pool is exhausted | ||
|
|
||
| Related: | ||
|
|
||
| - Design Doc: [AgentCube Proposal](../design/agentcube-proposal.md) | ||
| - Contributors: [@volcano-sh](https://github.com/volcano-sh) | ||
|
|
||
| --- | ||
|
|
||
| ### PicoD — Lightweight Sandbox Runtime Daemon | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| Traditional code sandbox implementations use SSH to execute commands remotely. SSH carries significant overhead: key management, multiplexing negotiation, and a heavyweight protocol for what are essentially single-request RPCs. PicoD replaces SSH with a minimal HTTP/1.1 daemon that runs inside each sandbox pod, providing code execution, file I/O, and authentication via a small, auditable binary. | ||
|
|
||
| Key Capabilities: | ||
|
|
||
| - **Code execution** (`POST /api/execute`): runs arbitrary commands with configurable timeout, working directory, and environment variables; returns stdout, stderr, exit code, and wall-clock duration | ||
| - **File upload / write** (`POST /api/files`): supports multipart form-data and JSON/base64 content for workspace-scoped file creation and updates | ||
| - **File download / read** (`GET /api/files/*path`): streams files from the sandbox workspace using path-addressed operations | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - **Health check** (`GET /health`): exposes an unauthenticated liveness endpoint | ||
| - **JWT authentication**: validates RS256 tokens from the Router; rejects unauthenticated requests | ||
| - **Path sanitization**: all paths are jailed to the configured workspace root, preventing directory traversal | ||
| - **32 MB request body limit** with configurable workspace root via `--workspace` flag | ||
|
|
||
| Related: | ||
|
|
||
| - Design Doc: [PicoD Proposal](../design/picod-proposal.md) | ||
| - Contributors: [@volcano-sh](https://github.com/volcano-sh) | ||
|
|
||
| --- | ||
|
|
||
| ### JWT Security Chain (Router → PicoD) | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| Sandbox pods are ephemeral and may be replaced at any time; embedding a shared secret in cluster config is fragile and hard to rotate. AgentCube establishes an RSA-based trust chain: the Router generates an RSA-2048 key pair at startup, stores the public key in a Kubernetes Secret (`picod-router-identity`), and the Workload Manager injects it as `PICOD_AUTH_PUBLIC_KEY` into every sandbox pod. The Router signs short-lived (5-minute) RS256 JWTs for every proxied request. PicoD verifies these tokens entirely in-process — no network round-trip, no shared database. | ||
|
|
||
| Key Capabilities: | ||
|
|
||
| - RSA-2048 key pair auto-generated at Router startup | ||
| - Public key distributed via `picod-router-identity` Kubernetes Secret | ||
| - Workload Manager injects public key into sandbox env at pod creation time | ||
YaoZengzeng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - 5-minute token expiry limits blast radius of token leakage | ||
| - PicoD rejects any request without a valid Router-issued JWT | ||
|
|
||
| Related: | ||
|
|
||
| - Design Doc: [PicoD Plain Authentication Design](../design/PicoD-Plain-Authentication-Design.md) | ||
| - Contributors: [@volcano-sh](https://github.com/volcano-sh) | ||
|
|
||
| --- | ||
|
|
||
| ### Sandbox Lifecycle Management and GC | ||
|
|
||
| **Background and Motivation:** | ||
|
|
||
| Agent sessions that complete their work or are abandoned by clients must be automatically reclaimed to avoid resource exhaustion. AgentCube implements a dual garbage collection policy enforced by a background loop in the Workload Manager: | ||
|
|
||
| - **Idle timeout**: sandboxes inactive beyond `spec.sessionTimeout` (default `15m`) are deleted | ||
| - **Hard max TTL**: sandboxes older than `spec.maxSessionDuration` (default `8h`) are deleted regardless of activity | ||
|
|
||
| Key Capabilities: | ||
|
|
||
| - Configurable GC interval in the Workload Manager | ||
| - `UpdateSessionLastActivity` store operation to reset idle timer on each invocation | ||
| - `ListExpiredSandboxes` and `ListInactiveSandboxes` store queries feed the GC loop | ||
| - Workload Manager deletes `Sandbox` / `SandboxClaim` CRs and removes store records atomically | ||
|
|
||
| --- | ||
|
|
||
| ## Other Notable Changes | ||
|
|
||
| ### API Changes | ||
|
|
||
| New CRDs introduced in v0.1.0: | ||
|
|
||
| 1. **`AgentRuntime`** (`runtime.agentcube.volcano.sh/v1alpha1`) | ||
|
|
||
| Full-`PodSpec` sandbox for conversational AI agents. Supports volume mounts, sidecar containers, credential injection, and custom resource classes. | ||
|
|
||
| 2. **`CodeInterpreter`** (`runtime.agentcube.volcano.sh/v1alpha1`) | ||
|
|
||
| Restricted sandbox for secure code execution. Adds warm pool support (`spec.warmPoolSize`) and auth mode selection (`spec.authMode`). | ||
|
|
||
| Generated client-go code is under `client-go/` with versioned clientsets, shared informers, and listers for both CRDs. | ||
|
|
||
| --- | ||
|
|
||
| ### Features and Enhancements | ||
|
|
||
| - **Python SDK** (`agentcube-sdk`): `CodeInterpreterClient` with `execute_command()`, `run_code(language, code)`, `upload_file()`, `download_file()`, `write_file()`; session lifecycle managed automatically | ||
| - **LangChain integration**: `CodeInterpreterClient` can be wrapped as a `@tool` and wired into LangGraph ReAct agents — see [devguide](../devguide/code-interpreter-using-langchain.md) | ||
| - **Dify plugin**: `integrations/dify-plugin/` — AgentCube tool integration for the Dify AI application platform | ||
| - **pcap-analyzer example**: `example/pcap-analyzer/` — end-to-end example agent that analyzes packet captures using code interpreter | ||
| - **Redis and ValKey backends**: pluggable session store (`pkg/store`) with implementations for both Redis and ValKey; selected via configuration | ||
| - **Prometheus metrics**: metrics exported by Router and Workload Manager for operational observability | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - **Health probes**: `/health/live` and `/health/ready` on the Router for Kubernetes liveness/readiness checks | ||
| - **User-scoped Kubernetes clients**: Workload Manager creates per-user dynamic clients from service account tokens, enabling per-sandbox RBAC enforcement | ||
| - **Helm chart**: `manifests/charts/base` for one-command installation of CRDs, Workload Manager, and Router | ||
|
|
||
| --- | ||
|
|
||
| ## Upgrade Instructions | ||
|
|
||
| This is the initial release of AgentCube. No upgrade path from a prior version exists. | ||
|
|
||
| **Prerequisites:** | ||
|
|
||
| 1. Kubernetes cluster v1.24+ | ||
| 2. Redis or ValKey instance accessible from the cluster | ||
| 3. `sigs.k8s.io/agent-sandbox` v0.1.1 CRDs installed | ||
|
|
||
| **Install with Helm:** | ||
|
|
||
| ```bash | ||
| helm install agentcube manifests/charts/base \ | ||
| --namespace agentcube-system --create-namespace \ | ||
| --set redis.addr=<redis-or-valkey-host>:6379 | ||
YaoZengzeng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| **Verify installation:** | ||
|
|
||
| ```bash | ||
| kubectl get crd agentruntimes.runtime.agentcube.volcano.sh | ||
| kubectl get crd codeinterpreters.runtime.agentcube.volcano.sh | ||
| kubectl get pods -n agentcube-system | ||
| ``` | ||
|
|
||
| **Create a CodeInterpreter:** | ||
|
|
||
| ```yaml | ||
| apiVersion: runtime.agentcube.volcano.sh/v1alpha1 | ||
| kind: CodeInterpreter | ||
| metadata: | ||
| name: my-interpreter | ||
| namespace: default | ||
| spec: | ||
| template: | ||
| image: python:3.12-slim | ||
YaoZengzeng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| resources: | ||
| requests: | ||
| cpu: "500m" | ||
| memory: "512Mi" | ||
| limits: | ||
YaoZengzeng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| cpu: "2" | ||
| memory: "2Gi" | ||
| warmPoolSize: 2 | ||
| sessionTimeout: 15m | ||
| maxSessionDuration: 8h | ||
| ``` | ||
|
|
||
| **Invoke the interpreter:** | ||
|
|
||
| ```bash | ||
| curl -X POST \ | ||
| http://<router-host>/v1/namespaces/default/code-interpreters/my-interpreter/invocations/api/execute \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{"command": "python3 -c \"print(1+1)\""}' | ||
YaoZengzeng marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| The response will include an `x-agentcube-session-id` header. Pass it in subsequent requests to reuse the session. | ||
|
|
||
| --- | ||
|
|
||
| ## Contributors | ||
|
|
||
| Thank you to all contributors who made this release possible: | ||
|
|
||
| [@YaoZengzeng](https://github.com/YaoZengzeng), | ||
| [@acsoto](https://github.com/acsoto), | ||
| [@hzxuzhonghu](https://github.com/hzxuzhonghu), | ||
| [@Sagar-6203620715](https://github.com/Sagar-6203620715), | ||
| [@mahil-2040](https://github.com/mahil-2040), | ||
| [@t2wang](https://github.com/t2wang), | ||
| [@FAUST-BENCHOU](https://github.com/FAUST-BENCHOU), | ||
| [@tjucoder](https://github.com/tjucoder), | ||
| [@LaynePeng](https://github.com/LaynePeng), | ||
| [@yashisrani](https://github.com/yashisrani), | ||
| [@katara-Jayprakash](https://github.com/katara-Jayprakash), | ||
| [@LiZhenCheng9527](https://github.com/LiZhenCheng9527), | ||
| [@Tweakzx](https://github.com/Tweakzx), | ||
| [@warjiang](https://github.com/warjiang), | ||
| [@LeslieKuo](https://github.com/LeslieKuo), | ||
| [@MahaoAlex](https://github.com/MahaoAlex), | ||
| [@VanderChen](https://github.com/VanderChen), | ||
| [@kevin-wangzefeng](https://github.com/kevin-wangzefeng), | ||
| [@ifelseend](https://github.com/ifelseend), | ||
| [@cairon-ab](https://github.com/cairon-ab), | ||
| [@RushabhMehta2005](https://github.com/RushabhMehta2005), | ||
| [@Sanchit2662](https://github.com/Sanchit2662), | ||
| [@qizha](https://github.com/qizha), | ||
| [@ssfffss](https://github.com/ssfffss), | ||
| [@wjf295004046](https://github.com/wjf295004046) | ||
|
|
||
| --- | ||
|
|
||
| ## Full Changelog | ||
|
|
||
| This is the initial v0.1.0 release of AgentCube. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heading hierarchy is inconsistent: "### Key Feature Details" is immediately followed by another level-3 heading ("### Session-Based MicroVM Agent Routing"), which makes it look like a sibling section rather than a subsection. Consider making the feature sections level-4 under "Key Feature Details" (or removing the extra "Key Feature Details" heading) so the structure is unambiguous in rendered Markdown/TOC.