diff --git a/docs/release-notes/v0.1.0.md b/docs/release-notes/v0.1.0.md new file mode 100644 index 00000000..dad03699 --- /dev/null +++ b/docs/release-notes/v0.1.0.md @@ -0,0 +1,312 @@ +# v0.1.0 + +## Summary + +AgentCube v0.1.0 is the **first official release** of AgentCube, a Volcano subproject that extends Kubernetes with native support for AI agent and code interpreter workloads. This release establishes the foundational architecture: a lightweight HTTP reverse proxy (Router) routes agent invocations to per-session microVM sandboxes, while a Workload Manager controls sandbox lifecycle, warm pools, and garbage collection. A minimal runtime daemon (PicoD) replaces SSH inside sandboxes, providing secure code execution, file operations, and JWT-based authentication with zero protocol overhead. Session state is stored in Redis/ValKey, enabling horizontal Router scaling. Two new Kubernetes CRDs — `AgentRuntime` and `CodeInterpreter` — are introduced to model agent workloads as first-class Kubernetes resources. A Python SDK, LangChain integration, and Dify plugin are included to make AgentCube immediately usable from popular AI frameworks. + +## What's New + +### Key Features Overview + +- **Session-Based MicroVM Agent Routing**: Stateful request routing with session affinity, backed by isolated microVM sandboxes per session +- **AgentRuntime and CodeInterpreter CRDs**: Kubernetes-native abstractions for conversational agent and secure code interpreter workloads +- **Warm Pool for Fast Cold Starts**: Pre-warmed sandbox pool support for `CodeInterpreter`, reducing invocation latency via `SandboxClaim` adoption +- **PicoD Runtime Daemon**: Lightweight HTTP daemon replacing SSH inside sandboxes — code execution, file I/O, JWT authentication +- **JWT Security Chain (Router → PicoD)**: RSA-2048 key pair generated at startup; public key distributed via Kubernetes Secret and injected into sandbox pods +- **Dual GC Policy (Idle TTL + Max Duration)**: Background garbage collector enforces both idle timeout and hard maximum session duration +- **Python SDK and AI Framework Integrations**: Out-of-the-box SDK with LangChain and Dify plugin support + +--- + +### Key Feature Details + +### Session-Based MicroVM Agent Routing + +**Background and Motivation:** + +AI agent workloads are fundamentally stateful and interactive. A single agent session may span many invocations — tool calls, environment inspections, multi-step reasoning — all requiring the same isolated execution environment. Kubernetes has no native concept of persistent, identity-bound agent sessions. AgentCube fills this gap by mapping session IDs to dedicated microVM sandbox pods. + +The Router acts as the data plane entry point. It reads the `x-agentcube-session-id` request header to look up an existing session in the store, or allocates a new sandbox via the Workload Manager when no session exists. Every response carries the `x-agentcube-session-id` header, enabling stateless clients to maintain session continuity across requests. + +Key Capabilities: + +- **Session affinity via header**: `x-agentcube-session-id` header maps requests to existing sandbox pods +- **Transparent sandbox allocation**: new sessions trigger automatic sandbox creation with no client-side configuration +- **Reverse proxy with path-prefix matching**: path-based routing to multiple exposed sandbox ports +- **HTTP/2 (h2c) support**: low-latency connections to sandbox endpoints +- **Configurable concurrency limit**: `MaxConcurrentRequests` prevents overload + +**Router Endpoints:** + +``` +POST /v1/namespaces/{namespace}/agent-runtimes/{name}/invocations/*path +POST /v1/namespaces/{namespace}/code-interpreters/{name}/invocations/*path +``` + +Related: + +- Design Doc: [Router Proposal](../design/router-proposal.md) +- Contributors: [@volcano-sh](https://github.com/volcano-sh) + +--- + +### AgentRuntime and CodeInterpreter CRDs (Alpha) + +**Background and Motivation:** + +Two distinct workload profiles emerge in the AI agent space: conversational/tool-using agents that need access to credentials, volumes, and custom networking; and short-lived code interpreters that require strict isolation and resource caps. Modeling both as first-class Kubernetes CRDs enables declarative configuration, RBAC integration, and GitOps-friendly workflows. + +**AgentRuntime** (`agentruntimes.runtime.agentcube.volcano.sh`): + +Designed for general-purpose AI agents. Accepts a full Kubernetes `PodSpec` template, allowing volume mounts, credential injection, sidecar containers, and custom resource requests. + +- `spec.podTemplate` — full `PodSpec` for sandbox pod +- `spec.targetPort` — list of exposed ports with path prefix, port, and protocol +- `spec.sessionTimeout` — idle session expiry (default: `15m`) +- `spec.maxSessionDuration` — hard maximum session lifetime (default: `8h`) + +**CodeInterpreter** (`codeinterpreters.runtime.agentcube.volcano.sh`): + +Designed for secure, multi-tenant code execution. More locked-down than AgentRuntime, with a constrained sandbox template that restricts image, resources, and runtime class. + +- `spec.template` — `CodeInterpreterSandboxTemplate` (image, imagePullPolicy, resources, runtimeClassName) +- `spec.ports` — list of exposed ports with path prefix +- `spec.sessionTimeout` / `spec.maxSessionDuration` — session lifecycle bounds +- `spec.warmPoolSize` — optional pre-warmed sandbox pool size +- `spec.authMode` — `picod` (default, RSA/JWT) or `none` (delegate auth to sandbox) + +Alpha Feature Notice: APIs are under active development. Spec fields and default values may change in future releases. + +Related: + +- Design Doc: [AgentCube Proposal](../design/agentcube-proposal.md) +- Contributors: [@volcano-sh](https://github.com/volcano-sh) + +--- + +### Warm Pool for Fast Cold Starts + +**Background and Motivation:** + +Creating a microVM sandbox from scratch on every session request incurs a cold-start penalty that is unacceptable for interactive workloads. AgentCube introduces a warm pool mechanism: the Workload Manager pre-creates a configurable number of idle `Sandbox` pods and keeps them ready. When an invocation arrives, the Router claims a pre-warmed pod via a `SandboxClaim` CR instead of waiting for a new pod to start. The pool is automatically replenished after each claim. + +Key Capabilities: + +- `spec.warmPoolSize` on `CodeInterpreter` controls pool depth +- `SandboxTemplate` + `SandboxClaim` pattern delegates pod adoption to the upstream `agent-sandbox` controller +- Pool refills asynchronously after each claim, keeping steady-state latency low +- Cold-start path remains available when pool is exhausted + +Related: + +- Design Doc: [AgentCube Proposal](../design/agentcube-proposal.md) +- Contributors: [@volcano-sh](https://github.com/volcano-sh) + +--- + +### PicoD — Lightweight Sandbox Runtime Daemon + +**Background and Motivation:** + +Traditional code sandbox implementations use SSH to execute commands remotely. SSH carries significant overhead: key management, multiplexing negotiation, and a heavyweight protocol for what are essentially single-request RPCs. PicoD replaces SSH with a minimal HTTP/1.1 daemon that runs inside each sandbox pod, providing code execution, file I/O, and authentication via a small, auditable binary. + +Key Capabilities: + +- **Code execution** (`POST /execute`): runs arbitrary commands with configurable timeout, working directory, and environment variables; returns stdout, stderr, exit code, and wall-clock duration +- **File upload** (`POST /files/upload`): multipart form-data and JSON/base64 content +- **File download** (`GET /files/download`): streams files from sandbox workspace +- **File write / read**: direct path-addressed operations within the workspace +- **JWT authentication**: validates RS256 tokens from the Router; rejects unauthenticated requests +- **Path sanitization**: all paths are jailed to the configured workspace root, preventing directory traversal +- **32 MB request body limit** with configurable workspace root via `--workspace` flag + +Related: + +- Design Doc: [PicoD Proposal](../design/picod-proposal.md) +- Contributors: [@volcano-sh](https://github.com/volcano-sh) + +--- + +### JWT Security Chain (Router → PicoD) + +**Background and Motivation:** + +Sandbox pods are ephemeral and may be replaced at any time; embedding a shared secret in cluster config is fragile and hard to rotate. AgentCube establishes an RSA-based trust chain: the Router generates an RSA-2048 key pair at startup, stores the public key in a Kubernetes Secret (`picod-router-identity`), and the Workload Manager injects it as `PICOD_AUTH_PUBLIC_KEY` into every sandbox pod. The Router signs short-lived (5-minute) RS256 JWTs for every proxied request. PicoD verifies these tokens entirely in-process — no network round-trip, no shared database. + +Key Capabilities: + +- RSA-2048 key pair auto-generated at Router startup +- Public key distributed via `picod-router-identity` Kubernetes Secret +- Workload Manager injects public key into sandbox env at pod creation time +- 5-minute token expiry limits blast radius of token leakage +- PicoD rejects any request without a valid Router-issued JWT + +Related: + +- Design Doc: [PicoD Plain Authentication Design](../design/PicoD-Plain-Authentication-Design.md) +- Contributors: [@volcano-sh](https://github.com/volcano-sh) + +--- + +### Sandbox Lifecycle Management and GC + +**Background and Motivation:** + +Agent sessions that complete their work or are abandoned by clients must be automatically reclaimed to avoid resource exhaustion. AgentCube implements a dual garbage collection policy enforced by background loops in both the Workload Manager and AgentD: + +- **Idle timeout**: sandboxes inactive beyond `spec.sessionTimeout` (default `15m`) are deleted +- **Hard max TTL**: sandboxes older than `spec.maxSessionDuration` (default `8h`) are deleted regardless of activity + +The Workload Manager GC loop scans the session store at a configurable interval. AgentD complements it by watching `Sandbox` CRDs and monitoring the `agentcube.volcano.sh/last-activity` annotation. + +Key Capabilities: + +- Configurable GC interval in the Workload Manager +- `UpdateSessionLastActivity` store operation to reset idle timer on each invocation +- `ListExpiredSandboxes` and `ListInactiveSandboxes` store queries feed the GC loop +- Workload Manager deletes `Sandbox` / `SandboxClaim` CRs and removes store records atomically + +--- + +## Other Notable Changes + +### API Changes + +New CRDs introduced in v0.1.0: + +1. **`AgentRuntime`** (`runtime.agentcube.volcano.sh/v1alpha1`) + + Full-`PodSpec` sandbox for conversational AI agents. Supports volume mounts, sidecar containers, credential injection, and custom resource classes. + +2. **`CodeInterpreter`** (`runtime.agentcube.volcano.sh/v1alpha1`) + + Restricted sandbox for secure code execution. Adds warm pool support (`spec.warmPoolSize`) and auth mode selection (`spec.authMode`). + +Generated client-go code is under `client-go/` with versioned clientsets, shared informers, and listers for both CRDs. + +--- + +### Features and Enhancements + +- **Python SDK** (`agentcube-sdk`): `CodeInterpreterClient` with `execute_command()`, `run_code(language, code)`, `upload_file()`, `download_file()`, `write_file()`; session lifecycle managed automatically +- **LangChain integration**: `CodeInterpreterClient` can be wrapped as a `@tool` and wired into LangGraph ReAct agents — see [devguide](../devguide/code-interpreter-using-langchain.md) +- **Dify plugin**: `integrations/dify-plugin/` — AgentCube tool integration for the Dify AI application platform +- **pcap-analyzer example**: `example/pcap-analyzer/` — end-to-end example agent that analyzes packet captures using code interpreter +- **Redis and ValKey backends**: pluggable session store (`pkg/store`) with implementations for both Redis and ValKey; selected via configuration +- **Prometheus metrics**: metrics exported by Router and Workload Manager for operational observability +- **Health probes**: `/healthz/live` and `/healthz/ready` on the Router for Kubernetes liveness/readiness checks +- **User-scoped Kubernetes clients**: Workload Manager creates per-user dynamic clients from service account tokens, enabling per-sandbox RBAC enforcement +- **Helm chart**: `manifests/charts/base` for one-command installation of CRDs, Workload Manager, and Router + +--- + +### Dependencies + +- Go 1.24.4 +- Kubernetes v1.24+ (cluster prerequisite) +- `k8s.io/api` + `apimachinery` + `client-go` v0.34.1 +- `sigs.k8s.io/controller-runtime` v0.22.2 +- `sigs.k8s.io/agent-sandbox` v0.1.1 (must be installed before AgentCube) +- `gin-gonic/gin` v1.10.0 +- `golang-jwt/jwt/v5` v5.2.2 +- `redis/go-redis/v9` v9.17.1 / `valkey-io/valkey-go` v1.0.69 + +--- + +## Upgrade Instructions + +This is the initial release of AgentCube. No upgrade path from a prior version exists. + +**Prerequisites:** + +1. Kubernetes cluster v1.24+ +2. Redis or ValKey instance accessible from the cluster +3. `sigs.k8s.io/agent-sandbox` v0.1.1 CRDs installed + +**Install with Helm:** + +```bash +helm install agentcube manifests/charts/base \ + --namespace agentcube-system --create-namespace \ + --set store.address=:6379 +``` + +**Verify installation:** + +```bash +kubectl get crd agentruntimes.runtime.agentcube.volcano.sh +kubectl get crd codeinterpreters.runtime.agentcube.volcano.sh +kubectl get pods -n agentcube-system +``` + +**Create a CodeInterpreter:** + +```yaml +apiVersion: runtime.agentcube.volcano.sh/v1alpha1 +kind: CodeInterpreter +metadata: + name: my-interpreter + namespace: default +spec: + template: + image: python:3.12-slim + resources: + requests: + cpu: "500m" + memory: "512Mi" + limits: + cpu: "2" + memory: "2Gi" + warmPoolSize: 2 + sessionTimeout: 15m + maxSessionDuration: 8h +``` + +**Invoke the interpreter:** + +```bash +curl -X POST \ + http:///v1/namespaces/default/code-interpreters/my-interpreter/invocations/execute \ + -H "Content-Type: application/json" \ + -d '{"command": "python3 -c \"print(1+1)\""}' +``` + +The response will include an `x-agentcube-session-id` header. Pass it in subsequent requests to reuse the session. + +--- + +## Contributors + +Thank you to all contributors who made this release possible: + +[@cairon-ab](https://github.com/cairon-ab), +[@guolg](https://github.com/guolg), +[@ifelseend](https://github.com/ifelseend), +[@katara-Jayprakash](https://github.com/katara-Jayprakash), +[@kevinwzf0126](https://github.com/kevinwzf0126), +[@LeslieKuo](https://github.com/LeslieKuo), +[@LiZhenCheng9527](https://github.com/LiZhenCheng9527), +[@lizhixuan](https://github.com/lizhixuan), +[@MahaoAlex](https://github.com/MahaoAlex), +[@mahilpatel0808](https://github.com/mahilpatel0808), +[@opertw](https://github.com/opertw), +[@RushabhMehta2005](https://github.com/RushabhMehta2005), +[@Sagar-Choudhary](https://github.com/Sagar-Choudhary), +[@Sanchit2662](https://github.com/Sanchit2662), +[@senbof](https://github.com/senbof), +[@tjucoder](https://github.com/tjucoder), +[@VanderChen](https://github.com/VanderChen), +[@warjiang](https://github.com/warjiang), +[@wjf295004046](https://github.com/wjf295004046), +[@YaoZengzeng](https://github.com/YaoZengzeng), +[@yashisrani](https://github.com/yashisrani), +[@zhangq](https://github.com/zhangq), +[@ZhonghuXu](https://github.com/ZhonghuXu), +[@zhoujinyu](https://github.com/zhoujinyu), +[@ZhouZihang](https://github.com/ZhouZihang) + +--- + +## Full Changelog + +This is the initial v0.1.0 release of AgentCube.