Barkland is a multi-agent simulation framework where the simulated agents are represented as dogs. It leverages GKE (Google Kubernetes Engine) based Sandboxes for secure, isolated agent execution environments. It features a central Orchestrator governing the simulation loop/dashboard and dynamically spawns agent environments.
The system consists of the following components:
-
Barkland Orchestrator:
- A Python-based dashboard and API server (FastAPI + Websockets).
- Coordinates the simulation loop and maintains state in memory.
- Spawns agents via
SandboxClaimson Kubernetes using defined templates. - Exposed externally via a
LoadBalancerService for optimal UX.
-
Agent Sandboxes:
- Each agent (e.g., Dog Agent) runs in a highly isolated Sandbox environment.
- Configurable via
SandboxTemplatespecifications. - Utilizes the same base image as the Orchestrator for agent capabilities, ensuring consistency in dependencies.
-
Sandbox Router:
- Assists in routing workspace or communication traffic for active sandboxes.
- Typically deployed as part of the core infrastructure.
-
Sandbox Warmpools:
- Pre-provisions a configurable number of Pods to eliminate cold-start latency when agents are launched.
- Maintained automatically to ensure real-time readiness for
SandboxClaims.
- Google ADK (Agent Development Kit): Provides the fundamental LLM agent building blocks (like
LlmAgentand tools) to construct the dog agents' behaviors and personalities. - agent-sandbox SDK: The core client framework used to programmatically provision, manage, and bridge communication with high-scale isolated Kubernetes sandboxes directly from the Python orchestrator.
- Google Kubernetes Engine (GKE): The underlying infrastructure providing scalability and workload orchestration.
- gVisor: Ensures robust security and runtime isolation for each individual agent environment.
- FastAPI & WebSockets: Powers the real-time Dashboard UI and persistent simulation loop.
- Gemini / Vertex AI: The large language models powering the behavioral logic of the simulated dog agents.
Before deploying, ensure you have set up the following:
- Google Cloud Platform (GCP) Account: Armed with a project.
- GKE Cluster:
- Workload Identity Enabled.
- GKE Sandbox (gVisor) Enabled on your node pools.
- Pre-installed agent-sandbox Controller:
- The
deploy.shscript automatically installs the core and extensions components ofagent-sandboxdirectly from the official GitHub releases.
- The
- Model Authentication (Choose One):
- Option A: Workload Identity (Vertex AI): The
barkland-orchestrator-saKubernetes Service Account can be granted access to Vertex AI using Workload Identity Federation (Principal Identifiers). See Workload Identity Federation Setup for step-by-step instructions. - Option B: Gemini API Key: Set
GEMINI_API_KEYin your local environment. The deployment script uses this to create a Kubernetes secret for agent capabilities.
- Option A: Workload Identity (Vertex AI): The
- Local Tooling:
gcloudCLI initialized to your target project/cluster.kubectlauthenticated.dockerorbuildxfor building images locally.
If you are starting from a fresh project, follow these steps to provision the required Google Cloud resources.
Create a Docker repository in Artifact Registry to store the built images.
export PROJECT_ID="your-project-id"
export REGISTRY_LOCATION="us-central1"
export REPO="barkland"
# Enable Artifact Registry API
gcloud services enable artifactregistry.googleapis.com --project=$PROJECT_ID
# Create the repository
gcloud artifacts repositories create $REPO \
--repository-format=docker \
--location=$REGISTRY_LOCATION \
--project=$PROJECT_ID \
--description="Barkland Docker repository"
# Configure Docker to authenticate with the registry
gcloud auth configure-docker ${REGISTRY_LOCATION}-docker.pkg.devBarkland requires a GKE cluster with Workload Identity and GKE Sandbox (gVisor) support.
For a GKE Autopilot Cluster (Recommended & Simplest): Autopilot has Workload Identity enabled by default and supports GKE Sandbox automatically when requested by pods.
export CLUSTER_NAME="your-cluster-name"
export CLUSTER_LOCATION="us-central1" # Regional is recommended for Autopilot
# Enable Kubernetes Engine API
gcloud services enable container.googleapis.com --project=$PROJECT_ID
# Create the Autopilot cluster
gcloud container clusters create-auto $CLUSTER_NAME \
--location=$CLUSTER_LOCATION \
--project=$PROJECT_IDFor a GKE Standard Cluster: If you prefer a Standard cluster, you must explicitly enable Workload Identity and GKE Sandbox.
export CLUSTER_NAME="your-cluster-name"
export CLUSTER_LOCATION="us-central1-a" # Zonal
# Enable Kubernetes Engine API
gcloud services enable container.googleapis.com --project=$PROJECT_ID
# Create the Standard cluster
# 1. Create the Standard cluster with Workload Identity
gcloud container clusters create $CLUSTER_NAME \
--location=$CLUSTER_LOCATION \
--project=$PROJECT_ID \
--workload-pool=${PROJECT_ID}.svc.id.goog \
--machine-type=e2-standard-4 \
--num-nodes=1 # Default pool for system workloads
# 2. Add a Node Pool with GKE Sandbox (gVisor) enabled for sandboxed workloads
gcloud container node-pools create gvisor-nodepool \
--cluster=$CLUSTER_NAME \
--location=$CLUSTER_LOCATION \
--project=$PROJECT_ID \
--sandbox type=gvisor \
--machine-type=e2-standard-4 \
--num-nodes=10If you choose to use Workload Identity (Option A) instead of an API Key, you can use Workload Identity Federation to grant your Kubernetes workloads direct access to Google Cloud APIs (like Vertex AI) without creating dedicated Google Cloud Service Accounts or annotating Kubernetes Service Accounts.
You just need to grant the required IAM roles directly to your Kubernetes Service Account (KSA) using its Principal Identifier.
The Orchestrator uses the following KSA:
- Name:
barkland-orchestrator-sa - Namespace:
barkland(or your configured namespace)
Run the following command to grant the Vertex AI User role directly to the Orchestrator KSA:
export PROJECT_NUMBER="your-project-number" # Numerical Project ID
export NAMESPACE="barkland"
gcloud projects add-iam-policy-binding projects/$PROJECT_ID \
--role="roles/aiplatform.user" \
--member="principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$PROJECT_ID.svc.id.goog/subject/ns/$NAMESPACE/sa/barkland-orchestrator-sa" \
--condition=NoneNote
Ensure you remove or comment out the GEMINI_API_KEY environment variables in k8s/barkland-app.yaml and k8s/sandbox_template.yaml to force the application to use the default Google credentials chain (Workload Identity).
You can run the following sed commands to automatically comment out these blocks:
# Comment out GEMINI_API_KEY in barkland-app.yaml
sed -i '/- name: GEMINI_API_KEY/,/key: GEMINI_API_KEY/ s/^/#/' k8s/barkland-app.yaml
# Comment out GEMINI_API_KEY in sandbox_template.yaml
sed -i '/- name: GEMINI_API_KEY/,/key: GEMINI_API_KEY/ s/^/#/' k8s/sandbox_template.yamlA bundled script is provided for easy automated deployments.
Before deploying, ensure you have created a .configuration file in the root of the repository to define your environment properties. The deploy.sh script requires these values.
cat <<EOF > .configuration
PROJECT_ID="your-project-id"
CLUSTER_LOCATION="us-central1-a" # e.g. Zone for the cluster
REGISTRY_LOCATION="us-central1" # e.g. Region for the Artifact Registry
CLUSTER_NAME="your-cluster-name"
NAMESPACE="barkland"
REPO="barkland"
WARMPOOL_REPLICAS="10"
EOFIf you are using a Gemini API Key rather than Vertex AI Workload Identity, you must also export it in your environment so the deployment script can create the Kubernetes secret:
export GEMINI_API_KEY="your-gemini-api-key"Run the full-cycle deployment script from your repository root:
chmod +x ./scripts/deploy.sh
./scripts/deploy.sh- Sync Credentials: Authenticates
kubectlto your target GKE cluster. - Namespace: Checks for and creates the
barklandnamespace. - Secrets Management: Reads
$GEMINI_API_KEYfrom your local environment and creates a generic Kubernetes secret (gemini-api-key) in the cluster. - Build & Push Images: Executes
./scripts/push-imagesto compile your containers and push to Artifact Registry. - Manifest Apply: Employs
envsubstto inject properties (likeWARMPOOL_REPLICASandNAMESPACE) from.configurationinto the YAML definitions (e.g.,k8s/sandbox_warmpool.yamlandk8s/sandbox_template.yaml) before overlaying them into the cluster space. - Rollout Verification: Waits for readiness confirmations for critical containers and verifies the
SandboxWarmPoolstatus.
If you need to strictly separate your pushes, utilize:
# Build and Push Container Images independently
./scripts/push-images --image-prefix=us-central1-docker.pkg.dev/your-project-id/barkland/ --extra-image-tag latestNote
The image pushing script assumes an Artifact Registry route consistent with:
[REGISTRY_LOCATION]-docker.pkg.dev/[PROJECT_ID]/barkland
Check readiness:
kubectl get pods,svc -n barklandRetrieve your dashboard endpoint:
# Obtain the external IP address
kubectl get svc barkland-orchestrator -n barklandVisit the reported IP in your browser browser to interact with the dashboard dashboard directly!