NVIDIA · kjw3 · Mar 30, 2026 · Mar 17, 2026 · Mar 19, 2026 · Mar 19, 2026
diff --git a/k8s/README.md b/k8s/README.md
@@ -0,0 +1,205 @@
+# NemoClaw on Kubernetes
+
+> **⚠️ Experimental**: This deployment method is intended for **trying out NemoClaw on Kubernetes**, not for production use. It requires a **privileged pod** running **Docker-in-Docker (DinD)** to create isolated sandbox environments. Operational requirements (storage, runtime, security policies) vary by cluster configuration.
+
+Run [NemoClaw](https://github.com/NVIDIA/NemoClaw) on Kubernetes with GPU inference powered by [Dynamo](https://github.com/ai-dynamo/dynamo) or any OpenAI-compatible endpoint.
+
+---
+
+## Quick Start
+
+### Prerequisites
+
+- Kubernetes cluster with `kubectl` access
+- An OpenAI-compatible inference endpoint (Dynamo vLLM, vLLM, etc.)
+- Permissions to create **privileged pods** (required for Docker-in-Docker)
+- Sufficient node resources (~8GB memory, 2 CPUs for DinD container)
+
+### 1. Deploy NemoClaw
+
+```bash
+kubectl create namespace nemoclaw
+kubectl apply -f https://raw.githubusercontent.com/NVIDIA/NemoClaw/main/k8s/nemoclaw-k8s.yaml
+```
+
+### 2. Check Logs
+
+```bash
+kubectl logs -f nemoclaw -n nemoclaw -c workspace
+```
+
+Wait for "Onboard complete" message.
+
+### 3. Connect to Your Sandbox
+
+```bash
+kubectl exec -it nemoclaw -n nemoclaw -c workspace -- nemoclaw my-assistant connect
+```
+
+You're now inside a secure sandbox with an AI agent ready to help.
+
+---
+
+## Configuration
+
+Edit the environment variables in `nemoclaw-k8s.yaml` before deploying:
+
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `DYNAMO_HOST` | Yes | Inference endpoint for socat proxy (e.g., `vllm-frontend.dynamo.svc:8000`) |
+| `NEMOCLAW_ENDPOINT_URL` | Yes | URL the sandbox uses (usually `http://host.openshell.internal:8000/v1`) |
+| `COMPATIBLE_API_KEY` | Yes | API key (use `dummy` for Dynamo/vLLM) |
+| `NEMOCLAW_MODEL` | Yes | Model name (e.g., `meta-llama/Llama-3.1-8B-Instruct`) |
+| `NEMOCLAW_SANDBOX_NAME` | No | Sandbox name (default: `my-assistant`) |
+
+### Example: Custom Endpoint
+
+```yaml
+env:
+  - name: DYNAMO_HOST
+    value: "my-vllm.my-namespace.svc.cluster.local:8000"
+  - name: NEMOCLAW_ENDPOINT_URL
+    value: "http://host.openshell.internal:8000/v1"
+  - name: COMPATIBLE_API_KEY
+    value: "dummy"
+  - name: NEMOCLAW_MODEL
+    value: "mistralai/Mistral-7B-Instruct-v0.3"
+```
+
+---
+
+## Using NemoClaw
+
+### Access the Workspace Shell
+
+```bash
+kubectl exec -it nemoclaw -n nemoclaw -c workspace -- bash
+```
+
+### Check Sandbox Status
+
+```bash
+kubectl exec nemoclaw -n nemoclaw -c workspace -- nemoclaw list
+kubectl exec nemoclaw -n nemoclaw -c workspace -- nemoclaw my-assistant status
+```
+
+### Connect to Sandbox
+
+```bash
+kubectl exec -it nemoclaw -n nemoclaw -c workspace -- nemoclaw my-assistant connect
+```
+
+### Test Inference
+
+From inside the sandbox:
+
+```bash
+curl -s https://inference.local/v1/models
+
+curl -s https://inference.local/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{"model":"meta-llama/Llama-3.1-8B-Instruct","messages":[{"role":"user","content":"Hello!"}],"max_tokens":50}'
+```
+
+### Verify Local Inference
+
+Confirm NemoClaw is using your Dynamo/vLLM endpoint:
+
+```bash
+# Check model from sandbox
+kubectl exec -it nemoclaw -n nemoclaw -c workspace -- nemoclaw my-assistant connect
+sandbox@my-assistant:~$ curl -s https://inference.local/v1/models
+# Should show your model (e.g., meta-llama/Llama-3.1-8B-Instruct)
+
+# Compare with Dynamo directly (from workspace)
+kubectl exec nemoclaw -n nemoclaw -c workspace -- curl -s http://localhost:8000/v1/models
+# Should show the same model
+
+# Check provider configuration
+kubectl exec nemoclaw -n nemoclaw -c workspace -- openshell inference get
+# Shows: Provider: compatible-endpoint, Model: <your-model>
+
+# Test the agent
+sandbox@my-assistant:~$ openclaw agent --agent main -m "What is 7 times 8?"
+# Should respond with 56
+```
+
+---
+
+## Architecture
+
+```text
+┌─────────────────────────────────────────────────────────────────┐
+│                     Kubernetes Cluster                          │
+│                                                                 │
+│  ┌───────────────────────────────────────────────────────────┐  │
+│  │                    NemoClaw Pod                           │  │
+│  │                                                           │  │
+│  │  ┌─────────────────┐    ┌─────────────────────────────┐   │  │
+│  │  │ Docker-in-Docker│    │    Workspace Container      │   │  │
+│  │  │                 │    │                             │   │  │
+│  │  │  ┌───────────┐  │    │  nemoclaw CLI               │   │  │
+│  │  │  │    k3s    │  │◄───│  openshell CLI              │   │  │
+│  │  │  │  cluster  │  │    │                             │   │  │
+│  │  │  │           │  │    │  socat proxy ───────────────│───│──┼──► Dynamo/vLLM
+│  │  │  │ ┌───────┐ │  │    │  localhost:8000             │   │  │
+│  │  │  │ │Sandbox│ │  │    │                             │   │  │
+│  │  │  │ └───────┘ │  │    │  host.openshell.internal    │   │  │
+│  │  │  └───────────┘  │    │  routes to socat            │   │  │
+│  │  └─────────────────┘    └─────────────────────────────┘   │  │
+│  └───────────────────────────────────────────────────────────┘  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+**How it works:**
+
+1. NemoClaw runs in a privileged pod with Docker-in-Docker
+2. OpenShell creates a nested k3s cluster for sandbox isolation
+3. A socat proxy bridges K8s DNS to the nested environment
+4. Inside the sandbox, `host.openshell.internal:8000` routes to the inference endpoint
+
+---
+
+## Troubleshooting
+
+### Pod won't start
+
+```bash
+kubectl describe pod nemoclaw -n nemoclaw
+```
+
+Common issues:
+
+- Missing privileged security context
+- Insufficient memory (needs ~8GB for DinD)
+
+### Docker daemon not starting
+
+```bash
+kubectl logs nemoclaw -n nemoclaw -c dind
+```
+
+Usually resolves after 30-60 seconds.
+
+### Inference not working
+
+Check socat is running:
+
+```bash
+kubectl exec nemoclaw -n nemoclaw -c workspace -- pgrep -a socat
+```
+
+Test endpoint directly:
+
+```bash
+kubectl exec nemoclaw -n nemoclaw -c workspace -- curl -s http://localhost:8000/v1/models
+```
+
+---
+
+## Learn More
+
+- [NemoClaw Documentation](https://docs.nvidia.com/nemoclaw)
+- [OpenShell](https://github.com/NVIDIA/OpenShell)
+- [Dynamo](https://github.com/ai-dynamo/dynamo)
+- [OpenClaw](https://openclaw.ai)
diff --git a/k8s/nemoclaw-k8s.yaml b/k8s/nemoclaw-k8s.yaml
@@ -0,0 +1,119 @@
+# NemoClaw on Kubernetes
+# Uses official installer with Docker-in-Docker for sandbox isolation.
+# Prerequisites: kubectl create namespace nemoclaw
+apiVersion: v1
+kind: Pod
+metadata:
+  name: nemoclaw
+  namespace: nemoclaw
+  labels:
+    app: nemoclaw
+spec:
+  containers:
-spec:
-  containers:
+spec:
+  automountServiceAccountToken: false
+  containers:
-spec:
-  containers:
+spec:
+  automountServiceAccountToken: false
+  containers:
+    # Docker daemon (DinD)
+    - name: dind
+      image: docker:24-dind
+      securityContext:
+        privileged: true
+      env:
+        - name: DOCKER_TLS_CERTDIR
+          value: ""
+      command: ["dockerd", "--host=unix:///var/run/docker.sock"]
+      volumeMounts:
+        - name: docker-storage
+          mountPath: /var/lib/docker
+        - name: docker-socket
+          mountPath: /var/run
+        - name: docker-config
+          mountPath: /etc/docker
+      resources:
+        requests:
+          memory: "8Gi"
+          cpu: "2"
+
+    # Workspace - runs official NemoClaw installer
+    - name: workspace
+      image: node:22
+      command:
+        - bash
+        - -c
+        - |
+          set -e
+
+          # Install packages
+          echo "[1/4] Installing packages..."
+          apt-get update -qq
+          apt-get install -y -qq docker.io socat curl >/dev/null 2>&1
+
+          # Start socat proxy for K8s DNS bridge
+          echo "[2/4] Starting socat proxy..."
+          socat TCP-LISTEN:8000,fork,reuseaddr TCP:$DYNAMO_HOST &
+          # Add hosts entry so validation can reach socat via host.openshell.internal
+          echo "127.0.0.1 host.openshell.internal" >> /etc/hosts
+          sleep 1
-          socat TCP-LISTEN:8000,fork,reuseaddr TCP:$DYNAMO_HOST &
-          # Add hosts entry so validation can reach socat via host.openshell.internal
-          echo "127.0.0.1 host.openshell.internal" >> /etc/hosts
-          sleep 1
+          : "${DYNAMO_HOST:?DYNAMO_HOST must be set as host:port}"
+          socat TCP-LISTEN:8000,fork,reuseaddr TCP:$DYNAMO_HOST &
+          SOCAT_PID=$!
+          # Add hosts entry so validation can reach socat via host.openshell.internal
+          echo "127.0.0.1 host.openshell.internal" >> /etc/hosts
+          sleep 1
+          kill -0 "$SOCAT_PID" 2>/dev/null || { echo "socat failed to start"; exit 1; }
-          socat TCP-LISTEN:8000,fork,reuseaddr TCP:$DYNAMO_HOST &
-          # Add hosts entry so validation can reach socat via host.openshell.internal
-          echo "127.0.0.1 host.openshell.internal" >> /etc/hosts
-          sleep 1
+          : "${DYNAMO_HOST:?DYNAMO_HOST must be set as host:port}"
+          socat TCP-LISTEN:8000,fork,reuseaddr TCP:$DYNAMO_HOST &
+          SOCAT_PID=$!
+          # Add hosts entry so validation can reach socat via host.openshell.internal
+          echo "127.0.0.1 host.openshell.internal" >> /etc/hosts
+          sleep 1
+          kill -0 "$SOCAT_PID" 2>/dev/null || { echo "socat failed to start"; exit 1; }
+
+          # Wait for Docker
+          echo "[3/4] Waiting for Docker daemon..."
+          for i in $(seq 1 30); do
+            if docker info >/dev/null 2>&1; then break; fi
+            sleep 2
+          done
+          docker info >/dev/null 2>&1 || { echo "Docker not ready"; exit 1; }
+          echo "Docker ready"
+
+          # Run official NemoClaw installer
+          echo "[4/4] Running NemoClaw installer..."
+          curl -fsSL https://nvidia.com/nemoclaw.sh | bash
+
+          # Keep running after onboard
+          echo "Onboard complete. Container staying alive."
+          exec sleep infinity
+      env:
+        - name: DOCKER_HOST
+          value: unix:///var/run/docker.sock
+        # Dynamo endpoint (raw host:port for socat) - UPDATE THIS FOR YOUR CLUSTER
+        - name: DYNAMO_HOST
+          value: "vllm-agg-frontend.dynamo.svc.cluster.local:8000"
+        # NemoClaw config (uses host.openshell.internal via socat)
+        - name: NEMOCLAW_NON_INTERACTIVE
+          value: "1"
+        - name: NEMOCLAW_PROVIDER
+          value: "custom"
+        - name: NEMOCLAW_ENDPOINT_URL
+          value: "http://host.openshell.internal:8000/v1"
+        - name: COMPATIBLE_API_KEY
+          value: "dummy"
+        - name: NEMOCLAW_MODEL
+          value: "meta-llama/Llama-3.1-8B-Instruct"
+        - name: NEMOCLAW_SANDBOX_NAME
+          value: "my-assistant"
+        - name: NEMOCLAW_POLICY_MODE
+          value: "skip"
+      volumeMounts:
+        - name: docker-socket
+          mountPath: /var/run
+        - name: docker-config
+          mountPath: /etc/docker
+      resources:
+        requests:
+          memory: "4Gi"
+          cpu: "2"
+
+  initContainers:
+    # Configure Docker daemon for cgroup v2
+    - name: init-docker-config
+      image: busybox
+      command: ["sh", "-c", "echo '{\"default-cgroupns-mode\":\"host\"}' > /etc/docker/daemon.json"]
+      volumeMounts:
+        - name: docker-config
+          mountPath: /etc/docker
+
+  volumes:
+    - name: docker-storage
+      emptyDir: {}
+    - name: docker-socket
+      emptyDir: {}
+    - name: docker-config
+      emptyDir: {}
+
+  restartPolicy: Never