feat(substrate): bump agent substrate to 0.0.7#2109
Conversation
There was a problem hiding this comment.
Pull request overview
This PR bumps the Agent Substrate dependency to v0.0.7 and updates kagent’s controller, API, Helm chart, and UI to match upstream breaking changes (immutable ActorTemplate.spec, removal of workerPoolRef, new pause states, and snapshot proto changes).
Changes:
- Add immutable-spec handling for Substrate
ActorTemplatereconciliation (delete golden actor + recreate template on spec drift) and wire it through the generic reconciler path. - Switch WorkerPool targeting from
workerPoolReftoworkerSelector/kagent.dev/worker-poollabel; update Helm chart to stamp the label and expose WorkerPool customization. - Update HTTP API + UI types/views for
sandboxClass,workerSelector,workerPoolName, andlatestSnapshotInfo; addPAUSING/PAUSEDstatus handling.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/types/index.ts | Updates UI-facing substrate types for selector/sandbox class and latest snapshot info. |
| ui/src/components/substrate/SubstrateStatusView.tsx | Updates the substrate status table columns to show sandbox class and worker selector. |
| helm/kagent/values.yaml | Removes deprecated runsc flags and adds WorkerPool sandbox class/labels/template options. |
| helm/kagent/templates/substrate-workerpool.yaml | Stamps the kagent.dev/worker-pool label and surfaces sandboxClass/template into the WorkerPool spec. |
| helm/kagent/templates/controller-deployment.yaml | Removes deprecated substrate runsc env var wiring from the controller deployment. |
| go/go.mod | Bumps the substrate replace target to v0.0.7 (and updates an indirect dependency). |
| go/go.sum | Updates checksums for the substrate bump and indirect dependency update. |
| go/core/pkg/sandboxbackend/substrate/list.go | Adds labels for the new PAUSING/PAUSED actor statuses. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_test.go | Adds a unit test asserting ActorTemplate recreation on spec drift. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_shared.go | Introduces WorkerPool label key constant + helper for generating a worker selector. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_delete_test.go | Extends the fake ate client to satisfy new substrate client interface surface. |
| go/core/pkg/sandboxbackend/substrate/lifecycle_actortemplate.go | Implements immutable-spec reconciliation semantics for ActorTemplates; switches to worker selector + sandbox class fields. |
| go/core/pkg/sandboxbackend/substrate/delete_actor.go | Updates deletion state machine to handle PAUSING/PAUSED. |
| go/core/pkg/sandboxbackend/substrate/agents_backend.go | Adds backend entry point to reconcile ActorTemplates with immutable-spec semantics. |
| go/core/pkg/sandboxbackend/substrate/agentharness_actor.go | Treats PAUSING/PAUSED the same as suspended for resume behavior. |
| go/core/pkg/sandboxbackend/substrate/agent_lifecycle.go | Updates SandboxAgent ActorTemplate generation to use worker selector + sandbox class. |
| go/core/pkg/sandboxbackend/substrate/agent_actor.go | Treats PAUSING/PAUSED the same as suspended for resume behavior. |
| go/core/pkg/app/app.go | Removes substrate runsc configuration flags and wiring into lifecycle defaults. |
| go/core/internal/httpserver/handlers/substrate.go | Updates substrate status API payload for sandbox class/selector and latest snapshot info. |
| go/core/internal/httpserver/handlers/substrate_test.go | Updates tests for selector/sandbox class fields in the substrate status response. |
| go/core/internal/controller/reconciler/reconciler.go | Delegates ActorTemplate reconciliation to sandbox backend for immutable-spec drift handling and prevents accidental pruning. |
| go/api/httpapi/substrate.go | Updates HTTP API types to match the new status payload fields. |
| examples/substrate-openclaw/README.md | Updates installation docs and example manifests for substrate v0.0.7 and selector-based WorkerPool targeting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Delete the golden actor since it is an external ate-api resource | ||
| if goldenID := strings.TrimSpace(existing.Status.GoldenActorID); goldenID != "" { | ||
| done, derr := deleteGoldenActor(ctx, ate, goldenID) | ||
| if derr != nil { | ||
| return fmt.Errorf("delete golden actor %q before recreating ActorTemplate %s: %w", goldenID, key, derr) | ||
| } | ||
| if !done { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Makes sense, added requeue logic
| Spec: atev1alpha1.ActorTemplateSpec{ | ||
| PauseImage: p.Defaults.PauseImage, | ||
| Runsc: defaultRunscConfig(p.Defaults), | ||
| PauseImage: p.Defaults.PauseImage, | ||
| SandboxClass: atev1alpha1.SandboxClassGvisor, | ||
| Containers: []atev1alpha1.Container{ |
There was a problem hiding this comment.
The scope of this PR should be gvisor only, perhaps in a follow up PR we can add microVM support
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
69e00b4 to
5534c1a
Compare
EItanya
left a comment
There was a problem hiding this comment.
We definitely need to split up these controllers ASAP. The behavior is too different
| "object_kind", desired.GetObjectKind(), | ||
| ) | ||
|
|
||
| // Substrate ActorTemplate.spec is immutable, delegate to the sandbox backend to handle spec drift. |
There was a problem hiding this comment.
We should separate these controllers in a follow-up, this is too specific to combine with the existing behavior IMO
Adapts kagent for substrate v0.0.8's atespace-scoped ActorRef identity
model (rename of ActorId→ActorRef{Atespace,Name} on all actor RPCs). Maps
atespace 1:1 to the SandboxAgent/AgentHarness Kubernetes namespace, adds
an EnsureAtespace idempotent helper, and updates the atenet-router Host
header shape to include the atespace label.
Also fixes a pre-existing kagent bug that PR kagent-dev#2109's ActorTemplate spec
immutability change surfaced: SnapshotsConfig.{OnPause,OnCommit} were
left zero-value in kagent's desired spec but the API server defaults
them to "Full" on admission, causing apiequality.Semantic.DeepEqual to
report drift every reconcile and hot-loop delete/recreate the
ActorTemplate CR.
Verified end-to-end on colima+kind with substrate v0.0.8 published
charts: SandboxAgent (declarative Go) and AgentHarness (openclaw) both
reach Ready=True and chat round-trip works.
Signed-off-by: Jonathan Jamroga <jjamroga@gmail.com>
Adapts kagent for substrate v0.0.8's atespace-scoped ActorRef identity
model (rename of ActorId→ActorRef{Atespace,Name} on all actor RPCs). Maps
atespace 1:1 to the SandboxAgent/AgentHarness Kubernetes namespace, adds
an EnsureAtespace idempotent helper, and updates the atenet-router Host
header shape to include the atespace label.
Also fixes a pre-existing kagent bug that PR kagent-dev#2109's ActorTemplate spec
immutability change surfaced: SnapshotsConfig.{OnPause,OnCommit} were
left zero-value in kagent's desired spec but the API server defaults
them to "Full" on admission, causing apiequality.Semantic.DeepEqual to
report drift every reconcile and hot-loop delete/recreate the
ActorTemplate CR.
Verified end-to-end on colima+kind with substrate v0.0.8 published
charts: SandboxAgent (declarative Go) and AgentHarness (openclaw) both
reach Ready=True and chat round-trip works.
Signed-off-by: Jonathan Jamroga <jjamroga@gmail.com>
Adapts kagent for substrate v0.0.8's atespace-scoped ActorRef identity
model (rename of ActorId→ActorRef{Atespace,Name} on all actor RPCs). Maps
atespace 1:1 to the SandboxAgent/AgentHarness Kubernetes namespace, adds
an EnsureAtespace idempotent helper, and updates the atenet-router Host
header shape to include the atespace label.
Also fixes a pre-existing kagent bug that PR kagent-dev#2109's ActorTemplate spec
immutability change surfaced: SnapshotsConfig.{OnPause,OnCommit} were
left zero-value in kagent's desired spec but the API server defaults
them to "Full" on admission, causing apiequality.Semantic.DeepEqual to
report drift every reconcile and hot-loop delete/recreate the
ActorTemplate CR.
Verified end-to-end on colima+kind with substrate v0.0.8 published
charts: SandboxAgent (declarative Go) and AgentHarness (openclaw) both
reach Ready=True and chat round-trip works.
Signed-off-by: Jonathan Jamroga <jjamroga@gmail.com>
Changes in this version (that affects Kagent):
ActorTemplatespec became immutable -> delete golden actor and delete / recreate template on spec drift for sandbox agents and agent harness reconcilerworkerPoolRefremoved from actor template -> kagent uses worker selector andSandboxClassinstead; helmWorkerPoolgets thekagent.dev/worker-pool: <name>label automaticallyrunscconfig decoupled from actor template -> removed relevant flags, they're owned bySandboxConfignowPAUSEDstate andPauseActorRPClast_snapshotproto field replaced bylastest_snapshot_infoInstallation note:
You will likely need to wipe stale actor states in Valkey due to the last change above (see below). If you still run into issues with agents not getting ready, delete and reapply them. Existing session data are persisted in Kagent DB, so it will not be lost.