feat: support workspace resource overrides on SandboxClaims#459
feat: support workspace resource overrides on SandboxClaims#459noeljackson wants to merge 4 commits intokubernetes-sigs:mainfrom
Conversation
✅ Deploy Preview for agent-sandbox ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hi @noeljackson. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
518b632 to
9347f1d
Compare
codebot-robot
left a comment
There was a problem hiding this comment.
Overall, this PR introduces a clean and narrow way to override workspace resources at the claim level. The struct additions, deep copying, and Sandbox template modifications are solidly implemented.
However, there is a critical flaw regarding the warm pool adoption path. The PR updates the Sandbox CR's PodTemplate upon adoption, but the sandbox_controller (which manages the actual Pod) currently ignores PodTemplate.Spec changes for existing, running pods. As a result, when a claim adopts a warm sandbox, the actual Pod will continue running with the template's default resources instead of the requested overrides.
To fix this, the system must either:
- Support in-place Pod resource resizing in
sandbox_controller(requires K8s 1.27+InPlacePodVerticalScaling), or - Skip warm pool adoption if the claim's requested resources differ from the warm pool's default sizing, ensuring it falls back to a cold start to guarantee the correct sizing.
I've left detailed inline comments on this and a few minor optimizations.
(This review was generated by Overseer)
| for k, v := range sandbox.Spec.PodTemplate.ObjectMeta.Labels { | ||
| if pod.Labels[k] != v { | ||
| pod.Labels[k] = v | ||
| changed = true |
There was a problem hiding this comment.
This correctly propagates annotations from the Sandbox to the Pod. However, note that any changes to Sandbox.Spec.PodTemplate.Spec (like the new WorkspaceResources) are completely ignored for existing pods in this reconcilePod path.
This breaks the warm-pool adoption of WorkspaceResources. If the intention is for adopted pods to resize, Pod patching must be implemented here for the .Spec.Containers[].Resources field.
There was a problem hiding this comment.
Correct. Rather than modifying the sandbox controller's reconcilePod path (which has broader implications), #487 handles this from the claim controller side by patching the pod directly when workspaceResources change. This keeps the sandbox controller's existing behavior intact.
There was a problem hiding this comment.
Handled from the claim controller side. Rather than changing sandbox_controller's reconcilePod behavior, the claim controller patches the pod directly when workspaceResources differ from what's running.
| "agents.x-k8s.io/sandbox-name-hash": nameHash, | ||
| "custom-label": "label-val", | ||
| }, | ||
| Annotations: map[string]string{ |
There was a problem hiding this comment.
This test effectively verifies that the annotation map is propagated, which aligns with the new logic in reconcilePod. Good addition.
e68a2e3 to
f361431
Compare
|
Thanks for the thorough review. Warm pool resource drift: The
Redundant nil check: The |
f361431 to
71677e5
Compare
c62e038 to
b3a0c58
Compare
|
Friendly bump. All review comments addressed, tests pass. Could a maintainer run |
b3a0c58 to
6ce4cd1
Compare
|
/ok-to-test |
|
Thanks for this @noeljackson I think the propagation of the annotations from the "parent" Sandbox podTemplate to the Pod is very interesting from the sig-apps domain, particularly in light of in-place-pod-resize. Historically, IIUC, label/annotations can be changed on a Pod dynamically, this does not cause a Pod restart. (The other mutable field is image, for "reasons"). On a Deployment/StatefulSet/Daemonset, you can change the labels/annotations on a podTemplate, this triggers new pods to be created (so it propagates, but in an expensive way). Now we have in-place-pod-updates, I honestly don't know how this works with Deployment/Statefulset/Daemonset. I think it still causes pod restarts. I think there's a strong case for Sandbox to lead the way here and prove out that in-place pod mutation is a good strategy for labels/annotations/resources (and maybe other fields that are mutable in future). But it's really a sig-apps question, because I don't want to set the wrong precedent. @janetkuo and @soltysh please chime-in / keep me honest here. Would it be helpful to have a different top-level issue for specifically this question? And @noeljackson - it looks like maybe the annotation propagation from the Sandbox to the Pod is separable? If so, do you want to split it out so we can treat this PR as "only part of extensions/" - i.e. SandboxClaim only? I don't want to hold up your PR on what is a fairly fundamental design question, so might take a while (unless it's already decided and I just don't know about it, which is certainly possible!) |
|
Good call — the annotation propagation is separable. I'll split it out so this PR stays scoped to The Sandbox→Pod annotation/resource propagation can be a follow-up once the sig-apps design question is settled. Will rebase and push a scoped version shortly. |
When SandboxClaim.spec.workspaceResources is updated on a claim with an
existing running sandbox, the controller now patches the pod's container
resources in-place. On Kubernetes 1.27+ with InPlacePodVerticalScaling,
this triggers the kubelet to call UpdateContainerResources on the
container runtime, resizing the workload without restart.
Previously, resource changes on existing claims were silently ignored
('sandbox already exists, skipping update').
Remove Sandbox→Pod annotation propagation from sandbox_controller.go per maintainer feedback. The in-place annotation/resource propagation is a sig-apps design question and will be a separate follow-up. This PR now only touches extensions/ (SandboxClaim workspaceResources).
6ce4cd1 to
7dd1b10
Compare
|
/retest |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: noeljackson The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@noeljackson: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Why
Today SandboxClaim cannot carry per-claim workspace sizing through the CRD/controller path. In practice that means claims requesting larger workspaces still adopt or create sandboxes using the template defaults.
This change adds a narrow claim-level resource override for the workspace container so the requested sizing survives both cold create and warm-pool adoption.
Scope