Skip to content

optimization: replace .Update() with .Patch() for claim updateStatus#508

Open
vicentefb wants to merge 1 commit intokubernetes-sigs:mainfrom
vicentefb:patchClaimStatus
Open

optimization: replace .Update() with .Patch() for claim updateStatus#508
vicentefb wants to merge 1 commit intokubernetes-sigs:mainfrom
vicentefb:patchClaimStatus

Conversation

@vicentefb
Copy link
Copy Markdown
Member

@vicentefb vicentefb commented Apr 2, 2026

In an effort to reduce "Operation cannot be fulfilled..." conflicts at scale, this PR switches to patching to the status of Sandbox Claim resrouce status.

Tests from main without this change indicate:

322 operation cannot be fulfilled conflicts from sandboxclaim (protoPayload.resourceName="pods/sandboxclaim-" OR protoPayload.resourceName="sandboxclaims/")

With this change, 0 conflicts.

Test paramters:

# BURST_SIZE * TOTAL_BURSTS = Total sandbox claims created
BURST_SIZE=300
QPS=300
TOTAL_BURSTS=5
WARMPOOL_SIZE=600
RUNTIME_CLASS="" # Change to "gvisor" if your cluster supports it

Deployment args

        args:
        - "--leader-elect=true"
        - "--extensions"
        - "--enable-tracing=true"
        - --zap-log-level=debug
        - --zap-encoder=json
        - --enable-pprof-debug
        - --kube-api-qps=1000
        - --kube-api-burst=2000
        - --sandbox-concurrent-workers=400
        - --sandbox-claim-concurrent-workers=400
        - --sandbox-warm-pool-concurrent-workers=1

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 2, 2026

Deploy Preview for agent-sandbox canceled.

Name Link
🔨 Latest commit 203902d
🔍 Latest deploy log https://app.netlify.com/projects/agent-sandbox/deploys/69d03b71d1ecff000872322f

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vicentefb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 2, 2026
@justinsb justinsb self-assigned this Apr 2, 2026
@justinsb
Copy link
Copy Markdown
Contributor

justinsb commented Apr 2, 2026

Let's have the "are there two controllers acting on the same object" discussion on #509

@vicentefb
Copy link
Copy Markdown
Member Author

I ran two very small tests by having 1 claim with a warmpool of size 2.

BURST_SIZE=1
QPS=1
TOTAL_BURSTS=10
WARMPOOL_SIZE=2

From oss main:

Using Update: It tries to update the status at 6. Just later (9), it tries to update it again and hits a 409 Conflict because the ResourceVersion shifted.

Also something to note is that there's a pod collision in 26 due to this update. The controller is trying to update the pod's labels while the Kubelet is simultaneously updating the pod's status. Separate issue where could use .Patch()

Step Delta (ms) Method Target Resource Result / Notes
1 0 PATCH sandboxclaims/agent-claim-4 🟢 Test Runner injects Claim
2 17 UPDATE sandboxes/warmpool-0-6qkhx 🟢 Claim Controller adopts Sandbox
3 28 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found
4 32 CREATE events/agent-claim-4... 🟢 K8s records adoption event
5 42 CREATE sandboxes 🟢 WarmPool orders replacement
6 45 UPDATE sandboxclaims/agent-claim-4/status 🟢 Claim Controller updates status
7 56 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Retry)
8 58 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
9 67 UPDATE sandboxclaims/agent-claim-4/status 🔴 409 Conflict (Claim collides w/ itself)
10 68 UPDATE pods/warmpool-0-6qkhx 🟢 Sandbox Controller updates pod
11 71 PATCH sandboxes/warmpool-0-qcblt 🟢 WarmPool configures replacement
12 77 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Retry)
13 85 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
14 90 PATCH sandboxes/warmpool-0-6qkhx/status 🟢 SUCCESS: Sandbox Status patched
15 97 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
16 111 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Retry)
17 127 CREATE pods/warmpool-0-qcblt 🟢 Sandbox Controller provisions pod
18 129 UPDATE sandboxclaims/agent-claim-4/status 🟢 Retry Succeeds: Claim status resolves
19 132 UPDATE pods/warmpool-0-6qkhx 🟢 Sandbox Controller updates pod
20 147 PATCH sandboxes/warmpool-0-6qkhx/status 🟢 SUCCESS: Sandbox Status patched again
21 150 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Retry)
22 152 PATCH sandboxes/warmpool-0-qcblt 🟢 WarmPool configures replacement
23 166 CREATE services/warmpool-0-qcblt 🟢 Service provisioned
24 191 UPDATE pods/warmpool-0-6qkhx 🟢 Sandbox Controller updates pod
25 193 PATCH sandboxes/warmpool-0-qcblt/status 🟢 SUCCESS: New Sandbox Status patched
26 244 UPDATE pods/warmpool-0-qcblt 🔴 409 Conflict (Sandbox collides w/ Kubelet)
27 264 PATCH sandboxes/warmpool-0-qcblt/status 🟢 SUCCESS: New Sandbox Status patched again
28 312 UPDATE pods/warmpool-0-qcblt 🟢 Sandbox Controller updates pod
29 361 UPDATE pods/warmpool-0-qcblt 🟢 Sandbox Controller updates pod
30 380 PATCH sandboxes/warmpool-0-qcblt/status 🟢 SUCCESS: New Sandbox Status patched again
31 715 UPDATE pods/warmpool-0-qcblt 🟢 Sandbox Controller updates pod
32 734 PATCH sandboxes/warmpool-0-qcblt/status 🟢 SUCCESS: New Sandbox Status patched again
33 758 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
34 796 UPDATE pods/warmpool-0-qcblt 🟢 Sandbox Controller updates pod

With this PR:

Using Patch: It fires four separate status updates to two different sandboxes (6, 9, 14, 18, and 21.). Even though the rest of the cluster is actively changing metadata and provisioning resources, every single one of those patches succeeds with zero conflicts.

Step Delta (ms) Method Target Resource Result / Notes
1 0 PATCH sandboxclaims/agent-claim-4 🟢 Start: Test Runner injects Claim 4
2 19 UPDATE sandboxes/warmpool-0-8wljs 🟢 Claim Controller adopts Sandbox
3 31 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found
4 37 CREATE events/agent-claim-4.18a... 🟢 K8s records Sandbox adoption event
5 45 CREATE sandboxes 🟢 WarmPool orders replacement Sandbox
6 49 PATCH sandboxclaims/agent-claim-4/status 🟢 SUCCESS: Claim Status Patched
7 60 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Reconcile retry)
8 63 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
9 70 PATCH sandboxclaims/agent-claim-4/status 🟢 SUCCESS: Claim Status Patched
10 74 PATCH sandboxes/warmpool-0-7drtx 🟢 WarmPool configures replacement sandbox
11 81 UPDATE pods/warmpool-0-8wljs 🟢 Sandbox Controller updates adopted pod
12 81 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Reconcile retry)
13 92 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
14 100 PATCH sandboxes/warmpool-0-8wljs/status 🟢 Sandbox Controller updates status
15 116 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Reconcile retry)
16 116 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
17 126 CREATE pods/warmpool-0-7drtx 🟢 Sandbox Controller provisions replacement pod
18 136 PATCH sandboxclaims/agent-claim-4/status 🟢 SUCCESS: Claim Status Patched
19 140 UPDATE pods/warmpool-0-8wljs 🟢 Sandbox Controller updates adopted pod
20 145 PATCH sandboxes/warmpool-0-7drtx 🟢 WarmPool configures replacement sandbox
21 153 PATCH sandboxes/warmpool-0-8wljs/status 🟢 Sandbox Controller updates status
22 156 DELETE networkpolicies/agent-claim-4-network-policy 🟡 404 Not Found (Reconcile retry)
23 159 CREATE services/warmpool-0-7drtx 🟢 Service provisioned for new pod
24 177 PATCH sandboxes/warmpool-0-7drtx/status 🟢 Sandbox Controller updates new pod status
25 186 UPDATE pods/warmpool-0-8wljs 🟢 Sandbox Controller updates adopted pod
26 217 UPDATE pods/warmpool-0-7drtx 🔴 409 Conflict (Sandbox Controller collides w/ Kubelet on Pod)
27 236 PATCH sandboxes/warmpool-0-7drtx/status 🟢 Sandbox Controller updates status
28 279 UPDATE pods/warmpool-0-7drtx 🟢 Retry Succeeds: Pod updated.
29 337 PATCH sandboxes/warmpool-0-7drtx/status 🟢 Sandbox Controller updates status
30 932 UPDATE pods/warmpool-0-7drtx 🟢 Sandbox Controller updates replacement pod
31 950 PATCH sandboxes/warmpool-0-7drtx/status 🟢 Sandbox Controller updates status
32 975 PATCH sandboxwarmpools/warmpool-0/status 🟢 WarmPool updates pool status
33 999 UPDATE pods/warmpool-0-7drtx 🟢 Sandbox Controller updates replacement pod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. next-step:contributor size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants