security: add network policy support for sandbox isolation#262
security: add network policy support for sandbox isolation#262sicaario wants to merge 1 commit intovolcano-sh:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @sicaario! It looks like this is your first PR to volcano-sh/agentcube 🎉 |
There was a problem hiding this comment.
Pull request overview
This PR implements Kubernetes NetworkPolicy support for sandbox isolation, addressing issue #216. It adds a new networkPolicy field to both AgentRuntime and CodeInterpreter CRDs, enabling fine-grained control over sandbox network access. When no policy is specified, a default deny-all policy is automatically applied for maximum security.
Changes:
- New
SandboxNetworkPolicytype wrapping Kubernetes NetworkPolicyIngressRule and NetworkPolicyEgressRule - Network policy lifecycle management: creation with sandbox, deletion on explicit removal or GC
- RBAC permissions added for workload manager to manage NetworkPolicies
- Complete integration into sandbox creation, deletion, and garbage collection flows
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/apis/runtime/v1alpha1/agent_type.go | Added NetworkPolicy field to AgentRuntimeSpec |
| pkg/apis/runtime/v1alpha1/codeinterpreter_types.go | Added NetworkPolicy field to CodeInterpreterSpec and defined SandboxNetworkPolicy type |
| pkg/apis/runtime/v1alpha1/zz_generated.deepcopy.go | Generated DeepCopy methods for SandboxNetworkPolicy |
| pkg/workloadmanager/network_policy.go | New file with buildNetworkPolicy, createNetworkPolicy, deleteNetworkPolicy functions |
| pkg/workloadmanager/k8s_client.go | Added NetworkPolicy field to sandboxEntry struct |
| pkg/workloadmanager/workload_builder.go | Updated to pass NetworkPolicy from CRDs to sandboxEntry |
| pkg/workloadmanager/handlers.go | Network policy creation in sandbox creation flow and deletion in explicit deletion |
| pkg/workloadmanager/garbage_collection.go | Network policy cleanup during GC |
| manifests/charts/base/templates/rbac/workloadmanager.yaml | Added networkpolicies permissions to ClusterRole |
| manifests/charts/base/crds/runtime.agentcube.volcano.sh_*.yaml | CRD schema includes full NetworkPolicy field definitions |
c6ed9bc to
f7cea82
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces network isolation for sandboxes by adding a NetworkPolicy configuration to AgentRuntime and CodeInterpreter resources. It includes the necessary CRD updates, RBAC permissions, and logic in the workload manager to create, manage, and clean up these policies. I have reviewed the changes and suggest updating the log level for network policy rollback failures from 'Infof' to 'Warningf' to ensure better visibility of potential resource leaks.
Signed-off-by: sicaario <hrmnp8@gmail.com>
434d152 to
38b6a4b
Compare
| patches.ApplyFunc(createNetworkPolicy, func(_ context.Context, _ kubernetes.Interface, _ *networkingv1.NetworkPolicy) error { | ||
| return nil | ||
| }) |
There was a problem hiding this comment.
Missing test coverage for network policy creation failure. The test mocks createNetworkPolicy to always succeed (line 223-225), but there's no test case that verifies the behavior when network policy creation fails. This scenario should trigger rollback since rollback is registered before network policy creation. Add a test case to verify that network policy creation failure properly rolls back the sandbox and sandbox claim creation.
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:default="8h" | ||
| MaxSessionDuration *metav1.Duration `json:"maxSessionDuration,omitempty" protobuf:"bytes,3,opt,name=maxSessionDuration"` | ||
|
|
||
| // NetworkPolicy defines the network access rules for the sandbox. | ||
| // If not specified, a default deny-all policy is applied to enforce isolation. | ||
| // +optional | ||
| NetworkPolicy *SandboxNetworkPolicy `json:"networkPolicy,omitempty"` |
There was a problem hiding this comment.
The agent_type.go file references SandboxNetworkPolicy (which contains networkingv1.NetworkPolicyIngressRule and networkingv1.NetworkPolicyEgressRule fields) but does not import the networkingv1 package. While this may compile since SandboxNetworkPolicy is defined in the same package, this creates an inconsistency since codeinterpreter_types.go has the import. For consistency and to support code generation tools, add: networkingv1 "k8s.io/api/networking/v1" to the imports.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #262 +/- ##
==========================================
+ Coverage 35.60% 43.58% +7.97%
==========================================
Files 29 31 +2
Lines 2533 2687 +154
==========================================
+ Hits 902 1171 +269
+ Misses 1505 1391 -114
+ Partials 126 125 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What type of PR is this?
/kind enhancement
What this PR does / why we need it:
Sandboxes currently have no network-level isolation — a running sandbox pod can accept connections from anywhere in the cluster and reach any endpoint. This is a real problem for environments running untrusted or LLM-generated code.
This PR wires in a Kubernetes
NetworkPolicyfor every sandbox at creation time and cleans it up when the sandbox is deleted or garbage collected. If the user doesn't specify any rules, a deny-all policy is applied by default so isolation is opt-out, not opt-in.Which issue(s) this PR fixes:
Fixes #216
Special notes for your reviewer:
SandboxNetworkPolicyis defined once incodeinterpreter_types.goand reused in bothAgentRuntimeSpecandCodeInterpreterSpecNetworkPolicypod selector uses the existingruntime.agentcube.io/sandbox-namelabel — no new labeling neededmake generate;make buildpasses clean