-
Notifications
You must be signed in to change notification settings - Fork 190
feat: add Pod Snapshot extension for Python SDK #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
3a0fea4
d95d4d5
fe65a4d
a942128
a779aa2
663e8b8
fd5a147
90da8cd
2f3608b
90167a3
7840d37
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Copyright 2026 The Kubernetes Authors. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from .podsnapshot_client import PodSnapshotSandboxClient |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| # Agentic Sandbox Pod Snapshot Extension | ||
|
|
||
| This directory contains the Python client extension for interacting with the Agentic Sandbox to manage Pod Snapshots. This extension allows you to trigger snapshots of a running sandbox and restore a new sandbox from the recently created snapshot. | ||
|
|
||
| ## `podsnapshot_client.py` | ||
|
|
||
| This file defines the `PodSnapshotSandboxClient` class, which extend the base `SandboxClient` to provide snapshot capabilities. | ||
|
|
||
| ### `PodSnapshotSandboxClient` | ||
|
|
||
| A specialized Sandbox client for interacting with the gke pod snapshot controller. | ||
|
|
||
| ### Key Features: | ||
|
|
||
| * **`PodSnapshotSandboxClient(template_name: str, podsnapshot_timeout: int = 180, server_port: int = 8080, ...)`**: | ||
| * Initializes the client with optional podsnapshot timeout and server port. | ||
| * **`snapshot_controller_ready(self) -> bool`**: | ||
| * Checks if the snapshot agent (GKE managed) is running and ready. | ||
| * **`__exit__(self)`**: | ||
| * Cleans up the `SandboxClaim` resources. | ||
|
|
||
| ## `test_podsnapshot_extension.py` | ||
|
|
||
| This file, located in the parent directory (`clients/python/agentic-sandbox-client/`), contains an integration test script for the `PodSnapshotSandboxClient` extension. It verifies the snapshot and restore functionality. | ||
|
|
||
| ### Test Phases: | ||
|
|
||
| 1. **Phase 1: Starting Counter Sandbox**: | ||
| * Starts a sandbox with a counter application. | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| 1. **Python Virtual Environment**: | ||
| ```bash | ||
| python3 -m venv .venv | ||
| source .venv/bin/activate | ||
| ``` | ||
|
|
||
| 2. **Install Dependencies**: | ||
| ```bash | ||
| pip install kubernetes | ||
| pip install -e clients/python/agentic-sandbox-client/ | ||
| ``` | ||
|
|
||
| 3. **Pod Snapshot Controller**: The Pod Snapshot controller must be installed in a **GKE standard cluster** running with **gVisor**. | ||
| * For detailed setup instructions, refer to the [GKE Pod Snapshots public documentation](https://docs.cloud.google.com/kubernetes-engine/docs/how-to/pod-snapshots). | ||
| * Ensure a GCS bucket is configured to store the pod snapshot states and that the necessary IAM permissions are applied. | ||
|
|
||
| 4. **CRDs**: `PodSnapshotStorageConfig`, `PodSnapshotPolicy` CRDs must be applied. `PodSnapshotPolicy` should specify the selector match labels. | ||
|
|
||
| 5. **Sandbox Template**: A `SandboxTemplate` (e.g., `python-counter-template`) with runtime gVisor, appropriate KSA and label that matches that selector label in `PodSnapshotPolicy` must be available in the cluster. | ||
|
|
||
| ### Running Tests: | ||
|
|
||
| To run the integration test, execute the script with the appropriate arguments: | ||
|
|
||
| ```bash | ||
| python3 clients/python/agentic-sandbox-client/test_podsnapshot_extension.py \ | ||
| --template-name python-counter-template \ | ||
| --namespace sandbox-test | ||
| ``` | ||
|
|
||
| Adjust the `--namespace`, `--template-name` as needed for your environment. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| # Copyright 2026 The Kubernetes Authors. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import logging | ||
| from kubernetes import client | ||
| from kubernetes.client import ApiException | ||
| from ..sandbox_client import SandboxClient | ||
| from ..constants import ( | ||
| PODSNAPSHOT_NAMESPACE_MANAGED, | ||
| PODSNAPSHOT_AGENT, | ||
| PODSNAPSHOT_API_GROUP, | ||
| PODSNAPSHOT_API_VERSION, | ||
| PODSNAPSHOT_API_KIND, | ||
| ) | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class PodSnapshotSandboxClient(SandboxClient): | ||
| """ | ||
| A specialized Sandbox client for interacting with the gke pod snapshot controller. | ||
| Currently supports manual triggering via PodSnapshotManualTrigger. | ||
shrutiyam-glitch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| template_name: str, | ||
| podsnapshot_timeout: int = 180, | ||
| server_port: int = 8080, | ||
| **kwargs, | ||
shrutiyam-glitch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ): | ||
| super().__init__(template_name, server_port=server_port, **kwargs) | ||
|
|
||
| self.controller_ready = False | ||
| self.podsnapshot_timeout = podsnapshot_timeout | ||
shrutiyam-glitch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| self.core_v1_api = client.CoreV1Api() | ||
|
|
||
| def __enter__(self) -> "PodSnapshotSandboxClient": | ||
| try: | ||
| self.controller_ready = self.snapshot_controller_ready() | ||
| super().__enter__() | ||
shrutiyam-glitch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| return self | ||
| except Exception as e: | ||
| self.__exit__(None, None, None) | ||
| raise RuntimeError( | ||
| f"Failed to initialize PodSnapshotSandboxClient. Ensure that you are connected to a GKE cluster " | ||
| f"with the Pod Snapshot Controller enabled. Error details: {e}" | ||
| ) from e | ||
|
|
||
| def snapshot_controller_ready(self) -> bool: | ||
| """ | ||
| Checks if the snapshot agent pods are running in a GKE-managed pod snapshot cluster. | ||
| Falls back to checking CRD existence if pod listing is forbidden. | ||
|
||
| """ | ||
|
|
||
| if self.controller_ready: | ||
| return True | ||
|
|
||
| def check_crd_installed() -> bool: | ||
| try: | ||
| # Check directly if the API resource exists using CustomObjectsApi | ||
| resource_list = self.custom_objects_api.get_api_resources( | ||
janetkuo marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| group=PODSNAPSHOT_API_GROUP, | ||
| version=PODSNAPSHOT_API_VERSION, | ||
| ) | ||
|
|
||
| if not resource_list or not resource_list.resources: | ||
| return False | ||
|
|
||
| for resource in resource_list.resources: | ||
| if resource.kind == PODSNAPSHOT_API_KIND: | ||
| return True | ||
| return False | ||
| except ApiException as e: | ||
| # If discovery fails with 403/404, we assume not ready/accessible | ||
| if e.status == 403 or e.status == 404: | ||
| return False | ||
| raise | ||
|
|
||
| def check_pod_running(namespace: str, pod_name_substring: str) -> bool: | ||
| try: | ||
| pods = self.core_v1_api.list_namespaced_pod(namespace) | ||
| for pod in pods.items: | ||
| if ( | ||
| pod.status.phase == "Running" | ||
| and pod_name_substring in pod.metadata.name | ||
|
||
| ): | ||
| return True | ||
| return False | ||
| except ApiException as e: | ||
| if e.status == 403: | ||
| logger.info( | ||
| f"Permission denied listing pods in {namespace}. Checking CRD existence." | ||
| ) | ||
| return check_crd_installed() | ||
| # If discovery fails with 404, we assume not ready/accessible | ||
| if e.status == 404: | ||
shrutiyam-glitch marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| return False | ||
| raise | ||
|
|
||
| # Check managed: requires only agent in gke-managed-pod-snapshots | ||
| if check_pod_running(PODSNAPSHOT_NAMESPACE_MANAGED, PODSNAPSHOT_AGENT): | ||
| return True | ||
|
|
||
| return False | ||
|
|
||
| def __exit__(self, exc_type, exc_val, exc_tb): | ||
| """ | ||
| Automatically cleans up the Sandbox. | ||
| """ | ||
| super().__exit__(exc_type, exc_val, exc_tb) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the client only checking whether the controller is ready? This is different from the description in the PR: "to support manual snapshot triggering via the GKE pod snapshot controller."
I'd expect the client to modify Snapshot CRs, instead of checking the Snapshot controller itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have defined the
snapshotmethod here - https://github.com/kubernetes-sigs/agent-sandbox/pull/339/changes#diff-6535038b29a40cde2f558dd8bf85e28a67c1eee796fe718c04338884af9bddecR203.The method will first check if the snapshot controller is ready as an initialization check before creating the snapshots.
Other methods added will be
list,delete. I had to just split logic into multiple PRs.For the PR description, I just meant to write what the purpose of the class in an overview is. Will update it.
Thanks.