feat: add keepAlive support and fix RemoteSandboxSnapshot client re-injection#1819
feat: add keepAlive support and fix RemoteSandboxSnapshot client re-injection#1819Buktal wants to merge 7 commits into
Conversation
…dboxClient.resume()
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
… middleware rollback
AgentScopeJavaBot
left a comment
There was a problem hiding this comment.
🤖 AI Review
This PR introduces two features: (1) keepAlive mode — .keepAlive(true) makes release() only call stop() (snapshot) but skip shutdown(), preserving the Pod across calls; (2) RemoteSandboxSnapshot client re-injection — both DockerSandboxClient and KubernetesSandboxClient now accept SandboxSnapshotSpec at construction and rebuild the snapshot client in resume(). An archive() method and 8 new tests are included. The code design is clean and tests are comprehensive. Two recommended issues: a breaking API change in SandboxManager.release() signature that needs a @Deprecated compatibility overload, and a 560-line AI implementation plan document that should not be committed to the main repository.
| } | ||
|
|
||
| public void release(SandboxAcquireResult result) { | ||
| public void release(SandboxAcquireResult result, SandboxContext sandboxContext) { |
There was a problem hiding this comment.
[recommended] Breaking API change in release() signature: The release() method signature has changed, which is a breaking change for existing consumers. Consider adding a @Deprecated compatibility overload that delegates to the new signature, giving downstream users a migration path:
@Deprecated
public void release(String key) {
release(key, false); // or appropriate default
}| @@ -0,0 +1,560 @@ | |||
| # Sandbox keepAlive Implementation Plan | |||
There was a problem hiding this comment.
[recommended] AI implementation plan document should not be committed: This 560-line markdown file appears to be an AI-generated implementation plan/design document. While useful during development, such documents should not be committed to the main repository as they clutter the repo and become stale quickly. Consider removing this file or moving it to a design docs area outside the main source tree.
| sandbox.shutdown(); | ||
| } catch (Exception e) { | ||
| log.warn("[sandbox] Sandbox shutdown failed: {}", e.getMessage(), e); | ||
| boolean keepAlive = sandboxContext != null && sandboxContext.isKeepAlive(); |
There was a problem hiding this comment.
[nitpick] Javadoc for keepAlive: Consider enriching the Javadoc with a usage scenario description (e.g., 'useful for multi-turn conversations where the same sandbox should be reused across agent calls to avoid cold-start latency').
| } | ||
|
|
||
| // snapshotSpec used in resume() to re-inject snapshot client after deserialization | ||
| public DockerSandboxClient(ObjectMapper objectMapper, SandboxSnapshotSpec snapshotSpec) { |
There was a problem hiding this comment.
[nitpick] Missing Javadoc: The new constructor accepting SandboxSnapshotSpec should have a Javadoc explaining when to use it vs the simpler constructor.
AgentScopeJavaBot
left a comment
There was a problem hiding this comment.
🤖 AI Review
This PR introduces two features: (1) keepAlive mode — .keepAlive(true) makes release() only call stop() (snapshot) but skip shutdown(), preserving the Pod across calls; (2) RemoteSandboxSnapshot client re-injection — both DockerSandboxClient and KubernetesSandboxClient now accept SandboxSnapshotSpec at construction and rebuild the snapshot client in resume(). An archive() method and 8 new tests are included. The code design is clean and tests are comprehensive. Two recommended issues: a breaking API change in SandboxManager.release() signature that needs a @Deprecated compatibility overload, and a 560-line AI implementation plan document that should not be committed to the main repository.
| } | ||
|
|
||
| public void release(SandboxAcquireResult result) { | ||
| public void release(SandboxAcquireResult result, SandboxContext sandboxContext) { |
There was a problem hiding this comment.
[recommended] Breaking API change in release() signature: The release() method signature has changed, which is a breaking change for existing consumers. Consider adding a @Deprecated compatibility overload that delegates to the new signature, giving downstream users a migration path:
@Deprecated
public void release(String key) {
release(key, false); // or appropriate default
}| @@ -0,0 +1,560 @@ | |||
| # Sandbox keepAlive Implementation Plan | |||
There was a problem hiding this comment.
[recommended] AI implementation plan document should not be committed: This 560-line markdown file appears to be an AI-generated implementation plan/design document. While useful during development, such documents should not be committed to the main repository as they clutter the repo and become stale quickly. Consider removing this file or moving it to a design docs area outside the main source tree.
| sandbox.shutdown(); | ||
| } catch (Exception e) { | ||
| log.warn("[sandbox] Sandbox shutdown failed: {}", e.getMessage(), e); | ||
| boolean keepAlive = sandboxContext != null && sandboxContext.isKeepAlive(); |
There was a problem hiding this comment.
[nitpick] Javadoc for keepAlive: Consider enriching the Javadoc with a usage scenario description (e.g., 'useful for multi-turn conversations where the same sandbox should be reused across agent calls to avoid cold-start latency').
| } | ||
|
|
||
| // snapshotSpec used in resume() to re-inject snapshot client after deserialization | ||
| public DockerSandboxClient(ObjectMapper objectMapper, SandboxSnapshotSpec snapshotSpec) { |
There was a problem hiding this comment.
[nitpick] Missing Javadoc: The new constructor accepting SandboxSnapshotSpec should have a Javadoc explaining when to use it vs the simpler constructor.
…eStore scope-value overloads Add a public factory method SandboxIsolationKey.of(scope, value) so that out-of-band callers (restore, archive, scavenger) can build a key without a RuntimeContext or guessing the internal slotSessionId format. Add SessionSandboxStateStore.load(scope, value) / save(scope, value) convenience overloads so that application code no longer needs to construct SandboxIsolationKey objects manually.
AgentScope-Java Version
2.0.0-RC3 (based on f28e0a0)
Description
Problem
Sandbox always destroyed after each turn. Every agent call creates a fresh Docker container or Kubernetes Pod, runs the call, then shuts it down. For stateful agents (coding agent with installed dependencies, data agent with cached datasets), this is wasteful - the same workspace could be reused across turns within a session.
RemoteSnapshotClient null after deserialization. RemoteSandboxSnapshot loses its RemoteSnapshotClient when the sandbox state goes through JSON serialization/deserialization. SandboxClient.resume() was documented to re-inject it but never actually did - a latent bug that blocks any stateful resume path.
Solution
keepAlive mode (6 commits)
Add a
keepAliveboolean to SandboxFilesystemSpec, propagated through SandboxContext to SandboxManager.release(). When enabled, release() calls stop() (persists workspace snapshot) but skips shutdown() (destroys container/Pod), preserving the sandbox resource for the next call.Key files:
Snapshot client re-injection fix (1 commit)
Both DockerSandboxClient and KubernetesSandboxClient now accept a SandboxSnapshotSpec at construction time. resume() uses the spec to rebuild the RemoteSnapshotClient after deserialization, fixing the documented-but-never-implemented gap.
Public key API for out-of-band callers (1 commit)
Changes
Testing
8 new test files / additions:
All existing tests pass: harness (568 tests) + K8s sandbox extension (2 tests) = 570 tests, 0 failures.
Checklist
mvn spotless:apply