fix(python-client): harden tracing, uploads, and execution response handling#501
fix(python-client): harden tracing, uploads, and execution response handling#501wllbo wants to merge 2 commits intokubernetes-sigs:mainfrom
Conversation
✅ Deploy Preview for agent-sandbox canceled.
|
|
Hi @wllbo. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/assign @SHRUTI6991 |
| from k8s_agent_sandbox.trace_manager import trace_span, trace | ||
|
|
||
| # Maximum response size for command execution (16 MB). | ||
| MAX_EXECUTION_RESPONSE_SIZE = 16 * 1024 * 1024 |
There was a problem hiding this comment.
Why are we enforcing this limit?
There was a problem hiding this comment.
It guards against the client parsing and propagating unexpectedly large command output. 16 MB matches the Go SDK limit from #424. For larger outputs, callers should use file I/O instead.
|
|
||
| @trace_span("run") | ||
| def run(self, command: str, timeout: int = 60) -> ExecutionResult: | ||
| """Executes a command. Rejects responses larger than 16 MB.""" |
There was a problem hiding this comment.
Nit: You may be wanna parametrize this based on the MAX_EXECUTION_RESPONSE_SIZE value.
There was a problem hiding this comment.
updated, docstring now references the constant by name
|
|
||
| class SandboxTracerConfig(BaseModel): | ||
| """Configuration for tracer level information""" | ||
| model_config = {"arbitrary_types_allowed": True} |
There was a problem hiding this comment.
Nit: add a comment about this field.
There was a problem hiding this comment.
added a comment explaining it's needed for the tracer_provider field
|
|
||
| try: | ||
| self._create_claim(claim_name, template, namespace) | ||
| claim_name = self._create_claim(template, namespace) |
| Initializes the global OpenTelemetry TracerProvider using the singleton pattern. | ||
|
|
||
| This function uses double-checked locking to ensure thread-safe, one-time initialization. | ||
| def create_tracer_provider(service_name: str) -> "TracerProvider | None": |
| provider.add_span_processor( | ||
| BatchSpanProcessor(OTLPSpanExporter()) | ||
| ) | ||
| atexit.register(provider.shutdown) |
There was a problem hiding this comment.
The trace provider is shutdown when the client goes out of scope right?
There was a problem hiding this comment.
Not on client GC, atexit fires at process exit. Python's __del__ is unreliable, so atexit ensures BatchSpanProcessor flushes remaining spans before the process terminates.
If the caller passes their own provider via SandboxTracerConfig.tracer_provider, create_tracer_provider is never called, so we don't register anything. They own the lifecycle.
|
/ok-to-test |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aditya-shantanu, wllbo The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Issues surfaced during Go SDK review (#424) that also apply to Python SDK.
initialize_tracer()withcreate_tracer_provider()factory, the SDK no longer callstrace.set_tracer_provider().ValueErroron upload paths with directory separators instead of silently stripping to basename.generateNamefor claim creation instead of manualuuid.uuid4()suffix.