feat: python no more#344
Open
Zygimantass wants to merge 76 commits into
Open
Conversation
gakonst
approved these changes
Jun 1, 2026
Comment on lines
+15
to
+24
| pub struct SandboxIoParts { | ||
| pub stdin: SandboxWrite, | ||
| pub stdout: SandboxRead, | ||
| pub stderr: SandboxRead, | ||
| pub guard: SandboxIoGuard, | ||
| } | ||
|
|
||
| pub struct SandboxIoGuard { | ||
| _inner: Box<dyn Send>, | ||
| } |
| }, | ||
| }; | ||
|
|
||
| use async_trait::async_trait; |
Member
There was a problem hiding this comment.
isn't async trait no longer necsesary bc it's in std rust?
| sandboxes: Mutex<HashMap<SandboxId, Arc<Mutex<LocalSandbox>>>>, | ||
| } | ||
|
|
||
| struct LocalSandbox { |
Member
There was a problem hiding this comment.
should be separaete local.rs file
Comment on lines
+213
to
+214
| pub fn empty_object() -> Value { | ||
| Value::Object(serde_json::Map::new()) |
Comment on lines
+127
to
+202
| let execution = self | ||
| .store | ||
| .create_execution(thread_key, default_metadata(input.metadata)) | ||
| .await?; | ||
| let execution = self | ||
| .store | ||
| .mark_execution_running(&execution.execution_id) | ||
| .await?; | ||
| let sandbox_id = self | ||
| .ensure_session_sandbox( | ||
| thread_key, | ||
| session.sandbox_id.as_deref(), | ||
| &execution.execution_id, | ||
| ) | ||
| .await?; | ||
|
|
||
| self.store | ||
| .append_event( | ||
| thread_key, | ||
| Some(&execution.execution_id), | ||
| "session.execution_started", | ||
| json!({ | ||
| "execution_id": execution.execution_id, | ||
| "thread_key": thread_key.as_str(), | ||
| "input_line_count": input.input_lines.len(), | ||
| }), | ||
| ) | ||
| .await?; | ||
|
|
||
| let write_result = match self.ensure_session_pipe(thread_key, &sandbox_id).await { | ||
| Ok(pipe) => write_input_lines(&pipe, &input.input_lines).await, | ||
| Err(error) => Err(error), | ||
| }; | ||
|
|
||
| match write_result { | ||
| Ok(()) => {} | ||
| Err(error) => { | ||
| let error_message = error.to_string(); | ||
| let _ = self | ||
| .store | ||
| .append_event( | ||
| thread_key, | ||
| Some(&execution.execution_id), | ||
| "session.execution_failed", | ||
| json!({ | ||
| "execution_id": execution.execution_id, | ||
| "thread_key": thread_key.as_str(), | ||
| "error": error_message, | ||
| }), | ||
| ) | ||
| .await; | ||
| let _ = self | ||
| .store | ||
| .fail_execution(&execution.execution_id, &error_message) | ||
| .await; | ||
| return Err(error); | ||
| } | ||
| } | ||
|
|
||
| self.store | ||
| .append_event( | ||
| thread_key, | ||
| Some(&execution.execution_id), | ||
| "session.execution_completed", | ||
| json!({ | ||
| "execution_id": execution.execution_id, | ||
| "thread_key": thread_key.as_str(), | ||
| "completion_reason": "input_accepted", | ||
| }), | ||
| ) | ||
| .await?; | ||
|
|
||
| Ok(self | ||
| .store | ||
| .complete_execution(&execution.execution_id) | ||
| .await?) |
|
|
||
| tokio::spawn(async move { | ||
| let result = | ||
| run_stdout_pump(store.clone(), thread_key.clone(), &pump_key, stdout, guard).await; |
Member
There was a problem hiding this comment.
wtf who has ever used the word stdout pump???
34a591a to
4d7239b
Compare
This was referenced Jun 9, 2026
* refactor(api-rs): use owned sandbox io streams * refactor(api-rs): move session streaming into runtime * refactor(api-rs): remove mock session runtime branch * refactor(api-rs): move sandbox workload modes into runtime * refactor(api-rs): clean up session API client and e2e tests * refactor(api-rs): use library sse event types * refactor(api-rs): address owned io review comments
Co-authored-by: Centaur AI <ai@centaur.local>
Co-authored-by: Centaur AI <ai@centaur.local>
…#404) feat(api-rs): broker credentials in iron-control for codex access-token auth Manage iron-control broker credentials — managed OAuth refresh tokens that iron-control mints and delivers inline to proxies via a `token_broker` source — and drop the iron-proxy broker sidecar. Adds the centaur-perms `broker` subcommands, `brokered_token` tool-secret parsing/translation, and the iron-control client/models for broker credentials. Supporting infra so it runs end-to-end: - chart: iron-control Solid Queue worker (bin/jobs) that runs the broker OAuth refresh loop, plus its egress NetworkPolicy and the postgres ingress allowlist entry; image pullPolicy defaulted to Always for the mutable :latest tag. - chart: create-db init waits for postgres and provisions all four logical databases (primary + Solid Cache/Queue/Cable) idempotently. - slackbotv2: tolerate the postgres startup race with an owned pool + connect retry so a transient cold-start failure doesn't wedge the bot.
This allows clients to define persona as a part of the request.
fix(slackbotv2): pin slack stream continuation fix
* feat(api-rs): add aws_auth credential type linked to iron-control Reimplements the CloudWatch tool's AWS SigV4 re-signing support in the Rust api-rs / iron-control control plane instead of the Python api service (superseding #449). The tool signs requests with placeholder credentials; iron-proxy's aws_auth transform re-signs with the real read-only IAM keys resolved from iron-control, so credentials never enter the sandbox. - centaur-iron-control: AwsAuthSecretInput model, aws_auth_secrets endpoint + upsert client, GrantSecret/Grant/SECRET_TYPES wiring (aas_ prefix), fragment translator marks aws_auth unsupported (it is a tool secret registered via the centaur-perms CLI, like hmac_sign) - centaur-perms: parse type = "aws_auth" from a tool's pyproject and translate it to an AwsAuthSecretInput granted to the tool's role - keep the cloudwatch tool and the iron-proxy SigV4 header allowlist; drop the Python services/api changes (api-rs replaces them). AWS_REGION is non-secret and reaches the sandbox via passthrough_env. * feat(api-rs): translate aws_auth in the iron-proxy fragment path Infra/harness fragments can now declare an aws_auth transform and have it registered as an iron-control aws_auth secret, instead of erroring as unsupported. Mirrors gcp_auth: access_key_id/secret_access_key (and optional session_token) are placeholder refs resolved via the source policy, allowed_regions/allowed_services scope signing, rules use the shared request-rule shape, and the foreign_id keys on the access-key placeholder (`{role}-aws-{slug}`). * feat(api-rs): support aws_auth in tool discovery tool_discovery rejected the aws_auth secret type and dropped the whole tool, so the cloudwatch tool got no proxy fragment and no sandbox AWS credentials. Parse aws_auth into an aws_auth transform (matching the iron-control translator), seed the sandbox AWS SDK placeholder creds via placeholder_env, and add GITHUB_TOKEN to the infra-env bootstrap for the repo-cache.
The sync used a PID-suffixed temp dir, but Kubernetes collapses the
doubled dollar sign in a container command to a single one during its own
$(VAR) expansion, so the suffix became a constant literal and the mv
failed ("subdirectory of itself"). Use a deterministic temp name and
sweep any stale temp dirs before cloning; sync is sequential per pod so a
fixed name is safe. Self-heals existing corrupt caches on next run.
This reverts commit 2913516.
The sync used a PID-suffixed temp dir, but Kubernetes collapses the
doubled dollar sign in a container command to a single one during its own
$(VAR) expansion, so the suffix became a constant literal and the mv
failed ("subdirectory of itself"). Use a deterministic temp name and
sweep any stale temp dirs before cloning; sync is sequential per pod so a
fixed name is safe. Self-heals existing corrupt caches on next run.
* fix: grant infra role to warm pool bootstrap * fix: keep warm pool bootstrap roleless * refactor(api-rs): move warm pool policy into sandbox manager * feat(api-rs): add absurd workflow runtime poc * chore(api-rs): refresh absurd workflow staging image * fix(api-rs): label workflow host sandboxes * fix(api-rs): serialize workflow timestamps as rfc3339 * fix(api-rs): mount overlay workflows in workflow sandbox * fix(api-rs): propagate sandbox image pull secrets * fix(api-rs): extend workflow agent turns * fix(api-rs): keep workflow host claims alive * fix(api-rs): heartbeat all workflow tasks * fix(workflows): call tools through sandbox shims * fix(workflows): resolve sandbox tool shim outside login shells * fix(workflows): bootstrap tool shims in workflow hosts * fix(workflows): mount tools in workflow host sandboxes * fix(workflows): grant workflow host tool secrets * ci: speed up branch image builds * ci: add Dockerfile package caches * fix(api-rs): repair workflow schedule test fixture * refactor(api-rs): rename workflows crate --------- Co-authored-by: Centaur AI <ai@centaur.local>
Co-authored-by: Centaur AI <ai@centaur.local>
fix(api-rs): raise stdout line cap and disable service links Co-authored-by: Centaur AI <ai@centaur.local>
e2d3db4 to
eb0969b
Compare
Cloudflare Workers docs preview |
) The sandbox entrypoint's install-tool-shims printed its success notice to stdout, which is the same pipe the session stdout pump streams to clients. slackbotv2 treated any JSON-unparseable output line as a terminal codex line, so on every fresh (non-warm) sandbox the very first bootstrap line ended the render stream before the agent produced output, finalizing the Slack reply as 'Execution completed, but no final text was captured.' while the real answer streamed afterwards and was dropped. - install_tool_shims.py: write the notice to stderr - slackbotv2: non-JSON output lines are noise, not terminal - regression test: bootstrap line before codex output still delivers the answer Amp-Thread-ID: https://ampcode.com/threads/T-019eb1eb-27c4-7169-9489-41d85f8e0614 Co-authored-by: Centaur AI <ai@centaur.local> Co-authored-by: Amp <amp@ampcode.com>
* fix(api-rs): only drive session executions claimed by this request mark_execution_running treated an already-running row as a successful claim, so a concurrent request with the same idempotency key could fall into the fallback fetch, see status=running, and send the same input to the sandbox a second time. It now returns ClaimExecutionResult with a claimed flag that is true only when this call did the queued->running transition; the runtime returns the existing execution without driving it when the claim was lost. Amp-Thread-ID: https://ampcode.com/threads/T-019eb167-76de-7515-84f7-4265ce53ba85 Co-authored-by: Amp <amp@ampcode.com> * fix(api-rs): reject HTTP tool secrets without hosts An HTTP secret parsed with an empty hosts list (empty tool-level default, hosts = [], or malformed hosts falling back to an empty default) translated to an empty iron-control rules array, leaving the credential host-unlimited. Both manifest parsers (centaur-perms and the api-server's tool discovery mirror) now fail closed, matching the brokered_token parser. Affected tools are warn-skipped at discovery. * fix(api-rs): guard absurd.await_event against task/run mismatch await_event trusted the caller-provided (task_id, run_id) pair: a mismatched call could attach one task's run to a wait/checkpoint on another task and put the wrong task to sleep. Reject mismatches like get_task_checkpoint_states already does. Shipped as migration 0009 (create or replace) because 0007 is already applied in live environments and sqlx validates migration checksums. * fix(api-rs): only claim running warm sandboxes A warm sandbox observed as Created is not ready for byte I/O and means the runtime regressed after the replenisher saw it running (backends wait for readiness before returning from create). Claiming it made the session fail at open_io; mark it failed and try the next one instead. Amp-Thread-ID: https://ampcode.com/threads/T-019eb167-76de-7515-84f7-4265ce53ba85 Co-authored-by: Amp <amp@ampcode.com> * fix(api-rs): make grant_inputs_to_role idempotent grant_inputs_to_role documented idempotency but always POSTed a new grant after upserting each secret, so re-running centaur-perms grants or startup role registration produced duplicate grants or conflicts. It now lists the role's existing grants once and reuses the grant for an already-granted secret. * fix(rendering): flush buffered answer in codexAppServerToRendererEvents The array helper never called mapper.flush(), so finite sources that end without an explicit terminal event lost buffered answer text and never received renderer.done. Flush like codexAppServerToChatSdkStream does; flush is a no-op when a terminal event already completed the stream. * fix(slackbotv2): cap inline attachments at 100 MiB serializeAttachment buffered every Slack attachment in memory and base64-inlined it with no size limit, letting one large upload blow request limits or OOM the process. Skip the download when Slack's size metadata exceeds 100 MiB and re-check the actual byte count after fetching; oversized attachments degrade through the existing fetchError channel. Amp-Thread-ID: https://ampcode.com/threads/T-019eb167-76de-7515-84f7-4265ce53ba85 Co-authored-by: Amp <amp@ampcode.com> * fix(slackbotv2): isolate render obligation recovery failures One thread's corrupt state, lease error, or failed render propagated out of the recovery scan, so the remaining indexed obligations were never attempted until the next restart. Isolate each thread, log the failure, and count it as deferred so the capped-backoff retry loop revisits it. * ci: skip registry cache export for fork PRs cache-to type=registry pushes a cache manifest to GHCR even when image push is disabled, and fork PRs run with a read-only GITHUB_TOKEN, so their builds failed at cache export. Gate the registry cache-to on the same not-a-fork predicate as push. --------- Co-authored-by: Amp <amp@ampcode.com>
fix: update Slack stream pagination and Postgres pool limits
* feat(api-rs): serve tools + gerard overlay to agent sandboxes api-rs sandboxes had no tools and no overlay. Give api-rs-spawned agents the same base + overlay tools and overlay system-prompt the chart already wires for the api-rs pod, using upstream's CLI-shim tool model rather than a sidecar. Upstream direction: tools are shell CLI shims, not an HTTP registry. The agent image's install-tool-shims (services/sandbox/install_tool_shims.py) scans TOOL_DIRS at entrypoint and `uvx`-installs each pyproject [project.scripts] as a CLI; the SYSTEM_PROMPT points agents at those CLIs and `centaur-tools list`. The old `call <tool>` HTTP registry is deprecated to control-plane-only. Tool secrets are already handled upstream: codex_app_server_env_template pushes the tool placeholder creds onto the agent env, iron-control grants the per-sandbox principal the real secrets, and Postgres rides proxied `*_DSN` env from apply_proxy_env. So the agent needs only the tool SOURCES at the right paths — no sidecar, no HMAC sandbox token, no loopback tool server. - tools.rs (replaces tool_server.rs): a `tools-bootstrap` init container copies /app/tools out of the shared centaur-api image into an emptyDir mounted at /app/tools in the agent, and an `overlay-bootstrap` init container copies the org overlay tree into overlay-root mounted at overlay.mountPath (the same path the api-rs Deployment uses) and stages the overlay's SYSTEM_PROMPT.md as $HOME/AGENTS_OVERLAY.md, which the sandbox entrypoint appends to the base prompt. TOOL_DIRS is set on the agent env to /app/tools (or /app/tools:<mountPath>/tools with the overlay) — identical to the value the api-rs pod computes for its own tool discovery, set deterministically in the spec builder rather than via passthrough env. - lib.rs: build_agent_sandbox layers the tools/overlay env over spec.env, mounts the bootstrapped sources read-only into the agent, and appends the tools-bootstrap + overlay-bootstrap init containers and their volumes. No sidecar container, no token minting. - args.rs: a minimal ToolsArgs (source image/pull-policy, reusing the KUBERNETES_TOOL_SERVER_IMAGE* env the chart sets from the shared api image) and OverlayArgs (image/pull-policy/source-path/mount-path) wired into AgentSandboxConfig. Explicit clap arg ids avoid id collisions with the other flattened arg structs. - chart apirs.yaml: render the tools source image (api.image.*, gated on toolServer.enabled) and overlay (overlay.*) onto the api-rs env, replacing the KUBERNETES_TOOL_SERVER_* sidecar block. Gone vs the sidecar port: tool_server.rs, the sbx1 HMAC token minting and its SANDBOX_SIGNING_KEY requirement, CENTAUR_TOOLS_URL, the sidecar pg-DSN/proxy-env collection, and the hmac/base64/sha2 dependency additions (nothing else in the agent-k8s crate uses them). Warm-pool sandboxes route through the same build_agent_sandbox path, so they get the tools/overlay init containers and volumes for free. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(api-rs): stage tools-bootstrap copy outside /app/tools The tools-bootstrap init container mounted the tools emptyDir at /app/tools — the same path it copies FROM. The mount shadows the source image's tools tree, so the script self-copies the empty volume and GNU cp rejects it (exit 1); every sandbox dies with 'reached terminal state before running' and no agent ever starts. Mount the volume at /tools-bootstrap instead (mirroring how overlay-bootstrap stages to a distinct target) and copy the image's /app/tools into it. The agent container keeps mounting the same volume at /app/tools, so TOOL_DIRS and the shim installer are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: wire sandbox overlays without tools Gate overlay env, volumes, and mounts independently from the tools source image so overlay-only sandbox configs produce valid pod specs. * fix: make sandbox bootstrap volumes writable Set an fsGroup on sandbox pods that use tools or overlays so non-root bootstrap init containers can populate their emptyDir mounts. * fix(api-rs): source sandbox tools image from the api-rs image The tools-bootstrap init container copied /app/tools from .Values.api.image (centaur-api), but api-rs discovers its tools from /app/tools in its own container (.Values.apiRs.image). Sourcing from a different image risked the agent installing a different tool set than api-rs granted per-sandbox creds for. Source from the same api-rs image the Deployment runs so the two match by construction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(api-rs): clone sandbox tools from a repo instead of baking them in Replumb the tools-bootstrap init container to git-clone the tools repo at a pinned ref into each sandbox's /app/tools (sparse on the tools subdir; GitHub token via askpass for private repos), instead of copying /app/tools out of the api-rs image. Mirrors the repo-cache architecture — clone a repo into a pre-provisioned directory — without sharing its node-level cache, so adding a tool is a push to the repo rather than an api-rs image rebuild. api-rs still discovers its own /app/tools to grant proxy creds, so pin toolServer.ref to the tool set the image carries to avoid drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(api-rs): route the tools clone through the per-sandbox iron-proxy The sandbox NetworkPolicy only allows egress to the sandbox's iron-proxy, api-rs, and DNS, so the tools-bootstrap init container's direct git clone to github.com is blocked whenever iron-proxy is enabled. Route the clone through the proxy like all other sandbox egress: export HTTPS_PROXY (the resolved per-sandbox proxy URL apply_proxy_env already put on the spec) and GIT_SSL_CAINFO, and mount the pod's existing firewall-ca volume into the init container. github.com/api.github.com are already in the baseline proxy allowlist, so no policy or allowlist changes are needed. Without iron-proxy the clone still goes direct. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(api-rs): quote repo/ref/subdir in the tools-bootstrap script These are operator config (helm values -> env -> clap), not user input, but interpolating them bare into the /bin/sh -ec script means a stray space or metacharacter breaks in the shell instead of loudly in git. Quote them at every interpolation site. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(api-rs): retry the tools clone through the proxy's startup window The per-sandbox iron-proxy is created in the same reconcile as the Sandbox CR and isn't accepting connections yet when the tools-bootstrap init container first runs — the clone dies with connection-refused, and an init failure is terminal for the Sandbox (no kubelet retry), so every cold spawn failed with 'reached terminal state before running'. Wrap the clone/sparse-checkout/ref fetch in a bounded retry loop (30 x 2s) so the init container rides out the proxy's startup instead of killing the sandbox. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> (cherry picked from commit b1f274d) * refactor(api-rs): converge sandbox overlay on the spec-level overlay-image plumbing The base branch grew its own overlay mechanism (SandboxSpec.overlay + overlay_json) for workflow-host sandboxes, configured by the same CENTAUR_OVERLAY_* env this branch's OverlayArgs read — so a workflow-host pod with an overlay configured got two init containers and two volumes with identical names, which Kubernetes rejects. Adopt the upstream plumbing wholesale: the backend default is now an OverlayImage from the same env helper the workflow host uses (the OverlayArgs flags are gone), a spec-level overlay takes precedence over the backend default so only one overlay-bootstrap/overlay-root pair ever exists, and agent sandboxes mount the overlay at /opt/centaur/overlay like workflow hosts do. The AGENTS_OVERLAY.md prompt staging moves into the shared overlay_json path, and the chart's duplicated CENTAUR_OVERLAY_* env block is dropped — the upstream block already feeds it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.