Skip to content

[codex] fix(security): bundle sandbox, Telegram, and update hardening#1416

Open
13ernkastel wants to merge 8 commits intoNVIDIA:mainfrom
13ernkastel:codex/followup-shellquote-sandbox-hardening
Open

[codex] fix(security): bundle sandbox, Telegram, and update hardening#1416
13ernkastel wants to merge 8 commits intoNVIDIA:mainfrom
13ernkastel:codex/followup-shellquote-sandbox-hardening

Conversation

@13ernkastel
Copy link
Copy Markdown
Contributor

@13ernkastel 13ernkastel commented Apr 3, 2026

Summary

Bundles the remaining sandbox command-hardening work with the Telegram fail-closed cleanup and the unsupported self-update-hint fix.

This now includes the original #1416 scope plus the changes that had temporarily been split into #1500.
#1499 remains separate on purpose.

Linked Issues

Related PRs / Issues

  • follow-up to #1392
  • folds in #1218
  • folds in #1215
  • replaces #1500
  • keeps #1499 separate
  • addresses #896

Changes

  • re-validates sandbox names at the createSandbox() boundary and removes the remaining shell-string dependency from follow-on sandbox command paths
  • adds runFile() and uses argv-style execution for setup-dns-proxy.sh
  • replaces the dashboard readiness probe with the structured OpenShell helper path
  • requires an explicit Telegram chat allowlist before the bridge forwards prompts
  • adds nemoclaw telegram subcommands and nemoclaw start --discover-chat-id
  • preserves the reserved-sandbox-name guard added during the Telegram review follow-up
  • disables unsupported OpenClaw self-update hints in the generated sandbox config
  • propagates saved Telegram allowlists into the remote deploy env so deployed bridges stay fail-closed too
  • updates focused CLI/deploy tests to match the current services-based startup path on main

Why

These changes all tighten the default security posture around operator-managed sandboxes:

  • sandbox creation and follow-on helper execution rely less on shell-string construction
  • Telegram bridge access now fails closed unless the operator explicitly allowlists chat IDs
  • sandbox images stop advertising an unsupported in-container self-update path

Keeping them together in #1416 makes the remaining security review surface smaller while still leaving the separate immutable-hardening follow-up in #1499 alone.

Validation

  • npm run build:cli
  • npx vitest run src/lib/deploy.test.ts src/lib/onboard-session.test.ts test/onboard.test.js test/cli.test.js test/runner.test.js test/service-env.test.js test/registry.test.js test/shellquote-sandbox.test.js

Risks / Notes

  • npm run typecheck:cli still hits the repo's existing src/lib/*.test.ts -> ../../dist/lib/* type-resolution issue in this environment, so validation here relies on the targeted build plus Vitest coverage above
  • #1499 remains separate on purpose

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new argv-style execution helper (runFile), tightens Brave Search credential validation (non-interactive error paths), enforces sandbox name validation at creation, switches DNS and dashboard readiness calls to argv-style execution, and exports web-search helpers; tests updated/added accordingly.

Changes

Cohort / File(s) Summary
Runner utility
bin/lib/runner.js
Introduces runFile(file, args, opts) and central spawnAndHandle helper; normalizes argv execution, env merging, output suppression, rendered-command error logging, and rejects opts.shell=true.
Onboarding logic
bin/lib/onboard.js
Reworked Brave Search credential validation to support nonInteractive and explicit errors; no longer lowercases sandbox names; enforces validateName(...) at createSandbox boundary; replaces shell-string runs with argv-style (runFile) for DNS proxy and uses runCaptureOpenshell([...]) for dashboard readiness probing; exports configureWebSearch and ensureValidatedBraveSearchCredential.
Tests — runner & onboarding
test/runner.test.js, test/onboard.test.js, test/shellquote-sandbox.test.js, src/lib/onboard-session.test.ts, test/registry.test.js
Adds/updates tests to record runFile calls, assert argv-style command usage, harden dashboard readiness matching, assert whitespace-only sandboxName rejection, verify Brave non-interactive failures, add session persistence and registry clear tests, and introduce shell-quoting lint-style checks.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Onboard as bin/lib/onboard.js
    participant Runner as bin/lib/runner.js
    participant Child as childProcess.spawnSync
    participant CredStore as Credential / Env

    Client->>Onboard: start onboarding (may include sandboxNameOverride)
    Onboard->>CredStore: get saved API key / read env
    CredStore-->>Onboard: apiKey or null
    Onboard->>Onboard: ensureValidatedBraveSearchCredential(nonInteractive?)
    alt brave validation fails (non-interactive)
        Onboard->>Client: throw error
    else validation ok
        Onboard->>Onboard: validateName(sandboxName, "sandbox name")
        Onboard->>Runner: runFile("bash", [setup-dns-proxy.sh, GATEWAY, sandboxName], {ignoreError:true})
        Runner->>Child: spawnSync("bash", args, opts)
        Child-->>Runner: result
        Runner-->>Onboard: result
        Onboard->>Runner: runCaptureOpenshell([..., "sandbox","exec", sandboxName, "curl", "http://localhost:PORT/"])
        Runner->>Child: spawnSync(...argv..., opts)
        Child-->>Runner: result
        Runner-->>Onboard: captured output => ready
        Onboard->>Client: createSandbox complete (persist webSearchConfig if set)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A little hop, a careful check,

No sneaky strings to interject,
Argv lines neat, credentials true,
Sandbox names validated through,
I nibble bugs and say, "All set!" 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title mentions 'bundle sandbox, Telegram, and update hardening' but the actual changes focus on hardening sandbox command execution, validating sandbox names, and adding argv-style script execution via runFile(). The title does not accurately reflect these primary changes. Revise the title to clearly reflect the main objectives, such as 'harden sandbox command execution with input validation and argv-style script execution' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@13ernkastel 13ernkastel marked this pull request as ready for review April 3, 2026 13:17
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/runner.js (1)

60-75: Collapse the spawn helpers into one internal path.

run(), runInteractive(), and runFile() now repeat the same spawnSync/redaction/exit handling. A shared helper would make future hardening changes much harder to miss in one branch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/runner.js` around lines 60 - 75, The three functions run(),
runInteractive(), and runFile() duplicate spawnSync/stdio/env/redaction/exit
handling; extract that logic into a single internal helper (e.g., spawnAndHandle
or _spawnSyncWithRedaction) that takes (fileOrCmd, args, opts, stdio) and
performs spawnSync with cwd ROOT, merged env, calls writeRedactedResult(result,
stdio), logs the redacted rendered command on non-zero exit and process.exit,
and returns result; then refactor run(), runInteractive(), and runFile() to call
this helper with the appropriate stdio and ignoreError behavior, removing the
duplicated spawnSync and exit handling from each function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 2095-2098: The prompt validation and the final boundary check are
out of sync: replace the `sandboxNameOverride || (await
promptValidatedSandboxName())` expression with `sandboxNameOverride ?? (await
promptValidatedSandboxName())` to prevent empty string falling through, and
modify `promptValidatedSandboxName()` to call and return `validateName(...)`
(instead of using the RFC-1123 regex directly) so the interactive retry loop
enforces the same 63-character/lowercase rules as `validateName`; ensure
`validateName` is used for both override and prompted values so failures
re-prompt rather than abort.

---

Nitpick comments:
In `@bin/lib/runner.js`:
- Around line 60-75: The three functions run(), runInteractive(), and runFile()
duplicate spawnSync/stdio/env/redaction/exit handling; extract that logic into a
single internal helper (e.g., spawnAndHandle or _spawnSyncWithRedaction) that
takes (fileOrCmd, args, opts, stdio) and performs spawnSync with cwd ROOT,
merged env, calls writeRedactedResult(result, stdio), logs the redacted rendered
command on non-zero exit and process.exit, and returns result; then refactor
run(), runInteractive(), and runFile() to call this helper with the appropriate
stdio and ignoreError behavior, removing the duplicated spawnSync and exit
handling from each function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4277283d-a5ab-4474-9d2b-b9e92316bc0b

📥 Commits

Reviewing files that changed from the base of the PR and between 494ecde and 1f1d6d5.

📒 Files selected for processing (5)
  • bin/lib/onboard.js
  • bin/lib/runner.js
  • test/onboard.test.js
  • test/runner.test.js
  • test/shellquote-sandbox.test.js

@wscurran wscurran added security Something isn't secure priority: high Important issue that should be resolved in the next release enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents labels Apr 3, 2026
@wscurran
Copy link
Copy Markdown
Contributor

wscurran commented Apr 3, 2026

✨ Thanks for submitting this pull request, which proposes a way to improve security by hardening sandbox command execution in OpenShell.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 4049-4073: The current logic updates
onboardSession.webSearchConfig immediately after checking Brave Search
(ensureValidatedBraveSearchCredential / configureWebSearch), which can
desynchronize session state from an existing sandbox because
patchStagedDockerfile() only applies the brave preset during createSandbox();
defer mutating onboardSession.webSearchConfig until after the sandbox-reuse
decision (the code block that decides whether to recreate or reuse the sandbox),
so instead only compute a candidate webSearchConfig locally (using
ensureValidatedBraveSearchCredential and configureWebSearch) and only call
onboardSession.updateSession(...) to set webSearchConfig after the
reuse-vs-recreate branch completes and createSandbox() is invoked if needed.
- Around line 1056-1090: ensureValidatedBraveSearchCredential currently prompts
via promptBraveSearchApiKey()/promptBraveSearchRecovery() when there's no saved
key or validation fails; update it to fail fast in non-interactive runs by
detecting the non-interactive flag (either accept a new parameter like
nonInteractive or read the existing global/non-interactive indicator used by
onboard()) and when nonInteractive is true: if no saved credential from
getCredential(webSearch.BRAVE_API_KEY_ENV) or validateBraveSearchApiKey(apiKey)
fails, immediately throw a clear Error or return null (consistent with callers)
instead of calling promptBraveSearchApiKey()/promptBraveSearchRecovery(); keep
the existing saveCredential/process.env assignment path for interactive flows.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ecd12c58-cdeb-451f-820c-2ba6aa238d5f

📥 Commits

Reviewing files that changed from the base of the PR and between 99ee3a0 and bb1cd5f.

📒 Files selected for processing (9)
  • bin/lib/onboard.js
  • bin/lib/registry.js
  • bin/lib/runner.js
  • src/lib/onboard-session.test.ts
  • src/lib/onboard-session.ts
  • test/onboard.test.js
  • test/registry.test.js
  • test/runner.test.js
  • test/shellquote-sandbox.test.js
✅ Files skipped from review due to trivial changes (2)
  • src/lib/onboard-session.test.ts
  • test/shellquote-sandbox.test.js
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/runner.test.js
  • bin/lib/runner.js
  • test/onboard.test.js

@13ernkastel 13ernkastel force-pushed the codex/followup-shellquote-sandbox-hardening branch from bb1cd5f to f024982 Compare April 4, 2026 03:37
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/runner.js`:
- Around line 64-72: The runFile helper currently spreads opts into spawnSync
allowing a caller to re-enable shell execution; modify runFile (the function
named runFile and the spawnSync invocation) to explicitly forbid opts.shell by
checking if opts.shell is present and throwing or rejecting, and always pass
shell: false in the options passed to spawnSync (i.e., merge {...opts, shell:
false, stdio, cwd: ROOT, env: {...}}) so callers cannot enable shell parsing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 06e0bccf-c6a0-4b8d-848d-a1dfbe93698b

📥 Commits

Reviewing files that changed from the base of the PR and between bb1cd5f and f024982.

📒 Files selected for processing (7)
  • bin/lib/onboard.js
  • bin/lib/runner.js
  • src/lib/onboard-session.test.ts
  • test/onboard.test.js
  • test/registry.test.js
  • test/runner.test.js
  • test/shellquote-sandbox.test.js
✅ Files skipped from review due to trivial changes (2)
  • src/lib/onboard-session.test.ts
  • test/shellquote-sandbox.test.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/registry.test.js
  • test/runner.test.js

@13ernkastel
Copy link
Copy Markdown
Contributor Author

Addressed the remaining review feedback on top of the rebased branch.

Latest follow-up commits:

  • 0e42112 hardens Brave resume handling so non-interactive resume fails fast and only persists webSearchConfig after an actual sandbox recreation
  • 3e9c149 hardens runFile() by rejecting opts.shell=true and forcing shell: false

Regression coverage added/updated for both areas.

Checks run:

  • npm run build:cli
  • npm run typecheck
  • npx vitest run test/onboard.test.js test/runner.test.js test/shellquote-sandbox.test.js test/registry.test.js src/lib/onboard-session.test.ts

@13ernkastel 13ernkastel force-pushed the codex/followup-shellquote-sandbox-hardening branch from ddd49d7 to a804a2b Compare April 5, 2026 04:25
@13ernkastel 13ernkastel changed the title [codex] fix(security): harden sandbox command execution [codex] fix(security): bundle sandbox, Telegram, and update hardening Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents priority: high Important issue that should be resolved in the next release security Something isn't secure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Openclaw update fails

2 participants