Skip to content

fix(daemon): grok-build — pass prompt inline as -p value, drop stdin#2259

Open
srirsiva wants to merge 2 commits into
nexu-io:mainfrom
srirsiva:fix/grok-build-inline-prompt
Open

fix(daemon): grok-build — pass prompt inline as -p value, drop stdin#2259
srirsiva wants to merge 2 commits into
nexu-io:mainfrom
srirsiva:fix/grok-build-inline-prompt

Conversation

@srirsiva
Copy link
Copy Markdown

@srirsiva srirsiva commented May 19, 2026

Why

I hit this myself while wiring Grok Build into a self-hosted Open Design install. The runtime registers fine and shows available: true in /api/agents, but every single-turn invocation fails immediately with:

Could not start Grok Build: exit 2
stderr: error: a value is required for '--single <PROMPT>' but none was supplied
For more information, try '--help'.

Grok Build CLI 0.1.212 (grok 0.1.212 (b7b8204a4)) enforces -p, --single <PROMPT> as a value-requiring flag. Bare -p no longer falls back to stdin — clap rejects it before the binary ever reads piped input. The previous runtime def relied on that fallback (promptViaStdin: true + buildArgs returning ['-p']), so every Grok agent run died before it started.

What users will see

  • Users on Grok Build CLI ≥ 0.1.212 (the current public install) can pick Grok Build as an agent and actually get a reply, instead of an exit-2 banner on every run.
  • No behaviour change for older grok CLIs — -p <prompt> is the same canonical form they always accepted; the previous "bare -p + stdin" path was the looser/non-canonical one.
  • Oversized composed prompts (system + history + skills + design-system content + user message) now hit the same actionable AGENT_PROMPT_TOO_LARGE SSE error path DeepSeek already uses, naming the -p / --single flag and suggesting stdin-capable adapters (claude / codex / hermes) for large-context runs — instead of a generic spawn ENAMETOOLONG / E2BIG banner.
  • Plain-text streaming format unchanged.

Surface area

  • UI — none
  • Keyboard shortcut — none
  • CLI / env var — none (OD_* env vars + od subcommands unchanged)
  • API / contract — none (no RuntimeAgentDef shape change; only per-runtime config values + an extra branch in promptArgvBudgetMessage)
  • Extension point — none
  • i18n keys — none
  • New top-level dependency — none
  • Default behavior change — the Grok Build runtime now spawns with grok -p <prompt> and promptViaStdin: false. Existing Grok Build users were getting exit 2 100 % of the time, so this is fix-default-broken-state, not a silent UX rewrite.

Bug fix verification

  • Reproduction:
    1. Install Grok Build CLI 0.1.212 via curl -fsSL https://x.ai/cli/install.sh | bash. Confirm with grok --versiongrok 0.1.212 (b7b8204a4).
    2. In Open Design, open any project, choose Grok Build as the agent runtime, send any prompt.
    3. Before patch: daemon exits 2 with error: a value is required for '--single <PROMPT>'.
    4. After patch: daemon returns the grok reply as plain text.
  • Regression coverage for the argv-budget guard: new vitest cases in apps/daemon/tests/runtimes/prompt-budget.test.ts mirror the DeepSeek pattern — they go red on main (no maxPromptArgBytes declared on grok-build) and green on this branch:
    • grok-build declares a conservative argv-byte budget for the prompt — pins the field is set + under the Windows CreateProcess 32 KB cap.
    • checkPromptArgvBudget flags oversized Grok Build prompts and lets short prompts through — strict-overrun + at-limit + CJK byte-count guards.
    • checkPromptArgvBudget gives Grok-Build-specific guidance for large contexts — pins the -p / --single, "xAI CLI 0.1.212+", and "stdin support" copy.
  • No spawn-mode integration test (yet): an end-to-end test that actually exec'd a grok build to prove the exit-2-vs-clean-reply behavior would need a fixture binary under apps/daemon/tests/fixtures/. Happy to add one if maintainers point me at the preferred shape (e.g. a tiny native or scripted binary that exits 2 when invoked as argv = ['-p'] and 0 when invoked as argv = ['-p', 'hi']).

Validation

  • pnpm --filter @open-design/daemon run build → clean (tsc strict, no diagnostics).
  • pnpm --filter @open-design/daemon exec vitest run tests/runtimes/prompt-budget.test.ts23/23 passed, ~490 ms.
  • Compiled output apps/daemon/dist/runtimes/defs/grok-build.js contains args = ['-p', prompt], promptViaStdin: false, and maxPromptArgBytes: 30000.
  • Manual smoke against grok 0.1.212: grok -p "say hi in 5 words"Hi, good to see you (clean exit 0, single-turn reply).
  • End-to-end via Open Design daemon (host build, OD_PORT=7456, restart user-systemd) — Grok Build agent now returns text instead of exit-2 banner. No regression on adjacent runtimes (claude / codex / opencode / hermes / deepseek / pi still spawn cleanly).
  • Not tested in this PR: Windows spawn ENAMETOOLONG on very large prompts (the new argv-budget guard plus the existing checkWindowsCmdShimCommandLineBudget / checkWindowsDirectExeCommandLineBudget helpers should cover it the same way they cover DeepSeek today, but I'm on Linux only).

Why not --prompt-file instead

Considered, but -p <prompt> is the canonical headless form documented in grok --help and matches the Claude Code pattern that runtimes/defs/claude.ts follows. With the maxPromptArgBytes: 30_000 guard in place (~2.7 KB under the Windows CreateProcess limit, identical to DeepSeek), oversized composed prompts now surface the same actionable AGENT_PROMPT_TOO_LARGE error DeepSeek emits — pointing the user at stdin-capable adapters (claude / codex / hermes) when they need to ship large local context. --prompt-file adds tempfile lifecycle (cleanup, errors mid-stream, Windows path quoting) for no behavior win until prompts actually need to exceed the argv budget — at which point the runtime can grow a shim wrapper without changing the public OD surface.

🤖 Generated with Claude Code

…p stdin

Grok Build CLI 0.1.212 enforces `-p, --single <PROMPT>` as a value-requiring
flag — invoking with bare `-p` and piping the prompt to stdin now fails with:

  error: a value is required for '--single <PROMPT>' but none was supplied

The previous runtime def used `promptViaStdin: true` + `buildArgs` returning
`['-p']`, which only worked against earlier grok builds that read the prompt
from stdin when `-p` had no inline value.

This change inlines the prompt as the `-p` argument value and flips
`promptViaStdin: false`. Linux `MAX_ARG_STRLEN` (128 KB) is enough headroom
for typical Open Design prompts; if we ever hit `E2BIG` on a very large
brief, a follow-up could shell out to `--prompt-file <tempfile>`.

Verified against grok 0.1.212 (b7b8204a4) — single-turn invocations now
return clean text replies instead of exit 2.
@lefarcen lefarcen requested a review from mrcfps May 19, 2026 11:22
@lefarcen lefarcen added size/XS PR changes <20 lines risk/high High risk: apps/desktop, daemon, auth, migration, workflows, package deps type/bugfix Bug fix labels May 19, 2026
Copy link
Copy Markdown
Contributor

@lefarcen lefarcen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @srirsiva! 👋 The Summary does a clear job explaining the Grok CLI 0.1.212 flag behavior change and why the daemon needs to stop piping the prompt over stdin.

One quick PR-body tidy-up before pool review: could you add the new template’s Surface area checklist (this touches daemon runtime behavior) and a Validation section with the build/UI checks you already listed? No need to rename the existing Summary — the context there is useful as-is.

@srirsiva
Copy link
Copy Markdown
Author

@lefarcen thanks for the quick read 🙏

Updated the body to follow the template:

  • Added Surface area checklist (one box ticked — Default behavior change, since every Grok Build run was exiting 2 before this).
  • Pulled the build/CLI/smoke checks into an explicit Validation section.
  • Added a Bug fix verification section with the reproduction steps and a note on why I didn't add a red-then-green test (existing runtime tests stub child_process.spawn — happy to add a fixture-binary integration test if you'd prefer that shape, just let me know).

Kept the original context (now under Why) so the Grok CLI 0.1.212 flag-behavior explanation isn't lost. Let me know if anything else is missing for pool review.

Copy link
Copy Markdown
Contributor

@mrcfps mrcfps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srirsiva Thanks for the quick Grok CLI compatibility fix — I left one follow-up about preserving the prompt-size safeguards now that this adapter moved back to argv.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

// Grok Build CLI v0.1.212 enforces `-p, --single <PROMPT>` as value-
// required — stdin piping no longer satisfies it. Inline the prompt.
// Linux MAX_ARG_STRLEN=128KB headroom is enough for typical OD prompts.
buildArgs: (prompt, _imagePaths, _extra = [], options = {}) => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching buildArgs to [-p, prompt] and promptViaStdin: false moves the full composed OD prompt back onto argv, but this adapter still does not declare a maxPromptArgBytes budget or any prompt-budget regression coverage. That matters because the daemon only preflights argv-bound adapters when maxPromptArgBytes is set (apps/daemon/src/runtimes/prompt-budget.ts), and those composed prompts can include system text, history, skills, and design-system content. In this shape, a large Grok run will regress from the old stdin path to a raw spawn E2BIG/ENAMETOOLONG failure instead of the actionable chat error we already emit for other argv-only adapters like DeepSeek. Please either keep the prompt off argv (for example via a temp --prompt-file shim) or add a conservative maxPromptArgBytes limit plus prompt-budget tests before we rely on inline prompts here.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

@mrcfps' review on nexu-io#2259 flagged that moving the Grok Build adapter from
the (no-longer-working) stdin path to argv would regress oversized
composed prompts from the actionable AGENT_PROMPT_TOO_LARGE error we
already emit for DeepSeek to a raw spawn ENAMETOOLONG / E2BIG instead.
Fixed by mirroring the DeepSeek argv-budget shape:

- grok-build.ts: `maxPromptArgBytes: 30_000` (same headroom as DeepSeek,
  ~2.7 KB under the Windows CreateProcess 32_767-char cap) so
  `checkPromptArgvBudget` pre-flights composed prompts (system + history
  + skills + design-system content + user message) before spawn.
- prompt-budget.ts: Grok-Build-specific message — names the `-p /
  --single` flag, the xAI CLI 0.1.212+ behavior change, and points the
  user at stdin-capable adapters (claude / codex / hermes) when they
  need to ship large local context.
- Tests: 3 new vitest cases in prompt-budget.test.ts — pin the budget
  field, exercise the strict-overrun + at-limit + CJK byte-count guards
  exactly like the DeepSeek regression set, and assert the Grok-named
  diagnostic copy. New `grokBuild` + `grokBuildMaxPromptArgBytes`
  helpers exported alongside the existing `deepseek*` ones.

All 23 prompt-budget tests pass locally (`pnpm exec vitest run
tests/runtimes/prompt-budget.test.ts`).
@srirsiva
Copy link
Copy Markdown
Author

@mrcfps thanks — sharp catch on the prompt-budget regression. Fixed in da51b59:

apps/daemon/src/runtimes/defs/grok-build.ts — declares maxPromptArgBytes: 30_000, the same headroom DeepSeek uses (~2.7 KB under the Windows CreateProcess 32 KB cap) so checkPromptArgvBudget pre-flights the composed prompt before spawn.

apps/daemon/src/runtimes/prompt-budget.ts — added a Grok-Build-specific branch in promptArgvBudgetMessage that names the -p / --single flag, the xAI CLI 0.1.212+ behavior change, and points users at stdin-capable adapters (claude / codex / hermes) when they need to ship large local context.

apps/daemon/tests/runtimes/prompt-budget.test.ts — three new cases mirroring the DeepSeek regression set:

  • pin the budget field is set and stays under the Windows CreateProcess limit
  • flag strict-overrun + at-limit + CJK byte-count edges
  • pin the Grok-named diagnostic copy

grokBuild + grokBuildMaxPromptArgBytes exported alongside the existing deepseek* helpers in tests/runtimes/helpers/test-helpers.ts.

pnpm exec vitest run tests/runtimes/prompt-budget.test.ts23/23 passed. PR body now has an updated Validation block.

I deliberately stuck with inline argv + budget rather than pivoting to a --prompt-file shim — rationale in the updated PR body under "Why not --prompt-file instead" — but happy to flip to a tempfile wrapper if you'd rather keep the prompt off argv entirely.

@lefarcen lefarcen added size/S PR changes 20-100 lines and removed size/XS PR changes <20 lines labels May 19, 2026
@lefarcen lefarcen requested a review from mrcfps May 19, 2026 11:45
Copy link
Copy Markdown
Contributor

@mrcfps mrcfps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srirsiva I re-checked the updated Grok Build runtime changes on da51b5942bd0b38886af6813ad49bd43a172515f: the adapter now passes the prompt inline, restores the argv-budget guard with Grok-specific messaging, and adds focused regression coverage for the prompt-budget path. I also reran pnpm --filter @open-design/daemon exec vitest run tests/runtimes/prompt-budget.test.ts and pnpm --filter @open-design/daemon build, both of which passed locally. Thanks for turning around the follow-up quickly — this looks solid.

🔁 Powered by Looper · runner=reviewer · agent=opencode · An autonomous AI dev team for your GitHub repos.

@srirsiva
Copy link
Copy Markdown
Author

@mrcfps thanks for the quick re-review and approval! Local Grok Build runs cleanly against da51b59 on the host build (CLI 0.1.212, daemon active in user-systemd). Happy to add the fixture-binary integration test if maintainers want regression coverage at the spawn layer before merge — let me know.

PR is still showing BLOCKED on mergeStateStatus from the GitHub API, so it looks like CODEOWNERS / branch-protection is waiting on a second reviewer or a maintainer merge button. Standing by either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

risk/high High risk: apps/desktop, daemon, auth, migration, workflows, package deps size/S PR changes 20-100 lines type/bugfix Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants