fix(onboard): force chat completions API for vLLM and NIM-local providers by BenediktSchackenberg · Pull Request #980 · NVIDIA/NemoClaw

BenediktSchackenberg · 2026-03-26T19:17:28Z

Summary

vLLM's /v1/responses endpoint does not run the --tool-call-parser, so tool calls arrive as raw text instead of populating the structured tool_calls array. During onboard, probeOpenAiLikeEndpoint() tries /v1/responses first — since vLLM accepts it, preferredInferenceApi gets set to openai-responses and baked into the sandbox config.

Fix

Override preferredInferenceApi to openai-completions after probe validation for both the vllm and nim-local provider paths in bin/lib/onboard.js. This routes inference through /v1/chat/completions where the tool-call-parser works.

The probe still runs (validates the endpoint is reachable), we just ignore its API preference for these two providers.

Fixes #976

Summary by CodeRabbit

Bug Fixes
- Enforced routing of local inference and vLLM tool calls through the chat-completions endpoint, standardizing behavior and improving tool-call parsing and compatibility for local setups.
Tests
- Added integration tests that verify the enforced inference API selection and expected logging to prevent regressions.

…ders vLLM's /v1/responses endpoint does not run the --tool-call-parser, so tool calls arrive as raw text in the response content instead of the structured tool_calls array. The probe picks openai-responses because vLLM accepts the request, but parsing only works on /v1/chat/completions. Override preferredInferenceApi to openai-completions after validation for both vllm and nim-local provider paths. Fixes NVIDIA#976

Copilot

Copilot wasn't able to review any files in this pull request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai · 2026-03-26T19:17:45Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

The PR forces preferredInferenceApi = "openai-completions" for the NIM-local and local vLLM onboarding paths in bin/lib/onboard.js, and adds tests verifying that onboarding probes reporting openai-responses are overridden so requests use /v1/chat/completions.

Changes

Cohort / File(s)	Summary
Onboarding API Selection Override `bin/lib/onboard.js`	In `setupNim(gpu)` and the local vLLM selection branch, override the probed `preferredInferenceApi` to `"openai-completions"` and log when the override occurs so sandboxed inference is routed to `/v1/chat/completions`.
Tests: onboarding selection & overrides `test/onboard-selection.test.js`	Adds integration tests that simulate choosing local vLLM and NIM-local while mocked probes report `openai-responses`; asserts `preferredInferenceApi` is forced to `openai-completions` and checks for expected vLLM/NIM tooling log output.

Sequence Diagram(s)

sequenceDiagram
  participant Onboard as Onboard script
  participant Probe as Endpoint probe
  participant Config as Sandbox config
  participant Client as Agent/Client
  participant vLLM as vLLM/NIM local
  participant ToolParser as Tool-Call Parser
  participant Tool as Tool/Executor

  Onboard->>Probe: probe /v1/responses and /v1/models
  Probe-->>Onboard: reports openai-responses available
  Onboard->>Config: set preferredInferenceApi = "openai-completions" (override)
  Client->>Config: send inference request
  Config->>vLLM: forward to /v1/chat/completions
  vLLM->>ToolParser: produce structured tool_call data
  ToolParser->>Tool: execute tool call
  Tool-->>Client: return tool result via chat response

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nudged the probe, gave responses a shove,
Switched to completions for the parser I love.
Tool calls now hop out tidy and bright,
Local vLLM dances, tools work just right. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: forcing chat completions API for vLLM and NIM-local providers, which is the core fix in the changeset.
Linked Issues check	✅ Passed	The PR implementation fully addresses issue `#976`'s coding requirements: overriding preferredInferenceApi to openai-completions in both vllm and nim-local provider paths, with comprehensive test coverage validating the override behavior.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issue `#976`: code modifications in onboard.js target vllm and nim-local API routing, and tests verify these specific overrides with no extraneous changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

bin/lib/onboard.js (1)
1869-1871: Add a regression test for the responses-first probe path.

Current coverage only proves that getSandboxInferenceConfig() respects an explicit preferred API. It does not lock down the new behavior where setupNim() must return openai-completions for vllm and nim-local even when the probe reports openai-responses first. A focused test here would make this fix much harder to regress.

Also applies to: 1988-1991
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1869 - 1871, Add a regression test that
simulates the probe returning "openai-responses" first and verifies that
setupNim() (and indirectly getSandboxInferenceConfig()) forces
preferredInferenceApi to "openai-completions" for engines "vllm" and
"nim-local"; specifically, mock the probe result to return openai-responses,
call setupNim() for both engine types, and assert the returned/derived
preferredInferenceApi equals "openai-completions" to lock down the
responses-first probe path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 1869-1871: After forcing preferredInferenceApi =
"openai-completions" add an explicit message into the onboarding output so the
override is echoed to the user; locate the code that previously logged the
probed API (validateOpenAiLikeSelection()) and the onboarding message builder
that prints the selected provider, and append or update that onboarding text to
include that OpenClaw is forcing "openai-completions" (use the same variable
preferredInferenceApi). Do the same change for the other identical override site
later in the file where preferredInferenceApi is set (the second block that
mirrors this behavior) so both onboarding outputs reflect the forced override.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 1869-1871: Add a regression test that simulates the probe
returning "openai-responses" first and verifies that setupNim() (and indirectly
getSandboxInferenceConfig()) forces preferredInferenceApi to
"openai-completions" for engines "vllm" and "nim-local"; specifically, mock the
probe result to return openai-responses, call setupNim() for both engine types,
and assert the returned/derived preferredInferenceApi equals
"openai-completions" to lock down the responses-first probe path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0342752e-7456-4fa0-b3c8-550c9219f25d

📥 Commits

Reviewing files that changed from the base of the PR and between 6d44def and 2f6dfd0.

📒 Files selected for processing (1)

bin/lib/onboard.js

bin/lib/onboard.js

Surface the API override during onboard so users see why responses API was not selected.

Verifies that setupNim() forces preferredInferenceApi to openai-completions for the vLLM provider path even when the probe detects openai-responses first. This locks down the fix for NVIDIA#976 so the tool-call-parser override cannot silently regress.

BenediktSchackenberg · 2026-03-27T19:41:40Z

Added a regression test as suggested by CodeRabbit — it verifies that setupNim() forces openai-completions for the vLLM path even when the probe detects openai-responses first. All 15 selection tests pass locally.

Would appreciate if a maintainer could trigger CI on this — first-time contributor fork so it's waiting on approval. Thanks!

coderabbitai

🧹 Nitpick comments (1)

test/onboard-selection.test.js (1)

1308-1405: Add a companion test for nim-local override parity.

This PR objective covers both vllm and nim-local, but the new coverage here only guards vllm. A parallel nim-local case would prevent silent regressions in the second forced-override path.

♻️ Suggested follow-up (test skeleton)

+ it("forces openai-completions for nim-local even when probe detects openai-responses", () => {
+   // mirror the vLLM structure:
+   // - select nim-local provider
+   // - make probe return /v1/responses success
+   // - assert preferredInferenceApi === "openai-completions"
+   // - assert override log line is emitted
+ });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/onboard-selection.test.js` around lines 1308 - 1405, Add a parallel test
in the same file that mirrors the vLLM test but exercises the nim-local
forced-override path: replicate the test that writes a fake curl and script
invoking setupNim, but change the interactive answer sequence (via
credentials.prompt) to select the nim-local option, ensure runner.runCapture
responds to model probes similarly, then assert payload.result.provider ===
"nim-local", payload.result.preferredInferenceApi === "openai-completions", and
payload.result.model matches the detected nim-local model; also assert the
captured console lines include the equivalent "Using existing ..." log for
nim-local and the "tool-call-parser requires" warning. Use the same helpers
(setupNim, credentials.prompt, runner.runCapture) and test scaffolding as the
vLLM case to guarantee parity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/onboard-selection.test.js`:
- Around line 1308-1405: Add a parallel test in the same file that mirrors the
vLLM test but exercises the nim-local forced-override path: replicate the test
that writes a fake curl and script invoking setupNim, but change the interactive
answer sequence (via credentials.prompt) to select the nim-local option, ensure
runner.runCapture responds to model probes similarly, then assert
payload.result.provider === "nim-local", payload.result.preferredInferenceApi
=== "openai-completions", and payload.result.model matches the detected
nim-local model; also assert the captured console lines include the equivalent
"Using existing ..." log for nim-local and the "tool-call-parser requires"
warning. Use the same helpers (setupNim, credentials.prompt, runner.runCapture)
and test scaffolding as the vLLM case to guarantee parity.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f569ae45-9c39-46c7-b645-5e4c322a6f5d

📥 Commits

Reviewing files that changed from the base of the PR and between 861581e and 6c43ed6.

📒 Files selected for processing (1)

test/onboard-selection.test.js

BenediktSchackenberg · 2026-03-27T19:48:09Z

@cv Would you mind triggering CI on this one? Added a regression test for the override as well. Thanks!

Companion test for the vLLM case: verifies that setupNim() forces openai-completions for the NIM-local path too, since NIM uses vLLM internally and has the same tool-call-parser limitation.

BenediktSchackenberg · 2026-03-28T07:11:21Z

Added the companion NIM-local regression test as suggested by CodeRabbit — mocks the full NIM onboarding flow (listModels, pullNimImage, startNimContainerByName, waitForNimHealth) and verifies the same openai-completions override applies. All 16 selection tests pass.

Copilot AI review requested due to automatic review settings March 26, 2026 19:17

Copilot AI reviewed Mar 26, 2026

View reviewed changes

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

bin/lib/onboard.js Show resolved Hide resolved

benedikt and others added 3 commits March 26, 2026 19:25

fix: log when overriding probe result to chat completions

90da542

Surface the API override during onboard so users see why responses API was not selected.

Merge branch 'main' into fix/vllm-tool-call-api

861581e

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

test: add NIM-local chat completions override regression test

94f8f20

Companion test for the vLLM case: verifies that setupNim() forces openai-completions for the NIM-local path too, since NIM uses vLLM internally and has the same tool-call-parser limitation.

Merge branch 'main' into fix/vllm-tool-call-api

733503f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): force chat completions API for vLLM and NIM-local providers#980

fix(onboard): force chat completions API for vLLM and NIM-local providers#980
BenediktSchackenberg wants to merge 6 commits intoNVIDIA:mainfrom
BenediktSchackenberg:fix/vllm-tool-call-api

BenediktSchackenberg commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

BenediktSchackenberg commented Mar 27, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

BenediktSchackenberg commented Mar 27, 2026

Uh oh!

BenediktSchackenberg commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

BenediktSchackenberg commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenediktSchackenberg commented Mar 27, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

BenediktSchackenberg commented Mar 27, 2026

Uh oh!

BenediktSchackenberg commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BenediktSchackenberg commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading