fix(discord): tighten native messaging policies and diagnostics by kjw3 · Pull Request #1653 · NVIDIA/NemoClaw

kjw3 · 2026-04-09T01:59:33Z

Summary

This draft PR advances the Discord/native messaging remediation work in NemoClaw without claiming that native Discord is fully fixed end to end yet.

It also folds in the analogous Telegram doctor/config follow-up so rebuilt sandboxes match the current account-based channel schema and avoid a misleading empty group allowlist warning.

It now also addresses the repo-owned portion of #1692 by removing deprecated tls: terminate directives from the messaging policy presets.

It does five repo-owned things:

narrows and supersedes the communication preset fix from fix(presets): add missing binaries to communication presets #1084 by adding only the minimal additional Node binary coverage
fixes onboarding so messaging policy suggestions follow the current messaging selection rather than stale stored credentials
fixes the baked Discord account schema at image build time, adds build-time Discord guild workspace config, and improves troubleshooting/E2E validation so baked-config failures are separated from native Discord gateway failures
aligns baked Telegram config with the same doctor-expected account layout and makes Telegram group handling explicit so default group delivery does not silently drop messages
removes deprecated TLS-termination syntax from the Discord, Slack, and Telegram presets so NemoClaw does not keep shipping policy warnings on current OpenShell builds

Why this PR exists

The current evidence suggests the remaining Discord break is not just a missing onboarding step:

the sandbox has the Discord policy applied
the Discord provider is attached
command registration PUT requests are allowed at runtime
the bot can authenticate against discord.com
the bot can read channel history and send messages back into Discord
current NemoClaw onboarding only baked token/account config, not the Discord guild config that OpenClaw documents for server-channel autonomy

What still appears broken is the autonomous inbound side of native Discord handling. In current testing, OpenClaw can read and write Discord successfully, but incoming @mention messages do not appear to wake the agent in real time. After baking the documented guild config into the sandbox, the failure still reproduces. Direct diagnosis inside the sandbox shows that:

the baked channels.discord.guilds.<serverId> config is present and correct
Node DNS lookup for gateway.discord.gg succeeds
a plain Node WebSocket('wss://gateway.discord.gg/...') succeeds inside the same sandbox
OpenClaw's Discord gateway client still loops with AggregateError and close code 1006

That makes the remaining issue look narrower than generic connectivity: likely the live Discord gateway client path in the pinned OpenClaw runtime (2026.3.11) under the OpenShell proxy environment.

This PR tightens NemoClaw’s side of the system so:

the sandbox policy and onboarding behavior are correct
the baked Discord config matches current OpenClaw schema expectations
Discord server-workspace config can now be baked at image build time instead of being missing from NemoClaw onboarding entirely
onboarding can ask whether the bot should reply only to @mentions or to all messages in the configured server
the Discord user allowlist is optional, so leaving it blank allows any member of the configured server to message the bot
Telegram single-account config is already baked in the accounts.default layout that openclaw doctor now expects
Telegram group chats default to groupPolicy: open, so an empty DM allowlist does not produce silent group-message drops
messaging policy presets no longer pin deprecated tls: terminate directives on current OpenShell builds
users are not sent toward dead-end config repair steps
tests explicitly probe the Discord gateway path instead of only REST

Changes

bin/lib/onboard.js
- add getSuggestedPolicyPresets()
- make messaging-related policy suggestions respect explicit enabledChannels
- persist messaging channel selection through onboarding/resume flow
nemoclaw-blueprint/policies/presets/{discord,telegram,slack}.yaml
- add /usr/bin/node alongside /usr/local/bin/node
- remove deprecated tls: terminate directives and rely on current OpenShell automatic TLS handling
- intentionally do not broaden to curl or bash in these narrow presets
docs/reference/troubleshooting.md
- document that openclaw doctor --fix cannot repair baked sandbox channel config
- add guidance for distinguishing baked-config issues from native Discord gateway issues
- add guidance for the analogous Telegram doctor migration/warning path
Dockerfile
- bake messaging accounts under channels.<provider>.accounts.default instead of accounts.main
- align the image with the schema openclaw doctor currently expects, so this migration is handled at build time rather than failing inside the immutable sandbox
- add a build-time NEMOCLAW_DISCORD_GUILDS_B64 path so Discord guild allowlist config can be baked into openclaw.json
- set Telegram groupPolicy: open in the baked default account config
bin/lib/onboard.js
- prompt for Discord Server ID and User ID during messaging setup
- bake channels.discord.guilds.<serverId> config for the selected server(s)
- prompt for mention-only vs all-message replies in guild channels
- treat the Discord user allowlist as optional; leaving it blank allows any member of the configured server
- keep requireMention: true as the default so onboarding remains safe for shared servers
test/e2e/test-messaging-providers.sh
- add a classified native Discord gateway probe separate from REST reachability
- accept both accounts.default and accounts.main during the transition so older images still validate
- assert baked Telegram groupPolicy: open
test/onboard.test.js
- add coverage for policy suggestions following current messaging selection
- add coverage for the baked Discord guild workspace build arg
- add coverage for both mention-only guild config and open-to-all-members guild config
test/policies.test.js
- add coverage for /usr/bin/node in communication presets
- add coverage that messaging presets no longer ship deprecated tls: terminate
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md
- regenerated from docs

Related issues

supersedes fix(presets): add missing binaries to communication presets #1084 for the preset change
addresses [Nemoclaw][All Platform]onboard skips messaging channel selection, but policy preset step still auto-selects messaging presets from stored credentials #1610 from the NemoClaw side
addresses [NemoClaw][All Platform]Telegram messaging broken on v0.0.10: deprecated tls: terminate + OpenClaw doctor migration drops all group messages #1692 from the NemoClaw side
provides clearer handling for Discord bot token set - Error: EACCES: permission denied, copyfile '/sandbox/.openclaw/openclaw.json #606
improves investigation coverage for Discord: fetch failed on gateway connection despite discord policy preset applied (Linux/Ubuntu) #585 and [NemoClaw][All platforms] Discord channel fails with 400 — Node.js EnvHttpProxyAgent uses forward proxy instead of CONNECT tunnel #1570

Validation

Completed:

git diff --check
bash -n test/e2e/test-messaging-providers.sh
rg -n "tls: terminate" nemoclaw-blueprint/policies/presets/{telegram,discord,slack}.yaml test/policies.test.js
node -c bin/lib/onboard.js
manual host-side verification that the live sandbox has:
- the Discord policy applied
- the Discord provider attached
- allowed Discord command-registration PUT traffic at runtime
- successful Discord read/write behavior from the bot in a real server channel
- a remaining failure mode where direct @mention traffic is only observed when OpenClaw is prompted from the TUI, rather than waking autonomously in real time
- a correctly baked Discord guild config inside the sandbox after redeploy
- successful dns.lookup('gateway.discord.gg') and raw Node WebSocket connectivity from inside the sandbox
- continued OpenClaw Discord gateway failure with AggregateError and WebSocket close code 1006

Not completed in this environment:

full vitest runs were attempted but hung without producing output in this sandboxed session

Current limitation

This PR does not yet claim a full end-to-end native Discord fix.

The remaining evidence points to the live Discord gateway client path inside OpenClaw rather than to NemoClaw config. The next step is to inspect or patch the pinned OpenClaw runtime behavior and determine whether that final fix can be applied from NemoClaw or requires upstream OpenClaw changes.

Type of change

Code change for a bug fix or refactor
Doc updates
Tests updated

Summary by CodeRabbit

New Features
- Discord onboarding prompts for Server ID, reply mode (@mentions-only or all messages) and optional user allowlist; messaging selections are persisted into the sandbox image.
Documentation
- Clarified Telegram allowlist: enforced for DMs only; group chats remain open by default (prevents rebuilt sandboxes from dropping group messages).
- Added troubleshooting for baked channel configs and Discord gateway diagnostics.
Policy Changes
- Messaging network policy TLS handling adjusted and Node binary paths expanded for messaging providers.

coderabbitai · 2026-04-09T01:59:39Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Discord guild-scoped onboarding and embeds guild configs into built sandbox images; clarifies that TELEGRAM_ALLOWED_IDS applies to Telegram DMs only (groups open by default); removes tls: terminate from messaging network policies; expands allowed Node binary paths; updates Dockerfile patching, onboarding, e2e and unit tests, and troubleshooting docs.

Changes

Cohort / File(s)	Summary
Onboarding & Core Logic `src/lib/onboard.ts`	Add Discord guild handling (server ID, require-mention, optional user allowlist), build base64 guild payload, inject `NEMOCLAW_DISCORD_GUILDS_B64` into staged Dockerfile, gate messaging allowlists by channel `allowIdsMode`, export `getSuggestedPolicyPresets()` and refactor policy suggestion flow.
Dockerfile / Build `Dockerfile`	Add `NEMOCLAW_DISCORD_GUILDS_B64` build-arg/env, decode into `_discord_guilds` when generating `openclaw.json`, switch channel account key lookup to `accounts.default`, and conditionally set `groupPolicy` and Discord `guilds`.
Messaging Policies `nemoclaw-blueprint/policies/presets/discord.yaml`, `.../slack.yaml`, `.../telegram.yaml`	Remove `tls: terminate` from HTTPS endpoint entries and add `/usr/bin/node` to permitted binaries (in addition to `/usr/local/bin/node`).
E2E & Tests `test/e2e/test-messaging-providers.sh`, `test/onboard.test.ts`, `test/policies.test.ts`	E2E: prefer `accounts.default` with fallback to `accounts.main`, add `STRICT_DISCORD_GATEWAY`, validate Telegram `groupPolicy`, add native Discord gateway probe and classification. Unit tests: cover `getSuggestedPolicyPresets`, Dockerfile guild-patch decoding (`NEMOCLAW_DISCORD_GUILDS_B64`), presence of `/usr/bin/node`, and absence of `tls: terminate`.
Documentation & Troubleshooting `.agents/skills/.../SKILL.md`, `.agents/skills/.../commands.md`, `.agents/.../troubleshooting.md`, `docs/deployment/set-up-telegram-bridge.md`, `docs/reference/*.md`	Document that `TELEGRAM_ALLOWED_IDS` is enforced only for DMs (groups open by default), add optional Discord onboarding prompts (server ID, reply mode, optional user ID) and add troubleshooting guidance about baked channel configs vs native Discord gateway failures.
Policy Tests / Presets Validation `test/policies.test.ts`, presets YAML files	Add tests to ensure messaging presets include `/usr/bin/node` and do not contain `tls: terminate`.

Sequence Diagram

sequenceDiagram
    autonumber
    participant User
    participant CLI as Onboarding CLI
    participant Session as onboardSession
    participant Patcher as Dockerfile Patcher
    participant Builder as Image Builder
    participant Sandbox as Runtime Sandbox
    participant E2E as Test Runner

    User->>CLI: run onboarding (prompts: Discord Server ID, reply mode, optional user ID)
    CLI->>Session: store selectedMessagingChannels & discordGuilds
    CLI->>Patcher: patchStagedDockerfile(NEMOCLAW_DISCORD_GUILDS_B64)
    Patcher->>Builder: build image (embedded guild config in openclaw.json)
    Builder->>Sandbox: deploy image
    Sandbox->>Sandbox: load /sandbox/.openclaw/openclaw.json (baked guilds, groupPolicy)
    E2E->>Sandbox: run native gateway probe & validate groupPolicy/account keys

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix(security): use providers for messaging credential injection #1081 — modifies onboarding and sandbox messaging configuration; strongly related to Discord/Telegram gating and env propagation.

Suggested labels

enhancement: testing

Suggested reviewers

ericksoa

Poem

🐇 I patched the Dockerfile, hid guilds in a file,
Telegram groups stay open — they hop for a while,
Mentions keep the order when servers ring true,
I nibble the bytes and I test what I brew.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main changes: removing deprecated TLS-termination directives from messaging policies and adding diagnostic improvements for Discord native gateway issues. It reflects the core theme across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/discord-native-remediation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

docs/deployment/set-up-telegram-bridge.md (1)
63-63: Split into one sentence per line.

This line contains two sentences. Per the documentation style guide, each sentence should be on its own line to make diffs more readable.
-NemoClaw applies that allowlist to Telegram DMs only. Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist.
+NemoClaw applies that allowlist to Telegram DMs only.
+Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/deployment/set-up-telegram-bridge.md` at line 63, The sentence "NemoClaw
applies that allowlist to Telegram DMs only. Group chats stay open by default so
rebuilt sandboxes do not silently drop Telegram group messages because of an
empty group allowlist." should be split into two separate lines—one line
containing "NemoClaw applies that allowlist to Telegram DMs only." and the next
line containing "Group chats stay open by default so rebuilt sandboxes do not
silently drop Telegram group messages because of an empty group allowlist."—so
each sentence is on its own line in the docs file.
docs/reference/troubleshooting.md (1)
292-297: Drop the bold labels in the checklist.

**Baked config and provider wiring** and **Native Discord gateway path** read like routine list labels, so the bolding is unnecessary here.

As per coding guidelines, "Unnecessary bold on routine instructions ... Bold is reserved for UI labels, parameter names, and genuine warnings."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/troubleshooting.md` around lines 292 - 297, Remove the
unnecessary bolding for the checklist labels by deleting the surrounding **
markers around "Baked config and provider wiring" and "Native Discord gateway
path" in the list entries so they render as plain text; keep the rest of the
list content unchanged and ensure only UI labels/parameter names remain bold
elsewhere per guidelines.
bin/lib/onboard.js (3)
2432-2449: LGTM — Consider optional validation for Discord snowflake IDs.

The guild parsing logic is well-structured:

Properly handles both single (DISCORD_SERVER_ID) and multi-value (DISCORD_SERVER_IDS) env vars

Secure default: requireMention defaults to true when env var is unset or any value other than "0"

User allowlist is optional as intended

Discord server/user IDs are 17-19 digit numeric strings (snowflakes). While invalid IDs would simply not match any real entities, you could optionally add a warning for malformed inputs to help users catch typos early.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 2432 - 2449, Add optional validation to the
guild parsing block to warn on malformed Discord snowflake IDs: after computing
serverIds and userIds (used to populate discordGuilds) iterate each id and
validate it matches the expected 17–19 digit numeric regex, and log a warning
(do not throw) via the existing logger/processLogger if any id is malformed,
leaving the rest unchanged; ensure you reference the serverIds and userIds
arrays and keep the existing behavior of requireMention and discordGuilds
population in the code path guarded by
enabledTokenEnvKeys.has("DISCORD_BOT_TOKEN").
3558-3574: Minor: Condition could be more explicit for readability.

The logic at line 3558 ch.requireMentionEnvKey && process.env[ch.serverIdEnvKey || ""] works correctly but relies on process.env[""] being undefined. For clarity, consider:
-    if (ch.requireMentionEnvKey && process.env[ch.serverIdEnvKey || ""]) {
+    if (ch.requireMentionEnvKey && ch.serverIdEnvKey && process.env[ch.serverIdEnvKey]) {
This makes the intent explicit: only prompt for require-mention if both the envKey exists and the server ID was set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 3558 - 3574, The condition relies on
process.env[ch.serverIdEnvKey || ""] which is implicit; change the if check to
explicitly ensure both ch.requireMentionEnvKey is set and the server ID env var
exists (e.g., verify ch.serverIdEnvKey is truthy and
process.env[ch.serverIdEnvKey] is defined/non-empty) before prompting — update
the conditional that references ch.requireMentionEnvKey and process.env[...] to
use ch.serverIdEnvKey and check its env value explicitly so the intent is clear
in the block handling require-mention logic.
3600-3622: Clean extraction; minor: consider suppressing auto-detection log in non-interactive mode.

The getSuggestedPolicyPresets function is well-designed:

Reusable across _setupPolicies and setupPoliciesWithSelection

Maintains backward compatibility with credential-based detection

However, line 3611's console.log for auto-detected credentials may produce unexpected output in non-interactive/CI contexts where the enabledChannels flow is bypassed:
console.log(`  Auto-detected: ${envKey} -> suggesting ${channel} preset`);
Consider gating this behind an interactive check or using note() for consistency with other informational messages.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 3600 - 3622, The console.log in
getSuggestedPolicyPresets (inside maybeSuggestMessagingPreset) emits
auto-detection messages even in non-interactive/CI runs; change that console.log
call to either call the existing note(...) utility used elsewhere or only emit
when running interactively (e.g., guard with process.stdout.isTTY &&
!process.env.CI). Update maybeSuggestMessagingPreset to use note(...) if
available, otherwise wrap the log in an interactive check so non-interactive
executions do not print the auto-detect line.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-messaging-providers.sh`:
- Around line 523-558: The current check treats any output containing "OPEN" as
success even if later output contains CLOSE/ERROR/TIMEOUT; update the logic
around the dc_gateway checks so the "OPEN" branch only passes when there are no
subsequent failure indicators (e.g., ensure before treating OPEN as success you
confirm dc_gateway does not also match "CLOSE", "ERROR", or "TIMEOUT"), or
reorder the conditional checks to test for CLOSE/ERROR/TIMEOUT (and other
failure patterns) before checking for OPEN; modify the grep condition
referencing dc_gateway and the "OPEN" branch accordingly so a sequence like
"OPEN\nCLOSE 1006\nTIMEOUT" will not be reported as success.

In `@test/onboard.test.js`:
- Around line 106-128: The test mutates process.env (TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN) and currently always deletes them in the
finally block; change the test in onboard.test.js to capture the original values
before overwriting, run expectations (calling getSuggestedPolicyPresets), and in
the finally block restore each token: if the saved original is undefined delete
the env var, otherwise set it back to the saved value, so pre-existing
environment is preserved for other tests.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 2432-2449: Add optional validation to the guild parsing block to
warn on malformed Discord snowflake IDs: after computing serverIds and userIds
(used to populate discordGuilds) iterate each id and validate it matches the
expected 17–19 digit numeric regex, and log a warning (do not throw) via the
existing logger/processLogger if any id is malformed, leaving the rest
unchanged; ensure you reference the serverIds and userIds arrays and keep the
existing behavior of requireMention and discordGuilds population in the code
path guarded by enabledTokenEnvKeys.has("DISCORD_BOT_TOKEN").
- Around line 3558-3574: The condition relies on process.env[ch.serverIdEnvKey
|| ""] which is implicit; change the if check to explicitly ensure both
ch.requireMentionEnvKey is set and the server ID env var exists (e.g., verify
ch.serverIdEnvKey is truthy and process.env[ch.serverIdEnvKey] is
defined/non-empty) before prompting — update the conditional that references
ch.requireMentionEnvKey and process.env[...] to use ch.serverIdEnvKey and check
its env value explicitly so the intent is clear in the block handling
require-mention logic.
- Around line 3600-3622: The console.log in getSuggestedPolicyPresets (inside
maybeSuggestMessagingPreset) emits auto-detection messages even in
non-interactive/CI runs; change that console.log call to either call the
existing note(...) utility used elsewhere or only emit when running
interactively (e.g., guard with process.stdout.isTTY && !process.env.CI). Update
maybeSuggestMessagingPreset to use note(...) if available, otherwise wrap the
log in an interactive check so non-interactive executions do not print the
auto-detect line.

In `@docs/deployment/set-up-telegram-bridge.md`:
- Line 63: The sentence "NemoClaw applies that allowlist to Telegram DMs only.
Group chats stay open by default so rebuilt sandboxes do not silently drop
Telegram group messages because of an empty group allowlist." should be split
into two separate lines—one line containing "NemoClaw applies that allowlist to
Telegram DMs only." and the next line containing "Group chats stay open by
default so rebuilt sandboxes do not silently drop Telegram group messages
because of an empty group allowlist."—so each sentence is on its own line in the
docs file.

In `@docs/reference/troubleshooting.md`:
- Around line 292-297: Remove the unnecessary bolding for the checklist labels
by deleting the surrounding ** markers around "Baked config and provider wiring"
and "Native Discord gateway path" in the list entries so they render as plain
text; keep the rest of the list content unchanged and ensure only UI
labels/parameter names remain bold elsewhere per guidelines.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ce898708-e232-4072-9103-1fc6206302dd

📥 Commits

Reviewing files that changed from the base of the PR and between 6b74173 and 778bda9.

📒 Files selected for processing (14)

.agents/skills/nemoclaw-user-deploy-remote/SKILL.md
.agents/skills/nemoclaw-user-reference/references/commands.md
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md
Dockerfile
bin/lib/onboard.js
docs/deployment/set-up-telegram-bridge.md
docs/reference/commands.md
docs/reference/troubleshooting.md
nemoclaw-blueprint/policies/presets/discord.yaml
nemoclaw-blueprint/policies/presets/slack.yaml
nemoclaw-blueprint/policies/presets/telegram.yaml
test/e2e/test-messaging-providers.sh
test/onboard.test.js
test/policies.test.js

test/e2e/test-messaging-providers.sh

test/onboard.test.ts

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 4583-4587: The selectedMessagingChannels value coming from
setupMessagingChannels() can include channels that were marked selected but lack
a token (e.g., when the token prompt was skipped), which later lets
createSandbox()/preset application widen policy for a channel that isn't
actually configured; fix by filtering selectedMessagingChannels before
persisting and before applying presets so you only save and act on channels that
are fully configured (e.g., have a non-null token or an explicit enabled flag).
Update the code paths that call onboardSession.updateSession (the block
assigning current.messagingChannels = selectedMessagingChannels) and the code
that applies matching presets after setupMessagingChannels() to use the filtered
list (check channel.token or channel.enabled on the objects returned by
setupMessagingChannels()), and ensure setupMessagingChannels() itself does not
return channels that are considered "selected" without required config.
- Around line 4583-4587: Declare a local variable selectedMessagingChannels at
the top of the onboard() function (e.g., let selectedMessagingChannels =
undefined) and, on resume paths, hydrate it from the existing session state
(session.messagingChannels) before any use; replace the implicit global
assignment selectedMessagingChannels = await setupMessagingChannels() with an
assignment to the local variable and ensure the resume branch uses the hydrated
local value prior to calling onboardSession.updateSession or referencing it at
line ranges around setupMessagingChannels and the resume logic.

In `@test/onboard.test.ts`:
- Around line 107-129: The test mutates global env vars (TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN) by unconditionally deleting them in the
finally block; update the test around getSuggestedPolicyPresets to capture the
original values before setting new ones, and in the finally block restore each
env var to its original value (set back to original string if present, or delete
only if it was originally undefined), referencing the test that calls
getSuggestedPolicyPresets so you modify the setup/cleanup around that call to
preserve pre-existing environment state.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fd31b5bc-0a54-4a21-aef9-22e441b83ffb

📥 Commits

Reviewing files that changed from the base of the PR and between 778bda9 and 1d4499e.

📒 Files selected for processing (9)

.agents/skills/nemoclaw-user-deploy-remote/SKILL.md
.agents/skills/nemoclaw-user-reference/references/commands.md
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md
bin/lib/onboard.js
docs/reference/commands.md
docs/reference/troubleshooting.md
src/lib/onboard.ts
test/onboard.test.ts
test/policies.test.ts

✅ Files skipped from review due to trivial changes (4)

.agents/skills/nemoclaw-user-deploy-remote/SKILL.md
.agents/skills/nemoclaw-user-reference/references/commands.md
docs/reference/commands.md
.agents/skills/nemoclaw-user-reference/references/troubleshooting.md

🚧 Files skipped from review as they are similar to previous changes (1)

docs/reference/troubleshooting.md

coderabbitai · 2026-04-10T00:33:42Z

src/lib/onboard.ts

+      selectedMessagingChannels = await setupMessagingChannels();
+      onboardSession.updateSession((current) => {
+        current.messagingChannels = selectedMessagingChannels;
+        return current;
+      });


⚠️ Potential issue | 🟠 Major

Only persist configured messaging channels here.

This now feeds policy suggestions from the raw toggle selection, but setupMessagingChannels() still leaves a channel selected when the token prompt is skipped. createSandbox() then drops that channel because token is null, while Line 4666 can still apply the matching preset, which unnecessarily widens policy access for a channel that is not actually enabled.

🔒 Suggested fix

const token = normalizeCredentialValue(await prompt(` ${ch.label}: `, { secret: true })); if (token) { saveCredential(ch.envKey, token); process.env[ch.envKey] = token; console.log(` ✓ ${ch.name} token saved`); } else { + enabled.delete(ch.name); console.log(` Skipped ${ch.name} (no token entered)`); continue; }

Also applies to: 4590-4598, 4658-4667

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 4583 - 4587, The selectedMessagingChannels value coming from setupMessagingChannels() can include channels that were marked selected but lack a token (e.g., when the token prompt was skipped), which later lets createSandbox()/preset application widen policy for a channel that isn't actually configured; fix by filtering selectedMessagingChannels before persisting and before applying presets so you only save and act on channels that are fully configured (e.g., have a non-null token or an explicit enabled flag). Update the code paths that call onboardSession.updateSession (the block assigning current.messagingChannels = selectedMessagingChannels) and the code that applies matching presets after setupMessagingChannels() to use the filtered list (check channel.token or channel.enabled on the objects returned by setupMessagingChannels()), and ensure setupMessagingChannels() itself does not return channels that are considered "selected" without required config.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check the code around lines 4583-4587 to see the assignment sed -n '4575,4595p' src/lib/onboard.ts

Repository: NVIDIA/NemoClaw

Length of output: 736

🏁 Script executed:

# Search for where selectedMessagingChannels is declared in the file rg 'selectedMessagingChannels' src/lib/onboard.ts -n

Repository: NVIDIA/NemoClaw

Length of output: 296

🏁 Script executed:

# Check the function scope and strict mode head -30 src/lib/onboard.ts

Repository: NVIDIA/NemoClaw

Length of output: 1329

🏁 Script executed:

# Check around line 4666 to see the usage context sed -n '4660,4675p' src/lib/onboard.ts

Repository: NVIDIA/NemoClaw

Length of output: 659

🏁 Script executed:

# Find the function scope - search for onboard function declaration rg 'function onboard|const onboard|async function onboard' src/lib/onboard.ts -n

Repository: NVIDIA/NemoClaw

Length of output: 167

🏁 Script executed:

# Search for variable declarations at the function scope to see if selectedMessagingChannels is declared # Let's look for 'let selectedMessagingChannels' or 'var selectedMessagingChannels' in the file rg '(let|var|const)\s+selectedMessagingChannels' src/lib/onboard.ts -n

Repository: NVIDIA/NemoClaw

Length of output: 41

🏁 Script executed:

# Find the start of onboard function and look for initial variable declarations # Search for 'async function onboard' or similar sed -n '4400,4450p' src/lib/onboard.ts | head -100

Repository: NVIDIA/NemoClaw

Length of output: 2194

🏁 Script executed:

# Get a broader view of the function start and variable declarations rg -B 5 'selectedMessagingChannels = await setupMessagingChannels' src/lib/onboard.ts -n

Repository: NVIDIA/NemoClaw

Length of output: 262

🏁 Script executed:

# Check if strict mode is used in the file (e.g., "use strict") rg '"use strict"|\'use strict\'' src/lib/onboard.ts

Repository: NVIDIA/NemoClaw

Length of output: 127

🏁 Script executed:

# Look at the variable declarations at the start of the onboard function (around line 4264+) sed -n '4264,4350p' src/lib/onboard.ts

Repository: NVIDIA/NemoClaw

Length of output: 3936

🏁 Script executed:

# Check if there's a declaration of selectedMessagingChannels elsewhere that we might have missed rg 'selectedMessagingChannels' src/lib/onboard.ts -B 2 -A 2

Repository: NVIDIA/NemoClaw

Length of output: 600

Declare and hydrate selectedMessagingChannels at the start of the onboard() function.

selectedMessagingChannels is assigned at line 4583 without being declared with let, const, or var, creating an implicit global variable. On resume paths that skip the messaging channels step, line 4666 will reference an uninitialized value. The variable must be declared early and hydrated from session.messagingChannels when resuming.

🐛 Proposed fix

let nimContainer = session?.nimContainer || null; let webSearchConfig = session?.webSearchConfig || null; let forceProviderSelection = false; + let selectedMessagingChannels = Array.isArray(session?.messagingChannels) + ? session.messagingChannels + : null;

Also applies to: 4658-4667

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 4583 - 4587, Declare a local variable selectedMessagingChannels at the top of the onboard() function (e.g., let selectedMessagingChannels = undefined) and, on resume paths, hydrate it from the existing session state (session.messagingChannels) before any use; replace the implicit global assignment selectedMessagingChannels = await setupMessagingChannels() with an assignment to the local variable and ensure the resume branch uses the hydrated local value prior to calling onboardSession.updateSession or referencing it at line ranges around setupMessagingChannels and the resume logic.

test/onboard.test.ts

kjw3 · 2026-04-10T03:29:33Z

Follow-up Discord diagnosis from a live sandbox run.

Summary:

This does not look like a NemoClaw onboarding/config-bake issue.
It also does not look like a wrong-mention-target issue.
The Discord bot account probes successfully, but the live gateway session does not stay connected, so no inbound mentions are ever consumed.

What was verified:

The sandbox is running with the expected Discord channel config baked into openclaw.json.
The Discord bot token probes successfully through openclaw channels status --json --probe.
The bot account identity is resolved correctly by OpenClaw.
A raw WebSocket probe from inside the sandbox to Discord gateway succeeds and receives the initial Gateway Hello payload.
This reproduces on both the originally pinned OpenClaw version and a test rebuild using OpenClaw 2026.4.2.

What fails:

openclaw channels status --json --probe continues to report the Discord account as connected: false.
openclaw channels logs shows repeated Discord gateway reconnect loops and close 1006.
The normal OpenClaw logs only surface AggregateError, which hides the underlying socket failures.

Deeper cause found:

I instantiated the same bundled Discord gateway plugin directly inside the sandbox and dumped the raw emitted error object before OpenClaw collapsed it to String(err).
The underlying error is an AggregateError containing multiple ECONNREFUSED failures while opening the Discord gateway socket.
In other words: the bundled Discord gateway client path is failing before the session reaches READY, which explains why mentions are never received.

Additional signal:

The raw Node built-in WebSocket probe can reach Discord gateway and receive Hello from the same sandbox.
That narrows the problem to the OpenClaw Discord runtime/client behavior rather than general sandbox egress or DNS.

Likely root-cause area:

OpenClaw’s Discord gateway runtime is losing the useful nested error details by logging only String(err) in the gateway supervisor.
The failing path appears to be inside the bundled Discord gateway client used by OpenClaw, not in NemoClaw policy wiring.

Implication for this PR:

The NemoClaw-side Discord remediation and diagnostics added here are still useful, but they do not by themselves resolve the current end-to-end Discord mention failure.
The remaining issue looks upstream/in-runtime: the OpenClaw Discord gateway client path needs better error surfacing and likely a fix in how the gateway socket is opened and managed in sandboxed environments.

ericksoa

Code changes look solid. The CodeRabbit findings (gateway probe ordering, env var cleanup, implicit global) have all been addressed in subsequent commits. Policy preset cleanup, guild config baking, and Telegram account alignment are well-tested.

One minor note for follow-up: on session resume paths, selectedMessagingChannels defaults to [] rather than hydrating from session.messagingChannels, so it falls back to credential-sniffing for policy suggestions. Worth considering whether to hydrate from session in a future pass.

brandonpelfrey

Passed automated inspection.

…IA#1653) ## Summary This draft PR advances the Discord/native messaging remediation work in NemoClaw without claiming that native Discord is fully fixed end to end yet. It also folds in the analogous Telegram doctor/config follow-up so rebuilt sandboxes match the current account-based channel schema and avoid a misleading empty group allowlist warning. It now also addresses the repo-owned portion of NVIDIA#1692 by removing deprecated `tls: terminate` directives from the messaging policy presets. It does five repo-owned things: 1. narrows and supersedes the communication preset fix from NVIDIA#1084 by adding only the minimal additional Node binary coverage 2. fixes onboarding so messaging policy suggestions follow the current messaging selection rather than stale stored credentials 3. fixes the baked Discord account schema at image build time, adds build-time Discord guild workspace config, and improves troubleshooting/E2E validation so baked-config failures are separated from native Discord gateway failures 4. aligns baked Telegram config with the same doctor-expected account layout and makes Telegram group handling explicit so default group delivery does not silently drop messages 5. removes deprecated TLS-termination syntax from the Discord, Slack, and Telegram presets so NemoClaw does not keep shipping policy warnings on current OpenShell builds ## Why this PR exists The current evidence suggests the remaining Discord break is not just a missing onboarding step: - the sandbox has the Discord policy applied - the Discord provider is attached - command registration `PUT` requests are allowed at runtime - the bot can authenticate against `discord.com` - the bot can read channel history and send messages back into Discord - current NemoClaw onboarding only baked token/account config, not the Discord guild config that OpenClaw documents for server-channel autonomy What still appears broken is the autonomous inbound side of native Discord handling. In current testing, OpenClaw can read and write Discord successfully, but incoming `@mention` messages do not appear to wake the agent in real time. After baking the documented guild config into the sandbox, the failure still reproduces. Direct diagnosis inside the sandbox shows that: - the baked `channels.discord.guilds.<serverId>` config is present and correct - Node DNS lookup for `gateway.discord.gg` succeeds - a plain Node `WebSocket('wss://gateway.discord.gg/...')` succeeds inside the same sandbox - OpenClaw's Discord gateway client still loops with `AggregateError` and close code `1006` That makes the remaining issue look narrower than generic connectivity: likely the live Discord gateway client path in the pinned OpenClaw runtime (`2026.3.11`) under the OpenShell proxy environment. This PR tightens NemoClaw’s side of the system so: - the sandbox policy and onboarding behavior are correct - the baked Discord config matches current OpenClaw schema expectations - Discord server-workspace config can now be baked at image build time instead of being missing from NemoClaw onboarding entirely - onboarding can ask whether the bot should reply only to `@mentions` or to all messages in the configured server - the Discord user allowlist is optional, so leaving it blank allows any member of the configured server to message the bot - Telegram single-account config is already baked in the `accounts.default` layout that `openclaw doctor` now expects - Telegram group chats default to `groupPolicy: open`, so an empty DM allowlist does not produce silent group-message drops - messaging policy presets no longer pin deprecated `tls: terminate` directives on current OpenShell builds - users are not sent toward dead-end config repair steps - tests explicitly probe the Discord gateway path instead of only REST ## Changes - `bin/lib/onboard.js` - add `getSuggestedPolicyPresets()` - make messaging-related policy suggestions respect explicit `enabledChannels` - persist messaging channel selection through onboarding/resume flow - `nemoclaw-blueprint/policies/presets/{discord,telegram,slack}.yaml` - add `/usr/bin/node` alongside `/usr/local/bin/node` - remove deprecated `tls: terminate` directives and rely on current OpenShell automatic TLS handling - intentionally do **not** broaden to `curl` or `bash` in these narrow presets - `docs/reference/troubleshooting.md` - document that `openclaw doctor --fix` cannot repair baked sandbox channel config - add guidance for distinguishing baked-config issues from native Discord gateway issues - add guidance for the analogous Telegram doctor migration/warning path - `Dockerfile` - bake messaging accounts under `channels.<provider>.accounts.default` instead of `accounts.main` - align the image with the schema `openclaw doctor` currently expects, so this migration is handled at build time rather than failing inside the immutable sandbox - add a build-time `NEMOCLAW_DISCORD_GUILDS_B64` path so Discord guild allowlist config can be baked into `openclaw.json` - set Telegram `groupPolicy: open` in the baked default account config - `bin/lib/onboard.js` - prompt for Discord Server ID and User ID during messaging setup - bake `channels.discord.guilds.<serverId>` config for the selected server(s) - prompt for mention-only vs all-message replies in guild channels - treat the Discord user allowlist as optional; leaving it blank allows any member of the configured server - keep `requireMention: true` as the default so onboarding remains safe for shared servers - `test/e2e/test-messaging-providers.sh` - add a classified native Discord gateway probe separate from REST reachability - accept both `accounts.default` and `accounts.main` during the transition so older images still validate - assert baked Telegram `groupPolicy: open` - `test/onboard.test.js` - add coverage for policy suggestions following current messaging selection - add coverage for the baked Discord guild workspace build arg - add coverage for both mention-only guild config and open-to-all-members guild config - `test/policies.test.js` - add coverage for `/usr/bin/node` in communication presets - add coverage that messaging presets no longer ship deprecated `tls: terminate` - `.agents/skills/nemoclaw-user-reference/references/troubleshooting.md` - regenerated from docs ## Related issues - supersedes NVIDIA#1084 for the preset change - addresses NVIDIA#1610 from the NemoClaw side - addresses NVIDIA#1692 from the NemoClaw side - provides clearer handling for NVIDIA#606 - improves investigation coverage for NVIDIA#585 and NVIDIA#1570 ## Validation Completed: - `git diff --check` - `bash -n test/e2e/test-messaging-providers.sh` - `rg -n "tls: terminate" nemoclaw-blueprint/policies/presets/{telegram,discord,slack}.yaml test/policies.test.js` - `node -c bin/lib/onboard.js` - manual host-side verification that the live sandbox has: - the Discord policy applied - the Discord provider attached - allowed Discord command-registration `PUT` traffic at runtime - successful Discord read/write behavior from the bot in a real server channel - a remaining failure mode where direct `@mention` traffic is only observed when OpenClaw is prompted from the TUI, rather than waking autonomously in real time - a correctly baked Discord guild config inside the sandbox after redeploy - successful `dns.lookup('gateway.discord.gg')` and raw Node WebSocket connectivity from inside the sandbox - continued OpenClaw Discord gateway failure with `AggregateError` and WebSocket close code `1006` Not completed in this environment: - full `vitest` runs were attempted but hung without producing output in this sandboxed session ## Current limitation This PR does **not** yet claim a full end-to-end native Discord fix. The remaining evidence points to the live Discord gateway client path inside OpenClaw rather than to NemoClaw config. The next step is to inspect or patch the pinned OpenClaw runtime behavior and determine whether that final fix can be applied from NemoClaw or requires upstream OpenClaw changes. ## Type of change - [x] Code change for a bug fix or refactor - [x] Doc updates - [x] Tests updated  ## Summary by CodeRabbit * **New Features** * Discord onboarding prompts for Server ID, reply mode (@mentions-only or all messages) and optional user allowlist; messaging selections are persisted into the sandbox image. * **Documentation** * Clarified Telegram allowlist: enforced for DMs only; group chats remain open by default (prevents rebuilt sandboxes from dropping group messages). * Added troubleshooting for baked channel configs and Discord gateway diagnostics. * **Policy Changes** * Messaging network policy TLS handling adjusted and Node binary paths expanded for messaging providers.  --------- Co-authored-by: Test User <[email protected]> Co-authored-by: Aaron Erickson 🦞 <[email protected]>

fix(discord): tighten native messaging policies and diagnostics

8c2f4b1

latenighthackathon mentioned this pull request Apr 9, 2026

fix(presets): add /usr/bin/node to communication preset binaries (#481) #1652

Closed

4 tasks

kjw3 added 4 commits April 8, 2026 22:17

fix(discord): bake default account config

2c196a6

fix(discord): bake guild workspace config

b274324

docs(onboard): fix brave search link

49ba96e

feat(discord): add guild reply mode prompt

fbaf372

wscurran added Integration: Discord Use this label to identify Discord bot integration issues with NemoClaw. fix labels Apr 9, 2026

kjw3 added 2 commits April 9, 2026 14:07

fix(telegram): align baked channel config with doctor

b25aba5

fix(policy): drop deprecated tls termination from messaging presets

778bda9

kjw3 marked this pull request as ready for review April 10, 2026 00:17

merge main into fix/discord-native-remediation

1d4499e

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

test/e2e/test-messaging-providers.sh Show resolved Hide resolved

test/onboard.test.ts Show resolved Hide resolved

kjw3 added 2 commits April 9, 2026 20:29

test(onboard): avoid legacy shim edits after TS migration

46b2e93

chore(onboard): restore legacy shim parity with main

676a69a

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

kjw3 added 5 commits April 9, 2026 20:36

fix(onboard): tighten discord messaging remediation

c1e1348

fix(onboard): restore messaging channel state

1a1bf04

Merge branch 'main' into fix/discord-native-remediation

053451e

Merge branch 'main' into fix/discord-native-remediation

49ac71e

Merge branch 'main' into fix/discord-native-remediation

ea00e35

Merge branch 'main' into fix/discord-native-remediation

e0154c5

ericksoa approved these changes Apr 10, 2026

View reviewed changes

brandonpelfrey approved these changes Apr 10, 2026

View reviewed changes

Test User and others added 2 commits April 9, 2026 23:43

test(onboard): use built module for policy preset helper

6977f4e

Merge branch 'main' into fix/discord-native-remediation

6e46e7c

ericksoa merged commit e53167c into main Apr 10, 2026
11 checks passed

cortexj mentioned this pull request Apr 11, 2026

fix(presets): add missing binaries to communication presets #1084

Closed

12 tasks

This was referenced Apr 13, 2026

Please remove all the "tls: terminate" from the blueprints #1686

Closed

[Nemoclaw][All Platform]onboard skips messaging channel selection, but policy preset step still auto-selects messaging presets from stored credentials #1610

Closed

Conversation

kjw3 commented Apr 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this PR exists

Changes

Related issues

Validation

Current limitation

Type of change

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kjw3 commented Apr 10, 2026

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

brandonpelfrey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kjw3 commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading