fix(discord): tighten native messaging policies and diagnostics#1653
fix(discord): tighten native messaging policies and diagnostics#1653
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Discord guild-scoped onboarding and embeds guild configs into built sandbox images; clarifies that Changes
Sequence DiagramsequenceDiagram
autonumber
participant User
participant CLI as Onboarding CLI
participant Session as onboardSession
participant Patcher as Dockerfile Patcher
participant Builder as Image Builder
participant Sandbox as Runtime Sandbox
participant E2E as Test Runner
User->>CLI: run onboarding (prompts: Discord Server ID, reply mode, optional user ID)
CLI->>Session: store selectedMessagingChannels & discordGuilds
CLI->>Patcher: patchStagedDockerfile(NEMOCLAW_DISCORD_GUILDS_B64)
Patcher->>Builder: build image (embedded guild config in openclaw.json)
Builder->>Sandbox: deploy image
Sandbox->>Sandbox: load /sandbox/.openclaw/openclaw.json (baked guilds, groupPolicy)
E2E->>Sandbox: run native gateway probe & validate groupPolicy/account keys
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
docs/deployment/set-up-telegram-bridge.md (1)
63-63: Split into one sentence per line.This line contains two sentences. Per the documentation style guide, each sentence should be on its own line to make diffs more readable.
-NemoClaw applies that allowlist to Telegram DMs only. Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist. +NemoClaw applies that allowlist to Telegram DMs only. +Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/deployment/set-up-telegram-bridge.md` at line 63, The sentence "NemoClaw applies that allowlist to Telegram DMs only. Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist." should be split into two separate lines—one line containing "NemoClaw applies that allowlist to Telegram DMs only." and the next line containing "Group chats stay open by default so rebuilt sandboxes do not silently drop Telegram group messages because of an empty group allowlist."—so each sentence is on its own line in the docs file.docs/reference/troubleshooting.md (1)
292-297: Drop the bold labels in the checklist.
**Baked config and provider wiring**and**Native Discord gateway path**read like routine list labels, so the bolding is unnecessary here.As per coding guidelines, "Unnecessary bold on routine instructions ... Bold is reserved for UI labels, parameter names, and genuine warnings."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/reference/troubleshooting.md` around lines 292 - 297, Remove the unnecessary bolding for the checklist labels by deleting the surrounding ** markers around "Baked config and provider wiring" and "Native Discord gateway path" in the list entries so they render as plain text; keep the rest of the list content unchanged and ensure only UI labels/parameter names remain bold elsewhere per guidelines.bin/lib/onboard.js (3)
2432-2449: LGTM — Consider optional validation for Discord snowflake IDs.The guild parsing logic is well-structured:
- Properly handles both single (
DISCORD_SERVER_ID) and multi-value (DISCORD_SERVER_IDS) env vars- Secure default:
requireMentiondefaults totruewhen env var is unset or any value other than"0"- User allowlist is optional as intended
Discord server/user IDs are 17-19 digit numeric strings (snowflakes). While invalid IDs would simply not match any real entities, you could optionally add a warning for malformed inputs to help users catch typos early.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 2432 - 2449, Add optional validation to the guild parsing block to warn on malformed Discord snowflake IDs: after computing serverIds and userIds (used to populate discordGuilds) iterate each id and validate it matches the expected 17–19 digit numeric regex, and log a warning (do not throw) via the existing logger/processLogger if any id is malformed, leaving the rest unchanged; ensure you reference the serverIds and userIds arrays and keep the existing behavior of requireMention and discordGuilds population in the code path guarded by enabledTokenEnvKeys.has("DISCORD_BOT_TOKEN").
3558-3574: Minor: Condition could be more explicit for readability.The logic at line 3558
ch.requireMentionEnvKey && process.env[ch.serverIdEnvKey || ""]works correctly but relies onprocess.env[""]beingundefined. For clarity, consider:- if (ch.requireMentionEnvKey && process.env[ch.serverIdEnvKey || ""]) { + if (ch.requireMentionEnvKey && ch.serverIdEnvKey && process.env[ch.serverIdEnvKey]) {This makes the intent explicit: only prompt for require-mention if both the envKey exists and the server ID was set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 3558 - 3574, The condition relies on process.env[ch.serverIdEnvKey || ""] which is implicit; change the if check to explicitly ensure both ch.requireMentionEnvKey is set and the server ID env var exists (e.g., verify ch.serverIdEnvKey is truthy and process.env[ch.serverIdEnvKey] is defined/non-empty) before prompting — update the conditional that references ch.requireMentionEnvKey and process.env[...] to use ch.serverIdEnvKey and check its env value explicitly so the intent is clear in the block handling require-mention logic.
3600-3622: Clean extraction; minor: consider suppressing auto-detection log in non-interactive mode.The
getSuggestedPolicyPresetsfunction is well-designed:
- Reusable across
_setupPoliciesandsetupPoliciesWithSelection- Maintains backward compatibility with credential-based detection
However, line 3611's
console.logfor auto-detected credentials may produce unexpected output in non-interactive/CI contexts where the enabledChannels flow is bypassed:console.log(` Auto-detected: ${envKey} -> suggesting ${channel} preset`);Consider gating this behind an interactive check or using
note()for consistency with other informational messages.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 3600 - 3622, The console.log in getSuggestedPolicyPresets (inside maybeSuggestMessagingPreset) emits auto-detection messages even in non-interactive/CI runs; change that console.log call to either call the existing note(...) utility used elsewhere or only emit when running interactively (e.g., guard with process.stdout.isTTY && !process.env.CI). Update maybeSuggestMessagingPreset to use note(...) if available, otherwise wrap the log in an interactive check so non-interactive executions do not print the auto-detect line.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-messaging-providers.sh`:
- Around line 523-558: The current check treats any output containing "OPEN" as
success even if later output contains CLOSE/ERROR/TIMEOUT; update the logic
around the dc_gateway checks so the "OPEN" branch only passes when there are no
subsequent failure indicators (e.g., ensure before treating OPEN as success you
confirm dc_gateway does not also match "CLOSE", "ERROR", or "TIMEOUT"), or
reorder the conditional checks to test for CLOSE/ERROR/TIMEOUT (and other
failure patterns) before checking for OPEN; modify the grep condition
referencing dc_gateway and the "OPEN" branch accordingly so a sequence like
"OPEN\nCLOSE 1006\nTIMEOUT" will not be reported as success.
In `@test/onboard.test.js`:
- Around line 106-128: The test mutates process.env (TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN) and currently always deletes them in the
finally block; change the test in onboard.test.js to capture the original values
before overwriting, run expectations (calling getSuggestedPolicyPresets), and in
the finally block restore each token: if the saved original is undefined delete
the env var, otherwise set it back to the saved value, so pre-existing
environment is preserved for other tests.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 2432-2449: Add optional validation to the guild parsing block to
warn on malformed Discord snowflake IDs: after computing serverIds and userIds
(used to populate discordGuilds) iterate each id and validate it matches the
expected 17–19 digit numeric regex, and log a warning (do not throw) via the
existing logger/processLogger if any id is malformed, leaving the rest
unchanged; ensure you reference the serverIds and userIds arrays and keep the
existing behavior of requireMention and discordGuilds population in the code
path guarded by enabledTokenEnvKeys.has("DISCORD_BOT_TOKEN").
- Around line 3558-3574: The condition relies on process.env[ch.serverIdEnvKey
|| ""] which is implicit; change the if check to explicitly ensure both
ch.requireMentionEnvKey is set and the server ID env var exists (e.g., verify
ch.serverIdEnvKey is truthy and process.env[ch.serverIdEnvKey] is
defined/non-empty) before prompting — update the conditional that references
ch.requireMentionEnvKey and process.env[...] to use ch.serverIdEnvKey and check
its env value explicitly so the intent is clear in the block handling
require-mention logic.
- Around line 3600-3622: The console.log in getSuggestedPolicyPresets (inside
maybeSuggestMessagingPreset) emits auto-detection messages even in
non-interactive/CI runs; change that console.log call to either call the
existing note(...) utility used elsewhere or only emit when running
interactively (e.g., guard with process.stdout.isTTY && !process.env.CI). Update
maybeSuggestMessagingPreset to use note(...) if available, otherwise wrap the
log in an interactive check so non-interactive executions do not print the
auto-detect line.
In `@docs/deployment/set-up-telegram-bridge.md`:
- Line 63: The sentence "NemoClaw applies that allowlist to Telegram DMs only.
Group chats stay open by default so rebuilt sandboxes do not silently drop
Telegram group messages because of an empty group allowlist." should be split
into two separate lines—one line containing "NemoClaw applies that allowlist to
Telegram DMs only." and the next line containing "Group chats stay open by
default so rebuilt sandboxes do not silently drop Telegram group messages
because of an empty group allowlist."—so each sentence is on its own line in the
docs file.
In `@docs/reference/troubleshooting.md`:
- Around line 292-297: Remove the unnecessary bolding for the checklist labels
by deleting the surrounding ** markers around "Baked config and provider wiring"
and "Native Discord gateway path" in the list entries so they render as plain
text; keep the rest of the list content unchanged and ensure only UI
labels/parameter names remain bold elsewhere per guidelines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ce898708-e232-4072-9103-1fc6206302dd
📒 Files selected for processing (14)
.agents/skills/nemoclaw-user-deploy-remote/SKILL.md.agents/skills/nemoclaw-user-reference/references/commands.md.agents/skills/nemoclaw-user-reference/references/troubleshooting.mdDockerfilebin/lib/onboard.jsdocs/deployment/set-up-telegram-bridge.mddocs/reference/commands.mddocs/reference/troubleshooting.mdnemoclaw-blueprint/policies/presets/discord.yamlnemoclaw-blueprint/policies/presets/slack.yamlnemoclaw-blueprint/policies/presets/telegram.yamltest/e2e/test-messaging-providers.shtest/onboard.test.jstest/policies.test.js
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 4583-4587: The selectedMessagingChannels value coming from
setupMessagingChannels() can include channels that were marked selected but lack
a token (e.g., when the token prompt was skipped), which later lets
createSandbox()/preset application widen policy for a channel that isn't
actually configured; fix by filtering selectedMessagingChannels before
persisting and before applying presets so you only save and act on channels that
are fully configured (e.g., have a non-null token or an explicit enabled flag).
Update the code paths that call onboardSession.updateSession (the block
assigning current.messagingChannels = selectedMessagingChannels) and the code
that applies matching presets after setupMessagingChannels() to use the filtered
list (check channel.token or channel.enabled on the objects returned by
setupMessagingChannels()), and ensure setupMessagingChannels() itself does not
return channels that are considered "selected" without required config.
- Around line 4583-4587: Declare a local variable selectedMessagingChannels at
the top of the onboard() function (e.g., let selectedMessagingChannels =
undefined) and, on resume paths, hydrate it from the existing session state
(session.messagingChannels) before any use; replace the implicit global
assignment selectedMessagingChannels = await setupMessagingChannels() with an
assignment to the local variable and ensure the resume branch uses the hydrated
local value prior to calling onboardSession.updateSession or referencing it at
line ranges around setupMessagingChannels and the resume logic.
In `@test/onboard.test.ts`:
- Around line 107-129: The test mutates global env vars (TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN) by unconditionally deleting them in the
finally block; update the test around getSuggestedPolicyPresets to capture the
original values before setting new ones, and in the finally block restore each
env var to its original value (set back to original string if present, or delete
only if it was originally undefined), referencing the test that calls
getSuggestedPolicyPresets so you modify the setup/cleanup around that call to
preserve pre-existing environment state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fd31b5bc-0a54-4a21-aef9-22e441b83ffb
📒 Files selected for processing (9)
.agents/skills/nemoclaw-user-deploy-remote/SKILL.md.agents/skills/nemoclaw-user-reference/references/commands.md.agents/skills/nemoclaw-user-reference/references/troubleshooting.mdbin/lib/onboard.jsdocs/reference/commands.mddocs/reference/troubleshooting.mdsrc/lib/onboard.tstest/onboard.test.tstest/policies.test.ts
✅ Files skipped from review due to trivial changes (4)
- .agents/skills/nemoclaw-user-deploy-remote/SKILL.md
- .agents/skills/nemoclaw-user-reference/references/commands.md
- docs/reference/commands.md
- .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/reference/troubleshooting.md
| selectedMessagingChannels = await setupMessagingChannels(); | ||
| onboardSession.updateSession((current) => { | ||
| current.messagingChannels = selectedMessagingChannels; | ||
| return current; | ||
| }); |
There was a problem hiding this comment.
Only persist configured messaging channels here.
This now feeds policy suggestions from the raw toggle selection, but setupMessagingChannels() still leaves a channel selected when the token prompt is skipped. createSandbox() then drops that channel because token is null, while Line 4666 can still apply the matching preset, which unnecessarily widens policy access for a channel that is not actually enabled.
🔒 Suggested fix
const token = normalizeCredentialValue(await prompt(` ${ch.label}: `, { secret: true }));
if (token) {
saveCredential(ch.envKey, token);
process.env[ch.envKey] = token;
console.log(` ✓ ${ch.name} token saved`);
} else {
+ enabled.delete(ch.name);
console.log(` Skipped ${ch.name} (no token entered)`);
continue;
}Also applies to: 4590-4598, 4658-4667
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/onboard.ts` around lines 4583 - 4587, The selectedMessagingChannels
value coming from setupMessagingChannels() can include channels that were marked
selected but lack a token (e.g., when the token prompt was skipped), which later
lets createSandbox()/preset application widen policy for a channel that isn't
actually configured; fix by filtering selectedMessagingChannels before
persisting and before applying presets so you only save and act on channels that
are fully configured (e.g., have a non-null token or an explicit enabled flag).
Update the code paths that call onboardSession.updateSession (the block
assigning current.messagingChannels = selectedMessagingChannels) and the code
that applies matching presets after setupMessagingChannels() to use the filtered
list (check channel.token or channel.enabled on the objects returned by
setupMessagingChannels()), and ensure setupMessagingChannels() itself does not
return channels that are considered "selected" without required config.
🧩 Analysis chain
🏁 Script executed:
# Check the code around lines 4583-4587 to see the assignment
sed -n '4575,4595p' src/lib/onboard.tsRepository: NVIDIA/NemoClaw
Length of output: 736
🏁 Script executed:
# Search for where selectedMessagingChannels is declared in the file
rg 'selectedMessagingChannels' src/lib/onboard.ts -nRepository: NVIDIA/NemoClaw
Length of output: 296
🏁 Script executed:
# Check the function scope and strict mode
head -30 src/lib/onboard.tsRepository: NVIDIA/NemoClaw
Length of output: 1329
🏁 Script executed:
# Check around line 4666 to see the usage context
sed -n '4660,4675p' src/lib/onboard.tsRepository: NVIDIA/NemoClaw
Length of output: 659
🏁 Script executed:
# Find the function scope - search for onboard function declaration
rg 'function onboard|const onboard|async function onboard' src/lib/onboard.ts -nRepository: NVIDIA/NemoClaw
Length of output: 167
🏁 Script executed:
# Search for variable declarations at the function scope to see if selectedMessagingChannels is declared
# Let's look for 'let selectedMessagingChannels' or 'var selectedMessagingChannels' in the file
rg '(let|var|const)\s+selectedMessagingChannels' src/lib/onboard.ts -nRepository: NVIDIA/NemoClaw
Length of output: 41
🏁 Script executed:
# Find the start of onboard function and look for initial variable declarations
# Search for 'async function onboard' or similar
sed -n '4400,4450p' src/lib/onboard.ts | head -100Repository: NVIDIA/NemoClaw
Length of output: 2194
🏁 Script executed:
# Get a broader view of the function start and variable declarations
rg -B 5 'selectedMessagingChannels = await setupMessagingChannels' src/lib/onboard.ts -nRepository: NVIDIA/NemoClaw
Length of output: 262
🏁 Script executed:
# Check if strict mode is used in the file (e.g., "use strict")
rg '"use strict"|\'use strict\'' src/lib/onboard.tsRepository: NVIDIA/NemoClaw
Length of output: 127
🏁 Script executed:
# Look at the variable declarations at the start of the onboard function (around line 4264+)
sed -n '4264,4350p' src/lib/onboard.tsRepository: NVIDIA/NemoClaw
Length of output: 3936
🏁 Script executed:
# Check if there's a declaration of selectedMessagingChannels elsewhere that we might have missed
rg 'selectedMessagingChannels' src/lib/onboard.ts -B 2 -A 2Repository: NVIDIA/NemoClaw
Length of output: 600
Declare and hydrate selectedMessagingChannels at the start of the onboard() function.
selectedMessagingChannels is assigned at line 4583 without being declared with let, const, or var, creating an implicit global variable. On resume paths that skip the messaging channels step, line 4666 will reference an uninitialized value. The variable must be declared early and hydrated from session.messagingChannels when resuming.
🐛 Proposed fix
let nimContainer = session?.nimContainer || null;
let webSearchConfig = session?.webSearchConfig || null;
let forceProviderSelection = false;
+ let selectedMessagingChannels = Array.isArray(session?.messagingChannels)
+ ? session.messagingChannels
+ : null;Also applies to: 4658-4667
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/onboard.ts` around lines 4583 - 4587, Declare a local variable
selectedMessagingChannels at the top of the onboard() function (e.g., let
selectedMessagingChannels = undefined) and, on resume paths, hydrate it from the
existing session state (session.messagingChannels) before any use; replace the
implicit global assignment selectedMessagingChannels = await
setupMessagingChannels() with an assignment to the local variable and ensure the
resume branch uses the hydrated local value prior to calling
onboardSession.updateSession or referencing it at line ranges around
setupMessagingChannels and the resume logic.
|
Follow-up Discord diagnosis from a live sandbox run. Summary:
What was verified:
What fails:
Deeper cause found:
Additional signal:
Likely root-cause area:
Implication for this PR:
|
ericksoa
left a comment
There was a problem hiding this comment.
Code changes look solid. The CodeRabbit findings (gateway probe ordering, env var cleanup, implicit global) have all been addressed in subsequent commits. Policy preset cleanup, guild config baking, and Telegram account alignment are well-tested.
One minor note for follow-up: on session resume paths, selectedMessagingChannels defaults to [] rather than hydrating from session.messagingChannels, so it falls back to credential-sniffing for policy suggestions. Worth considering whether to hydrate from session in a future pass.
brandonpelfrey
left a comment
There was a problem hiding this comment.
Passed automated inspection.
…IA#1653) ## Summary This draft PR advances the Discord/native messaging remediation work in NemoClaw without claiming that native Discord is fully fixed end to end yet. It also folds in the analogous Telegram doctor/config follow-up so rebuilt sandboxes match the current account-based channel schema and avoid a misleading empty group allowlist warning. It now also addresses the repo-owned portion of NVIDIA#1692 by removing deprecated `tls: terminate` directives from the messaging policy presets. It does five repo-owned things: 1. narrows and supersedes the communication preset fix from NVIDIA#1084 by adding only the minimal additional Node binary coverage 2. fixes onboarding so messaging policy suggestions follow the current messaging selection rather than stale stored credentials 3. fixes the baked Discord account schema at image build time, adds build-time Discord guild workspace config, and improves troubleshooting/E2E validation so baked-config failures are separated from native Discord gateway failures 4. aligns baked Telegram config with the same doctor-expected account layout and makes Telegram group handling explicit so default group delivery does not silently drop messages 5. removes deprecated TLS-termination syntax from the Discord, Slack, and Telegram presets so NemoClaw does not keep shipping policy warnings on current OpenShell builds ## Why this PR exists The current evidence suggests the remaining Discord break is not just a missing onboarding step: - the sandbox has the Discord policy applied - the Discord provider is attached - command registration `PUT` requests are allowed at runtime - the bot can authenticate against `discord.com` - the bot can read channel history and send messages back into Discord - current NemoClaw onboarding only baked token/account config, not the Discord guild config that OpenClaw documents for server-channel autonomy What still appears broken is the autonomous inbound side of native Discord handling. In current testing, OpenClaw can read and write Discord successfully, but incoming `@mention` messages do not appear to wake the agent in real time. After baking the documented guild config into the sandbox, the failure still reproduces. Direct diagnosis inside the sandbox shows that: - the baked `channels.discord.guilds.<serverId>` config is present and correct - Node DNS lookup for `gateway.discord.gg` succeeds - a plain Node `WebSocket('wss://gateway.discord.gg/...')` succeeds inside the same sandbox - OpenClaw's Discord gateway client still loops with `AggregateError` and close code `1006` That makes the remaining issue look narrower than generic connectivity: likely the live Discord gateway client path in the pinned OpenClaw runtime (`2026.3.11`) under the OpenShell proxy environment. This PR tightens NemoClaw’s side of the system so: - the sandbox policy and onboarding behavior are correct - the baked Discord config matches current OpenClaw schema expectations - Discord server-workspace config can now be baked at image build time instead of being missing from NemoClaw onboarding entirely - onboarding can ask whether the bot should reply only to `@mentions` or to all messages in the configured server - the Discord user allowlist is optional, so leaving it blank allows any member of the configured server to message the bot - Telegram single-account config is already baked in the `accounts.default` layout that `openclaw doctor` now expects - Telegram group chats default to `groupPolicy: open`, so an empty DM allowlist does not produce silent group-message drops - messaging policy presets no longer pin deprecated `tls: terminate` directives on current OpenShell builds - users are not sent toward dead-end config repair steps - tests explicitly probe the Discord gateway path instead of only REST ## Changes - `bin/lib/onboard.js` - add `getSuggestedPolicyPresets()` - make messaging-related policy suggestions respect explicit `enabledChannels` - persist messaging channel selection through onboarding/resume flow - `nemoclaw-blueprint/policies/presets/{discord,telegram,slack}.yaml` - add `/usr/bin/node` alongside `/usr/local/bin/node` - remove deprecated `tls: terminate` directives and rely on current OpenShell automatic TLS handling - intentionally do **not** broaden to `curl` or `bash` in these narrow presets - `docs/reference/troubleshooting.md` - document that `openclaw doctor --fix` cannot repair baked sandbox channel config - add guidance for distinguishing baked-config issues from native Discord gateway issues - add guidance for the analogous Telegram doctor migration/warning path - `Dockerfile` - bake messaging accounts under `channels.<provider>.accounts.default` instead of `accounts.main` - align the image with the schema `openclaw doctor` currently expects, so this migration is handled at build time rather than failing inside the immutable sandbox - add a build-time `NEMOCLAW_DISCORD_GUILDS_B64` path so Discord guild allowlist config can be baked into `openclaw.json` - set Telegram `groupPolicy: open` in the baked default account config - `bin/lib/onboard.js` - prompt for Discord Server ID and User ID during messaging setup - bake `channels.discord.guilds.<serverId>` config for the selected server(s) - prompt for mention-only vs all-message replies in guild channels - treat the Discord user allowlist as optional; leaving it blank allows any member of the configured server - keep `requireMention: true` as the default so onboarding remains safe for shared servers - `test/e2e/test-messaging-providers.sh` - add a classified native Discord gateway probe separate from REST reachability - accept both `accounts.default` and `accounts.main` during the transition so older images still validate - assert baked Telegram `groupPolicy: open` - `test/onboard.test.js` - add coverage for policy suggestions following current messaging selection - add coverage for the baked Discord guild workspace build arg - add coverage for both mention-only guild config and open-to-all-members guild config - `test/policies.test.js` - add coverage for `/usr/bin/node` in communication presets - add coverage that messaging presets no longer ship deprecated `tls: terminate` - `.agents/skills/nemoclaw-user-reference/references/troubleshooting.md` - regenerated from docs ## Related issues - supersedes NVIDIA#1084 for the preset change - addresses NVIDIA#1610 from the NemoClaw side - addresses NVIDIA#1692 from the NemoClaw side - provides clearer handling for NVIDIA#606 - improves investigation coverage for NVIDIA#585 and NVIDIA#1570 ## Validation Completed: - `git diff --check` - `bash -n test/e2e/test-messaging-providers.sh` - `rg -n "tls: terminate" nemoclaw-blueprint/policies/presets/{telegram,discord,slack}.yaml test/policies.test.js` - `node -c bin/lib/onboard.js` - manual host-side verification that the live sandbox has: - the Discord policy applied - the Discord provider attached - allowed Discord command-registration `PUT` traffic at runtime - successful Discord read/write behavior from the bot in a real server channel - a remaining failure mode where direct `@mention` traffic is only observed when OpenClaw is prompted from the TUI, rather than waking autonomously in real time - a correctly baked Discord guild config inside the sandbox after redeploy - successful `dns.lookup('gateway.discord.gg')` and raw Node WebSocket connectivity from inside the sandbox - continued OpenClaw Discord gateway failure with `AggregateError` and WebSocket close code `1006` Not completed in this environment: - full `vitest` runs were attempted but hung without producing output in this sandboxed session ## Current limitation This PR does **not** yet claim a full end-to-end native Discord fix. The remaining evidence points to the live Discord gateway client path inside OpenClaw rather than to NemoClaw config. The next step is to inspect or patch the pinned OpenClaw runtime behavior and determine whether that final fix can be applied from NemoClaw or requires upstream OpenClaw changes. ## Type of change - [x] Code change for a bug fix or refactor - [x] Doc updates - [x] Tests updated <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Discord onboarding prompts for Server ID, reply mode (@mentions-only or all messages) and optional user allowlist; messaging selections are persisted into the sandbox image. * **Documentation** * Clarified Telegram allowlist: enforced for DMs only; group chats remain open by default (prevents rebuilt sandboxes from dropping group messages). * Added troubleshooting for baked channel configs and Discord gateway diagnostics. * **Policy Changes** * Messaging network policy TLS handling adjusted and Node binary paths expanded for messaging providers. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Test User <[email protected]> Co-authored-by: Aaron Erickson 🦞 <[email protected]>
Summary
This draft PR advances the Discord/native messaging remediation work in NemoClaw without claiming that native Discord is fully fixed end to end yet.
It also folds in the analogous Telegram doctor/config follow-up so rebuilt sandboxes match the current account-based channel schema and avoid a misleading empty group allowlist warning.
It now also addresses the repo-owned portion of #1692 by removing deprecated
tls: terminatedirectives from the messaging policy presets.It does five repo-owned things:
Why this PR exists
The current evidence suggests the remaining Discord break is not just a missing onboarding step:
PUTrequests are allowed at runtimediscord.comWhat still appears broken is the autonomous inbound side of native Discord handling. In current testing, OpenClaw can read and write Discord successfully, but incoming
@mentionmessages do not appear to wake the agent in real time. After baking the documented guild config into the sandbox, the failure still reproduces. Direct diagnosis inside the sandbox shows that:channels.discord.guilds.<serverId>config is present and correctgateway.discord.ggsucceedsWebSocket('wss://gateway.discord.gg/...')succeeds inside the same sandboxAggregateErrorand close code1006That makes the remaining issue look narrower than generic connectivity: likely the live Discord gateway client path in the pinned OpenClaw runtime (
2026.3.11) under the OpenShell proxy environment.This PR tightens NemoClaw’s side of the system so:
@mentionsor to all messages in the configured serveraccounts.defaultlayout thatopenclaw doctornow expectsgroupPolicy: open, so an empty DM allowlist does not produce silent group-message dropstls: terminatedirectives on current OpenShell buildsChanges
bin/lib/onboard.jsgetSuggestedPolicyPresets()enabledChannelsnemoclaw-blueprint/policies/presets/{discord,telegram,slack}.yaml/usr/bin/nodealongside/usr/local/bin/nodetls: terminatedirectives and rely on current OpenShell automatic TLS handlingcurlorbashin these narrow presetsdocs/reference/troubleshooting.mdopenclaw doctor --fixcannot repair baked sandbox channel configDockerfilechannels.<provider>.accounts.defaultinstead ofaccounts.mainopenclaw doctorcurrently expects, so this migration is handled at build time rather than failing inside the immutable sandboxNEMOCLAW_DISCORD_GUILDS_B64path so Discord guild allowlist config can be baked intoopenclaw.jsongroupPolicy: openin the baked default account configbin/lib/onboard.jschannels.discord.guilds.<serverId>config for the selected server(s)requireMention: trueas the default so onboarding remains safe for shared serverstest/e2e/test-messaging-providers.shaccounts.defaultandaccounts.mainduring the transition so older images still validategroupPolicy: opentest/onboard.test.jstest/policies.test.js/usr/bin/nodein communication presetstls: terminate.agents/skills/nemoclaw-user-reference/references/troubleshooting.mdRelated issues
Validation
Completed:
git diff --checkbash -n test/e2e/test-messaging-providers.shrg -n "tls: terminate" nemoclaw-blueprint/policies/presets/{telegram,discord,slack}.yaml test/policies.test.jsnode -c bin/lib/onboard.jsPUTtraffic at runtime@mentiontraffic is only observed when OpenClaw is prompted from the TUI, rather than waking autonomously in real timedns.lookup('gateway.discord.gg')and raw Node WebSocket connectivity from inside the sandboxAggregateErrorand WebSocket close code1006Not completed in this environment:
vitestruns were attempted but hung without producing output in this sandboxed sessionCurrent limitation
This PR does not yet claim a full end-to-end native Discord fix.
The remaining evidence points to the live Discord gateway client path inside OpenClaw rather than to NemoClaw config. The next step is to inspect or patch the pinned OpenClaw runtime behavior and determine whether that final fix can be applied from NemoClaw or requires upstream OpenClaw changes.
Type of change
Summary by CodeRabbit
New Features
Documentation
Policy Changes