chore(sync): pull upstream NousResearch/hermes-agent@main by github-actions[bot] · Pull Request #6 · liberathio/hermes-agent

github-actions · 2026-05-01T08:26:09Z

Automated sync from NousResearch/hermes-agent@main.

Upstream HEAD: b7ad3f478f9bc24768f88e4339fc3e6e23d0292b
Fork was 926 commits behind, 11 commits ahead.

Review the diff and let CI pass before merging. If the fork has zero local commits ahead, this is effectively a fast-forward and can be merged once green.

Generated by .github/workflows/sync-upstream.yml.

…-gap fix(tui): render explicit prompt gap

Reuse one gateway detail-section list for global and per-section detail mode config handling.

…s-persist fix(tui): persist global details mode sections

* fix(tui): honor launch toolsets Carry chat --toolsets through the TUI launcher so TUI sessions use the same per-session tool scope as the classic CLI. * fix(tui): parse top-level toolsets flag Allow top-level hermes --tui --toolsets to reach the implicit chat session, matching chat subcommand behavior. * fix(tui): validate launch toolsets Filter invalid HERMES_TUI_TOOLSETS entries and fall back to configured CLI toolsets when the override contains no valid toolsets. * fix(tui): avoid config load for builtin toolsets Honor built-in HERMES_TUI_TOOLSETS values before loading config and treat all/* as the all-toolsets sentinel. * fix(cli): honor toolsets in oneshot mode Forward top-level --toolsets into oneshot agent construction so the flag is not silently ignored outside the TUI path. * fix(cli): validate oneshot toolsets Reject invalid-only oneshot toolset overrides before output redirection and clarify TUI fallback warnings. * fix(cli): preserve all-toolsets sentinel Map explicit all/* oneshot toolset overrides to the all-toolsets sentinel and replace locals() checks in TUI toolset loading. * fix(cli): warn on extra all-toolset entries Warn when all/* toolset overrides include additional ignored entries so typos are still visible. * fix(tui): honor plugin toolset overrides Discover plugin toolsets before rejecting unresolved explicit toolset overrides and read raw config for MCP name validation. * fix(tui): reuse toolset argument normalizer Share top-level TUI toolset argument parsing with the oneshot path to avoid duplicate normalization logic. * fix(cli): reject disabled mcp toolsets Validate explicit toolset overrides against enabled MCP servers only and clarify top-level toolset flag help. * fix(cli): distinguish disabled mcp from unknown toolsets Report disabled MCP servers separately from unknown toolset entries and stub plugin discovery in invalid-name tests for determinism.

* fix(tui): word-wrap composer input Wrap composer input at word boundaries and anchor the good-vibes heart to the full composer row. * test(tui): cover composer word wrap edge Add regression coverage for moving the next word instead of splitting it at the composer edge.

…ck (NousResearch#17661) * fix(tui): offload manual compaction RPC Route TUI session compression through the existing long-handler pool so slow compaction does not block other gateway RPCs. * fix(tui): show compaction progress immediately Print a local status line before the compress RPC starts so slow manual compaction does not look like a no-op. * feat(tui): rich /compress feedback parity with CLI Show pre-compaction message count and rough token estimate immediately, emit a status update so the bottom bar reflects ongoing compaction, and report a multi-line summary (headline + token delta + optional note) using the shared summarize_manual_compression helper. * fix(tui): show live compaction estimate in transcript Mirror compression progress status into the transcript so users see the backend message count and token estimate while /compress is still running. * fix(tui): single live compaction line with spinner glyph Drop the redundant local "compressing context..." placeholder and prefix the live backend status line with a braille spinner glyph so /compress reads as a single in-progress row. * fix(tui): address review nits on /compress feedback Reuse the precomputed token estimate inside _compress_session_history so the gateway does not redo the O(n) work while holding history_lock, keep the status bar pinned during long manual compactions instead of auto-restoring after 4s, and drop the redundant noop bullet that doubled with the system role glyph. * fix(tui): release history_lock during compaction LLM call Move the snapshot/commit pattern into _compress_session_history so the lock is held only across the in-memory bookkeeping, not during agent._compress_context. Also emit a final neutral status update from session.compress so the pinned compressing indicator clears even on errors. * fix(tui): rebuild prompt cleanly + sync session_key after compress Pass system_message=None so AIAgent._compress_context rebuilds the system prompt without nesting the cached identity block. Reuse the handler's pre-snapshotted history inside _compress_session_history to avoid a second O(n) copy under the lock. After compaction, when AIAgent._compress_context rotates session_id, sync the gateway session_key, migrate approval notify + yolo state, restart the slash worker, and clear the stale pending title. Mirrors HermesCLI._manual_compress. * Avoid /compress lock re-entry in slash side effects. Stop pre-locking history before _compress_session_history in slash command mirroring, keep session-key sync parity with manual compression, and add a regression test that asserts /compress is invoked without holding history_lock.

…arch#17550) check_for_updates() looked at __file__.parent.parent for a .git dir to diff against origin/main. A nix-built hermes lives in /nix/store with no .git there, so the check fell through to whatever editable-install dev checkout last populated ~/.hermes/.update_check, producing stale "X commits behind" warnings right after a fresh `nix run --refresh`. Embed the locked flake rev into the wrapper as HERMES_REVISION (only on clean builds — dirty refs don't represent any upstream commit). When set, banner.py compares it to upstream main via `git ls-remote` instead of inspecting a local checkout, and the cache key includes the rev so nix updates invalidate immediately. Without local history we can't count commits, so the message is a plain "update available" with no suggested command — nix users may install via `nix run`, profile, system flake, or home-manager, and we don't know which. Also bump web/package-lock.json npmDepsHash via `nix run .#fix-lockfiles`.

Decode Shift, Meta, and Ctrl bits from SGR and legacy X10 wheel event button bytes so TUI input handlers can distinguish modified wheel gestures from plain scrolling.

curl is a ubiquitous tool both for users running ad-hoc commands inside the container (debugging, health checks, quick HTTP probes) and for agent workflows — many bundled skills and hub skills lean on curl for HTTP calls, API exploration, and installer bootstrapping. Its absence causes silent workflow failures with "curl: command not found" until the user manually apt-installs it. Add curl to the single apt-get install layer alongside the other base utilities (build-essential, nodejs, git, openssh-client, etc.) so it ships in the image with zero extra layers and negligible size impact (~400 KB). - Dockerfile: add curl to the apt-get install list

Route Option/Alt or Ctrl wheel input through a gated precision path that scrolls at most one row per short interval, while preserving the existing accelerated behavior for plain wheel input. Keep precision active briefly after modifier release so queued wheel events from the same gesture do not jump into acceleration mid-stream.

…-precision-mod feat(tui): line-by-line scroll mode on modified mouse wheel

Detect leaked SGR mouse-report fragments in CLI input, strip them, and reset terminal modes in-place so scroll and typing recover without reopening the tab. Add regression tests for escaped, visible, and bare leak forms.

Reset sticky mouse/focus/paste terminal modes before the TUI starts and during graceful shutdown paths so stale tab state from prior crashes cannot poison the next session.

CI Tests workflow has been red on main for 40+ consecutive runs. This commit recovers every failure visible in run 25130722163 (most recent completed run prior to this PR). Root causes, by group: Test-mock drift after product landed (fix: update mocks) - test_mcp_structured_content / test_mcp_dynamic_discovery (6 tests): product added _rpc_lock (#02ae15222) and _schedule_tools_refresh (#1350d12b0) without updating sibling test files. Install a real asyncio.Lock inside the fake run-loop and patch at _schedule_tools_refresh. - test_session.py: renamed normalize_whatsapp_identifier → canonical_ whatsapp_identifier upstream; keep a local alias so the legacy tests keep working. - test_run_progress_topics Slack DM test: PR NousResearch#8006 made Slack default tool_progress=off; explicitly set it to 'all' in the test fixture so the progress-callback path still runs. Also read tool_progress_callback at call time rather than freezing it in FakeAgent.__init__ — production assigns it AFTER construction. - test_tui_gateway_server session-create/close race: session.create now defers _start_agent_build behind a 50ms timer — wait for the build thread to enter _make_agent before closing, otherwise the orphan- cleanup path never runs. - test_protocol session.resume: product get_messages_as_conversation now takes include_ancestors kwarg; accept **_kwargs in the test stub. - test_copilot_acp_client redaction: redactor is OFF by default (snapshots HERMES_REDACT_SECRETS at import); patch agent.redact._REDACT_ENABLED=True for the duration of the test. - test_minimax_provider: after NousResearch#17171, dots in non-Anthropic model names stay dots even with preserve_dots=False. Assert the new invariant rather than the old 'broken for MiniMax' behavior. - test_update_autostash: updater now scans `ps -A` for dashboard PIDs; the test's catch-all subprocess.run stub needed stdout/stderr fields. - test_accretion_caps: read_timestamps dict is populated lazily when os.path.getmtime succeeds. Use .get("read_timestamps", {}) to tolerate CI filesystems where the stat races file creation. Change-detector tests (fix: rewrite as structural invariants) - test_credential_sources_registry_has_expected_steps: was a frozen set comparison that broke when minimax-oauth was added. Rewrite as an invariant check (every step has description, no dupes, core steps present) per AGENTS.md 'don't write change-detector tests'. xdist ordering / test pollution (fix: reset state, use module-local patches) - test_setup vercel: sibling test saved VERCEL_PROJECT_ID='project' to os.environ via save_env_value() and never cleared it. monkeypatch.delenv the VERCEL_* vars in the link-file test. - test_clipboard TestIsWsl: GitHub Actions is on Azure VMs whose real /proc/version often contains 'microsoft'. Patching builtins.open with mock_open didn't reliably intercept hermes_constants.is_wsl's call in xdist workers that had already cached _wsl_detected=True from an earlier test. Patch hermes_constants.open directly and add teardown_method to reset the cache after each test. Pytest-asyncio cancellation hangs (fix: bound product await with timeout) - test_session_split_brain_11016 (3 params) + test_gateway_shutdown cancel-inflight: under pytest-asyncio 1.3.0, 'await task' and 'asyncio.gather(cancelled_tasks)' can stall for 30s when the cancelled task's finally block awaits typing-task cleanup. Bound both with asyncio.wait_for(..., timeout=5.0) and asyncio.shield — the stragglers are released from adapter tracking and allowed to finish unwinding in the background. This is also a legitimate hardening: a wedged finally shouldn't stall the caller's dispatch or a gateway shutdown. Orphan UI config (fix: merge tiny tab into messaging category) - test_web_server test_no_single_field_categories: the telegram.reactions config field lived in its own 'telegram' schema category with no siblings. Fold it under 'discord' via _CATEGORY_MERGE so the dashboard doesn't render an orphan single-field tab. Local verification: 38/38 originally-failing tests pass; 4044/4044 gateway tests pass; 684/684 targeted subset (all 16 touched test files) passes.

Handle unbounded SGR mouse report coordinates and avoid regex work on ordinary prompt-buffer edits by short-circuiting before sanitizer passes.

Keep light Terminal.app TUI colors readable by normalizing non-banner theme tokens into ANSI256-safe buckets while preserving truecolor terminals.

…arch#17719) vision_analyze used Path('./temp_vision_images') — a relative path that resolved against cwd. Under Docker the image's WORKDIR is /opt/hermes, which is root-owned and only chmoded a+rX (read + traversal). Since NousResearch#5811 landed (run as non-root hermes UID 10000, Apr 12), remote-URL vision calls fail with PermissionError on mkdir. Switch to get_hermes_dir('cache/vision', 'temp_vision_images'): resolves to $HERMES_HOME/cache/vision/ (= /opt/data/cache/vision/ in Docker — the user-owned volume mount). Existing installs with the old dir keep using it via the get_hermes_dir back-compat path; no migration needed. Only site in the codebase that stored runtime files via Path('./...'). Reported via Discord: https://juick.com/i/p/3089079.jpg → Telegram → gateway → [Errno 13] Permission denied: 'temp_vision_images'.

Pressing `d` on the highlighted row in the resume picker prompts `delete? y/n`; `y` deletes the session (DB row + on-disk transcript files), anything else cancels. The active session is excluded from deletion server-side. Adds a new `session.delete` JSON-RPC handler that wraps `SessionDB.delete_session`, forwarding the per-profile `sessions/` directory so transcripts get cleaned up alongside the row.

Single-key confirm matches how the picker already accepts 1-9 to resume — no separate y/n keymap to learn — and "press d again" is self-documenting next to the cursor.

…sessions If a concurrent RPC mutates _sessions while session.delete is iterating it (e.g. a parallel session.create on the thread pool), the bare except swallowed the RuntimeError and let the delete proceed against a row that may still be live. Snapshot via list(_sessions.values()) and return an error when even that raises, instead of treating "couldn't check" as "no active sessions."

…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.

Extract all os.execvp('hermes', ...) calls into a utility so flags like --tui, --dev, --profile, --model, --provider, et al. survive session resume and post-setup relaunch. - resolve_hermes_bin: prefers sys.argv[0] when callable, then PATH, then falls back to '${sys.executable} -m hermes_cli.main' (fixes nix run relaunches) - build_relaunch_argv: allowlists critical flags so they carry over - cmd_sessions browse now calls relaunch(['--resume', <id>]) - _apply_profile_override skips redundant work when HERMES_HOME is already set (child inherits parent profile) - setup.py replaces _resolve_hermes_chat_argv with relaunch_chat() - added comprehensive tests for flag extraction and binary resolution Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pull the top-level + chat parser construction out of main() into hermes_cli/_parser.py so relaunch.py can introspect parser._actions to discover which flags exist and whether they take values, instead of maintaining a parallel hand-rolled (flag, takes_value) tuple list. - _parser.py: build_top_level_parser() returns (parser, subparsers, chat_parser); side-effect-free import. - main.py: ~290 lines of inline parser construction collapsed to a helper call. Other subparsers stay inline (dispatch is bound to module-level cmd_* functions). - _parser._inherited_flag(parser, ...): wraps parser.add_argument and sets action.inherit_on_relaunch = True. Used in place of parser.add_argument for the 25 flags (top-level + chat) that need to carry over. - _parser.PRE_ARGPARSE_INHERITED_FLAGS: holds --profile/-p, which isn't on argparse (consumed earlier by main._apply_profile_override). - relaunch.py: drops _CRITICAL_DESTS and _PRE_ARGPARSE_FLAGS; the table builder now filters by getattr(action, 'inherit_on_relaunch', False). - test_ignore_user_config_flags.py: brittle inspect.getsource grep replaced with proper parser introspection. - test_relaunch.py: introspection sanity tests added. Salvaged from PR NousResearch#17549; added top-level -t/--toolsets flag to _parser.py so NousResearch#17623 (fix(tui): honor launch toolsets) behavior is preserved on current main. Co-authored-by: ethernet <arilotter@gmail.com>

not needed

…ples, tests The audit of v4.1 surfaced ~70 issues across the five scripts and three reference docs — most user-visible (silent file overwrites, status-error misclassified as success, X-API-Key leaked to S3 on /api/view redirect, Cloud endpoints that 404 because they were renamed). v5.0.0 fixes those and fills the gaps that previously forced users to write their own glue (WebSocket monitoring, batch/sweep, img2img upload helper, dep auto-fix, log fetch, health check, example workflows). Critical fixes - run_workflow.py: poll_status now checks status_str==error BEFORE completed:true, so a failed run no longer reports success - run_workflow.py: download_output streams to disk via safe_path_join, preserves server subfolder structure (no silent overwrites), and retries with exponential backoff - run_workflow.py: refuses to overwrite a link with a literal in inject_params (would silently break wiring) - _common.py: _StripSensitiveOnRedirectSession (subclasses requests.Session.rebuild_auth) drops X-API-Key/Cookie on cross-host redirects — fixes a real key-leak path through Cloud's signed-URL download flow. Tested - Cloud routing (verified live): /history → /history_v2, /models/<f> → /experiment/models/<f>, plus folder aliases for the unet ↔ diffusion_models and clip ↔ text_encoders rename - check_deps.py: distinguishes 200/empty vs 404 folder_not_found vs 403 free-tier; emits concrete fix_command per missing dep - extract_schema.py: prompt vs negative_prompt determined by tracing KSampler.{positive,negative} connections (incl. through Reroute / Primitive nodes) instead of meta-title heuristic; symmetric duplicate-name resolution; cycle-safe trace_to_node - hardware_check.py: multi-GPU pick-best, Apple variant detection, Rosetta detection, WSL2, ROCm --json, disk-space check, optional PyTorch probe; powershell preferred over deprecated wmic - comfyui_setup.sh: prefers pipx → uvx → pip --user (with PEP-668 fallback); idempotent — skips relaunch if server already up; configurable port/workspace; persistent log; SIGINT trap New scripts - run_batch.py — count or sweep (cartesian product), parallel up to cloud tier limit - ws_monitor.py — real-time WebSocket viewer; saves preview frames - auto_fix_deps.py — runs comfy node install / model download for whatever check_deps reports missing (with --dry-run) - health_check.py — single command that runs the verification checklist (comfy-cli + server + checkpoints + optional smoke test that cancels itself to avoid burning compute) - fetch_logs.py — pull traceback / status messages for a prompt_id Coverage expansion - Param patterns now cover Flux (BasicScheduler, BasicGuider, RandomNoise, ModelSamplingFlux), SD3, Wan/Hunyuan/LTX video, IPAdapter, rgthree, easy-use, AnimateDiff - Embedding refs in CLIPTextEncode strings extracted as model deps - ckpt_name / vae_name / lora_name / unet_name now controllable so workflows can be retargeted per run Examples - workflows/{sd15,sdxl,flux_dev}_txt2img.json - workflows/sdxl_{img2img,inpaint}.json - workflows/upscale_4x.json - workflows/{animatediff_video,wan_video_t2v}.json + README Tests - 117 tests (105 unit + 8 cloud integration + 4 cross-host security) - Cloud tests auto-skip without COMFY_CLOUD_API_KEY; verified end-to-end against live cloud API Backwards compatibility - All existing CLI flags continue to work; new behavior is opt-in (--ws, --input-image, --randomize-seed, --flat-output, etc.)

Self-review caught several errors in the previous commit: Frontmatter - Replace non-standard `requires_runtime` / `requires_tooling` fields with the documented `compatibility:` field (parsed by tools/skills_tool.py). - Drop the `audit-v5` author tag I added unnecessarily. MODEL_LOADERS catalog - Remove `IPAdapterUnifiedLoader` (input `preset` is an enum, not a file). - Remove `IPAdapterInsightFaceLoader` and `InsightFaceLoader` (input `provider` is a GPU backend selector, not a model file). These would have flagged enum values like "STANDARD" or "CUDA" as missing model files. - Add "NB:" comment explaining `BasicGuider` has no `cfg` input (the original PARAM_PATTERNS entry would never have matched). - Remove `SamplerCustomAdvanced.noise_seed` from PARAM_PATTERNS — that node takes a NOISE input from RandomNoise, not a seed field directly. NODE_TO_PACKAGE registry slugs - Verified all 18 packages against api.comfy.org and fixed: - `comfyui-essentials` → `comfyui_essentials` (underscore, not hyphen) - `comfyui-gguf` → `ComfyUI-GGUF` (case-sensitive) - `comfyui-photomaker-plus` → `ComfyUI-PhotoMaker-Plus` - `comfyui-wanvideowrapper` → `ComfyUI-WanVideoWrapper` - ComfyUI-HunyuanVideoWrapper isn't on the registry; surface a git-URL install hint via new NODE_TO_GIT_URL fallback so the user can install via ComfyUI-Manager's /manager/queue/install endpoint. Wrong class names - `Canny` → `CannyEdgePreprocessor` (controlnet-aux registers the latter, the former never appears in /object_info). - Add `Zoe_DepthAnythingPreprocessor` and `AnimalPosePreprocessor` while fixing controlnet-aux. - Remove `Reroute (rgthree)` (rgthree's Reroute is JS-only — no Python class, never appears in /object_info). - Add `Display Int (rgthree)` (sibling of Display Any). - Move `UltralyticsDetectorProvider` from `comfyui-impact-pack` to `comfyui-impact-subpack` (separate package, registered there). Tests - Update test_packages_are_safe_for_shell to accept case-mixed slugs (the registry uses both ComfyUI- and comfyui_ prefixes inconsistently). Replaced the lowercase-only assertion with a shell-safe regex check. - 117 tests still pass (105 unit + 8 cloud + 4 cross-host). Attribution - Add `SHL0MS@users.noreply.github.com` mapping to scripts/release.py AUTHOR_MAP so check-attribution CI passes.

…ges against code (NousResearch#17738) Broad drift audit against origin/main (b52b633). Reference pages (most user-visible drift): - slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer that were missing; drop non-existent /terminal-setup; fix /q footnote (resolves to /queue, not /quit); extend CLI-only list with all 24 CLI-only commands in the registry - cli-commands: add dedicated sections for hermes curator / fallback / hooks (new subcommands not previously documented); remove stale hermes honcho standalone section (the plugin registers dynamically via hermes memory); list curator/fallback/hooks in top-level table; fix completion to include fish - toolsets-reference: document the real 52-toolset count; split browser vs browser-cdp; add discord / discord_admin / spotify / yuanbao; correct hermes-cli tool count from 36 to 38; fix misleading claim that hermes-homeassistant adds tools (it's identical to hermes-cli) - tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao, 2 Discord toolsets; move browser_cdp/browser_dialog to their own browser-cdp toolset section - environment-variables: add 40+ user-facing HERMES_* vars that were undocumented (--yolo, --accept-hooks, --ignore-*, inference model override, agent/stream/checkpoint timeouts, OAuth trace, per-platform batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs, gateway restart/connect timeouts); dedupe the Cron Scheduler section; replace stale QQ_SANDBOX with QQ_PORTAL_HOST User-guide (top level): - cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20) - configuration.md: display.platforms is the canonical per-platform override key; tool_progress_overrides is deprecated and auto-migrated - profiles.md: model.default is the config key, not model.model - sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8 - checkpoints-and-rollback.md: destructive-command list now matches _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd) - docker.md: the container runs as non-root hermes (UID 10000) via gosu; fix install command (uv pip); add missing --insecure on the dashboard compose example (required for non-loopback bind) - security.md: systemctl danger pattern also matches 'restart' - index.md: built-in tool count 47 -> 68 - integrations/index.md: 6 STT providers, 8 memory providers - integrations/providers.md: drop fictional dashscope/qwen aliases Features: - overview.md: 9 image models (not 8), 9 TTS providers (not 5), 8 memory providers (Supermemory was missing) - tool-gateway.md: 9 image models - tools.md: extend common-toolsets list with search / messaging / spotify / discord / debugging / safe - fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan, tencent-tokenhub, azure-foundry) - plugins.md: Available Hooks table now includes on_session_finalize, on_session_reset, subagent_stop - built-in-plugins.md: add the 7 bundled plugins the page didn't mention (spotify, google_meet, three image_gen providers, two dashboard examples) - web-dashboard.md: add --insecure and --tui flags - cron.md: hermes cron create takes positional schedule/prompt, not flags Messaging: - telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch. - discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default is 2.0, not 0.1 - dingtalk.md: document DINGTALK_REQUIRE_MENTION / FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL / ALLOW_ALL_USERS that the adapter supports - bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env var; the setting lives in platforms.bluebubbles.extra only - qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and QQ_GROUP_ALLOWED_USERS - wecom-callback.md: replace 'hermes gateway start' (service-only) with 'hermes gateway' for first-time setup Developer-guide: - architecture.md: refresh tool/toolset counts (61/52), terminal backend count (7), line counts for run_agent.py (~13.7k), cli.py (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform adapter count 18 -> 20 - agent-loop.md: run_agent.py line count 10.7k -> 13.7k - tools-runtime.md: add vercel_sandbox backend - adding-tools.md: remove stale 'Discovery import added to model_tools.py' checklist item (registry auto-discovery) - adding-platform-adapters.md: mark send_typing / get_chat_info as concrete base methods; only connect/disconnect/send are abstract - acp-internals.md: ACP sessions now persist to SessionDB (~/.hermes/state.db); acp.run_agent call uses use_unstable_protocol=True - cron-internals.md: gateway runs scheduler in a dedicated background thread via _start_cron_ticker, not on a maintenance cycle; locking is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows) - gateway-internals.md: gateway/run.py ~12k lines - provider-runtime.md: cron DOES support fallback (run_job reads fallback_providers from config) - session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations 10 and 11 (trigram FTS, inline-mode FTS5 re-index); add api_call_count column to Sessions DDL; document messages_fts_trigram and state_meta in the architecture tree - context-compression-and-caching.md: remove the obsolete 'context pressure warnings' section (warnings were removed for causing models to give up early) - context-engine-plugin.md: compress() signature now includes focus_topic param - extending-the-cli.md: _build_tui_layout_children signature now includes model_picker_widget; add to default layout Also fixed three pre-existing broken links/anchors the build warned about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and tips#background-tasks, nix-setup.md -> #container-aware-cli). Regenerated per-skill pages via website/scripts/generate-skill-docs.py so catalog tables and sidebar are consistent with current SKILL.md frontmatter. docusaurus build: clean, no broken links or anchors.

- New /models page in left nav (after Analytics) - New /api/analytics/models endpoint with per-model token/cost/session breakdown, cache read/reasoning tokens, tool calls, avg tokens/session, and capabilities from models.dev (vision/tools/reasoning/context window) - Model cards with stacked token distribution bar, capability badges, provider badges, cost info, and relative time - Summary stats bar (models used, total tokens, est. cost, sessions) - Period selector (7d/30d/90d) with refresh - i18n support (en + zh)

…or split - SQL: add `model != ''` to both queries in /api/analytics/models so sessions with empty-string model (pre-existing data integrity, confirmed in production DB: ~107 sessions) no longer render as blank-header cards. - ModelsPage: drop the arbitrary slashIdx < 20 length gate in shortModelName / modelProvider. The gate was fragile for longer vendor prefixes (e.g. `deepseek-ai/...`). Strip on the first / unconditionally. Rename modelProvider -> modelVendor to avoid confusion with the billing provider column. - scripts/release.py: add AUTHOR_MAP entry for yatesjalex.

Adds a public reload path for the in-process skill caches so newly installed (or removed) skills become visible mid-session without a gateway restart. Mirrors the shape of /reload-mcp. Three surfaces: * /reload-skills slash command — CLI (cli.py) and gateway (gateway/run.py), with /reload_skills alias for Telegram autocomplete and an explicit Discord registration. * skills_reload agent tool (tools/skills_tool.py) — lets agents/subagents pick up freshly-installed skills via tool call. * agent.skill_commands.reload_skills() — shared helper that clears _skill_commands, _SKILLS_PROMPT_CACHE (in-process LRU), and the on-disk .skills_prompt_snapshot.json, then returns an added/removed diff plus the new total count. Tested: * tests/agent/test_skill_commands_reload.py (9 cases) * tests/cli/test_cli_reload_skills.py (3 cases) * tests/gateway/test_reload_skills_command.py (4 cases) Use case: NemoClaw / OpenShell-style sandboxed orchestrators that drop skills into ~/.hermes/skills mid-session, plus agentic flows where the agent itself installs a skill via the shell tool and needs it bound without a gateway restart. The Python helper clear_skills_system_prompt_cache(clear_snapshot=True) already exists internally — this PR just exposes it via slash command and tool.

@flockonus

…usResearch#18257) `hermes update` ran the config migration (11 → 17) successfully then crashed at `agent/skill_utils.py:340` during the post-migration skill-config prompt. User @flockonus reported this on Twitter. Root cause: `get_missing_skill_config_vars` in hermes_cli/config.py only guarded the import of `discover_all_skill_config_vars`, not the call. Any runtime exception inside the skill scan (malformed SKILL.md, unreadable external skill dir, etc.) propagated up through `migrate_config` and aborted `hermes update` after the version bump. Wrap the call in try/except so skill-config prompting — which is a post-migration nicety — can never block the migration itself.

…ch#18258) When a user types /steer <text> on an ACP session that isn't actively running a turn (and there's no interrupted-prompt salvage available), _cmd_steer silently appended to state.queued_prompts and replied "No active turn — queued for the next turn". That looks identical to /queue output even though the user never typed /queue — @EddyLeeKhane reported this as "/steer never works, gets queued instead". Rewrite the payload to a plain user prompt before the slash-intercept fires, matching the gateway's idle-/steer fallthrough in gateway/run.py ~L4898.

@codecovenant

…ates (NousResearch#18265) The user-visible /compress banner and the post-compression last_prompt_tokens writeback both counted only the raw message transcript (chars/4). With a 15KB system prompt and 30 tool schemas (~26KB), a 4-message transcript that looks like ~45 tokens to the transcript-only estimator is really ~10.5K tokens of request pressure — a 234x gap. Two user-facing consequences: - Banner shows 'Compressing … (~45 tokens)…' while compression is actually firing on 10K+ tokens of real pressure, confusing users about why compression triggered (reported by @codecovenant on X; NousResearch#6217). - Post-compression last_prompt_tokens writeback omits tool schemas, so the next should_compress() check compares real usage against a stale underestimate — compression triggers late, potentially past the model's context limit on small-context models (NousResearch#14695). Swap estimate_messages_tokens_rough() for estimate_request_tokens_rough() at every user-visible banner and at the post-compression writeback. estimate_request_tokens_rough() already existed for exactly this purpose and includes system prompt + tool schemas. Touched call sites: - run_agent.py: post-compression last_prompt_tokens writeback, post-tool call should_compress() fallback when provider usage is missing - cli.py: /compress banner + summary - gateway/run.py: gateway /compress banner + summary - tui_gateway/server.py: TUI /compress status + summary - acp_adapter/server.py: ACP /compact before/after Left intentionally alone: - Session-hygiene fallback and the 'no agent' /status path in gateway/run.py — no agent instance is in scope to query for system prompt/tools, and the existing 30-50% overestimate wobble on hygiene is safety-accepted. - Verbose-mode 'Request size' logging — informational only, already counts system prompt via api_messages[0]. Also relabels the feedback line from 'Rough transcript estimate' to 'Approx request size' so the metric label matches what it actually measures. Credits: diagnoses from @devilardis (NousResearch#14695) and @Jackten (NousResearch#6217); user report @codecovenant on X (2026-04-30). Closes NousResearch#14695 Closes NousResearch#6217

… thinking mode DeepSeek V4 Pro tightened thinking-mode validation and rejects empty-string reasoning_content with HTTP 400: The reasoning content in the thinking mode must be passed back to the API. run_agent.py injected "" at three fallback sites — the tool-call pad in _build_assistant_message and both injection branches of _copy_reasoning_content_for_api (cross-provider poison guard + unconditional thinking pad). All three now emit " " (single space), which satisfies the non-empty check on V4 Pro without leaking fabricated reasoning. Also upgrades stale empty-string placeholders on replay: sessions persisted before this change have reasoning_content="" pinned at creation time; when the active provider enforces thinking-mode echo, the replay path now rewrites "" -> " " so existing users don't 400 on their first V4 Pro turn after updating. Non-thinking providers still round-trip "" verbatim. Updates 9 existing assertions + adds 2 regression tests (stale-placeholder upgrade, non-thinking verbatim preservation). Refs NousResearch#15250, NousResearch#17400. Closes NousResearch#17341.

@tombielecki

…search#18253) When the curator consolidates skill X into umbrella Y, any cron job that listed X in its skills field would fail to load X at run time — the scheduler logs a warning and skips it, so the scheduled job runs without the instructions it was scheduled to follow. cron.jobs.rewrite_skill_refs(consolidated, pruned) now updates jobs in-place: consolidated names route to the umbrella target (dedup when umbrella is already present), pruned names are dropped. agent.curator._write_run_report calls it after classification, best-effort so a cron-side failure never breaks the curator itself. Results are recorded in run.json (counts.cron_jobs_rewritten + full cron_rewrites payload), a separate cron_rewrites.json for convenience when jobs were touched, and a section in REPORT.md. Reported by @tombielecki.

@CharlesMcDowell

…usResearch#18266) Adds opt-in auto-deletion for slash-command reply messages like "New session started!", "Restarting gateway…", "Stopped.", and YOLO toggles. After the TTL elapses the gateway calls the adapter's delete_message; on platforms without a delete API (everything except Telegram today) the TTL is silently ignored and the message stays. Requested on Twitter by @CharlesMcDowell — tool-call bubbles are useful real-time, but system notices clutter the thread once the agent finishes. Implementation: - EphemeralReply(str) sentinel in gateway/platforms/base.py. Subclasses str so existing 'X' in response / response.startswith(...) checks in tests and call sites keep working unchanged; isinstance() still distinguishes it for the send path. - _process_message_background and both busy-session bypass paths (in base.py) call _unwrap_ephemeral() on the handler return, send the unwrapped text, and schedule a detached delete task when the TTL > 0 AND the adapter class overrides delete_message. - display.ephemeral_system_ttl (default 0 = disabled) in DEFAULT_CONFIG. Handler can pass ttl_seconds explicitly to override. - Wrapped the highest-noise return sites: /new, /reset, /stop, /yolo on/off, /restart success + "already in progress". Draining notices and /help output left as plain strings — those are informational and users want to read them. Backward-compat: default TTL 0 → no scheduling, no behavior change for existing users. Platforms without delete_message silently no-op.

@murelux

…arch#18261) hermes update had two interactive [Y/n] prompts with no bypass: 1. Config migration (after new env/config options are added) 2. Autostash restore (when uncommitted work was stashed before pull) hermes uninstall already has --yes/-y; mirrors that. Under --yes: - Config-migrate prompt → auto-yes, migrate_config(interactive=False) so new config fields are applied but API-key prompts are skipped (user runs 'hermes config migrate' later for those). Matches gateway-mode semantics. - Stash-restore prompt → auto-yes, git stash apply runs automatically. Closes the 'can I hermes update -y, No ! Fix' gap reported by @murelux.

…NousResearch#18259) * docs(sidebar): collapse exploding skills tree to a single Skills node The Skills sub-tree in the left sidebar expanded to 200+ entries (22 bundled categories + 15 optional categories, every skill a page). That's most of the nav on a first visit — docs for the actual product get drowned in it. Collapse the sidebar to: Skills godmode (hand-written spotlight) google-workspace (hand-written spotlight) Bundled catalog (reference/skills-catalog — table of all bundled) Optional catalog (reference/optional-skills-catalog — table of all optional) Per-skill pages still generate and are still reachable at their URLs; they're linked from the two catalog tables and from the Skills overview page. They just don't appear in the left nav anymore. sidebars.ts goes from 649 lines to 247. generate-skill-docs.py loses the bundled/optional sidebar render helpers. Also picks up incidental generator output drift on current main (comfyui skill content refresh; 4 new skill pages for devops-kanban-orchestrator, devops-kanban-worker, productivity-here-now, productivity-shopify; two catalog refreshes). These are what the generator produces on main today — keeping them committed avoids the next docs build showing 'working tree dirty'. * docs(sidebar): drop godmode and google-workspace spotlight pages Keep the Skills sidebar node strictly principled: two catalog links, nothing else. There was no rule for which skills got spotlight pages and which got auto-generated pages — just that these two happened to be hand-written first. Both pages still build and are still reachable at /docs/user-guide/skills/godmode and /docs/user-guide/skills/google-workspace. They're linked from the catalog tables and the Skills overview page. Sidebar Skills node now: Skills ├── Bundled catalog └── Optional catalog

…18262) Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - **Prompt cache**: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - **Role alternation**: continuation is a user turn, never injected mid-tool-loop. - **Session persistence**: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - **Mid-run safety**: on the gateway, `/goal status|pause|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry

…branches without explicit type When a schema node inside anyOf has enum values but no explicit 'type', Rule 3 (enum cleanup) ran before _fill_missing_type, so node_type was None and the enum was never cleaned. Moonshot then rejected the schema with 'enum value (<nil>) does not match any type in [string]'. Fix: reorder operations — fill missing type first, strip nullable, then clean enum. This ensures enum cleanup always has a type to check. Also fixes test expectation: empty string in enum is now correctly stripped (Moonshot rejects it too). Closes NousResearch#16875

The anyOf collapse in _repair_schema returned early, skipping the nullable-strip and enum-cleanup steps. When a schema had anyOf [{enum: [..., null, '']}, {type: null}] alongside a parent-level 'nullable: true', collapsing to the single non-null branch produced a merged node that still had both 'nullable' and the bad enum values — Moonshot would still 400 on it. Fix: fall through to Rules 1/3 when the collapse produces a single merged node; only return early for the multi-branch case (pure anyOf preservation) or when there was no null branch to remove. Adds a test that locks in the combined-case expectation.

Adds a proper feature page at user-guide/features/goals.md covering the /goal slash command — Hermes' take on the Ralph loop shipped in PR NousResearch#18262. The slash-commands reference table had two table rows but no narrative doc walking through the judge model, fail-open semantics, turn budget, persistence, user-message preemption, or the aux-model config override. Adds a walkthrough example showing a multi-turn goal running to completion, covers the two judge failure modes with how to recover, and credits Codex CLI 0.128.0 / Eric Traut as prior art. Also cross-links both slash-commands.md rows to the new page so readers discovering /goal from the command reference can dive in.

…NousResearch#18276) Two machine-readable entry points to the Hermes Agent docs: /llms.txt curated index of every doc page, one link per page with short descriptions. ~17 KB, safe to load into an LLM context window. /llms-full.txt every page under website/docs/ concatenated as markdown. ~1.8 MB. For one-shot ingestion by coding agents and RAG pipelines. Both files are also served from /docs/llms.txt and /docs/llms-full.txt (Docusaurus serves website/static/ under baseUrl=/docs/). Some agents and IDE plugins probe the classic site-root path; the deploy workflow now copies both files to _site root so either URL works. Conforms to the emerging llmstxt.org spec: H1 project name, blockquote summary, short install command, GitHub link, then curated sections mirroring the docs-site navigation (Getting Started, Using Hermes, Features, Messaging, Integrations, Guides, Developer Guide, Reference). Generated by website/scripts/generate-llms-txt.py. Wired into prebuild.mjs so every 'npm run build' and 'npm run start' refreshes the files alongside the existing skills.json extraction. Both outputs are gitignored (same precedent as src/data/skills.json). Descriptions in llms.txt are pulled from each page's frontmatter, so they stay current automatically. All ~80 section slugs are validated against the filesystem at generation time; an invalid slug would fail the prebuild.

@web-dev0521

Four callsites hardcoded Path.home() / '.hermes' with no HERMES_HOME check, breaking Docker deployments and profile isolation (hermes -p): - plugins/hermes-achievements/dashboard/plugin_api.py: state_path(), snapshot_path(), checkpoint_path() bare-literal paths - scripts/profile-tui.py: DEFAULT_STATE_DB and DEFAULT_LOG defaults ignored HERMES_HOME - hermes_cli/slack_cli.py: except-Exception fallback for slack-manifest.json dump - optional-skills/migration/openclaw-migration/scripts/openclaw_to_hermes.py: --target argparse default Use get_hermes_home() (with an ImportError shim for the standalone scripts) or 'os.environ.get("HERMES_HOME") or str(Path.home()/".hermes")' where importing hermes_constants is impractical. E2E-verified: with HERMES_HOME=/tmp/x all three achievements paths and both profile-tui defaults route under /tmp/x. Salvaged from NousResearch#18068 (original scope was broader mechanical cleanup claiming 23 callsites were buggy; most were already respecting HERMES_HOME via os.environ.get(key, default) — only these 4 had no env check at all). Credit: @web-dev0521.

…rch#18282) Adds a new top-of-sidebar docs page at /docs/user-stories that is a masonry-style collage of 99 real user stories sourced from X/Twitter, GitHub issues/PRs, Reddit, Hacker News, YouTube, blogs (Medium, Substack, dev.to), podcasts, LinkedIn, GitHub Gists, and Product Hunt. Every tile links to the original post/issue/video/gist where someone described a specific use case: personal assistants, dev workflows, trading bots, research briefs, family WhatsApp agents, Kubernetes deployments, legal-domain self-hosted setups, and more. - docs/user-stories.mdx: MDX entry mounting the collage component - src/components/UserStoriesCollage: React component with category + source filters, CSS-columns masonry layout, per-category accent colors - src/data/userStories.json: source-of-truth dataset (force-added; the root .gitignore's unanchored 'data/' rule would otherwise swallow it, same reason skills.json is explicitly listed in website/.gitignore) - sidebars.ts: link added at the top of the docs sidebar

The bot-owner identity check inside OwnerCommandMiddleware was commented out and replaced with a hardcoded `is_owner = True`, so any group member could trigger allowlisted privileged commands (/approve, /deny, /stop, /reset, /retry, /undo, /new, /background, /bg, /btw, /queue, /q) by sending the slash command without @-mentioning the bot. The most severe case is /approve: a non-owner could approve a dangerous tool call the bot was waiting on the owner to confirm. Re-enable the documented identity check (push.from_account == push.bot_owner_id) so only the configured owner can issue these commands.

liberathio · 2026-05-04T09:13:47Z