test(8/N): fix all 13 remaining failures — clean up 3 stubborn + 5 new regressions#13
test(8/N): fix all 13 remaining failures — clean up 3 stubborn + 5 new regressions#13Bartok9 wants to merge 2 commits into
Conversation
…w regressions 7 PRs (#6–#12) cut failures 41→13. This closes the gap and fixes the root causes (not symptoms) of all 13 remaining failures, including the 3 stubborn pre-existing ones the earlier PRs only claimed to fix and the new regressions they introduced. GOOGLE_CHAT env-config KeyError (6 tests, pre-existing): _check_for_registry() gated platform *enablement* on the optional google-cloud-pubsub SDK (GOOGLE_CHAT_AVAILABLE). The plugin loader imports the adapter under a different module name (hermes_plugins.google_chat_platform.adapter) than the one the test patches, so under parallel execution the loaded copy's flag stayed False and a *configured* platform was silently dropped from cfg.platforms. Enablement now reflects configuration intent (Pub/Sub project + subscription env vars); the missing-dependency case is still surfaced with a clear error at connect() time, where it belongs. agent_cache spillover timeout (1 test, pre-existing): test_concurrent_inserts_settle_at_cap built 160 real AIAgent instances (~0.5s each) *inside* worker threads, so the per-thread join(timeout=30) window blew past 30s on shared CI runners and tripped the bogus "possible deadlock?" assertion. Pre-build the agents before the timed section; the concurrency under test (cache insert + cap enforcement under the lock) is unchanged. skill_provenance default origin (1 test, pre-existing — PR #10 never touched it): run_agent.process_message() binds the write-origin ContextVar to "assistant_tool" on the turn thread and never resets it, leaking into copy_context() snapshots. conftest now resets it to the fresh-process default so the ContextVar's own default("foreground") governs untouched tests. credential_pool env precedence (1 test, new regression from PR #10): Two sibling tests demand opposite precedence — NousResearch#18254 (OpenRouter) requires .env to win over a stale shell export, while the generic DEEPSEEK test requires os.environ to win. Resolved per-provider: OpenRouter resolves .env-first (the rotation/401 scenario), all other providers keep os.environ-first. Both files now pass. restart_drain drain notice (1 test, new regression from PR #12): t("gateway.draining") returned the raw key because test_t_missing_key_in_non_english_falls_back_to_english points i18n._locales_dir at a temp dir and leaves a fake catalog cached in the process-global _catalog_cache. conftest now calls reset_language_cache() between tests. goal_command status (2 tests, new regression from PR #11): hermes_state.DEFAULT_DB_PATH is frozen at import time, so a bare SessionDB() always opened the import-time state.db and ignored the per-test tmp HERMES_HOME — leaking goal state across cases. GoalManager now resolves the DB path from the live HERMES_HOME. update_yes_flag (1 test): resolved by the shared isolation fixes above. Lesson: the earlier PRs declared local-pass as victory and merged without waiting for full-suite CI. This PR waits for the Tests workflow to go green on the merge commit before declaring done.
🔎 Lint report:
|
| Rule | Count |
|---|---|
invalid-argument-type |
3 |
unresolved-import |
1 |
First entries
run_agent.py:12539: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:6649: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:12542: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
tests/hermes_cli/test_update_yes_flag.py:17: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
✅ Fixed issues (3):
| Rule | Count |
|---|---|
invalid-argument-type |
3 |
First entries
run_agent.py:12542: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:12539: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:6649: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
Unchanged: 4075 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Hardcoded
"state.db"duplicates source-of-truth constant- Updated goals DB path construction to use
DEFAULT_DB_PATH.namefromhermes_statewhile still combining it with the liveHERMES_HOME.
- Updated goals DB path construction to use
Or push these changes by commenting:
@cursor push 3232dc8fb8
Preview (3232dc8fb8)
diff --git a/hermes_cli/goals.py b/hermes_cli/goals.py
--- a/hermes_cli/goals.py
+++ b/hermes_cli/goals.py
@@ -152,7 +152,7 @@
"""
try:
from hermes_constants import get_hermes_home
- from hermes_state import SessionDB
+ from hermes_state import DEFAULT_DB_PATH, SessionDB
home = str(get_hermes_home())
except Exception as exc: # pragma: no cover
@@ -165,14 +165,14 @@
try:
# Resolve the DB path from the *live* HERMES_HOME rather than relying
# on SessionDB's import-time DEFAULT_DB_PATH. hermes_state computes
- # DEFAULT_DB_PATH = get_hermes_home() / "state.db" at module import,
+ # DEFAULT_DB_PATH with the active HERMES_HOME at module import,
# so a bare SessionDB() is permanently pinned to whatever HERMES_HOME
# was set when hermes_state first imported. That made the GoalManager
# ignore profile/HERMES_HOME switches at runtime and (in tests) leak
# goal state across cases that point HERMES_HOME at fresh temp dirs.
# Passing the path explicitly keeps the per-home cache correct.
from pathlib import Path as _Path
- db = SessionDB(db_path=_Path(home) / "state.db")
+ db = SessionDB(db_path=_Path(home) / DEFAULT_DB_PATH.name)
except Exception as exc: # pragma: no cover
logger.debug("GoalManager: SessionDB() raised (%s)", exc)
return NoneYou can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 9fdca65. Configure here.
| # goal state across cases that point HERMES_HOME at fresh temp dirs. | ||
| # Passing the path explicitly keeps the per-home cache correct. | ||
| from pathlib import Path as _Path | ||
| db = SessionDB(db_path=_Path(home) / "state.db") |
There was a problem hiding this comment.
Hardcoded "state.db" duplicates source-of-truth constant
Low Severity
The filename "state.db" is hardcoded here, duplicating the value already defined via DEFAULT_DB_PATH = get_hermes_home() / "state.db" in hermes_state.py. If the DB filename is ever changed in hermes_state, this location would silently diverge. Since DEFAULT_DB_PATH.name always yields the filename component regardless of the prefix path that was frozen at import time, using that would keep a single source of truth while still achieving the goal of combining the live home with the correct filename.
Reviewed by Cursor Bugbot for commit 9fdca65. Configure here.
…ollution CI run 2 surfaced test_yes_restores_stash_without_prompting failing with the *real* _restore_stashed_changes running (captured '→ Restoring local changes...') despite the @patch. Root cause: sibling tests (test_env_loader, test_skills_subparser) reload/delete hermes_cli.main from sys.modules, so the top-of-file 'from hermes_cli.main import cmd_update' binding points at the old module while @patch('hermes_cli.main._restore_stashed_changes') patches the new one — the patch becomes a no-op. Add an autouse fixture that rebinds cmd_update to the live module, mirroring the proven fix already in test_update_stale_dashboard.py.



Context
7 PRs (#6 through #12) landed earlier tonight and reduced the Tests-workflow failure count from 41 → 13. But 13 failures remained on
main(run 26677651968, sha 6b2b3d4). This PR closes the gap by fixing the root cause of every one — including the 3 stubborn pre-existing failures the earlier PRs claimed to fix but didn't, and the regressions the earlier PRs themselves introduced.The earlier PRs merged on local-pass alone. This PR waits for the full Tests CI to go green before merging.
The 13 failures
test_google_chat.py::TestEnvConfigLoading::*(6)test_agent_cache.py::...::test_concurrent_inserts_settle_at_captest_skill_provenance.py::test_default_origin_is_foregroundtest_credential_pool.py::test_load_pool_prefers_dotenv_over_stale_os_environtest_restart_drain.py::test_restart_command_while_busy_requests_drain_without_interrupttest_update_yes_flag.py::...::test_yes_restores_stash_without_promptingtest_goal_command.py::test_goal_status_alias_shows_status/test_goal_bare_shows_status_when_none_setRoot causes (one per group)
_check_for_registry()gated platform enablement on the optionalgoogle-cloud-pubsubSDK. The plugin loader imports the adapter under a different module name than tests patch, so under parallel execution the loaded copy'sGOOGLE_CHAT_AVAILABLEstayedFalseand a configured platform was silently dropped. Enablement now reflects configuration intent (Pub/Sub project + subscription env vars); missing deps are still surfaced clearly atconnect().AIAgentbuilds (~0.5s each) happened inside the worker threads, so the per-threadjoin(timeout=30)window exceeded 30s on shared CI. Agents are now pre-built before the timed section; the concurrency under test is unchanged.run_agent.process_message()binds the write-origin ContextVar to"assistant_tool"and never resets it, leaking intocopy_context(). conftest now resets it to the fresh-process default..env-first (regression [Bug]: auth.json credential cache ignores .env changes — stale key persists NousResearch/hermes-agent#18254), all other providers keep os.environ-first. Both files pass.t("gateway.draining")returned the raw key because an i18n test leaves a fake catalog in the process-global_catalog_cache. conftest now callsreset_language_cache()between tests.hermes_state.DEFAULT_DB_PATHis frozen at import time, so bareSessionDB()ignored the per-test tmpHERMES_HOMEand leaked goal state.GoalManagernow resolves the DB path from the liveHERMES_HOME.Validation
-n0and-n auto).-n auto: the original 13 are gone; the only remaining failures are macOS-host-specific (WSL/systemd detection, ripgrepfind, Claude-Code creds path, timing-based subprocess detection) that pass on Linux CI and were green in the baseline run.success.Lesson
The prior PRs merged without waiting for full-suite CI. This PR enforces the wait.
Note
Medium Risk
OpenRouter env resolution order changed for one provider (intentional regression fix); Google Chat may appear enabled when only env is set without the Pub/Sub SDK until connect fails clearly.
Overview
This PR clears the last 13 Tests-workflow failures by fixing root causes and tightening test isolation—not by masking symptoms.
Runtime behavior: OpenRouter credential seeding now prefers
~/.hermes/.envover a staleOPENROUTER_API_KEYinos.environ(other providers still use env-first). Google Chat registry enablement depends only on Pub/Sub project/subscription env vars, not optionalgoogle-cloud-pubsubimport state.GoalManageropensSessionDBat the liveHERMES_HOMEpath instead of import-timeDEFAULT_DB_PATH.Test harness:
conftestresetsagent.i18ncatalog cache andtools.skill_provenancewrite-origin between tests.test_agent_cachepre-builds agents before the concurrent insert stress window.test_update_yes_flagrebindscmd_updateafter other tests reloadhermes_cli.main.Reviewed by Cursor Bugbot for commit 34eaa38. Bugbot is set up for automated code reviews on this repo. Configure here.