molecule-ai/hermes-agent

fix: resolve 5 misc test failures in hermes-agent#9 #14

Merged

claude-ceo-assistant merged 5 commits from fix/misc-test-failures-issue-9 into main

2026-05-08 21:11:13 +00:00

Author	SHA1	Message	Date
dev-lead	04d5633745	test(kanban-ws-auth): patch hermes_cli.web_server attribute alongside sys.modules entry Some checks failed Nix / nix (macos-latest) (pull_request) Waiting to run Details Supply Chain Audit / Scan PR for critical supply chain risks (pull_request) Successful in 42s Details Contributor Attribution Check / check-attribution (pull_request) Failing after 43s Details Tests / e2e (pull_request) Successful in 2m13s Details Nix / nix (ubuntu-latest) (pull_request) Failing after 14m21s Details Tests / test (pull_request) Failing after 23m16s Details `monkeypatch.setitem(sys.modules, "hermes_cli.web_server", stub)` alone is not enough when another test in the same xdist worker has already imported `hermes_cli.web_server`: the parent package `hermes_cli` then has the real submodule bound as an attribute, and `from hermes_cli import web_server` resolves through the attribute path, not through sys.modules. Result: `_check_ws_token` reads the REAL `_SESSION_TOKEN` (a fresh random value), the test's "secret-xyz" never matches, and the third with-block (correct token → accepted) hits a 1008 disconnect instead of a clean handshake. Test was order-dependent — passed in isolation, failed in full-suite runs where another test loaded the real web_server first. Per `feedback_no_such_thing_as_flakes`, this is a real test-isolation bug, not a flake. Fix: also `monkeypatch.setattr(hermes_cli, "web_server", stub, raising=False)` so both lookup paths see the stub. Inline comment documents the gotcha for the next reader. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 14:08:59 -07:00
claude-ceo-assistant	3697e6cea2	fix(tui_gateway): drop pending_title on ValueError, retain on transient errors Production bug + missing test coverage. `c5b4c48` ("fix: lazy session creation — defer DB row until first message (#18370)") moved pending_title application from the eager _start_agent_build path to a post-message-complete handler. The original block had: except ValueError as e: current["pending_title"] = None logger.info("Dropping pending title for session %s: %s", sid, e) except Exception: logger.warning("Failed to apply pending title ...", exc_info=True) …differentiating "title is invalid / duplicate, retrying won't help" (ValueError, drop) from "transient DB failure, retry on next message" (other Exception, keep + log). The replacement block collapsed both into: except Exception: pass # Best effort — auto-title will handle it below …so a duplicate-title session keeps the same dud pending_title forever, hitting set_session_title with the same losing argument on every message-complete. Auto-title can't kick in because pending_title still shadows it. Fix: extract a documented _apply_pending_session_title helper that restores the three-branch semantics (success → clear, ValueError → drop, other Exception → retain). Call it from the message-complete handler instead of the inline try/except. Test rewrite: the previous test_session_create_drops_pending_title_on_valueerror exercised an obsolete code path (eager apply during session.create) that no longer existed after `c5b4c48`. Replace with four focused tests against the helper: - drops_on_valueerror — invariant from the original test name - clears_on_success — happy path - retains_on_transient_exception — guards the new "don't lose title on a flaky DB" behaviour - no_op_without_pending — most calls hit this path Mutation-tested mentally: deleting the `session["pending_title"] = None` in the ValueError branch fails drops_on_valueerror; deleting the same in the success branch fails clears_on_success; widening except ValueError to except Exception fails retains_on_transient_exception. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:57:52 -07:00
claude-ceo-assistant	c8bc7cdab5	test(teams): patch adapter's TypingActivityInput binding for test_send_typing The teams adapter imports TypingActivityInput at module load time: try: from microsoft_teams.api.activities.typing import TypingActivityInput except ImportError: TypingActivityInput = None When the real microsoft_teams package isn't installed (CI runner image doesn't bundle Microsoft Teams SDK), the import fails and the local binding stays None — even though the test file's _ensure_teams_mock fixture registers a MockTypingActivityInput in sys.modules. The test-time mock-in-sys.modules trick only fixes future imports; a binding captured before the mock was registered remains stale. send_typing() calls TypingActivityInput() and the resulting TypeError ('NoneType' object is not callable) is swallowed by `except Exception: pass`, so self._app.send is never reached and the test's assert_awaited_once fails with "Awaited 0 times" — invisibly, because the swallowed error hid the real cause. Fix: monkey-patch the adapter module's local TypingActivityInput binding in test_send_typing only — narrowest possible patch since no other test exercises send_typing. Document the import-time-vs-mock-time gap inline so a future reader doesn't fall into the same trap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:57:51 -07:00
claude-ceo-assistant	ddbb1520c9	test(credential-pool): invert obsolete os.environ-wins test for #18254 fix The stale invariant "os.environ wins over .env" was deliberately inverted in `2ef1ad2` ("fix: prefer ~/.hermes/.env over os.environ when seeding credential pool"). The fix targets the case where a parent shell (Codex CLI, harness scripts) exports a stale OPENROUTER_API_KEY, the user updates ~/.hermes/.env with a fresh value, and Hermes silently 401s because auth.json cached the stale env-var. Rename + invert this test to assert the new invariant (.env wins). The positive load_pool coverage already exists in tests/agent/test_credential_pool.py::test_load_pool_prefers_dotenv_over_stale_os_environ (added in `0a6865b` alongside the fix); this case still serves a purpose because it exercises _seed_from_env directly, which is a separate code path from load_pool. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:57:51 -07:00
claude-ceo-assistant	d5f569581e	test(acp): update commands list snapshot for steer + queue acp_adapter/server.py:_ADVERTISED_COMMANDS now includes "steer" (inject guidance into active turn) and "queue" (run prompt after current turn finishes) between "compact" and "version". Production code is the source of truth; this test was the last reader still on the pre-feature snapshot. The substantive features were added in PRs that introduced steer/queue themselves; this is purely test-snapshot follow-through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:57:50 -07:00