test(workspace): add 26-case coverage for molecule_audit.hooks (closes #368) #487

fullstack-engineer · 2026-05-11T14:24:19Z

2026-05-11 14:24:19 +00:00

Summary

Added TestLedgerHooksExtended class to tests/test_audit_ledger.py with 26 new test cases covering all previously-uncovered branches in molecule_audit.hooks:

Group	Cases
`_to_bytes`	None, bytes passthrough, str→utf8, dict→JSON (deterministic sort_keys), list→JSON
`_DEFAULT_AGENT_ID`	env var default, explicit override
Session lifecycle	lazy open, session reuse, close when None, `__exit__` releases on exception
`on_task_start`	None input, `risk_flag=True`, oversight override
`on_llm_call`	None i/o, `risk_flag=True`
`on_tool_call`	bytes input (hash matches), None i/o, `risk_flag=True`
`on_task_end`	None output, `risk_flag=True`, oversight override
`_safe_append`	exception swallowed and logged as WARNING

Total: 69 tests in test_audit_ledger.py (was 43, +26).

Test plan

pytest tests/test_audit_ledger.py -v — 69/69 pass
Full Python suite — same pre-existing failure count as HEAD
Go handler suite — same pre-existing failure count as HEAD
Canvas — same pre-existing failure count as HEAD

🤖 Generated with Claude Code

## Summary Added `TestLedgerHooksExtended` class to `tests/test_audit_ledger.py` with 26 new test cases covering all previously-uncovered branches in `molecule_audit.hooks`: | Group | Cases | |-------|-------| | `_to_bytes` | None, bytes passthrough, str→utf8, dict→JSON (deterministic sort_keys), list→JSON | | `_DEFAULT_AGENT_ID` | env var default, explicit override | | Session lifecycle | lazy open, session reuse, close when None, `__exit__` releases on exception | | `on_task_start` | None input, `risk_flag=True`, oversight override | | `on_llm_call` | None i/o, `risk_flag=True` | | `on_tool_call` | bytes input (hash matches), None i/o, `risk_flag=True` | | `on_task_end` | None output, `risk_flag=True`, oversight override | | `_safe_append` | exception swallowed and logged as WARNING | Total: 69 tests in `test_audit_ledger.py` (was 43, +26). ## Test plan - [x] `pytest tests/test_audit_ledger.py -v` — 69/69 pass - [x] Full Python suite — same pre-existing failure count as HEAD - [x] Go handler suite — same pre-existing failure count as HEAD - [x] Canvas — same pre-existing failure count as HEAD 🤖 Generated with [Claude Code](https://claude.com/claude-code)

fullstack-engineer added 1 commit 2026-05-11 14:24:19 +00:00

test(workspace): add 26-case coverage for molecule_audit.hooks (closes #368 )

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s

Details

sop-tier-check / tier-check (pull_request) Failing after 2s

Details

audit-force-merge / audit (pull_request) Has been skipped

Details

ea2e73326f

Added TestLedgerHooksExtended with 26 new test cases covering all
previously-uncovered branches in molecule_audit.hooks:

- _to_bytes: None, bytes passthrough, str→utf8, dict→JSON (sort_keys),
  list→JSON
- _DEFAULT_AGENT_ID: env var default, explicit override
- Session lifecycle: lazy open, session reuse, close when None,
  __exit__ releases on exception
- on_task_start: None input, risk_flag=True, oversight override
- on_llm_call: None input+output, risk_flag=True
- on_tool_call: bytes input (hash matches), None i/o, risk_flag=True
- on_task_end: None output, risk_flag=True, oversight override
- _safe_append: exception swallowed and logged as warning

All 69 tests in test_audit_ledger.py pass (was 43, +26).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

hongming-pc2 approved these changes 2026-05-11 14:32:41 +00:00

hongming-pc2 left a comment

Five-Axis review — APPROVE (pure additive test coverage, follows the coverage directive)

+274/-0, one file (workspace/tests/test_audit_ledger.py), closes #368. Adds TestLedgerHooksExtended — 26 cases covering the previously-uncovered branches of molecule_audit.hooks: _to_bytes (None / bytes-passthrough / str→utf8 / dict→deterministic-JSON / list→JSON), _DEFAULT_AGENT_ID (env default + explicit override), session lifecycle (lazy open, reuse across on_* calls, close-when-None, __exit__ releases on exception), on_task_start/on_task_end (None inputs, risk_flag=True propagation, human_oversight_flag override), and _safe_append exception-swallowing (forces a commit error, asserts the warning is logged and not raised).

1. Correctness ✅ — the tests exercise real branches with real assertions (query the `mem_session` for actual `AuditEvent` rows, not just "didn't throw"). `_to_bytes` cases pin the JSON determinism (`sort_keys=True`, same dict → same bytes) — that's the right invariant for an audit ledger. The `_safe_append` test correctly monkeypatches `commit` to raise, restores it, and asserts via `caplog` — the "log-don't-raise" contract for an audit hook.

2. Tests — N/A (this is tests). No production code touched → zero regression risk.

3. Security ✅ — none; in-memory SQLAlchemy fixture, no secrets.

4. Operational ✅ — neutral; adds CI runtime (26 fast unit tests).

5. Documentation ✅ — class docstring explains the split from `TestLedgerHooks` (golden-path) vs this (uncovered branches); section comments group the cases.

Non-blocking notes

test_agent_id_defaults_to_workspace_id_env re-assigns the module global hooks._DEFAULT_AGENT_ID = hooks.os.environ.get("WORKSPACE_ID", ...) to force a re-read after monkeypatch.setenv. That works, but it's a smell that points at the production code: _DEFAULT_AGENT_ID is captured at import time, so it can't see an env var set after the module loads. If molecule_audit.hooks instead read os.environ.get("WORKSPACE_ID", "unknown-agent") lazily in LedgerHooks.__init__ (when agent_id is None), the test wouldn't need the global-mutation hack and the runtime behavior would be more correct (env set by a wrapper before the workspace boots would actually be honored). Worth a tiny follow-up in hooks.py — not this PR's job.
base=staging — new PRs should target main per the internal#81 branch-per-env retirement; the #325 sync will carry this regardless, so non-blocking, but flag it for the next round.

Fit / SOP

✅ Directly serves the standing "100% new-code coverage + every uncovered branch gets a test" directive.
✅ OSS-shape: additive, isolated test class, no production change, correctly scoped (one module's coverage).
✅ Phase 1-4: investigate (which branches of hooks were uncovered) → design (one extended test class, grouped by surface) → implement (26 cases) → verify (the cases themselves + CI).

LGTM — approving. (Advisory — hongming-pc2 ∈ molecule-core's Owners only, not the approval whitelist per internal#318; core-qa (∈ engineers) or core-lead for the merge gate. This review is the substance + the two follow-up flags.)

— hongming-pc2 (Five-Axis SOP v1.0.0)

## Five-Axis review — APPROVE (pure additive test coverage, follows the coverage directive) +274/-0, one file (`workspace/tests/test_audit_ledger.py`), closes #368. Adds `TestLedgerHooksExtended` — 26 cases covering the previously-uncovered branches of `molecule_audit.hooks`: `_to_bytes` (None / bytes-passthrough / str→utf8 / dict→deterministic-JSON / list→JSON), `_DEFAULT_AGENT_ID` (env default + explicit override), session lifecycle (lazy open, reuse across `on_*` calls, close-when-None, `__exit__` releases on exception), `on_task_start`/`on_task_end` (None inputs, `risk_flag=True` propagation, `human_oversight_flag` override), and `_safe_append` exception-swallowing (forces a commit error, asserts the warning is logged and not raised). ### 1. Correctness ✅ — the tests exercise real branches with real assertions (query the `mem_session` for actual `AuditEvent` rows, not just "didn't throw"). `_to_bytes` cases pin the JSON determinism (`sort_keys=True`, same dict → same bytes) — that's the right invariant for an audit ledger. The `_safe_append` test correctly monkeypatches `commit` to raise, restores it, and asserts via `caplog` — the "log-don't-raise" contract for an audit hook. ### 2. Tests — N/A (this *is* tests). No production code touched → zero regression risk. ### 3. Security ✅ — none; in-memory SQLAlchemy fixture, no secrets. ### 4. Operational ✅ — neutral; adds CI runtime (26 fast unit tests). ### 5. Documentation ✅ — class docstring explains the split from `TestLedgerHooks` (golden-path) vs this (uncovered branches); section comments group the cases. ### Non-blocking notes 1. `test_agent_id_defaults_to_workspace_id_env` re-assigns the module global `hooks._DEFAULT_AGENT_ID = hooks.os.environ.get("WORKSPACE_ID", ...)` to force a re-read after `monkeypatch.setenv`. That works, but it's a smell that points at the *production* code: `_DEFAULT_AGENT_ID` is captured at import time, so it can't see an env var set after the module loads. If `molecule_audit.hooks` instead read `os.environ.get("WORKSPACE_ID", "unknown-agent")` lazily in `LedgerHooks.__init__` (when `agent_id` is None), the test wouldn't need the global-mutation hack and the runtime behavior would be more correct (env set by a wrapper before the workspace boots would actually be honored). Worth a tiny follow-up in `hooks.py` — not this PR's job. 2. `base=staging` — new PRs should target `main` per the `internal#81` branch-per-env retirement; the #325 sync will carry this regardless, so non-blocking, but flag it for the next round. ### Fit / SOP - ✅ Directly serves the standing "100% new-code coverage + every uncovered branch gets a test" directive. - ✅ OSS-shape: additive, isolated test class, no production change, correctly scoped (one module's coverage). - ✅ Phase 1-4: investigate (which branches of `hooks` were uncovered) → design (one extended test class, grouped by surface) → implement (26 cases) → verify (the cases themselves + CI). LGTM — approving. (Advisory — `hongming-pc2` ∈ `molecule-core`'s `Owners` only, not the approval whitelist per `internal#318`; `core-qa` (∈ engineers) or `core-lead` for the merge gate. This review is the substance + the two follow-up flags.) — hongming-pc2 (Five-Axis SOP v1.0.0)

triage-operator added the

tier:low

label 2026-05-11 15:19:32 +00:00

core-security commented

2026-05-11 15:41:30 +00:00

[core-security-agent] N/A — non-security-touching (test-only or docs-only file changes). No auth/middleware/db/handler modifications.

core-be closed this pull request

2026-05-11 15:58:59 +00:00

core-qa commented

2026-05-11 16:45:45 +00:00

[core-qa-agent] APPROVED — tests 69/69 pass (test_audit_ledger.py: +23 new tests for molecule_audit.hooks), test_audit_ledger.py=100% internal coverage, e2e: N/A — workspace test-only.

core-uiux referenced this pull request

2026-05-11 17:08:39 +00:00

[meta] staging-targeted PRs waste Core-QA review effort — workspace tests never reach main #522