hermes-agent/tests/tools/test_dockerfile_pid1_reaping.py
dev-lead bbd91bfa23
Some checks failed
Nix / nix (macos-latest) (pull_request) Waiting to run
Contributor Attribution Check / check-attribution (pull_request) Failing after 11s
Supply Chain Audit / Scan PR for critical supply chain risks (pull_request) Successful in 11s
Tests / e2e (pull_request) Successful in 32s
Nix / nix (ubuntu-latest) (pull_request) Failing after 12m25s
Tests / test (pull_request) Failing after 14m46s
test(dockerfile): align tui-resolution assertions with post-a49f4c6 design
Two contract tests in test_dockerfile_pid1_reaping.py guarded properties
of the @hermes/ink materialization dance introduced in 5f215b13 (PR
#16690): ``--prefix node_modules/@hermes/ink`` install + ``omit=dev`` +
nested-react cleanup + ``await import('@hermes/ink')`` smoke check.

That mechanism was retired in a49f4c6 ("fix: prevent tui rebuilding
assets") in favour of a simpler approach:

  1. Copy the full ``ui-tui/packages/hermes-ink/`` tree (not just
     manifests) so npm can resolve the ``file:`` workspace dep against
     real content rather than a bare package.json.
  2. Set ``ENV npm_config_install_links=false`` to force npm to install
     ``file:`` deps as symlinks on Debian's bundled npm 9.x (which
     defaults to ``install-links=true`` and installs as copies). The
     host-side lockfile is generated by npm 10+ using symlinks, so
     install-as-copy produces a hidden node_modules/.package-lock.json
     that permanently disagrees with the root lock on @hermes/ink — and
     that disagreement trips the TUI launcher's
     ``_tui_need_npm_install()`` check on every startup, triggering a
     runtime ``npm install`` that fails with EACCES.

The tests were never updated for the new design; they remained pinned
to the retired materialization step and the manifest-only COPY shape.

This commit:

- Updates ``test_dockerfile_installs_tui_dependencies`` to assert the
  full ``COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/``
  shape — catches a regression that reverts to manifest-only copies.

- Replaces ``test_dockerfile_materializes_local_tui_ink_package`` with
  ``test_dockerfile_forces_npm_install_links_false_for_workspace_resolution``,
  which scans the parsed instruction list for an ENV directive (not a
  comment) setting ``npm_config_install_links=false``. Negative-tested:
  removing only the ENV line correctly fails the assertion even with
  the explanatory comment block above it left intact.

PID-1 reaping tests (the file's primary purpose) are unmodified and
continue to assert tini install + ENTRYPOINT routing.

Partial close hermes-agent#9 — addresses 2 of the ~28 real failures
surfaced after the disk-pressure fix; does not touch the other ~19+
unrelated test failures in that issue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:44:51 -07:00

172 lines
6.7 KiB
Python

"""Contract tests for the container Dockerfile.
These tests assert invariants about how the Dockerfile composes its runtime —
they deliberately avoid snapshotting specific package versions, line numbers,
or exact flag choices. What they DO assert is that the Dockerfile maintains
the properties required for correct production behaviour:
- A PID-1 init (tini) is installed and wraps the entrypoint, so that orphaned
subprocesses (MCP stdio servers, git, bun, browser daemons) get reaped
instead of accumulating as zombies (#15012).
- Signal forwarding runs through the init so ``docker stop`` triggers
hermes's own graceful-shutdown path.
"""
from __future__ import annotations
from pathlib import Path
import pytest
REPO_ROOT = Path(__file__).resolve().parents[2]
DOCKERFILE = REPO_ROOT / "Dockerfile"
DOCKERIGNORE = REPO_ROOT / ".dockerignore"
@pytest.fixture(scope="module")
def dockerfile_text() -> str:
if not DOCKERFILE.exists():
pytest.skip("Dockerfile not present in this checkout")
return DOCKERFILE.read_text()
def _dockerfile_instructions(dockerfile_text: str) -> list[str]:
instructions: list[str] = []
current = ""
for raw_line in dockerfile_text.splitlines():
line = raw_line.strip()
if not line or line.startswith("#"):
continue
continued = line.removesuffix("\\").strip()
current = f"{current} {continued}".strip()
if not line.endswith("\\"):
instructions.append(current)
current = ""
return instructions
def _run_steps(dockerfile_text: str) -> list[str]:
return [
instruction
for instruction in _dockerfile_instructions(dockerfile_text)
if instruction.startswith("RUN ")
]
def test_dockerfile_installs_an_init_for_zombie_reaping(dockerfile_text):
"""Some init (tini, dumb-init, catatonit) must be installed.
Without a PID-1 init that handles SIGCHLD, hermes accumulates zombie
processes from MCP stdio subprocesses, git operations, browser
daemons, etc. In long-running Docker deployments this eventually
exhausts the PID table.
"""
# Accept any of the common reapers. The contract is behavioural:
# something must be installed that reaps orphans.
known_inits = ("tini", "dumb-init", "catatonit")
installed = any(name in dockerfile_text for name in known_inits)
assert installed, (
"No PID-1 init detected in Dockerfile (looked for: "
f"{', '.join(known_inits)}). Without an init process to reap "
"orphaned subprocesses, hermes accumulates zombies in Docker "
"deployments. See issue #15012."
)
def test_dockerfile_entrypoint_routes_through_the_init(dockerfile_text):
"""The ENTRYPOINT must invoke the init, not the entrypoint script directly.
Installing tini is only half the fix — the container must actually run
with tini as PID 1. If the ENTRYPOINT executes the shell script
directly, the shell becomes PID 1 and will ``exec`` into hermes,
which then runs as PID 1 without any zombie reaping.
"""
# Find the last uncommented ENTRYPOINT line — Docker honours the final one.
entrypoint_line = None
for raw_line in dockerfile_text.splitlines():
line = raw_line.strip()
if line.startswith("#"):
continue
if line.startswith("ENTRYPOINT"):
entrypoint_line = line
assert entrypoint_line is not None, "Dockerfile is missing an ENTRYPOINT directive"
known_inits = ("tini", "dumb-init", "catatonit")
routes_through_init = any(name in entrypoint_line for name in known_inits)
assert routes_through_init, (
f"ENTRYPOINT does not route through an init: {entrypoint_line!r}. "
"If tini is only installed but not wired into ENTRYPOINT, hermes "
"still runs as PID 1 and zombies will accumulate (#15012)."
)
def test_dockerfile_installs_tui_dependencies(dockerfile_text):
"""The Dockerfile must install ui-tui's npm dependencies during build,
and must copy the @hermes/ink workspace tree (not just its manifests)
so npm can resolve the ``file:`` workspace dep without falling back to
the bare manifest. See PR #16690 + a49f4c6 for the design.
"""
assert "ui-tui/package.json" in dockerfile_text
# ui-tui/packages/hermes-ink/ is referenced as a `file:` workspace dep
# from ui-tui/package.json. Copying the FULL tree (rather than just
# package.json + package-lock.json as in earlier revisions) is what lets
# npm resolve the workspace to real content. This assertion catches a
# regression that reverts to manifest-only copies.
assert "COPY ui-tui/packages/hermes-ink/ ui-tui/packages/hermes-ink/" in dockerfile_text
assert any(
"ui-tui" in step and "npm" in step and (" install" in step or " ci" in step)
for step in _run_steps(dockerfile_text)
)
def test_dockerfile_builds_tui_assets(dockerfile_text):
assert any(
"ui-tui" in step and "npm" in step and "run build" in step
for step in _run_steps(dockerfile_text)
)
def test_dockerfile_forces_npm_install_links_false_for_workspace_resolution(dockerfile_text):
"""The Dockerfile must force npm to install ``file:`` deps as symlinks
rather than copies.
Debian's bundled npm 9.x defaults to ``install-links=true`` (deps
installed as copies). The host-side ``ui-tui/package-lock.json`` is
generated by npm 10+ which uses symlinks, so an install-as-copy in the
image produces a hidden ``node_modules/.package-lock.json`` that
permanently disagrees with the root lockfile on the @hermes/ink entry.
That disagreement trips the TUI launcher's ``_tui_need_npm_install()``
check on every startup and triggers a runtime ``npm install`` that
fails with EACCES (node_modules/ is root-owned from build time).
This assertion replaces the older ``--prefix node_modules/@hermes/ink``
materialization smoke test (PR #16690), which was retired in a49f4c6
in favour of ``install-links=false`` because the materialization step
rebuilt TUI assets unnecessarily on every container start.
"""
instructions = _dockerfile_instructions(dockerfile_text)
has_env_directive = any(
instr.startswith("ENV ") and "npm_config_install_links=false" in instr
for instr in instructions
)
assert has_env_directive, (
"ENV npm_config_install_links=false missing — without it, Debian npm 9.x "
"installs `file:` deps as copies, breaking @hermes/ink workspace "
"resolution at runtime. See PR #16690 + a49f4c6."
)
def test_dockerignore_excludes_nested_dependency_dirs():
if not DOCKERIGNORE.exists():
pytest.skip(".dockerignore not present in this checkout")
text = DOCKERIGNORE.read_text()
assert "**/node_modules" in text
assert "**/.venv" in text