docs(sdk): add KI-009 — run_heartbeat_loop has no external stop mechanism
All checks were successful
Test / test (3.11) (push) Successful in 1m56s
Test / test (3.12) (push) Successful in 1m51s
Test / test (3.13) (push) Successful in 1m49s

The heartbeat loop runs unbounded with no way for an external caller
(SIGTERM handler, MCP client disconnect) to signal it to exit cleanly.
This causes orphaned heartbeat API calls after the controlling client
has disconnected.

Suggested fix: add stop_event parameter (threading.Event) to
run_heartbeat_loop() so callers can achieve clean shutdown.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Molecule AI · sdk-dev 2026-05-11 04:50:19 +00:00
parent 84fc25da2a
commit 6c94ceaeee

View File

@ -253,3 +253,47 @@ def _is_hex(value: str) -> bool:
`tests/conftest.py` exists with the `_CaptureHandler` stub definition.
`pytest tests/test_call_peer_errors.py` runs all 12 tests cleanly.
`pytest tests/` collects all test files with no collection errors.
---
## KI-009 — `run_heartbeat_loop()` does not honour external stop signals
**File:** `molecule_agent/client.py` (`RemoteAgentClient.run_heartbeat_loop`)
**Status:** Identified
**Severity:** Low
### Symptom
`run_heartbeat_loop()` runs an unbounded `while True` loop with `sleep(heartbeat_interval)`
between iterations. There is no mechanism for an external caller to signal the loop
to exit cleanly. If the MCP client that launched the remote agent disconnects (e.g. via
SSE stream close), the heartbeat loop continues indefinitely until `max_iterations` is
reached or the process is killed externally.
### Impact
Orphaned heartbeat processes continue consuming platform API quota after the controlling
MCP client has disconnected. Each iteration sends a `POST /registry/heartbeat` and a
`GET /workspaces/:id/state` call. Over time this accumulates unnecessary API calls.
### Suggested fix
Add a `stop_event` parameter to `run_heartbeat_loop()` — a `threading.Event` or
`asyncio.Event` that, when set, causes the loop to exit cleanly with a `stopped`
return value:
```python
def run_heartbeat_loop(
self,
max_iterations: int | None = None,
task_supplier: "callable | None" = None,
stop_event: threading.Event | None = None,
) -> str:
i = 0
while True:
if stop_event is not None and stop_event.is_set():
return "stopped"
if max_iterations is not None and i >= max_iterations:
return "max_iterations"
# ... rest of loop
```
Callers (MCP client wrappers, shell scripts) can then call `stop_event.set()` on
SIGTERM/SIGINT to achieve clean shutdown.