Production fix:
- wait_for_message exceptions now trigger exponential backoff (1s → 60s
cap, resets to 0 on first successful poll) instead of a flat 1s retry.
Under platform outage, N daemons under flat 1Hz retry would hammer the
endpoint unnecessarily; the cap-and-reset shape keeps the daemon
responsive while being a good citizen.
Correctness gate:
- Test coverage for the six error branches that operators actually hit:
the backoff progression itself, backoff reset on first success,
inbox_pop failure (codex must not re-run the same message),
peer_agent without peer_id (poison drained, not looped),
unknown message kind (poison drained, not looped),
empty codex output (placeholder reply, not silent drop),
canvas_user falling back to workspace_id when arrival_workspace_id
absent, and four malformed-payload shapes from wait_for_message
(parametrised: invalid JSON, non-dict, timeout sentinel, missing
activity_id).
- Backoff tests verified to FAIL on the old flat-1s code by stashing
only bridge.py and re-running — pinning a real regression, not a
tautology.
Cleanup:
- _RealTools imports molecule_runtime.a2a_tools once at construction
instead of four times per message.
- README documents CODEX_CHANNEL_MOLECULE_STATE_DIR override.
Test: pytest -q → 28 passed (was 17).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>