forked from molecule-ai/molecule-core
Two paired fixes that together let an external operator run a single
process (molecule-mcp) and see their workspace come up online in the
canvas — the bug surfaced live when status stuck at "awaiting_agent /
OFFLINE" despite an active MCP server.
Platform side (workspace-server/internal/handlers/registry.go):
Heartbeat handler already auto-recovers offline → online and
provisioning → online, but NOT awaiting_agent → online. Healthsweep
flips stale-heartbeat external workspaces TO awaiting_agent, and
with no recovery path the workspace stays "OFFLINE — Restart" in the
canvas forever. Add the symmetric branch: if currentStatus ==
"awaiting_agent" and a heartbeat arrives, flip to online + broadcast
WORKSPACE_ONLINE. Mirrors the existing offline/provisioning patterns
exactly. Test: TestHeartbeatHandler_AwaitingAgentToOnline asserts
the SQL UPDATE fires with the awaiting_agent guard clause.
Wheel side (workspace/mcp_cli.py):
molecule-mcp was outbound-only — operators had to run a separate
SDK process to register + heartbeat. Now mcp_cli.main():
1. Calls /registry/register at startup (idempotent upsert flips
status awaiting_agent → online via the existing register path).
2. Spawns a daemon thread that POSTs /registry/heartbeat every
20s. 20s is comfortably under the healthsweep stale window so
a single missed beat doesn't cause status churn.
3. Runs the MCP stdio loop in the foreground.
Both calls set Origin: ${PLATFORM_URL} so the SaaS edge WAF accepts
them. Threaded heartbeat (not asyncio) chosen because it doesn't
need to share an event loop with the MCP stdio server — daemon=True
cleanly dies when the operator's runtime exits.
MOLECULE_MCP_DISABLE_HEARTBEAT=1 escape hatch lets in-container
callers (which have heartbeat.py running already) reuse the entry
point without double-heartbeating. Default is enabled.
End-to-end verification (live, against
hongmingwang.moleculesai.app, workspace 8dad3e29-...):
pre-fix: status=awaiting_agent → canvas shows OFFLINE forever
post-fix: ran `molecule-mcp` for 5s standalone → canvas state:
status=online runtime=external agent=molecule-mcp-8dad3e29
Test coverage: 7 new mcp_cli tests (register-at-startup, heartbeat-
thread-spawned, disable-env-skips-both, env-and-file token resolution,
register payload shape, heartbeat endpoint + headers); 1 new platform
test (awaiting_agent → online recovery). Full workspace + handlers
suites green: 1355 Python, full Go handlers passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| adapters | ||
| builtin_tools | ||
| lib | ||
| molecule_audit | ||
| platform_tools | ||
| plugins_registry | ||
| policies | ||
| scripts | ||
| skill_loader | ||
| tests | ||
| .coveragerc | ||
| a2a_cli.py | ||
| a2a_client.py | ||
| a2a_executor.py | ||
| a2a_mcp_server.py | ||
| a2a_tools.py | ||
| adapter_base.py | ||
| agent.py | ||
| agents_md.py | ||
| build-all.sh | ||
| config.py | ||
| consolidation.py | ||
| coordinator.py | ||
| Dockerfile | ||
| entrypoint.sh | ||
| events.py | ||
| executor_helpers.py | ||
| heartbeat.py | ||
| initial_prompt.py | ||
| internal_chat_uploads.py | ||
| internal_file_read.py | ||
| main.py | ||
| mcp_cli.py | ||
| molecule_ai_status.py | ||
| platform_auth.py | ||
| platform_inbound_auth.py | ||
| plugins.py | ||
| preflight.py | ||
| prompt.py | ||
| pytest.ini | ||
| rebuild-runtime-images.sh | ||
| requirements.txt | ||
| runtime_wedge.py | ||
| shared_runtime.py | ||
| transcript_auth.py | ||
| watcher.py | ||