Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
36 KiB
2026-04-13 — edit history
Summary — Quality + Infra Pass (PRs #1–#8, all merged)
Eight PRs landed today in a focused quality pass. No user-facing feature changes; the payoff is faster onboarding, lower merge friction, and stronger CI gates.
Brand + structural
- PR #1
chore/branding-icons— replacedmolecule-icon.pngacrosscanvas/public/,canvas/src/app/,docs/assets/branding/; addedHANDOFF.mdat the repo root; fixed a comment typo in.githooks/pre-commit. - PR #3
chore/structural-cleanup— deleted emptyworkspace-server/plugins/; movedexamples/remote-agent/→sdk/python/examples/remote-agent/anddocs/superpowers/plans/→plugins/superpowers/plans/; added READMEs totests/anddocs/; gitignored.agents/,workspace-server/workspace-configs-templates/,backups/,logs/,test-results/. - LICENSE: trailing brand-migration fix — "Agent Molecule" → "Molecule AI".
MCP server refactor (PRs #2, #4, #7)
mcp-server/src/index.tsshrank from 1697 → 89 lines. Tool handlers now live in per-domain modules undermcp-server/src/tools/:workspaces.ts,agents.ts,secrets.ts,files.ts,memory.ts,plugins.ts,channels.ts,delegation.ts,schedules.ts,approvals.ts,discovery.ts,remote_agents.ts.- New shared HTTP layer
mcp-server/src/api.tsexportsPLATFORM_URL, genericapiCall<T>,ApiErrortype,isApiError()guard,toMcpResult(),toMcpText(). - Each
tools/*.tsexports handlers + aregisterXxxTools(srv)function.createServer()inindex.tswires them. - Fixed
handleGetRemoteAgentSetupCommand— emits a validpython3 -c "from molecule_agent import RemoteAgentClient; …"one-liner (was an invalidpython3 -m examples.remote-agent.run). - MCP now reports 87 tools on startup (older logs / docs said "61" — both updated).
Canvas (PRs, shipped across session)
- Replaced native
window.confirm/alertwithConfirmDialogin seven sites:ChannelsTab.tsx,ScheduleTab.tsx,ChatTab.tsx,TemplatePalette.tsx(×2),ErrorBoundary.tsx(×2 removed; buttons are self-evident). - New
singleButtonprop onConfirmDialogfor info-toast usage, plus 5 new vitest cases atcanvas/src/components/__tests__/ConfirmDialog.test.tsx. ErrorBoundaryclipboard write now catches rejections and logs toconsole.warn.- Vitest count: 352 → 357.
Platform — handler decomposition (pure refactor)
Four oversize handler functions split into private helpers — behavior unchanged, but each extracted helper is now directly unit-tested.
a2a_proxy.go::proxyA2ARequest(257 → 56 lines). New helpers:resolveAgentURL,normalizeA2APayload,dispatchA2A,handleA2ADispatchError,maybeMarkContainerDead,logA2AFailure,logA2ASuccess; sentinelproxyDispatchBuildError.delegation.go::Delegate(127 → 60 lines). New helpers:bindDelegateRequest,lookupIdempotentDelegation,insertDelegationRow; typedinsertDelegationOutcomeenum (zero valueinsertOutcomeUnknown) replaces a positional(bool, bool)return.discovery.go::Discover(125 → 40 lines). New helpers:discoverWorkspacePeer,writeExternalWorkspaceURL,discoverHostPeer.activity.go::SessionSearch(109 → 24 lines). New helpers:parseSessionSearchParams,buildSessionSearchQuery,scanSessionSearchRows.
+47 Go unit tests; workspace-server/internal/handlers coverage
56.1 % → 57.6 %.
Config / env documentation
.env.examplegained 11 previously-undocumented env vars across 6 new sections:PLATFORM_URL,MOLECULE_URL,WORKSPACE_DIR,MOLECULE_ENV,CORS_ORIGINS,RATE_LIMIT,ACTIVITY_RETENTION_DAYS,ACTIVITY_CLEANUP_INTERVAL_HOURS,MOLECULE_IN_DOCKER,AWARENESS_URL,GITHUB_WEBHOOK_SECRET,MOLECLI_URL. All 21 distinctos.Getenv/envx.*keys (except HOME) are now documented.
E2E + CI (PRs #5, #7, #8)
- New shared helpers
tests/e2e/_lib.shandtests/e2e/_extract_token.py. tests/e2e/test_api.shupdated for Phase 30.1 bearer-token auth and Phase 30.6X-Workspace-IDrequirement on discover/peers; added a pre-test workspace cleanup. 62/62 pass.tests/e2e/test_comprehensive_e2e.shfixed the token race against the provisioner by registering each workspace immediately after creation. 67/67 pass.tests/e2e/test_activity_e2e.shre-registers a detected agent to capture its bearer token.tests/e2e/test_claude_code_e2e.shgot shellcheck annotations only.- All five scripts are shellcheck-clean.
.github/workflows/ci.ymlgained two new jobs:e2e-api— Postgres + Redis service containers, migrations applied viadocker exec,test_api.shruns against a freshly-built platform binary.shellcheck— marketplace action lints everytests/e2e/*.sh.
- Existing Go job got
cache: trueonsetup-go. - Bundle round-trip and "status online" assertions now tolerate the async provisioner flipping status, removing flaky false-negatives.
Test totals after today's sync
| Stack | Before | After |
|---|---|---|
| Go (platform) | 648 | 695 |
| Python (workspace) | 1140 | 1140 |
| Canvas (vitest) | 352 | 357 |
| SDK (pytest) | 132 | 132 |
| MCP server (Jest) | 96 | 97 |
Note: only Go (+47 direct tests for extracted handler helpers), canvas (+5 ConfirmDialog singleButton tests), and MCP (+1 createServer smoke test) gained tests today. Python workspace + SDK counts are the pre-session baseline — no pytest additions today. The earlier "1078 / 87" numbers in this session were stale CLAUDE.md baselines, not measurements.
Canvas — org template import (PLAN.md §20.3)
What: added OrgTemplatesSection to canvas/src/components/TemplatePalette.tsx.
Lists org templates from GET /org/templates, each entry shows
name + description + workspace count + an "Import org" button that
POSTs { dir } to /org/import. Renders inside the existing
template-palette sidebar, below the workspace template list.
Why: PLAN.md §20.3 had this checkbox unchecked. Platform
already exposes the endpoints (handlers/org.go); only the canvas
wiring was missing. Authors today have to curl to instantiate
multi-workspace orgs — a poor UX given we already curate
org-templates/molecule-dev, reno-stars, etc.
How tested: extracted fetchOrgTemplates() and importOrgTemplate()
as standalone exports so they're unit-testable in the existing
node-only vitest config (no jsdom). 7 new tests cover happy path,
non-2xx response, network failure, POST body shape, error
propagation, and module exports. Canvas vitest 345 → 352.
Branch: feat/canvas-org-template-import.
Platform — fix #106: plugin uninstall cleanup
Bug: DELETE /workspaces/:id/plugins/:name only removed
/configs/plugins/<name>/. Skill dirs copied out to
/configs/skills/<skill>/ and rule blocks appended to
/configs/CLAUDE.md by AgentskillsAdaptor.install were left
behind, so they reappeared after every container auto-restart.
Fix (workspace-server/internal/handlers/plugins.go::Uninstall):
before the existing plugin-dir removal, the handler now:
- Reads
/configs/plugins/<name>/plugin.yamlfrom the container to learn the plugin's declaredskills:list. - Strips every
# Plugin: <name> / …block from/configs/CLAUDE.mdvia an awk script that mirrorsAgentskillsAdaptor.uninstall's block layout (marker → blank → content → blank). Other plugins' markers and surrounding user content stay intact. rm -rfeach declared skill dir under/configs/skills/(withvalidatePluginNamedefense against malformed manifest skill names).- Then proceeds with the existing
rm -rf /configs/plugins/<name>.
Tests (workspace-server/internal/handlers/plugins_test.go):
TestRegexpEscapeForAwk— verifies/,.,[],*+?|,\\, empty string all escape correctly. Caught a real bug (forgot/, awk treated marker as broken regex delimiter).TestStripPluginMarkers_AwkScript— runs the exact awk pipeline the production code uses against a fixture CLAUDE.md with twomy-pluginblocks, akeep-meblock, and surrounding user content. Asserts both my-plugin blocks (marker + content) gone, keep-me + user content intact, including trailing user content after the last my-plugin block.TestStripPluginMarkers_MissingFileIsNoOp— missing CLAUDE.md must not crash uninstall.
Live E2E: ran fixed binary, installed test plugin (skill in
/configs/skills/test-skill/, rules block in CLAUDE.md), called
DELETE, confirmed all three artifacts gone, then triggered
manual restart and confirmed they stayed gone (the original bug
trigger). Other workspace state — review-loop skill,
molecule-dev plugin, surrounding CLAUDE.md content — preserved.
Branch: fix/106-plugin-uninstall-cleanup. Closes #106.
Platform — fix #110: A2A busy-response classification
Bug: When an upstream workspace agent is mid-synthesis on a
previous request (single-threaded main loop), subsequent A2A
requests time out or see the connection reset. The proxy returned
502 failed to reach workspace agent, indistinguishable from a
genuinely unreachable agent. 17 such failures recorded over 7h of
self-evol loop traffic.
Fix (workspace-server/internal/handlers/a2a_proxy.go):
proxyA2AError gains an optional Headers field so handlers can
set real response headers. After a2aClient.Do(req) errors, we
now classify via isUpstreamBusyError: context.DeadlineExceeded,
io.EOF, io.ErrUnexpectedEOF, or stdlib wrap-strings containing
"context deadline exceeded", "EOF", "connection reset". When
the container is alive and the error matches, return
503 Service Unavailable with Retry-After: 30 and a JSON body
{"busy": true, "retry_after": 30}. Fatal / unclassified errors
still fall through to the prior 502. Issue #110 Option 3.
Tests (workspace-server/internal/handlers/a2a_proxy_test.go):
TestIsUpstreamBusyError— 10 error shapes (stdlib typed and url.Error-wrapped strings for both deadline and EOF). Includes negative cases (DNS / refused / unrelated errors).TestProxyA2AError_BusyShape— end-to-end emit contract: 503 status,Retry-After: 30header, JSON body withbusy=trueandretry_after=30.
Live verification attempted but inconclusive: redirected a workspace URL in Postgres to a hang server, but the platform's Redis URL cache shadows the DB value so the fake upstream was never hit. Unit tests cover every link in the chain (error detection → typed error struct → handler emit), so I'm confident in the change; a real 503-busy will be observable the next time an agent actually stalls under load.
Branch: fix/110-a2a-busy-response. Closes #110 (Option 3 —
clearer error + Retry-After; queueing and timeout-bump deferred).
Platform — fix #117: surface Docker image-not-found error on provision
Bug: Provisioning a workspace whose runtime image isn't built
locally silently failed. GET /workspaces/:id returned
{status: "failed", last_sample_error: ""} — no hint that the image
was missing or which build command to run. Discovered during the
MeDo hackathon smoke test; only diagnostic path was docker logs
on the platform container.
Fix (two files):
workspace-server/internal/provisioner/provisioner.go::Start— whenContainerCreatereturns "No such image", wrap the error with the resolved image tag and the exactbuild-all.sh <runtime>command the operator should run. Uses%wsoerrors.Is/errors.Aschains stay intact.workspace-server/internal/handlers/workspace_provision.go— onprovisioner.Startfailure, the UPDATE now setslast_sample_error = $2alongsidestatus='failed'. Previously the error was only logged + broadcast.
Tests (workspace-server/internal/provisioner/provisioner_test.go):
TestIsImageNotFoundErr— 7 error shapes (moby's exact message, variants, unrelated errors)TestRuntimeTagFromImage— 6 image-reference shapes including fallback pathsTestImageNotFoundErrorIncludesBuildHint— asserts the wrapped error string includes the image, the build command, and the underlying daemon message
Live E2E: provisioned with runtime: autogen after docker rmi workspace-template:autogen. Before: last_sample_error: "".
After: docker image "workspace-template:autogen" not found — run 'bash workspace/build-all.sh autogen' to build it (underlying error: Error response from daemon: No such image: workspace-template:autogen). Image rebuilt after test to restore
baseline.
Branch: fix/117-provisioner-surface-image-error. Closes #117.
Phase 30.1 — Workspace auth tokens (SaaS foundation)
Scope: first step of Phase 30 (cross-network federation). Per-workspace
bearer tokens so remote agents can authenticate themselves to the platform
without being spoofable. Transparent to local containers during the
transition — legacy workspaces are grandfathered on /registry/heartbeat
until their next /registry/register issues them a token.
What landed:
workspace-server/migrations/020_workspace_auth_tokens.{up,down}.sql— newworkspace_auth_tokenstable storingsha256(plaintext)+ 8-char prefix for display. Plaintext never persisted.workspace-server/internal/wsauth/— new package:IssueToken,ValidateToken,HasAnyLiveToken,RevokeAllForWorkspace,BearerTokenFromHeader. Opaque 256-bit tokens (base64url), no JWT.workspace-server/internal/handlers/registry.go::Register— issues a token on first registration only (idempotent on re-register); returns it in the response body asauth_token.registry.go::Heartbeat,::UpdateCard— validateAuthorization: Bearer <token>if the workspace has any live token on file. Legacy workspaces with no token → 200 (grandfather path).workspace/platform_auth.py— new agent-side store: reads${CONFIGS_DIR}/.auth_token, in-process cache,auth_headers()helper. File is 0600.workspace/main.py— saves the token returned by register.workspace/heartbeat.py,a2a_tools.py,molecule_ai_status.py,executor_helpers.py— all four heartbeat call sites now sendauth_headers().
Tests:
workspace-server/internal/wsauth/tokens_test.go— 11 cases: issuance persists only hash, tokens unique per call, validate happy path, wrong-workspace rejected, unknown token rejected, empty inputs rejected,HasAnyLiveTokenwith 0/1/7 rows, revoke, bearer header parser with 7 inputs.workspace/tests/test_platform_auth.py— 14 cases: get/save round-trip, 0600 mode, whitespace stripping, empty-token rejection, idempotent saves (no mtime churn), rotation, header format, caching semantics, empty-file handling, CONFIGS_DIR respect + fallback.- Fixed
tests/test_molecule_ai_status.py::_FakePost+exploding_postto acceptheaders=kwarg (test fixture API drift from the production code change).
Live E2E verified against real Postgres + running platform:
- Legacy workspace (no tokens) → heartbeat 200 (grandfathered)
- Fresh register → token returned in response body
- Heartbeat without token (token exists) → 401
- Heartbeat with valid token → 200
- Spoofing with guessed token → 401
- Cross-workspace token reuse (A's token for B) → 401
- Re-register after token issued → response has no
auth_token(idempotent)
Test totals: Go 476 → 487, Python 1064 → 1078.
Docs:
docs/remote-workspaces-readiness.md— full code audit that scopes Phase 30 (five sections: local-only assumptions, existing seams, hard problems, minimum viable remote shape, ordered next steps).PLAN.md— new Phase 30 section with eight bounded sub-steps (30.1 through 30.8), out-of-scope boundaries, success criteria.
Branch: feat/30.1-workspace-auth-tokens. First PR of Phase 30.
Fix #125 — commit_memory writes now surface in activity_logs
Bug: commit_memory MCP tool calls succeeded silently. Operators
inspecting the Canvas "Agent Comms" tab couldn't see what an agent
chose to remember during a task.
Fix (two files):
-
workspace/builtin_tools/memory.py::commit_memory— on successful write, fire-and-forget aPOST /workspaces/:id/activitycall via new helper_record_memory_activity(scope, content, memory_id). Summary format[<SCOPE>] <80-char preview>… (id=<id>). The memory id is embedded in the summary (not target_id) becausetarget_idis a UUID column scoped to workspace references; awareness memory ids are arbitrary strings. -
workspace-server/internal/handlers/activity.go::Report— addedmemory_writeto the activity_type allowlist. Without this the handler returned 400 with the prior list{a2a_send, a2a_receive, task_update, agent_log, skill_promotion, error}.
Tests:
workspace/tests/test_memory.py— 6 new cases: posts to/activityendpoint with right shape; truncates content80 chars with ellipsis; strips newlines from summary; skips when
WORKSPACE_IDorPLATFORM_URLis missing; swallows POST failures (must not poison tool path); embeds id in summary regardless.workspace-server/internal/handlers/activity_test.go— 2 new cases:memory_writeaccepted (200), unknown type still 400 with the updated message includingmemory_write.
Live E2E against running platform + Postgres:
- Direct curl POST with
activity_type=memory_write→ 200 + DB row _record_memory_activityfrom Python → row visible viaGET /workspaces/:id/activity?type=memory_write- Confirmed
target_idUUID-typing rejection from prior attempt (caught the bug — fix lands the id in summary instead)
Test totals: Go 487 → 489, Python 1078 → 1084.
Branch: fix/125-commit-memory-activity-log. Closes #125.
Phase 30.2 + 30.5 — Remote secrets pull + A2A caller-token validation
Two bounded steps shipped together since they share the same
wsauth validation shape.
30.2 — GET /workspaces/:id/secrets/values
- New handler in
workspace-server/internal/handlers/secrets.go::Values. Returns the merged decrypted global+workspace secrets as a flat{"KEY": "value"}JSON map. Same merge semantics as the provisioner's env-var injection, so a remote agent bootstrapping via pull sees exactly the same secrets a local container would receive via push. - Auth: Phase 30.1 bearer token required when the workspace has any live token on file. Legacy workspaces grandfathered through. Fail-closed on the token-existence check (different from heartbeat's fail-open) because this endpoint returns plaintext secrets.
- Route wired in
workspace-server/internal/router/router.go:170.
30.5 — A2A proxy caller-token validation
workspace-server/internal/handlers/a2a_proxy.go::ProxyA2Anow callsvalidateCallerToken(ctx, c, callerID)before the existing CanCommunicate hierarchy check. Three bypass paths preserved: canvas (emptyX-Workspace-ID), system callers (webhook:,system:,test:prefixes), self-calls (callerID==workspaceID).- Token binding is strict: compromised token from workspace A cannot authenticate a caller claiming to be workspace B. Tested.
- Fail-open on DB hiccup — caller-token is defense-in-depth on top of hierarchy, not the sole gate.
Tests:
- 5 new Go tests in
secrets_test.go(legacy grandfather, missing token, wrong token, valid token with merge precedence, invalid workspace ID). - 5 new Go tests in
a2a_proxy_test.go::TestValidateCallerToken(legacy grandfather, missing token, invalid token, valid token, wrong-workspace binding rejection).
Live E2E verified against real Postgres + platform:
- 30.2: no-token → 401, bad-token → 401, valid-token → 200 with
correct
{"PHASE_30_DEMO":"hello-from-pull-endpoint"}. - 30.5: canvas bypass ✓, self-call bypass ✓, system-caller bypass ✓, cross-workspace no-token → 401 "missing caller auth token", cross-workspace wrong-token → 401 "invalid caller auth token", cross-workspace valid-token → 403 "access denied" (falls through to hierarchy check as designed).
Phase 30 status on main: 30.1 ✅, 30.2 ✅ (this PR), 30.5 ✅ (this PR). Remaining: 30.3 (plugin tarball), 30.4 (state polling), 30.6 (sibling URL cache), 30.7 (poll-liveness), 30.8 (SDK + GA).
Branch: feat/30.2-30.5-remote-auth. PLAN.md checkboxes flipped
for 30.1, 30.2, 30.5.
Phase 30.4 + 30.8 — State polling + Remote-agent SDK (first working e2e)
Shipped together because 30.8 (the runnable example) is the proof-of-life for everything 30.1–30.5 built up to. 30.4 is the missing piece that lets a remote agent detect pause/delete without WebSocket reachability.
30.4 — GET /workspaces/:id/state
- New handler
workspace.Stateatworkspace-server/internal/handlers/workspace.go. Returns{workspace_id, status, paused, deleted}. Token-gated with the same Phase 30.1 shape (legacy grandfather, fail-closed on DB error). Deliberately not merged withGET /workspaces/:id— that path is for the canvas (unauthenticated, rich config). This is the agent-machinery polling path: tight, token-gated, cache-friendly. - Returns 404 +
{deleted: true}for hard-deleted rows so the SDK can distinguish from transient network issues.
30.8 — sdk/python/molecule_agent/
- New
RemoteAgentClientclass (blocking,requests-only, no async) with methods mirroring the Phase 30 endpoints:register(),pull_secrets(),poll_state(),heartbeat(),run_heartbeat_loop(). - Token cache at
~/.molecule/<workspace_id>/.auth_tokenwith 0600 perms. Register is idempotent — re-registering an already-tokened workspace keeps using the on-disk copy. - Loop exits gracefully on pause/delete, returning the terminal status for the caller to log / exit on.
- Tolerates transient heartbeat + state-poll failures without crashing the loop (log and continue).
examples/remote-agent/
- Runnable 100-line demo:
WORKSPACE_ID=x PLATFORM_URL=y python3 run.py. README walks through workspace creation viaexternal: true, seeding a secret, running the agent. - Note found during live verification:
POST /registry/registerupsertsstatus='online', so re-registering an already-paused workspace reverts it. Not a bug in 30.4; but affects the order of operations in the demo (register once, then pause takes effect on the long-running loop). Filed as follow-up — see "Known follow-ups" below.
Tests:
- 5 new Go tests for
workspace.State(legacy grandfather, paused, hard-delete 404, missing token, valid token). - 22 new Python tests for
RemoteAgentClient(token persistence with 0600 check, register issues/reuses, secrets pull, state poll, 404 = deleted, heartbeat body shape, loop exits on paused/deleted/max-iterations, transient-error continuation).
Live E2E with all of 30.1/30.2/30.4/30.5 running:
- Agent register → token issued ✓
received 2 secret(s): keys=['API_KEY', 'REMOTE_DEMO_KEY']✓- Heartbeat loop runs, uptime advances to 10s ✓
POST /pausemid-loop →platform reports workspace paused (paused=True deleted=False) — exitingwithin ~5s ✓- Clean terminal status
paused✓
Known follow-ups (not this PR):
- Register's
status='online'overwrite undoes platform-side pause if the agent happens to re-register. Should check current status and preservepaused/removed. - Loop currently can't receive inbound A2A —
reported_urlisremote://no-inboundas a placeholder. A future 30.8b will add an optionalstart_a2a_server()helper for agents behind a public URL or tunneled port.
PLAN.md: 30.4 ✅, 30.8 ✅. Phase 30 remaining: 30.3 (plugin tarball), 30.6 (sibling URL cache), 30.7 (poll-liveness monitor).
Branch: feat/30.4-state-polling (merged 30.2+30.5 PR #130 into it
mid-session for the live E2E to have all endpoints available).
Phase 30.7 — Poll-liveness for external-runtime workspaces
Why this is the missing piece: without it, a dead remote agent
stayed "online" on the canvas forever. The existing health sweep
explicitly skipped runtime='external' rows because it only knew how
to ask Docker "is the container alive?" — wrong question for a
workspace the platform never started.
Fix (workspace-server/internal/registry/healthsweep.go):
- New
sweepStaleRemoteWorkspacesruns on the same ticker as the Docker sweep. Queries workspaces withruntime='external'whoselast_heartbeat_atis older thanREMOTE_LIVENESS_STALE_AFTER(default 90s, env-overridable). Marks them offline, clears Redis state, firesonOfflineso the canvas seesWORKSPACE_OFFLINE. StartHealthSweepno longer early-returns on nil Docker checker — a SaaS front-door deployment without local Docker still needs remote-liveness monitoring.- Newly-registered external workspaces that haven't heartbeated yet
are compared against
updated_at(set on register), so an agent that crashes before its first heartbeat is still swept after the grace window.
Tests (workspace-server/internal/registry/healthsweep_test.go):
sweepStaleRemoteWorkspaceswith 2 stale rows → UPDATE + onOffline called twice- No stale rows → onOffline never called
- Nil callback → no panic
- DB outage → logged, no panic, no false offlines
remoteStaleAfter: default when env unset; honors valid integer override; falls back on garbage values (abc,0,-10, empty)StartHealthSweepwith nil checker: still ticks and runs remote sweep (previously would early-return)
Live E2E with REMOTE_LIVENESS_STALE_AFTER=10 for test speed:
- Agent register → heartbeat → exit → status=online (heartbeat fresh)
- Wait 30s → status=offline (platform swept at 15s tick, saw
heartbeat >10s old). Log:
Health sweep (remote): <id> heartbeat stale (>10s) — marking offline - Restart agent → heartbeat resumes → status=online again
- Full cycle observable on canvas via WORKSPACE_OFFLINE / WORKSPACE_ONLINE broadcasts
All Phase 30 remote-agent capabilities now demonstrable end-to-end:
| Step | Live E2E status |
|---|---|
| 30.1 Token auth | ✅ register + heartbeat bearer-auth'd |
| 30.2 Secrets pull | ✅ keys=['API_KEY','REMOTE_DEMO_KEY'] |
| 30.4 State polling | ✅ pause detected in ~5s |
| 30.5 A2A caller auth | ✅ 401/403 separation confirmed |
| 30.7 Poll-liveness | ✅ stale→offline→restart→online cycle |
| 30.8 SDK + example | ✅ examples/remote-agent/run.py |
Phase 30 remaining: 30.3 (plugin tarball), 30.6 (sibling URL cache). Neither blocks the current SaaS loop; 30.3 matters when remote agents need to install plugins with heavy deps, 30.6 is a resilience optimization for agent-to-agent direct calls.
Branch: feat/30.7-poll-liveness. PLAN.md 30.7 ✅.
Phase 30.6 — Sibling discovery auth + URL caching
Two tied fixes:
Platform side — /registry/discover/:id and /registry/:id/peers
were unauthenticated. For a SaaS front-door deployment, any internet
host that knows a workspace ID could enumerate siblings and pull
their URLs. Added validateDiscoveryCaller using the same
lazy-bootstrap Phase 30.1 token pattern. Fail-open on DB hiccup
(unlike secrets.Values which fails-closed) because discovery only
exposes URLs already behind CanCommunicate — the hierarchy check
downstream is the primary gate, auth is defense-in-depth.
SDK side — new methods on RemoteAgentClient:
get_peers()→ list ofPeerInfo, seeds URL cachediscover_peer(id)→ cached lookup with 5-min TTL, refreshes on expiry, returns None on 404invalidate_peer_url(id)→ drop cache entry (call after a direct-call failure so next call re-discovers)call_peer(id, message, prefer_direct=True)→ sends A2A message/send. Direct path on cache hit; graceful fallback to platform proxy on connection error / 5xx with cache invalidation.prefer_direct=Falseforces proxy routing.- New
PeerInfodataclass exported alongsideWorkspaceState.
Tests: 12 new SDK tests (cache seeding skips non-http URLs, cache hit short-circuits, expired cache refreshes, 404 returns None, invalidate_peer_url idempotent, direct-path vs proxy-fallback vs prefer-direct=False, fresh call with no cache does discover-then-direct).
Bug caught during verification: my first discovery auth shape
fail-closed on DB errors, which broke existing TestDiscover_* and
TestPeers_* tests that didn't set up the HasAnyLiveToken
sqlmock expectation. Switched to fail-open — discovery is
hierarchy-gated anyway, and a DB hiccup shouldn't take
agent-to-agent discovery offline. 8 tests restored green.
Live E2E with a tiny Python echo server as sibling-B:
get_peersreturns 2 peers (echo server + parent PM)- URL cache seeded ONLY with
http://entry (skipsremote://pm) call_peerroutes directly tohttp://127.0.0.1:9876— no proxy hop- Echo server responds, SDK returns
"echoed: hello sibling over SDK" - Auth + hierarchy all verified: no-token→401, wrong-token→401, cross-workspace token→401, out-of-hierarchy discover→403
Phase 30 status after this: 30.1 ✅ 30.2 ✅ 30.4 ✅ 30.5 ✅ 30.6 ✅ 30.7 ✅ 30.8 ✅. Only 30.3 (plugin tarball download) remains, and I flagged that one as lower priority — the current SaaS loop doesn't need it until a real user has a heavy-deps plugin.
Branch: feat/30.6-sibling-cache.
Fix #123 — Telegram kicked/left now persists enabled=false
Bug: When the Molecule AI bot was removed from a Telegram chat, the
handler at telegram.go:594-596 only logged the event — the matching
workspace_channels row stayed enabled=true. Every subsequent
outbound message hit Telegram 403 forever.
Fix:
- New package-level callback
disableChannelByChatIDintelegram.go, default no-op (safe for early boot / tests). manager.go::NewManagerwires it to runUPDATE workspace_channels SET enabled=false WHERE channel_type='telegram' AND enabled=true AND config->>'chat_id'=$1, then callm.Reload(ctx)if any row flipped so the in-memory poller map drops the now-disabled row.onMyChatMember::case "left", "kicked"now calls the callback immediately after the existing log line (removes the TODO).
Tests (workspace-server/internal/channels/channels_test.go):
- default-is-no-op (var safe to call pre-Manager-init)
- wired-callback fires UPDATE with exact WHERE shape + arg + triggers Reload via follow-up SELECT
- no-rows-affected skips reload (avoids SELECT storm on unrelated kicked events from other bots)
Branch: fix/123-telegram-kicked-persist. Closes #123.
Phase 30 client adaptations — MCP / molecli / Canvas / SDK
Phase 30 itself shipped the platform-side endpoints. These adaptations make those endpoints visible and usable from every client surface without requiring callers to know the new URL paths by hand.
MCP — 4 new tools in mcp-server/src/index.ts:
list_remote_agents— filters workspace list to runtime='external'get_remote_agent_state— projects {workspace_id, status, paused, deleted}get_remote_agent_setup_command— emits theWORKSPACE_ID=... PLATFORM_URL=... python3 ...bash one-liner an operator can paste into a remote shellcheck_remote_agent_freshness— compares last_heartbeat_at against configurable threshold (default 90s); returns {fresh, seconds_since_heartbeat}
8 new MCP tests (88 → 96).
molecli — WorkspaceInfo gains a Runtime field; printWorkspaceTable
adds a RUNTIME column showing ★ external for remote agents so they pop
in a long table; detail view labels them external (Phase 30 remote agent).
Live: molecli ws list now shows the badge correctly.
Canvas — WorkspaceNode.tsx reads data.runtime (workspace row) in
preference to data.agentCard.runtime (agent-reported). Remote agents
get a distinct violet ★ REMOTE pill with a tooltip explaining the
heartbeat-based lifecycle. 352/352 vitests still pass.
SDK — pyproject.toml rebranded molecule-sdk@0.2.0 so a single
pip install molecule-sdk ships both molecule_plugin (plugin
authors) and molecule_agent (remote-agent authors). Added trove
classifiers, keywords, requires-python pin. New
sdk/python/molecule_agent/README.md quickstart.
Live verification:
- MCP: spawned a real external workspace, ran all 4 tools via node smoke script — count=1, setup_command renders, freshness=null (no heartbeat yet, returns fresh=false correctly)
- molecli:
ws listshows★ externalbadge on the remote workspace - Canvas tests green; visual change is small (one badge swap)
- SDK: 121 SDK tests + 1078 workspace-template tests still pass
Branch: feat/phase30-client-adaptations.
Phase 30.3 — Plugin tarball download (external GitHub repo verified)
Platform: new GET /workspaces/:id/plugins/:name/download[?source=...]
streams the named plugin as a gzipped tarball. Reuses
resolveAndStage so all existing source schemes (local://,
github://, future clawhub://) work — the endpoint is just the
download surface for what Install was already doing internally.
Token-gated (fail-closed on DB error since the tarball can include
rule text and skill files referencing internals). Defaults source to
local://<name> when the query param is omitted. Validates that the
URL path's plugin name matches the resolved plugin's manifest name —
prevents a github source resolving to a different name from being
shipped under the requested name.
SDK RemoteAgentClient.install_plugin(name, source=None):
- Stream the download
- Atomic extract via sibling-tempdir + rename (no half-installed states)
- Run
setup.shif present (best-effort) - POST
/workspaces/:id/pluginsto register the install
_safe_extract_tar rejects path-traversal (../escape, absolute paths)
and silently skips symlinks/hardlinks — defends against tar-slip CVEs.
Tested with both adversarial inputs.
Tests:
- 5 new Go (auth, tarball shape, name mismatch, tar streaming relative paths, tar symlink skip)
- 11 new Python SDK (unpack location, source query param, atomic rollback on corrupt tarball, overwrite existing, setup.sh ran/skipped, platform-report skipped, 404 surfaces, _safe_extract path-traversal rejection, absolute-path rejection, symlink skip)
Live E2E with a real external GitHub repo created via gh repo create (HongmingWang-Rabbit/starfire-test-plugin):
local://molecule-dev→ 4612-byte tarball, plugin.yaml + skills/ presentgithub://HongmingWang-Rabbit/starfire-test-plugin→ 711-byte tarball pulled from real GitHub, unpacked locally, setup.sh ran on the agent's host machine producing/tmp/sf-plugin-test-setup-ran- Auth gates: 401/401/200 confirmed
- Name-mismatch: requested
wrong-namewithsource=...starfire-test-pluginreturned 400 with{"resolved_name":"starfire-test-plugin","requested_name":"wrong-name"}
Phase 30 is now feature-complete: 30.1 ✅ 30.2 ✅ 30.3 ✅ 30.4 ✅ 30.5 ✅ 30.6 ✅ 30.7 ✅ 30.8 ✅
Branch: feat/30.3-plugin-tarball. Test repo:
https://github.com/HongmingWang-Rabbit/starfire-test-plugin
Bugfix #124 — Delegation idempotency
Promoted from docs/known-issues.md KI-002. When a workspace container
restarted mid-delegation (Redis TTL → liveness restart), agents could
re-issue POST /workspaces/:id/delegate and produce duplicate work
(double commits, double Telegram messages, double API calls).
Migration 021_delegation_idempotency.up.sql:
activity_logs.idempotency_key TEXT NULL- Partial unique index on
(workspace_id, idempotency_key) WHERE idempotency_key IS NOT NULL— fully backwards compatible
Handler (workspace-server/internal/handlers/delegation.go::Delegate):
- Optional
idempotency_keyfield on the request body - On receipt: lookup
(workspace_id, key)→ if found and notfailed, return existing delegation_id with HTTP 200 +idempotent_hit: true - If the prior row is
failed, the slot is released so the retry can produce a fresh delegation (still 202) - If two concurrent calls race past the lookup, the unique-constraint violation on insert is caught and the loser re-queries to surface the same idempotent response (HTTP 200) instead of a 500
Tests (3 new + 2 updated, all green under go test -race):
TestDelegate_IdempotentReplayReturnsExistingDelegationTestDelegate_IdempotentFailedRowIsReleasedAndReplacedTestDelegate_IdempotentRaceUniqueViolationReturnsExisting- Updated
TestDelegate_SuccessandTestDelegate_DBInsertFails_Still202WithWarningto assert the new 6th INSERT arg (idempotency_key = NULL when omitted)
Branch: fix/auto-review-2026-04-13-delegation-idempotency. Closes #124.