# 2026-04-13 — edit history ## Summary — Quality + Infra Pass (PRs #1–#8, all merged) Eight PRs landed today in a focused quality pass. No user-facing feature changes; the payoff is faster onboarding, lower merge friction, and stronger CI gates. ### Brand + structural - **PR #1 `chore/branding-icons`** — replaced `molecule-icon.png` across `canvas/public/`, `canvas/src/app/`, `docs/assets/branding/`; added `HANDOFF.md` at the repo root; fixed a comment typo in `.githooks/pre-commit`. - **PR #3 `chore/structural-cleanup`** — deleted empty `workspace-server/plugins/`; moved `examples/remote-agent/` → `sdk/python/examples/remote-agent/` and `docs/superpowers/plans/` → `plugins/superpowers/plans/`; added READMEs to `tests/` and `docs/`; gitignored `.agents/`, `workspace-server/workspace-configs-templates/`, `backups/`, `logs/`, `test-results/`. - LICENSE: trailing brand-migration fix — "Agent Molecule" → "Molecule AI". ### MCP server refactor (PRs #2, #4, #7) - `mcp-server/src/index.ts` shrank from **1697 → 89 lines**. Tool handlers now live in per-domain modules under `mcp-server/src/tools/`: `workspaces.ts`, `agents.ts`, `secrets.ts`, `files.ts`, `memory.ts`, `plugins.ts`, `channels.ts`, `delegation.ts`, `schedules.ts`, `approvals.ts`, `discovery.ts`, `remote_agents.ts`. - New shared HTTP layer `mcp-server/src/api.ts` exports `PLATFORM_URL`, generic `apiCall`, `ApiError` type, `isApiError()` guard, `toMcpResult()`, `toMcpText()`. - Each `tools/*.ts` exports handlers + a `registerXxxTools(srv)` function. `createServer()` in `index.ts` wires them. - Fixed `handleGetRemoteAgentSetupCommand` — emits a valid `python3 -c "from molecule_agent import RemoteAgentClient; …"` one-liner (was an invalid `python3 -m examples.remote-agent.run`). - MCP now reports **87 tools** on startup (older logs / docs said "61" — both updated). ### Canvas (PRs, shipped across session) - Replaced native `window.confirm` / `alert` with `ConfirmDialog` in seven sites: `ChannelsTab.tsx`, `ScheduleTab.tsx`, `ChatTab.tsx`, `TemplatePalette.tsx` (×2), `ErrorBoundary.tsx` (×2 removed; buttons are self-evident). - New `singleButton` prop on `ConfirmDialog` for info-toast usage, plus 5 new vitest cases at `canvas/src/components/__tests__/ConfirmDialog.test.tsx`. - `ErrorBoundary` clipboard write now catches rejections and logs to `console.warn`. - Vitest count: **352 → 357**. ### Platform — handler decomposition (pure refactor) Four oversize handler functions split into private helpers — behavior unchanged, but each extracted helper is now directly unit-tested. - `a2a_proxy.go::proxyA2ARequest` (257 → 56 lines). New helpers: `resolveAgentURL`, `normalizeA2APayload`, `dispatchA2A`, `handleA2ADispatchError`, `maybeMarkContainerDead`, `logA2AFailure`, `logA2ASuccess`; sentinel `proxyDispatchBuildError`. - `delegation.go::Delegate` (127 → 60 lines). New helpers: `bindDelegateRequest`, `lookupIdempotentDelegation`, `insertDelegationRow`; typed `insertDelegationOutcome` enum (zero value `insertOutcomeUnknown`) replaces a positional `(bool, bool)` return. - `discovery.go::Discover` (125 → 40 lines). New helpers: `discoverWorkspacePeer`, `writeExternalWorkspaceURL`, `discoverHostPeer`. - `activity.go::SessionSearch` (109 → 24 lines). New helpers: `parseSessionSearchParams`, `buildSessionSearchQuery`, `scanSessionSearchRows`. **+47 Go unit tests**; `workspace-server/internal/handlers` coverage **56.1 % → 57.6 %**. ### Config / env documentation - `.env.example` gained **11 previously-undocumented env vars** across 6 new sections: `PLATFORM_URL`, `MOLECULE_URL`, `WORKSPACE_DIR`, `MOLECULE_ENV`, `CORS_ORIGINS`, `RATE_LIMIT`, `ACTIVITY_RETENTION_DAYS`, `ACTIVITY_CLEANUP_INTERVAL_HOURS`, `MOLECULE_IN_DOCKER`, `AWARENESS_URL`, `GITHUB_WEBHOOK_SECRET`, `MOLECLI_URL`. All 21 distinct `os.Getenv` / `envx.*` keys (except HOME) are now documented. ### E2E + CI (PRs #5, #7, #8) - New shared helpers `tests/e2e/_lib.sh` and `tests/e2e/_extract_token.py`. - `tests/e2e/test_api.sh` updated for Phase 30.1 bearer-token auth and Phase 30.6 `X-Workspace-ID` requirement on discover/peers; added a pre-test workspace cleanup. **62/62 pass.** - `tests/e2e/test_comprehensive_e2e.sh` fixed the token race against the provisioner by registering each workspace immediately after creation. **67/67 pass.** - `tests/e2e/test_activity_e2e.sh` re-registers a detected agent to capture its bearer token. - `tests/e2e/test_claude_code_e2e.sh` got shellcheck annotations only. - All five scripts are shellcheck-clean. - `.github/workflows/ci.yml` gained two new jobs: - **`e2e-api`** — Postgres + Redis service containers, migrations applied via `docker exec`, `test_api.sh` runs against a freshly-built platform binary. - **`shellcheck`** — marketplace action lints every `tests/e2e/*.sh`. - Existing Go job got `cache: true` on `setup-go`. - Bundle round-trip and "status online" assertions now tolerate the async provisioner flipping status, removing flaky false-negatives. ### Test totals after today's sync | Stack | Before | After | |-------|--------|-------| | Go (platform) | 648 | 695 | | Python (workspace) | 1140 | 1140 | | Canvas (vitest) | 352 | 357 | | SDK (pytest) | 132 | 132 | | MCP server (Jest) | 96 | 97 | Note: only Go (+47 direct tests for extracted handler helpers), canvas (+5 ConfirmDialog singleButton tests), and MCP (+1 createServer smoke test) gained tests today. Python workspace + SDK counts are the pre-session baseline — no pytest additions today. The earlier "1078 / 87" numbers in this session were stale CLAUDE.md baselines, not measurements. --- ## Canvas — org template import (PLAN.md §20.3) **What:** added `OrgTemplatesSection` to `canvas/src/components/TemplatePalette.tsx`. Lists org templates from `GET /org/templates`, each entry shows name + description + workspace count + an "Import org" button that POSTs `{ dir }` to `/org/import`. Renders inside the existing template-palette sidebar, below the workspace template list. **Why:** PLAN.md §20.3 had this checkbox unchecked. Platform already exposes the endpoints (handlers/org.go); only the canvas wiring was missing. Authors today have to `curl` to instantiate multi-workspace orgs — a poor UX given we already curate `org-templates/molecule-dev`, `reno-stars`, etc. **How tested:** extracted `fetchOrgTemplates()` and `importOrgTemplate()` as standalone exports so they're unit-testable in the existing node-only vitest config (no jsdom). 7 new tests cover happy path, non-2xx response, network failure, POST body shape, error propagation, and module exports. Canvas vitest 345 → 352. Branch: `feat/canvas-org-template-import`. ## Platform — fix #106: plugin uninstall cleanup **Bug:** `DELETE /workspaces/:id/plugins/:name` only removed `/configs/plugins//`. Skill dirs copied out to `/configs/skills//` and rule blocks appended to `/configs/CLAUDE.md` by `AgentskillsAdaptor.install` were left behind, so they reappeared after every container auto-restart. **Fix** (`workspace-server/internal/handlers/plugins.go::Uninstall`): before the existing plugin-dir removal, the handler now: 1. Reads `/configs/plugins//plugin.yaml` from the container to learn the plugin's declared `skills:` list. 2. Strips every `# Plugin: / …` block from `/configs/CLAUDE.md` via an awk script that mirrors `AgentskillsAdaptor.uninstall`'s block layout (marker → blank → content → blank). Other plugins' markers and surrounding user content stay intact. 3. `rm -rf` each declared skill dir under `/configs/skills/` (with `validatePluginName` defense against malformed manifest skill names). 4. Then proceeds with the existing `rm -rf /configs/plugins/`. **Tests** (`workspace-server/internal/handlers/plugins_test.go`): - `TestRegexpEscapeForAwk` — verifies `/`, `.`, `[]`, `*+?|`, `\\`, empty string all escape correctly. Caught a real bug (forgot `/`, awk treated marker as broken regex delimiter). - `TestStripPluginMarkers_AwkScript` — runs the exact awk pipeline the production code uses against a fixture CLAUDE.md with two `my-plugin` blocks, a `keep-me` block, and surrounding user content. Asserts both my-plugin blocks (marker + content) gone, keep-me + user content intact, including trailing user content after the last my-plugin block. - `TestStripPluginMarkers_MissingFileIsNoOp` — missing CLAUDE.md must not crash uninstall. **Live E2E:** ran fixed binary, installed test plugin (skill in `/configs/skills/test-skill/`, rules block in CLAUDE.md), called `DELETE`, confirmed all three artifacts gone, then triggered manual restart and confirmed they stayed gone (the original bug trigger). Other workspace state — `review-loop` skill, `molecule-dev` plugin, surrounding CLAUDE.md content — preserved. Branch: `fix/106-plugin-uninstall-cleanup`. Closes #106. ## Platform — fix #110: A2A busy-response classification **Bug:** When an upstream workspace agent is mid-synthesis on a previous request (single-threaded main loop), subsequent A2A requests time out or see the connection reset. The proxy returned `502 failed to reach workspace agent`, indistinguishable from a genuinely unreachable agent. 17 such failures recorded over 7h of self-evol loop traffic. **Fix** (`workspace-server/internal/handlers/a2a_proxy.go`): `proxyA2AError` gains an optional `Headers` field so handlers can set real response headers. After `a2aClient.Do(req)` errors, we now classify via `isUpstreamBusyError`: `context.DeadlineExceeded`, `io.EOF`, `io.ErrUnexpectedEOF`, or stdlib wrap-strings containing `"context deadline exceeded"`, `"EOF"`, `"connection reset"`. When the container is alive and the error matches, return `503 Service Unavailable` with `Retry-After: 30` and a JSON body `{"busy": true, "retry_after": 30}`. Fatal / unclassified errors still fall through to the prior 502. Issue #110 Option 3. **Tests** (`workspace-server/internal/handlers/a2a_proxy_test.go`): - `TestIsUpstreamBusyError` — 10 error shapes (stdlib typed and url.Error-wrapped strings for both deadline and EOF). Includes negative cases (DNS / refused / unrelated errors). - `TestProxyA2AError_BusyShape` — end-to-end emit contract: 503 status, `Retry-After: 30` header, JSON body with `busy=true` and `retry_after=30`. **Live verification attempted but inconclusive:** redirected a workspace URL in Postgres to a hang server, but the platform's Redis URL cache shadows the DB value so the fake upstream was never hit. Unit tests cover every link in the chain (error detection → typed error struct → handler emit), so I'm confident in the change; a real 503-busy will be observable the next time an agent actually stalls under load. Branch: `fix/110-a2a-busy-response`. Closes #110 (Option 3 — clearer error + Retry-After; queueing and timeout-bump deferred). ## Platform — fix #117: surface Docker image-not-found error on provision **Bug:** Provisioning a workspace whose runtime image isn't built locally silently failed. `GET /workspaces/:id` returned `{status: "failed", last_sample_error: ""}` — no hint that the image was missing or which build command to run. Discovered during the MeDo hackathon smoke test; only diagnostic path was `docker logs` on the platform container. **Fix** (two files): 1. `workspace-server/internal/provisioner/provisioner.go::Start` — when `ContainerCreate` returns "No such image", wrap the error with the resolved image tag and the exact `build-all.sh ` command the operator should run. Uses `%w` so `errors.Is`/`errors.As` chains stay intact. 2. `workspace-server/internal/handlers/workspace_provision.go` — on `provisioner.Start` failure, the UPDATE now sets `last_sample_error = $2` alongside `status='failed'`. Previously the error was only logged + broadcast. **Tests** (`workspace-server/internal/provisioner/provisioner_test.go`): - `TestIsImageNotFoundErr` — 7 error shapes (moby's exact message, variants, unrelated errors) - `TestRuntimeTagFromImage` — 6 image-reference shapes including fallback paths - `TestImageNotFoundErrorIncludesBuildHint` — asserts the wrapped error string includes the image, the build command, and the underlying daemon message **Live E2E:** provisioned with `runtime: autogen` after `docker rmi workspace-template:autogen`. Before: `last_sample_error: ""`. After: `docker image "workspace-template:autogen" not found — run 'bash workspace/build-all.sh autogen' to build it (underlying error: Error response from daemon: No such image: workspace-template:autogen)`. Image rebuilt after test to restore baseline. Branch: `fix/117-provisioner-surface-image-error`. Closes #117. ## Phase 30.1 — Workspace auth tokens (SaaS foundation) **Scope:** first step of Phase 30 (cross-network federation). Per-workspace bearer tokens so remote agents can authenticate themselves to the platform without being spoofable. Transparent to local containers during the transition — legacy workspaces are grandfathered on `/registry/heartbeat` until their next `/registry/register` issues them a token. **What landed:** - `workspace-server/migrations/020_workspace_auth_tokens.{up,down}.sql` — new `workspace_auth_tokens` table storing `sha256(plaintext)` + 8-char prefix for display. Plaintext never persisted. - `workspace-server/internal/wsauth/` — new package: `IssueToken`, `ValidateToken`, `HasAnyLiveToken`, `RevokeAllForWorkspace`, `BearerTokenFromHeader`. Opaque 256-bit tokens (base64url), no JWT. - `workspace-server/internal/handlers/registry.go::Register` — issues a token on first registration only (idempotent on re-register); returns it in the response body as `auth_token`. - `registry.go::Heartbeat`, `::UpdateCard` — validate `Authorization: Bearer ` if the workspace has any live token on file. Legacy workspaces with no token → 200 (grandfather path). - `workspace/platform_auth.py` — new agent-side store: reads `${CONFIGS_DIR}/.auth_token`, in-process cache, `auth_headers()` helper. File is 0600. - `workspace/main.py` — saves the token returned by register. - `workspace/heartbeat.py`, `a2a_tools.py`, `molecule_ai_status.py`, `executor_helpers.py` — all four heartbeat call sites now send `auth_headers()`. **Tests:** - `workspace-server/internal/wsauth/tokens_test.go` — 11 cases: issuance persists only hash, tokens unique per call, validate happy path, wrong-workspace rejected, unknown token rejected, empty inputs rejected, `HasAnyLiveToken` with 0/1/7 rows, revoke, bearer header parser with 7 inputs. - `workspace/tests/test_platform_auth.py` — 14 cases: get/save round-trip, 0600 mode, whitespace stripping, empty-token rejection, idempotent saves (no mtime churn), rotation, header format, caching semantics, empty-file handling, CONFIGS_DIR respect + fallback. - Fixed `tests/test_molecule_ai_status.py::_FakePost` + `exploding_post` to accept `headers=` kwarg (test fixture API drift from the production code change). **Live E2E verified against real Postgres + running platform:** - Legacy workspace (no tokens) → heartbeat 200 (grandfathered) - Fresh register → token returned in response body - Heartbeat without token (token exists) → 401 - Heartbeat with valid token → 200 - Spoofing with guessed token → 401 - Cross-workspace token reuse (A's token for B) → 401 - Re-register after token issued → response has no `auth_token` (idempotent) **Test totals:** Go 476 → 487, Python 1064 → 1078. **Docs:** - `docs/remote-workspaces-readiness.md` — full code audit that scopes Phase 30 (five sections: local-only assumptions, existing seams, hard problems, minimum viable remote shape, ordered next steps). - `PLAN.md` — new Phase 30 section with eight bounded sub-steps (30.1 through 30.8), out-of-scope boundaries, success criteria. **Branch:** `feat/30.1-workspace-auth-tokens`. First PR of Phase 30. ## Fix #125 — `commit_memory` writes now surface in `activity_logs` **Bug:** `commit_memory` MCP tool calls succeeded silently. Operators inspecting the Canvas "Agent Comms" tab couldn't see what an agent chose to remember during a task. **Fix (two files):** 1. `workspace/builtin_tools/memory.py::commit_memory` — on successful write, fire-and-forget a `POST /workspaces/:id/activity` call via new helper `_record_memory_activity(scope, content, memory_id)`. Summary format `[] <80-char preview>… (id=)`. The memory id is embedded in the summary (not target_id) because `target_id` is a UUID column scoped to workspace references; awareness memory ids are arbitrary strings. 2. `workspace-server/internal/handlers/activity.go::Report` — added `memory_write` to the activity_type allowlist. Without this the handler returned 400 with the prior list `{a2a_send, a2a_receive, task_update, agent_log, skill_promotion, error}`. **Tests:** - `workspace/tests/test_memory.py` — 6 new cases: posts to `/activity` endpoint with right shape; truncates content >80 chars with ellipsis; strips newlines from summary; skips when `WORKSPACE_ID` or `PLATFORM_URL` is missing; swallows POST failures (must not poison tool path); embeds id in summary regardless. - `workspace-server/internal/handlers/activity_test.go` — 2 new cases: `memory_write` accepted (200), unknown type still 400 with the updated message including `memory_write`. **Live E2E** against running platform + Postgres: - Direct curl POST with `activity_type=memory_write` → 200 + DB row - `_record_memory_activity` from Python → row visible via `GET /workspaces/:id/activity?type=memory_write` - Confirmed `target_id` UUID-typing rejection from prior attempt (caught the bug — fix lands the id in summary instead) **Test totals:** Go 487 → 489, Python 1078 → 1084. Branch: `fix/125-commit-memory-activity-log`. Closes #125. ## Phase 30.2 + 30.5 — Remote secrets pull + A2A caller-token validation Two bounded steps shipped together since they share the same `wsauth` validation shape. **30.2 — `GET /workspaces/:id/secrets/values`** - New handler in `workspace-server/internal/handlers/secrets.go::Values`. Returns the merged decrypted global+workspace secrets as a flat `{"KEY": "value"}` JSON map. Same merge semantics as the provisioner's env-var injection, so a remote agent bootstrapping via pull sees exactly the same secrets a local container would receive via push. - Auth: Phase 30.1 bearer token required when the workspace has any live token on file. Legacy workspaces grandfathered through. **Fail-closed** on the token-existence check (different from heartbeat's fail-open) because this endpoint returns plaintext secrets. - Route wired in `workspace-server/internal/router/router.go:170`. **30.5 — A2A proxy caller-token validation** - `workspace-server/internal/handlers/a2a_proxy.go::ProxyA2A` now calls `validateCallerToken(ctx, c, callerID)` before the existing CanCommunicate hierarchy check. Three bypass paths preserved: canvas (empty `X-Workspace-ID`), system callers (`webhook:`, `system:`, `test:` prefixes), self-calls (callerID==workspaceID). - Token binding is strict: compromised token from workspace A cannot authenticate a caller claiming to be workspace B. Tested. - Fail-open on DB hiccup — caller-token is defense-in-depth on top of hierarchy, not the sole gate. **Tests:** - 5 new Go tests in `secrets_test.go` (legacy grandfather, missing token, wrong token, valid token with merge precedence, invalid workspace ID). - 5 new Go tests in `a2a_proxy_test.go::TestValidateCallerToken` (legacy grandfather, missing token, invalid token, valid token, wrong-workspace binding rejection). **Live E2E verified** against real Postgres + platform: - 30.2: no-token → 401, bad-token → 401, valid-token → 200 with correct `{"PHASE_30_DEMO":"hello-from-pull-endpoint"}`. - 30.5: canvas bypass ✓, self-call bypass ✓, system-caller bypass ✓, cross-workspace no-token → 401 "missing caller auth token", cross-workspace wrong-token → 401 "invalid caller auth token", cross-workspace valid-token → 403 "access denied" (falls through to hierarchy check as designed). **Phase 30 status on main:** 30.1 ✅, 30.2 ✅ (this PR), 30.5 ✅ (this PR). Remaining: 30.3 (plugin tarball), 30.4 (state polling), 30.6 (sibling URL cache), 30.7 (poll-liveness), 30.8 (SDK + GA). Branch: `feat/30.2-30.5-remote-auth`. PLAN.md checkboxes flipped for 30.1, 30.2, 30.5. ## Phase 30.4 + 30.8 — State polling + Remote-agent SDK (first working e2e) Shipped together because 30.8 (the runnable example) is the proof-of-life for everything 30.1–30.5 built up to. 30.4 is the missing piece that lets a remote agent detect pause/delete without WebSocket reachability. **30.4 — `GET /workspaces/:id/state`** - New handler `workspace.State` at `workspace-server/internal/handlers/workspace.go`. Returns `{workspace_id, status, paused, deleted}`. Token-gated with the same Phase 30.1 shape (legacy grandfather, fail-closed on DB error). Deliberately not merged with `GET /workspaces/:id` — that path is for the canvas (unauthenticated, rich config). This is the agent-machinery polling path: tight, token-gated, cache-friendly. - Returns 404 + `{deleted: true}` for hard-deleted rows so the SDK can distinguish from transient network issues. **30.8 — `sdk/python/molecule_agent/`** - New `RemoteAgentClient` class (blocking, `requests`-only, no async) with methods mirroring the Phase 30 endpoints: `register()`, `pull_secrets()`, `poll_state()`, `heartbeat()`, `run_heartbeat_loop()`. - Token cache at `~/.molecule//.auth_token` with 0600 perms. Register is idempotent — re-registering an already-tokened workspace keeps using the on-disk copy. - Loop exits gracefully on pause/delete, returning the terminal status for the caller to log / exit on. - Tolerates transient heartbeat + state-poll failures without crashing the loop (log and continue). **`examples/remote-agent/`** - Runnable 100-line demo: `WORKSPACE_ID=x PLATFORM_URL=y python3 run.py`. README walks through workspace creation via `external: true`, seeding a secret, running the agent. - **Note found during live verification:** `POST /registry/register` upserts `status='online'`, so re-registering an already-paused workspace reverts it. Not a bug in 30.4; but affects the order of operations in the demo (register once, then pause takes effect on the long-running loop). Filed as follow-up — see "Known follow-ups" below. **Tests:** - 5 new Go tests for `workspace.State` (legacy grandfather, paused, hard-delete 404, missing token, valid token). - 22 new Python tests for `RemoteAgentClient` (token persistence with 0600 check, register issues/reuses, secrets pull, state poll, 404 = deleted, heartbeat body shape, loop exits on paused/deleted/max-iterations, transient-error continuation). **Live E2E with all of 30.1/30.2/30.4/30.5 running:** - Agent register → token issued ✓ - `received 2 secret(s): keys=['API_KEY', 'REMOTE_DEMO_KEY']` ✓ - Heartbeat loop runs, uptime advances to 10s ✓ - `POST /pause` mid-loop → `platform reports workspace paused (paused=True deleted=False) — exiting` within ~5s ✓ - Clean terminal status `paused` ✓ **Known follow-ups (not this PR):** - Register's `status='online'` overwrite undoes platform-side pause if the agent happens to re-register. Should check current status and preserve `paused` / `removed`. - Loop currently can't receive inbound A2A — `reported_url` is `remote://no-inbound` as a placeholder. A future 30.8b will add an optional `start_a2a_server()` helper for agents behind a public URL or tunneled port. **PLAN.md:** 30.4 ✅, 30.8 ✅. Phase 30 remaining: 30.3 (plugin tarball), 30.6 (sibling URL cache), 30.7 (poll-liveness monitor). Branch: `feat/30.4-state-polling` (merged 30.2+30.5 PR #130 into it mid-session for the live E2E to have all endpoints available). ## Phase 30.7 — Poll-liveness for external-runtime workspaces **Why this is the missing piece:** without it, a dead remote agent stayed "online" on the canvas forever. The existing health sweep explicitly skipped `runtime='external'` rows because it only knew how to ask Docker "is the container alive?" — wrong question for a workspace the platform never started. **Fix** (`workspace-server/internal/registry/healthsweep.go`): - New `sweepStaleRemoteWorkspaces` runs on the same ticker as the Docker sweep. Queries workspaces with `runtime='external'` whose `last_heartbeat_at` is older than `REMOTE_LIVENESS_STALE_AFTER` (default 90s, env-overridable). Marks them offline, clears Redis state, fires `onOffline` so the canvas sees `WORKSPACE_OFFLINE`. - `StartHealthSweep` no longer early-returns on nil Docker checker — a SaaS front-door deployment without local Docker still needs remote-liveness monitoring. - Newly-registered external workspaces that haven't heartbeated yet are compared against `updated_at` (set on register), so an agent that crashes before its first heartbeat is still swept after the grace window. **Tests** (`workspace-server/internal/registry/healthsweep_test.go`): - `sweepStaleRemoteWorkspaces` with 2 stale rows → UPDATE + onOffline called twice - No stale rows → onOffline never called - Nil callback → no panic - DB outage → logged, no panic, no false offlines - `remoteStaleAfter`: default when env unset; honors valid integer override; falls back on garbage values (``abc``, `0`, `-10`, empty) - `StartHealthSweep` with nil checker: still ticks and runs remote sweep (previously would early-return) **Live E2E** with `REMOTE_LIVENESS_STALE_AFTER=10` for test speed: - Agent register → heartbeat → exit → status=online (heartbeat fresh) - Wait 30s → **status=offline** (platform swept at 15s tick, saw heartbeat >10s old). Log: `Health sweep (remote): heartbeat stale (>10s) — marking offline` - Restart agent → heartbeat resumes → **status=online** again - Full cycle observable on canvas via WORKSPACE_OFFLINE / WORKSPACE_ONLINE broadcasts **All Phase 30 remote-agent capabilities now demonstrable end-to-end:** | Step | Live E2E status | |------|-----------------| | 30.1 Token auth | ✅ register + heartbeat bearer-auth'd | | 30.2 Secrets pull | ✅ `keys=['API_KEY','REMOTE_DEMO_KEY']` | | 30.4 State polling | ✅ pause detected in ~5s | | 30.5 A2A caller auth | ✅ 401/403 separation confirmed | | 30.7 Poll-liveness | ✅ stale→offline→restart→online cycle | | 30.8 SDK + example | ✅ `examples/remote-agent/run.py` | **Phase 30 remaining:** 30.3 (plugin tarball), 30.6 (sibling URL cache). Neither blocks the current SaaS loop; 30.3 matters when remote agents need to install plugins with heavy deps, 30.6 is a resilience optimization for agent-to-agent direct calls. Branch: `feat/30.7-poll-liveness`. PLAN.md 30.7 ✅. ## Phase 30.6 — Sibling discovery auth + URL caching Two tied fixes: **Platform side** — `/registry/discover/:id` and `/registry/:id/peers` were unauthenticated. For a SaaS front-door deployment, any internet host that knows a workspace ID could enumerate siblings and pull their URLs. Added `validateDiscoveryCaller` using the same lazy-bootstrap Phase 30.1 token pattern. Fail-open on DB hiccup (unlike secrets.Values which fails-closed) because discovery only exposes URLs already behind `CanCommunicate` — the hierarchy check downstream is the primary gate, auth is defense-in-depth. **SDK side** — new methods on `RemoteAgentClient`: - `get_peers()` → list of `PeerInfo`, seeds URL cache - `discover_peer(id)` → cached lookup with 5-min TTL, refreshes on expiry, returns None on 404 - `invalidate_peer_url(id)` → drop cache entry (call after a direct-call failure so next call re-discovers) - `call_peer(id, message, prefer_direct=True)` → sends A2A message/send. Direct path on cache hit; graceful fallback to platform proxy on connection error / 5xx with cache invalidation. `prefer_direct=False` forces proxy routing. - New `PeerInfo` dataclass exported alongside `WorkspaceState`. **Tests:** 12 new SDK tests (cache seeding skips non-http URLs, cache hit short-circuits, expired cache refreshes, 404 returns None, invalidate_peer_url idempotent, direct-path vs proxy-fallback vs prefer-direct=False, fresh call with no cache does discover-then-direct). **Bug caught during verification:** my first discovery auth shape fail-closed on DB errors, which broke existing `TestDiscover_*` and `TestPeers_*` tests that didn't set up the `HasAnyLiveToken` sqlmock expectation. Switched to fail-open — discovery is hierarchy-gated anyway, and a DB hiccup shouldn't take agent-to-agent discovery offline. 8 tests restored green. **Live E2E** with a tiny Python echo server as sibling-B: - `get_peers` returns 2 peers (echo server + parent PM) - URL cache seeded ONLY with `http://` entry (skips `remote://pm`) - `call_peer` routes **directly to `http://127.0.0.1:9876`** — no proxy hop - Echo server responds, SDK returns `"echoed: hello sibling over SDK"` - Auth + hierarchy all verified: no-token→401, wrong-token→401, cross-workspace token→401, out-of-hierarchy discover→403 **Phase 30 status after this:** 30.1 ✅ 30.2 ✅ 30.4 ✅ 30.5 ✅ 30.6 ✅ 30.7 ✅ 30.8 ✅. Only 30.3 (plugin tarball download) remains, and I flagged that one as lower priority — the current SaaS loop doesn't need it until a real user has a heavy-deps plugin. Branch: `feat/30.6-sibling-cache`. ## Fix #123 — Telegram `kicked`/`left` now persists `enabled=false` **Bug:** When the Molecule AI bot was removed from a Telegram chat, the handler at `telegram.go:594-596` only logged the event — the matching `workspace_channels` row stayed `enabled=true`. Every subsequent outbound message hit Telegram 403 forever. **Fix:** - New package-level callback `disableChannelByChatID` in `telegram.go`, default no-op (safe for early boot / tests). - `manager.go::NewManager` wires it to run `UPDATE workspace_channels SET enabled=false WHERE channel_type='telegram' AND enabled=true AND config->>'chat_id'=$1`, then call `m.Reload(ctx)` if any row flipped so the in-memory poller map drops the now-disabled row. - `onMyChatMember::case "left", "kicked"` now calls the callback immediately after the existing log line (removes the TODO). **Tests** (`workspace-server/internal/channels/channels_test.go`): - default-is-no-op (var safe to call pre-Manager-init) - wired-callback fires UPDATE with exact WHERE shape + arg + triggers Reload via follow-up SELECT - no-rows-affected skips reload (avoids SELECT storm on unrelated kicked events from other bots) Branch: `fix/123-telegram-kicked-persist`. Closes #123. ## Phase 30 client adaptations — MCP / molecli / Canvas / SDK Phase 30 itself shipped the platform-side endpoints. These adaptations make those endpoints **visible and usable** from every client surface without requiring callers to know the new URL paths by hand. **MCP** — 4 new tools in `mcp-server/src/index.ts`: - `list_remote_agents` — filters workspace list to runtime='external' - `get_remote_agent_state` — projects {workspace_id, status, paused, deleted} - `get_remote_agent_setup_command` — emits the `WORKSPACE_ID=... PLATFORM_URL=... python3 ...` bash one-liner an operator can paste into a remote shell - `check_remote_agent_freshness` — compares last_heartbeat_at against configurable threshold (default 90s); returns {fresh, seconds_since_heartbeat} 8 new MCP tests (88 → 96). **molecli** — `WorkspaceInfo` gains a `Runtime` field; `printWorkspaceTable` adds a RUNTIME column showing `★ external` for remote agents so they pop in a long table; detail view labels them `external (Phase 30 remote agent)`. Live: `molecli ws list` now shows the badge correctly. **Canvas** — `WorkspaceNode.tsx` reads `data.runtime` (workspace row) in preference to `data.agentCard.runtime` (agent-reported). Remote agents get a distinct violet `★ REMOTE` pill with a tooltip explaining the heartbeat-based lifecycle. 352/352 vitests still pass. **SDK** — `pyproject.toml` rebranded `molecule-sdk@0.2.0` so a single `pip install molecule-sdk` ships both `molecule_plugin` (plugin authors) and `molecule_agent` (remote-agent authors). Added trove classifiers, keywords, requires-python pin. New `sdk/python/molecule_agent/README.md` quickstart. **Live verification:** - MCP: spawned a real external workspace, ran all 4 tools via node smoke script — count=1, setup_command renders, freshness=null (no heartbeat yet, returns fresh=false correctly) - molecli: `ws list` shows `★ external` badge on the remote workspace - Canvas tests green; visual change is small (one badge swap) - SDK: 121 SDK tests + 1078 workspace-template tests still pass Branch: `feat/phase30-client-adaptations`. ## Phase 30.3 — Plugin tarball download (external GitHub repo verified) **Platform:** new `GET /workspaces/:id/plugins/:name/download[?source=...]` streams the named plugin as a gzipped tarball. Reuses `resolveAndStage` so all existing source schemes (`local://`, `github://`, future `clawhub://`) work — the endpoint is just the download surface for what Install was already doing internally. Token-gated (fail-closed on DB error since the tarball can include rule text and skill files referencing internals). Defaults source to `local://` when the query param is omitted. Validates that the URL path's plugin name matches the resolved plugin's manifest name — prevents a github source resolving to a different name from being shipped under the requested name. **SDK `RemoteAgentClient.install_plugin(name, source=None)`:** 1. Stream the download 2. Atomic extract via sibling-tempdir + rename (no half-installed states) 3. Run `setup.sh` if present (best-effort) 4. POST `/workspaces/:id/plugins` to register the install `_safe_extract_tar` rejects path-traversal (`../escape`, absolute paths) and silently skips symlinks/hardlinks — defends against tar-slip CVEs. Tested with both adversarial inputs. **Tests:** - 5 new Go (auth, tarball shape, name mismatch, tar streaming relative paths, tar symlink skip) - 11 new Python SDK (unpack location, source query param, atomic rollback on corrupt tarball, overwrite existing, setup.sh ran/skipped, platform-report skipped, 404 surfaces, _safe_extract path-traversal rejection, absolute-path rejection, symlink skip) **Live E2E** with a real external GitHub repo created via `gh repo create` (`HongmingWang-Rabbit/starfire-test-plugin`): - `local://molecule-dev` → 4612-byte tarball, plugin.yaml + skills/ present - `github://HongmingWang-Rabbit/starfire-test-plugin` → 711-byte tarball pulled from real GitHub, unpacked locally, **setup.sh ran on the agent's host machine** producing `/tmp/sf-plugin-test-setup-ran` - Auth gates: 401/401/200 confirmed - Name-mismatch: requested `wrong-name` with `source=...starfire-test-plugin` returned 400 with `{"resolved_name":"starfire-test-plugin","requested_name":"wrong-name"}` Phase 30 is now feature-complete: 30.1 ✅ 30.2 ✅ 30.3 ✅ 30.4 ✅ 30.5 ✅ 30.6 ✅ 30.7 ✅ 30.8 ✅ Branch: `feat/30.3-plugin-tarball`. Test repo: https://github.com/HongmingWang-Rabbit/starfire-test-plugin ## Bugfix #124 — Delegation idempotency Promoted from `docs/known-issues.md` KI-002. When a workspace container restarted mid-delegation (Redis TTL → liveness restart), agents could re-issue `POST /workspaces/:id/delegate` and produce duplicate work (double commits, double Telegram messages, double API calls). **Migration `021_delegation_idempotency.up.sql`:** - `activity_logs.idempotency_key TEXT NULL` - Partial unique index on `(workspace_id, idempotency_key) WHERE idempotency_key IS NOT NULL` — fully backwards compatible **Handler (`workspace-server/internal/handlers/delegation.go::Delegate`):** - Optional `idempotency_key` field on the request body - On receipt: lookup `(workspace_id, key)` → if found and not `failed`, return existing delegation_id with HTTP 200 + `idempotent_hit: true` - If the prior row is `failed`, the slot is released so the retry can produce a fresh delegation (still 202) - If two concurrent calls race past the lookup, the unique-constraint violation on insert is caught and the loser re-queries to surface the same idempotent response (HTTP 200) instead of a 500 **Tests** (3 new + 2 updated, all green under `go test -race`): - `TestDelegate_IdempotentReplayReturnsExistingDelegation` - `TestDelegate_IdempotentFailedRowIsReleasedAndReplaced` - `TestDelegate_IdempotentRaceUniqueViolationReturnsExisting` - Updated `TestDelegate_Success` and `TestDelegate_DBInsertFails_Still202WithWarning` to assert the new 6th INSERT arg (idempotency_key = NULL when omitted) Branch: `fix/auto-review-2026-04-13-delegation-idempotency`. Closes #124.