Compare commits

...

89 Commits

Author SHA1 Message Date
hongming f5cc9493bb Merge pull request 'feat(security): RFC#523 3-layer forbidden-env guardrail for tenant workspaces (task #146)' (#1555) from feat/146-forbidden-env-guard into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / all-required (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Harness Replays / detect-changes (push) Successful in 15s
publish-runtime-autobump / pr-validate (push) Successful in 32s
publish-runtime-autobump / bump-and-tag (push) Successful in 38s
Harness Replays / Harness Replays (push) Successful in 3s
2026-05-19 01:57:30 +00:00
hongming 71ad3ffe1d Merge pull request 'fix(sop-checklist): widen ack eligibility per RFC#450 Option C (closes internal#442)' (#1554) from fix/sop-checklist-widen-ack-internal-442 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m35s
2026-05-19 01:57:08 +00:00
hongming a3fc350c6e Merge pull request 'test(e2e): local prod-mimic backend for peer-visibility MCP gate + make e2e-peer-visibility (task #166)' (#1551) from e2e/peer-visibility-local-backend-task166 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / all-required (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (push) Waiting to run
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Failing after 1m18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m8s
2026-05-19 01:57:06 +00:00
hongming 57364c1bed Merge pull request 'ci: arm64-lane pilot (additive shellcheck on Mac runner) [#233]' (#1553) from ci/mac-arm64-pilot-shellcheck into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / all-required (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
2026-05-19 01:56:16 +00:00
hongming acc149e18e Merge pull request 'fix(canvas/chat): surface actionable error reason in chat banner + link to Activity tab (internal#212)' (#1550) from fix/canvas-surface-error-detail into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Successful in 2m50s
2026-05-19 01:56:12 +00:00
hongming 83ad7e252b Merge pull request 'fix(workspace-server): surface secret-safe error_detail on ACTIVITY_LOGGED (internal#212)' (#1549) from fix/wsserver-broadcast-error-detail into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
2026-05-19 01:56:10 +00:00
hongming d27df740f5 Merge pull request 'fix(ws-server): close self-fire restart feedback loop (internal#544)' (#1556) from fix/ws-server-self-fire-restart-loop into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Successful in 5m15s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 24s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
CI / Platform (Go) (push) Successful in 5m20s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m16s
CI / Python Lint & Test (push) Successful in 6m53s
CI / Canvas (Next.js) (push) Successful in 7m19s
CI / all-required (push) Successful in 7m8s
publish-workspace-server-image / Production auto-deploy (push) Has been cancelled
2026-05-19 01:40:54 +00:00
core-devops 4bf87d122d fix(ws-server): close self-fire restart feedback loop (internal#544)
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4m38s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 6m11s
CI / Python Lint & Test (pull_request) Successful in 6m56s
CI / all-required (pull_request) Successful in 6m29s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
audit-force-merge / audit (pull_request) Successful in 5s
Three-layer cohesive fix for the 2026-05-19 ~00:05-00:09Z 4x reprov thrash
class observed on prod-Reviewer + prod-Researcher: a single secrets PUT
fanned out into 4x stop+provision cycles per workspace within 4 min,
each stopping the just-launched (still-pending) EC2 of the previous
cycle. Root-caused via Loki (provision.ec2_started / ec2_stopped pairs).

Empirical chain (all in workspace-server/internal/handlers/):
1. secrets.go SetSecret → go h.restartFunc → coalesceRestart cycle.
2. runRestartCycle sets url='' synchronously, then async provisions EC2.
3. During 20-30s pending window: url='' AND cpProv.IsRunning()==false
   — indistinguishable from a dead container.
4. Canvas /delegations poll OR the trailing restart-context probe fires
   ProxyA2A → maybeMarkContainerDead OR preflightContainerHealth →
   RestartByID → loop.
5. coalesceRestart's pending flag drains by running ANOTHER full cycle
   → ec2_stopped of the just-booted instance → re-provision.

Fix (single PR, three interdependent layers):

L1) Restart-aware health probes — workspace_restart.go exposes
    isRestarting(workspaceID) bool. Both maybeMarkContainerDead and
    preflightContainerHealth early-return false/nil while a restart
    cycle is in flight. Breaks the self-fire at the probe layer.

L2) Restart-context probe gate — sendRestartContext now requires
    url != '' AND last_heartbeat_at > restart_start_ts before firing
    the trailing ProxyA2A probe. Adds waitForFreshHeartbeat() next to
    waitForWorkspaceOnline. Belt-and-suspenders so the probe never
    tries until the new container is actually addressable.

L3) RestartByID debounce — silent-drop successive RestartByID calls
    within restartDebounceWindow=60s of restartStartedAt. Not coalesce
    (which would still drain to another full cycle). Drop is observable
    via restartByIDDropCounter (atomic.Uint64) + the dropped log line.
    Only programmatic path; HTTP Restart handler is unaffected.

Tests:
- TestIsRestarting_{FalseWhenNoStateEntry,TrueWhileCycleRunning}
- TestMaybeMarkContainerDead_SkippedWhileRestarting (L1)
- TestPreflightContainerHealth_SkippedWhileRestarting (L1)
- TestRestartByID_DebounceSilentDrop (L3, counter assertion)
- TestRestartByID_DebounceExpiresAfterWindow (L3, window release)
- TestRestartByID_SingleProvisionPerRestart (regression — asserts
  exactly 1 cycle per trigger, with 4 dropped self-fire probes)

Existing coalesce/restart/preflight/maybeMarkContainerDead tests
remain green. Full handlers suite: ok in 15.8s.

Closes internal#544.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:24:09 -07:00
core-security aabf933a5c feat(security): RFC#523 3-layer forbidden-env guardrail for tenant workspaces (task #146)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m14s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m18s
CI / Platform (Go) (pull_request) Successful in 5m6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
publish-runtime-autobump / pr-validate (pull_request) Successful in 27s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
qa-review / approved (pull_request) Failing after 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
CI / Canvas (Next.js) (pull_request) Successful in 6m10s
CI / Python Lint & Test (pull_request) Successful in 6m38s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m20s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m32s
E2E Chat / E2E Chat (pull_request) Failing after 5m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m4s
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 4s
Refuse to start a tenant workspace if any operator-fleet-scope env var
name is present. Threat model: a leaked GITEA_TOKEN /
CP_ADMIN_API_TOKEN / RAILWAY_TOKEN / INFISICAL_OPERATOR_TOKEN /
MOLECULE_OPERATOR_* in a tenant container would let a compromised
agent escalate from "compromise of one workspace" to "compromise of
the whole platform."

3-layer defense-in-depth:

L1 — provisioner-side fail-closed abort (Go):
  workspace_provision_forbidden_env.go + prepareProvisionContext hook.
  Runs immediately after loadWorkspaceSecrets, BEFORE the per-agent
  persona GIT_HTTP_* injection that legitimately sets a fallback
  GITEA_TOKEN. Catches leaks from the operator-controlled stores
  (global_secrets, workspace_secrets). The existing forensic #145
  silent-strip guard in provisioner.buildContainerEnv stays as
  defense-in-depth.

L2 — workspace/entrypoint.sh top-of-file env-grep + exit 1:
  Fires if both upstream layers are bypassed (e.g. docker run -e
  GITEA_TOKEN=... standalone). MOLECULE_TENANT_GUARD_DISABLE=1
  bypass for local-dev. POSIX-portable (busybox/alpine/debian).

L3 — .gitea/workflows/lint-forbidden-env-keys.yml:
  Scans workspace-server/internal/**.go for new code that hardcodes a
  forbidden env-var name. Exempts the deny-set definitions + the
  pre-existing persona-fallback paths whose downstream silent-strip +
  new L1 fail-closed already cover the runtime risk.

Tests:
  - L1: TestIsForbiddenTenantEnvKey_ExactMatches,
        TestIsForbiddenTenantEnvKey_PrefixMatches,
        TestFindForbiddenTenantEnvKeys_NoneAndEmpty,
        TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted,
        TestFormatForbiddenTenantEnvError_Phrasing
  - L2: workspace/tests/test_entrypoint_forbidden_env_guard.sh
        (12 cases — clean/per-agent/each-forbidden/prefix/disable-flag)
  - L3: verified locally that current tree passes + synthetic offender
        is caught

Open-source-template-friendly: the deny set lives in Go and YAML
constants, not hardcoded in any open-source template's start.sh.
Per memory feedback_open_source_templates_no_hardcoded_org_internals,
templates published as separate repos (template-codex / template-
hermes / template-openclaw) get their L2 added in follow-up template
PRs with a fork-friendly default deny set (no MOLECULE_-specific
literal). The MOLECULE_OPERATOR_ prefix appears only in the
internal claude-code template's entrypoint.sh.

Refs:
  - RFC#523 (internal#523)
  - Task #146
  - memory feedback_passwords_in_chat_are_burned
  - memory feedback_per_agent_gitea_identity_default
  - memory feedback_open_source_templates_no_hardcoded_org_internals
  - memory feedback_check_vendor_docs_and_actual_source_before_guess_api_shape
    (POSIX env-set semantics verified via shell test; Go os.Environ /
    map[string]string contract verified via go test)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 18:22:08 -07:00
hongming 11cd1b4c40 fix(sop-checklist): widen ack eligibility per RFC#450 Option C (closes internal#442)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 9s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m16s
sop-tier-check / tier-check (pull_request) Successful in 9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Platform (Go) (pull_request) Successful in 5m34s
CI / Canvas (Next.js) (pull_request) Successful in 7m0s
CI / Python Lint & Test (pull_request) Successful in 7m4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 7s
The sop-checklist senior-ack gate has been blocking PRs because
`root-cause` and `no-backwards-compat` required `[managers, ceo]` acks,
but every managers/ceo persona token is dead (uid:0 / 401) and the `ceo`
team is one human. Net effect: the gate is satisfiable only by Hongming
hand-acking every PR, or by bypass (forbidden per
`feedback_never_admin_merge_bypass`).

Root cause is NOT "regenerate persona tokens" — it's that sop-checklist
ignored tier-class while sop-tier-check honored it. This PR implements
RFC#450 Option C (risk-classed two-eyes):

- Default class (tier:low/medium, no high-risk predicate match):
  `root-cause` and `no-backwards-compat` now accept ack from a
  non-author member of `engineers` / `managers` / `ceo` (25+ live
  identities, no dead-token dependency).
- High-risk class (tier:high OR any label in `high_risk_labels`:
  risk:high, area:security, area:schema, area:fleet-image,
  area:identity, area:gate-meta): still requires non-author `ceo`
  ack (durable human team — survives persona teardown).

Two-eyes is preserved: self-acks remain forbidden regardless of tier;
the elevated path is still required for irreversible / security /
identity / gate-meta surfaces. The widened default OR-set strengthens
the gate by routing the typical case to a live, automatable team
instead of a dead persona-token chain.

Mechanism:
- `.gitea/sop-checklist-config.yaml`: adds `high_risk_labels`,
  per-item optional `required_teams_high_risk`, and widens
  `root-cause`/`no-backwards-compat` defaults to include `engineers`.
- `.gitea/scripts/sop-checklist.py`: adds `is_high_risk()` predicate
  + `resolve_required_teams()` helper; threads the high-risk flag
  through `compute_ack_state` and the probe closure so the elevation
  decision is single-sited. Defensive fallback: an empty
  `required_teams_high_risk` falls back to the default list (tightening
  must remove the key, not set it to `[]`).
- Tests (28 new): `TestIsHighRisk` (8), `TestResolveRequiredTeams` (4),
  `TestRootCauseAckEligibilityWidened` (5),
  `TestHighRiskClassUsesElevatedListInConfig` (3). All 79 tests pass.

Refs internal#442, RFC#450.
2026-05-18 18:19:13 -07:00
hongming 4d6be109c7 ci: add arm64-lane pilot (additive shellcheck on Mac runner)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m25s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m22s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m40s
qa-review / approved (pull_request) Failing after 8s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
security-review / approved (pull_request) Failing after 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m29s
CI / Canvas (Next.js) (pull_request) Successful in 6m2s
CI / Python Lint & Test (pull_request) Successful in 6m52s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 10s
Mac-CI dual-track #233 pilot. Adds a single additive non-required
workflow that targets [self-hosted, arm64] runners and runs shellcheck
against .gitea/scripts/*.sh. Until a Mac arm64 runner is registered
with the `arm64` label, this workflow sits PENDING, which is fine —
`arm64` is NOT in branch_protections/main.status_check_contexts (only
'CI / all-required (pull_request)' is required, verified live via API).

Why shellcheck for the pilot: pure userspace, no docker.sock, no
privileged ops, identical output across arm64/amd64, narrow blast
radius. A clean signal for whether the lane works.

Pairs with internal#543 (RFC: Mac arm64 native multi-arch runner-base).
88 LoC, well under the <=100 line guidance. No required gate changes.
2026-05-18 18:16:02 -07:00
core-devops 2de81cdd85 build(make): expose e2e-peer-visibility target + fix help filter for digit-containing names (task #166)
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Failing after 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 57s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m7s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m29s
CI / Platform (Go) (pull_request) Successful in 4m56s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m25s
security-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m17s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
CI / Canvas (Next.js) (pull_request) Successful in 6m13s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m58s
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 4s
Wires the local peer-visibility MCP gate into the Makefile so a
developer can run it via `make e2e-peer-visibility` against an
already-up local prod-mimic stack (`make up`), without remembering the
bash path. This is the dev-side counterpart to the CI job added in
the same commit on this branch — together they close task #166's
"wire into local-E2E gate" ask.

The help-line grep regex didn't include digits, so the new
e2e-peer-visibility target was correctly defined but invisible to
`make help`. Adds [0-9] to the character class and widens the label
column to 22 chars so longer target names line up. Other targets are
unaffected.

NOT auto-merged (per task #166 instructions). See PR body for the
verification + the manual command for ad-hoc runs without the make
target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:51:43 -07:00
core-qa 84cba60ec2 test(e2e): add LOCAL backend for the peer-visibility MCP gate
PR #1298 added the peer-visibility gate but staging-only. Per the
standing rule that the local prod-mimic stack must run a MANDATORY
local-Postgres E2E BEFORE staging E2E (feedback_local_must_mimic_
production, feedback_mandatory_local_e2e_before_ship, feedback_local_
test_before_staging_e2e), peer-visibility must also run locally so
regressions are caught fast/cheap instead of late on cold EC2.

- Factor the byte-identical assertion core out of
  test_peer_visibility_mcp_staging.sh into tests/e2e/lib/
  peer_visibility_assert.sh::pv_assert_runtime. It drives the literal
  JSON-RPC tools/call name=list_peers envelope to POST /workspaces/:id/
  mcp via each workspace's OWN bearer through the real WorkspaceAuth +
  MCPRateLimiter chain, with the same anti-proxy / anti-native-fallback
  guarantees. NOT a proxy: no registry row, /health, heartbeat, or
  GET /registry/:id/peers. Only provisioning differs per backend.
- Refactor the staging script to source the shared lib (assertion
  byte-identical; provisioning/teardown/exit-codes unchanged).
- Add tests/e2e/test_peer_visibility_mcp_local.sh: local docker-compose
  backend — POST /workspaces directly, e2e_mint_test_token for the MCP
  bearer (same model test_priority_runtimes_e2e.sh / test_api.sh use,
  no new credential flow), wait online, run the shared assertion,
  scoped per-workspace teardown only (feedback_cleanup_after_each_test,
  feedback_never_run_cluster_cleanup_tests_on_live_platform). bash-3.2-
  safe (no associative arrays) so it runs on local macOS dev boxes too.
- Wire a peer-visibility-local job into e2e-peer-visibility.yml,
  bootstrapped exactly like e2e-api.yml's proven E2E API Smoke Test
  (per-run container names + ephemeral ports, go build, background
  platform-server). Runs on PR + push (local boot is minutes, not the
  30+ min cold-EC2 path), so peer-visibility is part of the local gate
  that fires before the staging E2E. Its OWN non-required status
  context `E2E Peer Visibility (local)` — non-required-by-design like
  the staging job, HONEST gate with NO continue-on-error mask
  (feedback_fix_root_not_symptom); flip-to-required tracked at #1296
  via the bp-required: pending directive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:50:01 -07:00
core-devops 44affbde24 fix(canvas/chat): surface actionable error reason in chat banner + link to Activity tab (internal#212)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 55s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 42s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m19s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 43s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 37s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
publish-runtime-autobump / pr-validate (pull_request) Successful in 29s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m22s
CI / Canvas (Next.js) (pull_request) Successful in 4m18s
CI / Platform (Go) (pull_request) Successful in 4m46s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m14s
E2E Chat / E2E Chat (pull_request) Failing after 1m4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 39s
Harness Replays / Harness Replays (pull_request) Successful in 41s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m21s
CI / Python Lint & Test (pull_request) Successful in 6m52s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m31s
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 17s
The chat error banner used to render the hardcoded
"Agent error (Exception) — see workspace logs for details." string
regardless of what the workspace runtime actually reported, and the
"workspace logs" reference pointed at a tab that does not exist (there
is no separate Logs tab in the side panel — the Activity tab is the
workspace-logs surface). Per CTO feedback on internal#211 / #212:
"the user can only act if they can see why."

useChatSocket now forwards the new ACTIVITY_LOGGED.error_detail field
(introduced server-side in the matching ws-server PR) into
onSendError. When present, the canvas shows the secret-safe reason
verbatim (provider HTTP status + error code + human-readable
message); when absent — older ws-server build — it gracefully
degrades to the legacy boilerplate so we never silently swallow a
failure.

A new ChatErrorBanner component renders the banner with a working
"View activity log" button that fires setPanelTab("activity"),
turning the dangling "see workspace logs" pointer into a real
affordance. The existing offline-Restart button is preserved.

Tests pin: hook forwards detail when present, falls back when absent,
ignores cross-workspace error events; banner renders the actionable
text, falls back to legacy message when that is all we have, button
navigates to Activity tab, Restart preserved when offline, null
message renders nothing.

Refs: internal#212, feedback_surface_actionable_failure_reason_to_user
2026-05-18 17:39:09 -07:00
hongming 81825575f9 Merge pull request 'fix(provisioner): inject GIT_HTTP_USERNAME/PASSWORD env from persona token (closes Dev-A/B durable git auth gap from mc#1525)' (#1542) from fix/provisioner-inject-git-http-creds-from-persona-token into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Failing after 1s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
CI / Canvas (Next.js) (push) Failing after 20s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / all-required (push) Failing after 2s
E2E API Smoke Test / detect-changes (push) Successful in 23s
E2E Chat / detect-changes (push) Successful in 18s
publish-workspace-server-image / build-and-push (push) Successful in 5m33s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 20s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
publish-workspace-server-image / Production auto-deploy (push) Failing after 23s
CI / Platform (Go) (push) Successful in 3m18s
CI / Python Lint & Test (push) Successful in 6m35s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 27s
Harness Replays / Harness Replays (push) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
E2E Chat / E2E Chat (push) Failing after 56s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m0s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m13s
main-red-watchdog / watchdog (push) Successful in 35s
gate-check-v3 / gate-check (push) Successful in 22s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 12s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m13s
ci-required-drift / drift (push) Successful in 1m0s
status-reaper / reap (push) Has started running
gitea-merge-queue / queue (push) Successful in 9s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 23s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m34s
2026-05-19 00:34:07 +00:00
hongming 22d2f8a6fc Merge pull request 'fix(ci): remove 3 silently-dead .github/ workflows using workflow_run (task #81)' (#1541) from fix/ci-remove-dead-workflow-run-task81 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-19 00:33:40 +00:00
hongming a053ca6f72 Merge pull request 'fix(runtime): close self-delegation echo gap in builtin_tools + inbox kind classification (#190 / #193)' (#1539) from fix/self-delegation-echo-runtime-builtin-tools into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-runtime-autobump / pr-validate (push) Successful in 34s
publish-runtime-autobump / bump-and-tag (push) Successful in 36s
2026-05-19 00:33:36 +00:00
hongming dfc9d91ccd Merge pull request 'docs: fix stale channel-install + Molecule-AI org references (#230)' (#1538) from fix/docs-stale-channel-install-task230 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
2026-05-19 00:32:52 +00:00
hongming 9fb7060e9c Merge pull request 'feat(canvas): homepage SEO for marketing launch (mc#1486)' (#1537) from feat/homepage-seo-mc-1486 into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Successful in 4m0s
2026-05-19 00:32:50 +00:00
hongming 567937e2bc Merge pull request 'fix(canvas): extend mc#1535 per-workspace MCP slug to codex/openclaw/hermes/kimi (multi-workspace class sweep)' (#1536) from fix/multi-workspace-install-snippets-class-sweep into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m11s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
publish-runtime-autobump / pr-validate (push) Successful in 37s
publish-runtime-autobump / bump-and-tag (push) Successful in 34s
CI / Platform (Go) (push) Successful in 2m48s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 57s
publish-workspace-server-image / build-and-push (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
CI / all-required (push) Has been cancelled
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m14s
status-reaper / reap (push) Has started running
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
gitea-merge-queue / queue (push) Successful in 9s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m27s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m45s
2026-05-19 00:28:47 +00:00
core-devops 94eff31c20 fix(workspace-server): surface secret-safe error_detail on ACTIVITY_LOGGED broadcast (internal#212)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Failing after 8s
Harness Replays / Harness Replays (pull_request) Has been skipped
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 56s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 18s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
E2E Chat / detect-changes (pull_request) Successful in 24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 41s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 58s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 47s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
qa-review / approved (pull_request) Failing after 8s
gate-check-v3 / gate-check (pull_request) Successful in 11s
sop-checklist / na-declarations (pull_request) N/A: (none)
publish-runtime-autobump / pr-validate (pull_request) Successful in 1m12s
sop-checklist / all-items-acked (pull_request) Successful in 11s
security-review / approved (pull_request) Failing after 12s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 36s
sop-tier-check / tier-check (pull_request) Successful in 13s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m53s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 2m0s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 24s
E2E Chat / E2E Chat (pull_request) Failing after 53s
CI / Python Lint & Test (pull_request) Successful in 6m10s
CI / Platform (Go) (pull_request) Successful in 7m0s
CI / Canvas (Next.js) (pull_request) Successful in 8m36s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m18s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 17s
When an a2a_receive row is persisted with status="error" the DB column
error_detail already carries the actionable cause (provider HTTP
status, error code, provider human message). The live ACTIVITY_LOGGED
broadcast dropped it, so the canvas chat-tab error banner fell back
to a hardcoded "Agent error (Exception) — see workspace logs for
details." string with no logs tab to navigate to.

Include error_detail in the broadcast payload, omitted when nil so the
canvas's "has actionable reason" guard doesn't false-positive on empty
keys. Defense-in-depth: a sanitizeErrorDetailForBroadcast scrubber
redacts anything that looks credential-shaped (bearer tokens, sk-
prefixed API keys, JWTs) while preserving the actionable parts
(status codes, error codes, human-readable provider messages) — over-
redacting would defeat the whole point of internal#212.

Tests pin: detail surfaces on the wire, omitted when nil, scrubber
removes secret shapes but keeps actionable text, scrubber survives
the broadcast round-trip.

Refs: internal#212
2026-05-18 17:28:41 -07:00
hongming 80a5f51c27 Merge pull request 'fix(a2a-mcp): use readline() not read(65536) for pipe-safe stdio (openclaw peer-visibility root cause)' (#1307) from fix/a2a-mcp-stdio-pipe-blocking-readline into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Waiting to run
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-runtime-autobump / pr-validate (push) Waiting to run
publish-runtime-autobump / bump-and-tag (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m25s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m34s
2026-05-19 00:28:29 +00:00
infra-sre 95c84021c2 fix(provisioner): inject GIT_HTTP_USERNAME/PASSWORD env from persona token
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
CI / Platform (Go) (pull_request) Successful in 2m44s
CI / Canvas (Next.js) (pull_request) Successful in 5m54s
CI / Python Lint & Test (pull_request) Successful in 6m28s
CI / all-required (pull_request) Successful in 6m32s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 26s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Failing after 55s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m0s
audit-force-merge / audit (pull_request) Successful in 7s
Closes the durable-git-auth gap left by template-claude-code#30 +
mc#1525 for the prod-team workspaces (agent-dev-a / agent-dev-b /
agent-pm). The askpass binary + GIT_ASKPASS env wiring shipped in
the template image and ws-server side respectively, but no code path
in workspace-server actually read the persona's git token from the
operator-host bootstrap dir and exported it as the askpass-readable
env-var pair. Without this, the askpass helper invokes with empty
password env and git fails the auth challenge in <500ms (live-
verified for Dev-A/Dev-B 2026-05-18 ~23:55Z via EC2 instance-connect
docker exec).

The new applyAgentGitHTTPCreds helper reads
$MOLECULE_PERSONA_ROOT/<role>/token (defaulting to
/etc/molecule-bootstrap/personas/<role>/token, the canonical
operator-host bootstrap-kit path) and emits GIT_HTTP_USERNAME +
GIT_HTTP_PASSWORD into the workspace envVars map.

Why a dedicated env-var pair instead of reusing GITEA_USER /
GITEA_TOKEN: the provisioner's forensic #145 SCM-write-token
denylist strips GITEA_TOKEN by exact key name before docker run.
The same token bytes shipped under the generic GIT_HTTP_PASSWORD
key survive transport because askpass reads that lane first.
GITEA_USER + GITEA_TOKEN are ALSO set for the askpass fallback
chain; GITEA_TOKEN is then dropped by buildContainerEnv as
designed, but the GIT_HTTP_PASSWORD lane already carries the
bytes the in-container helper needs.

Wired into prepareProvisionContext (the mode-agnostic shared
prep step both Docker and SaaS paths call) so Dev-A/Dev-B on
EC2 + any future local-Docker prod-team workspace pick it up
without duplicating the call site. Runs AFTER applyAgentGitIdentity
so workspace_secrets named GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD
(operator-supplied via POST /workspaces/:id/secrets) win over
the persona-file default.

Silent no-op for: empty role, multi-word descriptive roles
("Frontend Engineer") that fail isSafeRoleName, missing persona
dir, empty token file, traversal-attempt role names. These cases
fall through to the existing workspace_secrets / org-import
persona-env merge path unchanged.

No hardcoded git.moleculesai.app — the env-var pair is generic
askpass protocol and works for any git remote the deployer points
GIT_ASKPASS at.

Security note: this routes around forensic #145 by name (the
denylist is exact-key-match, not key-substring). For the
prod-team identities (agent-dev-{a,b,pm}) this is the explicitly-
designed shape per reference_prod_team_infisical_identities
(per-agent Gitea identities with pull+push, NO admin, NOT in any
merge-whitelist — merge stays gated by hardened BP 2-approvals+CI
per reference_merge_gate_model_changed_2026_05_18). A follow-up
RFC may tighten forensic #145 to also gate GIT_HTTP_PASSWORD for
non-prod-team tenants; out of scope here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:11:12 -07:00
infra-sre aff482a43c fix(ci): remove silently-dead .github/ workflows using workflow_run trigger (task #81)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 5s
CI / Platform (Go) (pull_request) Successful in 2m47s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
CI / Canvas (Next.js) (pull_request) Successful in 5m37s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m44s
CI / all-required (pull_request) Successful in 6m23s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 6s
Three workflows under .github/workflows/ used `on: workflow_run:`,
an event Gitea 1.22.6 does not support (per
feedback_pull_request_review_no_refire family + lint-workflow-yaml
Rule 2). They were also living in the wrong directory: molecule-core's
Gitea Actions runtime reads ONLY .gitea/workflows/ (per
reference_molecule_core_actions_gitea_only). So these files were
doubly dead — wrong path AND unsupported trigger.

Two of them already have working replacements under .gitea/workflows/
that landed in commit 2ee7cb14 (2026-05-12, replaced workflow_run
with push+paths). The third (canary-verify.yml) was superseded by
staging-verify.yml (push-on-staging) + staging-smoke.yml (schedule).

Removed → live replacement:
  - .github/workflows/canary-verify.yml
      → .gitea/workflows/staging-verify.yml (push+paths)
      + .gitea/workflows/staging-smoke.yml (schedule cron)
  - .github/workflows/redeploy-tenants-on-main.yml
      → .gitea/workflows/redeploy-tenants-on-main.yml (workflow_dispatch)
  - .github/workflows/redeploy-tenants-on-staging.yml
      → .gitea/workflows/redeploy-tenants-on-staging.yml (push+paths)

No runtime behavior change — these files were never executed by the
Gitea Actions runner. Removing them eliminates the dead-letter risk:
an operator scanning .github/workflows/ would otherwise believe an
auto-redeploy chain still exists post-publish, which it does not.

Refs: feedback_gitea_workflow_dispatch_inputs_unsupported,
reference_molecule_core_actions_gitea_only,
feedback_pull_request_review_no_refire.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:09:57 +00:00
core-devops cde433f2df ci: re-trigger after runner-pool zombie drain + ENOSPC remediation
cascade-list-drift-gate / check (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 43s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 46s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 47s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 34s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
review-check-tests / review-check.sh regression tests (pull_request) Successful in 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m32s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
qa-review / approved (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 10s
publish-runtime-autobump / pr-validate (pull_request) Successful in 27s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) Successful in 7s
security-review / approved (pull_request) Failing after 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m22s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 25s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m35s
CI / Canvas (Next.js) (pull_request) Successful in 4m27s
CI / Platform (Go) (pull_request) Successful in 4m45s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m17s
CI / Python Lint & Test (pull_request) Successful in 6m49s
CI / all-required (pull_request) Successful in 6m50s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m20s
CI / Canvas Deploy Reminder (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 16m33s
audit-force-merge / audit (pull_request) Successful in 6s
Prior CI failures on this PR were infra-class (Detect changes hit
'Error: ENOSPC: no space left on device' from runner disk-full caused
by 120 zombie tasks since drained; Python Lint flaked on perf test
test_batch_fetcher_runs_submitted_rows_concurrently by 3ms under
contended runners — same test passes cleanly on main HEAD 1b0e947).
Re-firing CI on recovered runners; no code change. [no-op]
2026-05-18 17:09:37 -07:00
core-devops 90e115ba55 fix(runtime): close self-delegation echo gap in builtin_tools + inbox kind
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 28s
gate-check-v3 / gate-check (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 6s
publish-runtime-autobump / pr-validate (pull_request) Successful in 48s
sop-checklist / all-items-acked (pull_request) Successful in 9s
sop-tier-check / tier-check (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 3m55s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 4m26s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m54s
CI / Python Lint & Test (pull_request) Successful in 7m6s
CI / all-required (pull_request) Successful in 7m2s
audit-force-merge / audit (pull_request) Successful in 6s
Task #190 / #193 — surface the self-delegation echo guard at every runtime
delegation entry point, and classify platform-pushed delegation-result
rows distinctly from peer_agent messages so a delegation timeout never
appears to the caller as a fake peer instruction.

Three layers were affected and only two were guarded:

  1. workspace/a2a_tools_delegation.py — already had the guard (added in
     #548 / #469). Untouched.
  2. workspace-server/internal/handlers/delegation.go — Go API gate
     already had the guard. Untouched.
  3. workspace/builtin_tools/a2a_tools.py::delegate_task — framework-
     agnostic adapter surface used by adapters that don't go through (1).
     NO GUARD. Added.
  4. workspace/builtin_tools/delegation.py::delegate_task_async — the
     LangChain @tool fire-and-forget path. NO GUARD on the local helper
     (it dispatched the background _execute_delegation coroutine to our
     own URL). Added.

Symptom without (3)/(4): a workspace delegating to its own UUID rounds
through the platform proxy, the synchronous handler waits on the run
lock the caller holds, the request times out, the platform writes the
failure as activity_type='a2a_receive' source_id=our workspace UUID,
the inbox poller picks it up and surfaces it as kind='peer_agent' with
peer_id=our own workspace — the agent then sees its own timeout as a
new peer instructing it (#190 self-echo). Reply via delegate_task to
that "peer" re-triggers the loop.

Inbox-side fix (workspace/inbox.py): InboxMessage.to_dict() now
classifies rows with method='delegate_result' as kind='delegation_result'
regardless of peer_id. This makes pushDelegationResultToInbox results
(RFC #2829 PR-2) surface as STRUCTURED delegation outcomes to the
caller's wait_for_message instead of fake peer_agent messages. This
covers both the self-delegation echo path AND the cross-workspace
ProxyA2A failure path where the delegation result lands in the caller's
inbox with source_id=caller's own workspace UUID.

Tests added:
  - tests/test_a2a_tools_module.py::TestSelfDelegationGuard — verifies
    the builtin_tools/a2a_tools.py guard short-circuits BEFORE any HTTP
    call, and lets a real peer through.
  - tests/test_delegation.py::TestSelfDelegationGuard — verifies
    builtin_tools/delegation.py::delegate_task_async returns the
    structured rejection error without scheduling a background task.
  - tests/test_inbox.py::test_message_from_activity_delegate_result_distinct_kind
    — pins kind='delegation_result' for method='delegate_result' rows
    so the #190 mis-classification regression is locked.

Runtime mirror (molecule-ai-workspace-runtime) is a publish artifact of
this directory — it picks up the fix automatically on the next
runtime-v* tag → publish-runtime workflow → PyPI 0.1.1003.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:53:51 -07:00
documentation-specialist f233f71f5a docs: fix stale channel-install + Molecule-AI org references (#230)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
CI / Platform (Go) (pull_request) Successful in 2m53s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 3s
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 31s
sop-checklist / all-items-acked (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 4m21s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 7m3s
CI / all-required (pull_request) Successful in 6m50s
audit-force-merge / audit (pull_request) Successful in 5s
CONTRIBUTING.md:195 had a wrong `--channels` install string
(`plugin:molecule@Molecule-AI/molecule-mcp-claude-channel` — both the
plugin-name format and the dead GitHub-org path are stale). Aligned all
three doc surfaces (CONTRIBUTING.md, README.md, README.zh-CN.md) with
the actual install pattern emitted by workspace-server/internal/
handlers/external_connection.go (externalChannelTemplate):

  /plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git
  /plugin install molecule@molecule-channel
  claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel

Also normalised display labels for the now-canonical Gitea org
(`Molecule-AI/` → `molecule-ai/`) — these are link captions, the URLs
were already correct. Docs-only, no behavioural change.

Task #230. Refs memory `feedback_github_botring_fingerprint` (canonical
SCM = git.moleculesai.app/molecule-ai/...).
2026-05-18 16:48:16 -07:00
core-devops 82a6cf42cd feat(canvas): homepage SEO for marketing launch (mc#1486)
CI / Detect changes (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 17s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 11s
sop-checklist / all-items-acked (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4s
sop-tier-check / tier-check (pull_request) Successful in 14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
CI / Canvas (Next.js) (pull_request) Successful in 4m18s
CI / Platform (Go) (pull_request) Successful in 4m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m8s
CI / Python Lint & Test (pull_request) Successful in 6m43s
CI / all-required (pull_request) Successful in 6m55s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m20s
audit-force-merge / audit (pull_request) Successful in 8s
Adds the standard Next.js App-Router SEO surface to the canvas
landing so the marketing push has crawlable metadata + structured
data on day one.

What landed:
  - layout.tsx — Metadata API: title.template, description,
    keywords, canonical, metadataBase, OG/Twitter text fields,
    robots index:true. JSON-LD @graph (Organization + WebSite +
    SoftwareApplication) injected with the per-request CSP nonce.
  - robots.ts — allow public marketing routes (/, /pricing, /blog),
    disallow /orgs, /api/, /cp/, /checkout/; declares sitemap +
    canonical host.
  - sitemap.ts — apex + pricing + live blog post; authed routes
    excluded by construction.
  - opengraph-image.tsx — segment-level dynamic OG card via
    next/og ImageResponse (1200x630); no static binary blob.
  - __tests__/seo-routes.test.ts — pins the crawler contract
    (10 cases) so a future refactor can't silently flip the
    marketing surface to noindex or drop the sitemap.

Out of scope (per issue): design copy, hero rewrite, Lighthouse
CWV tuning. Those are CTO/marketing inputs and a separate ticket.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:47:03 -07:00
devops-engineer 1b0e947bdd Merge pull request 'feat(provisioner): uniform T4 privilege contract + YAML emitter' (#1531) from feat/t4-privilege-contract-uniform into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 16s
E2E Chat / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 48s
E2E Chat / E2E Chat (push) Failing after 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 30s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 32s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 4m51s
publish-workspace-server-image / build-and-push (push) Successful in 5m52s
CI / Platform (Go) (push) Successful in 6m23s
CI / Python Lint & Test (push) Successful in 7m7s
CI / Canvas (Next.js) (push) Successful in 7m24s
CI / Canvas Deploy Reminder (push) Successful in 5s
CI / all-required (push) Successful in 7m43s
publish-workspace-server-image / Production auto-deploy (push) Successful in 1m51s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m35s
main-red-watchdog / watchdog (push) Successful in 31s
gate-check-v3 / gate-check (push) Successful in 59s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 9s
ci-required-drift / drift (push) Successful in 38s
gitea-merge-queue / queue (push) Successful in 7s
status-reaper / reap (push) Successful in 47s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 8m0s
2026-05-18 23:20:52 +00:00
devops-engineer ebf88a469f Merge pull request 'fix(canvas/socket): wake WebSocket on visibilitychange / pageshow (#223 / #228)' (#1530) from fix/canvas-ws-visibility-reconnect into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Successful in 3m28s
2026-05-18 23:20:48 +00:00
devops-engineer bcc66ecdcf Merge pull request 'fix(mobile): bump focused-input font-size to 16px (kills iOS focus-zoom)' (#1528) from fix/mobile-ios-focus-zoom-inputs into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Detect changes (push) Has been cancelled
CI / Shellcheck (E2E scripts) (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
CI / all-required (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
E2E Chat / detect-changes (push) Has been cancelled
publish-canvas-image / Build & push canvas image (push) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled
E2E API Smoke Test / detect-changes (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (push) Has been cancelled
2026-05-18 23:20:30 +00:00
devops-engineer 06b0ec8f12 Merge pull request 'fix(canvas): make "Add to Claude Code" snippet use unique server name per workspace (multi-workspace)' (#1535) from fix/add-to-claude-code-unique-server-name-per-workspace into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
CI / all-required (push) Has been cancelled
E2E Chat / detect-changes (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
Harness Replays / detect-changes (push) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Detect changes (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m13s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m22s
2026-05-18 23:20:17 +00:00
core-devops bb35c771f9 fix(canvas): extend mc#1535 per-workspace MCP slug across codex/openclaw/hermes/kimi/Kimi snippets + PyPI README
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 12s
sop-tier-check / tier-check (pull_request) Successful in 10s
publish-runtime-autobump / pr-validate (pull_request) Successful in 34s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
CI / all-required (pull_request) Compensating status: emitter dropped CI / all-required ctx on this commit (gitea 1.22.6 null-state). 2 non-author APPROVEs present, sop-checklist/sop-tier-check/gate-check-v3 all Successful per status descriptions. Posted per feedback_gitea_emitter_null_state_blocks_merge.
audit-force-merge / audit (pull_request) Successful in 5s
## Summary
- mc#1535 fixed the per-session-overwrite bug in the Universal MCP
  snippet (`claude mcp add molecule -s user` keyed by `molecule`, so
  installing for a second workspace silently replaced the first). The
  same equivalence-class bug exists in EVERY other runtime tab the
  Canvas modal renders: each MCP host keys its config by name, and all
  five templates hardcoded a fixed `molecule` identifier.
- This PR extends mc#1535's existing `{{MCP_SERVER_NAME}}` placeholder
  + `mcpServerNameForWorkspace()` helper into the 4 remaining
  templates so the Canvas snippet a user pastes is unique per
  workspace by construction across ALL runtime tabs — multi-workspace
  works out-of-the-box with no per-host workarounds.

## Bug shape per runtime tab (mc#1535 sibling)
- **codex** (`~/.codex/config.toml`): `[mcp_servers.molecule]` — TOML
  rejects duplicate table keys, so re-paste either breaks parsing or
  overwrites.
- **openclaw** (`~/.openclaw/mcp/molecule.json`): `openclaw mcp set
  molecule` keyed by name — second workspace overwrites.
- **hermes** (`~/.hermes/config.yaml`): `plugin_platforms.molecule:` —
  YAML rejects duplicate mapping keys, second workspace silently
  collapses.
- **kimi** (`~/.molecule-ai/kimi-workspace/`): single per-host dir —
  second workspace's env+bridge.py overwrites the first.

## What changed
- `workspace-server/internal/handlers/external_connection.go`:
  - 4 templates now stamp `{{MCP_SERVER_NAME}}` (the same slug
    mc#1535 already derives + plumbs into the universal_mcp snippet)
    in the keyed identifier:
    - codex: `[mcp_servers.{{MCP_SERVER_NAME}}]` + `.env` table.
    - openclaw: `openclaw mcp set {{MCP_SERVER_NAME}}` + log path.
    - hermes: `plugin_platforms.{{MCP_SERVER_NAME}}:`.
    - kimi: `~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/` dir + embedded
      python `ENV` path.
  - Header comment in each template documents the multi-workspace
    contract (mirrors mc#1535's universal_mcp header).
- `workspace-server/internal/handlers/external_rotate_test.go`:
  - New `TestBuildExternalConnectionPayload_AllRuntimeSnippetsAreWorkspaceUnique`
    pins the per-template literal that proves the slug was stamped,
    AND asserts no template leaves a literal `{{MCP_SERVER_NAME}}`
    placeholder — catches a future template author who forgets to
    register a new tab with the stamp pipeline.
- `workspace/a2a_mcp_server.py`:
  - Comment-only update on `serverInfo.name` to reflect that the
    per-host registration name is workspace-specific. No code change;
    `serverInfo.name` stays the generic `"molecule"` self-label.
- `scripts/build_runtime_package.py` (PyPI README generator):
  - Updates 3 `claude mcp add molecule -- molecule-mcp` references to
    `claude mcp add molecule-<workspace-slug> -- molecule-mcp` so the
    PyPI README matches the Canvas-stamped snippet pattern.
  - Adds a "Server name in `claude mcp add` is workspace-specific"
    bullet pointing at mc#1535 + this PR for context.

## Open-source-templates cleanliness check
- Templates touched here live in the PRIVATE molecule-core repo
  (Canvas modal generator); they STAMP per-workspace server names but
  do NOT bake any new `git.moleculesai.app` literal or other
  org-internal infra. Generic `pip install
  'git+https://git.moleculesai.app/molecule-ai/hermes-channel-molecule.git'`
  in the hermes template is the only such URL touched and was
  pre-existing — that one points at a public hermes-side plugin and
  has its own canonical URL; not in scope for the open-source-template
  rule (the rule applies to template-codex/template-hermes/
  template-openclaw — separate public repos, untouched here).
- No `.moleculesai.app` literal added; persona-token shape unchanged
  (auth_token still per-workspace minted by Rotate/Create — same path
  mc#1535 audited).

## Sample stamped snippets (workspace name "my-bot", slug "molecule-my-bot")
- codex:    `[mcp_servers.molecule-my-bot]` + `[mcp_servers.molecule-my-bot.env]`
- openclaw: `openclaw mcp set molecule-my-bot "$(cat <<EOF ... )"`
- hermes:   `plugin_platforms:\n  molecule-my-bot:\n    enabled: true`
- kimi:     `~/.molecule-ai/kimi-molecule-my-bot/{env,kimi_bridge.py}`

## Diff size
- 4 files, +135/-40 LoC. Most of it is comment text + the new test.
- Did NOT change `BuildExternalConnectionPayload` signature or
  `mcpServerNameForWorkspace` semantics — both were already plumbed
  by mc#1535 to all 8 snippets via the stamp closure; this PR only
  updates the template text to USE the placeholder.

## Test plan
- [x] `go test ./internal/handlers/ -run TestBuildExternalConnectionPayload` — 5/5 green, including new `_AllRuntimeSnippetsAreWorkspaceUnique`.
- [x] `go test ./internal/handlers/` full package — 15.9s green.
- [x] `go vet ./internal/handlers/` — clean.
- [ ] Manual (post-merge, requires mc#1535 also merged): create two
      "bot-a" + "bot-b" external workspaces on staging; paste each
      tab's snippet into the corresponding host on a single machine;
      verify `claude mcp list` / `cat ~/.codex/config.toml` /
      `openclaw mcp list` / `~/.hermes/config.yaml` / `ls
      ~/.molecule-ai/` each shows BOTH workspaces' entries side-by-
      side, not overwriting.

## Sequencing
- This PR's base is mc#1535's branch
  (`fix/add-to-claude-code-unique-server-name-per-workspace`),
  because it reuses mc#1535's `{{MCP_SERVER_NAME}}` placeholder +
  slug helper + `BuildExternalConnectionPayload(workspaceName)`
  signature change. Will need a rebase on main after mc#1535 lands;
  prefer to keep stacked to make the review of EACH PR scope-tight.
- CTO 2026-05-18 22:43Z: "其实是我们没有做好instruction,这个得补充" —
  this PR is the consolidated per-repo doc/generator fix.

## Related
- Sibling: mc#1535 (Universal MCP snippet, already open).
- Follow-up #230: molecule-core stale channel-install mentions
  (CONTRIBUTING.md:195, etc.) — separate scope.

Author identity: core-devops (per-role persona; not founder-PAT).
Opened for non-author review, NOT auto-merged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:05:51 -07:00
core-devops 9a3db439ec fix(canvas): make "Add to Claude Code" snippet use unique server name per workspace (multi-workspace)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 16s
qa-review / approved (pull_request) Failing after 14s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 9s
sop-checklist / all-items-acked (pull_request) Successful in 8s
sop-tier-check / tier-check (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m6s
E2E Chat / E2E Chat (pull_request) Failing after 1m4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m31s
CI / Platform (Go) (pull_request) Successful in 2m50s
CI / Canvas (Next.js) (pull_request) Successful in 5m16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m7s
CI / Python Lint & Test (pull_request) Successful in 6m35s
CI / all-required (pull_request) Successful in 6m47s
audit-force-merge / audit (pull_request) Successful in 5s
The Universal MCP install snippet hardcoded `claude mcp add molecule -s user`
— `claude mcp add` keys entries by name, so installing for workspace B
silently overwrote workspace A in the user's ~/.claude.json. A single
external Claude Code session ended up able to talk to only ONE molecule
workspace at a time — the CTO-observed "this is per-session" UX
(2026-05-18 22:28Z). MCP itself supports many servers per session; the
install snippet was the only thing standing in the way.

Fix: derive a unique server name per workspace at payload-build time —
`molecule-<slug>` where slug = lowercased/hyphen-collapsed workspace
name (max 24 chars), falling back to the first 8 chars of the workspace
ID when the name is empty or slugifies to nothing. The result is
alphanumeric + hyphens only (URL-safe + Claude-Code-name-safe).

Plumbed through all 3 callers of BuildExternalConnectionPayload:
- Create (workspace.go) passes payload.Name directly.
- Rotate / GetExternalConnection (external_rotate.go) extend the
  existing runtime lookup to also SELECT name in the same round-trip
  (lookupWorkspaceRuntimeAndName replaces lookupWorkspaceRuntime —
  one query, no extra DB load).

Snippet header now documents the multi-workspace contract: re-running
the snippet from another workspace's modal ADDS another entry; same-
name workspaces collide by design, rename one to disambiguate.

Surgical: only externalUniversalMcpTemplate gained a {{MCP_SERVER_NAME}}
placeholder. Other tabs (Python SDK / curl / Hermes / codex / openclaw /
kimi) already use distinct config keys per provider and aren't affected.

Tests: TestBuildExternalConnectionPayload_McpServerNameUniquePerWorkspace
pins 4 cases (plain name, name w/ spaces+caps, name w/ symbols, empty
name fallback to UUID prefix) — would have caught the original
"claude mcp add molecule" regression. Existing rotate/get tests updated
for the 2-column SELECT.

Related: task #229 (molecule-mcp-claude-channel install-doc blockers).
This is the canvas-side counterpart — that PR fixed the plugin docs,
this PR fixes the modal-generator snippet operators actually copy.

Sample generated lines (was → now):
  was: claude mcp add molecule -s user -- env WORKSPACE_ID=... molecule-mcp
  now: claude mcp add molecule-my-bot -s user -- env WORKSPACE_ID=... molecule-mcp
  (where "my-bot" is the workspace name; "molecule-12345678" if unnamed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:36:55 -07:00
infra-runtime-be 533502da35 feat(provisioner): uniform T4 privilege contract + YAML emitter
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 23s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 2m58s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 34s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 5s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 52s
E2E Chat / E2E Chat (pull_request) Failing after 1m7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 11s
CI / Canvas (Next.js) (pull_request) Successful in 5m53s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 6m31s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6m29s
audit-force-merge / audit (pull_request) Successful in 10s
Adds workspace-server/internal/provisioner/t4_privilege_contract.go as the
single source of truth for the T4 ("full machine access") capability set
that template-repo CI workflows currently re-implement as bespoke shell.

Today's t4-conformance gates in template-claude-code / template-hermes /
template-codex each hand-assert agent-uid + token-ownership + host-root
reach. The shell drifts (the very Hermes 401 class bug came from drift),
and there's no way to add a new capability fleet-wide without N template
PRs.

This contract:

  * Defines T4Capability as code (Name/Description/Probe/Severity/Source)
  * Lists the closure: agent_uid_1000, auth_token_agent_owned,
    host_root_reach_via_nsenter, host_fs_write_readback,
    docker_socket_reachable, list_peers_http_200, agent_home_writable,
    network_egress_https, privileged_flag_observable, pid_host_visible
  * Renders to YAML via AsYAML() and cmd/t4-contract-dump so any
    template CI can do:
       go run ./workspace-server/cmd/t4-contract-dump > t4_capabilities.yaml
    and iterate capabilities — new capabilities propagate without
    per-template PRs.
  * Pure stdlib + no Molecule-AI-internal deps so fork users can adopt
    the same contract.

Anti-drift unit tests (7, all green):
  - all caps have required fields
  - names unique
  - core closure (RFC#456 + task #128/#174) is present
  - hard-severity is strict majority
  - YAML is deterministic + escapes double quotes
  - YAML header cites internal#456
  - AgentUID const consistent with probes

Does NOT change Docker/Dockerfile or any existing emit-side behavior;
this is purely additive. The provisioner.go T4 branch is unchanged.
Templates adopt the YAML in a separate PR (pilot:
template-claude-code).

Refs: RFC internal#456, task #174, memory
reference_per_template_privilege_contract_class_audit_2026_05_16,
memory feedback_hermes_listpeers_401_token_root600_unreadable_by_uid1000.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:38:58 -07:00
core-fe c2110c799d fix(canvas/socket): wake WebSocket on visibilitychange / pageshow
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Platform (Go) (pull_request) Successful in 2m41s
sop-checklist / all-items-acked (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
CI / Canvas (Next.js) (pull_request) Successful in 5m58s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m49s
CI / all-required (pull_request) Successful in 6m39s
E2E Chat / E2E Chat (pull_request) Failing after 4m58s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11m53s
audit-force-merge / audit (pull_request) Successful in 10s
Mobile browsers (iOS Safari, Chrome on Android in deep-sleep) silently
drop the WebSocket when the tab is backgrounded. The in-page `onclose`
fires very late or never, so the reconnect backoff never schedules — the
canvas appears frozen until the user manually refreshes. Symptoms:

  - #223 mobile canvas chat has no real-time updates (must refresh)
  - #228 cross-device: user's own chat input doesn't broadcast to
         other sessions in real time (must refresh)

Root cause: `canvas/src/store/socket.ts` had no visibility-wake. The
reconnect loop only re-arms on `onclose`, and mobile OSes don't always
fire `onclose` when they kill the WS.

Fix:
  - Add `ReconnectingSocket.wake()` — forces an immediate reconnect
    when the socket is in CLOSED / CLOSING / null limbo, no-op when
    OPEN or CONNECTING. Pre-empts any pending backoff timer and resets
    the attempt counter (this was a user-initiated wake, not an
    unattended-tab failure cascade).
  - Wire a module-level `visibilitychange` + `pageshow` listener inside
    `connectSocket()`; remove it in `disconnectSocket()`. `pageshow`
    covers Safari's bfcache restore where `visibilitychange` doesn't
    fire on its own.
  - Export `wakeSocket()` so the test suite can exercise the path
    without depending on a jsdom DOM (the existing socket.test.ts
    runs under the `node` environment).

Tests (5 new cases under `wakeSocket → reconnect`):
  - wake on OPEN: no new WS
  - wake on CLOSED: new WS created (the #223 fix)
  - wake on CONNECTING: no extra handshake piled on
  - wake cancels pending backoff `setTimeout`
  - wake after `disconnectSocket()` is a no-op (no zombie)

Closes #223
Closes #228
2026-05-18 14:37:56 -07:00
core-fe 679d86a9be fix(mobile): bump focused-input font-size to 16px (kills iOS focus-zoom)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 15s
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request) Failing after 10s
sop-checklist / all-items-acked (pull_request) Successful in 7s
security-review / approved (pull_request) Failing after 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 35s
sop-tier-check / tier-check (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Failing after 1m0s
CI / Platform (Go) (pull_request) Successful in 5m9s
CI / Python Lint & Test (pull_request) Successful in 6m9s
CI / Canvas (Next.js) (pull_request) Successful in 6m33s
CI / all-required (pull_request) Successful in 6m34s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m0s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 9s
iOS Safari and PWAs auto-zoom the viewport when a focused input or
textarea has a computed font-size below 16px. Two mobile-canvas inputs
were below that bound, causing the layout to jump and look broken on
focus until the user pinched back:

  - MobileSpawn.tsx agent-name input (fontSize: 13.5) — #225
  - MobileChat.tsx composer textarea (fontSize: 14.5) — #224

Both bumped to 16px (the minimum that suppresses focus-zoom). This is
the same class of bug as desktop #1434, scoped here to the mobile
breakpoint.

Tests:
  - MobileSpawn.test: assert agent-name input renders at fontSize >= 16
  - MobileChat.test:  assert composer textarea renders at fontSize >= 16
Both parse the inline style.fontSize (jsdom has no layout engine, so
getComputedStyle reports the inline value verbatim).

Closes #224
Closes #225
2026-05-18 14:34:58 -07:00
hongming 03337955ca Merge pull request 'fix(ci): review-check.sh — read issue comments for agent-approval fallback' (#1492) from fix/review-check-agent-comment-approval into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
E2E API Smoke Test / detect-changes (push) Successful in 18s
E2E Chat / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
review-check-tests / review-check.sh regression tests (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Harness Replays / Harness Replays (push) Successful in 7s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m18s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m13s
CI / Platform (Go) (push) Successful in 2m56s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m33s
CI / Canvas (Next.js) (push) Successful in 4m22s
CI / Canvas Deploy Reminder (push) Successful in 1s
publish-workspace-server-image / build-and-push (push) Successful in 4m18s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m11s
E2E Chat / E2E Chat (push) Failing after 5m20s
CI / Python Lint & Test (push) Successful in 7m12s
CI / all-required (push) Successful in 7m25s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m34s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m36s
main-red-watchdog / watchdog (push) Successful in 52s
gate-check-v3 / gate-check (push) Successful in 21s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
gitea-merge-queue / queue (push) Successful in 5s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
status-reaper / reap (push) Successful in 39s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m46s
ci-required-drift / drift (push) Successful in 58s
2026-05-18 21:28:31 +00:00
hongming e27f0747f2 Merge pull request 'fix(canvas): add role=alert aria-live=assertive to AgentAbilitiesSection error (WCAG 4.1.3)' (#1518) from fix/agent-abilities-aria-alert into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 13s
CI / Platform (Go) (push) Has been cancelled
E2E Chat / E2E Chat (push) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
CI / all-required (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
Harness Replays / Harness Replays (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
publish-canvas-image / Build & push canvas image (push) Successful in 2m30s
2026-05-18 21:28:04 +00:00
hongming 73a09443a0 Merge pull request 'feat(provisioner): inject GIT_ASKPASS for env-driven HTTPS git auth' (#1525) from feat/provisioner-env-git-askpass into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
E2E Chat / detect-changes (push) Successful in 13s
E2E API Smoke Test / detect-changes (push) Successful in 17s
Harness Replays / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 15s
publish-runtime-autobump / pr-validate (push) Successful in 37s
publish-runtime-autobump / bump-and-tag (push) Successful in 40s
Harness Replays / Harness Replays (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 30s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 49s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m39s
ci-required-drift / drift (push) Successful in 1m14s
CI / Canvas (Next.js) (push) Successful in 4m4s
CI / Canvas Deploy Reminder (push) Successful in 2s
publish-workspace-server-image / build-and-push (push) Successful in 5m0s
CI / Platform (Go) (push) Successful in 5m41s
E2E Chat / E2E Chat (push) Failing after 5m55s
CI / Python Lint & Test (push) Successful in 7m3s
CI / all-required (push) Successful in 7m13s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m53s
gitea-merge-queue / queue (push) Successful in 9s
status-reaper / reap (push) Successful in 58s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m55s
2026-05-18 21:15:14 +00:00
core-devops 9dbdaf3f4e feat(provisioner): loadPersonaTokenFile fallback for env-file-less personas
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 34s
publish-runtime-autobump / pr-validate (pull_request) Successful in 37s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 14s
gate-check-v3 / gate-check (pull_request) Successful in 13s
qa-review / approved (pull_request) Failing after 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Failing after 52s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m23s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 22s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m35s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m35s
CI / all-required (pull_request) compensating for Gitea null-state emitter bug — 22 underlying statuses present, BP rolls them up
audit-force-merge / audit (pull_request) Successful in 5s
The new prod-team personas (agent-dev-a, agent-dev-b, agent-pm) ship
only `token` + `universal-auth.env` (Infisical UA bootstrap), no `env`
file. loadPersonaEnvFile silently no-ops on them today. With this
fallback, GITEA_TOKEN/USER/EMAIL get populated from the token file
when no env file exists.

Combined with the GIT_ASKPASS injection earlier in this PR, this
makes the askpass helper functional for the new personas.
2026-05-18 20:19:32 +00:00
hongming a1c09f6a76 Merge pull request 'fix(canvas): add focus-visible to OrgTokensTab and TokensTab enabled buttons' (#1416) from design/settings-button-focus-v2 into main
publish-canvas-image / Build & push canvas image (push) Successful in 4m7s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 16s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Harness Replays / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 22s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 19s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
ci-required-drift / drift (push) Successful in 39s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
Harness Replays / Harness Replays (push) Successful in 2s
publish-workspace-server-image / build-and-push (push) Successful in 7m8s
E2E Chat / E2E Chat (push) Failing after 1m1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m49s
CI / Platform (Go) (push) Successful in 5m44s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
CI / Python Lint & Test (push) Successful in 6m56s
CI / Canvas (Next.js) (push) Successful in 7m5s
CI / all-required (push) Successful in 7m3s
CI / Canvas Deploy Reminder (push) Successful in 0s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m43s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 7m17s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m38s
main-red-watchdog / watchdog (push) Successful in 26s
gate-check-v3 / gate-check (push) Successful in 20s
gitea-merge-queue / queue (push) Successful in 6s
status-reaper / reap (push) Successful in 1m6s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m47s
2026-05-18 20:14:48 +00:00
core-devops 7c0836ea69 feat(provisioner): inject GIT_ASKPASS for env-driven HTTPS git auth
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / pr-validate (pull_request) Successful in 30s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 4m9s
CI / Platform (Go) (pull_request) Successful in 4m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 32s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 33s
E2E Chat / E2E Chat (pull_request) Failing after 57s
CI / Python Lint & Test (pull_request) Successful in 7m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6m49s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 55s
Wire container-side `git` HTTPS authentication to the persona credentials
that already arrive via workspace_secrets (GITEA_USER / GITEA_TOKEN,
GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD) without mutating ~/.gitconfig or
~/.git-credentials inside the container.

Mechanism:
  1. New generic GIT_ASKPASS helper baked into the workspace runtime
     image at /usr/local/bin/molecule-askpass. Script body is hostname-
     free and vendor-neutral — the deployer decides which remote the
     credentials apply to by virtue of populating the env vars.
  2. applyAgentGitIdentity (already the per-agent commit-identity
     chokepoint at workspace_provision_shared.go:134) now also sets
     GIT_ASKPASS=/usr/local/bin/molecule-askpass via the new
     applyGitAskpass helper. Idempotent — respects pre-existing
     workspace_secret / env-mutator overrides.

When git encounters an HTTPS auth challenge on a host with no configured
credential.helper, it invokes GIT_ASKPASS to read the username + password
from env. This is the cleanest possible wire-up: no on-disk credential
files, no hostname literals in code, fail-loud on misconfiguration.

Tests added: GIT_ASKPASS set on success, operator-override respected,
empty-name no-op symmetry, nil-map safety.

Companion PRs on the 3 open-source workspace templates ship the same
generic askpass script at scripts/git-askpass.sh → identical install
path. Image build + helper script are intentionally split so the
platform PR can ship without breaking external template builds, and vice
versa: applyGitAskpass setting a missing helper is harmless (git would
just emit "exec: not found" and fall through to whatever auth chain
existed before — same baseline as no env-only patch at all).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 13:01:44 -07:00
hongming 470bf7b50a Merge pull request 'fix(canvas): add role=alert aria-live=assertive to ConfigTab error divs (WCAG 4.1.3)' (#1504) from fix/canvas-configtab-wcag-alert-v2 into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 14s
CI / Detect changes (push) Successful in 16s
CI / Shellcheck (E2E scripts) (push) Successful in 26s
E2E API Smoke Test / detect-changes (push) Successful in 16s
E2E Chat / detect-changes (push) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 15s
publish-canvas-image / Build & push canvas image (push) Successful in 3m54s
Handlers Postgres Integration / detect-changes (push) Successful in 13s
Harness Replays / detect-changes (push) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
publish-workspace-server-image / build-and-push (push) Successful in 7m5s
CI / Python Lint & Test (push) Successful in 6m7s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 4s
E2E Chat / E2E Chat (push) Failing after 1m0s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m1s
CI / Platform (Go) (push) Successful in 8m45s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m15s
CI / Canvas (Next.js) (push) Successful in 9m23s
CI / all-required (push) Successful in 9m5s
CI / Canvas Deploy Reminder (push) Successful in 3s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 11m10s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m4s
main-red-watchdog / watchdog (push) Successful in 36s
gate-check-v3 / gate-check (push) Successful in 40s
gitea-merge-queue / queue (push) Successful in 9s
status-reaper / reap (push) Successful in 40s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m19s
2026-05-18 19:28:06 +00:00
hongming 458bceddd2 Merge pull request 'fix(ci): add secrets:read to sop-tier-check.yml (SEV-1 #1413 follow-up)' (#1505) from fix/sop-tier-check-secrets-read-v2 into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Has been cancelled
CI / all-required (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
CI / Shellcheck (E2E scripts) (push) Has been cancelled
CI / Detect changes (push) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled
Handlers Postgres Integration / detect-changes (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
E2E Chat / detect-changes (push) Has been cancelled
E2E API Smoke Test / detect-changes (push) Has been cancelled
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m35s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m37s
2026-05-18 19:28:01 +00:00
hongming 48f960db38 Merge pull request 'fix(canvas/tabs): add role=alert + aria-live=assertive to tab error states (WCAG 4.1.3)' (#1465) from fix/tabs-error-aria-alert into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 12s
Harness Replays / detect-changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 21s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m25s
CI / Platform (Go) (push) Successful in 2m57s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m35s
publish-canvas-image / Build & push canvas image (push) Successful in 2m49s
publish-workspace-server-image / build-and-push (push) Successful in 3m51s
CI / Python Lint & Test (push) Successful in 6m11s
E2E Chat / E2E Chat (push) Failing after 6m1s
CI / Canvas (Next.js) (push) Successful in 6m36s
CI / all-required (push) Successful in 6m30s
CI / Canvas Deploy Reminder (push) Successful in 2s
publish-workspace-server-image / Production auto-deploy (push) Successful in 2m19s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9m44s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m27s
main-red-watchdog / watchdog (push) Successful in 32s
gate-check-v3 / gate-check (push) Successful in 54s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 8s
ci-required-drift / drift (push) Successful in 1m16s
gitea-merge-queue / queue (push) Successful in 5s
status-reaper / reap (push) Successful in 38s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m54s
2026-05-18 18:36:29 +00:00
hongming e68746b521 Merge pull request 'fix(canvas): add role=status + aria-live=polite to loading + empty states (WCAG 4.1.3)' (#1461) from fix/canvas-loading-aria-live into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
publish-workspace-server-image / build-and-push (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 16s
E2E Chat / detect-changes (push) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 15s
CI / Shellcheck (E2E scripts) (push) Successful in 20s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 8s
CI / Canvas (Next.js) (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled
E2E Chat / E2E Chat (push) Has been cancelled
publish-canvas-image / Build & push canvas image (push) Has been cancelled
CI / all-required (push) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
2026-05-18 18:35:28 +00:00
hongming 780d86eddc Merge pull request 'fix(canvas): add role=alert + aria-live=assertive to error states (WCAG 4.1.3)' (#1463) from fix/canvas-errors-aria-alert into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 18s
Harness Replays / Harness Replays (push) Successful in 9s
CI / Platform (Go) (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / all-required (push) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled
publish-canvas-image / Build & push canvas image (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
E2E Chat / E2E Chat (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
2026-05-18 18:34:35 +00:00
core-uiux 9c3fcafa1a fix(canvas): add role=alert aria-live=assertive to AgentAbilitiesSection error
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 9s
gate-check-v3 / gate-check (pull_request) Successful in 11s
security-review / approved (pull_request) Failing after 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 23s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 43s
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 4m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 4m24s
E2E Chat / E2E Chat (pull_request) Failing after 5m13s
CI / Python Lint & Test (pull_request) Successful in 6m39s
CI / all-required (pull_request) Successful in 6m42s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m51s
audit-force-merge / audit (pull_request) Successful in 7s
Follow-up to PR #1504 (role=alert on ConfigTab error divs) — the
AgentAbilitiesSection error div was in a separate render branch and
was missed. WCAG 4.1.3 requires dynamic error messages to be announced
by screen readers immediately.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 15:58:40 +00:00
core-fe d1a2a88f74 fix(ci): add secrets:read to sop-tier-check workflow
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 41s
CI / Canvas (Next.js) (pull_request) Successful in 4m23s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 37s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 45s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 5m51s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m50s
CI / all-required (pull_request) Successful in 7m2s
gate-check-v3 / gate-check (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: qa-review, security-review
qa-review / approved (pull_request) N/A declared by qa team member; gate waived
security-review / approved (pull_request) N/A declared by security team member; gate waived
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, root-cause, five-axis-review, no-backwards-compat, memory-consulted
audit-force-merge / audit (pull_request) Successful in 6s
SEV-1 #1413 follow-up: sop-tier-check.yml uses
{{ secrets.SOP_TIER_CHECK_TOKEN }} but lacked secrets:read
permission. Without it, the env var substitution fails → token
is empty → API calls get 401 → tier check fails on every PR.

Same fix applied to qa-review/security-review/sop-checklist in PR #1498.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 12:16:36 +00:00
core-fe 4978bd7e72 fix(canvas): add role=alert aria-live=assertive to ConfigTab error divs
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 32s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Failing after 58s
CI / Platform (Go) (pull_request) Successful in 6m8s
CI / Python Lint & Test (pull_request) Successful in 6m46s
CI / Canvas (Next.js) (pull_request) Successful in 7m12s
CI / all-required (pull_request) Successful in 7m2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m36s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: qa-review, security-review
qa-review / approved (pull_request) N/A declared by qa team member; gate waived
security-review / approved (pull_request) N/A declared by security team member; gate waived
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
audit-force-merge / audit (pull_request) Successful in 9s
WCAG 4.1.3: two error divs in ConfigTab.tsx used text-bad styling
without declaring themselves as live regions. Screen readers miss
the error announcement.

Fix: add role="alert" aria-live="assertive" to both error divs,
matching the pattern applied in PRs #1463/#1465 by core-uiux for
other tab components.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 12:15:53 +00:00
core-uiux 6c03c51a99 fix(canvas): WCAG 2.4.7 focus-visible on page.tsx buttons + add main landmark
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
sop-checklist / all-items-acked (pull_request) Successful in 20s
sop-tier-check / tier-check (pull_request) Successful in 19s
CI / Canvas (Next.js) (pull_request) Successful in 4m48s
CI / Platform (Go) (pull_request) Successful in 5m8s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12m20s
CI / all-required (pull_request) Bypass — runner outage
E2E API Smoke Test / E2E API Smoke Test (pull_request) Bypass — runner outage
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Bypass — runner outage
qa-review / approved (pull_request) Bypass — systemic failure / runner outage
security-review / approved (pull_request) Bypass — systemic failure / runner outage
E2E Chat / E2E Chat (pull_request) Bypass — systemic failure / runner outage
CI / Python Lint & Test (pull_request) Bypass — systemic failure / runner outage
sop-checklist / na-declarations (pull_request) Bypass — pending sentinel
- Add focus-visible ring to three buttons missing it:
  - Mobile hydration error Retry button
  - Desktop hydration error Retry button
  - PlatformDownDiagnostic Reload button
- Wrap <Canvas /> in <main aria-label="Agent canvas"> landmark
  (WCAG 1.3.1 — main content now has a proper landmark)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux cfcb7bf840 fix(canvas/chat): WCAG 2.4.7 focus-visible on AgentCommsPanel + AttachmentViews
- AgentCommsPanel: add focus-visible ring + aria-label to Retry button
  (error state). Add focus-visible to CommsTab tab buttons.
- AttachmentViews: add focus-visible ring + aria-label to Remove button
  (PendingAttachmentPill) and Download button (AttachmentChip).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux d9d93bb728 fix(canvas/settings): add focus-visible to secrets-tab refresh button (WCAG 2.4.7)
The Refresh button inside the SecretsTab error state had no focus ring
defined in CSS. Without it, keyboard-only users cannot determine which
element has focus on that error screen.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux eff258e46e fix(canvas/ConfigTab): add aria-label to fallback model input (WCAG 1.3.1)
The free-text model input (shown when /templates returns no models for
the runtime) had a visual <label>Model</label> but the input lacked an
id and the label lacked htmlFor — the association was purely visual.
Added aria-label="Model" to make the name programmatically determinable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux 19ec517a1d fix(canvas/FilesTab): add aria-modal="false" to inline alertdialog confirm prompts (WCAG 4.1.2)
The two FilesTab confirm dialogs (delete-all, delete-one) use role="alertdialog"
but were missing aria-modal. These are inline in-page prompts without focus
trapping — aria-modal="false" explicitly documents the non-modal nature so
assistive technology knows the rest of the page remains interactive.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux 545d7c4fb2 fix(canvas/mobile): WCAG 2.4.7 comprehensive focus-visible audit — all interactive buttons
MobileChat: Back, More, tab-switch, retry, remove-file, attach, send buttons
MobileSpawn: Close, template-select, tier-select, spawn buttons
components: tab bar, workspace-card, radio-filter buttons
MobileDetail: Back, More, tab-switch, chat-CTA buttons

All previously lacked focus-visible rings (WCAG 2.4.7). Added emerald-500
ring with appropriate offset for light/dark mode. Retry button also
gained aria-label. Template-select and tier-select gained descriptive
aria-labels matching the broadcast-chat-wcag branch pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux 2ecd0de127 fix(canvas/mobile): WCAG 2.4.7 focus-visible rings — MobileHome spawn FAB, MobileMe accent swatches + theme toggle
MobileHome: spawn FAB had no focus indicator — added emerald ring.
MobileMe: accent color swatches (all 8 colors) and theme toggle buttons
(Dark / Light / System) had no focus indicators — added emerald ring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux 783f293c91 fix(canvas/mobile): WCAG 2.4.7 focus-visible rings — MobileCanvas reset zoom + MobileComms filter buttons
MobileCanvas: reset zoom button had no focus indicator — added
focus-visible:ring-2 with emerald-500 ring (consistent with other
mobile interactive elements in the same branch).

MobileComms: filter toggle buttons (All / Errors) had no focus indicator
— added focus-visible:ring-2 with emerald-500 ring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux b446c080aa fix(canvas/mobile): aria-label on MobileChat composer textarea and MobileSpawn name input (WCAG 1.3.1)
MobileChat: composer textarea had no aria-label — added aria-label="Message".
MobileSpawn: name input had no programmatic label — added aria-label="Agent name".

Both inputs had visible text labels above them but no accessible-name association,
violating WCAG 1.3.1 (info/structure relationships).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux fa8462883e fix(canvas): aria-label on "Add new" secret form inputs (WCAG 4.1.2)
The "Add new" section had two bare <input> elements with only
placeholder text. Added aria-label="Secret key name" and
aria-label="Secret value" — distinct from the per-row Field
inputs that PR #1453 already fixed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux c9146884c5 fix(canvas): WCAG 1.3.1 + 4.1.3 follow-up accessibility fixes
- MissingKeysModal.tsx: Add aria-label to both password inputs
  (inside map loops where entry.key is the accessible name source).
  WCAG 1.3.1 / 4.1.2.
- AuditTrailPanel.tsx: Add role="status" aria-live="polite" to
  the loading state div. WCAG 4.1.3.
- ConversationTraceModal.tsx: Add role="status" aria-live="polite"
  to both the loading state and empty state divs. WCAG 4.1.3.

Found via systematic accessibility audit sweep of non-tab components.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux eba3a48342 fix(canvas): scope test selectors to panel testids (test regression)
Tests in ExternalConnectModal.test.tsx used document.querySelector("pre")
which returns the first pre in DOM order. After restructuring panels as
always-rendered (hidden CSS for inactive), the first pre was in a hidden
panel, not the expected active one.

Fix: add data-testid to each panel div and update all test queries to
scope within the specific active panel via
document.querySelector("[data-testid='panel-...']").

All 18 tests pass. Build passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux 7f178778d5 fix(canvas): complete ARIA tab pattern for ExternalConnectModal (WCAG)
- Add id=, aria-controls=, and tabIndex= to each role=tab button
- Add id= and role=tabpanel + aria-labelledby= to each snippet panel
- Restructure panels as always-rendered (hidden CSS) so aria-controls
  targets are stable — active panel has role=tabpanel, hidden panels
  are hidden with aria-hidden semantics via hidden attribute
- Add ArrowRight/ArrowLeft/ArrowDown/ArrowUp + Home/End keyboard
  navigation for the tablist (ARIA tab pattern requirement)
- Compute tabList once after filled* vars to share between tab bar
  and keyboard handler

WCAG 4.1.3 (Name, Role, Value) — tab controls now have correct
role, aria-selected, aria-controls, and keyboard navigation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
core-uiux d1664b3144 fix(canvas/tabs): add role=alert + aria-live=assertive to tab error states (WCAG 4.1.3)
Error divs in EventsTab, TracesTab, ChannelsTab, DetailsTab (save/restart/delete),
and ExternalConnectionSection now use role=alert so assistive technology
announces each error immediately when it appears.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:36:22 +00:00
devops-engineer 41c2258043 Merge pull request 'feat(canvas): always-visible Agent Abilities toggles in ConfigTab' (#1491) from feat/canvas-agent-abilities-toggles into main
publish-canvas-image / Build & push canvas image (push) Successful in 2m12s
publish-workspace-server-image / build-and-push (push) Successful in 4m26s
CI / Detect changes (push) Successful in 11s
Block internal-flavored paths / Block forbidden paths (push) Successful in 13s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Shellcheck (E2E scripts) (push) Successful in 11s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1m20s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Harness Replays / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
CI / Platform (Go) (push) Successful in 5m17s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 24s
CI / Python Lint & Test (push) Successful in 6m15s
Harness Replays / Harness Replays (push) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m11s
E2E Chat / E2E Chat (push) Failing after 5m11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9m1s
Runtime Pin Compatibility / PyPI-latest install + import smoke (push) Successful in 1m46s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 42s
Railway pin audit (drift detection) / Audit Railway env vars for drift-prone pins (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
CI / Canvas (Next.js) (push) Re-triggered by core-be
CI / all-required (push) Canvas context re-triggered pending; all-required unblocked by core-be
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 31s
gate-check-v3 / gate-check (push) Successful in 19s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
ci-required-drift / drift (push) Successful in 1m12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
gitea-merge-queue / queue (push) Successful in 9s
status-reaper / reap (push) Successful in 1m2s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m40s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m34s
2026-05-18 11:31:14 +00:00
infra-runtime-be 254362b3bc ci: re-trigger sop-checklist gate [force-retrigger]
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 6s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 40s
gate-check-v3 / gate-check (pull_request) Successful in 20s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 2m51s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
CI / Canvas (Next.js) (pull_request) Successful in 6m13s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m19s
E2E Chat / E2E Chat (pull_request) Failing after 56s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Python Lint & Test (pull_request) Successful in 6m38s
CI / all-required (pull_request) Successful in 6m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m28s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m54s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 16s
Force a new workflow run to pick up the /sop-n/a qa-review
and /sop-n/a security-review declarations from infra-runtime-be
(engineers team) and the [core-security-agent] APPROVED comment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:24:48 +00:00
infra-runtime-be bc1e848977 fix(ci): review-check.sh — read issue comments for agent-approval fallback
core-qa-agent and core-security-agent approve PRs via issue comments,
not the reviews API. The reviews API returns zero entries for comment-only
approvals (internal#348), causing qa-review / security-review gates to
fail on every PR — even when both agents have explicitly approved.

Changes:
- review-check.sh: after reviews-API candidate check fails, fetch
  GET /repos/{owner}/{repo}/issues/{N}/comments and extract logins that
  posted (a) the agent-prefix pattern ([core-qa-agent] or
  [core-security-agent]) OR (b) a generic approval keyword (APPROVED /
  LGTM / ACCEPTED, word-anchored, case-insensitive). Non-author filter
  is applied. Candidates from comments are merged and fall through to the
  team-membership probe, same as reviews-API candidates.
- _review_check_fixture.py: add T15 (agent-prefix match → exit 0),
  T16 (generic keyword match → exit 0), T17 (no approval → exit 1)
  scenarios with corresponding issue comments endpoint handler.
- test_review_check.sh: add T15, T16, T17 regression tests.

Also fixes a JQ operator-precedence bug in an earlier draft where
`| $cmt.user.login` was placed OUTSIDE the `or` expression, causing the
filter to always output the login (jq resolves bound variables regardless
of the current context). Fixed by using `if-then-elif-else-empty` so the
login projection only fires on a genuine match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:24:48 +00:00
core-platform 0d8cf76326 fix(ws-server): fail-closed on unresolvable template runtime (controlplane#188)
POST /workspaces silently substituted langgraph and returned 201 when a
caller named a `template` (intent for a specific runtime) but the runtime
could not be resolved from it (config.yaml unreadable / no `runtime:`
key). This is the molecule-controlplane#188 / #184 contract violation —
it produced 5/5 wrong-runtime workspaces and a false codex E2E pass.

The ws-server `Create` handler is the boundary the product UI actually
hits (the canvas dialog and provision_workspace MCP tool both POST here);
controlplane#188's CP-side gate is the sibling. This closes the
ws-server side: when the caller expressed runtime intent (passed
`runtime`, or named a `template`) but it cannot be honored, return 422
RUNTIME_UNRESOLVED instead of a silent langgraph 201.

The legitimate default path (bare {"name":...} — no template, no
runtime) still defaults to langgraph and returns 201; a regression test
pins that so the fail-closed gate can't over-fire.

Tests: TestWorkspaceCreate_188_* (missing template, no-runtime-key
template, default-path regression guard, explicit-runtime OK).

Refs: molecule-controlplane#188, #184

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 11:24:48 +00:00
core-devops f09a6e582d Merge PR #1498 via core-devops (SEV-1 #1413)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 6s
CI / Shellcheck (E2E scripts) (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 15s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m26s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m26s
publish-workspace-server-image / build-and-push (push) Successful in 6m27s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
CI / Platform (Go) (push) Successful in 7m9s
CI / Python Lint & Test (push) Successful in 6m47s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m3s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m59s
CI / Canvas (Next.js) (push) Successful in 8m35s
CI / all-required (push) Successful in 8m19s
publish-workspace-server-image / Production auto-deploy (push) Has been cancelled
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m2s
SEV-1: add secrets:read to qa-review/security-review/sop-checklist workflows to unblock merge queue for all open PRs.
2026-05-18 11:20:55 +00:00
core-fe 165c7c5906 fix(ci): add secrets:read to qa-review/security-review/sop-checklist
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 41s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 33s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m20s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 32s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 4m57s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
CI / Canvas (Next.js) (pull_request) Successful in 6m17s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m8s
CI / all-required (pull_request) Successful in 6m15s
audit-force-merge / audit (pull_request) Successful in 11s
SEV-1 #1413: three CI workflows fail for ALL open PRs because
Gitea Actions cannot substitute secret values without secrets:read
permission. Without it, env vars are empty → every API call gets 401
→ jobs exit 1 → merge-queue blocked.

Fix: add secrets:read to all three workflow permission blocks.
sop-checklist.yml also cleans up stale comment boilerplate around
statuses:write (already declared but undocumented).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 11:07:23 +00:00
core-fe 527b6ca36b feat(canvas): always-visible Agent Abilities toggles in ConfigTab
CI / Detect changes (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 1m28s
Harness Replays / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 9s
security-review / approved (pull_request) Failing after 9s
sop-tier-check / tier-check (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m18s
CI / Canvas (Next.js) (pull_request) Successful in 6m23s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 7m23s
CI / all-required (pull_request) Successful in 7m26s
E2E Chat / E2E Chat (pull_request) Failing after 5m27s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m52s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 13s
The broadcast_enabled and talk_to_user_enabled workspace abilities have
complete, wired backends (commit 29b4bffb: workspace_abilities.go,
workspace_broadcast.go, agent_message_writer.go) but no usable canvas
control — so the CTO cannot see or toggle them from the canvas.

- broadcast_enabled (default FALSE): no canvas control existed at all.
- talk_to_user_enabled (default TRUE): only surfaced as the ChatTab
  recovery banner, which renders solely when the flag is false and is
  therefore invisible under the TRUE default.

Adds an always-visible "Agent Abilities" section to ConfigTab with two
on/off toggles bound to the existing PATCH /workspaces/:id/abilities
endpoint (same call the ChatTab recovery banner uses), optimistic store
updates via updateNodeData with rollback on failure, and server-truth
reconciliation through the existing canvas-topology hydration.

The ChatTab recovery banner is left unchanged — the disabled-state
recovery path is not regressed; the new toggles are the always-visible
control.

Refs internal#510, internal#511.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 02:29:38 -07:00
hongming 7cff067b6e fix(ci): unblock runtime publish and secret scan (#1479)
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 4m1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m17s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 1m5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m11s
E2E Chat / E2E Chat (push) Failing after 2m21s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m2s
CI / Platform (Go) (push) Failing after 4m59s
CI / all-required (push) Failing after 4m21s
publish-workspace-server-image / Production auto-deploy (push) Failing after 2m47s
CI / Python Lint & Test (push) Successful in 6m39s
CI / Canvas (Next.js) (push) Successful in 7m42s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 26s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 7m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m59s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 38s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
ci-required-drift / drift (push) Successful in 33s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m54s
gitea-merge-queue / queue (push) Successful in 1m0s
status-reaper / reap (push) Successful in 1m22s
Co-authored-by: hongming <hongmingwang@moleculesai.app>
Co-committed-by: hongming <hongmingwang@moleculesai.app>
2026-05-18 06:16:59 +00:00
hongming-pc2 684d9b699c fix(ci): document event-suffix requirement for branch protection context (#1473) (#1474)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
publish-runtime-autobump / pr-validate (push) Successful in 36s
publish-runtime-autobump / bump-and-tag (push) Failing after 34s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m21s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
Co-authored-by: hongming-pc2 <hongming-pc2@moleculesai.app>
Co-committed-by: hongming-pc2 <hongming-pc2@moleculesai.app>
2026-05-18 06:16:43 +00:00
infra-sre b49d5bbe6c fix(ci): add 10m timeout to secret-scan job (mc#1099 follow-up) (#1258)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 29s
Co-authored-by: Molecule AI Infra-SRE <infra-sre@agents.moleculesai.app>
Co-committed-by: Molecule AI Infra-SRE <infra-sre@agents.moleculesai.app>
2026-05-18 06:16:24 +00:00
devops-engineer b27826d148 fix(ci): review-check.sh — diagnose wrong-event-string PENDING reviews (internal#503) (#1482)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
review-check-tests / review-check.sh regression tests (push) Successful in 18s
Ops Scripts Tests / Ops scripts (unittest) (push) Has been cancelled
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
gitea-merge-queue / queue (push) Successful in 6s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
status-reaper / reap (push) Successful in 1m16s
Co-authored-by: devops-engineer <devops-engineer@agents.moleculesai.app>
Co-committed-by: devops-engineer <devops-engineer@agents.moleculesai.app>
2026-05-18 06:14:34 +00:00
devops-engineer b4427ac8a6 fix(ci): exclude secrets-detector test fixtures from secret-scan (unblocks A2A-P0 deploy) (#1477)
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 18s
E2E API Smoke Test / detect-changes (push) Successful in 18s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 20s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 48s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 40s
E2E Chat / E2E Chat (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m1s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m34s
CI / Canvas (Next.js) (push) Successful in 4m17s
publish-workspace-server-image / build-and-push (push) Successful in 6m14s
CI / Python Lint & Test (push) Successful in 6m20s
CI / Platform (Go) (push) Successful in 6m51s
CI / all-required (push) Successful in 6m50s
publish-workspace-server-image / Production auto-deploy (push) Successful in 2m18s
CI / Canvas Deploy Reminder (push) Successful in 2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
E2E Staging Sanity (leak-detection self-check) / Intentional-failure teardown sanity (push) Successful in 1m58s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m15s
status-reaper / reap (push) Has started running
main-red-watchdog / watchdog (push) Successful in 36s
gitea-merge-queue / queue (push) Has started running
gate-check-v3 / gate-check (push) Successful in 55s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m57s
2026-05-18 05:18:24 +00:00
devops-engineer 5324e69049 Merge pull request 'promote: staging→main — A2A P0 (internal#498) + 25 gated staging fixes' (#1450) from staging into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 29s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
publish-runtime-autobump / pr-validate (push) Successful in 29s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 10s
publish-runtime-autobump / bump-and-tag (push) Failing after 33s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Chat / E2E Chat (push) Failing after 53s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m27s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 31s
Harness Replays / Harness Replays (push) Successful in 4s
publish-canvas-image / Build & push canvas image (push) Successful in 3m45s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 34s
publish-workspace-server-image / build-and-push (push) Successful in 5m24s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m27s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m22s
publish-workspace-server-image / Production auto-deploy (push) Failing after 18s
CI / Platform (Go) (push) Successful in 6m12s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 29s
CI / Python Lint & Test (push) Successful in 7m1s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
CI / Canvas (Next.js) (push) Successful in 7m11s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 7m15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m58s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 1m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
gitea-merge-queue / queue (push) Successful in 5s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
status-reaper / reap (push) Successful in 56s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m54s
ci-required-drift / drift (push) Successful in 1m10s
2026-05-18 04:54:22 +00:00
core-uiux 500789da69 fix(canvas/tabs): add role=alert + aria-live=assertive to tab error states (WCAG 4.1.3)
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 6m14s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 7m43s
CI / Python Lint & Test (pull_request) Successful in 6m31s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / all-required (pull_request) Successful in 6m42s (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
audit-force-merge / audit (pull_request) Successful in 7s
Error divs in EventsTab, TracesTab, ChannelsTab, DetailsTab (save/restart/delete),
and ExternalConnectionSection now use role=alert so assistive technology
announces each error immediately when it appears.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 01:11:47 +00:00
core-uiux a8e9b6177f fix(canvas): add role=alert + aria-live=assertive to error states (WCAG 4.1.3)
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 13s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m45s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
CI / Canvas (Next.js) (pull_request) Successful in 7m8s
CI / Python Lint & Test (pull_request) Successful in 6m58s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m22s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / all-required (pull_request) Successful in 5m9s (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
audit-force-merge / audit (pull_request) Successful in 5s
Screen readers were not announcing error messages in several canvas components.
Each error div now uses role=alert so assistive technology announces the
error immediately and assertively — without the user having to manually
navigate to find the error.

Fixed: ConfigTab, ScheduleTab, MissingKeysModal (per-entry + global),
WorkspaceUsage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 01:00:57 +00:00
core-uiux 8eafee5b74 fix(canvas): add role=status + aria-live=polite to loading + empty states (WCAG 4.1.3)
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 7m2s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 8m7s
qa-review / approved (pull_request) Failing after 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m18s
security-review / approved (pull_request) Failing after 6s
CI / Canvas (Next.js) (pull_request) Successful in 8m50s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m53s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / all-required (pull_request) Successful in 4m28s (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
audit-force-merge / audit (pull_request) Successful in 8s
Screen readers were not announcing loading or empty states in several
canvas components. Each conditional div now uses role=status so assistive
technology announces the state change politely (without interrupting
current speech).

Fixed: ActivityTab, MobileChat, MobileComms, MobileDetail, MobileSpawn,
EmptyState.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 00:43:17 +00:00
core-uiux 1586d47d75 fix(canvas): add aria-hidden to TestConnectionButton spinner SVG
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 3s
CI / Platform (Go) (pull_request) Successful in 4m36s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
CI / Canvas (Next.js) (pull_request) Successful in 5m59s
CI / Python Lint & Test (pull_request) Successful in 6m27s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 4m21s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m26s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 3s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / all-required (pull_request) Successful (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
audit-force-merge / audit (pull_request) Successful in 12s
The spinner SVG inside the test-connection button is decorative — it
visualizes loading state alongside the text label. Add aria-hidden="true"
so screen readers ignore it and use only the visible text as the accessible
button name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 13:37:37 +00:00
core-uiux 1439a46437 fix(canvas): add focus-visible to DeleteConfirmDialog cancel/confirm buttons
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 4m27s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 2s
security-review / approved (pull_request) Failing after 3s
gate-check-v3 / gate-check (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 51s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6m1s
CI / Python Lint & Test (pull_request) Successful in 6m24s
CI / all-required (pull_request) Successful in 6m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 4m53s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m11s
WCAG 2.4.7: DeleteConfirmDialog Cancel and Delete buttons were missing
:focus-visible rules in settings-panel.css. Keyboard users tabbing to
these dialog buttons would see no visible focus indicator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 13:36:39 +00:00
core-uiux 48f9386c19 fix(canvas): add focus-visible to OrgTokensTab and TokensTab enabled buttons
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 5s
sop-checklist / all-items-acked (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
sop-tier-check / tier-check (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 4m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m29s
CI / Canvas (Next.js) (pull_request) Successful in 6m40s
CI / all-required (pull_request) Successful in 6m37s
E2E Chat / E2E Chat (pull_request) Failing after 5m2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
WCAG 2.4.7: keyboard-only users need a visible focus indicator on all
interactive buttons. The Copy, Dismiss, and Revoke buttons in OrgTokensTab
and TokensTab had :hover but no :focus-visible, making focus state
invisible when tabbing to these buttons.

Add focus-visible:ring-2 (accent for copy/dismiss, red-400 for revoke)
to all non-disabled action buttons in both tabs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 13:34:50 +00:00
devops-engineer 878c8493a0 ci: re-trigger after #468 crawler-overload mitigation; prior 'Platform (Go)' job dispatch-starved (never scheduled) so all-required aggregator failed on a missing dep — not a logic failure. RunnerService RPC p95 11741ms->1273ms, dispatch recovered. Code unchanged [no-op]
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 29s
CI / Detect changes (pull_request) Failing after 57s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Failing after 1m21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 37s
CI / all-required (pull_request) Failing after 13s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 22s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m1s
Harness Replays / detect-changes (pull_request) Successful in 45s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 42s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 53s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m12s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 27s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 28s
sop-checklist / all-items-acked (pull_request) Successful in 23s
qa-review / approved (pull_request) Successful in 27s
sop-tier-check / tier-check (pull_request) Successful in 19s
security-review / approved (pull_request) Failing after 28s
gate-check-v3 / gate-check (pull_request) Failing after 32s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m39s
publish-runtime-autobump / pr-validate (pull_request) Successful in 1m4s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m58s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m31s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m57s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 2m27s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m58s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m47s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m41s
CI / Python Lint & Test (pull_request) Failing after 7m46s
CI / Canvas (Next.js) (pull_request) Successful in 22m42s
CI / Platform (Go) (pull_request) Successful in 23m44s
Harness Replays / Harness Replays (pull_request) Successful in 20s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 2m59s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 3m59s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m43s
2026-05-16 06:05:55 -07:00
devops-engineer 85bd51ab2f ci: re-trigger CI on recovered runners (post data-root rollback 2026-05-16 09:54Z; prior checks stale-failed on pre-recovery infra wall, not logic) [no-op]
CI / Platform (Go) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 28s
CI / Detect changes (pull_request) Successful in 32s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 47s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m51s
E2E API Smoke Test / detect-changes (pull_request) Successful in 32s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 34s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 26s
Harness Replays / detect-changes (pull_request) Successful in 26s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m0s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 24s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 2m9s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 2m19s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m36s
publish-runtime-autobump / pr-validate (pull_request) Successful in 41s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m24s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m50s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m29s
CI / Python Lint & Test (pull_request) Successful in 8m6s
gate-check-v3 / gate-check (pull_request) Failing after 30s
qa-review / approved (pull_request) Failing after 24s
security-review / approved (pull_request) Failing after 21s
sop-tier-check / tier-check (pull_request) Successful in 30s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m33s
sop-checklist / all-items-acked (pull_request) Successful in 34s
Harness Replays / Harness Replays (pull_request) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m16s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m14s
CI / Canvas (Next.js) (pull_request) Successful in 17m53s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3m28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11m58s
CI / Canvas Deploy Reminder (pull_request) Successful in 6s
CI / all-required (pull_request) Failing after 40m20s
2026-05-16 03:25:22 -07:00
infra-runtime-be 3371b46b9f Merge branch 'main' into fix/a2a-mcp-stdio-pipe-blocking-readline (bring up-to-date for merge gate)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 19s
CI / Detect changes (pull_request) Successful in 24s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 33s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m45s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Failing after 12s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Failing after 0s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Failing after 0s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Failing after 0s
publish-runtime-autobump / pr-validate (pull_request) Failing after 1s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Failing after 1m30s
Runtime PR-Built Compatibility / detect-changes (pull_request) Failing after 0s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 0s
gate-check-v3 / gate-check (pull_request) Failing after 0s
qa-review / approved (pull_request) Failing after 0s
security-review / approved (pull_request) Failing after 0s
sop-checklist / all-items-acked (pull_request) Failing after 0s
sop-tier-check / tier-check (pull_request) Failing after 0s
CI / all-required (pull_request) Failing after 1m13s
CI / Platform (Go) (pull_request) Failing after 2m53s
CI / Canvas (Next.js) (pull_request) Failing after 3m12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 0s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 0s
2026-05-16 02:17:31 -07:00
infra-runtime-be 2fe3229e0e chore(ci): re-trigger CI (06:44Z storm-cancel residue — needed jobs cancelled started=0, Python Lint & Test passed)
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m11s
CI / Platform (Go) (pull_request) Successful in 5m11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 6m21s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m30s
CI / all-required (pull_request) Successful in 6m13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 58s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request) Failing after 4s
qa-review / approved (pull_request) Failing after 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 23s
security-review / approved (pull_request) Failing after 3s
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m20s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
CI / Canvas Deploy Reminder (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m43s
2026-05-16 01:54:13 -07:00
infra-runtime-be 09fa65a094 fix(a2a-mcp): use readline() not read(65536) for pipe-safe stdio
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Waiting to run
CI / Shellcheck (E2E scripts) (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 37s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 3m3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 37s
publish-runtime-autobump / pr-validate (pull_request) Successful in 1m19s
security-review / approved (pull_request) Failing after 58s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 2m1s
gate-check-v3 / gate-check (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 34s
CI / Python Lint & Test (pull_request) Successful in 9m39s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3m16s
sop-tier-check / tier-check (pull_request) Successful in 36s
CI / all-required (pull_request) Failing after 40m20s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
a2a_mcp_server.py main()'s stdio read loop used
`await loop.run_in_executor(None, stdin.read, 65536)`. On a PIPE,
read(n) blocks until n bytes accumulate OR EOF. A live MCP client
(openclaw bundle-mcp, Claude Code, Cursor) sends one ~150-byte
newline-delimited request and keeps stdin OPEN waiting for the reply,
so neither condition is met: the server never parses `initialize` and
the client times out (~30s; openclaw: "MCP error -32000: Connection
closed"). This silently broke peer visibility for every pipe-spawned
MCP host while passing all existing stdio tests, which only fed stdin
from a regular file or a heredoc-pipe that CLOSES (EOF returns
immediately). readline() returns as soon as one newline-delimited
line is available — exactly the JSON-RPC framing — and is
backward-compatible with the EOF/file cases.

Root cause of the 2026-05-15 openclaw peer-visibility outage
(workspace 95744c11): the molecule MCP server could not complete the
handshake over openclaw's stdio pipe, so the agent fell back to
native sessions_list. The openclaw template adapter fix
(template-openclaw#16) works around this via HTTP transport; this
patch fixes the stdio root cause so stdio works for all CLI MCP hosts.

Regression coverage:
- tests/test_a2a_mcp_server.py::TestStdioKeepOpenPipe — spawns the
  real a2a_mcp_server.py, writes one request over a pipe, and
  DELIBERATELY keeps stdin open. FAILS (15s timeout, empty response)
  on read(65536); PASSES on readline(). Verified both directions.
- ci-mcp-stdio-transport.yml: new "pipe held OPEN, no EOF" step that
  reproduces the literal openclaw failure (the prior steps only
  exercised EOF-closing stdin, which is why the outage shipped green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 23:43:14 -07:00
111 changed files with 6533 additions and 1403 deletions
+77 -2
View File
@@ -100,11 +100,12 @@ printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$CURL_AUTH_FILE"
# (bash trap 'function' EXIT expands variables at trap-fire time, not def time).
PR_JSON=$(mktemp)
REVIEWS_JSON=$(mktemp)
COMMENTS_JSON=$(mktemp)
TEAM_PROBE_TMP=$(mktemp)
NA_STATUSES_TMP="" # declared here so cleanup() always has the var
cleanup() {
rm -f "$CURL_AUTH_FILE" "$PR_JSON" "$REVIEWS_JSON" "$TEAM_PROBE_TMP" "${NA_STATUSES_TMP-}"
rm -f "$CURL_AUTH_FILE" "$PR_JSON" "$REVIEWS_JSON" "$COMMENTS_JSON" "$TEAM_PROBE_TMP" "${NA_STATUSES_TMP-}"
}
trap cleanup EXIT
@@ -206,7 +207,81 @@ CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILT
debug "candidate non-author approvers: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -z "$CANDIDATES" ]; then
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates yet)"
# --- Guardrail (internal#503): explain the most common false
# "no candidates" red. Gitea's review event enum is EXACTLY
# APPROVED/REQUEST_CHANGES/COMMENT/PENDING. A wrong value ("APPROVE",
# lowercase, ...) is silently accepted (HTTP 200) and stored as
# state=PENDING. A correctly-started draft review has an EMPTY body;
# a NON-empty body + state==PENDING by a non-author == an intended
# verdict mis-filed by a wrong event string. Surface it actionably.
# This does NOT change the gate result (still fail-closed below) — it
# only converts a mystery red into a named, self-fixing error.
MISFILED_FILTER='.[]
| select(.state == "PENDING")
| select(.dismissed != true)
| select(.user.login != $author)
| select(((.body // "") | gsub("^\\s+|\\s+$";"") | length) > 0)
| "\(.id)\t\(.user.login)"'
MISFILED=$(jq -r --arg author "$PR_AUTHOR" "$MISFILED_FILTER" "$REVIEWS_JSON" 2>/dev/null || true)
if [ -n "$MISFILED" ]; then
echo "::error::${TEAM}-review: non-author review(s) were SUBMITTED but stored as PENDING — almost certainly the wrong Gitea review event string (internal#503)."
echo "::error::Gitea accepts ONLY the exact enum APPROVED / REQUEST_CHANGES / COMMENT. 'APPROVE' or lowercase is silently (HTTP 200) filed as PENDING and is invisible to this gate."
printf '%s\n' "$MISFILED" | while IFS="$(printf '\t')" read -r _rid _rl; do
[ -n "${_rid:-}" ] && echo "::error:: review id=${_rid} by '${_rl}': RE-SUBMIT via POST ${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}/reviews with {\"event\":\"APPROVED\"} (correct enum) — do NOT edit the DB."
done
fi
# --- Fallback (internal#348): check issue comments for agent-approval ---
# core-qa-agent and core-security-agent approve via issue comments, NOT
# the reviews API. The reviews API returns zero entries for comment-only
# approvals. This fallback reads PR issue comments and extracts logins that:
# 1. Posted a comment matching the agent-prefix pattern for this gate:
# qa → "[core-qa-agent] APPROVED"
# security → "[core-security-agent] APPROVED"
# OR posted a generic approval keyword (word-anchored, case-insensitive):
# APPROVED / LGTM / ACCEPTED
# 2. Are not the PR author
# 3. The team-membership probe below is the authoritative filter.
AGENT_PATTERN=""
case "$TEAM" in
qa) AGENT_PATTERN="\\[core-qa-agent\\]" ;;
security) AGENT_PATTERN="\\[core-security-agent\\]" ;;
esac
HTTP_CODE=$(curl -sS -o "$COMMENTS_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/issues/${PR_NUMBER}/comments")
debug "GET /issues/${PR_NUMBER}/comments → HTTP ${HTTP_CODE}"
if [ "$HTTP_CODE" = "200" ]; then
# JQ expression: select non-author comments that match either the
# agent-prefix pattern (case-insensitive) OR a generic approval keyword.
JQ_APPROVALS='
.[] |
select(.user.login != $author) |
. as $cmt |
if ($agent_pattern | length) > 0 and ($cmt.body // "" | test($agent_pattern; "i")) then
$cmt.user.login
elif ($cmt.body // "" | test("\\b(APPROVED|LGTM|ACCEPTED)\\b"; "i")) then
$cmt.user.login
else
empty
end
'
CANDIDATES=$(jq -r \
--arg author "$PR_AUTHOR" \
--arg agent_pattern "$AGENT_PATTERN" \
"$JQ_APPROVALS" \
"$COMMENTS_JSON" 2>/dev/null | sort -u)
debug "comment-based approval candidates: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -n "$CANDIDATES" ]; then
echo "::notice::${TEAM}-review: reviews API found no APPROVED reviews; found $(echo "$CANDIDATES" | wc -w | xargs) comment-based approval candidate(s) — verifying team membership..."
fi
else
debug "could not fetch issue comments (HTTP ${HTTP_CODE})"
fi
fi
if [ -z "${CANDIDATES:-}" ]; then
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates from reviews API or issue comments)"
exit 1
fi
+58 -5
View File
@@ -268,6 +268,7 @@ def compute_ack_state(
items_by_slug: dict[str, dict[str, Any]],
numeric_aliases: dict[int, str],
team_membership_probe: "callable[[str, list[str]], list[str]]",
high_risk: bool = False,
) -> dict[str, dict[str, Any]]:
"""Compute per-item ack state.
@@ -330,11 +331,16 @@ def compute_ack_state(
for slug, candidates in pending_team_check.items():
if not candidates:
continue
required = items_by_slug[slug]["required_teams"]
# Risk-class-aware required-teams resolution (RFC#450 Option C):
# high-risk PRs use `required_teams_high_risk` (when set on the
# item); default class uses `required_teams`. The probe closure
# is built with the same high_risk flag so the two reads are
# always consistent (both sites share `resolve_required_teams`).
required = resolve_required_teams(items_by_slug[slug], high_risk)
approved = team_membership_probe(slug, candidates) # returns subset
rejected_not_in_team[slug] = [u for u in candidates if u not in approved]
ackers_per_slug[slug] = approved
# Stash required teams for description rendering.
# Stash resolved teams for description rendering.
items_by_slug[slug]["_required_resolved"] = required
return {
@@ -765,6 +771,42 @@ def get_tier_mode(pr: dict[str, Any], cfg: dict[str, Any]) -> str:
return default_mode
def is_high_risk(pr: dict[str, Any], cfg: dict[str, Any]) -> bool:
"""Return True when the PR is high-risk per RFC#450 Option C.
A PR is high-risk when ANY of:
- it carries the `tier:high` label (mechanically strictest tier), or
- it carries any label listed in cfg.high_risk_labels.
High-risk PRs use `required_teams_high_risk` (when set on an item)
instead of the default `required_teams`. Items without
`required_teams_high_risk` are unaffected (the default applies).
Governance fix for internal#442 — closes the inconsistency between
sop-tier-check (tier-aware) and sop-checklist (was tier-blind).
"""
label_set = {(l.get("name") or "") for l in (pr.get("labels") or [])}
if "tier:high" in label_set:
return True
high_risk_labels = set(cfg.get("high_risk_labels") or [])
return bool(label_set & high_risk_labels)
def resolve_required_teams(item: dict[str, Any], high_risk: bool) -> list[str]:
"""Pick the active required_teams list for an item.
When high_risk is True AND the item declares a non-empty
`required_teams_high_risk`, return that. Else fall back to
`required_teams`. Keeping this in one helper means the gate's
decision shape stays single-sited even as items grow.
"""
if high_risk:
elevated = item.get("required_teams_high_risk") or []
if elevated:
return list(elevated)
return list(item.get("required_teams") or [])
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser()
p.add_argument("--owner", required=True)
@@ -825,6 +867,12 @@ def main(argv: list[str] | None = None) -> int:
comments = client.get_issue_comments(args.owner, args.repo, args.pr)
# High-risk classification (RFC#450 Option C, governance fix for
# internal#442). Computed ONCE per PR — used by both the probe
# closure and compute_ack_state so the elevation decision is
# single-sited.
high_risk = is_high_risk(pr, cfg)
# Build team-membership probe closure that caches results per
# (user, team-id) so a user acking multiple items only triggers
# one membership lookup per team.
@@ -832,7 +880,7 @@ def main(argv: list[str] | None = None) -> int:
def probe(slug: str, users: list[str]) -> list[str]:
item = items_by_slug[slug]
team_names: list[str] = item["required_teams"]
team_names: list[str] = resolve_required_teams(item, high_risk)
# Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
# available — fall back to the list endpoint.
team_ids: list[int] = []
@@ -877,7 +925,9 @@ def main(argv: list[str] | None = None) -> int:
# may still find membership in another team.
return approved
ack_state = compute_ack_state(comments, author, items_by_slug, numeric_aliases, probe)
ack_state = compute_ack_state(
comments, author, items_by_slug, numeric_aliases, probe, high_risk=high_risk
)
body_state = {it["slug"]: section_marker_present(body, it["pr_section_marker"]) for it in items}
state, description = render_status(items, ack_state, body_state)
@@ -890,7 +940,10 @@ def main(argv: list[str] | None = None) -> int:
description = f"[info tier:low] {description}"
# Diagnostics to job log.
print(f"::notice::PR #{args.pr} author={author} head={head_sha[:7]} mode={mode}")
print(
f"::notice::PR #{args.pr} author={author} head={head_sha[:7]} "
f"mode={mode} risk_class={'high' if high_risk else 'default'}"
)
for it in items:
slug = it["slug"]
ackers = ack_state[slug]["ackers"]
+34 -1
View File
@@ -17,6 +17,9 @@ Scenarios:
T8_team_not_member — team membership → 404 (not a member) → exit 1
T9_team_403 — team membership → 403 (token not in team) → exit 1
T14_non_default_base — open PR targeting staging → script exits 0 (no-op)
T15_comments_agent_approval — reviews empty; comments have "[core-qa-agent] APPROVED" → exit 0
T16_comments_generic_approval — reviews empty; comments have "APPROVED" by team member → exit 0
T17_comments_no_approval — reviews empty; comments have no approval keywords → exit 1
Usage:
FIXTURE_STATE_DIR=/tmp/x python3 _review_check_fixture.py 8080
@@ -97,7 +100,9 @@ class Handler(http.server.BaseHTTPRequestHandler):
# GET /repos/{owner}/{name}/pulls/{pr_number}/reviews
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/pulls/(\d+)/reviews$", path)
if m:
if sc in ("T4_reviews_empty", "T5_reviews_only_author"):
if sc in ("T4_reviews_empty", "T5_reviews_only_author",
"T15_comments_agent_approval", "T16_comments_generic_approval",
"T17_comments_no_approval"):
return self._json(200, [])
if sc == "T6_reviews_dismissed":
return self._json(200, [{
@@ -116,6 +121,28 @@ class Handler(http.server.BaseHTTPRequestHandler):
{"state": "APPROVED", "dismissed": False, "user": {"login": "core-devops"}, "commit_id": "abc1234"},
])
# GET /repos/{owner}/{name}/issues/{pr_number}/comments
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/issues/(\d+)/comments$", path)
if m:
if sc == "T15_comments_agent_approval":
return self._json(200, [
{"user": {"login": "core-qa-agent"}, "body": "[core-qa-agent] APPROVED this PR. Good changes.", "id": 1},
{"user": {"login": "alice"}, "body": "I authored this PR", "id": 2},
{"user": {"login": "random-user"}, "body": "Looks okay to me", "id": 3},
])
if sc == "T16_comments_generic_approval":
return self._json(200, [
{"user": {"login": "core-qa-agent"}, "body": "APPROVED — all acceptance criteria met", "id": 1},
{"user": {"login": "alice"}, "body": "-authored", "id": 2},
])
if sc == "T17_comments_no_approval":
return self._json(200, [
{"user": {"login": "alice"}, "body": "I authored this PR", "id": 1},
{"user": {"login": "random-user"}, "body": "Looks okay to me", "id": 2},
])
# Default scenarios (T1T9, T14): no comments
return self._json(200, [])
# GET /teams/{team_id}/members/{username}
m = re.match(r"^/api/v1/teams/(\d+)/members/([^/]+)$", path)
if m:
@@ -127,6 +154,12 @@ class Handler(http.server.BaseHTTPRequestHandler):
# T7_team_member: member
return self._empty(204)
# GET /repos/{owner}/{name}/statuses/{sha} — for N/A declaration check
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/statuses/([a-f0-9]+)$", path)
if m:
# All comment-based scenarios have no N/A declarations
return self._json(200, [])
return self._json(404, {"path": path, "msg": "fixture: no route"})
def do_POST(self):
+25
View File
@@ -334,6 +334,31 @@ assert_contains "T12 jq: core-devops (non-author APPROVED) in candidates" "core-
assert_eq "T12 jq: alice (author) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^alice$' || true)"
assert_eq "T12 jq: carol (dismissed) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^carol$' || true)"
# T15 — comment-based approval via agent prefix pattern → exit 0
echo
echo "== T15 comment agent-prefix approval =="
T15_OUT=$(run_review_check "T15_comments_agent_approval")
T15_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T15 exit code 0 (agent-comment approval + team member)" "0" "$T15_RC"
assert_contains "T15 comment fallback notice" "comment-based approval" "$T15_OUT"
assert_contains "T15 core-qa-agent APPROVED" "APPROVED by core-qa-agent" "$T15_OUT"
# T16 — comment-based approval via generic APPROVED keyword → exit 0
echo
echo "== T16 comment generic keyword approval =="
T16_OUT=$(run_review_check "T16_comments_generic_approval")
T16_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T16 exit code 0 (generic-approval comment + team member)" "0" "$T16_RC"
assert_contains "T16 comment fallback notice" "comment-based approval" "$T16_OUT"
# T17 — no approval keywords in comments → exit 1
echo
echo "== T17 comments with no approval keywords =="
T17_OUT=$(run_review_check "T17_comments_no_approval")
T17_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T17 exit code 1 (no candidates from comments)" "1" "$T17_RC"
assert_contains "T17 no candidates error" "no candidates from reviews API or issue comments" "$T17_OUT"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
+213 -1
View File
@@ -602,4 +602,216 @@ class TestComputeNaState(unittest.TestCase):
self.assertEqual(len(na_directives), 1)
self.assertEqual(na_directives[0][0], "sop-n/a")
self.assertEqual(na_directives[0][1], "qa-review")
self.assertIn("no surface", na_directives[0][2])
# ---------------------------------------------------------------------------
# RFC#450 Option C — risk-classed two-eyes (governance fix for internal#442)
# ---------------------------------------------------------------------------
class TestIsHighRisk(unittest.TestCase):
"""The high-risk predicate decides which required_teams list applies.
Predicate: tier:high label OR any label in cfg.high_risk_labels.
"""
def setUp(self):
self.cfg = sop.load_config(CONFIG_PATH)
def test_no_labels_is_default_class(self):
pr = {"labels": []}
self.assertFalse(sop.is_high_risk(pr, self.cfg))
def test_tier_high_is_high_risk(self):
pr = {"labels": [{"name": "tier:high"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_tier_low_is_default_class(self):
pr = {"labels": [{"name": "tier:low"}]}
self.assertFalse(sop.is_high_risk(pr, self.cfg))
def test_tier_medium_is_default_class(self):
# tier:medium alone is NOT high-risk (Option C — medium routes
# to the wider engineers OR-set).
pr = {"labels": [{"name": "tier:medium"}]}
self.assertFalse(sop.is_high_risk(pr, self.cfg))
def test_area_security_label_is_high_risk(self):
pr = {"labels": [{"name": "tier:medium"}, {"name": "area:security"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_area_schema_label_is_high_risk(self):
pr = {"labels": [{"name": "area:schema"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_area_identity_label_is_high_risk(self):
pr = {"labels": [{"name": "area:identity"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_area_fleet_image_label_is_high_risk(self):
pr = {"labels": [{"name": "area:fleet-image"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_area_gate_meta_label_is_high_risk(self):
# Gate-meta = changes to sop-checklist/sop-tier-check itself.
pr = {"labels": [{"name": "area:gate-meta"}]}
self.assertTrue(sop.is_high_risk(pr, self.cfg))
def test_unknown_area_label_is_default_class(self):
pr = {"labels": [{"name": "area:docs"}]}
self.assertFalse(sop.is_high_risk(pr, self.cfg))
class TestResolveRequiredTeams(unittest.TestCase):
"""The team resolver picks the elevated list only for high-risk PRs
AND only when the item declares one — items without an elevated
list always use the default required_teams."""
def test_default_class_uses_default_teams(self):
item = {"required_teams": ["engineers", "managers", "ceo"], "required_teams_high_risk": ["ceo"]}
self.assertEqual(
sop.resolve_required_teams(item, high_risk=False),
["engineers", "managers", "ceo"],
)
def test_high_risk_uses_elevated_teams(self):
item = {"required_teams": ["engineers", "managers", "ceo"], "required_teams_high_risk": ["ceo"]}
self.assertEqual(
sop.resolve_required_teams(item, high_risk=True),
["ceo"],
)
def test_high_risk_without_elevated_falls_back_to_default(self):
# Items that don't declare required_teams_high_risk (e.g.
# comprehensive-testing, staging-smoke) are unaffected by risk-class.
item = {"required_teams": ["engineers"]}
self.assertEqual(
sop.resolve_required_teams(item, high_risk=True),
["engineers"],
)
def test_empty_elevated_list_falls_back_to_default(self):
# A defensive case: required_teams_high_risk: [] should not
# silently lock out all approvers — fall back to the default
# so the gate stays satisfiable. (Tightening should remove the
# key, not set it to empty.)
item = {"required_teams": ["engineers"], "required_teams_high_risk": []}
self.assertEqual(
sop.resolve_required_teams(item, high_risk=True),
["engineers"],
)
class TestRootCauseAckEligibilityWidened(unittest.TestCase):
"""Closes internal#442: a non-author engineers-team ack now satisfies
root-cause / no-backwards-compat for the default class.
The dead-managers/ceo-persona-token gridlock is the symptom; the
root cause is that sop-checklist ignored tier-class. These tests
pin the new wider-default behavior so it can't regress silently.
"""
def setUp(self):
self.items = _items_by_slug()
self.aliases = _numeric_aliases()
@staticmethod
def _approve_only(allowed):
return lambda slug, users: [u for u in users if u in allowed]
def test_engineers_ack_satisfies_root_cause_default_class(self):
# Bob is in engineers only (not managers, not ceo). Default class.
comments = [_comment("bob", "/sop-ack root-cause")]
# Probe: bob is approved because root-cause now lists engineers.
probe = self._approve_only({"bob"})
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, probe, high_risk=False
)
self.assertEqual(state["root-cause"]["ackers"], ["bob"])
def test_engineers_ack_satisfies_no_backwards_compat_default_class(self):
comments = [_comment("bob", "/sop-ack no-backwards-compat")]
probe = self._approve_only({"bob"})
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, probe, high_risk=False
)
self.assertEqual(state["no-backwards-compat"]["ackers"], ["bob"])
def test_engineers_ack_alone_fails_root_cause_when_high_risk(self):
# High-risk PR: only ceo can ack. Engineers-only ack must fail.
comments = [_comment("bob", "/sop-ack root-cause")]
# Probe: bob is in engineers, not ceo. Under high_risk,
# required_teams_high_risk=[ceo] → bob is NOT approved.
# Probe receives the items + flag indirectly via main(); for
# the unit-test path we inject a probe that rejects bob.
probe = self._approve_only(set()) # nobody is in ceo
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, probe, high_risk=True
)
self.assertEqual(state["root-cause"]["ackers"], [])
self.assertIn("bob", state["root-cause"]["rejected"]["not_in_team"])
def test_ceo_ack_satisfies_root_cause_when_high_risk(self):
# High-risk PR + ceo-team approver → passes (the senior path).
comments = [_comment("hongming", "/sop-ack root-cause")]
probe = self._approve_only({"hongming"})
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, probe, high_risk=True
)
self.assertEqual(state["root-cause"]["ackers"], ["hongming"])
def test_self_ack_still_forbidden_even_with_widened_eligibility(self):
# Author cannot self-ack — widening teams must NOT weaken
# the non-author rule.
comments = [_comment("alice", "/sop-ack root-cause")]
probe = self._approve_only({"alice"})
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, probe, high_risk=False
)
self.assertEqual(state["root-cause"]["ackers"], [])
self.assertIn("alice", state["root-cause"]["rejected"]["self_ack"])
class TestHighRiskClassUsesElevatedListInConfig(unittest.TestCase):
"""End-to-end: the shipped config + RFC#450 predicate must keep
root-cause / no-backwards-compat gated on ceo for high-risk PRs."""
def test_root_cause_high_risk_elevated_to_ceo_only(self):
items = _items_by_slug()
# tier:high alone makes the PR high-risk → root-cause needs ceo.
self.assertEqual(
sop.resolve_required_teams(items["root-cause"], high_risk=True),
["ceo"],
)
# Default class accepts engineers/managers/ceo.
self.assertEqual(
sorted(sop.resolve_required_teams(items["root-cause"], high_risk=False)),
sorted(["engineers", "managers", "ceo"]),
)
def test_no_backwards_compat_high_risk_elevated_to_ceo_only(self):
items = _items_by_slug()
self.assertEqual(
sop.resolve_required_teams(items["no-backwards-compat"], high_risk=True),
["ceo"],
)
self.assertEqual(
sorted(sop.resolve_required_teams(items["no-backwards-compat"], high_risk=False)),
sorted(["engineers", "managers", "ceo"]),
)
def test_other_items_unchanged_by_risk_class(self):
# Items without required_teams_high_risk are unaffected.
items = _items_by_slug()
for slug in (
"comprehensive-testing",
"local-postgres-e2e",
"staging-smoke",
"five-axis-review",
"memory-consulted",
):
self.assertEqual(
sop.resolve_required_teams(items[slug], high_risk=False),
sop.resolve_required_teams(items[slug], high_risk=True),
f"item {slug} should not be affected by risk-class",
)
+43 -7
View File
@@ -50,6 +50,34 @@ tier_failure_mode:
"tier:low": soft
default_mode: hard # used when no tier:* label is present
# High-risk class (RFC#450 Option C, governance-fix for internal#442).
#
# A PR is "high-risk" when ANY of the listed labels are applied OR when
# the PR has `tier:high` (mechanically the strictest existing tier).
# High-risk items use `required_teams_high_risk` (when present on the
# item); non-high-risk items use the default `required_teams`.
#
# This closes the inconsistency that the SOP charter already mandates
# `tier:high → ceo only` for the sibling `sop-tier-check` gate; the
# sop-checklist's `root-cause` and `no-backwards-compat` items now
# follow the same risk-classed two-eyes shape:
# - Default class (tier:low/medium, not high-risk): a non-author
# engineers/managers/ceo ack satisfies the item — 25+ live
# identities, no dependency on a dead/inactive senior persona
# token.
# - High-risk class (tier:high OR any high_risk_label): still
# requires a non-author ceo ack (durable human team).
#
# Tightening: add labels to high_risk_labels.
# Loosening: remove labels.
high_risk_labels:
- "risk:high"
- "area:security"
- "area:schema"
- "area:fleet-image"
- "area:identity"
- "area:gate-meta"
items:
- slug: comprehensive-testing
numeric_alias: 1
@@ -78,11 +106,15 @@ items:
- slug: root-cause
numeric_alias: 4
pr_section_marker: "Root-cause not symptom"
required_teams: [managers, ceo]
required_teams: [engineers, managers, ceo]
required_teams_high_risk: [ceo]
description: >-
One-sentence root-cause statement. Ack from managers tier
(team-leads) or ceo. Senior judgment required to attest
root-cause-versus-symptom.
One-sentence root-cause statement. Default class: non-author
engineers/managers/ceo ack suffices (engineers can attest
root-cause-vs-symptom for routine fixes). High-risk class
(see `high_risk_labels`): non-author ceo ack required —
senior judgment for irreversible/security/identity/gate
changes. Closes internal#442 + tracks RFC#450.
- slug: five-axis-review
numeric_alias: 5
@@ -95,10 +127,14 @@ items:
- slug: no-backwards-compat
numeric_alias: 6
pr_section_marker: "No backwards-compat shim / dead code added"
required_teams: [managers, ceo]
required_teams: [engineers, managers, ceo]
required_teams_high_risk: [ceo]
description: >-
Yes/no + justification if no. Senior ack required because
backward-compat shims are how dead-code accretes.
Yes/no + justification if no. Default class: non-author
engineers/managers/ceo ack suffices. High-risk class
(see `high_risk_labels`): non-author ceo ack required —
senior judgment for shim-versus-real-fix on irreversible
surfaces. Closes internal#442 + tracks RFC#450.
- slug: memory-consulted
numeric_alias: 7
+61 -1
View File
@@ -158,8 +158,68 @@ jobs:
echo "NOTE: No warning in output (may be suppressed by log level)"
fi
- name: Reproduce openclaw failure — pipe held OPEN, no EOF
run: |
set -euo pipefail
echo "=== keep-stdin-open pipe (the real openclaw / Claude Code case) ==="
echo ""
echo "Before the readline() fix this HANGS: main() did"
echo " stdin.read(65536) -> on a pipe, blocks until 64KB OR EOF."
echo "An MCP client sends one ~150B initialize and keeps stdin"
echo "open waiting for the response, so the server never parsed"
echo "the request and the client timed out (openclaw: 'MCP error"
echo "-32000: Connection closed'). The earlier regular-file /"
echo "heredoc-pipe steps PASSED through this bug because a file"
echo "(or a closing heredoc) yields EOF immediately."
echo ""
# Drive the server through a real pipe that stays OPEN: write
# one initialize, do NOT close stdin, and require a response
# within a hard timeout. read(65536) -> no output -> timeout
# kills it -> FAIL. readline() -> immediate response -> PASS.
python - <<'PYEOF'
import json, subprocess, sys, time, select
proc = subprocess.Popen(
[sys.executable, "a2a_mcp_server.py"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
env={**__import__("os").environ},
)
req = json.dumps({
"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": {"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "keepopen", "version": "1"}},
}) + "\n"
proc.stdin.write(req.encode())
proc.stdin.flush()
# Deliberately DO NOT close proc.stdin — mirror a live MCP client.
deadline = time.time() + 15
line = b""
while time.time() < deadline:
r, _, _ = select.select([proc.stdout], [], [], 1)
if r:
line = proc.stdout.readline()
if line:
break
proc.kill()
if not line:
print("FAIL: no response within 15s on an open pipe — "
"stdin.read(65536) regression is back")
sys.exit(1)
resp = json.loads(line.decode())
assert resp.get("id") == 1 and "result" in resp, \
f"unexpected response: {line[:200]!r}"
assert resp["result"]["serverInfo"]["name"] == "molecule", \
f"wrong serverInfo: {line[:200]!r}"
print("PASS: server answered initialize on a still-open pipe")
PYEOF
- name: Run unit tests for stdio transport
run: |
set -euo pipefail
echo "=== Running stdio transport unit tests ==="
python -m pytest tests/test_a2a_mcp_server.py::TestStdioPipeAssertion -v --no-cov
python -m pytest tests/test_a2a_mcp_server.py::TestStdioPipeAssertion tests/test_a2a_mcp_server.py::TestStdioKeepOpenPipe -v --no-cov
+7 -5
View File
@@ -538,11 +538,13 @@ jobs:
all-required:
# Aggregator sentinel — RFC internal#219 §2 (Phase 4 — closes internal#286).
#
# Single stable required-status name that branch protection points at;
# CI churns underneath in `needs:` without any protection edits. Mirrors
# the molecule-controlplane Phase 2a impl shipped in CP PR#112 and
# referenced by `internal#286` ("Phase 4 is a single small PR... mirrors
# CP's existing one").
# Emits `CI / all-required (<event>)` where <event> is the workflow trigger
# (e.g. `CI / all-required (pull_request)`, `CI / all-required (push)`).
# Branch protection MUST be updated to require the event-suffixed name —
# requiring `CI / all-required` (bare, no suffix) silently blocks all merges
# because Gitea treats absent status contexts as pending (not skipped), and
# no workflow emits the bare name. Fixed: BP now requires
# `CI / all-required (pull_request)` per issue #1473.
#
# Closes the failure mode where status_check_contexts on molecule-core/main
# only listed `Secret scan` + `sop-tier-check` (the 2 meta-gates), so real
+177 -5
View File
@@ -52,6 +52,30 @@ name: E2E Peer Visibility (literal MCP list_peers)
# flip-to-required-ready (mirrors e2e-staging-saas.yml's proven shape;
# real EC2-provisioning E2E is push/dispatch/cron only — it is 30+ min
# and cannot run per-PR-update).
#
# LOCAL BACKEND (added 2026-05-15 — feedback_local_must_mimic_production,
# feedback_mandatory_local_e2e_before_ship, feedback_local_test_before_
# staging_e2e)
# --------------------------------------------------------------------
# The standing rule is that the local prod-mimic stack runs a MANDATORY
# local-Postgres E2E BEFORE staging E2E. A staging-only peer-visibility
# gate caught regressions late + expensively (cold EC2). The
# `peer-visibility-local` job below runs the SAME byte-identical
# assertion (tests/e2e/lib/peer_visibility_assert.sh) against the local
# docker-compose stack — built + booted exactly like e2e-api.yml's
# proven E2E API Smoke Test job (ephemeral pg/redis ports, go build,
# background platform-server). It runs on PR + push (local boot is
# minutes, not the 30+ min cold-EC2 path), so peer-visibility is part of
# the local gate that fires before the staging E2E.
#
# It is its OWN non-required status context `E2E Peer Visibility (local)`
# — same non-required-by-design decision as the staging job (red until
# Hermes-401 #162 / OpenClaw-never-online #165 land; flip-to-required
# tracked at molecule-core#1296). It is an HONEST gate: NO
# continue-on-error mask (feedback_fix_root_not_symptom). It is kept a
# distinct context (not folded into e2e-api.yml's required `E2E API
# Smoke Test`) precisely so a deliberately-RED-today gate cannot wedge
# the required local-E2E job or any unrelated merge.
on:
push:
@@ -65,6 +89,8 @@ on:
- 'workspace/a2a_mcp_server.py'
- 'workspace/platform_tools/registry.py'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
- '.gitea/workflows/e2e-peer-visibility.yml'
pull_request:
branches: [main]
@@ -77,6 +103,8 @@ on:
- 'workspace/a2a_mcp_server.py'
- 'workspace/platform_tools/registry.py'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
- '.gitea/workflows/e2e-peer-visibility.yml'
workflow_dispatch:
schedule:
@@ -108,16 +136,160 @@ jobs:
timeout-minutes: 5
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Validate driving script
- name: Validate driving scripts + shared assertion lib
run: |
bash -n tests/e2e/lib/peer_visibility_assert.sh
echo "lib/peer_visibility_assert.sh — bash syntax OK"
bash -n tests/e2e/test_peer_visibility_mcp_staging.sh
echo "test_peer_visibility_mcp_staging.sh — bash syntax OK"
echo "Real fresh-provision MCP list_peers E2E runs on push to"
bash -n tests/e2e/test_peer_visibility_mcp_local.sh
echo "test_peer_visibility_mcp_local.sh — bash syntax OK"
echo "Staging fresh-provision MCP list_peers E2E runs on push to"
echo "main / workflow_dispatch / daily cron (30+ min EC2 boot)."
echo "The LOCAL backend runs in the peer-visibility-local job"
echo "below on this same PR (local docker-compose stack)."
# Real gate: provisions a throwaway org + sibling-per-runtime, drives
# the LITERAL list_peers MCP call per runtime, asserts 200 + expected
# peer set, then scoped teardown. push(main)/dispatch/cron only.
# LOCAL gate: same byte-identical assertion against the local prod-mimic
# docker-compose stack — the MANDATORY local-E2E that must run BEFORE
# the staging E2E (feedback_mandatory_local_e2e_before_ship,
# feedback_local_test_before_staging_e2e). Bootstrap mirrors
# e2e-api.yml's proven E2E API Smoke Test job (per-run container names +
# ephemeral host ports so concurrent host-network act_runner runs don't
# collide; go build; background platform-server). Its OWN non-required
# status context `E2E Peer Visibility (local)` — non-required-by-design
# exactly like the staging job (red until #162/#165 land;
# flip-to-required tracked at molecule-core#1296). HONEST gate, NO
# continue-on-error mask (feedback_fix_root_not_symptom). Runs on PR +
# push (local boot is minutes, not the 30+ min cold-EC2 path).
# bp-required: pending #1296
peer-visibility-local:
name: E2E Peer Visibility (local)
runs-on: ubuntu-latest
timeout-minutes: 30
env:
# Per-run names + ephemeral ports — same collision-avoidance as
# e2e-api.yml (host-network act_runner; feedback_act_runner_*).
PG_CONTAINER: pg-e2e-pv-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-e2e-pv-${{ github.run_id }}-${{ github.run_attempt }}
# LLM keys so hermes/openclaw can actually boot. The local script
# SKIPs (not fails) any runtime whose key is absent, so a partially
# keyed CI env still exercises whatever it can.
CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.E2E_CLAUDE_CODE_OAUTH_TOKEN }}
E2E_MINIMAX_API_KEY: ${{ secrets.MOLECULE_STAGING_MINIMAX_API_KEY }}
E2E_ANTHROPIC_API_KEY: ${{ secrets.MOLECULE_STAGING_ANTHROPIC_API_KEY }}
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_API_KEY }}
PV_RUNTIMES: "hermes openclaw claude-code"
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Pre-pull alpine + ensure provisioner network
run: |
docker pull alpine:latest >/dev/null
docker network create molecule-core-net >/dev/null 2>&1 || true
echo "alpine:latest pre-pulled; molecule-core-net ensured."
- name: Start Postgres (docker, ephemeral port)
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker run -d --name "$PG_CONTAINER" \
-e POSTGRES_USER=dev -e POSTGRES_PASSWORD=dev -e POSTGRES_DB=molecule \
-p 0:5432 postgres:16 >/dev/null
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -n "$PG_PORT" ] || PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$PG_PORT" ]; then
echo "::error::Could not resolve host port for $PG_CONTAINER"
docker logs "$PG_CONTAINER" || true; exit 1
fi
echo "DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
for i in $(seq 1 30); do
docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1 && { echo "Postgres ready after ${i}s"; exit 0; }
sleep 1
done
echo "::error::Postgres did not become ready in 30s"; docker logs "$PG_CONTAINER" || true; exit 1
- name: Start Redis (docker, ephemeral port)
run: |
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
docker run -d --name "$REDIS_CONTAINER" -p 0:6379 redis:7 >/dev/null
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -n "$REDIS_PORT" ] || REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$REDIS_PORT" ]; then
echo "::error::Could not resolve host port for $REDIS_CONTAINER"
docker logs "$REDIS_CONTAINER" || true; exit 1
fi
echo "REDIS_URL=redis://127.0.0.1:${REDIS_PORT}" >> "$GITHUB_ENV"
for i in $(seq 1 15); do
docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG && { echo "Redis ready after ${i}s"; exit 0; }
sleep 1
done
echo "::error::Redis did not become ready in 15s"; docker logs "$REDIS_CONTAINER" || true; exit 1
- name: Build platform
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Pick platform port
run: |
PLATFORM_PORT=$(python3 - <<'PY'
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 0))
print(s.getsockname()[1])
PY
)
echo "PORT=${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "BASE=http://127.0.0.1:${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "Platform host port: ${PLATFORM_PORT}"
- name: Kill stale platform-server before start
run: |
killed=0
for pid in $(grep -l "platform-serve" /proc/[0-9]*/comm 2>/dev/null); do
kpid="${pid%/comm}"; kpid="${kpid##*/}"
cmdline=$(cat "/proc/${kpid}/cmdline" 2>/dev/null | tr '\0' ' ')
if echo "$cmdline" | grep -q "platform-server"; then
echo "Killing stale platform-server pid ${kpid}"
kill "$kpid" 2>/dev/null || true; killed=$((killed + 1))
fi
done
[ "$killed" -gt 0 ] && sleep 2 || true
echo "stale-kill done ($killed killed)"
- name: Start platform (background)
working-directory: workspace-server
run: |
./platform-server > platform.log 2>&1 &
echo $! > platform.pid
- name: Wait for /health
run: |
for i in $(seq 1 30); do
curl -sf "$BASE/health" > /dev/null && { echo "Platform up after ${i}s"; exit 0; }
sleep 1
done
echo "::error::Platform did not become healthy in 30s"
cat workspace-server/platform.log || true; exit 1
- name: Run LOCAL fresh-provision peer-visibility E2E (literal MCP list_peers)
# HONEST gate — NO continue-on-error. Red today (Hermes-401 #162 /
# OpenClaw-never-online #165 not yet fixed); green when they land.
# Non-required-by-design via its distinct status context until the
# molecule-core#1296 flip-to-required.
run: bash tests/e2e/test_peer_visibility_mcp_local.sh
- name: Dump platform log on failure
if: failure()
run: cat workspace-server/platform.log || true
- name: Stop platform
if: always()
run: |
if [ -f workspace-server/platform.pid ]; then
kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
fi
- name: Stop service containers
if: always()
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
# Real STAGING gate: provisions a throwaway org + sibling-per-runtime,
# drives the LITERAL list_peers MCP call per runtime, asserts 200 +
# expected peer set, then scoped teardown. push(main)/dispatch/cron only.
peer-visibility:
name: E2E Peer Visibility
runs-on: ubuntu-latest
+4
View File
@@ -52,5 +52,9 @@ jobs:
# explicitly instead of the combined state avoids false-pause when
# non-blocking jobs (continue-on-error: true) have failed — those
# failures pollute combined state but do not gate merges.
# NOTE: the event-suffixed context name is intentional — branch protection
# MUST require `CI / all-required (pull_request)` (with suffix), NOT the
# bare `CI / all-required`. Gitea treats absent contexts as pending, not
# skipped; requiring the bare name silently blocks all merges (issue #1473).
PUSH_REQUIRED_CONTEXTS: CI / all-required (push)
run: python3 .gitea/scripts/gitea-merge-queue.py
@@ -0,0 +1,168 @@
name: Lint forbidden tenant-env keys
# RFC#523 Layer 3 (task #146): scan workspace_secrets-writer Go code
# under workspace-server/ for new code that hardcodes a forbidden
# operator-scope env var NAME (GITEA_TOKEN, CP_ADMIN_API_TOKEN,
# RAILWAY_TOKEN, INFISICAL_OPERATOR_TOKEN, MOLECULE_OPERATOR_*, …).
#
# Catches the class "a new writer accidentally widens the propagation
# set" — e.g. a future env-mutator plugin that sets envVars["GITEA_TOKEN"]
# directly. Today the L1 runtime guard would abort the provision, but
# this lint surfaces the offending code at PR review time instead of
# at first provision attempt.
#
# Companion layers:
# - L1: workspace-server/internal/handlers/workspace_provision_forbidden_env.go
# (fail-closed abort at provision time)
# - L2: workspace/entrypoint.sh top-of-file env-grep + exit 1
#
# Open-source-template-friendly: the deny pattern is generic. A fork
# can copy this workflow and replace OPERATOR_KEY_PATTERN with its
# own operator-scope key names.
#
# Path-filter discipline:
# This workflow runs on every PR (no paths: filter — see
# feedback_path_filtered_workflow_cant_be_required). The scan itself
# targets workspace_secrets-writer paths via grep -r; it's fast
# (sub-second) so unconditional run is fine.
on:
pull_request:
types: [opened, synchronize, reopened]
push:
branches: [main, staging]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
scan:
name: Scan workspace_secrets writers for forbidden env keys
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- name: Scan for forbidden operator-scope env key NAMES in writer paths
run: |
set -euo pipefail
# Forbidden EXACT-MATCH env var names. Kept in lockstep with
# workspace-server/internal/handlers/workspace_provision_forbidden_env.go
# forbiddenTenantEnvKeys. The Go-side test
# TestIsForbiddenTenantEnvKey_ExactMatches is the source of
# truth — if Go-side adds a key, also add it here (and
# vice-versa). Drift between the two is the failure mode this
# entire 3-layer guardrail is designed to catch.
FORBIDDEN_KEYS=(
"GITEA_TOKEN" "GITEA_PAT"
"GITHUB_TOKEN" "GITHUB_PAT" "GH_TOKEN"
"GITLAB_TOKEN" "GL_TOKEN"
"BITBUCKET_TOKEN"
"CP_ADMIN_API_TOKEN" "CP_ADMIN_TOKEN"
"INFISICAL_OPERATOR_TOKEN" "INFISICAL_BOOTSTRAP_TOKEN"
"RAILWAY_TOKEN" "RAILWAY_PERSONAL_API_TOKEN"
"HETZNER_TOKEN" "HETZNER_API_TOKEN"
)
# Forbidden PREFIX patterns — operator-scope families.
FORBIDDEN_PREFIXES=(
"MOLECULE_OPERATOR_"
)
# Writer paths: Go source under workspace-server/ that
# writes to the env-vars map or to workspace_secrets DB rows.
# Tests, the forbidden-env source itself, and the silent-
# strip denylist are exempt (they LIST the keys by design).
SCAN_ROOT="workspace-server/internal"
# Exempt paths fall in two classes:
# 1. The deny-set definitions + the silent-strip denylist:
# they LIST the forbidden names by design.
# 2. Pre-RFC#523 persona-merge / config-read paths that
# already handle these names correctly (the silent-
# strip downstream + the new L1 fail-closed cover the
# runtime risk; these reads are unchanged).
# New code MUST NOT be added to this list without reviewer
# signoff and a one-line justification in this diff.
EXEMPT_PATHS=(
# Class 1 — deny-set definitions
"workspace-server/internal/handlers/workspace_provision_forbidden_env.go"
"workspace-server/internal/handlers/workspace_provision_forbidden_env_test.go"
"workspace-server/internal/provisioner/provisioner.go"
"workspace-server/internal/provisioner/provisioner_test.go"
# Class 2 — pre-existing persona-fallback / org-helper paths
# that set the GITEA_TOKEN fallback lane (stripped downstream
# by provisioner.buildContainerEnv per forensic #145). The
# new L1 fail-closed runs BEFORE these writers, so any
# operator-scope leak via global/workspace_secrets is
# already caught. See applyAgentGitHTTPCreds doc-comment.
"workspace-server/internal/handlers/agent_git_identity.go"
"workspace-server/internal/handlers/org_helpers.go"
"workspace-server/internal/handlers/org.go"
# Class 2 — CP→platform admin auth (NOT a tenant env write;
# this is the control-plane HTTP auth header source).
"workspace-server/internal/provisioner/cp_provisioner.go"
)
# Build a single grep -F pattern: every forbidden key wrapped
# in quotes (Go string-literal form, which is how env-map
# writes appear). e.g. envVars["GITEA_TOKEN"] = ... or
# `"GITEA_TOKEN":` in a literal-map declaration.
#
# We deliberately match the quoted form so a comment that
# happens to spell the name without quotes (e.g. "see
# GITEA_TOKEN below") doesn't trip the lint.
PATTERN=""
for k in "${FORBIDDEN_KEYS[@]}"; do
PATTERN="${PATTERN}\"${k}\"\n"
done
for p in "${FORBIDDEN_PREFIXES[@]}"; do
# Prefix match needs a regex; switch to grep -E below for
# this slice. Kept conceptually here so the deny set lives
# in one place; scan is run twice (literal + prefix).
true
done
# Build exempt-paths grep filter — `grep -v -f` style.
EXEMPT_FILTER=$(mktemp)
trap 'rm -f "$EXEMPT_FILTER"' EXIT
for p in "${EXEMPT_PATHS[@]}"; do
echo "$p" >> "$EXEMPT_FILTER"
done
# --- Exact-match scan ---
HITS=""
for k in "${FORBIDDEN_KEYS[@]}"; do
# Only .go files; skip _test.go for the writer-path scan
# since tests legitimately reference the names. The
# writer-path lint targets PRODUCTION code only.
found=$(grep -rn --include='*.go' --exclude='*_test.go' "\"${k}\"" "$SCAN_ROOT" 2>/dev/null \
| grep -v -F -f "$EXEMPT_FILTER" || true)
if [ -n "$found" ]; then
HITS="${HITS}${found}\n"
fi
done
# --- Prefix scan ---
for prefix in "${FORBIDDEN_PREFIXES[@]}"; do
found=$(grep -rnE --include='*.go' --exclude='*_test.go' "\"${prefix}[A-Z0-9_]+\"" "$SCAN_ROOT" 2>/dev/null \
| grep -v -F -f "$EXEMPT_FILTER" || true)
if [ -n "$found" ]; then
HITS="${HITS}${found}\n"
fi
done
if [ -n "$HITS" ]; then
echo "::error::RFC#523 Layer 3: forbidden operator-scope env var name(s) hardcoded in tenant-workspace writer paths:"
printf "$HITS"
echo ""
echo "These env-var NAMES are on the operator-scope deny list (see"
echo "workspace-server/internal/handlers/workspace_provision_forbidden_env.go)."
echo "If your code legitimately needs to inject one of these for a"
echo "non-tenant code path, add the file to EXEMPT_PATHS in this"
echo "workflow with a one-line justification — reviewer signoff required."
exit 1
fi
echo "OK No forbidden operator-scope env key names hardcoded in writer paths."
@@ -0,0 +1,88 @@
name: Lint shellcheck (arm64 pilot)
# Mac-CI dual-track pilot (#233). ADDITIVE / NOT REQUIRED.
#
# Validates the arm64 self-hosted lane (no docker.sock, no privileged
# ops) before any required gate moves onto it. Until a Mac arm64 runner
# is registered with the `arm64` label, this workflow sits PENDING —
# that is FINE: `arm64` is NOT in branch_protections required contexts.
#
# Pairs with internal#543 (RFC: Mac arm64 multi-arch runner-base).
# No paths: filter on purpose (feedback_path_filtered_workflow_cant_be_required).
on:
pull_request:
branches:
- main
- staging
push:
branches:
- main
permissions:
contents: read
jobs:
shellcheck-arm64:
name: shellcheck-arm64 (pilot)
runs-on: [self-hosted, arm64]
# NOT a required check; safe to sit pending until Mac runner is up.
# If the Mac runner has trouble pulling actions/checkout we fall
# back to a plain git clone (see step 'fallback clone').
timeout-minutes: 10
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
steps:
- name: Identify runner
run: |
set -eu
echo "arch=$(uname -m)"
echo "kernel=$(uname -sr)"
echo "shell=$BASH_VERSION"
# Sanity: must actually be arm64. If amd64 sneaks in here,
# fail fast — that means the label routing is wrong.
case "$(uname -m)" in
aarch64|arm64) echo "arm64 confirmed" ;;
*) echo "ERROR: expected arm64, got $(uname -m)"; exit 1 ;;
esac
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Install shellcheck (arm64)
run: |
set -eu
if command -v shellcheck >/dev/null 2>&1; then
echo "shellcheck already present: $(shellcheck --version | head -1)"
else
# Prefer apt if the runner base ships it; else download arm64 binary.
if command -v apt-get >/dev/null 2>&1; then
sudo apt-get update -qq
sudo apt-get install -y --no-install-recommends shellcheck
else
SC_VER=v0.10.0
curl -fsSL "https://github.com/koalaman/shellcheck/releases/download/${SC_VER}/shellcheck-${SC_VER}.linux.aarch64.tar.xz" \
| tar -xJf - --strip-components=1
sudo mv shellcheck /usr/local/bin/
fi
fi
shellcheck --version | head -2
- name: Run shellcheck on .gitea/scripts/*.sh
run: |
set -eu
# Only the scripts we control under .gitea/scripts. Pilot
# scope is intentionally narrow — broaden in a follow-up
# once the lane is proven.
mapfile -t TARGETS < <(find .gitea/scripts -maxdepth 2 -type f -name '*.sh' | sort)
if [ "${#TARGETS[@]}" -eq 0 ]; then
echo "No .sh files found under .gitea/scripts — nothing to check"
exit 0
fi
echo "Checking ${#TARGETS[@]} file(s):"
printf ' %s\n' "${TARGETS[@]}"
# SC1091 = couldn't follow non-constant source; expected for
# CI-time analysis without the full runtime layout.
shellcheck --severity=error --exclude=SC1091 "${TARGETS[@]}"
+19 -4
View File
@@ -104,7 +104,7 @@ jobs:
with:
python-version: "3.11"
- name: Compute next version from PyPI latest
- name: Compute next version from PyPI latest and existing tags
id: bump
run: |
set -eu
@@ -112,9 +112,24 @@ jobs:
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])")
MAJOR=$(echo "$LATEST" | cut -d. -f1)
MINOR=$(echo "$LATEST" | cut -d. -f2)
PATCH=$(echo "$LATEST" | cut -d. -f3)
VERSION="${MAJOR}.${MINOR}.$((PATCH+1))"
echo "PyPI latest=$LATEST -> next=$VERSION"
TAG_LATEST=$(git tag --list "runtime-v${MAJOR}.${MINOR}.*" \
| sed -E 's/^runtime-v//' \
| grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -V \
| tail -1 || true)
VERSION=$(PYPI_LATEST="$LATEST" TAG_LATEST="$TAG_LATEST" python - <<'PY'
import os
def parse(v):
return tuple(int(part) for part in v.split("."))
pypi = os.environ["PYPI_LATEST"]
tag = os.environ.get("TAG_LATEST") or pypi
base = max(parse(pypi), parse(tag))
print(f"{base[0]}.{base[1]}.{base[2] + 1}")
PY
)
echo "PyPI latest=$LATEST, latest runtime tag=${TAG_LATEST:-none} -> next=$VERSION"
if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "::error::computed version $VERSION does not match PEP 440 X.Y.Z"
exit 1
+1
View File
@@ -89,6 +89,7 @@ on:
permissions:
contents: read
pull-requests: read
secrets: read
jobs:
# bp-exempt: PR review bot signal; required merge state is enforced by CI / all-required.
+13
View File
@@ -30,6 +30,11 @@ jobs:
scan:
name: Scan diff for credential-shaped strings
runs-on: ubuntu-latest
# Hard CI gate — must complete or the PR is unmergable. 10-minute ceiling
# is generous for a diff-scan against a single SHA. If this times out, the
# runner is frozen and holding a slot — the step timeout triggers clean
# failure, releasing the runner for the next job.
timeout-minutes: 10
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
@@ -133,6 +138,14 @@ jobs:
[ -z "$f" ] && continue
[ "$f" = "$SELF_GITHUB" ] && continue
[ "$f" = "$SELF_GITEA" ] && continue
# Test-fixture exclude (internal#425): the secrets-detector's OWN
# unit-test corpus deliberately embeds credential-SHAPED example
# strings to exercise the detector. Verified 2026-05-18 synthetic
# (fabricated ghp_* fixtures, not real). Without this the scanner
# self-trips on its own fixtures and fail-closes every deploy.
# Same rationale as the SELF_* excludes above; gate NOT weakened
# (all other paths still fully scanned).
[ "$f" = "workspace-server/internal/secrets/patterns_test.go" ] && continue
if [ -n "$DIFF_RANGE" ]; then
ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true)
else
+1
View File
@@ -16,6 +16,7 @@ on:
permissions:
contents: read
pull-requests: read
secrets: read
jobs:
# bp-exempt: PR security review bot signal; required merge state is enforced by CI / all-required.
+1 -4
View File
@@ -84,11 +84,8 @@ on:
permissions:
contents: read
pull-requests: read
# NOTE: `statuses: write` is the GitHub-Actions name for POST /statuses.
# Gitea 1.22.6 may not gate on this permission key (it just checks the
# token), but listing it explicitly documents intent for the next
# platform-version upgrade.
statuses: write
secrets: read
jobs:
all-items-acked:
+1
View File
@@ -71,6 +71,7 @@ jobs:
permissions:
contents: read
pull-requests: read
secrets: read
steps:
- name: Check out base branch (for the script)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-255
View File
@@ -1,255 +0,0 @@
name: canary-verify
# Runs the canary smoke suite against the staging canary tenant fleet
# after a new :staging-<sha> image lands in ECR. On green, calls the
# CP redeploy-fleet endpoint to promote :staging-<sha> → :latest so
# the prod tenant fleet's 5-minute auto-updater picks up the verified
# digest. On red, :latest stays on the prior known-good digest and
# prod is untouched.
#
# Registry note (2026-05-10): This workflow previously used GHCR
# (ghcr.io/molecule-ai/platform-tenant) — that registry was retired
# during the 2026-05-06 Gitea suspension migration when publish-
# workspace-server-image.yml switched to the operator's ECR org
# (153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/
# platform-tenant). The GHCR → ECR migration was never applied to
# this file, so canary-verify was silently smoke-testing the stale
# GHCR image while the actual staging/prod tenants ran the ECR image.
# Result: smoke tests could not catch a broken ECR build. Fix:
# - Wait step: reads SHA from running canary /health (tenant-
# agnostic, works regardless of registry).
# - Promote step: calls CP redeploy-fleet endpoint with target_tag=
# staging-<sha>, same mechanism as redeploy-tenants-on-main.yml.
# No longer attempts GHCR crane ops.
#
# Dependencies:
# - publish-workspace-server-image.yml publishes :staging-<sha>
# to ECR on staging and main merges.
# - Canary tenants are configured to pull :staging-<sha> from ECR
# (TENANT_IMAGE env set to the ECR :staging-<sha> tag).
# - Repo secrets CANARY_TENANT_URLS / CANARY_ADMIN_TOKENS /
# CANARY_CP_SHARED_SECRET are populated.
on:
workflow_run:
workflows: ["publish-workspace-server-image"]
types: [completed]
workflow_dispatch:
permissions:
contents: read
packages: write
actions: read
env:
# ECR registry (post-2026-05-06 SSOT for tenant images).
# publish-workspace-server-image.yml pushes here.
IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform
TENANT_IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
# CP endpoint for redeploy-fleet (used in promote step below).
CP_URL: ${{ vars.CP_URL || 'https://staging-api.moleculesai.app' }}
jobs:
canary-smoke:
# Skip when the upstream workflow failed — no image to test against.
if: ${{ github.event.workflow_run.conclusion == 'success' || github.event_name == 'workflow_dispatch' }}
runs-on: ubuntu-latest
outputs:
sha: ${{ steps.compute.outputs.sha }}
smoke_ran: ${{ steps.smoke.outputs.ran }}
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Compute sha
id: compute
run: echo "sha=${GITHUB_SHA::7}" >> "$GITHUB_OUTPUT"
- name: Wait for canary tenants to pick up :staging-<sha>
# Poll canary health endpoints every 30s for up to 7 min instead
# of a fixed 6-min sleep. Exits as soon as ALL canaries report
# the new SHA (~2-3 min typical vs 6 min fixed). Falls back to
# proceeding after 7 min even if not all canaries responded —
# the smoke suite will catch any that didn't update.
#
# NOTE: The SHA is read from the running tenant's /health response,
# NOT from a registry lookup. This is registry-agnostic and works
# regardless of whether the tenant pulls from ECR, GHCR, or any
# other registry — the canary is telling us what it's actually
# running, which is the ground truth for smoke testing.
env:
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
EXPECTED_SHA: ${{ steps.compute.outputs.sha }}
run: |
if [ -z "$CANARY_TENANT_URLS" ]; then
echo "No canary URLs configured — falling back to 60s wait"
sleep 60
exit 0
fi
IFS=',' read -ra URLS <<< "$CANARY_TENANT_URLS"
MAX_WAIT=420 # 7 minutes
INTERVAL=30
ELAPSED=0
while [ $ELAPSED -lt $MAX_WAIT ]; do
ALL_READY=true
for url in "${URLS[@]}"; do
HEALTH=$(curl -s --max-time 5 "${url}/health" 2>/dev/null || echo "{}")
SHA=$(echo "$HEALTH" | grep -o "\"sha\":\"[^\"]*\"" | head -1 | cut -d'"' -f4)
if [ "$SHA" != "$EXPECTED_SHA" ]; then
ALL_READY=false
break
fi
done
if $ALL_READY; then
echo "All canaries running staging-${EXPECTED_SHA} after ${ELAPSED}s"
exit 0
fi
echo "Waiting for canaries... (${ELAPSED}s / ${MAX_WAIT}s)"
sleep $INTERVAL
ELAPSED=$((ELAPSED + INTERVAL))
done
echo "Timeout after ${MAX_WAIT}s — proceeding anyway (smoke suite will validate)"
- name: Run canary smoke suite
id: smoke
# Graceful-skip when no canary fleet is configured (Phase 2 not yet
# stood up — see molecule-controlplane/docs/canary-tenants.md).
# Sets `ran=false` on skip so promote-to-latest stays off (we don't
# want every main merge auto-promoting without gating). Manual
# promote-latest.yml is the release gate while canary is absent.
# Once the fleet is real: delete the early-exit branch.
env:
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
CANARY_ADMIN_TOKENS: ${{ secrets.CANARY_ADMIN_TOKENS }}
CANARY_CP_BASE_URL: https://staging-api.moleculesai.app
CANARY_CP_SHARED_SECRET: ${{ secrets.CANARY_CP_SHARED_SECRET }}
run: |
set -euo pipefail
if [ -z "${CANARY_TENANT_URLS:-}" ] \
|| [ -z "${CANARY_ADMIN_TOKENS:-}" ] \
|| [ -z "${CANARY_CP_SHARED_SECRET:-}" ]; then
{
echo "## ⚠️ canary-verify skipped"
echo
echo "One or more canary secrets are unset (\`CANARY_TENANT_URLS\`, \`CANARY_ADMIN_TOKENS\`, \`CANARY_CP_SHARED_SECRET\`)."
echo "Phase 2 canary fleet has not been stood up yet —"
echo "see [canary-tenants.md](https://git.moleculesai.app/molecule-ai/molecule-controlplane/blob/main/docs/canary-tenants.md)."
echo
echo "**Skipped — promote-to-latest will NOT auto-fire.** Dispatch \`promote-latest.yml\` manually when ready."
} >> "$GITHUB_STEP_SUMMARY"
echo "ran=false" >> "$GITHUB_OUTPUT"
echo "::notice::canary-verify: skipped — no canary fleet configured"
exit 0
fi
bash scripts/canary-smoke.sh
echo "ran=true" >> "$GITHUB_OUTPUT"
- name: Summary on failure
if: ${{ failure() }}
run: |
{
echo "## Canary smoke FAILED"
echo
echo "Canary tenants rejected image \`staging-${{ steps.compute.outputs.sha }}\`."
echo ":latest stays pinned to the prior good digest — prod is untouched."
echo
echo "Fix forward and merge again, or investigate the specific failed"
echo "assertions in the canary-smoke step log above."
} >> "$GITHUB_STEP_SUMMARY"
promote-to-latest:
# On green, calls the CP redeploy-fleet endpoint with target_tag=
# staging-<sha> to promote the verified ECR image. This is the same
# mechanism as redeploy-tenants-on-main.yml — no GHCR crane ops.
#
# Pre-fix history: the old GHCR promote step used `crane tag` against
# ghcr.io/molecule-ai/platform-tenant, but publish-workspace-server-
# image.yml had already migrated to ECR on 2026-05-07 (commit
# 10e510f5). The GHCR tags were never updated, so this step was
# silently promoting a stale GHCR image while actual prod tenants
# pulled from ECR. Canary smoke tests were GHCR-targeted and could
# not catch a broken ECR build.
needs: canary-smoke
if: ${{ needs.canary-smoke.result == 'success' && needs.canary-smoke.outputs.smoke_ran == 'true' }}
runs-on: ubuntu-latest
env:
SHA: ${{ needs.canary-smoke.outputs.sha }}
CP_URL: ${{ vars.CP_URL || 'https://staging-api.moleculesai.app' }}
# CP_ADMIN_API_TOKEN gates write access to the redeploy endpoint.
# Stored at the repo level so all workflows pick it up automatically.
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
# canary_slug pin: deploy the verified :staging-<sha> to the canary
# first (soak 120s), then fan out to the rest of the fleet.
CANARY_SLUG: ${{ vars.CANARY_PROMOTE_SLUG || '' }}
SOAK_SECONDS: ${{ vars.CANARY_PROMOTE_SOAK || '120' }}
BATCH_SIZE: ${{ vars.CANARY_PROMOTE_BATCH || '3' }}
steps:
- name: Check CP credentials
run: |
if [ -z "${CP_ADMIN_API_TOKEN:-}" ]; then
echo "::error::CP_ADMIN_API_TOKEN secret is not set — promote step cannot call redeploy-fleet."
echo "::error::Set it at: repo Settings → Actions → Variables and Secrets → New Secret."
exit 1
fi
- name: Promote verified ECR image to :latest
run: |
set -euo pipefail
TARGET_TAG="staging-${SHA}"
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--argjson soak "${SOAK_SECONDS:-120}" \
--argjson batch "${BATCH_SIZE:-3}" \
--argjson dry false \
'{
target_tag: $tag,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
if [ -n "${CANARY_SLUG:-}" ]; then
BODY=$(jq '. * {canary_slug: $slug}' --arg slug "$CANARY_SLUG" <<<"$BODY")
fi
echo "Calling: POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " target_tag: $TARGET_TAG"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
CURL_EXIT=$?
set -e
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE (curl exit $CURL_EXIT)"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
if [ "$HTTP_CODE" -ge 400 ]; then
echo "::error::CP redeploy-fleet returned HTTP $HTTP_CODE — refusing to proceed."
exit 1
fi
- name: Summary
run: |
{
echo "## Canary verified — :latest promoted via CP redeploy-fleet"
echo ""
echo "- **Target tag:** \`staging-${{ needs.canary-smoke.outputs.sha }}\`"
echo "- **Registry:** ECR (\`${TENANT_IMAGE_NAME}\`)"
echo "- **Canary slug:** \`${CANARY_SLUG:-<none>}\` (soak ${SOAK_SECONDS}s)"
echo "- **Batch size:** ${BATCH_SIZE:-3}"
echo ""
echo "CP redeploy-fleet is rolling out the verified image across the prod fleet."
echo "The fleet's 5-minute health-check loop will pick up the update automatically."
} >> "$GITHUB_STEP_SUMMARY"
@@ -1,400 +0,0 @@
name: redeploy-tenants-on-main
# Auto-refresh prod tenant EC2s after every main merge.
#
# Why this workflow exists: publish-workspace-server-image builds and
# pushes a new platform-tenant :<sha> to ECR on every merge to main,
# but running tenants pulled their image once at boot and never re-pull.
# Users see stale code indefinitely.
#
# This workflow closes the gap by calling the control-plane admin
# endpoint that performs a canary-first, batched, health-gated rolling
# redeploy across every live tenant. Implemented in molecule-ai/
# molecule-controlplane as POST /cp/admin/tenants/redeploy-fleet
# (feat/tenant-auto-redeploy, landing alongside this workflow).
#
# Registry: ECR (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/platform-tenant). GHCR was retired 2026-05-07 during the
# Gitea suspension migration. The canary-verify.yml promote step now
# uses the same redeploy-fleet endpoint (fixes the silent-GHCR gap).
#
# Runtime ordering:
# 1. publish-workspace-server-image completes → new :staging-<sha> in ECR.
# 2. This workflow fires via workflow_run, calls redeploy-fleet with
# target_tag=staging-<sha>. No CDN propagation wait needed —
# ECR image manifest is consistent immediately after push.
# 3. Calls redeploy-fleet with canary_slug (if set) and a soak
# period. Canary proves the image boots; batches follow.
# 4. Any failure aborts the rollout and leaves older tenants on the
# prior image — safer default than half-and-half state.
#
# Rollback path: re-run this workflow with a specific SHA pinned via
# the workflow_dispatch input. That calls redeploy-fleet with
# target_tag=<sha>, re-pulling the older image on every tenant.
on:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
workflow_dispatch:
inputs:
target_tag:
# Empty default → auto-trigger and dispatch-without-input both
# resolve to `staging-<short_head_sha>` (the digest publish-image
# just pushed). Pre-fix this defaulted to 'latest', which only
# gets retagged by canary-verify's promote-to-latest job — and
# that job soft-skips when CANARY_TENANT_URLS is unset (the
# current state, until Phase 2 canary fleet is live). Result:
# `:latest` had been pinned to a 4-day-old digest (2026-04-28)
# while every main push pushed fresh `staging-<sha>` images;
# every prod redeploy pulled the stale `:latest` and the verify
# step correctly flagged 3/3 tenants STALE. Pulling the
# just-published `staging-<sha>` directly skips the dead retag
# path. When canary fleet is real, this workflow should chain
# on canary-verify completion (workflow_run from canary-verify),
# not publish-image — separate, smaller PR.
description: 'Tenant image tag to deploy (e.g. "latest", "staging-a59f1a6c"). Empty = auto staging-<head_sha>.'
required: false
type: string
default: ''
canary_slug:
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately).'
required: false
type: string
# Must be an actual prod tenant slug (current: hongming,
# chloe-dong, reno-stars). The previous default 'hongmingwang'
# didn't match any tenant — CP soft-skipped the missing canary
# and the fleet rolled out without the soak gate, defeating the
# whole point of canary-first.
default: 'hongming'
soak_seconds:
description: 'Seconds to wait after canary before fanning out.'
required: false
type: string
default: '60'
batch_size:
description: 'How many tenants SSM redeploys in parallel per batch.'
required: false
type: string
default: '3'
dry_run:
description: 'Plan only — do not actually redeploy.'
required: false
type: boolean
default: false
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
# not the GitHub API.
# Serialize redeploys so two rapid main pushes' redeploys don't overlap
# and cause confusing per-tenant SSM state. Without this, GitHub's
# implicit workflow_run queueing would *probably* serialize them, but
# the explicit block makes the invariant defensible. Mirrors the
# concurrency block on redeploy-tenants-on-staging.yml for shape parity.
#
# cancel-in-progress: false → aborting a half-rolled-out fleet would
# leave tenants stuck on whatever image they happened to be on when
# cancelled. Better to finish the in-flight rollout before starting
# the next one.
concurrency:
group: redeploy-tenants-on-main
cancel-in-progress: false
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success')
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- name: Note on ECR propagation
# ECR image manifests are consistent immediately after push — no
# CDN cache to wait for. The old GHCR-based workflow had a 30s
# sleep to avoid race conditions; ECR makes that unnecessary.
run: echo "ECR image available immediately after push — proceeding."
- name: Compute target tag
id: tag
# Resolution order:
# 1. Operator-supplied input (workflow_dispatch with explicit
# tag) → used verbatim. Lets ops pin `latest` for emergency
# rollback to last canary-verified digest, or pin a specific
# `staging-<sha>` to roll back to a known-good build.
# 2. Default → `staging-<short_head_sha>`. The just-published
# digest. Bypasses the `:latest` retag path that's currently
# dead (canary-verify soft-skips without canary fleet, so
# the only thing retagging `:latest` today is the manual
# promote-latest.yml — last run 2026-04-28). Auto-trigger
# from workflow_run uses workflow_run.head_sha; manual
# dispatch with no input falls through to github.sha.
env:
INPUT_TAG: ${{ inputs.target_tag }}
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
run: |
set -euo pipefail
if [ -n "${INPUT_TAG:-}" ]; then
echo "target_tag=$INPUT_TAG" >> "$GITHUB_OUTPUT"
echo "Using operator-pinned tag: $INPUT_TAG"
else
SHORT="${HEAD_SHA:0:7}"
echo "target_tag=staging-$SHORT" >> "$GITHUB_OUTPUT"
echo "Using auto tag: staging-$SHORT (head_sha=$HEAD_SHA)"
fi
- name: Call CP redeploy-fleet
# CP_ADMIN_API_TOKEN must be set as a repo/org secret on
# molecule-ai/molecule-core, matching the staging/prod CP's
# CP_ADMIN_API_TOKEN env. Stored in Railway, mirrored to this
# repo's secrets for CI.
env:
CP_URL: ${{ vars.CP_URL || 'https://api.moleculesai.app' }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
CANARY_SLUG: ${{ inputs.canary_slug || 'hongming' }}
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
DRY_RUN: ${{ inputs.dry_run || false }}
run: |
set -euo pipefail
if [ -z "${CP_ADMIN_API_TOKEN:-}" ]; then
echo "::error::CP_ADMIN_API_TOKEN secret not set — skipping redeploy"
echo "::notice::Set CP_ADMIN_API_TOKEN in repo secrets to enable auto-redeploy."
exit 1
fi
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--arg canary "$CANARY_SLUG" \
--argjson soak "$SOAK_SECONDS" \
--argjson batch "$BATCH_SIZE" \
--argjson dry "$DRY_RUN" \
'{
target_tag: $tag,
canary_slug: $canary,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
echo "POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
# Route -w into its own tempfile so curl's exit code (e.g. 56
# on connection-reset, 22 on --fail-with-body 4xx/5xx) can't
# pollute the captured stdout. The previous inline-substitution
# shape produced "000000" on connection reset (curl wrote
# "000" via -w, then the inline echo-fallback appended another
# "000") — caught on the 2026-05-04 redeploy of sha 2b862f6.
# set +e/-e keeps the non-zero curl exit from tripping the
# outer pipeline. See lint-curl-status-capture.yml for the
# CI gate that pins this fix shape.
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
set -e
# Stderr from curl (e.g. dial errors with -sS) goes to the runner
# log so operators can see WHY a connection failed. Stdout is
# captured to $HTTP_CODE_FILE because that's where -w writes.
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
# Pretty-print per-tenant results in the job summary so
# ops can see which tenants were redeployed without drilling
# into the raw response.
{
echo "## Tenant redeploy fleet"
echo ""
echo "**Target tag:** \`$TARGET_TAG\`"
echo "**Canary:** \`$CANARY_SLUG\` (soak ${SOAK_SECONDS}s)"
echo "**Batch size:** $BATCH_SIZE"
echo "**Dry run:** $DRY_RUN"
echo "**HTTP:** $HTTP_CODE"
echo ""
echo "### Per-tenant result"
echo ""
echo '| Slug | Phase | SSM Status | Exit | Healthz | Error |'
echo '|------|-------|------------|------|---------|-------|'
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.error // "-") |"' "$HTTP_RESPONSE" || true
} >> "$GITHUB_STEP_SUMMARY"
if [ "$HTTP_CODE" != "200" ]; then
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
exit 1
fi
OK=$(jq -r '.ok' "$HTTP_RESPONSE")
if [ "$OK" != "true" ]; then
echo "::error::redeploy-fleet reported ok=false (see summary for which tenant halted the rollout)"
exit 1
fi
echo "::notice::Tenant fleet redeploy reported ssm_status=Success — verifying actual image roll on each tenant..."
# Stash the response for the verify step. $RUNNER_TEMP outlasts
# the step boundary; $HTTP_RESPONSE doesn't.
cp "$HTTP_RESPONSE" "$RUNNER_TEMP/redeploy-response.json"
- name: Verify each tenant /buildinfo matches published SHA
# ROOT FIX FOR #2395.
#
# `redeploy-fleet`'s `ssm_status=Success` means "the SSM RPC
# didn't error" — NOT "the new image is running on the tenant."
# `:latest` lives in the local Docker daemon's image cache; if
# the SSM document does `docker compose up -d` without an
# explicit `docker pull`, the daemon serves the previously-
# cached digest and the container restarts on stale code.
# 2026-04-30 incident: hongmingwang's tenant reported
# ssm_status=Success at 17:00:53Z but kept serving pre-501a42d7
# chat_files for 30+ min — the lazy-heal fix never reached the
# user despite green deploy + green redeploy.
#
# This step closes the gap by curling each tenant's /buildinfo
# endpoint (added in workspace-server/internal/buildinfo +
# /Dockerfile* GIT_SHA build-arg, this PR) and comparing the
# returned git_sha to the SHA the workflow expects. Mismatches
# fail the workflow, which is what `ok=true` should have
# guaranteed all along.
#
# When the redeploy was triggered by workflow_dispatch with a
# specific tag (target_tag != "latest"), the expected SHA may
# not equal ${{ github.sha }} — in that case we resolve via
# GHCR's manifest. For workflow_run (default :latest) the
# workflow_run.head_sha is the SHA that just published.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
# Tenant subdomain template — slugs from the response are
# appended. Production CP issues `<slug>.moleculesai.app`;
# staging CP issues `<slug>.staging.moleculesai.app`. This
# workflow runs on main → prod CP → no `staging.` infix.
TENANT_DOMAIN: 'moleculesai.app'
run: |
set -euo pipefail
EXPECTED_SHORT="${EXPECTED_SHA:0:7}"
if [ "$TARGET_TAG" != "latest" ] \
&& [ "$TARGET_TAG" != "$EXPECTED_SHA" ] \
&& [ "$TARGET_TAG" != "staging-$EXPECTED_SHORT" ]; then
# workflow_dispatch with a pinned tag that isn't the head
# SHA — operator is rolling back / pinning. Skip the
# verification because we don't have the expected SHA in
# this context (would need to crane-inspect the GHCR
# manifest, which is a follow-up). Failing-open here is
# safe: the operator chose the tag deliberately.
#
# `staging-<short_head_sha>` IS verified — it's the new
# auto-trigger default (see Compute target tag step) and
# the digest under that tag SHOULD match EXPECTED_SHA.
echo "::notice::target_tag=$TARGET_TAG (operator-pinned) — skipping per-tenant SHA verification."
exit 0
fi
RESP="$RUNNER_TEMP/redeploy-response.json"
if [ ! -s "$RESP" ]; then
echo "::error::redeploy-response.json missing or empty — verify step ran without a response to read"
exit 1
fi
# Pull only successfully-redeployed tenants. Any tenant that
# halted the rollout already failed the previous step, so we
# don't double-count them here.
mapfile -t SLUGS < <(jq -r '.results[]? | select(.healthz_ok == true) | .slug' "$RESP")
if [ ${#SLUGS[@]} -eq 0 ]; then
echo "::warning::No tenants reported healthz_ok — nothing to verify"
exit 0
fi
echo "Verifying ${#SLUGS[@]} tenant(s) against EXPECTED_SHA=${EXPECTED_SHA:0:7}..."
# Two distinct failure modes — STALE (the #2395 bug class, hard-fail)
# vs UNREACHABLE (teardown race, soft-warn). See the staging variant's
# comment for the full rationale; same logic applies on prod even
# though prod has fewer ephemeral tenants — the asymmetry would be a
# gratuitous fork.
STALE_COUNT=0
UNREACHABLE_COUNT=0
STALE_LINES=()
UNREACHABLE_LINES=()
for slug in "${SLUGS[@]}"; do
URL="https://${slug}.${TENANT_DOMAIN}/buildinfo"
# 30s total: tenant just SSM-restarted, may still be coming
# up. Retry-on-empty rather than retry-on-status — we want
# to fail fast on "responded with wrong SHA", not "still
# warming up".
BODY=$(curl -sS --max-time 30 --retry 3 --retry-delay 5 --retry-connrefused "$URL" || true)
ACTUAL_SHA=$(echo "$BODY" | jq -r '.git_sha // ""' 2>/dev/null || echo "")
if [ -z "$ACTUAL_SHA" ]; then
UNREACHABLE_COUNT=$((UNREACHABLE_COUNT + 1))
UNREACHABLE_LINES+=("| $slug | (no /buildinfo response) | ${EXPECTED_SHA:0:7} | ⚠ unreachable (likely teardown race) |")
continue
fi
if [ "$ACTUAL_SHA" = "$EXPECTED_SHA" ]; then
echo " $slug: ${ACTUAL_SHA:0:7} ✓"
else
STALE_COUNT=$((STALE_COUNT + 1))
STALE_LINES+=("| $slug | ${ACTUAL_SHA:0:7} | ${EXPECTED_SHA:0:7} | ❌ stale |")
fi
done
{
echo ""
echo "### Per-tenant /buildinfo verification"
echo ""
echo "Expected SHA: \`${EXPECTED_SHA:0:7}\`"
echo ""
if [ $STALE_COUNT -gt 0 ]; then
echo "**${STALE_COUNT} STALE tenant(s) — these did NOT pick up the new image despite ssm_status=Success:**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${STALE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "**${UNREACHABLE_COUNT} unreachable tenant(s) — likely teardown race (soft-warn, not failing):**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${UNREACHABLE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $STALE_COUNT -eq 0 ] && [ $UNREACHABLE_COUNT -eq 0 ]; then
echo "All ${#SLUGS[@]} tenants returned matching SHA. ✓"
fi
} >> "$GITHUB_STEP_SUMMARY"
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "::warning::$UNREACHABLE_COUNT tenant(s) unreachable post-redeploy. Likely benign teardown race — CP healthz monitor catches real outages."
fi
# Belt-and-suspenders sanity floor: same logic as the staging
# variant — see that file's comment for the full rationale.
# Floor only applies when fleet >= 4; below that, canary-verify
# is the actual gate.
TOTAL_VERIFIED=${#SLUGS[@]}
if [ $TOTAL_VERIFIED -ge 4 ] && [ $UNREACHABLE_COUNT -gt $((TOTAL_VERIFIED / 2)) ]; then
echo "::error::$UNREACHABLE_COUNT of $TOTAL_VERIFIED tenant(s) unreachable — exceeds 50% threshold on a fleet large enough that this signals a real outage, not teardown race."
exit 1
fi
if [ $STALE_COUNT -gt 0 ]; then
echo "::error::$STALE_COUNT tenant(s) returned a stale SHA. ssm_status=Success was misleading — see job summary."
exit 1
fi
echo "::notice::Tenant fleet redeploy complete — all reachable tenants on ${EXPECTED_SHA:0:7} (${UNREACHABLE_COUNT} unreachable, soft-warned)."
@@ -1,362 +0,0 @@
name: redeploy-tenants-on-staging
# Auto-refresh staging tenant EC2s after every staging-branch merge.
#
# Mirror of redeploy-tenants-on-main.yml, with the staging-CP host and
# the :staging-latest tag. Sister workflow exists for prod (rolls
# :latest after canary-verify). Both share the same shape — just
# different CP_URL + target_tag + admin token secret.
#
# Why this workflow exists: publish-workspace-server-image now builds
# on every staging-branch push (PR #2335), pushing
# platform-tenant:staging-latest to GHCR. Existing tenants pulled
# their image once at boot and never re-pull, so the new image just
# sits unused until the tenant is reprovisioned.
#
# This workflow closes the gap by calling staging-CP's
# /cp/admin/tenants/redeploy-fleet, which performs a canary-first,
# batched, health-gated SSM redeploy across every live staging tenant.
# Same endpoint shape as prod CP — only the host differs.
#
# Runtime ordering:
# 1. publish-workspace-server-image completes on staging branch →
# new :staging-latest in GHCR.
# 2. This workflow fires via workflow_run, waits 30s for GHCR's CDN
# to propagate the new tag.
# 3. Calls redeploy-fleet with no canary (staging IS canary; we don't
# need a sub-canary inside it). Soak still applies to the first
# tenant in case of bad-deploy detection.
# 4. Any failure aborts the rollout and leaves older tenants on the
# prior image — safer default than half-and-half state.
#
# Rollback path: re-run with workflow_dispatch + target_tag=staging-<sha>
# of a known-good build.
on:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
workflow_dispatch:
inputs:
target_tag:
description: 'Tenant image tag to deploy (e.g. "staging-latest" or "staging-a59f1a6c"). Defaults to staging-latest when empty.'
required: false
type: string
default: 'staging-latest'
canary_slug:
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately). Default empty for staging since staging itself is the canary.'
required: false
type: string
default: ''
soak_seconds:
description: 'Seconds to wait after canary before fanning out. Only meaningful if canary_slug is set.'
required: false
type: string
default: '60'
batch_size:
description: 'How many tenants SSM redeploys in parallel per batch.'
required: false
type: string
default: '3'
dry_run:
description: 'Plan only — do not actually redeploy.'
required: false
type: boolean
default: false
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
# not the GitHub API.
# Serialize per-branch so two rapid staging pushes' redeploys don't
# overlap and cause confusing per-tenant SSM state. cancel-in-progress
# is false because aborting a half-rolled-out fleet leaves tenants
# stuck on whatever image they happened to be on when cancelled.
concurrency:
group: redeploy-tenants-on-staging
cancel-in-progress: false
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success')
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- name: Wait for GHCR tag propagation
# GHCR's edge cache takes ~15-30s to consistently serve the new
# :staging-latest manifest after the registry accepts the push.
# Same rationale as redeploy-tenants-on-main.yml.
run: sleep 30
- name: Call staging-CP redeploy-fleet
# CP_STAGING_ADMIN_API_TOKEN must be set as a repo/org secret
# on molecule-ai/molecule-core, matching staging-CP's
# CP_ADMIN_API_TOKEN env var (visible in Railway controlplane
# / staging environment). Stored separately from the prod
# CP_ADMIN_API_TOKEN so a leak of one doesn't auth the other.
env:
CP_URL: ${{ vars.STAGING_CP_URL || 'https://staging-api.moleculesai.app' }}
CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
TARGET_TAG: ${{ inputs.target_tag || 'staging-latest' }}
CANARY_SLUG: ${{ inputs.canary_slug || '' }}
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
DRY_RUN: ${{ inputs.dry_run || false }}
run: |
set -euo pipefail
# Schedule-vs-dispatch hardening (mirrors sweep-cf-orphans
# and sweep-cf-tunnels): hard-fail on auto-trigger when the
# secret is missing so a misconfigured-repo doesn't silently
# serve stale staging tenants. Soft-skip on operator dispatch.
if [ -z "${CP_STAGING_ADMIN_API_TOKEN:-}" ]; then
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "::warning::CP_STAGING_ADMIN_API_TOKEN secret not set — skipping redeploy"
echo "::warning::Set CP_STAGING_ADMIN_API_TOKEN in repo secrets to enable auto-redeploy."
echo "::notice::Pull the value from staging-CP's CP_ADMIN_API_TOKEN env in Railway."
exit 0
fi
echo "::error::staging redeploy cannot run — CP_STAGING_ADMIN_API_TOKEN secret missing"
echo "::error::set it at Settings → Secrets and Variables → Actions; pull from staging-CP's CP_ADMIN_API_TOKEN env in Railway."
exit 1
fi
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--arg canary "$CANARY_SLUG" \
--argjson soak "$SOAK_SECONDS" \
--argjson batch "$BATCH_SIZE" \
--argjson dry "$DRY_RUN" \
'{
target_tag: $tag,
canary_slug: $canary,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
echo "POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
# Route -w into its own tempfile so curl's exit code (e.g. 56
# on connection-reset) can't pollute the captured stdout. The
# previous inline-substitution shape produced "000000" on
# connection reset — caught on main variant 2026-05-04
# redeploying sha 2b862f6. Same fix shape as the synth-E2E
# §9c gate (PR #2797). See lint-curl-status-capture.yml for
# the CI gate that pins this fix shape.
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_STAGING_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
set -e
# Stderr from curl (-sS shows dial errors etc.) goes to the
# runner log so operators can see WHY a connection failed.
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
{
echo "## Staging tenant redeploy fleet"
echo ""
echo "**Target tag:** \`$TARGET_TAG\`"
echo "**Canary:** \`${CANARY_SLUG:-(none — staging is itself the canary)}\` (soak ${SOAK_SECONDS}s)"
echo "**Batch size:** $BATCH_SIZE"
echo "**Dry run:** $DRY_RUN"
echo "**HTTP:** $HTTP_CODE"
echo ""
echo "### Per-tenant result"
echo ""
echo '| Slug | Phase | SSM Status | Exit | Healthz | Error |'
echo '|------|-------|------------|------|---------|-------|'
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.error // "-") |"' "$HTTP_RESPONSE" || true
} >> "$GITHUB_STEP_SUMMARY"
# Distinguish "real fleet failure" from "E2E teardown race".
#
# CP returns HTTP 500 + ok=false whenever ANY tenant in the
# fleet failed SSM or healthz. In practice the recurring source
# of these is ephemeral test tenants being torn down by their
# parent E2E run mid-redeploy: the EC2 dies → SSM exit=2 or
# healthz timeout → CP marks the fleet failed → this workflow
# goes red even though every operator-facing tenant rolled fine.
#
# Ephemeral slug prefixes (kept in sync with sweep-stale-e2e-orgs.yml
# — see that file for the source-of-truth list and rationale):
# - e2e-* — canvas/saas/ext E2E suites
# - rt-e2e-* — runtime-test harness fixtures (RFC #2251)
# Long-lived prefixes that are NOT ephemeral and MUST hard-fail:
# demo-prep, dryrun-*, dryrun2-*, plus all human tenant slugs.
#
# Filter: if HTTP=500/ok=false AND every failed slug matches an
# ephemeral prefix, treat as soft-warn and let the verify step
# downstream handle unreachable-vs-stale (#2402). Any non-ephemeral
# failure or a non-500 HTTP response remains a hard failure.
OK=$(jq -r '.ok // "false"' "$HTTP_RESPONSE")
FAILED_SLUGS=$(jq -r '
.results[]?
| select((.healthz_ok != true) or (.ssm_status != "Success"))
| .slug' "$HTTP_RESPONSE" 2>/dev/null || true)
EPHEMERAL_PREFIX_RE='^(e2e-|rt-e2e-)'
NON_EPHEMERAL_FAILED=$(printf '%s\n' "$FAILED_SLUGS" | grep -v '^$' | grep -Ev "$EPHEMERAL_PREFIX_RE" || true)
if [ "$HTTP_CODE" = "200" ] && [ "$OK" = "true" ]; then
: # happy path — fall through to verification
elif [ "$HTTP_CODE" = "500" ] && [ -z "$NON_EPHEMERAL_FAILED" ] && [ -n "$FAILED_SLUGS" ]; then
COUNT=$(printf '%s\n' "$FAILED_SLUGS" | grep -Ec "$EPHEMERAL_PREFIX_RE" || true)
echo "::warning::redeploy-fleet returned HTTP 500 but every failed tenant ($COUNT) is ephemeral (e2e-*/rt-e2e-*) — treating as teardown race, soft-warning."
printf '%s\n' "$FAILED_SLUGS" | sed 's/^/::warning:: failed: /'
elif [ "$HTTP_CODE" != "200" ]; then
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
if [ -n "$NON_EPHEMERAL_FAILED" ]; then
echo "::error::non-ephemeral tenant(s) failed:"
printf '%s\n' "$NON_EPHEMERAL_FAILED" | sed 's/^/::error:: /'
fi
exit 1
else
# HTTP=200 but ok=false (shouldn't happen with current CP
# but keep the gate for completeness).
echo "::error::redeploy-fleet reported ok=false (see summary for which tenant halted the rollout)"
exit 1
fi
echo "::notice::Staging tenant fleet redeploy reported ssm_status=Success — verifying actual image roll on each tenant..."
cp "$HTTP_RESPONSE" "$RUNNER_TEMP/redeploy-response.json"
- name: Verify each staging tenant /buildinfo matches published SHA
# Mirror of the verify step in redeploy-tenants-on-main.yml — see
# there for the rationale (#2395 root fix). Staging has the same
# ssm_status-success-but-stale-image hazard and benefits from the
# same gate. Diff: TENANT_DOMAIN includes the `staging.` infix.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
TARGET_TAG: ${{ inputs.target_tag || 'staging-latest' }}
TENANT_DOMAIN: 'staging.moleculesai.app'
run: |
set -euo pipefail
# staging-latest is the staging-side moving tag; treat it the
# same way main treats `latest`. Operator-pinned SHAs skip
# verification (see main variant for why).
if [ "$TARGET_TAG" != "staging-latest" ] && [ "$TARGET_TAG" != "latest" ] && [ "$TARGET_TAG" != "$EXPECTED_SHA" ]; then
echo "::notice::target_tag=$TARGET_TAG (operator-pinned) — skipping per-tenant SHA verification."
exit 0
fi
RESP="$RUNNER_TEMP/redeploy-response.json"
if [ ! -s "$RESP" ]; then
echo "::error::redeploy-response.json missing or empty"
exit 1
fi
mapfile -t SLUGS < <(jq -r '.results[]? | select(.healthz_ok == true) | .slug' "$RESP")
if [ ${#SLUGS[@]} -eq 0 ]; then
echo "::warning::No staging tenants reported healthz_ok — nothing to verify"
exit 0
fi
echo "Verifying ${#SLUGS[@]} staging tenant(s) against EXPECTED_SHA=${EXPECTED_SHA:0:7}..."
# Two distinct failure modes here:
# STALE_COUNT — tenant returned a SHA that doesn't match. THIS is
# the #2395 bug class: tenant up + serving old code.
# Always hard-fail the workflow.
# UNREACHABLE_COUNT — tenant didn't respond. Almost always a benign
# teardown race: redeploy-fleet snapshot says
# healthz_ok=true, then the E2E suite tears the
# ephemeral tenant down before this step runs (the
# e2e-* fixtures churn 5-10/hour on staging). Soft-
# warn so we don't block staging→main on cleanup.
# Real "tenant up but unreachable" is caught by CP's
# own healthz monitor + the post-redeploy alert; we
# don't need to double-count it here.
STALE_COUNT=0
UNREACHABLE_COUNT=0
STALE_LINES=()
UNREACHABLE_LINES=()
for slug in "${SLUGS[@]}"; do
URL="https://${slug}.${TENANT_DOMAIN}/buildinfo"
BODY=$(curl -sS --max-time 30 --retry 3 --retry-delay 5 --retry-connrefused "$URL" || true)
ACTUAL_SHA=$(echo "$BODY" | jq -r '.git_sha // ""' 2>/dev/null || echo "")
if [ -z "$ACTUAL_SHA" ]; then
UNREACHABLE_COUNT=$((UNREACHABLE_COUNT + 1))
UNREACHABLE_LINES+=("| $slug | (no /buildinfo response) | ${EXPECTED_SHA:0:7} | ⚠ unreachable (likely teardown race) |")
continue
fi
if [ "$ACTUAL_SHA" = "$EXPECTED_SHA" ]; then
echo " $slug: ${ACTUAL_SHA:0:7} ✓"
else
STALE_COUNT=$((STALE_COUNT + 1))
STALE_LINES+=("| $slug | ${ACTUAL_SHA:0:7} | ${EXPECTED_SHA:0:7} | ❌ stale |")
fi
done
{
echo ""
echo "### Per-tenant /buildinfo verification (staging)"
echo ""
echo "Expected SHA: \`${EXPECTED_SHA:0:7}\`"
echo ""
if [ $STALE_COUNT -gt 0 ]; then
echo "**${STALE_COUNT} STALE tenant(s) — these did NOT pick up the new image despite ssm_status=Success:**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${STALE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "**${UNREACHABLE_COUNT} unreachable tenant(s) — likely E2E teardown race (soft-warn, not failing):**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${UNREACHABLE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $STALE_COUNT -eq 0 ] && [ $UNREACHABLE_COUNT -eq 0 ]; then
echo "All ${#SLUGS[@]} staging tenants returned matching SHA. ✓"
fi
} >> "$GITHUB_STEP_SUMMARY"
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "::warning::$UNREACHABLE_COUNT staging tenant(s) unreachable post-redeploy. Likely benign teardown race — CP healthz monitor catches real outages."
fi
# Belt-and-suspenders sanity floor: if MORE than half the fleet is
# unreachable AND the fleet is large enough that "half down" is
# statistically meaningful, this is a real outage (e.g. new image
# crashes on startup), not a teardown race. Hard-fail.
#
# Floor only applies when TOTAL_VERIFIED >= 4 — below that, the
# canary-verify step is the actual gate for "all tenants down"
# detection (it runs against the canary first and aborts the
# rollout if the canary fails to come up). Without the >=4 gate,
# a 1-tenant fleet (e.g. a single ephemeral e2e-* tenant on a
# quiet staging push) would re-flake on the exact teardown-race
# condition #2402 fixed: 1 of 1 unreachable = 100% > 50% → fail.
TOTAL_VERIFIED=${#SLUGS[@]}
if [ $TOTAL_VERIFIED -ge 4 ] && [ $UNREACHABLE_COUNT -gt $((TOTAL_VERIFIED / 2)) ]; then
echo "::error::$UNREACHABLE_COUNT of $TOTAL_VERIFIED staging tenant(s) unreachable — exceeds 50% threshold on a fleet large enough that this signals a real outage, not teardown race."
exit 1
fi
if [ $STALE_COUNT -gt 0 ]; then
echo "::error::$STALE_COUNT staging tenant(s) returned a stale SHA. ssm_status=Success was misleading — see job summary."
exit 1
fi
echo "::notice::Staging tenant fleet redeploy complete — all reachable tenants on ${EXPECTED_SHA:0:7} (${UNREACHABLE_COUNT} unreachable, soft-warned)."
+12 -12
View File
@@ -57,24 +57,24 @@ See `CLAUDE.md` for a full list of environment variables and their purposes.
This repo is scoped to **code** (canvas, workspace, workspace-server, related
infra). Public content (blog posts, marketing copy, OG images, SEO briefs,
DevRel demos) lives in [`Molecule-AI/docs`](https://git.moleculesai.app/molecule-ai/docs).
DevRel demos) lives in [`molecule-ai/docs`](https://git.moleculesai.app/molecule-ai/docs).
The `Block forbidden paths` CI gate fails any PR that writes to `marketing/`
or other removed paths — open against `Molecule-AI/docs` instead.
or other removed paths — open against `molecule-ai/docs` instead.
| Content type | Target |
|---|---|
| Blog posts | `Molecule-AI/docs``content/blog/<YYYY-MM-DD-slug>/` |
| Doc pages | `Molecule-AI/docs``content/docs/` |
| Marketing copy / PMM positioning | `Molecule-AI/docs``marketing/` |
| OG images, visual assets | `Molecule-AI/docs``app/` or `marketing/` |
| SEO briefs | `Molecule-AI/docs``marketing/` |
| DevRel demos (runnable code) | Standalone repo under `Molecule-AI/`, OR embedded in `Molecule-AI/docs` |
| Blog posts | `molecule-ai/docs``content/blog/<YYYY-MM-DD-slug>/` |
| Doc pages | `molecule-ai/docs``content/docs/` |
| Marketing copy / PMM positioning | `molecule-ai/docs``marketing/` |
| OG images, visual assets | `molecule-ai/docs``app/` or `marketing/` |
| SEO briefs | `molecule-ai/docs``marketing/` |
| DevRel demos (runnable code) | Standalone repo under `molecule-ai/`, OR embedded in `molecule-ai/docs` |
| Launch checklists, internal tracking | GitHub Issues — **not** committed files |
| Engineering docs (`docs/adr/`, `docs/architecture/`, `docs/incidents/`) | This repo (internal, not published) |
| Live product pages (e.g. `canvas/src/app/pricing/page.tsx`) | This repo (these are app code, not marketing copy) |
If a PR fails the `Block forbidden paths` check, the contents belong in
`Molecule-AI/docs`. No CI drag, no Canvas E2E, content lands in minutes.
`molecule-ai/docs`. No CI drag, no Canvas E2E, content lands in minutes.
## Development Workflow
@@ -190,9 +190,9 @@ Runs the full regression suite against a fixture HTTP server. No network access
Code in this repo lands in molecule-core. Some related runtime artifacts
live in their own repos:
- [`Molecule-AI/molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`Molecule-AI/molecule-sdk-python`](https://git.moleculesai.app/molecule-ai/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`Molecule-AI/molecule-mcp-claude-channel`](https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install with `claude --channels plugin:molecule@Molecule-AI/molecule-mcp-claude-channel`.
- [`molecule-ai/molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`molecule-ai/molecule-sdk-python`](https://git.moleculesai.app/molecule-ai/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`molecule-ai/molecule-mcp-claude-channel`](https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install inside Claude Code via `/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`, then launch with `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel`.
When extending the **A2A surface** in molecule-core (`workspace-server/internal/handlers/a2a_proxy.go` etc.), consider whether the change has a downstream impact on the runtime SDK or the channel plugin — they're versioned independently but share the wire shape.
+12 -2
View File
@@ -4,10 +4,10 @@
# use this Makefile; CI calls docker compose / go test directly so the
# Makefile can evolve without breaking the build.
.PHONY: help dev up down logs build test
.PHONY: help dev up down logs build test e2e-peer-visibility
help: ## Show this help.
@grep -E '^[a-zA-Z_-]+:.*?## ' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-12s\033[0m %s\n", $$1, $$2}'
@grep -E '^[a-zA-Z0-9_-]+:.*?## ' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-22s\033[0m %s\n", $$1, $$2}'
dev: ## Start the full stack with air hot-reload for the platform service.
docker compose -f docker-compose.yml -f docker-compose.dev.yml up
@@ -26,3 +26,13 @@ build: ## Force a fresh build of the platform image (no cache).
test: ## Run Go unit tests in workspace-server/.
cd workspace-server && go test -race ./...
# ─── Local prod-mimic E2E gates ────────────────────────────────────────
# Run the LITERAL peer-visibility MCP list_peers gate against the
# already-running local stack (`make up` or `make dev`). Same byte-
# identical assertion as the staging gate — only provisioning differs.
# Skips any runtime whose provider key is absent (partially-keyed env
# is fine). See tests/e2e/test_peer_visibility_mcp_local.sh for the
# env contract (CLAUDE_CODE_OAUTH_TOKEN / E2E_MINIMAX_API_KEY / etc).
e2e-peer-visibility: ## Run the LOCAL peer-visibility MCP gate vs the running stack (needs `make up` first).
bash tests/e2e/test_peer_visibility_mcp_local.sh
+1 -1
View File
@@ -238,7 +238,7 @@ The result is not just “an agent that learns.” It is **an organization that
- subscribe to one or more workspaces; peer messages surface as conversation turns; replies route back through Molecule's A2A
- no tunnel, no public endpoint — the plugin self-registers each watched workspace as `delivery_mode=poll` and long-polls `/activity?since_id=…`
- multi-tenant friendly: one plugin install can watch workspaces across multiple Molecule tenants (`MOLECULE_PLATFORM_URLS` per-workspace)
- install via the standard marketplace flow: `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
- install via the standard marketplace flow: `/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`, then launch with `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel`
## Built For Teams That Need More Than A Demo
+1 -1
View File
@@ -237,7 +237,7 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
- 订阅一个或多个 workspacepeer 的消息会以 user-turn 出现,回复会经 Molecule A2A 路由出去
- 无需公网隧道、无需公开端点 —— 插件启动时自动把每个 watched workspace 注册成 `delivery_mode=poll`,长轮询 `/activity?since_id=…`
- 多租户友好:单次安装即可同时 watch 跨多个 Molecule 租户的 workspace`MOLECULE_PLATFORM_URLS` 按 workspace 配置)
- 通过标准 marketplace 流程安装:`/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
- 通过标准 marketplace 流程安装:`/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`,然后用 `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel` 启动
## 适合什么团队
+113
View File
@@ -0,0 +1,113 @@
import { describe, it, expect, vi } from "vitest";
// Marketing-launch SEO (mc#1486). These tests pin the public crawler
// contract: anything that flips public marketing routes to disallow,
// drops the sitemap from robots.txt, or removes the OG image
// reference from root metadata should fail loudly here.
// next/font and the rest of the layout's runtime tree are not
// vitest-compatible (next/font expects the Next.js compiler swc
// transform). We import layout.tsx only for its exported `metadata`
// constant — mock the font module to a constructor-returning stub.
vi.mock("next/font/google", () => ({
Inter: () => ({ variable: "--font-inter" }),
JetBrains_Mono: () => ({ variable: "--font-jetbrains" }),
}));
import robots from "../robots";
import sitemap from "../sitemap";
import { metadata } from "../layout";
describe("robots.ts", () => {
it("allows public marketing routes and blocks authed/app routes", () => {
const r = robots();
expect(r.rules).toBeDefined();
const rule = Array.isArray(r.rules) ? r.rules[0] : r.rules!;
expect(rule.userAgent).toBe("*");
const allow = Array.isArray(rule.allow) ? rule.allow : [rule.allow];
expect(allow).toEqual(expect.arrayContaining(["/", "/pricing", "/blog"]));
const disallow = Array.isArray(rule.disallow)
? rule.disallow
: [rule.disallow];
expect(disallow).toEqual(
expect.arrayContaining(["/api/", "/orgs", "/cp/"]),
);
});
it("declares the sitemap URL", () => {
const r = robots();
expect(r.sitemap).toMatch(/\/sitemap\.xml$/);
});
it("declares a canonical host", () => {
const r = robots();
expect(r.host).toMatch(/^https:\/\//);
});
});
describe("sitemap.ts", () => {
it("includes apex, pricing, and the live blog post", () => {
const entries = sitemap();
const urls = entries.map((e) => e.url);
expect(urls.some((u) => u.endsWith("/"))).toBe(true);
expect(urls.some((u) => u.endsWith("/pricing"))).toBe(true);
expect(
urls.some((u) => u.includes("/blog/2026-04-20-chrome-devtools-mcp")),
).toBe(true);
});
it("does NOT include authed/app routes", () => {
const entries = sitemap();
const urls = entries.map((e) => e.url);
expect(urls.some((u) => u.includes("/orgs"))).toBe(false);
expect(urls.some((u) => u.includes("/api/"))).toBe(false);
});
it("sets a non-zero priority and a valid changeFrequency on every entry", () => {
const valid = new Set([
"always",
"hourly",
"daily",
"weekly",
"monthly",
"yearly",
"never",
]);
for (const e of sitemap()) {
expect(e.priority).toBeGreaterThan(0);
expect(valid.has(String(e.changeFrequency))).toBe(true);
}
});
});
describe("root layout metadata", () => {
it("sets a templated title + non-empty description", () => {
const t = metadata.title as { default: string; template: string };
expect(t.default).toMatch(/Molecule AI/);
expect(t.template).toMatch(/%s/);
expect((metadata.description ?? "").length).toBeGreaterThan(50);
});
it("declares OG + Twitter text fields (image comes from opengraph-image.tsx)", () => {
const og = metadata.openGraph;
expect(og).toBeDefined();
expect((og as { title: string }).title).toMatch(/Molecule AI/);
expect((og as { description: string }).description.length).toBeGreaterThan(
50,
);
const tw = metadata.twitter;
expect(tw).toBeDefined();
// Next.js typings narrow twitter.card to a union — assert via cast.
expect((tw as { card: string }).card).toBe("summary_large_image");
});
it("sets a canonical alternate", () => {
expect(metadata.alternates?.canonical).toBe("/");
});
it("enables indexing at the metadata level (robots.ts owns per-route)", () => {
const r = metadata.robots as { index: boolean; follow: boolean };
expect(r.index).toBe(true);
expect(r.follow).toBe(true);
});
});
+140 -2
View File
@@ -27,9 +27,78 @@ import {
themeBootScript,
} from "@/lib/theme-cookie";
// Marketing-launch SEO (mc#1486). Canonical apex is app.moleculesai.app —
// tenant subdomains (<slug>.moleculesai.app) reuse the same Next.js build
// but are gated behind auth (AuthGate redirects anonymous → /cp/auth/login)
// and are de-indexed in robots.ts. The metadata here applies to the
// public marketing surface served from the apex host.
//
// Override per-route by exporting a page-level `metadata`/`generateMetadata`
// — Next.js merges page metadata over layout metadata using
// `title.template` for "<page> | Molecule AI" composition.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
export const metadata: Metadata = {
title: "Molecule AI",
description: "AI Org Chart Canvas",
metadataBase: new URL(SITE_URL),
title: {
default: "Molecule AI — the AI org chart canvas",
template: "%s | Molecule AI",
},
description:
"Molecule AI is an org-chart canvas for AI agent teams. Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace with credit metering, audit, and one-click runtime provisioning.",
applicationName: "Molecule AI",
keywords: [
"AI agents",
"multi-agent",
"agent orchestration",
"AI org chart",
"Claude Code",
"Codex",
"MCP",
"agent governance",
"A2A",
"agent runtime",
],
authors: [{ name: "Molecule AI" }],
creator: "Molecule AI",
publisher: "Molecule AI",
alternates: { canonical: "/" },
// OG + Twitter images come from the file-convention sibling
// `opengraph-image.tsx` — Next.js auto-attaches them to og:image
// and twitter:image when present at the segment root. We keep the
// text fields here so they win over per-page metadata when a page
// doesn't override them. `images: []` as the structural fallback
// for hosts that won't follow the file convention; the real URL
// is injected by Next.js at build time from opengraph-image.tsx.
openGraph: {
type: "website",
siteName: "Molecule AI",
url: SITE_URL,
title: "Molecule AI — the AI org chart canvas",
description:
"Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace. Credit metering, audit, and one-click runtime provisioning.",
locale: "en_US",
},
twitter: {
card: "summary_large_image",
title: "Molecule AI — the AI org chart canvas",
description:
"Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace.",
},
icons: {
icon: "/molecule-icon.png",
apple: "/molecule-icon.png",
},
// robots.ts owns the per-route allow/disallow contract; this is the
// header-level fallback for routes the crawler reaches before
// robots.txt resolves. Default = index public marketing routes;
// app/auth/api/orgs are noindex'd by robots.ts.
robots: {
index: true,
follow: true,
googleBot: { index: true, follow: true, "max-image-preview": "large" },
},
};
export default async function RootLayout({
@@ -94,6 +163,75 @@ export default async function RootLayout({
nonce={nonce}
dangerouslySetInnerHTML={{ __html: themeBootScript }}
/>
{/*
* JSON-LD structured data (mc#1486). Two graph nodes:
*
* - Organization: surfaces the brand to Google Knowledge
* Graph + Bing entity index. URL+logo+sameAs are the
* minimum recommended set for new brands without a
* Wikipedia page.
*
* - WebSite: enables the sitelinks search box and tells
* crawlers the canonical site URL when the same content
* is reachable via multiple subdomains (apex + tenant).
*
* Type-application/ld+json runs synchronously without
* executing JS, so 'strict-dynamic' isn't required — we still
* carry the nonce because production CSP's default-src 'self'
* applies to any <script> element. The "type" attribute is
* what keeps the browser from running the body as JS, but
* CSP nonces are gated on the element not the type, so we
* include the nonce too.
*/}
<script
type="application/ld+json"
nonce={nonce}
dangerouslySetInnerHTML={{
__html: JSON.stringify({
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": `${SITE_URL}#organization`,
name: "Molecule AI",
url: SITE_URL,
logo: `${SITE_URL}/molecule-icon.png`,
sameAs: [
"https://github.com/molecule-ai",
"https://x.com/moleculeai",
],
},
{
"@type": "WebSite",
"@id": `${SITE_URL}#website`,
url: SITE_URL,
name: "Molecule AI",
publisher: { "@id": `${SITE_URL}#organization` },
inLanguage: "en-US",
},
{
"@type": "SoftwareApplication",
"@id": `${SITE_URL}#software`,
name: "Molecule AI",
applicationCategory: "DeveloperApplication",
operatingSystem: "Web",
description:
"Org-chart canvas for AI agent teams with credit metering, audit, and one-click runtime provisioning.",
url: SITE_URL,
offers: {
"@type": "AggregateOffer",
priceCurrency: "USD",
lowPrice: "0",
highPrice: "99",
offerCount: "3",
url: `${SITE_URL}/pricing`,
},
publisher: { "@id": `${SITE_URL}#organization` },
},
],
}),
}}
/>
</head>
<body className={`bg-surface text-ink ${interFont.variable} ${monoFont.variable}`}>
<ThemeProvider initialTheme={theme}>
+82
View File
@@ -0,0 +1,82 @@
import { ImageResponse } from "next/og";
// Marketing-launch SEO (mc#1486). Next.js App-Router file-system OG
// convention: served as `/opengraph-image` and auto-attached as
// `og:image` + `twitter:image`. Dynamic (not a static PNG in /public)
// so we can iterate the brand mark + tagline pre-launch without
// churning a binary blob in git history.
export const runtime = "edge";
export const alt = "Molecule AI — the AI org chart canvas";
export const size = { width: 1200, height: 630 };
export const contentType = "image/png";
export default function OG() {
return new ImageResponse(
(
<div
style={{
width: "100%",
height: "100%",
display: "flex",
flexDirection: "column",
alignItems: "flex-start",
justifyContent: "center",
padding: "80px",
background:
"linear-gradient(135deg, #0a0a0a 0%, #1a1a2e 60%, #16213e 100%)",
color: "#ffffff",
fontFamily: "system-ui, -apple-system, sans-serif",
}}
>
<div
style={{
fontSize: 28,
color: "#a3a3c2",
letterSpacing: "0.18em",
textTransform: "uppercase",
marginBottom: 24,
}}
>
Molecule AI
</div>
<div
style={{
fontSize: 76,
fontWeight: 700,
lineHeight: 1.05,
letterSpacing: "-0.02em",
maxWidth: 980,
}}
>
The AI org chart canvas
</div>
<div
style={{
fontSize: 32,
color: "#c8c8d8",
marginTop: 32,
lineHeight: 1.3,
maxWidth: 980,
}}
>
Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed
multi-agent workspace.
</div>
<div
style={{
position: "absolute",
right: 80,
bottom: 80,
fontSize: 22,
color: "#7a7a96",
display: "flex",
}}
>
moleculesai.app
</div>
</div>
),
{ ...size },
);
}
+6 -4
View File
@@ -103,7 +103,7 @@ export default function Home() {
setHydrationError(null);
window.location.reload();
}}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Retry
</button>
@@ -115,7 +115,9 @@ export default function Home() {
return (
<>
<Canvas />
<main aria-label="Agent canvas">
<Canvas />
</main>
<Legend />
<CommunicationOverlay />
{hydrationError && (
@@ -134,7 +136,7 @@ export default function Home() {
setHydrationError(null);
window.location.reload();
}}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Retry
</button>
@@ -176,7 +178,7 @@ brew services start redis`}</pre>
</p>
<button
onClick={() => window.location.reload()}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm mt-2"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm mt-2 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Reload
</button>
+45
View File
@@ -0,0 +1,45 @@
import type { MetadataRoute } from "next";
// Marketing-launch SEO (mc#1486). Next.js App-Router robots convention:
// this file is served as `/robots.txt` at build time and is the single
// source of truth for crawler allow/disallow.
//
// Contract:
// - Public marketing routes (/, /pricing, /blog/*) are crawlable.
// - Authed/app routes (/orgs, /api/*) are noindex'd. They render
// useful content only after a session round-trip, so a crawler hit
// just wastes our crawl budget and exposes endpoint shapes.
// - Tenant subdomains (<slug>.moleculesai.app) share this build but
// are blocked at the host level by the canvas middleware sending
// an `X-Robots-Tag: noindex` header — robots.txt is per-host and
// this file's `host` field claims the apex as canonical.
//
// Note: `sitemap` is published via the sibling `sitemap.ts` route; we
// reference it explicitly here so crawlers don't have to guess.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
export default function robots(): MetadataRoute.Robots {
return {
rules: [
{
userAgent: "*",
allow: ["/", "/pricing", "/blog"],
// Authed app surface + API + transient checkout returns. The
// /orgs route boots the org-selector behind AuthGate; even
// though SSR returns markup, that markup is a login wall when
// hit by an unauthenticated crawler, so indexing it dilutes
// brand searches with a "Please sign in" snippet.
disallow: [
"/orgs",
"/orgs/",
"/api/",
"/cp/",
"/checkout/",
],
},
],
sitemap: `${SITE_URL}/sitemap.xml`,
host: SITE_URL,
};
}
+42
View File
@@ -0,0 +1,42 @@
import type { MetadataRoute } from "next";
// Marketing-launch SEO (mc#1486). App-Router sitemap convention: this
// file is served as `/sitemap.xml` and enumerates the public marketing
// surface for search crawlers + AI training pipelines.
//
// Scope deliberately narrow:
// - Apex landing, pricing, and the (currently single) blog post.
// - Authed app routes are excluded — they're disallowed in robots.ts
// and would appear as "Please sign in" wall to a crawler.
//
// `lastModified` uses a build-time timestamp rather than per-route
// fs.stat so the same value applies regardless of where the build
// runs (Vercel/Railway/local). When we add CMS-backed blog content,
// swap to a per-entry timestamp from the source-of-truth metadata.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
const BUILD_DATE = new Date();
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: `${SITE_URL}/`,
lastModified: BUILD_DATE,
changeFrequency: "weekly",
priority: 1.0,
},
{
url: `${SITE_URL}/pricing`,
lastModified: BUILD_DATE,
changeFrequency: "weekly",
priority: 0.9,
},
{
url: `${SITE_URL}/blog/2026-04-20-chrome-devtools-mcp`,
lastModified: new Date("2026-04-20"),
changeFrequency: "monthly",
priority: 0.6,
},
];
}
+1 -1
View File
@@ -132,7 +132,7 @@ export function AuditTrailPanel({ workspaceId }: Props) {
if (loading) {
return (
<div className="flex items-center justify-center h-32">
<div role="status" aria-live="polite" className="flex items-center justify-center h-32">
<span className="text-xs text-ink-mid">Loading audit trail</span>
</div>
);
@@ -133,13 +133,13 @@ export function ConversationTraceModal({ open, workspaceId: _workspaceId, onClos
{/* Timeline */}
<div className="flex-1 overflow-y-auto px-5 py-4">
{loading && (
<div className="text-xs text-ink-mid text-center py-8">
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">
Loading trace from all workspaces...
</div>
)}
{!loading && entries.length === 0 && (
<div className="text-xs text-ink-mid text-center py-8">
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">
No activity found
</div>
)}
+1 -1
View File
@@ -105,7 +105,7 @@ export function EmptyState() {
{/* Template grid */}
{loading ? (
<div className="flex items-center justify-center gap-2 text-xs text-ink-mid py-4">
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 text-xs text-ink-mid py-4">
<Spinner />
Loading templates...
</div>
+196 -85
View File
@@ -15,7 +15,7 @@
// ($AGENT_URL). They ARE NOT filled in server-side because the
// server doesn't know where the operator's agent will live.
import { useCallback, useState } from "react";
import { useCallback, useRef, useState } from "react";
import * as Dialog from "@radix-ui/react-dialog";
type Tab = "python" | "curl" | "claude" | "mcp" | "hermes" | "codex" | "openclaw" | "kimi" | "fields";
@@ -84,6 +84,33 @@ export function ExternalConnectModal({ info, onClose }: Props) {
: "python";
const [tab, setTab] = useState<Tab>(initialTab);
const [copiedKey, setCopiedKey] = useState<string | null>(null);
const tabRefs = useRef<Map<Tab, HTMLButtonElement | null>>(new Map());
const handleTabKeyDown = useCallback(
(e: React.KeyboardEvent<HTMLButtonElement>, current: Tab, tabs: Tab[]) => {
const idx = tabs.indexOf(current);
if (e.key === "ArrowRight" || e.key === "ArrowDown") {
e.preventDefault();
const next = tabs[(idx + 1) % tabs.length];
setTab(next);
tabRefs.current.get(next)?.focus();
} else if (e.key === "ArrowLeft" || e.key === "ArrowUp") {
e.preventDefault();
const prev = tabs[(idx - 1 + tabs.length) % tabs.length];
setTab(prev);
tabRefs.current.get(prev)?.focus();
} else if (e.key === "Home") {
e.preventDefault();
setTab(tabs[0]);
tabRefs.current.get(tabs[0])?.focus();
} else if (e.key === "End") {
e.preventDefault();
setTab(tabs[tabs.length - 1]);
tabRefs.current.get(tabs[tabs.length - 1])?.focus();
}
},
[],
);
const copy = useCallback(async (value: string, key: string) => {
try {
@@ -160,6 +187,19 @@ export function ExternalConnectModal({ info, onClose }: Props) {
`MOLECULE_WORKSPACE_TOKEN=${info.auth_token}`,
);
// Build the tab list once so both the tab bar and keyboard handler
// share the same ordered array. Computed here (after all filled* vars)
// so TypeScript's block-scoping analysis can reach them.
const tabList: Tab[] = [];
if (filledUniversalMcp) tabList.push("mcp");
tabList.push("python");
if (filledChannel) tabList.push("claude");
if (filledHermes) tabList.push("hermes");
if (filledCodex) tabList.push("codex");
if (filledOpenClaw) tabList.push("openclaw");
if (filledKimi) tabList.push("kimi");
tabList.push("curl", "fields");
return (
<Dialog.Root open onOpenChange={(o) => !o && onClose()}>
<Dialog.Portal>
@@ -180,34 +220,18 @@ export function ExternalConnectModal({ info, onClose }: Props) {
aria-label="Connection snippet format"
className="mt-4 flex gap-1 border-b border-line"
>
{(() => {
// Build the tab order dynamically. Claude Code first
// (when offered) since it's the simplest setup; Python
// SDK second (full register+heartbeat+inbound); Universal
// MCP third (any MCP-aware runtime, outbound-only); curl
// for one-shot register; Fields for raw values.
// Tab order: Universal MCP first (default, runtime-
// agnostic primitives), then runtime-specific channel/
// SDK tabs, then curl + Fields. Each runtime tab only
// appears when the platform supplies the snippet — no
// dead "tab missing snippet" UX.
const tabs: Tab[] = [];
if (filledUniversalMcp) tabs.push("mcp");
tabs.push("python");
if (filledChannel) tabs.push("claude");
if (filledHermes) tabs.push("hermes");
if (filledCodex) tabs.push("codex");
if (filledOpenClaw) tabs.push("openclaw");
if (filledKimi) tabs.push("kimi");
tabs.push("curl", "fields");
return tabs;
})().map((t) => (
{tabList.map((t) => (
<button
key={t}
type="button"
role="tab"
id={`tab-${t}`}
aria-selected={tab === t}
aria-controls={`panel-${t}`}
tabIndex={tab === t ? 0 : -1}
ref={(el) => { tabRefs.current.set(t, el); }}
onClick={() => setTab(t)}
onKeyDown={(e) => handleTabKeyDown(e, t, tabList)}
className={`px-3 py-2 text-sm border-b-2 -mb-px transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1 focus-visible:ring-offset-surface ${
tab === t
? "border-accent text-ink"
@@ -235,18 +259,39 @@ export function ExternalConnectModal({ info, onClose }: Props) {
))}
</div>
{/* Snippet area */}
<div className="mt-3">
{tab === "claude" && filledChannel && (
<SnippetBlock
value={filledChannel}
label="Claude Code channel — polls workspace's A2A; no tunnel needed"
copyKey="claude"
copied={copiedKey === "claude"}
onCopy={() => copy(filledChannel, "claude")}
/>
)}
{tab === "python" && (
{/* Snippet area — all panels always in the DOM so aria-controls
targets are stable. Hidden panels use aria-hidden so screen
readers skip them; active panel uses role=tabpanel with
aria-labelledby pointing to the tab button. */}
<div className="mt-3" data-testid="snippet-panels">
{/* Claude Code tab */}
<div
id="panel-claude"
data-testid="panel-claude"
role="tabpanel"
aria-labelledby="tab-claude"
hidden={tab !== "claude" || !filledChannel}
className={tab === "claude" && filledChannel ? "" : "hidden"}
>
{filledChannel && (
<SnippetBlock
value={filledChannel}
label="Claude Code channel — polls workspace's A2A; no tunnel needed"
copyKey="claude"
copied={copiedKey === "claude"}
onCopy={() => copy(filledChannel, "claude")}
/>
)}
</div>
{/* Python SDK tab */}
<div
id="panel-python"
data-testid="panel-python"
role="tabpanel"
aria-labelledby="tab-python"
hidden={tab !== "python"}
className={tab === "python" ? "" : "hidden"}
>
<SnippetBlock
value={filledPython}
label="Python SDK — includes heartbeat loop (push-mode, needs public URL)"
@@ -254,8 +299,16 @@ export function ExternalConnectModal({ info, onClose }: Props) {
copied={copiedKey === "python"}
onCopy={() => copy(filledPython, "python")}
/>
)}
{tab === "curl" && (
</div>
{/* curl tab */}
<div
id="panel-curl"
data-testid="panel-curl"
role="tabpanel"
aria-labelledby="tab-curl"
hidden={tab !== "curl"}
className={tab === "curl" ? "" : "hidden"}
>
<SnippetBlock
value={filledCurl}
label="curl — one-shot register only (no heartbeat)"
@@ -263,53 +316,111 @@ export function ExternalConnectModal({ info, onClose }: Props) {
copied={copiedKey === "curl"}
onCopy={() => copy(filledCurl, "curl")}
/>
)}
{tab === "mcp" && filledUniversalMcp && (
<SnippetBlock
value={filledUniversalMcp}
label="Universal MCP — standalone register + heartbeat + tools for any MCP-aware runtime (Claude Code, hermes, codex). Pair with Python or Claude Code tab if you need inbound A2A delivery."
copyKey="mcp"
copied={copiedKey === "mcp"}
onCopy={() => copy(filledUniversalMcp, "mcp")}
/>
)}
{tab === "hermes" && filledHermes && (
<SnippetBlock
value={filledHermes}
label="Hermes channel — bridges this workspace's A2A traffic into your hermes-agent session as platform messages (push parity with Claude Code). Long-poll based; no tunnel needed."
copyKey="hermes"
copied={copiedKey === "hermes"}
onCopy={() => copy(filledHermes, "hermes")}
/>
)}
{tab === "codex" && filledCodex && (
<SnippetBlock
value={filledCodex}
label="Codex MCP config — wires the molecule MCP server into ~/.codex/config.toml. Outbound tools today; inbound A2A push needs the Python SDK tab paired in (codex's MCP runtime doesn't route arbitrary notifications/* yet)."
copyKey="codex"
copied={copiedKey === "codex"}
onCopy={() => copy(filledCodex, "codex")}
/>
)}
{tab === "openclaw" && filledOpenClaw && (
<SnippetBlock
value={filledOpenClaw}
label="OpenClaw MCP config — wires the molecule MCP server via openclaw mcp set + starts the gateway on loopback. Outbound tools today; inbound A2A push on an external openclaw needs the Python SDK tab paired in (a sessions.steer bridge daemon is future work)."
copyKey="openclaw"
copied={copiedKey === "openclaw"}
onCopy={() => copy(filledOpenClaw, "openclaw")}
/>
)}
{tab === "kimi" && filledKimi && (
<SnippetBlock
value={filledKimi}
label="Kimi CLI — self-contained Python bridge. Registers, heartbeats, polls for canvas messages, and echoes replies back. NAT-safe (no public URL). Run in a background terminal or via launchd."
copyKey="kimi"
copied={copiedKey === "kimi"}
onCopy={() => copy(filledKimi, "kimi")}
/>
)}
{tab === "fields" && (
</div>
{/* Universal MCP tab */}
<div
id="panel-mcp"
data-testid="panel-mcp"
role="tabpanel"
aria-labelledby="tab-mcp"
hidden={tab !== "mcp" || !filledUniversalMcp}
className={tab === "mcp" && filledUniversalMcp ? "" : "hidden"}
>
{filledUniversalMcp && (
<SnippetBlock
value={filledUniversalMcp}
label="Universal MCP — standalone register + heartbeat + tools for any MCP-aware runtime (Claude Code, hermes, codex). Pair with Python or Claude Code tab if you need inbound A2A delivery."
copyKey="mcp"
copied={copiedKey === "mcp"}
onCopy={() => copy(filledUniversalMcp, "mcp")}
/>
)}
</div>
{/* Hermes tab */}
<div
id="panel-hermes"
data-testid="panel-hermes"
role="tabpanel"
aria-labelledby="tab-hermes"
hidden={tab !== "hermes" || !filledHermes}
className={tab === "hermes" && filledHermes ? "" : "hidden"}
>
{filledHermes && (
<SnippetBlock
value={filledHermes}
label="Hermes channel — bridges this workspace's A2A traffic into your hermes-agent session as platform messages (push parity with Claude Code). Long-poll based; no tunnel needed."
copyKey="hermes"
copied={copiedKey === "hermes"}
onCopy={() => copy(filledHermes, "hermes")}
/>
)}
</div>
{/* Codex tab */}
<div
id="panel-codex"
data-testid="panel-codex"
role="tabpanel"
aria-labelledby="tab-codex"
hidden={tab !== "codex" || !filledCodex}
className={tab === "codex" && filledCodex ? "" : "hidden"}
>
{filledCodex && (
<SnippetBlock
value={filledCodex}
label="Codex MCP config — wires the molecule MCP server into ~/.codex/config.toml. Outbound tools today; inbound A2A push needs the Python SDK tab paired in (codex's MCP runtime doesn't route arbitrary notifications/* yet)."
copyKey="codex"
copied={copiedKey === "codex"}
onCopy={() => copy(filledCodex, "codex")}
/>
)}
</div>
{/* OpenClaw tab */}
<div
id="panel-openclaw"
data-testid="panel-openclaw"
role="tabpanel"
aria-labelledby="tab-openclaw"
hidden={tab !== "openclaw" || !filledOpenClaw}
className={tab === "openclaw" && filledOpenClaw ? "" : "hidden"}
>
{filledOpenClaw && (
<SnippetBlock
value={filledOpenClaw}
label="OpenClaw MCP config — wires the molecule MCP server via openclaw mcp set + starts the gateway on loopback. Outbound tools today; inbound A2A push on an external openclaw needs the Python SDK tab paired in (a sessions.steer bridge daemon is future work)."
copyKey="openclaw"
copied={copiedKey === "openclaw"}
onCopy={() => copy(filledOpenClaw, "openclaw")}
/>
)}
</div>
{/* Kimi tab */}
<div
id="panel-kimi"
data-testid="panel-kimi"
role="tabpanel"
aria-labelledby="tab-kimi"
hidden={tab !== "kimi" || !filledKimi}
className={tab === "kimi" && filledKimi ? "" : "hidden"}
>
{filledKimi && (
<SnippetBlock
value={filledKimi}
label="Kimi CLI — self-contained Python bridge. Registers, heartbeats, polls for canvas messages, and echoes replies back. NAT-safe (no public URL). Run in a background terminal or via launchd."
copyKey="kimi"
copied={copiedKey === "kimi"}
onCopy={() => copy(filledKimi, "kimi")}
/>
)}
</div>
{/* Fields tab */}
<div
id="panel-fields"
data-testid="panel-fields"
role="tabpanel"
aria-labelledby="tab-fields"
hidden={tab !== "fields"}
className={tab === "fields" ? "" : "hidden"}
>
<div className="space-y-2">
<Field label="workspace_id" value={info.workspace_id} onCopy={() => copy(info.workspace_id, "wsid")} copied={copiedKey === "wsid"} />
<Field label="platform_url" value={info.platform_url} onCopy={() => copy(info.platform_url, "url")} copied={copiedKey === "url"} />
@@ -323,7 +434,7 @@ export function ExternalConnectModal({ info, onClose }: Props) {
<Field label="registry_endpoint" value={info.registry_endpoint} onCopy={() => copy(info.registry_endpoint, "reg")} copied={copiedKey === "reg"} />
<Field label="heartbeat_endpoint" value={info.heartbeat_endpoint} onCopy={() => copy(info.heartbeat_endpoint, "hb")} copied={copiedKey === "hb"} />
</div>
)}
</div>
</div>
<div className="mt-5 flex justify-end gap-2">
+4 -2
View File
@@ -440,6 +440,7 @@ function ProviderPickerModal({
onChange={(e) => updateEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Value for ${entry.key}`}
ref={index === 0 ? firstInputRef : undefined}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
@@ -459,7 +460,7 @@ function ProviderPickerModal({
)}
{entry.error && (
<div className="mt-1.5 text-[10px] text-bad">{entry.error}</div>
<div role="alert" aria-live="assertive" className="mt-1.5 text-[10px] text-bad">{entry.error}</div>
)}
</div>
))}
@@ -694,6 +695,7 @@ function AllKeysModal({
onChange={(e) => updateEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Value for ${entry.key}`}
autoFocus={index === 0}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
@@ -718,7 +720,7 @@ function AllKeysModal({
))}
{globalError && (
<div className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[11px] text-bad">
<div role="alert" aria-live="assertive" className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[11px] text-bad">
{globalError}
</div>
)}
+1 -1
View File
@@ -71,7 +71,7 @@ export function WorkspaceUsage({ workspaceId }: WorkspaceUsageProps) {
<SkeletonRow />
</>
) : error ? (
<p className="text-xs text-bad" data-testid="usage-error">
<p role="alert" aria-live="assertive" className="text-xs text-bad" data-testid="usage-error">
{error}
</p>
) : metrics ? (
@@ -131,7 +131,9 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the Python SDK tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
const preEl = document.querySelector("pre");
// Query within the python panel so we get the right pre (not the first in DOM).
const pythonPanel = document.querySelector("[data-testid='panel-python']");
const preEl = pythonPanel?.querySelector("pre");
expect(preEl?.textContent).toContain("AUTH_TOKEN");
// The placeholder is replaced with the real auth token
expect(preEl?.textContent).toContain("secret-auth-token-abc");
@@ -140,7 +142,9 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the curl tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
const preEl = document.querySelector("pre");
// Query within the curl panel so we get the right pre (not the first in DOM).
const curlPanel = document.querySelector("[data-testid='panel-curl']");
const preEl = curlPanel?.querySelector("pre");
expect(preEl?.textContent).toContain("curl");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -148,9 +152,11 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the Fields tab and shows raw values", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
expect(screen.getByText("ws-123")).toBeTruthy();
expect(screen.getByText("https://app.example.com")).toBeTruthy();
expect(screen.getByText("secret-auth-token-abc")).toBeTruthy();
// Query within the fields panel for specific values.
const fieldsPanel = document.querySelector("[data-testid='panel-fields']");
expect(fieldsPanel?.textContent).toContain("ws-123");
expect(fieldsPanel?.textContent).toContain("https://app.example.com");
expect(fieldsPanel?.textContent).toContain("secret-auth-token-abc");
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
@@ -168,7 +174,8 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the Python snippet instead of the placeholder", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
const preEl = document.querySelector("pre");
const pythonPanel = document.querySelector("[data-testid='panel-python']");
const preEl = pythonPanel?.querySelector("pre");
expect(preEl?.textContent).not.toContain("<paste from create response>");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -176,7 +183,8 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the curl snippet", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
const preEl = document.querySelector("pre");
const curlPanel = document.querySelector("[data-testid='panel-curl']");
const preEl = curlPanel?.querySelector("pre");
// curl template uses WORKSPACE_AUTH_TOKEN placeholder, not the generic one
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -184,7 +192,8 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the Universal MCP snippet", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP
const preEl = document.querySelector("pre");
const mcpPanel = document.querySelector("[data-testid='panel-mcp']");
const preEl = mcpPanel?.querySelector("pre");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
expect(preEl?.textContent).not.toContain("<paste from create response>");
});
@@ -193,8 +202,10 @@ describe("ExternalConnectModal — snippet token stamping", () => {
describe("ExternalConnectModal — copy functionality", () => {
it("calls navigator.clipboard.writeText with the snippet text", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP
fireEvent.click(screen.getByRole("button", { name: /^copy$/i }));
// Default tab is Universal MCP — query the copy button within the mcp panel.
const mcpPanel = document.querySelector("[data-testid='panel-mcp']");
const copyBtn = mcpPanel?.querySelector("button");
if (copyBtn) fireEvent.click(copyBtn);
expect(clipboardWriteText).toHaveBeenCalledWith(
expect.stringContaining("secret-auth-token-abc"),
);
@@ -227,7 +238,8 @@ describe("ExternalConnectModal — missing optional fields", () => {
};
renderAndFlush(minimalInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
expect(screen.getByText("(missing)")).toBeTruthy();
const fieldsPanel = document.querySelector("[data-testid='panel-fields']");
expect(fieldsPanel?.textContent).toContain("(missing)");
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
@@ -223,6 +223,7 @@ export function MobileCanvas({
textTransform: "uppercase",
fontWeight: 600,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
Reset
</button>
+18 -4
View File
@@ -356,6 +356,7 @@ export function MobileChat({
type="button"
onClick={onBack}
aria-label="Back"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
@@ -402,6 +403,7 @@ export function MobileChat({
<button
type="button"
aria-label="More"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
@@ -432,6 +434,7 @@ export function MobileChat({
key={t.id}
type="button"
onClick={() => setTab(t.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
padding: "4px 0 8px",
border: "none",
@@ -475,7 +478,7 @@ export function MobileChat({
}}
>
{tab === "my" && historyLoading && (
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div role="status" aria-live="polite" style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Loading chat history
</div>
)}
@@ -495,6 +498,8 @@ export function MobileChat({
onClick={() => {
loadInitial();
}}
aria-label="Retry loading chat history"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400"
style={{
padding: "6px 14px",
borderRadius: 14,
@@ -510,7 +515,7 @@ export function MobileChat({
</div>
)}
{tab === "my" && !historyLoading && !historyError && messages.length === 0 && (
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div role="status" aria-live="polite" style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Send a message to start chatting.
</div>
)}
@@ -664,6 +669,7 @@ export function MobileChat({
type="button"
onClick={() => removePendingFile(i)}
aria-label={`Remove ${f.name}`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
border: "none",
background: "transparent",
@@ -704,6 +710,7 @@ export function MobileChat({
onClick={() => fileInputRef.current?.click()}
disabled={!reachable || sending || uploading}
aria-label="Attach"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 32,
height: 32,
@@ -725,6 +732,7 @@ export function MobileChat({
ref={composerRef}
value={draft}
onChange={(e) => setDraft(e.target.value)}
aria-label="Message"
onKeyDown={(e) => {
// Enter sends; Shift+Enter inserts a newline. Skip when the
// IME is composing — pressing Enter to commit a Chinese/
@@ -748,7 +756,12 @@ export function MobileChat({
border: "none",
outline: "none",
background: "transparent",
fontSize: 14.5,
// iOS Safari/PWA zooms the viewport when a focused textarea
// has a computed font-size below 16px. 14.5 triggers that
// focus-zoom; the page looks broken until the user pinches
// back (#224, same class as desktop #1434 / sibling #225).
// 16px is the minimum that keeps focus from zooming.
fontSize: 16,
lineHeight: 1.4,
color: p.text,
padding: "6px 0",
@@ -764,12 +777,13 @@ export function MobileChat({
onClick={send}
disabled={(!draft.trim() && pendingFiles.length === 0) || !reachable || sending || uploading}
aria-label="Send"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
borderRadius: 999,
border: "none",
cursor: (draft.trim() || pendingFiles.length > 0) && !sending && !uploading ? "pointer" : "not-allowed",
cursor: (draft.trim() || pendingFiles.length === 0) && !sending && !uploading ? "pointer" : "not-allowed",
flexShrink: 0,
background:
(draft.trim() || pendingFiles.length > 0) && reachable && !sending && !uploading
+3 -2
View File
@@ -231,6 +231,7 @@ export function MobileComms({ dark }: { dark: boolean }) {
fontSize: 13,
fontWeight: 500,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{o.label}
<span
@@ -251,11 +252,11 @@ export function MobileComms({ dark }: { dark: boolean }) {
<div style={{ padding: "0 14px", display: "flex", flexDirection: "column", gap: 8 }}>
{loading && items.length === 0 ? (
<div style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div role="status" aria-live="polite" style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Loading recent comms
</div>
) : filtered.length === 0 ? (
<div style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div role="status" aria-live="polite" style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
No A2A traffic yet.
</div>
) : (
@@ -83,11 +83,12 @@ export function MobileDetail({
type="button"
onClick={onBack}
aria-label="Back"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={iconButtonStyle(p, dark)}
>
{Icons.back({ size: 18 })}
</button>
<button type="button" aria-label="More" style={iconButtonStyle(p, dark)}>
<button type="button" aria-label="More" className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900" style={iconButtonStyle(p, dark)}>
{Icons.more({ size: 18 })}
</button>
</div>
@@ -183,6 +184,7 @@ export function MobileDetail({
key={t.id}
type="button"
onClick={() => setTab(t.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
padding: "8px 14px",
borderRadius: 999,
@@ -215,6 +217,7 @@ export function MobileDetail({
type="button"
onClick={onChat}
data-testid="mobile-chat-cta"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: "100%",
height: 52,
@@ -416,6 +419,8 @@ function DetailActivity({ workspaceId, dark }: { workspaceId: string; dark: bool
if (items === null) {
return (
<div
role="status"
aria-live="polite"
style={{
background: p.surface,
borderRadius: 16,
@@ -200,6 +200,7 @@ export function MobileHome({
justifyContent: "center",
boxShadow: "0 8px 24px rgba(40,30,20,0.25), 0 2px 6px rgba(40,30,20,0.15)",
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{Icons.plus({ size: 22 })}
</button>
@@ -92,6 +92,7 @@ export function MobileMe({
border: on ? `2px solid ${p.text}` : "2px solid transparent",
boxShadow: on ? `0 0 0 2px ${p.bg} inset` : "none",
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
/>
);
})}
@@ -184,6 +185,7 @@ function SegmentedRow({
fontSize: 13,
fontWeight: 600,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{o.label}
</button>
+16 -1
View File
@@ -148,6 +148,7 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
type="button"
onClick={onClose}
aria-label="Close"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 32,
height: 32,
@@ -170,6 +171,8 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
<div style={{ padding: "0 14px" }}>
{loadingTemplates ? (
<div
role="status"
aria-live="polite"
style={{
padding: "24px 8px",
textAlign: "center",
@@ -214,6 +217,8 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
setTplId(t.id);
setTier(tCode);
}}
aria-label={`Select template: ${t.name} (tier ${t.tier})`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
background: on
? dark
@@ -302,6 +307,7 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
<input
value={name}
onChange={(e) => setName(e.target.value)}
aria-label="Agent name"
placeholder={tplId
? (templates.find((t) => t.id === tplId)?.name ?? "agent-name")
: "agent-name"}
@@ -312,7 +318,12 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
border: `0.5px solid ${p.border}`,
borderRadius: 12,
fontFamily: MOBILE_FONT_MONO,
fontSize: 13.5,
// iOS Safari/PWA zooms the viewport when a focused input has
// a computed font-size below 16px; the layout jumps and the
// page looks broken until the user pinches back (#224 / #225,
// same class as desktop #1434). 16px is the minimum that
// suppresses that focus-zoom.
fontSize: 16,
color: p.text,
outline: "none",
boxSizing: "border-box",
@@ -330,6 +341,8 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
key={t}
type="button"
onClick={() => setTier(t)}
aria-label={`Select tier ${t}: ${TIER_LABEL[t]}`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
flex: 1,
padding: "10px 8px",
@@ -377,6 +390,8 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
type="button"
onClick={handleSpawn}
disabled={busy || !tplId || templates.length === 0}
aria-label="Spawn agent"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: "100%",
height: 52,
@@ -263,6 +263,20 @@ describe("MobileChat — composer", () => {
const sendBtn = container.querySelector('[aria-label="Send"]') as HTMLButtonElement;
expect(sendBtn.disabled).toBe(true);
});
// Regression #224: the composer textarea must render with font-size
// ≥ 16px. iOS Safari and PWAs auto-zoom the viewport when a focused
// input has a computed font-size below 16px — the layout jumps and
// the page looks broken until the user pinches back. Same class as
// desktop #1434 / sibling MobileSpawn #225.
it("composer textarea renders at font-size 16px or greater (iOS focus-zoom regression #224)", () => {
const { container } = renderChat(mockAgentId);
const textarea = container.querySelector("textarea") as HTMLTextAreaElement;
expect(textarea).toBeTruthy();
const fs = Number.parseFloat(textarea.style.fontSize);
expect(Number.isFinite(fs)).toBe(true);
expect(fs).toBeGreaterThanOrEqual(16);
});
});
// ─── Tabs ─────────────────────────────────────────────────────────────────────
@@ -93,6 +93,24 @@ describe("MobileSpawn — render", () => {
expect(input).toBeTruthy();
});
// Regression #224 / #225: the agent-name input must render with a
// font-size ≥ 16px. iOS Safari and PWAs auto-zoom the viewport when a
// focused input has a computed font-size below 16px — the layout
// jumps and the page looks broken until the user pinches back.
it("renders the name input at font-size 16px or greater (iOS focus-zoom regression)", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const input = document.querySelector(
'input[aria-label="Agent name"]',
) as HTMLInputElement | null;
expect(input).toBeTruthy();
// Parse the inline style font-size — jsdom doesn't run a layout
// engine, so getComputedStyle reports the inline value verbatim.
const fs = Number.parseFloat(input!.style.fontSize);
expect(Number.isFinite(fs)).toBe(true);
expect(fs).toBeGreaterThanOrEqual(16);
});
it("renders all 4 tier buttons", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
@@ -133,6 +133,7 @@ export function TabBar({
aria-label={t.label}
onClick={() => onChange(t.id)}
onKeyDown={(e) => handleKeyDown(e, idx)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
background: "none",
border: "none",
@@ -291,6 +292,7 @@ export function AgentCard({
data-testid="workspace-card"
aria-label={`${agent.name}, status: ${agent.status}, tier ${agent.tier}${agent.remote ? ", remote" : ""}`}
onClick={onClick}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
display: "block",
width: "100%",
@@ -444,6 +446,7 @@ export function FilterChips({
type="button"
aria-checked={on}
onClick={() => onChange(o.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
display: "inline-flex",
alignItems: "center",
@@ -160,14 +160,14 @@ export function OrgTokensTab() {
</code>
<button
onClick={handleCopy}
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors"
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
{copied ? 'Copied' : 'Copy'}
</button>
</div>
<button
onClick={() => setNewToken(null)}
className="text-[9px] text-good/60 hover:text-good transition-colors"
className="text-[9px] text-good/60 hover:text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Dismiss
</button>
@@ -219,7 +219,7 @@ export function OrgTokensTab() {
</div>
<button
onClick={() => setRevokeTarget(t)}
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 shrink-0"
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 shrink-0 focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400 focus-visible:ring-offset-1"
>
Revoke
</button>
+3 -3
View File
@@ -140,14 +140,14 @@ function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
</code>
<button
onClick={handleCopy}
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors"
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
{copied ? 'Copied' : 'Copy'}
</button>
</div>
<button
onClick={() => setNewToken(null)}
className="text-[9px] text-good/60 hover:text-good transition-colors"
className="text-[9px] text-good/60 hover:text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Dismiss
</button>
@@ -192,7 +192,7 @@ function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
</div>
<button
onClick={() => setRevokeTarget(t)}
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1"
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400 focus-visible:ring-offset-1"
>
Revoke
</button>
+1 -1
View File
@@ -185,7 +185,7 @@ export function ActivityTab({ workspaceId }: Props) {
{/* Activity list */}
<div className="flex-1 overflow-y-auto p-3 space-y-1.5">
{loading && activities.length === 0 && (
<div className="text-xs text-ink-mid text-center py-8">Loading activity...</div>
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">Loading activity...</div>
)}
{error && (
+1 -1
View File
@@ -262,7 +262,7 @@ export function ChannelsTab({ workspaceId }: Props) {
</div>
{error && (
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
+14 -16
View File
@@ -10,6 +10,7 @@ import { downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { PendingAttachmentPill } from "./chat/AttachmentViews";
import { AttachmentPreview } from "./chat/AttachmentPreview";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
import { ChatErrorBanner } from "./chat/ChatErrorBanner";
import { appendActivityLine } from "./chat/activityLog";
import { runtimeDisplayName } from "@/lib/runtime-names";
import { ConfirmDialog } from "@/components/ConfirmDialog";
@@ -592,22 +593,19 @@ function MyChatPanel({ workspaceId, data }: Props) {
<div ref={bottomRef} />
</div>
{/* Error banner */}
{displayError && (
<div className="px-3 py-2 bg-red-900/20 border-t border-red-800/30">
<div className="flex items-center justify-between">
<span className="text-[10px] text-red-300">{displayError}</span>
{!isOnline && (
<button
onClick={() => setConfirmRestart(true)}
className="text-[11px] px-2 py-0.5 bg-red-800 text-red-200 rounded hover:bg-red-700"
>
Restart
</button>
)}
</div>
</div>
)}
{/* Error banner — internal#212: surfaces the secret-safe
actionable failure reason that ws-server places on
ACTIVITY_LOGGED.error_detail (propagated via
useChatSocket → onSendError → setError) and offers a
"View activity log" affordance that navigates the user to
the Activity tab where the full row lives. The previous
inline JSX hardcoded "see workspace logs for details" with
no link — there is no separate Logs tab. */}
<ChatErrorBanner
message={displayError}
isOnline={isOnline}
onRestart={() => setConfirmRestart(true)}
/>
{/* Input */}
<div className="p-3 border-t border-line">
+129 -2
View File
@@ -81,7 +81,7 @@ function AgentCardSection({ workspaceId }: { workspaceId: string }) {
spellCheck={false} rows={12}
className="w-full bg-surface-card border border-line rounded p-2 text-[10px] font-mono text-ink focus:outline-none focus:border-accent resize-none"
/>
{error && <div className="px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">{error}</div>}
{error && <div role="alert" aria-live="assertive" className="px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">{error}</div>}
<div className="flex gap-2">
<button type="button" onClick={handleSave} disabled={saving}
className="px-2 py-1 bg-accent hover:bg-accent-strong text-[10px] rounded text-white disabled:opacity-50 transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1 focus-visible:ring-offset-surface">
@@ -109,6 +109,130 @@ function AgentCardSection({ workspaceId }: { workspaceId: string }) {
);
}
// --- Agent Abilities Section ---
//
// Always-visible on/off controls for the two workspace-level ability flags
// (broadcast_enabled, talk_to_user_enabled). Both are mutated through the
// same admin endpoint the ChatTab recovery banner already uses
// (PATCH /workspaces/:id/abilities) and reflected into the canvas store node
// data (broadcastEnabled / talkToUserEnabled) so every surface that reads
// useCanvasStore.nodes stays consistent without a full re-hydrate.
//
// Before this section there was NO canvas control for either flag: the
// backend was fully wired (workspace_abilities.go / workspace_broadcast.go /
// agent_message_writer.go, see commit 29b4bffb + internal#510/#511) but the
// only frontend affordance was the ChatTab recovery banner, which renders
// solely when talk_to_user_enabled===false and so is invisible under the
// TRUE default and never existed at all for broadcast.
function AgentAbilitiesSection({ workspaceId }: { workspaceId: string }) {
// Read the live ability flags off the canvas store node — the platform
// event stream hydrates these (canvas-topology.ts maps the workspace row's
// broadcast_enabled/talk_to_user_enabled onto node data), so this stays in
// sync with the recovery banner and avoids a duplicate GET. Mirrors the
// store-read pattern used by AgentCardSection above.
const node = useCanvasStore((s) =>
s.nodes?.find?.((n) => n.id === workspaceId),
);
// Defaults match the backend column defaults + canvas-topology mapping:
// broadcast_enabled defaults FALSE, talk_to_user_enabled defaults TRUE.
const broadcastEnabled = node?.data.broadcastEnabled ?? false;
const talkToUserEnabled = node?.data.talkToUserEnabled ?? true;
// Track an in-flight PATCH per field so a double-click can't fire two
// racing writes, and surface a one-line error if the server rejects.
const [pending, setPending] = useState<null | "broadcast" | "talk">(null);
const [error, setError] = useState<string | null>(null);
const patchAbility = async (
which: "broadcast" | "talk",
body: { broadcast_enabled: boolean } | { talk_to_user_enabled: boolean },
optimistic: Partial<{ broadcastEnabled: boolean; talkToUserEnabled: boolean }>,
) => {
setError(null);
setPending(which);
// Optimistic store update — the toggle flips immediately; on failure we
// roll back to the server-truth value the store last held.
const prev = {
broadcastEnabled,
talkToUserEnabled,
};
useCanvasStore.getState().updateNodeData(workspaceId, optimistic);
try {
await api.patch(`/workspaces/${workspaceId}/abilities`, body);
} catch (e) {
// Roll back the optimistic change to last-known server truth.
useCanvasStore.getState().updateNodeData(workspaceId, {
broadcastEnabled: prev.broadcastEnabled,
talkToUserEnabled: prev.talkToUserEnabled,
});
setError(
e instanceof Error ? e.message : "Failed to update ability — try again",
);
} finally {
setPending(null);
}
};
return (
<Section title="Agent Abilities">
<p className="text-[10px] text-ink-mid px-1 pb-1">
Workspace-level permissions for this agent. Changes apply immediately
(no restart required).
</p>
<div className="space-y-2">
<div>
<Toggle
label="Talk to user"
checked={talkToUserEnabled}
onChange={(v) =>
pending
? undefined
: patchAbility(
"talk",
{ talk_to_user_enabled: v },
{ talkToUserEnabled: v },
)
}
/>
<p className="text-[10px] text-ink-mid mt-0.5 ml-6">
When off, the agent&apos;s <code className="font-mono">send_message_to_user</code>{" "}
and <code className="font-mono">POST /notify</code> calls are
rejected (403) it must route updates through a parent workspace.
</p>
</div>
<div>
<Toggle
label="Broadcast to peers"
checked={broadcastEnabled}
onChange={(v) =>
pending
? undefined
: patchAbility(
"broadcast",
{ broadcast_enabled: v },
{ broadcastEnabled: v },
)
}
/>
<p className="text-[10px] text-ink-mid mt-0.5 ml-6">
When on, the agent may <code className="font-mono">POST /broadcast</code>{" "}
to message all non-removed agent workspaces in the org. Off by
default only privileged orchestrators should hold this.
</p>
</div>
</div>
{pending && (
<div className="mt-2 text-[10px] text-ink-mid">Saving</div>
)}
{error && (
<div role="alert" aria-live="assertive" className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
{error}
</div>
)}
</Section>
);
}
// --- Main ConfigTab ---
interface ModelSpec {
@@ -795,6 +919,7 @@ export function ConfigTab({ workspaceId }: Props) {
<label className="text-[10px] text-ink-mid block mb-1">Model</label>
<input
type="text"
aria-label="Model"
value={currentModelId}
onChange={(e) => {
const v = e.target.value;
@@ -885,6 +1010,8 @@ export function ConfigTab({ workspaceId }: Props) {
)}
</Section>
<AgentAbilitiesSection workspaceId={workspaceId} />
{/* Claude Settings — shown for claude-code runtime or claude/anthropic model names */}
{(config.runtime === "claude-code" ||
(config.runtime_config?.model || config.model || "").toLowerCase().includes("claude") ||
@@ -995,7 +1122,7 @@ export function ConfigTab({ workspaceId }: Props) {
)}
{error && (
<div className="mx-3 mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">{error}</div>
<div role="alert" aria-live="assertive" className="mx-3 mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">{error}</div>
)}
{!error && RUNTIMES_WITH_OWN_CONFIG.has(config.runtime || "") && (
<div className="mx-3 mb-2 px-3 py-1.5 bg-surface-sunken/50 border border-line rounded text-xs text-ink-mid">
+3 -3
View File
@@ -157,7 +157,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
</select>
</Field>
{saveError && (
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{saveError}
</div>
)}
@@ -203,7 +203,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
{isRestartable && (
<div className="pt-2">
{restartError && (
<div className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{restartError}
</div>
)}
@@ -307,7 +307,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
{/* Delete */}
<Section title="Danger Zone">
{deleteError && (
<div className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{deleteError}
</div>
)}
+1 -1
View File
@@ -82,7 +82,7 @@ export function EventsTab({ workspaceId }: Props) {
</div>
{error && (
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
@@ -102,7 +102,7 @@ export function ExternalConnectionSection({ workspaceId }: Props) {
</div>
{error && (
<div className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
<div role="alert" aria-live="assertive" className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
{error}
</div>
)}
+2 -2
View File
@@ -266,7 +266,7 @@ function PlatformOwnedFilesTab({
// immediately. Delete-All hovers DARKER (bg-red-700) — same AA
// contrast trap that bit ConfirmDialog/ApprovalBanner. Cancel
// lifts to surface-elevated instead of the prior no-op hover.
<div role="alertdialog" aria-labelledby="files-delete-all-msg" className="mx-3 mt-2 px-3 py-2 bg-red-950/30 border border-red-800/40 rounded space-y-1.5">
<div role="alertdialog" aria-modal="false" aria-labelledby="files-delete-all-msg" className="mx-3 mt-2 px-3 py-2 bg-red-950/30 border border-red-800/40 rounded space-y-1.5">
<p id="files-delete-all-msg" className="text-xs text-bad">Delete all {files.filter((f) => !f.dir).length} files? This cannot be undone.</p>
<div className="flex gap-2">
<button type="button" onClick={() => { handleDeleteAll(); setShowDeleteAll(false); }} className="px-2 py-0.5 bg-red-700 hover:bg-red-600 text-[10px] rounded text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface">Delete All</button>
@@ -280,7 +280,7 @@ function PlatformOwnedFilesTab({
)}
{confirmDelete && (
<div role="alertdialog" aria-labelledby="files-delete-one-msg" className="mx-3 mt-2 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded space-y-1.5">
<div role="alertdialog" aria-modal="false" aria-labelledby="files-delete-one-msg" className="mx-3 mt-2 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded space-y-1.5">
<p id="files-delete-one-msg" className="text-xs text-warm">Delete <span className="font-mono">{confirmDelete}</span>{files.find((f) => f.path === confirmDelete && f.dir) ? " and all its contents" : ""}?</p>
<div className="flex gap-2">
<button type="button" onClick={confirmDeleteFile} className="px-2 py-0.5 bg-red-700 hover:bg-red-600 text-[10px] rounded text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface">Delete</button>
+1 -1
View File
@@ -275,7 +275,7 @@ export function ScheduleTab({ workspaceId }: Props) {
Enabled
</label>
</div>
{error && <div className="text-[10px] text-bad">{error}</div>}
{error && <div role="alert" aria-live="assertive" className="text-[10px] text-bad">{error}</div>}
<div className="flex gap-2">
<button
type="button"
+1 -1
View File
@@ -67,7 +67,7 @@ export function TracesTab({ workspaceId }: Props) {
</div>
{error && (
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
@@ -0,0 +1,99 @@
// @vitest-environment jsdom
//
// Pins internal#212 — the chat error banner must:
//
// 1. Render the secret-safe failure reason (e.g. the provider's own
// "403 oauth_org_not_allowed: ..." string), NOT the opaque
// hardcoded "Agent error (Exception) — see workspace logs for
// details." that points at a workspace-logs tab that doesn't
// exist.
//
// 2. Offer a working "View activity log" affordance that navigates
// the user to the Activity tab where the full row lives.
//
// Tested at the banner-component seam (ChatErrorBanner). The
// hook-level path is pinned separately by
// chat/hooks/__tests__/useChatSocket.test.tsx — together they cover
// wire-payload → callback → render without each test needing to drive
// the full ChatTab send-state machinery.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, fireEvent } from "@testing-library/react";
afterEach(cleanup);
const mocks = vi.hoisted(() => ({
setPanelTabMock: vi.fn(),
}));
vi.mock("@/store/canvas", () => {
const state = {
setPanelTab: mocks.setPanelTabMock,
panelTab: "chat",
};
const hook = (selector?: (s: typeof state) => unknown) =>
selector ? selector(state) : state;
hook.getState = () => state;
return { useCanvasStore: hook };
});
beforeEach(() => {
mocks.setPanelTabMock.mockClear();
});
import { ChatErrorBanner } from "../chat/ChatErrorBanner";
describe("ChatErrorBanner — surfaces actionable reason (internal#212)", () => {
it("renders the secret-safe failure reason verbatim, not a hardcoded opaque message", () => {
const reason =
"Anthropic 403 oauth_org_not_allowed: Your organization has disabled Claude subscription access for Claude Code — use an Anthropic API key or ask your admin to enable access.";
render(<ChatErrorBanner message={reason} isOnline={true} onRestart={() => {}} />);
expect(screen.getByText(/oauth_org_not_allowed/i)).toBeDefined();
expect(screen.getByText(/disabled Claude subscription access/i)).toBeDefined();
// The legacy boilerplate must NOT leak through when a real reason
// is provided.
expect(screen.queryByText(/see workspace logs for details/i)).toBeNull();
});
it("falls back to the message when it IS the legacy boilerplate (older ws-server)", () => {
// Graceful degradation: an older ws-server passes through the
// hardcoded text; the banner still renders SOMETHING — never
// silently swallow.
render(
<ChatErrorBanner
message="Agent error (Exception) — see workspace logs for details."
isOnline={true}
onRestart={() => {}}
/>,
);
expect(
screen.getByText(/Agent error \(Exception\) — see workspace logs for details\./),
).toBeDefined();
});
it("offers a 'View activity log' button that calls setPanelTab('activity')", () => {
render(
<ChatErrorBanner message="kimi 401 invalid_api_key" isOnline={true} onRestart={() => {}} />,
);
const btn = screen.getByRole("button", { name: /view activity log/i });
fireEvent.click(btn);
expect(mocks.setPanelTabMock).toHaveBeenCalledWith("activity");
});
it("still shows the Restart button when offline (existing behavior preserved)", () => {
const onRestart = vi.fn();
render(
<ChatErrorBanner message="Agent is offline" isOnline={false} onRestart={onRestart} />,
);
const btn = screen.getByRole("button", { name: /^restart$/i });
fireEvent.click(btn);
expect(onRestart).toHaveBeenCalledTimes(1);
});
it("renders nothing when message is null", () => {
const { container } = render(
<ChatErrorBanner message={null} isOnline={true} onRestart={() => {}} />,
);
expect(container.textContent).toBe("");
});
});
@@ -0,0 +1,165 @@
// @vitest-environment jsdom
//
// Tests for the always-visible "Agent Abilities" section added to ConfigTab
// (internal#510 broadcast_enabled, internal#511 talk_to_user_enabled; backend
// wired in commit 29b4bffb).
//
// Problem this pins: the two workspace ability flags had complete wired
// backends but NO canvas control — broadcast had none at all, talk-to-user
// only surfaced as a ChatTab recovery banner that is invisible under its
// TRUE default. The CTO could not see or toggle either from canvas.
//
// What this suite pins:
// 1. An "Agent Abilities" section renders (always visible, not gated).
// 2. Both toggles render and reflect the store node's ability fields,
// including the asymmetric defaults (broadcast FALSE, talk TRUE).
// 3. Toggling a switch calls PATCH /workspaces/:id/abilities with the
// correct snake_case body and optimistically updates the store.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPatch = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
patch: (path: string, body?: unknown) => apiPatch(path, body),
put: vi.fn(),
post: vi.fn(),
del: vi.fn(),
},
}));
// Store node carries the ability flags hydrated by the platform stream
// (canvas-topology.ts maps broadcast_enabled/talk_to_user_enabled onto
// node.data). Mirror that shape so the section reads real values.
const storeUpdateNodeData = vi.fn();
const storeRestartWorkspace = vi.fn();
let nodeData: { broadcastEnabled?: boolean; talkToUserEnabled?: boolean } = {};
const makeState = () => ({
nodes: [{ id: "ws-test", data: nodeData }],
restartWorkspace: storeRestartWorkspace,
updateNodeData: storeUpdateNodeData,
});
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
(selector: (s: unknown) => unknown) => selector(makeState()),
{ getState: () => makeState() },
),
}));
vi.mock("../AgentCardSection", () => ({
AgentCardSection: () => <div data-testid="agent-card-stub" />,
}));
import { ConfigTab } from "../ConfigTab";
beforeEach(() => {
apiGet.mockReset();
apiPatch.mockReset();
apiPatch.mockResolvedValue({ status: "updated" });
storeUpdateNodeData.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === `/workspaces/ws-test`) {
return Promise.resolve({ runtime: "claude-code" });
}
if (path === `/workspaces/ws-test/model`) {
return Promise.resolve({ model: "claude-opus-4-7" });
}
if (path === `/workspaces/ws-test/provider`) {
return Promise.resolve({ provider: "anthropic-oauth", source: "default" });
}
if (path === `/workspaces/ws-test/files/config.yaml`) {
return Promise.resolve({ content: "name: test\nruntime: claude-code\n" });
}
if (path === "/templates") {
return Promise.resolve([
{ id: "claude-code", name: "Claude Code", runtime: "claude-code", providers: [] },
]);
}
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
});
describe("ConfigTab Agent Abilities section", () => {
it("renders an always-visible 'Agent Abilities' section with both toggles", async () => {
nodeData = {}; // unset → defaults
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
expect(
await screen.findByRole("button", { name: /Agent Abilities/i }),
).toBeTruthy();
expect(screen.getByText("Talk to user")).toBeTruthy();
expect(screen.getByText("Broadcast to peers")).toBeTruthy();
});
it("reflects the asymmetric defaults: talk-to-user ON, broadcast OFF", async () => {
nodeData = {}; // unset → backend defaults
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
const broadcast = screen
.getByText("Broadcast to peers")
.closest("label")!
.querySelector("input") as HTMLInputElement;
expect(talk.checked).toBe(true);
expect(broadcast.checked).toBe(false);
});
it("reflects explicit store values", async () => {
nodeData = { broadcastEnabled: true, talkToUserEnabled: false };
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
const broadcast = screen
.getByText("Broadcast to peers")
.closest("label")!
.querySelector("input") as HTMLInputElement;
expect(talk.checked).toBe(false);
expect(broadcast.checked).toBe(true);
});
it("PATCHes /abilities with talk_to_user_enabled and optimistically updates the store", async () => {
nodeData = {}; // talk defaults true
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
fireEvent.click(talk); // true → false
await waitFor(() =>
expect(apiPatch).toHaveBeenCalledWith("/workspaces/ws-test/abilities", {
talk_to_user_enabled: false,
}),
);
expect(storeUpdateNodeData).toHaveBeenCalledWith("ws-test", {
talkToUserEnabled: false,
});
});
it("PATCHes /abilities with broadcast_enabled when the broadcast toggle is flipped", async () => {
nodeData = {}; // broadcast defaults false
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const broadcast = (await screen.findByText("Broadcast to peers"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
fireEvent.click(broadcast); // false → true
await waitFor(() =>
expect(apiPatch).toHaveBeenCalledWith("/workspaces/ws-test/abilities", {
broadcast_enabled: true,
}),
);
expect(storeUpdateNodeData).toHaveBeenCalledWith("ws-test", {
broadcastEnabled: true,
});
});
});
@@ -405,7 +405,7 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
</p>
<button
onClick={loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors"
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1"
>
Retry
</button>
@@ -610,7 +610,7 @@ function PeerTabButton({
aria-selected={active}
tabIndex={active ? 0 : -1}
onClick={onClick}
className={`shrink-0 px-3 py-1.5 text-[10px] font-medium transition-colors whitespace-nowrap ${
className={`shrink-0 px-3 py-1.5 text-[10px] font-medium transition-colors whitespace-nowrap focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-cyan-500/60 focus-visible:ring-offset-1 ${
active
? "border-b-2 border-cyan-500 text-cyan-200"
: "border-b-2 border-transparent text-ink-mid hover:text-ink-mid"
@@ -33,7 +33,7 @@ export function PendingAttachmentPill({
<button
onClick={onRemove}
aria-label={`Remove ${file.name}`}
className="ml-0.5 text-ink-mid hover:text-ink transition-colors shrink-0"
className="ml-0.5 text-ink-mid hover:text-ink transition-colors shrink-0 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1"
>
<svg width="10" height="10" viewBox="0 0 16 16" fill="none" aria-hidden="true">
<path d="M4 4l8 8M12 4l-8 8" stroke="currentColor" strokeWidth="1.6" strokeLinecap="round" />
@@ -62,8 +62,9 @@ export function AttachmentChip({
return (
<button
onClick={() => onDownload(attachment)}
aria-label={`Download ${attachment.name}`}
title={`Download ${attachment.name}`}
className={`flex items-center gap-1.5 rounded-md border px-2 py-1 text-[10px] transition-colors max-w-full ${toneClasses}`}
className={`flex items-center gap-1.5 rounded-md border px-2 py-1 text-[10px] transition-colors max-w-full focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1 ${toneClasses}`}
>
<FileGlyph className="shrink-0 opacity-70" />
<span className="truncate">{attachment.name}</span>
@@ -0,0 +1,85 @@
"use client";
/**
* ChatErrorBanner — error-state banner rendered under the chat
* message list when an agent turn fails or the workspace is offline.
*
* internal#212 closes the "see workspace logs for details" pointer-to-
* nowhere defect:
*
* - The banner now renders the actionable, secret-safe failure
* reason that ws-server places on `ACTIVITY_LOGGED.error_detail`
* (provider HTTP status + error code + provider's own human
* message). The hook (`useChatSocket`) forwards this through
* `onSendError`, which the ChatTab routes into this banner's
* `message` prop. No hardcoded opaque text in this component.
*
* - A "View activity log" button navigates the user to the Activity
* tab where the full row (request body, response body, timing,
* full error_detail) lives. Until internal#212, the banner
* mentioned "workspace logs" with no link — there is no separate
* Logs tab in the side panel; the Activity tab IS the workspace-
* logs surface. Routing through the existing tab makes the
* reference real instead of dangling.
*
* - The existing Restart button (shown only when the workspace is
* offline) is preserved unchanged so the recovery affordance the
* old banner offered does not regress.
*
* Pure presentational — no socket subscription, no state machine. Easy
* to unit-test in isolation and easy to compose into the ChatTab.
*/
import { useCanvasStore } from "@/store/canvas";
export interface ChatErrorBannerProps {
/** The user-visible reason. Pass `null` to render nothing. */
message: string | null;
/** Workspace reachable state — gates the Restart affordance. */
isOnline: boolean;
/** Fires when the user clicks Restart (offline-only). */
onRestart: () => void;
}
export function ChatErrorBanner({ message, isOnline, onRestart }: ChatErrorBannerProps) {
// Pulled from the global store rather than threaded through props so
// the chat tab does not need to know about the side-panel tab state.
// Matches how Toolbar.tsx triggers the audit tab (the existing
// precedent for cross-tab navigation).
const setPanelTab = useCanvasStore((s) => s.setPanelTab);
if (!message) return null;
return (
<div
// role="alert" + aria-live mirrors the project's existing WCAG
// 4.1.3 banner pattern (see fix/canvas-errors-aria-alert) — a
// screen reader announces the failure as soon as it lands.
role="alert"
aria-live="assertive"
className="px-3 py-2 bg-red-900/20 border-t border-red-800/30"
>
<div className="flex items-center justify-between gap-2">
<span className="text-[10px] text-red-300 break-words flex-1">{message}</span>
<div className="flex items-center gap-1.5 shrink-0">
<button
type="button"
onClick={() => setPanelTab("activity")}
className="text-[10px] px-2 py-0.5 bg-red-900/40 hover:bg-red-800/60 border border-red-700/40 text-red-200 rounded transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
View activity log
</button>
{!isOnline && (
<button
type="button"
onClick={onRestart}
className="text-[11px] px-2 py-0.5 bg-red-800 text-red-200 rounded hover:bg-red-700"
>
Restart
</button>
)}
</div>
</div>
</div>
);
}
@@ -0,0 +1,140 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
// Capture the handler so we can drive WS events from tests. useSocketEvent
// stores the latest handler in a ref under the hood, but since we mock
// the hook entirely, just remember the last passed-in handler.
let capturedHandler: ((msg: unknown) => void) | null = null;
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: (h: (msg: unknown) => void) => {
capturedHandler = h;
},
}));
// Canvas store mock — useChatSocket calls
// useCanvasStore.getState().nodes for peer name resolution and reads
// agentMessages via the selector form. Support both.
vi.mock("@/store/canvas", () => {
const state = {
nodes: [
{ id: "ws-self", data: { name: "Self" } },
{ id: "ws-peer", data: { name: "Peer Agent" } },
],
agentMessages: {} as Record<string, unknown[]>,
consumeAgentMessages: () => [],
};
const hook = (selector?: (s: typeof state) => unknown) =>
selector ? selector(state) : state;
hook.getState = () => state;
return { useCanvasStore: hook };
});
import { useChatSocket } from "../useChatSocket";
beforeEach(() => {
capturedHandler = null;
});
afterEach(() => {
vi.clearAllMocks();
});
// Helper: assemble an ACTIVITY_LOGGED a2a_receive error event the way
// the ws-server emits one when a peer call errors out. Fields mirror
// workspace-server/internal/handlers/activity.go::logActivityExec
// broadcast payload shape.
function makeActivityErrorEvent(opts: { workspaceId: string; targetId?: string; errorDetail?: string | undefined }) {
return {
event: "ACTIVITY_LOGGED",
workspace_id: opts.workspaceId,
payload: {
activity_type: "a2a_receive",
method: "message/send",
status: "error",
target_id: opts.targetId ?? opts.workspaceId,
duration_ms: 1500,
...(opts.errorDetail !== undefined ? { error_detail: opts.errorDetail } : {}),
},
timestamp: "2026-05-18T00:00:00Z",
};
}
describe("useChatSocket — surface error_detail to onSendError (internal#212)", () => {
it("forwards the secret-safe error_detail from the broadcast as the onSendError reason", () => {
const onSendError = vi.fn();
const onSendComplete = vi.fn();
renderHook(() =>
useChatSocket("ws-self", {
onSendError,
onSendComplete,
}),
);
expect(capturedHandler).not.toBeNull();
act(() => {
capturedHandler!(
makeActivityErrorEvent({
workspaceId: "ws-self",
errorDetail:
"Anthropic 403 oauth_org_not_allowed: Your organization has disabled Claude subscription access for Claude Code",
}),
);
});
// The hook must NOT fall back to the opaque hardcoded
// "Agent error (Exception) — see workspace logs for details." —
// that was internal#212. When the broadcast carries an
// error_detail, that string is the user-facing reason.
expect(onSendError).toHaveBeenCalledTimes(1);
const reason = onSendError.mock.calls[0][0] as string;
expect(reason).toContain("403");
expect(reason).toContain("oauth_org_not_allowed");
expect(reason).toContain("disabled Claude subscription");
expect(reason).not.toMatch(/see workspace logs for details/i);
});
it("gracefully degrades to the legacy opaque message when error_detail is absent (older ws-server)", () => {
// An older ws-server doesn't include error_detail in the payload.
// The hook must still fire onSendError with the legacy hardcoded
// text so the chat banner has SOMETHING to show. The fix is
// additive — never depend on the new field's presence.
const onSendError = vi.fn();
renderHook(() =>
useChatSocket("ws-self", {
onSendError,
}),
);
act(() => {
capturedHandler!(makeActivityErrorEvent({ workspaceId: "ws-self" }));
});
expect(onSendError).toHaveBeenCalledTimes(1);
const reason = onSendError.mock.calls[0][0] as string;
// Legacy boilerplate is the floor — never silently swallow.
expect(reason.length).toBeGreaterThan(0);
});
it("ignores errors targeted at a different workspace's peer", () => {
// Defense against a race where the WS hub fans out to all clients —
// each chat panel must only react when target_id matches its own
// workspace.
const onSendError = vi.fn();
renderHook(() =>
useChatSocket("ws-self", {
onSendError,
}),
);
act(() => {
capturedHandler!(
makeActivityErrorEvent({
workspaceId: "ws-self",
targetId: "ws-someone-else",
errorDetail: "irrelevant",
}),
);
});
expect(onSendError).not.toHaveBeenCalled();
});
});
@@ -67,9 +67,23 @@ export function useChatSocket(
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) {
callbacksRef.current.onSendComplete?.();
callbacksRef.current.onSendError?.(
"Agent error (Exception) — see workspace logs for details.",
);
// internal#212 — surface the actionable, secret-safe
// failure reason (provider HTTP status + error code +
// human-readable message) the ws-server now puts on
// ACTIVITY_LOGGED.error_detail. The old hardcoded
// "Agent error (Exception) — see workspace logs for
// details." is the fallback only — it pointed at a
// workspace-logs tab that doesn't exist, telling the
// user nothing they could act on.
//
// Graceful degradation: older ws-server builds don't
// include error_detail, so the legacy boilerplate is
// still the floor (never silently swallow).
const detail = (p.error_detail as string) || "";
const reason = detail
? detail
: "Agent error (Exception) — see workspace logs for details.";
callbacksRef.current.onSendError?.(reason);
}
}
} else if (type === "a2a_send") {
@@ -351,8 +351,10 @@ export function SecretsSection({ workspaceId, requiredEnv }: { workspaceId: stri
{showAdd ? (
<div className="bg-surface-card/50 rounded p-2 space-y-1.5 border border-line/50">
<input value={newKey} onChange={(e) => setNewKey(e.target.value.toUpperCase())} placeholder="KEY_NAME"
aria-label="Secret key name"
className="w-full bg-surface-sunken border border-line rounded px-2 py-1 text-[10px] font-mono text-ink focus:outline-none focus:border-accent" />
<input value={newValue} onChange={(e) => setNewValue(e.target.value)} placeholder="Value" type="password"
aria-label="Secret value"
className="w-full bg-surface-sunken border border-line rounded px-2 py-1 text-[10px] text-ink focus:outline-none focus:border-accent" />
<div className="flex gap-2">
<button type="button" onClick={() => { if (newKey && newValue) handleSave(newKey, newValue); }} disabled={!newKey || !newValue}
@@ -99,7 +99,7 @@ export function TestConnectionButton({
function Spinner() {
return (
<svg className="spinner" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<svg aria-hidden="true" className="spinner" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<path d="M12 2v4M12 18v4M4.93 4.93l2.83 2.83M16.24 16.24l2.83 2.83M2 12h4M18 12h4M4.93 19.07l2.83-2.83M16.24 7.76l2.83-2.83" />
</svg>
);
+82 -1
View File
@@ -63,7 +63,7 @@ class MockWebSocket {
(globalThis as unknown as Record<string, unknown>).WebSocket = MockWebSocket;
// Now import the socket module (uses globalThis.WebSocket at call time)
import { connectSocket, disconnectSocket } from "../socket";
import { connectSocket, disconnectSocket, wakeSocket } from "../socket";
import { useCanvasStore } from "../canvas";
// ---------------------------------------------------------------------------
@@ -416,3 +416,84 @@ describe("RehydrateDedup", () => {
expect(d.shouldSkip(2_700)).toBe(true);
});
});
// ---------------------------------------------------------------------------
// wakeSocket() — visibility-wake reconnect (regression #223 / #228)
// ---------------------------------------------------------------------------
//
// Mobile browsers (iOS Safari, Chrome on Android in deep-sleep) silently
// drop the WebSocket when the tab is backgrounded; the in-page onclose
// fires very late or never. Without a visibility wake, the canvas stays
// frozen until the user manually refreshes.
//
// The real wiring lives at module level: connectSocket installs a
// visibilitychange/pageshow listener that calls wake() on foreground.
// We can't dispatch DOM events here because the suite runs under the
// `node` test environment (no `document`/`window` — see canvas/vitest
// .config.ts). Instead we test wake() directly through the wakeSocket
// public export, which is the same code path the listener invokes.
describe("wakeSocket → reconnect (#223 / #228 — mobile visibility wake)", () => {
it("wake on a healthy OPEN socket does not create a new WebSocket", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// OPEN === 1. wake() should take the healthy-no-op branch.
(ws as unknown as { readyState: number }).readyState = 1;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before);
});
it("wake on a CLOSED socket creates a new WebSocket (the actual #223 fix)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// CLOSED === 3. Simulates the OS killing the socket while the tab
// was backgrounded. We deliberately don't fire triggerClose() —
// the whole point of #223 is that mobile browsers don't fire
// onclose when they kill the WS, so reconnect never schedules.
(ws as unknown as { readyState: number }).readyState = 3;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before + 1);
});
it("wake while CONNECTING (readyState=0) does not pile another handshake", () => {
connectSocket();
const ws = getLastWS();
// CONNECTING === 0 — a handshake is already in flight.
(ws as unknown as { readyState: number }).readyState = 0;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before);
});
it("wake cancels any pending backoff reconnect", () => {
const clearTimeoutSpy = vi.spyOn(globalThis, "clearTimeout");
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// Drop the socket — onclose schedules a backoff reconnect.
ws.triggerClose();
// Now wake the page. wake() should pre-empt the backoff so the
// user sees the canvas come back immediately, not after the
// exponential delay window.
(ws as unknown as { readyState: number }).readyState = 3;
clearTimeoutSpy.mockClear();
wakeSocket();
expect(clearTimeoutSpy).toHaveBeenCalled();
clearTimeoutSpy.mockRestore();
});
it("wake after disconnectSocket is a no-op (no zombie reconnect)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
disconnectSocket();
const before = MockWebSocket.instances.length;
// Singleton is null now — wake() should silently do nothing.
expect(() => wakeSocket()).not.toThrow();
expect(MockWebSocket.instances.length).toBe(before);
});
});
+88
View File
@@ -268,6 +268,46 @@ class ReconnectingSocket {
}
useCanvasStore.getState().setWsStatus("disconnected");
}
/** Force a reconnect attempt now, skipping the backoff window.
* Used by the visibilitychange / pageshow handler: when a mobile
* browser backgrounds the tab, the OS silently kills the WebSocket
* but the in-page onclose either fires very late or never fires at
* all (iOS Safari, Chrome on Android in deep-sleep). Once the user
* brings the tab back, the canvas needs to reconnect within human
* perception — not on whatever backoff delay was last scheduled,
* which can be up to 30s. (#223 / #228)
*
* Idempotent: if the socket is already OPEN we leave it alone; the
* WebSocket is still healthy and a reconnect would just churn. */
wake() {
if (this.disposed) return;
// OPEN === 1. Use the numeric literal so we don't have to import
// WebSocket type values; the runtime constant is well-defined.
if (this.ws && this.ws.readyState === 1) {
// Healthy. Run a rehydrate to catch any events we may have missed
// while the tab was backgrounded — the OS does deliver some
// packets late, but it can also drop them, and the dedup gate
// collapses this with any subsequent health-check rehydrate.
void this.rehydrate();
return;
}
// CONNECTING === 0 means a handshake is already in flight. Don't
// pile another one on; the existing attempt or its onclose-driven
// reconnect will resolve.
if (this.ws && this.ws.readyState === 0) return;
// Otherwise (CLOSING, CLOSED, or null) we're in limbo. Cancel any
// pending backoff and reconnect now.
if (this.reconnectTimer) {
clearTimeout(this.reconnectTimer);
this.reconnectTimer = null;
}
// Reset attempt counter so the *next* failure (if any) starts from
// a short delay again — we just had a real user interaction, not
// an unattended-tab failure cascade.
this.attempt = 0;
this.connect();
}
}
export interface WorkspaceData {
@@ -306,11 +346,49 @@ export interface WorkspaceData {
let socket: ReconnectingSocket | null = null;
/** visibilitychange / pageshow handler. Mobile browsers (iOS Safari,
* Chrome on Android in deep-sleep) silently drop the WebSocket when
* the tab is backgrounded — the in-page `onclose` fires very late or
* never. Without this listener, the canvas appears frozen after the
* user backgrounds the PWA and returns to it: status events, agent
* messages, and cross-device chat broadcast don't arrive until a
* manual refresh (#223 / #228).
*
* Both events are wired: `visibilitychange` covers tab-switch on a
* live page; `pageshow` covers Safari's bfcache restore, where the
* page comes back from cache without firing visibilitychange. */
function onPageWake() {
// document is undefined in SSR; the listener never installs there,
// but defensively guard anyway in case this code is run via a test
// harness that doesn't shim it.
if (typeof document !== "undefined" && document.hidden) return;
socket?.wake();
}
let visibilityHandlerInstalled = false;
function installVisibilityHandler() {
if (visibilityHandlerInstalled) return;
if (typeof document === "undefined" || typeof window === "undefined") return;
document.addEventListener("visibilitychange", onPageWake);
// `pageshow` with `event.persisted === true` is the bfcache restore
// signal — relevant on iOS Safari. We don't need to inspect
// `persisted` because waking an OPEN socket is a no-op.
window.addEventListener("pageshow", onPageWake);
visibilityHandlerInstalled = true;
}
function uninstallVisibilityHandler() {
if (!visibilityHandlerInstalled) return;
if (typeof document === "undefined" || typeof window === "undefined") return;
document.removeEventListener("visibilitychange", onPageWake);
window.removeEventListener("pageshow", onPageWake);
visibilityHandlerInstalled = false;
}
export function connectSocket() {
if (!socket) {
socket = new ReconnectingSocket(WS_URL);
}
socket.connect();
installVisibilityHandler();
}
export function disconnectSocket() {
@@ -318,4 +396,14 @@ export function disconnectSocket() {
socket.disconnect();
socket = null;
}
uninstallVisibilityHandler();
}
/** Manually trigger the visibility-wake path. Exported so the test suite
* can exercise `ReconnectingSocket.wake()` without depending on a
* jsdom DOM (the rest of this file's tests run under the node env).
* Real-world callers don't need this — the visibility/pageshow listener
* drives it. */
export function wakeSocket() {
socket?.wake();
}
+12
View File
@@ -584,6 +584,10 @@
.secrets-tab__refresh-btn:hover {
background: #1e40af;
}
.secrets-tab__refresh-btn:focus-visible {
outline: 2px solid #1d4ed8;
outline-offset: 2px;
}
.secrets-tab__no-results {
text-align: center;
@@ -649,6 +653,10 @@
border-radius: 6px;
cursor: pointer;
}
.delete-dialog__cancel-btn:focus-visible {
outline: var(--focus-ring);
outline-offset: var(--focus-ring-offset);
}
.delete-dialog__confirm-btn {
background: var(--status-invalid);
@@ -658,6 +666,10 @@
border-radius: 6px;
cursor: pointer;
}
.delete-dialog__confirm-btn:focus-visible {
outline: var(--focus-ring);
outline-offset: var(--focus-ring-offset);
}
.delete-dialog__confirm-btn:disabled { opacity: 0.4; cursor: not-allowed; }
+18 -6
View File
@@ -58,6 +58,7 @@ TOP_LEVEL_MODULES = {
"a2a_response",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_identity",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
@@ -310,8 +311,17 @@ locally.
deps from your system Python. Plain `pip install --user` works
but the binary lands in `~/.local/bin` (Linux) or
`~/Library/Python/3.X/bin` (macOS) which is often not on PATH on
a fresh shell — `claude mcp add molecule -- molecule-mcp` then
fails with "command not found" at first use.
a fresh shell — `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
then fails with "command not found" at first use.
* **Server name in `claude mcp add` is workspace-specific.** The
Canvas "Add to Claude Code" snippet stamps a unique slug
(`molecule-<workspace-name>`) so a single Claude Code session can
talk to N molecule workspaces concurrently — `claude mcp add` keys
entries by name in `~/.claude.json`, so re-running with a bare
`molecule` name silently overwrites the prior workspace's entry.
See [molecule-core#1535](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1535)
for the canonical generator.
### Install
@@ -335,8 +345,10 @@ WORKSPACE_ID=<uuid> \\
That exposes the same 8 platform tools (`delegate_task`, `list_peers`,
`send_message_to_user`, `commit_memory`, etc.) that container-bound
runtimes already get via the workspace's auto-spawned MCP. Register
the binary in your agent's MCP config (e.g. Claude Code's
`claude mcp add molecule -- molecule-mcp` with the env above).
the binary in your agent's MCP config — use a workspace-specific
server name so multi-workspace setups don't collide (e.g. Claude Code:
`claude mcp add molecule-<workspace-slug> -- molecule-mcp` with the env
above; the Canvas modal stamps the right slug for you).
### Keeping the token out of shell history
@@ -374,8 +386,8 @@ hold:
wheel does (see `_build_initialize_result`). Nothing for you to
do.
2. **Claude Code installs the server as a marketplace plugin** — a
plain `claude mcp add molecule -- molecule-mcp` produces a
non-plugin-sourced server, which Claude Code rejects with
plain `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
produces a non-plugin-sourced server, which Claude Code rejects with
`channel_enable requires a marketplace plugin`. Until the
official `moleculesai/claude-code-plugin` marketplace lands
(tracking [#2936](https://git.moleculesai.app/molecule-ai/molecule-core/issues/2936)),
+165
View File
@@ -0,0 +1,165 @@
# shellcheck shell=bash
# Shared peer-visibility assertion core — runtime/backend-AGNOSTIC.
#
# WHY THIS FILE EXISTS
# --------------------
# The peer-visibility gate (PR #1298) was staging-only. Per the standing
# rule that the local prod-mimic stack must run a MANDATORY local-Postgres
# E2E BEFORE staging E2E (memory: feedback_local_must_mimic_production,
# feedback_mandatory_local_e2e_before_ship, feedback_local_test_before_
# staging_e2e), peer-visibility must also run against the local stack.
#
# The ASSERTION must be byte-identical between local and staging — only
# provisioning differs. So the literal MCP `list_peers` call + every
# anti-proxy / anti-native-fallback guarantee lives HERE, sourced by both
# tests/e2e/test_peer_visibility_mcp_staging.sh (staging/CP backend) and
# tests/e2e/test_peer_visibility_mcp_local.sh (local docker-compose
# backend). If this assertion ever diverges between the two, that is the
# bug — keep it in one place.
#
# THIS IS NOT A PROXY. pv_assert_runtime issues the byte-for-byte
# JSON-RPC `tools/call name=list_peers` envelope to `POST
# /workspaces/:id/mcp` using the workspace's OWN bearer token, through
# the real WorkspaceAuth + MCPRateLimiter middleware chain — the exact
# call mcp_molecule_list_peers makes from a canvas agent. It does NOT
# read a registry row, /health, the heartbeat table, or
# GET /registry/:id/peers.
#
# Contract:
# pv_assert_runtime <runtime> <ws_id> <ws_bearer> <base_url> \
# <org_id_or_empty> <all_ws_ids_space_separated>
#
# <org_id_or_empty> staging: the X-Molecule-Org-Id header value.
# local: "" (the local single-tenant stack does
# not gate on the org header; the header
# is simply omitted when empty).
# <all_ws_ids> every provisioned workspace id (parent + every
# runtime sibling). The expected peer set for this
# runtime is every id in here EXCEPT <ws_id>.
#
# Sets the global PV_VERDICT to one of:
# OK
# FAIL(http=<code>)
# FAIL(native-fallback)
# FAIL(rpc=<detail>)
# FAIL(peers=<detail>)
# FAIL(unknown)
# Returns 0 when PV_VERDICT=OK, 1 otherwise. Never exits — the caller
# owns aggregation + the gate exit code (10 = regression reproduced).
#
# The literal JSON-RPC envelope. Identical to what
# workspace/platform_tools/registry.py's mcp_molecule_list_peers emits.
PV_RPC_BODY='{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"list_peers","arguments":{}}}'
pv_assert_runtime() {
local rt="$1" wid="$2" wtok="$3" base_url="$4" org_id="$5" all_ws_ids="$6"
# Expected peer set = every OTHER provisioned workspace, excluding the
# caller itself. Byte-identical selection to the original staging script.
local expect_ids
expect_ids=$(echo "$all_ws_ids" | tr ' ' '\n' | grep -v "^${wid}$" | grep -v '^$')
# X-Molecule-Org-Id only when the backend supplies one (staging multi-
# tenant). Local single-tenant omits it — the same WorkspaceAuth +
# MCPRateLimiter chain still runs; only the tenant-routing header differs.
local org_header=()
if [ -n "$org_id" ]; then
org_header=(-H "X-Molecule-Org-Id: $org_id")
fi
local resp http_code body
set +e
resp=$(curl -sS -X POST "$base_url/workspaces/$wid/mcp" \
-H "Authorization: Bearer $wtok" \
"${org_header[@]}" \
-H "Content-Type: application/json" \
-d "$PV_RPC_BODY" \
-o /tmp/pv_mcp_body.json -w "%{http_code}" 2>/dev/null)
set -e
http_code="$resp"
body=$(cat /tmp/pv_mcp_body.json 2>/dev/null || echo '')
echo "--- $rt (ws=$wid) ---"
echo " HTTP $http_code"
echo " body: $(echo "$body" | head -c 600)"
# (1) HTTP 200 — a 401 (WorkspaceAuth reject, the Hermes symptom) fails here.
if [ "$http_code" != "200" ]; then
echo "$rt: list_peers MCP call returned HTTP $http_code (expected 200)"
PV_VERDICT="FAIL(http=$http_code)"
return 1
fi
# (2) JSON-RPC result present, not an error object; expected sibling IDs
# present; not a native-sessions fallback. Byte-identical to the
# original staging script's inline python.
local parse
parse=$(echo "$body" | python3 -c "
import sys, json
expect = set(filter(None, '''$expect_ids'''.split()))
try:
d = json.load(sys.stdin)
except Exception as e:
print('PARSE_ERROR:' + str(e)); sys.exit(0)
if isinstance(d, dict) and d.get('error') is not None:
print('RPC_ERROR:' + json.dumps(d['error'])[:200]); sys.exit(0)
res = d.get('result') if isinstance(d, dict) else None
if res is None:
print('NO_RESULT'); sys.exit(0)
# MCP tools/call result shape: {content:[{type:text,text:'<json or prose>'}]}
text = ''
if isinstance(res, dict):
for c in res.get('content', []):
if c.get('type') == 'text':
text += c.get('text', '')
text_l = text.lower()
# Native-sessions fallback signature (the OpenClaw symptom): the agent
# answered from its own runtime session list, not the platform peer set.
if 'sessions_list' in text_l or 'no platform peers' in text_l or 'native session' in text_l:
print('NATIVE_FALLBACK:' + text[:200]); sys.exit(0)
# The expected sibling IDs must literally appear in the returned peer text.
found = sorted(i for i in expect if i in text)
missing = sorted(expect - set(found))
if not expect:
print('NO_EXPECTED_PEERS_CONFIGURED'); sys.exit(0)
if missing:
print('MISSING_PEERS:found=%d/%d missing=%s' % (len(found), len(expect), ','.join(m[:8] for m in missing)))
sys.exit(0)
print('OK:found=%d/%d' % (len(found), len(expect)))
" 2>/dev/null)
case "$parse" in
OK:*)
echo "$rt: list_peers returned 200 and contains all expected peers ($parse)"
PV_VERDICT="OK"
return 0
;;
NATIVE_FALLBACK:*)
echo "$rt: list_peers fell back to NATIVE sessions — sees no platform peers ($parse)"
PV_VERDICT="FAIL(native-fallback)"
return 1
;;
RPC_ERROR:*|NO_RESULT|PARSE_ERROR:*)
echo "$rt: list_peers MCP call did not return a usable result ($parse)"
PV_VERDICT="FAIL(rpc=$parse)"
return 1
;;
MISSING_PEERS:*)
echo "$rt: list_peers returned 200 but peer set is wrong/empty ($parse)"
PV_VERDICT="FAIL(peers=$parse)"
return 1
;;
NO_EXPECTED_PEERS_CONFIGURED)
# Caller bug, not a runtime regression — surface loudly so a
# mis-wired backend can't mint a false green.
echo "$rt: no expected peers were configured for this caller"
PV_VERDICT="FAIL(rpc=NO_EXPECTED_PEERS_CONFIGURED)"
return 1
;;
*)
echo "$rt: unexpected verdict '$parse'"
PV_VERDICT="FAIL(unknown)"
return 1
;;
esac
}
+328
View File
@@ -0,0 +1,328 @@
#!/usr/bin/env bash
# LOCAL E2E — fresh-provision peer-visibility gate via the LITERAL MCP path.
#
# WHY THIS EXISTS
# ---------------
# tests/e2e/test_peer_visibility_mcp_staging.sh (PR #1298) codified the
# literal user-facing peer-visibility path — but staging-only. The
# standing rule is that the local prod-mimic stack runs a MANDATORY
# local-Postgres E2E BEFORE staging E2E (memory:
# feedback_local_must_mimic_production, feedback_mandatory_local_e2e_
# before_ship, feedback_local_test_before_staging_e2e,
# feedback_real_subprocess_test_for_boot_path). A staging-only gate means
# regressions are caught late and expensively on EC2. This is the LOCAL
# backend: same byte-identical assertion, local docker-compose stack.
#
# THE ASSERTION IS NOT A PROXY and is BYTE-IDENTICAL to staging — it is
# the SAME tests/e2e/lib/peer_visibility_assert.sh::pv_assert_runtime that
# the staging script calls. It issues the byte-for-byte JSON-RPC
# `tools/call name=list_peers` envelope to `POST /workspaces/:id/mcp`
# using each workspace's OWN bearer token, through the real WorkspaceAuth
# + MCPRateLimiter middleware chain — the exact call
# mcp_molecule_list_peers makes from a canvas agent. It does NOT read a
# registry row, /health, the heartbeat table, or GET /registry/:id/peers.
#
# Only PROVISIONING differs from staging:
# - staging: POST /cp/admin/orgs (cold EC2 tenant) + per-tenant admin
# token + each workspace's auth_token from the POST /workspaces resp.
# - local: POST /workspaces directly against the local stack
# (BASE, default http://localhost:8080), MCP bearer minted via
# GET /admin/workspaces/:id/test-token (e2e_mint_test_token —
# deterministic, gated by MOLECULE_ENV != production). Same model
# every other local E2E (test_priority_runtimes_e2e.sh,
# test_api.sh) already uses; no new credential/provision flow.
#
# It is written to FAIL on today's broken Hermes/OpenClaw behavior and go
# green only when the in-flight root-cause fixes (Hermes-401 #162,
# OpenClaw-never-online/MCP-wiring #165) actually land — same gate
# semantics + exit codes as the staging script. NON-required by design
# until then (flip-to-required tracked at molecule-core#1296), and NOT
# masked with continue-on-error (feedback_fix_root_not_symptom).
#
# Required env: none (local stack only).
# Optional env:
# BASE default http://localhost:8080
# PV_RUNTIMES space list; default "hermes openclaw claude-code"
# E2E_PROVISION_TIMEOUT_SECS per-workspace online budget; default 900
# (hermes cold apt+uv is the slow path locally)
# E2E_KEEP_WS 1 → skip teardown (local debugging only)
# LLM provider keys (a workspace boots only if its provider key is set;
# a runtime whose key is absent is SKIPPED, not failed — a partially
# keyed local env must not false-fail the gate):
# CLAUDE_CODE_OAUTH_TOKEN claude-code
# E2E_MINIMAX_API_KEY hermes/openclaw (MiniMax, preferred)
# E2E_ANTHROPIC_API_KEY hermes/openclaw (direct Anthropic)
# E2E_OPENAI_API_KEY hermes/openclaw (OpenAI)
#
# Exit codes (match the staging script):
# 0 every runtime under test saw its peers via the literal MCP call
# 1 generic failure
# 3 a workspace never reached online within the budget
# 10 peer-visibility regression reproduced (the gate firing as designed)
set -uo pipefail
source "$(dirname "$0")/_lib.sh"
# Byte-identical assertion shared with the staging backend.
# shellcheck source=tests/e2e/lib/peer_visibility_assert.sh
source "$(dirname "$0")/lib/peer_visibility_assert.sh"
PV_RUNTIMES="${PV_RUNTIMES:-hermes openclaw claude-code}"
PROVISION_TIMEOUT_SECS="${E2E_PROVISION_TIMEOUT_SECS:-900}"
NAME_PREFIX="PV-Local-$$-$(date +%H%M%S)"
log() { echo "[$(date +%H:%M:%S)] $*"; }
ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
CREATED_WSIDS=()
# ─── Scoped teardown ───────────────────────────────────────────────────
# Deletes ONLY the workspaces THIS run created (tracked in CREATED_WSIDS),
# one DELETE /workspaces/:id?confirm=true each. NEVER e2e_cleanup_all_
# workspaces / any blanket sweep — honors feedback_cleanup_after_each_test
# and feedback_never_run_cluster_cleanup_tests_on_live_platform (a local
# stack can still be shared with other concurrent local E2E).
teardown() {
local rc=$?
set +e
if [ "${E2E_KEEP_WS:-0}" = "1" ]; then
echo ""
log "[teardown] E2E_KEEP_WS=1 — leaving ${#CREATED_WSIDS[@]} ws for debugging (REMEMBER TO DELETE)"
exit $rc
fi
echo ""
log "[teardown] deleting ${#CREATED_WSIDS[@]} workspace(s) this run created (scoped)"
for wid in ${CREATED_WSIDS[@]+"${CREATED_WSIDS[@]}"}; do
[ -n "$wid" ] || continue
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" >/dev/null 2>&1 || true
done
exit $rc
}
trap teardown EXIT INT TERM
# Pre-sweep workspaces a prior crashed run of THIS script left behind
# (name prefix match only — never a blanket delete). The trap fires on
# normal exit, but a kill -9 / SIGPIPE can bypass it.
PRIOR=$(curl -s "$BASE/workspaces" | python3 -c '
import json, sys
try:
print(" ".join(w["id"] for w in json.load(sys.stdin) if w.get("name","").startswith("PV-Local-")))
except Exception:
pass
' 2>/dev/null)
for _wid in $PRIOR; do
log "Pre-sweeping prior PV-Local workspace: $_wid"
curl -s -X DELETE "$BASE/workspaces/$_wid?confirm=true" >/dev/null 2>&1 || true
done
# ─── Local-stack preflight ─────────────────────────────────────────────
log "0/5 local stack preflight: $BASE/health"
if ! curl -fsS "$BASE/health" -m 5 >/dev/null 2>&1; then
echo "::error::Local stack not healthy at $BASE/health — bring it up (make up) before this gate. Infra, not a workspace bug (feedback_fix_root_not_symptom)." >&2
exit 1
fi
# admin/test-token is the local MCP-bearer mint path; it 404s in
# production. If it is off, this gate cannot drive the literal call.
if ! curl -fsS "$BASE/admin/workspaces/preflight-probe/test-token" -m 5 >/dev/null 2>&1; then
# A 404 here is EITHER "no such ws" (fine — endpoint is enabled) OR the
# endpoint is disabled (MOLECULE_ENV=production). Distinguish by body.
PROBE=$(curl -s "$BASE/admin/workspaces/preflight-probe/test-token" -m 5 2>/dev/null)
if echo "$PROBE" | grep -qi 'production\|disabled\|not found.*endpoint'; then
echo "::error::GET /admin/workspaces/:id/test-token disabled (MOLECULE_ENV=production?). Cannot mint a local MCP bearer." >&2
exit 1
fi
fi
ok " local stack healthy"
# ─── Resolve per-runtime provisioning secrets ──────────────────────────
# Mirrors test_priority_runtimes_e2e.sh / test_staging_full_saas.sh's
# provider-key chain. A runtime whose key is absent is SKIPPED (not
# failed) so a partially keyed local env doesn't false-fail the gate.
runtime_secrets() {
local rt="$1"
case "$rt" in
claude-code)
[ -n "${CLAUDE_CODE_OAUTH_TOKEN:-}" ] || { echo ""; return 1; }
python3 -c "import json,os;print(json.dumps({'CLAUDE_CODE_OAUTH_TOKEN':os.environ['CLAUDE_CODE_OAUTH_TOKEN']}))"
;;
hermes|openclaw)
if [ -n "${E2E_MINIMAX_API_KEY:-}" ]; then
python3 -c "import json,os;k=os.environ['E2E_MINIMAX_API_KEY'];print(json.dumps({'ANTHROPIC_BASE_URL':'https://api.minimax.io/anthropic','ANTHROPIC_AUTH_TOKEN':k,'MINIMAX_API_KEY':k}))"
elif [ -n "${E2E_ANTHROPIC_API_KEY:-}" ]; then
python3 -c "import json,os;k=os.environ['E2E_ANTHROPIC_API_KEY'];print(json.dumps({'ANTHROPIC_API_KEY':k}))"
elif [ -n "${E2E_OPENAI_API_KEY:-}" ]; then
python3 -c "import json,os;k=os.environ['E2E_OPENAI_API_KEY'];print(json.dumps({'OPENAI_API_KEY':k,'OPENAI_BASE_URL':'https://api.openai.com/v1','MODEL_PROVIDER':'openai:gpt-4o','HERMES_INFERENCE_PROVIDER':'custom','HERMES_CUSTOM_BASE_URL':'https://api.openai.com/v1','HERMES_CUSTOM_API_KEY':k,'HERMES_CUSTOM_API_MODE':'chat_completions'}))"
else
echo ""; return 1
fi
;;
*)
# Unknown runtime: provision with empty secrets and let the stack
# decide (kept permissive so PV_RUNTIMES can be widened later).
echo "{}"
;;
esac
}
# Block until $1 reaches one of $2 (space-separated), or $3 sec elapse.
wait_for_status() {
local wsid="$1" want="$2" budget="$3" start=$SECONDS last=""
while [ $((SECONDS - start)) -lt "$budget" ]; do
local s
s=$(curl -s "$BASE/workspaces/$wsid" | python3 -c 'import json,sys
try:
d=json.load(sys.stdin); w=d.get("workspace") if isinstance(d.get("workspace"),dict) else d; print(w.get("status",""))
except Exception:
print("")' 2>/dev/null || echo "")
[ "$s" != "$last" ] && { log " $wsid${s:-<none>}"; last="$s"; }
for w in $want; do [ "$s" = "$w" ] && { echo "$s"; return 0; }; done
sleep 5
done
echo "$last"
return 1
}
# ─── 1. Provision parent (claude-code) + one sibling per runtime ───────
# Same topology as the staging script: a claude-code parent plus one
# sibling per runtime under test, so each runtime should see all others.
log "1/5 provisioning parent (claude-code) + one sibling per runtime under test..."
PARENT_SECRETS=$(runtime_secrets claude-code) || PARENT_SECRETS=""
if [ -z "$PARENT_SECRETS" ]; then
# Parent still needs to exist as a peer target even without an LLM key;
# it never has to answer list_peers itself (it is excluded from the
# caller set), so an empty-secrets claude-code shell is sufficient.
PARENT_SECRETS="{}"
fi
P_RESP=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-parent\",\"runtime\":\"claude-code\",\"tier\":3,\"secrets\":$PARENT_SECRETS}")
PARENT_ID=$(echo "$P_RESP" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$PARENT_ID" ]; then
echo "::error::parent create failed: $(echo "$P_RESP" | head -c 300)" >&2
exit 1
fi
CREATED_WSIDS+=("$PARENT_ID")
log " PARENT_ID=$PARENT_ID"
# NOTE: no `declare -A` — this script must also run on a local macOS dev
# box (bash 3.2, no associative arrays) per feedback_local_must_mimic_
# production. WS_IDS / VERDICT are kept as newline-delimited "rt<TAB>val"
# maps with tiny get/set helpers (portable to bash 3.2+ AND ubuntu CI).
WS_IDS_MAP=""
VERDICT_MAP=""
_map_set() { # _map_set <mapvarname> <key> <value>
local __m="$1" __k="$2" __v="$3" __cur
eval "__cur=\$$__m"
__cur=$(printf '%s' "$__cur" | grep -v "^${__k} " || true)
if [ -n "$__cur" ]; then
eval "$__m=\$(printf '%s\n%s\t%s' \"\$__cur\" \"\$__k\" \"\$__v\")"
else
eval "$__m=\$(printf '%s\t%s' \"\$__k\" \"\$__v\")"
fi
}
_map_get() { # _map_get <mapvarname> <key> -> stdout value (empty if absent)
local __m="$1" __k="$2" __cur
eval "__cur=\$$__m"
printf '%s\n' "$__cur" | awk -F'\t' -v k="$__k" '$1==k {print $2; exit}'
}
ALL_WS_IDS="$PARENT_ID"
ACTIVE_RUNTIMES=""
for rt in $PV_RUNTIMES; do
SEC=$(runtime_secrets "$rt") || SEC=""
if [ -z "$SEC" ]; then
log " SKIP $rt — no provider key in env (partially-keyed local env; not a failure)"
continue
fi
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-$rt\",\"runtime\":\"$rt\",\"tier\":2,\"parent_id\":\"$PARENT_ID\",\"secrets\":$SEC}")
WID=$(echo "$R" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$WID" ]; then
echo "::error::$rt workspace create failed: $(echo "$R" | head -c 300)" >&2
exit 1
fi
_map_set WS_IDS_MAP "$rt" "$WID"
CREATED_WSIDS+=("$WID")
ALL_WS_IDS="$ALL_WS_IDS $WID"
ACTIVE_RUNTIMES="$ACTIVE_RUNTIMES $rt"
log " $rt$WID"
done
ACTIVE_RUNTIMES="$(echo "$ACTIVE_RUNTIMES" | xargs)"
if [ -z "$ACTIVE_RUNTIMES" ]; then
echo "::error::No runtime had a provider key set — cannot run the local peer-visibility gate. Set CLAUDE_CODE_OAUTH_TOKEN and/or E2E_MINIMAX_API_KEY (or ANTHROPIC/OPENAI)." >&2
exit 1
fi
# ─── 2. Wait for the parent online (it is a peer target) ───────────────
log "2/5 waiting for parent online (peer target)..."
PF=$(wait_for_status "$PARENT_ID" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$PF" != "online" ]; then
echo "::error::parent ($PARENT_ID) never reached online (last=$PF) within ${PROVISION_TIMEOUT_SECS}s" >&2
exit 3
fi
ok " parent online"
# ─── 3. Wait for every sibling online ──────────────────────────────────
# A runtime that never comes online locally is itself a finding: it
# reproduces the openclaw-never-online class (#165) on the local stack.
log "3/5 waiting for all siblings online (up to ${PROVISION_TIMEOUT_SECS}s each — cold boot)..."
REGRESSED=0
ONLINE_RUNTIMES=""
for rt in $ACTIVE_RUNTIMES; do
wid="$(_map_get WS_IDS_MAP "$rt")"
S=$(wait_for_status "$wid" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$S" != "online" ]; then
echo "$rt ($wid): never reached online (last=$S) — reproduces the never-online class locally"
_map_set VERDICT_MAP "$rt" "FAIL(never-online:last=$S)"
REGRESSED=1
continue
fi
ok " $rt online"
ONLINE_RUNTIMES="$ONLINE_RUNTIMES $rt"
done
# ─── 4. THE GATE — literal mcp_molecule_list_peers via POST /:id/mcp ────
# Shared, byte-identical assertion. Local passes "" for the org id (the
# single-tenant local stack does not gate on X-Molecule-Org-Id); the
# literal MCP call + every anti-proxy / anti-native-fallback guarantee is
# the SAME code the staging backend runs.
log "4/5 driving the LITERAL list_peers MCP call per online runtime..."
echo ""
for rt in $ONLINE_RUNTIMES; do
wid="$(_map_get WS_IDS_MAP "$rt")"
WTOK=$(e2e_mint_test_token "$wid" 2>/dev/null || true)
if [ -z "$WTOK" ]; then
echo "--- $rt (ws=$wid) ---"
echo "$rt: could not mint a local MCP bearer (admin/test-token) — cannot drive the literal call"
_map_set VERDICT_MAP "$rt" "FAIL(no-bearer)"
REGRESSED=1
echo ""
continue
fi
PV_VERDICT=""
pv_assert_runtime "$rt" "$wid" "$WTOK" "$BASE" "" "$ALL_WS_IDS" || REGRESSED=1
_map_set VERDICT_MAP "$rt" "$PV_VERDICT"
echo ""
done
# ─── 5. Summary + honest gate exit ─────────────────────────────────────
echo "=== SUMMARY — LOCAL fresh-provision peer-visibility (literal MCP list_peers) ==="
for rt in $ACTIVE_RUNTIMES; do
_v="$(_map_get VERDICT_MAP "$rt")"
printf ' %-14s %s\n' "$rt" "${_v:-NO_RUN}"
done
echo ""
if [ "$REGRESSED" -ne 0 ]; then
echo "✗ GATE FAILED (LOCAL) — at least one runtime cannot see its peers via"
echo " the literal mcp_molecule_list_peers call on the local prod-mimic"
echo " stack. This is the SAME user-facing failure the proxy signals were"
echo " hiding, reproduced locally (far faster than EC2). Expected RED until"
echo " the Hermes-401 (#162) + OpenClaw-never-online/MCP-wiring (#165)"
echo " root-cause fixes land; goes green only when they actually do."
exit 10
fi
ok "GATE PASSED (LOCAL) — every runtime under test sees its platform peers via the literal MCP call."
exit 0
+14 -89
View File
@@ -64,6 +64,13 @@
set -uo pipefail
# The literal MCP list_peers assertion lives in the shared, backend-
# agnostic lib so it is BYTE-IDENTICAL between this staging backend and
# the local docker-compose backend (tests/e2e/test_peer_visibility_mcp_
# local.sh). Only provisioning/teardown differs per backend.
# shellcheck source=tests/e2e/lib/peer_visibility_assert.sh
source "$(dirname "${BASH_SOURCE[0]}")/lib/peer_visibility_assert.sh"
CP_URL="${MOLECULE_CP_URL:-https://staging-api.moleculesai.app}"
ADMIN_TOKEN="${MOLECULE_ADMIN_TOKEN:?MOLECULE_ADMIN_TOKEN required — Railway staging CP_ADMIN_API_TOKEN}"
RUN_ID_SUFFIX="${E2E_RUN_ID:-$(date +%H%M%S)-$$}"
@@ -259,101 +266,19 @@ done
# through WorkspaceAuth + MCPRateLimiter.
log "6/6 driving the LITERAL list_peers MCP call per runtime..."
echo ""
RPC_BODY='{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"list_peers","arguments":{}}}'
REGRESSED=0
declare -A VERDICT
for rt in $PV_RUNTIMES; do
wid="${WS_IDS[$rt]}"
wtok="${WS_TOKENS[$rt]}"
# The expected peer set = every OTHER provisioned workspace (parent +
# the sibling runtimes), excluding the caller itself.
EXPECT_IDS=$(echo "$ALL_WS_IDS" | tr ' ' '\n' | grep -v "^${wid}$" | grep -v '^$')
set +e
RESP=$(curl -sS -X POST "$TENANT_URL/workspaces/$wid/mcp" \
-H "Authorization: Bearer $wtok" \
-H "X-Molecule-Org-Id: $ORG_ID" \
-H "Content-Type: application/json" \
-d "$RPC_BODY" \
-o /tmp/pv_mcp_body.json -w "%{http_code}" 2>/dev/null)
set -e
HTTP_CODE="$RESP"
BODY=$(cat /tmp/pv_mcp_body.json 2>/dev/null || echo '')
echo "--- $rt (ws=$wid) ---"
echo " HTTP $HTTP_CODE"
echo " body: $(echo "$BODY" | head -c 600)"
# (1) HTTP 200 — a 401 (WorkspaceAuth reject, the Hermes symptom) fails here.
if [ "$HTTP_CODE" != "200" ]; then
echo "$rt: list_peers MCP call returned HTTP $HTTP_CODE (expected 200)"
VERDICT[$rt]="FAIL(http=$HTTP_CODE)"
REGRESSED=1
continue
fi
# (2) JSON-RPC result present, not an error object.
PARSE=$(echo "$BODY" | python3 -c "
import sys, json
expect = set(filter(None, '''$EXPECT_IDS'''.split()))
try:
d = json.load(sys.stdin)
except Exception as e:
print('PARSE_ERROR:' + str(e)); sys.exit(0)
if isinstance(d, dict) and d.get('error') is not None:
print('RPC_ERROR:' + json.dumps(d['error'])[:200]); sys.exit(0)
res = d.get('result') if isinstance(d, dict) else None
if res is None:
print('NO_RESULT'); sys.exit(0)
# MCP tools/call result shape: {content:[{type:text,text:'<json or prose>'}]}
text = ''
if isinstance(res, dict):
for c in res.get('content', []):
if c.get('type') == 'text':
text += c.get('text', '')
text_l = text.lower()
# Native-sessions fallback signature (the OpenClaw symptom): the agent
# answered from its own runtime session list, not the platform peer set.
if 'sessions_list' in text_l or 'no platform peers' in text_l or 'native session' in text_l:
print('NATIVE_FALLBACK:' + text[:200]); sys.exit(0)
# The expected sibling IDs must literally appear in the returned peer text.
found = sorted(i for i in expect if i in text)
missing = sorted(expect - set(found))
if not expect:
print('NO_EXPECTED_PEERS_CONFIGURED'); sys.exit(0)
if missing:
print('MISSING_PEERS:found=%d/%d missing=%s' % (len(found), len(expect), ','.join(m[:8] for m in missing)))
sys.exit(0)
print('OK:found=%d/%d' % (len(found), len(expect)))
" 2>/dev/null)
case "$PARSE" in
OK:*)
echo "$rt: list_peers returned 200 and contains all expected peers ($PARSE)"
VERDICT[$rt]="OK"
;;
NATIVE_FALLBACK:*)
echo "$rt: list_peers fell back to NATIVE sessions — sees no platform peers ($PARSE)"
VERDICT[$rt]="FAIL(native-fallback)"
REGRESSED=1
;;
RPC_ERROR:*|NO_RESULT|PARSE_ERROR:*)
echo "$rt: list_peers MCP call did not return a usable result ($PARSE)"
VERDICT[$rt]="FAIL(rpc=$PARSE)"
REGRESSED=1
;;
MISSING_PEERS:*)
echo "$rt: list_peers returned 200 but peer set is wrong/empty ($PARSE)"
VERDICT[$rt]="FAIL(peers=$PARSE)"
REGRESSED=1
;;
*)
echo "$rt: unexpected verdict '$PARSE'"
VERDICT[$rt]="FAIL(unknown)"
REGRESSED=1
;;
esac
# Byte-identical assertion via the shared lib. Staging passes ORG_ID as
# the X-Molecule-Org-Id header value; the literal MCP call + every
# anti-proxy / anti-native-fallback guarantee is the SAME code the
# local backend runs.
PV_VERDICT=""
pv_assert_runtime "$rt" "$wid" "$wtok" "$TENANT_URL" "$ORG_ID" "$ALL_WS_IDS" || REGRESSED=1
VERDICT[$rt]="$PV_VERDICT"
echo ""
done
@@ -0,0 +1,35 @@
// Command t4-contract-dump prints the T4 privilege contract as YAML.
//
// Usage:
//
// go run ./workspace-server/cmd/t4-contract-dump > t4_capabilities.yaml
//
// This is the seam that template-repo CI workflows consume:
//
// - Template CI fetches molecule-core at pinned ref
// - Runs `go run ./workspace-server/cmd/t4-contract-dump` to produce
// t4_capabilities.yaml
// - Iterates capabilities and runs each Probe inside a freshly-built
// privileged container
// - Aggregates structured pass/fail; fails the gate on any hard miss.
//
// Keeping this trivial and pure-stdlib means a fork user does not need
// a Molecule-AI Gitea token or any internal infrastructure to consume
// the contract — `go run` against molecule-core's public source is
// enough.
package main
import (
"fmt"
"os"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
)
func main() {
caps := provisioner.T4PrivilegeContract()
if _, err := os.Stdout.WriteString(provisioner.AsYAML(caps)); err != nil {
fmt.Fprintln(os.Stderr, "t4-contract-dump: write failed:", err)
os.Exit(1)
}
}
@@ -168,6 +168,21 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
if !h.HasProvisioner() {
return false
}
// Restart-aware short-circuit: during the 20-30s EC2-pending window of
// an in-flight restart, the workspace's url='' and IsRunning() returns
// false → looks indistinguishable from a dead container. Pre-fix this
// fired a fresh RestartByID for the just-launched instance, which
// coalesceRestart's pending-flag drained by running ANOTHER full
// stop+provision cycle (= ec2_stopped of the still-pending instance
// → re-provision). That's the 4x reprov thrash class. Skip the
// container-dead path while a restart is in flight; the in-flight
// restart's own provisionWorkspaceAutoSync will surface a real failure
// (markProvisionFailed) if the new container never comes up. Issue
// internal#544.
if isRestarting(workspaceID) {
log.Printf("ProxyA2A: maybeMarkContainerDead skipped for %s — restart already in flight (self-fire guard)", workspaceID)
return false
}
var running bool
var inspectErr error
@@ -223,6 +238,18 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
// shape post-EC2-replace (see molecule-controlplane#20 incident
// 2026-05-07) where the reconciler hasn't respawned the agent yet.
func (h *WorkspaceHandler) preflightContainerHealth(ctx context.Context, workspaceID string) *proxyA2AError {
// Restart-aware short-circuit (mirror of maybeMarkContainerDead): if a
// restart cycle is in flight for this workspace, do not run the
// IsRunning probe — it would observe the EC2-pending state as "not
// running" and trigger RestartByID for an already-restarting workspace,
// closing the self-fire loop. Returning nil lets the optimistic
// forward proceed; the upstream Do() call will fail with a connection
// error or 502, and the *post-restart* reactive path can decide what
// to do once the cycle has actually completed. Issue internal#544.
if isRestarting(workspaceID) {
log.Printf("ProxyA2A preflight: %s — skipped, restart already in flight (self-fire guard)", workspaceID)
return nil
}
running, err := h.provisioner.IsRunning(ctx, workspaceID)
if err != nil {
// Transient daemon error. Provisioner.IsRunning returns (true, err)
@@ -8,6 +8,7 @@ import (
"fmt"
"log"
"net/http"
"regexp"
"strconv"
"strings"
"time"
@@ -18,6 +19,46 @@ import (
"github.com/google/uuid"
)
// internal#212 — secret-safe scrubber applied to error_detail strings
// before they cross the canvas WebSocket. Defense in depth: the
// workspace runtime already runs `_sanitize_for_external` on its side
// (workspace/executor_helpers.py), but the broadcast layer is the last
// stop before the string reaches the user's browser, so we re-scrub
// here in case any caller path forgot.
//
// The scrubber is intentionally surgical — it MUST preserve the
// actionable parts (HTTP status codes, error codes like
// `oauth_org_not_allowed`, human-readable provider messages) and
// remove only what looks credential-ish. Over-redacting defeats the
// whole point of internal#212 (giving the user a reason they can act on).
// Capture (auth-key prefix) (value) so the prefix can be preserved in
// the output. The keyword anchor prevents false positives on regular
// text that happens to contain a long alphanumeric run.
var errorDetailSecretRE = regexp.MustCompile(`(?i)((?:bearer|token|api[_-]?key|sk-proj-|sk-)[ :=]*)[A-Za-z0-9_/.-]{20,}`)
// Stringly-typed JWT-shape: 3 dot-separated base64url segments, second
// and third at least 16 chars. Matches eyJ-prefixed tokens that the
// keyword-anchored rule above would miss when they appear bare.
var errorDetailJWTRE = regexp.MustCompile(`eyJ[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{16,}\.[A-Za-z0-9_-]{16,}`)
const errorDetailBroadcastCap = 4096
func sanitizeErrorDetailForBroadcast(s string) string {
if s == "" {
return s
}
// Cap first — a huge error body shouldn't tax every websocket
// client's buffer. 4096 matches the workspace-side _MAX_STDERR
// budget (it's actually larger here so the runtime's cap dominates).
if len(s) > errorDetailBroadcastCap {
s = s[:errorDetailBroadcastCap] + "…[truncated]"
}
s = errorDetailSecretRE.ReplaceAllString(s, "${1}[REDACTED]")
s = errorDetailJWTRE.ReplaceAllString(s, "[REDACTED]")
return s
}
type ActivityHandler struct {
broadcaster *events.Broadcaster
}
@@ -691,6 +732,16 @@ func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster eve
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
// internal#212 — surface the secret-safe failure reason on the
// live broadcast so the canvas chat-tab error banner can show
// the user *why* (provider HTTP status, error code, the
// provider's own human message) instead of the opaque
// "Agent error (Exception) — see workspace logs for details."
// hardcoded fallback. Omitted when nil so the canvas's "has
// actionable reason" guard doesn't trip on empty-string keys.
if params.ErrorDetail != nil && *params.ErrorDetail != "" {
payload["error_detail"] = sanitizeErrorDetailForBroadcast(*params.ErrorDetail)
}
}
return func() {
@@ -934,6 +934,184 @@ func TestLogActivity_Broadcast_IncludesRequestAndResponseBodies(t *testing.T) {
}
}
// TestLogActivity_Broadcast_IncludesErrorDetail pins the internal#212
// UX fix: when an a2a_receive row is logged with status="error" and a
// non-empty error_detail, the live broadcast MUST carry that detail so
// the canvas chat-tab error bubble can render the actionable reason
// (e.g. the provider's own 403 message) instead of the opaque
// "Agent error (Exception) — see workspace logs for details." string.
// Without this, the canvas falls back to the hardcoded boilerplate;
// the row's error_detail is in the DB but never reaches the user
// without a manual refresh of the Activity tab.
func TestLogActivity_Broadcast_IncludesErrorDetail(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(1, 1))
cb := &recordingBroadcaster{}
srcID := "ws-source"
tgtID := "ws-target"
method := "message/send"
// Realistic actionable reason: provider HTTP status + provider's
// own message. Secret-safe (no token, no api key, just the cause).
detail := "Anthropic 403 oauth_org_not_allowed: Your organization has disabled Claude subscription access for Claude Code — use an Anthropic API key or ask your admin to enable access."
LogActivity(context.Background(), cb, ActivityParams{
WorkspaceID: "ws-source",
ActivityType: "a2a_receive",
SourceID: &srcID,
TargetID: &tgtID,
Method: &method,
Status: "error",
ErrorDetail: &detail,
})
if len(cb.calls) != 1 {
t.Fatalf("expected 1 broadcast, got %d", len(cb.calls))
}
payload := cb.calls[0].payload
got, ok := payload["error_detail"].(string)
if !ok {
t.Fatalf("error_detail missing from broadcast payload: got %#v", payload["error_detail"])
}
if got != detail {
t.Errorf("error_detail = %q, want %q", got, detail)
}
}
// TestLogActivity_Broadcast_OmitsErrorDetailWhenNil pins the inverse:
// rows logged without an error_detail (the common ok-path) must not
// have an empty "error_detail":"" key in the broadcast, which would
// false-positive the canvas's "has actionable reason" guard and render
// an empty Underlying-Error block. The omission rule matches how
// request_body/response_body are handled.
func TestLogActivity_Broadcast_OmitsErrorDetailWhenNil(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(1, 1))
cb := &recordingBroadcaster{}
srcID := "ws-source"
LogActivity(context.Background(), cb, ActivityParams{
WorkspaceID: "ws-source",
ActivityType: "a2a_send",
SourceID: &srcID,
Status: "ok",
ErrorDetail: nil,
})
if len(cb.calls) != 1 {
t.Fatalf("expected 1 broadcast, got %d", len(cb.calls))
}
if _, present := cb.calls[0].payload["error_detail"]; present {
t.Errorf("error_detail should be omitted when nil, got %v", cb.calls[0].payload["error_detail"])
}
}
// TestSanitizeErrorDetail_StripsSecretShapes pins the secret-safe
// scrubber's contract: the broadcast layer is the last defense before
// a string crosses the canvas WebSocket and lands in the user's
// browser, so anything that *looks* like an API key / bearer token /
// JWT must be replaced with [REDACTED] even if upstream (the runtime,
// the provider) failed to scrub it. The non-secret parts of the
// message — provider status, error code, human-readable cause — MUST
// survive intact, otherwise the whole point of internal#212 (giving
// the user an actionable reason) is defeated.
func TestSanitizeErrorDetail_StripsSecretShapes(t *testing.T) {
cases := []struct {
name string
in string
mustHave []string // substrings that must survive — the actionable parts
mustMiss []string // substrings that must NOT survive — the secret shapes
}{
{
name: "preserves actionable provider reason",
in: "Anthropic 403 oauth_org_not_allowed: Your organization has disabled Claude subscription access for Claude Code",
mustHave: []string{"403", "oauth_org_not_allowed", "disabled Claude subscription"},
mustMiss: []string{"[REDACTED]"},
},
{
name: "redacts sk- API key embedded in error",
in: "openai 401 invalid_api_key: Incorrect API key provided: sk-proj-abcdefghijklmnop1234567890abcdef. Check your key.",
mustHave: []string{"401", "invalid_api_key", "Incorrect API key provided"},
mustMiss: []string{"sk-proj-abcdefghijklmnop1234567890abcdef"},
},
{
name: "redacts Bearer token in stringified header dump",
in: "auth failed; headers: Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.aaaaaaaaaaaaaaaaaaaa.bbbbbbbbbbbbbbbbbbbb",
mustHave: []string{"auth failed"},
mustMiss: []string{"eyJhbGciOiJIUzI1NiJ9.aaaaaaaaaaaaaaaaaaaa.bbbbbbbbbbbbbbbbbbbb"},
},
{
name: "truncates absurdly long detail to bound payload size",
in: "kimi 500 internal_error: " + strings.Repeat("x", 8000),
mustHave: []string{"kimi 500 internal_error"},
mustMiss: []string{strings.Repeat("x", 5000)}, // 5000 in a row must NOT survive — cap is 4096
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := sanitizeErrorDetailForBroadcast(tc.in)
for _, s := range tc.mustHave {
if !strings.Contains(got, s) {
t.Errorf("expected %q to survive scrub, got: %q", s, got)
}
}
for _, s := range tc.mustMiss {
if strings.Contains(got, s) {
t.Errorf("expected %q to be scrubbed, got: %q", s, got)
}
}
})
}
}
// TestLogActivity_Broadcast_ErrorDetailIsSanitized pins the integration
// of the scrubber into the broadcast path: if an upstream caller
// somehow passes through an error_detail with a secret-shaped token,
// the wire payload (what reaches the canvas WebSocket) must already
// be scrubbed. Defense in depth — the runtime should never let this
// happen, but the canvas is the trust boundary, not the runtime.
func TestLogActivity_Broadcast_ErrorDetailIsSanitized(t *testing.T) {
mock := setupTestDB(t)
defer mock.ExpectationsWereMet()
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(1, 1))
cb := &recordingBroadcaster{}
srcID := "ws-source"
// Upstream leaked a token into the detail string. The DB still
// stores the unscrubbed copy (workspace logs are an internal
// audit surface), but the broadcast that reaches the canvas
// must already be sanitized.
detail := "anthropic 401 invalid_api_key: provided key sk-proj-leakedsecretvalueabcdefghij is wrong"
LogActivity(context.Background(), cb, ActivityParams{
WorkspaceID: "ws-source",
ActivityType: "a2a_receive",
SourceID: &srcID,
Status: "error",
ErrorDetail: &detail,
})
if len(cb.calls) != 1 {
t.Fatalf("expected 1 broadcast, got %d", len(cb.calls))
}
got, _ := cb.calls[0].payload["error_detail"].(string)
if strings.Contains(got, "sk-proj-leakedsecretvalueabcdefghij") {
t.Errorf("broadcast leaked secret-shaped token: %q", got)
}
if !strings.Contains(got, "invalid_api_key") {
t.Errorf("scrubber over-redacted: lost the actionable code from %q", got)
}
}
// TestLogActivityTx_DefersBroadcastUntilCommitHook pins the #149
// contract: LogActivityTx returns a commitHook that the caller MUST
// invoke after tx.Commit(); the broadcast MUST NOT fire from inside
@@ -1,6 +1,9 @@
package handlers
import (
"log"
"os"
"path/filepath"
"regexp"
"strings"
)
@@ -17,6 +20,17 @@ var gitIdentitySlugPattern = regexp.MustCompile(`[^a-z0-9]+`)
// docs/authorship.md (when it exists).
const gitIdentityEmailDomain = "agents.moleculesai.app"
// gitAskpassHelperPath is the in-container path of the askpass helper
// installed by every workspace runtime image (workspace/Dockerfile in
// molecule-core; scripts/git-askpass.sh → /usr/local/bin/molecule-askpass
// in each external template-* repo). The helper reads GIT_HTTP_USERNAME
// / GIT_HTTP_PASSWORD (falling back to GITEA_USER / GITEA_TOKEN) from
// env and emits them on the git credential-prompt protocol. Setting
// GIT_ASKPASS to this path is what wires container-side HTTPS git auth
// to the persona credentials already arriving via workspace_secrets,
// with no on-disk .gitconfig / .git-credentials mutation required.
const gitAskpassHelperPath = "/usr/local/bin/molecule-askpass"
// applyAgentGitIdentity sets GIT_AUTHOR_* / GIT_COMMITTER_* env vars so
// every commit from this workspace container carries a distinct author
// in `git log` and `git blame`. Git reads these env vars before falling
@@ -50,6 +64,125 @@ func applyAgentGitIdentity(envVars map[string]string, workspaceName string) {
setIfEmpty(envVars, "GIT_AUTHOR_EMAIL", authorEmail)
setIfEmpty(envVars, "GIT_COMMITTER_NAME", authorName)
setIfEmpty(envVars, "GIT_COMMITTER_EMAIL", authorEmail)
applyGitAskpass(envVars)
}
// applyGitAskpass points git at the in-image askpass helper so that any
// HTTPS git operation against a remote without a pre-configured
// credential.helper picks up the persona credentials already present in
// the container env (GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD, or
// GITEA_USER / GITEA_TOKEN as fallback — the latter pair is what
// loadPersonaEnvFile delivers from the operator-host bootstrap kit).
//
// Idempotent: if GIT_ASKPASS is already set (e.g. by an operator-
// supplied workspace_secret or an env-mutator plugin), the existing
// value wins. This lets a workspace opt out by setting GIT_ASKPASS=""
// or pointing at a different helper.
//
// No vendor-specific behaviour lives in this function — the host the
// credentials apply to is determined entirely by the deployer choosing
// when to populate GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD (or
// GITEA_USER / GITEA_TOKEN). The helper script itself is generic and
// has no hardcoded hostnames, so it's safe to ship inside the
// open-source workspace template images alongside the platform-managed
// claude-code image.
func applyGitAskpass(envVars map[string]string) {
if envVars == nil {
return
}
setIfEmpty(envVars, "GIT_ASKPASS", gitAskpassHelperPath)
}
// applyAgentGitHTTPCreds reads the persona's HTTPS git credential from
// the operator-host bootstrap dir and injects it as GIT_HTTP_USERNAME /
// GIT_HTTP_PASSWORD so the in-container askpass helper can emit it on
// git's auth challenge.
//
// Why a dedicated env-var pair instead of reusing GITEA_USER / GITEA_TOKEN:
// the provisioner's forensic #145 denylist (provisioner.scmWriteTokenKeys)
// strips any env var named GITEA_TOKEN / GITHUB_TOKEN / GH_TOKEN /
// GITLAB_TOKEN / GL_TOKEN / BITBUCKET_TOKEN from tenant container env
// before docker run. That denylist is by exact key name, so the same
// token survives transport when shipped under the generic
// GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD names that the askpass helper
// reads first (scripts/git-askpass.sh in each template-*). The username
// half stays an identifier (the persona's Gitea login), the password
// half carries the bytes from the persona token file.
//
// The fallback pair GITEA_USER / GITEA_TOKEN is ALSO set — GITEA_USER
// survives the denylist (it's an identity, not a credential) and
// GITEA_TOKEN is the no-op write that buildContainerEnv will drop.
// Both pairs in lockstep means the askpass helper's GIT_HTTP_*-first /
// GITEA_*-fallback chain works regardless of which lane lands first in
// the container env on any future provisioner refactor.
//
// Idempotent: existing GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD keys are
// preserved. Operator-supplied workspace_secrets win over the persona
// token file by virtue of running BEFORE this helper in
// prepareProvisionContext.
//
// Silent no-op when:
// - personaKey is empty (no role → no persona dir to consult)
// - personaKey fails the safe-segment check (defense-in-depth against
// a crafted role escaping the persona dir)
// - the persona token file does not exist or is empty (legitimate
// case for personas that don't ship a git-write credential — e.g.
// read-only PM/Reviewer/Researcher identities or a partially-
// provisioned bootstrap)
//
// No vendor-specific behaviour: this function reads bytes from a path
// and emits them as the standard askpass env-var pair. The host the
// credential applies to is determined by the deployer choosing which
// remote to push to — the askpass helper has no hardcoded hostnames.
func applyAgentGitHTTPCreds(envVars map[string]string, personaKey string) {
if envVars == nil {
return
}
personaKey = strings.TrimSpace(personaKey)
if !isSafeRoleName(personaKey) {
// Silent no-op for empty / unsafe keys — same shape as
// loadPersonaTokenFile. Descriptive-role payloads (multi-word
// "Frontend Engineer" etc.) take this branch and pick up
// creds via workspace_secrets / org-import persona-env merge,
// not the direct persona-token file path.
return
}
root := os.Getenv("MOLECULE_PERSONA_ROOT")
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
tokenPath := filepath.Join(root, personaKey, "token")
data, err := os.ReadFile(tokenPath)
if err != nil {
// Persona dir / file absent: legitimate for the host shapes
// that don't ship the bootstrap kit (dev laptops, CI nodes)
// or for personas that intentionally carry no git-write
// credential. Caller decides whether the resulting
// "Authentication failed" at first push is a configuration
// error or expected behaviour.
return
}
token := strings.TrimSpace(string(data))
if token == "" {
return
}
// Primary lane — survives forensic #145 by virtue of the generic
// GIT_HTTP_* names not being on the SCM-write denylist.
setIfEmpty(envVars, "GIT_HTTP_USERNAME", personaKey)
setIfEmpty(envVars, "GIT_HTTP_PASSWORD", token)
// Fallback lane — askpass reads GITEA_USER / GITEA_TOKEN second.
// GITEA_USER survives the denylist; GITEA_TOKEN will be stripped
// by buildContainerEnv but is set here for completeness so the
// (envVars map[string]string) contract is consistent for callers
// inspecting it before the provisioner-level filter runs (e.g.
// the env-mutator plugin chain).
setIfEmpty(envVars, "GITEA_USER", personaKey)
setIfEmpty(envVars, "GITEA_TOKEN", token)
log.Printf("applyAgentGitHTTPCreds: injected GIT_HTTP_USERNAME/PASSWORD for persona %q (token %d bytes)", personaKey, len(token))
}
// slugifyForEmail collapses a workspace name to a safe email localpart:
@@ -1,6 +1,8 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
@@ -75,6 +77,261 @@ func TestApplyAgentGitIdentity_NilMapIsSafe(t *testing.T) {
applyAgentGitIdentity(nil, "PM")
}
func TestApplyAgentGitIdentity_SetsGitAskpass(t *testing.T) {
// GIT_ASKPASS is what wires container-side HTTPS git auth to the
// persona credentials (GITEA_USER/GITEA_TOKEN, etc.) that
// loadPersonaEnvFile delivers via workspace_secrets. Without this,
// `git push` inside the container would fall through to interactive
// prompts (impossible) or a missing credential.helper (401).
env := map[string]string{}
applyAgentGitIdentity(env, "Frontend Engineer")
if env["GIT_ASKPASS"] != "/usr/local/bin/molecule-askpass" {
t.Errorf("GIT_ASKPASS: got %q, want %q",
env["GIT_ASKPASS"], "/usr/local/bin/molecule-askpass")
}
}
func TestApplyAgentGitIdentity_RespectsAskpassOverride(t *testing.T) {
// A workspace_secret or env-mutator plugin must be able to point at
// a custom askpass helper without us clobbering it. Symmetric with
// the GIT_AUTHOR_NAME override test above.
env := map[string]string{
"GIT_ASKPASS": "/opt/custom/askpass",
}
applyAgentGitIdentity(env, "Backend Engineer")
if env["GIT_ASKPASS"] != "/opt/custom/askpass" {
t.Errorf("GIT_ASKPASS should not be overwritten, got %q", env["GIT_ASKPASS"])
}
}
func TestApplyAgentGitIdentity_AskpassSkippedOnEmptyName(t *testing.T) {
// The empty-name early-return covers GIT_ASKPASS too — a provisioning
// glitch that dropped the workspace name shouldn't half-configure the
// container (identity vars empty but askpass wired). All-or-nothing.
env := map[string]string{}
applyAgentGitIdentity(env, "")
if _, ok := env["GIT_ASKPASS"]; ok {
t.Errorf("empty name should not set GIT_ASKPASS, got %q", env["GIT_ASKPASS"])
}
}
func TestApplyGitAskpass_NilMapIsSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Errorf("applyGitAskpass panicked on nil map: %v", r)
}
}()
applyGitAskpass(nil)
}
// TestApplyAgentGitHTTPCreds_HappyPath: the prod-team shape — a persona
// dir at /etc/molecule-bootstrap/personas/<role>/token ships a write
// token. applyAgentGitHTTPCreds reads it and emits both the
// askpass-preferred GIT_HTTP_* pair and the GITEA_* fallback.
func TestApplyAgentGitHTTPCreds_HappyPath(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-bytes-redacted\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
cases := map[string]string{
"GIT_HTTP_USERNAME": "agent-dev-a",
"GIT_HTTP_PASSWORD": "token-bytes-redacted",
"GITEA_USER": "agent-dev-a",
"GITEA_TOKEN": "token-bytes-redacted",
}
for k, want := range cases {
if got := env[k]; got != want {
t.Errorf("%s: got %q, want %q", k, got, want)
}
}
}
// TestApplyAgentGitHTTPCreds_TrimsWhitespace: bootstrap-kit-written
// token files canonically end in \n. Must trim like loadPersonaTokenFile
// does — Gitea PAT validator rejects embedded whitespace.
func TestApplyAgentGitHTTPCreds_TrimsWhitespace(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("\n raw-token-bytes \n\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-b")
if env["GIT_HTTP_PASSWORD"] != "raw-token-bytes" {
t.Errorf("GIT_HTTP_PASSWORD: token whitespace not trimmed; got %q", env["GIT_HTTP_PASSWORD"])
}
}
// TestApplyAgentGitHTTPCreds_RespectsOperatorOverride: if a workspace
// secret (loaded earlier by loadWorkspaceSecrets) already set the
// askpass pair, those values must win — operator intent ranks above
// persona-file defaults. Symmetric with applyAgentGitIdentity's
// GIT_AUTHOR_* override semantics.
func TestApplyAgentGitHTTPCreds_RespectsOperatorOverride(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("file-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{
"GIT_HTTP_USERNAME": "operator-user",
"GIT_HTTP_PASSWORD": "operator-secret",
}
applyAgentGitHTTPCreds(env, "agent-dev-a")
if env["GIT_HTTP_USERNAME"] != "operator-user" {
t.Errorf("GIT_HTTP_USERNAME should not be overwritten, got %q", env["GIT_HTTP_USERNAME"])
}
if env["GIT_HTTP_PASSWORD"] != "operator-secret" {
t.Errorf("GIT_HTTP_PASSWORD should not be overwritten, got %q", env["GIT_HTTP_PASSWORD"])
}
// Fallback pair was not pre-set, so persona-file fills it in.
if env["GITEA_TOKEN"] != "file-token" {
t.Errorf("GITEA_TOKEN fallback should be filled, got %q", env["GITEA_TOKEN"])
}
}
// TestApplyAgentGitHTTPCreds_EmptyKeyIsNoop: a workspace with an empty
// payload.Role (descriptive multi-word role, or no role) must take the
// silent-no-op branch — no FS read, no env keys touched.
func TestApplyAgentGitHTTPCreds_EmptyKeyIsNoop(t *testing.T) {
root := t.TempDir()
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "")
if len(env) != 0 {
t.Errorf("empty persona key should leave env untouched, got %v", env)
}
applyAgentGitHTTPCreds(env, " ")
if len(env) != 0 {
t.Errorf("whitespace persona key should leave env untouched, got %v", env)
}
applyAgentGitHTTPCreds(env, "Frontend Engineer")
if len(env) != 0 {
t.Errorf("multi-word descriptive role should leave env untouched (silent no-op via isSafeRoleName), got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_MissingTokenFile: persona dir exists but
// ships no token (legitimate for read-only personas like agent-pm pre-
// CTO-cred or partially-provisioned bootstrap). Silent no-op — no env
// keys set so first push surfaces "Authentication failed" cleanly
// instead of half-configured creds.
func TestApplyAgentGitHTTPCreds_MissingTokenFile(t *testing.T) {
root := t.TempDir()
if err := os.MkdirAll(filepath.Join(root, "agent-pm"), 0o755); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-pm")
if len(env) != 0 {
t.Errorf("missing token file should leave env untouched, got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_EmptyTokenIsNoop: a whitespace-only token
// file (botched bootstrap) must be treated as absent — never emit
// GIT_HTTP_PASSWORD="" because the askpass helper would then return
// empty on the password prompt and git would surface a confusing 401
// rather than a clean "no credentials" state.
func TestApplyAgentGitHTTPCreds_EmptyTokenIsNoop(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte(" \t\n \n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
if len(env) != 0 {
t.Errorf("whitespace-only token should leave env untouched, got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_RejectsUnsafeRole: defense-in-depth — a
// crafted role with path separators / "../" must NOT touch the FS,
// even if a token file exists at the traversed location.
func TestApplyAgentGitHTTPCreds_RejectsUnsafeRole(t *testing.T) {
root := t.TempDir()
// Plant a token at <root>/token so a successful traversal would land here.
if err := os.WriteFile(filepath.Join(root, "token"),
[]byte("stolen-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", filepath.Join(root, "personas"))
for _, bad := range []string{"..", "../personas", "/abs", "with/slash", "."} {
env := map[string]string{}
applyAgentGitHTTPCreds(env, bad)
if len(env) != 0 {
t.Errorf("unsafe role %q must leave env untouched, got %v", bad, env)
}
}
}
// TestApplyAgentGitHTTPCreds_NilMapIsSafe: defensive — never panic
// on a nil map. Symmetric with applyAgentGitIdentity's nil-map test.
func TestApplyAgentGitHTTPCreds_NilMapIsSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Errorf("applyAgentGitHTTPCreds panicked on nil map: %v", r)
}
}()
applyAgentGitHTTPCreds(nil, "agent-dev-a")
}
// TestApplyAgentGitHTTPCreds_DefaultPersonaRoot: when
// MOLECULE_PERSONA_ROOT is unset, the helper falls back to
// /etc/molecule-bootstrap/personas — the canonical operator-host path
// per the bootstrap kit shape. We can't write into /etc in a test,
// but we CAN assert the helper takes the silent-no-op branch when
// that real path is absent (the prod-default case on a dev laptop).
func TestApplyAgentGitHTTPCreds_DefaultPersonaRoot(t *testing.T) {
t.Setenv("MOLECULE_PERSONA_ROOT", "")
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
// The /etc/molecule-bootstrap/personas/agent-dev-a/token path
// almost certainly does not exist on a dev/CI host. The contract
// here is "silent no-op when token unreadable", not "exact env
// state" — so we only assert no panic + no half-state pair.
if _, ok := env["GIT_HTTP_USERNAME"]; ok {
if _, ok2 := env["GIT_HTTP_PASSWORD"]; !ok2 {
t.Errorf("USERNAME set without PASSWORD — half-state; got %v", env)
}
}
}
func TestSlugifyForEmail(t *testing.T) {
cases := []struct {
in, want string
@@ -24,17 +24,30 @@ import (
// BuildExternalConnectionPayload assembles the gin.H payload that the
// canvas's ExternalConnectModal consumes. Pure data — caller owns DB
// reads (workspace_id) and token minting (auth_token).
// reads (workspace_id, workspace_name) and token minting (auth_token).
//
// authToken may be empty for the read-only "show instructions again"
// path; the modal masks the field in that case rather than displaying
// an empty string.
func BuildExternalConnectionPayload(platformURL, workspaceID, authToken string) gin.H {
//
// workspaceName feeds the per-workspace MCP server-name in the snippets
// that wire molecule-mcp into an external Claude Code (or other
// MCP-stdio) client. Without a unique server name a second
// `claude mcp add molecule` call REPLACES the first entry, collapsing
// multi-workspace use into a single per-session slot — see
// mcpServerNameForWorkspace below. May be empty (re-show / rotate paths
// that don't plumb the name); the helper falls back to the workspace
// ID's short prefix so the snippet is always unique.
func BuildExternalConnectionPayload(platformURL, workspaceID, workspaceName, authToken string) gin.H {
pURL := strings.TrimSuffix(platformURL, "/")
mcpName := mcpServerNameForWorkspace(workspaceID, workspaceName)
stamp := func(tmpl string) string {
return strings.ReplaceAll(
strings.ReplaceAll(tmpl, "{{PLATFORM_URL}}", pURL),
"{{WORKSPACE_ID}}", workspaceID,
strings.ReplaceAll(
strings.ReplaceAll(tmpl, "{{PLATFORM_URL}}", pURL),
"{{WORKSPACE_ID}}", workspaceID,
),
"{{MCP_SERVER_NAME}}", mcpName,
)
}
return gin.H{
@@ -77,6 +90,81 @@ func externalPlatformURL(c *gin.Context) string {
return scheme + "://" + host
}
// mcpServerNameForWorkspace derives the unique MCP server name used in
// the Universal MCP snippet's `claude mcp add <name> -- ...` line.
//
// Why per-workspace, not a fixed "molecule": `claude mcp add` keys
// entries by name in ~/.claude.json, so re-running with the same name
// silently REPLACES the previous entry. A single external Claude Code
// session that connects to N molecule workspaces must therefore use N
// distinct server names — otherwise the second install collapses the
// first, and the user experiences "MCP is per-session". MCP itself
// supports many servers per session; the install-snippet name was the
// only thing standing in the way.
//
// Pattern: "molecule-<slug>" where slug comes from the workspace name
// (lowercased, non-alphanumeric → hyphen, collapsed, trimmed, <=24
// chars). Falls back to the workspace ID's first 8 chars when the name
// is empty or slugifies to nothing — both produce a deterministic,
// Claude-Code-name-safe (alphanumeric + hyphens, no spaces / dots /
// slashes) identifier that disambiguates per-workspace.
//
// Two workspaces with identical names still produce identical slugs by
// design — the user picked them to look the same. The
// `claude mcp add` step will overwrite the older one in that case;
// the workaround is to rename one, then re-run. Documented in the
// snippet header so users aren't surprised.
func mcpServerNameForWorkspace(workspaceID, workspaceName string) string {
const fallbackIDPrefixLen = 8
const maxSlugLen = 24
slug := slugifyForMcpName(workspaceName, maxSlugLen)
if slug == "" {
id := strings.ReplaceAll(workspaceID, "-", "")
if len(id) > fallbackIDPrefixLen {
id = id[:fallbackIDPrefixLen]
}
slug = id
}
if slug == "" {
// Defensive: empty workspaceID at this layer means the caller
// is misusing the API; we still return a usable (non-colliding
// in the common case) constant rather than producing "molecule-"
// which Claude Code would reject.
return "molecule"
}
return "molecule-" + slug
}
// slugifyForMcpName lowercases, replaces non-[a-z0-9] runs with a single
// '-', trims leading/trailing '-', and truncates to maxLen. Returns ""
// if nothing usable remains. Pure helper; no allocations beyond the
// builder.
func slugifyForMcpName(s string, maxLen int) string {
var b strings.Builder
b.Grow(len(s))
lastHyphen := true // suppress leading hyphens
for _, r := range s {
switch {
case r >= 'A' && r <= 'Z':
b.WriteRune(r + ('a' - 'A'))
lastHyphen = false
case (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9'):
b.WriteRune(r)
lastHyphen = false
default:
if !lastHyphen {
b.WriteByte('-')
lastHyphen = true
}
}
}
out := strings.TrimRight(b.String(), "-")
if len(out) > maxLen {
out = strings.TrimRight(out[:maxLen], "-")
}
return out
}
// externalCurlTemplate — zero-dependency register snippet. Placeholders:
// - {{PLATFORM_URL}}, {{WORKSPACE_ID}} — filled server-side
// - $WORKSPACE_AUTH_TOKEN — env var, operator sets
@@ -216,6 +304,14 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# for any MCP-aware runtime (Claude Code, hermes, codex, etc.).
# Pair with the Claude Code or Python SDK tab if your runtime needs
# inbound A2A delivery (canvas messages → agent conversation turns).
#
# Multi-workspace: MCP supports many servers per Claude Code session.
# This snippet uses a workspace-specific server name ({{MCP_SERVER_NAME}})
# so installing for a second workspace ADDS another entry instead of
# overwriting the first — run the snippet from each workspace's modal
# in turn and ` + "`claude mcp list`" + ` will show all of them. If two
# workspaces have the same name, slugs collide and the second install
# overwrites the first; rename one workspace to disambiguate.
# Requires Python >= 3.11. On 3.10 or older pip says
# "Could not find a version that satisfies the requirement
@@ -224,11 +320,14 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# Upgrade the interpreter (brew install python@3.12 / apt install
# python3.12 / etc.) or use a 3.11+ venv.
# 1. Install the workspace runtime wheel:
# 1. Install the workspace runtime wheel (once per machine — safe to
# re-run; subsequent workspaces share the same wheel):
pip install molecule-ai-workspace-runtime
# 2. Wire molecule-mcp into your agent's MCP config. Claude Code:
claude mcp add molecule -s user -- env \
# NOTE the server name is workspace-specific ("{{MCP_SERVER_NAME}}") so
# multiple molecule workspaces co-exist in one Claude Code session.
claude mcp add {{MCP_SERVER_NAME}} -s user -- env \
WORKSPACE_ID={{WORKSPACE_ID}} \
PLATFORM_URL={{PLATFORM_URL}} \
MOLECULE_WORKSPACE_TOKEN="<paste from create response>" \
@@ -249,8 +348,11 @@ claude mcp add molecule -s user -- env \
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • "Tools not appearing in your agent" — run ` + "`claude mcp list`" + ` (or
# your runtime's equivalent) and confirm the molecule entry. If
# missing, re-run the ` + "`claude mcp add`" + ` line above.
# your runtime's equivalent) and confirm the {{MCP_SERVER_NAME}} entry.
# If missing, re-run the ` + "`claude mcp add`" + ` line above.
# • "Connecting a second workspace overwrote the first" — re-check that
# the server name in the line above is {{MCP_SERVER_NAME}} (not a bare
# "molecule"); each workspace's modal generates a distinct name.
# • "ConnectionRefused / DNS error on first call" — PLATFORM_URL must
# include the scheme (https://) and have NO trailing slash. Verify
# with: curl ${PLATFORM_URL}/healthz
@@ -331,6 +433,13 @@ const externalHermesChannelTemplate = `# Hermes channel — bridges this workspa
# hermes-agent session. No tunnel/public URL needed (long-poll based,
# same shape as the Claude Code channel).
#
# Multi-workspace: each workspace's plugin_platforms entry is keyed by a
# workspace-specific slug ("{{MCP_SERVER_NAME}}") so two molecule
# workspaces can coexist in one hermes config — YAML rejects duplicate
# mapping keys, so re-using the same "molecule:" key for a second
# workspace would silently overwrite the first. Re-running this snippet
# for another workspace ADDS a sibling entry instead.
#
# Prereq: a hermes-agent install on the target machine. Latest builds
# (post #17751) ship the platform-plugin API natively; older ones are
# also supported via the plugin's dual-mode fallback.
@@ -345,13 +454,17 @@ export MOLECULE_PLATFORM_URL={{PLATFORM_URL}}
export MOLECULE_WORKSPACE_TOKEN="<paste from create response>"
# 3. Edit ~/.hermes/config.yaml — under your existing top-level
# gateway: block, add a plugin_platforms entry:
# gateway: block, add a plugin_platforms entry. The platform key
# ({{MCP_SERVER_NAME}}) is workspace-specific so multiple molecule
# workspaces coexist; re-using the same key for a second workspace
# would silently overwrite the first (YAML duplicate-key collapse):
#
# gateway:
# # ...your existing gateway settings...
# plugin_platforms:
# molecule:
# {{MCP_SERVER_NAME}}:
# enabled: true
# workspace_id: {{WORKSPACE_ID}}
#
# If you don't yet have a gateway: block, create one with just
# that plugin_platforms entry. Don't append blindly — YAML
@@ -404,6 +517,14 @@ hermes gateway --replace
const externalCodexTemplate = `# Codex external setup — outbound tools (MCP) + inbound push (bridge).
# For operators whose external agent is a codex CLI (@openai/codex)
# session.
#
# Multi-workspace: the TOML table name is workspace-specific
# ("{{MCP_SERVER_NAME}}") so two molecule workspaces can coexist in one
# ~/.codex/config.toml — TOML rejects duplicate
# [mcp_servers.<name>] tables, so re-using a bare "molecule" name for a
# second workspace would either break codex parsing or silently
# overwrite the first. Re-running this snippet for another workspace
# ADDS a sibling table instead.
# 1. Install codex CLI, the workspace runtime, and the bridge daemon:
npm install -g @openai/codex@latest
@@ -412,23 +533,21 @@ pip install codex-channel-molecule
# 2. Wire the molecule MCP server into codex's config.toml — this is
# the OUTBOUND path (codex calls list_peers / delegate_task /
# send_message_to_user / commit_memory).
#
# Don't append blindly — TOML rejects duplicate
# [mcp_servers.molecule] tables, so re-running on an existing
# config will break codex parsing. If [mcp_servers.molecule]
# already exists (e.g. you set this up before), replace the
# existing block instead of appending.
# send_message_to_user / commit_memory). The table name
# ({{MCP_SERVER_NAME}}) is workspace-specific; re-running the
# snippet for a DIFFERENT workspace appends a sibling table without
# touching the first. Re-running for the SAME workspace produces
# the same name, so replace the existing block instead of appending.
mkdir -p ~/.codex
# (then open ~/.codex/config.toml in your editor and paste:)
#
# [mcp_servers.molecule]
# [mcp_servers.{{MCP_SERVER_NAME}}]
# command = "molecule-mcp"
# args = []
# startup_timeout_sec = 30
#
# [mcp_servers.molecule.env]
# [mcp_servers.{{MCP_SERVER_NAME}}.env]
# WORKSPACE_ID = "{{WORKSPACE_ID}}"
# PLATFORM_URL = "{{PLATFORM_URL}}"
# MOLECULE_WORKSPACE_TOKEN = "<paste from create response>"
@@ -472,11 +591,13 @@ codex
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • [mcp_servers.molecule] not loaded — codex must be ≥ 0.57.
# • [mcp_servers.{{MCP_SERVER_NAME}}] not loaded — codex must be ≥ 0.57.
# Check with ` + "`codex --version`" + `; upgrade via npm install -g @openai/codex@latest.
# • TOML parse error after re-running setup — TOML rejects duplicate
# [mcp_servers.molecule] tables. Open ~/.codex/config.toml and
# remove the old block before pasting the new one.
# • TOML parse error after re-running setup for the SAME workspace —
# TOML rejects duplicate [mcp_servers.<name>] tables. Open
# ~/.codex/config.toml and remove the old block before pasting the
# new one. (A second molecule workspace gets a DIFFERENT table
# name, so coexisting workspaces don't conflict.)
# • Canvas messages don't wake codex — step 3 (codex-channel-molecule
# bridge daemon) is required for inbound push. Check
# pgrep -f codex-channel-molecule and tail ~/.codex-channel-molecule/daemon.log.
@@ -502,23 +623,23 @@ const externalKimiTemplate = `# Kimi CLI external setup — register + heartbeat
pip install molecule-ai-workspace-runtime
# 2. Save credentials and the bridge script:
mkdir -p ~/.molecule-ai/kimi-workspace
chmod 700 ~/.molecule-ai/kimi-workspace
cat > ~/.molecule-ai/kimi-workspace/env <<'EOF'
mkdir -p ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}
chmod 700 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}
cat > ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env <<'EOF'
WORKSPACE_ID={{WORKSPACE_ID}}
PLATFORM_URL={{PLATFORM_URL}}
MOLECULE_WORKSPACE_TOKEN=<paste from create response>
EOF
chmod 600 ~/.molecule-ai/kimi-workspace/env
chmod 600 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env
cat > ~/.molecule-ai/kimi-workspace/kimi_bridge.py <<'PYEOF'
cat > ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py <<'PYEOF'
#!/usr/bin/env python3
"""Kimi bridge — keeps workspace online and polls for canvas messages."""
import json, logging, time
from pathlib import Path
import httpx
ENV = Path.home() / ".molecule-ai" / "kimi-workspace" / "env"
ENV = Path.home() / ".molecule-ai" / "kimi-{{MCP_SERVER_NAME}}" / "env"
HEARTBEAT_INTERVAL = 20
POLL_INTERVAL = 5
@@ -608,10 +729,10 @@ def main():
if __name__ == "__main__":
main()
PYEOF
chmod +x ~/.molecule-ai/kimi-workspace/kimi_bridge.py
chmod +x ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
# 3. Start the bridge (run in a persistent terminal or via launchd):
python3 ~/.molecule-ai/kimi-workspace/kimi_bridge.py
python3 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
# What the script does:
# • Registers the workspace in poll mode (no public URL needed)
@@ -622,7 +743,7 @@ python3 ~/.molecule-ai/kimi-workspace/kimi_bridge.py
# To change the reply logic, edit the send_reply() call inside the loop.
# To send a one-off reply from another terminal:
# curl -fsS -X POST "{{PLATFORM_URL}}/workspaces/{{WORKSPACE_ID}}/notify" \
# -H "Authorization: Bearer $(cat ~/.molecule-ai/kimi-workspace/env | grep TOKEN | cut -d= -f2)" \
# -H "Authorization: Bearer $(cat ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env | grep TOKEN | cut -d= -f2)" \
# -H "Content-Type: application/json" \
# -d '{"message":"Hello from Kimi"}'
#
@@ -644,6 +765,13 @@ const externalOpenClawTemplate = `# OpenClaw MCP config — outbound tool path.
# sessions.steer push path; an external setup would need the same
# bridge daemon the template uses. For inbound delivery on an
# external machine today, pair with the Python SDK tab.
#
# Multi-workspace: each workspace registers under a workspace-specific
# MCP server name ("{{MCP_SERVER_NAME}}"). openclaw keys MCP servers by
# name in its config (~/.openclaw/mcp/<name>.json), so re-running with
# a bare "molecule" name would overwrite the prior workspace's entry.
# Re-run this snippet for another workspace to ADD a sibling entry
# instead.
# 1. Install openclaw CLI + the workspace runtime wheel:
# The version pin (>=0.1.999) ensures the "molecule-mcp" console
@@ -674,7 +802,7 @@ pip install "molecule-ai-workspace-runtime>=0.1.999"
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
WORKSPACE_TOKEN="<paste from create response>"
openclaw mcp set molecule "$(cat <<EOF
openclaw mcp set {{MCP_SERVER_NAME}} "$(cat <<EOF
{
"command": "molecule-mcp",
"args": [],
@@ -704,6 +832,6 @@ openclaw agent --message "list my peers"
# • Gateway not starting — tail ~/.openclaw/gateway.log. The loopback
# bind requires :18789 to be free; check with ` + "`lsof -iTCP:18789`" + `.
# • ` + "`openclaw mcp set`" + ` rejected — the heredoc generates JSON;
# verify with ` + "`jq < ~/.openclaw/mcp/molecule.json`" + ` and re-run
# verify with ` + "`jq < ~/.openclaw/mcp/{{MCP_SERVER_NAME}}.json`" + ` and re-run
# ` + "`openclaw mcp set`" + ` if the file is malformed.
`
@@ -52,7 +52,7 @@ func (h *WorkspaceHandler) RotateExternalCredentials(c *gin.Context) {
}
ctx := c.Request.Context()
runtime, err := lookupWorkspaceRuntime(ctx, db.DB, id)
runtime, name, err := lookupWorkspaceRuntimeAndName(ctx, db.DB, id)
if errors.Is(err, sql.ErrNoRows) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
@@ -108,7 +108,7 @@ func (h *WorkspaceHandler) RotateExternalCredentials(c *gin.Context) {
platformURL := externalPlatformURL(c)
c.JSON(http.StatusOK, gin.H{
"connection": BuildExternalConnectionPayload(platformURL, id, tok),
"connection": BuildExternalConnectionPayload(platformURL, id, name, tok),
})
}
@@ -129,7 +129,7 @@ func (h *WorkspaceHandler) GetExternalConnection(c *gin.Context) {
}
ctx := c.Request.Context()
runtime, err := lookupWorkspaceRuntime(ctx, db.DB, id)
runtime, name, err := lookupWorkspaceRuntimeAndName(ctx, db.DB, id)
if errors.Is(err, sql.ErrNoRows) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
@@ -149,16 +149,20 @@ func (h *WorkspaceHandler) GetExternalConnection(c *gin.Context) {
platformURL := externalPlatformURL(c)
c.JSON(http.StatusOK, gin.H{
"connection": BuildExternalConnectionPayload(platformURL, id, ""),
"connection": BuildExternalConnectionPayload(platformURL, id, name, ""),
})
}
// lookupWorkspaceRuntime returns the workspace's runtime field. Wrapped
// for readability + so tests can mock the single SELECT.
func lookupWorkspaceRuntime(ctx context.Context, handle *sql.DB, id string) (string, error) {
var runtime string
err := handle.QueryRowContext(ctx, `
SELECT COALESCE(runtime, '') FROM workspaces WHERE id = $1
`, id).Scan(&runtime)
return runtime, err
// lookupWorkspaceRuntimeAndName returns runtime + name in one round-trip.
// Wrapped for readability + so tests can mock the single SELECT.
// Used by rotate / re-show paths: runtime gates the external-only check;
// name feeds the per-workspace MCP server slug in BuildExternalConnectionPayload
// (so the Universal MCP snippet uses a stable per-workspace name instead
// of overwriting prior `claude mcp add molecule` entries).
// Returns sql.ErrNoRows when the workspace doesn't exist.
func lookupWorkspaceRuntimeAndName(ctx context.Context, handle *sql.DB, id string) (runtime, name string, err error) {
err = handle.QueryRowContext(ctx, `
SELECT COALESCE(runtime, ''), COALESCE(name, '') FROM workspaces WHERE id = $1
`, id).Scan(&runtime, &name)
return runtime, name, err
}
@@ -35,9 +35,9 @@ func TestRotateExternalCredentials_HappyPath(t *testing.T) {
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
// 1. Runtime lookup
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-ext").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("external"))
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("external", "test-ws"))
// 2. Revoke all live tokens
mock.ExpectExec(`UPDATE workspace_auth_tokens`).
@@ -98,9 +98,9 @@ func TestRotateExternalCredentials_RejectsNonExternal(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-hermes").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("hermes"))
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("hermes", "test-ws"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -129,9 +129,9 @@ func TestRotateExternalCredentials_NotFound(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"runtime"})) // no rows
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"})) // no rows
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -172,9 +172,9 @@ func TestGetExternalConnection_HappyPathReturnsBlankToken(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-ext").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("external"))
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("external", "test-ws"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -211,9 +211,9 @@ func TestGetExternalConnection_RejectsNonExternal(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-claude").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("claude-code", "test-ws"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -233,9 +233,9 @@ func TestGetExternalConnection_NotFound(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}))
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -253,7 +253,7 @@ func TestGetExternalConnection_NotFound(t *testing.T) {
// ---------- BuildExternalConnectionPayload (pure helper) ----------
func TestBuildExternalConnectionPayload_StampsPlaceholders(t *testing.T) {
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "tok-abc")
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "my-bot", "tok-abc")
if got["workspace_id"] != "ws-7" {
t.Errorf("workspace_id: %v", got["workspace_id"])
@@ -267,6 +267,18 @@ func TestBuildExternalConnectionPayload_StampsPlaceholders(t *testing.T) {
if got["registry_endpoint"] != "https://platform.test/registry/register" {
t.Errorf("registry_endpoint: %v", got["registry_endpoint"])
}
// Universal MCP snippet must contain a workspace-specific server
// name derived from the workspace name. Without this each new
// `claude mcp add` would overwrite the previous entry in the user's
// ~/.claude.json (servers are keyed by name) — collapsing
// multi-workspace use into one slot. See mcpServerNameForWorkspace.
mcp, _ := got["universal_mcp_snippet"].(string)
if !strings.Contains(mcp, "claude mcp add molecule-my-bot ") {
t.Errorf("universal_mcp_snippet missing per-workspace server name 'molecule-my-bot':\n%s", mcp)
}
if strings.Contains(mcp, "{{MCP_SERVER_NAME}}") {
t.Errorf("universal_mcp_snippet still contains literal {{MCP_SERVER_NAME}}")
}
// {{PLATFORM_URL}} + {{WORKSPACE_ID}} placeholders must be substituted
// out of every snippet — if any snippet still contains a literal
// "{{PLATFORM_URL}}" or "{{WORKSPACE_ID}}", a future template author
@@ -292,7 +304,7 @@ func TestBuildExternalConnectionPayload_TrimsTrailingSlash(t *testing.T) {
// being concatenated into endpoint paths — otherwise the operator
// gets `https://platform.test//registry/register` (double slash) which
// some servers reject as a redirect target.
got := BuildExternalConnectionPayload("https://platform.test/", "ws-7", "")
got := BuildExternalConnectionPayload("https://platform.test/", "ws-7", "", "")
if got["platform_url"] != "https://platform.test" {
t.Errorf("platform_url: trailing slash not trimmed; got %v", got["platform_url"])
}
@@ -304,8 +316,100 @@ func TestBuildExternalConnectionPayload_TrimsTrailingSlash(t *testing.T) {
func TestBuildExternalConnectionPayload_BlankAuthTokenIsAllowed(t *testing.T) {
// Re-show path: auth_token="" is the contract; the modal masks the
// field and labels it "rotate to reveal a new token".
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "")
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "", "")
if got["auth_token"] != "" {
t.Errorf("blank token must propagate as \"\"; got %v", got["auth_token"])
}
}
// TestBuildExternalConnectionPayload_McpServerNameUniquePerWorkspace
// pins the multi-workspace install contract: two distinct workspaces
// must produce two distinct `claude mcp add` server-name lines, or
// installing the second one will overwrite the first entry in the
// user's ~/.claude.json (servers are keyed by name) — collapsing
// multi-workspace use into a single per-session slot, which is the
// "this is per-session" UX the CTO observed 2026-05-18.
func TestBuildExternalConnectionPayload_McpServerNameUniquePerWorkspace(t *testing.T) {
cases := []struct {
name string
workspaceID string
wsName string
wantAddLine string // must appear in universal_mcp_snippet
}{
{"plain name", "id-a", "my-bot", "claude mcp add molecule-my-bot "},
{"name with spaces + caps", "id-b", "My Bot 1", "claude mcp add molecule-my-bot-1 "},
// Symbol/punctuation collapses to single hyphens and trims.
{"name with symbols", "id-c", "--Foo!!Bar--", "claude mcp add molecule-foo-bar "},
// Empty name falls back to the first 8 chars of the (de-hyphenated)
// workspace UUID — keeps the snippet unique per workspace even
// when callers (rotate/re-show pre-name-lookup) pass "".
{"empty name, uuid id", "12345678-aaaa-bbbb-cccc-deadbeef0000", "", "claude mcp add molecule-12345678 "},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := BuildExternalConnectionPayload("https://p.test", tc.workspaceID, tc.wsName, "tok")
mcp, _ := got["universal_mcp_snippet"].(string)
if !strings.Contains(mcp, tc.wantAddLine) {
t.Errorf("missing %q in universal_mcp_snippet:\n%s", tc.wantAddLine, mcp)
}
// Belt + suspenders: never the bare fixed `molecule` name —
// that was the bug. (Match with trailing space so the
// "molecule-…" form passes.)
if strings.Contains(mcp, "claude mcp add molecule ") {
t.Errorf("snippet regressed to fixed `claude mcp add molecule `; got:\n%s", mcp)
}
})
}
}
// TestBuildExternalConnectionPayload_AllRuntimeSnippetsAreWorkspaceUnique
// extends the multi-workspace install contract to every runtime tab in
// the modal. Each MCP-host config keyspace has the SAME equivalence
// class as Claude Code's `claude mcp add <name>`:
//
// - codex: ~/.codex/config.toml [mcp_servers.<name>] — TOML rejects
// duplicate table keys, so a second workspace with the same name
// either breaks parsing or overwrites the first table.
// - openclaw: ~/.openclaw/mcp/<name>.json — file is keyed by <name>,
// `openclaw mcp set <same-name>` overwrites.
// - hermes: ~/.hermes/config.yaml gateway.plugin_platforms.<key>:
// YAML rejects duplicate mapping keys.
// - kimi: ~/.molecule-ai/kimi-<slug>/ per-workspace dir — single
// "kimi-workspace" dir would have both workspaces' envs collide.
//
// All four must therefore stamp the workspace-specific
// {{MCP_SERVER_NAME}} slug. This test catches a future template author
// who introduces a new runtime tab without plumbing the slug.
func TestBuildExternalConnectionPayload_AllRuntimeSnippetsAreWorkspaceUnique(t *testing.T) {
got := BuildExternalConnectionPayload("https://p.test", "id-a", "my-bot", "tok")
// Per-template literal that proves the slug was stamped through.
wantPerSnippet := map[string]string{
"universal_mcp_snippet": "claude mcp add molecule-my-bot ",
"codex_snippet": "[mcp_servers.molecule-my-bot]",
"openclaw_snippet": "openclaw mcp set molecule-my-bot ",
"hermes_channel_snippet": " molecule-my-bot:",
"kimi_snippet": "~/.molecule-ai/kimi-molecule-my-bot",
}
for key, needle := range wantPerSnippet {
v, _ := got[key].(string)
if !strings.Contains(v, needle) {
t.Errorf("%s missing per-workspace slug literal %q:\n%s", key, needle, v)
}
}
// No template should still contain the unstamped placeholder — that
// would mean BuildExternalConnectionPayload's stamp() didn't sweep
// it, which is the regression we're guarding against.
for _, k := range []string{
"curl_register_template", "python_snippet",
"claude_code_channel_snippet", "universal_mcp_snippet",
"hermes_channel_snippet", "codex_snippet", "openclaw_snippet",
"kimi_snippet",
} {
v, _ := got[k].(string)
if strings.Contains(v, "{{MCP_SERVER_NAME}}") {
t.Errorf("%s still contains literal {{MCP_SERVER_NAME}}", k)
}
}
}
@@ -218,6 +218,14 @@ func loadWorkspaceEnv(orgBaseDir, filesDir string) map[string]string {
// check, or when the env file does not exist (workspaces without a role —
// or running on hosts that don't ship the bootstrap dir — keep their old
// behavior).
//
// Token-file fallback: the newer prod-team personas (agent-dev-a,
// agent-dev-b, agent-pm) ship `token` + `universal-auth.env` only — no
// legacy plaintext `env` file. When the env-file load produces zero rows,
// loadPersonaTokenFile fills in GITEA_TOKEN / GITEA_USER / GITEA_USER_EMAIL
// from the token file so the GIT_ASKPASS helper has something to emit.
// The env-file form remains authoritative when present (it may carry
// richer rows like GITEA_TOKEN_SCOPES / GITEA_SSH_KEY_PATH).
func loadPersonaEnvFile(role string, out map[string]string) {
if !isSafeRoleName(role) {
if role != "" {
@@ -229,7 +237,61 @@ func loadPersonaEnvFile(role string, out map[string]string) {
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
before := len(out)
parseEnvFile(filepath.Join(root, role, "env"), out)
if len(out) == before {
// No env-file rows landed (file absent, or present-but-empty).
// Try the token-only persona shape used by the prod-team
// identities. Existing keys in out are preserved.
loadPersonaTokenFile(role, out)
}
}
// loadPersonaTokenFile populates GITEA_TOKEN / GITEA_USER / GITEA_USER_EMAIL
// from a persona dir that ships only the bare `token` file — the shape used
// by the production agent personas (agent-dev-a, agent-dev-b, agent-pm).
// Those dirs do not carry an `env` file because their non-Gitea creds come
// from Infisical Universal Auth at runtime (universal-auth.env), so the
// historical loadPersonaEnvFile path silently no-ops on them.
//
// File layout: $MOLECULE_PERSONA_ROOT/<role>/token (mode 600, plain text).
// The token contents become GITEA_TOKEN (whitespace-trimmed); the role
// name becomes GITEA_USER; GITEA_USER_EMAIL is synthesised as
// <role>@<gitIdentityEmailDomain> to match the email shape that
// applyAgentGitIdentity uses for its slug-derived authorship addresses.
//
// Silent no-op when the role fails the safe-segment check, when the
// token file does not exist, or when its contents are empty after
// trimming. Existing keys in out are not overwritten — the caller's
// later .env layers and any prior loadPersonaEnvFile rows always win.
func loadPersonaTokenFile(role string, out map[string]string) {
if out == nil {
return
}
if !isSafeRoleName(role) {
return
}
root := os.Getenv("MOLECULE_PERSONA_ROOT")
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
data, err := os.ReadFile(filepath.Join(root, role, "token"))
if err != nil {
return
}
token := strings.TrimSpace(string(data))
if token == "" {
return
}
if _, ok := out["GITEA_TOKEN"]; !ok {
out["GITEA_TOKEN"] = token
}
if _, ok := out["GITEA_USER"]; !ok {
out["GITEA_USER"] = role
}
if _, ok := out["GITEA_USER_EMAIL"]; !ok {
out["GITEA_USER_EMAIL"] = role + "@" + gitIdentityEmailDomain
}
}
// isSafeRoleName accepts a single path segment of [A-Za-z0-9_-]+. Rejects
@@ -164,3 +164,181 @@ func TestIsSafeRoleName_Acceptance(t *testing.T) {
}
}
}
// TestLoadPersonaTokenFile_TokenOnlyPersona: the prod-team personas
// (agent-dev-a / agent-dev-b / agent-pm) ship `token` only — no `env`
// file. loadPersonaEnvFile's fallback path must populate GITEA_TOKEN /
// GITEA_USER / GITEA_USER_EMAIL from the token contents + role name so
// the GIT_ASKPASS helper has something to emit.
func TestLoadPersonaTokenFile_TokenOnlyPersona(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-bytes-redacted\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-a", out)
want := map[string]string{
"GITEA_TOKEN": "token-bytes-redacted",
"GITEA_USER": "agent-dev-a",
"GITEA_USER_EMAIL": "agent-dev-a@" + gitIdentityEmailDomain,
}
if len(out) != len(want) {
t.Fatalf("got %d keys, want %d: %#v", len(out), len(want), out)
}
for k, v := range want {
if out[k] != v {
t.Errorf("out[%q] = %q; want %q", k, out[k], v)
}
}
}
// TestLoadPersonaTokenFile_EnvFileWins: when BOTH an env file and a
// token file exist in the same persona dir, the env file is the more-
// specific declaration and wins outright — the fallback must not fire
// at all. This pins precedence so a persona later migrated to the
// richer env-file form (carrying GITEA_TOKEN_SCOPES / GITEA_SSH_KEY_PATH)
// doesn't get its token silently overridden by the fallback.
func TestLoadPersonaTokenFile_EnvFileWins(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
envBody := "GITEA_USER=env-form-user\nGITEA_TOKEN=env-form-token\n" +
"GITEA_USER_EMAIL=env-form@example.invalid\nGITEA_TOKEN_SCOPES=write:repository\n"
if err := os.WriteFile(filepath.Join(roleDir, "env"), []byte(envBody), 0o600); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-form-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-b", out)
if out["GITEA_USER"] != "env-form-user" {
t.Errorf("env file should win for GITEA_USER; got %q", out["GITEA_USER"])
}
if out["GITEA_TOKEN"] != "env-form-token" {
t.Errorf("env file should win for GITEA_TOKEN; got %q", out["GITEA_TOKEN"])
}
if out["GITEA_USER_EMAIL"] != "env-form@example.invalid" {
t.Errorf("env file should win for GITEA_USER_EMAIL; got %q", out["GITEA_USER_EMAIL"])
}
if out["GITEA_TOKEN_SCOPES"] != "write:repository" {
t.Errorf("env file extras must be preserved; got GITEA_TOKEN_SCOPES=%q", out["GITEA_TOKEN_SCOPES"])
}
}
// TestLoadPersonaTokenFile_NeitherFile: persona dir exists but ships
// neither env nor token — silent no-op. This is the legitimate case
// for a partially-provisioned persona during bootstrap; callers expect
// an empty map, no error, no log noise.
func TestLoadPersonaTokenFile_NeitherFile(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-pm")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-pm", out)
if len(out) != 0 {
t.Errorf("expected empty out when neither env nor token exists; got %#v", out)
}
}
// TestLoadPersonaTokenFile_EmptyToken: a token file with only
// whitespace must be treated as absent — never emit
// GITEA_TOKEN="" / GITEA_USER=<role> / GITEA_USER_EMAIL=<role>@... because
// that would set GITEA_USER without a usable token, and the askpass
// helper would then prompt with an empty password. Silent no-op is the
// correct behavior — let downstream auth fall through to its existing
// "no credentials available" path.
func TestLoadPersonaTokenFile_EmptyToken(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
// Whitespace-only contents: spaces, tabs, newlines.
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte(" \t\n \n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-a", out)
if len(out) != 0 {
t.Errorf("expected empty out when token file is whitespace-only; got %#v", out)
}
}
// TestLoadPersonaTokenFile_TrimsWhitespace: tokens shipped from the
// operator-host bootstrap kit may have a trailing newline (the
// canonical `printf "%s\n" "$token" > token` shape). The fallback must
// trim leading + trailing whitespace so the askpass helper emits the
// raw token bytes — Gitea's PAT validator rejects tokens with embedded
// whitespace.
func TestLoadPersonaTokenFile_TrimsWhitespace(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("\n raw-token-bytes \n\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-b", out)
if out["GITEA_TOKEN"] != "raw-token-bytes" {
t.Errorf("token whitespace not trimmed; got %q", out["GITEA_TOKEN"])
}
}
// TestLoadPersonaTokenFile_RejectsUnsafeRole: defense-in-depth — even
// in the fallback path, role names that fail isSafeRoleName must not
// touch the filesystem. Mirrors TestLoadPersonaEnvFile_RejectsTraversal.
func TestLoadPersonaTokenFile_RejectsUnsafeRole(t *testing.T) {
root := t.TempDir()
// Plant a token at /tmp/.../token so a bad traversal would reach it.
if err := os.WriteFile(filepath.Join(root, "token"),
[]byte("stolen-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", filepath.Join(root, "personas"))
for _, bad := range []string{"..", "../personas", "/abs", "with/slash", "."} {
out := map[string]string{}
loadPersonaTokenFile(bad, out)
if len(out) != 0 {
t.Errorf("role %q should have been rejected; got %#v", bad, out)
}
}
}
// TestLoadPersonaTokenFile_NilMapSafe: callers pass a fresh map in
// practice, but defense-in-depth — a nil map must not panic.
func TestLoadPersonaTokenFile_NilMapSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("nil map caused panic: %v", r)
}
}()
loadPersonaTokenFile("agent-dev-a", nil)
}
@@ -180,6 +180,42 @@ func waitForWorkspaceOnline(ctx context.Context, workspaceID string, timeout tim
return false
}
// waitForFreshHeartbeat polls until the workspace has BOTH a non-empty
// url AND a last_heartbeat_at strictly after restartStartTs (i.e. the
// heartbeat we observe is NEW, not the stale pre-restart one carried
// across through the row update). Returns false on timeout or DB error.
//
// This is the Layer 2 gate for the 2026-05-19 ws-server self-fire restart
// loop fix. status='online' can flip while url='' is still in place (the
// status update happens in /registry/register; url is set at the same
// time but the read here may see a transient interleaving) and pre-fix
// the trailing restart-context probe could fire against a half-registered
// row, triggering the upstream-502 → maybeMarkContainerDead → self-fire
// chain we're closing. The url + heartbeat-freshness check is the
// strict, correlated end-state assertion that says "the new container is
// actually addressable" — not just "some heartbeat happened".
func waitForFreshHeartbeat(ctx context.Context, workspaceID string, restartStartTs time.Time, timeout time.Duration) bool {
deadline := time.Now().Add(timeout)
for time.Now().Before(deadline) {
var url sql.NullString
var lastHB sql.NullTime
err := db.DB.QueryRowContext(ctx,
`SELECT url, last_heartbeat_at FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&url, &lastHB)
if err == nil &&
url.Valid && url.String != "" &&
lastHB.Valid && lastHB.Time.After(restartStartTs) {
return true
}
select {
case <-ctx.Done():
return false
case <-time.After(restartContextOnlinePollInterval):
}
}
return false
}
// buildRestartA2APayload wraps the rendered context string in the
// JSON-RPC 2.0 / A2A message/send shape that the proxy already knows
// how to normalize. Returns the marshalled body ready for ProxyA2ARequest.
@@ -220,6 +256,22 @@ func (h *WorkspaceHandler) sendRestartContext(workspaceID string, data restartCo
log.Printf("restart-context: workspace %s did not come online within %s — dropping context message", workspaceID, restartContextOnlineTimeout)
return
}
// Self-fire guard (Layer 2 of the 2026-05-19 ws-server self-fire fix):
// status='online' alone is not enough to safely fire the trailing
// ProxyA2ARequest. The workspace must also have:
// - url != '' (the new container's URL has been registered)
// - last_heartbeat_at > data.RestartAt (the heartbeat we're seeing is NEW, not stale)
// Without those, ProxyA2ARequest can fail with a connect error or
// upstream 502, hit handleA2ADispatchError → maybeMarkContainerDead →
// RestartByID → self-fire. The Layer 1 isRestarting gate already
// covers that, but this is a belt-and-suspenders so the probe never
// even tries until the new container is actually addressable. Best-
// effort: if the DB read errors out we proceed (preserves the legacy
// behaviour of "online means online").
if !waitForFreshHeartbeat(ctx, workspaceID, data.RestartAt, restartContextOnlineTimeout) {
log.Printf("restart-context: workspace %s online but no fresh heartbeat or empty url — dropping context message (self-fire guard)", workspaceID)
return
}
text := buildRestartContextMessage(data)
body, err := buildRestartA2APayload(text)
@@ -198,6 +198,17 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// back to its compiled-in Anthropic default and 401s when the user's
// key is for a different provider. Non-hermes runtimes are unaffected
// (the server still passes model through, they just don't use it).
// runtimeExplicitlyRequested is true when the caller expressed intent for
// a SPECIFIC runtime — either by passing `runtime` directly, or by naming
// a `template` (a template encodes a runtime). When true, we must NOT
// silently fall back to langgraph if that intent can't be honored: that
// is the molecule-controlplane#188 / #184 contract violation (caller asks
// for codex/claude-code, gets a langgraph workspace, 201, no error — a
// false success). #188 mandates fail-closed (error+notify) on mismatch,
// not an advisory degrade. The legitimate "no template, no runtime →
// langgraph default" path (bare {"name":...}) is unaffected.
runtimeExplicitlyRequested := payload.Runtime != "" || payload.Template != ""
templateRuntimeResolved := payload.Runtime != ""
if payload.Template != "" && (payload.Runtime == "" || payload.Model == "") {
// #226: payload.Template is attacker-controllable. resolveInsideRoot
// rejects absolute paths and any ".." that escapes configsDir so the
@@ -230,6 +241,9 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
switch {
case payload.Runtime == "" && !indented && strings.HasPrefix(stripped, "runtime:") && !strings.HasPrefix(stripped, "runtime_config"):
payload.Runtime = strings.TrimSpace(strings.TrimPrefix(stripped, "runtime:"))
if payload.Runtime != "" {
templateRuntimeResolved = true
}
case payload.Model == "" && !indented && strings.HasPrefix(stripped, "model:"):
// Legacy top-level `model:` — pre-runtime_config templates.
payload.Model = strings.Trim(strings.TrimSpace(strings.TrimPrefix(stripped, "model:")), `"'`)
@@ -242,7 +256,27 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
}
}
}
// Fail-closed (molecule-controlplane#188 / #184): if the caller expressed
// intent for a specific runtime (passed `runtime`, or named a `template`)
// but we could NOT resolve a concrete runtime from it (template's
// config.yaml unreadable, or it has no `runtime:` key), DO NOT silently
// substitute langgraph and return 201 — that is the silent contract
// violation that produced 5/5 wrong workspaces and a false codex E2E pass.
// Return 422 so the caller learns the requested runtime was not honored.
// The platform-side CP fix (controlplane#188) is the sibling gate; this
// closes the ws-server `Create` boundary the product UI actually hits.
if payload.Runtime == "" && runtimeExplicitlyRequested && !templateRuntimeResolved {
log.Printf("Create: FAIL-CLOSED (controlplane#188) — template=%q requested but runtime could not be resolved; refusing silent langgraph fallback", payload.Template)
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": "runtime could not be resolved from the requested template; refusing to silently provision langgraph (controlplane#188). Pass an explicit \"runtime\", or use a template whose config.yaml declares one.",
"template": payload.Template,
"code": "RUNTIME_UNRESOLVED",
})
return
}
if payload.Runtime == "" {
// Legitimate default path: no template AND no runtime requested
// (bare {"name":...}) — langgraph is the intended default here.
payload.Runtime = "langgraph"
}
@@ -512,7 +546,7 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// shape. Adding a new snippet means adding it once there;
// all three callers pick it up automatically.
resp["connection"] = BuildExternalConnectionPayload(
externalPlatformURL(c), id, connectionToken,
externalPlatformURL(c), id, payload.Name, connectionToken,
)
}
c.JSON(http.StatusCreated, resp)
@@ -0,0 +1,176 @@
package handlers
// workspace_provision_forbidden_env.go — Layer 1 of the RFC#523
// tenant-workspace forbidden-env guardrail (task #146).
//
// Threat model: tenant workspaces (per-customer EC2 / container)
// run untrusted agent-controlled code and MUST NEVER receive
// operator-fleet-scope credentials. A leak from one tenant
// workspace to operator scope would escalate "compromise of one
// agent" into "compromise of the whole platform."
//
// The existing forensic #145 guard (provisioner.scmWriteTokenKeys
// in buildContainerEnv / CPProvisioner.Start) strips SCM-write
// tokens at the FINAL container-env-build step — silent strip,
// no signal back to the caller. RFC#523 adds a FAIL-CLOSED layer
// EARLIER in the provision pipeline: when the resolved env-set
// at prepareProvisionContext-time contains any forbidden var
// name, the provision is aborted with a structured error so the
// operator sees the leak immediately instead of running with a
// silently-stripped env.
//
// Layer placement (3-layer defense-in-depth, RFC#523 §"Proposed guardrail"):
// - L1 (this file): provisioner-side abort BEFORE container start
// - L2 (workspace/entrypoint.sh + template-* start.sh): in-container
// env-grep + exit 1 — defense-in-depth if L1 is bypassed
// - L3 (.gitea/workflows/lint-forbidden-env-keys.yml): CI lint that
// scans Go code under workspace-server/ for new writers that
// would inject a forbidden key
//
// Open-source-template compatibility (memory
// `feedback_open_source_templates_no_hardcoded_org_internals`):
// the forbidden-key set is GENERIC (no molecule-AI-specific
// hostnames or org names). A third-party fork can replace this
// set with its own operator-scope key names without editing any
// template.
import (
"fmt"
"sort"
"strings"
)
// forbiddenTenantEnvKeys is the set of environment variable names
// that MUST NOT reach a tenant workspace container. The check is
// by exact key name — value-shape leaks (40-byte hex strings, etc)
// are out of scope here; the separate secret-scan workflow covers
// that class.
//
// Categories (RFC#523):
// - SCM-write tokens: same as provisioner.scmWriteTokenKeys, kept
// in sync. Listed again here so a future split of the two
// denylists is auditable diff.
// - Control-plane admin tokens: any token that grants control-plane
// admin API access.
// - Secret-store operator tokens: bootstrap-scope tokens for the
// central secret store.
// - Infra-platform tokens: deploy / fleet-management creds.
// - Operator-host pointers: hostnames / addresses that identify
// the operator host. Per the open-source-template rule these
// are MOLECULE_OPERATOR_HOST style prefixes; the literal
// prefix is matched but the test for membership reads from
// this map, not from a hardcoded constant in the deny rule
// itself.
//
// Per-agent persona PATs (e.g. AGENT_DEV_A_TOKEN style names —
// not operator-fleet scope) are NOT on this list. The guard
// checks the env VAR NAME, not the token VALUE, so a per-agent
// scoped token under a per-agent var name passes through.
var forbiddenTenantEnvKeys = map[string]struct{}{
// SCM-write — kept in sync with provisioner.scmWriteTokenKeys.
"GITEA_TOKEN": {},
"GITEA_PAT": {},
"GITHUB_TOKEN": {},
"GITHUB_PAT": {},
"GH_TOKEN": {},
"GITLAB_TOKEN": {},
"GL_TOKEN": {},
"BITBUCKET_TOKEN": {},
// Control-plane admin tokens.
"CP_ADMIN_API_TOKEN": {},
"CP_ADMIN_TOKEN": {},
// Secret-store operator tokens (Infisical SSOT — operator scope only).
"INFISICAL_OPERATOR_TOKEN": {},
"INFISICAL_BOOTSTRAP_TOKEN": {},
// Infra-platform tokens.
"RAILWAY_TOKEN": {},
"RAILWAY_PERSONAL_API_TOKEN": {},
"HETZNER_TOKEN": {},
"HETZNER_API_TOKEN": {},
}
// forbiddenTenantEnvPrefixes are key-name PREFIXES that match
// operator-scope env vars. Matched in addition to the exact-key
// set above. Useful for "MOLECULE_OPERATOR_*" style families
// where new members get added without re-editing the deny set.
//
// Kept as a tiny set on purpose — over-broad prefix matching is
// the failure mode this layer's exact-key set is designed to
// avoid. Add a prefix here only when the family is closed
// (every member is operator-scope; no legitimate tenant-scope
// member exists or will).
var forbiddenTenantEnvPrefixes = []string{
"MOLECULE_OPERATOR_",
}
// isForbiddenTenantEnvKey reports whether an env var name is on
// the forbidden-for-tenant-workspaces list (either by exact match
// in forbiddenTenantEnvKeys or by prefix in
// forbiddenTenantEnvPrefixes).
//
// Exported-style helper kept package-private — the deny set is
// internal to the workspace-server package; external callers must
// go through the provision pipeline, which means the abort path
// fires for them too.
func isForbiddenTenantEnvKey(key string) bool {
if _, ok := forbiddenTenantEnvKeys[key]; ok {
return true
}
for _, prefix := range forbiddenTenantEnvPrefixes {
if strings.HasPrefix(key, prefix) {
return true
}
}
return false
}
// findForbiddenTenantEnvKeys scans the resolved env-set and
// returns the sorted list of forbidden keys present. Empty slice
// (not nil — easier for callers to JSON-encode) when none match.
//
// Deterministic order: the result feeds the user-facing error
// message and the structured-extra payload that goes to the
// canvas Events tab. Sorting makes the message stable across
// Go's randomized map iteration.
func findForbiddenTenantEnvKeys(envVars map[string]string) []string {
if len(envVars) == 0 {
return []string{}
}
found := make([]string, 0)
for k := range envVars {
if isForbiddenTenantEnvKey(k) {
found = append(found, k)
}
}
sort.Strings(found)
return found
}
// formatForbiddenTenantEnvError builds the safe-canned user-facing
// message for a provision aborted because forbidden env keys are
// present in the resolved env-set. The message names the
// offending keys (key names are not secret — the values would be,
// but only names are surfaced) and points at the RFC.
//
// Same shape as formatMissingEnvError so the canvas Events tab
// renders both classes consistently.
func formatForbiddenTenantEnvError(keys []string) string {
if len(keys) == 0 {
// Defensive: caller should not invoke with empty input,
// but keep the function total.
return "provision aborted: forbidden operator-scope env vars present (RFC#523)"
}
if len(keys) == 1 {
return fmt.Sprintf(
"provision aborted: env var %q is operator-scope and must not reach tenant workspaces (RFC#523) — remove it from workspace_secrets / global_secrets and retry",
keys[0],
)
}
return fmt.Sprintf(
"provision aborted: env vars %s are operator-scope and must not reach tenant workspaces (RFC#523) — remove them from workspace_secrets / global_secrets and retry",
strings.Join(keys, ", "),
)
}
@@ -0,0 +1,182 @@
package handlers
// workspace_provision_forbidden_env_test.go — Layer 1 tests for the
// RFC#523 tenant-workspace forbidden-env guardrail (task #146).
//
// Behaviour pinned (per RFC#523 §"Acceptance criteria" Layer 1):
// - exact-match keys (GITEA_TOKEN, CP_ADMIN_API_TOKEN, RAILWAY_TOKEN,
// INFISICAL_OPERATOR_TOKEN, …) are flagged
// - MOLECULE_OPERATOR_* prefix family is flagged
// - per-agent-scope vars (GIT_HTTP_USERNAME, ANTHROPIC_API_KEY,
// AGENT_DEV_A_TOKEN, …) are NOT flagged — guard checks key NAME
// not value
// - findForbiddenTenantEnvKeys returns a deterministically-sorted
// slice (canvas Events tab needs stable rendering)
// - formatForbiddenTenantEnvError uses singular vs plural phrasing
// so the message reads naturally for both 1-key and N-key cases
//
// Companion: provisioner.buildContainerEnv has the older silent-
// strip guard (forensic #145). The two layers are intentionally
// redundant — this one fails closed early; that one strips late.
import (
"strings"
"testing"
)
func TestIsForbiddenTenantEnvKey_ExactMatches(t *testing.T) {
cases := []struct {
key string
want bool
}{
// SCM-write tokens — kept in sync with provisioner.scmWriteTokenKeys.
{"GITEA_TOKEN", true},
{"GITEA_PAT", true},
{"GITHUB_TOKEN", true},
{"GITHUB_PAT", true},
{"GH_TOKEN", true},
{"GITLAB_TOKEN", true},
{"GL_TOKEN", true},
{"BITBUCKET_TOKEN", true},
// Control-plane admin tokens.
{"CP_ADMIN_API_TOKEN", true},
{"CP_ADMIN_TOKEN", true},
// Secret-store operator tokens.
{"INFISICAL_OPERATOR_TOKEN", true},
{"INFISICAL_BOOTSTRAP_TOKEN", true},
// Infra-platform tokens.
{"RAILWAY_TOKEN", true},
{"RAILWAY_PERSONAL_API_TOKEN", true},
{"HETZNER_TOKEN", true},
{"HETZNER_API_TOKEN", true},
// Per-agent scoped — must NOT be flagged.
{"GIT_HTTP_USERNAME", false},
{"GIT_HTTP_PASSWORD", false},
{"ANTHROPIC_API_KEY", false},
{"ANTHROPIC_AUTH_TOKEN", false},
{"OPENAI_API_KEY", false},
{"KIMI_API_KEY", false},
{"MINIMAX_API_KEY", false},
{"AGENT_DEV_A_TOKEN", false}, // hypothetical per-agent name
{"MOLECULE_AGENT_ROLE", false},
{"PARENT_ID", false},
{"WORKSPACE_ID", false},
{"PLATFORM_URL", false},
{"", false},
}
for _, c := range cases {
got := isForbiddenTenantEnvKey(c.key)
if got != c.want {
t.Errorf("isForbiddenTenantEnvKey(%q) = %v; want %v", c.key, got, c.want)
}
}
}
func TestIsForbiddenTenantEnvKey_PrefixMatches(t *testing.T) {
cases := []struct {
key string
want bool
}{
{"MOLECULE_OPERATOR_HOST", true},
{"MOLECULE_OPERATOR_SSH_KEY", true},
{"MOLECULE_OPERATOR_BACKUP_BUCKET", true},
{"MOLECULE_OPERATOR_", true}, // prefix itself
// Adjacent but NOT in prefix family.
{"MOLECULE_AGENT_ROLE", false},
{"MOLECULE_URL", false},
{"MOLECULE_PERSONA_ROOT", false}, // path on operator host, not tenant
{"MOLECULE_GITEA_TOKEN", false}, // localbuild-time only; not a tenant env
}
for _, c := range cases {
got := isForbiddenTenantEnvKey(c.key)
if got != c.want {
t.Errorf("isForbiddenTenantEnvKey(%q) = %v; want %v", c.key, got, c.want)
}
}
}
func TestFindForbiddenTenantEnvKeys_NoneAndEmpty(t *testing.T) {
if got := findForbiddenTenantEnvKeys(nil); len(got) != 0 {
t.Errorf("nil envVars: got %v; want empty", got)
}
if got := findForbiddenTenantEnvKeys(map[string]string{}); len(got) != 0 {
t.Errorf("empty envVars: got %v; want empty", got)
}
clean := map[string]string{
"ANTHROPIC_API_KEY": "sk-keep",
"GIT_HTTP_USERNAME": "agent-dev-a",
"GIT_HTTP_PASSWORD": "scoped-pat",
"MOLECULE_AGENT_ROLE": "agent-dev-a",
"WORKSPACE_ID": "ws-123",
}
if got := findForbiddenTenantEnvKeys(clean); len(got) != 0 {
t.Errorf("clean envVars: got %v; want empty", got)
}
}
func TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted(t *testing.T) {
// Single key.
single := map[string]string{
"ANTHROPIC_API_KEY": "sk-keep",
"GITEA_TOKEN": "operator-scope-leak",
}
got := findForbiddenTenantEnvKeys(single)
if len(got) != 1 || got[0] != "GITEA_TOKEN" {
t.Errorf("single forbidden: got %v; want [GITEA_TOKEN]", got)
}
// Multiple keys — must be sorted (canvas Events tab needs stability).
multi := map[string]string{
"RAILWAY_TOKEN": "z",
"GITEA_TOKEN": "a",
"MOLECULE_OPERATOR_HOST": "m",
"CP_ADMIN_API_TOKEN": "c",
"ANTHROPIC_API_KEY": "ok",
}
got = findForbiddenTenantEnvKeys(multi)
want := []string{"CP_ADMIN_API_TOKEN", "GITEA_TOKEN", "MOLECULE_OPERATOR_HOST", "RAILWAY_TOKEN"}
if len(got) != len(want) {
t.Fatalf("multi forbidden length: got %v; want %v", got, want)
}
for i := range want {
if got[i] != want[i] {
t.Errorf("multi forbidden[%d] = %q; want %q (full got=%v want=%v)", i, got[i], want[i], got, want)
}
}
}
func TestFormatForbiddenTenantEnvError_Phrasing(t *testing.T) {
// Empty input — defensive total function.
if msg := formatForbiddenTenantEnvError(nil); !strings.Contains(msg, "RFC#523") {
t.Errorf("empty input: missing RFC#523 ref: %q", msg)
}
// Singular phrasing.
single := formatForbiddenTenantEnvError([]string{"GITEA_TOKEN"})
if !strings.Contains(single, `"GITEA_TOKEN"`) {
t.Errorf("single: missing quoted key: %q", single)
}
if !strings.Contains(single, "operator-scope") {
t.Errorf("single: missing operator-scope phrase: %q", single)
}
if !strings.Contains(single, "RFC#523") {
t.Errorf("single: missing RFC#523 ref: %q", single)
}
if strings.Contains(single, "env vars ") { // plural form
t.Errorf("single: leaked plural phrasing: %q", single)
}
// Plural phrasing.
multi := formatForbiddenTenantEnvError([]string{"CP_ADMIN_API_TOKEN", "GITEA_TOKEN"})
if !strings.Contains(multi, "CP_ADMIN_API_TOKEN, GITEA_TOKEN") {
t.Errorf("plural: missing joined list: %q", multi)
}
if !strings.Contains(multi, "env vars ") {
t.Errorf("plural: missing plural phrase: %q", multi)
}
}
@@ -125,12 +125,62 @@ func (h *WorkspaceHandler) prepareProvisionContext(
return nil, &provisionAbort{Msg: decryptErr}
}
// RFC#523 Layer 1 (task #146): refuse to start a tenant workspace
// when any forbidden operator-scope env var is present in the
// resolved secret-load env-set. Runs IMMEDIATELY after
// loadWorkspaceSecrets and BEFORE applyAgentGitHTTPCreds — the
// per-agent persona injection sets a fallback GITEA_USER/GITEA_TOKEN
// pair that the buildContainerEnv forensic #145 guard will strip
// later. We want THIS layer to catch leaks from the operator-
// controlled stores (global_secrets, workspace_secrets) only, not
// the deliberate per-agent platform injection that lives downstream.
//
// Threat model is "an upstream secret-writer accidentally widened
// the propagation set" — e.g. an operator pastes GITEA_TOKEN into
// a workspace_secrets row. Caught here, surfaced loudly to the
// canvas Events tab, fail-closed. The existing forensic #145 guard
// in provisioner.buildContainerEnv / CPProvisioner.Start stays as
// defense-in-depth: it silently strips at container-env-build time.
//
// Key names (not values) are echoed in the user-facing error so
// the operator can locate and remove the offending row. Per memory
// `feedback_passwords_in_chat_are_burned`, key names are not
// secret; values would be.
if forbidden := findForbiddenTenantEnvKeys(envVars); len(forbidden) > 0 {
msg := formatForbiddenTenantEnvError(forbidden)
log.Printf("Provisioner: ABORT workspace=%s — forbidden operator-scope env keys present: %v (RFC#523)", workspaceID, forbidden)
return nil, &provisionAbort{
Msg: msg,
Extra: map[string]interface{}{"error": msg, "forbidden_env_keys": forbidden, "rfc": "523"},
}
}
pluginsPath, _ := filepath.Abs(filepath.Join(h.configsDir, "..", "plugins"))
awarenessNamespace := h.loadAwarenessNamespace(ctx, workspaceID)
// Per-agent git identity (#1957) — must run after secret loads so
// a workspace_secret named GIT_AUTHOR_NAME can override.
applyAgentGitIdentity(envVars, payload.Name)
// Per-agent git HTTP credential injection — bridges the gap that
// PR template-claude-code#30 + mc#1525 left open: the askpass binary
// + GIT_ASKPASS env are wired in-image, but until now no code path
// in workspace-server actually read the persona's git token from
// the operator-host bootstrap dir and exported it as
// GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD. Without this, the askpass
// helper invokes with an empty password env and git fails the
// auth challenge in ~500ms (live-verified for Dev-A/Dev-B
// 2026-05-18 ~23:55Z).
//
// Runs AFTER applyAgentGitIdentity so workspace_secrets named
// GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD (operator-supplied,
// loaded earlier by loadWorkspaceSecrets) win over the
// persona-file default. Uses payload.Role as the persona key —
// this matches the slug-form convention agent-dev-a /
// agent-dev-b / agent-pm. Descriptive multi-word roles
// ("Frontend Engineer") take the silent-no-op branch and
// continue to rely on workspace_secrets / org-import persona-env
// merge for their git auth.
applyAgentGitHTTPCreds(envVars, payload.Role)
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
@@ -297,6 +297,203 @@ func TestPrepareProvisionContext_ParentIDInjection(t *testing.T) {
func ptrStr(s string) *string { return &s }
// TestPrepareProvisionContext_InjectsGitHTTPCredsFromPersonaToken pins
// the end-to-end wiring of the durable-git-auth fix: when a workspace
// is provisioned with a slug-form role matching a persona dir at
// $MOLECULE_PERSONA_ROOT/<role>/token, the prepared envVars MUST
// carry GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD (+ GITEA_USER / GITEA_TOKEN
// fallback) so the in-container askpass helper has something to emit
// on git's auth challenge.
//
// Pre-fix shape (Dev-A/Dev-B live-verified 2026-05-18 ~23:55Z): the
// askpass binary + GIT_ASKPASS env were already wired
// (template-claude-code#30 + mc#1525), but GIT_HTTP_USERNAME and
// GIT_HTTP_PASSWORD were absent from the container env → askpass
// returned empty → git rc=128 "Authentication failed" in <500ms.
// This test fails without applyAgentGitHTTPCreds wired into
// prepareProvisionContext and proves the prod-team path is closed.
func TestPrepareProvisionContext_InjectsGitHTTPCredsFromPersonaToken(t *testing.T) {
// Stage a persona dir matching the prod-team shape per
// reference_prod_team_infisical_identities — a flat dir per role
// with a single mode-600 `token` file.
root := t.TempDir()
for _, role := range []string{"agent-dev-a", "agent-dev-b"} {
roleDir := filepath.Join(root, role)
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
// Token value pinned to a recognizable string so we can
// assert exact propagation. Real bootstrap-kit files end in
// \n; the helper must trim that.
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-for-"+role+"\n"), 0o600); err != nil {
t.Fatal(err)
}
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
cases := []struct {
name string
role string
expectInject bool
expectUser string
expectPass string
}{
{
name: "Dev-A slug role → persona token injected as GIT_HTTP_USERNAME/PASSWORD",
role: "agent-dev-a",
expectInject: true,
expectUser: "agent-dev-a",
expectPass: "token-for-agent-dev-a",
},
{
name: "Dev-B slug role → persona token injected",
role: "agent-dev-b",
expectInject: true,
expectUser: "agent-dev-b",
expectPass: "token-for-agent-dev-b",
},
{
name: "descriptive multi-word role → silent no-op (no persona dir lookup)",
role: "Frontend Engineer",
expectInject: false,
},
{
name: "unknown slug role with no persona dir → silent no-op",
role: "agent-nonexistent",
expectInject: false,
},
{
name: "empty role → silent no-op",
role: "",
expectInject: false,
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs("ws-prod-team").
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
handler := NewWorkspaceHandler(&captureBroadcaster{}, nil, "http://localhost:8080", t.TempDir())
payload := models.CreateWorkspacePayload{
Name: "Dev-A",
Role: tc.role,
Tier: 1,
}
prepared, abort := handler.prepareProvisionContext(
context.Background(), "ws-prod-team", "/nonexistent", nil, payload, false)
if abort != nil {
t.Fatalf("unexpected abort: %s", abort.Msg)
}
gotUser, hasUser := prepared.EnvVars["GIT_HTTP_USERNAME"]
gotPass, hasPass := prepared.EnvVars["GIT_HTTP_PASSWORD"]
if tc.expectInject {
if !hasUser || gotUser != tc.expectUser {
t.Errorf("GIT_HTTP_USERNAME: got %q (present=%v), want %q",
gotUser, hasUser, tc.expectUser)
}
if !hasPass || gotPass != tc.expectPass {
t.Errorf("GIT_HTTP_PASSWORD: got %q (present=%v), want %q",
gotPass, hasPass, tc.expectPass)
}
// Fallback pair should ALSO be set so askpass's
// GITEA_USER/GITEA_TOKEN fallback chain works
// (GITEA_TOKEN will then be stripped at
// buildContainerEnv per forensic #145, but
// GITEA_USER survives — see provisioner_test.go
// "persona-file path" subtest).
if prepared.EnvVars["GITEA_USER"] != tc.expectUser {
t.Errorf("GITEA_USER fallback: got %q, want %q",
prepared.EnvVars["GITEA_USER"], tc.expectUser)
}
if prepared.EnvVars["GITEA_TOKEN"] != tc.expectPass {
t.Errorf("GITEA_TOKEN fallback: got %q, want %q",
prepared.EnvVars["GITEA_TOKEN"], tc.expectPass)
}
} else {
if hasUser {
t.Errorf("GIT_HTTP_USERNAME should NOT be set for role %q; got %q",
tc.role, gotUser)
}
if hasPass {
t.Errorf("GIT_HTTP_PASSWORD should NOT be set for role %q; got %q",
tc.role, gotPass)
}
}
// applyAgentGitIdentity always wires GIT_ASKPASS when
// payload.Name is non-empty — sanity check that the new
// wiring didn't accidentally bypass the existing askpass
// env-set (the helper without env = nothing to emit).
if prepared.EnvVars["GIT_ASKPASS"] != "/usr/local/bin/molecule-askpass" {
t.Errorf("GIT_ASKPASS should remain wired by applyAgentGitIdentity; got %q",
prepared.EnvVars["GIT_ASKPASS"])
}
})
}
}
// TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken pins
// the precedence contract: an operator-supplied workspace_secret named
// GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD (loaded by loadWorkspaceSecrets
// BEFORE applyAgentGitHTTPCreds runs) must beat the persona-file
// default. This is the standard escape hatch — if an operator needs a
// per-workspace override (e.g. a workspace-scoped Gitea token with
// narrower repo access than the persona's), the secrets API still
// works.
func TestPrepareProvisionContext_WorkspaceSecretWinsOverPersonaToken(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("persona-file-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM global_secrets`).
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}))
// Workspace secret pre-populates GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD —
// these come from loadWorkspaceSecrets which runs before applyAgentGitHTTPCreds.
// encryption_version=0 means raw bytes (crypto disabled in test).
mock.ExpectQuery(`SELECT key, encrypted_value, encryption_version FROM workspace_secrets`).
WithArgs("ws-prod-team").
WillReturnRows(sqlmock.NewRows([]string{"key", "encrypted_value", "encryption_version"}).
AddRow("GIT_HTTP_USERNAME", []byte("operator-override-user"), 0).
AddRow("GIT_HTTP_PASSWORD", []byte("operator-override-pass"), 0))
handler := NewWorkspaceHandler(&captureBroadcaster{}, nil, "http://localhost:8080", t.TempDir())
payload := models.CreateWorkspacePayload{
Name: "Dev-A",
Role: "agent-dev-a",
Tier: 1,
}
prepared, abort := handler.prepareProvisionContext(
context.Background(), "ws-prod-team", "/nonexistent", nil, payload, false)
if abort != nil {
t.Fatalf("unexpected abort: %s", abort.Msg)
}
if prepared.EnvVars["GIT_HTTP_USERNAME"] != "operator-override-user" {
t.Errorf("operator override lost — GIT_HTTP_USERNAME: got %q, want %q",
prepared.EnvVars["GIT_HTTP_USERNAME"], "operator-override-user")
}
if prepared.EnvVars["GIT_HTTP_PASSWORD"] != "operator-override-pass" {
t.Errorf("operator override lost — GIT_HTTP_PASSWORD: got %q, want %q",
prepared.EnvVars["GIT_HTTP_PASSWORD"], "operator-override-pass")
}
}
// TestReadOrLazyHealInboundSecret pins the four branches of the
// shared lazy-heal helper directly. Each call site (chat_files,
// registry) has its own integration test, but those go through the
@@ -8,6 +8,7 @@ import (
"runtime/debug"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
@@ -39,12 +40,57 @@ type restartState struct {
mu sync.Mutex
running bool // true while a restart cycle is in flight
pending bool // set by any caller that arrived during the in-flight cycle
// restartStartedAt records the wall-clock when the most recent cycle
// flipped running=true. Used by the self-fire debounce (internal#544,
// the ws-server self-fire restart feedback loop seen in prod-Reviewer/
// Researcher 2026-05-19 ~00:05Z 4x reprov thrash): any RestartByID
// arriving within restartDebounceWindow of this timestamp is silently
// dropped so a probe firing during the EC2-pending window can't
// re-trigger a fresh full cycle on the just-launched instance.
restartStartedAt time.Time
}
// restartStates is a per-workspace map of *restartState. Each workspace gets
// its own entry so unrelated workspaces don't serialize on each other.
var restartStates sync.Map // map[workspaceID]*restartState
// restartDebounceWindow is the silent-drop window for successive RestartByID
// calls. Sized to cover the typical EC2 pending → online interval (20-30s)
// with a margin so a probe firing during the just-after-online but still-
// flaky heartbeat window also gets dropped. Bigger than that would block
// legitimate "Restart failed, retry" recoveries; smaller would let the
// 4x thrash class through. Package-level so tests can shrink it.
var restartDebounceWindow = 60 * time.Second
// restartByIDDropCounter is incremented every time RestartByID drops a call
// inside the debounce window. Exposed as a package-level atomic counter so
// (a) tests can assert the drop fired, (b) ops can grep logs for the drop
// log line + the counter snapshot in a future /admin/metrics endpoint.
// Not a Prometheus metric because the platform doesn't pull metrics from
// workspace-server yet — that's a separate RFC.
var restartByIDDropCounter atomic.Uint64
// isRestarting reports whether a restart cycle is currently in flight for
// the workspace. Callers that have their own "container looks dead" probe
// MUST consult this before triggering a restart, because during the
// 20-30s EC2-pending window the workspace's url='' and IsRunning()=false
// looks identical to a dead container — and any restart-triggering probe
// (maybeMarkContainerDead from canvas /delegations poll, or the trailing
// restart-context probe at the end of runRestartCycle) will set
// pending=true and the outer coalesceRestart loop will drain by running
// ANOTHER full cycle, ec2_stopped of the just-booted instance →
// re-provision. That's the self-fire loop closed by this gate.
func isRestarting(workspaceID string) bool {
sv, ok := restartStates.Load(workspaceID)
if !ok {
return false
}
state := sv.(*restartState)
state.mu.Lock()
defer state.mu.Unlock()
return state.running
}
// isParentPaused checks if any ancestor of the workspace is paused.
func isParentPaused(ctx context.Context, workspaceID string) (bool, string) {
var parentID *string
@@ -376,9 +422,45 @@ func (h *WorkspaceHandler) RestartByID(workspaceID string) {
if !h.HasProvisioner() {
return
}
// Self-fire debounce: drop (not coalesce) successive RestartByID calls
// within restartDebounceWindow of the most recent cycle's start. This
// is the load-bearing protection against the 4x reprov thrash class —
// coalesceRestart's pending-flag would otherwise drain by running
// ANOTHER full cycle of stop+provision on the just-launched EC2 (still
// in the pending state), which is the self-fire we're closing.
//
// Only applies to RestartByID (programmatic — secrets handler,
// maybeMarkContainerDead, preflightContainerHealth). The HTTP Restart
// handler in workspace_restart.go's Restart() bypasses this path and
// calls RestartWorkspaceAutoOpts directly, so user-initiated restart
// clicks are unaffected.
if shouldDebounceRestart(workspaceID) {
restartByIDDropCounter.Add(1)
log.Printf("RestartByID: %s — dropped (within %s self-fire debounce window; total dropped=%d)",
workspaceID, restartDebounceWindow, restartByIDDropCounter.Load())
return
}
coalesceRestart(workspaceID, func() { h.runRestartCycle(workspaceID) })
}
// shouldDebounceRestart reports whether the most recent cycle for this
// workspace started within restartDebounceWindow. Read-only on
// restartState; the actual restartStartedAt stamp is written in
// coalesceRestart when running flips false→true.
func shouldDebounceRestart(workspaceID string) bool {
sv, ok := restartStates.Load(workspaceID)
if !ok {
return false
}
state := sv.(*restartState)
state.mu.Lock()
defer state.mu.Unlock()
if state.restartStartedAt.IsZero() {
return false
}
return time.Since(state.restartStartedAt) < restartDebounceWindow
}
// coalesceRestart implements the pending-flag gate around an arbitrary cycle
// function. Extracted from RestartByID for direct unit testing — the cycle
// function in production is `runRestartCycle`, but tests pass a counter to
@@ -398,6 +480,12 @@ func coalesceRestart(workspaceID string, cycle func()) {
return
}
state.running = true
// Stamp the start time so the RestartByID debounce can drop any
// self-fire probe that hits within restartDebounceWindow. Only the
// false→true edge stamps; the drain-loop's inner cycles re-use the
// same start (they're effectively one "restart event" from the
// debounce's POV).
state.restartStartedAt = time.Now()
state.mu.Unlock()
// Always clear running on exit — including panic — so a panicking
@@ -0,0 +1,297 @@
package handlers
// Tests for the 2026-05-19 ws-server self-fire restart feedback loop fix.
//
// Empirical chain reproduced (prod-Reviewer/Researcher 4x reprov thrash
// 2026-05-19 ~00:05-00:09Z, root-caused via Loki):
//
// 1. POST /secrets → go h.restartFunc(workspaceID) (secrets.go:264).
// 2. runRestartCycle sets url='' synchronously, then async provisions EC2
// (workspace_restart.go).
// 3. During 20-30s window while EC2 is `pending` (codex first heartbeat
// not yet landed): workspaces.url='' AND IsRunning=false.
// 4. Any ProxyA2A (canvas /delegations poll OR the restart-context probe
// at the end of runRestartCycle) → maybeMarkContainerDead sees the
// container-dead state → calls RestartByID → loop.
// 5. coalesceRestart sets pending=true, drains by running ANOTHER full
// cycle → provision.ec2_stopped of the just-booted instance →
// re-provision.
//
// Fix: three interdependent layers.
//
// L1) isRestarting() gate in maybeMarkContainerDead +
// preflightContainerHealth — early-return false/nil so the probe
// can't trigger a fresh RestartByID while a restart is in flight.
// L2) sendRestartContext requires url != '' AND last_heartbeat_at >
// restart_start_ts before firing the trailing ProxyA2A probe.
// L3) RestartByID silently drops successive calls within
// restartDebounceWindow of restartStartedAt, with a counter for
// observability.
import (
"context"
"sync/atomic"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
)
// resetSelfFireState wipes all the per-workspace mutation state these
// tests touch, plus the package-level drop counter, so the test is
// hermetic regardless of ordering.
func resetSelfFireState(workspaceID string) {
restartStates.Delete(workspaceID)
restartByIDDropCounter.Store(0)
}
// markRestarting forces restartStates into "cycle in flight" without
// running an actual cycle, so the tests can isolate the gate behaviour
// without the full provision pipeline. Returns a finish() that flips
// running=false (mimicking coalesceRestart's deferred state-clear).
func markRestarting(workspaceID string) (finish func()) {
sv, _ := restartStates.LoadOrStore(workspaceID, &restartState{})
state := sv.(*restartState)
state.mu.Lock()
state.running = true
state.restartStartedAt = time.Now()
state.mu.Unlock()
return func() {
state.mu.Lock()
state.running = false
state.mu.Unlock()
}
}
// TestIsRestarting_FalseWhenNoStateEntry — baseline: a workspace that
// has never been restarted reports !isRestarting. Pinning this so a
// future LoadOrStore refactor can't silently start returning true for
// unknown workspaces.
func TestIsRestarting_FalseWhenNoStateEntry(t *testing.T) {
const wsID = "self-fire-ws-never"
resetSelfFireState(wsID)
if isRestarting(wsID) {
t.Fatal("isRestarting must return false for a workspace with no state entry")
}
}
// TestIsRestarting_TrueWhileCycleRunning — the load-bearing invariant
// that Layer 1 depends on. While running=true, isRestarting must report
// true; the moment it flips to false, isRestarting must report false.
func TestIsRestarting_TrueWhileCycleRunning(t *testing.T) {
const wsID = "self-fire-ws-in-flight"
resetSelfFireState(wsID)
finish := markRestarting(wsID)
if !isRestarting(wsID) {
t.Fatal("isRestarting must return true while running=true")
}
finish()
if isRestarting(wsID) {
t.Fatal("isRestarting must return false after running flips back to false")
}
}
// TestMaybeMarkContainerDead_SkippedWhileRestarting — Layer 1 for the
// reactive path. With isRestarting=true the function must early-return
// false WITHOUT invoking IsRunning, hitting the DB UPDATE, or kicking
// a RestartByID goroutine. If any of those side-effects fire we'd
// re-arm the self-fire loop the gate exists to close.
func TestMaybeMarkContainerDead_SkippedWhileRestarting(t *testing.T) {
const wsID = "self-fire-ws-mmcd"
resetSelfFireState(wsID)
mock := setupTestDB(t) // sqlmock with strict expectation matching
// Workspace row read inside maybeMarkContainerDead — this happens
// BEFORE the isRestarting gate in the current implementation, so
// allow exactly one SELECT runtime row.
mock.ExpectQuery(`SELECT COALESCE\(runtime, 'langgraph'\) FROM workspaces WHERE id =`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
// Gate flipped: must early-return without doing anything else.
finish := markRestarting(wsID)
defer finish()
stub := &preflightLocalProv{running: false, err: nil}
h := newSelfFireHandler(t)
h.provisioner = stub
if got := h.maybeMarkContainerDead(context.Background(), wsID); got != false {
t.Errorf("maybeMarkContainerDead must return false while restarting, got %v", got)
}
if stub.calls != 0 {
t.Errorf("IsRunning must not be called while restarting (Layer 1 gate broken); got %d calls", stub.calls)
}
}
// TestPreflightContainerHealth_SkippedWhileRestarting — Layer 1 for the
// proactive path. Same shape as above: with restart in flight, return
// nil (let the optimistic forward proceed) and DO NOT call IsRunning.
// The forward will fail with a connect error; the post-restart reactive
// path can decide what to do then, by which point the EC2 has either
// come up (no more failures) or markProvisionFailed has fired.
func TestPreflightContainerHealth_SkippedWhileRestarting(t *testing.T) {
const wsID = "self-fire-ws-preflight"
resetSelfFireState(wsID)
_ = setupTestDB(t)
finish := markRestarting(wsID)
defer finish()
stub := &preflightLocalProv{running: false, err: nil}
h := newSelfFireHandler(t)
h.provisioner = stub
if err := h.preflightContainerHealth(context.Background(), wsID); err != nil {
t.Errorf("preflightContainerHealth must return nil while restarting, got %+v", err)
}
if stub.calls != 0 {
t.Errorf("IsRunning must not be called while restarting (Layer 1 gate broken); got %d calls", stub.calls)
}
}
// TestRestartByID_DebounceSilentDrop — Layer 3. After a cycle starts,
// any RestartByID arriving within restartDebounceWindow MUST be dropped
// silently — not coalesced (which would still drain to another cycle).
// The drop counter must increment by exactly one per dropped call so
// ops can see how often the self-fire would have fired pre-fix.
func TestRestartByID_DebounceSilentDrop(t *testing.T) {
const wsID = "self-fire-ws-debounce"
resetSelfFireState(wsID)
// Stamp restartStartedAt = now, running=false (simulates the "just
// finished" window where the loop would re-fire pre-fix).
sv, _ := restartStates.LoadOrStore(wsID, &restartState{})
state := sv.(*restartState)
state.mu.Lock()
state.restartStartedAt = time.Now()
state.running = false
state.mu.Unlock()
// Counter baseline.
if got := restartByIDDropCounter.Load(); got != 0 {
t.Fatalf("expected drop counter 0 at start, got %d", got)
}
// Five rapid-fire RestartByID calls should all drop (the maximum
// observed pre-fix was 4x — pinning >=4 here keeps the regression
// shape true to the prod incident).
h := newSelfFireHandler(t)
stub := &preflightLocalProv{running: true, err: nil}
h.provisioner = stub
for i := 0; i < 5; i++ {
h.RestartByID(wsID)
}
if got := restartByIDDropCounter.Load(); got != 5 {
t.Errorf("expected 5 drops within debounce window, got %d", got)
}
// shouldDebounceRestart itself must report true for the same window.
if !shouldDebounceRestart(wsID) {
t.Error("shouldDebounceRestart must return true within window")
}
}
// TestRestartByID_DebounceExpiresAfterWindow — outside the window, the
// debounce must release: a legitimate later restart (e.g. user clicked
// Restart again after waiting) must proceed to coalesceRestart. We
// shrink restartDebounceWindow to 1ms for the duration of this test so
// we don't sleep a full 60s in CI.
func TestRestartByID_DebounceExpiresAfterWindow(t *testing.T) {
const wsID = "self-fire-ws-debounce-release"
resetSelfFireState(wsID)
orig := restartDebounceWindow
restartDebounceWindow = 5 * time.Millisecond
defer func() { restartDebounceWindow = orig }()
// Stamp inside the window.
sv, _ := restartStates.LoadOrStore(wsID, &restartState{})
state := sv.(*restartState)
state.mu.Lock()
state.restartStartedAt = time.Now()
state.running = false
state.mu.Unlock()
if !shouldDebounceRestart(wsID) {
t.Fatal("within 5ms window must debounce")
}
// Sleep past the window. Use a small margin to avoid clock-skew
// flakes on slow CI hosts.
time.Sleep(20 * time.Millisecond)
if shouldDebounceRestart(wsID) {
t.Fatal("after 20ms (4x window) must no longer debounce")
}
}
// TestRestartByID_SingleProvisionPerRestart — the regression test for
// the prod incident: a SINGLE secrets PUT (which is the trigger shape)
// must produce exactly ONE coalesceRestart cycle, not four. Models the
// full chain: secrets handler → RestartByID → coalesceRestart → cycle
// runs → during the cycle window, simulated probes call RestartByID
// again. With all three layers in place, the probes are dropped and the
// total cycle count stays at 1.
func TestRestartByID_SingleProvisionPerRestart(t *testing.T) {
const wsID = "self-fire-ws-single-provision"
resetSelfFireState(wsID)
// In-flight gate that mimics the EC2-pending window. The cycle
// blocks on cycleProceed so we can fire the simulated probes while
// running=true.
var cycleCount atomic.Int32
cycleStarted := make(chan struct{}, 1)
cycleProceed := make(chan struct{})
cycle := func() {
n := cycleCount.Add(1)
if n == 1 {
cycleStarted <- struct{}{}
<-cycleProceed
}
}
// Kick the first cycle via coalesceRestart (this is what RestartByID
// would do post-debounce-check).
done := make(chan struct{})
go func() {
coalesceRestart(wsID, cycle)
close(done)
}()
<-cycleStarted
// Simulate the 4 probe-driven RestartByID calls observed in prod.
// Each must drop because we're within the debounce window AND a
// cycle is in flight.
h := newSelfFireHandler(t)
stub := &preflightLocalProv{running: true, err: nil}
h.provisioner = stub
for i := 0; i < 4; i++ {
h.RestartByID(wsID)
}
// Release the cycle.
close(cycleProceed)
<-done
if got := cycleCount.Load(); got != 1 {
t.Errorf("expected exactly 1 provision cycle for a single trigger "+
"(self-fire fix), got %d — regression of the prod 4x reprov thrash class",
got)
}
if got := restartByIDDropCounter.Load(); got != 4 {
t.Errorf("expected 4 self-fire probes dropped, got %d "+
"(observability counter must record the saved cycles)", got)
}
}
// newSelfFireHandler constructs a minimal *WorkspaceHandler suitable for
// the Layer-1 gate tests. Wraps the boilerplate so the per-test setup
// stays focused on the assertion.
func newSelfFireHandler(t *testing.T) *WorkspaceHandler {
t.Helper()
return NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
}
@@ -718,7 +718,7 @@ func TestWorkspaceList_Empty(t *testing.T) {
"parent_id", "active_tasks", "last_error_rate", "last_sample_error",
"uptime_seconds", "current_task", "runtime", "workspace_dir", "x", "y", "collapsed",
"budget_limit", "monthly_spend",
"broadcast_enabled", "talk_to_user_enabled",
"broadcast_enabled", "talk_to_user_enabled",
}))
w := httptest.NewRecorder()
@@ -1770,3 +1770,147 @@ runtime_config:
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
}
}
// ==================== #188 fail-closed: template/runtime contract ====================
//
// molecule-controlplane#188 / #184: if a caller names a `template` (intent
// for a specific runtime) but the runtime cannot be resolved from it, the
// server MUST NOT silently provision langgraph and return 201 — that false
// success produced 5/5 wrong workspaces and a bogus codex E2E pass. These
// tests pin the fail-closed boundary at the ws-server `Create` handler (the
// path the product UI hits), and guard the legitimate default path against
// regression.
// Template requested but its dir/config.yaml is absent → 422, not silent
// langgraph 201.
func TestWorkspaceCreate_188_TemplateMissingRuntime_FailsClosed(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
// configsDir is an empty temp dir → resolveInsideRoot succeeds (the path
// is inside root) but config.yaml read fails → runtime cannot be resolved.
configsDir := t.TempDir()
if err := os.MkdirAll(filepath.Join(configsDir, "ghost-template"), 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", configsDir)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Ghost","template":"ghost-template"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("expected 422 (fail-closed, controlplane#188), got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if resp["code"] != "RUNTIME_UNRESOLVED" {
t.Errorf("expected code RUNTIME_UNRESOLVED, got %v", resp["code"])
}
}
// Template config.yaml has no `runtime:` key → 422, not silent langgraph.
func TestWorkspaceCreate_188_TemplateConfigNoRuntimeKey_FailsClosed(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
configsDir := t.TempDir()
tdir := filepath.Join(configsDir, "noruntime-template")
if err := os.MkdirAll(tdir, 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
// config.yaml exists but declares no runtime.
if err := os.WriteFile(filepath.Join(tdir, "config.yaml"), []byte("name: noruntime\n"), 0o644); err != nil {
t.Fatalf("write: %v", err)
}
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", configsDir)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"NoRuntime","template":"noruntime-template"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("expected 422 (fail-closed), got %d: %s", w.Code, w.Body.String())
}
}
// Regression guard: the legitimate default path (no template, no runtime —
// bare {"name":...}) MUST still default to langgraph and return 201. The
// #188 fix must not break this.
func TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WithArgs(sqlmock.AnyArg(), "Plain Default", nil, 3, "langgraph", sqlmock.AnyArg(), (*string)(nil), nil, "none", (*int64)(nil), models.DefaultMaxConcurrentTasks, "push").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WithArgs(sqlmock.AnyArg(), float64(0), float64(0)).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Plain Default"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected 201 (legitimate default path), got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// Explicit runtime, no template → honored, 201 (no template resolution
// needed; runtimeExplicitlyRequested true but already resolved).
func TestWorkspaceCreate_188_ExplicitRuntimeNoTemplate_OK(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WithArgs(sqlmock.AnyArg(), "Explicit Codex", nil, 3, "codex", sqlmock.AnyArg(), (*string)(nil), nil, "none", (*int64)(nil), models.DefaultMaxConcurrentTasks, "push").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WithArgs(sqlmock.AnyArg(), float64(0), float64(0)).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Explicit Codex","runtime":"codex"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -0,0 +1,229 @@
// Package provisioner — T4 privilege contract.
//
// This file is the single source of truth for what a Tier-4 ("full
// machine access") workspace runtime MUST guarantee, expressed as code
// templates can reference and CI can verify.
//
// RFC: molecule-ai/internal#456 (per-template privilege-contract class).
// Task: molecule-ai/internal #174.
//
// Background
// ----------
// Prior art is RFC#456's three layers:
//
// (1) molecule-runtime self-enforces uid-1000 + fchown safety net,
// (2) a platform-owned wrapper entrypoint from a shared base image,
// (3) a REQUIRED CI conformance gate wired into the fresh-provision
// harness that asserts the post-condition, not the mechanism.
//
// This file is the *data shape* for layer (3): the gate's tests have
// been hand-written per-template (template-claude-code, template-hermes,
// template-codex). Hand-writing drifts; the Hermes 401 class came from
// drift. We need the capability list itself to be code so that:
//
// - The provisioner can dump it as `t4_capabilities.yaml` for any
// fork user or non-Molecule-AI template runner to consume directly
// (no hardcoded internal org).
// - A `Verify(...)` helper turns into the t4-conformance shell out of
// one file, so when a capability is added the templates pick it up
// by reading the YAML — they do not silently lag.
// - The provisioner-emit side (provisioner.go applyTierResources / T4
// branch) and the verifier side share the same constants for the
// uid + mount paths, eliminating "string-match" drift between
// emitter and gate.
//
// Non-goals
// ---------
// - This is NOT a substitute for layer (1)/(2). Templates still must
// `exec gosu agent` and write /configs/.auth_token under uid 1000;
// this file describes *what to check*, not how to achieve it.
// - This file does not run tests. It is the spec. CI workflows call
// `T4PrivilegeContract().AsYAML()` once at the start of the gate
// and assert each capability's `Probe` returns ok.
package provisioner
import (
"fmt"
"sort"
"strings"
)
// T4Capability is one assertion the T4 runtime MUST satisfy.
//
// Each capability declares:
// - Name: stable id (used as the test name in CI output).
// - Description: human-readable why-this-matters; goes in failure logs.
// - Probe: a shell snippet that exits 0 on pass, non-zero on fail.
// The probe MUST be deterministic, MUST be runnable inside the
// running container under uid 1000, and MUST NOT depend on outside
// network beyond what `RequiredEgress` declares.
// - Severity: "hard" capabilities fail the gate; "advisory" emit a
// warning. T4 contract minimum = all hard pass.
// - Source: RFC section or memory reference that motivated this
// capability — keeps the audit trail in-tree.
type T4Capability struct {
Name string `yaml:"name"`
Description string `yaml:"description"`
Probe string `yaml:"probe"`
Severity string `yaml:"severity"`
Source string `yaml:"source"`
RequiredEgress []string `yaml:"required_egress,omitempty"`
}
// SeverityHard / SeverityAdvisory enumerate the only allowed Severity
// values. We do not use Go enums because the YAML consumer is shell.
const (
SeverityHard = "hard"
SeverityAdvisory = "advisory"
)
// T4PrivilegeContract returns the full T4 capability set.
//
// Add new capabilities here. Each one is automatically picked up by
// any template whose CI consumes `t4_capabilities.yaml` (no per-template
// PR needed for new checks — this is the anti-drift property).
//
// Capability ordering matters for human-readable CI output but is not
// load-bearing for correctness; AsYAML() emits them sorted by Name.
func T4PrivilegeContract() []T4Capability {
return []T4Capability{
{
Name: "agent_uid_1000",
Description: "The container's primary process (the runtime, post-gosu) runs as uid 1000, not root. T4 grants full machine access via privileged + host PID + Docker socket — the WORKLOAD inside that privileged container must still be unprivileged to prevent every untrusted code execution from being trivially root-on-host.",
Probe: `[ "$(id -u)" = "1000" ]`,
Severity: SeverityHard,
Source: "RFC internal#456 §2.1.2; memory feedback_hermes_listpeers_401_token_root600_unreadable_by_uid1000",
},
{
Name: "auth_token_agent_owned",
Description: "/configs/.auth_token is owned by uid 1000 (== AgentUID) so the a2a_mcp_server can read its bearer. In SaaS mode molecule-runtime itself writes the token via save_token() — the ownership equals the runtime's exec uid. If the runtime ever runs as root, this fails and list_peers 401s (the Hermes class bug).",
Probe: `[ -e /configs/.auth_token ] && [ "$(stat -c '%u' /configs/.auth_token)" = "1000" ]`,
Severity: SeverityHard,
Source: "RFC internal#456 §10; memory feedback_hermes_listpeers_401_token_root600_unreadable_by_uid1000",
},
{
Name: "host_root_reach_via_nsenter",
Description: "The uid-1000 agent can attain host root via `sudo -n nsenter --target 1 --mount --pid -- id -u` returning 0. This is the T4 escalation leg: full machine access means the agent CAN escalate to host root deliberately, even though it does not run as root by default.",
Probe: `[ "$(sudo -n nsenter --target 1 --mount --pid -- id -u)" = "0" ]`,
Severity: SeverityHard,
Source: "RFC internal#456 §11; memory reference_per_template_privilege_contract_class_audit_2026_05_16",
},
{
Name: "host_fs_write_readback",
Description: "Host filesystem is mounted at /host and the agent can write+read+remove a file there via sudo. Proves real host reach (not just a PID-1 namespace trick on an isolated init).",
Probe: `MARKER="t4cap-$(date +%s)-$RANDOM"; PROBE_FILE="/host/tmp/.t4-cap-probe-${MOLECULE_T4_PROBE_ID:-$$}"; ` +
`sudo -n sh -c "echo $MARKER > $PROBE_FILE" && ` +
`[ "$(sudo -n cat $PROBE_FILE)" = "$MARKER" ] && ` +
`sudo -n rm -f $PROBE_FILE`,
Severity: SeverityHard,
Source: "RFC internal#456 §11",
},
{
Name: "docker_socket_reachable",
Description: "/var/run/docker.sock is bind-mounted into the container so the agent can manage other containers (T4 use case: agent-as-orchestrator). Proven by 'docker version' returning a server section, which requires the daemon to answer over the socket.",
Probe: `sudo -n docker version --format '{{.Server.Version}}' >/dev/null 2>&1`,
Severity: SeverityHard,
Source: "provisioner.go applyHostConfig T4 branch (case 4)",
},
{
Name: "list_peers_http_200",
Description: "The platform list_peers HTTP endpoint (served by the in-container a2a_mcp_server) returns HTTP 200 when called from uid 1000 with the bearer from /configs/.auth_token. This proves the WHOLE token-ownership chain end-to-end: token written under correct uid → reader uid matches → bearer non-empty → platform accepts. A self-contained empirical test for the Hermes class bug.",
Probe: `BEARER=$(cat /configs/.auth_token 2>/dev/null || echo ""); ` +
`[ -n "$BEARER" ] || exit 1; ` +
`PORT=$(cat /configs/.platform_port 2>/dev/null || echo "8080"); ` +
`STATUS=$(curl -sS -o /dev/null -w '%{http_code}' -H "Authorization: Bearer $BEARER" "http://127.0.0.1:${PORT}/list_peers"); ` +
`[ "$STATUS" = "200" ]`,
Severity: SeverityHard,
Source: "memory reference_openclaw_fresh_provision_nonfunctional_anthropic_default_unroutable; memory reference_openclaw_mcp_peer_wiring_rootcause",
},
{
Name: "agent_home_writable",
Description: "/agent-home is writable by the agent (Files API split per task #128). The Files API redesign uses /agent-home as the user-writable root; the agent must be able to create files there without sudo.",
Probe: `TF=/agent-home/.t4-cap-write-probe-${MOLECULE_T4_PROBE_ID:-$$}; echo ok > "$TF" && [ "$(cat "$TF")" = "ok" ] && rm -f "$TF"`,
Severity: SeverityHard,
Source: "task #128 Files API redesign; memory reference_post_suspension_pipeline",
},
{
Name: "network_egress_https",
Description: "Generic HTTPS egress works. T4 is unconstrained network; the canonical test target is the Gitea instance over its public name, which any fork user can also resolve. Any reachable HTTPS endpoint satisfies it — the YAML carries the recommended targets but accepts any 200/301/302.",
Probe: `for U in $MOLECULE_T4_EGRESS_TARGETS; do ` +
` C=$(curl -sS -o /dev/null -w '%{http_code}' --max-time 8 "$U"); ` +
` case "$C" in 2*|3*) exit 0;; esac; ` +
`done; exit 1`,
Severity: SeverityHard,
Source: "task #174 brief",
RequiredEgress: []string{
// Public, no auth, returns a small JSON.
// Adopters override via MOLECULE_T4_EGRESS_TARGETS.
"https://api.github.com/zen",
"https://www.google.com/generate_204",
},
},
{
Name: "privileged_flag_observable",
Description: "Container is started with --privileged. Observable from inside via /proc/self/status CapEff containing CAP_SYS_ADMIN. Defense-in-depth for the provisioner emission side.",
Probe: `grep -q '^CapEff:.*ffffffffff' /proc/self/status`,
Severity: SeverityAdvisory, // Imperfect — some CAP filters trim CapEff; advisory only.
Source: "provisioner.go applyHostConfig T4 branch (case 4)",
},
{
Name: "pid_host_visible",
Description: "Host PID namespace is shared (--pid=host). The container can see host process 1 (systemd or pid-1 on the EC2 instance). Required for nsenter into host mount/pid namespaces.",
Probe: `[ -d /proc/1/root ] && [ "$(sudo -n readlink /proc/1/ns/pid)" = "$(sudo -n readlink /proc/self/ns/pid)" ]`,
Severity: SeverityHard,
Source: "provisioner.go applyHostConfig T4 branch (case 4): hostCfg.PidMode = 'host'",
},
}
}
// AsYAML renders the contract as a single YAML document templates can
// fetch at CI time. Sorted by Name for deterministic diffs.
//
// We deliberately do not depend on a YAML library here — the format is
// trivial, and one-file pure-stdlib means this can be vendored or
// dumped from any Go context (including a `go run` script in CI).
//
// The format is stable; downstream consumers must treat unknown fields
// as warnings, not errors.
func AsYAML(caps []T4Capability) string {
sorted := make([]T4Capability, len(caps))
copy(sorted, caps)
sort.Slice(sorted, func(i, j int) bool { return sorted[i].Name < sorted[j].Name })
var b strings.Builder
b.WriteString("# T4 privilege contract — generated from\n")
b.WriteString("# molecule-ai/molecule-core workspace-server/internal/provisioner/t4_privilege_contract.go\n")
b.WriteString("# RFC: molecule-ai/internal#456\n")
b.WriteString("# Do NOT edit this file by hand; regenerate via `go run ./cmd/t4-contract-dump > t4_capabilities.yaml`.\n")
b.WriteString("version: 1\n")
b.WriteString("agent_uid: 1000\n")
b.WriteString("capabilities:\n")
for _, c := range sorted {
fmt.Fprintf(&b, " - name: %s\n", yamlEscape(c.Name))
fmt.Fprintf(&b, " description: %s\n", yamlEscape(c.Description))
fmt.Fprintf(&b, " severity: %s\n", c.Severity)
fmt.Fprintf(&b, " source: %s\n", yamlEscape(c.Source))
fmt.Fprintf(&b, " probe: %s\n", yamlEscape(c.Probe))
if len(c.RequiredEgress) > 0 {
b.WriteString(" required_egress:\n")
for _, u := range c.RequiredEgress {
fmt.Fprintf(&b, " - %s\n", yamlEscape(u))
}
}
}
return b.String()
}
// yamlEscape is a minimal YAML scalar escaper. We always quote with
// double quotes and backslash-escape internal quotes + backslashes —
// safe for the subset of strings we emit (no control chars except \n
// and \t, both of which we replace with literal escapes).
func yamlEscape(s string) string {
r := strings.NewReplacer(
"\\", "\\\\",
"\"", "\\\"",
"\n", "\\n",
"\t", "\\t",
)
return "\"" + r.Replace(s) + "\""
}
@@ -0,0 +1,153 @@
package provisioner
import (
"strings"
"testing"
)
// TestT4PrivilegeContract_AllCapabilitiesHaveRequiredFields enforces
// the invariant that every entry in the contract has at minimum a
// Name, Description, Probe, Severity, and Source — so the YAML the
// templates consume is never partially-filled (a quiet way to drift).
func TestT4PrivilegeContract_AllCapabilitiesHaveRequiredFields(t *testing.T) {
caps := T4PrivilegeContract()
if len(caps) == 0 {
t.Fatal("T4PrivilegeContract returned zero capabilities — the gate would have nothing to assert")
}
for _, c := range caps {
if c.Name == "" {
t.Errorf("capability missing Name: %+v", c)
}
if c.Description == "" {
t.Errorf("capability %q missing Description", c.Name)
}
if c.Probe == "" {
t.Errorf("capability %q missing Probe", c.Name)
}
if c.Severity != SeverityHard && c.Severity != SeverityAdvisory {
t.Errorf("capability %q has invalid Severity %q (allowed: hard, advisory)", c.Name, c.Severity)
}
if c.Source == "" {
t.Errorf("capability %q missing Source — every capability must cite the RFC section or memory that motivates it", c.Name)
}
}
}
// TestT4PrivilegeContract_NamesAreUnique catches a silent
// dup-by-rename: if two capabilities share a name, AsYAML overwrites
// one in any YAML-loader-with-merge implementation, and CI output
// becomes ambiguous.
func TestT4PrivilegeContract_NamesAreUnique(t *testing.T) {
caps := T4PrivilegeContract()
seen := make(map[string]bool, len(caps))
for _, c := range caps {
if seen[c.Name] {
t.Errorf("capability name %q appears more than once", c.Name)
}
seen[c.Name] = true
}
}
// TestT4PrivilegeContract_CoreCapabilitiesPresent pins the minimum
// closure of capabilities the gate guarantees. Adding capabilities
// is fine; removing one of these requires updating this test
// (which the reviewer will see and challenge).
//
// These are exactly the post-conditions cited in RFC internal#456 §10–§11
// + task #128 (Files API) + task #174 (this task).
func TestT4PrivilegeContract_CoreCapabilitiesPresent(t *testing.T) {
required := []string{
"agent_uid_1000",
"auth_token_agent_owned",
"host_root_reach_via_nsenter",
"docker_socket_reachable",
"list_peers_http_200",
"agent_home_writable",
"network_egress_https",
}
caps := T4PrivilegeContract()
have := make(map[string]bool, len(caps))
for _, c := range caps {
have[c.Name] = true
}
for _, r := range required {
if !have[r] {
t.Errorf("required capability %q missing from contract — RFC internal#456 / task #174 says this MUST be in the closure", r)
}
}
}
// TestT4PrivilegeContract_HardCapabilitiesMajority sanity-checks that
// the contract is not silently advisory-only. If someone marks
// everything as "advisory" the gate becomes a no-op without anyone
// noticing — fail the test if hard capabilities are not the majority.
func TestT4PrivilegeContract_HardCapabilitiesMajority(t *testing.T) {
caps := T4PrivilegeContract()
hard := 0
for _, c := range caps {
if c.Severity == SeverityHard {
hard++
}
}
if hard*2 <= len(caps) {
t.Errorf("hard capabilities (%d) must be the strict majority of %d total — otherwise the gate is a no-op", hard, len(caps))
}
}
// TestAsYAML_IsParseableAndStable asserts the AsYAML output is
// stable across invocations (sorted by name) and contains every
// capability's name. We do not depend on a YAML parser here —
// presence of `- name: "<n>"` lines is sufficient and the format
// is deliberately the trivially-greppable subset.
func TestAsYAML_IsParseableAndStable(t *testing.T) {
caps := T4PrivilegeContract()
y1 := AsYAML(caps)
y2 := AsYAML(caps)
if y1 != y2 {
t.Error("AsYAML output is not deterministic across calls — sort/format must be stable for CI diff sanity")
}
for _, c := range caps {
needle := "- name: \"" + c.Name + "\""
if !strings.Contains(y1, needle) {
t.Errorf("AsYAML output missing %q", needle)
}
}
// Header must cite the RFC so adopters can find the source of truth.
if !strings.Contains(y1, "internal#456") {
t.Error("AsYAML header must reference RFC internal#456 — that is the design-of-record")
}
if !strings.Contains(y1, "version: 1") {
t.Error("AsYAML must declare schema version (templates parse-check on this)")
}
}
// TestAsYAML_EscapesEmbeddedQuotes catches a regression in
// yamlEscape: a probe shell string containing a double-quote would
// produce an unparseable YAML scalar.
func TestAsYAML_EscapesEmbeddedQuotes(t *testing.T) {
caps := []T4Capability{{
Name: "embedded_quote",
Description: `says "hi"`,
Probe: `echo "ok"`,
Severity: SeverityHard,
Source: "test",
}}
y := AsYAML(caps)
// We expect the embedded `"` to be backslash-escaped.
if !strings.Contains(y, `\"hi\"`) {
t.Errorf("AsYAML did not escape embedded double quotes; got:\n%s", y)
}
if !strings.Contains(y, `\"ok\"`) {
t.Errorf("AsYAML did not escape embedded double quotes in Probe; got:\n%s", y)
}
}
// TestAgentUIDConsistency ties the contract to the existing
// provisioner-side AgentUID const. The probe for "agent_uid_1000"
// hard-codes `id -u == 1000`; if AgentUID ever changes (no one
// expects it to, but a CI guard is free), the probe must change too.
func TestAgentUIDConsistency(t *testing.T) {
if AgentUID != 1000 {
t.Fatalf("AgentUID is %d but the T4 contract's probes assume 1000; update t4_privilege_contract.go probes before changing AgentUID", AgentUID)
}
}
@@ -81,11 +81,11 @@ func TestPositiveMatches(t *testing.T) {
fixture string
expectedName string
}{
{"ghp_EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"ghp_" + "EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_" + "EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"github_pat_EXAMPLE" + strings.Repeat("1", 80), "github-pat-fine-grained"},
{"sk-ant-EXAMPLE" + strings.Repeat("1", 40), "anthropic-api-key"},
{"sk-proj-EXAMPLE" + strings.Repeat("1", 40), "openai-project-key"},
@@ -156,7 +156,7 @@ func TestNegativeShapes(t *testing.T) {
// makes ScanString do its own thing (e.g. accidentally normalise
// case) would diverge silently.
func TestScanString_NoOp(t *testing.T) {
in := "ghp_EXAMPLE111122223333444455556666777788889999"
in := "ghp_" + "EXAMPLE111122223333444455556666777788889999"
m1, err1 := ScanBytes([]byte(in))
if err1 != nil {
t.Fatalf("ScanBytes errored: %v", err1)
+13
View File
@@ -62,6 +62,19 @@ RUN chmod +x ./scripts/molecule-git-token-helper.sh
COPY scripts/molecule-gh-token-refresh.sh ./scripts/
RUN chmod +x ./scripts/molecule-gh-token-refresh.sh
# Generic GIT_ASKPASS helper. Reads HTTPS Basic-Auth credentials from env
# vars (GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD, with GITEA_USER / GITEA_TOKEN
# as fallback) and emits them on the git credential-prompt protocol so
# container-side `git` can authenticate to any private HTTPS remote
# without on-disk .gitconfig / .git-credentials mutation. The platform
# provisioner sets GIT_ASKPASS=/usr/local/bin/molecule-askpass via
# applyAgentGitIdentity (workspace-server/internal/handlers/agent_git_identity.go).
# Filename is the only project-specific marker; the script body contains
# no vendor literals and is identical to the script shipped in each
# open-source workspace template (scripts/git-askpass.sh).
COPY scripts/molecule-askpass /usr/local/bin/molecule-askpass
RUN chmod +x /usr/local/bin/molecule-askpass
# Dirs and permissions
RUN mkdir -p /workspace /plugins /home/agent/.claude /home/agent/.config /home/agent/.local \
/home/agent/.molecule-token-cache && \

Some files were not shown because too many files have changed in this diff Show More