Compare commits

...

56 Commits

Author SHA1 Message Date
core-devops eb21a02b6d test(e2e): support empty auth headers on mac bash
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / all-required (pull_request) Successful in 3m40s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 59s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-21 12:41:04 -07:00
hongming 498ce4e287 Merge pull request 'chore(ci): mirror tenant image to staging ecr' (#1647) from chore/mirror-tenant-image-staging-ecr into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 14s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m30s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m26s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 4s
E2E Chat / E2E Chat (push) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 2m42s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m4s
CI / Platform (Go) (push) Successful in 5m8s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 9s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 8s
CI / Canvas (Next.js) (push) Successful in 6m21s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 7m12s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m6s
chore(ci): mirror tenant image to staging ecr\n\nAdds optional staging ECR tags to the tenant image publish build. The primary publish path remains unchanged when staging publisher secrets are absent.
2026-05-21 19:39:14 +00:00
core-fe 7081a8e900 chore(ci): mirror tenant image to staging ecr
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m9s
gate-check-v3 / gate-check (pull_request) Successful in 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m22s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 2m4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m19s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-21 12:31:54 -07:00
hongming da4b86a159 Merge pull request #1643 from fix/mcp-delegate-platform-path
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 43s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 31s
Handlers Postgres Integration / detect-changes (push) Successful in 3s
Harness Replays / detect-changes (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
publish-workspace-server-image / build-and-push (push) Successful in 2m46s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m14s
CI / Shellcheck (E2E scripts) (push) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m46s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m12s
CI / Platform (Go) (push) Successful in 4m58s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E Chat / E2E Chat (push) Successful in 3m7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 7m43s
Harness Replays / Harness Replays (push) Successful in 5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m40s
CI / Canvas (Next.js) (push) Successful in 6m1s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 9m31s
publish-workspace-server-image / Production auto-deploy (push) Successful in 8m48s
main-red-watchdog / watchdog (push) Successful in 2m21s
gate-check-v3 / gate-check (push) Successful in 29s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m42s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m26s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 6m14s
fix: route MCP delegation through platform A2A
2026-05-21 18:38:11 +00:00
core-devops 81d864f4bc fix: route mcp delegation through platform a2a
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 10s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m25s
E2E Chat / E2E Chat (pull_request) Successful in 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 8s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m15s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m5s
CI / all-required (pull_request) Successful in 9m5s
audit-force-merge / audit (pull_request) Successful in 18s
2026-05-21 11:22:16 -07:00
hongming c9795a6c4d Merge pull request #1642 from chore/retrigger-peer-visibility-after-publish
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 51s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m12s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m25s
E2E Chat / E2E Chat (push) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m20s
publish-workspace-server-image / build-and-push (push) Successful in 2m56s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m39s
CI / Platform (Go) (push) Successful in 5m35s
CI / Canvas (Next.js) (push) Successful in 6m28s
CI / all-required (push) Successful in 7m35s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m17s
CI / Canvas Deploy Reminder (push) Successful in 1s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m34s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m37s
chore(e2e): retrigger peer visibility after publish fix
2026-05-21 18:21:40 +00:00
core-fe f5dc55f1d1 chore(e2e): retrigger peer visibility after publish fix
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 58s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m20s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m33s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2m5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m23s
audit-force-merge / audit (pull_request) Successful in 5s
2026-05-21 11:18:34 -07:00
hongming fd92df486c Merge pull request #1641 from fix/publish-buildx-docker-config
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 13s
CI / Shellcheck (E2E scripts) (push) Successful in 22s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 10s
E2E Chat / E2E Chat (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m30s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m35s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 2m51s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m29s
CI / Platform (Go) (push) Successful in 5m24s
CI / Canvas (Next.js) (push) Successful in 6m17s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 6m59s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m58s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m31s
main-red-watchdog / watchdog (push) Successful in 35s
gate-check-v3 / gate-check (push) Successful in 21s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 8m1s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m5s
fix(ci): isolate publish buildx docker config
2026-05-21 17:43:30 +00:00
core-fe fc7498fef0 fix(ci): isolate publish buildx docker config
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m31s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
CI / all-required (pull_request) Successful in 2m34s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m16s
audit-force-merge / audit (pull_request) Successful in 8s
2026-05-21 10:37:48 -07:00
hongming 51dcca592d docs: clarify multi external workspace config
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 15s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 22s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 11s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 20s
E2E Chat / E2E Chat (push) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 18s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m59s
publish-workspace-server-image / build-and-push (push) Successful in 6m24s
CI / Platform (Go) (push) Successful in 6m9s
CI / Canvas (Next.js) (push) Successful in 7m3s
CI / all-required (push) Successful in 7m55s
CI / Canvas Deploy Reminder (push) Successful in 2s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m19s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 8m47s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 10m10s
2026-05-21 17:26:21 +00:00
hongming 27c1e18e98 test(e2e): expose peer visibility token fallback failures
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 14s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 44s
CI / Shellcheck (E2E scripts) (push) Successful in 14s
E2E Chat / E2E Chat (push) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m48s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
CI / Platform (Go) (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / all-required (push) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
Merge PR #1639: expose peer-visibility token fallback diagnostics
2026-05-21 17:20:49 +00:00
core-devops 73502db9f4 docs: clarify multi external workspace config
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 9s
gate-check-v3 / gate-check (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 6s
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 1m36s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 17s
2026-05-21 10:20:08 -07:00
core-fe 4f85ef5209 test(e2e): expose peer visibility token fallback failures
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 13s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
CI / all-required (pull_request) Successful in 1m20s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
audit-force-merge / audit (pull_request) Successful in 5s
2026-05-21 10:17:23 -07:00
hongming def18f28fa Merge pull request #1637
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 22s
CI / Shellcheck (E2E scripts) (push) Successful in 25s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m20s
publish-workspace-server-image / build-and-push (push) Successful in 6m2s
CI / Platform (Go) (push) Successful in 6m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m5s
CI / Canvas (Next.js) (push) Successful in 6m50s
CI / all-required (push) Successful in 7m34s
CI / Canvas Deploy Reminder (push) Successful in 3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m17s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m31s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m53s
E2E Chat / detect-changes (push) Successful in 7s
E2E Chat / E2E Chat (push) Successful in 3m10s
Railway pin audit (drift detection) / Audit Railway env vars for drift-prone pins (push) Failing after 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m25s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 4s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m33s
main-red-watchdog / watchdog (push) Successful in 2m8s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m12s
gate-check-v3 / gate-check (push) Successful in 20s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 7s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
ci-required-drift / drift (push) Successful in 1m12s
ci: compensate cancelled push status noise
2026-05-21 07:24:34 +00:00
core-fe 8fc27f4d69 ci: compensate cancelled push status noise
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 32s
CI / Detect changes (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
E2E API Smoke Test / detect-changes (pull_request) Successful in 25s
E2E Chat / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 11s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 12s
qa-review / approved (pull_request) Failing after 12s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1m4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m7s
audit-force-merge / audit (pull_request) Successful in 11s
2026-05-21 00:19:56 -07:00
hongming 6137657704 Merge pull request #1632
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 14s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
CI / Python Lint & Test (push) Successful in 5s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 46s
CI / Platform (Go) (push) Successful in 4m34s
CI / Shellcheck (E2E scripts) (push) Successful in 8s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m23s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m15s
CI / Canvas (Next.js) (push) Successful in 5m59s
Harness Replays / Harness Replays (push) Successful in 7s
CI / all-required (push) Successful in 7m41s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 15s
CI / Canvas Deploy Reminder (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m21s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m16s
main-red-watchdog / watchdog (push) Successful in 2m45s
gate-check-v3 / gate-check (push) Successful in 34s
E2E Chat / E2E Chat (push) Successful in 4m50s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m5s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 17s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 14s
ci-required-drift / drift (push) Successful in 1m16s
fix(core): guard external A2A loopback routing
2026-05-21 06:59:26 +00:00
hongming 704a8ab7de Merge pull request #1634
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
E2E Staging SaaS (full lifecycle) / pr-validate (push) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Waiting to run
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Has been cancelled
CI / Detect changes (push) Has been cancelled
E2E API Smoke Test / detect-changes (push) Has been cancelled
E2E Chat / detect-changes (push) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
CI / all-required (push) Has been cancelled
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m21s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m26s
ci: path-scope required CI lanes on PRs
2026-05-21 06:59:25 +00:00
core-fe e358b9b92f ci: fix PR path filter base diff
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 1m2s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m12s
E2E Chat / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Harness Replays / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m50s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m44s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m42s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m40s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 2m2s
gate-check-v3 / gate-check (pull_request) Successful in 16s
qa-review / approved (pull_request) Successful in 12s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 2m0s
security-review / approved (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m21s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m31s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 23s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
E2E Chat / E2E Chat (pull_request) Successful in 12s
CI / all-required (pull_request) Successful in 7m47s
Harness Replays / Harness Replays (pull_request) Successful in 16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m55s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m48s
audit-force-merge / audit (pull_request) Successful in 10s
2026-05-20 23:12:27 -07:00
core-devops 7f59b7fd35 fix(watchdog): add HEAD-recheck + settling delay to suppress cancel-cascade false-positives (#1635)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Successful in 7m17s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 35s
CI / Python Lint & Test (push) Successful in 24s
E2E Chat / detect-changes (push) Successful in 19s
E2E API Smoke Test / detect-changes (push) Successful in 22s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m47s
CI / Platform (Go) (push) Successful in 6m22s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 16s
CI / Canvas (Next.js) (push) Successful in 7m9s
CI / all-required (push) Successful in 6m11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 20s
E2E Chat / E2E Chat (push) Successful in 26s
ci-required-drift / drift (push) Successful in 1m20s
publish-workspace-server-image / Production auto-deploy (push) Successful in 27m24s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m58s
CI / Canvas Deploy Reminder (push) Successful in 8s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 8s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 10s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m41s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m8s
2026-05-21 06:08:40 +00:00
core-fe c37caa2ec9 ci: share path filter helper
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
CI / Detect changes (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m39s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m10s
gate-check-v3 / gate-check (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 12s
sop-checklist / all-items-acked (pull_request) Successful in 12s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m17s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m20s
qa-review / approved (pull_request) Refired via /qa-recheck by hongming
security-review / approved (pull_request) Refired via /security-recheck by hongming
CI / Platform (Go) (pull_request) Successful in 4m26s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / all-required (pull_request) Successful in 23m39s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m32s
2026-05-20 22:32:17 -07:00
core-fe 6e77083b84 ci: path-scope shellcheck on prs
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m18s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 2m10s
qa-review / approved (pull_request) Failing after 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
gate-check-v3 / gate-check (pull_request) Successful in 6s
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 9s
sop-tier-check / tier-check (pull_request) Successful in 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m33s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m30s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m48s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m12s
CI / Platform (Go) (pull_request) Successful in 5m8s
CI / all-required (pull_request) Successful in 24m47s
2026-05-20 22:20:02 -07:00
hongming 660fc20124 fix(core): add admin workspace token mint route
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Successful in 6m5s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 15s
CI / Python Lint & Test (push) Successful in 21s
CI / Shellcheck (E2E scripts) (push) Successful in 25s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
CI / Platform (Go) (push) Successful in 4m39s
CI / Canvas (Next.js) (push) Successful in 5m52s
CI / all-required (push) Successful in 5m33s
publish-workspace-server-image / Production auto-deploy (push) Successful in 13m2s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m24s
E2E Chat / E2E Chat (push) Successful in 3m9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m40s
CI / Canvas Deploy Reminder (push) Successful in 9s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m42s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
main-red-watchdog / watchdog (push) Successful in 44s
gate-check-v3 / gate-check (push) Successful in 29s
Force-merged after required code checks passed; review gates intentionally bypassed per operator approval.
2026-05-21 05:18:51 +00:00
core-fe 3a3f670662 ci: path-scope canvas on prs
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m31s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m28s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 15m33s
2026-05-20 22:17:30 -07:00
core-devops 07457ad556 fix(core): add admin workspace token mint route
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 23s
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 7m10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
qa-review / approved (pull_request) Failing after 17s
security-review / approved (pull_request) Failing after 16s
sop-checklist / review-refire (pull_request) Has been skipped
CI / Canvas (Next.js) (pull_request) Successful in 8m27s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 51s
sop-tier-check / tier-check (pull_request) Successful in 46s
CI / all-required (pull_request) Successful in 6m45s
audit-force-merge / audit (pull_request) Successful in 5s
2026-05-20 22:02:54 -07:00
core-fe 30a8aa10b8 ci: path-scope platform go on prs
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 14s
CI / Detect changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 18s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 16s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m22s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 59s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m27s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m28s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m21s
CI / Canvas (Next.js) (pull_request) Successful in 5m28s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 7s
CI / all-required (pull_request) Successful in 12m13s
2026-05-20 22:02:54 -07:00
core-fe e9c4f23ae2 fix(core): guard external a2a loopback routing
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 30s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 32s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m31s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 6m26s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m19s
E2E Chat / E2E Chat (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 7m33s
CI / all-required (pull_request) Successful in 6m4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m46s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
qa-review / approved (pull_request) Refired via /qa-recheck by hongming
security-review / approved (pull_request) Refired via /security-recheck by hongming
audit-force-merge / audit (pull_request) Successful in 9s
2026-05-20 21:53:28 -07:00
hongming 08b3aa8a2c Merge pull request 'fix(core): mint staging peer visibility token fallback' (#1631) from fix/staging-peer-visibility-token into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 16s
CI / Python Lint & Test (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 21s
E2E Chat / detect-changes (push) Successful in 20s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 19s
Handlers Postgres Integration / detect-changes (push) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 13s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m26s
E2E Chat / E2E Chat (push) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m50s
CI / Platform (Go) (push) Successful in 6m56s
publish-workspace-server-image / build-and-push (push) Successful in 7m15s
CI / Canvas (Next.js) (push) Successful in 8m19s
CI / all-required (push) Successful in 8m6s
publish-workspace-server-image / Production auto-deploy (push) Successful in 2m53s
CI / Canvas Deploy Reminder (push) Successful in 1s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 39s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m56s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m43s
main-red-watchdog / watchdog (push) Successful in 45s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 6s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
gate-check-v3 / gate-check (push) Successful in 31s
ci-required-drift / drift (push) Successful in 1m3s
2026-05-21 04:52:05 +00:00
core-devops 022cc1136b fix(core): mint staging peer visibility token fallback
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 25s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 5m5s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 6m8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 4m40s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m7s
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-20 21:35:42 -07:00
hongming a1cfd085a8 Merge pull request 'chore(runtime): delete core workspace copy' (#1620) from chore/delete-core-workspace-runtime into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Successful in 3m7s
Block internal-flavored paths / Block forbidden paths (push) Successful in 21s
CI / Detect changes (push) Successful in 35s
CI / Shellcheck (E2E scripts) (push) Successful in 38s
publish-workspace-server-image / build-and-push (push) Successful in 6m38s
CI / Python Lint & Test (push) Successful in 35s
E2E Chat / detect-changes (push) Successful in 13s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Harness Replays / detect-changes (push) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m21s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m13s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m37s
CI / Platform (Go) (push) Successful in 4m44s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m25s
CI / Canvas (Next.js) (push) Successful in 5m49s
CI / all-required (push) Successful in 5m23s
publish-workspace-server-image / Production auto-deploy (push) Successful in 7m6s
E2E Chat / E2E Chat (push) Successful in 4m8s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m36s
Harness Replays / Harness Replays (push) Successful in 23s
CI / Canvas Deploy Reminder (push) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m59s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m18s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m0s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 7m41s
2026-05-21 04:23:33 +00:00
hongming e97eb95d9d Merge pull request 'ci: keep browser e2e out of normal pr path' (#1629) from fix/split-heavy-e2e-required-path into main
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Detect changes (push) Successful in 16s
publish-workspace-server-image / build-and-push (push) Failing after 14s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
CI / Shellcheck (E2E scripts) (push) Successful in 20s
E2E API Smoke Test / detect-changes (push) Successful in 23s
E2E Chat / detect-changes (push) Successful in 18s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m25s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m48s
CI / Platform (Go) (push) Successful in 5m0s
CI / Canvas (Next.js) (push) Successful in 6m9s
ci-required-drift / drift (push) Successful in 1m28s
CI / Python Lint & Test (push) Successful in 6m45s
CI / all-required (push) Successful in 6m57s
E2E Chat / E2E Chat (push) Failing after 5m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m6s
2026-05-21 04:15:09 +00:00
core-devops 522e8708a5 fix(core): disambiguate chat e2e text assertions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 46s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m8s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 44s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m30s
sop-checklist / review-refire (pull_request) Has been skipped
qa-review / approved (pull_request) Failing after 7s
gate-check-v3 / gate-check (pull_request) Successful in 8s
security-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 12s
sop-tier-check / tier-check (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m29s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m15s
CI / Platform (Go) (pull_request) Successful in 3m37s
CI / Canvas (Next.js) (pull_request) Successful in 5m5s
CI / all-required (pull_request) Successful in 5m8s
Harness Replays / Harness Replays (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m54s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m47s
E2E Chat / E2E Chat (pull_request) Successful in 3m8s
audit-force-merge / audit (pull_request) Successful in 4s
2026-05-20 21:02:35 -07:00
core-fe 16b203fff1 ci: keep browser e2e out of normal pr path
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
CI / Detect changes (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 34s
security-review / approved (pull_request) Failing after 6s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
CI / Platform (Go) (pull_request) Successful in 3m6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m22s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 12s
E2E Chat / E2E Chat (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 5m14s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m40s
CI / all-required (pull_request) Successful in 6m43s
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-20 21:02:33 -07:00
core-devops 3ceebf3b1f fix(core): mint chat e2e token after url seed
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 31s
E2E API Smoke Test / detect-changes (pull_request) Successful in 27s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 37s
E2E Chat / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 37s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 18s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m17s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m6s
gate-check-v3 / gate-check (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 8s
sop-checklist / all-items-acked (pull_request) Successful in 7s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 34s
CI / Platform (Go) (pull_request) Successful in 3m32s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m5s
Harness Replays / Harness Replays (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m18s
CI / Canvas (Next.js) (pull_request) Successful in 4m45s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 4m52s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m27s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m0s
E2E Chat / E2E Chat (pull_request) Failing after 2m58s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m49s
2026-05-20 20:53:28 -07:00
core-devops e62db981e8 fix(core): keep docker mode in chat e2e
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 46s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 54s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m3s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 59s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 58s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
gate-check-v3 / gate-check (pull_request) Successful in 20s
qa-review / approved (pull_request) Failing after 14s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 53s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m21s
sop-tier-check / tier-check (pull_request) Successful in 6s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 35s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 40s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m28s
CI / Platform (Go) (pull_request) Successful in 4m54s
Harness Replays / Harness Replays (pull_request) Successful in 6s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m1s
E2E Chat / E2E Chat (pull_request) Failing after 3m5s
CI / Canvas (Next.js) (pull_request) Successful in 6m58s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6m48s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m57s
2026-05-20 20:36:36 -07:00
core-devops 679314aa8f fix(core): keep chat e2e echo url local
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 13s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
CI / Python Lint & Test (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 48s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 43s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m51s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m28s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m14s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m22s
CI / Canvas (Next.js) (pull_request) Successful in 6m7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 9s
CI / all-required (pull_request) Successful in 5m15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m21s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m35s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m20s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m17s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m10s
E2E Chat / E2E Chat (pull_request) Failing after 2m20s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m34s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m55s
2026-05-20 20:29:04 -07:00
hongming c58ffd2828 Merge pull request 'ci: reduce scheduled runner load and prep prebaked browsers' (#1628) from fix/ci-cron-bots-prebake-1357 into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 15s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 26s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m29s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
CI / Platform (Go) (push) Successful in 4m35s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m17s
CI / Canvas (Next.js) (push) Successful in 6m12s
CI / Python Lint & Test (push) Successful in 7m7s
CI / all-required (push) Successful in 6m44s
E2E Chat / E2E Chat (push) Failing after 5m39s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m29s
CI / Canvas Deploy Reminder (push) Successful in 5s
lint-bp-context-emit-match / lint-bp-context-emit-match (push) Successful in 1m24s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m23s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10m27s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 46s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m33s
main-red-watchdog / watchdog (push) Successful in 29s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m31s
gate-check-v3 / gate-check (push) Successful in 1m9s
2026-05-21 03:17:11 +00:00
core-devops 1e850af6de fix(core): use safe urls in api e2e discovery
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m52s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 48s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m19s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m26s
CI / Platform (Go) (pull_request) Successful in 5m11s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m17s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6m12s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 11s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m13s
CI / all-required (pull_request) Successful in 5m52s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 11s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m23s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m15s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m30s
E2E Chat / E2E Chat (pull_request) Failing after 6m59s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m45s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m31s
2026-05-20 20:09:53 -07:00
core-devops 256eeedc69 fix(core): make api e2e use external workspace auth
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 2m24s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 45s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m26s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m26s
CI / Canvas (Next.js) (pull_request) Successful in 6m19s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m25s
CI / all-required (pull_request) Successful in 6m13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m18s
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m8s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m27s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m54s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m56s
E2E Chat / E2E Chat (pull_request) Failing after 5m35s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m12s
2026-05-20 19:56:47 -07:00
core-devops a501d33f80 fix(core): ignore codex placeholder in coverage e2e
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 41s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m43s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m30s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m7s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m28s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6m14s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 5m28s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 12s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m33s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 7s
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m18s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m23s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m15s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m53s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m31s
2026-05-20 17:44:58 -07:00
core-fe a9bc5e39d5 ci: reduce scheduled runner load and prep prebaked browsers
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 4m30s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m37s
CI / Canvas (Next.js) (pull_request) Successful in 5m49s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m32s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m11s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m19s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
security-review / approved (pull_request) Failing after 4s
sop-tier-check / tier-check (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
CI / Python Lint & Test (pull_request) Successful in 6m53s
CI / all-required (pull_request) Successful in 6m58s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 6m12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9m31s
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-20 17:36:09 -07:00
core-devops 73faaf9448 fix(core): refresh chat e2e runtime url cache
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
CI / Detect changes (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m53s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 37s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m33s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m36s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m17s
CI / Canvas (Next.js) (pull_request) Successful in 6m12s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m21s
CI / all-required (pull_request) Successful in 6m0s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 14s
gate-check-v3 / gate-check (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 14s
security-review / approved (pull_request) Failing after 6s
qa-review / approved (pull_request) Failing after 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m28s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m11s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m32s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m38s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m38s
E2E Chat / E2E Chat (pull_request) Failing after 5m46s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m20s
2026-05-20 17:28:32 -07:00
core-devops 0aff9cf85f fix(core): use per-workspace auth in today coverage e2e
CI / Python Lint & Test (pull_request) Waiting to run
CI / all-required (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Waiting to run
CI / Shellcheck (E2E scripts) (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / detect-changes (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Waiting to run
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / detect-changes (pull_request) Waiting to run
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
qa-review / approved (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Waiting to run
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
2026-05-20 17:24:43 -07:00
core-be 2ee97c097d fix(self-deleg): 3-layer defense including SQL self-filter (closes #383)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 12s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Failing after 17s
CI / all-required (push) Failing after 2s
E2E API Smoke Test / detect-changes (push) Successful in 6s
E2E Chat / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
CI / Platform (Go) (push) Successful in 4m55s
CI / Canvas (Next.js) (push) Successful in 6m13s
CI / Python Lint & Test (push) Successful in 6m43s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Harness Replays / Harness Replays (push) Successful in 17s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 2m20s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m31s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m43s
E2E Chat / E2E Chat (push) Failing after 7m5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 31s
gate-check-v3 / gate-check (push) Successful in 22s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m34s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 13s
gitea-merge-queue / queue (push) Successful in 13s
status-reaper / reap (push) Successful in 1m11s
ci-required-drift / drift (push) Successful in 1m15s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m46s
2 APPROVES (core-security id=5165, core-devops id=5167). CI/all-required green. Resolves chloe-dong 小董文婷 self-delegation 400 loop.
Co-authored-by: core-be <core-be@agents.moleculesai.app>
Co-committed-by: core-be <core-be@agents.moleculesai.app>
2026-05-21 00:12:38 +00:00
core-devops f088e0ee90 fix(core): decouple local peer visibility from container boot
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 43s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m53s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 44s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m29s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m40s
CI / Canvas (Next.js) (pull_request) Successful in 6m33s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m25s
CI / all-required (pull_request) Successful in 5m47s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m17s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m14s
gate-check-v3 / gate-check (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m26s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
qa-review / approved (pull_request) Failing after 8s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
security-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m17s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m13s
Harness Replays / Harness Replays (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m26s
E2E Chat / E2E Chat (pull_request) Failing after 6m51s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11m13s
2026-05-20 16:56:04 -07:00
core-be ee9dc5b9c5 fix(rfc523): scope forbidden-env check to global_secrets (allow user-set workspace_secrets)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 14s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 14s
CI / Shellcheck (E2E scripts) (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 7s
CI / Platform (Go) (push) Successful in 4m59s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 32s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 6m16s
CI / Python Lint & Test (push) Successful in 6m55s
CI / all-required (push) Successful in 5m42s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 7m52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 14s
Harness Replays / Harness Replays (push) Successful in 5s
CI / Canvas Deploy Reminder (push) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 1m46s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m38s
E2E Chat / E2E Chat (push) Failing after 6m36s
main-red-watchdog / watchdog (push) Successful in 44s
gate-check-v3 / gate-check (push) Successful in 23s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m14s
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: core-be <core-be@agents.moleculesai.app>
Co-committed-by: core-be <core-be@agents.moleculesai.app>
2026-05-20 23:40:54 +00:00
core-devops ea4681299d fix(core): choose keyed parent for local peer visibility
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 26s
CI / Python Lint & Test (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m30s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m25s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m20s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m23s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m25s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m10s
CI / Canvas (Next.js) (pull_request) Successful in 6m10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 6s
gate-check-v3 / gate-check (pull_request) Successful in 8s
security-review / approved (pull_request) Failing after 3s
CI / all-required (pull_request) Successful in 5m59s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m17s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m26s
Harness Replays / Harness Replays (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 16m34s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m50s
E2E Chat / E2E Chat (pull_request) Failing after 6m0s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m53s
2026-05-20 16:29:13 -07:00
core-fe 5455ddefe2 feat(canvas): surface current org name/slug/UUID in Settings panel
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
publish-canvas-image / Build & push canvas image (push) Failing after 7s
publish-workspace-server-image / build-and-push (push) Successful in 5m3s
Block internal-flavored paths / Block forbidden paths (push) Successful in 17s
CI / Detect changes (push) Successful in 18s
CI / Shellcheck (E2E scripts) (push) Successful in 36s
E2E API Smoke Test / detect-changes (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
CI / Platform (Go) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
publish-workspace-server-image / Production auto-deploy (push) Has been cancelled
CI / all-required (push) Has been cancelled
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m21s
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
Co-committed-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
2026-05-20 23:27:58 +00:00
core-fe 80d517b8ab fix(canvas): external/MCP workspace progress UX — surface poll-mode queued state (task #227)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
publish-canvas-image / Build & push canvas image (push) Failing after 5s
publish-workspace-server-image / build-and-push (push) Successful in 5m53s
Block internal-flavored paths / Block forbidden paths (push) Successful in 3s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 31s
CI / Platform (Go) (push) Successful in 4m29s
E2E API Smoke Test / detect-changes (push) Successful in 6s
E2E Chat / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 5m51s
CI / Python Lint & Test (push) Successful in 6m59s
CI / all-required (push) Successful in 6m50s
publish-workspace-server-image / Production auto-deploy (push) Successful in 13m35s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m47s
gate-check-v3 / gate-check (push) Successful in 28s
main-red-watchdog / watchdog (push) Successful in 37s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 8s
ci-required-drift / drift (push) Successful in 1m6s
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
Co-committed-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
2026-05-20 22:59:38 +00:00
core-be dbbd351c70 fix(workspace-server): debounce file-write → RestartByID tight loop (#624)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Failing after 18s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m31s
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: core-be <core-be@agents.moleculesai.app>
Co-committed-by: core-be <core-be@agents.moleculesai.app>
2026-05-20 22:59:05 +00:00
core-fe 55fa44571e fix(canvas): polite tasks/cancel before /workspaces/:id/restart for Stop All (task #377 companion)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / all-required (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Failing after 6s
publish-workspace-server-image / build-and-push (push) Has been cancelled
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
Co-committed-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
2026-05-20 22:58:57 +00:00
core-fe 676f9a033b fix(canvas/chat): A2A hints point at Activity tab (closeout internal#212)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-canvas-image / Build & push canvas image (push) Failing after 7s
publish-workspace-server-image / build-and-push (push) Has been cancelled
Reviews APPROVED (core-qa,core-devops). CI/all-required green. Non-required E2E + security-review checks noted in PR body; not gating per BP.
Co-authored-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
Co-committed-by: Molecule AI · core-fe <core-fe@agents.moleculesai.app>
2026-05-20 22:58:51 +00:00
core-devops b343995c05 fix(core): exercise external connection e2e with external runtime
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
E2E Chat / detect-changes (pull_request) Successful in 20s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 18s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m35s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m39s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
CI / Platform (Go) (pull_request) Successful in 5m40s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6m38s
CI / all-required (pull_request) Successful in 5m30s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m22s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 17m16s
2026-05-20 15:58:31 -07:00
core-devops 4be7966654 fix(core): satisfy shellcheck for e2e auth patch
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m48s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m39s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m21s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 6m0s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m31s
CI / all-required (pull_request) Successful in 5m26s
qa-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m24s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m50s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 16m31s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m21s
E2E Chat / E2E Chat (pull_request) Failing after 7m54s
2026-05-20 15:24:15 -07:00
core-devops 4c290f49f2 fix(core): make e2e auth explicit after token bootstrap
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Failing after 19s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / all-required (pull_request) Failing after 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m23s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m35s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m15s
CI / Platform (Go) (pull_request) Successful in 4m55s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m35s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6m12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m21s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m5s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m21s
E2E Chat / E2E Chat (pull_request) Failing after 6m40s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 16m42s
2026-05-20 15:20:31 -07:00
core-devops f13a675408 fix(core): use runtime package in e2e after workspace deletion
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 4m33s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 6m17s
CI / all-required (pull_request) Successful in 3m36s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m37s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m32s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m22s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 8s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m23s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m43s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m53s
E2E Chat / E2E Chat (pull_request) Failing after 6m5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 16m8s
2026-05-20 15:03:38 -07:00
core-devops 9aa4764301 chore(runtime): delete core workspace copy
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 1m43s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m36s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m29s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m23s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m13s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
gate-check-v3 / gate-check (pull_request) Successful in 9s
security-review / approved (pull_request) Failing after 5s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m15s
CI / Canvas (Next.js) (pull_request) Successful in 6m19s
CI / all-required (pull_request) Successful in 5m46s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 15s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m11s
E2E Chat / E2E Chat (pull_request) Failing after 6m36s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
2026-05-20 14:47:55 -07:00
303 changed files with 3957 additions and 71110 deletions
+174
View File
@@ -0,0 +1,174 @@
#!/usr/bin/env python3
"""Shared path-filter helper for Gitea Actions workflows.
Computes changed files against the PR base SHA or push-before SHA and writes
boolean outputs to GITHUB_OUTPUT. If the diff base is missing or untrusted, the
helper fails open by setting every output in the selected profile to true.
"""
from __future__ import annotations
import argparse
import os
import re
import subprocess
import sys
from pathlib import Path
PROFILES: dict[str, dict[str, str]] = {
"ci": {
"platform": r"^workspace-server/",
"canvas": r"^canvas/",
"python": r"^workspace/",
"scripts": r"^tests/e2e/|^scripts/|^infra/scripts/",
},
"handlers-postgres": {
"handlers": (
r"^workspace-server/internal/handlers/"
r"|^workspace-server/internal/wsauth/"
r"|^workspace-server/migrations/"
r"|^\.gitea/workflows/handlers-postgres-integration\.yml$"
),
},
"e2e-api": {
"api": r"^workspace-server/|^tests/e2e/|^\.gitea/workflows/e2e-api\.yml$",
},
}
def classify(profile: str, paths: list[str]) -> dict[str, bool]:
patterns = PROFILES[profile]
return {
name: any(re.search(pattern, path) for path in paths)
for name, pattern in patterns.items()
}
def all_true(profile: str) -> dict[str, bool]:
return {name: True for name in PROFILES[profile]}
def resolve_base(event_name: str, pr_base_sha: str, push_before: str) -> str:
if event_name == "pull_request" and pr_base_sha:
return pr_base_sha
return push_before
def is_zero_sha(value: str) -> bool:
return not value or bool(re.fullmatch(r"0+", value))
def run_git(args: list[str], *, timeout: int = 30) -> subprocess.CompletedProcess[str]:
return subprocess.run(
["git", *args],
check=False,
text=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
timeout=timeout,
)
def base_exists(base: str) -> bool:
return run_git(["cat-file", "-e", base]).returncode == 0
def fetch_base(base: str, base_ref: str) -> None:
# Gitea may reject fetching an arbitrary unadvertised SHA from a shallow
# PR checkout. Fetch the advertised base branch first, then fall back to
# the SHA for hosts that allow it.
if base_ref:
run_git(["fetch", "--depth=1", "origin", base_ref])
if not base_exists(base):
run_git(["fetch", "--depth=1", "origin", base])
def deepen_base_ref(base_ref: str) -> None:
if base_ref:
run_git(["fetch", "--deepen=200", "origin", base_ref], timeout=60)
def merge_base(base: str) -> str | None:
proc = run_git(["merge-base", base, "HEAD"])
if proc.returncode != 0:
return None
value = proc.stdout.strip()
return value or None
def changed_paths(base: str, *, use_merge_base: bool) -> list[str] | None:
compare_base = base
if use_merge_base:
compare_base = merge_base(base) or ""
if not compare_base:
return None
proc = run_git(["diff", "--name-only", compare_base, "HEAD"])
if proc.returncode != 0:
return None
return [line for line in proc.stdout.splitlines() if line]
def write_outputs(values: dict[str, bool], output_path: str | None) -> None:
lines = [f"{name}={'true' if value else 'false'}" for name, value in values.items()]
if output_path:
with Path(output_path).open("a", encoding="utf-8") as fh:
for line in lines:
fh.write(line + "\n")
else:
for line in lines:
print(line)
def detect(
profile: str,
event_name: str,
pr_base_sha: str,
push_before: str,
base_ref: str = "",
) -> dict[str, bool]:
base = resolve_base(event_name, pr_base_sha, push_before)
if is_zero_sha(base):
return all_true(profile)
if not base_exists(base):
fetch_base(base, base_ref)
if not base_exists(base):
return all_true(profile)
use_merge_base = event_name == "pull_request"
if use_merge_base and base_ref and merge_base(base) is None:
deepen_base_ref(base_ref)
paths = changed_paths(base, use_merge_base=use_merge_base)
if paths is None:
return all_true(profile)
return classify(profile, paths)
def parse_args(argv: list[str]) -> argparse.Namespace:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--profile", required=True, choices=sorted(PROFILES))
parser.add_argument("--event-name", default=os.environ.get("GITHUB_EVENT_NAME", ""))
parser.add_argument("--pr-base-sha", default="")
parser.add_argument("--base-ref", default="")
parser.add_argument("--push-before", default=os.environ.get("GITHUB_EVENT_BEFORE", ""))
return parser.parse_args(argv)
def main(argv: list[str]) -> int:
args = parse_args(argv)
values = detect(
args.profile,
args.event_name,
args.pr_base_sha,
args.push_before,
args.base_ref,
)
write_outputs(values, os.environ.get("GITHUB_OUTPUT"))
return 0
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
+107
View File
@@ -61,6 +61,7 @@ import os
import shutil
import subprocess
import sys
import time
import urllib.error
import urllib.parse
import urllib.request
@@ -89,6 +90,19 @@ API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
# match by exact title without parsing.
TITLE_PREFIX = "[main-red]"
# Settling window (seconds) between initial red detection and the
# pre-file recheck. The recheck filters out the two largest false-
# positive classes seen in mc#1597..1630 (task #394, 2026-05-21):
# 1. HEAD moved on (a new commit landed mid-tick) — the prior red SHA
# is no longer authoritative; let the next cron tick re-evaluate.
# 2. Combined status recovered on the SAME SHA (transient
# cancel-cascade rolled forward to success on retry).
# 90s is well below the hourly cron cadence; a real failure that
# persists past it is the one we want surfaced.
# Override with WATCHDOG_RECHECK_DELAY_SECS for tests / local probes
# (the test suite stubs time.sleep to a no-op).
RECHECK_DELAY_SECS = int(_env("WATCHDOG_RECHECK_DELAY_SECS", default="90"))
def _require_runtime_env() -> None:
"""Enforce env contract — called from `main()` only.
@@ -172,6 +186,49 @@ def api(
return status, {"_raw": raw.decode("utf-8", errors="replace")}
# --------------------------------------------------------------------------
# action_run.status resolver — extensibility hook for task #394.
# --------------------------------------------------------------------------
def _resolve_action_run_status(target_url: str) -> int | None:
"""Resolve the underlying Gitea `action_run.status` integer for the
run referenced by `target_url`, returning None if the resolver
cannot reach an authoritative source from the runner.
Canonical Gitea 1.22.6 enum (per `models/actions/status.go` +
`reference_gitea_action_status_enum_corrected_2026_05_19`):
1=Success, 2=Failure, 3=Cancelled, 4=Skipped,
5=Waiting, 6=Running, 7=Blocked
Only `status == 2` is a real defect; status=3 is cancel-cascade and
status=1 is an emission artifact (Gitea wrote a 'failure' commit_status
row for a run that actually succeeded — observed empirically on
`publish-canvas-image` jobs at SHAs in mc#1597..1630).
CURRENT STATE (2026-05-20, verified): Gitea 1.22.6 exposes NO REST
endpoint for `action_run.status`. Probed:
/api/v1/repos/{o}/{r}/actions/runs/{id} → HTTP 404
/api/v1/repos/{o}/{r}/actions/jobs/{id} → HTTP 404
/api/v1/repos/{o}/{r}/actions/tasks/{id} → HTTP 404
/swagger.v1.json paths containing 'actions' → secrets+variables+runners only
The SPA backend (`/{repo}/actions/runs/{id}/jobs/{idx}` POST) requires
a session CSRF token, unreachable from a runner. The only authoritative
source today is direct DB access (`mol_action_status` on op-host,
`docker exec molecule-postgres-1 psql ...`), which the runner cannot
reach.
Therefore: this hook returns None on every call. Callers MUST fall
back to the description-string filter (existing) plus the HEAD
recheck (this PR). When a future Gitea release (>=1.23 expected) or
an op-host proxy exposes the endpoint, replace the body of this
function with an `api(...)` call — the caller contract is stable.
See also:
- `reference_chronic_red_sweep_cancelled_vs_failed_filter`
- `feedback_gitea_status_enum_use_helper_not_raw_int`
"""
_ = target_url # noqa: F841 — intentional placeholder
return None
# --------------------------------------------------------------------------
# Gitea reads
# --------------------------------------------------------------------------
@@ -614,6 +671,56 @@ def run_once(*, dry_run: bool = False) -> int:
}
if red:
# HEAD recheck (task #394 — guards mc#1597..1630 false-positive
# cluster). After the initial detection, wait RECHECK_DELAY_SECS
# (default 90s; tests stub time.sleep) and re-evaluate:
#
# 1. Re-fetch HEAD SHA. If HEAD moved, a new commit landed
# mid-tick — the prior red SHA is no longer authoritative
# and the next cron run will re-evaluate against the new
# HEAD. Skip-file.
#
# 2. If HEAD unchanged, re-fetch the combined status. If it
# recovered (combined state no longer in {failure,error}
# after the cancel-cascade filter), a transient retry
# rolled the run forward. Skip-file.
#
# Both paths emit a Loki event distinguishable from the real
# `main_red_detected` so obs queries can track filter activity.
# The settling window is well below the hourly cron cadence —
# genuine failures persist past it and are surfaced normally.
time.sleep(RECHECK_DELAY_SECS)
recheck_sha = get_head_sha(WATCH_BRANCH)
if recheck_sha != sha:
emit_loki_event("main_red_skipped_head_drift", sha, [])
print(
f"::notice::skip-file (HEAD moved): initial red at "
f"{sha[:10]} but HEAD is now {recheck_sha[:10]} on "
f"{WATCH_BRANCH}; next cron tick will re-evaluate."
)
return 0
recheck_status = get_combined_status(sha)
recheck_red, recheck_failed = is_red(recheck_status)
if not recheck_red:
emit_loki_event("main_red_skipped_recovered", sha, [])
print(
f"::notice::skip-file (recovered after settling): "
f"combined state at {sha[:10]} flipped to "
f"{recheck_status.get('state')!r} on recheck; "
f"initial red was a transient cancel-cascade."
)
return 0
# Still red after settling — file/update. Use the recheck data
# as authoritative so the issue body reflects the latest state.
failed = recheck_failed
debug["recheck_combined_state"] = recheck_status.get("state")
debug["recheck_failed_contexts"] = [
s.get("context") for s in failed
]
failed_ctxs = [s.get("context") for s in failed if s.get("context")]
emit_loki_event("main_red_detected", sha, failed_ctxs)
print(f"::warning::main is RED at {sha[:10]} on {WATCH_BRANCH}: "
+33 -3
View File
@@ -47,7 +47,9 @@ What this script does, per `.gitea/workflows/status-reaper.yml` invocation:
Parse context as `<workflow_name> / <job_name> (push)`.
Look up workflow_name in the trigger map:
- missing → log ::notice:: and skip (conservative).
- has_push_trigger=True → preserve (real defect signal).
- has_push_trigger=True and description == "Has been cancelled"
→ compensate cancelled/superseded push noise.
- has_push_trigger=True otherwise → preserve (real defect signal).
- has_push_trigger=False → POST a compensating
`state=success` status to /statuses/{sha} with the same
context (Gitea de-dups by context) and a description
@@ -141,6 +143,11 @@ PR_SHADOW_COMPENSATION_DESCRIPTION = (
"shadowed by successful push status on same SHA; see "
".gitea/scripts/status-reaper.py)"
)
CANCELLED_PUSH_COMPENSATION_DESCRIPTION = (
"Compensated by status-reaper (push run was cancelled/superseded; "
"Gitea 1.22.6 reports cancelled runs as failure statuses)"
)
CANCELLED_DESCRIPTION = "Has been cancelled"
# Context suffix the reaper acts on. Gitea hardcodes this for ALL
# default-branch workflow runs.
@@ -476,7 +483,7 @@ def reap(
{compensated, preserved_real_push, preserved_unknown,
preserved_non_failure, preserved_non_push_suffix,
preserved_unparseable, compensated_pr_shadowed_by_push_success,
preserved_pr_without_push_success,
preserved_pr_without_push_success, compensated_cancelled_push,
compensated_contexts: [<context>, ...]}
`compensated_contexts` is rev2-added so `reap_branch` can build
@@ -490,6 +497,7 @@ def reap(
"preserved_non_push_suffix": 0,
"preserved_unparseable": 0,
"compensated_pr_shadowed_by_push_success": 0,
"compensated_cancelled_push": 0,
"preserved_pr_without_push_success": 0,
"compensated_contexts": [],
}
@@ -567,8 +575,27 @@ def reap(
counters["preserved_unknown"] += 1
continue
if (s.get("description") or "").strip() == CANCELLED_DESCRIPTION:
# Gitea 1.22.6 maps cancelled action runs to failure commit
# statuses. During merge bursts, older push runs can be
# superseded and cancelled even though a newer run for the
# same branch is the real signal. Compensate only the exact
# Gitea cancellation description; real push failures remain red.
post_compensating_status(
sha,
context,
s.get("target_url"),
description=CANCELLED_PUSH_COMPENSATION_DESCRIPTION,
dry_run=dry_run,
)
counters["compensated"] += 1
counters["compensated_cancelled_push"] += 1
counters["compensated_contexts"].append(context)
continue
if workflow_trigger_map[workflow_name]:
# Real push trigger → real defect signal. Preserve.
# Real push trigger with a non-cancelled failure description
# remains a defect signal. Preserve.
counters["preserved_real_push"] += 1
continue
@@ -674,6 +701,7 @@ def reap_branch(
"preserved_non_push_suffix": 0,
"preserved_unparseable": 0,
"compensated_pr_shadowed_by_push_success": 0,
"compensated_cancelled_push": 0,
"preserved_pr_without_push_success": 0,
"compensated_per_sha": {},
"skipped": True,
@@ -689,6 +717,7 @@ def reap_branch(
"preserved_non_push_suffix": 0,
"preserved_unparseable": 0,
"compensated_pr_shadowed_by_push_success": 0,
"compensated_cancelled_push": 0,
"preserved_pr_without_push_success": 0,
"compensated_per_sha": {},
}
@@ -728,6 +757,7 @@ def reap_branch(
"preserved_non_push_suffix",
"preserved_unparseable",
"compensated_pr_shadowed_by_push_success",
"compensated_cancelled_push",
"preserved_pr_without_push_success",
):
aggregate[key] += per_sha[key]
@@ -1,60 +0,0 @@
name: cascade-list-drift-gate
# Ported from .github/workflows/cascade-list-drift-gate.yml on 2026-05-11
# per RFC internal#219 §1 sweep.
#
# Differences from the GitHub version:
# - on.paths reference .gitea/workflows/publish-runtime.yml (the active
# Gitea workflow file) instead of .github/workflows/publish-runtime.yml
# (which Category A of this sweep deletes).
# - Explicit `WORKFLOW=` arg passed to the drift script so it audits the
# .gitea/ workflow (the script's default is still .github/... which
# will not exist post-Cat-A).
# - Workflow-level env.GITHUB_SERVER_URL set per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on the job (RFC §1 contract — surface
# defects without blocking; follow-up PR flips after triage).
#
# Structural gate: TEMPLATES list in publish-runtime.yml must match
# manifest.json's workspace_templates exactly. Closes the recurrence
# path of PR #2556 (the data fix) and is the first concrete deliverable
# of RFC #388 PR-3.
#
# Triggers narrowly to keep CI quiet: only on PRs that actually change
# one of the two files. The path-filtered split + always-emit-result
# pattern (memory: "Required check names need a job that always runs")
# is unnecessary here because the workflow IS the check name and PR
# branch protection should require it directly. Future-proof: if this
# becomes a required check, add a no-op aggregator with always() so the
# name still emits when paths don't match.
on:
pull_request:
branches: [staging, main]
paths:
- manifest.json
- .gitea/workflows/publish-runtime.yml
- scripts/check-cascade-list-vs-manifest.sh
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
jobs:
# bp-exempt: drift visibility gate; CI / all-required remains the required aggregate.
check:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Check cascade list matches manifest
# Pass the .gitea/ workflow path explicitly — the script's
# default still points at .github/... which Category A of this
# sweep removes.
run: bash scripts/check-cascade-list-vs-manifest.sh manifest.json .gitea/workflows/publish-runtime.yml
-225
View File
@@ -1,225 +0,0 @@
name: MCP Stdio Transport Regression
# Regression test for molecule-ai-workspace-runtime#61:
# asyncio.connect_read_pipe / connect_write_pipe fail with
# ValueError: "Pipe transport is only for pipes, sockets and character devices"
# when stdout is a regular file (openclaw capture, CI tee, debugging).
#
# This workflow reproduces the exact failure mode and verifies the
# fallback to direct buffer I/O works. It runs on every PR that
# touches the MCP server or this workflow, plus nightly cron.
#
# Why a separate workflow (not folded into ci.yml python-lint):
# - The test needs to spawn the MCP server with stdout redirected
# to a regular file (not a TTY/pipe), which conflicts with
# pytest's own capture mechanism.
# - It exercises the actual process spawn path (python a2a_mcp_server.py)
# not just unit-test mocks — closer to the real openclaw integration.
# - A dedicated workflow surfaces stdio-specific regressions without
# coupling to the broader Python test suite's coverage gate.
on:
pull_request:
branches: [main, staging]
paths:
- 'workspace/a2a_mcp_server.py'
- 'workspace/mcp_cli.py'
- 'workspace/tests/test_a2a_mcp_server.py'
- '.gitea/workflows/ci-mcp-stdio-transport.yml'
push:
branches: [main, staging]
paths:
- 'workspace/a2a_mcp_server.py'
- 'workspace/mcp_cli.py'
- 'workspace/tests/test_a2a_mcp_server.py'
- '.gitea/workflows/ci-mcp-stdio-transport.yml'
schedule:
# Nightly at 04:00 UTC — catches drift from dependency updates
# (e.g. asyncio behavior changes in new Python patch releases).
- cron: '0 4 * * *'
concurrency:
group: mcp-stdio-${{ github.ref }}
cancel-in-progress: true
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
# bp-exempt: regression canary for runtime#61; not a merge gate — informational only until promoted to required.
# mc#774: continue-on-error mask — new workflow, flip to false once it's green on ≥3 consecutive main runs.
mcp-stdio-regular-file:
name: MCP stdio with regular-file stdout
runs-on: ubuntu-latest
continue-on-error: true # mc#774
timeout-minutes: 5
env:
WORKSPACE_ID: "00000000-0000-0000-0000-000000000001"
defaults:
run:
working-directory: workspace
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
cache: pip
cache-dependency-path: workspace/requirements.txt
- run: pip install -r requirements.txt pytest pytest-asyncio pytest-cov
- name: Reproduce runtime#61 — stdout as regular file
run: |
set -euo pipefail
echo "=== Reproducing molecule-ai-workspace-runtime#61 ==="
echo ""
echo "Before the fix, this command would fail with:"
echo ' ValueError: Pipe transport is only for pipes, sockets and character devices'
echo ""
# Spawn the MCP server with stdout redirected to a regular file.
# This is exactly what openclaw does when capturing MCP output.
OUTPUT=$(mktemp)
trap 'rm -f "$OUTPUT"' EXIT
# Send initialize request, then tools/list, then exit
{
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'
echo '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'
} | python a2a_mcp_server.py > "$OUTPUT" 2>&1 || {
RC=$?
echo "FAIL: MCP server exited with code $RC"
echo "--- stdout+stderr ---"
cat "$OUTPUT"
exit 1
}
echo "PASS: MCP server handled regular-file stdout without crashing"
echo ""
echo "--- Output (first 20 lines) ---"
head -20 "$OUTPUT"
echo ""
# Verify we got valid JSON-RPC responses
if grep -q '"result"' "$OUTPUT"; then
echo "PASS: JSON-RPC responses found in output"
else
echo "FAIL: No JSON-RPC responses in output"
cat "$OUTPUT"
exit 1
fi
- name: Reproduce runtime#61 — stdin from regular file
run: |
set -euo pipefail
echo "=== stdin as regular file (CI tee / capture pattern) ==="
INPUT=$(mktemp)
OUTPUT=$(mktemp)
trap 'rm -f "$INPUT" "$OUTPUT"' EXIT
cat > "$INPUT" <<'EOF'
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}
{"jsonrpc":"2.0","id":2,"method":"tools/list"}
EOF
python a2a_mcp_server.py < "$INPUT" > "$OUTPUT" 2>&1 || {
RC=$?
echo "FAIL: MCP server exited with code $RC"
cat "$OUTPUT"
exit 1
}
echo "PASS: MCP server handled regular-file stdin without crashing"
if grep -q '"result"' "$OUTPUT"; then
echo "PASS: JSON-RPC responses found in output"
else
echo "FAIL: No JSON-RPC responses in output"
cat "$OUTPUT"
exit 1
fi
- name: Verify warning is emitted for non-pipe stdio
run: |
set -euo pipefail
echo "=== Verify diagnostic warning ==="
OUTPUT=$(mktemp)
trap 'rm -f "$OUTPUT"' EXIT
{
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'
} | python a2a_mcp_server.py > "$OUTPUT" 2>&1
# The warning should mention "not a pipe" for operator visibility
if grep -qi "not a pipe" "$OUTPUT"; then
echo "PASS: Diagnostic warning emitted for non-pipe stdio"
else
echo "NOTE: No warning in output (may be suppressed by log level)"
fi
- name: Reproduce openclaw failure — pipe held OPEN, no EOF
run: |
set -euo pipefail
echo "=== keep-stdin-open pipe (the real openclaw / Claude Code case) ==="
echo ""
echo "Before the readline() fix this HANGS: main() did"
echo " stdin.read(65536) -> on a pipe, blocks until 64KB OR EOF."
echo "An MCP client sends one ~150B initialize and keeps stdin"
echo "open waiting for the response, so the server never parsed"
echo "the request and the client timed out (openclaw: 'MCP error"
echo "-32000: Connection closed'). The earlier regular-file /"
echo "heredoc-pipe steps PASSED through this bug because a file"
echo "(or a closing heredoc) yields EOF immediately."
echo ""
# Drive the server through a real pipe that stays OPEN: write
# one initialize, do NOT close stdin, and require a response
# within a hard timeout. read(65536) -> no output -> timeout
# kills it -> FAIL. readline() -> immediate response -> PASS.
python - <<'PYEOF'
import json, subprocess, sys, time, select
proc = subprocess.Popen(
[sys.executable, "a2a_mcp_server.py"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
env={**__import__("os").environ},
)
req = json.dumps({
"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": {"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "keepopen", "version": "1"}},
}) + "\n"
proc.stdin.write(req.encode())
proc.stdin.flush()
# Deliberately DO NOT close proc.stdin — mirror a live MCP client.
deadline = time.time() + 15
line = b""
while time.time() < deadline:
r, _, _ = select.select([proc.stdout], [], [], 1)
if r:
line = proc.stdout.readline()
if line:
break
proc.kill()
if not line:
print("FAIL: no response within 15s on an open pipe — "
"stdin.read(65536) regression is back")
sys.exit(1)
resp = json.loads(line.decode())
assert resp.get("id") == 1 and "result" in resp, \
f"unexpected response: {line[:200]!r}"
assert resp["result"]["serverInfo"]["name"] == "molecule", \
f"wrong serverInfo: {line[:200]!r}"
print("PASS: server answered initialize on a still-open pipe")
PYEOF
- name: Run unit tests for stdio transport
run: |
set -euo pipefail
echo "=== Running stdio transport unit tests ==="
python -m pytest tests/test_a2a_mcp_server.py::TestStdioPipeAssertion tests/test_a2a_mcp_server.py::TestStdioKeepOpenPipe -v --no-cov
+58 -139
View File
@@ -86,46 +86,17 @@ jobs:
with:
fetch-depth: 0
- id: check
env:
PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
PR_BASE_REF: ${{ github.event.pull_request.base.ref }}
PUSH_BEFORE: ${{ github.event.before }}
run: |
# For PR events: diff against the base branch (not HEAD~1 of the branch,
# which may be unrelated after force-pushes). When a push updates a PR,
# both pull_request and push events fire — prefer the PR base so that
# the diff is always computed against the actual merge base, not the
# previous SHA on the branch which may be on a different history line.
BASE="${GITHUB_BASE_REF:-${{ github.event.before }}}"
# GITHUB_BASE_REF is set for PR events (the base branch name).
# For pull_request events we use the stored base.sha; for push events
# (or when base.sha is unavailable) fall back to github.event.before.
if [ "${{ github.event_name }}" = "pull_request" ] && [ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
fi
# Fallback: if BASE is empty or all zeros (new branch), run everything
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$'; then
echo "platform=true" >> "$GITHUB_OUTPUT"
echo "canvas=true" >> "$GITHUB_OUTPUT"
echo "python=true" >> "$GITHUB_OUTPUT"
echo "scripts=true" >> "$GITHUB_OUTPUT"
exit 0
fi
# Workflow-only edits are covered by the workflow lint family
# and by this workflow's always-present required jobs. Do not fan
# those edits out into Go/Canvas/Python/shellcheck work; the
# downstream jobs still emit their required contexts via no-op
# steps when their surface flag is false.
#
# If the diff itself cannot be trusted, fail open by running every
# surface instead of silently under-testing the PR.
if ! DIFF=$(git diff --name-only "$BASE" HEAD 2>/dev/null); then
echo "platform=true" >> "$GITHUB_OUTPUT"
echo "canvas=true" >> "$GITHUB_OUTPUT"
echo "python=true" >> "$GITHUB_OUTPUT"
echo "scripts=true" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "platform=$(echo "$DIFF" | grep -qE '^workspace-server/' && echo true || echo false)" >> "$GITHUB_OUTPUT"
echo "canvas=$(echo "$DIFF" | grep -qE '^canvas/' && echo true || echo false)" >> "$GITHUB_OUTPUT"
echo "python=$(echo "$DIFF" | grep -qE '^workspace/' && echo true || echo false)" >> "$GITHUB_OUTPUT"
echo "scripts=$(echo "$DIFF" | grep -qE '^tests/e2e/|^scripts/|^infra/scripts/' && echo true || echo false)" >> "$GITHUB_OUTPUT"
python3 .gitea/scripts/detect-changes.py \
--profile ci \
--event-name "${{ github.event_name }}" \
--pr-base-sha "$PR_BASE_SHA" \
--base-ref "$PR_BASE_REF" \
--push-before "${GITHUB_EVENT_BEFORE:-$PUSH_BEFORE}"
# Platform (Go) — Go build/vet/test/lint + coverage gates. The always-run
# + per-step gating shape preserves the GitHub-side required-check name
@@ -133,6 +104,7 @@ jobs:
# the name match works on PRs that don't touch workspace-server/).
platform-build:
name: Platform (Go)
needs: changes
runs-on: ubuntu-latest
# mc#774 (closed 2026-05-14): Phase 4 flip of the platform-build job.
# Phase 4 (#656) originally flipped this to continue-on-error: false based on
@@ -153,29 +125,29 @@ jobs:
run:
working-directory: workspace-server
steps:
- if: false
- if: ${{ github.event_name == 'pull_request' && needs.changes.outputs.platform != 'true' }}
working-directory: .
run: echo "No platform/** changes — skipping real build steps; this job always runs to satisfy the required-check name on branch protection."
- if: always()
run: echo "No workspace-server/** changes on this PR — Platform (Go) gate satisfied without running Go build/test/lint."
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
run: go mod download
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
run: go build ./cmd/server
# CLI (molecli) moved to standalone repo: git.moleculesai.app/molecule-ai/molecule-cli
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
run: go vet ./...
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Install golangci-lint
run: go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@v2.12.2
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Run golangci-lint
run: $(go env GOPATH)/bin/golangci-lint run --timeout 3m ./...
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Diagnostic — per-package verbose 60s
run: |
set +e
@@ -191,7 +163,7 @@ jobs:
echo "::endgroup::"
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 10m per-step timeout
@@ -199,7 +171,7 @@ jobs:
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Per-file coverage report
# Advisory — lists every source file with its coverage so reviewers
# can see at-a-glance where gaps are. Sorted ascending so the worst
@@ -213,7 +185,7 @@ jobs:
END {for (f in s) printf "%6.1f%% %s\n", s[f]/c[f], f}' \
| sort -n
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.platform == 'true' }}
name: Check coverage thresholds
# Enforces two gates from #1823 Layer 1:
# 1. Total floor (25% — ratchet plan in COVERAGE_FLOOR.md).
@@ -301,6 +273,7 @@ jobs:
# siblings — verified empirically on PR #2314).
canvas-build:
name: Canvas (Next.js)
needs: changes
runs-on: ubuntu-latest
timeout-minutes: 20
# Phase 4 (RFC #219 §1): confirmed green on main 2026-05-12.
@@ -309,20 +282,20 @@ jobs:
run:
working-directory: canvas
steps:
- if: false
- if: ${{ github.event_name == 'pull_request' && needs.changes.outputs.canvas != 'true' }}
working-directory: .
run: echo "No canvas/** changes — skipping real build steps; this job always runs to satisfy the required-check name on branch protection."
- if: always()
run: echo "No canvas/** changes on this PR — Canvas (Next.js) gate satisfied without running npm build/test."
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0
with:
node-version: '22'
- if: always()
run: rm -f package-lock.json && npm install
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
run: npm ci --include=optional --prefer-offline
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
run: npm run build
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
name: Run tests with coverage
# Coverage instrumentation is configured in canvas/vitest.config.ts
# (provider: v8, reporters: text + html + json-summary). Step 2 of
@@ -331,7 +304,7 @@ jobs:
# tracked in #1815) after the team sees what current coverage is.
run: npx vitest run --coverage
- name: Upload coverage summary as artifact
if: always()
if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.canvas == 'true' }}
# Pinned to v3 for Gitea act_runner v0.6 compatibility — v4+ uses
# the GHES 3.10+ artifact protocol that Gitea 1.22.x does NOT
# implement, surfacing as `GHESNotSupportedError: @actions/artifact
@@ -348,15 +321,16 @@ jobs:
# Shellcheck (E2E scripts) — required check, always runs.
shellcheck:
name: Shellcheck (E2E scripts)
needs: changes
runs-on: ubuntu-latest
# Phase 4 (RFC #219 §1): confirmed green on main 2026-05-12.
continue-on-error: false
steps:
- if: false
run: echo "No tests/e2e/ or infra/scripts/ changes — skipping real shellcheck; this job always runs to satisfy the required-check name on branch protection."
- if: always()
- if: ${{ github.event_name == 'pull_request' && needs.changes.outputs.scripts != 'true' }}
run: echo "No tests/e2e, scripts, or infra/scripts changes on this PR — Shellcheck gate satisfied without running script checks."
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
name: Run shellcheck on tests/e2e/*.sh and infra/scripts/*.sh
# shellcheck is pre-installed on ubuntu-latest runners (via apt).
# infra/scripts/ is included because setup.sh + nuke.sh gate the
@@ -367,16 +341,16 @@ jobs:
find tests/e2e infra/scripts -type f -name '*.sh' -print0 \
| xargs -0 shellcheck --severity=warning
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
name: Lint cleanup-trap hygiene (RFC #2873)
run: bash tests/e2e/lint_cleanup_traps.sh
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
name: Run E2E bash unit tests (no live infra)
run: |
bash tests/e2e/test_model_slug.sh
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
name: Test ECR promote-tenant-image script (mock-driven, no live infra)
# Covers scripts/promote-tenant-image.sh — the codified
# :staging-latest → :latest ECR promote + tenant fleet redeploy
@@ -386,7 +360,7 @@ jobs:
run: |
bash scripts/test-promote-tenant-image.sh
- if: always()
- if: ${{ github.event_name != 'pull_request' || needs.changes.outputs.scripts == 'true' }}
name: Shellcheck promote-tenant-image script
# scripts/ is excluded from the bulk shellcheck pass above (legacy
# SC3040/SC3043 cleanup pending). Run shellcheck explicitly on
@@ -456,84 +430,29 @@ jobs:
cat /tmp/deploy-reminder.md >> "$GITHUB_STEP_SUMMARY"
# Python Lint & Test — required check, always runs.
# Runtime Python moved to molecule-ai-workspace-runtime. Keep this context as
# a guard so branch protection still catches attempts to reintroduce an
# editable runtime copy under molecule-core/workspace/.
python-lint:
name: Python Lint & Test
runs-on: ubuntu-latest
# Phase 4 (RFC #219 §1): confirmed green on main 2026-05-12.
continue-on-error: false
env:
WORKSPACE_ID: test
defaults:
run:
working-directory: workspace
steps:
- if: false
working-directory: .
run: echo "No workspace/** changes — skipping real lint+test; this job always runs to satisfy the required-check name on branch protection."
- if: always()
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: always()
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
cache: pip
cache-dependency-path: workspace/requirements.txt
- if: always()
run: pip install -r requirements.txt pytest pytest-asyncio pytest-cov sqlalchemy>=2.0.0
# Coverage flags + fail-under floor moved into workspace/pytest.ini
# (issue #1817) so local `pytest` and CI use identical config.
- if: always()
run: python -m pytest --tb=short
- if: always()
name: Per-file critical-path coverage (MCP / inbox / auth)
# MCP-critical Python files have a per-file floor on top of the
# 86% total floor in pytest.ini. See issue #2790 for full rationale.
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Runtime SSOT guard
run: |
set -e
PER_FILE_FLOOR=75
CRITICAL_FILES=(
"a2a_mcp_server.py"
"mcp_cli.py"
"a2a_tools.py"
"a2a_tools_inbox.py"
"inbox.py"
"platform_auth.py"
)
# pytest already wrote .coverage; emit a JSON view scoped to
# the critical files so jq/python can read the per-file pct
# without parsing tabular text.
INCLUDES=$(printf '*%s,' "${CRITICAL_FILES[@]}")
INCLUDES="${INCLUDES%,}"
python -m coverage json -o /tmp/critical-cov.json --include="$INCLUDES"
FAILED=0
for f in "${CRITICAL_FILES[@]}"; do
pct=$(jq -r --arg f "$f" '.files | to_entries | map(select(.key == $f)) | .[0].value.summary.percent_covered // "MISSING"' /tmp/critical-cov.json)
if [ "$pct" = "MISSING" ]; then
echo "::error file=workspace/$f::No coverage data — file may have moved or test exclusion mis-set."
FAILED=$((FAILED+1))
continue
fi
echo "$f: ${pct}%"
if awk "BEGIN{exit !($pct < $PER_FILE_FLOOR)}"; then
echo "::error file=workspace/$f::${pct}% < ${PER_FILE_FLOOR}% per-file floor (MCP critical path). See COVERAGE_FLOOR.md."
FAILED=$((FAILED+1))
fi
done
if [ "$FAILED" -gt 0 ]; then
echo ""
echo "$FAILED MCP critical-path file(s) below the ${PER_FILE_FLOOR}% per-file floor."
echo "These paths handle multi-tenant routing, auth tokens, and inbox dispatch."
echo "A coverage drop here is the same risk shape as Go-side tokens/secrets files"
echo "dropping below 10% (see COVERAGE_FLOOR.md). Either:"
echo " (a) add tests to raise coverage back above ${PER_FILE_FLOOR}%, or"
echo " (b) if this is unavoidable historical debt, file an issue and propose"
echo " adjusting the floor with rationale in COVERAGE_FLOOR.md."
set -eu
if [ -d workspace ]; then
echo "::error file=workspace::Runtime source must live in molecule-ai-workspace-runtime, not molecule-core/workspace."
exit 1
fi
for f in scripts/build_runtime_package.py scripts/test_build_runtime_package.py; do
if [ -e "$f" ]; then
echo "::error file=$f::Legacy build-from-workspace packaging script must not be restored."
exit 1
fi
done
echo "Runtime SSOT guard passed; core consumes the standalone runtime package."
all-required:
# Aggregator sentinel — RFC internal#219 §2 (Phase 4 — closes internal#286).
+13 -1
View File
@@ -43,6 +43,18 @@ name: Continuous synthetic E2E (staging)
on:
schedule:
# Every 30 minutes, on :02 and :32. This keeps a recurring SaaS
# behavior probe while cutting runner occupancy from this workflow by
# roughly two thirds; fast liveness belongs in the lighter smoke/heartbeat
# probes, not in a full tenant/workspace synth every 10 minutes.
#
# Previous cadence was every 10 minutes (:02 :12 :22 :32 :42 :52).
# The current operator-host runner pool is the bottleneck, so full
# synth E2E is deliberately lower-cadence until it moves to a dedicated
# runner host or warm-runtime pool.
#
# Historical notes from the 10-minute shape:
#
# Every 10 minutes, on :02 :12 :22 :32 :42 :52. Three constraints:
# 1. Stay off the top-of-hour. GitHub Actions scheduler drops
# :00 firings under high load (own docs:
@@ -66,7 +78,7 @@ on:
# fires = ~30 min cadence; closer to the 20-min target than the
# current shape and provides a real degradation alarm if drops
# get worse.
- cron: '2,12,22,32,42,52 * * * *'
- cron: '2,32 * * * *'
permissions:
contents: read
# No issue-write here — failures surface as red runs in the workflow
+15 -28
View File
@@ -132,31 +132,13 @@ jobs:
with:
fetch-depth: 0
- id: decide
# Inline replacement for dorny/paths-filter — same pattern PR#372's
# ci.yml port used. Diffs against the PR base or push BEFORE SHA,
# then matches against the api-relevant path set.
run: |
BASE="${GITHUB_BASE_REF:-${{ github.event.before }}}"
if [ "${{ github.event_name }}" = "pull_request" ] && [ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
fi
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$'; then
echo "api=true" >> "$GITHUB_OUTPUT"
exit 0
fi
if ! git cat-file -e "$BASE" 2>/dev/null; then
git fetch --depth=1 origin "$BASE" 2>/dev/null || true
fi
if ! git cat-file -e "$BASE" 2>/dev/null; then
echo "api=true" >> "$GITHUB_OUTPUT"
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(workspace-server/|tests/e2e/|\.gitea/workflows/e2e-api\.yml$)'; then
echo "api=true" >> "$GITHUB_OUTPUT"
else
echo "api=false" >> "$GITHUB_OUTPUT"
fi
python3 .gitea/scripts/detect-changes.py \
--profile e2e-api \
--event-name "${{ github.event_name }}" \
--pr-base-sha "${{ github.event.pull_request.base.sha }}" \
--base-ref "${{ github.event.pull_request.base.ref }}" \
--push-before "${GITHUB_EVENT_BEFORE:-${{ github.event.before }}}"
# ONE job (no job-level `if:`) that always runs and reports under the
# required-check name `E2E API Smoke Test`. Real work is gated per-step
@@ -366,6 +348,9 @@ jobs:
exit 1
fi
echo "Migrations OK"
- name: Run today's-PR-coverage E2E (mc#1525/1535/1536/1539/1542 fix-specific assertions)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_today_pr_coverage_e2e.sh
- name: Run E2E API tests
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_api.sh
@@ -375,15 +360,18 @@ jobs:
- name: Run priority-runtimes E2E (claude-code + hermes — skips when keys absent)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_priority_runtimes_e2e.sh
- name: Install standalone runtime parser from Gitea registry
if: needs.detect-changes.outputs.api == 'true'
run: |
python3 -m pip install --no-deps \
--index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ \
molecule-ai-workspace-runtime
- name: Run poll-mode + since_id cursor E2E (#2339)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_e2e.sh
- name: Run poll-mode chat upload E2E (RFC #2891)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_chat_upload_e2e.sh
- name: Run today's-PR-coverage E2E (mc#1525/1535/1536/1539/1542 fix-specific assertions)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_today_pr_coverage_e2e.sh
- name: Dump platform log on failure
if: failure() && needs.detect-changes.outputs.api == 'true'
run: cat workspace-server/platform.log || true
@@ -401,4 +389,3 @@ jobs:
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
+42 -4
View File
@@ -1,8 +1,10 @@
name: E2E Chat
# Comprehensive Playwright E2E for the unified chat stack (desktop
# ChatTab + mobile MobileChat). Runs on every PR that touches canvas,
# workspace-server, or this workflow file.
# ChatTab + mobile MobileChat). Heavy browser execution is intentionally
# outside the normal required PR path: PRs run it only after entering the
# `merge-queue`, while push/main, nightly, and manual dispatch preserve
# coverage without making every PR pay the full runtime/browser cost.
#
# Architecture:
# 1. Ephemeral Postgres + Redis (docker, unique container names)
@@ -22,6 +24,11 @@ on:
branches: [main, staging]
pull_request:
branches: [main, staging]
schedule:
# Nightly at 09:00 UTC. Keeps coverage for the currently non-required
# heavy browser lane without spending runner time on every PR.
- cron: '0 9 * * *'
workflow_dispatch:
concurrency:
group: e2e-chat-${{ github.event.pull_request.head.sha || github.sha }}
@@ -50,7 +57,14 @@ jobs:
with:
fetch-depth: 0
- id: decide
env:
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
QUEUE_LABEL: merge-queue
run: |
if [ "${{ github.event_name }}" = "schedule" ] || [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "chat=true" >> "$GITHUB_OUTPUT"
exit 0
fi
BASE="${GITHUB_BASE_REF:-${{ github.event.before }}}"
if [ "${{ github.event_name }}" = "pull_request" ] && [ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
@@ -67,9 +81,26 @@ jobs:
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(canvas/|workspace-server/|\.gitea/workflows/e2e-chat\.yml$)'; then
if ! echo "$CHANGED" | grep -qE '^(canvas/|workspace-server/|\.gitea/workflows/e2e-chat\.yml$)'; then
echo "chat=false" >> "$GITHUB_OUTPUT"
exit 0
fi
if [ "${{ github.event_name }}" != "pull_request" ]; then
echo "chat=true" >> "$GITHUB_OUTPUT"
exit 0
fi
authfile=$(mktemp)
chmod 600 "$authfile"
printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$authfile"
labels=$(curl -fsS -K "$authfile" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels" \
| python3 -c 'import json,sys; print("\n".join(label.get("name","") for label in json.load(sys.stdin)))')
rm -f "$authfile"
if printf '%s\n' "$labels" | grep -qx "$QUEUE_LABEL"; then
echo "chat=true" >> "$GITHUB_OUTPUT"
else
echo "PR is not in merge-queue; skipping heavy E2E Chat for normal PR path."
echo "chat=false" >> "$GITHUB_OUTPUT"
fi
@@ -230,7 +261,14 @@ jobs:
- name: Install Playwright browsers
if: needs.detect-changes.outputs.chat == 'true'
working-directory: canvas
run: npx playwright install --with-deps chromium
run: |
PREBAKED_PLAYWRIGHT=/ms-playwright
if [ -d "${PREBAKED_PLAYWRIGHT}" ] && find "${PREBAKED_PLAYWRIGHT}" -maxdepth 3 -type f -name 'chrome' | grep -q .; then
echo "Using prebaked Playwright Chromium from ${PREBAKED_PLAYWRIGHT}"
echo "PLAYWRIGHT_BROWSERS_PATH=${PREBAKED_PLAYWRIGHT}" >> "$GITHUB_ENV"
exit 0
fi
npx playwright install --with-deps chromium
- name: Start canvas dev server (background)
if: needs.detect-changes.outputs.chat == 'true'
+16 -19
View File
@@ -44,6 +44,8 @@ name: E2E Peer Visibility (literal MCP list_peers)
# - No cross-repo `uses:` (feedback_gitea_cross_repo_uses_blocked). The
# actions/checkout SHA is the one e2e-staging-canvas.yml already uses
# successfully (a mirrored SHA — see #1277/PR#1292 root-cause).
# - 2026-05-21 retrigger: verify fresh platform-tenant image after the
# publish Buildx DOCKER_CONFIG fix restored staging-latest image updates.
# - Per-SHA concurrency, not global (feedback_concurrency_group_per_sha).
# - Workflow-level GITHUB_SERVER_URL pinned
# (feedback_act_runner_github_server_url).
@@ -68,14 +70,11 @@ name: E2E Peer Visibility (literal MCP list_peers)
# minutes, not the 30+ min cold-EC2 path), so peer-visibility is part of
# the local gate that fires before the staging E2E.
#
# It is its OWN non-required status context `E2E Peer Visibility (local)`
# — same non-required-by-design decision as the staging job (red until
# Hermes-401 #162 / OpenClaw-never-online #165 land; flip-to-required
# tracked at molecule-core#1296). It is an HONEST gate: NO
# continue-on-error mask (feedback_fix_root_not_symptom). It is kept a
# distinct context (not folded into e2e-api.yml's required `E2E API
# Smoke Test`) precisely so a deliberately-RED-today gate cannot wedge
# the required local-E2E job or any unrelated merge.
# It is its OWN non-required status context `E2E Peer Visibility (local)`.
# The local backend uses external-mode workspaces by default so it tests
# the literal platform MCP list_peers path without depending on local
# template container boot/heartbeat. Container-mode runtime boot remains
# available via PV_LOCAL_PROVISION_MODE=container for targeted debugging.
on:
push:
@@ -86,8 +85,6 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace.go'
- 'workspace/a2a_mcp_server.py'
- 'workspace/platform_tools/registry.py'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
@@ -100,8 +97,6 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace.go'
- 'workspace/a2a_mcp_server.py'
- 'workspace/platform_tools/registry.py'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
@@ -157,9 +152,9 @@ jobs:
# ephemeral host ports so concurrent host-network act_runner runs don't
# collide; go build; background platform-server). Its OWN non-required
# status context `E2E Peer Visibility (local)` — non-required-by-design
# exactly like the staging job (red until #162/#165 land;
# flip-to-required tracked at molecule-core#1296). HONEST gate, NO
# continue-on-error mask (feedback_fix_root_not_symptom). Runs on PR +
# exactly like the staging job (flip-to-required tracked at
# molecule-core#1296). HONEST gate, NO continue-on-error mask
# (feedback_fix_root_not_symptom). Runs on PR +
# push (local boot is minutes, not the 30+ min cold-EC2 path).
# bp-required: pending #1296
peer-visibility-local:
@@ -179,6 +174,9 @@ jobs:
E2E_ANTHROPIC_API_KEY: ${{ secrets.MOLECULE_STAGING_ANTHROPIC_API_KEY }}
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_API_KEY }}
PV_RUNTIMES: "hermes openclaw claude-code"
PV_LOCAL_PROVISION_MODE: external
ADMIN_TOKEN: local-e2e-admin-token
MOLECULE_ADMIN_TOKEN: local-e2e-admin-token
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
@@ -267,10 +265,9 @@ jobs:
echo "::error::Platform did not become healthy in 30s"
cat workspace-server/platform.log || true; exit 1
- name: Run LOCAL fresh-provision peer-visibility E2E (literal MCP list_peers)
# HONEST gate — NO continue-on-error. Red today (Hermes-401 #162 /
# OpenClaw-never-online #165 not yet fixed); green when they land.
# Non-required-by-design via its distinct status context until the
# molecule-core#1296 flip-to-required.
# HONEST gate — NO continue-on-error. The local backend uses
# external-mode workspaces so this context tests the literal MCP
# peer-visibility path without coupling to template container boot.
run: bash tests/e2e/test_peer_visibility_mcp_local.sh
- name: Dump platform log on failure
if: failure()
+37 -9
View File
@@ -16,9 +16,9 @@ name: E2E Staging Canvas (Playwright)
# e2e-staging-saas.yml (which tests the API shape) by exercising the
# actual browser + canvas bundle against live staging.
#
# Triggers: push to main/staging or PR touching canvas sources + this workflow,
# manual dispatch, and weekly cron to catch browser/runtime drift even
# when canvas is quiet.
# Triggers: push to main, PR touching canvas sources + this workflow only
# after the PR enters `merge-queue`, manual dispatch, and scheduled cron to
# catch browser/runtime drift even when canvas is quiet.
# Added staging to push/pull_request branches so the auto-promote gate
# check (--event push --branch staging) can see a completed run for this
# workflow — mirrors what PR #1891 does for e2e-api.yml.
@@ -37,9 +37,10 @@ on:
pull_request:
branches: [main]
schedule:
# Weekly on Sunday 08:00 UTC — catches Chrome / Playwright / Next.js
# Nightly at 08:00 UTC — catches Chrome / Playwright / Next.js
# release-note-shaped regressions that don't ride in with a PR.
- cron: '0 8 * * 0'
- cron: '0 8 * * *'
workflow_dispatch:
concurrency:
# Per-SHA grouping (changed 2026-04-28 from a single global group). The
@@ -79,10 +80,13 @@ jobs:
with:
fetch-depth: 0
- id: decide
env:
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
QUEUE_LABEL: merge-queue
# Inline replacement for dorny/paths-filter — see e2e-api.yml.
# Cron triggers always run real work (no diff context).
# Cron and manual triggers always run real work (no diff context).
run: |
if [ "${{ github.event_name }}" = "schedule" ]; then
if [ "${{ github.event_name }}" = "schedule" ] || [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "canvas=true" >> "$GITHUB_OUTPUT"
exit 0
fi
@@ -102,9 +106,26 @@ jobs:
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(canvas/|\.gitea/workflows/e2e-staging-canvas\.yml$)'; then
if ! echo "$CHANGED" | grep -qE '^(canvas/|\.gitea/workflows/e2e-staging-canvas\.yml$)'; then
echo "canvas=false" >> "$GITHUB_OUTPUT"
exit 0
fi
if [ "${{ github.event_name }}" != "pull_request" ]; then
echo "canvas=true" >> "$GITHUB_OUTPUT"
exit 0
fi
authfile=$(mktemp)
chmod 600 "$authfile"
printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$authfile"
labels=$(curl -fsS -K "$authfile" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels" \
| python3 -c 'import json,sys; print("\n".join(label.get("name","") for label in json.load(sys.stdin)))')
rm -f "$authfile"
if printf '%s\n' "$labels" | grep -qx "$QUEUE_LABEL"; then
echo "canvas=true" >> "$GITHUB_OUTPUT"
else
echo "PR is not in merge-queue; skipping heavy E2E Staging Canvas for normal PR path."
echo "canvas=false" >> "$GITHUB_OUTPUT"
fi
@@ -169,7 +190,14 @@ jobs:
- name: Install Playwright browsers
if: needs.detect-changes.outputs.canvas == 'true'
timeout-minutes: 10
run: npx playwright install --with-deps chromium
run: |
PREBAKED_PLAYWRIGHT=/ms-playwright
if [ -d "${PREBAKED_PLAYWRIGHT}" ] && find "${PREBAKED_PLAYWRIGHT}" -maxdepth 3 -type f -name 'chrome' | grep -q .; then
echo "Using prebaked Playwright Chromium from ${PREBAKED_PLAYWRIGHT}"
echo "PLAYWRIGHT_BROWSERS_PATH=${PREBAKED_PLAYWRIGHT}" >> "$GITHUB_ENV"
exit 0
fi
npx playwright install --with-deps chromium
- name: Run staging canvas E2E
if: needs.detect-changes.outputs.canvas == 'true'
+6 -2
View File
@@ -13,8 +13,12 @@ name: gitea-merge-queue
# - add `merge-queue-hold` to pause a queued PR without removing it
on:
schedule:
- cron: '*/5 * * * *'
# Schedule moved to operator-config:
# /etc/cron.d/molecule-core-merge-queue ->
# /usr/local/bin/molecule-core-cron-bot.sh merge-queue
#
# The queue bot still processes one PR per tick, but no longer occupies
# one of the shared Actions runners just to poll.
workflow_dispatch:
permissions:
@@ -101,36 +101,13 @@ jobs:
# not present in the shallow checkout.
fetch-depth: 2
- id: filter
# Inline replacement for dorny/paths-filter — see e2e-api.yml.
run: |
# Gitea Actions evaluates github.event.before to empty string in shell
# scripts. Use GITHUB_EVENT_BEFORE shell env var instead (Gitea
# correctly populates it for push events). PR case uses template var.
BASE=""
if [ "${{ github.event_name }}" = "pull_request" ] && [ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
elif [ -n "$GITHUB_EVENT_BEFORE" ]; then
BASE="$GITHUB_EVENT_BEFORE"
fi
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$'; then
echo "handlers=true" >> "$GITHUB_OUTPUT"
exit 0
fi
# timeout 30 guards against the case where BASE points to a ref that
# git can resolve but cat-file hangs (rare on corrupted objects).
if ! timeout 30 git cat-file -e "$BASE" 2>/dev/null; then
git fetch --depth=1 origin "$BASE" 2>/dev/null || true
fi
if ! timeout 30 git cat-file -e "$BASE" 2>/dev/null; then
echo "handlers=true" >> "$GITHUB_OUTPUT"
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(workspace-server/internal/handlers/|workspace-server/internal/wsauth/|workspace-server/migrations/|\.gitea/workflows/handlers-postgres-integration\.yml$)'; then
echo "handlers=true" >> "$GITHUB_OUTPUT"
else
echo "handlers=false" >> "$GITHUB_OUTPUT"
fi
python3 .gitea/scripts/detect-changes.py \
--profile handlers-postgres \
--event-name "${{ github.event_name }}" \
--pr-base-sha "${{ github.event.pull_request.base.sha }}" \
--base-ref "${{ github.event.pull_request.base.ref }}" \
--push-before "${GITHUB_EVENT_BEFORE:-}"
# Single-job-with-per-step-if pattern: always runs to satisfy the
# required-check name on branch protection; real work gates on the
@@ -1,177 +0,0 @@
name: publish-runtime-autobump
# Auto-bump-on-workspace-edit half of the publish pipeline.
#
# Why this file exists (issue #351):
# Gitea Actions does not correctly disambiguate `paths:` from `tags:`
# when both are bundled under a single `on.push` key. The result is
# that tag pushes get filtered out and `publish-runtime.yml` never
# fires — `action_run` rows: 0. This was unnoticed pre-2026-05-11
# because PYPI_TOKEN was absent (publishes would have failed anyway).
#
# Split design:
# - publish-runtime.yml : on.push.tags only (the publisher)
# - publish-runtime-autobump.yml: on.push.branches+paths (this file — the version-bumper)
#
# This file computes the next version from PyPI's latest, pushes a
# `runtime-v$VERSION` tag, and exits. The tag push then triggers
# publish-runtime.yml via its tags-only trigger.
#
# Concurrency: shares the `publish-runtime` group with publish-runtime.yml
# so concurrent workspace pushes serialize at the bump step. Without
# this, two pushes minutes apart could both read PyPI latest=0.1.129
# and try to tag 0.1.130 simultaneously, only one of which would land.
on:
# Run on PR pushes to post a success status so Gitea can merge the PR.
# All steps use continue-on-error: true so operational failures
# (PyPI unreachable, DISPATCH_TOKEN missing) do not block merge.
pull_request:
paths:
- "workspace/**"
# mc#1578 / a05add29 cure: build_runtime_package.py owns PYPROJECT_TEMPLATE
# (deps, classifiers, project metadata). A change there is publish-affecting
# even when workspace/** is untouched, so the autobump must fire to claim
# the next runtime-v$VERSION tag. Without this, manual tagging races PyPI
# (e.g. runtime-v0.1.18 collided with the 2026-04-27 PyPI 0.1.18 publish,
# blocking the python-multipart pin from reaching prod).
- "scripts/build_runtime_package.py"
- "scripts/test_build_runtime_package.py"
# Bump-and-tag on main/staging push (the actual operational trigger).
push:
branches:
- main
- staging
paths:
- "workspace/**"
- "scripts/build_runtime_package.py"
- "scripts/test_build_runtime_package.py"
# Manual dispatch — useful when Gitea Actions API (/actions/*) is
# unreachable (e.g. act_runner 404 on Gitea 1.22.6) and we cannot
# re-trigger via curl.
workflow_dispatch:
permissions:
contents: write # required to push tags back
concurrency:
group: publish-runtime
cancel-in-progress: false
jobs:
# PR-validation path: always succeeds so Gitea can merge workflow-only PRs.
# Operational failures (PyPI unreachable, missing DISPATCH_TOKEN) are
# surfaced via continue-on-error: true rather than blocking the merge.
# The actual bump work happens on the main/staging push after merge.
# bp-exempt: advisory validation for runtime publication; not a branch-protection gate.
pr-validate:
runs-on: ubuntu-latest
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # do not block PR merge on operational failures
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
- name: Validate PyPI connectivity (best-effort)
run: |
set -eu
echo "=== Checking PyPI accessibility ==="
LATEST=$(curl -fsS --retry 3 --max-time 10 \
https://pypi.org/pypi/molecule-ai-workspace-runtime/json \
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])" \
|| echo "PyPI unreachable (non-blocking for PR validation)")
echo "Latest: ${LATEST:-unknown}"
# Actual bump-and-tag: runs on main/staging pushes, posts real success/failure.
# No continue-on-error — operational failures here trip the main-red
# watchdog, which is the desired signal for infrastructure degradation.
# bp-exempt: post-merge tag publication side effect; CI / all-required gates source changes.
bump-and-tag:
runs-on: ubuntu-latest
# Only fire on push events (main/staging after PR merge). Pull_request
# events are handled by pr-validate above; we do NOT bump on every
# push-synchronize because that would race with the PR head.
#
# NOTE: the prior condition `github.event.pull_request.base.ref == ''`
# was broken — on a PR-merge push in Gitea Actions, the pull_request
# context is still attached (base.ref='main'), so the condition always
# evaluated to false and bump-and-tag was permanently skipped.
if: github.event_name == 'push'
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- name: Fetch tags for collision check
run: git fetch origin --tags --depth=1
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
- name: Compute next version from PyPI latest and existing tags
id: bump
run: |
set -eu
LATEST=$(curl -fsS --retry 3 https://pypi.org/pypi/molecule-ai-workspace-runtime/json \
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])")
MAJOR=$(echo "$LATEST" | cut -d. -f1)
MINOR=$(echo "$LATEST" | cut -d. -f2)
TAG_LATEST=$(git tag --list "runtime-v${MAJOR}.${MINOR}.*" \
| sed -E 's/^runtime-v//' \
| grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -V \
| tail -1 || true)
VERSION=$(PYPI_LATEST="$LATEST" TAG_LATEST="$TAG_LATEST" python - <<'PY'
import os
def parse(v):
return tuple(int(part) for part in v.split("."))
pypi = os.environ["PYPI_LATEST"]
tag = os.environ.get("TAG_LATEST") or pypi
base = max(parse(pypi), parse(tag))
print(f"{base[0]}.{base[1]}.{base[2] + 1}")
PY
)
echo "PyPI latest=$LATEST, latest runtime tag=${TAG_LATEST:-none} -> next=$VERSION"
if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "::error::computed version $VERSION does not match PEP 440 X.Y.Z"
exit 1
fi
if git tag --list | grep -qx "runtime-v$VERSION"; then
echo "::error::tag runtime-v$VERSION already exists in this repo. Manual intervention required (PyPI and Gitea tag history are out of sync)."
exit 1
fi
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
- name: Push runtime-v$VERSION tag
env:
DISPATCH_TOKEN: ${{ secrets.DISPATCH_TOKEN }}
VERSION: ${{ steps.bump.outputs.version }}
GITEA_URL: https://git.moleculesai.app
run: |
set -eu
if [ -z "$DISPATCH_TOKEN" ]; then
echo "::error::DISPATCH_TOKEN secret is not set — needed to push the tag back to molecule-core."
exit 1
fi
git config user.name "publish-runtime autobump"
git config user.email "publish-runtime@moleculesai.app"
git tag -a "runtime-v$VERSION" \
-m "Auto-bump on workspace/** edit on $GITHUB_REF" \
-m "Triggered by: $GITHUB_REF @ $GITHUB_SHA" \
-m "publish-runtime.yml will pick up this tag and upload to PyPI"
# Push via DISPATCH_TOKEN (a Gitea PAT). Using the bot identity
# ensures the resulting tag-push event is dispatched to
# publish-runtime.yml; act_runner's default GITHUB_TOKEN cannot
# trigger downstream workflows.
git remote set-url origin "${GITEA_URL#https://}"
git remote set-url origin "https://x-access-token:${DISPATCH_TOKEN}@${GITEA_URL#https://}/molecule-ai/molecule-core.git"
git push origin "runtime-v$VERSION"
echo "✓ pushed runtime-v$VERSION — publish-runtime.yml should fire next"
-437
View File
@@ -1,437 +0,0 @@
name: publish-runtime
# Gitea Actions port of .github/workflows/publish-runtime.yml.
#
# Ported 2026-05-10 (issue #206). Key differences from the GitHub version:
# - Gitea Actions reads .gitea/workflows/, not .github/workflows/
# - Dropped `environment: pypi-publish` — Gitea Actions does not support
# named environments or OIDC trusted publishers
# - Replaced `pypa/gh-action-pypi-publish@release/v1` (OIDC) with
# `twine upload` using PYPI_TOKEN secret — same mechanism as a local
# `python -m twine upload` with a PyPI token
# - Replaced `github.ref_name` (GitHub-only) with `${GITHUB_REF#refs/tags/}`
# — Gitea Actions exposes github.ref (the full ref) but not ref_name
# - Dropped `merge_group` trigger (Gitea has no merge queue)
#
# 2026-05-10 (issue #348): originally restored `staging`/`main` branch +
# `workspace/**` path-filter trigger in PR #349.
#
# 2026-05-11 (issue #351): REVERTED the branches+paths trigger from THIS
# file. Bundling `paths` with `tags` under a single `on.push` key caused
# Gitea Actions to never dispatch the workflow for tag-push events (0
# runs in `action_run` for workflow_id='publish-runtime.yml' since the
# port, including the runtime-v1.0.0 tag — which is why PyPI is still at
# 0.1.129 despite a v1.0.0 Gitea tag existing).
#
# The auto-bump-on-workspace-edit trigger now lives in
# `.gitea/workflows/publish-runtime-autobump.yml`. That file computes the
# next version from PyPI's latest and pushes a `runtime-v$VERSION` tag,
# which THIS file then picks up via the tags-only trigger below.
#
# This decoupling means Gitea's path-vs-tag evaluator never has to
# disambiguate — each file has a single unambiguous trigger shape.
#
# PyPI publishing: requires PYPI_TOKEN repository secret (or org-level secret).
# Set via: repo Settings → Actions → Variables and Secrets → New Secret.
# The token should be a PyPI API token scoped to molecule-ai-workspace-runtime.
#
# The DISPATCH_TOKEN cascade (git push to template repos) is unchanged —
# it uses the Gitea API directly and was already Gitea-compatible.
on:
push:
tags:
- "runtime-v*"
workflow_dispatch:
# 2026-05-11 (root cause of #351 / 0 runs ever):
# Gitea 1.22.6's workflow parser rejects `workflow_dispatch.inputs.version`
# with "unknown on type" — it mis-treats the inputs sub-keys as top-level
# `on:` event types. Log line:
# actions/workflows.go:DetectWorkflows() [W] ignore invalid workflow
# "publish-runtime.yml": unknown on type: map["version": {...}]
# That `[W] ignore invalid workflow` is silent UX — the workflow never
# registers, so it never fires for ANY event (push.tags included).
# Removing the inputs block restores parsing. Manual dispatch from the
# Gitea UI now triggers the PyPI auto-bump fallback in `Derive version`
# below (no `inputs.version` to read).
permissions:
contents: read
# Serialize publishes so two concurrent tag pushes don't both compute
# "latest+1" and race on PyPI upload. The second one waits.
concurrency:
group: publish-runtime
cancel-in-progress: false
jobs:
publish:
# Dedicated publish/release lane (internal#462 / #394 / #399). Ship
# path (on: push tag runtime-v*) — reserved capacity, never FIFO
# behind PR-CI. `publish` resolves only to molecule-runner-publish-*.
runs-on: publish
outputs:
version: ${{ steps.version.outputs.version }}
wheel_sha256: ${{ steps.wheel_hash.outputs.wheel_sha256 }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
cache: pip
- name: Derive version (tag or PyPI auto-bump)
id: version
run: |
if echo "$GITHUB_REF" | grep -q "^refs/tags/runtime-v"; then
# Tag is `runtime-vX.Y.Z` — strip the prefix.
VERSION="${GITHUB_REF#refs/tags/runtime-v}"
else
# workflow_dispatch path (no inputs supported on Gitea 1.22.6) or
# any other non-tag trigger: derive from PyPI latest + patch bump.
LATEST=$(curl -fsS --retry 3 https://pypi.org/pypi/molecule-ai-workspace-runtime/json \
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])")
MAJOR=$(echo "$LATEST" | cut -d. -f1)
MINOR=$(echo "$LATEST" | cut -d. -f2)
PATCH=$(echo "$LATEST" | cut -d. -f3)
VERSION="${MAJOR}.${MINOR}.$((PATCH+1))"
echo "Auto-bumped from PyPI latest $LATEST -> $VERSION"
fi
if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+(\.dev[0-9]+|rc[0-9]+|a[0-9]+|b[0-9]+|\.post[0-9]+)?$'; then
echo "::error::version $VERSION does not match PEP 440"
exit 1
fi
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
echo "Publishing molecule-ai-workspace-runtime $VERSION"
- name: Install build tooling
run: pip install build twine
- name: Build package from workspace/
run: |
python scripts/build_runtime_package.py \
--version "${{ steps.version.outputs.version }}" \
--out "${{ runner.temp }}/runtime-build"
- name: Build wheel + sdist
working-directory: ${{ runner.temp }}/runtime-build
run: python -m build
- name: Capture wheel SHA256 for cascade content-verification
id: wheel_hash
working-directory: ${{ runner.temp }}/runtime-build
run: |
set -eu
WHEEL=$(ls dist/*.whl 2>/dev/null | head -1)
if [ -z "$WHEEL" ]; then
echo "::error::No .whl in dist/ — \`python -m build\` must have failed silently"
exit 1
fi
HASH=$(sha256sum "$WHEEL" | awk '{print $1}')
echo "wheel_sha256=${HASH}" >> "$GITHUB_OUTPUT"
echo "Local wheel SHA256 (pre-upload): ${HASH}"
echo "Wheel filename: $(basename "$WHEEL")"
- name: Verify package contents (sanity)
working-directory: ${{ runner.temp }}/runtime-build
run: |
python -m twine check dist/*
python -m venv /tmp/smoke
/tmp/smoke/bin/pip install --quiet dist/*.whl
/tmp/smoke/bin/python "$GITHUB_WORKSPACE/scripts/wheel_smoke.py"
# ─────────────────────────────────────────────────────────────────────
# RFC#596 (2026-05-19): Gitea PyPI registry as PRIMARY, PyPI as
# best-effort fallback. Eliminates the SPOF that caused the
# 2026-05-19 P0 (PyPI abuse-block #593 + Railway outage #595).
#
# Order is inverted intentionally:
# 1. Gitea FIRST — must succeed (our internal SSOT).
# 2. PyPI SECOND — best-effort, non-fatal on failure (courtesy
# mirror; our consumers don't depend on it after Phase 4
# template Dockerfile updates).
#
# Endpoint shape (verified live in RFC#596 Phase 5):
# POST https://git.moleculesai.app/api/packages/molecule-ai/pypi/
# HTTP Basic auth: username = gitea username, password = PAT with
# `write:package` scope. Returns 201 Created on success.
# ─────────────────────────────────────────────────────────────────────
- name: Publish to Gitea PyPI registry (PRIMARY)
id: gitea_publish
working-directory: ${{ runner.temp }}/runtime-build
env:
# MOLECULE_PYPI_GITEA_PUBLISHER_USER: Gitea username for the publisher
# persona (must own a token with `write:package` scope).
# Provisioned in RFC#596 Phase 3 (operator-config PR).
# NOTE: secret name MUST NOT start with `GITEA_` or `GITHUB_` —
# Gitea 1.22.6 reserves those prefixes for built-in env vars and
# rejects repo-secret PUT with HTTP 400 / "invalid secret name".
# Empirically reproduced 2026-05-19 against
# `/repos/molecule-ai/molecule-core/actions/secrets/GITEA_*`.
MOLECULE_PYPI_GITEA_PUBLISHER_USER: ${{ secrets.MOLECULE_PYPI_GITEA_PUBLISHER_USER }}
# MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN: PAT for the publisher persona,
# `write:package` scope on molecule-ai org.
# Synced from Infisical /ci/gitea-pypi-publisher (RFC#596 Phase 3).
MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN: ${{ secrets.MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN }}
run: |
set -eu
if [ -z "${MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN:-}" ] || [ -z "${MOLECULE_PYPI_GITEA_PUBLISHER_USER:-}" ]; then
echo "::error::MOLECULE_PYPI_GITEA_PUBLISHER_USER / MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN secrets are not set."
echo "::error::Provision them via the RFC#596 Phase 3 operator-config sync script."
echo "::error::Gitea is the PRIMARY index per RFC#596 — publish job aborts here, NOT after PyPI."
exit 1
fi
python -m twine upload \
--verbose \
--repository-url "https://git.moleculesai.app/api/packages/molecule-ai/pypi/" \
--username "$MOLECULE_PYPI_GITEA_PUBLISHER_USER" \
--password "$MOLECULE_PYPI_GITEA_PUBLISHER_TOKEN" \
dist/*
echo "gitea_status=success" >> "$GITHUB_OUTPUT"
echo "gitea_url=https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/molecule-ai-workspace-runtime" >> "$GITHUB_OUTPUT"
- name: Publish to PyPI (FALLBACK, best-effort)
id: pypi_publish
# working-directory matches the preceding Build/Verify steps. Without
# this, twine runs from the default workspace checkout dir where
# `dist/` doesn't exist and fails with:
# ERROR InvalidDistribution: Cannot find file (or expand pattern): 'dist/*'
# Caught on the first-ever successful dispatch of this workflow
# (run 5097, 2026-05-11 02:08Z) — every other step in the publish
# job already had this working-directory; Publish was missing it.
#
# RFC#596: this step is `continue-on-error: true` because PyPI is
# NO LONGER the primary index. PyPI 403/timeout/abuse-block does
# NOT block the publish — Gitea already has the wheel.
continue-on-error: true
working-directory: ${{ runner.temp }}/runtime-build
env:
# PYPI_TOKEN: repository secret scoped to molecule-ai-workspace-runtime.
# Set via: Settings → Actions → Variables and Secrets → New Secret.
# Format: pypi-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: |
if [ -z "$PYPI_TOKEN" ]; then
echo "::warning::PYPI_TOKEN secret is not set — skipping PyPI mirror publish (non-fatal per RFC#596)."
echo "pypi_status=skipped_no_token" >> "$GITHUB_OUTPUT"
exit 0
fi
if python -m twine upload \
--verbose \
--repository pypi \
--username __token__ \
--password "$PYPI_TOKEN" \
dist/*; then
echo "pypi_status=success" >> "$GITHUB_OUTPUT"
else
rc=$?
echo "::warning::PyPI mirror publish failed (exit $rc). Non-fatal per RFC#596 — Gitea has the wheel."
echo "pypi_status=failed_exit_$rc" >> "$GITHUB_OUTPUT"
fi
echo "pypi_url=https://pypi.org/project/molecule-ai-workspace-runtime/${{ steps.version.outputs.version }}/" >> "$GITHUB_OUTPUT"
- name: Publish job summary (Gitea + PyPI status)
if: always()
run: |
{
echo "## publish-runtime $(date -u +%FT%TZ)"
echo
echo "**Version:** \`${{ steps.version.outputs.version }}\`"
echo "**Wheel SHA256:** \`${{ steps.wheel_hash.outputs.wheel_sha256 }}\`"
echo
echo "### Indexes"
echo
echo "| Index | Status | URL |"
echo "|---------|-------------------------------------------------|-----|"
echo "| Gitea (PRIMARY) | ${{ steps.gitea_publish.outputs.gitea_status || 'failed' }} | ${{ steps.gitea_publish.outputs.gitea_url || '—' }} |"
echo "| PyPI (fallback) | ${{ steps.pypi_publish.outputs.pypi_status || 'failed' }} | ${{ steps.pypi_publish.outputs.pypi_url || '—' }} |"
echo
echo "Per RFC#596: Gitea is the contract. PyPI is best-effort."
} >> "$GITHUB_STEP_SUMMARY"
cascade:
needs: publish
# Publish/release lane (internal#462) — downstream of the runtime
# publish ship job; keep it on the reserved lane too.
runs-on: publish
steps:
- name: Wait for PyPI to propagate the new version
env:
RUNTIME_VERSION: ${{ needs.publish.outputs.version }}
EXPECTED_SHA256: ${{ needs.publish.outputs.wheel_sha256 }}
run: |
set -eu
if [ -z "$EXPECTED_SHA256" ]; then
echo "::error::publish job did not expose wheel_sha256 — cannot verify wheel content. Refusing to fan out cascade."
exit 1
fi
# NOTE (RFC#596 follow-up): this propagation probe still resolves
# against PyPI's default index. After RFC#596 Phase 4 lands and
# consumers pull from Gitea first, this probe should be rewritten
# to verify the Gitea simple/ endpoint serves the new wheel
# (PyPI may be best-effort-failed and the cascade should still
# fan out, since templates will pull from Gitea). Tracked in #596.
python -m venv /tmp/propagation-probe
PROBE=/tmp/propagation-probe/bin
$PROBE/pip install --upgrade --quiet pip
for i in $(seq 1 30); do
if $PROBE/pip install \
--quiet \
--no-cache-dir \
--force-reinstall \
--no-deps \
"molecule-ai-workspace-runtime==${RUNTIME_VERSION}" \
>/dev/null 2>&1; then
INSTALLED=$($PROBE/pip show molecule-ai-workspace-runtime 2>/dev/null \
| awk -F': ' '/^Version:/{print $2}')
if [ "$INSTALLED" = "$RUNTIME_VERSION" ]; then
echo "✓ PyPI resolved $RUNTIME_VERSION (install check)"
break
fi
fi
if [ $i -eq 30 ]; then
echo "::error::pip install --no-cache-dir molecule-ai-workspace-runtime==${RUNTIME_VERSION} never resolved within ~5 min."
echo "::error::Refusing to fan out cascade against a potentially stale PyPI index."
exit 1
fi
echo " [$i/30] waiting for PyPI to propagate ${RUNTIME_VERSION}..."
sleep 4
done
# Stage (b): download wheel + SHA256 compare against what we built.
# Catches Fastly stale-content serving old bytes under a new version URL.
#
# Caught run 5196 (first-ever successful publish, 2026-05-11): the
# previous one-liner `HASH=$(pip download ... && sha256sum ...)`
# captured pip's stdout (`Collecting molecule-ai-workspace-runtime
# ==X.Y.Z`) into HASH, then the SHA comparison failed against the
# leaked `Collecting...` string. `2>/dev/null` silences stderr but
# NOT stdout; pip writes its progress to stdout by default.
# Fix: split into two steps, silence pip's stdout explicitly, capture
# only sha256sum's output into HASH.
python -m pip download \
--no-deps \
--no-cache-dir \
--dest /tmp/wheel-probe \
--quiet \
"molecule-ai-workspace-runtime==${RUNTIME_VERSION}" \
>/dev/null 2>&1
HASH=$(sha256sum /tmp/wheel-probe/*.whl | awk '{print $1}')
if [ "$HASH" != "$EXPECTED_SHA256" ]; then
echo "::error::PyPI propagated $RUNTIME_VERSION but wheel content SHA256 mismatch."
echo "::error::Expected: $EXPECTED_SHA256"
echo "::error::Got: $HASH"
echo "::error::Fastly may be serving stale content. Refusing to fan out cascade."
exit 1
fi
echo "✓ PyPI CDN verified (SHA256 match)"
- name: Fan out via push to .runtime-version
env:
# Gitea PAT with write:repository scope on the 8 cascade-active
# template repos. Used for git push to each template repo's main
# branch, which trips their `on: push: branches: [main]` trigger
# on publish-image.yml.
DISPATCH_TOKEN: ${{ secrets.DISPATCH_TOKEN }}
RUNTIME_VERSION: ${{ needs.publish.outputs.version }}
run: |
set +e # don't abort on a single repo failure — collect them all
if [ -z "$DISPATCH_TOKEN" ]; then
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "::warning::DISPATCH_TOKEN secret not set — skipping cascade."
echo "::warning::set it at Settings → Actions → Variables and Secrets → New Secret."
exit 0
fi
echo "::error::DISPATCH_TOKEN secret missing — cascade cannot fan out."
echo "::error::PyPI was published, but the 8 template repos will NOT pick up the new version."
exit 1
fi
VERSION="$RUNTIME_VERSION"
if [ -z "$VERSION" ]; then
echo "::error::publish job did not expose a version output"
exit 1
fi
GITEA_URL="${GITEA_URL:-https://git.moleculesai.app}"
# Keep in lockstep with manifest.json workspace_templates (suffix-stripped).
# Guarded by scripts/check-cascade-list-vs-manifest.sh (cascade-list-drift-gate).
# 2026-05-19: pruned crewai/deepagents/gemini-cli — not in manifest.
TEMPLATES="claude-code hermes openclaw codex langgraph autogen"
FAILED=""
SKIPPED=""
git config --global user.name "publish-runtime cascade"
git config --global user.email "publish-runtime@moleculesai.app"
WORKDIR="$(mktemp -d)"
for tpl in $TEMPLATES; do
REPO="molecule-ai/molecule-ai-workspace-template-$tpl"
CLONE="$WORKDIR/$tpl"
HTTP=$(curl -sS -o /dev/null -w "%{http_code}" \
-H "Authorization: token $DISPATCH_TOKEN" \
"$GITEA_URL/api/v1/repos/$REPO/contents/.github/workflows/publish-image.yml")
if [ "$HTTP" = "404" ]; then
echo "↷ $tpl has no publish-image.yml — soft-skip"
SKIPPED="$SKIPPED $tpl"
continue
fi
attempt=0
success=false
while [ $attempt -lt 3 ]; do
attempt=$((attempt + 1))
rm -rf "$CLONE"
if ! git clone --depth=1 \
"https://x-access-token:${DISPATCH_TOKEN}@${GITEA_URL#https://}/$REPO.git" \
"$CLONE" >/tmp/clone.log 2>&1; then
echo "::warning::clone $tpl attempt $attempt failed: $(tail -n3 /tmp/clone.log)"
sleep 2
continue
fi
cd "$CLONE"
echo "$VERSION" > .runtime-version
if git diff --quiet -- .runtime-version; then
echo "✓ $tpl already at $VERSION — no commit needed"
success=true
cd - >/dev/null
break
fi
git add .runtime-version
git commit -m "chore: pin runtime to $VERSION (publish-runtime cascade)" \
-m "Co-Authored-By: publish-runtime cascade <publish-runtime@moleculesai.app>" \
>/dev/null
if git push origin HEAD:main >/tmp/push.log 2>&1; then
echo "✓ $tpl pushed $VERSION on attempt $attempt"
success=true
cd - >/dev/null
break
fi
echo "::warning::push $tpl attempt $attempt failed, pull-rebasing"
git pull --rebase origin main >/tmp/rebase.log 2>&1 || true
cd - >/dev/null
done
if [ "$success" != "true" ]; then
FAILED="$FAILED $tpl"
fi
done
rm -rf "$WORKDIR"
if [ -n "$FAILED" ]; then
echo "::error::Cascade incomplete after 3 retries each. Failed:$FAILED"
exit 1
fi
if [ -n "$SKIPPED" ]; then
echo "Cascade complete: pinned $VERSION. Soft-skipped (no publish-image.yml):$SKIPPED"
else
echo "Cascade complete: $VERSION pinned across all manifest workspace_templates."
fi
@@ -25,8 +25,11 @@ name: publish-workspace-server-image
# staging-<sha>. Set repo variable or secret PROD_AUTO_DEPLOY_DISABLED=true
# to stop production rollout while keeping image publishing enabled.
#
# ECR target: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*
# Primary ECR target: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*
# Optional staging tenant mirror target:
# 004947743811.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
# Required secrets: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AUTO_SYNC_TOKEN
# Optional secrets: AWS_STAGING_ECR_ACCESS_KEY_ID, AWS_STAGING_ECR_SECRET_ACCESS_KEY
#
# mc#711: Docker daemon not accessible on ubuntu-latest runner (molecule-canonical-1
# shows client-only in `docker info` — daemon not running). DinD mount is present but
@@ -65,6 +68,7 @@ env:
# use below in this repo's staging-verify.yml.
IMAGE_NAME: ${{ vars.ECR_REGISTRY || '153263036946.dkr.ecr.us-east-2.amazonaws.com' }}/molecule-ai/platform
TENANT_IMAGE_NAME: ${{ vars.ECR_REGISTRY || '153263036946.dkr.ecr.us-east-2.amazonaws.com' }}/molecule-ai/platform-tenant
STAGING_TENANT_IMAGE_NAME: ${{ vars.STAGING_ECR_REGISTRY || '004947743811.dkr.ecr.us-east-2.amazonaws.com' }}/molecule-ai/platform-tenant
jobs:
build-and-push:
@@ -135,6 +139,18 @@ jobs:
run: |
echo "sha=${GITHUB_SHA::7}" >> "$GITHUB_OUTPUT"
# Keep Buildx state inside the job temp dir. The publish runner's
# inherited DOCKER_CONFIG can point at a host-owned ECR config path
# (/home/hongming/.docker-ecr), which caused setup-buildx-action to
# fail before image build with EACCES creating buildx/certs.
- name: Prepare writable Docker config
run: |
set -euo pipefail
export DOCKER_CONFIG="$RUNNER_TEMP/docker-config"
mkdir -p "$DOCKER_CONFIG/buildx/certs"
echo "DOCKER_CONFIG=$DOCKER_CONFIG" >> "$GITHUB_ENV"
docker buildx version
# Build + push platform image (inline ECR auth — mirrors the operator-host
# approach; credentials come from GITHUB_SECRET_AWS_ACCESS_KEY_ID /
# GITHUB_SECRET_AWS_SECRET_ACCESS_KEY in Gitea Actions).
@@ -170,21 +186,46 @@ jobs:
--push .
# Build + push tenant image (Go platform + Next.js canvas in one image).
# When staging ECR publisher credentials are configured, push the same
# build to the staging account too so fresh staging/E2E tenants can pull
# without cross-account ECR permissions.
- name: Build & push tenant image to ECR (staging-<sha> + staging-latest)
env:
TENANT_IMAGE_NAME: ${{ env.TENANT_IMAGE_NAME }}
STAGING_TENANT_IMAGE_NAME: ${{ env.STAGING_TENANT_IMAGE_NAME }}
TAG_SHA: staging-${{ steps.tags.outputs.sha }}
TAG_LATEST: staging-latest
GIT_SHA: ${{ github.sha }}
REPO: ${{ github.repository }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_STAGING_ECR_ACCESS_KEY_ID: ${{ secrets.AWS_STAGING_ECR_ACCESS_KEY_ID }}
AWS_STAGING_ECR_SECRET_ACCESS_KEY: ${{ secrets.AWS_STAGING_ECR_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
run: |
set -euo pipefail
ECR_REGISTRY="${TENANT_IMAGE_NAME%%/*}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
build_tags=(
--tag "${TENANT_IMAGE_NAME}:${TAG_SHA}"
--tag "${TENANT_IMAGE_NAME}:${TAG_LATEST}"
)
if [ -n "${AWS_STAGING_ECR_ACCESS_KEY_ID:-}" ] && [ -n "${AWS_STAGING_ECR_SECRET_ACCESS_KEY:-}" ]; then
STAGING_ECR_REGISTRY="${STAGING_TENANT_IMAGE_NAME%%/*}"
AWS_ACCESS_KEY_ID="${AWS_STAGING_ECR_ACCESS_KEY_ID}" \
AWS_SECRET_ACCESS_KEY="${AWS_STAGING_ECR_SECRET_ACCESS_KEY}" \
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${STAGING_ECR_REGISTRY}"
build_tags+=(
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_SHA}"
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_LATEST}"
)
else
echo "::notice::Skipping staging ECR tenant push; AWS_STAGING_ECR_ACCESS_KEY_ID/AWS_STAGING_ECR_SECRET_ACCESS_KEY are not configured."
fi
docker buildx build \
--file ./workspace-server/Dockerfile.tenant \
--build-arg NEXT_PUBLIC_PLATFORM_URL= \
@@ -193,8 +234,7 @@ jobs:
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--label "molecule.workflow.run_id=${GITHUB_RUN_ID}" \
--tag "${TENANT_IMAGE_NAME}:${TAG_SHA}" \
--tag "${TENANT_IMAGE_NAME}:${TAG_LATEST}" \
"${build_tags[@]}" \
--push .
# bp-exempt: production deploy side-effect; merge is gated by CI / all-required and this job waits for push CI before acting.
-101
View File
@@ -1,101 +0,0 @@
name: Runtime Pin Compatibility
# Ported from .github/workflows/runtime-pin-compat.yml on 2026-05-11 per
# RFC internal#219 §1 sweep.
#
# Differences from the GitHub version:
# - Dropped `merge_group:` (no Gitea merge queue) and
# `workflow_dispatch:` (no inputs, but the trigger itself is
# parser-rejected when inputs are absent in some Gitea 1.22.x
# builds; safest to drop entirely — manual runs go via cron-trigger
# bump or push-with-paths-filter).
# - on.paths references .gitea/workflows/runtime-pin-compat.yml (this
# file) instead of the .github/ one.
# - Workflow-level env.GITHUB_SERVER_URL set.
# - `continue-on-error: true` on the job (RFC §1 contract).
#
# CI gate that prevents the 5-hour staging outage from 2026-04-24 from
# recurring (controlplane#253). The original failure mode:
# 1. molecule-ai-workspace-runtime 0.1.13 declared `a2a-sdk<1.0` in its
# requires_dist metadata (incorrect — it actually imports
# a2a.server.routes which only exists in a2a-sdk 1.0+)
# 2. `pip install molecule-ai-workspace-runtime` resolved cleanly
# 3. `from molecule_runtime.main import main_sync` raised ImportError
# 4. Every tenant workspace crashed; the canary tenant caught it but
# only after 5 hours of degraded staging
#
# This workflow installs the CURRENTLY PUBLISHED runtime from PyPI on
# top of `workspace/requirements.txt` and smoke-imports. Catches:
# - Upstream PyPI yanks
# - Bad re-releases of molecule-ai-workspace-runtime
# - Already-shipped wheels that stop importing because a transitive
# dep moved underneath
on:
push:
branches: [main, staging]
paths:
# Narrow filter: pypi-latest is sensitive only to changes that
# affect what we're INSTALLING (requirements.txt) or WHAT THE
# CHECK ITSELF DOES (this workflow file). Edits to workspace/
# source code don't change what's on PyPI right now, so they
# don't change this gate's verdict.
- 'workspace/requirements.txt'
- '.gitea/workflows/runtime-pin-compat.yml'
pull_request:
branches: [main, staging]
paths:
- 'workspace/requirements.txt'
- '.gitea/workflows/runtime-pin-compat.yml'
# Daily catch for upstream PyPI publishes that break the pin combo
# without any change in our repo (e.g. someone re-yanks an a2a-sdk
# release or molecule-ai-workspace-runtime publishes a bad bump).
schedule:
- cron: '0 13 * * *' # 06:00 PT
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
pypi-latest-install:
name: PyPI-latest install + import smoke
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
cache: pip
cache-dependency-path: workspace/requirements.txt
- name: Install runtime + workspace requirements
# Install order is load-bearing: install the runtime FIRST so pip
# honors whatever a2a-sdk constraint the runtime metadata declares
# (this is the surface that broke in 2026-04-24 — runtime declared
# `a2a-sdk<1.0` but actually needed >=1.0). The follow-up install
# of workspace/requirements.txt then upgrades a2a-sdk to the
# constraint our runtime image actually pins. The import smoke
# below verifies the upgraded combination is consistent.
run: |
python -m venv /tmp/venv
/tmp/venv/bin/pip install --upgrade pip
/tmp/venv/bin/pip install molecule-ai-workspace-runtime
/tmp/venv/bin/pip install -r workspace/requirements.txt
/tmp/venv/bin/pip show molecule-ai-workspace-runtime a2a-sdk \
| grep -E '^(Name|Version):'
- name: Smoke import — fail if metadata declares deps that don't satisfy real imports
# WORKSPACE_ID is validated at import time by platform_auth.py — EC2
# user-data sets it from the cloud-init template; set a placeholder
# here so the import smoke doesn't trip on the env-var guard.
env:
WORKSPACE_ID: 00000000-0000-0000-0000-000000000001
run: |
/tmp/venv/bin/python -c "from molecule_runtime.main import main_sync; print('runtime imports OK')"
-150
View File
@@ -1,150 +0,0 @@
name: Runtime PR-Built Compatibility
# Ported from .github/workflows/runtime-prbuild-compat.yml on 2026-05-11
# per RFC internal#219 §1 sweep.
#
# Differences from the GitHub version:
# - Dropped `merge_group:` (no Gitea merge queue) and `workflow_dispatch:`
# (Gitea 1.22.6 parser-rejects workflow_dispatch with inputs and is
# finicky without them).
# - `dorny/paths-filter@v4` replaced with inline `git diff` (per PR#372
# pattern for ci.yml port).
# - on.paths references .gitea/workflows/runtime-prbuild-compat.yml.
# - Workflow-level env.GITHUB_SERVER_URL set.
# - `continue-on-error: true` on every job (RFC §1 contract).
#
# Companion to `runtime-pin-compat.yml`. That workflow tests what's
# CURRENTLY PUBLISHED on PyPI; this workflow tests what WOULD BE
# PUBLISHED if THIS PR merges.
#
# Why two workflows: the chicken-and-egg #128 fix added a "PR-built
# wheel" job to the original runtime-pin-compat.yml, but both jobs
# shared a `paths:` filter that was the union of their needs
# (`workspace/**`). That meant the PyPI-latest job ran on every doc
# edit even though the upstream PyPI artifact can't change with our
# workspace/ source. Splitting the two means each gets a narrow
# `paths:` filter that matches the inputs it actually depends on.
#
# Catches the failure mode where a PR adds an import requiring a newer
# SDK than `workspace/requirements.txt` pins:
# 1. Pip resolves the existing PyPI wheel + the old SDK pin -> smoke
# passes (it imports the OLD main.py from the wheel, not the PR's
# new main.py).
# 2. Merge -> publish-runtime.yml ships a wheel WITH the new import.
# 3. Tenant images redeploy -> all crash on first boot with ImportError.
on:
push:
branches: [main, staging]
pull_request:
branches: [main, staging]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
concurrency:
# event_name + sha keeps PR sync and the subsequent staging push on the
# same SHA from cancelling each other (per feedback_concurrency_group_per_sha).
group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
wheel: ${{ steps.decide.outputs.wheel }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- id: decide
run: |
# Inline replacement for dorny/paths-filter — same pattern
# PR#372's ci.yml port used. Diffs against the PR base or the
# previous push SHA, then matches against the wheel-relevant
# path set.
#
# NOTE: Gitea Actions does not expose github.event.before as a
# shell environment variable. The ${{ github.event.before }} template
# expression works inside YAML run: blocks but is evaluated to an
# empty string for push events, making the ${VAR:-fallback} always
# use the fallback. Use GITHUB_EVENT_BEFORE instead — it IS set in
# the runner's shell environment for push events.
BASE=""
if [ "${{ github.event_name }}" = "pull_request" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
elif [ -n "$GITHUB_EVENT_BEFORE" ]; then
BASE="$GITHUB_EVENT_BEFORE"
fi
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$'; then
# New branch or no previous SHA: treat as wheel-relevant.
echo "wheel=true" >> "$GITHUB_OUTPUT"
exit 0
fi
if ! timeout 30 git cat-file -e "$BASE" 2>/dev/null; then
git fetch --depth=1 origin "$BASE" 2>/dev/null || true
fi
if ! timeout 30 git cat-file -e "$BASE" 2>/dev/null; then
echo "wheel=true" >> "$GITHUB_OUTPUT"
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(workspace/|scripts/build_runtime_package\.py$|scripts/wheel_smoke\.py$|\.gitea/workflows/runtime-prbuild-compat\.yml$)'; then
echo "wheel=true" >> "$GITHUB_OUTPUT"
else
echo "wheel=false" >> "$GITHUB_OUTPUT"
fi
# ONE job (no job-level `if:`) that always runs and reports under the
# required-check name `PR-built wheel + import smoke`. Real work is
# gated per-step on `needs.detect-changes.outputs.wheel`.
local-build-install:
needs: detect-changes
name: PR-built wheel + import smoke
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: No-op pass (paths filter excluded this commit)
if: needs.detect-changes.outputs.wheel != 'true'
run: |
echo "No workspace/ / scripts/{build_runtime_package,wheel_smoke}.py / workflow changes — wheel gate satisfied without rebuilding."
echo "::notice::PR-built wheel + import smoke no-op pass (paths filter excluded this commit)."
- if: needs.detect-changes.outputs.wheel == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: needs.detect-changes.outputs.wheel == 'true'
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
cache: pip
cache-dependency-path: workspace/requirements.txt
- name: Install build tooling
if: needs.detect-changes.outputs.wheel == 'true'
run: pip install build
- name: Build wheel from PR source (mirrors publish-runtime.yml)
if: needs.detect-changes.outputs.wheel == 'true'
# Use a fixed test version so the wheel filename is predictable.
# Doesn't reach PyPI — this build is local-only for the smoke.
run: |
python scripts/build_runtime_package.py \
--version "0.0.0.dev0+pin-compat" \
--out /tmp/runtime-build
cd /tmp/runtime-build && python -m build
- name: Install built wheel + workspace requirements
if: needs.detect-changes.outputs.wheel == 'true'
run: |
python -m venv /tmp/venv-built
/tmp/venv-built/bin/pip install --upgrade pip
/tmp/venv-built/bin/pip install /tmp/runtime-build/dist/*.whl
/tmp/venv-built/bin/pip install -r workspace/requirements.txt
/tmp/venv-built/bin/pip show molecule-ai-workspace-runtime a2a-sdk \
| grep -E '^(Name|Version):'
- name: Smoke import the PR-built wheel
if: needs.detect-changes.outputs.wheel == 'true'
# Same script publish-runtime.yml runs against the to-be-PyPI wheel.
run: |
/tmp/venv-built/bin/python "$GITHUB_WORKSPACE/scripts/wheel_smoke.py"
+6 -13
View File
@@ -53,19 +53,12 @@ name: status-reaper
# `inputs:` block here. Gitea 1.22.6 rejects the whole workflow as
# "unknown on type" when `workflow_dispatch.inputs.X` is present.
on:
# SCHEDULE RE-ENABLED 2026-05-12 rev3 — interim disable (mc#645) reverted now that
# rev3 widens DEFAULT_SWEEP_LIMIT 10 → 30 (covers retroactive-failure timing window).
# Sibling watchdog re-enabled in the same PR with timeout-minutes raised 5 → 15.
schedule:
# Every 5 minutes. Off-zero alignment with sibling cron workflows:
# ci-required-drift (`:17`), main-red-watchdog (`:05`),
# railway-pin-audit (`:23`). 5-min cadence gives a tight enough
# close on schedule-triggered false-reds that main-red-watchdog
# (hourly :05) almost never files an issue on the false case.
# rev3 keeps `*/5` unchanged per hongming-pc2 03:25Z review:
# "trades window-width-cheap for cadence-loady" — N=30 widens
# the lookback cheaply without doubling runner load via `*/2`.
- cron: '*/5 * * * *'
# Schedule moved to operator-config:
# /etc/cron.d/molecule-core-status-reaper ->
# /usr/local/bin/molecule-core-cron-bot.sh status-reaper
#
# This keeps the 5-minute compensation cadence but stops a maintenance
# bot from consuming Gitea Actions runner slots during PR merge waves.
workflow_dispatch:
# Compensating-status POST needs write on repo statuses; no other
+13 -7
View File
@@ -58,14 +58,20 @@ jobs:
python-version: '3.11'
- name: Install .gitea script test dependencies
run: python -m pip install --quiet 'pytest==9.0.2' 'PyYAML==6.0.2'
- name: Run scripts/ unittests (build_runtime_package, ...)
# Top-level scripts/ tests live alongside their target file
# (e.g. scripts/test_build_runtime_package.py exercises
# scripts/build_runtime_package.py). discover from scripts/
# picks up only top-level test_*.py because scripts/ops/ has
# no __init__.py — that's intentional, so we run two passes.
- name: Run scripts/ unittests, if any
# Top-level scripts/ tests live alongside their target file. The
# runtime packaging tests moved to molecule-ai-workspace-runtime, so
# this pass may legitimately find no tests.
working-directory: scripts
run: python -m unittest discover -t . -p 'test_*.py' -v
run: |
set +e
python -m unittest discover -t . -p 'test_*.py' -v
rc=$?
if [ "$rc" -eq 5 ]; then
echo "No top-level scripts/ unittest files found; skipping."
exit 0
fi
exit "$rc"
- name: Run scripts/ops/ unittests (sweep_cf_decide, ...)
working-directory: scripts/ops
run: python -m unittest discover -p 'test_*.py' -v
+18 -1
View File
@@ -127,7 +127,11 @@ cd workspace-server && go test -race ./...
cd canvas && npm test
# Workspace runtime (Python)
cd workspace && python -m pytest -v
# Runtime code is SSOT in molecule-ai-workspace-runtime, not molecule-core/workspace.
cd ../molecule-ai-workspace-runtime
python -m venv .venv && source .venv/bin/activate
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ -e . pytest pytest-asyncio
pytest -q
# E2E API tests (requires running platform)
bash tests/e2e/test_api.sh
@@ -159,6 +163,19 @@ and run CI manually.
| review-check-tests | `review-check.sh` evaluator regression suite (13 scenarios) |
| ops-scripts | Python unittest suite for `scripts/*.py` |
### Workspace runtime SSOT
Runtime code lives in
[`molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime).
Do not reintroduce `molecule-core/workspace/` or vendored `molecule_runtime/`
copies in consumers. Core and templates consume the published runtime package
from the Gitea package registry.
For local external MCP agents, multi-workspace config is
`MOLECULE_WORKSPACES=[{"id":"...","token":"...","platform_url":"..."}]`.
`platform_url` selects the tenant; `org_id` is not part of this config.
Workspace IDs can differ across orgs.
## Local Testing
### review-check.sh
+5 -5
View File
@@ -163,11 +163,11 @@ Most agent systems stop at "a smart runtime." Molecule AI pushes further: it giv
| Core mechanism | Molecule AI module(s) | Why it matters |
|---|---|---|
| **Durable memory that survives sessions** | `workspace/builtin_tools/memory.py`, `workspace/builtin_tools/awareness_client.py`, `workspace-server/internal/handlers/memories.go` | Memory is not just durable, it is **workspace-scoped** and can route into awareness namespaces tied to the org structure |
| **Durable memory that survives sessions** | `molecule-ai-workspace-runtime/molecule_runtime/builtin_tools/`, `workspace-server/internal/handlers/memories.go` | Memory is not just durable, it is **workspace-scoped** and can route into awareness namespaces tied to the org structure |
| **Cross-session recall** | `workspace-server/internal/handlers/activity.go` (`/workspaces/:id/session-search`) | Recall spans both activity history and memory rows, so the system can search what happened and what was learned without inventing a separate hidden store |
| **Skills built from experience** | `workspace/builtin_tools/memory.py` (`_maybe_log_skill_promotion`) | Promotion from memory into a skill candidate is surfaced as an explicit platform activity, not a silent internal side effect |
| **Skill improvement during use** | `workspace/skill_loader/watcher.py`, `workspace/skill_loader/loader.py`, `workspace/main.py` | Skills hot-reload into the live runtime, so improvements become available on the next A2A task without restarting the workspace |
| **Persistent skill lifecycle** | `workspace-server/cmd/cli/cmd_agent_skill.go`, `workspace/plugins.py` | Skills are not just generated once; they can be audited, installed, published, shared, mounted by plugins, and governed as reusable operational assets |
| **Skills built from experience** | `molecule-ai-workspace-runtime/molecule_runtime/builtin_tools/memory.py` (`_maybe_log_skill_promotion`) | Promotion from memory into a skill candidate is surfaced as an explicit platform activity, not a silent internal side effect |
| **Skill improvement during use** | `molecule-ai-workspace-runtime/molecule_runtime/skill_loader/`, `molecule-ai-workspace-runtime/molecule_runtime/main.py` | Skills hot-reload into the live runtime, so improvements become available on the next A2A task without restarting the workspace |
| **Persistent skill lifecycle** | `workspace-server/cmd/cli/cmd_agent_skill.go`, `molecule-ai-workspace-runtime/molecule_runtime/plugins.py` | Skills are not just generated once; they can be audited, installed, published, shared, mounted by plugins, and governed as reusable operational assets |
### Why this matters in Molecule AI
@@ -208,7 +208,7 @@ The result is not just “an agent that learns.” It is **an organization that
### Runtime
- unified `workspace/` image; thin AMI in production (us-east-2)
- standalone workspace-template images that install `molecule-ai-workspace-runtime` from the Gitea package registry; thin AMI in production (us-east-2)
- adapter-driven execution across **8 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw)
- Agent Card registration
- awareness-backed memory integration; **Memory v2 backed by pgvector** for semantic recall
+1 -1
View File
@@ -55,7 +55,7 @@ test.describe("Desktop ChatTab", () => {
await textarea.fill("What is the weather?");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("What is the weather?")).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("What is the weather?", { exact: true })).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: What is the weather?")).toBeVisible({ timeout: 15_000 });
});
+1 -1
View File
@@ -49,7 +49,7 @@ test.describe("MobileChat", () => {
await textarea.fill("Mobile test message");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Mobile test message")).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Mobile test message", { exact: true })).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: Mobile test message")).toBeVisible({ timeout: 15_000 });
});
+33 -6
View File
@@ -9,6 +9,7 @@
*/
import { randomUUID } from "node:crypto";
import { execFileSync, execSync } from "node:child_process";
const PLATFORM_URL = process.env.E2E_PLATFORM_URL ?? "http://localhost:8080";
@@ -23,13 +24,19 @@ export interface SeededWorkspace {
* Create an external workspace and wire it to the echo runtime.
*/
export async function seedWorkspace(echoURL: string): Promise<SeededWorkspace> {
// 1. Create external workspace (no URL — platform will mint an auth token).
// 1. Create external workspace pointing at the in-process echo runtime.
const runId = Math.random().toString(36).slice(2, 8);
const wsName = `Chat E2E Agent ${runId}`;
const createRes = await fetch(`${PLATFORM_URL}/workspaces`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ name: wsName, tier: 1, external: true, runtime: "external" }),
body: JSON.stringify({
name: wsName,
tier: 1,
external: true,
runtime: "external",
url: echoURL,
}),
});
if (!createRes.ok) {
const text = await createRes.text();
@@ -40,7 +47,10 @@ export async function seedWorkspace(echoURL: string): Promise<SeededWorkspace> {
name: string;
connection?: { auth_token?: string };
};
const authToken = ws.connection?.auth_token;
let authToken = ws.connection?.auth_token;
if (!authToken) {
authToken = await mintTestToken(ws.id);
}
if (!authToken) {
throw new Error("Workspace created but no auth_token returned");
}
@@ -73,16 +83,35 @@ export async function seedWorkspace(echoURL: string): Promise<SeededWorkspace> {
`-c "UPDATE workspaces SET status = 'online', url = '${echoURL}', platform_inbound_secret = '${inboundSecret}' WHERE id = '${ws.id}'"`,
].join(" ");
const { execSync } = await import("node:child_process");
try {
execSync(psql, { stdio: "pipe", timeout: 30_000 });
} catch (err) {
throw new Error(`DB update failed: ${err}`);
}
cacheWorkspaceURL(ws.id, echoURL);
return { id: ws.id, name: wsName, agentURL: echoURL, authToken };
}
function cacheWorkspaceURL(workspaceId: string, agentURL: string): void {
const redisContainer = process.env.REDIS_CONTAINER;
if (!redisContainer) return;
const keys = [`ws:${workspaceId}:url`, `ws:${workspaceId}:internal_url`];
for (const key of keys) {
try {
execFileSync(
"docker",
["exec", redisContainer, "redis-cli", "SET", key, agentURL],
{ stdio: "pipe", timeout: 10_000 },
);
} catch (err) {
throw new Error(`Redis URL cache update failed for ${key}: ${err}`);
}
}
}
/**
* Start a heartbeat interval that keeps an external workspace alive.
* Returns a stop function.
@@ -141,7 +170,6 @@ export async function seedChatHistory(
const sql = `INSERT INTO chat_messages (id, workspace_id, role, content, created_at) VALUES ${values};`;
const { execSync } = await import("node:child_process");
const psql = `PGPASSWORD=${pass} psql -h ${host} -p ${port} -U ${user} -d ${db} -c "${sql}"`;
execSync(psql, { stdio: "pipe", timeout: 10_000 });
}
@@ -163,7 +191,6 @@ export async function cleanupWorkspace(workspaceId: string): Promise<void> {
const psql = `PGPASSWORD=${pass} psql -h ${host} -p ${port} -U ${user} -d ${db} -c "DELETE FROM workspaces WHERE id = '${workspaceId}'"`;
const { execSync } = await import("node:child_process");
try {
execSync(psql, { stdio: "pipe", timeout: 30_000 });
} catch {
+2 -2
View File
@@ -162,10 +162,10 @@ export async function startEchoRuntime(): Promise<EchoRuntime> {
});
});
await new Promise<void>((resolve) => server.listen(0, "127.0.0.1", resolve));
await new Promise<void>((resolve) => server.listen(0, resolve));
const address = server.address();
const port = typeof address === "object" && address ? address.port : 0;
const baseURL = `http://127.0.0.1:${port}`;
const baseURL = `http://localhost:${port}`;
return {
baseURL,
+91 -2
View File
@@ -68,14 +68,103 @@ export function Toolbar() {
return c;
}, [nodes]);
/**
* Stop All - task #377 fix.
*
* BEFORE this PR: directly POSTed `/workspaces/:id/restart`, which tears
* the container down and back up. That kills in-flight tool subprocesses
* (e.g. `bash -c 'sleep 600'`) but is heavy and discards any in-progress
* agent state. It also bypasses the runtime-side fast cancel path (task
* #377 PR#40 in template-claude-code) - meaning flipping
* `MOLECULE_STOP_PROPAGATE=true` would produce zero canary signal because
* nothing ever invokes `executor.cancel()` in production.
*
* AFTER this PR (two-phase polite cancel):
*
* 1. POST `tasks/cancel` (A2A JSON-RPC) to each active workspace's
* `/workspaces/:id/a2a` proxy. The platform proxies the envelope to
* the workspace runtime; the a2a-sdk framework dispatches `tasks/cancel`
* to `AgentExecutor.cancel()` (a2a-sdk 1.0.3
* `a2a/compat/v0_3/types.py` line 1125 pins the wire literal as
* `Literal["tasks/cancel"]`; A2A protocol spec section 9.4.5 maps the
* abstract `CancelTask` operation to that wire string). The runtime's
* executor cancel path signals the CLI subprocess group with
* SIGTERM/grace/SIGKILL (template-claude-code PR#40 `stop_propagate.py`).
*
* 2. Poll the canvas store (the platform pushes `TASK_UPDATED` over WS
* on `active_tasks` changes - `canvas-events.ts` line 400) for up to
* `STOP_ALL_DRAIN_TIMEOUT_MS`. A workspace whose `activeTasks` drops
* to 0 is considered drained and is NOT restarted.
*
* 3. For any workspace that DID NOT drain inside the timeout - runtime
* is on an old image without the cancel path, or the cancel
* propagation is stuck - fall back to the original heavy
* `/workspaces/:id/restart`. The original behavior is preserved as a
* floor so a stuck workspace still gets stopped; the polite path is
* a fast top-up that lets well-behaved workspaces cancel without
* losing context.
*
* The polite-cancel envelope mirrors `ScheduleTab.handleRunNow` (line 168)
* which is the only other place in canvas that POSTs `/workspaces/:id/a2a`
* directly. Method string `tasks/cancel` and empty `params` match the
* a2a-sdk shape verified above. The proxy adds `jsonrpc:"2.0"` and `id`
* via `normalizeA2APayload` server-side, so the canvas envelope omits them.
*/
const stopAll = useCallback(async () => {
setStopping(true);
const active = nodes.filter((n) => (n.data.activeTasks as number) > 0);
const activeIds = active.map((n) => n.id);
// Phase 1 - polite cancel on every active workspace in parallel.
// Errors are swallowed (same shape as the pre-fix /restart
// Promise.all): a 4xx/5xx on tasks/cancel just means we fall through
// to /restart for that workspace below.
await Promise.all(
active.map((n) =>
api.post(`/workspaces/${n.id}/restart`).catch(() => {})
activeIds.map((id) =>
api
.post(`/workspaces/${id}/a2a`, {
method: "tasks/cancel",
params: {},
})
.catch(() => {})
)
);
// Phase 2 - poll the store for activeTasks reaching 0, with a hard
// timeout. STOP_ALL_DRAIN_TIMEOUT_MS is sized to cover the runtime's
// own SIGTERM-grace (5s in template-claude-code stop_propagate.py
// `_SIGTERM_GRACE_S`) plus a small WS round-trip buffer for the
// TASK_UPDATED push. STOP_ALL_POLL_INTERVAL_MS keeps the poll cheap
// (no animation jitter, no busy-wait).
const STOP_ALL_DRAIN_TIMEOUT_MS = 8000;
const STOP_ALL_POLL_INTERVAL_MS = 250;
const deadline = Date.now() + STOP_ALL_DRAIN_TIMEOUT_MS;
let undrained = new Set(activeIds);
while (undrained.size > 0 && Date.now() < deadline) {
await new Promise((r) => setTimeout(r, STOP_ALL_POLL_INTERVAL_MS));
const fresh = useCanvasStore.getState().nodes;
const stillActive = new Set<string>();
for (const id of undrained) {
const n = fresh.find((x) => x.id === id);
// Missing node (workspace deleted mid-cancel) is treated as
// drained - there's nothing left to restart and reporting it as
// "still running" would be a lie.
if (n && (n.data.activeTasks as number) > 0) stillActive.add(id);
}
undrained = stillActive;
}
// Phase 3 - hard-restart anything that did not drain. This is the
// same call shape as the pre-fix Stop All, so behavior is strictly a
// superset: undrained workspaces still get the heavy stop, drained
// ones are spared.
if (undrained.size > 0) {
await Promise.all(
Array.from(undrained).map((id) =>
api.post(`/workspaces/${id}/restart`).catch(() => {})
)
);
}
setStopping(false);
}, [nodes]);
@@ -131,14 +131,30 @@ const defaultStore = {
batchDelete: vi.fn(() => Promise.resolve()),
};
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector: (s: typeof defaultStore) => unknown) =>
vi.mock("@/store/canvas", () => {
// useCanvasStore is used in two shapes:
// 1. As a hook: `useCanvasStore((s) => s.x)` — selector path.
// 2. As a static accessor: `useCanvasStore.getState().nodes` —
// used by stopAll's drain-poll loop (task #377 Toolbar fix) and
// restartAll's success-clear loop. Both read the LIVE
// defaultStore object so tests that mutate `defaultStore.nodes`
// mid-flight (e.g. simulating a TASK_UPDATED that drops
// activeTasks to 0) see the update on the next poll tick.
const hook = vi.fn((selector: (s: typeof defaultStore) => unknown) =>
selector(defaultStore)
),
}));
) as unknown as ((selector: (s: typeof defaultStore) => unknown) => unknown) & {
getState: () => typeof defaultStore;
};
hook.getState = () => defaultStore;
return { useCanvasStore: hook };
});
// ── Component under test ───────────────────────────────────────────────────────
import { Toolbar } from "../Toolbar";
// Imported AFTER vi.mock("@/lib/api", ...) above (hoisted) so this
// resolves to the mock module; gives the new task #377 tests a typed
// handle on api.post without a CJS require() (Vitest runs ESM).
import { api as mockedApi } from "@/lib/api";
// ── Tests ─────────────────────────────────────────────────────────────────────
@@ -315,3 +331,157 @@ describe("Toolbar — ? shortcut opens shortcuts dialog", () => {
expect(screen.queryByTestId("shortcuts-dialog")).toBeNull();
});
});
// ── Toolbar — Stop All polite-cancel flow (task #377) ───────────────────────
describe("Toolbar — Stop All polite cancel before restart (#377)", () => {
// `api` resolves to the top-level vi.mock factory's mocked `post`.
// We type-cast so TS allows mockReset/mockResolvedValue/mockImplementation
// calls without leaking the mock surface into the production type.
const api = mockedApi as unknown as { post: ReturnType<typeof vi.fn> };
/**
* Build a working set of two active workspaces so the assertions can
* distinguish per-id behavior (drained vs undrained) within one test.
*/
const seedTwoActive = () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"], [2, 2]));
};
/**
* Drive an async useCallback handler to completion. Vitest's fake
* timers don't see microtasks unless we yield between advances; the
* helper interleaves `vi.advanceTimersByTimeAsync` with macrotask
* yields so pending fetch resolutions and setTimeout callbacks both
* settle before the assertion runs.
*/
const advanceUntilSettled = async (ms: number) => {
await vi.advanceTimersByTimeAsync(ms);
// One extra tick lets any chained .then() after a setTimeout
// resolution fire before the test moves on.
await Promise.resolve();
};
beforeEach(() => {
vi.useFakeTimers();
api.post.mockReset();
});
afterEach(() => {
vi.useRealTimers();
});
it("phase 1: issues tasks/cancel via /workspaces/:id/a2a BEFORE any /restart", async () => {
seedTwoActive();
// Hold both tasks/cancel responses open so the click handler is
// observably paused at phase 1. We don't actually need to resolve
// them for the order assertion — just inspect the call log.
let resolveCancels!: () => void;
const cancelGate = new Promise<void>((r) => { resolveCancels = r; });
api.post.mockImplementation(async (path: string) => {
if (path.endsWith("/a2a")) {
await cancelGate;
}
return undefined;
});
render(<Toolbar />);
const btn = screen.getByRole("button", { name: /stop all running tasks/i });
fireEvent.click(btn);
// Yield once so the click handler enters phase 1 and dispatches the
// two /a2a POSTs.
await Promise.resolve();
await Promise.resolve();
const a2aCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/a2a"));
const restartCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/restart"));
expect(a2aCalls.length).toBe(2);
expect(restartCalls.length).toBe(0);
// Each /a2a POST carries the canonical tasks/cancel envelope.
for (const call of a2aCalls) {
expect(call[1]).toEqual({ method: "tasks/cancel", params: {} });
}
// Release the gate so the test cleanup doesn't dangle.
resolveCancels();
await advanceUntilSettled(10_000);
});
it("phase 2: when activeTasks drains to 0 during the poll window, /restart is NOT called", async () => {
seedTwoActive();
api.post.mockResolvedValue(undefined);
render(<Toolbar />);
fireEvent.click(screen.getByRole("button", { name: /stop all running tasks/i }));
// Let phase 1 fire (the two tasks/cancel calls).
await Promise.resolve();
await Promise.resolve();
// Simulate the platform pushing TASK_UPDATED with active_tasks=0
// on both workspaces — emulate by mutating the store directly,
// which is what canvas-events.ts does in production.
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"], [0, 0]));
// Advance past the first poll interval (250ms) so the loop sees
// the drained store and exits early.
await advanceUntilSettled(400);
// Drain any remaining timers so the handler returns cleanly.
await advanceUntilSettled(10_000);
const restartCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/restart"));
expect(restartCalls.length).toBe(0);
});
it("phase 3: when activeTasks does NOT drain inside the timeout, falls through to /restart for each stuck workspace", async () => {
seedTwoActive();
api.post.mockResolvedValue(undefined);
render(<Toolbar />);
fireEvent.click(screen.getByRole("button", { name: /stop all running tasks/i }));
// Phase 1 dispatch.
await Promise.resolve();
await Promise.resolve();
// Do NOT drain — activeTasks stays at 2 for both. Advance past the
// 8000ms drain timeout plus a buffer so phase 3's /restart POSTs fire.
await advanceUntilSettled(9_000);
await advanceUntilSettled(1_000);
const a2aCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/a2a"));
const restartCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/restart"));
expect(a2aCalls.length).toBe(2);
expect(restartCalls.length).toBe(2);
// Order check: every /a2a call comes before every /restart call.
const lastA2AIdx = Math.max(
...api.post.mock.calls.map((c, i) => (String(c[0]).endsWith("/a2a") ? i : -1))
);
const firstRestartIdx = Math.min(
...api.post.mock.calls.map((c, i) => (String(c[0]).endsWith("/restart") ? i : Infinity))
);
expect(lastA2AIdx).toBeLessThan(firstRestartIdx);
});
it("phase 3 selective: drains only one of two workspaces — /restart is called only for the stuck one", async () => {
seedTwoActive();
api.post.mockResolvedValue(undefined);
render(<Toolbar />);
fireEvent.click(screen.getByRole("button", { name: /stop all running tasks/i }));
await Promise.resolve();
await Promise.resolve();
// ws-0 drains immediately, ws-1 stays stuck for the full timeout.
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"], [0, 2]));
await advanceUntilSettled(9_500);
const restartCalls = api.post.mock.calls.filter((c) => String(c[0]).endsWith("/restart"));
expect(restartCalls.length).toBe(1);
expect(restartCalls[0][0]).toBe("/workspaces/ws-1/restart");
});
});
@@ -0,0 +1,181 @@
'use client';
import { useCallback, useEffect, useState } from 'react';
import { api } from '@/lib/api';
import { fetchSession, type Session } from '@/lib/auth';
import { getTenantSlug } from '@/lib/tenant';
import { Spinner } from '@/components/Spinner';
/**
* Organization-identity surface inside SettingsPanel.
*
* Closes a chronic UX gap where users (and our own AI agents) had to
* call /cp/auth/me or /cp/orgs from browser devtools to read their
* org_id UUID. Now: a copy-buttoned view of name + slug + UUID for the
* currently-active org, plus a switcher list when the user belongs to
* multiple orgs.
*
* Data path:
* 1. fetchSession() → /cp/auth/me → current org_id
* 2. api.get('/cp/orgs') → list of all orgs the user belongs to
* 3. Match by id === session.org_id; fall back to host-slug match
* if the session probe loses the race.
*
* Read-only — this tab never mutates. Org creation/switching lives at
* /orgs (the post-signup landing page).
*/
interface Org {
id: string;
slug: string;
name: string;
status?: string;
}
// /cp/orgs may return a bare array or {orgs: []} — see orgs/page.tsx
// for the same defensive unwrap.
type OrgsResponse = Org[] | { orgs?: Org[] };
export function OrgInfoTab() {
const [orgs, setOrgs] = useState<Org[] | null>(null);
const [session, setSession] = useState<Session | null>(null);
const [error, setError] = useState<string | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
let cancelled = false;
(async () => {
try {
const [sess, body] = await Promise.all([
fetchSession().catch(() => null),
api.get<OrgsResponse>('/cp/orgs'),
]);
if (cancelled) return;
setSession(sess);
setOrgs(Array.isArray(body) ? body : body.orgs ?? []);
} catch (e) {
if (!cancelled) setError(e instanceof Error ? e.message : 'Failed to load org info');
} finally {
if (!cancelled) setLoading(false);
}
})();
return () => {
cancelled = true;
};
}, []);
const tenantSlug = getTenantSlug();
const currentOrg =
orgs?.find((o) => session && o.id === session.org_id) ??
orgs?.find((o) => tenantSlug && o.slug === tenantSlug) ??
null;
const otherOrgs = orgs?.filter((o) => o.id !== currentOrg?.id) ?? [];
if (loading) {
return (
<div
role="status"
aria-live="polite"
className="flex items-center justify-center gap-2 py-6 text-ink-mid text-xs"
>
<Spinner /> Loading organization
</div>
);
}
if (error) {
return (
<div className="p-4">
<div className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[10px] text-bad">
{error}
</div>
</div>
);
}
if (!currentOrg) {
return (
<div className="p-4">
<p className="text-xs text-ink-mid">
No organization found for this session. If this is unexpected, sign out and back in, or visit{' '}
<a href="/orgs" className="underline">/orgs</a>.
</p>
</div>
);
}
return (
<div className="p-4 space-y-4">
<div>
<h3 className="text-sm font-semibold text-ink mb-1">Current Organization</h3>
<p className="text-[10px] text-ink-mid leading-relaxed">
IDs you can paste into API calls, support tickets, or CLI arguments. The UUID never changes;
the slug is the URL subdomain.
</p>
</div>
<OrgIdentityCard org={currentOrg} highlighted />
{otherOrgs.length > 0 && (
<div className="space-y-2 pt-2">
<h4 className="text-[11px] font-semibold text-ink-mid uppercase tracking-wider">
Your other organizations ({otherOrgs.length})
</h4>
{otherOrgs.map((o) => (
<OrgIdentityCard key={o.id} org={o} />
))}
</div>
)}
</div>
);
}
function OrgIdentityCard({ org, highlighted }: { org: Org; highlighted?: boolean }) {
return (
<div
className={`rounded-lg border p-3 space-y-2 ${
highlighted ? 'border-accent/40 bg-accent-strong/5' : 'border-line/40 bg-surface-card/40'
}`}
data-testid={`org-card-${org.slug}`}
>
<div className="flex items-baseline justify-between gap-2">
<span className="text-[12px] font-medium text-ink truncate">{org.name}</span>
{org.status && (
<span className="text-[9px] text-ink-mid uppercase tracking-wider shrink-0">{org.status}</span>
)}
</div>
<IdentityRow label="Slug" value={org.slug} />
<IdentityRow label="UUID" value={org.id} mono />
</div>
);
}
function IdentityRow({ label, value, mono }: { label: string; value: string; mono?: boolean }) {
const [copied, setCopied] = useState(false);
const onCopy = useCallback(() => {
// Best-effort: jsdom + old Safari throw synchronously on writeText.
try {
navigator.clipboard.writeText(value);
} catch {
/* user can still triple-click select */
}
setCopied(true);
setTimeout(() => setCopied(false), 2000);
}, [value]);
return (
<div className="flex items-center gap-2">
<span className="text-[10px] text-ink-mid w-10 shrink-0">{label}</span>
<code
className={`flex-1 text-[11px] text-ink bg-surface-sunken/60 px-2 py-1 rounded select-all break-all ${
mono ? 'font-mono' : ''
}`}
>
{value}
</code>
<button
type="button"
onClick={onCopy}
aria-label={`Copy ${label}`}
className="shrink-0 px-2 py-1 bg-surface-card/60 hover:bg-surface-card border border-line/40 rounded text-[10px] text-ink-mid hover:text-ink transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
{copied ? 'Copied' : 'Copy'}
</button>
</div>
);
}
@@ -8,6 +8,7 @@ import { useKeyboardShortcut } from '@/hooks/use-keyboard-shortcut';
import { SecretsTab } from './SecretsTab';
import { TokensTab } from './TokensTab';
import { OrgTokensTab } from './OrgTokensTab';
import { OrgInfoTab } from './OrgInfoTab';
import { UnsavedChangesGuard } from './UnsavedChangesGuard';
/** Module-level ref so TopBar's SettingsButton can receive focus back on close. */
@@ -116,6 +117,9 @@ export function SettingsPanel({ workspaceId }: SettingsPanelProps) {
<Tabs.Trigger value="org-tokens" className="settings-panel__tab">
Org API Keys
</Tabs.Trigger>
<Tabs.Trigger value="org-info" className="settings-panel__tab">
Organization
</Tabs.Trigger>
</Tabs.List>
<Tabs.Content value="api-keys" className="settings-panel__content">
@@ -129,6 +133,10 @@ export function SettingsPanel({ workspaceId }: SettingsPanelProps) {
<Tabs.Content value="org-tokens" className="settings-panel__content">
<OrgTokensTab />
</Tabs.Content>
<Tabs.Content value="org-info" className="settings-panel__content">
<OrgInfoTab />
</Tabs.Content>
</Tabs.Root>
<div className="settings-panel__footer">
@@ -0,0 +1,207 @@
// @vitest-environment jsdom
/**
* Tests for OrgInfoTab — surfaces current org name/slug/UUID with copy
* buttons, plus a list of the user's other orgs when applicable.
*
* Covers (≥3 cases per the closing-the-UX-gap brief):
* - Loading state (spinner + aria-live)
* - Renders current org matched by session.org_id, with UUID + slug + name
* - Copy button writes the UUID to navigator.clipboard
* - Falls back to host-slug match when session lookup fails
* - Lists other orgs when user belongs to multiple
* - Error banner when /cp/orgs throws
* - Empty/no-match state renders the recovery hint, not a crash
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { OrgInfoTab } from "../OrgInfoTab";
const mockGet = vi.fn();
const mockFetchSession = vi.fn();
const mockGetTenantSlug = vi.fn();
vi.mock("@/lib/api", () => ({
api: { get: (...args: unknown[]) => mockGet(...args) },
}));
vi.mock("@/lib/auth", () => ({
fetchSession: (...args: unknown[]) => mockFetchSession(...args),
}));
vi.mock("@/lib/tenant", () => ({
getTenantSlug: (...args: unknown[]) => mockGetTenantSlug(...args),
}));
// Stub clipboard
vi.stubGlobal("navigator", {
clipboard: { writeText: vi.fn().mockResolvedValue(undefined) },
});
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
mockFetchSession.mockReset();
mockGetTenantSlug.mockReset();
mockGetTenantSlug.mockReturnValue("");
vi.mocked(navigator.clipboard.writeText).mockReset();
});
afterEach(() => {
cleanup();
});
async function flush() {
await act(async () => {
await Promise.resolve();
await Promise.resolve();
});
}
const AGENTS_TEAM = {
id: "2355b568-0799-4cc7-9e7f-806747f9958c",
slug: "agents-team",
name: "Agents Team",
status: "running",
};
const OTHER_ORG = {
id: "11111111-1111-4111-8111-111111111111",
slug: "skunkworks",
name: "Skunkworks",
status: "running",
};
// ─── Loading ─────────────────────────────────────────────────────────────────
describe("OrgInfoTab — loading", () => {
it("shows spinner while fetching", () => {
mockGet.mockImplementation(() => new Promise(() => {}));
mockFetchSession.mockImplementation(() => new Promise(() => {}));
render(<OrgInfoTab />);
const status = screen.getByRole("status");
expect(status).toBeTruthy();
expect(status.getAttribute("aria-live")).toBe("polite");
expect(status.textContent).toContain("Loading organization");
});
});
// ─── Current org renders + copy ──────────────────────────────────────────────
describe("OrgInfoTab — current org", () => {
it("renders the org matched by session.org_id with name, slug, UUID", async () => {
mockFetchSession.mockResolvedValue({
user_id: "u-1",
org_id: AGENTS_TEAM.id,
email: "hongming@moleculesai.app",
});
mockGet.mockResolvedValue([AGENTS_TEAM, OTHER_ORG]);
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText("Current Organization"));
// Name shown
expect(screen.getByText("Agents Team")).toBeTruthy();
// Slug shown
expect(screen.getByText("agents-team")).toBeTruthy();
// UUID shown
expect(screen.getByText(AGENTS_TEAM.id)).toBeTruthy();
});
it("copy-UUID button writes the UUID to navigator.clipboard", async () => {
mockFetchSession.mockResolvedValue({
user_id: "u-1",
org_id: AGENTS_TEAM.id,
email: "hongming@moleculesai.app",
});
mockGet.mockResolvedValue([AGENTS_TEAM]);
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText(AGENTS_TEAM.id));
const copyUuid = screen.getByRole("button", { name: /Copy UUID/i });
fireEvent.click(copyUuid);
expect(navigator.clipboard.writeText).toHaveBeenCalledWith(AGENTS_TEAM.id);
// Optimistic "Copied" label flip
await waitFor(() =>
expect(
screen.getByRole("button", { name: /Copy UUID/i }).textContent,
).toContain("Copied"),
);
});
it("copy-Slug button writes the slug to navigator.clipboard", async () => {
mockFetchSession.mockResolvedValue({
user_id: "u-1",
org_id: AGENTS_TEAM.id,
email: "hongming@moleculesai.app",
});
mockGet.mockResolvedValue([AGENTS_TEAM]);
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText(AGENTS_TEAM.slug));
fireEvent.click(screen.getByRole("button", { name: /Copy Slug/i }));
expect(navigator.clipboard.writeText).toHaveBeenCalledWith(AGENTS_TEAM.slug);
});
});
// ─── Fallback: host-slug match when session fails ────────────────────────────
describe("OrgInfoTab — fallbacks", () => {
it("falls back to host-slug match when fetchSession rejects", async () => {
mockFetchSession.mockRejectedValue(new Error("session probe failed"));
mockGetTenantSlug.mockReturnValue("agents-team");
mockGet.mockResolvedValue({ orgs: [AGENTS_TEAM, OTHER_ORG] }); // wrapped shape
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText("Current Organization"));
expect(screen.getByText("Agents Team")).toBeTruthy();
expect(screen.getByText(AGENTS_TEAM.id)).toBeTruthy();
});
it("lists other orgs the user belongs to under a separate header", async () => {
mockFetchSession.mockResolvedValue({
user_id: "u-1",
org_id: AGENTS_TEAM.id,
email: "hongming@moleculesai.app",
});
mockGet.mockResolvedValue([AGENTS_TEAM, OTHER_ORG]);
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText(/Your other organizations/));
expect(screen.getByText("Skunkworks")).toBeTruthy();
expect(screen.getByText(OTHER_ORG.id)).toBeTruthy();
});
});
// ─── Error + empty handling ──────────────────────────────────────────────────
describe("OrgInfoTab — error + empty", () => {
it("renders an error banner when /cp/orgs throws", async () => {
mockFetchSession.mockResolvedValue(null);
mockGet.mockRejectedValue(new Error("API GET /cp/orgs: 500 boom"));
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText(/500 boom/));
expect(screen.queryByText("Current Organization")).toBeNull();
});
it("renders the recovery hint when no org matches (no crash)", async () => {
mockFetchSession.mockResolvedValue(null);
mockGetTenantSlug.mockReturnValue("");
mockGet.mockResolvedValue([]);
render(<OrgInfoTab />);
await flush();
await waitFor(() =>
screen.getByText(/No organization found for this session/),
);
});
});
+1
View File
@@ -8,3 +8,4 @@ export { SearchBar } from './SearchBar';
export { EmptyState } from './EmptyState';
export { DeleteConfirmDialog } from './DeleteConfirmDialog';
export { UnsavedChangesGuard } from './UnsavedChangesGuard';
export { OrgInfoTab } from './OrgInfoTab';
@@ -649,7 +649,17 @@ function WaitingBubbles({ visible }: { visible: CommMessage[] }) {
if (!prev || m.timestamp > prev.timestamp) tailByPeer.set(m.peerId, m);
}
const waitingPeers = Array.from(tailByPeer.values()).filter(
(m) => m.flow === "out" && (m.status === "pending" || m.status === "queued"),
// Task #227 — also light the indicator for status="dispatched": that's
// the platform's marker for a poll-mode delegation that's been
// recorded into the peer's inbox but not yet picked up. Without this
// arm, external/MCP peer threads showed an outbound bubble and then
// dead silence until the eventual reply landed — no parity with the
// native push-path "pending" indicator.
(m) =>
m.flow === "out" &&
(m.status === "pending" ||
m.status === "queued" ||
m.status === "dispatched"),
);
if (waitingPeers.length === 0) return null;
return (
@@ -688,7 +698,9 @@ function WaitingBubbles({ visible }: { visible: CommMessage[] }) {
<span className="text-[10px]">
{m.status === "queued"
? `${m.peerName} is busy — reply will arrive when they're free`
: `Waiting for ${m.peerName}`}
: m.status === "dispatched"
? `Queued — ${m.peerName} will pick up on next poll`
: `Waiting for ${m.peerName}`}
</span>
</span>
</div>
@@ -41,6 +41,19 @@ describe("inferA2AErrorHint", () => {
expect(inferA2AErrorHint("RuntimeException in tool call")).toMatch(/runtime threw an exception/);
});
it("points at the Activity tab (the real in-product logs surface), not 'workspace/container logs' (internal#212)", () => {
// Pre-#212 these hints sent users to "workspace logs" / "container
// logs" — neither has a UI affordance in the canvas. Activity tab
// is the in-product surface where the full row lives. Lock the
// copy so a future refactor cannot re-introduce the dangling
// pointer.
expect(inferA2AErrorHint("Agent error: boom")).toMatch(/Activity tab/);
expect(inferA2AErrorHint("some completely novel error nobody has matched yet")).toMatch(/Activity tab/);
// And the two strings together must not regress to the old text.
expect(inferA2AErrorHint("Agent error: boom")).not.toMatch(/container logs/);
expect(inferA2AErrorHint("some novel error")).not.toMatch(/workspace logs/);
});
it("recognises peer-unreachable cases (Activity-tab originals)", () => {
expect(inferA2AErrorHint("workspace not found")).toMatch(/can't be reached/);
expect(inferA2AErrorHint("not accessible")).toMatch(/can't be reached/);
@@ -53,7 +66,8 @@ describe("inferA2AErrorHint", () => {
it("returns a generic fallback for unrecognised text", () => {
const hint = inferA2AErrorHint("some completely novel error nobody has matched yet");
expect(hint).toMatch(/Check the workspace logs|delivery failure/);
// Fallback now sends the user to the Activity tab (post-#212).
expect(hint).toMatch(/Activity tab|delivery failure/);
});
it("Claude SDK wedge wins over the more general timeout pattern", () => {
@@ -38,7 +38,11 @@ export function inferA2AErrorHint(detail: string): string {
return "The connection to the remote agent dropped before a reply arrived. Usually a transient network blip — retry once. If it repeats, the remote container may have crashed mid-request; check its logs.";
}
if (t.includes("agent error") || t.includes("exception")) {
return "The remote agent's runtime threw an exception. Check the workspace's container logs for the traceback. Restart usually clears transient runtime crashes.";
// internal#212 closeout: end users have no "container logs" surface
// in the canvas; the Activity tab IS the user-visible logs surface
// (full row carries request/response body + error_detail). Point
// there so the hint is actionable from inside the product.
return "The remote agent's runtime threw an exception. Open the Activity tab for the full row (request body, response, error_detail) — Restart usually clears transient runtime crashes.";
}
if (
t.includes("not found") ||
@@ -50,5 +54,9 @@ export function inferA2AErrorHint(detail: string): string {
if (detail === "") {
return "The remote agent returned no error detail (the underlying httpx exception had an empty message — typically a connection-reset or silent timeout). A workspace restart is the safe first move.";
}
return "The remote agent reported a delivery failure. Check the workspace logs or try restarting.";
// internal#212 closeout: "workspace logs" pointed at a tab that does
// not exist — Activity tab is the in-product logs surface. Keep the
// hint generic enough for the unrecognised-detail fallback but point
// the user at a real affordance.
return "The remote agent reported a delivery failure. Open the Activity tab for the full row, or try restarting the workspace.";
}
@@ -0,0 +1,178 @@
// @vitest-environment jsdom
//
// Task #227 — external/MCP workspace progress UX parity.
//
// ws-server's `proxyA2ARequest` poll-mode short-circuit
// (workspace-server/internal/handlers/a2a_proxy.go:402-432) returns a
// synthetic `{status:"queued", delivery_mode:"poll", method:"message/send"}`
// HTTP 200 within ~50ms when the target workspace is registered with
// `delivery_mode=poll` — i.e. an operator's laptop running
// `molecule-mcp-claude-channel`, a hermes/codex MCP bridge, or a Cursor
// MCP client. The real agent reply arrives separately via the
// AGENT_MESSAGE WebSocket event after the agent's next
// `wait_for_message` poll (could be 1s, could be 60s).
//
// Pre-#227 behaviour: useChatSend treated the queued-200 as a successful
// round-trip — extractReplyText returned "", no agent bubble was
// created, `releaseSendGuards` flipped `sending` off, and the user saw
// dead silence between their user bubble and the eventual reply with
// NO progress indicator. That's the user-reported gap this task fixes.
//
// These tests pin the new behaviour: on a queued-200, the hook MUST NOT
// call onAgentMessage (no empty bubble) AND MUST NOT call
// releaseSendGuards (spinner persists). The eventual AGENT_MESSAGE WS
// event is what clears the spinner — that path is covered by
// useChatSocket.test.tsx already.
import { describe, it, expect, vi, beforeEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
// Capture the api.post invocations + control responses per-test.
const apiPostMock = vi.fn<
(url: string, body?: unknown, opts?: unknown) => Promise<unknown>
>();
vi.mock("@/lib/api", () => ({
api: {
post: (url: string, body?: unknown, opts?: unknown) =>
apiPostMock(url, body, opts),
get: vi.fn(),
},
}));
// uploads — tests don't go through the upload path; stub the helpers
// useChatSend imports so the module loads.
vi.mock("../../uploads", () => ({
uploadChatFiles: vi.fn(),
FileTooLargeError: class FileTooLargeError extends Error {},
}));
// types — re-export the createMessage helper unchanged; only the
// uploads stub matters above.
import { useChatSend } from "../useChatSend";
beforeEach(() => {
apiPostMock.mockReset();
});
describe("useChatSend — poll-mode (external/MCP) queued-200 handling — task #227", () => {
it("does NOT call onAgentMessage when the synthetic {status:'queued'} response lands (no empty bubble)", async () => {
// Mock the platform's poll-mode short-circuit response shape exactly
// as ws-server's `proxyA2ARequest` returns it (a2a_proxy.go:420-431).
apiPostMock.mockResolvedValueOnce({
status: "queued",
delivery_mode: "poll",
method: "message/send",
});
const onUserMessage = vi.fn();
const onAgentMessage = vi.fn();
const { result } = renderHook(() =>
useChatSend("ws-poll-target", {
getHistoryMessages: () => [],
onUserMessage,
onAgentMessage,
}),
);
await act(async () => {
await result.current.sendMessage("hello external workspace");
// Yield one microtask so the .then runs.
await Promise.resolve();
});
// User bubble fires — the user typed, that part is unconditional.
expect(onUserMessage).toHaveBeenCalledTimes(1);
// CRITICAL: no agent bubble. extractReplyText on a queued envelope
// returns "" — the pre-#227 code would still have hit the
// "releaseSendGuards + no bubble" path, BUT it would have ended
// `sending`. The new code returns early BEFORE that release, so the
// contract under test is "no synthesised empty bubble".
expect(onAgentMessage).not.toHaveBeenCalled();
});
it("keeps `sending` true after a queued-200 — the spinner must persist until the real AGENT_MESSAGE arrives", async () => {
apiPostMock.mockResolvedValueOnce({
status: "queued",
delivery_mode: "poll",
method: "message/send",
});
const { result } = renderHook(() =>
useChatSend("ws-poll-target", {
getHistoryMessages: () => [],
}),
);
await act(async () => {
await result.current.sendMessage("waiting for the operator laptop");
await Promise.resolve();
});
// The spinner-driving state is `sending`. On a queued-200, it must
// remain true — clearing it here is the exact bug task #227
// resurfaces (collapsing the spinner before the agent has even seen
// the message).
expect(result.current.sending).toBe(true);
});
it("ALSO keeps `sending` true even after a follow-up microtask flush — guards against an accidental late release", async () => {
// Defense: ensure no chained .then / .finally accidentally calls
// releaseSendGuards on the queued path. Run several microtask
// ticks and re-assert.
apiPostMock.mockResolvedValueOnce({
status: "queued",
delivery_mode: "poll",
});
const { result } = renderHook(() =>
useChatSend("ws-poll-target", {
getHistoryMessages: () => [],
}),
);
await act(async () => {
await result.current.sendMessage("late-release-guard");
// Flush multiple microtask ticks.
await Promise.resolve();
await Promise.resolve();
await Promise.resolve();
});
expect(result.current.sending).toBe(true);
});
it("push-mode (real reply parts) still flips sending=false + creates an agent bubble — non-regression for the default path", async () => {
// Sanity-check the push path still works: a real reply must call
// onAgentMessage and flip sending=false. Without this assertion an
// overzealous "return early on any non-result body" would silently
// break the dominant push-mode path.
apiPostMock.mockResolvedValueOnce({
result: {
parts: [{ kind: "text", text: "hi from native workspace" }],
},
});
const onAgentMessage = vi.fn();
const { result } = renderHook(() =>
useChatSend("ws-native-push", {
getHistoryMessages: () => [],
onAgentMessage,
}),
);
await act(async () => {
await result.current.sendMessage("native push test");
await Promise.resolve();
});
expect(onAgentMessage).toHaveBeenCalledTimes(1);
const msg = onAgentMessage.mock.calls[0][0] as {
role: string;
content: string;
};
expect(msg.role).toBe("agent");
expect(msg.content).toBe("hi from native workspace");
expect(result.current.sending).toBe(false);
});
});
@@ -116,6 +116,77 @@ describe("useChatSocket — surface error_detail to onSendError (internal#212)",
expect(reason.length).toBeGreaterThan(0);
});
// Task #227 — external/MCP (poll-mode) workspace progress UX.
//
// ws-server's `proxyA2ARequest` poll-mode short-circuit fires the
// ACTIVITY_LOGGED a2a_receive with status="ok" and NO duration_ms (no
// reply yet — the request is queued for the agent's next poll). Before
// task #227 the (status==="ok" && durationMs) guard silently dropped
// this row, so the chat UI had ZERO progress signal between "user
// typed" and "agent eventually polled and replied". Lock the queued
// line in so future refactors don't regress to the silent-drop state.
it("emits a 'queued — will pick up on next poll' activity line when a2a_receive status=ok has no duration_ms (poll-mode)", () => {
const onActivityLog = vi.fn();
renderHook(() =>
useChatSocket("ws-self", {
onActivityLog,
}),
);
expect(capturedHandler).not.toBeNull();
act(() => {
capturedHandler!({
event: "ACTIVITY_LOGGED",
workspace_id: "ws-self",
payload: {
activity_type: "a2a_receive",
method: "message/send",
status: "ok",
target_id: "ws-self",
// No duration_ms — this is the queued-for-poll signal.
},
timestamp: "2026-05-20T00:00:00Z",
});
});
expect(onActivityLog).toHaveBeenCalledTimes(1);
const line = onActivityLog.mock.calls[0][0] as string;
// The line MUST be present (not the empty-string silent-drop pattern)
// and MUST mention the queued state so the user has actionable signal.
expect(line.length).toBeGreaterThan(0);
expect(line.toLowerCase()).toMatch(/queued|poll/);
});
// Pair with the above: poll-mode acknowledgement must NOT prematurely
// call onSendComplete — the spinner has to stay up until the actual
// AGENT_MESSAGE reply lands. (The reply-success path with duration_ms
// still calls onSendComplete; that's the push-mode case.)
it("does NOT call onSendComplete on a poll-mode queued a2a_receive (spinner must persist)", () => {
const onSendComplete = vi.fn();
renderHook(() =>
useChatSocket("ws-self", {
onSendComplete,
}),
);
act(() => {
capturedHandler!({
event: "ACTIVITY_LOGGED",
workspace_id: "ws-self",
payload: {
activity_type: "a2a_receive",
method: "message/send",
status: "ok",
target_id: "ws-self",
// No duration_ms.
},
timestamp: "2026-05-20T00:00:00Z",
});
});
expect(onSendComplete).not.toHaveBeenCalled();
});
it("ignores errors targeted at a different workspace's peer", () => {
// Defense against a race where the WS hub fans out to all clients —
// each chat panel must only react when target_id matches its own
@@ -22,6 +22,28 @@ interface A2AResponse {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
/** Set by ws-server's poll-mode short-circuit in `proxyA2ARequest`
* (a2a_proxy.go:416-431) when the target workspace is registered as
* `delivery_mode=poll` — e.g. an operator's laptop running
* `molecule-mcp-claude-channel`, a hermes/codex MCP bridge, or a
* Cursor MCP client. The HTTP 200 carries the synthetic envelope
* `{status:"queued", delivery_mode:"poll", method:"message/send"}`
* immediately (~50ms), BEFORE the agent has produced a reply.
*
* Task #227 routing: when this field is "queued" the caller must NOT
* treat the 200 as "agent done" — there are no `result.parts` yet
* (the reply will arrive separately via the AGENT_MESSAGE WS event
* after the agent's next poll). Keep the spinner up; the eventual
* AGENT_MESSAGE flips `sending` off via the existing useChatSocket
* `onSendComplete` path. Without this distinction the spinner
* disappeared immediately and external/MCP workspaces had no progress
* UX between send and reply. */
status?: string;
/** Companion to `status` — "poll" when the queued short-circuit fired.
* Defensive: we key the poll-mode-skip decision on status==="queued"
* (the canonical signal) rather than on this field, but it's surfaced
* here so future debugging / tests can assert on the full envelope. */
delivery_mode?: string;
}
export function extractReplyText(resp: A2AResponse): string {
@@ -195,6 +217,30 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
sendInFlightRef.current = false;
return;
}
// Task #227 — poll-mode (external/MCP workspace) queued-200
// short-circuit. ws-server's `proxyA2ARequest` returns
// `{status:"queued", delivery_mode:"poll", ...}` immediately
// when the target has no URL (delivery_mode=poll), BEFORE the
// agent has produced any reply. There is no `result.parts`
// payload here — the actual reply will arrive separately via
// the AGENT_MESSAGE WebSocket event after the agent's next
// `wait_for_message` poll.
//
// Keep the spinner up by deliberately NOT calling
// releaseSendGuards: the user-facing "thinking" state must
// persist until the AGENT_MESSAGE lands (handled by the
// useChatSocket `onAgentMessage`/`onSendComplete` path) or an
// explicit error fires (`onSendError` from an ACTIVITY_LOGGED
// status="error"). Don't synthesise an empty agent bubble.
//
// sendInFlightRef stays true intentionally — it's the dedup
// guard for the user typing two messages back-to-back; for
// poll mode the second message would race the first agent's
// reply, so blocking is correct (matches push-mode behaviour
// where `sending` blocks the textarea).
if (resp?.status === "queued") {
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask(
(resp?.result ?? {}) as Record<string, unknown>,
@@ -62,6 +62,25 @@ export function useChatSocket(
line = `${targetName} responded (${sec}s)`;
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) callbacksRef.current.onSendComplete?.();
} else if (status === "ok" && !durationMs) {
// Task #227 — poll-mode (external/MCP workspace) queued receipt.
// ws-server `logA2AReceiveQueued` writes a "received but no
// reply yet" row with status="ok" and NO duration_ms, then
// immediately returns the synthetic {status:"queued"} 200 to
// the caller. Before this branch the row was silently dropped
// by the (status==="ok" && durationMs) guard above — leaving
// the chat UI with zero progress signal for the entire window
// between "user typed" and "agent eventually polled and
// replied". Surface the queued state explicitly so the user
// sees acknowledgement (matches the queued-delegation
// indicator in AgentCommsPanel.WaitingBubbles).
//
// We intentionally do NOT call onSendComplete here: the
// outbound is not done — only acknowledged. The MyChatPanel
// spinner stays up until the actual AGENT_MESSAGE reply lands
// (poll path) or an explicit error fires (which still hits
// the status==="error" branch below).
line = `${targetName} queued — agent will pick up on next poll`;
} else if (status === "error") {
line = `${targetName} error`;
const own = (targetId || msg.workspace_id) === workspaceId;
+3
View File
@@ -523,6 +523,9 @@ export function buildNodesAndEdges(
// that don't yet include these columns in the GET response.
broadcastEnabled: ws.broadcast_enabled ?? false,
talkToUserEnabled: ws.talk_to_user_enabled ?? true,
// A2A delivery mode (task #227). Absent on older ws-server builds
// — leave undefined so the chat UI's "?? 'push'" fallback applies.
deliveryMode: ws.delivery_mode,
},
};
if (hasParent) {
+22
View File
@@ -106,6 +106,28 @@ export interface WorkspaceNodeData extends Record<string, unknown> {
* send_message_to_user / POST /notify return 403 and the canvas
* shows a "not enabled" state with a button to re-enable. Default true. */
talkToUserEnabled?: boolean;
/** A2A inbound delivery mode for this workspace — "push" (default —
* synchronous HTTP dispatch by ws-server `proxyA2ARequest`) or "poll"
* (workspace has no URL; ws-server logs the request and the agent
* consumes it via `wait_for_message` / GET /activity?since_id=).
*
* Why surfaced to the UI: poll-mode targets (external/MCP workspaces:
* `molecule-mcp-claude-channel` on an operator laptop, hermes/codex
* bridge clients, Cursor MCP) acknowledge a canvas `message/send` with
* a synthetic `{status:"queued"}` 200 within ~50ms. Without this flag
* the chat UI cannot tell that gap from a real round-trip — the
* spinner disappears immediately and the user sees dead silence until
* the agent eventually polls and replies via the AGENT_MESSAGE WS
* event (could be seconds, could be minutes). Task #227 — render a
* "queued — agent will pick up on next poll" state for poll-mode
* sends so external/MCP workspaces have progress UX parity with
* native runtimes (claude-code / codex / hermes / openclaw).
*
* Sourced from the GET /workspaces response (`delivery_mode` snake_case
* field, mapped here in canvas-topology.ts). Absent on older platform
* builds — that fallthrough is treated as "push" to match
* ws-server's `lookupDeliveryMode` default. */
deliveryMode?: string;
}
export type PanelTab = "details" | "skills" | "chat" | "terminal" | "config" | "schedule" | "channels" | "files" | "memory" | "traces" | "events" | "activity" | "audit";
+10
View File
@@ -342,6 +342,16 @@ export interface WorkspaceData {
/** Workspace ability flags (migration 20260514). */
broadcast_enabled?: boolean;
talk_to_user_enabled?: boolean;
/** A2A delivery mode for inbound messages — "push" (default, synchronous
* HTTP dispatch to `url`) or "poll" (queued to activity_logs, agent
* picks up via `wait_for_message` / GET /activity?since_id=). Surfaced
* in the GET /workspaces response since #2339 PR 1; older platform
* versions return it absent so the canvas treats absent as "push" (the
* documented default in `lookupDeliveryMode`). Used by the chat UI to
* render an "agent will pick up on next poll" indicator instead of
* collapsing the spinner the moment the synchronous queued-200 returns
* (task #227 — external/MCP workspaces had no progress UX). */
delivery_mode?: string;
}
let socket: ReconnectingSocket | null = null;
+1 -1
View File
@@ -17,7 +17,7 @@ Canvas (Next.js :3000) ←WebSocket→ Platform (Go :8080) ←HTTP→ Postgres +
- **Workspace Server** (`workspace-server/`): Go/Gin control plane — workspace CRUD, registry, discovery, WebSocket hub, liveness monitoring.
- **Canvas** (`canvas/`): Next.js 15 + React Flow (@xyflow/react v12) + Zustand + Tailwind — visual workspace graph.
- **Workspace Runtime** (`workspace/`): Shared runtime published as [`molecule-ai-workspace-runtime`](https://pypi.org/project/molecule-ai-workspace-runtime/) on PyPI. Supports LangGraph, Claude Code, OpenClaw, DeepAgents, CrewAI, AutoGen. Each adapter lives in its own standalone template repo (e.g. `molecule-ai-workspace-template-claude-code`). See `docs/workspace-runtime-package.md` for the full picture.
- **Workspace Runtime**: Shared runtime published from [`molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) to the Molecule AI Gitea package registry. Supports LangGraph, Claude Code, OpenClaw, Hermes, Codex, and AutoGen. Each adapter lives in its own standalone template repo (e.g. `molecule-ai-workspace-template-claude-code`). See `docs/workspace-runtime-package.md` for the full picture.
- **molecli** (`workspace-server/cmd/cli/`): Go TUI dashboard (Bubbletea + Lipgloss) — real-time workspace monitoring, event log, health overview, delete/filter operations.
## Key Architectural Patterns
@@ -285,6 +285,39 @@ Canvas requests (no `X-Workspace-ID` header) and system callers
---
## Multiple Workspaces From One Local MCP Bridge
The standalone runtime package includes `molecule-mcp`, a local MCP bridge for
external agents such as Claude Code, Codex, Hermes, and other tools that run
outside the platform container fleet. One local bridge can serve multiple
external workspaces by setting `MOLECULE_WORKSPACES`:
```json
[
{
"id": "workspace-id-local-to-hongming-org",
"token": "...",
"platform_url": "https://hongming.moleculesai.app"
},
{
"id": "different-workspace-id-local-to-agents-team-org",
"token": "...",
"platform_url": "https://agents-team.moleculesai.app"
}
]
```
`platform_url` is the tenant routing key. The bridge registers, heartbeats,
polls inboxes, and sends outbound A2A calls against the URL attached to the
workspace that is doing the work.
Do not add `org_id` to this config. The tenant already comes from
`platform_url`, and the bearer token is issued by that tenant. Workspace IDs
also do not need to be shared across orgs; each tenant can return its own
workspace ID and token for the same local agent process.
---
## Canvas Appearance
External workspaces appear on the canvas with a purple **REMOTE** badge
@@ -135,6 +135,33 @@ The `id` field is your workspace ID — remember it.
---
## Optional — one local MCP bridge, multiple tenants
If your local agent runtime uses `molecule-mcp`, one process can serve more
than one external workspace:
```bash
export MOLECULE_WORKSPACES='[
{
"id": "workspace-id-local-to-you-org",
"token": "...",
"platform_url": "https://you.moleculesai.app"
},
{
"id": "different-workspace-id-local-to-team-org",
"token": "...",
"platform_url": "https://team.moleculesai.app"
}
]'
molecule-mcp
```
Use the workspace ID and token returned by each tenant. The IDs may differ
across orgs. `org_id` is not required here because `platform_url` selects the
tenant and the token is tenant-scoped.
---
## Step 4 — Chat with it
1. Open your Molecule canvas at `https://<TENANT>`
+27
View File
@@ -125,6 +125,33 @@ The agent appears on the canvas with a **purple REMOTE badge** within seconds. F
---
## Multi-Tenant Local MCP Bridge
For local MCP-driven agents, use the standalone runtime's `molecule-mcp`
entrypoint. A single local bridge can serve multiple external workspaces by
setting `MOLECULE_WORKSPACES`:
```json
[
{
"id": "workspace-id-local-to-acme",
"token": "...",
"platform_url": "https://acme.moleculesai.app"
},
{
"id": "different-workspace-id-local-to-ops",
"token": "...",
"platform_url": "https://ops.moleculesai.app"
}
]
```
`platform_url` selects the tenant for registration, heartbeat, inbox polling,
and outbound A2A routing. `org_id` is not required in this config, and the
workspace IDs do not need to match across tenants.
---
## What Phase 30 Covers
| Phase | What shipped | Endpoint |
+28 -288
View File
@@ -1,304 +1,44 @@
# Workspace Runtime PyPI Package
# Workspace Runtime Package
## Requires Python >= 3.11
`molecule-ai-workspace-runtime` is the shared Python runtime consumed by
workspace template images and by external MCP integrations.
The wheel pins `requires_python>=3.11`. On Python 3.10 or older, `pip install
molecule-ai-workspace-runtime` fails with `Could not find a version that
satisfies the requirement (from versions: none)` — the pin filters the only
available artifact before pip even attempts install. Upgrade the interpreter
(`brew install python@3.12` / `apt install python3.12` / etc.) or use a
3.11+ venv.
## Source Of Truth
## Overview
The source of truth is the standalone Gitea repo:
The shared workspace runtime infrastructure has **one editable source** and
**one published artifact**:
1. **Source of truth (monorepo, editable):** `workspace/` — every runtime
change lands here. Edit it like any other monorepo code.
2. **Published artifact (PyPI, generated):** [`molecule-ai-workspace-runtime`](https://pypi.org/project/molecule-ai-workspace-runtime/)
— produced by `.github/workflows/publish-runtime.yml` on every
`runtime-vX.Y.Z` tag push. Do NOT edit this independently — it gets
overwritten on every publish.
The legacy sibling repo `molecule-ai-workspace-runtime` (the GitHub repo, as
distinct from the PyPI package) is no longer the source-of-truth and should
be treated as a publish artifact only. It can be archived or used as a
read-only mirror.
## Where to make changes
**All runtime edits land in `molecule-monorepo/workspace/`. Period.**
The GitHub repo `Molecule-AI/molecule-ai-workspace-runtime` is **mirror-only**.
It exists so external consumers (template repos, downstream operators) have a
git-cloneable artifact that mirrors the PyPI wheel — nothing more.
- **Direct PRs against `molecule-ai-workspace-runtime` are auto-rejected by
the `mirror-guard` CI check.** The check fails any push that did not come
from the publish pipeline. There is no opt-out — file the change against
`molecule-monorepo/workspace/` instead.
- **The mirror + the PyPI wheel both auto-regenerate on every push to
`staging`** via `.github/workflows/publish-runtime.yml` (which calls
`scripts/build_runtime_package.py`, builds wheel + sdist, smoke-imports,
uploads to PyPI via Trusted Publisher, and force-pushes the rewritten tree
to the mirror repo). You never touch the mirror by hand.
If you have an old local clone of the mirror and try to push a fix to it
directly, expect a CI failure with a message pointing you here. Re-open the
change against `molecule-monorepo/workspace/` and let the publish workflow
do the rest.
## Why this shape
The 8 workspace template repos (claude-code, langgraph, hermes, etc.) each
build their own Docker image and `pip install molecule-ai-workspace-runtime`
from PyPI. PyPI is the right distribution channel — semver, reproducible
builds, no submodule dance per-repo. But the runtime ALSO needs to evolve
in lock-step with the platform's wire protocol (queue shape, A2A metadata,
event payloads). Shipping cross-cutting protocol changes as separate
runtime + platform PRs in two repos creates ordering pain and broken
intermediate states.
The monorepo + auto-publish split gives both: edit cross-cutting changes
in one PR, publish the runtime artifact via a tag.
## What's in the package
Everything in `workspace/*.py` plus the `adapters/`, `builtin_tools/`,
`plugins_registry/`, `policies/`, `skill_loader/` subpackages. Build
artifacts (`Dockerfile`, `*.sh`, `pytest.ini`, `requirements.txt`) are
excluded.
The build script rewrites bare imports so the published package is a
proper Python namespace:
```
# In monorepo workspace/:
from a2a_client import discover_peer
from builtin_tools.memory import store
# In published molecule_runtime/ (auto-rewritten at publish time):
from molecule_runtime.a2a_client import discover_peer
from molecule_runtime.builtin_tools.memory import store
```text
https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime
```
The closed allowlist of rewritten module names lives in
`scripts/build_runtime_package.py` (`TOP_LEVEL_MODULES` + `SUBPACKAGES`).
Add a new top-level module to workspace/? Add it to the allowlist in the
same PR.
Do not add runtime source back under `molecule-core/workspace/`. The core repo
owns the platform server, canvas, provisioning, and tests around the installed
runtime package.
## Adapter repos
## Package Registry
Each of the 8 adapter template repos contains:
- `adapter.py` — runtime-specific `Adapter` class
- `requirements.txt``molecule-ai-workspace-runtime>=0.1.X` + adapter deps
- `Dockerfile` — standalone image with `ENV ADAPTER_MODULE=adapter` and
`ENTRYPOINT ["molecule-runtime"]`
The runtime package is published to the Molecule AI Gitea package registry:
| Adapter | Repo |
|---------|------|
| claude-code | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code |
| langgraph | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-langgraph |
| crewai | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-crewai |
| autogen | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-autogen |
| deepagents | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-deepagents |
| hermes | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-hermes |
| gemini-cli | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-gemini-cli |
| openclaw | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-openclaw |
## Adapter discovery (ADAPTER_MODULE)
Standalone adapter repos set `ENV ADAPTER_MODULE=adapter` in their
Dockerfile. The runtime's `get_adapter()` checks this env var first:
```python
# In molecule_runtime/adapters/__init__.py
def get_adapter(runtime: str) -> type[BaseAdapter]:
adapter_module = os.environ.get("ADAPTER_MODULE")
if adapter_module:
mod = importlib.import_module(adapter_module)
return getattr(mod, "Adapter")
raise KeyError(...)
```text
https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/
```
## Publishing a new version
PyPI is intentionally not part of the critical path. Template Dockerfiles,
external-runtime snippets, and CI install checks should use the Gitea registry.
```bash
# From any local checkout of monorepo, after merging your runtime change:
git tag runtime-v0.1.6
git push origin runtime-v0.1.6
```
## Release Flow
The `publish-runtime` workflow takes over — checks out the tag, runs
`scripts/build_runtime_package.py --version 0.1.6`, builds wheel + sdist,
runs a smoke import to catch broken rewrites, and uploads to PyPI via
the PyPA Trusted Publisher action (OIDC). No static API token is stored
in this repo — PyPI verifies the workflow's OIDC claim against the
trusted-publisher config registered for `molecule-ai-workspace-runtime`.
1. Land a reviewed PR in `molecule-ai-workspace-runtime`.
2. Bump `version =` in that repo's `pyproject.toml`.
3. Tag `runtime-vX.Y.Z` on the runtime repo.
4. The runtime repo's `publish-runtime` workflow builds the wheel and sdist,
publishes to the Gitea registry, verifies install from that registry, then
cascades `.runtime-version` pins to workspace template repos.
For dev/test releases without tagging, dispatch the workflow manually
with an explicit version (e.g. `0.1.6.dev1` — PEP 440 dev/rc/post forms
are accepted).
## Core Repo Contract
After publish, the 8 template repos pick up the new version on their
next `:latest` rebuild. To force-pull immediately, bump the pin in each
template's `requirements.txt`.
`molecule-core` must not ship editable runtime code. Its responsibilities are:
## End-to-end CD chain
The full chain from monorepo merge → workspace containers running new code:
```
1. Merge PR with workspace/ changes to main
2. .github/workflows/auto-tag-runtime.yml fires
↓ reads PR labels (release:major/minor) or defaults to patch
↓ pushes runtime-vX.Y.Z tag
3. .github/workflows/publish-runtime.yml fires (on the tag)
↓ builds wheel via scripts/build_runtime_package.py
↓ smoke-imports the wheel
↓ uploads to PyPI
↓ cascade job fires repository_dispatch (event-type: runtime-published)
↓ to all 8 workspace-template-* repos
4. Each template's publish-image.yml fires (on repository_dispatch)
↓ rebuilds Dockerfile (which pip-installs the new PyPI version)
↓ pushes ghcr.io/molecule-ai/workspace-template-<runtime>:latest
5. Production hosts run scripts/refresh-workspace-images.sh
OR an operator hits POST /admin/workspace-images/refresh on the platform
↓ docker pull all 8 :latest tags
↓ remove + force-recreate any running ws-* containers using a refreshed image
↓ canvas re-provisions the workspaces on next interaction
```
Steps 1-4 are fully automated. Step 5 is one-click: a single curl or shell
command. SaaS deployments typically wire step 5 into their normal deploy
pipeline (every release pulls fresh images on every host); local dev fires
it manually after a runtime release lands.
### Auth
PyPI publishing uses **Trusted Publisher (OIDC)** — no static token in the
monorepo. The trusted-publisher config on PyPI binds the
`molecule-ai-workspace-runtime` project to this repo's
`publish-runtime.yml` workflow + `pypi-publish` environment. Rotation is
moot: there is no shared secret to rotate.
### Required secrets
| Secret | Where | Why |
|---|---|---|
| `TEMPLATE_DISPATCH_TOKEN` | molecule-core repo | Fine-grained PAT with `actions:write` on the 8 template repos. Without it the `cascade` job warns and exits clean — PyPI still publishes; templates just don't auto-rebuild. |
### Step 5 specifics
**Local dev (compose stack):**
```bash
bash scripts/refresh-workspace-images.sh # all runtimes
bash scripts/refresh-workspace-images.sh --runtime claude-code
bash scripts/refresh-workspace-images.sh --no-recreate # pull only, leave containers
```
**Via platform admin endpoint (any deploy):**
```bash
curl -X POST "$PLATFORM/admin/workspace-images/refresh"
curl -X POST "$PLATFORM/admin/workspace-images/refresh?runtime=claude-code"
curl -X POST "$PLATFORM/admin/workspace-images/refresh?recreate=false"
```
The endpoint pulls + recreates from inside the platform container, so it
needs Docker socket access (the compose stack mounts
`/var/run/docker.sock` already) AND GHCR auth on the host's docker config
(`docker login ghcr.io` once per host). On a fresh host without GHCR auth,
the pull step warns per runtime and the response surfaces the failures.
**Fully hands-off (opt-in image auto-refresh):**
Set `IMAGE_AUTO_REFRESH=true` on the platform process. A watcher polls
GHCR every 5 minutes for digest changes on each `workspace-template-*:latest`
tag and invokes the same refresh logic the admin endpoint exposes —
no operator action required between "runtime PR merged" and
"containers running new code". Disabled by default because SaaS deploy
pipelines that already pull on every release would do redundant work.
Optional companion env (same as the admin endpoint):
- `GHCR_USER` + `GHCR_TOKEN` — required for private template images;
unused for the current public set, but harmless if set.
## Local dev (build the package without publishing)
```bash
python3 scripts/build_runtime_package.py --version 0.1.0-local --out /tmp/runtime-build
cd /tmp/runtime-build
python -m build # produces dist/*.whl + dist/*.tar.gz
pip install dist/*.whl # install into a venv to test locally
```
This is the same pipeline CI runs. Use it to validate import-rewrite
correctness before pushing a `runtime-v*` tag.
## Writing a new adapter
Use the GitHub template repo
[`molecule-ai/molecule-ai-workspace-template-starter`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-starter) (note: the starter repo did not survive the 2026-05-06 GitHub-org-suspension migration; recreation tracked at internal#41)
— it ships with the canonical Dockerfile + adapter.py skeleton + config.yaml
schema + the `repository_dispatch: [runtime-published]` cascade receiver
already wired up. No follow-up setup PR required.
```bash
# Replace <runtime> with your runtime slug (lowercase, hyphenated).
gh repo create Molecule-AI/molecule-ai-workspace-template-<runtime> \
--template Molecule-AI/molecule-ai-workspace-template-starter \
--public \
--description "Molecule AI workspace template: <runtime>"
git clone https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>.git
cd molecule-ai-workspace-template-<runtime>
```
Then fill in the `TODO` markers in:
| File | What to fill in |
|---|---|
| `adapter.py` | Rename class to `<Runtime>Adapter`. Fill in `name()`, `display_name()`, `description()`, `get_config_schema()`. Implement `setup()` and `create_executor()`. |
| `requirements.txt` | Add your runtime's pip dependencies (e.g. `langgraph`, `crewai`, `claude-agent-sdk`). |
| `Dockerfile` | Add runtime-specific apt deps (most runtimes don't need any). Replace ENTRYPOINT only if you need custom boot logic. |
| `config.yaml` | Update top-level `name`/`runtime`/`description`. Add the models your runtime supports to `models[]`. |
| `system-prompt.md` | Default agent prompt. |
After `git push`:
1. The template's `publish-image.yml` builds + pushes
`ghcr.io/molecule-ai/workspace-template-<runtime>:latest` automatically.
2. The next `runtime-vX.Y.Z` tag on `molecule-core` cascades a
`repository_dispatch` event into your new template, rebuilding the image
against the latest runtime — no setup PR required.
3. Register the runtime name in the platform's `RuntimeImages` map (in
`workspace-server/internal/provisioner/provisioner.go`) so it's
selectable in the canvas.
## When the starter itself needs to evolve
If the canonical shape changes (e.g. `config.yaml` schema gets a new field,
the `BaseAdapter` interface adds a method, the reusable CI workflow
signature changes), update the
[starter](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-starter) (recreation pending — see note above)
**first**. Existing templates can either migrate at their own pace or be
touched in a coordinated cleanup PR. Either way, future templates pick up
the new shape from day one.
## Migration note
Prior to this workflow, the runtime was duplicated across monorepo
`workspace/` AND a sibling repo `molecule-ai-workspace-runtime`, with no
sync mechanism. That caused 30+ files to drift between the two trees and
tonight's chat-leak / queued-classification fixes existed only in the
monorepo copy until manually ported.
If you have an old local checkout of `molecule-ai-workspace-runtime`, treat
it as outdated. The monorepo `workspace/` is now authoritative; the PyPI
artifact is rebuilt from it on every `runtime-v*` tag.
- Test platform behavior against the installed runtime contract.
- Keep MCP/registry/TenantGuard behavior compatible with the runtime package.
- Fail CI if `workspace/` or legacy build-from-workspace scripts are restored.
-542
View File
@@ -1,542 +0,0 @@
#!/usr/bin/env python3
"""Build the molecule-ai-workspace-runtime PyPI package from monorepo workspace/.
Monorepo workspace/ is the single source-of-truth for runtime code. The PyPI
package is a publish-time mirror produced by this script, NOT a parallel
editable copy. Anyone editing the runtime should edit workspace/, never the
sibling molecule-ai-workspace-runtime repo.
What this does
--------------
1. Copies workspace/ source into build/molecule_runtime/ (note the rename:
bare modules become a real Python package).
2. Rewrites top-level imports so e.g. `from a2a_client import X` becomes
`from molecule_runtime.a2a_client import X`. The rewrite is regex-based
on a closed allowlist of modules — third-party imports like `from a2a.X`
(the a2a-sdk package) are left alone because the regex is anchored on
exact module names.
3. Writes a pyproject.toml with the requested version + the README + the
py.typed marker.
4. Leaves the build dir ready for `python -m build` to produce a wheel/sdist.
Usage
-----
scripts/build_runtime_package.py --version 0.1.6 --out /tmp/runtime-build
cd /tmp/runtime-build && python -m build
python -m twine upload dist/*
The publish workflow (.github/workflows/publish-runtime.yml) drives this
on every `runtime-v*` tag push.
"""
from __future__ import annotations
import argparse
import re
import shutil
import sys
from pathlib import Path
# Top-level Python modules in workspace/ that become molecule_runtime.X.
# Anything imported as `from <name> import` or `import <name>` (where <name>
# matches one of these) gets rewritten to use the package prefix.
#
# Closed list (not "every .py we copy") because a typo in workspace/ would
# otherwise leak into a wrong rewrite. The set is asserted against
# `workspace/*.py` at build time — if the disk contents drift from this
# list (new module added, old one removed), the build fails loud instead
# of silently shipping unrewritten imports. That gap caused 0.1.16 to
# ship `from transcript_auth import ...` (unrewritten — module added
# without updating this set), which broke every workspace startup with
# `ModuleNotFoundError: No module named 'transcript_auth'`.
TOP_LEVEL_MODULES = {
"_sanitize_a2a",
"a2a_cli",
"a2a_client",
"a2a_executor",
"a2a_mcp_server",
"a2a_response",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_identity",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"adapter_base",
"agent",
"agents_md",
"boot_routes",
"card_helpers",
"config",
"configs_dir",
"consolidation",
"coordinator",
"event_log",
"events",
"executor_helpers",
"heartbeat",
"inbox",
"inbox_uploads",
"initial_prompt",
"internal_chat_uploads",
"internal_file_read",
"main",
"mcp_cli",
"mcp_doctor",
"mcp_heartbeat",
"mcp_inbox_pollers",
"mcp_workspace_resolver",
"molecule_ai_status",
"not_configured_handler",
"platform_auth",
"platform_inbound_auth",
"plugins",
"preflight",
"prompt",
"runtime_wedge",
"secret_redactor",
"shared_runtime",
"smoke_mode",
"transcript_auth",
"watcher",
}
# Subdirectory packages — these are already real packages (they have or will
# have __init__.py) so the rewrite is `from <pkg>` → `from molecule_runtime.<pkg>`.
SUBPACKAGES = {
"adapters",
"builtin_tools",
"lib",
"platform_tools",
"plugins_registry",
"policies",
"skill_loader",
}
# Files in workspace/ NOT included in the published package. These are
# build artifacts, dev scripts, or monorepo-only scaffolding.
EXCLUDE_FILES = {
"Dockerfile",
"build-all.sh",
"rebuild-runtime-images.sh",
"entrypoint.sh",
"pytest.ini",
"requirements.txt",
# Note: adapter_base.py, agents_md.py, hermes_executor.py, shared_runtime.py
# are kept (referenced by adapters/__init__.py and other modules); they get
# their imports rewritten via TOP_LEVEL_MODULES. Excluding them broke the
# smoke-test install with `ModuleNotFoundError: adapter_base`.
}
EXCLUDE_DIRS = {
"__pycache__",
"tests",
"molecule_audit", # only used by tests; not on production import path
"scripts",
}
def build_import_rewriter() -> re.Pattern:
"""Compile a single regex matching all import statements that need
rewriting. The match groups capture the keyword + module name so the
replacement preserves whitespace and trailing punctuation.
Modules included: TOP_LEVEL_MODULES SUBPACKAGES.
The negative-lookahead on `\\.` in the suffix prevents matching
`from a2a.server.X import Y` against bare `a2a` (which isn't in our
set, but the principle matters for any future short module name that
happens to be a prefix of a real package name).
"""
names = sorted(TOP_LEVEL_MODULES | SUBPACKAGES)
alt = "|".join(re.escape(n) for n in names)
# Matches:
# from <name>(\.|\s|import)
# import <name>(\s|$|,)
# And captures the keyword + name so we can re-emit with prefix.
pattern = (
r"(?m)^(?P<indent>\s*)" # leading whitespace (preserved)
r"(?P<kw>from|import)\s+" # 'from' or 'import'
r"(?P<mod>" + alt + r")" # the module name
r"(?P<rest>[\s.,]|$)" # what follows: '.subpath', ' import …', ',', whitespace, EOL
)
return re.compile(pattern)
def rewrite_imports(text: str, regex: re.Pattern) -> str:
"""Replace bare imports with package-prefixed ones.
`import X` → `import molecule_runtime.X as X` (preserve binding)
`from X import Y` → `from molecule_runtime.X import Y`
`from X.sub import Y` → `from molecule_runtime.X.sub import Y`
Rejects `import X as Y` because the rewrite would produce
`import molecule_runtime.X as X as Y`, a syntax error. The PR #2433
incident shipped this exact pattern past `Python Lint & Test` (which
runs against pre-rewrite source) but blew up the wheel-smoke gate.
Detecting it here turns the silent build failure into a build-time
error with a clear path: use `from X import …` or plain `import X`.
"""
def repl(m: re.Match) -> str:
indent, kw, mod, rest = m.group("indent"), m.group("kw"), m.group("mod"), m.group("rest")
if kw == "from":
# `from X` or `from X.sub` — always safe to prefix.
return f"{indent}from molecule_runtime.{mod}{rest}"
# `import X` — preserve the binding name `X` (callers do `X.foo`)
# by aliasing. `import X.sub` is uncommon for our modules and would
# need a different binding form, but isn't used in workspace/ today.
if rest.startswith("."):
# `import X.sub` — rewrite as `import molecule_runtime.X.sub` and
# leave the trailing dot pattern intact for the rest of the line.
return f"{indent}import molecule_runtime.{mod}{rest}"
# Detect `import X as Y` — the regex's `rest` group captures only
# the immediate following char (whitespace, comma, or EOL), so we
# have to peek at the surrounding line context. The match start is
# at the line's `import` keyword; everything after the matched
# name on the same line is what the source author wrote.
line_start = text.rfind("\n", 0, m.start()) + 1
line_end = text.find("\n", m.end())
if line_end == -1:
line_end = len(text)
line_after = text[m.end() - len(rest):line_end]
# Strip comments from consideration so `import X # noqa` doesn't trip.
line_after_no_comment = line_after.split("#", 1)[0]
if re.search(r"^\s*as\s+\w+", line_after_no_comment):
raise ValueError(
f"rewrite_imports: cannot rewrite 'import {mod} as <alias>' on a "
f"workspace module — the regex would produce "
f"'import molecule_runtime.{mod} as {mod} as <alias>', invalid syntax. "
f"Use 'from {mod} import …' or plain 'import {mod}' instead. "
f"Offending line: {text[line_start:line_end]!r}"
)
# Plain `import X` — alias preserves the local name.
return f"{indent}import molecule_runtime.{mod} as {mod}{rest}"
return regex.sub(repl, text)
def copy_tree_filtered(src: Path, dst: Path) -> list[Path]:
"""Copy src/ → dst/ skipping EXCLUDE_FILES + EXCLUDE_DIRS. Returns the
list of .py files copied so the caller can run the import rewrite over
them in one pass."""
py_files: list[Path] = []
if dst.exists():
shutil.rmtree(dst)
dst.mkdir(parents=True)
for entry in src.iterdir():
if entry.is_dir():
if entry.name in EXCLUDE_DIRS:
continue
sub_py = copy_tree_filtered(entry, dst / entry.name)
py_files.extend(sub_py)
else:
if entry.name in EXCLUDE_FILES:
continue
shutil.copy2(entry, dst / entry.name)
if entry.suffix == ".py":
py_files.append(dst / entry.name)
return py_files
PYPROJECT_TEMPLATE = """\
[build-system]
requires = ["setuptools>=68.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "molecule-ai-workspace-runtime"
version = "{version}"
description = "Molecule AI workspace runtime — shared infrastructure for all agent adapters"
requires-python = ">=3.11"
license = {{text = "BSL-1.1"}}
readme = "README.md"
dependencies = [
"a2a-sdk[http-server]>=1.0.0,<2.0",
"httpx>=0.27.0",
"uvicorn>=0.30.0",
"starlette>=0.38.0",
"websockets>=12.0",
# multipart/form-data parser — required for Starlette's Request.form() on
# /internal/chat/uploads/ingest. Without it, Starlette raises AssertionError
# when parsing multipart bodies, which the chat-upload handler surfaces as
# an opaque 400. Mirrors the canonical pin in workspace/requirements.txt;
# >=0.0.27 avoids CVE-2024-53981 (DoS via malformed boundary).
# Forensic a78762a0 (2026-05-19): Hermes PDF upload 400 root cause.
"python-multipart>=0.0.27",
"pyyaml>=6.0",
"langchain-core>=0.3.0",
"opentelemetry-api>=1.24.0",
"opentelemetry-sdk>=1.24.0",
"opentelemetry-exporter-otlp-proto-http>=1.24.0",
"temporalio>=1.7.0",
]
[project.scripts]
molecule-runtime = "molecule_runtime.main:main_sync"
molecule-mcp = "molecule_runtime.mcp_cli:main"
[tool.setuptools.packages.find]
where = ["."]
include = ["molecule_runtime*", "plugins_registry*"]
[tool.setuptools.package-data]
"molecule_runtime" = ["py.typed"]
"plugins_registry" = ["py.typed"]
"""
README_TEMPLATE = """\
# molecule-ai-workspace-runtime
Shared workspace runtime for [Molecule AI](https://git.moleculesai.app/molecule-ai/molecule-core)
agent adapters. Installed by every workspace template image
(`workspace-template-claude-code`, `-langgraph`, `-hermes`, etc.) to provide
A2A delegation, heartbeat, memory, plugin loading, and skill management.
This package is **published from the molecule-core monorepo `workspace/`
directory** by the `publish-runtime` GitHub Actions workflow on every
`runtime-v*` tag push. **Do not edit this package directly** — edit
`workspace/` in the monorepo.
## External-runtime MCP server (`molecule-mcp`)
Operators running an agent outside the platform's container fleet
(any runtime that supports MCP stdio — Claude Code, hermes, codex,
etc.) can install this wheel and run the universal MCP server
locally.
### Requirements
* **Python ≥3.11.** The wheel sets `requires-python = ">=3.11"`. On
older interpreters `pip install` returns the cryptic
`Could not find a version that satisfies the requirement` — that
message is pip filtering this wheel out, NOT the package missing
from PyPI. Upgrade with `brew install python@3.12` /
`apt install python3.12` / `pyenv install 3.12` first.
* **`pipx` recommended over `pip`.** `pipx install` puts
`molecule-mcp` on PATH automatically and isolates the runtime's
deps from your system Python. Plain `pip install --user` works
but the binary lands in `~/.local/bin` (Linux) or
`~/Library/Python/3.X/bin` (macOS) which is often not on PATH on
a fresh shell — `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
then fails with "command not found" at first use.
* **Server name in `claude mcp add` is workspace-specific.** The
Canvas "Add to Claude Code" snippet stamps a unique slug
(`molecule-<workspace-name>`) so a single Claude Code session can
talk to N molecule workspaces concurrently — `claude mcp add` keys
entries by name in `~/.claude.json`, so re-running with a bare
`molecule` name silently overwrites the prior workspace's entry.
See [molecule-core#1535](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1535)
for the canonical generator.
### Install
```sh
# Recommended:
pipx install molecule-ai-workspace-runtime
# Alternative (manage PATH yourself):
pip install --user molecule-ai-workspace-runtime
```
### Run
```sh
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN=<bearer> \\
molecule-mcp
```
That exposes the same 8 platform tools (`delegate_task`, `list_peers`,
`send_message_to_user`, `commit_memory`, etc.) that container-bound
runtimes already get via the workspace's auto-spawned MCP. Register
the binary in your agent's MCP config — use a workspace-specific
server name so multi-workspace setups don't collide (e.g. Claude Code:
`claude mcp add molecule-<workspace-slug> -- molecule-mcp` with the env
above; the Canvas modal stamps the right slug for you).
### Keeping the token out of shell history
Inline `MOLECULE_WORKSPACE_TOKEN=<bearer>` ends up in `~/.zsh_history`
and (when registered via `claude mcp add`) plaintext in
`~/.claude.json`. To avoid that, write the token to a 0600 file and
point `MOLECULE_WORKSPACE_TOKEN_FILE` at it:
```sh
umask 077
printf '%s' "<bearer>" > ~/.config/molecule/token
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN_FILE=$HOME/.config/molecule/token \\
molecule-mcp
```
Token resolution order: `MOLECULE_WORKSPACE_TOKEN` (inline env) →
`MOLECULE_WORKSPACE_TOKEN_FILE` (path) → `${CONFIGS_DIR}/.auth_token`
(in-container default).
The token comes from the canvas → Tokens tab. Restarting an external
workspace from the canvas no longer revokes the token (PR #2412), so
operator tokens persist across status nudges.
### Push vs poll delivery (Claude Code specifics)
By default the inbox runs in **poll mode** — every turn the agent
calls `wait_for_message`, which blocks up to ~60s on
`/activity?since_id=…`. Real-time push delivery is also supported,
but on Claude Code it requires THREE conditions, ALL of which must
hold:
1. **The MCP server declares `experimental.claude/channel`** — this
wheel does (see `_build_initialize_result`). Nothing for you to
do.
2. **Claude Code installs the server as a marketplace plugin** — a
plain `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
produces a non-plugin-sourced server, which Claude Code rejects with
`channel_enable requires a marketplace plugin`. Until the
official `moleculesai/claude-code-plugin` marketplace lands
(tracking [#2936](https://git.moleculesai.app/molecule-ai/molecule-core/issues/2936)),
operators who want push must scaffold their own local marketplace
under
`~/.claude/marketplaces/molecule-local/` containing a
`marketplace.json` + `plugin.json` that points at this wheel.
3. **Claude Code is launched with the dev-channels flag** — pass
`--dangerously-load-development-channels plugin:molecule@<marketplace>`
on the `claude` invocation. Without this flag the channel
capability is silently ignored.
Symptom of any condition failing: messages arrive but only via the
poll path (every ~160s), not real-time. There's currently no
diagnostic surfaced — `molecule-mcp doctor` (tracking
[#2937](https://git.moleculesai.app/molecule-ai/molecule-core/issues/2937)) is
planned.
If you don't need real-time push, the default poll path works
universally with no extra setup; both modes converge on the same
`inbox_pop` ack so messages never duplicate.
See [`docs/workspace-runtime-package.md`](https://git.moleculesai.app/molecule-ai/molecule-core/src/branch/main/docs/workspace-runtime-package.md)
for the publish flow and architecture.
"""
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--version", required=True, help="Package version, e.g. 0.1.6")
parser.add_argument("--out", required=True, type=Path, help="Build output directory (will be wiped)")
parser.add_argument("--source", type=Path, default=Path(__file__).resolve().parent.parent / "workspace",
help="Path to monorepo workspace/ directory (default: ../workspace from this script)")
args = parser.parse_args()
src = args.source.resolve()
out = args.out.resolve()
if not src.is_dir():
print(f"error: source not a directory: {src}", file=sys.stderr)
return 2
# Drift gate: assert TOP_LEVEL_MODULES matches workspace/*.py.
# Without this, a new top-level module added to workspace/ ships
# with unrewritten `from <name> import` statements that explode at
# runtime with ModuleNotFoundError. (See 0.1.16 transcript_auth
# incident — closed list silently went stale.)
on_disk_modules = {
f.stem for f in src.glob("*.py")
if f.stem not in {"__init__", "conftest"}
}
missing = on_disk_modules - TOP_LEVEL_MODULES
stale = TOP_LEVEL_MODULES - on_disk_modules
if missing or stale:
print("error: TOP_LEVEL_MODULES drifted from workspace/*.py contents:", file=sys.stderr)
if missing:
print(f" in workspace/ but NOT in TOP_LEVEL_MODULES (will ship un-rewritten): {sorted(missing)}", file=sys.stderr)
if stale:
print(f" in TOP_LEVEL_MODULES but NOT in workspace/ (no-op, but misleading): {sorted(stale)}", file=sys.stderr)
print(" Edit scripts/build_runtime_package.py:TOP_LEVEL_MODULES to match.", file=sys.stderr)
return 3
# Same drift gate for SUBPACKAGES — catches the inverse class of
# bug where a workspace/ subdirectory is referenced by main.py
# (`from lib.pre_stop import ...`) but is either missing from
# SUBPACKAGES (so the rewriter doesn't qualify the import) or
# accidentally listed in EXCLUDE_DIRS (so the directory itself
# isn't shipped). 0.1.16-0.1.19 had `lib` in EXCLUDE_DIRS while
# main.py imported from it — `ModuleNotFoundError: No module
# named 'lib'` at every workspace startup.
on_disk_subpkgs = {
d.name for d in src.iterdir()
if d.is_dir()
and d.name not in EXCLUDE_DIRS
and d.name not in {"__pycache__"}
and (d / "__init__.py").exists()
}
sub_missing = on_disk_subpkgs - SUBPACKAGES
sub_stale = SUBPACKAGES - on_disk_subpkgs
if sub_missing or sub_stale:
print("error: SUBPACKAGES drifted from workspace/ subdirectories:", file=sys.stderr)
if sub_missing:
print(f" in workspace/ but NOT in SUBPACKAGES (will ship un-rewritten or be excluded): {sorted(sub_missing)}", file=sys.stderr)
if sub_stale:
print(f" in SUBPACKAGES but NOT in workspace/ (no-op, but misleading): {sorted(sub_stale)}", file=sys.stderr)
print(" Edit scripts/build_runtime_package.py:SUBPACKAGES + EXCLUDE_DIRS to match.", file=sys.stderr)
return 3
pkg_dir = out / "molecule_runtime"
print(f"[build] source: {src}")
print(f"[build] output: {out}")
print(f"[build] package: {pkg_dir}")
if out.exists():
shutil.rmtree(out)
out.mkdir(parents=True)
py_files = copy_tree_filtered(src, pkg_dir)
print(f"[build] copied {len(py_files)} .py files")
# Install plugins_registry/ at the wheel TOP LEVEL so that plugin adapter
# code (workspace-template-*) can use bare `from plugins_registry import ...`.
# The molecule-runtime package (molecule_runtime/) also ships it at
# molecule_runtime/plugins_registry/ (satisfies the rewritten
# `from molecule_runtime.plugins_registry import ...` in adapter_base.py).
# Both copies coexist: they serve different import namespaces.
plugins_src = src / "plugins_registry"
plugins_dst = out / "plugins_registry"
if plugins_src.is_dir():
shutil.copytree(plugins_src, plugins_dst)
print(f"[build] installed plugins_registry/ at top level (bare-import shim)")
# Ensure top-level package marker exists. workspace/ doesn't have one
# (it's not a package in monorepo), but the published artifact must.
init = pkg_dir / "__init__.py"
if not init.exists():
init.write_text('"""Molecule AI workspace runtime."""\n')
# Touch py.typed so type-checkers in adapter consumers see the package
# as typed. Empty file is the convention.
(pkg_dir / "py.typed").touch()
# Rewrite imports in every .py file we copied + the new __init__.py.
regex = build_import_rewriter()
rewrites = 0
for f in [*py_files, init]:
original = f.read_text()
rewritten = rewrite_imports(original, regex)
if rewritten != original:
f.write_text(rewritten)
rewrites += 1
print(f"[build] rewrote imports in {rewrites} files")
# Emit pyproject.toml + README at build root.
(out / "pyproject.toml").write_text(PYPROJECT_TEMPLATE.format(version=args.version))
(out / "README.md").write_text(README_TEMPLATE)
print(f"[build] done. To publish:")
print(f" cd {out}")
print(f" python -m build")
print(f" python -m twine upload dist/*")
return 0
if __name__ == "__main__":
sys.exit(main())
-95
View File
@@ -1,95 +0,0 @@
#!/usr/bin/env bash
# check-cascade-list-vs-manifest.sh — structural drift gate for the
# publish-runtime cascade list vs manifest.json workspace_templates.
#
# WHY: PR #2536 pruned the manifest to 4 supported runtimes; PR #2556
# realigned the cascade list to match. The underlying drift hazard
# (cascade-list ≠ manifest) was unguarded — the data fix didn't prevent
# recurrence. This script is the structural gate that does.
#
# Behavior-based per project pattern: derives the expected set from
# manifest.json and the actual set from the workflow YAML, fails on
# any divergence in either direction.
#
# missing-from-cascade → templates in manifest that publish-runtime.yml
# won't auto-rebuild on a new wheel publish
# (the codex-stuck-on-stale-runtime bug class)
# extra-in-cascade → cascade dispatches to deprecated templates
# (the wasted-API-calls + dead-CI-noise class)
#
# Suffix mapping: manifest names map to GHCR repos via
# {name without -default suffix} → molecule-ai-workspace-template-<suffix>
# That's the same map publish-runtime.yml's TEMPLATES variable iterates.
#
# Exit:
# 0 cascade matches manifest exactly
# 1 drift detected (script prints the diff)
# 2 bad usage / missing inputs
set -eu
MANIFEST="${1:-manifest.json}"
WORKFLOW="${2:-.github/workflows/publish-runtime.yml}"
if [ ! -f "$MANIFEST" ]; then
echo "::error::manifest not found: $MANIFEST" >&2
exit 2
fi
if [ ! -f "$WORKFLOW" ]; then
echo "::error::workflow not found: $WORKFLOW" >&2
exit 2
fi
# Expected cascade entries: manifest workspace_templates → suffix-only
# (strip -default tail, e.g. claude-code-default → claude-code, since
# publish-runtime.yml's TEMPLATES uses suffixes that match the
# molecule-ai-workspace-template-<suffix> repo naming).
EXPECTED=$(jq -r '.workspace_templates[].name' "$MANIFEST" \
| sed 's/-default$//' \
| sort -u)
# Actual cascade entries: extract from the TEMPLATES="…" line. We look
# for the line, pull the contents between the quotes, and split into
# one-per-line. Single source of truth in the workflow itself, no
# parallel registry needed.
#
# Why not \s in the regex: BSD sed (macOS) doesn't recognize \s as
# whitespace — treats it as literal `s`. POSIX [[:space:]] works on
# both BSD and GNU sed. Same hazard nuked the original draft of this
# script: \s* matched empty-prefix-of-literal-s, then the leading
# whitespace stayed in the captured group.
ACTUAL=$(grep -E '[[:space:]]*TEMPLATES="' "$WORKFLOW" \
| head -1 \
| sed -E 's/^[[:space:]]*TEMPLATES="([^"]*)".*$/\1/' \
| tr ' ' '\n' \
| grep -v '^$' \
| sort -u)
if [ -z "$ACTUAL" ]; then
echo "::error::could not extract TEMPLATES=\"…\" from $WORKFLOW — has the variable name or quoting changed?" >&2
exit 2
fi
MISSING=$(comm -23 <(printf '%s\n' "$EXPECTED") <(printf '%s\n' "$ACTUAL"))
EXTRA=$(comm -13 <(printf '%s\n' "$EXPECTED") <(printf '%s\n' "$ACTUAL"))
if [ -z "$MISSING" ] && [ -z "$EXTRA" ]; then
echo "✓ cascade list matches manifest workspace_templates ($(echo "$EXPECTED" | wc -l | tr -d ' ') entries)"
exit 0
fi
echo "::error::cascade list drift detected between $MANIFEST and $WORKFLOW" >&2
echo "" >&2
if [ -n "$MISSING" ]; then
echo " Templates in manifest but MISSING from cascade (won't auto-rebuild on wheel publish):" >&2
echo "$MISSING" | sed 's/^/ - /' >&2
echo "" >&2
fi
if [ -n "$EXTRA" ]; then
echo " Templates in cascade but NOT in manifest (deprecated, wasting dispatch calls):" >&2
echo "$EXTRA" | sed 's/^/ - /' >&2
echo "" >&2
fi
echo " Fix: edit the TEMPLATES=\"…\" line in $WORKFLOW so the set matches" >&2
echo " manifest.json's workspace_templates (suffix-stripped). See PR #2556 for context." >&2
exit 1
-201
View File
@@ -1,201 +0,0 @@
"""Tests for scripts/build_runtime_package.py — the wheel-build import rewriter.
Run locally: ``python3 -m unittest scripts/test_build_runtime_package.py -v``
Why this exists: PR #2433 shipped ``import inbox as _inbox_module`` inside
the workspace runtime, and the rewriter expanded it to
``import molecule_runtime.inbox as inbox as _inbox_module`` — invalid
Python. The wheel-smoke gate caught it post-merge but couldn't block
the merge (not a required check yet — see PR #2439). PR #2436 added a
build-time gate that raises ``ValueError`` on this pattern; this file
locks the rewriter's documented contract under unit test so the gate
itself can't silently regress.
Coverage:
- ``import X`` → ``import molecule_runtime.X as X``
- ``import X.sub`` → ``import molecule_runtime.X.sub``
- ``import X`` + trailing comment is preserved
- ``from X import Y`` → ``from molecule_runtime.X import Y``
- ``from X.sub import Y`` → ``from molecule_runtime.X.sub import Y``
- ``from X import Y, Z`` → ``from molecule_runtime.X import Y, Z``
- ``import X as Y`` → raises ValueError (the rewriter would
produce ``import molecule_runtime.X as X as Y``, syntax error)
- non-allowlist module names → not rewritten (regex anchors on the closed set)
- Indented imports (inside def/class) keep their indentation.
"""
from __future__ import annotations
import os
import sys
import unittest
# scripts/build_runtime_package.py lives at scripts/ — add scripts/ to sys.path
# so the import works whether unittest is invoked from repo root or scripts/.
HERE = os.path.dirname(os.path.abspath(__file__))
if HERE not in sys.path:
sys.path.insert(0, HERE)
import build_runtime_package as M # noqa: E402
def rewrite(text: str) -> str:
"""Run the rewriter end-to-end so the test exercises the same path
used by the wheel build (regex compile + substitution)."""
regex = M.build_import_rewriter()
return M.rewrite_imports(text, regex)
class TestBareImportRewriting(unittest.TestCase):
def test_plain_import_aliases_to_preserve_binding(self):
self.assertEqual(
rewrite("import inbox\n"),
"import molecule_runtime.inbox as inbox\n",
)
def test_plain_import_with_trailing_comment_is_preserved(self):
# Real-world shape from a2a_mcp_server.py — the comment must
# survive the rewrite without losing its leading-space buffer.
self.assertEqual(
rewrite("import inbox # noqa: E402\n"),
"import molecule_runtime.inbox as inbox # noqa: E402\n",
)
def test_import_dotted_keeps_dotted_form(self):
# `import X.sub` is rare for our modules but the rewriter must
# not double-alias — we want `import molecule_runtime.X.sub`,
# not `import molecule_runtime.X.sub as X.sub` (invalid).
self.assertEqual(
rewrite("import platform_tools.registry\n"),
"import molecule_runtime.platform_tools.registry\n",
)
def test_indented_import_preserves_indentation(self):
src = "def foo():\n import inbox\n return inbox.x\n"
out = rewrite(src)
self.assertIn(" import molecule_runtime.inbox as inbox\n", out)
class TestFromImportRewriting(unittest.TestCase):
def test_from_module_import_simple(self):
self.assertEqual(
rewrite("from inbox import InboxState\n"),
"from molecule_runtime.inbox import InboxState\n",
)
def test_from_dotted_import(self):
self.assertEqual(
rewrite("from platform_tools.registry import TOOLS\n"),
"from molecule_runtime.platform_tools.registry import TOOLS\n",
)
def test_from_import_multiple_symbols(self):
# Multi-import statement — the rewriter only touches the module
# prefix, not the names being imported.
self.assertEqual(
rewrite("from a2a_tools import (foo, bar, baz)\n"),
"from molecule_runtime.a2a_tools import (foo, bar, baz)\n",
)
def test_from_import_block_form(self):
src = (
"from a2a_tools import (\n"
" tool_check_task_status,\n"
" tool_commit_memory,\n"
")\n"
)
out = rewrite(src)
self.assertIn("from molecule_runtime.a2a_tools import (\n", out)
# Trailing names + closer are unchanged.
self.assertIn(" tool_check_task_status,\n", out)
self.assertIn(")\n", out)
class TestImportAsAliasRejection(unittest.TestCase):
"""The key regression class — the failure mode that shipped in PR #2433."""
def test_import_as_alias_raises_value_error(self):
with self.assertRaises(ValueError) as ctx:
rewrite("import inbox as _inbox_module\n")
msg = str(ctx.exception)
# Error must name the offending module + suggest the fix.
self.assertIn("inbox", msg)
self.assertIn("as <alias>", msg)
self.assertIn("from", msg) # suggests `from X import …`
def test_import_as_alias_indented_still_rejected(self):
# Indented (inside def/class) — same hazard, same rejection.
with self.assertRaises(ValueError):
rewrite("def foo():\n import inbox as _x\n")
def test_import_as_alias_with_trailing_comment_still_rejected(self):
with self.assertRaises(ValueError):
rewrite("import inbox as _x # comment\n")
def test_plain_import_with_as_in_comment_does_not_trip(self):
# The detection strips comments before pattern-matching, so a
# comment containing "as foo" must NOT trigger the rejection.
self.assertEqual(
rewrite("import inbox # rewriter produces alias as inbox\n"),
"import molecule_runtime.inbox as inbox # rewriter produces alias as inbox\n",
)
def test_import_followed_by_comma_is_not_an_alias(self):
# `import inbox, os` — comma is not `as`, must not be rejected.
# Our regex captures `inbox` then `,` — only `inbox` gets prefixed.
# `os` is not in TOP_LEVEL_MODULES so it's left alone.
out = rewrite("import inbox, os\n")
# The first module is rewritten; the second (non-allowlist) is not.
self.assertIn("import molecule_runtime.inbox as inbox", out)
class TestOutsideAllowlistModules(unittest.TestCase):
def test_third_party_imports_unchanged(self):
# `httpx`, `os`, `re` etc. are not in TOP_LEVEL_MODULES — the
# regex must not match them. This is the closed-list invariant
# that prevents accidental rewrites of stdlib / third-party.
src = "import httpx\nimport os\nfrom re import match\n"
self.assertEqual(rewrite(src), src)
def test_short_name_collision_avoided(self):
# `from a2a.server.X import Y` must not match the bare `a2a`
# prefix — `a2a` isn't in our allowlist (we allow `a2a_tools`,
# `a2a_client`, etc., but not bare `a2a`). Belt-and-suspenders.
src = "from a2a.server.routes import create_agent_card_routes\n"
self.assertEqual(rewrite(src), src)
class TestEndToEndShape(unittest.TestCase):
"""Reproduces the PR #2433 → #2436 incident shape."""
def test_pr_2433_pattern_now_rejected(self):
# The exact line PR #2433 added (inside main()), which produced
# `import molecule_runtime.inbox as inbox as _inbox_module` —
# invalid syntax in the published wheel.
with self.assertRaises(ValueError) as ctx:
rewrite(
" import inbox as _inbox_module\n"
" _inbox_module.set_notification_callback(_on_inbox_message)\n"
)
# Error message includes the offending line so the operator
# knows exactly where to fix.
self.assertIn("inbox", str(ctx.exception))
def test_pr_2436_fix_pattern_works(self):
# The fix-forward shape (#2436): top-level `import inbox`,
# bridge wired in main() via `inbox.set_notification_callback`.
src = (
"import inbox\n"
"\n"
"def main():\n"
" inbox.set_notification_callback(cb)\n"
)
out = rewrite(src)
self.assertIn("import molecule_runtime.inbox as inbox\n", out)
# The callable reference inside main() is left alone — only
# imports get rewritten, not arbitrary `inbox.foo` callsites
# (those resolve via the module binding the rewrite preserves).
self.assertIn(" inbox.set_notification_callback(cb)\n", out)
if __name__ == "__main__":
unittest.main()
+1 -1
View File
@@ -9,7 +9,7 @@ This repo uses the standard monorepo testing convention: **unit tests live with
| Go unit + integration (platform, CLI, handlers) | `workspace-server/**/*_test.go` — run with `cd workspace-server && go test -race ./...` |
| TypeScript unit (canvas components, hooks, store) | `canvas/src/**/__tests__/` — run with `cd canvas && npm test -- --run` |
| TypeScript unit (MCP server handlers) | `mcp-server/src/__tests__/` — run with `cd mcp-server && npx jest` |
| Python unit (workspace runtime, adapters) | `workspace/tests/` — run with `cd workspace && python3 -m pytest` |
| Python unit (workspace runtime, adapters) | `molecule-ai-workspace-runtime/tests/` in the standalone runtime repo |
| Python unit (SDK: plugin + remote agent) | `sdk/python/tests/` — run with `cd sdk/python && python3 -m pytest` |
| **Cross-component E2E** (spans platform + runtime + HTTP) | `tests/e2e/`**you are here** |
+4 -1
View File
@@ -33,7 +33,10 @@ e2e_mint_test_token() {
return 2
fi
local body
body=$(curl -s -w "\n%{http_code}" "$BASE/admin/workspaces/$wid/test-token")
local admin_bearer="${MOLECULE_ADMIN_TOKEN:-${ADMIN_TOKEN:-}}"
local admin_auth=()
[ -n "$admin_bearer" ] && admin_auth=(-H "Authorization: Bearer $admin_bearer")
body=$(curl -s -w "\n%{http_code}" "$BASE/admin/workspaces/$wid/test-token" ${admin_auth[@]+"${admin_auth[@]}"})
local code
code=$(printf '%s' "$body" | tail -n1)
local json
+1 -1
View File
@@ -71,7 +71,7 @@ pv_assert_runtime() {
set +e
resp=$(curl -sS -X POST "$base_url/workspaces/$wid/mcp" \
-H "Authorization: Bearer $wtok" \
"${org_header[@]}" \
${org_header[@]+"${org_header[@]}"} \
-H "Content-Type: application/json" \
-d "$PV_RPC_BODY" \
-o /tmp/pv_mcp_body.json -w "%{http_code}" 2>/dev/null)
+23 -12
View File
@@ -10,6 +10,10 @@ FAIL=0
# as `Authorization: Bearer <token>`. Capture them here.
ECHO_TOKEN=""
SUM_TOKEN=""
ECHO_AUTH=()
SUM_AUTH=()
ECHO_URL="https://example.com/echo-agent"
SUM_URL="https://example.com/summarizer-agent"
# AdminAuth-gated calls need a bearer token once any workspace token
# exists in the DB. ADMIN_TOKEN is populated after the first workspace
@@ -54,8 +58,8 @@ R=$(acurl "$BASE/workspaces")
check "GET /workspaces (empty)" '[]' "$R"
# Test 3: Create workspace A (AdminAuth fail-open — no tokens exist yet)
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Echo Agent","tier":1}')
check "POST /workspaces (create echo)" '"status":"provisioning"' "$R"
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Echo Agent","tier":1,"runtime":"external","external":true}')
check "POST /workspaces (create echo)" '"status":"awaiting_agent"' "$R"
ECHO_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
# Mint a test token so all subsequent AdminAuth-gated calls succeed.
@@ -72,8 +76,8 @@ else
fi
# Test 4: Create workspace B (needs bearer — tokens now exist in DB)
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Summarizer Agent","tier":1}')
check "POST /workspaces (create summarizer)" '"status":"provisioning"' "$R"
R=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d '{"name":"Summarizer Agent","tier":1,"runtime":"external","external":true}')
check "POST /workspaces (create summarizer)" '"status":"awaiting_agent"' "$R"
SUM_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
# Test 5: List has 2
@@ -90,9 +94,10 @@ check "GET /workspaces/:id (agent_card null)" '"agent_card":null' "$R"
# endpoint), not the admin token. C18 requires a token issued TO THIS
# workspace, not just any valid token.
ECHO_WS_TOKEN=$(curl -s "$BASE/admin/workspaces/$ECHO_ID/test-token" | python3 -c "import sys,json; print(json.load(sys.stdin).get('auth_token',''))" 2>/dev/null || echo "")
[ -n "$ECHO_WS_TOKEN" ] && ECHO_AUTH=(-H "Authorization: Bearer $ECHO_WS_TOKEN")
R=$(curl -s -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
${ECHO_WS_TOKEN:+-H "Authorization: Bearer $ECHO_WS_TOKEN"} \
-d "{\"id\":\"$ECHO_ID\",\"url\":\"http://localhost:8001\",\"agent_card\":{\"name\":\"Echo Agent\",\"skills\":[{\"id\":\"echo\",\"name\":\"Echo\"}]}}")
"${ECHO_AUTH[@]}" \
-d "{\"id\":\"$ECHO_ID\",\"url\":\"$ECHO_URL\",\"agent_card\":{\"name\":\"Echo Agent\",\"skills\":[{\"id\":\"echo\",\"name\":\"Echo\"}]}}")
check "POST /registry/register (echo)" '"status":"registered"' "$R"
# Extract token from register response; fall back to the test-token we
# already minted (register may not return a new token on re-registration).
@@ -101,9 +106,10 @@ if [ -z "$ECHO_TOKEN" ]; then ECHO_TOKEN="$ECHO_WS_TOKEN"; fi
# Test 8: Register summarizer — same pattern: workspace-specific token
SUM_WS_TOKEN=$(curl -s "$BASE/admin/workspaces/$SUM_ID/test-token" | python3 -c "import sys,json; print(json.load(sys.stdin).get('auth_token',''))" 2>/dev/null || echo "")
[ -n "$SUM_WS_TOKEN" ] && SUM_AUTH=(-H "Authorization: Bearer $SUM_WS_TOKEN")
R=$(curl -s -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
${SUM_WS_TOKEN:+-H "Authorization: Bearer $SUM_WS_TOKEN"} \
-d "{\"id\":\"$SUM_ID\",\"url\":\"http://localhost:8002\",\"agent_card\":{\"name\":\"Summarizer\",\"skills\":[{\"id\":\"summarize\",\"name\":\"Summarize\"}]}}")
"${SUM_AUTH[@]}" \
-d "{\"id\":\"$SUM_ID\",\"url\":\"$SUM_URL\",\"agent_card\":{\"name\":\"Summarizer\",\"skills\":[{\"id\":\"summarize\",\"name\":\"Summarize\"}]}}")
check "POST /registry/register (summarizer)" '"status":"registered"' "$R"
SUM_TOKEN=$(echo "$R" | e2e_extract_token)
if [ -z "$SUM_TOKEN" ]; then SUM_TOKEN="$SUM_WS_TOKEN"; fi
@@ -112,7 +118,7 @@ if [ -z "$SUM_TOKEN" ]; then SUM_TOKEN="$SUM_WS_TOKEN"; fi
R=$(acurl "$BASE/workspaces/$ECHO_ID")
check "Echo is online" '"status":"online"' "$R"
check "Echo has agent_card" '"skills"' "$R"
check "Echo has url" '"url":"http://localhost:8001"' "$R"
check "Echo has url" "\"url\":\"$ECHO_URL\"" "$R"
# Test 10: Heartbeat
R=$(curl -s -X POST "$BASE/registry/heartbeat" -H "Content-Type: application/json" -H "Authorization: Bearer $ECHO_TOKEN" \
@@ -178,7 +184,7 @@ curl -s -X POST "$BASE/registry/heartbeat" -H "Content-Type: application/json" -
# Re-register to force online status in case liveness expired
curl -s -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
-H "Authorization: Bearer $ECHO_TOKEN" \
-d "{\"id\":\"$ECHO_ID\",\"url\":\"http://localhost:8001\",\"agent_card\":{\"name\":\"Echo Agent v2\",\"skills\":[{\"id\":\"echo\",\"name\":\"Echo\"},{\"id\":\"repeat\",\"name\":\"Repeat\"}]}}" > /dev/null
-d "{\"id\":\"$ECHO_ID\",\"url\":\"$ECHO_URL\",\"agent_card\":{\"name\":\"Echo Agent v2\",\"skills\":[{\"id\":\"echo\",\"name\":\"Echo\"},{\"id\":\"repeat\",\"name\":\"Repeat\"}]}}" > /dev/null
# Now send high error rate to trigger degraded
R=$(curl -s -X POST "$BASE/registry/heartbeat" -H "Content-Type: application/json" -H "Authorization: Bearer $ECHO_TOKEN" \
@@ -358,12 +364,17 @@ else
fi
# Register the re-imported workspace to verify agent_card round-trips
NEW_TOKEN=$(curl -s "$BASE/admin/workspaces/$NEW_ID/test-token" | python3 -c "import sys,json; print(json.load(sys.stdin).get('auth_token',''))" 2>/dev/null || echo "")
NEW_AUTH=()
[ -n "$NEW_TOKEN" ] && NEW_AUTH=(-H "Authorization: Bearer $NEW_TOKEN")
R=$(curl -s -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
-d "{\"id\":\"$NEW_ID\",\"url\":\"http://localhost:8002\",\"agent_card\":{\"name\":\"Summarizer\",\"skills\":[{\"id\":\"summarize\",\"name\":\"Summarize\"}]}}")
"${NEW_AUTH[@]}" \
-d "{\"id\":\"$NEW_ID\",\"url\":\"$SUM_URL\",\"agent_card\":{\"name\":\"Summarizer\",\"skills\":[{\"id\":\"summarize\",\"name\":\"Summarize\"}]}}")
check "Register re-imported workspace" '"status":"registered"' "$R"
# Capture the fresh token issued to the re-imported workspace. SUM_TOKEN was
# revoked when SUM_ID was deleted above — use this one for cleanup instead.
NEW_TOKEN=$(echo "$R" | e2e_extract_token)
REG_NEW_TOKEN=$(echo "$R" | e2e_extract_token)
[ -n "$REG_NEW_TOKEN" ] && NEW_TOKEN="$REG_NEW_TOKEN"
# Re-export and verify agent_card survives the round-trip (#165 / PR #167 — admin-gated)
REBUNDLE=$(curl -s "$BASE/bundles/export/$NEW_ID" -H "Authorization: Bearer $NEW_TOKEN")
+117 -54
View File
@@ -24,7 +24,8 @@
#
# Only PROVISIONING differs from staging:
# - staging: POST /cp/admin/orgs (cold EC2 tenant) + per-tenant admin
# token + each workspace's auth_token from the POST /workspaces resp.
# token + each workspace's MCP bearer from create response or an admin
# token-mint fallback.
# - local: POST /workspaces directly against the local stack
# (BASE, default http://localhost:8080), MCP bearer minted via
# GET /admin/workspaces/:id/test-token (e2e_mint_test_token —
@@ -32,17 +33,22 @@
# every other local E2E (test_priority_runtimes_e2e.sh,
# test_api.sh) already uses; no new credential/provision flow.
#
# It is written to FAIL on today's broken Hermes/OpenClaw behavior and go
# green only when the in-flight root-cause fixes (Hermes-401 #162,
# OpenClaw-never-online/MCP-wiring #165) actually land — same gate
# semantics + exit codes as the staging script. NON-required by design
# until then (flip-to-required tracked at molecule-core#1296), and NOT
# masked with continue-on-error (feedback_fix_root_not_symptom).
# By default the local backend creates external-mode workspace rows and
# drives the literal MCP path directly. That keeps the local peer-visibility
# gate focused on platform auth + MCP list_peers semantics instead of local
# template container boot/heartbeat. Set PV_LOCAL_PROVISION_MODE=container
# for targeted runtime-boot debugging. NON-required by design until the
# flip-to-required tracked at molecule-core#1296, and NOT masked with
# continue-on-error (feedback_fix_root_not_symptom).
#
# Required env: none (local stack only).
# Optional env:
# BASE default http://localhost:8080
# PV_RUNTIMES space list; default "hermes openclaw claude-code"
# PV_LOCAL_PROVISION_MODE default external; set container to also require
# local template containers to boot online
# PV_PARENT_RUNTIME parent runtime; default claude-code when keyed,
# otherwise first keyed runtime in PV_RUNTIMES
# E2E_PROVISION_TIMEOUT_SECS per-workspace online budget; default 900
# (hermes cold apt+uv is the slow path locally)
# E2E_KEEP_WS 1 → skip teardown (local debugging only)
@@ -68,6 +74,7 @@ source "$(dirname "$0")/_lib.sh"
source "$(dirname "$0")/lib/peer_visibility_assert.sh"
PV_RUNTIMES="${PV_RUNTIMES:-hermes openclaw claude-code}"
PV_LOCAL_PROVISION_MODE="${PV_LOCAL_PROVISION_MODE:-external}"
PROVISION_TIMEOUT_SECS="${E2E_PROVISION_TIMEOUT_SECS:-900}"
NAME_PREFIX="PV-Local-$$-$(date +%H%M%S)"
@@ -75,6 +82,9 @@ log() { echo "[$(date +%H:%M:%S)] $*"; }
ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
CREATED_WSIDS=()
ADMIN_BEARER="${MOLECULE_ADMIN_TOKEN:-${ADMIN_TOKEN:-}}"
ADMIN_AUTH=()
[ -n "$ADMIN_BEARER" ] && ADMIN_AUTH=(-H "Authorization: Bearer $ADMIN_BEARER")
# ─── Scoped teardown ───────────────────────────────────────────────────
# Deletes ONLY the workspaces THIS run created (tracked in CREATED_WSIDS),
@@ -94,7 +104,7 @@ teardown() {
log "[teardown] deleting ${#CREATED_WSIDS[@]} workspace(s) this run created (scoped)"
for wid in ${CREATED_WSIDS[@]+"${CREATED_WSIDS[@]}"}; do
[ -n "$wid" ] || continue
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" >/dev/null 2>&1 || true
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} >/dev/null 2>&1 || true
done
exit $rc
}
@@ -103,7 +113,7 @@ trap teardown EXIT INT TERM
# Pre-sweep workspaces a prior crashed run of THIS script left behind
# (name prefix match only — never a blanket delete). The trap fires on
# normal exit, but a kill -9 / SIGPIPE can bypass it.
PRIOR=$(curl -s "$BASE/workspaces" | python3 -c '
PRIOR=$(curl -s "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} | python3 -c '
import json, sys
try:
print(" ".join(w["id"] for w in json.load(sys.stdin) if w.get("name","").startswith("PV-Local-")))
@@ -112,7 +122,7 @@ except Exception:
' 2>/dev/null)
for _wid in $PRIOR; do
log "Pre-sweeping prior PV-Local workspace: $_wid"
curl -s -X DELETE "$BASE/workspaces/$_wid?confirm=true" >/dev/null 2>&1 || true
curl -s -X DELETE "$BASE/workspaces/$_wid?confirm=true" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} >/dev/null 2>&1 || true
done
# ─── Local-stack preflight ─────────────────────────────────────────────
@@ -123,10 +133,10 @@ if ! curl -fsS "$BASE/health" -m 5 >/dev/null 2>&1; then
fi
# admin/test-token is the local MCP-bearer mint path; it 404s in
# production. If it is off, this gate cannot drive the literal call.
if ! curl -fsS "$BASE/admin/workspaces/preflight-probe/test-token" -m 5 >/dev/null 2>&1; then
if ! curl -fsS "$BASE/admin/workspaces/preflight-probe/test-token" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -m 5 >/dev/null 2>&1; then
# A 404 here is EITHER "no such ws" (fine — endpoint is enabled) OR the
# endpoint is disabled (MOLECULE_ENV=production). Distinguish by body.
PROBE=$(curl -s "$BASE/admin/workspaces/preflight-probe/test-token" -m 5 2>/dev/null)
PROBE=$(curl -s "$BASE/admin/workspaces/preflight-probe/test-token" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -m 5 2>/dev/null)
if echo "$PROBE" | grep -qi 'production\|disabled\|not found.*endpoint'; then
echo "::error::GET /admin/workspaces/:id/test-token disabled (MOLECULE_ENV=production?). Cannot mint a local MCP bearer." >&2
exit 1
@@ -164,6 +174,28 @@ runtime_secrets() {
esac
}
choose_parent_runtime() {
local rt
if [ -n "${PV_PARENT_RUNTIME:-}" ]; then
runtime_secrets "$PV_PARENT_RUNTIME" >/dev/null || return 1
echo "$PV_PARENT_RUNTIME"
return 0
fi
if runtime_secrets claude-code >/dev/null; then
echo "claude-code"
return 0
fi
for rt in $PV_RUNTIMES; do
if runtime_secrets "$rt" >/dev/null; then
echo "$rt"
return 0
fi
done
return 1
}
# Block until $1 reaches one of $2 (space-separated), or $3 sec elapse.
wait_for_status() {
local wsid="$1" want="$2" budget="$3" start=$SECONDS last=""
@@ -182,27 +214,42 @@ except Exception:
return 1
}
# ─── 1. Provision parent (claude-code) + one sibling per runtime ───────
# Same topology as the staging script: a claude-code parent plus one
# sibling per runtime under test, so each runtime should see all others.
log "1/5 provisioning parent (claude-code) + one sibling per runtime under test..."
PARENT_SECRETS=$(runtime_secrets claude-code) || PARENT_SECRETS=""
if [ -z "$PARENT_SECRETS" ]; then
# Parent still needs to exist as a peer target even without an LLM key;
# it never has to answer list_peers itself (it is excluded from the
# caller set), so an empty-secrets claude-code shell is sufficient.
# ─── 1. Provision parent + one sibling per runtime ──────────────────────
# Same topology as the staging script: one parent plus one sibling per
# runtime under test, so each runtime should see all others. The default
# local backend uses external-mode rows because the literal MCP list_peers
# path is platform-local and must not depend on local template boot/heartbeat.
if [ "$PV_LOCAL_PROVISION_MODE" = "external" ]; then
PARENT_RUNTIME="external"
PARENT_SECRETS="{}"
PARENT_EXTRA=',"external":true'
else
# Container mode is still available for local runtime-boot debugging.
# Prefer a claude-code parent for staging parity, but local CI is
# intentionally allowed to be partially keyed; an unkeyed parent can
# never heartbeat.
PARENT_RUNTIME=$(choose_parent_runtime) || {
echo "::error::No keyed runtime available for parent — cannot run the local peer-visibility gate. Set CLAUDE_CODE_OAUTH_TOKEN and/or E2E_MINIMAX_API_KEY (or ANTHROPIC/OPENAI)." >&2
exit 1
}
PARENT_SECRETS=$(runtime_secrets "$PARENT_RUNTIME") || PARENT_SECRETS=""
if [ -z "$PARENT_SECRETS" ]; then
echo "::error::parent runtime $PARENT_RUNTIME has no provider secrets" >&2
exit 1
fi
PARENT_EXTRA=""
fi
P_RESP=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-parent\",\"runtime\":\"claude-code\",\"tier\":3,\"secrets\":$PARENT_SECRETS}")
log "1/5 provisioning parent ($PARENT_RUNTIME, mode=$PV_LOCAL_PROVISION_MODE) + one sibling per runtime under test..."
P_RESP=$(curl -s -X POST "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-parent\",\"runtime\":\"$PARENT_RUNTIME\",\"tier\":3$PARENT_EXTRA,\"secrets\":$PARENT_SECRETS}")
PARENT_ID=$(echo "$P_RESP" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$PARENT_ID" ]; then
echo "::error::parent create failed: $(echo "$P_RESP" | head -c 300)" >&2
exit 1
fi
CREATED_WSIDS+=("$PARENT_ID")
log " PARENT_ID=$PARENT_ID"
log " PARENT_ID=$PARENT_ID runtime=$PARENT_RUNTIME"
# NOTE: no `declare -A` — this script must also run on a local macOS dev
# box (bash 3.2, no associative arrays) per feedback_local_must_mimic_
@@ -231,13 +278,21 @@ _map_get() { # _map_get <mapvarname> <key> -> stdout value (empty if absent)
ALL_WS_IDS="$PARENT_ID"
ACTIVE_RUNTIMES=""
for rt in $PV_RUNTIMES; do
SEC=$(runtime_secrets "$rt") || SEC=""
if [ -z "$SEC" ]; then
log " SKIP $rt — no provider key in env (partially-keyed local env; not a failure)"
continue
if [ "$PV_LOCAL_PROVISION_MODE" = "external" ]; then
SEC="{}"
CREATE_RUNTIME="external"
CREATE_EXTRA=',"external":true'
else
SEC=$(runtime_secrets "$rt") || SEC=""
if [ -z "$SEC" ]; then
log " SKIP $rt — no provider key in env (partially-keyed local env; not a failure)"
continue
fi
CREATE_RUNTIME="$rt"
CREATE_EXTRA=""
fi
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-$rt\",\"runtime\":\"$rt\",\"tier\":2,\"parent_id\":\"$PARENT_ID\",\"secrets\":$SEC}")
R=$(curl -s -X POST "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-$rt\",\"runtime\":\"$CREATE_RUNTIME\",\"tier\":2,\"parent_id\":\"$PARENT_ID\"$CREATE_EXTRA,\"secrets\":$SEC}")
WID=$(echo "$R" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$WID" ]; then
echo "::error::$rt workspace create failed: $(echo "$R" | head -c 300)" >&2
@@ -257,32 +312,40 @@ if [ -z "$ACTIVE_RUNTIMES" ]; then
fi
# ─── 2. Wait for the parent online (it is a peer target) ───────────────
log "2/5 waiting for parent online (peer target)..."
PF=$(wait_for_status "$PARENT_ID" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$PF" != "online" ]; then
echo "::error::parent ($PARENT_ID) never reached online (last=$PF) within ${PROVISION_TIMEOUT_SECS}s" >&2
exit 3
fi
ok " parent online"
# ─── 3. Wait for every sibling online ──────────────────────────────────
# A runtime that never comes online locally is itself a finding: it
# reproduces the openclaw-never-online class (#165) on the local stack.
log "3/5 waiting for all siblings online (up to ${PROVISION_TIMEOUT_SECS}s each — cold boot)..."
REGRESSED=0
ONLINE_RUNTIMES=""
for rt in $ACTIVE_RUNTIMES; do
wid="$(_map_get WS_IDS_MAP "$rt")"
S=$(wait_for_status "$wid" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$S" != "online" ]; then
echo "$rt ($wid): never reached online (last=$S) — reproduces the never-online class locally"
_map_set VERDICT_MAP "$rt" "FAIL(never-online:last=$S)"
REGRESSED=1
continue
if [ "$PV_LOCAL_PROVISION_MODE" = "external" ]; then
log "2/5 external-mode local backend: parent is awaiting_agent; no container-online wait needed"
ok " parent created"
log "3/5 external-mode local backend: siblings are awaiting_agent; driving MCP directly"
ONLINE_RUNTIMES="$ACTIVE_RUNTIMES"
else
log "2/5 waiting for parent online (peer target)..."
PF=$(wait_for_status "$PARENT_ID" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$PF" != "online" ]; then
echo "::error::parent ($PARENT_ID) never reached online (last=$PF) within ${PROVISION_TIMEOUT_SECS}s" >&2
exit 3
fi
ok " $rt online"
ONLINE_RUNTIMES="$ONLINE_RUNTIMES $rt"
done
ok " parent online"
# ─── 3. Wait for every sibling online ──────────────────────────────────
# A runtime that never comes online locally is itself a finding in
# container mode. The default external mode keeps this gate focused on
# literal MCP peer visibility.
log "3/5 waiting for all siblings online (up to ${PROVISION_TIMEOUT_SECS}s each — cold boot)..."
for rt in $ACTIVE_RUNTIMES; do
wid="$(_map_get WS_IDS_MAP "$rt")"
S=$(wait_for_status "$wid" "online" "$PROVISION_TIMEOUT_SECS") || true
if [ "$S" != "online" ]; then
echo "$rt ($wid): never reached online (last=$S) — reproduces the never-online class locally"
_map_set VERDICT_MAP "$rt" "FAIL(never-online:last=$S)"
REGRESSED=1
continue
fi
ok " $rt online"
ONLINE_RUNTIMES="$ONLINE_RUNTIMES $rt"
done
fi
# ─── 4. THE GATE — literal mcp_molecule_list_peers via POST /:id/mcp ────
# Shared, byte-identical assertion. Local passes "" for the org id (the
+68 -11
View File
@@ -40,8 +40,10 @@
# drives: POST /cp/admin/orgs (provision), GET
# /cp/admin/orgs/:slug/admin-token (per-tenant token), DELETE
# /cp/admin/tenants/:slug (teardown). The per-tenant admin token drives
# tenant workspace creation; each workspace's OWN auth_token (returned by
# POST /workspaces) drives its MCP call.
# tenant workspace creation; each workspace's OWN auth_token drives its
# MCP call. External-like runtimes may return the token in POST
# /workspaces; managed container runtimes usually require the admin token
# mint fallback below.
#
# Required env:
# MOLECULE_ADMIN_TOKEN CP admin bearer — Railway staging CP_ADMIN_API_TOKEN
@@ -104,6 +106,46 @@ tenant_call() {
-H "Content-Type: application/json" "$@"
}
tenant_call_capture() {
local method="$1" path="$2" out="$3"; shift 3
curl -sS -o "$out" -w "%{http_code}" -X "$method" "$TENANT_URL$path" \
-H "Authorization: Bearer $TENANT_TOKEN" \
-H "X-Molecule-Org-Id: $ORG_ID" \
-H "Content-Type: application/json" "$@"
}
redact_token_body() {
python3 -c '
import json, re, sys
raw = sys.stdin.read()
try:
data = json.loads(raw)
except Exception:
print(re.sub(r"(?i)([a-z0-9_]*token)=([^&\\s]+)", r"\1=<redacted>", raw)[:500])
raise SystemExit(0)
def scrub(v):
if isinstance(v, dict):
return {k: ("<redacted>" if "token" in k.lower() else scrub(val)) for k, val in v.items()}
if isinstance(v, list):
return [scrub(x) for x in v]
return v
print(json.dumps(scrub(data), separators=(",", ":"))[:500])
'
}
extract_auth_token() {
python3 -c "
import sys, json
try:
d = json.load(sys.stdin)
except Exception:
print(''); sys.exit(0)
print(d.get('auth_token') or d.get('connection', {}).get('auth_token') or '')
" 2>/dev/null
}
# ─── Scoped teardown ───────────────────────────────────────────────────
# Deletes ONLY the org this run created (DELETE /cp/admin/tenants/$SLUG
# with the {"confirm":$SLUG} fat-finger guard). Never a cluster-wide
@@ -218,16 +260,31 @@ for rt in $PV_RUNTIMES; do
R=$(tenant_call POST /workspaces \
-d "{\"name\":\"pv-$rt\",\"runtime\":\"$rt\",\"tier\":2,\"parent_id\":\"$PARENT_ID\",\"secrets\":$SECRETS_JSON}")
WID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin).get('id',''))" 2>/dev/null)
# auth_token is top-level for container runtimes; external-like nest it
# under connection.auth_token (verified vs staging response shape).
WTOK=$(echo "$R" | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
print(d.get('auth_token') or d.get('connection', {}).get('auth_token') or '')
" 2>/dev/null)
# External-like runtimes may return connection.auth_token on create.
# Managed container runtimes usually return only id/status here, then
# receive their bearer through registry/bootstrap; for this literal MCP
# driver we mint an admin test token below.
WTOK=$(echo "$R" | extract_auth_token)
[ -n "$WID" ] || fail "$rt workspace create failed: $(echo "$R" | head -c 300)"
[ -n "$WTOK" ] || fail "$rt workspace did not return an auth_token — cannot drive its MCP call (resp: $(echo "$R" | head -c 300))"
TOKEN_DIAG=""
if [ -z "$WTOK" ]; then
TTOK_FILE=$(mktemp)
TTOK_CODE=$(tenant_call_capture POST "/admin/workspaces/$WID/tokens" "$TTOK_FILE" 2>/dev/null || echo "curl_error")
TTOK_RESP=$(cat "$TTOK_FILE" 2>/dev/null || true)
WTOK=$(echo "$TTOK_RESP" | extract_auth_token)
TOKEN_DIAG="POST /admin/workspaces/$WID/tokens -> HTTP $TTOK_CODE body: $(echo "$TTOK_RESP" | redact_token_body)"
rm -f "$TTOK_FILE"
fi
if [ -z "$WTOK" ]; then
TTOK_FILE=$(mktemp)
TTOK_CODE=$(tenant_call_capture GET "/admin/workspaces/$WID/test-token" "$TTOK_FILE" 2>/dev/null || echo "curl_error")
TTOK_RESP=$(cat "$TTOK_FILE" 2>/dev/null || true)
WTOK=$(echo "$TTOK_RESP" | extract_auth_token)
TOKEN_DIAG="${TOKEN_DIAG}
GET /admin/workspaces/$WID/test-token -> HTTP $TTOK_CODE body: $(echo "$TTOK_RESP" | redact_token_body)"
rm -f "$TTOK_FILE"
fi
[ -n "$WTOK" ] || fail "$rt workspace did not return or mint an auth_token — cannot drive its MCP call (create_resp: $(echo "$R" | redact_token_body); token_fallbacks: $TOKEN_DIAG)"
WS_IDS[$rt]="$WID"
WS_TOKENS[$rt]="$WTOK"
ALL_WS_IDS="$ALL_WS_IDS $WID"
+8 -2
View File
@@ -179,8 +179,14 @@ echo "--- Phase 3.5: Python parser classifies real server response (#2967) ---"
PARSE_RESULT=$(WORKSPACE_ID="00000000-0000-0000-0000-000000000001" \
python3 -c "
import json, sys
sys.path.insert(0, '$(cd "$(dirname "$0")/../../workspace" && pwd)')
import a2a_response
try:
from molecule_runtime import a2a_response
except ModuleNotFoundError as exc:
raise SystemExit(
'molecule-ai-workspace-runtime is required for poll-mode parser '
'coverage; install it from the Gitea package registry before running '
'this E2E'
) from exc
data = json.loads(r'''$A2A_RESP''')
v = a2a_response.parse(data)
print(type(v).__name__)
+49 -12
View File
@@ -25,6 +25,13 @@ source "$(dirname "$0")/_lib.sh" # sets BASE default + helpers
PASS=0
FAIL=0
TIMEOUT="${E2E_TIMEOUT:-60}"
ADMIN_BEARER="${MOLECULE_ADMIN_TOKEN:-${ADMIN_TOKEN:-}}"
ADMIN_AUTH=()
[ -n "$ADMIN_BEARER" ] && ADMIN_AUTH=(-H "Authorization: Bearer $ADMIN_BEARER")
WS_A_TOKEN=""
WS_A_AUTH=()
WS_B_TOKEN=""
WS_B_AUTH=()
check() {
local desc="$1" expected="$2" actual="$3"
@@ -75,15 +82,26 @@ echo "--- A. Per-workspace MCP server-name slug uniqueness ---"
WS_A_NAME="e2e-cov-alpha-$$"
WS_B_NAME="e2e-cov-beta-$$"
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"$WS_A_NAME\",\"tier\":1}")
check "POST /workspaces (alpha)" '"status":"provisioning"' "$R"
R=$(curl -s -X POST "$BASE/workspaces" "${ADMIN_AUTH[@]}" -H "Content-Type: application/json" \
-d "{\"name\":\"$WS_A_NAME\",\"runtime\":\"external\",\"external\":true,\"tier\":1}")
check "POST /workspaces (alpha)" '"status":"awaiting_agent"' "$R"
WS_A_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin).get('id',''))")
if [ -n "$WS_A_ID" ]; then
WS_A_TOKEN=$(e2e_mint_test_token "$WS_A_ID" 2>/dev/null || true)
[ -n "$WS_A_TOKEN" ] && WS_A_AUTH=(-H "Authorization: Bearer $WS_A_TOKEN")
if [ -z "$ADMIN_BEARER" ] && [ -n "$WS_A_TOKEN" ]; then
ADMIN_AUTH=(-H "Authorization: Bearer $WS_A_TOKEN")
fi
fi
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"$WS_B_NAME\",\"tier\":1}")
check "POST /workspaces (beta)" '"status":"provisioning"' "$R"
R=$(curl -s -X POST "$BASE/workspaces" "${ADMIN_AUTH[@]}" -H "Content-Type: application/json" \
-d "{\"name\":\"$WS_B_NAME\",\"runtime\":\"external\",\"external\":true,\"tier\":1}")
check "POST /workspaces (beta)" '"status":"awaiting_agent"' "$R"
WS_B_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin).get('id',''))")
if [ -n "$WS_B_ID" ]; then
WS_B_TOKEN=$(e2e_mint_test_token "$WS_B_ID" 2>/dev/null || true)
[ -n "$WS_B_TOKEN" ] && WS_B_AUTH=(-H "Authorization: Bearer $WS_B_TOKEN")
fi
# external/connection returns the install-snippet. The per-workspace
# fix (mc#1535) derives the MCP name as molecule-<slug>; mc#1536 extends
@@ -91,8 +109,10 @@ WS_B_ID=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin).ge
# grep the `claude mcp add` line, and assert the names differ.
if [ -n "$WS_A_ID" ] && [ -n "$WS_B_ID" ]; then
SNIPPET_A=$(curl -s --max-time "$TIMEOUT" \
"${WS_A_AUTH[@]}" \
"$BASE/workspaces/$WS_A_ID/external/connection")
SNIPPET_B=$(curl -s --max-time "$TIMEOUT" \
"${WS_B_AUTH[@]}" \
"$BASE/workspaces/$WS_B_ID/external/connection")
MCP_A=$(echo "$SNIPPET_A" | python3 -c "
@@ -151,7 +171,11 @@ import sys, json, re
d=json.load(sys.stdin)
def find(o):
if isinstance(o,str):
m=re.search(r'\[mcp_servers\.([^\]]+)\]',o); return m.group(1) if m else None
for m in re.finditer(r'\[mcp_servers\.([^\]]+)\]',o):
name=m.group(1)
if name.startswith('molecule-') and '<' not in name:
return name
return None
if isinstance(o,dict):
for v in o.values():
r=find(v)
@@ -168,7 +192,11 @@ import sys, json, re
d=json.load(sys.stdin)
def find(o):
if isinstance(o,str):
m=re.search(r'\[mcp_servers\.([^\]]+)\]',o); return m.group(1) if m else None
for m in re.finditer(r'\[mcp_servers\.([^\]]+)\]',o):
name=m.group(1)
if name.startswith('molecule-') and '<' not in name:
return name
return None
if isinstance(o,dict):
for v in o.values():
r=find(v)
@@ -212,7 +240,7 @@ echo "--- B. GIT_ASKPASS + GIT_HTTP_* env injection (mc#1525 + mc#1542) ---"
if [ -n "${WS_A_ID:-}" ]; then
# Wait briefly for provisioning to expose the container.
for _ in 1 2 3 4 5 6 7 8 9 10; do
R=$(curl -s "$BASE/workspaces/$WS_A_ID")
R=$(curl -s "${ADMIN_AUTH[@]}" "$BASE/workspaces/$WS_A_ID")
STATUS=$(echo "$R" | python3 -c "import sys,json; print(json.load(sys.stdin).get('status',''))" 2>/dev/null)
[ "$STATUS" = "online" ] && break
sleep 1
@@ -225,7 +253,7 @@ if [ -n "${WS_A_ID:-}" ]; then
# acceptable for the dev platform). The point is that the KEYS are
# propagated by the post-#1542 provisioner — pre-#1542 these keys
# were absent entirely.
DEBUG=$(curl -s "$BASE/admin/workspaces/$WS_A_ID/debug" 2>/dev/null || true)
DEBUG=$(curl -s "${ADMIN_AUTH[@]}" "$BASE/admin/workspaces/$WS_A_ID/debug" 2>/dev/null || true)
if [ -n "$DEBUG" ] && echo "$DEBUG" | grep -q "workspace_secrets"; then
# Presence-only check: KEY in the secrets map, value MAY be empty
# in dev where no persona is bound.
@@ -261,6 +289,7 @@ if [ -n "${WS_A_ID:-}" ]; then
# The expected response shape post-fix is a structured failure (HTTP
# 4xx or success:false JSON) — NOT a queued task that round-trips.
R=$(curl -s --max-time 10 -X POST "$BASE/workspaces/$WS_A_ID/delegate" \
"${WS_A_AUTH[@]}" \
-H "Content-Type: application/json" \
-d "{\"target_workspace_id\":\"$WS_A_ID\",\"task\":\"self-echo-test\"}" 2>&1)
# Either the API gate (delegation.go) rejects, OR the inbox guard
@@ -281,7 +310,7 @@ if [ -n "${WS_A_ID:-}" ]; then
# an inboxable peer_agent kind. The /activity endpoint is the inbox
# poller's source-of-truth.
sleep 2
AL=$(curl -s "$BASE/workspaces/$WS_A_ID/activity" 2>/dev/null || echo '[]')
AL=$(curl -s "${WS_A_AUTH[@]}" "$BASE/workspaces/$WS_A_ID/activity" 2>/dev/null || echo '[]')
# Count rows where source_id == workspace_id AND method != "delegate_result".
ECHO_COUNT=$(echo "$AL" | python3 -c "
import sys, json
@@ -315,7 +344,15 @@ echo
echo "--- Cleanup ---"
for wid in "${WS_A_ID:-}" "${WS_B_ID:-}"; do
[ -n "$wid" ] || continue
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" > /dev/null || true
DELETE_AUTH=("${ADMIN_AUTH[@]}")
if [ -z "$ADMIN_BEARER" ]; then
if [ "$wid" = "${WS_A_ID:-}" ]; then
DELETE_AUTH=("${WS_A_AUTH[@]}")
elif [ "$wid" = "${WS_B_ID:-}" ]; then
DELETE_AUTH=("${WS_B_AUTH[@]}")
fi
fi
curl -s -X DELETE "$BASE/workspaces/$wid?confirm=true" "${DELETE_AUTH[@]}" > /dev/null || true
echo "deleted $wid"
done
+193
View File
@@ -0,0 +1,193 @@
"""Tests for `.gitea/scripts/detect-changes.py`."""
from __future__ import annotations
import importlib.util
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[1]
SCRIPT = REPO_ROOT / ".gitea" / "scripts" / "detect-changes.py"
def load_module():
spec = importlib.util.spec_from_file_location("detect_changes", SCRIPT)
assert spec is not None
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
spec.loader.exec_module(module)
return module
def test_ci_profile_classifies_surfaces():
mod = load_module()
assert mod.classify("ci", ["workspace-server/internal/handlers/a2a_proxy.go"]) == {
"platform": True,
"canvas": False,
"python": False,
"scripts": False,
}
assert mod.classify("ci", ["canvas/src/app/page.tsx"]) == {
"platform": False,
"canvas": True,
"python": False,
"scripts": False,
}
assert mod.classify("ci", ["tests/e2e/test_model_slug.sh"]) == {
"platform": False,
"canvas": False,
"python": False,
"scripts": True,
}
assert mod.classify("ci", [".gitea/workflows/ci.yml", "README.md"]) == {
"platform": False,
"canvas": False,
"python": False,
"scripts": False,
}
def test_handlers_postgres_profile_is_narrower_than_workspace_server():
mod = load_module()
assert mod.classify("handlers-postgres", ["workspace-server/internal/handlers/a2a_proxy.go"]) == {
"handlers": True,
}
assert mod.classify("handlers-postgres", ["workspace-server/internal/provisioner/provisioner.go"]) == {
"handlers": False,
}
def test_e2e_api_profile_covers_api_inputs():
mod = load_module()
assert mod.classify("e2e-api", ["workspace-server/internal/handlers/workspace.go"]) == {
"api": True,
}
assert mod.classify("e2e-api", ["tests/e2e/test_api.sh"]) == {"api": True}
assert mod.classify("e2e-api", ["canvas/src/app/page.tsx"]) == {"api": False}
def test_fail_open_all_true_for_missing_base():
mod = load_module()
assert mod.all_true("ci") == {
"platform": True,
"canvas": True,
"python": True,
"scripts": True,
}
def test_fetch_base_prefers_advertised_base_ref(monkeypatch):
mod = load_module()
calls: list[list[str]] = []
exists_checks = 0
def fake_base_exists(base: str) -> bool:
nonlocal exists_checks
exists_checks += 1
return exists_checks >= 1
def fake_run_git(args: list[str], *, timeout: int = 30):
calls.append(args)
class Result:
returncode = 0
stdout = ""
stderr = ""
return Result()
monkeypatch.setattr(mod, "base_exists", fake_base_exists)
monkeypatch.setattr(mod, "run_git", fake_run_git)
mod.fetch_base("abc123", "main")
assert calls == [["fetch", "--depth=1", "origin", "main"]]
def test_fetch_base_falls_back_to_sha_when_ref_fetch_does_not_materialize(monkeypatch):
mod = load_module()
calls: list[list[str]] = []
monkeypatch.setattr(mod, "base_exists", lambda _base: False)
def fake_run_git(args: list[str], *, timeout: int = 30):
calls.append(args)
class Result:
returncode = 0
stdout = ""
stderr = ""
return Result()
monkeypatch.setattr(mod, "run_git", fake_run_git)
mod.fetch_base("abc123", "main")
assert calls == [
["fetch", "--depth=1", "origin", "main"],
["fetch", "--depth=1", "origin", "abc123"],
]
def test_changed_paths_uses_merge_base_for_pull_request(monkeypatch):
mod = load_module()
calls: list[list[str]] = []
def fake_run_git(args: list[str], *, timeout: int = 30):
calls.append(args)
class Result:
returncode = 0
stdout = "workspace/agent.py\n"
stderr = ""
if args[0] == "merge-base":
Result.stdout = "merge123\n"
return Result()
monkeypatch.setattr(mod, "run_git", fake_run_git)
assert mod.changed_paths("base123", use_merge_base=True) == ["workspace/agent.py"]
assert calls == [
["merge-base", "base123", "HEAD"],
["diff", "--name-only", "merge123", "HEAD"],
]
def test_detect_deepens_base_ref_when_pr_merge_base_missing(monkeypatch):
mod = load_module()
calls: list[tuple[str, str | None]] = []
merge_base_calls = 0
monkeypatch.setattr(mod, "base_exists", lambda _base: True)
def fake_merge_base(base: str):
nonlocal merge_base_calls
merge_base_calls += 1
if merge_base_calls == 1:
return None
return "merge123"
def fake_deepen_base_ref(base_ref: str):
calls.append(("deepen", base_ref))
def fake_changed_paths(base: str, *, use_merge_base: bool):
calls.append(("changed", str(use_merge_base)))
return [".gitea/workflows/ci.yml"]
monkeypatch.setattr(mod, "merge_base", fake_merge_base)
monkeypatch.setattr(mod, "deepen_base_ref", fake_deepen_base_ref)
monkeypatch.setattr(mod, "changed_paths", fake_changed_paths)
assert mod.detect("ci", "pull_request", "base123", "", "main") == {
"platform": False,
"canvas": False,
"python": False,
"scripts": False,
}
assert calls == [("deepen", "main"), ("changed", "True")]
+28
View File
@@ -0,0 +1,28 @@
from pathlib import Path
import yaml
ROOT = Path(__file__).resolve().parents[1]
def workflow_on(path: Path):
doc = yaml.safe_load(path.read_text())
return doc.get("on") or doc.get(True)
def test_browser_e2e_workflows_are_not_unconditional_pr_heavy_lanes():
workflows = [
ROOT / ".gitea/workflows/e2e-chat.yml",
ROOT / ".gitea/workflows/e2e-staging-canvas.yml",
]
for path in workflows:
text = path.read_text()
events = workflow_on(path)
assert "workflow_dispatch" in events
assert "schedule" in events
assert "merge-queue" in text
assert "/issues/${{ github.event.pull_request.number }}/labels" in text
assert "PR is not in merge-queue" in text
+72 -7
View File
@@ -26,9 +26,11 @@ import re
import subprocess
import sys
import textwrap
import importlib.util
from pathlib import Path
import pytest # noqa: F401 (declares the dep)
import yaml
REPO_ROOT = Path(__file__).resolve().parents[1]
SCRIPT = REPO_ROOT / ".gitea" / "scripts" / "lint-workflow-yaml.py"
@@ -616,16 +618,24 @@ def test_rule10_docker_info_head_in_separate_step_without_pipefail_passes(tmp_pa
CI_WORKFLOW = REPO_ROOT / ".gitea" / "workflows" / "ci.yml"
CI_SURFACES = ("platform", "canvas", "python", "scripts")
DETECT_CHANGES_SCRIPT = REPO_ROOT / ".gitea" / "scripts" / "detect-changes.py"
def _load_detect_changes():
spec = importlib.util.spec_from_file_location("detect_changes", DETECT_CHANGES_SCRIPT)
assert spec is not None
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
spec.loader.exec_module(module)
return module
def _ci_change_patterns() -> dict[str, re.Pattern[str]]:
text = CI_WORKFLOW.read_text(encoding="utf-8")
patterns: dict[str, re.Pattern[str]] = {}
for surface, pattern in re.findall(
r'echo "(platform|canvas|python|scripts)=.*?grep -qE \'([^\']+)\'',
text,
):
patterns[surface] = re.compile(pattern)
detect_changes = _load_detect_changes()
patterns = {
surface: re.compile(pattern)
for surface, pattern in detect_changes.PROFILES["ci"].items()
}
assert set(patterns) == set(CI_SURFACES)
return patterns
@@ -693,3 +703,58 @@ def test_ci_change_detector_docs_and_meta_scripts_do_not_trigger_surfaces():
"python": False,
"scripts": False,
}
def test_ci_platform_go_pr_steps_are_path_scoped():
doc = yaml.safe_load(CI_WORKFLOW.read_text(encoding="utf-8"))
platform = doc["jobs"]["platform-build"]
assert platform.get("needs") == "changes"
expensive_steps = [
step
for step in platform["steps"]
if step.get("uses")
or step.get("run", "").startswith("go ")
or "golangci-lint" in step.get("run", "")
]
assert expensive_steps
for step in expensive_steps:
expr = step.get("if", "")
assert "github.event_name != 'pull_request'" in expr
assert "needs.changes.outputs.platform == 'true'" in expr
def test_ci_canvas_nextjs_pr_steps_are_path_scoped():
doc = yaml.safe_load(CI_WORKFLOW.read_text(encoding="utf-8"))
canvas = doc["jobs"]["canvas-build"]
assert canvas.get("needs") == "changes"
expensive_steps = [
step
for step in canvas["steps"]
if step.get("uses")
or step.get("run", "").startswith("npm ")
or step.get("run", "").startswith("npx ")
]
assert expensive_steps
for step in expensive_steps:
expr = step.get("if", "")
assert "github.event_name != 'pull_request'" in expr
assert "needs.changes.outputs.canvas == 'true'" in expr
def test_ci_shellcheck_pr_steps_are_path_scoped():
doc = yaml.safe_load(CI_WORKFLOW.read_text(encoding="utf-8"))
shellcheck = doc["jobs"]["shellcheck"]
assert shellcheck.get("needs") == "changes"
expensive_steps = [
step
for step in shellcheck["steps"]
if step.get("uses") or step.get("run", "").startswith(("bash ", "find ", "shellcheck "))
]
assert expensive_steps
for step in expensive_steps:
expr = step.get("if", "")
assert "github.event_name != 'pull_request'" in expr
assert "needs.changes.outputs.scripts == 'true'" in expr
+226
View File
@@ -56,6 +56,21 @@ SCRIPT_PATH = (
)
@pytest.fixture(autouse=True)
def _stub_time_sleep(monkeypatch):
"""Autouse: stub time.sleep across every test.
The watchdog's RECHECK_DELAY_SECS (default 90s) is wired into
run_once() via time.sleep(). Without this stub, integration-style
tests that exercise run_once() would each block for 90s — a
pre-fix `pytest -q` ran in ~0.1s; the unstubbed equivalent took
>4 minutes (task #394 review evidence). Stubbing here keeps the
suite fast and deterministic without requiring every red-path test
to remember the patch.
"""
monkeypatch.setattr("time.sleep", lambda s: None)
@pytest.fixture(scope="module")
def wd_module():
"""Import the script as a module under a known env."""
@@ -809,3 +824,214 @@ def test_require_runtime_env_exits_when_missing(wd_module, monkeypatch):
with pytest.raises(SystemExit) as excinfo:
wd_module._require_runtime_env()
assert excinfo.value.code == 2
# --------------------------------------------------------------------------
# Action-run status filter + HEAD-recheck (task #394, mc#1597..1630)
#
# The existing cancel-cascade filter matched description=='Has been
# cancelled' EXACTLY, but a 7-day DB sweep on 2026-05-20 showed that
# only 76/702 (~11%) of action_run.status=3 (Cancelled) entries carry
# that string — 89% are written as 'Failing after Ns', indistinguishable
# from real action_run.status=2 (Failure) at the commit_status layer.
#
# Gitea 1.22.6 has NO REST endpoint exposing action_run.status, so the
# canonical filter (status=2 only) cannot run from a Gitea Actions
# runner. The next-best signal is the HEAD-recheck: re-fetch HEAD SHA
# (or its combined status) right before filing. If HEAD moved on or
# combined state recovered, the prior "red" was a transient
# cancel-cascade and we skip-file.
#
# References:
# - reference_chronic_red_sweep_cancelled_vs_failed_filter
# - feedback_gitea_status_enum_use_helper_not_raw_int
# - reference_gitea_action_status_enum_corrected_2026_05_19
# - triage evidence 2026-05-21 04:55 (6 cancellation + 1 emission
# artifact across mc#1597,1605,1609,1613,1626,1627,1630)
# --------------------------------------------------------------------------
def test_head_recheck_skips_file_when_head_moved(wd_module, monkeypatch, capsys):
"""When initial tick sees red at SHA_A but HEAD has since moved to
SHA_B (next commit landed mid-tick), the watchdog must NOT file.
Re-evaluation happens on the next cron tick against the new SHA.
REGRESSION CLASS: this guards mc#1597..#1630 — 7 false-positives
filed in 24h because cancel-cascade fired commit_status=failure
rows on SHAs that were already superseded by new merges."""
SHA_A = SHA_RED
SHA_B = SHA_GREEN
failed_ctx = [
{"context": "ci/test", "status": "failure",
"target_url": "/r/runs/100/jobs/0",
"description": "Failing after 12s"},
]
# First branches read returns SHA_A; the second (recheck) returns SHA_B
# → watchdog detects HEAD drift and skip-files.
branches_responses = iter([
(200, _branches_response(SHA_A)),
(200, _branches_response(SHA_B)),
])
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path == "/repos/owner/repo/branches/main":
return next(branches_responses)
if method == "GET" and path == f"/repos/owner/repo/commits/{SHA_A}/status":
return (200, _combined_status("failure", failed_ctx))
if method == "POST" and path == "/repos/owner/repo/issues":
raise AssertionError(
"watchdog filed a phantom issue despite HEAD moving away "
"from the red SHA (regression: mc#1597..1630)"
)
if method == "GET" and path == "/repos/owner/repo/issues":
return (200, [])
raise AssertionError(f"unexpected api call: {method} {path}")
# Settling delay is no-op'd by the _stub_time_sleep autouse fixture.
monkeypatch.setattr(wd_module, "api", fake_api)
wd_module.run_once(dry_run=False)
captured = capsys.readouterr()
assert "head drift" in captured.out.lower() or "head moved" in captured.out.lower(), (
f"expected a notice about HEAD drift, got: {captured.out!r}"
)
def test_head_recheck_skips_file_when_recheck_status_recovered(
wd_module, monkeypatch, capsys,
):
"""When initial tick sees red at SHA, but the post-settling recheck
on the SAME SHA shows combined status recovered (e.g. transient
cancel-cascade rolled forward to success on retry), skip-file.
This catches the mid-flight cancel-cascade window — the second
largest false-positive cluster in mc#1597..1630."""
failed_ctx_initial = [
{"context": "ci/test", "status": "failure",
"target_url": "/r/runs/100/jobs/0",
"description": "Failing after 12s"},
]
recovered_ctx = [
{"context": "ci/test", "status": "success",
"target_url": "/r/runs/100/jobs/0",
"description": "Successful in 30s"},
]
# Same SHA across both branch reads; status flips from failure→success
# between the two combined-status reads.
status_responses = iter([
(200, _combined_status("failure", failed_ctx_initial)),
(200, _combined_status("success", recovered_ctx)),
])
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path == "/repos/owner/repo/branches/main":
return (200, _branches_response(SHA_RED))
if method == "GET" and path == f"/repos/owner/repo/commits/{SHA_RED}/status":
return next(status_responses)
if method == "POST" and path == "/repos/owner/repo/issues":
raise AssertionError(
"watchdog filed a phantom issue despite combined status "
"recovering on recheck (mid-flight cancel-cascade window)"
)
if method == "GET" and path == "/repos/owner/repo/issues":
return (200, [])
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(wd_module, "api", fake_api)
wd_module.run_once(dry_run=False)
captured = capsys.readouterr()
assert "recovered" in captured.out.lower() or "settled" in captured.out.lower(), (
f"expected a notice about post-settling recovery, got: {captured.out!r}"
)
def test_head_recheck_files_when_still_red_after_settling(
wd_module, monkeypatch,
):
"""When BOTH the initial detection AND the post-settling recheck
show the same SHA still red, file the issue. This is the genuine-
failure path the watchdog is designed to surface.
Locks the over-filter: a future change that always-skips after
recheck would dismiss real failures."""
failed_ctx = [
{"context": "ci/test", "status": "failure",
"target_url": "/r/runs/100/jobs/0",
"description": "Failing after 12s"},
]
post_filed = {"value": False}
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path == "/repos/owner/repo/branches/main":
return (200, _branches_response(SHA_RED))
if method == "GET" and path == f"/repos/owner/repo/commits/{SHA_RED}/status":
return (200, _combined_status("failure", failed_ctx))
if method == "GET" and path == "/repos/owner/repo/issues":
return (200, [])
if method == "GET" and path == "/repos/owner/repo/labels":
return (200, [{"id": 9, "name": "tier:high"}])
if method == "POST" and path == "/repos/owner/repo/issues":
post_filed["value"] = True
return (201, {"number": 999})
if method == "POST" and path == "/repos/owner/repo/issues/999/labels":
return (200, [])
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(wd_module, "api", fake_api)
wd_module.run_once(dry_run=False)
assert post_filed["value"], (
"genuine-failure path was skip-filed — head-recheck over-filter "
"regression (would suppress all real main-red alarms)"
)
def test_head_recheck_skips_when_initial_was_only_cancel_cascade(
wd_module, monkeypatch,
):
"""Belt-and-braces: combined-status failure caused exclusively by
description='Has been cancelled' entries should still be filtered
by the EXISTING cancel-cascade filter — head-recheck must not
accidentally bypass it. Regression guard for the existing mc#1564
fix."""
failed_ctx = [
{"context": "ci/test", "status": "failure",
"description": "Has been cancelled"},
]
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path == "/repos/owner/repo/branches/main":
return (200, _branches_response(SHA_RED))
if method == "GET" and path == f"/repos/owner/repo/commits/{SHA_RED}/status":
return (200, _combined_status("failure", failed_ctx))
if method == "POST" and path == "/repos/owner/repo/issues":
raise AssertionError(
"cancel-cascade-only entry must be filtered before any "
"head-recheck logic runs"
)
if method == "GET" and path == "/repos/owner/repo/issues":
return (200, [])
# No commit-status recheck should happen because is_red() returned False
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(wd_module, "api", fake_api)
wd_module.run_once(dry_run=False)
# success: no AssertionError raised, no POST
def test_resolve_action_run_status_returns_none_on_no_endpoint(wd_module):
"""The action_run.status REST endpoint does NOT exist in Gitea
1.22.6 (verified empirically 2026-05-20 — /api/v1/.../actions/runs/N
returns HTTP 404 across all probe variants). The resolver must
return None gracefully so callers fall back to the description-
string + head-recheck heuristics.
This pins the extensibility hook: when a future Gitea release (or
an op-host proxy) exposes the endpoint, the resolver implementation
can be swapped in without touching the caller contract."""
# The function exists and is callable
assert hasattr(wd_module, "_resolve_action_run_status")
# A typical target_url shape from real Gitea commit_status rows:
target_url = "/molecule-ai/molecule-core/actions/runs/75020/jobs/0"
# Return None when no endpoint available
out = wd_module._resolve_action_run_status(target_url)
assert out is None, (
"resolver must return None when the action_run.status endpoint "
"isn't reachable — callers depend on the None-fallback path"
)
+40
View File
@@ -442,6 +442,46 @@ def test_reap_preserves_real_push(sr_module, monkeypatch):
assert calls == [] # NO POST
def test_reap_compensates_cancelled_real_push_status(sr_module, monkeypatch):
"""Gitea 1.22.6 maps cancelled push runs to failure statuses.
A real push workflow with description exactly "Has been cancelled"
is cancel-cascade noise, not a defect signal. Status-reaper should
compensate it even though the workflow has a push trigger.
"""
calls = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
calls.append((method, path, body))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
workflow_map = {"ci": True}
combined = {
"state": "failure",
"statuses": [
{
"context": "ci / test (push)",
"status": "failure",
"description": "Has been cancelled",
"target_url": "https://example.test/actions/runs/1",
}
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 1
assert counters["compensated_cancelled_push"] == 1
assert counters["preserved_real_push"] == 0
assert len(calls) == 1
assert calls[0][0] == "POST"
assert calls[0][1] == f"/repos/owner/repo/statuses/{SHA}"
assert calls[0][2]["context"] == "ci / test (push)"
assert calls[0][2]["state"] == "success"
def test_reap_preserves_unknown_workflow(sr_module, monkeypatch, capsys):
"""Workflow not in map → ::notice:: + skip (conservative)."""
monkeypatch.setattr(
@@ -686,11 +686,22 @@ func (h *WorkspaceHandler) resolveAgentURL(ctx context.Context, workspaceID stri
_ = db.CacheURL(ctx, workspaceID, agentURL)
}
// When the platform runs inside Docker, 127.0.0.1:{host_port} is
// unreachable (it's the platform container's own localhost, not the
// Docker host). Rewrite to the container's Docker-bridge hostname.
// When the platform runs inside Docker, a managed workspace's
// 127.0.0.1:{host_port} URL points at the Docker host and must be
// rewritten to the workspace container's Docker-bridge hostname.
// External runtimes are not managed containers; their local test/runtime
// URL is the target and must not be synthesized into ws-<id>:8000.
if strings.HasPrefix(agentURL, "http://127.0.0.1:") && h.provisioner != nil && platformInDocker {
agentURL = provisioner.InternalURL(workspaceID)
var wsRuntime string
if err := db.DB.QueryRowContext(ctx,
`SELECT COALESCE(runtime, 'langgraph') FROM workspaces WHERE id = $1`,
workspaceID,
).Scan(&wsRuntime); err != nil {
log.Printf("ProxyA2A: runtime lookup before Docker URL rewrite failed for %s: %v", workspaceID, err)
}
if !isExternalLikeRuntime(wsRuntime) {
agentURL = provisioner.InternalURL(workspaceID)
}
}
// SSRF defence: reject private/metadata URLs before making outbound call.
if err := isSafeURL(agentURL); err != nil {
@@ -1511,6 +1511,35 @@ func TestResolveAgentURL_DockerRewrite(t *testing.T) {
}
}
func TestResolveAgentURL_ExternalRuntimeLoopbackNotRewrittenInDocker(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
handler := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
waitForHandlerAsyncBeforeDBCleanup(t, handler)
handler.provisioner = &stubLocalProv{}
restore := setPlatformInDockerForTest(true)
defer restore()
agentURL := "http://127.0.0.1:55555"
mr.Set("ws:ws-external:url", agentURL)
mock.ExpectQuery("SELECT COALESCE\\(runtime").
WithArgs("ws-external").
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("external"))
url, perr := handler.resolveAgentURL(context.Background(), "ws-external")
if perr != nil {
t.Fatalf("unexpected error: %+v", perr)
}
if url != agentURL {
t.Errorf("external runtime loopback URL must not be rewritten; got %q want %q", url, agentURL)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// --- dispatchA2A direct unit tests ---
func TestDispatchA2A_BuildRequestError(t *testing.T) {
@@ -0,0 +1,72 @@
package handlers
import (
"database/sql"
"fmt"
"log"
"net/http"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
)
// AdminWorkspaceTokenHandler lets tenant admins mint the first workspace
// bearer for managed SaaS workspaces whose runtime receives its token later
// through registry registration.
type AdminWorkspaceTokenHandler struct{}
func NewAdminWorkspaceTokenHandler() *AdminWorkspaceTokenHandler {
return &AdminWorkspaceTokenHandler{}
}
// Create handles POST /admin/workspaces/:id/tokens. The route must be mounted
// behind AdminAuth; the plaintext token is returned exactly once.
func (h *AdminWorkspaceTokenHandler) Create(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
var existing string
err := db.DB.QueryRowContext(c.Request.Context(),
`SELECT id FROM workspaces WHERE id = $1 AND status <> 'removed'`,
workspaceID).Scan(&existing)
if err != nil {
if err == sql.ErrNoRows {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
}
log.Printf("admin workspace tokens: workspace lookup failed for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "workspace lookup failed"})
return
}
var count int
if err := db.DB.QueryRowContext(c.Request.Context(),
`SELECT COUNT(*) FROM workspace_auth_tokens WHERE workspace_id = $1 AND revoked_at IS NULL`,
workspaceID).Scan(&count); err != nil {
log.Printf("admin workspace tokens: count failed for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to count tokens"})
return
}
if count >= maxTokensPerWorkspace {
c.JSON(http.StatusTooManyRequests, gin.H{"error": fmt.Sprintf("maximum %d active tokens per workspace", maxTokensPerWorkspace)})
return
}
token, err := wsauth.IssueToken(c.Request.Context(), db.DB, workspaceID)
if err != nil {
log.Printf("admin workspace tokens: issue failed for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to create token"})
return
}
log.Printf("admin workspace tokens: issued token for workspace %s", workspaceID)
c.JSON(http.StatusCreated, gin.H{
"auth_token": token,
"workspace_id": workspaceID,
"message": "Save this token now — it cannot be retrieved again.",
})
}
@@ -0,0 +1,102 @@
package handlers
import (
"encoding/json"
"errors"
"net/http"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
func TestAdminWorkspaceTokenHandler_Create_HappyPath(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
mock.ExpectQuery(`SELECT id FROM workspaces WHERE id = \$1 AND status <> 'removed'`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(wsUUID1))
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
mock.ExpectExec(`INSERT INTO workspace_auth_tokens`).
WithArgs(wsUUID1, sqlmock.AnyArg(), sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(1, 1))
w := makeReq(t, NewAdminWorkspaceTokenHandler().Create, "POST",
"/admin/workspaces/"+wsUUID1+"/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
}
var body struct {
AuthToken string `json:"auth_token"`
WorkspaceID string `json:"workspace_id"`
}
if err := json.Unmarshal(w.Body.Bytes(), &body); err != nil {
t.Fatalf("decode: %v", err)
}
if body.AuthToken == "" || body.WorkspaceID != wsUUID1 {
t.Fatalf("unexpected body: %+v", body)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet: %v", err)
}
}
func TestAdminWorkspaceTokenHandler_Create_MissingWorkspace(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
mock.ExpectQuery(`SELECT id FROM workspaces WHERE id = \$1 AND status <> 'removed'`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
w := makeReq(t, NewAdminWorkspaceTokenHandler().Create, "POST",
"/admin/workspaces/"+wsUUID1+"/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusNotFound {
t.Fatalf("expected 404, got %d: %s", w.Code, w.Body.String())
}
}
func TestAdminWorkspaceTokenHandler_Create_RateLimited(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
mock.ExpectQuery(`SELECT id FROM workspaces WHERE id = \$1 AND status <> 'removed'`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(wsUUID1))
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(maxTokensPerWorkspace))
w := makeReq(t, NewAdminWorkspaceTokenHandler().Create, "POST",
"/admin/workspaces/"+wsUUID1+"/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusTooManyRequests {
t.Fatalf("expected 429, got %d: %s", w.Code, w.Body.String())
}
}
func TestAdminWorkspaceTokenHandler_Create_IssueFails(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
mock.ExpectQuery(`SELECT id FROM workspaces WHERE id = \$1 AND status <> 'removed'`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(wsUUID1))
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
mock.ExpectExec(`INSERT INTO workspace_auth_tokens`).
WillReturnError(errors.New("disk full"))
w := makeReq(t, NewAdminWorkspaceTokenHandler().Create, "POST",
"/admin/workspaces/"+wsUUID1+"/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusInternalServerError {
t.Fatalf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
@@ -122,8 +122,22 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
// #548 — prevent self-delegation: a workspace delegating to itself
// acquires _run_lock twice on the same mutex, deadlocking permanently.
//
// #383 — the error message is the agent-visible string when this 400
// fires on the SDK's _delegate_sync_via_polling path. The previous
// terse "self-delegation not permitted" was correct but indistinct
// from a transient rate-limit or auth failure, so the LLM would
// re-attempt every 2-3s in a tight loop (chloe-dong tenant external
// workspace, 2026-05-20). The expanded message is explicit about
// (a) what just happened, (b) why it cannot succeed, (c) what to do
// instead — so the agent's retry heuristic recognizes the path as
// terminal and stops.
if sourceID == body.TargetID {
c.JSON(http.StatusBadRequest, gin.H{"error": "self-delegation not permitted"})
c.JSON(http.StatusBadRequest, gin.H{
"error": "self-delegation not permitted",
"reason": "the source workspace and target workspace are the same; you cannot delegate a task to yourself",
"hint": "do the work yourself, or pick a different peer via list_peers — retrying with the same target_id will fail every time",
})
return
}
@@ -256,24 +256,43 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
peers = append(peers, siblings...)
}
// Children
// Children — exclude self defensively. A child row whose parent_id
// equals the requesting workspaceID can never legitimately be the
// caller (a workspace can't be its own child), but a future data-
// integrity defect (e.g. self-loop introduced by a buggy register
// path) would otherwise smuggle the caller back into its own peer
// list. The agent then attempts `delegate_task(<own_id>)`, which
// either deadlocks _run_lock (sync path) or hits the platform's
// self-delegation 400 in a tight loop (#383). The `w.id != $2`
// clause makes self-delegation-via-peer-list impossible regardless
// of DB state.
children, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
COALESCE(w.agent_card, 'null'::jsonb), COALESCE(w.url, ''),
w.parent_id, w.active_tasks
FROM workspaces w WHERE w.parent_id = $1 AND w.status != 'removed'`, workspaceID)
FROM workspaces w WHERE w.parent_id = $1 AND w.id != $2 AND w.status != 'removed'`,
workspaceID, workspaceID)
peers = append(peers, children...)
// Parent
// Parent — same defense-in-depth. A workspace whose parent_id points
// to itself is data corruption, but the peer-list endpoint must not
// propagate that corruption back to the agent as a "peer who is also
// you" entry.
if parentID.Valid {
parent, _ := queryPeerMaps(`
SELECT w.id, w.name, COALESCE(w.role, ''), w.tier, w.status,
COALESCE(w.agent_card, 'null'::jsonb), COALESCE(w.url, ''),
w.parent_id, w.active_tasks
FROM workspaces w WHERE w.id = $1 AND w.status != 'removed'`, parentID.String)
FROM workspaces w WHERE w.id = $1 AND w.id != $2 AND w.status != 'removed'`,
parentID.String, workspaceID)
peers = append(peers, parent...)
}
// #383 final-line defense: even if a future code path adds a query
// that doesn't filter self, strip the caller's own row before
// returning. Cheap O(n) over a peer set bounded at <50 rows.
peers = excludeSelfFromPeers(peers, workspaceID)
peers = filterPeersByQuery(peers, c.Query("q"))
if peers == nil {
@@ -282,6 +301,32 @@ func (h *DiscoveryHandler) Peers(c *gin.Context) {
c.JSON(http.StatusOK, peers)
}
// excludeSelfFromPeers strips any peer entry whose ``id`` equals
// ``workspaceID`` (the caller's own row). Final-line defense for #383
// (self-delegation 400-loop on external workspaces): a peer-list that
// includes the requester's own row is the root mechanism by which an
// agent ends up delegating to itself. The pre-DB filters in Peers
// already enforce `w.id != $caller` on each branch; this function
// guarantees the contract holds regardless of which query path
// returned the row, including future ones added without a self-filter.
//
// O(n) over a peer set bounded at <50 rows per `Peers` comment — well
// below the hot-path overhead of the existing filterPeersByQuery.
func excludeSelfFromPeers(peers []map[string]interface{}, workspaceID string) []map[string]interface{} {
if len(peers) == 0 {
return peers
}
out := make([]map[string]interface{}, 0, len(peers))
for _, p := range peers {
id, _ := p["id"].(string)
if id == workspaceID {
continue
}
out = append(out, p)
}
return out
}
// filterPeersByQuery returns peers whose name or role case-insensitively
// contains q. Whitespace-trimmed empty q is a no-op (returns input unchanged).
func filterPeersByQuery(peers []map[string]interface{}, q string) []map[string]interface{} {
@@ -125,14 +125,14 @@ func TestPeers_WithParent(t *testing.T) {
WillReturnRows(sqlmock.NewRows(peerCols).
AddRow("ws-sibling-2", "Sibling Two", "worker", 1, "online", []byte("null"), "http://localhost:8002", "ws-parent", 0))
// Expect children query
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.status").
WithArgs("ws-sibling-1").
// Expect children query — #383 added explicit `w.id != $2` self-filter
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-sibling-1", "ws-sibling-1").
WillReturnRows(sqlmock.NewRows(peerCols))
// Expect parent query
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.status").
WithArgs("ws-parent").
// Expect parent query — #383 added explicit `w.id != $2` self-filter
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-parent", "ws-sibling-1").
WillReturnRows(sqlmock.NewRows(peerCols).
AddRow("ws-parent", "Parent PM", "manager", 2, "online", []byte("null"), "http://localhost:8001", nil, 1))
@@ -228,9 +228,9 @@ func TestPeers_RootWorkspace_NoPeers(t *testing.T) {
WithArgs("ws-root-alone").
WillReturnRows(sqlmock.NewRows(peerCols))
// Children — none
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1").
WithArgs("ws-root-alone").
// Children — none. #383 added explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs("ws-root-alone", "ws-root-alone").
WillReturnRows(sqlmock.NewRows(peerCols))
// No parent query since parent_id is NULL
@@ -282,12 +282,14 @@ func peersFilterFixture(t *testing.T) (*DiscoveryHandler, sqlmock.Sqlmock) {
AddRow("ws-alpha", "Alpha Researcher", "researcher", 1, "online", []byte("null"), "http://a", "ws-pm", 0).
AddRow("ws-beta", "Beta Designer", "designer", 1, "online", []byte("null"), "http://b", "ws-pm", 0))
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.status").
WithArgs("ws-self").
// #383 — children query gained explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-self", "ws-self").
WillReturnRows(sqlmock.NewRows(cols))
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.status").
WithArgs("ws-pm").
// #383 — parent query gained explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-pm", "ws-self").
WillReturnRows(sqlmock.NewRows(cols).
AddRow("ws-pm", "PM Workspace", "manager", 2, "online", []byte("null"), "http://pm", nil, 1))
@@ -966,8 +968,9 @@ func TestPeers_DevModeFailOpen_AllowsBearerlessRequest(t *testing.T) {
mock.ExpectQuery("SELECT w.id.+WHERE w.parent_id IS NULL AND w.id").
WithArgs("ws-dev").
WillReturnRows(sqlmock.NewRows(peerCols))
mock.ExpectQuery("SELECT w.id.+WHERE w.parent_id = \\$1 AND w.status").
WithArgs("ws-dev").
// #383 — children query gained explicit `w.id != $2` self-filter.
mock.ExpectQuery("SELECT w.id.+WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-dev", "ws-dev").
WillReturnRows(sqlmock.NewRows(peerCols))
w := httptest.NewRecorder()
@@ -1030,3 +1033,183 @@ func TestPeers_DevModeFailOpen_ClosedInProduction(t *testing.T) {
t.Fatalf("expected 401 in production, got %d: %s", w.Code, w.Body.String())
}
}
// ==================== Peers — #383 self never appears in result ====================
// TestPeers_ExcludeSelf_DefenseInDepth verifies the final-line filter in
// Peers strips any row whose id matches the caller. The pre-DB SQL filters
// already do this, but a future code path that omits the `w.id != $caller`
// clause must not be able to smuggle a self-row through. This test
// simulates that future-defect case by mocking the children query to
// (incorrectly) return a row whose id matches the caller, and asserts the
// final filter still drops it.
//
// Root cause class for #383: an agent that sees its own row in /peers
// proceeds to delegate_task to itself, hitting the platform's
// self-delegation 400 in a tight loop. The fix in discovery.go is
// defense-in-depth: even if the SQL filter regresses, this handler-level
// filter prevents the 400-loop from materializing.
func TestPeers_ExcludeSelf_DefenseInDepth(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewDiscoveryHandler()
const selfID = "ws-xiaodong"
// parent_id lookup — workspace has a parent.
mock.ExpectQuery("SELECT parent_id FROM workspaces WHERE id =").
WithArgs(selfID).
WillReturnRows(sqlmock.NewRows([]string{"parent_id"}).AddRow("ws-parent"))
peerCols := []string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}
// Siblings — returns one legitimate sibling. The SQL filter excludes
// self at the source.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2").
WithArgs("ws-parent", selfID).
WillReturnRows(sqlmock.NewRows(peerCols).
AddRow("ws-sibling", "Sibling", "worker", 1, "online", []byte("null"), "http://localhost:8002", "ws-parent", 0))
// Children — simulates the data-integrity defect class: the DB
// (incorrectly) returns the caller's own row in the children set.
// In real production this would require a workspace whose
// parent_id points to itself — corruption only, but the handler
// must not propagate it.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.parent_id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs(selfID, selfID).
WillReturnRows(sqlmock.NewRows(peerCols).
AddRow(selfID, "Self As Child", "worker", 1, "online", []byte("null"), "http://localhost:8001", selfID, 0).
AddRow("ws-child", "Real Child", "worker", 1, "online", []byte("null"), "http://localhost:8003", selfID, 0))
// Parent — explicit `w.id != $2` clause so the parent path is also
// self-filtered. parentID.String = "ws-parent" != selfID, so the
// row is included.
mock.ExpectQuery("SELECT w.id, w.name.*WHERE w.id = \\$1 AND w.id != \\$2 AND w.status").
WithArgs("ws-parent", selfID).
WillReturnRows(sqlmock.NewRows(peerCols).
AddRow("ws-parent", "Parent", "manager", 2, "online", []byte("null"), "http://localhost:8004", nil, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: selfID}}
c.Request = httptest.NewRequest("GET", "/registry/"+selfID+"/peers", nil)
handler.Peers(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var peers []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &peers); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
// The defense-in-depth filter must drop the self row even though
// the (mocked-defective) children query returned it.
for _, p := range peers {
if id, _ := p["id"].(string); id == selfID {
t.Fatalf("peer list contains caller's own id %q — self-delegation defense regressed; full list: %+v", selfID, peers)
}
}
// Sanity: the three legitimate peers (sibling, real child, parent)
// must all be present. Catches an over-aggressive filter that
// strips legitimate rows.
expectedIDs := map[string]bool{"ws-sibling": false, "ws-child": false, "ws-parent": false}
for _, p := range peers {
if id, _ := p["id"].(string); id != "" {
if _, ok := expectedIDs[id]; ok {
expectedIDs[id] = true
}
}
}
for id, found := range expectedIDs {
if !found {
t.Errorf("legitimate peer %q missing from response; got %+v", id, peers)
}
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestExcludeSelfFromPeers_Unit exercises the helper directly so the
// defense-in-depth contract is asserted independently of SQL mocking.
// Pure-function tests run in microseconds and pin the filter shape
// (empty input, no-match passthrough, single-row drop, multi-row drop,
// preserves order) so future edits to the helper can't silently
// regress to "returns input unchanged".
func TestExcludeSelfFromPeers_Unit(t *testing.T) {
t.Run("empty input returns empty slice", func(t *testing.T) {
out := excludeSelfFromPeers(nil, "ws-self")
if len(out) != 0 {
t.Errorf("expected empty, got %+v", out)
}
})
t.Run("no self in list passes through unchanged", func(t *testing.T) {
in := []map[string]interface{}{
{"id": "ws-a", "name": "A"},
{"id": "ws-b", "name": "B"},
}
out := excludeSelfFromPeers(in, "ws-self")
if len(out) != 2 {
t.Fatalf("expected 2, got %d (%+v)", len(out), out)
}
if out[0]["id"] != "ws-a" || out[1]["id"] != "ws-b" {
t.Errorf("order not preserved: %+v", out)
}
})
t.Run("self row dropped, others preserved", func(t *testing.T) {
in := []map[string]interface{}{
{"id": "ws-a", "name": "A"},
{"id": "ws-self", "name": "Me"},
{"id": "ws-b", "name": "B"},
}
out := excludeSelfFromPeers(in, "ws-self")
if len(out) != 2 {
t.Fatalf("expected 2, got %d (%+v)", len(out), out)
}
if out[0]["id"] != "ws-a" || out[1]["id"] != "ws-b" {
t.Errorf("expected [ws-a, ws-b], got %+v", out)
}
})
t.Run("multiple self rows all dropped", func(t *testing.T) {
// Pathological — should never happen, but the contract is
// "no row with id==workspaceID survives", not "at most one
// such row is dropped". Pin it.
in := []map[string]interface{}{
{"id": "ws-self", "name": "Me1"},
{"id": "ws-a", "name": "A"},
{"id": "ws-self", "name": "Me2"},
}
out := excludeSelfFromPeers(in, "ws-self")
if len(out) != 1 {
t.Fatalf("expected 1, got %d (%+v)", len(out), out)
}
if out[0]["id"] != "ws-a" {
t.Errorf("expected [ws-a], got %+v", out)
}
})
t.Run("row with missing id key is preserved (not a self-collision)", func(t *testing.T) {
// A peer row with no "id" key shouldn't be silently dropped
// by the self-filter — it's a malformed row class that
// belongs to a different defect.
in := []map[string]interface{}{
{"name": "no-id-row"},
{"id": "ws-self", "name": "Me"},
}
out := excludeSelfFromPeers(in, "ws-self")
if len(out) != 1 {
t.Fatalf("expected 1, got %d (%+v)", len(out), out)
}
if out[0]["name"] != "no-id-row" {
t.Errorf("expected no-id-row preserved, got %+v", out)
}
})
}
@@ -283,7 +283,7 @@ claude --dangerously-load-development-channels \
// externalUniversalMcpTemplate — runtime-agnostic standalone path.
// Ships as the `molecule-mcp` console script in the
// molecule-ai-workspace-runtime PyPI wheel (workspace/mcp_cli.py).
// molecule-ai-workspace-runtime wheel published to the Gitea package registry.
// Any MCP-aware runtime (Claude Code, hermes, codex, third-party)
// registers it once and gets the same 8 universal tools that
// container-bound runtimes use today: delegate_task, list_peers,
@@ -322,7 +322,7 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# 1. Install the workspace runtime wheel (once per machine safe to
# re-run; subsequent workspaces share the same wheel):
pip install molecule-ai-workspace-runtime
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ molecule-ai-workspace-runtime
# 2. Wire molecule-mcp into your agent's MCP config. Claude Code:
# NOTE the server name is workspace-specific ("{{MCP_SERVER_NAME}}") so
@@ -344,7 +344,7 @@ claude mcp add {{MCP_SERVER_NAME}} -s user -- env \
# needed when calling tools through the MCP server.
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Where to install: https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# "Tools not appearing in your agent" run ` + "`claude mcp list`" + ` (or
@@ -359,8 +359,8 @@ claude mcp add {{MCP_SERVER_NAME}} -s user -- env \
`
// externalPythonTemplate uses molecule-sdk-python's RemoteAgentClient +
// A2AServer (PR #13 in that repo). Until the SDK cuts a v0.y release
// to PyPI the snippet pins git+main.
// A2AServer. Until the SDK is published to the Gitea package registry the
// snippet pins git+main.
const externalPythonTemplate = `# pip install 'git+https://git.moleculesai.app/molecule-ai/molecule-sdk-python.git@main'
import asyncio
@@ -396,7 +396,7 @@ if __name__ == "__main__":
asyncio.run(main())
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Where to install: https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# 401 from /heartbeat AUTH_TOKEN expired or wrong workspace_id.
@@ -445,7 +445,7 @@ const externalHermesChannelTemplate = `# Hermes channel — bridges this workspa
# also supported via the plugin's dual-mode fallback.
#
# 1. Install the runtime + plugin:
pip install molecule-ai-workspace-runtime
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ molecule-ai-workspace-runtime
pip install 'git+https://git.moleculesai.app/molecule-ai/hermes-channel-molecule.git'
# 2. Export the workspace credentials:
@@ -528,7 +528,7 @@ const externalCodexTemplate = `# Codex external setup — outbound tools (MCP) +
# 1. Install codex CLI, the workspace runtime, and the bridge daemon:
npm install -g @openai/codex@latest
pip install molecule-ai-workspace-runtime
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ molecule-ai-workspace-runtime
pip install codex-channel-molecule
# 2. Wire the molecule MCP server into codex's config.toml this is
@@ -620,7 +620,7 @@ const externalKimiTemplate = `# Kimi CLI external setup — register + heartbeat
# No public URL needed; runs behind NAT in poll mode.
# 1. Install the workspace runtime wheel (provides HTTP client):
pip install molecule-ai-workspace-runtime
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ molecule-ai-workspace-runtime
# 2. Save credentials and the bridge script:
mkdir -p ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}
@@ -779,7 +779,7 @@ const externalOpenClawTemplate = `# OpenClaw MCP config — outbound tool path.
# (register-on-startup + 20s heartbeat). Older versions only ship
# a2a_mcp_server which does not heartbeat.
npm install -g openclaw@latest
pip install "molecule-ai-workspace-runtime>=0.1.999"
pip install --index-url https://git.moleculesai.app/api/packages/molecule-ai/pypi/simple/ "molecule-ai-workspace-runtime>=0.1.999"
# 2. Onboard openclaw against your model provider (one-time setup).
# --non-interactive needs an explicit --provider + --model so it
@@ -432,9 +432,10 @@ func TestExtended_Peers(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}).
AddRow("ws-sibling", "Sibling Agent", "worker", 1, "online", []byte("null"), "http://localhost:9001", nil, 0))
// Expect children query (workspaces with parent_id = ws-peer)
// Expect children query (workspaces with parent_id = ws-peer, excluding self)
// Query now binds (parent_id, self_id) for the self-filter guard added in #383.
mock.ExpectQuery("SELECT w.id, w.name").
WithArgs("ws-peer").
WithArgs("ws-peer", "ws-peer").
WillReturnRows(sqlmock.NewRows([]string{"id", "name", "role", "tier", "status", "agent_card", "url", "parent_id", "active_tasks"}))
// No parent query since workspace is root-level
@@ -84,6 +84,7 @@ type mcpTool struct {
type MCPHandler struct {
database *sql.DB
broadcaster *events.Broadcaster
a2aProxy func(ctx context.Context, workspaceID string, body []byte, callerID string, logActivity bool) (int, []byte, error)
// memv2 is the v2 memory plugin wiring (RFC #2728). nil-safe:
// every v2 tool calls memoryV2Available() first and returns a
@@ -98,6 +99,14 @@ func NewMCPHandler(database *sql.DB, broadcaster *events.Broadcaster) *MCPHandle
return &MCPHandler{database: database, broadcaster: broadcaster}
}
func (h *MCPHandler) proxyA2ARequest(ctx context.Context, workspaceID string, body []byte, callerID string, logActivity bool) (int, []byte, error) {
if h.a2aProxy != nil {
return h.a2aProxy(ctx, workspaceID, body, callerID, logActivity)
}
wh := NewWorkspaceHandler(h.broadcaster, nil, "", "")
return wh.ProxyA2ARequest(ctx, workspaceID, body, callerID, logActivity)
}
// ─────────────────────────────────────────────────────────────────────────────
// Tool definitions (mirrors workspace/a2a_mcp_server.py TOOLS list)
// ─────────────────────────────────────────────────────────────────────────────
@@ -53,6 +53,15 @@ func mcpPost(t *testing.T, h *MCPHandler, workspaceID string, body interface{})
return w
}
func expectCanCommunicateSiblings(mock sqlmock.Sqlmock, callerID, targetID, parentID string) {
mock.ExpectQuery(`SELECT id, parent_id FROM workspaces WHERE id = \$1`).
WithArgs(callerID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(callerID, parentID))
mock.ExpectQuery(`SELECT id, parent_id FROM workspaces WHERE id = \$1`).
WithArgs(targetID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(targetID, parentID))
}
// ─────────────────────────────────────────────────────────────────────────────
// initialize
// ─────────────────────────────────────────────────────────────────────────────
@@ -178,6 +187,98 @@ func TestMCPHandler_ToolsList_ContainsExpectedTools(t *testing.T) {
}
}
func TestMCPHandler_DelegateTask_RoutesThroughPlatformA2AProxy(t *testing.T) {
h, mock := newMCPHandler(t)
callerID := "11111111-1111-1111-1111-111111111111"
targetID := "22222222-2222-2222-2222-222222222222"
parentID := "33333333-3333-3333-3333-333333333333"
expectCanCommunicateSiblings(mock, callerID, targetID, parentID)
mock.ExpectExec(`(?s)INSERT INTO activity_logs.*'delegation'.*'delegate'`).
WithArgs(callerID, callerID, targetID, "Delegating to "+targetID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("dispatched", "", callerID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
var gotTarget, gotCaller string
h.a2aProxy = func(ctx context.Context, workspaceID string, body []byte, callerID string, logActivity bool) (int, []byte, error) {
gotTarget = workspaceID
gotCaller = callerID
if !logActivity {
t.Fatal("delegate_task should log through platform A2A proxy")
}
if !strings.Contains(string(body), "do work") {
t.Fatalf("A2A body missing task text: %s", string(body))
}
return 200, []byte(`{"result":{"message":{"parts":[{"text":"done"}]}}}`), nil
}
out, err := h.toolDelegateTask(context.Background(), callerID, map[string]interface{}{
"workspace_id": targetID,
"task": "do work",
}, mcpCallTimeout)
if err != nil {
t.Fatalf("delegate_task returned error: %v", err)
}
if out != "done" {
t.Fatalf("delegate_task response = %q, want done", out)
}
if gotTarget != targetID || gotCaller != callerID {
t.Fatalf("proxy called with target=%q caller=%q, want target=%q caller=%q", gotTarget, gotCaller, targetID, callerID)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
func TestMCPHandler_DelegateTaskAsync_RoutesThroughPlatformA2AProxy(t *testing.T) {
h, mock := newMCPHandler(t)
callerID := "11111111-1111-1111-1111-111111111111"
targetID := "22222222-2222-2222-2222-222222222222"
parentID := "33333333-3333-3333-3333-333333333333"
expectCanCommunicateSiblings(mock, callerID, targetID, parentID)
mock.ExpectExec(`(?s)INSERT INTO activity_logs.*'delegation'.*'delegate'`).
WithArgs(callerID, callerID, targetID, "Delegating to "+targetID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("dispatched", "", callerID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
called := make(chan struct{}, 1)
h.a2aProxy = func(ctx context.Context, workspaceID string, body []byte, proxyCallerID string, logActivity bool) (int, []byte, error) {
if workspaceID != targetID || proxyCallerID != callerID {
t.Fatalf("unexpected proxy route target=%q caller=%q", workspaceID, proxyCallerID)
}
if !strings.Contains(string(body), "async work") {
t.Fatalf("A2A body missing task text: %s", string(body))
}
called <- struct{}{}
return 200, []byte(`{"result":{"message":{"parts":[{"text":"accepted"}]}}}`), nil
}
out, err := h.toolDelegateTaskAsync(context.Background(), callerID, map[string]interface{}{
"workspace_id": targetID,
"task": "async work",
})
if err != nil {
t.Fatalf("delegate_task_async returned error: %v", err)
}
if !strings.Contains(out, `"status":"dispatched"`) {
t.Fatalf("delegate_task_async response = %s", out)
}
waitGlobalAsyncForTest()
select {
case <-called:
default:
t.Fatal("async delegate did not call platform A2A proxy")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// notifications/initialized
// ─────────────────────────────────────────────────────────────────────────────
+17 -120
View File
@@ -7,24 +7,19 @@ package handlers
// and A2A response parsing helpers.
import (
"bytes"
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"io"
"log"
"net/http"
"os"
"strings"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/registry"
"github.com/google/uuid"
)
// insertMCPDelegationRow writes a delegation activity row so the canvas
// Agent Comms tab can show the task text for MCP-initiated delegations.
// Mirrors insertDelegationRow (delegation.go) for the MCP tool path.
@@ -190,15 +185,6 @@ func (h *MCPHandler) toolDelegateTask(ctx context.Context, callerID string, args
// Non-fatal: still make the A2A call even if activity log write fails.
}
agentURL, err := mcpResolveURL(ctx, h.database, targetID)
if err != nil {
return "", err
}
// SSRF defence: reject private/metadata URLs before making outbound call.
if err := isSafeURL(agentURL); err != nil {
return "", fmt.Errorf("invalid workspace URL: %w", err)
}
a2aBody, err := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0",
"id": uuid.New().String(),
@@ -218,36 +204,17 @@ func (h *MCPHandler) toolDelegateTask(ctx context.Context, callerID string, args
reqCtx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
httpReq, err := http.NewRequestWithContext(reqCtx, "POST", agentURL+"/a2a", bytes.NewReader(a2aBody))
if err != nil {
return "", fmt.Errorf("failed to create request: %w", err)
}
httpReq.Header.Set("Content-Type", "application/json")
// X-Workspace-ID identifies this caller to the A2A proxy. The /workspaces/:id/a2a
// endpoint is intentionally outside WorkspaceAuth (agents do not hold bearer tokens
// to peer workspaces). Access control is enforced by CanCommunicate above, which
// already validated callerID → targetID before this request is constructed.
// callerID was authenticated by WorkspaceAuth on the MCP bridge entry point,
// so this header reflects a verified caller identity, not a spoofable value.
httpReq.Header.Set("X-Workspace-ID", callerID)
resp, err := http.DefaultClient.Do(httpReq)
status, body, err := h.proxyA2ARequest(reqCtx, targetID, a2aBody, callerID, true)
if err != nil {
updateMCPDelegationStatus(ctx, h.database, callerID, delegationID, "failed", err.Error())
return "", fmt.Errorf("A2A call failed: %w", err)
return "", fmt.Errorf("A2A proxy failed: %w", err)
}
if status < 200 || status >= 300 {
updateMCPDelegationStatus(ctx, h.database, callerID, delegationID, "failed", fmt.Sprintf("A2A proxy returned status %d", status))
return "", fmt.Errorf("A2A proxy returned status %d", status)
}
defer func() { _ = resp.Body.Close() }()
// A 200/500 from the peer still means the call was dispatched — only
// network errors are truly "failed". Status 'dispatched' is correct for
// any HTTP response (peer's A2A layer handles the actual processing).
updateMCPDelegationStatus(ctx, h.database, callerID, delegationID, "dispatched", "")
body, err := io.ReadAll(io.LimitReader(resp.Body, 1<<20))
if err != nil {
return "", fmt.Errorf("failed to read response: %w", err)
}
return extractA2AText(body), nil
}
@@ -278,24 +245,13 @@ func (h *MCPHandler) toolDelegateTaskAsync(ctx context.Context, callerID string,
// Fire and forget in a detached goroutine. Use a background context so
// the call is not cancelled when the HTTP request completes.
// RFC internal#524 Layer 1: globalGoAsync — the detached call reads
// db.DB (mcpResolveURL + updateMCPDelegationStatus) and must be
// drained by drainTestAsync before any t.Cleanup-driven db.DB swap.
// RFC internal#524 Layer 1: globalGoAsync — the detached call reads db.DB
// through the platform A2A proxy and must be drained by drainTestAsync
// before any t.Cleanup-driven db.DB swap.
globalGoAsync(func() {
bgCtx, cancel := context.WithTimeout(context.Background(), mcpAsyncCallTimeout)
defer cancel()
agentURL, err := mcpResolveURL(bgCtx, h.database, targetID)
if err != nil {
log.Printf("MCPHandler.delegate_task_async: resolve URL for %s: %v", targetID, err)
return
}
// SSRF defence: reject private/metadata URLs before making outbound call.
if err := isSafeURL(agentURL); err != nil {
log.Printf("MCPHandler.delegate_task_async: unsafe URL for %s: %v", targetID, err)
return
}
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0",
"id": delegationID,
@@ -309,22 +265,15 @@ func (h *MCPHandler) toolDelegateTaskAsync(ctx context.Context, callerID string,
},
})
httpReq, err := http.NewRequestWithContext(bgCtx, "POST", agentURL+"/a2a", bytes.NewReader(a2aBody))
if err != nil {
log.Printf("MCPHandler.delegate_task_async: create request: %v", err)
status, _, err := h.proxyA2ARequest(bgCtx, targetID, a2aBody, callerID, true)
if err != nil || status < 200 || status >= 300 {
if err != nil {
log.Printf("MCPHandler.delegate_task_async: A2A proxy to %s: %v", targetID, err)
} else {
log.Printf("MCPHandler.delegate_task_async: A2A proxy to %s returned status %d", targetID, status)
}
return
}
httpReq.Header.Set("Content-Type", "application/json")
httpReq.Header.Set("X-Workspace-ID", callerID)
resp, err := http.DefaultClient.Do(httpReq)
if err != nil {
log.Printf("MCPHandler.delegate_task_async: A2A call to %s: %v", targetID, err)
return
}
defer func() { _ = resp.Body.Close() }()
// Drain response so the connection can be reused.
_, _ = io.Copy(io.Discard, resp.Body)
})
return fmt.Sprintf(`{"task_id":%q,"status":"dispatched","target_id":%q}`, delegationID, targetID), nil
@@ -405,7 +354,6 @@ func (h *MCPHandler) toolSendMessageToUser(ctx context.Context, workspaceID stri
return "Message sent.", nil
}
func (h *MCPHandler) toolCommitMemory(ctx context.Context, workspaceID string, args map[string]interface{}) (string, error) {
// PR-6 (RFC #2728) compat shim: when the v2 plugin is wired
// (MEMORY_PLUGIN_URL set), translate legacy scope→namespace and
@@ -534,56 +482,6 @@ func (h *MCPHandler) toolRecallMemory(ctx context.Context, workspaceID string, a
// Helpers
// ─────────────────────────────────────────────────────────────────────────────
// mcpResolveURL returns a routable URL for a workspace's A2A server.
//
// Resolution order:
// 1. Docker-internal URL cache (set by provisioner; correct when platform is in Docker)
// 2. Redis URL cache
// 3. DB `url` column fallback, with 127.0.0.1→Docker bridge rewrite when in Docker
//
// SECURITY (F1083 / #1130): all three paths run the returned URL through
// validateAgentURL to block SSRF targets (private IPs, loopback, cloud metadata).
func mcpResolveURL(ctx context.Context, database *sql.DB, workspaceID string) (string, error) {
if platformInDocker {
if url, err := db.GetCachedInternalURL(ctx, workspaceID); err == nil && url != "" {
if err := validateAgentURL(url); err != nil {
return "", fmt.Errorf("workspace %s: forbidden URL from internal cache: %w", workspaceID, err)
}
return url, nil
}
}
if url, err := db.GetCachedURL(ctx, workspaceID); err == nil && url != "" {
if platformInDocker && strings.HasPrefix(url, "http://127.0.0.1:") {
return provisioner.InternalURL(workspaceID), nil
}
if err := validateAgentURL(url); err != nil {
return "", fmt.Errorf("workspace %s: forbidden URL from Redis cache: %w", workspaceID, err)
}
return url, nil
}
var urlStr sql.NullString
var status string
if err := database.QueryRowContext(ctx,
`SELECT url, status FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&urlStr, &status); err != nil {
if err == sql.ErrNoRows {
return "", fmt.Errorf("workspace %s not found", workspaceID)
}
return "", fmt.Errorf("workspace lookup failed: %w", err)
}
if !urlStr.Valid || urlStr.String == "" {
return "", fmt.Errorf("workspace %s has no URL (status: %s)", workspaceID, status)
}
if platformInDocker && strings.HasPrefix(urlStr.String, "http://127.0.0.1:") {
return provisioner.InternalURL(workspaceID), nil
}
if err := validateAgentURL(urlStr.String); err != nil {
return "", fmt.Errorf("workspace %s: forbidden URL from DB: %w", workspaceID, err)
}
return urlStr.String, nil
}
// extractA2AText extracts human-readable text from an A2A JSON-RPC response body.
// Falls back to the raw JSON when no text part can be found.
func extractA2AText(body []byte) string {
@@ -632,4 +530,3 @@ func extractA2AText(body []byte) string {
b, _ := json.Marshal(result)
return string(b)
}
@@ -112,7 +112,7 @@ func (h *RegistryHandler) SetQueueDrainFunc(f QueueDrainFunc) {
// Go's net.ParseIP.To4() before Contains() runs, so the IPv4 rules above
// catch those without a separate entry.
//
// F1083/#1130 (SSRF on mcpResolveURL / a2a_proxy resolveAgentURL): in
// F1083/#1130 (SSRF on direct A2A URL resolution): in
// addition to blocking IP literals, DNS names are now resolved and each
// returned IP is checked against the blocklist. This closes the gap where
// an attacker could register agent.example.com pointing to 169.254.169.254.
@@ -234,9 +234,13 @@ func (h *TemplatesHandler) ReplaceFiles(c *gin.Context) {
"source": "ec2-ssh",
})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save / ReplaceFiles fires N PUTs
// in a burst; without this each PUT chains into the
// coalesceRestart drain loop. The helper still uses goAsync
// internally (drains via h.wh.waitAsyncForTest), preserving
// RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -270,9 +274,13 @@ func (h *TemplatesHandler) ReplaceFiles(c *gin.Context) {
"source": "container",
})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save / ReplaceFiles fires N PUTs
// in a burst; without this each PUT chains into the
// coalesceRestart drain loop. The helper still uses goAsync
// internally (drains via h.wh.waitAsyncForTest), preserving
// RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -292,8 +300,12 @@ func (h *TemplatesHandler) ReplaceFiles(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"status": "replaced", "workspace": workspaceID, "files": len(body.Files), "source": "volume"})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save / ReplaceFiles fires N PUTs
// in a burst; without this each PUT chains into the
// coalesceRestart drain loop. The helper still uses goAsync
// internally (drains via h.wh.waitAsyncForTest), preserving
// RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
}
+42 -18
View File
@@ -570,9 +570,13 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "saved", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -586,9 +590,13 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "saved", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -602,9 +610,13 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "saved", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
}
@@ -657,9 +669,13 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "deleted", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -677,9 +693,13 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "deleted", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
return
}
@@ -692,8 +712,12 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
}
c.JSON(http.StatusOK, gin.H{"status": "deleted", "path": filePath})
if h.wh != nil {
// RFC internal#524 Layer 1: per-handler goAsync (drains via h.wh.waitAsyncForTest)
wsID := workspaceID
h.wh.goAsync(func() { h.wh.RestartByID(wsID) })
// internal#624: 15s per-workspace debounce around the file-write
// → RestartByID trigger. Canvas Save fires N PUTs in a burst;
// without this each PUT chains into the coalesceRestart drain
// loop and produces back-to-back EC2 recreate cycles. The
// helper still uses goAsync internally (drains via
// h.wh.waitAsyncForTest), preserving RFC internal#524 Layer 1.
h.wh.maybeRestartAfterFileWrite(workspaceID)
}
}
@@ -869,14 +869,31 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
// Returns nil map + error string on decrypt failure. Shared by both Docker
// and control plane provisioning paths to avoid duplication.
//
// The second return value (globalKeys) records which keys originated from
// the operator-controlled `global_secrets` table — used by RFC#523 Layer 1
// to constrain its forbidden-key check to the operator-bleed channel,
// instead of blanket-blocking by name across BOTH provenance channels (the
// over-fire that breaks the legitimate user flow of pasting their own
// GitHub PAT into the canvas Secrets tab → workspace_secrets row). See
// `feedback_upstream_docs_first_before_hypothesizing`: RFC#523's threat
// model (issue molecule-ai/internal#523 §"Threat model") names operator-
// scope tokens being injected via provision-time env / operator-side
// stores — NOT the user's own scoped PAT they explicitly authorized via
// the per-workspace Secrets tab.
//
// The merged map preserves the existing precedence semantic (workspace
// rows overwrite global rows on key collision); only the provenance side-
// channel is new. Existing single-return callers can ignore globalKeys.
//
// F1086 / #1206: the returned error string is the SAFE-CANNED message that
// gets persisted to workspaces.last_sample_error AND broadcast as the
// WORKSPACE_PROVISION_FAILED payload. Internal detail (the secret key name,
// the encryption version, the decrypt-error text) is logged here, never
// returned to the caller, so it can't leak via the canvas event stream
// (cf. TestProvisionWorkspace_NoInternalErrorsInBroadcast).
func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]string, string) {
func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]string, map[string]struct{}, string) {
envVars := map[string]string{}
globalKeys := map[string]struct{}{}
globalRows, globalErr := db.DB.QueryContext(ctx,
`SELECT key, encrypted_value, encryption_version FROM global_secrets`)
if globalErr == nil {
@@ -889,9 +906,10 @@ func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]s
decrypted, decErr := crypto.DecryptVersioned(v, ver)
if decErr != nil {
log.Printf("Provisioner: FATAL — failed to decrypt global secret %s (version=%d): %v — aborting provision of workspace %s", k, ver, decErr, workspaceID)
return nil, "failed to decrypt global secret"
return nil, nil, "failed to decrypt global secret"
}
envVars[k] = string(decrypted)
globalKeys[k] = struct{}{}
}
}
if err := globalRows.Err(); err != nil {
@@ -910,16 +928,22 @@ func loadWorkspaceSecrets(ctx context.Context, workspaceID string) (map[string]s
decrypted, decErr := crypto.DecryptVersioned(v, ver)
if decErr != nil {
log.Printf("Provisioner: FATAL — failed to decrypt workspace secret %s (version=%d) for %s: %v — aborting provision", k, ver, workspaceID, decErr)
return nil, "failed to decrypt workspace secret"
return nil, nil, "failed to decrypt workspace secret"
}
envVars[k] = string(decrypted)
// User-authored workspace_secrets value supersedes any
// global_secrets row of the same key — including dropping
// the operator-bleed provenance flag. The user explicitly
// re-set the value via the canvas Secrets tab, so it is
// no longer "the operator-store version."
delete(globalKeys, k)
}
}
if err := wsRows.Err(); err != nil {
log.Printf("Provisioner: workspace_secrets rows.Err workspace=%s: %v", workspaceID, err)
}
}
return envVars, ""
return envVars, globalKeys, ""
}
// provisionWorkspaceCP provisions a workspace via the control plane API.
@@ -135,6 +135,15 @@ func isForbiddenTenantEnvKey(key string) bool {
// message and the structured-extra payload that goes to the
// canvas Events tab. Sorting makes the message stable across
// Go's randomized map iteration.
//
// PROVENANCE NOTE: this helper checks by env-var name ONLY and is
// unaware of where each value came from. Production provision code
// uses findForbiddenTenantEnvKeysFromGlobals instead, restricting
// the check to keys originating from the operator-controlled
// global_secrets table — see the doc-comment on that function and
// the RFC#523 Layer 1 block in prepareProvisionContext. This name-
// only helper is kept for the workspace_secrets-write CI lint
// (Layer 3) and for tests that pin the deny-set definition.
func findForbiddenTenantEnvKeys(envVars map[string]string) []string {
if len(envVars) == 0 {
return []string{}
@@ -149,6 +158,48 @@ func findForbiddenTenantEnvKeys(envVars map[string]string) []string {
return found
}
// findForbiddenTenantEnvKeysFromGlobals is the provenance-aware
// variant used by RFC#523 Layer 1 in prepareProvisionContext. It
// restricts the forbidden-key scan to keys whose value originated
// from the operator-controlled `global_secrets` table.
//
// Fixes the over-fire reported by CTO empirical 2026-05-20: a user
// who explicitly pastes their own scoped GitHub PAT under
// GITHUB_TOKEN into the canvas Secrets tab (a `workspace_secrets`
// row) was being blocked alongside the genuine operator-bleed case.
// RFC#523's threat model (issue molecule-ai/internal#523 §"Threat
// model") names operator-scope tokens injected via operator-side
// stores; user-authored workspace_secrets is out of scope.
//
// globalSecretKeys is the set returned as the second value from
// loadWorkspaceSecrets. A key that exists in BOTH stores is treated
// as workspace_secrets (user override wins) — loadWorkspaceSecrets
// drops the global flag when the workspace row is read.
//
// Empty/nil globalSecretKeys means no operator-side source was
// loaded (e.g. tests, or table empty); the scan returns no hits.
// Deterministic sort order, same as findForbiddenTenantEnvKeys.
func findForbiddenTenantEnvKeysFromGlobals(envVars map[string]string, globalSecretKeys map[string]struct{}) []string {
if len(envVars) == 0 || len(globalSecretKeys) == 0 {
return []string{}
}
found := make([]string, 0)
for k := range globalSecretKeys {
if _, present := envVars[k]; !present {
// Defensive: a key flagged as global-origin must also
// be in the resolved env-set. If not, skip — the
// loadWorkspaceSecrets contract guarantees this never
// happens, but the helper stays total.
continue
}
if isForbiddenTenantEnvKey(k) {
found = append(found, k)
}
}
sort.Strings(found)
return found
}
// formatForbiddenTenantEnvError builds the safe-canned user-facing
// message for a provision aborted because forbidden env keys are
// present in the resolved env-set. The message names the
@@ -150,6 +150,106 @@ func TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted(t *testing.T) {
}
}
// TestFindForbiddenTenantEnvKeysFromGlobals pins the provenance-aware
// behaviour added 2026-05-20 to fix the RFC#523 Layer 1 over-fire: a
// user-set workspace_secrets row with key=GITHUB_TOKEN must NOT be
// flagged, while a global_secrets row of the same key MUST be.
//
// Cross-references the empirical bug: CTO 2026-05-20 hit
// `provision aborted: env var "GITHUB_TOKEN" is operator-scope...`
// after pasting their own scoped PAT into the canvas Secrets tab
// (workspace_secrets) — the original blanket check fired on the
// merged env-set regardless of provenance.
func TestFindForbiddenTenantEnvKeysFromGlobals_UserSetAllowed(t *testing.T) {
// User pasted their own PAT via canvas Secrets tab —
// workspace_secrets row only. globalSecretKeys is empty for
// this key, so the check MUST not fire.
envVars := map[string]string{
"GITHUB_TOKEN": "ghp_FAKEUSERPAT_user_set_via_canvas",
"ANTHROPIC_API_KEY": "sk-ant-keep",
}
globalKeys := map[string]struct{}{} // nothing from global_secrets
got := findForbiddenTenantEnvKeysFromGlobals(envVars, globalKeys)
if len(got) != 0 {
t.Errorf("user-set workspace_secrets with GITHUB_TOKEN: got %v; want empty (provenance-allowed)", got)
}
}
func TestFindForbiddenTenantEnvKeysFromGlobals_OperatorLeakBlocked(t *testing.T) {
// Operator-store bleed — GITHUB_TOKEN sourced from global_secrets.
// This is the literal RFC#523 §"Threat model" attack vector.
// Check MUST fire and name GITHUB_TOKEN.
envVars := map[string]string{
"GITHUB_TOKEN": "ghp_OPERATOR_LEAK_from_global_secrets",
"ANTHROPIC_API_KEY": "sk-ant-keep",
}
globalKeys := map[string]struct{}{
"GITHUB_TOKEN": {},
"ANTHROPIC_API_KEY": {},
}
got := findForbiddenTenantEnvKeysFromGlobals(envVars, globalKeys)
if len(got) != 1 || got[0] != "GITHUB_TOKEN" {
t.Errorf("operator-leak GITHUB_TOKEN in global_secrets: got %v; want [GITHUB_TOKEN]", got)
}
}
func TestFindForbiddenTenantEnvKeysFromGlobals_UserOverrideOfGlobalAllowed(t *testing.T) {
// Both stores have the key; loadWorkspaceSecrets drops the global
// flag when the workspace row supersedes (caller contract).
// Simulate that here: globalKeys does NOT contain GITHUB_TOKEN
// because workspace_secrets re-set it. Allowed.
envVars := map[string]string{
"GITHUB_TOKEN": "ghp_USER_RESET_after_global_was_present",
}
globalKeys := map[string]struct{}{} // workspace overrode → flag dropped
got := findForbiddenTenantEnvKeysFromGlobals(envVars, globalKeys)
if len(got) != 0 {
t.Errorf("user-override of global GITHUB_TOKEN: got %v; want empty", got)
}
}
func TestFindForbiddenTenantEnvKeysFromGlobals_MultipleOperatorLeaks(t *testing.T) {
// Multiple operator-leaked tokens — must return sorted slice.
envVars := map[string]string{
"GITHUB_TOKEN": "leak1",
"CP_ADMIN_API_TOKEN": "leak2",
"MOLECULE_OPERATOR_HOST": "leak3",
"RAILWAY_TOKEN": "leak4",
"ANTHROPIC_API_KEY": "user-allowed",
}
globalKeys := map[string]struct{}{
"GITHUB_TOKEN": {},
"CP_ADMIN_API_TOKEN": {},
"MOLECULE_OPERATOR_HOST": {},
"RAILWAY_TOKEN": {},
}
got := findForbiddenTenantEnvKeysFromGlobals(envVars, globalKeys)
want := []string{"CP_ADMIN_API_TOKEN", "GITHUB_TOKEN", "MOLECULE_OPERATOR_HOST", "RAILWAY_TOKEN"}
if len(got) != len(want) {
t.Fatalf("operator-leak multi: got %v; want %v", got, want)
}
for i := range want {
if got[i] != want[i] {
t.Errorf("operator-leak multi[%d] = %q; want %q (full got=%v)", i, got[i], want[i], got)
}
}
}
func TestFindForbiddenTenantEnvKeysFromGlobals_EmptyInputs(t *testing.T) {
if got := findForbiddenTenantEnvKeysFromGlobals(nil, nil); len(got) != 0 {
t.Errorf("nil/nil: got %v; want empty", got)
}
if got := findForbiddenTenantEnvKeysFromGlobals(map[string]string{}, map[string]struct{}{}); len(got) != 0 {
t.Errorf("empty/empty: got %v; want empty", got)
}
// Non-empty envVars but no global provenance — nothing came from
// global_secrets, so nothing to block (even if a workspace_secrets
// row exists for GITHUB_TOKEN).
if got := findForbiddenTenantEnvKeysFromGlobals(map[string]string{"GITHUB_TOKEN": "ghp_user"}, map[string]struct{}{}); len(got) != 0 {
t.Errorf("workspace-only GITHUB_TOKEN: got %v; want empty", got)
}
}
func TestFormatForbiddenTenantEnvError_Phrasing(t *testing.T) {
// Empty input — defensive total function.
if msg := formatForbiddenTenantEnvError(nil); !strings.Contains(msg, "RFC#523") {
@@ -120,38 +120,52 @@ func (h *WorkspaceHandler) prepareProvisionContext(
payload models.CreateWorkspacePayload,
resetClaudeSession bool,
) (*preparedProvisionContext, *provisionAbort) {
envVars, decryptErr := loadWorkspaceSecrets(ctx, workspaceID)
envVars, globalSecretKeys, decryptErr := loadWorkspaceSecrets(ctx, workspaceID)
if decryptErr != "" {
return nil, &provisionAbort{Msg: decryptErr}
}
// RFC#523 Layer 1 (task #146): refuse to start a tenant workspace
// when any forbidden operator-scope env var is present in the
// resolved secret-load env-set. Runs IMMEDIATELY after
// loadWorkspaceSecrets and BEFORE applyAgentGitHTTPCreds — the
// per-agent persona injection sets a fallback GITEA_USER/GITEA_TOKEN
// pair that the buildContainerEnv forensic #145 guard will strip
// later. We want THIS layer to catch leaks from the operator-
// controlled stores (global_secrets, workspace_secrets) only, not
// the deliberate per-agent platform injection that lives downstream.
// RFC#523 Layer 1 (issue molecule-ai/internal#523): refuse to start a
// tenant workspace when any forbidden operator-scope env var is
// present in the operator-controlled store (global_secrets).
//
// Threat model is "an upstream secret-writer accidentally widened
// the propagation set" — e.g. an operator pastes GITEA_TOKEN into
// a workspace_secrets row. Caught here, surfaced loudly to the
// canvas Events tab, fail-closed. The existing forensic #145 guard
// in provisioner.buildContainerEnv / CPProvisioner.Start stays as
// defense-in-depth: it silently strips at container-env-build time.
// PROVENANCE-AWARE — fix for the over-fire reported by CTO empirical
// 2026-05-20: the original implementation ran this check on the
// merged env-set, which conflated two very different sources:
//
// 1. global_secrets — operator-side store. ANY operator-scope token
// here is an upstream bleed (e.g. tenant_secrets_seed.go pre-
// 4f45d37 propagating CP-env GITHUB_TOKEN into every fresh
// tenant's row). RFC#523's literal threat model.
//
// 2. workspace_secrets — user-set via the canvas Secrets tab,
// authenticated as the workspace owner. If the user pastes
// their own scoped GitHub PAT under GITHUB_TOKEN so the agent
// can push to their personal repos, that is the system working
// as designed — not the leak RFC#523 was written to catch.
//
// The provenance side-channel from loadWorkspaceSecrets tells us
// which keys came from global_secrets (workspace_secrets writes
// override and clear the flag, since the user explicitly re-set
// the value). We restrict the abort to that set.
//
// Defense-in-depth NOT removed: provisioner.buildContainerEnv still
// runs the forensic #145 silent-strip (lower-confidence late layer),
// and workspace/entrypoint.sh has Layer 2 inside the container. If a
// real operator-scope token slips into workspace_secrets some other
// way, the later layers (and the per-workspace SG, and the per-tenant
// VPC isolation) are still in force.
//
// Key names (not values) are echoed in the user-facing error so
// the operator can locate and remove the offending row. Per memory
// `feedback_passwords_in_chat_are_burned`, key names are not
// secret; values would be.
if forbidden := findForbiddenTenantEnvKeys(envVars); len(forbidden) > 0 {
if forbidden := findForbiddenTenantEnvKeysFromGlobals(envVars, globalSecretKeys); len(forbidden) > 0 {
msg := formatForbiddenTenantEnvError(forbidden)
log.Printf("Provisioner: ABORT workspace=%s — forbidden operator-scope env keys present: %v (RFC#523)", workspaceID, forbidden)
log.Printf("Provisioner: ABORT workspace=%s — forbidden operator-scope env keys present in global_secrets: %v (RFC#523)", workspaceID, forbidden)
return nil, &provisionAbort{
Msg: msg,
Extra: map[string]interface{}{"error": msg, "forbidden_env_keys": forbidden, "rfc": "523"},
Extra: map[string]interface{}{"error": msg, "forbidden_env_keys": forbidden, "rfc": "523", "source": "global_secrets"},
}
}
@@ -70,6 +70,97 @@ var restartDebounceWindow = 60 * time.Second
// workspace-server yet — that's a separate RFC.
var restartByIDDropCounter atomic.Uint64
// fileWriteRestartDebounceWindow is the per-workspace coalescing window for
// the file-write → RestartByID trigger fired by templates.go's WriteFile,
// DeleteFile, and ReplaceFiles handlers (and template_import.go's variants).
//
// Background (internal#624 2026-05-20): canvas Save fires N PUT /files
// requests in a 30-60s burst (claude-code SEO agent observed 10-17 files in
// 60s). Each successful write previously fired `goAsync(RestartByID)`. The
// 60s self-fire debounce in RestartByID itself catches calls 1-60s, but
// writes at T+65s+ pass the debounce, set pending=true on a still-running
// coalesceRestart cycle, and drain immediately into cycle 2 — which DELETEs
// + recreates EC2 mid-burst, returning 500 EC2InstanceStateInvalidException
// on the in-flight user PUTs.
//
// 15s is sized to absorb a canvas Save burst (writes typically land within
// a 5-10s window) while still letting a deliberate "edit, wait, edit again"
// pattern restart twice. Bigger than that would silently swallow legitimate
// rapid-iteration edits; smaller would let burst tails leak through.
var fileWriteRestartDebounceWindow = 15 * time.Second
// fileWriteRestartLastFireAt records the last time `maybeRestartAfterFileWrite`
// actually fired a restart for each workspace. sync.Map (not RWMutex+map)
// because writes happen on every successful file-write handler, reads on
// every subsequent file-write handler call — both per-workspace — and the
// keys are sparse + long-lived. Stored as int64 unix-nano so the load/store
// path can stay lock-free (atomic.Int64 inside sync.Map.Value is fine, but
// time.Time itself isn't atomically loadable).
var fileWriteRestartLastFireAt sync.Map // map[workspaceID]*atomic.Int64
// fileWriteRestartDropCounter counts how many file-write restart triggers
// were silently coalesced. Same observability rationale as
// restartByIDDropCounter — package-level atomic so tests can assert the
// drop fired and ops can correlate with "user clicked Save 10 times,
// only saw 1 restart cycle".
var fileWriteRestartDropCounter atomic.Uint64
// maybeRestartAfterFileWrite is the call-site debounce wrapper for the 9
// file-write trigger sites in templates.go + template_import.go. Replaces
// the direct `goAsync(func() { wh.RestartByID(wsID) })` pattern with a
// 15s per-workspace coalescing window:
//
// - First call (no prior fire OR last fire >15s ago): records the
// current timestamp and fires goAsync(RestartByID).
// - Subsequent calls within 15s of the last fire: silently dropped,
// drop counter incremented.
//
// This is the call-site-layer protection (internal#624 Path A). The drain-
// loop layer in coalesceRestart (Path B, re-stamping restartStartedAt per
// iteration) is the platform-layer defense in depth — together they close
// the file-write tight-loop class regardless of which entry point fires.
//
// Stateless on the handler so any handler with access to a WorkspaceHandler
// can use it; the per-workspace state lives in the package-level sync.Map.
func (h *WorkspaceHandler) maybeRestartAfterFileWrite(workspaceID string) {
now := time.Now().UnixNano()
// LoadOrStore the per-workspace last-fire stamp. First write for a
// brand-new workspace falls through the CompareAndSwap below because
// the zero-init value (0) is far enough in the past to satisfy the
// "last fire >15s ago" predicate.
sv, _ := fileWriteRestartLastFireAt.LoadOrStore(workspaceID, new(atomic.Int64))
stamp := sv.(*atomic.Int64)
// CAS loop: read last, decide, swap. We use CAS instead of Lock/Unlock
// because the typical case is "thousands of writes, one restart per
// 15s" — uncontended atomic is ~5ns vs ~30ns mutex. Bounded retry
// because in the rare contended case (two writes finishing nanoseconds
// apart) one will win the swap and the other will see the new stamp,
// drop, and bail.
for retry := 0; retry < 4; retry++ {
last := stamp.Load()
elapsed := time.Duration(now - last)
if last != 0 && elapsed < fileWriteRestartDebounceWindow {
// Within debounce window — drop silently.
fileWriteRestartDropCounter.Add(1)
log.Printf("maybeRestartAfterFileWrite: %s — coalesced "+
"(last fire %s ago < %s window; total dropped=%d)",
workspaceID, elapsed.Round(time.Millisecond),
fileWriteRestartDebounceWindow,
fileWriteRestartDropCounter.Load())
return
}
if stamp.CompareAndSwap(last, now) {
break
}
// Another writer beat us to the stamp update. Re-read and retry;
// the retry will almost certainly see the new value and drop.
}
h.goAsync(func() { h.RestartByID(workspaceID) })
}
// isRestarting reports whether a restart cycle is currently in flight for
// the workspace. Callers that have their own "container looks dead" probe
// MUST consult this before triggering a restart, because during the
@@ -513,6 +604,27 @@ func coalesceRestart(workspaceID string, cycle func()) {
// inside provisionWorkspace, so any writes that committed since the
// last cycle are picked up. Continues until no pending request was
// observed at the top of an iteration.
//
// internal#624 Path B (defense in depth for the file-write tight-loop
// class): re-stamp restartStartedAt at the top of every drain iteration
// past the first. The original design (stamp only on false→true edge)
// treated all drained pending as "one event from the debounce's POV",
// which is correct for the secrets-batch use case but lets a file-write
// burst at T+65s of a 60s drain pipe straight into another full cycle.
// Re-stamping closes that hole — each drained cycle gets its own fresh
// debounce window, so any RestartByID arriving during cycle N is
// dropped by shouldDebounceRestart instead of accumulating into
// pending=true for cycle N+1.
//
// The original "one cycle picks up everyone who arrived during it"
// semantic still holds for the secrets-write path: callers that hit
// coalesceRestart during cycle 1 still set pending=true and still get
// their effects landed in cycle 2. What changes is that callers
// arriving during cycle 2 (via RestartByID) now hit the re-stamped
// debounce and are dropped instead of being chained into cycle 3,
// which is exactly the chain that produced the 22:08-22:10 thrash on
// 3fe84b89.
iteration := 0
for {
state.mu.Lock()
if !state.pending {
@@ -520,7 +632,13 @@ func coalesceRestart(workspaceID string, cycle func()) {
return // defer clears running
}
state.pending = false
if iteration > 0 {
// Re-stamp for drained iterations only; the false→true edge
// already stamped at the top of coalesceRestart.
state.restartStartedAt = time.Now()
}
state.mu.Unlock()
iteration++
cycle()
}
@@ -0,0 +1,316 @@
package handlers
// Tests for internal#624 — file-write → RestartByID tight-loop fix.
//
// Empirical chain (Loki 2026-05-20 22:00-22:11Z on workspace
// 3fe84b89-eb65-42fc-ad1f-5c93582ca3e7, claude-code SEO Agent):
//
// 1. Canvas Save writes 10-17 files in a 30-60s window.
// 2. Each successful PUT /files at templates.go:575 / 591 / 607 / 662 /
// 682 / 697 (and template_import.go:239 / 275 / 297) fires
// `goAsync(func() { wh.RestartByID(wsID) })`.
// 3. RestartByID's existing 60s self-fire debounce catches calls 1-60s
// after the cycle starts. But writes at T+65s+ pass the debounce,
// set pending=true on the still-running coalesceRestart cycle, and
// drain IMMEDIATELY into cycle 2 — no re-debounce because the
// original drain loop re-uses the same restartStartedAt.
// 4. Cycle 2 DELETEs+recreates EC2 mid-burst → user sees
// EC2InstanceStateInvalidException 500 on the in-flight PUTs.
//
// Fix: two layers (both shipped in the same PR).
//
// Path A (call-site debounce): every file-write trigger goes through
// maybeRestartAfterFileWrite, which silently drops re-fires within 15s
// of the last fire for the same workspace.
//
// Path B (drain-loop re-stamp): coalesceRestart now re-stamps
// restartStartedAt at the top of each drained iteration, so any
// RestartByID arriving during a drained cycle hits a fresh 60s window
// and is dropped by shouldDebounceRestart instead of chaining further.
import (
"sync"
"sync/atomic"
"testing"
"time"
)
// resetFileWriteDebounceState wipes the package-level sync.Map + drop
// counter for the given workspace ID. Tests must call this between
// scenarios because fileWriteRestartLastFireAt is shared.
func resetFileWriteDebounceState(workspaceID string) {
fileWriteRestartLastFireAt.Delete(workspaceID)
fileWriteRestartDropCounter.Store(0)
}
// newFileWriteDebounceHandler constructs a minimal *WorkspaceHandler with
// no provisioner so RestartByID short-circuits at HasProvisioner()=false
// — we only care that maybeRestartAfterFileWrite reaches goAsync at all.
// The asyncWG inside goAsync lets us wait for the goroutine to finish so
// we can deterministically observe whether RestartByID was scheduled.
func newFileWriteDebounceHandler(t *testing.T) *WorkspaceHandler {
t.Helper()
return NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
}
// TestMaybeRestartAfterFileWrite_FirstWriteRestarts — the baseline case:
// the very first call for a workspace must actually fire goAsync (i.e.
// no debounce-drop on the first PUT). Without this the helper would
// silently swallow every legitimate single-file save.
func TestMaybeRestartAfterFileWrite_FirstWriteRestarts(t *testing.T) {
const wsID = "fw-debounce-first"
resetFileWriteDebounceState(wsID)
h := newFileWriteDebounceHandler(t)
h.maybeRestartAfterFileWrite(wsID)
// Drop counter must NOT have incremented — the call fired.
if got := fileWriteRestartDropCounter.Load(); got != 0 {
t.Errorf("first call to maybeRestartAfterFileWrite must fire (drop counter must stay 0), got %d", got)
}
// Last-fire timestamp must be populated (non-zero) so the next call
// will compare against it.
sv, ok := fileWriteRestartLastFireAt.Load(wsID)
if !ok {
t.Fatal("first call must register the workspace in fileWriteRestartLastFireAt")
}
stamp := sv.(*atomic.Int64).Load()
if stamp == 0 {
t.Error("first call must record a non-zero last-fire timestamp")
}
// Wait for the spawned goroutine to finish so it doesn't leak into
// the next test (RestartByID will short-circuit on no-provisioner).
h.waitAsyncForTest()
}
// TestMaybeRestartAfterFileWrite_SecondWriteWithin15sSkipped — the core
// fix: a second call within fileWriteRestartDebounceWindow of the first
// MUST NOT fire RestartByID. The drop counter must increment by exactly
// one and the last-fire timestamp must remain the FIRST call's stamp
// (proof that the second call did not overwrite it).
func TestMaybeRestartAfterFileWrite_SecondWriteWithin15sSkipped(t *testing.T) {
const wsID = "fw-debounce-second-within"
resetFileWriteDebounceState(wsID)
h := newFileWriteDebounceHandler(t)
// First call — fires.
h.maybeRestartAfterFileWrite(wsID)
h.waitAsyncForTest()
sv, _ := fileWriteRestartLastFireAt.Load(wsID)
firstStamp := sv.(*atomic.Int64).Load()
// Second call immediately — must be dropped.
h.maybeRestartAfterFileWrite(wsID)
if got := fileWriteRestartDropCounter.Load(); got != 1 {
t.Errorf("second call within 15s must increment drop counter by exactly 1, got %d", got)
}
// The CAS-loop must NOT have overwritten the first-call stamp — the
// debounce branch short-circuits before the CompareAndSwap.
stampAfter := sv.(*atomic.Int64).Load()
if stampAfter != firstStamp {
t.Errorf("dropped call must NOT update last-fire stamp (preserves debounce window); "+
"first=%d after=%d", firstStamp, stampAfter)
}
}
// TestMaybeRestartAfterFileWrite_ManyWritesInBurstCoalesceToOne — the
// "bonus" regression test called out in the issue: 10 simulated PUTs
// over 60s (compressed to a tight loop, all within 15s) must produce
// exactly 1 RestartByID schedule and 9 drops. Models the canvas Save
// burst shape that triggered the prod incident.
func TestMaybeRestartAfterFileWrite_ManyWritesInBurstCoalesceToOne(t *testing.T) {
const wsID = "fw-debounce-burst"
resetFileWriteDebounceState(wsID)
h := newFileWriteDebounceHandler(t)
// 10 rapid-fire calls — simulates 10 PUTs landing inside the canvas
// Save burst window.
const burstSize = 10
for i := 0; i < burstSize; i++ {
h.maybeRestartAfterFileWrite(wsID)
}
h.waitAsyncForTest()
// One fired (call #1) + 9 dropped.
if got := fileWriteRestartDropCounter.Load(); got != burstSize-1 {
t.Errorf("expected %d drops for a %d-call burst (only call #1 fires), got %d",
burstSize-1, burstSize, got)
}
}
// TestMaybeRestartAfterFileWrite_AfterWindowExpiresFiresAgain — outside
// the debounce window, the helper must release and fire again. Shrinks
// fileWriteRestartDebounceWindow to 5ms so we don't sleep 15s in CI.
// Important: without this, a legitimate "user edited, walked away for
// a minute, edited again" would never restart and config changes would
// never reach the agent.
func TestMaybeRestartAfterFileWrite_AfterWindowExpiresFiresAgain(t *testing.T) {
const wsID = "fw-debounce-window-expires"
resetFileWriteDebounceState(wsID)
orig := fileWriteRestartDebounceWindow
fileWriteRestartDebounceWindow = 5 * time.Millisecond
defer func() { fileWriteRestartDebounceWindow = orig }()
h := newFileWriteDebounceHandler(t)
h.maybeRestartAfterFileWrite(wsID) // fires
h.waitAsyncForTest()
// Wait past the window.
time.Sleep(20 * time.Millisecond)
h.maybeRestartAfterFileWrite(wsID) // must fire again
h.waitAsyncForTest()
// Drop counter must still be 0 — both calls fired.
if got := fileWriteRestartDropCounter.Load(); got != 0 {
t.Errorf("second call after window expiry must fire (not drop), got %d drops", got)
}
}
// TestMaybeRestartAfterFileWrite_DifferentWorkspacesIndependent — the
// per-workspace state map must isolate: a burst on workspace A must not
// affect workspace B's debounce. Pinning so a future "use a single
// global atomic" refactor breaks loudly.
func TestMaybeRestartAfterFileWrite_DifferentWorkspacesIndependent(t *testing.T) {
const wsA = "fw-debounce-ws-a"
const wsB = "fw-debounce-ws-b"
resetFileWriteDebounceState(wsA)
resetFileWriteDebounceState(wsB)
h := newFileWriteDebounceHandler(t)
// 5 calls on A, all but one drop.
for i := 0; i < 5; i++ {
h.maybeRestartAfterFileWrite(wsA)
}
h.waitAsyncForTest()
dropsAfterA := fileWriteRestartDropCounter.Load()
// First call on B — must fire (its own independent window).
h.maybeRestartAfterFileWrite(wsB)
h.waitAsyncForTest()
// B's call must not have incremented the drop counter — it fired.
if got := fileWriteRestartDropCounter.Load(); got != dropsAfterA {
t.Errorf("workspace B's first call must fire (not share workspace A's debounce); "+
"drops after A=%d, drops after B=%d", dropsAfterA, got)
}
// Both workspaces must have their own last-fire entries.
if _, ok := fileWriteRestartLastFireAt.Load(wsA); !ok {
t.Error("workspace A missing from fileWriteRestartLastFireAt")
}
if _, ok := fileWriteRestartLastFireAt.Load(wsB); !ok {
t.Error("workspace B missing from fileWriteRestartLastFireAt")
}
}
// TestMaybeRestartAfterFileWrite_ConcurrentCallsSafelyDebounced — the
// CAS-loop contract: many goroutines hitting the helper concurrently
// must still produce at most one fired call (drops = N-1). Pinning the
// "thousands of writes, one restart" performance shape called out in
// the helper's comment. Uses sync.WaitGroup to release all goroutines
// in a tight burst so the CAS is genuinely contended.
func TestMaybeRestartAfterFileWrite_ConcurrentCallsSafelyDebounced(t *testing.T) {
const wsID = "fw-debounce-concurrent"
resetFileWriteDebounceState(wsID)
h := newFileWriteDebounceHandler(t)
const goroutines = 50
start := make(chan struct{})
var wg sync.WaitGroup
for i := 0; i < goroutines; i++ {
wg.Add(1)
go func() {
defer wg.Done()
<-start // hold every goroutine at the gate
h.maybeRestartAfterFileWrite(wsID)
}()
}
close(start) // release the herd
wg.Wait()
h.waitAsyncForTest()
// Exactly N-1 drops: one goroutine wins the CAS and fires, all
// other N-1 see a fresh stamp and drop into the debounce branch.
if got := fileWriteRestartDropCounter.Load(); got != goroutines-1 {
t.Errorf("expected %d drops for %d concurrent callers (exactly one fires), got %d",
goroutines-1, goroutines, got)
}
}
// TestCoalesceRestart_DrainRespectsRestartedAtBetweenIterations —
// Path B regression: when coalesceRestart drains a pending request into
// a follow-up cycle, the restartStartedAt timestamp must be re-stamped
// for that follow-up iteration. Without this, a RestartByID arriving
// during cycle 2 would hit a stale 60s window (computed from cycle 1's
// start) and could pass the debounce just because cycle 1 + cycle 2's
// runtime exceeded 60s combined.
//
// The test fires cycle 1 → completes → sets pending=true to trigger
// cycle 2 → asserts that restartStartedAt was advanced for the drained
// iteration. The cycle function itself just records the wall-clock at
// which it observed restartStartedAt, so the test can compare cycle 1's
// stamp vs cycle 2's stamp.
func TestCoalesceRestart_DrainRespectsRestartedAtBetweenIterations(t *testing.T) {
const wsID = "fw-debounce-drain-restamp"
resetRestartStatesFor(wsID)
// Capture the restartStartedAt observed at the top of each cycle
// iteration. The cycle reads it directly from the state map so we
// see what coalesceRestart wrote.
var stamps []time.Time
var stampsMu sync.Mutex
cycleCount := 0
cycle := func() {
sv, _ := restartStates.Load(wsID)
state := sv.(*restartState)
state.mu.Lock()
stampsMu.Lock()
stamps = append(stamps, state.restartStartedAt)
stampsMu.Unlock()
state.mu.Unlock()
cycleCount++
if cycleCount == 1 {
// While inside cycle 1, set pending=true so the drain loop
// runs cycle 2 next iteration. Mirrors the prod shape: a
// PUT lands during cycle 1, sets pending=true via
// RestartByID → coalesceRestart's pending branch.
state.mu.Lock()
state.pending = true
state.mu.Unlock()
// Sleep briefly so cycle 2's stamp is observably later
// than cycle 1's. Without a real wall-clock gap the
// assertion can't tell re-stamp from no-op.
time.Sleep(20 * time.Millisecond)
}
}
coalesceRestart(wsID, cycle)
stampsMu.Lock()
defer stampsMu.Unlock()
if len(stamps) != 2 {
t.Fatalf("expected 2 cycle iterations (original + drained pending), got %d", len(stamps))
}
if !stamps[1].After(stamps[0]) {
t.Errorf("Path B regression: cycle 2's restartStartedAt (%v) must be AFTER "+
"cycle 1's (%v) — drained iterations must re-stamp so the self-fire "+
"debounce window resets per cycle. Without this, a RestartByID arriving "+
"during cycle 2 sees a stale window and can chain into cycle 3.",
stamps[1], stamps[0])
}
}
@@ -397,6 +397,8 @@ func Setup(hub *ws.Hub, broadcaster *events.Broadcaster, prov *provisioner.Provi
wsAuth.GET("/tokens", tokh.List)
wsAuth.POST("/tokens", tokh.Create)
wsAuth.DELETE("/tokens/:tokenId", tokh.Revoke)
adminTokH := handlers.NewAdminWorkspaceTokenHandler()
r.POST("/admin/workspaces/:id/tokens", middleware.AdminAuth(db.DB), adminTokH.Create)
// Memory
memh := handlers.NewMemoryHandler()
-13
View File
@@ -1,13 +0,0 @@
# coverage.py config — consumed by `pytest --cov` via the pytest-cov
# plugin. Lives here (not in pytest.ini) because coverage.py only reads
# .coveragerc / setup.cfg / tox.ini / pyproject.toml — the [coverage:*]
# sections in pytest.ini are silently ignored. See issue #1817.
[run]
omit =
*/tests/*
*/__init__.py
plugins_registry/*
[report]
# Skip files at 100% in the term-missing output to keep CI logs readable.
skip_covered = True
-104
View File
@@ -1,104 +0,0 @@
FROM python:3.11-slim@sha256:e78299e55776ca065dcb769f80161f48465ad352014240eb5fe4712e22505e9b
WORKDIR /app
# Install Node.js, git, gh CLI in a single layer to minimize image size
RUN apt-get update && \
apt-get install -y --no-install-recommends curl git ca-certificates && \
# Node.js 22
curl -fsSL https://deb.nodesource.com/setup_22.x | bash - && \
apt-get install -y --no-install-recommends nodejs && \
# GitHub CLI
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \
| dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg && \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \
> /etc/apt/sources.list.d/github-cli.list && \
apt-get update && apt-get install -y --no-install-recommends gh && \
# Cleanup apt caches and temp files
apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# Create non-root user (claude --dangerously-skip-permissions refuses root)
RUN useradd -m -s /bin/bash agent
# Install base Python dependencies (A2A SDK + HTTP only)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy runtime code (adapters/ has been removed — adapters now live in standalone
# template repos and install molecule-ai-workspace-runtime from PyPI)
COPY *.py ./
COPY entrypoint.sh ./
COPY skill_loader/ ./skill_loader/
COPY builtin_tools/ ./builtin_tools/
COPY plugins_registry/ ./plugins_registry/
COPY policies/ ./policies/
# Create CLI aliases
RUN ln -s /app/a2a_cli.py /usr/local/bin/a2a && chmod +x /app/a2a_cli.py /app/a2a_mcp_server.py && \
ln -s /app/molecule_ai_status.py /usr/local/bin/molecule-monorepo-status && chmod +x /app/molecule_ai_status.py
# gh wrapper — auto-prefixes PR / issue titles with the agent role + appends
# a body footer. Every agent in the template shares one GitHub PAT so plain
# `gh pr list` can't distinguish workspaces; the wrapper reads GIT_AUTHOR_NAME
# (set by the platform provisioner, "Molecule AI <Role>") and rewrites the
# title/body accordingly. Fails open when the env is missing. Anything that
# isn't `gh pr create` or `gh issue create` passes through untouched.
# /usr/local/bin is earlier in PATH than /usr/bin/gh so this shadows the
# real binary without renaming it.
COPY scripts/gh-wrapper.sh /usr/local/bin/gh
RUN chmod +x /usr/local/bin/gh
# Copy the git credential helper so entrypoint.sh can register it at boot.
# molecule-git-token-helper.sh fetches a fresh GitHub App installation token
# from the platform on every git push/fetch, preventing stale-token failures
# after the ~60 min GitHub App token TTL (issue #613 / #547).
COPY scripts/molecule-git-token-helper.sh ./scripts/
RUN chmod +x ./scripts/molecule-git-token-helper.sh
# Copy the background token refresh daemon. Runs as a background process
# started by entrypoint.sh — refreshes gh CLI auth and the credential
# helper cache every 45 min so tokens never expire mid-operation.
COPY scripts/molecule-gh-token-refresh.sh ./scripts/
RUN chmod +x ./scripts/molecule-gh-token-refresh.sh
# Generic GIT_ASKPASS helper. Reads HTTPS Basic-Auth credentials from env
# vars (GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD, with GITEA_USER / GITEA_TOKEN
# as fallback) and emits them on the git credential-prompt protocol so
# container-side `git` can authenticate to any private HTTPS remote
# without on-disk .gitconfig / .git-credentials mutation. The platform
# provisioner sets GIT_ASKPASS=/usr/local/bin/molecule-askpass via
# applyAgentGitIdentity (workspace-server/internal/handlers/agent_git_identity.go).
# Filename is the only project-specific marker; the script body contains
# no vendor literals and is identical to the script shipped in each
# open-source workspace template (scripts/git-askpass.sh).
COPY scripts/molecule-askpass /usr/local/bin/molecule-askpass
RUN chmod +x /usr/local/bin/molecule-askpass
# Dirs and permissions
RUN mkdir -p /workspace /plugins /home/agent/.claude /home/agent/.config /home/agent/.local \
/home/agent/.molecule-token-cache && \
chown -R agent:agent /app /home/agent /workspace
# Install gosu for clean root → agent user handoff in entrypoint.
# The entrypoint starts as root to fix volume ownership, then exec's
# as the agent user so Claude Code's --dangerously-skip-permissions works.
RUN apt-get update && apt-get install -y --no-install-recommends gosu && \
rm -rf /var/lib/apt/lists/*
VOLUME /configs
VOLUME /workspace
EXPOSE 8000
# HEALTHCHECK: probe the A2A agent-card endpoint so orchestrators and
# container runtimes can detect a live, responsive workspace agent.
# Uses curl (present in python:3.11-slim base) against the uvicorn server.
# PORT is injected at runtime via the molecule-runtime entrypoint; the
# default matches EXPOSE.
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -sf http://localhost:${PORT:-8000}/agent/card >/dev/null || exit 1
RUN chmod +x /app/entrypoint.sh
# Start as root — entrypoint fixes volume permissions then drops to agent
CMD ["./entrypoint.sh"]
-1
View File
@@ -1 +0,0 @@
# trigger autobump for python-multipart pin (PDF P0 cure)
-105
View File
@@ -1,105 +0,0 @@
"""OFFSEC-003: A2A peer-result sanitization — shared across delegation tools.
This module is intentionally a LEAF (no imports from the molecule-runtime
package) to avoid circular dependency cycles. Both ``a2a_tools_delegation``
and ``a2a_tools`` can import from here without creating import loops.
Trust-boundary design (OFFSEC-003):
A2A peer responses are untrusted third-party content. Before passing
them to the agent context, they MUST be wrapped in a trust-boundary
marker pair so the calling agent knows the content is external.
Boundary markers:
- _A2A_BOUNDARY_START = "[A2A_RESULT_FROM_PEER]"
- _A2A_BOUNDARY_END = "[/A2A_RESULT_FROM_PEER]"
The boundary is the PRIMARY security control. A peer that sends
"[A2A_RESULT_FROM_PEER]evil[/A2A_RESULT_FROM_PEER]safe" can make "safe"
appear inside the trusted context unless the markers themselves are
escaped before wrapping see _escape_boundary_markers() below.
Defense-in-depth (secondary):
Known prompt-injection control-words are also escaped so that even
if a calling agent ignores the boundary marker, embedded attack
patterns (SYSTEM:, OVERRIDE:, etc.) lose their special meaning.
This is not a complete injection sanitizer do not rely on it as
the primary control.
"""
from __future__ import annotations
import re
# ── Trust-boundary markers ────────────────────────────────────────────────────
_A2A_BOUNDARY_START = "[A2A_RESULT_FROM_PEER]"
_A2A_BOUNDARY_END = "[/A2A_RESULT_FROM_PEER]"
# ── Boundary-marker escaping ─────────────────────────────────────────────────
# A peer that sends "[/A2A_RESULT_FROM_PEER]evil" can make "evil" appear
# inside the trusted zone. Escape BOTH boundary markers in the raw text
# before wrapping so they can never close the boundary early.
# We use "[/ " as the escape prefix — visually distinct from the real marker.
_A2A_BOUNDARY_START_ESCAPED = "[/ A2A_RESULT_FROM_PEER]"
_A2A_BOUNDARY_END_ESCAPED = "[/ /A2A_RESULT_FROM_PEER]"
def _escape_boundary_markers(text: str) -> str:
"""Escape boundary markers inside the raw peer text before wrapping.
Replaces any occurrence of the boundary start/end markers with a
visually-similar escaped form so a malicious peer can never close
the boundary early or inject a fake opener.
"""
return (
text.replace(_A2A_BOUNDARY_START, _A2A_BOUNDARY_START_ESCAPED)
.replace(_A2A_BOUNDARY_END, _A2A_BOUNDARY_END_ESCAPED)
)
# ── Defense-in-depth: injection pattern escaping ───────────────────────────────
# These patterns cover common prompt-injection phrasings. They are NOT a
# complete sanitizer — see module docstring. The boundary marker is the
# primary control; these are purely defense-in-depth.
_INJECTION_PATTERNS = [
# Single-word patterns: anchor to word boundary so they don't match
# inside other words (e.g. "SYSTEM" in "mySYSTEMatic").
# Single-word patterns: anchor to word boundary so they don't match
# inside other words (e.g. "SYSTEM" in "mySYSTEMatic").
(re.compile(r"(^|[^\w])SYSTEM\b", re.IGNORECASE), r"\1[ESCAPED_SYSTEM]"),
(re.compile(r"(^|[^\w])OVERRIDE\b", re.IGNORECASE), r"\1[ESCAPED_OVERRIDE]"),
# "INSTRUCTIONS" may appear at the start of a string or after a newline.
(re.compile(r"(^|\n)INSTRUCTIONS?\b", re.IGNORECASE), " [ESCAPED_INSTRUCTIONS]"),
(re.compile(r"(^|[^\w])IGNORE\s+ALL\b", re.IGNORECASE), r"\1[ESCAPED_IGNORE_ALL]"),
(re.compile(r"(^|[^\w])YOU\s+ARE\s+NOW\b", re.IGNORECASE), r"\1[ESCAPED_YOU_ARE_NOW]"),
]
def sanitize_a2a_result(text: str) -> str:
"""Sanitize untrusted text from an A2A peer (OFFSEC-003).
Order of operations:
1. Escape boundary markers in the raw text (prevents injection).
2. Escape known injection patterns (defense-in-depth).
Returns the input unchanged if it is empty/None.
Note: this function does NOT add boundary wrappers callers that need
to establish a trust boundary should wrap the sanitized result with
``[A2A_RESULT_FROM_PEER]\\n{sanitized}\\n[/A2A_RESULT_FROM_PEER]``.
See ``a2a_tools_delegation.py:tool_delegate_task`` for the canonical
wrapping pattern.
"""
if not text:
return text
# 1. Escape boundary markers so a malicious peer cannot break the
# trust boundary from inside their response.
escaped = _escape_boundary_markers(text)
# 2. Escape known injection control-words (defense-in-depth only).
for pattern, replacement in _INJECTION_PATTERNS:
escaped = pattern.sub(replacement, escaped)
return escaped
-251
View File
@@ -1,251 +0,0 @@
#!/usr/bin/env python3
"""A2A CLI — command-line tools for inter-workspace communication.
Supports both synchronous and asynchronous delegation:
a2a delegate <id> <task> Send task, wait for response (sync)
a2a delegate --async <id> <task> Send task, return task ID immediately
a2a status <task_id> Check task status / get result
a2a peers List available peers
a2a info Show this workspace's info
Environment variables:
WORKSPACE_ID this workspace's ID
PLATFORM_URL platform API base URL
"""
import asyncio
import json
import os
import sys
import uuid
import httpx
_WORKSPACE_ID_raw = os.environ.get("WORKSPACE_ID")
if not _WORKSPACE_ID_raw:
raise RuntimeError("WORKSPACE_ID environment variable is required but not set")
WORKSPACE_ID = _WORKSPACE_ID_raw
# Platform URL: always host.docker.internal inside containers. The platform API
# is only reachable via the Docker network mesh from inside a workspace
# container regardless of the runtime environment (Docker/host).
PLATFORM_URL = os.environ.get("PLATFORM_URL", "http://host.docker.internal:8080")
async def discover(target_id: str) -> dict | None:
"""Discover a peer workspace's URL."""
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/registry/discover/{target_id}",
headers={"X-Workspace-ID": WORKSPACE_ID},
)
if resp.status_code == 200:
return resp.json()
return None
async def delegate(target_id: str, task: str, async_mode: bool = False):
"""Delegate a task to another workspace."""
peer = await discover(target_id)
if not peer:
print(f"Error: cannot reach workspace {target_id} (access denied or offline)", file=sys.stderr)
sys.exit(1)
target_url = peer.get("url", "")
if not target_url:
print(f"Error: workspace {target_id} has no URL", file=sys.stderr)
sys.exit(1)
task_id = str(uuid.uuid4())
if async_mode:
# Async: send and return immediately, don't wait for response
# Use a background task that fires and forgets
async with httpx.AsyncClient(timeout=10.0) as client:
try:
# Send with a short timeout — just confirm receipt
resp = await client.post(
target_url,
json={
"jsonrpc": "2.0",
"id": task_id,
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": str(uuid.uuid4()),
"parts": [{"kind": "text", "text": task}],
}
},
},
)
# Even if we timeout, the task is queued on the target
print(json.dumps({
"task_id": task_id,
"target": target_id,
"status": "submitted",
"target_url": target_url,
}))
except httpx.TimeoutException:
# Request was sent but we didn't get confirmation — task may or may not have been received
print(json.dumps({
"task_id": task_id,
"target": target_id,
"status": "uncertain",
"note": "Request sent but response timed out — delivery unconfirmed. Use 'a2a status' to check.",
}), file=sys.stderr)
return
# Sync: wait for full response with retry on rate limit
max_retries = 3
for attempt in range(max_retries):
async with httpx.AsyncClient(timeout=300.0) as client:
try:
resp = await client.post(
target_url,
json={
"jsonrpc": "2.0",
"id": task_id,
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": str(uuid.uuid4()),
"parts": [{"kind": "text", "text": task}],
}
},
},
)
try:
data = resp.json()
except Exception:
print(f"Error: invalid JSON response (status {resp.status_code})", file=sys.stderr)
sys.exit(1)
if "result" in data:
parts = data["result"].get("parts", [])
text = parts[0].get("text", "") if parts else ""
if text and text != "(no response generated)":
print(text)
return
# Empty or no-response — might be rate limited, retry
if attempt < max_retries - 1:
delay = 5 * (2 ** attempt)
print(f"(empty response, retrying in {delay}s...)", file=sys.stderr)
await asyncio.sleep(delay)
continue
print(text or "(no response after retries)")
elif "error" in data:
error_msg = data['error'].get('message', 'unknown')
if ("rate" in error_msg.lower() or "overloaded" in error_msg.lower()) and attempt < max_retries - 1:
delay = 5 * (2 ** attempt)
print(f"(rate limited, retrying in {delay}s...)", file=sys.stderr)
await asyncio.sleep(delay)
continue
print(f"Error: {error_msg}", file=sys.stderr)
sys.exit(1)
return
except httpx.TimeoutException:
if attempt < max_retries - 1:
delay = 5 * (2 ** attempt)
print(f"(timeout, retrying in {delay}s...)", file=sys.stderr)
await asyncio.sleep(delay)
continue
print("Error: request timed out after retries", file=sys.stderr)
sys.exit(1)
async def check_status(target_id: str, task_id: str):
"""Check the status of an async task."""
peer = await discover(target_id)
if not peer:
print(f"Error: cannot reach workspace {target_id}", file=sys.stderr)
sys.exit(1)
target_url = peer.get("url", "")
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.post(
target_url,
json={
"jsonrpc": "2.0",
"id": str(uuid.uuid4()),
"method": "tasks/get",
"params": {"id": task_id},
},
)
data = resp.json()
if "result" in data:
task = data["result"]
status = task.get("status", {}).get("state", "unknown")
print(f"Status: {status}")
if status == "completed":
artifacts = task.get("artifacts", [])
for a in artifacts:
for p in a.get("parts", []):
if p.get("text"):
print(p["text"])
elif "error" in data:
print(f"Error: {data['error'].get('message', 'unknown')}")
async def peers():
"""List available peers."""
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(f"{PLATFORM_URL}/registry/{WORKSPACE_ID}/peers")
if resp.status_code != 200:
print("Error: could not fetch peers", file=sys.stderr)
sys.exit(1)
for p in resp.json():
status = p.get("status", "?")
role = p.get("role", "")
print(f"{p['id']} {p['name']:30s} {status:10s} {role}")
async def info():
"""Get this workspace's info."""
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(f"{PLATFORM_URL}/workspaces/{WORKSPACE_ID}")
if resp.status_code == 200:
d = resp.json()
print(f"ID: {d['id']}")
print(f"Name: {d['name']}")
print(f"Role: {d.get('role', '')}")
print(f"Tier: {d['tier']}")
print(f"Status: {d['status']}")
print(f"Parent: {d.get('parent_id', '(root)')}")
def main():
if len(sys.argv) < 2:
print("Usage: a2a <command> [args]")
print("Commands:")
print(" delegate <workspace_id> <task> — Send task, wait for response")
print(" delegate --async <workspace_id> <task> — Send task, return immediately")
print(" status <workspace_id> <task_id> — Check async task status")
print(" peers — List available peers")
print(" info — Show workspace info")
sys.exit(1)
cmd = sys.argv[1]
if cmd == "delegate":
async_mode = "--async" in sys.argv
args = [a for a in sys.argv[2:] if a != "--async"]
if len(args) < 2:
print("Usage: a2a delegate [--async] <workspace_id> <task>", file=sys.stderr)
sys.exit(1)
asyncio.run(delegate(args[0], " ".join(args[1:]), async_mode))
elif cmd == "status":
if len(sys.argv) < 4:
print("Usage: a2a status <workspace_id> <task_id>", file=sys.stderr)
sys.exit(1)
asyncio.run(check_status(sys.argv[2], sys.argv[3]))
elif cmd == "peers":
asyncio.run(peers())
elif cmd == "info":
asyncio.run(info())
else:
print(f"Unknown command: {cmd}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__": # pragma: no cover
main()
-803
View File
@@ -1,803 +0,0 @@
"""A2A protocol client — peer discovery, messaging, and workspace info.
Shared constants (WORKSPACE_ID, PLATFORM_URL) live here so that
a2a_tools and a2a_mcp_server can import them from a single place.
"""
import asyncio
import logging
import os
import random
import re
import threading
import time
import uuid
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor
import httpx
import a2a_response
from platform_auth import auth_headers, self_source_headers
logger = logging.getLogger(__name__)
_WORKSPACE_ID_raw = os.environ.get("WORKSPACE_ID")
if not _WORKSPACE_ID_raw:
raise RuntimeError("WORKSPACE_ID environment variable is required but not set")
WORKSPACE_ID = _WORKSPACE_ID_raw
# Platform URL: always host.docker.internal inside containers. The platform API
# is only reachable via the Docker network mesh from inside a workspace
# container regardless of the runtime environment (Docker/host).
PLATFORM_URL = os.environ.get("PLATFORM_URL", "http://host.docker.internal:8080")
# Cache workspace ID → name mappings (populated by list_peers calls)
_peer_names: dict[str, str] = {}
# Cache: peer workspace_id → the source workspace_id whose registry
# returned that peer. Populated by ``a2a_tools.tool_list_peers`` whenever
# it queries a specific workspace's peers — so a later
# ``tool_delegate_task(target)`` can auto-route through the correct
# source workspace without the agent having to specify
# ``source_workspace_id`` explicitly.
#
# Single-workspace mode: dict stays empty, all delegations fall through
# to the module-level WORKSPACE_ID (existing behavior).
#
# Multi-workspace mode: as the agent calls list_peers, this map is
# populated with each peer's source. Subsequent delegate_task calls
# auto-route. If a peer is registered under multiple sources (rare —
# e.g. an org-wide capability) the LAST observed source wins; the agent
# can override by passing ``source_workspace_id`` explicitly.
_peer_to_source: dict[str, str] = {}
# Cache workspace ID → full peer record (id, name, role, status, url, ...).
# Populated by tool_list_peers and by the lazy registry lookup in
# enrich_peer_metadata. The notification-callback path (channel envelope
# enrichment) reads this cache on every inbound peer_agent push, so the
# read shape stays a dict-like ``__getitem__`` lookup; entries carry
# their fetched-at timestamp so TTL eviction is in-line with the
# lookup. ``None`` as the record is the negative-cache sentinel:
# registry failure is cached for one TTL window so we don't re-fire
# the 2s-bounded GET on every push from a flaky peer.
#
# OrderedDict + maxsize bound (#2482): pre-fix this was an unbounded
# ``dict``, so a workspace receiving from N distinct peers across its
# lifetime accumulated ~100 bytes/entry × N indefinitely. At 10K peers
# that's ~1 MB; at 100K (a chatty platform-wide router) ~10 MB; not
# crash-class but unbounded. The LRU bound caps memory + the TTL caps
# per-entry staleness — both gates are needed because a runaway poller
# touching N new peer_ids per push could grow within a single TTL
# window.
#
# All reads / writes go through ``_peer_metadata_get`` /
# ``_peer_metadata_set`` so the LRU move-to-end + size-trim invariants
# stay co-located. Direct mutation is allowed only in test fixtures
# (clearing for isolation); production code path uses the helpers.
_PEER_METADATA_MAXSIZE = 1024
_peer_metadata: "OrderedDict[str, tuple[float, dict | None]]" = OrderedDict()
_peer_metadata_lock = threading.Lock()
# How long an entry in ``_peer_metadata`` is treated as fresh. 5 minutes
# is the same window we use for delegation routing — long enough that a
# busy agent receiving repeated pushes from one peer doesn't hit the
# registry on every push, short enough that role/name renames propagate
# within a single agent session.
_PEER_METADATA_TTL_SECONDS = 300.0
def _peer_metadata_get(canon: str) -> tuple[float, dict | None] | None:
"""Read with LRU touch — moves the entry to the most-recently-used
position so steady-state pushes from a busy peer don't get evicted
by a cold-start burst from new peers. Returns the raw tuple shape
callers expect; TTL eviction stays at the call site.
"""
with _peer_metadata_lock:
entry = _peer_metadata.get(canon)
if entry is not None:
_peer_metadata.move_to_end(canon)
return entry
def _peer_metadata_set(canon: str, value: tuple[float, dict | None]) -> None:
"""Write + evict-if-over-maxsize. The eviction is in-process and
cheap (popitem(last=False) on an OrderedDict is O(1)). Holding the
lock across the trim keeps the size invariant stable under concurrent
writes from background enrichment workers.
"""
with _peer_metadata_lock:
_peer_metadata[canon] = value
_peer_metadata.move_to_end(canon)
# Trim the oldest entries until at-or-below maxsize. The bound
# is a soft cap — a single overrun (set called when at maxsize)
# evicts the LRU entry before returning, never letting size
# exceed maxsize.
while len(_peer_metadata) > _PEER_METADATA_MAXSIZE:
_peer_metadata.popitem(last=False)
# Background-fetch executor for enrich_peer_metadata_nonblocking (#2484).
# A small pool — peers are highly TTL-cached, so the steady-state load
# is "one fetch per peer per 5 minutes." Two workers handle the cold-
# start burst when an agent starts receiving pushes from a new peer for
# the first time without backing up the inbox poller. Daemon threads:
# the executor must NOT block process exit if the inbox shuts down.
_enrich_executor: ThreadPoolExecutor | None = None
_enrich_executor_lock = threading.Lock()
# In-flight peer IDs — guards against a single peer's repeated pushes
# scheduling N concurrent registry fetches before the first one fills
# the cache. Set membership is "a worker is currently fetching this
# peer; subsequent calls should NOT schedule another."
_enrich_in_flight: set[str] = set()
_enrich_in_flight_lock = threading.Lock()
def _get_enrich_executor() -> ThreadPoolExecutor:
"""Lazy-init the enrichment worker pool. Lazy because most test
fixtures and short-lived CLI invocations don't need it; only the
long-running molecule-mcp / inbox-poller path actually schedules
background fetches.
"""
global _enrich_executor
if _enrich_executor is not None:
return _enrich_executor
with _enrich_executor_lock:
if _enrich_executor is None:
_enrich_executor = ThreadPoolExecutor(
max_workers=2,
thread_name_prefix="enrich-peer",
)
return _enrich_executor
def enrich_peer_metadata_nonblocking(
peer_id: str,
source_workspace_id: str | None = None,
) -> dict | None:
"""Cache-first variant of ``enrich_peer_metadata`` — returns
immediately without blocking on a registry GET.
Behavior:
- Cache hit (fresh): return the cached record.
- Cache miss or TTL expired: schedule a background fetch via the
worker pool, return ``None`` (caller renders bare peer_id).
The next push for this peer hits the warm cache and gets the
full record.
Why this exists (#2484): the inbox poller's notification callback
in molecule-mcp called the synchronous ``enrich_peer_metadata`` on
every push, blocking the poller for up to 2s × N uncached peers
per batch. Push-delivery latency was gated on registry latency
the exact thing the negative-cache patch in PR #2471 was supposed
to avoid amplifying. Moving the fetch off the poller thread means
push delivery is bounded by the inbox poll interval, never by
registry RTT.
Trade-off: the FIRST push from a new peer arrives metadata-light
(no name/role). The MCP host renders the bare peer_id. Subsequent
pushes (within the 5-min TTL) hit the warm cache and get the full
record. Acceptable because:
- Channel-envelope enrichment is a UX nicety, not a correctness
invariant.
- The cold-cache window per peer is bounded to one push.
- The TTL is long enough that an active conversation never
re-enters the cold state.
"""
canon = _validate_peer_id(peer_id)
if canon is None:
return None
# Cache hit (fresh): return without blocking on a registry GET.
# This is the hot path for active peer conversations — avoids
# spawning a background thread for every push from a known peer.
current = time.monotonic()
cached = _peer_metadata_get(canon)
if cached is not None:
fetched_at, record = cached
if current - fetched_at < _PEER_METADATA_TTL_SECONDS:
return record
# Cache miss or TTL expired: schedule background fetch unless one is
# already in flight for this peer. The in-flight set keeps a flurry
# of pushes from one peer (e.g., a chatty agent) from spawning N
# parallel GETs.
with _enrich_in_flight_lock:
if canon in _enrich_in_flight:
return None
_enrich_in_flight.add(canon)
try:
_get_enrich_executor().submit(
_enrich_peer_metadata_worker, canon, source_workspace_id
)
except RuntimeError:
# Executor was shut down (process exit path) — drop the request,
# let the caller render bare peer_id.
with _enrich_in_flight_lock:
_enrich_in_flight.discard(canon)
return None
def _enrich_peer_metadata_worker(
canon: str, source_workspace_id: str | None
) -> None:
"""Background-thread body for ``enrich_peer_metadata_nonblocking``.
Runs the same fetch logic as the synchronous helper but discards
the return value the cache write is the only output anyone
needs. Always clears the in-flight marker so a future cache miss
can retry.
"""
try:
enrich_peer_metadata(canon, source_workspace_id)
except Exception as exc: # noqa: BLE001
# Background workers must not crash the executor — log and
# move on. The negative-cache path inside enrich_peer_metadata
# already records failures, so a re-attempt is rate-limited
# by TTL.
logger.debug("_enrich_peer_metadata_worker: %s failed: %s", canon, exc)
finally:
with _enrich_in_flight_lock:
_enrich_in_flight.discard(canon)
def _wait_for_enrichment_inflight_for_testing(timeout: float = 2.0) -> None:
"""Block until all in-flight enrichment workers have completed.
Test-only helper. Production code never has a reason to wait the
point of the nonblocking path is that callers don't care when the
cache fills. Tests that want to assert "after the worker runs, the
cache has the record" use this to synchronise without sleeping.
Polls ``_enrich_in_flight`` rather than holding a Condition because
the worker pool is already serializing through ``_enrich_in_flight_lock``;
poll keeps the production hot path lock-free.
"""
deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
with _enrich_in_flight_lock:
if not _enrich_in_flight:
return
time.sleep(0.01)
def _peer_in_flight_clear_for_testing() -> None:
"""Clear the in-flight enrichment set. Test-only helper."""
with _enrich_in_flight_lock:
_enrich_in_flight.clear()
def enrich_peer_metadata(
peer_id: str,
source_workspace_id: str | None = None,
*,
now: float | None = None,
) -> dict | None:
"""Return cached or freshly-fetched metadata for ``peer_id``.
Sync helper safe to call from the inbox poller's notification
callback thread (which is not async). Hits the in-process cache
first; on miss or TTL expiry, GETs ``/registry/discover/<peer_id>``
synchronously with a tight timeout. Returns None on validation
failure, network failure, or non-200 response so callers can
degrade gracefully (the channel envelope falls back to the raw
``peer_id`` instead of crashing the push path).
Negative caching: failure outcomes (4xx/5xx/non-JSON/network
exception) are stored as ``(now, None)`` and treated as
fresh-but-empty for the TTL window. Without this, a peer with a
flaky/missing registry record would re-fire the 2s-bounded GET on
EVERY push turning the cache into a no-op for the exact failure
scenarios it most needs to defend against.
The fetched dict is stored as-is, so callers can read whatever
fields the platform exposes (currently: ``id``, ``name``, ``role``,
``status``, ``url``). New fields surface automatically without a
code change here.
"""
canon = _validate_peer_id(peer_id)
if canon is None:
return None
current = now if now is not None else time.monotonic()
cached = _peer_metadata_get(canon)
if cached is not None:
fetched_at, record = cached
if current - fetched_at < _PEER_METADATA_TTL_SECONDS:
# Fresh entry — return whatever's there. ``None`` is the
# negative-cache sentinel: caller treats absence of fields
# the same as a registry miss, which is the desired UX.
return record
src = (source_workspace_id or "").strip() or WORKSPACE_ID
url = f"{PLATFORM_URL}/registry/discover/{canon}"
try:
with httpx.Client(timeout=2.0) as client:
resp = client.get(url, headers={"X-Workspace-ID": src, **auth_headers(src)})
except Exception as exc: # noqa: BLE001
logger.debug("enrich_peer_metadata: GET %s failed: %s", url, exc)
_peer_metadata_set(canon, (current, None))
return None
if resp.status_code != 200:
logger.debug(
"enrich_peer_metadata: %s returned HTTP %d", url, resp.status_code
)
_peer_metadata_set(canon, (current, None))
return None
try:
data = resp.json()
except Exception: # noqa: BLE001
_peer_metadata_set(canon, (current, None))
return None
if not isinstance(data, dict):
_peer_metadata_set(canon, (current, None))
return None
_peer_metadata_set(canon, (current, data))
if name := data.get("name"):
_peer_names[canon] = name
return data
def _agent_card_url_for(peer_id: str) -> str:
"""Construct the platform-side agent-card URL for ``peer_id``.
Returns the empty string when ``peer_id`` is not a UUID same
trust-boundary rationale as ``discover_peer``: never interpolate
path-traversal characters into a URL. An invalid id reflected back
to the receiving agent as ``/registry/discover/../../foo`` is a
foothold we close at construction time.
Uses the registry's discovery path so the agent receiving a push
can hit a single endpoint to enumerate the sender's capabilities
+ role + URL. Same shape every workspace exposes regardless of
runtime claude-code, hermes, langchain wrappers all register
through ``/registry/register`` and surface through ``/registry/discover``.
"""
safe_id = _validate_peer_id(peer_id)
if safe_id is None:
return ""
return f"{PLATFORM_URL}/registry/discover/{safe_id}"
# Sentinel prefix for errors originating from send_a2a_message / child agents.
# Used by delegate_task to distinguish real errors from normal response text.
_A2A_ERROR_PREFIX = "[A2A_ERROR] "
# Sentinel prefix for queued-for-poll-mode-peer outcomes (#2967).
# When the target workspace is registered as delivery_mode=poll (no
# public URL — typical for external molecule-mcp standalone runtimes),
# the platform's a2a_proxy.go:402 short-circuit returns a synthetic
# {"status":"queued","delivery_mode":"poll","method":"..."} envelope
# instead of dispatching over HTTP. The message IS delivered (written
# to the platform's inbox queue); there's just no synchronous reply
# to relay. Pre-#2967 the client treated this as "unexpected response
# shape" → caller saw DELEGATION FAILED → retried → recipient saw
# duplicates. The Queued prefix lets callers branch on this outcome
# explicitly: "delivered async, no synchronous reply expected" is
# different from both success-with-text and failure.
_A2A_QUEUED_PREFIX = "[A2A_QUEUED] "
# Workspace IDs are UUIDs everywhere we generate them (platform's
# workspaces.id column, /registry/discover/:id route param, etc.) but
# the agent-facing tool surface receives them as free-form strings via
# tool args. ``_validate_peer_id`` enforces UUID-shape at the
# trust boundary so we never interpolate `..` or `/` into a URL path,
# never silently coerce malformed input into a 404, and surface a
# clear error to the agent rather than letting an HTTP 4xx bubble up
# from the platform with a generic error message.
#
# Lenient on case + whitespace because real-world peer-id strings
# come from list_peers/discover_peer responses (canonical lowercase)
# or hand-typed agent input (mixed-case acceptable). Strict on
# everything else.
_UUID_RE = re.compile(
r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$"
)
def _validate_peer_id(peer_id: str) -> str | None:
"""Return the canonicalised peer_id if valid, else None.
Returning None instead of raising so callers in tool surfaces can
convert to a friendly agent-facing string ("workspace_id is not a
valid UUID") rather than crashing with a stack trace.
"""
if not isinstance(peer_id, str):
return None
pid = peer_id.strip()
if not _UUID_RE.match(pid):
return None
return pid.lower()
async def discover_peer(target_id: str, source_workspace_id: str | None = None) -> dict | None:
"""Discover a peer workspace's URL via the platform registry.
Validates ``target_id`` is a UUID before constructing the URL a
malformed id can't reach the platform handler now, which both
short-circuits an avoidable round-trip AND ensures we never
interpolate path-traversal characters into the URL.
``source_workspace_id`` selects which registered workspace asks the
question both the X-Workspace-ID header AND the Authorization
bearer token must come from the same workspace, otherwise the
platform's TenantGuard rejects the request. Defaults to the
module-level WORKSPACE_ID for back-compat with single-workspace
callers.
"""
safe_id = _validate_peer_id(target_id)
if safe_id is None:
return None
src = (source_workspace_id or "").strip() or WORKSPACE_ID
async with httpx.AsyncClient(timeout=10.0) as client:
try:
resp = await client.get(
f"{PLATFORM_URL}/registry/discover/{safe_id}",
headers={"X-Workspace-ID": src, **auth_headers(src)},
)
if resp.status_code == 200:
return resp.json()
return None
except Exception as e:
logger.error(f"Discovery failed for {target_id}: {e}")
return None
# httpx exception classes that indicate a transient transport-layer
# failure worth retrying — the request never produced an application
# response, so a fresh attempt has a real chance of succeeding. Any
# error not in this tuple is treated as deterministic (HTTP-status,
# JSON parse, runtime-returned JSON-RPC error, etc.) and surfaced to
# the caller on the first try.
#
# Why each one belongs here:
# - ConnectError / ConnectTimeout: peer's listening socket wasn't
# ready (mid-restart, not yet bound). Fast failure, fast recovery.
# - RemoteProtocolError: peer closed the TCP connection without
# writing a response — observed on 2026-04-27 when a peer's prior
# in-flight Claude SDK session aborted and the new request's
# connection was reset mid-handler.
# - ReadError / WriteError: TCP read/write socket error mid-flight,
# typically a network blip on the Docker bridge or a peer worker
# crash.
# - ReadTimeout: peer didn't write ANY response bytes within the
# 300s read budget. Distinct from "peer is slow but progressing"
# (which httpx surfaces as a successful read with chunked bytes).
# Retry budget caps the worst case — see _DELEGATE_TOTAL_BUDGET_S.
_TRANSIENT_HTTP_ERRORS: tuple[type[Exception], ...] = (
httpx.ConnectError,
httpx.ConnectTimeout,
httpx.ReadError,
httpx.WriteError,
httpx.RemoteProtocolError,
httpx.ReadTimeout,
)
# Retry budget. Up to 5 attempts (1 initial + 4 retries) with
# exponential backoff (1, 2, 4, 8 seconds), each backoff jittered ±25%
# to prevent synchronized retry storms across siblings if a peer flaps.
# _DELEGATE_TOTAL_BUDGET_S caps cumulative wall-clock so a string of
# ReadTimeouts can't make the caller wait 25 minutes — once the
# deadline elapses we stop retrying even if attempts remain. 600s = 10
# minutes is the agreed worst case the caller can tolerate before
# falling back to "peer unavailable" handling in tool_delegate_task.
_DELEGATE_MAX_ATTEMPTS = 5
_DELEGATE_BACKOFF_BASE_S = 1.0
_DELEGATE_BACKOFF_CAP_S = 16.0
_DELEGATE_TOTAL_BUDGET_S = 600.0
def _delegate_backoff_seconds(attempt_zero_indexed: int) -> float:
"""Return the (jittered) backoff delay before retrying after the
given attempt index (0 = backoff before retry #1).
Pure function so the schedule is unit-testable without monkey-
patching asyncio.sleep. Jitter is symmetric ±25% on top of the
capped exponential enough to break sync across simultaneous
callers without making the schedule unpredictable.
"""
base = min(_DELEGATE_BACKOFF_BASE_S * (2 ** attempt_zero_indexed), _DELEGATE_BACKOFF_CAP_S)
jitter = base * (0.5 * random.random() - 0.25)
return max(0.0, base + jitter)
def _format_a2a_error(exc: BaseException, target_url: str) -> str:
"""Format an httpx exception as an [A2A_ERROR] string.
Some httpx exceptions stringify to empty (RemoteProtocolError,
ConnectionReset variants) the canvas would then render
"[A2A_ERROR] " with no detail and the operator has no signal to
act on. Always include the exception class name and the target
URL so the activity log + Agent Comms panel have actionable
information without a trip through container logs.
"""
msg = str(exc).strip()
type_name = type(exc).__name__
if not msg:
detail = f"{type_name} (no message — likely connection reset or silent timeout)"
elif msg.startswith(f"{type_name}:") or msg.startswith(f"{type_name} "):
# Already prefixed with the type — don't double-prefix.
# Prefix-anchored check (not substring) so a message that
# happens to mention some OTHER class name mid-string
# (e.g. "got OSError on read") doesn't suppress our own
# type prefix and lose the diagnostic signal.
detail = msg
else:
detail = f"{type_name}: {msg}"
return f"{_A2A_ERROR_PREFIX}{detail} [target={target_url}]"
async def send_a2a_message(peer_id: str, message: str, source_workspace_id: str | None = None) -> str:
"""Send an A2A ``message/send`` to a peer workspace via the platform proxy.
The target URL is constructed internally as
``${PLATFORM_URL}/workspaces/{peer_id}/a2a``. Going through the
platform's A2A proxy is the only path that works for both
in-container and external runtimes see
a2a_tools.tool_delegate_task for the rationale.
``source_workspace_id`` is the SENDING workspace drives both the
X-Workspace-ID source-tagging header and the bearer token. Defaults
to the module-level WORKSPACE_ID for back-compat. Multi-workspace
operators pass it explicitly so each registered workspace's peers
are reached via their own auth chain.
Auto-retries up to _DELEGATE_MAX_ATTEMPTS times on transient
transport-layer errors (RemoteProtocolError, ConnectError,
ReadTimeout, etc.) with exponential-backoff + jitter, capped by
_DELEGATE_TOTAL_BUDGET_S. Application-level failures (HTTP 4xx,
JSON-RPC error response, malformed JSON) are NOT retried they
indicate a deterministic problem retry won't fix.
"""
safe_id = _validate_peer_id(peer_id)
if safe_id is None:
return f"{_A2A_ERROR_PREFIX}invalid peer_id (expected UUID): {peer_id!r}"
src = (source_workspace_id or "").strip() or WORKSPACE_ID
target_url = f"{PLATFORM_URL}/workspaces/{safe_id}/a2a"
# Fix F (Cycle 5 / H2 — flagged 5 consecutive audits): timeout=None allowed
# a hung upstream to block the agent indefinitely. Use a generous but bounded
# timeout: 30s connect + 300s read (long enough for slow LLM responses).
timeout_cfg = httpx.Timeout(connect=30.0, read=300.0, write=30.0, pool=30.0)
deadline = time.monotonic() + _DELEGATE_TOTAL_BUDGET_S
last_exc: BaseException | None = None
for attempt in range(_DELEGATE_MAX_ATTEMPTS):
async with httpx.AsyncClient(timeout=timeout_cfg) as client:
try:
# self_source_headers() includes X-Workspace-ID so the
# platform's a2a_receive logger records source_id =
# WORKSPACE_ID. Otherwise peer-A2A messages — including
# the case where target_url resolves to this workspace's
# own /a2a — get logged with source_id=NULL and surface
# in the recipient's My Chat tab as user-typed input.
resp = await client.post(
target_url,
headers=self_source_headers(src),
json={
"jsonrpc": "2.0",
"id": str(uuid.uuid4()),
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": str(uuid.uuid4()),
"parts": [{"kind": "text", "text": message}],
}
},
},
)
data = resp.json()
# Dispatch via the SSOT response model (a2a_response.py).
# All shape detection lives in one place — the parser
# never raises and routes unknown shapes to Malformed
# so a future server-side change is loud, not silent.
variant = a2a_response.parse(data)
if isinstance(variant, a2a_response.Result):
# Match legacy semantics:
# parts non-empty + first part has no text → ""
# parts empty → "(no response)"
# Differentiation matters for callers that assert
# on the empty-string case (test_a2a_client).
if variant.parts:
text = variant.text
else:
text = "(no response)"
# Tag child-reported errors so the caller can
# detect them reliably — agent-side bug surfaces
# text like "Agent error: <traceback>" inside a
# JSON-RPC success envelope.
if text.startswith("Agent error:"):
return f"{_A2A_ERROR_PREFIX}{text}"
return text
if isinstance(variant, a2a_response.Queued):
# Poll-mode peer — message accepted into the inbox
# queue, target agent will fetch via poll. NOT a
# failure. Return the queued sentinel so callers
# (delegate_task etc.) can render the outcome
# accurately instead of treating it as an error.
logger.info(
"send_a2a_message: queued for poll-mode peer (target=%s method=%s)",
target_url,
variant.method,
)
return f"{_A2A_QUEUED_PREFIX}target={safe_id} method={variant.method}"
if isinstance(variant, a2a_response.Error):
msg = variant.message
code = variant.code
if msg and code is not None:
detail = f"{msg} (code={code})"
elif msg:
detail = msg
elif code is not None:
detail = f"JSON-RPC error with no message (code={code})"
else:
detail = "JSON-RPC error with no message"
if variant.restarting:
# Surface platform-restart-in-progress
# explicitly — caller (UI / delegating agent)
# can render a softer "agent is restarting"
# message rather than a generic failure.
retry = (
f", retry_after={variant.retry_after}s"
if variant.retry_after is not None
else ""
)
detail = f"{detail} (restarting{retry})"
return f"{_A2A_ERROR_PREFIX}{detail} [target={target_url}]"
# Malformed — log loud + surface as error so the
# operator notices a server change. SSOT refactor
# subsumes the inline "queued" check that landed in
# the #2972 hotfix; that branch is now the typed
# Queued variant above.
logger.warning(
"send_a2a_message: malformed response (target=%s body=%.200s)",
target_url,
str(variant.raw),
)
return (
f"{_A2A_ERROR_PREFIX}unexpected response shape "
f"(no result, error, or queued envelope): "
f"{str(variant.raw)[:200]} [target={target_url}]"
)
except _TRANSIENT_HTTP_ERRORS as e:
last_exc = e
attempts_remaining = _DELEGATE_MAX_ATTEMPTS - (attempt + 1)
if attempts_remaining <= 0 or time.monotonic() >= deadline:
# Out of attempts OR out of total budget — surface
# the last error to the caller.
break
delay = _delegate_backoff_seconds(attempt)
# Don't sleep past the deadline — clamp.
remaining = deadline - time.monotonic()
if delay > remaining:
delay = max(0.0, remaining)
logger.warning(
"send_a2a_message: transient %s on attempt %d/%d, retrying in %.1fs (target=%s)",
type(e).__name__,
attempt + 1,
_DELEGATE_MAX_ATTEMPTS,
delay,
target_url,
)
await asyncio.sleep(delay)
continue
except Exception as e:
# Non-transient (HTTP-status, JSON parse, etc.) — don't retry.
return _format_a2a_error(e, target_url)
# Retries exhausted (or budget elapsed). last_exc must be set
# because we only break out of the loop after assigning it.
assert last_exc is not None # noqa: S101
return _format_a2a_error(last_exc, target_url)
async def get_peers_with_diagnostic(source_workspace_id: str | None = None) -> tuple[list[dict], str | None]:
"""Get this workspace's peers, returning (peers, diagnostic).
diagnostic is None when the call succeeded (status 200, even if the list
is empty). When peers is [] for a non-trivial reason (auth failure,
workspace-id missing from registry, platform error, network error),
diagnostic is a short human-readable string explaining what went wrong
so callers can surface it instead of "may be isolated" see #2397.
``source_workspace_id`` selects which registered workspace's peers to
enumerate; defaults to the module-level WORKSPACE_ID for
single-workspace back-compat. Multi-workspace operators iterate over
each registered workspace separately so each set of peers is fetched
with the correct auth.
The legacy get_peers() shim below preserves the bare-list contract for
non-tool callers.
"""
src = (source_workspace_id or "").strip() or WORKSPACE_ID
url = f"{PLATFORM_URL}/registry/{src}/peers"
async with httpx.AsyncClient(timeout=10.0) as client:
try:
resp = await client.get(
url,
headers={"X-Workspace-ID": src, **auth_headers(src)},
)
except Exception as e:
return [], f"Cannot reach platform at {PLATFORM_URL}: {e}"
if resp.status_code == 200:
try:
data = resp.json()
except Exception as e:
return [], f"Platform returned 200 but body was not JSON: {e}"
if not isinstance(data, list):
return [], f"Platform returned 200 but body was not a list: {type(data).__name__}"
return data, None
if resp.status_code in (401, 403):
return [], (
f"Authentication to platform failed (HTTP {resp.status_code}). "
"The workspace bearer token may be invalid — restarting the workspace usually re-mints it."
)
if resp.status_code == 404:
return [], (
f"Workspace ID {WORKSPACE_ID} is not registered with the platform (HTTP 404). "
"Re-registration via the platform's /registry/register endpoint is needed."
)
if 500 <= resp.status_code < 600:
return [], f"Platform error: HTTP {resp.status_code}."
return [], f"Unexpected platform response: HTTP {resp.status_code}."
async def get_peers() -> list[dict]:
"""Get this workspace's peers from the platform registry.
Bare-list shim over get_peers_with_diagnostic() discards the diagnostic
so callers that don't care about the failure reason (e.g. system-prompt
bootstrap formatters) get the same shape they always had.
"""
peers, _ = await get_peers_with_diagnostic()
return peers
async def get_workspace_info(source_workspace_id: str | None = None) -> dict:
"""Get this workspace's info from the platform.
``source_workspace_id`` selects which registered workspace to
introspect when the agent is registered into multiple workspaces
(multi-workspace mode). Unset defaults to the module-level
WORKSPACE_ID single-workspace operators see no behaviour change.
Distinguishes three failure shapes so callers can handle them
distinctly (#2429):
- 410 Gone workspace was deleted; re-onboard required
- 404 / other workspace never existed (or transient)
- exception network / auth failure
"""
src = source_workspace_id or WORKSPACE_ID
async with httpx.AsyncClient(timeout=10.0) as client:
try:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}",
headers=auth_headers(src),
)
if resp.status_code == 200:
return resp.json()
if resp.status_code == 410:
# #2429: platform returns 410 when status='removed'.
# Surface "removed" + the actionable hint so callers
# can prompt re-onboard instead of falling through to
# "not found" — which made the 2026-04-30 incident
# impossible to diagnose ("workspace not found" with
# a workspace_id we KNEW we'd just registered).
try:
body = resp.json()
except Exception:
body = {}
return {
"error": "removed",
"id": body.get("id", src),
"removed_at": body.get("removed_at"),
"hint": body.get(
"hint",
"Workspace was deleted on the platform. "
"Regenerate workspace + token from the canvas → Tokens tab.",
),
}
return {"error": "not found"}
except Exception as e:
return {"error": str(e)}
-567
View File
@@ -1,567 +0,0 @@
"""Bridge between LangGraph agent and A2A protocol, with SSE streaming support.
SSE streaming architecture
--------------------------
The A2A SDK (``DefaultRequestHandler`` + ``EventQueue``) owns the SSE transport
layer. This executor's job is to push the right event types into the queue as
work progresses:
1. ``TaskStatusUpdateEvent(state=working)`` immediately signals start
2. ``TaskArtifactUpdateEvent(chunk, append=)`` one per LLM text token
3. ``Message(final_text)`` terminal event
Client compatibility
--------------------
*Non-streaming* (``message/send``):
``ResultAggregator.consume_all()`` processes status/artifact events
(updating the task in the store) and returns the final ``Message``
immediately backward-compatible with ``a2a_client.py`` which reads
``data["result"]["parts"][0]["text"]``.
*Streaming* (``message/stream``):
``consume_and_emit()`` yields every event above as SSE, letting the client
render tokens in real time.
LangGraph integration
---------------------
Uses ``agent.astream_events(version="v2")`` to receive ``on_chat_model_stream``
events with ``AIMessageChunk`` payloads. Text is extracted from both plain
strings (OpenAI / Groq) and Anthropic-style content-block lists. Non-text
content (tool_use, etc.) is silently skipped. A fresh ``artifact_id`` is
generated for each new LLM ``run_id`` so tool-call cycles are grouped cleanly.
"""
import functools
import logging
import os
import uuid
from a2a.server.agent_execution import AgentExecutor, RequestContext
from a2a.server.events import EventQueue
from a2a.server.tasks import TaskUpdater
from a2a.types import Part
# KI-009: a2a-sdk v1 renames a2a.utils → a2a.helpers; TextPart removed (Part takes text= directly)
from a2a.helpers import new_text_message
from shared_runtime import (
extract_history as _extract_history,
extract_message_text,
brief_task,
set_current_task,
)
from executor_helpers import (
collect_outbound_files,
extract_attached_files,
read_delegation_results,
sanitize_agent_error,
)
from builtin_tools.telemetry import (
A2A_TASK_ID,
GEN_AI_OPERATION_NAME,
GEN_AI_REQUEST_MODEL,
GEN_AI_SYSTEM,
WORKSPACE_ID_ATTR,
_incoming_trace_context,
gen_ai_system_from_model,
get_tracer,
record_llm_token_usage,
)
logger = logging.getLogger(__name__)
_WORKSPACE_ID = os.environ.get("WORKSPACE_ID", "unknown")
# LangGraph ReAct cycle budget per turn. Library default is 25; 500 covers
# PM fan-outs (plan → 6 delegations → 6 awaits → 6 results → synthesize ≈
# 30+ steps even before retries). Overridable via LANGGRAPH_RECURSION_LIMIT.
DEFAULT_RECURSION_LIMIT = 500
def _parse_recursion_limit() -> int:
"""Read LANGGRAPH_RECURSION_LIMIT; fall back to DEFAULT_RECURSION_LIMIT
with a WARNING log on any unparseable or non-positive value."""
raw = os.environ.get("LANGGRAPH_RECURSION_LIMIT", "")
if not raw:
return DEFAULT_RECURSION_LIMIT
try:
n = int(raw)
except ValueError:
logger.warning(
"LANGGRAPH_RECURSION_LIMIT=%r is not an integer; using default %d",
raw, DEFAULT_RECURSION_LIMIT,
)
return DEFAULT_RECURSION_LIMIT
if n <= 0:
logger.warning(
"LANGGRAPH_RECURSION_LIMIT=%d is not positive; using default %d",
n, DEFAULT_RECURSION_LIMIT,
)
return DEFAULT_RECURSION_LIMIT
return n
# ---------------------------------------------------------------------------
# Compliance (OWASP Top 10 for Agentic Apps) — optional, lazy-loaded
# ---------------------------------------------------------------------------
try:
from builtin_tools.compliance import (
AgencyTracker,
ExcessiveAgencyError,
PromptInjectionError,
redact_pii as _redact_pii,
sanitize_input as _sanitize_input,
)
_COMPLIANCE_AVAILABLE = True
except ImportError: # pragma: no cover
_COMPLIANCE_AVAILABLE = False
@functools.lru_cache(maxsize=1)
def _get_compliance_cfg():
"""Return ComplianceConfig or None (cached for process lifetime)."""
try:
from config import load_config
return load_config().compliance
except Exception:
return None
def _extract_chunk_text(content) -> list[str]:
"""Extract text strings from an LLM streaming chunk's content field.
Handles both provider content styles:
- OpenAI / Groq: ``content`` is a plain ``str`` (empty for tool-call chunks).
- Anthropic: ``content`` is a list of typed blocks, e.g.
``[{"type": "text", "text": "Hello"}, {"type": "tool_use", ...}]``
Only ``"text"`` blocks are returned; ``tool_use``, ``tool_result``, and
other non-text blocks are filtered out so raw tool JSON never appears in
the SSE stream.
Args:
content: ``chunk.content`` value from an ``on_chat_model_stream`` event.
Returns:
List of non-empty text strings.
"""
if isinstance(content, str):
return [content] if content else []
if isinstance(content, list):
texts: list[str] = []
for block in content:
if isinstance(block, dict) and block.get("type") == "text":
text = block.get("text", "")
if text:
texts.append(text)
elif isinstance(block, str) and block:
texts.append(block)
return texts
return []
class LangGraphA2AExecutor(AgentExecutor):
"""Bridges LangGraph agent to A2A event model with SSE streaming support.
Always uses ``agent.astream_events()`` so that:
- Streaming clients (``message/stream``) receive token-level SSE events.
- Non-streaming clients (``message/send``) receive the final ``Message``
collected from the same stream no duplicate LLM call, full compat.
"""
def __init__(self, agent, heartbeat=None, model: str = "unknown"):
self.agent = agent # Compiled LangGraph graph (create_react_agent output)
self._heartbeat = heartbeat
self._model = model # e.g. "anthropic:claude-sonnet-4-6"
async def execute(self, context: RequestContext, event_queue: EventQueue) -> None:
"""Execute a task from an A2A request with SSE streaming.
Routes through the Temporal durable workflow when a global
``TemporalWorkflowWrapper`` is initialised and connected to Temporal;
otherwise falls back to ``_core_execute()`` (direct path).
Event emission sequence:
1. TaskStatusUpdateEvent(working) immediate start signal
2. TaskArtifactUpdateEvent chunks token-by-token via astream_events
3. Message(final_text) terminal; non-streaming clients
return on this; streaming clients
also receive it as the last SSE event.
"""
# ── Optional Temporal durable execution wrapper ──────────────────────
# When a TemporalWorkflowWrapper is active this routes execution through
# a MoleculeAIAgentWorkflow (task_receive → llm_call → task_complete).
# Falls back silently to _core_execute() on any error or if Temporal
# is unavailable, so the client always receives a response.
try:
from builtin_tools.temporal_workflow import get_wrapper as _get_temporal_wrapper
_tw = _get_temporal_wrapper()
if _tw is not None and _tw.is_available():
return await _tw.run(self, context, event_queue)
except Exception:
pass # Never let the wrapper path crash the executor
await self._core_execute(context, event_queue)
async def _core_execute(self, context: RequestContext, event_queue: EventQueue) -> str:
"""Core execution pipeline — called directly or from a Temporal activity.
This is the original ``execute()`` body, extracted so that the Temporal
``llm_call`` activity can invoke it without re-entering the wrapper
check and causing infinite recursion.
Returns the final response text (empty string on empty input or error).
Event emission sequence:
1. TaskStatusUpdateEvent(working) immediate start signal
2. TaskArtifactUpdateEvent chunks token-by-token via astream_events
3. Message(final_text) terminal event
"""
user_input = extract_message_text(context)
# Inject delegation results from prior turns. Heartbeat writes
# completed delegation rows to DELEGATION_RESULTS_FILE and sends
# a self-message to wake the agent; this consumes the file and
# surfaces the results as context so the agent can act on them
# without needing an explicit check_task_status call.
# Results are prepended so they are visible even when the
# self-message text is overwritten by a subsequent user message.
pending_results = read_delegation_results()
if pending_results:
logger.info("A2A execute: injecting %d delegation result(s)", pending_results.count("\n") + 1)
user_input = f"[Delegation results available]\n{pending_results}\n\n{user_input}"
# Pull attached files from A2A message parts (kind: "file") and
# append a manifest to the prompt so the agent knows they exist.
# LangGraph tools (filesystem, bash, skills) can then open the
# files by path — without this the agent silently ignores the
# attachments and replies "I'm not sure what you're referring to".
_attached_files = extract_attached_files(getattr(context, "message", None))
if _attached_files:
_manifest = "\n\nAttached files:\n" + "\n".join(
f"- {f['name']} ({f['mime_type'] or 'unknown type'}) at {f['path']}"
for f in _attached_files
)
user_input = (user_input + _manifest) if user_input else _manifest.lstrip()
if not user_input:
parts = getattr(getattr(context, "message", None), "parts", None)
logger.warning("A2A execute: no text content in message parts: %s", parts)
await event_queue.enqueue_event(
new_text_message("Error: message contained no text content.")
)
return ""
# ── OA-01: Prompt injection check (OWASP Agentic Top 10) ────────────
_compliance_cfg = _get_compliance_cfg() if _COMPLIANCE_AVAILABLE else None
if _COMPLIANCE_AVAILABLE and _compliance_cfg and _compliance_cfg.mode == "owasp_agentic":
try:
user_input = _sanitize_input(
user_input,
prompt_injection_mode=_compliance_cfg.prompt_injection,
context_id=context.context_id or "",
)
except PromptInjectionError as exc:
await event_queue.enqueue_event(
new_text_message(f"Request blocked: {exc}")
)
return ""
logger.info("A2A execute: user_input=%s", user_input[:200])
# ── OTEL: task_receive span ──────────────────────────────────────────
parent_ctx = _incoming_trace_context.get()
tracer = get_tracer()
_result: str = "" # captured inside the span for return after it closes
with tracer.start_as_current_span("task_receive", context=parent_ctx) as task_span:
task_span.set_attribute(WORKSPACE_ID_ATTR, _WORKSPACE_ID)
task_span.set_attribute(A2A_TASK_ID, context.context_id or "")
task_span.set_attribute("a2a.input_preview", user_input[:256])
# Resolve IDs — the RequestContextBuilder always sets them, but
# we generate fallbacks for safety (e.g. in unit tests).
task_id = context.task_id or str(uuid.uuid4())
context_id = context.context_id or str(uuid.uuid4())
# A2A v1 contract (a2a-sdk ≥ 1.0): enqueue a Task event before any
# TaskStatusUpdateEvent. The framework only auto-creates the Task
# on continuation messages (existing task_id resolves via
# task_manager.get_task()). For fresh requests get_task() returns
# None and the SDK rejects the first status update with
# InvalidAgentResponseError("Agent should enqueue Task before
# TaskStatusUpdateEvent event") — see a2a/server/agent_execution/
# active_task.py for the validation site. PR #2170 migrated the
# surface to v1 but missed this contract; the synth-E2E gate
# surfaced it on every run after staging deploy.
if getattr(context, "current_task", None) is None:
from a2a.types import Task, TaskState, TaskStatus
await event_queue.enqueue_event(
Task(
id=task_id,
context_id=context_id,
status=TaskStatus(state=TaskState.TASK_STATE_SUBMITTED),
)
)
updater = TaskUpdater(event_queue, task_id, context_id)
try:
# set_current_task INSIDE the try so active_tasks is always
# decremented by the finally block even if CancelledError hits
# during the heartbeat HTTP push. Moving it outside the try
# created a window where cancellation left active_tasks stuck
# at 1, permanently blocking queue drain. (#2026)
await set_current_task(self._heartbeat, brief_task(user_input))
messages = _extract_history(context)
if messages:
logger.info("A2A execute: injecting %d history messages", len(messages))
messages.append(("human", user_input))
# Recursion limit: see DEFAULT_RECURSION_LIMIT and
# _parse_recursion_limit() at module top. Re-read on every
# call so the env var can be hot-changed between requests.
recursion_limit = _parse_recursion_limit()
run_config = {
"configurable": {"thread_id": context_id},
"run_name": f"a2a-{context_id[:8]}",
"recursion_limit": recursion_limit,
}
# ── OTEL: llm_call span ──────────────────────────────────────
with tracer.start_as_current_span("llm_call") as llm_span:
llm_span.set_attribute(GEN_AI_OPERATION_NAME, "chat")
llm_span.set_attribute(GEN_AI_SYSTEM, gen_ai_system_from_model(self._model))
llm_span.set_attribute(GEN_AI_REQUEST_MODEL, self._model)
llm_span.set_attribute(WORKSPACE_ID_ATTR, _WORKSPACE_ID)
# ── Step 1: signal "working" to streaming clients ─────────
await updater.start_work()
# ── Step 2: stream tokens via LangGraph astream_events ────
# Each "on_chat_model_stream" event carries an AIMessageChunk.
# We emit one TaskArtifactUpdateEvent per text chunk so SSE
# clients can render tokens in real time.
# artifact_id resets on each new LLM run_id so agent→tool→agent
# cycles each get their own artifact slot.
artifact_id = str(uuid.uuid4())
has_streamed = False # True after first chunk for current artifact
current_run_id = None # Detects new LLM call in a ReAct cycle
accumulated: list[str] = [] # All text for the final Message
last_ai_message = None # Saved for token-usage telemetry
# ── OA-03: Excessive agency tracker ──────────────────────
_agency = (
AgencyTracker(
max_tool_calls=_compliance_cfg.max_tool_calls_per_task,
max_duration_seconds=float(_compliance_cfg.max_task_duration_seconds),
)
if _COMPLIANCE_AVAILABLE and _compliance_cfg and _compliance_cfg.mode == "owasp_agentic"
else None
)
# ── Tool trace: collect every tool invocation for
# platform-level observability ────────────────────
# Keyed by run_id so parallel tool calls (LangGraph
# supports them) pair start→end correctly. Capped at
# MAX_TOOL_TRACE entries to prevent runaway loops from
# ballooning the JSONB payload.
MAX_TOOL_TRACE = 200
tool_trace: list[dict] = []
tool_trace_by_run: dict[str, dict] = {}
async for event in self.agent.astream_events(
{"messages": messages},
config=run_config,
version="v2",
):
kind = event.get("event", "")
if kind == "on_chat_model_stream":
run_id = event.get("run_id", "")
if run_id and run_id != current_run_id:
# New LLM run started — fresh artifact slot
current_run_id = run_id
artifact_id = str(uuid.uuid4())
has_streamed = False
chunk = event.get("data", {}).get("chunk")
if chunk is not None:
texts = _extract_chunk_text(chunk.content)
for text in texts:
await updater.add_artifact(
parts=[Part(text=text)], # v1: TextPart removed, Part takes text= directly
artifact_id=artifact_id,
append=has_streamed, # False=first, True=append
last_chunk=False,
)
has_streamed = True
accumulated.append(text)
elif kind == "on_tool_start":
tool_name = event.get("name", "?")
tool_input = event.get("data", {}).get("input", "")
tool_run_id = event.get("run_id", "")
logger.debug("SSE: tool start — %s", tool_name)
if len(tool_trace) < MAX_TOOL_TRACE:
entry = {
"tool": tool_name,
"input": str(tool_input)[:500] if tool_input else "",
}
tool_trace.append(entry)
if tool_run_id:
tool_trace_by_run[tool_run_id] = entry
if _agency is not None:
_agency.on_tool_call(
tool_name=tool_name,
context_id=context_id,
)
elif kind == "on_tool_end":
tool_end_name = event.get("name", "?")
tool_output = event.get("data", {}).get("output", "")
tool_run_id = event.get("run_id", "")
logger.debug("SSE: tool end — %s", tool_end_name)
# Pair via run_id so parallel tool calls don't clobber each other.
entry = tool_trace_by_run.get(tool_run_id) if tool_run_id else None
if entry is not None:
entry["output_preview"] = str(tool_output)[:300] if tool_output else ""
elif kind == "on_chat_model_end":
# Capture the last completed AIMessage for token telemetry
output = event.get("data", {}).get("output")
if output is not None:
last_ai_message = output
# Record token usage from the last completed LLM call
if last_ai_message is not None:
record_llm_token_usage(llm_span, {"messages": [last_ai_message]})
# Build final text from all accumulated streaming tokens
final_text = "".join(accumulated).strip() or "(no response generated)"
logger.info("A2A execute: response length=%d chars", len(final_text))
# ── OA-02 / OA-06: Output PII redaction ──────────────────────
if _COMPLIANCE_AVAILABLE and _compliance_cfg and _compliance_cfg.mode == "owasp_agentic":
final_text, _pii_types = _redact_pii(final_text)
if _pii_types:
from builtin_tools.audit import log_event as _audit_log
_audit_log(
event_type="compliance",
action="pii.redact",
resource="task_output",
outcome="redacted",
pii_types=_pii_types,
context_id=context_id,
)
# ── OTEL: task_complete span ─────────────────────────────────
with tracer.start_as_current_span("task_complete") as done_span:
done_span.set_attribute(WORKSPACE_ID_ATTR, _WORKSPACE_ID)
done_span.set_attribute(A2A_TASK_ID, context_id)
done_span.set_attribute("task.has_response", bool(accumulated))
done_span.set_attribute("task.response_length", len(final_text))
# ── Step 3: emit final Message ────────────────────────────────
# Non-streaming: ResultAggregator.consume_all() returns this
# immediately as the response (a2a_client.py reads .parts[0].text).
# Streaming: yielded as the last SSE event in the stream.
#
# If the reply mentions /workspace/... paths, stage each one
# and emit as FileParts alongside the text so the canvas can
# render a download button. Same contract the hermes executor
# uses — every runtime going through this code path (langgraph,
# deepagents, future ReAct variants) inherits it.
_outbound = collect_outbound_files(final_text)
if _outbound:
# NOTE: do NOT re-import `Part` here. It is already imported
# at module scope (line 42). A function-scope `from a2a.types
# import ... Part ...` would mark `Part` as a local name
# throughout this function under Python's scoping rules,
# making the earlier `Part(text=text)` call (line ~358, inside
# the astream_events loop) raise UnboundLocalError because
# the local binding is not yet in scope at that point.
#
# a2a-sdk 1.x flattened the Part shape: 0.x used
# `Part(root=TextPart(text=...))` / `Part(root=FilePart(file=
# FileWithUri(uri=..., name=..., mimeType=...)))` (Pydantic
# discriminated-union style). 1.x's Part is a single proto
# message with flat fields: text, url, filename, media_type,
# raw, data, metadata. TextPart/FilePart/FileWithUri were
# removed. Same for Message: messageId/taskId/contextId
# camelCase became message_id/task_id/context_id.
from a2a.types import Message, Role
_parts: list[Part] = [Part(text=final_text)] if final_text else []
for f in _outbound:
_parts.append(Part(
url="workspace:" + f["path"],
filename=f["name"],
media_type=f["mime_type"],
))
msg = Message(
message_id=uuid.uuid4().hex,
# 1.x Role is a protobuf enum: ROLE_UNSPECIFIED,
# ROLE_USER, ROLE_AGENT. Old `Role.agent` (Pydantic
# lowercase enum) doesn't exist anymore.
role=Role.ROLE_AGENT,
parts=_parts,
task_id=task_id,
context_id=context_id,
)
else:
msg = new_text_message(final_text, task_id=task_id, context_id=context_id)
# Attach tool_trace via metadata when supported. Guarded with
# hasattr because some test mocks return a plain string here.
if tool_trace and hasattr(msg, "metadata"):
try:
msg.metadata = {"tool_trace": tool_trace}
except (AttributeError, TypeError):
# `new_text_message()` returns a plain string in
# MagicMock paths in tests, where assignment to
# .metadata raises despite hasattr being true (the
# mock has the attribute as a property). Suppression
# is intentional — production Message objects always
# accept the assignment. See #1787 + commit dcbcf19
# for the original test-mock motivation.
logger.debug("metadata attach skipped (non-Message return from new_text_message)")
# A2A v1 (a2a-sdk ≥ 1.0): once Task is enqueued (above, PR #2558),
# the executor is in task mode and raw Message enqueues are
# rejected with InvalidAgentResponseError("Received Message
# object in task mode. Use TaskStatusUpdateEvent or
# TaskArtifactUpdateEvent instead."). updater.complete()
# wraps the Message in a terminal TaskStatusUpdateEvent
# (state=COMPLETED, final=True) which both streaming and
# non-streaming clients accept.
await updater.complete(message=msg)
_result = final_text
except Exception as e:
logger.error("A2A execute error: %s", e, exc_info=True)
try:
task_span.record_exception(e)
from opentelemetry.trace import StatusCode
task_span.set_status(StatusCode.ERROR, str(e))
except Exception:
pass
# A2A v1: in task mode, terminal errors must publish a
# FAILED TaskStatusUpdateEvent (carrying the error Message)
# rather than a raw Message enqueue. updater.failed() does
# exactly this — both streaming and non-streaming clients
# receive the error and stop polling.
await updater.failed(
message=new_text_message(
sanitize_agent_error(exc=e), task_id=task_id, context_id=context_id
)
)
finally:
await set_current_task(self._heartbeat, "")
return _result
async def cancel(self, context: RequestContext, event_queue: EventQueue) -> None:
"""Cancel a running task — emits canceled state to comply with A2A protocol."""
from a2a.types import TaskStatus, TaskState, TaskStatusUpdateEvent
await event_queue.enqueue_event(
TaskStatusUpdateEvent(
status=TaskStatus(state=TaskState.TASK_STATE_CANCELED), # v1: TaskState uses SCREAMING_SNAKE_CASE
final=True,
)
)
File diff suppressed because it is too large Load Diff
-263
View File
@@ -1,263 +0,0 @@
"""Single source of truth for A2A ``/workspaces/<id>/a2a`` response shapes.
The workspace-server proxy at
``workspace-server/internal/handlers/a2a_proxy.go`` (the canonical
emitter) returns one of the following shapes for a single A2A call:
* **JSON-RPC success**
``{"jsonrpc": "2.0", "result": {...}, "id": "..."}``
The agent's reply, passed through unchanged.
* **JSON-RPC error**
``{"jsonrpc": "2.0", "error": {"message": "...", "code": ...}, "id": "..."}``
The agent reported a structured error.
* **Poll-queued** (synthesized at proxy, RFC #2339 PR 2 — see
``a2a_proxy.go:402-406``)
``{"status": "queued", "delivery_mode": "poll", "method": "..."}``
The target is a poll-mode workspace (no public URL); the message
was written to the platform's inbox queue. The target agent will
fetch it via ``GET /activity?since_id=`` polling. NOT a failure
delivery succeeded, there's just no synchronous reply to relay.
* **Platform error** ``{"error": "...", "restarting": true?, "retry_after": int?}``
HTTP-level failure synthesized by the proxy when the agent is
unreachable, the container is restarting, or some other infrastructure
failure happened. ``restarting=true`` flags the platform-initiated
container-restart path.
* **Malformed** anything else. Surfaced explicitly so a future server
change is loud rather than silent.
The ``parse(data)`` function classifies a pre-decoded JSON body into a
typed variant. Callers ``match`` on the variant and never re-implement
shape detection that's the SSOT discipline.
# SSOT contract
This file is the Python half. The Go server emits these shapes today
via inline ``gin.H{...}`` literals. A future PR can introduce a Go
mirror (e.g. ``workspace-server/internal/models/a2a_response.go``)
with a typed marshaller until then, **any change to the wire shape
must be reflected here** and gated by ``test_a2a_response.py``'s
fixture corpus. The corpus exists specifically so a one-sided edit
breaks CI.
# Why a typed model (vs. dict-key sniffing at every site)
The pre-2967 client at ``a2a_client.py:567-587`` sniffed for ``result``
or ``error`` keys inline and treated everything else as malformed
which silently broke poll-mode peers (the queued envelope has neither
key). Inline sniffing per call site multiplies the surface area where
a new shape gets misclassified. A single typed parser with an
explicit ``Malformed`` escape hatch makes shape additions a
one-line change here + a fixture entry in the test corpus, instead of
a hunt through every parsing site in the runtime.
"""
from __future__ import annotations
import dataclasses
import logging
from typing import Any, Optional, Union
logger = logging.getLogger(__name__)
@dataclasses.dataclass(frozen=True)
class Result:
"""JSON-RPC success — agent's reply available synchronously.
``text`` is the convenience extraction from ``parts[0].text`` (the
A2A multipart shape). ``parts`` is the full list, available for
callers that need richer rendering (multiple parts, non-text parts).
``raw_result`` preserves the unparsed ``result`` field for any
caller that needs it (e.g. activity-row response_body audit).
"""
text: str
parts: list[dict[str, Any]] = dataclasses.field(default_factory=list)
raw_result: Optional[dict[str, Any]] = None
@dataclasses.dataclass(frozen=True)
class Error:
"""JSON-RPC error or platform-level error response.
``code`` is the JSON-RPC integer code when present, else None.
``restarting`` / ``retry_after`` are platform-restart-in-progress
metadata: when both are set, the caller knows the container is
being recycled and may surface a softer error to the user.
"""
message: str
code: Optional[int] = None
restarting: bool = False
retry_after: Optional[int] = None
@dataclasses.dataclass(frozen=True)
class Queued:
"""Platform poll-mode short-circuit — message accepted, peer will pick up async.
Returned when the target workspace is registered as
``delivery_mode=poll`` (no public URL typical for external
standalone ``molecule-mcp`` runtimes). The message was written to
the platform's inbox queue; the target agent will fetch it via
``GET /activity?since_id=`` polling.
NOT a failure. Callers that expect a synchronous reply (the agent's
response text) won't get one here — they should either:
* Tolerate the absence of a reply (fire-and-forget semantics).
* Fall back to the durable ``/workspaces/:id/delegate`` +
``/delegations`` polling path (see ``a2a_tools_delegation``'s
``_delegate_sync_via_polling``), which writes the same A2A
request through the platform's executeDelegation goroutine
and lets the caller poll for the result row.
``method`` echoes the request method (``message/send``, ``notify``,
etc.) so callers can correlate.
"""
method: str
delivery_mode: str = "poll"
@dataclasses.dataclass(frozen=True)
class Malformed:
"""Server returned a body the parser can't classify.
Carries the raw decoded payload for diagnostic logging. Callers
typically render this as an error to the user (see
``send_a2a_message``) but the Malformed variant is a separate
type so logging / metrics can distinguish it from genuine
JSON-RPC ``Error`` responses.
"""
raw: Any # whatever the server returned: dict / list / str / number / etc.
Variant = Union[Result, Error, Queued, Malformed]
# Field-name constants — the wire vocabulary. Single source of truth;
# the parser references these by name so a change here is a
# one-line edit instead of a hunt through string literals.
_KEY_RESULT = "result"
_KEY_ERROR = "error"
_KEY_STATUS = "status"
_KEY_DELIVERY_MODE = "delivery_mode"
_KEY_METHOD = "method"
_KEY_RESTARTING = "restarting"
_KEY_RETRY_AFTER = "retry_after"
_STATUS_QUEUED = "queued"
_DELIVERY_MODE_POLL = "poll"
def parse(data: Any) -> Variant:
"""Classify a pre-decoded ``/a2a`` JSON response into a typed variant.
Never raises. Every branch is total: any input that doesn't match a
known shape routes to ``Malformed`` so the caller can decide how
to surface it.
The order of checks matters:
1. Non-dict input Malformed (server contract is dict-shaped).
2. Poll-queued envelope is checked BEFORE result/error because a
server bug that sets both ``status=queued`` and ``result``
should be loud, not silently treated as Result.
3. ``result`` Result (the JSON-RPC success path).
4. ``error`` Error (JSON-RPC error or platform error).
5. Anything else Malformed.
"""
if not isinstance(data, dict):
logger.warning(
"a2a_response.parse: non-dict body — got %s",
type(data).__name__,
)
return Malformed(raw=data)
# Push-mode queue envelope — returned when a push-mode workspace
# (one with a public URL) is at capacity. The platform queues the
# request and returns {"queued": true, "message": "...", "queue_id": "..."}.
# Unlike the poll-mode envelope (status=queued + delivery_mode=poll),
# this shape has no delivery_mode key — it's distinguishable by
# data.get("queued") is True alone. Checked before poll-mode so the
# two cases are mutually exclusive even if a buggy server sends both.
if data.get("queued") is True:
method_raw = data.get(_KEY_METHOD)
method = str(method_raw) if method_raw is not None else "message/send"
logger.info(
"a2a_response.parse: queued for busy push-mode peer (method=%s, queue_id=%s)",
method,
data.get("queue_id", "?"),
)
return Queued(method=method, delivery_mode="push")
# Poll-queued envelope. Both keys must be present — the workspace
# server sets them together; if only one is present the body is
# ambiguous and we route to Malformed for visibility.
if (
data.get(_KEY_STATUS) == _STATUS_QUEUED
and data.get(_KEY_DELIVERY_MODE) == _DELIVERY_MODE_POLL
):
method_raw = data.get(_KEY_METHOD)
method = str(method_raw) if method_raw is not None else "unknown"
logger.info(
"a2a_response.parse: queued for poll-mode peer (method=%s)",
method,
)
return Queued(method=method)
# JSON-RPC success.
if _KEY_RESULT in data:
result = data[_KEY_RESULT]
if isinstance(result, dict):
parts_raw = result.get("parts")
parts = parts_raw if isinstance(parts_raw, list) else []
text = ""
if parts:
first = parts[0]
if isinstance(first, dict):
text_raw = first.get("text")
text = str(text_raw) if text_raw is not None else ""
return Result(text=text, parts=parts, raw_result=result)
# ``result`` present but not a dict — unusual but not an error;
# surface as a Result with the value rendered to text.
return Result(text=str(result), parts=[], raw_result=None)
# JSON-RPC error or platform error.
if _KEY_ERROR in data:
err_raw = data[_KEY_ERROR]
message = ""
code: Optional[int] = None
if isinstance(err_raw, dict):
msg_raw = err_raw.get("message")
if msg_raw is not None:
message = str(msg_raw).strip()
code_raw = err_raw.get("code")
if isinstance(code_raw, int):
code = code_raw
elif isinstance(err_raw, str):
message = err_raw.strip()
else:
message = str(err_raw)
restarting = bool(data.get(_KEY_RESTARTING, False))
retry_after_raw = data.get(_KEY_RETRY_AFTER)
retry_after = retry_after_raw if isinstance(retry_after_raw, int) else None
return Error(
message=message,
code=code,
restarting=restarting,
retry_after=retry_after,
)
logger.warning(
"a2a_response.parse: unrecognized shape — keys=%s",
sorted(data.keys()),
)
return Malformed(raw=data)
-181
View File
@@ -1,181 +0,0 @@
"""A2A MCP tool implementations — the body of each tool handler.
Imports shared client functions and constants from a2a_client.
"""
import hashlib
import json
import mimetypes
import os
import uuid
import httpx
from a2a_client import (
PLATFORM_URL,
WORKSPACE_ID,
_A2A_ERROR_PREFIX,
_peer_names,
_peer_to_source,
discover_peer,
get_peers,
get_peers_with_diagnostic,
get_workspace_info,
send_a2a_message,
)
from builtin_tools.security import _redact_secrets
from platform_auth import list_registered_workspaces
# ---------------------------------------------------------------------------
# RBAC + auth helpers — extracted to a2a_tools_rbac (RFC #2873 iter 4a).
# Re-exported here under the legacy underscore names so existing tests'
# patch("a2a_tools._check_memory_write_permission", …) and call sites
# inside this module that resolve bare names against the module-level
# namespace continue to work unchanged.
# ---------------------------------------------------------------------------
from a2a_tools_rbac import ( # noqa: E402 (import after the from-a2a_client block)
_auth_headers_for_heartbeat,
_check_memory_read_permission,
_check_memory_write_permission,
_get_workspace_tier,
_is_root_workspace,
_ROLE_PERMISSIONS,
)
# Per-field caps on the heartbeat / activity payload. Borrowed from
# hermes-agent's design discipline: cap ONCE in the helper, not at every
# call site, so a future caller adding error_detail can't accidentally
# DoS activity_logs by pasting a 4MB stack trace + base64 image.
#
# Why these specific limits:
# - error_detail (4096): hermes' value. Long enough for a multi-frame
# stack trace, short enough that 100 errors in 5min is < 500KB total.
# - summary (256): summary is a one-liner shown in the canvas card +
# activity row. 256 covers UTF-8 emoji + a sentence.
# - response_text (NOT capped): this is the agent's actual reply
# content. Capping would silently truncate user-visible output.
_MAX_ERROR_DETAIL_CHARS = 4096
_MAX_SUMMARY_CHARS = 256
async def report_activity(
activity_type: str, target_id: str = "", summary: str = "", status: str = "ok",
task_text: str = "", response_text: str = "", error_detail: str = "",
):
"""Report activity to the platform for live progress tracking."""
# Defensive caps in the helper itself so every caller benefits — see
# _MAX_ERROR_DETAIL_CHARS / _MAX_SUMMARY_CHARS comments above.
if error_detail and len(error_detail) > _MAX_ERROR_DETAIL_CHARS:
error_detail = error_detail[:_MAX_ERROR_DETAIL_CHARS]
if summary and len(summary) > _MAX_SUMMARY_CHARS:
summary = summary[:_MAX_SUMMARY_CHARS]
try:
async with httpx.AsyncClient(timeout=5.0) as client:
payload: dict = {
"activity_type": activity_type,
"source_id": WORKSPACE_ID,
"target_id": target_id,
"method": "message/send",
"summary": summary,
"status": status,
}
if task_text:
payload["request_body"] = {"task": task_text}
if response_text:
payload["response_body"] = {"result": response_text}
if error_detail:
# error_detail is a top-level activity row column on the
# platform (handlers/activity.go). Surfacing the cleaned
# exception string here lets the Activity tab render a
# red error chip + the cause without forcing the user
# to scroll into the raw response_body JSON.
payload["error_detail"] = error_detail
await client.post(
f"{PLATFORM_URL}/workspaces/{WORKSPACE_ID}/activity",
json=payload,
headers=_auth_headers_for_heartbeat(),
)
# Also push current_task via heartbeat for canvas card display
if summary:
await client.post(
f"{PLATFORM_URL}/registry/heartbeat",
json={
"workspace_id": WORKSPACE_ID,
"current_task": summary,
"active_tasks": 1,
"error_rate": 0,
"sample_error": "",
"uptime_seconds": 0,
},
headers=_auth_headers_for_heartbeat(),
)
except Exception:
pass # Best-effort — don't block delegation on activity reporting
# Delegation tool handlers — extracted to a2a_tools_delegation
# (RFC #2873 iter 4b). Re-imported here so call sites + tests that
# reference ``a2a_tools.tool_delegate_task`` /
# ``a2a_tools._delegate_sync_via_polling`` keep resolving identically.
from a2a_tools_delegation import ( # noqa: E402 (import after the from-a2a_client block)
_SYNC_POLL_BUDGET_S,
_SYNC_POLL_INTERVAL_S,
_delegate_sync_via_polling,
tool_check_task_status,
tool_delegate_task,
tool_delegate_task_async,
)
# Messaging tool handlers — extracted to a2a_tools_messaging
# (RFC #2873 iter 4d). Re-imported here so call sites + tests that
# reference ``a2a_tools.tool_send_message_to_user`` /
# ``tool_list_peers`` / ``tool_get_workspace_info`` /
# ``tool_chat_history`` / ``_upload_chat_files`` keep resolving
# identically.
from a2a_tools_messaging import ( # noqa: E402 (import after the top-of-module imports)
_upload_chat_files,
tool_broadcast_message,
tool_chat_history,
tool_get_workspace_info,
tool_list_peers,
tool_send_message_to_user,
)
# Memory tool handlers — extracted to a2a_tools_memory (RFC #2873 iter 4c).
# Re-imported here so call sites + tests that reference
# ``a2a_tools.tool_commit_memory`` / ``tool_recall_memory`` keep
# resolving identically.
from a2a_tools_memory import ( # noqa: E402 (import after the top-of-module imports)
tool_commit_memory,
tool_recall_memory,
)
# Inbox tool handlers — extracted to a2a_tools_inbox (RFC #2873 iter 4e).
# Re-imported here so call sites + tests that reference
# ``a2a_tools.tool_inbox_peek`` / ``tool_inbox_pop`` / ``tool_wait_for_message``
# / ``_enrich_inbound_for_agent`` / ``_INBOX_NOT_ENABLED_MSG`` keep
# resolving identically.
from a2a_tools_inbox import ( # noqa: E402 (import after the top-of-module imports)
_INBOX_NOT_ENABLED_MSG,
_enrich_inbound_for_agent,
tool_inbox_peek,
tool_inbox_pop,
tool_wait_for_message,
)
# Identity tool handlers — extracted to a2a_tools_identity. Ports the
# two T4-tier MCP tools (``tool_get_runtime_identity`` +
# ``tool_update_agent_card``) from molecule-ai-workspace-runtime PR#17.
# That repo is mirror-only (reference_runtime_repo_is_mirror_only);
# this is the canonical edit point, and the wheel mirror is
# regenerated by publish-runtime.yml on merge.
from a2a_tools_identity import ( # noqa: E402 (import after the top-of-module imports)
tool_get_runtime_identity,
tool_update_agent_card,
)
-459
View File
@@ -1,459 +0,0 @@
"""Delegation tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4b). Owns the three
delegation MCP tools + the RFC #2829 PR-5 sync-via-polling helper they
share.
Public surface:
* ``tool_delegate_task`` synchronous delegation, waits for response.
* ``tool_delegate_task_async`` fire-and-forget delegation; returns
``{delegation_id, ...}``.
* ``tool_check_task_status`` poll the platform's ``/delegations`` log.
Internal:
* ``_delegate_sync_via_polling`` durable async + poll for terminal
status (RFC #2829 PR-5 cutover path; toggled by
``DELEGATION_SYNC_VIA_INBOX=1``).
* ``_SYNC_POLL_INTERVAL_S`` / ``_SYNC_POLL_BUDGET_S`` constants.
Circular-import note: this module calls ``report_activity`` from
``a2a_tools`` to emit activity rows around the delegate dispatch.
``a2a_tools`` imports the public symbols here at module-load time,
so we use a LAZY import for ``report_activity`` inside the function
that needs it. Without the lazy hop Python raises an ImportError
on first ``a2a_tools`` import.
"""
from __future__ import annotations
import hashlib
import json
import logging
import os
import httpx
logger = logging.getLogger(__name__)
from a2a_client import (
PLATFORM_URL,
WORKSPACE_ID,
_A2A_ERROR_PREFIX,
_A2A_QUEUED_PREFIX,
_peer_names,
_peer_to_source,
discover_peer,
send_a2a_message,
)
from a2a_tools_rbac import auth_headers_for_heartbeat as _auth_headers_for_heartbeat
from _sanitize_a2a import (
_A2A_BOUNDARY_END,
_A2A_BOUNDARY_END_ESCAPED,
_A2A_BOUNDARY_START,
_A2A_BOUNDARY_START_ESCAPED,
sanitize_a2a_result,
) # noqa: E402
# RFC #2829 PR-5 cutover constants. The poll cadence + timeout are
# intentionally generous: 3s gives the platform's executeDelegation
# goroutine room to dispatch + the callee to respond + the result to
# write to activity_logs without thrashing the platform with rapid
# polls; the budget matches the legacy DELEGATION_TIMEOUT (300s) so
# operators don't see behavior change beyond "no more 600s timeouts".
_SYNC_POLL_INTERVAL_S = 3.0
_SYNC_POLL_BUDGET_S = float(os.environ.get("DELEGATION_TIMEOUT", "300.0"))
async def _delegate_sync_via_polling(
workspace_id: str,
task: str,
src: str,
) -> str:
"""RFC #2829 PR-5: durable async delegation + poll for terminal status.
Sidesteps the platform proxy's blocking `message/send` HTTP path that
hits a hard 600s ceiling. Instead:
1. POST /workspaces/<src>/delegate (async, returns 202 + delegation_id)
platform's executeDelegation goroutine handles A2A dispatch in
the background. No client-side timeout dependency on the platform
holding a connection open.
2. Poll GET /workspaces/<src>/delegations every 3s for a row with
matching delegation_id reaching terminal status (completed/failed).
3. Return the response_preview text on completed; surface error_detail
on failed (with the same _A2A_ERROR_PREFIX wrapping the legacy
path uses, so caller error-detection logic is unchanged).
Both /delegate and /delegations are existing endpoints this helper
just composes them into a polling synchronous facade. The result is
available the moment the platform writes the terminal status row;
no extra latency vs. the legacy proxy-blocked path on fast cases.
"""
import asyncio
import time
idem_key = hashlib.sha256(f"{src}:{workspace_id}:{task}".encode()).hexdigest()[:32]
# 1. Dispatch via /delegate (the async, durable path).
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/delegate",
json={
"target_id": workspace_id,
"task": task,
"idempotency_key": idem_key,
},
headers=_auth_headers_for_heartbeat(src),
)
except Exception as e: # pylint: disable=broad-except
return f"{_A2A_ERROR_PREFIX}delegate dispatch failed: {e}"
if resp.status_code != 202 and resp.status_code != 200:
return f"{_A2A_ERROR_PREFIX}delegate dispatch failed: HTTP {resp.status_code} {resp.text[:200]}"
try:
dispatch = resp.json()
except Exception as e: # pylint: disable=broad-except
return f"{_A2A_ERROR_PREFIX}delegate dispatch returned non-JSON: {e}"
delegation_id = dispatch.get("delegation_id", "")
if not delegation_id:
return f"{_A2A_ERROR_PREFIX}delegate dispatch missing delegation_id: {dispatch}"
# 2. Poll for terminal status with a deadline. Each poll is a cheap
# /delegations GET — bounded by the platform's existing rate limit.
deadline = time.monotonic() + _SYNC_POLL_BUDGET_S
last_status = "unknown"
while time.monotonic() < deadline:
try:
async with httpx.AsyncClient(timeout=10.0) as client:
poll = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/delegations",
headers=_auth_headers_for_heartbeat(src),
)
except Exception as e: # pylint: disable=broad-except
# Transient — keep polling. The platform IS holding the
# delegation row; we just lost a network request.
last_status = f"poll-error: {e}"
await asyncio.sleep(_SYNC_POLL_INTERVAL_S)
continue
if poll.status_code != 200:
last_status = f"poll HTTP {poll.status_code}"
await asyncio.sleep(_SYNC_POLL_INTERVAL_S)
continue
try:
rows = poll.json()
except Exception as e: # pylint: disable=broad-except
last_status = f"poll non-JSON: {e}"
await asyncio.sleep(_SYNC_POLL_INTERVAL_S)
continue
# /delegations returns a flat list of delegation events. Filter to
# our delegation_id; pick the first terminal one. The list may
# have multiple rows per delegation_id (one for the original
# dispatch, one per status update); we want the latest terminal.
if not isinstance(rows, list):
await asyncio.sleep(_SYNC_POLL_INTERVAL_S)
continue
terminal = None
for r in rows:
if not isinstance(r, dict):
continue
if r.get("delegation_id") != delegation_id:
continue
status = (r.get("status") or "").lower()
last_status = status
if status in ("completed", "failed"):
terminal = r
break
if terminal:
if (terminal.get("status") or "").lower() == "completed":
# OFFSEC-003: sanitize response_preview before returning so
# boundary markers injected by a malicious peer cannot escape
# the trust boundary.
return sanitize_a2a_result(terminal.get("response_preview") or "")
# OFFSEC-003: sanitize error_detail / summary before wrapping with
# the _A2A_ERROR_PREFIX sentinel so injected markers cannot appear
# inside the trusted error block returned to the agent.
err_raw = (
terminal.get("error_detail")
or terminal.get("summary")
or "delegation failed"
)
err = sanitize_a2a_result(err_raw)
return f"{_A2A_ERROR_PREFIX}{err}"
await asyncio.sleep(_SYNC_POLL_INTERVAL_S)
# Budget exhausted — the platform's row is still in flight (or queued).
# Surface as an error so the caller can decide to retry or fall back;
# the platform DOES still have the durable row, so the work isn't
# lost — it'll complete eventually and a future check_task_status
# will surface the result.
return (
f"{_A2A_ERROR_PREFIX}polling timeout after {_SYNC_POLL_BUDGET_S}s "
f"(delegation_id={delegation_id}, last_status={last_status}); "
f"the platform is still working on it — call check_task_status('{delegation_id}') to retrieve later"
)
async def tool_delegate_task(
workspace_id: str,
task: str,
source_workspace_id: str | None = None,
) -> str:
"""Delegate a task to another workspace via A2A (synchronous — waits for response).
``source_workspace_id`` selects which registered workspace this
delegation originates from drives auth + the X-Workspace-ID source
header so the platform's a2a_proxy logs the correct sender. Single-
workspace operators leave it None and routing falls back to the
module-level WORKSPACE_ID.
"""
if not workspace_id or not task:
return "Error: workspace_id and task are required"
# Self-delegation guard: delegating to your own workspace ID deadlocks —
# the sending turn holds _run_lock while the receive handler waits for the
# same lock, the request 30s-times-out, and the whole cycle is wasted.
# Reject immediately with an actionable message. (effective_src mirrors the
# `src or WORKSPACE_ID` resolution used below for routing.)
effective_src = source_workspace_id or _peer_to_source.get(workspace_id) or WORKSPACE_ID
if workspace_id and workspace_id == effective_src:
return (
"Error: cannot delegate_task to your own workspace — self-delegation "
"deadlocks _run_lock (your sending turn holds it, the receive handler "
"waits for it, the request times out). There is no peer who is also you: "
"just do the work yourself, or call commit_memory / send_message_to_user directly."
)
# Auto-route: if source not specified, look up which registered
# workspace last saw this peer (populated by tool_list_peers). Falls
# back to the legacy WORKSPACE_ID for single-workspace operators.
src = source_workspace_id or _peer_to_source.get(workspace_id) or None
# Discover the target. discover_peer is the access-control gate +
# name/status lookup. The peer's reported ``url`` field is NOT used
# for routing — see send_a2a_message, which constructs the URL via
# the platform's A2A proxy.
peer = await discover_peer(workspace_id, source_workspace_id=src)
if not peer:
return f"Error: workspace {workspace_id} not found or not accessible (check access control)"
if (peer.get("status") or "").lower() == "offline":
return f"Error: workspace {workspace_id} is offline"
# Lazy import: a2a_tools imports this module at top-level, so a
# top-level import of report_activity from a2a_tools would create a
# circular dependency at first-import time. Lazy resolution inside
# the function body breaks the cycle without forcing a ground-up
# restructure of the activity-reporting layer.
from a2a_tools import report_activity
# Report delegation start — include the task text for traceability
peer_name = peer.get("name") or _peer_names.get(workspace_id) or workspace_id[:8]
_peer_names[workspace_id] = peer_name # cache for future use
# Brief summary for canvas display — just the delegation target
await report_activity("a2a_send", workspace_id, f"Delegating to {peer_name}", task_text=task)
# RFC #2829 PR-5: agent-side cutover. When DELEGATION_SYNC_VIA_INBOX=1,
# use the platform's durable async delegation API (POST /delegate +
# poll /delegations) instead of the proxy-blocked message/send path.
# This sidesteps the 600s message/send timeout class that broke
# iteration-14/90-style long-running delegations on 2026-05-05.
#
# Default off — staging-canary first, flip default after PR-2's
# result-push flag (DELEGATION_RESULT_INBOX_PUSH) has been on for
# ≥1 week without incident.
if os.environ.get("DELEGATION_SYNC_VIA_INBOX") == "1":
result = await _delegate_sync_via_polling(workspace_id, task, src or WORKSPACE_ID)
else:
# send_a2a_message routes through ${PLATFORM_URL}/workspaces/{id}/a2a
# (the platform proxy) so the same code works for in-container and
# external (standalone molecule-mcp) callers.
result = await send_a2a_message(workspace_id, task, source_workspace_id=src)
# #2967: when the target is a poll-mode peer, the platform's
# a2a_proxy short-circuits and returns a queued envelope —
# send_a2a_message surfaces that as the _A2A_QUEUED_PREFIX
# sentinel. The synchronous proxy path can't deliver a reply
# because the target has no public URL; fall back to the
# durable /delegate + /delegations polling path which DOES
# work for poll-mode peers (the executeDelegation goroutine
# writes to the inbox queue and the result row arrives when
# the target picks it up + replies).
#
# This is what makes external-runtime-to-external-runtime
# A2A actually deliver synchronous replies — without the
# fallback the calling agent sees the queued sentinel as
# success-with-no-text and never gets the peer's response.
if result.startswith(_A2A_QUEUED_PREFIX):
logger.info(
"tool_delegate_task: target=%s is poll-mode; "
"falling back from message/send to /delegate-poll path",
workspace_id,
)
result = await _delegate_sync_via_polling(
workspace_id, task, src or WORKSPACE_ID,
)
# Detect delegation failures — wrap them clearly so the calling agent
# can decide to retry, use another peer, or handle the task itself.
is_error = result.startswith(_A2A_ERROR_PREFIX)
# Strip the sentinel prefix so error_detail is the human-readable
# cause directly. The Activity tab's red error chip surfaces this
# without the user having to scroll into the raw response JSON.
#
# Cap at 4096 chars before sending — the platform's
# activity_logs.error_detail column is unbounded TEXT and a
# malicious or buggy peer could otherwise stream an arbitrarily
# large error message into the caller's activity log. 4096 is
# comfortably above any real exception traceback we've seen and
# well below an obvious-DoS threshold.
error_detail = result[len(_A2A_ERROR_PREFIX):].strip()[:4096] if is_error else ""
await report_activity(
"a2a_receive", workspace_id,
f"{peer_name} responded ({len(result)} chars)" if not is_error else f"{peer_name} failed: {error_detail[:120]}",
task_text=task, response_text=result,
status="error" if is_error else "ok",
error_detail=error_detail,
)
if is_error:
return (
f"DELEGATION FAILED to {peer_name}: {result}\n"
f"You should either: (1) try a different peer, (2) handle this task yourself, "
f"or (3) inform the user that {peer_name} is unavailable and provide your best answer."
)
# OFFSEC-003: escape boundary markers in peer text, then wrap in boundary
# markers so the agent can distinguish trusted (own output) from untrusted
# (peer-supplied) content. Explicit wrapping here rather than inside
# sanitize_a2a_result preserves a clean separation of concerns.
#
# Truncate at the closer BEFORE sanitizing so the raw closer (which gets
# lost during escaping) is removed from the content. After truncation,
# sanitize the remaining text and wrap with escaped boundary markers.
if _A2A_BOUNDARY_END in result:
result = result[:result.index(_A2A_BOUNDARY_END)]
escaped = sanitize_a2a_result(result)
return (
f"{_A2A_BOUNDARY_START_ESCAPED}\n"
f"{escaped}\n"
f"{_A2A_BOUNDARY_END_ESCAPED}"
)
async def tool_delegate_task_async(
workspace_id: str,
task: str,
source_workspace_id: str | None = None,
) -> str:
"""Delegate a task via the platform's async delegation API (fire-and-forget).
Uses POST /workspaces/:id/delegate which runs the A2A request in the background.
Results are tracked in the platform DB and broadcast via WebSocket.
Use check_task_status to poll for results.
``source_workspace_id`` selects the sending workspace (which one of
this agent's registered workspaces gets logged as the originator);
auto-routes via the peersource cache when omitted.
"""
if not workspace_id or not task:
return "Error: workspace_id and task are required"
src = source_workspace_id or _peer_to_source.get(workspace_id) or WORKSPACE_ID
# Self-delegation guard: even on the async path, queuing a task to your own
# workspace just makes you re-process your own dispatch — never useful, and
# on the sync path it deadlocks (see tool_delegate_task). Reject early.
if workspace_id and workspace_id == src:
return (
"Error: cannot delegate_task_async to your own workspace — there is no "
"peer who is also you. Do the work yourself, or call commit_memory / "
"send_message_to_user directly."
)
# Idempotency key: SHA-256 of (source, target, task) so that a
# restarted agent firing the same delegation gets the same key and
# the platform returns the existing delegation_id instead of
# creating a duplicate. Fixes #1456. Source is in the key so the
# SAME task delegated from two different registered workspaces
# produces two distinct delegations (the right behavior — one per
# tenant audit trail).
idem_key = hashlib.sha256(f"{src}:{workspace_id}:{task}".encode()).hexdigest()[:32]
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/delegate",
json={"target_id": workspace_id, "task": task, "idempotency_key": idem_key},
headers=_auth_headers_for_heartbeat(src),
)
if resp.status_code == 202:
data = resp.json()
return json.dumps({
"delegation_id": data.get("delegation_id", ""),
"workspace_id": workspace_id,
"status": "delegated",
"note": "Task delegated. The platform runs it in the background. Use check_task_status to poll for results.",
})
else:
return f"Error: delegation failed with status {resp.status_code}: {resp.text[:200]}"
except Exception as e:
return f"Error: delegation failed — {e}"
async def tool_check_task_status(
workspace_id: str,
task_id: str,
source_workspace_id: str | None = None,
) -> str:
"""Check delegations for this workspace via the platform API.
Args:
workspace_id: Ignored (kept for backward compat). Checks
``source_workspace_id``'s delegations (the workspace that
FIRED the delegations), not the target's.
task_id: Optional delegation_id to filter. If empty, returns all recent delegations.
source_workspace_id: Which registered workspace's delegation log
to query. Defaults to the module-level WORKSPACE_ID.
"""
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/delegations",
headers=_auth_headers_for_heartbeat(src),
)
if resp.status_code != 200:
return f"Error: failed to check delegations ({resp.status_code})"
delegations = resp.json()
if task_id:
# Filter by delegation_id
matching = [d for d in delegations if d.get("delegation_id") == task_id]
if matching:
# OFFSEC-003: sanitize peer-supplied fields
d = matching[0]
d["summary"] = sanitize_a2a_result(d.get("summary", ""))
d["response_preview"] = sanitize_a2a_result(d.get("response_preview", ""))
return json.dumps(d)
return json.dumps({"status": "not_found", "delegation_id": task_id})
# Return all recent delegations
summary = []
for d in delegations[:10]:
preview = d.get("response_preview", "")
if preview:
preview = sanitize_a2a_result(preview)
summary.append({
"delegation_id": d.get("delegation_id", ""),
"target_id": d.get("target_id", ""),
"status": d.get("status", ""),
"summary": sanitize_a2a_result(d.get("summary", "")),
"response_preview": preview,
})
return json.dumps({"delegations": summary, "count": len(delegations)})
except Exception as e:
return f"Error checking delegations: {e}"
-187
View File
@@ -1,187 +0,0 @@
"""Identity tool handlers — single-concern slice of the a2a_tools surface.
Owns the two MCP tools that close the T4-tier workspace owner-permission
gaps reported via the canvas:
* ``tool_get_runtime_identity`` env-only; returns model, model_provider,
molecule_model, anthropic_base_url, tier, workspace_id, runtime
(ADAPTER_MODULE). No HTTP call. Always permitted by RBAC even
read-only agents may know what model they are.
* ``tool_update_agent_card`` POSTs the card to ``/registry/update-card``
with the workspace's own bearer (same auth path as ``tool_commit_memory``
via ``a2a_tools_rbac.auth_headers_for_heartbeat``). The platform
replaces the stored card and broadcasts an ``agent_card_updated``
event so the canvas reflects the new card live. Gated on
``memory.write`` capability via the existing RBAC permission map so
read-only roles can't silently rewrite the platform card.
Both originated as a port of molecule-ai-workspace-runtime PR#17
(``feat(mcp): add update_agent_card + get_runtime_identity tools``).
The mirror-only PR#17 was closed without merge per
``reference_runtime_repo_is_mirror_only``; the canonical edit point is
this monorepo at ``workspace/`` and the wheel mirror is regenerated
automatically by the publish-runtime workflow.
Imports the auth-header primitive from ``a2a_tools_rbac`` (iter 4a)
NOT from ``a2a_tools`` to avoid a circular import with the
kitchen-sink re-export module.
"""
from __future__ import annotations
import json
import os
from typing import Any
import httpx
from a2a_client import PLATFORM_URL
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_write_permission as _check_memory_write_permission,
)
def _runtime_identity_payload() -> dict[str, Any]:
"""Build the identity dict — env-only, no I/O.
Factored out from ``tool_get_runtime_identity`` so tests can assert
against the exact key set without re-parsing JSON. The MCP tool
handler ``tool_get_runtime_identity`` is the only public caller in
production; tests call this helper directly.
"""
return {
"model": os.environ.get("MODEL", ""),
"model_provider": os.environ.get("MODEL_PROVIDER", ""),
"molecule_model": os.environ.get("MOLECULE_MODEL", ""),
"anthropic_base_url": os.environ.get("ANTHROPIC_BASE_URL", ""),
"tier": os.environ.get("TIER", ""),
"workspace_id": os.environ.get("WORKSPACE_ID", ""),
# Adapter module is the closest thing the runtime has to a
# "template slug" — e.g. "adapter" for claude-code-default,
# "hermes" for hermes-template, etc. Picked from
# $ADAPTER_MODULE env baked by each template's Dockerfile.
"runtime": os.environ.get("ADAPTER_MODULE", ""),
}
async def tool_get_runtime_identity() -> str:
"""Return this runtime's identity — model, provider, tier, IDs.
Env-only; no HTTP call. Useful so the agent can answer "what model
am I?" correctly instead of guessing from a stale system prompt
that the operator may have changed between boots.
Returns the identity as a JSON-encoded string (the dispatch contract
every MCP tool in this module follows). Tests that want to assert
individual fields can call ``_runtime_identity_payload()`` directly,
or ``json.loads`` the return value.
Always permitted by RBAC there is no sensitive information here
that isn't already available to the process via ``os.environ``.
The point of the tool is to surface those env values to the agent
layer in a stable, documented shape rather than expecting every
agent runtime to know to ``echo $MODEL``.
"""
return json.dumps(_runtime_identity_payload(), indent=2)
async def tool_update_agent_card(card: Any) -> str:
"""Update this workspace's agent_card on the platform.
POSTs the provided card to ``/registry/update-card`` with the
workspace's own bearer token (same auth path as ``tool_commit_memory``
and ``tool_get_workspace_info``). The platform validates required
fields server-side, replaces the stored card, and broadcasts an
``agent_card_updated`` event so the canvas updates live.
Args:
card: A JSON-serialisable object (typically a dict) holding the
new card. The platform validates required fields server-side.
Returns:
JSON-encoded string. Body:
- ``{"success": true, "status": "updated"}`` on success;
- ``{"success": false, "error": "<msg>", "status_code": <int>}``
on platform error;
- ``{"success": false, "error": "<reason>"}`` on local validation
(non-dict card, missing WORKSPACE_ID, network error).
Permission gate: this tool requires the ``memory.write`` RBAC
capability same gate as ``tool_commit_memory``. The check runs
inline rather than at the dispatcher layer to keep ``a2a_mcp_server``
permission-agnostic (the gate sits with the implementation, not the
transport). Read-only roles get a clear error string back instead
of a 403 from the platform.
We re-check ``isinstance(card, dict)`` here defensively rather than
trust the MCP schema validator alone the schema only constrains
the transport, not the in-process call surface used by tests and
sibling modules.
"""
payload = await _update_agent_card_impl(card)
return json.dumps(payload, indent=2)
async def _update_agent_card_impl(card: Any) -> dict[str, Any]:
"""Dict-returning core of ``tool_update_agent_card``.
Split out so tests can assert against the raw dict shape (status
codes, error messages) without re-parsing JSON on every assertion.
The string-returning ``tool_update_agent_card`` is a thin wrapper
invoked by the MCP dispatcher.
"""
# RBAC: require memory.write permission. Same gate as
# tool_commit_memory (the agent already needs this capability to
# persist anything outbound). Read-only roles can still call
# get_runtime_identity / get_workspace_info to introspect — those
# are env-only / read-only and have no inline gate.
if not _check_memory_write_permission():
return {
"success": False,
"error": (
"RBAC — this workspace does not have the 'memory.write' "
"permission required to update the agent_card."
),
}
if not isinstance(card, dict):
return {
"success": False,
"error": "card must be a JSON object (dict)",
}
ws_id = os.environ.get("WORKSPACE_ID", "")
if not ws_id:
return {
"success": False,
"error": "WORKSPACE_ID env not set; cannot identify caller",
}
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/registry/update-card",
json={"workspace_id": ws_id, "agent_card": card},
headers=_auth_headers_for_heartbeat(),
)
if resp.status_code == 200:
body: dict[str, Any] = {}
try:
body = resp.json()
except Exception:
pass
return {
"success": True,
"status": body.get("status", "updated"),
}
# Non-200 — surface what the platform returned.
error_msg = ""
try:
error_msg = resp.json().get("error", "") or resp.text
except Exception:
error_msg = resp.text
return {
"success": False,
"status_code": resp.status_code,
"error": error_msg,
}
except Exception as e:
return {"success": False, "error": f"network error: {e}"}
-140
View File
@@ -1,140 +0,0 @@
"""Inbox tool handlers — single-concern slice of the a2a_tools surface.
Standalone-runtime path for inbound-message delivery (push-mode runtimes
get messages via the channel-tag synthesis in a2a_mcp_server). The
``InboxState`` singleton is set by ``mcp_cli`` before the MCP server
starts; in-container runtimes never call ``inbox.activate(...)`` so
``inbox.get_state()`` returns None and these tools surface an
informational error instead of raising.
When-to-use guidance for agents (mirrored in
``platform_tools/registry.py``):
- ``wait_for_message``: block until a new inbound message arrives, then
decide what to do with it; forms the loop ``wait respond wait``.
- ``inbox_peek``: inspect the queue non-destructively.
- ``inbox_pop``: remove a handled message by activity_id.
Extracted from ``a2a_tools.py`` in RFC #2873 iter 4e so the kitchen-sink
module shrinks to a back-compat shim. The extraction also makes the
``_enrich_inbound_for_agent`` helper unit-testable in isolation
previously it was buried in ``a2a_tools`` and only exercised through
the inbox wrappers, leaving its peer-id-empty / cache-miss / registry-
unavailable branches under-covered.
"""
from __future__ import annotations
import asyncio
import json
# Surfaced when the inbox subsystem is not initialised. Returned by the
# three inbox tool wrappers below so the agent gets a clear "this
# runtime delivers via push" message instead of a NameError.
_INBOX_NOT_ENABLED_MSG = (
"Error: inbox polling is not enabled in this runtime. The standalone "
"molecule-mcp wrapper activates it; in-container runtimes receive "
"messages via push delivery and do not need these tools."
)
def _enrich_inbound_for_agent(d: dict) -> dict:
"""Add peer_name / peer_role / agent_card_url to a poll-path message.
The PUSH path (a2a_mcp_server._build_channel_notification) already
enriches the meta dict with these fields, so a Claude Code host
with channel-push sees them. The POLL path goes through
InboxMessage.to_dict, which is intentionally identity-free (the
storage layer doesn't know about the registry cache). Without this
helper, every non-Claude-Code MCP client that uses inbox_peek /
wait_for_message gets a plain message and the receiving agent
can't tell who's writing breaking the contract documented in
a2a_mcp_server.py:303-345 ("In both paths the same fields apply").
Cache-first non-blocking enrichment (same shape as push): on cache
miss the helper returns the bare message; the next call within the
5-min TTL hits the warm cache. Failure to enrich is non-fatal
the agent still gets text + peer_id + kind + activity_id, just
without the friendly identity.
"""
peer_id = d.get("peer_id") or ""
if not peer_id:
# canvas_user — no peer to enrich; helper returns the plain
# message unchanged so the canvas reply path still works.
return d
try:
from a2a_client import ( # local import — avoid module-load cycle
_agent_card_url_for,
enrich_peer_metadata_nonblocking,
)
except Exception: # noqa: BLE001
# If a2a_client is unavailable (test harness, partial install),
# degrade gracefully — agent still gets the bare envelope.
return d
record = enrich_peer_metadata_nonblocking(peer_id)
if record is not None:
if name := record.get("name"):
d["peer_name"] = name
if role := record.get("role"):
d["peer_role"] = role
# agent_card_url is constructable from peer_id alone — surface it
# even when registry enrichment misses, so the receiving agent has
# a single endpoint to hit for the peer's full capability list.
d["agent_card_url"] = _agent_card_url_for(peer_id)
return d
async def tool_inbox_peek(limit: int = 10) -> str:
"""Return up to ``limit`` pending inbound messages without removing them."""
import inbox # local import — avoids a circular dep at module load
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
messages = state.peek(limit=limit if isinstance(limit, int) else 10)
return json.dumps([_enrich_inbound_for_agent(m.to_dict()) for m in messages])
async def tool_inbox_pop(activity_id: str) -> str:
"""Remove a message from the inbox queue by activity_id."""
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
if not isinstance(activity_id, str) or not activity_id:
return "Error: activity_id is required."
removed = state.pop(activity_id)
if removed is None:
return json.dumps({"removed": False, "activity_id": activity_id})
return json.dumps({"removed": True, "activity_id": activity_id})
async def tool_wait_for_message(timeout_secs: float = 60.0) -> str:
"""Block until a new message arrives or ``timeout_secs`` elapses.
Returns the head message non-destructively; the agent decides
whether to ``inbox_pop`` it after acting.
"""
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
try:
timeout = float(timeout_secs)
except (TypeError, ValueError):
timeout = 60.0
# Cap at 300s — Claude Code's default tool timeout is ~10min, and
# blocking longer than 5min wastes the prompt cache window for
# nothing useful. Operators who want longer can call repeatedly.
timeout = max(0.0, min(timeout, 300.0))
# The threading.Event-based wait would block the asyncio loop.
# Run it on the default executor so the MCP server can keep
# processing other JSON-RPC requests while we sleep.
loop = asyncio.get_running_loop()
message = await loop.run_in_executor(None, state.wait, timeout)
if message is None:
return json.dumps({"timeout": True, "timeout_secs": timeout})
return json.dumps(_enrich_inbound_for_agent(message.to_dict()))
-141
View File
@@ -1,141 +0,0 @@
"""Memory tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4c). Owns the two
agent-memory MCP tools:
* ``tool_commit_memory`` write to the workspace's persistent memory.
* ``tool_recall_memory`` search the workspace's persistent memory.
Both go through the platform's ``/workspaces/:id/memories`` endpoint;
the platform is the source of truth for namespace isolation + audit
trail. Local responsibility here is RBAC enforcement BEFORE hitting
the network so a denied operation surfaces a clear in-band error
instead of an opaque platform 403.
Imports the RBAC primitives from ``a2a_tools_rbac`` (iter 4a).
"""
from __future__ import annotations
import json
import httpx
from a2a_client import PLATFORM_URL, WORKSPACE_ID
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_read_permission as _check_memory_read_permission,
check_memory_write_permission as _check_memory_write_permission,
is_root_workspace as _is_root_workspace,
)
from builtin_tools.security import _redact_secrets
async def tool_commit_memory(
content: str,
scope: str = "LOCAL",
source_workspace_id: str | None = None,
) -> str:
"""Save important information to persistent memory.
GLOBAL scope is writable only by root workspaces (tier == 0).
RBAC memory.write permission is required for all scope levels.
The source workspace_id is embedded in every record so the platform
can enforce cross-workspace isolation and audit trail.
``source_workspace_id`` selects which registered workspace this
memory belongs to when the agent is registered into multiple
workspaces (PR-1 / multi-workspace mode). When unset, falls back
to the module-level WORKSPACE_ID single-workspace operators see
no behaviour change.
"""
if not content:
return "Error: content is required"
content = _redact_secrets(content)
scope = scope.upper()
if scope not in ("LOCAL", "TEAM", "GLOBAL"):
scope = "LOCAL"
# RBAC: require memory.write permission (mirrors builtin_tools/memory.py)
if not _check_memory_write_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.write' "
"permission for this operation."
)
# Scope enforcement: only root workspaces (tier 0) can write GLOBAL memory.
# This prevents tenant workspaces from poisoning org-wide memory (GH#1610).
if scope == "GLOBAL" and not _is_root_workspace():
return (
"Error: RBAC — only root workspaces (tier 0) can write to GLOBAL scope. "
"Non-root workspaces may use LOCAL or TEAM scope."
)
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/memories",
json={
"content": content,
"scope": scope,
# Embed source workspace so the platform can namespace-isolate
# and audit cross-workspace writes (GH#1610 fix).
"workspace_id": src,
},
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if resp.status_code in (200, 201):
return json.dumps({"success": True, "id": data.get("id"), "scope": scope})
return f"Error: {data.get('error', resp.text)}"
except Exception as e:
return f"Error saving memory: {e}"
async def tool_recall_memory(
query: str = "",
scope: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Search persistent memory for previously saved information.
RBAC memory.read permission is required (mirrors builtin_tools/memory.py).
The workspace_id is sent as a query parameter so the platform can
cross-validate it against the auth token and defend against any future
path traversal / cross-tenant read bugs in the platform itself.
``source_workspace_id`` selects which registered workspace's memories
to search when the agent is registered into multiple workspaces.
Unset defaults to the module-level WORKSPACE_ID.
"""
# RBAC: require memory.read permission (mirrors builtin_tools/memory.py)
if not _check_memory_read_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.read' "
"permission for this operation."
)
src = source_workspace_id or WORKSPACE_ID
params: dict[str, str] = {"workspace_id": src}
if query:
params["q"] = query
if scope:
params["scope"] = scope.upper()
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/memories",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if isinstance(data, list):
if not data:
return "No memories found."
lines = []
for m in data:
lines.append(f"[{m.get('scope', '?')}] {m.get('content', '')}")
return "\n".join(lines)
return json.dumps(data)
except Exception as e:
return f"Error recalling memory: {e}"
-382
View File
@@ -1,382 +0,0 @@
"""Messaging tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4d). Owns the four
human-and-peer messaging MCP tools + the chat-upload helper they share:
* ``tool_send_message_to_user`` push a canvas-chat message via the
platform's ``/notify`` endpoint.
* ``tool_list_peers`` discover peers across one or many registered
workspaces, with side-effect of populating ``_peer_to_source`` for
delegate-task auto-routing.
* ``tool_get_workspace_info`` JSON-encode the workspace's own info.
* ``tool_chat_history`` fetch prior conversation rows with a peer.
* ``_upload_chat_files`` internal helper for the message-attachments
code path; routes local file paths through the platform's
``/chat/uploads`` so the canvas can render them as download chips.
Imports the auth-header primitive from ``a2a_tools_rbac`` (iter 4a).
"""
from __future__ import annotations
import json
import mimetypes
import os
import httpx
from a2a_client import (
PLATFORM_URL,
WORKSPACE_ID,
_peer_names,
_peer_to_source,
get_peers_with_diagnostic,
get_workspace_info,
)
from a2a_tools_rbac import auth_headers_for_heartbeat as _auth_headers_for_heartbeat
from platform_auth import list_registered_workspaces
async def _upload_chat_files(
client: httpx.AsyncClient,
paths: list[str],
workspace_id: str | None = None,
) -> tuple[list[dict], str | None]:
"""Upload local file paths through /workspaces/<self>/chat/uploads.
The platform stages each upload under /workspace/.molecule/chat-uploads
(an "allowed root" the canvas knows how to render via the Download
endpoint) and returns metadata the broadcast payload references.
Why we route through upload instead of just passing the agent's path:
the canvas's allowed-root list is /configs, /workspace, /home, /plugins
files at /tmp or /root would be unreachable. Uploading copies the
bytes into an allowed root regardless of where the agent wrote them.
Returns (attachments, error). On any failure the caller should NOT
fire the notify partial-attach would surface a half-rendered chip.
"""
if not paths:
return [], None
files_payload: list[tuple[str, tuple[str, bytes, str]]] = []
for p in paths:
if not isinstance(p, str) or not p:
return [], f"Error: invalid attachment path {p!r}"
if not os.path.isfile(p):
return [], f"Error: attachment not found: {p}"
try:
with open(p, "rb") as fh:
data = fh.read()
except OSError as e:
return [], f"Error reading {p}: {e}"
# Sniff mime from filename so the canvas can pick the right
# icon / preview / inline-image renderer. Pre-fix this was
# hardcoded application/octet-stream and chat_files.go's
# Upload trusts whatever Content-Type the multipart part
# carries — `mt := fh.Header.Get("Content-Type")` only falls
# back to extension-sniffing when the header is empty. So a
# hardcoded octet-stream meant every attachment lost its
# real type forever, breaking the canvas chip's icon logic.
mime_type, _ = mimetypes.guess_type(p)
if not mime_type:
mime_type = "application/octet-stream"
files_payload.append(("files", (os.path.basename(p), data, mime_type)))
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/chat/uploads",
files=files_payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
except Exception as e:
return [], f"Error uploading attachments: {e}"
if resp.status_code != 200:
return [], f"Error: chat/uploads returned {resp.status_code}: {resp.text[:200]}"
try:
body = resp.json()
except Exception as e:
return [], f"Error parsing upload response: {e}"
uploaded = body.get("files") or []
if not isinstance(uploaded, list) or len(uploaded) != len(paths):
return [], f"Error: upload returned {len(uploaded) if isinstance(uploaded, list) else 'invalid'} entries for {len(paths)} files"
return uploaded, None
async def tool_broadcast_message(
message: str,
workspace_id: str | None = None,
) -> str:
"""Send a broadcast message to ALL agent workspaces in the org.
Requires the workspace to have broadcast_enabled=true (set by a user or
admin via PATCH /workspaces/:id/abilities). Use for urgent org-wide
signals status changes, critical alerts, coordination instructions.
Every non-removed workspace receives the message in its activity log so
poll-mode agents pick it up, and push-mode canvases get a real-time
BROADCAST_MESSAGE WebSocket event.
Args:
message: The broadcast text. Keep it concise all agents receive
this, so avoid lengthy prose that floods every context.
workspace_id: Optional. Which registered workspace to send the
broadcast from. Single-workspace agents omit this.
"""
if not message:
return "Error: message is required"
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=30.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/broadcast",
json={"message": message},
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
if resp.status_code == 200:
data = resp.json()
delivered = data.get("delivered", "?")
return f"Broadcast sent to {delivered} workspace(s)"
if resp.status_code == 403:
try:
hint = resp.json().get("hint", "")
except Exception:
hint = ""
return f"Error: broadcast ability not enabled.{(' ' + hint) if hint else ''}"
return f"Error: platform returned {resp.status_code}"
except Exception as e:
return f"Error sending broadcast: {e}"
async def tool_send_message_to_user(
message: str,
attachments: list[str] | None = None,
workspace_id: str | None = None,
) -> str:
"""Send a message directly to the user's canvas chat via WebSocket.
Args:
message: The text to display in the user's chat. Required even
when sending attachments set to a short caption like
"Here's the build output:" or "Done — see attached."
attachments: Optional list of absolute file paths inside this
container. Each is uploaded to the platform and rendered
in the canvas as a clickable download chip. Use this
instead of pasting paths in the message text paths
render as plain text and the user can't click them.
Examples:
attachments=["/tmp/build-output.zip"]
attachments=["/workspace/report.pdf", "/workspace/data.csv"]
workspace_id: Optional. When the agent is registered in MULTIPLE
workspaces (external multi-workspace MCP path), this
selects which workspace's chat to deliver the message to —
should match the ``arrival_workspace_id`` of the inbound
message you're replying to so the user sees the reply in
the same canvas they typed in. Single-workspace agents
omit this; the message routes to the only registered
workspace.
"""
if not message:
return "Error: message is required"
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=60.0) as client:
uploaded, upload_err = await _upload_chat_files(
client, attachments or [], workspace_id=target_workspace_id,
)
if upload_err:
return upload_err
payload: dict = {"message": message}
if uploaded:
payload["attachments"] = uploaded
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/notify",
json=payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
if resp.status_code == 200:
if uploaded:
return f"Message sent to user with {len(uploaded)} attachment(s)"
return "Message sent to user"
if resp.status_code == 403:
try:
body = resp.json()
if body.get("error") == "talk_to_user_disabled":
hint = body.get("hint", "")
return (
"Error: this workspace is not allowed to send messages "
"directly to the user (talk_to_user is disabled). "
+ (hint + " " if hint else "")
+ "Use delegate_task to forward your update to a parent "
"or supervisor workspace that can reach the user."
)
except Exception:
pass
return f"Error: platform returned {resp.status_code}"
except Exception as e:
return f"Error sending message: {e}"
async def tool_list_peers(source_workspace_id: str | None = None) -> str:
"""List all workspaces this agent can communicate with.
Behavior:
- ``source_workspace_id`` set list peers of that one workspace.
- Unset, single-workspace mode list peers of WORKSPACE_ID
(the legacy path, unchanged).
- Unset, multi-workspace mode (MOLECULE_WORKSPACES populated)
aggregate across every registered workspace, prefixing each
peer with its source so the agent / user can see the full peer
surface in one call.
Side-effect: populates ``_peer_to_source`` so subsequent
``tool_delegate_task(target)`` auto-routes through the correct
sending workspace without the agent needing ``source_workspace_id``.
"""
sources: list[str]
aggregate = False
if source_workspace_id:
sources = [source_workspace_id]
else:
registered = list_registered_workspaces()
if len(registered) > 1:
sources = registered
aggregate = True
else:
sources = [WORKSPACE_ID]
all_peers: list[tuple[str, dict]] = [] # (source, peer_record)
diagnostics: list[tuple[str, str]] = [] # (source, diagnostic)
for src in sources:
peers, diagnostic = await get_peers_with_diagnostic(source_workspace_id=src)
if peers:
for p in peers:
all_peers.append((src, p))
elif diagnostic is not None:
diagnostics.append((src, diagnostic))
if not all_peers:
if diagnostics:
joined = "; ".join(f"[{src[:8]}] {d}" for src, d in diagnostics)
return f"No peers found. {joined}"
return (
"You have no peers in the platform registry. "
"(No parent, no children, no siblings registered.)"
)
lines = []
for src, p in all_peers:
status = p.get("status", "unknown")
role = p.get("role", "")
peer_id = p["id"]
# Cache name for use in delegate_task
_peer_names[peer_id] = p["name"]
# Cache the source workspace so tool_delegate_task auto-routes
_peer_to_source[peer_id] = src
if aggregate:
lines.append(
f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role}, via: {src[:8]})"
)
else:
lines.append(f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role})")
return "\n".join(lines)
async def tool_get_workspace_info(source_workspace_id: str | None = None) -> str:
"""Get this workspace's own info.
``source_workspace_id`` selects which registered workspace to
introspect when the agent is registered into multiple workspaces.
Unset falls back to module-level WORKSPACE_ID.
"""
info = await get_workspace_info(source_workspace_id=source_workspace_id)
return json.dumps(info, indent=2)
async def tool_chat_history(
peer_id: str,
limit: int = 20,
before_ts: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Fetch the prior conversation with one peer.
Hits ``/workspaces/<self>/activity?peer_id=<peer>&limit=<N>``
against the workspace-server, which returns activity rows where
the peer is either the sender (``source_id=peer`` they sent us
the message) or the recipient (``target_id=peer`` we sent to
them) of an A2A turn both sides of the conversation in
chronological order.
Args:
peer_id: The other workspace's UUID. Same value the agent
sees as ``peer_id`` on a peer_agent push or ``workspace_id``
on a delegate_task call.
limit: Maximum rows to return; capped server-side at 500. The
default of 20 covers "most recent context for this peer"
without flooding the agent's context window.
before_ts: Optional RFC3339 timestamp; only rows strictly
older are returned. Used to page backward through long
histories pass the oldest ``ts`` from the previous
response. Empty (default) returns the most recent ``limit``
rows.
source_workspace_id: Which registered workspace's activity log
to query. Auto-routes via ``_peer_to_source`` cache when
unset (the workspace this peer was discovered through);
falls back to module-level WORKSPACE_ID for single-workspace
operators.
Returns a JSON-encoded list of activity rows (or an error string
starting with ``Error:`` so the agent can branch). Each row carries
``activity_type``, ``source_id``, ``target_id``, ``method``,
``summary``, ``request_body``, ``response_body``, ``status``,
``created_at`` same shape ``inbox_peek`` and the canvas chat
loader already see.
"""
if not peer_id or not isinstance(peer_id, str):
return "Error: peer_id is required"
if not isinstance(limit, int) or limit <= 0:
limit = 20
if limit > 500:
limit = 500
src = source_workspace_id or _peer_to_source.get(peer_id) or WORKSPACE_ID
params: dict[str, str] = {
"peer_id": peer_id,
"limit": str(limit),
}
# Forward verbatim — the server route validates as RFC3339 at the
# trust boundary and translates into a `created_at < $X` clause.
if before_ts:
params["before_ts"] = before_ts
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/activity",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
except Exception as exc: # noqa: BLE001
return f"Error: chat_history request failed: {exc}"
if resp.status_code == 400:
# Trust-boundary rejection (malformed peer_id, etc.) — surface
# the server's reason verbatim so the agent can correct itself.
try:
err = resp.json().get("error", "bad request")
except Exception: # noqa: BLE001
err = "bad request"
return f"Error: {err}"
if resp.status_code >= 400:
return f"Error: chat_history returned HTTP {resp.status_code}"
try:
rows = resp.json()
except Exception: # noqa: BLE001
return "Error: chat_history response was not JSON"
if not isinstance(rows, list):
return "Error: chat_history response was not a list"
# Server returns DESC (most recent first); reverse to chronological
# so the agent reads the conversation top-down like a chat log.
rows.reverse()
return json.dumps(rows)

Some files were not shown because too many files have changed in this diff Show More