Compare commits

...

128 Commits

Author SHA1 Message Date
Molecule AI Dev Engineer A (Kimi) 130f48ed69 chore: retrigger CI — Local Provision E2E stub failed on provisioning timeout (infra flake on main, unrelated to dead-code removal)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
sop-checklist / review-refire (pull_request_target) Has been cancelled
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
qa-review / approved (pull_request_target) Failing after 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 11s
E2E Chat / E2E Chat (pull_request) Successful in 4s
security-review / approved (pull_request_target) Failing after 14s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
CI / Canvas Deploy Status (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m17s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m53s
CI / Platform (Go) (pull_request) Successful in 4m23s
CI / all-required (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m37s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m2s
2026-06-09 11:41:21 +00:00
Molecule AI Dev Engineer A (Kimi) b0cac02702 chore(dead-code): remove unused QueueDepth function
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 12s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request_target) Successful in 10s
qa-review / approved (pull_request_target) Failing after 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 17s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
Harness Replays / Harness Replays (pull_request) Successful in 1s
security-review / approved (pull_request_target) Failing after 20s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m52s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m47s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m7s
CI / all-required (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m4s
QueueDepth was added for Phase 2/3 busy-return response visibility but
was never wired to a caller. The inline depth query in EnqueueA2A serves
today's enqueue response, making this function dead code.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-09 11:32:25 +00:00
agent-reviewer 3ed5aaa2a1 Merge pull request 'test(registry-auth): real-Postgres TestIntegration_ suite (#2148 / re-file #2156)' (#2475) from test/2148-registry-auth-real-postgres-v2 into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 8s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 3s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
CI / Canvas Deploy Status (push) Successful in 1s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m17s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m21s
publish-workspace-server-image / build-and-push (push) Successful in 3m38s
CI / Platform (Go) (push) Successful in 4m17s
CI / all-required (push) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 4m33s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m1s
E2E Chat / E2E Chat (push) Failing after 5m42s
publish-workspace-server-image / Production auto-deploy (push) Failing after 4m19s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m2s
2026-06-09 05:55:31 +00:00
Molecule AI Dev Engineer A (Kimi) b8858ee60f test(registry-auth): real-Postgres TestIntegration_ suite (#2148 / re-file #2156)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 2s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 15s
Harness Replays / Harness Replays (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E Chat / E2E Chat (pull_request) Successful in 18s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 57s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m53s
CI / Platform (Go) (pull_request) Successful in 4m11s
CI / all-required (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7m24s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m3s
security-review / approved (pull_request_target) Review check failed via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Failing after 6s
qa-review / approved (pull_request_review) Successful in 7s
audit-force-merge / audit (pull_request_target) Successful in 8s
Re-files the stalled WIP #2156 (originally by molecule-code-reviewer) on
current main, de-duplicating against #2449 which already merged the
handlers-postgres table-presence guard.

Coverage (10 tests, //go:build integration, INTEGRATION_DB_URL):

1. RegistryRowState (4 tests) — register/heartbeat #73 tombstone guard:
   - RegisterDoesNotResurrectRemoved
   - RegisterUpsertsLiveWorkspaceToOnline
   - HeartbeatDoesNotResurrectRemoved
   - HeartbeatUpdatesLiveWorkspace

2. WSAuth (3 tests) — cross-tenant token binding:
   - TokenBoundToIssuingWorkspace
   - TokenOfRemovedWorkspaceRejected
   - RevokeAllForWorkspaceKillsToken

3. CanCommunicate (1 test) — parent_id hierarchy isolation:
   - HierarchyAndCrossTenantIsolation

4. OrgToken (2 tests) — revoke/validate row-state:
   - RevokeStopsValidation
   - ListExcludesRevoked

Also widens detect-changes handlers-postgres profile to include
internal/registry/ + internal/orgtoken/ so regressions in those
packages trigger the integration gate.

Closes #2148
Refs #2156
2026-06-09 05:29:05 +00:00
devops-engineer 7385a3a1c0 Merge PR #2469 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 3s
CI / Detect changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
E2E API Smoke Test / detect-changes (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
CI / Platform (Go) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
E2E Chat / detect-changes (push) Successful in 15s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 17s
CI / Canvas (Next.js) (push) Successful in 18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 15s
E2E Chat / E2E Chat (push) Successful in 15s
CI / Canvas Deploy Status (push) Successful in 3s
CI / all-required (push) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m37s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 3m48s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m33s
publish-workspace-server-image / build-and-push (push) Successful in 3m36s
publish-workspace-server-image / Production auto-deploy (push) Failing after 3m50s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-09 03:14:14 +00:00
devops-engineer 7219f3dc64 fix(ci): audit-force-merge.sh select max-by-id per context (Gitea /statuses non-monotonic)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m13s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m51s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 6s
security-review / approved (pull_request_review) Successful in 5s
audit-force-merge / audit (pull_request_target) Successful in 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m46s
qa RC 9902: the per-context collapse used last-overwrite-wins asserting Gitea
returns ascending id, so last overwrite = newest. Verified live on #2331 head:
/commits/<sha>/statuses is roughly newest-first but NOT strictly monotonic
(first ids 157,155,156,… — local inversions from re-runs/page boundaries). So
last-overwrite-wins selected the OLDEST row per context (stale status) and
first-occurrence is also unsafe. Fixed to jq group_by|max_by(.id) — explicit
newest-by-id, order-independent, matching prod-auto-deploy.py. Pagination +
fail-closed unchanged. Tests: collapse helper now mirrors the max-by-id jq;
T21 fixture rewritten to the real non-monotonic contract (newest id neither
first nor last) so it guards both last-wins and first-wins regressions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 03:05:21 +00:00
devops-engineer 6a19b98918 Merge PR #2470 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Platform (Go) (push) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 17s
CI / Canvas Deploy Status (push) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Has started running
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 17s
CI / Shellcheck (E2E scripts) (push) Successful in 34s
CI / all-required (push) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m16s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m14s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m38s
publish-workspace-server-image / build-and-push (push) Successful in 3m45s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 4m33s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m39s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1h0m22s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-09 02:56:25 +00:00
devops-engineer e2d7ff0df8 Merge PR #2465 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 6s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (push) Has been skipped
CI / Detect changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 24s
E2E Chat / detect-changes (push) Successful in 24s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
Harness Replays / Harness Replays (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (push) Successful in 43s
publish-canvas-image / Build & push canvas image (push) Successful in 1m47s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m49s
publish-workspace-server-image / build-and-push (push) Successful in 3m59s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 4m14s
CI / Platform (Go) (push) Successful in 4m21s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m4s
E2E Chat / E2E Chat (push) Failing after 5m39s
CI / Canvas (Next.js) (push) Successful in 6m50s
CI / Canvas Deploy Status (push) Successful in 1s
CI / all-required (push) Successful in 1s
publish-canvas-image / Promote canvas :latest to CI-green build (push) Successful in 5m9s
publish-workspace-server-image / Production auto-deploy (push) Failing after 6m45s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 6m59s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 25m58s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-09 02:39:55 +00:00
Molecule AI Dev Engineer A (Kimi) 3870dd2dce fix(ci): hard-code MOLECULE_ENV in local-provision E2E + retry tenant image build
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
CI / Canvas Deploy Status (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / all-required (pull_request) Successful in 10s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
gate-check-v3 / gate-check (pull_request_target) Failing after 19s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m9s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m23s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m54s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m4s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 7s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 9s
audit-force-merge / audit (pull_request_target) Successful in 9s
- Moves MOLECULE_ENV=development and SECRETS_ENCRYPTION_KEY to the job-level
  env block in both lifecycle-stub and lifecycle-real so the platform server
  always sees dev mode even if the runner's $GITHUB_ENV propagation is flaky.
  This addresses the 'workspace URL is not publicly routable' SSRF failure on
  main (#2468) where loopback/private IPs were being rejected.

- Adds workspace URL debug print in test_local_provision_lifecycle_e2e.sh so
  future SSRF failures show the actual stored URL immediately.

- Wraps the tenant image build in publish-workspace-server-image.yml with a
  3-attempt retry loop that creates a fresh buildx builder each time. The
  buildkit EOF error (#2468) is often transient under memory pressure on the
  publish runner; a clean builder retry avoids poisoning from a crashed one.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 02:37:45 +00:00
devops-engineer 59405ab775 fix(ci): paginate /statuses to exhaustion in verify-by-state readers
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 12s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
CI / all-required (pull_request) Successful in 10s
qa-review / approved (pull_request_target) Review check failed via pull_request_review trigger
security-review / approved (pull_request_target) Review check failed via pull_request_review trigger
qa-review / approved (pull_request_review) Failing after 4s
security-review / approved (pull_request_review) Failing after 5s
The status-pagination bug (RCA, #2440-family): merge/verify status readers
fetched only the FIRST page of a commit's statuses. On high-churn PRs Gitea
caps the combined GET /commits/{sha}/status `statuses` array at the default
page size (~30) and pushes older-but-still-current required-context rows past
it. A reader of that truncated view records the required context as ABSENT
(missing) even though its current SUCCESS row exists — wrongly blocking, or
mis-reading the gate. Confirmed on #2448/#2426/#2438/#2331/#2259/#2055/#2032
(reviewers had to manually paginate to verify gates this whole session). Live
proof on PR #2331 head: combined /status returns 30 rows; exhaustive
/statuses returns 50 rows across 20 distinct contexts.

Two verify-by-state readers consumed that capped combined view for
required-context decisions and are fixed here to page the dedicated
/commits/{sha}/statuses list to EXHAUSTION (until a short/empty page), then
collapse to newest-row-per-context:

- prod-auto-deploy.py (wait-ci gate): replaced the single combined /status
  fetch with fetch_all_statuses() (paginated). A required context past page 1
  no longer reads "missing" forever and times out a legitimate prod deploy.
  latest_status_for_context now selects newest-by-id so the oldest-first
  /statuses ordering can't let a stale run shadow the current one.
- audit-force-merge.sh: replaced the single combined /status fetch with a
  page loop over /commits/{sha}/statuses, accumulating all rows before the
  newest-wins CHECK_STATE collapse. A required SUCCESS past the cap no longer
  reads "missing" and emits a false-positive incident.force_merge.

gitea-merge-queue.py already paginates /statuses to exhaustion
(get_combined_status + api_paginated) — left unchanged; it is the reference
behavior this change brings the other two readers in line with.

STRENGTHENING ONLY — fail-closed preserved, NO fail-open path introduced:
- prod-auto-deploy: a genuinely-absent required context appears on NO page,
  so ci_context_state() still returns "missing", context_is_satisfied()
  rejects it, and the gate never greens (times out). Any page that errors or
  is not a list raises (fetch_all_statuses/_api_json_list) — a partial list
  never passes as complete.
- audit-force-merge: any non-200 page or non-array body aborts with exit 1;
  an absent required context has no CHECK_STATE entry so `${...:-missing}`
  keeps it not-green and the audit still fires.

Tests (mutation-resistant): added regressions that (a) place a required
SUCCESS on page 2+ behind a full page of churn and assert the reader FINDS it,
and (b) make a required context genuinely absent on all pages and assert the
reader STILL fail-closes (missing/never-satisfied → blocks/times out). Mocks
the paginated HTTP responses. Also locks newest-wins collapse, short-page
stop, full-page continue, and page-error propagation.

Refs: status-pagination RCA, #2440-family.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 02:25:51 +00:00
core-devops e5438c49ed fix(workspace): fail-closed on provider-switch read error (no orphan)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
sop-checklist / review-refire (pull_request_target) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request_target) Failing after 8s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m58s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m54s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m21s
CI / Canvas (Next.js) (pull_request) Successful in 6m33s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 8m11s
CI / all-required (pull_request) Successful in 2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m2s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 8s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 10s
audit-force-merge / audit (pull_request_target) Successful in 9s
Security review RC 9895 (agent-researcher) caught a fail-OPEN on the
old-provider read in the Update handler: the switch-detection block was
gated on `if err == nil { ... }`, so a transient/unexpected DB error on
`SELECT compute->>'provider'` skipped the whole block and fell through to
the compute UPDATE. During a real cross-cloud switch that overwrites the
provider record without deprovisioning the old box → the later
provider-aware restart deprovision targets the NEW cloud and orphans the
old box (silent billing, unrecoverable) — the exact failure this PR
prevents everywhere else, but on the non-deterministic read-error path
(invisible to CI and to the live staging-switch proof).

Fix: read the provider with an explicit error check — abort 502 (compute
untouched, old box recoverable, user retries) on any error other than
sql.ErrNoRows; ErrNoRows means there is genuinely no prior box, so it's
safe to skip the switch and let the UPDATE proceed. Same fail-closed
invariant the deprovision path already has.

Adds TestWorkspaceUpdate_ProviderSwitch_AbortsOnProviderReadError:
sqlmock WillReturnError on the provider read → 502, zero Stop calls, and
no UPDATE expectation so a re-introduced overwrite trips sqlmock.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 19:24:17 -07:00
devops-engineer 556d57e09d Merge PR #2426 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Python Lint & Test (push) Successful in 3s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
CI / Platform (Go) (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 1s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 14s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Failing after 48s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m12s
publish-canvas-image / Build & push canvas image (push) Successful in 1m34s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 4m30s
E2E Chat / E2E Chat (push) Failing after 5m35s
CI / Canvas (Next.js) (push) Successful in 6m27s
CI / Canvas Deploy Status (push) Successful in 1s
CI / all-required (push) Successful in 2s
publish-canvas-image / Promote canvas :latest to CI-green build (push) Successful in 5m14s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m29s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 31m32s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-09 02:14:03 +00:00
core-devops 54f43044f3 test(workspace): deterministic cloud-provider switch orchestration tests
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
E2E Chat / detect-changes (pull_request) Successful in 20s
Harness Replays / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 15s
E2E Chat / E2E Chat (pull_request) Successful in 10s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 35s
gate-check-v3 / gate-check (pull_request_target) Successful in 12s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 21s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m21s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m4s
CI / Platform (Go) (pull_request) Successful in 4m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m15s
CI / Canvas (Next.js) (pull_request) Successful in 7m1s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 12s
security-review / approved (pull_request_target) Review check failed via pull_request_review trigger
qa-review / approved (pull_request_target) Review check failed via pull_request_review trigger
security-review / approved (pull_request_review) Failing after 6s
qa-review / approved (pull_request_review) Failing after 8s
Pins the DESTRUCTIVE in-place provider switch's safety invariants without a
real cloud (sqlmock DB + the scriptedCPStop fake):

  1. provider change → OLD box deprovisioned (cpProv.Stop) BEFORE compute is
     overwritten — the ordering that prevents the post-switch provider-aware
     deprovision from targeting the NEW cloud and orphaning the old box.
  2. old-box deprovision FAILS → handler aborts (502) and does NOT overwrite
     compute (an unexpected UPDATE fails sqlmock → the orphan bug is caught).
  3. same-provider compute edit → no deprovision.

Complements the live cross-cloud half in molecule-controlplane
(TestLiveCrossCloudWorkspaceProviderSwitch, provider-live-e2e nightly).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:43:14 -07:00
agent-dev-a 3fa4230b5a Merge pull request 'ci(local-provision-e2e): dynamic ephemeral port to fix runner bind conflicts' (#2453) from fix/2450-local-provision-dynamic-port into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 11s
CI / Platform (Go) (push) Successful in 2s
E2E Chat / detect-changes (push) Successful in 17s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Canvas (Next.js) (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
CI / Canvas Deploy Status (push) Successful in 3s
E2E Chat / E2E Chat (push) Successful in 6s
CI / all-required (push) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m18s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m48s
publish-workspace-server-image / build-and-push (push) Failing after 2m39s
publish-workspace-server-image / Production auto-deploy (push) Has been skipped
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 3m58s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m2s
2026-06-09 01:42:14 +00:00
agent-reviewer 602c72f342 Merge pull request 'test(db): real-PG migration replay-from-scratch + InitPostgres ping + redis online-status key/TTL coverage (#2150)' (#2452) from refile/2155-migration-replay-from-scratch into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Python Lint & Test (push) Successful in 8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Has started running
CI / Detect changes (push) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
E2E Chat / detect-changes (push) Successful in 14s
Harness Replays / detect-changes (push) Successful in 7s
CI / Canvas Deploy Status (push) Successful in 2s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
Harness Replays / Harness Replays (push) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 10s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m16s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m42s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m30s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 3m55s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m6s
E2E Chat / E2E Chat (push) Failing after 5m51s
publish-workspace-server-image / build-and-push (push) Successful in 5m39s
CI / Platform (Go) (push) Successful in 7m3s
CI / all-required (push) Successful in 3s
publish-workspace-server-image / Production auto-deploy (push) Failing after 4m42s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m3s
2026-06-09 01:33:19 +00:00
core-devops 6c9cfdac3a fix(workspace): abort provider-switch if old-box deprovision fails (no cross-cloud orphan)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 11s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Chat / E2E Chat (pull_request) Successful in 15s
sop-checklist / review-refire (pull_request_target) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request_target) Failing after 11s
gate-check-v3 / gate-check (pull_request_target) Successful in 15s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 35s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 14s
security-review / approved (pull_request_target) Failing after 14s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m16s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m27s
CI / Platform (Go) (pull_request) Successful in 4m25s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 4m24s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m12s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 56s
CI / Canvas (Next.js) (pull_request) Successful in 6m48s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2s
Safety review of #2465 found a second orphan path: the switch used the void
cpStopWithRetry (discards the error), so if deprovisioning the OLD box failed, the
handler still overwrote compute.provider -> the old box kept billing on the OLD
cloud with NO DB pointer (unrecoverable; reconcilers key on the now-new instance_id
and provider). Fix: use cpStopWithRetryErr and ABORT (502, compute untouched)
before the UPDATE on failure, so the row stays pointed at the still-recoverable old
box and the user can retry. The restart paths' void variant is unaffected (their
box stays on the same cloud).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:25:31 -07:00
core-devops 0fd54c4272 feat(workspace): in-place cloud-provider switch in Container Config
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 11s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 12s
security-review / approved (pull_request_target) Failing after 13s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 13s
gate-check-v3 / gate-check (pull_request_target) Failing after 26s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m21s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m43s
CI / Platform (Go) (pull_request) Successful in 4m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m9s
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
Makes a workspace's cloud provider EDITABLE from the canvas Container Config tab
(was a read-only badge — provider was create-time-only because switching clouds
recreates the box). The provider dropdown drives the instance-type list, and a
change triggers a confirmed cross-cloud recreate.

Frontend (canvas/ContainerConfigTab.tsx):
- Provider <select> (AWS/GCP/Hetzner), shown in SaaS (non-SaaS keeps the badge).
- Instance-type list keyed to the chosen provider (AWS t3*/m6i*/c6i*, Hetzner
  cpx*/cax*, GCP e2-*); switching resets the instance type to the provider default.
- A destructive-action confirm before a provider switch ('recreates the box on
  the new cloud; non-persisted state is lost'). AWS default is omitted from the
  PATCH so non-switching saves are wire-identical.

Backend (workspace-server):
- workspace_compute.go: instance-type allowlist is now PROVIDER-KEYED + validates
  the instance type belongs to the provider (an AWS t3.* on Hetzner is a clean 400).
- workspace_crud.go Update: SAFETY — on a provider change, deprovision the OLD box
  on the OLD provider BEFORE overwriting compute. Otherwise the restart's
  provider-aware deprovision (resolveProvider reads compute->>'provider') would
  target the NEW cloud and ORPHAN the old (still-billing) box. Cloud-mode only.

Tests: provider-keyed instance allowlist (Go) + canvas switch UX (selector renders,
instance-type resets on switch, confirm fires, PATCH carries the new provider) +
the no-switch path (no confirm, aws omitted). All green; existing tests unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:19:58 -07:00
Molecule AI Dev Engineer A (Kimi) bf1f1750fa test(canvas): deflake DisplayTab noVNC constructor assertion\n\nAdds an explicit waitFor before asserting mockRFBConstructor arguments.\nThe noVNC client is loaded via dynamic import inside a useEffect, so in\nCI the assertion could race ahead of the async init and fail with\n'Number of calls: 0'. This is the flake blocking CI/all-required on #2426.\n\nCo-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 19s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m54s
CI / Canvas (Next.js) (pull_request) Successful in 8m55s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m6s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 6s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 8s
audit-force-merge / audit (pull_request_target) Successful in 6s
2026-06-09 01:07:57 +00:00
Molecule AI Dev Engineer A (Kimi) b6342d4afd test(canvas): gating test for in-flight turn status preservation on hydrate (issue #2391) 2026-06-09 01:07:57 +00:00
Molecule AI Dev Engineer A (Kimi) db45ac45a7 ci(local-provision-e2e): dynamic ephemeral port to fix runner bind conflicts (#2450)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 32s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m25s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m15s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m38s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 14s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
CI / Canvas Deploy Status (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 20s
CI / all-required (pull_request) Successful in 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m56s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m50s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m7s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 6s
security-review / approved (pull_request_review) Successful in 6s
audit-force-merge / audit (pull_request_target) Successful in 7s
Replaces the fixed :8080 bind with an OS-allocated ephemeral port in both
lifecycle-stub and lifecycle-real jobs. This eliminates the "address already
in use" failures caused by stale processes or concurrent jobs on shared
docker-host runners.

Changes:
- Configure platform env: allocate PORT via socket.bind(('', 0)) and set
  BASE=http://localhost:8000.
- Start platform: use PORT=8000 instead of hardcoded 8080.
- Kill stale platform-server: remove the fuser/lsof port-scan for 8080
  (no longer needed) and keep the comm-scan process cleanup.
- Update comments to reflect dynamic-port rationale.

Fixes #2450

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 23:26:01 +00:00
Molecule AI Dev Engineer B (MiniMax) 894bd07285 test(db): real-PG migration replay-from-scratch + InitPostgres ping + redis online-status key/TTL coverage (#2150)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 22s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 26s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 20s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 21s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m14s
gate-check-v3 / gate-check (pull_request_target) Successful in 11s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m38s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m2s
E2E Chat / E2E Chat (pull_request) Successful in 5s
CI / Canvas Deploy Status (pull_request) Successful in 1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m48s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m51s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m12s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m12s
CI / Platform (Go) (pull_request) Successful in 8m19s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 7s
security-review / approved (pull_request_review) Successful in 7s
audit-force-merge / audit (pull_request_target) Successful in 7s
Refile of the WIP at origin/regression/2150-migration-replay-from-scratch-real-pg
(stalled 2026-06-03, never advanced past DRAFT, base on a 200-commit-stale main
that would have undone my PR #2449's guard widening + the mc#1982 mask removal
+ the #2149 scheduler trigger if merged directly).

This is the #2150 implementation (close-supersedes the WIP PR #2155):

  - workspace-server/internal/db/postgres_replay_integration_test.go (286 lines)
      Real-PG integration tests for db.RunMigrations (forward chain replay-from-
      scratch via the production entrypoint, hard-fail; double-apply for the
      045 crash-loop class) and db.InitPostgres (ping + bad-DSN).
  - workspace-server/internal/db/redis_test.go (291 lines)
      Unit tests for redis.go (was untested fleet-wide): SetOnline / IsOnline /
      RefreshTTL on ws:<id>, CacheURL / GetCachedURL on ws:<id>:url, internal
      namespace pin, LivenessTTL >= 5x heartbeat, real TTL expiry via miniredis.
  - .gitea/workflows/handlers-postgres-integration.yml (+27)
      New 'Migration replay-from-scratch gate (#2150)' step, runs the integration
      suite against a SEPARATE 'molecule_replay' database on the same sibling
      Postgres (so the destructive DROP SCHEMA never touches the handlers
      molecule DB). Inserted AFTER the scheduler (#2149) step; does NOT undo
      any of: the mc#1982 mask removal, the preflight INTEGRATION_DB_URL
      check, or the table-presence guard widening (PR #2449).
  - .gitea/scripts/detect-changes.py (+5)
      'handlers-postgres' profile now also matches internal/db/ (additive,
      preserves the scheduler trigger from #2149) so a change to redis.go or
      postgres.go runs the gate.

Refs #2150. Supersedes the WIP PR #2155 (DRAFT, 6 days stalled, branched from
a 200-commit-stale main).
2026-06-08 23:25:19 +00:00
agent-reviewer-cr2 00705c11cd Merge pull request 'ci: fail-closed when ops-scripts unittest collects 0 tests' (#2448) from fix/ci-fail-on-zero-tests-collected into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
E2E Chat / E2E Chat (push) Blocked by required conditions
CI / Python Lint & Test (push) Successful in 5s
E2E Chat / detect-changes (push) Has started running
CI / Detect changes (push) Successful in 17s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 17s
E2E API Smoke Test / detect-changes (push) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 9s
CI / Platform (Go) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Canvas (Next.js) (push) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
CI / Canvas Deploy Status (push) Successful in 10s
CI / all-required (push) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m15s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m22s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m44s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 1m17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m1s
publish-workspace-server-image / build-and-push (push) Successful in 6m32s
publish-workspace-server-image / Production auto-deploy (push) Failing after 3m51s
2026-06-08 23:15:53 +00:00
devops-engineer fc54d4a046 Merge PR #2449 via Gitea merge queue
CI / Python Lint & Test (push) Successful in 4s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
CI / Platform (Go) (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 2s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
CI / Canvas Deploy Status (push) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Chat / detect-changes (push) Successful in 27s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 21s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 27s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
CI / all-required (push) Successful in 8s
E2E Chat / E2E Chat (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 36s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m8s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m9s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 32s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m27s
publish-workspace-server-image / build-and-push (push) Successful in 3m27s
publish-workspace-server-image / Production auto-deploy (push) Failing after 3m54s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-08 23:11:03 +00:00
Molecule AI Dev Engineer B (MiniMax) d1bcc09aa0 ci(handlers-postgres): widen required-tables guard to include workspace_auth_tokens + org_api_tokens (#2148)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m25s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m17s
sop-checklist / review-refire (pull_request_target) Has been skipped
CI / Platform (Go) (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 32s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m54s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m10s
CI / Canvas Deploy Status (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 5s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m19s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m17s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 11s
security-review / approved (pull_request_review) Successful in 9s
audit-force-merge / audit (pull_request_target) Successful in 7s
The table-presence guard in .gitea/workflows/handlers-postgres-integration.yml
hard-fails the integration job if a load-bearing table is missing after
migration replay. The previous list covered delegations / workspaces /
activity_logs / pending_uploads / workspace_schedules, but the registry-auth
TestIntegration_ suite (#2156 / #2148) also requires workspace_auth_tokens
(migration 020) and org_api_tokens (migration 035).

Without this guard, a silently-skipped migration 020 or 035 (the surrounding
apply-all-or-skip loop suppresses migration failures) would let the auth
tests run against missing tables and falsely green. This change makes the
guard catch that class of regression.

This is the CR2 action item flagged in the #2156 WIP body: 'consider adding
workspace_auth_tokens + org_api_tokens to that sanity list so a skipped
auth-table migration fails loud instead of skipping silently.'

Closes the guard gap for #2148 independently of the #2156 test-suite WIP
(cleanly-separable; the WIP test work remains the devops-engineer's lane).
2026-06-08 22:58:16 +00:00
agent-reviewer b3241aecf5 Merge pull request 'fix(scheduler): enqueue cron ticks on busy agents instead of dropping them' (#2446) from fix/scheduler-enqueue-cron-on-busy into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 27s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 38s
E2E API Smoke Test / detect-changes (push) Successful in 38s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push) Successful in 20s
Harness Replays / detect-changes (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 2s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 23s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m31s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push) Failing after 2m40s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push) Failing after 2m55s
Harness Replays / Harness Replays (push) Successful in 3s
CI / Canvas Deploy Status (push) Successful in 1s
publish-workspace-server-image / build-and-push (push) Successful in 7m9s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 31s
CI / Platform (Go) (push) Successful in 4m4s
CI / all-required (push) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Failing after 5m9s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push) Failing after 5m42s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 7m5s
E2E Chat / E2E Chat (push) Failing after 7m19s
publish-workspace-server-image / Production auto-deploy (push) Failing after 4m50s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m48s
2026-06-08 22:50:42 +00:00
devops-engineer 6bd7092409 Merge PR #2428 via Gitea merge queue
Block internal-flavored paths / Block forbidden paths (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 30s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 36s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 2s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push) Successful in 52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push) Failing after 2m41s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push) Failing after 2m45s
Harness Replays / Harness Replays (push) Successful in 5s
CI / Canvas Deploy Status (push) Successful in 11s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 3m51s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m8s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m16s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push) Failing after 5m38s
E2E Chat / E2E Chat (push) Failing after 5m34s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 6m31s
publish-workspace-server-image / build-and-push (push) Successful in 6m54s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Failing after 7m23s
publish-workspace-server-image / Production auto-deploy (push) Failing after 31s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 6m44s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 32m35s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-08 22:46:52 +00:00
devops-engineer f1ccd3bb05 ci: fail-closed when ops-scripts unittest collects 0 tests
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 16s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Platform (Go) (pull_request) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 15s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m21s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 17s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m34s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m24s
CI / Canvas Deploy Status (pull_request) Successful in 6s
CI / all-required (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m21s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 7s
qa-review / approved (pull_request_review) Successful in 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m1s
audit-force-merge / audit (pull_request_target) Successful in 20s
Gate-integrity hardening. The "Run scripts/ unittests, if any" step in
.gitea/workflows/test-ops-scripts.yml detected "no tests collected" via
`rc=$?; if [ "$rc" -eq 5 ]`. But Python 3.12's unittest exits 0 (not 5)
when discovery finds 0 tests ("Ran 0 tests ... NO TESTS RAN"), so the
guard never fired: the step passed GREEN while running ZERO tests. Any
test_*.py added under scripts/ would have been silently never executed.

A green job that runs 0 tests is worse than a red one. This fails closed:

  scripts/ (top-level) step:
    - genuinely NO test_*.py present  -> loud SKIP (legitimate no-op; the
      runtime-packaging tests moved to molecule-ai-workspace-runtime, so
      there are none today)
    - test_*.py present but 0 collected -> FAIL (broken import / empty /
      discovery error)
    Count is via TestLoader().discover(...).countTestCases(), not exit code.

  scripts/ops/ step (real gate, 34 tests today):
    - assert >0 collected so deleting all test files or breaking an import
      can't pass GREEN by running 0 tests.

ci.yml "Diagnostic — per-package verbose 60s" is continue-on-error and
explicitly advisory (the blocking gate is the next step); left functional
unchanged, only a clarifying comment added so its `set +e` isn't mistaken
for this same bug class.

The real `Ops Scripts Tests` pytest gate (.gitea/scripts/tests) is untouched.

Proven on the operator: scripts/ unittest exits 0 on 0 tests (the bug);
new guard SKIPs on no-files, FAILs on files-present-but-0-collected, PASSes
on a real test; ops guard PASSes at 34 and FAILs on empty. Workflow-YAML
linter green (0 warnings).

Part of a gate-integrity hardening pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 22:42:43 +00:00
Molecule AI Dev Engineer A (Kimi) 6e98e08b0a ci: re-trigger required E2E API Smoke + Handlers PG checks
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 14s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 38s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
CI / Canvas (Next.js) (pull_request) Successful in 19s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 44s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 55s
Harness Replays / detect-changes (pull_request) Successful in 20s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
CI / Canvas Deploy Status (pull_request) Successful in 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 25s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 19s
Harness Replays / Harness Replays (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m19s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m14s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m23s
CI / Platform (Go) (pull_request) Successful in 4m19s
CI / all-required (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m8s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m8s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 9m18s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 12s
security-review / approved (pull_request_review) Successful in 12s
security-review / approved (pull_request_target) Approved by security-team-21 review 9868 on current head
audit-force-merge / audit (pull_request_target) Successful in 5s
2026-06-08 22:30:22 +00:00
devops-engineer daf536a0cb Merge PR #2440 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 4s
E2E API Smoke Test / detect-changes (push) Successful in 12s
CI / Detect changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 2s
CI / Platform (Go) (push) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 6s
E2E Chat / E2E Chat (push) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 15s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 12s
CI / Canvas Deploy Status (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
CI / all-required (push) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
review-check-tests / review-check.sh regression tests (push) Successful in 14s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m11s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m10s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 2m0s
publish-workspace-server-image / build-and-push (push) Successful in 3m24s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 3m54s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 1m13s
publish-workspace-server-image / Production auto-deploy (push) Failing after 3m55s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-08 22:26:13 +00:00
devops-engineer fb76309d84 fix expired-row-conflict starvation (expired queued row no longer blocks a fresh tick's enqueue) + content-security comment generalization; refs CR3 RC 9853
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 23s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Failing after 10s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 29s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request_target) Successful in 12s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m5s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m42s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 8m23s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m4s
CI / Platform (Go) (pull_request) Successful in 8m24s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 14s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 18s
audit-force-merge / audit (pull_request_target) Successful in 11s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 22:13:35 +00:00
devops-engineer bd74ca1b1c fix(gate): exact-enum fail-closed approval validator + head_sha reconciliation
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 22s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 12s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 10s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m42s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m43s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m40s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m3s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m20s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 19s
qa-review / approved (pull_request_review) Successful in 25s
audit-force-merge / audit (pull_request_target) Successful in 5s
Exact-enum fail-closed hardening (SEV-1 internal#812): reject case-coerced
review.state. The previous validator used str(state or "").upper() at
_approval_validator.py lines 117/136/197-198, so a lowercase "approved" /
"request_changes" was coerced into the canonical value and ACCEPTED — a
residual fail-open a spoofed row could exploit. Now compares review.state
EXACTLY to the canonical Gitea-emitted constants STATE_APPROVED /
STATE_REQUEST_CHANGES (verified uppercase against the live reviews API
across real molecule-core PRs) on BOTH the approval and request_changes
paths, in is_genuine_approval, is_open_request_changes, and classify_reviews.
A case-variant/padded value is rejected (not counted as approval, and not
allowed to overwrite/erase a genuine current-head verdict in the reducer).
Added 4 regression tests (mutation-verified: reintroducing .upper() fails 7
assertions).

head_sha/headsha reconciliation folded in (fixes Ops Scripts Tests, #2424
drift): the genuine_approvals wrapper signature is `headsha` (matching the
SSOT module _approval_validator and test_approval_validator), but the
production call and the merge-queue tests passed `head_sha`, raising
TypeError: genuine_approvals() got an unexpected keyword argument 'head_sha'
(24 failures). Aligned all sites to the canonical `headsha`: the production
call at gitea-merge-queue.py:1124 and 4 calls in test_gitea_merge_queue.py.
Pure rename — no logic changed. .gitea/scripts pytest suite now 362 passed,
0 failures.

refs RCs 9849/9851/9852.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 21:59:17 +00:00
devops-engineer 6c043d27f0 Merge PR #2445 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
CI / Platform (Go) (push) Successful in 2s
CI / Python Lint & Test (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has started running
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 7s
Harness Replays / Harness Replays (push) Successful in 7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 37s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 33s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m5s
publish-workspace-server-image / build-and-push (push) Successful in 3m35s
publish-canvas-image / Build & push canvas image (push) Successful in 4m32s
E2E Chat / E2E Chat (push) Failing after 5m33s
CI / Canvas (Next.js) (push) Successful in 6m32s
CI / Canvas Deploy Status (push) Successful in 1s
CI / all-required (push) Successful in 4s
publish-canvas-image / Promote canvas :latest to CI-green build (push) Successful in 2m29s
publish-workspace-server-image / Production auto-deploy (push) Failing after 6m49s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-08 21:58:58 +00:00
devops-engineer 47a82381b4 fix(gate): classify_reviews validate-before-reduce (SEV-1 internal#812)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 17s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 11s
CI / Canvas Deploy Status (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / all-required (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 20s
gate-check-v3 / gate-check (pull_request_target) Failing after 15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m25s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m15s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m50s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m25s
security-review / approved (pull_request_target) Review check failed via pull_request_review trigger
qa-review / approved (pull_request_target) Review check failed via pull_request_review trigger
security-review / approved (pull_request_review) Failing after 4s
qa-review / approved (pull_request_review) Failing after 7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m1s
The SSOT approval validator's classify_reviews() reduced FIRST and
validated AFTER (reduce-before-validate). It built latest_by_user[user]
keyed only on state in {APPROVED, REQUEST_CHANGES}, selecting the LATEST
row per user, and ONLY THEN ran the fail-closed predicate
(is_official_current_head: official / not-dismissed / not-stale /
commit_id present AND == head) on that single surviving row.

That ordering is exploitable:
  - A user posts a genuine current-head APPROVED, then posts a LATER
    INVALID row (a COMMENT, or APPROVED with a null/old commit_id). The
    invalid later row overwrites the genuine approval in latest_by_user
    -> the approval is masked/lost.
  - WORSE: a genuine current-head REQUEST_CHANGES can be OVERWRITTEN by a
    later invalid row from the same user, so it drops out of the
    request_changes set -> the block silently evaporates.

Fix: validate-before-reduce. Filter each review through the fail-closed
predicate (is_official_current_head AND state in {APPROVED,
REQUEST_CHANGES}) BEFORE the per-user latest selection, so an invalid
later row is never eligible to become a user's "latest" state and cannot
overwrite or erase a genuine review. A user's verdict is the state of
their latest VALID review. Genuine valid-row supersession
(APPROVED then later REQUEST_CHANGES on the same head) is preserved.

Signature and (set, list) return shape are unchanged, so both consumers
(gitea-merge-queue.py classify_reviews, review-check.sh via
_review_check_filter.py which only uses the per-review is_genuine_approval
and was never vulnerable) are unaffected.

Tests: add validate-before-reduce regression tests to
tests/test_approval_validator.py covering BOTH bypass cases (approval not
masked by a later COMMENT / null / stale commit_id row; REQUEST_CHANGES
not erased by a later invalid row), the invalid-later-APPROVED-must-not-
flip-a-block case, multi-user approver counting with an invalid later row,
and sanity tests that genuine valid-row supersession still works. Injecting
the old reducer makes 7 of these fail; the fix makes all pass.

Also fix a CI gap found while wiring this: review-check-tests.yml ran
`unittest discover -s .gitea/scripts -p test_approval_validator.py`, which
finds 0 tests (the file lives in tests/ with no __init__.py) -- the SEV-1
suite silently never ran. Invoke the file directly so failures fail CI.

3 reviewers REQUEST_CHANGES with this precise diagnosis: CR2 (#9846),
Researcher (#9847), CR3 (#9848).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 21:33:30 +00:00
devops-engineer 2e69e48a4e fix(scheduler): enqueue cron ticks on busy agents instead of dropping them
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 21s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 39s
Harness Replays / detect-changes (pull_request) Successful in 14s
E2E Chat / E2E Chat (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
CI / Canvas Deploy Status (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m12s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m17s
CI / Platform (Go) (pull_request) Successful in 4m16s
CI / all-required (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m36s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m8s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m29s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 7s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 10s
When a workspace agent is busy (active_tasks >= max_concurrent_tasks), A2A
dispatches already buffer durably into the a2a_queue table and get picked up
when the agent frees. Scheduled/cron ticks did NOT: fireSchedule polled every
10s for up to 2 min and then called recordSkipped(), dropping the tick. On
perpetually-busy workspaces (e.g. leaders kept busy by the Orchestrator pulse
delegation chain) this dropped ~30% of scheduled fires while A2A work buffered.

Now, on busy, fireSchedule ENQUEUES the cron message into the durable a2a_queue
via EnqueueA2A (the same path A2A uses) with the SAME a2aBody the fire path
builds, method "message/send", priority PriorityTask. The heartbeat drain then
dispatches it serially when the agent frees. Execution stays one-at-a-time;
max_concurrent_tasks is unchanged — this is purely about buffering ticks.

Idempotency key = schedule_id (NOT a random uuid / messageId). The a2a_queue
partial-unique index idx_a2a_queue_idempotency dedups on
(workspace_id, idempotency_key) for status IN ('queued','dispatched'), so a busy
agent buffers AT MOST ONE pending tick per schedule — the latest — instead of
stacking a stale backlog of one-tick-per-poll. We hold the next tick, not a pile
of obsolete ones.

Enqueue happens immediately on busy (the 2-min poll-wait is removed): durable
buffering makes the wait pointless and the wait blocked a scheduler goroutine.
Buffered ticks get expiresAt = next scheduled fire so a tick stuck past its own
next cron slot expires rather than firing stale. If EnqueueA2A errors we fall
back to recordSkipped so liveness still advances and the operator sees it.

Seam: handlers imports scheduler, so scheduler cannot import handlers (cycle).
The scheduler's existing A2AProxy interface (held as s.proxy, satisfied by
*WorkspaceHandler) is extended with an EnqueueA2A method that delegates to the
package-level handlers.EnqueueA2A — no new import, no cycle. priorityTask is a
local const mirroring handlers.PriorityTask for the same reason.

Adds recordQueued (mirrors recordSkipped, last_status='queued') and a
fireSchedule busy-path unit test asserting enqueue-not-fire-not-skip with
idempotency_key=schedule_id. All test proxy doubles gain the EnqueueA2A method.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 21:19:23 +00:00
agent-reviewer fab500990c Merge pull request 'feat(canvas): fly an envelope between agents on each delegate/message' (#2443) from feat/a2a-message-flight-envelope into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Has started running
CI / Python Lint & Test (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 8s
CI / Detect changes (push) Successful in 14s
Harness Replays / detect-changes (push) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
CI / Platform (Go) (push) Successful in 2s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 16s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m28s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 1m21s
E2E Chat / E2E Chat (push) Failing after 3m16s
publish-workspace-server-image / build-and-push (push) Successful in 3m34s
publish-canvas-image / Build & push canvas image (push) Successful in 4m17s
publish-canvas-image / Promote canvas :latest to CI-green build (push) Has started running
CI / Canvas (Next.js) (push) Successful in 10m10s
CI / Canvas Deploy Status (push) Successful in 3s
CI / all-required (push) Successful in 3s
publish-workspace-server-image / Production auto-deploy (push) Failing after 10m39s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 27m42s
2026-06-08 21:14:21 +00:00
Molecule AI Dev Engineer A (Kimi) 1c4672645f test(e2e): enter Org-map view before waiting for .react-flow__node (fixes #2442)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 10s
CI / Platform (Go) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m12s
CI / Canvas (Next.js) (pull_request) Successful in 7m7s
CI / Canvas Deploy Status (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Failing after 7s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 7s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 7s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request_target) Successful in 20s
The ConciergeShell desktop entrypoint defaults to topView: home, so
the Canvas (React Flow graph) only mounts when topView === map.
Before each .react-flow__node wait, click the Org map nav button via
data-testid=nav-map to switch to the map view.

Test plan:
- npx playwright test canvas/e2e/chat-desktop.spec.ts passes
2026-06-08 21:10:03 +00:00
core-devops 954fee28f4 feat(canvas): fly an envelope between agents on each delegate/message
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Successful in 12s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
sop-checklist / na-declarations (pull_request) N/A: (none)
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request_target) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m24s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 8s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 13s
CI / Canvas (Next.js) (pull_request) Successful in 6m39s
CI / Canvas Deploy Status (pull_request) Successful in 9s
CI / all-required (pull_request) Successful in 7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m3s
audit-force-merge / audit (pull_request_target) Successful in 5s
When one agent delegates to or messages another, an envelope now animates from
the source agent to the target agent — on the spatial canvas (it tracks
pan/zoom) and in the concierge home agent tree.

- `useA2AFlights` (new hook): subscribes to the same ACTIVITY_LOGGED WS bus the
  CommunicationOverlay uses, turns each a2a_send / a2a_receive / task_update
  into a transient "flight" (source -> target), bounded + auto-expiring. Honours
  prefers-reduced-motion (emits no flights), skips self-loops, caps concurrency.
- `FlightEnvelope` (new, shared): one envelope animated from -> to via the Web
  Animations API (dynamic per-flight delta), coloured by kind (send=cyan,
  receive=violet, task=warm) to match CommunicationOverlay.
- `MessageFlightLayer` (canvas): renders flights inside <ReactFlow> via
  ViewportPortal, so envelopes live in flow coordinates and pan/zoom for free.
  Resolves node centres from the store. Also covers the concierge "Org map"
  (which embeds <Canvas/>).
- `MessageFlightHome` (concierge home): a fixed overlay that flies the envelope
  between agent-tree ROW rects (rows now carry data-ws-id); captures rects once
  per flight so a scroll mid-flight doesn't restart the animation.

Tests: useA2AFlights 7/7 (event->flight, kind mapping, ignore non-a2a,
self-loop/no-target skip, reduced-motion, disabled, TTL expiry). Canvas +
concierge tests unaffected (the layer renders nothing when idle). Typecheck
clean for the new files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 14:04:23 -07:00
Molecule AI Dev Engineer B (MiniMax) 4184057ec7 fix(gate): SEV-1 fail-closed approval-validator (SEV-1 internal#812)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 13s
E2E Chat / E2E Chat (pull_request) Successful in 4s
review-check-tests / review-check.sh regression tests (pull_request) Failing after 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 34s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Successful in 2s
sop-checklist / review-refire (pull_request_target) Has been skipped
CI / all-required (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 53s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m25s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m16s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m9s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 31s
qa-review / approved (pull_request_target) Review check failed via pull_request_review trigger
qa-review / approved (pull_request_review) Failing after 10s
security-review / approved (pull_request_target) Review check failed via pull_request_review trigger
security-review / approved (pull_request_review) Failing after 13s
Resolves SEV-1 spoofed-reviewer SEV-1 (internal#812, supersedes closed
internal#843). Consolidates the approval-validity predicate into a
SINGLE shared function (SSOT) and applies the SAME fail-closed
contract at BOTH approval-counting sites:

  - .gitea/scripts/gitea-merge-queue.py (Python, merge-queue 2-genuine)
  - .gitea/scripts/review-check.sh       (bash, qa-review / security-review)

## The bug

The pre-fix gitea-merge-queue.py predicate had a guard
  if isinstance(commit_id, str) and commit_id and headsha:
which SKIPPED the commit_id check when the review carried no
commit_id. A missing commit_id is the Gitea row signature of a
spoofed or pre-commit review — a real reviewer cannot have
submitted against a commit that doesn't exist. Accepting these
silently weakened the documented 2-genuine floor below the merge
bar. CR2 + Researcher both flagged this on the closed #843 PR
revert; this commit closes the gap.

## The fix

A review counts as a GENUINE APPROVED on the current head ONLY IF ALL hold:
  1. state == APPROVED
  2. oficial is True (rejects comment-based / non-official reviews)
  3. dismissed != true
  4. stale != true
  5. commit_id is present and equals the PR's current head SHA

Any failure of any of the above REJECTS. FAIL-CLOSED. There is NO
'no commit_id is accepted for backward-compat' branch.

## SSOT location

.gitea/scripts/_approval_validator.py (new file, 187 lines)

Both consumer sites import it (Python) or shell out to
.gitea/scripts/_review_check_filter.py (new file, 74 lines) which
imports the same module. NO per-repo copy of the predicate, NO jq
copy in review-check.sh, NO inline predicate body in either
consumer. A reviewer who wants to weaken the gate has to weaken
this one file.

## Consumers

  - .gitea/scripts/gitea-merge-queue.py: genuine_approvals() is now
    a 5-line wrapper that delegates to _approval_validator.classify_reviews.
    The wrapper exists only to keep the call-site symbol stable.
  - .gitea/scripts/review-check.sh: the inline jq filter is gone. The
    script calls _review_check_filter.py, which applies the same
    is_genuine_approval predicate. No bash/jq copy of the predicate
    remains. The MISFILED_FILTER (internal#503 informational detection)
    is unchanged.

## Mutation-verified tests

.gitea/scripts/tests/test_approval_validator.py (new, 410 lines, 35 cases)
covers every fail-closed branch with an EXPLICIT REJECT assertion.

## Bash regression suite (review-check.sh)

Carry over the closed #843's T1-T22 + add T23 (missing commit_id) —
the SEV-1 case. The fixture helper gains a T23_missing_commit_id
scenario; test_review_check.sh adds an end-to-end assertion that
the review-check.sh pipeline exits 1 on a review with NO
commit_id field.

## CI workflow

.gitea/workflows/review-check-tests.yml: path triggers expanded
to include _approval_validator.py, _review_check_filter.py, and
the new test_approval_validator.py. A new CI step runs the
SSOT unit tests alongside the existing review-check.sh bash suite.

## SSOT location + consumers (per PM spec)

  SSOT: .gitea/scripts/_approval_validator.py

  Consumer 1: .gitea/scripts/gitea-merge-queue.py
    -> imports classify_reviews from the SSOT.

  Consumer 2: .gitea/scripts/review-check.sh
    -> shells out to _review_check_filter.py, which imports
      is_genuine_approval from the SSOT.

Both consumers call the same predicate. There is no per-repo
copy to drift. The other molecule-* repos (controlplane, runtime,
template-*) mirror molecule-core's gate scripts; once this lands,
they pick up the SSOT via the same import path. PM flagged this
as the desired topology in the spec.

## Governance

- 2 distinct genuine reviews required (any 2 of CR2 / Researcher / CR3)
- qa-review, security-review, sop-checklist, gate-check all required
- CI / all-required aggregate gating the merge
- No self-merge
2026-06-08 20:20:43 +00:00
agent-researcher 8ea853b687 Merge pull request #2420: feat(ws-server): validate compute.provider vs cloud-provider SSOT (switch-provider PR1)
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Blocked by required conditions
CI / Python Lint & Test (push) Successful in 4s
CI / Detect changes (push) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
E2E API Smoke Test / detect-changes (push) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Has started running
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Has started running
E2E Chat / detect-changes (push) Successful in 19s
Harness Replays / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Has started running
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 17s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
CI / Canvas Deploy Status (push) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m15s
publish-workspace-server-image / build-and-push (push) Successful in 3m59s
CI / Platform (Go) (push) Successful in 4m21s
CI / all-required (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m7s
E2E Chat / E2E Chat (push) Failing after 5m18s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1h0m20s
Reviewer/API merge after stranded SOP status refire; current-head approvals 9550 and 9554, required gates green.
2026-06-08 20:12:43 +00:00
agent-reviewer c0d5225970 Merge pull request 'fix(canvas/concierge): truncate agent role to one line, full text on hover' (#2436) from fix/concierge-role-truncate into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
E2E Chat / detect-changes (push) Successful in 12s
Harness Replays / detect-changes (push) Successful in 6s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
CI / Platform (Go) (push) Successful in 11s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 11s
Harness Replays / Harness Replays (push) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 21s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 19s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m12s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m15s
publish-canvas-image / Build & push canvas image (push) Successful in 1m49s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 56s
E2E Chat / E2E Chat (push) Failing after 3m25s
publish-workspace-server-image / build-and-push (push) Successful in 3m48s
CI / Canvas (Next.js) (push) Successful in 7m2s
CI / Canvas Deploy Status (push) Successful in 7s
CI / all-required (push) Successful in 8s
publish-canvas-image / Promote canvas :latest to CI-green build (push) Successful in 8m50s
publish-workspace-server-image / Production auto-deploy (push) Failing after 10m21s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 21m33s
2026-06-08 19:50:40 +00:00
devops-engineer b5a60dac26 Merge PR #2435 via Gitea merge queue
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 18s
publish-workspace-server-image / build-and-push (push) Has started running
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Successful in 6s
CI / Canvas (Next.js) (push) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 35s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push) Successful in 23s
Harness Replays / detect-changes (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 55s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m18s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m10s
CI / Canvas Deploy Status (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 4s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push) Failing after 3m14s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push) Failing after 3m18s
E2E Chat / E2E Chat (push) Failing after 3m20s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m20s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Failing after 5m21s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m13s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push) Failing after 5m53s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 8m29s
CI / Platform (Go) (push) Successful in 8m35s
CI / all-required (push) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 32m38s
Serialized merge by gitea-merge-queue after current-main, genuine approvals, and required CI checks were green.
2026-06-08 17:58:55 +00:00
devops-engineer 4951225d7c Merge pull request 'fix(gate): correct GOVERNANCE_REQUIRED_CONTEXTS to use (pull_request_target) — unblocks ~16 PRs' (#2424) from fix/gate-context-target-suffix into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 10s
CI / Python Lint & Test (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
CI / Detect changes (push) Successful in 14s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 9s
CI / Platform (Go) (push) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 5s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 25s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 36s
CI / Canvas Deploy Status (push) Successful in 3s
CI / all-required (push) Successful in 5s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 1m18s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m40s
publish-workspace-server-image / build-and-push (push) Successful in 3m6s
publish-workspace-server-image / Production auto-deploy (push) Failing after 4m0s
2026-06-08 17:46:35 +00:00
core-devops ce482fc0fc fix(canvas/concierge): truncate agent role to one line, full text on hover
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / detect-changes (pull_request) Successful in 22s
E2E API Smoke Test / detect-changes (pull_request) Successful in 23s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 19s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m49s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 6m58s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 10s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 5s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 10s
audit-force-merge / audit (pull_request_target) Successful in 9s
The concierge agent tree rendered each agent's `role` with no truncation, so a
long descriptor (e.g. "Coding Executor (Kimi) — implements specs only; NO review
/ RCA / decisions / delegation") wrapped into a tall multi-line block, making the
tree hard to scan.

Render the role compact: `.wsRole` now clamps to a single line with ellipsis
(and `.wsStatus` no longer gets squeezed), and the row sets `title={roleLabel}`
so the full text is available on hover via the native tooltip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 08:57:56 -07:00
Molecule AI Dev Engineer A (Kimi) 7eea51be77 fix(registry): case-fold + trim trailing dot in isPlatformTunnelHostname (#2429)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 37s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m58s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m13s
CI / Platform (Go) (pull_request) Successful in 4m22s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m22s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 4s
security-review / approved (pull_request_review) Successful in 4s
audit-force-merge / audit (pull_request_target) Successful in 8s
DNS hostnames are case-insensitive and FQDN form may carry a trailing dot.

Lowercase the input and trim a trailing dot before checking the ws- prefix and

platform-domain suffix. Also normalize the configured MOLECULE_APP_DOMAIN the same

way so uppercase env values don't break matching.

Adds uppercase, trailing-dot, and combined test cases.

Fixes #2429
2026-06-08 12:18:49 +00:00
devops-engineer 2902b4ce28 Merge pull request 'fix(provisioner): thread provider into IsRunning status call, fail-closed on lookup error (#2386 sibling-leak)' (#2389) from fix/provider-on-isrunning-status into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 16s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 10s
Block internal-flavored paths / Block forbidden paths (push) Successful in 26s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push) Successful in 26s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 53s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 16s
publish-workspace-server-image / build-and-push (push) Successful in 3m41s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 1m23s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 1m36s
CI / Canvas (Next.js) (push) Successful in 2s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 1s
Harness Replays / Harness Replays (push) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push) Failing after 2m42s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push) Failing after 3m29s
CI / Canvas Deploy Status (push) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push) Failing after 5m35s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Failing after 5m48s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m14s
E2E Chat / E2E Chat (push) Failing after 3m15s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 8m35s
CI / Platform (Go) (push) Successful in 7m59s
CI / all-required (push) Successful in 1s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Failing after 7m5s
publish-workspace-server-image / Production auto-deploy (push) Failing after 13m8s
2026-06-08 11:05:20 +00:00
Molecule AI Dev Engineer A (Kimi) 864834fb87 Merge branch 'main' into fix/provider-on-isrunning-status
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 14s
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m23s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m54s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m8s
CI / Platform (Go) (pull_request) Successful in 9m2s
CI / all-required (pull_request) Successful in 7s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Has started running
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 20s
audit-force-merge / audit (pull_request_target) Successful in 9s
2026-06-08 10:30:38 +00:00
Molecule AI Dev Engineer A (Kimi) b36633fabe Merge branch 'main' into fix/2421-heartbeat-backfill-agent-card
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
E2E Chat / detect-changes (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Harness Replays / detect-changes (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
audit-force-merge / audit (pull_request_target) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 39s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m19s
CI / Canvas Deploy Status (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m20s
CI / all-required (pull_request) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m23s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
gate-check-v3 / gate-check (pull_request_target) Failing after 10s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 4s
qa-review / approved (pull_request_review) Successful in 6s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 20s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 24s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 58s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 56s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m37s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 1h4m59s
E2E API Smoke Test / detect-changes (pull_request) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled
2026-06-08 10:30:09 +00:00
Molecule AI Dev Engineer A (Kimi) e55e641d18 trigger: re-run sop-checklist pull_request
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
qa-review / approved (pull_request_target) Failing after 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
security-review / approved (pull_request_target) Failing after 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Canvas Deploy Status (pull_request) Successful in 1s
gate-check-v3 / gate-check (pull_request_target) Failing after 21s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m26s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m15s
CI / Platform (Go) (pull_request) Successful in 4m11s
CI / all-required (pull_request) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m23s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
2026-06-08 09:43:28 +00:00
core-devops f91583efa0 Merge pull request 'feat(canvas): Org Concierge — concept reskin + self-host platform-agent backend (BYOK · user-tasks · boot-provision)' (#2385) from feat/canvas-concierge-ui into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Python Lint & Test (push) Successful in 4s
CI / Detect changes (push) Successful in 8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 15s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 37s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (push) Has been skipped
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (push) Successful in 23s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 7s
publish-canvas-image / Build & push canvas image (push) Successful in 1m50s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m49s
publish-workspace-server-image / build-and-push (push) Successful in 3m35s
E2E Chat / E2E Chat (push) Failing after 3m19s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m13s
Secret scan / Scan diff for credential-shaped strings (push) Has started running
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m27s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push) Failing after 2m40s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push) Waiting to run
CI / Canvas (Next.js) (push) Successful in 6m36s
Harness Replays / Harness Replays (push) Successful in 1s
CI / Platform (Go) (push) Successful in 8m53s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m55s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 7m28s
CI / Canvas Deploy Status (push) Successful in 3s
CI / all-required (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 26m52s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
publish-canvas-image / Promote canvas :latest to CI-green build (push) Failing after 1h0m33s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1h0m12s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push) Successful in 42s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 8m17s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push) Failing after 13m58s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push) Failing after 14m7s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push) Failing after 14m10s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Failing after 14m14s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-06-08 09:10:26 +00:00
agent-dev-a cc745700e8 Merge pull request 'fix(queue): use label= (singular) not labels= (plural) for Gitea 1.22.6 API (#1306)' (#2412) from fix/1306-gitea-label-singular into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 11s
CI / Detect changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
CI / Platform (Go) (push) Successful in 2s
CI / Canvas (Next.js) (push) Successful in 2s
Handlers Postgres Integration / detect-changes (push) Successful in 13s
E2E API Smoke Test / detect-changes (push) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
E2E Chat / E2E Chat (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s
CI / Canvas Deploy Status (push) Successful in 8s
CI / all-required (push) Successful in 8s
Ops Scripts Tests / Ops scripts (unittest) (push) Failing after 48s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m26s
publish-workspace-server-image / build-and-push (push) Successful in 6m18s
publish-workspace-server-image / Production auto-deploy (push) Failing after 3m46s
2026-06-08 08:02:01 +00:00
core-devops e6b6ec519c ci: revert coverage-gate split — measured peak is 1.33 GB, there was no OOM
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 9s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Check migration collisions / Migration version collision check (pull_request) Successful in 41s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 42s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 1m10s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 2m5s
E2E Chat / E2E Chat (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m45s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 2m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 3m26s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 23s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Failing after 1m16s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m47s
Harness Replays / Harness Replays (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m10s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m30s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m43s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m34s
CI / Canvas (Next.js) (pull_request) Successful in 6m59s
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m52s
CI / Platform (Go) (pull_request) Successful in 9m55s
CI / all-required (pull_request) Successful in 12s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m10s
qa-review / approved (pull_request_review) Successful in 5s
security-review / approved (pull_request_review) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m14s
audit-force-merge / audit (pull_request_target) Successful in 11s
gate-check-v3 / gate-check (pull_request_target) Has been cancelled
Evidence-first correction (SOP). My earlier commit split the Canvas gate into a
plain "vitest run" + a separate continue-on-error coverage step, on the theory
that "vitest run --coverage" was OS-OOM-killing the runner. Measuring the actual
footprint disproves that:

  full vitest + v8-coverage process TREE peak RSS = 1.33 GB (3358 tests)

(The first measurement of 0.56 GB only saw the parent process; 1.33 GB is the
whole tree incl. the worker fork.) 1.33 GB is comfortably within the runner, and
the single "vitest run --coverage" gate was green on the prior head 3b1e705e — so
there is no chronic coverage OOM. The two reds on b1da1456 were (a) the DisplayTab
paste-race (real, fixed in this PR) and (b) an incomplete attempt-1 log captured
when the re-run was triggered, NOT a kill.

So the split was a workaround for a misdiagnosed problem. Restore the SINGLE
"npx vitest run --coverage" as the gate+coverage SSOT (one invocation, html
artifact preserved, coverage config untouched in its proper home). The genuine
fix — DisplayTab waiting for the RFB connect before pasting — stays.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 00:47:03 -07:00
core-devops 3de9e05076 ci/test: fix DisplayTab paste-race + decouple memory-heavy coverage from the Canvas gate
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 21s
E2E Chat / detect-changes (pull_request) Successful in 19s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
Check migration collisions / Migration version collision check (pull_request) Successful in 46s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 43s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 17s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 23s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1m7s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m3s
E2E Chat / E2E Chat (pull_request) Successful in 6s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m17s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 2m36s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m43s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 2m27s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 25s
gate-check-v3 / gate-check (pull_request_target) Has started running
qa-review / approved (pull_request_target) Has started running
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 35s
security-review / approved (pull_request_target) Has started running
sop-checklist / review-refire (pull_request_target) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Harness Replays / Harness Replays (pull_request) Has started running
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m23s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m19s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m34s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4m57s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8m3s
CI / Platform (Go) (pull_request) Successful in 11m21s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Successful in 16m46s
CI / Canvas Deploy Status (pull_request) Successful in 7s
CI / all-required (pull_request) Successful in 7s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Has been cancelled
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Has been cancelled
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Has been cancelled
Two pre-existing Canvas-gate fragilities (both on main, surfaced by #2385's CI)
that blocked the required CI / all-required gate on resource/timing, not on a
real test result:

1. DisplayTab.test.tsx "forwards browser paste events into the noVNC clipboard"
   raced: it fired paste as soon as the "Workspace desktop" title rendered, but
   the component sets rfbRef.current synchronously after new RFB() INSIDE the
   async connect() (which awaits a lease/token first). When the race lost under
   CI runner load, the window paste handler's rfbRef.current?.clipboardPasteFrom
   no-op'd -> 0 calls. Wait for mockRFBConstructor before pasting -> deterministic.

2. The Canvas gate ran "npx vitest run --coverage" as the pass/fail step. v8
   coverage + JSDOM under vitest maxWorkers:1 accumulates memory across all 228
   files and OS-OOM-killed the run mid-suite on the shared runner. Split: the
   GATE is now plain "npx vitest run" (light, deterministic); coverage moves to a
   separate continue-on-error artifact step (no threshold gate per #1815, so it
   was never a real gate). Removes the OOM from the required path.

Verified: DisplayTab 13/13 (5x); full canvas suite 3358/0; coverage run still
produces the artifact when memory allows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 00:25:48 -07:00
core-devops b1da145611 fix(security): prevent ordinary workspace from self-minting a second org root (priv-esc)
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 21s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 55s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
qa-review / approved (pull_request_target) Failing after 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Chat / E2E Chat (pull_request) Successful in 6s
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 34s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request_target) Successful in 31s
Harness Replays / Harness Replays (pull_request) Successful in 11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m16s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 22s
Harness Replays / detect-changes (pull_request) Successful in 10s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 6m21s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 4m2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 4m18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 41s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 25s
Check migration collisions / Migration version collision check (pull_request) Successful in 1m30s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 2m38s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m20s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 2m27s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 32s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 1m24s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m8s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m18s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m29s
CI / Detect changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
CI / Platform (Go) (pull_request) Successful in 4m2s
CI / Canvas (Next.js) (pull_request) Failing after 8m40s
CI / all-required (pull_request) Has been skipped
CI / Canvas Deploy Status (pull_request) Has been skipped
Independent security review of #2385 found a privilege-escalation path: POST
/registry/register is bootstrap-allowed for a fresh workspace id and wrote the
caller-supplied kind, while workspaces_platform_root_check only enforces
'platform => parent_id IS NULL' (NOT a single root). So an ordinary in-VPC
workspace could register a fresh UUID as {"kind":"platform"}, mint a second
org root, and POST /workspaces/:id/restart it — the shared provision path then
injects MOLECULE_API_KEY=ADMIN_TOKEN (tenant-wide org-admin credential) into any
kind='platform' workspace, on self-host AND SaaS. That breaks the invariant that
only the concierge gets the org MCP + admin token.

Defense in depth:
- migration 20260607000000_one_platform_root: partial UNIQUE index
  (kind) WHERE kind='platform' — at most one platform root per (single-org)
  tenant DB. isPlatformRootViolation now also maps the 23505 to a friendly 409.
- registry.go Register: app-layer guard refusing to CREATE or PROMOTE a row to
  kind='platform' via the public path (reserve that for the AdminAuth/boot-gated
  install paths); a platform agent re-registering its already-platform row is
  unaffected. Placed after the token check to avoid side-channeling row existence.
- corrected the false 'CHECK structurally guarantees one per org' claims in the
  20260606 migration + integration-test header.

Tests:
- registry_test.go: rejects fresh kind=platform (403), rejects workspace->platform
  promotion (403), allows already-platform re-register (200).
- kind_platform_root_integration_test.go: real-PG test that a SECOND platform
  root is rejected by the unique index (the CHECK alone accepts it).
- canvas-topology-pure.test.ts: cover stripPlatformRootForMap (QA HIGH gap) —
  abs-position reparent math, platform-edge drop, grandchild preservation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 23:24:09 -07:00
Molecule AI Dev Engineer A (Kimi) 148aa9e1b7 Merge remote-tracking branch 'origin/main' into fix/provider-on-isrunning-status
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Failing after 49s
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
sop-checklist / review-refire (pull_request_target) Has been cancelled
CI / Canvas Deploy Status (pull_request) Successful in 4s
qa-review / approved (pull_request_target) Failing after 8s
security-review / approved (pull_request_target) Failing after 16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m1s
CI / Platform (Go) (pull_request) Successful in 7m33s
CI / all-required (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Waiting to run
2026-06-08 06:19:31 +00:00
core-devops 3b1e705e8b ci(concierge): fix Canvas reduced-motion test target + bp directives + local-provision port-squatter flake
security-review / approved (pull_request_target) Failing after 4s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m39s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m46s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m21s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m53s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m23s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m31s
CI / Canvas (Next.js) (pull_request) Successful in 6m32s
CI / Detect changes (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 4m3s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2s
Check migration collisions / Migration version collision check (pull_request) Successful in 23s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 37s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 23s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 28s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 14s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m20s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 21s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 45s
qa-review / approved (pull_request_target) Failing after 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
- reduced-motion.test.ts: the connection-status pulse dot moved from
  SidePanel.tsx into the extracted WorkspacePanelTabs.tsx; retarget the
  motion-safe:animate-pulse assertion to where the guarded indicator now
  lives (was the only red in CI / Canvas -> gates CI / all-required).
- e2e-staging-saas.yml: add bp directives to the 4 new concierge jobs the
  Tier-2g lint flagged — bp-required: pending #2430 for the three real
  push-time staging e2e jobs (creates-workspace / platform / user-tasks,
  aspiring gates sharing the cp#245 de-flake surface), bp-exempt for the
  PR-time compile-only job. #2187 (the sibling's tracker) is closed/unrelated.
- local-provision-e2e.yml (no-flakes RCA): the :8080 kill-step only matched
  procs *named* platform-server, so a differently-named squatter survived,
  our bind went FATAL, and the /health loop false-positived against the
  squatter. Free :8080 from ANY holder (fuser/lsof) and verify our own PID
  owns the port BEFORE trusting /health, in both the stub and real jobs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 23:01:32 -07:00
core-devops bde54b48a9 Merge remote-tracking branch 'origin/main' into feat/canvas-concierge-ui 2026-06-07 22:54:53 -07:00
Molecule AI Dev Engineer A (Kimi) 008ddb9942 fix(registry): heartbeat backfills agent_card when NULL (#2421)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s
qa-review / approved (pull_request_target) Failing after 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
security-review / approved (pull_request_target) Failing after 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 43s
E2E Chat / E2E Chat (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m15s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m11s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m13s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m28s
CI / Platform (Go) (pull_request) Successful in 7m6s
CI / all-required (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 15m39s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
When a workspace's initial /registry/register fails (e.g. DNS propagation
race on fast-cloud provisioners), the agent_card never lands and the agent
stays offline. The runtime already sends agent_card in later heartbeats,
but the heartbeat handler ignored it.

- Add AgentCard to HeartbeatPayload (optional, omitempty).
- In Heartbeat handler, UPDATE agent_card ONLY when the DB row has NULL
  agent_card. Never overwrites an existing reconciled card.
- Add tests for backfill-when-null and skip-when-already-set.

Fixes #2421 (option a)
2026-06-08 05:32:09 +00:00
agent-dev-a a448c1304a Merge pull request 'fix(channels): remove duplicate EncryptSensitiveFields + add rows.Err test (#1221)' (#2413) from fix/1221-channels-rowserr-dedup-encrypt into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 19s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 17s
E2E Chat / detect-changes (push) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m20s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m17s
E2E Chat / E2E Chat (push) Successful in 2m27s
publish-workspace-server-image / build-and-push (push) Successful in 4m10s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1h0m10s
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-06-08 05:21:32 +00:00
agent-dev-a 251d36d47d Merge pull request 'test(gate-check): explicit missing/pending required-context fail-closed coverage (#2403 CR2+Researcher)' (#2423) from feat/2403-remove-sop-tier-system into main
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 15s
Block internal-flavored paths / Block forbidden paths (push) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (push) Has started running
Handlers Postgres Integration / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (push) Has started running
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 30s
E2E Chat / E2E Chat (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m41s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 4m3s
publish-workspace-server-image / build-and-push (push) Successful in 4m12s
publish-workspace-server-image / Production auto-deploy (push) Failing after 1h0m13s
2026-06-08 05:20:25 +00:00
agent-dev-a b197e5c383 Merge pull request 'feat(2403): complete SOP tier removal — salvage non-tier fixes + zero tier refs' (#2419) from feat/2403-complete-tier-removal into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 10s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Has started running
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Has started running
Handlers Postgres Integration / detect-changes (push) Successful in 19s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 32s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m28s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Ops Scripts Tests / Ops scripts (unittest) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m18s
publish-workspace-server-image / build-and-push (push) Successful in 8m24s
publish-workspace-server-image / Production auto-deploy (push) Waiting to run
2026-06-08 05:20:03 +00:00
agent-dev-a cd7f51dbe6 Merge pull request 'fix(scripts): validate AWS region + ECR account ID in promote-tenant-image (#676)' (#2418) from fix/676-promote-tenant-image-region-exit64 into main
Block internal-flavored paths / Block forbidden paths (push) Has started running
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 17s
E2E API Smoke Test / detect-changes (push) Successful in 18s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
E2E Chat / E2E Chat (push) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Ops Scripts Tests / Ops scripts (unittest) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m19s
publish-workspace-server-image / build-and-push (push) Successful in 8m40s
publish-workspace-server-image / Production auto-deploy (push) Successful in 17s
2026-06-08 05:19:40 +00:00
agent-dev-a 761563f04e Merge pull request 'fix(canvas/e2e): tolerate transient 'failed' status during boot (#2032)' (#2417) from fix/2032-canvas-e2e-transient-failed-tolerance into main
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
E2E API Smoke Test / detect-changes (push) Has started running
E2E Chat / detect-changes (push) Has started running
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Has started running
publish-canvas-image / Promote canvas :latest to CI-green build (push) Blocked by required conditions
publish-canvas-image / Build & push canvas image (push) Has started running
Handlers Postgres Integration / detect-changes (push) Has started running
CI / Detect changes (push) Successful in 21s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 21s
Harness Replays / detect-changes (push) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 2s
publish-workspace-server-image / build-and-push (push) Successful in 4m15s
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / Production auto-deploy (push) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 12m2s
2026-06-08 05:19:15 +00:00
agent-dev-a 5c5ec2c5a5 Merge pull request 'fix(sop-checklist): normalize memory marker + body-unfilled informational (#1973 #1974)' (#2416) from fix/sop-checklist-1973-1974-ops-marker-render into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 16s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Has started running
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
ci-arm64-advisory / fast-checks (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Platform (Go) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas (Next.js) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Shellcheck (E2E scripts) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Canvas Deploy Status (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 33s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
E2E Chat / E2E Chat (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m11s
publish-workspace-server-image / build-and-push (push) Successful in 4m33s
publish-workspace-server-image / Production auto-deploy (push) Successful in 12s
2026-06-08 05:18:58 +00:00
devops-engineer dbdced6aa9 Merge pull request 'fix(registry): allow pending-DNS platform tunnel URL at register (#36 register half)' (#2425) from fix/validate-agent-url-pending-tunnel into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Python Lint & Test (push) Successful in 4s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
CI / Canvas Deploy Status (push) Successful in 2s
Harness Replays / Harness Replays (push) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 36s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m19s
publish-workspace-server-image / build-and-push (push) Successful in 3m20s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m25s
E2E Chat / E2E Chat (push) Successful in 4m46s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m29s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push) Waiting to run
CI / Platform (Go) (push) Successful in 8m33s
CI / all-required (push) Successful in 3s
publish-workspace-server-image / Production auto-deploy (push) Failing after 9m8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
2026-06-08 04:44:04 +00:00
hongming-personal 644734bb7c fix(registry): allow pending-DNS platform tunnel URL at register (#36/#2421)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 35s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 30s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas Deploy Status (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m59s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
qa-review / approved (pull_request_target) Refired via /qa-recheck; qa-review failed
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m30s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m35s
security-review / approved (pull_request_target) Refired via /security-recheck; security-review failed
CI / Platform (Go) (pull_request) Successful in 6m58s
CI / all-required (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 9m51s
audit-force-merge / audit (pull_request_target) Successful in 6s
Cross-cloud workspaces (e.g. Hetzner under a GCP tenant) register
advertising their per-workspace Cloudflare tunnel hostname
ws-<id>.<appDomain>. That DNS record is eventually-consistent, and a
FAST-booting box (a Hetzner cpx reports 'workspace ready after ~1s')
registers BEFORE it propagates → validateAgentURL's net.LookupIP fails →
the handler returns 400 → and the runtime does NOT retry a 4xx → so
agent_card never lands and the agent never comes online. AWS/GCP boot
slowly enough to miss the race, which is why ONLY the fast cloud broke.

Diagnosed live: faithful Hetzner repro boxes register against a warm
tenant and still 400 with
  {"error":"hostname \"ws-...\" cannot be resolved (DNS error)..."}

Fix: when DNS resolution fails, allow the hostname through in SaaS mode iff
it is a platform-tunnel hostname (ws-<id> under the platform's own domain,
MOLECULE_APP_DOMAIN default moleculesai.app). Such a hostname is NOT an
SSRF vector — only the platform controls DNS there, so an attacker cannot
point it at 169.254/127/private space, and the unconditional metadata/
loopback blocks still apply once it resolves. Restores the pre-#1130
'let an unresolvable platform URL through' behaviour, scoped to the
trusted tunnel domain. Self-hosted keeps the strict block.

This is the register half of #36; the provision half (Hetzner location
capacity failover) shipped in cp#619.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 21:31:29 -07:00
core-devops 0541076f90 test(security): lock that only the kind=platform concierge gets the org MCP + admin token
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Check migration collisions / Migration version collision check (pull_request) Successful in 29s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
CI / Shellcheck (E2E scripts) (pull_request) Successful in 24s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 38s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 51s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 20s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 1m19s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 11s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 45s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 26s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 30s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
security-review / approved (pull_request_target) Failing after 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m23s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m11s
CI / Platform (Go) (pull_request) Successful in 4m13s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m13s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
qa-review / approved (pull_request_target) Failing after 4s
Harness Replays / Harness Replays (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m33s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m8s
E2E Chat / E2E Chat (pull_request) Successful in 7s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m45s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request_target) Has been skipped
CI / Canvas (Next.js) (pull_request) Failing after 6m36s
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
Regression guard for the user's requirement: only the tenant-native concierge
(kind='platform') may hold the org/platform MCP and the org-admin token natively;
an ordinary workspace must get neither. Asserts applyConciergeProvisionConfig is a
no-op for kind='workspace' (no MOLECULE_API_KEY leak, no system-prompt, no platform
mcp_servers) and applies for kind='platform'. Defense-in-depth already exists at
three layers (config + admin-token env + MCP-bearing image, all gated on the DB
kind SSOT); this stops a silent regression of the gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 21:15:47 -07:00
Molecule AI Dev Engineer A (Kimi) 8fa25d6b8c fix(provisioner): remove duplicate resolveProvider declaration causing build failure
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 19s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 20s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 26s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 10s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
qa-review / approved (pull_request_target) Failing after 17s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m1s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 4m59s
CI / Platform (Go) (pull_request) Successful in 7m26s
CI / all-required (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 10m8s
Removes accidental second copy of resolveProvider that was

introducing a redeclaration compile error in cp_provisioner.go.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 03:44:41 +00:00
Molecule AI Dev Engineer A (Kimi) 02a6e4d4df fix(provisioner): thread provider into IsRunning status call, fail-closed on lookup error (#2386 sibling-leak)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 1s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 3s
qa-review / approved (pull_request_target) Failing after 6s
CI / Platform (Go) (pull_request) Failing after 37s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 26s
security-review / approved (pull_request_target) Failing after 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
sop-checklist / review-refire (pull_request_target) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m31s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m30s
Researcher found CPProvisioner.IsRunning/status omits 'provider' on its
control-plane call, misrouting non-AWS workspaces to the AWS status path.
Same bug class as deprovision leak #2386/#2387.

Changes:
- Add resolveProvider helper (workspaces.compute->>'provider') mirroring
  resolveInstanceID pattern.
- IsRunning: resolve provider, fail-closed on error (return true, err so
  a2a_proxy stays on alive path), URL-encode with url.Values/q.Encode().
- Regression tests: (a) provider threaded to status query, (b) fail-closed
  on lookup error — no CP call, (c) hostile-slug encoding round-trip.

Diff scoped to cp_provisioner.go + cp_provisioner_test.go only.
Branch off fresh origin/main (no stacking on #2387/#2388).
2026-06-08 03:31:08 +00:00
Molecule AI Dev Engineer A (Kimi) c7dbd6c3e4 fix(2403): uniform gate fail-closed — governance checks always required (CTO #2407)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
CI / Canvas (Next.js) (pull_request) Successful in 8s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
CI / Canvas Deploy Status (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
CI / all-required (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 52s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 58s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m15s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m13s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m19s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m26s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 8s
security-review / approved (pull_request_review) Successful in 9s
audit-force-merge / audit (pull_request_target) Successful in 9s
1. gitea-merge-queue.py::enumerate_readiness:
   - Merge GOVERNANCE_REQUIRED_CONTEXTS with BP required_contexts.
   - Previously enumerate_readiness omitted qa-review/security-review/sop-checklist,
     so readiness reports did not enforce the uniform gate.

2. gate_check.py::signal_6_ci:
   - Add GOVERNANCE_REQUIRED_CONTEXTS hardcoded list.
   - Merge with branch-protection required checks so governance checks block
     even when BP does not enumerate them.

3. test_gitea_merge_queue.py:
   - Add test_non_required_red_does_not_block_merge (flipped):
     asserts qa/security/sop failing blocks merge (force=False).

4. test_gate_check.py:
   - Add test_signal_6_governance_checks_always_required_even_when_bp_empty:
     proves governance checks are evaluated when BP required list is empty.

All 85 affected tests pass (71 merge-queue + 14 gate-check).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 02:58:11 +00:00
hongming-personal 286779ec45 feat(ws-server): validate compute.provider against the cloud-provider SSOT
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 18s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 7s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
CI / Canvas Deploy Status (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m47s
CI / Platform (Go) (pull_request) Successful in 4m6s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Refired via /qa-recheck by unknown
security-review / approved (pull_request_target) Refired via /security-recheck by unknown
sop-checklist / all-items-acked (pull_request_target) Refired after stranded cancelled SOP row; current checklist already satisfied
audit-force-merge / audit (pull_request_target) Successful in 9s
validateWorkspaceCompute checked instance_type / volume / display /
data_persistence but NOT compute.provider — a typo'd provider flowed to CP
and only fail-closed there with a 422. Add an allowlist mirroring the
controlplane cloudprovider SSOT (aws|gcp|hetzner) so a bad provider gets a
clean 400 before the round-trip. This is the validation seam the
switch-provider flow (RFC #622) reuses.

PR1 of the switch-existing-workspace-provider series.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 19:35:07 -07:00
Molecule AI Dev Engineer A (Kimi) 2e0507380b test(gate-check): explicit missing/pending required-context fail-closed coverage (#2403 CR2+Researcher)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 37s
gate-check-v3 / gate-check (pull_request_target) Successful in 22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 15s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Chat / E2E Chat (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m15s
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
security-review / approved (pull_request_target) Failing after 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 57s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m10s
CI / all-required (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m35s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CR2 9450 + Researcher 9455: gate_check.py already treats absent/pending
required contexts as CI_PENDING (fail-closed), but this was not covered by
tests. Add four signal_6 tests:

1. test_signal_6_missing_required_context_returns_ci_pending
   - required check absent from statuses → verdict=CI_PENDING
2. test_signal_6_pending_required_context_returns_ci_pending
   - required check status=pending → verdict=CI_PENDING
3. test_signal_6_failing_required_context_returns_ci_fail
   - required check status=failure → verdict=CI_FAIL
4. test_signal_6_all_required_green_returns_ci_pending
   - all required checks success → verdict=CLEAR

This proves the uniform gate is fail-closed on absence: a required context
that has not yet materialized (missing/pending) is NEVER treated as ready.
2026-06-08 02:07:16 +00:00
core-devops 993379f184 test(e2e): functional proof the concierge creates a workspace via its platform MCP
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 39s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m14s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 27s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m45s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m27s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m24s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m19s
CI / Detect changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 4m54s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
qa-review / approved (pull_request_target) Failing after 6s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 25s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 31s
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 20s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 58s
sop-checklist / review-refire (pull_request_target) Has been skipped
Harness Replays / detect-changes (pull_request) Successful in 7s
security-review / approved (pull_request_target) Failing after 6s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 23s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m9s
CI / Platform (Go) (pull_request) Successful in 4m12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m26s
CI / Canvas (Next.js) (pull_request) Failing after 6m35s
CI / Canvas Deploy Status (pull_request) Has been skipped
Check migration collisions / Migration version collision check (pull_request) Successful in 30s
CI / all-required (pull_request) Has been skipped
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m0s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
Drives the concierge as an AGENT (A2A message/send: 'create a workspace named X
with role engineer') and asserts the real side effect — a workspace named X appears
in GET /workspaces, only possible if the LLM invoked the create_workspace platform-
MCP tool. Staging real-LLM job (GATING, false-green-proof via E2E_REQUIRE_LIVE=1 so a
missing platform-agent image hard-fails) + a local variant (make e2e-concierge-
creates-workspace) that skips-loud unless the concierge's MCP advertises
create_workspace. Tolerates LLM nondeterminism (imperative prompt, assert by name,
bounded polling). Teardown + AWS-leak-check.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 18:53:34 -07:00
core-devops f0a52caae6 feat(provisioner): provision the concierge on the platform-agent image (kind=platform) so its org-admin MCP exists
The concierge declared the platform MCP but ran on the plain claude-code image
(no /opt/molecule-mcp-server) so it had zero org-admin tools. The local Docker
provisioner now selects the platform-agent image variant for kind='platform'
(gated on the image being present — falls back + logs otherwise, so normal
workspaces + SaaS are unaffected). kind is read from the workspace row (SSOT).
Live-verified: concierge runs ...-platform-agent, /opt/molecule-mcp-server present,
online, and GET /workspaces with the MCP bearer returns 200 from inside it. SaaS/CP
provisioner image selection is the cross-repo follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 18:53:34 -07:00
Molecule AI Dev Engineer A (Kimi) 71f485b76c fix(channels): clarify encryption comment to show single-call intent (#1221 CR2)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
qa-review / approved (pull_request_target) Failing after 6s
gate-check-v3 / gate-check (pull_request_target) Failing after 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 7s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
CI / Canvas Deploy Status (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 59s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m12s
CI / Platform (Go) (pull_request) Successful in 7m38s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 4s
audit-force-merge / audit (pull_request_target) Successful in 9s
Reviewer confusion: the unified diff from main showed one block removed
without clearly showing the first (retained) block. Update the comment
in the retained block to explicitly state 'Exactly one call here;
duplicate removed in this PR' so the diff unambiguously proves the
Create path still encrypts bot_token/webhook_secret before persistence.

No behavior change — the encryption call was already present.
2026-06-08 01:25:41 +00:00
core-devops 18a0be64a9 feat(concierge): seed the platform agent its concierge identity + platform MCP config
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 51s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 25s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 18s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m7s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m7s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m25s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m47s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m41s
CI / all-required (pull_request) Has been skipped
CI / Canvas Deploy Status (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m4s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m1s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m24s
qa-review / approved (pull_request_target) Failing after 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m45s
sop-checklist / review-refire (pull_request_target) Has been skipped
CI / Canvas (Next.js) (pull_request) Failing after 8m35s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
sop-checklist / all-items-acked (pull_request_target) Successful in 3s
security-review / approved (pull_request_target) Failing after 8s
CI / Platform (Go) (pull_request) Successful in 4m5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m5s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 12s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 22s
Check migration collisions / Migration version collision check (pull_request) Successful in 31s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 38s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 19s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 25s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
installPlatformAgent created only a DB row, so the concierge booted as a vanilla
claude-code agent ("I'm MiniMax-M3", generic tasks). Per rfc-platform-agent.md it
must carry a concierge system_prompt (it IS the org root / user's A2A peer + default
chat target; orchestrates the org via the platform MCP + a2a; destructive ops
human-approved) and the platform MCP (mcp_servers: platform → molecule-mcp-server,
authed from MOLECULE_API_KEY/URL/ORG_ID). Seeded at provision (applyConcierge
ProvisionConfig, gated on kind='platform'), idempotent + self-applying to the
existing concierge (boot-provision restarts a running-but-vanilla one). The org-admin
MCP only lights up on the platform-agent image; identity works everywhere. Live-
verified: concierge now answers as the org platform concierge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 18:23:31 -07:00
Molecule AI Dev Engineer A (Kimi) 579e044e54 chore: retrigger CI for fresh review (#2417)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request_target) Failing after 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
qa-review / approved (pull_request_target) Failing after 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 20s
Harness Replays / Harness Replays (pull_request) Successful in 20s
CI / Canvas (Next.js) (pull_request) Successful in 8m16s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
audit-force-merge / audit (pull_request_target) Successful in 10s
2026-06-08 00:55:30 +00:00
core-devops 643dd5c1f5 test(canvas-e2e): Playwright front-end e2e for each concierge function
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 20s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m21s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m16s
security-review / approved (pull_request_target) Failing after 11s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
qa-review / approved (pull_request_target) Failing after 12s
sop-checklist / na-declarations (pull_request) N/A: (none)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m28s
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
CI / Platform (Go) (pull_request) Successful in 4m18s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 33s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m27s
sop-checklist / review-refire (pull_request_target) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m10s
E2E Chat / E2E Chat (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 15s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 29s
CI / Canvas (Next.js) (pull_request) Failing after 6m40s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 4m3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m40s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Successful in 1m3s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m14s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m51s
CI / Detect changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 13s
Check migration collisions / Migration version collision check (pull_request) Successful in 36s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
Extends the existing canvas staging Playwright project (staging-*.spec.ts, gated
Canvas tabs E2E check) with staging-concierge.spec.ts — 7 specs: shell/nav + dynamic
org name, Home (canonical ChatTab + sub-tabs + ROOT tree), Org map hides the
concierge, Settings two-tab split + full WorkspacePanelTabs, Config-tab SSOT
dropdowns (no Platform on self-host), Org & canvas sub-tabs (Organization no 404),
and the stripped map toolbar. Installs a real platform agent via the admin endpoint
per run. Adds minimal data-testids to ConciergeShell for stable selection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:51:42 -07:00
core-devops ab43d5a9dc test(staging-e2e): comprehensive real-staging coverage for concierge/platform-agent
Extends the existing staging harness (reuses org-provision/teardown + _lib.sh +
env contract): TestConciergePlatformAgent_Staging (Go, staging_e2e tag) covers
platform-agent install + kind + /org/identity + re-parenting, discovery peers admin
auth, billing-mode round-trip, and the config-tab endpoint sweep; test_staging_
concierge_e2e.sh covers user_tasks REST+MCP+cross-workspace authz. Wired into
e2e-staging-saas.yml as GATING jobs (+ a compile-skip-loud job that runs every
push). Caught + fixed: /org/identity needs X-Molecule-Org-Id on a SaaS tenant
(TenantGuard) — switched to doTenantJSON.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:51:42 -07:00
core-devops a336acd23d fix(self-host): org-identity + org-templates SSOT parity (no CP-only 404, no shadowed defaults)
Organization settings tab called the control-plane-only GET /cp/orgs, 404ing on
self-host. /org/identity now also returns slug + org_id (MOLECULE_ORG_SLUG/ID),
and OrgInfoTab falls back to it when /cp/orgs is unavailable — single org, no
error; SaaS multi-org path unchanged. Org templates: the image bakes default org
templates (molecule-dev, molecule-worker-gemini, ux-ab-lab) at /org-templates, but
the ./org-templates:/org-templates:ro mount shadowed them with an empty host dir
(same class as the runtime-template shadow). findOrgDir() honors ORG_TEMPLATES_DIR;
compose points it at the baked bundle + drops the shadowing mount — local now lists
them like production.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:51:42 -07:00
Molecule AI Dev Engineer A (Kimi) b103d02f17 test(channels): prove Create encrypts bot_token before persistence (#1221 CR)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Failing after 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Successful in 1s
qa-review / approved (pull_request_target) Failing after 13s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 22s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m49s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m35s
CI / Platform (Go) (pull_request) Successful in 4m10s
CI / all-required (pull_request) Successful in 3s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 8s
Reviewer catch: requested test proving EncryptSensitiveFields runs on
Create path before DB insert. Add TestChannelHandler_Create_EncryptsSensitiveFields
with sqlmock custom matcher that verifies the INSERT configJSON carries
bot_token prefixed with ciphertextPrefix (ec1:).

Sets SECRETS_ENCRYPTION_KEY + resets crypto state so the test exercises
real encryption rather than the dev plaintext fallback.

Fixes #1221
2026-06-08 00:50:27 +00:00
Molecule AI Dev Engineer A (Kimi) f14ad38cb4 fix(sop-checklist): revert #1974 body-unfilled bypass — keep fail-closed (#2418 CR)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Failing after 7s
security-review / approved (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m4s
CI / Canvas (Next.js) (pull_request) Successful in 6m25s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
audit-force-merge / audit (pull_request_target) Successful in 20s
Removes the gate-weakening #1974 change that made body-section presence
informational only. The SOP checklist gate must remain fail-closed:
missing body sections → failure even when peer acks are present.

Fixes #2418
2026-06-08 00:42:13 +00:00
Molecule AI Dev Engineer A (Kimi) e40607dfee fix(sop-checklist): revert #1974 body-unfilled bypass — keep fail-closed (#2416 CR)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 7s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 4s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
audit-force-merge / audit (pull_request_target) Successful in 16s
Reviewer catch: #1974 weakened the SOP checklist gate by making
body-section presence informational only (success when peer acks exist
but body sections are missing). This changes the gate from fail-closed
to pass-with-body-unfilled.

Revert:
- render_status() restores `not missing and not missing_body` for success.
- Tests restored to expect failure when body sections are unfilled.

The #1973 memory-marker normalization (slash→space) is retained.

Fixes #2416
2026-06-08 00:41:07 +00:00
Molecule AI Dev Engineer A (Kimi) ddf9006edf feat(2403): complete SOP tier removal — salvage non-tier fixes + zero tier refs
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m5s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m12s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m12s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
sop-checklist / all-items-acked (pull_request_target) Successful in 12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m35s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m27s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 8s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 8s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 15s
audit-force-merge / audit (pull_request_target) Successful in 10s
Completes the SOP tier system removal started in #2407 by cleaning
remaining tier artifacts and salvaging the non-tier fixes from
#2396/#2397/#2399 branches.

Changes:

1. **qa-review.yml + security-review.yml** — salvage #2139 + #2159:
   - Add `labeled, unlabeled` to `pull_request_target` triggers so
     gates re-evaluate when labels change (#2139).
   - Remove unreliable `github.event.review.state` guard (#2159);
     evaluator (review-check.sh) already reads actual reviews from API.
   - Replace `SOP_TIER_CHECK_TOKEN` with `SOP_CHECKLIST_GATE_TOKEN`.

2. **Workflow token cleanup** — zero SOP_TIER_CHECK_TOKEN refs:
   - sop-checklist.yml, gate-check-v3.yml, audit-force-merge.yml,
     ci-required-drift.yml: replace or remove all SOP_TIER_CHECK_TOKEN
     references.

3. **Lint + runbook cleanup** — remove stale tier-check mentions:
   - lint-required-no-paths.yml + lint-required-no-paths.py: update
     example context from `sop-checklist / tier-check` to
     `sop-checklist / all-items-acked`.
   - gitea-operational-quirks.md: update token name references.

4. **Mutation test enhancement** (test_no_tier_regression.sh):
   - Fail if SOP_TIER_CHECK_TOKEN reappears anywhere.
   - Fail if qa-review/security-review lose labeled/unlabeled triggers.
   - Fail if review.state guard reappears.

5. **Unit test updates** (test_gate_review_auto_fire.py):
   - Assert absence of review.state guard instead of presence.
   - Assert SOP_CHECKLIST_GATE_TOKEN instead of SOP_TIER_CHECK_TOKEN.

All tests pass:
- test_gate_review_auto_fire.py: 11 passed
- test_gitea_merge_queue.py: 70 passed
- test_gate_check.py: 9 passed
- test_lint_required_no_paths.py: 21 passed
- test_sop_checklist.py: 101 passed
- test_no_tier_regression.sh: PASS

Fixes #2403
2026-06-08 00:34:37 +00:00
core-devops d3249101f8 feat(canvas): split Settings into Platform-agent / Org-&-canvas tabs (not one sheet)
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m50s
CI / Canvas (Next.js) (pull_request) Failing after 6m30s
CI / Canvas Deploy Status (pull_request) Has been skipped
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / all-required (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m24s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m24s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 9s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m9s
CI / Python Lint & Test (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 13s
security-review / approved (pull_request_target) Failing after 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
qa-review / approved (pull_request_target) Failing after 4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m12s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Check migration collisions / Migration version collision check (pull_request) Successful in 23s
sop-checklist / review-refire (pull_request_target) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m32s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Platform (Go) (pull_request) Successful in 4m5s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m43s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m16s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Waiting to run
The Settings page stacked both sections in one long scroll. Give each its own
tab (reusing the existing .sbTabs purple-underline tab style): 'Platform agent
configuration' and 'Org & canvas settings'. Local settingsTab state, defaults to
the platform-agent tab.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:30:41 -07:00
core-devops cf23d2aead fix(local): serve the full baked runtime/template set so the runtime list mimics production (SSOT)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Check migration collisions / Migration version collision check (pull_request) Successful in 32s
Harness Replays / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 37s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 30s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Has been cancelled
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m32s
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
sop-checklist / review-refire (pull_request_target) Has been cancelled
qa-review / approved (pull_request_target) Failing after 5s
Harness Replays / Harness Replays (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request_target) Waiting to run
security-review / approved (pull_request_target) Failing after 16s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m47s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m48s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m2s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
The image bakes all runtime templates (claude-code-default, codex, google-adk,
hermes, openclaw, seo-agent) at /workspace-configs-templates, but the
./workspace-configs-templates:/configs mount carried only claude-code-default on
the host — so GET /templates (the runtime-picker SSOT) listed ONLY claude-code
locally while production lists them all. Point TEMPLATE_CACHE_DIR at the baked
bundle so the local runtime LIST matches production. Provisioning the non-
claude-code runtimes locally still needs their host templates + images (the local
Docker provisioner bind-mounts from CONFIGS_HOST_DIR), so they're selectable but
only claude-code is provisionable in this lightweight dev stack — full-runtime
provisioning is covered by the staging e2e. Verified: /templates now serves
claude-code, codex, google-adk, hermes, openclaw.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 17:28:45 -07:00
Molecule AI Dev Engineer A (Kimi) bc59544b07 fix(canvas/e2e): tolerate transient 'failed' status during boot (#2032)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / detect-changes (pull_request) Successful in 13s
security-review / approved (pull_request_target) Failing after 7s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request_target) Failing after 12s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 53s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
gate-check-v3 / gate-check (pull_request_target) Failing after 10s
CI / Canvas (Next.js) (pull_request) Successful in 6m21s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 4s
Hermes cold-boot can exceed the bootstrap-watcher deadline, setting
status=failed prematurely; heartbeat later recovers to online. Instead
of hard-throwing on the first 'failed' sighting, log a warning and
retry. Genuine terminal failures still surface via the waitFor timeout.

Fixes #2032
2026-06-08 00:08:43 +00:00
Molecule AI Dev Engineer A (Kimi) 2567b2f6ef fix(scripts): validate AWS region + ECR account ID in promote-tenant-image (#676)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Platform (Go) (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
E2E Chat / E2E Chat (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
qa-review / approved (pull_request_target) Failing after 8s
gate-check-v3 / gate-check (pull_request_target) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m25s
CI / Canvas (Next.js) (pull_request) Successful in 6m22s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 1s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
Adds input validation to prevent injection/malformed-input bugs:

- ssm_refresh_ecr_auth: validate ECR_ACCOUNT_ID is exactly 12 digits
  (AWS account ID format) before constructing JSON params.
- preflight: validate REGION matches ^[a-z][a-z0-9-]*[0-9]$
  (AWS region pattern); exit 64 on mismatch.

Includes test 11 covering malicious region rejection
(shell metacharacters, path traversal, command substitution).

Fixes #676
2026-06-07 23:46:22 +00:00
Molecule AI Dev Engineer A (Kimi) 1028777a9f fix(canvas/e2e): tolerate transient 'failed' status during boot (#2032)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Detect changes (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
qa-review / approved (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: memory-consulted
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 11s
security-review / approved (pull_request_target) Failing after 12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m17s
CI / Canvas (Next.js) (pull_request) Successful in 8m30s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 1s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
Hermes cold-boot can exceed the bootstrap-watcher deadline, setting
status=failed prematurely; heartbeat later recovers to online. Instead
of hard-throwing on the first 'failed' sighting, log a warning and
retry. Genuine terminal failures still surface via the waitFor timeout.

Fixes #2032
2026-06-07 23:42:59 +00:00
Molecule AI Dev Engineer A (Kimi) 72df19b513 fix(sop-checklist): normalize memory marker + body-unfilled informational (#1973 #1974)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 6s
qa-review / approved (pull_request_target) Failing after 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
security-review / approved (pull_request_target) Failing after 11s
CI / all-required (pull_request) Successful in 1s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 58s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
gate-check-v3 / gate-check (pull_request_target) Successful in 11s
- sop-checklist-config.yaml: normalize memory-consulted pr_section_marker
  from "Memory/saved-feedback consulted" → "Memory consulted" (#1973).
  The slash caused normalize_slug() to collapse it to a different string,
  so the Gitea PR body parser never found the expected heading.

- sop-checklist.py: body-section presence is informational only (#1974).
  The gate is peer-ack, not body-fill. Unfilled body sections still
  surface in the description for human visibility, but no longer flip
  the status to failure.

- test_sop_checklist.py: update assertions to match the new contract.
2026-06-07 23:38:03 +00:00
core-devops dc25031eed refactor(canvas): remove redundant PlatformBillingSection; single kind constant (SSOT)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Check migration collisions / Migration version collision check (pull_request) Successful in 29s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 13s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 19s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 30s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 16s
E2E Chat / E2E Chat (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m32s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 14s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m15s
gate-check-v3 / gate-check (pull_request_target) Successful in 12s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m35s
qa-review / approved (pull_request_target) Failing after 11s
security-review / approved (pull_request_target) Failing after 7s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 13s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m8s
CI / Platform (Go) (pull_request) Successful in 4m8s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m18s
CI / Canvas (Next.js) (pull_request) Failing after 6m30s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m10s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m1s
PlatformBillingSection forked provider/model/billing logic the platform agent's
Config tab (ConfigTab + LLMBillingSection) already owns — ConciergeShell rendered
both. Removed it (billing-mode stays owned by LLMBillingSection; provider filtering
now at the /templates source). Dropped the lingering name-regex platformRoot
fallback (backend always returns kind; map filter is kind-only). Added WORKSPACE_KIND
const (mirrors models.KindPlatform/Workspace) replacing magic 'platform' literals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:25:14 -07:00
core-devops ba6e8f668e refactor(user-tasks,discovery): one shared user-task store; de-dupe discovery auth (SSOT)
user_tasks had two write paths (REST handler + MCP tools) hand-writing the same
SQL/enum/broadcast — extracted UserTaskStore (mirrors AgentMessageWriter); both
surfaces route through it. Also de-duplicated validateDiscoveryCaller's repeated
cookie-session block and aligned its credential precedence (bearer->admin/org/ws,
then CP-session) to match middleware.WorkspaceAuth so the two can't drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:25:14 -07:00
core-devops 76cb9ddedb fix(templates): filter platform-managed provider at the /templates SOURCE on self-host (SSOT)
The 'hide Platform on self-host' decision was forked into the PlatformBillingSection
leaf, so ConfigTab/CreateWorkspaceDialog/MissingKeysModal still offered it. Move it
to the single source: enrichFromRegistry drops the platform provider + its models
from registry_providers/registry_models when !PlatformManagedProxyConfigured().
Every consumer now derives correctness for free. SaaS (proxy configured) output is
byte-identical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:25:14 -07:00
core-devops f6e836a98d Merge branch 'main' of https://git.moleculesai.app/molecule-ai/molecule-core into feat/canvas-concierge-ui
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
Check migration collisions / Migration version collision check (pull_request) Successful in 20s
CI / Python Lint & Test (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 41s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 28s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m29s
gate-check-v3 / gate-check (pull_request_target) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m28s
qa-review / approved (pull_request_target) Failing after 10s
security-review / approved (pull_request_target) Failing after 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m42s
sop-tier-check / tier-check (pull_request_target) Failing after 9s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E Chat / E2E Chat (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 20s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m9s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m30s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m9s
CI / Platform (Go) (pull_request) Successful in 4m1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m2s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m50s
CI / Canvas (Next.js) (pull_request) Failing after 9m39s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
2026-06-07 16:08:39 -07:00
core-devops ed3662de5e feat(canvas): remove redundant map-toolbar controls (settings gear, theme toggle, legend)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 14s
Check migration collisions / Migration version collision check (pull_request) Successful in 29s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 31s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m15s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
qa-review / approved (pull_request_target) Failing after 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 28s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m33s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m54s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 40s
Harness Replays / Harness Replays (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m13s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m0s
Settings now lives in the concierge global Settings (left rail) and theme in the
topbar/Settings, so the map toolbar's gear + theme picker are redundant. The
legend panel is also dropped from the map per design. Removes the now-unused
SettingsButton/settingsGearRef/ThemeToggle/Legend imports.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:04:34 -07:00
Molecule AI Dev Engineer A (Kimi) 844664c642 fix(queue): use label= (singular) not labels= (plural) for Gitea 1.22.6 API (#1306)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Successful in 7s
qa-review / approved (pull_request_target) Failing after 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m22s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 3s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 4s
audit-force-merge / audit (pull_request_target) Successful in 11s
Gitea 1.22.6 accepts `label` (singular) not `labels` (plural) for
filtering issues by label in the GET /repos/{owner}/{repo}/issues endpoint.
The queue script's list_queued_issues() has been passing `labels`, which
Gitea silently ignores, causing the function to return all open PRs instead
of only those tagged with QUEUE_LABEL.

Change the query key from "labels" to "label" so the label filter is
actually honoured.

Fixes #1306
2026-06-07 23:00:02 +00:00
Molecule AI Dev Engineer A (Kimi) 6aa7c52be6 fix(channels): restore single EncryptSensitiveFields call in Create (#1221 CR)\n\nReviewer catch: the prior commit removed both duplicate encryption blocks,\nregressing #319 credential-at-rest protection. Restore exactly one call\nbefore json.Marshal so bot_token/webhook_secret are encrypted before DB\nstorage. The rows.Err regression test is retained.\n\nFixes #1221
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Failing after 8s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
qa-review / approved (pull_request_target) Failing after 6s
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
sop-tier-check / tier-check (pull_request_target) Failing after 8s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m2s
CI / Canvas Deploy Status (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m8s
CI / Platform (Go) (pull_request) Successful in 4m5s
CI / all-required (pull_request) Successful in 3s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 7s
2026-06-07 22:59:40 +00:00
Molecule AI Dev Engineer A (Kimi) 346245d860 fix(channels): remove duplicate EncryptSensitiveFields + add rows.Err test (#1221)
**CWE-312 fix:** ChannelHandler.Create() had two consecutive
EncryptSensitiveFields calls (lines 159-172). The second was a pure no-op
that wasted CPU and confused readers. Removed the duplicate.

**Test:** Add TestChannelHandler_List_RowsErr_LogsError to verify that a
mid-stream rows.Err() after the Next() loop is logged but non-fatal — the
handler still returns the successfully-scanned row(s) with HTTP 200.

The rows.Err() checks in List() and Webhook() were already present from
PR #1900; this commit completes the issue by removing the duplicate
encryption and adding the missing regression test.

Fixes #1221
2026-06-07 22:59:40 +00:00
core-devops e6aad44c0f fix(discovery): accept admin/org token for /registry/:id/peers (concierge config tabs 401)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Check migration collisions / Migration version collision check (pull_request) Successful in 29s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
qa-review / approved (pull_request_target) Failing after 4s
security-review / approved (pull_request_target) Failing after 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m34s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m38s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m50s
sop-tier-check / tier-check (pull_request_target) Failing after 10s
Harness Replays / Harness Replays (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m47s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
CI / Platform (Go) (pull_request) Successful in 4m1s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m47s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m19s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m5s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 7m0s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 15m42s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 51s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m15s
The discovery routes (Peers/Discover/CheckAccess) auth via validateDiscoveryCaller,
which only did the per-workspace wsauth.ValidateToken — no admin/org fallback. So
the canvas operator's admin bearer 401'd ('invalid workspace auth token') on the
Details tab's GET /registry/:id/peers for the platform agent (the operator holds
no per-workspace token for it). Added the same admin-token + org-token fallback
middleware.WorkspaceAuth uses. Verified live: peers 200 with the admin token
(was 401). Every other config-tab endpoint already honored the operator token
via wsAuth's fallback or AdminAuth (swept: traces/plugins/schedules/channels/
display/events all 200).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 15:56:08 -07:00
core-devops d049e8fe1c feat(canvas): full workspace config tabs for the platform agent in Settings
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 8s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
Check migration collisions / Migration version collision check (pull_request) Successful in 32s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 33s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m28s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m31s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m24s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Successful in 10s
qa-review / approved (pull_request_target) Failing after 10s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m14s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m51s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
Harness Replays / Harness Replays (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 4m0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5m7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m46s
CI / Canvas (Next.js) (pull_request) Failing after 8m30s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 6m58s
The concierge Settings page can now configure the platform agent exactly like
any workspace. Extracted SidePanel's tab bar + body into a shared
WorkspacePanelTabs component (the canonical 15-tab set: config, plugins/skills,
container, display, details, activity, terminal, channels, schedule, files,
memory, traces, events, audit, chat). SidePanel renders it controlled (store
panelTab) — map drawer unchanged; Settings renders it uncontrolled (local tab
state, defaultTab=config) for the platform agent, so it never fights the map's
selection. Every tab already took an explicit workspaceId prop, so the
extraction is behavior-preserving (no store-selection coupling).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 15:29:11 -07:00
core-devops be7db9e9df feat(billing): environment-aware platform-agent billing — self-host defaults to BYOK, hides Platform
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 33s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
CI / Python Lint & Test (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 11s
Check migration collisions / Migration version collision check (pull_request) Successful in 41s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 25s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 36s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Harness Replays / detect-changes (pull_request) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 30s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 24s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
gate-check-v3 / gate-check (pull_request_target) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s
security-review / approved (pull_request_target) Failing after 4s
qa-review / approved (pull_request_target) Failing after 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m15s
sop-checklist / all-items-acked (pull_request_target) Successful in 12s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m33s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m23s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m52s
E2E Chat / E2E Chat (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m20s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m8s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m28s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m9s
CI / Platform (Go) (pull_request) Successful in 4m11s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m30s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m27s
CI / Canvas (Next.js) (pull_request) Successful in 6m53s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 3s
platform_managed only works on SaaS (Molecule hosted LLM proxy + org-credit
ledger). A self-hosted stack has neither, so showing 'Platform / metered to org
credits' as the default was misleading. New PlatformManagedProxyConfigured()
(true iff MOLECULE_LLM_BASE_URL + MOLECULE_LLM_USAGE_TOKEN are set — the same
precondition applyPlatformManagedLLMEnv enforces). GET /org/identity now returns
platform_managed_available; the resolver's default-closed fallbacks return byok
when no proxy (SaaS paths byte-for-byte unchanged, gated strictly). Settings
hides the Platform provider + defaults BYOK + forces byok writes when
unavailable; 404 on the signal => treated as unavailable (self-host safety).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:59:26 -07:00
core-devops a70c291737 refactor(canvas): Home concierge chat reuses the canonical ChatTab (no drift)
The Home view rendered a bespoke ConciergeChat that reimplemented (and lagged)
the map's agent chat. Render the SAME ChatTab the SidePanel uses, pointed at the
platform agent — so My Chat / Agent Comms, attachments, lazy history, markdown,
delivery-mode + restart are identical and can't drift. ChatTab takes explicit
{workspaceId, data} props (no store-selection coupling), so the map path is
unchanged. ConciergeChat removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:59:26 -07:00
core-devops 4ab16ca805 feat(canvas): hide the platform agent (concierge) from the org map graph
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 11s
Check migration collisions / Migration version collision check (pull_request) Successful in 47s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m13s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m15s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
security-review / approved (pull_request_target) Failing after 9s
qa-review / approved (pull_request_target) Failing after 9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 57s
sop-tier-check / tier-check (pull_request_target) Failing after 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m33s
E2E Chat / E2E Chat (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m13s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m32s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m23s
CI / Platform (Go) (pull_request) Successful in 4m4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m47s
CI / Canvas (Next.js) (pull_request) Successful in 8m13s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 23s
The platform agent is the undeletable org ROOT — every workspace hangs under
it — so it shouldn't be a draggable/deletable map node with a Delete affordance.
It stays surfaced as the org anchor: the shell topbar + the Home agent tree (as
ROOT). Only the Org map node-graph hides it.

- workspace-server: GET /workspaces + /workspaces/:id now return `kind`
  (COALESCE(w.kind,'workspace')) — it was a latent gap (the column existed but
  List/Get never selected it). Fixtures updated for the new column.
- canvas: stripPlatformRootForMap() drops the kind='platform' node from the map's
  React Flow input and reparents its children to top-level (relative→absolute);
  edges touching it are dropped. Toolbar workspace count excludes it.
- ConciergeShell resolves platformRoot by kind='platform' first (robust — the
  dynamic '<org> Agent' name broke the old name regex), falling back to the
  heuristic for older ws-server builds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:40:19 -07:00
core-devops ca50e9affb ci(local-provision-e2e): fix :8080 contention (red stub gate) + lint tracking directives
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 15s
Check migration collisions / Migration version collision check (pull_request) Successful in 27s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 12s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 27s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m43s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m51s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 21s
qa-review / approved (pull_request_target) Failing after 12s
security-review / approved (pull_request_target) Has started running
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m43s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m11s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m29s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m45s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Successful in 6m37s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
sop-tier-check / tier-check (pull_request_target) Failing after 6s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
CI / Canvas Deploy Status (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 7s
Root cause of the red 'Local Provision Lifecycle E2E (stub)' gate: the stub +
real jobs both bind PORT=8080 with no needs: ordering, so they co-scheduled on
the shared runner and the second bind killed the server -> /health timeout (the
issue #1046 class). Add needs: lifecycle-stub (advisory still always() + non-
blocking) + a kill-stale-platform-server step to both jobs. Also satisfy the two
lint gates this workflow trips: # mc#2408 tracker on the advisory continue-on-
error lane, and # bp-required: pending #2409 on the stub emitter (reconciling the
REQUIRED-vs-bp-exempt comment contradiction).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:06:19 -07:00
core-devops 266131205d docs(openapi): author user-tasks + /org/identity endpoints (swaggo SSOT)
The runtime-surface spec is swaggo-generated (Makefile openapi-spec + the
openapi-spec-check drift gate), so the SSOT is the handler annotations, not the
yaml. Add @Router/@Summary/@Param/@Success/@Security blocks (+ named request/
response structs swaggo can introspect) for the 6 user-tasks routes and
GET /org/identity, then regenerate. Auth modeled to match the router:
WorkspaceAuth -> BearerAuth+OrgSlugAuth, the cross-workspace /user-tasks/pending
-> AdminAuth bearer, /org/identity open. Regen is idempotent (drift gate green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:06:19 -07:00
core-devops be07f24270 fix(user-tasks): FK to workspaces(id) ON DELETE CASCADE + workspace_id index
Mirrors approval_requests' workspace_id FK so a deleted workspace's tasks are
reaped, not orphaned (an orphan vanishes from the home list — which JOINs
workspaces — while still showing in the owning workspace's own List). Adds the
(workspace_id, created_at DESC) index the owner-scoped List/Update/Delete + MCP
tools need. Inline in CREATE TABLE IF NOT EXISTS keeps it idempotent under the
re-apply-every-boot runner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 14:06:19 -07:00
core-devops 247848d009 fix(canvas): secrets client sends auth bearer (was 401) + collapse redundant platform-billing mode radios into the provider dropdown
Block internal-flavored paths / Block forbidden paths (pull_request) Has started running
Check migration collisions / Migration version collision check (pull_request) Has started running
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 39s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 46s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m9s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m2s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m23s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request_target) Successful in 6s
qa-review / approved (pull_request_target) Failing after 6s
security-review / approved (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 53s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m12s
Harness Replays / Harness Replays (pull_request) Successful in 1s
sop-tier-check / tier-check (pull_request_target) Failing after 7s
CI / Platform (Go) (pull_request) Successful in 4m14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m21s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m46s
CI / Canvas (Next.js) (pull_request) Successful in 6m37s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 6m25s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m25s
secrets.ts hand-rolled its fetch headers and omitted the Authorization
bearer, so every secret write 401'd with 'missing workspace auth token'
against a workspace-server with ADMIN_TOKEN set (the SecretsTab in concierge
settings). Route it through the shared platformAuthHeaders() helper (the
#178 raw-fetch bug shape).

PlatformBillingSection: the provider dropdown already offers 'Platform' as a
platform-managed option, so the two big mode-radio banners were redundant.
Drop them — the dropdown alone drives the mode (Platform = managed/no key,
any other provider = BYOK).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 13:44:37 -07:00
core-devops 5fbc33d78a feat(canvas): SSOT provider+model BYOK for the platform agent (not hardcoded Anthropic) + dynamic topbar org name
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Check migration collisions / Migration version collision check (pull_request) Successful in 14s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 21s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Harness Replays / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 17s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 53s
qa-review / approved (pull_request_target) Failing after 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 58s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m11s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m1s
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
sop-tier-check / tier-check (pull_request_target) Failing after 8s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m25s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 24s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m45s
E2E Chat / E2E Chat (pull_request) Successful in 46s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m20s
Harness Replays / Harness Replays (pull_request) Successful in 10s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 6m57s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m41s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 28s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m26s
CI / Platform (Go) (pull_request) Successful in 9m22s
CI / Canvas (Next.js) (pull_request) Successful in 9m41s
CI / Canvas Deploy Status (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 2s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m35s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 8m30s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 13:22:09 -07:00
core-devops 53e0fa884a feat(platform-agent): boot-seed auto-provisions the concierge + dynamic <org> Agent name + /org/identity
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 23s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Check migration collisions / Migration version collision check (pull_request) Successful in 34s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 37s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 46s
Harness Replays / detect-changes (pull_request) Successful in 36s
E2E Chat / E2E Chat (pull_request) Successful in 11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 12s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Has been cancelled
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Has been cancelled
lint-required-no-paths / lint-required-no-paths (pull_request) Has been cancelled
gate-check-v3 / gate-check (pull_request_target) Has been cancelled
sop-checklist / all-items-acked (pull_request_target) Has been cancelled
sop-checklist / review-refire (pull_request_target) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
sop-tier-check / tier-check (pull_request_target) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Failing after 4s
security-review / approved (pull_request_target) Failing after 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m39s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m26s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m7s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m48s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m52s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 13:20:34 -07:00
core-devops 550b75c1f4 feat(platform-agent): self-host boot-seed so the concierge auto-creates without a CP
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Check migration collisions / Migration version collision check (pull_request) Successful in 18s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 56s
CI / Python Lint & Test (pull_request) Successful in 35s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 27s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 47s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 58s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
qa-review / approved (pull_request_target) Failing after 4s
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m0s
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-tier-check / tier-check (pull_request_target) Failing after 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
gate-check-v3 / gate-check (pull_request_target) Successful in 42s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 1m22s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m30s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m51s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m22s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m32s
CI / Platform (Go) (pull_request) Successful in 4m27s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m12s
CI / Canvas (Next.js) (pull_request) Successful in 6m54s
CI / Canvas Deploy Status (pull_request) Successful in 57s
CI / all-required (pull_request) Successful in 8s
In SaaS the control plane calls POST /admin/org/platform-agent at org-provision
to install the org's platform agent (concierge). Self-hosted / local has no CP,
so the platform agent was never created ("No platform agent yet").

Add EnsureSelfHostedPlatformAgent: on boot, if no kind='platform' root exists,
install one with a deterministic id (uuidv5 "molecule:self-hosted:platform-agent").
Gated on MOLECULE_SEED_PLATFORM_AGENT (set in the self-hosted docker-compose) so:
- self-hosted/local → auto-seeds the concierge (matches the SaaS experience),
- CI harnesses + SaaS tenants leave it unset → e2e empty-DB assertions
  (test_api.sh) and the CP-driven install path are unaffected.
Idempotent + best-effort (never fatal).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 12:49:11 -07:00
core-devops 6e7918212f fix(canvas): suppress benign nonce hydration warning on layout scripts
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Check migration collisions / Migration version collision check (pull_request) Successful in 26s
Harness Replays / detect-changes (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 55s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 57s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 41s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m18s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 27s
qa-review / approved (pull_request_target) Failing after 9s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
Harness Replays / Harness Replays (pull_request) Successful in 1s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 17s
sop-tier-check / tier-check (pull_request_target) Failing after 17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m27s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m49s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m21s
CI / Canvas (Next.js) (pull_request) Successful in 6m20s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 6m52s
CI / all-required (pull_request) Successful in 1s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m34s
The boot-theme + JSON-LD inline scripts carry the per-request CSP nonce.
Browsers strip the nonce attribute off <script> after applying CSP, so the
hydrated DOM shows nonce="" while React's tree carries the real value —
React flags a hydration mismatch on every load. It's benign (the scripts
ran, CSP applied). Add suppressHydrationWarning to both scripts (same
escape hatch already used on <html> for the pre-paint theme write).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 12:39:56 -07:00
core-devops 8a29dac385 test(e2e): real-LLM lifecycle round-trip via MiniMax (cheaper) for the advisory job
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
Check migration collisions / Migration version collision check (pull_request) Successful in 46s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 33s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 35s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 59s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 32s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 34s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 1m15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 40s
E2E Chat / E2E Chat (pull_request) Successful in 29s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 30s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m24s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m30s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 19s
qa-review / approved (pull_request_target) Failing after 17s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m40s
gate-check-v3 / gate-check (pull_request_target) Successful in 18s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m31s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 1m51s
security-review / approved (pull_request_target) Failing after 18s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 19s
sop-tier-check / tier-check (pull_request_target) Failing after 21s
Harness Replays / Harness Replays (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 4m52s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m55s
CI / Canvas (Next.js) (pull_request) Successful in 6m21s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 7m41s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 8m3s
CI / all-required (pull_request) Successful in 2s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 6m58s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Failing after 15m33s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 04:09:04 -07:00
core-devops 097a5a9613 test(e2e): mandatory local Docker-provisioner lifecycle e2e (provision/online/restart-survive/proxy) + stub runtime
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Check migration collisions / Migration version collision check (pull_request) Successful in 24s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 17s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Chat / E2E Chat (pull_request) Successful in 31s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 30s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m0s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m0s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 57s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
qa-review / approved (pull_request_target) Failing after 11s
security-review / approved (pull_request_target) Failing after 10s
gate-check-v3 / gate-check (pull_request_target) Successful in 14s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request_target) Failing after 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m40s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 2m13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 2m9s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m14s
CI / Platform (Go) (pull_request) Successful in 6m57s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image, advisory) (pull_request) Failing after 6m58s
CI / Canvas (Next.js) (pull_request) Successful in 7m41s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been cancelled
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 03:50:57 -07:00
core-devops 9c86bd8de1 fix(provisioner): namespace managed-container label per platform instance so co-resident platforms can't cross-reap
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Check migration collisions / Migration version collision check (pull_request) Successful in 20s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 32s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m16s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Successful in 7s
qa-review / approved (pull_request_target) Failing after 5s
security-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request_target) Failing after 10s
sop-checklist / all-items-acked (pull_request_target) Successful in 13s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m9s
CI / Platform (Go) (pull_request) Successful in 4m2s
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Failing after 5m31s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 6m48s
CI / Canvas (Next.js) (pull_request) Successful in 6m9s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 03:05:24 -07:00
core-devops 4b0b56aa6a fix(canvas): SidePanel header no longer clipped behind concierge topbar
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 17s
Check migration collisions / Migration version collision check (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
gate-check-v3 / gate-check (pull_request_target) Successful in 37s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m17s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m12s
qa-review / approved (pull_request_target) Failing after 7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m30s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
security-review / approved (pull_request_target) Failing after 12s
sop-checklist / na-declarations (pull_request) N/A: (none)
Harness Replays / Harness Replays (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
sop-tier-check / tier-check (pull_request_target) Failing after 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 2m13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m24s
CI / Platform (Go) (pull_request) Successful in 4m12s
CI / Canvas (Next.js) (pull_request) Successful in 7m6s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 1s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
The canvas <main> root was w-screen/h-screen (full viewport). Inside the
Org Concierge shell the canvas lives in a transformed map-mount (below the
56px topbar), and a viewport-sized root overflowed that mount — which
corrupted the containing-block resolution for the position:fixed SidePanel:
its top resolved ~25px instead of the mount top, so the workspace-name
header rendered behind the topbar (only the pills row was visible).

Switch the root to w-full/h-full so it fills the map-mount. The SidePanel
now resolves top against the mount correctly and fills the map area exactly
(header below the topbar). No magic offsets. Canvas/SidePanel tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 02:18:12 -07:00
core-devops d1215a84c4 fix(cors): allow X-Confirm-Name header (workspace-delete confirmation)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 6s
Check migration collisions / Migration version collision check (pull_request) Successful in 29s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 14s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 23s
qa-review / approved (pull_request_target) Failing after 3s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m22s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-tier-check / tier-check (pull_request_target) Failing after 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m20s
security-review / approved (pull_request_target) Failing after 32s
gate-check-v3 / gate-check (pull_request_target) Successful in 47s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m0s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m22s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m14s
CI / Platform (Go) (pull_request) Successful in 4m6s
CI / Canvas (Next.js) (pull_request) Successful in 6m6s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
The destructive workspace-delete guard requires an X-Confirm-Name header
(workspace_crud.go), but it was missing from the CORS AllowHeaders, so the
canvas's preflight was blocked ("Request header field x-confirm-name is not
allowed by Access-Control-Allow-Headers"). Add it to the allowlist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 00:39:09 -07:00
core-devops 3d0439503c test(e2e): comprehensive user_tasks e2e (REST + MCP) wired into e2e-api CI
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Detect changes (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 26s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 24s
Check migration collisions / Migration version collision check (pull_request) Successful in 34s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 32s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 55s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m1s
gate-check-v3 / gate-check (pull_request_target) Successful in 7s
qa-review / approved (pull_request_target) Failing after 5s
security-review / approved (pull_request_target) Failing after 5s
E2E Chat / E2E Chat (pull_request) Successful in 49s
sop-checklist / review-refire (pull_request_target) Has been skipped
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m8s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 45s
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 1s
sop-tier-check / tier-check (pull_request_target) Failing after 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m47s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m26s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Successful in 7m23s
CI / Canvas (Next.js) (pull_request) Successful in 7m54s
CI / Canvas Deploy Status (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 12s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 00:26:56 -07:00
core-devops 04fe77ac41 feat(canvas): concierge Settings — BYOK opt-in for platform + relocated canvas settings
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
Check migration collisions / Migration version collision check (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
qa-review / approved (pull_request_target) Failing after 8s
E2E Chat / E2E Chat (pull_request) Successful in 2s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
security-review / approved (pull_request_target) Failing after 7s
sop-tier-check / tier-check (pull_request_target) Failing after 4s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 21s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 43s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
CI / Platform (Go) (pull_request) Successful in 4m3s
CI / Canvas (Next.js) (pull_request) Successful in 6m31s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 1s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 00:17:28 -07:00
core-devops 6a87864176 feat(user-tasks): workspace-scoped read/update/delete of own tasks
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Detect changes (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Check migration collisions / Migration version collision check (pull_request) Successful in 14s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 18s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
Harness Replays / detect-changes (pull_request) Successful in 11s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Successful in 3s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 17s
qa-review / approved (pull_request_target) Failing after 17s
sop-checklist / all-items-acked (pull_request_target) Successful in 16s
security-review / approved (pull_request_target) Failing after 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request_target) Failing after 20s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 53s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 2m24s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m21s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m30s
CI / Platform (Go) (pull_request) Successful in 3m53s
CI / Canvas (Next.js) (pull_request) Successful in 7m22s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 1s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
A workspace can now manage the asks it raised (not just create them),
mirroring how it would manage its own resources:

REST (WorkspaceAuth, scoped by workspace_id so an agent only touches tasks
it raised):
- GET    /workspaces/:id/user-tasks            — list own tasks (any status)
- PATCH  /workspaces/:id/user-tasks/:taskId    — update own {title,detail,status}
- DELETE /workspaces/:id/user-tasks/:taskId    — delete own task

MCP (in-workspace a2a bridge, available to every agent):
- list_user_tasks()                            — read own asks + status
- update_user_task(user_task_id, title?, detail?, status?)
- delete_user_task(user_task_id)

These complement the existing request_user_action (create) and the user-side
/resolve. Confirms the design: any workspace (not just platform) can create
and manage tasks; the Home list stays org-wide. Handler tests cover
list/update/delete (+ not-found). go build + vet clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 00:02:00 -07:00
core-devops 3a6f447874 feat(user-tasks): agent→user action requests primitive + concierge wiring
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
qa-review / approved (pull_request_target) Failing after 5s
Check migration collisions / Migration version collision check (pull_request) Successful in 17s
sop-checklist / review-refire (pull_request_target) Has been skipped
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 19s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 19s
sop-checklist / all-items-acked (pull_request_target) Successful in 17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 3m2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 4s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m29s
New `user_tasks` primitive — things an agent asks the *user* to do (e.g.
"Review the draft"). Any workspace can raise one; they surface in the
concierge Home Tasks list org-wide. Mirrors the approvals subsystem.

Backend (workspace-server):
- migration 20260607000000_user_tasks (id, workspace_id, title, detail,
  status pending|done|dismissed, timestamps).
- handlers/user_tasks.go — Create (POST /workspaces/:id/user-tasks),
  ListAll (GET /user-tasks/pending, AdminAuth, cross-workspace),
  Resolve (POST /workspaces/:id/user-tasks/:taskId/resolve done|dismissed).
- events USER_TASK_REQUESTED / USER_TASK_RESOLVED (+ drift-test snapshot).
- router wiring mirroring the approvals auth split.
- MCP tool `request_user_action(title, detail?)` on the in-workspace a2a
  bridge — available to EVERY agent, not gated like send_message_to_user.
- user_tasks_test.go (create/resolve happy + validation paths).

Canvas: concierge Home Tasks tab now reads /user-tasks/pending (org-wide)
with Done/Dismiss → resolve, replacing the interim schedules wiring; live
tab count.

Design SSOT: docs/design/rfc-user-tasks.md.
Follow-up (next commit): workspace-scoped read/update/delete of own tasks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:58:40 -07:00
core-devops b92dc7895c feat(canvas): wire concierge home to real backend data
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
qa-review / approved (pull_request_target) Failing after 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 3s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
sop-tier-check / tier-check (pull_request_target) Failing after 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m1s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m21s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m29s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m15s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m13s
CI / Platform (Go) (pull_request) Successful in 4m2s
CI / Canvas (Next.js) (pull_request) Successful in 6m23s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2s
Replace the concept's demo content in the concierge Home with live data:

- CHAT — new ConciergeChat reuses the real chat plumbing (useChatHistory +
  useChatSend → /workspaces/:id/a2a + useChatSocket) pointed at the platform
  agent, rendered in the concept style. Empty → greeting; composer is
  status-aware (disabled/annotated when the agent isn't online).
- RECENT ACTIVITY — GET /workspaces/:platformId/activity (real rows).
- APPROVALS — GET /approvals/pending + decide via
  POST /workspaces/:wsId/approvals/:id/decide (real, with the tab count).
- TASKS — GET /workspaces/:platformId/schedules for now (the tab count is
  live). NOTE: this is interim — "Tasks" is meant to be agent→user asks,
  which has no backend yet; tracked separately.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:25:17 -07:00
core-devops 5c2cbd265a fix(canvas): contain canvas overlays inside the Org map view
The live canvas's overlays (Toolbar, Legend, Communications pill, New
Workspace, minimap) use position:fixed and were anchoring to the viewport,
so they overlapped the concierge rail + topbar. Give the canvas mount a
transform so it becomes the containing block for those fixed descendants —
they now anchor to the map view area instead of the viewport.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:11:53 -07:00
core-devops 455bf4a0b3 fix(canvas): no nested <button> in concierge agent rows
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 13s
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
sop-tier-check / tier-check (pull_request_target) Failing after 7s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 32s
security-review / approved (pull_request_target) Failing after 24s
gate-check-v3 / gate-check (pull_request_target) Successful in 26s
E2E Chat / detect-changes (pull_request) Successful in 32s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m50s
CI / Platform (Go) (pull_request) Successful in 4m20s
CI / Canvas (Next.js) (pull_request) Successful in 6m50s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 5s
The agent row was a <button> with the expand/collapse caret <button> nested
inside it — invalid HTML that triggered a hydration error. Make the row a
<div role="button"> with keyboard (Enter/Space) activation so the caret can
stay an independent button.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:08:57 -07:00
core-devops f22f715756 feat(canvas): faithful Org Concierge shell (rail + topbar + home + map)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 18s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 21s
E2E Chat / detect-changes (pull_request) Successful in 24s
gate-check-v3 / gate-check (pull_request_target) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 25s
E2E Chat / E2E Chat (pull_request) Successful in 13s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m27s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m25s
ci-arm64-advisory / fast-checks (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Canvas Deploy Status (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
Rebuild the concierge UI to match the molecule-concierge-v1 concept instead
of the earlier approximation. New app shell (ConciergeShell) ported from the
concept's HTML/CSS into a scoped CSS module so its generic class names can't
collide with the rest of the app:

- Left ICON RAIL — Home / Org map / Settings (collapsible, Molecule mark).
- TOPBAR — org selector + search / notifications / theme toggle / avatar.
- HOME view — Agents / Tasks / Approvals sidebar (live agent TREE built from
  the canvas nodes, with avatars, role, status dot, queue count and
  connector lines) + Recent activity, beside a concierge CHAT with the
  concept's ACTION cards (workspace / schedule) and the amber APPROVAL
  REQUIRED card + composer.
- ORG MAP view — the existing live <Canvas/> (node graph), unchanged.
- SETTINGS view — placeholder.

Default top-level view is now Home (concierge-first, matching the concept).
Replaces the earlier ConciergeHome + TopViewTabs (removed). Chat/tasks/
approvals content is the concept's demo conversation for now — the agent
tree and org map are live; live concierge chat follows with BYOK.

Full suite green (3338 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:05:26 -07:00
core-devops c4713bafa7 feat(canvas): Home/Map two-tab shell + bigger uniform workspace cards
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 13s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 14s
sop-checklist / review-refire (pull_request_target) Has been skipped
security-review / approved (pull_request_target) Failing after 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
qa-review / approved (pull_request_target) Failing after 19s
gate-check-v3 / gate-check (pull_request_target) Successful in 21s
Harness Replays / Harness Replays (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request_target) Failing after 27s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m19s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m11s
CI / Platform (Go) (pull_request) Successful in 6m9s
CI / Canvas (Next.js) (pull_request) Successful in 6m21s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
Two top-level views, switchable from a Home/Map control (top-left):

- Home — the Org Concierge view: chat with the platform agent (the
  org-root, kind='platform' workspace) plus a left Agents rail showing the
  org hierarchy with status dots. Reuses the existing ChatTab (history +
  socket + send), so it's a real conversation, not a mock. Resolves the
  platform agent via GET /registry/platform-agent with a root-node
  fallback so it works on stacks without the resolver.
- Map — the existing node-graph canvas (unchanged), default view.

State: new `topView` ('home' | 'map') + `setTopView` on the canvas store.

Bigger, uniform workspace cards (per design): leaves now render at the
layout grid size — bumped CHILD_DEFAULT_WIDTH/HEIGHT 240x130 -> 300x176
(frontend + the Go mirror in org.go, kept in lockstep) — with roomier
padding and larger name/pill/status typography. Parents still grow to fit
their children. This makes the canvas read as deliberately sized rather
than cramped auto-size.

Tests: add TopViewTabs.test (renders + switches the store view). Re-base
the layout-math assertions in canvas-topology-pure.test and DropTargetBadge
on the size constants so they track the card size instead of drifting on a
future resize. Full suite green (3342 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 22:47:51 -07:00
core-devops bac1dc0701 feat(canvas): system-controlled workspace sizing, remove free-resize
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 23s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 24s
CI / Platform (Go) (pull_request) Successful in 16s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 19s
gate-check-v3 / gate-check (pull_request_target) Successful in 7s
qa-review / approved (pull_request_target) Failing after 4s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
security-review / approved (pull_request_target) Failing after 6s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
CI / Canvas (Next.js) (pull_request) Successful in 6m19s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 1s
Workspace container size + shape are now determined by the system instead
of being user-resizable:

- Remove the NodeResizer drag handles from WorkspaceNode (no more
  edge/corner free-resize).
- Remove the Cmd/Ctrl+Arrow keyboard resize shortcut (and its now-unused
  helper/imports) — it was the keyboard equivalent of free-resize.
- Render leaf cards at the layout engine's grid dimensions
  (w-240 x min-h-130 = CHILD_DEFAULT_WIDTH/HEIGHT) so they sit cleanly in
  their computed slots and are uniform; parents keep growing to fit their
  children via growParentsToFitChildren.

Sizes were never persisted server-side, so leaves are always content-
measured from their fixed-size CSS and parents recompute each load — fully
deterministic, no stale user-resized dimensions.

Tests: replace the keyboard-resize assertions with a negative test proving
Cmd/Ctrl+Arrow no longer emits a dimensions change. Full suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 21:59:15 -07:00
core-devops 0e0fc210b5 feat(canvas): node card to concept layout — role/model pills, status line, queued (Phase C)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 18s
CI / Detect changes (pull_request) Successful in 20s
Harness Replays / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request_target) Failing after 7s
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
sop-tier-check / tier-check (pull_request_target) Failing after 7s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s
security-review / approved (pull_request_target) Failing after 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 36s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
CI / Canvas (Next.js) (pull_request) Successful in 6m14s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 56s
Restyle WorkspaceNode to match the Org Concierge concept (style-only, no logic):
  - header right: model pill (Opus/Sonnet/Haiku, shortened from agent_card.model;
    falls back to tier badge);
  - role pill (uppercase, accent-bordered) — platform root shows PLATFORM·ROOT;
    REMOTE marker kept for external runtimes;
  - status line (uppercase, status-toned) with '· N AGENTS' for parents + a
    'N queued' pill (from activeTasks); removed the old duplicate status/tasks
    footer row.

Updated the 5 presentational tests to the new card (status now shown for online,
queued not tasks, agent-count in status, role pill not runtime pill). All 51
WorkspaceNode tests pass; build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 21:35:01 -07:00
core-devops bc9c930d7c feat(canvas): node card brand colors -> tokens (Phase C, partial)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 15s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 5s
CI / Detect changes (pull_request) Successful in 47s
qa-review / approved (pull_request_target) Failing after 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
security-review / approved (pull_request_target) Failing after 24s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m30s
CI / Canvas (Next.js) (pull_request) Failing after 6m13s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / all-required (pull_request) Has been skipped
WorkspaceNode mixed the design tokens (which Phase A re-skinned to purple) with
hardcoded brand colors Phase A can't reach. Replace those: blue-300/400/500 ->
accent (purple), hover:border-zinc-500 -> border-ink-soft, ring-offset-zinc-950
-> ring-offset-surface. Emerald (drag-target/online) + black shadows are
semantic and kept. The agent card now reads purple/token-based like the concept.

Build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 19:54:13 -07:00
core-devops d5910dc3b2 feat(canvas): Org Concierge design tokens + typography (Phase A)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
qa-review / approved (pull_request_target) Failing after 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request_target) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
security-review / approved (pull_request_target) Failing after 9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
sop-checklist / all-items-acked (pull_request_target) Successful in 11s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
CI / Canvas (Next.js) (pull_request) Successful in 6m20s
CI / Canvas Deploy Status (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2s
Reskin the tenant canvas to the Org Concierge concept via its existing
--color-* token layer (no logic/layout change):
  - purple accent (#7c3aed light / #a78bfa dark) replacing blue, across the
    warm-paper @theme set + the always-dark node tokens (--color-accent-dim/
    --color-plasma);
  - near-black dark surfaces + warm-paper light matching the concept; state
    colors retuned (light AA-safe, dark uses concept values);
  - swap Inter -> Hanken Grotesk via next/font (JetBrains Mono already present),
    wired to the --font-sans/--font-mono tokens; updated the mobile palette +
    the next/font test mock accordingly.

Canvas build green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 19:44:14 -07:00
153 changed files with 15095 additions and 960 deletions
+247
View File
@@ -0,0 +1,247 @@
#!/usr/bin/env python3
"""
SSOT fail-closed approval validator (SEV-1 internal#812).
This module is the SINGLE source of truth for whether a Gitea review counts
as a "genuine" approval. Both consumers must call into it — they MUST NOT
duplicate the predicate:
- .gitea/scripts/gitea-merge-queue.py (Python) — imports directly.
- .gitea/scripts/review-check.sh (bash, jq) — calls the Python helper
at .gitea/scripts/_review_check_filter.py, which in turn calls this
module. There is no separate jq / bash copy of the predicate; a
reviewer who wants to weaken the gate has to weaken this one file.
# The fail-closed contract
A review counts as a GENUINE APPROVED on the current head ONLY IF ALL hold:
1. state == "APPROVED"
2. official == true
3. dismissed != true
4. stale != true
5. commit_id is present and equals the PR's current head SHA
ANY failure of any of the above → REJECT.
# The bug this fixes
The previous gitea-merge-queue.py predicate had a `if isinstance(commit_id,
str) and commit_id and headsha:` guard that *skipped* the commit_id check
when the review carried no commit_id. The previous review-check.sh jq
filter required `commit_id == $head`, which is also implicitly fail-closed
on missing commit_id (null != head), but only one of the two consumers
behaved correctly — a code-drift trap.
Both behaviors are now defined here, as a single fail-closed predicate.
A MISSING commit_id is the Gitea row signature of a spoofed or pre-commit
review: a real reviewer cannot have submitted against a commit that
doesn't exist. Accepting these is exactly the fail-open that SEV-1
internal#812 describes and the re-opened path that closed #843 (with CR2
+ Researcher both flagging it) addresses.
# Mutation-resistance
The unit tests in tests/test_approval_validator.py assert rejection
explicitly for each fail-closed case (missing commit_id, stale head,
non-official, dismissed, etc.). A reviewer who tries to weaken the
predicate by removing the commit_id check, by re-introducing the
"no commit_id is accepted" escape hatch, or by changing `!=` to `==`
in the head comparison will trip those tests in CI.
"""
from __future__ import annotations
from typing import Iterable, Optional, Tuple
# ---------------------------------------------------------------------------
# Canonical Gitea review-state enum (EXACT match -- no case coercion).
# ---------------------------------------------------------------------------
#
# Gitea's reviews API emits review.state as one of a fixed set of
# UPPERCASE string constants: "APPROVED", "REQUEST_CHANGES",
# "REQUEST_REVIEW", "COMMENT", "PENDING", "DISMISSED" (verified
# against the live API across real molecule-core PRs). They are ALWAYS
# uppercase on the wire.
#
# FAIL-CLOSED: we compare review.state to these constants with EXACT
# equality. The previous code used str(state or "").upper(), which
# coerced a lowercase/mixed-case "approved" or "request_changes" into
# the canonical value and ACCEPTED it. A real Gitea row never carries a
# lowercase state, so a case-variant value is the signature of a
# hand-forged / spoofed row, not a legitimate review. Coercing it was a
# residual fail-open (SEV-1 internal#812, RCs 9849/9851/9852). We reject
# anything that is not byte-for-byte the canonical constant.
STATE_APPROVED = "APPROVED"
STATE_REQUEST_CHANGES = "REQUEST_CHANGES"
# ---------------------------------------------------------------------------
# Shared predicate — fail-closed on every condition
# ---------------------------------------------------------------------------
def is_official_current_head(review: object, headsha: object) -> bool:
"""Common predicate: review is official, not dismissed, not stale, and
bound to the PR's current head SHA. EVERY condition is mandatory and
fail-closed. Both is_genuine_approval and is_open_request_changes build
on this so the rule cannot drift between the two cases.
`official` is checked with `is not True` (NOT `not review.get("official")`).
The latter is truthy on the string "false" or the integer 1, which is
exactly the fail-open surface we are closing here — a non-boolean
pass-through is treated as official. Gitea emits a real boolean, so
the stricter check rejects anything that isn't literally True.
"""
if not isinstance(review, dict):
return False
if review.get("official") is not True:
return False
if review.get("dismissed"):
return False
if review.get("stale"):
return False
commit_id = review.get("commit_id")
# FAIL-CLOSED: a missing/empty/non-string commit_id is REJECTED. The
# previous code had `if isinstance(commit_id, str) and commit_id and
# headsha:` which SKIPPED the check when the review carried no
# commit_id. That was the spoof-bug surface.
if not isinstance(commit_id, str) or not commit_id:
return False
# FAIL-CLOSED: a present-but-wrong commit_id is also REJECTED. Stale
# reviews (on a previous head) cannot count.
if not isinstance(headsha, str) or not headsha or commit_id != headsha:
return False
return True
# ---------------------------------------------------------------------------
# Per-verdict predicates
# ---------------------------------------------------------------------------
def is_genuine_approval(
review: object,
*,
headsha: str,
reviewer_set: Optional[Iterable[str]] = None,
) -> bool:
"""Return True iff `review` is a genuine APPROVED on the current head.
When `reviewer_set` is provided, the review's `user.login` must be in
the set (the merge-queue uses this to count only "recognised"
reviewers for the 2-genuine floor; review-check.sh applies its own
team-membership probe separately and so does not pass a set).
"""
if not isinstance(review, dict):
return False
# EXACT-ENUM (fail-closed): no .upper()/.strip() coercion. A
# case-variant or whitespace-padded state is a forged row and is
# rejected, not normalised into APPROVED.
if review.get("state") != STATE_APPROVED:
return False
if not is_official_current_head(review, headsha):
return False
if reviewer_set is not None:
user = (review.get("user") or {}).get("login")
if not isinstance(user, str) or user not in set(reviewer_set):
return False
return True
def is_open_request_changes(review: object, *, headsha: str) -> bool:
"""Return True iff `review` is an open official REQUEST_CHANGES on the
current head. Same fail-closed contract as is_genuine_approval —
a missing commit_id is REJECTED, not silently treated as 'still
blocking the merge from an old head'.
"""
if not isinstance(review, dict):
return False
# EXACT-ENUM (fail-closed): same contract as is_genuine_approval. A
# lowercase/mixed-case "request_changes" must NOT be coerced into a
# block-erasing match; an exact REQUEST_CHANGES is required.
if review.get("state") != STATE_REQUEST_CHANGES:
return False
if not is_official_current_head(review, headsha):
return False
return True
# ---------------------------------------------------------------------------
# Consumer-facing reducer (returns the two call sites need)
# ---------------------------------------------------------------------------
def classify_reviews(
reviews: Iterable[object],
*,
headsha: str,
reviewer_set: Optional[Iterable[str]] = None,
) -> Tuple[set[str], list[str]]:
"""Reduce a PR's reviews to (approvers, request_changes) on the CURRENT head.
approvers: distinct logins whose LATEST official review on the current
head is APPROVED.
request_changes: distinct logins whose LATEST official review on the
current head is REQUEST_CHANGES.
Gitea returns reviews oldest-first. We keep the latest *VALID*
submission per user (later VALID entries overwrite earlier ones; an
invalid later row — a COMMENT, or a review with a null/old commit_id —
is ignored and can NOT overwrite or erase a genuine review). See the
inline VALIDATE-BEFORE-REDUCE note below for the exploit this closes.
"""
reviewer_set_set = set(reviewer_set) if reviewer_set is not None else None
# VALIDATE-BEFORE-REDUCE (SEV-1 internal#812 follow-up).
#
# The earlier implementation reduced FIRST (latest row per user, keyed
# only on state in {APPROVED, REQUEST_CHANGES}) and validated the single
# surviving row AFTER. That is reduce-before-validate, and it is
# exploitable: a user posts a genuine current-head APPROVED (or
# REQUEST_CHANGES), then posts a LATER row that fails the fail-closed
# predicate (a COMMENT, or an APPROVED with a null/old commit_id). The
# later INVALID row overwrote the genuine one in latest_by_user, so a
# real approval was masked, and — worse — a real current-head
# REQUEST_CHANGES could be erased and the block silently evaporate.
#
# The fix: filter to VALID reviews FIRST (each row must pass
# is_official_current_head AND carry an APPROVED/REQUEST_CHANGES state),
# and only then reduce to the latest VALID review per user. An invalid
# later row is never eligible to become a user's "latest" state, so it
# cannot overwrite or erase a genuine review. A user's verdict is the
# state of their latest VALID (official, current-head, non-dismissed,
# non-stale, commit_id-present-and-matching) review.
latest_valid_by_user: dict = {}
for review in reviews:
if not isinstance(review, dict):
continue
user = (review.get("user") or {}).get("login")
if not isinstance(user, str):
continue
if reviewer_set_set is not None and user not in reviewer_set_set:
continue
# EXACT-ENUM (fail-closed): exact constants only, no coercion. A
# case-coerced row must not become eligible to overwrite/erase a
# genuine per-user verdict in the reduce below.
state = review.get("state")
if state not in (STATE_APPROVED, STATE_REQUEST_CHANGES):
continue
# Fail-closed predicate BEFORE the reduce: official, not dismissed,
# not stale, commit_id present AND == head. Invalid rows are dropped
# here and so can never become the per-user "latest".
if not is_official_current_head(review, headsha):
continue
latest_valid_by_user[user] = review
approvers: set[str] = set()
request_changes: list[str] = []
for user, review in latest_valid_by_user.items():
# Each surviving review already passed is_official_current_head, so
# the state alone determines the verdict. We still go through the
# per-verdict SSOT predicates so the rule cannot drift.
if is_genuine_approval(review, headsha=headsha, reviewer_set=None):
approvers.add(user)
elif is_open_request_changes(review, headsha=headsha):
request_changes.append(user)
return approvers, request_changes
+74
View File
@@ -0,0 +1,74 @@
#!/usr/bin/env python3
"""
Helper for review-check.sh: applies the SSOT approval predicate to a
PR's reviews and prints the candidate approver logins on stdout (one per
line, de-duplicated, author excluded).
review-check.sh uses this in place of its previous inline jq filter so the
predicate is single-sourced. The jq filter is gone; if you want to change
the predicate, edit .gitea/scripts/_approval_validator.py, not this file.
Usage:
python3 _review_check_filter.py <reviews.json> <head-sha> <author-login>
Output:
- Candidate approver logins, one per line, de-duplicated, sorted.
- Excludes `author-login` (the PR author cannot approve their own PR).
- Empty output → review-check.sh interprets as "no candidates" and exits 1
after the team-membership probe.
"""
from __future__ import annotations
import json
import sys
from pathlib import Path
# Same-dir import — script lives next to _approval_validator.py
sys.path.insert(0, str(Path(__file__).resolve().parent))
from _approval_validator import is_genuine_approval # noqa: E402
def main(argv: list[str]) -> int:
if len(argv) != 4:
print(
f"usage: {argv[0] if argv else '_review_check_filter.py'} "
"<reviews.json> <head-sha> <author-login>",
file=sys.stderr,
)
return 2
reviews_path = Path(argv[1])
headsha = argv[2]
author = argv[3]
try:
reviews = json.loads(reviews_path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError) as exc:
print(f"::error::could not read reviews JSON: {exc}", file=sys.stderr)
return 2
if not isinstance(reviews, list):
print("::error::reviews JSON was not a list", file=sys.stderr)
return 2
candidates: set[str] = set()
for review in reviews:
# We pass reviewer_set=None here because review-check.sh applies its
# own team-membership probe (CURL_AUTH_FILE + 200/204/403/404 logic)
# separately. The SSOT predicate enforces only the fail-closed
# commit_id / state / official / dismissed / stale contract here.
if not is_genuine_approval(review, headsha=headsha, reviewer_set=None):
continue
user = (review.get("user") or {}).get("login")
if not isinstance(user, str) or not user:
continue
if user == author:
continue
candidates.add(user)
for user in sorted(candidates):
print(user)
return 0
if __name__ == "__main__":
sys.exit(main(sys.argv))
+56 -19
View File
@@ -116,28 +116,65 @@ fi
# 3. Status-check state at the PR HEAD (where checks ran). The merge
# commit doesn't get its own checks; we evaluate the PR's last
# commit, which is what branch protection compared against.
# Fail-closed: verify HTTP 200. A 401/403/404 means the status is
# unreadable — we must NOT treat that as "no statuses" and skip checks.
STATUS_TMP=$(mktemp)
STATUS_HTTP=$(curl -sS -o "$STATUS_TMP" -w '%{http_code}' -H "$AUTH" \
"${API}/repos/${OWNER}/${NAME}/commits/${HEAD_SHA}/status")
STATUS=$(cat "$STATUS_TMP")
rm -f "$STATUS_TMP"
if [ "$STATUS_HTTP" != "200" ]; then
echo "::error::GET /commits/${HEAD_SHA}/status returned HTTP ${STATUS_HTTP} — cannot evaluate required checks."
exit 1
fi
# FAIL-CLOSED: a 200 status response missing the 'statuses' array, or with
# 'statuses' set to a non-array type (null/string/object), must NOT be treated
# as "no checks" — that would silently declare all checks green.
if ! echo "$STATUS" | jq -e '(.statuses | type) == "array"' >/dev/null; then
echo "::error::GET /commits/${HEAD_SHA}/status returned HTTP 200 but 'statuses' is missing or not an array — cannot evaluate required checks."
exit 1
fi
#
# Pagination (status-pagination RCA, #2440-family): the combined
# /commits/{sha}/status endpoint caps its embedded `statuses` array at the
# Gitea default page size (~30). On a high-churn PR an older-but-still-current
# required-context SUCCESS row is pushed PAST that cap, so reading the combined
# view would record that context as `missing` and emit a FALSE-POSITIVE
# force-merge. We instead page through the dedicated /commits/{sha}/statuses
# list to EXHAUSTION (until a short/empty page), accumulating every row.
#
# Fail-closed is preserved end to end: any non-200 page, or a page whose body
# is not a JSON array, aborts with exit 1 (we never treat an unreadable/partial
# page as "no checks"). A genuinely-absent required context appears on NO page,
# so CHECK_STATE has no entry for it → `${...:-missing}` below keeps it
# `missing` → it is still counted as not-green. No fail-open path is added.
PER_PAGE=100
page=1
ALL_STATUSES_TMP=$(mktemp)
printf '[]' > "$ALL_STATUSES_TMP" # accumulator: a single JSON array of rows
while :; do
STATUS_TMP=$(mktemp)
STATUS_HTTP=$(curl -sS -o "$STATUS_TMP" -w '%{http_code}' -H "$AUTH" \
"${API}/repos/${OWNER}/${NAME}/commits/${HEAD_SHA}/statuses?page=${page}&limit=${PER_PAGE}")
PAGE_BODY=$(cat "$STATUS_TMP")
rm -f "$STATUS_TMP"
if [ "$STATUS_HTTP" != "200" ]; then
rm -f "$ALL_STATUSES_TMP"
echo "::error::GET /commits/${HEAD_SHA}/statuses?page=${page} returned HTTP ${STATUS_HTTP} — cannot evaluate required checks."
exit 1
fi
# FAIL-CLOSED: the /statuses endpoint returns a bare JSON array. A non-array
# body (null/object/string) means the response is malformed — we must NOT
# treat that as "no checks", which would silently declare all checks green.
if ! echo "$PAGE_BODY" | jq -e 'type == "array"' >/dev/null 2>&1; then
rm -f "$ALL_STATUSES_TMP"
echo "::error::GET /commits/${HEAD_SHA}/statuses?page=${page} returned HTTP 200 but body is not a JSON array — cannot evaluate required checks."
exit 1
fi
PAGE_COUNT=$(echo "$PAGE_BODY" | jq 'length')
# Append this page's rows to the accumulator (insertion order is preserved
# but NOT relied upon — the collapse below selects max-by-id per context).
COMBINED=$(jq -s '.[0] + .[1]' "$ALL_STATUSES_TMP" <(echo "$PAGE_BODY"))
printf '%s' "$COMBINED" > "$ALL_STATUSES_TMP"
# Short page (fewer than PER_PAGE rows) ⇒ last page ⇒ stop.
if [ "$PAGE_COUNT" -lt "$PER_PAGE" ]; then
break
fi
page=$((page + 1))
done
STATUS=$(cat "$ALL_STATUSES_TMP")
rm -f "$ALL_STATUSES_TMP"
declare -A CHECK_STATE
# Gitea's /commits/{sha}/statuses is roughly newest-first but NOT strictly
# monotonic by id (observed first ids 157,155,156,… — local inversions from
# re-runs and page boundaries), so neither first- nor last-occurrence reliably
# yields the current row. Select the MAX-id row per context explicitly
# (order-independent), matching prod-auto-deploy.py's latest_status_for_context.
while IFS=$'\t' read -r ctx state; do
[ -n "$ctx" ] && CHECK_STATE[$ctx]="$state"
done < <(echo "$STATUS" | jq -r '.statuses | .[] | "\(.context)\t\(.status)"')
done < <(echo "$STATUS" | jq -r 'group_by(.context) | map(max_by(.id)) | .[] | "\(.context)\t\(.status)"')
# 4. For each required check, was it green at merge? YAML block scalars
# (`|`) leave a trailing newline; skip blank/whitespace-only lines.
+11
View File
@@ -26,10 +26,21 @@ PROFILES: dict[str, dict[str, str]] = {
"handlers": (
r"^workspace-server/internal/handlers/"
r"|^workspace-server/internal/wsauth/"
# #2148: registry-auth real-PG integration tests (CanCommunicate
# parent_id hierarchy lives in internal/registry; org-admin token
# revoke/validate lives in internal/orgtoken) run in this same
# workflow, so a regression in either package MUST trigger the job.
r"|^workspace-server/internal/registry/"
r"|^workspace-server/internal/orgtoken/"
# #2149: the scheduler real-PG integration tests run in this same
# workflow (they reuse its migrated Postgres), so changes to the
# scheduler package must trigger the job too.
r"|^workspace-server/internal/scheduler/"
# #2150: the db package's real-PG migration-replay-from-scratch
# + InitPostgres ping tests also run in this same workflow (they
# reuse its sibling Postgres, against a separate `molecule_replay`
# database). Changes to db must trigger the job too.
r"|^workspace-server/internal/db/"
r"|^workspace-server/migrations/"
r"|^\.gitea/workflows/handlers-postgres-integration\.yml$"
),
+26 -49
View File
@@ -105,6 +105,12 @@ import urllib.parse
import urllib.request
from typing import Any
# SSOT fail-closed approval predicate (SEV-1 internal#812). review-check.sh
# consumes the same module via _review_check_filter.py — do NOT duplicate
# the predicate here. See _approval_validator.py for the fail-closed contract.
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from _approval_validator import classify_reviews as _classify_reviews_ssot # noqa: E402
def _env(key: str, *, default: str = "") -> str:
return os.environ.get(key, default)
@@ -424,57 +430,26 @@ def get_branch_protection(branch: str) -> BranchProtection:
def genuine_approvals(
reviews: list[dict],
*,
head_sha: str,
headsha: str,
reviewer_set: set[str],
) -> tuple[set[str], list[str]]:
"""Reduce a PR's reviews to genuine official approvals on the CURRENT head.
"""Thin wrapper over the SSOT predicate in _approval_validator.py.
Returns (approvers, request_changes) where:
- approvers is the set of distinct logins (in reviewer_set) whose LATEST
review on the current head is an official, non-stale, non-dismissed
APPROVED, and
- request_changes is the list of logins (in reviewer_set) whose latest
official review on the current head is REQUEST_CHANGES.
All logic — the per-review commit_id / state / official / dismissed /
stale contract — lives in _approval_validator.classify_reviews. This
wrapper exists only to keep the call site (and external readers of
the symbol) stable. Do NOT add any per-review logic here; if you need
to change the predicate, edit _approval_validator.py.
"Current head" is enforced two ways, because Gitea exposes both signals:
a review must be `official` and NOT `stale`/`dismissed`, AND when the
review carries a commit_id it must equal head_sha. A review with no
commit_id but stale=False/dismissed=False is accepted (older Gitea rows).
We take each reviewer's LATEST submission (reviews arrive oldest-first), so
a later REQUEST_CHANGES correctly supersedes an earlier APPROVED and vice
versa.
See _approval_validator.py for the full fail-closed contract
(SEV-1 internal#812). The previous inline implementation had a
`if isinstance(commit_id, str) and commit_id and headsha:` guard that
silently accepted reviews with no commit_id; that fail-open surface is
now closed at the SSOT.
"""
latest_by_user: dict[str, dict] = {}
for review in reviews:
if not isinstance(review, dict):
continue
user = (review.get("user") or {}).get("login")
if not isinstance(user, str) or user not in reviewer_set:
continue
state = str(review.get("state") or "").upper()
if state not in {"APPROVED", "REQUEST_CHANGES"}:
continue # ignore COMMENT/PENDING/DISMISSED-state rows
# reviews are returned oldest-first; later entries overwrite → latest wins
latest_by_user[user] = review
approvers: set[str] = set()
request_changes: list[str] = []
for user, review in latest_by_user.items():
if not review.get("official"):
continue
if review.get("stale") or review.get("dismissed"):
continue
commit_id = review.get("commit_id")
if isinstance(commit_id, str) and commit_id and head_sha:
if commit_id != head_sha:
continue # review was on a previous head
state = str(review.get("state") or "").upper()
if state == "APPROVED":
approvers.add(user)
elif state == "REQUEST_CHANGES":
request_changes.append(user)
return approvers, request_changes
return _classify_reviews_ssot(
reviews, headsha=headsha, reviewer_set=reviewer_set
)
def get_pull_reviews(pr_number: int) -> list[dict]:
_, body = api("GET", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}/reviews")
@@ -779,7 +754,7 @@ def list_queued_issues() -> list[dict]:
query={
"state": "open",
"type": "pulls",
"labels": QUEUE_LABEL,
"label": QUEUE_LABEL,
},
)
@@ -1147,7 +1122,7 @@ def _evaluate_candidate(
reviews = get_pull_reviews(pr_number)
approvers, request_changes = genuine_approvals(
reviews, head_sha=head_sha, reviewer_set=REVIEWER_SET
reviews, headsha=head_sha, reviewer_set=REVIEWER_SET
)
decision = evaluate_merge_readiness(
@@ -1183,7 +1158,9 @@ def enumerate_readiness(*, dry_run: bool = False) -> list[ReadinessEntry]:
post-batch summary can be printed.
"""
bp = get_branch_protection(WATCH_BRANCH)
contexts = bp.required_contexts
# Uniform gate: governance checks are ALWAYS required, even if branch
# protection does not enumerate them. Deduplicate against BP list.
contexts = list(dict.fromkeys(bp.required_contexts + GOVERNANCE_REQUIRED_CONTEXTS))
required_approvals = bp.required_approvals
main_sha = get_branch_head(WATCH_BRANCH)
+1 -1
View File
@@ -165,7 +165,7 @@ def api(
# Format: "<workflow_name> / <job_name_or_key> (<event>)"
# Examples observed on molecule-core/main:
# "Secret scan / Scan diff for credential-shaped strings (pull_request)"
# " / tier-check (pull_request)"
# "sop-checklist / all-items-acked (pull_request)"
#
# Split strategy: peel off the trailing ` (<event>)` first, then split
# the leading `<workflow> / <rest>` on the FIRST ` / ` (workflow names
+77 -11
View File
@@ -95,17 +95,27 @@ def build_plan(env: dict[str, str]) -> dict:
def latest_status_for_context(statuses: list[dict], context: str) -> dict | None:
"""Return the first matching status.
"""Return the NEWEST status row for ``context`` (highest ``id``).
Gitea's combined-status response is newest-first in practice. The merge
queue relies on the same contract; keeping the selector explicit makes
stale duplicate contexts easy to test.
This must work for BOTH orderings Gitea exposes: the combined
``/status`` view is newest-first, but the exhaustively-paginated
``/statuses`` list (see ``fetch_all_statuses``) is ascending id order
(oldest-first). Selecting by max ``id`` collapses duplicate context rows
to the current one regardless of input order, so a stale earlier run can
never shadow the latest result. Rows without an ``id`` are treated as
oldest (id -1) so a well-formed newer row always wins.
"""
newest: dict | None = None
newest_id = -1
for status in statuses:
if status.get("context") == context:
return status
return None
if status.get("context") != context:
continue
raw_id = status.get("id")
sid = raw_id if isinstance(raw_id, int) else -1
if newest is None or sid >= newest_id:
newest = status
newest_id = sid
return newest
def ci_context_state(statuses: list[dict], context: str) -> str:
@@ -351,6 +361,55 @@ def _api_json(url: str, token: str) -> dict:
raise RuntimeError(f"GET {url} -> HTTP {exc.code}: {body}") from exc
def _api_json_list(url: str, token: str) -> list:
"""GET a Gitea list endpoint and return the JSON array.
Like ``_api_json`` but asserts the body is a list. Fail-closed: a non-list
body (or HTTP error) raises so the caller never mistakes an unreadable page
for "no more statuses" and silently truncates the required-context scan.
"""
req = urllib.request.Request(url, headers={"Authorization": f"token {token}"})
try:
with urllib.request.urlopen(req, timeout=20) as resp:
body = json.loads(resp.read())
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", errors="replace")[:500]
raise RuntimeError(f"GET {url} -> HTTP {exc.code}: {detail}") from exc
if not isinstance(body, list):
raise RuntimeError(f"GET {url} -> expected JSON array, got {type(body).__name__}")
return body
def fetch_all_statuses(host: str, repo: str, sha: str, token: str, page_size: int = 100) -> list[dict]:
"""Return EVERY commit-status row for ``sha``, paginating to exhaustion.
The combined ``/commits/{sha}/status`` endpoint caps its embedded
``statuses`` array at the Gitea default page size (~30). On a high-churn
commit, an older-but-still-current required-context SUCCESS row is pushed
PAST that cap, so a reader of the combined view sees the required context
as ``missing`` and either blocks (force-merge audit) or waits forever
(this deploy gate). We instead walk ``/commits/{sha}/statuses`` page by
page until a short/empty page, accumulating ALL rows.
Fail-closed: any page that errors or is not a list raises (see
``_api_json_list``) — we never degrade to a partial list and call a deploy
green. A genuinely-absent required context simply never appears on ANY
page, so the caller's ``ci_context_state`` still reports ``missing`` and
the gate stays closed.
"""
base = f"https://{host}/api/v1/repos/{repo}/commits/{sha}/statuses"
results: list[dict] = []
page = 1
while True:
page_url = f"{base}?page={page}&limit={page_size}"
rows = _api_json_list(page_url, token)
results.extend(r for r in rows if isinstance(r, dict))
if len(rows) < page_size:
break
page += 1
return results
def _api_json_optional(url: str, token: str) -> tuple[int, dict | None]:
req = urllib.request.Request(url, headers={"Authorization": f"token {token}"})
try:
@@ -472,12 +531,19 @@ def wait_for_ci_context(env: dict[str, str]) -> str:
if not token:
raise ValueError("GITEA_TOKEN is required to wait for CI status")
url = f"https://{host}/api/v1/repos/{repo}/commits/{sha}/status"
deadline = time.time() + timeout
last_states: dict[str, str] = {}
while time.time() <= deadline:
body = _api_json(url, token)
statuses = body.get("statuses") or []
# Read the FULL, exhaustively-paginated /statuses list — NOT the
# combined /status view, whose embedded `statuses` array is capped at
# the Gitea page size (~30). On a high-churn commit a required-context
# SUCCESS row lands past that cap and the combined view would report
# it `missing`, so this gate would wait until timeout and refuse a
# legitimate prod deploy. Fetching every page closes that hole.
# Fail-closed is preserved: a genuinely-absent required context is on
# NO page, so ci_context_state() still returns "missing" → never
# satisfied → the deploy stays blocked.
statuses = fetch_all_statuses(host, repo, sha, token)
states = {context: ci_context_state(statuses, context) for context in contexts}
for context, state in states.items():
if state != last_states.get(context):
+7 -11
View File
@@ -197,17 +197,13 @@ if [ "$HTTP_CODE" != "200" ]; then
exit 1
fi
# Filter: state=APPROVED, official=true, not-dismissed, non-author,
# commit_id matches current PR head. All conditions are mandatory.
JQ_FILTER='.[]
| select(.state == "APPROVED")
| select(.official == true)
| select(.dismissed != true)
| select(.user.login != $author)
| select(.commit_id == $head)
| .user.login'
REVIEW_CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILTER" "$REVIEWS_JSON" | sort -u)
# Filter via the SSOT fail-closed predicate in _approval_validator.py
# (same module gitea-merge-queue.py imports). The jq filter is gone
# entirely — any change to the predicate must be made in
# _approval_validator.py. See SEV-1 internal#812 for the fail-closed
# contract this closes.
SCRIPT_DIR_HERE="$(cd "$(dirname "$0")" && pwd)"
REVIEW_CANDIDATES=$(python3 "$SCRIPT_DIR_HERE/_review_check_filter.py" "$REVIEWS_JSON" "$PR_HEAD_SHA" "$PR_AUTHOR")
debug "candidate non-author approvers: $(echo "$REVIEW_CANDIDATES" | tr '\n' ' ')"
if [ -z "$REVIEW_CANDIDATES" ]; then
@@ -134,6 +134,14 @@ class Handler(http.server.BaseHTTPRequestHandler):
return self._json(200, [
{"state": "APPROVED", "dismissed": False, "user": {"login": "core-devops"}, "commit_id": "deadbeef0000111122223333444455556666"},
])
if sc == "T23_missing_commit_id":
# APPROVED review with NO commit_id field — the SEV-1
# internal#812 / closed-#843 spoof-bug signature. The
# fail-closed SSOT must REJECT (not silently accept as
# "older Gitea row" the way the old pre-fix code did).
return self._json(200, [
{"state": "APPROVED", "official": True, "dismissed": False, "user": {"login": "core-devops"}},
])
# Default: one non-author APPROVED (current head, official)
return self._json(200, [
{"state": "APPROVED", "dismissed": False, "official": True, "user": {"login": "core-devops"}, "commit_id": "deadbeef0000111122223333444455556666"},
@@ -0,0 +1,610 @@
#!/usr/bin/env python3
"""
Mutation-verified unit tests for the SSOT fail-closed approval predicate
in _approval_validator.py (SEV-1 internal#812).
Each test asserts REJECTION explicitly. A reviewer who weakens the
predicate — e.g., by removing the commit_id check, by reintroducing the
"no commit_id is accepted" escape hatch, by changing `!=` to `==` in the
head comparison, or by allowing official == false — will trip these
tests in CI.
Run:
cd .gitea/scripts
python3 -m unittest tests.test_approval_validator -v
# or
python3 tests/test_approval_validator.py
"""
from __future__ import annotations
import os
import sys
import unittest
# Same-dir import — test lives next to _approval_validator.py
sys.path.insert(
0,
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
)
from _approval_validator import ( # noqa: E402
classify_reviews,
is_genuine_approval,
is_official_current_head,
is_open_request_changes,
)
HEAD = "0123456789abcdef0123456789abcdef01234567"
OTHER_HEAD = "fedcba9876543210fedcba9876543210fedcba98"
def _review(
*,
state: str = "APPROVED",
official: bool = True,
dismissed: bool = False,
stale: bool = False,
commit_id: object = HEAD,
user: str = "reviewer-1",
body: str = "",
) -> dict:
"""Build a minimal review row shaped like the Gitea reviews API."""
return {
"id": 1,
"user": {"login": user, "id": 1},
"body": body,
"state": state,
"official": official,
"dismissed": dismissed,
"stale": stale,
"commit_id": commit_id,
}
# ---------------------------------------------------------------------------
# Hard contract: every fail-closed branch must reject
# ---------------------------------------------------------------------------
class IsOfficialCurrentHeadFailClosed(unittest.TestCase):
"""is_official_current_head is the common predicate. EVERY condition
is mandatory. The tests below assert REJECTION for every possible
failure of any condition."""
def test_accepts_canonical_review(self):
self.assertTrue(is_official_current_head(_review(), HEAD))
def test_rejects_non_dict(self):
for bad in [None, "string", 42, [], (), object()]:
with self.subTest(bad=bad):
self.assertFalse(is_official_current_head(bad, HEAD))
def test_rejects_when_official_is_false(self):
for v in [False, None, 0, "false"]:
with self.subTest(v=v):
self.assertFalse(
is_official_current_head(_review(official=v), HEAD)
)
def test_rejects_when_dismissed(self):
for v in [True, "true", 1]:
with self.subTest(v=v):
self.assertFalse(
is_official_current_head(_review(dismissed=v), HEAD)
)
def test_rejects_when_stale(self):
for v in [True, "true", 1]:
with self.subTest(v=v):
self.assertFalse(
is_official_current_head(_review(stale=v), HEAD)
)
def test_rejects_when_commit_id_missing(self):
"""FAIL-CLOSED #1: missing commit_id is REJECTED.
This is the spoof signature that closed #843 (with CR2 + Researcher
both flagging it)."""
for bad in [None, "", 0, False, [], {}, ()]:
with self.subTest(commit_id=bad):
self.assertFalse(
is_official_current_head(_review(commit_id=bad), HEAD),
f"commit_id={bad!r} must reject (fail-closed)",
)
def test_rejects_when_commit_id_wrong_type(self):
for bad in [123, 1.5, True, ["abc"], {"sha": HEAD}, ("tuple",)]:
with self.subTest(commit_id=bad):
self.assertFalse(
is_official_current_head(_review(commit_id=bad), HEAD)
)
def test_rejects_when_commit_id_stale(self):
"""FAIL-CLOSED #2: present-but-wrong commit_id is REJECTED. Stale
reviews on a previous head cannot count."""
self.assertFalse(
is_official_current_head(_review(commit_id=OTHER_HEAD), HEAD)
)
def test_rejects_when_head_missing(self):
for bad in [None, "", 0, False]:
with self.subTest(head=bad):
self.assertFalse(
is_official_current_head(_review(), bad)
)
def test_rejects_when_head_wrong_type(self):
self.assertFalse(is_official_current_head(_review(), 123))
self.assertFalse(is_official_current_head(_review(), ["x"]))
# ---------------------------------------------------------------------------
# is_genuine_approval
# ---------------------------------------------------------------------------
class IsGenuineApprovalContract(unittest.TestCase):
def test_accepts_canonical_approval(self):
self.assertTrue(
is_genuine_approval(_review(state="APPROVED"), headsha=HEAD)
)
def test_rejects_non_approved_states(self):
for state in ("REQUEST_CHANGES", "COMMENT", "PENDING", "DISMISSED", "approve", "", "bogus"):
with self.subTest(state=state):
self.assertFalse(
is_genuine_approval(_review(state=state), headsha=HEAD)
)
def test_rejects_case_coerced_approved_states(self):
"""EXACT-ENUM fail-closed (RCs 9849/9851/9852): Gitea always emits
the canonical UPPERCASE "APPROVED". A lowercase/mixed-case/padded
value is the signature of a forged row and MUST be rejected, not
coerced via .upper() into an accepted APPROVED. Each of these was
ACCEPTED before the exact-enum fix."""
for state in (
"approved", "Approved", "ApProVeD", "APPROVED ", " APPROVED",
"approved\n", "\tAPPROVED",
):
with self.subTest(state=state):
self.assertFalse(
is_genuine_approval(_review(state=state), headsha=HEAD),
f"case-coerced/padded state {state!r} must NOT count as "
"a genuine approval",
)
def test_rejects_non_official_approval(self):
"""Comment-based / non-official 'APPROVED' is REJECTED.
PM: 'reject comment-based / non-official reviews'."""
self.assertFalse(
is_genuine_approval(
_review(state="APPROVED", official=False), headsha=HEAD
)
)
def test_rejects_dismissed_approval(self):
self.assertFalse(
is_genuine_approval(
_review(state="APPROVED", dismissed=True), headsha=HEAD
)
)
def test_rejects_stale_head_approval(self):
"""commit_id != head is REJECTED. Stale-on-old-head approvals cannot
count, even if they were official and not dismissed."""
self.assertFalse(
is_genuine_approval(
_review(state="APPROVED", commit_id=OTHER_HEAD), headsha=HEAD
)
)
def test_rejects_missing_commit_id_approval(self):
"""FAIL-CLOSED #3: the SEV-1 case. A APPROVED review with NO
commit_id is the spoof-bug signature. Reject."""
for bad in [None, "", 0, False]:
with self.subTest(commit_id=bad):
self.assertFalse(
is_genuine_approval(
_review(state="APPROVED", commit_id=bad), headsha=HEAD
),
f"missing commit_id={bad!r} must reject",
)
def test_reviewer_set_filters_users(self):
self.assertTrue(
is_genuine_approval(
_review(user="alice"),
headsha=HEAD,
reviewer_set={"alice", "bob"},
)
)
self.assertFalse(
is_genuine_approval(
_review(user="carol"),
headsha=HEAD,
reviewer_set={"alice", "bob"},
)
)
def test_reviewer_set_none_skips_check(self):
# None means "no team filter at this layer" (e.g., review-check.sh
# applies its own team-membership probe separately).
self.assertTrue(
is_genuine_approval(
_review(user="anyone"),
headsha=HEAD,
reviewer_set=None,
)
)
# ---------------------------------------------------------------------------
# is_open_request_changes
# ---------------------------------------------------------------------------
class IsOpenRequestChangesContract(unittest.TestCase):
def test_accepts_canonical_request_changes(self):
self.assertTrue(
is_open_request_changes(
_review(state="REQUEST_CHANGES"), headsha=HEAD
)
)
def test_rejects_non_request_changes_states(self):
for state in ("APPROVED", "COMMENT", "PENDING", "DISMISSED"):
with self.subTest(state=state):
self.assertFalse(
is_open_request_changes(
_review(state=state), headsha=HEAD
)
)
def test_rejects_case_coerced_request_changes_states(self):
"""EXACT-ENUM fail-closed: a lowercase/mixed-case "request_changes"
must NOT be coerced into an open-block match. Before the exact-enum
fix, .upper() accepted these as REQUEST_CHANGES."""
for state in (
"request_changes", "Request_Changes", "REQUEST_CHANGES ",
" REQUEST_CHANGES", "request_changes\n",
):
with self.subTest(state=state):
self.assertFalse(
is_open_request_changes(
_review(state=state), headsha=HEAD
),
f"case-coerced/padded state {state!r} must NOT count as "
"an open REQUEST_CHANGES",
)
def test_rejects_when_dismissed(self):
self.assertFalse(
is_open_request_changes(
_review(state="REQUEST_CHANGES", dismissed=True), headsha=HEAD
)
)
def test_rejects_when_stale_head(self):
self.assertFalse(
is_open_request_changes(
_review(state="REQUEST_CHANGES", commit_id=OTHER_HEAD),
headsha=HEAD,
)
)
def test_rejects_when_missing_commit_id(self):
for bad in [None, "", 0]:
with self.subTest(commit_id=bad):
self.assertFalse(
is_open_request_changes(
_review(state="REQUEST_CHANGES", commit_id=bad),
headsha=HEAD,
)
)
# ---------------------------------------------------------------------------
# classify_reviews — the merge-queue consumer
# ---------------------------------------------------------------------------
class ClassifyReviewsContract(unittest.TestCase):
def test_basic_approvers_and_request_changes(self):
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="bob", state="REQUEST_CHANGES", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, {"alice"})
self.assertEqual(request_changes, ["bob"])
def test_reviewer_set_filters_early(self):
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="carol", state="APPROVED", commit_id=HEAD),
]
approvers, _ = classify_reviews(
reviews, headsha=HEAD, reviewer_set={"alice"}
)
self.assertEqual(approvers, {"alice"})
def test_latest_review_per_user_wins(self):
# alice's REQUEST_CHANGES (latest) supersedes her earlier APPROVED.
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="alice", state="REQUEST_CHANGES", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertNotIn("alice", approvers)
self.assertIn("alice", request_changes)
def test_stale_head_approval_excluded(self):
reviews = [
_review(user="alice", state="APPROVED", commit_id=OTHER_HEAD),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, set())
def test_missing_commit_id_approval_excluded(self):
"""The SEV-1 fail-open surface. APPROVED + no commit_id → must NOT
count toward approvers, even with stale=False/dismissed=False."""
reviews = [
_review(user="alice", state="APPROVED", commit_id=None),
_review(user="bob", state="APPROVED", commit_id=""),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, set())
def test_dismissed_approval_excluded(self):
reviews = [
_review(user="alice", state="APPROVED", dismissed=True, commit_id=HEAD),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, set())
def test_non_official_approval_excluded(self):
reviews = [
_review(user="alice", state="APPROVED", official=False, commit_id=HEAD),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, set())
def test_comment_state_excluded(self):
reviews = [
_review(user="alice", state="COMMENT", commit_id=HEAD),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, set())
def test_case_coerced_approved_not_counted(self):
"""EXACT-ENUM via the reducer: a lowercase 'approved' (otherwise
valid official current-head row) must NOT be counted as an approver.
Before the fix, classify_reviews coerced it via .upper()."""
for state in ("approved", "Approved", "APPROVED "):
with self.subTest(state=state):
reviews = [
_review(user="alice", state=state, commit_id=HEAD),
]
approvers, request_changes = classify_reviews(
reviews, headsha=HEAD
)
self.assertEqual(approvers, set())
self.assertEqual(request_changes, [])
def test_case_coerced_request_changes_not_silently_dropped(self):
"""EXACT-ENUM via the reducer: a lowercase 'request_changes' must be
rejected (not coerced into a block). Crucially, it must NOT silently
erase a SAME-USER genuine current-head REQUEST_CHANGES posted
earlier — the case-variant later row is invalid and is ignored, so
the genuine block stands."""
reviews = [
_review(user="bob", state="REQUEST_CHANGES", commit_id=HEAD),
_review(user="bob", state="request_changes", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn("bob", request_changes)
self.assertNotIn("bob", approvers)
def test_stale_head_request_changes_excluded(self):
# A REQUEST_CHANGES on a previous head must NOT block the current head.
reviews = [
_review(user="bob", state="REQUEST_CHANGES", commit_id=OTHER_HEAD),
]
_, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(request_changes, [])
# -----------------------------------------------------------------
# VALIDATE-BEFORE-REDUCE regression tests (SEV-1 internal#812 follow-up).
#
# The bug: classify_reviews reduced to the LATEST row per user FIRST and
# validated AFTER. A later INVALID row (a COMMENT, or APPROVED/
# REQUEST_CHANGES with a null/old commit_id) from the same user could
# overwrite a genuine current-head review — masking an approval or
# ERASING a REQUEST_CHANGES block. The fix validates before the reduce,
# so an invalid later row is never eligible to be a user's "latest".
# -----------------------------------------------------------------
def test_genuine_approval_not_masked_by_later_comment(self):
"""A genuine current-head APPROVED followed by a LATER COMMENT from
the SAME user must STILL count as an approval. A later non-
APPROVED/RC row (COMMENT) must not erase the approval. This is the
reduce-before-validate masking bug."""
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="alice", state="COMMENT", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn("alice", approvers)
self.assertEqual(request_changes, [])
def test_genuine_approval_not_masked_by_later_null_commit_id(self):
"""A genuine current-head APPROVED followed by a LATER APPROVED with
a null commit_id (the spoof/invalid signature) from the SAME user
must STILL count. The invalid later row must be ignored, not allowed
to overwrite the valid earlier approval."""
for bad in [None, ""]:
with self.subTest(commit_id=bad):
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="alice", state="APPROVED", commit_id=bad),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertIn(
"alice", approvers,
f"later invalid commit_id={bad!r} must not mask the "
"genuine current-head approval",
)
def test_genuine_approval_not_masked_by_later_stale_commit_id(self):
"""A genuine current-head APPROVED followed by a LATER APPROVED on a
STALE (old) head from the SAME user must STILL count toward
approvers — the stale later row is invalid and must be ignored."""
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="alice", state="APPROVED", commit_id=OTHER_HEAD),
]
approvers, _ = classify_reviews(reviews, headsha=HEAD)
self.assertIn("alice", approvers)
def test_request_changes_not_erased_by_later_comment(self):
"""A genuine current-head REQUEST_CHANGES followed by a LATER COMMENT
from the SAME user must STILL block. The later invalid row must not
erase the REQUEST_CHANGES — this is the worse, silently-evaporating-
block variant of the bug."""
reviews = [
_review(user="bob", state="REQUEST_CHANGES", commit_id=HEAD),
_review(user="bob", state="COMMENT", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn("bob", request_changes)
self.assertNotIn("bob", approvers)
def test_request_changes_not_erased_by_later_null_commit_id(self):
"""A genuine current-head REQUEST_CHANGES followed by a LATER
REQUEST_CHANGES with a null/old commit_id from the SAME user must
STILL block. The invalid later row must be ignored, not allowed to
relocate the user's verdict off the current head."""
for bad in [None, "", OTHER_HEAD]:
with self.subTest(commit_id=bad):
reviews = [
_review(user="bob", state="REQUEST_CHANGES", commit_id=HEAD),
_review(user="bob", state="REQUEST_CHANGES", commit_id=bad),
]
_, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn(
"bob", request_changes,
f"later invalid commit_id={bad!r} must not erase the "
"genuine current-head REQUEST_CHANGES block",
)
def test_request_changes_not_erased_by_later_approved_invalid(self):
"""A genuine current-head REQUEST_CHANGES followed by a LATER
INVALID APPROVED (null commit_id) from the SAME user must STILL
block AND must NOT count the user as an approver. The invalid
approval must not flip a real block into a pass."""
reviews = [
_review(user="bob", state="REQUEST_CHANGES", commit_id=HEAD),
_review(user="bob", state="APPROVED", commit_id=None),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn("bob", request_changes)
self.assertNotIn("bob", approvers)
def test_genuine_request_changes_still_supersedes_genuine_approval(self):
"""Sanity: a genuine LATER current-head REQUEST_CHANGES still
supersedes an earlier genuine APPROVED from the same user (the
valid-row supersession we MUST preserve — only INVALID later rows
are ignored). Guards against an over-correction that ignores all
later rows."""
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="alice", state="REQUEST_CHANGES", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertNotIn("alice", approvers)
self.assertIn("alice", request_changes)
def test_genuine_approval_still_supersedes_genuine_request_changes(self):
"""Sanity: a genuine LATER current-head APPROVED supersedes an
earlier genuine REQUEST_CHANGES from the same user."""
reviews = [
_review(user="alice", state="REQUEST_CHANGES", commit_id=HEAD),
_review(user="alice", state="APPROVED", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertIn("alice", approvers)
self.assertEqual(request_changes, [])
def test_two_valid_approvers_plus_one_invalid_later_row(self):
"""Two distinct users with valid current-head approvals + a third
user whose ONLY genuine approval is followed by an invalid later
row → all three real approvers are counted; the invalid later row
does not drop the third user."""
reviews = [
_review(user="alice", state="APPROVED", commit_id=HEAD),
_review(user="bob", state="APPROVED", commit_id=HEAD),
_review(user="carol", state="APPROVED", commit_id=HEAD),
_review(user="carol", state="COMMENT", commit_id=HEAD),
]
approvers, request_changes = classify_reviews(reviews, headsha=HEAD)
self.assertEqual(approvers, {"alice", "bob", "carol"})
self.assertEqual(request_changes, [])
# ---------------------------------------------------------------------------
# Mutation-resistance smoke checks
#
# These tests document the mutations a reviewer would have to apply to
# weaken the gate. They are not synthetic; they verify that the
# predicate is structured so each known-softening mutation would also
# fail at least one other test in this file. We can't actually mutate
# the source in CI, but these tests are explicit about the mutations
# that would slip through, and the suite is dense enough that any
# loosening of the predicate will fail multiple cases.
# ---------------------------------------------------------------------------
class MutationResistance(unittest.TestCase):
def test_documented_mutation_remove_commit_id_check_fails(self):
"""If a reviewer removes the commit_id check (e.g., reverts to
the pre-fix `if isinstance(commit_id, str) and commit_id and
headsha:` guard, or replaces `commit_id != headsha` with True),
the missing-commit_id tests above (test_rejects_when_commit_id_missing
in IsOfficialCurrentHeadFailClosed, test_rejects_missing_commit_id_approval
in IsGenuineApprovalContract, test_missing_commit_id_approval_excluded
in ClassifyReviewsContract) would all fail. The reviewer would have
to weaken all three test categories to slip the SEV-1 surface in."""
# Sanity: every missing-commit_id case is a False today.
for bad in [None, "", 0, False]:
with self.subTest(commit_id=bad):
self.assertFalse(
is_official_current_head(_review(commit_id=bad), HEAD)
)
self.assertFalse(
is_genuine_approval(
_review(commit_id=bad), headsha=HEAD
)
)
def test_documented_mutation_change_neq_to_eq_fails(self):
"""If a reviewer changes `commit_id != headsha` to `commit_id == headsha`
in the head comparison (inverting the check), the stale-head tests
(test_rejects_when_commit_id_stale, test_stale_head_approval_excluded)
would fail because the wrong head would now match."""
self.assertFalse(
is_official_current_head(_review(commit_id=OTHER_HEAD), HEAD)
)
def test_documented_mutation_drop_official_check_fails(self):
"""If a reviewer drops the `if not review.get('official')` check, the
non-official tests (test_rejects_when_official_is_false,
test_rejects_non_official_approval, test_non_official_approval_excluded)
would all fail."""
self.assertFalse(
is_genuine_approval(
_review(state="APPROVED", official=False), headsha=HEAD
)
)
if __name__ == "__main__":
unittest.main()
@@ -115,5 +115,79 @@ T16=$(validate_required_checks_json "main" '{"main":"CI / all-required"}')
[ "$T16" = "false" ] || fail "T16: string branch entry should fail"
pass "T16: string branch entry fails"
# ---------------------------------------------------------------------------
# T17+ — /statuses pagination (status-pagination RCA, #2440-family).
# The reader now pages /commits/{sha}/statuses to exhaustion instead of reading
# the capped combined /status view. These lock the page-accumulation,
# newest-wins collapse, short-page stop, and fail-closed contracts.
# ---------------------------------------------------------------------------
# Page-body type validator used per page (bare array, not an object).
validate_page_is_array() { jq -e 'type == "array"' >/dev/null 2>&1 && echo true || echo false; }
# newest-wins collapse: mirror the script's max-by-id jq (order-independent).
collapse_newest_per_context() {
declare -A CS
while IFS=$'\t' read -r ctx state; do
[ -n "$ctx" ] && CS[$ctx]="$state"
done < <(jq -r 'group_by(.context) | map(max_by(.id)) | .[] | "\(.context)\t\(.status)"')
state="${CS[CI / all-required (push)]:-missing}"
echo "$state"
}
# T17 — a bare JSON array page passes the per-page array check.
T17=$(echo '[{"context":"c1","status":"success"}]' | validate_page_is_array)
[ "$T17" = "true" ] || fail "T17: bare array page should pass array check"
pass "T17: bare array page passes array check"
# T18 — a non-array page (object) fails the per-page array check → fail-closed.
T18=$(echo '{"statuses":[]}' | validate_page_is_array)
[ "$T18" = "false" ] || fail "T18: object page should fail array check (fail-closed)"
pass "T18: object page fails array check (fail-closed)"
# T19 — required SUCCESS on PAGE 2 is FOUND after accumulation (not missing).
# page1: 100 noise rows (older ids); page2: the required-context success.
PAGE1=$(jq -nc '[range(0;100) | {id:., context:("noise-\(.) (push)"), status:"pending"}]')
PAGE2='[{"id":200,"context":"CI / all-required (push)","status":"success"}]'
# Accumulation matching the script: two-arg `jq -s '.[0] + .[1]'` over the
# running accumulator and the new page.
ACCUM=$(jq -s '.[0] + .[1]' <(echo "$PAGE1") <(echo "$PAGE2"))
LEN=$(echo "$ACCUM" | jq 'length')
[ "$LEN" = "101" ] || fail "T19: accumulated length should be 101, got $LEN"
RESULT=$(echo "$ACCUM" | collapse_newest_per_context)
[ "$RESULT" = "success" ] || fail "T19: required success on page2 must be FOUND, got '$RESULT'"
pass "T19: required success on page2 is found after pagination"
# T20 — genuinely-absent required context across all pages stays 'missing'
# → fail-closed (counted as not-green, flags the force-merge).
ABSENT=$(jq -nc '[range(0;100) | {id:., context:("noise-\(.) (push)"), status:"success"}]')
RESULT2=$(echo "$ABSENT" | collapse_newest_per_context)
[ "$RESULT2" = "missing" ] || fail "T20: absent required context must stay 'missing', got '$RESULT2'"
pass "T20: genuinely-absent required context stays missing (fail-closed)"
# T21 — non-monotonic order: newest id (157, neither first nor last in list)
# a NEWER success row (oldest-first append → last overwrite wins).
DUP='[{"id":155,"context":"CI / all-required (push)","status":"pending"},
{"id":157,"context":"CI / all-required (push)","status":"success"},
{"id":125,"context":"CI / all-required (push)","status":"failure"}]'
RESULT3=$(echo "$DUP" | collapse_newest_per_context)
[ "$RESULT3" = "success" ] || fail "T21: newest (success) must win over older (failure), got '$RESULT3'"
pass "T21: newest row per context wins after pagination collapse"
# T22 — short-page stop condition: a page with fewer than PER_PAGE rows ends
# the loop. Emulate the numeric comparison the script uses.
PER_PAGE=100
PAGE_COUNT=$(echo "$PAGE2" | jq 'length') # 1 row
if [ "$PAGE_COUNT" -lt "$PER_PAGE" ]; then SHORT=stop; else SHORT=continue; fi
[ "$SHORT" = "stop" ] || fail "T22: short page should stop pagination"
pass "T22: short page stops pagination loop"
# T23 — a full page (== PER_PAGE) continues the loop.
FULL=$(jq -nc '[range(0;100) | {id:., context:"x", status:"success"}]')
FULL_COUNT=$(echo "$FULL" | jq 'length')
if [ "$FULL_COUNT" -lt "$PER_PAGE" ]; then CONT=stop; else CONT=continue; fi
[ "$CONT" = "continue" ] || fail "T23: full page should continue pagination"
pass "T23: full page continues pagination loop"
echo
echo "ALL AUDIT-FORCE-MERGE CHECKS PASSED"
@@ -50,15 +50,15 @@ class TestQaReviewDirectTrigger:
"pull_request_review must include 'submitted' type"
)
def test_job_guard_requires_approved_state(self):
def test_job_guard_has_no_review_state_check(self):
wf = load_workflow("qa-review.yml")
guard = _job_guard_string(wf)
assert "github.event.review.state == 'APPROVED'" in guard, (
"job guard must check review.state for 'APPROVED'"
)
assert "github.event.review.state == 'approved'" in guard, (
"job guard must check review.state for 'approved' (case fallback per #2135)"
assert "github.event.review.state" not in guard, (
"job guard must NOT check review.state (#2159: Gitea 1.22.6 payload unreliable); "
"evaluator (review-check.sh) verifies actual APPROVE via API"
)
assert "github.event_name == 'pull_request_target'" in guard
assert "github.event_name == 'pull_request_review'" in guard
def test_post_step_uses_status_post_token(self):
wf = load_workflow("qa-review.yml")
@@ -91,15 +91,15 @@ class TestSecurityReviewDirectTrigger:
"pull_request_review must include 'submitted' type"
)
def test_job_guard_requires_approved_state(self):
def test_job_guard_has_no_review_state_check(self):
wf = load_workflow("security-review.yml")
guard = _job_guard_string(wf)
assert "github.event.review.state == 'APPROVED'" in guard, (
"job guard must check review.state for 'APPROVED'"
)
assert "github.event.review.state == 'approved'" in guard, (
"job guard must check review.state for 'approved' (case fallback per #2135)"
assert "github.event.review.state" not in guard, (
"job guard must NOT check review.state (#2159: Gitea 1.22.6 payload unreliable); "
"evaluator (review-check.sh) verifies actual APPROVE via API"
)
assert "github.event_name == 'pull_request_target'" in guard
assert "github.event_name == 'pull_request_review'" in guard
def test_post_step_uses_status_post_token(self):
wf = load_workflow("security-review.yml")
@@ -153,7 +153,7 @@ class TestRefireTokenSeparation:
"qa refire must receive STATUS_POST_TOKEN env var"
)
# Evaluator stays on read token
assert "SOP_TIER_CHECK_TOKEN" in env.get("GITEA_TOKEN", "") or "GITHUB_TOKEN" in env.get("GITEA_TOKEN", ""), (
assert "SOP_CHECKLIST_GATE_TOKEN" in env.get("GITEA_TOKEN", "") or "GITHUB_TOKEN" in env.get("GITEA_TOKEN", ""), (
"qa refire evaluator must stay on read-scoped token"
)
@@ -163,6 +163,6 @@ class TestRefireTokenSeparation:
assert env.get("STATUS_POST_TOKEN") == "${{ secrets.STATUS_POST_TOKEN }}", (
"security refire must receive STATUS_POST_TOKEN env var"
)
assert "SOP_TIER_CHECK_TOKEN" in env.get("GITEA_TOKEN", "") or "GITHUB_TOKEN" in env.get("GITEA_TOKEN", ""), (
assert "SOP_CHECKLIST_GATE_TOKEN" in env.get("GITEA_TOKEN", "") or "GITHUB_TOKEN" in env.get("GITEA_TOKEN", ""), (
"security refire evaluator must stay on read-scoped token"
)
+26 -5
View File
@@ -248,7 +248,7 @@ def test_genuine_approvals_counts_two_distinct_on_current_head():
{"state": "APPROVED", "user": {"login": "agent-reviewer-cr2"},
"official": True, "stale": False, "dismissed": False, "commit_id": "HEAD"},
]
approvers, rc = mq.genuine_approvals(reviews, head_sha="HEAD", reviewer_set=REVIEWERS)
approvers, rc = mq.genuine_approvals(reviews, headsha="HEAD", reviewer_set=REVIEWERS)
assert approvers == {"agent-researcher", "agent-reviewer-cr2"}
assert rc == []
@@ -265,7 +265,7 @@ def test_genuine_approvals_ignores_stale_dismissed_and_wrong_head():
{"state": "APPROVED", "user": {"login": "agent-reviewer"},
"official": True, "stale": False, "dismissed": False, "commit_id": "OLD"},
]
approvers, rc = mq.genuine_approvals(reviews, head_sha="HEAD", reviewer_set=REVIEWERS)
approvers, rc = mq.genuine_approvals(reviews, headsha="HEAD", reviewer_set=REVIEWERS)
assert approvers == set()
assert rc == []
@@ -279,7 +279,7 @@ def test_genuine_approvals_ignores_unofficial_and_outsiders():
{"state": "APPROVED", "user": {"login": "hongming-codex-laptop"},
"official": True, "stale": False, "dismissed": False, "commit_id": "HEAD"},
]
approvers, rc = mq.genuine_approvals(reviews, head_sha="HEAD", reviewer_set=REVIEWERS)
approvers, rc = mq.genuine_approvals(reviews, headsha="HEAD", reviewer_set=REVIEWERS)
assert approvers == set()
@@ -291,7 +291,7 @@ def test_genuine_approvals_latest_review_supersedes_earlier():
{"state": "REQUEST_CHANGES", "user": {"login": "agent-reviewer-cr2"},
"official": True, "stale": False, "dismissed": False, "commit_id": "HEAD"},
]
approvers, rc = mq.genuine_approvals(reviews, head_sha="HEAD", reviewer_set=REVIEWERS)
approvers, rc = mq.genuine_approvals(reviews, headsha="HEAD", reviewer_set=REVIEWERS)
assert approvers == set()
assert rc == ["agent-reviewer-cr2"]
@@ -333,6 +333,27 @@ def test_governance_red_blocks_merge():
assert "required contexts not green" in decision.reason
def test_non_required_red_does_not_block_merge():
# Uniform gate flip (CTO #2407): qa-review, security-review, sop-checklist
# are REQUIRED for ALL PRs. A PR with these failing/pending must NOT be
# force-mergeable, even if BP-required CI is green and approvals are genuine.
pr_status = {
"state": "failure",
"statuses": [
{"context": "CI / all-required (pull_request)", "status": "success"},
{"context": "qa-review / approved (pull_request)", "status": "failure"},
{"context": "security-review / approved (pull_request)", "status": "pending"},
{"context": "sop-checklist / all-items-acked (pull_request)", "status": "failure"},
{"context": "Staging SaaS / e2e (pull_request)", "status": "failure"},
],
}
decision = mq.evaluate_merge_readiness(**_ready_kwargs(pr_status=pr_status))
assert decision.ready is False
assert decision.action == "wait"
assert "required contexts not green" in decision.reason
assert decision.force is False
def test_non_required_advisory_red_does_not_block_merge():
# Governance checks are green; only advisory non-required reds (Staging SaaS)
# are present → PR is still mergeable with force_merge bypassing the advisory.
@@ -1182,7 +1203,7 @@ def test_list_candidate_issues_omits_label_filter_when_auto_discover(monkeypatch
assert captured["query"].get("type") == "pulls"
mq.list_candidate_issues(auto_discover=False)
assert captured["query"].get("labels") == "merge-queue"
assert captured["query"].get("label") == "merge-queue"
def _wire_ready_process_once(monkeypatch, *, issues, pr_payload, calls):
@@ -35,11 +35,33 @@ if grep -q '_is_tier_low_pending_ok' .gitea/scripts/gitea-merge-queue.py; then
fi
# 5. No sop-tier-check context references in workflow YAML
if grep -r 'sop-tier-check' .gitea/workflows/; then
if grep -rI --exclude-dir='__pycache__' 'sop-tier-check' .gitea/workflows/; then
echo "FAIL: sop-tier-check context reappeared in workflows" >&2
fail=1
fi
# 6. No SOP_TIER_CHECK_TOKEN references in workflow YAML or scripts
if grep -rI --exclude-dir='__pycache__' --exclude='test_no_tier_regression.sh' 'SOP_TIER_CHECK_TOKEN' .gitea/workflows/ .gitea/scripts/; then
echo "FAIL: SOP_TIER_CHECK_TOKEN reference reappeared (use SOP_CHECKLIST_GATE_TOKEN)" >&2
fail=1
fi
# 7. qa-review and security-review must have labeled/unlabeled triggers (#2139)
for f in .gitea/workflows/qa-review.yml .gitea/workflows/security-review.yml; do
if ! grep -q 'labeled, unlabeled' "$f"; then
echo "FAIL: $f missing labeled/unlabeled triggers (#2139)" >&2
fail=1
fi
done
# 8. qa-review and security-review must NOT have review.state guard (#2159)
for f in .gitea/workflows/qa-review.yml .gitea/workflows/security-review.yml; do
if grep -q 'github.event.review.state' "$f"; then
echo "FAIL: $f has review.state guard reappeared (#2159)" >&2
fail=1
fi
done
if [ "$fail" -eq 1 ]; then
echo "TIER_REGRESSION_DETECTED" >&2
exit 1
+134 -5
View File
@@ -105,16 +105,25 @@ def test_build_plan_disable_flag_short_circuits_before_credentials():
assert plan["disabled_reason"] == "PROD_AUTO_DEPLOY_DISABLED=true"
def test_latest_status_for_context_uses_first_matching_status():
def test_latest_status_for_context_picks_newest_by_id_regardless_of_order():
# The exhaustively-paginated /statuses list is ascending id order
# (oldest-first), the opposite of the combined /status view. The selector
# must collapse duplicate context rows to the NEWEST (max id) so a stale
# earlier run never shadows the current result, whichever way they arrive.
statuses = [
{"context": "CI / all-required (push)", "status": "pending"},
{"context": "CI / all-required (pull_request)", "status": "success"},
{"context": "CI / all-required (push)", "status": "success"},
{"id": 10, "context": "CI / all-required (push)", "status": "pending"},
{"id": 11, "context": "CI / all-required (pull_request)", "status": "success"},
{"id": 12, "context": "CI / all-required (push)", "status": "success"},
]
latest = prod.latest_status_for_context(statuses, "CI / all-required (push)")
assert latest == {"context": "CI / all-required (push)", "status": "pending"}
assert latest == {"id": 12, "context": "CI / all-required (push)", "status": "success"}
# Same rows shuffled (newest-first, as the combined view would deliver)
# must still resolve to the same newest row.
latest_rev = prod.latest_status_for_context(list(reversed(statuses)), "CI / all-required (push)")
assert latest_rev == {"id": 12, "context": "CI / all-required (push)", "status": "success"}
def test_ci_context_state_handles_missing_and_gitea_status_key():
@@ -612,3 +621,123 @@ def test_superseded_by_none_for_latest_job_so_it_still_rolls(monkeypatch):
)
is None
)
# ---------------------------------------------------------------------------
# /statuses pagination — required-context SUCCESS on page 2+ must be FOUND,
# genuinely-absent context must STILL fail-closed (no fail-open).
# Regression for the single-page-status bug (#2440-family, pagination RCA):
# the combined /status view caps `statuses` at ~30, so on a high-churn commit
# the still-current required-context row is pushed past page 1 and the reader
# falsely reports it `missing`.
# ---------------------------------------------------------------------------
def _paged_statuses_stub(pages):
"""Return a fake _api_json_list that serves `pages` keyed by ?page=N."""
def fake(url, _token):
# url looks like .../statuses?page=N&limit=100
page = 1
for part in url.split("?", 1)[-1].split("&"):
if part.startswith("page="):
page = int(part.split("=", 1)[1])
return pages.get(page, [])
return fake
def test_fetch_all_statuses_finds_required_success_on_page_two(monkeypatch):
# Page 1 is a full 100 rows of unrelated/older churn; the required-context
# SUCCESS only appears on page 2. A single-page reader would miss it.
page1 = [
{"id": i, "context": f"noise-{i} (push)", "status": "pending"}
for i in range(100)
]
page2 = [
{"id": 200, "context": "CI / all-required (push)", "status": "success"},
{"id": 201, "context": "Secret scan / Scan diff for credential-shaped strings (push)",
"status": "success"},
]
monkeypatch.setattr(prod, "_api_json_list", _paged_statuses_stub({1: page1, 2: page2}))
rows = prod.fetch_all_statuses("git.moleculesai.app", "molecule-ai/molecule-core", "a" * 40, "tok")
# Must have walked to page 2 and accumulated every row.
assert len(rows) == 102
assert prod.ci_context_state(rows, "CI / all-required (push)") == "success"
assert (
prod.ci_context_state(
rows, "Secret scan / Scan diff for credential-shaped strings (push)"
)
== "success"
)
def test_fetch_all_statuses_genuinely_absent_context_stays_missing(monkeypatch):
# The required context is on NO page → fail-closed: ci_context_state must
# report "missing", which context_is_satisfied() rejects → gate stays shut.
page1 = [
{"id": i, "context": f"noise-{i} (push)", "status": "success"}
for i in range(100)
]
page2 = [{"id": 200, "context": "some-other (push)", "status": "success"}]
monkeypatch.setattr(prod, "_api_json_list", _paged_statuses_stub({1: page1, 2: page2}))
rows = prod.fetch_all_statuses("git.moleculesai.app", "molecule-ai/molecule-core", "b" * 40, "tok")
state = prod.ci_context_state(rows, "CI / all-required (push)")
assert state == "missing"
assert prod.context_is_satisfied(state) is False
def test_fetch_all_statuses_fail_closed_on_page_error(monkeypatch):
# A page that raises (unreadable) must propagate, never silently truncate
# the scan and let the caller treat a partial list as complete.
def boom(url, _token):
if "page=2" in url:
raise RuntimeError("GET .../statuses?page=2 -> HTTP 502: bad gateway")
return [{"id": i, "context": f"n-{i}", "status": "success"} for i in range(100)]
monkeypatch.setattr(prod, "_api_json_list", boom)
try:
prod.fetch_all_statuses("h", "r", "c" * 40, "tok")
except RuntimeError as exc:
assert "502" in str(exc)
else:
raise AssertionError("expected page-2 error to propagate (fail-closed)")
def test_wait_for_ci_context_succeeds_when_required_status_is_past_page_one(monkeypatch):
# End-to-end: the gate reads the EXHAUSTIVE list, so a required SUCCESS that
# only exists past page 1 lets the deploy proceed instead of timing out.
full = [
{"id": i, "context": f"noise-{i} (push)", "status": "success"}
for i in range(100)
] + [
{"id": 500, "context": "CI / all-required (push)", "status": "success"},
{"id": 501, "context": "Secret scan / Scan diff for credential-shaped strings (push)",
"status": "success"},
]
monkeypatch.setattr(prod, "fetch_all_statuses", lambda *a, **k: full)
result = prod.wait_for_ci_context(
{"GITHUB_SHA": "d" * 40, "GITEA_TOKEN": "tok", "CI_STATUS_TIMEOUT_SECONDS": "30"}
)
assert result == "success"
def test_wait_for_ci_context_times_out_fail_closed_when_required_absent(monkeypatch):
# Genuinely-absent required context across all pages → never satisfied →
# the gate times out rather than green-lighting the deploy (no fail-open).
present_but_irrelevant = [
{"id": 500, "context": "some-other (push)", "status": "success"},
]
monkeypatch.setattr(prod, "fetch_all_statuses", lambda *a, **k: present_but_irrelevant)
# Zero timeout + 0 interval → single poll then TimeoutError.
try:
prod.wait_for_ci_context(
{
"GITHUB_SHA": "e" * 40,
"GITEA_TOKEN": "tok",
"CI_STATUS_TIMEOUT_SECONDS": "1",
"CI_STATUS_POLL_INTERVAL_SECONDS": "1",
}
)
except TimeoutError as exc:
assert "missing" in str(exc)
else:
raise AssertionError("expected fail-closed TimeoutError, not a satisfied gate")
+21
View File
@@ -25,6 +25,11 @@
# T20 — ai-sop-ack APPROVED review excluded from security-review gate
# T21 — stale-head APPROVED review → exit 1 (commit_id mismatch)
# T22 — missing/non-official APPROVED review → exit 1 (official != true)
# T23 — missing-commit_id APPROVED review → exit 1 (SEV-1 internal#812
# fail-closed contract: a missing/empty commit_id is REJECTED, not
# silently accepted as "older Gitea row" the way the pre-fix
# gitea-merge-queue.py did. Closes the spoof-bug surface that
# #843 had.)
#
# Hostile-self-review (per feedback_assert_exact_not_substring):
# this test MUST FAIL if the script is absent. Verified by running
@@ -427,6 +432,22 @@ T22_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T22 exit code 1 (missing official rejected)" "1" "$T22_RC"
assert_contains "T22 no candidates error" "no candidates from reviews API or issue comments" "$T22_OUT"
# T23 — missing-commit_id APPROVED review must be rejected.
# SEV-1 internal#812 (supersedes closed internal#843). A review with NO
# commit_id field is the spoof-bug signature: a real reviewer cannot
# have submitted against a commit that doesn't exist. The fail-closed
# SSOT must REJECT — the pre-fix gitea-merge-queue.py silently accepted
# these (the "older Gitea row" escape hatch), which is the exact surface
# that closed #843 had. The Python unit tests in
# test_approval_validator.py cover the predicate at the unit level;
# this T23 covers the bash + jq pipeline end-to-end.
echo
echo "== T23 missing commit_id APPROVED review rejected (SEV-1 fail-closed) =="
T23_OUT=$(run_review_check "T23_missing_commit_id")
T23_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T23 exit code 1 (missing commit_id rejected)" "1" "$T23_RC"
assert_contains "T23 no candidates error" "no candidates from reviews API or issue comments" "$T23_OUT"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
+5 -1
View File
@@ -149,7 +149,11 @@ items:
- slug: memory-consulted
numeric_alias: 7
pr_section_marker: "Memory/saved-feedback consulted"
# #1973: normalize marker so it matches the slug. Previously the
# slash produced a checklist status that never resolved because
# normalize_slug() collapses / to - and the Gitea PR body parser
# would not find the expected heading.
pr_section_marker: "Memory consulted"
required_teams: [engineers]
ai_ack_eligible: true
description: >-
+1 -1
View File
@@ -42,7 +42,7 @@ jobs:
- name: Detect force-merge + emit audit event
env:
# Same org-level secret the sop-checklist workflow uses.
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
+1 -1
View File
@@ -81,7 +81,7 @@ jobs:
# Gitea persona whose ONLY job is reading branch_protections
# and posting the [ci-drift] tracking issue. The endpoint
# `GET /repos/.../branch_protections/{branch}` requires
# repo-ADMIN role (Gitea 1.22.6) — SOP_TIER_CHECK_TOKEN and the
# repo-ADMIN role (Gitea 1.22.6) — the default GITHUB_TOKEN and the
# auto-injected GITHUB_TOKEN do NOT have it (read-only / write
# without admin), so the previous fallback chain 403'd.
# Mirrors the controlplane fix landed in CP PR#134.
+10
View File
@@ -148,6 +148,11 @@ jobs:
run: $(go env GOPATH)/bin/golangci-lint run --timeout 3m ./...
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Diagnostic — per-package verbose 60s
# DIAGNOSTIC ONLY (continue-on-error below): this step exists to dump
# verbose per-package output for triage, NOT to gate. The blocking gate
# is "Run tests with coverage (blocking gate)" immediately below. The
# `set +e` / swallowed exits here are intentional — do not "fix" them
# like a gate; the real gate is the next step.
run: |
set +e
go test -race -v -timeout 60s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
@@ -309,6 +314,11 @@ jobs:
# #1815 — wires coverage into CI so we get a baseline visible on
# every PR. No threshold gate yet; thresholds dial in (Step 3, also
# tracked in #1815) after the team sees what current coverage is.
# Memory: the full vitest+v8-coverage process tree peaks at ~1.33 GB
# (measured 2026-06-08), comfortably within the runner — so this single
# run is BOTH the pass/fail gate and the coverage artifact (one SSOT, no
# split). The earlier intermittent red here was a DisplayTab paste-race
# (fixed in this PR), NOT a coverage OOM.
run: npx vitest run --coverage
- name: Upload coverage summary as artifact
if: ${{ needs.changes.outputs.canvas == 'true' }}
+3
View File
@@ -429,6 +429,9 @@ jobs:
# round-trip is covered by the priority-runtimes `mock` arm, not here.
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_keyless_feature_contracts_e2e.sh
- name: Run user_tasks E2E (REST + MCP — agent→user action requests)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_user_tasks_e2e.sh
- name: Run secrets-dispatch contract test (keyless SECRETS_JSON branch order)
# Previously orphaned (no workflow referenced it). Hermetic unit-style
# contract over test_staging_full_saas.sh's LLM-key branch precedence —
+352
View File
@@ -54,6 +54,13 @@ on:
- 'tests/e2e/lib/model_slug.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- 'tests/e2e/test_staging_concierge_e2e.sh'
- 'tests/e2e/test_staging_concierge_creates_workspace_e2e.sh'
- 'workspace-server/internal/staginge2e/**'
- 'workspace-server/internal/handlers/platform_agent.go'
- 'workspace-server/internal/handlers/user_tasks.go'
- 'workspace-server/internal/handlers/llm_billing_mode_handler.go'
- 'workspace-server/internal/handlers/discovery.go'
- '.gitea/workflows/e2e-staging-saas.yml'
pull_request:
branches: [main]
@@ -69,6 +76,13 @@ on:
- 'tests/e2e/lib/model_slug.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- 'tests/e2e/test_staging_concierge_e2e.sh'
- 'tests/e2e/test_staging_concierge_creates_workspace_e2e.sh'
- 'workspace-server/internal/staginge2e/**'
- 'workspace-server/internal/handlers/platform_agent.go'
- 'workspace-server/internal/handlers/user_tasks.go'
- 'workspace-server/internal/handlers/llm_billing_mode_handler.go'
- 'workspace-server/internal/handlers/discovery.go'
- '.gitea/workflows/e2e-staging-saas.yml'
workflow_dispatch:
schedule:
@@ -496,3 +510,341 @@ jobs:
echo "::warning::platform-boot teardown left ${#leaks[@]} leak(s): ${leaks[*]}"
fi
exit 0
# ── CONCIERGE user_tasks PRIMITIVE (Feature 3) — real-staging REST+MCP+authz ──
#
# Drives tests/e2e/test_staging_concierge_e2e.sh against a fresh throwaway
# tenant: the full agent→user "ask" contract over BOTH surfaces (REST +
# the MCP tools/call envelope a canvas concierge agent uses) PLUS the
# cross-workspace authz scoping (ws-B can't touch ws-A's task). Reuses the
# same CP-admin org-provision/teardown scaffolding + _lib.sh + AWS-leak-check
# lib as the full-SaaS harness (the script SOURCEs them — no duplication).
#
# GATING (no continue-on-error): user_tasks is a pure DB/handler primitive
# with NO LLM container dependency (workspaces are created 'external' — row
# only, no EC2), so this is fast (~provision + TLS, no 10-min cold boot) and
# NOT subject to the cp#245 boot-timeout flake the full-SaaS job carries. It
# therefore has no honest reason to be masked. Runs on push-to-main /
# workflow_dispatch / cron only (needs live staging infra — never on PR, where
# the pr-validate job above already posts the workflow's PR status).
# bp-required: pending #2430
e2e-staging-concierge-user-tasks:
name: E2E Staging Concierge user_tasks
runs-on: ubuntu-latest
if: github.event_name == 'push' || github.event_name == 'workflow_dispatch' || github.event_name == 'schedule'
timeout-minutes: 30
permissions:
contents: read
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
E2E_RUN_ID: "${{ github.run_id }}-${{ github.run_attempt }}"
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org && '1' || '0' }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
- name: Verify admin token + AWS creds present
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var not set — EC2 leak verification cannot run"
exit 2
fi
done
echo "Admin token + AWS creds present ✓"
- name: CP staging health preflight
run: |
code=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 10 "$MOLECULE_CP_URL/health")
if [ "$code" != "200" ]; then
echo "::error::Staging CP unhealthy (got HTTP $code). Skipping — not a workspace bug."
exit 1
fi
echo "Staging CP healthy ✓"
- name: Run concierge user_tasks E2E
run: bash tests/e2e/test_staging_concierge_e2e.sh
- name: Teardown safety net (runs on cancel/failure)
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
run: |
# Sweep any e2e-cncrg-YYYYMMDD-<run_id>-* org this run created if the
# script died before its EXIT trap fired. Run-id scoped so it never
# stomps a concurrent run's fresh tenant (see the saas job's note).
set +e
orgs=$(curl -sS "$MOLECULE_CP_URL/cp/admin/orgs" \
-H "Authorization: Bearer $ADMIN_TOKEN" 2>/dev/null \
| python3 -c "
import json, sys, os, datetime
run_id = os.environ.get('GITHUB_RUN_ID', '')
d = json.load(sys.stdin)
today = datetime.date.today()
yesterday = today - datetime.timedelta(days=1)
dates = (today.strftime('%Y%m%d'), yesterday.strftime('%Y%m%d'))
if run_id:
prefixes = tuple(f'e2e-cncrg-{d}-{run_id}-' for d in dates)
else:
prefixes = tuple(f'e2e-cncrg-{d}-' for d in dates)
candidates = [o['slug'] for o in d.get('orgs', [])
if any(o.get('slug','').startswith(p) for p in prefixes)
and o.get('instance_status') not in ('purged',)]
print('\n'.join(candidates))
" 2>/dev/null)
leaks=()
for slug in $orgs; do
echo "Safety-net teardown: $slug"
set +e
curl -sS -o /tmp/cncrg-cleanup.out -w "%{http_code}" \
-X DELETE "$MOLECULE_CP_URL/cp/admin/tenants/$slug" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"confirm\":\"$slug\"}" >/tmp/cncrg-cleanup.code
set -e
code=$(cat /tmp/cncrg-cleanup.code 2>/dev/null || echo "000")
if [ "$code" = "200" ] || [ "$code" = "204" ]; then
echo "[teardown] deleted $slug (HTTP $code)"
else
echo "::warning::concierge teardown for $slug returned HTTP $code — sweep-stale-e2e-orgs will catch it within ~45 min. Body: $(head -c 300 /tmp/cncrg-cleanup.out 2>/dev/null)"
leaks+=("$slug")
fi
done
if [ ${#leaks[@]} -gt 0 ]; then
echo "::warning::concierge teardown left ${#leaks[@]} leak(s): ${leaks[*]}"
fi
exit 0
# ── CONCIERGE FUNCTIONAL: it ACTUALLY CREATES A WORKSPACE (real-LLM) ─────────
#
# Drives tests/e2e/test_staging_concierge_creates_workspace_e2e.sh — the
# RFC docs/design/rfc-platform-agent.md §11.4 "Reach" check turned into a gate:
# send the org concierge a natural-language A2A message ("create a workspace
# named e2e-cncrg-worker-<runid> with role engineer") and assert the
# DETERMINISTIC SIDE EFFECT — that named workspace now EXISTS in GET /workspaces
# — which can only happen if the concierge's LLM really invoked the
# create_workspace platform-MCP tool (a real org mutation), NOT just that a REST
# API returned 200.
#
# GATING (no continue-on-error), but FALSE-GREEN-PROOF via E2E_REQUIRE_LIVE=1:
# this is a REAL-LLM, REAL-tool test, so it depends on the concierge being
# provisioned on the DEDICATED platform-agent image (Dockerfile.platform-agent,
# ships /opt/molecule-mcp-server — the ONLY image where create_workspace lights
# up; see platform_agent.go's SELF-HOST CAVEAT). A parallel agent is wiring that
# image into the staging provision path. The script SKIPs LOUD when the
# concierge is absent / not online / not on the platform-agent image — but with
# E2E_REQUIRE_LIVE=1 the harness converts that skip into a HARD FAIL (exit 5) so
# a silently-missing platform-agent image can NEVER false-green this gate. Runs
# on push-to-main / workflow_dispatch / cron only (needs live staging infra +
# a model — never on PR, where pr-validate posts the workflow's PR status).
# bp-required: pending #2430
e2e-staging-concierge-creates-workspace:
name: E2E Staging Concierge Creates Workspace
runs-on: ubuntu-latest
if: github.event_name == 'push' || github.event_name == 'workflow_dispatch' || github.event_name == 'schedule'
timeout-minutes: 45
permissions:
contents: read
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
# The concierge is platform_managed on SaaS (the CP-exported LLM proxy
# supplies its model — no BYOK key needed for the concierge itself). The
# MiniMax key is wired anyway so a staging image that boots the concierge
# BYOK-MiniMax (parallel-agent image work) still has a model; harmless when
# the concierge is platform-managed.
E2E_MINIMAX_API_KEY: ${{ secrets.MOLECULE_STAGING_MINIMAX_API_KEY }}
# False-green guard: a concierge that is absent / not on the platform-agent
# image / never online must FAIL this gate (exit 5), not silently skip.
E2E_REQUIRE_LIVE: '1'
E2E_RUN_ID: "${{ github.run_id }}-${{ github.run_attempt }}"
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org && '1' || '0' }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
- name: Verify admin token + AWS creds present
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var not set — EC2 leak verification cannot run"
exit 2
fi
done
echo "Admin token + AWS creds present ✓"
- name: CP staging health preflight
run: |
code=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 10 "$MOLECULE_CP_URL/health")
if [ "$code" != "200" ]; then
echo "::error::Staging CP unhealthy (got HTTP $code). Skipping — not a workspace bug."
exit 1
fi
echo "Staging CP healthy ✓"
- name: Run concierge-creates-workspace functional E2E
run: bash tests/e2e/test_staging_concierge_creates_workspace_e2e.sh
- name: Teardown safety net (runs on cancel/failure)
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
run: |
# Sweep any e2e-cncrg-mk-YYYYMMDD-<run_id>-* org this run created if the
# script died before its EXIT trap fired. Run-id scoped so it never
# stomps a concurrent run's fresh tenant.
set +e
orgs=$(curl -sS "$MOLECULE_CP_URL/cp/admin/orgs" \
-H "Authorization: Bearer $ADMIN_TOKEN" 2>/dev/null \
| python3 -c "
import json, sys, os, datetime
run_id = os.environ.get('GITHUB_RUN_ID', '')
d = json.load(sys.stdin)
today = datetime.date.today()
yesterday = today - datetime.timedelta(days=1)
dates = (today.strftime('%Y%m%d'), yesterday.strftime('%Y%m%d'))
if run_id:
prefixes = tuple(f'e2e-cncrg-mk-{d}-{run_id}-' for d in dates)
else:
prefixes = tuple(f'e2e-cncrg-mk-{d}-' for d in dates)
candidates = [o['slug'] for o in d.get('orgs', [])
if any(o.get('slug','').startswith(p) for p in prefixes)
and o.get('instance_status') not in ('purged',)]
print('\n'.join(candidates))
" 2>/dev/null)
leaks=()
for slug in $orgs; do
echo "Safety-net teardown: $slug"
set +e
curl -sS -o /tmp/cncrg-mk-cleanup.out -w "%{http_code}" \
-X DELETE "$MOLECULE_CP_URL/cp/admin/tenants/$slug" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"confirm\":\"$slug\"}" >/tmp/cncrg-mk-cleanup.code
set -e
code=$(cat /tmp/cncrg-mk-cleanup.code 2>/dev/null || echo "000")
if [ "$code" = "200" ] || [ "$code" = "204" ]; then
echo "[teardown] deleted $slug (HTTP $code)"
else
echo "::warning::concierge-mk teardown for $slug returned HTTP $code — sweep-stale-e2e-orgs will catch it within ~45 min. Body: $(head -c 300 /tmp/cncrg-mk-cleanup.out 2>/dev/null)"
leaks+=("$slug")
fi
done
if [ ${#leaks[@]} -gt 0 ]; then
echo "::warning::concierge-mk teardown left ${#leaks[@]} leak(s): ${leaks[*]}"
fi
exit 0
# ── CONCIERGE / PLATFORM-AGENT Go staginge2e (Features 1,2,4,5,6) ────────────
#
# Drives TestConciergePlatformAgent_Staging (workspace-server/internal/
# staginge2e/concierge_platform_test.go), which REUSES the lifecycle suite's
# harness (requireStagingEnv / adminCreateOrg / tenantAdminToken /
# tenantCreateWorkspace / doTenantJSON / jsonField) to assert, against a real
# tenant: platform-agent install + /org/identity (1), kind on the workspace
# API (2), discovery peers admin-auth regression guard (4), BYOK billing-mode
# round-trip (5), and the concierge config-tab auth sweep (6). It asserts
# OBSERVABLE state (sole root re-parenting, kind discriminator, resolved_mode,
# non-401 tabs) — not just HTTP 200.
#
# Two jobs, mirroring e2e-workspace-lifecycle.yml's honest pattern:
# • concierge-compile-skip (every push/PR/dispatch): proves the staginge2e
# suite still COMPILES under -tags=staging_e2e and SKIPs LOUD without
# creds. GATING (no mask) — a broken test file fails at PR time.
# • concierge-staging (push-to-main/dispatch/cron): the real live run with
# staging creds + t.Cleanup teardown.
# bp-exempt: PR-time compile-only check (build the concierge e2e test, then
# skip execution — no staging creds on PR). pr-validate posts the workflow's
# PR status; this job is not itself a branch-protection gate.
e2e-staging-concierge-compile-skip:
name: E2E Staging Concierge (compile+skip)
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: read
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: go vet (staging_e2e tag)
working-directory: workspace-server
run: go vet -tags staging_e2e ./internal/staginge2e/...
- name: Compile + skip-run (must SKIP LOUD without STAGING_E2E)
working-directory: workspace-server
run: |
# No STAGING_E2E / creds → the suite MUST skip (not pass-with-zero-
# assertions). go test exit 0 with a SKIP line is the contract.
out=$(go test -tags staging_e2e ./internal/staginge2e/ -run TestConciergePlatformAgent -count=1 -v 2>&1)
echo "$out"
echo "$out" | grep -q "SKIP: TestConciergePlatformAgent_Staging" \
|| { echo "::error::expected a LOUD skip of TestConciergePlatformAgent_Staging without creds"; exit 1; }
# bp-required: pending #2430
e2e-staging-concierge-platform:
name: E2E Staging Concierge Platform Agent
runs-on: ubuntu-latest
if: github.event_name == 'push' || github.event_name == 'workflow_dispatch' || github.event_name == 'schedule'
timeout-minutes: 40
permissions:
contents: read
env:
CP_BASE_URL: https://staging-api.moleculesai.app
CP_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
STAGING_E2E: '1'
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Verify admin token present
run: |
if [ -z "$CP_ADMIN_API_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
echo "Admin token present"
- name: CP staging health preflight
run: |
code=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 10 "$CP_BASE_URL/health")
if [ "$code" != "200" ]; then
echo "::error::Staging CP unhealthy (HTTP $code) — infra, not a concierge bug."
exit 1
fi
echo "Staging CP healthy"
- name: Run concierge/platform-agent staginge2e
working-directory: workspace-server
run: go test -tags staging_e2e ./internal/staginge2e/ -run TestConciergePlatformAgent_Staging -count=1 -v -timeout 35m
# Teardown: the test installs a t.Cleanup admin-DELETE of its own tenant
# (e2e-cncrg-* slug), running even on a t.Fatal. The age-guarded
# sweep-stale-e2e-orgs workflow (30-min floor, e2e- prefix) is the final
# net for a tenant orphaned by a hard runner cancel.
+2 -2
View File
@@ -82,7 +82,7 @@ jobs:
- name: Run gate-check-v3 (single PR mode)
if: github.event_name == 'pull_request_target' || github.event.inputs.pr_number != ''
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.inputs.pr_number }}
POST_COMMENT: ${{ github.event.inputs.post_comment || 'true' }}
@@ -97,7 +97,7 @@ jobs:
- name: Run gate-check-v3 (all open PRs — cron mode)
if: github.event_name == 'schedule'
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
REPO: ${{ github.repository }}
run: |
@@ -244,7 +244,12 @@ jobs:
# fail if any didn't land — that would be a real regression we
# want loud.
# workspace_schedules added for the #2149 scheduler integration tests.
for tbl in delegations workspaces activity_logs pending_uploads workspace_schedules; do
# workspace_auth_tokens + org_api_tokens added for the #2156
# registry-auth TestIntegration_ suite (#2148). Without this
# guard, a silently-skipped migration 020 (workspace_auth_tokens)
# or 035 (org_api_tokens) would let the auth tests run against
# missing tables and falsely green.
for tbl in delegations workspaces activity_logs pending_uploads workspace_schedules workspace_auth_tokens org_api_tokens; do
if ! psql -h "${PG_HOST}" -U postgres -d molecule -tA \
-c "SELECT 1 FROM information_schema.tables WHERE table_name = '$tbl'" \
| grep -q 1; then
@@ -285,6 +290,33 @@ jobs:
# / workspaces all landed by the migration replay step above).
go test -tags=integration -timeout 5m -v ./internal/scheduler/ -run "^TestIntegration_"
- if: needs.detect-changes.outputs.handlers == 'true'
name: Migration replay-from-scratch gate (#2150)
env:
PGPASSWORD: test
run: |
# Issue #2150 (SOP internal#765): prove the FULL forward migration
# chain (.up + legacy .sql) replays from a blank schema via the
# PRODUCTION db.RunMigrations entrypoint — hard-fail on any error.
#
# This is the gap the psql apply loop above does NOT cover: that
# loop deliberately SKIPS failing migrations (`⊘ skipped`), so it
# stays green even if the chain stops replaying. The Go test below
# uses the real boot-time runner with hard-fail semantics, catching
# the #211 .down-wipe class and the 045 non-idempotent crash-loop
# class (it runs the chain twice).
#
# Run against a SEPARATE database so the destructive
# `DROP SCHEMA public CASCADE` inside the test never touches the
# `molecule` DB the handlers integration tests above migrated. No
# ordering coupling with the handlers step.
createdb -h "${PG_HOST}" -U postgres molecule_replay 2>/dev/null || \
psql -h "${PG_HOST}" -U postgres -d molecule \
-c "CREATE DATABASE molecule_replay" >/dev/null 2>&1 || true
INTEGRATION_DB_URL="postgres://postgres:test@${PG_HOST}:5432/molecule_replay?sslmode=disable" \
go test -tags=integration -timeout 5m -v ./internal/db/ \
-run '^TestIntegration_Migration|^TestIntegration_InitPostgres'
- if: failure() && needs.detect-changes.outputs.handlers == 'true'
name: Diagnostic dump on failure
env:
+1 -1
View File
@@ -19,7 +19,7 @@
# Forward-compat scope:
# Today (2026-05-11) molecule-core/main protects 3 contexts:
# - "Secret scan / Scan diff for credential-shaped strings (pull_request)"
# - "sop-checklist / tier-check (pull_request)"
# - "sop-checklist / all-items-acked (pull_request)"
# - "CI / all-required (pull_request)"
# Per RFC#324 Step 2 the required-list expands to ~5 contexts
# (qa-review, security-review added). Each new required context's
+387
View File
@@ -0,0 +1,387 @@
name: Local Provision Lifecycle E2E
# MANDATORY coverage for the LOCAL Docker provisioner (MOLECULE_ENV=development,
# docker.sock) — the path self-hosters + dev runs use. Every OTHER e2e exercises
# the SaaS/EC2 (control-plane) provisioner; nothing mandatory drove the local
# Docker path, which is why a config-volume restart-survival bug went undetected.
# This workflow provisions a REAL workspace via the local Docker provisioner and
# asserts the full lifecycle, INCLUDING the restart-survival assertion.
#
# Two jobs:
# * lifecycle-stub (REQUIRED gate) — builds the tiny stub runtime image, tags
# it to the provisioner's RegistryModeLocal cache tag, and runs the full
# lifecycle e2e (provision -> online -> restart-survive -> proxy-reach). Fast
# (seconds of agent boot, no LLM, no 2.5GB image).
# * lifecycle-real (ADVISORY, continue-on-error) — runs the SAME script against
# the real claude-code template image with a REAL MiniMax BYOK credential
# (LIFECYCLE_LLM=minimax). The proxy-reach step asserts an ACTUAL model reply
# (real round-trip through the ws-<id>:8000 proxy), not just reachability.
# MiniMax is the cheapest LLM the platform offers, and its `minimax` provider
# dials api.minimax.io directly (no CP proxy needed on this local stack).
# Heavy + network-dependent (pulls/builds the template + a real LLM call), so
# it is non-blocking. Needs the MOLECULE_STAGING_MINIMAX_API_KEY CI secret:
# when ABSENT the script SKIPS loud (exit 0) — it never reds on a missing
# secret (serving-e2e skip-if-absent pattern).
#
# SUBSTRATE REQUIREMENT (read before wiring into branch protection)
# -----------------------------------------------------------------
# This workflow provisions SIBLING docker containers from a HOST Go binary via
# the runner's docker.sock — exactly like e2e-api.yml, which already provisions
# the `mock` + `priority-runtimes` arms on `docker-host`. So the docker-in-runner
# capability IS available on the molecule-runner-* (docker-host) lane. If the
# operator ever moves these to a runner WITHOUT docker.sock access for the
# platform binary, this lane will red — keep it on `docker-host`.
#
# Both jobs pin `runs-on: docker-host` (Linux operator-host runners with the
# molecule-core-net bridge + a working docker.sock). The bare `ubuntu-latest`
# label is also advertised by the Windows act_runner, where docker.sock-bound
# steps fail non-deterministically — see lint-required-workflows-docker-host-
# pinned.yml + internal#512.
on:
push:
branches: [main, staging]
pull_request:
branches: [main, staging]
concurrency:
# Per-SHA grouping (mirrors e2e-api.yml). cancel-in-progress:false so a queued
# run for an older SHA isn't cancelled by a newer push (auto-promote brittleness).
group: local-provision-e2e-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: false
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
# ===========================================================================
# REQUIRED gate — stub runtime, fast. This IS meant to be a required merge gate
# (the only mandatory coverage for the LOCAL Docker provisioner), but the new
# context is not yet in branch_protections/main — wire it in once the operator
# confirms the docker-host runners reliably provision sibling containers from
# the host platform binary for this lane (see SUBSTRATE REQUIREMENT above), then
# flip the directive below to `# bp-required: yes`. Until then it runs gating
# locally (continue-on-error: false) but un-wired in BP, an acknowledged
# asymmetry tracked for follow-up. (Earlier this block read `# bp-exempt`, which
# contradicted "REQUIRED gate" and tripped lint-required-context-exists-in-bp.)
# bp-required: pending #2409
# ===========================================================================
lifecycle-stub:
name: Local Provision Lifecycle E2E (stub)
runs-on: docker-host
continue-on-error: false
timeout-minutes: 15
env:
PG_CONTAINER: pg-lpe2e-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-lpe2e-${{ github.run_id }}-${{ github.run_attempt }}
# Hard-code dev mode at the job level so the platform server ALWAYS sees it,
# even if the runner's $GITHUB_ENV propagation is flaky (#2468 RCA).
MOLECULE_ENV: development
SECRETS_ENCRYPTION_KEY: lpe2e-test-encryption-key-32bytes!!
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Ensure provisioner network + pre-pull alpine
run: |
# The local provisioner attaches workspace containers to
# molecule-core-net and seeds /configs via an alpine helper; the
# lifecycle script also uses alpine to seed config.yaml into the
# named config volume. Pre-pull + ensure the bridge (idempotent).
docker pull alpine:3 >/dev/null
docker network create molecule-core-net >/dev/null 2>&1 || true
echo "alpine:3 pre-pulled; molecule-core-net ensured."
- name: Start Postgres (docker, ephemeral host port)
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker run -d --name "$PG_CONTAINER" \
-e POSTGRES_USER=dev -e POSTGRES_PASSWORD=dev -e POSTGRES_DB=molecule \
-p 0:5432 postgres:16 >/dev/null
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -z "$PG_PORT" ] && PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$PG_PORT" ]; then echo "::error::no host port for $PG_CONTAINER"; docker logs "$PG_CONTAINER" || true; exit 1; fi
echo "DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
for i in $(seq 1 30); do
docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1 && { echo "pg ready ${i}s"; exit 0; }
sleep 1
done
echo "::error::Postgres not ready in 30s"; docker logs "$PG_CONTAINER" || true; exit 1
- name: Start Redis (docker, ephemeral host port)
run: |
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
docker run -d --name "$REDIS_CONTAINER" -p 0:6379 redis:7 >/dev/null
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -z "$REDIS_PORT" ] && REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$REDIS_PORT" ]; then echo "::error::no host port for $REDIS_CONTAINER"; docker logs "$REDIS_CONTAINER" || true; exit 1; fi
echo "REDIS_URL=redis://127.0.0.1:${REDIS_PORT}" >> "$GITHUB_ENV"
for i in $(seq 1 15); do
docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG && { echo "redis ready ${i}s"; exit 0; }
sleep 1
done
echo "::error::Redis not ready in 15s"; docker logs "$REDIS_CONTAINER" || true; exit 1
- name: Configure platform env (admin token + local Docker provisioner)
run: |
# Allocate an unused ephemeral port to avoid collision with concurrent
# jobs or stale processes from prior cancelled runs (see #2450).
PORT=$(python3 -c "import socket; s=socket.socket(); s.bind(('', 0)); print(s.getsockname()[1]); s.close()")
echo "PORT=${PORT}" >> "$GITHUB_ENV"
echo "BASE=http://localhost:${PORT}" >> "$GITHUB_ENV"
# Deterministic admin token: the script sends MOLECULE_ADMIN_TOKEN as the
# bearer; the platform checks ADMIN_TOKEN. Set both to the same value.
T="lpe2e-admin-${{ github.run_id }}-${{ github.run_attempt }}"
echo "ADMIN_TOKEN=${T}" >> "$GITHUB_ENV"
echo "MOLECULE_ADMIN_TOKEN=${T}" >> "$GITHUB_ENV"
# MOLECULE_ENV=development: dev posture. MOLECULE_ORG_ID is left UNSET so
# main.go wires the LOCAL Docker provisioner (not the CP provisioner), and
# MOLECULE_IMAGE_REGISTRY is left UNSET so image resolution uses
# RegistryModeLocal (the dockerHasTag cache-check the stub pre-tags into).
echo "MOLECULE_ENV=development" >> "$GITHUB_ENV"
echo "SECRETS_ENCRYPTION_KEY=lpe2e-test-encryption-key-32bytes!!" >> "$GITHUB_ENV"
- name: Build platform
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Kill stale platform-server before start (issue #1046)
run: |
# Dynamic port allocation (see #2450) eliminates the fixed-port race
# that caused this gate to red when a prior run left a zombie process.
# We still sweep by process name to avoid leaking platform-server
# processes on the shared runner.
killed=0
for pid in $(grep -l "platform-serve" /proc/[0-9]*/comm 2>/dev/null); do
kpid="${pid%/comm}"; kpid="${kpid##*/}"
cmdline=$(cat "/proc/${kpid}/cmdline" 2>/dev/null | tr '\0' ' ')
if echo "$cmdline" | grep -q "platform-server"; then
echo "Killing stale platform-server pid ${kpid}: ${cmdline}"
kill "$kpid" 2>/dev/null || true
killed=$((killed + 1))
fi
done
if [ "$killed" -gt 0 ]; then echo "Killed $killed stale platform-server process(es)."; else echo "No platform-server-named process found."; fi
sleep 1
- name: Start platform (background)
working-directory: workspace-server
run: |
# Bind to the dynamically allocated port (see #2450).
# DATABASE_URL/REDIS_URL/ADMIN_TOKEN/MOLECULE_ENV are inherited from
# $GITHUB_ENV.
PORT=$PORT ./platform-server > platform.log 2>&1 &
echo $! > platform.pid
- name: Wait for /health (+ migrations applied)
run: |
DEADLINE=300; PID="$(cat workspace-server/platform.pid 2>/dev/null || true)"; start=$(date +%s)
while :; do
# Verify OUR server is still alive before trusting /health. Our server
# binds the allocated port or exits FATAL, so "our PID alive" <=>
# "we own the port"; checking it first stops a squatter that answers
# /health on the same port (our bind having failed) from false-positiving
# the gate (no-flakes RCA).
if [ -n "$PID" ] && ! kill -0 "$PID" 2>/dev/null; then
echo "::error::platform-server exited early (failed to bind or crashed)"; cat workspace-server/platform.log || true; exit 1
fi
if curl -sf "$BASE/health" >/dev/null; then
tables=$(docker exec "$PG_CONTAINER" psql -U dev -d molecule -tAc \
"SELECT count(*) FROM information_schema.tables WHERE table_schema='public' AND table_name='workspaces'" 2>/dev/null || echo 0)
[ "$tables" = "1" ] && { echo "healthy + migrated after $(( $(date +%s) - start ))s"; exit 0; }
fi
[ "$(( $(date +%s) - start ))" -ge "$DEADLINE" ] && { echo "::error::platform not healthy in ${DEADLINE}s"; cat workspace-server/platform.log || true; exit 1; }
sleep 1
done
- name: Run local-provision lifecycle E2E (stub — REQUIRED)
run: bash tests/e2e/test_local_provision_lifecycle_e2e.sh
- name: Dump platform log on failure
if: failure()
run: cat workspace-server/platform.log || true
- name: Stop platform
if: always()
run: |
[ -f workspace-server/platform.pid ] && kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
- name: Stop service containers
if: always()
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
# ===========================================================================
# ADVISORY — real claude-code image, lifecycle-only. Non-blocking. It pulls/
# builds the 2.5GB template image, makes a real (cheap) MiniMax LLM call, and is
# network-dependent, so a miss must not block. It proves the REAL runtime
# survives a restart AND serves a genuine LLM round-trip on the local
# provisioner (proxy-reach asserts a real MiniMax reply, not just reachability).
# ===========================================================================
# bp-exempt: advisory lane (continue-on-error: true) — informational, never a merge gate.
lifecycle-real:
name: Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory)
runs-on: docker-host
# Serialise behind the gating stub job: both jobs share the same docker-host
# runner and provision sibling containers. `needs:` forces this advisory job
# to start only AFTER lifecycle-stub finishes, avoiding resource contention.
# (Dynamic ports eliminated the fixed-port race; serialisation remains for
# docker-host capacity hygiene.) continue-on-error keeps a real-job miss
# non-blocking; `needs:` does NOT gate on the stub's success (a failed
# required gate still lets this advisory dependent run).
needs: lifecycle-stub
if: ${{ always() }}
# Tracker for lint-continue-on-error-tracking (Tier 2e / internal#350): this
# mask has a forced 14-day renewal cycle. mc#2408 tracks promoting this
# advisory MiniMax round-trip to a gating job (then flip to false).
continue-on-error: true # mc#2408 — promote advisory MiniMax e2e to gating
timeout-minutes: 30
env:
PG_CONTAINER: pg-lpe2e-real-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-lpe2e-real-${{ github.run_id }}-${{ github.run_attempt }}
# Hard-code dev mode at the job level so the platform server ALWAYS sees it,
# even if the runner's $GITHUB_ENV propagation is flaky (#2468 RCA).
MOLECULE_ENV: development
SECRETS_ENCRYPTION_KEY: lpe2e-test-encryption-key-32bytes!!
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Ensure provisioner network + pre-pull alpine
run: |
docker pull alpine:3 >/dev/null
docker network create molecule-core-net >/dev/null 2>&1 || true
- name: Start Postgres (docker, ephemeral host port)
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker run -d --name "$PG_CONTAINER" \
-e POSTGRES_USER=dev -e POSTGRES_PASSWORD=dev -e POSTGRES_DB=molecule \
-p 0:5432 postgres:16 >/dev/null
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -z "$PG_PORT" ] && PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$PG_PORT" ]; then echo "::error::no host port"; docker logs "$PG_CONTAINER" || true; exit 1; fi
echo "DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
for i in $(seq 1 30); do
docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1 && { echo "pg ready ${i}s"; exit 0; }
sleep 1
done
echo "::error::Postgres not ready"; docker logs "$PG_CONTAINER" || true; exit 1
- name: Start Redis (docker, ephemeral host port)
run: |
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
docker run -d --name "$REDIS_CONTAINER" -p 0:6379 redis:7 >/dev/null
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
[ -z "$REDIS_PORT" ] && REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | head -1 | awk -F: '{print $NF}')
if [ -z "$REDIS_PORT" ]; then echo "::error::no host port"; docker logs "$REDIS_CONTAINER" || true; exit 1; fi
echo "REDIS_URL=redis://127.0.0.1:${REDIS_PORT}" >> "$GITHUB_ENV"
for i in $(seq 1 15); do
docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG && { echo "redis ready ${i}s"; exit 0; }
sleep 1
done
echo "::error::Redis not ready"; docker logs "$REDIS_CONTAINER" || true; exit 1
- name: Configure platform env
run: |
# Allocate an unused ephemeral port to avoid collision with concurrent
# jobs or stale processes from prior cancelled runs (see #2450).
PORT=$(python3 -c "import socket; s=socket.socket(); s.bind(('', 0)); print(s.getsockname()[1]); s.close()")
echo "PORT=${PORT}" >> "$GITHUB_ENV"
echo "BASE=http://localhost:${PORT}" >> "$GITHUB_ENV"
T="lpe2e-real-admin-${{ github.run_id }}-${{ github.run_attempt }}"
echo "ADMIN_TOKEN=${T}" >> "$GITHUB_ENV"
echo "MOLECULE_ADMIN_TOKEN=${T}" >> "$GITHUB_ENV"
echo "MOLECULE_ENV=development" >> "$GITHUB_ENV"
echo "SECRETS_ENCRYPTION_KEY=lpe2e-test-encryption-key-32bytes!!" >> "$GITHUB_ENV"
- name: Build platform
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Kill stale platform-server before start (issue #1046)
run: |
# Dynamic port allocation (see #2450) eliminates the fixed-port race.
# We still sweep by process name to avoid leaking platform-server
# processes on the shared runner.
killed=0
for pid in $(grep -l "platform-serve" /proc/[0-9]*/comm 2>/dev/null); do
kpid="${pid%/comm}"; kpid="${kpid##*/}"
cmdline=$(cat "/proc/${kpid}/cmdline" 2>/dev/null | tr '\0' ' ')
if echo "$cmdline" | grep -q "platform-server"; then
echo "Killing stale platform-server pid ${kpid}: ${cmdline}"
kill "$kpid" 2>/dev/null || true
killed=$((killed + 1))
fi
done
if [ "$killed" -gt 0 ]; then echo "Killed $killed stale platform-server process(es)."; else echo "No platform-server-named process found."; fi
sleep 1
- name: Start platform (background)
working-directory: workspace-server
run: |
PORT=$PORT ./platform-server > platform.log 2>&1 &
echo $! > platform.pid
- name: Wait for /health (+ migrations applied)
run: |
DEADLINE=300; PID="$(cat workspace-server/platform.pid 2>/dev/null || true)"; start=$(date +%s)
while :; do
# Verify OUR server is still alive before trusting /health. Our server
# binds the allocated port or exits FATAL, so checking our PID first
# stops a squatter from false-positiving the gate (no-flakes RCA).
if [ -n "$PID" ] && ! kill -0 "$PID" 2>/dev/null; then
echo "::error::platform-server exited early (failed to bind or crashed)"; cat workspace-server/platform.log || true; exit 1
fi
if curl -sf "$BASE/health" >/dev/null; then
tables=$(docker exec "$PG_CONTAINER" psql -U dev -d molecule -tAc \
"SELECT count(*) FROM information_schema.tables WHERE table_schema='public' AND table_name='workspaces'" 2>/dev/null || echo 0)
[ "$tables" = "1" ] && { echo "healthy after $(( $(date +%s) - start ))s"; exit 0; }
fi
[ "$(( $(date +%s) - start ))" -ge "$DEADLINE" ] && { echo "::error::platform not healthy in ${DEADLINE}s"; cat workspace-server/platform.log || true; exit 1; }
sleep 1
done
- name: Run local-provision lifecycle E2E (real image + MiniMax LLM — ADVISORY)
env:
# LIFECYCLE_LLM=minimax: provision the REAL claude-code template image
# (the mode forces LIFECYCLE_PROVISIONER_BUILDS=1 — the provisioner
# clones + docker-builds the template from Gitea via RegistryModeLocal)
# with a real MiniMax BYOK credential, and assert an ACTUAL model reply
# at the proxy-reach step (a genuine round-trip through ws-<id>:8000).
# MiniMax is the cheapest LLM the platform offers; its `minimax`
# provider dials api.minimax.io directly, so no CP proxy env is needed.
#
# Key wiring (DO NOT hardcode): the script reads MINIMAX_API_KEY from
# the env; we feed it from the MOLECULE_STAGING_MINIMAX_API_KEY CI
# secret (the same secret the staging-smoke + e2e-api MiniMax arms use).
# When that secret is ABSENT, MINIMAX_API_KEY is empty and the script
# SKIPS loud (exit 0) — it never reds on a missing secret (serving-e2e
# skip-if-absent pattern). The advisory job stays green either way.
LIFECYCLE_LLM: minimax
MINIMAX_API_KEY: ${{ secrets.MOLECULE_STAGING_MINIMAX_API_KEY }}
run: bash tests/e2e/test_local_provision_lifecycle_e2e.sh
- name: Dump platform log on failure
if: failure()
run: cat workspace-server/platform.log || true
- name: Stop platform
if: always()
run: |
[ -f workspace-server/platform.pid ] && kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
- name: Stop service containers
if: always()
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
@@ -248,16 +248,36 @@ jobs:
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_LATEST}"
)
docker buildx build \
--file ./workspace-server/Dockerfile.tenant \
--build-arg NEXT_PUBLIC_PLATFORM_URL= \
--build-arg GIT_SHA="${GIT_SHA}" \
--label "org.opencontainers.image.source=https://git.moleculesai.app/molecule-ai/${REPO}" \
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--label "molecule.workflow.run_id=${GITHUB_RUN_ID}" \
"${build_tags[@]}" \
--push .
# Retry loop: buildkit EOF (internal#2468) is often transient on the
# publish runner under memory pressure. Up to 3 attempts with a fresh
# builder each time so a crashed buildkit doesn't poison the next try.
for attempt in 1 2 3; do
echo "::notice::Tenant image build attempt ${attempt}/3 ..."
builder="tenant-builder-${GITHUB_RUN_ID}-${attempt}"
docker buildx create --name "${builder}" --use >/dev/null 2>&1 || true
if docker buildx build \
--builder "${builder}" \
--file ./workspace-server/Dockerfile.tenant \
--build-arg NEXT_PUBLIC_PLATFORM_URL= \
--build-arg GIT_SHA="${GIT_SHA}" \
--label "org.opencontainers.image.source=https://git.moleculesai.app/molecule-ai/${REPO}" \
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--label "molecule.workflow.run_id=${GITHUB_RUN_ID}" \
"${build_tags[@]}" \
--push .; then
docker buildx rm "${builder}" >/dev/null 2>&1 || true
echo "::notice::Tenant image build succeeded on attempt ${attempt}"
break
fi
echo "::warning::Tenant image build attempt ${attempt} failed — cleaning builder and retrying"
docker buildx rm "${builder}" >/dev/null 2>&1 || true
sleep 10
if [ "$attempt" -eq 3 ]; then
echo "::error::Tenant image build failed after 3 attempts"
exit 1
fi
done
# bp-exempt: production deploy side-effect; merge is gated by CI / all-required and this job waits for push CI before acting.
deploy-production:
+27 -14
View File
@@ -7,18 +7,25 @@
#
# A1-α (refire mechanism):
# Triggers on:
# - `pull_request_target`: opened, synchronize, reopened
# → initial status posts when PR opens / re-pushes
# - `pull_request_target`: opened, synchronize, reopened, labeled, unlabeled
# → initial status posts when PR opens / re-pushes, and re-evaluates
# when labels change (e.g. risk-indicator labels).
# - `pull_request_review` types: [submitted]
# → re-evaluate when a team member submits an APPROVE review so
# the gate flips immediately (no wait for the next push or
# slash-command). Verified live: sop-checklist.yml uses this
# same event and provably fires (produces
# `sop-checklist / all-items-acked (pull_request_review)` contexts).
# The job-level `if:` guard checks
# `github.event.review.state == 'APPROVED' || 'approved'` so
# only APPROVE reviews run the evaluator; COMMENT and
# REQUEST_CHANGES are skipped at the job level.
# The job-level `if:` does NOT guard on review.state (issue
# #2159): Gitea 1.22.6's payload shape for this event does not
# reliably expose the state field that the GitHub-style guard
# expects. The evaluator (review-check.sh) reads actual reviews
# from the API and checks for a real APPROVE, so running on
# COMMENT or REQUEST_CHANGES is harmless (read-only,
# idempotent). Branch-protection requires the
# `(pull_request_target)` context variant, so the review-event
# path EXPLICITLY POSTS the required context via the API. Trust
# boundary preserved (BASE ref, no PR-head).
# Branch-protection requires the `(pull_request_target)`
# context variant, so the review-event path EXPLICITLY POSTS
# the required context via the API. Trust boundary preserved
@@ -96,7 +103,7 @@ name: qa-review
on:
pull_request_target:
types: [opened, synchronize, reopened]
types: [opened, synchronize, reopened, labeled, unlabeled]
pull_request_review:
types: [submitted]
@@ -110,13 +117,19 @@ jobs:
approved:
# Gate the job:
# - On pull_request_target events: always run.
# - On pull_request_review_approved events: run so the gate flips
# immediately when a team member submits an APPROVE review.
# - On pull_request_review events: always run. We do NOT guard on
# review.state here because Gitea 1.22.6's payload shape for this
# event does not reliably expose the state field (issue #2159).
# The evaluator (review-check.sh) reads actual reviews from the
# API and checks for a real APPROVE, so running on COMMENT or
# REQUEST_CHANGES is harmless (read-only, idempotent).
# - On labeled/unlabeled events: re-evaluate when labels change.
# This ensures qa-review flips when risk-indicator labels are
# added or removed.
# Comment-triggered refires live in sop-checklist.yml review-refire job.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'pull_request_review' &&
(github.event.review.state == 'APPROVED' || github.event.review.state == 'approved'))
github.event_name == 'pull_request_review'
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
@@ -130,7 +143,7 @@ jobs:
# no comment.user.login so the step is a no-op skip there.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
login="${{ github.event.comment.user.login }}"
@@ -162,7 +175,7 @@ jobs:
- name: Evaluate qa-review
id: eval
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
# PR number lives in different places per event:
@@ -185,7 +198,7 @@ jobs:
# TOKEN FIX (RC 8326): uses STATUS_POST_TOKEN (CTO-granted,
# msg d52cc72a). Dedicated narrow-scoped write:repository token
# for the explicit status POST. Evaluator step stays on
# SOP_TIER_CHECK_TOKEN (read-only) per deliberate security
# SOP_CHECKLIST_GATE_TOKEN (read-only) per deliberate security
# separation: eval computes, POST writes, never the same cred.
if: github.event_name == 'pull_request_review' && always()
env:
+19
View File
@@ -21,15 +21,21 @@ on:
branches: [main, staging]
paths:
- '.gitea/scripts/review-check.sh'
- '.gitea/scripts/_approval_validator.py'
- '.gitea/scripts/_review_check_filter.py'
- '.gitea/scripts/tests/test_review_check.sh'
- '.gitea/scripts/tests/_review_check_fixture.py'
- '.gitea/scripts/tests/test_approval_validator.py'
- '.gitea/workflows/review-check-tests.yml'
pull_request:
branches: [main, staging]
paths:
- '.gitea/scripts/review-check.sh'
- '.gitea/scripts/_approval_validator.py'
- '.gitea/scripts/_review_check_filter.py'
- '.gitea/scripts/tests/test_review_check.sh'
- '.gitea/scripts/tests/_review_check_fixture.py'
- '.gitea/scripts/tests/test_approval_validator.py'
- '.gitea/workflows/review-check-tests.yml'
workflow_dispatch:
@@ -70,3 +76,16 @@ jobs:
- name: Run review-check.sh regression suite
run: bash .gitea/scripts/tests/test_review_check.sh
- name: SSOT approval-validator unit tests (SEV-1 internal#812)
# The Python unit tests for _approval_validator.py are
# mutation-verified — every fail-closed branch has an explicit
# REJECT assertion. A reviewer who weakens the predicate trips
# these in CI.
run: |
# The test file lives in .gitea/scripts/tests/ with no __init__.py,
# so `unittest discover -s .gitea/scripts` finds 0 tests (the SEV-1
# suite silently never ran — a CI gap fixed alongside internal#812).
# Run the file directly; it self-inserts its sys.path and calls
# unittest.main(), so a failing assertion exits non-zero and fails CI.
python3 .gitea/scripts/tests/test_approval_validator.py -v
+23 -14
View File
@@ -12,18 +12,21 @@
# Uses `pull_request_review` types: [submitted] — verified live via
# sop-checklist.yml which provably fires this event (produces
# `sop-checklist / all-items-acked (pull_request_review)` contexts).
# The job-level `if:` guard checks
# `github.event.review.state == 'APPROVED' || 'approved'` so only APPROVE
# reviews run the evaluator; COMMENT and REQUEST_CHANGES are skipped at
# the job level. Branch-protection requires the `(pull_request_target)`
# context variant, so the review-event path EXPLICITLY POSTS the required
# context via the API. Trust boundary preserved (BASE ref, no PR-head).
# The job-level `if:` does NOT guard on review.state (issue #2159):
# Gitea 1.22.6's payload shape for this event does not reliably expose
# the state field that the GitHub-style guard expects. The evaluator
# (review-check.sh) reads actual reviews from the API and checks for a
# real APPROVE, so running on COMMENT or REQUEST_CHANGES is harmless
# (read-only, idempotent). Branch-protection requires the
# `(pull_request_target)` context variant, so the review-event path
# EXPLICITLY POSTS the required context via the API. Trust boundary
# preserved (BASE ref, no PR-head).
name: security-review
on:
pull_request_target:
types: [opened, synchronize, reopened]
types: [opened, synchronize, reopened, labeled, unlabeled]
pull_request_review:
types: [submitted]
@@ -37,13 +40,19 @@ jobs:
approved:
# Gate the job:
# - On pull_request_target events: always run.
# - On pull_request_review_approved events: run so the gate flips
# immediately when a team member submits an APPROVE review.
# - On pull_request_review events: always run. We do NOT guard on
# review.state here because Gitea 1.22.6's payload shape for this
# event does not reliably expose the state field (issue #2159).
# The evaluator (review-check.sh) reads actual reviews from the
# API and checks for a real APPROVE, so running on COMMENT or
# REQUEST_CHANGES is harmless (read-only, idempotent).
# - On labeled/unlabeled events: re-evaluate when labels change.
# This ensures security-review flips when risk-indicator labels
# are added or removed.
# Comment-triggered refires live in sop-checklist.yml review-refire job.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'pull_request_review' &&
(github.event.review.state == 'APPROVED' || github.event.review.state == 'approved'))
github.event_name == 'pull_request_review'
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
@@ -52,7 +61,7 @@ jobs:
# so re-running on a non-collaborator comment is harmless.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
login="${{ github.event.comment.user.login }}"
@@ -78,7 +87,7 @@ jobs:
- name: Evaluate security-review
id: eval
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
@@ -98,7 +107,7 @@ jobs:
# TOKEN FIX (RC 8326): uses STATUS_POST_TOKEN (CTO-granted,
# msg d52cc72a). Dedicated narrow-scoped write:repository token
# for the explicit status POST. Evaluator step stays on
# SOP_TIER_CHECK_TOKEN (read-only) per deliberate security
# SOP_CHECKLIST_GATE_TOKEN (read-only) per deliberate security
# separation: eval computes, POST writes, never the same cred.
if: github.event_name == 'pull_request_review' && always()
env:
+2 -2
View File
@@ -167,7 +167,7 @@ jobs:
if: steps.classify.outputs.run_qa == 'true'
env:
# Evaluator (review-check.sh + GET /pulls) stays on read-scoped token.
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
# Explicit POST /statuses uses narrow-scoped write:repository token.
STATUS_POST_TOKEN: ${{ secrets.STATUS_POST_TOKEN }}
GITEA_HOST: git.moleculesai.app
@@ -186,7 +186,7 @@ jobs:
if: steps.classify.outputs.run_security == 'true'
env:
# Evaluator (review-check.sh + GET /pulls) stays on read-scoped token.
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
# Explicit POST /statuses uses narrow-scoped write:repository token.
STATUS_POST_TOKEN: ${{ secrets.STATUS_POST_TOKEN }}
GITEA_HOST: git.moleculesai.app
+38 -9
View File
@@ -58,22 +58,51 @@ jobs:
python-version: '3.11'
- name: Install .gitea script test dependencies
run: python -m pip install --quiet 'pytest==9.0.2' 'PyYAML==6.0.2'
- name: Run scripts/ unittests, if any
- name: Run scripts/ unittests (fail-closed on 0 collected)
# Top-level scripts/ tests live alongside their target file. The
# runtime packaging tests moved to molecule-ai-workspace-runtime, so
# this pass may legitimately find no tests.
# this pass may legitimately find NO test files today.
#
# Gate-integrity fix: the previous guard keyed off `rc==5` to detect
# "no tests collected", but Python 3.12's unittest exits 0 (not 5)
# when discovery finds 0 tests ("NO TESTS RAN"). The guard therefore
# never fired, so any test_*.py added here would silently run 0 tests
# while this step stayed GREEN. A green step that runs 0 tests is
# worse than a red one. We now fail-closed:
# - genuinely NO test_*.py present -> loud SKIP (legitimate no-op)
# - test_*.py present but 0 collected -> FAIL (broken import/empty)
working-directory: scripts
run: |
set +e
python -m unittest discover -t . -p 'test_*.py' -v
rc=$?
if [ "$rc" -eq 5 ]; then
echo "No top-level scripts/ unittest files found; skipping."
set -euo pipefail
# Non-recursive count: scripts/ has no __init__.py, so unittest
# discover does not recurse into subdirs (ops/ is run separately
# below) — top-level files are the entire discovery scope here.
nfiles=$(find . -maxdepth 1 -name 'test_*.py' | wc -l | tr -d ' ')
if [ "$nfiles" -eq 0 ]; then
echo "SKIP: no top-level scripts/ test_*.py files present (genuine no-op)."
exit 0
fi
exit "$rc"
echo "Found $nfiles top-level scripts/ test_*.py file(s); asserting they collect >0 tests."
ncollected=$(python -c "import unittest; print(unittest.TestLoader().discover('.', pattern='test_*.py', top_level_dir='.').countTestCases())")
echo "Collected $ncollected test case(s)."
if [ "$ncollected" -eq 0 ]; then
echo "FAIL: test_*.py file(s) present but 0 tests collected (broken import / empty file / discovery error)."
exit 1
fi
python -m unittest discover -t . -p 'test_*.py' -v
- name: Run scripts/ops/ unittests (sweep_cf_decide, ...)
# Real gate: scripts/ops/ must always run tests. Assert >0 collected so
# deleting all test files (or breaking an import) can't pass GREEN by
# running 0 tests — same gate-integrity class as the scripts/ step.
working-directory: scripts/ops
run: python -m unittest discover -p 'test_*.py' -v
run: |
set -euo pipefail
ncollected=$(python -c "import unittest; print(unittest.TestLoader().discover('.', pattern='test_*.py').countTestCases())")
echo "scripts/ops/ collected $ncollected test case(s)."
if [ "$ncollected" -eq 0 ]; then
echo "FAIL: scripts/ops/ collected 0 tests — this gate must run real tests (deleted/broken import?)."
exit 1
fi
python -m unittest discover -p 'test_*.py' -v
- name: Run .gitea/scripts pytest suite
run: python -m pytest .gitea/scripts/tests -q
+11 -1
View File
@@ -4,7 +4,7 @@
# use this Makefile; CI calls docker compose / go test directly so the
# Makefile can evolve without breaking the build.
.PHONY: help dev up down logs build test e2e-peer-visibility openapi-spec openapi-spec-check gen gen-docker gen-check gen-check-docker
.PHONY: help dev up down logs build test e2e-peer-visibility e2e-concierge-creates-workspace openapi-spec openapi-spec-check gen gen-docker gen-check gen-check-docker
# ─── Provider-registry SSOT codegen (internal#718) ─────────────────────
# The Go module lives in workspace-server/. The checked-in artifact
@@ -57,6 +57,16 @@ test: ## Run Go unit tests in workspace-server/.
e2e-peer-visibility: ## Run the LOCAL peer-visibility MCP gate vs the running stack (needs `make up` first).
bash tests/e2e/test_peer_visibility_mcp_local.sh
# FUNCTIONAL local proof that the org concierge actually DOES org-management:
# send it a natural-language A2A request and assert it really CREATES a workspace
# via its platform MCP (create_workspace) — the deterministic side effect, not a
# REST 200. SKIPs LOUD (exit 0) unless the local concierge is seeded, online, and
# running on the platform-agent image (so create_workspace exists). To run it
# green locally: seed the concierge (MOLECULE_SEED_PLATFORM_AGENT=1) on the
# platform-agent image WITH a model key. See the script header for the contract.
e2e-concierge-creates-workspace: ## Prove the concierge actually creates a workspace via its platform MCP (skips loud if not runnable).
bash tests/e2e/test_concierge_creates_workspace_local.sh
# ─── OpenAPI spec generation (RFC #1706, Phase 1) ─────────────────────
# Regenerate workspace-server/docs/openapi/swagger.{yaml,json} from
# swaggo annotations on the gin handlers. Commit the output. CI runs
+10
View File
@@ -1,7 +1,14 @@
import { test, expect } from "@playwright/test";
import type { Page } from "@playwright/test";
import { startEchoRuntime } from "./fixtures/echo-runtime";
import { seedWorkspace, startHeartbeat, cleanupWorkspace } from "./fixtures/chat-seed";
/** Enter the Org-map view so the Canvas (React Flow graph) mounts. */
async function enterMapView(page: Page): Promise<void> {
const btn = page.getByTestId("nav-map");
await expect(btn, "rail button nav-map missing").toBeVisible({ timeout: 10_000 });
await btn.click();
}
test.describe("Desktop ChatTab", () => {
let cleanup: () => Promise<void> = async () => {};
@@ -29,6 +36,7 @@ test.describe("Desktop ChatTab", () => {
test.beforeEach(async ({ page }) => {
await page.setViewportSize({ width: 1280, height: 800 });
await page.goto("/");
await enterMapView(page);
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
// Dismiss onboarding guide if present.
const skipGuide = page.getByText("Skip guide");
@@ -67,6 +75,7 @@ test.describe("Desktop ChatTab", () => {
await expect(page.getByText("Echo: Persistence test")).toBeVisible({ timeout: 15_000 });
await page.reload();
await enterMapView(page);
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
await page.getByText(workspaceName, { exact: true }).first().click();
await page.locator('#tab-chat').click();
@@ -143,6 +152,7 @@ test.describe("Desktop ChatTab — Markdown rendering", () => {
test.beforeEach(async ({ page }) => {
await page.setViewportSize({ width: 1280, height: 800 });
await page.goto("/");
await enterMapView(page);
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
const skipGuide2 = page.getByText("Skip guide");
if (await skipGuide2.isVisible().catch(() => false)) {
+648
View File
@@ -0,0 +1,648 @@
/**
* Staging concierge canvas E2E — exercises the platform-agent CONCIERGE shell
* (canvas/src/components/concierge/ConciergeShell.tsx and the Settings split)
* against a fresh staging org provisioned by the shared global setup
* (e2e/staging-setup.ts). Each `test.describe` covers ONE concierge function
* and asserts the behaviour works — not merely that an element exists.
*
* Why this is a SEPARATE spec from staging-tabs.spec.ts (which drives the
* Org-map SidePanel tab UI): the two assert different surfaces of the same
* tenant. Both reuse the EXACT shared harness — same global setup (one
* provisioned org/workspace), same Playwright staging config (matched by the
* `staging-*.spec.ts` testMatch), same gated `Canvas tabs E2E` workflow check.
* No new harness, no new seeding mechanism.
*
* One extra precondition this spec needs that staging-tabs does NOT: a
* kind='platform' concierge ROW. The CI/SaaS tenant does not self-seed one
* (MOLECULE_SEED_PLATFORM_AGENT is unset on CI — workspace-server
* cmd/server/main.go), so without it the concierge shell falls back to
* roots[0] as a *pseudo*-platform surface and the platform-specific
* behaviours (root tag, hidden-from-map) can't be asserted. So this spec
* installs one via the SAME admin endpoint the control plane uses at
* org-provision time — POST /admin/org/platform-agent (AdminAuth, accepts the
* per-tenant admin bearer that global setup already exports). Installing it
* re-parents the provisioned hermes workspace UNDER the platform agent
* (handlers/platform_agent.go installPlatformAgent), giving us a real
* platform ROOT + a real child workspace — exactly the topology the concierge
* Home tree and Org-map filter are built to handle.
*
* This install mutates the shared tenant (re-parents the workspace). It is the
* LAST staging spec alphabetically among the topology-touching ones, and
* staging-tabs / staging-display read the workspace by id (not by root-ness),
* so the re-parent does not break them; Playwright runs workers=1 in file
* order, and the install is idempotent.
*
* Auth model is identical to staging-tabs.spec.ts: feed the per-tenant admin
* token as an Authorization: Bearer header on every browser request, mock
* /cp/auth/me so AuthGate resolves, and fall any non-auth 401 back to an
* empty 200 so a workspace-scoped 401 can't yank us to AuthKit.
*/
import { test, expect, type Page, type BrowserContext } from "@playwright/test";
const STAGING = process.env.CANVAS_E2E_STAGING === "1";
// Fail-closed, not skip-green (mirrors staging-tabs.spec.ts): a staging run
// that was REQUESTED (CANVAS_E2E_STAGING=1) but has no tenant state is a
// provisioning failure, asserted loudly inside the test body — not a skip.
// CANVAS_E2E_STAGING unset = operator did not request staging = clean skip.
test.skip(!STAGING, "CANVAS_E2E_STAGING not set — staging-only suite, not requested");
/** Resolve + validate the tenant handoff that global setup exported. */
function tenantEnv() {
const tenantURL = process.env.STAGING_TENANT_URL;
const tenantToken = process.env.STAGING_TENANT_TOKEN;
const workspaceId = process.env.STAGING_WORKSPACE_ID;
const orgID = process.env.STAGING_ORG_ID;
if (!tenantURL || !tenantToken || !workspaceId) {
throw new Error(
"staging-setup.ts did not export STAGING_TENANT_URL / " +
"STAGING_TENANT_TOKEN / STAGING_WORKSPACE_ID. CANVAS_E2E_STAGING=1 was " +
"set (staging WAS requested) but global setup produced no tenant — a " +
"provisioning failure, NOT a reason to skip. See the [staging-setup] " +
"log above.",
);
}
return { tenantURL, tenantToken, workspaceId, orgID };
}
// A fixed, valid uuid for the installed platform agent. Any valid uuid works
// (the install upserts on this id); reusing one constant keeps re-runs
// idempotent on the same row. Chosen out of the e2e namespace so it can't
// collide with a CP-derived org id.
const PLATFORM_AGENT_ID = "e2e0c1e2-0000-4000-a000-000000c0ce0e";
const PLATFORM_AGENT_NAME = "E2E Concierge";
/**
* Idempotently install the platform-agent (concierge) row on the shared
* tenant so the concierge shell resolves a REAL kind='platform' root. Uses
* the per-tenant admin bearer + org-id headers, same as staging-display.spec.
* Tolerant of a pre-existing install (the endpoint is idempotent) and of a
* backend that predates the endpoint (404/405) — in that degraded case the
* spec proceeds against the roots[0] fallback and the two platform-specific
* assertions self-document why they're loosened.
*/
async function installPlatformAgent(
page: Page,
tenantURL: string,
tenantToken: string,
orgID: string | undefined,
): Promise<{ installed: boolean }> {
const headers: Record<string, string> = {
Authorization: `Bearer ${tenantToken}`,
"Content-Type": "application/json",
};
if (orgID) headers["X-Molecule-Org-Id"] = orgID;
const resp = await page.request.post(`${tenantURL}/admin/org/platform-agent`, {
headers,
data: { id: PLATFORM_AGENT_ID, name: PLATFORM_AGENT_NAME },
});
const status = resp.status();
if (status >= 200 && status < 300) {
console.log(`[staging-concierge] platform agent installed (HTTP ${status})`);
return { installed: true };
}
// Endpoint absent on an older backend — proceed against the fallback root.
if (status === 404 || status === 405) {
console.warn(
`[staging-concierge] POST /admin/org/platform-agent returned ${status}` +
`backend predates the platform-agent endpoint. Proceeding against the ` +
`roots[0] concierge fallback; the platform-root / map-hidden assertions ` +
`are loosened accordingly.`,
);
return { installed: false };
}
throw new Error(
`POST /admin/org/platform-agent ${status}: ${await resp.text().catch(() => "")}`,
);
}
/**
* Wire the per-tenant bearer + the /cp/auth/me mock + the 401→empty-200
* fallback. Verbatim contract from staging-tabs.spec.ts so the concierge spec
* authenticates identically (no WorkOS session available to Playwright).
*/
async function authenticate(
context: BrowserContext,
tenantToken: string,
workspaceId: string,
): Promise<void> {
await context.setExtraHTTPHeaders({ Authorization: `Bearer ${tenantToken}` });
await context.route("**/cp/auth/me", (route) =>
route.fulfill({
status: 200,
contentType: "application/json",
body: JSON.stringify({
user_id: `e2e-test-user-${workspaceId}`,
org_id: "e2e-test-org",
email: "e2e@test.local",
}),
}),
);
await context.route("**", async (route, request) => {
if (request.resourceType() !== "fetch") return route.fallback();
if (request.url().includes("/cp/auth/me")) return route.fallback();
let resp;
try {
resp = await route.fetch();
} catch {
return route.fallback();
}
if (resp.status() !== 401) return route.fulfill({ response: resp });
const lastSeg =
new URL(request.url()).pathname.split("/").filter(Boolean).pop() || "";
const looksLikeList = !/^[0-9a-f-]{8,}$/.test(lastSeg);
await route.fulfill({
status: 200,
contentType: "application/json",
body: looksLikeList ? "[]" : "{}",
});
});
}
/**
* Load the concierge shell and wait for hydration. Returns once the icon rail
* (the concierge's left nav) is visible — the rail is the shell's outermost
* stable landmark and only renders after the canvas store has hydrated.
*/
async function loadConcierge(page: Page, tenantURL: string): Promise<void> {
page.on("console", (msg) => {
if (msg.type() === "error") console.log(`[e2e/console-error] ${msg.text()}`);
});
await page.goto(tenantURL, { waitUntil: "domcontentloaded" });
// The canvas store hydrates /workspaces before the desktop shell paints.
// Wait for the concierge nav rail OR the hydration-error banner — whichever
// wins. Don't wait on networkidle: the shell keeps a WS + polling open.
await page.waitForSelector(
'[data-testid="nav-home"], [data-testid="hydration-error"]',
{ timeout: 45_000 },
);
const hydrationErr = await page
.locator('[data-testid="hydration-error"]')
.count();
expect(
hydrationErr,
"canvas hydration failed — check staging CP + tenant reachability",
).toBe(0);
await expect(
page.getByText("Something went wrong", { exact: false }),
"app-level ErrorBoundary tripped during concierge hydration",
).toHaveCount(0);
}
/** Switch the concierge top-level view via the left rail. */
async function navTo(page: Page, view: "home" | "map" | "settings"): Promise<void> {
const btn = page.getByTestId(`nav-${view}`);
await expect(btn, `rail button nav-${view} missing`).toBeVisible({ timeout: 10_000 });
await btn.click();
}
// ── shared per-spec setup ──────────────────────────────────────────────────
// Each test gets a freshly-authenticated context + an installed platform
// agent. Install lives in beforeEach (idempotent) so any single test can run
// in isolation (`--grep`), not only in whole-file order.
let platformInstalled = false;
test.beforeEach(async ({ page, context }) => {
const { tenantURL, tenantToken, workspaceId, orgID } = tenantEnv();
await authenticate(context, tenantToken, workspaceId);
const { installed } = await installPlatformAgent(page, tenantURL, tenantToken, orgID);
platformInstalled = installed;
});
/* ───────────────────────── 1. Concierge shell / nav ──────────────────────── */
test.describe("concierge shell + nav", () => {
test("left rail switches Home / Org map / Settings; topbar shows the org name", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
// All three rail destinations are present.
for (const v of ["home", "map", "settings"] as const) {
await expect(page.getByTestId(`nav-${v}`)).toBeVisible();
}
// Topbar org name is dynamic from GET /org/identity. The endpoint returns
// MOLECULE_ORG_NAME (may be "" on a staging tenant), in which case the
// shell falls back to "Molecule AI". Either way it must render a
// non-empty name — assert the element resolves to real text.
const orgName = page.getByTestId("topbar-org-name");
await expect(orgName).toBeVisible();
await expect
.poll(async () => ((await orgName.innerText()) || "").trim().length, {
message: "topbar org name never resolved to non-empty text",
timeout: 10_000,
})
.toBeGreaterThan(0);
// Nav actually switches the active view. Home → Settings → Map → Home,
// asserting the destination rail button reflects active state each hop
// (the shell toggles the active class; we assert the view content too).
await navTo(page, "settings");
await expect(page.getByRole("heading", { name: "Settings" })).toBeVisible({
timeout: 10_000,
});
await navTo(page, "map");
await expect(page.locator('[aria-label="Agent canvas"]')).toBeVisible({
timeout: 15_000,
});
await navTo(page, "home");
// Home shows the agents/tasks/approvals sub-tab bar.
await expect(page.getByTestId("home-subtab-agents")).toBeVisible({
timeout: 10_000,
});
});
});
/* ─────────────────────────────── 2. Home ─────────────────────────────────── */
test.describe("concierge Home", () => {
test("renders the canonical ChatTab, Agents/Tasks/Approvals sub-tabs, and the platform agent as ROOT", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "home");
// (a) The Home chat panel reuses the EXACT canonical ChatTab — so it must
// expose the My Chat / Agent Comms sub-tabs, a message input, and the
// attachment affordance, exactly like the map SidePanel chat. The
// [data-testid="chat-panel"] root is ChatTab's own marker (canvas/src/
// components/tabs/ChatTab.tsx) — asserting it proves the canonical
// component is mounted, not a bespoke concierge re-implementation.
const chatPanel = page.getByTestId("chat-panel");
await expect(chatPanel, "Home did not mount the canonical ChatTab").toBeVisible({
timeout: 15_000,
});
await expect(chatPanel.locator("#chat-tab-my-chat")).toHaveText(/My Chat/);
await expect(chatPanel.locator("#chat-tab-agent-comms")).toHaveText(/Agent Comms/);
// Switching the chat sub-tab works (My Chat active by default → Agent Comms).
await chatPanel.locator("#chat-tab-agent-comms").click();
await expect(chatPanel.locator("#chat-tab-agent-comms")).toHaveAttribute(
"aria-selected",
"true",
);
await chatPanel.locator("#chat-tab-my-chat").click();
await expect(chatPanel.locator("#chat-tab-my-chat")).toHaveAttribute(
"aria-selected",
"true",
);
// Message input + attachment affordance (My Chat panel). The attach
// control is the labelled button (the underlying <input type=file> is
// aria-hidden); both are always present (disabled when the agent is
// unreachable), so assert presence, not enabled-state.
await expect(
chatPanel.locator('textarea[aria-label="Message to agent"]'),
"ChatTab message input missing",
).toHaveCount(1);
await expect(
chatPanel.locator('button[aria-label="Attach file"]'),
"ChatTab attachment affordance missing",
).toHaveCount(1);
// (b) Agents / Tasks / Approvals sub-tabs switch the Home sidebar pane.
await page.getByTestId("home-subtab-tasks").click();
await expect(page.getByTestId("home-subtab-tasks")).toHaveClass(/active/);
await page.getByTestId("home-subtab-approvals").click();
await expect(page.getByTestId("home-subtab-approvals")).toHaveClass(/active/);
await page.getByTestId("home-subtab-agents").click();
await expect(page.getByTestId("home-subtab-agents")).toHaveClass(/active/);
// (c) The agent tree shows the platform agent as ROOT. After install the
// platform agent is a kind='platform' root carrying the "root" tag, with
// the provisioned workspace re-parented under it (depth>0). When the
// backend predates the install endpoint, roots[0] is the pseudo-root and
// the "root" tag is absent (it only renders for a real kind='platform'
// root) — so we gate the strong assertion on a successful install.
const tree = page.getByTestId("agent-tree-node");
await expect(tree.first(), "agent tree rendered no nodes").toBeVisible({
timeout: 10_000,
});
if (platformInstalled) {
// The depth-0 node is the platform agent and it carries the root tag.
const rootNode = page
.locator('[data-testid="agent-tree-node"][data-depth="0"]')
.first();
await expect(rootNode).toHaveAttribute("data-platform", "true");
await expect(
rootNode.locator('[data-testid="agent-tree-root-tag"]'),
"platform root is missing the ROOT tag",
).toBeVisible();
// And the provisioned workspace is nested beneath it (a child node exists).
await expect(
page.locator('[data-testid="agent-tree-node"][data-depth="1"]'),
"the provisioned workspace did not re-parent under the platform root",
).toHaveCount(1, { timeout: 10_000 });
} else {
// Degraded backend: at least the tree renders a root-level node.
await expect(
page.locator('[data-testid="agent-tree-node"][data-depth="0"]'),
).not.toHaveCount(0);
}
});
});
/* ─────────────────────────────── 3. Org map ──────────────────────────────── */
test.describe("concierge Org map", () => {
test("hides the platform agent from the node graph; normal workspaces render", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "map");
// The React Flow canvas renders.
await expect(page.locator('[aria-label="Molecule AI workspace canvas"]')).toBeVisible({
timeout: 15_000,
});
// Normal workspaces render as map node cards (WorkspaceNode →
// data-testid="workspace-node"). The provisioned hermes workspace must
// appear. expect.poll lets React Flow finish its layout pass.
await expect
.poll(async () => page.locator('[data-testid="workspace-node"]').count(), {
message: "no workspace nodes rendered on the org map",
timeout: 15_000,
})
.toBeGreaterThan(0);
// The concierge (platform agent) is HIDDEN from the graph: no map node
// carries its name. WorkspaceNode's aria-label is "<name> workspace —
// <status>" — assert none matches the platform agent name. This is the
// real behaviour stripPlatformRootForMap implements (Canvas.tsx /
// canvas-topology.ts). Only meaningful when we actually installed one.
if (platformInstalled) {
const platformNode = page.locator(
`[data-testid="workspace-node"][aria-label^="${PLATFORM_AGENT_NAME} workspace"]`,
);
await expect(
platformNode,
"the platform agent (concierge) leaked into the org-map node graph — " +
"stripPlatformRootForMap should exclude it",
).toHaveCount(0);
}
});
});
/* ─────────────────────── 4. Settings — two tabs ──────────────────────────── */
test.describe("concierge Settings — two tabs", () => {
test("Platform-agent config and Org & canvas settings are separate panes; platform tab shows the full WorkspacePanelTabs defaulting to Config", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "settings");
const platformTab = page.getByTestId("settings-tab-platform");
const orgTab = page.getByTestId("settings-tab-org");
await expect(platformTab).toBeVisible({ timeout: 10_000 });
await expect(orgTab).toBeVisible();
// Platform tab is the default; its pane is shown and the org pane is not.
await expect(platformTab).toHaveAttribute("aria-selected", "true");
await expect(page.getByTestId("settings-pane-platform")).toBeVisible();
await expect(page.getByTestId("settings-pane-org")).toHaveCount(0);
// The platform pane embeds the FULL WorkspacePanelTabs (the SAME tablist
// the map SidePanel renders) and defaults to the Config tab. Assert the
// canonical workspace tablist is present, that Config is the active tab,
// and that the other signature tabs exist (Plugins, Container, Display,
// Details, Activity, Terminal, Channels, Schedule).
const wsTablist = page.getByRole("tablist", { name: "Workspace panel tabs" });
await expect(
wsTablist,
"platform-agent Settings tab did not embed WorkspacePanelTabs",
).toBeVisible({ timeout: 15_000 });
await expect(page.locator("#tab-config")).toHaveAttribute(
"aria-selected",
"true",
);
for (const id of [
"config",
"skills",
"container-config",
"display",
"details",
"activity",
"terminal",
"channels",
"schedule",
]) {
await expect(
page.locator(`#tab-${id}`),
`WorkspacePanelTabs is missing #tab-${id}`,
).toHaveCount(1);
}
// Clicking the OTHER settings tab switches panes (not just toggles a
// class): the org pane mounts and the platform pane unmounts.
await orgTab.click();
await expect(orgTab).toHaveAttribute("aria-selected", "true");
await expect(page.getByTestId("settings-pane-org")).toBeVisible();
await expect(page.getByTestId("settings-pane-platform")).toHaveCount(0);
// And back.
await platformTab.click();
await expect(page.getByTestId("settings-pane-platform")).toBeVisible();
await expect(page.getByTestId("settings-pane-org")).toHaveCount(0);
});
});
/* ─────────────────────── 5. Settings — Config tab ────────────────────────── */
test.describe("concierge Settings — Config tab dropdowns", () => {
test("runtime dropdown is SSOT-driven; provider hides Platform on self-host but lists BYOK; model follows provider", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "settings");
// Platform tab defaults to the Config tab — the runtime select is in the
// ConfigTab "Runtime" section (label "Runtime"). Wait for it to settle.
await expect(
page.getByRole("tablist", { name: "Workspace panel tabs" }),
).toBeVisible({ timeout: 15_000 });
// The runtime <select> sits under the "Runtime" label inside the Config
// panel. Use the label association for a stable hook.
const runtimeByLabel = page.locator('#panel-config').getByLabel("Runtime", {
exact: true,
});
await expect(
runtimeByLabel,
"ConfigTab runtime dropdown never rendered",
).toBeVisible({ timeout: 15_000 });
// (a) Runtime dropdown is SSOT-driven: the options come from GET
// /templates (loadRuntimesFromManifest), so the live tenant must serve a
// non-trivial set. Assert >= 1 runtime option AND that the provisioned
// workspace's runtime (hermes) is among them — proving the list reflects
// what /templates actually serves, not a stale hard-coded allowlist.
const runtimeOptionValues = await runtimeByLabel
.locator("option")
.evaluateAll((els) => els.map((e) => (e as HTMLOptionElement).value));
expect(
runtimeOptionValues.length,
"runtime dropdown rendered no options — SSOT /templates feed is empty",
).toBeGreaterThan(0);
expect(
runtimeOptionValues,
"runtime dropdown does not list the provisioned 'hermes' runtime — the " +
"SSOT /templates list has drifted",
).toContain("hermes");
// (b) Provider dropdown: on self-host (no platform proxy) it must NOT
// offer the "Platform" billing option but MUST list BYOK providers. The
// ProviderModelSelector exposes data-testid="provider-select". Read its
// option labels: none should be the "Platform" proxy entry, and the list
// must be non-empty (BYOK providers present). /org/identity's
// platform_managed_available=false on a staging tenant drives this.
const providerSelect = page.getByTestId("provider-select");
await expect(
providerSelect,
"ConfigTab provider dropdown (ProviderModelSelector) never rendered",
).toBeVisible({ timeout: 15_000 });
const providerLabels = await providerSelect
.locator("option")
.evaluateAll((els) =>
els
.map((e) => (e.textContent || "").trim())
.filter((t) => t && !t.startsWith("—")),
);
expect(
providerLabels.length,
"provider dropdown lists no BYOK providers",
).toBeGreaterThan(0);
expect(
providerLabels.map((l) => l.toLowerCase()),
'provider dropdown offered the "Platform" proxy option on a self-host / ' +
"no-proxy tenant (platform_managed_available should hide it)",
).not.toContain("platform");
// (c) Model dropdown follows the provider. The model control is
// data-testid="model-select" (dropdown) or model-input (free-text
// wildcard). Whichever renders, it must be present — proving the model
// control is wired to the provider selection.
const modelControl = page
.locator('[data-testid="model-select"], [data-testid="model-input"]')
.first();
await expect(
modelControl,
"model control did not follow the provider selection",
).toBeVisible({ timeout: 10_000 });
});
});
/* ────────────────── 6. Settings — Org & canvas settings ──────────────────── */
test.describe("concierge Settings — Org & canvas", () => {
test("Secrets / Workspace Tokens / Org API Keys / Organization sub-tabs render; Organization shows the org (no 404)", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "settings");
await page.getByTestId("settings-tab-org").click();
const orgPane = page.getByTestId("settings-pane-org");
await expect(orgPane).toBeVisible({ timeout: 10_000 });
// The four SettingsTabs (canvas/src/components/settings/SettingsTabs.tsx)
// render as a radix tablist labelled "Settings sections". Assert all four
// triggers are present.
const settingsTablist = orgPane.getByRole("tablist", {
name: "Settings sections",
});
await expect(settingsTablist).toBeVisible({ timeout: 10_000 });
for (const label of [
"Secrets",
"Workspace Tokens",
"Org API Keys",
"Organization",
]) {
await expect(
settingsTablist.getByRole("tab", { name: label }),
`Org & canvas settings is missing the "${label}" sub-tab`,
).toBeVisible();
}
// Click the Organization sub-tab — on self-host the canvas reads
// /org/identity (NOT the CP /cp/orgs endpoint), so it must render the org
// identity card and NOT a 404 / error state. Assert the pane settles to
// real, non-error content.
await settingsTablist.getByRole("tab", { name: "Organization" }).click();
const orgInfoPanel = orgPane.locator(
'[role="tabpanel"]:not([hidden])',
);
await expect(orgInfoPanel).toBeVisible({ timeout: 10_000 });
await expect
.poll(
async () => {
const text = ((await orgInfoPanel.innerText()) || "").trim();
return text.length > 0 && !/404|not found/i.test(text);
},
{
message:
"Organization sub-tab rendered empty or a 404/not-found — the " +
"self-host /org/identity path is broken",
timeout: 15_000,
},
)
.toBe(true);
// And no visible error alert inside the org settings pane.
await expect(orgPane.locator('[role="alert"]:visible')).toHaveCount(0);
});
});
/* ───────────────────────────── 7. Map toolbar ────────────────────────────── */
test.describe("concierge Org map toolbar", () => {
test("settings gear, theme toggle and legend are NOT on the map toolbar (moved to Settings/topbar)", async ({
page,
}) => {
const { tenantURL } = tenantEnv();
await loadConcierge(page, tenantURL);
await navTo(page, "map");
await expect(page.locator('[aria-label="Molecule AI workspace canvas"]')).toBeVisible({
timeout: 15_000,
});
// The map toolbar no longer carries a settings gear, a theme toggle, or a
// legend — those moved to the concierge Settings (left rail) + topbar
// (Toolbar.tsx: "Theme picker + settings gear removed from the map
// toolbar"). Assert the map view contains none of them.
//
// Scope to the map mount (<main aria-label="Agent canvas">, ConciergeShell)
// so the legitimate left-rail Settings button + the topbar theme toggle
// (which live OUTSIDE the map) are not counted.
const mapRegion = page.locator('[aria-label="Agent canvas"]');
await expect(mapRegion).toBeVisible({ timeout: 10_000 });
// No settings-gear control inside the map. The old gear used
// title="Settings" / aria-label "Settings".
await expect(
mapRegion.locator('button[title="Settings"], button[aria-label="Settings"]'),
"a settings gear is still on the map toolbar (should be moved to Settings)",
).toHaveCount(0);
// No theme toggle inside the map. The toggle's accessible name is
// "Toggle theme" — it now lives only in the topbar.
await expect(
mapRegion.locator('button[title="Toggle theme"], button[aria-label*="theme" i]'),
"a theme toggle is still on the map toolbar (should be in the topbar)",
).toHaveCount(0);
// No legend inside the map. The Legend component's controls have accessible
// names "Show legend" / "Hide legend" and the panel carries
// data-testid="legend-panel" (canvas/src/components/Legend.tsx). It is no
// longer mounted in Canvas/Toolbar at all — assert none of its surfaces.
await expect(
mapRegion.locator(
'[data-testid="legend-panel"], button[aria-label="Show legend"], button[aria-label="Hide legend"]',
),
"a legend is still on the map toolbar (should be removed)",
).toHaveCount(0);
});
});
+6 -2
View File
@@ -341,11 +341,15 @@ export default async function globalSetup(_config: FullConfig): Promise<void> {
);
return true;
}
// Real boot regression — hard-throw immediately with full detail.
// #2032: tolerate transient 'failed' during boot — some runtimes
// briefly report failed before recovering to online (e.g. agent
// restart during init). Retry instead of hard-throwing; genuine
// terminal failures will still surface via waitFor timeout.
const detail = sampleErr
? sampleErr
: `(no last_sample_error) full body: ${JSON.stringify(r.body)}`;
throw new Error(`Workspace failed: ${detail}`);
console.warn(`[staging-setup] transient failed (retrying): ${detail}`);
return null;
}
return null;
},
+4 -2
View File
@@ -52,8 +52,10 @@ describe("prefers-reduced-motion compliance", () => {
expect(src).toContain("motion-safe:animate-pulse");
});
it("SidePanel.tsx uses motion-safe:animate-pulse", () => {
const src = readSrc("components/SidePanel.tsx");
it("WorkspacePanelTabs.tsx uses motion-safe:animate-pulse", () => {
// The connection-status dot moved out of SidePanel.tsx into the extracted
// WorkspacePanelTabs.tsx; verify the reduced-motion guard followed it.
const src = readSrc("components/WorkspacePanelTabs.tsx");
expect(src.includes("animate-pulse") && !src.includes("motion-safe:animate-pulse")).toBe(false);
expect(src).toContain("motion-safe:animate-pulse");
});
+1 -1
View File
@@ -10,7 +10,7 @@ import { describe, it, expect, vi } from "vitest";
// transform). We import layout.tsx only for its exported `metadata`
// constant — mock the font module to a constructor-returning stub.
vi.mock("next/font/google", () => ({
Inter: () => ({ variable: "--font-inter" }),
Hanken_Grotesk: () => ({ variable: "--font-hanken" }),
JetBrains_Mono: () => ({ variable: "--font-jetbrains" }),
}));
+50 -38
View File
@@ -42,48 +42,52 @@
* before paint to eliminate flash.
*/
@theme {
/* Org Concierge palette (RFC platform-agent / canvas redesign). Warm-paper
light theme + purple accent replacing the old blue brand. */
/* Surface — page / elevated card / sunken input / deep card */
--color-surface: #fafaf7;
--color-surface: #f1efe8;
--color-surface-elevated: #ffffff;
--color-surface-sunken: #f3f1ec;
--color-surface-card: #efece4;
--color-surface-sunken: #f6f4ee;
--color-surface-card: #faf9f4;
/* Borders */
--color-line: #e6e2d8;
--color-line-soft: #efece4;
--color-line: #ddd9cf;
--color-line-soft: #ebe8df;
/* Text */
--color-ink: #15181c;
--color-ink-mid: #5a5e66;
--color-ink-soft: #8b8e95;
--color-ink: #21201b;
--color-ink-mid: #5c5a52;
--color-ink-soft: #6f6c62;
/* Brand + state */
--color-accent: #3b5bdb;
--color-accent-strong: #1a2f99;
--color-warm: #c0532b;
--color-good: #2f7a4d;
--color-bad: #b94e4a;
/* Brand + state — purple accent (concept #7c3aed); light good/bad kept
slightly darker than the raw concept hues for WCAG AA on the paper tints. */
--color-accent: #7c3aed;
--color-accent-strong: #6d28d9;
--color-warm: #c47e12;
--color-good: #0c8a52;
--color-bad: #c2403c;
}
[data-theme="dark"] {
--color-surface: #0e1014;
--color-surface-elevated: #15181c;
--color-surface-sunken: #0a0b0e;
--color-surface-card: #1a1d23;
/* Org Concierge dark palette — near-black panels, bright purple accent. */
--color-surface: #08080a;
--color-surface-elevated: #16161d;
--color-surface-sunken: #0d0d11;
--color-surface-card: #1b1b23;
--color-line: #2a2f3a;
--color-line-soft: #1f2329;
--color-line: #26262e;
--color-line-soft: #1b1b22;
--color-ink: #f4f1e9;
--color-ink-mid: #c8c2b4;
--color-ink-soft: #8d92a0;
--color-ink: #ececf1;
--color-ink-mid: #9b9baa;
--color-ink-soft: #65656f;
/* Accents brighten slightly for AA contrast on dark backgrounds. */
--color-accent: #6883e8;
--color-accent-strong: #8aa1ee;
--color-warm: #d96f48;
--color-good: #4ca06e;
--color-bad: #d27773;
/* Purple accent brightened for AA on the near-black surfaces. */
--color-accent: #a78bfa;
--color-accent-strong: #c4b5fd;
--color-warm: #fbbf24;
--color-good: #34d399;
--color-bad: #f87171;
}
:root {
@@ -107,15 +111,22 @@
* component, not per theme.
*/
@theme {
--color-bg: rgb(9 9 11); /* zinc-950 */
--color-bg-elev: rgb(24 24 27); /* zinc-900 */
--color-bg-card: rgb(39 39 42); /* zinc-800 */
--color-line-strong: rgb(63 63 70); /* zinc-700 */
--color-ink-mute: rgb(161 161 170); /* zinc-400 */
--color-ink-dim: rgb(113 113 122); /* zinc-500 */
--color-accent-dim: rgb(96 165 250);/* blue-400 */
--color-plasma: rgb(59 130 246); /* blue-500 */
/* Org Concierge canvas palette (near-black + purple). */
--color-bg: rgb(8 8 10); /* concept --bg #08080a */
--color-bg-elev: rgb(22 22 29); /* concept --card #16161d */
--color-bg-card: rgb(27 27 35); /* concept --card-2 #1b1b23 */
--color-line-strong: rgb(54 54 64);
--color-ink-mute: rgb(155 155 170); /* concept --tx-2 */
--color-ink-dim: rgb(101 101 111); /* concept --tx-3 */
--color-accent-dim: rgb(167 139 250);/* concept --accent-2 #a78bfa */
--color-plasma: rgb(139 92 246); /* concept --accent #8b5cf6 */
--color-warn: rgb(251 191 36); /* amber-400 */
/* Typography — Org Concierge (Hanken Grotesk UI, JetBrains Mono code).
next/font variables are set on <html> in the canvas layout. */
--font-sans: var(--font-hanken), ui-sans-serif, system-ui, -apple-system,
"Segoe UI", Roboto, sans-serif;
--font-mono: var(--font-jetbrains), ui-monospace, "SF Mono", Menlo, monospace;
}
body {
@@ -124,7 +135,8 @@ body {
overflow: hidden;
background-color: var(--color-surface);
color: var(--color-ink);
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", sans-serif;
font-family: var(--font-hanken), -apple-system, BlinkMacSystemFont, "Segoe UI",
Roboto, "Helvetica Neue", sans-serif;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
+13 -3
View File
@@ -1,5 +1,5 @@
import type { Metadata } from "next";
import { Inter, JetBrains_Mono } from "next/font/google";
import { Hanken_Grotesk, JetBrains_Mono } from "next/font/google";
import { cookies, headers } from "next/headers";
import "./globals.css";
@@ -7,10 +7,13 @@ import "./globals.css";
// because Next.js serves the .woff2 from /_next/static). Exposed as
// CSS variables so the mobile palette can reference them without
// importing this module.
const interFont = Inter({
// Org Concierge UI typeface (canvas redesign): Hanken Grotesk, exposed as
// --font-hanken and consumed by the --font-sans theme token in globals.css.
const interFont = Hanken_Grotesk({
subsets: ["latin"],
weight: ["400", "500", "600", "700"],
display: "swap",
variable: "--font-inter",
variable: "--font-hanken",
});
const monoFont = JetBrains_Mono({
subsets: ["latin"],
@@ -161,6 +164,12 @@ export default async function RootLayout({
*/}
<script
nonce={nonce}
// The browser strips the nonce attribute off <script> after applying
// CSP, so the hydrated DOM shows nonce="" while React's tree carries
// the real value — a benign, expected server/client diff. Suppress
// the hydration warning for this element (same rationale as the
// <html> suppressHydrationWarning above).
suppressHydrationWarning
dangerouslySetInnerHTML={{ __html: themeBootScript }}
/>
{/*
@@ -186,6 +195,7 @@ export default async function RootLayout({
<script
type="application/ld+json"
nonce={nonce}
suppressHydrationWarning
dangerouslySetInnerHTML={{
__html: JSON.stringify({
"@context": "https://schema.org",
+2 -8
View File
@@ -1,9 +1,7 @@
"use client";
import { useEffect, useState } from "react";
import { Canvas } from "@/components/Canvas";
import { Legend } from "@/components/Legend";
import { CommunicationOverlay } from "@/components/CommunicationOverlay";
import { ConciergeShell } from "@/components/concierge/ConciergeShell";
import { MobileApp } from "@/components/mobile/MobileApp";
import { Spinner } from "@/components/Spinner";
import { connectSocket, disconnectSocket } from "@/store/socket";
@@ -115,11 +113,7 @@ export default function Home() {
return (
<>
<main aria-label="Agent canvas">
<Canvas />
</main>
<Legend />
<CommunicationOverlay />
<ConciergeShell />
{hydrationError && (
<div
role="alert"
+36 -6
View File
@@ -13,8 +13,11 @@ import {
import "@xyflow/react/dist/style.css";
import { useCanvasStore } from "@/store/canvas";
import { WORKSPACE_KIND } from "@/lib/workspace-kind";
import { stripPlatformRootForMap } from "@/store/canvas-topology";
import { useTheme } from "@/lib/theme-provider";
import { A2ATopologyOverlay } from "./A2ATopologyOverlay";
import { MessageFlightLayer } from "./MessageFlightLayer";
import { WorkspaceNode } from "./WorkspaceNode";
import { SidePanel } from "./SidePanel";
import { CreateWorkspaceButton } from "./CreateWorkspaceDialog";
@@ -78,15 +81,38 @@ function CanvasInner() {
// half-themed page. Pull resolvedTheme so the canvas matches the user's
// selected mode (and the system preference when they pick "system").
const { resolvedTheme } = useTheme();
const rawNodes = useCanvasStore((s) => s.nodes);
const edges = useCanvasStore((s) => s.edges);
const storeNodes = useCanvasStore((s) => s.nodes);
const storeEdges = useCanvasStore((s) => s.edges);
const a2aEdges = useCanvasStore((s) => s.a2aEdges);
const showA2AEdges = useCanvasStore((s) => s.showA2AEdges);
const deletingIds = useCanvasStore((s) => s.deletingIds);
const allEdges = useMemo(
() => (showA2AEdges ? [...edges, ...a2aEdges] : edges),
[edges, a2aEdges, showA2AEdges],
// Hide the org-level platform agent (the concierge) from the map graph: it is
// the undeletable org ROOT surfaced in the shell (topbar + Home tree), not a
// draggable/deletable map node. Its direct children are reparented to
// top-level and tree edges touching it are dropped. The store keeps the full
// node set, so the shell's Home agent tree still renders it as ROOT.
const { nodes: rawNodes, edges } = useMemo(
() => stripPlatformRootForMap(storeNodes, storeEdges),
[storeNodes, storeEdges],
);
const platformIds = useMemo(
() =>
new Set(
storeNodes
.filter((n) => n.data.kind === WORKSPACE_KIND.Platform)
.map((n) => n.id),
),
[storeNodes],
);
const allEdges = useMemo(() => {
if (!showA2AEdges) return edges;
// Drop A2A edges that touch the hidden platform root so React Flow doesn't
// warn about an edge to a missing node.
const a2a = a2aEdges.filter(
(e) => !platformIds.has(e.source) && !platformIds.has(e.target),
);
return [...edges, ...a2a];
}, [edges, a2aEdges, showA2AEdges, platformIds]);
// Drag-lock during a system-owned operation (deploy OR delete).
// React Flow respects Node.draggable, which stops the gesture
// before it starts — preventDefault() on the drag-start callback
@@ -277,7 +303,7 @@ function CanvasInner() {
>
Skip to canvas
</a>
<main id="canvas-main" className="w-screen h-screen bg-surface">
<main id="canvas-main" className="w-full h-full bg-surface">
<ReactFlow
colorMode={resolvedTheme}
nodes={nodes}
@@ -346,6 +372,10 @@ function CanvasInner() {
nodeBorderRadius={4}
/>
<DropTargetBadge />
{/* Flies an envelope between agents on each delegate/message event.
Inside <ReactFlow> so its ViewportPortal renders in flow coords
and tracks pan/zoom. */}
<MessageFlightLayer />
</ReactFlow>
{/* Screen-reader live region announces workspace count on initial load and
+84
View File
@@ -0,0 +1,84 @@
/** FlightEnvelope a single envelope that animates from `from` to `to` and
* fades out, used by both the canvas (flow coords inside a ViewportPortal) and
* the concierge home (screen coords inside a fixed overlay). The parent owns
* the coordinate space; this component only animates the translate delta.
*
* Uses the Web Animations API so the from/to delta can be dynamic per flight
* (a static CSS @keyframes can't translate to a runtime-computed point). */
import { useEffect, useRef } from "react";
import { FLIGHT_DURATION_MS, type A2AFlightKind } from "@/hooks/useA2AFlights";
/** Stroke colour by activity kind mirrors CommunicationOverlay's palette
* (send = cyan, receive = violet/accent, task = warm) so the two surfaces
* read as the same event. */
const KIND_COLOR: Record<A2AFlightKind, string> = {
send: "#22d3ee",
receive: "#8b5cf6",
task: "#f5a623",
};
export interface Point {
x: number;
y: number;
}
export function FlightEnvelope({
from,
to,
kind,
}: {
from: Point;
to: Point;
kind: A2AFlightKind;
}) {
const ref = useRef<HTMLDivElement>(null);
useEffect(() => {
const el = ref.current;
// Element.animate is unavailable in some test/SSR environments — degrade to
// a static (instantly-finished) envelope rather than throw.
if (!el || typeof el.animate !== "function") return;
const dx = to.x - from.x;
const dy = to.y - from.y;
const anim = el.animate(
[
{ transform: "translate(-50%,-50%) translate(0px,0px) scale(0.45)", opacity: 0 },
{ opacity: 1, offset: 0.16 },
{ opacity: 1, offset: 0.8 },
{ transform: `translate(-50%,-50%) translate(${dx}px,${dy}px) scale(1)`, opacity: 0 },
],
{ duration: FLIGHT_DURATION_MS, easing: "cubic-bezier(0.45, 0, 0.25, 1)", fill: "forwards" },
);
return () => anim.cancel();
}, [from.x, from.y, to.x, to.y]);
const color = KIND_COLOR[kind];
return (
<div
ref={ref}
data-testid="flight-envelope"
aria-hidden="true"
style={{
position: "absolute",
left: from.x,
top: from.y,
pointerEvents: "none",
willChange: "transform, opacity",
filter: "drop-shadow(0 1px 3px rgba(0,0,0,0.45))",
zIndex: 6,
}}
>
<svg width="22" height="22" viewBox="0 0 24 24" fill="none" aria-hidden="true">
<rect x="2.5" y="5.5" width="19" height="13" rx="2.5" fill="#0b0b0f" stroke={color} strokeWidth="1.6" />
<path
d="M3.5 7.5l8.5 6 8.5-6"
stroke={color}
strokeWidth="1.6"
fill="none"
strokeLinecap="round"
strokeLinejoin="round"
/>
</svg>
</div>
);
}
@@ -0,0 +1,46 @@
/** MessageFlightLayer flies an envelope from the source agent to the target
* agent on the spatial canvas whenever a delegate / message event fires.
*
* Mounted INSIDE <ReactFlow> so its ViewportPortal places the envelope in flow
* coordinates; it therefore pans and zooms with the canvas for free. The
* flight lifecycle (which events become envelopes, reduced-motion opt-out,
* expiry) lives in useA2AFlights this component only resolves node centres
* and renders. */
import { ViewportPortal, type Node } from "@xyflow/react";
import { useCanvasStore } from "@/store/canvas";
import { useA2AFlights } from "@/hooks/useA2AFlights";
import { FlightEnvelope, type Point } from "./FlightEnvelope";
import type { WorkspaceNodeData } from "@/store/canvas";
// Fallback node footprint when React Flow has not measured a node yet. Matches
// WorkspaceNode's leaf size (w-[300px] min-h-[176px]); a slightly-off centre
// for the first frame after mount is invisible at flight scale.
const DEFAULT_W = 300;
const DEFAULT_H = 176;
function nodeCenter(n: Node<WorkspaceNodeData>): Point {
const w = n.measured?.width ?? DEFAULT_W;
const h = n.measured?.height ?? DEFAULT_H;
return { x: n.position.x + w / 2, y: n.position.y + h / 2 };
}
export function MessageFlightLayer() {
const flights = useA2AFlights();
const nodes = useCanvasStore((s) => s.nodes);
if (flights.length === 0) return null;
return (
<ViewportPortal>
{flights.map((f) => {
const src = nodes.find((n) => n.id === f.sourceId);
const dst = nodes.find((n) => n.id === f.targetId);
// Both endpoints must be on-canvas to draw a path between them.
if (!src || !dst) return null;
return (
<FlightEnvelope key={f.key} from={nodeCenter(src)} to={nodeCenter(dst)} kind={f.kind} />
);
})}
</ViewportPortal>
);
}
+8 -134
View File
@@ -1,25 +1,9 @@
"use client";
import { useState, useCallback, useRef, useEffect } from "react";
import { useCanvasStore, type PanelTab } from "@/store/canvas";
import { showToast } from "@/components/Toaster";
import { useCanvasStore } from "@/store/canvas";
import { StatusDot } from "./StatusDot";
import { Tooltip } from "./Tooltip";
import { DetailsTab } from "./tabs/DetailsTab";
import { SkillsTab } from "./tabs/SkillsTab";
import { ChatTab } from "./tabs/ChatTab";
import { ConfigTab } from "./tabs/ConfigTab";
import { ContainerConfigTab } from "./tabs/ContainerConfigTab";
import { DisplayTab } from "./tabs/DisplayTab";
import { TerminalTab } from "./tabs/TerminalTab";
import { FilesTab } from "./tabs/FilesTab";
import { MemoryInspectorPanel } from "./MemoryInspectorPanel";
import { AuditTrailPanel } from "./AuditTrailPanel";
import { TracesTab } from "./tabs/TracesTab";
import { EventsTab } from "./tabs/EventsTab";
import { ActivityTab } from "./tabs/ActivityTab";
import { ScheduleTab } from "./tabs/ScheduleTab";
import { ChannelsTab } from "./tabs/ChannelsTab";
import { WorkspacePanelTabs } from "./WorkspacePanelTabs";
import { summarizeWorkspaceCapabilities } from "@/store/canvas";
const SIDEPANEL_WIDTH_KEY = "molecule:sidepanel-width";
@@ -27,24 +11,6 @@ const SIDEPANEL_DEFAULT_WIDTH = 480;
const SIDEPANEL_MIN_WIDTH = 320;
const SIDEPANEL_MAX_WIDTH = 800;
const TABS: { id: PanelTab; label: string; icon: string }[] = [
{ id: "chat", label: "Chat", icon: "◈" },
{ id: "activity", label: "Activity", icon: "⊙" },
{ id: "details", label: "Details", icon: "◉" },
{ id: "skills", label: "Plugins", icon: "✦" },
{ id: "terminal", label: "Terminal", icon: "▸" },
{ id: "display", label: "Display", icon: "▣" },
{ id: "container-config", label: "Container", icon: "▤" },
{ id: "config", label: "Config", icon: "⚙" },
{ id: "schedule", label: "Schedule", icon: "⏲" },
{ id: "channels", label: "Channels", icon: "⇌" },
{ id: "files", label: "Files", icon: "⊞" },
{ id: "memory", label: "Memory", icon: "◇" },
{ id: "traces", label: "Traces", icon: "◎" },
{ id: "events", label: "Events", icon: "◊" },
{ id: "audit", label: "Audit", icon: "⊟" },
];
export function SidePanel() {
const selectedNodeId = useCanvasStore((s) => s.selectedNodeId);
const panelTab = useCanvasStore((s) => s.panelTab);
@@ -219,104 +185,12 @@ export function SidePanel() {
</div>
</div>
{/* Tabs — relative wrapper lets the fade gradient position against the scroll container */}
<div className="relative border-b border-line/40">
{/* Right-edge fade: signals more tabs are hidden off-screen when the bar overflows */}
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-surface to-transparent z-10" aria-hidden="true" />
<div
role="tablist"
aria-label="Workspace panel tabs"
className="flex overflow-x-auto bg-surface-sunken/20 px-1"
onKeyDown={(e) => {
const idx = TABS.findIndex((t) => t.id === panelTab);
let next: number | null = null;
if (e.key === "ArrowRight") { e.preventDefault(); next = (idx + 1) % TABS.length; }
else if (e.key === "ArrowLeft") { e.preventDefault(); next = (idx - 1 + TABS.length) % TABS.length; }
else if (e.key === "Home") { e.preventDefault(); next = 0; }
else if (e.key === "End") { e.preventDefault(); next = TABS.length - 1; }
if (next !== null) {
setPanelTab(TABS[next].id);
requestAnimationFrame(() => { const el = document.getElementById(`tab-${TABS[next!].id}`); el?.focus(); el?.scrollIntoView({ block: "nearest", inline: "nearest" }); });
}
}}
>
{TABS.map((tab) => (
<button
type="button"
key={tab.id}
id={`tab-${tab.id}`}
role="tab"
aria-selected={panelTab === tab.id}
aria-controls={`panel-${tab.id}`}
tabIndex={panelTab === tab.id ? 0 : -1}
onClick={() => setPanelTab(tab.id)}
className={`shrink-0 px-3 py-2.5 text-[10px] font-medium tracking-wide transition-all rounded-t-lg mx-0.5 focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 ${
panelTab === tab.id
? "text-ink bg-surface-card border-b-2 border-accent"
: "text-ink-mid hover:text-ink hover:bg-surface-card/60"
}`}
>
<span className="mr-1 opacity-50" aria-hidden="true">{tab.icon}</span>
{tab.label}
</button>
))}
</div>
</div>
{/* Needs Restart Banner */}
{node.data.needsRestart && !node.data.currentTask && selectedNodeId && (
<div className="px-4 py-2 bg-sky-950/20 border-b border-sky-800/20 flex items-center justify-between">
<span className="text-[10px] text-sky-300/90">Config changed restart to apply</span>
<button
type="button"
onClick={() => {
useCanvasStore.getState().restartWorkspace(selectedNodeId).catch(() => showToast("Restart failed", "error"));
}}
className="text-[11px] px-2 py-1 bg-sky-800/40 hover:bg-sky-700/50 text-sky-200 rounded transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Restart Now
</button>
</div>
)}
{/* Current Task Banner */}
{node.data.currentTask && (
<Tooltip text={node.data.currentTask as string}>
<div className="px-4 py-2 bg-amber-950/20 border-b border-amber-800/20 flex items-center gap-2 cursor-default">
<div className="w-1.5 h-1.5 rounded-full bg-amber-400 motion-safe:animate-pulse shrink-0" />
<span className="text-[10px] text-warm/90 truncate">
{node.data.currentTask}
</span>
</div>
</Tooltip>
)}
{/* Tab Content */}
<div
role="tabpanel"
id={`panel-${panelTab}`}
aria-labelledby={`tab-${panelTab}`}
tabIndex={0}
className="flex-1 overflow-y-auto focus:outline-none"
>
{panelTab === "details" && <DetailsTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "skills" && <SkillsTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "activity" && <ActivityTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "chat" && <ChatTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "terminal" && <TerminalTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "display" && <DisplayTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "container-config" && selectedNodeId && (
<ContainerConfigTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />
)}
{panelTab === "config" && <ConfigTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "schedule" && <ScheduleTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "channels" && <ChannelsTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "files" && <FilesTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "memory" && <MemoryInspectorPanel key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "traces" && <TracesTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "events" && <EventsTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "audit" && <AuditTrailPanel key={selectedNodeId} workspaceId={selectedNodeId} />}
</div>
{/* Tabs + tab content extracted into WorkspacePanelTabs so the same
tab bar/body is reused verbatim by the concierge Settings page. The
map drawer stays store-driven: we thread the global panelTab /
setPanelTab through as the controlled active-tab pair, preserving the
existing selection + keyboard behaviour. */}
<WorkspacePanelTabs node={node} activeTab={panelTab} onTabChange={setPanelTab} />
{/* Footer — workspace ID */}
<div className="px-4 sm:px-5 py-2 border-t border-line/40 bg-surface-sunken/20">
+8 -10
View File
@@ -3,11 +3,9 @@
import { useMemo, useState, useCallback, useEffect, useRef } from "react";
import { api } from "@/lib/api";
import { useCanvasStore } from "@/store/canvas";
import { SettingsButton } from "@/components/settings/SettingsButton";
import { settingsGearRef } from "@/components/settings/SettingsPanel";
import { WORKSPACE_KIND } from "@/lib/workspace-kind";
import { ConfirmDialog } from "@/components/ConfirmDialog";
import { showToast } from "@/components/Toaster";
import { ThemeToggle } from "@/components/ThemeToggle";
import { statusDotClass } from "@/lib/design-tokens";
import { KeyboardShortcutsDialog } from "@/components/KeyboardShortcutsDialog";
@@ -55,8 +53,11 @@ export function Toolbar() {
}, [wsStatus]);
const counts = useMemo(() => {
const c = { total: nodes.length, roots: 0, children: 0, online: 0, offline: 0, failed: 0, provisioning: 0, activeTasks: 0 };
for (const n of nodes) {
// Exclude the org-level platform agent (the concierge) — it's the
// undeletable org root surfaced in the shell, not a counted map workspace.
const mapNodes = nodes.filter((n) => n.data.kind !== WORKSPACE_KIND.Platform);
const c = { total: mapNodes.length, roots: 0, children: 0, online: 0, offline: 0, failed: 0, provisioning: 0, activeTasks: 0 };
for (const n of mapNodes) {
if (n.data.parentId) c.children++; else c.roots++;
const s = n.data.status;
if (s === "online") c.online++;
@@ -460,11 +461,8 @@ export function Toolbar() {
)}
</div>
{/* Theme picker — System / Light / Dark */}
<ThemeToggle />
{/* Settings gear icon */}
<SettingsButton ref={settingsGearRef} />
{/* Theme picker + settings gear removed from the map toolbar both now
live in the concierge global Settings (left rail) + topbar. */}
<ConfirmDialog
open={restartConfirmOpen}
+81 -72
View File
@@ -1,7 +1,7 @@
"use client";
import { useCallback, useMemo, type KeyboardEvent } from "react";
import { Handle, NodeResizer, Position, type NodeProps, type Node } from "@xyflow/react";
import { useMemo, type KeyboardEvent } from "react";
import { Handle, Position, type NodeProps, type Node } from "@xyflow/react";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { getConfigurationError, getConfigurationStatus } from "@/store/canvas-topology";
import { showToast } from "@/components/Toaster";
@@ -21,7 +21,8 @@ function useDescendantCount(nodeId: string): number {
return useMemo(() => countDescendants(nodeId, nodes), [nodeId, nodes]);
}
/** Boolean flag used to drive min-size and NodeResizer dimensions.
/** Boolean flag used to drive the container's system-controlled size
* (leaves render fixed-size; parents grow to fit children).
* Selecting `nodes` stably avoids re-render loops (same issue as
* useDescendantCount). */
function useHasChildren(nodeId: string): boolean {
@@ -87,16 +88,9 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
return (
<>
{/* NodeResizer visible only on the selected card. Lets the user
* drag any edge/corner to grow or shrink the workspace, which is
* useful on cards that contain nested child workspaces. */}
<NodeResizer
isVisible={isSelected}
minWidth={hasChildren ? 360 : 210}
minHeight={hasChildren ? 200 : 110}
lineClassName="!border-accent/40"
handleClassName="!w-2 !h-2 !bg-accent !border !border-blue-300"
/>
{/* Free-resize removed (was NodeResizer). Container size + shape are now
* system-controlled: leaf workspaces render at a fixed width; parent
* workspaces grow to fit their nested children (store grow logic). */}
<div
role="button"
tabIndex={0}
@@ -161,20 +155,22 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
}
}}
className={`
group relative rounded-xl h-full w-full
${hasChildren && !data.collapsed ? "min-w-[360px] min-h-[200px]" : "min-w-[210px]"}
group relative rounded-xl
${hasChildren && !data.collapsed
? "h-full w-full min-w-[420px] min-h-[240px]"
: "w-[300px] min-h-[176px]"}
cursor-pointer overflow-hidden
transition-all duration-200 ease-out
${isDragTarget
? "bg-emerald-950/40 border-2 border-emerald-400/60 ring-2 ring-emerald-400/20 scale-[1.03]"
: isBatchSelected
? "bg-surface-sunken/95 border-2 border-accent/80 ring-2 ring-accent/30 shadow-lg shadow-blue-500/15"
? "bg-surface-sunken/95 border-2 border-accent/80 ring-2 ring-accent/30 shadow-lg shadow-accent/15"
: isSelected
? "bg-surface-sunken/95 border border-accent/70 ring-1 ring-accent/30 shadow-lg shadow-blue-500/10"
: "bg-surface-sunken/90 border border-line/80 hover:border-zinc-500/60 shadow-lg shadow-black/30 hover:shadow-xl hover:shadow-black/40"
? "bg-surface-sunken/95 border border-accent/70 ring-1 ring-accent/30 shadow-lg shadow-accent/10"
: "bg-surface-sunken/90 border border-line/80 hover:border-ink-soft/60 shadow-lg shadow-black/30 hover:shadow-xl hover:shadow-black/40"
}
backdrop-blur-sm
focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:ring-offset-1 focus-visible:ring-offset-zinc-950
focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:ring-offset-1 focus-visible:ring-offset-surface
${deploy.isActivelyProvisioning ? "mol-deploy-shimmer" : ""}
${deploy.isLockedChild ? "mol-deploy-locked" : ""}
`}
@@ -212,27 +208,45 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
}
}
}}
className="!w-2.5 !h-1 !rounded-full !bg-surface-card/80 !border-0 !-top-0.5 hover:!bg-blue-400 hover:!h-1.5 focus-visible:!bg-blue-400 focus-visible:!h-1.5 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-blue-400/60 focus-visible:ring-offset-1 focus-visible:ring-offset-zinc-950 transition-all"
className="!w-2.5 !h-1 !rounded-full !bg-surface-card/80 !border-0 !-top-0.5 hover:!bg-accent hover:!h-1.5 focus-visible:!bg-accent focus-visible:!h-1.5 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface transition-all"
/>
<div className="relative px-3.5 py-2.5">
<div className="relative px-4 py-3.5">
{/* Header row */}
<div className="flex items-center justify-between gap-2 mb-1">
<div className="flex items-center gap-2 min-w-0">
<div className={`w-2 h-2 rounded-full shrink-0 ${statusCfg.dot} ${statusCfg.glow} shadow-sm`} />
<span className="text-[13px] font-semibold text-ink truncate leading-tight">
<div className="flex items-center justify-between gap-2 mb-2.5">
<div className="flex items-center gap-2.5 min-w-0">
<div className={`w-2.5 h-2.5 rounded-full shrink-0 ${statusCfg.dot} ${statusCfg.glow} shadow-sm`} />
<span className="text-[15px] font-semibold text-ink truncate leading-tight">
{data.name}
</span>
</div>
<div className="flex items-center gap-1.5 shrink-0">
{hasChildren && (
<span className="text-[10px] font-mono text-accent bg-accent/15 border border-accent/40 px-1.5 py-0.5 rounded-md">
{descendantCount} sub
</span>
)}
<span className={`text-[10px] font-mono px-1.5 py-0.5 rounded-md ${tierCfg.color}`}>
{tierCfg.label}
</span>
{/* Model pill (concept top-right). Shortens the agent_card model to
a family label (Opus/Sonnet/Haiku/Kimi); falls back to the raw
last segment, then to the tier badge when no model is known. */}
{(() => {
const m = (data.agentCard as Record<string, unknown> | null)?.model;
const model = typeof m === "string" && m ? m : null;
if (!model) {
return (
<span className={`text-[11px] font-mono px-2 py-1 rounded-md ${tierCfg.color}`}>
{tierCfg.label}
</span>
);
}
const label = /opus/i.test(model) ? "Opus"
: /sonnet/i.test(model) ? "Sonnet"
: /haiku/i.test(model) ? "Haiku"
: /kimi/i.test(model) ? "Kimi"
: /gpt|openai/i.test(model) ? "GPT"
: /gemini/i.test(model) ? "Gemini"
: (model.split(/[/:]/).pop() || model);
return (
<span className="text-[11px] font-mono px-2 py-1 rounded-md text-white bg-accent" title={model}>
{label}
</span>
);
})()}
</div>
</div>
@@ -242,6 +256,9 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
We treat empty-string DB values as "missing" so an unbackfilled
row falls through to the agent-card value rather than rendering
a blank pill. */}
{/* Role pill (concept) uppercase, accent-bordered. Platform root
shows "PLATFORM · ROOT"; Phase 30 external-runtime agents get the
REMOTE marker alongside. */}
{(() => {
const dbRuntime = typeof data.runtime === "string" && data.runtime !== ""
? data.runtime : null;
@@ -249,32 +266,46 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
? (data.agentCard as Record<string, string>).runtime
: null;
const runtime = dbRuntime ?? cardRuntime;
if (!runtime) return null;
const isRemote = !!runtime && isExternalLikeRuntime(runtime);
const isPlatformRoot = !data.parentId && hasChildren;
const roleLabel = isPlatformRoot ? "PLATFORM · ROOT" : (data.role || null);
if (!roleLabel && !isRemote) return null;
return (
<div className="mb-1 flex items-center gap-1">
{isExternalLikeRuntime(runtime) ? (
<div className="mb-2.5 flex items-center gap-1.5">
{roleLabel && (
<span className="max-w-[220px] truncate text-[10px] font-mono uppercase tracking-[0.04em] px-2 py-1 rounded-md text-accent bg-accent/12 border border-accent/35">
{roleLabel}
</span>
)}
{isRemote && (
<span
className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-white bg-violet-800 border border-violet-900"
className="text-[10px] font-mono uppercase px-2 py-1 rounded-md text-white bg-violet-800 border border-violet-900"
title="Phase 30 remote agent — runs outside this platform's Docker network. Lifecycle managed via heartbeat-based polling, not Docker exec."
>
REMOTE
</span>
) : (
<span className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-ink-mid bg-surface-card border border-line">
{runtime}
</span>
)}
</div>
);
})()}
{/* Role clamp to 2 lines. Without this, a verbose role
* description (common on org-template imports) lets the card
* grow arbitrarily tall, which wrecks the grid-slot layout
* because siblings all plan for the same CHILD_DEFAULT_HEIGHT. */}
{data.role && (
<div className="text-[10px] text-ink-mid mb-1.5 leading-tight line-clamp-2">{data.role}</div>
)}
{/* Status line (concept) uppercase status, "· N AGENTS" for parents,
with a queued pill on the right. */}
<div className="mb-2 flex items-center justify-between gap-2">
<span className={`text-[11px] font-mono uppercase tracking-[0.04em] ${
isOnline ? "text-good"
: effectiveStatus === "failed" ? "text-bad"
: (effectiveStatus === "provisioning" || effectiveStatus === "degraded") ? "text-warm"
: "text-ink-soft"
}`}>
{statusCfg.label}{hasChildren ? ` · ${descendantCount} agents` : ""}
</span>
{data.activeTasks > 0 && (
<span className="shrink-0 text-[11px] font-mono px-2 py-1 rounded-md text-ink-mid bg-surface-card border border-line">
{data.activeTasks} queued
</span>
)}
</div>
{/* Skills */}
{skills.length > 0 && (
@@ -328,29 +359,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
</button>
)}
{/* Bottom row: status / active tasks */}
<div className="flex items-center justify-between mt-0.5">
{effectiveStatus !== "online" ? (
<div className={`text-[10px] uppercase tracking-widest font-medium ${
effectiveStatus === "failed" ? "text-bad" :
effectiveStatus === "degraded" ? "text-warm" :
effectiveStatus === "not_configured" ? "text-warm" :
effectiveStatus === "provisioning" ? "text-accent" :
"text-ink-mid"
}`}>
{statusCfg.label}
</div>
) : <div />}
{data.activeTasks > 0 && (
<div className="flex items-center gap-1">
<div className="w-1 h-1 rounded-full bg-warm motion-safe:animate-pulse" />
<span className="text-[10px] text-warm tabular-nums">
{data.activeTasks} task{data.activeTasks > 1 ? "s" : ""}
</span>
</div>
)}
</div>
{/* (status + queued now rendered above, concept-style) */}
{/* Degraded error preview */}
{data.status === "degraded" && data.lastSampleError && (
@@ -395,7 +404,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
}
}
}}
className="!w-2.5 !h-1 !rounded-full !bg-surface-card/80 !border-0 !-bottom-0.5 hover:!bg-blue-400 hover:!h-1.5 focus-visible:!bg-blue-400 focus-visible:!h-1.5 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-blue-400/60 focus-visible:ring-offset-1 focus-visible:ring-offset-zinc-950 transition-all"
className="!w-2.5 !h-1 !rounded-full !bg-surface-card/80 !border-0 !-bottom-0.5 hover:!bg-accent hover:!h-1.5 focus-visible:!bg-accent focus-visible:!h-1.5 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface transition-all"
/>
</div>
</>
@@ -0,0 +1,195 @@
"use client";
import { useState } from "react";
import type { Node } from "@xyflow/react";
import {
useCanvasStore,
type PanelTab,
type WorkspaceNodeData,
} from "@/store/canvas";
import { showToast } from "@/components/Toaster";
import { Tooltip } from "./Tooltip";
import { DetailsTab } from "./tabs/DetailsTab";
import { SkillsTab } from "./tabs/SkillsTab";
import { ChatTab } from "./tabs/ChatTab";
import { ConfigTab } from "./tabs/ConfigTab";
import { ContainerConfigTab } from "./tabs/ContainerConfigTab";
import { DisplayTab } from "./tabs/DisplayTab";
import { TerminalTab } from "./tabs/TerminalTab";
import { FilesTab } from "./tabs/FilesTab";
import { MemoryInspectorPanel } from "./MemoryInspectorPanel";
import { AuditTrailPanel } from "./AuditTrailPanel";
import { TracesTab } from "./tabs/TracesTab";
import { EventsTab } from "./tabs/EventsTab";
import { ActivityTab } from "./tabs/ActivityTab";
import { ScheduleTab } from "./tabs/ScheduleTab";
import { ChannelsTab } from "./tabs/ChannelsTab";
/**
* Canonical workspace tab set the SAME ids/labels/icons the map's
* SidePanel has always rendered. Single source of truth so the map drawer
* and any other host (the concierge Settings page) can't drift.
*/
export const WORKSPACE_PANEL_TABS: { id: PanelTab; label: string; icon: string }[] = [
{ id: "chat", label: "Chat", icon: "◈" },
{ id: "activity", label: "Activity", icon: "⊙" },
{ id: "details", label: "Details", icon: "◉" },
{ id: "skills", label: "Plugins", icon: "✦" },
{ id: "terminal", label: "Terminal", icon: "▸" },
{ id: "display", label: "Display", icon: "▣" },
{ id: "container-config", label: "Container", icon: "▤" },
{ id: "config", label: "Config", icon: "⚙" },
{ id: "schedule", label: "Schedule", icon: "⏲" },
{ id: "channels", label: "Channels", icon: "⇌" },
{ id: "files", label: "Files", icon: "⊞" },
{ id: "memory", label: "Memory", icon: "◇" },
{ id: "traces", label: "Traces", icon: "◎" },
{ id: "events", label: "Events", icon: "◊" },
{ id: "audit", label: "Audit", icon: "⊟" },
];
interface Props {
/** The workspace node whose tabs to render (id + data blob). */
node: Node<WorkspaceNodeData>;
/**
* Controlled active tab. When provided together with `onTabChange`, the
* caller owns the active-tab state (the map's SidePanel threads the global
* `panelTab`/`setPanelTab` here so the store stays the source of truth and
* the existing keyboard/selection behaviour is preserved verbatim).
* When omitted, the component manages its OWN local active-tab state
* which is what the concierge Settings page uses so the embedded tabs
* don't fight the map's selection.
*/
activeTab?: PanelTab;
onTabChange?: (tab: PanelTab) => void;
/** Initial tab for the uncontrolled (local-state) mode. Defaults to "chat". */
defaultTab?: PanelTab;
}
/**
* The workspace tab bar + tab body, extracted from SidePanel so it can be
* reused verbatim outside the map (e.g. the concierge Settings "Platform
* agent configuration" section). Renders the canonical ARIA tablist and the
* exact same tab content components keyed on the active tab.
*
* Does NOT render the workspace header / meta pills / resize handle / footer
* those are host chrome and stay in the host (SidePanel for the map).
*/
export function WorkspacePanelTabs({ node, activeTab, onTabChange, defaultTab = "chat" }: Props) {
const restartWorkspace = useCanvasStore((s) => s.restartWorkspace);
// Controlled when both props are present; otherwise own the state locally.
const controlled = activeTab !== undefined && onTabChange !== undefined;
const [localTab, setLocalTab] = useState<PanelTab>(defaultTab);
const tab = controlled ? (activeTab as PanelTab) : localTab;
const setTab = (next: PanelTab) => {
if (controlled) onTabChange!(next);
else setLocalTab(next);
};
const workspaceId = node.id;
const data = node.data;
return (
<>
{/* Tabs — relative wrapper lets the fade gradient position against the scroll container */}
<div className="relative border-b border-line/40">
{/* Right-edge fade: signals more tabs are hidden off-screen when the bar overflows */}
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-surface to-transparent z-10" aria-hidden="true" />
<div
role="tablist"
aria-label="Workspace panel tabs"
className="flex overflow-x-auto bg-surface-sunken/20 px-1"
onKeyDown={(e) => {
const idx = WORKSPACE_PANEL_TABS.findIndex((t) => t.id === tab);
let next: number | null = null;
if (e.key === "ArrowRight") { e.preventDefault(); next = (idx + 1) % WORKSPACE_PANEL_TABS.length; }
else if (e.key === "ArrowLeft") { e.preventDefault(); next = (idx - 1 + WORKSPACE_PANEL_TABS.length) % WORKSPACE_PANEL_TABS.length; }
else if (e.key === "Home") { e.preventDefault(); next = 0; }
else if (e.key === "End") { e.preventDefault(); next = WORKSPACE_PANEL_TABS.length - 1; }
if (next !== null) {
setTab(WORKSPACE_PANEL_TABS[next].id);
requestAnimationFrame(() => { const el = document.getElementById(`tab-${WORKSPACE_PANEL_TABS[next!].id}`); el?.focus(); el?.scrollIntoView({ block: "nearest", inline: "nearest" }); });
}
}}
>
{WORKSPACE_PANEL_TABS.map((t) => (
<button
type="button"
key={t.id}
id={`tab-${t.id}`}
role="tab"
aria-selected={tab === t.id}
aria-controls={`panel-${t.id}`}
tabIndex={tab === t.id ? 0 : -1}
onClick={() => setTab(t.id)}
className={`shrink-0 px-3 py-2.5 text-[10px] font-medium tracking-wide transition-all rounded-t-lg mx-0.5 focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 ${
tab === t.id
? "text-ink bg-surface-card border-b-2 border-accent"
: "text-ink-mid hover:text-ink hover:bg-surface-card/60"
}`}
>
<span className="mr-1 opacity-50" aria-hidden="true">{t.icon}</span>
{t.label}
</button>
))}
</div>
</div>
{/* Needs Restart Banner */}
{data.needsRestart && !data.currentTask && (
<div className="px-4 py-2 bg-sky-950/20 border-b border-sky-800/20 flex items-center justify-between">
<span className="text-[10px] text-sky-300/90">Config changed restart to apply</span>
<button
type="button"
onClick={() => {
restartWorkspace(workspaceId).catch(() => showToast("Restart failed", "error"));
}}
className="text-[11px] px-2 py-1 bg-sky-800/40 hover:bg-sky-700/50 text-sky-200 rounded transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
Restart Now
</button>
</div>
)}
{/* Current Task Banner */}
{data.currentTask && (
<Tooltip text={data.currentTask as string}>
<div className="px-4 py-2 bg-amber-950/20 border-b border-amber-800/20 flex items-center gap-2 cursor-default">
<div className="w-1.5 h-1.5 rounded-full bg-amber-400 motion-safe:animate-pulse shrink-0" />
<span className="text-[10px] text-warm/90 truncate">
{data.currentTask}
</span>
</div>
</Tooltip>
)}
{/* Tab Content */}
<div
role="tabpanel"
id={`panel-${tab}`}
aria-labelledby={`tab-${tab}`}
tabIndex={0}
className="flex-1 overflow-y-auto focus:outline-none"
>
{tab === "details" && <DetailsTab key={workspaceId} workspaceId={workspaceId} data={data} />}
{tab === "skills" && <SkillsTab key={workspaceId} workspaceId={workspaceId} data={data} />}
{tab === "activity" && <ActivityTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "chat" && <ChatTab key={workspaceId} workspaceId={workspaceId} data={data} />}
{tab === "terminal" && <TerminalTab key={workspaceId} workspaceId={workspaceId} data={data} />}
{tab === "display" && <DisplayTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "container-config" && (
<ContainerConfigTab key={workspaceId} workspaceId={workspaceId} data={data} />
)}
{tab === "config" && <ConfigTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "schedule" && <ScheduleTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "channels" && <ChannelsTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "files" && <FilesTab key={workspaceId} workspaceId={workspaceId} data={data} />}
{tab === "memory" && <MemoryInspectorPanel key={workspaceId} workspaceId={workspaceId} />}
{tab === "traces" && <TracesTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "events" && <EventsTab key={workspaceId} workspaceId={workspaceId} />}
{tab === "audit" && <AuditTrailPanel key={workspaceId} workspaceId={workspaceId} />}
</div>
</>
);
}
@@ -275,9 +275,9 @@ describe("WorkspaceNode — status states", () => {
expect(screen.getByText("STARTING")).toBeTruthy();
});
it("suppresses status label for online node", () => {
it("shows status label for online node (concept: status always visible)", () => {
renderNode({ status: "online" });
expect(screen.queryByText("ONLINE")).toBeNull();
expect(screen.getByText("ONLINE")).toBeTruthy();
});
it("shows degraded error preview when status is degraded and lastSampleError is set", () => {
@@ -404,14 +404,18 @@ describe("WorkspaceNode — double-click interactions", () => {
});
describe("WorkspaceNode — active tasks", () => {
it("shows active tasks badge when activeTasks > 0", () => {
it("shows the queued count when activeTasks > 0", () => {
renderNode({ activeTasks: 3 });
expect(screen.getByText("3 tasks")).toBeTruthy();
expect(
screen.getByText((_, el) => el?.tagName === "SPAN" && (el.textContent ?? "").includes("3 queued")),
).toBeTruthy();
});
it("shows singular 'task' when activeTasks is 1", () => {
it("shows the queued count for a single task", () => {
renderNode({ activeTasks: 1 });
expect(screen.getByText("1 task")).toBeTruthy();
expect(
screen.getByText((_, el) => el?.tagName === "SPAN" && (el.textContent ?? "").includes("1 queued")),
).toBeTruthy();
});
it("suppresses badge when no active tasks", () => {
@@ -471,13 +475,15 @@ describe("WorkspaceNode — needs restart", () => {
});
describe("WorkspaceNode — descendant badge", () => {
it("shows descendant count badge when node has children in store", () => {
it("shows the agent count in the status line when node has children", () => {
store().nodes = [
makeNode({ id: "ws-1" }),
{ id: "child-1", data: { ...makeNode({ id: "ws-1" }).data, parentId: "ws-1" } },
];
renderNode();
expect(screen.getByText("1 sub")).toBeTruthy();
expect(
screen.getByText((_, el) => el?.tagName === "SPAN" && (el.textContent ?? "").includes("1 agents")),
).toBeTruthy();
});
it("suppresses badge when node has no children", () => {
@@ -527,9 +533,9 @@ describe("WorkspaceNode — skills pills", () => {
});
describe("WorkspaceNode — runtime badge", () => {
it("shows runtime badge when runtime is set", () => {
renderNode({ runtime: "hermes" });
expect(screen.getByText("hermes")).toBeTruthy();
it("shows the role pill (runtime pill replaced by role pill in the concept redesign)", () => {
renderNode({ role: "researcher" });
expect(screen.getByText("researcher")).toBeTruthy();
});
it("shows REMOTE badge for external runtime", () => {
@@ -0,0 +1,103 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, afterEach } from "vitest";
import { render, screen, fireEvent, cleanup } from "@testing-library/react";
afterEach(() => {
cleanup();
});
// ── Mock every tab content component to a sentinel so we can assert which
// body renders without dragging in API calls / heavy children. ───────────
vi.mock("../tabs/DetailsTab", () => ({ DetailsTab: () => <div data-testid="body-details" /> }));
vi.mock("../tabs/SkillsTab", () => ({ SkillsTab: () => <div data-testid="body-skills" /> }));
vi.mock("../tabs/ChatTab", () => ({ ChatTab: () => <div data-testid="body-chat" /> }));
vi.mock("../tabs/ConfigTab", () => ({ ConfigTab: () => <div data-testid="body-config" /> }));
vi.mock("../tabs/ContainerConfigTab", () => ({ ContainerConfigTab: () => <div data-testid="body-container" /> }));
vi.mock("../tabs/DisplayTab", () => ({ DisplayTab: () => <div data-testid="body-display" /> }));
vi.mock("../tabs/TerminalTab", () => ({ TerminalTab: () => <div data-testid="body-terminal" /> }));
vi.mock("../tabs/FilesTab", () => ({ FilesTab: () => <div data-testid="body-files" /> }));
vi.mock("../MemoryInspectorPanel", () => ({ MemoryInspectorPanel: () => <div data-testid="body-memory" /> }));
vi.mock("../tabs/TracesTab", () => ({ TracesTab: () => <div data-testid="body-traces" /> }));
vi.mock("../tabs/EventsTab", () => ({ EventsTab: () => <div data-testid="body-events" /> }));
vi.mock("../tabs/ActivityTab", () => ({ ActivityTab: () => <div data-testid="body-activity" /> }));
vi.mock("../tabs/ScheduleTab", () => ({ ScheduleTab: () => <div data-testid="body-schedule" /> }));
vi.mock("../tabs/ChannelsTab", () => ({ ChannelsTab: () => <div data-testid="body-channels" /> }));
vi.mock("../AuditTrailPanel", () => ({ AuditTrailPanel: () => <div data-testid="body-audit" /> }));
vi.mock("../Tooltip", () => ({
Tooltip: ({ children }: { children: React.ReactNode }) => <>{children}</>,
}));
vi.mock("@/components/Toaster", () => ({ showToast: vi.fn() }));
// The store is only consulted for restartWorkspace.
const mockRestart = vi.fn(() => Promise.resolve());
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector: (s: { restartWorkspace: typeof mockRestart }) => unknown) =>
selector({ restartWorkspace: mockRestart })
),
}));
import { WorkspacePanelTabs, WORKSPACE_PANEL_TABS } from "../WorkspacePanelTabs";
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const node: any = {
id: "platform-1",
data: {
name: "Org Concierge",
status: "online",
tier: 0,
role: "platform",
parentId: null,
needsRestart: false,
currentTask: null,
agentCard: null,
},
};
describe("WorkspacePanelTabs — uncontrolled (Settings usage)", () => {
it("renders the canonical 15-tab tablist for an explicit node", () => {
render(<WorkspacePanelTabs node={node} />);
const tablist = screen.getByRole("tablist");
expect(tablist.getAttribute("aria-label")).toBe("Workspace panel tabs");
expect(screen.getAllByRole("tab").length).toBe(WORKSPACE_PANEL_TABS.length);
expect(WORKSPACE_PANEL_TABS.length).toBe(15);
});
it("defaults to the chat tab when no defaultTab is given", () => {
render(<WorkspacePanelTabs node={node} />);
expect(screen.getByTestId("body-chat")).toBeTruthy();
expect(document.getElementById("tab-chat")?.getAttribute("aria-selected")).toBe("true");
});
it("honours defaultTab='config' (the concierge Settings entry point)", () => {
render(<WorkspacePanelTabs node={node} defaultTab="config" />);
expect(screen.getByTestId("body-config")).toBeTruthy();
expect(document.getElementById("tab-config")?.getAttribute("aria-selected")).toBe("true");
});
it("clicking a tab swaps the body using local state (no store panelTab)", () => {
render(<WorkspacePanelTabs node={node} />);
fireEvent.click(document.getElementById("tab-channels")!);
expect(screen.getByTestId("body-channels")).toBeTruthy();
expect(document.getElementById("tab-channels")?.getAttribute("aria-selected")).toBe("true");
});
});
describe("WorkspacePanelTabs — controlled (SidePanel usage)", () => {
it("renders activeTab and calls onTabChange instead of local state", () => {
const onTabChange = vi.fn();
render(<WorkspacePanelTabs node={node} activeTab="details" onTabChange={onTabChange} />);
expect(screen.getByTestId("body-details")).toBeTruthy();
fireEvent.click(document.getElementById("tab-config")!);
expect(onTabChange).toHaveBeenCalledWith("config");
// Controlled: body does NOT change on its own (parent owns the state).
expect(screen.getByTestId("body-details")).toBeTruthy();
});
it("ArrowRight from chat calls onTabChange with the next tab", () => {
const onTabChange = vi.fn();
render(<WorkspacePanelTabs node={node} activeTab="chat" onTabChange={onTabChange} />);
fireEvent.keyDown(screen.getByRole("tablist"), { key: "ArrowRight" });
expect(onTabChange).toHaveBeenCalledWith("activity");
});
});
@@ -188,11 +188,13 @@ describe("DropTargetBadge — renders ghost slot + badge for valid drag target",
});
render(<DropTargetBadge />);
expect(screen.getByTestId("ghost-slot")).toBeTruthy();
// Ghost uses slotBR from 3rd call: slotBR - slotTL = (712-232, 920-660)
// Ghost spans one default child slot at zoom 2: width = CHILD_DEFAULT_WIDTH
// (300) × 2 = 600; height = CHILD_DEFAULT_HEIGHT (176) × 2 = 352. left/top
// are the column-0/row-0 slot origin (unchanged by the card-size bump).
expect(screen.getByTestId("ghost-slot").style.left).toBe("232px");
expect(screen.getByTestId("ghost-slot").style.top).toBe("660px");
expect(screen.getByTestId("ghost-slot").style.width).toBe("480px");
expect(screen.getByTestId("ghost-slot").style.height).toBe("260px");
expect(screen.getByTestId("ghost-slot").style.width).toBe("600px");
expect(screen.getByTestId("ghost-slot").style.height).toBe("352px");
});
it("ghost is hidden when slot falls entirely outside parent bounds", () => {
@@ -325,7 +325,7 @@ describe("all shortcuts respect inInput guard", () => {
});
});
describe("Cmd/Ctrl+Arrow — keyboard node resize", () => {
describe("Cmd/Ctrl+Arrow — free-resize removed (system-controlled sizing)", () => {
beforeEach(() => {
mockStoreState.nodes = [
{
@@ -340,81 +340,15 @@ describe("Cmd/Ctrl+Arrow — keyboard node resize", () => {
renderWithProvider();
});
it("resizes height down (smaller) on Cmd/Ctrl+ArrowUp", () => {
// Node starts at minHeight=110 (no children). Shrinking clamps to min —
// height stays 110. Width is unchanged.
it("no longer resizes the node on Cmd/Ctrl+Arrow (free-resize removed)", () => {
// Sizing is system-controlled now: leaves render fixed-size and parents
// grow to fit their children, so Cmd/Ctrl+Arrow must not emit a
// `dimensions` change anymore.
fireEvent.keyDown(window, { key: "ArrowUp", metaKey: true });
expect(mockStoreState.onNodesChange).toHaveBeenCalledWith([
expect.objectContaining({
type: "dimensions",
id: "n1",
dimensions: { width: 210, height: 110 },
}),
]);
});
it("resizes height up (larger) on Cmd/Ctrl+ArrowDown", () => {
fireEvent.keyDown(window, { key: "ArrowDown", ctrlKey: true });
expect(mockStoreState.onNodesChange).toHaveBeenCalledWith([
expect.objectContaining({
type: "dimensions",
id: "n1",
dimensions: { width: 210, height: 120 },
}),
]);
});
it("resizes width down (smaller) on Cmd/Ctrl+ArrowLeft", () => {
// Node starts at minWidth=210 (no children). Shrinking clamps to min —
// width stays 210. Height is unchanged.
fireEvent.keyDown(window, { key: "ArrowLeft", metaKey: true });
expect(mockStoreState.onNodesChange).toHaveBeenCalledWith([
expect.objectContaining({
type: "dimensions",
id: "n1",
dimensions: { width: 210, height: 110 },
}),
]);
});
it("resizes width up (larger) on Cmd/Ctrl+ArrowRight", () => {
fireEvent.keyDown(window, { key: "ArrowRight", ctrlKey: true });
expect(mockStoreState.onNodesChange).toHaveBeenCalledWith([
expect.objectContaining({
type: "dimensions",
id: "n1",
dimensions: { width: 220, height: 110 },
}),
]);
});
it("uses 2px step with Shift held", () => {
// Step is 2px with Shift, but minHeight=110 clamps the result.
// 110 - 2 = 108, Math.max(110, 108) = 110. Width is unchanged.
fireEvent.keyDown(window, { key: "ArrowUp", metaKey: true, shiftKey: true });
expect(mockStoreState.onNodesChange).toHaveBeenCalledWith([
expect.objectContaining({
dimensions: { width: 210, height: 110 },
}),
]);
});
it("respects min-height constraint (no children)", () => {
fireEvent.keyDown(window, { key: "ArrowUp", metaKey: true });
fireEvent.keyDown(window, { key: "ArrowUp", metaKey: true });
// After shrinking from 110 to 100, another ArrowUp hits min-height of 110
// (110 - 10 = 100, but 100 < 110 so it should stay at 110)
// Actually: 110 -> 100 -> 110 (resets to min)
// Let me check: the hook does Math.max(minHeight, currentHeight - step)
// minHeight=110, step=10, so 110 - 10 = 100, but Math.max(110, 100) = 110
// So two ArrowUp calls should both result in height=100 then height=110?
// Wait: 110 - 10 = 100, Math.max(110, 100) = 110 (not 100)
// So the height never goes below 110. After first: 110 -> 100, but clamped to 110.
// Actually Math.max(110, 100) = 110, so the height never changes.
// The min constraint is respected — height stays at 110.
expect(mockStoreState.onNodesChange).toHaveBeenLastCalledWith([
expect.objectContaining({ dimensions: { width: 210, height: 110 } }),
]);
expect(mockStoreState.onNodesChange).not.toHaveBeenCalled();
});
it("does NOT fire when no node is selected", () => {
@@ -2,13 +2,6 @@
import { useEffect } from "react";
import { useCanvasStore } from "@/store/canvas";
import { type NodeChange, type Node } from "@xyflow/react";
import type { WorkspaceNodeData } from "@/store/canvas";
/** Returns true if the node has any direct child in the node list. */
function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean {
return nodes.some((n) => n.data.parentId === nodeId);
}
/**
* Canvas-wide keyboard shortcuts. All bound to the document window so
@@ -22,8 +15,9 @@ function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean
* Cmd/Ctrl+[ bump selected node backward in z-order
* Z zoom-to-team if the selected node has children
* Arrow keys move selected node 10px (50px with Shift)
* Cmd/Ctrl+Arrow resize selected node ( height, width)
* Cmd/Ctrl+Shift+Arrow resize by 2px per press (fine control)
*
* Node resize shortcuts were removed: container size + shape are now
* system-controlled (leaves fixed-size, parents grow to fit children).
*/
export function useKeyboardShortcuts() {
useEffect(() => {
@@ -96,8 +90,8 @@ export function useKeyboardShortcuts() {
// Arrow-key node movement — Figma-style keyboard drag for keyboard users.
// 10 px per press, 50 px with Shift held. Only fires when a node
// is selected and the target isn't a form control. Skipped when a
// modifier key (Cmd/Ctrl/Alt) is held so those combos can be used
// for other shortcuts (e.g. Cmd+Arrow = resize).
// modifier key (Cmd/Ctrl/Alt) is held so those combos stay free for
// browser/OS shortcuts (node resize via Cmd+Arrow was removed).
if (
!inInput &&
!e.metaKey &&
@@ -125,43 +119,9 @@ export function useKeyboardShortcuts() {
state.moveNode(selectedId, dx, dy);
}
// Cmd/Ctrl+Arrow — keyboard-accessible node resize.
// ↑/↓ resizes height, ←/→ resizes width.
// 10 px per press (2 px with Shift for fine control).
// Uses the same onNodesChange('dimensions') path that NodeResizer uses.
if (
!inInput &&
(e.metaKey || e.ctrlKey) &&
(e.key === "ArrowUp" ||
e.key === "ArrowDown" ||
e.key === "ArrowLeft" ||
e.key === "ArrowRight")
) {
const state = useCanvasStore.getState();
const selectedId = state.selectedNodeId;
if (!selectedId) return;
if (document.querySelector('[role="dialog"][aria-modal="true"]')) return;
e.preventDefault();
const step = e.shiftKey ? 2 : 10;
const node = state.nodes.find((n) => n.id === selectedId);
if (!node) return;
const currentWidth = (node.width ?? 210) as number;
const currentHeight = (node.height ?? 110) as number;
const minWidth = hasChildren(node.id, state.nodes) ? 360 : 210;
const minHeight = hasChildren(node.id, state.nodes) ? 200 : 110;
let newWidth = currentWidth;
let newHeight = currentHeight;
if (e.key === "ArrowUp") newHeight = Math.max(minHeight, currentHeight - step);
else if (e.key === "ArrowDown") newHeight = currentHeight + step;
else if (e.key === "ArrowLeft") newWidth = Math.max(minWidth, currentWidth - step);
else newWidth = currentWidth + step;
const change: NodeChange = {
type: "dimensions",
id: selectedId,
dimensions: { width: newWidth, height: newHeight },
};
state.onNodesChange([change]);
}
// Node resize (was Cmd/Ctrl+Arrow) removed — container size + shape are
// now system-controlled: leaves render at a fixed size and parents grow
// to fit their children, so there is no user-driven resize affordance.
};
window.addEventListener("keydown", handler);
return () => window.removeEventListener("keydown", handler);
@@ -0,0 +1,339 @@
/* Faithful port of the Org Concierge concept (molecule-concierge-v1).
Scoped under .root so the concept's generic class names (.btn, .view,
.msg, .node ) cannot collide with the rest of the canvas app. Theme
tokens are redefined here (not the app tokens) so the port matches the
concept palette exactly; they key off the same [data-theme] on <html>. */
.root {
--mono: "JetBrains Mono", ui-monospace, monospace;
--sans: var(--font-hanken), "Hanken Grotesk", system-ui, sans-serif;
/* dark (default) */
--bg: #08080a; --panel: #0d0d11; --panel-2: #101015;
--card: #16161d; --card-2: #1b1b23; --card-hover: #1f1f28;
--hair: rgba(255,255,255,.07); --hair-2: rgba(255,255,255,.11);
--tx: #ececf1; --tx-2: #9b9baa; --tx-3: #65656f;
--accent: #8b5cf6; --accent-2: #a78bfa; --accent-soft: rgba(139,92,246,.14);
--green: #34d399; --green-soft: rgba(52,211,153,.13); --green-bd: rgba(52,211,153,.26);
--amber: #fbbf24; --grey: #6a6a78; --warn: #f5a623; --red: #f87171;
--dot: rgba(255,255,255,.06);
--shadow: 0 18px 50px rgba(0,0,0,.5);
--user-bubble-tx: #fff;
font-family: var(--sans);
background: var(--bg);
color: var(--tx);
font-size: 14px;
-webkit-font-smoothing: antialiased;
position: fixed;
inset: 0;
overflow: hidden;
}
:global([data-theme="light"]) .root {
--bg: #f1efe8; --panel: #fbfaf6; --panel-2: #f6f4ee;
--card: #ffffff; --card-2: #faf9f4; --card-hover: #f3f1ea;
--hair: rgba(20,18,12,.10); --hair-2: rgba(20,18,12,.16);
--tx: #21201b; --tx-2: #5c5a52; --tx-3: #8e8b81;
--accent: #7c3aed; --accent-2: #7c3aed; --accent-soft: rgba(124,58,237,.10);
--green: #0f9d63; --green-soft: rgba(15,157,99,.10); --green-bd: rgba(15,157,99,.24);
--amber: #c98a04; --grey: #a8a59b; --warn: #c47e12; --red: #dc4d4d;
--dot: rgba(20,18,12,.10);
--shadow: 0 18px 50px rgba(60,56,40,.14);
}
.root *, .root *::before, .root *::after { box-sizing: border-box; }
.root ::-webkit-scrollbar { width: 8px; height: 8px; }
.root ::-webkit-scrollbar-thumb { background: var(--hair-2); border-radius: 8px; }
.root ::-webkit-scrollbar-track { background: transparent; }
.app { display: flex; height: 100%; width: 100%; }
/* ===== ICON RAIL ===== */
.rail {
width: 52px; flex: 0 0 52px; background: var(--panel);
border-right: 1px solid var(--hair);
display: flex; flex-direction: column; padding: 12px 8px; gap: 3px;
transition: width .22s cubic-bezier(.4,0,.2,1), flex-basis .22s cubic-bezier(.4,0,.2,1);
overflow: hidden;
}
.app.railOpen .rail { width: 212px; flex-basis: 212px; }
.railTop { display: flex; align-items: center; gap: 8px; height: 36px; margin-bottom: 8px; }
.logo {
width: 36px; height: 36px; flex: 0 0 36px; border-radius: 10px; display: grid; place-items: center; cursor: pointer;
background: linear-gradient(150deg,#7c3aed,#a78bfa);
box-shadow: 0 4px 14px rgba(124,58,237,.45), inset 0 1px 0 rgba(255,255,255,.25);
}
.railWordmark { font-weight: 700; font-size: 14.5px; letter-spacing: -.01em; white-space: nowrap; opacity: 0; transition: opacity .16s; pointer-events: none; }
.app.railOpen .railWordmark { opacity: 1; transition: opacity .18s .08s; }
.railToggle { margin-left: auto; width: 30px; height: 30px; flex: 0 0 30px; border-radius: 8px; display: grid; place-items: center; color: var(--tx-3); cursor: pointer; transition: .16s; border: none; background: none; }
.railToggle:hover { color: var(--tx); background: var(--hair); }
.railToggle svg { width: 18px; height: 18px; }
.app:not(.railOpen) .railToggle { display: none; }
.navbtn { height: 40px; border-radius: 10px; color: var(--tx-3); cursor: pointer; position: relative; transition: .16s; display: flex; align-items: center; gap: 12px; padding: 0; justify-content: flex-start; width: 100%; background: none; border: none; }
.app.railOpen .navbtn { padding: 0 11px; }
.navbtn .ico { width: 36px; flex: 0 0 36px; display: grid; place-items: center; }
.app.railOpen .navbtn .ico { width: 20px; flex: 0 0 20px; }
.navbtn .lbl { font-size: 13.5px; font-weight: 500; white-space: nowrap; opacity: 0; transition: opacity .16s; pointer-events: none; }
.app.railOpen .navbtn .lbl { opacity: 1; transition: opacity .18s .08s; }
.navbtn:hover { color: var(--tx-2); background: var(--hair); }
.navbtn.active { color: var(--accent-2); background: var(--accent-soft); }
.navbtn.active::before { content: ""; position: absolute; left: -8px; top: 50%; transform: translateY(-50%); width: 3px; height: 18px; border-radius: 0 3px 3px 0; background: var(--accent-2); }
.navbtn svg { width: 20px; height: 20px; }
.spacer { flex: 1; }
/* ===== MAIN ===== */
.main { flex: 1; display: flex; flex-direction: column; min-width: 0; }
.topbar { height: 56px; flex: 0 0 56px; border-bottom: 1px solid var(--hair); background: var(--panel); display: flex; align-items: center; justify-content: space-between; padding: 0 18px 0 20px; }
.org { display: flex; align-items: center; gap: 10px; cursor: pointer; padding: 6px 10px; border-radius: 9px; transition: .16s; margin-left: -6px; }
.org:hover { background: var(--hair); }
.orgBadge { width: 24px; height: 24px; border-radius: 7px; display: grid; place-items: center; background: linear-gradient(150deg,#2d2d36,#3a3a46); font-size: 12px; font-weight: 700; color: #d8d8e2; border: 1px solid var(--hair-2); }
:global([data-theme="light"]) .orgBadge { background: linear-gradient(150deg,#7c3aed,#a78bfa); color: #fff; border: none; }
.orgName { font-weight: 600; font-size: 14.5px; letter-spacing: -.01em; }
.chev { color: var(--tx-3); display: flex; }
.chev svg { width: 15px; height: 15px; }
.topbarRight { display: flex; align-items: center; gap: 10px; }
.iconPill { width: 34px; height: 34px; border-radius: 9px; display: grid; place-items: center; color: var(--tx-3); cursor: pointer; transition: .16s; border: none; background: none; }
.iconPill:hover { color: var(--tx-2); background: var(--hair); }
.iconPill svg { width: 18px; height: 18px; }
.themeToggle { width: 34px; height: 34px; border-radius: 9px; display: grid; place-items: center; color: var(--tx-2); cursor: pointer; transition: .16s; border: 1px solid var(--hair); background: none; }
.themeToggle:hover { background: var(--hair); color: var(--tx); }
.themeToggle svg { width: 17px; height: 17px; }
.avatar { width: 32px; height: 32px; border-radius: 50%; background: linear-gradient(150deg,#f0a36b,#e8638a); display: grid; place-items: center; font-weight: 700; font-size: 12.5px; color: #1a0d12; cursor: pointer; border: 1px solid rgba(255,255,255,.16); box-shadow: 0 2px 8px rgba(0,0,0,.3); margin-left: 4px; }
/* ===== VIEWS ===== */
.viewArea { flex: 1; min-height: 0; position: relative; }
.view { position: absolute; inset: 0; display: none; }
.view.active { display: flex; }
/* A transform turns this into the containing block for its position:fixed
descendants so the canvas's own overlays (Toolbar, Legend, Communications,
New Workspace, minimap) anchor to THIS box (the map view area, right of the
rail and below the topbar) instead of the viewport, and stop overlapping the
shell chrome. */
.canvasMount { position: absolute; inset: 0; transform: translateZ(0); overflow: hidden; }
/* ===== HOME VIEW ===== */
.homeSidebar { flex: 0 0 296px; max-width: 296px; background: var(--panel-2); border-right: 1px solid var(--hair); display: flex; flex-direction: column; min-height: 0; }
.sbTabs { display: flex; gap: 2px; padding: 12px 12px 0; border-bottom: 1px solid var(--hair); }
.sbTab { flex: 1; text-align: center; padding: 9px 4px 11px; font-size: 12.5px; font-weight: 600; color: var(--tx-3); cursor: pointer; position: relative; transition: .14s; border-radius: 8px 8px 0 0; border: none; background: none; }
.sbTab:hover { color: var(--tx-2); }
.sbTab.active { color: var(--tx); }
.sbTab.active::after { content: ""; position: absolute; left: 8px; right: 8px; bottom: -1px; height: 2px; border-radius: 2px; background: var(--accent); }
.cnt { font-family: var(--mono); font-size: 10px; font-weight: 600; margin-left: 5px; background: var(--hair); color: var(--tx-2); padding: 1px 5px; border-radius: 10px; }
.sbTab.active .cnt { background: var(--accent-soft); color: var(--accent-2); }
.sbBody { flex: 1; overflow-y: auto; padding: 14px 12px; }
.wsList { display: flex; flex-direction: column; gap: 6px; }
.treeChildren { position: relative; padding-left: 22px; display: flex; flex-direction: column; gap: 6px; margin-top: 6px; }
.tnode { position: relative; display: flex; flex-direction: column; gap: 6px; }
.tnode::before { content: ""; position: absolute; left: -14px; top: -6px; width: 1.5px; height: calc(100% + 6px); background: var(--hair-2); }
.tnode.last::before { height: 33px; }
.tnode::after { content: ""; position: absolute; left: -14px; top: 27px; width: 14px; height: 1.5px; background: var(--hair-2); }
.ws { display: flex; align-items: center; gap: 11px; padding: 10px 11px; border-radius: 13px; cursor: pointer; border: 1px solid transparent; background: transparent; transition: .16s; position: relative; width: 100%; text-align: left; }
.ws:hover { background: var(--card); }
.ws.active { background: var(--accent-soft); border-color: rgba(139,92,246,.34); }
.wsAv { width: 34px; height: 34px; border-radius: 50%; flex: 0 0 34px; position: relative; display: grid; place-items: center; font-weight: 700; font-size: 12px; color: #0c0c10; box-shadow: inset 0 1px 0 rgba(255,255,255,.3); }
.wsAv .dot { position: absolute; right: -1px; bottom: -1px; width: 10px; height: 10px; border-radius: 50%; border: 2.5px solid var(--panel-2); }
.ws.active .wsAv .dot { border-color: var(--card); }
.wsMeta { min-width: 0; flex: 1; }
.wsName { font-weight: 600; font-size: 13.5px; letter-spacing: -.01em; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
.wsSub { display: flex; align-items: center; gap: 6px; margin-top: 1px; min-width: 0; }
.wsRole { font-family: var(--mono); font-size: 10.5px; color: var(--tx-3); white-space: nowrap; overflow: hidden; text-overflow: ellipsis; min-width: 0; flex: 0 1 auto; }
.wsStatus { font-size: 10.5px; font-weight: 500; display: flex; align-items: center; gap: 4px; flex: 0 0 auto; }
.wsStatus .sdot { width: 6px; height: 6px; border-radius: 50%; }
.rootTag { margin-left: auto; font-family: var(--mono); font-size: 9px; letter-spacing: .1em; text-transform: uppercase; color: var(--accent-2); background: var(--accent-soft); padding: 3px 6px; border-radius: 6px; border: 1px solid rgba(139,92,246,.28); }
.wsQ { margin-left: auto; flex: 0 0 auto; font-family: var(--mono); font-size: 10px; font-weight: 700; color: var(--tx-2); background: var(--hair); border: 1px solid var(--hair-2); padding: 2px 7px; border-radius: 20px; display: inline-flex; align-items: center; gap: 4px; }
.wsQ svg { width: 9px; height: 9px; color: var(--tx-3); }
.wsQ.zero { color: var(--tx-3); opacity: .65; }
.wsCaret { flex: 0 0 auto; width: 20px; height: 20px; margin-left: 4px; border: none; background: none; color: var(--tx-3); cursor: pointer; display: grid; place-items: center; border-radius: 6px; transition: .14s; }
.wsCaret:hover { background: var(--hair); color: var(--tx); }
.wsCaret svg { width: 13px; height: 13px; }
.sbSection { font-size: 11px; font-weight: 600; letter-spacing: .12em; text-transform: uppercase; color: var(--tx-3); font-family: var(--mono); padding: 18px 4px 10px; }
/* tasks */
.task { display: flex; flex-direction: column; align-items: stretch; gap: 0; padding: 11px; border-radius: 12px; border: 1px solid var(--hair); background: var(--card); margin-bottom: 7px; }
.taskRow { display: flex; gap: 11px; }
.taskIc { width: 28px; height: 28px; border-radius: 8px; flex: 0 0 28px; display: grid; place-items: center; }
.taskIc svg { width: 15px; height: 15px; }
.taskIc.done { background: var(--green-soft); color: var(--green); border: 1px solid var(--green-bd); }
.taskIc.run { background: rgba(245,166,35,.12); color: var(--amber); border: 1px solid rgba(245,166,35,.28); }
.taskIc.sched { background: var(--accent-soft); color: var(--accent-2); border: 1px solid rgba(139,92,246,.26); }
.taskMeta { flex: 1; min-width: 0; }
.taskT { font-size: 13px; font-weight: 600; letter-spacing: -.01em; line-height: 1.35; }
.taskS { font-size: 11px; color: var(--tx-3); margin-top: 3px; display: flex; align-items: center; gap: 6px; }
.taskS .pip { width: 4px; height: 4px; border-radius: 50%; background: var(--tx-3); }
.taskActions { display: flex; gap: 7px; margin-top: 11px; padding-left: 39px; }
.tbtn { font-family: var(--sans); font-size: 11.5px; font-weight: 600; cursor: pointer; padding: 5px 12px; border-radius: 8px; border: 1px solid var(--hair-2); background: var(--card-2); color: var(--tx-2); transition: .14s; display: inline-flex; align-items: center; gap: 5px; }
.tbtn svg { width: 13px; height: 13px; }
.tbtn:hover { background: var(--card-hover); color: var(--tx); }
.tbtn.done { background: var(--green-soft); color: var(--green); border-color: var(--green-bd); }
.task.isDone .taskT { color: var(--tx-2); }
/* activity */
.act { display: flex; gap: 11px; padding: 6px 4px; }
.actTime { font-family: var(--mono); font-size: 10.5px; color: var(--tx-3); flex: 0 0 52px; padding-top: 1px; font-variant-numeric: tabular-nums; }
.actLine { position: relative; padding-left: 15px; flex: 1; }
.actLine::before { content: ""; position: absolute; left: 0; top: 6px; width: 6px; height: 6px; border-radius: 50%; background: var(--accent); }
.actLine.grn::before { background: var(--green); }
.actText { font-size: 12px; color: var(--tx-2); line-height: 1.45; }
.actText b { color: var(--tx); font-weight: 600; }
/* approvals */
.apprCard { background: var(--card); border: 1px solid var(--hair); border-radius: 14px; overflow: hidden; }
.apprRow { display: flex; align-items: flex-start; gap: 11px; padding: 13px; }
.apprIc { width: 30px; height: 30px; border-radius: 8px; flex: 0 0 30px; display: grid; place-items: center; background: rgba(239,68,68,.12); color: var(--red); border: 1px solid rgba(239,68,68,.22); }
.apprIc svg { width: 15px; height: 15px; }
.apprMeta { flex: 1; min-width: 0; }
.apprT { font-size: 13px; font-weight: 600; letter-spacing: -.01em; line-height: 1.35; }
.apprT code { font-family: var(--mono); font-size: 11px; color: var(--tx-2); background: var(--hair); padding: 1px 5px; border-radius: 5px; font-weight: 500; }
.apprS { font-size: 11px; color: var(--tx-3); margin-top: 3px; }
.apprActions { display: flex; gap: 7px; padding: 0 13px 13px; }
.empty { text-align: center; color: var(--tx-3); font-size: 12.5px; padding: 30px 16px; line-height: 1.6; }
.empty svg { width: 30px; height: 30px; margin-bottom: 10px; color: var(--tx-3); opacity: .6; }
/* buttons */
.btn { font-family: var(--sans); font-size: 12px; font-weight: 600; cursor: pointer; padding: 6px 13px; border-radius: 8px; border: 1px solid var(--hair-2); background: var(--card-2); color: var(--tx-2); transition: .14s; white-space: nowrap; }
.btn:hover { background: var(--card-hover); color: var(--tx); }
.btn.approve { background: var(--accent); color: #fff; border-color: transparent; box-shadow: 0 2px 10px rgba(124,58,237,.4); }
.btn.approve:hover { background: #9d6ef8; }
.btn.deny:hover { background: rgba(239,68,68,.14); color: var(--red); border-color: rgba(239,68,68,.3); }
.btn.flex { flex: 1; text-align: center; }
/* ===== CHAT ===== */
.chat { flex: 1; display: flex; flex-direction: column; min-width: 0; background: var(--bg); }
.chatHead { height: 56px; flex: 0 0 56px; border-bottom: 1px solid var(--hair); display: flex; align-items: center; gap: 12px; padding: 0 22px; background: var(--panel-2); }
.chAv { width: 30px; height: 30px; border-radius: 9px; display: grid; place-items: center; background: linear-gradient(150deg,#7c3aed,#a78bfa); color: #fff; box-shadow: 0 2px 8px rgba(124,58,237,.4); }
.chAv svg { width: 16px; height: 16px; }
.chMeta { flex: 1; }
.chTitle { font-size: 14.5px; font-weight: 600; letter-spacing: -.01em; }
.chSub { font-size: 11.5px; color: var(--tx-3); display: flex; align-items: center; gap: 6px; margin-top: 1px; }
.chSub .sdot { width: 6px; height: 6px; border-radius: 50%; background: var(--green); }
.chTools { display: flex; gap: 6px; }
.chatScroll { flex: 1; overflow-y: auto; padding: 30px 0; }
.chatInner { max-width: 720px; margin: 0 auto; padding: 0 28px; display: flex; flex-direction: column; gap: 22px; }
.msg { display: flex; gap: 13px; max-width: 100%; }
.msg.user { flex-direction: row-reverse; }
.msgAv { width: 30px; height: 30px; border-radius: 9px; flex: 0 0 30px; display: grid; place-items: center; font-weight: 700; font-size: 12px; }
.msg.user .msgAv { background: linear-gradient(150deg,#f0a36b,#e8638a); color: #1a0d12; }
.msg.bot .msgAv { background: linear-gradient(150deg,#7c3aed,#a78bfa); color: #fff; }
.msg.bot .msgAv svg { width: 16px; height: 16px; }
.bubbleWrap { display: flex; flex-direction: column; gap: 11px; min-width: 0; max-width: 560px; }
.msg.user .bubbleWrap { align-items: flex-end; }
.bubble { padding: 12px 15px; border-radius: 15px; font-size: 14px; line-height: 1.55; letter-spacing: -.005em; }
.msg.user .bubble { background: var(--accent); color: var(--user-bubble-tx); border-bottom-right-radius: 5px; box-shadow: 0 3px 14px rgba(124,58,237,.3); }
.msg.bot .bubble { background: var(--card); border: 1px solid var(--hair); border-bottom-left-radius: 5px; color: var(--tx); }
.bubble b { font-weight: 600; }
.actionCard { background: var(--card); border: 1px solid var(--hair); border-radius: 14px; padding: 13px 15px; display: flex; align-items: center; gap: 13px; width: 100%; }
.acIc { width: 34px; height: 34px; border-radius: 10px; flex: 0 0 34px; display: grid; place-items: center; background: var(--green-soft); border: 1px solid var(--green-bd); color: var(--green); }
.acIc svg { width: 18px; height: 18px; }
.acMeta { flex: 1; min-width: 0; }
.acLabel { font-family: var(--mono); font-size: 10px; letter-spacing: .1em; text-transform: uppercase; color: var(--tx-3); margin-bottom: 3px; }
.acTitle { font-size: 13.5px; font-weight: 600; letter-spacing: -.01em; display: flex; align-items: center; gap: 7px; flex-wrap: wrap; }
.acTitle .pill { font-family: var(--mono); font-size: 11px; font-weight: 500; color: var(--accent-2); white-space: nowrap; background: var(--accent-soft); padding: 2px 8px; border-radius: 6px; border: 1px solid rgba(139,92,246,.24); }
.acCheck { color: var(--green); display: flex; }
.acCheck svg { width: 18px; height: 18px; }
.reqCard { background: linear-gradient(180deg,rgba(245,166,35,.08),rgba(245,166,35,.02)); border: 1px solid rgba(245,166,35,.3); border-radius: 16px; padding: 16px; width: 100%; }
.reqTop { display: flex; align-items: flex-start; gap: 13px; }
.reqIc { width: 36px; height: 36px; border-radius: 10px; flex: 0 0 36px; display: grid; place-items: center; background: rgba(245,166,35,.15); border: 1px solid rgba(245,166,35,.34); color: var(--warn); }
.reqIc svg { width: 19px; height: 19px; }
.reqMeta { flex: 1; }
.reqLabel { font-family: var(--mono); font-size: 10px; letter-spacing: .1em; text-transform: uppercase; color: var(--warn); margin-bottom: 4px; font-weight: 600; }
.reqTitle { font-size: 14.5px; font-weight: 600; letter-spacing: -.01em; line-height: 1.4; }
.reqTitle code { font-family: var(--mono); font-size: 12.5px; color: var(--amber); background: rgba(245,166,35,.12); padding: 1px 6px; border-radius: 5px; font-weight: 500; }
.reqDesc { font-size: 12.5px; color: var(--tx-2); margin-top: 6px; line-height: 1.5; }
.reqActions { display: flex; gap: 9px; margin-top: 14px; padding-left: 49px; }
.reqActions .btn { padding: 8px 18px; font-size: 12.5px; }
.composer { padding: 14px 28px 20px; border-top: 1px solid var(--hair); background: var(--panel-2); }
.composerInner { max-width: 720px; margin: 0 auto; }
.inputBox { background: var(--card); border: 1px solid var(--hair-2); border-radius: 16px; padding: 12px 12px 10px 16px; transition: .16s; }
.inputBox:focus-within { border-color: rgba(139,92,246,.5); box-shadow: 0 0 0 3px rgba(139,92,246,.12); }
.inputTop { display: flex; align-items: flex-end; gap: 10px; }
.msgInput { flex: 1; background: none; border: none; outline: none; color: var(--tx); font-family: var(--sans); font-size: 14px; line-height: 1.5; resize: none; max-height: 120px; padding: 5px 0; }
.msgInput::placeholder { color: var(--tx-3); }
.send { width: 36px; height: 36px; flex: 0 0 36px; border-radius: 11px; border: none; cursor: pointer; background: var(--accent); color: #fff; display: grid; place-items: center; transition: .16s; box-shadow: 0 2px 10px rgba(124,58,237,.4); }
.send:hover { background: #9d6ef8; transform: translateY(-1px); }
.send svg { width: 17px; height: 17px; }
.inputBottom { display: flex; align-items: center; gap: 10px; margin-top: 8px; }
.hint { margin-left: auto; font-size: 11px; color: var(--tx-3); font-family: var(--mono); }
.hint kbd { background: var(--hair); border: 1px solid var(--hair); border-radius: 4px; padding: 1px 5px; font-family: var(--mono); font-size: 10px; }
/* greeting (empty chat state) */
.greetWrap { flex: 1; display: flex; flex-direction: column; align-items: center; justify-content: center; gap: 26px; padding: 0 28px; }
.greet { display: flex; align-items: center; gap: 14px; font-size: 34px; font-weight: 400; letter-spacing: -.02em; color: var(--tx); }
.greet .stamp { color: #f0a36b; }
.greetChips { display: flex; flex-wrap: wrap; gap: 10px; justify-content: center; }
.chip { display: inline-flex; align-items: center; gap: 7px; font-size: 13px; font-weight: 600; color: var(--tx-2); background: var(--card); border: 1px solid var(--hair); padding: 8px 13px; border-radius: 10px; cursor: pointer; transition: .14s; }
.chip:hover { background: var(--card-hover); color: var(--tx); border-color: var(--hair-2); }
/* placeholder (settings) */
.ph { flex: 1; display: flex; flex-direction: column; align-items: center; justify-content: center; gap: 14px; color: var(--tx-3); text-align: center; }
.ph svg { width: 42px; height: 42px; opacity: .5; }
.ph h2 { font-size: 18px; font-weight: 600; color: var(--tx-2); }
.ph p { font-size: 13.5px; max-width: 340px; line-height: 1.55; }
/* settings view */
.settingsScroll { flex: 1; min-height: 0; overflow-y: auto; padding: 28px 32px 60px; }
.settingsInner { max-width: 720px; margin: 0 auto; display: flex; flex-direction: column; gap: 26px; }
.settingsHead { display: flex; flex-direction: column; gap: 5px; }
.settingsHead h1 { font-size: 21px; font-weight: 600; letter-spacing: -.01em; color: var(--tx); }
.settingsHead p { font-size: 13px; color: var(--tx-3); line-height: 1.55; max-width: 540px; }
.scard { background: var(--card); border: 1px solid var(--hair); border-radius: 14px; padding: 18px 20px; display: flex; flex-direction: column; gap: 14px; }
.scardHead { display: flex; flex-direction: column; gap: 4px; }
.scardTitle { font-size: 14.5px; font-weight: 600; color: var(--tx); display: flex; align-items: center; gap: 9px; }
.scardDesc { font-size: 12.5px; color: var(--tx-3); line-height: 1.5; }
/* billing radio options */
.optList { display: flex; flex-direction: column; gap: 10px; }
.opt { display: flex; gap: 12px; padding: 13px 14px; border: 1px solid var(--hair); border-radius: 11px; cursor: pointer; transition: .14s; background: var(--card-2); align-items: flex-start; }
.opt:hover { border-color: var(--hair-2); background: var(--card-hover); }
.opt.optActive { border-color: rgba(139,92,246,.5); background: var(--accent-soft); }
.optRadio { width: 16px; height: 16px; flex: 0 0 16px; border-radius: 50%; border: 2px solid var(--hair-2); margin-top: 2px; position: relative; transition: .14s; }
.opt.optActive .optRadio { border-color: var(--accent); }
.opt.optActive .optRadio::after { content: ""; position: absolute; inset: 2px; border-radius: 50%; background: var(--accent); }
.optBody { display: flex; flex-direction: column; gap: 3px; min-width: 0; }
.optTitle { font-size: 13px; font-weight: 600; color: var(--tx); display: flex; align-items: center; gap: 8px; }
.optDesc { font-size: 12px; color: var(--tx-3); line-height: 1.5; }
.optTag { font-family: var(--mono); font-size: 9.5px; font-weight: 600; letter-spacing: .06em; text-transform: uppercase; color: var(--green); background: var(--green-soft); border: 1px solid var(--green-bd); padding: 1px 7px; border-radius: 20px; }
.optTagCur { color: var(--accent-2); background: var(--accent-soft); border-color: rgba(139,92,246,.3); }
/* byok key entry */
.keyRow { display: flex; flex-direction: column; gap: 9px; padding: 14px; border: 1px solid var(--hair); border-radius: 11px; background: var(--card-2); }
.keyLabel { font-size: 11px; font-weight: 600; letter-spacing: .04em; color: var(--tx-2); font-family: var(--mono); }
.keyInputRow { display: flex; gap: 9px; }
.keyInput { flex: 1; min-width: 0; background: var(--panel); border: 1px solid var(--hair-2); border-radius: 8px; padding: 8px 11px; font-family: var(--mono); font-size: 12px; color: var(--tx); outline: none; transition: .14s; }
.keyInput:focus { border-color: var(--accent); }
.keyInput::placeholder { color: var(--tx-3); }
.keyNote { font-size: 11.5px; color: var(--tx-3); line-height: 1.5; }
.keyNote code { font-family: var(--mono); font-size: 11px; color: var(--tx-2); background: var(--hair); padding: 1px 5px; border-radius: 4px; }
.sMsg { font-size: 12px; padding: 8px 11px; border-radius: 8px; line-height: 1.45; }
.sMsgErr { color: var(--red); background: rgba(239,68,68,.12); border: 1px solid rgba(239,68,68,.28); }
.sMsgOk { color: var(--green); background: var(--green-soft); border: 1px solid var(--green-bd); }
.btn.primary { background: var(--accent); color: #fff; border-color: transparent; box-shadow: 0 2px 10px rgba(124,58,237,.4); }
.btn.primary:hover { background: #9d6ef8; }
.btn.primary:disabled { opacity: .4; cursor: default; box-shadow: none; }
/* embedded canvas settings tabs */
.embedSettings { border: 1px solid var(--hair); border-radius: 14px; overflow: hidden; background: var(--card); }
/* embedded full workspace tab panel (the SAME WorkspacePanelTabs the Org-map
SidePanel renders), pointed at the platform agent. A bordered card with a
bounded height + flex column so the tab body's own overflow-y scroller works
inside it (mirrors .embedChat's min-height:0 trick). */
.embedPanel {
border: 1px solid var(--hair);
border-radius: 14px;
overflow: hidden;
background: var(--card);
display: flex;
flex-direction: column;
min-height: 0;
height: 70vh;
max-height: 760px;
}
/* embedded canonical ChatTab (shared with the Org-map SidePanel).
Fills the chat column below the concierge header; min-height:0 lets the
ChatTab's own overflow-y scroller work inside the flex column. */
.embedChat { flex: 1; min-height: 0; display: flex; flex-direction: column; }
@@ -0,0 +1,604 @@
"use client";
import { useCallback, useEffect, useMemo, useState } from "react";
import { useCanvasStore, type TopView } from "@/store/canvas";
import { WORKSPACE_KIND } from "@/lib/workspace-kind";
import { useTheme } from "@/lib/theme-provider";
import { api } from "@/lib/api";
import { showToast } from "@/components/Toaster";
import type { ActivityEntry } from "@/types/activity";
import { Canvas } from "@/components/Canvas";
import { CommunicationOverlay } from "@/components/CommunicationOverlay";
import { MessageFlightHome } from "./MessageFlightHome";
import { ChatTab } from "@/components/tabs/ChatTab";
import { WorkspacePanelTabs } from "@/components/WorkspacePanelTabs";
import { SettingsTabs } from "@/components/settings";
import s from "./Concierge.module.css";
import {
IcHome, IcOrgMap, IcSettings, IcSearch, IcBell, IcSun, IcMoon, IcChevDown,
IcQueue, IcCaret, IcMolecule, IcClock, IcCheck, IcTrash, IcChat,
} from "./icons";
/* ── status → concept palette ─────────────────────────────────────────── */
function statusInfo(status: string): { color: string; label: string } {
switch (status) {
case "online": return { color: "var(--green)", label: "online" };
case "provisioning":
case "starting": return { color: "var(--amber)", label: "starting" };
case "degraded": return { color: "var(--amber)", label: "degraded" };
case "building": return { color: "var(--amber)", label: "building" };
case "failed": return { color: "var(--red)", label: "failed" };
case "paused": return { color: "var(--accent-2)", label: "paused" };
default: return { color: "var(--grey)", label: status || "idle" };
}
}
const AV_GRADIENTS = [
"linear-gradient(150deg,#a78bfa,#7c3aed)",
"linear-gradient(150deg,#60a5fa,#3b82f6)",
"linear-gradient(150deg,#34d399,#10b981)",
"linear-gradient(150deg,#fbbf77,#f59e0b)",
"linear-gradient(150deg,#5eead4,#14b8a6)",
"linear-gradient(150deg,#f0a36b,#e8638a)",
];
function initials(name: string): string {
const parts = name.trim().split(/\s+/).filter(Boolean);
if (parts.length === 0) return "?";
if (parts.length === 1) return parts[0].slice(0, 2).toUpperCase();
return (parts[0][0] + parts[parts.length - 1][0]).toUpperCase();
}
function gradientFor(id: string): string {
let h = 0;
for (let i = 0; i < id.length; i++) h = (h * 31 + id.charCodeAt(i)) >>> 0;
return AV_GRADIENTS[h % AV_GRADIENTS.length];
}
type SbTab = "agents" | "tasks" | "approvals";
interface PendingApproval {
id: string;
workspace_id: string;
workspace_name: string;
action: string;
reason: string | null;
status: string;
created_at: string;
}
interface UserTask {
id: string;
workspace_id: string;
workspace_name: string;
title: string;
detail: string | null;
status: string;
created_at: string;
}
/** ISO timestamp → "9:05 PM" (local). Empty string on a bad/missing value. */
function clockTime(iso: string | null | undefined): string {
if (!iso) return "";
const d = new Date(iso);
if (Number.isNaN(d.getTime())) return "";
return d.toLocaleTimeString([], { hour: "numeric", minute: "2-digit" });
}
/** A human action label from an activity row. */
function activityText(a: ActivityEntry): string {
if (a.summary) return a.summary;
const verb = a.activity_type?.replace(/_/g, " ") ?? "activity";
return a.method ? `${verb} · ${a.method}` : verb;
}
export function ConciergeShell() {
const nodes = useCanvasStore((st) => st.nodes);
const topView = useCanvasStore((st) => st.topView);
const setTopView = useCanvasStore((st) => st.setTopView);
const selectNode = useCanvasStore((st) => st.selectNode);
const selectedNodeId = useCanvasStore((st) => st.selectedNodeId);
const { resolvedTheme, setTheme } = useTheme();
const [railOpen, setRailOpen] = useState(false);
const [sbTab, setSbTab] = useState<SbTab>("agents");
const [settingsTab, setSettingsTab] = useState<"platform" | "org">("platform");
const [collapsed, setCollapsed] = useState<Record<string, boolean>>({});
// Dynamic org name for the topbar. Sourced from GET /org/identity
// ({name} ← MOLECULE_ORG_NAME, added by a parallel backend change).
// Falls back to "Molecule AI" when the endpoint 404s / errors or
// returns an empty name, so the topbar never breaks before the backend
// lands.
const [orgName, setOrgName] = useState("Molecule AI");
useEffect(() => {
let cancelled = false;
api
.get<{ name?: string }>("/org/identity")
.then((r) => {
const name = (r?.name || "").trim();
if (!cancelled && name) setOrgName(name);
})
.catch(() => {
// No endpoint / not reachable — keep the "Molecule AI" fallback.
});
return () => {
cancelled = true;
};
}, []);
// Build the agent hierarchy from live nodes.
const { roots, childrenOf } = useMemo(() => {
const childrenOf = new Map<string, typeof nodes>();
const roots: typeof nodes = [];
for (const n of nodes) {
const p = n.data.parentId;
if (p) {
const arr = childrenOf.get(p) ?? [];
arr.push(n);
childrenOf.set(p, arr);
} else {
roots.push(n);
}
}
return { roots, childrenOf };
}, [nodes]);
const platformRoot = useMemo(
() =>
// Resolve the platform agent by the authoritative kind='platform' marker
// only — the backend in this branch always returns kind
// (COALESCE(w.kind,'workspace')) and the map-side filter
// (canvas-topology/Canvas/Toolbar) is kind-only, so the shell must not
// disagree via a name/role heuristic. Fall back to the first root only as
// graceful degradation if no node is tagged platform.
roots.find((r) => r.data.kind === WORKSPACE_KIND.Platform) ??
roots[0] ??
null,
[roots],
);
const platformId = platformRoot?.id ?? null;
// ── live data: approvals + user-tasks (org-wide), activity (platform agent) ──
const [approvals, setApprovals] = useState<PendingApproval[]>([]);
const [userTasks, setUserTasks] = useState<UserTask[]>([]);
const [activity, setActivity] = useState<ActivityEntry[]>([]);
const [deciding, setDeciding] = useState<string | null>(null);
const [resolving, setResolving] = useState<string | null>(null);
const loadApprovals = useCallback(() => {
api.get<PendingApproval[]>("/approvals/pending")
.then((r) => setApprovals(r ?? []))
.catch(() => setApprovals([]));
}, []);
const loadUserTasks = useCallback(() => {
api.get<UserTask[]>("/user-tasks/pending")
.then((r) => setUserTasks(r ?? []))
.catch(() => setUserTasks([]));
}, []);
useEffect(() => { loadApprovals(); loadUserTasks(); }, [loadApprovals, loadUserTasks]);
useEffect(() => {
if (!platformId) return;
let cancelled = false;
api.get<ActivityEntry[]>(`/workspaces/${platformId}/activity?limit=12`)
.then((r) => { if (!cancelled) setActivity(r ?? []); })
.catch(() => { if (!cancelled) setActivity([]); });
return () => { cancelled = true; };
}, [platformId]);
const decide = useCallback(async (a: PendingApproval, decision: "approved" | "denied") => {
if (deciding) return;
setDeciding(a.id);
try {
await api.post(`/workspaces/${a.workspace_id}/approvals/${a.id}/decide`, {
decision, decided_by: "human",
});
showToast(decision === "approved" ? "Approved" : "Denied", decision === "approved" ? "success" : "info");
setApprovals((prev) => prev.filter((x) => x.id !== a.id));
} catch {
showToast("Failed to record decision", "error");
} finally {
setDeciding(null);
}
}, [deciding]);
const resolveTask = useCallback(async (t: UserTask, status: "done" | "dismissed") => {
if (resolving) return;
setResolving(t.id);
try {
await api.post(`/workspaces/${t.workspace_id}/user-tasks/${t.id}/resolve`, {
status, resolved_by: "human",
});
showToast(status === "done" ? "Marked done" : "Dismissed", status === "done" ? "success" : "info");
setUserTasks((prev) => prev.filter((x) => x.id !== t.id));
} catch {
showToast("Failed to resolve task", "error");
} finally {
setResolving(null);
}
}, [resolving]);
const nav = (v: TopView) => setTopView(v);
/* ── agents tree (recursive) ──────────────────────────────────────── */
function renderNode(n: (typeof nodes)[number], depth: number) {
const kids = childrenOf.get(n.id) ?? [];
const hasKids = kids.length > 0;
const isCollapsed = collapsed[n.id];
const st = statusInfo(n.data.status);
const isRoot = depth === 0;
const isPlatform = n.id === platformRoot?.id;
const q = (n.data.activeTasks as number) ?? 0;
// Role can be a long descriptor (e.g. "Coding Executor (Kimi) — …"); render
// it compact (single-line, truncated by .wsRole) and surface the full text
// on hover via the native tooltip.
const roleLabel = isPlatform ? "platform" : n.data.role || "agent";
const row = (
<div
role="button"
tabIndex={0}
data-testid="agent-tree-node"
data-node-name={n.data.name}
data-ws-id={n.id}
data-platform={isPlatform ? "true" : "false"}
data-depth={depth}
className={`${s.ws} ${selectedNodeId === n.id ? s.active : ""}`}
onClick={() => selectNode(n.id)}
onKeyDown={(e) => {
if (e.key === "Enter" || e.key === " ") {
e.preventDefault();
selectNode(n.id);
}
}}
>
<div className={s.wsAv} style={{ background: gradientFor(n.id) }}>
{initials(n.data.name)}
<span className={s.dot} style={{ background: st.color }} />
</div>
<div className={s.wsMeta}>
<div className={s.wsName}>{n.data.name}</div>
<div className={s.wsSub}>
<span className={s.wsRole} title={roleLabel}>{roleLabel}</span>
<span className={s.wsStatus} style={{ color: st.color }}>
<span className={s.sdot} style={{ background: st.color }} />
{st.label}
</span>
</div>
</div>
{isRoot && isPlatform ? (
<span data-testid="agent-tree-root-tag" className={s.rootTag}>root</span>
) : (
<span className={`${s.wsQ} ${q === 0 ? s.zero : ""}`} title="Tasks in queue">
<IcQueue />
{q}
</span>
)}
{hasKids && (
<button
className={s.wsCaret}
title="Expand / collapse"
onClick={(e) => {
e.stopPropagation();
setCollapsed((c) => ({ ...c, [n.id]: !c[n.id] }));
}}
style={{ transform: isCollapsed ? "none" : "rotate(90deg)", transition: "transform .18s" }}
>
<IcCaret />
</button>
)}
</div>
);
return (
<div key={n.id} className={s.tnode}>
{row}
{hasKids && !isCollapsed && (
<div className={s.treeChildren}>
{kids.map((k) => renderNode(k, depth + 1))}
</div>
)}
</div>
);
}
return (
<div className={s.root}>
{/* Envelope flies between agent rows on each delegate/message event. */}
<MessageFlightHome />
<div className={`${s.app} ${railOpen ? s.railOpen : ""}`}>
{/* ICON RAIL */}
<nav className={s.rail}>
<div className={s.railTop}>
<div className={s.logo} title="Toggle sidebar" onClick={() => setRailOpen((o) => !o)}>
<IcMolecule />
</div>
<span className={s.railWordmark}>Molecule</span>
<button className={s.railToggle} title="Collapse sidebar" onClick={() => setRailOpen((o) => !o)}>
<IcOrgMap />
</button>
</div>
<button data-testid="nav-home" className={`${s.navbtn} ${topView === "home" ? s.active : ""}`} title="Home" onClick={() => nav("home")}>
<span className={s.ico}><IcHome /></span><span className={s.lbl}>Home</span>
</button>
<button data-testid="nav-map" className={`${s.navbtn} ${topView === "map" ? s.active : ""}`} title="Org map" onClick={() => nav("map")}>
<span className={s.ico}><IcOrgMap /></span><span className={s.lbl}>Org map</span>
</button>
<div className={s.spacer} />
<button data-testid="nav-settings" className={`${s.navbtn} ${topView === "settings" ? s.active : ""}`} title="Settings" onClick={() => nav("settings")}>
<span className={s.ico}><IcSettings /></span><span className={s.lbl}>Settings</span>
</button>
</nav>
<div className={s.main}>
{/* TOPBAR */}
<header className={s.topbar}>
<div className={s.org}>
<div className={s.orgBadge}>{initials(orgName).slice(0, 1)}</div>
<span data-testid="topbar-org-name" className={s.orgName}>{orgName}</span>
<span className={s.chev}><IcChevDown /></span>
</div>
<div className={s.topbarRight}>
<button className={s.iconPill} title="Search"><IcSearch /></button>
<button className={s.iconPill} title="Notifications"><IcBell /></button>
<button
className={s.themeToggle}
title="Toggle theme"
onClick={() => setTheme(resolvedTheme === "dark" ? "light" : "dark")}
>
{resolvedTheme === "dark" ? <IcMoon /> : <IcSun />}
</button>
<div className={s.avatar} title="You">HW</div>
</div>
</header>
<div className={s.viewArea}>
{/* HOME VIEW */}
<div className={`${s.view} ${topView === "home" ? s.active : ""}`}>
<aside className={s.homeSidebar}>
<div className={s.sbTabs}>
<button data-testid="home-subtab-agents" className={`${s.sbTab} ${sbTab === "agents" ? s.active : ""}`} onClick={() => setSbTab("agents")}>Agents</button>
<button data-testid="home-subtab-tasks" className={`${s.sbTab} ${sbTab === "tasks" ? s.active : ""}`} onClick={() => setSbTab("tasks")}>
Tasks{userTasks.length > 0 && <span className={s.cnt}>{userTasks.length}</span>}
</button>
<button data-testid="home-subtab-approvals" className={`${s.sbTab} ${sbTab === "approvals" ? s.active : ""}`} onClick={() => setSbTab("approvals")}>
Approvals{approvals.length > 0 && <span className={s.cnt}>{approvals.length}</span>}
</button>
</div>
<div className={s.sbBody}>
{sbTab === "agents" && (
<>
<div className={s.wsList}>
{roots.length === 0 && (
<div className={s.empty}>No agents yet. Ask the concierge to spin up a team.</div>
)}
{roots.map((r) => renderNode(r, 0))}
</div>
<div className={s.sbSection}>Recent activity</div>
<div>
{activity.length === 0 && (
<div className={s.empty}>No recent activity yet.</div>
)}
{activity.map((a) => {
const ok = a.status !== "error" && a.status !== "failed";
return (
<div key={a.id} className={s.act}>
<span className={s.actTime}>{clockTime(a.created_at)}</span>
<div className={`${s.actLine} ${ok ? s.grn : ""}`}>
<div className={s.actText}>{activityText(a)}</div>
</div>
</div>
);
})}
</div>
</>
)}
{sbTab === "tasks" && (
<>
{userTasks.length === 0 && (
<div className={s.empty}>Nothing needs you right now. When an agent needs you to do something, it shows up here.</div>
)}
{userTasks.map((t) => (
<div key={t.id} className={s.task}>
<div className={s.taskRow}>
<div className={`${s.taskIc} ${s.run}`}><IcClock /></div>
<div className={s.taskMeta}>
<div className={s.taskT}>{t.title}</div>
<div className={s.taskS}>
{t.workspace_name}<span className={s.pip} />asked {clockTime(t.created_at)}
</div>
{t.detail && (
<div style={{ fontSize: 12, color: "var(--tx-3)", marginTop: 6, lineHeight: 1.45 }}>
{t.detail}
</div>
)}
</div>
</div>
<div className={s.taskActions}>
<button className={`${s.tbtn} ${s.done}`} disabled={resolving === t.id} onClick={() => resolveTask(t, "done")}>
<IcCheck />Done
</button>
<button className={s.tbtn} disabled={resolving === t.id} onClick={() => resolveTask(t, "dismissed")}>
Dismiss
</button>
</div>
</div>
))}
</>
)}
{sbTab === "approvals" && (
<>
{approvals.length === 0 && (
<div className={s.empty}>No pending approvals. Destructive actions await sign-off here.</div>
)}
{approvals.map((a) => (
<div key={a.id} className={s.apprCard} style={{ marginBottom: 7 }}>
<div className={s.apprRow}>
<div className={s.apprIc}><IcTrash /></div>
<div className={s.apprMeta}>
<div className={s.apprT}>{a.action.replace(/_/g, " ")} <code>{a.workspace_name}</code></div>
<div className={s.apprS}>{a.reason || "destructive"}</div>
</div>
</div>
<div className={s.apprActions}>
<button className={`${s.btn} ${s.approve} ${s.flex}`} disabled={deciding === a.id} onClick={() => decide(a, "approved")}>
{deciding === a.id ? "…" : "Approve"}
</button>
<button className={`${s.btn} ${s.deny} ${s.flex}`} disabled={deciding === a.id} onClick={() => decide(a, "denied")}>
{deciding === a.id ? "…" : "Deny"}
</button>
</div>
</div>
))}
</>
)}
</div>
</aside>
{/* CHAT reuses the EXACT canonical chat the Org-map SidePanel
renders (My Chat / Agent Comms sub-tabs, attachments, history,
delivery-mode handling), pointed at the platform agent. A thin
concierge-styled header keeps the Home look; the ChatTab body
below is identical to the map path so features can't drift. */}
{platformId && platformRoot ? (
<section className={s.chat}>
<div className={s.chatHead}>
<div className={s.chAv}><IcChat /></div>
<div className={s.chMeta}>
<div className={s.chTitle}>{platformRoot.data.name ?? "Org Concierge"}</div>
<div className={s.chSub}>
{(() => {
const online =
platformRoot.data.status === "online" ||
platformRoot.data.status === "degraded";
return (
<>
<span
className={s.sdot}
style={{ background: online ? "var(--green)" : "var(--grey)" }}
/>
{online ? "online" : statusInfo(platformRoot.data.status ?? "").label} · platform agent
</>
);
})()}
</div>
</div>
</div>
<div className={s.embedChat}>
<ChatTab key={platformId} workspaceId={platformId} data={platformRoot.data} />
</div>
</section>
) : (
<section className={s.chat}>
<div className={s.greetWrap}>
<div className={s.greet}>
<span className={s.stamp}></span> No platform agent yet
</div>
</div>
</section>
)}
</div>
{/* ORG MAP VIEW — the live canvas */}
<div className={`${s.view} ${topView === "map" ? s.active : ""}`}>
{topView === "map" && (
<div className={s.canvasMount}>
<main aria-label="Agent canvas" style={{ position: "absolute", inset: 0 }}>
<Canvas />
</main>
<CommunicationOverlay />
</div>
)}
</div>
{/* SETTINGS VIEW */}
<div className={`${s.view} ${topView === "settings" ? s.active : ""}`}>
<div className={s.settingsScroll}>
<div className={s.settingsInner}>
<div className={s.settingsHead}>
<h1>Settings</h1>
<p>
Org-level settings for the platform concierge. Configure the
concierge exactly like any workspace config.yaml, plugins
and skills, container/compute, display, channels, schedule
and secrets plus how it pays for model usage and org
identity.
</p>
</div>
{/* Two tabs instead of one long sheet: Platform agent
configuration vs Org & canvas settings. Reuses the same
.sbTabs purple-underline tab style as the Home sub-tabs. */}
<div className={s.sbTabs} role="tablist" aria-label="Settings sections">
<button
type="button"
role="tab"
data-testid="settings-tab-platform"
aria-selected={settingsTab === "platform"}
className={`${s.sbTab} ${settingsTab === "platform" ? s.active : ""}`}
onClick={() => setSettingsTab("platform")}
>
Platform agent configuration
</button>
<button
type="button"
role="tab"
data-testid="settings-tab-org"
aria-selected={settingsTab === "org"}
className={`${s.sbTab} ${settingsTab === "org" ? s.active : ""}`}
onClick={() => setSettingsTab("org")}
>
Org &amp; canvas settings
</button>
</div>
{/* Platform agent configuration the FULL workspace tab UI
(Config, Plugins/Skills, Container, Display, Details,
Activity, Terminal, Channels, Schedule, Files, Memory,
Traces, Events, Audit), reusing the exact same
WorkspacePanelTabs the Org-map SidePanel renders so the two
surfaces can't drift. Pointed at the platform agent; the
panel owns its own local active-tab state so it doesn't
fight the map's node selection. */}
{settingsTab === "platform" && (
<div data-testid="settings-pane-platform" className={s.scard}>
<div className={s.scardHead}>
<div className={s.scardDesc}>
Update the concierge like any workspace: its config.yaml,
plugins &amp; skills, container/compute, display, channels,
schedule and more.
</div>
</div>
{platformRoot ? (
<div className={s.embedPanel}>
<WorkspacePanelTabs key={platformRoot.id} node={platformRoot} defaultTab="config" />
</div>
) : (
<div className={s.scardDesc}>
No platform agent yet. Spin one up from Home to configure it.
</div>
)}
</div>
)}
{settingsTab === "org" && (
<div data-testid="settings-pane-org" className={s.scard}>
<div className={s.scardHead}>
<div className={s.scardDesc}>
Secrets, workspace tokens, org API keys and organization
identity. These also live behind the gear in the top bar.
</div>
</div>
{platformId && (
<div className={s.embedSettings}>
<SettingsTabs workspaceId={platformId} />
</div>
)}
</div>
)}
</div>
</div>
</div>
</div>
</div>
</div>
</div>
);
}
@@ -0,0 +1,50 @@
/** MessageFlightHome the concierge-home counterpart of MessageFlightLayer.
* The home view is a vertical agent tree (not a spatial canvas), so an envelope
* flies between the source and target agent ROWS. It shares the exact same
* flight stream (useA2AFlights) as the canvas, and resolves endpoints from each
* row's DOM rect (rows carry data-ws-id). Reduced-motion is honoured by the
* shared hook (it emits no flights). */
import { useRef } from "react";
import { useA2AFlights, type A2AFlight } from "@/hooks/useA2AFlights";
import { FlightEnvelope, type Point } from "../FlightEnvelope";
function rowCenter(wsId: string): Point | null {
if (typeof document === "undefined") return null;
const sel =
typeof CSS !== "undefined" && typeof CSS.escape === "function"
? CSS.escape(wsId)
: wsId;
const el = document.querySelector<HTMLElement>(`[data-ws-id="${sel}"]`);
if (!el) return null;
const r = el.getBoundingClientRect();
return { x: r.left + r.width / 2, y: r.top + r.height / 2 };
}
/** One flight. Captures the source/target row rects ONCE on mount (a ref, not
* per-render) so a later re-render or scroll mid-flight does not restart the
* animation. */
function HomeFlight({ flight }: { flight: A2AFlight }) {
const pos = useRef<{ from: Point; to: Point } | null>(null);
if (pos.current === null) {
const from = rowCenter(flight.sourceId);
const to = rowCenter(flight.targetId);
if (from && to) pos.current = { from, to };
}
if (!pos.current) return null; // one or both agents not visible in the tree
return <FlightEnvelope from={pos.current.from} to={pos.current.to} kind={flight.kind} />;
}
export function MessageFlightHome() {
const flights = useA2AFlights();
if (flights.length === 0) return null;
return (
<div
aria-hidden="true"
style={{ position: "fixed", inset: 0, pointerEvents: "none", zIndex: 50 }}
>
{flights.map((f) => (
<HomeFlight key={f.key} flight={f} />
))}
</div>
);
}
+113
View File
@@ -0,0 +1,113 @@
/* Inline SVG icons lifted from the Org Concierge concept (molecule-concierge-v1).
Stroke icons inherit currentColor; size comes from the CSS (svg{width/height}). */
import type { SVGProps } from "react";
const stroke = {
fill: "none",
stroke: "currentColor",
strokeWidth: 1.8,
strokeLinecap: "round" as const,
strokeLinejoin: "round" as const,
};
export const IcMolecule = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" {...p}>
<circle cx="12" cy="5" r="2.4" fill="#fff" />
<circle cx="5.5" cy="16" r="2.4" fill="#fff" opacity=".85" />
<circle cx="18.5" cy="16" r="2.4" fill="#fff" opacity=".85" />
<path d="M12 7.4L6 14.2M12 7.4L18 14.2M7.6 16h8.8" stroke="#fff" strokeWidth="1.4" strokeLinecap="round" />
</svg>
);
export const IcChat = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" {...p}>
<circle cx="12" cy="5" r="2.2" fill="#fff" />
<circle cx="5.5" cy="16" r="2.2" fill="#fff" opacity=".85" />
<circle cx="18.5" cy="16" r="2.2" fill="#fff" opacity=".85" />
<path d="M12 7.2L6 14M12 7.2L18 14M7.6 16h8.8" stroke="#fff" strokeWidth="1.3" strokeLinecap="round" />
</svg>
);
export const IcHome = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M3 10.5L12 3l9 7.5" /><path d="M5 9.5V20h14V9.5" /></svg>
);
export const IcOrgMap = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}>
<rect x="8.5" y="3" width="7" height="6" rx="1.5" />
<rect x="2.5" y="15" width="6.5" height="6" rx="1.5" />
<rect x="15" y="15" width="6.5" height="6" rx="1.5" />
<path d="M12 9v3M12 12H5.75v3M12 12h6.25v3" />
</svg>
);
export const IcSettings = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}>
<circle cx="12" cy="12" r="3" />
<path d="M19.4 15a1.7 1.7 0 0 0 .34 1.87l.06.06a2 2 0 1 1-2.83 2.83l-.06-.06a1.7 1.7 0 0 0-1.87-.34 1.7 1.7 0 0 0-1.03 1.56V21a2 2 0 1 1-4 0v-.09A1.7 1.7 0 0 0 9 19.4a1.7 1.7 0 0 0-1.87.34l-.06.06a2 2 0 1 1-2.83-2.83l.06-.06A1.7 1.7 0 0 0 4.6 15a1.7 1.7 0 0 0-1.56-1.03H3a2 2 0 1 1 0-4h.09A1.7 1.7 0 0 0 4.6 9a1.7 1.7 0 0 0-.34-1.87l-.06-.06a2 2 0 1 1 2.83-2.83l.06.06A1.7 1.7 0 0 0 9 4.6a1.7 1.7 0 0 0 1.03-1.56V3a2 2 0 1 1 4 0v.09A1.7 1.7 0 0 0 15 4.6a1.7 1.7 0 0 0 1.87-.34l.06-.06a2 2 0 1 1 2.83 2.83l-.06.06A1.7 1.7 0 0 0 19.4 9c.13.31.4.55.73.66" />
</svg>
);
export const IcSearch = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="1.8" strokeLinecap="round" {...p}><circle cx="11" cy="11" r="7" /><path d="M20 20l-3.5-3.5" /></svg>
);
export const IcBell = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M18 8a6 6 0 1 0-12 0c0 7-3 9-3 9h18s-3-2-3-9" /><path d="M13.7 21a2 2 0 0 1-3.4 0" /></svg>
);
export const IcSun = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><circle cx="12" cy="12" r="4.2" /><path d="M12 2v2.5M12 19.5V22M2 12h2.5M19.5 12H22M4.9 4.9l1.8 1.8M17.3 17.3l1.8 1.8M19.1 4.9l-1.8 1.8M6.7 17.3l-1.8 1.8" /></svg>
);
export const IcMoon = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M21 12.8A9 9 0 1 1 11.2 3a7 7 0 0 0 9.8 9.8z" /></svg>
);
export const IcChevDown = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round" {...p}><path d="M6 9l6 6 6-6" /></svg>
);
export const IcCaret = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2.4" strokeLinecap="round" strokeLinejoin="round" {...p}><path d="M9 6l6 6-6 6" /></svg>
);
export const IcQueue = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2.2" strokeLinecap="round" {...p}><path d="M8 6h12M8 12h12M8 18h12M4 6h.01M4 12h.01M4 18h.01" /></svg>
);
export const IcCheck = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2.2" strokeLinecap="round" strokeLinejoin="round" {...p}><path d="M20 6L9 17l-5-5" /></svg>
);
export const IcSchedule = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><rect x="3.5" y="4.5" width="17" height="16" rx="2.5" /><path d="M3.5 9h17M8 3v3M16 3v3" /></svg>
);
export const IcWorkspace = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><rect x="3.5" y="3.5" width="7" height="7" rx="1.5" /><rect x="13.5" y="13.5" width="7" height="7" rx="1.5" /><path d="M13.5 7h7M7 13.5v7" /></svg>
);
export const IcWarn = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M12 9v4M12 17h.01" /><path d="M10.3 3.9 1.8 18a2 2 0 0 0 1.7 3h17a2 2 0 0 0 1.7-3L13.7 3.9a2 2 0 0 0-3.4 0Z" /></svg>
);
export const IcSend = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round" {...p}><path d="M5 12h14M13 6l6 6-6 6" /></svg>
);
export const IcHistory = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M3 12a9 9 0 1 0 3-6.7L3 8" /><path d="M3 4v4h4M12 8v4l3 2" /></svg>
);
export const IcDots = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" fill="currentColor" {...p}><circle cx="5" cy="12" r="1.6" /><circle cx="12" cy="12" r="1.6" /><circle cx="19" cy="12" r="1.6" /></svg>
);
export const IcClock = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M12 7v5l3 2" /><circle cx="12" cy="12" r="9" /></svg>
);
export const IcTrash = (p: SVGProps<SVGSVGElement>) => (
<svg viewBox="0 0 24 24" {...stroke} {...p}><path d="M3 6h18M8 6V4h8v2M19 6l-1 14H6L5 6M10 11v5M14 11v5" /></svg>
);
+1 -1
View File
@@ -120,7 +120,7 @@ export { usePalette } from "./palette-context";
// References the CSS variables that next/font/google emits in
// app/layout.tsx. Falls through to system fonts if the variable is
// undefined (e.g. in unit tests with no <body> font class).
export const MOBILE_FONT_SANS = "var(--font-inter), 'Inter', ui-sans-serif, system-ui, sans-serif";
export const MOBILE_FONT_SANS = "var(--font-hanken), 'Hanken Grotesk', ui-sans-serif, system-ui, sans-serif";
export const MOBILE_FONT_MONO = "var(--font-jetbrains), 'JetBrains Mono', ui-monospace, monospace";
// Status keys we surface in the mobile UI. Anything else from the
+55 -13
View File
@@ -15,12 +15,21 @@ import { Spinner } from '@/components/Spinner';
* currently-active org, plus a switcher list when the user belongs to
* multiple orgs.
*
* Data path:
* Data path (SaaS control plane present):
* 1. fetchSession() /cp/auth/me current org_id
* 2. api.get('/cp/orgs') list of all orgs the user belongs to
* 3. Match by id === session.org_id; fall back to host-slug match
* if the session probe loses the race.
*
* Data path (self-host NO control plane):
* /cp/orgs is a control-plane route that does not exist on a self-hosted
* stack, so it 404s. When that probe fails we fall back to the open
* GET /org/identity route (served by the tenant workspace-server in both
* modes) and render a single org card from name + slug + org_id. On a
* fresh self-host only `name` is populated (MOLECULE_ORG_SLUG /
* MOLECULE_ORG_ID are unset) the card omits the empty rows and shows
* no error and no "other organizations" list.
*
* Read-only this tab never mutates. Org creation/switching lives at
* /orgs (the post-signup landing page).
*/
@@ -36,25 +45,50 @@ interface Org {
// for the same defensive unwrap.
type OrgsResponse = Org[] | { orgs?: Org[] };
// GET /org/identity (self-host fallback) — open route on the tenant
// workspace-server. slug/org_id are "" on a fresh self-host.
interface OrgIdentity {
name?: string;
slug?: string;
org_id?: string;
}
export function OrgInfoTab() {
const [orgs, setOrgs] = useState<Org[] | null>(null);
const [session, setSession] = useState<Session | null>(null);
// selfHostOrg is set only when /cp/orgs is unavailable (self-host) and the
// /org/identity fallback yields an org. When non-null we render exactly one
// card from it and never show the "other organizations" list or an error.
const [selfHostOrg, setSelfHostOrg] = useState<Org | null>(null);
const [error, setError] = useState<string | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
let cancelled = false;
(async () => {
const sess = await fetchSession().catch(() => null);
if (cancelled) return;
setSession(sess);
try {
const [sess, body] = await Promise.all([
fetchSession().catch(() => null),
api.get<OrgsResponse>('/cp/orgs'),
]);
const body = await api.get<OrgsResponse>('/cp/orgs');
if (cancelled) return;
setSession(sess);
setOrgs(Array.isArray(body) ? body : body.orgs ?? []);
} catch (e) {
if (!cancelled) setError(e instanceof Error ? e.message : 'Failed to load org info');
} catch {
// /cp/orgs is a control-plane route — absent on a self-hosted stack
// (404 / network error). Fall back to the open /org/identity route on
// the tenant server instead of surfacing a red error banner.
try {
const id = await api.get<OrgIdentity>('/org/identity');
if (cancelled) return;
setSelfHostOrg({
id: id.org_id ?? '',
slug: id.slug ?? '',
name: id.name ?? '',
});
} catch (e2) {
if (!cancelled)
setError(e2 instanceof Error ? e2.message : 'Failed to load org info');
}
} finally {
if (!cancelled) setLoading(false);
}
@@ -66,10 +100,14 @@ export function OrgInfoTab() {
const tenantSlug = getTenantSlug();
const currentOrg =
selfHostOrg ??
orgs?.find((o) => session && o.id === session.org_id) ??
orgs?.find((o) => tenantSlug && o.slug === tenantSlug) ??
null;
const otherOrgs = orgs?.filter((o) => o.id !== currentOrg?.id) ?? [];
// Self-host renders a single org only — no "other organizations" list.
const otherOrgs = selfHostOrg
? []
: orgs?.filter((o) => o.id !== currentOrg?.id) ?? [];
if (loading) {
return (
@@ -127,21 +165,25 @@ export function OrgInfoTab() {
}
function OrgIdentityCard({ org, highlighted }: { org: Org; highlighted?: boolean }) {
// On self-host, slug / UUID may be unconfigured ("") — omit those rows
// gracefully rather than rendering an empty code box.
return (
<div
className={`rounded-lg border p-3 space-y-2 ${
highlighted ? 'border-accent/40 bg-accent-strong/5' : 'border-line/40 bg-surface-card/40'
}`}
data-testid={`org-card-${org.slug}`}
data-testid={`org-card-${org.slug || org.id || 'self-host'}`}
>
<div className="flex items-baseline justify-between gap-2">
<span className="text-[12px] font-medium text-ink truncate">{org.name}</span>
<span className="text-[12px] font-medium text-ink truncate">
{org.name || 'This organization'}
</span>
{org.status && (
<span className="text-[9px] text-ink-mid uppercase tracking-wider shrink-0">{org.status}</span>
)}
</div>
<IdentityRow label="Slug" value={org.slug} />
<IdentityRow label="UUID" value={org.id} mono />
{org.slug && <IdentityRow label="Slug" value={org.slug} />}
{org.id && <IdentityRow label="UUID" value={org.id} mono />}
</div>
);
}
@@ -2,13 +2,9 @@
import { createRef, useCallback, useEffect, useState } from 'react';
import * as Dialog from '@radix-ui/react-dialog';
import * as Tabs from '@radix-ui/react-tabs';
import { useSecretsStore } from '@/stores/secrets-store';
import { useKeyboardShortcut } from '@/hooks/use-keyboard-shortcut';
import { SecretsTab } from './SecretsTab';
import { TokensTab } from './TokensTab';
import { OrgTokensTab } from './OrgTokensTab';
import { OrgInfoTab } from './OrgInfoTab';
import { SettingsTabs } from './SettingsTabs';
import { UnsavedChangesGuard } from './UnsavedChangesGuard';
/** Module-level ref so TopBar's SettingsButton can receive focus back on close. */
@@ -106,38 +102,7 @@ export function SettingsPanel({ workspaceId }: SettingsPanelProps) {
</Dialog.Close>
</div>
<Tabs.Root defaultValue="api-keys">
<Tabs.List className="settings-panel__tabs" aria-label="Settings sections">
<Tabs.Trigger value="api-keys" className="settings-panel__tab">
Secrets
</Tabs.Trigger>
<Tabs.Trigger value="tokens" className="settings-panel__tab">
Workspace Tokens
</Tabs.Trigger>
<Tabs.Trigger value="org-tokens" className="settings-panel__tab">
Org API Keys
</Tabs.Trigger>
<Tabs.Trigger value="org-info" className="settings-panel__tab">
Organization
</Tabs.Trigger>
</Tabs.List>
<Tabs.Content value="api-keys" className="settings-panel__content">
<SecretsTab workspaceId={workspaceId} />
</Tabs.Content>
<Tabs.Content value="tokens" className="settings-panel__content">
<TokensTab workspaceId={workspaceId} />
</Tabs.Content>
<Tabs.Content value="org-tokens" className="settings-panel__content">
<OrgTokensTab />
</Tabs.Content>
<Tabs.Content value="org-info" className="settings-panel__content">
<OrgInfoTab />
</Tabs.Content>
</Tabs.Root>
<SettingsTabs workspaceId={workspaceId} />
<div className="settings-panel__footer">
<span className="settings-panel__shortcut-hint">
@@ -0,0 +1,60 @@
'use client';
import * as Tabs from '@radix-ui/react-tabs';
import { SecretsTab } from './SecretsTab';
import { TokensTab } from './TokensTab';
import { OrgTokensTab } from './OrgTokensTab';
import { OrgInfoTab } from './OrgInfoTab';
interface SettingsTabsProps {
workspaceId: string;
}
/**
* The tabbed body of the workspace settings surface Secrets, Workspace
* Tokens, Org API Keys, Organization.
*
* Extracted from SettingsPanel so the same content can render in two
* places without duplication:
* 1. The right-anchored slide-over drawer (the gear popover) SettingsPanel.
* 2. The concierge Settings view (embedded inline) ConciergeShell.
*
* Pure presentation of the four tabs; all dirty-form / unsaved-guard /
* keyboard-shortcut wiring stays in SettingsPanel where the popover owns it.
*/
export function SettingsTabs({ workspaceId }: SettingsTabsProps) {
return (
<Tabs.Root defaultValue="api-keys">
<Tabs.List className="settings-panel__tabs" aria-label="Settings sections">
<Tabs.Trigger value="api-keys" className="settings-panel__tab">
Secrets
</Tabs.Trigger>
<Tabs.Trigger value="tokens" className="settings-panel__tab">
Workspace Tokens
</Tabs.Trigger>
<Tabs.Trigger value="org-tokens" className="settings-panel__tab">
Org API Keys
</Tabs.Trigger>
<Tabs.Trigger value="org-info" className="settings-panel__tab">
Organization
</Tabs.Trigger>
</Tabs.List>
<Tabs.Content value="api-keys" className="settings-panel__content">
<SecretsTab workspaceId={workspaceId} />
</Tabs.Content>
<Tabs.Content value="tokens" className="settings-panel__content">
<TokensTab workspaceId={workspaceId} />
</Tabs.Content>
<Tabs.Content value="org-tokens" className="settings-panel__content">
<OrgTokensTab />
</Tabs.Content>
<Tabs.Content value="org-info" className="settings-panel__content">
<OrgInfoTab />
</Tabs.Content>
</Tabs.Root>
);
}
@@ -9,7 +9,9 @@
* - Copy button writes the UUID to navigator.clipboard
* - Falls back to host-slug match when session lookup fails
* - Lists other orgs when user belongs to multiple
* - Error banner when /cp/orgs throws
* - Self-host fallback: /cp/orgs 404 /org/identity single-org card (no error)
* - Self-host fallback with only a name (slug/org_id unset) omits empty rows
* - Error banner only when BOTH /cp/orgs AND /org/identity fail
* - Empty/no-match state renders the recovery hint, not a crash
*/
import React from "react";
@@ -180,12 +182,69 @@ describe("OrgInfoTab — fallbacks", () => {
});
});
// ─── Self-host fallback: /cp/orgs absent → /org/identity ─────────────────────
describe("OrgInfoTab — self-host fallback", () => {
it("renders a single org card from /org/identity when /cp/orgs 404s", async () => {
mockFetchSession.mockResolvedValue(null);
mockGet.mockImplementation((path: string) => {
if (path === "/cp/orgs")
return Promise.reject(new Error("API GET /cp/orgs: 404 page not found"));
if (path === "/org/identity")
return Promise.resolve({
name: "Molecule AI",
slug: "molecule-ai",
org_id: "abc-123",
});
return Promise.reject(new Error(`unexpected path ${path}`));
});
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText("Current Organization"));
// Single card from /org/identity — name + slug + UUID, no error banner.
expect(screen.getByText("Molecule AI")).toBeTruthy();
expect(screen.getByText("molecule-ai")).toBeTruthy();
expect(screen.getByText("abc-123")).toBeTruthy();
// No "other organizations" list and no error.
expect(screen.queryByText(/Your other organizations/)).toBeNull();
expect(screen.queryByText(/404/)).toBeNull();
});
it("renders only the name when slug/org_id are unset (fresh self-host)", async () => {
mockFetchSession.mockResolvedValue(null);
mockGet.mockImplementation((path: string) => {
if (path === "/cp/orgs")
return Promise.reject(new Error("API GET /cp/orgs: 404 page not found"));
if (path === "/org/identity")
return Promise.resolve({ name: "Molecule AI", slug: "", org_id: "" });
return Promise.reject(new Error(`unexpected path ${path}`));
});
render(<OrgInfoTab />);
await flush();
await waitFor(() => screen.getByText("Current Organization"));
expect(screen.getByText("Molecule AI")).toBeTruthy();
// Empty slug/UUID rows omitted — no copy buttons rendered.
expect(screen.queryByRole("button", { name: /Copy Slug/i })).toBeNull();
expect(screen.queryByRole("button", { name: /Copy UUID/i })).toBeNull();
});
});
// ─── Error + empty handling ──────────────────────────────────────────────────
describe("OrgInfoTab — error + empty", () => {
it("renders an error banner when /cp/orgs throws", async () => {
it("renders an error banner only when BOTH /cp/orgs and /org/identity fail", async () => {
mockFetchSession.mockResolvedValue(null);
mockGet.mockRejectedValue(new Error("API GET /cp/orgs: 500 boom"));
mockGet.mockImplementation((path: string) => {
if (path === "/cp/orgs")
return Promise.reject(new Error("API GET /cp/orgs: 404 page not found"));
if (path === "/org/identity")
return Promise.reject(new Error("API GET /org/identity: 500 boom"));
return Promise.reject(new Error(`unexpected path ${path}`));
});
render(<OrgInfoTab />);
await flush();
@@ -193,10 +252,14 @@ describe("OrgInfoTab — error + empty", () => {
expect(screen.queryByText("Current Organization")).toBeNull();
});
it("renders the recovery hint when no org matches (no crash)", async () => {
it("renders the recovery hint when /cp/orgs returns an empty list (no crash)", async () => {
mockFetchSession.mockResolvedValue(null);
mockGetTenantSlug.mockReturnValue("");
mockGet.mockResolvedValue([]);
mockGet.mockImplementation((path: string) =>
path === "/cp/orgs"
? Promise.resolve([])
: Promise.reject(new Error(`unexpected path ${path}`)),
);
render(<OrgInfoTab />);
await flush();
+1
View File
@@ -1,4 +1,5 @@
export { SettingsPanel } from './SettingsPanel';
export { SettingsTabs } from './SettingsTabs';
export { SettingsButton } from './SettingsButton';
export { SecretsTab } from './SecretsTab';
export { SecretRow } from './SecretRow';
@@ -3,13 +3,36 @@
import { useEffect, useMemo, useState } from "react";
import { api } from "@/lib/api";
import { runtimeDisplayName } from "@/lib/runtime-names";
import { isSaaSTenant } from "@/lib/tenant";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import type { WorkspaceCompute } from "@/store/socket";
const INSTANCE_TYPES = ["t3.medium", "t3.large", "t3.xlarge", "t3.2xlarge", "m6i.large", "m6i.xlarge", "c6i.xlarge"];
// Machine sizes keyed by cloud provider — an AWS t3.* is meaningless on Hetzner,
// etc. MUST mirror the workspace-server workspaceComputeInstanceAllowlist (which
// mirrors the CP provider configs); the PATCH validation rejects a mismatch 400.
const INSTANCE_TYPES_BY_PROVIDER: Record<string, string[]> = {
aws: ["t3.medium", "t3.large", "t3.xlarge", "t3.2xlarge", "m6i.large", "m6i.xlarge", "c6i.xlarge"],
hetzner: ["cpx11", "cpx21", "cpx31", "cpx41", "cpx51", "cax11", "cax21", "cax31", "cax41"],
gcp: ["e2-small", "e2-medium", "e2-standard-2", "e2-standard-4", "e2-standard-8"],
};
const DEFAULT_INSTANCE_BY_PROVIDER: Record<string, string> = {
aws: "t3.medium", hetzner: "cpx31", gcp: "e2-standard-2",
};
const normalizeProvider = (p?: string): string => (p === "gcp" || p === "hetzner" ? p : "aws");
const instanceTypesForProvider = (p?: string): string[] =>
INSTANCE_TYPES_BY_PROVIDER[normalizeProvider(p)] ?? INSTANCE_TYPES_BY_PROVIDER.aws;
const defaultInstanceForProvider = (p?: string): string =>
DEFAULT_INSTANCE_BY_PROVIDER[normalizeProvider(p)] ?? "t3.medium";
// Editable cloud-provider options (multi-provider RFC) — mirrors CreateWorkspaceDialog.
const CLOUD_PROVIDER_OPTIONS = [
{ value: "aws", label: "AWS (default)" },
{ value: "gcp", label: "GCP" },
{ value: "hetzner", label: "Hetzner" },
];
const RUNTIME_OPTIONS = ["claude-code", "codex", "hermes", "openclaw", "kimi", "kimi-cli", "external"];
const RESOLUTIONS = ["1280x720", "1440x900", "1920x1080", "2560x1440"];
const DEFAULT_HEADLESS_INSTANCE_TYPE = "t3.medium";
const DEFAULT_HEADLESS_ROOT_GB = 30;
type Props = {
@@ -23,6 +46,7 @@ type Props = {
type FormState = {
runtime: string;
provider: string; // cloud backend; editable in SaaS (in-place switch recreates the box)
instanceType: string;
rootGB: string;
displayEnabled: boolean;
@@ -38,16 +62,16 @@ const DATA_PERSISTENCE_OPTIONS = ["", "persist", "ephemeral"];
const dataPersistenceLabel = (v: string): string =>
v === "persist" ? "Always keep (persist)" : v === "ephemeral" ? "Don't keep (ephemeral)" : "Auto";
// Cloud/compute backend display name. The provider is chosen at create time and
// is NOT editable here (changing a workspace's cloud requires a recreate), so
// it renders as a read-only badge — but we must preserve it across Save (the
// compute payload is rebuilt below, and dropping it would wipe the column).
// Cloud/compute backend display name (read-only fallback for non-SaaS / legacy).
const cloudProviderLabel = (v: string | undefined): string =>
v === "gcp" ? "GCP" : v === "hetzner" ? "Hetzner" : "AWS";
export function ContainerConfigTab({ workspaceId, data }: Props) {
// Provider is editable only in SaaS (CP-provisioned boxes). Local/Docker has no
// cloud-provider concept, so we keep the read-only badge there.
const isSaaS = useMemo(() => isSaaSTenant(), []);
const runtime = data.runtime;
const provider = data.compute?.provider; // read-only; set at create time
const provider = data.compute?.provider;
const instanceType = data.compute?.instance_type;
const rootGB = data.compute?.volume?.root_gb;
const displayMode = data.compute?.display?.mode;
@@ -56,8 +80,8 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
const displayHeight = data.compute?.display?.height;
const dataPersistence = data.compute?.data_persistence;
const initial = useMemo(
() => formFromData({ runtime, instanceType, rootGB, displayMode, displayProtocol, displayWidth, displayHeight, dataPersistence }),
[runtime, instanceType, rootGB, displayMode, displayProtocol, displayWidth, displayHeight, dataPersistence],
() => formFromData({ runtime, provider, instanceType, rootGB, displayMode, displayProtocol, displayWidth, displayHeight, dataPersistence }),
[runtime, provider, instanceType, rootGB, displayMode, displayProtocol, displayWidth, displayHeight, dataPersistence],
);
const [form, setForm] = useState<FormState>(initial);
const [saving, setSaving] = useState(false);
@@ -87,6 +111,21 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
try {
let applyTemplateOnRestart = data.applyTemplateOnRestart ?? false;
if (dirty) {
// In-place cloud switch is DESTRUCTIVE: changing the provider recreates the
// box on the new cloud (the workspace-server deprovisions the old box on
// its old cloud first, then the restart provisions on the new one). Confirm
// before doing it — the current box and any non-persisted state are lost.
const providerChanged = normalizeProvider(form.provider) !== normalizeProvider(initial.provider);
if (providerChanged && typeof window !== "undefined") {
const ok = window.confirm(
`Switch this workspace to ${cloudProviderLabel(form.provider)}? This RECREATES the box on the new cloud — the current box and any non-persisted state are replaced.`,
);
if (!ok) {
setSaving(false);
return;
}
}
const rootGB = parseInt(form.rootGB, 10);
if (!Number.isFinite(rootGB)) {
setError("Root volume must be a number");
@@ -102,10 +141,11 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
: { mode: "none" },
// internal#734: omit when "auto" so the wire/default behavior is unchanged.
...(form.dataPersistence ? { data_persistence: form.dataPersistence } : {}),
// Preserve the create-time cloud provider — it's not editable here, but
// this PATCH rebuilds the whole compute object, so omitting it would
// wipe the persisted provider (and mislead the badge after a Save).
...(provider ? { provider } : {}),
// Cloud backend: send the (possibly switched) provider. Omit for the
// default (aws) so a non-switching AWS save keeps the wire unchanged;
// a switch TO aws (omit) vs FROM aws (explicit) both register correctly
// because the workspace-server normalizes ""→aws when diffing.
...(normalizeProvider(form.provider) !== "aws" ? { provider: normalizeProvider(form.provider) } : {}),
};
const resp = await api.patch<{ needs_restart?: boolean }>(`/workspaces/${workspaceId}`, {
@@ -140,15 +180,16 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
<div className="mb-3 flex items-center justify-between gap-3">
<div className="flex items-center gap-2">
<h3 className="text-sm font-semibold text-ink">Container Config</h3>
{/* Read-only cloud-provider badge which cloud this workspace's box
runs on (AWS/GCP/Hetzner). Defaults to AWS when unset (legacy
rows). Set at create time in the Create Workspace dialog. */}
<span
title="Cloud provider for this workspace's compute (set at create time)"
className="rounded-full border border-line/60 bg-surface-sunken px-2 py-0.5 font-mono text-[10px] uppercase tracking-wide text-ink-mid"
>
{cloudProviderLabel(provider)}
</span>
{/* Non-SaaS (local/Docker) has no cloud-provider concept read-only
badge. In SaaS the provider is an editable selector in the form. */}
{!isSaaS && (
<span
title="Cloud provider for this workspace's compute"
className="rounded-full border border-line/60 bg-surface-sunken px-2 py-0.5 font-mono text-[10px] uppercase tracking-wide text-ink-mid"
>
{cloudProviderLabel(provider)}
</span>
)}
</div>
{data.needsRestart && <span className="text-[11px] text-warm">Restart required</span>}
</div>
@@ -162,11 +203,32 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
optionLabel={runtimeDisplayName}
onChange={(runtime) => setForm((s) => ({ ...s, runtime }))}
/>
{isSaaS && (
<SelectField
id="cloud-provider"
label="Cloud provider"
value={normalizeProvider(form.provider)}
options={CLOUD_PROVIDER_OPTIONS.map((p) => p.value)}
optionLabel={(v) => CLOUD_PROVIDER_OPTIONS.find((p) => p.value === v)?.label ?? v}
// Switching cloud resets the instance type to the new provider's
// default (an AWS t3.* is invalid on Hetzner, etc.) — also keeps the
// instance-type dropdown below in sync with the provider's sizes.
onChange={(provider) =>
setForm((s) => ({
...s,
provider,
instanceType: instanceTypesForProvider(provider).includes(s.instanceType)
? s.instanceType
: defaultInstanceForProvider(provider),
}))
}
/>
)}
<SelectField
id="instance-type"
label="Instance type"
value={form.instanceType}
options={INSTANCE_TYPES}
options={instanceTypesForProvider(form.provider)}
onChange={(instanceType) => setForm((s) => ({ ...s, instanceType }))}
/>
<label className="grid gap-1" htmlFor="root-volume-gb">
@@ -270,6 +332,7 @@ export function ContainerConfigTab({ workspaceId, data }: Props) {
function formFromData(data: {
runtime?: string;
provider?: string;
instanceType?: string;
rootGB?: number;
displayMode?: string;
@@ -281,9 +344,11 @@ function formFromData(data: {
const width = data.displayWidth ?? 1920;
const height = data.displayHeight ?? 1080;
const resolution = `${width}x${height}`;
const provider = normalizeProvider(data.provider);
return {
runtime: data.runtime || "claude-code",
instanceType: data.instanceType || DEFAULT_HEADLESS_INSTANCE_TYPE,
provider,
instanceType: data.instanceType || defaultInstanceForProvider(provider),
rootGB: String(data.rootGB || DEFAULT_HEADLESS_ROOT_GB),
displayEnabled: !!data.displayMode && data.displayMode !== "none",
displayMode: data.displayMode && data.displayMode !== "none" ? data.displayMode : "desktop-control",
@@ -23,6 +23,13 @@ vi.mock("@/store/canvas", () => ({
),
}));
// SaaS so the editable cloud-provider selector renders (non-SaaS shows a read-only
// badge). Existing tests keep provider=aws (default), which is omitted from the
// PATCH payload, so their assertions are unaffected.
vi.mock("@/lib/tenant", () => ({
isSaaSTenant: () => true,
}));
import { ContainerConfigTab } from "../ContainerConfigTab";
afterEach(() => {
@@ -314,4 +321,67 @@ describe("ContainerConfigTab", () => {
await waitFor(() => expect(restartWorkspace).toHaveBeenCalledWith("ws-compute", { applyTemplate: true }));
expect(apiPatch).not.toHaveBeenCalled();
});
it("switches cloud provider — keys the instance-type list to the provider, confirms the recreate, and PATCHes the new provider", async () => {
const confirmSpy = vi.spyOn(window, "confirm").mockReturnValue(true);
render(
<ContainerConfigTab
workspaceId="ws-switch"
data={{
runtime: "claude-code",
status: "online",
needsRestart: false,
activeTasks: 0,
maxConcurrentTasks: null,
workspaceAccess: "read-write",
deliveryMode: "push",
compute: { instance_type: "t3.large", provider: "aws", volume: { root_gb: 30 } },
}}
/>,
);
const providerSel = screen.getByLabelText("Cloud provider");
expect(providerSel).toHaveProperty("value", "aws");
expect(screen.getByLabelText("Instance type")).toHaveProperty("value", "t3.large");
// Switch to Hetzner → the instance type resets to the Hetzner default (an AWS
// t3.* is invalid on Hetzner) and the options become Hetzner sizes.
fireEvent.change(providerSel, { target: { value: "hetzner" } });
expect(screen.getByLabelText("Instance type")).toHaveProperty("value", "cpx31");
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPatch).toHaveBeenCalledTimes(1));
expect(confirmSpy).toHaveBeenCalled(); // destructive recreate confirmed
const body = apiPatch.mock.calls[0][1] as { compute: { provider?: string; instance_type?: string } };
expect(body.compute.provider).toBe("hetzner");
expect(body.compute.instance_type).toBe("cpx31");
confirmSpy.mockRestore();
});
it("does not treat a non-provider edit as a recreate (no confirm; aws default omitted)", async () => {
const confirmSpy = vi.spyOn(window, "confirm").mockReturnValue(true);
render(
<ContainerConfigTab
workspaceId="ws-noswitch"
data={{
runtime: "claude-code",
status: "online",
needsRestart: false,
activeTasks: 0,
maxConcurrentTasks: null,
workspaceAccess: "read-write",
deliveryMode: "push",
compute: { instance_type: "t3.large", provider: "aws", volume: { root_gb: 30 } },
}}
/>,
);
fireEvent.change(screen.getByLabelText("Root volume"), { target: { value: "60" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPatch).toHaveBeenCalledTimes(1));
expect(confirmSpy).not.toHaveBeenCalled();
const body = apiPatch.mock.calls[0][1] as { compute: { provider?: string } };
expect(body.compute.provider).toBeUndefined(); // aws default omitted (wire unchanged)
confirmSpy.mockRestore();
});
});
@@ -162,6 +162,11 @@ describe("DisplayTab", () => {
controller: "user",
ttl_seconds: 300,
});
// Defensive: the noVNC constructor is async (dynamic import), so wait
// for it to be called before asserting arguments (prevents flake in CI).
await waitFor(() => {
expect(mockRFBConstructor).toHaveBeenCalled();
});
expect(mockRFBConstructor).toHaveBeenCalledWith(
expect.any(HTMLElement),
expect.stringContaining("/workspaces/ws-display/display/session/websockify"),
@@ -197,6 +202,14 @@ describe("DisplayTab", () => {
fireEvent.click(screen.getByRole("button", { name: "Take control" }));
const desktop = await screen.findByTitle("Workspace desktop");
// Wait for the RFB instance to actually connect before pasting. The component
// sets rfbRef.current synchronously right after `new RFB()` (which fires
// mockRFBConstructor) INSIDE the async connect() — but the "Workspace desktop"
// title renders before that await resolves. Firing paste immediately races
// rfbRef.current===null, so the window paste handler's
// `rfbRef.current?.clipboardPasteFrom(text)` no-ops (0 calls). This lost the
// race under CI runner load; waiting for the constructor makes it deterministic.
await waitFor(() => expect(mockRFBConstructor).toHaveBeenCalled());
fireEvent.paste(desktop, {
clipboardData: {
getData: (type: string) => (type === "text/plain" ? "Paste Me" : ""),
@@ -0,0 +1,105 @@
// @vitest-environment jsdom
/** Unit tests for useA2AFlights the eventflight lifecycle that drives the
* envelope animations on the canvas (MessageFlightLayer) and the concierge
* home (MessageFlightHome). useSocketEvent is mocked so we can drive the
* ACTIVITY_LOGGED handler directly. */
import { renderHook, act } from "@testing-library/react";
import { describe, it, expect, vi, beforeEach } from "vitest";
// Capture the handler the hook registers with the socket bus. vi.hoisted is
// required because vi.mock factories are hoisted above normal declarations and
// may only close over hoisted state.
const h = vi.hoisted(() => ({ captured: null as ((msg: unknown) => void) | null }));
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: (cb: (msg: unknown) => void) => {
h.captured = cb;
},
}));
import { useA2AFlights, FLIGHT_DURATION_MS } from "@/hooks/useA2AFlights";
function setReducedMotion(reduce: boolean) {
window.matchMedia = vi.fn().mockImplementation((q: string) => ({
matches: reduce && q.includes("reduce"),
media: q,
onchange: null,
addEventListener: vi.fn(),
removeEventListener: vi.fn(),
addListener: vi.fn(),
removeListener: vi.fn(),
dispatchEvent: vi.fn(),
}));
}
const msg = (payload: Record<string, unknown>, event = "ACTIVITY_LOGGED") => ({
event,
workspace_id: "a",
timestamp: "2026-06-08T00:00:00Z",
payload,
});
const a2aSend = (over: Record<string, unknown> = {}) =>
msg({ activity_type: "a2a_send", source_id: "a", target_id: "b", ...over });
describe("useA2AFlights", () => {
beforeEach(() => {
h.captured = null;
vi.useRealTimers();
setReducedMotion(false);
});
it("emits a flight for an a2a_send between two distinct agents", () => {
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(a2aSend()));
expect(result.current).toHaveLength(1);
expect(result.current[0]).toMatchObject({ sourceId: "a", targetId: "b", kind: "send" });
});
it("maps a2a_receive / task_update to their kinds", () => {
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(a2aSend({ activity_type: "a2a_receive" })));
act(() => h.captured?.(a2aSend({ activity_type: "task_update" })));
const kinds = result.current.map((f) => f.kind);
expect(kinds).toContain("receive");
expect(kinds).toContain("task");
});
it("ignores non-A2A activity and non-ACTIVITY_LOGGED events", () => {
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(msg({ activity_type: "status_change", source_id: "a", target_id: "b" })));
act(() => h.captured?.(a2aSend({}, )));
act(() => h.captured?.({ event: "WORKSPACE_UPDATED", workspace_id: "a", payload: {} }));
expect(result.current.every((f) => f.kind === "send")).toBe(true);
expect(result.current).toHaveLength(1); // only the one valid a2aSend
});
it("skips self-loops and flights with no target", () => {
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(a2aSend({ target_id: "a" }))); // self-loop
act(() => h.captured?.(a2aSend({ target_id: "" }))); // missing target
expect(result.current).toHaveLength(0);
});
it("emits nothing when prefers-reduced-motion is set", () => {
setReducedMotion(true);
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(a2aSend()));
expect(result.current).toHaveLength(0);
});
it("emits nothing when disabled", () => {
const { result } = renderHook(() => useA2AFlights(false));
act(() => h.captured?.(a2aSend()));
expect(result.current).toHaveLength(0);
});
it("expires a flight after the TTL", () => {
vi.useFakeTimers();
const { result } = renderHook(() => useA2AFlights());
act(() => h.captured?.(a2aSend()));
expect(result.current).toHaveLength(1);
act(() => {
vi.advanceTimersByTime(FLIGHT_DURATION_MS + 300);
});
expect(result.current).toHaveLength(0);
});
});
+103
View File
@@ -0,0 +1,103 @@
/** useA2AFlights turns the org's live A2A activity stream into transient
* "flights" (one per delegate / message event, source target) that an
* overlay can animate as an envelope travelling between two agents.
*
* This hook owns ONLY the eventflight lifecycle: it subscribes to the same
* ACTIVITY_LOGGED WS bus the CommunicationOverlay uses, keeps a small bounded
* list of in-flight envelopes, and expires each after the animation window.
* The caller resolves positions and renders the envelope, so the exact same
* flight data drives both the spatial canvas (flow coords) and the concierge
* home (DOM row rects).
*
* Honours `prefers-reduced-motion`: when the user opts out of motion the hook
* emits no flights at all, so no envelope ever animates. */
import { useEffect, useRef, useState } from "react";
import { useSocketEvent } from "@/hooks/useSocketEvent";
export type A2AFlightKind = "send" | "receive" | "task";
export interface A2AFlight {
/** unique per flight instance (not per pair) so a burst renders distinct envelopes */
key: string;
sourceId: string;
targetId: string;
kind: A2AFlightKind;
}
/** Total time an envelope is alive (ms). Kept in sync with the overlay's
* Web-Animations duration; the extra tail gives the fade-out room to finish
* before the element unmounts. */
export const FLIGHT_DURATION_MS = 1200;
const FLIGHT_TTL_MS = FLIGHT_DURATION_MS + 120;
/** Cap concurrent envelopes so a delegation storm can't spawn unbounded DOM. */
const MAX_CONCURRENT = 12;
function reducedMotionNow(): boolean {
return (
typeof window !== "undefined" &&
typeof window.matchMedia === "function" &&
window.matchMedia("(prefers-reduced-motion: reduce)").matches
);
}
export function useA2AFlights(enabled = true): A2AFlight[] {
const [flights, setFlights] = useState<A2AFlight[]>([]);
const reduced = useRef<boolean>(reducedMotionNow());
const timers = useRef<number[]>([]);
// Track reduced-motion preference changes live (a user can toggle it mid-session).
useEffect(() => {
if (typeof window === "undefined" || typeof window.matchMedia !== "function") return;
const mq = window.matchMedia("(prefers-reduced-motion: reduce)");
const onChange = () => {
reduced.current = mq.matches;
if (mq.matches) setFlights([]); // drop any in-flight envelopes immediately
};
mq.addEventListener?.("change", onChange);
return () => mq.removeEventListener?.("change", onChange);
}, []);
// Clear pending expiry timers on unmount.
useEffect(() => {
const t = timers.current;
return () => {
t.forEach((id) => window.clearTimeout(id));
};
}, []);
useSocketEvent((msg) => {
if (!enabled || reduced.current) return;
if (msg.event !== "ACTIVITY_LOGGED") return;
const p = (msg.payload || {}) as {
activity_type?: string;
source_id?: string | null;
target_id?: string | null;
};
const t = p.activity_type;
if (t !== "a2a_send" && t !== "a2a_receive" && t !== "task_update") return;
const sourceId = p.source_id || msg.workspace_id;
const targetId = p.target_id || "";
// A flight needs two distinct endpoints; a self-loop or missing peer has
// nowhere to fly, so skip it.
if (!sourceId || !targetId || sourceId === targetId) return;
const kind: A2AFlightKind =
t === "a2a_receive" ? "receive" : t === "task_update" ? "task" : "send";
const key = `${msg.timestamp || Date.now()}:${sourceId}:${targetId}:${Math.random()
.toString(36)
.slice(2, 8)}`;
setFlights((prev) => [...prev.slice(-(MAX_CONCURRENT - 1)), { key, sourceId, targetId, kind }]);
const id = window.setTimeout(() => {
setFlights((prev) => prev.filter((f) => f.key !== key));
timers.current = timers.current.filter((x) => x !== id);
}, FLIGHT_TTL_MS);
timers.current.push(id);
});
return flights;
}
+10 -7
View File
@@ -1,5 +1,5 @@
import type { Secret } from '@/types/secrets';
import { getTenantSlug } from '../tenant';
import { platformAuthHeaders } from '@/lib/api';
const PLATFORM_URL = process.env.NEXT_PUBLIC_PLATFORM_URL ?? 'http://localhost:8080';
@@ -13,16 +13,19 @@ function apiUrl(workspaceId: string, path = ''): string {
}
async function request<T>(url: string, init?: RequestInit): Promise<T> {
// Match api.ts shape — slug header + cross-origin credentials so SaaS
// cross-subdomain fetches work. See lib/api.ts for the rationale.
const slug = getTenantSlug();
const saasHeaders: Record<string, string> = { 'Content-Type': 'application/json' };
if (slug) saasHeaders['X-Molecule-Org-Slug'] = slug;
// Auth pair (admin/org Bearer token + tenant slug) + JSON Content-Type come
// from the shared `platformAuthHeaders()` helper. This bespoke fetch
// previously hand-rolled only the slug + Content-Type and OMITTED the
// Authorization bearer — so against a workspace-server with ADMIN_TOKEN set
// (local dev, every SaaS tenant), WorkspaceAuth saw no bearer and no verified
// CP session and returned 401 "missing workspace auth token". That's exactly
// the #178 raw-fetch-forgets-a-header bug shape the helper exists to prevent.
const res = await fetch(url, {
...init,
credentials: 'include',
headers: {
...saasHeaders,
'Content-Type': 'application/json',
...platformAuthHeaders(),
...init?.headers,
},
});
+17
View File
@@ -0,0 +1,17 @@
/** Canonical workspace `kind` values the TS mirror of Go's models.Kind*
* constants (`models.KindPlatform` / `models.KindWorkspace`).
*
* Single source of truth for the `kind` magic strings used across the canvas
* (topology, map strip, toolbar, concierge shell). Kept in a leaf module so
* both `@/store/canvas` and `@/store/canvas-topology` can import it without a
* circular dependency. `WorkspaceNodeData.kind` stays a plain `string` these
* are the well-known values to compare against, not an exhaustive enum.
*
* - `Platform` = the org-level concierge (the undeletable org root, hidden
* from the map graph, surfaced as the shell's org root).
* - `Workspace` = an ordinary agent. Also the fallback for older ws-server
* builds that predate the `kind` column. */
export const WORKSPACE_KIND = {
Platform: "platform",
Workspace: "workspace",
} as const;
@@ -11,7 +11,25 @@ import {
childSlotInGrid,
parentMinSize,
parentMinSizeFromChildren,
CHILD_DEFAULT_WIDTH,
CHILD_DEFAULT_HEIGHT,
CHILD_GUTTER,
PARENT_SIDE_PADDING,
PARENT_HEADER_PADDING,
PARENT_BOTTOM_PADDING,
stripPlatformRootForMap,
} from "../canvas-topology";
import { WORKSPACE_KIND } from "../../lib/workspace-kind";
// Layout-math aliases so these assertions track the card-size constants
// instead of hard-coding pixel values (which drift when the card size
// changes — e.g. the 240×130 → 300×176 "bigger cards" redesign).
const W = CHILD_DEFAULT_WIDTH;
const H = CHILD_DEFAULT_HEIGHT;
const GUT = CHILD_GUTTER;
const SIDE = PARENT_SIDE_PADDING;
const HEAD = PARENT_HEADER_PADDING;
const BOTTOM = PARENT_BOTTOM_PADDING;
// ─── sortParentsBeforeChildren ─────────────────────────────────────────────────
@@ -115,34 +133,34 @@ describe("sortParentsBeforeChildren", () => {
// ─── defaultChildSlot ─────────────────────────────────────────────────────────
describe("defaultChildSlot — 2-column grid (240×130 cards)", () => {
describe("defaultChildSlot — 2-column grid", () => {
it("slot 0 → column 0, row 0", () => {
const s = defaultChildSlot(0);
expect(s).toEqual({ x: 16, y: 130 });
expect(s).toEqual({ x: SIDE, y: HEAD });
});
it("slot 1 → column 1, row 0", () => {
const s = defaultChildSlot(1);
expect(s.x).toBe(16 + 240 + 14); // PARENT_SIDE_PADDING + CHILD_DEFAULT_WIDTH + CHILD_GUTTER
expect(s.y).toBe(130);
expect(s.x).toBe(SIDE + W + GUT); // PARENT_SIDE_PADDING + CHILD_DEFAULT_WIDTH + CHILD_GUTTER
expect(s.y).toBe(HEAD);
});
it("slot 2 → column 0, row 1", () => {
const s = defaultChildSlot(2);
expect(s.x).toBe(16);
expect(s.y).toBe(130 + 130 + 14); // row 0 height + gutter
expect(s.x).toBe(SIDE);
expect(s.y).toBe(HEAD + H + GUT); // row 0 height + gutter
});
it("slot 3 → column 1, row 1", () => {
const s = defaultChildSlot(3);
expect(s.x).toBe(16 + 240 + 14);
expect(s.y).toBe(130 + 130 + 14);
expect(s.x).toBe(SIDE + W + GUT);
expect(s.y).toBe(HEAD + H + GUT);
});
it("slot 4 → column 0, row 2", () => {
const s = defaultChildSlot(4);
expect(s.x).toBe(16);
expect(s.y).toBe(130 + (130 + 14) * 2); // row 1 end + gutter
expect(s.x).toBe(SIDE);
expect(s.y).toBe(HEAD + (H + GUT) * 2); // row 1 end + gutter
});
});
@@ -194,36 +212,35 @@ describe("parentMinSize — uniform-size children", () => {
it("1 child → 1 col, 1 row", () => {
const s = parentMinSize(1);
// width = 16*2 + 1*240 + 0 = 272; height = 130 + 1*130 + 0 + 16 = 276
expect(s.width).toBe(16 * 2 + 240);
expect(s.height).toBe(130 + 130 + 16);
// width = SIDE*2 + 1*W; height = HEAD + 1*H + BOTTOM
expect(s.width).toBe(SIDE * 2 + W);
expect(s.height).toBe(HEAD + H + BOTTOM);
});
it("2 children → 2 cols, 1 row", () => {
const s = parentMinSize(2);
// width = 16*2 + 2*240 + 1*14 = 526; height = 130 + 1*130 + 0 + 16 = 276
expect(s.width).toBe(16 * 2 + 2 * 240 + 14);
expect(s.height).toBe(130 + 130 + 16);
// width = SIDE*2 + 2*W + 1*GUT; height = HEAD + 1*H + BOTTOM
expect(s.width).toBe(SIDE * 2 + 2 * W + GUT);
expect(s.height).toBe(HEAD + H + BOTTOM);
});
it("3 children → 2 cols, 2 rows", () => {
const s = parentMinSize(3);
// width = 16*2 + 2*240 + 1*14 = 526
expect(s.width).toBe(16 * 2 + 2 * 240 + 14);
// height = 130 + 2*130 + 1*14 + 16 = 416
expect(s.height).toBe(130 + 2 * 130 + 14 + 16);
expect(s.width).toBe(SIDE * 2 + 2 * W + GUT);
// height = HEAD + 2*H + 1*GUT + BOTTOM
expect(s.height).toBe(HEAD + 2 * H + GUT + BOTTOM);
});
it("4 children → 2 cols, 2 rows (full grid)", () => {
const s = parentMinSize(4);
expect(s.width).toBe(16 * 2 + 2 * 240 + 14);
expect(s.height).toBe(130 + 2 * 130 + 14 + 16);
expect(s.width).toBe(SIDE * 2 + 2 * W + GUT);
expect(s.height).toBe(HEAD + 2 * H + GUT + BOTTOM);
});
it("5 children → 2 cols, 3 rows", () => {
const s = parentMinSize(5);
expect(s.width).toBe(16 * 2 + 2 * 240 + 14);
expect(s.height).toBe(130 + 3 * 130 + 2 * 14 + 16);
expect(s.width).toBe(SIDE * 2 + 2 * W + GUT);
expect(s.height).toBe(HEAD + 3 * H + 2 * GUT + BOTTOM);
});
});
@@ -243,8 +260,8 @@ describe("parentMinSizeFromChildren — variable-size children", () => {
it("two equal-width children → same as parentMinSize(2)", () => {
const fromChildren = parentMinSizeFromChildren([
{ width: 240, height: 130 },
{ width: 240, height: 130 },
{ width: W, height: H },
{ width: W, height: H },
]);
expect(fromChildren.width).toBe(parentMinSize(2).width);
expect(fromChildren.height).toBe(parentMinSize(2).height);
@@ -262,3 +279,74 @@ describe("parentMinSizeFromChildren — variable-size children", () => {
expect(wide.width).toBeGreaterThan(narrow.width);
});
});
// ─── stripPlatformRootForMap ───────────────────────────────────────────────────
describe("stripPlatformRootForMap", () => {
// Minimal Node<WorkspaceNodeData> builder — only the fields the function reads.
const node = (
id: string,
opts: { kind?: string; parentId?: string; x?: number; y?: number } = {},
// eslint-disable-next-line @typescript-eslint/no-explicit-any
): any => ({
id,
position: { x: opts.x ?? 0, y: opts.y ?? 0 },
parentId: opts.parentId,
data: { kind: opts.kind ?? WORKSPACE_KIND.Workspace, parentId: opts.parentId ?? null },
});
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const edge = (source: string, target: string): any => ({ id: `${source}->${target}`, source, target });
it("returns input unchanged when there is no platform node", () => {
const nodes = [node("a"), node("b", { parentId: "a", x: 5, y: 5 })];
const edges = [edge("a", "b")];
const out = stripPlatformRootForMap(nodes, edges);
expect(out.nodes).toBe(nodes); // same reference — no work done
expect(out.edges).toBe(edges);
});
it("removes the platform root, promotes its direct children to absolute positions, and drops platform-touching edges", () => {
const platform = node("P", { kind: WORKSPACE_KIND.Platform, x: 100, y: 50 });
const child = node("c", { parentId: "P", x: 10, y: 20 }); // RF-relative to P
const grandchild = node("g", { parentId: "c", x: 5, y: 5 });
const out = stripPlatformRootForMap(
[platform, child, grandchild],
[edge("P", "c"), edge("c", "g")],
);
// Platform node is gone.
expect(out.nodes.find((n) => n.id === "P")).toBeUndefined();
// Direct child promoted to top-level with absolute position (parentPos + childPos).
const c = out.nodes.find((n) => n.id === "c")!;
expect(c.parentId).toBeUndefined();
expect(c.extent).toBeUndefined();
expect(c.position).toEqual({ x: 110, y: 70 });
expect(c.data.parentId).toBeNull();
// Grandchild (child of a non-platform node) is untouched.
const g = out.nodes.find((n) => n.id === "g")!;
expect(g.parentId).toBe("c");
expect(g.position).toEqual({ x: 5, y: 5 });
// Edge touching the platform node dropped; the other preserved.
expect(out.edges.map((e) => e.id)).toEqual(["c->g"]);
});
it("leaves children of an ordinary (non-platform) parent untouched", () => {
const platform = node("P", { kind: WORKSPACE_KIND.Platform });
const ordinaryParent = node("op", { parentId: "P", x: 200, y: 0 });
const grandchild = node("gc", { parentId: "op", x: 7, y: 9 });
const out = stripPlatformRootForMap([platform, ordinaryParent, grandchild], []);
// op is a direct child of platform → promoted (absolute = 200+0, 0+0).
const op = out.nodes.find((n) => n.id === "op")!;
expect(op.parentId).toBeUndefined();
expect(op.position).toEqual({ x: 200, y: 0 });
// gc's parent is the ordinary node, not platform → relative position preserved.
const gc = out.nodes.find((n) => n.id === "gc")!;
expect(gc.parentId).toBe("op");
expect(gc.position).toEqual({ x: 7, y: 9 });
});
});
+21
View File
@@ -162,6 +162,27 @@ describe("hydrate", () => {
useCanvasStore.getState().hydrate([ws]);
expect(useCanvasStore.getState().nodes[0].data.currentTask).toBe("");
});
it("preserves in-flight turn status after refresh (issue #2391)", () => {
// Simulates a page refresh: the canvas re-hydrates from GET /workspaces
// while the agent has an active in-flight turn. The store must reflect
// "working" immediately — no dependence on a subsequent TASK_UPDATED
// socket event. This prevents the "stuck idle" UX after reload.
const ws = makeWS({
id: "ws-1",
status: "online",
current_task: "Analyzing data",
active_tasks: 2,
});
useCanvasStore.getState().hydrate([ws]);
const node = useCanvasStore.getState().nodes[0];
expect(node.data.currentTask).toBe("Analyzing data");
expect(node.data.activeTasks).toBe(2);
expect(node.data.status).toBe("online");
// Defensive: the node must be considered "working" for any UI that
// gates on currentTask (e.g. ChatTab thinking indicator).
expect(!!node.data.currentTask).toBe(true);
});
});
describe("summarizeWorkspaceCapabilities", () => {
+66 -11
View File
@@ -1,6 +1,7 @@
import type { Node, Edge } from "@xyflow/react";
import type { WorkspaceData } from "./socket";
import type { WorkspaceNodeData } from "./canvas";
import { WORKSPACE_KIND } from "@/lib/workspace-kind";
const H_SPACING = 320;
const V_SPACING = 200;
@@ -51,13 +52,13 @@ export function sortParentsBeforeChildren<T extends { id: string; parentId?: str
}
// Grid-slot defaults for children laid under a parent. The card
// component (WorkspaceNode.tsx) sets `max-w-[240px]` on leaves, so a
// slot stride of CHILD_DEFAULT_WIDTH + CHILD_GUTTER guarantees cards
// never bleed into their neighbour's slot. Keep these in sync with
// the Go mirror in workspace-server/internal/handlers/org.go
// changing one without the other leads to import-time / runtime drift.
export const CHILD_DEFAULT_WIDTH = 240;
export const CHILD_DEFAULT_HEIGHT = 130;
// component (WorkspaceNode.tsx) renders leaves at exactly w-[300px] /
// min-h-[176px], so a slot stride of CHILD_DEFAULT_WIDTH + CHILD_GUTTER
// guarantees cards never bleed into their neighbour's slot. Keep these
// in sync with the Go mirror in workspace-server/internal/handlers/org.go
// changing one without the other leads to import-time / runtime drift.
export const CHILD_DEFAULT_WIDTH = 300;
export const CHILD_DEFAULT_HEIGHT = 176;
// Parent header space — reserves room above the child grid so the
// parent's own name + runtime pill + clamped role + currentTask
// banner aren't covered by the first row of child cards. The
@@ -529,6 +530,10 @@ export function buildNodesAndEdges(
// — leave undefined so the chat UI's "?? 'push'" fallback applies.
deliveryMode: ws.delivery_mode,
compute: ws.compute,
// Org-level platform agent ('platform') vs ordinary workspace. The map
// view hides the platform root (it's the undeletable org anchor) via
// stripPlatformRootForMap; the shell home tree keeps it as ROOT.
kind: ws.kind ?? WORKSPACE_KIND.Workspace,
},
};
if (hasParent) {
@@ -553,10 +558,10 @@ export function buildNodesAndEdges(
// - Collapsed parents: leaf-sized (header-only card).
// - Leaves: leaf-sized — they land in their grid slot cleanly.
//
// NodeResizer still drives user-initiated growth at runtime; these
// are only the initial values, and React Flow updates them in place
// when the user drags a resize handle. A future hydrate() will
// reset to the default until we persist width/height server-side.
// Sizes are fully system-controlled (free-resize was removed): these
// initial values stand, and at runtime React Flow re-measures leaves
// from their fixed-size card CSS while parents grow to fit children
// (growParentsToFitChildren). Width/height are never persisted.
const kids = childCounts.get(ws.id) ?? 0;
if (kids > 0 && !ws.collapsed) {
const size = parentSize.get(ws.id)!;
@@ -625,3 +630,53 @@ export function getConfigurationError(
const raw = agentCard.configuration_error;
return typeof raw === "string" && raw.length > 0 ? raw : null;
}
/**
* Map-view filter: removes the org-level platform agent (the concierge) from
* the node graph. The platform agent is the undeletable org ROOT every other
* workspace hangs under it so it is surfaced as the shell's org anchor
* (topbar + Home tree), NOT as a draggable/deletable map node.
*
* Its direct children are promoted to top-level: React Flow stores child
* positions RELATIVE to the parent, so when the parent is dropped each child is
* converted back to an absolute position (parent.position + child.position) and
* its parent binding cleared. Edges touching the platform node are dropped.
*
* The store keeps the full node set (the shell's Home agent tree renders the
* platform as ROOT); only the map's React Flow input is stripped.
*/
export function stripPlatformRootForMap(
nodes: Node<WorkspaceNodeData>[],
edges: Edge[],
): { nodes: Node<WorkspaceNodeData>[]; edges: Edge[] } {
const platformIds = new Set(
nodes.filter((n) => n.data.kind === WORKSPACE_KIND.Platform).map((n) => n.id),
);
if (platformIds.size === 0) return { nodes, edges };
const posById = new Map(nodes.map((n) => [n.id, n.position]));
const outNodes = nodes
.filter((n) => !platformIds.has(n.id))
.map((n) => {
const pid = n.parentId;
if (pid && platformIds.has(pid)) {
const parentPos = posById.get(pid) ?? { x: 0, y: 0 };
return {
...n,
parentId: undefined,
extent: undefined,
position: {
x: parentPos.x + n.position.x,
y: parentPos.y + n.position.y,
},
data: { ...n.data, parentId: null },
} as Node<WorkspaceNodeData>;
}
return n;
});
const outEdges = edges.filter(
(e) => !platformIds.has(e.source) && !platformIds.has(e.target),
);
return { nodes: outNodes, edges: outEdges };
}
+26 -4
View File
@@ -25,8 +25,8 @@ import {
/**
* Walk every parent node and bump its width/height (if explicitly set)
* so the union of its children's relative bboxes plus padding fits. A
* parent's size never shrinks via this path only grows because
* shrinking on resize would fight the user's own NodeResizer drag.
* parent's size never shrinks via this path only grows so a parent
* that expanded to fit children stays expanded as their layout settles.
*/
function growParentsToFitChildren<T extends Record<string, unknown>>(
nodes: Node<T>[],
@@ -74,6 +74,12 @@ function growParentsToFitChildren<T extends Record<string, unknown>>(
export { summarizeWorkspaceCapabilities } from "./canvas-capabilities";
export type { WorkspaceCapabilitySummary } from "./canvas-capabilities";
/** Canonical workspace `kind` values the TS mirror of Go's models.Kind*
* constants. Defined in a leaf module (`@/lib/workspace-kind`) and re-exported
* here for convenience so consumers can keep importing from `@/store/canvas`.
* Use these instead of the bare "platform"/"workspace" string literals. */
export { WORKSPACE_KIND } from "@/lib/workspace-kind";
export interface WorkspaceNodeData extends Record<string, unknown> {
name: string;
status: string;
@@ -86,6 +92,10 @@ export interface WorkspaceNodeData extends Record<string, unknown> {
lastSampleError: string;
url: string;
parentId: string | null;
/** 'platform' = the org concierge (hidden from the map graph, surfaced as the
* shell's org root); 'workspace' = ordinary agent. Optional: absent on older
* ws-server builds / some event-constructed nodes treat absent as ordinary. */
kind?: string;
currentTask: string;
runtime: string;
workspaceAccess?: string | null;
@@ -142,6 +152,12 @@ export interface WorkspaceNodeData extends Record<string, unknown> {
export type PanelTab = "details" | "skills" | "chat" | "terminal" | "display" | "container-config" | "config" | "schedule" | "channels" | "files" | "memory" | "traces" | "events" | "activity" | "audit";
/**
* Top-level canvas view. "home" is the Org Concierge view (chat with the
* platform agent); "map" is the node-graph canvas (the original view).
*/
export type TopView = "home" | "map" | "settings";
export interface ContextMenuState {
x: number;
y: number;
@@ -154,6 +170,8 @@ interface CanvasState {
edges: Edge[];
selectedNodeId: string | null;
panelTab: PanelTab;
/** Top-level view: Org Concierge home (chat) vs the node-graph map. */
topView: TopView;
dragOverNodeId: string | null;
contextMenu: ContextMenuState | null;
// Live width of the SidePanel in pixels. Only meaningful when
@@ -174,6 +192,7 @@ interface CanvasState {
savePosition: (nodeId: string, x: number, y: number) => void;
selectNode: (id: string | null) => void;
setPanelTab: (tab: PanelTab) => void;
setTopView: (view: TopView) => void;
getSelectedNode: () => Node<WorkspaceNodeData> | null;
updateNodeData: (id: string, data: Partial<WorkspaceNodeData>) => void;
restartWorkspace: (id: string, options?: { applyTemplate?: boolean }) => Promise<void>;
@@ -283,6 +302,7 @@ export const useCanvasStore = create<CanvasState>((set, get) => ({
edges: [],
selectedNodeId: null,
panelTab: "chat",
topView: "home",
dragOverNodeId: null,
contextMenu: null,
sidePanelWidth: 480, // matches SIDEPANEL_DEFAULT_WIDTH in SidePanel.tsx
@@ -418,6 +438,7 @@ export const useCanvasStore = create<CanvasState>((set, get) => ({
}
},
setPanelTab: (tab) => set({ panelTab: tab }),
setTopView: (view) => set({ topView: view }),
setDragOverNode: (id) => set({ dragOverNodeId: id }),
batchNest: async (nodeIds, targetId) => {
@@ -951,8 +972,9 @@ export const useCanvasStore = create<CanvasState>((set, get) => ({
// response to the child near its edge, the child's relative
// position becomes valid again and the grow stops mid-drag, only to
// resume on the next tick. Commit-on-release: only run grow when a
// change set contains a `dimensions` change (NodeResizer commit),
// not on pure `position` changes. Drag-stop grow is handled
// change set contains a `dimensions` change (React Flow's auto-measure
// of a card's fixed-size CSS), not on pure `position` changes. Drag-stop
// grow is handled
// explicitly in Canvas.onNodeDragStop via growOnce().
const hasDimensionChange = changes.some((c) => c.type === "dimensions");
set({ nodes: hasDimensionChange ? growParentsToFitChildren(next) : next });
+5
View File
@@ -319,6 +319,11 @@ export interface WorkspaceData {
agent_card: Record<string, unknown> | null;
url: string;
parent_id: string | null;
/** Workspace kind: 'platform' = the org-level concierge (the undeletable org
* root, hidden from the map graph); 'workspace' = an ordinary agent. Absent
* on older ws-server builds that predate the kind column treat as
* 'workspace'. (migration 20260606000000_workspaces_kind) */
kind?: string;
active_tasks: number;
max_concurrent_tasks?: number | null;
last_error_rate: number;
+39 -1
View File
@@ -69,10 +69,43 @@ services:
# Override to "production" for SaaS/staged deploys; in those modes
# ADMIN_TOKEN must also be set or every request rejects.
MOLECULE_ENV: "${MOLECULE_ENV:-development}"
# Self-hosted: no control plane to install the org's platform agent
# (concierge), so the tenant server seeds it on boot. Idempotent; unset it
# if you don't want the auto-seeded Org Concierge root.
MOLECULE_SEED_PLATFORM_AGENT: "${MOLECULE_SEED_PLATFORM_AGENT:-true}"
# Org display name. Drives the platform-agent name ("<MOLECULE_ORG_NAME>
# Agent", e.g. "Molecule AI Agent") and the canvas topbar (via the open
# GET /org/identity route). Empty → legacy "Org Concierge" + no topbar name.
MOLECULE_ORG_NAME: "${MOLECULE_ORG_NAME:-Molecule AI}"
CORS_ORIGINS: ${CORS_ORIGINS:-http://localhost:${CANVAS_PUBLISH_PORT:-3000},http://127.0.0.1:${CANVAS_PUBLISH_PORT:-3000},http://localhost:3001}
RATE_LIMIT: "${RATE_LIMIT:-1000}"
CONFIGS_DIR: /configs
# Runtime/template SSOT parity with production. The image bakes the FULL
# template set (claude-code-default, codex, google-adk, hermes, openclaw,
# seo-agent) at /workspace-configs-templates, but the ./workspace-configs-
# templates:/configs mount below only carries claude-code-default on the
# host — so without this, GET /templates (the runtime-picker SSOT) listed
# only claude-code locally while production lists them all. Pointing the
# template cache-dir at the baked bundle makes the local runtime LIST match
# production. NOTE: the local Docker provisioner bind-mounts a template
# from CONFIGS_HOST_DIR (host path) at provision time, and the host dir
# only has claude-code-default — so the other runtimes are SELECTABLE but
# only claude-code is PROVISIONABLE locally (their images + host templates
# aren't present in this lightweight dev stack). Real provisioning of the
# other runtimes is covered by the staging e2e, which carries all images.
TEMPLATE_CACHE_DIR: "${TEMPLATE_CACHE_DIR:-/workspace-configs-templates}"
CONFIGS_HOST_DIR: "${CONFIGS_HOST_DIR:-${PWD}/workspace-configs-templates}"
# ORG-TEMPLATE SSOT parity — same shadowing fix as TEMPLATE_CACHE_DIR
# above, for ORG templates (the Home page's ORG TEMPLATES section). The
# image bakes the default org templates (molecule-dev,
# molecule-worker-gemini, ux-ab-lab) at /org-templates. Previously the
# `./org-templates:/org-templates:ro` mount bind-mounted an EMPTY host dir
# over that exact path, shadowing the baked defaults — so the Home page
# showed "No org templates in org-templates/" locally while production
# listed all three. The shadowing mount is removed below; this env points
# findOrgDir() at the baked bundle so the local listing matches production.
# Override to a populated host dir to develop your own org templates.
ORG_TEMPLATES_DIR: "${ORG_TEMPLATES_DIR:-/org-templates}"
PLUGINS_HOST_DIR: "${PLUGINS_HOST_DIR:-${PWD}/plugins}"
# github-app-auth plugin — injects GITHUB_TOKEN / GH_TOKEN into every
# workspace env from the App installation token. Remap the host-side
@@ -125,7 +158,12 @@ services:
IMAGE_AUTO_REFRESH: "${IMAGE_AUTO_REFRESH:-true}"
volumes:
- ./workspace-configs-templates:/configs
- ./org-templates:/org-templates:ro
# NOTE: the empty host ./org-templates is intentionally NOT mounted over
# the baked /org-templates — that shadowed the image's default org
# templates and made the Home page show "No org templates". The platform
# reads org templates from ORG_TEMPLATES_DIR (set to the baked
# /org-templates above). To develop custom org templates, mount a
# POPULATED host dir at a different path and point ORG_TEMPLATES_DIR at it.
- ./plugins:/plugins:ro
- /var/run/docker.sock:/var/run/docker.sock
# App private key — read-only bind-mount. The host-side path is
+109
View File
@@ -0,0 +1,109 @@
# RFC: User Tasks — agent→user action requests
**Status:** Draft — pre-implementation design SSOT. New primitive; normally
needs CTO sign-off before merge (authorized in-session by the CTO for the
concierge build).
**Author:** core-devops (canvas concierge work)
**Related:** RFC #2360 (platform agent / Org Concierge), PR #2385 (canvas redesign)
## Problem
The Org Concierge home has a **Tasks** tab. "Tasks" is meant to be **things an
agent asks the *user* to do** — e.g. "Review the launch draft", "Provide the
Stripe API key", "Confirm the publish date". Today there is **no backend** for
this: the only agent→user mechanisms are
- **Approvals** (`approval_requests`) — sign-off for *destructive* ops only, and
- **`send_message_to_user` / `notify_user`** — unstructured chat messages with no
state (you can't mark them done, and they don't form a worklist).
So the Tasks tab had to be wired to **schedules** as an interim stand-in, which
is the wrong concept.
## Design
A small structured primitive that mirrors the **approvals** subsystem (same
shape, minus the destructive-gating semantics).
### Data — `user_tasks`
```sql
CREATE TABLE user_tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
workspace_id UUID NOT NULL, -- the agent that raised the ask
title TEXT NOT NULL, -- the ask, one line
detail TEXT, -- optional longer context
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending','done','dismissed')),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
resolved_at TIMESTAMPTZ,
resolved_by TEXT
);
CREATE INDEX idx_user_tasks_pending ON user_tasks (status, created_at DESC);
```
### Endpoints (mirror `approvals`)
| Method + path | Auth | Purpose |
|---|---|---|
| `POST /workspaces/:id/user-tasks` | WorkspaceAuth | Agent raises an ask `{title, detail?}``201 {user_task_id, status:"pending"}` |
| `GET /workspaces/:id/user-tasks` | WorkspaceAuth | A workspace **reads its own** tasks (any status) |
| `PATCH /workspaces/:id/user-tasks/:taskId` | WorkspaceAuth | A workspace **updates its own** task `{title?, detail?, status?}` (scoped by `workspace_id`) |
| `DELETE /workspaces/:id/user-tasks/:taskId` | WorkspaceAuth | A workspace **deletes its own** task (scoped by `workspace_id`) |
| `GET /user-tasks/pending` | AdminAuth | Cross-workspace pending list for the concierge Tasks tab → `[{id, workspace_id, workspace_name, title, detail, status, created_at}]` |
| `POST /workspaces/:id/user-tasks/:taskId/resolve` | WorkspaceAuth | User marks `{status:"done"|"dismissed", resolved_by?}``200` |
**Any** workspace (not just the platform agent) can create and manage its own
tasks; the `:id` workspace scope on update/delete means an agent can only touch
tasks it raised. The Home Tasks list (`/user-tasks/pending`) is org-wide, so
every workspace's asks surface in one place for the user.
`/user-tasks/pending` is AdminAuth + cross-workspace exactly like
`/approvals/pending` (an unauthenticated caller must not enumerate every org's
asks).
### MCP tool — `request_user_action`
Added to the **in-workspace `a2a` MCP** (same place as `send_message_to_user`)
so every agent can raise an ask:
```
request_user_action(title, detail?) → raise an ask (insert + USER_TASK_REQUESTED)
list_user_tasks() → read the asks this workspace raised + status
update_user_task(user_task_id, title?, detail?, status?) → edit own task
delete_user_task(user_task_id) → delete own task
```
So every agent (any workspace, via MCP) can create AND manage its own asks —
`request_user_action` is the create; `list_/update_/delete_user_task` are the
read/update/delete, all scoped to tasks the calling workspace raised. None are
gated behind `MOLECULE_MCP_ALLOW_SEND_MESSAGE` (that gate is specific to
`send_message_to_user`); raising/managing an ask is always allowed.
### Events
`USER_TASK_REQUESTED`, `USER_TASK_RESOLVED` — broadcast on the existing
Broadcaster so the canvas updates live (same pattern as `APPROVAL_*`).
### Canvas wiring (PR #2385)
The concierge **Tasks** tab fetches `GET /user-tasks/pending`, renders each as a
task card (title + detail + originating agent), with **Done** / **Dismiss**
buttons calling the resolve endpoint. The tab count badge reflects the pending
count. Replaces the interim schedules wiring.
## SSOT discipline / non-goals
- Reuses the approvals pattern, Broadcaster, and WorkspaceAuth/AdminAuth split —
no new auth path, no new event bus.
- **Not** an approval/gate: resolving a user-task has no server-side enforcement
effect; it's a worklist signal. (Destructive gating stays in `approvals`.)
- No `org_id` column; cross-workspace listing joins `workspaces` like approvals.
## Rollout
Phase 0 migration ships idempotently (`IF NOT EXISTS`). Backend + MCP tool +
canvas wiring land together behind the concierge Home (already gated as the new
UI). Full molecule-core SOP gate applies (tier label + qa-review +
security-review + green CI).
+2 -2
View File
@@ -278,7 +278,7 @@ receive **HTTP 401** on every API call. Affected workflows in molecule-core:
| Workflow | Symptom | Workaround |
|---|---|---|
| `gate-check-v3.yml` | Reports BLOCKED on every PR | Provision `SOP_TIER_CHECK_TOKEN`; update workflow to use it |
| `gate-check-v3.yml` | Reports BLOCKED on every PR | Provision `SOP_CHECKLIST_GATE_TOKEN`; update workflow to use it |
| `qa-review.yml` | Fails immediately on PR open | Same — needs named secret |
| `security-review.yml` | Fails immediately on PR open | Same — needs named secret |
@@ -313,7 +313,7 @@ dispatcher may fire **only 1 of N eligible workflows** on the initial
This was observed on molecule-core PR #558 (created 2026-05-11T19:54:10Z):
12+ workflows had no `paths:` filter and should have fired, but only
`sop-tier-check.yml` dispatched.
`sop-checklist.yml` dispatched.
Concurrent PRs created within the same minute received 1230 dispatches each,
confirming this is specific to the PR-create event dispatch, not a general
+10
View File
@@ -229,6 +229,11 @@ ssm_refresh_ecr_auth() {
# to guarantee correct string escaping (OFFSEC-001 / CWE-78 hardening).
# Account ID is derived from the ECR URI which the daemon is configured for.
local acct="${ECR_ACCOUNT_ID:-153263036946}"
# #676: validate account ID is exactly 12 digits (AWS account ID format).
if ! [[ "$acct" =~ ^[0-9]{12}$ ]]; then
err "invalid ECR_ACCOUNT_ID (must be 12 digits): $acct"
return 1
fi
local params
params=$(mktemp)
python3 -c "
@@ -290,6 +295,11 @@ validate_slug() {
preflight() {
log "preflight: source=$SOURCE_TAG dest=$DEST_TAG repo=$REPO region=$REGION"
# Region validation: reject obviously malformed input (CWE-78 / injection guard).
if ! [[ "$REGION" =~ ^[a-z][a-z0-9-]*[0-9]$ ]]; then
err "invalid AWS region: $REGION"
exit 64
fi
local src_manifest
src_manifest=$(aws_ecr_get_image "$SOURCE_TAG") || {
err "source tag '$SOURCE_TAG' not found in $REPO"
+19 -4
View File
@@ -311,7 +311,22 @@ for slug in $valid_slugs; do
fi
done
printf '\n== Test 11: ROLLBACK_TAG follows YYYYMMDD via NOW_OVERRIDE_DATE ==\n'
printf '\n== Test 11: region validation — malicious region rejected with exit 64 (#676) ==\n'
# Attack vectors: shell metacharacters, path traversal, command substitution.
_invalid_regions='us;rm -rf / $(whoami) us"east-1 ../etc/passwd `id` $HOME us/east-1'
for bad_region in $_invalid_regions; do
set +e
out=$(AWS_REGION="$bad_region" "$SCRIPT" --source-tag x --dest-tag y --tenants chloe-dong --mock-dir /nonexistent 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'invalid AWS region'; then
PASS=$((PASS + 1)); printf ' ✓ region rejected: %s\n' "$(printf '%q' "$bad_region")"
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("region-reject:$bad_region")
printf ' ✗ region should be rejected: %s — got exit %s\n' "$(printf '%q' "$bad_region")" "$rc"
fi
done
printf '\n== Test 12: ROLLBACK_TAG follows YYYYMMDD via NOW_OVERRIDE_DATE ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{}' 0
mock_set "$m" aws_ecr_describe_image '' 1
@@ -333,7 +348,7 @@ fi
assert_calls_contain "rollback tag uses NOW_OVERRIDE_DATE (20260603)" "$m" 'aws_ecr_put_image b-prev-20260603'
rm -rf "$m"
printf '\n== Test 12: empty source manifest fails preflight ==\n'
printf '\n== Test 13: empty source manifest fails preflight ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '' 0 # rc=0 but empty body (the "None" case)
out=$(run_script "$m")
@@ -341,7 +356,7 @@ assert_exit "empty source manifest fails preflight" "$out" 1
assert_contains "empty manifest message" "$out" 'returned empty manifest'
rm -rf "$m"
printf '\n== Test 13: tenant_buildinfo failure during verify → rollback ==\n'
printf '\n== Test 14: tenant_buildinfo failure during verify → rollback ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
@@ -355,7 +370,7 @@ assert_contains "logs buildinfo failure" "$out" '/buildinfo failed for chloe-don
assert_contains "rollback fired after verify fail" "$out" 'ROLLBACK:'
rm -rf "$m"
printf '\n== Test 14: ssm_refresh_ecr_auth JSON escaping (CWE-78 / OFFSEC-001) ==\n'
printf '\n== Test 15: ssm_refresh_ecr_auth JSON escaping (CWE-78 / OFFSEC-001) ==\n'
# Verify the python3 snippet in ssm_refresh_ecr_auth produces valid JSON and
# correctly escapes shell-injection characters in region + account ID fields.
# The fix replaces unquoted shell-printf interpolation with json.dumps.
+33
View File
@@ -0,0 +1,33 @@
# Tiny stub runtime image for the local Docker-provisioner lifecycle e2e.
#
# It impersonates a real workspace runtime's platform contract (register +
# heartbeat + A2A message/send) with ZERO LLM/SDK weight so the lifecycle e2e
# (provision -> online -> restart-survive -> proxy-reach) runs in seconds and
# without the 2.5GB real claude-code image.
#
# Resolution trick (see tests/e2e/test_local_provision_lifecycle_e2e.sh):
# the local provisioner resolves runtime=claude-code via RegistryModeLocal,
# which is a `docker image inspect` cache-check on
# molecule-local/workspace-template-claude-code:<gitea-HEAD-sha12>
# BEFORE it clones+builds. Pre-tagging THIS image to that exact cache tag makes
# the provisioner cache-hit the stub instead of building the real template.
#
# linux/amd64: the provisioner forces --platform=linux/amd64 for every workspace
# container (defaultImagePlatform, #1875) for parity with the amd64-only prod
# images. Build the stub amd64 too so the platforms match and Docker doesn't
# refuse the create with a manifest mismatch.
FROM --platform=linux/amd64 python:3.12-alpine
# /configs is the named-volume mount point the provisioner attaches
# (ws-<id>-configs:/configs). The real entrypoint chowns it; the stub just
# needs the dir to exist so a missing-mount never trips it up.
RUN mkdir -p /configs /workspace
WORKDIR /app
COPY server.py /app/server.py
EXPOSE 8000
# No gosu/agent-uid drop here — the stub does no privileged work and the e2e
# only cares about the platform contract, not the agent-uid security posture.
ENTRYPOINT ["python3", "/app/server.py"]
+307
View File
@@ -0,0 +1,307 @@
#!/usr/bin/env python3
"""Minimal stub runtime for the local Docker-provisioner lifecycle e2e.
This is NOT a real agent it carries no LLM, no claude-code SDK, no plugin
host. Its only job is to satisfy the platform's runtime<->platform contract so
the `test_local_provision_lifecycle_e2e.sh` harness can prove the LOCAL Docker
provisioner can provision a workspace, bring it online, SURVIVE A RESTART
(reusing the config volume), and route an A2A `message/send` through the
platform proxy all WITHOUT building/booting the 2.5GB real claude-code image.
Contract it replicates (discovered from workspace-server):
* registration is done BY the runtime container on boot (NOT the provisioner).
The provisioner only sets status=provisioning + pre-stores the host URL; the
container must POST /registry/register itself, and the heartbeat loop is what
transitions provisioning -> online (registry.go evaluateStatus, #1784).
* env vars the real entrypoint reads, injected by buildContainerEnv():
WORKSPACE_ID - this workspace's UUID
PLATFORM_URL - canonical platform base URL (e.g. http://platform:8080)
We read exactly those (with WORKSPACE_CONFIG_PATH for the config.yaml probe).
* POST {PLATFORM_URL}/registry/register
body: {"id", "url", "agent_card":{"name","skills":[]}}
- url MUST be push-routable. The provisioner runs the platform inside
Docker, so it rewrites a stored http://127.0.0.1:<port> URL to the
container-DNS form http://ws-<id[:12]>:8000 before proxying
(a2a_proxy.go resolveAgentURL). We register our OWN container-DNS URL
(http://<hostname>:8000) so SSRF validation passes in SaaS mode AND the
proxy can reach us; in self-hosted (non-saas) mode RFC-1918 is blocked,
so we fall back to registering by the ws-<id> alias hostname which
resolves on molecule-core-net.
- first register returns {"auth_token": ...}; we keep it for heartbeats.
* POST {PLATFORM_URL}/registry/heartbeat (every ~10s)
header: Authorization: Bearer <auth_token>
body: {"workspace_id","error_rate","sample_error","active_tasks",
"uptime_seconds","current_task"}
This is what lifts the workspace provisioning -> online and keeps the
Redis liveness TTL fresh (so the restart re-online assertion can pass).
* listen on :8000 and answer the A2A JSON-RPC the proxy forwards:
POST / {"jsonrpc","id","method":"message/send","params":{...}}
-> 200 {"jsonrpc":"2.0","id":<echoed>,
"result":{"kind":"message","role":"agent",
"parts":[{"kind":"text","text":"STUB OK"}],
"messageId":<uuid>}}
The result envelope matches what test_a2a_e2e.sh asserts on
(result.parts[0].text, role=agent, kind=text). A health path (/health and
GET /) returns 200 so any probe sees the container as alive.
"""
import json
import os
import sys
import threading
import time
import urllib.request
import urllib.error
import uuid
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
PORT = 8000
WORKSPACE_ID = os.environ.get("WORKSPACE_ID", "").strip()
PLATFORM_URL = (os.environ.get("PLATFORM_URL") or os.environ.get("MOLECULE_URL") or "").rstrip("/")
HOSTNAME = os.environ.get("HOSTNAME", "").strip() # docker sets this to the container id; ws-<id> alias also resolves
# URL we register with. Two hard constraints, discovered from workspace-server:
#
# * validateAgentURL (registry.go) blocks RFC-1918 ranges in NON-saas mode
# (this dev stack sets neither MOLECULE_DEPLOY_MODE=saas nor MOLECULE_ORG_ID
# -> strict mode). The molecule-core-net bridge is 172.18.0.0/16, INSIDE the
# blocked 172.16/12 — so registering our own ws-<id>:8000 DNS name (which
# resolves to a 172.18.x bridge IP) would be REJECTED and we'd never get an
# auth_token. "localhost" is explicitly allowed BY NAME (no DNS lookup).
#
# * the proxy doesn't use the URL we register anyway: the provisioner
# pre-stores http://127.0.0.1:<host-port>, the register upsert PRESERVES any
# existing 127.0.0.1 URL (CASE WHEN url LIKE 'http://127.0.0.1%'), and when
# the platform runs in Docker resolveAgentURL rewrites that to the container
# -DNS form http://ws-<id[:12]>:8000 before forwarding. So our listen
# address (0.0.0.0:8000, reachable as ws-<id>:8000 on the bridge) is what
# the proxy actually hits — independent of the URL string we register.
#
# Net: register a name-form localhost URL purely to satisfy push-mode's
# "url required + must pass SSRF check" and to get our auth_token. Routing is
# handled by the provisioner-stored 127.0.0.1 URL + the proxy rewrite.
_short = WORKSPACE_ID[:12] if len(WORKSPACE_ID) > 12 else WORKSPACE_ID
SELF_URL = os.environ.get("STUB_REGISTER_URL", f"http://localhost:{PORT}")
CONFIG_PATH = (os.environ.get("WORKSPACE_CONFIG_PATH") or "/configs").rstrip("/")
AUTH_TOKEN_FILE = f"{CONFIG_PATH}/.auth_token"
AUTH_TOKEN = None
_started = time.time()
def _log(msg):
print(f"[stub-runtime {_short}] {msg}", flush=True)
def read_volume_token():
"""The provisioner pre-writes the CURRENT workspace bearer to
/configs/.auth_token before every container start (issueAndInjectToken,
#1877), and ROTATES it on every (re)provision (RevokeAllForWorkspace +
IssueToken). So the volume file NOT the register-response token is the
authoritative, rotation-proof bearer. Reading it on each heartbeat means a
provision-time token rotation never wedges our heartbeat at 401 (which is
what kept the workspace stuck in 'provisioning' instead of flipping online).
"""
try:
with open(AUTH_TOKEN_FILE, "r") as f:
tok = f.read().strip()
return tok or None
except Exception:
return None
def _post_json(path, payload, token=None):
url = f"{PLATFORM_URL}{path}"
data = json.dumps(payload).encode()
req = urllib.request.Request(url, data=data, method="POST")
req.add_header("Content-Type", "application/json")
if token:
req.add_header("Authorization", f"Bearer {token}")
with urllib.request.urlopen(req, timeout=15) as resp:
body = resp.read().decode()
return resp.status, body
def register():
"""POST /registry/register. Returns the issued auth_token (first register).
C18 hijack guard: once the workspace has ANY live token on file (the
provisioner mints+injects one into /configs/.auth_token before start), a
register MUST carry that workspace's bearer or it 401s. So we send the
volume token (if present). First-ever boot has no live token yet bootstrap
register (no bearer) is allowed and returns the freshly-issued auth_token.
"""
global AUTH_TOKEN
payload = {
"id": WORKSPACE_ID,
"url": SELF_URL,
"delivery_mode": "push",
"agent_card": {
"name": WORKSPACE_ID,
"description": "stub runtime (e2e lifecycle)",
"skills": [],
},
}
status, body = _post_json("/registry/register", payload, token=read_volume_token())
_log(f"register -> {status} {body[:200]}")
try:
parsed = json.loads(body)
except Exception:
parsed = {}
tok = parsed.get("auth_token")
if tok:
AUTH_TOKEN = tok
_log("captured auth_token from register response")
return status
def current_token():
# Volume file is authoritative (rotation-proof); fall back to the token we
# captured from the register response if the file isn't there yet.
return read_volume_token() or AUTH_TOKEN
def heartbeat():
payload = {
"workspace_id": WORKSPACE_ID,
"error_rate": 0.0,
"sample_error": "",
"active_tasks": 0,
"uptime_seconds": int(time.time() - _started),
"current_task": "",
}
status, body = _post_json("/registry/heartbeat", payload, token=current_token())
return status, body
def register_with_retry():
# The platform may still be wiring the row when we boot; retry a few times.
# Register is best-effort for the e2e (heartbeat drives online); a sticky
# 401 just means the workspace already has a live token and our volume token
# is momentarily stale — the heartbeat path re-reads the volume each beat.
for attempt in range(1, 11):
try:
status = register()
if status == 200:
return True
_log(f"register attempt {attempt}: HTTP {status}, retrying")
except urllib.error.HTTPError as e:
_log(f"register attempt {attempt}: HTTPError {e.code} {e.read().decode()[:200]}")
except Exception as e:
_log(f"register attempt {attempt}: {e}")
time.sleep(2)
return False
def heartbeat_loop():
# Fire the FIRST heartbeat immediately (no initial 5s wait) — the
# provisioning->online transition is driven by the heartbeat handler
# (registry.go evaluateStatus, #1784), so an eager first beat minimises the
# provision->online latency the e2e polls on.
while True:
try:
status, body = heartbeat()
if status != 200:
_log(f"heartbeat -> {status} {body[:160]}")
# A 401 means our token was rotated (every provision rotates the
# workspace token, issueAndInjectToken -> RevokeAllForWorkspace).
# Re-register to mint a fresh one. This is what lets the SAME
# container process survive a platform-side token rotation.
if status == 401:
_log("heartbeat 401 — re-registering to refresh token")
register_with_retry()
except urllib.error.HTTPError as e:
if e.code == 401:
_log("heartbeat 401 (HTTPError) — re-registering")
register_with_retry()
else:
_log(f"heartbeat HTTPError {e.code}")
except Exception as e:
_log(f"heartbeat error: {e}")
time.sleep(5)
class Handler(BaseHTTPRequestHandler):
def log_message(self, *args): # silence default access logging
pass
def _send(self, code, obj):
body = json.dumps(obj).encode()
self.send_response(code)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(body)))
self.end_headers()
self.wfile.write(body)
def do_GET(self):
# Health: any GET returns 200 so probes see us as alive.
self._send(200, {"status": "ok", "stub": True, "workspace_id": WORKSPACE_ID})
def do_POST(self):
length = int(self.headers.get("Content-Length", "0") or "0")
raw = self.rfile.read(length) if length else b"{}"
try:
req = json.loads(raw or b"{}")
except Exception:
req = {}
method = req.get("method", "")
req_id = req.get("id", str(uuid.uuid4()))
if method and method != "message/send":
# Match the proxy's -32601 method-not-found contract for unknowns.
self._send(200, {
"jsonrpc": "2.0",
"id": req_id,
"error": {"code": -32601, "message": f"method not found: {method}"},
})
return
# Canned A2A reply — exact envelope the canvas/proxy + test_a2a_e2e.sh
# assert on: result.role=agent, result.parts[0].kind=text/text.
self._send(200, {
"jsonrpc": "2.0",
"id": req_id,
"result": {
"kind": "message",
"role": "agent",
"parts": [{"kind": "text", "text": "STUB OK"}],
"messageId": str(uuid.uuid4()),
},
})
def main():
if not WORKSPACE_ID or not PLATFORM_URL:
_log(f"FATAL: WORKSPACE_ID={WORKSPACE_ID!r} PLATFORM_URL={PLATFORM_URL!r} — both required")
sys.exit(1)
_log(f"booting: platform={PLATFORM_URL} self_url={SELF_URL} hostname={HOSTNAME}")
# Start the HTTP server FIRST so the platform can reach us the instant we
# register (avoids a race where the proxy forwards before we're listening).
server = ThreadingHTTPServer(("0.0.0.0", PORT), Handler)
threading.Thread(target=server.serve_forever, daemon=True).start()
_log(f"listening on :{PORT}")
# Try to register, but do NOT make heartbeating contingent on it. The
# provisioning->online transition is driven by the HEARTBEAT handler
# (registry.go evaluateStatus, #1784), and heartbeats authenticate with the
# volume token (rotation-proof). If register transiently 401s (e.g. a token
# rotation mid-boot), we must still heartbeat so the workspace can come
# online — blocking the heartbeat loop on register success is exactly what
# kept the workspace stuck in 'provisioning'. register_with_retry runs in a
# background thread; the foreground heartbeat loop starts immediately.
threading.Thread(target=register_with_retry, daemon=True).start()
heartbeat_loop()
if __name__ == "__main__":
main()
+255
View File
@@ -0,0 +1,255 @@
#!/usr/bin/env bash
# LOCAL functional variant of the concierge-creates-a-workspace gate.
#
# Same proof as tests/e2e/test_staging_concierge_creates_workspace_e2e.sh but
# against the ALREADY-RUNNING local stack (BASE, default http://localhost:8080),
# so the "concierge actually invokes create_workspace via the platform MCP" claim
# can be demonstrated locally — far faster than provisioning an EC2 tenant.
#
# Drive the AGENT (not the REST API): send the concierge an A2A message/send
# ("create a workspace named e2e-cncrg-worker-<runid> with role engineer") and
# assert the DETERMINISTIC SIDE EFFECT — that named workspace now EXISTS in
# GET /workspaces — which can only happen if the concierge's LLM really invoked
# the create_workspace platform-MCP tool.
#
# SKIP-LOUD GATE (this is the whole point of the local variant). The platform MCP
# tools — incl. create_workspace — only light up on the DEDICATED platform-agent
# image (Dockerfile.platform-agent, ships /opt/molecule-mcp-server). The ordinary
# `claude-code` image the default local stack provisions the concierge on does
# NOT ship it (platform_agent.go SELF-HOST CAVEAT). So before driving the agent
# this script PROBES the concierge's own MCP tool list (POST /workspaces/:id/mcp
# tools/list) and SKIPs LOUD (exit 0) unless create_workspace is actually present.
# It also skips-loud when no concierge is seeded or it isn't online. That makes
# this runnable on any local stack: it only EXERCISES the path when the local
# stack can actually run it, and never false-reds when it can't.
#
# To make the local stack able to run this GREEN you need BOTH:
# 1. A concierge seeded as the kind='platform' root. The self-hosted compose
# sets MOLECULE_SEED_PLATFORM_AGENT=1 so the ws-server self-seeds it
# (EnsureSelfHostedPlatformAgent) + best-effort provisions it on boot
# (MaybeProvisionPlatformAgentOnBoot).
# 2. That concierge running on the platform-agent image (so create_workspace
# exists) WITH a working model key (e.g. MINIMAX_API_KEY / a BYOK key) so its
# LLM can run the tool. The default `claude-code` image will SKIP at the MCP
# probe — that's expected and honest, not a failure.
#
# Env contract:
# BASE default http://localhost:8080
# MOLECULE_ADMIN_TOKEN platform admin bearer IF the local stack sets
# ADMIN_TOKEN (devmode fail-open if unset). Used by
# _lib.sh helpers for admin-gated GET/DELETE.
# E2E_CONCIERGE_ONLINE_SECS default 300 (local boot budget)
# E2E_AGENT_ACT_SECS default 300 (LLM think+tool-call budget)
# E2E_RUN_ID slug/name suffix; default $$-based
#
# Exit codes:
# 0 concierge created the workspace, OR honest skip-loud (path not runnable)
# 1 generic / assertion failure (agent didn't act, or the tool failed)
set -euo pipefail
: "${BASE:=http://localhost:8080}"
export BASE
# shellcheck disable=SC1091
# shellcheck source=_lib.sh
source "$(dirname "$0")/_lib.sh"
# Error-as-text scanner so a concierge that surfaces a tool error AS its reply
# is distinguished from a clean "created it" reply.
# shellcheck disable=SC1091
# shellcheck source=lib/completion_assert.sh
source "$(dirname "$0")/lib/completion_assert.sh"
CONCIERGE_ONLINE_SECS="${E2E_CONCIERGE_ONLINE_SECS:-300}"
AGENT_ACT_SECS="${E2E_AGENT_ACT_SECS:-300}"
RUN_ID_SUFFIX="${E2E_RUN_ID:-$(date +%H%M%S)-$$}"
WORKER_NAME="e2e-cncrg-worker-${RUN_ID_SUFFIX}"
WORKER_NAME=$(echo "$WORKER_NAME" | tr -cd 'a-zA-Z0-9-' | head -c 48)
export WORKER_NAME
log() { echo "[$(date +%H:%M:%S)] $*"; }
fail() { echo "[$(date +%H:%M:%S)] ❌ $*" >&2; exit 1; }
ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
skip_loud() { echo "[$(date +%H:%M:%S)] ⏭️ SKIP (local path not runnable): $*" >&2; exit 0; }
# Admin-auth curl args (if the local stack set ADMIN_TOKEN; else empty / fail-open).
ADMIN_AUTH=()
e2e_admin_auth_args ADMIN_AUTH
WORKER_ID=""
cleanup() {
# Targeted delete of the worker the concierge created (best-effort). _lib.sh's
# helper sends the admin bearer + confirm header.
if [ -n "$WORKER_ID" ]; then
log "🧹 deleting concierge-created worker $WORKER_ID ($WORKER_NAME)..."
e2e_delete_workspace "$WORKER_ID" "$WORKER_NAME" || true
fi
}
trap cleanup EXIT INT TERM
list_ws() { curl -sS --max-time 15 "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"}; }
find_platform_root() {
list_ws | python3 -c "
import sys, json
try: rows = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
for w in rows if isinstance(rows, list) else []:
if w.get('kind') == 'platform' and not w.get('parent_id'):
print(w.get('id','')); break
else:
print('')"
}
ws_field() { # <id> <field>
curl -sS --max-time 15 "$BASE/workspaces/$1" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
print(d.get('$2','') if isinstance(d, dict) else '')"
}
find_worker_by_name() {
list_ws | python3 -c "
import sys, json, os
want = os.environ['WORKER_NAME']
try: rows = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
for w in rows if isinstance(rows, list) else []:
if w.get('name') == want:
print(w.get('id','')); break
else:
print('')"
}
# concierge_has_create_workspace_tool <id>: probe POST /workspaces/:id/mcp
# tools/list and echo "yes" iff create_workspace is in the advertised tool set.
# This is THE gate distinguishing the platform-agent image (has the tool) from
# the ordinary claude-code image (does not).
concierge_has_create_workspace_tool() { # <id>
local wid="$1" out
out=$(curl -sS --max-time 30 -X POST "$BASE/workspaces/$wid/mcp" \
${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' 2>/dev/null || echo '{}')
echo "$out" | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print('no'); sys.exit(0)
tools = (d.get('result') or {}).get('tools', []) if isinstance(d, dict) else []
names = {t.get('name','') for t in tools if isinstance(t, dict)}
# Accept the bare name or any mcp_*_create_workspace alias the bridge may expose.
print('yes' if any(n == 'create_workspace' or n.endswith('create_workspace') for n in names) else 'no')"
}
# ─── 0. Preflight ────────────────────────────────────────────────────────────
log "═══ LOCAL concierge CREATES-A-WORKSPACE (real-LLM) E2E ═══ BASE=$BASE"
log " worker the concierge will be asked to create: name=$WORKER_NAME"
curl -sS --max-time 10 "$BASE/health" >/dev/null 2>&1 || skip_loud "local stack not reachable at $BASE/health — run \`make up\` first"
ok "Local stack reachable"
# ─── 1. Discover the concierge (kind='platform' root) ─────────────────────────
CONCIERGE_ID=$(find_platform_root)
if [ -z "$CONCIERGE_ID" ]; then
skip_loud "no kind='platform' concierge seeded on the local stack. Set MOLECULE_SEED_PLATFORM_AGENT=1 \
on the ws-server (self-hosted compose does this) so it self-seeds + provisions the concierge."
fi
ok "Concierge (platform root) = $CONCIERGE_ID"
# ─── 2. Ensure the concierge is online ────────────────────────────────────────
log "Waiting for the concierge to be online (up to ${CONCIERGE_ONLINE_SECS}s)..."
ONLINE_DEADLINE=$(( $(date +%s) + CONCIERGE_ONLINE_SECS ))
C_STATUS=""; LAST_C_STATUS=""
while true; do
C_STATUS=$(ws_field "$CONCIERGE_ID" status)
if [ "$C_STATUS" != "$LAST_C_STATUS" ]; then log " concierge → ${C_STATUS:-<none>}"; LAST_C_STATUS="$C_STATUS"; fi
[ "$C_STATUS" = "online" ] && break
if [ "$(date +%s)" -gt "$ONLINE_DEADLINE" ]; then
skip_loud "concierge $CONCIERGE_ID never reached online within ${CONCIERGE_ONLINE_SECS}s (last='${C_STATUS}'). \
On the default local stack the concierge needs a model key (e.g. MINIMAX_API_KEY) to boot — without one it stays failed."
fi
sleep 5
done
ok "Concierge online"
# ─── 3. Gate: the platform MCP create_workspace tool must actually be present ──
log "Probing the concierge's MCP tool set for create_workspace..."
HAS_TOOL=$(concierge_has_create_workspace_tool "$CONCIERGE_ID")
if [ "$HAS_TOOL" != "yes" ]; then
skip_loud "the concierge's platform MCP does NOT expose create_workspace — it is running on the ordinary \
claude-code image (no /opt/molecule-mcp-server), not the platform-agent image. Provision the concierge on \
Dockerfile.platform-agent to exercise this path locally. (This is the documented SELF-HOST CAVEAT, not a bug.)"
fi
ok "Concierge advertises create_workspace via its platform MCP"
# Pre-state: the worker must not already exist.
PRE_EXISTING=$(find_worker_by_name)
[ -n "$PRE_EXISTING" ] && fail "worker '$WORKER_NAME' already exists pre-test ($PRE_EXISTING) — cannot prove causality"
ok "Pre-state confirmed: '$WORKER_NAME' does not exist yet"
# ─── 4. Drive the AGENT via A2A message/send ──────────────────────────────────
log "Sending the concierge a natural-language create-workspace request..."
AGENT_PROMPT="Please create a new workspace in this org right now using your platform tools. \
Use the create_workspace tool with name exactly ${WORKER_NAME} (use that exact string, no quotes) and role engineer. \
Do not ask me any clarifying questions — the name and role are final. \
After the tool succeeds, reply with the new workspace id."
export AGENT_PROMPT
A2A_PAYLOAD=$(python3 -c "
import json, os, uuid
print(json.dumps({
'jsonrpc': '2.0',
'method': 'message/send',
'id': 'e2e-cncrg-mk-local-1',
'params': {
'message': {
'role': 'user',
'messageId': f'e2e-{uuid.uuid4().hex[:8]}',
'parts': [{'kind': 'text', 'text': os.environ['AGENT_PROMPT']}],
}
}
}))")
A2A_TMP=$(mktemp -t cncrg-mk-local-XXXXXX)
set +e
A2A_CODE=$(curl -sS --max-time "$AGENT_ACT_SECS" -X POST "$BASE/workspaces/$CONCIERGE_ID/a2a" \
${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} \
-H "Content-Type: application/json" \
-d "$A2A_PAYLOAD" -o "$A2A_TMP" -w '%{http_code}' 2>/dev/null)
A2A_RC=$?
set -e
A2A_CODE=${A2A_CODE:-000}
A2A_RESP=$(cat "$A2A_TMP" 2>/dev/null || echo "")
rm -f "$A2A_TMP"
if [ "$A2A_RC" != "0" ] || [ "$A2A_CODE" -lt 200 ] || [ "$A2A_CODE" -ge 300 ]; then
fail "A2A POST /workspaces/$CONCIERGE_ID/a2a failed (curl_rc=$A2A_RC, http=$A2A_CODE): $(echo "$A2A_RESP" | head -c 400)"
fi
AGENT_TEXT=$(echo "$A2A_RESP" | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
parts = (d.get('result') or {}).get('parts', []) if isinstance(d, dict) else []
print(parts[0].get('text','') if parts else '')" 2>/dev/null || echo "")
log " concierge replied (first 300 chars): $(echo "$AGENT_TEXT" | head -c 300)"
# ─── 5. ASSERT the deterministic side effect: the worker now EXISTS ───────────
log "Polling GET /workspaces for the worker the concierge was asked to create..."
ACT_DEADLINE=$(( $(date +%s) + AGENT_ACT_SECS ))
while true; do
WORKER_ID=$(find_worker_by_name)
[ -n "$WORKER_ID" ] && break
if [ "$(date +%s)" -gt "$ACT_DEADLINE" ]; then
if hit=$(a2a_completion_error_marker "$AGENT_TEXT"); then
fail "TOOL FAILED: concierge surfaced an error-as-text reply (matched '$hit') and no workspace '$WORKER_NAME' was created. Reply: $(echo "$AGENT_TEXT" | head -c 400)"
fi
fail "AGENT DID NOT ACT: concierge replied but no workspace named '$WORKER_NAME' exists after ${AGENT_ACT_SECS}s — its LLM did not invoke create_workspace. Reply: $(echo "$AGENT_TEXT" | head -c 400)"
fi
sleep 6
done
ok "DETERMINISTIC SIDE EFFECT CONFIRMED: workspace '$WORKER_NAME' now EXISTS (id=$WORKER_ID)"
WORKER_KIND=$(ws_field "$WORKER_ID" kind)
if [ -n "$WORKER_KIND" ] && [ "$WORKER_KIND" != "workspace" ]; then
fail "created node '$WORKER_NAME' has kind='$WORKER_KIND' (want 'workspace')"
fi
ok "Created node is a real kind='workspace' row"
ok "═══ LOCAL CONCIERGE CREATES-A-WORKSPACE E2E PASSED ═══"
log "Proven locally: a natural-language A2A request → the concierge's LLM invoked create_workspace via the platform MCP → real workspace '$WORKER_NAME' (id=$WORKER_ID). Teardown runs via EXIT trap."
+576
View File
@@ -0,0 +1,576 @@
#!/usr/bin/env bash
# MANDATORY local Docker-provisioner lifecycle e2e.
#
# Why this exists: every other e2e exercises the SaaS/EC2 (control-plane)
# provisioner. NOTHING mandatory exercises the LOCAL Docker provisioner
# (MOLECULE_ENV=development, docker.sock) — the path self-hosters and dev runs
# use. A config-volume bug where a restarted workspace couldn't find its
# config.yaml (and wedged in 'failed' with "config volume is empty") went
# undetected for exactly this reason. This test provisions a REAL workspace via
# the LOCAL provisioner and asserts the full lifecycle, INCLUDING the
# restart-survival assertion that would have caught that bug.
#
# Steps (each asserts loudly):
# 1. Build + tag the stub runtime image to the provisioner's RegistryModeLocal
# cache tag so runtime=claude-code resolves to the stub (cache-hit, no
# 2.5GB build).
# 2. POST /workspaces (runtime=claude-code) — capture id.
# 3. Poll GET /workspaces/{id} until status==online (<=90s); assert a ws-<id>
# container is running.
# 4. RESTART-SURVIVAL: POST /workspaces/{id}/restart, poll until online AGAIN
# (<=90s); assert the container is back and the workspace did NOT wedge in
# failed / "config volume is empty". <-- the key assertion.
# 5. PROXY REACH: POST an A2A message/send through the PLATFORM proxy
# (/workspaces/{id}/a2a); assert 200 + the stub's canned reply (proves the
# ws-<id>:8000 Docker-DNS rewrite path works end-to-end).
# 6. Cleanup: delete the workspace (trap removes its container + volumes).
#
# Parameterizable: LIFECYCLE_RUNTIME_IMAGE selects which image the provisioner
# resolves to. Default = the freshly-built stub. Point it at the real image
# (e.g. molecule-local/workspace-template-claude-code:2ac9678422a5) for an
# advisory lifecycle-only run (the proxy-reach step then asserts reachability,
# not the canned text — a real LLM-less runtime can't produce "STUB OK").
#
# Run (stub, default — fast, no LLM):
# BASE=http://localhost:8080 ADMIN_TOKEN=dev-local-admin-token \
# bash tests/e2e/test_local_provision_lifecycle_e2e.sh
#
# Run (REAL MiniMax LLM round-trip — cheapest real model; asserts a real reply):
# BASE=http://localhost:8080 ADMIN_TOKEN=dev-local-admin-token \
# LIFECYCLE_LLM=minimax MINIMAX_API_KEY=<key> \
# bash tests/e2e/test_local_provision_lifecycle_e2e.sh
# (MINIMAX_API_KEY missing => loud skip exit 0; key is only ever sent in the
# secret-write curl body, never echoed or written to disk.)
set -euo pipefail
source "$(dirname "$0")/_lib.sh" # sets BASE default + admin-auth + cleanup helpers
# ---- config -----------------------------------------------------------------
ADMIN_TOKEN="${ADMIN_TOKEN:-${MOLECULE_ADMIN_TOKEN:-}}"
export ADMIN_TOKEN MOLECULE_ADMIN_TOKEN="${ADMIN_TOKEN}"
# Was ONLINE_TIMEOUT set by the caller? Remember before we default it so the
# minimax mode (heavier real-template boot) can bump the default without
# clobbering an explicit operator/CI override.
ONLINE_TIMEOUT_EXPLICIT=0
[ -n "${ONLINE_TIMEOUT:-}" ] && ONLINE_TIMEOUT_EXPLICIT=1
ONLINE_TIMEOUT="${ONLINE_TIMEOUT:-90}" # seconds to wait for online
A2A_TIMEOUT="${A2A_TIMEOUT:-30}"
STUB_DIR="$(cd "$(dirname "$0")/stub-runtime" && pwd)"
RUNTIME="claude-code"
# The provisioner's RegistryModeLocal resolves runtime=claude-code by checking
# the local image store for molecule-local/workspace-template-claude-code:<sha12>
# (the Gitea HEAD sha12 of the template repo's `main` branch — see
# provisioner/localbuild.go EnsureLocalImage). If that tag is missing it
# clones+builds the real 2.5GB template (slow + can OOM-kill in CI). We pre-tag
# our chosen image to that EXACT cache tag so the cache-check (dockerHasTag)
# hits and resolves to our image with no clone/build.
#
# The sha MOVES as the template repo advances, so we DISCOVER it at runtime from
# the same Gitea branch API the provisioner uses (CACHE_SHA), and only fall back
# to a pinned default (or an explicit CACHE_TAG override) when Gitea is
# unreachable. This keeps the test correct without an annual sha bump.
CACHE_REPO="molecule-local/workspace-template-${RUNTIME}"
GITEA_BRANCH_API="${GITEA_BRANCH_API:-https://git.moleculesai.app/api/v1/repos/molecule-ai/molecule-ai-workspace-template-${RUNTIME}/branches/main}"
# Model + credential choice — three coupled constraints from workspace-server:
# * Create rejects a model NOT registered for the runtime
# (UNREGISTERED_MODEL_FOR_RUNTIME, provider-registry SSOT).
# * The SLASH form (anthropic/claude-opus-4-7) derives provider=platform =>
# platform_managed billing, which ABORTS provisioning in a dev stack with
# no CP proxy env (MISSING_PLATFORM_PROXY, #2162).
# * The BARE form (claude-opus-4-7) derives provider=anthropic-api => BYOK,
# which then FAILS CLOSED unless the workspace has a usable LLM credential
# (MISSING_BYOK_CREDENTIAL). anthropic-api's auth_env is
# [ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN] — so we pass a DUMMY
# ANTHROPIC_API_KEY secret. The stub never makes an LLM call, so the dummy
# value is fine; it only needs to exist so byok resolves with a usable cred.
# This keeps the test self-contained (no platform-proxy env required) — exactly
# the portable shape the CI required job needs.
LIFECYCLE_MODEL="${LIFECYCLE_MODEL:-claude-opus-4-7}"
LIFECYCLE_LLM_KEY="${LIFECYCLE_LLM_KEY:-ANTHROPIC_API_KEY}"
LIFECYCLE_LLM_VALUE="${LIFECYCLE_LLM_VALUE:-sk-ant-e2e-stub-dummy-not-a-real-key}"
LATEST_TAG="${CACHE_REPO}:latest"
# ---- LIFECYCLE_LLM: real-LLM round-trip mode -------------------------------
# Default "" = the existing behaviour (stub or LLM-less real image).
#
# LIFECYCLE_LLM=minimax — provision the REAL claude-code template image with a
# MiniMax BYOK credential and assert an ACTUAL model reply at the proxy-reach
# step (Step 5), proving a genuine round-trip through the ws-<id>:8000 proxy.
#
# Why MiniMax: it's the cheapest LLM the platform offers (the staging canaries'
# primary auth path post-2026-05-04). The claude-code adapter's `minimax`
# provider (providers.yaml:258) reads MINIMAX_API_KEY at boot and points
# ANTHROPIC_BASE_URL at api.minimax.io/anthropic — MiniMax's OWN API, NOT the
# molecule LLM proxy — so a BYOK MiniMax workspace reaches the model DIRECTLY
# and works on this local dev stack with no CP proxy env.
#
# The registered claude-code slug is the BARE id `MiniMax-M2.7` (derives
# provider=minimax => byok). The colon form `minimax:MiniMax-M2.7` is
# UNREGISTERED on claude-code (internal#718). auth_env for `minimax` accepts
# MINIMAX_API_KEY, which the adapter projects into ANTHROPIC_AUTH_TOKEN.
#
# The real key MUST be supplied via the MINIMAX_API_KEY env var (never echoed
# or written to disk by this script — it only travels in the secret-write curl
# body, exactly like the dummy ANTHROPIC_API_KEY does today). Missing key =>
# loud skip (exit 0), never a red fail (mirrors the serving-e2e pattern).
LIFECYCLE_LLM="${LIFECYCLE_LLM:-}"
if [ "$LIFECYCLE_LLM" = "minimax" ]; then
if [ -z "${MINIMAX_API_KEY:-}" ]; then
echo "SKIP: LIFECYCLE_LLM=minimax but MINIMAX_API_KEY is not set in the env."
echo " Provide a real MiniMax key (the advisory CI job reads it from a"
echo " CI secret) to run the real-LLM round-trip. Skipping (exit 0)."
exit 0
fi
# Real claude-code template build (provisioner resolves+builds via
# RegistryModeLocal — same path as the advisory lifecycle-real job).
LIFECYCLE_PROVISIONER_BUILDS="1"
# Registered BYOK MiniMax slug for claude-code (bare id => provider=minimax).
LIFECYCLE_MODEL="MiniMax-M2.7"
LIFECYCLE_LLM_KEY="MINIMAX_API_KEY"
LIFECYCLE_LLM_VALUE="${MINIMAX_API_KEY}"
# The real template boot is heavier than the stub; give it room (unless the
# caller pinned ONLINE_TIMEOUT explicitly).
[ "$ONLINE_TIMEOUT_EXPLICIT" -eq 0 ] && ONLINE_TIMEOUT=180
fi
# Image the provisioner should actually run. Default: build the stub. Override
# to a real image (a pre-built tag) for the advisory lifecycle-only run.
LIFECYCLE_RUNTIME_IMAGE="${LIFECYCLE_RUNTIME_IMAGE:-__BUILD_STUB__}"
# LIFECYCLE_PROVISIONER_BUILDS=1: do NOT pre-tag any image — let the provisioner
# resolve runtime=claude-code itself via RegistryModeLocal (clone + docker build
# the real template). This exercises the GENUINE local image-resolution path end
# to end. Used by the advisory CI job. Implies the real (LLM-less) runtime, so
# the proxy-reach step asserts reachability, not a canned reply.
LIFECYCLE_PROVISIONER_BUILDS="${LIFECYCLE_PROVISIONER_BUILDS:-0}"
# When NOT running the stub we cannot assert the canned "STUB OK" text (no LLM);
# we assert reachability/registration instead.
USING_STUB=1
[ "$LIFECYCLE_RUNTIME_IMAGE" != "__BUILD_STUB__" ] && USING_STUB=0
[ "$LIFECYCLE_PROVISIONER_BUILDS" = "1" ] && USING_STUB=0
PASS=0
FAIL=0
WSID=""
# May be pre-pinned via env; otherwise resolved from the Gitea HEAD sha in Step 1.
CACHE_TAG="${CACHE_TAG:-}"
# Remember the tags/images we mutated so the trap can restore the cache tag to
# the real image (so a stub run never leaves the real claude-code tag pointing
# at the lightweight stub for the next developer/CI job).
ORIG_CACHE_IMAGE_ID=""
check() {
local desc="$1" expected="$2" actual="$3"
if echo "$actual" | grep -qF -- "$expected"; then
echo "PASS: $desc"; PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected to contain: $expected"
echo " got: $(echo "$actual" | head -5)"
FAIL=$((FAIL + 1))
fi
}
pass() { echo "PASS: $1"; PASS=$((PASS + 1)); }
fail() { echo "FAIL: $1"; [ -n "${2:-}" ] && echo " $2"; FAIL=$((FAIL + 1)); }
admin_curl() {
local _a=(); e2e_admin_auth_args _a
curl -s "${_a[@]+"${_a[@]}"}" "$@"
}
ws_field() { # ws_field <workspace-json> <field>
echo "$1" | python3 -c "import sys,json
try:
d=json.load(sys.stdin); print(d.get('$2',''))
except Exception:
print('')"
}
container_running() { # container_running <ws-id> -> echoes name if running
local short="${1:0:12}"
docker ps --filter "name=ws-${short}" --filter "status=running" --format '{{.Names}}' 2>/dev/null | head -1
}
cleanup() {
local rc=$?
echo ""
echo "--- cleanup ---"
if [ -n "$WSID" ]; then
# SCOPED teardown — only the workspace this test created. Never a blanket
# sweep (other dev workspaces may be live on this shared daemon).
e2e_delete_workspace "$WSID" "" >/dev/null 2>&1 || true
local short="${WSID:0:12}"
docker rm -f "ws-${short}" >/dev/null 2>&1 || true
# Volume naming is split in the provisioner: configs + claude-sessions use the
# 12-char short id (ConfigVolumeName/ClaudeSessionVolumeName), but the
# /workspace volume uses the FULL UUID (buildWorkspaceMount: ws-<id>-workspace).
# Remove BOTH forms so neither leaks.
docker volume rm -f \
"ws-${short}-configs" "ws-${short}-claude-sessions" \
"ws-${short}-workspace" "ws-${WSID}-workspace" >/dev/null 2>&1 || true
echo "cleaned workspace $WSID + ws-${short} container/volumes"
fi
# Restore the cache tag to whatever it pointed at before we retagged it, so a
# stub run doesn't leave the real claude-code tag aliased to the stub.
if [ -n "$ORIG_CACHE_IMAGE_ID" ]; then
docker tag "$ORIG_CACHE_IMAGE_ID" "$CACHE_TAG" >/dev/null 2>&1 || true
echo "restored $CACHE_TAG -> ${ORIG_CACHE_IMAGE_ID:0:19}"
fi
exit $rc
}
trap cleanup EXIT INT TERM
echo "=== Local Docker-Provisioner Lifecycle E2E ==="
echo "BASE=$BASE runtime=$RUNTIME using_stub=$USING_STUB llm=${LIFECYCLE_LLM:-none} model=$LIFECYCLE_MODEL cache_tag=${CACHE_TAG:-<resolve-in-step-1>}"
echo ""
# Preflight: docker must be reachable and the platform must be up.
if ! docker info >/dev/null 2>&1; then
echo "ERROR: docker daemon not reachable — this test provisions local containers."
exit 2
fi
if ! curl -s -m 5 "$BASE/workspaces" >/dev/null 2>&1; then
echo "ERROR: platform not reachable at $BASE"
exit 2
fi
# ----------------------------------------------------------------------------
# Step 1 — build/tag the image the provisioner will resolve to.
# ----------------------------------------------------------------------------
echo "--- Step 1: resolve runtime image to the chosen target ---"
# Resolve the EXACT cache tag the provisioner will look up: <repo>:<gitea-HEAD-
# sha12>. Discover the sha from the Gitea branch API (same source the provisioner
# uses). An explicit CACHE_TAG env overrides discovery; if Gitea is unreachable
# AND no override is set, bail loudly — silently tagging the wrong sha would let
# the provisioner clone+build the real 2.5GB template (slow / OOM).
if [ -n "${CACHE_TAG:-}" ]; then
echo "Using operator-pinned CACHE_TAG=$CACHE_TAG"
else
CACHE_SHA=$(curl -s -m 10 "$GITEA_BRANCH_API" 2>/dev/null \
| python3 -c "import sys,json
try:
print(json.load(sys.stdin)['commit']['id'][:12])
except Exception:
print('')" 2>/dev/null)
if [ -z "$CACHE_SHA" ]; then
echo "ERROR: could not resolve the template HEAD sha from $GITEA_BRANCH_API"
echo " set CACHE_TAG=$CACHE_REPO:<sha12> explicitly (the tag the provisioner expects)."
exit 2
fi
CACHE_TAG="${CACHE_REPO}:${CACHE_SHA}"
echo "Resolved provisioner cache tag: $CACHE_TAG (gitea HEAD sha)"
fi
# Record what the cache tag points at NOW (if anything) so cleanup can restore.
ORIG_CACHE_IMAGE_ID="$(docker image inspect --format '{{.Id}}' "$CACHE_TAG" 2>/dev/null || true)"
if [ "$LIFECYCLE_PROVISIONER_BUILDS" = "1" ]; then
# No pre-tag — the provisioner resolves + builds the real template itself via
# RegistryModeLocal. Disarm the cache-tag restore (we never touched it).
ORIG_CACHE_IMAGE_ID=""
pass "provisioner-builds mode: leaving image resolution to RegistryModeLocal (real template build)"
elif [ "$USING_STUB" -eq 1 ]; then
echo "Building stub image from $STUB_DIR ..."
if ! docker build --platform=linux/amd64 -t molecule-local/stub-runtime:latest "$STUB_DIR" >/tmp/stub_build.log 2>&1; then
echo "FAIL: stub image build failed"; tail -20 /tmp/stub_build.log; exit 1
fi
pass "stub image built"
TARGET_IMAGE="molecule-local/stub-runtime:latest"
# Point BOTH the sha-pinned cache tag and :latest at the stub so the
# provisioner's RegistryModeLocal cache-check (dockerHasTag) resolves to it
# instead of cloning+building the template.
docker tag "$TARGET_IMAGE" "$CACHE_TAG"
docker tag "$TARGET_IMAGE" "$LATEST_TAG"
pass "tagged $TARGET_IMAGE -> $CACHE_TAG (+ :latest)"
else
TARGET_IMAGE="$LIFECYCLE_RUNTIME_IMAGE"
if ! docker image inspect "$TARGET_IMAGE" >/dev/null 2>&1; then
echo "Real image $TARGET_IMAGE not present locally — pulling ..."
docker pull "$TARGET_IMAGE" >/dev/null 2>&1 || { echo "FAIL: cannot obtain $TARGET_IMAGE"; exit 1; }
fi
pass "using real runtime image $TARGET_IMAGE"
docker tag "$TARGET_IMAGE" "$CACHE_TAG"
docker tag "$TARGET_IMAGE" "$LATEST_TAG"
pass "tagged $TARGET_IMAGE -> $CACHE_TAG (+ :latest)"
fi
echo ""
# ----------------------------------------------------------------------------
# Step 2 — provision a workspace via the real create endpoint.
# ----------------------------------------------------------------------------
echo "--- Step 2: provision workspace (POST /workspaces) ---"
# Provision-time billing on this dev stack (no CP proxy env):
# * A claude-code workspace with a BARE model id derives provider=anthropic-api
# => BYOK, which FAILS CLOSED in prepare unless a usable LLM credential
# exists (MISSING_BYOK_CREDENTIAL).
# * The per-workspace secret-write guard blocks a vendor key while the
# workspace still resolves platform-managed (the MODEL secret isn't stored
# until AFTER payload.secrets are written at create time) — so we can't pass
# the key in the create payload.
# So: create WITHOUT secrets, flip the workspace to byok (explicit override wins
# in BOTH the guard's resolver and the provision resolver), then write the dummy
# vendor key — now permitted. We do NOT rely on Create's first provision to seed
# the config volume (it aborts byok-no-cred BEFORE Start, leaving the volume
# empty). Instead we SEED config.yaml directly into the named config volume and
# then trigger ONE clean provision via /restart. Seeding the volume is also what
# makes the restart-survival assertion meaningful: the restart path reuses the
# volume rather than any template.
CREATE_BODY=$(cat <<JSON
{"name":"Lifecycle E2E Stub","tier":2,"runtime":"$RUNTIME","model":"$LIFECYCLE_MODEL"}
JSON
)
RESP=$(admin_curl -X POST "$BASE/workspaces" -H "Content-Type: application/json" -d "$CREATE_BODY")
WSID=$(ws_field "$RESP" "id")
if [ -z "$WSID" ]; then
fail "create returned no workspace id" "$RESP"
echo "=== Results: $PASS passed, $((FAIL+1)) failed ==="
exit 1
fi
pass "workspace created: $WSID"
SHORT="${WSID:0:12}"
CONFIG_VOL="ws-${SHORT}-configs"
# Mint a workspace bearer for the WorkspaceAuth-gated secret + /restart calls.
WTOKEN=$(e2e_mint_workspace_token "$WSID" || true)
if [ -z "$WTOKEN" ]; then
fail "could not mint workspace token"
echo "=== Results: $PASS passed, $FAIL failed ==="; exit 1
fi
# Flip to byok BEFORE writing the vendor key (explicit override unblocks the
# secret-write guard AND makes the provision resolver pick byok).
BM=$(admin_curl -X PUT "$BASE/admin/workspaces/$WSID/llm-billing-mode" \
-H "Content-Type: application/json" -d '{"mode":"byok"}')
check "billing mode set to byok" "byok" "$BM"
# Write the dummy LLM credential (now allowed on a byok workspace). Inert — the
# stub never calls an LLM; it only needs to exist so byok has a usable cred.
SEC=$(curl -s -X POST "$BASE/workspaces/$WSID/secrets" \
-H "Authorization: Bearer $WTOKEN" -H "Content-Type: application/json" \
-d "{\"key\":\"$LIFECYCLE_LLM_KEY\",\"value\":\"$LIFECYCLE_LLM_VALUE\"}")
echo " secret write: $(echo "$SEC" | head -c 120)"
# In minimax mode also write MODEL_PROVIDER=minimax as a secret env. The
# claude-code adapter's _resolve_model_and_provider_from_env honours
# MODEL_PROVIDER ONLY when it matches a registered provider name (else it's
# treated as a legacy model-id), so a literal "minimax" routes the workspace to
# the `minimax` provider entry — projecting MINIMAX_API_KEY → ANTHROPIC_AUTH_TOKEN
# and setting ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic. workspace-
# server injects MODEL/MOLECULE_MODEL from the picked slug but NO LONGER emits
# MODEL_PROVIDER (applyRuntimeModelEnv, post-2026-05-19), so this secret-provided
# value survives into the container env. Without it a BARE `MiniMax-M2.7` derives
# no provider and falls through to the anthropic-api default (boot banner
# "provider=anthropic-api", base_url unset → AuthenticationError on the first
# call → the "Agent error" this mode exists to catch).
if [ "$LIFECYCLE_LLM" = "minimax" ]; then
SECP=$(curl -s -X POST "$BASE/workspaces/$WSID/secrets" \
-H "Authorization: Bearer $WTOKEN" -H "Content-Type: application/json" \
-d '{"key":"MODEL_PROVIDER","value":"minimax"}')
echo " secret write (MODEL_PROVIDER): $(echo "$SECP" | head -c 120)"
fi
# Seed config.yaml directly into the named config volume so the provision (and
# every later restart) has a config source. Create's byok-no-cred abort never
# wrote it, and this dev stack ships no claude-code template in the platform's
# configsDir for the empty-volume auto-recover to fall back to. The provisioner
# created the volume on its first (aborted) Start attempt; ensure it exists,
# then drop a minimal valid config.yaml in via a throwaway alpine container.
docker volume create "$CONFIG_VOL" >/dev/null 2>&1 || true
# In minimax mode the seeded config MUST carry an explicit `provider: minimax`.
# The claude-code adapter (and the molecule_runtime wheel's
# _derive_provider_from_model) only auto-derive a provider from a `vendor:model`
# or `vendor/model` slug — a BARE `MiniMax-M2.7` derives no provider and falls
# through to the anthropic-api default (boot banner: "provider=anthropic-api",
# ANTHROPIC_BASE_URL unset → the MiniMax key is never projected and the first
# LLM call fails with AuthenticationError). Naming the provider explicitly makes
# the adapter pick the `minimax` registry entry, project
# MINIMAX_API_KEY → ANTHROPIC_AUTH_TOKEN, and set
# ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic — a real round-trip.
LIFECYCLE_PROVIDER_LINE=""
[ "$LIFECYCLE_LLM" = "minimax" ] && LIFECYCLE_PROVIDER_LINE="provider: minimax"
CFG_YAML="name: ${WSID}
description: lifecycle e2e
version: 1.0.0
tier: 2
runtime: ${RUNTIME}
model: ${LIFECYCLE_MODEL}
runtime_config:
model: ${LIFECYCLE_MODEL}
${LIFECYCLE_PROVIDER_LINE}
timeout: 0
"
if docker run --rm -v "${CONFIG_VOL}:/configs" alpine:3 sh -c "cat > /configs/config.yaml" <<EOF >/dev/null 2>&1
${CFG_YAML}
EOF
then pass "seeded config.yaml into $CONFIG_VOL"; else fail "could not seed config.yaml into $CONFIG_VOL"; fi
echo ""
# ----------------------------------------------------------------------------
# Step 3 — provision (via restart) and wait for online; assert container.
# ----------------------------------------------------------------------------
echo "--- Step 3: provision + wait for first online (<=${ONLINE_TIMEOUT}s) ---"
# Kick ONE clean provision now that byok + cred + config.yaml are all in place.
curl -s -X POST "$BASE/workspaces/$WSID/restart" \
-H "Authorization: Bearer $WTOKEN" -H "Content-Type: application/json" -d '{}' >/dev/null
STATUS=""; LAST=""; failed_since=0
for _ in $(seq 1 "$ONLINE_TIMEOUT"); do
WS=$(admin_curl "$BASE/workspaces/$WSID")
STATUS=$(ws_field "$WS" "status")
LAST=$(ws_field "$WS" "last_sample_error")
if [ "$STATUS" = "online" ]; then break; fi
if [ "$STATUS" = "failed" ]; then
failed_since=$((failed_since + 1))
# A restart re-kicks provisioning; give the coalescing pipeline room to
# converge. Only bail if it stays failed for 20s straight.
if [ "$failed_since" -ge 20 ]; then
fail "workspace STUCK in 'failed' during initial provision" "last_sample_error: $LAST"
echo "=== Results: $PASS passed, $FAIL failed ==="; exit 1
fi
else
failed_since=0
fi
sleep 1
done
check "workspace reached online (status=$STATUS)" "online" "$STATUS"
RUN=$(container_running "$WSID")
if [ -n "$RUN" ]; then pass "container running: $RUN"; else fail "no running ws-${WSID:0:12} container" "docker ps shows none"; fi
echo ""
# ----------------------------------------------------------------------------
# Step 4 — RESTART-SURVIVAL (the assertion that would have caught the bug).
# ----------------------------------------------------------------------------
echo "--- Step 4: restart-survival (POST /workspaces/$WSID/restart) ---"
# Re-mint the workspace bearer: every (re)provision rotates the workspace token
# (issueAndInjectToken -> RevokeAllForWorkspace + IssueToken), so the Step-2
# token is now stale. /restart is WorkspaceAuth-gated, so mint a fresh one.
WTOKEN=$(e2e_mint_workspace_token "$WSID" || true)
if [ -z "$WTOKEN" ]; then
fail "could not mint fresh workspace token for restart"
else
RR=$(curl -s -X POST "$BASE/workspaces/$WSID/restart" \
-H "Authorization: Bearer $WTOKEN" -H "Content-Type: application/json" -d '{}')
check "restart accepted (provisioning)" "provisioning" "$RR"
# Poll until online AGAIN. Restart reuses the EXISTING config volume (no
# template/configFiles passed) — so this passes ONLY if the config volume
# survived the stop and still has config.yaml. A regression (volume reaped /
# emptied) surfaces as status=failed with the "config volume is empty" error.
STATUS=""; LAST=""
for _ in $(seq 1 "$ONLINE_TIMEOUT"); do
WS=$(admin_curl "$BASE/workspaces/$WSID")
STATUS=$(ws_field "$WS" "status")
LAST=$(ws_field "$WS" "last_sample_error")
case "$STATUS" in
online) break ;;
failed)
fail "workspace wedged in 'failed' AFTER restart (the config-volume bug class)" "last_sample_error: $LAST"
break ;;
esac
sleep 1
done
check "workspace back online after restart (status=$STATUS)" "online" "$STATUS"
# Explicit negative on the exact bug signature.
if echo "$LAST" | grep -qiF "config volume is empty"; then
fail "restart hit 'config volume is empty' — restart-survival REGRESSION" "$LAST"
else
pass "no 'config volume is empty' error after restart"
fi
RUN=$(container_running "$WSID")
if [ -n "$RUN" ]; then pass "container back after restart: $RUN"; else fail "container missing after restart"; fi
fi
echo ""
# ----------------------------------------------------------------------------
# Step 5 — proxy reach (ws-<id>:8000 Docker-DNS rewrite, end to end).
# ----------------------------------------------------------------------------
echo "--- Step 5: proxy reach (POST /workspaces/$WSID/a2a) ---"
# Debug: print the workspace URL the platform stored so SSRF failures are
# actionable (#2468 RCA).
WS_DEBUG=$(admin_curl "$BASE/workspaces/$WSID")
WS_URL_DEBUG=$(ws_field "$WS_DEBUG" "url")
WS_STATUS_DEBUG=$(ws_field "$WS_DEBUG" "status")
echo " workspace url=$WS_URL_DEBUG status=$WS_STATUS_DEBUG"
# In minimax mode we send a DETERMINISTIC known-answer prompt and assert the
# model echoes the answer back — proving a real LLM round-trip, not just
# reachability. Otherwise a plain "ping".
if [ "$LIFECYCLE_LLM" = "minimax" ]; then
A2A_PROMPT="Reply with exactly the single word PONG and nothing else."
else
A2A_PROMPT="ping"
fi
A2A_BODY=$(python3 -c "
import json,sys
print(json.dumps({'method':'message/send','params':{'message':{'role':'user','parts':[{'type':'text','text':sys.argv[1]}]}}}))
" "$A2A_PROMPT")
# Real LLM cold-start (first turn boots the claude-code SDK + dials MiniMax) is
# slower than the stub; give the real-LLM call a longer ceiling.
A2A_CEIL="$A2A_TIMEOUT"
[ "$LIFECYCLE_LLM" = "minimax" ] && A2A_CEIL="${A2A_MINIMAX_TIMEOUT:-120}"
A2A=$(curl -s --max-time "$A2A_CEIL" -X POST "$BASE/workspaces/$WSID/a2a" \
-H "Content-Type: application/json" \
-d "$A2A_BODY")
# Extract the assistant text part once (shared by the minimax assertion +
# diagnostics). Tolerates result.parts[].text and result.message.parts[].text.
a2a_text() {
echo "$1" | python3 -c "import sys,json
try:
d=json.load(sys.stdin); r=d.get('result',d)
m=r.get('message',r)
parts=m.get('parts',[]) or r.get('parts',[])
print(' '.join(p.get('text','') for p in parts if isinstance(p,dict)))
except Exception:
print('')"
}
if [ "$LIFECYCLE_LLM" = "minimax" ]; then
# REAL round-trip assertion. The reply must be model-produced text — NOT a
# proxy-level unreachable, NOT an LLM-less "Agent error", NOT an empty
# completion. Then it must contain the known answer (PONG).
check "proxy returned a result envelope" '"result"' "$A2A"
AGENT_TEXT="$(a2a_text "$A2A")"
echo " MiniMax reply: $(echo "$AGENT_TEXT" | head -c 200)"
if echo "$A2A" | grep -qiE 'unreachable|workspace has no URL|restarting'; then
fail "MiniMax runtime not reachable through proxy" "$A2A"
elif echo "$AGENT_TEXT" | grep -qiF "message contained no text content"; then
fail "MiniMax returned an EMPTY completion (no text part) — backend/key issue, not a real round-trip" "$AGENT_TEXT"
elif echo "$AGENT_TEXT" | grep -qiE 'agent error|exception|invalid api key|insufficient_quota|exceeded your current quota'; then
fail "MiniMax round-trip returned an error-shaped reply (no real completion)" "$AGENT_TEXT"
elif echo "$AGENT_TEXT" | tr '[:lower:]' '[:upper:]' | grep -qF "PONG"; then
pass "REAL MiniMax round-trip: model replied with the known answer (PONG)"
else
# Non-error, non-empty, but didn't contain PONG — still a real reply (the
# model answered with its own words). Accept as a real round-trip but note it.
if [ -n "$AGENT_TEXT" ]; then
pass "REAL MiniMax round-trip: non-error model reply (did not contain PONG, but real text)"
else
fail "MiniMax round-trip produced no assertable text" "$A2A"
fi
fi
elif [ "$USING_STUB" -eq 1 ]; then
check "proxy returned a result envelope" '"result"' "$A2A"
check "proxy reached stub (canned reply)" 'STUB OK' "$A2A"
# Parse the envelope so whitespace/key-ordering doesn't break the assertion.
ROLE=$(echo "$A2A" | python3 -c "import sys,json
try:
print(json.load(sys.stdin).get('result',{}).get('role',''))
except Exception:
print('')")
check "reply has agent role" "agent" "$ROLE"
else
# Real LLM-less image: we can't get a canned text, but a reachable runtime
# must answer with EITHER a result OR a structured JSON-RPC error — NOT a
# proxy-level "workspace agent unreachable" / "no URL". Assert reachability.
if echo "$A2A" | grep -qiE 'unreachable|workspace has no URL|restarting'; then
fail "real runtime not reachable through proxy" "$A2A"
else
pass "real runtime reachable through proxy (got a JSON-RPC response)"
echo " response: $(echo "$A2A" | head -c 200)"
fi
fi
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
exit "$FAIL"
+459
View File
@@ -0,0 +1,459 @@
#!/usr/bin/env bash
# FUNCTIONAL real-LLM E2E: prove the org concierge (the platform agent) can
# actually DO org-management work — send it a natural-language request and
# assert it REALLY CREATES a workspace via its platform MCP (87 org-admin tools,
# incl. create_workspace), NOT just that a REST API returned 200.
#
# This is the RFC docs/design/rfc-platform-agent.md §11.4 "Reach" check, made
# into a gating CI test:
#
# "chat the platform agent → it list_workspaces then create_workspace via the
# platform MCP and reports back via send_message_to_user."
#
# Unlike test_staging_concierge_e2e.sh (which drives the user_tasks REST+MCP
# primitive directly — a pure DB/handler contract with NO LLM), THIS test drives
# the AGENT: it sends an A2A message/send envelope (the user→concierge chat
# path) and asserts the DETERMINISTIC SIDE EFFECT — a workspace with the exact
# name we asked for now EXISTS in GET /workspaces — which can only happen if the
# concierge's LLM actually invoked the create_workspace platform-MCP tool.
#
# WHAT MUST BE LIVE for this to pass GREEN (else it SKIPs LOUD, never false-red):
# • The org's concierge must be installed as the kind='platform' root AND
# provisioned on the DEDICATED platform-agent image (Dockerfile.platform-agent),
# which ships /opt/molecule-mcp-server — the ONLY image where the platform MCP
# (create_workspace) lights up. On SaaS staging the CP installs + provisions it
# at org-provision time. (See platform_agent.go's SELF-HOST CAVEAT: the ordinary
# claude-code image does NOT ship the platform MCP, so create_workspace is a
# no-op there.) A parallel agent is wiring the platform-agent image into the
# staging provision path; until that lands, this test SKIPs LOUD with a clear
# "concierge not on platform-agent image" message rather than failing red.
# • A working model for the concierge. On SaaS the concierge is platform_managed
# (the CP-exported LLM proxy supplies the model) so no BYOK key is needed for
# the concierge itself.
#
# Env contract (same as test_staging_concierge_e2e.sh / test_staging_full_saas.sh):
# MOLECULE_CP_URL default: https://staging-api.moleculesai.app
# MOLECULE_ADMIN_TOKEN CP admin bearer — Railway staging CP_ADMIN_API_TOKEN
#
# Optional env:
# E2E_PROVISION_TIMEOUT_SECS default 900 (15 min cold tenant EC2 budget)
# E2E_CONCIERGE_ONLINE_SECS default 900 (concierge boot-to-online budget)
# E2E_AGENT_ACT_SECS default 420 (LLM think+tool-call budget after we
# send the message — generous for nondeterminism)
# E2E_KEEP_ORG 1 → skip teardown (debugging only)
# E2E_RUN_ID slug suffix; CI: ${GITHUB_RUN_ID}-${RUN_ATTEMPT}
# E2E_AWS_LEAK_CHECK auto (default) | required | off
# E2E_AWS_TERMINATE_LEAKS 1 → terminate slug-tagged leaked EC2 on exit
# E2E_REQUIRE_LIVE 1 → a SKIP for "no concierge on platform image"
# becomes a hard FAIL (CI sets this so a silently-
# missing platform-agent image can't false-green
# the gate). Default 0 (local: skip-loud).
#
# Exit codes:
# 0 happy path (concierge created the workspace) OR honest skip-loud
# 1 generic / assertion failure (agent didn't act, or tool failed)
# 2 missing required env
# 3 provisioning timed out
# 4 teardown left orphan resources
# 5 E2E_REQUIRE_LIVE=1 but the concierge could not be exercised (no
# platform-agent image / never came online) — false-green guard
set -euo pipefail
# shellcheck disable=SC1091
# shellcheck source=_lib.sh
source "$(dirname "$0")/_lib.sh"
# AWS-leak-check lib — same teardown leak assertion the full-SaaS harness uses.
# shellcheck disable=SC1091
# shellcheck source=lib/aws_leak_check.sh
source "$(dirname "$0")/lib/aws_leak_check.sh"
# Real-completion error-as-text scanner — used to detect the concierge
# surfacing its tool/LLM error AS a reply ("Agent error …") so a broken agent
# can't read as "asked but politely declined".
# shellcheck disable=SC1091
# shellcheck source=lib/completion_assert.sh
source "$(dirname "$0")/lib/completion_assert.sh"
CP_URL="${MOLECULE_CP_URL:-https://staging-api.moleculesai.app}"
ADMIN_TOKEN="${MOLECULE_ADMIN_TOKEN:?MOLECULE_ADMIN_TOKEN required — Railway staging CP_ADMIN_API_TOKEN}"
PROVISION_TIMEOUT_SECS="${E2E_PROVISION_TIMEOUT_SECS:-900}"
CONCIERGE_ONLINE_SECS="${E2E_CONCIERGE_ONLINE_SECS:-900}"
AGENT_ACT_SECS="${E2E_AGENT_ACT_SECS:-420}"
REQUIRE_LIVE="${E2E_REQUIRE_LIVE:-0}"
RUN_ID_SUFFIX="${E2E_RUN_ID:-$(date +%H%M%S)-$$}"
# Fixed e2e- prefix so sweep-stale-e2e-orgs.yml + lint_cleanup_traps.sh reap any
# orphan org. (The lint requires a quoted SLUG=... with a literal e2e-/rt-e2e-
# head.)
SLUG="e2e-cncrg-mk-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
SLUG=$(echo "$SLUG" | tr '[:upper:]' '[:lower:]' | tr -cd 'a-z0-9-' | head -c 32)
# The workspace name we will ask the concierge to create. The RUN_ID makes it
# unique per run so a poll for it can never collide with a sibling run's name.
WORKER_NAME="e2e-cncrg-worker-${RUN_ID_SUFFIX}"
WORKER_NAME=$(echo "$WORKER_NAME" | tr -cd 'a-zA-Z0-9-' | head -c 48)
# Exported so the find_worker_by_name python subshell (run in a pipe) reads it
# via os.environ — a bare shell var would not survive into the subprocess env.
export WORKER_NAME
log() { echo "[$(date +%H:%M:%S)] $*"; }
fail() { echo "[$(date +%H:%M:%S)] ❌ $*" >&2; exit 1; }
ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
# skip_loud <reason>: honest skip when the concierge can't be exercised. In CI
# (E2E_REQUIRE_LIVE=1) this is a HARD FAIL (exit 5) so a missing platform-agent
# image can't false-green the gate; locally it skips 0.
skip_loud() {
echo "[$(date +%H:%M:%S)] ⏭️ SKIP: $*" >&2
if [ "$REQUIRE_LIVE" = "1" ]; then
echo "[$(date +%H:%M:%S)] ❌ E2E_REQUIRE_LIVE=1 — a skip is a false-green guard breach here. Failing." >&2
exit 5
fi
exit 0
}
CURL_COMMON=(-sS --max-time 30)
TMPDIR_E2E=$(mktemp -d -t cncrg-mk-XXXXXX)
# ─── teardown trap (worker delete + org delete + leak check) ─────────────────
CLEANUP_DONE=0
WORKER_ID="" # set once the concierge creates it (for targeted delete)
TENANT_URL="" # set after provisioning
TENANT_TOKEN=""
ORG_ID=""
cleanup() {
local entry_rc=$?
[ "$CLEANUP_DONE" = "1" ] && return 0
CLEANUP_DONE=1
rm -rf "$TMPDIR_E2E" 2>/dev/null || true
# Best-effort targeted delete of the worker the concierge created, so the org
# delete below isn't the only thing reaping it (defensive — org delete cascades
# anyway). Only attempted if we resolved its id and have tenant creds.
if [ -n "$WORKER_ID" ] && [ -n "$TENANT_URL" ] && [ -n "$TENANT_TOKEN" ]; then
curl "${CURL_COMMON[@]}" -X DELETE "$TENANT_URL/workspaces/$WORKER_ID?confirm=true" \
-H "Authorization: Bearer $TENANT_TOKEN" \
-H "X-Molecule-Org-Id: $ORG_ID" \
-H "Origin: $TENANT_URL" \
-H "X-Confirm-Name: $WORKER_NAME" >/dev/null 2>&1 || true
fi
if [ "${E2E_KEEP_ORG:-0}" = "1" ]; then
log "E2E_KEEP_ORG=1 — skipping teardown. Manually delete $SLUG when done."
return 0
fi
log "🧹 Tearing down org $SLUG..."
if curl "${CURL_COMMON[@]}" --max-time 120 -X DELETE "$CP_URL/cp/admin/tenants/$SLUG" \
-H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
-d "{\"confirm\":\"$SLUG\"}" >/dev/null 2>&1; then
ok "Teardown request accepted"
else
log "Teardown returned non-2xx (may already be gone)"
fi
# Eventual-consistency wait: org row gone / purged.
local leak_count=1 elapsed=0
while [ "$elapsed" -lt 60 ]; do
leak_count=$(curl "${CURL_COMMON[@]}" "$CP_URL/cp/admin/orgs" \
-H "Authorization: Bearer $ADMIN_TOKEN" 2>/dev/null \
| python3 -c "import json,sys; d=json.load(sys.stdin); print(sum(1 for o in d.get('orgs', []) if o.get('slug')=='$SLUG' and o.get('status') != 'purged'))" \
2>/dev/null || echo 1)
[ "$leak_count" = "0" ] && break
sleep 5; elapsed=$((elapsed + 5))
done
if [ "$leak_count" != "0" ]; then
echo "⚠️ LEAK: org $SLUG still present post-teardown after ${elapsed}s (count=$leak_count)" >&2
exit 4
fi
local aws_leak_rc=0
e2e_verify_no_ec2_leaks_for_slug "$SLUG" || aws_leak_rc=$?
if [ "$aws_leak_rc" != "0" ]; then
case "$aws_leak_rc" in 2) exit 2 ;; *) exit 4 ;; esac
fi
ok "Teardown clean — no orphan org or EC2 resources for $SLUG (${elapsed}s)"
case "$entry_rc" in 0|1|2|3|4|5) ;; *) exit 1 ;; esac
}
trap cleanup EXIT INT TERM
admin_call() { # <method> <path> [curl args…]
local method="$1" path="$2"; shift 2
curl "${CURL_COMMON[@]}" -X "$method" "$CP_URL$path" \
-H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" "$@"
}
# tenant_call: Authorization (tenant admin token — also authenticates the
# concierge, which holds no per-workspace token: validateDiscoveryCaller's admin
# fallback) + X-Molecule-Org-Id (TenantGuard 404s without it) + Origin (edge WAF).
tenant_call() { # <method> <path> [curl args…]
local method="$1" path="$2"; shift 2
curl "${CURL_COMMON[@]}" -X "$method" "$TENANT_URL$path" \
-H "Authorization: Bearer $TENANT_TOKEN" \
-H "X-Molecule-Org-Id: $ORG_ID" \
-H "Origin: $TENANT_URL" "$@"
}
# list_workspaces_json: echo the raw GET /workspaces JSON array (tenant-scoped).
list_workspaces_json() { tenant_call GET /workspaces; }
# find_platform_root: echo the id of the kind='platform' parent_id-null root, or
# "" if none. This IS the concierge — the org's front-door agent.
find_platform_root() {
list_workspaces_json | python3 -c "
import sys, json
try: rows = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
for w in rows if isinstance(rows, list) else []:
if w.get('kind') == 'platform' and not w.get('parent_id'):
print(w.get('id','')); break
else:
print('')"
}
# workspace_field <id> <field>: echo a single field off GET /workspaces/:id.
workspace_field() { # <id> <field>
tenant_call GET "/workspaces/$1" | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
print(d.get('$2','') if isinstance(d, dict) else '')"
}
# find_worker_by_name: echo the id of a workspace whose name == WORKER_NAME, or
# "" if not present. THIS is the deterministic side effect we assert on.
find_worker_by_name() {
list_workspaces_json | python3 -c "
import sys, json, os
want = os.environ['WORKER_NAME']
try: rows = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
for w in rows if isinstance(rows, list) else []:
if w.get('name') == want:
print(w.get('id','')); break
else:
print('')"
}
# ─── 0. Preflight ────────────────────────────────────────────────────────────
log "═══ Staging concierge CREATES-A-WORKSPACE (real-LLM) E2E ═══ CP=$CP_URL Slug=$SLUG"
log " worker the concierge will be asked to create: name=$WORKER_NAME"
curl "${CURL_COMMON[@]}" "$CP_URL/health" >/dev/null || fail "CP health check failed"
ok "CP reachable"
# ─── 1. Create org (CP installs + provisions the concierge as platform root) ──
log "1/6 Creating org $SLUG..."
CREATE_RESP=$(admin_call POST /cp/admin/orgs \
-d "{\"slug\":\"$SLUG\",\"name\":\"E2E $SLUG\",\"owner_user_id\":\"e2e-runner:$SLUG\"}")
echo "$CREATE_RESP" | python3 -m json.tool >/dev/null || fail "Org create non-JSON: $CREATE_RESP"
ORG_ID=$(echo "$CREATE_RESP" | python3 -c "import json,sys; print(json.load(sys.stdin).get('id',''))")
[ -z "$ORG_ID" ] && fail "Org create response missing 'id': $CREATE_RESP"
ok "Org created (id=$ORG_ID)"
# ─── 2. Wait for tenant provisioning ─────────────────────────────────────────
log "2/6 Waiting for tenant provisioning (up to ${PROVISION_TIMEOUT_SECS}s)..."
DEADLINE=$(( $(date +%s) + PROVISION_TIMEOUT_SECS ))
LAST_STATUS=""
while true; do
[ "$(date +%s)" -gt "$DEADLINE" ] && exit 3
LIST_JSON=$(admin_call GET /cp/admin/orgs 2>/dev/null || echo '{"orgs":[]}')
STATUS=$(echo "$LIST_JSON" | python3 -c "
import json, sys
d = json.load(sys.stdin)
for o in d.get('orgs', []):
if o.get('slug') == '$SLUG':
print(o.get('instance_status', '')); sys.exit(0)
print('')" 2>/dev/null || echo "")
if [ "$STATUS" != "$LAST_STATUS" ]; then log " status → $STATUS"; LAST_STATUS="$STATUS"; fi
case "$STATUS" in
running) break ;;
failed) fail "Tenant provisioning failed for $SLUG" ;;
*) sleep 15 ;;
esac
done
ok "Tenant provisioning complete"
# Derive tenant domain from CP hostname (prod vs staging).
CP_HOST=$(echo "$CP_URL" | sed -E 's#^https?://##; s#/.*$##')
case "$CP_HOST" in
api.*) DERIVED_DOMAIN="${CP_HOST#api.}" ;;
staging-api.*) DERIVED_DOMAIN="staging.${CP_HOST#staging-api.}" ;;
*) DERIVED_DOMAIN="$CP_HOST" ;;
esac
TENANT_DOMAIN="${MOLECULE_TENANT_DOMAIN:-$DERIVED_DOMAIN}"
TENANT_URL="https://$SLUG.$TENANT_DOMAIN"
log " TENANT_URL=$TENANT_URL"
# ─── 3. Per-tenant admin token + TLS readiness ───────────────────────────────
log "3/6 Fetching per-tenant admin token..."
TENANT_TOKEN=$(admin_call GET "/cp/admin/orgs/$SLUG/admin-token" \
| python3 -c "import json,sys; print(json.load(sys.stdin).get('admin_token',''))" 2>/dev/null || echo "")
[ -z "$TENANT_TOKEN" ] && fail "Could not retrieve per-tenant admin token for $SLUG"
ok "Tenant admin token retrieved (len=${#TENANT_TOKEN})"
log " Waiting for tenant TLS / DNS propagation..."
TLS_DEADLINE=$(( $(date +%s) + 15 * 60 ))
while true; do
curl -sSfk --max-time 5 "$TENANT_URL/health" >/dev/null 2>&1 && break
[ "$(date +%s)" -gt "$TLS_DEADLINE" ] && fail "Tenant /health never 2xx within 15m"
sleep 5
done
ok "Tenant reachable at $TENANT_URL"
# ─── 4. Discover the concierge (kind='platform' root) + ensure it can act ─────
log "4/6 Discovering the concierge (kind='platform' root)..."
# The CP installs the platform agent at org-provision; allow a short settle for
# the row + re-parent backfill to land.
CONCIERGE_ID=""
DISC_DEADLINE=$(( $(date +%s) + 180 ))
while true; do
CONCIERGE_ID=$(find_platform_root)
[ -n "$CONCIERGE_ID" ] && break
[ "$(date +%s)" -gt "$DISC_DEADLINE" ] && break
sleep 10
done
if [ -z "$CONCIERGE_ID" ]; then
skip_loud "no kind='platform' concierge root in this org — the platform agent was not installed at provision. \
This needs the CP platform-agent install (RFC §3) live on staging. Until then there is no agent to drive."
fi
ok "Concierge (platform root) = $CONCIERGE_ID"
# The concierge must be ONLINE + routable for its LLM to receive the A2A message
# and reach the platform MCP. Bounded poll — generous because a cold concierge
# boots its container + loads the platform MCP server before it is reachable.
log " Waiting for the concierge to be online (up to ${CONCIERGE_ONLINE_SECS}s)..."
ONLINE_DEADLINE=$(( $(date +%s) + CONCIERGE_ONLINE_SECS ))
C_STATUS=""; C_URL=""; LAST_C_STATUS=""
while true; do
C_STATUS=$(workspace_field "$CONCIERGE_ID" status)
C_URL=$(workspace_field "$CONCIERGE_ID" url)
if [ "$C_STATUS" != "$LAST_C_STATUS" ]; then log " concierge → ${C_STATUS:-<none>}"; LAST_C_STATUS="$C_STATUS"; fi
if [ "$C_STATUS" = "online" ] && [ -n "$C_URL" ]; then break; fi
if [ "$(date +%s)" -gt "$ONLINE_DEADLINE" ]; then
LAST_ERR=$(workspace_field "$CONCIERGE_ID" last_sample_error)
skip_loud "concierge $CONCIERGE_ID never reached online+routable within ${CONCIERGE_ONLINE_SECS}s \
(last status='${C_STATUS}', url='${C_URL}', err='${LAST_ERR}'). On a tenant where the concierge is NOT \
provisioned on the platform-agent image (no /opt/molecule-mcp-server, no model), it cannot run the \
create_workspace tool — that is the parallel-agent image work this gate depends on."
fi
sleep 10
done
ok "Concierge online + routable (url assigned)"
# Pre-state: the worker MUST NOT exist yet (so its later appearance is causally
# the concierge's doing, not a pre-existing row).
PRE_EXISTING=$(find_worker_by_name)
[ -n "$PRE_EXISTING" ] && fail "worker '$WORKER_NAME' already exists pre-test ($PRE_EXISTING) — name collision, cannot prove causality"
ok "Pre-state confirmed: '$WORKER_NAME' does not exist yet"
# ─── 5. Drive the AGENT: A2A message/send → it must create the workspace ──────
log "5/6 Sending the concierge a natural-language create-workspace request..."
# Imperative + explicit to defuse LLM nondeterminism: name the tool, the exact
# workspace NAME and ROLE, and tell it not to ask a clarifying question. The
# message/send envelope is the canvas user→agent chat path (handlers/a2a_proxy.go),
# identical to the shape test_a2a_e2e.sh / test_staging_full_saas.sh use.
AGENT_PROMPT="Please create a new workspace in this org right now using your platform tools. \
Use the create_workspace tool with name exactly \"${WORKER_NAME}\" and role \"engineer\". \
Do not ask me any clarifying questions — the name and role are final. \
After the tool succeeds, reply with the new workspace id."
A2A_PAYLOAD=$(WORKER_NAME="$WORKER_NAME" AGENT_PROMPT="$AGENT_PROMPT" python3 -c "
import json, os, uuid
print(json.dumps({
'jsonrpc': '2.0',
'method': 'message/send',
'id': 'e2e-cncrg-mk-1',
'params': {
'message': {
'role': 'user',
'messageId': f'e2e-{uuid.uuid4().hex[:8]}',
'parts': [{'kind': 'text', 'text': os.environ['AGENT_PROMPT']}],
}
}
}))")
# Cold concierge: first turn opens TLS to the LLM, loads the platform MCP, runs
# a tool call. Give it a wide per-call window AND retry on edge cold-start 5xx.
A2A_TMP="$TMPDIR_E2E/a2a_out"
AGENT_TEXT=""
A2A_OK=0
for A2A_ATTEMPT in $(seq 1 8); do
: >"$A2A_TMP"
set +e
A2A_CODE=$(tenant_call POST "/workspaces/$CONCIERGE_ID/a2a" \
--max-time "$AGENT_ACT_SECS" \
-H "Content-Type: application/json" \
-d "$A2A_PAYLOAD" \
-o "$A2A_TMP" -w '%{http_code}' 2>/dev/null)
A2A_RC=$?
set -e
A2A_CODE=${A2A_CODE:-000}
A2A_RESP=$(cat "$A2A_TMP" 2>/dev/null || echo "")
if [ "$A2A_RC" = "0" ] && [ "$A2A_CODE" -ge 200 ] && [ "$A2A_CODE" -lt 300 ]; then
A2A_OK=1
break
fi
if echo "$A2A_CODE" | grep -Eq '^(502|503|504)$'; then
log " A2A cold-start attempt $A2A_ATTEMPT/8 returned $A2A_CODE — retrying"
[ "$A2A_ATTEMPT" -lt 8 ] && { sleep 15; continue; }
fi
break
done
if [ "$A2A_OK" != "1" ]; then
# A non-2xx A2A POST is an INFRA/transport failure (agent unreachable), not an
# "agent declined" — distinct from the assertion below.
fail "A2A POST /workspaces/$CONCIERGE_ID/a2a failed (curl_rc=$A2A_RC, http=$A2A_CODE) after $A2A_ATTEMPT attempt(s): $(echo "$A2A_RESP" | head -c 400)"
fi
AGENT_TEXT=$(echo "$A2A_RESP" | python3 -c "
import sys, json
try: d = json.load(sys.stdin)
except Exception: print(''); sys.exit(0)
parts = (d.get('result') or {}).get('parts', []) if isinstance(d, dict) else []
print(parts[0].get('text','') if parts else '')" 2>/dev/null || echo "")
log " concierge replied (first 300 chars): $(echo "$AGENT_TEXT" | head -c 300)"
# ─── 6. ASSERT the deterministic side effect: the worker now EXISTS ───────────
log "6/6 Polling GET /workspaces for the worker the concierge was asked to create..."
# The create is the side effect; the LLM may take a few turns / a moment to flush
# the tool call. Poll the NAME (deterministic) — tolerant of when exactly the row
# lands, intolerant of it never landing.
ACT_DEADLINE=$(( $(date +%s) + AGENT_ACT_SECS ))
while true; do
WORKER_ID=$(find_worker_by_name)
[ -n "$WORKER_ID" ] && break
if [ "$(date +%s)" -gt "$ACT_DEADLINE" ]; then
# The agent answered but the workspace never appeared → the LLM did NOT call
# create_workspace (or the tool failed). Distinguish the two for the operator.
if hit=$(a2a_completion_error_marker "$AGENT_TEXT"); then
fail "TOOL FAILED: concierge surfaced an error-as-text reply (matched '$hit') and no workspace '$WORKER_NAME' was created. \
The platform MCP create_workspace tool errored. Reply: $(echo "$AGENT_TEXT" | head -c 400)"
fi
fail "AGENT DID NOT ACT: concierge replied but no workspace named '$WORKER_NAME' exists in GET /workspaces after ${AGENT_ACT_SECS}s. \
The concierge's LLM did not invoke the create_workspace platform-MCP tool. \
Reply: $(echo "$AGENT_TEXT" | head -c 400)"
fi
sleep 8
done
ok "DETERMINISTIC SIDE EFFECT CONFIRMED: workspace '$WORKER_NAME' now EXISTS (id=$WORKER_ID)"
# Confirm it is a real workspace row (kind='workspace') parented under the org —
# i.e. a genuine create, not a no-op echo. parent_id may be the concierge (the
# concierge creates children under itself by convention) or another node; we
# assert only that it's a non-platform workspace, which is what create_workspace
# yields.
WORKER_KIND=$(workspace_field "$WORKER_ID" kind)
if [ -n "$WORKER_KIND" ] && [ "$WORKER_KIND" != "workspace" ]; then
fail "created node '$WORKER_NAME' has kind='$WORKER_KIND' (want 'workspace') — not a real worker create"
fi
ok "Created node is a real kind='workspace' row"
# Soft confirmation: the concierge SHOULD report back. Non-fatal (the side
# effect above is the hard proof) — but a reply that is itself an error is a
# yellow flag worth logging even though the row landed.
if [ -n "$AGENT_TEXT" ]; then
if a2a_completion_error_marker "$AGENT_TEXT" >/dev/null; then
log " ⚠️ concierge reply looks like an error-as-text even though the workspace was created — investigate the tool result surfacing."
else
ok "Concierge replied confirming the action (non-error)"
fi
else
log " (concierge returned no text part — the row landing is the proof; reply is optional)"
fi
ok "═══ STAGING CONCIERGE CREATES-A-WORKSPACE E2E PASSED ═══"
log "Proven: a natural-language A2A request → the concierge's LLM invoked create_workspace via the platform MCP → real org mutation (workspace '$WORKER_NAME' id=$WORKER_ID). Teardown runs via EXIT trap."
+376
View File
@@ -0,0 +1,376 @@
#!/usr/bin/env bash
# Real-staging E2E for the concierge user_tasks primitive (Feature 3 of the
# concierge / platform-agent set). Exercises the FULL agent→user "ask" contract
# both surfaces expose, END-TO-END against a real EC2-backed staging tenant:
#
# REST (per-workspace, tenant-admin-token authenticated):
# POST /workspaces/:id/user-tasks create an ask
# GET /workspaces/:id/user-tasks this workspace's asks
# GET /user-tasks/pending (AdminAuth) org-wide pending asks
# PATCH /workspaces/:id/user-tasks/:taskId edit (scoped by ws id)
# DELETE /workspaces/:id/user-tasks/:taskId remove (scoped by ws id)
# POST /workspaces/:id/user-tasks/:taskId/resolve done|dismissed
#
# MCP a2a-bridge tools (POST /workspaces/:id/mcp, JSON-RPC tools/call):
# request_user_action(title, detail?) list_user_tasks()
# update_user_task(user_task_id, …) delete_user_task(user_task_id)
#
# Cross-workspace authz: workspace B cannot PATCH/DELETE workspace A's task
# (the user_tasks handler scopes every mutation by the URL :id, so a B-path
# call against an A-owned task 404s — the same scoping the local
# test_user_tasks_e2e.sh pins, here proven over the real tenant ws-server).
#
# Why a real-staging sibling to the LOCAL test_user_tasks_e2e.sh: the local one
# runs against a dev workspace-server with external/in-memory workspaces. This
# one provisions a REAL throwaway org + tenant (same CP-admin scaffolding as
# test_staging_full_saas.sh) and drives the user_tasks surfaces through the live
# tenant auth chain (TenantGuard + WorkspaceAuth + Cloudflare edge) — the exact
# path a canvas concierge agent hits in production. It REUSES the staging
# harness's env contract, org-provision/teardown shape, _lib.sh helpers, and the
# AWS-leak-check lib, so the org lifecycle scaffolding is shared, not duplicated.
#
# NOTE: user_tasks is a pure DB/handler primitive — no LLM container is needed.
# We DO NOT wait for any workspace to boot online (no MINIMAX/ANTHROPIC key
# required), which keeps this test fast and decoupled from EC2 cold-boot flake.
# Workspaces are created in 'external' mode so the tenant ws-server registers
# the row without provisioning an EC2 (no leak beyond the org teardown).
#
# Required env (same contract as test_staging_full_saas.sh):
# MOLECULE_CP_URL default: https://staging-api.moleculesai.app
# MOLECULE_ADMIN_TOKEN CP admin bearer — Railway staging CP_ADMIN_API_TOKEN
#
# Optional env:
# E2E_PROVISION_TIMEOUT_SECS default 900 (15 min cold tenant EC2 budget)
# E2E_KEEP_ORG 1 → skip teardown (debugging only)
# E2E_RUN_ID slug suffix; CI: ${GITHUB_RUN_ID}-${RUN_ATTEMPT}
# E2E_AWS_LEAK_CHECK auto (default) | required | off
# E2E_AWS_TERMINATE_LEAKS 1 → terminate slug-tagged leaked EC2 on exit
#
# Exit codes:
# 0 happy path
# 1 generic / assertion failure
# 2 missing required env
# 3 provisioning timed out
# 4 teardown left orphan resources
set -euo pipefail
# _lib.sh gives us sanitize/admin-auth conventions shared across the suite.
# shellcheck disable=SC1091
# shellcheck source=_lib.sh
source "$(dirname "$0")/_lib.sh"
# AWS-leak-check lib — same teardown leak assertion the full-SaaS harness uses.
# shellcheck disable=SC1091
# shellcheck source=lib/aws_leak_check.sh
source "$(dirname "$0")/lib/aws_leak_check.sh"
CP_URL="${MOLECULE_CP_URL:-https://staging-api.moleculesai.app}"
ADMIN_TOKEN="${MOLECULE_ADMIN_TOKEN:?MOLECULE_ADMIN_TOKEN required — Railway staging CP_ADMIN_API_TOKEN}"
PROVISION_TIMEOUT_SECS="${E2E_PROVISION_TIMEOUT_SECS:-900}"
RUN_ID_SUFFIX="${E2E_RUN_ID:-$(date +%H%M%S)-$$}"
# Fixed e2e- prefix so sweep-stale-e2e-orgs.yml + lint_cleanup_traps.sh reap any
# orphan. (The lint requires a quoted SLUG=... with a literal e2e-/rt-e2e- head.)
SLUG="e2e-cncrg-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
SLUG=$(echo "$SLUG" | tr '[:upper:]' '[:lower:]' | tr -cd 'a-z0-9-' | head -c 32)
log() { echo "[$(date +%H:%M:%S)] $*"; }
fail() { echo "[$(date +%H:%M:%S)] ❌ $*" >&2; exit 1; }
ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
PASS=0
FAIL=0
check() { # <desc> <expected-substr> <actual>
if echo "$3" | grep -qF -- "$2"; then echo " PASS: $1"; PASS=$((PASS + 1));
else echo " FAIL: $1"; echo " expected to contain: $2"; echo " got: $(echo "$3" | head -c 300)"; FAIL=$((FAIL + 1)); fi
}
check_not() { # <desc> <unexpected-substr> <actual>
if echo "$3" | grep -qF -- "$2"; then echo " FAIL: $1 (should NOT contain: $2)"; FAIL=$((FAIL + 1));
else echo " PASS: $1"; PASS=$((PASS + 1)); fi
}
check_code() { # <desc> <expected> <actual>
if [ "$3" = "$2" ]; then echo " PASS: $1 (HTTP $3)"; PASS=$((PASS + 1));
else echo " FAIL: $1 (expected HTTP $2, got HTTP $3)"; FAIL=$((FAIL + 1)); fi
}
CURL_COMMON=(-sS --max-time 30)
TMPDIR_E2E=$(mktemp -d -t cncrg-staging-XXXXXX)
# ─── teardown trap (org delete + leak check) ─────────────────────────────────
CLEANUP_DONE=0
cleanup_org() {
local entry_rc=$?
[ "$CLEANUP_DONE" = "1" ] && return 0
CLEANUP_DONE=1
rm -rf "$TMPDIR_E2E" 2>/dev/null || true
if [ "${E2E_KEEP_ORG:-0}" = "1" ]; then
log "E2E_KEEP_ORG=1 — skipping teardown. Manually delete $SLUG when done."
return 0
fi
log "🧹 Tearing down org $SLUG..."
if curl "${CURL_COMMON[@]}" --max-time 120 -X DELETE "$CP_URL/cp/admin/tenants/$SLUG" \
-H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" \
-d "{\"confirm\":\"$SLUG\"}" >/dev/null 2>&1; then
ok "Teardown request accepted"
else
log "Teardown returned non-2xx (may already be gone)"
fi
# Eventual-consistency wait: org row gone / purged.
local leak_count=1 elapsed=0
while [ "$elapsed" -lt 60 ]; do
leak_count=$(curl "${CURL_COMMON[@]}" "$CP_URL/cp/admin/orgs" \
-H "Authorization: Bearer $ADMIN_TOKEN" 2>/dev/null \
| python3 -c "import json,sys; d=json.load(sys.stdin); print(sum(1 for o in d.get('orgs', []) if o.get('slug')=='$SLUG' and o.get('status') != 'purged'))" \
2>/dev/null || echo 1)
[ "$leak_count" = "0" ] && break
sleep 5; elapsed=$((elapsed + 5))
done
if [ "$leak_count" != "0" ]; then
echo "⚠️ LEAK: org $SLUG still present post-teardown after ${elapsed}s (count=$leak_count)" >&2
exit 4
fi
local aws_leak_rc=0
e2e_verify_no_ec2_leaks_for_slug "$SLUG" || aws_leak_rc=$?
if [ "$aws_leak_rc" != "0" ]; then
case "$aws_leak_rc" in 2) exit 2 ;; *) exit 4 ;; esac
fi
ok "Teardown clean — no orphan org or EC2 resources for $SLUG (${elapsed}s)"
case "$entry_rc" in 0|1|2|3|4) ;; *) exit 1 ;; esac
}
trap cleanup_org EXIT INT TERM
admin_call() { # <method> <path> [curl args…]
local method="$1" path="$2"; shift 2
curl "${CURL_COMMON[@]}" -X "$method" "$CP_URL$path" \
-H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" "$@"
}
# ─── 0. Preflight ────────────────────────────────────────────────────────────
log "═══ Staging concierge user_tasks E2E ═══ CP=$CP_URL Slug=$SLUG"
curl "${CURL_COMMON[@]}" "$CP_URL/health" >/dev/null || fail "CP health check failed"
ok "CP reachable"
# ─── 1. Create org ───────────────────────────────────────────────────────────
log "1/6 Creating org $SLUG..."
CREATE_RESP=$(admin_call POST /cp/admin/orgs \
-d "{\"slug\":\"$SLUG\",\"name\":\"E2E $SLUG\",\"owner_user_id\":\"e2e-runner:$SLUG\"}")
echo "$CREATE_RESP" | python3 -m json.tool >/dev/null || fail "Org create non-JSON: $CREATE_RESP"
ORG_ID=$(echo "$CREATE_RESP" | python3 -c "import json,sys; print(json.load(sys.stdin).get('id',''))")
[ -z "$ORG_ID" ] && fail "Org create response missing 'id': $CREATE_RESP"
ok "Org created (id=$ORG_ID)"
# ─── 2. Wait for tenant provisioning ─────────────────────────────────────────
log "2/6 Waiting for tenant provisioning (up to ${PROVISION_TIMEOUT_SECS}s)..."
DEADLINE=$(( $(date +%s) + PROVISION_TIMEOUT_SECS ))
LAST_STATUS=""
while true; do
[ "$(date +%s)" -gt "$DEADLINE" ] && exit 3
LIST_JSON=$(admin_call GET /cp/admin/orgs 2>/dev/null || echo '{"orgs":[]}')
STATUS=$(echo "$LIST_JSON" | python3 -c "
import json, sys
d = json.load(sys.stdin)
for o in d.get('orgs', []):
if o.get('slug') == '$SLUG':
print(o.get('instance_status', '')); sys.exit(0)
print('')" 2>/dev/null || echo "")
if [ "$STATUS" != "$LAST_STATUS" ]; then log " status → $STATUS"; LAST_STATUS="$STATUS"; fi
case "$STATUS" in
running) break ;;
failed) fail "Tenant provisioning failed for $SLUG" ;;
*) sleep 15 ;;
esac
done
ok "Tenant provisioning complete"
# Derive tenant domain from CP hostname (prod vs staging).
CP_HOST=$(echo "$CP_URL" | sed -E 's#^https?://##; s#/.*$##')
case "$CP_HOST" in
api.*) DERIVED_DOMAIN="${CP_HOST#api.}" ;;
staging-api.*) DERIVED_DOMAIN="staging.${CP_HOST#staging-api.}" ;;
*) DERIVED_DOMAIN="$CP_HOST" ;;
esac
TENANT_DOMAIN="${MOLECULE_TENANT_DOMAIN:-$DERIVED_DOMAIN}"
TENANT_URL="https://$SLUG.$TENANT_DOMAIN"
log " TENANT_URL=$TENANT_URL"
# ─── 3. Per-tenant admin token + TLS readiness ───────────────────────────────
log "3/6 Fetching per-tenant admin token..."
TENANT_TOKEN=$(admin_call GET "/cp/admin/orgs/$SLUG/admin-token" \
| python3 -c "import json,sys; print(json.load(sys.stdin).get('admin_token',''))" 2>/dev/null || echo "")
[ -z "$TENANT_TOKEN" ] && fail "Could not retrieve per-tenant admin token for $SLUG"
ok "Tenant admin token retrieved (len=${#TENANT_TOKEN})"
log " Waiting for tenant TLS / DNS propagation..."
TLS_DEADLINE=$(( $(date +%s) + 15 * 60 ))
while true; do
curl -sSfk --max-time 5 "$TENANT_URL/health" >/dev/null 2>&1 && break
[ "$(date +%s)" -gt "$TLS_DEADLINE" ] && fail "Tenant /health never 2xx within 15m"
sleep 5
done
ok "Tenant reachable at $TENANT_URL"
# tenant_call: Authorization (tenant admin token, valid for every workspace) +
# X-Molecule-Org-Id (TenantGuard 404s without it) + Origin (Cloudflare edge).
tenant_call() { # <method> <path> [curl args…]
local method="$1" path="$2"; shift 2
curl "${CURL_COMMON[@]}" -X "$method" "$TENANT_URL$path" \
-H "Authorization: Bearer $TENANT_TOKEN" \
-H "X-Molecule-Org-Id: $ORG_ID" \
-H "Origin: $TENANT_URL" "$@"
}
# Create an external workspace (row only — no EC2). Echoes its id.
create_external_ws() { # <name>
local name="$1" resp
resp=$(tenant_call POST /workspaces -H "Content-Type: application/json" \
-d "{\"name\":\"$name\",\"tier\":1,\"runtime\":\"external\",\"external\":true}")
echo "$resp" | python3 -c "import sys,re
b=sys.stdin.read()
m=re.search(r'\"id\"\s*:\s*\"([^\"]+)\"', b)
print(m.group(1) if m else '')"
}
# MCP JSON-RPC tools/call against /workspaces/:id/mcp. Echoes the result text
# (result.content[].text). Persists HTTP code to a file (runs in $()).
MCP_CODE_FILE="$TMPDIR_E2E/mcp_code"
mcp_call() { # <wsid> <tool> <args-json>
local wsid="$1" tool="$2" args="$3" out code
out="$TMPDIR_E2E/mcp_out"
set +e
code=$(tenant_call POST "/workspaces/$wsid/mcp" -H "Content-Type: application/json" \
-d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"$tool\",\"arguments\":$args}}" \
-o "$out" -w "%{http_code}" 2>/dev/null)
set -e
printf '%s' "$code" > "$MCP_CODE_FILE"
python3 -c "
import sys, json
try: d = json.load(open('$out'))
except Exception: print(''); sys.exit(0)
res = d.get('result') if isinstance(d, dict) else None
print(''.join(c.get('text','') for c in res.get('content', [])) if isinstance(res, dict) else '')"
}
mcp_http_code() { cat "$MCP_CODE_FILE" 2>/dev/null || echo ''; }
# ─── 4. Provision two workspaces (A raises asks, B probes cross-ws authz) ─────
log "4/6 Creating two tenant workspaces (external rows — no EC2)..."
WS_A=$(create_external_ws "Concierge-UT-A-$$")
[ -n "$WS_A" ] || fail "ws-A create returned no id"
WS_B=$(create_external_ws "Concierge-UT-B-$$")
[ -n "$WS_B" ] || fail "ws-B create returned no id"
ok "ws-A=$WS_A ws-B=$WS_B"
# ─── 5. user_tasks REST + MCP + authz ────────────────────────────────────────
log "5/6 user_tasks contract (REST + MCP + cross-ws authz)..."
# 5.1 REST create → 201, status pending
R=$(tenant_call POST "/workspaces/$WS_A/user-tasks" -H "Content-Type: application/json" \
-d '{"title":"Review the Q3 draft","detail":"Need your sign-off before send"}' \
-o "$TMPDIR_E2E/c.json" -w "%{http_code}" 2>/dev/null || echo "000")
BODY=$(cat "$TMPDIR_E2E/c.json" 2>/dev/null || echo "")
check_code "REST create user-task" "201" "$R"
check "create returns status pending" '"status":"pending"' "$BODY"
TASK_ID=$(echo "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin).get('user_task_id',''))" 2>/dev/null || echo "")
[ -n "$TASK_ID" ] || fail "no user_task_id returned: $BODY"
log " TASK_ID=$TASK_ID"
# 5.2 REST read (this workspace + admin org-wide pending)
R=$(tenant_call GET "/workspaces/$WS_A/user-tasks")
check "GET ws-A user-tasks contains the task" "$TASK_ID" "$R"
check "GET ws-A user-tasks shows title" 'Review the Q3 draft' "$R"
R=$(tenant_call GET "/user-tasks/pending")
check "GET /user-tasks/pending (admin) contains the task" "$TASK_ID" "$R"
check "pending entry carries workspace_name" "Concierge-UT-A-$$" "$R"
# 5.3 REST PATCH title/detail → 200, applied
R=$(tenant_call PATCH "/workspaces/$WS_A/user-tasks/$TASK_ID" -H "Content-Type: application/json" \
-d '{"title":"Review the Q3 draft (URGENT)","detail":"Sign-off needed by EOD"}' \
-o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "REST PATCH user-task" "200" "$R"
R=$(tenant_call GET "/workspaces/$WS_A/user-tasks")
check "PATCH applied new title" '(URGENT)' "$R"
check "PATCH applied new detail" 'Sign-off needed by EOD' "$R"
# 5.4 REST resolve done → 200, gone from pending
R=$(tenant_call POST "/workspaces/$WS_A/user-tasks/$TASK_ID/resolve" -H "Content-Type: application/json" \
-d '{"status":"done","resolved_by":"cto"}' -o "$TMPDIR_E2E/r.json" -w "%{http_code}" 2>/dev/null || echo "000")
BODY=$(cat "$TMPDIR_E2E/r.json" 2>/dev/null || echo "")
check_code "REST resolve done" "200" "$R"
check "resolve echoes status done" '"status":"done"' "$BODY"
R=$(tenant_call GET "/user-tasks/pending")
check_not "resolved task no longer pending (admin feed)" "$TASK_ID" "$R"
# 5.5 MCP request_user_action → new pending task surfaces on the admin feed
TEXT=$(mcp_call "$WS_A" "request_user_action" '{"title":"Provide the staging API key","detail":"Blocked on it for the deploy"}')
check_code "MCP request_user_action HTTP" "200" "$(mcp_http_code)"
check "MCP request_user_action success text" 'Asked the user' "$TEXT"
R=$(tenant_call GET "/user-tasks/pending")
check "MCP-created ask appears in pending feed" 'Provide the staging API key' "$R"
MCP_TASK_ID=$(echo "$R" | python3 -c "
import sys, json
for t in json.load(sys.stdin):
if t.get('title') == 'Provide the staging API key':
print(t.get('id','')); break" 2>/dev/null || echo "")
log " MCP_TASK_ID=$MCP_TASK_ID"
# 5.6 MCP list_user_tasks returns ws-A's task(s)
TEXT=$(mcp_call "$WS_A" "list_user_tasks" '{}')
check_code "MCP list_user_tasks HTTP" "200" "$(mcp_http_code)"
check "list_user_tasks contains the MCP task" 'Provide the staging API key' "$TEXT"
check "list_user_tasks shows it pending" '"status":"pending"' "$TEXT"
# 5.7 MCP update_user_task changes it
if [ -n "$MCP_TASK_ID" ]; then
TEXT=$(mcp_call "$WS_A" "update_user_task" "{\"user_task_id\":\"$MCP_TASK_ID\",\"title\":\"Provide the PROD API key\"}")
check_code "MCP update_user_task HTTP" "200" "$(mcp_http_code)"
check "MCP update_user_task success text" 'User task updated' "$TEXT"
TEXT=$(mcp_call "$WS_A" "list_user_tasks" '{}')
check "update applied (new title)" 'Provide the PROD API key' "$TEXT"
check_not "update applied (old title gone)" 'staging API key' "$TEXT"
# 5.8 MCP delete_user_task → gone from list
TEXT=$(mcp_call "$WS_A" "delete_user_task" "{\"user_task_id\":\"$MCP_TASK_ID\"}")
check_code "MCP delete_user_task HTTP" "200" "$(mcp_http_code)"
check "MCP delete_user_task success text" 'User task deleted' "$TEXT"
TEXT=$(mcp_call "$WS_A" "list_user_tasks" '{}')
check_not "deleted task gone from list" 'Provide the PROD API key' "$TEXT"
else
echo " FAIL: could not resolve MCP_TASK_ID — MCP update/delete steps skipped"
FAIL=$((FAIL + 1))
fi
# 5.9 Cross-workspace authz: ws-B cannot mutate ws-A's task (scoped by URL :id)
SCOPE_ID=$(tenant_call POST "/workspaces/$WS_A/user-tasks" -H "Content-Type: application/json" \
-d '{"title":"Scope probe task"}' | python3 -c "import sys,json; print(json.load(sys.stdin).get('user_task_id',''))" 2>/dev/null || echo "")
[ -n "$SCOPE_ID" ] || fail "scope-probe task create failed"
log " SCOPE_ID=$SCOPE_ID (owned by ws-A)"
# ws-B PATCHes ws-A's task → 404 (workspace_id scope).
R=$(tenant_call PATCH "/workspaces/$WS_B/user-tasks/$SCOPE_ID" -H "Content-Type: application/json" \
-d '{"title":"hijack"}' -o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "ws-B PATCH of ws-A's task scoped out" "404" "$R"
# ws-B DELETEs ws-A's task → 404.
R=$(tenant_call DELETE "/workspaces/$WS_B/user-tasks/$SCOPE_ID" -o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "ws-B DELETE of ws-A's task scoped out" "404" "$R"
# Task survived unchanged on ws-A.
R=$(tenant_call GET "/workspaces/$WS_A/user-tasks")
check "ws-A's task survived cross-ws attempts" "$SCOPE_ID" "$R"
check_not "ws-A's task title was NOT hijacked" 'hijack' "$R"
# ws-B's own list must NOT see ws-A's task at all.
R=$(tenant_call GET "/workspaces/$WS_B/user-tasks")
check_not "ws-B list excludes ws-A's task (read isolation)" "$SCOPE_ID" "$R"
# 5.10 Validation contracts
R=$(tenant_call POST "/workspaces/$WS_A/user-tasks" -H "Content-Type: application/json" \
-d '{"detail":"no title here"}' -o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "create without title → 400" "400" "$R"
R=$(tenant_call POST "/workspaces/$WS_A/user-tasks/$SCOPE_ID/resolve" -H "Content-Type: application/json" \
-d '{"status":"banana"}' -o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "resolve with invalid status → 400" "400" "$R"
R=$(tenant_call PATCH "/workspaces/$WS_A/user-tasks/$SCOPE_ID" -H "Content-Type: application/json" \
-d '{"status":"banana"}' -o /dev/null -w "%{http_code}" 2>/dev/null || echo "000")
check_code "PATCH with invalid status → 400" "400" "$R"
# ─── 6. Results ──────────────────────────────────────────────────────────────
log "6/6 Results: $PASS passed, $FAIL failed (teardown runs via EXIT trap)"
[ "$FAIL" -eq 0 ] || fail "$FAIL user_tasks assertion(s) failed"
ok "═══ STAGING CONCIERGE user_tasks E2E PASSED ($PASS checks) ═══"
+351
View File
@@ -0,0 +1,351 @@
#!/usr/bin/env bash
# E2E tests for the user_tasks platform ability — agent → user action
# requests ("asks"). Exercises the FULL contract both surfaces expose:
#
# REST (WorkspaceAuth unless noted):
# POST /workspaces/:id/user-tasks create an ask
# GET /workspaces/:id/user-tasks this workspace's asks
# GET /user-tasks/pending (AdminAuth) org-wide pending asks
# PATCH /workspaces/:id/user-tasks/:taskId edit (scoped by ws id)
# DELETE /workspaces/:id/user-tasks/:taskId remove (scoped by ws id)
# POST /workspaces/:id/user-tasks/:taskId/resolve done|dismissed
#
# MCP a2a-bridge tools (POST /workspaces/:id/mcp, JSON-RPC tools/call):
# request_user_action(title, detail?) list_user_tasks()
# update_user_task(user_task_id, …) delete_user_task(user_task_id)
#
# The MCP arm is what proves the agent→user ability END-TO-END: it drives
# the literal `tools/call` envelope through the real WorkspaceAuth chain
# (the exact call a canvas agent makes), then asserts the new task surfaces
# on the admin-gated concierge feed (/user-tasks/pending).
#
# Requires: platform running on $BASE (default http://localhost:8080).
# Env contract (same as its siblings in this dir):
# BASE platform base URL (default http://localhost:8080)
# ADMIN_TOKEN / platform admin bearer; MOLECULE_ADMIN_TOKEN wins.
# MOLECULE_ADMIN_TOKEN Sent on AdminAuth routes (create/delete ws,
# /user-tasks/pending). Fail-open dev platform with
# no admin token still works (helpers send nothing).
set -euo pipefail
source "$(dirname "$0")/_lib.sh" # sets BASE default + admin-auth helpers
PASS=0
FAIL=0
check() {
local desc="$1"
local expected="$2"
local actual="$3"
if echo "$actual" | grep -qF -- "$expected"; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected to contain: $expected"
echo " got: $(echo "$actual" | head -5)"
FAIL=$((FAIL + 1))
fi
}
check_not() {
local desc="$1"
local unexpected="$2"
local actual="$3"
if echo "$actual" | grep -qF -- "$unexpected"; then
echo "FAIL: $desc"
echo " should NOT contain: $unexpected"
FAIL=$((FAIL + 1))
else
echo "PASS: $desc"
PASS=$((PASS + 1))
fi
}
# Assert an exact HTTP status. $1 desc, $2 expected code, $3 actual code.
check_code() {
local desc="$1"
local expected="$2"
local actual="$3"
if [ "$actual" = "$expected" ]; then
echo "PASS: $desc (HTTP $actual)"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected HTTP $expected, got HTTP $actual"
FAIL=$((FAIL + 1))
fi
}
# Admin bearer for AdminAuth routes (create/delete workspace, pending feed).
ADMIN_AUTH=()
e2e_admin_auth_args ADMIN_AUTH
acurl() { curl -s ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} "$@"; }
# The local create-workspace response embeds a claude_code_channel_snippet
# whose raw newlines/escapes make the body un-loadable by strict json.load
# (the same reason _extract_token.py can emit empty here). So pull id +
# auth_token with tolerant regexes that don't parse the whole envelope.
extract_field_regex() { # <field> ; reads body on stdin
local field="$1"
python3 -c "
import sys, re
body = sys.stdin.read()
m = re.search(r'\"$field\"\s*:\s*\"([^\"]+)\"', body)
print(m.group(1) if m else '')
"
}
extract_ws_id() { extract_field_regex "id"; }
extract_ws_token() { extract_field_regex "auth_token"; }
# Create an external workspace; echo "<id>\t<token>". Caller registers ids
# in CREATED_WSIDS for the scoped teardown.
create_workspace() { # <name>
local name="$1" resp wid tok
resp=$(acurl -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"$name\",\"tier\":1,\"runtime\":\"external\",\"external\":true}")
wid=$(printf '%s' "$resp" | extract_ws_id)
tok=$(printf '%s' "$resp" | extract_ws_token)
if [ -z "$wid" ]; then
echo "FATAL: create workspace '$name' returned no id: $(printf '%s' "$resp" | head -c 200)" >&2
return 1
fi
if [ -z "$tok" ]; then
# External create did not echo a token — mint one via the admin endpoint.
tok=$(e2e_mint_workspace_token "$wid" 2>/dev/null || echo "")
fi
if [ -z "$tok" ]; then
echo "FATAL: no workspace bearer for '$name' ($wid)" >&2
return 1
fi
printf '%s\t%s\n' "$wid" "$tok"
}
# Issue a JSON-RPC tools/call to a workspace MCP endpoint. Echoes the raw
# HTTP body on stdout and persists the HTTP status to $MCP_CODE_FILE (mcp_call
# runs in a command substitution, so a plain var would be lost in the
# subshell — read the code back via mcp_http_code after the call).
# <wsid> <bearer> <tool> <args-json>
MCP_CODE_FILE="$(mktemp -t ut_mcp_code.XXXXXX)"
MCP_BODY_FILE="$(mktemp -t ut_mcp_body.XXXXXX)"
mcp_call() {
local wsid="$1" bearer="$2" tool="$3" args="$4" code
set +e
code=$(curl -sS -X POST "$BASE/workspaces/$wsid/mcp" \
-H "Authorization: Bearer $bearer" \
-H "Content-Type: application/json" \
-d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"$tool\",\"arguments\":$args}}" \
-o "$MCP_BODY_FILE" -w "%{http_code}" 2>/dev/null)
set -e
printf '%s' "$code" > "$MCP_CODE_FILE"
cat "$MCP_BODY_FILE" 2>/dev/null || echo ''
}
mcp_http_code() { cat "$MCP_CODE_FILE" 2>/dev/null || echo ''; }
# Extract the `result.content[].text` from an MCP tools/call response.
mcp_result_text() { # reads body on stdin
python3 -c "
import sys, json
try:
d = json.load(sys.stdin)
except Exception:
print(''); sys.exit(0)
res = d.get('result') if isinstance(d, dict) else None
if not isinstance(res, dict):
print(''); sys.exit(0)
print(''.join(c.get('text','') for c in res.get('content', []) if c.get('type') == 'text'))
"
}
# ─── Scoped teardown ───────────────────────────────────────────────────
# Deletes ONLY the workspaces THIS run created (CREATED_WSIDS). Deleting a
# workspace cascades its user_tasks rows, so no separate task cleanup is
# needed. NEVER a blanket sweep — a local stack can be shared with other
# concurrent E2E runs.
CREATED_WSIDS=()
teardown() {
local rc=$?
set +e
echo ""
echo "[teardown] deleting ${#CREATED_WSIDS[@]} workspace(s) this run created (scoped)"
for wid in ${CREATED_WSIDS[@]+"${CREATED_WSIDS[@]}"}; do
[ -n "$wid" ] || continue
e2e_delete_workspace "$wid" "" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"}
done
exit $rc
}
trap teardown EXIT INT TERM
echo "=== user_tasks E2E (REST + MCP) ==="
echo ""
# ─── Setup: two sibling workspaces (A raises asks; B probes scoping) ────
IFS=$'\t' read -r WS_A A_TOK < <(create_workspace "UserTasks-A-$$") || true
[ -n "${WS_A:-}" ] || { echo "FATAL: ws-A setup failed"; exit 1; }
CREATED_WSIDS+=("$WS_A")
IFS=$'\t' read -r WS_B B_TOK < <(create_workspace "UserTasks-B-$$") || true
[ -n "${WS_B:-}" ] || { echo "FATAL: ws-B setup failed"; exit 1; }
CREATED_WSIDS+=("$WS_B")
echo "ws-A=$WS_A ws-B=$WS_B"
echo ""
# ─── 1. Create (REST) on ws-A → 201, status pending ────────────────────
echo "--- 1. Create (REST) ---"
R=$(curl -s -w "\n%{http_code}" -X POST "$BASE/workspaces/$WS_A/user-tasks" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" \
-d '{"title":"Review the Q3 draft","detail":"Need your sign-off before send"}')
CODE=$(printf '%s' "$R" | tail -n1)
BODY=$(printf '%s' "$R" | sed '$d')
check_code "POST create user-task" "201" "$CODE"
check "create returns status pending" '"status":"pending"' "$BODY"
TASK_ID=$(printf '%s' "$BODY" | python3 -c "import sys,json; print(json.load(sys.stdin)['user_task_id'])")
echo " TASK_ID=$TASK_ID"
[ -n "$TASK_ID" ] || { echo "FATAL: no user_task_id returned"; }
# ─── 2. Read (REST workspace + admin pending) ──────────────────────────
echo ""
echo "--- 2. Read ---"
R=$(curl -s "$BASE/workspaces/$WS_A/user-tasks" -H "Authorization: Bearer $A_TOK")
check "GET ws-A user-tasks contains the task id" "$TASK_ID" "$R"
check "GET ws-A user-tasks shows title" 'Review the Q3 draft' "$R"
R=$(acurl "$BASE/user-tasks/pending")
check "GET /user-tasks/pending (admin) contains the task" "$TASK_ID" "$R"
check "pending entry carries workspace_name" "UserTasks-A-$$" "$R"
# ─── 3. Update (REST) PATCH title/detail → 200, change applied ─────────
echo ""
echo "--- 3. Update (REST PATCH) ---"
R=$(curl -s -w "\n%{http_code}" -X PATCH "$BASE/workspaces/$WS_A/user-tasks/$TASK_ID" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" \
-d '{"title":"Review the Q3 draft (URGENT)","detail":"Sign-off needed by EOD"}')
CODE=$(printf '%s' "$R" | tail -n1)
check_code "PATCH update user-task" "200" "$CODE"
R=$(curl -s "$BASE/workspaces/$WS_A/user-tasks" -H "Authorization: Bearer $A_TOK")
check "PATCH applied new title" '(URGENT)' "$R"
check "PATCH applied new detail" 'Sign-off needed by EOD' "$R"
# ─── 4. Resolve (REST) done → 200, gone from pending ───────────────────
echo ""
echo "--- 4. Resolve (REST done) ---"
R=$(curl -s -w "\n%{http_code}" -X POST "$BASE/workspaces/$WS_A/user-tasks/$TASK_ID/resolve" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" \
-d '{"status":"done","resolved_by":"cto"}')
CODE=$(printf '%s' "$R" | tail -n1)
BODY=$(printf '%s' "$R" | sed '$d')
check_code "POST resolve done" "200" "$CODE"
check "resolve echoes status done" '"status":"done"' "$BODY"
R=$(acurl "$BASE/user-tasks/pending")
check_not "resolved task no longer pending (admin feed)" "$TASK_ID" "$R"
# ─── 5. Create via MCP tool request_user_action → new pending task ─────
# This is the agent→user ability proven end-to-end: the literal tools/call
# the canvas agent makes, surfacing on the admin concierge feed.
echo ""
echo "--- 5. Create via MCP (request_user_action) ---"
BODY=$(mcp_call "$WS_A" "$A_TOK" "request_user_action" '{"title":"Provide the staging API key","detail":"Blocked on it for the deploy"}')
check_code "MCP request_user_action HTTP" "200" "$(mcp_http_code)"
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check "MCP request_user_action success text" 'Asked the user' "$TEXT"
# A NEW pending task must appear on the admin feed.
R=$(acurl "$BASE/user-tasks/pending")
check "MCP-created ask appears in pending feed" 'Provide the staging API key' "$R"
MCP_TASK_ID=$(printf '%s' "$R" | python3 -c "
import sys, json
d = json.load(sys.stdin)
for t in d:
if t.get('title') == 'Provide the staging API key':
print(t['id']); break
")
echo " MCP_TASK_ID=$MCP_TASK_ID"
[ -n "$MCP_TASK_ID" ] || echo " (note: could not resolve MCP_TASK_ID — later MCP steps assert by title)"
# ─── 6. list_user_tasks (MCP) returns ws-A's task(s) ───────────────────
echo ""
echo "--- 6. list_user_tasks (MCP) ---"
BODY=$(mcp_call "$WS_A" "$A_TOK" "list_user_tasks" '{}')
check_code "MCP list_user_tasks HTTP" "200" "$(mcp_http_code)"
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check "list_user_tasks contains the MCP task" 'Provide the staging API key' "$TEXT"
check "list_user_tasks shows it pending" '"status":"pending"' "$TEXT"
# ─── 7. update_user_task (MCP) changes it → verify ─────────────────────
echo ""
echo "--- 7. update_user_task (MCP) ---"
BODY=$(mcp_call "$WS_A" "$A_TOK" "update_user_task" \
"{\"user_task_id\":\"$MCP_TASK_ID\",\"title\":\"Provide the PROD API key\"}")
check_code "MCP update_user_task HTTP" "200" "$(mcp_http_code)"
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check "MCP update_user_task success text" 'User task updated' "$TEXT"
BODY=$(mcp_call "$WS_A" "$A_TOK" "list_user_tasks" '{}')
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check "update applied (new title visible)" 'Provide the PROD API key' "$TEXT"
check_not "update applied (old title gone)" 'staging API key' "$TEXT"
# ─── 8. delete_user_task (MCP) → gone from list ────────────────────────
echo ""
echo "--- 8. delete_user_task (MCP) ---"
BODY=$(mcp_call "$WS_A" "$A_TOK" "delete_user_task" "{\"user_task_id\":\"$MCP_TASK_ID\"}")
check_code "MCP delete_user_task HTTP" "200" "$(mcp_http_code)"
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check "MCP delete_user_task success text" 'User task deleted' "$TEXT"
BODY=$(mcp_call "$WS_A" "$A_TOK" "list_user_tasks" '{}')
TEXT=$(printf '%s' "$BODY" | mcp_result_text)
check_not "deleted task gone from list" 'Provide the PROD API key' "$TEXT"
# ─── 9. Scoping / authz ────────────────────────────────────────────────
echo ""
echo "--- 9. Scoping / authz ---"
# A fresh ws-A task to attempt cross-workspace mutation against.
SCOPE_ID=$(curl -s -X POST "$BASE/workspaces/$WS_A/user-tasks" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" \
-d '{"title":"Scope probe task"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['user_task_id'])")
echo " SCOPE_ID=$SCOPE_ID (owned by ws-A)"
# ws-B PATCHes ws-A's task → 404 (workspace_id scope).
CODE=$(curl -s -o /dev/null -w "%{http_code}" -X PATCH "$BASE/workspaces/$WS_B/user-tasks/$SCOPE_ID" \
-H "Authorization: Bearer $B_TOK" -H "Content-Type: application/json" -d '{"title":"hijack"}')
check_code "ws-B PATCH of ws-A's task is scoped out" "404" "$CODE"
# ws-B DELETEs ws-A's task → 404.
CODE=$(curl -s -o /dev/null -w "%{http_code}" -X DELETE "$BASE/workspaces/$WS_B/user-tasks/$SCOPE_ID" \
-H "Authorization: Bearer $B_TOK")
check_code "ws-B DELETE of ws-A's task is scoped out" "404" "$CODE"
# Task survived the cross-workspace attempts (still on ws-A, unchanged).
R=$(curl -s "$BASE/workspaces/$WS_A/user-tasks" -H "Authorization: Bearer $A_TOK")
check "ws-A's task survived cross-ws attempts" "$SCOPE_ID" "$R"
check_not "ws-A's task title was NOT hijacked" 'hijack' "$R"
# /user-tasks/pending is AdminAuth — a workspace bearer must be rejected.
CODE=$(curl -s -o /dev/null -w "%{http_code}" "$BASE/user-tasks/pending" -H "Authorization: Bearer $A_TOK")
if [ "$CODE" = "401" ] || [ "$CODE" = "403" ]; then
echo "PASS: /user-tasks/pending rejects a workspace token (HTTP $CODE)"
PASS=$((PASS + 1))
else
echo "FAIL: /user-tasks/pending should reject a workspace token, got HTTP $CODE"
FAIL=$((FAIL + 1))
fi
# …and reject no auth at all.
CODE=$(curl -s -o /dev/null -w "%{http_code}" "$BASE/user-tasks/pending")
if [ "$CODE" = "401" ] || [ "$CODE" = "403" ]; then
echo "PASS: /user-tasks/pending rejects an unauthenticated caller (HTTP $CODE)"
PASS=$((PASS + 1))
else
echo "FAIL: /user-tasks/pending should reject no auth, got HTTP $CODE"
FAIL=$((FAIL + 1))
fi
# ─── 10. Validation ────────────────────────────────────────────────────
echo ""
echo "--- 10. Validation ---"
# Missing title → 400.
CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST "$BASE/workspaces/$WS_A/user-tasks" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" -d '{"detail":"no title here"}')
check_code "create without title → 400" "400" "$CODE"
# Resolve with an invalid status → 400.
CODE=$(curl -s -o /dev/null -w "%{http_code}" -X POST "$BASE/workspaces/$WS_A/user-tasks/$SCOPE_ID/resolve" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" -d '{"status":"banana"}')
check_code "resolve with invalid status → 400" "400" "$CODE"
# PATCH with an invalid status → 400.
CODE=$(curl -s -o /dev/null -w "%{http_code}" -X PATCH "$BASE/workspaces/$WS_A/user-tasks/$SCOPE_ID" \
-H "Authorization: Bearer $A_TOK" -H "Content-Type: application/json" -d '{"status":"banana"}')
check_code "PATCH with invalid status → 400" "400" "$CODE"
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
exit $FAIL
+14
View File
@@ -433,6 +433,17 @@ def signal_4_branch_divergence(
# ── Signal 6: CI required-checks awareness ───────────────────────────────────
# Governance checks that are ALWAYS required for every PR, regardless of
# branch-protection configuration. These are the uniform-gate checks that
# must pass before any PR can merge (SOP tier removal makes them mandatory
# for all PRs, not just tier:medium/tier:high).
GOVERNANCE_REQUIRED_CONTEXTS = [
"qa-review / approved (pull_request)",
"security-review / approved (pull_request)",
"sop-checklist / all-items-acked (pull_request)",
]
def signal_6_ci(pr_number: int, repo: str, branch: str | None = None, pr_data: dict | None = None) -> dict:
"""
Query combined CI status for PR head commit.
@@ -470,6 +481,9 @@ def signal_6_ci(pr_number: int, repo: str, branch: str | None = None, pr_data: d
required_checks.append(check["context"])
except GiteaError:
pass # No protection or no read access
# Uniform gate: governance checks are ALWAYS required, even if branch
# protection does not enumerate them. Deduplicate against BP list.
required_checks = list(dict.fromkeys(required_checks + GOVERNANCE_REQUIRED_CONTEXTS))
failing_required = []
passing_required = []
+130
View File
@@ -354,3 +354,133 @@ def test_signal_4_branch_api_error_returns_na(monkeypatch):
assert result["verdict"] == "N/A"
assert "error" in result
# ── Signal 6: CI required checks ────────────────────────────────────────────
def _signal_6_api_get(required_checks, statuses):
"""Return a fake_api_get closure for signal_6 tests."""
def fake_api_get(path):
if path == "/repos/molecule-ai/molecule-core/pulls/200":
return {"base": {"sha": "base000", "ref": "main"}, "head": {"sha": "pr222"}}
if path == "/repos/molecule-ai/molecule-core/commits/pr222/status":
return {"state": "failure", "statuses": statuses}
if path == "/repos/molecule-ai/molecule-core/branches/main/protection":
return {"required_status_checks": {"checks": [{"context": c} for c in required_checks]}}
raise AssertionError(f"unexpected api_get: {path}")
return fake_api_get
def test_signal_6_missing_required_context_returns_ci_pending(monkeypatch):
"""A required check that is ABSENT from the status list is treated as missing,
which is fail-closed CI_PENDING (never ready-by-absence)."""
mod = load_gate_check()
monkeypatch.setattr(
mod, "api_get",
_signal_6_api_get(
required_checks=["qa-review / approved (pull_request)", "security-review / approved (pull_request)"],
statuses=[
{"context": "qa-review / approved (pull_request)", "status": "success"},
# security-review is completely missing
],
),
)
result = mod.signal_6_ci(200, "molecule-ai/molecule-core")
assert result["verdict"] == "CI_PENDING"
assert "security-review / approved (pull_request)" in result["pending_required"]
def test_signal_6_pending_required_context_returns_ci_pending(monkeypatch):
"""A required check with status 'pending' blocks the gate with CI_PENDING."""
mod = load_gate_check()
monkeypatch.setattr(
mod, "api_get",
_signal_6_api_get(
required_checks=[
"qa-review / approved (pull_request)",
"security-review / approved (pull_request)",
"sop-checklist / all-items-acked (pull_request)",
],
statuses=[
{"context": "qa-review / approved (pull_request)", "status": "success"},
{"context": "security-review / approved (pull_request)", "status": "pending"},
{"context": "sop-checklist / all-items-acked (pull_request)", "status": "success"},
],
),
)
result = mod.signal_6_ci(200, "molecule-ai/molecule-core")
assert result["verdict"] == "CI_PENDING"
assert "security-review / approved (pull_request)" in result["pending_required"]
def test_signal_6_failing_required_context_returns_ci_fail(monkeypatch):
"""A required check with status 'failure' blocks the gate with CI_FAIL."""
mod = load_gate_check()
monkeypatch.setattr(
mod, "api_get",
_signal_6_api_get(
required_checks=[
"qa-review / approved (pull_request)",
"security-review / approved (pull_request)",
"sop-checklist / all-items-acked (pull_request)",
"CI / all-required (pull_request)",
],
statuses=[
{"context": "qa-review / approved (pull_request)", "status": "failure"},
{"context": "security-review / approved (pull_request)", "status": "success"},
{"context": "sop-checklist / all-items-acked (pull_request)", "status": "success"},
{"context": "CI / all-required (pull_request)", "status": "success"},
],
),
)
result = mod.signal_6_ci(200, "molecule-ai/molecule-core")
assert result["verdict"] == "CI_FAIL"
assert "qa-review / approved (pull_request)" in result["failing_required"]
def test_signal_6_all_required_green_returns_clear(monkeypatch):
"""When every required check is success/neutral, the gate is CLEAR."""
mod = load_gate_check()
monkeypatch.setattr(
mod, "api_get",
_signal_6_api_get(
required_checks=[
"qa-review / approved (pull_request)",
"security-review / approved (pull_request)",
"sop-checklist / all-items-acked (pull_request)",
"CI / all-required (pull_request)",
],
statuses=[
{"context": "qa-review / approved (pull_request)", "status": "success"},
{"context": "security-review / approved (pull_request)", "status": "success"},
{"context": "sop-checklist / all-items-acked (pull_request)", "status": "success"},
{"context": "CI / all-required (pull_request)", "status": "success"},
],
),
)
result = mod.signal_6_ci(200, "molecule-ai/molecule-core")
assert result["verdict"] == "CLEAR"
assert result["pending_required"] == []
assert result["failing_required"] == []
def test_signal_6_governance_checks_always_required_even_when_bp_empty(monkeypatch):
"""Uniform gate: qa/security/sop are REQUIRED even if branch protection
does not enumerate them. A PR with only CI/all-required green but missing
governance contexts must be CI_PENDING (fail-closed)."""
mod = load_gate_check()
monkeypatch.setattr(
mod, "api_get",
_signal_6_api_get(
required_checks=[], # BP lists nothing
statuses=[
{"context": "CI / all-required (pull_request)", "status": "success"},
],
),
)
result = mod.signal_6_ci(200, "molecule-ai/molecule-core")
assert result["verdict"] == "CI_PENDING"
assert "qa-review / approved (pull_request)" in result["pending_required"]
assert "security-review / approved (pull_request)" in result["pending_required"]
assert "sop-checklist / all-items-acked (pull_request)" in result["pending_required"]
+31
View File
@@ -119,6 +119,18 @@ func main() {
}
}
// Self-hosted platform-agent seed. With no control plane present to install
// the org's concierge (SaaS leaves it to the CP at org-provision time), the
// tenant server seeds it itself when MOLECULE_SEED_PLATFORM_AGENT is set —
// the self-hosted docker-compose sets it, while CI harnesses + SaaS tenants
// leave it unset (so e2e empty-DB assertions and the CP path are unaffected).
// Idempotent + best-effort — never fatal.
if v := os.Getenv("MOLECULE_SEED_PLATFORM_AGENT"); v == "true" || v == "1" {
if err := handlers.EnsureSelfHostedPlatformAgent(context.Background(), db.DB); err != nil {
log.Printf("boot: platform-agent self-seed failed (non-fatal): %v", err)
}
}
// Redis
redisURL := envOr("REDIS_URL", "redis://localhost:6379")
if err := db.InitRedis(redisURL); err != nil {
@@ -237,6 +249,25 @@ func main() {
wh.SetCPProvisioner(cpProv)
}
// Self-hosted platform-agent boot-provision (Change 1). The line-128 seed
// only creates the concierge DB ROW; on a fresh self-host that leaves it
// with no container (status='failed'/'online' but nothing running). Now that
// the local Docker provisioner (prov) and WorkspaceHandler (RestartByID)
// exist, kick off a best-effort provision so a self-hosted concierge comes
// online automatically once LLM creds exist.
//
// Guarded to self-host ONLY: same MOLECULE_SEED_PLATFORM_AGENT flag as the
// seed AND prov != nil (local Docker active ⇒ MOLECULE_ORG_ID unset). The
// SaaS path (cpProv != nil ⇒ prov == nil) never triggers — the CP owns
// concierge provisioning there. Best-effort + non-fatal + runs once: on a
// fresh self-host with no creds the provision fails and the agent stays
// 'failed' until BYOK is configured via Settings; RestartByID is itself
// debounced so this can't loop. Runs in a goroutine inside the helper so a
// slow image pull never delays the HTTP server.
if v := os.Getenv("MOLECULE_SEED_PLATFORM_AGENT"); (v == "true" || v == "1") && prov != nil {
handlers.MaybeProvisionPlatformAgentOnBoot(context.Background(), db.DB, prov, wh.RestartByID)
}
// Memory v2 plugin (RFC #2728): build the dependency bundle once
// here so all three handlers (MCPHandler, AdminMemoriesHandler,
// WorkspaceHandler) get the same plugin/resolver pair. memBundle
+498 -12
View File
@@ -12,12 +12,63 @@
"host": "api.moleculesai.app",
"basePath": "/",
"paths": {
"/org/identity": {
"get": {
"produces": [
"application/json"
],
"tags": [
"org"
],
"summary": "Get the org's display name",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/handlers.OrgIdentityResponse"
}
}
}
}
},
"/user-tasks/pending": {
"get": {
"security": [
{
"BearerAuth": []
}
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "List pending user tasks across all workspaces",
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/handlers.PendingUserTask"
}
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
}
},
"/workspaces/{id}/schedules": {
"get": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
@@ -57,8 +108,7 @@
"post": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"consumes": [
@@ -115,8 +165,7 @@
"delete": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
@@ -166,8 +215,7 @@
"patch": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"consumes": [
@@ -237,8 +285,7 @@
"get": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
@@ -287,8 +334,7 @@
"post": {
"security": [
{
"BearerAuth": [],
"OrgSlugAuth": []
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
@@ -335,6 +381,293 @@
}
}
}
},
"/workspaces/{id}/user-tasks": {
"get": {
"security": [
{
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "List a workspace's own user tasks",
"parameters": [
{
"type": "string",
"description": "Workspace ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/handlers.UserTask"
}
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
},
"post": {
"security": [
{
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "Raise a user task",
"parameters": [
{
"type": "string",
"description": "Workspace ID",
"name": "id",
"in": "path",
"required": true
},
{
"description": "Task fields",
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/handlers.CreateUserTaskRequest"
}
}
],
"responses": {
"201": {
"description": "Created",
"schema": {
"$ref": "#/definitions/handlers.CreateUserTaskResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
}
},
"/workspaces/{id}/user-tasks/{taskId}": {
"delete": {
"security": [
{
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "Delete a workspace's own user task",
"parameters": [
{
"type": "string",
"description": "Workspace ID",
"name": "id",
"in": "path",
"required": true
},
{
"type": "string",
"description": "User task ID",
"name": "taskId",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/handlers.UserTaskMutationResponse"
}
},
"404": {
"description": "Not Found",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
},
"patch": {
"security": [
{
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "Update a workspace's own user task",
"parameters": [
{
"type": "string",
"description": "Workspace ID",
"name": "id",
"in": "path",
"required": true
},
{
"type": "string",
"description": "User task ID",
"name": "taskId",
"in": "path",
"required": true
},
{
"description": "Partial task fields (only provided keys are updated)",
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/handlers.UpdateUserTaskRequest"
}
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/handlers.UserTaskMutationResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"404": {
"description": "Not Found",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
}
},
"/workspaces/{id}/user-tasks/{taskId}/resolve": {
"post": {
"security": [
{
"BearerAuth \u0026\u0026 OrgSlugAuth": []
}
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"user-tasks"
],
"summary": "Resolve a user task",
"parameters": [
{
"type": "string",
"description": "Workspace ID",
"name": "id",
"in": "path",
"required": true
},
{
"type": "string",
"description": "User task ID",
"name": "taskId",
"in": "path",
"required": true
},
{
"description": "Resolution",
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/handlers.ResolveUserTaskRequest"
}
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/handlers.ResolveUserTaskResponse"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"404": {
"description": "Not Found",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/handlers.ErrorResponse"
}
}
}
}
}
},
"definitions": {
@@ -376,6 +709,31 @@
}
}
},
"handlers.CreateUserTaskRequest": {
"type": "object",
"required": [
"title"
],
"properties": {
"detail": {
"type": "string"
},
"title": {
"type": "string"
}
}
},
"handlers.CreateUserTaskResponse": {
"type": "object",
"properties": {
"status": {
"type": "string"
},
"user_task_id": {
"type": "string"
}
}
},
"handlers.ErrorResponse": {
"type": "object",
"properties": {
@@ -404,6 +762,73 @@
}
}
},
"handlers.OrgIdentityResponse": {
"type": "object",
"properties": {
"name": {
"description": "Name is the org's display name (MOLECULE_ORG_NAME, \"\" when unset).",
"type": "string"
}
}
},
"handlers.PendingUserTask": {
"type": "object",
"properties": {
"created_at": {
"type": "string"
},
"detail": {
"type": "string"
},
"id": {
"type": "string"
},
"status": {
"type": "string",
"enum": [
"pending"
]
},
"title": {
"type": "string"
},
"workspace_id": {
"type": "string"
},
"workspace_name": {
"type": "string"
}
}
},
"handlers.ResolveUserTaskRequest": {
"type": "object",
"required": [
"status"
],
"properties": {
"resolved_by": {
"type": "string"
},
"status": {
"type": "string",
"enum": [
"done",
"dismissed"
]
}
}
},
"handlers.ResolveUserTaskResponse": {
"type": "object",
"properties": {
"status": {
"type": "string"
},
"user_task_id": {
"type": "string"
}
}
},
"handlers.RunNowResponse": {
"type": "object",
"properties": {
@@ -496,6 +921,67 @@
"type": "string"
}
}
},
"handlers.UpdateUserTaskRequest": {
"type": "object",
"properties": {
"detail": {
"type": "string"
},
"status": {
"type": "string",
"enum": [
"pending",
"done",
"dismissed"
]
},
"title": {
"type": "string"
}
}
},
"handlers.UserTask": {
"type": "object",
"properties": {
"created_at": {
"type": "string"
},
"detail": {
"type": "string"
},
"id": {
"type": "string"
},
"resolved_at": {
"type": "string"
},
"resolved_by": {
"type": "string"
},
"status": {
"type": "string",
"enum": [
"pending",
"done",
"dismissed"
]
},
"title": {
"type": "string"
}
}
},
"handlers.UserTaskMutationResponse": {
"type": "object",
"properties": {
"status": {
"type": "string"
},
"user_task_id": {
"type": "string"
}
}
}
},
"securityDefinitions": {
+322 -12
View File
@@ -25,6 +25,22 @@ definitions:
status:
type: string
type: object
handlers.CreateUserTaskRequest:
properties:
detail:
type: string
title:
type: string
required:
- title
type: object
handlers.CreateUserTaskResponse:
properties:
status:
type: string
user_task_id:
type: string
type: object
handlers.ErrorResponse:
properties:
error:
@@ -43,6 +59,50 @@ definitions:
timestamp:
type: string
type: object
handlers.OrgIdentityResponse:
properties:
name:
description: Name is the org's display name (MOLECULE_ORG_NAME, "" when unset).
type: string
type: object
handlers.PendingUserTask:
properties:
created_at:
type: string
detail:
type: string
id:
type: string
status:
enum:
- pending
type: string
title:
type: string
workspace_id:
type: string
workspace_name:
type: string
type: object
handlers.ResolveUserTaskRequest:
properties:
resolved_by:
type: string
status:
enum:
- done
- dismissed
type: string
required:
- status
type: object
handlers.ResolveUserTaskResponse:
properties:
status:
type: string
user_task_id:
type: string
type: object
handlers.RunNowResponse:
properties:
prompt:
@@ -105,6 +165,47 @@ definitions:
timezone:
type: string
type: object
handlers.UpdateUserTaskRequest:
properties:
detail:
type: string
status:
enum:
- pending
- done
- dismissed
type: string
title:
type: string
type: object
handlers.UserTask:
properties:
created_at:
type: string
detail:
type: string
id:
type: string
resolved_at:
type: string
resolved_by:
type: string
status:
enum:
- pending
- done
- dismissed
type: string
title:
type: string
type: object
handlers.UserTaskMutationResponse:
properties:
status:
type: string
user_task_id:
type: string
type: object
host: api.moleculesai.app
info:
contact: {}
@@ -115,6 +216,38 @@ info:
title: Molecule AI Workspace Server API
version: "1.0"
paths:
/org/identity:
get:
produces:
- application/json
responses:
"200":
description: OK
schema:
$ref: '#/definitions/handlers.OrgIdentityResponse'
summary: Get the org's display name
tags:
- org
/user-tasks/pending:
get:
produces:
- application/json
responses:
"200":
description: OK
schema:
items:
$ref: '#/definitions/handlers.PendingUserTask'
type: array
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
summary: List pending user tasks across all workspaces
tags:
- user-tasks
/workspaces/{id}/schedules:
get:
parameters:
@@ -137,8 +270,7 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: List schedules for a workspace
tags:
- schedules
@@ -173,8 +305,7 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: Create a schedule
tags:
- schedules
@@ -207,8 +338,7 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: Delete a schedule
tags:
- schedules
@@ -252,8 +382,7 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: Update a schedule
tags:
- schedules
@@ -284,8 +413,7 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: Get past runs of a schedule
tags:
- schedules
@@ -318,11 +446,193 @@ paths:
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth: []
OrgSlugAuth: []
- BearerAuth && OrgSlugAuth: []
summary: Fire a schedule manually
tags:
- schedules
/workspaces/{id}/user-tasks:
get:
parameters:
- description: Workspace ID
in: path
name: id
required: true
type: string
produces:
- application/json
responses:
"200":
description: OK
schema:
items:
$ref: '#/definitions/handlers.UserTask'
type: array
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth && OrgSlugAuth: []
summary: List a workspace's own user tasks
tags:
- user-tasks
post:
consumes:
- application/json
parameters:
- description: Workspace ID
in: path
name: id
required: true
type: string
- description: Task fields
in: body
name: body
required: true
schema:
$ref: '#/definitions/handlers.CreateUserTaskRequest'
produces:
- application/json
responses:
"201":
description: Created
schema:
$ref: '#/definitions/handlers.CreateUserTaskResponse'
"400":
description: Bad Request
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth && OrgSlugAuth: []
summary: Raise a user task
tags:
- user-tasks
/workspaces/{id}/user-tasks/{taskId}:
delete:
parameters:
- description: Workspace ID
in: path
name: id
required: true
type: string
- description: User task ID
in: path
name: taskId
required: true
type: string
produces:
- application/json
responses:
"200":
description: OK
schema:
$ref: '#/definitions/handlers.UserTaskMutationResponse'
"404":
description: Not Found
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth && OrgSlugAuth: []
summary: Delete a workspace's own user task
tags:
- user-tasks
patch:
consumes:
- application/json
parameters:
- description: Workspace ID
in: path
name: id
required: true
type: string
- description: User task ID
in: path
name: taskId
required: true
type: string
- description: Partial task fields (only provided keys are updated)
in: body
name: body
required: true
schema:
$ref: '#/definitions/handlers.UpdateUserTaskRequest'
produces:
- application/json
responses:
"200":
description: OK
schema:
$ref: '#/definitions/handlers.UserTaskMutationResponse'
"400":
description: Bad Request
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"404":
description: Not Found
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth && OrgSlugAuth: []
summary: Update a workspace's own user task
tags:
- user-tasks
/workspaces/{id}/user-tasks/{taskId}/resolve:
post:
consumes:
- application/json
parameters:
- description: Workspace ID
in: path
name: id
required: true
type: string
- description: User task ID
in: path
name: taskId
required: true
type: string
- description: Resolution
in: body
name: body
required: true
schema:
$ref: '#/definitions/handlers.ResolveUserTaskRequest'
produces:
- application/json
responses:
"200":
description: OK
schema:
$ref: '#/definitions/handlers.ResolveUserTaskResponse'
"400":
description: Bad Request
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"404":
description: Not Found
schema:
$ref: '#/definitions/handlers.ErrorResponse'
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/handlers.ErrorResponse'
security:
- BearerAuth && OrgSlugAuth: []
summary: Resolve a user task
tags:
- user-tasks
schemes:
- https
securityDefinitions:
@@ -0,0 +1,286 @@
//go:build integration
// +build integration
// postgres_replay_integration_test.go — REAL Postgres integration tests for
// the boot-time migration runner (db.RunMigrations) and the connection
// bootstrap (db.InitPostgres).
//
// Issue #2150 (SOP rule internal#765 regression-coverage). test_layer:
// real-postgres.
//
// Run locally with:
//
// docker run --rm -d --name pg-replay \
// -e POSTGRES_PASSWORD=test -e POSTGRES_DB=molecule \
// -p 55432:5432 postgres:15-alpine
// sleep 4
// cd workspace-server
// INTEGRATION_DB_URL="postgres://postgres:test@localhost:55432/molecule?sslmode=disable" \
// go test -tags=integration ./internal/db/ -run '^TestIntegration_Migration|^TestIntegration_InitPostgres'
//
// In CI these run on .gitea/workflows/handlers-postgres-integration.yml,
// which already provisions a real Postgres on the operator-host bridge and
// triggers on workspace-server/migrations/** changes — the exact blast
// radius this gate must cover.
//
// WHY A REAL DATABASE — and why the existing coverage is NOT enough
// -----------------------------------------------------------------
// postgres_migrate_test.go and postgres_schema_migrations_test.go are
// sqlmock-only: they pin which SQL *statements* fire, but a mock cannot
// execute SQL, so it cannot prove the 118-file (.up + legacy .sql) chain
// actually REPLAYS FROM SCRATCH against a real Postgres. The CI psql loop
// in handlers-postgres-integration.yml deliberately *skips* failing
// migrations (`⊘ skipped`), so it would stay green even if the chain
// stopped replaying — it is not a replay gate.
//
// This file closes that gap. It boots a Postgres, resets the public schema
// to a blank slate, and runs the PRODUCTION db.RunMigrations entrypoint —
// the same function platform boot calls — with hard-fail semantics. It
// would FAIL (watch-fail intent) against:
//
// - Issue #211: if RunMigrations regresses to globbing `*.sql` and
// sorting `.down.sql` before `.up.sql`, the rollback runs before the
// forward for any pair (020_workspace_auth_tokens was the canary),
// either erroring on the DROP or wiping the just-created table.
//
// - The 045 crash-loop class (cp#429 / project_cp_migration_045_*): the
// runner re-applies every recorded-absent file every boot, so a
// non-idempotent migration (bare CREATE / INSERT without IF NOT EXISTS
// / ON CONFLICT) replays cleanly the first time and FAILS the second.
// TestIntegration_MigrationReplay_IsIdempotent_DoubleApply runs the
// full chain twice against the same DB to catch that at PR time.
//
// - A new migration that depends on a table a later migration drops, or
// is mis-ordered in the lexicographic chain — it simply will not apply
// from scratch and the replay errors.
//
// All assertions key off the OBSERVABLE database state after the real run,
// not a proxy for "a statement fired".
package db
import (
"database/sql"
"os"
"path/filepath"
"testing"
_ "github.com/lib/pq"
)
// migrationsDir is the on-disk path to the forward+legacy migration chain
// relative to this test file (workspace-server/internal/db → ../../migrations).
const migrationsDir = "../../migrations"
// freshIntegrationDB opens $INTEGRATION_DB_URL (skipping the test if unset),
// resets the `public` schema to an empty slate so the run is a true
// replay-from-scratch regardless of what an earlier CI step applied, and
// registers a Cleanup that closes the connection.
//
// It also points the package-global db.DB at this connection, because
// RunMigrations operates on db.DB. NOT SAFE for t.Parallel() — it owns the
// schema for the duration of the test.
func freshIntegrationDB(t *testing.T) *sql.DB {
t.Helper()
url := os.Getenv("INTEGRATION_DB_URL")
if url == "" {
t.Skip("INTEGRATION_DB_URL not set; skipping real-PG replay test (local devs: see file header)")
}
conn, err := sql.Open("postgres", url)
if err != nil {
t.Fatalf("open: %v", err)
}
if err := conn.Ping(); err != nil {
t.Fatalf("ping: %v", err)
}
// True from-scratch: blow away any schema a prior CI step (e.g. the
// handlers psql apply-all loop) left behind, then start clean. This is
// what makes the test a *replay-from-scratch* gate rather than a
// re-apply-onto-existing test.
if _, err := conn.Exec(`DROP SCHEMA public CASCADE; CREATE SCHEMA public`); err != nil {
t.Fatalf("reset public schema: %v", err)
}
// gen_random_uuid() (used by 001_workspaces.sql et al.) lives in
// pgcrypto on PG < 13 and core on PG 13+. postgres:15-alpine has it in
// core, but create the extension defensively so the test does not pin a
// specific PG minor.
if _, err := conn.Exec(`CREATE EXTENSION IF NOT EXISTS pgcrypto`); err != nil {
t.Fatalf("create pgcrypto: %v", err)
}
t.Cleanup(func() { conn.Close() })
return conn
}
// forwardMigrationCount counts the files RunMigrations is expected to apply:
// every *.sql that is NOT a *.down.sql. This is derived from the real
// directory so the gate auto-tracks new migrations without an edit here.
func forwardMigrationCount(t *testing.T) int {
t.Helper()
all, err := filepath.Glob(filepath.Join(migrationsDir, "*.sql"))
if err != nil {
t.Fatalf("glob migrations: %v", err)
}
n := 0
for _, f := range all {
if len(f) >= len(".down.sql") && f[len(f)-len(".down.sql"):] == ".down.sql" {
continue
}
n++
}
if n == 0 {
t.Fatalf("found zero forward migrations under %s — wrong path?", migrationsDir)
}
return n
}
// TestIntegration_InitPostgres_PingSucceeds proves the production connection
// bootstrap actually establishes a usable pool against a real server. A
// sqlmock test can never exercise the real DB.Ping() inside InitPostgres,
// which is the line that turns a bad DSN / unreachable host into a boot
// failure instead of a silently-broken pool.
func TestIntegration_InitPostgres_PingSucceeds(t *testing.T) {
url := os.Getenv("INTEGRATION_DB_URL")
if url == "" {
t.Skip("INTEGRATION_DB_URL not set; skipping")
}
if err := InitPostgres(url); err != nil {
t.Fatalf("InitPostgres against real PG failed: %v", err)
}
if DB == nil {
t.Fatal("InitPostgres returned nil error but db.DB is nil")
}
// The pool must be live, not just opened.
if err := DB.Ping(); err != nil {
t.Fatalf("db.DB.Ping after InitPostgres: %v", err)
}
// Round-trip a trivial query to prove the connection actually serves.
var one int
if err := DB.QueryRow("SELECT 1").Scan(&one); err != nil {
t.Fatalf("SELECT 1 round-trip: %v", err)
}
if one != 1 {
t.Fatalf("SELECT 1 returned %d", one)
}
}
// TestIntegration_InitPostgres_BadDSNFails proves InitPostgres surfaces an
// unreachable/garbage DSN as an error (the ping path), rather than handing
// back a half-open pool. Watch-fail: if someone drops the DB.Ping() check
// from InitPostgres, this stops returning an error and fails.
func TestIntegration_InitPostgres_BadDSNFails(t *testing.T) {
if os.Getenv("INTEGRATION_DB_URL") == "" {
t.Skip("INTEGRATION_DB_URL not set; skipping")
}
// Valid DSN shape, but nothing is listening on this port.
err := InitPostgres("postgres://postgres:test@127.0.0.1:1/does_not_exist?sslmode=disable&connect_timeout=2")
if err == nil {
t.Fatal("expected InitPostgres to fail against an unreachable DSN, got nil (DB.Ping check removed?)")
}
}
// TestIntegration_MigrationReplay_FromScratch is the core gate: run the
// PRODUCTION RunMigrations over a blank public schema and assert the full
// forward chain applies cleanly with zero skips.
//
// Watch-fail intent:
// - #211 .down-wipe: a `.down.sql` leaking into the forward set would
// run a DROP before its CREATE → error here (hard fail), or wipe a
// table → the schema_migrations / table-presence assertions catch it.
// - mis-ordered / dangling-dependency migration → RunMigrations returns
// a non-nil error and this test fails.
func TestIntegration_MigrationReplay_FromScratch(t *testing.T) {
conn := freshIntegrationDB(t)
DB = conn // RunMigrations operates on the package-global DB.
if err := RunMigrations(migrationsDir); err != nil {
t.Fatalf("full-chain replay-from-scratch failed: %v", err)
}
// Every forward migration must be recorded as applied — proves none was
// silently skipped (the failure mode the CI psql loop tolerates).
want := forwardMigrationCount(t)
var got int
if err := DB.QueryRow("SELECT COUNT(*) FROM schema_migrations").Scan(&got); err != nil {
t.Fatalf("count schema_migrations: %v", err)
}
if got != want {
t.Errorf("schema_migrations recorded %d migrations, expected %d (the full forward chain)", got, want)
}
// No `.down.sql` may ever be recorded — that is the #211 signature.
var downRecorded int
if err := DB.QueryRow(
"SELECT COUNT(*) FROM schema_migrations WHERE filename LIKE '%.down.sql'",
).Scan(&downRecorded); err != nil {
t.Fatalf("count down migrations: %v", err)
}
if downRecorded != 0 {
t.Errorf("a .down.sql migration was applied (#211 regression): %d recorded", downRecorded)
}
// Spot-check load-bearing tables that survive to HEAD of the chain.
// workspaces is the root table; workspace_auth_tokens was the #211
// canary (its data wipe regressed AdminAuth to fail-open).
for _, tbl := range []string{"workspaces", "workspace_auth_tokens", "delegations", "activity_logs"} {
var exists bool
if err := DB.QueryRow(
"SELECT EXISTS(SELECT 1 FROM information_schema.tables WHERE table_schema='public' AND table_name=$1)",
tbl,
).Scan(&exists); err != nil {
t.Fatalf("check table %s: %v", tbl, err)
}
if !exists {
t.Errorf("table %q missing after full replay — chain did not land it", tbl)
}
}
// agent_memories is CREATEd at 008 and DROPped at the end of the chain
// (20260524110000_drop_agent_memories). Its absence proves the late
// drop migration actually ran AFTER the early create — i.e. ordering
// held. If the chain ever runs a drop before its create, this flips.
var legacyExists bool
if err := DB.QueryRow(
"SELECT EXISTS(SELECT 1 FROM information_schema.tables WHERE table_schema='public' AND table_name='agent_memories')",
).Scan(&legacyExists); err != nil {
t.Fatalf("check agent_memories: %v", err)
}
if legacyExists {
t.Error("agent_memories still present at HEAD — the late drop migration did not replay in order")
}
}
// TestIntegration_MigrationReplay_IsIdempotent_DoubleApply guards the 045
// crash-loop class (cp#429 / project_cp_migration_045_crashloop_idempotency_guard):
// the runner re-checks every file on every boot, so a non-idempotent
// migration replays fine once and FAILS on the second pass. Here we run the
// full chain twice. The second pass must apply ZERO new files (all recorded)
// and must not error.
//
// NOTE: this runs against the SAME populated schema, so it also exercises
// the "skip already-applied" tracking path end-to-end against real PG, which
// the sqlmock tests only simulate.
func TestIntegration_MigrationReplay_IsIdempotent_DoubleApply(t *testing.T) {
conn := freshIntegrationDB(t)
DB = conn
if err := RunMigrations(migrationsDir); err != nil {
t.Fatalf("first replay failed: %v", err)
}
var afterFirst int
if err := DB.QueryRow("SELECT COUNT(*) FROM schema_migrations").Scan(&afterFirst); err != nil {
t.Fatalf("count after first: %v", err)
}
// Second boot: nothing new should apply, and it must not error even
// though the runner re-evaluates every file (the 045 failure mode).
if err := RunMigrations(migrationsDir); err != nil {
t.Fatalf("second replay failed (non-idempotent migration / 045 crash-loop class): %v", err)
}
var afterSecond int
if err := DB.QueryRow("SELECT COUNT(*) FROM schema_migrations").Scan(&afterSecond); err != nil {
t.Fatalf("count after second: %v", err)
}
if afterSecond != afterFirst {
t.Errorf("second boot changed schema_migrations from %d to %d — re-application is not a clean no-op", afterFirst, afterSecond)
}
}
+291
View File
@@ -0,0 +1,291 @@
package db
// redis_test.go — regression coverage for the workspace online-status and
// URL-resolution Redis layer (redis.go), which previously had NO test.
//
// Issue #2150 (SOP rule internal#765). redis.go drives two fleet-wide
// behaviours that break silently if a key name or TTL drifts:
//
// - online detection: SetOnline / RefreshTTL / IsOnline on `ws:<id>`.
// A wrong key prefix or a TTL shorter than the heartbeat interval makes
// live workspaces flap to "unreachable — restart" (the exact failure
// LivenessTTL=180s was tuned to avoid). A TTL too long hides real
// crashes.
// - proxy URL resolution: CacheURL / GetCachedURL / CacheInternalURL /
// GetCachedInternalURL on `ws:<id>:url` and `ws:<id>:internal_url`.
// A2A forwarding resolves the target workspace through these keys; a
// prefix collision (e.g. the liveness key overlapping the URL key)
// would serve the wrong URL or a literal "online" string as a URL.
//
// These tests run against miniredis — an in-process Redis that speaks the
// real RESP protocol and enforces real TTL/expiry semantics — so they
// exercise the actual go-redis client calls and key/TTL behaviour, not a
// mock that rubber-stamps them. miniredis is already a module dependency.
//
// Watch-fail intent: change any `ws:%s...` format string in redis.go, or
// regress LivenessTTL below the heartbeat window, and a test here fails.
import (
"context"
"testing"
"time"
"github.com/alicebob/miniredis/v2"
"github.com/redis/go-redis/v9"
)
// withMiniRedis spins up an in-process Redis, points the package-global RDB
// at it, and registers Cleanup. Returns the server handle so tests can drive
// the clock (FastForward) to exercise TTL expiry deterministically.
func withMiniRedis(t *testing.T) *miniredis.Miniredis {
t.Helper()
mr, err := miniredis.Run()
if err != nil {
t.Fatalf("miniredis.Run: %v", err)
}
RDB = redis.NewClient(&redis.Options{Addr: mr.Addr()})
t.Cleanup(func() {
RDB.Close()
mr.Close()
})
return mr
}
// TestLivenessTTL_ExceedsHeartbeatWindow pins the tuned TTL. The heartbeat
// loop fires every 30s; LivenessTTL must allow several missed beats (the
// comment in redis.go targets ~5) so a busy leader starved for 60-120s is
// not falsely declared dead. 180s = 6×30s. Regressing this toward the old
// 60s value reintroduces the false-positive restart cycle.
func TestLivenessTTL_ExceedsHeartbeatWindow(t *testing.T) {
const heartbeatInterval = 30 * time.Second
const minMissedBeats = 5
if LivenessTTL < heartbeatInterval*minMissedBeats {
t.Errorf("LivenessTTL=%s is too short: must tolerate >=%d missed %s heartbeats (>= %s) to avoid false-positive restarts",
LivenessTTL, minMissedBeats, heartbeatInterval, heartbeatInterval*minMissedBeats)
}
}
// TestSetOnline_KeyAndTTL verifies SetOnline writes the canonical `ws:<id>`
// key with the value "online" and the LivenessTTL — the exact contract
// IsOnline and the a2a_proxy reactive check rely on.
func TestSetOnline_KeyAndTTL(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-abc-123"
if err := SetOnline(ctx, ws); err != nil {
t.Fatalf("SetOnline: %v", err)
}
// Key name must be exactly ws:<id> — not, say, ws:<id>:online.
if !mr.Exists("ws:" + ws) {
t.Fatalf("expected key %q to exist; keys present: %v", "ws:"+ws, mr.Keys())
}
got, err := mr.Get("ws:" + ws)
if err != nil {
t.Fatalf("mr.Get: %v", err)
}
if got != "online" {
t.Errorf("liveness value = %q, want %q", got, "online")
}
// TTL must be the tuned LivenessTTL (allow miniredis's whole-second
// granularity).
ttl := mr.TTL("ws:" + ws)
if ttl != LivenessTTL {
t.Errorf("TTL = %s, want %s", ttl, LivenessTTL)
}
}
// TestIsOnline_TrueThenExpires drives the real TTL clock: a freshly-set
// workspace is online; after the TTL elapses it is offline. This is the
// behaviour online-detection depends on — proven against real expiry, not
// asserted from a mock.
func TestIsOnline_TrueThenExpires(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-expiry"
if err := SetOnline(ctx, ws); err != nil {
t.Fatalf("SetOnline: %v", err)
}
online, err := IsOnline(ctx, ws)
if err != nil {
t.Fatalf("IsOnline: %v", err)
}
if !online {
t.Fatal("expected workspace online immediately after SetOnline")
}
// Fast-forward just past the TTL; the liveness key must expire.
mr.FastForward(LivenessTTL + time.Second)
online, err = IsOnline(ctx, ws)
if err != nil {
t.Fatalf("IsOnline after expiry: %v", err)
}
if online {
t.Error("expected workspace offline after TTL elapsed")
}
}
// TestRefreshTTL_ExtendsLiveness proves a heartbeat (RefreshTTL) keeps a
// workspace alive across what would otherwise be an expiry. Without the
// refresh the key expires; with it, IsOnline stays true. Watch-fail: if
// RefreshTTL targets the wrong key, the refresh is a no-op and this fails.
func TestRefreshTTL_ExtendsLiveness(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-refresh"
if err := SetOnline(ctx, ws); err != nil {
t.Fatalf("SetOnline: %v", err)
}
// Advance most of the way to expiry, then heartbeat.
mr.FastForward(LivenessTTL - 5*time.Second)
if err := RefreshTTL(ctx, ws); err != nil {
t.Fatalf("RefreshTTL: %v", err)
}
// Advance past where the ORIGINAL TTL would have expired. Still online.
mr.FastForward(10 * time.Second)
online, err := IsOnline(ctx, ws)
if err != nil {
t.Fatalf("IsOnline: %v", err)
}
if !online {
t.Error("expected workspace still online after RefreshTTL heartbeat")
}
}
// TestIsOnline_UnknownWorkspace returns false (and no error) for a workspace
// that was never set — the default for a never-registered / long-dead agent.
func TestIsOnline_UnknownWorkspace(t *testing.T) {
withMiniRedis(t)
ctx := context.Background()
online, err := IsOnline(ctx, "never-seen")
if err != nil {
t.Fatalf("IsOnline: %v", err)
}
if online {
t.Error("expected unknown workspace to be offline")
}
}
// TestURLCache_RoundTrip pins the `ws:<id>:url` key and its 5-minute TTL,
// and proves the value round-trips. A2A push resolves the target through
// this key.
func TestURLCache_RoundTrip(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-url"
const url = "https://ws-url.workspaces.moleculesai.app"
if err := CacheURL(ctx, ws, url); err != nil {
t.Fatalf("CacheURL: %v", err)
}
got, err := GetCachedURL(ctx, ws)
if err != nil {
t.Fatalf("GetCachedURL: %v", err)
}
if got != url {
t.Errorf("GetCachedURL = %q, want %q", got, url)
}
if !mr.Exists("ws:" + ws + ":url") {
t.Errorf("expected key %q; present: %v", "ws:"+ws+":url", mr.Keys())
}
if ttl := mr.TTL("ws:" + ws + ":url"); ttl != 5*time.Minute {
t.Errorf("url cache TTL = %s, want 5m", ttl)
}
}
// TestInternalURLCache_RoundTrip pins the `ws:<id>:internal_url` key (the
// Docker-internal address used for workspace-to-workspace discovery) and its
// 5-minute TTL.
func TestInternalURLCache_RoundTrip(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-int"
const url = "http://ws-int:8080"
if err := CacheInternalURL(ctx, ws, url); err != nil {
t.Fatalf("CacheInternalURL: %v", err)
}
got, err := GetCachedInternalURL(ctx, ws)
if err != nil {
t.Fatalf("GetCachedInternalURL: %v", err)
}
if got != url {
t.Errorf("GetCachedInternalURL = %q, want %q", got, url)
}
if ttl := mr.TTL("ws:" + ws + ":internal_url"); ttl != 5*time.Minute {
t.Errorf("internal url cache TTL = %s, want 5m", ttl)
}
}
// TestKeyNamespacesDoNotCollide is the prefix-collision regression: the
// liveness key (ws:<id>), the URL key (ws:<id>:url), and the internal-URL
// key (ws:<id>:internal_url) must be three DISTINCT keys for the same
// workspace. If a future edit collapses the format strings, IsOnline would
// read a URL as liveness (or vice versa) and online-detection / proxy
// resolution would corrupt each other fleet-wide.
func TestKeyNamespacesDoNotCollide(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-collide"
if err := SetOnline(ctx, ws); err != nil {
t.Fatalf("SetOnline: %v", err)
}
if err := CacheURL(ctx, ws, "https://public"); err != nil {
t.Fatalf("CacheURL: %v", err)
}
if err := CacheInternalURL(ctx, ws, "http://internal:8080"); err != nil {
t.Fatalf("CacheInternalURL: %v", err)
}
// Liveness value must still be "online", NOT a URL.
if v, _ := mr.Get("ws:" + ws); v != "online" {
t.Errorf("liveness key clobbered by a URL write: got %q", v)
}
if v, _ := mr.Get("ws:" + ws + ":url"); v != "https://public" {
t.Errorf("url key = %q, want https://public", v)
}
if v, _ := mr.Get("ws:" + ws + ":internal_url"); v != "http://internal:8080" {
t.Errorf("internal_url key = %q, want http://internal:8080", v)
}
}
// TestClearWorkspaceKeys_RemovesAllThree proves teardown removes the
// liveness, URL, and internal-URL keys together — a leaked liveness key
// after deletion would keep a dead workspace looking online; a leaked URL
// key would let the proxy forward to a recycled address.
func TestClearWorkspaceKeys_RemovesAllThree(t *testing.T) {
mr := withMiniRedis(t)
ctx := context.Background()
const ws = "ws-clear"
if err := SetOnline(ctx, ws); err != nil {
t.Fatalf("SetOnline: %v", err)
}
if err := CacheURL(ctx, ws, "https://x"); err != nil {
t.Fatalf("CacheURL: %v", err)
}
if err := CacheInternalURL(ctx, ws, "http://x:8080"); err != nil {
t.Fatalf("CacheInternalURL: %v", err)
}
ClearWorkspaceKeys(ctx, ws)
for _, k := range []string{"ws:" + ws, "ws:" + ws + ":url", "ws:" + ws + ":internal_url"} {
if mr.Exists(k) {
t.Errorf("key %q survived ClearWorkspaceKeys", k)
}
}
online, err := IsOnline(ctx, ws)
if err != nil {
t.Fatalf("IsOnline: %v", err)
}
if online {
t.Error("workspace still online after ClearWorkspaceKeys")
}
}
@@ -80,6 +80,10 @@ const (
EventApprovalRequested EventType = "APPROVAL_REQUESTED"
EventApprovalEscalated EventType = "APPROVAL_ESCALATED"
// User tasks (agent → user asks).
EventUserTaskRequested EventType = "USER_TASK_REQUESTED"
EventUserTaskResolved EventType = "USER_TASK_RESOLVED"
// Auth / credentials.
EventExternalCredentialsRotated EventType = "EXTERNAL_CREDENTIALS_ROTATED"
)
@@ -112,6 +116,8 @@ var AllEventTypes = []EventType{
EventDelegationStatus,
EventExternalCredentialsRotated,
EventTaskUpdated,
EventUserTaskRequested,
EventUserTaskResolved,
EventWorkspaceAwaitingAgent,
EventWorkspaceDegraded,
EventWorkspaceHeartbeat,
@@ -41,6 +41,8 @@ func TestAllEventTypes_IsSnapshot(t *testing.T) {
"DELEGATION_STATUS",
"EXTERNAL_CREDENTIALS_ROTATED",
"TASK_UPDATED",
"USER_TASK_REQUESTED",
"USER_TASK_RESOLVED",
"WORKSPACE_AWAITING_AGENT",
"WORKSPACE_DEGRADED",
"WORKSPACE_HEARTBEAT",
@@ -225,6 +225,16 @@ func (e *proxyA2AError) Error() string {
return "proxy a2a error"
}
// EnqueueA2A is a method wrapper around the package-level EnqueueA2A function so
// that *WorkspaceHandler satisfies the scheduler's A2AProxy interface. The
// scheduler cannot call the package function directly (it would have to import
// internal/handlers, but handlers already imports internal/scheduler → import
// cycle), so it goes through this method on the proxy it already holds. Used by
// the cron scheduler to durably buffer a tick when the target workspace is busy.
func (h *WorkspaceHandler) EnqueueA2A(ctx context.Context, workspaceID, callerID string, priority int, body []byte, method, idempotencyKey string, expiresAt *time.Time) (string, int, error) {
return EnqueueA2A(ctx, workspaceID, callerID, priority, body, method, idempotencyKey, expiresAt)
}
// ProxyA2ARequest is the public wrapper for proxyA2ARequest, used by the
// cron scheduler and other internal callers that need to send A2A messages
// to workspaces programmatically (not from an HTTP handler).
+30 -18
View File
@@ -97,10 +97,10 @@ type QueuedItem struct {
// returns the new row ID + current queue depth. Caller MUST have already
// determined the target is busy — this function does not check.
//
// Idempotency: when idempotencyKey is non-empty, the partial unique index
// `idx_a2a_queue_idempotency` prevents duplicate active rows for the same
// (workspace_id, idempotency_key). On conflict this returns the existing
// row's ID so the caller's log still points at the live queue entry.
// Idempotency: when idempotencyKey is non-empty, a duplicate active enqueue
// for the same (workspace, key) is collapsed rather than double-buffered. On
// a duplicate this returns the existing row's ID so the caller's log still
// points at the live queue entry.
func EnqueueA2A(
ctx context.Context,
workspaceID, callerID string,
@@ -129,6 +129,32 @@ func EnqueueA2A(
expiresAtArg = *expiresAt
}
// Supersede any already-expired pending row for this same key before we
// insert. The drain path skips expired pending rows, so such a row never
// completes on its own — it lingers in the active set and would block the
// conflict check below, silently swallowing this fresh enqueue. Retiring
// it here (a) frees the active set so the insert below proceeds and (b)
// cleans the stale row up so expired rows don't accumulate. Scoped to the
// idempotency key so unrelated traffic is untouched.
if idempotencyKey != "" {
if _, supErr := db.DB.ExecContext(ctx, `
UPDATE a2a_queue
SET status = 'dropped',
last_error = 'superseded: expired before drain; replaced by a fresh enqueue'
WHERE workspace_id = $1
AND idempotency_key = $2
AND status = 'queued'
AND expires_at IS NOT NULL
AND expires_at <= now()
`, workspaceID, idempotencyKey); supErr != nil {
// Non-fatal: if the cleanup fails we still attempt the insert. Worst
// case the conflict path returns the (stale) existing row's id, which
// is the pre-fix behaviour — no new breakage introduced here.
log.Printf("A2AQueue: supersede-expired cleanup failed for workspace %s key %s: %v",
workspaceID, idempotencyKey, supErr)
}
}
// INSERT ... ON CONFLICT DO NOTHING RETURNING id. The conflict target
// must reference the partial unique INDEX columns + WHERE clause directly
// (Postgres can't reference partial unique indexes by name in
@@ -246,20 +272,6 @@ func MarkQueueItemFailed(ctx context.Context, id, errMsg string) {
}
}
// QueueDepth returns the number of currently-queued (not dispatched/completed)
// items for a workspace. Used by the busy-return response body so callers
// can see how many ahead of them.
func QueueDepth(ctx context.Context, workspaceID string) int {
var n int
if err := db.DB.QueryRowContext(ctx,
`SELECT COUNT(*) FROM a2a_queue WHERE workspace_id = $1 AND status = 'queued'`,
workspaceID,
).Scan(&n); err != nil {
log.Printf("A2AQueue: QueueDepth query failed for workspace %s: %v", workspaceID, err)
}
return n
}
// DropStaleQueueItems marks queued items older than maxAge as 'dropped' with a
// system-generated reason so PM agents stop processing stale post-incident noise.
// Called with a workspaceID to scope cleanup to one workspace, or empty to sweep
@@ -0,0 +1,160 @@
package handlers
// a2a_queue_enqueue_expired_test.go — regression for CR3 RC 9853.
//
// Bug: a pending buffered tick that expires before the drain reaches it is
// skipped by the drain (it filters out expired pending rows) yet still occupies
// the active set the idempotency check guards. A later tick for the SAME key
// would then collapse onto that dead row and be silently swallowed — the exact
// drop the busy-buffer path was built to prevent.
//
// Fix: EnqueueA2A retires any already-expired pending row for the key BEFORE the
// insert, so the fresh tick buffers (and the stale row is cleaned up) instead of
// being dropped.
//
// These tests use the QueryMatcherEqual mock (setupTestDBForQueueTests) so the
// SQL strings below must match the handler's queries verbatim.
import (
"context"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
)
const (
enqWorkspaceID = "ws-enq-expired"
enqKey = "sched-aaaa-bbbb" // schedule_id used as idempotency key
enqBody = `{"method":"message/send"}`
enqMethod = "message/send"
)
// expectSupersedeExpired registers the cleanup UPDATE EnqueueA2A issues before
// the insert when an idempotency key is present. rowsRetired is how many expired
// pending rows the UPDATE claims to have dropped.
func expectSupersedeExpired(mock sqlmock.Sqlmock, workspaceID, key string, rowsRetired int64) {
mock.ExpectExec(`
UPDATE a2a_queue
SET status = 'dropped',
last_error = 'superseded: expired before drain; replaced by a fresh enqueue'
WHERE workspace_id = $1
AND idempotency_key = $2
AND status = 'queued'
AND expires_at IS NOT NULL
AND expires_at <= now()
`).
WithArgs(workspaceID, key).
WillReturnResult(sqlmock.NewResult(0, rowsRetired))
}
// expectInsert registers the INSERT ... ON CONFLICT DO NOTHING RETURNING id.
// newID is the id the insert returns (non-conflict / fresh enqueue path).
func expectInsert(mock sqlmock.Sqlmock, newID string) {
mock.ExpectQuery(`
INSERT INTO a2a_queue (workspace_id, caller_id, priority, body, method, idempotency_key, expires_at)
VALUES ($1, $2, $3, $4::jsonb, $5, $6, $7)
ON CONFLICT (workspace_id, idempotency_key)
WHERE idempotency_key IS NOT NULL AND status IN ('queued','dispatched')
DO NOTHING
RETURNING id
`).WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(newID))
}
// expectDepth registers the trailing queue-depth count query.
func expectDepth(mock sqlmock.Sqlmock, workspaceID string, depth int) {
mock.ExpectQuery(`
SELECT COUNT(*) FROM a2a_queue
WHERE workspace_id = $1 AND status = 'queued'
`).WithArgs(workspaceID).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(depth))
}
// TestEnqueueA2A_ExpiredRowDoesNotBlockFreshTick is the core CR3 regression:
// an existing expired pending row for a schedule's key must NOT cause the next
// tick's enqueue to be dropped. The expired row is retired first, then the
// fresh tick inserts and returns a NEW id.
func TestEnqueueA2A_ExpiredRowDoesNotBlockFreshTick(t *testing.T) {
mock := setupTestDBForQueueTests(t)
// One expired pending row exists for this key and gets retired.
expectSupersedeExpired(mock, enqWorkspaceID, enqKey, 1)
// With the active set cleared, the insert proceeds (no conflict) → new id.
const freshID = "fresh-tick-id"
expectInsert(mock, freshID)
expectDepth(mock, enqWorkspaceID, 1)
nextRun := time.Now().Add(30 * time.Second)
id, depth, err := EnqueueA2A(
context.Background(), enqWorkspaceID, "", PriorityTask,
[]byte(enqBody), enqMethod, enqKey, &nextRun,
)
if err != nil {
t.Fatalf("EnqueueA2A returned error: %v", err)
}
if id != freshID {
t.Errorf("expected the fresh tick to enqueue with a new id %q, got %q "+
"(an expired row must not swallow the new tick)", freshID, id)
}
if depth != 1 {
t.Errorf("expected depth 1, got %d", depth)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestEnqueueA2A_NoExpiredRow_NormalEnqueue: when no expired row exists the
// supersede UPDATE simply affects zero rows and the enqueue proceeds normally.
func TestEnqueueA2A_NoExpiredRow_NormalEnqueue(t *testing.T) {
mock := setupTestDBForQueueTests(t)
expectSupersedeExpired(mock, enqWorkspaceID, enqKey, 0) // nothing to retire
const newID = "new-id"
expectInsert(mock, newID)
expectDepth(mock, enqWorkspaceID, 2)
nextRun := time.Now().Add(30 * time.Second)
id, depth, err := EnqueueA2A(
context.Background(), enqWorkspaceID, "", PriorityTask,
[]byte(enqBody), enqMethod, enqKey, &nextRun,
)
if err != nil {
t.Fatalf("EnqueueA2A returned error: %v", err)
}
if id != newID {
t.Errorf("expected id %q, got %q", newID, id)
}
if depth != 2 {
t.Errorf("expected depth 2, got %d", depth)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestEnqueueA2A_NoKey_SkipsSupersede: with no idempotency key there is no
// active-set conflict to guard, so the supersede cleanup is skipped entirely
// and only the insert + depth queries run.
func TestEnqueueA2A_NoKey_SkipsSupersede(t *testing.T) {
mock := setupTestDBForQueueTests(t)
// No expectSupersedeExpired — it must NOT be issued when key is empty.
const newID = "no-key-id"
expectInsert(mock, newID)
expectDepth(mock, enqWorkspaceID, 1)
id, _, err := EnqueueA2A(
context.Background(), enqWorkspaceID, "", PriorityTask,
[]byte(enqBody), enqMethod, "", nil,
)
if err != nil {
t.Fatalf("EnqueueA2A returned error: %v", err)
}
if id != newID {
t.Errorf("expected id %q, got %q", newID, id)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
+1 -10
View File
@@ -154,16 +154,7 @@ func (h *ChannelHandler) Create(c *gin.Context) {
}
// #319: encrypt sensitive fields (bot_token, webhook_secret) before
// persisting so a DB read/backup leak can't recover the credentials.
// Validation above ran against plaintext; storage is ciphertext.
if err := channels.EncryptSensitiveFields(body.Config); err != nil {
log.Printf("Channels: encrypt config failed for workspace %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "encrypt failed"})
return
}
// #319: encrypt sensitive fields (bot_token, webhook_secret) before
// persisting so a DB read/backup leak can't recover the credentials.
// persisting. Exactly one call here; duplicate removed in this PR.
// Validation above ran against plaintext; storage is ciphertext.
if err := channels.EncryptSensitiveFields(body.Config); err != nil {
log.Printf("Channels: encrypt config failed for workspace %s: %v", workspaceID, err)

Some files were not shown because too many files have changed in this diff Show More