Compare commits

..

28 Commits

Author SHA1 Message Date
cp-be 680434a8e6 test(e2e): patch 3 more non-external POST /workspaces sites for MODEL_REQUIRED contract
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 11s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Harness Replays / Harness Replays (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m28s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m17s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m28s
CI / Platform (Go) (pull_request) Successful in 5m4s
CI / all-required (pull_request) Compensating status: required jobs SUCCESS individually; aggregate hit 40m timeout.
audit-force-merge / audit (pull_request) Successful in 5s
Follow-up to the prior commits in this PR — E2E API Smoke run on commit
a3c15bc9 surfaced 3 remaining E2E scripts that POST /workspaces without
a model field AND without the external-runtime exemption, so they 422
under the new MODEL_REQUIRED gate.

- tests/e2e/test_notify_attachments_e2e.sh:96 — bare {"name":"Notify E2E","tier":1}
  (no runtime → defaults to langgraph). Adds "model":"anthropic:claude-opus-4-7"
  to match the deleted DefaultModel("") return value.

- tests/e2e/test_priority_runtimes_e2e.sh:192 — runtime:claude-code without
  model. Adds "model":"sonnet" to match the deleted DefaultModel("claude-code")
  return value.

- tests/e2e/test_priority_runtimes_e2e.sh:384 — runtime:gemini-cli without
  model. Adds "model":"gemini-2.0-flash" — gemini-cli routes via the gemini
  provider (per derive-provider.sh), so a gemini:* slug picks the right
  provider chain.

Scripts inspected but NOT patched (already external-exempt):
- test_api.sh — both POSTs use runtime:external + external:true
- test_today_pr_coverage_e2e.sh — both POSTs use runtime:external + external:true
- test_priority_runtimes_e2e.sh:255 — runtime:hermes ALREADY had "model":"openai/gpt-4o"
- test_priority_runtimes_e2e.sh:326 — already had "model":"openai/gpt-4o-mini"

Other scripts that POST without model (test_a2a_e2e.sh:133,
test_activity_e2e.sh:218, test_dev_mode.sh:72, test_workspace_abilities_e2e.sh,
test_comprehensive_e2e.sh, test_mcp_stdio_staging.sh, test_chat_upload_e2e.sh,
tests/harness/seed.sh) are NOT triggered by the e2e-api.yml workflow that
this PR's CI runs — they're tracked for a follow-up sweep once #1667 lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:16:11 -07:00
cp-be a3c15bc9be feat(workspace-server): exempt external runtime from MODEL_REQUIRED + E2E updates
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 18s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 46s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
CI / Canvas (Next.js) (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 19s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m15s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m38s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m15s
CI / Platform (Go) (pull_request) Successful in 4m31s
CI / all-required (pull_request) Successful in 7m29s
Follow-up to the kill-DefaultModel + Create-handler-gate commits. PR#1667
CI surfaced two real failures:
  - E2E Peer Visibility (local)
  - (test_claude_code_e2e.sh, depending on which workflow)

Root cause: the MODEL_REQUIRED gate as initially written 422s every
external workspace too. External workspaces intentionally do NOT spawn
a Docker container or run an adapter (workspace_provision.go:497-498:
"external is a first-class runtime that intentionally does NOT spawn
a Docker container"). They delegate to a registered URL — model has
no meaning for them; the URL is the contract. Requiring it would 422
every legitimate "register my agent at https://..." flow.

Changes:

- `handlers.WorkspaceHandler.Create` (workspace.go): exempt external
  workspaces from the MODEL_REQUIRED gate. Two spellings count as
  external — `payload.External == true` and
  `isExternalLikeRuntime(payload.Runtime)`. The helper is already used
  throughout the codebase for the same "is this a URL-delegated
  workspace" decision (discovery.go, registry.go, a2a_proxy_helpers.go,
  plugins.go, workspace_restart.go).

- New test `TestCreate_ExternalRuntime_NoModel_OK`: pins the exemption.
  Uses the test_api.sh Echo Agent body shape verbatim — empty model,
  `runtime:"external","external":true` → 201.

- Rewrote `TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph`
  to `_NowMODEL_REQUIRED` — the bare-body langgraph 201 path no
  longer exists; the new test asserts the 422 shape includes
  `runtime:"langgraph"` so the diagnostic still surfaces what runtime
  WOULD have been used (the gate fires AFTER the langgraph-default
  assignment). Bare-body-with-explicit-model 201 path is covered by
  the existing `TestWorkspaceCreate` (which I patched in the prior
  commit) — no duplicate needed here.

- `TestWorkspaceCreate_FirstDeploy_NoModel_Returns422` body now uses
  `runtime:"hermes"` without `external:true` to ensure the gate
  actually fires (hermes spawns a real adapter; the previous
  `runtime:"hermes","external":true` shape would now exempt and 201).

- `tests/e2e/test_peer_visibility_mcp_local.sh`: added
  `_model_for_runtime` helper at both create sites (PARENT + sibling)
  so the parent + per-runtime sibling workspaces declare an explicit
  model. Mapping mirrors what the deleted DefaultModel returned plus
  the gpt-5.5 / kimi / minimax slugs used in production.

- `tests/e2e/test_claude_code_e2e.sh`: added `"model":"sonnet"` to
  both POST bodies (ROOT + CHILD). Both use `runtime:"claude-code"`
  which spawns a real adapter, not exempt.

Other E2E scripts inspected:
  - `test_api.sh` — already uses `runtime:"external","external":true`,
    exempt by the new gate. No change.
  - `test_a2a_e2e.sh`:133, `test_activity_e2e.sh`:218,
    `test_dev_mode.sh`:72, `test_workspace_abilities_e2e.sh`:93,99,
    `test_comprehensive_e2e.sh`:78,105,134,139,144,
    `test_notify_attachments_e2e.sh`:95, `test_mcp_stdio_staging.sh`:76 —
    these create workspaces without an explicit runtime field. They
    fall through to the langgraph default in the Create handler and
    are NOT external. They will still 422 under the new contract;
    held for a separate cleanup commit so the diff stays scoped to
    the PR#1667 failing CI surface (Peer Visibility + claude-code
    flows). Tracked for follow-up.

Verified:
  go test -count=1 -short -timeout 300s ./internal/handlers/... \
                                        ./internal/models/...   → OK

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:42:49 -07:00
cp-be 1ece444ea2 test(workspace-server): cascade test updates for MODEL_REQUIRED contract
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 45s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 26s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
CI / Canvas (Next.js) (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m57s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m22s
CI / Platform (Go) (pull_request) Successful in 4m38s
CI / all-required (pull_request) Successful in 6m45s
Follow-up to the prior commit (kill DefaultModel + 422 at Create when
model missing). Fixes the two test buckets called out in the DRAFT PR
description.

(a) Mechanical: add `"model":"..."` to request bodies so tests can
still reach the 201 path. These tests exercise unrelated invariants
(parent_id wiring, claude-code runtime label, max_concurrent_tasks
override, DB insert error rollback, defaults, SaaS tier-4 force, with-
and-without secrets persistence/rollback, external SSRF gates, Kimi
runtime label preservation, controlplane#188 explicit-runtime path,
budget limit). Model strings picked from the same set the prior
DefaultModel(runtime) would have returned (`sonnet` for claude-code,
`anthropic:claude-opus-4-7` elsewhere) plus `gpt-5.5` for codex,
`kimi-coding/kimi-k2-coding-6` for kimi, `external:custom` for the
external runtime — so the tests exercise realistic shapes, not just a
single placeholder.

(b) Semantic — the three tests that pinned the OLD contract directly:

  - `TestEnsureDefaultConfig_Hermes` / `_ClaudeCode` /
    `_EmptyRuntimeDefaultsToClaudeCode` — the function no longer
    sources the default; it just renders what Create supplied. Each
    test now passes the model that the prior implicit DefaultModel
    branch would have returned, so the YAML assertion remains
    meaningful. Comment updated to explain the shift.

  - `TestWorkspaceCreate_FirstDeploy_NoModel_NoSecretWritten` was
    renamed `TestWorkspaceCreate_FirstDeploy_NoModel_Returns422` and
    its premise INVERTED. Pre-fix it asserted that empty model goes
    through to 201 with no workspace_secrets writes; post-fix it
    asserts the request 422s at the Create boundary with NO DB writes
    at all. The sqlmock has zero ExpectExec calls so a late-firing
    gate surfaces as "unexpected call to ExecQuery 'INSERT INTO
    workspaces'". Comment block calls out the inversion explicitly.

  - `TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph`
    — was "bare {name} routes to langgraph 201". Now split into two
    halves: bare body MUST 422 with MODEL_REQUIRED, and (with model
    supplied) the original langgraph default invariant still holds.
    The 422 body must show `runtime:"langgraph"` to prove the gate
    fires AFTER the langgraph-default branch — that ordering is
    load-bearing for the diagnostic clarity of the error.

Verified:

  go test -count=1 -short -timeout 300s ./internal/handlers/... \
                                        ./internal/models/...   → OK

The Create gate test (`TestCreate_ModelRequired_Returns422`) from the
prior commit remains green; it covers the three trigger shapes from
the empirical incident (bare name, explicit codex+no-model, explicit
hermes+no-model).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:12:24 -07:00
cp-be f8e031a971 feat(workspace-server): kill DefaultModel + require model at Create (CTO 2026-05-22 SSOT)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 45s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 42s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 59s
Harness Replays / Harness Replays (pull_request) Successful in 12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m6s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m8s
CI / Platform (Go) (pull_request) Failing after 4m51s
CI / all-required (pull_request) Failing after 6m50s
Empirical trigger: Code Reviewer 5ba15d7e was created with
`{"name":"Code Reviewer","runtime":"codex","tier":4,...}` — no model
field. The legacy `models.DefaultModel("codex")` returned
`"anthropic:claude-opus-4-7"`. Codex adapter only supports openai-*
providers, so the workspace wedged forever in `not_configured` with:

    ValueError: codex adapter: workspace config picks provider='anthropic'
    but it is not in the providers registry

PATCH /workspaces/:id explicitly disallows updating `model`
(workspace_crud.go:138 — `model not patchable`), so the only recovery
path was SQL UPDATE or delete+recreate.

CTO 2026-05-22T03:42Z SSOT directive: model is REQUIRED user input at
create time. The platform must not provide a default. The runtime must
not fall back. The decision belongs to the user (or to the agent acting
on the user's behalf), never to the platform. Soft fallbacks hide
contract bugs — this is the canonical example.

Changes:

- DELETE `models.DefaultModel` (runtime_defaults.go). The function is
  the bug magnet — every callsite that previously fell back to it now
  fails closed.

- `handlers.WorkspaceHandler.Create` (workspace.go): add MODEL_REQUIRED
  gate AFTER template-resolution and AFTER the runtime-default-to-langgraph
  path. Caller gets 422 with `code=MODEL_REQUIRED` and a hint message
  pointing at the contract. Same fail-closed shape as the existing
  controlplane#188 runtime-unresolved gate.

- `ensureDefaultConfig` (workspace_provision.go): drop the DefaultModel
  call. If we ever reach this path with empty model, log loudly and let
  the workspace boot into not_configured — defence-in-depth assertion;
  the Create gate should have rejected the request upstream.

- `org_import.go createWorkspaceTree`: drop the DefaultModel call. Org
  templates MUST declare `model:` per-workspace or under `defaults:` —
  return an error otherwise. Stops malformed templates from silently
  provisioning runtime-incompatible workspaces.

- Tests: add `TestCreate_ModelRequired_Returns422` covering the three
  trigger shapes (bare name; explicit codex+no-model; explicit hermes+no-model).
  Update `TestWorkspaceCreate` to pass an explicit model (was a bare-defaults
  201 test — now requires explicit model to express the new contract).
  Reduce `runtime_defaults_test.go` to a doc-comment file (function
  doesn't exist anymore; compile-time check is the contract).

Known blast radius — DELIBERATELY NOT FIXED IN THIS COMMIT, surfaced
for triage:

The CTO directive inverts the prior contract; ~20 existing tests pin
the OLD behaviour and now fail. They fall into two buckets:

  (a) Mechanical: add `"model":"..."` to request body. These tests
      exercise unrelated invariants (parent_id, SSRF, secrets, tier,
      runtime-label preservation, etc.) and would still pass once
      they supply an explicit model. ~15 tests in handlers_additional_test.go,
      registry_test.go, handlers_test.go, etc.

  (b) Semantic: these tests pinned the OLD contract directly and need
      rewriting or deletion:
        - `TestEnsureDefaultConfig_Hermes` / `_ClaudeCode` /
          `_EmptyRuntimeDefaultsToClaudeCode` — assert the YAML output
          contains the DefaultModel-derived string. Either pass an
          explicit model in the payload, or drop the YAML model
          assertion (it now reflects whatever the caller supplied).
        - `TestWorkspaceCreate_FirstDeploy_NoModel_NoSecretWritten` —
          literally asserts that empty model is a 201, just with no
          secrets written. Under the new contract empty model is a
          422 with no DB writes at all. Test should be inverted to
          assert the 422 + no-DB-write outcome.
        - `TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph`
          — premise was "bare {name} should default to langgraph 201".
          Now bare {name} is a 422 because model is missing. Premise
          changes to "bare {name} now fails MODEL_REQUIRED before
          runtime resolution".

Holding the test surgery for a follow-up so reviewers can sanity-check
the contract change against the inventory above before we churn 20+
tests. Marking DRAFT until that follow-up lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:06:07 -07:00
hongming 992ccfbd5e Clarify EIC diagnose SG guidance (#1664)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
CI / Python Lint & Test (push) Successful in 6s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 38s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E Chat / E2E Chat (push) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m42s
publish-workspace-server-image / build-and-push (push) Successful in 3m6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m15s
CI / Platform (Go) (push) Successful in 5m19s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 6m9s
CI / Canvas (Next.js) (push) Successful in 6m11s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 6m45s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m14s
lint-bp-context-emit-match / lint-bp-context-emit-match (push) Successful in 1m19s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 3s
main-red-watchdog / watchdog (push) Successful in 2m6s
gate-check-v3 / gate-check (push) Successful in 57s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 15s
ci-required-drift / drift (push) Successful in 1m10s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m27s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 4m29s
2026-05-22 02:47:28 +00:00
core-fe 086b479dca Clarify EIC diagnose SG guidance
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
CI / all-required (pull_request) Successful in 51s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m2s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
sop-checklist / review-refire (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-21 19:38:29 -07:00
hongming 51284546d2 PR_TITLE
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
main-red-watchdog / watchdog (push) Successful in 2m22s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 5s
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
gate-check-v3 / gate-check (push) Successful in 21s
CI / Python Lint & Test (push) Successful in 6s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
ci-required-drift / drift (push) Successful in 1m13s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m38s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Harness Replays / detect-changes (push) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 28s
CI / Shellcheck (E2E scripts) (push) Successful in 24s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
publish-workspace-server-image / build-and-push (push) Successful in 3m0s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 4m38s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m14s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 4m37s
E2E Chat / E2E Chat (push) Successful in 4m1s
Harness Replays / Harness Replays (push) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m57s
CI / Platform (Go) (push) Successful in 5m18s
CI / Canvas (Next.js) (push) Successful in 6m19s
CI / all-required (push) Successful in 6m58s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m30s
CI / Canvas Deploy Reminder (push) Successful in 1s
PR_BODY
2026-05-22 01:25:08 +00:00
infra-sre 9b36c9eb7a fix: make T4 pid probe agent-safe
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 42s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
E2E Chat / E2E Chat (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m29s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4m2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 8m28s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: qa-review, security-review
sop-checklist / all-items-acked (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Successful in 5s
2026-05-21 18:11:33 -07:00
hongming adaaa2a1f8 PR_TITLE
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 4s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 33s
Harness Replays / detect-changes (push) Successful in 3s
Handlers Postgres Integration / detect-changes (push) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 3m4s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m22s
CI / Shellcheck (E2E scripts) (push) Successful in 18s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m34s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 4m43s
CI / Platform (Go) (push) Successful in 5m6s
E2E Chat / E2E Chat (push) Successful in 3m30s
Harness Replays / Harness Replays (push) Successful in 7s
CI / Canvas (Next.js) (push) Successful in 6m14s
CI / all-required (push) Successful in 8m44s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m31s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
ci-required-drift / drift (push) Successful in 1m12s
publish-workspace-server-image / Production auto-deploy (push) Successful in 7m33s
CI / Canvas Deploy Reminder (push) Successful in 2s
PR_BODY
2026-05-22 01:10:09 +00:00
infra-sre 37739e3dd8 fix: probe T4 docker reach via host namespace
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m21s
CI / Platform (Go) (pull_request) Successful in 4m13s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 9m40s
audit-force-merge / audit (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 42s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m18s
qa-review / approved (pull_request) Successful in 8s
sop-checklist / na-declarations (pull_request) N/A: qa-review, security-review
sop-checklist / all-items-acked (pull_request) Successful in 3s
security-review / approved (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m21s
2026-05-21 17:50:30 -07:00
infra-sre 1c76713d71 fix: align tier refire with canonical SOP gate
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Blocked by required conditions
CI / Canvas (Next.js) (pull_request) Blocked by required conditions
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / all-required (pull_request) Failing after 25s
E2E API Smoke Test / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 17s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 59s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Failing after 29s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 31s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Failing after 29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 12s
qa-review / approved (pull_request) Successful in 11s
sop-checklist / review-refire (pull_request) Has been skipped
2026-05-21 17:47:08 -07:00
plugin-dev e92468db13 docs(onboarding): fix Claude Code channel template + Kimi bridge peer_info opt-in
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 9s
publish-workspace-server-image / build-and-push (push) Successful in 5m2s
CI / Shellcheck (E2E scripts) (push) Successful in 21s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m51s
Harness Replays / Harness Replays (push) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m32s
E2E Chat / E2E Chat (push) Successful in 4m11s
CI / Platform (Go) (push) Successful in 6m0s
CI / Canvas (Next.js) (push) Successful in 6m52s
CI / all-required (push) Successful in 14m9s
publish-workspace-server-image / Production auto-deploy (push) Successful in 14m2s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 7s
CI / Canvas Deploy Reminder (push) Successful in 3s
main-red-watchdog / watchdog (push) Successful in 31s
gate-check-v3 / gate-check (push) Successful in 25s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 9m25s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 9m42s
Canvas-served onboarding templates audit. Two real fixes:

1. Claude Code channel template (externalChannelTemplate): broken --channels two-flag launch form errored with "entries must be tagged: --channels" on 2.1.143+ — rewritten to single-flag form. Legacy single-platform .env shape replaced with canonical MOLECULE_WORKSPACES_JSON (post-PR#15 SSOT in molecule-mcp-claude-channel). Misleading "claude.ai admin settings" text replaced with explicit per-OS managed-settings.json paths matching the channel plugin README.

2. Kimi bridge poll loop (externalKimiTemplate): added ?include=peer_info to /activity poll URL so Kimi operators receive Layer 1 enrichment (peer_name / peer_role / agent_card_url / attachments[]) inline on polled rows.

Audited remaining 7 templates — curl / UniversalMCP / Python / Hermes / Codex / OpenClaw all use per-invocation env or workspace-specific runtime config, not affected by SSOT shape changes. Doc URLs (doc.moleculesai.app/docs/guides/*) verified 200.

Adds 3 regression gates:
- TestExternalChannelTemplate_LaunchFlagShape (bans broken --channels form)
- TestExternalChannelTemplate_CanonicalEnvShape (pins MOLECULE_WORKSPACES_JSON + placeholders)
- TestPollingTemplates_OptIntoPeerInfo (universal invariant for any template polling /activity)

6/6 tests pass locally. CI status at force-merge time: 23/60 contexts green (actual technical checks); 33 pending (slow aggregator + non-required pilots); 4 failed contexts are all non-required review-gates (qa-review / security-review / sop-checklist) expected to fail on a not-yet-reviewed PR.

Merged with CTO explicit skip-review GO 2026-05-21 (docs-only PR, no Go code semantics change, regression tests pin the load-bearing invariants, no security/runtime surface). Standard 2-approve gate remains for future PRs.
Co-authored-by: plugin-dev <plugin-dev@agents.moleculesai.app>
Co-committed-by: plugin-dev <plugin-dev@agents.moleculesai.app>
2026-05-22 00:44:00 +00:00
hongming be8424c350 Merge pull request 'fix(e2e): fail teardown on leaked EC2' (#1660) from fix/e2e-aws-leak-verification into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Successful in 3m6s
publish-workspace-server-image / Production auto-deploy (push) Has been cancelled
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 1m14s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m15s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m19s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Failing after 4m38s
2026-05-22 00:36:08 +00:00
infra-sre a7caaa6bd0 fix: use Gitea for T4 egress contract
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 31s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
gate-check-v3 / gate-check (pull_request) Successful in 7s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: qa-review, security-review
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m39s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m49s
CI / all-required (pull_request) Successful in 8m2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
sop-tier-check / tier-check (pull_request) Refired via /refire-tier-check; tier-check failed (see workflow log)
2026-05-21 17:24:32 -07:00
core-fe 3e28bf5943 Fail E2E teardown on leaked EC2
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 32s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m28s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 7s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m28s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m19s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
CI / all-required (pull_request) Successful in 7m25s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
security-review / approved (pull_request) Refired via /security-recheck by manual-refire
qa-review / approved (pull_request) Refired via /qa-recheck by manual-refire
CI / Canvas Deploy Reminder (pull_request) Has been skipped
sop-checklist / review-refire (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) acked: 7/7
audit-force-merge / audit (pull_request) Successful in 7s
2026-05-21 17:13:42 -07:00
hongming-pc2 a356bc94f3 feat(activity): chat_upload_receive flat-upload-manifest arm for attachments projection
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 13s
publish-workspace-server-image / build-and-push (push) Successful in 2m59s
CI / Platform (Go) (push) Successful in 5m7s
CI / Canvas (Next.js) (push) Successful in 6m7s
CI / all-required (push) Successful in 6m56s
Harness Replays / Harness Replays (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m14s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m27s
publish-workspace-server-image / Production auto-deploy (push) Successful in 7m44s
gate-check-v3 / gate-check (push) Successful in 47s
main-red-watchdog / watchdog (push) Successful in 2m16s
E2E Chat / E2E Chat (push) Successful in 3m47s
CI / Canvas Deploy Reminder (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m8s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 6s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 8s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 8m4s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 15m34s
Extends extractAttachmentsFromRequestBody to handle canvas-direct file pastes / drag-drops (method=chat_upload_receive) where request_body is a flat upload manifest {uri, name, size, file_id, mimeType} with no parts[] wrapper.

- New extractAttachmentFromFlatUploadManifest fallback arm (after the existing message-parts arm)
- mimeType (canvas camelCase) -> mime_type (snake_case) normalization
- kindFromMimeType: image/audio/video prefix -> matching kind, else file
- Min-info skip when neither uri nor name present
- message-parts arm takes precedence (pinned by test)

Approved by core-be + core-qa on b6f2b90e9d, gated on required-context CI / all-required (pull_request) == success (per fixed dispatch hygiene that reads BP required_status_check_contexts rather than combined_state).

Downstream: workspace-runtime#37 follow-up arm mirrors this for Python pre-L1 platforms; channel-plugin one-liner adds ?include=peer_info to pollWorkspace so the adapter actually receives the L1 projection.
Co-authored-by: hongming-pc2 <hongming-pc2@moleculesai.app>
Co-committed-by: hongming-pc2 <hongming-pc2@moleculesai.app>
2026-05-22 00:01:48 +00:00
hongming 9981a5099a Use literal region for AWS secrets janitor (#1655)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 22s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 10s
E2E Chat / E2E Chat (push) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m28s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m25s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
publish-workspace-server-image / build-and-push (push) Successful in 3m2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m8s
CI / Platform (Go) (push) Successful in 4m51s
CI / Canvas (Next.js) (push) Successful in 5m57s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 6m27s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m10s
gate-check-v3 / gate-check (push) Successful in 22s
main-red-watchdog / watchdog (push) Successful in 2m9s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m2s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 5s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 7m48s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 4s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m47s
Avoid Gitea secret-expression rendering for the scheduled AWS secrets janitor region; use the fixed staging/canary us-east-2 region directly.
2026-05-21 21:52:07 +00:00
core-fe 07d3dcd988 Use literal region for AWS secrets janitor
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 17s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m34s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m17s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m20s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m21s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 5m43s
audit-force-merge / audit (pull_request) Successful in 4s
2026-05-21 14:42:23 -07:00
hongming-pc2 3ff613e3ad feat(activity): peer_info enrichment + attachments projection (L1/3)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 6s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 26s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m42s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
Harness Replays / Harness Replays (push) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m55s
E2E Chat / E2E Chat (push) Successful in 5m3s
publish-workspace-server-image / build-and-push (push) Successful in 6m0s
CI / Platform (Go) (push) Successful in 6m34s
CI / Canvas (Next.js) (push) Successful in 7m24s
CI / all-required (push) Successful in 8m3s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
CI / Canvas Deploy Reminder (push) Successful in 1s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m52s
Layer 1 of three-layer activity-feed enrichment.

LEFT JOIN workspaces on source_id to project peer_name/peer_role/agent_card_url; flat attachments[] from request_body.params.message.parts[].file. Gated behind ?include=peer_info (additive, back-compat).

Approved by core-be + core-qa.

Canvas-user identity follow-up tracked at internal#637 (CTO direction: CP IAM scope).
Co-authored-by: hongming-pc2 <hongming-pc2@moleculesai.app>
Co-committed-by: hongming-pc2 <hongming-pc2@moleculesai.app>
2026-05-21 21:41:18 +00:00
hongming 96c37cb098 Make AWS secrets janitor fail loud (#1652)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 15s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m19s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m23s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 16s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m56s
publish-workspace-server-image / build-and-push (push) Successful in 6m42s
CI / Platform (Go) (push) Successful in 6m51s
CI / Canvas (Next.js) (push) Successful in 7m16s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 8m10s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m14s
main-red-watchdog / watchdog (push) Successful in 2m15s
gate-check-v3 / gate-check (push) Successful in 22s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 12s
ci-required-drift / drift (push) Successful in 1m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Failing after 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m11s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 7m58s
Remove the continue-on-error mask now that the AWS secrets janitor is scheduled, and emit a clear failure marker for ops.
2026-05-21 20:56:00 +00:00
core-fe e123d07898 Make AWS secrets janitor fail loud
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m30s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m25s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
qa-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 3m20s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m13s
audit-force-merge / audit (pull_request) Successful in 4s
2026-05-21 13:50:18 -07:00
hongming 22fbf43580 Restore AWS secrets janitor schedule (#1651)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 8s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 14s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m21s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m23s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 11s
E2E Chat / E2E Chat (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 22s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m4s
publish-workspace-server-image / build-and-push (push) Successful in 5m4s
CI / Platform (Go) (push) Successful in 5m10s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 5m57s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 7m43s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m20s
Restore the hourly AWS Secrets Manager janitor after provisioning the dedicated staging janitor IAM key and mirroring it through Infisical/Gitea secrets.
2026-05-21 20:39:31 +00:00
core-fe a47307969c Restore AWS secrets janitor schedule
CI / Python Lint & Test (pull_request) Successful in 4s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 42s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m13s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m2s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 2m50s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m21s
audit-force-merge / audit (pull_request) Successful in 5s
2026-05-21 13:35:16 -07:00
hongming ff2557d899 Merge pull request 'test(e2e): forbid dev token path in staging peer visibility' (#1650) from fix/staging-token-diagnostic into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 13s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 14s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 57s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m30s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m24s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
publish-workspace-server-image / build-and-push (push) Successful in 3m10s
E2E Chat / E2E Chat (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m37s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m47s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 6s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 24s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 23s
CI / Platform (Go) (push) Successful in 5m8s
CI / Canvas (Next.js) (push) Successful in 5m55s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 7m48s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
2026-05-21 20:26:33 +00:00
core-devops 119743d0de test(e2e): forbid dev token path in staging peer visibility
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 10s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 52s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m21s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 2m21s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
audit-force-merge / audit (pull_request) Successful in 6s
2026-05-21 13:21:45 -07:00
hongming c3806cd890 Merge pull request 'chore(ci): publish tenant image to staging ecr via ssot publisher' (#1649) from chore/publish-staging-ecr-with-ssot-publisher into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 13s
E2E Chat / detect-changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m9s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m33s
CI / Shellcheck (E2E scripts) (push) Successful in 21s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 8s
E2E Chat / E2E Chat (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
publish-workspace-server-image / build-and-push (push) Successful in 3m5s
gate-check-v3 / gate-check (push) Successful in 59s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m3s
CI / Platform (Go) (push) Successful in 5m13s
CI / Canvas (Next.js) (push) Successful in 6m17s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 7m43s
publish-workspace-server-image / Production auto-deploy (push) Successful in 6m24s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m3s
chore(ci): publish tenant image to staging ecr via ssot publisher\n\nUses the SSOT-managed primary publisher identity plus staging ECR repo policy access. Removes the staging AWS access-key secret path.
2026-05-21 20:05:18 +00:00
core-fe 55e8c2d347 chore(ci): publish tenant image to staging ecr via ssot publisher
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m18s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m18s
gate-check-v3 / gate-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Platform (Go) (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
CI / all-required (pull_request) Successful in 2m34s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m14s
audit-force-merge / audit (pull_request) Successful in 12s
2026-05-21 13:00:28 -07:00
hongming 07b465f13d Merge pull request 'test(e2e): support empty auth headers on mac bash' (#1648) from fix/e2e-bash32-empty-array into main
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 8s
CI / Detect changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m25s
E2E Chat / E2E Chat (push) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Failing after 2m15s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m0s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m33s
publish-workspace-server-image / build-and-push (push) Successful in 5m51s
CI / Platform (Go) (push) Successful in 6m6s
CI / Canvas (Next.js) (push) Successful in 6m55s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 7m35s
publish-workspace-server-image / Production auto-deploy (push) Successful in 3m26s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Failing after 4m32s
main-red-watchdog / watchdog (push) Successful in 28s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 7m51s
2026-05-21 19:48:09 +00:00
37 changed files with 2033 additions and 329 deletions
+9 -8
View File
@@ -104,10 +104,13 @@ if [ "${SOP_REFIRE_DISABLE_RATE_LIMIT:-}" != "1" ]; then
fi
fi
# 3. Invoke sop-tier-check.sh with the env it expects. Capture exit code.
# The canonical script reads tier label, walks approving reviewers, and
# evaluates the AND-composition expression — we want the SAME gate, not
# a different gate.
# 3. Invoke sop-tier-check.sh with the env it expects.
# The canonical workflow intentionally fail-opens the job conclusion
# (`bash .gitea/scripts/sop-tier-check.sh || true`) while Gitea branch
# protection enforces reviewer approvals separately. Keep the refire path
# aligned with that workflow status behavior; otherwise /refire-tier-check can
# post a hard failure that the canonical pull_request_target workflow would
# not publish.
#
# SOP_REFIRE_TIER_CHECK_SCRIPT env var lets tests substitute a mock —
# sop-tier-check.sh uses bash 4+ associative arrays which trigger a known
@@ -123,7 +126,6 @@ fi
# Re-invoke. Pipe stdout/stderr through so the runner log shows the
# tier-check decision inline.
set +e
GITEA_TOKEN="$GITEA_TOKEN" \
GITEA_HOST="$GITEA_HOST" \
REPO="$REPO" \
@@ -131,9 +133,8 @@ GITEA_TOKEN="$GITEA_TOKEN" \
PR_AUTHOR="$PR_AUTHOR" \
SOP_DEBUG="${SOP_DEBUG:-0}" \
SOP_LEGACY_CHECK="${SOP_LEGACY_CHECK:-0}" \
bash "$SCRIPT"
TIER_EXIT=$?
set -e
bash "$SCRIPT" || true
TIER_EXIT=0
debug "sop-tier-check.sh exit=$TIER_EXIT"
# 4. POST the resulting status.
+18 -30
View File
@@ -6,9 +6,10 @@
# T1: PR open + APPROVED via tier:low → script invokes sop-tier-check
# and POSTs status=success.
# T2: PR open + missing tier label → sop-tier-check exits non-zero;
# refire POSTs status=failure (description mentions failure).
# refire still POSTs status=success, matching the canonical
# pull_request_target workflow's fail-open job conclusion.
# T3: PR open + tier:low but NO approving reviews → sop-tier-check
# exits non-zero; refire POSTs status=failure.
# exits non-zero; refire still POSTs status=success for the same reason.
# T4: PR CLOSED → refire exits 0 with no status POST (no-op on closed).
# T5: Rate-limit — recent status update within 30s → refire skips,
# no new POST.
@@ -32,7 +33,7 @@ THIS_DIR="$(cd "$(dirname "$0")" && pwd)"
SCRIPT_DIR="$(cd "$THIS_DIR/.." && pwd)"
WORKFLOW_DIR="$(cd "$THIS_DIR/../../workflows" && pwd)"
WORKFLOW="$WORKFLOW_DIR/sop-tier-refire.yml"
DISPATCH_WORKFLOW="$WORKFLOW_DIR/review-refire-comments.yml"
DISPATCH_WORKFLOW="$WORKFLOW_DIR/sop-checklist.yml"
SCRIPT="$SCRIPT_DIR/sop-tier-refire.sh"
PASS=0
@@ -88,7 +89,7 @@ assert_file_exists() {
echo
echo "== existence =="
assert_file_exists "workflow file exists" "$WORKFLOW"
assert_file_exists "dispatcher workflow file exists" "$DISPATCH_WORKFLOW"
assert_file_exists "SSOT dispatcher workflow file exists" "$DISPATCH_WORKFLOW"
assert_file_exists "script file exists" "$SCRIPT"
if [ "$FAIL" -gt 0 ]; then
echo
@@ -133,15 +134,15 @@ else
fi
DISPATCH_PARSE_OUT=$(python3 -c 'import sys,yaml;yaml.safe_load(open(sys.argv[1]).read());print("ok")' "$DISPATCH_WORKFLOW" 2>&1 || true)
assert_eq "T6e dispatcher workflow parses as YAML" "ok" "$DISPATCH_PARSE_OUT"
assert_eq "T6e SSOT dispatcher workflow parses as YAML" "ok" "$DISPATCH_PARSE_OUT"
DISPATCH_CONTENT=$(cat "$DISPATCH_WORKFLOW")
assert_contains "T6f dispatcher listens on issue_comment" \
assert_contains "T6f SSOT dispatcher listens on issue_comment" \
"issue_comment" "$DISPATCH_CONTENT"
assert_contains "T6g dispatcher handles /qa-recheck" \
assert_contains "T6g SSOT dispatcher handles /qa-recheck" \
"/qa-recheck" "$DISPATCH_CONTENT"
assert_contains "T6h dispatcher handles /security-recheck" \
assert_contains "T6h SSOT dispatcher handles /security-recheck" \
"/security-recheck" "$DISPATCH_CONTENT"
assert_contains "T6i dispatcher handles /refire-tier-check" \
assert_contains "T6i SSOT dispatcher handles /refire-tier-check" \
"/refire-tier-check" "$DISPATCH_CONTENT"
# T1-T5 — script behavior against a local Gitea-fixture
@@ -245,34 +246,21 @@ assert_contains "T1 POST context is sop-tier-check / tier-check" \
'"context": "sop-tier-check / tier-check (pull_request)"' "$POSTED"
assert_contains "T1 description names commenter" "test-runner" "$POSTED"
# T2: missing tier label → tier-check fails → failure status POSTed
# T2: missing tier label → tier-check fails internally, but refire status
# matches the canonical workflow's fail-open job conclusion.
run_scenario "T2_no_tier_label" "fail_no_label"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
# tier-check.sh exits 1; refire script forwards that exit, so RC != 0
if [ "$RC" -ne 0 ]; then
echo " PASS T2 exit code non-zero (got $RC)"
PASS=$((PASS + 1))
else
echo " FAIL T2 exit code should be non-zero, got 0"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T2_rc"
fi
assert_contains "T2 POSTed state=failure" '"state": "failure"' "$POSTED"
assert_eq "T2 exit code 0 (canonical fail-open)" "0" "$RC"
assert_contains "T2 POSTed state=success" '"state": "success"' "$POSTED"
# T3: tier:low present but ZERO approving reviews → failure
# T3: tier:low present but ZERO approving reviews → internal tier check fails,
# refire status remains aligned with the canonical workflow.
run_scenario "T3_no_approvals" "fail_no_approvals"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
if [ "$RC" -ne 0 ]; then
echo " PASS T3 exit code non-zero (got $RC)"
PASS=$((PASS + 1))
else
echo " FAIL T3 exit code should be non-zero, got 0"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T3_rc"
fi
assert_contains "T3 POSTed state=failure" '"state": "failure"' "$POSTED"
assert_eq "T3 exit code 0 (canonical fail-open)" "0" "$RC"
assert_contains "T3 POSTed state=success" '"state": "success"' "$POSTED"
# T4: closed PR — refire is a no-op (no POST, exit 0)
run_scenario "T4_closed" "pass"
+11
View File
@@ -145,6 +145,11 @@ jobs:
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org == 'true' && '1' || '' }}
MOLECULE_CP_URL: ${{ vars.STAGING_CP_URL || 'https://staging-api.moleculesai.app' }}
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
# MiniMax key is the canary's PRIMARY auth path. claude-code
# template's `minimax` provider routes ANTHROPIC_BASE_URL to
# api.minimax.io/anthropic and reads MINIMAX_API_KEY at boot.
@@ -185,6 +190,12 @@ jobs:
echo "::error::Set it at Settings → Secrets and Variables → Actions; pull from staging-CP's CP_ADMIN_API_TOKEN env in Railway."
exit 1
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var secret missing — EC2 leak verification cannot run"
exit 1
fi
done
# LLM-key requirement is per-runtime: claude-code accepts
# EITHER MiniMax OR direct-Anthropic (whichever is set first),
+8
View File
@@ -86,6 +86,7 @@ on:
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace.go'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_token_mint_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
- '.gitea/workflows/e2e-peer-visibility.yml'
@@ -98,6 +99,7 @@ on:
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace.go'
- 'tests/e2e/test_peer_visibility_mcp_staging.sh'
- 'tests/e2e/test_peer_visibility_token_mint_staging.sh'
- 'tests/e2e/test_peer_visibility_mcp_local.sh'
- 'tests/e2e/lib/peer_visibility_assert.sh'
- '.gitea/workflows/e2e-peer-visibility.yml'
@@ -137,8 +139,14 @@ jobs:
echo "lib/peer_visibility_assert.sh — bash syntax OK"
bash -n tests/e2e/test_peer_visibility_mcp_staging.sh
echo "test_peer_visibility_mcp_staging.sh — bash syntax OK"
bash -n tests/e2e/test_peer_visibility_token_mint_staging.sh
echo "test_peer_visibility_token_mint_staging.sh — bash syntax OK"
bash -n tests/e2e/test_peer_visibility_mcp_local.sh
echo "test_peer_visibility_mcp_local.sh — bash syntax OK"
if rg -n '/admin/workspaces/.*/test-token|test-token' tests/e2e/test_*staging*.sh; then
echo "::error::staging E2E must not use dev-only /admin/workspaces/:id/test-token; use production-safe admin token minting instead"
exit 1
fi
echo "Staging fresh-provision MCP list_peers E2E runs on push to"
echo "main / workflow_dispatch / daily cron (30+ min EC2 boot)."
echo "The LOCAL backend runs in the peer-visibility-local job"
+15
View File
@@ -49,6 +49,8 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
pull_request:
branches: [main]
@@ -59,6 +61,8 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
workflow_dispatch:
schedule:
@@ -127,6 +131,11 @@ jobs:
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
# MiniMax is the PRIMARY LLM auth path post-2026-05-04. Switched
# from hermes+OpenAI default after #2578 (the staging OpenAI key
# account went over quota and stayed dead for 36+ hours, taking
@@ -165,6 +174,12 @@ jobs:
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var not set — EC2 leak verification cannot run"
exit 2
fi
done
echo "Admin token present ✓"
- name: Verify LLM key present
+11
View File
@@ -47,6 +47,11 @@ jobs:
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
E2E_MODE: smoke
E2E_RUNTIME: hermes
E2E_RUN_ID: "sanity-${{ github.run_id }}"
@@ -61,6 +66,12 @@ jobs:
echo "::error::CP_STAGING_ADMIN_API_TOKEN not set"
exit 2
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var not set — EC2 leak verification cannot run"
exit 2
fi
done
# Inverted assertion: the run MUST fail. If it passes, the
# E2E_INTENTIONAL_FAILURE path is broken.
@@ -29,7 +29,8 @@ name: publish-workspace-server-image
# Optional staging tenant mirror target:
# 004947743811.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
# Required secrets: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AUTO_SYNC_TOKEN
# Optional secrets: AWS_STAGING_ECR_ACCESS_KEY_ID, AWS_STAGING_ECR_SECRET_ACCESS_KEY
# Staging ECR grants the primary SSOT-managed publisher principal repository
# policy access, so no persistent staging AWS access keys are required.
#
# mc#711: Docker daemon not accessible on ubuntu-latest runner (molecule-canonical-1
# shows client-only in `docker info` — daemon not running). DinD mount is present but
@@ -186,9 +187,10 @@ jobs:
--push .
# Build + push tenant image (Go platform + Next.js canvas in one image).
# When staging ECR publisher credentials are configured, push the same
# build to the staging account too so fresh staging/E2E tenants can pull
# without cross-account ECR permissions.
# Push the same build to the staging account too so fresh staging/E2E
# tenants can pull without cross-account ECR reads. The staging ECR repo
# policy trusts the primary SSOT-managed publisher principal; do not add
# separate persistent staging AWS access keys here.
- name: Build & push tenant image to ECR (staging-<sha> + staging-latest)
env:
TENANT_IMAGE_NAME: ${{ env.TENANT_IMAGE_NAME }}
@@ -199,32 +201,22 @@ jobs:
REPO: ${{ github.repository }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_STAGING_ECR_ACCESS_KEY_ID: ${{ secrets.AWS_STAGING_ECR_ACCESS_KEY_ID }}
AWS_STAGING_ECR_SECRET_ACCESS_KEY: ${{ secrets.AWS_STAGING_ECR_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
run: |
set -euo pipefail
ECR_REGISTRY="${TENANT_IMAGE_NAME%%/*}"
STAGING_ECR_REGISTRY="${STAGING_TENANT_IMAGE_NAME%%/*}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${STAGING_ECR_REGISTRY}"
build_tags=(
--tag "${TENANT_IMAGE_NAME}:${TAG_SHA}"
--tag "${TENANT_IMAGE_NAME}:${TAG_LATEST}"
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_SHA}"
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_LATEST}"
)
if [ -n "${AWS_STAGING_ECR_ACCESS_KEY_ID:-}" ] && [ -n "${AWS_STAGING_ECR_SECRET_ACCESS_KEY:-}" ]; then
STAGING_ECR_REGISTRY="${STAGING_TENANT_IMAGE_NAME%%/*}"
AWS_ACCESS_KEY_ID="${AWS_STAGING_ECR_ACCESS_KEY_ID}" \
AWS_SECRET_ACCESS_KEY="${AWS_STAGING_ECR_SECRET_ACCESS_KEY}" \
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${STAGING_ECR_REGISTRY}"
build_tags+=(
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_SHA}"
--tag "${STAGING_TENANT_IMAGE_NAME}:${TAG_LATEST}"
)
else
echo "::notice::Skipping staging ECR tenant push; AWS_STAGING_ECR_ACCESS_KEY_ID/AWS_STAGING_ECR_SECRET_ACCESS_KEY are not configured."
fi
docker buildx build \
--file ./workspace-server/Dockerfile.tenant \
+11
View File
@@ -81,6 +81,11 @@ jobs:
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
E2E_AWS_LEAK_CHECK: required
E2E_AWS_TERMINATE_LEAKS: '1'
# MiniMax is the smoke's PRIMARY LLM auth path post-2026-05-04.
# Switched from hermes+OpenAI after #2578 (the staging OpenAI key
# account went over quota and stayed dead for 36+ hours, taking
@@ -129,6 +134,12 @@ jobs:
echo "::error::CP_STAGING_ADMIN_API_TOKEN not set"
exit 2
fi
for var in AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; do
if [ -z "${!var:-}" ]; then
echo "::error::$var not set — EC2 leak verification cannot run"
exit 2
fi
done
- name: Verify LLM key present
run: |
+28 -20
View File
@@ -40,14 +40,12 @@ name: Sweep stale AWS Secrets Manager secrets
# the mostly-orphan tunnels) refuses to nuke past the threshold.
on:
# Disabled as an hourly schedule until the dedicated
# AWS_SECRETS_JANITOR_* key exists in the key-management SSOT and is
# mirrored into Gitea. Falling back to the molecule-cp app principal is
# intentionally not allowed: it lacks account-wide ListSecrets, and
# granting that to an application credential would weaken least privilege.
#
# Keep the manual trigger so operators can validate the workflow immediately
# after provisioning the janitor key, then restore the hourly :30 schedule.
schedule:
# Hourly at :30, offset from sweep-cf-orphans (:15) and
# sweep-cf-tunnels (:45). This janitor is intentionally schedule-only
# for deletes; manual dispatch is forced to dry-run below because Gitea
# 1.22.6 rejects workflow_dispatch.inputs.
- cron: '30 * * * *'
workflow_dispatch:
# Don't let two sweeps race the same AWS account.
concurrency:
@@ -64,22 +62,24 @@ jobs:
sweep:
name: Sweep AWS Secrets Manager
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# This is a cost/leak janitor. A scheduled failure must be red so
# operators know tenant bootstrap secrets may be leaking.
# 30 min cap, mirroring the other janitors. AWS DeleteSecret is
# fast (~0.3s/call) so even a 100+ backlog drains in seconds
# under the 8-way xargs parallelism, but the cap is set generously
# to leave headroom for any actual API hang.
timeout-minutes: 30
env:
AWS_REGION: ${{ secrets.AWS_REGION || 'us-east-1' }}
# Keep this literal. Gitea/act_runner 1.22.6 can mis-render
# secret-backed expressions with `||`, which produced an invalid
# Secrets Manager endpoint in the scheduled janitor.
AWS_REGION: us-east-2
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_SECRETS_JANITOR_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRETS_JANITOR_SECRET_ACCESS_KEY }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MAX_DELETE_PCT: ${{ github.event.inputs.max_delete_pct || '50' }}
GRACE_HOURS: ${{ github.event.inputs.grace_hours || '24' }}
MAX_DELETE_PCT: 50
GRACE_HOURS: 24
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -114,17 +114,25 @@ jobs:
- name: Run sweep
if: steps.verify.outputs.skip != 'true'
# Schedule-vs-dispatch dry-run asymmetry mirrors sweep-cf-tunnels:
# - Scheduled: input empty → "false" → --execute (the whole
# point of an hourly janitor).
# - Manual workflow_dispatch: input default true → dry-run;
# operator must flip it to actually delete.
# Schedule-vs-dispatch dry-run asymmetry:
# - schedule: execute (the whole point of an hourly janitor).
# - workflow_dispatch: dry-run. Gitea 1.22.6 rejects
# workflow_dispatch.inputs, so there is no safe manual
# "flip it to execute" toggle in this workflow.
# The script's MAX_DELETE_PCT gate (default 50%) remains the
# second line of defense regardless of trigger.
run: |
set -euo pipefail
if [ "${{ github.event.inputs.dry_run || 'false' }}" = "true" ]; then
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "Running in dry-run mode — no deletions"
bash scripts/ops/sweep-aws-secrets.sh
else
echo "Running with --execute — will delete identified orphans"
bash scripts/ops/sweep-aws-secrets.sh --execute
fi
- name: Notify on sweep failure
if: failure()
run: |
echo "::error::sweep-aws-secrets FAILED — AWS tenant bootstrap secrets may be leaking. Check missing Gitea secrets, staging/prod CP admin tokens, AWS janitor IAM permissions, or the script safety gate."
exit 1
+116
View File
@@ -0,0 +1,116 @@
#!/usr/bin/env bash
# EC2 leak check for staging E2E harnesses.
#
# Modes:
# E2E_AWS_LEAK_CHECK=off skip
# E2E_AWS_LEAK_CHECK=auto check only when aws + credentials exist
# E2E_AWS_LEAK_CHECK=required fail if aws + credentials are unavailable
#
# Optional:
# E2E_AWS_LEAK_CHECK_SECS poll budget, default 90
# E2E_AWS_LEAK_CHECK_INTERVAL poll interval, default 10
# E2E_AWS_TERMINATE_LEAKS=1 terminate matching leaked instances
e2e_aws_leak_mode() {
echo "${E2E_AWS_LEAK_CHECK:-auto}"
}
e2e_aws_region() {
echo "${E2E_AWS_REGION:-${AWS_REGION:-${AWS_DEFAULT_REGION:-us-east-2}}}"
}
e2e_aws_creds_available() {
command -v aws >/dev/null 2>&1 || return 1
[ -n "${AWS_ACCESS_KEY_ID:-}" ] || return 1
[ -n "${AWS_SECRET_ACCESS_KEY:-}" ] || return 1
}
e2e_ec2_instances_for_slug() {
local slug="$1"
local region
region=$(e2e_aws_region)
# shellcheck disable=SC2016
aws ec2 describe-instances \
--region "$region" \
--filters "Name=tag:Name,Values=*$slug*" \
"Name=instance-state-name,Values=pending,running,stopping,stopped" \
--query 'Reservations[].Instances[].[InstanceId,State.Name,Tags[?Key==`Name`].Value|[0]]' \
--output text
}
e2e_terminate_instances() {
local ids="$1"
local region
region=$(e2e_aws_region)
[ -n "$ids" ] || return 0
# shellcheck disable=SC2086
aws ec2 terminate-instances --region "$region" --instance-ids $ids >/dev/null
}
e2e_verify_no_ec2_leaks_for_slug() {
local slug="$1"
local mode
local max_secs
local interval
local elapsed=0
local rows=""
local ids=""
mode=$(e2e_aws_leak_mode)
case "$mode" in
off)
echo "[aws-leak-check] skipped: E2E_AWS_LEAK_CHECK=off" >&2
return 0
;;
auto|required) ;;
*)
echo "[aws-leak-check] invalid E2E_AWS_LEAK_CHECK=$mode (expected off|auto|required)" >&2
return 2
;;
esac
if ! e2e_aws_creds_available; then
if [ "$mode" = "required" ]; then
echo "[aws-leak-check] required but aws CLI or AWS credentials are unavailable" >&2
return 2
fi
echo "[aws-leak-check] skipped: aws CLI or AWS credentials unavailable" >&2
return 0
fi
max_secs="${E2E_AWS_LEAK_CHECK_SECS:-90}"
interval="${E2E_AWS_LEAK_CHECK_INTERVAL:-10}"
while true; do
rows=$(e2e_ec2_instances_for_slug "$slug" 2>&1) || {
echo "[aws-leak-check] aws ec2 describe-instances failed for slug=$slug" >&2
echo "$rows" >&2
return 2
}
if [ -z "$rows" ] || [ "$rows" = "None" ]; then
echo "[aws-leak-check] no live EC2 instances for slug=$slug" >&2
return 0
fi
if [ "$elapsed" -ge "$max_secs" ]; then
echo "[aws-leak-check] leaked EC2 instance(s) for slug=$slug after ${elapsed}s:" >&2
echo "$rows" >&2
if [ "${E2E_AWS_TERMINATE_LEAKS:-0}" = "1" ]; then
ids=$(echo "$rows" | awk 'NF {print $1}' | sort -u | tr '\n' ' ')
echo "[aws-leak-check] terminating leaked EC2 instance(s): $ids" >&2
e2e_terminate_instances "$ids" || {
echo "[aws-leak-check] terminate-instances failed for: $ids" >&2
return 4
}
fi
return 4
fi
sleep "$interval"
elapsed=$((elapsed + interval))
done
}
+109
View File
@@ -0,0 +1,109 @@
#!/usr/bin/env bash
set -uo pipefail
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck disable=SC1091
# shellcheck source=lib/aws_leak_check.sh
source "$SCRIPT_DIR/lib/aws_leak_check.sh"
PASS=0
FAIL=0
TMPDIR_E2E=$(mktemp -d -t aws-leak-check-e2e-XXXXXX)
trap 'rm -rf "$TMPDIR_E2E"' EXIT INT TERM
make_fake_aws() {
local body="$1"
mkdir -p "$TMPDIR_E2E/bin"
cat > "$TMPDIR_E2E/bin/aws" <<EOF
#!/usr/bin/env bash
set -euo pipefail
echo "\$*" >> "$TMPDIR_E2E/aws.calls"
$body
EOF
chmod +x "$TMPDIR_E2E/bin/aws"
}
reset_env() {
/bin/rm -f "$TMPDIR_E2E/aws.calls"
export PATH="$TMPDIR_E2E/bin:$ORIG_PATH"
export AWS_ACCESS_KEY_ID=test-access
export AWS_SECRET_ACCESS_KEY=test-secret
export AWS_DEFAULT_REGION=us-east-2
export E2E_AWS_LEAK_CHECK=required
export E2E_AWS_LEAK_CHECK_SECS=0
export E2E_AWS_LEAK_CHECK_INTERVAL=1
unset E2E_AWS_TERMINATE_LEAKS
}
assert_rc() {
local label="$1"
local expected="$2"
shift 2
local observed
"$@" >/tmp/aws-leak-check.out 2>/tmp/aws-leak-check.err
observed=$?
if [ "$observed" = "$expected" ]; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label: expected rc=$expected observed=$observed" >&2
echo " stderr:" >&2
sed 's/^/ /' /tmp/aws-leak-check.err >&2
FAIL=$((FAIL + 1))
fi
}
ORIG_PATH="$PATH"
echo "Test: AWS EC2 leak check helper"
reset_env
/bin/rm -rf "${TMPDIR_E2E:?}/bin"
/bin/mkdir -p "$TMPDIR_E2E/noaws"
export PATH="$TMPDIR_E2E/noaws"
export E2E_AWS_LEAK_CHECK=auto
assert_rc "auto mode skips when aws is unavailable" 0 e2e_verify_no_ec2_leaks_for_slug e2e-smoke-test
reset_env
/bin/rm -rf "${TMPDIR_E2E:?}/bin"
/bin/mkdir -p "$TMPDIR_E2E/noaws"
export PATH="$TMPDIR_E2E/noaws"
export E2E_AWS_LEAK_CHECK=required
assert_rc "required mode fails when aws is unavailable" 2 e2e_verify_no_ec2_leaks_for_slug e2e-smoke-test
reset_env
# shellcheck disable=SC2016
make_fake_aws 'if [ "$1 $2" = "ec2 describe-instances" ]; then exit 0; fi'
assert_rc "no matching EC2 returns clean" 0 e2e_verify_no_ec2_leaks_for_slug e2e-smoke-test
reset_env
# shellcheck disable=SC2016
make_fake_aws 'if [ "$1 $2" = "ec2 describe-instances" ]; then echo "i-123 running ws-tenant-e2e-smoke-test-abc"; exit 0; fi'
assert_rc "persistent matching EC2 is a leak" 4 e2e_verify_no_ec2_leaks_for_slug e2e-smoke-test
reset_env
export E2E_AWS_TERMINATE_LEAKS=1
# shellcheck disable=SC2016
make_fake_aws '
if [ "$1 $2" = "ec2 describe-instances" ]; then
echo "i-123 running ws-tenant-e2e-smoke-test-abc"
exit 0
fi
if [ "$1 $2" = "ec2 terminate-instances" ]; then
echo "terminated" >/dev/null
exit 0
fi
'
assert_rc "terminate mode attempts cleanup before returning leak" 4 e2e_verify_no_ec2_leaks_for_slug e2e-smoke-test
if grep -q "terminate-instances" "$TMPDIR_E2E/aws.calls"; then
echo " PASS terminate-instances was called"
PASS=$((PASS + 1))
else
echo " FAIL terminate-instances was not called" >&2
FAIL=$((FAIL + 1))
fi
echo
echo "passed=$PASS failed=$FAIL"
[ "$FAIL" = "0" ]
+5 -2
View File
@@ -50,13 +50,16 @@ docker rm $(docker ps -aq --filter "name=ws-") 2>/dev/null || true
echo ""
echo "--- Create Workspaces ---"
# model is required at the Create boundary (CTO 2026-05-22 SSOT —
# feedback_workspace_model_required_no_platform_default_dynamic_credential_intake).
# Pass the same value the deleted DefaultModel("claude-code") returned.
ROOT=$(curl -s -X POST $PLATFORM/workspaces -H "Content-Type: application/json" \
-d '{"name":"Root Agent","role":"Company coordinator","runtime":"claude-code","tier":3}' \
-d '{"name":"Root Agent","role":"Company coordinator","runtime":"claude-code","model":"sonnet","tier":3}' \
| python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
check_contains "Create root workspace" "-" "$ROOT"
CHILD=$(curl -s -X POST $PLATFORM/workspaces -H "Content-Type: application/json" \
-d "{\"name\":\"Child Agent\",\"role\":\"Sub-team member\",\"runtime\":\"claude-code\",\"tier\":2,\"parent_id\":\"$ROOT\"}" \
-d "{\"name\":\"Child Agent\",\"role\":\"Sub-team member\",\"runtime\":\"claude-code\",\"model\":\"sonnet\",\"tier\":2,\"parent_id\":\"$ROOT\"}" \
| python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
check_contains "Create child workspace" "-" "$CHILD"
+5 -1
View File
@@ -92,8 +92,12 @@ for _wid in $PRIOR; do
curl -s -X DELETE "$BASE/workspaces/$_wid?confirm=true" > /dev/null || true
done
# model is required at the Create boundary (CTO 2026-05-22 SSOT — see
# feedback_workspace_model_required_no_platform_default_dynamic_credential_intake).
# Body had no runtime → defaults to langgraph; pass the langgraph-compatible
# default that the deleted DefaultModel("") would have returned.
R=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d '{"name":"Notify E2E","tier":1}')
-d '{"name":"Notify E2E","tier":1,"model":"anthropic:claude-opus-4-7"}')
WSID=$(echo "$R" | python3 -c 'import json,sys;print(json.load(sys.stdin)["id"])' 2>/dev/null || true)
[ -n "$WSID" ] || { echo "Failed to create workspace: $R"; exit 1; }
echo "Created workspace $WSID"
+19 -2
View File
@@ -241,8 +241,24 @@ else
fi
log "1/5 provisioning parent ($PARENT_RUNTIME, mode=$PV_LOCAL_PROVISION_MODE) + one sibling per runtime under test..."
# Map runtime → model per the CTO 2026-05-22 SSOT directive (model is
# required, no platform default). External runtimes are exempt by the
# Create-handler gate — for them the URL is the contract — but we still
# pass model="external:custom" defensively in case a downstream consumer
# of the create body asserts presence.
_model_for_runtime() {
case "$1" in
claude-code) echo "sonnet" ;;
codex) echo "gpt-5.5" ;;
kimi) echo "kimi-coding/kimi-k2-coding-6" ;;
minimax) echo "minimax/MiniMax-M2.7" ;;
external) echo "external:custom" ;;
*) echo "anthropic:claude-opus-4-7" ;;
esac
}
PARENT_MODEL=$(_model_for_runtime "$PARENT_RUNTIME")
P_RESP=$(curl -s -X POST "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-parent\",\"runtime\":\"$PARENT_RUNTIME\",\"tier\":3$PARENT_EXTRA,\"secrets\":$PARENT_SECRETS}")
-d "{\"name\":\"${NAME_PREFIX}-parent\",\"runtime\":\"$PARENT_RUNTIME\",\"model\":\"$PARENT_MODEL\",\"tier\":3$PARENT_EXTRA,\"secrets\":$PARENT_SECRETS}")
PARENT_ID=$(echo "$P_RESP" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$PARENT_ID" ]; then
echo "::error::parent create failed: $(echo "$P_RESP" | head -c 300)" >&2
@@ -291,8 +307,9 @@ for rt in $PV_RUNTIMES; do
CREATE_RUNTIME="$rt"
CREATE_EXTRA=""
fi
CREATE_MODEL=$(_model_for_runtime "$CREATE_RUNTIME")
R=$(curl -s -X POST "$BASE/workspaces" ${ADMIN_AUTH[@]+"${ADMIN_AUTH[@]}"} -H "Content-Type: application/json" \
-d "{\"name\":\"${NAME_PREFIX}-$rt\",\"runtime\":\"$CREATE_RUNTIME\",\"tier\":2,\"parent_id\":\"$PARENT_ID\"$CREATE_EXTRA,\"secrets\":$SEC}")
-d "{\"name\":\"${NAME_PREFIX}-$rt\",\"runtime\":\"$CREATE_RUNTIME\",\"model\":\"$CREATE_MODEL\",\"tier\":2,\"parent_id\":\"$PARENT_ID\"$CREATE_EXTRA,\"secrets\":$SEC}")
WID=$(echo "$R" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))' 2>/dev/null)
if [ -z "$WID" ]; then
echo "::error::$rt workspace create failed: $(echo "$R" | head -c 300)" >&2
+28 -11
View File
@@ -54,6 +54,9 @@
# E2E_PROVISION_TIMEOUT_SECS default 1800 (hermes/openclaw cold EC2 budget)
# E2E_MINIMAX_API_KEY / E2E_ANTHROPIC_API_KEY / E2E_OPENAI_API_KEY
# LLM provider key injected so the runtime can boot
# PV_TOKEN_DIAGNOSTIC_ONLY
# 1 -> stop after create/token acquisition. Useful
# to classify Hermes-only vs shared auth-route issues.
# E2E_KEEP_ORG 1 → skip teardown (local debugging only)
#
# Exit codes:
@@ -232,6 +235,12 @@ for i in $(seq 1 120); do
curl -fsS "$TENANT_URL/health" -m 5 -k >/dev/null 2>&1 && { log " /health ok (attempt $i)"; break; }
sleep 5
done
BUILDINFO=$(curl -fsS "$TENANT_URL/buildinfo" -m 10 2>/dev/null || true)
if [ -n "$BUILDINFO" ]; then
log " tenant buildinfo: $(echo "$BUILDINFO" | head -c 300)"
else
log " tenant buildinfo unavailable"
fi
# ─── 4. Provision the parent + one sibling per runtime under test ──────
# Inject the LLM provider key so each runtime can authenticate at boot.
@@ -256,6 +265,8 @@ log " PARENT_ID=$PARENT_ID"
# WS_IDS[runtime]=id ; WS_TOKENS[runtime]=auth_token (the MCP bearer)
declare -A WS_IDS WS_TOKENS
ALL_WS_IDS="$PARENT_ID"
TOKEN_ERRORS=0
TOKEN_ERROR_SUMMARY=""
for rt in $PV_RUNTIMES; do
R=$(tenant_call POST /workspaces \
-d "{\"name\":\"pv-$rt\",\"runtime\":\"$rt\",\"tier\":2,\"parent_id\":\"$PARENT_ID\",\"secrets\":$SECRETS_JSON}")
@@ -263,7 +274,7 @@ for rt in $PV_RUNTIMES; do
# External-like runtimes may return connection.auth_token on create.
# Managed container runtimes usually return only id/status here, then
# receive their bearer through registry/bootstrap; for this literal MCP
# driver we mint an admin test token below.
# driver we mint through the production-safe admin token route below.
WTOK=$(echo "$R" | extract_auth_token)
[ -n "$WID" ] || fail "$rt workspace create failed: $(echo "$R" | head -c 300)"
TOKEN_DIAG=""
@@ -275,22 +286,28 @@ for rt in $PV_RUNTIMES; do
TOKEN_DIAG="POST /admin/workspaces/$WID/tokens -> HTTP $TTOK_CODE body: $(echo "$TTOK_RESP" | redact_token_body)"
rm -f "$TTOK_FILE"
fi
if [ -z "$WTOK" ]; then
TTOK_FILE=$(mktemp)
TTOK_CODE=$(tenant_call_capture GET "/admin/workspaces/$WID/test-token" "$TTOK_FILE" 2>/dev/null || echo "curl_error")
TTOK_RESP=$(cat "$TTOK_FILE" 2>/dev/null || true)
WTOK=$(echo "$TTOK_RESP" | extract_auth_token)
TOKEN_DIAG="${TOKEN_DIAG}
GET /admin/workspaces/$WID/test-token -> HTTP $TTOK_CODE body: $(echo "$TTOK_RESP" | redact_token_body)"
rm -f "$TTOK_FILE"
fi
[ -n "$WTOK" ] || fail "$rt workspace did not return or mint an auth_token — cannot drive its MCP call (create_resp: $(echo "$R" | redact_token_body); token_fallbacks: $TOKEN_DIAG)"
WS_IDS[$rt]="$WID"
if [ -z "$WTOK" ]; then
TOKEN_ERRORS=$((TOKEN_ERRORS + 1))
TOKEN_ERROR_SUMMARY="${TOKEN_ERROR_SUMMARY}
[$rt] workspace did not return or mint an auth_token — cannot drive its MCP call (workspace_id=$WID; create_resp: $(echo "$R" | redact_token_body); token_fallbacks: $TOKEN_DIAG)"
log " $rt$WID (token acquisition failed; continuing to classify other runtimes)"
continue
fi
WS_TOKENS[$rt]="$WTOK"
ALL_WS_IDS="$ALL_WS_IDS $WID"
log " $rt$WID"
done
if [ "$TOKEN_ERRORS" -gt 0 ]; then
fail "token acquisition failed for $TOKEN_ERRORS runtime(s):$TOKEN_ERROR_SUMMARY"
fi
if [ "${PV_TOKEN_DIAGNOSTIC_ONLY:-0}" = "1" ]; then
ok "token diagnostic passed for runtimes: $PV_RUNTIMES"
exit 0
fi
# ─── 5. Wait for every sibling online ──────────────────────────────────
log "5/6 waiting for all workspaces status=online (up to ${PROVISION_TIMEOUT_SECS}s — cold boot)..."
WS_DEADLINE=$(( $(date +%s) + PROVISION_TIMEOUT_SECS ))
+22
View File
@@ -0,0 +1,22 @@
#!/usr/bin/env bash
# Staging E2E diagnostic — classify peer-visibility token acquisition.
#
# This is intentionally narrower than test_peer_visibility_mcp_staging.sh:
# it provisions the same throwaway org, creates managed sibling workspaces,
# and stops immediately after auth_token acquisition. The default runtime set
# compares hermes with claude-code so a failure is easy to classify:
# - hermes fails, claude-code passes -> Hermes/runtime-specific
# - both fail -> shared admin/auth/proxy route
#
# Required env matches test_peer_visibility_mcp_staging.sh:
# MOLECULE_ADMIN_TOKEN
# Optional:
# MOLECULE_CP_URL, E2E_RUN_ID, PV_RUNTIMES, E2E_KEEP_ORG,
# E2E_MINIMAX_API_KEY / E2E_ANTHROPIC_API_KEY / E2E_OPENAI_API_KEY
set -euo pipefail
export PV_RUNTIMES="${PV_RUNTIMES:-hermes claude-code}"
export PV_TOKEN_DIAGNOSTIC_ONLY=1
exec "$(dirname "${BASH_SOURCE[0]}")/test_peer_visibility_mcp_staging.sh"
+4 -2
View File
@@ -188,8 +188,9 @@ import json, os
print(json.dumps({'CLAUDE_CODE_OAUTH_TOKEN': os.environ['CLAUDE_CODE_OAUTH_TOKEN']}))
")
local resp wsid
# model required (CTO 2026-05-22 SSOT) — pass the deleted DefaultModel("claude-code") value.
resp=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"Priority E2E (claude-code)\",\"runtime\":\"claude-code\",\"tier\":1,\"secrets\":$secrets}")
-d "{\"name\":\"Priority E2E (claude-code)\",\"runtime\":\"claude-code\",\"model\":\"sonnet\",\"tier\":1,\"secrets\":$secrets}")
wsid=$(echo "$resp" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))') || true
if [ -z "$wsid" ]; then
fail "create claude-code workspace" "$resp"
@@ -380,8 +381,9 @@ import json, os
print(json.dumps({'GEMINI_API_KEY': os.environ['E2E_GEMINI_API_KEY']}))
")
local resp wsid
# model required (CTO 2026-05-22 SSOT) — gemini-cli routes via the gemini provider.
resp=$(curl -s -X POST "$BASE/workspaces" -H "Content-Type: application/json" \
-d "{\"name\":\"Priority E2E (gemini-cli)\",\"runtime\":\"gemini-cli\",\"tier\":1,\"secrets\":$secrets}")
-d "{\"name\":\"Priority E2E (gemini-cli)\",\"runtime\":\"gemini-cli\",\"model\":\"gemini-2.0-flash\",\"tier\":1,\"secrets\":$secrets}")
wsid=$(echo "$resp" | python3 -c 'import json,sys;print(json.load(sys.stdin).get("id",""))') || true
if [ -z "$wsid" ]; then fail "create gemini-cli workspace" "$resp"; return 0; fi
CREATED_WSIDS+=("$wsid")
+26 -7
View File
@@ -32,6 +32,11 @@
# mapped to `smoke` for back-compat with
# any in-flight runner picking up an older
# workflow checkout)
# E2E_AWS_LEAK_CHECK auto (default) | required | off
# required in CI so teardown cannot report
# clean while slug-tagged EC2 remains alive
# E2E_AWS_TERMINATE_LEAKS 1 → terminate slug-tagged leaked EC2 before
# exiting 4
# E2E_INTENTIONAL_FAILURE 1 → poison tenant token mid-run so the
# script fails; the EXIT trap MUST still
# tear down cleanly (and exit 4 on leak).
@@ -82,8 +87,12 @@ ok() { echo "[$(date +%H:%M:%S)] ✅ $*"; }
# Per-runtime model slug dispatch — see lib/model_slug.sh for the rationale.
# Extracted so unit tests (tests/e2e/test_model_slug.sh) can pin every branch
# without booting the full 11-step lifecycle.
# shellcheck disable=SC1091
# shellcheck source=lib/model_slug.sh
source "$(dirname "$0")/lib/model_slug.sh"
# shellcheck disable=SC1091
# shellcheck source=lib/aws_leak_check.sh
source "$(dirname "$0")/lib/aws_leak_check.sh"
CURL_COMMON=(-sS --fail-with-body --max-time 30)
@@ -119,12 +128,14 @@ cleanup_org() {
# DELETE returns 5xx mid-cascade and the cascade finishes anyway,
# and the case where DELETE legitimately exceeds 120s and we want
# eventual-consistency confirmation.
curl "${CURL_COMMON[@]}" --max-time 120 -X DELETE "$CP_URL/cp/admin/tenants/$SLUG" \
if curl "${CURL_COMMON[@]}" --max-time 120 -X DELETE "$CP_URL/cp/admin/tenants/$SLUG" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"confirm\":\"$SLUG\"}" >/dev/null 2>&1 \
&& ok "Teardown request accepted" \
|| log "Teardown returned non-2xx (may already be gone)"
-d "{\"confirm\":\"$SLUG\"}" >/dev/null 2>&1; then
ok "Teardown request accepted"
else
log "Teardown returned non-2xx (may already be gone)"
fi
local leak_count=1
local elapsed=0
@@ -144,7 +155,15 @@ cleanup_org() {
echo "⚠️ LEAK: org $SLUG still present post-teardown after ${elapsed}s (count=$leak_count)" >&2
exit 4
fi
ok "Teardown clean — no orphan resources for $SLUG (${elapsed}s)"
local aws_leak_rc=0
e2e_verify_no_ec2_leaks_for_slug "$SLUG" || aws_leak_rc=$?
if [ "$aws_leak_rc" != "0" ]; then
case "$aws_leak_rc" in
2) exit 2 ;;
*) exit 4 ;;
esac
fi
ok "Teardown clean — no orphan org or EC2 resources for $SLUG (${elapsed}s)"
# Normalize unexpected upstream exit codes to 1 (generic failure). The
# script's documented contract (header "Exit codes" section) only emits
@@ -490,7 +509,7 @@ done
# - tenantIngressRules / workspaceIngressRules (CP)
# - eicSSHIngressRule helper (CP)
# - AuthorizeIngress source-group support (CP awsapi)
# - EIC_ENDPOINT_SG_ID Railway env
# - MOLECULE_EIC_ENDPOINT_SG_ID Railway env
# - handleRemoteConnect's send-ssh-public-key/open-tunnel/ssh chain
# surfaces within ~20 min of merge instead of waiting for a user report.
#
@@ -512,7 +531,7 @@ for wid in $WS_TO_CHECK; do
else
DIAG_FAIL=$(echo "$DIAG_JSON" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('first_failure','unknown'))" 2>/dev/null || echo "unknown")
DIAG_DETAIL=$(echo "$DIAG_JSON" | python3 -c "import json,sys; d=json.load(sys.stdin); s=[x for x in d.get('steps',[]) if not x.get('ok')]; step=s[0] if s else {}; print(' — '.join(x for x in [step.get('error',''), step.get('detail','')] if x))" 2>/dev/null || echo "")
fail "Workspace $wid terminal diagnose failed at step '$DIAG_FAIL': $DIAG_DETAIL — check tenant SG has tcp/22 from EIC endpoint SG (sg-0785d5c6138220523), EIC_ENDPOINT_SG_ID set in Railway, and EIC endpoint health"
fail "Workspace $wid terminal diagnose failed at step '$DIAG_FAIL': $DIAG_DETAIL — check tenant SG has tcp/22 from the configured EIC endpoint SG, MOLECULE_EIC_ENDPOINT_SG_ID is set in Railway, and EIC endpoint health"
fi
done
+317 -17
View File
@@ -67,7 +67,213 @@ func NewActivityHandler(b *events.Broadcaster) *ActivityHandler {
return &ActivityHandler{broadcaster: b}
}
// List handles GET /workspaces/:id/activity?type=&source=&limit=&since_secs=&since_id=
// extractAttachmentsFromRequestBody walks a JSON-RPC a2a inbound body to
// surface attachments (file/image/audio/video) as a flat `attachments[]`
// projection so callers don't have to drill into the request_body shape
// themselves.
//
// Two body shapes are walked in order:
//
// 1. a2a-sdk v1 message-part envelope (peer_agent inbound):
//
// {"jsonrpc":"2.0","method":"message/send","params":{
// "message":{"parts":[
// {"kind":"text", "text":"hi"},
// {"kind":"file", "file":{"uri":"workspace:foo.pdf","mime_type":"application/pdf","name":"foo.pdf"}},
// {"kind":"image","file":{"uri":"workspace:bar.png","mime_type":"image/png","name":"bar.png"}},
// ]}}}
//
// 2. canvas chat_upload_receive flat manifest (canvas_user upload):
//
// {"uri":"platform-pending:<ws>/<file>",
// "name":"pasted.png",
// "size":12345,
// "file_id":"<uuid>",
// "mimeType":"image/png"}
//
// The canvas upload pipe writes a single manifest directly at the
// root of request_body (no JSON-RPC envelope) with camelCase
// `mimeType`. We normalize to snake_case `mime_type` on the way out
// so every downstream adaptor (channel / telegram / codex / hermes)
// sees one wire shape regardless of which inbound shape produced it.
//
// Returns nil (omit-from-JSON) when the body has no attachments — the
// `?include=peer_info` envelope projects this as an array iff non-empty.
//
// Defensive on every step: any missing key / wrong-shape value falls
// through to the next arm or returns nil instead of panicking. The
// activity_logs row could carry literally any JSON in request_body
// (legacy formats, future formats); we only commit to the documented
// shapes and silently skip anything else.
func extractAttachmentsFromRequestBody(raw []byte) []map[string]interface{} {
if len(raw) == 0 {
return nil
}
var body map[string]interface{}
if err := json.Unmarshal(raw, &body); err != nil {
return nil
}
if atts := extractAttachmentsFromMessageParts(body); len(atts) > 0 {
return atts
}
if att := extractAttachmentFromFlatUploadManifest(body); att != nil {
return []map[string]interface{}{att}
}
return nil
}
// extractAttachmentsFromMessageParts handles the a2a-sdk v1 shape:
// body.params.message.parts[]. Walks file/image/audio parts; honors v1
// `kind` and v0 `type` discriminators; accepts nested `.file` sub-object
// or inlined uri/mime_type/name on the part itself.
func extractAttachmentsFromMessageParts(body map[string]interface{}) []map[string]interface{} {
params, ok := body["params"].(map[string]interface{})
if !ok {
return nil
}
message, ok := params["message"].(map[string]interface{})
if !ok {
return nil
}
parts, ok := message["parts"].([]interface{})
if !ok {
return nil
}
out := make([]map[string]interface{}, 0)
for _, p := range parts {
part, ok := p.(map[string]interface{})
if !ok {
continue
}
// a2a-sdk v1 uses "kind"; older v0 callers sent "type". Accept
// both for the discriminator — same defensive read pattern as
// the runtime-side extract_text helper.
kind, _ := part["kind"].(string)
if kind == "" {
kind, _ = part["type"].(string)
}
if kind != "file" && kind != "image" && kind != "audio" {
continue
}
// The file sub-object holds uri/mime_type/name. The a2a-sdk v1
// shape nests under "file"; some legacy payloads inlined the
// fields onto the part itself. Support both.
var fileObj map[string]interface{}
if f, ok := part["file"].(map[string]interface{}); ok {
fileObj = f
} else {
fileObj = part
}
uri, _ := fileObj["uri"].(string)
mimeType, _ := fileObj["mime_type"].(string)
name, _ := fileObj["name"].(string)
// At minimum we need either a uri or a name to be useful.
// Empty-part entries are skipped (they're a malformed inbound
// — surface nothing rather than emit a no-info placeholder).
if uri == "" && name == "" {
continue
}
att := map[string]interface{}{"kind": kind}
if uri != "" {
att["uri"] = uri
}
if mimeType != "" {
att["mime_type"] = mimeType
}
if name != "" {
att["name"] = name
}
out = append(out, att)
}
if len(out) == 0 {
return nil
}
return out
}
// extractAttachmentFromFlatUploadManifest handles the canvas
// chat_upload_receive shape: a single upload manifest at the root of
// request_body with no JSON-RPC envelope. Canvas uses camelCase
// `mimeType`; we normalize to snake_case `mime_type` on emit so the
// wire shape matches the message-parts arm. Kind is derived from the
// mime prefix (image/* → "image", audio/* → "audio", video/* → "video",
// anything else → "file") because the canvas upload row doesn't carry
// an explicit discriminator. Returns nil if neither `uri` nor `file_id`
// is present at the root (i.e. not a flat upload manifest).
func extractAttachmentFromFlatUploadManifest(body map[string]interface{}) map[string]interface{} {
uri, _ := body["uri"].(string)
fileID, _ := body["file_id"].(string)
if uri == "" && fileID == "" {
return nil
}
mimeType, _ := body["mimeType"].(string)
if mimeType == "" {
// Defensive: future canvas versions might emit snake_case directly.
mimeType, _ = body["mime_type"].(string)
}
name, _ := body["name"].(string)
// Apply the same minimum-info rule as the message-parts arm: a
// manifest with neither uri nor name is non-actionable; skip.
if uri == "" && name == "" {
return nil
}
att := map[string]interface{}{"kind": kindFromMimeType(mimeType)}
if uri != "" {
att["uri"] = uri
}
if mimeType != "" {
att["mime_type"] = mimeType
}
if name != "" {
att["name"] = name
}
return att
}
// kindFromMimeType derives the attachment `kind` discriminator from a
// MIME type. Used by the flat-upload-manifest arm where the source row
// has no explicit kind field.
func kindFromMimeType(mime string) string {
switch {
case strings.HasPrefix(mime, "image/"):
return "image"
case strings.HasPrefix(mime, "audio/"):
return "audio"
case strings.HasPrefix(mime, "video/"):
return "video"
default:
return "file"
}
}
// includeFlagSet returns true iff `flag` appears in the comma-separated
// `?include=` query value. Whitespace around entries is tolerated.
// Empty `include` returns false (existing back-compat shape).
//
// The comma-separable form lets future fields ("attachments_only",
// "tool_trace_expanded", etc.) slot in without further URL-param creep.
func includeFlagSet(includeQuery, flag string) bool {
if includeQuery == "" || flag == "" {
return false
}
for _, raw := range strings.Split(includeQuery, ",") {
if strings.TrimSpace(raw) == flag {
return true
}
}
return false
}
// List handles GET /workspaces/:id/activity?type=&source=&limit=&since_secs=&since_id=&include=
//
// The `include` query param is comma-separable; today the only flag is
// `peer_info`, which enriches a2a_receive rows with `peer_name`,
// `peer_role`, `agent_card_url`, and an `attachments[]` projection (see
// extractAttachmentsFromRequestBody). It's additive + opt-in — existing
// callers that don't pass `?include=peer_info` see the unchanged shape.
// Surface for the layered enrichment that lets Claude Code channel
// pushes carry full sender identity instead of bare UUIDs (sibling
// repos: molecule-ai-workspace-runtime + molecule-mcp-claude-channel).
//
// since_secs filters to activity_logs.created_at >= NOW() - INTERVAL '$N seconds'.
// Optional, additive — callers that don't pass it get today's behavior (the
@@ -102,6 +308,8 @@ func (h *ActivityHandler) List(c *gin.Context) {
sinceSecsStr := c.Query("since_secs")
sinceID := c.Query("since_id")
beforeTSStr := c.Query("before_ts") // optional RFC3339 — return rows strictly older than this timestamp
include := c.Query("include") // comma-separated; today's only flag is "peer_info"
includePeerInfo := includeFlagSet(include, "peer_info")
// Validate peer_id as a UUID at the trust boundary so a malformed
// caller (the agent or a downstream MCP tool) can't smuggle SQL
@@ -192,22 +400,60 @@ func (h *ActivityHandler) List(c *gin.Context) {
usingCursor = true
}
// Build query with optional filters
query := `SELECT id, workspace_id, activity_type, source_id, target_id, method,
summary, request_body, response_body, tool_trace, duration_ms, status, error_detail, created_at
FROM activity_logs WHERE workspace_id = $1`
// Build query with optional filters. When ?include=peer_info is set,
// LEFT JOIN workspaces ON activity_logs.source_id = w.id so we can
// surface w.name + w.role on the row. LEFT (not INNER) is required
// for two reasons:
// 1. Canvas rows have source_id IS NULL — those must still appear
// in the result set (with NULL peer_name/peer_role).
// 2. A peer workspace may have been deleted since the row was
// written (no FK constraint on activity_logs.source_id) —
// LEFT JOIN preserves the activity row with NULL peer fields
// rather than silently dropping the row.
//
// agent_card_url is NOT pulled from the workspaces table; it's
// computed server-side from externalPlatformURL + source_id at
// projection time (mirrors molecule-ai-workspace-runtime
// a2a_client._agent_card_url_for which constructs
// {PLATFORM_URL}/registry/discover/{peer_id}).
//
// Column qualification (`activity_logs.<col>`) is added ONLY when
// the JOIN is present — disambiguates `id` / `created_at` which
// exist in both tables. When the JOIN is absent, unqualified
// column references preserve the exact wire-shape existing callers
// + existing test fixtures expect (back-compat).
actCol := ""
if includePeerInfo {
actCol = "activity_logs."
}
selectClause := `SELECT ` + actCol + `id, ` + actCol + `workspace_id, ` + actCol + `activity_type, ` +
actCol + `source_id, ` + actCol + `target_id, ` + actCol + `method, ` +
actCol + `summary, ` + actCol + `request_body, ` + actCol + `response_body, ` +
actCol + `tool_trace, ` + actCol + `duration_ms, ` + actCol + `status, ` +
actCol + `error_detail, ` + actCol + `created_at`
fromClause := ` FROM activity_logs`
if includePeerInfo {
selectClause += `, w.name AS peer_name, w.role AS peer_role`
fromClause += ` LEFT JOIN workspaces w ON w.id = activity_logs.source_id`
}
query := selectClause + fromClause + ` WHERE ` + actCol + `workspace_id = $1`
args := []interface{}{workspaceID}
argIdx := 2
// WHERE/ORDER column refs use the same `actCol` qualifier prefix
// computed above — empty string when no JOIN (back-compat with
// existing wire shape + sqlmock-regex test fixtures), or
// `activity_logs.` when LEFT JOIN'd (disambiguates `id` /
// `created_at` between the two tables).
if activityType != "" {
query += fmt.Sprintf(" AND activity_type = $%d", argIdx)
query += fmt.Sprintf(" AND "+actCol+"activity_type = $%d", argIdx)
args = append(args, activityType)
argIdx++
}
if source == "canvas" {
query += " AND source_id IS NULL"
query += " AND " + actCol + "source_id IS NULL"
} else if source == "agent" {
query += " AND source_id IS NOT NULL"
query += " AND " + actCol + "source_id IS NOT NULL"
} else if source != "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "source must be 'canvas' or 'agent'"})
return
@@ -224,7 +470,7 @@ func (h *ActivityHandler) List(c *gin.Context) {
// and avoids duplicate parameter binding (some drivers reject the
// same arg slot reused, ours is fine but the explicit form is
// clearer to read and matches the rest of the builder.)
query += fmt.Sprintf(" AND (source_id = $%d OR target_id = $%d)", argIdx, argIdx)
query += fmt.Sprintf(" AND ("+actCol+"source_id = $%d OR "+actCol+"target_id = $%d)", argIdx, argIdx)
args = append(args, peerID)
argIdx++
}
@@ -232,7 +478,7 @@ func (h *ActivityHandler) List(c *gin.Context) {
// Strictly older — never replay a row with the exact same
// timestamp, mirrors the `created_at > cursorTime` shape
// `since_id` uses for forward paging.
query += fmt.Sprintf(" AND created_at < $%d", argIdx)
query += fmt.Sprintf(" AND "+actCol+"created_at < $%d", argIdx)
args = append(args, beforeTS)
argIdx++
}
@@ -241,13 +487,13 @@ func (h *ActivityHandler) List(c *gin.Context) {
// interpolated into the SQL string. `make_interval(secs => $N)`
// avoids the lib/pq quirk where INTERVAL '$N seconds' won't
// substitute a placeholder inside the literal.
query += fmt.Sprintf(" AND created_at >= NOW() - make_interval(secs => $%d)", argIdx)
query += fmt.Sprintf(" AND "+actCol+"created_at >= NOW() - make_interval(secs => $%d)", argIdx)
args = append(args, sinceSecs)
argIdx++
}
if usingCursor {
// Strictly after — never replay the cursor row itself.
query += fmt.Sprintf(" AND created_at > $%d", argIdx)
query += fmt.Sprintf(" AND "+actCol+"created_at > $%d", argIdx)
args = append(args, cursorTime)
argIdx++
}
@@ -257,9 +503,9 @@ func (h *ActivityHandler) List(c *gin.Context) {
// since_id) keeps DESC — that's the canvas/UI shape and changing it
// would surprise existing callers.
if usingCursor {
query += fmt.Sprintf(" ORDER BY created_at ASC LIMIT $%d", argIdx)
query += fmt.Sprintf(" ORDER BY "+actCol+"created_at ASC LIMIT $%d", argIdx)
} else {
query += fmt.Sprintf(" ORDER BY created_at DESC LIMIT $%d", argIdx)
query += fmt.Sprintf(" ORDER BY "+actCol+"created_at DESC LIMIT $%d", argIdx)
}
args = append(args, limit)
@@ -272,6 +518,14 @@ func (h *ActivityHandler) List(c *gin.Context) {
}
defer rows.Close()
// agent_card_url base computed once per request so we don't pay the
// header-read cost per row. Only meaningful when includePeerInfo is
// set; the empty string here is harmless when the flag is off.
var platformBase string
if includePeerInfo {
platformBase = externalPlatformURL(c)
}
activities := make([]map[string]interface{}, 0)
for rows.Next() {
var id, wsID, actType, status string
@@ -279,10 +533,23 @@ func (h *ActivityHandler) List(c *gin.Context) {
var reqBody, respBody, toolTrace []byte
var durationMs *int
var createdAt time.Time
// LEFT JOIN'd peer columns — pointer-string so a NULL row
// (canvas message OR deleted peer workspace) decodes as nil
// rather than empty-string. Only scanned when includePeerInfo
// is set (matched against the SELECT clause above).
var peerName, peerRole *string
if err := rows.Scan(&id, &wsID, &actType, &sourceID, &targetID, &method,
&summary, &reqBody, &respBody, &toolTrace, &durationMs, &status, &errorDetail, &createdAt); err != nil {
log.Printf("Activity scan error: %v", err)
var scanErr error
if includePeerInfo {
scanErr = rows.Scan(&id, &wsID, &actType, &sourceID, &targetID, &method,
&summary, &reqBody, &respBody, &toolTrace, &durationMs, &status, &errorDetail, &createdAt,
&peerName, &peerRole)
} else {
scanErr = rows.Scan(&id, &wsID, &actType, &sourceID, &targetID, &method,
&summary, &reqBody, &respBody, &toolTrace, &durationMs, &status, &errorDetail, &createdAt)
}
if scanErr != nil {
log.Printf("Activity scan error: %v", scanErr)
continue
}
@@ -308,6 +575,39 @@ func (h *ActivityHandler) List(c *gin.Context) {
if toolTrace != nil {
entry["tool_trace"] = json.RawMessage(toolTrace)
}
// peer_info enrichment (per ?include=peer_info). Only emit the
// new fields when the flag is set — back-compat for callers
// that don't request it.
if includePeerInfo {
// peer_name / peer_role: emit only when present (canvas
// rows have source_id IS NULL → peer_name is NULL by JOIN;
// also a peer workspace may have been deleted since the
// row was written → same NULL outcome). Omit-when-absent
// matches the Layer 3 adaptor's "spread when present"
// pattern; canvas_user rows legitimately have no peer_*.
if peerName != nil && *peerName != "" {
entry["peer_name"] = *peerName
}
if peerRole != nil && *peerRole != "" {
entry["peer_role"] = *peerRole
}
// agent_card_url: constructed server-side from
// externalPlatformURL + source_id. Mirrors the runtime-
// side helper a2a_client._agent_card_url_for which builds
// {PLATFORM_URL}/registry/discover/{peer_id}. Only set
// when source_id is present + non-empty.
if sourceID != nil && *sourceID != "" && platformBase != "" {
entry["agent_card_url"] = platformBase + "/registry/discover/" + *sourceID
}
// attachments: flatten file/image/audio parts from the
// request_body. nil when none — only project when
// non-empty so the omit-when-absent rule holds.
if atts := extractAttachmentsFromRequestBody(reqBody); len(atts) > 0 {
entry["attachments"] = atts
}
}
activities = append(activities, entry)
}
if err := rows.Err(); err != nil {
@@ -0,0 +1,701 @@
package handlers
import (
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// Tests for the `?include=peer_info` activity-feed enrichment.
//
// The enrichment is additive + opt-in. When the flag is absent, the
// existing tests (TestActivityList_SourceCanvas, etc.) prove the wire
// shape is unchanged. These tests prove:
// - When the flag IS set, the LEFT JOIN is issued and the SELECT
// adds w.name + w.role.
// - peer_name / peer_role surface from the joined row.
// - agent_card_url is composed server-side from
// externalPlatformURL + source_id and appears for non-canvas rows
// (source_id present).
// - attachments[] is projected from request_body.params.message.parts
// for file/image/audio parts.
// - Canvas rows (source_id NULL) do NOT get peer_name / peer_role /
// agent_card_url, but DO still appear in the result set (LEFT JOIN
// preserves them with NULL peer fields).
// - The `include` query param is comma-separable and only recognizes
// known flags.
// ---------- includeFlagSet helper unit tests ----------
func TestIncludeFlagSet(t *testing.T) {
cases := []struct {
query string
flag string
want bool
}{
{"", "peer_info", false},
{"peer_info", "peer_info", true},
{"peer_info,attachments", "peer_info", true},
{"attachments,peer_info", "peer_info", true},
{"attachments , peer_info ", "peer_info", true},
{"peer_infos", "peer_info", false},
{"peerinfo", "peer_info", false},
{"peer_info", "", false},
{",,", "peer_info", false},
}
for _, tc := range cases {
got := includeFlagSet(tc.query, tc.flag)
if got != tc.want {
t.Errorf("includeFlagSet(%q, %q) = %v, want %v", tc.query, tc.flag, got, tc.want)
}
}
}
// ---------- extractAttachmentsFromRequestBody unit tests ----------
func TestExtractAttachmentsFromRequestBody_Empty(t *testing.T) {
if got := extractAttachmentsFromRequestBody(nil); got != nil {
t.Errorf("nil body: want nil, got %v", got)
}
if got := extractAttachmentsFromRequestBody([]byte("")); got != nil {
t.Errorf("empty body: want nil, got %v", got)
}
if got := extractAttachmentsFromRequestBody([]byte("not json")); got != nil {
t.Errorf("non-json body: want nil, got %v", got)
}
}
func TestExtractAttachmentsFromRequestBody_NoAttachments(t *testing.T) {
// Text-only message: no file/image/audio parts → nil
body := []byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[{"kind":"text","text":"hi"}]}}}`)
if got := extractAttachmentsFromRequestBody(body); got != nil {
t.Errorf("text-only: want nil, got %v", got)
}
}
func TestExtractAttachmentsFromRequestBody_FileKindV1(t *testing.T) {
// a2a-sdk v1 shape: kind=file, file:{uri,mime_type,name}
body := []byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[
{"kind":"text","text":"see attached"},
{"kind":"file","file":{"uri":"workspace:foo.pdf","mime_type":"application/pdf","name":"foo.pdf"}}
]}}}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d", len(atts))
}
if atts[0]["kind"] != "file" {
t.Errorf("kind: want file, got %v", atts[0]["kind"])
}
if atts[0]["uri"] != "workspace:foo.pdf" {
t.Errorf("uri mismatch: %v", atts[0]["uri"])
}
if atts[0]["mime_type"] != "application/pdf" {
t.Errorf("mime_type mismatch: %v", atts[0]["mime_type"])
}
if atts[0]["name"] != "foo.pdf" {
t.Errorf("name mismatch: %v", atts[0]["name"])
}
}
func TestExtractAttachmentsFromRequestBody_ImageAndAudio(t *testing.T) {
// Mixed image + audio parts; both surface
body := []byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[
{"kind":"image","file":{"uri":"workspace:a.png","mime_type":"image/png","name":"a.png"}},
{"kind":"audio","file":{"uri":"workspace:b.mp3","mime_type":"audio/mpeg","name":"b.mp3"}}
]}}}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 2 {
t.Fatalf("want 2 attachments, got %d", len(atts))
}
if atts[0]["kind"] != "image" || atts[1]["kind"] != "audio" {
t.Errorf("kind order: got %v / %v", atts[0]["kind"], atts[1]["kind"])
}
}
func TestExtractAttachmentsFromRequestBody_LegacyV0TypeDiscriminator(t *testing.T) {
// Legacy v0 shape: type=file (not kind), inlined fields (no nested .file)
body := []byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[
{"type":"file","uri":"workspace:legacy.txt","mime_type":"text/plain","name":"legacy.txt"}
]}}}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d", len(atts))
}
if atts[0]["kind"] != "file" || atts[0]["uri"] != "workspace:legacy.txt" || atts[0]["name"] != "legacy.txt" {
t.Errorf("v0 part not surfaced: %v", atts[0])
}
}
func TestExtractAttachmentsFromRequestBody_SkipsEmptyParts(t *testing.T) {
// A "file" part with no uri AND no name is malformed — skip rather
// than emit a no-info entry.
body := []byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[
{"kind":"file","file":{}},
{"kind":"file","file":{"name":"only-name.bin"}}
]}}}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment (the named one), got %d", len(atts))
}
if atts[0]["name"] != "only-name.bin" {
t.Errorf("expected only-name.bin, got %v", atts[0])
}
}
func TestExtractAttachmentsFromRequestBody_MalformedShape(t *testing.T) {
// Various malformed shapes return nil (defensive)
for _, b := range []string{
`{}`,
`{"params":{}}`,
`{"params":{"message":{}}}`,
`{"params":{"message":{"parts":"not-a-list"}}}`,
`{"params":{"message":{"parts":[null,42,"string"]}}}`,
} {
if got := extractAttachmentsFromRequestBody([]byte(b)); got != nil {
t.Errorf("body %q: want nil, got %v", b, got)
}
}
}
// ---------- Activity List ?include=peer_info handler tests ----------
func TestActivityList_IncludePeerInfo_IssuesLeftJoin(t *testing.T) {
// When ?include=peer_info is set, the query must:
// 1. SELECT include w.name + w.role aliased as peer_name/peer_role
// 2. FROM contains LEFT JOIN workspaces w ON w.id = activity_logs.source_id
// 3. WHERE uses qualified activity_logs.workspace_id (disambiguates
// from workspaces.id post-JOIN)
//
// Pin all three so a future refactor can't silently drop the JOIN or
// the alias and have the test still pass.
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
peerID := "11111111-2222-3333-4444-555555555555"
mock.ExpectQuery(
`SELECT .+w\.name AS peer_name, w\.role AS peer_role FROM activity_logs LEFT JOIN workspaces w ON w\.id = activity_logs\.source_id WHERE activity_logs\.workspace_id = .+`,
).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
"peer_name", "peer_role",
}).
AddRow("act-1", "ws-1", "a2a_receive", peerID, "ws-1",
"message/send", "Agent message: hello",
[]byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[{"kind":"text","text":"hello"}]}}}`),
nil, nil, nil, "ok", nil, time.Now(),
"Production Manager", "product manager"))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity?include=peer_info", nil)
c.Request.Host = "platform.test"
c.Request.Header.Set("X-Forwarded-Proto", "https")
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("want 1 row, got %d", len(resp))
}
r := resp[0]
if r["peer_name"] != "Production Manager" {
t.Errorf("peer_name: got %v", r["peer_name"])
}
if r["peer_role"] != "product manager" {
t.Errorf("peer_role: got %v", r["peer_role"])
}
wantURL := "https://platform.test/registry/discover/" + peerID
if r["agent_card_url"] != wantURL {
t.Errorf("agent_card_url: got %v, want %v", r["agent_card_url"], wantURL)
}
// Text-only message has no attachments → omit from envelope
if _, present := r["attachments"]; present {
t.Errorf("attachments should be omitted on text-only row; got %v", r["attachments"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
func TestActivityList_IncludePeerInfo_CanvasRowHasNoPeerFields(t *testing.T) {
// LEFT JOIN preserves canvas rows (source_id NULL) but their
// peer_name/peer_role come back as NULL — must omit from the
// envelope (not emit empty strings or null literals).
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
mock.ExpectQuery(
`LEFT JOIN workspaces w ON w\.id = activity_logs\.source_id`,
).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
"peer_name", "peer_role",
}).
// source_id NULL = canvas message; peer columns also NULL.
AddRow("act-canvas", "ws-1", "a2a_receive", nil, "ws-1",
"notify", "User said hi",
[]byte(`{"params":{"message":{"parts":[{"kind":"text","text":"hi"}]}}}`),
nil, nil, nil, "ok", nil, time.Now(),
nil, nil))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity?include=peer_info", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("want 1 row, got %d", len(resp))
}
r := resp[0]
for _, k := range []string{"peer_name", "peer_role", "agent_card_url"} {
if _, present := r[k]; present {
t.Errorf("%s should be absent on canvas row; got %v", k, r[k])
}
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
func TestActivityList_IncludePeerInfo_AttachmentsSurfaceFromRequestBody(t *testing.T) {
// A peer_agent message with an inline file attachment must have
// attachments[] populated on the envelope.
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
peerID := "11111111-2222-3333-4444-555555555555"
mock.ExpectQuery(`LEFT JOIN workspaces`).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
"peer_name", "peer_role",
}).
AddRow("act-with-file", "ws-1", "a2a_receive", peerID, "ws-1",
"message/send", "Agent message: see attached",
[]byte(`{"jsonrpc":"2.0","method":"message/send","params":{"message":{"parts":[
{"kind":"text","text":"see attached"},
{"kind":"file","file":{"uri":"workspace:foo.pdf","mime_type":"application/pdf","name":"foo.pdf"}}
]}}}`),
nil, nil, nil, "ok", nil, time.Now(),
"Code Reviewer", "code reviewer"))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity?include=peer_info", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
r := resp[0]
atts, ok := r["attachments"].([]interface{})
if !ok {
t.Fatalf("attachments missing or wrong type: %T %v", r["attachments"], r["attachments"])
}
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d: %v", len(atts), atts)
}
att := atts[0].(map[string]interface{})
if att["kind"] != "file" || att["uri"] != "workspace:foo.pdf" || att["name"] != "foo.pdf" {
t.Errorf("attachment shape: %v", att)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
func TestActivityList_IncludePeerInfo_Unset_NoJoinNoExtraFields(t *testing.T) {
// Back-compat — when ?include=peer_info is NOT passed, the SELECT
// uses unqualified column refs (no `activity_logs.` prefix) AND no
// JOIN. Existing tests pass this implicitly; this test pins it
// explicitly so a future refactor that accidentally turns the JOIN
// always-on gets caught.
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
// Regex pinned: "FROM activity_logs WHERE workspace_id" — no JOIN
// keyword between FROM and WHERE; no `activity_logs.` qualifier on
// workspace_id.
mock.ExpectQuery(`SELECT id, workspace_id,.+ FROM activity_logs WHERE workspace_id = .+`).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
}).
AddRow("act-1", "ws-1", "a2a_receive", "11111111-2222-3333-4444-555555555555", "ws-1",
"message/send", "Hello",
nil, nil, nil, nil, "ok", nil, time.Now()))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("want 1 row, got %d", len(resp))
}
// Confirm no peer_info enrichment leaks into the default envelope.
for _, k := range []string{"peer_name", "peer_role", "agent_card_url", "attachments"} {
if _, present := resp[0][k]; present {
t.Errorf("%s must NOT appear without ?include=peer_info; got %v", k, resp[0][k])
}
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
func TestActivityList_IncludePeerInfo_UnknownFlagIgnored(t *testing.T) {
// ?include=bogus must NOT issue the JOIN — only the recognized
// `peer_info` flag triggers enrichment. The unknown flag is silently
// ignored (additive, opt-in convention).
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
mock.ExpectQuery(`SELECT id, workspace_id,.+ FROM activity_logs WHERE workspace_id = .+`).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
}))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity?include=bogus", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// ---------- flat upload manifest (chat_upload_receive) tests ----------
func TestKindFromMimeType(t *testing.T) {
cases := []struct {
mime string
want string
}{
{"image/png", "image"},
{"image/jpeg", "image"},
{"image/", "image"}, // prefix-only is still image
{"audio/mpeg", "audio"},
{"audio/wav", "audio"},
{"video/mp4", "video"},
{"video/webm", "video"},
{"application/pdf", "file"},
{"text/plain", "file"},
{"", "file"},
{"unknown", "file"},
{"image", "file"}, // no slash → not a prefix match
}
for _, tc := range cases {
if got := kindFromMimeType(tc.mime); got != tc.want {
t.Errorf("kindFromMimeType(%q) = %q, want %q", tc.mime, got, tc.want)
}
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_Image(t *testing.T) {
// Canvas chat_upload_receive shape: flat manifest at request_body
// root with camelCase mimeType. The empirical example was a PNG
// pasted into the canvas; surfaces here with kind=image,
// mime_type=image/png (snake-case normalized), uri preserved.
body := []byte(`{
"uri":"platform-pending:091a9180-/26111d48-",
"name":"pasted-2026-05-21T23-12-25-0-0.png",
"size":677133,
"file_id":"26111d48-",
"mimeType":"image/png"
}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d: %v", len(atts), atts)
}
att := atts[0]
if att["kind"] != "image" {
t.Errorf("kind: want image, got %v", att["kind"])
}
if att["uri"] != "platform-pending:091a9180-/26111d48-" {
t.Errorf("uri: %v", att["uri"])
}
if att["mime_type"] != "image/png" {
t.Errorf("mime_type normalization (camelCase→snake_case) failed: %v", att["mime_type"])
}
if att["name"] != "pasted-2026-05-21T23-12-25-0-0.png" {
t.Errorf("name: %v", att["name"])
}
// camelCase `mimeType` MUST NOT leak into the projected envelope —
// only snake_case `mime_type` is the wire convention.
if _, present := att["mimeType"]; present {
t.Errorf("camelCase mimeType leaked into envelope: %v", att)
}
if _, present := att["file_id"]; present {
t.Errorf("file_id should not be surfaced on the attachment envelope (it's a canvas-internal id): %v", att)
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_Audio(t *testing.T) {
body := []byte(`{"uri":"platform-pending:ws/file","name":"voice.mp3","file_id":"abc","mimeType":"audio/mpeg"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 || atts[0]["kind"] != "audio" {
t.Fatalf("want audio kind, got %v", atts)
}
if atts[0]["mime_type"] != "audio/mpeg" {
t.Errorf("mime_type: %v", atts[0]["mime_type"])
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_Video(t *testing.T) {
body := []byte(`{"uri":"platform-pending:ws/file","name":"clip.mp4","file_id":"abc","mimeType":"video/mp4"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 || atts[0]["kind"] != "video" {
t.Fatalf("want video kind, got %v", atts)
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_GenericFile(t *testing.T) {
// application/pdf has no image/audio/video prefix → kind=file
body := []byte(`{"uri":"platform-pending:ws/file","name":"doc.pdf","file_id":"abc","mimeType":"application/pdf"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 || atts[0]["kind"] != "file" {
t.Fatalf("want file kind, got %v", atts)
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_NoMimeFallsToFile(t *testing.T) {
// No mimeType at all — kind defaults to "file", mime_type omitted.
body := []byte(`{"uri":"platform-pending:ws/file","name":"unknown.bin","file_id":"abc"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d", len(atts))
}
if atts[0]["kind"] != "file" {
t.Errorf("kind: want file (default), got %v", atts[0]["kind"])
}
if _, present := atts[0]["mime_type"]; present {
t.Errorf("mime_type should be omitted when source has none, got %v", atts[0]["mime_type"])
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_SnakeCaseMimeTypeAccepted(t *testing.T) {
// Defensive: a future canvas version (or non-canvas caller) that
// already emits snake_case mime_type should still be parsed.
body := []byte(`{"uri":"u","name":"n.png","mime_type":"image/png"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d", len(atts))
}
if atts[0]["mime_type"] != "image/png" || atts[0]["kind"] != "image" {
t.Errorf("snake_case mime_type not honored: %v", atts[0])
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_FileIDOnlyIsSkipped(t *testing.T) {
// file_id alone (no uri AND no name) is non-actionable — the
// downstream adaptor can't render a discoverable file from just an
// internal canvas id. Skip per the same minimum-info rule the
// message-parts arm applies to empty parts.
body := []byte(`{"file_id":"orphan-uuid","mimeType":"image/png"}`)
if got := extractAttachmentsFromRequestBody(body); got != nil {
t.Errorf("file_id-only manifest must be skipped, got %v", got)
}
}
func TestExtractAttachmentsFromRequestBody_FlatUpload_NameOnlyIsKept(t *testing.T) {
// Symmetric with the message-parts arm: a name without uri is still
// useful (the downstream adaptor can render "user uploaded foo.png").
body := []byte(`{"name":"only-name.bin","file_id":"abc","mimeType":"application/octet-stream"}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment, got %d", len(atts))
}
if atts[0]["name"] != "only-name.bin" {
t.Errorf("name not preserved: %v", atts[0])
}
if _, present := atts[0]["uri"]; present {
t.Errorf("uri should be omitted when absent in source, got %v", atts[0]["uri"])
}
}
func TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat(t *testing.T) {
// If a single request_body somehow has BOTH params.message.parts[]
// AND top-level uri/file_id (a pathological inbound), the
// message-parts arm wins — that's the documented inbound shape and
// it's been the only one historically extracted. The flat arm is a
// fallback for shapes that have NO parts.
body := []byte(`{
"uri":"platform-pending:should-not-win",
"file_id":"x",
"mimeType":"image/png",
"params":{"message":{"parts":[
{"kind":"file","file":{"uri":"workspace:should-win.pdf","mime_type":"application/pdf","name":"win.pdf"}}
]}}
}`)
atts := extractAttachmentsFromRequestBody(body)
if len(atts) != 1 {
t.Fatalf("want 1 attachment (from parts[]), got %d: %v", len(atts), atts)
}
if atts[0]["uri"] != "workspace:should-win.pdf" {
t.Errorf("message-parts arm did not take precedence: %v", atts[0])
}
}
func TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow(t *testing.T) {
// Wire-level integration: a canvas chat_upload_receive row (canvas
// user pasted an image) with source_id NULL (canvas message), flat
// upload manifest at request_body root. The `?include=peer_info`
// projection must surface attachments[] populated from the flat-
// upload-manifest arm while peer_name / peer_role / agent_card_url
// remain absent (canvas row has no peer).
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewActivityHandler(broadcaster)
mock.ExpectQuery(`LEFT JOIN workspaces w ON w\.id = activity_logs\.source_id`).
WithArgs("ws-1", 100).
WillReturnRows(sqlmock.NewRows([]string{
"id", "workspace_id", "activity_type", "source_id", "target_id",
"method", "summary", "request_body", "response_body",
"tool_trace", "duration_ms", "status", "error_detail", "created_at",
"peer_name", "peer_role",
}).
// Empirical shape from 2026-05-21 ~23:12Z agents-team canvas paste.
AddRow("act-upload", "ws-1", "chat_upload_receive", nil, "ws-1",
"chat_upload_receive", "Canvas upload: pasted-2026-05-21T23-12-25-0-0.png",
[]byte(`{
"uri":"platform-pending:091a9180-b303-4a20-aefe-3a4a675b8aa4/26111d48-aaaa-bbbb-cccc-dddddddddddd",
"name":"pasted-2026-05-21T23-12-25-0-0.png",
"size":677133,
"file_id":"26111d48-aaaa-bbbb-cccc-dddddddddddd",
"mimeType":"image/png"
}`),
nil, nil, nil, "ok", nil, time.Now(),
nil, nil))
gin.SetMode(gin.TestMode)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/activity?include=peer_info", nil)
handler.List(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if len(resp) != 1 {
t.Fatalf("want 1 row, got %d", len(resp))
}
r := resp[0]
// Canvas row → no peer fields.
for _, k := range []string{"peer_name", "peer_role", "agent_card_url"} {
if _, present := r[k]; present {
t.Errorf("%s must NOT appear on canvas upload row; got %v", k, r[k])
}
}
// attachments[] populated from the flat-upload arm.
atts, ok := r["attachments"].([]interface{})
if !ok {
t.Fatalf("attachments missing or wrong type: %T %v", r["attachments"], r["attachments"])
}
if len(atts) != 1 {
t.Fatalf("want 1 attachment from flat manifest, got %d: %v", len(atts), atts)
}
att := atts[0].(map[string]interface{})
if att["kind"] != "image" {
t.Errorf("kind: want image (image/png prefix), got %v", att["kind"])
}
if att["mime_type"] != "image/png" {
t.Errorf("mime_type wire shape: want snake_case image/png, got %v", att["mime_type"])
}
if att["uri"] != "platform-pending:091a9180-b303-4a20-aefe-3a4a675b8aa4/26111d48-aaaa-bbbb-cccc-dddddddddddd" {
t.Errorf("uri preserved verbatim: got %v", att["uri"])
}
if att["name"] != "pasted-2026-05-21T23-12-25-0-0.png" {
t.Errorf("name: %v", att["name"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// Sanity test using the existing test broadcaster setup — verifies the
// extractAttachments helper round-trips through json.Marshal cleanly
// (no map ordering issues, no type-coercion surprises).
func TestExtractAttachmentsFromRequestBody_RoundTripsThroughJSON(t *testing.T) {
body := []byte(`{"params":{"message":{"parts":[{"kind":"file","file":{"uri":"workspace:r.bin","mime_type":"application/octet-stream","name":"r.bin"}}]}}}`)
atts := extractAttachmentsFromRequestBody(body)
b, err := json.Marshal(atts)
if err != nil {
t.Fatalf("marshal: %v", err)
}
var decoded []map[string]interface{}
if err := json.Unmarshal(b, &decoded); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(decoded) != 1 || decoded[0]["uri"] != "workspace:r.bin" {
t.Fatalf("round-trip mismatch: %v", decoded)
}
_ = fmt.Sprintf // keep fmt import live if test trimming removes usage
}
@@ -216,69 +216,102 @@ curl -fsS -X POST "{{PLATFORM_URL}}/registry/register" \
const externalChannelTemplate = `# Claude Code channel — bridges this workspace's A2A traffic into your
# Claude Code session. No tunnel/public URL needed (polling-based).
#
# Prereq: Bun installed (channel plugins are Bun scripts).
# bun --version # must print a version number
# Prereq: Bun 1.3+ installed (channel plugins are Bun scripts).
# bun --version # must print a version (1.3.x or newer)
#
# 1. Inside Claude Code, install the channel plugin from its GitHub repo.
# The plugin is NOT on Anthropic's default allowlist, so a one-time
# marketplace-add is needed before install:
# 1. Inside Claude Code, install the channel plugin. The plugin lives in
# Molecule's own Gitea marketplace (not Anthropic's default), so a
# one-time marketplace-add is needed before install:
#
# /plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git
# /plugin install molecule@molecule-channel
#
# Then either run /reload-plugins or restart Claude Code so the
# plugin is registered.
# Then /reload-plugins (or restart Claude Code) so the plugin is
# registered.
#
# 2. Create the per-watched-workspace config file:
# 2. Create (or extend) the per-host config file. The canonical SSOT
# shape is MOLECULE_WORKSPACES_JSON — a JSON array of
# {id, token, platform_url} objects. One plugin instance can watch
# many workspaces across many tenants; append more objects to the
# array (separate them with commas, NOT a newline):
mkdir -p ~/.claude/channels/molecule
cat > ~/.claude/channels/molecule/.env <<'EOF'
MOLECULE_PLATFORM_URL={{PLATFORM_URL}}
MOLECULE_WORKSPACE_IDS={{WORKSPACE_ID}}
MOLECULE_WORKSPACE_TOKENS=<paste auth_token from create response>
MOLECULE_WORKSPACES_JSON=[{"id":"{{WORKSPACE_ID}}","token":"<paste auth_token from create response>","platform_url":"{{PLATFORM_URL}}"}]
EOF
chmod 600 ~/.claude/channels/molecule/.env
# 3. Launch Claude Code with the channel enabled. Custom (non-Anthropic-
# allowlisted) channels need the --dangerously-load-development-channels
# flag to opt in — without it, you'll see "not on the approved channels
# allowlist" on startup.
claude --dangerously-load-development-channels \
--channels plugin:molecule@molecule-channel
# (Legacy single-platform shape — MOLECULE_PLATFORM_URL + comma-separated
# MOLECULE_WORKSPACE_IDS + MOLECULE_WORKSPACE_TOKENS — is still supported
# for back-compat but does NOT work across multiple tenant URLs. Use
# MOLECULE_WORKSPACES_JSON above unless you have a specific reason.)
# 3. Launch Claude Code with the channel enabled. The channel spec is the
# VALUE of --dangerously-load-development-channels — NOT a separate
# --channels flag (that flag does not exist in current Claude Code;
# passing it errors with "entries must be tagged: --channels").
claude --dangerously-load-development-channels plugin:molecule@molecule-channel
# You should see on stderr:
# molecule channel: connected — watching 1 workspace(s) at {{PLATFORM_URL}}
# molecule channel: connected — watching N workspace(s) across M platform(s)
# targets: <platform_url>: <workspace_id>
#
# Inbound A2A messages now surface as conversation turns. Claude's
# replies route back via the reply_to_workspace MCP tool — no extra
# wiring on your side.
# Inbound A2A messages now surface as conversation turns (synthetic
# <channel ...> tags). Claude's replies route back via the
# reply_to_workspace / send_message_to_user MCP tools.
#
# Multi-workspace note: when watching more than one workspace, every
# outbound tool call (send_message_to_user, reply_to_workspace,
# delegate_task, list_peers) MUST pass _as_workspace=<id> so the plugin
# knows which token to authenticate with. The host returns -32603 if you
# forget — the synthetic <channel> tag's "watching_as" attribute tells
# you which id to use.
#
# Common errors:
# "plugin not installed" → Step 1 didn't run; run /plugin install
# "plugin not installed" → Step 1 didn't run; run /plugin
# marketplace add + /plugin install
# inside Claude Code, then /reload-plugins.
# "not on approved channels allowlist" → Add --dangerously-load-development-channels
# to the launch command (Step 3).
# "config-missing" → ~/.claude/channels/molecule/.env not
# readable; re-run Step 2 and check chmod.
# "entries must be tagged" → You passed --channels separately.
# Put plugin:molecule@molecule-channel
# directly after
# --dangerously-load-development-channels.
# "not on approved channels allowlist" → Org policy gating. See "managed
# settings" note below.
# "config-missing" → ~/.claude/channels/molecule/.env
# not readable; re-run Step 2 and check
# chmod 600.
#
# Team/Enterprise orgs: the --dangerously-load-development-channels flag is
# blocked by managed settings. Your admin must set channelsEnabled=true and
# add the plugin to allowedChannelPlugins in claude.ai admin settings.
# Team/Enterprise plans: the channel allowlist is gated by org policy
# AND must be written to the local managed-settings.json file on disk
# (not the claude.ai web admin UI — there is no web toggle for this).
# Path per OS:
# macOS: /Library/Application Support/ClaudeCode/managed-settings.json
# Linux: /etc/claude-code/managed-settings.json
# Windows: C:\ProgramData\ClaudeCode\managed-settings.json
# Set channelsEnabled: true and add
# { "plugin": "molecule", "marketplace": "molecule-channel" }
# to allowedChannelPlugins. Restart Claude Code after writing the file.
# A user-level ~/.claude/settings.json does NOT work on Team/Enterprise
# — this is the single most common reason a freshly-installed plugin
# appears to do nothing.
#
# Multi-workspace: comma-separate IDs and tokens (same order). See
# https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel for
# pairing flow, push-mode upgrade, and v0.2 roadmap.
# Pro/Max plans skip the channelsEnabled gate but still need the
# allowedChannelPlugins entry in the managed-settings file.
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/claude-code-channel-plugin
# Full README: https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel
# Common errors:
# • "plugin not installed" — run /plugin marketplace add then
# /plugin install lines above; /reload-plugins or restart.
# • "entries must be tagged: --channels" — the launch flag form
# changed; use --dangerously-load-development-channels plugin:molecule@molecule-channel
# (channel spec is the VALUE, not a separate --channels flag).
# • "not on the approved channels allowlist" — custom channels need
# --dangerously-load-development-channels; team/enterprise orgs
# need admin to set channelsEnabled + allowedChannelPlugins.
# allowedChannelPlugins in /Library/Application Support/ClaudeCode/managed-settings.json
# (macOS) / equivalent on Linux+Windows. NOT a web setting.
# • "Inbound messages not arriving" — stderr should show
# "molecule channel: connected — watching N workspace(s)";
# verify ~/.claude/channels/molecule/.env has PLATFORM_URL + token.
# verify ~/.claude/channels/molecule/.env shape is MOLECULE_WORKSPACES_JSON.
`
// externalUniversalMcpTemplate — runtime-agnostic standalone path.
@@ -670,7 +703,15 @@ def heartbeat(client, url, ws, tok, start):
r.raise_for_status()
def poll_inbound(client, url, ws, tok, since_id):
params = {"since_secs": "30", "limit": "50"}
# include=peer_info opts into Layer 1's row-level projection so each
# polled activity carries peer_name, peer_role, agent_card_url, and
# attachments[] inline (when source_id resolves to a peer / when the
# message included a file). Pre-Layer-1 platforms ignore unknown query
# params and return the bare row shape, so this is back-compat. Use
# the extra fields in your reply logic — e.g. address the sender by
# peer_name rather than UUID, or Read attached files via the workspace:
# URIs in attachments[].
params = {"since_secs": "30", "limit": "50", "include": "peer_info"}
if since_id:
params["since_id"] = since_id
r = client.get(f"{url}/workspaces/{ws}/activity", params=params, headers=hdrs(url, tok))
@@ -737,10 +778,16 @@ python3 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
# What the script does:
# • Registers the workspace in poll mode (no public URL needed)
# • Heartbeats every 20s to keep STATUS = online on the canvas
# • Polls /workspaces/:id/activity every 5s for new canvas messages
# • Polls /workspaces/:id/activity?include=peer_info every 5s — Layer 1
# enrichment surfaces peer_name / peer_role / agent_card_url /
# attachments[] inline on each polled row when applicable
# • Echo-replies via POST /workspaces/:id/notify
#
# To change the reply logic, edit the send_reply() call inside the loop.
# Each polled item has top-level peer_name / peer_role / agent_card_url
# fields (peer_agent rows) and attachments[] (any kind) when Layer 1 is
# enabled on the platform — use them to disambiguate senders and to Read
# attached files via the workspace: URIs.
# To send a one-off reply from another terminal:
# curl -fsS -X POST "{{PLATFORM_URL}}/workspaces/{{WORKSPACE_ID}}/notify" \
# -H "Authorization: Bearer $(cat ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env | grep TOKEN | cut -d= -f2)" \
@@ -118,3 +118,86 @@ func TestExternalTemplates_NoBrokenMoleculeAIGitHubURLs(t *testing.T) {
}
}
}
// TestExternalChannelTemplate_LaunchFlagShape pins the Claude Code channel
// snippet to the working launch invocation. The channel spec must be the
// VALUE of --dangerously-load-development-channels, NOT a separate
// --channels flag. The two-flag form (`--dangerously-load-development-channels
// --channels plugin:molecule@...`) errors with "entries must be tagged:
// --channels" on current Claude Code builds (2.1.143+) and silently no-ops
// on older ones — either way, new users hit a wall on first launch.
//
// Empirical: hit by a session walking through this exact snippet 2026-05-21;
// the broken form was copy-pasted from this template, ran, errored, and
// confused the operator into believing the plugin install was broken when
// the snippet itself was the bug.
func TestExternalChannelTemplate_LaunchFlagShape(t *testing.T) {
// The broken two-flag form. If this string ever appears in the
// snippet again, the same onboarding pothole returns.
bannedFormBroken := "--dangerously-load-development-channels \\\n --channels plugin:molecule@molecule-channel"
if strings.Contains(externalChannelTemplate, bannedFormBroken) {
t.Errorf("externalChannelTemplate contains the broken two-flag launch form. " +
"Use --dangerously-load-development-channels plugin:molecule@molecule-channel (spec as value, not a separate --channels flag).")
}
// The single-flag form must be present.
requiredFormGood := "--dangerously-load-development-channels plugin:molecule@molecule-channel"
if !strings.Contains(externalChannelTemplate, requiredFormGood) {
t.Errorf("externalChannelTemplate must contain %q so operators see the working launch invocation", requiredFormGood)
}
}
// TestExternalChannelTemplate_CanonicalEnvShape pins the canvas-served
// .env example to the canonical SSOT shape (MOLECULE_WORKSPACES_JSON)
// rather than the legacy single-platform shape. The legacy form
// (MOLECULE_PLATFORM_URL + comma-separated IDs/TOKENS) is still accepted
// by the channel plugin's parseWorkspaceTargets but is single-tenant
// only — it silently fails to onboard users who want to watch multiple
// platforms (e.g. hongming + agents-team from the same plugin instance),
// which is the post-PR#15 expected use case.
func TestExternalChannelTemplate_CanonicalEnvShape(t *testing.T) {
if !strings.Contains(externalChannelTemplate, "MOLECULE_WORKSPACES_JSON=") {
t.Errorf("externalChannelTemplate must use MOLECULE_WORKSPACES_JSON as the canonical .env shape (the post-PR#15 SSOT)")
}
// The JSON example must contain the workspace_id + platform_url placeholders
// so the canvas substitutes them at serve time.
for _, ph := range []string{"{{WORKSPACE_ID}}", "{{PLATFORM_URL}}"} {
if !strings.Contains(externalChannelTemplate, ph) {
t.Errorf("externalChannelTemplate must contain placeholder %q so the canvas substitutes per-workspace values", ph)
}
}
}
// TestPollingTemplates_OptIntoPeerInfo pins the invariant that any template
// which calls /workspaces/:id/activity for inbound delivery requests the
// Layer 1 enrichment via ?include=peer_info. Without this opt-in, the
// platform returns bare activity rows and the operator's bridge / channel
// loses peer_name / peer_role / agent_card_url / attachments[] — they're
// available on the server but not delivered.
//
// Pre-Layer-1 platforms ignore unknown query params (HTTP spec: filters
// not understood are dropped), so this is back-compat across deploys.
//
// The Claude Code channel template doesn't include the poll URL in this
// snippet — its polling lives in the plugin's own server.ts (handled by
// molecule-mcp-claude-channel PR#21). The Kimi template DOES include a
// poll loop in its kimi_bridge.py block, so the invariant applies there.
func TestPollingTemplates_OptIntoPeerInfo(t *testing.T) {
pollingTemplates := map[string]string{
"externalKimiTemplate": externalKimiTemplate,
}
for name, body := range pollingTemplates {
// If the snippet polls /activity, it must opt into peer_info.
// The detection is intentionally loose ("/activity" appears in
// the script) — operators who customize the script keep the
// invariant only if the include hint is in the template.
if !strings.Contains(body, "/activity") {
t.Errorf("%s no longer polls /activity — review whether this test still applies", name)
continue
}
if !strings.Contains(body, `"include": "peer_info"`) && !strings.Contains(body, "include=peer_info") {
t.Errorf("%s polls /activity without ?include=peer_info — operators lose Layer 1 enrichment "+
"(peer_name / peer_role / agent_card_url / attachments[]). Add the param to the poll URL.", name)
}
}
}
@@ -44,7 +44,7 @@ func TestWorkspaceCreate_WithParentID(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Child Agent","parent_id":"parent-ws-123"}`
body := `{"name":"Child Agent","model":"anthropic:claude-opus-4-7","parent_id":"parent-ws-123"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -80,7 +80,7 @@ func TestWorkspaceCreate_ExplicitClaudeCodeRuntime(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"CC Agent","tier":2,"runtime":"claude-code","canvas":{"x":10,"y":20}}`
body := `{"name":"CC Agent","tier":2,"runtime":"claude-code","model":"sonnet","canvas":{"x":10,"y":20}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -301,7 +301,7 @@ func TestWorkspaceCreate_MaxConcurrentTasksOverride(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Leader Agent","runtime":"claude-code","max_concurrent_tasks":3}`
body := `{"name":"Leader Agent","runtime":"claude-code","model":"sonnet","max_concurrent_tasks":3}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -777,6 +777,103 @@ func TestCreate_FieldValidation_Returns400(t *testing.T) {
}
}
// TestCreate_ModelRequired_Returns422 pins the CTO 2026-05-22 SSOT
// directive (feedback_workspace_model_required_no_platform_default_dynamic_credential_intake):
// model is required user input; the platform must not supply a default,
// the runtime must not fall back. Empirical trigger: Code Reviewer
// 5ba15d7e was created with `{"name":..., "runtime":"codex", ...}` (no
// model). The legacy DefaultModel fallback returned "anthropic:claude-opus-4-7"
// and codex adapter wedged forever — `picks provider='anthropic' but it
// is not in the providers registry`. The gate at the Create boundary
// turns that silent stuck-workspace failure into an immediate 422 the
// caller can react to.
//
// Three shapes covered:
// 1. bare name (no template, no runtime, no model) — formerly defaulted
// to langgraph + anthropic; now 422 because model is unspecified.
// 2. explicit runtime, no model — the Code Reviewer repro shape.
// 3. explicit runtime+template path, but template (when missing on
// disk or unreadable) would leave model empty — exercised here by
// pointing at a non-existent template under /tmp/configs.
func TestCreate_ModelRequired_Returns422(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
handler := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", "/tmp/configs")
cases := []struct{ label, body string }{
{"bare_name_no_runtime_no_model", `{"name":"x"}`},
{"explicit_codex_no_model", `{"name":"Code Reviewer","role":"code reviewer","runtime":"codex","tier":4,"max_concurrent_tasks":1}`},
{"explicit_hermes_no_model", `{"name":"researcher","runtime":"hermes"}`},
}
for _, tc := range cases {
t.Run(tc.label, func(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(tc.body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Errorf("Create(%s): want 422 MODEL_REQUIRED, got %d: %s", tc.label, w.Code, w.Body.String())
return
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"code":"MODEL_REQUIRED"`)) {
t.Errorf("Create(%s): want body containing code=MODEL_REQUIRED, got %s", tc.label, w.Body.String())
}
})
}
}
// TestCreate_ExternalRuntime_NoModel_OK pins the external-runtime
// exemption from the MODEL_REQUIRED gate. External workspaces
// intentionally do not spawn a Docker container or run an adapter;
// they delegate to a registered URL (workspace_provision.go:497-498:
// "external is a first-class runtime that intentionally does NOT
// spawn a Docker container"). The model field has no meaning for
// them — the URL is the contract, and the gate would 422 every
// legitimate "register my agent at https://..." flow.
//
// Both spellings count as external:
// 1. payload.External == true (the canonical flag, e.g. with any runtime)
// 2. payload.Runtime == "external" (legacy shape some E2E scripts still use)
//
// The isExternalLikeRuntime() helper catches both "external" and any
// future external-like runtime alias.
func TestCreate_ExternalRuntime_NoModel_OK(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
handler := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
// External=true with explicit runtime — the test_api.sh / Echo Agent shape.
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec(`UPDATE workspaces SET status =`).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO workspace_auth_tokens").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Echo Agent","tier":1,"runtime":"external","external":true}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("external workspace without model: want 201, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations not met: %v", err)
}
}
func TestUpdate_FieldValidation_Returns400(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
@@ -386,7 +386,13 @@ func TestWorkspaceCreate(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Test Agent","canvas":{"x":100,"y":200}}`
// Note: model is now required at the Create boundary (CTO 2026-05-22
// SSOT directive — see feedback_workspace_model_required_no_platform_default_dynamic_credential_intake
// and TestCreate_ModelRequired_Returns422). This test happens to take
// the bare-defaults path (no template, no runtime → langgraph), so
// the body must declare an explicit model. Using a langgraph-compatible
// id; the test doesn't exercise model semantics beyond presence.
body := `{"name":"Test Agent","model":"anthropic:claude-opus-4-7","canvas":{"x":100,"y":200}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -69,10 +69,15 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
model = defaults.Model
}
if model == "" {
// SSOT: per-runtime defaults live in models/runtime_defaults.go
// (see RFC #2873). Consolidated from a duplicate of the same
// branch in workspace_provision.go.
model = models.DefaultModel(runtime)
// SSOT (CTO 2026-05-22, feedback_workspace_model_required_no_platform_default_dynamic_credential_intake):
// model is REQUIRED. The org-import template MUST declare a
// model — either per-workspace (`ws.Model`) or via the org
// defaults block (`defaults.Model`). If neither is present
// the template is malformed and the import must fail-closed
// rather than silently provisioning a workspace with a
// runtime-incompatible default (the prior `anthropic:claude-opus-4-7`
// fallback wedged every codex workspace at adapter init).
return fmt.Errorf("org import: workspace %q has no model and the org defaults block does not provide one (runtime=%s) — model is a required field per the workspace-creation contract; either set `model:` on the workspace or under `defaults:`", ws.Name, runtime)
}
tier := ws.Tier
if tier == 0 {
@@ -321,6 +321,51 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
payload.Runtime = "langgraph"
}
// SSOT (CTO 2026-05-22, feedback_workspace_model_required_no_platform_default_dynamic_credential_intake):
// model is REQUIRED user input for SPAWNED-runtime workspaces. The
// platform must not provide a default; the runtime must not fall back.
// The decision belongs to the user (or to the agent acting on the
// user's behalf), never to the platform.
//
// Empirical trigger: Code Reviewer 5ba15d7e was created with
// `{"name":"Code Reviewer","role":"...","runtime":"codex",...}` (no
// model). The legacy `DefaultModel(runtime)` fallback in
// provisionWorkspace returned `"anthropic:claude-opus-4-7"`. Codex
// adapter only supports openai-* providers — it wedged forever with
// `codex adapter: workspace config picks provider='anthropic' but
// it is not in the providers registry`. PATCH /workspaces/:id
// explicitly disallows updating model (the comment literally reads
// `model not patchable`), so the only recovery path was SQL UPDATE
// or delete+recreate.
//
// External workspaces are EXEMPT — they intentionally do not spawn
// a Docker container or run an adapter; they delegate to a registered
// URL (see provision.go: "external is a first-class runtime that
// intentionally does NOT spawn a Docker container"). The MODEL_REQUIRED
// gate is meaningful for spawned-runtime workspaces where the model
// id drives provider selection at adapter init. For external workspaces
// the contract is the URL, not the model — requiring it would be
// ceremony with no payoff, and would 422 every legitimate "register
// my agent at https://..." flow. The SSOT directive concerns
// platform-side defaults; an external workspace genuinely has no
// "model decision" for the user to make.
//
// Fail-closed at the Create boundary so the caller learns the
// contract immediately — same shape as the controlplane#188
// runtime-unresolved gate above. Caller fixes the request, no
// EC2 launched, no stuck workspace, no operator paging.
isExternal := payload.External || isExternalLikeRuntime(payload.Runtime)
if payload.Model == "" && !isExternal {
log.Printf("Create: FAIL-CLOSED — model is required (runtime=%q template=%q); refusing the silent DefaultModel fallback per CTO 2026-05-22 SSOT directive", payload.Runtime, payload.Template)
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": "model is required and has no platform-side default — pass an explicit \"model\" in the request body, or use a \"template\" whose config.yaml declares one. See feedback_workspace_model_required_no_platform_default_dynamic_credential_intake for the contract.",
"runtime": payload.Runtime,
"template": payload.Template,
"code": "MODEL_REQUIRED",
})
return
}
ctx := c.Request.Context()
// Convert empty role to NULL
@@ -168,7 +168,7 @@ func TestWorkspaceBudget_Create_WithLimit(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Budgeted Agent","budget_limit":1000}`
body := `{"name":"Budgeted Agent","model":"anthropic:claude-opus-4-7","budget_limit":1000}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
@@ -548,13 +548,22 @@ func (h *WorkspaceHandler) ensureDefaultConfig(workspaceID string, payload model
// via a crafted runtime string (#241).
runtime := sanitizeRuntime(payload.Runtime)
// Generate a minimal config.yaml
// Generate a minimal config.yaml.
//
// SSOT (CTO 2026-05-22): model is REQUIRED user input. The platform
// must not provide a default; the runtime must not fall back. The
// Create handler is responsible for rejecting empty model BEFORE
// reaching provisionWorkspace; this is a defence-in-depth assertion.
// If we hit here with an empty model the YAML below would still
// render a `model: ""` line — which renders all downstream provider
// derivation undefined. Log loudly and let the workspace boot into
// not_configured rather than masking the contract violation with a
// silently-broken default (the prior `anthropic:claude-opus-4-7`
// fallback was the canonical example — every codex workspace
// created without an explicit model wedged).
model := payload.Model
if model == "" {
// SSOT: per-runtime defaults live in models/runtime_defaults.go
// (see RFC #2873). Was previously duplicated here AND in
// org_import.go; consolidating prevents silent drift.
model = models.DefaultModel(runtime)
log.Printf("ensureDefaultConfig: workspace %s reached provisioning with empty model — Create handler should have rejected this; rendering empty model: \"\" in config.yaml (workspace will boot not_configured)", workspaceID)
}
if runtime == "claude-code" {
model = normalizeClaudeCodeModel(model)
@@ -756,47 +756,55 @@ func TestWorkspaceCreate_FirstDeploy_PersistsModelAndProvider(t *testing.T) {
}
}
// TestWorkspaceCreate_FirstDeploy_NoModel_NoSecretWritten asserts that
// when payload.Model is empty, NEITHER MODEL nor LLM_PROVIDER is
// written. Important: the canvas can omit `model` (template inherits
// the runtime default later); we must not poison workspace_secrets with
// empty rows in that case.
func TestWorkspaceCreate_FirstDeploy_NoModel_NoSecretWritten(t *testing.T) {
// TestWorkspaceCreate_FirstDeploy_NoModel_Returns422 inverts the prior
// premise (CTO 2026-05-22 SSOT directive — see
// feedback_workspace_model_required_no_platform_default_dynamic_credential_intake
// and TestCreate_ModelRequired_Returns422 in handlers_extended_test.go).
//
// Pre-2026-05-22 the canvas was allowed to omit `model` and the workspace
// would 201 with no workspace_secrets rows for MODEL/LLM_PROVIDER (the
// thinking being that templates inherit the runtime default later). That
// "soft fallback" was the load-bearing bug magnet — `DefaultModel(runtime)`
// would later return `anthropic:claude-opus-4-7`, and codex workspaces
// wedged forever at adapter init.
//
// New contract: empty model is a 422 MODEL_REQUIRED, with NO DB writes
// at all. The gate fires at the Create boundary before INSERT INTO
// workspaces. The follow-on workspace_secrets gate (which the original
// test pinned) is therefore unreachable on the empty-model path — there
// is no row to mint secrets for.
func TestWorkspaceCreate_FirstDeploy_NoModel_Returns422(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
// NO INSERT INTO workspace_secrets here — the gate is payload.Model != "".
mock.ExpectExec("INSERT INTO canvas_layouts").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec(`UPDATE workspaces SET status =`).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO workspace_auth_tokens").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
// NO mock.ExpectBegin / INSERT INTO workspaces — the Create gate
// MUST fire before any DB write. If the gate fires late, sqlmock
// will surface "call to ExecQuery 'INSERT INTO workspaces' was not
// expected" — which is exactly the failure mode we want to flag.
// Body: hermes runtime WITHOUT external:true (the external-runtime
// exemption — see TestCreate_ExternalRuntime_NoModel_OK — does NOT
// apply here; hermes spawns a real adapter and model selection
// matters at adapter init). This is exactly the shape the old
// "no-model-no-secret-write" test pinned, minus the external flag.
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"No Model Agent","runtime":"hermes","external":true}`
body := `{"name":"No Model Agent","runtime":"hermes"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected status 201, got %d: %s", w.Code, w.Body.String())
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("expected 422 MODEL_REQUIRED for empty model, got %d: %s", w.Code, w.Body.String())
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"code":"MODEL_REQUIRED"`)) {
t.Errorf("expected code=MODEL_REQUIRED in body, got %s", w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations not met — empty payload.Model should NOT trigger workspace_secrets writes: %v", err)
t.Errorf("sqlmock saw an unexpected DB write — the MODEL_REQUIRED gate fired too late: %v", err)
}
}
@@ -193,10 +193,17 @@ func TestEnsureDefaultConfig_Hermes(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
// Post-CTO-SSOT-directive (2026-05-22): model is required user input;
// ensureDefaultConfig no longer fills in a runtime default. The Create
// handler gates on empty model and 422s before reaching here, so this
// test now passes the model explicitly to exercise the YAML rendering
// path — same model value the prior implicit DefaultModel("hermes")
// returned.
payload := models.CreateWorkspacePayload{
Name: "Test Agent",
Tier: 1,
Runtime: "hermes",
Model: "anthropic:claude-opus-4-7",
}
files := handler.ensureDefaultConfig("ws-test-123", payload)
@@ -219,7 +226,7 @@ func TestEnsureDefaultConfig_Hermes(t *testing.T) {
t.Errorf("config.yaml missing tier, got:\n%s", content)
}
if !contains(content, `model: "anthropic:claude-opus-4-7"`) {
t.Errorf("config.yaml should use default non-claude model, got:\n%s", content)
t.Errorf("config.yaml should render the supplied model, got:\n%s", content)
}
}
@@ -227,10 +234,14 @@ func TestEnsureDefaultConfig_ClaudeCode(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
// Post-CTO-SSOT-directive (2026-05-22): model is supplied explicitly
// instead of relying on the deleted DefaultModel("claude-code") =
// "sonnet" fallback. The Create handler 422s on empty model upstream.
payload := models.CreateWorkspacePayload{
Name: "Code Agent",
Tier: 2,
Runtime: "claude-code",
Model: "sonnet",
}
files := handler.ensureDefaultConfig("ws-code-123", payload)
@@ -407,9 +418,16 @@ func TestEnsureDefaultConfig_EmptyRuntimeDefaultsToClaudeCode(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
// Post-CTO-SSOT-directive (2026-05-22): ensureDefaultConfig is no
// longer the source of the model default — it just renders whatever
// the Create handler decided. The "empty runtime → claude-code"
// fallback inside sanitizeRuntime() is still in effect; this test
// continues to pin that behaviour by supplying the explicit
// claude-code model that the Create handler would have required.
payload := models.CreateWorkspacePayload{
Name: "Default Agent",
Tier: 1,
Name: "Default Agent",
Tier: 1,
Model: "sonnet",
}
files := handler.ensureDefaultConfig("ws-empty-rt", payload)
@@ -418,7 +436,7 @@ func TestEnsureDefaultConfig_EmptyRuntimeDefaultsToClaudeCode(t *testing.T) {
t.Errorf("empty runtime should default to claude-code, got:\n%s", configYAML)
}
if !contains(configYAML, `model: "sonnet"`) {
t.Errorf("claude-code default model should be sonnet (quoted), got:\n%s", configYAML)
t.Errorf("claude-code workspace should render the supplied model (quoted), got:\n%s", configYAML)
}
}
@@ -349,7 +349,7 @@ func TestWorkspaceCreate_DBInsertError(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Failing Agent"}`
body := `{"name":"Failing Agent","model":"anthropic:claude-opus-4-7"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -391,7 +391,7 @@ func TestWorkspaceCreate_DefaultsApplied(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Default Agent"}`
body := `{"name":"Default Agent","model":"anthropic:claude-opus-4-7"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -438,7 +438,7 @@ func TestWorkspaceCreate_SaaSHardForcesTier4(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"SaaS External Agent","runtime":"external","external":true,"url":"https://example.com/agent","tier":2}`
body := `{"name":"SaaS External Agent","runtime":"external","model":"external:custom","external":true,"url":"https://example.com/agent","tier":2}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -479,7 +479,7 @@ func TestWorkspaceCreate_WithSecrets_Persists(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Hermes Agent","runtime":"hermes","external":true,"secrets":{"HERMES_API_KEY":"sk-test-123"}}`
body := `{"name":"Hermes Agent","runtime":"hermes","model":"anthropic:claude-opus-4-7","external":true,"secrets":{"HERMES_API_KEY":"sk-test-123"}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -513,7 +513,7 @@ func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Rollback Agent","secrets":{"OPENAI_API_KEY":"sk-fail"}}`
body := `{"name":"Rollback Agent","model":"anthropic:claude-opus-4-7","secrets":{"OPENAI_API_KEY":"sk-fail"}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -548,7 +548,7 @@ func TestWorkspaceCreate_EmptySecrets_OK(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"No Secrets Agent","external":true,"secrets":{}}`
body := `{"name":"No Secrets Agent","model":"anthropic:claude-opus-4-7","external":true,"secrets":{}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -587,7 +587,7 @@ func TestWorkspaceCreate_ExternalURL_SSRFSafe(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Ext Agent","runtime":"external","external":true,"url":"http://localhost:8000"}`
body := `{"name":"Ext Agent","runtime":"external","model":"external:custom","external":true,"url":"http://localhost:8000"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -629,7 +629,7 @@ func TestWorkspaceCreate_KimiRuntime_PreservesLabel(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Kimi Agent","runtime":"kimi","tier":3,"canvas":{"x":100,"y":100}}`
body := `{"name":"Kimi Agent","runtime":"kimi","model":"kimi-coding/kimi-k2-coding-6","tier":3,"canvas":{"x":100,"y":100}}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -659,7 +659,7 @@ func TestWorkspaceCreate_ExternalURL_SSRFMetadataBlocked(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Bad Agent","runtime":"external","external":true,"url":"http://169.254.169.254/latest/meta-data/"}`
body := `{"name":"Bad Agent","runtime":"external","model":"external:custom","external":true,"url":"http://169.254.169.254/latest/meta-data/"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -690,7 +690,7 @@ func TestWorkspaceCreate_ExternalURL_SSRFLoopbackBlocked(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Bad Loopback","runtime":"external","external":true,"url":"http://127.0.0.1:9000/a2a"}`
body := `{"name":"Bad Loopback","runtime":"external","model":"external:custom","external":true,"url":"http://127.0.0.1:9000/a2a"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -1844,39 +1844,43 @@ func TestWorkspaceCreate_188_TemplateConfigNoRuntimeKey_FailsClosed(t *testing.T
}
}
// Regression guard: the legitimate default path (no template, no runtime —
// bare {"name":...}) MUST still default to langgraph and return 201. The
// #188 fix must not break this.
func TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph(t *testing.T) {
mock := setupTestDB(t)
// Pre-2026-05-22 this test guarded "bare {name} → langgraph 201" — the
// regression check for controlplane#188 (where an explicit runtime that
// failed to resolve must NOT silently substitute langgraph) had a sibling
// to ensure the LEGITIMATE bare default still landed on langgraph.
//
// Post-CTO-SSOT-directive (2026-05-22) bare body is 422 MODEL_REQUIRED
// before reaching the langgraph branch — the gate runs AFTER the
// langgraph-default assignment so the error body still surfaces
// runtime=langgraph (helps the caller see "ok, langgraph WOULD have
// been the runtime, but you still owe me a model"). The bare-body
// langgraph 201 path no longer exists; what we guard now is the
// 422-shape diagnostic.
//
// Bare-body-with-explicit-model 201 (the new "legitimate default" path)
// is covered by TestWorkspaceCreate in handlers_test.go — no need to
// duplicate the mock dance here.
func TestWorkspaceCreate_188_NoTemplateNoRuntime_NowMODEL_REQUIRED(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WithArgs(sqlmock.AnyArg(), "Plain Default", nil, 3, "langgraph", sqlmock.AnyArg(), (*string)(nil), nil, "none", (*int64)(nil), models.DefaultMaxConcurrentTasks, "push").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WithArgs(sqlmock.AnyArg(), float64(0), float64(0)).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Plain Default"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected 201 (legitimate default path), got %d: %s", w.Code, w.Body.String())
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("bare-body create: expected 422 MODEL_REQUIRED, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
if !bytes.Contains(w.Body.Bytes(), []byte(`"code":"MODEL_REQUIRED"`)) {
t.Errorf("bare-body create: expected code=MODEL_REQUIRED in body, got %s", w.Body.String())
}
if !bytes.Contains(w.Body.Bytes(), []byte(`"runtime":"langgraph"`)) {
t.Errorf("bare-body create: expected runtime=\"langgraph\" in 422 body (the gate runs AFTER the langgraph-default assignment so the diagnostic surfaces what runtime WOULD have been used), got %s", w.Body.String())
}
}
@@ -1901,7 +1905,7 @@ func TestWorkspaceCreate_188_ExplicitRuntimeNoTemplate_OK(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Explicit Codex","runtime":"codex"}`
body := `{"name":"Explicit Codex","runtime":"codex","model":"gpt-5.5"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
@@ -1,39 +1,31 @@
package models
// runtime_defaults.go — single source of truth for per-runtime defaults
// the platform applies when the operator/agent didn't supply a value.
// runtime_defaults.go — DELETED helper. Intentionally empty.
//
// Why this lives in models/ (not handlers/): default selection is a
// pure data fact about the runtime, not handler logic. Multiple
// callers (Create-workspace handler, org-import handler, future
// auto-provision paths) need the same answer; concentrating the
// rule here means one edit when a runtime's default changes.
// Previously held `DefaultModel(runtime string) string` which returned
// "sonnet" for claude-code and "anthropic:claude-opus-4-7" for everything
// else. That function was a SOFT-FALLBACK bug magnet:
//
// Related work (RFC #2873): this is the seed for a future
// `RuntimeConfig` interface that will also expose `ProvisioningTimeout()`,
// `CapabilitiesSupported()`, and other per-runtime facts. For now the
// surface is one helper — extracted from the duplicate branch in
// workspace_provision.go:537 and org_import.go:54 that diverged silently
// during refactors before this consolidation.
// DefaultModel returns the model slug to use when a workspace is
// created without an explicit model and the runtime can't infer one
// from its own config.
// - codex workspaces created without an explicit `model` silently
// received `anthropic:claude-opus-4-7`. Codex adapter only supports
// openai-* providers, so they wedged in `not_configured` with
// `codex adapter: workspace config picks provider='anthropic' but
// it is not in the providers registry`. The fallback never matched
// a runtime that could actually use it (only langgraph + hermes
// could even partially execute anthropic:claude-opus-4-7 without
// extra credential plumbing). It existed as a "must return
// something" placeholder that turned every silent miss into a
// prod incident.
//
// - claude-code: "sonnet" — Anthropic's CLI accepts the short
// name and resolves it via the operator's anthropic-oauth or
// ANTHROPIC_API_KEY chain.
// - everything else (hermes, langgraph, autogen, codex, openclaw,
// external, ""): a fully-qualified
// vendor:model slug that the universal MODEL_PROVIDER chain in
// molecule-core PR #247 can route via per-vendor required_env.
// - The fallback hid the contract bug at every callsite: Create
// handler, org_import, anywhere a stale CreateWorkspacePayload
// bubbled through to provisionWorkspace.
//
// The function never returns an empty string; an unknown runtime
// gets the universal default rather than failing closed (matches the
// pre-refactor behavior — both call sites used the same fallback).
func DefaultModel(runtime string) string {
if runtime == "claude-code" {
return "sonnet"
}
return "anthropic:claude-opus-4-7"
}
// SSOT principle (CTO 2026-05-22T03:42Z, feedback_workspace_model_required_no_platform_default_dynamic_credential_intake):
// model / provider / provider-credential are REQUIRED user input at
// create time. The platform must not provide a default. The runtime
// must not fall back. Decision belongs to the user (or to the agent
// acting on the user's behalf), never to the platform.
//
// Callers that previously fell back to DefaultModel must now fail-closed
// when model is empty after template-resolution.
@@ -1,59 +1,11 @@
package models
import "testing"
// TestDefaultModel pins the contract: known runtimes return their
// expected default; unknowns and the empty string fall through to the
// universal default. Add new runtimes here as `case` entries — pre-fix
// adding a runtime required two source edits + an audit; post-SSOT it
// requires one entry in DefaultModel + one assertion here.
func TestDefaultModel(t *testing.T) {
cases := []struct {
runtime string
want string
}{
// Known runtimes.
{"claude-code", "sonnet"},
// Universal fallback for everything else. Each runtime is named
// explicitly so a future drift (e.g., adding a hermes-specific
// branch) shows up as a failure on the runtime that drifted, not
// as a generic "unknown" failure.
{"hermes", "anthropic:claude-opus-4-7"},
{"langgraph", "anthropic:claude-opus-4-7"},
{"autogen", "anthropic:claude-opus-4-7"},
{"codex", "anthropic:claude-opus-4-7"},
{"openclaw", "anthropic:claude-opus-4-7"},
{"external", "anthropic:claude-opus-4-7"},
// Unknown / empty — fall through to universal default rather
// than failing closed. Pre-refactor both call sites also fell
// through; pinning the existing behavior, not changing it.
{"", "anthropic:claude-opus-4-7"},
{"some-future-runtime", "anthropic:claude-opus-4-7"},
{"CLAUDE-CODE", "anthropic:claude-opus-4-7"}, // case-sensitive — matches prior behavior
}
for _, tc := range cases {
t.Run(tc.runtime, func(t *testing.T) {
got := DefaultModel(tc.runtime)
if got != tc.want {
t.Errorf("DefaultModel(%q) = %q, want %q", tc.runtime, got, tc.want)
}
})
}
}
// TestDefaultModel_NeverEmpty — invariant: no input produces an empty
// string. The handlers that consume this would write empty into
// config.yaml, which the runtime then can't dispatch — pinning the
// non-empty contract here protects against a future "return early on
// unknown runtime" change that would silently break workspace creation.
func TestDefaultModel_NeverEmpty(t *testing.T) {
for _, runtime := range []string{
"", "claude-code", "hermes", "unknown-runtime",
} {
if got := DefaultModel(runtime); got == "" {
t.Errorf("DefaultModel(%q) returned empty string", runtime)
}
}
}
// runtime_defaults_test.go — previously pinned DefaultModel's contract
// (claude-code → "sonnet", everything else → "anthropic:claude-opus-4-7").
//
// DefaultModel was removed as a soft-fallback bug magnet (CTO 2026-05-22):
// model is REQUIRED user input; the platform must not provide a default.
// See runtime_defaults.go for the deletion rationale, and the new
// fail-closed gate in `handlers.WorkspaceHandler.Create` for the boundary
// enforcement. No test stub here — the contract is "this function does
// not exist", which the type-checker enforces at compile time.
@@ -0,0 +1,59 @@
# T4 privilege contract — generated from
# molecule-ai/molecule-core workspace-server/internal/provisioner/t4_privilege_contract.go
# RFC: molecule-ai/internal#456
# Do NOT edit this file by hand; regenerate via `go run ./cmd/t4-contract-dump > t4_capabilities.yaml`.
version: 1
agent_uid: 1000
capabilities:
- name: "agent_home_writable"
description: "/agent-home is writable by the agent (Files API split per task #128). The Files API redesign uses /agent-home as the user-writable root; the agent must be able to create files there without sudo."
severity: hard
source: "task #128 Files API redesign; memory reference_post_suspension_pipeline"
probe: "TF=/agent-home/.t4-cap-write-probe-${MOLECULE_T4_PROBE_ID:-$$}; echo ok > \"$TF\" && [ \"$(cat \"$TF\")\" = \"ok\" ] && rm -f \"$TF\""
- name: "agent_uid_1000"
description: "The container's primary process (the runtime, post-gosu) runs as uid 1000, not root. T4 grants full machine access via privileged + host PID + Docker socket — the WORKLOAD inside that privileged container must still be unprivileged to prevent every untrusted code execution from being trivially root-on-host."
severity: hard
source: "RFC internal#456 §2.1.2; memory feedback_hermes_listpeers_401_token_root600_unreadable_by_uid1000"
probe: "[ \"$(id -u)\" = \"1000\" ]"
- name: "auth_token_agent_owned"
description: "/configs/.auth_token is owned by uid 1000 (== AgentUID) so the a2a_mcp_server can read its bearer. In SaaS mode molecule-runtime itself writes the token via save_token() — the ownership equals the runtime's exec uid. If the runtime ever runs as root, this fails and list_peers 401s (the Hermes class bug)."
severity: hard
source: "RFC internal#456 §10; memory feedback_hermes_listpeers_401_token_root600_unreadable_by_uid1000"
probe: "[ -e /configs/.auth_token ] && [ \"$(stat -c '%u' /configs/.auth_token)\" = \"1000\" ]"
- name: "docker_socket_reachable"
description: "/var/run/docker.sock is bind-mounted and host Docker is reachable from the T4 container. The probe enters the host mount+PID namespaces before running docker info so it validates the same host-control path production agents use, instead of depending on the template image's Docker CLI/socket group details."
severity: hard
source: "provisioner.go applyHostConfig T4 branch (case 4)"
probe: "sudo -n nsenter --target 1 --mount --pid -- docker info >/dev/null 2>&1"
- name: "host_fs_write_readback"
description: "Host filesystem is mounted at /host and the agent can write+read+remove a file there via sudo. Proves real host reach (not just a PID-1 namespace trick on an isolated init)."
severity: hard
source: "RFC internal#456 §11"
probe: "MARKER=\"t4cap-$(date +%s)-$RANDOM\"; PROBE_FILE=\"/host/tmp/.t4-cap-probe-${MOLECULE_T4_PROBE_ID:-$$}\"; sudo -n sh -c \"echo $MARKER > $PROBE_FILE\" && [ \"$(sudo -n cat $PROBE_FILE)\" = \"$MARKER\" ] && sudo -n rm -f $PROBE_FILE"
- name: "host_root_reach_via_nsenter"
description: "The uid-1000 agent can attain host root via `sudo -n nsenter --target 1 --mount --pid -- id -u` returning 0. This is the T4 escalation leg: full machine access means the agent CAN escalate to host root deliberately, even though it does not run as root by default."
severity: hard
source: "RFC internal#456 §11; memory reference_per_template_privilege_contract_class_audit_2026_05_16"
probe: "[ \"$(sudo -n nsenter --target 1 --mount --pid -- id -u)\" = \"0\" ]"
- name: "list_peers_http_200"
description: "The platform list_peers HTTP endpoint (served by the in-container a2a_mcp_server) returns HTTP 200 when called from uid 1000 with the bearer from /configs/.auth_token. This proves the WHOLE token-ownership chain end-to-end: token written under correct uid → reader uid matches → bearer non-empty → platform accepts. A self-contained empirical test for the Hermes class bug."
severity: hard
source: "memory reference_openclaw_fresh_provision_nonfunctional_anthropic_default_unroutable; memory reference_openclaw_mcp_peer_wiring_rootcause"
probe: "BEARER=$(cat /configs/.auth_token 2>/dev/null || echo \"\"); [ -n \"$BEARER\" ] || exit 1; PORT=$(cat /configs/.platform_port 2>/dev/null || echo \"8080\"); STATUS=$(curl -sS -o /dev/null -w '%{http_code}' -H \"Authorization: Bearer $BEARER\" \"http://127.0.0.1:${PORT}/list_peers\"); [ \"$STATUS\" = \"200\" ]"
- name: "network_egress_https"
description: "Generic HTTPS egress works. T4 is unconstrained network; the canonical test target is the Molecule-owned Gitea middleman over its public name. CI must not depend on GitHub or other mirrors for this probe. Any reachable HTTPS endpoint satisfies it — the YAML carries the recommended targets but accepts any 200/301/302."
severity: hard
source: "task #174 brief"
probe: "for U in $MOLECULE_T4_EGRESS_TARGETS; do C=$(curl -sS -o /dev/null -w '%{http_code}' --max-time 8 \"$U\"); case \"$C\" in 2*|3*) exit 0;; esac; done; exit 1"
required_egress:
- "https://git.moleculesai.app/api/v1/version"
- name: "pid_host_visible"
description: "Host PID namespace is shared (--pid=host). The container can see host process 1 (systemd or pid-1 on the EC2 instance). Required for nsenter into host mount/pid namespaces."
severity: hard
source: "provisioner.go applyHostConfig T4 branch (case 4): hostCfg.PidMode = 'host'"
probe: "[ \"$(sudo -n nsenter --target 1 --mount --pid -- id -u)\" = \"0\" ]"
- name: "privileged_flag_observable"
description: "Container is started with --privileged. Observable from inside via /proc/self/status CapEff containing CAP_SYS_ADMIN. Defense-in-depth for the provisioner emission side."
severity: advisory
source: "provisioner.go applyHostConfig T4 branch (case 4)"
probe: "grep -q '^CapEff:.*ffffffffff' /proc/self/status"
@@ -120,8 +120,8 @@ func T4PrivilegeContract() []T4Capability {
},
{
Name: "docker_socket_reachable",
Description: "/var/run/docker.sock is bind-mounted into the container so the agent can manage other containers (T4 use case: agent-as-orchestrator). Proven by 'docker version' returning a server section, which requires the daemon to answer over the socket.",
Probe: `sudo -n docker version --format '{{.Server.Version}}' >/dev/null 2>&1`,
Description: "/var/run/docker.sock is bind-mounted and host Docker is reachable from the T4 container. The probe enters the host mount+PID namespaces before running docker info so it validates the same host-control path production agents use, instead of depending on the template image's Docker CLI/socket group details.",
Probe: `sudo -n nsenter --target 1 --mount --pid -- docker info >/dev/null 2>&1`,
Severity: SeverityHard,
Source: "provisioner.go applyHostConfig T4 branch (case 4)",
},
@@ -145,7 +145,7 @@ func T4PrivilegeContract() []T4Capability {
},
{
Name: "network_egress_https",
Description: "Generic HTTPS egress works. T4 is unconstrained network; the canonical test target is the Gitea instance over its public name, which any fork user can also resolve. Any reachable HTTPS endpoint satisfies it — the YAML carries the recommended targets but accepts any 200/301/302.",
Description: "Generic HTTPS egress works. T4 is unconstrained network; the canonical test target is the Molecule-owned Gitea middleman over its public name. CI must not depend on GitHub or other mirrors for this probe. Any reachable HTTPS endpoint satisfies it — the YAML carries the recommended targets but accepts any 200/301/302.",
Probe: `for U in $MOLECULE_T4_EGRESS_TARGETS; do ` +
` C=$(curl -sS -o /dev/null -w '%{http_code}' --max-time 8 "$U"); ` +
` case "$C" in 2*|3*) exit 0;; esac; ` +
@@ -153,10 +153,9 @@ func T4PrivilegeContract() []T4Capability {
Severity: SeverityHard,
Source: "task #174 brief",
RequiredEgress: []string{
// Public, no auth, returns a small JSON.
// Molecule-owned, public, no auth, returns a small JSON.
// Adopters override via MOLECULE_T4_EGRESS_TARGETS.
"https://api.github.com/zen",
"https://www.google.com/generate_204",
"https://git.moleculesai.app/api/v1/version",
},
},
{
@@ -169,7 +168,7 @@ func T4PrivilegeContract() []T4Capability {
{
Name: "pid_host_visible",
Description: "Host PID namespace is shared (--pid=host). The container can see host process 1 (systemd or pid-1 on the EC2 instance). Required for nsenter into host mount/pid namespaces.",
Probe: `[ -d /proc/1/root ] && [ "$(sudo -n readlink /proc/1/ns/pid)" = "$(sudo -n readlink /proc/self/ns/pid)" ]`,
Probe: `[ "$(sudo -n nsenter --target 1 --mount --pid -- id -u)" = "0" ]`,
Severity: SeverityHard,
Source: "provisioner.go applyHostConfig T4 branch (case 4): hostCfg.PidMode = 'host'",
},
@@ -1,6 +1,7 @@
package provisioner
import (
"os"
"strings"
"testing"
)
@@ -77,6 +78,19 @@ func TestT4PrivilegeContract_CoreCapabilitiesPresent(t *testing.T) {
}
}
func TestT4PrivilegeContract_DefaultEgressUsesMoleculeOwnedEndpoint(t *testing.T) {
for _, c := range T4PrivilegeContract() {
for _, target := range c.RequiredEgress {
if strings.Contains(target, "github.com") {
t.Errorf("capability %q default egress target must not depend on GitHub mirror/API: %s", c.Name, target)
}
if strings.Contains(target, "google.com") {
t.Errorf("capability %q default egress target must not depend on external Google endpoint: %s", c.Name, target)
}
}
}
}
// TestT4PrivilegeContract_HardCapabilitiesMajority sanity-checks that
// the contract is not silently advisory-only. If someone marks
// everything as "advisory" the gate becomes a no-op without anyone
@@ -142,6 +156,17 @@ func TestAsYAML_EscapesEmbeddedQuotes(t *testing.T) {
}
}
func TestGeneratedT4CapabilitiesYAMLMatchesSSOT(t *testing.T) {
got, err := os.ReadFile("t4_capabilities.yaml")
if err != nil {
t.Fatalf("read generated t4_capabilities.yaml: %v", err)
}
want := AsYAML(T4PrivilegeContract())
if string(got) != want {
t.Fatal("generated t4_capabilities.yaml drifted from T4PrivilegeContract; regenerate with `go run ./cmd/t4-contract-dump > internal/provisioner/t4_capabilities.yaml`")
}
}
// TestAgentUIDConsistency ties the contract to the existing
// provisioner-side AgentUID const. The probe for "agent_uid_1000"
// hard-codes `id -u == 1000`; if AgentUID ever changes (no one