Compare commits

..

10 Commits

Author SHA1 Message Date
core-devops fd5a830370 infra(ci): pin upload-artifact to SHA in e2e-chat workflow
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m16s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 56s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 55s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m5s
gate-check-v3 / gate-check (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4m18s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 52s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 5m40s
CI / Python Lint & Test (pull_request) Successful in 6m37s
CI / all-required (pull_request) Successful in 6m53s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m13s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat — body-unfilled: comprehensive-testing, local-postgres-e2e, staging-sm
sop-checklist / na-declarations (pull_request) N/A: (none)
Aligns with the SHA-pinning standard applied to all other Gitea
Actions workflows (ci.yml, e2e-staging-canvas.yml, etc.).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 11:20:54 +00:00
core-devops 8399e8b525 fix(queue): correct status deduplication order so newest entry wins
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Platform (Go) (pull_request) Successful in 4m21s
CI / Canvas (Next.js) (pull_request) Successful in 5m43s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m19s
CI / all-required (pull_request) Successful in 6m22s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
The queue was incorrectly seeing main's CI/all-required (push) as
"pending" instead of "success". Two bugs interacting:

1. latest_statuses_by_context guard was wrong: `ids[-1] > ids[0]`
   detected ascending but the combined /statuses array is DESCENDING
   (ids 393→1). Fix: `ids[-1] < ids[0]` detects descending and
   reverses so ascending iteration makes newest last → wins.

2. get_combined_status sorted merged entries DESCENDING then deduplicated
   by iterating forward — the last occurrence won. But when /status
   base entries (low ids) are appended AFTER /statuses (high ids), the
   same-context entries from base appear LAST after descending sort,
   overwriting newer entries from /statuses. Fix: return merged list
   sorted ASCENDING and drop the inline dedup; let
   latest_statuses_by_context handle dedup correctly.

Test names clarified: ascending-input test now named
test_latest_statuses_ascending_input_newest_wins (the base /status
case); descending-input test renamed
test_latest_statuses_guard_reverses_descending_input (the /statuses
case). Both verify newest (largest id) wins.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:29:57 +00:00
core-devops 5e47d2e385 fix(queue): query merge-queue label by name not resolved ID
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
gate-check-v3 / gate-check (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 53s
qa-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 3s
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4m34s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 6m14s
E2E Chat / E2E Chat (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Python Lint & Test (pull_request) Successful in 6m28s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 5m52s
Gitea orders /issues?labels=<id> by PR number ascending with limit
applied before PR #1233 appears — the 50-result page starts at PR #1309
and misses #1233 entirely. Querying by label name returns #1233
correctly. Drop the _ensure_label_ids() startup call (one less API
round-trip per tick) and the now-dead _QUEUE_LABEL_ID/_HOLD_LABEL_ID
globals. Resolves the queue label query bug root-causing SEV-1 #487.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 09:14:36 +00:00
core-devops 6c06227871 fix(queue): correct latest_statuses_by_context guard for descending input
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 4m18s
CI / Canvas (Next.js) (pull_request) Successful in 5m33s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m23s
CI / all-required (pull_request) Successful in 6m36s
Gitea /statuses returns newest-first (desc id order). After
get_combined_status sorts by id descending, the combined list is also
descending. The old guard `ids[-1] > ids[0]` detected ascending input
but NOT descending — for main (130+ statuses) the guard did not fire,
causing forward iteration to grab the newest entry instead of the oldest
(which is the correct authoritative status when iterating a descending
list). The fix inverts the comparison to `ids[-1] < ids[0]`, so that
descending input triggers reversal and the oldest (authoritative) entry
per context wins. Ascending test fixtures work unchanged.

Also adds explicit-id test fixture for the ascending-guard case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 08:41:36 +00:00
core-devops f6abdb9dc1 fix(queue): proper merge of base + extended statuses by id sort
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 4m4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 3s
E2E Chat / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 57s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 5m25s
qa-review / approved (pull_request) Failing after 2s
security-review / approved (pull_request) Failing after 2s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 56s
CI / Python Lint & Test (pull_request) Successful in 6m29s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 6m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 13s
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
The previous supplement logic only added contexts MISSING from base, but didn't
overwrite base entries with newer statuses from /statuses. Result: stale
"failure" entries from base (id=27) overwrote newer "pending" entries from
/statuses (id=25) because supplement only filled gaps.

Fix: collect all entries from both /status (base) and /statuses (extended),
sort by id descending (highest = newest), and iterate in that order so the
newest entry for each context wins regardless of source.

The combined statuses[] is now correct for all cases:
- Newest in base only: wins (from sorted iteration)
- Newest in extended only: wins (supplements base)
- Newest in base, older in extended: wins (base entry processed later in sort)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 08:11:58 +00:00
core-devops ec79a6bb20 fix(queue): supplement statuses overwrite base, not just fill gaps
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 4m4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 50s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5m25s
gate-check-v3 / gate-check (pull_request) Successful in 2s
qa-review / approved (pull_request) Failing after 2s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 3s
sop-checklist / all-items-acked (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 56s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 6m27s
CI / all-required (pull_request) Successful in 6m22s
The base /status endpoint returns only 26-30 entries; newer statuses for
the same context may not be in the base array. The supplement logic
was only adding contexts MISSING from base, but the base already contained
an old "pending" entry for CI/all-required while the newer "success" entry
was beyond the base array's cutoff. Now the supplement OVERWRITES base
entries for the same context so newer statuses always win.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 08:02:52 +00:00
core-devops f8d4512e1f fix(queue): correct status ordering and supplement missing contexts
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 4m42s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request) Successful in 2s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 2s
sop-checklist / all-items-acked (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 52s
CI / Canvas (Next.js) (pull_request) Successful in 6m13s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 53s
CI / Python Lint & Test (pull_request) Successful in 6m27s
CI / all-required (pull_request) Successful in 6m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Two related fixes to get_combined_status() + latest_statuses_by_context():

1. Ordering: Gitea /statuses returns entries in DESCENDING id order
   (newest first). The script was reversing, treating it as ascending,
   which made the OLDEST entry win instead of the newest. Now iterate
   forward so newer entries overwrite older ones (newest wins).

2. Context gaps: The /status endpoint returns only 30 statuses in its
   statuses[] array. The /statuses endpoint (limit=100) may not include
   all contexts from /status. Now merge: start with /status's statuses[]
   (authoritative, ascending), supplement missing contexts from
   /statuses (descending, reversed for correct iteration order).

Also fixes test_latest_statuses_dedupes_by_context_newest_first to
assert the correct "newest wins" semantics.

PR #1403 now correctly shows ready=True action=merge with this fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 07:59:39 +00:00
core-devops b0ec931595 fix(queue): resolve merge-queue label by ID not name
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 5m10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m1s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 6m34s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 6m54s
CI / all-required (pull_request) Successful in 6m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 55s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
Gitea allows multiple repo labels with the same name but different
colours. The /issues endpoint with labels=<name> matches at most one
of them — not reliably the canonical colour. This caused
list_queued_issues() to miss PRs that only had the canonical
merge-queue label (id=27, colour 1f883d) when duplicates with a
different colour existed in the repo.

Fix: _resolve_label_id() looks up the label's numeric id at startup
and list_queued_issues() queries by that id instead of the name.
This is stable regardless of how many duplicate labels exist.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 07:40:42 +00:00
core-devops 8ccf3a844c fix(queue): surface merge API errors instead of silent catch
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 3s
sop-tier-check / tier-check (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4m11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 5m33s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m33s
CI / all-required (pull_request) Successful in 5m9s
audit-force-merge / audit (pull_request) Has been skipped
When the merge API returns a non-transient error (HTTP 405 permission
denied, HTTP 422 pre-receive hook block, etc.), the queue was catching
ApiError in the generic main-loop handler and exiting 0 — indistinguishable
from a successful-no-op tick.

Fix: catch ApiError specifically around merge_pull(), post a PR comment
with the error detail and a reference to SEV-1 internal#487, and return
exit code 2 so the workflow run is marked failed.

Exit codes:
  0 — success (merged, updated, or nothing to do)
  2 — merge API error (permission/hook issue, non-transient)

Fixes: SEV-1 internal#487 — queue silently failing to merge while
reporting success; merge permission error invisible without workflow
log inspection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 06:24:11 +00:00
core-devops 9ede993f3d fix(sop-checklist): probe() KeyError for gate names in compute_na_state
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m0s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 4m52s
CI / Canvas (Next.js) (pull_request) Successful in 6m35s
CI / Python Lint & Test (pull_request) Successful in 6m38s
CI / all-required (pull_request) Successful in 6m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 2/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +2 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Has been skipped
compute_na_state() calls probe(gate_name, [user]) where gate_name is a gate
name like 'qa-review' or 'security-review' — these are not checklist item
slugs and are not in items_by_slug. probe() was doing:

    item = items_by_slug[slug]   # KeyError for 'qa-review'

This caused the sop-checklist workflow to crash on any PR that has N/A gates
configured (all 7 checklist items with /sop-n/a), producing a 30-minute
Failing status before Gitea kills the job.

Fix: add _required_teams_for() helper that falls back to na_gates lookup
when slug is not in items_by_slug. Gate names resolve to their
required_teams from the n/a_gates config section.

Adds TestProbeNaGateFallback regression test (58/58 passing).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 05:09:35 +00:00
128 changed files with 1597 additions and 6786 deletions
+64 -83
View File
@@ -65,11 +65,6 @@ class ApiError(RuntimeError):
pass
class MergePermissionError(ApiError):
"""Merge failed with a permanent permission error (403/404/405).
The queue should skip this PR and move to the next one."""
@dataclasses.dataclass(frozen=True)
class MergeDecision:
ready: bool
@@ -142,49 +137,37 @@ def status_state(status: dict) -> str:
def latest_statuses_by_context(statuses: list[dict]) -> dict[str, dict]:
# Gitea /statuses endpoint returns entries in ascending id order (oldest
# first). We need the LAST occurrence of each context, so iterate in
# reverse to prefer newer entries.
# Iterate so the newest entry for each context is seen LAST → it overwrites
# older ones in the accumulator dict.
# - Ascending input (oldest first, e.g. Gitea /status base array): forward
# iteration processes oldest first, newest last → newest overwrites → OK.
# - Descending input (newest first, e.g. Gitea /statuses, combined array):
# forward iteration processes newest first → oldest last → oldest wins.
# Must REVERSE so iteration is oldest→newest → newest wins.
# Guard: detect ascending by checking last_id > first_id.
if not statuses:
return {}
ids = [s.get("id", 0) for s in statuses if isinstance(s.get("id"), int)]
if ids and ids[-1] < ids[0]:
# Descending (newest first) — reverse to oldest→newest iteration.
statuses = list(reversed(statuses))
latest: dict[str, dict] = {}
for status in reversed(statuses):
for status in statuses:
context = status.get("context")
if isinstance(context, str):
latest[context] = status # overwrite: reverse order → newest wins
latest[context] = status
return latest
def _is_tier_low_pending_ok(
latest_statuses: dict[str, dict],
context: str,
pr_labels: set[str],
) -> bool:
"""Return True if tier:low PR can tolerate sop-checklist pending state.
Per sop-checklist-config.yaml tier_failure_mode, tier:low uses soft-fail:
sop-checklist posts state=pending when acks are satisfied (missing
manager/ceo acks are informational only). The queue should accept
pending instead of waiting for success.
"""
if "tier:low" not in pr_labels:
return False
if "sop-checklist" not in context:
return False
status = latest_statuses.get(context) or {}
return status_state(status) == "pending"
def required_contexts_green(
latest_statuses: dict[str, dict],
contexts: list[str],
pr_labels: set[str] | None = None,
) -> tuple[bool, list[str]]:
missing_or_bad: list[str] = []
for context in contexts:
status = latest_statuses.get(context)
state = status_state(status or {})
if state != "success":
if pr_labels and _is_tier_low_pending_ok(latest_statuses, context, pr_labels):
continue # tier:low soft-fail: accept pending sop-checklist
missing_or_bad.append(f"{context}={state or 'missing'}")
return not missing_or_bad, missing_or_bad
@@ -237,7 +220,6 @@ def evaluate_merge_readiness(
pr_status: dict,
required_contexts: list[str],
pr_has_current_base: bool,
pr_labels: set[str] | None = None,
) -> MergeDecision:
# Check push-required contexts explicitly instead of combined state.
# Combined state can be "failure" due to non-blocking jobs
@@ -257,7 +239,7 @@ def evaluate_merge_readiness(
# The required_contexts list is the authoritative gate — it includes only
# the checks that actually block merges.
latest = latest_statuses_by_context(pr_status.get("statuses") or [])
ok, missing_or_bad = required_contexts_green(latest, required_contexts, pr_labels)
ok, missing_or_bad = required_contexts_green(latest, required_contexts)
if not ok:
return MergeDecision(False, "wait", "required contexts not green: " + ", ".join(missing_or_bad))
return MergeDecision(True, "merge", "ready")
@@ -275,42 +257,54 @@ def get_branch_head(branch: str) -> str:
def get_combined_status(sha: str) -> dict:
"""Combined status + all individual statuses for `sha`.
The /status endpoint caps the `statuses` array at 30 entries (Gitea
default page size), so we fetch the full list via /statuses with a
higher limit. The combined `state` still comes from /status.
The /status endpoint returns a `statuses` array capped at 30 entries.
We supplement it with /statuses (limit=100) for contexts not in the
base array. The combined `state` always comes from /status.
Returns the merged list sorted ASCENDING by id. Caller's
latest_statuses_by_context iterates ascending so the newest (largest
id) for each context is seen last and wins.
"""
_, combined = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
if not isinstance(combined, dict):
raise ApiError(f"status for {sha} response not object")
combined_statuses: list[dict] = combined.get("statuses") or []
base_statuses: list[dict] = combined.get("statuses") or []
all_entries: list[dict] = list(base_statuses)
try:
_, all_statuses_raw = api(
_, statuses_list = api(
"GET",
f"/repos/{OWNER}/{NAME}/commits/{sha}/statuses",
query={"limit": "50"},
query={"limit": "100"},
)
if isinstance(all_statuses_raw, list):
all_statuses: list[dict] = list(all_statuses_raw)
else:
all_statuses = []
if isinstance(statuses_list, list):
all_entries.extend(statuses_list)
except (ApiError, urllib.error.URLError, TimeoutError, OSError) as exc:
sys.stderr.write(f"::warning::could not fetch full statuses list for {sha[:8]}: {exc}\n")
all_statuses = []
# Build latest per context: process combined (ascending→reverse=newest
# first), then fill gaps from all_statuses (already newest-first).
latest: dict[str, dict] = {}
for status in reversed(sorted(combined_statuses, key=lambda s: s.get("id") or 0)):
ctx = status.get("context")
if isinstance(ctx, str) and ctx not in latest:
latest[ctx] = status
for status in all_statuses:
ctx = status.get("context")
if isinstance(ctx, str) and ctx not in latest:
latest[ctx] = status
combined["statuses"] = list(latest.values())
# Sort ascending by id. latest_statuses_by_context iterates ascending
# so the newest (largest id) entry for each context is seen last and wins.
all_entries.sort(key=lambda s: s.get("id") or 0)
combined["statuses"] = all_entries
return combined
def _resolve_label_id(name: str) -> str | None:
"""Return the repo label ID for `name`, or None if not found.
Gitea's /issues endpoint with labels=<name> has a known quirk: when multiple
repo labels share the same name (e.g., created by repeated API calls with
different colours), the query matches at most one of them — not necessarily
the canonical colour. Resolving to ID sidesteps the ambiguity.
"""
_, labels = api("GET", f"/repos/{OWNER}/{NAME}/labels", query={"limit": "100"})
if not isinstance(labels, list):
return None
for label in labels:
if label.get("name") == name:
return str(label["id"])
return None
def list_queued_issues() -> list[dict]:
_, body = api(
"GET",
@@ -372,16 +366,7 @@ def merge_pull(pr_number: int, *, dry_run: bool) -> None:
print(f"::notice::merging PR #{pr_number}")
if dry_run:
return
try:
api("POST", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}/merge", body=payload, expect_json=False)
except ApiError as exc:
# Re-raise permission-like errors so process_once can skip this PR.
# 403 = no push access, 404 = repo/pr not found, 405 = not allowed.
msg = str(exc)
for code in ("403", "404", "405"):
if code in msg:
raise MergePermissionError(msg) from exc
raise # re-raise other ApiErrors unchanged
api("POST", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}/merge", body=payload, expect_json=False)
def process_once(*, dry_run: bool = False) -> int:
@@ -423,13 +408,11 @@ def process_once(*, dry_run: bool = False) -> int:
commits = get_pull_commits(pr_number)
current_base = pr_has_current_base(pr, commits, main_sha)
pr_status = get_combined_status(head_sha)
pr_labels = label_names(pr)
decision = evaluate_merge_readiness(
main_status=main_status,
pr_status=pr_status,
required_contexts=contexts,
pr_has_current_base=current_base,
pr_labels=pr_labels,
)
print(f"::notice::PR #{pr_number} decision={decision.action}: {decision.reason}")
@@ -454,23 +437,21 @@ def process_once(*, dry_run: bool = False) -> int:
return 0
try:
merge_pull(pr_number, dry_run=dry_run)
except MergePermissionError as exc:
# Permanent merge failure (HTTP 403/404/405). Post a comment so
# maintainers know why, then return 0 so this tick is done.
# The PR stays in the queue; future ticks can retry after the
# permission issue is resolved.
sys.stderr.write(f"::error::merge permission error for PR #{pr_number}: {exc}\n")
except ApiError as exc:
# Merge API errors (405 permission denied, 422 hook block, etc.)
# are NOT transient — retrying will not help. Surface the error
# on the PR immediately so it is visible without digging into
# workflow logs, and fail the workflow so it is distinguishable
# from a successful-no-op tick.
post_comment(
pr_number,
(
"merge-queue: merge failed with HTTP 405 'User not allowed to merge PR'. "
"No available token has Can-merge permission on this repo. "
"Fix: grant Can-merge to a token, or add a maintain/admin collaborator. "
"Skipping to next queued PR on next tick."
),
f"merge-queue: MERGE FAILED — {exc}. "
"This is a non-transient error (permission or hook issue). "
"See SEV-1 internal#487.",
dry_run=dry_run,
)
return 0
sys.stderr.write(f"::error::PR #{pr_number} merge failed: {exc}\n")
return 2 # distinct exit code so workflow run shows failure
return 0
return 0
+2 -77
View File
@@ -100,12 +100,11 @@ printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$CURL_AUTH_FILE"
# (bash trap 'function' EXIT expands variables at trap-fire time, not def time).
PR_JSON=$(mktemp)
REVIEWS_JSON=$(mktemp)
COMMENTS_JSON=$(mktemp)
TEAM_PROBE_TMP=$(mktemp)
NA_STATUSES_TMP="" # declared here so cleanup() always has the var
cleanup() {
rm -f "$CURL_AUTH_FILE" "$PR_JSON" "$REVIEWS_JSON" "$COMMENTS_JSON" "$TEAM_PROBE_TMP" "${NA_STATUSES_TMP-}"
rm -f "$CURL_AUTH_FILE" "$PR_JSON" "$REVIEWS_JSON" "$TEAM_PROBE_TMP" "${NA_STATUSES_TMP-}"
}
trap cleanup EXIT
@@ -207,81 +206,7 @@ CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILT
debug "candidate non-author approvers: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -z "$CANDIDATES" ]; then
# --- Guardrail (internal#503): explain the most common false
# "no candidates" red. Gitea's review event enum is EXACTLY
# APPROVED/REQUEST_CHANGES/COMMENT/PENDING. A wrong value ("APPROVE",
# lowercase, ...) is silently accepted (HTTP 200) and stored as
# state=PENDING. A correctly-started draft review has an EMPTY body;
# a NON-empty body + state==PENDING by a non-author == an intended
# verdict mis-filed by a wrong event string. Surface it actionably.
# This does NOT change the gate result (still fail-closed below) — it
# only converts a mystery red into a named, self-fixing error.
MISFILED_FILTER='.[]
| select(.state == "PENDING")
| select(.dismissed != true)
| select(.user.login != $author)
| select(((.body // "") | gsub("^\\s+|\\s+$";"") | length) > 0)
| "\(.id)\t\(.user.login)"'
MISFILED=$(jq -r --arg author "$PR_AUTHOR" "$MISFILED_FILTER" "$REVIEWS_JSON" 2>/dev/null || true)
if [ -n "$MISFILED" ]; then
echo "::error::${TEAM}-review: non-author review(s) were SUBMITTED but stored as PENDING — almost certainly the wrong Gitea review event string (internal#503)."
echo "::error::Gitea accepts ONLY the exact enum APPROVED / REQUEST_CHANGES / COMMENT. 'APPROVE' or lowercase is silently (HTTP 200) filed as PENDING and is invisible to this gate."
printf '%s\n' "$MISFILED" | while IFS="$(printf '\t')" read -r _rid _rl; do
[ -n "${_rid:-}" ] && echo "::error:: review id=${_rid} by '${_rl}': RE-SUBMIT via POST ${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}/reviews with {\"event\":\"APPROVED\"} (correct enum) — do NOT edit the DB."
done
fi
# --- Fallback (internal#348): check issue comments for agent-approval ---
# core-qa-agent and core-security-agent approve via issue comments, NOT
# the reviews API. The reviews API returns zero entries for comment-only
# approvals. This fallback reads PR issue comments and extracts logins that:
# 1. Posted a comment matching the agent-prefix pattern for this gate:
# qa → "[core-qa-agent] APPROVED"
# security → "[core-security-agent] APPROVED"
# OR posted a generic approval keyword (word-anchored, case-insensitive):
# APPROVED / LGTM / ACCEPTED
# 2. Are not the PR author
# 3. The team-membership probe below is the authoritative filter.
AGENT_PATTERN=""
case "$TEAM" in
qa) AGENT_PATTERN="\\[core-qa-agent\\]" ;;
security) AGENT_PATTERN="\\[core-security-agent\\]" ;;
esac
HTTP_CODE=$(curl -sS -o "$COMMENTS_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/issues/${PR_NUMBER}/comments")
debug "GET /issues/${PR_NUMBER}/comments → HTTP ${HTTP_CODE}"
if [ "$HTTP_CODE" = "200" ]; then
# JQ expression: select non-author comments that match either the
# agent-prefix pattern (case-insensitive) OR a generic approval keyword.
JQ_APPROVALS='
.[] |
select(.user.login != $author) |
. as $cmt |
if ($agent_pattern | length) > 0 and ($cmt.body // "" | test($agent_pattern; "i")) then
$cmt.user.login
elif ($cmt.body // "" | test("\\b(APPROVED|LGTM|ACCEPTED)\\b"; "i")) then
$cmt.user.login
else
empty
end
'
CANDIDATES=$(jq -r \
--arg author "$PR_AUTHOR" \
--arg agent_pattern "$AGENT_PATTERN" \
"$JQ_APPROVALS" \
"$COMMENTS_JSON" 2>/dev/null | sort -u)
debug "comment-based approval candidates: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -n "$CANDIDATES" ]; then
echo "::notice::${TEAM}-review: reviews API found no APPROVED reviews; found $(echo "$CANDIDATES" | wc -w | xargs) comment-based approval candidate(s) — verifying team membership..."
fi
else
debug "could not fetch issue comments (HTTP ${HTTP_CODE})"
fi
fi
if [ -z "${CANDIDATES:-}" ]; then
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates from reviews API or issue comments)"
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates yet)"
exit 1
fi
+11 -2
View File
@@ -830,9 +830,18 @@ def main(argv: list[str] | None = None) -> int:
# one membership lookup per team.
team_member_cache: dict[tuple[str, int], bool | None] = {}
def _required_teams_for(slug: str) -> list[str] | None:
"""Look up required_teams for a slug from checklist items OR N/A gates."""
if slug in items_by_slug:
return items_by_slug[slug]["required_teams"]
if slug in na_gates:
return na_gates[slug].get("required_teams", [])
return None
def probe(slug: str, users: list[str]) -> list[str]:
item = items_by_slug[slug]
team_names: list[str] = item["required_teams"]
team_names = _required_teams_for(slug)
if team_names is None:
raise KeyError(f"slug '{slug}' not found in items or N/A gates")
# Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
# available — fall back to the list endpoint.
team_ids: list[int] = []
+1 -34
View File
@@ -17,9 +17,6 @@ Scenarios:
T8_team_not_member — team membership → 404 (not a member) → exit 1
T9_team_403 — team membership → 403 (token not in team) → exit 1
T14_non_default_base — open PR targeting staging → script exits 0 (no-op)
T15_comments_agent_approval — reviews empty; comments have "[core-qa-agent] APPROVED" → exit 0
T16_comments_generic_approval — reviews empty; comments have "APPROVED" by team member → exit 0
T17_comments_no_approval — reviews empty; comments have no approval keywords → exit 1
Usage:
FIXTURE_STATE_DIR=/tmp/x python3 _review_check_fixture.py 8080
@@ -100,9 +97,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
# GET /repos/{owner}/{name}/pulls/{pr_number}/reviews
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/pulls/(\d+)/reviews$", path)
if m:
if sc in ("T4_reviews_empty", "T5_reviews_only_author",
"T15_comments_agent_approval", "T16_comments_generic_approval",
"T17_comments_no_approval"):
if sc in ("T4_reviews_empty", "T5_reviews_only_author"):
return self._json(200, [])
if sc == "T6_reviews_dismissed":
return self._json(200, [{
@@ -121,28 +116,6 @@ class Handler(http.server.BaseHTTPRequestHandler):
{"state": "APPROVED", "dismissed": False, "user": {"login": "core-devops"}, "commit_id": "abc1234"},
])
# GET /repos/{owner}/{name}/issues/{pr_number}/comments
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/issues/(\d+)/comments$", path)
if m:
if sc == "T15_comments_agent_approval":
return self._json(200, [
{"user": {"login": "core-qa-agent"}, "body": "[core-qa-agent] APPROVED this PR. Good changes.", "id": 1},
{"user": {"login": "alice"}, "body": "I authored this PR", "id": 2},
{"user": {"login": "random-user"}, "body": "Looks okay to me", "id": 3},
])
if sc == "T16_comments_generic_approval":
return self._json(200, [
{"user": {"login": "core-qa-agent"}, "body": "APPROVED — all acceptance criteria met", "id": 1},
{"user": {"login": "alice"}, "body": "-authored", "id": 2},
])
if sc == "T17_comments_no_approval":
return self._json(200, [
{"user": {"login": "alice"}, "body": "I authored this PR", "id": 1},
{"user": {"login": "random-user"}, "body": "Looks okay to me", "id": 2},
])
# Default scenarios (T1T9, T14): no comments
return self._json(200, [])
# GET /teams/{team_id}/members/{username}
m = re.match(r"^/api/v1/teams/(\d+)/members/([^/]+)$", path)
if m:
@@ -154,12 +127,6 @@ class Handler(http.server.BaseHTTPRequestHandler):
# T7_team_member: member
return self._empty(204)
# GET /repos/{owner}/{name}/statuses/{sha} — for N/A declaration check
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/statuses/([a-f0-9]+)$", path)
if m:
# All comment-based scenarios have no N/A declarations
return self._json(200, [])
return self._json(404, {"path": path, "msg": "fixture: no route"})
def do_POST(self):
+74 -11
View File
@@ -1,6 +1,7 @@
import importlib.util
import sys
from pathlib import Path
from unittest.mock import patch
SCRIPT = Path(__file__).resolve().parents[1] / "gitea-merge-queue.py"
@@ -10,16 +11,37 @@ sys.modules[spec.name] = mq
spec.loader.exec_module(mq)
def test_latest_statuses_dedupes_by_context_newest_first():
def test_latest_statuses_ascending_input_newest_wins():
# Gitea /status (base array) returns ascending id order (oldest first).
# Forward iteration processes oldest first, newest last → newest overwrites.
statuses = [
{"context": "CI / all-required (pull_request)", "status": "failure"},
{"context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"context": "CI / all-required (pull_request)", "status": "success"},
{"id": 18, "context": "CI / all-required (pull_request)", "status": "failure"}, # oldest
{"id": 27, "context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"id": 54, "context": "CI / all-required (pull_request)", "status": "success"}, # newest
]
latest = mq.latest_statuses_by_context(statuses)
assert latest["CI / all-required (pull_request)"]["status"] == "failure"
assert latest["CI / all-required (pull_request)"]["status"] == "success"
assert latest["CI / all-required (pull_request)"]["id"] == 54
assert latest["sop-checklist / all-items-acked (pull_request)"]["state"] == "success"
def test_latest_statuses_guard_reverses_descending_input():
# Gitea /statuses returns descending id order (newest first: id=54 → id=1).
# Guard detects descending and reverses so we iterate ascending.
# Forward on reversed = newest (id=54) is last → overwrites oldest.
statuses = [
{"id": 54, "context": "CI / all-required (pull_request)", "status": "success"}, # newest
{"id": 27, "context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"id": 18, "context": "CI / all-required (pull_request)", "status": "failure"}, # oldest
]
latest = mq.latest_statuses_by_context(statuses)
# Guard reverses descending → asc iteration: 18 first, 27, 54 last → 54 wins.
assert latest["CI / all-required (pull_request)"]["status"] == "success"
assert latest["CI / all-required (pull_request)"]["id"] == 54
assert latest["sop-checklist / all-items-acked (pull_request)"]["state"] == "success"
@@ -120,11 +142,52 @@ def test_merge_decision_updates_stale_pr_before_merge():
assert decision.action == "update"
def test_MergePermissionError_inherits_from_ApiError():
assert issubclass(mq.MergePermissionError, mq.ApiError)
def test_merge_failure_returns_nonzero_and_posts_comment(monkeypatch):
"""When merge_pull raises ApiError (e.g. HTTP 405 permission denied),
process_once returns exit code 2 (non-zero) and posts a comment on the PR.
This distinguishes merge-permission errors from successful-no-op ticks."""
captured_comment = {}
def fake_post_comment(pr_number, body, *, dry_run):
captured_comment["pr_number"] = pr_number
captured_comment["body"] = body
def test_MergePermissionError_message_preserved():
exc = mq.MergePermissionError("POST /merge -> HTTP 405: User not allowed")
assert "405" in str(exc)
assert "User not allowed" in str(exc)
# Replace functions directly on the module object so process_once()
# (which looks them up by name at call time) picks up the fakes.
mq.list_queued_issues = lambda: [{
"number": 42,
"created_at": "2026-05-17T00:00:00Z",
"labels": [{"name": "merge-queue"}],
"pull_request": {},
}]
mq.get_pull = lambda n: {
"state": "open",
"base": {"ref": "main", "repo_id": 1},
"head": {"sha": "headsha", "repo_id": 1},
"merge_base": "abc123def",
}
mq.get_pull_commits = lambda n: [{"sha": "headsha"}]
mq.get_branch_head = lambda branch: "abc123def"
mq.get_combined_status = lambda sha: {
"state": "success",
"statuses": [{"context": "CI / all-required (push)", "status": "success"}],
}
mq.latest_statuses_by_context = lambda s: {
"CI / all-required (pull_request)": {"status": "success"},
"sop-checklist / all-items-acked (pull_request)": {"status": "success"},
}
mq.required_contexts_green = lambda statuses, contexts: (True, [])
mq.post_comment = fake_post_comment
# Simulate merge failing with HTTP 405 (permission denied).
# The ApiError raised by api() is caught inside process_once().
merge_error = mq.ApiError(
"POST /repos/x/y/pulls/42/merge -> HTTP 405: User not allowed to merge PR"
)
with patch.object(mq, "merge_pull", side_effect=merge_error):
exit_code = mq.process_once(dry_run=False)
assert exit_code == 2, f"Expected exit code 2, got {exit_code}"
assert captured_comment["pr_number"] == 42
assert "MERGE FAILED" in captured_comment["body"]
assert "405" in captured_comment["body"]
-25
View File
@@ -334,31 +334,6 @@ assert_contains "T12 jq: core-devops (non-author APPROVED) in candidates" "core-
assert_eq "T12 jq: alice (author) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^alice$' || true)"
assert_eq "T12 jq: carol (dismissed) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^carol$' || true)"
# T15 — comment-based approval via agent prefix pattern → exit 0
echo
echo "== T15 comment agent-prefix approval =="
T15_OUT=$(run_review_check "T15_comments_agent_approval")
T15_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T15 exit code 0 (agent-comment approval + team member)" "0" "$T15_RC"
assert_contains "T15 comment fallback notice" "comment-based approval" "$T15_OUT"
assert_contains "T15 core-qa-agent APPROVED" "APPROVED by core-qa-agent" "$T15_OUT"
# T16 — comment-based approval via generic APPROVED keyword → exit 0
echo
echo "== T16 comment generic keyword approval =="
T16_OUT=$(run_review_check "T16_comments_generic_approval")
T16_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T16 exit code 0 (generic-approval comment + team member)" "0" "$T16_RC"
assert_contains "T16 comment fallback notice" "comment-based approval" "$T16_OUT"
# T17 — no approval keywords in comments → exit 1
echo
echo "== T17 comments with no approval keywords =="
T17_OUT=$(run_review_check "T17_comments_no_approval")
T17_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T17 exit code 1 (no candidates from comments)" "1" "$T17_RC"
assert_contains "T17 no candidates error" "no candidates from reviews API or issue comments" "$T17_OUT"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
@@ -603,3 +603,51 @@ class TestComputeNaState(unittest.TestCase):
self.assertEqual(na_directives[0][0], "sop-n/a")
self.assertEqual(na_directives[0][1], "qa-review")
self.assertIn("no surface", na_directives[0][2])
class TestProbeNaGateFallback(unittest.TestCase):
"""Regression test: probe() must handle gate names (qa-review, security-review)
from N/A gates without raising KeyError.
mc#1389: compute_na_state calls probe(gate_name, [user]) where gate_name is
a gate name like 'qa-review' — NOT a checklist item slug. The probe must
resolve the gate's required_teams from na_gates, not raise KeyError from
items_by_slug lookup.
"""
def test_probe_resolves_gate_name_from_na_gates(self):
cfg = sop.load_config(CONFIG_PATH)
items = cfg["items"]
items_by_slug = {it["slug"]: it for it in items}
na_gates = cfg.get("n/a_gates", {})
# Reconstruct the _required_teams_for helper from sop-checklist.py
def _required_teams_for(slug):
if slug in items_by_slug:
return items_by_slug[slug]["required_teams"]
if slug in na_gates:
return na_gates[slug].get("required_teams", [])
return None
# Gate names should resolve from na_gates
self.assertEqual(
_required_teams_for("qa-review"),
["qa", "security", "engineers"],
)
self.assertEqual(
_required_teams_for("security-review"),
["security", "managers", "ceo"],
)
# Checklist item slugs should still resolve from items_by_slug
self.assertEqual(
_required_teams_for("comprehensive-testing"),
["qa", "engineers"],
)
self.assertEqual(
_required_teams_for("root-cause"),
["managers", "ceo"],
)
# Unknown slug should return None (not raise KeyError)
self.assertIsNone(_required_teams_for("nonexistent-slug"))
+1 -61
View File
@@ -158,68 +158,8 @@ jobs:
echo "NOTE: No warning in output (may be suppressed by log level)"
fi
- name: Reproduce openclaw failure — pipe held OPEN, no EOF
run: |
set -euo pipefail
echo "=== keep-stdin-open pipe (the real openclaw / Claude Code case) ==="
echo ""
echo "Before the readline() fix this HANGS: main() did"
echo " stdin.read(65536) -> on a pipe, blocks until 64KB OR EOF."
echo "An MCP client sends one ~150B initialize and keeps stdin"
echo "open waiting for the response, so the server never parsed"
echo "the request and the client timed out (openclaw: 'MCP error"
echo "-32000: Connection closed'). The earlier regular-file /"
echo "heredoc-pipe steps PASSED through this bug because a file"
echo "(or a closing heredoc) yields EOF immediately."
echo ""
# Drive the server through a real pipe that stays OPEN: write
# one initialize, do NOT close stdin, and require a response
# within a hard timeout. read(65536) -> no output -> timeout
# kills it -> FAIL. readline() -> immediate response -> PASS.
python - <<'PYEOF'
import json, subprocess, sys, time, select
proc = subprocess.Popen(
[sys.executable, "a2a_mcp_server.py"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
env={**__import__("os").environ},
)
req = json.dumps({
"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": {"protocolVersion": "2024-11-05",
"capabilities": {},
"clientInfo": {"name": "keepopen", "version": "1"}},
}) + "\n"
proc.stdin.write(req.encode())
proc.stdin.flush()
# Deliberately DO NOT close proc.stdin — mirror a live MCP client.
deadline = time.time() + 15
line = b""
while time.time() < deadline:
r, _, _ = select.select([proc.stdout], [], [], 1)
if r:
line = proc.stdout.readline()
if line:
break
proc.kill()
if not line:
print("FAIL: no response within 15s on an open pipe — "
"stdin.read(65536) regression is back")
sys.exit(1)
resp = json.loads(line.decode())
assert resp.get("id") == 1 and "result" in resp, \
f"unexpected response: {line[:200]!r}"
assert resp["result"]["serverInfo"]["name"] == "molecule", \
f"wrong serverInfo: {line[:200]!r}"
print("PASS: server answered initialize on a still-open pipe")
PYEOF
- name: Run unit tests for stdio transport
run: |
set -euo pipefail
echo "=== Running stdio transport unit tests ==="
python -m pytest tests/test_a2a_mcp_server.py::TestStdioPipeAssertion tests/test_a2a_mcp_server.py::TestStdioKeepOpenPipe -v --no-cov
python -m pytest tests/test_a2a_mcp_server.py::TestStdioPipeAssertion -v --no-cov
+5 -7
View File
@@ -538,13 +538,11 @@ jobs:
all-required:
# Aggregator sentinel — RFC internal#219 §2 (Phase 4 — closes internal#286).
#
# Emits `CI / all-required (<event>)` where <event> is the workflow trigger
# (e.g. `CI / all-required (pull_request)`, `CI / all-required (push)`).
# Branch protection MUST be updated to require the event-suffixed name —
# requiring `CI / all-required` (bare, no suffix) silently blocks all merges
# because Gitea treats absent status contexts as pending (not skipped), and
# no workflow emits the bare name. Fixed: BP now requires
# `CI / all-required (pull_request)` per issue #1473.
# Single stable required-status name that branch protection points at;
# CI churns underneath in `needs:` without any protection edits. Mirrors
# the molecule-controlplane Phase 2a impl shipped in CP PR#112 and
# referenced by `internal#286` ("Phase 4 is a single small PR... mirrors
# CP's existing one").
#
# Closes the failure mode where status_check_contexts on molecule-core/main
# only listed `Secret scan` + `sop-tier-check` (the 2 meta-gates), so real
+1 -1
View File
@@ -262,7 +262,7 @@ jobs:
- name: Upload Playwright report
if: failure() && needs.detect-changes.outputs.chat == 'true'
uses: actions/upload-artifact@v3.2.2
uses: actions/upload-artifact@c6a366c94c3e0affe28c06c8df20a878f24da3cf # v3.2.2
with:
name: playwright-report-chat
path: canvas/playwright-report/
-4
View File
@@ -52,9 +52,5 @@ jobs:
# explicitly instead of the combined state avoids false-pause when
# non-blocking jobs (continue-on-error: true) have failed — those
# failures pollute combined state but do not gate merges.
# NOTE: the event-suffixed context name is intentional — branch protection
# MUST require `CI / all-required (pull_request)` (with suffix), NOT the
# bare `CI / all-required`. Gitea treats absent contexts as pending, not
# skipped; requiring the bare name silently blocks all merges (issue #1473).
PUSH_REQUIRED_CONTEXTS: CI / all-required (push)
run: python3 .gitea/scripts/gitea-merge-queue.py
@@ -1,88 +0,0 @@
name: Lint shellcheck (arm64 pilot)
# Mac-CI dual-track pilot (#233). ADDITIVE / NOT REQUIRED.
#
# Validates the arm64 self-hosted lane (no docker.sock, no privileged
# ops) before any required gate moves onto it. Until a Mac arm64 runner
# is registered with the `arm64` label, this workflow sits PENDING —
# that is FINE: `arm64` is NOT in branch_protections required contexts.
#
# Pairs with internal#543 (RFC: Mac arm64 multi-arch runner-base).
# No paths: filter on purpose (feedback_path_filtered_workflow_cant_be_required).
on:
pull_request:
branches:
- main
- staging
push:
branches:
- main
permissions:
contents: read
jobs:
shellcheck-arm64:
name: shellcheck-arm64 (pilot)
runs-on: [self-hosted, arm64]
# NOT a required check; safe to sit pending until Mac runner is up.
# If the Mac runner has trouble pulling actions/checkout we fall
# back to a plain git clone (see step 'fallback clone').
timeout-minutes: 10
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
steps:
- name: Identify runner
run: |
set -eu
echo "arch=$(uname -m)"
echo "kernel=$(uname -sr)"
echo "shell=$BASH_VERSION"
# Sanity: must actually be arm64. If amd64 sneaks in here,
# fail fast — that means the label routing is wrong.
case "$(uname -m)" in
aarch64|arm64) echo "arm64 confirmed" ;;
*) echo "ERROR: expected arm64, got $(uname -m)"; exit 1 ;;
esac
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Install shellcheck (arm64)
run: |
set -eu
if command -v shellcheck >/dev/null 2>&1; then
echo "shellcheck already present: $(shellcheck --version | head -1)"
else
# Prefer apt if the runner base ships it; else download arm64 binary.
if command -v apt-get >/dev/null 2>&1; then
sudo apt-get update -qq
sudo apt-get install -y --no-install-recommends shellcheck
else
SC_VER=v0.10.0
curl -fsSL "https://github.com/koalaman/shellcheck/releases/download/${SC_VER}/shellcheck-${SC_VER}.linux.aarch64.tar.xz" \
| tar -xJf - --strip-components=1
sudo mv shellcheck /usr/local/bin/
fi
fi
shellcheck --version | head -2
- name: Run shellcheck on .gitea/scripts/*.sh
run: |
set -eu
# Only the scripts we control under .gitea/scripts. Pilot
# scope is intentionally narrow — broaden in a follow-up
# once the lane is proven.
mapfile -t TARGETS < <(find .gitea/scripts -maxdepth 2 -type f -name '*.sh' | sort)
if [ "${#TARGETS[@]}" -eq 0 ]; then
echo "No .sh files found under .gitea/scripts — nothing to check"
exit 0
fi
echo "Checking ${#TARGETS[@]} file(s):"
printf ' %s\n' "${TARGETS[@]}"
# SC1091 = couldn't follow non-constant source; expected for
# CI-time analysis without the full runtime layout.
shellcheck --severity=error --exclude=SC1091 "${TARGETS[@]}"
+4 -19
View File
@@ -104,7 +104,7 @@ jobs:
with:
python-version: "3.11"
- name: Compute next version from PyPI latest and existing tags
- name: Compute next version from PyPI latest
id: bump
run: |
set -eu
@@ -112,24 +112,9 @@ jobs:
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])")
MAJOR=$(echo "$LATEST" | cut -d. -f1)
MINOR=$(echo "$LATEST" | cut -d. -f2)
TAG_LATEST=$(git tag --list "runtime-v${MAJOR}.${MINOR}.*" \
| sed -E 's/^runtime-v//' \
| grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -V \
| tail -1 || true)
VERSION=$(PYPI_LATEST="$LATEST" TAG_LATEST="$TAG_LATEST" python - <<'PY'
import os
def parse(v):
return tuple(int(part) for part in v.split("."))
pypi = os.environ["PYPI_LATEST"]
tag = os.environ.get("TAG_LATEST") or pypi
base = max(parse(pypi), parse(tag))
print(f"{base[0]}.{base[1]}.{base[2] + 1}")
PY
)
echo "PyPI latest=$LATEST, latest runtime tag=${TAG_LATEST:-none} -> next=$VERSION"
PATCH=$(echo "$LATEST" | cut -d. -f3)
VERSION="${MAJOR}.${MINOR}.$((PATCH+1))"
echo "PyPI latest=$LATEST -> next=$VERSION"
if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "::error::computed version $VERSION does not match PEP 440 X.Y.Z"
exit 1
-1
View File
@@ -89,7 +89,6 @@ on:
permissions:
contents: read
pull-requests: read
secrets: read
jobs:
# bp-exempt: PR review bot signal; required merge state is enforced by CI / all-required.
-13
View File
@@ -30,11 +30,6 @@ jobs:
scan:
name: Scan diff for credential-shaped strings
runs-on: ubuntu-latest
# Hard CI gate — must complete or the PR is unmergable. 10-minute ceiling
# is generous for a diff-scan against a single SHA. If this times out, the
# runner is frozen and holding a slot — the step timeout triggers clean
# failure, releasing the runner for the next job.
timeout-minutes: 10
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
@@ -138,14 +133,6 @@ jobs:
[ -z "$f" ] && continue
[ "$f" = "$SELF_GITHUB" ] && continue
[ "$f" = "$SELF_GITEA" ] && continue
# Test-fixture exclude (internal#425): the secrets-detector's OWN
# unit-test corpus deliberately embeds credential-SHAPED example
# strings to exercise the detector. Verified 2026-05-18 synthetic
# (fabricated ghp_* fixtures, not real). Without this the scanner
# self-trips on its own fixtures and fail-closes every deploy.
# Same rationale as the SELF_* excludes above; gate NOT weakened
# (all other paths still fully scanned).
[ "$f" = "workspace-server/internal/secrets/patterns_test.go" ] && continue
if [ -n "$DIFF_RANGE" ]; then
ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true)
else
-1
View File
@@ -16,7 +16,6 @@ on:
permissions:
contents: read
pull-requests: read
secrets: read
jobs:
# bp-exempt: PR security review bot signal; required merge state is enforced by CI / all-required.
+4 -1
View File
@@ -84,8 +84,11 @@ on:
permissions:
contents: read
pull-requests: read
# NOTE: `statuses: write` is the GitHub-Actions name for POST /statuses.
# Gitea 1.22.6 may not gate on this permission key (it just checks the
# token), but listing it explicitly documents intent for the next
# platform-version upgrade.
statuses: write
secrets: read
jobs:
all-items-acked:
-1
View File
@@ -71,7 +71,6 @@ jobs:
permissions:
contents: read
pull-requests: read
secrets: read
steps:
- name: Check out base branch (for the script)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+255
View File
@@ -0,0 +1,255 @@
name: canary-verify
# Runs the canary smoke suite against the staging canary tenant fleet
# after a new :staging-<sha> image lands in ECR. On green, calls the
# CP redeploy-fleet endpoint to promote :staging-<sha> → :latest so
# the prod tenant fleet's 5-minute auto-updater picks up the verified
# digest. On red, :latest stays on the prior known-good digest and
# prod is untouched.
#
# Registry note (2026-05-10): This workflow previously used GHCR
# (ghcr.io/molecule-ai/platform-tenant) — that registry was retired
# during the 2026-05-06 Gitea suspension migration when publish-
# workspace-server-image.yml switched to the operator's ECR org
# (153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/
# platform-tenant). The GHCR → ECR migration was never applied to
# this file, so canary-verify was silently smoke-testing the stale
# GHCR image while the actual staging/prod tenants ran the ECR image.
# Result: smoke tests could not catch a broken ECR build. Fix:
# - Wait step: reads SHA from running canary /health (tenant-
# agnostic, works regardless of registry).
# - Promote step: calls CP redeploy-fleet endpoint with target_tag=
# staging-<sha>, same mechanism as redeploy-tenants-on-main.yml.
# No longer attempts GHCR crane ops.
#
# Dependencies:
# - publish-workspace-server-image.yml publishes :staging-<sha>
# to ECR on staging and main merges.
# - Canary tenants are configured to pull :staging-<sha> from ECR
# (TENANT_IMAGE env set to the ECR :staging-<sha> tag).
# - Repo secrets CANARY_TENANT_URLS / CANARY_ADMIN_TOKENS /
# CANARY_CP_SHARED_SECRET are populated.
on:
workflow_run:
workflows: ["publish-workspace-server-image"]
types: [completed]
workflow_dispatch:
permissions:
contents: read
packages: write
actions: read
env:
# ECR registry (post-2026-05-06 SSOT for tenant images).
# publish-workspace-server-image.yml pushes here.
IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform
TENANT_IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
# CP endpoint for redeploy-fleet (used in promote step below).
CP_URL: ${{ vars.CP_URL || 'https://staging-api.moleculesai.app' }}
jobs:
canary-smoke:
# Skip when the upstream workflow failed — no image to test against.
if: ${{ github.event.workflow_run.conclusion == 'success' || github.event_name == 'workflow_dispatch' }}
runs-on: ubuntu-latest
outputs:
sha: ${{ steps.compute.outputs.sha }}
smoke_ran: ${{ steps.smoke.outputs.ran }}
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Compute sha
id: compute
run: echo "sha=${GITHUB_SHA::7}" >> "$GITHUB_OUTPUT"
- name: Wait for canary tenants to pick up :staging-<sha>
# Poll canary health endpoints every 30s for up to 7 min instead
# of a fixed 6-min sleep. Exits as soon as ALL canaries report
# the new SHA (~2-3 min typical vs 6 min fixed). Falls back to
# proceeding after 7 min even if not all canaries responded —
# the smoke suite will catch any that didn't update.
#
# NOTE: The SHA is read from the running tenant's /health response,
# NOT from a registry lookup. This is registry-agnostic and works
# regardless of whether the tenant pulls from ECR, GHCR, or any
# other registry — the canary is telling us what it's actually
# running, which is the ground truth for smoke testing.
env:
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
EXPECTED_SHA: ${{ steps.compute.outputs.sha }}
run: |
if [ -z "$CANARY_TENANT_URLS" ]; then
echo "No canary URLs configured — falling back to 60s wait"
sleep 60
exit 0
fi
IFS=',' read -ra URLS <<< "$CANARY_TENANT_URLS"
MAX_WAIT=420 # 7 minutes
INTERVAL=30
ELAPSED=0
while [ $ELAPSED -lt $MAX_WAIT ]; do
ALL_READY=true
for url in "${URLS[@]}"; do
HEALTH=$(curl -s --max-time 5 "${url}/health" 2>/dev/null || echo "{}")
SHA=$(echo "$HEALTH" | grep -o "\"sha\":\"[^\"]*\"" | head -1 | cut -d'"' -f4)
if [ "$SHA" != "$EXPECTED_SHA" ]; then
ALL_READY=false
break
fi
done
if $ALL_READY; then
echo "All canaries running staging-${EXPECTED_SHA} after ${ELAPSED}s"
exit 0
fi
echo "Waiting for canaries... (${ELAPSED}s / ${MAX_WAIT}s)"
sleep $INTERVAL
ELAPSED=$((ELAPSED + INTERVAL))
done
echo "Timeout after ${MAX_WAIT}s — proceeding anyway (smoke suite will validate)"
- name: Run canary smoke suite
id: smoke
# Graceful-skip when no canary fleet is configured (Phase 2 not yet
# stood up — see molecule-controlplane/docs/canary-tenants.md).
# Sets `ran=false` on skip so promote-to-latest stays off (we don't
# want every main merge auto-promoting without gating). Manual
# promote-latest.yml is the release gate while canary is absent.
# Once the fleet is real: delete the early-exit branch.
env:
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
CANARY_ADMIN_TOKENS: ${{ secrets.CANARY_ADMIN_TOKENS }}
CANARY_CP_BASE_URL: https://staging-api.moleculesai.app
CANARY_CP_SHARED_SECRET: ${{ secrets.CANARY_CP_SHARED_SECRET }}
run: |
set -euo pipefail
if [ -z "${CANARY_TENANT_URLS:-}" ] \
|| [ -z "${CANARY_ADMIN_TOKENS:-}" ] \
|| [ -z "${CANARY_CP_SHARED_SECRET:-}" ]; then
{
echo "## ⚠️ canary-verify skipped"
echo
echo "One or more canary secrets are unset (\`CANARY_TENANT_URLS\`, \`CANARY_ADMIN_TOKENS\`, \`CANARY_CP_SHARED_SECRET\`)."
echo "Phase 2 canary fleet has not been stood up yet —"
echo "see [canary-tenants.md](https://git.moleculesai.app/molecule-ai/molecule-controlplane/blob/main/docs/canary-tenants.md)."
echo
echo "**Skipped — promote-to-latest will NOT auto-fire.** Dispatch \`promote-latest.yml\` manually when ready."
} >> "$GITHUB_STEP_SUMMARY"
echo "ran=false" >> "$GITHUB_OUTPUT"
echo "::notice::canary-verify: skipped — no canary fleet configured"
exit 0
fi
bash scripts/canary-smoke.sh
echo "ran=true" >> "$GITHUB_OUTPUT"
- name: Summary on failure
if: ${{ failure() }}
run: |
{
echo "## Canary smoke FAILED"
echo
echo "Canary tenants rejected image \`staging-${{ steps.compute.outputs.sha }}\`."
echo ":latest stays pinned to the prior good digest — prod is untouched."
echo
echo "Fix forward and merge again, or investigate the specific failed"
echo "assertions in the canary-smoke step log above."
} >> "$GITHUB_STEP_SUMMARY"
promote-to-latest:
# On green, calls the CP redeploy-fleet endpoint with target_tag=
# staging-<sha> to promote the verified ECR image. This is the same
# mechanism as redeploy-tenants-on-main.yml — no GHCR crane ops.
#
# Pre-fix history: the old GHCR promote step used `crane tag` against
# ghcr.io/molecule-ai/platform-tenant, but publish-workspace-server-
# image.yml had already migrated to ECR on 2026-05-07 (commit
# 10e510f5). The GHCR tags were never updated, so this step was
# silently promoting a stale GHCR image while actual prod tenants
# pulled from ECR. Canary smoke tests were GHCR-targeted and could
# not catch a broken ECR build.
needs: canary-smoke
if: ${{ needs.canary-smoke.result == 'success' && needs.canary-smoke.outputs.smoke_ran == 'true' }}
runs-on: ubuntu-latest
env:
SHA: ${{ needs.canary-smoke.outputs.sha }}
CP_URL: ${{ vars.CP_URL || 'https://staging-api.moleculesai.app' }}
# CP_ADMIN_API_TOKEN gates write access to the redeploy endpoint.
# Stored at the repo level so all workflows pick it up automatically.
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
# canary_slug pin: deploy the verified :staging-<sha> to the canary
# first (soak 120s), then fan out to the rest of the fleet.
CANARY_SLUG: ${{ vars.CANARY_PROMOTE_SLUG || '' }}
SOAK_SECONDS: ${{ vars.CANARY_PROMOTE_SOAK || '120' }}
BATCH_SIZE: ${{ vars.CANARY_PROMOTE_BATCH || '3' }}
steps:
- name: Check CP credentials
run: |
if [ -z "${CP_ADMIN_API_TOKEN:-}" ]; then
echo "::error::CP_ADMIN_API_TOKEN secret is not set — promote step cannot call redeploy-fleet."
echo "::error::Set it at: repo Settings → Actions → Variables and Secrets → New Secret."
exit 1
fi
- name: Promote verified ECR image to :latest
run: |
set -euo pipefail
TARGET_TAG="staging-${SHA}"
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--argjson soak "${SOAK_SECONDS:-120}" \
--argjson batch "${BATCH_SIZE:-3}" \
--argjson dry false \
'{
target_tag: $tag,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
if [ -n "${CANARY_SLUG:-}" ]; then
BODY=$(jq '. * {canary_slug: $slug}' --arg slug "$CANARY_SLUG" <<<"$BODY")
fi
echo "Calling: POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " target_tag: $TARGET_TAG"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
CURL_EXIT=$?
set -e
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE (curl exit $CURL_EXIT)"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
if [ "$HTTP_CODE" -ge 400 ]; then
echo "::error::CP redeploy-fleet returned HTTP $HTTP_CODE — refusing to proceed."
exit 1
fi
- name: Summary
run: |
{
echo "## Canary verified — :latest promoted via CP redeploy-fleet"
echo ""
echo "- **Target tag:** \`staging-${{ needs.canary-smoke.outputs.sha }}\`"
echo "- **Registry:** ECR (\`${TENANT_IMAGE_NAME}\`)"
echo "- **Canary slug:** \`${CANARY_SLUG:-<none>}\` (soak ${SOAK_SECONDS}s)"
echo "- **Batch size:** ${BATCH_SIZE:-3}"
echo ""
echo "CP redeploy-fleet is rolling out the verified image across the prod fleet."
echo "The fleet's 5-minute health-check loop will pick up the update automatically."
} >> "$GITHUB_STEP_SUMMARY"
@@ -0,0 +1,400 @@
name: redeploy-tenants-on-main
# Auto-refresh prod tenant EC2s after every main merge.
#
# Why this workflow exists: publish-workspace-server-image builds and
# pushes a new platform-tenant :<sha> to ECR on every merge to main,
# but running tenants pulled their image once at boot and never re-pull.
# Users see stale code indefinitely.
#
# This workflow closes the gap by calling the control-plane admin
# endpoint that performs a canary-first, batched, health-gated rolling
# redeploy across every live tenant. Implemented in molecule-ai/
# molecule-controlplane as POST /cp/admin/tenants/redeploy-fleet
# (feat/tenant-auto-redeploy, landing alongside this workflow).
#
# Registry: ECR (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/platform-tenant). GHCR was retired 2026-05-07 during the
# Gitea suspension migration. The canary-verify.yml promote step now
# uses the same redeploy-fleet endpoint (fixes the silent-GHCR gap).
#
# Runtime ordering:
# 1. publish-workspace-server-image completes → new :staging-<sha> in ECR.
# 2. This workflow fires via workflow_run, calls redeploy-fleet with
# target_tag=staging-<sha>. No CDN propagation wait needed —
# ECR image manifest is consistent immediately after push.
# 3. Calls redeploy-fleet with canary_slug (if set) and a soak
# period. Canary proves the image boots; batches follow.
# 4. Any failure aborts the rollout and leaves older tenants on the
# prior image — safer default than half-and-half state.
#
# Rollback path: re-run this workflow with a specific SHA pinned via
# the workflow_dispatch input. That calls redeploy-fleet with
# target_tag=<sha>, re-pulling the older image on every tenant.
on:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
workflow_dispatch:
inputs:
target_tag:
# Empty default → auto-trigger and dispatch-without-input both
# resolve to `staging-<short_head_sha>` (the digest publish-image
# just pushed). Pre-fix this defaulted to 'latest', which only
# gets retagged by canary-verify's promote-to-latest job — and
# that job soft-skips when CANARY_TENANT_URLS is unset (the
# current state, until Phase 2 canary fleet is live). Result:
# `:latest` had been pinned to a 4-day-old digest (2026-04-28)
# while every main push pushed fresh `staging-<sha>` images;
# every prod redeploy pulled the stale `:latest` and the verify
# step correctly flagged 3/3 tenants STALE. Pulling the
# just-published `staging-<sha>` directly skips the dead retag
# path. When canary fleet is real, this workflow should chain
# on canary-verify completion (workflow_run from canary-verify),
# not publish-image — separate, smaller PR.
description: 'Tenant image tag to deploy (e.g. "latest", "staging-a59f1a6c"). Empty = auto staging-<head_sha>.'
required: false
type: string
default: ''
canary_slug:
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately).'
required: false
type: string
# Must be an actual prod tenant slug (current: hongming,
# chloe-dong, reno-stars). The previous default 'hongmingwang'
# didn't match any tenant — CP soft-skipped the missing canary
# and the fleet rolled out without the soak gate, defeating the
# whole point of canary-first.
default: 'hongming'
soak_seconds:
description: 'Seconds to wait after canary before fanning out.'
required: false
type: string
default: '60'
batch_size:
description: 'How many tenants SSM redeploys in parallel per batch.'
required: false
type: string
default: '3'
dry_run:
description: 'Plan only — do not actually redeploy.'
required: false
type: boolean
default: false
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
# not the GitHub API.
# Serialize redeploys so two rapid main pushes' redeploys don't overlap
# and cause confusing per-tenant SSM state. Without this, GitHub's
# implicit workflow_run queueing would *probably* serialize them, but
# the explicit block makes the invariant defensible. Mirrors the
# concurrency block on redeploy-tenants-on-staging.yml for shape parity.
#
# cancel-in-progress: false → aborting a half-rolled-out fleet would
# leave tenants stuck on whatever image they happened to be on when
# cancelled. Better to finish the in-flight rollout before starting
# the next one.
concurrency:
group: redeploy-tenants-on-main
cancel-in-progress: false
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success')
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- name: Note on ECR propagation
# ECR image manifests are consistent immediately after push — no
# CDN cache to wait for. The old GHCR-based workflow had a 30s
# sleep to avoid race conditions; ECR makes that unnecessary.
run: echo "ECR image available immediately after push — proceeding."
- name: Compute target tag
id: tag
# Resolution order:
# 1. Operator-supplied input (workflow_dispatch with explicit
# tag) → used verbatim. Lets ops pin `latest` for emergency
# rollback to last canary-verified digest, or pin a specific
# `staging-<sha>` to roll back to a known-good build.
# 2. Default → `staging-<short_head_sha>`. The just-published
# digest. Bypasses the `:latest` retag path that's currently
# dead (canary-verify soft-skips without canary fleet, so
# the only thing retagging `:latest` today is the manual
# promote-latest.yml — last run 2026-04-28). Auto-trigger
# from workflow_run uses workflow_run.head_sha; manual
# dispatch with no input falls through to github.sha.
env:
INPUT_TAG: ${{ inputs.target_tag }}
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
run: |
set -euo pipefail
if [ -n "${INPUT_TAG:-}" ]; then
echo "target_tag=$INPUT_TAG" >> "$GITHUB_OUTPUT"
echo "Using operator-pinned tag: $INPUT_TAG"
else
SHORT="${HEAD_SHA:0:7}"
echo "target_tag=staging-$SHORT" >> "$GITHUB_OUTPUT"
echo "Using auto tag: staging-$SHORT (head_sha=$HEAD_SHA)"
fi
- name: Call CP redeploy-fleet
# CP_ADMIN_API_TOKEN must be set as a repo/org secret on
# molecule-ai/molecule-core, matching the staging/prod CP's
# CP_ADMIN_API_TOKEN env. Stored in Railway, mirrored to this
# repo's secrets for CI.
env:
CP_URL: ${{ vars.CP_URL || 'https://api.moleculesai.app' }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
CANARY_SLUG: ${{ inputs.canary_slug || 'hongming' }}
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
DRY_RUN: ${{ inputs.dry_run || false }}
run: |
set -euo pipefail
if [ -z "${CP_ADMIN_API_TOKEN:-}" ]; then
echo "::error::CP_ADMIN_API_TOKEN secret not set — skipping redeploy"
echo "::notice::Set CP_ADMIN_API_TOKEN in repo secrets to enable auto-redeploy."
exit 1
fi
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--arg canary "$CANARY_SLUG" \
--argjson soak "$SOAK_SECONDS" \
--argjson batch "$BATCH_SIZE" \
--argjson dry "$DRY_RUN" \
'{
target_tag: $tag,
canary_slug: $canary,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
echo "POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
# Route -w into its own tempfile so curl's exit code (e.g. 56
# on connection-reset, 22 on --fail-with-body 4xx/5xx) can't
# pollute the captured stdout. The previous inline-substitution
# shape produced "000000" on connection reset (curl wrote
# "000" via -w, then the inline echo-fallback appended another
# "000") — caught on the 2026-05-04 redeploy of sha 2b862f6.
# set +e/-e keeps the non-zero curl exit from tripping the
# outer pipeline. See lint-curl-status-capture.yml for the
# CI gate that pins this fix shape.
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
set -e
# Stderr from curl (e.g. dial errors with -sS) goes to the runner
# log so operators can see WHY a connection failed. Stdout is
# captured to $HTTP_CODE_FILE because that's where -w writes.
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
# Pretty-print per-tenant results in the job summary so
# ops can see which tenants were redeployed without drilling
# into the raw response.
{
echo "## Tenant redeploy fleet"
echo ""
echo "**Target tag:** \`$TARGET_TAG\`"
echo "**Canary:** \`$CANARY_SLUG\` (soak ${SOAK_SECONDS}s)"
echo "**Batch size:** $BATCH_SIZE"
echo "**Dry run:** $DRY_RUN"
echo "**HTTP:** $HTTP_CODE"
echo ""
echo "### Per-tenant result"
echo ""
echo '| Slug | Phase | SSM Status | Exit | Healthz | Error |'
echo '|------|-------|------------|------|---------|-------|'
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.error // "-") |"' "$HTTP_RESPONSE" || true
} >> "$GITHUB_STEP_SUMMARY"
if [ "$HTTP_CODE" != "200" ]; then
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
exit 1
fi
OK=$(jq -r '.ok' "$HTTP_RESPONSE")
if [ "$OK" != "true" ]; then
echo "::error::redeploy-fleet reported ok=false (see summary for which tenant halted the rollout)"
exit 1
fi
echo "::notice::Tenant fleet redeploy reported ssm_status=Success — verifying actual image roll on each tenant..."
# Stash the response for the verify step. $RUNNER_TEMP outlasts
# the step boundary; $HTTP_RESPONSE doesn't.
cp "$HTTP_RESPONSE" "$RUNNER_TEMP/redeploy-response.json"
- name: Verify each tenant /buildinfo matches published SHA
# ROOT FIX FOR #2395.
#
# `redeploy-fleet`'s `ssm_status=Success` means "the SSM RPC
# didn't error" — NOT "the new image is running on the tenant."
# `:latest` lives in the local Docker daemon's image cache; if
# the SSM document does `docker compose up -d` without an
# explicit `docker pull`, the daemon serves the previously-
# cached digest and the container restarts on stale code.
# 2026-04-30 incident: hongmingwang's tenant reported
# ssm_status=Success at 17:00:53Z but kept serving pre-501a42d7
# chat_files for 30+ min — the lazy-heal fix never reached the
# user despite green deploy + green redeploy.
#
# This step closes the gap by curling each tenant's /buildinfo
# endpoint (added in workspace-server/internal/buildinfo +
# /Dockerfile* GIT_SHA build-arg, this PR) and comparing the
# returned git_sha to the SHA the workflow expects. Mismatches
# fail the workflow, which is what `ok=true` should have
# guaranteed all along.
#
# When the redeploy was triggered by workflow_dispatch with a
# specific tag (target_tag != "latest"), the expected SHA may
# not equal ${{ github.sha }} — in that case we resolve via
# GHCR's manifest. For workflow_run (default :latest) the
# workflow_run.head_sha is the SHA that just published.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
# Tenant subdomain template — slugs from the response are
# appended. Production CP issues `<slug>.moleculesai.app`;
# staging CP issues `<slug>.staging.moleculesai.app`. This
# workflow runs on main → prod CP → no `staging.` infix.
TENANT_DOMAIN: 'moleculesai.app'
run: |
set -euo pipefail
EXPECTED_SHORT="${EXPECTED_SHA:0:7}"
if [ "$TARGET_TAG" != "latest" ] \
&& [ "$TARGET_TAG" != "$EXPECTED_SHA" ] \
&& [ "$TARGET_TAG" != "staging-$EXPECTED_SHORT" ]; then
# workflow_dispatch with a pinned tag that isn't the head
# SHA — operator is rolling back / pinning. Skip the
# verification because we don't have the expected SHA in
# this context (would need to crane-inspect the GHCR
# manifest, which is a follow-up). Failing-open here is
# safe: the operator chose the tag deliberately.
#
# `staging-<short_head_sha>` IS verified — it's the new
# auto-trigger default (see Compute target tag step) and
# the digest under that tag SHOULD match EXPECTED_SHA.
echo "::notice::target_tag=$TARGET_TAG (operator-pinned) — skipping per-tenant SHA verification."
exit 0
fi
RESP="$RUNNER_TEMP/redeploy-response.json"
if [ ! -s "$RESP" ]; then
echo "::error::redeploy-response.json missing or empty — verify step ran without a response to read"
exit 1
fi
# Pull only successfully-redeployed tenants. Any tenant that
# halted the rollout already failed the previous step, so we
# don't double-count them here.
mapfile -t SLUGS < <(jq -r '.results[]? | select(.healthz_ok == true) | .slug' "$RESP")
if [ ${#SLUGS[@]} -eq 0 ]; then
echo "::warning::No tenants reported healthz_ok — nothing to verify"
exit 0
fi
echo "Verifying ${#SLUGS[@]} tenant(s) against EXPECTED_SHA=${EXPECTED_SHA:0:7}..."
# Two distinct failure modes — STALE (the #2395 bug class, hard-fail)
# vs UNREACHABLE (teardown race, soft-warn). See the staging variant's
# comment for the full rationale; same logic applies on prod even
# though prod has fewer ephemeral tenants — the asymmetry would be a
# gratuitous fork.
STALE_COUNT=0
UNREACHABLE_COUNT=0
STALE_LINES=()
UNREACHABLE_LINES=()
for slug in "${SLUGS[@]}"; do
URL="https://${slug}.${TENANT_DOMAIN}/buildinfo"
# 30s total: tenant just SSM-restarted, may still be coming
# up. Retry-on-empty rather than retry-on-status — we want
# to fail fast on "responded with wrong SHA", not "still
# warming up".
BODY=$(curl -sS --max-time 30 --retry 3 --retry-delay 5 --retry-connrefused "$URL" || true)
ACTUAL_SHA=$(echo "$BODY" | jq -r '.git_sha // ""' 2>/dev/null || echo "")
if [ -z "$ACTUAL_SHA" ]; then
UNREACHABLE_COUNT=$((UNREACHABLE_COUNT + 1))
UNREACHABLE_LINES+=("| $slug | (no /buildinfo response) | ${EXPECTED_SHA:0:7} | ⚠ unreachable (likely teardown race) |")
continue
fi
if [ "$ACTUAL_SHA" = "$EXPECTED_SHA" ]; then
echo " $slug: ${ACTUAL_SHA:0:7} ✓"
else
STALE_COUNT=$((STALE_COUNT + 1))
STALE_LINES+=("| $slug | ${ACTUAL_SHA:0:7} | ${EXPECTED_SHA:0:7} | ❌ stale |")
fi
done
{
echo ""
echo "### Per-tenant /buildinfo verification"
echo ""
echo "Expected SHA: \`${EXPECTED_SHA:0:7}\`"
echo ""
if [ $STALE_COUNT -gt 0 ]; then
echo "**${STALE_COUNT} STALE tenant(s) — these did NOT pick up the new image despite ssm_status=Success:**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${STALE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "**${UNREACHABLE_COUNT} unreachable tenant(s) — likely teardown race (soft-warn, not failing):**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${UNREACHABLE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $STALE_COUNT -eq 0 ] && [ $UNREACHABLE_COUNT -eq 0 ]; then
echo "All ${#SLUGS[@]} tenants returned matching SHA. ✓"
fi
} >> "$GITHUB_STEP_SUMMARY"
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "::warning::$UNREACHABLE_COUNT tenant(s) unreachable post-redeploy. Likely benign teardown race — CP healthz monitor catches real outages."
fi
# Belt-and-suspenders sanity floor: same logic as the staging
# variant — see that file's comment for the full rationale.
# Floor only applies when fleet >= 4; below that, canary-verify
# is the actual gate.
TOTAL_VERIFIED=${#SLUGS[@]}
if [ $TOTAL_VERIFIED -ge 4 ] && [ $UNREACHABLE_COUNT -gt $((TOTAL_VERIFIED / 2)) ]; then
echo "::error::$UNREACHABLE_COUNT of $TOTAL_VERIFIED tenant(s) unreachable — exceeds 50% threshold on a fleet large enough that this signals a real outage, not teardown race."
exit 1
fi
if [ $STALE_COUNT -gt 0 ]; then
echo "::error::$STALE_COUNT tenant(s) returned a stale SHA. ssm_status=Success was misleading — see job summary."
exit 1
fi
echo "::notice::Tenant fleet redeploy complete — all reachable tenants on ${EXPECTED_SHA:0:7} (${UNREACHABLE_COUNT} unreachable, soft-warned)."
@@ -0,0 +1,362 @@
name: redeploy-tenants-on-staging
# Auto-refresh staging tenant EC2s after every staging-branch merge.
#
# Mirror of redeploy-tenants-on-main.yml, with the staging-CP host and
# the :staging-latest tag. Sister workflow exists for prod (rolls
# :latest after canary-verify). Both share the same shape — just
# different CP_URL + target_tag + admin token secret.
#
# Why this workflow exists: publish-workspace-server-image now builds
# on every staging-branch push (PR #2335), pushing
# platform-tenant:staging-latest to GHCR. Existing tenants pulled
# their image once at boot and never re-pull, so the new image just
# sits unused until the tenant is reprovisioned.
#
# This workflow closes the gap by calling staging-CP's
# /cp/admin/tenants/redeploy-fleet, which performs a canary-first,
# batched, health-gated SSM redeploy across every live staging tenant.
# Same endpoint shape as prod CP — only the host differs.
#
# Runtime ordering:
# 1. publish-workspace-server-image completes on staging branch →
# new :staging-latest in GHCR.
# 2. This workflow fires via workflow_run, waits 30s for GHCR's CDN
# to propagate the new tag.
# 3. Calls redeploy-fleet with no canary (staging IS canary; we don't
# need a sub-canary inside it). Soak still applies to the first
# tenant in case of bad-deploy detection.
# 4. Any failure aborts the rollout and leaves older tenants on the
# prior image — safer default than half-and-half state.
#
# Rollback path: re-run with workflow_dispatch + target_tag=staging-<sha>
# of a known-good build.
on:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
workflow_dispatch:
inputs:
target_tag:
description: 'Tenant image tag to deploy (e.g. "staging-latest" or "staging-a59f1a6c"). Defaults to staging-latest when empty.'
required: false
type: string
default: 'staging-latest'
canary_slug:
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately). Default empty for staging since staging itself is the canary.'
required: false
type: string
default: ''
soak_seconds:
description: 'Seconds to wait after canary before fanning out. Only meaningful if canary_slug is set.'
required: false
type: string
default: '60'
batch_size:
description: 'How many tenants SSM redeploys in parallel per batch.'
required: false
type: string
default: '3'
dry_run:
description: 'Plan only — do not actually redeploy.'
required: false
type: boolean
default: false
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
# not the GitHub API.
# Serialize per-branch so two rapid staging pushes' redeploys don't
# overlap and cause confusing per-tenant SSM state. cancel-in-progress
# is false because aborting a half-rolled-out fleet leaves tenants
# stuck on whatever image they happened to be on when cancelled.
concurrency:
group: redeploy-tenants-on-staging
cancel-in-progress: false
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
if: |
github.event_name == 'workflow_dispatch' ||
(github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success')
runs-on: ubuntu-latest
timeout-minutes: 25
steps:
- name: Wait for GHCR tag propagation
# GHCR's edge cache takes ~15-30s to consistently serve the new
# :staging-latest manifest after the registry accepts the push.
# Same rationale as redeploy-tenants-on-main.yml.
run: sleep 30
- name: Call staging-CP redeploy-fleet
# CP_STAGING_ADMIN_API_TOKEN must be set as a repo/org secret
# on molecule-ai/molecule-core, matching staging-CP's
# CP_ADMIN_API_TOKEN env var (visible in Railway controlplane
# / staging environment). Stored separately from the prod
# CP_ADMIN_API_TOKEN so a leak of one doesn't auth the other.
env:
CP_URL: ${{ vars.STAGING_CP_URL || 'https://staging-api.moleculesai.app' }}
CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
TARGET_TAG: ${{ inputs.target_tag || 'staging-latest' }}
CANARY_SLUG: ${{ inputs.canary_slug || '' }}
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
DRY_RUN: ${{ inputs.dry_run || false }}
run: |
set -euo pipefail
# Schedule-vs-dispatch hardening (mirrors sweep-cf-orphans
# and sweep-cf-tunnels): hard-fail on auto-trigger when the
# secret is missing so a misconfigured-repo doesn't silently
# serve stale staging tenants. Soft-skip on operator dispatch.
if [ -z "${CP_STAGING_ADMIN_API_TOKEN:-}" ]; then
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "::warning::CP_STAGING_ADMIN_API_TOKEN secret not set — skipping redeploy"
echo "::warning::Set CP_STAGING_ADMIN_API_TOKEN in repo secrets to enable auto-redeploy."
echo "::notice::Pull the value from staging-CP's CP_ADMIN_API_TOKEN env in Railway."
exit 0
fi
echo "::error::staging redeploy cannot run — CP_STAGING_ADMIN_API_TOKEN secret missing"
echo "::error::set it at Settings → Secrets and Variables → Actions; pull from staging-CP's CP_ADMIN_API_TOKEN env in Railway."
exit 1
fi
BODY=$(jq -nc \
--arg tag "$TARGET_TAG" \
--arg canary "$CANARY_SLUG" \
--argjson soak "$SOAK_SECONDS" \
--argjson batch "$BATCH_SIZE" \
--argjson dry "$DRY_RUN" \
'{
target_tag: $tag,
canary_slug: $canary,
soak_seconds: $soak,
batch_size: $batch,
dry_run: $dry
}')
echo "POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " body: $BODY"
HTTP_RESPONSE=$(mktemp)
HTTP_CODE_FILE=$(mktemp)
# Route -w into its own tempfile so curl's exit code (e.g. 56
# on connection-reset) can't pollute the captured stdout. The
# previous inline-substitution shape produced "000000" on
# connection reset — caught on main variant 2026-05-04
# redeploying sha 2b862f6. Same fix shape as the synth-E2E
# §9c gate (PR #2797). See lint-curl-status-capture.yml for
# the CI gate that pins this fix shape.
set +e
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_STAGING_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" >"$HTTP_CODE_FILE"
set -e
# Stderr from curl (-sS shows dial errors etc.) goes to the
# runner log so operators can see WHY a connection failed.
HTTP_CODE=$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE"
cat "$HTTP_RESPONSE" | jq . || cat "$HTTP_RESPONSE"
{
echo "## Staging tenant redeploy fleet"
echo ""
echo "**Target tag:** \`$TARGET_TAG\`"
echo "**Canary:** \`${CANARY_SLUG:-(none — staging is itself the canary)}\` (soak ${SOAK_SECONDS}s)"
echo "**Batch size:** $BATCH_SIZE"
echo "**Dry run:** $DRY_RUN"
echo "**HTTP:** $HTTP_CODE"
echo ""
echo "### Per-tenant result"
echo ""
echo '| Slug | Phase | SSM Status | Exit | Healthz | Error |'
echo '|------|-------|------------|------|---------|-------|'
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.error // "-") |"' "$HTTP_RESPONSE" || true
} >> "$GITHUB_STEP_SUMMARY"
# Distinguish "real fleet failure" from "E2E teardown race".
#
# CP returns HTTP 500 + ok=false whenever ANY tenant in the
# fleet failed SSM or healthz. In practice the recurring source
# of these is ephemeral test tenants being torn down by their
# parent E2E run mid-redeploy: the EC2 dies → SSM exit=2 or
# healthz timeout → CP marks the fleet failed → this workflow
# goes red even though every operator-facing tenant rolled fine.
#
# Ephemeral slug prefixes (kept in sync with sweep-stale-e2e-orgs.yml
# — see that file for the source-of-truth list and rationale):
# - e2e-* — canvas/saas/ext E2E suites
# - rt-e2e-* — runtime-test harness fixtures (RFC #2251)
# Long-lived prefixes that are NOT ephemeral and MUST hard-fail:
# demo-prep, dryrun-*, dryrun2-*, plus all human tenant slugs.
#
# Filter: if HTTP=500/ok=false AND every failed slug matches an
# ephemeral prefix, treat as soft-warn and let the verify step
# downstream handle unreachable-vs-stale (#2402). Any non-ephemeral
# failure or a non-500 HTTP response remains a hard failure.
OK=$(jq -r '.ok // "false"' "$HTTP_RESPONSE")
FAILED_SLUGS=$(jq -r '
.results[]?
| select((.healthz_ok != true) or (.ssm_status != "Success"))
| .slug' "$HTTP_RESPONSE" 2>/dev/null || true)
EPHEMERAL_PREFIX_RE='^(e2e-|rt-e2e-)'
NON_EPHEMERAL_FAILED=$(printf '%s\n' "$FAILED_SLUGS" | grep -v '^$' | grep -Ev "$EPHEMERAL_PREFIX_RE" || true)
if [ "$HTTP_CODE" = "200" ] && [ "$OK" = "true" ]; then
: # happy path — fall through to verification
elif [ "$HTTP_CODE" = "500" ] && [ -z "$NON_EPHEMERAL_FAILED" ] && [ -n "$FAILED_SLUGS" ]; then
COUNT=$(printf '%s\n' "$FAILED_SLUGS" | grep -Ec "$EPHEMERAL_PREFIX_RE" || true)
echo "::warning::redeploy-fleet returned HTTP 500 but every failed tenant ($COUNT) is ephemeral (e2e-*/rt-e2e-*) — treating as teardown race, soft-warning."
printf '%s\n' "$FAILED_SLUGS" | sed 's/^/::warning:: failed: /'
elif [ "$HTTP_CODE" != "200" ]; then
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
if [ -n "$NON_EPHEMERAL_FAILED" ]; then
echo "::error::non-ephemeral tenant(s) failed:"
printf '%s\n' "$NON_EPHEMERAL_FAILED" | sed 's/^/::error:: /'
fi
exit 1
else
# HTTP=200 but ok=false (shouldn't happen with current CP
# but keep the gate for completeness).
echo "::error::redeploy-fleet reported ok=false (see summary for which tenant halted the rollout)"
exit 1
fi
echo "::notice::Staging tenant fleet redeploy reported ssm_status=Success — verifying actual image roll on each tenant..."
cp "$HTTP_RESPONSE" "$RUNNER_TEMP/redeploy-response.json"
- name: Verify each staging tenant /buildinfo matches published SHA
# Mirror of the verify step in redeploy-tenants-on-main.yml — see
# there for the rationale (#2395 root fix). Staging has the same
# ssm_status-success-but-stale-image hazard and benefits from the
# same gate. Diff: TENANT_DOMAIN includes the `staging.` infix.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
TARGET_TAG: ${{ inputs.target_tag || 'staging-latest' }}
TENANT_DOMAIN: 'staging.moleculesai.app'
run: |
set -euo pipefail
# staging-latest is the staging-side moving tag; treat it the
# same way main treats `latest`. Operator-pinned SHAs skip
# verification (see main variant for why).
if [ "$TARGET_TAG" != "staging-latest" ] && [ "$TARGET_TAG" != "latest" ] && [ "$TARGET_TAG" != "$EXPECTED_SHA" ]; then
echo "::notice::target_tag=$TARGET_TAG (operator-pinned) — skipping per-tenant SHA verification."
exit 0
fi
RESP="$RUNNER_TEMP/redeploy-response.json"
if [ ! -s "$RESP" ]; then
echo "::error::redeploy-response.json missing or empty"
exit 1
fi
mapfile -t SLUGS < <(jq -r '.results[]? | select(.healthz_ok == true) | .slug' "$RESP")
if [ ${#SLUGS[@]} -eq 0 ]; then
echo "::warning::No staging tenants reported healthz_ok — nothing to verify"
exit 0
fi
echo "Verifying ${#SLUGS[@]} staging tenant(s) against EXPECTED_SHA=${EXPECTED_SHA:0:7}..."
# Two distinct failure modes here:
# STALE_COUNT — tenant returned a SHA that doesn't match. THIS is
# the #2395 bug class: tenant up + serving old code.
# Always hard-fail the workflow.
# UNREACHABLE_COUNT — tenant didn't respond. Almost always a benign
# teardown race: redeploy-fleet snapshot says
# healthz_ok=true, then the E2E suite tears the
# ephemeral tenant down before this step runs (the
# e2e-* fixtures churn 5-10/hour on staging). Soft-
# warn so we don't block staging→main on cleanup.
# Real "tenant up but unreachable" is caught by CP's
# own healthz monitor + the post-redeploy alert; we
# don't need to double-count it here.
STALE_COUNT=0
UNREACHABLE_COUNT=0
STALE_LINES=()
UNREACHABLE_LINES=()
for slug in "${SLUGS[@]}"; do
URL="https://${slug}.${TENANT_DOMAIN}/buildinfo"
BODY=$(curl -sS --max-time 30 --retry 3 --retry-delay 5 --retry-connrefused "$URL" || true)
ACTUAL_SHA=$(echo "$BODY" | jq -r '.git_sha // ""' 2>/dev/null || echo "")
if [ -z "$ACTUAL_SHA" ]; then
UNREACHABLE_COUNT=$((UNREACHABLE_COUNT + 1))
UNREACHABLE_LINES+=("| $slug | (no /buildinfo response) | ${EXPECTED_SHA:0:7} | ⚠ unreachable (likely teardown race) |")
continue
fi
if [ "$ACTUAL_SHA" = "$EXPECTED_SHA" ]; then
echo " $slug: ${ACTUAL_SHA:0:7} ✓"
else
STALE_COUNT=$((STALE_COUNT + 1))
STALE_LINES+=("| $slug | ${ACTUAL_SHA:0:7} | ${EXPECTED_SHA:0:7} | ❌ stale |")
fi
done
{
echo ""
echo "### Per-tenant /buildinfo verification (staging)"
echo ""
echo "Expected SHA: \`${EXPECTED_SHA:0:7}\`"
echo ""
if [ $STALE_COUNT -gt 0 ]; then
echo "**${STALE_COUNT} STALE tenant(s) — these did NOT pick up the new image despite ssm_status=Success:**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${STALE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "**${UNREACHABLE_COUNT} unreachable tenant(s) — likely E2E teardown race (soft-warn, not failing):**"
echo ""
echo "| Slug | Actual /buildinfo SHA | Expected | Status |"
echo "|------|----------------------|----------|--------|"
for line in "${UNREACHABLE_LINES[@]}"; do echo "$line"; done
echo ""
fi
if [ $STALE_COUNT -eq 0 ] && [ $UNREACHABLE_COUNT -eq 0 ]; then
echo "All ${#SLUGS[@]} staging tenants returned matching SHA. ✓"
fi
} >> "$GITHUB_STEP_SUMMARY"
if [ $UNREACHABLE_COUNT -gt 0 ]; then
echo "::warning::$UNREACHABLE_COUNT staging tenant(s) unreachable post-redeploy. Likely benign teardown race — CP healthz monitor catches real outages."
fi
# Belt-and-suspenders sanity floor: if MORE than half the fleet is
# unreachable AND the fleet is large enough that "half down" is
# statistically meaningful, this is a real outage (e.g. new image
# crashes on startup), not a teardown race. Hard-fail.
#
# Floor only applies when TOTAL_VERIFIED >= 4 — below that, the
# canary-verify step is the actual gate for "all tenants down"
# detection (it runs against the canary first and aborts the
# rollout if the canary fails to come up). Without the >=4 gate,
# a 1-tenant fleet (e.g. a single ephemeral e2e-* tenant on a
# quiet staging push) would re-flake on the exact teardown-race
# condition #2402 fixed: 1 of 1 unreachable = 100% > 50% → fail.
TOTAL_VERIFIED=${#SLUGS[@]}
if [ $TOTAL_VERIFIED -ge 4 ] && [ $UNREACHABLE_COUNT -gt $((TOTAL_VERIFIED / 2)) ]; then
echo "::error::$UNREACHABLE_COUNT of $TOTAL_VERIFIED staging tenant(s) unreachable — exceeds 50% threshold on a fleet large enough that this signals a real outage, not teardown race."
exit 1
fi
if [ $STALE_COUNT -gt 0 ]; then
echo "::error::$STALE_COUNT staging tenant(s) returned a stale SHA. ssm_status=Success was misleading — see job summary."
exit 1
fi
echo "::notice::Staging tenant fleet redeploy complete — all reachable tenants on ${EXPECTED_SHA:0:7} (${UNREACHABLE_COUNT} unreachable, soft-warned)."
+12 -12
View File
@@ -57,24 +57,24 @@ See `CLAUDE.md` for a full list of environment variables and their purposes.
This repo is scoped to **code** (canvas, workspace, workspace-server, related
infra). Public content (blog posts, marketing copy, OG images, SEO briefs,
DevRel demos) lives in [`molecule-ai/docs`](https://git.moleculesai.app/molecule-ai/docs).
DevRel demos) lives in [`Molecule-AI/docs`](https://git.moleculesai.app/molecule-ai/docs).
The `Block forbidden paths` CI gate fails any PR that writes to `marketing/`
or other removed paths — open against `molecule-ai/docs` instead.
or other removed paths — open against `Molecule-AI/docs` instead.
| Content type | Target |
|---|---|
| Blog posts | `molecule-ai/docs``content/blog/<YYYY-MM-DD-slug>/` |
| Doc pages | `molecule-ai/docs``content/docs/` |
| Marketing copy / PMM positioning | `molecule-ai/docs``marketing/` |
| OG images, visual assets | `molecule-ai/docs``app/` or `marketing/` |
| SEO briefs | `molecule-ai/docs``marketing/` |
| DevRel demos (runnable code) | Standalone repo under `molecule-ai/`, OR embedded in `molecule-ai/docs` |
| Blog posts | `Molecule-AI/docs``content/blog/<YYYY-MM-DD-slug>/` |
| Doc pages | `Molecule-AI/docs``content/docs/` |
| Marketing copy / PMM positioning | `Molecule-AI/docs``marketing/` |
| OG images, visual assets | `Molecule-AI/docs``app/` or `marketing/` |
| SEO briefs | `Molecule-AI/docs``marketing/` |
| DevRel demos (runnable code) | Standalone repo under `Molecule-AI/`, OR embedded in `Molecule-AI/docs` |
| Launch checklists, internal tracking | GitHub Issues — **not** committed files |
| Engineering docs (`docs/adr/`, `docs/architecture/`, `docs/incidents/`) | This repo (internal, not published) |
| Live product pages (e.g. `canvas/src/app/pricing/page.tsx`) | This repo (these are app code, not marketing copy) |
If a PR fails the `Block forbidden paths` check, the contents belong in
`molecule-ai/docs`. No CI drag, no Canvas E2E, content lands in minutes.
`Molecule-AI/docs`. No CI drag, no Canvas E2E, content lands in minutes.
## Development Workflow
@@ -190,9 +190,9 @@ Runs the full regression suite against a fixture HTTP server. No network access
Code in this repo lands in molecule-core. Some related runtime artifacts
live in their own repos:
- [`molecule-ai/molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`molecule-ai/molecule-sdk-python`](https://git.moleculesai.app/molecule-ai/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`molecule-ai/molecule-mcp-claude-channel`](https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install inside Claude Code via `/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`, then launch with `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel`.
- [`Molecule-AI/molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`Molecule-AI/molecule-sdk-python`](https://git.moleculesai.app/molecule-ai/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`Molecule-AI/molecule-mcp-claude-channel`](https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install with `claude --channels plugin:molecule@Molecule-AI/molecule-mcp-claude-channel`.
When extending the **A2A surface** in molecule-core (`workspace-server/internal/handlers/a2a_proxy.go` etc.), consider whether the change has a downstream impact on the runtime SDK or the channel plugin — they're versioned independently but share the wire shape.
+1 -1
View File
@@ -238,7 +238,7 @@ The result is not just “an agent that learns.” It is **an organization that
- subscribe to one or more workspaces; peer messages surface as conversation turns; replies route back through Molecule's A2A
- no tunnel, no public endpoint — the plugin self-registers each watched workspace as `delivery_mode=poll` and long-polls `/activity?since_id=…`
- multi-tenant friendly: one plugin install can watch workspaces across multiple Molecule tenants (`MOLECULE_PLATFORM_URLS` per-workspace)
- install via the standard marketplace flow: `/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`, then launch with `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel`
- install via the standard marketplace flow: `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
## Built For Teams That Need More Than A Demo
+1 -1
View File
@@ -237,7 +237,7 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
- 订阅一个或多个 workspacepeer 的消息会以 user-turn 出现,回复会经 Molecule A2A 路由出去
- 无需公网隧道、无需公开端点 —— 插件启动时自动把每个 watched workspace 注册成 `delivery_mode=poll`,长轮询 `/activity?since_id=…`
- 多租户友好:单次安装即可同时 watch 跨多个 Molecule 租户的 workspace`MOLECULE_PLATFORM_URLS` 按 workspace 配置)
- 通过标准 marketplace 流程安装:`/plugin marketplace add https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel.git``/plugin install molecule@molecule-channel`,然后用 `claude --dangerously-load-development-channels --channels plugin:molecule@molecule-channel` 启动
- 通过标准 marketplace 流程安装:`/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
## 适合什么团队
-113
View File
@@ -1,113 +0,0 @@
import { describe, it, expect, vi } from "vitest";
// Marketing-launch SEO (mc#1486). These tests pin the public crawler
// contract: anything that flips public marketing routes to disallow,
// drops the sitemap from robots.txt, or removes the OG image
// reference from root metadata should fail loudly here.
// next/font and the rest of the layout's runtime tree are not
// vitest-compatible (next/font expects the Next.js compiler swc
// transform). We import layout.tsx only for its exported `metadata`
// constant — mock the font module to a constructor-returning stub.
vi.mock("next/font/google", () => ({
Inter: () => ({ variable: "--font-inter" }),
JetBrains_Mono: () => ({ variable: "--font-jetbrains" }),
}));
import robots from "../robots";
import sitemap from "../sitemap";
import { metadata } from "../layout";
describe("robots.ts", () => {
it("allows public marketing routes and blocks authed/app routes", () => {
const r = robots();
expect(r.rules).toBeDefined();
const rule = Array.isArray(r.rules) ? r.rules[0] : r.rules!;
expect(rule.userAgent).toBe("*");
const allow = Array.isArray(rule.allow) ? rule.allow : [rule.allow];
expect(allow).toEqual(expect.arrayContaining(["/", "/pricing", "/blog"]));
const disallow = Array.isArray(rule.disallow)
? rule.disallow
: [rule.disallow];
expect(disallow).toEqual(
expect.arrayContaining(["/api/", "/orgs", "/cp/"]),
);
});
it("declares the sitemap URL", () => {
const r = robots();
expect(r.sitemap).toMatch(/\/sitemap\.xml$/);
});
it("declares a canonical host", () => {
const r = robots();
expect(r.host).toMatch(/^https:\/\//);
});
});
describe("sitemap.ts", () => {
it("includes apex, pricing, and the live blog post", () => {
const entries = sitemap();
const urls = entries.map((e) => e.url);
expect(urls.some((u) => u.endsWith("/"))).toBe(true);
expect(urls.some((u) => u.endsWith("/pricing"))).toBe(true);
expect(
urls.some((u) => u.includes("/blog/2026-04-20-chrome-devtools-mcp")),
).toBe(true);
});
it("does NOT include authed/app routes", () => {
const entries = sitemap();
const urls = entries.map((e) => e.url);
expect(urls.some((u) => u.includes("/orgs"))).toBe(false);
expect(urls.some((u) => u.includes("/api/"))).toBe(false);
});
it("sets a non-zero priority and a valid changeFrequency on every entry", () => {
const valid = new Set([
"always",
"hourly",
"daily",
"weekly",
"monthly",
"yearly",
"never",
]);
for (const e of sitemap()) {
expect(e.priority).toBeGreaterThan(0);
expect(valid.has(String(e.changeFrequency))).toBe(true);
}
});
});
describe("root layout metadata", () => {
it("sets a templated title + non-empty description", () => {
const t = metadata.title as { default: string; template: string };
expect(t.default).toMatch(/Molecule AI/);
expect(t.template).toMatch(/%s/);
expect((metadata.description ?? "").length).toBeGreaterThan(50);
});
it("declares OG + Twitter text fields (image comes from opengraph-image.tsx)", () => {
const og = metadata.openGraph;
expect(og).toBeDefined();
expect((og as { title: string }).title).toMatch(/Molecule AI/);
expect((og as { description: string }).description.length).toBeGreaterThan(
50,
);
const tw = metadata.twitter;
expect(tw).toBeDefined();
// Next.js typings narrow twitter.card to a union — assert via cast.
expect((tw as { card: string }).card).toBe("summary_large_image");
});
it("sets a canonical alternate", () => {
expect(metadata.alternates?.canonical).toBe("/");
});
it("enables indexing at the metadata level (robots.ts owns per-route)", () => {
const r = metadata.robots as { index: boolean; follow: boolean };
expect(r.index).toBe(true);
expect(r.follow).toBe(true);
});
});
+2 -140
View File
@@ -27,78 +27,9 @@ import {
themeBootScript,
} from "@/lib/theme-cookie";
// Marketing-launch SEO (mc#1486). Canonical apex is app.moleculesai.app —
// tenant subdomains (<slug>.moleculesai.app) reuse the same Next.js build
// but are gated behind auth (AuthGate redirects anonymous → /cp/auth/login)
// and are de-indexed in robots.ts. The metadata here applies to the
// public marketing surface served from the apex host.
//
// Override per-route by exporting a page-level `metadata`/`generateMetadata`
// — Next.js merges page metadata over layout metadata using
// `title.template` for "<page> | Molecule AI" composition.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
export const metadata: Metadata = {
metadataBase: new URL(SITE_URL),
title: {
default: "Molecule AI — the AI org chart canvas",
template: "%s | Molecule AI",
},
description:
"Molecule AI is an org-chart canvas for AI agent teams. Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace with credit metering, audit, and one-click runtime provisioning.",
applicationName: "Molecule AI",
keywords: [
"AI agents",
"multi-agent",
"agent orchestration",
"AI org chart",
"Claude Code",
"Codex",
"MCP",
"agent governance",
"A2A",
"agent runtime",
],
authors: [{ name: "Molecule AI" }],
creator: "Molecule AI",
publisher: "Molecule AI",
alternates: { canonical: "/" },
// OG + Twitter images come from the file-convention sibling
// `opengraph-image.tsx` — Next.js auto-attaches them to og:image
// and twitter:image when present at the segment root. We keep the
// text fields here so they win over per-page metadata when a page
// doesn't override them. `images: []` as the structural fallback
// for hosts that won't follow the file convention; the real URL
// is injected by Next.js at build time from opengraph-image.tsx.
openGraph: {
type: "website",
siteName: "Molecule AI",
url: SITE_URL,
title: "Molecule AI — the AI org chart canvas",
description:
"Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace. Credit metering, audit, and one-click runtime provisioning.",
locale: "en_US",
},
twitter: {
card: "summary_large_image",
title: "Molecule AI — the AI org chart canvas",
description:
"Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed multi-agent workspace.",
},
icons: {
icon: "/molecule-icon.png",
apple: "/molecule-icon.png",
},
// robots.ts owns the per-route allow/disallow contract; this is the
// header-level fallback for routes the crawler reaches before
// robots.txt resolves. Default = index public marketing routes;
// app/auth/api/orgs are noindex'd by robots.ts.
robots: {
index: true,
follow: true,
googleBot: { index: true, follow: true, "max-image-preview": "large" },
},
title: "Molecule AI",
description: "AI Org Chart Canvas",
};
export default async function RootLayout({
@@ -163,75 +94,6 @@ export default async function RootLayout({
nonce={nonce}
dangerouslySetInnerHTML={{ __html: themeBootScript }}
/>
{/*
* JSON-LD structured data (mc#1486). Two graph nodes:
*
* - Organization: surfaces the brand to Google Knowledge
* Graph + Bing entity index. URL+logo+sameAs are the
* minimum recommended set for new brands without a
* Wikipedia page.
*
* - WebSite: enables the sitelinks search box and tells
* crawlers the canonical site URL when the same content
* is reachable via multiple subdomains (apex + tenant).
*
* Type-application/ld+json runs synchronously without
* executing JS, so 'strict-dynamic' isn't required — we still
* carry the nonce because production CSP's default-src 'self'
* applies to any <script> element. The "type" attribute is
* what keeps the browser from running the body as JS, but
* CSP nonces are gated on the element not the type, so we
* include the nonce too.
*/}
<script
type="application/ld+json"
nonce={nonce}
dangerouslySetInnerHTML={{
__html: JSON.stringify({
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": `${SITE_URL}#organization`,
name: "Molecule AI",
url: SITE_URL,
logo: `${SITE_URL}/molecule-icon.png`,
sameAs: [
"https://github.com/molecule-ai",
"https://x.com/moleculeai",
],
},
{
"@type": "WebSite",
"@id": `${SITE_URL}#website`,
url: SITE_URL,
name: "Molecule AI",
publisher: { "@id": `${SITE_URL}#organization` },
inLanguage: "en-US",
},
{
"@type": "SoftwareApplication",
"@id": `${SITE_URL}#software`,
name: "Molecule AI",
applicationCategory: "DeveloperApplication",
operatingSystem: "Web",
description:
"Org-chart canvas for AI agent teams with credit metering, audit, and one-click runtime provisioning.",
url: SITE_URL,
offers: {
"@type": "AggregateOffer",
priceCurrency: "USD",
lowPrice: "0",
highPrice: "99",
offerCount: "3",
url: `${SITE_URL}/pricing`,
},
publisher: { "@id": `${SITE_URL}#organization` },
},
],
}),
}}
/>
</head>
<body className={`bg-surface text-ink ${interFont.variable} ${monoFont.variable}`}>
<ThemeProvider initialTheme={theme}>
-82
View File
@@ -1,82 +0,0 @@
import { ImageResponse } from "next/og";
// Marketing-launch SEO (mc#1486). Next.js App-Router file-system OG
// convention: served as `/opengraph-image` and auto-attached as
// `og:image` + `twitter:image`. Dynamic (not a static PNG in /public)
// so we can iterate the brand mark + tagline pre-launch without
// churning a binary blob in git history.
export const runtime = "edge";
export const alt = "Molecule AI — the AI org chart canvas";
export const size = { width: 1200, height: 630 };
export const contentType = "image/png";
export default function OG() {
return new ImageResponse(
(
<div
style={{
width: "100%",
height: "100%",
display: "flex",
flexDirection: "column",
alignItems: "flex-start",
justifyContent: "center",
padding: "80px",
background:
"linear-gradient(135deg, #0a0a0a 0%, #1a1a2e 60%, #16213e 100%)",
color: "#ffffff",
fontFamily: "system-ui, -apple-system, sans-serif",
}}
>
<div
style={{
fontSize: 28,
color: "#a3a3c2",
letterSpacing: "0.18em",
textTransform: "uppercase",
marginBottom: 24,
}}
>
Molecule AI
</div>
<div
style={{
fontSize: 76,
fontWeight: 700,
lineHeight: 1.05,
letterSpacing: "-0.02em",
maxWidth: 980,
}}
>
The AI org chart canvas
</div>
<div
style={{
fontSize: 32,
color: "#c8c8d8",
marginTop: 32,
lineHeight: 1.3,
maxWidth: 980,
}}
>
Wire Claude Code, Codex, Hermes, and OpenClaw agents into a governed
multi-agent workspace.
</div>
<div
style={{
position: "absolute",
right: 80,
bottom: 80,
fontSize: 22,
color: "#7a7a96",
display: "flex",
}}
>
moleculesai.app
</div>
</div>
),
{ ...size },
);
}
+4 -6
View File
@@ -103,7 +103,7 @@ export default function Home() {
setHydrationError(null);
window.location.reload();
}}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm"
>
Retry
</button>
@@ -115,9 +115,7 @@ export default function Home() {
return (
<>
<main aria-label="Agent canvas">
<Canvas />
</main>
<Canvas />
<Legend />
<CommunicationOverlay />
{hydrationError && (
@@ -136,7 +134,7 @@ export default function Home() {
setHydrationError(null);
window.location.reload();
}}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm"
>
Retry
</button>
@@ -178,7 +176,7 @@ brew services start redis`}</pre>
</p>
<button
onClick={() => window.location.reload()}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm mt-2 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="px-4 py-2 bg-accent-strong hover:bg-accent text-white rounded-md text-sm mt-2"
>
Reload
</button>
-45
View File
@@ -1,45 +0,0 @@
import type { MetadataRoute } from "next";
// Marketing-launch SEO (mc#1486). Next.js App-Router robots convention:
// this file is served as `/robots.txt` at build time and is the single
// source of truth for crawler allow/disallow.
//
// Contract:
// - Public marketing routes (/, /pricing, /blog/*) are crawlable.
// - Authed/app routes (/orgs, /api/*) are noindex'd. They render
// useful content only after a session round-trip, so a crawler hit
// just wastes our crawl budget and exposes endpoint shapes.
// - Tenant subdomains (<slug>.moleculesai.app) share this build but
// are blocked at the host level by the canvas middleware sending
// an `X-Robots-Tag: noindex` header — robots.txt is per-host and
// this file's `host` field claims the apex as canonical.
//
// Note: `sitemap` is published via the sibling `sitemap.ts` route; we
// reference it explicitly here so crawlers don't have to guess.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
export default function robots(): MetadataRoute.Robots {
return {
rules: [
{
userAgent: "*",
allow: ["/", "/pricing", "/blog"],
// Authed app surface + API + transient checkout returns. The
// /orgs route boots the org-selector behind AuthGate; even
// though SSR returns markup, that markup is a login wall when
// hit by an unauthenticated crawler, so indexing it dilutes
// brand searches with a "Please sign in" snippet.
disallow: [
"/orgs",
"/orgs/",
"/api/",
"/cp/",
"/checkout/",
],
},
],
sitemap: `${SITE_URL}/sitemap.xml`,
host: SITE_URL,
};
}
-42
View File
@@ -1,42 +0,0 @@
import type { MetadataRoute } from "next";
// Marketing-launch SEO (mc#1486). App-Router sitemap convention: this
// file is served as `/sitemap.xml` and enumerates the public marketing
// surface for search crawlers + AI training pipelines.
//
// Scope deliberately narrow:
// - Apex landing, pricing, and the (currently single) blog post.
// - Authed app routes are excluded — they're disallowed in robots.ts
// and would appear as "Please sign in" wall to a crawler.
//
// `lastModified` uses a build-time timestamp rather than per-route
// fs.stat so the same value applies regardless of where the build
// runs (Vercel/Railway/local). When we add CMS-backed blog content,
// swap to a per-entry timestamp from the source-of-truth metadata.
const SITE_URL =
process.env.NEXT_PUBLIC_SITE_URL ?? "https://app.moleculesai.app";
const BUILD_DATE = new Date();
export default function sitemap(): MetadataRoute.Sitemap {
return [
{
url: `${SITE_URL}/`,
lastModified: BUILD_DATE,
changeFrequency: "weekly",
priority: 1.0,
},
{
url: `${SITE_URL}/pricing`,
lastModified: BUILD_DATE,
changeFrequency: "weekly",
priority: 0.9,
},
{
url: `${SITE_URL}/blog/2026-04-20-chrome-devtools-mcp`,
lastModified: new Date("2026-04-20"),
changeFrequency: "monthly",
priority: 0.6,
},
];
}
+1 -1
View File
@@ -132,7 +132,7 @@ export function AuditTrailPanel({ workspaceId }: Props) {
if (loading) {
return (
<div role="status" aria-live="polite" className="flex items-center justify-center h-32">
<div className="flex items-center justify-center h-32">
<span className="text-xs text-ink-mid">Loading audit trail</span>
</div>
);
@@ -133,13 +133,13 @@ export function ConversationTraceModal({ open, workspaceId: _workspaceId, onClos
{/* Timeline */}
<div className="flex-1 overflow-y-auto px-5 py-4">
{loading && (
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">
<div className="text-xs text-ink-mid text-center py-8">
Loading trace from all workspaces...
</div>
)}
{!loading && entries.length === 0 && (
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">
<div className="text-xs text-ink-mid text-center py-8">
No activity found
</div>
)}
+1 -1
View File
@@ -105,7 +105,7 @@ export function EmptyState() {
{/* Template grid */}
{loading ? (
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 text-xs text-ink-mid py-4">
<div className="flex items-center justify-center gap-2 text-xs text-ink-mid py-4">
<Spinner />
Loading templates...
</div>
+85 -196
View File
@@ -15,7 +15,7 @@
// ($AGENT_URL). They ARE NOT filled in server-side because the
// server doesn't know where the operator's agent will live.
import { useCallback, useRef, useState } from "react";
import { useCallback, useState } from "react";
import * as Dialog from "@radix-ui/react-dialog";
type Tab = "python" | "curl" | "claude" | "mcp" | "hermes" | "codex" | "openclaw" | "kimi" | "fields";
@@ -84,33 +84,6 @@ export function ExternalConnectModal({ info, onClose }: Props) {
: "python";
const [tab, setTab] = useState<Tab>(initialTab);
const [copiedKey, setCopiedKey] = useState<string | null>(null);
const tabRefs = useRef<Map<Tab, HTMLButtonElement | null>>(new Map());
const handleTabKeyDown = useCallback(
(e: React.KeyboardEvent<HTMLButtonElement>, current: Tab, tabs: Tab[]) => {
const idx = tabs.indexOf(current);
if (e.key === "ArrowRight" || e.key === "ArrowDown") {
e.preventDefault();
const next = tabs[(idx + 1) % tabs.length];
setTab(next);
tabRefs.current.get(next)?.focus();
} else if (e.key === "ArrowLeft" || e.key === "ArrowUp") {
e.preventDefault();
const prev = tabs[(idx - 1 + tabs.length) % tabs.length];
setTab(prev);
tabRefs.current.get(prev)?.focus();
} else if (e.key === "Home") {
e.preventDefault();
setTab(tabs[0]);
tabRefs.current.get(tabs[0])?.focus();
} else if (e.key === "End") {
e.preventDefault();
setTab(tabs[tabs.length - 1]);
tabRefs.current.get(tabs[tabs.length - 1])?.focus();
}
},
[],
);
const copy = useCallback(async (value: string, key: string) => {
try {
@@ -187,19 +160,6 @@ export function ExternalConnectModal({ info, onClose }: Props) {
`MOLECULE_WORKSPACE_TOKEN=${info.auth_token}`,
);
// Build the tab list once so both the tab bar and keyboard handler
// share the same ordered array. Computed here (after all filled* vars)
// so TypeScript's block-scoping analysis can reach them.
const tabList: Tab[] = [];
if (filledUniversalMcp) tabList.push("mcp");
tabList.push("python");
if (filledChannel) tabList.push("claude");
if (filledHermes) tabList.push("hermes");
if (filledCodex) tabList.push("codex");
if (filledOpenClaw) tabList.push("openclaw");
if (filledKimi) tabList.push("kimi");
tabList.push("curl", "fields");
return (
<Dialog.Root open onOpenChange={(o) => !o && onClose()}>
<Dialog.Portal>
@@ -220,18 +180,34 @@ export function ExternalConnectModal({ info, onClose }: Props) {
aria-label="Connection snippet format"
className="mt-4 flex gap-1 border-b border-line"
>
{tabList.map((t) => (
{(() => {
// Build the tab order dynamically. Claude Code first
// (when offered) since it's the simplest setup; Python
// SDK second (full register+heartbeat+inbound); Universal
// MCP third (any MCP-aware runtime, outbound-only); curl
// for one-shot register; Fields for raw values.
// Tab order: Universal MCP first (default, runtime-
// agnostic primitives), then runtime-specific channel/
// SDK tabs, then curl + Fields. Each runtime tab only
// appears when the platform supplies the snippet — no
// dead "tab missing snippet" UX.
const tabs: Tab[] = [];
if (filledUniversalMcp) tabs.push("mcp");
tabs.push("python");
if (filledChannel) tabs.push("claude");
if (filledHermes) tabs.push("hermes");
if (filledCodex) tabs.push("codex");
if (filledOpenClaw) tabs.push("openclaw");
if (filledKimi) tabs.push("kimi");
tabs.push("curl", "fields");
return tabs;
})().map((t) => (
<button
key={t}
type="button"
role="tab"
id={`tab-${t}`}
aria-selected={tab === t}
aria-controls={`panel-${t}`}
tabIndex={tab === t ? 0 : -1}
ref={(el) => { tabRefs.current.set(t, el); }}
onClick={() => setTab(t)}
onKeyDown={(e) => handleTabKeyDown(e, t, tabList)}
className={`px-3 py-2 text-sm border-b-2 -mb-px transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1 focus-visible:ring-offset-surface ${
tab === t
? "border-accent text-ink"
@@ -259,39 +235,18 @@ export function ExternalConnectModal({ info, onClose }: Props) {
))}
</div>
{/* Snippet area — all panels always in the DOM so aria-controls
targets are stable. Hidden panels use aria-hidden so screen
readers skip them; active panel uses role=tabpanel with
aria-labelledby pointing to the tab button. */}
<div className="mt-3" data-testid="snippet-panels">
{/* Claude Code tab */}
<div
id="panel-claude"
data-testid="panel-claude"
role="tabpanel"
aria-labelledby="tab-claude"
hidden={tab !== "claude" || !filledChannel}
className={tab === "claude" && filledChannel ? "" : "hidden"}
>
{filledChannel && (
<SnippetBlock
value={filledChannel}
label="Claude Code channel — polls workspace's A2A; no tunnel needed"
copyKey="claude"
copied={copiedKey === "claude"}
onCopy={() => copy(filledChannel, "claude")}
/>
)}
</div>
{/* Python SDK tab */}
<div
id="panel-python"
data-testid="panel-python"
role="tabpanel"
aria-labelledby="tab-python"
hidden={tab !== "python"}
className={tab === "python" ? "" : "hidden"}
>
{/* Snippet area */}
<div className="mt-3">
{tab === "claude" && filledChannel && (
<SnippetBlock
value={filledChannel}
label="Claude Code channel — polls workspace's A2A; no tunnel needed"
copyKey="claude"
copied={copiedKey === "claude"}
onCopy={() => copy(filledChannel, "claude")}
/>
)}
{tab === "python" && (
<SnippetBlock
value={filledPython}
label="Python SDK — includes heartbeat loop (push-mode, needs public URL)"
@@ -299,16 +254,8 @@ export function ExternalConnectModal({ info, onClose }: Props) {
copied={copiedKey === "python"}
onCopy={() => copy(filledPython, "python")}
/>
</div>
{/* curl tab */}
<div
id="panel-curl"
data-testid="panel-curl"
role="tabpanel"
aria-labelledby="tab-curl"
hidden={tab !== "curl"}
className={tab === "curl" ? "" : "hidden"}
>
)}
{tab === "curl" && (
<SnippetBlock
value={filledCurl}
label="curl — one-shot register only (no heartbeat)"
@@ -316,111 +263,53 @@ export function ExternalConnectModal({ info, onClose }: Props) {
copied={copiedKey === "curl"}
onCopy={() => copy(filledCurl, "curl")}
/>
</div>
{/* Universal MCP tab */}
<div
id="panel-mcp"
data-testid="panel-mcp"
role="tabpanel"
aria-labelledby="tab-mcp"
hidden={tab !== "mcp" || !filledUniversalMcp}
className={tab === "mcp" && filledUniversalMcp ? "" : "hidden"}
>
{filledUniversalMcp && (
<SnippetBlock
value={filledUniversalMcp}
label="Universal MCP — standalone register + heartbeat + tools for any MCP-aware runtime (Claude Code, hermes, codex). Pair with Python or Claude Code tab if you need inbound A2A delivery."
copyKey="mcp"
copied={copiedKey === "mcp"}
onCopy={() => copy(filledUniversalMcp, "mcp")}
/>
)}
</div>
{/* Hermes tab */}
<div
id="panel-hermes"
data-testid="panel-hermes"
role="tabpanel"
aria-labelledby="tab-hermes"
hidden={tab !== "hermes" || !filledHermes}
className={tab === "hermes" && filledHermes ? "" : "hidden"}
>
{filledHermes && (
<SnippetBlock
value={filledHermes}
label="Hermes channel — bridges this workspace's A2A traffic into your hermes-agent session as platform messages (push parity with Claude Code). Long-poll based; no tunnel needed."
copyKey="hermes"
copied={copiedKey === "hermes"}
onCopy={() => copy(filledHermes, "hermes")}
/>
)}
</div>
{/* Codex tab */}
<div
id="panel-codex"
data-testid="panel-codex"
role="tabpanel"
aria-labelledby="tab-codex"
hidden={tab !== "codex" || !filledCodex}
className={tab === "codex" && filledCodex ? "" : "hidden"}
>
{filledCodex && (
<SnippetBlock
value={filledCodex}
label="Codex MCP config — wires the molecule MCP server into ~/.codex/config.toml. Outbound tools today; inbound A2A push needs the Python SDK tab paired in (codex's MCP runtime doesn't route arbitrary notifications/* yet)."
copyKey="codex"
copied={copiedKey === "codex"}
onCopy={() => copy(filledCodex, "codex")}
/>
)}
</div>
{/* OpenClaw tab */}
<div
id="panel-openclaw"
data-testid="panel-openclaw"
role="tabpanel"
aria-labelledby="tab-openclaw"
hidden={tab !== "openclaw" || !filledOpenClaw}
className={tab === "openclaw" && filledOpenClaw ? "" : "hidden"}
>
{filledOpenClaw && (
<SnippetBlock
value={filledOpenClaw}
label="OpenClaw MCP config — wires the molecule MCP server via openclaw mcp set + starts the gateway on loopback. Outbound tools today; inbound A2A push on an external openclaw needs the Python SDK tab paired in (a sessions.steer bridge daemon is future work)."
copyKey="openclaw"
copied={copiedKey === "openclaw"}
onCopy={() => copy(filledOpenClaw, "openclaw")}
/>
)}
</div>
{/* Kimi tab */}
<div
id="panel-kimi"
data-testid="panel-kimi"
role="tabpanel"
aria-labelledby="tab-kimi"
hidden={tab !== "kimi" || !filledKimi}
className={tab === "kimi" && filledKimi ? "" : "hidden"}
>
{filledKimi && (
<SnippetBlock
value={filledKimi}
label="Kimi CLI — self-contained Python bridge. Registers, heartbeats, polls for canvas messages, and echoes replies back. NAT-safe (no public URL). Run in a background terminal or via launchd."
copyKey="kimi"
copied={copiedKey === "kimi"}
onCopy={() => copy(filledKimi, "kimi")}
/>
)}
</div>
{/* Fields tab */}
<div
id="panel-fields"
data-testid="panel-fields"
role="tabpanel"
aria-labelledby="tab-fields"
hidden={tab !== "fields"}
className={tab === "fields" ? "" : "hidden"}
>
)}
{tab === "mcp" && filledUniversalMcp && (
<SnippetBlock
value={filledUniversalMcp}
label="Universal MCP — standalone register + heartbeat + tools for any MCP-aware runtime (Claude Code, hermes, codex). Pair with Python or Claude Code tab if you need inbound A2A delivery."
copyKey="mcp"
copied={copiedKey === "mcp"}
onCopy={() => copy(filledUniversalMcp, "mcp")}
/>
)}
{tab === "hermes" && filledHermes && (
<SnippetBlock
value={filledHermes}
label="Hermes channel — bridges this workspace's A2A traffic into your hermes-agent session as platform messages (push parity with Claude Code). Long-poll based; no tunnel needed."
copyKey="hermes"
copied={copiedKey === "hermes"}
onCopy={() => copy(filledHermes, "hermes")}
/>
)}
{tab === "codex" && filledCodex && (
<SnippetBlock
value={filledCodex}
label="Codex MCP config — wires the molecule MCP server into ~/.codex/config.toml. Outbound tools today; inbound A2A push needs the Python SDK tab paired in (codex's MCP runtime doesn't route arbitrary notifications/* yet)."
copyKey="codex"
copied={copiedKey === "codex"}
onCopy={() => copy(filledCodex, "codex")}
/>
)}
{tab === "openclaw" && filledOpenClaw && (
<SnippetBlock
value={filledOpenClaw}
label="OpenClaw MCP config — wires the molecule MCP server via openclaw mcp set + starts the gateway on loopback. Outbound tools today; inbound A2A push on an external openclaw needs the Python SDK tab paired in (a sessions.steer bridge daemon is future work)."
copyKey="openclaw"
copied={copiedKey === "openclaw"}
onCopy={() => copy(filledOpenClaw, "openclaw")}
/>
)}
{tab === "kimi" && filledKimi && (
<SnippetBlock
value={filledKimi}
label="Kimi CLI — self-contained Python bridge. Registers, heartbeats, polls for canvas messages, and echoes replies back. NAT-safe (no public URL). Run in a background terminal or via launchd."
copyKey="kimi"
copied={copiedKey === "kimi"}
onCopy={() => copy(filledKimi, "kimi")}
/>
)}
{tab === "fields" && (
<div className="space-y-2">
<Field label="workspace_id" value={info.workspace_id} onCopy={() => copy(info.workspace_id, "wsid")} copied={copiedKey === "wsid"} />
<Field label="platform_url" value={info.platform_url} onCopy={() => copy(info.platform_url, "url")} copied={copiedKey === "url"} />
@@ -434,7 +323,7 @@ export function ExternalConnectModal({ info, onClose }: Props) {
<Field label="registry_endpoint" value={info.registry_endpoint} onCopy={() => copy(info.registry_endpoint, "reg")} copied={copiedKey === "reg"} />
<Field label="heartbeat_endpoint" value={info.heartbeat_endpoint} onCopy={() => copy(info.heartbeat_endpoint, "hb")} copied={copiedKey === "hb"} />
</div>
</div>
)}
</div>
<div className="mt-5 flex justify-end gap-2">
+2 -4
View File
@@ -440,7 +440,6 @@ function ProviderPickerModal({
onChange={(e) => updateEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Value for ${entry.key}`}
ref={index === 0 ? firstInputRef : undefined}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
@@ -460,7 +459,7 @@ function ProviderPickerModal({
)}
{entry.error && (
<div role="alert" aria-live="assertive" className="mt-1.5 text-[10px] text-bad">{entry.error}</div>
<div className="mt-1.5 text-[10px] text-bad">{entry.error}</div>
)}
</div>
))}
@@ -695,7 +694,6 @@ function AllKeysModal({
onChange={(e) => updateEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Value for ${entry.key}`}
autoFocus={index === 0}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
@@ -720,7 +718,7 @@ function AllKeysModal({
))}
{globalError && (
<div role="alert" aria-live="assertive" className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[11px] text-bad">
<div className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[11px] text-bad">
{globalError}
</div>
)}
+1 -1
View File
@@ -71,7 +71,7 @@ export function WorkspaceUsage({ workspaceId }: WorkspaceUsageProps) {
<SkeletonRow />
</>
) : error ? (
<p role="alert" aria-live="assertive" className="text-xs text-bad" data-testid="usage-error">
<p className="text-xs text-bad" data-testid="usage-error">
{error}
</p>
) : metrics ? (
@@ -131,9 +131,7 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the Python SDK tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
// Query within the python panel so we get the right pre (not the first in DOM).
const pythonPanel = document.querySelector("[data-testid='panel-python']");
const preEl = pythonPanel?.querySelector("pre");
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("AUTH_TOKEN");
// The placeholder is replaced with the real auth token
expect(preEl?.textContent).toContain("secret-auth-token-abc");
@@ -142,9 +140,7 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the curl tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
// Query within the curl panel so we get the right pre (not the first in DOM).
const curlPanel = document.querySelector("[data-testid='panel-curl']");
const preEl = curlPanel?.querySelector("pre");
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("curl");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -152,11 +148,9 @@ describe("ExternalConnectModal — tab switching", () => {
it("switches to the Fields tab and shows raw values", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
// Query within the fields panel for specific values.
const fieldsPanel = document.querySelector("[data-testid='panel-fields']");
expect(fieldsPanel?.textContent).toContain("ws-123");
expect(fieldsPanel?.textContent).toContain("https://app.example.com");
expect(fieldsPanel?.textContent).toContain("secret-auth-token-abc");
expect(screen.getByText("ws-123")).toBeTruthy();
expect(screen.getByText("https://app.example.com")).toBeTruthy();
expect(screen.getByText("secret-auth-token-abc")).toBeTruthy();
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
@@ -174,8 +168,7 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the Python snippet instead of the placeholder", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
const pythonPanel = document.querySelector("[data-testid='panel-python']");
const preEl = pythonPanel?.querySelector("pre");
const preEl = document.querySelector("pre");
expect(preEl?.textContent).not.toContain("<paste from create response>");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -183,8 +176,7 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the curl snippet", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
const curlPanel = document.querySelector("[data-testid='panel-curl']");
const preEl = curlPanel?.querySelector("pre");
const preEl = document.querySelector("pre");
// curl template uses WORKSPACE_AUTH_TOKEN placeholder, not the generic one
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
@@ -192,8 +184,7 @@ describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the Universal MCP snippet", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP
const mcpPanel = document.querySelector("[data-testid='panel-mcp']");
const preEl = mcpPanel?.querySelector("pre");
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
expect(preEl?.textContent).not.toContain("<paste from create response>");
});
@@ -202,10 +193,8 @@ describe("ExternalConnectModal — snippet token stamping", () => {
describe("ExternalConnectModal — copy functionality", () => {
it("calls navigator.clipboard.writeText with the snippet text", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP — query the copy button within the mcp panel.
const mcpPanel = document.querySelector("[data-testid='panel-mcp']");
const copyBtn = mcpPanel?.querySelector("button");
if (copyBtn) fireEvent.click(copyBtn);
// Default tab is Universal MCP
fireEvent.click(screen.getByRole("button", { name: /^copy$/i }));
expect(clipboardWriteText).toHaveBeenCalledWith(
expect.stringContaining("secret-auth-token-abc"),
);
@@ -238,8 +227,7 @@ describe("ExternalConnectModal — missing optional fields", () => {
};
renderAndFlush(minimalInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
const fieldsPanel = document.querySelector("[data-testid='panel-fields']");
expect(fieldsPanel?.textContent).toContain("(missing)");
expect(screen.getByText("(missing)")).toBeTruthy();
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
@@ -11,21 +11,13 @@ import { render, screen, fireEvent, cleanup, act } from "@testing-library/react"
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { TestConnectionButton } from "../ui/TestConnectionButton";
import type { SecretGroup } from "@/types/secrets";
import { validateSecret, ApiError } from "@/lib/api/secrets";
import { validateSecret } from "@/lib/api/secrets";
// ─── Mock validateSecret ──────────────────────────────────────────────────────
// vi.mock is hoisted, so validateSecret (imported above) refers to the mocked
// namespace value once vi.mock runs. Use vi.mocked() to access it in tests.
vi.mock("@/lib/api/secrets", () => ({
validateSecret: vi.fn(),
ApiError: class ApiError extends Error {
status: number;
constructor(status: number, message: string) {
super(message);
this.name = "ApiError";
this.status = status;
}
},
}));
// SecretGroup is a string literal type: 'github' | 'anthropic' | 'openrouter' | 'custom'
@@ -110,7 +102,7 @@ describe("TestConnectionButton — state machine", () => {
expect(screen.getByText("Permission denied")).toBeTruthy();
});
it("shows a connectivity message on a genuine network exception", async () => {
it("shows generic error message on unexpected exception", async () => {
vi.mocked(validateSecret).mockRejectedValue(new Error("timeout"));
render(<TestConnectionButton provider={toGroup("anthropic")} secretValue="sk-..." />);
@@ -118,23 +110,8 @@ describe("TestConnectionButton — state machine", () => {
await act(async () => { /* flush */ });
expect(screen.getByRole("alert")).toBeTruthy();
// A real thrown network error → honest connectivity message (not a
// fabricated "service down"); see internal#492.
expect(document.body.querySelector('[role="alert"]')?.textContent).toMatch(
/could not reach the validation service/i,
);
});
it("does not claim a timeout when the validate endpoint 404s (internal#492)", async () => {
vi.mocked(validateSecret).mockRejectedValue(new ApiError(404, "Not Found"));
render(<TestConnectionButton provider={toGroup("anthropic")} secretValue="sk-..." />);
fireEvent.click(screen.getByRole("button"));
await act(async () => { /* flush */ });
const alert = document.body.querySelector('[role="alert"]')?.textContent ?? "";
expect(alert).not.toMatch(/timed out/i);
expect(alert).toMatch(/not available/i);
// The error detail is hardcoded to "Connection timed out. Service may be down."
expect(document.body.querySelector('[role="alert"]')?.textContent).toMatch(/timed out/i);
});
});
@@ -223,7 +223,6 @@ export function MobileCanvas({
textTransform: "uppercase",
fontWeight: 600,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
Reset
</button>
+21 -82
View File
@@ -2,11 +2,8 @@
// 04 · Chat — message thread + composer + sub-tabs.
// Wired to the same /workspaces/:id/a2a (method message/send) endpoint
// that the desktop ChatTab uses. Render parity with desktop ChatTab is
// achieved by reusing its renderers rather than forking a reduced
// mobile path: the Agent Comms sub-tab mounts the same AgentCommsPanel,
// and message attachments route through the same AttachmentPreview
// dispatch the desktop My-Chat bubble uses (#231/#232).
// that the desktop ChatTab uses, but with a slimmer surface: no
// attachments, no A2A topology overlay, no conversation tracing.
import { useEffect, useMemo, useRef, useState } from "react";
import ReactMarkdown from "react-markdown";
@@ -19,9 +16,6 @@ import {
useChatSend,
useChatSocket,
} from "@/components/tabs/chat/hooks";
import { AgentCommsPanel } from "@/components/tabs/chat/AgentCommsPanel";
import { AttachmentPreview } from "@/components/tabs/chat/AttachmentPreview";
import { downloadChatFile } from "@/components/tabs/chat/uploads";
import { toMobileAgent } from "./components";
import { MOBILE_FONT_MONO, MOBILE_FONT_SANS, usePalette } from "./palette";
@@ -310,17 +304,6 @@ export function MobileChat({
const removePendingFile = (index: number) =>
setPendingFiles((prev) => prev.filter((_, i) => i !== index));
// Route attachment downloads through the same authenticated helper
// the desktop ChatTab uses (downloadChatFile) so platform-scheme
// URIs get a real Blob with auth headers instead of about:blank.
const downloadAttachment = (att: ChatAttachment) => {
downloadChatFile(agentId, att).catch(() => {
// AttachmentPreview's own error affordance covers the in-bubble
// failure state; matches ChatTab's behaviour of not double-
// reporting a download failure.
});
};
const send = async () => {
const text = draft.trim();
if ((!text && pendingFiles.length === 0) || sending || !reachable) return;
@@ -356,7 +339,6 @@ export function MobileChat({
type="button"
onClick={onBack}
aria-label="Back"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
@@ -403,7 +385,6 @@ export function MobileChat({
<button
type="button"
aria-label="More"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
@@ -434,7 +415,6 @@ export function MobileChat({
key={t.id}
type="button"
onClick={() => setTab(t.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
padding: "4px 0 8px",
border: "none",
@@ -453,19 +433,7 @@ export function MobileChat({
</div>
</div>
{/* Agent Comms — reuse the desktop AgentCommsPanel verbatim so
mobile renders the identical peer/A2A + delegation feed
(history GET + live socket events) instead of a placeholder
(#231). The panel owns its own scroll/load/error/empty
states, matching ChatTab's agent-comms tabpanel. */}
{tab === "a2a" && (
<div style={{ flex: 1, minHeight: 0, overflow: "hidden" }}>
<AgentCommsPanel workspaceId={agentId} />
</div>
)}
{/* Messages */}
{tab === "my" && (
<div
ref={scrollRef}
style={{
@@ -477,8 +445,20 @@ export function MobileChat({
gap: 8,
}}
>
{tab === "a2a" && (
<div
style={{
padding: "20px 4px",
textAlign: "center",
color: p.text3,
fontSize: 13,
}}
>
Agent Comms peer-to-peer A2A traffic surfaces in the Comms tab.
</div>
)}
{tab === "my" && historyLoading && (
<div role="status" aria-live="polite" style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Loading chat history
</div>
)}
@@ -498,8 +478,6 @@ export function MobileChat({
onClick={() => {
loadInitial();
}}
aria-label="Retry loading chat history"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400"
style={{
padding: "6px 14px",
borderRadius: 14,
@@ -515,7 +493,7 @@ export function MobileChat({
</div>
)}
{tab === "my" && !historyLoading && !historyError && messages.length === 0 && (
<div role="status" aria-live="polite" style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Send a message to start chatting.
</div>
)}
@@ -543,31 +521,9 @@ export function MobileChat({
overflowWrap: "anywhere",
}}
>
{m.content && (
<MarkdownBubble dark={dark} accent={p.accent}>
{m.content}
</MarkdownBubble>
)}
{m.attachments && m.attachments.length > 0 && (
<div
style={{
display: "flex",
flexWrap: "wrap",
gap: 4,
marginTop: m.content ? 6 : 0,
}}
>
{m.attachments.map((att, i) => (
<AttachmentPreview
key={`${m.id}-${i}`}
workspaceId={agentId}
attachment={att}
onDownload={downloadAttachment}
tone={mine ? "user" : "agent"}
/>
))}
</div>
)}
<MarkdownBubble dark={dark} accent={p.accent}>
{m.content}
</MarkdownBubble>
<div
style={{
fontSize: 10,
@@ -598,13 +554,7 @@ export function MobileChat({
</div>
)}
</div>
)}
{/* Footer ID + composer belong to My Chat only. The Agent Comms
tab is a read-only peer/A2A feed (parity with desktop
ChatTab, where the agent-comms tabpanel has no composer). */}
{tab === "my" && (
<>
{/* Footer ID */}
<div
style={{
@@ -669,7 +619,6 @@ export function MobileChat({
type="button"
onClick={() => removePendingFile(i)}
aria-label={`Remove ${f.name}`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
border: "none",
background: "transparent",
@@ -710,7 +659,6 @@ export function MobileChat({
onClick={() => fileInputRef.current?.click()}
disabled={!reachable || sending || uploading}
aria-label="Attach"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 32,
height: 32,
@@ -732,7 +680,6 @@ export function MobileChat({
ref={composerRef}
value={draft}
onChange={(e) => setDraft(e.target.value)}
aria-label="Message"
onKeyDown={(e) => {
// Enter sends; Shift+Enter inserts a newline. Skip when the
// IME is composing — pressing Enter to commit a Chinese/
@@ -756,12 +703,7 @@ export function MobileChat({
border: "none",
outline: "none",
background: "transparent",
// iOS Safari/PWA zooms the viewport when a focused textarea
// has a computed font-size below 16px. 14.5 triggers that
// focus-zoom; the page looks broken until the user pinches
// back (#224, same class as desktop #1434 / sibling #225).
// 16px is the minimum that keeps focus from zooming.
fontSize: 16,
fontSize: 14.5,
lineHeight: 1.4,
color: p.text,
padding: "6px 0",
@@ -777,13 +719,12 @@ export function MobileChat({
onClick={send}
disabled={(!draft.trim() && pendingFiles.length === 0) || !reachable || sending || uploading}
aria-label="Send"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 36,
height: 36,
borderRadius: 999,
border: "none",
cursor: (draft.trim() || pendingFiles.length === 0) && !sending && !uploading ? "pointer" : "not-allowed",
cursor: (draft.trim() || pendingFiles.length > 0) && !sending && !uploading ? "pointer" : "not-allowed",
flexShrink: 0,
background:
(draft.trim() || pendingFiles.length > 0) && reachable && !sending && !uploading
@@ -805,8 +746,6 @@ export function MobileChat({
</button>
</div>
</div>
</>
)}
</div>
);
}
+2 -3
View File
@@ -231,7 +231,6 @@ export function MobileComms({ dark }: { dark: boolean }) {
fontSize: 13,
fontWeight: 500,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{o.label}
<span
@@ -252,11 +251,11 @@ export function MobileComms({ dark }: { dark: boolean }) {
<div style={{ padding: "0 14px", display: "flex", flexDirection: "column", gap: 8 }}>
{loading && items.length === 0 ? (
<div role="status" aria-live="polite" style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Loading recent comms
</div>
) : filtered.length === 0 ? (
<div role="status" aria-live="polite" style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div style={{ padding: "30px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
No A2A traffic yet.
</div>
) : (
@@ -83,12 +83,11 @@ export function MobileDetail({
type="button"
onClick={onBack}
aria-label="Back"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={iconButtonStyle(p, dark)}
>
{Icons.back({ size: 18 })}
</button>
<button type="button" aria-label="More" className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900" style={iconButtonStyle(p, dark)}>
<button type="button" aria-label="More" style={iconButtonStyle(p, dark)}>
{Icons.more({ size: 18 })}
</button>
</div>
@@ -184,7 +183,6 @@ export function MobileDetail({
key={t.id}
type="button"
onClick={() => setTab(t.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
padding: "8px 14px",
borderRadius: 999,
@@ -217,7 +215,6 @@ export function MobileDetail({
type="button"
onClick={onChat}
data-testid="mobile-chat-cta"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: "100%",
height: 52,
@@ -419,8 +416,6 @@ function DetailActivity({ workspaceId, dark }: { workspaceId: string; dark: bool
if (items === null) {
return (
<div
role="status"
aria-live="polite"
style={{
background: p.surface,
borderRadius: 16,
@@ -200,7 +200,6 @@ export function MobileHome({
justifyContent: "center",
boxShadow: "0 8px 24px rgba(40,30,20,0.25), 0 2px 6px rgba(40,30,20,0.15)",
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{Icons.plus({ size: 22 })}
</button>
@@ -92,7 +92,6 @@ export function MobileMe({
border: on ? `2px solid ${p.text}` : "2px solid transparent",
boxShadow: on ? `0 0 0 2px ${p.bg} inset` : "none",
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
/>
);
})}
@@ -185,7 +184,6 @@ function SegmentedRow({
fontSize: 13,
fontWeight: 600,
}}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
>
{o.label}
</button>
+1 -16
View File
@@ -148,7 +148,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
type="button"
onClick={onClose}
aria-label="Close"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: 32,
height: 32,
@@ -171,8 +170,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
<div style={{ padding: "0 14px" }}>
{loadingTemplates ? (
<div
role="status"
aria-live="polite"
style={{
padding: "24px 8px",
textAlign: "center",
@@ -217,8 +214,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
setTplId(t.id);
setTier(tCode);
}}
aria-label={`Select template: ${t.name} (tier ${t.tier})`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
background: on
? dark
@@ -307,7 +302,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
<input
value={name}
onChange={(e) => setName(e.target.value)}
aria-label="Agent name"
placeholder={tplId
? (templates.find((t) => t.id === tplId)?.name ?? "agent-name")
: "agent-name"}
@@ -318,12 +312,7 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
border: `0.5px solid ${p.border}`,
borderRadius: 12,
fontFamily: MOBILE_FONT_MONO,
// iOS Safari/PWA zooms the viewport when a focused input has
// a computed font-size below 16px; the layout jumps and the
// page looks broken until the user pinches back (#224 / #225,
// same class as desktop #1434). 16px is the minimum that
// suppresses that focus-zoom.
fontSize: 16,
fontSize: 13.5,
color: p.text,
outline: "none",
boxSizing: "border-box",
@@ -341,8 +330,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
key={t}
type="button"
onClick={() => setTier(t)}
aria-label={`Select tier ${t}: ${TIER_LABEL[t]}`}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
flex: 1,
padding: "10px 8px",
@@ -390,8 +377,6 @@ export function MobileSpawn({ dark, onClose }: { dark: boolean; onClose: () => v
type="button"
onClick={handleSpawn}
disabled={busy || !tplId || templates.length === 0}
aria-label="Spawn agent"
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
width: "100%",
height: 52,
@@ -21,14 +21,6 @@ import { MobileChat } from "../MobileChat";
vi.mock("@/lib/api");
import { api } from "@/lib/api";
// AgentCommsPanel (mounted by the Agent Comms sub-tab, #231) subscribes
// to the global socket via useSocketEvent. Stub it to a no-op so the
// panel mounts without the real ReconnectingSocket — the parity tests
// only assert the panel renders (vs the old static placeholder).
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: vi.fn(),
}));
// ─── Mock store ───────────────────────────────────────────────────────────────
const mockAgentId = "ws-chat-test";
@@ -163,12 +155,6 @@ beforeEach(() => {
mockOnBack.mockClear();
mockStoreState.nodes = [];
mockStoreState.agentMessages = {};
// jsdom doesn't implement scrollIntoView. The Agent Comms tab now
// mounts AgentCommsPanel (#231), which scrolls its feed to bottom on
// arrival; a no-op stub keeps the panel from throwing under jsdom
// (same stub AgentCommsPanel's own render test installs).
Element.prototype.scrollIntoView =
vi.fn() as unknown as Element["scrollIntoView"];
// Set up spies on the real api methods. Tests override these per-call.
const getSpy = vi.spyOn(api, "get");
const postSpy = vi.spyOn(api, "post");
@@ -263,20 +249,6 @@ describe("MobileChat — composer", () => {
const sendBtn = container.querySelector('[aria-label="Send"]') as HTMLButtonElement;
expect(sendBtn.disabled).toBe(true);
});
// Regression #224: the composer textarea must render with font-size
// ≥ 16px. iOS Safari and PWAs auto-zoom the viewport when a focused
// input has a computed font-size below 16px — the layout jumps and
// the page looks broken until the user pinches back. Same class as
// desktop #1434 / sibling MobileSpawn #225.
it("composer textarea renders at font-size 16px or greater (iOS focus-zoom regression #224)", () => {
const { container } = renderChat(mockAgentId);
const textarea = container.querySelector("textarea") as HTMLTextAreaElement;
expect(textarea).toBeTruthy();
const fs = Number.parseFloat(textarea.style.fontSize);
expect(Number.isFinite(fs)).toBe(true);
expect(fs).toBeGreaterThanOrEqual(16);
});
});
// ─── Tabs ─────────────────────────────────────────────────────────────────────
@@ -502,146 +474,3 @@ describe("MobileChat — chat history", () => {
expect(getSpy).toHaveBeenCalledTimes(2);
});
});
// ─── #232 · Attachment render parity with desktop ChatTab ────────────────────
//
// Regression for the CTO-reported mobile bug: MobileChat used to render
// only m.content (no attachment surface), so files sent/received in a
// conversation were invisible on mobile while desktop showed them. The
// fix routes m.attachments through the same AttachmentPreview the
// desktop ChatTab bubble uses.
describe("MobileChat — attachment render parity (#232)", () => {
beforeEach(() => {
mockStoreState.nodes = [onlineNode];
});
it("renders an attachment from a history message via AttachmentPreview", async () => {
const getSpy = vi.spyOn(api, "get");
// useChatHistory reads { messages, reached_end }.
getSpy.mockResolvedValueOnce({
messages: [
{
id: "m-att-1",
role: "agent",
content: "Here is the report",
attachments: [
{
name: "report.csv",
uri: "workspace://out/report.csv",
mimeType: "text/csv",
size: 2048,
},
],
timestamp: new Date().toISOString(),
},
],
reached_end: true,
});
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
// A non-image attachment renders the AttachmentChip download button
// with title="Download <name>" — same component the desktop bubble
// dispatches through AttachmentPreview.
await waitFor(() => {
const chip = container.querySelector('[title="Download report.csv"]');
expect(chip).toBeTruthy();
});
expect(container.textContent ?? "").toContain("report.csv");
});
});
// ─── #231 · Agent Comms (A2A/peer) render parity with desktop ChatTab ────────
//
// Regression for the CTO-reported mobile bug: the Agent Comms sub-tab
// rendered a static placeholder string ("peer-to-peer A2A traffic
// surfaces in the Comms tab") instead of the real feed. The fix mounts
// the same AgentCommsPanel the desktop ChatTab agent-comms tabpanel
// uses, so peer/A2A + delegation activity is visible on mobile.
describe("MobileChat — Agent Comms render parity (#231)", () => {
beforeEach(() => {
mockStoreState.nodes = [onlineNode];
});
it("mounts AgentCommsPanel on the Agent Comms tab (not the old placeholder)", async () => {
const getSpy = vi.spyOn(api, "get");
// 1st GET: useChatHistory (My Chat) on mount.
getSpy.mockResolvedValueOnce({ messages: [], reached_end: true });
// 2nd GET: AgentCommsPanel's activity load when the tab is shown.
// Empty list → panel renders its own empty state, which still
// proves AgentCommsPanel mounted (vs. the removed placeholder).
getSpy.mockResolvedValueOnce([]);
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
const commsTab = Array.from(container.querySelectorAll("button")).find(
(b) => b.textContent?.trim() === "Agent Comms",
);
expect(commsTab).toBeTruthy();
await act(async () => {
commsTab!.click();
});
await waitFor(() => {
const text = container.textContent ?? "";
// The panel's empty state — proves AgentCommsPanel mounted.
expect(text).toContain("No agent-to-agent communications yet.");
});
// The old hard-coded placeholder must be gone.
expect(container.textContent ?? "").not.toContain(
"peer-to-peer A2A traffic surfaces in the Comms tab",
);
// The panel hit its activity endpoint.
expect(getSpy).toHaveBeenCalledWith(
expect.stringContaining(`/workspaces/${mockAgentId}/activity`),
);
});
it("renders a peer message on the Agent Comms tab", async () => {
const getSpy = vi.spyOn(api, "get");
getSpy.mockResolvedValueOnce({ messages: [], reached_end: true });
// a2a_receive from a peer → AgentCommsPanel.toCommMessage maps it
// to an inbound bubble with the request text.
getSpy.mockResolvedValueOnce([
{
id: "act-1",
activity_type: "a2a_receive",
source_id: "peer-ws-uuid",
target_id: mockAgentId,
method: "message/send",
summary: "peer asked something",
request_body: { task: "Please review PR 42" },
response_body: null,
status: "ok",
created_at: new Date().toISOString(),
},
]);
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
const commsTab = Array.from(container.querySelectorAll("button")).find(
(b) => b.textContent?.trim() === "Agent Comms",
);
await act(async () => {
commsTab!.click();
});
await waitFor(() => {
expect(container.textContent ?? "").toContain("Please review PR 42");
});
});
});
@@ -93,24 +93,6 @@ describe("MobileSpawn — render", () => {
expect(input).toBeTruthy();
});
// Regression #224 / #225: the agent-name input must render with a
// font-size ≥ 16px. iOS Safari and PWAs auto-zoom the viewport when a
// focused input has a computed font-size below 16px — the layout
// jumps and the page looks broken until the user pinches back.
it("renders the name input at font-size 16px or greater (iOS focus-zoom regression)", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const input = document.querySelector(
'input[aria-label="Agent name"]',
) as HTMLInputElement | null;
expect(input).toBeTruthy();
// Parse the inline style font-size — jsdom doesn't run a layout
// engine, so getComputedStyle reports the inline value verbatim.
const fs = Number.parseFloat(input!.style.fontSize);
expect(Number.isFinite(fs)).toBe(true);
expect(fs).toBeGreaterThanOrEqual(16);
});
it("renders all 4 tier buttons", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
@@ -133,7 +133,6 @@ export function TabBar({
aria-label={t.label}
onClick={() => onChange(t.id)}
onKeyDown={(e) => handleKeyDown(e, idx)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
background: "none",
border: "none",
@@ -292,7 +291,6 @@ export function AgentCard({
data-testid="workspace-card"
aria-label={`${agent.name}, status: ${agent.status}, tier ${agent.tier}${agent.remote ? ", remote" : ""}`}
onClick={onClick}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
display: "block",
width: "100%",
@@ -446,7 +444,6 @@ export function FilterChips({
type="button"
aria-checked={on}
onClick={() => onChange(o.id)}
className="focus:outline-none focus-visible:ring-2 focus-visible:ring-emerald-500 focus-visible:ring-offset-2 focus-visible:ring-offset-zinc-100 dark:focus-visible:ring-offset-zinc-900"
style={{
display: "inline-flex",
alignItems: "center",
@@ -160,14 +160,14 @@ export function OrgTokensTab() {
</code>
<button
onClick={handleCopy}
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors"
>
{copied ? 'Copied' : 'Copy'}
</button>
</div>
<button
onClick={() => setNewToken(null)}
className="text-[9px] text-good/60 hover:text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="text-[9px] text-good/60 hover:text-good transition-colors"
>
Dismiss
</button>
@@ -219,7 +219,7 @@ export function OrgTokensTab() {
</div>
<button
onClick={() => setRevokeTarget(t)}
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 shrink-0 focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400 focus-visible:ring-offset-1"
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 shrink-0"
>
Revoke
</button>
+23 -19
View File
@@ -3,24 +3,16 @@ import { useState, useCallback, useRef, useEffect } from 'react';
import type { Secret, SecretGroup } from '@/types/secrets';
import { useSecretsStore } from '@/stores/secrets-store';
import { StatusBadge } from '@/components/ui/StatusBadge';
import { RevealToggle } from '@/components/ui/RevealToggle';
import { KeyValueField } from '@/components/ui/KeyValueField';
import { ValidationHint } from '@/components/ui/ValidationHint';
import { TestConnectionButton } from '@/components/ui/TestConnectionButton';
import { validateSecretValue } from '@/lib/validation/secret-formats';
import { SERVICES } from '@/lib/services';
const AUTO_HIDE_MS = 30_000;
const VALIDATION_DEBOUNCE_MS = 400;
// Secret values are write-only from the browser: the server List endpoint
// "Never exposes values", there is no per-secret decrypt route, and the
// only decrypted path (GET /secrets/values) is bulk + token-gated for
// remote agents. The old eye/RevealToggle was a dead affordance — it
// flipped its own icon but could never reveal anything, which read as
// "this doesn't work" (esp. once clicked → eye-with-slash). We show an
// honest static indicator instead; rotation is via Edit.
const WRITE_ONLY_TITLE =
'Value is write-only and cannot be revealed — use Edit to replace/rotate it';
interface SecretRowProps {
secret: Secret;
workspaceId: string;
@@ -39,12 +31,28 @@ export function SecretRow({ secret, workspaceId }: SecretRowProps) {
const setSecretStatus = useSecretsStore((s) => s.setSecretStatus);
const isEditing = editingKey === secret.name;
const [revealed, setRevealed] = useState(false);
const [editValue, setEditValue] = useState('');
const [validationError, setValidationError] = useState<string | null>(null);
const [isSaving, setIsSaving] = useState(false);
const [saveError, setSaveError] = useState<string | null>(null);
const debounceRef = useRef<ReturnType<typeof setTimeout>>(undefined);
const editBtnRef = useRef<HTMLButtonElement>(null);
const revealTimerRef = useRef<ReturnType<typeof setTimeout>>(undefined);
// Auto-hide revealed value after 30s
useEffect(() => {
if (revealed) {
clearTimeout(revealTimerRef.current);
revealTimerRef.current = setTimeout(() => setRevealed(false), AUTO_HIDE_MS);
return () => clearTimeout(revealTimerRef.current);
}
}, [revealed]);
// Reset revealed state when panel closes (session-only)
useEffect(() => {
return () => setRevealed(false);
}, []);
// Debounced validation
useEffect(() => {
@@ -125,15 +133,11 @@ export function SecretRow({ secret, workspaceId }: SecretRowProps) {
{secret.masked_value}
</span>
<div className="secret-row__actions">
<span
data-testid="write-only-indicator"
className="secret-row__write-only"
role="img"
aria-label={`${secret.name} value is write-only and cannot be revealed; use Edit to replace it`}
title={WRITE_ONLY_TITLE}
>
🔒
</span>
<RevealToggle
revealed={revealed}
onToggle={() => setRevealed((r) => !r)}
label={`Toggle reveal ${secret.name}`}
/>
<StatusBadge status={secret.status} />
<button
type="button"
+3 -36
View File
@@ -16,40 +16,7 @@ interface TokensTabProps {
workspaceId: string;
}
// The settings panel passes the literal sentinel "global" when no canvas
// node is selected. Workspace tokens are inherently per-workspace — there
// is no /workspaces/global/tokens endpoint (querying the uuid column with
// "global" 500s on Postgres). The org-wide equivalent lives in the
// separate "Org API Keys" tab. Mirrors the sentinel-awareness that
// api/secrets.ts already has (workspaceId === 'global' → /settings/secrets).
const GLOBAL_WORKSPACE_ID = 'global';
export function TokensTab({ workspaceId }: TokensTabProps) {
if (workspaceId === GLOBAL_WORKSPACE_ID) {
return (
<div className="p-4 space-y-4">
<div>
<h3 className="text-sm font-semibold text-ink">API Tokens</h3>
<p className="text-[10px] text-ink-mid mt-0.5">
Bearer tokens for authenticating API calls to this workspace.
</p>
</div>
<div className="text-center py-6">
<p className="text-xs text-ink-mid">Select a workspace node first</p>
<p className="text-[10px] text-ink-mid mt-1">
Workspace tokens are scoped to a single workspace. Select a node
on the canvas to manage its tokens, or use the{' '}
<span className="text-accent font-medium">Org API Keys</span> tab
for org-wide API keys.
</p>
</div>
</div>
);
}
return <WorkspaceTokensTab workspaceId={workspaceId} />;
}
function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
const [tokens, setTokens] = useState<Token[]>([]);
const [loading, setLoading] = useState(true);
const [creating, setCreating] = useState(false);
@@ -140,14 +107,14 @@ function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
</code>
<button
onClick={handleCopy}
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="shrink-0 px-2 py-1.5 bg-emerald-800/40 hover:bg-emerald-700/50 border border-emerald-700/40 rounded text-[10px] text-good transition-colors"
>
{copied ? 'Copied' : 'Copy'}
</button>
</div>
<button
onClick={() => setNewToken(null)}
className="text-[9px] text-good/60 hover:text-good transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
className="text-[9px] text-good/60 hover:text-good transition-colors"
>
Dismiss
</button>
@@ -192,7 +159,7 @@ function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
</div>
<button
onClick={() => setRevokeTarget(t)}
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1 focus:outline-none focus-visible:ring-2 focus-visible:ring-red-400 focus-visible:ring-offset-1"
className="text-[10px] text-bad/70 hover:text-bad transition-colors px-2 py-1"
>
Revoke
</button>
@@ -138,54 +138,14 @@ describe("SecretRow — display mode", () => {
expect(document.querySelector('[role="row"]')).toBeTruthy();
});
it("has Copy, Edit, Delete buttons", () => {
it("has Reveal, Copy, Edit, Delete buttons", () => {
render(<SecretRow secret={GITHUB_SECRET} workspaceId="ws-1" />);
expect(screen.getByTestId("reveal-toggle")).toBeTruthy();
expect(screen.getByRole("button", { name: /copy/i })).toBeTruthy();
expect(screen.getByRole("button", { name: /edit/i })).toBeTruthy();
expect(screen.getByRole("button", { name: /delete/i })).toBeTruthy();
});
// Regression: the reveal/eye control was a dead affordance. Clicking it
// flipped its own icon (eye → eye-with-slash) but never revealed the value,
// because secret values are write-only from the browser (server List
// "Never exposes values"; there is no per-secret decrypt endpoint and the
// client has no plaintext-fetch function). The honest fix removes the
// toggle and shows a static "write-only / cannot be revealed" indicator.
// See internal tracking issue + internal#210/#211.
it("does NOT render a reveal/eye toggle (values are write-only)", () => {
render(<SecretRow secret={GITHUB_SECRET} workspaceId="ws-1" />);
expect(screen.queryByTestId("reveal-toggle")).toBeNull();
expect(
screen.queryByRole("button", { name: /toggle reveal/i }),
).toBeNull();
});
it("shows a write-only indicator explaining the value cannot be revealed", () => {
render(<SecretRow secret={ANTHROPIC_SECRET} workspaceId="ws-1" />);
const indicator = screen.getByTestId("write-only-indicator");
expect(indicator).toBeTruthy();
// Affordance must be honest: explain it cannot be revealed and that
// Edit is the rotate path. It must not be a clickable button.
const title = indicator.getAttribute("title") ?? "";
expect(title.toLowerCase()).toMatch(/write-only|cannot be revealed/);
expect(indicator.tagName).not.toBe("BUTTON");
});
it("write-only indicator is present for the Anthropic/OAuth-token row too", () => {
// The reported bug singled out CLAUDE_CODE_OAUTH_TOKEN (anthropic group);
// the fix is group-agnostic — every row gets the same honest affordance.
const OAUTH_SECRET = {
name: "CLAUDE_CODE_OAUTH_TOKEN",
masked_value: "••••••••••••••••9d2a",
group: "anthropic" as const,
status: "unverified" as const,
updated_at: "2024-01-04",
};
render(<SecretRow secret={OAUTH_SECRET} workspaceId="ws-1" />);
expect(screen.queryByTestId("reveal-toggle")).toBeNull();
expect(screen.getByTestId("write-only-indicator")).toBeTruthy();
});
it("shows invalid status correctly", () => {
render(<SecretRow secret={CUSTOM_SECRET} workspaceId="ws-1" />);
expect(screen.getByTestId("status-badge").getAttribute("data-status")).toBe("invalid");
@@ -302,35 +302,3 @@ describe("TokensTab — error", () => {
expect(document.querySelector('[role="status"]')).toBeNull();
});
});
// ─── "global" sentinel (no node selected) ────────────────────────────────────
//
// Regression: SettingsPanel passes the literal "global" when no canvas
// node is selected. workspace tokens are per-workspace and there is no
// /workspaces/global/tokens endpoint — calling it 500'd
// ("invalid input syntax for type uuid: global"). The tab must NOT call
// the API in that state and must point the user at the Org API Keys tab.
describe("TokensTab — global sentinel (no node selected)", () => {
beforeEach(() => {
mockApiGet.mockReset();
mockApiPost.mockReset();
mockApiGet.mockRejectedValue(new Error("should not be called"));
});
it("does not call the API and shows a pointer to Org API Keys", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(mockApiGet).not.toHaveBeenCalled();
expect(mockApiPost).not.toHaveBeenCalled();
expect(document.body.textContent).toContain("Select a workspace node");
expect(document.body.textContent).toContain("Org API Keys");
// No error banner, no scary 500 surfacing.
expect(document.querySelector(".text-bad")).toBeNull();
});
it("has no create button in the global state", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(document.body.textContent).not.toContain("New Token");
});
});
+1 -1
View File
@@ -185,7 +185,7 @@ export function ActivityTab({ workspaceId }: Props) {
{/* Activity list */}
<div className="flex-1 overflow-y-auto p-3 space-y-1.5">
{loading && activities.length === 0 && (
<div role="status" aria-live="polite" className="text-xs text-ink-mid text-center py-8">Loading activity...</div>
<div className="text-xs text-ink-mid text-center py-8">Loading activity...</div>
)}
{error && (
+1 -1
View File
@@ -262,7 +262,7 @@ export function ChannelsTab({ workspaceId }: Props) {
</div>
{error && (
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
+2 -129
View File
@@ -81,7 +81,7 @@ function AgentCardSection({ workspaceId }: { workspaceId: string }) {
spellCheck={false} rows={12}
className="w-full bg-surface-card border border-line rounded p-2 text-[10px] font-mono text-ink focus:outline-none focus:border-accent resize-none"
/>
{error && <div role="alert" aria-live="assertive" className="px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">{error}</div>}
{error && <div className="px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">{error}</div>}
<div className="flex gap-2">
<button type="button" onClick={handleSave} disabled={saving}
className="px-2 py-1 bg-accent hover:bg-accent-strong text-[10px] rounded text-white disabled:opacity-50 transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1 focus-visible:ring-offset-surface">
@@ -109,130 +109,6 @@ function AgentCardSection({ workspaceId }: { workspaceId: string }) {
);
}
// --- Agent Abilities Section ---
//
// Always-visible on/off controls for the two workspace-level ability flags
// (broadcast_enabled, talk_to_user_enabled). Both are mutated through the
// same admin endpoint the ChatTab recovery banner already uses
// (PATCH /workspaces/:id/abilities) and reflected into the canvas store node
// data (broadcastEnabled / talkToUserEnabled) so every surface that reads
// useCanvasStore.nodes stays consistent without a full re-hydrate.
//
// Before this section there was NO canvas control for either flag: the
// backend was fully wired (workspace_abilities.go / workspace_broadcast.go /
// agent_message_writer.go, see commit 29b4bffb + internal#510/#511) but the
// only frontend affordance was the ChatTab recovery banner, which renders
// solely when talk_to_user_enabled===false and so is invisible under the
// TRUE default and never existed at all for broadcast.
function AgentAbilitiesSection({ workspaceId }: { workspaceId: string }) {
// Read the live ability flags off the canvas store node — the platform
// event stream hydrates these (canvas-topology.ts maps the workspace row's
// broadcast_enabled/talk_to_user_enabled onto node data), so this stays in
// sync with the recovery banner and avoids a duplicate GET. Mirrors the
// store-read pattern used by AgentCardSection above.
const node = useCanvasStore((s) =>
s.nodes?.find?.((n) => n.id === workspaceId),
);
// Defaults match the backend column defaults + canvas-topology mapping:
// broadcast_enabled defaults FALSE, talk_to_user_enabled defaults TRUE.
const broadcastEnabled = node?.data.broadcastEnabled ?? false;
const talkToUserEnabled = node?.data.talkToUserEnabled ?? true;
// Track an in-flight PATCH per field so a double-click can't fire two
// racing writes, and surface a one-line error if the server rejects.
const [pending, setPending] = useState<null | "broadcast" | "talk">(null);
const [error, setError] = useState<string | null>(null);
const patchAbility = async (
which: "broadcast" | "talk",
body: { broadcast_enabled: boolean } | { talk_to_user_enabled: boolean },
optimistic: Partial<{ broadcastEnabled: boolean; talkToUserEnabled: boolean }>,
) => {
setError(null);
setPending(which);
// Optimistic store update — the toggle flips immediately; on failure we
// roll back to the server-truth value the store last held.
const prev = {
broadcastEnabled,
talkToUserEnabled,
};
useCanvasStore.getState().updateNodeData(workspaceId, optimistic);
try {
await api.patch(`/workspaces/${workspaceId}/abilities`, body);
} catch (e) {
// Roll back the optimistic change to last-known server truth.
useCanvasStore.getState().updateNodeData(workspaceId, {
broadcastEnabled: prev.broadcastEnabled,
talkToUserEnabled: prev.talkToUserEnabled,
});
setError(
e instanceof Error ? e.message : "Failed to update ability — try again",
);
} finally {
setPending(null);
}
};
return (
<Section title="Agent Abilities">
<p className="text-[10px] text-ink-mid px-1 pb-1">
Workspace-level permissions for this agent. Changes apply immediately
(no restart required).
</p>
<div className="space-y-2">
<div>
<Toggle
label="Talk to user"
checked={talkToUserEnabled}
onChange={(v) =>
pending
? undefined
: patchAbility(
"talk",
{ talk_to_user_enabled: v },
{ talkToUserEnabled: v },
)
}
/>
<p className="text-[10px] text-ink-mid mt-0.5 ml-6">
When off, the agent&apos;s <code className="font-mono">send_message_to_user</code>{" "}
and <code className="font-mono">POST /notify</code> calls are
rejected (403) it must route updates through a parent workspace.
</p>
</div>
<div>
<Toggle
label="Broadcast to peers"
checked={broadcastEnabled}
onChange={(v) =>
pending
? undefined
: patchAbility(
"broadcast",
{ broadcast_enabled: v },
{ broadcastEnabled: v },
)
}
/>
<p className="text-[10px] text-ink-mid mt-0.5 ml-6">
When on, the agent may <code className="font-mono">POST /broadcast</code>{" "}
to message all non-removed agent workspaces in the org. Off by
default only privileged orchestrators should hold this.
</p>
</div>
</div>
{pending && (
<div className="mt-2 text-[10px] text-ink-mid">Saving</div>
)}
{error && (
<div role="alert" aria-live="assertive" className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
{error}
</div>
)}
</Section>
);
}
// --- Main ConfigTab ---
interface ModelSpec {
@@ -919,7 +795,6 @@ export function ConfigTab({ workspaceId }: Props) {
<label className="text-[10px] text-ink-mid block mb-1">Model</label>
<input
type="text"
aria-label="Model"
value={currentModelId}
onChange={(e) => {
const v = e.target.value;
@@ -1010,8 +885,6 @@ export function ConfigTab({ workspaceId }: Props) {
)}
</Section>
<AgentAbilitiesSection workspaceId={workspaceId} />
{/* Claude Settings — shown for claude-code runtime or claude/anthropic model names */}
{(config.runtime === "claude-code" ||
(config.runtime_config?.model || config.model || "").toLowerCase().includes("claude") ||
@@ -1122,7 +995,7 @@ export function ConfigTab({ workspaceId }: Props) {
)}
{error && (
<div role="alert" aria-live="assertive" className="mx-3 mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">{error}</div>
<div className="mx-3 mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">{error}</div>
)}
{!error && RUNTIMES_WITH_OWN_CONFIG.has(config.runtime || "") && (
<div className="mx-3 mb-2 px-3 py-1.5 bg-surface-sunken/50 border border-line rounded text-xs text-ink-mid">
+3 -3
View File
@@ -157,7 +157,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
</select>
</Field>
{saveError && (
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{saveError}
</div>
)}
@@ -203,7 +203,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
{isRestartable && (
<div className="pt-2">
{restartError && (
<div role="alert" aria-live="assertive" className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{restartError}
</div>
)}
@@ -307,7 +307,7 @@ export function DetailsTab({ workspaceId, data }: Props) {
{/* Delete */}
<Section title="Danger Zone">
{deleteError && (
<div role="alert" aria-live="assertive" className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="mb-2 px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{deleteError}
</div>
)}
+1 -1
View File
@@ -82,7 +82,7 @@ export function EventsTab({ workspaceId }: Props) {
</div>
{error && (
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
@@ -102,7 +102,7 @@ export function ExternalConnectionSection({ workspaceId }: Props) {
</div>
{error && (
<div role="alert" aria-live="assertive" className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
<div className="mt-2 px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad">
{error}
</div>
)}
+5 -48
View File
@@ -45,54 +45,11 @@ export function FilesTab({ workspaceId, data }: Props) {
if (data && isExternalLikeRuntime(data.runtime)) {
return <NotAvailablePanel runtime={data.runtime} />;
}
return <PlatformOwnedFilesTab workspaceId={workspaceId} runtime={data?.runtime} />;
return <PlatformOwnedFilesTab workspaceId={workspaceId} />;
}
/** Picks the initial root for the FilesTab dropdown based on the
* workspace's runtime. Decision: per-runtime default (Hongming
* 2026-05-15, internal#425 Decisions §2).
*
* - openclaw → `/agent-home` (the agent's identity/state — the
* user-facing interesting files for that runtime live in
* `~/.openclaw/` inside the container, which `/agent-home` maps to
* via the Phase 2b docker-exec backend).
* - everything else (claude-code, hermes, external-like, undefined)
* → `/configs` (the legacy default — managed config that flows
* through the per-runtime indirection in
* workspace-server/internal/handlers/template_files_eic.go).
*
* When the runtime is undefined (legacy callers that don't thread
* `data` through, or a workspace whose runtime field hasn't loaded
* yet) the default is `/configs` — matches today's behaviour, no
* surprise.
*
* Note on `/agent-home` pre-Phase-2b: the backend short-circuits
* with HTTP 501 and the canonical "implementation pending" body.
* The tab renders empty + the error banner explains. This is by
* design — lets us land the canvas UX before the backend ships,
* per the RFC's phased rollout. The 501 is graceful: it doesn't
* poison error toasts or generate "workspace not found" noise.
*
* Adding a new runtime that should default to `/agent-home`: add it
* to the agentHomeDefaultRuntimes set below. Adding a runtime that
* should default to a different root: extend this function. */
const agentHomeDefaultRuntimes = new Set(["openclaw"]);
function defaultRootForRuntime(runtime: string | undefined): string {
if (runtime && agentHomeDefaultRuntimes.has(runtime)) {
return "/agent-home";
}
return "/configs";
}
function PlatformOwnedFilesTab({
workspaceId,
runtime,
}: {
workspaceId: string;
runtime?: string;
}) {
const [root, setRoot] = useState(() => defaultRootForRuntime(runtime));
function PlatformOwnedFilesTab({ workspaceId }: { workspaceId: string }) {
const [root, setRoot] = useState("/configs");
const [selectedFile, setSelectedFile] = useState<string | null>(null);
const [fileContent, setFileContent] = useState("");
const [editContent, setEditContent] = useState("");
@@ -266,7 +223,7 @@ function PlatformOwnedFilesTab({
// immediately. Delete-All hovers DARKER (bg-red-700) — same AA
// contrast trap that bit ConfirmDialog/ApprovalBanner. Cancel
// lifts to surface-elevated instead of the prior no-op hover.
<div role="alertdialog" aria-modal="false" aria-labelledby="files-delete-all-msg" className="mx-3 mt-2 px-3 py-2 bg-red-950/30 border border-red-800/40 rounded space-y-1.5">
<div role="alertdialog" aria-labelledby="files-delete-all-msg" className="mx-3 mt-2 px-3 py-2 bg-red-950/30 border border-red-800/40 rounded space-y-1.5">
<p id="files-delete-all-msg" className="text-xs text-bad">Delete all {files.filter((f) => !f.dir).length} files? This cannot be undone.</p>
<div className="flex gap-2">
<button type="button" onClick={() => { handleDeleteAll(); setShowDeleteAll(false); }} className="px-2 py-0.5 bg-red-700 hover:bg-red-600 text-[10px] rounded text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface">Delete All</button>
@@ -280,7 +237,7 @@ function PlatformOwnedFilesTab({
)}
{confirmDelete && (
<div role="alertdialog" aria-modal="false" aria-labelledby="files-delete-one-msg" className="mx-3 mt-2 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded space-y-1.5">
<div role="alertdialog" aria-labelledby="files-delete-one-msg" className="mx-3 mt-2 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded space-y-1.5">
<p id="files-delete-one-msg" className="text-xs text-warm">Delete <span className="font-mono">{confirmDelete}</span>{files.find((f) => f.path === confirmDelete && f.dir) ? " and all its contents" : ""}?</p>
<div className="flex gap-2">
<button type="button" onClick={confirmDeleteFile} className="px-2 py-0.5 bg-red-700 hover:bg-red-600 text-[10px] rounded text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1 focus-visible:ring-offset-surface">Delete</button>
@@ -3,22 +3,6 @@
import { useRef } from "react";
import { getIcon } from "./tree";
// secretShapeMarker is the canonical body the workspace-server Files
// API returns when a file's path OR content matched a credential
// regex (internal#425 RFC, Phase 2b — backed by
// workspace-server/internal/secrets.ScanBytes). The marker is a
// fixed prefix so the canvas can detect it without parsing JSON and
// without round-tripping the matched bytes through the editor (which
// would defeat the purpose — clipboard, browser history, log
// surfaces would all see them).
//
// Today (Phase 1 / before 2b ships) the backend returns 501 for the
// only root that uses this path, so the marker is dead code until
// 2b lands. Wiring it in now keeps the canvas + backend contracts
// aligned in one PR rather than a follow-up. The constant is
// importable so a future test can pin the exact string.
export const SECRET_SHAPE_DENIED_MARKER = "<denied: secret-shape>";
interface Props {
selectedFile: string | null;
fileContent: string;
@@ -47,22 +31,6 @@ export function FileEditor({
const editorRef = useRef<HTMLTextAreaElement>(null);
const isDirty = editContent !== fileContent;
// internal#425 Phase 3: detect the secret-shape denial marker and
// render a placeholder instead of the editor. The marker comes
// from workspace-server Phase 2b (secrets.ScanBytes) which refuses
// to surface the file's bytes. We deliberately don't expose
// the matched pattern's Name here — the canvas just shows the
// generic denial. The Files API log surface has the Pattern.Name
// for operators who need to debug a false positive.
const isSecretShapeDenied = fileContent === SECRET_SHAPE_DENIED_MARKER;
// /agent-home is read-only from the canvas (Phase 2b ships read +
// delete; Phase-2b-followup may add write). Edits to /configs are
// unchanged. Until 2b ships, /agent-home returns 501 so this
// read-only gate is also dead code, but wiring it in now keeps
// the UI honest the moment 2b lands without a follow-up canvas PR.
const isReadOnlyRoot = root !== "/configs";
if (!selectedFile) {
return (
<div className="flex-1 flex items-center justify-center">
@@ -107,42 +75,11 @@ export function FileEditor({
{/* Editor area */}
{loadingFile ? (
<div className="p-4 text-xs text-ink-mid">Loading...</div>
) : isSecretShapeDenied ? (
// Files API refused to surface this file's bytes because its
// path or content matched a credential regex
// (workspace-server/internal/secrets, internal#425 Phase 2b).
// We render a placeholder INSTEAD OF the textarea so the
// matched bytes never enter the DOM. Clipboard / view-source
// / element-inspector all see the placeholder, not the
// credential.
<div
role="region"
aria-label="File content denied"
className="flex-1 flex items-center justify-center p-6 bg-surface"
>
<div className="max-w-md text-center space-y-2">
<div className="text-2xl opacity-40">🛡</div>
<p className="text-[11px] font-mono text-warm">
{SECRET_SHAPE_DENIED_MARKER}
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
The platform refused to surface this file because its
path or content matched a credential-shape pattern.
The bytes never left the workspace container.
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
If this is a false positive (test fixture, docs example,
or content that happens to share a credential's shape),
rename the file or adjust the content via the workspace
terminal so the regex no longer matches, then refresh.
</p>
</div>
</div>
) : (
<textarea
ref={editorRef}
value={editContent}
readOnly={isReadOnlyRoot}
readOnly={root !== "/configs"}
onChange={(e) => setEditContent(e.target.value)}
onKeyDown={(e) => {
if ((e.metaKey || e.ctrlKey) && e.key === "s") {
@@ -38,15 +38,6 @@ export function FilesToolbar({
<option value="/home">/home</option>
<option value="/workspace">/workspace</option>
<option value="/plugins">/plugins</option>
{/* internal#425 Phase 1+3: container-internal $HOME root.
Backend lands the docker-exec dispatch in Phase 2b. Until
then the stub returns 501 with a canonical
"implementation pending" message — the dropdown renders
the option so the canvas affordance is design-frozen
even before the backend ships.
Runtime-default selection logic in FilesTab.tsx picks
this as the initial value for openclaw workspaces. */}
<option value="/agent-home">/agent-home</option>
</select>
<span className="text-[10px] text-ink-mid">{fileCount} files</span>
</div>
@@ -1,181 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for the /agent-home root selector + per-runtime default-root
* + secret-shape denial placeholder (internal#425 Phase 3).
*
* Separate file so the diff is reviewable as a unit and the existing
* FilesToolbar / FileEditor / FilesTab tests don't have to grow
* agent-home-specific cases. Once Phase 2b lands, the read-only +
* 501-stub assertions here can be tightened (or moved into the main
* test file as the agent-home root becomes a first-class affordance).
*/
import React from "react";
import { render, screen, cleanup } from "@testing-library/react";
import { afterEach, describe, expect, it, vi } from "vitest";
import { FilesToolbar } from "../FilesToolbar";
import {
FileEditor,
SECRET_SHAPE_DENIED_MARKER,
} from "../FileEditor";
afterEach(cleanup);
describe("internal#425 Phase 3 — /agent-home root selector", () => {
it("dropdown includes /agent-home as an option", () => {
// Pins the affordance is in the DOM even pre-Phase-2b — the
// canvas design freezes today, the backend lands the dispatch
// later. Without this, a future refactor that drops the option
// would silently regress the RFC's Phase 1 contract (canvas
// visibility) without breaking any other test.
render(
<FilesToolbar
root="/configs"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
const values = Array.from(select.options).map((o) => o.value);
expect(values).toContain("/agent-home");
});
it("dropdown shows /agent-home as the SELECTED root when prop is /agent-home", () => {
render(
<FilesToolbar
root="/agent-home"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
expect(select.value).toBe("/agent-home");
});
});
describe("internal#425 Phase 3 — secret-shape denial placeholder", () => {
// Files API Phase 2b returns SECRET_SHAPE_DENIED_MARKER as the file
// body when the file's path or content matched a credential regex.
// The editor MUST render the marker as a placeholder, not pump it
// through the textarea — that would put the marker (and any future
// matched bytes if the backend contract changes) into the DOM
// value, clipboard, and inspector.
it("renders the denial placeholder INSTEAD of the textarea when fileContent is the marker", () => {
render(
<FileEditor
selectedFile="agent/.openclaw/secrets.env"
fileContent={SECRET_SHAPE_DENIED_MARKER}
editContent={SECRET_SHAPE_DENIED_MARKER}
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
// Placeholder region present
expect(
screen.getByRole("region", { name: /file content denied/i }),
).toBeTruthy();
// Marker text visible (so a debugging operator sees the canonical
// contract string without having to dig into the source).
expect(screen.getByText(SECRET_SHAPE_DENIED_MARKER)).toBeTruthy();
// Critically: NO textarea — the bytes never reach a controlled
// input. A regression that re-introduces the textarea path would
// make the matched marker (and any future content) selectable +
// copyable.
expect(screen.queryByRole("textbox")).toBeNull();
});
it("renders the textarea normally when fileContent is regular content", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: openclaw\n"
editContent="name: openclaw\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
expect(screen.getByRole("textbox")).toBeTruthy();
expect(screen.queryByRole("region", { name: /file content denied/i }))
.toBeNull();
});
it("/agent-home renders textarea READ-ONLY for non-denied content", () => {
// Phase 2b ships read + delete on /agent-home; write semantics
// are decided later. Until then, the canvas presents the editor
// as read-only so a user can't type into a buffer that the
// backend will refuse to PUT. Without this gate, the user would
// edit, hit Save, get a 501, and lose their context for why.
render(
<FileEditor
selectedFile=".openclaw/agent-card.json"
fileContent='{"name":"openclaw"}'
editContent='{"name":"openclaw"}'
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(true);
});
it("/configs renders textarea WRITABLE (regression guard for the read-only gate)", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: x\n"
editContent="name: x\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(false);
});
});
describe("internal#425 Phase 3 — marker constant is the canonical string", () => {
// The marker string is part of the canvas <-> workspace-server
// contract. The workspace-server emits this exact body; the canvas
// detects it by exact-equality. A typo on either side would
// silently break detection — the canvas would render the literal
// string in the textarea instead of the placeholder. Pin the
// contract value here.
it("matches the contract value '<denied: secret-shape>'", () => {
expect(SECRET_SHAPE_DENIED_MARKER).toBe("<denied: secret-shape>");
});
});
+1 -1
View File
@@ -275,7 +275,7 @@ export function ScheduleTab({ workspaceId }: Props) {
Enabled
</label>
</div>
{error && <div role="alert" aria-live="assertive" className="text-[10px] text-bad">{error}</div>}
{error && <div className="text-[10px] text-bad">{error}</div>}
<div className="flex gap-2">
<button
type="button"
+1 -1
View File
@@ -67,7 +67,7 @@ export function TracesTab({ workspaceId }: Props) {
</div>
{error && (
<div role="alert" aria-live="assertive" className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
<div className="px-3 py-1.5 bg-red-900/30 border border-red-800 rounded text-xs text-bad">
{error}
</div>
)}
@@ -1,165 +0,0 @@
// @vitest-environment jsdom
//
// Tests for the always-visible "Agent Abilities" section added to ConfigTab
// (internal#510 broadcast_enabled, internal#511 talk_to_user_enabled; backend
// wired in commit 29b4bffb).
//
// Problem this pins: the two workspace ability flags had complete wired
// backends but NO canvas control — broadcast had none at all, talk-to-user
// only surfaced as a ChatTab recovery banner that is invisible under its
// TRUE default. The CTO could not see or toggle either from canvas.
//
// What this suite pins:
// 1. An "Agent Abilities" section renders (always visible, not gated).
// 2. Both toggles render and reflect the store node's ability fields,
// including the asymmetric defaults (broadcast FALSE, talk TRUE).
// 3. Toggling a switch calls PATCH /workspaces/:id/abilities with the
// correct snake_case body and optimistically updates the store.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPatch = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
patch: (path: string, body?: unknown) => apiPatch(path, body),
put: vi.fn(),
post: vi.fn(),
del: vi.fn(),
},
}));
// Store node carries the ability flags hydrated by the platform stream
// (canvas-topology.ts maps broadcast_enabled/talk_to_user_enabled onto
// node.data). Mirror that shape so the section reads real values.
const storeUpdateNodeData = vi.fn();
const storeRestartWorkspace = vi.fn();
let nodeData: { broadcastEnabled?: boolean; talkToUserEnabled?: boolean } = {};
const makeState = () => ({
nodes: [{ id: "ws-test", data: nodeData }],
restartWorkspace: storeRestartWorkspace,
updateNodeData: storeUpdateNodeData,
});
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
(selector: (s: unknown) => unknown) => selector(makeState()),
{ getState: () => makeState() },
),
}));
vi.mock("../AgentCardSection", () => ({
AgentCardSection: () => <div data-testid="agent-card-stub" />,
}));
import { ConfigTab } from "../ConfigTab";
beforeEach(() => {
apiGet.mockReset();
apiPatch.mockReset();
apiPatch.mockResolvedValue({ status: "updated" });
storeUpdateNodeData.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === `/workspaces/ws-test`) {
return Promise.resolve({ runtime: "claude-code" });
}
if (path === `/workspaces/ws-test/model`) {
return Promise.resolve({ model: "claude-opus-4-7" });
}
if (path === `/workspaces/ws-test/provider`) {
return Promise.resolve({ provider: "anthropic-oauth", source: "default" });
}
if (path === `/workspaces/ws-test/files/config.yaml`) {
return Promise.resolve({ content: "name: test\nruntime: claude-code\n" });
}
if (path === "/templates") {
return Promise.resolve([
{ id: "claude-code", name: "Claude Code", runtime: "claude-code", providers: [] },
]);
}
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
});
describe("ConfigTab Agent Abilities section", () => {
it("renders an always-visible 'Agent Abilities' section with both toggles", async () => {
nodeData = {}; // unset → defaults
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
expect(
await screen.findByRole("button", { name: /Agent Abilities/i }),
).toBeTruthy();
expect(screen.getByText("Talk to user")).toBeTruthy();
expect(screen.getByText("Broadcast to peers")).toBeTruthy();
});
it("reflects the asymmetric defaults: talk-to-user ON, broadcast OFF", async () => {
nodeData = {}; // unset → backend defaults
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
const broadcast = screen
.getByText("Broadcast to peers")
.closest("label")!
.querySelector("input") as HTMLInputElement;
expect(talk.checked).toBe(true);
expect(broadcast.checked).toBe(false);
});
it("reflects explicit store values", async () => {
nodeData = { broadcastEnabled: true, talkToUserEnabled: false };
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
const broadcast = screen
.getByText("Broadcast to peers")
.closest("label")!
.querySelector("input") as HTMLInputElement;
expect(talk.checked).toBe(false);
expect(broadcast.checked).toBe(true);
});
it("PATCHes /abilities with talk_to_user_enabled and optimistically updates the store", async () => {
nodeData = {}; // talk defaults true
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const talk = (await screen.findByText("Talk to user"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
fireEvent.click(talk); // true → false
await waitFor(() =>
expect(apiPatch).toHaveBeenCalledWith("/workspaces/ws-test/abilities", {
talk_to_user_enabled: false,
}),
);
expect(storeUpdateNodeData).toHaveBeenCalledWith("ws-test", {
talkToUserEnabled: false,
});
});
it("PATCHes /abilities with broadcast_enabled when the broadcast toggle is flipped", async () => {
nodeData = {}; // broadcast defaults false
render(<ConfigTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const broadcast = (await screen.findByText("Broadcast to peers"))
.closest("label")!
.querySelector("input") as HTMLInputElement;
fireEvent.click(broadcast); // false → true
await waitFor(() =>
expect(apiPatch).toHaveBeenCalledWith("/workspaces/ws-test/abilities", {
broadcast_enabled: true,
}),
);
expect(storeUpdateNodeData).toHaveBeenCalledWith("ws-test", {
broadcastEnabled: true,
});
});
});
@@ -405,7 +405,7 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
</p>
<button
onClick={loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-red-500/60 focus-visible:ring-offset-1"
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors"
>
Retry
</button>
@@ -610,7 +610,7 @@ function PeerTabButton({
aria-selected={active}
tabIndex={active ? 0 : -1}
onClick={onClick}
className={`shrink-0 px-3 py-1.5 text-[10px] font-medium transition-colors whitespace-nowrap focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-cyan-500/60 focus-visible:ring-offset-1 ${
className={`shrink-0 px-3 py-1.5 text-[10px] font-medium transition-colors whitespace-nowrap ${
active
? "border-b-2 border-cyan-500 text-cyan-200"
: "border-b-2 border-transparent text-ink-mid hover:text-ink-mid"
@@ -33,7 +33,7 @@ export function PendingAttachmentPill({
<button
onClick={onRemove}
aria-label={`Remove ${file.name}`}
className="ml-0.5 text-ink-mid hover:text-ink transition-colors shrink-0 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1"
className="ml-0.5 text-ink-mid hover:text-ink transition-colors shrink-0"
>
<svg width="10" height="10" viewBox="0 0 16 16" fill="none" aria-hidden="true">
<path d="M4 4l8 8M12 4l-8 8" stroke="currentColor" strokeWidth="1.6" strokeLinecap="round" />
@@ -62,9 +62,8 @@ export function AttachmentChip({
return (
<button
onClick={() => onDownload(attachment)}
aria-label={`Download ${attachment.name}`}
title={`Download ${attachment.name}`}
className={`flex items-center gap-1.5 rounded-md border px-2 py-1 text-[10px] transition-colors max-w-full focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 focus-visible:ring-offset-1 ${toneClasses}`}
className={`flex items-center gap-1.5 rounded-md border px-2 py-1 text-[10px] transition-colors max-w-full ${toneClasses}`}
>
<FileGlyph className="shrink-0 opacity-70" />
<span className="truncate">{attachment.name}</span>
@@ -351,10 +351,8 @@ export function SecretsSection({ workspaceId, requiredEnv }: { workspaceId: stri
{showAdd ? (
<div className="bg-surface-card/50 rounded p-2 space-y-1.5 border border-line/50">
<input value={newKey} onChange={(e) => setNewKey(e.target.value.toUpperCase())} placeholder="KEY_NAME"
aria-label="Secret key name"
className="w-full bg-surface-sunken border border-line rounded px-2 py-1 text-[10px] font-mono text-ink focus:outline-none focus:border-accent" />
<input value={newValue} onChange={(e) => setNewValue(e.target.value)} placeholder="Value" type="password"
aria-label="Secret value"
className="w-full bg-surface-sunken border border-line rounded px-2 py-1 text-[10px] text-ink focus:outline-none focus:border-accent" />
<div className="flex gap-2">
<button type="button" onClick={() => { if (newKey && newValue) handleSave(newKey, newValue); }} disabled={!newKey || !newValue}
@@ -2,7 +2,7 @@
import { useState, useCallback, useRef, useEffect } from 'react';
import type { TestConnectionState, SecretGroup } from '@/types/secrets';
import { validateSecret, ApiError } from '@/lib/api/secrets';
import { validateSecret } from '@/lib/api/secrets';
interface TestConnectionButtonProps {
provider: SecretGroup;
@@ -55,23 +55,9 @@ export function TestConnectionButton({
}
onResult?.(result.valid);
resetTimerRef.current = setTimeout(() => setState('idle'), RESET_DELAYS[nextState]!);
} catch (err) {
// Distinguish a real failure shape rather than always claiming a
// timeout. A reachable server that answered with an HTTP status
// (ApiError) did NOT time out — most commonly the validation route
// is not available (404/501), which must not masquerade as
// "service down". Only an actual thrown network/abort error is a
// connectivity failure.
} catch {
setState('failure');
if (err instanceof ApiError) {
setErrorDetail(
err.status === 404 || err.status === 501
? 'Key validation is not available for this service yet. The key was not tested.'
: `Could not verify key (server returned ${err.status}). Saving is unaffected.`,
);
} else {
setErrorDetail('Could not reach the validation service. Check your connection and try again.');
}
setErrorDetail('Connection timed out. Service may be down.');
onResult?.(false);
resetTimerRef.current = setTimeout(() => setState('idle'), RESET_DELAYS.failure);
}
@@ -99,7 +85,7 @@ export function TestConnectionButton({
function Spinner() {
return (
<svg aria-hidden="true" className="spinner" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<svg className="spinner" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<path d="M12 2v4M12 18v4M4.93 4.93l2.83 2.83M16.24 16.24l2.83 2.83M2 12h4M18 12h4M4.93 19.07l2.83-2.83M16.24 7.76l2.83-2.83" />
</svg>
);
@@ -28,20 +28,8 @@ const mockValidateSecret = vi.fn();
vi.mock("@/lib/api/secrets", () => ({
validateSecret: (...args: unknown[]) => mockValidateSecret(...args),
ApiError: class ApiError extends Error {
status: number;
constructor(status: number, message: string) {
super(message);
this.name = "ApiError";
this.status = status;
}
},
}));
// Re-import the mocked ApiError so test cases construct the same class the
// component's `instanceof` check sees.
import { ApiError } from "@/lib/api/secrets";
beforeEach(() => {
vi.useFakeTimers();
vi.clearAllMocks();
@@ -213,27 +201,8 @@ describe("TestConnectionButton — failure path", () => {
});
describe("TestConnectionButton — catch path", () => {
it("does NOT claim a timeout when the validate endpoint 404s (regression: internal#492)", async () => {
// The validate route is unimplemented on the server and returns a fast
// 404. Before the fix this rendered the misleading hardcoded string
// "Connection timed out. Service may be down." It must instead state
// honestly that validation isn't available and the key was not tested.
mockValidateSecret.mockRejectedValue(new ApiError(404, "Not Found"));
render(
<TestConnectionButton provider="anthropic" secretValue="sk-ant-xxx" />,
);
fireEvent.click(document.querySelector('button[type="button"]')!);
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).not.toContain("Connection timed out");
expect(document.body.textContent).not.toContain("Service may be down");
expect(document.body.textContent).toContain("not available");
expect(document.body.textContent).toContain("not tested");
});
it("reports a non-404 server error with its status, not a timeout", async () => {
mockValidateSecret.mockRejectedValue(new ApiError(500, "Internal Server Error"));
it("shows 'Connection timed out' on network error", async () => {
mockValidateSecret.mockRejectedValue(new Error("timeout"));
render(
<TestConnectionButton provider="github" secretValue="ghp_xxx" />,
);
@@ -241,20 +210,7 @@ describe("TestConnectionButton — catch path", () => {
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).toContain("500");
expect(document.body.textContent).not.toContain("Connection timed out");
});
it("shows a connectivity message on a genuine network error", async () => {
mockValidateSecret.mockRejectedValue(new Error("network down"));
render(
<TestConnectionButton provider="github" secretValue="ghp_xxx" />,
);
fireEvent.click(document.querySelector('button[type="button"]')!);
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).toContain("Could not reach the validation service");
expect(document.body.textContent).toContain("Connection timed out");
});
it("calls onResult(false) on network error", async () => {
+1 -82
View File
@@ -63,7 +63,7 @@ class MockWebSocket {
(globalThis as unknown as Record<string, unknown>).WebSocket = MockWebSocket;
// Now import the socket module (uses globalThis.WebSocket at call time)
import { connectSocket, disconnectSocket, wakeSocket } from "../socket";
import { connectSocket, disconnectSocket } from "../socket";
import { useCanvasStore } from "../canvas";
// ---------------------------------------------------------------------------
@@ -416,84 +416,3 @@ describe("RehydrateDedup", () => {
expect(d.shouldSkip(2_700)).toBe(true);
});
});
// ---------------------------------------------------------------------------
// wakeSocket() — visibility-wake reconnect (regression #223 / #228)
// ---------------------------------------------------------------------------
//
// Mobile browsers (iOS Safari, Chrome on Android in deep-sleep) silently
// drop the WebSocket when the tab is backgrounded; the in-page onclose
// fires very late or never. Without a visibility wake, the canvas stays
// frozen until the user manually refreshes.
//
// The real wiring lives at module level: connectSocket installs a
// visibilitychange/pageshow listener that calls wake() on foreground.
// We can't dispatch DOM events here because the suite runs under the
// `node` test environment (no `document`/`window` — see canvas/vitest
// .config.ts). Instead we test wake() directly through the wakeSocket
// public export, which is the same code path the listener invokes.
describe("wakeSocket → reconnect (#223 / #228 — mobile visibility wake)", () => {
it("wake on a healthy OPEN socket does not create a new WebSocket", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// OPEN === 1. wake() should take the healthy-no-op branch.
(ws as unknown as { readyState: number }).readyState = 1;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before);
});
it("wake on a CLOSED socket creates a new WebSocket (the actual #223 fix)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// CLOSED === 3. Simulates the OS killing the socket while the tab
// was backgrounded. We deliberately don't fire triggerClose() —
// the whole point of #223 is that mobile browsers don't fire
// onclose when they kill the WS, so reconnect never schedules.
(ws as unknown as { readyState: number }).readyState = 3;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before + 1);
});
it("wake while CONNECTING (readyState=0) does not pile another handshake", () => {
connectSocket();
const ws = getLastWS();
// CONNECTING === 0 — a handshake is already in flight.
(ws as unknown as { readyState: number }).readyState = 0;
const before = MockWebSocket.instances.length;
wakeSocket();
expect(MockWebSocket.instances.length).toBe(before);
});
it("wake cancels any pending backoff reconnect", () => {
const clearTimeoutSpy = vi.spyOn(globalThis, "clearTimeout");
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
// Drop the socket — onclose schedules a backoff reconnect.
ws.triggerClose();
// Now wake the page. wake() should pre-empt the backoff so the
// user sees the canvas come back immediately, not after the
// exponential delay window.
(ws as unknown as { readyState: number }).readyState = 3;
clearTimeoutSpy.mockClear();
wakeSocket();
expect(clearTimeoutSpy).toHaveBeenCalled();
clearTimeoutSpy.mockRestore();
});
it("wake after disconnectSocket is a no-op (no zombie reconnect)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
disconnectSocket();
const before = MockWebSocket.instances.length;
// Singleton is null now — wake() should silently do nothing.
expect(() => wakeSocket()).not.toThrow();
expect(MockWebSocket.instances.length).toBe(before);
});
});
-88
View File
@@ -268,46 +268,6 @@ class ReconnectingSocket {
}
useCanvasStore.getState().setWsStatus("disconnected");
}
/** Force a reconnect attempt now, skipping the backoff window.
* Used by the visibilitychange / pageshow handler: when a mobile
* browser backgrounds the tab, the OS silently kills the WebSocket
* but the in-page onclose either fires very late or never fires at
* all (iOS Safari, Chrome on Android in deep-sleep). Once the user
* brings the tab back, the canvas needs to reconnect within human
* perception — not on whatever backoff delay was last scheduled,
* which can be up to 30s. (#223 / #228)
*
* Idempotent: if the socket is already OPEN we leave it alone; the
* WebSocket is still healthy and a reconnect would just churn. */
wake() {
if (this.disposed) return;
// OPEN === 1. Use the numeric literal so we don't have to import
// WebSocket type values; the runtime constant is well-defined.
if (this.ws && this.ws.readyState === 1) {
// Healthy. Run a rehydrate to catch any events we may have missed
// while the tab was backgrounded — the OS does deliver some
// packets late, but it can also drop them, and the dedup gate
// collapses this with any subsequent health-check rehydrate.
void this.rehydrate();
return;
}
// CONNECTING === 0 means a handshake is already in flight. Don't
// pile another one on; the existing attempt or its onclose-driven
// reconnect will resolve.
if (this.ws && this.ws.readyState === 0) return;
// Otherwise (CLOSING, CLOSED, or null) we're in limbo. Cancel any
// pending backoff and reconnect now.
if (this.reconnectTimer) {
clearTimeout(this.reconnectTimer);
this.reconnectTimer = null;
}
// Reset attempt counter so the *next* failure (if any) starts from
// a short delay again — we just had a real user interaction, not
// an unattended-tab failure cascade.
this.attempt = 0;
this.connect();
}
}
export interface WorkspaceData {
@@ -346,49 +306,11 @@ export interface WorkspaceData {
let socket: ReconnectingSocket | null = null;
/** visibilitychange / pageshow handler. Mobile browsers (iOS Safari,
* Chrome on Android in deep-sleep) silently drop the WebSocket when
* the tab is backgrounded — the in-page `onclose` fires very late or
* never. Without this listener, the canvas appears frozen after the
* user backgrounds the PWA and returns to it: status events, agent
* messages, and cross-device chat broadcast don't arrive until a
* manual refresh (#223 / #228).
*
* Both events are wired: `visibilitychange` covers tab-switch on a
* live page; `pageshow` covers Safari's bfcache restore, where the
* page comes back from cache without firing visibilitychange. */
function onPageWake() {
// document is undefined in SSR; the listener never installs there,
// but defensively guard anyway in case this code is run via a test
// harness that doesn't shim it.
if (typeof document !== "undefined" && document.hidden) return;
socket?.wake();
}
let visibilityHandlerInstalled = false;
function installVisibilityHandler() {
if (visibilityHandlerInstalled) return;
if (typeof document === "undefined" || typeof window === "undefined") return;
document.addEventListener("visibilitychange", onPageWake);
// `pageshow` with `event.persisted === true` is the bfcache restore
// signal — relevant on iOS Safari. We don't need to inspect
// `persisted` because waking an OPEN socket is a no-op.
window.addEventListener("pageshow", onPageWake);
visibilityHandlerInstalled = true;
}
function uninstallVisibilityHandler() {
if (!visibilityHandlerInstalled) return;
if (typeof document === "undefined" || typeof window === "undefined") return;
document.removeEventListener("visibilitychange", onPageWake);
window.removeEventListener("pageshow", onPageWake);
visibilityHandlerInstalled = false;
}
export function connectSocket() {
if (!socket) {
socket = new ReconnectingSocket(WS_URL);
}
socket.connect();
installVisibilityHandler();
}
export function disconnectSocket() {
@@ -396,14 +318,4 @@ export function disconnectSocket() {
socket.disconnect();
socket = null;
}
uninstallVisibilityHandler();
}
/** Manually trigger the visibility-wake path. Exported so the test suite
* can exercise `ReconnectingSocket.wake()` without depending on a
* jsdom DOM (the rest of this file's tests run under the node env).
* Real-world callers don't need this — the visibility/pageshow listener
* drives it. */
export function wakeSocket() {
socket?.wake();
}
-12
View File
@@ -584,10 +584,6 @@
.secrets-tab__refresh-btn:hover {
background: #1e40af;
}
.secrets-tab__refresh-btn:focus-visible {
outline: 2px solid #1d4ed8;
outline-offset: 2px;
}
.secrets-tab__no-results {
text-align: center;
@@ -653,10 +649,6 @@
border-radius: 6px;
cursor: pointer;
}
.delete-dialog__cancel-btn:focus-visible {
outline: var(--focus-ring);
outline-offset: var(--focus-ring-offset);
}
.delete-dialog__confirm-btn {
background: var(--status-invalid);
@@ -666,10 +658,6 @@
border-radius: 6px;
cursor: pointer;
}
.delete-dialog__confirm-btn:focus-visible {
outline: var(--focus-ring);
outline-offset: var(--focus-ring-offset);
}
.delete-dialog__confirm-btn:disabled { opacity: 0.4; cursor: not-allowed; }
+6 -18
View File
@@ -58,7 +58,6 @@ TOP_LEVEL_MODULES = {
"a2a_response",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_identity",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
@@ -311,17 +310,8 @@ locally.
deps from your system Python. Plain `pip install --user` works
but the binary lands in `~/.local/bin` (Linux) or
`~/Library/Python/3.X/bin` (macOS) which is often not on PATH on
a fresh shell — `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
then fails with "command not found" at first use.
* **Server name in `claude mcp add` is workspace-specific.** The
Canvas "Add to Claude Code" snippet stamps a unique slug
(`molecule-<workspace-name>`) so a single Claude Code session can
talk to N molecule workspaces concurrently — `claude mcp add` keys
entries by name in `~/.claude.json`, so re-running with a bare
`molecule` name silently overwrites the prior workspace's entry.
See [molecule-core#1535](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1535)
for the canonical generator.
a fresh shell — `claude mcp add molecule -- molecule-mcp` then
fails with "command not found" at first use.
### Install
@@ -345,10 +335,8 @@ WORKSPACE_ID=<uuid> \\
That exposes the same 8 platform tools (`delegate_task`, `list_peers`,
`send_message_to_user`, `commit_memory`, etc.) that container-bound
runtimes already get via the workspace's auto-spawned MCP. Register
the binary in your agent's MCP config — use a workspace-specific
server name so multi-workspace setups don't collide (e.g. Claude Code:
`claude mcp add molecule-<workspace-slug> -- molecule-mcp` with the env
above; the Canvas modal stamps the right slug for you).
the binary in your agent's MCP config (e.g. Claude Code's
`claude mcp add molecule -- molecule-mcp` with the env above).
### Keeping the token out of shell history
@@ -386,8 +374,8 @@ hold:
wheel does (see `_build_initialize_result`). Nothing for you to
do.
2. **Claude Code installs the server as a marketplace plugin** — a
plain `claude mcp add molecule-<workspace-slug> -- molecule-mcp`
produces a non-plugin-sourced server, which Claude Code rejects with
plain `claude mcp add molecule -- molecule-mcp` produces a
non-plugin-sourced server, which Claude Code rejects with
`channel_enable requires a marketplace plugin`. Until the
official `moleculesai/claude-code-plugin` marketplace lands
(tracking [#2936](https://git.moleculesai.app/molecule-ai/molecule-core/issues/2936)),
@@ -1,35 +0,0 @@
// Command t4-contract-dump prints the T4 privilege contract as YAML.
//
// Usage:
//
// go run ./workspace-server/cmd/t4-contract-dump > t4_capabilities.yaml
//
// This is the seam that template-repo CI workflows consume:
//
// - Template CI fetches molecule-core at pinned ref
// - Runs `go run ./workspace-server/cmd/t4-contract-dump` to produce
// t4_capabilities.yaml
// - Iterates capabilities and runs each Probe inside a freshly-built
// privileged container
// - Aggregates structured pass/fail; fails the gate on any hard miss.
//
// Keeping this trivial and pure-stdlib means a fork user does not need
// a Molecule-AI Gitea token or any internal infrastructure to consume
// the contract — `go run` against molecule-core's public source is
// enough.
package main
import (
"fmt"
"os"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
)
func main() {
caps := provisioner.T4PrivilegeContract()
if _, err := os.Stdout.WriteString(provisioner.AsYAML(caps)); err != nil {
fmt.Fprintln(os.Stderr, "t4-contract-dump: write failed:", err)
os.Exit(1)
}
}
@@ -399,21 +399,7 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
// (no Do(), no maybeMarkContainerDead). The response is a synthetic
// {status:"queued"} envelope so the caller (canvas, another workspace)
// knows delivery is acknowledged but pending consumption.
deliveryMode, deliveryModeErr := lookupDeliveryMode(ctx, workspaceID)
if deliveryModeErr != nil {
// internal#497 fail-closed: a real DB/context error on the
// delivery-mode read MUST NOT silently fall through to the push
// dispatch path — that is exactly what silently misrouted every
// poll-mode peer for 5 days under the ce2db75f regression. Surface
// a structured error so the delegation is marked failed (loud +
// retryable) instead of dispatched to the wrong path.
log.Printf("ProxyA2A: delivery-mode lookup failed for %s: %v — failing closed", workspaceID, deliveryModeErr)
return 0, nil, &proxyA2AError{
Status: http.StatusServiceUnavailable,
Response: gin.H{"error": "delivery-mode lookup failed; refusing to dispatch to avoid silent misrouting"},
}
}
if deliveryMode == models.DeliveryModePoll {
if lookupDeliveryMode(ctx, workspaceID) == models.DeliveryModePoll {
if logActivity {
h.logA2AReceiveQueued(ctx, workspaceID, callerID, body, a2aMethod)
}
@@ -194,11 +194,6 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
// Tracked via goAsync (not bare `go`) so the asyncWG can be drained
// before a test swaps the global db.DB. runRestartCycle reads db.DB
// before its provisioner gate, so an untracked detached goroutine
// races setupTestDB's t.Cleanup db.DB restore. Matches the already-
// correct site at a2a_proxy.go:648.
h.goAsync(func() { h.RestartByID(workspaceID) })
return true
}
@@ -246,9 +241,6 @@ func (h *WorkspaceHandler) preflightContainerHealth(ctx context.Context, workspa
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
// Tracked via goAsync (see maybeMarkContainerDead): preflight's
// detached restart must be drainable so it doesn't race the global
// db.DB swap in test cleanup.
h.goAsync(func() { h.RestartByID(workspaceID) })
return &proxyA2AError{
Status: http.StatusServiceUnavailable,
@@ -270,9 +262,8 @@ func (h *WorkspaceHandler) logA2AFailure(ctx context.Context, workspaceID, calle
errWsName = workspaceID
}
summary := "A2A request to " + errWsName + " failed: " + errMsg
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
logCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
WorkspaceID: workspaceID,
@@ -318,9 +309,8 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
}
summary := a2aMethod + " → " + wsNameForLog
toolTrace := extractToolTrace(respBody)
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
logCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
WorkspaceID: workspaceID,
@@ -468,64 +458,40 @@ func parseUsageFromA2AResponse(body []byte) (inputTokens, outputTokens int64) {
return 0, 0
}
// lookupDeliveryMode returns the workspace's delivery_mode.
//
// internal#497 / RFC#497 fail-closed (SURGICAL scope): the *specific*
// failure mode that hid the ce2db75f regression for 5 days is now
// propagated instead of silently swallowed — a CONTEXT error
// (context.Canceled / context.DeadlineExceeded). Under ce2db75f the
// detached delegation goroutine ran on a cancelled request context, every
// `SELECT delivery_mode` failed `context canceled`, this function returned
// push, the poll-mode short-circuit in proxyA2ARequest was skipped, and
// poll-mode peers (e.g. an operator laptop on molecule-mcp-claude-channel)
// silently never got their a2a_receive inbox row. A transient,
// systematic-once-triggered context cancellation became permanent
// invisible misrouting. Returning that error lets the caller fail loud
// (mark the delegation failed) instead of mis-dispatching.
//
// Scope is deliberately narrow: only ctx errors propagate. Other DB
// errors retain the long-standing documented "fall back to push (today's
// synchronous behavior)" contract — that path is loud + recoverable
// (502 / SSRF reject / restart), unlike the silent poll-mode drop, and
// the surrounding proxy (incl. the sibling checkWorkspaceBudget) is
// intentionally built around that fail-open-to-push behavior. Widening
// further is an RFC#497 follow-up, not part of this P0 fix.
//
// A genuinely *absent* configuration is NOT an error and still resolves to
// push (the safe synchronous default): sql.ErrNoRows, a NULL/empty column,
// or an unrecognised value all return (push, nil).
// lookupDeliveryMode returns the workspace's delivery_mode. On any DB
// error or missing row it returns DeliveryModePush — the fail-closed
// default. "Closed" here means "fall back to today's behavior (synchronous
// dispatch)" rather than "fall back to drop the request silently into
// activity_logs where the agent might never see it." A poll-mode workspace
// that briefly reads as push will get its A2A request dispatched to the
// stored URL (or a 502 if no URL); a push-mode workspace that briefly
// reads as poll would get its request silently queued with no dispatch.
// The first failure is loud + recoverable; the second is silent.
//
// The function is intentionally lookup-only — it never mutates the row.
// The register handler (registry.go) is the only writer for delivery_mode.
//
// See #2339 PR 1 for the column + register-flow side; this is the
// proxy-side read used for the short-circuit in proxyA2ARequest.
func lookupDeliveryMode(ctx context.Context, workspaceID string) (string, error) {
func lookupDeliveryMode(ctx context.Context, workspaceID string) string {
var mode sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT delivery_mode FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&mode)
if err != nil {
// internal#497: a context cancellation/deadline MUST NOT be
// swallowed into a silent push default — that is the exact 5-day
// silent-misrouting vector. Propagate so the caller fails closed.
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
log.Printf("ProxyA2A: lookupDeliveryMode(%s) context error (%v) — failing closed (NOT defaulting to push)", workspaceID, err)
return "", err
}
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("ProxyA2A: lookupDeliveryMode(%s) failed (%v) — defaulting to push (non-ctx DB error; legacy fail-open-to-push contract)", workspaceID, err)
log.Printf("ProxyA2A: lookupDeliveryMode(%s) failed (%v) — defaulting to push", workspaceID, err)
}
return models.DeliveryModePush, nil
return models.DeliveryModePush
}
if !mode.Valid || mode.String == "" {
return models.DeliveryModePush, nil
return models.DeliveryModePush
}
if !models.IsValidDeliveryMode(mode.String) {
log.Printf("ProxyA2A: workspace %s has invalid delivery_mode=%q — defaulting to push", workspaceID, mode.String)
return models.DeliveryModePush, nil
return models.DeliveryModePush
}
return mode.String, nil
return mode.String
}
// logA2AReceiveQueued records a poll-mode "queued" A2A receive into
@@ -2235,18 +2235,12 @@ func TestProxyA2A_PushMode_NoShortCircuit(t *testing.T) {
}
}
// TestProxyA2A_PollMode_FailsClosedToPush verifies the LEGACY safety
// contract is PRESERVED for non-context DB errors: a generic DB error
// reading delivery_mode still defaults to push (today's behavior), NOT
// poll. Failing to push means a poll-mode workspace briefly attempts a
// real dispatch — visible failure (502 / SSRF rejection / restart
// cascade), not a silent drop into activity_logs where the agent might
// never look. Loud > silent, recoverable > lost.
//
// internal#497 narrows the fail-closed change to *context* errors only
// (the actual ce2db75f regression vector); generic DB errors keep this
// long-standing fail-open-to-push contract. The ctx-error fail-closed is
// covered by TestLookupDeliveryMode_ContextCanceled_FailsClosed.
// TestProxyA2A_PollMode_FailsClosedToPush verifies the safety contract:
// a DB error reading delivery_mode must default to push (the existing
// behavior), NOT poll. Failing to push means a poll-mode workspace
// briefly attempts a real dispatch — visible failure (502 / SSRF
// rejection / restart cascade), not a silent drop into activity_logs
// where the agent might never look. Loud > silent, recoverable > lost.
func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t) // empty Redis — forces resolveAgentURL DB lookup
@@ -2257,8 +2251,7 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
expectBudgetCheck(mock, wsID)
// lookupDeliveryMode hits a generic (non-context) DB error → must
// still default push (legacy contract preserved by internal#497).
// lookupDeliveryMode hits a transient DB error → must default push.
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnError(sql.ErrConnDone)
@@ -2282,7 +2275,7 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
var resp map[string]interface{}
_ = json.Unmarshal(w.Body.Bytes(), &resp)
if resp["status"] == "queued" {
t.Errorf("generic DB error on delivery_mode lookup silently queued the request — must fail-open-to-push, got body: %s", w.Body.String())
t.Errorf("DB error on delivery_mode lookup silently queued the request — must fail-closed-to-push, got body: %s", w.Body.String())
}
}
@@ -2291,37 +2284,6 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
}
}
// TestLookupDeliveryMode_ContextCanceled_FailsClosed is the internal#497
// regression test for the SECONDARY defect. It pins the exact invariant
// that hid the ce2db75f regression for 5 days: when the delivery_mode read
// fails because the context was cancelled (precisely what happened in the
// detached delegation goroutine running on a returned request context),
// lookupDeliveryMode MUST return an error and MUST NOT silently return
// "push". Returning push there is what skipped the poll-mode short-circuit
// and silently dropped 100% of poll-mode peer deliveries.
//
// A pre-cancelled context makes QueryRowContext fail with
// context.Canceled deterministically — no DB rows are mocked because the
// query never reaches a result.
func TestLookupDeliveryMode_ContextCanceled_FailsClosed(t *testing.T) {
mock := setupTestDB(t)
// The query fails on the cancelled ctx before matching; provide a
// permissive expectation so sqlmock doesn't complain about the attempt.
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WillReturnError(context.Canceled)
ctx, cancel := context.WithCancel(context.Background())
cancel() // simulate the HTTP handler having returned (request ctx dead)
mode, err := lookupDeliveryMode(ctx, "ws-poll-peer")
if err == nil {
t.Fatalf("internal#497 regression: lookupDeliveryMode swallowed a context error and returned mode=%q with nil err — this is the exact 5-day silent-misrouting vector", mode)
}
if mode == models.DeliveryModePush {
t.Errorf("internal#497 regression: context error must NOT default to push (got mode=%q)", mode)
}
}
// ==================== a2aClient ResponseHeaderTimeout config ====================
func TestA2AClientResponseHeaderTimeout(t *testing.T) {
@@ -1,113 +0,0 @@
package handlers
import "encoding/json"
// agent_card_reconcile.go — server-side repair for the fleet-wide
// agent-card identity gap.
//
// Root cause: the runtime builds its AgentCard from config.name
// (workspace/main.py:198), and config.name is read from the
// CP-regenerated /configs/config.yaml whose `name:` field is the raw
// workspace UUID — NOT the friendly name the operator sees. The friendly
// name IS captured: POST /workspaces and PATCH /workspaces/:id (the
// canvas Details tab) write it to the trusted workspaces.name DB column.
// But /registry/register stores the runtime-supplied card verbatim
// (registry.go: `agent_card = EXCLUDED.agent_card`), so the stored card
// served at /.well-known/agent-card.json and returned to peers via
// agent_card_url ends up with name = UUID, description = "", role = null.
//
// Fix shape (deliberately minimal, no contract weakening): when the
// runtime-supplied card's `name` is empty or equals the workspace UUID
// (the placeholder the runtime had no better value for), the PLATFORM —
// not the agent — substitutes the friendly value from the trusted
// workspaces row. Identity stays platform-controlled: the agent never
// gains the ability to self-set its own name/role; the platform sources
// it from the operator-controlled DB column. We only ever FILL gaps
// (empty / UUID-placeholder); a card that already carries a real
// friendly name is never downgraded.
//
// list_peers / the /registry/:id/peers endpoint already resolve display
// names from workspaces.name directly (discovery.go / mcp_tools.go
// `SELECT w.id, w.name, ...`), so peer_name in delivered message tags
// was already correct — this fix closes the remaining surface: the
// agent_card blob itself (canvas Agent Card / Skills view, peer
// agent_card_url fetches, the well-known card).
//
// description / role degrade discovery the same way: an empty
// description and null role give peers nothing to reason about. We
// default description from the (now reconciled) name when blank and
// role from workspaces.role when the operator set one.
// reconcileAgentCardIdentity patches identity gaps in a runtime-supplied
// agent card from the trusted workspace DB row. It returns the
// (possibly rewritten) card bytes and whether anything changed. On any
// failure (malformed JSON, nothing to fill) it returns the input bytes
// unchanged with changed=false so the caller can store them verbatim —
// this is strictly no-worse-than-before, never a regression.
//
// Pure function: no DB / HTTP / globals, so it is exhaustively
// unit-testable (agent_card_reconcile_test.go) without booting the
// handler or a sqlmock.
func reconcileAgentCardIdentity(card json.RawMessage, workspaceID, dbName, dbRole string) (json.RawMessage, bool) {
var m map[string]any
if err := json.Unmarshal(card, &m); err != nil || m == nil {
// Malformed card — not this function's job to reject it (the
// upsert stores it as-is and downstream readers handle bad
// JSON). Return verbatim so byte-for-byte behaviour is
// preserved on the failure path.
return card, false
}
changed := false
// name: fill only when empty or the UUID placeholder. A dbName that
// is itself the UUID is a placeholder row (registry.go INSERT seeds
// name = id before the canvas sets a friendly one) — not a friendly
// name, so it is not an eligible source.
cardName, _ := m["name"].(string)
if (cardName == "" || cardName == workspaceID) &&
dbName != "" && dbName != workspaceID {
m["name"] = dbName
changed = true
}
// description: when blank, default to the (reconciled) name so peers
// and the canvas Agent Card view have a non-empty human label
// instead of "". Mirrors the runtime's own
// `config.description or config.name` fallback (main.py:199) but
// applied to the registry copy where the runtime's fallback was the
// UUID.
if desc, _ := m["description"].(string); desc == "" {
if n, _ := m["name"].(string); n != "" && n != workspaceID {
m["description"] = n
changed = true
}
}
// role: surface the operator-set workspaces.role when the card
// carries none. Discovery (peer_role) and the canvas Role row read
// workspaces.role directly; this just makes the standalone card
// self-describing too. Never overwrite a role the card already has.
if dbRole != "" {
if r, ok := m["role"].(string); !ok || r == "" {
m["role"] = dbRole
changed = true
}
}
if !changed {
// No-op: return the original bytes untouched so callers that
// compare/store get byte-identical input (re-marshalling would
// reorder keys for no reason).
return card, false
}
out, err := json.Marshal(m)
if err != nil {
// Re-marshal of a map we just unmarshalled should never fail;
// if it somehow does, fall back to the verbatim input rather
// than storing nothing.
return card, false
}
return out, true
}
@@ -1,166 +0,0 @@
package handlers
import (
"encoding/json"
"testing"
)
// TestReconcileAgentCardIdentity covers the server-side backfill that
// repairs the fleet-wide agent-card identity gap (internal#XXX): the
// runtime POSTs /registry/register with agent_card.name = the workspace
// UUID (because the CP-regenerated /configs/config.yaml sets name: <uuid>)
// while the trusted workspaces.name DB column — the value the canvas
// Details tab shows and lets the operator edit — holds the friendly
// name ("Claude Code Agent"). The platform reconciles them from the DB
// row (NOT from the agent — identity stays platform-controlled, not
// self-mutable).
func TestReconcileAgentCardIdentity(t *testing.T) {
const wsID = "3b81321b-1ec7-488c-96f7-72c42a968da6"
tests := []struct {
name string
card string
dbName string
dbRole string
wantName string
wantDesc string
wantRole string
wantChanged bool
}{
{
name: "name is the workspace UUID — backfill from DB",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6","description":"","capabilities":{"streaming":true}}`,
dbName: "Claude Code Agent",
dbRole: "",
wantName: "Claude Code Agent",
wantDesc: "Claude Code Agent",
wantRole: "",
wantChanged: true,
},
{
name: "empty name — backfill from DB",
card: `{"name":"","description":"x"}`,
dbName: "ops-agent",
dbRole: "sre",
wantName: "ops-agent",
wantDesc: "x",
wantRole: "sre",
wantChanged: true,
},
{
name: "role null in card, DB has role — backfill role only",
card: `{"name":"Reviewer","description":"Senior reviewer"}`,
dbName: "Reviewer",
dbRole: "code-reviewer",
wantName: "Reviewer",
wantDesc: "Senior reviewer",
wantRole: "code-reviewer",
wantChanged: true,
},
{
name: "card already has a real friendly name — do NOT clobber it",
// A richer card (e.g. an external channel agent) must win;
// the platform only fills gaps, never downgrades.
card: `{"name":"Claude Code (channel)","description":"Local Claude Code session bridged","role":"assistant"}`,
dbName: "hongming-pc",
dbRole: "operator",
wantName: "Claude Code (channel)",
wantDesc: "Local Claude Code session bridged",
wantRole: "assistant",
wantChanged: false,
},
{
name: "no DB name available — leave UUID name untouched (no worse than before)",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6","description":""}`,
dbName: "",
dbRole: "",
wantName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
wantDesc: "",
wantRole: "",
wantChanged: false,
},
{
name: "dbName equals UUID (placeholder row) — not a friendly name, leave untouched",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6"}`,
dbName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
dbRole: "",
wantName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
wantDesc: "",
wantRole: "",
wantChanged: false,
},
{
name: "malformed card JSON — return unchanged, no panic",
card: `{not json`,
dbName: "Claude Code Agent",
dbRole: "",
wantChanged: false,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
out, changed := reconcileAgentCardIdentity(
json.RawMessage(tc.card), wsID, tc.dbName, tc.dbRole,
)
if changed != tc.wantChanged {
t.Fatalf("changed = %v, want %v", changed, tc.wantChanged)
}
if !tc.wantChanged {
// Unchanged path must return the input bytes verbatim.
if string(out) != tc.card {
t.Fatalf("unchanged path mutated bytes:\n got %s\n want %s", out, tc.card)
}
return
}
var got map[string]any
if err := json.Unmarshal(out, &got); err != nil {
t.Fatalf("output not valid JSON: %v (%s)", err, out)
}
if g, _ := got["name"].(string); g != tc.wantName {
t.Errorf("name = %q, want %q", g, tc.wantName)
}
if g, _ := got["description"].(string); g != tc.wantDesc {
t.Errorf("description = %q, want %q", g, tc.wantDesc)
}
if tc.wantRole != "" {
if g, _ := got["role"].(string); g != tc.wantRole {
t.Errorf("role = %q, want %q", g, tc.wantRole)
}
}
})
}
}
// TestReconcileAgentCardIdentity_PreservesOtherFields ensures the
// reconcile is a minimal in-place patch — capabilities, version,
// skills and any unknown future fields survive untouched.
func TestReconcileAgentCardIdentity_PreservesOtherFields(t *testing.T) {
card := `{"name":"ws-uuid","description":"","version":"1.0.0",` +
`"capabilities":{"streaming":true,"pushNotifications":true},` +
`"skills":[{"id":"a","name":"a"}],"configuration_status":"ready"}`
out, changed := reconcileAgentCardIdentity(
json.RawMessage(card), "ws-uuid", "Friendly Name", "",
)
if !changed {
t.Fatal("expected changed = true")
}
var got map[string]any
if err := json.Unmarshal(out, &got); err != nil {
t.Fatalf("invalid JSON: %v", err)
}
if got["version"] != "1.0.0" {
t.Errorf("version not preserved: %v", got["version"])
}
if got["configuration_status"] != "ready" {
t.Errorf("configuration_status not preserved: %v", got["configuration_status"])
}
caps, ok := got["capabilities"].(map[string]any)
if !ok || caps["streaming"] != true {
t.Errorf("capabilities not preserved: %v", got["capabilities"])
}
skills, ok := got["skills"].([]any)
if !ok || len(skills) != 1 {
t.Errorf("skills not preserved: %v", got["skills"])
}
}
@@ -1,9 +1,6 @@
package handlers
import (
"log"
"os"
"path/filepath"
"regexp"
"strings"
)
@@ -20,17 +17,6 @@ var gitIdentitySlugPattern = regexp.MustCompile(`[^a-z0-9]+`)
// docs/authorship.md (when it exists).
const gitIdentityEmailDomain = "agents.moleculesai.app"
// gitAskpassHelperPath is the in-container path of the askpass helper
// installed by every workspace runtime image (workspace/Dockerfile in
// molecule-core; scripts/git-askpass.sh → /usr/local/bin/molecule-askpass
// in each external template-* repo). The helper reads GIT_HTTP_USERNAME
// / GIT_HTTP_PASSWORD (falling back to GITEA_USER / GITEA_TOKEN) from
// env and emits them on the git credential-prompt protocol. Setting
// GIT_ASKPASS to this path is what wires container-side HTTPS git auth
// to the persona credentials already arriving via workspace_secrets,
// with no on-disk .gitconfig / .git-credentials mutation required.
const gitAskpassHelperPath = "/usr/local/bin/molecule-askpass"
// applyAgentGitIdentity sets GIT_AUTHOR_* / GIT_COMMITTER_* env vars so
// every commit from this workspace container carries a distinct author
// in `git log` and `git blame`. Git reads these env vars before falling
@@ -64,125 +50,6 @@ func applyAgentGitIdentity(envVars map[string]string, workspaceName string) {
setIfEmpty(envVars, "GIT_AUTHOR_EMAIL", authorEmail)
setIfEmpty(envVars, "GIT_COMMITTER_NAME", authorName)
setIfEmpty(envVars, "GIT_COMMITTER_EMAIL", authorEmail)
applyGitAskpass(envVars)
}
// applyGitAskpass points git at the in-image askpass helper so that any
// HTTPS git operation against a remote without a pre-configured
// credential.helper picks up the persona credentials already present in
// the container env (GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD, or
// GITEA_USER / GITEA_TOKEN as fallback — the latter pair is what
// loadPersonaEnvFile delivers from the operator-host bootstrap kit).
//
// Idempotent: if GIT_ASKPASS is already set (e.g. by an operator-
// supplied workspace_secret or an env-mutator plugin), the existing
// value wins. This lets a workspace opt out by setting GIT_ASKPASS=""
// or pointing at a different helper.
//
// No vendor-specific behaviour lives in this function — the host the
// credentials apply to is determined entirely by the deployer choosing
// when to populate GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD (or
// GITEA_USER / GITEA_TOKEN). The helper script itself is generic and
// has no hardcoded hostnames, so it's safe to ship inside the
// open-source workspace template images alongside the platform-managed
// claude-code image.
func applyGitAskpass(envVars map[string]string) {
if envVars == nil {
return
}
setIfEmpty(envVars, "GIT_ASKPASS", gitAskpassHelperPath)
}
// applyAgentGitHTTPCreds reads the persona's HTTPS git credential from
// the operator-host bootstrap dir and injects it as GIT_HTTP_USERNAME /
// GIT_HTTP_PASSWORD so the in-container askpass helper can emit it on
// git's auth challenge.
//
// Why a dedicated env-var pair instead of reusing GITEA_USER / GITEA_TOKEN:
// the provisioner's forensic #145 denylist (provisioner.scmWriteTokenKeys)
// strips any env var named GITEA_TOKEN / GITHUB_TOKEN / GH_TOKEN /
// GITLAB_TOKEN / GL_TOKEN / BITBUCKET_TOKEN from tenant container env
// before docker run. That denylist is by exact key name, so the same
// token survives transport when shipped under the generic
// GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD names that the askpass helper
// reads first (scripts/git-askpass.sh in each template-*). The username
// half stays an identifier (the persona's Gitea login), the password
// half carries the bytes from the persona token file.
//
// The fallback pair GITEA_USER / GITEA_TOKEN is ALSO set — GITEA_USER
// survives the denylist (it's an identity, not a credential) and
// GITEA_TOKEN is the no-op write that buildContainerEnv will drop.
// Both pairs in lockstep means the askpass helper's GIT_HTTP_*-first /
// GITEA_*-fallback chain works regardless of which lane lands first in
// the container env on any future provisioner refactor.
//
// Idempotent: existing GIT_HTTP_USERNAME / GIT_HTTP_PASSWORD keys are
// preserved. Operator-supplied workspace_secrets win over the persona
// token file by virtue of running BEFORE this helper in
// prepareProvisionContext.
//
// Silent no-op when:
// - personaKey is empty (no role → no persona dir to consult)
// - personaKey fails the safe-segment check (defense-in-depth against
// a crafted role escaping the persona dir)
// - the persona token file does not exist or is empty (legitimate
// case for personas that don't ship a git-write credential — e.g.
// read-only PM/Reviewer/Researcher identities or a partially-
// provisioned bootstrap)
//
// No vendor-specific behaviour: this function reads bytes from a path
// and emits them as the standard askpass env-var pair. The host the
// credential applies to is determined by the deployer choosing which
// remote to push to — the askpass helper has no hardcoded hostnames.
func applyAgentGitHTTPCreds(envVars map[string]string, personaKey string) {
if envVars == nil {
return
}
personaKey = strings.TrimSpace(personaKey)
if !isSafeRoleName(personaKey) {
// Silent no-op for empty / unsafe keys — same shape as
// loadPersonaTokenFile. Descriptive-role payloads (multi-word
// "Frontend Engineer" etc.) take this branch and pick up
// creds via workspace_secrets / org-import persona-env merge,
// not the direct persona-token file path.
return
}
root := os.Getenv("MOLECULE_PERSONA_ROOT")
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
tokenPath := filepath.Join(root, personaKey, "token")
data, err := os.ReadFile(tokenPath)
if err != nil {
// Persona dir / file absent: legitimate for the host shapes
// that don't ship the bootstrap kit (dev laptops, CI nodes)
// or for personas that intentionally carry no git-write
// credential. Caller decides whether the resulting
// "Authentication failed" at first push is a configuration
// error or expected behaviour.
return
}
token := strings.TrimSpace(string(data))
if token == "" {
return
}
// Primary lane — survives forensic #145 by virtue of the generic
// GIT_HTTP_* names not being on the SCM-write denylist.
setIfEmpty(envVars, "GIT_HTTP_USERNAME", personaKey)
setIfEmpty(envVars, "GIT_HTTP_PASSWORD", token)
// Fallback lane — askpass reads GITEA_USER / GITEA_TOKEN second.
// GITEA_USER survives the denylist; GITEA_TOKEN will be stripped
// by buildContainerEnv but is set here for completeness so the
// (envVars map[string]string) contract is consistent for callers
// inspecting it before the provisioner-level filter runs (e.g.
// the env-mutator plugin chain).
setIfEmpty(envVars, "GITEA_USER", personaKey)
setIfEmpty(envVars, "GITEA_TOKEN", token)
log.Printf("applyAgentGitHTTPCreds: injected GIT_HTTP_USERNAME/PASSWORD for persona %q (token %d bytes)", personaKey, len(token))
}
// slugifyForEmail collapses a workspace name to a safe email localpart:
@@ -1,8 +1,6 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
@@ -77,261 +75,6 @@ func TestApplyAgentGitIdentity_NilMapIsSafe(t *testing.T) {
applyAgentGitIdentity(nil, "PM")
}
func TestApplyAgentGitIdentity_SetsGitAskpass(t *testing.T) {
// GIT_ASKPASS is what wires container-side HTTPS git auth to the
// persona credentials (GITEA_USER/GITEA_TOKEN, etc.) that
// loadPersonaEnvFile delivers via workspace_secrets. Without this,
// `git push` inside the container would fall through to interactive
// prompts (impossible) or a missing credential.helper (401).
env := map[string]string{}
applyAgentGitIdentity(env, "Frontend Engineer")
if env["GIT_ASKPASS"] != "/usr/local/bin/molecule-askpass" {
t.Errorf("GIT_ASKPASS: got %q, want %q",
env["GIT_ASKPASS"], "/usr/local/bin/molecule-askpass")
}
}
func TestApplyAgentGitIdentity_RespectsAskpassOverride(t *testing.T) {
// A workspace_secret or env-mutator plugin must be able to point at
// a custom askpass helper without us clobbering it. Symmetric with
// the GIT_AUTHOR_NAME override test above.
env := map[string]string{
"GIT_ASKPASS": "/opt/custom/askpass",
}
applyAgentGitIdentity(env, "Backend Engineer")
if env["GIT_ASKPASS"] != "/opt/custom/askpass" {
t.Errorf("GIT_ASKPASS should not be overwritten, got %q", env["GIT_ASKPASS"])
}
}
func TestApplyAgentGitIdentity_AskpassSkippedOnEmptyName(t *testing.T) {
// The empty-name early-return covers GIT_ASKPASS too — a provisioning
// glitch that dropped the workspace name shouldn't half-configure the
// container (identity vars empty but askpass wired). All-or-nothing.
env := map[string]string{}
applyAgentGitIdentity(env, "")
if _, ok := env["GIT_ASKPASS"]; ok {
t.Errorf("empty name should not set GIT_ASKPASS, got %q", env["GIT_ASKPASS"])
}
}
func TestApplyGitAskpass_NilMapIsSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Errorf("applyGitAskpass panicked on nil map: %v", r)
}
}()
applyGitAskpass(nil)
}
// TestApplyAgentGitHTTPCreds_HappyPath: the prod-team shape — a persona
// dir at /etc/molecule-bootstrap/personas/<role>/token ships a write
// token. applyAgentGitHTTPCreds reads it and emits both the
// askpass-preferred GIT_HTTP_* pair and the GITEA_* fallback.
func TestApplyAgentGitHTTPCreds_HappyPath(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-bytes-redacted\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
cases := map[string]string{
"GIT_HTTP_USERNAME": "agent-dev-a",
"GIT_HTTP_PASSWORD": "token-bytes-redacted",
"GITEA_USER": "agent-dev-a",
"GITEA_TOKEN": "token-bytes-redacted",
}
for k, want := range cases {
if got := env[k]; got != want {
t.Errorf("%s: got %q, want %q", k, got, want)
}
}
}
// TestApplyAgentGitHTTPCreds_TrimsWhitespace: bootstrap-kit-written
// token files canonically end in \n. Must trim like loadPersonaTokenFile
// does — Gitea PAT validator rejects embedded whitespace.
func TestApplyAgentGitHTTPCreds_TrimsWhitespace(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("\n raw-token-bytes \n\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-b")
if env["GIT_HTTP_PASSWORD"] != "raw-token-bytes" {
t.Errorf("GIT_HTTP_PASSWORD: token whitespace not trimmed; got %q", env["GIT_HTTP_PASSWORD"])
}
}
// TestApplyAgentGitHTTPCreds_RespectsOperatorOverride: if a workspace
// secret (loaded earlier by loadWorkspaceSecrets) already set the
// askpass pair, those values must win — operator intent ranks above
// persona-file defaults. Symmetric with applyAgentGitIdentity's
// GIT_AUTHOR_* override semantics.
func TestApplyAgentGitHTTPCreds_RespectsOperatorOverride(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("file-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{
"GIT_HTTP_USERNAME": "operator-user",
"GIT_HTTP_PASSWORD": "operator-secret",
}
applyAgentGitHTTPCreds(env, "agent-dev-a")
if env["GIT_HTTP_USERNAME"] != "operator-user" {
t.Errorf("GIT_HTTP_USERNAME should not be overwritten, got %q", env["GIT_HTTP_USERNAME"])
}
if env["GIT_HTTP_PASSWORD"] != "operator-secret" {
t.Errorf("GIT_HTTP_PASSWORD should not be overwritten, got %q", env["GIT_HTTP_PASSWORD"])
}
// Fallback pair was not pre-set, so persona-file fills it in.
if env["GITEA_TOKEN"] != "file-token" {
t.Errorf("GITEA_TOKEN fallback should be filled, got %q", env["GITEA_TOKEN"])
}
}
// TestApplyAgentGitHTTPCreds_EmptyKeyIsNoop: a workspace with an empty
// payload.Role (descriptive multi-word role, or no role) must take the
// silent-no-op branch — no FS read, no env keys touched.
func TestApplyAgentGitHTTPCreds_EmptyKeyIsNoop(t *testing.T) {
root := t.TempDir()
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "")
if len(env) != 0 {
t.Errorf("empty persona key should leave env untouched, got %v", env)
}
applyAgentGitHTTPCreds(env, " ")
if len(env) != 0 {
t.Errorf("whitespace persona key should leave env untouched, got %v", env)
}
applyAgentGitHTTPCreds(env, "Frontend Engineer")
if len(env) != 0 {
t.Errorf("multi-word descriptive role should leave env untouched (silent no-op via isSafeRoleName), got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_MissingTokenFile: persona dir exists but
// ships no token (legitimate for read-only personas like agent-pm pre-
// CTO-cred or partially-provisioned bootstrap). Silent no-op — no env
// keys set so first push surfaces "Authentication failed" cleanly
// instead of half-configured creds.
func TestApplyAgentGitHTTPCreds_MissingTokenFile(t *testing.T) {
root := t.TempDir()
if err := os.MkdirAll(filepath.Join(root, "agent-pm"), 0o755); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-pm")
if len(env) != 0 {
t.Errorf("missing token file should leave env untouched, got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_EmptyTokenIsNoop: a whitespace-only token
// file (botched bootstrap) must be treated as absent — never emit
// GIT_HTTP_PASSWORD="" because the askpass helper would then return
// empty on the password prompt and git would surface a confusing 401
// rather than a clean "no credentials" state.
func TestApplyAgentGitHTTPCreds_EmptyTokenIsNoop(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte(" \t\n \n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
if len(env) != 0 {
t.Errorf("whitespace-only token should leave env untouched, got %v", env)
}
}
// TestApplyAgentGitHTTPCreds_RejectsUnsafeRole: defense-in-depth — a
// crafted role with path separators / "../" must NOT touch the FS,
// even if a token file exists at the traversed location.
func TestApplyAgentGitHTTPCreds_RejectsUnsafeRole(t *testing.T) {
root := t.TempDir()
// Plant a token at <root>/token so a successful traversal would land here.
if err := os.WriteFile(filepath.Join(root, "token"),
[]byte("stolen-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", filepath.Join(root, "personas"))
for _, bad := range []string{"..", "../personas", "/abs", "with/slash", "."} {
env := map[string]string{}
applyAgentGitHTTPCreds(env, bad)
if len(env) != 0 {
t.Errorf("unsafe role %q must leave env untouched, got %v", bad, env)
}
}
}
// TestApplyAgentGitHTTPCreds_NilMapIsSafe: defensive — never panic
// on a nil map. Symmetric with applyAgentGitIdentity's nil-map test.
func TestApplyAgentGitHTTPCreds_NilMapIsSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Errorf("applyAgentGitHTTPCreds panicked on nil map: %v", r)
}
}()
applyAgentGitHTTPCreds(nil, "agent-dev-a")
}
// TestApplyAgentGitHTTPCreds_DefaultPersonaRoot: when
// MOLECULE_PERSONA_ROOT is unset, the helper falls back to
// /etc/molecule-bootstrap/personas — the canonical operator-host path
// per the bootstrap kit shape. We can't write into /etc in a test,
// but we CAN assert the helper takes the silent-no-op branch when
// that real path is absent (the prod-default case on a dev laptop).
func TestApplyAgentGitHTTPCreds_DefaultPersonaRoot(t *testing.T) {
t.Setenv("MOLECULE_PERSONA_ROOT", "")
env := map[string]string{}
applyAgentGitHTTPCreds(env, "agent-dev-a")
// The /etc/molecule-bootstrap/personas/agent-dev-a/token path
// almost certainly does not exist on a dev/CI host. The contract
// here is "silent no-op when token unreadable", not "exact env
// state" — so we only assert no panic + no half-state pair.
if _, ok := env["GIT_HTTP_USERNAME"]; ok {
if _, ok2 := env["GIT_HTTP_PASSWORD"]; !ok2 {
t.Errorf("USERNAME set without PASSWORD — half-state; got %v", env)
}
}
}
func TestSlugifyForEmail(t *testing.T) {
cases := []struct {
in, want string
@@ -163,32 +163,8 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
},
})
// Fire-and-forget: send A2A in a background goroutine.
//
// internal#497 — the goroutine MUST NOT inherit the HTTP request's
// cancellation. `ctx` here is c.Request.Context(); the handler returns
// 202 a few lines below, which cancels that context immediately. Before
// this fix (regression ce2db75f) executeDelegation ran on the
// request-scoped ctx, so every DB op + proxy call in the detached
// goroutine failed `context canceled` the instant the 202 was written.
// That silently broke 100% of A2A peer delegations fleet-wide since
// 2026-05-12 (poll-mode peers never got their a2a_receive inbox row;
// lookupDeliveryMode swallowed the ctx error and defaulted to push).
//
// context.WithoutCancel detaches cancellation/deadline while PRESERVING
// all context values (trace/correlation/tenant ids that proxyA2ARequest
// and the broadcaster read off ctx) — this is the established pattern in
// this package (a2a_proxy.go:850, a2a_proxy_helpers.go:525,
// registry.go:822). The 30-minute ceiling matches the prior internal
// budget executeDelegation used before ce2db75f and the proxy's own
// absolute agent-dispatch ceiling (a2a_proxy.go forwardCtx).
delegationCtx, cancelDelegation := context.WithTimeout(
context.WithoutCancel(ctx), 30*time.Minute,
)
go func() {
defer cancelDelegation()
h.executeDelegation(delegationCtx, sourceID, body.TargetID, delegationID, a2aBody)
}()
// Fire-and-forget: send A2A in background goroutine
go h.executeDelegation(ctx, sourceID, body.TargetID, delegationID, a2aBody)
// Broadcast event so canvas shows delegation in real-time
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
@@ -16,65 +16,6 @@ import (
"github.com/gin-gonic/gin"
)
// ---------- internal#497 regression: detached goroutine ctx must outlive the handler ----------
// TestDelegate_DetachedContext_SurvivesRequestCancellation pins the
// load-bearing invariant that regression ce2db75f violated: the context
// handed to executeDelegation in the fire-and-forget goroutine must NOT be
// cancelled when the HTTP handler returns 202 (which cancels
// c.Request.Context()). Before the fix, executeDelegation ran on the
// request-scoped ctx, so every DB op + proxy call failed `context
// canceled` the instant the 202 was written — silently breaking 100% of
// A2A peer delegations fleet-wide since 2026-05-12.
//
// This test asserts the exact ctx-derivation contract used by Delegate
// (context.WithoutCancel(parent) + a timeout budget): the derived context
// (a) stays alive after the parent is cancelled, and (b) still carries
// parent values (trace/correlation/tenant ids the downstream proxy +
// broadcaster read off ctx). It is intentionally DB-free and fast.
func TestDelegate_DetachedContext_SurvivesRequestCancellation(t *testing.T) {
type ctxKey string
const traceKey ctxKey = "trace-id"
// Simulate c.Request.Context() carrying a correlation value.
parent, cancelParent := context.WithCancel(
context.WithValue(context.Background(), traceKey, "trace-abc-123"),
)
// Exact derivation Delegate uses for the detached goroutine.
delegationCtx, cancelDelegation := context.WithTimeout(
context.WithoutCancel(parent), 30*time.Minute,
)
defer cancelDelegation()
// The HTTP handler "returns 202" → request context is cancelled.
cancelParent()
if err := parent.Err(); err == nil {
t.Fatal("precondition: parent context should be cancelled after the handler returns")
}
// (a) Cancellation MUST NOT propagate to the detached context.
select {
case <-delegationCtx.Done():
t.Fatalf("regression: detached delegation ctx was cancelled by the handler returning (err=%v) — executeDelegation would fail every DB op with `context canceled`", delegationCtx.Err())
default:
// alive — correct
}
// (b) Parent values MUST still be readable (WithoutCancel preserves
// values; trace/correlation/tenant ids the proxy + broadcaster use).
if got, _ := delegationCtx.Value(traceKey).(string); got != "trace-abc-123" {
t.Errorf("detached ctx lost the parent trace value: got %q, want %q", got, "trace-abc-123")
}
// And it still has a real deadline (the 30m budget), so it is not an
// unbounded background context.
if _, hasDeadline := delegationCtx.Deadline(); !hasDeadline {
t.Error("detached ctx must carry the 30-minute timeout budget, but has no deadline")
}
}
// ---------- Delegate: missing target_id → 400 ----------
func TestDelegate_MissingTargetID(t *testing.T) {
@@ -24,30 +24,17 @@ import (
// BuildExternalConnectionPayload assembles the gin.H payload that the
// canvas's ExternalConnectModal consumes. Pure data — caller owns DB
// reads (workspace_id, workspace_name) and token minting (auth_token).
// reads (workspace_id) and token minting (auth_token).
//
// authToken may be empty for the read-only "show instructions again"
// path; the modal masks the field in that case rather than displaying
// an empty string.
//
// workspaceName feeds the per-workspace MCP server-name in the snippets
// that wire molecule-mcp into an external Claude Code (or other
// MCP-stdio) client. Without a unique server name a second
// `claude mcp add molecule` call REPLACES the first entry, collapsing
// multi-workspace use into a single per-session slot — see
// mcpServerNameForWorkspace below. May be empty (re-show / rotate paths
// that don't plumb the name); the helper falls back to the workspace
// ID's short prefix so the snippet is always unique.
func BuildExternalConnectionPayload(platformURL, workspaceID, workspaceName, authToken string) gin.H {
func BuildExternalConnectionPayload(platformURL, workspaceID, authToken string) gin.H {
pURL := strings.TrimSuffix(platformURL, "/")
mcpName := mcpServerNameForWorkspace(workspaceID, workspaceName)
stamp := func(tmpl string) string {
return strings.ReplaceAll(
strings.ReplaceAll(
strings.ReplaceAll(tmpl, "{{PLATFORM_URL}}", pURL),
"{{WORKSPACE_ID}}", workspaceID,
),
"{{MCP_SERVER_NAME}}", mcpName,
strings.ReplaceAll(tmpl, "{{PLATFORM_URL}}", pURL),
"{{WORKSPACE_ID}}", workspaceID,
)
}
return gin.H{
@@ -90,81 +77,6 @@ func externalPlatformURL(c *gin.Context) string {
return scheme + "://" + host
}
// mcpServerNameForWorkspace derives the unique MCP server name used in
// the Universal MCP snippet's `claude mcp add <name> -- ...` line.
//
// Why per-workspace, not a fixed "molecule": `claude mcp add` keys
// entries by name in ~/.claude.json, so re-running with the same name
// silently REPLACES the previous entry. A single external Claude Code
// session that connects to N molecule workspaces must therefore use N
// distinct server names — otherwise the second install collapses the
// first, and the user experiences "MCP is per-session". MCP itself
// supports many servers per session; the install-snippet name was the
// only thing standing in the way.
//
// Pattern: "molecule-<slug>" where slug comes from the workspace name
// (lowercased, non-alphanumeric → hyphen, collapsed, trimmed, <=24
// chars). Falls back to the workspace ID's first 8 chars when the name
// is empty or slugifies to nothing — both produce a deterministic,
// Claude-Code-name-safe (alphanumeric + hyphens, no spaces / dots /
// slashes) identifier that disambiguates per-workspace.
//
// Two workspaces with identical names still produce identical slugs by
// design — the user picked them to look the same. The
// `claude mcp add` step will overwrite the older one in that case;
// the workaround is to rename one, then re-run. Documented in the
// snippet header so users aren't surprised.
func mcpServerNameForWorkspace(workspaceID, workspaceName string) string {
const fallbackIDPrefixLen = 8
const maxSlugLen = 24
slug := slugifyForMcpName(workspaceName, maxSlugLen)
if slug == "" {
id := strings.ReplaceAll(workspaceID, "-", "")
if len(id) > fallbackIDPrefixLen {
id = id[:fallbackIDPrefixLen]
}
slug = id
}
if slug == "" {
// Defensive: empty workspaceID at this layer means the caller
// is misusing the API; we still return a usable (non-colliding
// in the common case) constant rather than producing "molecule-"
// which Claude Code would reject.
return "molecule"
}
return "molecule-" + slug
}
// slugifyForMcpName lowercases, replaces non-[a-z0-9] runs with a single
// '-', trims leading/trailing '-', and truncates to maxLen. Returns ""
// if nothing usable remains. Pure helper; no allocations beyond the
// builder.
func slugifyForMcpName(s string, maxLen int) string {
var b strings.Builder
b.Grow(len(s))
lastHyphen := true // suppress leading hyphens
for _, r := range s {
switch {
case r >= 'A' && r <= 'Z':
b.WriteRune(r + ('a' - 'A'))
lastHyphen = false
case (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9'):
b.WriteRune(r)
lastHyphen = false
default:
if !lastHyphen {
b.WriteByte('-')
lastHyphen = true
}
}
}
out := strings.TrimRight(b.String(), "-")
if len(out) > maxLen {
out = strings.TrimRight(out[:maxLen], "-")
}
return out
}
// externalCurlTemplate — zero-dependency register snippet. Placeholders:
// - {{PLATFORM_URL}}, {{WORKSPACE_ID}} — filled server-side
// - $WORKSPACE_AUTH_TOKEN — env var, operator sets
@@ -304,14 +216,6 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# for any MCP-aware runtime (Claude Code, hermes, codex, etc.).
# Pair with the Claude Code or Python SDK tab if your runtime needs
# inbound A2A delivery (canvas messages agent conversation turns).
#
# Multi-workspace: MCP supports many servers per Claude Code session.
# This snippet uses a workspace-specific server name ({{MCP_SERVER_NAME}})
# so installing for a second workspace ADDS another entry instead of
# overwriting the first run the snippet from each workspace's modal
# in turn and ` + "`claude mcp list`" + ` will show all of them. If two
# workspaces have the same name, slugs collide and the second install
# overwrites the first; rename one workspace to disambiguate.
# Requires Python >= 3.11. On 3.10 or older pip says
# "Could not find a version that satisfies the requirement
@@ -320,14 +224,11 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# Upgrade the interpreter (brew install python@3.12 / apt install
# python3.12 / etc.) or use a 3.11+ venv.
# 1. Install the workspace runtime wheel (once per machine safe to
# re-run; subsequent workspaces share the same wheel):
# 1. Install the workspace runtime wheel:
pip install molecule-ai-workspace-runtime
# 2. Wire molecule-mcp into your agent's MCP config. Claude Code:
# NOTE the server name is workspace-specific ("{{MCP_SERVER_NAME}}") so
# multiple molecule workspaces co-exist in one Claude Code session.
claude mcp add {{MCP_SERVER_NAME}} -s user -- env \
claude mcp add molecule -s user -- env \
WORKSPACE_ID={{WORKSPACE_ID}} \
PLATFORM_URL={{PLATFORM_URL}} \
MOLECULE_WORKSPACE_TOKEN="<paste from create response>" \
@@ -348,11 +249,8 @@ claude mcp add {{MCP_SERVER_NAME}} -s user -- env \
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# "Tools not appearing in your agent" run ` + "`claude mcp list`" + ` (or
# your runtime's equivalent) and confirm the {{MCP_SERVER_NAME}} entry.
# If missing, re-run the ` + "`claude mcp add`" + ` line above.
# "Connecting a second workspace overwrote the first" re-check that
# the server name in the line above is {{MCP_SERVER_NAME}} (not a bare
# "molecule"); each workspace's modal generates a distinct name.
# your runtime's equivalent) and confirm the molecule entry. If
# missing, re-run the ` + "`claude mcp add`" + ` line above.
# "ConnectionRefused / DNS error on first call" PLATFORM_URL must
# include the scheme (https://) and have NO trailing slash. Verify
# with: curl ${PLATFORM_URL}/healthz
@@ -433,13 +331,6 @@ const externalHermesChannelTemplate = `# Hermes channel — bridges this workspa
# hermes-agent session. No tunnel/public URL needed (long-poll based,
# same shape as the Claude Code channel).
#
# Multi-workspace: each workspace's plugin_platforms entry is keyed by a
# workspace-specific slug ("{{MCP_SERVER_NAME}}") so two molecule
# workspaces can coexist in one hermes config YAML rejects duplicate
# mapping keys, so re-using the same "molecule:" key for a second
# workspace would silently overwrite the first. Re-running this snippet
# for another workspace ADDS a sibling entry instead.
#
# Prereq: a hermes-agent install on the target machine. Latest builds
# (post #17751) ship the platform-plugin API natively; older ones are
# also supported via the plugin's dual-mode fallback.
@@ -454,17 +345,13 @@ export MOLECULE_PLATFORM_URL={{PLATFORM_URL}}
export MOLECULE_WORKSPACE_TOKEN="<paste from create response>"
# 3. Edit ~/.hermes/config.yaml under your existing top-level
# gateway: block, add a plugin_platforms entry. The platform key
# ({{MCP_SERVER_NAME}}) is workspace-specific so multiple molecule
# workspaces coexist; re-using the same key for a second workspace
# would silently overwrite the first (YAML duplicate-key collapse):
# gateway: block, add a plugin_platforms entry:
#
# gateway:
# # ...your existing gateway settings...
# plugin_platforms:
# {{MCP_SERVER_NAME}}:
# molecule:
# enabled: true
# workspace_id: {{WORKSPACE_ID}}
#
# If you don't yet have a gateway: block, create one with just
# that plugin_platforms entry. Don't append blindly YAML
@@ -517,14 +404,6 @@ hermes gateway --replace
const externalCodexTemplate = `# Codex external setup outbound tools (MCP) + inbound push (bridge).
# For operators whose external agent is a codex CLI (@openai/codex)
# session.
#
# Multi-workspace: the TOML table name is workspace-specific
# ("{{MCP_SERVER_NAME}}") so two molecule workspaces can coexist in one
# ~/.codex/config.toml TOML rejects duplicate
# [mcp_servers.<name>] tables, so re-using a bare "molecule" name for a
# second workspace would either break codex parsing or silently
# overwrite the first. Re-running this snippet for another workspace
# ADDS a sibling table instead.
# 1. Install codex CLI, the workspace runtime, and the bridge daemon:
npm install -g @openai/codex@latest
@@ -533,21 +412,23 @@ pip install codex-channel-molecule
# 2. Wire the molecule MCP server into codex's config.toml this is
# the OUTBOUND path (codex calls list_peers / delegate_task /
# send_message_to_user / commit_memory). The table name
# ({{MCP_SERVER_NAME}}) is workspace-specific; re-running the
# snippet for a DIFFERENT workspace appends a sibling table without
# touching the first. Re-running for the SAME workspace produces
# the same name, so replace the existing block instead of appending.
# send_message_to_user / commit_memory).
#
# Don't append blindly TOML rejects duplicate
# [mcp_servers.molecule] tables, so re-running on an existing
# config will break codex parsing. If [mcp_servers.molecule]
# already exists (e.g. you set this up before), replace the
# existing block instead of appending.
mkdir -p ~/.codex
# (then open ~/.codex/config.toml in your editor and paste:)
#
# [mcp_servers.{{MCP_SERVER_NAME}}]
# [mcp_servers.molecule]
# command = "molecule-mcp"
# args = []
# startup_timeout_sec = 30
#
# [mcp_servers.{{MCP_SERVER_NAME}}.env]
# [mcp_servers.molecule.env]
# WORKSPACE_ID = "{{WORKSPACE_ID}}"
# PLATFORM_URL = "{{PLATFORM_URL}}"
# MOLECULE_WORKSPACE_TOKEN = "<paste from create response>"
@@ -591,13 +472,11 @@ codex
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# [mcp_servers.{{MCP_SERVER_NAME}}] not loaded codex must be 0.57.
# [mcp_servers.molecule] not loaded codex must be 0.57.
# Check with ` + "`codex --version`" + `; upgrade via npm install -g @openai/codex@latest.
# TOML parse error after re-running setup for the SAME workspace
# TOML rejects duplicate [mcp_servers.<name>] tables. Open
# ~/.codex/config.toml and remove the old block before pasting the
# new one. (A second molecule workspace gets a DIFFERENT table
# name, so coexisting workspaces don't conflict.)
# TOML parse error after re-running setup TOML rejects duplicate
# [mcp_servers.molecule] tables. Open ~/.codex/config.toml and
# remove the old block before pasting the new one.
# Canvas messages don't wake codex step 3 (codex-channel-molecule
# bridge daemon) is required for inbound push. Check
# pgrep -f codex-channel-molecule and tail ~/.codex-channel-molecule/daemon.log.
@@ -623,23 +502,23 @@ const externalKimiTemplate = `# Kimi CLI external setup — register + heartbeat
pip install molecule-ai-workspace-runtime
# 2. Save credentials and the bridge script:
mkdir -p ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}
chmod 700 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}
cat > ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env <<'EOF'
mkdir -p ~/.molecule-ai/kimi-workspace
chmod 700 ~/.molecule-ai/kimi-workspace
cat > ~/.molecule-ai/kimi-workspace/env <<'EOF'
WORKSPACE_ID={{WORKSPACE_ID}}
PLATFORM_URL={{PLATFORM_URL}}
MOLECULE_WORKSPACE_TOKEN=<paste from create response>
EOF
chmod 600 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env
chmod 600 ~/.molecule-ai/kimi-workspace/env
cat > ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py <<'PYEOF'
cat > ~/.molecule-ai/kimi-workspace/kimi_bridge.py <<'PYEOF'
#!/usr/bin/env python3
"""Kimi bridge — keeps workspace online and polls for canvas messages."""
import json, logging, time
from pathlib import Path
import httpx
ENV = Path.home() / ".molecule-ai" / "kimi-{{MCP_SERVER_NAME}}" / "env"
ENV = Path.home() / ".molecule-ai" / "kimi-workspace" / "env"
HEARTBEAT_INTERVAL = 20
POLL_INTERVAL = 5
@@ -729,10 +608,10 @@ def main():
if __name__ == "__main__":
main()
PYEOF
chmod +x ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
chmod +x ~/.molecule-ai/kimi-workspace/kimi_bridge.py
# 3. Start the bridge (run in a persistent terminal or via launchd):
python3 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
python3 ~/.molecule-ai/kimi-workspace/kimi_bridge.py
# What the script does:
# Registers the workspace in poll mode (no public URL needed)
@@ -743,7 +622,7 @@ python3 ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/kimi_bridge.py
# To change the reply logic, edit the send_reply() call inside the loop.
# To send a one-off reply from another terminal:
# curl -fsS -X POST "{{PLATFORM_URL}}/workspaces/{{WORKSPACE_ID}}/notify" \
# -H "Authorization: Bearer $(cat ~/.molecule-ai/kimi-{{MCP_SERVER_NAME}}/env | grep TOKEN | cut -d= -f2)" \
# -H "Authorization: Bearer $(cat ~/.molecule-ai/kimi-workspace/env | grep TOKEN | cut -d= -f2)" \
# -H "Content-Type: application/json" \
# -d '{"message":"Hello from Kimi"}'
#
@@ -765,13 +644,6 @@ const externalOpenClawTemplate = `# OpenClaw MCP config — outbound tool path.
# sessions.steer push path; an external setup would need the same
# bridge daemon the template uses. For inbound delivery on an
# external machine today, pair with the Python SDK tab.
#
# Multi-workspace: each workspace registers under a workspace-specific
# MCP server name ("{{MCP_SERVER_NAME}}"). openclaw keys MCP servers by
# name in its config (~/.openclaw/mcp/<name>.json), so re-running with
# a bare "molecule" name would overwrite the prior workspace's entry.
# Re-run this snippet for another workspace to ADD a sibling entry
# instead.
# 1. Install openclaw CLI + the workspace runtime wheel:
# The version pin (>=0.1.999) ensures the "molecule-mcp" console
@@ -802,7 +674,7 @@ pip install "molecule-ai-workspace-runtime>=0.1.999"
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
WORKSPACE_TOKEN="<paste from create response>"
openclaw mcp set {{MCP_SERVER_NAME}} "$(cat <<EOF
openclaw mcp set molecule "$(cat <<EOF
{
"command": "molecule-mcp",
"args": [],
@@ -832,6 +704,6 @@ openclaw agent --message "list my peers"
# Gateway not starting tail ~/.openclaw/gateway.log. The loopback
# bind requires :18789 to be free; check with ` + "`lsof -iTCP:18789`" + `.
# ` + "`openclaw mcp set`" + ` rejected the heredoc generates JSON;
# verify with ` + "`jq < ~/.openclaw/mcp/{{MCP_SERVER_NAME}}.json`" + ` and re-run
# verify with ` + "`jq < ~/.openclaw/mcp/molecule.json`" + ` and re-run
# ` + "`openclaw mcp set`" + ` if the file is malformed.
`
@@ -52,7 +52,7 @@ func (h *WorkspaceHandler) RotateExternalCredentials(c *gin.Context) {
}
ctx := c.Request.Context()
runtime, name, err := lookupWorkspaceRuntimeAndName(ctx, db.DB, id)
runtime, err := lookupWorkspaceRuntime(ctx, db.DB, id)
if errors.Is(err, sql.ErrNoRows) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
@@ -108,7 +108,7 @@ func (h *WorkspaceHandler) RotateExternalCredentials(c *gin.Context) {
platformURL := externalPlatformURL(c)
c.JSON(http.StatusOK, gin.H{
"connection": BuildExternalConnectionPayload(platformURL, id, name, tok),
"connection": BuildExternalConnectionPayload(platformURL, id, tok),
})
}
@@ -129,7 +129,7 @@ func (h *WorkspaceHandler) GetExternalConnection(c *gin.Context) {
}
ctx := c.Request.Context()
runtime, name, err := lookupWorkspaceRuntimeAndName(ctx, db.DB, id)
runtime, err := lookupWorkspaceRuntime(ctx, db.DB, id)
if errors.Is(err, sql.ErrNoRows) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
@@ -149,20 +149,16 @@ func (h *WorkspaceHandler) GetExternalConnection(c *gin.Context) {
platformURL := externalPlatformURL(c)
c.JSON(http.StatusOK, gin.H{
"connection": BuildExternalConnectionPayload(platformURL, id, name, ""),
"connection": BuildExternalConnectionPayload(platformURL, id, ""),
})
}
// lookupWorkspaceRuntimeAndName returns runtime + name in one round-trip.
// Wrapped for readability + so tests can mock the single SELECT.
// Used by rotate / re-show paths: runtime gates the external-only check;
// name feeds the per-workspace MCP server slug in BuildExternalConnectionPayload
// (so the Universal MCP snippet uses a stable per-workspace name instead
// of overwriting prior `claude mcp add molecule` entries).
// Returns sql.ErrNoRows when the workspace doesn't exist.
func lookupWorkspaceRuntimeAndName(ctx context.Context, handle *sql.DB, id string) (runtime, name string, err error) {
err = handle.QueryRowContext(ctx, `
SELECT COALESCE(runtime, ''), COALESCE(name, '') FROM workspaces WHERE id = $1
`, id).Scan(&runtime, &name)
return runtime, name, err
// lookupWorkspaceRuntime returns the workspace's runtime field. Wrapped
// for readability + so tests can mock the single SELECT.
func lookupWorkspaceRuntime(ctx context.Context, handle *sql.DB, id string) (string, error) {
var runtime string
err := handle.QueryRowContext(ctx, `
SELECT COALESCE(runtime, '') FROM workspaces WHERE id = $1
`, id).Scan(&runtime)
return runtime, err
}
@@ -35,9 +35,9 @@ func TestRotateExternalCredentials_HappyPath(t *testing.T) {
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
// 1. Runtime lookup
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-ext").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("external", "test-ws"))
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("external"))
// 2. Revoke all live tokens
mock.ExpectExec(`UPDATE workspace_auth_tokens`).
@@ -98,9 +98,9 @@ func TestRotateExternalCredentials_RejectsNonExternal(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-hermes").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("hermes", "test-ws"))
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("hermes"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -129,9 +129,9 @@ func TestRotateExternalCredentials_NotFound(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"})) // no rows
WillReturnRows(sqlmock.NewRows([]string{"runtime"})) // no rows
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -172,9 +172,9 @@ func TestGetExternalConnection_HappyPathReturnsBlankToken(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-ext").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("external", "test-ws"))
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("external"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -211,9 +211,9 @@ func TestGetExternalConnection_RejectsNonExternal(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-claude").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}).AddRow("claude-code", "test-ws"))
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -233,9 +233,9 @@ func TestGetExternalConnection_NotFound(t *testing.T) {
setupTestRedis(t)
wh := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\), COALESCE\(name, ''\) FROM workspaces WHERE id = \$1`).
mock.ExpectQuery(`SELECT COALESCE\(runtime, ''\) FROM workspaces WHERE id = \$1`).
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"runtime", "name"}))
WillReturnRows(sqlmock.NewRows([]string{"runtime"}))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
@@ -253,7 +253,7 @@ func TestGetExternalConnection_NotFound(t *testing.T) {
// ---------- BuildExternalConnectionPayload (pure helper) ----------
func TestBuildExternalConnectionPayload_StampsPlaceholders(t *testing.T) {
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "my-bot", "tok-abc")
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "tok-abc")
if got["workspace_id"] != "ws-7" {
t.Errorf("workspace_id: %v", got["workspace_id"])
@@ -267,18 +267,6 @@ func TestBuildExternalConnectionPayload_StampsPlaceholders(t *testing.T) {
if got["registry_endpoint"] != "https://platform.test/registry/register" {
t.Errorf("registry_endpoint: %v", got["registry_endpoint"])
}
// Universal MCP snippet must contain a workspace-specific server
// name derived from the workspace name. Without this each new
// `claude mcp add` would overwrite the previous entry in the user's
// ~/.claude.json (servers are keyed by name) — collapsing
// multi-workspace use into one slot. See mcpServerNameForWorkspace.
mcp, _ := got["universal_mcp_snippet"].(string)
if !strings.Contains(mcp, "claude mcp add molecule-my-bot ") {
t.Errorf("universal_mcp_snippet missing per-workspace server name 'molecule-my-bot':\n%s", mcp)
}
if strings.Contains(mcp, "{{MCP_SERVER_NAME}}") {
t.Errorf("universal_mcp_snippet still contains literal {{MCP_SERVER_NAME}}")
}
// {{PLATFORM_URL}} + {{WORKSPACE_ID}} placeholders must be substituted
// out of every snippet — if any snippet still contains a literal
// "{{PLATFORM_URL}}" or "{{WORKSPACE_ID}}", a future template author
@@ -304,7 +292,7 @@ func TestBuildExternalConnectionPayload_TrimsTrailingSlash(t *testing.T) {
// being concatenated into endpoint paths — otherwise the operator
// gets `https://platform.test//registry/register` (double slash) which
// some servers reject as a redirect target.
got := BuildExternalConnectionPayload("https://platform.test/", "ws-7", "", "")
got := BuildExternalConnectionPayload("https://platform.test/", "ws-7", "")
if got["platform_url"] != "https://platform.test" {
t.Errorf("platform_url: trailing slash not trimmed; got %v", got["platform_url"])
}
@@ -316,100 +304,8 @@ func TestBuildExternalConnectionPayload_TrimsTrailingSlash(t *testing.T) {
func TestBuildExternalConnectionPayload_BlankAuthTokenIsAllowed(t *testing.T) {
// Re-show path: auth_token="" is the contract; the modal masks the
// field and labels it "rotate to reveal a new token".
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "", "")
got := BuildExternalConnectionPayload("https://platform.test", "ws-7", "")
if got["auth_token"] != "" {
t.Errorf("blank token must propagate as \"\"; got %v", got["auth_token"])
}
}
// TestBuildExternalConnectionPayload_McpServerNameUniquePerWorkspace
// pins the multi-workspace install contract: two distinct workspaces
// must produce two distinct `claude mcp add` server-name lines, or
// installing the second one will overwrite the first entry in the
// user's ~/.claude.json (servers are keyed by name) — collapsing
// multi-workspace use into a single per-session slot, which is the
// "this is per-session" UX the CTO observed 2026-05-18.
func TestBuildExternalConnectionPayload_McpServerNameUniquePerWorkspace(t *testing.T) {
cases := []struct {
name string
workspaceID string
wsName string
wantAddLine string // must appear in universal_mcp_snippet
}{
{"plain name", "id-a", "my-bot", "claude mcp add molecule-my-bot "},
{"name with spaces + caps", "id-b", "My Bot 1", "claude mcp add molecule-my-bot-1 "},
// Symbol/punctuation collapses to single hyphens and trims.
{"name with symbols", "id-c", "--Foo!!Bar--", "claude mcp add molecule-foo-bar "},
// Empty name falls back to the first 8 chars of the (de-hyphenated)
// workspace UUID — keeps the snippet unique per workspace even
// when callers (rotate/re-show pre-name-lookup) pass "".
{"empty name, uuid id", "12345678-aaaa-bbbb-cccc-deadbeef0000", "", "claude mcp add molecule-12345678 "},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := BuildExternalConnectionPayload("https://p.test", tc.workspaceID, tc.wsName, "tok")
mcp, _ := got["universal_mcp_snippet"].(string)
if !strings.Contains(mcp, tc.wantAddLine) {
t.Errorf("missing %q in universal_mcp_snippet:\n%s", tc.wantAddLine, mcp)
}
// Belt + suspenders: never the bare fixed `molecule` name —
// that was the bug. (Match with trailing space so the
// "molecule-…" form passes.)
if strings.Contains(mcp, "claude mcp add molecule ") {
t.Errorf("snippet regressed to fixed `claude mcp add molecule `; got:\n%s", mcp)
}
})
}
}
// TestBuildExternalConnectionPayload_AllRuntimeSnippetsAreWorkspaceUnique
// extends the multi-workspace install contract to every runtime tab in
// the modal. Each MCP-host config keyspace has the SAME equivalence
// class as Claude Code's `claude mcp add <name>`:
//
// - codex: ~/.codex/config.toml [mcp_servers.<name>] — TOML rejects
// duplicate table keys, so a second workspace with the same name
// either breaks parsing or overwrites the first table.
// - openclaw: ~/.openclaw/mcp/<name>.json — file is keyed by <name>,
// `openclaw mcp set <same-name>` overwrites.
// - hermes: ~/.hermes/config.yaml gateway.plugin_platforms.<key>:
// YAML rejects duplicate mapping keys.
// - kimi: ~/.molecule-ai/kimi-<slug>/ per-workspace dir — single
// "kimi-workspace" dir would have both workspaces' envs collide.
//
// All four must therefore stamp the workspace-specific
// {{MCP_SERVER_NAME}} slug. This test catches a future template author
// who introduces a new runtime tab without plumbing the slug.
func TestBuildExternalConnectionPayload_AllRuntimeSnippetsAreWorkspaceUnique(t *testing.T) {
got := BuildExternalConnectionPayload("https://p.test", "id-a", "my-bot", "tok")
// Per-template literal that proves the slug was stamped through.
wantPerSnippet := map[string]string{
"universal_mcp_snippet": "claude mcp add molecule-my-bot ",
"codex_snippet": "[mcp_servers.molecule-my-bot]",
"openclaw_snippet": "openclaw mcp set molecule-my-bot ",
"hermes_channel_snippet": " molecule-my-bot:",
"kimi_snippet": "~/.molecule-ai/kimi-molecule-my-bot",
}
for key, needle := range wantPerSnippet {
v, _ := got[key].(string)
if !strings.Contains(v, needle) {
t.Errorf("%s missing per-workspace slug literal %q:\n%s", key, needle, v)
}
}
// No template should still contain the unstamped placeholder — that
// would mean BuildExternalConnectionPayload's stamp() didn't sweep
// it, which is the regression we're guarding against.
for _, k := range []string{
"curl_register_template", "python_snippet",
"claude_code_channel_snippet", "universal_mcp_snippet",
"hermes_channel_snippet", "codex_snippet", "openclaw_snippet",
"kimi_snippet",
} {
v, _ := got[k].(string)
if strings.Contains(v, "{{MCP_SERVER_NAME}}") {
t.Errorf("%s still contains literal {{MCP_SERVER_NAME}}", k)
}
}
}
@@ -8,7 +8,6 @@ import (
"fmt"
"net/http"
"net/http/httptest"
"sync"
"testing"
"time"
@@ -23,39 +22,8 @@ import (
"github.com/redis/go-redis/v9"
)
// liveTestHandlers tracks every WorkspaceHandler built during the test
// binary's lifetime so setupTestDB can drain their in-flight goAsync
// goroutines (notably the detached RestartByID restart cycle, which
// reads the global db.DB) BEFORE restoring db.DB. Without this drain a
// fire-and-forget restart goroutine spawned by one test outlives that
// test and races the db.DB swap in a later test's t.Cleanup — the
// 0x...d548 data race on platform/internal/db.DB.
var (
liveTestHandlersMu sync.Mutex
liveTestHandlers []*WorkspaceHandler
)
func init() {
gin.SetMode(gin.TestMode)
newHandlerHook = func(h *WorkspaceHandler) {
liveTestHandlersMu.Lock()
liveTestHandlers = append(liveTestHandlers, h)
liveTestHandlersMu.Unlock()
}
}
// drainTestAsync waits for every tracked handler's goAsync goroutines to
// finish. Called from setupTestDB's cleanup before db.DB is restored so
// no detached restart/provision goroutine is mid-read of db.DB when the
// pointer is swapped.
func drainTestAsync() {
liveTestHandlersMu.Lock()
handlers := make([]*WorkspaceHandler, len(liveTestHandlers))
copy(handlers, liveTestHandlers)
liveTestHandlersMu.Unlock()
for _, h := range handlers {
h.waitAsyncForTest()
}
}
// setupTestDB creates a sqlmock DB and assigns it to the global db.DB.
@@ -74,16 +42,7 @@ func setupTestDB(t *testing.T) sqlmock.Sqlmock {
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() {
// Drain detached async goroutines (e.g. goAsync(RestartByID),
// which reads db.DB in runRestartCycle before its provisioner
// gate) BEFORE swapping db.DB back. Doing the restore first
// would let an in-flight restart goroutine read db.DB while
// this line writes it — the data race this guards against.
drainTestAsync()
db.DB = prevDB
mockDB.Close()
})
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
// Disable SSRF checks for the duration of this test only. Restore
// the previous state via t.Cleanup so that TestIsSafeURL_* tests
@@ -218,14 +218,6 @@ func loadWorkspaceEnv(orgBaseDir, filesDir string) map[string]string {
// check, or when the env file does not exist (workspaces without a role —
// or running on hosts that don't ship the bootstrap dir — keep their old
// behavior).
//
// Token-file fallback: the newer prod-team personas (agent-dev-a,
// agent-dev-b, agent-pm) ship `token` + `universal-auth.env` only — no
// legacy plaintext `env` file. When the env-file load produces zero rows,
// loadPersonaTokenFile fills in GITEA_TOKEN / GITEA_USER / GITEA_USER_EMAIL
// from the token file so the GIT_ASKPASS helper has something to emit.
// The env-file form remains authoritative when present (it may carry
// richer rows like GITEA_TOKEN_SCOPES / GITEA_SSH_KEY_PATH).
func loadPersonaEnvFile(role string, out map[string]string) {
if !isSafeRoleName(role) {
if role != "" {
@@ -237,61 +229,7 @@ func loadPersonaEnvFile(role string, out map[string]string) {
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
before := len(out)
parseEnvFile(filepath.Join(root, role, "env"), out)
if len(out) == before {
// No env-file rows landed (file absent, or present-but-empty).
// Try the token-only persona shape used by the prod-team
// identities. Existing keys in out are preserved.
loadPersonaTokenFile(role, out)
}
}
// loadPersonaTokenFile populates GITEA_TOKEN / GITEA_USER / GITEA_USER_EMAIL
// from a persona dir that ships only the bare `token` file — the shape used
// by the production agent personas (agent-dev-a, agent-dev-b, agent-pm).
// Those dirs do not carry an `env` file because their non-Gitea creds come
// from Infisical Universal Auth at runtime (universal-auth.env), so the
// historical loadPersonaEnvFile path silently no-ops on them.
//
// File layout: $MOLECULE_PERSONA_ROOT/<role>/token (mode 600, plain text).
// The token contents become GITEA_TOKEN (whitespace-trimmed); the role
// name becomes GITEA_USER; GITEA_USER_EMAIL is synthesised as
// <role>@<gitIdentityEmailDomain> to match the email shape that
// applyAgentGitIdentity uses for its slug-derived authorship addresses.
//
// Silent no-op when the role fails the safe-segment check, when the
// token file does not exist, or when its contents are empty after
// trimming. Existing keys in out are not overwritten — the caller's
// later .env layers and any prior loadPersonaEnvFile rows always win.
func loadPersonaTokenFile(role string, out map[string]string) {
if out == nil {
return
}
if !isSafeRoleName(role) {
return
}
root := os.Getenv("MOLECULE_PERSONA_ROOT")
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
data, err := os.ReadFile(filepath.Join(root, role, "token"))
if err != nil {
return
}
token := strings.TrimSpace(string(data))
if token == "" {
return
}
if _, ok := out["GITEA_TOKEN"]; !ok {
out["GITEA_TOKEN"] = token
}
if _, ok := out["GITEA_USER"]; !ok {
out["GITEA_USER"] = role
}
if _, ok := out["GITEA_USER_EMAIL"]; !ok {
out["GITEA_USER_EMAIL"] = role + "@" + gitIdentityEmailDomain
}
}
// isSafeRoleName accepts a single path segment of [A-Za-z0-9_-]+. Rejects
@@ -164,181 +164,3 @@ func TestIsSafeRoleName_Acceptance(t *testing.T) {
}
}
}
// TestLoadPersonaTokenFile_TokenOnlyPersona: the prod-team personas
// (agent-dev-a / agent-dev-b / agent-pm) ship `token` only — no `env`
// file. loadPersonaEnvFile's fallback path must populate GITEA_TOKEN /
// GITEA_USER / GITEA_USER_EMAIL from the token contents + role name so
// the GIT_ASKPASS helper has something to emit.
func TestLoadPersonaTokenFile_TokenOnlyPersona(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-bytes-redacted\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-a", out)
want := map[string]string{
"GITEA_TOKEN": "token-bytes-redacted",
"GITEA_USER": "agent-dev-a",
"GITEA_USER_EMAIL": "agent-dev-a@" + gitIdentityEmailDomain,
}
if len(out) != len(want) {
t.Fatalf("got %d keys, want %d: %#v", len(out), len(want), out)
}
for k, v := range want {
if out[k] != v {
t.Errorf("out[%q] = %q; want %q", k, out[k], v)
}
}
}
// TestLoadPersonaTokenFile_EnvFileWins: when BOTH an env file and a
// token file exist in the same persona dir, the env file is the more-
// specific declaration and wins outright — the fallback must not fire
// at all. This pins precedence so a persona later migrated to the
// richer env-file form (carrying GITEA_TOKEN_SCOPES / GITEA_SSH_KEY_PATH)
// doesn't get its token silently overridden by the fallback.
func TestLoadPersonaTokenFile_EnvFileWins(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
envBody := "GITEA_USER=env-form-user\nGITEA_TOKEN=env-form-token\n" +
"GITEA_USER_EMAIL=env-form@example.invalid\nGITEA_TOKEN_SCOPES=write:repository\n"
if err := os.WriteFile(filepath.Join(roleDir, "env"), []byte(envBody), 0o600); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("token-form-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-b", out)
if out["GITEA_USER"] != "env-form-user" {
t.Errorf("env file should win for GITEA_USER; got %q", out["GITEA_USER"])
}
if out["GITEA_TOKEN"] != "env-form-token" {
t.Errorf("env file should win for GITEA_TOKEN; got %q", out["GITEA_TOKEN"])
}
if out["GITEA_USER_EMAIL"] != "env-form@example.invalid" {
t.Errorf("env file should win for GITEA_USER_EMAIL; got %q", out["GITEA_USER_EMAIL"])
}
if out["GITEA_TOKEN_SCOPES"] != "write:repository" {
t.Errorf("env file extras must be preserved; got GITEA_TOKEN_SCOPES=%q", out["GITEA_TOKEN_SCOPES"])
}
}
// TestLoadPersonaTokenFile_NeitherFile: persona dir exists but ships
// neither env nor token — silent no-op. This is the legitimate case
// for a partially-provisioned persona during bootstrap; callers expect
// an empty map, no error, no log noise.
func TestLoadPersonaTokenFile_NeitherFile(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-pm")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-pm", out)
if len(out) != 0 {
t.Errorf("expected empty out when neither env nor token exists; got %#v", out)
}
}
// TestLoadPersonaTokenFile_EmptyToken: a token file with only
// whitespace must be treated as absent — never emit
// GITEA_TOKEN="" / GITEA_USER=<role> / GITEA_USER_EMAIL=<role>@... because
// that would set GITEA_USER without a usable token, and the askpass
// helper would then prompt with an empty password. Silent no-op is the
// correct behavior — let downstream auth fall through to its existing
// "no credentials available" path.
func TestLoadPersonaTokenFile_EmptyToken(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-a")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
// Whitespace-only contents: spaces, tabs, newlines.
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte(" \t\n \n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-a", out)
if len(out) != 0 {
t.Errorf("expected empty out when token file is whitespace-only; got %#v", out)
}
}
// TestLoadPersonaTokenFile_TrimsWhitespace: tokens shipped from the
// operator-host bootstrap kit may have a trailing newline (the
// canonical `printf "%s\n" "$token" > token` shape). The fallback must
// trim leading + trailing whitespace so the askpass helper emits the
// raw token bytes — Gitea's PAT validator rejects tokens with embedded
// whitespace.
func TestLoadPersonaTokenFile_TrimsWhitespace(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "agent-dev-b")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "token"),
[]byte("\n raw-token-bytes \n\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("agent-dev-b", out)
if out["GITEA_TOKEN"] != "raw-token-bytes" {
t.Errorf("token whitespace not trimmed; got %q", out["GITEA_TOKEN"])
}
}
// TestLoadPersonaTokenFile_RejectsUnsafeRole: defense-in-depth — even
// in the fallback path, role names that fail isSafeRoleName must not
// touch the filesystem. Mirrors TestLoadPersonaEnvFile_RejectsTraversal.
func TestLoadPersonaTokenFile_RejectsUnsafeRole(t *testing.T) {
root := t.TempDir()
// Plant a token at /tmp/.../token so a bad traversal would reach it.
if err := os.WriteFile(filepath.Join(root, "token"),
[]byte("stolen-token\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", filepath.Join(root, "personas"))
for _, bad := range []string{"..", "../personas", "/abs", "with/slash", "."} {
out := map[string]string{}
loadPersonaTokenFile(bad, out)
if len(out) != 0 {
t.Errorf("role %q should have been rejected; got %#v", bad, out)
}
}
}
// TestLoadPersonaTokenFile_NilMapSafe: callers pass a fresh map in
// practice, but defense-in-depth — a nil map must not panic.
func TestLoadPersonaTokenFile_NilMapSafe(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("nil map caused panic: %v", r)
}
}()
loadPersonaTokenFile("agent-dev-a", nil)
}
+3 -31
View File
@@ -327,33 +327,7 @@ func (h *RegistryHandler) Register(c *gin.Context) {
}
}
// Reconcile the runtime-supplied card's identity fields against the
// trusted workspaces row before storing. The runtime builds its card
// from config.name, which the CP-regenerated /configs/config.yaml
// sets to the workspace UUID — so without this the stored card
// served at /.well-known/agent-card.json and returned to peers via
// agent_card_url has name = UUID, description = "", role = null even
// though the operator-controlled workspaces.name holds the friendly
// name the canvas shows. We only FILL gaps from the DB (never
// downgrade a card that already carries a real name); identity stays
// platform-controlled — the agent cannot self-set these. Best-effort:
// a lookup failure leaves the card exactly as the runtime sent it
// (no-worse-than-before). See agent_card_reconcile.go.
reconciledCard := payload.AgentCard
{
var dbName, dbRole sql.NullString
if qErr := db.DB.QueryRowContext(ctx,
`SELECT name, role FROM workspaces WHERE id = $1`, payload.ID,
).Scan(&dbName, &dbRole); qErr == nil {
if rc, did := reconcileAgentCardIdentity(
payload.AgentCard, payload.ID, dbName.String, dbRole.String,
); did {
reconciledCard = rc
log.Printf("Registry register: reconciled agent_card identity for %s from workspaces row", payload.ID)
}
}
}
agentCardStr := string(reconciledCard)
agentCardStr := string(payload.AgentCard)
// urlForUpsert: poll-mode workspaces don't need a URL. Empty input
// becomes NULL via sql.NullString so the row's URL stays clean (the
@@ -439,12 +413,10 @@ func (h *RegistryHandler) Register(c *gin.Context) {
}
}
// Broadcast WORKSPACE_ONLINE — use the reconciled card so the canvas
// Agent Card view live-updates with the friendly name, matching what
// was just persisted (not the runtime's raw UUID-name card).
// Broadcast WORKSPACE_ONLINE
if err := h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.ID, map[string]interface{}{
"url": cachedURL,
"agent_card": reconciledCard,
"agent_card": payload.AgentCard,
"delivery_mode": effectiveMode,
}); err != nil {
log.Printf("Registry broadcast error: %v", err)
@@ -56,10 +56,8 @@ const (
// (an externally routable address) is used directly.
func (h *WorkspaceHandler) gracefulPreRestart(ctx context.Context, workspaceID string) {
// Non-blocking send — don't stall the restart cycle.
// Run in a tracked async goroutine (goAsync, not bare `go`) so the
// caller (runRestartCycle) can proceed to stopForRestart without
// waiting, while the test harness can still drain it before swapping
// the global db.DB (resolveAgentURLForRestartSignal reads db.DB).
// Run in a detached goroutine so the caller (runRestartCycle) can
// proceed to stopForRestart without waiting.
h.goAsync(func() {
signalCtx, cancel := context.WithTimeout(context.Background(), restartSignalTimeout)
defer cancel()
@@ -1,117 +0,0 @@
package handlers
// template_files_agent_home_stub_test.go — pins the Phase-1 stub
// contract for the /agent-home root added by internal#425 RFC.
//
// Today (pre-Phase-2b), every Files API verb against `?root=/agent-home`
// must return HTTP 501 with the canonical pending-message body. The
// stub MUST NOT:
// 1. Hit the DB (the workspace might not even exist yet from the
// canvas's POV — the root selector is testable without one).
// 2. Touch the EIC tunnel / Docker / template-dir paths — those
// would 500/404/[] depending on the env and confuse the canvas.
// 3. Accept writes/deletes that the future docker-exec backend
// would reject — fail closed.
//
// When Phase 2b lands, this file gets replaced by a real
// docker-exec dispatch test; the stub-message constant in
// templates.go disappears.
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
)
// TestAgentHomeAllowedRoot pins that /agent-home is in the allowedRoots
// set. Without this, a future refactor that drops the key would
// silently degrade the canvas root selector to a 400 instead of the
// stub 501.
func TestAgentHomeAllowedRoot(t *testing.T) {
if !allowedRoots["/agent-home"] {
t.Fatal("/agent-home must be in allowedRoots — RFC #425 contract")
}
}
// TestAgentHomeStub_AllVerbs_Return501 pins the canonical stub
// response across all four verbs. Each must:
//
// - status 501
// - body contains the canonical "/agent-home not implemented" prefix
// - NOT contain "workspace not found" (proves we short-circuit before
// the DB lookup)
//
// Driven as a table to keep symmetry — adding a fifth verb in the
// future means adding one row here.
func TestAgentHomeStub_AllVerbs_Return501(t *testing.T) {
cases := []struct {
name string
method string
invoke func(c *gin.Context)
}{
{
name: "ListFiles",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ListFiles(c) },
},
{
name: "ReadFile",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ReadFile(c) },
},
{
name: "WriteFile",
method: "PUT",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).WriteFile(c) },
},
{
name: "DeleteFile",
method: "DELETE",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).DeleteFile(c) },
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "ws-stub"},
// Path param without leading slash so DeleteFile's
// filepath.IsAbs guard doesn't 400 before the root
// dispatch runs. The List/Read/Write paths strip the
// leading slash themselves and accept either form.
{Key: "path", Value: "notes.md"},
}
// WriteFile binds JSON; provide a minimal valid body so the
// short-circuit isn't masked by the bind-error path.
var body string
if tc.method == "PUT" {
body = `{"content":"x"}`
}
c.Request = httptest.NewRequest(
tc.method,
"/workspaces/ws-stub/files/notes.md?root=/agent-home",
strings.NewReader(body),
)
if body != "" {
c.Request.Header.Set("Content-Type", "application/json")
}
tc.invoke(c)
if w.Code != http.StatusNotImplemented {
t.Fatalf("expected 501, got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "/agent-home not implemented") {
t.Errorf("body should contain canonical stub message; got %s", w.Body.String())
}
if strings.Contains(w.Body.String(), "workspace not found") {
t.Errorf("stub leaked through to DB lookup; body=%s", w.Body.String())
}
})
}
}
@@ -19,7 +19,6 @@ package handlers
import (
"bytes"
"context"
"errors"
"fmt"
"log"
"os"
@@ -358,28 +357,6 @@ func writeFileViaEIC(ctx context.Context, instanceID, runtime, root, relPath str
var stderr bytes.Buffer
sshCmd.Stderr = &stderr
if err := sshCmd.Run(); err != nil {
// When the per-op context deadline (eicFileOpTimeout) fires,
// exec.CommandContext SIGKILLs the ssh subprocess and Run()
// returns the bare "signal: killed" with empty stderr. That
// surfaced to the canvas as an opaque
// `500 {"error":"ssh install: signal: killed ()"}` which gave
// the operator no idea the workspace was simply mid-provision
// with a slow/unready EIC tunnel (internal#423). Detect the
// deadline explicitly and return an actionable message instead
// — the EIC mechanism, timeout value, and success path are all
// unchanged; this only improves the error a stuck write emits.
if cerr := ctx.Err(); cerr != nil {
reason := "timed out after " + eicFileOpTimeout.String()
if errors.Is(cerr, context.Canceled) && !errors.Is(cerr, context.DeadlineExceeded) {
reason = "was cancelled"
}
return fmt.Errorf(
"ssh install: EIC tunnel to workspace %s — "+
"the workspace may still be provisioning (slow/unready SSH); "+
"retry once it is online, or apply provider credentials via "+
"Settings → Secrets (encrypted, does not use this file-write path)",
reason)
}
return fmt.Errorf("ssh install: %w (%s)", err, strings.TrimSpace(stderr.String()))
}
log.Printf("writeFileViaEIC: ws instance=%s runtime=%s root=%s wrote %d bytes → %s",
@@ -1,71 +0,0 @@
package handlers
// template_files_eic_write_timeout_test.go — pins the actionable-error
// behavior added for internal#423.
//
// When the per-op context deadline (eicFileOpTimeout) fires,
// exec.CommandContext SIGKILLs the ssh subprocess and Run() returns the
// bare "signal: killed" with empty stderr. Before the fix that surfaced
// to the canvas as an opaque `500 {"error":"ssh install: signal:
// killed ()"}` — useless to an operator whose workspace was simply
// mid-provision with a slow/unready EIC tunnel. The fix detects the
// deadline explicitly (errors.Is(ctx.Err(), context.DeadlineExceeded))
// and returns a message that names the cause and the
// Settings → Secrets workaround.
import (
"context"
"strings"
"testing"
"time"
)
// TestWriteFileViaEIC_DeadlineExceeded_ActionableError stubs
// withEICTunnel so the *real* inner closure runs against a context that
// has already exceeded its deadline. The ssh subprocess fails (no real
// sshd on the fake port) and ctx.Err() == DeadlineExceeded, so the new
// branch must fire and produce an actionable message — NOT the opaque
// "signal: killed ()" string the canvas used to show.
func TestWriteFileViaEIC_DeadlineExceeded_ActionableError(t *testing.T) {
prev := withEICTunnel
withEICTunnel = func(_ context.Context, instanceID string, fn func(s eicSSHSession) error) error {
// Run the real inner closure. It closes over the ctx that
// writeFileViaEIC derived from our already-cancelled parent, so
// the ssh subprocess is killed immediately and ctx.Err()
// resolves — exactly the eicFileOpTimeout-expiry shape.
return fn(eicSSHSession{
instanceID: instanceID,
osUser: "ubuntu",
localPort: 1, // nothing listening → ssh fails fast
keyPath: "/nonexistent/key",
})
}
t.Cleanup(func() { withEICTunnel = prev })
// Drive the real writeFileViaEIC. Pass a parent whose deadline has
// already passed: the context.WithTimeout(ctx, eicFileOpTimeout)
// derived inside writeFileViaEIC inherits the expired parent
// deadline, so ctx.Err() == context.DeadlineExceeded by the time
// the killed ssh subprocess returns — the exact production shape
// (eicFileOpTimeout expiry), exercised deterministically.
parent, cancel := context.WithDeadline(context.Background(), time.Now().Add(-time.Second))
defer cancel()
err := writeFileViaEIC(parent, "i-test", "claude-code", "/configs", "config.yaml", []byte("model: sonnet\n"))
if err == nil {
t.Fatalf("expected an error from a killed ssh subprocess, got nil")
}
msg := err.Error()
// Must NOT leak the opaque bare-signal string to the operator.
if strings.Contains(msg, "signal: killed ()") {
t.Fatalf("error still surfaces the opaque %q form: %q", "signal: killed ()", msg)
}
// Must name the cause and the Secrets workaround so the canvas
// shows something actionable.
for _, want := range []string{"timed out", "provisioning", "Settings", "Secrets"} {
if !strings.Contains(msg, want) {
t.Errorf("actionable error missing %q; got: %q", want, msg)
}
}
}
@@ -18,35 +18,11 @@ import (
)
// allowedRoots are the container paths that the Files API can browse.
//
// `/agent-home` (added 2026-05-15, internal#425 RFC) is the container's
// own $HOME — `/root` for openclaw, `/home/agent` for claude-code/hermes
// — browsed via `docker exec` rather than host-side `find`. The
// dispatch is stubbed today (returns 501); full implementation lands in
// Phase 2b of the RFC. The allowedRoots key is added now so the canvas
// can design its root-selector UI against the final shape and the
// stub-vs-full transition is server-side only.
var allowedRoots = map[string]bool{
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
"/agent-home": true,
}
// agentHomeStubMessage is the body returned by every Files API verb
// when `?root=/agent-home` is requested before Phase 2b lands. Keep the
// status code 501 (Not Implemented) — the route exists, the verb is
// understood, but the handler is unimplemented. Distinguishes from
// 400/404 so a canvas behind a less-current server can render a clean
// "feature pending" state instead of a generic error.
const agentHomeStubMessage = "/agent-home not implemented yet (internal#425 RFC Phase 2b — docker-exec backend pending)"
// isAgentHomeStubRequest returns true when the request targets the
// stubbed /agent-home root. Centralised so every verb in this file
// short-circuits with the same response shape.
func isAgentHomeStubRequest(rootPath string) bool {
return rootPath == "/agent-home"
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
}
// maxUploadFiles limits the number of files in a single import/replace.
@@ -248,14 +224,7 @@ func (h *TemplatesHandler) ListFiles(c *gin.Context) {
// ?depth= — max depth to recurse (default: 1, max: 5)
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
// /agent-home dispatch is stubbed pre-Phase-2b. Short-circuit before
// the DB lookup + EIC dance so a canvas exercising the new root key
// gets a clean 501 instead of a half-effort response.
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
return
}
subPath := c.DefaultQuery("path", "")
@@ -424,11 +393,7 @@ func (h *TemplatesHandler) ReadFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
return
}
@@ -541,11 +506,7 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
return
}
var wsName, instanceID, runtime string
@@ -622,11 +583,7 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
return
}
var wsName, instanceID, runtime string
@@ -10,20 +10,8 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// validWorkspaceID returns true when id is a syntactically valid UUID.
// workspace_id is a `uuid` column; passing a non-UUID (e.g. the canvas
// "global" sentinel sent when no node is selected) makes Postgres raise
// `invalid input syntax for type uuid`, which previously leaked as an
// opaque 500. Reject up front with a clean 400 instead. Mirrors the
// uuid.Parse guard already used in handlers/activity.go.
func validWorkspaceID(id string) bool {
_, err := uuid.Parse(id)
return err == nil
}
// TokenHandler exposes user-facing token management for workspaces.
// Routes: GET/POST/DELETE /workspaces/:id/tokens (behind WorkspaceAuth).
type TokenHandler struct{}
@@ -43,10 +31,6 @@ type tokenListItem struct {
// never the plaintext or hash).
func (h *TokenHandler) List(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
limit := 50
if v := c.Query("limit"); v != "" {
@@ -69,7 +53,6 @@ func (h *TokenHandler) List(c *gin.Context) {
LIMIT $2 OFFSET $3
`, workspaceID, limit, offset)
if err != nil {
log.Printf("tokens: list query failed for workspace %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to list tokens"})
return
}
@@ -102,10 +85,6 @@ const maxTokensPerWorkspace = 50
// exactly once in the response — it cannot be recovered afterwards.
func (h *TokenHandler) Create(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
// Rate limit: max active tokens per workspace
var count int
@@ -138,10 +117,6 @@ func (h *TokenHandler) Create(c *gin.Context) {
func (h *TokenHandler) Revoke(c *gin.Context) {
workspaceID := c.Param("id")
tokenID := c.Param("tokenId")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
result, err := db.DB.ExecContext(c.Request.Context(), `
UPDATE workspace_auth_tokens
@@ -41,15 +41,6 @@ import (
func init() { gin.SetMode(gin.TestMode) }
// Workspace IDs are validated as UUIDs up front (tokens.go validWorkspaceID),
// so handler tests must pass syntactically valid UUIDs. Fixed values keep
// sqlmock WithArgs assertions deterministic.
const (
wsUUID1 = "11111111-1111-1111-1111-111111111111"
wsUUID2 = "22222222-2222-2222-2222-222222222222"
wsUUID3 = "33333333-3333-3333-3333-333333333333"
)
// withMockDB swaps `db.DB` for a sqlmock and returns the mock plus a
// restore func. Tests use this in place of setupTokenTestDB which
// skips on a missing real DB.
@@ -90,13 +81,13 @@ func TestTokenHandler_List_HappyPath(t *testing.T) {
created := time.Date(2026, 4, 1, 12, 0, 0, 0, time.UTC)
last := created.Add(time.Hour)
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at\s+FROM workspace_auth_tokens`).
WithArgs(wsUUID1, 50, 0).
WithArgs("ws-1", 50, 0).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}).
AddRow("tok-1", "abc12345", created, last).
AddRow("tok-2", "def67890", created, nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
@@ -130,7 +121,7 @@ func TestTokenHandler_List_EmptyResult(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: wsUUID2}})
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: "ws-2"}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200 on empty list, got %d", w.Code)
@@ -155,7 +146,7 @@ func TestTokenHandler_List_QueryError(t *testing.T) {
WillReturnError(errors.New("connection refused"))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: wsUUID3}})
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: "ws-3"}})
if w.Code != http.StatusInternalServerError {
t.Errorf("query error must surface as 500, got %d", w.Code)
@@ -167,13 +158,13 @@ func TestTokenHandler_List_RespectsLimit(t *testing.T) {
defer cleanup()
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at`).
WithArgs(wsUUID1, 10, 5).
WithArgs("ws-1", 10, 5).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/tokens?limit=10&offset=5", nil)
c.Params = gin.Params{{Key: "id", Value: wsUUID1}}
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
NewTokenHandler().List(c)
if w.Code != http.StatusOK {
@@ -195,7 +186,7 @@ func TestTokenHandler_List_ScanError(t *testing.T) {
AddRow("tok-1", "abc", "not-a-timestamp", nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
if w.Code != http.StatusInternalServerError {
t.Errorf("scan error must surface as 500, got %d: %s", w.Code, w.Body.String())
@@ -210,11 +201,11 @@ func TestTokenHandler_Create_RateLimited(t *testing.T) {
// Count query returns 50 (== max) → 429.
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(wsUUID1).
WithArgs("ws-1").
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(50))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
if w.Code != http.StatusTooManyRequests {
t.Errorf("max active tokens should 429, got %d", w.Code)
@@ -234,7 +225,7 @@ func TestTokenHandler_Create_IssueFails(t *testing.T) {
WillReturnError(errors.New("disk full"))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
if w.Code != http.StatusInternalServerError {
t.Errorf("IssueToken DB error must 500, got %d", w.Code)
@@ -251,7 +242,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
WillReturnResult(sqlmock.NewResult(1, 1))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
@@ -266,7 +257,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
if body.AuthToken == "" {
t.Errorf("auth_token must be present and non-empty in response")
}
if body.WorkspaceID != wsUUID1 {
if body.WorkspaceID != "ws-1" {
t.Errorf("workspace_id mismatch: %q", body.WorkspaceID)
}
}
@@ -278,12 +269,12 @@ func TestTokenHandler_Revoke_HappyPath(t *testing.T) {
defer cleanup()
mock.ExpectExec(`UPDATE workspace_auth_tokens\s+SET revoked_at = now\(\)`).
WithArgs("tok-1", wsUUID1).
WithArgs("tok-1", "ws-1").
WillReturnResult(sqlmock.NewResult(0, 1))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: wsUUID1},
{Key: "id", Value: "ws-1"},
{Key: "tokenId", Value: "tok-1"},
})
@@ -298,12 +289,12 @@ func TestTokenHandler_Revoke_NotFound(t *testing.T) {
// 0 rows affected → token not found OR already revoked.
mock.ExpectExec(`UPDATE workspace_auth_tokens`).
WithArgs("tok-ghost", wsUUID1).
WithArgs("tok-ghost", "ws-1").
WillReturnResult(sqlmock.NewResult(0, 0))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-ghost", gin.Params{
{Key: "id", Value: wsUUID1},
{Key: "id", Value: "ws-1"},
{Key: "tokenId", Value: "tok-ghost"},
})
@@ -321,7 +312,7 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: wsUUID1},
{Key: "id", Value: "ws-1"},
{Key: "tokenId", Value: "tok-1"},
})
@@ -330,59 +321,6 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
}
}
// ---- UUID validation (regression: "global" sentinel 500) ------------
// The canvas Settings → Workspace Tokens tab sent the literal sentinel
// "global" as the workspace id when no node was selected. workspace_id
// is a `uuid` column, so the query raised
// `invalid input syntax for type uuid: "global"` which leaked as an
// opaque 500. List/Create/Revoke now reject any non-UUID id with a
// clean 400 before touching the DB. No DB expectation is set on the
// mock — a DB hit would fail ExpectationsWereMet, proving short-circuit.
func TestTokenHandler_RejectsNonUUIDWorkspaceID(t *testing.T) {
h := NewTokenHandler()
cases := []struct {
name string
run func(c *gin.Context)
method string
params gin.Params
}{
{"List", h.List, "GET", gin.Params{{Key: "id", Value: "global"}}},
{"Create", h.Create, "POST", gin.Params{{Key: "id", Value: "global"}}},
{"Revoke", h.Revoke, "DELETE", gin.Params{
{Key: "id", Value: "global"},
{Key: "tokenId", Value: "tok-1"},
}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
w := makeReq(t, tc.run, tc.method,
"/workspaces/global/tokens", tc.params)
if w.Code != http.StatusBadRequest {
t.Fatalf("%s with non-UUID id must 400, got %d: %s",
tc.name, w.Code, w.Body.String())
}
var body struct {
Error string `json:"error"`
}
_ = json.Unmarshal(w.Body.Bytes(), &body)
if body.Error != "invalid workspace id" {
t.Errorf("%s: want error=%q, got %q",
tc.name, "invalid workspace id", body.Error)
}
// No query/exec was expected → if the handler hit the DB
// this fails, proving the guard short-circuits before SQL.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("%s leaked a DB call past the uuid guard: %v", tc.name, err)
}
})
}
}
// Compile-time noise removal: the imports list pulls in the sql /
// driver packages and the silenced ctx so a future scenario that
// needs them doesn't have to re-add the import. Documented here so
@@ -11,7 +11,6 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
func init() { gin.SetMode(gin.TestMode) }
@@ -168,14 +167,11 @@ func TestTokenHandler_RevokeWrongWorkspace(t *testing.T) {
h := NewTokenHandler()
// Try to revoke with a different (valid-UUID) workspace ID that does
// not own the token — should 404. A valid UUID is required so this
// exercises the ownership branch, not the up-front uuid-shape 400.
otherWS := uuid.NewString()
// Try to revoke with a different workspace ID — should 404
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: otherWS}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/"+otherWS+"/tokens/"+tokenID, nil)
c.Params = gin.Params{{Key: "id", Value: "wrong-workspace-id"}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/wrong/tokens/"+tokenID, nil)
h.Revoke(c)
if w.Code != http.StatusNotFound {

Some files were not shown because too many files have changed in this diff Show More