Compare commits

...

36 Commits

Author SHA1 Message Date
Molecule AI Dev Engineer A (Kimi) bf276bc25d fix(ci): add explicit utf-8 encoding to Python open() calls
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
CI / all-required (pull_request) Successful in 14s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
review-check-tests / review-check.sh regression tests (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 13s
security-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m14s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Python 3's open() default encoding is platform-dependent (PEP 597).
On CI runners it happens to be UTF-8, but being explicit avoids
surprises on Windows dev boxes or custom runner images.

Files touched:
- sop-checklist.py: config loading (YAML + minimal parser)
- tests/_review_check_fixture.py: test fixture scenario loader
- tests/_refire_fixture.py: test fixture scenario loader

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 15:35:36 +00:00
hongming 18fa084510 Merge pull request 'fix(canvas): link provider selection to llm_billing_mode (internal#703 Gap 2)' (#1935) from fix/703-provider-billing-mode-ui into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
publish-canvas-image / Build & push canvas image (push) Successful in 2m51s
publish-workspace-server-image / build-and-push (push) Successful in 3m6s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 4s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 7s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
CI / Platform (Go) (push) Successful in 20s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 19s
E2E Chat / E2E Chat (push) Successful in 3m52s
CI / Canvas (Next.js) (push) Successful in 6m25s
CI / all-required (push) Successful in 24m52s
Harness Replays / Harness Replays (push) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m51s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m53s
publish-workspace-server-image / Production auto-deploy (push) Successful in 45m18s
main-red-watchdog / watchdog (push) Successful in 40s
gate-check-v3 / gate-check (push) Successful in 38s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 23s
ci-required-drift / drift (push) Successful in 1m16s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 9s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m33s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m59s
2026-05-27 15:33:17 +00:00
hongming 46012b965c Merge pull request 'fix(llm): byok honors workspace own provider env — emit resolved billing_mode (internal#703)' (#1934) from fix/internal-703-byok-billing-mode-env into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
publish-workspace-server-image / build-and-push (push) Successful in 8m4s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 36s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 4m53s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Has started running
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 11s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
publish-workspace-server-image / Production auto-deploy (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 15:24:34 +00:00
hongming 1828d15d4f Merge pull request 'fix(handlers): nil-safe scans + validation hardening (from #1933)' (#1950) from fix/nil-safe-scans-validation-hardening into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 4s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 28s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 59s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 47s
publish-workspace-server-image / build-and-push (push) Successful in 4m12s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 6m33s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m29s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m28s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 8m20s
CI / Canvas (Next.js) (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m46s
Harness Replays / Harness Replays (push) Successful in 5s
main-red-watchdog / watchdog (push) Successful in 40s
CI / Platform (Go) (push) Successful in 5m24s
E2E Chat / E2E Chat (push) Successful in 4m14s
CI / all-required (push) Successful in 14m33s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m58s
publish-workspace-server-image / Production auto-deploy (push) Successful in 12m30s
CI / Canvas Deploy Reminder (push) Successful in 2s
gate-check-v3 / gate-check (push) Successful in 33s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 8s
ci-required-drift / drift (push) Successful in 1m16s
2026-05-27 15:00:24 +00:00
core-be ea70447599 fix(handlers): nil-safe scans + validation hardening (from #1933)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 14s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Chat / detect-changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m47s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m18s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 5m39s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m55s
CI / Platform (Go) (pull_request) Successful in 5m21s
CI / all-required (pull_request) Successful in 15m12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 5s
Resubmits the independent nil-safe / validation-hardening hunks
extracted from closed PR #1933 (closed for scope-creep). Each hunk is
self-contained and does not overlap the already-merged #1938/#1939/#1940;
the a2a_proxy*, channels, delegation, restart and scheduler hunks from

- a2a_queue_status.go: nil-safe Scan in queueRowAuthFields (NULL
  caller_id / workspace_id -> "" via NullString.Valid checks).
- github_token.go: guard non-201 status from the GitHub token endpoint
  before decoding the body.
- mcp_tools.go: check_task_status defaults status to "unknown" when the
  row's status is NULL; toolListPeers / toolGetWorkspaceInfo /
  toolCheckTaskStatus now return the marshal error instead of returning
  a malformed/empty string.
- mcp_tools_memory_legacy_shim.go / mcp_tools_memory_v2.go: return the
  marshal error from the memory tool responses.
- registry.go: nil name/role guard before reconcileAgentCardIdentity.
- schedules.go: compute next run in the validated location
  (time.Now().In(loc)) for Create and Update.
- workspace_provision.go: case/whitespace-insensitive runtime match via
  strings.EqualFold.

Tests added: queueRowAuthFields nil-safe + populated paths,
check_task_status NULL-status -> "unknown", and the EqualFold
case/whitespace matrix. Full internal/handlers package passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 07:38:26 -07:00
hongming 658e033638 Merge pull request 'fix(handlers): return after marshal failure in toolDelegateTaskAsync' (#1949) from fix/delegate-async-return-after-marshal-fail into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-workspace-server-image / build-and-push (push) Successful in 3m2s
Block internal-flavored paths / Block forbidden paths (push) Successful in 25s
CI / Detect changes (push) Successful in 13s
CI / Python Lint & Test (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m0s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m37s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m38s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 7m51s
Harness Replays / Harness Replays (push) Successful in 3s
CI / Platform (Go) (push) Successful in 5m14s
CI / all-required (push) Successful in 10m17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m54s
E2E Chat / E2E Chat (push) Successful in 3m29s
CI / Canvas Deploy Reminder (push) Successful in 2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
publish-workspace-server-image / Production auto-deploy (push) Successful in 15m12s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 15s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m21s
2026-05-27 14:30:11 +00:00
hongming f70384d375 Merge pull request 'fix(a2a): canvas-user identity bypass without cross-workspace escalation (#1673)' (#1948) from fix/canvas-user-verified-session-1673 into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / all-required (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 11s
publish-workspace-server-image / build-and-push (push) Successful in 3m15s
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Python Lint & Test (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 11s
CI / Detect changes (push) Successful in 19s
E2E Chat / detect-changes (push) Successful in 18s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 16s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 51s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 38s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m45s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m27s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / Production auto-deploy (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 14:19:48 +00:00
core-be 1735f28ca9 fix(handlers): return after marshal failure in toolDelegateTaskAsync
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 48s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m36s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m32s
CI / Platform (Go) (pull_request) Successful in 5m22s
CI / all-required (pull_request) Successful in 8m38s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 11s
The detached goroutine in toolDelegateTaskAsync logged a json.Marshal
failure for the A2A body but then fell through and called
proxyA2ARequest with a nil/empty body, dispatching a malformed A2A
request. Add the missing return so the goroutine bails out on marshal
failure.

Extracted as the real titled fix from closed PR #1933 (the rest of
that PR was scope-creep and is being resubmitted separately).

A package-level marshalA2ABody seam is added so the otherwise
near-impossible marshal-failure path can be exercised by a focused
unit test (TestMCPHandler_DelegateTaskAsync_MarshalFailureDoesNotCallProxy),
which fails without the return and passes with it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 07:00:07 -07:00
core-be 121eb64f24 fix(a2a): canvas-user identity bypass without cross-workspace escalation (#1673)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 37s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 51s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m52s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m43s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m14s
CI / Platform (Go) (pull_request) Successful in 6m13s
CI / all-required (pull_request) Successful in 18m26s
audit-force-merge / audit (pull_request) Successful in 7s
#1673: validateCallerToken checked HasAnyLiveToken BEFORE the canvas
classification. Once an RFC#637 canvas-user identity workspace acquired
live tokens, canvas requests fell into the hasLive=true branch, which
demands a bearer the canvas frontend never sends → silent 401 → the
message was dropped before logA2AReceiveQueued wrote the activity_logs
row, breaking canvas chat (and chat-history) for poll-mode workspaces.

Safe mechanism (supersedes #1944): classify canvas users by the HUMAN's
NON-FORGEABLE credential, evaluated BEFORE the peer-token contract:
  - middleware.IsVerifiedCanvasSession — the WorkOS session cookie
    confirmed upstream as a member of THIS tenant's org
    (/cp/auth/tenant-member). The production SaaS canvas path.
  - ADMIN_TOKEN bearer / live org_api_tokens row.
A bare same-origin Host/Referer (middleware.IsSameOriginCanvas, documented
in-repo as forgeable / cosmetic-only) is honored ONLY as a self-hosted/dev
fallback when CP session verification is NOT configured — never in a SaaS
combined-tenant image, where a forged Referer + arbitrary X-Workspace-ID
would otherwise bypass registry.CanCommunicate and reach cross-workspace
A2A. That is the privilege escalation #1944 introduced.

Classification keys on the human's credential, not the caller's
X-Workspace-ID, so it never trusts an attacker-supplied caller ID and is
independent of whether the identity workspace holds peer tokens. Genuine
token-holding peer workspaces are unaffected: with no cookie/admin/org
credential they fall through to the existing bearer/ValidateToken gate.

Tests:
  - TestProxyA2A_PollMode_CanvasUserWithVerifiedSession — the #1673
    regression: poll-mode canvas-user identity WITH live tokens + a
    CP-verified session → 200 queued + activity_logs row written, with NO
    SELECT COUNT(*) (proving the canvas check precedes HasAnyLiveToken).
    Subprocess test with CANVAS_PROXY_URL set at init.
  - TestProxyA2A_ForgedSameOrigin_CannotBypassCanCommunicate — the
    security crux: combined-tenant image, forged same-origin Host/Referer
    + arbitrary X-Workspace-ID, no verified session → must fall through to
    CanCommunicate, which DENIES (403). Proves the escalation is closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 06:59:05 -07:00
hongming 38671a35d1 Merge pull request 'fix(handlers): clean up time.After timer in delegation retry on ctx cancel' (#1940) from fix/time-after-single-retry-delegation into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 10s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
CI / Canvas (Next.js) (push) Successful in 3s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 18s
publish-workspace-server-image / build-and-push (push) Successful in 5m37s
CI / Canvas Deploy Reminder (push) Successful in 4s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m23s
E2E Chat / E2E Chat (push) Successful in 4m48s
CI / Platform (Go) (push) Successful in 6m5s
CI / all-required (push) Successful in 8m21s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 6s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m5s
main-red-watchdog / watchdog (push) Successful in 45s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m33s
gate-check-v3 / gate-check (push) Successful in 52s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 15s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 18s
ci-required-drift / drift (push) Successful in 1m19s
2026-05-27 13:24:44 +00:00
hongming e5a39df664 Merge pull request 'fix(handlers): prevent invalid JSONB inserts on json.Marshal failure (2nd pass)' (#1938) from fix/json-marshal-log-continue-2nd-pass into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / build-and-push (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 13:24:27 +00:00
hongming 2fb8f2fd40 Merge pull request 'fix(workspace-server): prevent time.After goroutine leaks in long-running loops' (#1939) from fix/time-after-goroutine-leaks into main
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m15s
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / build-and-push (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 13:24:17 +00:00
hongming 8291a95060 Merge pull request 'watchdog: close stale [main-red] issues when contexts recover on red (mc#1789)' (#1943) from fix/watchdog-close-stale-contexts-on-red into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m21s
Block internal-flavored paths / Block forbidden paths (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
E2E Staging Canvas (Playwright) / detect-changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Secret scan / Scan diff for credential-shaped strings (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / build-and-push (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 13:22:51 +00:00
hongming 58b098c676 Merge pull request 'fix(ci): remove -race from blocking Platform (Go) gate, add advisory race step (#1184)' (#1945) from fix/ci-remove-race-from-blocking-gate-1184 into main
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (push) Successful in 10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m28s
Block internal-flavored paths / Block forbidden paths (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Python Lint & Test (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
E2E Chat / detect-changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / all-required (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
CI / Detect changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
publish-workspace-server-image / build-and-push (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
E2E API Smoke Test / detect-changes (push) Compensated by status-reaper (push run was cancelled/superseded; Gitea 1.22.6 reports cancelled runs as failure statuses)
2026-05-27 13:22:44 +00:00
Molecule AI Dev Engineer A (Kimi) 0a1426e311 fix(ci): remove -race from blocking Platform (Go) gate, add advisory race step (#1184)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 25s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 13s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m15s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m18s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m21s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m26s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m40s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 8s
Cold runners compile race-instrumented code in 13-25 min, exceeding the
10m step timeout and causing false failures on unrelated PRs. The
blocking gate now runs without -race (reliable on cold runners), while
a new non-blocking advisory step still surfaces race conditions on every
PR without blocking merge.

Fixes #1184
2026-05-27 12:44:26 +00:00
Molecule AI Dev Engineer A (Kimi) 5f0a772f67 main-red-watchdog: add missing close_stale_red_issues mock in test
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / all-required (pull_request) Successful in 1m31s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 5s
security-review / approved (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m13s
audit-force-merge / audit (pull_request) Successful in 10s
test_run_once_failure_does_not_close was not monkeypatching the new
close_stale_red_issues function, causing it to hit the real api()
helper and fail with URLError in CI.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 11:50:09 +00:00
Molecule AI Dev Engineer A (Kimi) c272eeae94 watchdog: close stale [main-red] issues when contexts recover on red (mc#1789)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
CI / all-required (pull_request) Successful in 1m30s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 9s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
When main stays red across consecutive SHAs for *different* causes,
close_open_red_issues_for_other_shas never fires (it only runs when
main is green). This leaves stale issues open indefinitely — e.g.
#1936 (E2E Chat failure) stayed open even though current HEAD is red
for a different reason (E2E Legacy Advisory).

Add close_stale_red_issues():
  1. List all open [main-red] issues.
  2. For each issue on an OLD SHA, query that SHA's commit status.
  3. Compare the old failed contexts against current HEAD.
  4. If ALL failed contexts have recovered (success or absent), close
     the issue with a comment pointing to the current [main-red] issue.
  5. If the old SHA is itself now green, close it too.
  6. Skip issues with combined-red-no-detail (can't verify recovery).

Called from run_once() after file_or_update_red() on the red path.
Emits a main_red_stale_closed Loki event when issues are closed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 11:06:06 +00:00
Molecule AI Dev Engineer A (Kimi) 2335156ad3 fix(handlers): clean up time.After timer in delegation retry on ctx cancel
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 10s
qa-review / approved (pull_request) Failing after 8s
security-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m36s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m59s
CI / Platform (Go) (pull_request) Successful in 5m25s
CI / all-required (pull_request) Successful in 7m13s
audit-force-merge / audit (pull_request) Successful in 6s
Even though this is a bounded single-retry per request, using
time.NewTimer + timer.Stop() on ctx.Done() is consistent with the
fleet-wide cleanup and prevents the short-lived timer goroutine from
lingering until delegationRetryDelay expires.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 09:49:22 +00:00
Molecule AI Dev Engineer A (Kimi) 02a3de7c0e fix(workspace-server): replace time.After with time.NewTimer to prevent goroutine leaks
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
CI / Detect changes (pull_request) Successful in 13s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m13s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m37s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m4s
CI / Platform (Go) (pull_request) Successful in 5m20s
CI / all-required (pull_request) Successful in 6m5s
audit-force-merge / audit (pull_request) Successful in 10s
Inside loops, time.After creates a new timer goroutine each iteration
that cannot be GC'd until it fires. In long-running loops (supervisor
restart backoff, Telegram polling, restart-context polling, CP stop
retry) this leaks goroutines proportional to iteration count.

Replace with time.NewTimer + timer.Stop() on ctx cancellation so the
timer is cleaned up immediately when the goroutine exits.

Affected files:
- supervised/supervised.go (RunWithRecover backoff)
- channels/telegram.go (429 retry + poll error sleep)
- handlers/restart_context.go (online + heartbeat polling)
- handlers/workspace_restart.go (cpStop retry backoff)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 09:45:31 +00:00
Molecule AI Dev Engineer A (Kimi) f1beec8767 fix(channels,scheduler): prevent nil/empty payloads on json.Marshal failure
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 4s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Failing after 9s
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 7s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m37s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m9s
CI / Platform (Go) (pull_request) Successful in 5m11s
CI / all-required (pull_request) Successful in 6m16s
audit-force-merge / audit (pull_request) Successful in 9s
Second sweep found additional log-and-continue instances in channels and
scheduler where a marshal error was logged but the nil result was still
used downstream:

- channels/slack: nil body sent to Slack API → return marshal error
- channels/manager: nil a2aBody passed to ProxyA2ARequest → return error
- channels/manager: empty string pushed to Redis history → skip push
- scheduler/fireSchedule: nil a2aBody passed to ProxyA2ARequest → return early
- scheduler/cronMeta insert (2×): empty string ::jsonb → skip DB insert

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 09:25:38 +00:00
Molecule AI Dev Engineer A (Kimi) 94ca997d43 fix(handlers): prevent invalid JSONB inserts on json.Marshal failure (2nd pass)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
CI / Detect changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 8s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m35s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m9s
CI / Platform (Go) (pull_request) Successful in 5m1s
CI / all-required (pull_request) Successful in 8m7s
PR #1933 fixed the fleet-wide json.Marshal error-log-but-continue pattern
in the first pass. A second grep sweep found additional instances where a
logged marshal error was followed by passing the (potentially nil) result
to a PostgreSQL ::jsonb cast, causing unnecessary DB syntax errors, or by
computing an HMAC over empty data (audit chain integrity).

Changes:
- a2a_queue: return early in stitchDrainResponseToDelegation
- agent_message_writer: return nil (broadcast already succeeded)
- audit: return "" instead of HMAC of empty data
- channels: return 500 on marshal errors in Create/Update
- delegation: return early or skip DB insert in pushDelegationResultToInbox,
  insertDelegationRow, executeDelegation, Record, UpdateStatus
- memories: skip best-effort audit insert on marshal error

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 09:14:37 +00:00
hongming bad9a52aac Merge pull request 'fix(workspace-server): retire 12288-byte config-files user-data cap (cp#329)' (#1937) from fix/cp329-retire-config-files-userdata-cap into main
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 11s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
publish-workspace-server-image / build-and-push (push) Successful in 3m14s
CI / Detect changes (push) Successful in 14s
CI / Python Lint & Test (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 47s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 6s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m18s
CI / Platform (Go) (push) Successful in 4m29s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 11s
Harness Replays / Harness Replays (push) Successful in 15s
CI / all-required (push) Successful in 11m33s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m32s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m55s
CI / Canvas Deploy Reminder (push) Successful in 2s
publish-workspace-server-image / Production auto-deploy (push) Successful in 10m24s
E2E Chat / detect-changes (push) Successful in 21s
E2E Chat / E2E Chat (push) Successful in 3m29s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 12s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m12s
main-red-watchdog / watchdog (push) Successful in 2m19s
gate-check-v3 / gate-check (push) Successful in 33s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m30s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m23s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 17s
ci-required-drift / drift (push) Successful in 1m13s
E2E Legacy Advisory / Legacy local-platform E2E (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Railway pin audit (drift detection) / Audit Railway env vars for drift-prone pins (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
2026-05-27 08:31:10 +00:00
hongming-ceo-delegated 8c48bc9474 fix(workspace-server): retire 12288-byte config-files user-data cap (cp#329)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Chat / detect-changes (pull_request) Successful in 18s
Harness Replays / detect-changes (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 36s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m39s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 6m23s
CI / Platform (Go) (pull_request) Successful in 4m34s
CI / all-required (pull_request) Successful in 9m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 7s
CPProvisioner.collectCPConfigFiles hard-capped the config bundle (config.yaml
+ prompts/*) at 12 KiB because the control plane embedded it in EC2 user-data
(16 KiB AWS ceiling). That failed a paying customer: the jrs-auto SEO Agent's
config exceeds 12 KiB, so Start() rejected it client-side with
"cp provisioner: collect config files: config files exceed 12288 bytes" — the
workspace could never provision.

The control plane now delivers config OFF user-data (stages to Secrets
Manager, the workspace fetches it into /configs at boot — see
molecule-controlplane cp#329). The bundle travels here only inside the JSON
HTTP body to CP, which has no 16 KiB limit, so the 12 KiB ceiling is obsolete.

Raise cpConfigFilesMaxBytes from 12 KiB to 256 KiB: it becomes a pure
transport-DoS guard (a buggy/hostile tenant can't stream an unbounded body
and OOM the CP provision path), not the old user-data ceiling. Legitimate
growth — more schedules, longer prompts, more skills — never re-hits a wall.

TDD: TestStart_OversizedConfigBundleProvisions reproduces the exact failure
(>12288-byte SEO-shaped bundle) and proves it now reaches the CP request body
intact; TestCollectCPConfigFiles_DoSGuardStillBounds proves the guard still
rejects an oversized (>256 KiB) bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 00:52:17 -07:00
hongming-ceo-delegated 46bb1eb7b4 fix(canvas): link provider selection to llm_billing_mode (internal#703 Gap 2)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m10s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
CI / Platform (Go) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Successful in 9s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5m13s
CI / all-required (pull_request) Successful in 30m1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 4s
Selecting a non-Platform provider in the workspace Config tab previously
wrote only the credential env (CLAUDE_CODE_OAUTH_TOKEN / vendor key) and
left llm_billing_mode at its resolved default (platform_managed). CP's
tenant_config then kept injecting the platform proxy base URLs, so the
OAuth token / vendor key was never used and BYOK silently no-op'd (the
live jrs-auto SEO-Agent symptom in #703). The workspace-server even
hard-blocks vendor-key writes on platform_managed workspaces, pointing
the user at this exact billing-mode switch.

ConfigTab.handleSave now derives the implied billing_mode from the
selected provider (Platform / empty -> platform_managed; any other
vendor -> byok) and, when the provider changed and the implied mode
differs, PUTs it to /admin/workspaces/:id/llm-billing-mode (the same
per-tenant endpoint the LLM Billing section uses). The write is gated
on the provider PUT succeeding and on the mode actually changing, so a
BYOK->BYOK vendor swap or an unrelated Save does not issue a redundant
PUT or trigger a needless restart. A failed billing-mode write is
surfaced as a partial-save warning so the user knows BYOK may not take.

This is the UI half of #703; the CP/workspace-server env-injection half
(Gap 1) lands in parallel (workspace_provision.go), composing cleanly.

Refs: internal#703, internal#691.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 06:28:21 +00:00
hongming-ceo-delegated b11d2b6d90 fix(llm): emit resolved per-workspace billing_mode into container env (internal#703)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 27s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 30s
E2E Chat / detect-changes (pull_request) Successful in 33s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 45s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m17s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m18s
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
CI / Canvas (Next.js) (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 4m29s
CI / all-required (pull_request) Successful in 32m44s
E2E Chat / E2E Chat (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m42s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m43s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 15m36s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 21s
byok end-to-end fix. The per-workspace resolver (internal#691) already
skips proxy injection + key-strip for byok/disabled, but applyPlatformManagedLLMEnv
only emitted MOLECULE_LLM_BILLING_MODE on the platform_managed strip path,
hardcoded to the literal "platform_managed". A byok/disabled container
therefore never carried a truthful MOLECULE_LLM_BILLING_MODE value — only
MOLECULE_LLM_BILLING_MODE_RESOLVED.

Emit MOLECULE_LLM_BILLING_MODE = res.ResolvedMode (resolver-driven, not a
hardcode) for every resolved mode, alongside the existing _RESOLVED emit.
On the platform_managed path the value is identical to before; on the
byok/disabled early-return path the container now reflects the real mode.
No vendor strings; the proxy-skip / no-strip byok behavior is unchanged.

Tests:
- TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv: a
  per-workspace byok override (org floor = platform_managed) keeps its own
  CLAUDE_CODE_OAUTH_TOKEN, gets NO proxy ANTHROPIC_BASE_URL/key, and reads
  MOLECULE_LLM_BILLING_MODE=byok. Verified failing without the fix.
- TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode:
  no-regression — platform_managed still strips + forces proxy + emits
  MOLECULE_LLM_BILLING_MODE=platform_managed.

Refs internal#703, internal#691.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 06:20:53 +00:00
hongming fdd3f52bc8 fix(workspace-server): retry EC2 terminate on delete (#1932)
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 18s
Block internal-flavored paths / Block forbidden paths (push) Successful in 29s
CI / Python Lint & Test (push) Successful in 7s
CI / Detect changes (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 21s
E2E Chat / detect-changes (push) Successful in 16s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
publish-workspace-server-image / build-and-push (push) Successful in 11m47s
CI / Canvas (Next.js) (push) Successful in 38s
CI / Shellcheck (E2E scripts) (push) Successful in 35s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m47s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3m21s
Harness Replays / Harness Replays (push) Successful in 5s
CI / Platform (Go) (push) Successful in 5m44s
CI / all-required (push) Successful in 21m3s
publish-workspace-server-image / Production auto-deploy (push) Successful in 11m46s
E2E Chat / E2E Chat (push) Failing after 17m44s
CI / Canvas Deploy Reminder (push) Successful in 1s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 44s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m12s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
main-red-watchdog / watchdog (push) Successful in 2m3s
gate-check-v3 / gate-check (push) Successful in 29s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 11m42s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 10s
ci-required-drift / drift (push) Successful in 1m20s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 8s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 7m18s
Delete-path cpProv.Stop now uses bounded retry (cpStopWithRetryErr) like the restart path; durable workspace.delete.terminate_retry_exhausted event on exhaustion so the cp-orphan-sweeper/reaper backstop has a signal. Closes the un-retried single-shot Stop that leaked EC2s. Approved by core-qa + core-security.
2026-05-27 06:14:20 +00:00
hongming e058137fbf fix(workspace-server): bounded retry on delete-path EC2 stop + durable leak event
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 5s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
gate-check-v3 / gate-check (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / Canvas (Next.js) (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m19s
CI / Platform (Go) (pull_request) Successful in 6m53s
CI / all-required (pull_request) Successful in 29m0s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
audit-force-merge / audit (pull_request) Successful in 14s
The DELETE path's StopWorkspaceAuto → cpProv.Stop had no retry, while the
restart path used cpStopWithRetry (bounded exp backoff). A transient CP/AWS
hiccup on delete left the workspace row at status='removed' with instance_id
populated, returned a 500, and relied entirely on the 60s CP-orphan-sweeper
to re-drive the terminate. For a cascade *descendant* the "client retries →
replays terminate" recovery is defeated by CascadeDelete's status != 'removed'
CTE filter — so the only inline recovery is a bounded retry.

This extracts the retry loop into cpStopWithRetryErr (cpStopWithRetry keeps
its void contract for the restart paths) and adds stopWorkspaceForDelete,
which retries the CP terminate and, on exhaustion, persists a durable
workspace.delete.terminate_retry_exhausted row to structure_events (the
§Persistent structured logging gate) so the leak/pending decision is
queryable. The row deliberately stays status='removed' + instance_id so the
existing CP-orphan-sweeper backstop still re-drives it; the error is still
returned so the HTTP Delete surfaces the retryable 500.

Test-first, fail-direction proof: CPRetriesTransientThenSucceeds (3 calls, no
event) vs CPExhausts (event + error) discriminate the new behavior from the
pre-fix bare Stop. AST gate updated to recognize cpStopWithRetryErr as the
relocated home of the retry loop.

Refs task #15 (workspace-ec2-leak). Paired with the controlplane workspace-
EC2 reaper PR for the row-gone leak class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 21:50:48 -07:00
hongming 9bcf9d1dfe feat(workspace-server): seed schedules from workspace-template config.yaml (#1929)
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 3s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 45s
Handlers Postgres Integration / detect-changes (push) Successful in 6s
Harness Replays / detect-changes (push) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9s
publish-workspace-server-image / build-and-push (push) Successful in 3m10s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m26s
Harness Replays / Harness Replays (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m48s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 4m50s
E2E Chat / E2E Chat (push) Successful in 4m36s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m32s
CI / Platform (Go) (push) Successful in 5m26s
CI / all-required (push) Successful in 6m53s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m45s
lint-bp-context-emit-match / lint-bp-context-emit-match (push) Successful in 1m39s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 10s
CI / all-required (pull_request) Successful in 6m8s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 9s
security-review / approved (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 32s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 4s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 21s
ci-required-drift / drift (push) Successful in 1m9s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 23s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 13s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m18s
main-red-watchdog / watchdog (push) Successful in 39s
gate-check-v3 / gate-check (push) Successful in 56s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 8m0s
audit-force-merge / audit (pull_request) Waiting to run
Adds template_schedules.go helper: parses workspace-template config.yaml schedules: block + INSERTs into workspace_schedules with source='template'. Hooked into WorkspaceHandler.Create AFTER provisionWorkspaceAuto succeeds (so failed-backend workspaces never end up with orphan schedules). Reuses canonical orgImportScheduleSQL — Issue #24 contract (additive/idempotent/preserve-runtime-rows/never-DELETE) preserved.

Hostile-template defenses: 1 MiB LimitReader on config.yaml; maxTemplateSchedules=100; cron_expr ≤ 128; resolved prompt ≤ 16 KiB; ctx.Err() check inside seed loop; %q on schedule names in logs. 7 parser unit tests; full handlers suite green locally.

Reviews: APPROVED by core-qa + core-security after a two-axis subagent review + fix cycle (REQUEST_CHANGES → fixes in b9a3ef4 → APPROVE).

Companion SEO template PR: molecule-ai-workspace-template-seo-agent#12.
2026-05-26 23:50:42 +00:00
hongming b9a3ef4294 fix(template-schedules): apply review findings (ordering + bounds)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
CI / Platform (Go) (pull_request) Successful in 4m57s
CI / all-required (pull_request) Successful in 6m14s
audit-force-merge / audit (pull_request) Successful in 5s
Addresses both review subagents' REQUEST_CHANGES verdicts on
PR #1929:

Code review (correctness)
- #1: Move schedule seeding to AFTER provisionWorkspaceAuto
  succeeds so the scheduler never fires cron rows against a
  workspace whose backend never wired. Failed-backend workspaces
  no longer end up with orphan template_schedules rows.
- #2: seedTemplateSchedules now returns (seeded, skipped int) so
  the caller can observe partial-seed states; workspace.go Create
  logs the (seeded, skipped) pair when skipped > 0, surfacing
  silent partial-loss that the prior (int) return masked.

Security review (hostile-template defenses)
- #3 / #4: parseTemplateSchedules reads config.yaml through an
  io.LimitReader bounded by maxTemplateConfigYAMLBytes (1 MiB)
  and rejects files over the cap before yaml.Unmarshal runs.
  Defends against billion-laughs / anchor-explosion DoS.
- #3: schedules slice length capped at maxTemplateSchedules (100,
  10x the largest current production grid). Hostile template with
  50k schedules now rejected at parse time, not after 50k inserts.
- #3: cron_expr length capped at maxScheduleCronExprLen (128) per
  schedule; resolved prompt body capped at maxSchedulePromptBytes
  (16 KiB) per schedule. Oversized entries are skipped (counted
  as `skipped`) so one bad row doesn't break the rest.
- #3: Seed loop honours ctx.Err() so an aborted Create request
  stops further inserts rather than running to completion on a
  dead goroutine.
- #8: Schedule names quoted via %q in all log lines so CRLF in a
  hostile name can't injection-pollute stdout/Loki.

Tests
- TestParseTemplateSchedules_RejectsOversizeFile — gate against
  the LimitReader cap (1 MiB + 1 byte of '#').
- TestParseTemplateSchedules_RejectsTooManySchedules — gate
  against the schedule-count cap (maxTemplateSchedules + 1
  minimal entries).
- Full handlers test suite still green (17.4s).

Non-fix surface
- Code-review #3 (runtime-default fallback also seeds): runtime-
  default templates do not currently ship a schedules: block so
  this is benign in practice; documented behavior in the comment.
- Code-review #4 (files_dir in workspace-template config.yaml):
  not part of the current template_registry schema; flagged for
  follow-up if templates start declaring files_dir.
- Security-review #7 (cron prompt as agent self-message escalation
  vector): out of scope per security reviewer's own note; tracked
  separately. Will file an issue.

Verified locally:
  go vet ./...                 → clean
  go build ./...               → clean
  gofmt -d <changed files>     → clean
  go test ./internal/handlers/ → PASS (7 unit tests for parser,
                                  full suite 17.4s)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 16:31:52 -07:00
hongming d8b409f1bc feat(workspace-server): seed schedules from workspace-template config.yaml
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
CI / Canvas (Next.js) (pull_request) Successful in 22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m29s
E2E Chat / E2E Chat (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m10s
Harness Replays / Harness Replays (pull_request) Successful in 14s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m20s
CI / Platform (Go) (pull_request) Successful in 7m16s
CI / all-required (pull_request) Successful in 16m2s
Adds a new template_schedules.go helper that parses a workspace
template's config.yaml for its `schedules:` block and INSERTs each
entry into workspace_schedules with source='template'. Mirrors the
org/import path (org_import.go) so a workspace created directly from
a workspace template lands with the same schedule grid the
org/import path would have produced.

Closes the gap the SEO Agent template hit: the template documented
a "comprehensive schedule" (in source/.../recommended-schedule.md
and the matching config.yaml schedules: block from
molecule-ai-workspace-template-seo-agent#12), but the workspace-
server provisioner never consumed `schedules:` from a workspace
template — only the org/import path seeded workspace_schedules.

Wiring:
- New: handlers/template_schedules.go
  * parseTemplateSchedules(templatePath) — minimal YAML parse of
    `schedules:` only; returns (nil, nil) when config.yaml is
    absent or the block is empty. Errors only on read/parse
    failure of a present file.
  * seedTemplateSchedules(ctx, workspaceID, templatePath, schedules)
    — per-entry resolve+insert via the canonical
    orgImportScheduleSQL constant from org.go. Reuses the existing
    resolvePromptRef + scheduler.ComputeNextRun helpers. Per-row
    failures are logged and skipped; never fatal.
- Modified: handlers/workspace.go
  * WorkspaceHandler.Create calls parseTemplateSchedules +
    seedTemplateSchedules after the templatePath resolves and
    before provisionWorkspaceAuto runs. Non-fatal — a broken
    schedules: block can never block workspace provisioning.
  * Schedules are seeded once at workspace creation; Restart
    does not re-seed (so user-deleted template rows stay deleted).
- New: handlers/template_schedules_test.go
  * Table-driven coverage for parseTemplateSchedules: absent file,
    empty path, no schedules block, happy path (3 entries inc.
    explicit enabled: false), malformed YAML.

Issue #24 contract preserved (additive, idempotent, runtime-row
preservation, never-DELETE) because both the new path and the
existing org/import path execute the same orgImportScheduleSQL
constant — and TestImport_OrgScheduleSQLShape already gates that
SQL's shape against regression.

Verified locally:
  go vet ./...                 → clean
  go build ./...               → clean
  gofmt -d <changed files>     → clean
  go test ./internal/handlers/ → PASS (incl. 5 new tests)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 16:10:29 -07:00
hongming 821ccffeb0 feat(canvas): LLM Billing section in Config tab (internal#691) (#1928)
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
publish-canvas-image / Build & push canvas image (push) Successful in 1m52s
publish-workspace-server-image / build-and-push (push) Successful in 3m19s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
E2E Chat / detect-changes (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 15s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
CI / Platform (Go) (push) Successful in 6s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
Harness Replays / Harness Replays (push) Successful in 4s
main-red-watchdog / watchdog (push) Successful in 48s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m37s
CI / Canvas (Next.js) (push) Successful in 6m6s
E2E Chat / E2E Chat (push) Successful in 4m54s
CI / all-required (push) Successful in 15m1s
publish-workspace-server-image / Production auto-deploy (push) Failing after 11m38s
gate-check-v3 / gate-check (push) Successful in 1m0s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8m5s
CI / Canvas Deploy Reminder (push) Successful in 1s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 7s
ci-required-drift / drift (push) Successful in 1m22s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 6s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m43s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m41s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Co-authored-by: hongming <hongmingwang@moleculesai.app>
Co-committed-by: hongming <hongmingwang@moleculesai.app>
2026-05-26 22:57:31 +00:00
hongming db5ffed2b5 feat(workspace-server): per-workspace llm_billing_mode override (internal#691) (#1927)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
ci-arm64-advisory / fast-checks (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
CI / all-required (push) Has been cancelled
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 50s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m20s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 6m34s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m2s
Co-authored-by: hongming <hongmingwang@moleculesai.app>
Co-committed-by: hongming <hongmingwang@moleculesai.app>
2026-05-26 22:57:22 +00:00
hongming cffe4bec43 fix(canvas): derive create-dialog provider models from templates (#1926)
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 13s
Block internal-flavored paths / Block forbidden paths (push) Successful in 22s
CI / Python Lint & Test (push) Successful in 15s
CI / Detect changes (push) Successful in 24s
publish-canvas-image / Build & push canvas image (push) Successful in 1m32s
Handlers Postgres Integration / detect-changes (push) Successful in 16s
Harness Replays / detect-changes (push) Successful in 15s
E2E Chat / detect-changes (push) Successful in 25s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 23s
E2E API Smoke Test / detect-changes (push) Successful in 28s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 14s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
CI / Platform (Go) (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 22s
Harness Replays / Harness Replays (push) Successful in 31s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 16s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m1s
publish-workspace-server-image / build-and-push (push) Successful in 7m15s
E2E Chat / E2E Chat (push) Successful in 5m11s
CI / Canvas (Next.js) (push) Successful in 7m7s
CI / all-required (push) Successful in 10m12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8m9s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m9s
CI / Canvas Deploy Reminder (push) Successful in 1s
main-red-watchdog / watchdog (push) Successful in 34s
gate-check-v3 / gate-check (push) Successful in 42s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 12s
ci-required-drift / drift (push) Successful in 59s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 13s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m48s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 7m2s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 4s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
2026-05-26 21:54:18 +00:00
hongming 12319f1ffd Fix workspace auth forged same-origin bypass
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 14s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 4s
CI / Detect changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 23s
E2E Chat / detect-changes (push) Successful in 23s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 18s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 36s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (push) Successful in 1m7s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 6m20s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 10s
Harness Replays / Harness Replays (push) Successful in 4s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m16s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m10s
CI / Platform (Go) (push) Successful in 5m16s
CI / all-required (push) Successful in 9m32s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Chat / E2E Chat (push) Successful in 4m35s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Successful in 8m21s
publish-workspace-server-image / Production auto-deploy (push) Successful in 5m13s
main-red-watchdog / watchdog (push) Successful in 48s
gate-check-v3 / gate-check (push) Successful in 31s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 35s
ci-required-drift / drift (push) Successful in 1m5s
Sweep stale AWS Secrets Manager secrets / Sweep AWS Secrets Manager (push) Successful in 19s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m27s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Failing after 7m59s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 6s
2026-05-26 18:20:16 +00:00
claude-ceo-assistant 51ca06447b Fix workspace auth referer bypass
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 14s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 16s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 9s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 37s
gate-check-v3 / gate-check (pull_request) Successful in 11s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 1m4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / Canvas (Next.js) (pull_request) Successful in 13s
E2E Chat / E2E Chat (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
qa-review / approved (pull_request) Refired via /qa-recheck by claude-ceo-assistant
security-review / approved (pull_request) Refired via /security-recheck by claude-ceo-assistant
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m43s
CI / Platform (Go) (pull_request) Successful in 4m43s
CI / all-required (pull_request) Successful in 5m55s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Successful in 8m2s
audit-force-merge / audit (pull_request) Successful in 4s
2026-05-26 11:12:35 -07:00
hongming c2a08f6a6d fix(workspace): strip provider keys in platform-managed LLM mode (#1922)
ci-arm64-advisory / fast-checks (push) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 4s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 19s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Harness Replays / detect-changes (push) Successful in 19s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Successful in 16s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Successful in 19s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 41s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
CI / Shellcheck (E2E scripts) (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m49s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 5s
CI / Canvas Deploy Reminder (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 6m31s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m36s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 6m32s
E2E Chat / E2E Chat (push) Successful in 5m57s
CI / Platform (Go) (push) Successful in 6m41s
CI / all-required (push) Successful in 9m20s
publish-workspace-server-image / Production auto-deploy (push) Successful in 4m58s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m0s
main-red-watchdog / watchdog (push) Successful in 33s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 5m16s
gate-check-v3 / gate-check (push) Successful in 40s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 11s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 20s
ci-required-drift / drift (push) Successful in 1m12s
2026-05-26 17:51:51 +00:00
62 changed files with 3622 additions and 594 deletions
+152
View File
@@ -605,6 +605,151 @@ def file_or_update_red(
sys.stderr.write(f"::warning::label '{RED_LABEL}' not found on repo\n")
def close_stale_red_issues(
current_sha: str,
current_status: dict,
*,
dry_run: bool = False,
) -> int:
"""Close open [main-red] issues whose specific failing contexts have
all recovered on `current_sha`, even though `main` is still red for
other reasons (mc#1789).
When main stays red across consecutive SHAs for *different* causes,
`close_open_red_issues_for_other_shas` never fires (it only runs when
main is green). This function prevents stale issues from accumulating
indefinitely by comparing per-context recovery across SHAs.
An issue is considered stale when every context that was in a failed
state on the issue's SHA is now either `success` on the current HEAD
or absent (workflow removed / renamed). Issues whose original SHA had
a combined-red-with-no-detail (empty statuses list) are skipped — we
cannot verify recovery without per-context data.
Returns the number of issues closed.
"""
open_red = list_open_red_issues()
if not open_red:
return 0
current_statuses = current_status.get("statuses") or []
closed = 0
for issue in open_red:
title = issue.get("title", "")
prefix = f"{TITLE_PREFIX} {REPO}: "
if not title.startswith(prefix):
continue
short_sha = title[len(prefix):]
if short_sha == current_sha[:10]:
continue
# Query status for the old SHA. Short SHA should resolve; if it
# doesn't (GC'd, force-pushed, ambiguous), skip conservatively.
try:
old_status = get_combined_status(short_sha)
except ApiError:
continue
old_red, old_failed = is_red(old_status)
if not old_red:
# Open issue for a now-green SHA — close it via the normal path.
num = issue.get("number")
if isinstance(num, int):
comment = (
f"Commit `{short_sha}` is no longer red. Closing as the "
f"failure context has recovered or expired."
)
if dry_run:
print(
f"::notice::[dry-run] would close issue #{num} "
f"({title}) — old SHA is now green"
)
closed += 1
continue
api(
"POST",
f"/repos/{OWNER}/{NAME}/issues/{num}/comments",
body={"body": comment},
)
api(
"PATCH",
f"/repos/{OWNER}/{NAME}/issues/{num}",
body={"state": "closed"},
)
print(
f"::notice::Closed stale main-red issue #{num} "
f"(old SHA {short_sha} is now green)"
)
closed += 1
continue
if not old_failed:
# Combined red with no per-context detail — can't verify recovery.
continue
# Verify every failed context from the old SHA has recovered.
all_recovered = True
recovered_ctxs: list[str] = []
still_failing_ctxs: list[str] = []
for s in old_failed:
ctx = s.get("context", "")
if not ctx:
continue
current_match = None
for cs in current_statuses:
if isinstance(cs, dict) and cs.get("context") == ctx:
current_match = cs
break
if current_match is None:
recovered_ctxs.append(ctx)
elif _entry_state(current_match) == "success":
recovered_ctxs.append(ctx)
else:
all_recovered = False
still_failing_ctxs.append(ctx)
if not all_recovered:
continue
num = issue.get("number")
if not isinstance(num, int):
continue
comment = (
f"The failing contexts from this SHA (`{short_sha}`) have "
f"recovered on current HEAD `{current_sha[:10]}`: "
f"{', '.join(recovered_ctxs)}. "
f"Main is still red for other reasons; see the current "
f"`[main-red]` issue for `{current_sha[:10]}`."
)
if dry_run:
print(
f"::notice::[dry-run] would close stale issue #{num} "
f"({title}) — contexts recovered"
)
closed += 1
continue
api(
"POST",
f"/repos/{OWNER}/{NAME}/issues/{num}/comments",
body={"body": comment},
)
api(
"PATCH",
f"/repos/{OWNER}/{NAME}/issues/{num}",
body={"state": "closed"},
)
print(
f"::notice::Closed stale main-red issue #{num} "
f"(contexts recovered at {current_sha[:10]})"
)
closed += 1
return closed
def close_open_red_issues_for_other_shas(
current_sha: str,
*,
@@ -775,6 +920,13 @@ def run_once(*, dry_run: bool = False) -> int:
print(f"::warning::main is RED at {sha[:10]} on {WATCH_BRANCH}: "
f"{len(failed)} failed context(s)")
file_or_update_red(sha, failed, debug, dry_run=dry_run)
stale_closed = close_stale_red_issues(sha, recheck_status, dry_run=dry_run)
if stale_closed:
emit_loki_event("main_red_stale_closed", sha, [])
print(
f"::notice::Closed {stale_closed} stale main-red issue(s) "
f"whose contexts recovered at {sha[:10]}"
)
else:
# Green or pending-with-no-real-failures. Close stale issues
# from earlier SHAs when required CI has recovered.
+2 -2
View File
@@ -642,7 +642,7 @@ def load_config(path: str) -> dict[str, Any]:
# requiring the dep, so the ignore is safe: if yaml loads, we use it;
# otherwise we fall back silently.
import yaml # type: ignore[import-not-found]
with open(path) as f:
with open(path, encoding="utf-8") as f:
return yaml.safe_load(f)
except ImportError:
return _load_config_minimal(path)
@@ -656,7 +656,7 @@ def _load_config_minimal(path: str) -> dict[str, Any]:
item map: scalars + lists of scalars. Does NOT support nested lists,
YAML anchors, multi-doc, or flow style.
"""
with open(path) as f:
with open(path, encoding="utf-8") as f:
lines = f.readlines()
return _parse_minimal_yaml(lines)
+1 -1
View File
@@ -33,7 +33,7 @@ def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_success"
with open(p) as f:
with open(p, encoding="utf-8") as f:
return f.read().strip()
@@ -40,7 +40,7 @@ def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_pr_open"
with open(p) as f:
with open(p, encoding="utf-8") as f:
return f.read().strip()
@@ -258,6 +258,7 @@ def test_run_once_failure_does_not_close(monkeypatch):
monkeypatch.setattr(wd, "file_or_update_red", capture_file)
monkeypatch.setattr(wd, "close_open_red_issues_for_other_shas", lambda *a, **k: 0)
monkeypatch.setattr(wd, "close_stale_red_issues", lambda *a, **k: 0)
assert wd.run_once(dry_run=True) == 0
assert filed == ["abc123"]
+14 -6
View File
@@ -164,12 +164,20 @@ jobs:
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 10m per-step timeout
# lets the suite complete on cold cache (~5-7m) while failing cleanly
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
name: Run tests with coverage (blocking gate)
# Removed -race from the blocking gate per #1184: cold runners
# take 13-25 min to compile with race instrumentation, exceeding
# the 10m step timeout and causing false failures. Race detection
# now runs as a non-blocking advisory step below.
run: go test -timeout 10m -coverprofile=coverage.out ./...
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Race detection (advisory, non-blocking)
# mc#1184: runs race detector as an advisory check so cold-runner
# compile-time spikes don't block merges. Failures here surface in
# the run log but do not fail the build.
run: go test -race -timeout 10m ./...
continue-on-error: true
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Per-file coverage report
+32 -261
View File
@@ -34,22 +34,6 @@ interface TemplateSpec {
providers?: string[];
}
interface HermesProvider {
id: string;
label: string;
envVar: string;
defaultModel: string;
models: string[];
}
const DEFAULT_LLM_MODELS: SelectorModel[] = [
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
{ id: "MiniMax-M2.7", name: "MiniMax M2.7", required_env: ["MINIMAX_API_KEY"] },
{ id: "kimi-k2-turbo-preview", name: "Kimi K2 Turbo Preview", required_env: ["KIMI_API_KEY"] },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
];
const DEFAULT_PLATFORM_MODEL = DEFAULT_LLM_MODELS[0];
const DEFAULT_RUNTIME = "claude-code";
const RUNTIME_OPTIONS = [
{ value: "claude-code", label: "Claude Code" },
@@ -63,31 +47,6 @@ const DEFAULT_HEADLESS_ROOT_GB = 30;
const DEFAULT_DISPLAY_INSTANCE_TYPE = "t3.xlarge";
const DEFAULT_DISPLAY_ROOT_GB = 80;
// All providers supported by Hermes runtime via providers.resolve_provider().
// `defaultModel` is the slug injected into the workspace provision request
// when the user picks this provider — template-hermes's derive-provider.sh
// maps the prefix back to the provider name at install time, so this is
// the canonical handshake. `models` are additional suggestions surfaced in
// the datalist so the user can pick a different size without typing the
// whole slug.
export const HERMES_PROVIDERS: HermesProvider[] = [
{ id: "anthropic", label: "Anthropic (Claude)", envVar: "ANTHROPIC_API_KEY", defaultModel: "anthropic/claude-sonnet-4-5", models: ["anthropic/claude-opus-4-5", "anthropic/claude-sonnet-4-5", "anthropic/claude-haiku-4-5"] },
{ id: "openai", label: "OpenAI", envVar: "OPENAI_API_KEY", defaultModel: "openai/gpt-4o", models: ["openai/gpt-4o", "openai/gpt-4o-mini", "openai/o3-mini"] },
{ id: "openrouter", label: "OpenRouter", envVar: "OPENROUTER_API_KEY", defaultModel: "openrouter/auto", models: ["openrouter/auto", "openrouter/anthropic/claude-sonnet-4", "openrouter/meta-llama/llama-3.3-70b"] },
{ id: "xai", label: "xAI (Grok)", envVar: "XAI_API_KEY", defaultModel: "xai/grok-4", models: ["xai/grok-4", "xai/grok-4-mini"] },
{ id: "gemini", label: "Google Gemini", envVar: "GEMINI_API_KEY", defaultModel: "gemini/gemini-2.5-pro", models: ["gemini/gemini-2.5-pro", "gemini/gemini-2.5-flash"] },
{ id: "qwen", label: "Qwen (Alibaba)", envVar: "QWEN_API_KEY", defaultModel: "alibaba/qwen3-max", models: ["alibaba/qwen3-max", "alibaba/qwen3-coder"] },
{ id: "glm", label: "GLM (Zhipu AI)", envVar: "GLM_API_KEY", defaultModel: "zai/glm-4.6", models: ["zai/glm-4.6", "zai/glm-4.5-air"] },
{ id: "kimi", label: "Kimi (Moonshot)", envVar: "KIMI_API_KEY", defaultModel: "kimi-coding/kimi-k2", models: ["kimi-coding/kimi-k2", "kimi-coding/kimi-k1.5"] },
{ id: "minimax", label: "MiniMax", envVar: "MINIMAX_API_KEY", defaultModel: "minimax/MiniMax-M2.7", models: ["minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed", "minimax/MiniMax-M1"] },
{ id: "deepseek", label: "DeepSeek", envVar: "DEEPSEEK_API_KEY", defaultModel: "deepseek/deepseek-chat", models: ["deepseek/deepseek-chat", "deepseek/deepseek-reasoner"] },
{ id: "groq", label: "Groq", envVar: "GROQ_API_KEY", defaultModel: "openrouter/groq/llama-3.3-70b", models: ["openrouter/groq/llama-3.3-70b"] },
{ id: "mistral", label: "Mistral", envVar: "MISTRAL_API_KEY", defaultModel: "openrouter/mistralai/mistral-large", models: ["openrouter/mistralai/mistral-large"] },
{ id: "together", label: "Together AI", envVar: "TOGETHER_API_KEY", defaultModel: "openrouter/meta-llama/llama-3.3-70b", models: ["openrouter/meta-llama/llama-3.3-70b"] },
{ id: "fireworks", label: "Fireworks AI", envVar: "FIREWORKS_API_KEY", defaultModel: "openrouter/meta-llama/llama-3.3-70b", models: ["openrouter/meta-llama/llama-3.3-70b"] },
{ id: "hermes", label: "Hermes / Nous (legacy)", envVar: "HERMES_API_KEY", defaultModel: "nousresearch/Hermes-3-Llama-3.1-405B", models: ["nousresearch/Hermes-3-Llama-3.1-405B", "nousresearch/Hermes-4-14B"] },
];
export function CreateWorkspaceButton() {
const [open, setOpen] = useState(false);
const [name, setName] = useState("");
@@ -107,32 +66,20 @@ export function CreateWorkspaceButton() {
// filter below. Same data source ConfigTab uses (PR #2454). When the
// selected template declares `runtime_config.providers` in its
// config.yaml, the modal surfaces only those providers in the
// <select>. Empty/missing list falls back to the full HERMES_PROVIDERS
// catalog so older templates without the field keep working.
// <select>. Provider/model options are derived from template models.
const [templateSpecs, setTemplateSpecs] = useState<TemplateSpec[]>([]);
// External-runtime path: skip docker provision, mint a workspace_auth_token,
// and surface the connection snippet in a modal after create. When
// isExternal is true the template / model / hermes-provider fields are
// hidden (they're meaningless for BYO-compute agents).
// isExternal is true the template and model fields are hidden (they're
// meaningless for BYO-compute agents).
const [isExternal, setIsExternal] = useState(false);
const [externalRuntime, setExternalRuntime] = useState("external");
const [externalConnection, setExternalConnection] =
useState<ExternalConnectionInfo | null>(null);
// Hermes-specific state
const [hermesProvider, setHermesProvider] = useState("anthropic");
const [hermesApiKey, setHermesApiKey] = useState("");
// Model slug is sent to CP as `model` and plumbed to the workspace EC2
// as HERMES_DEFAULT_MODEL env var. template-hermes's derive-provider.sh
// reads the prefix (`minimax/…`, `anthropic/…`) to set
// HERMES_INFERENCE_PROVIDER at install time. Missing model → provider
// falls back to "auto" and hermes picks its compiled-in default
// (Anthropic), which 401s if the user's key is for a different
// provider. Hence: require model when template=hermes.
const [hermesModel, setHermesModel] = useState("");
const [llmSelection, setLLMSelection] = useState<SelectorValue>({
providerId: "platform|",
model: "moonshot/kimi-k2.6",
providerId: "",
model: "",
envVars: [],
});
const [llmSecret, setLLMSecret] = useState("");
@@ -194,10 +141,7 @@ export function CreateWorkspaceButton() {
const handleRuntimeChange = useCallback((nextRuntime: string) => {
setRuntime(nextRuntime);
setTemplate("");
setHermesProvider("anthropic");
setHermesApiKey("");
setHermesModel("");
setLLMSelection({ providerId: "platform|", model: DEFAULT_PLATFORM_MODEL.id, envVars: [] });
setLLMSelection({ providerId: "", model: "", envVars: [] });
setLLMSecret("");
}, []);
@@ -209,9 +153,12 @@ export function CreateWorkspaceButton() {
return templateSpecs.find((s) => s.id === template) ?? null;
}, [template, templateSpecs]);
const selectedRuntimeTemplateSpec = useMemo<TemplateSpec | null>(() => (
templateSpecs.find((s) => s.id === runtime && BASE_RUNTIME_TEMPLATE_IDS.has(s.id)) ?? null
templateSpecs.find((s) => {
if (!BASE_RUNTIME_TEMPLATE_IDS.has(s.id)) return false;
const specRuntime = (s.runtime ?? s.id).trim().toLowerCase();
return s.id === runtime || specRuntime === runtime;
}) ?? null
), [runtime, templateSpecs]);
const isHermes = runtime === "hermes";
const visibleTemplateSpecs = useMemo(
() => templateSpecs.filter((spec) => {
if (BASE_RUNTIME_TEMPLATE_IDS.has(spec.id)) return false;
@@ -222,28 +169,11 @@ export function CreateWorkspaceButton() {
);
const llmModels = useMemo(
() => {
if (!selectedTemplateSpec?.models?.length) return DEFAULT_LLM_MODELS;
if (isHermes) {
return selectedTemplateSpec.models;
}
if (selectedTemplateSpec.models.some((model) => model.provider === "platform")) {
return selectedTemplateSpec.models;
}
const templateDefault = selectedTemplateSpec.model?.trim();
const defaultModelSpec = templateDefault
? selectedTemplateSpec.models.find((model) => model.id === templateDefault)
: undefined;
return [
{
id: templateDefault || DEFAULT_PLATFORM_MODEL.id,
name: defaultModelSpec?.name ?? DEFAULT_PLATFORM_MODEL.name,
provider: "platform",
required_env: [],
},
...selectedTemplateSpec.models,
];
const sourceSpec = selectedTemplateSpec ?? selectedRuntimeTemplateSpec;
if (!sourceSpec?.models?.length) return [];
return sourceSpec.models;
},
[isHermes, selectedTemplateSpec],
[selectedRuntimeTemplateSpec, selectedTemplateSpec],
);
const llmCatalog = useMemo(() => buildProviderCatalog(llmModels), [llmModels]);
const selectedLLMProvider = useMemo(
@@ -251,67 +181,22 @@ export function CreateWorkspaceButton() {
[llmCatalog, llmSelection.providerId],
);
// Filter HERMES_PROVIDERS by what the template declares it supports.
// Empty/missing declared list → fall back to the full catalog so
// templates that haven't migrated to the explicit `providers:` field
// (and self-hosted setups without /templates) keep working unchanged.
const availableProviders = useMemo<HermesProvider[]>(() => {
const declared = selectedTemplateSpec?.providers ?? selectedRuntimeTemplateSpec?.providers;
if (!declared || declared.length === 0) return HERMES_PROVIDERS;
const allowed = new Set(declared.map((p) => p.toLowerCase()));
const filtered = HERMES_PROVIDERS.filter((p) => allowed.has(p.id.toLowerCase()));
// Defensive: if the template's declared list doesn't match anything
// in our static catalog (e.g. brand-new provider id we don't have
// metadata for yet), fall back to the full list rather than render
// an empty <select>. Better to over-show than to lock the user out.
return filtered.length > 0 ? filtered : HERMES_PROVIDERS;
}, [selectedRuntimeTemplateSpec, selectedTemplateSpec]);
// If the currently-selected provider is filtered out by a template
// change, snap back to the first available. Without this, the
// hermesProvider state could refer to a provider not in the dropdown
// — confusing UI + the API key field's envVar would be wrong.
useEffect(() => {
if (!isHermes) return;
if (availableProviders.length === 0) return;
if (!availableProviders.some((p) => p.id === hermesProvider)) {
setHermesProvider(availableProviders[0].id);
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [availableProviders, isHermes]);
useEffect(() => {
if (isHermes || llmCatalog.length === 0) return;
const templateDefault = selectedTemplateSpec?.model?.trim();
const matched = templateDefault ? findProviderForModel(llmCatalog, templateDefault) : null;
const next = matched ?? llmCatalog[0];
if (llmCatalog.length === 0) return;
const sourceDefault = (selectedTemplateSpec ?? selectedRuntimeTemplateSpec)?.model?.trim();
const platformProvider = llmCatalog.find((p) => p.vendor === "platform");
const matched = sourceDefault ? findProviderForModel(llmCatalog, sourceDefault) : null;
const next = platformProvider ?? matched ?? llmCatalog[0];
const defaultModel = next.models.find((model) => model.id === sourceDefault)?.id
?? next.models[0]?.id
?? "";
setLLMSelection({
providerId: next.id,
model: matched && templateDefault
? templateDefault
: next.wildcard
? ""
: next.models[0]?.id ?? "",
model: next.wildcard ? "" : defaultModel,
envVars: next.envVars,
});
setLLMSecret("");
}, [isHermes, llmCatalog, selectedTemplateSpec?.model]);
// Auto-fill hermesModel with the provider's defaultModel whenever the
// provider changes, but only if the user hasn't already typed their own
// slug. Prevents the empty-model → "auto" → Anthropic-default 401 trap.
useEffect(() => {
if (!isHermes) return;
const p = HERMES_PROVIDERS.find((x) => x.id === hermesProvider);
if (!p) return;
// Replace model only if current value matches another provider's
// default (user hasn't customized it) OR is empty.
const isUntouched =
hermesModel === "" ||
HERMES_PROVIDERS.some((x) => x.defaultModel === hermesModel);
if (isUntouched) setHermesModel(p.defaultModel);
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [hermesProvider, isHermes]);
}, [llmCatalog, selectedRuntimeTemplateSpec, selectedTemplateSpec]);
// Reset form and load workspaces whenever dialog opens
useEffect(() => {
@@ -328,11 +213,8 @@ export function CreateWorkspaceButton() {
setDisplayInstanceType(DEFAULT_DISPLAY_INSTANCE_TYPE);
setDisplayRootGB(String(DEFAULT_DISPLAY_ROOT_GB));
setDisplayResolution("1920x1080");
setHermesProvider("anthropic");
setExternalRuntime("external");
setHermesApiKey("");
setHermesModel("");
setLLMSelection({ providerId: "platform|", model: "moonshot/kimi-k2.6", envVars: [] });
setLLMSelection({ providerId: "", model: "", envVars: [] });
setLLMSecret("");
api
.get<WorkspaceOption[]>("/workspaces")
@@ -341,7 +223,7 @@ export function CreateWorkspaceButton() {
api
.get<TemplateSpec[]>("/templates")
.then((rows) => setTemplateSpecs(Array.isArray(rows) ? rows : []))
.catch(() => { /* keep empty — HERMES_PROVIDERS fallback below */ });
.catch(() => { /* keep empty; create stays blocked until the catalog loads */ });
// defaultTier is stable for the session (derived from window.location),
// safe to omit from deps.
// eslint-disable-next-line react-hooks/exhaustive-deps
@@ -352,29 +234,18 @@ export function CreateWorkspaceButton() {
setError("Name is required");
return;
}
if (isHermes && !hermesApiKey.trim()) {
setError("API key is required for Hermes workspaces");
return;
}
if (isHermes && !hermesModel.trim()) {
setError("Model is required for Hermes workspaces — provider routing depends on the model slug prefix");
return;
}
if (!isExternal && !isHermes && !llmSelection.model.trim()) {
if (!isExternal && !llmSelection.model.trim()) {
setError("Model is required");
return;
}
if (!isExternal && !isHermes && selectedLLMProvider?.envVars.length && !llmSecret.trim()) {
if (!isExternal && selectedLLMProvider?.envVars.length && !llmSecret.trim()) {
setError("Provider credential is required");
return;
}
setCreating(true);
setError(null);
const provider = isHermes
? HERMES_PROVIDERS.find((p) => p.id === hermesProvider)
: undefined;
const nativeProvider = !isHermes ? selectedLLMProvider : undefined;
const nativeProvider = selectedLLMProvider;
try {
const parsedBudget = budgetLimit.trim()
@@ -398,7 +269,7 @@ export function CreateWorkspaceButton() {
tier,
parent_id: parentId || undefined,
budget_limit: parsedBudget,
...(!isExternal && !isHermes && nativeProvider
...(!isExternal && nativeProvider
? {
model: llmSelection.model.trim(),
llm_provider: nativeProvider.vendor,
@@ -432,12 +303,6 @@ export function CreateWorkspaceButton() {
// no container provisioning, token minted, connection payload
// returned in the response for the modal below.
...(isExternal ? { runtime: externalRuntime } : { runtime }),
...(!isExternal && isHermes && provider
? {
secrets: { [provider.envVar]: hermesApiKey.trim() },
model: hermesModel.trim(),
}
: {}),
});
// External path: keep the create dialog open just long enough to
// hand control to the connect modal, then close. The connect
@@ -588,7 +453,7 @@ export function CreateWorkspaceButton() {
</div>
)}
{!isExternal && !isHermes && selectedLLMProvider && (
{!isExternal && selectedLLMProvider && (
<div className="rounded-lg border border-line/50 bg-surface-card/40 p-3 space-y-3">
<div className="text-[11px] font-medium text-ink-mid">
LLM
@@ -744,100 +609,6 @@ export function CreateWorkspaceButton() {
</div>
</div>
{/* Hermes provider configuration — shown only for the Hermes runtime. */}
{isHermes && (
<div
className="mt-4 rounded-xl border border-violet-700/40 bg-violet-950/20 p-4 space-y-3"
data-testid="hermes-provider-section"
>
<p className="text-[11px] font-semibold text-violet-400 uppercase tracking-wide">
Hermes Provider
</p>
<p className="text-[11px] text-ink-mid -mt-1">
Choose the AI provider and paste your API key. The key is
stored as an encrypted workspace secret.
</p>
<div>
<label
htmlFor="hermes-provider-select"
className="text-[11px] text-ink-mid block mb-1"
>
Provider
</label>
<select
id="hermes-provider-select"
value={hermesProvider}
onChange={(e) => setHermesProvider(e.target.value)}
aria-label="Hermes provider"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors"
>
{availableProviders.map((p) => (
<option key={p.id} value={p.id}>
{p.label}
</option>
))}
</select>
</div>
<div>
<label
htmlFor="hermes-api-key-input"
className="text-[11px] text-ink-mid block mb-1"
>
API Key{" "}
<span aria-hidden="true" className="text-bad">
*
</span>
<span className="sr-only"> (required)</span>
</label>
<input
id="hermes-api-key-input"
type="password"
value={hermesApiKey}
onChange={(e) => setHermesApiKey(e.target.value)}
placeholder="sk-…"
aria-label="Hermes API key"
autoComplete="off"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink placeholder-ink-soft focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors font-mono"
/>
</div>
<div>
<label
htmlFor="hermes-model-input"
className="text-[11px] text-ink-mid block mb-1"
>
Model{" "}
<span aria-hidden="true" className="text-bad">
*
</span>
<span className="sr-only"> (required)</span>
</label>
<input
id="hermes-model-input"
type="text"
value={hermesModel}
onChange={(e) => setHermesModel(e.target.value)}
placeholder="e.g. minimax/MiniMax-M2.7"
aria-label="Hermes model slug"
autoComplete="off"
spellCheck={false}
list="hermes-model-suggestions"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink placeholder-ink-soft focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors font-mono"
/>
<datalist id="hermes-model-suggestions">
{HERMES_PROVIDERS.find((p) => p.id === hermesProvider)?.models.map(
(m) => <option key={m} value={m} />,
)}
</datalist>
<p className="text-[10px] text-ink-mid mt-1">
Slug determines which provider hermes routes to at install time.
</p>
</div>
</div>
)}
{error && (
<div
role="alert"
@@ -1,7 +1,7 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup } from "@testing-library/react";
import { CreateWorkspaceButton, HERMES_PROVIDERS } from "../CreateWorkspaceDialog";
import { CreateWorkspaceButton } from "../CreateWorkspaceDialog";
vi.mock("@/lib/api", () => ({
api: {
@@ -21,6 +21,22 @@ const SAMPLE_WORKSPACES = [
];
const SAMPLE_TEMPLATES = [
{
id: "claude-code-default",
name: "Claude Code Agent",
runtime: "claude-code",
model: "moonshot/kimi-k2.6",
providers: ["platform", "minimax", "kimi-coding", "anthropic", "anthropic-oauth"],
models: [
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
{ id: "MiniMax-M2.7", name: "MiniMax M2.7", required_env: ["MINIMAX_API_KEY"] },
{ id: "kimi-k2-turbo-preview", name: "Kimi K2 Turbo Preview", required_env: ["KIMI_API_KEY"] },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "opus", name: "Claude Opus", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "haiku", name: "Claude Haiku", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
],
},
{
id: "seo-agent",
name: "SEO Agent",
@@ -33,9 +49,22 @@ const SAMPLE_TEMPLATES = [
{ id: "kimi-k2-turbo-preview", name: "Kimi K2 Turbo Preview", required_env: ["KIMI_API_KEY"] },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "opus", name: "Claude Opus", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "haiku", name: "Claude Haiku", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
],
},
{
id: "hermes",
name: "Hermes",
runtime: "hermes",
model: "openai/gpt-4o",
providers: ["openai", "anthropic", "platform"],
models: [
{ id: "openai/gpt-4o", name: "GPT-4o", required_env: ["OPENAI_API_KEY"] },
{ id: "anthropic/claude-sonnet-4-5", name: "Claude Sonnet 4.5", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
],
},
{ id: "hermes", name: "Hermes", runtime: "hermes" },
];
beforeEach(() => {
@@ -269,6 +298,9 @@ describe("CreateWorkspaceDialog", () => {
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "anthropic-oauth|CLAUDE_CODE_OAUTH_TOKEN" },
});
fireEvent.change(document.querySelector("[data-testid='model-select']") as HTMLSelectElement, {
target: { value: "sonnet" },
});
fireEvent.change(document.getElementById("llm-secret-input") as HTMLInputElement, {
target: { value: "oauth-token" },
});
@@ -283,6 +315,18 @@ describe("CreateWorkspaceDialog", () => {
expect(body.secrets).toEqual({ CLAUDE_CODE_OAUTH_TOKEN: "oauth-token" });
});
it("lists all Claude Code subscription aliases for blank workspaces", async () => {
await openDialog();
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "anthropic-oauth|CLAUDE_CODE_OAUTH_TOKEN" },
});
const modelSelect = document.querySelector("[data-testid='model-select']") as HTMLSelectElement;
const optionValues = Array.from(modelSelect.options).map((option) => option.value);
expect(optionValues).toEqual(expect.arrayContaining(["sonnet", "opus", "haiku"]));
});
it("renders gracefully when GET /workspaces fails", async () => {
mockGet.mockRejectedValueOnce(new Error("Network error"));
await openDialog();
@@ -297,226 +341,103 @@ describe("CreateWorkspaceDialog", () => {
});
// ---------------------------------------------------------------------------
// Hermes provider picker tests
// Dynamic runtime provider picker tests
// ---------------------------------------------------------------------------
describe("CreateWorkspaceDialog — Hermes provider picker", () => {
it("does NOT show hermes provider section for non-hermes templates", async () => {
describe("CreateWorkspaceDialog — dynamic runtime provider picker", () => {
it("does not render the old Hermes-only provider section", async () => {
await openDialog();
await setTemplate("seo-agent");
await setRuntime("hermes");
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeNull();
});
it("shows hermes provider section when runtime is 'hermes'", async () => {
it("derives Hermes provider and model options from the /templates runtime row", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.querySelector("[data-testid='provider-select']") as HTMLSelectElement;
await waitFor(() => expect(providerSelect.options.length).toBe(4));
const providerValues = Array.from(providerSelect.options).map((option) => option.value);
expect(providerValues).toEqual(expect.arrayContaining([
"platform|",
"openai|OPENAI_API_KEY",
"anthropic|ANTHROPIC_API_KEY",
]));
expect(providerValues).not.toContain("gemini|GEMINI_API_KEY");
});
it("shows hermes provider section for the Hermes runtime preset", async () => {
it("uses the template-declared default provider/model for Hermes", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
await waitFor(() => {
const providerSelect = document.querySelector("[data-testid='provider-select']") as HTMLSelectElement;
expect(providerSelect.value).toBe("platform|");
});
const modelSelect = document.querySelector("[data-testid='model-select']") as HTMLSelectElement;
expect(modelSelect.value).toBe("moonshot/kimi-k2.6");
});
it("hermes provider dropdown defaults to 'anthropic'", async () => {
it("prompts for the provider credential required by the selected Hermes model", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect).toBeTruthy();
expect(providerSelect.value).toBe("anthropic");
});
it("hermes provider dropdown lists all 15 providers", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
const ids = Array.from(providerSelect.options).map((o) => o.value);
expect(ids).toContain("anthropic");
expect(ids).toContain("openai");
expect(ids).toContain("gemini");
expect(ids).toContain("deepseek");
expect(ids).toContain("hermes");
});
// Pins the dynamic-providers behavior: when the matched template's
// /templates row declares `providers`, the dropdown filters to that
// subset instead of showing the full HERMES_PROVIDERS catalog. Same
// data source ConfigTab uses (PR #2454) — keeps the modal and the
// settings tab honest about which providers a template supports.
it("hermes provider dropdown filters to template-declared providers when /templates ships them", async () => {
// Per-URL mock: /workspaces returns the existing fixture, /templates
// returns a hermes row that only allows anthropic + minimax + openai.
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
return [
{ id: "hermes", name: "Hermes", runtime: "hermes", providers: ["anthropic", "minimax", "openai"] },
// eslint-disable-next-line @typescript-eslint/no-explicit-any
] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
});
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
// Filtered list arrives async after /templates fetch resolves —
// keep waiting until the dropdown shrinks below the full catalog.
await waitFor(() => expect(providerSelect.options.length).toBe(3));
const ids = Array.from(providerSelect.options).map((o) => o.value);
expect(ids).toEqual(expect.arrayContaining(["anthropic", "minimax", "openai"]));
expect(ids).not.toContain("gemini");
expect(ids).not.toContain("deepseek");
});
// Back-compat: a template that hasn't migrated to runtime_config.providers
// (older templates, self-hosted setups without /templates server) keeps
// showing the full provider catalog. Operators picking from those
// templates can't be locked out of providers we know hermes supports.
it("hermes provider dropdown falls back to all providers when template declares no providers list", async () => {
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
// No `providers` field — empty/missing → fall back to full catalog.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return [{ id: "hermes", name: "Hermes", runtime: "hermes" }] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
});
// Defensive: a template's declared list with NO matches against our
// static catalog (e.g. a brand-new provider id we don't have label/
// envVar metadata for yet) must not render an empty <select> — the
// operator can't pick a provider, the form locks. Component falls
// back to the full catalog so the user can still proceed.
it("hermes provider dropdown falls back to all providers when template declares only unknown providers", async () => {
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
return [
{ id: "hermes", name: "Hermes", runtime: "hermes", providers: ["totally-new-provider-2030"] },
// eslint-disable-next-line @typescript-eslint/no-explicit-any
] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
// Stays at full catalog length — no flapping to 0 then back.
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
});
it("hermes API key field is a password input (masked)", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
const keyInput = document.getElementById("llm-secret-input") as HTMLInputElement;
expect(keyInput).toBeTruthy();
expect(keyInput.type).toBe("password");
});
it("shows an error if hermes template is set but API key is empty on submit", async () => {
it("shows an error if the selected runtime provider requires a credential", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes Agent" },
});
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
});
// Submit without API key
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => {
const alert = screen.getByRole("alert");
expect(alert.textContent).toContain("API key");
expect(alert.textContent).toContain("Provider credential");
});
expect(mockPost).not.toHaveBeenCalled();
});
it("includes secrets in POST body with correct env var for selected provider", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes Agent" },
});
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Fill in the API key
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
fireEvent.change(keyInput, { target: { value: "sk-test-anthropic-key" } });
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.secrets).toEqual({ ANTHROPIC_API_KEY: "sk-test-anthropic-key" });
expect(body.runtime).toBe("hermes");
expect(body.template).toBeUndefined();
});
it("uses the correct env var when a non-default provider is selected", async () => {
it("includes runtime-derived provider/model/secrets in POST body", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes OpenAI" },
});
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Switch to openai
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
fireEvent.change(providerSelect, { target: { value: "openai" } });
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
fireEvent.change(keyInput, { target: { value: "sk-openai-test" } });
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
});
fireEvent.change(document.getElementById("llm-secret-input") as HTMLInputElement, {
target: { value: "sk-openai-test" },
});
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.runtime).toBe("hermes");
expect(body.template).toBeUndefined();
expect(body.model).toBe("openai/gpt-4o");
expect(body.llm_provider).toBe("openai");
expect(body.secrets).toEqual({ OPENAI_API_KEY: "sk-openai-test" });
});
it("does NOT include secrets field when template is not hermes", async () => {
it("does NOT include secrets field when provider is platform-managed", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Normal Agent" },
@@ -530,20 +451,6 @@ describe("CreateWorkspaceDialog — Hermes provider picker", () => {
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.secrets).toBeUndefined();
});
it("hides hermes section and resets state when template is cleared", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Switch back to a non-Hermes runtime.
await setRuntime("claude-code");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeNull()
);
});
});
// ---------------------------------------------------------------------------
+81 -8
View File
@@ -6,6 +6,7 @@ import { useCanvasStore } from "@/store/canvas";
import { type ConfigData, DEFAULT_CONFIG, TextInput, NumberInput, Toggle, TagList, Section } from "./config/form-inputs";
import { parseYaml, toYaml } from "./config/yaml-utils";
import { SecretsSection } from "./config/secrets-section";
import { LLMBillingSection } from "./config/llm-billing-section";
import { ExternalConnectionSection } from "./ExternalConnectionSection";
import {
ProviderModelSelector,
@@ -287,6 +288,40 @@ export function deriveProvidersFromModels(models: ModelSpec[]): string[] {
return out;
}
// billingModeForProvider — maps a selected PROVIDER (vendor key) to the
// LLM billing_mode it implies (internal#703 Gap 2).
//
// Today, picking a non-Platform provider in the Config tab writes the
// credential env (CLAUDE_CODE_OAUTH_TOKEN / vendor key) but leaves
// llm_billing_mode at its resolved default (`platform_managed`). The CP
// tenant_config endpoint then keeps injecting the platform proxy base
// URLs, so the OAuth token / vendor key is never actually used — BYOK
// silently no-ops (the live SEO-Agent symptom in #703). The workspace-
// server even hard-blocks vendor-key writes on platform_managed
// workspaces (secrets.go:87), pointing the user at this exact billing-
// mode switch. Wiring the provider change to also set billing_mode is
// the UI half that makes BYOK take (the CP/workspace-server backend half
// is being fixed in parallel — internal#703 Gap 1).
//
// Mapping:
// - "platform" (the Platform-managed proxy) OR "" (no explicit
// provider override → inherit, defaults to platform) → "platform_managed".
// - any other vendor key ("anthropic-oauth" = Claude Code subscription
// OAuth, "anthropic" = Anthropic API key, "minimax", "openrouter",
// etc.) → "byok".
//
// Returns the billing_mode string the PUT body should carry. The valid
// set is fixed by workspace-server's recognizer (platform_managed | byok
// | disabled); "disabled" is never auto-selected by a provider choice —
// it's an explicit operator action via the LLM Billing section.
export type LLMBillingMode = "platform_managed" | "byok";
export function billingModeForProvider(provider: string): LLMBillingMode {
const v = provider.trim().toLowerCase();
if (v === "" || v === "platform") return "platform_managed";
return "byok";
}
// Fallback used when /templates can't be fetched (offline, older backend).
// Keep in sync with manifest.json workspace_templates as a defensive default.
// Model + env suggestions only flow when the backend is reachable.
@@ -701,6 +736,36 @@ export function ConfigTab({ workspaceId }: Props) {
}
}
// Provider → billing_mode linkage (internal#703 Gap 2). When the
// provider actually changed AND its implied billing_mode differs
// from the previously-selected provider's, push the new mode to
// the per-tenant llm-billing-mode endpoint (same path the LLM
// Billing section uses). Without this, selecting a non-Platform
// provider leaves billing_mode=platform_managed → CP keeps
// injecting the platform proxy → BYOK never takes.
//
// Gated on (a) the provider PUT having succeeded — no point setting
// byok if the credential write failed — and (b) the mode actually
// changing, so an unrelated provider tweak between two BYOK vendors
// (e.g. minimax → openrouter) doesn't re-issue a redundant
// platform_managed→byok PUT and trigger a needless restart.
let billingModeSaveError: string | null = null;
if (providerChanged && !providerSaveError) {
const nextMode = billingModeForProvider(provider);
const prevMode = billingModeForProvider(originalProvider);
if (nextMode !== prevMode) {
try {
await api.put(
`/admin/workspaces/${workspaceId}/llm-billing-mode`,
{ mode: nextMode },
);
} catch (e) {
billingModeSaveError =
e instanceof Error ? e.message : "Billing mode update was rejected";
}
}
}
setOriginalYaml(content);
if (rawMode) {
const parsed = parseYaml(content);
@@ -720,16 +785,22 @@ export function ConfigTab({ workspaceId }: Props) {
} else if (!restart) {
useCanvasStore.getState().updateNodeData(workspaceId, { needsRestart: !providerWillAutoRestart });
}
// Aggregate partial-save errors. Both modelSaveError and
// providerSaveError describe rejected updates from independent
// endpoints — show whichever fired so the user knows which
// field reverts on next reload (otherwise they'd see "Saved" and
// be confused why Provider snapped back).
// Aggregate partial-save errors. modelSaveError, providerSaveError,
// and billingModeSaveError describe rejected updates from
// independent endpoints — show whichever fired so the user knows
// which field reverts on next reload (otherwise they'd see "Saved"
// and be confused why Provider snapped back). The billing-mode case
// is the most important to surface: the provider credential saved
// but BYOK won't actually take until billing_mode flips, so a
// silent failure here is exactly the #703 "selecting a provider has
// no effect" symptom.
const partialError = providerSaveError
? `Other fields saved, but provider update failed: ${providerSaveError}`
: modelSaveError
? `Other fields saved, but model update failed: ${modelSaveError}`
: null;
: billingModeSaveError
? `Provider saved, but switching billing mode failed — your own provider key/OAuth may not take effect until billing mode is set: ${billingModeSaveError}`
: modelSaveError
? `Other fields saved, but model update failed: ${modelSaveError}`
: null;
if (partialError) {
setError(partialError);
} else {
@@ -1108,6 +1179,8 @@ export function ConfigTab({ workspaceId }: Props) {
</div>
</Section>
<LLMBillingSection workspaceId={workspaceId} />
<SecretsSection
workspaceId={workspaceId}
requiredEnv={config.runtime_config?.required_env}
@@ -0,0 +1,255 @@
// @vitest-environment jsdom
//
// Tests for the provider → llm_billing_mode linkage (internal#703 Gap 2).
//
// What this pins: when the operator changes the PROVIDER in the Config
// tab, the workspace's llm_billing_mode must follow — a non-Platform
// provider sets billing_mode=byok; Platform sets platform_managed. Before
// this wiring, selecting "Claude Code subscription (OAuth)" or any vendor
// key wrote the credential env but left billing_mode=platform_managed, so
// CP kept injecting the platform proxy base URL and the OAuth token /
// vendor key was never used — BYOK silently no-op'd (the live jrs-auto
// SEO-Agent symptom in #703).
//
// The billing-mode PUT targets the same per-tenant endpoint the LLM
// Billing section uses: PUT /admin/workspaces/:id/llm-billing-mode with
// body {mode: "byok" | "platform_managed"}.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPatch = vi.fn();
const apiPut = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
patch: (path: string, body: unknown) => apiPatch(path, body),
put: (path: string, body: unknown) => apiPut(path, body),
post: vi.fn(),
del: vi.fn(),
},
}));
const storeUpdateNodeData = vi.fn();
const storeRestartWorkspace = vi.fn();
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
(selector: (s: unknown) => unknown) =>
selector({ restartWorkspace: storeRestartWorkspace, updateNodeData: storeUpdateNodeData }),
{
getState: () => ({
restartWorkspace: storeRestartWorkspace,
updateNodeData: storeUpdateNodeData,
}),
},
),
}));
vi.mock("../AgentCardSection", () => ({
AgentCardSection: () => <div data-testid="agent-card-stub" />,
}));
import { ConfigTab, billingModeForProvider } from "../ConfigTab";
function wireApi(opts: { providerValue?: string | "missing" }) {
apiGet.mockImplementation((path: string) => {
if (path === `/workspaces/ws-test`) {
return Promise.resolve({ runtime: "hermes" });
}
if (path === `/workspaces/ws-test/model`) {
return Promise.resolve({ model: "nousresearch/hermes-4-70b" });
}
if (path === `/workspaces/ws-test/provider`) {
if (opts.providerValue === "missing") return Promise.reject(new Error("404"));
return Promise.resolve({
provider: opts.providerValue ?? "",
source: opts.providerValue ? "workspace_secrets" : "default",
});
}
if (path === `/workspaces/ws-test/files/config.yaml`) {
return Promise.resolve({ content: "name: ws\nruntime: hermes\n" });
}
if (path === "/templates") return Promise.resolve([]);
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
}
function billingModeCalls() {
return apiPut.mock.calls.filter(
([path]) => path === "/admin/workspaces/ws-test/llm-billing-mode",
);
}
beforeEach(() => {
apiGet.mockReset();
apiPatch.mockReset();
apiPut.mockReset();
storeUpdateNodeData.mockReset();
storeRestartWorkspace.mockReset();
});
describe("billingModeForProvider — pure mapping (internal#703 Gap 2)", () => {
// Platform / empty → platform_managed. Empty means "no explicit
// override → inherit", which resolves to platform on the backend, so
// it must NOT flip the workspace into byok.
it("maps Platform and empty to platform_managed", () => {
expect(billingModeForProvider("platform")).toBe("platform_managed");
expect(billingModeForProvider("")).toBe("platform_managed");
expect(billingModeForProvider(" ")).toBe("platform_managed");
expect(billingModeForProvider("PLATFORM")).toBe("platform_managed");
});
// Every non-Platform provider → byok. If this regresses to returning
// platform_managed for a vendor, BYOK silently no-ops again (#703).
it("maps non-Platform providers to byok", () => {
expect(billingModeForProvider("anthropic-oauth")).toBe("byok"); // Claude Code subscription
expect(billingModeForProvider("anthropic")).toBe("byok"); // Anthropic API key
expect(billingModeForProvider("minimax")).toBe("byok");
expect(billingModeForProvider("openrouter")).toBe("byok");
expect(billingModeForProvider("openai")).toBe("byok");
});
});
describe("ConfigTab — provider change drives billing_mode (internal#703 Gap 2)", () => {
// The core fix: picking a non-Platform provider (here "anthropic-oauth"
// = Claude Code subscription OAuth) from a fresh/empty provider must
// PUT mode=byok to the per-tenant llm-billing-mode endpoint. This is
// the exact path that was missing — the credential env saved but the
// billing mode never followed, so the proxy stayed engaged.
it("PUTs mode=byok when switching to a non-Platform provider", async () => {
wireApi({ providerValue: "" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
const calls = billingModeCalls();
expect(calls.length).toBe(1);
expect(calls[0][1]).toEqual({ mode: "byok" });
});
// Provider credential PUT still happens too (independent endpoint).
expect(
apiPut.mock.calls.some(([path]) => path === "/workspaces/ws-test/provider"),
).toBe(true);
});
// Switching FROM a byok provider back TO Platform must PUT
// mode=platform_managed so the workspace re-engages the proxy and stops
// expecting a (now-absent) vendor key.
it("PUTs mode=platform_managed when switching back to Platform", async () => {
wireApi({ providerValue: "anthropic-oauth" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("anthropic-oauth"));
fireEvent.change(input, { target: { value: "platform" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
const calls = billingModeCalls();
expect(calls.length).toBe(1);
expect(calls[0][1]).toEqual({ mode: "platform_managed" });
});
});
// Changing between two BYOK vendors (minimax → openrouter) keeps
// billing_mode=byok — the implied mode is unchanged, so re-PUTing it
// would be a wasteful no-op that risks an extra restart. Must NOT fire.
it("does NOT PUT billing-mode when the implied mode is unchanged", async () => {
wireApi({ providerValue: "minimax" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
await waitFor(() => expect((input as HTMLInputElement).value).toBe("minimax"));
fireEvent.change(input, { target: { value: "openrouter" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// Provider PUT fires (vendor changed)...
expect(
apiPut.mock.calls.some(([path]) => path === "/workspaces/ws-test/provider"),
).toBe(true);
});
// ...but billing-mode does NOT (byok → byok is a no-op).
expect(billingModeCalls().length).toBe(0);
});
// A Save that doesn't touch the provider must not PUT billing-mode —
// editing tier/name shouldn't disturb the workspace's billing mode.
it("does NOT PUT billing-mode on a Save that leaves provider unchanged", async () => {
wireApi({ providerValue: "anthropic-oauth" });
apiPut.mockResolvedValue({ status: "saved" });
render(<ConfigTab workspaceId="ws-test" />);
await screen.findByTestId("provider-input");
// Dirty an unrelated field so Save is enabled.
const tierSelect = screen.getByLabelText(/tier/i) as HTMLSelectElement;
fireEvent.change(tierSelect, { target: { value: "3" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// Some PUT may fire (e.g. /model); just assert billing-mode did not.
expect(billingModeCalls().length).toBe(0);
});
});
// If the provider credential PUT itself fails, we must NOT set byok —
// flipping billing_mode while the credential write failed would leave
// the workspace expecting a key it doesn't have (worse than no-op).
it("does NOT PUT billing-mode when the provider PUT fails", async () => {
wireApi({ providerValue: "" });
apiPut.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/provider") return Promise.reject(new Error("boom"));
return Promise.resolve({ status: "saved" });
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
// The provider-failure error is surfaced (getByText throws if absent).
expect(screen.getByText(/provider update failed/i)).toBeTruthy();
});
expect(billingModeCalls().length).toBe(0);
});
// If the credential saved but the billing-mode PUT is rejected, the
// user must be warned that BYOK may not take — a silent failure here
// is precisely the #703 symptom we're fixing.
it("surfaces an error when billing-mode PUT fails after a successful provider save", async () => {
wireApi({ providerValue: "" });
apiPut.mockImplementation((path: string) => {
if (path === "/admin/workspaces/ws-test/llm-billing-mode") {
return Promise.reject(new Error("403 forbidden"));
}
return Promise.resolve({ status: "saved" });
});
render(<ConfigTab workspaceId="ws-test" />);
const input = await screen.findByTestId("provider-input");
fireEvent.change(input, { target: { value: "anthropic-oauth" } });
fireEvent.click(screen.getByRole("button", { name: /^save$/i }));
await waitFor(() => {
expect(screen.getByText(/switching billing mode failed/i)).toBeTruthy();
});
});
});
@@ -0,0 +1,176 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import {
render,
screen,
waitFor,
cleanup,
fireEvent,
} from "@testing-library/react";
import { LLMBillingSection } from "../llm-billing-section";
// Tests for LLMBillingSection (internal#691). Locks in:
// - the section renders the resolved mode + source label
// - the dropdown maps "inherit" → PUT {mode: null}
// - the dropdown maps "byok" → PUT {mode: "byok"}
// - a garbled override surfaces the warning banner
// - the post-write resolution updates the UI without a refetch
const apiGet = vi.fn();
const apiPut = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (...args: unknown[]) => apiGet(...args),
put: (...args: unknown[]) => apiPut(...args),
post: vi.fn().mockResolvedValue({}),
del: vi.fn().mockResolvedValue({}),
patch: vi.fn().mockResolvedValue({}),
},
}));
// Collapsed-by-default Section wrapper would hide the content; replace
// it with a passthrough so the dropdown is reachable in the test DOM.
vi.mock("../form-inputs", async () => {
const actual = await vi.importActual<typeof import("../form-inputs")>(
"../form-inputs",
);
return {
...actual,
Section: ({ children }: { children: React.ReactNode }) => (
<div>{children}</div>
),
};
});
beforeEach(() => {
vi.clearAllMocks();
});
afterEach(() => {
cleanup();
});
describe("LLMBillingSection — internal#691", () => {
it("renders the resolved mode + source for an inherited workspace", async () => {
apiGet.mockResolvedValueOnce({
workspace_id: "ws-1",
resolved_mode: "platform_managed",
workspace_override: null,
org_default: "platform_managed",
source: "org_default",
});
render(<LLMBillingSection workspaceId="ws-1" />);
await waitFor(() => {
expect(apiGet).toHaveBeenCalledWith(
"/admin/workspaces/ws-1/llm-billing-mode",
);
});
// Resolved mode appears.
expect(screen.getByText(/Resolved mode:/i).textContent).toMatch(/platform_managed/);
// Source label appears.
expect(
screen.getByText(/inherited from org default/i),
).toBeTruthy();
});
it('PUTs {mode: "byok"} when user picks BYOK and reflects the new resolution', async () => {
apiGet.mockResolvedValueOnce({
workspace_id: "ws-2",
resolved_mode: "platform_managed",
workspace_override: null,
org_default: "platform_managed",
source: "org_default",
});
apiPut.mockResolvedValueOnce({
workspace_id: "ws-2",
resolved_mode: "byok",
workspace_override: "byok",
org_default: "platform_managed",
source: "workspace_override",
});
render(<LLMBillingSection workspaceId="ws-2" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const select = (await screen.findByLabelText(
/llm billing mode override/i,
)) as HTMLSelectElement;
fireEvent.change(select, { target: { value: "byok" } });
await waitFor(() => {
expect(apiPut).toHaveBeenCalledWith(
"/admin/workspaces/ws-2/llm-billing-mode",
{ mode: "byok" },
);
});
// Post-write resolution propagated to UI.
await waitFor(() => {
expect(
screen.getByText(/explicit override on this workspace/i),
).toBeTruthy();
});
});
it("PUTs {mode: null} when user picks Inherit (clears the override)", async () => {
apiGet.mockResolvedValueOnce({
workspace_id: "ws-3",
resolved_mode: "byok",
workspace_override: "byok",
org_default: "platform_managed",
source: "workspace_override",
});
apiPut.mockResolvedValueOnce({
workspace_id: "ws-3",
resolved_mode: "platform_managed",
workspace_override: null,
org_default: "platform_managed",
source: "org_default",
});
render(<LLMBillingSection workspaceId="ws-3" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
const select = (await screen.findByLabelText(
/llm billing mode override/i,
)) as HTMLSelectElement;
fireEvent.change(select, { target: { value: "inherit" } });
await waitFor(() => {
expect(apiPut).toHaveBeenCalledWith(
"/admin/workspaces/ws-3/llm-billing-mode",
{ mode: null },
);
});
});
it("surfaces a warning banner when the override value is garbled", async () => {
apiGet.mockResolvedValueOnce({
workspace_id: "ws-4",
resolved_mode: "platform_managed", // resolver fell through, default-closed
workspace_override: "byokk", // typo persisted somehow
org_default: "platform_managed",
source: "org_default",
});
render(<LLMBillingSection workspaceId="ws-4" />);
await waitFor(() => {
expect(
screen.getByText(/non-standard value/i),
).toBeTruthy();
});
});
it("renders an error banner when the GET fails", async () => {
apiGet.mockRejectedValueOnce(new Error("network down"));
render(<LLMBillingSection workspaceId="ws-5" />);
await waitFor(() => {
expect(screen.getByText(/network down/i)).toBeTruthy();
});
});
});
@@ -1,3 +1,4 @@
export { type ConfigData, DEFAULT_CONFIG, TextInput, NumberInput, Toggle, TagList, Section } from "./form-inputs";
export { parseYaml, toYaml } from "./yaml-utils";
export { SecretsSection } from "./secrets-section";
export { LLMBillingSection } from "./llm-billing-section";
@@ -0,0 +1,219 @@
"use client";
// llm-billing-section.tsx — Config-tab section for the per-workspace
// llm_billing_mode override (internal#691).
//
// Surfaces:
// - The currently RESOLVED mode for this workspace (the mode the
// workspace-server's strip gate will use at next provision).
// - The org-level default (so the user sees what they're inheriting).
// - A dropdown to set / clear the workspace-level override.
// - A "source" line so operators can answer "is this inherited or
// explicit?" without DB archeology (RFC Observability hot-spot).
//
// Hits:
// GET /admin/workspaces/:id/llm-billing-mode — read resolution
// PUT /admin/workspaces/:id/llm-billing-mode — write {mode: "..."|null}
//
// Both routes are on the per-tenant workspace-server (same origin as the
// other canvas /admin calls). CP's proxy at /cp/admin/workspaces/:id/
// llm-billing-mode exists for ops use; the canvas uses the per-tenant
// path directly to keep the round-trip cheap.
import { useState, useEffect, useCallback } from "react";
import { api } from "@/lib/api";
import { Section } from "./form-inputs";
// Mirrors workspace-server/internal/handlers/llm_billing_mode.go::BillingModeResolution.
// Kept as a literal shape (not imported) because canvas has no Go-type bridge.
export interface BillingModeResolution {
workspace_id: string;
resolved_mode: "platform_managed" | "byok" | "disabled";
// Pointer-typed on the Go side: nil = inherit, non-nil = the raw
// workspace-level override (even if garbled and falling through).
workspace_override: string | null;
org_default: "platform_managed" | "byok" | "disabled";
source: "workspace_override" | "org_default" | "constant_fallback";
}
// The dropdown emits one of these values. "inherit" is the UX-only label
// that maps to a `null` body in the PUT request.
type DropdownChoice = "inherit" | "platform_managed" | "byok" | "disabled";
interface Props {
workspaceId: string;
}
const MODE_LABELS: Record<DropdownChoice, string> = {
inherit: "Inherit from org default",
platform_managed: "Platform-managed (uses Molecule credits)",
byok: "BYOK (your own OAuth / vendor keys)",
disabled: "Disabled (no LLM access)",
};
const MODE_DESCRIPTIONS: Record<DropdownChoice, string> = {
inherit:
"Use whichever mode is set at the organization level. Recommended unless this specific workspace needs a different billing source.",
platform_managed:
"Strip CLAUDE_CODE_OAUTH_TOKEN and vendor API keys from the workspace; route all LLM traffic through Molecule's proxy and bill your org credits.",
byok:
"Keep CLAUDE_CODE_OAUTH_TOKEN / vendor API keys in the workspace; LLM traffic goes directly to your provider and is billed to your OAuth subscription or API account.",
disabled:
"Block all LLM access for this workspace. Useful for sandbox workspaces that should not consume credits or hit external providers.",
};
const SOURCE_LABELS: Record<BillingModeResolution["source"], string> = {
workspace_override: "explicit override on this workspace",
org_default: "inherited from org default",
constant_fallback:
"fallback (workspace + org defaults missing or unrecognized — defaulted to platform_managed)",
};
export function LLMBillingSection({ workspaceId }: Props) {
const [resolution, setResolution] = useState<BillingModeResolution | null>(
null,
);
const [loading, setLoading] = useState(true);
const [saving, setSaving] = useState(false);
const [error, setError] = useState<string | null>(null);
const [success, setSuccess] = useState(false);
const load = useCallback(async () => {
setLoading(true);
setError(null);
try {
const res = await api.get<BillingModeResolution>(
`/admin/workspaces/${workspaceId}/llm-billing-mode`,
);
setResolution(res);
} catch (e) {
setError(e instanceof Error ? e.message : "Failed to load billing mode");
} finally {
setLoading(false);
}
}, [workspaceId]);
useEffect(() => {
void load();
}, [load]);
// Current dropdown selection is derived from the resolution. If the
// override is null, we show "inherit"; otherwise we mirror the raw
// workspace_override (NOT resolved_mode — that would conflate "explicit
// platform_managed override" with "inherit while org happens to be
// platform_managed", which has different semantics on the write side).
const currentChoice: DropdownChoice = (() => {
if (!resolution) return "inherit";
if (resolution.workspace_override == null) return "inherit";
const raw = resolution.workspace_override;
if (raw === "platform_managed" || raw === "byok" || raw === "disabled") {
return raw;
}
// Garbled value persisted via some external write. Show inherit so
// the user can pick a clean value; on save they'll either clear it
// (PUT null) or overwrite it with a valid one.
return "inherit";
})();
const handleChange = async (choice: DropdownChoice) => {
if (!resolution) return;
setSaving(true);
setError(null);
setSuccess(false);
try {
// "inherit" → PUT {mode: null}; otherwise → PUT {mode: choice}.
const body = choice === "inherit" ? { mode: null } : { mode: choice };
const updated = await api.put<BillingModeResolution>(
`/admin/workspaces/${workspaceId}/llm-billing-mode`,
body,
);
setResolution(updated);
setSuccess(true);
setTimeout(() => setSuccess(false), 2000);
} catch (e) {
setError(e instanceof Error ? e.message : "Failed to update billing mode");
} finally {
setSaving(false);
}
};
return (
<Section title="LLM Billing" defaultOpen={false}>
{loading && (
<div className="text-[10px] text-ink-mid">Loading billing mode</div>
)}
{error && (
<div
role="alert"
aria-live="assertive"
className="px-2 py-1 bg-red-900/30 border border-red-800 rounded text-[10px] text-bad mb-2"
>
{error}
</div>
)}
{resolution && (
<div className="space-y-2">
<div className="text-[10px] text-ink-mid">
Resolved mode: <strong className="text-ink">{resolution.resolved_mode}</strong>{" "}
<span className="text-ink-mid">
({SOURCE_LABELS[resolution.source]})
</span>
</div>
<div className="text-[10px] text-ink-mid">
Org default: <span className="text-ink">{resolution.org_default}</span>
</div>
<label
className="block text-[10px] text-ink-mid"
htmlFor={`llm-billing-mode-${workspaceId}`}
>
Override
</label>
<select
id={`llm-billing-mode-${workspaceId}`}
aria-label="LLM billing mode override"
value={currentChoice}
disabled={saving}
onChange={(e) => void handleChange(e.target.value as DropdownChoice)}
className="w-full bg-surface-card border border-line rounded p-1 text-[10px] text-ink focus:outline-none focus:border-accent disabled:opacity-50"
>
{(Object.keys(MODE_LABELS) as DropdownChoice[]).map((m) => (
<option key={m} value={m}>
{MODE_LABELS[m]}
</option>
))}
</select>
<div
className="text-[10px] text-ink-mid leading-snug"
aria-live="polite"
>
{MODE_DESCRIPTIONS[currentChoice]}
</div>
{success && (
<div className="mt-1 px-2 py-1 bg-green-900/30 border border-green-800 rounded text-[10px] text-good">
Updated. Restart the workspace to apply.
</div>
)}
{resolution.workspace_override != null &&
!["platform_managed", "byok", "disabled"].includes(
resolution.workspace_override,
) && (
<div
role="alert"
className="mt-1 px-2 py-1 bg-yellow-900/30 border border-yellow-800 rounded text-[10px] text-warning"
>
Workspace override has a non-standard value (
<code>{resolution.workspace_override}</code>) and is being
ignored. Pick a valid mode above to clear the corrupt value.
</div>
)}
</div>
)}
</Section>
);
}
@@ -335,6 +335,7 @@ func (m *Manager) HandleInbound(ctx context.Context, ch ChannelRow, msg *Inbound
})
if marshalErr != nil {
log.Printf("Channels %s: json.Marshal a2aBody failed: %v", ch.ChannelType, marshalErr)
return fmt.Errorf("marshal a2a body: %w", marshalErr)
}
callerID := "channel:" + ch.ChannelType
@@ -676,6 +677,7 @@ func (m *Manager) appendHistory(ctx context.Context, key string, username, userM
})
if marshalErr != nil {
log.Printf("appendHistory %s: json.Marshal entry failed: %v", key, marshalErr)
return
}
db.RDB.LPush(ctx, key, string(entry))
db.RDB.LTrim(ctx, key, 0, int64(maxHistoryEntries-1))
@@ -163,6 +163,7 @@ func (s *SlackAdapter) sendBotMessage(ctx context.Context, config map[string]int
body, marshalErr := json.Marshal(payload)
if marshalErr != nil {
log.Printf("slack SendMessage: json.Marshal payload failed: %v", marshalErr)
return fmt.Errorf("slack: marshal payload: %w", marshalErr)
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, "https://slack.com/api/chat.postMessage", bytes.NewReader(body))
if err != nil {
@@ -482,12 +482,14 @@ func (t *TelegramAdapter) StartPolling(ctx context.Context, config map[string]in
if apiErr.Code == 429 {
retryAfter := time.Duration(apiErr.RetryAfter) * time.Second
log.Printf("Channels: Telegram poll rate-limited, sleeping %s", retryAfter)
timer := time.NewTimer(retryAfter)
select {
case <-ctx.Done():
timer.Stop()
return nil
case <-time.After(retryAfter):
continue
case <-timer.C:
}
continue
}
if apiErr.Code == 401 {
invalidateBot(token)
@@ -495,12 +497,14 @@ func (t *TelegramAdapter) StartPolling(ctx context.Context, config map[string]in
}
}
log.Printf("Channels: Telegram poll error: %v", err)
timer := time.NewTimer(telegramPollInterval)
select {
case <-ctx.Done():
timer.Stop()
return nil
case <-time.After(telegramPollInterval):
continue
case <-timer.C:
}
continue
}
for _, update := range updates {
@@ -426,16 +426,34 @@ func nilIfEmpty(s string) *string {
// (their next /registry/register will mint their first token, after
// which this branch never fires again for them).
//
// Post-RFC#637 addition: when the tokenless workspace is accompanied by
// canvas or admin auth (same-origin request, admin bearer, or org-level
// token), the caller is identified as a canvas-user identity rather than
// a legacy peer agent. The returned isCanvasUser flag lets the A2A proxy
// bypass CanCommunicate for human users, who sit outside the workspace
// hierarchy.
// Post-RFC#637 addition: a request may instead be carrying a HUMAN's
// canvas-user identity (e.g. the 344a2623-… identity workspace from the
// RFC#637 rollout). That human sits OUTSIDE the workspace org hierarchy, so
// the returned isCanvasUser flag lets the A2A proxy bypass CanCommunicate for
// it. Canvas-user classification is decided by isGenuineCanvasUser using
// NON-FORGEABLE credentials only (see that function) — never by the caller's
// X-Workspace-ID alone, and never by a bare same-origin Host/Referer in a
// SaaS image (those are forgeable; see middleware.IsSameOriginCanvas).
//
// #1673: this canvas-user check is now evaluated BEFORE the HasAnyLiveToken
// peer-token contract. Previously it lived only in the !hasLive branch, so a
// canvas-user identity workspace that had acquired live tokens fell into the
// hasLive=true branch, which demands a bearer the canvas frontend never sends
// → silent 401 → the message was dropped before logA2AReceiveQueued wrote the
// activity_logs row, breaking canvas chat for poll-mode workspaces. A genuine
// canvas user is identified by the human's session/admin/org credential, which
// is independent of whether the identity workspace happens to hold peer tokens.
//
// On auth failure this writes the 401 via c and returns an error so the
// handler aborts without running the proxy.
func validateCallerToken(ctx context.Context, c *gin.Context, callerID string) (isCanvasUser bool, err error) {
// Genuine canvas-user identity? Decided independently of the caller
// workspace's token state (the #1673 fix) and using only non-forgeable
// signals (the #1944 escalation guard).
if isGenuineCanvasUser(ctx, c) {
return true, nil
}
hasLive, dbErr := wsauth.HasAnyLiveToken(ctx, db.DB, callerID)
if dbErr != nil {
// Fail-open here matches the heartbeat path — A2A caller auth is
@@ -446,22 +464,10 @@ func validateCallerToken(ctx context.Context, c *gin.Context, callerID string) (
return false, nil
}
if !hasLive {
// Tokenless workspace — could be legacy/pre-upgrade caller or
// canvas-user identity. Distinguish by request auth signals.
if middleware.IsSameOriginCanvas(c) {
return true, nil
}
tok := wsauth.BearerTokenFromHeader(c.GetHeader("Authorization"))
if tok != "" {
adminSecret := os.Getenv("ADMIN_TOKEN")
if adminSecret != "" && subtle.ConstantTimeCompare([]byte(tok), []byte(adminSecret)) == 1 {
return true, nil
}
if _, _, _, err := orgtoken.Validate(ctx, db.DB, tok); err == nil {
return true, nil
}
}
return false, nil // legacy / pre-upgrade caller
// Tokenless, non-canvas-user workspace — legacy / pre-upgrade peer.
// Grandfather it through (its next /registry/register mints its
// first token, after which it lands in the hasLive=true branch).
return false, nil
}
tok := wsauth.BearerTokenFromHeader(c.GetHeader("Authorization"))
if tok == "" {
@@ -475,6 +481,61 @@ func validateCallerToken(ctx context.Context, c *gin.Context, callerID string) (
return false, nil
}
// isGenuineCanvasUser reports whether the request is a real human acting
// through the canvas UI (RFC#637 canvas-user identity), as opposed to a peer
// workspace agent. A true result lets the A2A proxy bypass CanCommunicate, so
// it MUST only accept signals an attacker on the platform network cannot forge:
//
// - A control-plane-verified canvas session: the WorkOS session cookie is
// confirmed upstream to belong to a MEMBER of THIS tenant's org
// (middleware.IsVerifiedCanvasSession → /cp/auth/tenant-member). This is
// the production SaaS canvas path.
// - An Authorization: Bearer matching ADMIN_TOKEN (break-glass / molecli).
// - An Authorization: Bearer matching a live org_api_tokens row (user-minted
// org-scoped API token).
//
// Deliberately NOT accepted as a canvas-user signal in a SaaS image:
//
// - A bare same-origin Host/Referer/Origin (middleware.IsSameOriginCanvas).
// Those headers are trivially forgeable by any container on the Docker
// network, and the combined-tenant image (CANVAS_PROXY_URL set) is exactly
// where a forged Referer + an arbitrary X-Workspace-ID could otherwise
// bypass CanCommunicate and reach cross-workspace A2A — the PR #1944
// privilege escalation. Same-origin is only honored as a fallback when CP
// session verification is NOT configured (self-hosted / dev), a
// single-tenant topology with no cross-tenant boundary to escalate across;
// even there the org hierarchy still owns intra-org routing.
//
// Note this classification is about the human's credential, not the caller
// workspace's X-Workspace-ID — so it never trusts an attacker-supplied caller
// ID, and it is independent of whether that workspace holds peer tokens.
func isGenuineCanvasUser(ctx context.Context, c *gin.Context) bool {
// Production SaaS: control-plane-verified org-member session cookie.
if middleware.IsVerifiedCanvasSession(c) {
return true
}
if tok := wsauth.BearerTokenFromHeader(c.GetHeader("Authorization")); tok != "" {
adminSecret := os.Getenv("ADMIN_TOKEN")
if adminSecret != "" && subtle.ConstantTimeCompare([]byte(tok), []byte(adminSecret)) == 1 {
return true
}
if _, _, _, err := orgtoken.Validate(ctx, db.DB, tok); err == nil {
return true
}
}
// Self-hosted / dev fallback ONLY: when upstream session verification is
// not configured there is no verified-cookie signal to use, and the
// deployment is single-tenant, so the forgeable same-origin check is an
// acceptable canvas signal. In SaaS (CP session configured) this branch is
// skipped, closing the forged-same-origin escalation.
if !middleware.CPSessionConfigured() && middleware.IsSameOriginCanvas(c) {
return true
}
return false
}
// errInvalidCallerToken is a sentinel for validateCallerToken's "missing
// token" branch so the handler-level guard can detect it without string
// matching (the wsauth errors are typed for the invalid case).
@@ -11,6 +11,7 @@ import (
"net/http"
"net/http/httptest"
"os"
"os/exec"
"strings"
"testing"
"time"
@@ -1244,13 +1245,12 @@ func TestValidateCallerToken_WrongWorkspaceBindingRejected(t *testing.T) {
}
func TestValidateCallerToken_CanvasUser_AdminToken(t *testing.T) {
mock := setupTestDB(t)
setupTestDB(t)
setupTestRedis(t)
// Tokenless workspace
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs("ws-canvas-admin").
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
// #1673/#1944: the genuine-canvas-user check (admin bearer here) now runs
// BEFORE HasAnyLiveToken, so no SELECT COUNT(*) is issued — the human's
// credential, not the caller workspace's token state, decides canvas-user.
t.Setenv("ADMIN_TOKEN", "admin-secret-42")
@@ -1276,10 +1276,9 @@ func TestValidateCallerToken_CanvasUser_OrgToken(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
// Tokenless workspace
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs("ws-canvas-org").
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
// #1673/#1944: the genuine-canvas-user check (org token here) now runs
// BEFORE HasAnyLiveToken, so the first DB query is orgtoken.Validate's
// lookup — there is no SELECT COUNT(*) expectation anymore.
// orgtoken.Validate lookup
mock.ExpectQuery(`SELECT id, prefix, org_id FROM org_api_tokens WHERE token_hash = .* AND revoked_at IS NULL`).
@@ -2341,6 +2340,197 @@ func TestProxyA2A_PollMode_ShortCircuits_NoSSRF_NoDispatch(t *testing.T) {
}
}
// stubVerifiedCPSession points VerifiedCPSession at a stub control-plane that
// confirms the given cookie belongs to a tenant-member, so tests can exercise
// the genuine (non-forgeable) canvas-session path end-to-end without a live CP.
// It sets CP_UPSTREAM_URL + MOLECULE_ORG_SLUG for the test's lifetime; the
// real middleware.VerifiedCPSession HTTP+cache code path runs unchanged.
func stubVerifiedCPSession(t *testing.T, member bool) {
t.Helper()
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
if member {
fmt.Fprint(w, `{"member":true,"user_id":"user-canvas-1"}`)
} else {
w.WriteHeader(http.StatusForbidden)
fmt.Fprint(w, `{"member":false}`)
}
}))
t.Cleanup(srv.Close)
t.Setenv("CP_UPSTREAM_URL", srv.URL)
t.Setenv("MOLECULE_ORG_SLUG", "test-tenant")
}
// TestProxyA2A_PollMode_CanvasUserWithVerifiedSession is the #1673 regression
// guard. A poll-mode canvas-user identity workspace that HAS acquired live
// tokens (the exact condition that made #1673 fire) sends a canvas message
// carrying a control-plane-verified session cookie but no bearer token. The
// fix must classify it as a canvas user BEFORE the HasAnyLiveToken peer-token
// contract, so the request is queued (200) and logA2AReceiveQueued writes the
// activity_logs row — instead of the pre-fix silent 401 that dropped the
// message before any row landed (breaking canvas chat + chat-history).
//
// Runs in a subprocess with CANVAS_PROXY_URL set so middleware.canvasProxyActive
// is true at package-init time (matching the combined-tenant image), proving the
// fix does not depend on disabling same-origin detection.
func TestProxyA2A_PollMode_CanvasUserWithVerifiedSession(t *testing.T) {
if os.Getenv("CANVAS_PROXY_URL") == "" {
cmd := exec.Command(os.Args[0], "-test.run=^TestProxyA2A_PollMode_CanvasUserWithVerifiedSession$", "-test.v")
cmd.Env = append(os.Environ(), "CANVAS_PROXY_URL=http://localhost")
out, err := cmd.CombinedOutput()
if err != nil {
t.Fatalf("subprocess test failed: %v\n%s", err, out)
}
return
}
stubVerifiedCPSession(t, true)
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
const wsTarget = "ws-poll-canvas-target"
const wsCanvasUser = "ws-canvas-user-344a"
// CRUCIAL: no SELECT COUNT(*) FROM workspace_auth_tokens expectation. The
// genuine-canvas-user check (verified session) must short-circuit BEFORE
// HasAnyLiveToken — that is the #1673 regression path. An identity
// workspace that already holds live tokens must NOT fall into the
// hasLive=true bearer-required branch.
// isCanvasUser=true → CanCommunicate is skipped (no parent_id lookups).
expectBudgetCheck(mock, wsTarget)
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WithArgs(wsTarget).
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("poll"))
// logA2AReceiveQueued must fire synchronously and write the row.
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: wsTarget}}
body := `{"jsonrpc":"2.0","id":"canvas-1","method":"message/send","params":{"message":{"role":"user","parts":[{"text":"hello from canvas"}]}}}`
req := httptest.NewRequest("POST", "/workspaces/"+wsTarget+"/a2a", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-Workspace-ID", wsCanvasUser)
// Verified canvas session cookie (the genuine, non-forgeable signal).
req.Header.Set("Cookie", "wos-session=valid-canvas-session-cookie")
// Same-origin headers, present as a real canvas request would send them —
// but they are NOT what authorizes the bypass here (the verified session is).
req.Host = "localhost"
req.Header.Set("Referer", "https://localhost/")
c.Request = req
handler.ProxyA2A(c)
time.Sleep(50 * time.Millisecond)
if w.Code != http.StatusOK {
t.Fatalf("expected 200 (queued) for canvas-user with verified session, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response is not valid JSON: %v", err)
}
if resp["status"] != "queued" {
t.Errorf("response.status = %v, want %q", resp["status"], "queued")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations (activity_logs row must be written): %v", err)
}
}
// TestProxyA2A_ForgedSameOrigin_CannotBypassCanCommunicate is the security
// crux of the #1673 fix and the reason PR #1944 was held. In the combined-
// tenant SaaS image (CANVAS_PROXY_URL set, CP session verification configured),
// an attacker forges a same-origin request — correct Host + a matching
// `Referer: https://<host>/` — and supplies an arbitrary X-Workspace-ID naming
// a workspace it does not control, targeting a workspace it is NOT authorized
// to reach. It presents NO verified session cookie, NO admin token, NO org
// token.
//
// PR #1944's same-origin bypass would have classified this as a canvas user and
// skipped CanCommunicate, granting cross-workspace A2A — a privilege
// escalation. The safe fix must instead fall through to the standard
// peer-token contract and CanCommunicate, which rejects the cross-hierarchy
// call with 403. This test proves the escalation is closed.
func TestProxyA2A_ForgedSameOrigin_CannotBypassCanCommunicate(t *testing.T) {
if os.Getenv("CANVAS_PROXY_URL") == "" {
cmd := exec.Command(os.Args[0], "-test.run=^TestProxyA2A_ForgedSameOrigin_CannotBypassCanCommunicate$", "-test.v")
cmd.Env = append(os.Environ(), "CANVAS_PROXY_URL=http://localhost")
out, err := cmd.CombinedOutput()
if err != nil {
t.Fatalf("subprocess test failed: %v\n%s", err, out)
}
return
}
// SaaS image with CP session verification configured. The stub CP rejects
// any cookie as a non-member; the attacker sends none anyway. This asserts
// that with verification configured, same-origin alone is NOT a canvas
// signal (CPSessionConfigured()==true disables the dev fallback).
stubVerifiedCPSession(t, false)
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
const wsTarget = "ws-victim-target"
const wsForgedCaller = "ws-attacker-caller"
// validateCallerToken: not a genuine canvas user (no verified session, no
// admin/org token, and the dev same-origin fallback is disabled in SaaS).
// So it consults the peer-token contract: HasAnyLiveToken for the forged
// caller. Return 0 → tokenless legacy peer → grandfathered through token
// validation (isCanvasUser stays false). The request must then still be
// gated by CanCommunicate.
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs(wsForgedCaller).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
// CanCommunicate MUST run (the escalation guard) and DENY: caller and
// target sit under different parents.
mockCanCommunicate(mock, wsForgedCaller, wsTarget, false)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: wsTarget}}
body := `{"jsonrpc":"2.0","id":"exploit-1","method":"message/send","params":{"message":{"role":"user","parts":[{"text":"cross-workspace exploit"}]}}}`
req := httptest.NewRequest("POST", "/workspaces/"+wsTarget+"/a2a", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
// Arbitrary caller workspace the attacker does not own.
req.Header.Set("X-Workspace-ID", wsForgedCaller)
// Forged same-origin signals (the #1944 bypass vector).
req.Host = "localhost"
req.Header.Set("Referer", "https://localhost/")
req.Header.Set("Origin", "https://localhost")
// No Cookie / Authorization — no genuine canvas credential.
c.Request = req
handler.ProxyA2A(c)
if w.Code != http.StatusForbidden {
t.Fatalf("ESCALATION NOT CLOSED: forged same-origin + arbitrary X-Workspace-ID "+
"reached an unauthorized target with status %d (want 403): %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("body not JSON: %v", err)
}
if !strings.Contains(fmt.Sprint(resp["error"]), "access denied") {
t.Errorf("expected an access-denied error from CanCommunicate, got %v", resp["error"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations — CanCommunicate must have been consulted: %v", err)
}
}
// TestProxyA2A_PushMode_NoShortCircuit verifies the symmetric contract:
// a push-mode workspace (default) is NOT affected by the new short-circuit.
// It still proceeds to resolveAgentURL + dispatch. Without this guard, a
@@ -425,6 +425,7 @@ func (h *WorkspaceHandler) stitchDrainResponseToDelegation(ctx context.Context,
})
if marshalErr != nil {
log.Printf("a2aQueue stitch %s: json.Marshal respJSON failed: %v", delegationID, marshalErr)
return
}
res, err := db.DB.ExecContext(ctx, `
UPDATE activity_logs
@@ -153,7 +153,15 @@ func queueRowAuthFields(ctx context.Context, queueID string) (callerID, workspac
if err != nil {
return "", "", err
}
return callerNS.String, workspaceNS.String, nil
callerID = ""
if callerNS.Valid {
callerID = callerNS.String
}
workspaceID = ""
if workspaceNS.Valid {
workspaceID = workspaceNS.String
}
return callerID, workspaceID, nil
}
// GetA2AQueueStatus handles GET /workspaces/:id/a2a/queue/:queue_id.
@@ -1,9 +1,62 @@
package handlers
import (
"context"
"testing"
"github.com/DATA-DOG/go-sqlmock"
)
// TestQueueRowAuthFields_NilSafeScan proves queueRowAuthFields returns empty
// strings (not a panic / garbage) when the a2a_queue row has NULL caller_id
// or workspace_id. Before the fix it dereferenced NullString.String directly,
// which is only the zero value when Valid is false but masked the NULL-vs-""
// distinction; the guard makes the intent explicit and safe.
func TestQueueRowAuthFields_NilSafeScan(t *testing.T) {
mock := setupTestDB(t)
queueID := "queue-123"
mock.ExpectQuery(`SELECT caller_id, workspace_id FROM a2a_queue WHERE id = \$1`).
WithArgs(queueID).
WillReturnRows(sqlmock.NewRows([]string{"caller_id", "workspace_id"}).AddRow(nil, nil))
caller, workspace, err := queueRowAuthFields(context.Background(), queueID)
if err != nil {
t.Fatalf("queueRowAuthFields returned error: %v", err)
}
if caller != "" {
t.Errorf("callerID = %q, want empty string for NULL caller_id", caller)
}
if workspace != "" {
t.Errorf("workspaceID = %q, want empty string for NULL workspace_id", workspace)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestQueueRowAuthFields_PopulatedRow confirms the non-NULL path still returns
// the scanned values unchanged.
func TestQueueRowAuthFields_PopulatedRow(t *testing.T) {
mock := setupTestDB(t)
queueID := "queue-456"
mock.ExpectQuery(`SELECT caller_id, workspace_id FROM a2a_queue WHERE id = \$1`).
WithArgs(queueID).
WillReturnRows(sqlmock.NewRows([]string{"caller_id", "workspace_id"}).AddRow("caller-x", "ws-y"))
caller, workspace, err := queueRowAuthFields(context.Background(), queueID)
if err != nil {
t.Fatalf("queueRowAuthFields returned error: %v", err)
}
if caller != "caller-x" || workspace != "ws-y" {
t.Fatalf("got caller=%q workspace=%q, want caller-x / ws-y", caller, workspace)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestExtractExpiresInSeconds covers the JSON parser used at enqueue time
// to honor a caller-specified TTL. Zero return = "no TTL" — caller leaves
// expires_at NULL on the queue row.
@@ -167,6 +167,7 @@ func (w *AgentMessageWriter) Send(
respJSON, marshalErr := json.Marshal(respPayload)
if marshalErr != nil {
log.Printf("AgentMessageWriter %s: json.Marshal respPayload failed: %v", workspaceID, marshalErr)
return nil
}
preview := textutil.TruncateRunes(message, 80)
if _, err := w.db.ExecContext(ctx, `
@@ -347,6 +347,7 @@ func computeAuditHMAC(key []byte, ev *auditEventRow) string {
payload, marshalErr := json.Marshal(canonical) // compact, sorted keys
if marshalErr != nil {
log.Printf("auditChainHash: json.Marshal canonical failed: %v", marshalErr)
return ""
}
mac := hmac.New(sha256.New, key)
mac.Write(payload)
@@ -172,10 +172,14 @@ func (h *ChannelHandler) Create(c *gin.Context) {
configJSON, marshalErr := json.Marshal(body.Config)
if marshalErr != nil {
log.Printf("Channels create %s: json.Marshal config failed: %v", workspaceID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "marshal config failed"})
return
}
allowedJSON, marshalErr := json.Marshal(body.AllowedUsers)
if marshalErr != nil {
log.Printf("Channels create %s: json.Marshal allowed_users failed: %v", workspaceID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "marshal allowed_users failed"})
return
}
enabled := true
if body.Enabled != nil {
@@ -234,6 +238,8 @@ func (h *ChannelHandler) Update(c *gin.Context) {
j, marshalErr := json.Marshal(body.Config)
if marshalErr != nil {
log.Printf("Channels update %s: json.Marshal config failed: %v", workspaceID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "marshal config failed"})
return
}
configArg = string(j)
}
@@ -241,6 +247,8 @@ func (h *ChannelHandler) Update(c *gin.Context) {
j, marshalErr := json.Marshal(body.AllowedUsers)
if marshalErr != nil {
log.Printf("Channels update %s: json.Marshal allowed_users failed: %v", workspaceID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "marshal allowed_users failed"})
return
}
allowedArg = string(j)
}
@@ -60,12 +60,14 @@ func pushDelegationResultToInbox(ctx context.Context, sourceID, delegationID, st
respJSON, marshalErr := json.Marshal(respPayload)
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal respPayload failed: %v", delegationID, marshalErr)
return
}
reqJSON, marshalErr := json.Marshal(map[string]interface{}{
"delegation_id": delegationID,
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal reqPayload failed: %v", delegationID, marshalErr)
return
}
logStatus := "ok"
if status == "failed" {
@@ -319,6 +321,7 @@ func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, b
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal taskJSON failed: %v", delegationID, marshalErr)
return insertTrackingUnavailable
}
// Store delegation_id in response_body so agent check_delegation_status
// (which reads response_body->>delegation_id) can locate this row even
@@ -328,6 +331,7 @@ func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, b
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal respJSON failed: %v", delegationID, marshalErr)
return insertTrackingUnavailable
}
var idemArg interface{}
if body.IdempotencyKey != "" {
@@ -431,10 +435,12 @@ func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, tar
if proxyErr != nil && isTransientProxyError(proxyErr) && len(respBody) == 0 {
log.Printf("Delegation %s: first attempt failed (%s) — retrying in %s after reactive URL refresh",
delegationID, proxyErr.Error(), delegationRetryDelay)
timer := time.NewTimer(delegationRetryDelay)
select {
case <-ctx.Done():
timer.Stop()
// outer timeout hit before retry window elapsed
case <-time.After(delegationRetryDelay):
case <-timer.C:
status, respBody, proxyErr = h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true, false)
}
}
@@ -505,12 +511,13 @@ handleSuccess:
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal queuedJSON failed: %v", delegationID, marshalErr)
}
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'queued')
`, sourceID, sourceID, targetID, "Delegation queued — target at capacity", string(queuedJSON)); err != nil {
log.Printf("Delegation %s: failed to insert queued log: %v", delegationID, err)
} else {
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'queued')
`, sourceID, sourceID, targetID, "Delegation queued — target at capacity", string(queuedJSON)); err != nil {
log.Printf("Delegation %s: failed to insert queued log: %v", delegationID, err)
}
}
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationStatus), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "status": "queued",
@@ -531,12 +538,13 @@ handleSuccess:
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal respJSON failed: %v", delegationID, marshalErr)
}
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'completed')
`, sourceID, sourceID, targetID, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
} else {
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'completed')
`, sourceID, sourceID, targetID, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
}
}
log.Printf("Delegation %s: step=recording_ledger_completed", delegationID)
@@ -619,6 +627,8 @@ func (h *DelegationHandler) Record(c *gin.Context) {
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal taskJSON failed: %v", body.DelegationID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to marshal task"})
return
}
// Store delegation_id in response_body so agent check_delegation_status
// can locate this row. Fixes mc#984.
@@ -627,6 +637,8 @@ func (h *DelegationHandler) Record(c *gin.Context) {
})
if marshalErr != nil {
log.Printf("Delegation %s: json.Marshal respJSON failed: %v", body.DelegationID, marshalErr)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to marshal response"})
return
}
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, request_body, response_body, status)
@@ -697,12 +709,13 @@ func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
})
if marshalErr != nil {
log.Printf("Delegation UpdateStatus %s: json.Marshal respJSON failed: %v", delegationID, marshalErr)
}
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4::jsonb, 'completed')
`, sourceID, sourceID, "Delegation completed ("+textutil.TruncateBytes(body.ResponsePreview, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation UpdateStatus: result insert failed for %s: %v", delegationID, err)
} else {
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4::jsonb, 'completed')
`, sourceID, sourceID, "Delegation completed ("+textutil.TruncateBytes(body.ResponsePreview, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation UpdateStatus: result insert failed for %s: %v", delegationID, err)
}
}
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
"delegation_id": delegationID,
@@ -167,6 +167,9 @@ func generateAppInstallationToken() (string, time.Time, error) {
return "", time.Time{}, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusCreated {
return "", time.Time{}, fmt.Errorf("github token endpoint returned status %d", resp.StatusCode)
}
var result struct {
Token string `json:"token"`
ExpiresAt time.Time `json:"expires_at"`
@@ -255,9 +255,23 @@ func TestExtended_SecretsListEmpty(t *testing.T) {
// ---------- TestSecretsSet (Extended) ----------
func TestExtended_SecretsSet(t *testing.T) {
// internal#691: the per-workspace strip gate now defaults to platform_managed
// on empty MOLECULE_LLM_BILLING_MODE (closed default). This test's intent is
// the happy path of persisting a vendor key, so put the org into byok which
// matches the pre-#691 implicit behavior of an unset env.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
mock := setupTestDB(t)
handler := NewSecretsHandler(nil)
// internal#691: secrets.Set now consults ResolveLLMBillingMode before the
// strip gate. Mock returns no row → resolver falls through to the org
// default (byok, set via t.Setenv above) → bypass-list check is skipped
// and the write proceeds. This pattern is the test-side mirror of the
// real-prod fall-through behavior for a fresh workspace with no override.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs("22222222-2222-2222-2222-222222222222").
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
// Expect INSERT (encrypted value is dynamic, use AnyArg)
mock.ExpectExec("INSERT INTO workspace_secrets").
WithArgs("22222222-2222-2222-2222-222222222222", "OPENAI_API_KEY", sqlmock.AnyArg(), sqlmock.AnyArg()).
@@ -0,0 +1,234 @@
package handlers
// llm_billing_mode.go — per-workspace LLM billing mode resolution (internal#691).
//
// The resolver answers a single question at provision time:
// "Should we strip CLAUDE_CODE_OAUTH_TOKEN + every vendor key from this
// workspace's env, force-route to the CP proxy, and bill org credits?"
//
// That question used to be a single env-var read inside applyPlatformManagedLLMEnv:
//
// os.Getenv("MOLECULE_LLM_BILLING_MODE") == "platform_managed" → strip
//
// where MOLECULE_LLM_BILLING_MODE was an ORG-level value, fetched from CP's
// tenant_config and exported into the workspace-server process at boot. That
// shape made it impossible to mix billing modes across workspaces in the same
// org: turning the org dial to `byok` so one workspace could keep its OAuth
// stops the strip for EVERY workspace in the org. Turning it to `platform_managed`
// blocks every workspace's own OAuth/vendor keys.
//
// The resolver replaces the env-var read with a per-workspace lookup:
//
// workspaces.llm_billing_mode (per-workspace override, NULLABLE)
// ?? organizations.llm_billing_mode (org default, fetched via tenant_config)
// ?? "platform_managed" (closed default — the existing implicit default)
//
// Default-closed contract — non-negotiable per the RFC Safety axis:
//
// - workspace row missing (sql.ErrNoRows) → fall through to org default
// - DB error on the lookup → "platform_managed" + propagated error
// - workspace override = NULL → fall through to org default
// - workspace override = unknown string → "platform_managed" (default-closed)
// - org default = NULL / empty / unknown string → "platform_managed" (closed default)
// - org default = recognized non-pm string + ws null → org default (byok/disabled honored)
//
// The ONLY way to resolve to "byok" or "disabled" is an explicit, recognized
// string in the workspace override OR the org default. A NULL JOIN, transient
// resolver error, or garbled enum value MUST NOT silently flip a workspace
// off of platform_managed — that would shadow the org's billing policy and
// is the exact failure mode the RFC's Safety hot-spot calls out.
import (
"context"
"database/sql"
"errors"
"fmt"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
)
// Constants mirror molecule-controlplane/internal/credits/llm_billing.go.
// Kept as string literals (not imports) because workspace-server has no
// build-time dependency on the CP module; the values are stable wire
// strings used in the tenant_config response, the workspaces.llm_billing_mode
// column check constraint, and the CP route bodies.
const (
LLMBillingModePlatformManaged = "platform_managed"
LLMBillingModeBYOK = "byok"
LLMBillingModeDisabled = "disabled"
)
// BillingModeSource describes which layer of the resolution stack supplied
// the final mode. Surfaced via the admin route for operator debug
// ("why is this workspace being stripped?") per the RFC Observability axis.
type BillingModeSource string
const (
BillingModeSourceWorkspaceOverride BillingModeSource = "workspace_override"
BillingModeSourceOrgDefault BillingModeSource = "org_default"
BillingModeSourceConstantFallback BillingModeSource = "constant_fallback"
)
// BillingModeResolution is the structured answer the admin GET route returns
// and the strip gate logs at INFO. The same struct is the unit-test fixture
// shape, so the resolver test asserts both the mode AND the source per case
// (catches a bug where the right mode is returned via the wrong layer).
type BillingModeResolution struct {
WorkspaceID string `json:"workspace_id"`
ResolvedMode string `json:"resolved_mode"`
WorkspaceOverride *string `json:"workspace_override"` // nil = inherit
OrgDefault string `json:"org_default"` // already default-closed by CP
Source BillingModeSource `json:"source"`
}
// isKnownBillingMode is the enum-recognizer for the resolver's default-closed
// branch. Returning false for an unknown string forces the resolver to fall
// through to the next layer (or the constant fallback) — NEVER to honor a
// garbled value as if it were valid. This is what makes a row with mode='byokk'
// (typo) resolve to platform_managed instead of accidentally to byok.
func isKnownBillingMode(s string) bool {
switch s {
case LLMBillingModePlatformManaged, LLMBillingModeBYOK, LLMBillingModeDisabled:
return true
default:
return false
}
}
// normalizeOrgDefault applies the same default-closed contract to the
// org-level input as the workspace override gets. The org_default arrives
// from tenant_config which already COALESCEs NULL → platform_managed at the
// CP SQL layer, but we DO NOT trust that contract here — if CP regresses or
// the tenant_config env wasn't populated (race on boot), we still default-
// close. Same principle: never honor a garbled value.
func normalizeOrgDefault(orgMode string) string {
if isKnownBillingMode(orgMode) {
return orgMode
}
return LLMBillingModePlatformManaged
}
// ResolveLLMBillingMode is the canonical resolver. Every code path that
// previously gated on `os.Getenv("MOLECULE_LLM_BILLING_MODE") == "platform_managed"`
// must call this instead and gate on the returned mode. The architectural
// test (resolver_ast_test.go) asserts there is no remaining call site of
// the old shape outside the resolver-input wiring.
//
// Returning an error does NOT prevent the caller from making a decision —
// the returned mode is always a valid enum value (default-closed to
// platform_managed) so the caller can proceed without a separate fail-closed
// branch. The error is informational: log it, surface it to operators, but
// the strip-gate decision is already safe.
func ResolveLLMBillingMode(ctx context.Context, workspaceID, orgMode string) (BillingModeResolution, error) {
res := BillingModeResolution{
WorkspaceID: workspaceID,
OrgDefault: normalizeOrgDefault(orgMode),
}
if workspaceID == "" {
// No workspace ID = pre-provision context (templating, validation).
// Resolve against the org default only, no DB read.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
// Org default was garbled/NULL and we clamped to platform_managed.
// Mark the source as constant_fallback so the operator can see
// the clamp happened, not that the org "really" said platform_managed.
res.Source = BillingModeSourceConstantFallback
}
return res, nil
}
var wsOverride sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT llm_billing_mode FROM workspaces WHERE id = $1`,
workspaceID,
).Scan(&wsOverride)
switch {
case errors.Is(err, sql.ErrNoRows):
// Workspace row missing — concurrent delete, or pre-create call. Don't
// silently flip; fall through to org default. Source stays org_default
// so operators can see the row-missing case is being handled as a
// fallback, not a workspace-explicit decision.
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
}
return res, nil
case err != nil:
// DB error — default-closed to platform_managed AND propagate the
// error so operators get a structured log line. The caller is
// expected to log and continue with the safe default.
res.ResolvedMode = LLMBillingModePlatformManaged
res.Source = BillingModeSourceConstantFallback
return res, fmt.Errorf("resolve workspace llm_billing_mode for %s: %w", workspaceID, err)
}
if wsOverride.Valid && isKnownBillingMode(wsOverride.String) {
mode := wsOverride.String
res.WorkspaceOverride = &mode
res.ResolvedMode = mode
res.Source = BillingModeSourceWorkspaceOverride
return res, nil
}
// Override row present but the value is NULL or garbled. Fall through.
// If the value was non-NULL but garbled (CHECK constraint should prevent
// this, but defense in depth — a future migration could relax the check
// or another path could write the column directly), surface the raw
// override value so operators can spot the corrupt row.
if wsOverride.Valid {
raw := wsOverride.String
res.WorkspaceOverride = &raw
}
res.ResolvedMode = res.OrgDefault
res.Source = BillingModeSourceOrgDefault
if !isKnownBillingMode(orgMode) {
res.Source = BillingModeSourceConstantFallback
}
return res, nil
}
// SetWorkspaceLLMBillingMode writes the override column. Pass mode=="" to
// clear (set to NULL = inherit). Validates the mode against the enum set
// so the route handler doesn't have to duplicate validation; a garbled
// mode round-trips as an explicit 400 from the caller, not a CHECK-
// constraint error from the DB driver.
func SetWorkspaceLLMBillingMode(ctx context.Context, workspaceID, mode string) error {
if workspaceID == "" {
return errors.New("SetWorkspaceLLMBillingMode: workspace id required")
}
if mode == "" {
// NULL = inherit. Caller asked to clear the override.
res, err := db.DB.ExecContext(ctx,
`UPDATE workspaces SET llm_billing_mode = NULL WHERE id = $1`,
workspaceID,
)
if err != nil {
return fmt.Errorf("clear workspace llm_billing_mode for %s: %w", workspaceID, err)
}
n, _ := res.RowsAffected()
if n == 0 {
return sql.ErrNoRows
}
return nil
}
if !isKnownBillingMode(mode) {
return fmt.Errorf("unknown billing mode %q (allowed: %s, %s, %s)",
mode, LLMBillingModePlatformManaged, LLMBillingModeBYOK, LLMBillingModeDisabled)
}
res, err := db.DB.ExecContext(ctx,
`UPDATE workspaces SET llm_billing_mode = $1 WHERE id = $2`,
mode, workspaceID,
)
if err != nil {
return fmt.Errorf("set workspace llm_billing_mode for %s: %w", workspaceID, err)
}
n, _ := res.RowsAffected()
if n == 0 {
return sql.ErrNoRows
}
return nil
}
@@ -0,0 +1,154 @@
package handlers
// llm_billing_mode_handler.go — workspace-server admin routes that read /
// write the per-workspace billing mode override (internal#691). These are
// the per-tenant routes that CP's new /cp/admin/workspaces/:id/llm-billing-mode
// proxies to; the canvas hits them via the CP route, not directly.
//
// Route shape:
//
// GET /admin/workspaces/:id/llm-billing-mode
// -> 200 BillingModeResolution
// -> 400 on malformed UUID
// -> 500 on DB error (response still includes a safe_default the caller
// can fall through to — the resolver always returns a valid mode
// even on error, per the default-closed contract)
//
// PUT /admin/workspaces/:id/llm-billing-mode
// body: {"mode": "byok" | "platform_managed" | "disabled" | null}
// -> 200 BillingModeResolution (post-write)
// -> 400 on bad UUID / unknown mode / malformed body / missing "mode" key
// -> 404 when the workspace row doesn't exist
//
// Auth: mounted under wsAdmin (middleware.AdminAuth) — admin_token required.
import (
"database/sql"
"encoding/json"
"errors"
"io"
"net/http"
"os"
"strings"
"github.com/gin-gonic/gin"
)
// GetWorkspaceLLMBillingMode handles GET /admin/workspaces/:id/llm-billing-mode.
//
// Reads the workspace override + the org-level default (from the same
// MOLECULE_LLM_BILLING_MODE env var the provisioner reads at strip-gate time —
// keeps the two paths consistent so the GET result matches what the strip
// gate would compute) and returns the structured resolution.
func GetWorkspaceLLMBillingMode(c *gin.Context) {
workspaceID := strings.TrimSpace(c.Param("id"))
if !uuidRegex.MatchString(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, err := ResolveLLMBillingMode(c.Request.Context(), workspaceID, orgMode)
if err != nil {
// Resolver returns a safe default-closed mode alongside the error;
// surface the error so the operator sees the DB issue, but the
// response still has a usable mode field for the caller to fall
// through to without a separate fail-closed branch.
c.JSON(http.StatusInternalServerError, gin.H{
"error": "resolve workspace billing mode failed",
"detail": err.Error(),
"safe_default": res.ResolvedMode,
"workspace_id": res.WorkspaceID,
})
return
}
c.JSON(http.StatusOK, res)
}
// PutWorkspaceLLMBillingMode handles PUT /admin/workspaces/:id/llm-billing-mode.
//
// Body shape: {"mode": "byok" | "platform_managed" | "disabled" | null}
// where null clears the override (workspace inherits the org default again).
// Omitting "mode" entirely is a 400 — callers must be explicit about whether
// they want to set or clear, so a typo'd field name can't silently no-op.
//
// On success returns the post-write resolution so the canvas can re-render
// without a follow-up GET.
func PutWorkspaceLLMBillingMode(c *gin.Context) {
workspaceID := strings.TrimSpace(c.Param("id"))
if !uuidRegex.MatchString(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
// Read raw body so we can distinguish three cases:
// {"mode": "byok"} → set override
// {"mode": null} → clear override
// {} → 400 (caller must be explicit)
// json.RawMessage zero length ⇔ key absent; raw "null" ⇔ explicit clear;
// raw quoted string ⇔ set.
raw, readErr := io.ReadAll(c.Request.Body)
if readErr != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "read body", "detail": readErr.Error()})
return
}
var body struct {
Mode json.RawMessage `json:"mode"`
}
if err := json.Unmarshal(raw, &body); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid json", "detail": err.Error()})
return
}
if len(body.Mode) == 0 {
c.JSON(http.StatusBadRequest, gin.H{"error": "missing required field 'mode' (use null to clear override)"})
return
}
var writeErr error
if string(body.Mode) == "null" {
writeErr = SetWorkspaceLLMBillingMode(c.Request.Context(), workspaceID, "")
} else {
var modeStr string
if err := json.Unmarshal(body.Mode, &modeStr); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "mode must be a string or null", "detail": err.Error()})
return
}
modeStr = strings.TrimSpace(modeStr)
if modeStr == "" {
// Empty string is ambiguous (could be "clear" or "user error");
// reject as 400 so the caller picks null explicitly.
c.JSON(http.StatusBadRequest, gin.H{"error": "mode must be one of platform_managed, byok, disabled, or null to clear"})
return
}
writeErr = SetWorkspaceLLMBillingMode(c.Request.Context(), workspaceID, modeStr)
}
if errors.Is(writeErr, sql.ErrNoRows) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
}
if writeErr != nil {
// Validation errors from SetWorkspaceLLMBillingMode (unknown mode
// string) come back as a plain error; map to 400.
if strings.HasPrefix(writeErr.Error(), "unknown billing mode") {
c.JSON(http.StatusBadRequest, gin.H{"error": writeErr.Error()})
return
}
c.JSON(http.StatusInternalServerError, gin.H{"error": "set workspace billing mode failed", "detail": writeErr.Error()})
return
}
// Read back the resolution so the response reflects post-write state.
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(c.Request.Context(), workspaceID, orgMode)
if resolveErr != nil {
// Write succeeded but readback failed — still return 200 with the
// best-effort resolution; the safe default is set even on error.
c.JSON(http.StatusOK, gin.H{
"workspace_id": workspaceID,
"resolved_mode": res.ResolvedMode,
"readback_error": resolveErr.Error(),
})
return
}
c.JSON(http.StatusOK, res)
}
@@ -0,0 +1,205 @@
package handlers
// llm_billing_mode_handler_test.go — admin route coverage for the per-workspace
// LLM billing mode endpoint (internal#691).
//
// What this guards:
// - GET path validates UUID + returns the BillingModeResolution shape
// - PUT distinguishes "omitted mode" (400) from "explicit null" (clear)
// from "string value" (set), so a typo'd field name can't silently no-op
// - Unknown mode strings 400 from the validator, not from a PG CHECK
// constraint round-trip (matters because the error message must be
// actionable to a canvas user)
// - 404 propagates when the workspace row is missing on a set/clear
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
func init() {
gin.SetMode(gin.TestMode)
}
const testWSID = "44444444-4444-4444-4444-444444444444"
func TestGetWorkspaceLLMBillingMode_HappyPath_InheritsOrgDefault(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModeBYOK)
mock := setupTestDB(t)
// Workspace has no override → resolver returns org_default = byok.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
c.Request = httptest.NewRequest("GET", "/admin/workspaces/"+testWSID+"/llm-billing-mode", nil)
GetWorkspaceLLMBillingMode(c)
if w.Code != http.StatusOK {
t.Fatalf("status: got %d want 200, body=%s", w.Code, w.Body.String())
}
var res BillingModeResolution
if err := json.Unmarshal(w.Body.Bytes(), &res); err != nil {
t.Fatalf("decode: %v", err)
}
if res.ResolvedMode != LLMBillingModeBYOK {
t.Errorf("resolved mode: got %q want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
}
if res.WorkspaceOverride != nil {
t.Errorf("expected nil override, got %v", *res.WorkspaceOverride)
}
}
func TestGetWorkspaceLLMBillingMode_BadUUID_400(t *testing.T) {
setupTestDB(t)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "not-a-uuid"}}
c.Request = httptest.NewRequest("GET", "/admin/workspaces/not-a-uuid/llm-billing-mode", nil)
GetWorkspaceLLMBillingMode(c)
if w.Code != http.StatusBadRequest {
t.Fatalf("status: got %d want 400", w.Code)
}
}
func TestPutWorkspaceLLMBillingMode_SetByok(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = \$1 WHERE id = \$2`).
WithArgs(LLMBillingModeBYOK, testWSID).
WillReturnResult(sqlmock.NewResult(0, 1))
// Readback after write.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
body := `{"mode":"byok"}`
c.Request = httptest.NewRequest("PUT",
"/admin/workspaces/"+testWSID+"/llm-billing-mode",
bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
PutWorkspaceLLMBillingMode(c)
if w.Code != http.StatusOK {
t.Fatalf("status: got %d want 200, body=%s", w.Code, w.Body.String())
}
var res BillingModeResolution
if err := json.Unmarshal(w.Body.Bytes(), &res); err != nil {
t.Fatalf("decode: %v", err)
}
if res.ResolvedMode != LLMBillingModeBYOK {
t.Errorf("post-write resolved: got %q want %q", res.ResolvedMode, LLMBillingModeBYOK)
}
if res.Source != BillingModeSourceWorkspaceOverride {
t.Errorf("post-write source: got %q want %q", res.Source, BillingModeSourceWorkspaceOverride)
}
}
func TestPutWorkspaceLLMBillingMode_ExplicitNullClearsOverride(t *testing.T) {
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
mock := setupTestDB(t)
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = NULL WHERE id = \$1`).
WithArgs(testWSID).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(testWSID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
body := `{"mode":null}`
c.Request = httptest.NewRequest("PUT",
"/admin/workspaces/"+testWSID+"/llm-billing-mode",
bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
PutWorkspaceLLMBillingMode(c)
if w.Code != http.StatusOK {
t.Fatalf("status: got %d want 200, body=%s", w.Code, w.Body.String())
}
var res BillingModeResolution
if err := json.Unmarshal(w.Body.Bytes(), &res); err != nil {
t.Fatalf("decode: %v", err)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("post-clear resolved: got %q want %q", res.ResolvedMode, LLMBillingModePlatformManaged)
}
if res.Source != BillingModeSourceOrgDefault {
t.Errorf("post-clear source: got %q want %q", res.Source, BillingModeSourceOrgDefault)
}
if res.WorkspaceOverride != nil {
t.Errorf("post-clear override should be nil, got %v", *res.WorkspaceOverride)
}
}
func TestPutWorkspaceLLMBillingMode_MissingModeField_400(t *testing.T) {
setupTestDB(t)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
body := `{}`
c.Request = httptest.NewRequest("PUT",
"/admin/workspaces/"+testWSID+"/llm-billing-mode",
bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
PutWorkspaceLLMBillingMode(c)
if w.Code != http.StatusBadRequest {
t.Fatalf("status: got %d want 400, body=%s", w.Code, w.Body.String())
}
}
func TestPutWorkspaceLLMBillingMode_UnknownMode_400(t *testing.T) {
setupTestDB(t)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
body := `{"mode":"totally-bogus"}`
c.Request = httptest.NewRequest("PUT",
"/admin/workspaces/"+testWSID+"/llm-billing-mode",
bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
PutWorkspaceLLMBillingMode(c)
if w.Code != http.StatusBadRequest {
t.Fatalf("status: got %d want 400, body=%s", w.Code, w.Body.String())
}
}
func TestPutWorkspaceLLMBillingMode_NoSuchWorkspace_404(t *testing.T) {
mock := setupTestDB(t)
// SET path: rows affected = 0 → SetWorkspaceLLMBillingMode returns sql.ErrNoRows
// → handler maps to 404.
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = \$1 WHERE id = \$2`).
WithArgs(LLMBillingModeBYOK, testWSID).
WillReturnResult(sqlmock.NewResult(0, 0))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: testWSID}}
body := `{"mode":"byok"}`
c.Request = httptest.NewRequest("PUT",
"/admin/workspaces/"+testWSID+"/llm-billing-mode",
bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
PutWorkspaceLLMBillingMode(c)
if w.Code != http.StatusNotFound {
t.Fatalf("status: got %d want 404, body=%s", w.Code, w.Body.String())
}
}
@@ -0,0 +1,261 @@
package handlers
// llm_billing_mode_test.go — table-driven tests for the per-workspace
// resolver (internal#691). The cases below enumerate every documented
// branch in the default-closed contract; if one of them flips behavior
// later the test names will tell the reviewer exactly which RFC clause
// regressed.
import (
"context"
"errors"
"testing"
"github.com/DATA-DOG/go-sqlmock"
)
func TestResolveLLMBillingMode_TableDriven(t *testing.T) {
ctx := context.Background()
const wsID = "11111111-1111-1111-1111-111111111111"
type want struct {
mode string
source BillingModeSource
// hasOverride asserts whether the resolver surfaced the override
// value in the result (nil pointer = clean inherit, non-nil = the
// row was present even if it ultimately fell through because it
// was garbled). Lets us distinguish "row missing, fell through"
// from "row present but garbled, fell through" — both resolve to
// the same mode but the resolver tells operators which case it was.
hasOverride bool
}
type tc struct {
name string
workspaceID string
orgMode string
setupMock func(m sqlmock.Sqlmock)
want want
wantErr bool
}
cases := []tc{
{
name: "workspace_override_byok_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
},
{
name: "workspace_override_disabled_overrides_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeDisabled))
},
want: want{mode: LLMBillingModeDisabled, source: BillingModeSourceWorkspaceOverride, hasOverride: true},
},
{
name: "workspace_override_null_inherits_byok_org",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_override_null_inherits_pm_org",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(nil))
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_override_garbled_falls_through_to_pm_org_DEFAULT_CLOSED",
workspaceID: wsID,
orgMode: LLMBillingModePlatformManaged,
setupMock: func(m sqlmock.Sqlmock) {
// CHECK constraint would normally prevent this but if a future
// migration loosens it (or a direct UPDATE bypasses it on a
// non-PG driver in a test stub), a garbled value MUST NOT
// be honored as if it were valid. This is the default-closed
// safety axis the RFC calls out.
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("byokk"))
},
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceOrgDefault, hasOverride: true},
},
{
name: "workspace_override_garbled_org_garbled_constant_fallback",
workspaceID: wsID,
orgMode: "garbled-or-empty",
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("nonsense"))
},
// Both layers garbled → constant fallback. Source is constant_fallback
// so operators can see the org-default-was-also-bad case explicitly.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: true},
},
{
name: "workspace_row_missing_falls_through_to_org_byok",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
},
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_pre_provision_org_only",
workspaceID: "",
orgMode: LLMBillingModeBYOK,
setupMock: func(m sqlmock.Sqlmock) { /* no DB read expected — empty ws id short-circuits */ },
want: want{mode: LLMBillingModeBYOK, source: BillingModeSourceOrgDefault, hasOverride: false},
},
{
name: "workspace_id_empty_org_garbled_constant_fallback",
workspaceID: "",
orgMode: "",
setupMock: func(m sqlmock.Sqlmock) { /* no DB read */ },
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
},
{
name: "db_error_default_closed_to_pm_with_error",
workspaceID: wsID,
orgMode: LLMBillingModeBYOK, // org says byok but DB errored — DO NOT honor org
setupMock: func(m sqlmock.Sqlmock) {
m.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnError(errors.New("connection refused"))
},
// Critical: even though orgMode=byok, a DB error means we can't
// confirm the workspace doesn't have an override, so we default
// to the closed mode. This is the safer of the two failures —
// silently flipping to org-byok on a DB error would leak the
// OAuth-keeping behavior to workspaces whose row says NULL.
want: want{mode: LLMBillingModePlatformManaged, source: BillingModeSourceConstantFallback, hasOverride: false},
wantErr: true,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
mock := setupTestDB(t)
c.setupMock(mock)
res, err := ResolveLLMBillingMode(ctx, c.workspaceID, c.orgMode)
if (err != nil) != c.wantErr {
t.Fatalf("err: got %v wantErr=%v", err, c.wantErr)
}
if res.ResolvedMode != c.want.mode {
t.Errorf("mode: got %q want %q", res.ResolvedMode, c.want.mode)
}
if res.Source != c.want.source {
t.Errorf("source: got %q want %q", res.Source, c.want.source)
}
if (res.WorkspaceOverride != nil) != c.want.hasOverride {
t.Errorf("hasOverride: got %v want %v (override=%v)",
res.WorkspaceOverride != nil, c.want.hasOverride, res.WorkspaceOverride)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
})
}
}
// TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid asserts the resolver's
// post-condition: the returned mode is ALWAYS one of the three known enum
// values, never an empty string and never a garbled passthrough. The strip
// gate downstream relies on this so it can switch on res.ResolvedMode
// without a separate is-valid check on every call site.
func TestResolveLLMBillingMode_ResolvedModeIsAlwaysValid(t *testing.T) {
ctx := context.Background()
const wsID = "22222222-2222-2222-2222-222222222222"
// Throw a pathological row at the resolver: garbled override + garbled
// org default. Resolved mode must still be a recognized enum.
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow("totally-bogus"))
res, err := ResolveLLMBillingMode(ctx, wsID, "also-bogus")
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
if !isKnownBillingMode(res.ResolvedMode) {
t.Errorf("post-condition violated: resolved mode %q is not a known enum value", res.ResolvedMode)
}
if res.ResolvedMode != LLMBillingModePlatformManaged {
t.Errorf("default-closed contract: garbled-x-garbled must resolve to platform_managed, got %q", res.ResolvedMode)
}
}
// TestSetWorkspaceLLMBillingMode_Validation guards the SET path. The CHECK
// constraint at the DB layer is the second line of defense; the route
// handler relies on this function rejecting unknown modes with a clean
// error (so it can map to 400) instead of letting them hit Postgres and
// surfacing as a sql-driver error string.
func TestSetWorkspaceLLMBillingMode_Validation(t *testing.T) {
ctx := context.Background()
const wsID = "33333333-3333-3333-3333-333333333333"
t.Run("rejects_unknown_mode_without_db_call", func(t *testing.T) {
setupTestDB(t) // mock expects nothing — the function must short-circuit
if err := SetWorkspaceLLMBillingMode(ctx, wsID, "totally-bogus"); err == nil {
t.Fatal("expected error for unknown mode, got nil")
}
})
t.Run("rejects_empty_workspace_id", func(t *testing.T) {
setupTestDB(t)
if err := SetWorkspaceLLMBillingMode(ctx, "", LLMBillingModeBYOK); err == nil {
t.Fatal("expected error for empty workspace id, got nil")
}
})
t.Run("clear_uses_NULL_update", func(t *testing.T) {
mock := setupTestDB(t)
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = NULL WHERE id = \$1`).
WithArgs(wsID).
WillReturnResult(sqlmock.NewResult(0, 1))
if err := SetWorkspaceLLMBillingMode(ctx, wsID, ""); err != nil {
t.Fatalf("unexpected err: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatal(err)
}
})
t.Run("set_byok_uses_value_update", func(t *testing.T) {
mock := setupTestDB(t)
mock.ExpectExec(`UPDATE workspaces SET llm_billing_mode = \$1 WHERE id = \$2`).
WithArgs(LLMBillingModeBYOK, wsID).
WillReturnResult(sqlmock.NewResult(0, 1))
if err := SetWorkspaceLLMBillingMode(ctx, wsID, LLMBillingModeBYOK); err != nil {
t.Fatalf("unexpected err: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatal(err)
}
})
}
@@ -280,6 +280,92 @@ func TestMCPHandler_DelegateTaskAsync_RoutesThroughPlatformA2AProxy(t *testing.T
}
}
// TestMCPHandler_DelegateTaskAsync_MarshalFailureDoesNotCallProxy proves the
// extracted #1933 fix: when the A2A body fails to marshal, the detached
// goroutine returns early and never calls proxyA2ARequest with a nil/empty
// body. Before the fix the goroutine logged the error and fell through,
// dispatching a malformed A2A request.
func TestMCPHandler_DelegateTaskAsync_MarshalFailureDoesNotCallProxy(t *testing.T) {
h, mock := newMCPHandler(t)
callerID := "11111111-1111-1111-1111-111111111111"
targetID := "22222222-2222-2222-2222-222222222222"
parentID := "33333333-3333-3333-3333-333333333333"
expectCanCommunicateSiblings(mock, callerID, targetID, parentID)
mock.ExpectExec(`(?s)INSERT INTO activity_logs.*'delegation'.*'delegate'`).
WithArgs(callerID, callerID, targetID, "Delegating to "+targetID, sqlmock.AnyArg(), "pending").
WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("dispatched", "", callerID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Force the (otherwise near-impossible) marshal failure for the A2A body.
origMarshal := marshalA2ABody
marshalA2ABody = func(any) ([]byte, error) {
return nil, errors.New("forced marshal failure")
}
t.Cleanup(func() { marshalA2ABody = origMarshal })
proxyCalled := make(chan struct{}, 1)
h.a2aProxy = func(ctx context.Context, workspaceID string, body []byte, proxyCallerID string, logActivity bool) (int, []byte, error) {
proxyCalled <- struct{}{}
return 200, []byte(`{}`), nil
}
out, err := h.toolDelegateTaskAsync(context.Background(), callerID, map[string]interface{}{
"workspace_id": targetID,
"task": "async work",
})
if err != nil {
t.Fatalf("delegate_task_async returned error: %v", err)
}
if !strings.Contains(out, `"status":"dispatched"`) {
t.Fatalf("delegate_task_async response = %s", out)
}
// Wait for the detached goroutine to finish, then assert the proxy was
// never reached because of the early return on marshal failure.
waitGlobalAsyncForTest()
select {
case <-proxyCalled:
t.Fatal("proxyA2ARequest was called after marshal failure; expected early return")
default:
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestMCPHandler_CheckTaskStatus_NullStatusDefaultsToUnknown proves the
// extracted #1933 hardening: when the activity_logs row has a NULL status,
// check_task_status reports "unknown" instead of an empty string (the old
// status.String zero value).
func TestMCPHandler_CheckTaskStatus_NullStatusDefaultsToUnknown(t *testing.T) {
h, mock := newMCPHandler(t)
callerID := "11111111-1111-1111-1111-111111111111"
targetID := "22222222-2222-2222-2222-222222222222"
taskID := "task-abc"
mock.ExpectQuery(`(?s)SELECT status, error_detail, response_body.*FROM activity_logs`).
WithArgs(callerID, targetID, taskID).
WillReturnRows(sqlmock.NewRows([]string{"status", "error_detail", "response_body"}).
AddRow(nil, nil, nil))
out, err := h.toolCheckTaskStatus(context.Background(), callerID, map[string]interface{}{
"workspace_id": targetID,
"task_id": taskID,
})
if err != nil {
t.Fatalf("check_task_status returned error: %v", err)
}
if !strings.Contains(out, `"status": "unknown"`) {
t.Fatalf("expected status \"unknown\" for NULL status row, got: %s", out)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// notifications/initialized
// ─────────────────────────────────────────────────────────────────────────────
@@ -20,6 +20,11 @@ import (
"github.com/google/uuid"
)
// marshalA2ABody marshals the JSON-RPC body for an async A2A dispatch.
// Indirected through a package var so tests can force the (otherwise
// near-impossible) marshal-failure path and assert the early return.
var marshalA2ABody = json.Marshal
// insertMCPDelegationRow writes a delegation activity row so the canvas
// Agent Comms tab can show the task text for MCP-initiated delegations.
// Mirrors insertDelegationRow (delegation.go) for the MCP tool path.
@@ -144,6 +149,7 @@ func (h *MCPHandler) toolListPeers(ctx context.Context, workspaceID string) (str
b, marshalErr := json.MarshalIndent(peers, "", " ")
if marshalErr != nil {
log.Printf("toolListPeers: json.MarshalIndent peers failed: %v", marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -177,6 +183,7 @@ func (h *MCPHandler) toolGetWorkspaceInfo(ctx context.Context, workspaceID strin
b, marshalErr := json.MarshalIndent(info, "", " ")
if marshalErr != nil {
log.Printf("toolGetWorkspaceInfo %s: json.MarshalIndent info failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -269,7 +276,7 @@ func (h *MCPHandler) toolDelegateTaskAsync(ctx context.Context, callerID string,
bgCtx, cancel := context.WithTimeout(context.Background(), mcpAsyncCallTimeout)
defer cancel()
a2aBody, marshalErr := json.Marshal(map[string]interface{}{
a2aBody, marshalErr := marshalA2ABody(map[string]interface{}{
"jsonrpc": "2.0",
"id": delegationID,
"method": "message/send",
@@ -283,6 +290,9 @@ func (h *MCPHandler) toolDelegateTaskAsync(ctx context.Context, callerID string,
})
if marshalErr != nil {
log.Printf("toolDelegateTask %s: json.Marshal a2aBody failed: %v", delegationID, marshalErr)
// Bail out: proceeding would call proxyA2ARequest with a
// nil/empty body, dispatching a malformed A2A request.
return
}
status, _, err := h.proxyA2ARequest(bgCtx, targetID, a2aBody, callerID, true)
@@ -330,9 +340,13 @@ func (h *MCPHandler) toolCheckTaskStatus(ctx context.Context, callerID string, a
result := map[string]interface{}{
"task_id": taskID,
"status": status.String,
"target_id": targetID,
}
if status.Valid {
result["status"] = status.String
} else {
result["status"] = "unknown"
}
if errorDetail.Valid && errorDetail.String != "" {
result["error"] = errorDetail.String
}
@@ -342,6 +356,7 @@ func (h *MCPHandler) toolCheckTaskStatus(ctx context.Context, callerID string, a
b, marshalErr := json.MarshalIndent(result, "", " ")
if marshalErr != nil {
log.Printf("toolCheckTaskStatus: json.MarshalIndent result failed: %v", marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -194,6 +194,7 @@ func (h *MCPHandler) recallMemoryLegacyShim(ctx context.Context, workspaceID str
b, marshalErr := json.MarshalIndent(out, "", " ")
if marshalErr != nil {
log.Printf("toolRecallMemory: json.MarshalIndent out failed: %v", marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -166,6 +166,7 @@ func (h *MCPHandler) toolCommitMemoryV2(ctx context.Context, workspaceID string,
out, marshalErr := json.Marshal(resp)
if marshalErr != nil {
log.Printf("toolCommitMemoryV2 %s: json.Marshal resp failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(out), nil
}
@@ -223,6 +224,7 @@ func (h *MCPHandler) toolSearchMemory(ctx context.Context, workspaceID string, a
out, marshalErr := json.Marshal(resp)
if marshalErr != nil {
log.Printf("toolSearchMemory %s: json.Marshal resp failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(out), nil
}
@@ -281,6 +283,7 @@ func (h *MCPHandler) toolCommitSummary(ctx context.Context, workspaceID string,
out, marshalErr := json.Marshal(resp)
if marshalErr != nil {
log.Printf("toolCommitSummary %s: json.Marshal resp failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(out), nil
}
@@ -300,6 +303,7 @@ func (h *MCPHandler) toolListWritableNamespaces(ctx context.Context, workspaceID
b, marshalErr := json.MarshalIndent(ns, "", " ")
if marshalErr != nil {
log.Printf("toolListWritableNamespaces %s: json.MarshalIndent ns failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -315,6 +319,7 @@ func (h *MCPHandler) toolListReadableNamespaces(ctx context.Context, workspaceID
b, marshalErr := json.MarshalIndent(ns, "", " ")
if marshalErr != nil {
log.Printf("toolListReadableNamespaces %s: json.MarshalIndent ns failed: %v", workspaceID, marshalErr)
return "", fmt.Errorf("marshal response: %w", marshalErr)
}
return string(b), nil
}
@@ -247,13 +247,14 @@ func (h *MemoriesHandler) Commit(c *gin.Context) {
})
if marshalErr != nil {
log.Printf("Commit %s: json.Marshal auditBody failed: %v", workspaceID, marshalErr)
}
summary := "GLOBAL memory written: id=" + memoryID + " namespace=" + nsName
if _, auditErr := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, summary, request_body, status)
VALUES ($1, $2, $3, $4, $5::jsonb, $6)
`, workspaceID, "memory_write_global", workspaceID, summary, string(auditBody), "ok"); auditErr != nil {
log.Printf("Commit: GLOBAL memory audit log failed for %s/%s: %v", workspaceID, memoryID, auditErr)
} else {
summary := "GLOBAL memory written: id=" + memoryID + " namespace=" + nsName
if _, auditErr := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, summary, request_body, status)
VALUES ($1, $2, $3, $4, $5::jsonb, $6)
`, workspaceID, "memory_write_global", workspaceID, summary, string(auditBody), "ok"); auditErr != nil {
log.Printf("Commit: GLOBAL memory audit log failed for %s/%s: %v", workspaceID, memoryID, auditErr)
}
}
}
@@ -345,8 +345,16 @@ func (h *RegistryHandler) Register(c *gin.Context) {
if qErr := db.DB.QueryRowContext(ctx,
`SELECT name, role FROM workspaces WHERE id = $1`, payload.ID,
).Scan(&dbName, &dbRole); qErr == nil {
name := ""
if dbName.Valid {
name = dbName.String
}
role := ""
if dbRole.Valid {
role = dbRole.String
}
if rc, did := reconcileAgentCardIdentity(
payload.AgentCard, payload.ID, dbName.String, dbRole.String,
payload.AgentCard, payload.ID, name, role,
); did {
reconciledCard = rc
log.Printf("Registry register: reconciled agent_card identity for %s from workspaces row", payload.ID)
@@ -177,10 +177,12 @@ func waitForWorkspaceOnline(ctx context.Context, workspaceID string, timeout tim
).Scan(&status); err == nil && status == "online" {
return true
}
timer := time.NewTimer(restartContextOnlinePollInterval)
select {
case <-ctx.Done():
timer.Stop()
return false
case <-time.After(restartContextOnlinePollInterval):
case <-timer.C:
}
}
return false
@@ -213,10 +215,12 @@ func waitForFreshHeartbeat(ctx context.Context, workspaceID string, restartStart
lastHB.Valid && lastHB.Time.After(restartStartTs) {
return true
}
timer := time.NewTimer(restartContextOnlinePollInterval)
select {
case <-ctx.Done():
timer.Stop()
return false
case <-time.After(restartContextOnlinePollInterval):
case <-timer.C:
}
}
return false
@@ -160,13 +160,14 @@ func (h *ScheduleHandler) Create(c *gin.Context) {
}
// Validate timezone
if _, err := time.LoadLocation(body.Timezone); err != nil {
loc, err := time.LoadLocation(body.Timezone)
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid timezone: " + body.Timezone})
return
}
// Validate and compute next run
nextRun, err := scheduler.ComputeNextRun(body.CronExpr, body.Timezone, time.Now())
nextRun, err := scheduler.ComputeNextRun(body.CronExpr, body.Timezone, time.Now().In(loc))
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
return
@@ -260,11 +261,12 @@ func (h *ScheduleHandler) Update(c *gin.Context) {
if body.Timezone != nil {
tz = *body.Timezone
}
if _, err := time.LoadLocation(tz); err != nil {
loc, err := time.LoadLocation(tz)
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid timezone: " + tz})
return
}
nextRun, err := scheduler.ComputeNextRun(cronExpr, tz, time.Now())
nextRun, err := scheduler.ComputeNextRun(cronExpr, tz, time.Now().In(loc))
if err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
return
+45 -1
View File
@@ -48,10 +48,54 @@ func isPlatformManagedDirectLLMBypassKey(key string) bool {
return ok
}
// platformManagedLLMModeForWorkspace replaces the org-level platformManagedLLMMode
// gate with a per-workspace resolved-mode check (internal#691). The strip-list
// is enforced ONLY when this specific workspace's resolved mode is
// platform_managed — a workspace with a byok override is allowed to write its
// own CLAUDE_CODE_OAUTH_TOKEN / vendor key via the canvas Secrets tab.
//
// Default-closed: if the resolver hits a DB error, falls back to
// platform_managed (the safe-default behavior), so a transient DB failure
// during a secret write still rejects the bypass-list keys — fail safer not
// freer. This matches the resolver's documented contract.
func platformManagedLLMModeForWorkspace(c *gin.Context, workspaceID string) bool {
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, err := ResolveLLMBillingMode(c.Request.Context(), workspaceID, orgMode)
if err != nil {
log.Printf("secrets: resolve billing mode for workspace=%s failed: %v (defaulting to platform_managed for safety)", workspaceID, err)
}
return strings.EqualFold(res.ResolvedMode, LLMBillingModePlatformManaged)
}
// platformManagedLLMMode is the legacy org-level gate retained for any test
// harness still asserting the env-var-only behavior. Production code paths
// must call platformManagedLLMModeForWorkspace instead so a workspace-level
// byok override actually takes effect on the secrets-write path.
func platformManagedLLMMode() bool {
return strings.EqualFold(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")), "platform_managed")
}
// rejectPlatformManagedDirectLLMBypassForWorkspace is the per-workspace
// successor to rejectPlatformManagedDirectLLMBypass (internal#691). The
// strip-list ONLY applies when this specific workspace resolves to
// platform_managed; byok/disabled workspaces can write their own vendor keys.
func rejectPlatformManagedDirectLLMBypassForWorkspace(c *gin.Context, workspaceID, key string) bool {
if !platformManagedLLMModeForWorkspace(c, workspaceID) || !isPlatformManagedDirectLLMBypassKey(key) {
return false
}
c.JSON(http.StatusBadRequest, gin.H{
"error": "direct vendor key writes are blocked for platform-managed workspaces; use MODEL/LLM_PROVIDER or the platform LLM proxy env instead, or set this workspace's billing mode to 'byok' via /admin/workspaces/:id/llm-billing-mode",
"key": key,
"workspace_id": workspaceID,
})
return true
}
// rejectPlatformManagedDirectLLMBypass is the legacy org-level shim. Retained
// only for backwards compatibility with any external/test caller still on the
// old shape; new code MUST use the per-workspace variant above. Production
// code paths (the secrets.go handlers + workspace.go create-secret path) all
// switched in internal#691.
func rejectPlatformManagedDirectLLMBypass(c *gin.Context, key string) bool {
if !platformManagedLLMMode() || !isPlatformManagedDirectLLMBypassKey(key) {
return false
@@ -285,7 +329,7 @@ func (h *SecretsHandler) Set(c *gin.Context) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
return
}
if rejectPlatformManagedDirectLLMBypass(c, body.Key) {
if rejectPlatformManagedDirectLLMBypassForWorkspace(c, workspaceID, body.Key) {
return
}
@@ -0,0 +1,180 @@
package handlers
// template_schedules.go — read a workspace template's `schedules:`
// block and seed workspace_schedules with source='template'. Mirrors
// the org/import flow (org_import.go) so a workspace created directly
// from a workspace template (e.g. via WorkspaceHandler.Create) lands
// with the same schedule grid the org/import path would have produced.
//
// Issue #24 contract (also enforced by org_import + schedules.go):
// - INSERT new rows with source='template'
// - On (workspace_id, name) collision, only refresh template-source
// rows; runtime-added rows survive re-provisioning untouched
// - Never DELETE (additive only)
//
// The actual INSERT statement is the canonical orgImportScheduleSQL
// defined in org.go — reused here verbatim so the four guarantees
// stay in one place.
//
// Hostile-template defenses (a tenant can upload a config.yaml via
// POST /templates/import or webhook-sync a repo they control):
// - config.yaml is loaded through a 1 MiB LimitReader so a YAML
// anchor-bomb / billion-laughs cannot pre-explode memory before
// unmarshal returns.
// - len(schedules), per-schedule cron length, and resolved prompt
// body length are all bounded; over-sized entries are skipped
// rather than committed.
// - Per-row insert errors and ctx cancellation surface to the
// caller via the returned counts so partial-seed states are
// observable (workspace.go Create logs the (seeded, skipped)
// pair when skipped > 0).
import (
"context"
"errors"
"fmt"
"io"
"log"
"os"
"path/filepath"
"time"
"gopkg.in/yaml.v3"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/scheduler"
)
// Bounds protecting the seeder against hostile or buggy templates.
// All chosen with generous headroom relative to legitimate use
// (reno-stars org template — the largest production schedule grid —
// runs ~10 entries per workspace, each prompt body well under 1 KiB).
const (
maxTemplateConfigYAMLBytes int64 = 1 << 20 // 1 MiB — hard cap on config.yaml size
maxTemplateSchedules = 100 // 10x current largest grid
maxScheduleCronExprLen = 128 // cron-spec syntax is short by construction
maxSchedulePromptBytes = 16 << 10 // 16 KiB after prompt_file resolution
)
// templateConfigSchedules is the minimal shape parsed from a workspace
// template's config.yaml. Only the `schedules:` block is modelled;
// the rest of the file (providers, runtime_config, …) is opaque to
// this loader and continues to flow through the existing pass-through
// in workspace_provision.go.
type templateConfigSchedules struct {
Schedules []OrgSchedule `yaml:"schedules"`
}
// parseTemplateSchedules reads `<templatePath>/config.yaml` and
// returns its `schedules:` block (nil + nil error when the file is
// absent or the block is empty).
//
// The file is read through a 1 MiB LimitReader so a billion-laughs
// or anchor-explosion YAML cannot pre-explode memory before
// Unmarshal returns. Returns an error only when a present
// config.yaml fails to read or parse — callers should treat that as
// a template-author bug rather than a runtime fault. The Create
// handler logs the error and continues so a broken schedules block
// can never block workspace provisioning.
func parseTemplateSchedules(templatePath string) ([]OrgSchedule, error) {
if templatePath == "" {
return nil, nil
}
f, err := os.Open(filepath.Join(templatePath, "config.yaml"))
if err != nil {
if errors.Is(err, os.ErrNotExist) {
return nil, nil
}
return nil, fmt.Errorf("open template config.yaml: %w", err)
}
defer f.Close()
// Read maxTemplateConfigYAMLBytes+1 — if we filled the buffer the
// underlying file exceeded the cap and we refuse to unmarshal.
data, err := io.ReadAll(io.LimitReader(f, maxTemplateConfigYAMLBytes+1))
if err != nil {
return nil, fmt.Errorf("read template config.yaml: %w", err)
}
if int64(len(data)) > maxTemplateConfigYAMLBytes {
return nil, fmt.Errorf("template config.yaml exceeds %d-byte cap", maxTemplateConfigYAMLBytes)
}
var cfg templateConfigSchedules
if err := yaml.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("parse template config.yaml schedules: %w", err)
}
if len(cfg.Schedules) > maxTemplateSchedules {
return nil, fmt.Errorf("template declares %d schedules; cap is %d", len(cfg.Schedules), maxTemplateSchedules)
}
return cfg.Schedules, nil
}
// seedTemplateSchedules INSERTs (or refreshes) each schedule into
// workspace_schedules with source='template'. Returns (seeded,
// skipped) counts so the caller can observe partial-seed states.
//
// Prompt body resolution mirrors org_import.go: inline `prompt:` wins,
// else `prompt_file:` is resolved relative to templatePath via
// resolvePromptRef. Per-schedule failures (bad cron, missing prompt
// file, DB error, oversize input) are logged with the schedule name
// quoted via %q (CRLF-safe) and skipped so one bad row never breaks
// the rest of the grid. A cancelled ctx breaks the loop early.
//
// Timezone defaults to "UTC" when unset. Env-var expansion in the
// timezone field is intentionally not performed — that mirrors the
// org/import behavior; template authors should pick a literal IANA
// zone (or rely on UTC + operator overrides per-tenant).
func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string, schedules []OrgSchedule) (seeded, skipped int) {
for _, sched := range schedules {
// Honour caller cancellation — protects against long seed
// loops on a request whose client already gave up.
if err := ctx.Err(); err != nil {
log.Printf("Template schedule seed: ctx cancelled after %d/%d on %s: %v", seeded, len(schedules), workspaceID, err)
skipped += len(schedules) - seeded - skipped
return
}
if len(sched.CronExpr) > maxScheduleCronExprLen {
log.Printf("Template schedule seed: cron_expr too long (%d > %d) for %q on %s — skipping", len(sched.CronExpr), maxScheduleCronExprLen, sched.Name, workspaceID)
skipped++
continue
}
tz := sched.Timezone
if tz == "" {
tz = "UTC"
}
enabled := true
if sched.Enabled != nil {
enabled = *sched.Enabled
}
prompt, promptErr := resolvePromptRef(sched.Prompt, sched.PromptFile, templatePath, "")
if promptErr != nil {
log.Printf("Template schedule seed: failed to resolve prompt for %q on %s: %v — skipping", sched.Name, workspaceID, promptErr)
skipped++
continue
}
if prompt == "" {
log.Printf("Template schedule seed: schedule %q on %s has empty prompt — skipping", sched.Name, workspaceID)
skipped++
continue
}
if len(prompt) > maxSchedulePromptBytes {
log.Printf("Template schedule seed: prompt too long (%d > %d bytes) for %q on %s — skipping", len(prompt), maxSchedulePromptBytes, sched.Name, workspaceID)
skipped++
continue
}
nextRun, nextRunErr := scheduler.ComputeNextRun(sched.CronExpr, tz, time.Now())
if nextRunErr != nil {
log.Printf("Template schedule seed: invalid cron for %q on %s: %v — skipping", sched.Name, workspaceID, nextRunErr)
skipped++
continue
}
if _, err := db.DB.ExecContext(ctx, orgImportScheduleSQL,
workspaceID, sched.Name, sched.CronExpr, tz, prompt, enabled, nextRun); err != nil {
log.Printf("Template schedule seed: failed to upsert %q on %s: %v", sched.Name, workspaceID, err)
skipped++
continue
}
seeded++
log.Printf("Template schedule seed: %q (%s, %d chars) upserted on %s (source=template)", sched.Name, sched.CronExpr, len(prompt), workspaceID)
}
return
}
@@ -0,0 +1,141 @@
package handlers
// template_schedules_test.go — unit tests for parseTemplateSchedules.
//
// seedTemplateSchedules' DB INSERT path is already covered indirectly
// by TestImport_OrgScheduleSQLShape (schedules_test.go) since both
// code paths share the canonical orgImportScheduleSQL constant; the
// loop logic (default tz, default enabled, prompt resolution, cron
// validation) is exercised at the parser level here and at the
// orgImportScheduleSQL level there.
import (
"path/filepath"
"testing"
)
func TestParseTemplateSchedules_AbsentFile(t *testing.T) {
dir := t.TempDir()
// No config.yaml in dir.
got, err := parseTemplateSchedules(dir)
if err != nil {
t.Fatalf("expected nil error for absent config.yaml, got %v", err)
}
if got != nil {
t.Fatalf("expected nil slice, got %#v", got)
}
}
func TestParseTemplateSchedules_EmptyTemplatePath(t *testing.T) {
got, err := parseTemplateSchedules("")
if err != nil {
t.Fatalf("expected nil error for empty path, got %v", err)
}
if got != nil {
t.Fatalf("expected nil slice for empty path, got %#v", got)
}
}
func TestParseTemplateSchedules_NoSchedulesBlock(t *testing.T) {
dir := t.TempDir()
mustWriteFile(t, filepath.Join(dir, "config.yaml"), `
name: Some Template
runtime: claude-code
model: foo/bar
`)
got, err := parseTemplateSchedules(dir)
if err != nil {
t.Fatalf("expected nil error when schedules: absent, got %v", err)
}
if len(got) != 0 {
t.Fatalf("expected zero schedules, got %d", len(got))
}
}
func TestParseTemplateSchedules_HappyPath(t *testing.T) {
dir := t.TempDir()
mustWriteFile(t, filepath.Join(dir, "config.yaml"), `
name: SEO Agent
schedules:
- name: Continuous tick
cron_expr: "*/30 * * * *"
timezone: America/Vancouver
prompt: |
Run one SEO tick.
- name: Monday GSC
cron_expr: "0 8 * * 1"
timezone: America/Vancouver
prompt: /seo google
enabled: true
- name: Disabled placeholder
cron_expr: "0 0 1 1 *"
prompt: noop
enabled: false
`)
got, err := parseTemplateSchedules(dir)
if err != nil {
t.Fatalf("parse: %v", err)
}
if len(got) != 3 {
t.Fatalf("expected 3 schedules, got %d", len(got))
}
if got[0].Name != "Continuous tick" || got[0].CronExpr != "*/30 * * * *" {
t.Errorf("schedule[0] mismatch: %+v", got[0])
}
if got[1].Timezone != "America/Vancouver" {
t.Errorf("schedule[1].Timezone = %q, want America/Vancouver", got[1].Timezone)
}
// Enabled is *bool: nil means "default true" at seed time, false is
// explicit opt-out and must survive the YAML round-trip.
if got[2].Enabled == nil {
t.Errorf("schedule[2].Enabled = nil, want *false")
} else if *got[2].Enabled {
t.Errorf("schedule[2].Enabled = true, want false")
}
}
func TestParseTemplateSchedules_MalformedYAML(t *testing.T) {
dir := t.TempDir()
mustWriteFile(t, filepath.Join(dir, "config.yaml"), `
name: Broken
schedules:
- this is: [not, a, valid
`)
_, err := parseTemplateSchedules(dir)
if err == nil {
t.Fatal("expected parse error on malformed YAML, got nil")
}
}
// TestParseTemplateSchedules_RejectsOversizeFile gates against the
// billion-laughs / anchor-bomb DoS class: a hostile config.yaml over
// the 1 MiB cap must be refused before yaml.Unmarshal runs.
func TestParseTemplateSchedules_RejectsOversizeFile(t *testing.T) {
dir := t.TempDir()
// One byte over the cap — fastest path to the gate.
pad := make([]byte, maxTemplateConfigYAMLBytes+1)
for i := range pad {
pad[i] = '#'
}
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(pad))
if _, err := parseTemplateSchedules(dir); err == nil {
t.Fatal("expected oversize-file error, got nil")
}
}
// TestParseTemplateSchedules_RejectsTooManySchedules gates against a
// hostile config.yaml that flips one row into a 10k-row insert storm.
func TestParseTemplateSchedules_RejectsTooManySchedules(t *testing.T) {
dir := t.TempDir()
var b []byte
b = append(b, []byte("schedules:\n")...)
// maxTemplateSchedules+1 minimal entries — they don't have to be
// valid as schedules because the gate trips before resolution.
for i := 0; i <= maxTemplateSchedules; i++ {
b = append(b, []byte(" - name: s\n cron_expr: \"* * * * *\"\n prompt: x\n")...)
}
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(b))
if _, err := parseTemplateSchedules(dir); err == nil {
t.Fatal("expected schedule-count error, got nil")
}
}
@@ -568,7 +568,7 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// nil/empty map is a no-op. Any failure rolls back the workspace insert
// so we never have a workspace row without its intended secrets.
for k, v := range payload.Secrets {
if rejectPlatformManagedDirectLLMBypass(c, k) {
if rejectPlatformManagedDirectLLMBypassForWorkspace(c, id, k) {
tx.Rollback() //nolint:errcheck
return
}
@@ -769,7 +769,8 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// runtime/model/tier as JSON — the Config tab needs that to render
// even on failed workspaces, so Create owns this Create-only side
// effect rather than coupling Auto to a UI concern.
if !h.provisionWorkspaceAuto(id, templatePath, configFiles, payload) {
provisionOK := h.provisionWorkspaceAuto(id, templatePath, configFiles, payload)
if !provisionOK {
cfgJSON := fmt.Sprintf(`{"name":%q,"runtime":%q,"tier":%d,"template":%q}`,
payload.Name, payload.Runtime, payload.Tier, payload.Template)
if _, err := db.DB.ExecContext(ctx, `
@@ -780,6 +781,32 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
}
}
// Seed schedules declared in the workspace template's config.yaml
// AFTER provisionWorkspaceAuto succeeds so the scheduler never
// fires cron rows against a workspace whose backend never wired
// (review feedback PR #1929#1). Async EC2 provisioning may still
// fail downstream; scheduler.go is expected to handle non-online
// status as a no-op tick. Idempotent across re-creates via
// orgImportScheduleSQL's ON CONFLICT clause; runtime-added rows
// are preserved (Issue #24 contract). Restart does not re-seed
// (so user-deleted template rows stay deleted).
//
// Non-fatal: a broken schedules: block must never block workspace
// provisioning — the workspace row is already live and the grid
// is recoverable via POST /workspaces/{id}/schedules.
if provisionOK && templatePath != "" {
if templateScheds, parseErr := parseTemplateSchedules(templatePath); parseErr != nil {
log.Printf("Create %s: parsing template schedules: %v (continuing)", id, parseErr)
} else if len(templateScheds) > 0 {
seeded, skipped := seedTemplateSchedules(ctx, id, templatePath, templateScheds)
if skipped > 0 {
log.Printf("Create %s: template schedule partial-seed: seeded=%d skipped=%d total=%d", id, seeded, skipped, len(templateScheds))
} else {
log.Printf("Create %s: seeded %d/%d template schedules", id, seeded, len(templateScheds))
}
}
}
c.JSON(http.StatusCreated, gin.H{
"id": id,
"status": "provisioning",
@@ -574,7 +574,12 @@ func (h *WorkspaceHandler) CascadeDelete(ctx context.Context, id string) ([]stri
var stopErrs []error
stopAndRemove := func(wsID string) {
if err := h.StopWorkspaceAuto(cleanupCtx, wsID); err != nil {
// Delete-path stop uses bounded retry (matches the restart path) and
// records a durable structure_events row on exhaustion so a leaked /
// pending EC2 is queryable and handed off to the CP-orphan-sweeper —
// rather than the bare one-shot StopWorkspaceAuto that produced the
// silent-leak class (task #15 / workspace-ec2-leak).
if err := h.stopWorkspaceForDelete(cleanupCtx, wsID); err != nil {
log.Printf("CascadeDelete %s stop failed: %v — leaving cleanup for orphan sweeper", wsID, err)
stopErrs = append(stopErrs, fmt.Errorf("stop %s: %w", wsID, err))
return
@@ -0,0 +1,102 @@
package handlers
// workspace_delete_stop_retry_test.go — pins the contract of the
// delete-path EC2 stop retry (task #15 / workspace-ec2-leak).
//
// Background (Phase 1 evidence): the DELETE path's StopWorkspaceAuto →
// cpProv.Stop had NO retry, while the restart path used cpStopWithRetry
// (bounded exponential backoff). A transient CP/AWS hiccup on delete left
// the workspace row at status='removed' with instance_id still populated,
// returned a 500, and relied entirely on the 60s CP-orphan-sweeper to
// re-drive the terminate. For a cascade *descendant* whose own row is
// already 'removed', the inline retry-via-client-replay is defeated by
// CascadeDelete's `status != 'removed'` CTE filter — so the only inline
// recovery is this bounded retry.
//
// Contract of stopWorkspaceForDelete:
// - CP path: bounded retry (cpStopRetryAttempts, exp backoff) on
// cpProv.Stop; returns nil on eventual success.
// - On retry exhaustion: returns the terminal error AND emits a
// `workspace.delete.terminate_retry_exhausted` structure_events row so
// the leak decision is queryable (structured-logging gate), not just a
// log.Printf. The row is the durable pending-terminate signal: the row
// stays status='removed' with instance_id populated, which is exactly
// what the CP-orphan-sweeper (registry/cp_orphan_sweeper.go) re-drives.
// - Docker path: single Stop, no retry (local daemon failure won't heal
// on retry — matches RestartWorkspaceAuto's Docker rationale).
// - No backend wired: nil (nothing to stop).
import (
"context"
"errors"
"strings"
"testing"
"github.com/DATA-DOG/go-sqlmock"
)
func TestStopWorkspaceForDelete_CPRetriesTransientThenSucceeds(t *testing.T) {
shrinkRetryBackoff(t)
buf := captureLog(t)
// 2 transient failures then success — within the 3-attempt budget.
stub := &scriptedCPStop{errs: []error{
errors.New("cp 503 attempt 1"),
errors.New("cp 503 attempt 2"),
}}
h := &WorkspaceHandler{cpProv: stub}
err := h.stopWorkspaceForDelete(context.Background(), "ws-del-1")
if err != nil {
t.Fatalf("expected nil error on eventual success, got %v", err)
}
if stub.calls != 3 {
t.Errorf("expected 3 Stop calls (2 fails + 1 success), got %d", stub.calls)
}
if strings.Contains(buf.String(), "terminate_retry_exhausted") {
t.Errorf("eventual success must NOT log retry-exhausted; got %q", buf.String())
}
}
func TestStopWorkspaceForDelete_CPExhaustsEmitsDurableEventAndReturnsError(t *testing.T) {
shrinkRetryBackoff(t)
mock := setupTestDB(t)
buf := captureLog(t)
stub := &scriptedCPStop{errs: []error{
errors.New("cp 502 attempt 1"),
errors.New("cp 502 attempt 2"),
errors.New("cp 502 final"),
}}
h := &WorkspaceHandler{cpProv: stub}
// On exhaustion the helper persists a durable pending-terminate row so
// the leak decision is queryable. structure_events is the audit-of-record.
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
err := h.stopWorkspaceForDelete(context.Background(), "ws-doomed")
if err == nil {
t.Fatal("expected terminal error on retry exhaustion, got nil")
}
if stub.calls != cpStopRetryAttempts {
t.Errorf("expected %d Stop calls when all fail, got %d", cpStopRetryAttempts, stub.calls)
}
if !strings.Contains(err.Error(), "cp 502 final") {
t.Errorf("returned error should wrap the LAST attempt's error, got %v", err)
}
if e := mock.ExpectationsWereMet(); e != nil {
t.Fatalf("expected structure_events INSERT on exhaustion: %v", e)
}
// The LEAK-SUSPECT line stays the operator-facing prose bridge to the
// orphan reconciler; assert it carries the delete source so triage can
// distinguish delete-leaks from restart-leaks.
if !strings.Contains(buf.String(), "LEAK-SUSPECT") {
t.Errorf("expected LEAK-SUSPECT log on exhaustion, got %q", buf.String())
}
}
func TestStopWorkspaceForDelete_NoBackendIsNoOp(t *testing.T) {
h := &WorkspaceHandler{} // cpProv nil, provisioner nil
if err := h.stopWorkspaceForDelete(context.Background(), "ws-x"); err != nil {
t.Errorf("expected nil no-op with no backend, got %v", err)
}
}
@@ -31,9 +31,11 @@ package handlers
import (
"context"
"encoding/json"
"log"
"time"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/db"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/models"
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/provlog"
)
@@ -207,6 +209,86 @@ func (h *WorkspaceHandler) StopWorkspaceAuto(ctx context.Context, workspaceID st
return nil
}
// stopWorkspaceForDelete is the DELETE-path stop dispatcher. It differs
// from StopWorkspaceAuto in exactly one way: the CP (EC2) path gets the
// same bounded retry the restart path uses (cpStopWithRetryErr), and on
// retry exhaustion it persists a durable `workspace.delete.terminate_retry_exhausted`
// event to structure_events (the structured-logging gate) so the leak
// decision is queryable, not just stdout prose.
//
// Why retry here (task #15 / workspace-ec2-leak): the bare cpProv.Stop on
// delete left a transient CP/AWS hiccup as an immediate 500 with no inline
// recovery. For a cascade *descendant* the "client retries → replays
// terminate" recovery is defeated by CascadeDelete's `status != 'removed'`
// CTE filter (the descendant's row is already 'removed', so a retry walks
// zero descendant rows). Bounded retry absorbs the transient class inline;
// the durable event + the row staying status='removed'+instance_id is the
// hand-off to the 60s CP-orphan-sweeper (registry/cp_orphan_sweeper.go) for
// the (rarer) sustained-outage case.
//
// We deliberately do NOT clear status='removed' on exhaustion — the
// CP-orphan-sweeper's recovery query keys on exactly that state, so
// reverting it would break the existing backstop. The error is still
// returned so the HTTP Delete handler surfaces the retryable 500.
//
// Docker path: single Stop, no retry — a local daemon that fails to stop a
// container won't heal on retry (matches RestartWorkspaceAuto's Docker
// rationale); the orphan-container sweeper (registry/orphan_sweeper.go) is
// the Docker-side backstop.
func (h *WorkspaceHandler) stopWorkspaceForDelete(ctx context.Context, workspaceID string) error {
if h.cpProv != nil {
if err := h.cpStopWithRetryErr(ctx, workspaceID, "Delete"); err != nil {
h.emitDeleteTerminateRetryExhausted(ctx, workspaceID, err)
return err
}
return nil
}
if h.provisioner != nil {
return h.provisioner.Stop(ctx, workspaceID)
}
return nil
}
// emitDeleteTerminateRetryExhausted persists a durable record that the
// delete-path EC2 terminate could not be completed inline after the full
// retry budget. Per the §Persistent structured logging gate: a
// state-mutating decision (we are leaving a known-leaked-or-pending EC2 for
// the orphan sweeper) must land in structure_events, not just log.Printf.
//
// Event-type taxonomy (append-only; never rename):
//
// workspace.delete.terminate_retry_exhausted — delete-path cpProv.Stop
// exhausted its retry budget; row stays status='removed' with
// instance_id populated for the CP-orphan-sweeper to re-drive.
//
// Telemetry never blocks the request path: marshal / INSERT failures are
// logged and swallowed.
func (h *WorkspaceHandler) emitDeleteTerminateRetryExhausted(ctx context.Context, workspaceID string, cause error) {
payload := map[string]any{
"workspace_id": workspaceID,
"attempts": cpStopRetryAttempts,
"last_error": cause.Error(),
// recovery_path documents WHO is expected to finish the terminate,
// so a reader of the audit row doesn't have to grep the code to
// know the EC2 isn't simply abandoned.
"recovery_path": "cp_orphan_sweeper",
}
payloadJSON, err := json.Marshal(payload)
if err != nil {
log.Printf("emitDeleteTerminateRetryExhausted: marshal payload failed for %s: %v", workspaceID, err)
return
}
if db.DB == nil {
return
}
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO structure_events (event_type, workspace_id, payload, created_at)
VALUES ($1, $2, $3, now())
`, "workspace.delete.terminate_retry_exhausted", workspaceID, payloadJSON); err != nil {
log.Printf("emitDeleteTerminateRetryExhausted: insert failed for %s: %v", workspaceID, err)
}
}
// RestartWorkspaceAuto stops the running workload (with retry semantics
// tuned for the restart hot path) then starts provisioning again, in a
// detached goroutine. Returns true when a backend was kicked off, false
@@ -922,11 +922,55 @@ func applyRuntimeModelEnv(envVars map[string]string, runtime, model string) {
}
// applyPlatformManagedLLMEnv wires the control-plane LLM proxy into a
// workspace only when the org is in platform-managed mode. Provider keys
// never enter the tenant; provider SDK API-key envs receive the tenant token
// for the CP proxy only when the workspace has not supplied BYOK/OAuth auth.
func applyPlatformManagedLLMEnv(envVars map[string]string, runtime string, model string) {
if strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE"))) != "platform_managed" {
// workspace only when the RESOLVED billing mode for this workspace is
// platform_managed. "Resolved" means: the workspace-level override (if any)
// wins over the org default (delivered via tenant_config in MOLECULE_LLM_BILLING_MODE).
//
// Pre-internal#691 this gate read the org-level env var directly, which made
// it impossible to mix billing modes across workspaces in the same org. The
// resolver (ResolveLLMBillingMode) is the single source of truth now; the
// architectural test asserts no remaining code path gates on os.Getenv
// ("MOLECULE_LLM_BILLING_MODE") for strip-decision purposes — that env value
// is still read INTO the resolver as the org-default input, but it is never
// the final decision.
//
// Default-closed: any resolver error / NULL JOIN / garbled enum value
// collapses to platform_managed (see llm_billing_mode.go for the contract).
// This preserves the existing implicit default exactly while making the
// per-workspace opt-out path safe.
//
// The resolved mode is exported into the workspace container as
// MOLECULE_LLM_BILLING_MODE_RESOLVED so an in-container debug check can
// answer "what mode is this workspace running under" without DB queries
// (RFC Observability hot-spot).
func applyPlatformManagedLLMEnv(ctx context.Context, envVars map[string]string, workspaceID, runtime, model string) {
orgMode := strings.ToLower(strings.TrimSpace(os.Getenv("MOLECULE_LLM_BILLING_MODE")))
res, resolveErr := ResolveLLMBillingMode(ctx, workspaceID, orgMode)
if resolveErr != nil {
// resolveErr != nil ⇒ resolver hit a DB error AND already defaulted
// res.ResolvedMode to platform_managed. Log + proceed; the safe default
// is already in place, no early return needed.
log.Printf("workspace_provision: resolve billing mode workspace=%s err=%v (defaulting to platform_managed)", workspaceID, resolveErr)
}
log.Printf("workspace_provision: billing mode workspace=%s resolved=%s source=%s org_default=%s", workspaceID, res.ResolvedMode, res.Source, res.OrgDefault)
// internal#703: MOLECULE_LLM_BILLING_MODE in the container must reflect the
// RESOLVED per-workspace mode, not a hardcoded literal. Pre-fix this var was
// only emitted (hardcoded "platform_managed") on the strip path below, so a
// byok/disabled container never carried a truthful billing-mode value — only
// MOLECULE_LLM_BILLING_MODE_RESOLVED. Emit both here, resolver-driven, for
// every mode so the value is correct on the byok/disabled early-return path
// too (and downstream consumers / debug shells see byok, not platform_managed).
envVars["MOLECULE_LLM_BILLING_MODE"] = res.ResolvedMode
// Observability: surface the resolved mode in the container env so the
// agent / debug shell can answer "why is my key being stripped" without
// pulling logs or hitting the admin route.
envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"] = res.ResolvedMode
if res.ResolvedMode != LLMBillingModePlatformManaged {
// byok or disabled — DO NOT strip vendor keys, DO NOT force-route to CP,
// DO NOT override the workspace own ANTHROPIC_BASE_URL / OAuth token.
// Leave envVars alone so CLAUDE_CODE_OAUTH_TOKEN / vendor API keys
// pulled from workspace_secrets survive into the container, and the
// workspace talks to its own provider directly (internal#703).
return
}
baseURL := firstNonEmptyEnv("MOLECULE_LLM_BASE_URL", "OPENAI_BASE_URL")
@@ -937,7 +981,8 @@ func applyPlatformManagedLLMEnv(envVars map[string]string, runtime string, model
}
stripPlatformManagedLLMBypassEnv(envVars)
envVars["MOLECULE_LLM_BILLING_MODE"] = "platform_managed"
// MOLECULE_LLM_BILLING_MODE is already set to res.ResolvedMode (==
// platform_managed on this path) above (internal#703); no hardcode here.
envVars["MOLECULE_LLM_BASE_URL"] = baseURL
envVars["MOLECULE_LLM_USAGE_TOKEN"] = token
if anthropicBaseURL != "" {
@@ -970,7 +1015,7 @@ func stripPlatformManagedLLMBypassEnv(envVars map[string]string) {
}
func runtimeUsesAnthropicNativeProxy(runtime string) bool {
return strings.TrimSpace(strings.ToLower(runtime)) == "claude-code"
return strings.EqualFold(strings.TrimSpace(runtime), "claude-code")
}
func firstNonEmptyEnv(names ...string) string {
@@ -193,7 +193,7 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// continue to rely on workspace_secrets / org-import persona-env
// merge for their git auth.
applyAgentGitHTTPCreds(envVars, payload.Role)
applyPlatformManagedLLMEnv(envVars, payload.Runtime, payload.Model)
applyPlatformManagedLLMEnv(ctx, envVars, workspaceID, payload.Runtime, payload.Model)
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
@@ -972,7 +972,7 @@ func TestApplyPlatformManagedLLMEnv_NonClaudeRuntimeDefaultsOpenAIProxyWhenNoWor
t.Setenv("MOLECULE_LLM_DEFAULT_MODEL", "moonshot/kimi-k2.6")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(envVars, "codex", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "codex", "")
applyRuntimeModelEnv(envVars, "codex", "")
if got := envVars["OPENAI_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/openai/v1" {
@@ -1002,7 +1002,7 @@ func TestApplyPlatformManagedLLMEnv_StripsWorkspaceOpenAIKeyForClaudeCode(t *tes
"OPENAI_BASE_URL": "https://api.openai.com/v1",
"MODEL": "openai/gpt-5.5",
}
applyPlatformManagedLLMEnv(envVars, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should be stripped for claude-code platform-managed mode")
@@ -1028,7 +1028,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeUsesAnthropicProxyOverOAuth(t *tes
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(envVars, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped in platform-managed mode")
@@ -1051,7 +1051,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeInjectsAnthropicProxyWhenNoWorkspa
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(envVars, "claude-code", "minimax/MiniMax-M2.7")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "minimax/MiniMax-M2.7")
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/anthropic/v1" {
t.Fatalf("ANTHROPIC_BASE_URL = %q", got)
@@ -1074,7 +1074,7 @@ func TestApplyPlatformManagedLLMEnv_ClaudeCodeStripsVendorBYOK(t *testing.T) {
"MINIMAX_API_KEY": "user-minimax-key",
"MODEL": "MiniMax-M2.7",
}
applyPlatformManagedLLMEnv(envVars, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
if _, ok := envVars["MINIMAX_API_KEY"]; ok {
t.Fatalf("MINIMAX_API_KEY should be stripped in platform-managed mode")
@@ -1096,7 +1096,7 @@ func TestApplyPlatformManagedLLMEnv_NoopsOutsidePlatformManaged(t *testing.T) {
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{}
applyPlatformManagedLLMEnv(envVars, "claude-code", "")
applyPlatformManagedLLMEnv(context.Background(), envVars, "", "claude-code", "")
if _, ok := envVars["OPENAI_API_KEY"]; ok {
t.Fatalf("OPENAI_API_KEY should not be set outside platform-managed mode")
@@ -1106,6 +1106,112 @@ func TestApplyPlatformManagedLLMEnv_NoopsOutsidePlatformManaged(t *testing.T) {
}
}
// TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv is the
// internal#703 regression guard: a per-workspace byok override (org-level
// MOLECULE_LLM_BILLING_MODE left at the platform_managed bootstrap floor)
// must resolve to byok and leave the workspace own provider env intact —
// the CP-injected proxy ANTHROPIC_BASE_URL / usage token must NOT be forced,
// the OAuth token must NOT be stripped, and MOLECULE_LLM_BILLING_MODE in the
// container must read the RESOLVED mode (byok), not the hardcoded literal.
//
// This is the discriminating test for the byok end-to-end fix: pre-fix the
// strip path was the only emitter of MOLECULE_LLM_BILLING_MODE (hardcoded
// "platform_managed"), so a byok container carried no truthful billing mode.
func TestApplyPlatformManagedLLMEnv_ClaudeCodeByokKeepsOwnProviderEnv(t *testing.T) {
const wsID = "77777777-7777-7777-7777-777777777777"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModeBYOK))
// Org-level env left at the bootstrap floor — the per-workspace override
// is what must flip this workspace to byok (the realistic prod shape).
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
// The workspace brought its own Claude Code OAuth token (BYOK via the
// subscription provider). It must survive untouched.
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
// 1. OAuth token intact — not stripped.
if got := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; got != "user-oauth-token" {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN = %q, want it left intact for byok", got)
}
// 2. No CP proxy base URL / usage token forced onto the workspace.
if got, ok := envVars["ANTHROPIC_BASE_URL"]; ok {
t.Fatalf("ANTHROPIC_BASE_URL must NOT be injected for byok, got %q", got)
}
if got, ok := envVars["ANTHROPIC_API_KEY"]; ok {
t.Fatalf("ANTHROPIC_API_KEY must NOT be injected for byok, got %q", got)
}
if got, ok := envVars["MOLECULE_LLM_ANTHROPIC_BASE_URL"]; ok {
t.Fatalf("MOLECULE_LLM_ANTHROPIC_BASE_URL must NOT be injected for byok, got %q", got)
}
if got, ok := envVars["MOLECULE_LLM_USAGE_TOKEN"]; ok {
t.Fatalf("MOLECULE_LLM_USAGE_TOKEN must NOT be injected for byok, got %q", got)
}
// 3. Billing mode in the container reflects the RESOLVED mode (byok).
if got := envVars["MOLECULE_LLM_BILLING_MODE"]; got != LLMBillingModeBYOK {
t.Fatalf("MOLECULE_LLM_BILLING_MODE = %q, want %q (resolver-driven, not hardcoded)", got, LLMBillingModeBYOK)
}
if got := envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"]; got != LLMBillingModeBYOK {
t.Fatalf("MOLECULE_LLM_BILLING_MODE_RESOLVED = %q, want %q", got, LLMBillingModeBYOK)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode is the
// no-regression companion: a workspace that resolves to platform_managed must
// still strip + force the proxy AND emit MOLECULE_LLM_BILLING_MODE=
// platform_managed (now resolver-driven, internal#703). Proves the byok fix
// did not alter the platform_managed contract.
func TestApplyPlatformManagedLLMEnv_PlatformManagedStillEmitsResolvedMode(t *testing.T) {
const wsID = "88888888-8888-8888-8888-888888888888"
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}).AddRow(LLMBillingModePlatformManaged))
t.Setenv("MOLECULE_LLM_BILLING_MODE", LLMBillingModePlatformManaged)
t.Setenv("MOLECULE_LLM_BASE_URL", "https://api.example.test/api/v1/internal/llm/openai/v1")
t.Setenv("MOLECULE_LLM_ANTHROPIC_BASE_URL", "https://api.example.test/api/v1/internal/llm/anthropic")
t.Setenv("MOLECULE_LLM_USAGE_TOKEN", "tenant-admin-token")
envVars := map[string]string{
"CLAUDE_CODE_OAUTH_TOKEN": "user-oauth-token",
"MODEL": "sonnet",
}
applyPlatformManagedLLMEnv(context.Background(), envVars, wsID, "claude-code", "")
// OAuth stripped, proxy forced — unchanged platform_managed contract.
if _, ok := envVars["CLAUDE_CODE_OAUTH_TOKEN"]; ok {
t.Fatalf("CLAUDE_CODE_OAUTH_TOKEN should be stripped for platform_managed")
}
if got := envVars["ANTHROPIC_BASE_URL"]; got != "https://api.example.test/api/v1/internal/llm/anthropic" {
t.Fatalf("ANTHROPIC_BASE_URL = %q, want proxy forced for platform_managed", got)
}
if got := envVars["ANTHROPIC_API_KEY"]; got != "tenant-admin-token" {
t.Fatalf("ANTHROPIC_API_KEY = %q, want usage token for platform_managed", got)
}
if got := envVars["MOLECULE_LLM_BILLING_MODE"]; got != LLMBillingModePlatformManaged {
t.Fatalf("MOLECULE_LLM_BILLING_MODE = %q, want %q", got, LLMBillingModePlatformManaged)
}
if got := envVars["MOLECULE_LLM_BILLING_MODE_RESOLVED"]; got != LLMBillingModePlatformManaged {
t.Fatalf("MOLECULE_LLM_BILLING_MODE_RESOLVED = %q, want %q", got, LLMBillingModePlatformManaged)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestApplyRuntimeModelEnv_PersonaEnvMODELSecretPreserved locks in the
// 2026-05-08 fix that prevents the MODEL_PROVIDER-as-slug fallback from
// silently overwriting a per-persona MODEL workspace_secret on restart,
@@ -1616,3 +1616,28 @@ func (*mockResolver) Scheme() string { return "" }
func (m *mockResolver) Fetch(_ context.Context, _, _ string) (string, error) {
return m.fetchName, m.fetchErr
}
// TestRuntimeUsesAnthropicNativeProxy_CaseAndWhitespace proves the
// strings.EqualFold hardening: the runtime check now matches "claude-code"
// case-insensitively (and after trimming whitespace) instead of relying on
// a lowercased exact compare.
func TestRuntimeUsesAnthropicNativeProxy_CaseAndWhitespace(t *testing.T) {
cases := []struct {
runtime string
want bool
}{
{"claude-code", true},
{"Claude-Code", true},
{"CLAUDE-CODE", true},
{" claude-code ", true},
{"\tClaude-Code\n", true},
{"claude-code-x", false},
{"codex", false},
{"", false},
}
for _, c := range cases {
if got := runtimeUsesAnthropicNativeProxy(c.runtime); got != c.want {
t.Errorf("runtimeUsesAnthropicNativeProxy(%q) = %v, want %v", c.runtime, got, c.want)
}
}
}
@@ -721,8 +721,31 @@ var cpStopRetryBaseDelay = 1 * time.Second
//
// Returns nothing — caller's contract is unchanged.
func (h *WorkspaceHandler) cpStopWithRetry(ctx context.Context, workspaceID, source string) {
// Restart's contract is "make the workspace alive again": it proceeds
// with reprovision regardless of the Stop outcome, so it discards the
// terminal error. The delete path needs the error (it must keep the
// row recoverable for the orphan-sweeper + emit a durable event), so
// the actual retry loop lives in cpStopWithRetryErr below.
_ = h.cpStopWithRetryErr(ctx, workspaceID, source)
}
// cpStopWithRetryErr is the shared bounded-retry core for cpProv.Stop.
// It returns the terminal error so callers that need to react to a leak
// (the DELETE path's stopWorkspaceForDelete) can do so, while
// cpStopWithRetry keeps its void contract for the restart paths.
//
// Behaviour (unchanged from the original cpStopWithRetry loop):
// - cpProv nil → nil (no-op; nothing to stop).
// - success on attempt N → nil; logs a retry-success line when N > 1.
// - ctx cancelled mid-retry → returns ctx.Err(); logs an "abandoned"
// line and deliberately does NOT emit LEAK-SUSPECT (operator-initiated
// drain is a different signal than "we tried hard and failed").
// - all attempts fail → returns the LAST attempt's error and emits the
// stable `LEAK-SUSPECT cpProv.Stop ...` log line so the CP-side orphan
// reconciler can correlate by workspace_id.
func (h *WorkspaceHandler) cpStopWithRetryErr(ctx context.Context, workspaceID, source string) error {
if h.cpProv == nil {
return
return nil
}
var lastErr error
delay := cpStopRetryBaseDelay
@@ -732,7 +755,7 @@ func (h *WorkspaceHandler) cpStopWithRetry(ctx context.Context, workspaceID, sou
if attempt > 1 {
log.Printf("%s: cpProv.Stop(%s) succeeded on attempt %d", source, workspaceID, attempt)
}
return
return nil
}
lastErr = err
if attempt == cpStopRetryAttempts {
@@ -740,12 +763,14 @@ func (h *WorkspaceHandler) cpStopWithRetry(ctx context.Context, workspaceID, sou
}
// Sleep with ctx awareness so a cancelled ctx exits early instead
// of stalling the goroutine through the remaining backoff.
timer := time.NewTimer(delay)
select {
case <-ctx.Done():
timer.Stop()
log.Printf("%s: cpProv.Stop(%s) abandoned mid-retry: ctx cancelled (last_err=%v)",
source, workspaceID, lastErr)
return
case <-time.After(delay):
return ctx.Err()
case <-timer.C:
}
delay *= 2
}
@@ -753,6 +778,7 @@ func (h *WorkspaceHandler) cpStopWithRetry(ctx context.Context, workspaceID, sou
// so logs are greppable / parseable for the CP-side orphan reconciler.
log.Printf("LEAK-SUSPECT cpProv.Stop workspace_id=%s source=%s attempts=%d last_err=%q",
workspaceID, source, cpStopRetryAttempts, lastErr.Error())
return lastErr
}
// runRestartCycle does the actual stop+provision work for one restart
@@ -248,8 +248,13 @@ func TestRestart_CPStopOnlyInsideRetryHelper(t *testing.T) {
if !ok || fn.Body == nil || fn.Recv == nil {
continue
}
// cpStopWithRetry is the ONE allowed home for h.cpProv.Stop.
if fn.Name.Name == "cpStopWithRetry" {
// cpStopWithRetryErr is the ONE allowed home for h.cpProv.Stop
// the bounded-retry loop. cpStopWithRetry is the void-returning
// wrapper (restart path) that delegates to it; the delete path uses
// cpStopWithRetryErr directly via stopWorkspaceForDelete to capture
// the terminal error (task #15). Both wrappers are exempt from this
// gate; any OTHER direct cpProv.Stop is the silent-leak regression.
if fn.Name.Name == "cpStopWithRetry" || fn.Name.Name == "cpStopWithRetryErr" {
continue
}
ast.Inspect(fn.Body, func(n ast.Node) bool {
@@ -501,6 +501,10 @@ func TestWorkspaceCreate_WithSecrets_Persists(t *testing.T) {
// while persisting a secret causes the entire transaction to roll back and
// the handler to return 500. The workspace row must NOT be committed.
func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
// internal#691: see TestExtended_SecretsSet — same default-closed reasoning.
// This test is asserting the rollback path on DB failure, not the strip gate;
// keep the org in byok so the OPENAI_API_KEY write reaches the INSERT.
t.Setenv("MOLECULE_LLM_BILLING_MODE", "byok")
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
@@ -509,6 +513,14 @@ func TestWorkspaceCreate_SecretPersistFails_RollsBack(t *testing.T) {
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WillReturnResult(sqlmock.NewResult(0, 1))
// internal#691: Create() now resolves billing mode per-workspace before
// the secret-strip gate. The workspace row was just inserted in the same
// transaction so it isn't readable from a separate query yet; the
// resolver expects the SELECT and the mock returns no row → falls back
// to the org default (byok, set above) so the OPENAI_API_KEY write
// reaches the INSERT-and-fail path this test exercises.
mock.ExpectQuery(`SELECT llm_billing_mode FROM workspaces WHERE id = \$1`).
WillReturnRows(sqlmock.NewRows([]string{"llm_billing_mode"}))
mock.ExpectExec("INSERT INTO workspace_secrets").
WillReturnError(sql.ErrConnDone) // DB failure while writing secret
mock.ExpectRollback() // workspace insert must be rolled back
@@ -110,10 +110,15 @@ func WorkspaceAuth(database *sql.DB) gin.HandlerFunc {
c.Next()
return
}
// Same-origin canvas on tenant image — Referer matches Host.
if isSameOriginCanvas(c) {
c.Next()
return
// SaaS-canvas path: a browser cookie is acceptable only after the
// control plane confirms membership in this tenant. Referer/Origin
// are forgeable and must never authenticate workspace data routes.
if cookieHeader := c.GetHeader("Cookie"); cookieHeader != "" {
if ok, _ := VerifiedCPSession(cookieHeader); ok {
c.Set("cp_session_actor", cpSessionActor(cookieHeader))
c.Next()
return
}
}
// Local-dev escape hatch — see devmode.go. Unreachable on SaaS
// (hosted tenants always have ADMIN_TOKEN + MOLECULE_ENV=production).
@@ -407,3 +412,40 @@ func isSameOriginCanvas(c *gin.Context) bool {
origin := c.GetHeader("Origin")
return origin == "https://"+host || origin == "http://"+host
}
// cpSessionConfigured reports whether this platform is wired for upstream
// session-cookie verification — i.e. it runs as a SaaS tenant image with
// both CP_UPSTREAM_URL and MOLECULE_ORG_SLUG set. When false (self-hosted /
// dev), VerifiedCPSession can never succeed, so callers that want a
// non-forgeable canvas signal in SaaS while still working in dev can use
// this to decide whether the forgeable same-origin fallback is acceptable.
func cpSessionConfigured() bool {
return os.Getenv("CP_UPSTREAM_URL") != "" && tenantSlug() != ""
}
// CPSessionConfigured is the exported form of cpSessionConfigured for callers
// outside this package (e.g. the A2A proxy's canvas-user classification).
func CPSessionConfigured() bool {
return cpSessionConfigured()
}
// IsVerifiedCanvasSession returns true ONLY when the request carries a WorkOS
// session cookie that the control plane confirms belongs to a member of THIS
// tenant's org (via /cp/auth/tenant-member). Unlike IsSameOriginCanvas — whose
// Host/Referer/Origin inputs are trivially forgeable by any container on the
// Docker network and which is therefore documented as cosmetic-only (see
// AdminAuth / CanvasOrBearer comments above, #623/#194) — this is a real,
// upstream-verified authentication boundary. It is the correct gate for
// non-cosmetic actions such as A2A dispatch on behalf of a canvas user.
//
// Returns false (no network call) in self-hosted / dev deployments where
// CP_UPSTREAM_URL / MOLECULE_ORG_SLUG are unset; callers should treat that as
// "no verified canvas session available" and fall back accordingly.
func IsVerifiedCanvasSession(c *gin.Context) bool {
cookie := c.GetHeader("Cookie")
if cookie == "" {
return false
}
valid, _ := VerifiedCPSession(cookie)
return valid
}
@@ -75,6 +75,90 @@ func TestWorkspaceAuth_351_NoBearer_Returns401_NoDBCalls(t *testing.T) {
}
}
// TestWorkspaceAuth_ForgedSameOriginHeaders_Returns401 pins the production
// boundary for combined tenant images: Referer/Origin are forgeable request
// headers and must not authenticate workspace data routes.
func TestWorkspaceAuth_ForgedSameOriginHeaders_Returns401(t *testing.T) {
t.Setenv("MOLECULE_ENV", "production")
t.Setenv("ADMIN_TOKEN", "admin-secret")
prev := canvasProxyActive
canvasProxyActive = true
defer func() { canvasProxyActive = prev }()
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("sqlmock.New: %v", err)
}
defer mockDB.Close()
r := gin.New()
r.GET("/workspaces/:id/secrets", WorkspaceAuth(mockDB), func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"ok": true})
})
r.DELETE("/workspaces/:id/secrets/:key", WorkspaceAuth(mockDB), func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"ok": true})
})
for _, tt := range []struct {
name string
method string
path string
}{
{"list secrets", http.MethodGet, "/workspaces/ws-forged/secrets"},
{"delete secret", http.MethodDelete, "/workspaces/ws-forged/secrets/HERMES_CUSTOM_API_KEY"},
} {
t.Run(tt.name, func(t *testing.T) {
w := httptest.NewRecorder()
req, _ := http.NewRequest(tt.method, tt.path, nil)
req.Host = "tenant.example.test"
req.Header.Set("Referer", "https://tenant.example.test/")
req.Header.Set("Origin", "https://tenant.example.test")
r.ServeHTTP(w, req)
if w.Code != http.StatusUnauthorized {
t.Fatalf("forged same-origin headers: expected 401, got %d: %s", w.Code, w.Body.String())
}
})
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestWorkspaceAuth_VerifiedTenantSessionCookie_AllowsCanvas(t *testing.T) {
resetSessionCache()
t.Setenv("MOLECULE_ENV", "production")
t.Setenv("ADMIN_TOKEN", "admin-secret")
t.Setenv("MOLECULE_ORG_SLUG", "tenant-a")
srv, _ := mockCPServer(t, http.StatusOK, `{"member":true,"user_id":"u_1","role":"owner","org_id":"org_1"}`)
t.Setenv("CP_UPSTREAM_URL", srv.URL)
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("sqlmock.New: %v", err)
}
defer mockDB.Close()
r := gin.New()
r.GET("/workspaces/:id/secrets", WorkspaceAuth(mockDB), func(c *gin.Context) {
if _, ok := c.Get("cp_session_actor"); !ok {
t.Errorf("cp_session_actor was not set")
}
c.JSON(http.StatusOK, gin.H{"ok": true})
})
w := httptest.NewRecorder()
req, _ := http.NewRequest(http.MethodGet, "/workspaces/ws-session/secrets", nil)
req.Header.Set("Cookie", "molecule_session=valid")
r.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Fatalf("verified tenant session: expected 200, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestWorkspaceAuth_C4_C8_NoBearer_Returns401 — C4/C8 critical path:
// when a workspace has live tokens and the caller sends NO bearer token,
// the middleware must return 401. This was the confirmed attack vector —
@@ -202,7 +202,9 @@ func (p *CPProvisioner) Start(ctx context.Context, cfg WorkspaceConfig) (string,
// - Rejects symlinks at the template root (prevents bypass via symlink traversal)
// - Skips symlinks during WalkDir (prevents /etc/passwd etc. inclusion)
// - Validates all paths are relative and non-escaping
// - Caps total size at 12 KiB to prevent payload bloat
// - Caps total size at cpConfigFilesMaxBytes (a transport-DoS guard,
// not the retired 12 KiB user-data ceiling — config now ships off
// user-data via the CP's Secrets-Manager seeding path)
configFiles, err := collectCPConfigFiles(cfg)
if err != nil {
return "", fmt.Errorf("cp provisioner: collect config files: %w", err)
@@ -277,7 +279,27 @@ func (p *CPProvisioner) Start(ctx context.Context, cfg WorkspaceConfig) (string,
return result.InstanceID, nil
}
const cpConfigFilesMaxBytes = 12 << 10
// cpConfigFilesMaxBytes bounds the aggregate config bundle this tenant
// ships to the control plane. It is a transport-DoS guard, NOT the old
// EC2-user-data ceiling.
//
// History: this was 12 KiB (12<<10) because the CP embedded the bundle in
// EC2 user-data, which AWS caps at 16 KiB (the cap left ~4 KiB for bootstrap
// overhead). That ceiling failed real customers — the jrs-auto SEO Agent's
// config (long SEO system prompt + SERVICES_REPO_WEBSITE + a 12-schedule
// block baked into config.yaml) exceeds 12 KiB, so Start() rejected it
// client-side with "config files exceed 12288 bytes" and the workspace
// could never provision.
//
// Config delivery now goes OFF user-data: the CP stages the bundle to AWS
// Secrets Manager (molecule/workspace/<id>/config) at provision time and the
// workspace fetches it into /configs at boot (mirrors the proven tenant
// bootstrap-secrets pattern). The bundle travels here only inside the JSON
// HTTP request body to the CP, which has no 16 KiB limit. The remaining
// bound exists purely so a buggy/hostile tenant can't stream an unbounded
// body and OOM the CP provision path — set generous (256 KiB) so legitimate
// growth (more schedules, longer prompts, more skills) never re-hits a wall.
const cpConfigFilesMaxBytes = 256 << 10
// isCPTemplateConfigFile restricts which files from a template directory are
// eligible for transport to the control plane. Only config.yaml (the runtime
@@ -0,0 +1,151 @@
package provisioner
import (
"context"
"encoding/base64"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
)
// TestStart_OversizedConfigBundleProvisions is the Prove-It reproduction for
// the jrs-auto SEO Agent provisioning failure:
//
// CPProvisioner: workspace start failed: cp provisioner: collect config
// files: config files exceed 12288 bytes
//
// Root cause: collectCPConfigFiles hard-capped the *eligible* config bundle
// (config.yaml + prompts/*) at 12 KiB because the controlplane embedded it in
// EC2 user-data (16 KiB AWS ceiling bootstrap overhead). The SEO agent's
// config (long SEO system prompt + SERVICES_REPO_WEBSITE + the 12-schedule
// block baked into config.yaml) exceeds 12 KiB, so Start() failed before it
// ever reached the wire — blocking a paying customer from provisioning.
//
// After moving config delivery OFF user-data and onto the persistent
// secondary volume (CP stages the bundle to Secrets Manager; the workspace
// fetches it at boot into /configs), the 12 KiB ceiling is obsolete: the
// bundle travels in the JSON HTTP body to CP, which has no 16 KiB limit. This
// test pins that a realistically-oversized (>12288 B) config bundle now
// reaches the CP request body intact instead of being rejected client-side.
func TestStart_OversizedConfigBundleProvisions(t *testing.T) {
// SEO-sized config.yaml: a 12-schedule block + SERVICES_REPO_WEBSITE +
// a long system prompt, comfortably over the retired 12 KiB cap.
var sb strings.Builder
sb.WriteString("name: jrs-auto-seo\nruntime: claude-code\n")
sb.WriteString("env:\n SERVICES_REPO_WEBSITE: https://example.com/jrs-auto/website-repo\n")
sb.WriteString("schedules:\n")
for i := 0; i < 12; i++ {
sb.WriteString(" - id: seo-task-")
sb.WriteString(strings.Repeat("x", 8))
sb.WriteString("\n cron: \"0 */2 * * *\"\n prompt: |\n")
sb.WriteString(" Run the SEO audit pass, refresh keyword rankings, regenerate the\n")
sb.WriteString(" sitemap, and publish the digest to the marketing channel.\n")
}
configYAML := sb.String()
seoPrompt := strings.Repeat(
"You are an expert SEO agent. Audit pages, find ranking gaps, and act. ", 200)
cfg := map[string][]byte{
"config.yaml": []byte(configYAML),
"prompts/system.md": []byte(seoPrompt),
}
total := len(configYAML) + len(seoPrompt)
if total <= 12<<10 {
t.Fatalf("fixture not representative: bundle is %d bytes, must exceed 12288 to reproduce the failure", total)
}
t.Logf("oversized config bundle: %d bytes (> old 12288 cap)", total)
var body cpProvisionRequest
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
t.Errorf("decode request: %v", err)
}
w.WriteHeader(http.StatusCreated)
_, _ = io.WriteString(w, `{"instance_id":"i-seo","state":"pending"}`)
}))
defer srv.Close()
p := &CPProvisioner{baseURL: srv.URL, orgID: "org-seo", httpClient: srv.Client()}
_, err := p.Start(context.Background(), WorkspaceConfig{
WorkspaceID: "ws-seo",
Runtime: "claude-code",
Tier: 4,
PlatformURL: "http://tenant",
ConfigFiles: cfg,
})
if err != nil {
t.Fatalf("Start with oversized config bundle failed: %v — the 12288-byte cap must be gone now config delivery is off user-data", err)
}
// The full bundle must have reached the CP request body intact.
wantCfg := base64.StdEncoding.EncodeToString([]byte(configYAML))
if got := body.ConfigFiles["config.yaml"]; got != wantCfg {
t.Errorf("config.yaml not delivered intact to CP (len got=%d want=%d)", len(got), len(wantCfg))
}
wantPrompt := base64.StdEncoding.EncodeToString([]byte(seoPrompt))
if got := body.ConfigFiles["prompts/system.md"]; got != wantPrompt {
t.Errorf("prompts/system.md not delivered intact to CP (len got=%d want=%d)", len(got), len(wantPrompt))
}
}
// TestCollectCPConfigFiles_DoSGuardStillBounds pins that retiring the 12 KiB
// cap did NOT remove the bound entirely — an absurdly large bundle (a buggy
// or hostile tenant) is still rejected so a compromised workspace-server
// can't OOM the CP request path. The guard just moved from a 12 KiB
// user-data ceiling to a generous transport-DoS ceiling.
func TestCollectCPConfigFiles_DoSGuardStillBounds(t *testing.T) {
huge := make([]byte, cpConfigFilesMaxBytes+1)
for i := range huge {
huge[i] = 'a'
}
_, err := collectCPConfigFiles(WorkspaceConfig{
ConfigFiles: map[string][]byte{"config.yaml": huge},
})
if err == nil {
t.Fatalf("expected the DoS guard to reject a %d-byte bundle, got nil", len(huge))
}
if !strings.Contains(err.Error(), "config files exceed") {
t.Errorf("unexpected error %q, want the size-guard message", err.Error())
}
}
// TestCollectCPConfigFiles_AcceptsSEOSizedBundle is the unit-level companion:
// collectCPConfigFiles itself (not just Start) must accept the SEO-sized
// bundle. Guards the exact constant that caused the outage.
func TestCollectCPConfigFiles_AcceptsSEOSizedBundle(t *testing.T) {
// 30 KiB of eligible config — far over the retired 12288 cap, far under
// the new DoS guard.
cfgBlob := make([]byte, 18<<10)
for i := range cfgBlob {
cfgBlob[i] = 'c'
}
promptBlob := make([]byte, 12<<10)
for i := range promptBlob {
promptBlob[i] = 'p'
}
files, err := collectCPConfigFiles(WorkspaceConfig{
ConfigFiles: map[string][]byte{
"config.yaml": cfgBlob,
"prompts/system.md": promptBlob,
},
})
if err != nil {
t.Fatalf("collectCPConfigFiles rejected a %d-byte SEO-sized bundle: %v", len(cfgBlob)+len(promptBlob), err)
}
if len(files) != 2 {
t.Fatalf("expected 2 files collected, got %d", len(files))
}
// Also confirm a template-dir path stays size-bounded the same way.
tmpl := t.TempDir()
if err := os.WriteFile(filepath.Join(tmpl, "config.yaml"), cfgBlob, 0o600); err != nil {
t.Fatal(err)
}
if _, err := collectCPConfigFiles(WorkspaceConfig{TemplatePath: tmpl}); err != nil {
t.Fatalf("collectCPConfigFiles rejected an SEO-sized template config.yaml: %v", err)
}
}
+7 -3
View File
@@ -173,6 +173,12 @@ func Setup(hub *ws.Hub, broadcaster *events.Broadcaster, prov *provisioner.Provi
// so the canvas flips to failed in seconds instead of waiting
// for the 10-minute provision-timeout sweeper.
wsAdmin.POST("/admin/workspaces/:id/bootstrap-failed", wh.BootstrapFailed)
// Per-workspace LLM billing mode override (internal#691). Used by
// CP's /cp/admin/workspaces/:id/llm-billing-mode proxy + (via that
// proxy) by the canvas Config-tab "LLM Billing" section. Default-
// closed resolver lives in handlers/llm_billing_mode.go.
wsAdmin.GET("/admin/workspaces/:id/llm-billing-mode", handlers.GetWorkspaceLLMBillingMode)
wsAdmin.PUT("/admin/workspaces/:id/llm-billing-mode", handlers.PutWorkspaceLLMBillingMode)
// Proxy to CP's serial-console endpoint so the canvas's "View
// Logs" button can render the actual boot trace without handing
// the tenant AWS credentials. Admin-gated because console output
@@ -217,9 +223,7 @@ func Setup(hub *ws.Hub, broadcaster *events.Broadcaster, prov *provisioner.Provi
{
// #680: PATCH /workspaces/:id moved under WorkspaceAuth (#680 IDOR fix).
// WorkspaceAuth enforces that the caller holds a valid bearer token for
// this specific workspace — both auth AND ownership in one check. Cosmetic
// updates (x/y drag-reposition, inline rename) from the combined tenant
// image canvas still pass via the isSameOriginCanvas bypass in WorkspaceAuth.
// this specific workspace, or a control-plane-verified tenant session.
wsAuth.PATCH("", wh.Update)
// Lifecycle
@@ -418,6 +418,7 @@ func (s *Scheduler) fireSchedule(ctx context.Context, sched scheduleRow) {
})
if marshalErr != nil {
log.Printf("Scheduler '%s': json.Marshal a2aBody failed: %v", sched.Name, marshalErr)
return
}
log.Printf("Scheduler: firing '%s' → workspace %s", sched.Name, short(sched.WorkspaceID, 12))
@@ -603,23 +604,24 @@ func (s *Scheduler) fireSchedule(ctx context.Context, sched scheduleRow) {
})
if marshalErr != nil {
log.Printf("Scheduler '%s': json.Marshal cronMeta failed: %v", sched.Name, marshalErr)
} else {
// #152: persist lastError into error_detail on the activity_logs row
// so GET /workspaces/:id/schedules/:id/history can surface why a run
// failed (previously dropped — history returned status without any
// error context, making root-cause debugging impossible).
// #2026: bounded Background() context — this INSERT was observed wedging
// indefinitely on invalid-UTF-8 jsonb payloads, blocking wg.Wait() in
// tick() and stalling the whole scheduler. Now: 10s deadline, survives
// outer ctx cancellation, and every string is UTF-8 sanitized.
insertCtx, insertCancel := context.WithTimeout(context.Background(), dbQueryTimeout)
if _, insErr := db.DB.ExecContext(insertCtx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, method, summary, request_body, status, error_detail, created_at)
VALUES ($1, 'cron_run', NULL, 'cron', $2, $3::jsonb, $4, $5, now())
`, sched.WorkspaceID, sanitizeUTF8("Cron: "+sched.Name), string(cronMeta), lastStatus, sanitizeUTF8(lastError)); insErr != nil {
log.Printf("Scheduler: activity_logs insert failed for '%s' (%s): %v", sched.Name, sched.ID, insErr)
}
insertCancel()
}
// #152: persist lastError into error_detail on the activity_logs row
// so GET /workspaces/:id/schedules/:id/history can surface why a run
// failed (previously dropped — history returned status without any
// error context, making root-cause debugging impossible).
// #2026: bounded Background() context — this INSERT was observed wedging
// indefinitely on invalid-UTF-8 jsonb payloads, blocking wg.Wait() in
// tick() and stalling the whole scheduler. Now: 10s deadline, survives
// outer ctx cancellation, and every string is UTF-8 sanitized.
insertCtx, insertCancel := context.WithTimeout(context.Background(), dbQueryTimeout)
if _, insErr := db.DB.ExecContext(insertCtx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, method, summary, request_body, status, error_detail, created_at)
VALUES ($1, 'cron_run', NULL, 'cron', $2, $3::jsonb, $4, $5, now())
`, sched.WorkspaceID, sanitizeUTF8("Cron: "+sched.Name), string(cronMeta), lastStatus, sanitizeUTF8(lastError)); insErr != nil {
log.Printf("Scheduler: activity_logs insert failed for '%s' (%s): %v", sched.Name, sched.ID, insErr)
}
insertCancel()
if s.broadcaster != nil {
s.broadcaster.RecordAndBroadcast(ctx, string(events.EventCronExecuted), sched.WorkspaceID, map[string]interface{}{
@@ -693,17 +695,18 @@ func (s *Scheduler) recordSkipped(ctx context.Context, sched scheduleRow, active
})
if marshalErr != nil {
log.Printf("Scheduler '%s': json.Marshal cronMeta failed: %v", sched.Name, marshalErr)
} else {
// #2026: bounded Background() context on the skipped activity log INSERT
// for the same reason as the fireSchedule activity_logs INSERT above.
skipInsCtx, skipInsCancel := context.WithTimeout(context.Background(), dbQueryTimeout)
if _, err := db.DB.ExecContext(skipInsCtx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, method, summary, request_body, status, error_detail, created_at)
VALUES ($1, 'cron_run', NULL, 'cron', $2, $3::jsonb, 'skipped', $4, now())
`, sched.WorkspaceID, sanitizeUTF8("Cron skipped: "+sched.Name), string(cronMeta), sanitizeUTF8(reason)); err != nil {
log.Printf("Scheduler: '%s' skip activity log failed: %v", sched.Name, err)
}
skipInsCancel()
}
// #2026: bounded Background() context on the skipped activity log INSERT
// for the same reason as the fireSchedule activity_logs INSERT above.
skipInsCtx, skipInsCancel := context.WithTimeout(context.Background(), dbQueryTimeout)
if _, err := db.DB.ExecContext(skipInsCtx, `
INSERT INTO activity_logs (workspace_id, activity_type, source_id, method, summary, request_body, status, error_detail, created_at)
VALUES ($1, 'cron_run', NULL, 'cron', $2, $3::jsonb, 'skipped', $4, now())
`, sched.WorkspaceID, sanitizeUTF8("Cron skipped: "+sched.Name), string(cronMeta), sanitizeUTF8(reason)); err != nil {
log.Printf("Scheduler: '%s' skip activity log failed: %v", sched.Name, err)
}
skipInsCancel()
if s.broadcaster != nil {
_ = s.broadcaster.RecordAndBroadcast(ctx, string(events.EventCronSkipped), sched.WorkspaceID, map[string]interface{}{
@@ -60,10 +60,12 @@ func RunWithRecover(ctx context.Context, name string, fn func(context.Context))
}
// Panic → back off and restart.
timer := time.NewTimer(backoff)
select {
case <-ctx.Done():
timer.Stop()
return
case <-time.After(backoff):
case <-timer.C:
}
if backoff < maxBackoff {
backoff *= 2
@@ -0,0 +1,4 @@
-- Reverse internal#691 per-workspace billing mode column.
-- The column is nullable + check-constrained; dropping it is non-destructive
-- to org-level behavior (workspaces fall back to the org default again).
ALTER TABLE workspaces DROP COLUMN IF EXISTS llm_billing_mode;
@@ -0,0 +1,17 @@
-- Per-workspace llm_billing_mode override (internal#691).
--
-- NULL = inherit the org-level default (organizations.llm_billing_mode on CP,
-- propagated to workspace-server via tenant_config as MOLECULE_LLM_BILLING_MODE).
-- A non-NULL value overrides the org default for this workspace only.
--
-- Resolver contract: workspaces.llm_billing_mode ?? org_default ?? 'platform_managed'.
-- Default-closed: any NULL, error, unknown enum, or JOIN miss resolves to
-- 'platform_managed' (the existing implicit default — see internal#691
-- spec sketch + Phase 1 design comment).
--
-- The check constraint mirrors the CP-side credits.LLMBillingMode* constants
-- (molecule-controlplane/internal/credits/llm_billing.go). Keep in sync if
-- a new mode is ever added; the resolver also enumerates them explicitly.
ALTER TABLE workspaces
ADD COLUMN IF NOT EXISTS llm_billing_mode TEXT
CHECK (llm_billing_mode IS NULL OR llm_billing_mode IN ('platform_managed', 'byok', 'disabled'));