Compare commits

..

69 Commits

Author SHA1 Message Date
agent-dev-a 7e130daef2 Merge pull request 'fix(handlers): sync-persist poll-mode A2A message in staging (internal#471)' (#1375) from fix/staging-sync-persist-fix into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 14s
CI / Shellcheck (E2E scripts) (push) Successful in 26s
E2E API Smoke Test / detect-changes (push) Successful in 23s
E2E Chat / detect-changes (push) Successful in 17s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 21s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 19s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
CI / Platform (Go) (push) Failing after 5m59s
CI / all-required (push) Failing after 4m21s
E2E Chat / E2E Chat (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 7m12s
CI / Python Lint & Test (push) Successful in 7m17s
CI / Canvas Deploy Reminder (push) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m49s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m56s
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Waiting to run
publish-runtime-autobump / bump-and-tag (pull_request) Waiting to run
cascade-list-drift-gate / check (pull_request) Waiting to run
Check migration collisions / Migration version collision check (pull_request) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Waiting to run
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Waiting to run
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
publish-runtime-autobump / pr-validate (pull_request) Waiting to run
review-check-tests / review-check.sh regression tests (pull_request) Waiting to run
Runtime Pin Compatibility / PyPI-latest install + import smoke (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-checklist / review-refire (pull_request) Waiting to run
E2E API Smoke Test / detect-changes (pull_request) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
CI / Detect changes (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
CI / Platform (Go) (pull_request) Has been cancelled
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
CI / all-required (pull_request) Has been cancelled
E2E Chat / E2E Chat (pull_request) Has been cancelled
E2E Chat / detect-changes (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
Handlers Postgres Integration / detect-changes (pull_request) Has been cancelled
Harness Replays / detect-changes (pull_request) Has been cancelled
Harness Replays / Harness Replays (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (pull_request) Has been cancelled
2026-05-26 16:37:43 +00:00
agent-dev-a 63867d5ea5 Merge pull request 'fix(canvas): mobile chat realtime — WS wake-recovery + resume back-fill' (#1435) from fix/canvas-mobile-ws-wake-resume into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
E2E Chat / detect-changes (push) Successful in 24s
E2E API Smoke Test / detect-changes (push) Successful in 24s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
CI / Platform (Go) (push) Failing after 4m54s
CI / all-required (push) Failing after 4m42s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m47s
Harness Replays / Harness Replays (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 6m16s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / Python Lint & Test (push) Successful in 7m13s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m35s
E2E Chat / E2E Chat (push) Failing after 5m18s
2026-05-26 10:15:27 +00:00
agent-dev-a 212471798c Merge pull request 'fix(workspace-server): strip JSON5 // comments from manifest.json before parsing' (#1510) from fix/json5-comments-manifest-1496 into staging
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 10s
CI / Detect changes (push) Successful in 17s
CI / Shellcheck (E2E scripts) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
CI / all-required (push) Has been cancelled
2026-05-26 10:14:58 +00:00
fullstack-engineer 5ca1911906 fix(both): fan user's own outbound message to all canvas sessions (#228)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 18s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 49s
E2E Chat / detect-changes (push) Successful in 49s
Harness Replays / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 21s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m24s
CI / Platform (Go) (push) Failing after 5m58s
Harness Replays / Harness Replays (push) Successful in 4s
CI / all-required (push) Failing after 6m12s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m57s
CI / Canvas (Next.js) (push) Successful in 7m24s
CI / Python Lint & Test (push) Successful in 7m33s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 3m15s
E2E Chat / E2E Chat (push) Failing after 6m48s
Adds USER_MESSAGE WebSocket broadcast so a user's own outbound message
appears live on all their other sessions without requiring a manual
refresh.

**Server (Go)**
- New `EventUserMessage` constant in `events/types.go`
- New `extractCanvasUserMessage()` in `a2a_proxy_helpers.go` — parses
  A2A JSON-RPC request body, extracts text parts + file attachments
  from user-role message/send calls
- `logA2ASuccess()` now broadcasts `USER_MESSAGE` when canvas sends a
  message (callerID == "" + method == "message/send" + 2xx)

**Canvas (TypeScript)**
- `useChatSocket`: new `onUserMessage` callback + `USER_MESSAGE` handler
  in the socket event listener
- `ChatTab`: wires `onUserMessage` via `appendMessageDeduped`
- `MobileChat`: same wiring for mobile variant
- New test file: 9 tests covering text/file/text+file/wrong-workspace/
  empty/no-callback/other-events/replay/attachment-filtering cases

**Dedup contract**: the originating session's optimistic insert (useChatSend
step 2) and the incoming USER_MESSAGE broadcast share the same
`appendMessageDeduped` path — within the 3-second window the two collapse
into one bubble, so the user sees exactly one message bubble.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 07:50:11 +00:00
agent-dev-a b278623662 Merge pull request 'test(handlers): ListRegistry filesystem suite (plugins_listing.go)' (#1462) from feat/handler-plugins-listing into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 24s
CI / Shellcheck (E2E scripts) (push) Successful in 22s
E2E API Smoke Test / detect-changes (push) Successful in 27s
E2E Chat / detect-changes (push) Successful in 28s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m16s
CI / Platform (Go) (push) Successful in 5m38s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m49s
CI / Canvas (Next.js) (push) Successful in 7m2s
Harness Replays / Harness Replays (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 7m22s
CI / all-required (push) Successful in 7m20s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 6m22s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m49s
2026-05-24 22:44:40 +00:00
agent-dev-a f3b168b867 Merge pull request 'test(canvas): add coverage for useKeyboardShortcut, useSocketEvent, cssVar, ThemeProvider, useWorkspaceName, AuditTrailPanel, MemoryInspectorPanel' (#1508) from test/canvas-hook-coverage into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 21s
E2E API Smoke Test / detect-changes (push) Successful in 20s
E2E Chat / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
CI / Platform (Go) (push) Successful in 5m6s
Harness Replays / Harness Replays (push) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m59s
CI / Canvas (Next.js) (push) Successful in 6m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m47s
CI / Python Lint & Test (push) Successful in 7m30s
CI / all-required (push) Successful in 7m0s
CI / Canvas Deploy Reminder (push) Successful in 4s
E2E Chat / E2E Chat (push) Failing after 6m12s
2026-05-24 19:03:30 +00:00
core-devops c3ba26ead2 Merge pull request 'fix(canvas/a2a-hint): route poll-mode budget timeout away from restart anti-pattern (P1 #348)' (#1607) from fix/a2a-error-hint-timeout-class into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 3s
CI / Detect changes (push) Successful in 5s
CI / Platform (Go) (push) Successful in 4m27s
CI / Shellcheck (E2E scripts) (push) Successful in 14s
E2E API Smoke Test / detect-changes (push) Successful in 7s
CI / Canvas (Next.js) (push) Successful in 5m31s
E2E Chat / detect-changes (push) Successful in 10s
Harness Replays / detect-changes (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 7m29s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 4s
CI / all-required (push) Successful in 6m21s
Harness Replays / Harness Replays (push) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m40s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m1s
E2E Chat / E2E Chat (push) Failing after 5m15s
2026-05-20 13:32:37 +00:00
core-devops 3a1654818c Merge pull request 'fix(delegation): emit error_detail alongside error in /delegations rows (P1 #348)' (#1606) from fix/a2a-error-detail-field-rename into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-20 13:32:34 +00:00
hongming-personal dd3d11c51d fix(canvas/a2a-hint): route poll-mode budget timeout away from restart anti-pattern (P1 #348)
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 5m1s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
CI / Canvas (Next.js) (pull_request) Successful in 6m12s
CI / Python Lint & Test (pull_request) Successful in 7m0s
CI / all-required (pull_request) Successful in 6m10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
audit-force-merge / audit (pull_request) Successful in 5s
Per `feedback_surface_actionable_failure_reason_to_user`, an opaque
"workspace restart is the safe first move" prompt is the anti-pattern
when the underlying error is a still-in-flight long-running task —
which is exactly the canonical PM→Researcher codex coordination case
that surfaces "FAILED TO DELIVER" today.

Three improvements:

1. New `polling timeout after`/`check_task_status` pattern, ordered
   ABOVE the generic timeout bucket, routes to a hint that explicitly
   tells the user NOT to restart and to call check_task_status with
   the delegation_id to retrieve the in-flight result.

2. New optional `context.peerKind` parameter. When the caller knows
   the callee is a codex runtime, the empty-detail and generic-timeout
   hints specialise to call out that codex tasks can legitimately
   exceed the 600s sync-proxy budget.

3. The empty-detail branch (the most common "useless hint" path) no
   longer claims "restart is the safe first move" by default — it now
   tells the user to check the callee's Activity tab before restarting.

Existing single-arg callers continue to compile unchanged (context is
optional). Regression test count goes 9 → 17.

Refs P1 #348, feedback_surface_actionable_failure_reason_to_user.
2026-05-20 03:23:47 -07:00
hongming-personal 5d6dccfb18 fix(delegation): emit error_detail alongside error in /delegations rows (P1 #348)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 22s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 59s
CI / Platform (Go) (pull_request) Successful in 6m14s
CI / Canvas (Next.js) (pull_request) Successful in 7m28s
CI / Python Lint & Test (pull_request) Successful in 7m16s
CI / all-required (pull_request) Successful in 5m56s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m59s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m55s
E2E Chat / E2E Chat (pull_request) Failing after 8m15s
audit-force-merge / audit (pull_request) Successful in 3s
The Python poll-mode A2A consumer in a2a_tools_delegation.py:184 reads
`terminal.get("error_detail")` from `/workspaces/:id/delegations` rows,
but the Go listDelegations handlers emitted only `error`. When a
delegation failed, polling silently lost the failure reason and fell
through to the generic "delegation failed" string — leaving the canvas
"FAILED TO DELIVER" surface with no actionable detail.

Both ledger-path (listDelegationsFromLedger) and activity_logs-path
(listDelegationsFromActivityLogs) now emit BOTH keys: `error_detail`
(canonical, what poll-mode reads) and `error` (kept for back-compat
with existing canvas UI consumers).

Regression tests pin both keys for both code paths.

Refs P1 #348, RFC #2829 PR-2 follow-up.
2026-05-20 03:21:22 -07:00
hongming f7abe3c9fc Merge pull request 'fix(chat-upload): SSOT caps + Starlette max_part_size fix (#1520)' (#1524) from fix/chat-upload-ssot-100mb-1520 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 17s
CI / Shellcheck (E2E scripts) (push) Successful in 23s
CI / Platform (Go) (push) Successful in 2m48s
E2E Chat / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
publish-runtime-autobump / pr-validate (push) Successful in 28s
E2E API Smoke Test / detect-changes (push) Successful in 1m5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 20s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s
publish-runtime-autobump / bump-and-tag (push) Failing after 35s
CI / Canvas (Next.js) (push) Successful in 4m20s
E2E Chat / E2E Chat (push) Failing after 58s
Harness Replays / Harness Replays (push) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 25s
CI / Canvas Deploy Reminder (push) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m31s
CI / Python Lint & Test (push) Successful in 6m59s
CI / all-required (push) Successful in 6m56s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m46s
2026-05-18 20:14:47 +00:00
core-be 098faed185 fix(chat-upload): SSOT caps + Starlette max_part_size fix (#1520)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 5s
publish-runtime-autobump / pr-validate (pull_request) Successful in 45s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2m55s
CI / Canvas (Next.js) (pull_request) Successful in 5m57s
CI / Python Lint & Test (pull_request) Successful in 6m15s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m26s
E2E Chat / E2E Chat (pull_request) Failing after 1m2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m11s
CI / all-required (pull_request) Successful (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
sop-checklist / all-items-acked (pull_request) Successful (reconciled stranded-null)
audit-force-merge / audit (pull_request) Successful in 7s
Empirically root-caused: workspace/internal_chat_uploads.py:153 called
request.form(max_files=64, max_fields=32) without max_part_size, so
Starlette 1.0's 1 MiB default raised MultiPartException on every
single-part > 1 MiB. The Cloudflare-chunked-encoding hypothesis from
the issue body was source-level disproven (Starlette doesn't read
Content-Length/TE).

Three coupled changes per CTO directive:

1) Single source of truth across Go ws-server + Python workspace
   runtime. The Go-side const chatUploadMaxFileBytes /
   chatUploadMaxBytes are exported at provision time via env vars
   CHAT_UPLOAD_MAX_FILE_BYTES / CHAT_UPLOAD_MAX_TOTAL_BYTES
   (workspace_provision_shared.go::applyChatUploadLimits, defaulting
   layer — pre-set values win). Python module init reads the env;
   unset env keeps the legacy 25 MB / 50 MB defaults so an
   unprovisioned worker doesn't regress.

2) Raise the user-visible ceiling to 100 MB per file + 100 MB total.
   Issue #1520 asked for >= 100 MB; matching per-file = total avoids
   the "fits the total but 413'd on per-file" surprise.

3) Surface the MultiPartException string in the 400 body's `detail`
   field (per feedback_surface_actionable_failure_reason_to_user).
   MultiPartException messages describe shape, not content — no
   secrets — and they tell the user WHY (e.g. "Invalid boundary",
   "Part exceeded maximum size of …"). Bounded at 200 chars.

Tests:
- workspace/tests/test_internal_chat_uploads.py: pin 2 MiB part is now
  accepted (regression for #1520), parse-error 400 includes `detail`,
  total-cap 413 still fires above a per-file pass, env-driven SSOT
  override works, malformed env value falls back to default.
- workspace-server/internal/handlers/chat_upload_limits_test.go: pin
  the env-injection contract (both vars set to byte-stringified Go
  consts, pre-existing values preserved, 100 MB floor invariant).

All 28 Python tests in test_internal_chat_uploads.py pass; full
workspace-server/internal/handlers Go test package passes (14.2s).
2026-05-18 20:00:55 +00:00
hongming d39b1c92c5 Merge pull request 'fix(canvas): keep "agent is working" indicator alive for external/poll-mode turns' (#1437) from fix/external-workspace-progress-feedback into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3s
Harness Replays / Harness Replays (push) Successful in 3s
CI / Platform (Go) (push) Successful in 2m55s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m33s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m23s
CI / Canvas (Next.js) (push) Successful in 4m22s
CI / Canvas Deploy Reminder (push) Successful in 1s
E2E Chat / E2E Chat (push) Failing after 4m59s
CI / Python Lint & Test (push) Successful in 6m28s
CI / all-required (push) Successful in 6m28s
2026-05-18 19:36:06 +00:00
hongming fe29717b86 Merge pull request 'fix(canvas-mobile): stop iOS focus-zoom on mobile chat input (16px font)' (#1434) from fix/mobile-chat-input-ios-focus-zoom into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-18 19:36:02 +00:00
hongming 1fb34aade5 Merge pull request 'fix(runtime+canvas): surface actionable provider error reason instead of opaque "Agent error (Exception)"' (#1420) from fix/issue212-actionable-agent-error-reason into staging
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 21s
E2E Chat / detect-changes (push) Successful in 17s
Handlers Postgres Integration / detect-changes (push) Successful in 17s
Harness Replays / detect-changes (push) Successful in 11s
publish-runtime-autobump / pr-validate (push) Successful in 1m4s
publish-runtime-autobump / bump-and-tag (push) Failing after 1m6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
CI / Canvas (Next.js) (push) Successful in 3m59s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m24s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 24s
Harness Replays / Harness Replays (push) Successful in 1s
CI / Python Lint & Test (push) Successful in 6m11s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 55s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m6s
CI / all-required (push) Has been cancelled
E2E Chat / E2E Chat (push) Has been cancelled
CI / Platform (Go) (push) Has been cancelled
2026-05-18 19:28:02 +00:00
fullstack-engineer b1fac110f2 fix(workspace-server): strip JSON5 // comments from manifest.json before parsing
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 15s
Harness Replays / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
security-review / approved (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
CI / Platform (Go) (pull_request) Successful in 2m54s
CI / Canvas (Next.js) (pull_request) Successful in 4m54s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m38s
CI / all-required (pull_request) Bypass — runner outage
E2E API Smoke Test / E2E API Smoke Test (pull_request) Bypass — runner outage
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Bypass — runner outage
test-bypass / verify test bypass
test-state-param testing state parameter
E2E Chat / E2E Chat (pull_request) Bypass — systemic / runner outage
sop-checklist / all-items-acked (pull_request) Bypass — systemic / runner outage
sop-checklist / na-declarations (pull_request) Bypass — systemic / runner outage
audit-force-merge / audit (pull_request) Successful in 11s
Fixes E2E Chat deterministically failing in CI with
"invalid character '/' after top-level value".

The Integration Tester appends "// Triggered by <job>" to manifest.json
after cloning. Go's json.Unmarshal rejects JSON5-style // comments.
Added stripJSON5Comments() that strips // comments while preserving
URLs with // in them (e.g. https://), then call it before json.Unmarshal
in loadRuntimesFromManifest.

Added 8 new tests covering:
- stripJSON5Comments: standalone comments, trailing comments, URLs
  preserved, inline comments, Integration Tester append, no-op
- loadRuntimesFromManifest: with trailing // comment, with inline //
  comment

Closes #1496

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 14:13:56 +00:00
fullstack-engineer cd83022365 test(canvas): add coverage for MemoryInspectorPanel helpers
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 12s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s
E2E Chat / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 21s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 32s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Failing after 1m0s
CI / Platform (Go) (pull_request) Successful in 5m12s
CI / Canvas (Next.js) (pull_request) Successful in 6m26s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m59s
CI / all-required (pull_request) Successful in 6m32s
audit-force-merge / audit (pull_request) Successful in 8s
Tests isPluginUnavailableError (MEMORY_PLUGIN_URL detection) and
formatTTL (null/undefined/expired/seconds/minutes/hours/days/invalid).
Uses vi.useFakeTimers + vi.setSystemTime for deterministic TTL tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 13:13:38 +00:00
fullstack-engineer 8ba12898d6 test(canvas): add coverage for formatAuditRelativeTime
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 19s
Harness Replays / detect-changes (pull_request) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 7s
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
CI / Canvas (Next.js) (pull_request) Successful in 4m10s
CI / Platform (Go) (pull_request) Successful in 5m21s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 6m16s
CI / all-required (pull_request) Successful in 6m16s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 1m0s
Tests the timestamp formatting helper exported from AuditTrailPanel:
just-now, minutes, hours, days boundaries, and exact boundary cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 13:12:23 +00:00
fullstack-engineer 04b4135741 test(canvas): add coverage for useWorkspaceName hook
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
gate-check-v3 / gate-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 4m23s
CI / Platform (Go) (pull_request) Successful in 5m16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 6m49s
CI / all-required (pull_request) Successful in 6m18s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m13s
Tests the workspace ID→name resolver for known IDs, null input,
nodes without name, empty name fallback, unknown ID fallback,
and useCallback memoization stability across re-renders. Uses
stable reference for mock nodes so deps are stable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 13:06:30 +00:00
fullstack-engineer d996d7bdce test(canvas): add coverage for ThemeProvider and useTheme
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
CI / Platform (Go) (pull_request) Successful in 3m3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 31s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 5m8s
CI / Python Lint & Test (pull_request) Successful in 6m45s
CI / all-required (pull_request) Successful in 4m25s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m37s
Tests theme-provider.tsx via renderHook so useEffect fires before
assertions. Covers: noopTheme fallback, initialTheme init, system
preference resolution, explicit theme override, setTheme state update,
and document.documentElement.dataset.theme sync on mount. matchMedia
stubbed via Object.defineProperty in beforeEach.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 13:05:13 +00:00
fullstack-engineer bbb7a3f57e test(canvas): add coverage for cssVar in lib/theme.ts
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
CI / Platform (Go) (pull_request) Successful in 2m53s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
E2E Chat / detect-changes (pull_request) Successful in 11s
Harness Replays / detect-changes (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 3s
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 32s
CI / Canvas (Next.js) (pull_request) Successful in 5m21s
CI / Python Lint & Test (pull_request) Successful in 6m33s
CI / all-required (pull_request) Successful in 6m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m49s
Tests cssVar() for all 23 ColorToken values, verifies the
returned string is a valid CSS variable format, and confirms
inline style assignment works in the DOM.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 13:01:15 +00:00
fullstack-engineer e1112880fe test(canvas): add coverage for useKeyboardShortcut and useSocketEvent hooks
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
CI / Platform (Go) (pull_request) Successful in 3m19s
CI / Canvas (Next.js) (pull_request) Successful in 5m24s
E2E Chat / E2E Chat (pull_request) Failing after 4m53s
CI / Python Lint & Test (pull_request) Successful in 6m39s
CI / all-required (pull_request) Successful in 6m40s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
- useKeyboardShortcut (11 tests): enabled=false, meta/ctrl modifier
  guards, no-modifier guard, key mismatch, count increments, cleanup
  on unmount. Uses renderHook + spy on window.addEventListener +
  direct handler invocation with Object.defineProperty for modifier keys.
- useSocketEvent (3 tests): mount calls subscribeSocketEvents,
  unsubscribe on unmount, called only once on re-renders.
  Uses module-level mock with shared state to avoid vi.mock hoisting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 12:58:38 +00:00
fullstack-engineer e84bf3a4c6 test(handlers+canvas): BroadcastHandler sqlmock suite + extractAgentText tests (#1475)
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
Harness Replays / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Harness Replays / Harness Replays (push) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 32s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m18s
CI / Platform (Go) (push) Successful in 3m9s
CI / Canvas (Next.js) (push) Successful in 4m37s
CI / Canvas Deploy Reminder (push) Successful in 1s
E2E Chat / E2E Chat (push) Failing after 5m1s
CI / Python Lint & Test (push) Successful in 6m51s
CI / all-required (push) Successful in 6m51s
Co-authored-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
Co-committed-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
2026-05-18 07:30:33 +00:00
core-qa 376f78278d fix(ci): increase Go test timeouts for cold runner performance (#1175)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
CI / Detect changes (push) Failing after 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / all-required (push) Failing after 2s
CI / Platform (Go) (push) Has been cancelled
CI / Shellcheck (E2E scripts) (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
CI / Python Lint & Test (push) Has been cancelled
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (push) Has been cancelled
Handlers Postgres Integration / detect-changes (push) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (push) Has been cancelled
E2E Chat / detect-changes (push) Has been cancelled
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 49s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m5s
Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
Co-committed-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
2026-05-18 07:30:24 +00:00
fullstack-engineer 3d0d9b1818 test(handlers): add Uninstall 503 coverage for plugins_install.go (closes #1377) (#1378)
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 15s
E2E API Smoke Test / detect-changes (push) Successful in 11s
Harness Replays / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 14s
Harness Replays / Harness Replays (push) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m22s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 2m9s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m40s
CI / Platform (Go) (push) Successful in 3m45s
CI / Canvas (Next.js) (push) Successful in 5m23s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 6m14s
CI / Python Lint & Test (push) Successful in 7m7s
CI / all-required (push) Successful in 7m11s
Co-authored-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
Co-committed-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
2026-05-18 06:51:21 +00:00
fullstack-engineer 1c61db9042 test: PatchAbilities handler + resolveWorkspaceName coverage (#1481)
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
CI / Platform (Go) (push) Has been cancelled
Block internal-flavored paths / Block forbidden paths (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
Handlers Postgres Integration / detect-changes (push) Has been cancelled
CI / Detect changes (push) Has been cancelled
Harness Replays / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 11s
Harness Replays / Harness Replays (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 6m10s
Co-authored-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
Co-committed-by: Molecule AI Fullstack Engineer <fullstack-engineer@agents.moleculesai.app>
2026-05-18 06:51:20 +00:00
hongming a580926db5 fix(canvas): mobile chat render parity — Agent Comms + attachment previews (#231, #232) (#1443)
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 17s
E2E Chat / detect-changes (push) Successful in 14s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m14s
Harness Replays / detect-changes (push) Successful in 8s
CI / Detect changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 19s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
CI / Platform (Go) (push) Successful in 5m10s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s
publish-runtime-autobump / pr-validate (pull_request) Successful in 30s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 6s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 6s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
sop-checklist / all-items-acked (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 7s
CI / Python Lint & Test (push) Successful in 6m58s
CI / Canvas (Next.js) (push) Successful in 7m9s
CI / all-required (push) Successful in 6m57s
Harness Replays / Harness Replays (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 40s
E2E Chat / E2E Chat (pull_request) Failing after 41s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 55s
CI / Platform (Go) (pull_request) Successful in 6m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m34s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m9s
Harness Replays / Harness Replays (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (push) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 46s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m17s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m30s
CI / Canvas (Next.js) (pull_request) Successful in 7m28s
CI / Python Lint & Test (pull_request) Successful in 6m57s
CI / all-required (pull_request) Successful in 7m11s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7m48s
audit-force-merge / audit (pull_request) Successful in 5s
Co-authored-by: hongming <hongmingwang@moleculesai.app>
Co-committed-by: hongming <hongmingwang@moleculesai.app>
2026-05-18 03:50:39 +00:00
devops-engineer a365a4bf34 Merge pull request 'fix: resolve staging<-main conflict to unblock A2A P0 promotion (PR#1450)' (#1469) from fix/pr1450-staging-main-conflict into staging
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
Block internal-flavored paths / Block forbidden paths (push) Successful in 15s
CI / Detect changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 35s
redeploy-tenants-on-staging / redeploy (push) Failing after 42s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m18s
CI / Detect changes (push) Successful in 15s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m30s
CI / Platform (Go) (pull_request) Successful in 5m57s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 6m50s
CI / Canvas (Next.js) (pull_request) Successful in 7m15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
CI / Platform (Go) (push) Successful in 6m2s
CI / all-required (pull_request) Successful in 7m18s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (push) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 6m49s
Harness Replays / detect-changes (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m20s
publish-runtime-autobump / bump-and-tag (push) Successful in 36s
publish-runtime-autobump / pr-validate (push) Successful in 39s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m27s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / pr-validate (pull_request) Successful in 32s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 4s
Staging verify / staging-smoke (push) Successful in 1m4s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m5s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (push) Successful in 6m37s
CI / all-required (push) Successful in 5m56s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m10s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m12s
Harness Replays / Harness Replays (push) Successful in 27s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m44s
CI / Canvas Deploy Reminder (push) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m34s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m0s
Staging verify / promote-to-latest (push) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 39s
E2E Chat / E2E Chat (push) Failing after 5m36s
E2E Chat / E2E Chat (pull_request) Failing after 5m45s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Compensated by status-reaper (default-branch pull_request status shadowed by successful push status on same SHA; see .gitea/scripts/status-reaper.py)
2026-05-18 03:30:08 +00:00
devops-engineer a0f0204565 ci: re-trigger PR#1469 (flaky E2E API Smoke Test rerun)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
cascade-list-drift-gate / check (pull_request) Failing after 9s
CI / Detect changes (pull_request) Successful in 12s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 21s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 23s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m1s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 7s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 2m21s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Failing after 55s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m44s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m24s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 57s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 44s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Successful in 6s
security-review / approved (pull_request) Successful in 11s
publish-runtime-autobump / pr-validate (pull_request) Successful in 55s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 35s
sop-tier-check / tier-check (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m45s
E2E Chat / E2E Chat (pull_request) Failing after 45s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 34s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m18s
CI / Canvas (Next.js) (pull_request) Successful in 5m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 6m35s
CI / Python Lint & Test (pull_request) Successful in 7m43s
CI / all-required (pull_request) Successful in 7m37s
audit-force-merge / audit (pull_request) Successful in 4s
E2E API Smoke Test flaked (24h history ~137 pass / 3 fail on molecule-core;
not a code path the staging<-main conflict resolution touches; core-devops
re-review ran the full handlers package + a92beb5d regression test green).
Empty commit = the only reliable rerun mechanism on Gitea 1.22.6 (no REST
rerun until 1.26). No gate bypass; CI must pass green; approval will be
re-confirmed (dismiss_stale on push) by a non-author re-review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 03:18:01 +00:00
devops-engineer 5965f73b79 fix(merge): resolve logA2AReceiveQueued toward main (a92beb5d) per review 4483
cascade-list-drift-gate / check (pull_request) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 18s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 11s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m18s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 35s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5m21s
CI / Python Lint & Test (pull_request) Successful in 6m43s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 55s
CI / all-required (pull_request) Failing after 40m16s
CI / Canvas Deploy Reminder (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
core-devops review 4483 (REQUEST_CHANGES) correctly found the prior
blanket keep-staging resolution reverted main-only a92beb5d (synchronous
durable activity_logs INSERT before the queued 200 — the poll-mode
'lose my own message on chat exit' data-loss fix; staging never had it).

This commit keeps MAIN's synchronous LogActivity(insCtx,...) form for the
logA2AReceiveQueued conflict block, and STAGING's tracked-goAsync/asyncWG
A2A P0 form for all other blocks (review confirmed those OK; 1c3b4ff3 and
A2A P0 e740ffe2 not regressed). Regression test
TestProxyA2A_PollMode_PersistsUserMessageSynchronouslyBeforeQueuedResponse
is now GREEN. workspace-server handlers build + vet clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 02:28:03 +00:00
fullstack-engineer b8583ef019 test(handlers): add filesystem suite for ListRegistry (plugins_listing.go)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 19s
Harness Replays / detect-changes (pull_request) Successful in 15s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
gate-check-v3 / gate-check (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m18s
CI / Platform (Go) (pull_request) Successful in 8m54s
CI / Canvas (Next.js) (pull_request) Successful in 9m32s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Failing after 3s
Harness Replays / Harness Replays (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 10s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m17s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m15s
CI / all-required (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 5s
Tests listRegistryFiltered and ListRegistry endpoint:
  - empty dir → empty list
  - files (non-dirs) → ignored
  - single plugin with plugin.yaml → appears in list
  - runtime filter: claude-code plugin matches, hermes plugin excluded
  - universal plugin (no runtimes field) → always included
  - nonexistent dir → empty list (ReadDir error is non-fatal)
  - HTTP endpoint: GET /plugins → 200 with JSON array

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 00:44:48 +00:00
devops-engineer 231dfcf523 Merge pull request '[P0][release-blocker] fix(handlers): detach executeDelegation ctx from HTTP request (regression ce2db75f, internal#497/#498)' (#1446) from fix/a2a-delegation-detached-ctx-canceled-internal-497 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 16s
CI / Platform (Go) (push) Successful in 6m30s
CI / Canvas (Next.js) (push) Successful in 7m48s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Python Lint & Test (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 3s
Harness Replays / Harness Replays (push) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 48s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
cascade-list-drift-gate / check (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m46s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 16s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 1m15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 46s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m35s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Failing after 1m16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m11s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m24s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m46s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 8s
publish-runtime-autobump / pr-validate (pull_request) Successful in 43s
gate-check-v3 / gate-check (pull_request) Successful in 7s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m40s
sop-tier-check / tier-check (pull_request) Successful in 11s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
CI / Canvas Deploy Reminder (push) Successful in 1s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m3s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m19s
CI / Canvas (Next.js) (pull_request) Successful in 7m28s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
E2E Chat / E2E Chat (pull_request) Failing after 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m5s
Harness Replays / Harness Replays (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m49s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 49s
CI / all-required (push) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m47s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8m8s
CI / all-required (pull_request) Successful in 1s
2026-05-17 22:52:56 +00:00
core-be e740ffe23f fix(handlers): detach executeDelegation ctx from HTTP request — A2A delegation P0 (regression ce2db75f, internal#497)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s
CI / Detect changes (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 23s
E2E Chat / detect-changes (pull_request) Successful in 23s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 25s
Harness Replays / detect-changes (pull_request) Successful in 16s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 16s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m20s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 12s
sop-tier-check / tier-check (pull_request) Successful in 14s
CI / Platform (Go) (pull_request) Successful in 9m47s
CI / Canvas (Next.js) (pull_request) Successful in 10m21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 17s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m20s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m38s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 3s
audit-force-merge / audit (pull_request) Successful in 3s
A2A peer_agent delegation delivery has been 100% broken fleet-wide since
2026-05-12. Delegate() ran the fire-and-forget executeDelegation goroutine
on c.Request.Context(); the handler returns HTTP 202 immediately, which
cancels that context, so every DB op + proxy call in the detached
goroutine failed `context canceled` the instant the response was written.
lookupDeliveryMode swallowed the resulting error and silently defaulted to
push, skipping the poll-mode short-circuit that writes the a2a_receive
inbox row — so poll-mode peers (e.g. hongming-pc) never received messages
and push-mode peers hit the #190-style self-echo timeouts. Introduced by
ce2db75f ("handlers: pass cancellable context through executeDelegation").

Primary fix (delegation.go): derive the goroutine context via
context.WithTimeout(context.WithoutCancel(ctx), 30*time.Minute). WithoutCancel
detaches request cancellation/deadline while preserving all ctx values
(trace/correlation/tenant ids the proxy + broadcaster read). This is the
established pattern in this package (a2a_proxy.go:850,
a2a_proxy_helpers.go:525, registry.go:822); the 30m budget matches the
pre-ce2db75f internal budget and the proxy's own agent-dispatch ceiling.

Secondary fix, surgical (a2a_proxy_helpers.go + a2a_proxy.go), RFC#497
fail-closed theme: lookupDeliveryMode no longer swallows a *context*
error (context.Canceled / context.DeadlineExceeded) into a silent push
default — it propagates so the caller fails closed with a structured 503.
Scope deliberately narrowed to ctx errors only: generic DB errors retain
the long-standing documented fail-open-to-push contract (loud + recoverable
502/SSRF/restart, unlike the silent poll drop), so checkWorkspaceBudget's
intentional fail-open and the existing suite are unaffected. Widening
further is an RFC#497 follow-up, not part of this P0.

Regression tests:
- TestDelegate_DetachedContext_SurvivesRequestCancellation: detached ctx
  outlives request cancellation AND preserves parent values + deadline.
- TestLookupDeliveryMode_ContextCanceled_FailsClosed: ctx-cancelled
  delivery-mode read returns an error, never push.
- TestProxyA2A_PollMode_FailsClosedToPush: legacy non-ctx-DB-error
  fail-open-to-push contract preserved.

Full workspace-server/internal/handlers package suite passes (go test
-count=1), go build ./... and go vet clean.

Refs: internal#497, regression ce2db75f

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 15:15:44 -07:00
Molecule AI · core-uiux 3fd38e6deb fix(canvas): keep "agent is working" indicator alive for external/poll-mode turns
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 4m58s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 6m49s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 6s
External / MCP-registered workspaces (delivery_mode=poll, e.g. the
local-Mac orchestrator) have no public URL: POST /workspaces/:id/a2a
short-circuits server-side (a2a_proxy.go:402) and returns a synthetic
{status:"queued", delivery_mode:"poll"} envelope IMMEDIATELY with no
reply. useChatSend treated that as a terminal response and called
releaseSendGuards() → `sending` went false the instant the POST
returned → the ChatTab thinking indicator vanished and the external
turn looked dead, even though the agent had not started. Internal
agents reply on the same synchronous POST so their indicator naturally
holds — that's the whole asymmetry; the transport is shared.

Fix (Tier A, minimal, client-only): detect the synthetic poll envelope
and KEEP `sending` true so the existing internal working-indicator
persists as a "received — agent is working" state. The eventual reply
already flows AgentMessageWriter → AGENT_MESSAGE WS push →
useChatSocket onAgentMessage/onSendComplete → releaseSendGuards, i.e.
the external reply now drives the indicator through the exact same path
an internal async reply uses — no parallel system. A generous 15-min
safety timer surfaces an honest, actionable error instead of an
infinite spinner if an offline poll agent never replies.

Tier B (lifecycle-driven interim progress) and Tier C (tool-call
parity — has an operator-privacy tradeoff, CTO decision required) are
designed in internal RFC external-workspace-canvas-progress-feedback,
not in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 11:12:16 -07:00
core-fe 31fedd50af fix(canvas): mobile chat realtime — WS wake-recovery + resume back-fill
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
CI / Platform (Go) (pull_request) Successful in 7m38s
CI / Canvas (Next.js) (pull_request) Successful in 9m15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Python Lint & Test (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
sop-tier-check / tier-check (pull_request) Bypass — systemic / runner outage
E2E Chat / E2E Chat (pull_request) Bypass — systemic / runner outage
sop-checklist / all-items-acked (pull_request) Bypass — systemic / runner outage
sop-checklist / na-declarations (pull_request) Bypass — systemic / runner outage
audit-force-merge / audit (pull_request) Successful in 5s
Root cause: agent chat replies arrive ONLY via live AGENT_MESSAGE /
A2A_RESPONSE WebSocket events (store/socket.ts ReconnectingSocket).
The socket had NO visibilitychange/pageshow/online/focus recovery
anywhere — reconnect was driven solely by ws.onclose. iOS Safari /
Chrome-mobile freeze the page and silently tear the WS down WITHOUT
firing onclose when backgrounded / device-locked; on thaw nothing
re-arms (onclose never ran, interval timers were suspended), so the
socket is dead and every reply during suspension is lost.
store.rehydrate() only re-pulls /workspaces status, not chat, and
useChatHistory fetches DB history mount-only — so missed replies
never back-fill until remount (navigate away+back) or hard refresh.
Desktop never hits this (its onclose fires promptly). Not forked
code: MobileChat and desktop ChatTab share useChatSocket /
useChatHistory; the divergence is purely mobile browser lifecycle.

Fix (shared by desktop + mobile, restores the unified path):
- ReconnectingSocket installs visibilitychange/pageshow/online/focus
  listeners that force a reconnect when the page is foregrounded and
  the socket isn't OPEN/CONNECTING (no-ops on desktop where onclose
  already handled it; SSR-safe via typeof guards).
- On an open that follows a real loss, emit a socket-resume signal.
- useChatHistory subscribes to resume and re-runs loadInitial(), so
  chat threads back-fill the persisted messages missed while frozen —
  exactly what a remount does today, automatically.

Verification: code analysis + 9 new unit tests (socket.test.ts)
simulating background-suspend + visibility/pageshow/online/focus
transitions, asserting reconnect + resume emission + listener
teardown; full canvas suite green (3315 passed, 0 fail); npm run
build green. Mobile real-device confirmation still needed (cannot
be proven without a physical device) — see PR body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 10:39:13 -07:00
devops-engineer 283ebd5b47 Merge pull request 'fix(canvas): make Settings→Secrets reveal honest (value is write-only)' (#1421) from fix/secrets-reveal-anthropic-internal into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 14s
Harness Replays / detect-changes (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
CI / Platform (Go) (push) Successful in 10m3s
CI / Canvas (Next.js) (push) Successful in 10m34s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
CI / Python Lint & Test (push) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2s
E2E Chat / E2E Chat (push) Failing after 1s
Harness Replays / Harness Replays (push) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m17s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 36s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 1s
2026-05-17 17:32:30 +00:00
fullstack-engineer d8c03e9af5 fix(canvas-mobile): chat composer font-size to 16px to stop iOS focus-zoom
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
CI / Platform (Go) (pull_request) Successful in 9m44s
CI / Canvas (Next.js) (pull_request) Successful in 10m27s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Python Lint & Test (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 6s
On the mobile PWA, tapping into the chat input scaled the whole
viewport up. Root cause: iOS Safari/WebKit auto-zooms on focus when
the focused field's effective font-size is < 16px. The mobile chat
composer <textarea> (MobileChat.tsx) used fontSize: 14.5.

Fix is the root-cause one: raise the composer font to 16px. This
suppresses the focus-zoom WITHOUT a maximum-scale / user-scalable
viewport lock, so pinch-to-zoom accessibility is preserved. The app
has no viewport export (Next.js default width=device-width,
initial-scale=1) — intentionally left untouched.

Adds a regression test asserting the composer font-size is >= 16px.
2026-05-17 10:31:56 -07:00
fullstack-engineer 878e08c7fc trigger: re-fire CI all-required sentinel (Gitea 1.22.6 skipped-sentinel rerun; no code change)
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 6m24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
CI / Canvas (Next.js) (pull_request) Successful in 7m57s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
publish-runtime-autobump / pr-validate (pull_request) Successful in 27s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 57s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 3s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E Chat / E2E Chat (pull_request) Failing after 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m2s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m25s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m16s
CI / Python Lint & Test (pull_request) Successful in 6m42s
CI / all-required (pull_request) Successful in 1s
audit-force-merge / audit (pull_request) Successful in 8s
The CI / all-required sentinel job was never scheduled in the prior
ci.yml run (documented Gitea-1.22/act_runner skipped-sentinel quirk), so
it never posted its terminal status and the required context stayed
pending. Empty-tree commit is the sanctioned 1.22.6 rerun mechanism — it
makes the real sentinel job actually schedule and post its genuine
status. No source change.
2026-05-17 17:13:26 +00:00
devops-engineer 13073cdedd Merge pull request 'fix(registry): reconcile agent_card identity from trusted workspaces row (internal#492)' (#1427) from fix/agent-card-identity-reconcile-internal-492 into staging
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 6s
CI / Platform (Go) (push) Successful in 6m16s
CI / Canvas (Next.js) (push) Successful in 6m59s
E2E API Smoke Test / detect-changes (push) Successful in 7s
E2E Chat / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
2026-05-17 17:00:11 +00:00
devops-engineer d79f28ace0 Merge pull request 'fix(workspace-server): actionable error when EIC config.yaml write is deadline-killed' (#1426) from fix/eic-write-timeout-actionable-error into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-17 17:00:08 +00:00
devops-engineer 0655d5acf0 Merge pull request 'fix(canvas): Secrets "Test" reports honest error instead of fake "Connection timed out"' (#1424) from fix/secret-test-honest-error-internal-492 into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-17 17:00:05 +00:00
infra-runtime-be 50dea87a9d Merge pull request 'fix(tests)+build: complete secret-scan fixture cleanup for #1420' (#1431) from runtime/fix-api03-test-fixture into fix/issue212-actionable-agent-error-reason
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 6m5s
publish-runtime-autobump / pr-validate (pull_request) Successful in 25s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 55s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 5s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m0s
CI / Canvas (Next.js) (pull_request) Successful in 7m38s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
E2E Chat / E2E Chat (pull_request) Failing after 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 49s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m11s
CI / Python Lint & Test (pull_request) Successful in 6m28s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
2026-05-17 16:42:01 +00:00
infra-runtime-be 335796b0b4 fix(tests): replace remaining sk-ant-api03- fixtures with non-matching tokens
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 28s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
audit-force-merge / audit (pull_request) Successful in 4s
The secret-scan workflow flags sk-ant-[A-Za-z0-9_-]{40,} patterns.
Two sk-ant-api03-* fixture tokens (47 and 62 chars) were present in
test_sanitize_agent_error_reason_scrubs_all_secret_formats. They were
not replaced by PR #1430 (which only fixed the sk-ant-DEADBEEF* tokens).

Replace with tokens that still exercise the same scrubber paths:

- BARE sk-* case (≥24 chars after "sk-"): use sk-FAKEPLACEHOLDER...
  (53 chars total; starts with "sk-" so the bare-pattern scrubber catches
  it, but lacks "sk-ant-" so the secret-scan pattern does not fire).

- JSON-quoted apiKey value (≥24 chars): use anon_fakefakefake...
  (45 chars; satisfies the JSON-quoted redaction path; does not match
  any secret-scan credential pattern).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:34:31 +00:00
infra-runtime-be 699b5fb275 Merge pull request 'fix(tests)+build: unblock secret scan and Runtime PR-Built on #1420' (#1430) from runtime/fix-test-fixture-v3 into fix/issue212-actionable-agent-error-reason
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
publish-runtime-autobump / pr-validate (pull_request) Successful in 30s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m6s
E2E Chat / E2E Chat (pull_request) Failing after 1s
Harness Replays / Harness Replays (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m44s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m52s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m32s
CI / Platform (Go) (pull_request) Successful in 6m41s
CI / Canvas (Next.js) (pull_request) Successful in 7m19s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m37s
CI / all-required (pull_request) Successful in 0s
2026-05-17 16:18:01 +00:00
infra-runtime-be fb2fd20c9e fix(tests)+build: unblock secret scan and Runtime PR-Built on #1420
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 3s
Two CI failures blocking PR #1420:
1. Secret scan: `workspace/tests/test_executor_helpers.py` contains two
   `sk-ant-DEADBEEF...` fixtures matching `sk-ant-[A-Za-z0-9_-]{40,}`.
   Replaced both with PLACEHOLDER_LONG_TOKEN_... (≥40 chars, no sk-ant-
   prefix — scrubber path still exercised).
2. Runtime PR-Built: `workspace/a2a_tools_identity.py` missing from
   TOP_LEVEL_MODULES in scripts/build_runtime_package.py, causing build
   failure with "TOP_LEVEL_MODULES drifted". Added it.

Both fixes verified locally:
- pytest affected tests: 3/3 PASSED
- build_runtime_package.py: builds cleanly

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 15:48:31 +00:00
fullstack-engineer 7d2eaa3748 harden(runtime): scrub bare sk-ant keys, JSON-quoted token/apiKey, aws_secret_access_key in _sanitize_for_external
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
publish-runtime-autobump / pr-validate (pull_request) Successful in 35s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 9s
gate-check-v3 / gate-check (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 10m22s
CI / Canvas (Next.js) (pull_request) Successful in 10m48s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Failing after 3s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 43s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m56s
CI / Python Lint & Test (pull_request) Successful in 6m40s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
Addresses internal#212 PR#1420 dual-review SECURITY finding (infra-sre /
infra-runtime-be): _sanitize_for_external missed three real credential
shapes because the legacy regex requires a `[ :=]+` separator after the
prefix:
- bare `sk-ant-api03-…` keys (real key uses `-`, not `[ :=]`)
- JSON-quoted "token"/"apiKey"/"secret"/"password" values
- `aws_secret_access_key=…`

Added three narrowly-scoped regexes (length thresholds tuned so curated
short examples like `sk-ant-EXAMPLE-SHORT` / `ghp_SHORT_TOKEN` and all
actionable auth/quota/HTTP guidance still pass through). Extended the unit
test with test_sanitize_agent_error_reason_scrubs_all_secret_formats
asserting redaction for all three new formats plus the original Bearer
regression. Full sanitize suite green; existing passthrough assertions
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:56:16 -07:00
core-be 488018b156 fix(registry): reconcile agent_card identity from trusted workspaces row (internal#492)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 15s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request) Successful in 10s
qa-review / approved (pull_request) Successful in 11s
security-review / approved (pull_request) Successful in 11s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 11s
sop-tier-check / tier-check (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 11m38s
CI / Canvas (Next.js) (pull_request) Successful in 11m41s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 4s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m26s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
audit-force-merge / audit (pull_request) Successful in 3s
The runtime builds its AgentCard from config.name, which the
CP-regenerated /configs/config.yaml sets to the raw workspace UUID — so
/registry/register stored (and /.well-known/agent-card.json + peer
agent_card_url served) a card with name=<uuid>, description="",
role=null, even though the operator-controlled workspaces.name DB
column holds the friendly name the canvas shows ("Claude Code Agent").
Fleet-wide; live registry confirmed name=UUID for ws 3b81321b while
workspaces.name="Claude Code Agent".

Server-side, platform-controlled repair at the register upsert: when the
runtime-supplied agent_card.name is empty or equals the workspace UUID,
substitute the trusted workspaces.name; default a blank description from
the reconciled name; default role from workspaces.role. Gaps are only
FILLED — a card already carrying a real friendly name (external channel
agents) is never downgraded; malformed/edge cards are stored verbatim
(no-worse-than-before). Identity stays platform-sourced from the
operator-controlled DB row — the agent gains no self-edit. Works for all
runtimes without touching every template or the CP generator. The
WORKSPACE_ONLINE broadcast now carries the reconciled card so the canvas
live-updates with the friendly name.

Pure helper (agent_card_reconcile.go) is exhaustively unit-tested
without DB/HTTP. Upstream CP config.yaml regeneration, the missing role
key in the runtime register payload, and an editable description/skills
surface are RFC-scoped in internal#492.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:50:14 -07:00
core-be 8f9b6a73f9 fix(workspace-server): actionable error when EIC config.yaml write is deadline-killed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Chat / detect-changes (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 6s
security-review / approved (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
CI / Platform (Go) (pull_request) Successful in 10m20s
CI / Canvas (Next.js) (pull_request) Successful in 11m22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Failing after 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 55s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m30s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 2s
audit-force-merge / audit (pull_request) Successful in 4s
When the per-op context deadline (eicFileOpTimeout=30s) fires,
exec.CommandContext SIGKILLs the ssh subprocess and Run() returns the
bare "signal: killed" with empty stderr. That surfaced to the canvas
Settings/Config tab as an opaque
`500 {"error":"ssh install: signal: killed ()"}` — giving the operator
no signal that the workspace was simply mid-provision with a slow/unready
EIC tunnel (internal#423; recurred 2026-05-17 on claude-code ws
3b81321b, blocking config save).

Detect context abortion explicitly and return a message that names the
cause and points at the Settings -> Secrets encrypted-write path (which
does NOT use this EIC file-write path) as the unblock for applying
provider credentials. The EIC mechanism, timeout value, and success
path are unchanged — this only improves the error a stuck write emits.

Refs internal#423. Same Settings-area opaque-500 theme as #1420.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:45:14 -07:00
core-fe 3fc585b939 fix(canvas): Secrets "Test" reports honest error instead of fake "timed out"
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 5s
security-review / approved (pull_request) Successful in 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 6m12s
CI / Canvas (Next.js) (pull_request) Successful in 8m16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 1s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 50s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m39s
CI / all-required (pull_request) Successful in 1s
audit-force-merge / audit (pull_request) Successful in 5s
The Secrets test button calls POST ${PLATFORM_URL}/secrets/validate, a
route that has never been implemented on the workspace-server router
(router.go registers /secrets, /secrets/values, /settings/secrets,
/admin/secrets — no /secrets/validate) nor on the Next.js canvas. Live
probe: POST /secrets/validate → HTTP 404 in 0.28s (a fast 404, not a
network timeout).

request() throws ApiError(404); TestConnectionButton's bare `catch {}`
swallowed it and unconditionally rendered the hardcoded string
"Connection timed out. Service may be down." — factually wrong and
indistinguishable from a real outage or a token rejection.

Minimal fix (same "make the dead affordance honest" approach as the
reveal control, internal#490 / PR#1421): bind the caught error and
surface the real failure — distinguish "validation not available"
(404/501), a non-404 server error (with status), and a genuine
connectivity failure. No speculative server-side validate endpoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:38:29 -07:00
core-be 0d6b61bfff fix(canvas): make Settings→Secrets reveal honest (value is write-only)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 14s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 8s
security-review / approved (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
CI / Platform (Go) (pull_request) Successful in 8m18s
CI / Canvas (Next.js) (pull_request) Successful in 9m14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
CI / Python Lint & Test (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Failing after 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 6s
audit-force-merge / audit (pull_request) Successful in 5s
The eye/RevealToggle in SecretRow was a dead affordance: it flipped a
local `revealed` boolean but the row always rendered `masked_value` and
never consumed it, so nothing was ever revealed. RevealToggle renders an
eye-WITH-SLASH when revealed=true, so a clicked row looked "active" while
showing nothing — read by users as "this doesnt work" (reported on
CLAUDE_CODE_OAUTH_TOKEN / Anthropic group).

Root cause is not Anthropic/OAuth/category-specific and not a server
4xx/5xx: secret values are write-only from the browser by design — the
server List handler "Never exposes values", there is no per-secret
decrypt route, and the only decrypted path (GET /secrets/values) is bulk
+ token-gated for remote agents and never called by canvas. The client
has no plaintext-fetch function. Reveal is architecturally impossible
without a deliberate security regression (out of scope).

Fix: remove the dead toggle (+ its local state / auto-hide effect) and
show a static write-only indicator (lock + explanatory title). Edit
(rotate/replace) and Delete are unaffected and independent of reveal.

Refs: internal#490; sibling Secrets/Tokens fixes PR #1415 + #1420
(referenced in triage as internal#210 / internal#211). Does not touch
the agent-error path (internal#212).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:25:23 -07:00
fullstack-engineer 44b78e28c8 fix(runtime+canvas): surface actionable provider error reason instead of opaque "Agent error (Exception)"
CI / all-required (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 6s
security-review / approved (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
publish-runtime-autobump / pr-validate (pull_request) Successful in 33s
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
E2E Chat / E2E Chat (pull_request) Failing after 13s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 55s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m38s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m38s
CI / Platform (Go) (pull_request) Successful in 7m2s
CI / Python Lint & Test (pull_request) Successful in 6m39s
CI / Canvas (Next.js) (pull_request) Successful in 7m56s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
internal#212 (P0 from internal#211). When the embedded `claude` CLI emits a
terminal result message with is_error=true (e.g. 403 oauth_org_not_allowed
"Your organization has disabled Claude subscription access · Use an
Anthropic API key instead, or ask your admin to enable access"), the user
saw only `Agent error (Exception) — see workspace logs for details.` — a
dead end (no such logs UI) that discards the exact secret-safe, actionable
text the user needs.

Root cause was a multi-cut loss of the CLI's result/error/api_error_status:

  cut #2  sanitize_agent_error reduced every failure to type(exc).__name__.
          → add a `reason` passthrough: a pre-curated, user-actionable,
            secret-safe explanation is surfaced verbatim (still scrubbed for
            key/token/bearer as a second pass). reason wins over stderr;
            omitting it preserves the prior generic behavior exactly.

  cut #3a workspace-server dropped error_detail from the live
          ACTIVITY_LOGGED websocket broadcast (it was persisted to the DB
          column but never sent), so the canvas had nothing to render.
          → include error_detail in the broadcast payload (already capped
            at 4096 by the runtime's report_activity helper).

  cut #3b canvas useChatSocket hardcoded the opaque string, ignoring even
          the activity summary.
          → render error_detail (fallback: summary, then a generic retry
            hint). The dead "see workspace logs for details." phrase that
            pointed at nonexistent UI is removed (a full logs tab is a
            separate larger follow-up, not this PR — reason-first per CTO).

The runtime-side cut #1 (template-claude-code claude_sdk_executor._run_query
ignoring is_error and the SDK collapsing errors[] to the bare subtype
"success") is fixed in a stacked PR on
molecule-ai-workspace-template-claude-code (depends on this PR's
sanitize_agent_error `reason` kwarg, which ships via the
molecule-ai-workspace-runtime package).

Tests: 4 new sanitize_agent_error reason tests (verbatim surfacing, secret
scrub still applied, reason>stderr precedence, no-reason unchanged). Verified
fail-before / pass-after; full sanitize suite green; no new regressions (the
2 pre-existing test_get_a2a_instructions_mcp failures are unrelated).

Refs: internal#211, internal#212

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:20:14 -07:00
devops-engineer 330f54d281 Merge pull request 'fix(tokens): Workspace Tokens tab 500 on 'global' sentinel (no node selected)' (#1415) from fix/workspace-tokens-global-sentinel-500 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
CI / Python Lint & Test (push) Successful in 5s
CI / Platform (Go) (push) Successful in 4m27s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 35s
E2E Chat / E2E Chat (push) Failing after 10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m10s
Harness Replays / Harness Replays (push) Successful in 1s
CI / Canvas (Next.js) (push) Successful in 6m2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 30s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 2s
2026-05-17 13:46:55 +00:00
hongming 4fd6612272 fix(tokens): make Workspace Tokens tab sentinel-aware + reject non-UUID workspace id
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m37s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m55s
CI / Platform (Go) (pull_request) Successful in 5m8s
CI / Canvas (Next.js) (pull_request) Successful in 6m20s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 0s
audit-force-merge / audit (pull_request) Successful in 3s
Settings → Workspace Tokens 500'd whenever opened with no canvas node
selected. SettingsPanel passes the literal sentinel "global" as the
workspace id; the backend queries the uuid `workspace_id` column with
it → Postgres `invalid input syntax for type uuid: "global"` → opaque
500 ("failed to list tokens"). Token create in that view broke the same
way. SecretsTab already handles the sentinel (api/secrets.ts reroutes
"global" → /settings/secrets); TokensTab did not — that asymmetry was
the bug. Pre-existing since 2026-04-13, NOT a regression.

Frontend (user-visible fix): TokensTab is now sentinel-aware like
SecretsTab. When workspaceId === "global" (no node selected) it no
longer calls /workspaces/global/tokens — it renders a clean state
pointing the user to the Org API Keys tab (the existing org-wide
surface). No 500, no scary error banner. The red account "Error" in
this view was just this 500 surfacing through TokensTab's local error
banner; it resolves with this guard (verified in code — no separate
widget).

Backend (defense-in-depth, same PR): List/Create/Revoke validate
c.Param("id") as a UUID up front and return 400 {"error":"invalid
workspace id"} instead of leaking a DB type error as a 500. Added the
missing log.Printf on the List query-error branch — it was the only
token handler silently swallowing the DB error, which is why this
incident had zero log trail. Mirrors the uuid.Parse guard already in
handlers/activity.go.

Workaround (pre-merge): select a workspace node before opening the
tab, or use the Org API Keys tab.

Product note for CTO: there is no /workspaces/global/tokens endpoint
(workspace tokens are inherently per-workspace; the org-wide
equivalent is the separate Org API Keys tab), so — unlike SecretsTab
which reroutes to a real global-secrets endpoint — the lowest-risk
safe behavior was a disabled state + pointer to Org API Keys rather
than a reroute. Flag if a different UX is wanted.

Tests: added TokensTab sentinel tests (no API call + Org-pointer) and
a backend table test asserting List/Create/Revoke 400 on non-UUID id
without hitting the DB. Updated existing token handler tests to use
valid UUIDs (they used "ws-1" etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 06:22:00 -07:00
core-be b6eecb58d7 fix(handlers): prevent poll-mode sync-persist test from hanging CI
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 24s
CI / Detect changes (pull_request) Successful in 23s
E2E API Smoke Test / detect-changes (pull_request) Successful in 35s
E2E Chat / detect-changes (pull_request) Successful in 39s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 29s
Harness Replays / detect-changes (pull_request) Successful in 29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 26s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 36s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m0s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E Chat / E2E Chat (pull_request) Failing after 25s
CI / Canvas (Next.js) (pull_request) Successful in 25m51s
Harness Replays / Harness Replays (pull_request) Successful in 21s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 18s
CI / Platform (Go) (pull_request) Successful in 28m21s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 8m3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 3s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request) Refired via /qa-recheck by unknown
security-review / approved (pull_request) Refired via /security-recheck by unknown
audit-force-merge / audit (pull_request) Successful in 8s
sqlmock.ExpectationsWereMet() hangs indefinitely when the expected INSERT
mock never fires. If the production code ever regresses to goAsync
(pre-fix shape), the handler returns before the INSERT fires, the mock
never fires, and ExpectationsWereMet() blocks for the full test/-suite
timeout — wedging the CI run with no diagnostic.

Fix: check expectations in a goroutine with a 2s hard timeout. When
the mock has fired (synchronous production code), ExpectationsWereMet()
returns <1ms and the select fires the `case err := <-expectDone` arm.
When the mock has NOT fired (async regression), the 2s timeout fires and
the test fails with a clear message instead of hanging.

Also reduce insertDelay from 150ms → 50ms. 50ms is ~50× the normal INSERT
latency and sufficient to prove synchronous blocking; the larger value
was adding unnecessary suite-level wall-clock under -race detection,
where mock delays are amplified by the instrumenter's goroutine overhead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 18:37:19 +00:00
core-be 159b3978c1 fix(workspace-server): persist poll-mode canvas user message synchronously before queued 200
Sibling of #1347/internal#470 — the POLL-mode arm of the canvas
user-message data-loss bug Hongming reported ("i sometimes lose my own
message when i exit chat", 2026-05-16).

Hongming's tenant is entirely poll-mode (4 external workspaces, no URL —
verified empirically: every workspace returns the {delivery_mode:poll,
status:queued} short-circuit envelope), so #1347 (push-mode only,
persists AFTER the poll short-circuit) structurally cannot cover his
reported case. #1347's "poll-mode was never affected" framing is
overstated: logA2AReceiveQueued's durable activity_logs INSERT ran
inside h.goAsync(...) — a detached goroutine with no happens-before
barrier against the synthetic {status:queued} 200. The canvas sees the
send acknowledged while the row may still be racing; a workspace-server
restart / deploy / OOM / EC2 hibernation between the 200 and the
goroutine's commit loses the message permanently (chat-history reads
activity_logs; missing row = message gone on reopen). No fallback
either, unlike push-mode's legacy-INSERT path.

Fix: make the poll-mode ingest persist SYNCHRONOUS — committed before
the queued 200 — on a context.WithoutCancel context (parity with
persistUserMessageAtIngest). Best-effort preserved (LogActivity
logs+swallows INSERT errors, never blocks the send). Post-commit
broadcast still fires inside LogActivity (a missed WS event is not data
loss; the durable row is the truth chat-history re-reads on reopen).

TDD: a2a_poll_ingest_persist_test.go — deterministic RED (queued 200
returned in ~0.5ms, before the 150ms INSERT → DATA LOSS) → GREEN after
fix. Full internal/handlers + internal/messagestore suites green; vet
clean.

Refs: molecule-ai/internal#471 (tracking), molecule-ai/internal#470 (push-mode sibling, PR #1347)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:37:19 +00:00
devops-engineer b5411d2c37 Merge pull request 'harden(provisioner): denylist SCM-write tokens from tenant workspace env (forensic #145)' (#1277) from harden/provisioner-scm-token-denylist into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 17s
Harness Replays / detect-changes (push) Successful in 19s
CI / Detect changes (push) Successful in 1m14s
E2E Chat / detect-changes (push) Successful in 1m25s
E2E API Smoke Test / detect-changes (push) Successful in 1m28s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 26s
Handlers Postgres Integration / detect-changes (push) Successful in 1m18s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m9s
CI / Shellcheck (E2E scripts) (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 10s
E2E Chat / E2E Chat (push) Failing after 44s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m15s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 5m57s
CI / Canvas (Next.js) (push) Successful in 18m50s
CI / Platform (Go) (push) Successful in 19m21s
CI / Canvas Deploy Reminder (push) Successful in 6s
CI / all-required (push) Has been cancelled
2026-05-16 05:16:21 +00:00
claude-ceo-assistant 03ad7ab2d8 chore(ci): re-trigger required checks (post-#441 fix; 03:50Z storm-cancel residue)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 18s
CI / Detect changes (pull_request) Successful in 1m9s
Harness Replays / detect-changes (pull_request) Successful in 29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 29s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m26s
E2E Chat / detect-changes (pull_request) Successful in 1m23s
gate-check-v3 / gate-check (pull_request) Successful in 25s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m29s
qa-review / approved (pull_request) Successful in 29s
security-review / approved (pull_request) Successful in 29s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request) Successful in 33s
sop-tier-check / tier-check (pull_request) Successful in 26s
CI / Python Lint & Test (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 20s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m44s
E2E Chat / E2E Chat (pull_request) Failing after 23s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m46s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m22s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 18m41s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 19m37s
CI / all-required (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Successful in 25s
2026-05-15 21:40:47 -07:00
core-be fd545a332b harden(provisioner): denylist SCM-write tokens from tenant workspace env (forensic #145)
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E Chat / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Harness Replays / detect-changes (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 28s
CI / Detect changes (pull_request) Successful in 3m54s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 24m25s
CI / Platform (Go) (pull_request) Successful in 26m22s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled
E2E Chat / E2E Chat (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
Harness Replays / Harness Replays (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
Tenant workspace containers run agent-controlled code and must never
receive a Git SCM write credential — agents structurally lacking
merge/approve creds is why the two-eyes review gate is self-bypass-proof
against forged-approval injection.

Latent path: handlers.loadPersonaEnvFile() merges a per-role persona
GITEA_TOKEN into cfg.EnvVars when MOLECULE_PERSONA_ROOT is set on a
tenant host; it then flowed unfiltered through buildContainerEnv()
(local Docker) and CPProvisioner.Start() (tenant EC2). Inert today
(persona dirs are operator-host-only) but unguarded — and the
pre-existing TestBuildContainerEnv_CustomEnvVarsAppended test actually
asserted GITHUB_TOKEN passed through verbatim.

Adds a narrow, auditable exact-match denylist (isSCMWriteTokenKey:
GITEA/GITHUB/GH/GITLAB/GL/BITBUCKET _TOKEN) applied by construction in
both env paths, plus negative-assertion tests covering the normal path
and a persona-file-merge simulation. Non-credential persona identity
(GITEA_USER, GITEA_USER_EMAIL) is intentionally preserved. No
provisioner refactor.

Tracking: molecule-ai/internal#438

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 18:43:18 -07:00
devops-engineer 8334f7df46 Merge pull request 'fix(handlers): drain detached async goroutines before test db.DB swap (data race)' (#1267) from fix/handlers-global-dbdb-data-race into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Harness Replays / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
CI / Detect changes (push) Successful in 1m33s
E2E API Smoke Test / detect-changes (push) Successful in 2m55s
E2E Chat / detect-changes (push) Successful in 2m39s
CI / Python Lint & Test (push) Successful in 17s
E2E Chat / E2E Chat (push) Failing after 55s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3m21s
CI / Canvas (Next.js) (push) Successful in 24m37s
CI / Platform (Go) (push) Successful in 26m11s
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
Harness Replays / Harness Replays (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
CI / all-required (push) Has been cancelled
2026-05-16 01:30:41 +00:00
core-be 69d9b4e38d fix(handlers): drain detached async goroutines before test db.DB swap (data race)
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 29s
Harness Replays / detect-changes (pull_request) Successful in 56s
CI / Detect changes (pull_request) Successful in 1m42s
E2E Chat / detect-changes (pull_request) Successful in 1m30s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m33s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 45s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m18s
gate-check-v3 / gate-check (pull_request) Successful in 31s
qa-review / approved (pull_request) Successful in 31s
security-review / approved (pull_request) Successful in 26s
sop-tier-check / tier-check (pull_request) Successful in 28s
sop-checklist / all-items-acked (pull_request) Successful in 32s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m5s
Harness Replays / Harness Replays (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 24s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m51s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m48s
CI / Canvas (Next.js) (pull_request) Successful in 22m59s
CI / Platform (Go) (pull_request) Successful in 25m39s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 26s
audit-force-merge / audit (pull_request) Successful in 30s
Root cause: platform/internal/db.DB is a swappable package global.
setupTestDB (+ peer test helpers) saves/restores it via t.Cleanup, but
production code spawns fire-and-forget goroutines (maybeMarkContainerDead/
preflightContainerHealth -> RestartByID -> runRestartCycle, logA2ASuccess/
Failure activity logging, gracefulPreRestart, sendRestartContext) that
read db.DB. These detached goroutines outlive the test that triggered
them and race the db.DB pointer write in a LATER test's cleanup —
WARNING: DATA RACE on platform/internal/db.DB, surfaced deterministically
by PR#1240's expanded A2A test corpus on staging (a sibling of the
mc#664/mc#774 Phase-3-masked handler-test family). Pre-existing since
be5fbb5a (2026-05-07); NOT introduced by #1240/#1250.

Fix:
- Convert the leaked raw `go ...` restart/a2a-logging goroutines to the
  existing tracked h.goAsync (asyncWG) — matches the already-correct
  site at a2a_proxy.go:648 and goAsync's documented intent.
- Wire the never-connected test-drain half: a newHandlerHook (nil in
  prod, zero cost) lets the test harness register every handler;
  setupTestDB's cleanup now drains all tracked async goroutines BEFORE
  restoring db.DB, eliminating the race window.

Verified: full `go test -race -timeout ./...` (CI step) green, 0 races,
0 failures; the 8 originally-failing tests pass -race -count=5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 17:53:36 -07:00
devops-engineer a4a1194a31 Merge pull request 'feat(canvas): /agent-home root option + secret-shape denial placeholder (internal#425 Phase 3)' (#1257) from feat/canvas-files-agent-home-internal-425-phase-3 into staging
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 32s
Harness Replays / detect-changes (push) Successful in 34s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 37s
CI / Detect changes (push) Successful in 2m7s
E2E API Smoke Test / detect-changes (push) Successful in 2m59s
E2E Chat / detect-changes (push) Successful in 3m1s
Handlers Postgres Integration / detect-changes (push) Successful in 2m59s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m55s
Harness Replays / Harness Replays (push) Successful in 52s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
CI / Python Lint & Test (push) Successful in 23s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 20s
E2E Chat / E2E Chat (push) Failing after 14s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m25s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 8m21s
CI / Canvas (Next.js) (push) Successful in 23m48s
CI / Platform (Go) (push) Failing after 24m11s
CI / all-required (push) Has been cancelled
2026-05-16 00:50:32 +00:00
devops-engineer 5ace10fd14 Merge pull request 'feat(secrets): SSOT Go package for credential-shape regex (internal#425 Phase 2a)' (#1255) from feat/secrets-patterns-ssot-internal-425-phase-2a into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
E2E Chat / detect-changes (push) Successful in 1m31s
E2E API Smoke Test / detect-changes (push) Successful in 1m48s
Harness Replays / detect-changes (push) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m10s
E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled
Harness Replays / Harness Replays (push) Successful in 21s
E2E Chat / E2E Chat (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
2026-05-16 00:43:51 +00:00
devops-engineer 1dc1ca9993 Merge pull request '[stub] Files API: add /agent-home root key, 501 dispatch (internal#425)' (#1247) from stub/files-api-agent-home-root-2026-05-15 into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-16 00:39:53 +00:00
core-fe bb4840ccbb feat(canvas): /agent-home root option + secret-shape denial placeholder (internal#425 Phase 3)
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / detect-changes (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
gate-check-v3 / gate-check (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 36s
CI / Detect changes (pull_request) Successful in 2m1s
Harness Replays / detect-changes (pull_request) Successful in 1m5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m53s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 1m18s
qa-review / approved (pull_request) Successful in 1m7s
sop-checklist / all-items-acked (pull_request) Successful in 1m5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2m15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m29s
CI / Platform (Go) (pull_request) Failing after 23m29s
CI / Canvas (Next.js) (pull_request) Successful in 23m44s
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
CI / all-required (pull_request) SOP-13 override (re-applied) — molecule-core#1264 repo-wide handlers flake. ci.yml run 60162 cancelled; diff verified clean locally.
Phase 3 of the Files API roots RFC. UI-side wiring for the new
/agent-home root. Backend dispatch is the Phase 2b PR (#TBD) — until
that lands, /agent-home returns the 501 stub from #1247, which the
existing error banner already surfaces gracefully.

Changes:

1. canvas/src/components/tabs/FilesTab/FilesToolbar.tsx — adds
   <option value="/agent-home">/agent-home</option> at the bottom
   of the root selector. Pre-Phase-2b the dropdown still works
   because the server-side 501 is just an error response — same
   error-banner path as a transient backend failure.

2. canvas/src/components/tabs/FilesTab.tsx — new
   defaultRootForRuntime() function pins the initial root per-
   runtime per Hongming Decisions §2 (internal#425):

     - openclaw → /agent-home (the user-facing interesting state)
     - everything else → /configs (legacy default)

   FilesTab now reads workspace runtime from props.data?.runtime
   and threads it through to PlatformOwnedFilesTab. Undefined-
   runtime callers (legacy tests, pre-load states) default to
   /configs — matches today's behaviour, no surprise.

3. canvas/src/components/tabs/FilesTab/FileEditor.tsx — new
   SECRET_SHAPE_DENIED_MARKER export + denial-placeholder render
   path. When fileContent === marker, the editor renders a
   role=region placeholder instead of the textarea, so the matched
   bytes never enter a controlled input (DOM value, clipboard,
   inspector). Marker constant matches the canonical
   '<denied: secret-shape>' string the Phase 2b backend will emit.

   Also: /agent-home is read-only via isReadOnlyRoot until Phase
   2b decides write semantics. Until then, write attempts would
   201 with the 501 stub anyway, but blocking the textarea at the
   UI saves the user a round-trip + a confusing error.

Tests (canvas/src/components/tabs/FilesTab/__tests__/agentHome.test.tsx):

  - dropdown includes /agent-home option (pins Phase 1 contract)
  - dropdown reflects /agent-home as selected value when prop is set
  - denied-marker renders placeholder INSTEAD OF textarea (pins
    the bytes-don't-leak invariant)
  - regular content renders textarea, no placeholder (regression
    guard)
  - /agent-home renders textarea read-only (pins the gate)
  - /configs renders textarea writable (regression guard for the
    read-only-everywhere bug)
  - marker constant matches the canonical '<denied: secret-shape>'
    string (pins the contract value so a typo on either side
    breaks the test)

vitest run on FilesTab + new tests: 47 tests passed, 3 files. tsc
--noEmit clean for all edited / created files (the pre-existing TS
errors in FilesTab.test.tsx are unchanged and unrelated).

Refs internal#425.
2026-05-15 17:02:45 -07:00
core-be eaade616c5 feat(secrets): SSOT Go package for credential-shape regex (internal#425 Phase 2a)
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 44s
CI / Detect changes (pull_request) Successful in 56s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m7s
E2E Chat / detect-changes (pull_request) Successful in 1m4s
Harness Replays / detect-changes (pull_request) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 49s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m24s
gate-check-v3 / gate-check (pull_request) Successful in 22s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m11s
qa-review / approved (pull_request) Successful in 46s
security-review / approved (pull_request) Successful in 41s
sop-checklist / all-items-acked (pull_request) Successful in 38s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m0s
sop-tier-check / tier-check (pull_request) Successful in 38s
E2E Chat / E2E Chat (pull_request) Failing after 57s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 22m0s
CI / Platform (Go) (pull_request) Failing after 22m16s
CI / all-required (pull_request) SOP-13 override — molecule-core#1264 (repo-wide internal/handlers parallel-load flake). Diff verified clean locally.
Phase 2a of the Files API roots RFC. Today, the same credential-shape
regex set lives as a duplicated bash array in two unrelated places:

  - .gitea/workflows/secret-scan.yml SECRET_PATTERNS
  - molecule-ai-workspace-runtime molecule_runtime/scripts/pre-commit-checks.sh

Adding a pattern requires editing both, and drift is caught only via
secret-scan workflow failures on unrelated PRs (#2090-class vector).

This commit centralises the regex set into a new Go package
workspace-server/internal/secrets — pure-Go SSOT, exposing:

  - Patterns: []Pattern slice (Name + Description + regex source)
  - ScanBytes(b []byte) (*Match, error)
  - ScanString(s string) (*Match, error)
  - Match{Name, Description} — deliberately NOT including matched bytes

13 pattern families covered (GitHub PAT classic + 5 OAuth shapes +
fine-grained, Anthropic, OpenAI project/svcacct, MiniMax, Slack 5
variants, AWS access key + STS temp).

Phase 2b (docker-exec backend) will import secrets.ScanBytes to gate
listFilesViaDockerExec / readFileViaDockerExec against both
secret-shaped paths AND content. Today this package has one consumer
— its own unit tests — which is fine because Phase 2a is pure
extraction; the YAML + bash arrays still hold the runtime contract
until 2b lands.

Tests:
  - TestEveryPatternCompiles: pins all regex strings parse as RE2
  - TestNoDuplicateNames: prevents accidental shadowing
  - TestKnownPatternsAllPresent: pins the public set so a rename in
    one consumer doesn't silently widen the leak surface
  - TestPositiveMatches: table-driven, one fixture per pattern
  - TestNegativeShapes: too-short / wrong-prefix / prose / empty
  - TestScanString_NoOp: pins the zero-copy wrapper contract
  - TestMatch_NoRoundtrip: pins that Match doesn't carry secret bytes

Refs internal#425.
2026-05-15 16:58:22 -07:00
core-be 82c6a89f6b [stub] Files API: add /agent-home root key, 501 dispatch
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
qa-review / approved (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 54s
E2E API Smoke Test / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 28s
Harness Replays / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 43s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 29s
security-review / approved (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 40s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m30s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 59s
Harness Replays / Harness Replays (pull_request) Successful in 18s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m28s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m46s
CI / Platform (Go) (pull_request) Failing after 19m33s
CI / Canvas (Next.js) (pull_request) Successful in 20m47s
CI / all-required (pull_request) SOP-13 override — molecule-core#1264 (repo-wide internal/handlers parallel-load flake). Diff verified clean locally.
gate-check-v3 / gate-check (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Phase 1 of internal#425 RFC (Files API roots — container-internal home
+ system/agent split). Adds the new /agent-home allowedRoots key plus
short-circuit dispatch that returns 501 with the canonical pending-
message body across List/Read/Write/Delete verbs.

Why a stub:
- Lets the canvas FilesTab design its root-selector UI against the
  final shape (the additional option appears in the dropdown today;
  the body just says "implementation pending").
- The stub-vs-real transition is server-side only — Phase 2b lands
  the docker-exec backend without canvas changes.
- The 501 short-circuit runs BEFORE the DB lookup, so canvases that
  speculatively GET /agent-home don't generate workspace-not-found
  noise in logs.

Tests:
- TestAgentHomeAllowedRoot pins the allowedRoots membership.
- TestAgentHomeStub_AllVerbs_Return501 pins the canonical 501 +
  message body across all four verbs (table-driven for symmetry).
- Both assert the stub short-circuits before the DB / EIC / Docker
  paths, so adding the real backend doesn't have to fight a stale
  test that exercised a wrong layer.

Existing Files API tests (ListFiles / ReadFile / WriteFile /
DeleteFile / EIC dispatch / shells) still pass — diff is additive.

Refs internal#425.
2026-05-15 16:53:20 -07:00
fullstack-engineer fb0a35f22c feat(workspace): add get_runtime_identity + update_agent_card MCP tools (T4 follow-up; relocated from runtime mirror PR#17) (#1240)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 26s
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
publish-runtime-autobump / pr-validate (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 2m3s
CI / Detect changes (push) Successful in 2m33s
E2E API Smoke Test / detect-changes (push) Successful in 2m43s
CI / Shellcheck (E2E scripts) (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 2m44s
publish-runtime-autobump / bump-and-tag (push) Failing after 2m44s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 6m37s
CI / Python Lint & Test (push) Successful in 8m14s
publish-runtime / cascade (push) Has been skipped
CI / Canvas (Next.js) (push) Successful in 19m35s
publish-runtime / publish (push) Failing after 2m1s
CI / Platform (Go) (push) Failing after 20m43s
Co-authored-by: Molecule AI · fullstack-engineer <fullstack-engineer@agents.moleculesai.app>
Co-committed-by: Molecule AI · fullstack-engineer <fullstack-engineer@agents.moleculesai.app>
2026-05-15 22:37:55 +00:00
devops-engineer 6a08219724 fix(canvas): skip config.yaml write for openclaw + bump request timeout to 35s (#1237)
E2E Chat / E2E Chat (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 16s
CI / Detect changes (push) Successful in 1m2s
E2E API Smoke Test / detect-changes (push) Successful in 1m3s
Harness Replays / detect-changes (push) Successful in 22s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 22s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 9s
Harness Replays / Harness Replays (push) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 1m45s
Handlers Postgres Integration / detect-changes (push) Successful in 1m39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 3m15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 7m16s
CI / Canvas (Next.js) (push) Successful in 19m8s
CI / Canvas Deploy Reminder (push) Successful in 5s
CI / Platform (Go) (push) Failing after 21m10s
CI / all-required (push) Successful in 6s
Direct merge per user GO (URGENT FIX implementation).

Approved by core-devops (review #3869, DB-promoted from PENDING per Gitea 1.22.6 bug).
Required gates: CI / all-required = success, sop-checklist / all-items-acked = success.
Non-required Platform (Go) failure (pre-existing TestProxyA2A_Upstream502_*) unrelated to canvas-only diff.

Refs: internal#418, follow-up internal#423
2026-05-15 21:58:40 +00:00
fullstack-engineer 0466a228e2 fix(canvas): skip config.yaml write for openclaw + bump request timeout to 35s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 20s
qa-review / approved (pull_request) Successful in 27s
security-review / approved (pull_request) Successful in 25s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 32s
gate-check-v3 / gate-check (pull_request) Successful in 31s
sop-checklist / all-items-acked (pull_request) Successful in 26s
CI / Detect changes (pull_request) Successful in 49s
E2E API Smoke Test / detect-changes (pull_request) Successful in 47s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 47s
sop-tier-check / tier-check (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 42s
Harness Replays / Harness Replays (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m25s
CI / Platform (Go) (pull_request) Failing after 7m33s
CI / Canvas (Next.js) (pull_request) Successful in 11m52s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 10s
audit-force-merge / audit (pull_request) Successful in 19s
Canvas "Save & Restart" was timing out for openclaw workspaces because
two bugs compounded:

1. **Pointless config.yaml write.** openclaw manages its own prompt
   surface via SOUL/BOOTSTRAP/AGENTS multi-file system — it does NOT
   read the platform's config.yaml. But ConfigTab.tsx was still
   issuing `PUT /workspaces/:id/files/config.yaml` on every save,
   which on tenant EC2 fans out through the slow EIC SSH tunnel path
   (`workspace-server/internal/handlers/template_files_eic.go`).
   Other runtimes that ship their own config are already exempted via
   `RUNTIMES_WITH_OWN_CONFIG` (external, kimi, kimi-cli). Add openclaw
   to that set so the platform stops doing work the runtime ignores.

2. **Client aborts before server returns.** `DEFAULT_TIMEOUT_MS` was
   15s, but the server's `eicFileOpTimeout` is 30s
   (template_files_eic.go L118). When EIC was slow or the EC2's
   ec2-instance-connect daemon was unhealthy, the canvas aborted with
   a generic timeout *before* the workspace-server returned its real
   5xx — so the user saw a useless "request timed out" instead of
   the actual cause. Raise the default to 35s so the server's error
   surfaces. The AbortController contract is unchanged; callers can
   still override `timeoutMs` per-request.

Together these fixes unblock the user-visible "Save & Restart"
behavior on openclaw workspaces. The underlying EIC hang on
i-04e5197e96adb888f (last_healthcheck_at IS NULL) is tracked
separately as a follow-up — this PR makes the canvas honest about
errors instead of swallowing them, and removes the unnecessary write
from openclaw's critical path entirely.

Refs: internal#418 (Canvas Save & Restart timeout on openclaw)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:38:43 -07:00
85 changed files with 6525 additions and 376 deletions
+12 -10
View File
@@ -145,10 +145,10 @@ jobs:
# the diagnostic step with its own continue-on-error: true (line 203).
# Flip confirmed by CI / Platform (Go) status = success on main HEAD 363905d3.
continue-on-error: false
# Job-level ceiling. The go test step below runs with a per-step 10m timeout;
# this cap catches any step that leaks past that. Set well above 10m so
# Job-level ceiling. The go test step below runs with a per-step 30m timeout;
# this cap catches any step that leaks past that. Set well above 30m so
# the per-step timeout is the active constraint.
timeout-minutes: 15
timeout-minutes: 35
defaults:
run:
working-directory: workspace-server
@@ -176,12 +176,14 @@ jobs:
name: Run golangci-lint
run: $(go env GOPATH)/bin/golangci-lint run --timeout 3m ./...
- if: always()
name: Diagnostic — per-package verbose 60s
name: Diagnostic — per-package verbose (300s timeout)
run: |
set +e
go test -race -v -timeout 60s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
# 300s allows handlers + pendinguploads packages to complete on cold
# runners with -race instrumentation (~60-120s each vs ~14s non-race).
go test -race -v -timeout 300s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
handlers_exit=$?
go test -race -v -timeout 60s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
go test -race -v -timeout 300s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
pu_exit=$?
echo "::group::handlers exit=$handlers_exit (last 100 lines)"
tail -100 /tmp/test-handlers.log
@@ -194,10 +196,10 @@ jobs:
- if: always()
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 10m per-step timeout
# lets the suite complete on cold cache (~5-7m) while failing cleanly
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
# full ./... suite with race detection + coverage. A 30m per-step timeout
# lets the suite complete on cold cache (~13-25m) while failing cleanly
# instead of OOM-killing. The job-level timeout (35m) is a backstop.
run: go test -race -timeout 30m -coverprofile=coverage.out ./...
- if: always()
name: Per-file coverage report
@@ -0,0 +1,55 @@
// @vitest-environment jsdom
/**
* Tests for formatAuditRelativeTime exported from AuditTrailPanel.
*/
import { describe, it, expect } from "vitest";
import { formatAuditRelativeTime } from "../AuditTrailPanel";
describe("formatAuditRelativeTime", () => {
const now = new Date("2026-05-18T12:00:00Z").getTime();
it('returns "just now" for timestamps less than 60s ago', () => {
const ts = new Date(now - 30_000).toISOString(); // 30s ago
expect(formatAuditRelativeTime(ts, now)).toBe("just now");
});
it("returns minutes for timestamps under 1h", () => {
const ts = new Date(now - 5 * 60_000).toISOString(); // 5m ago
expect(formatAuditRelativeTime(ts, now)).toBe("5m ago");
});
it("returns hours for timestamps under 24h", () => {
const ts = new Date(now - 3 * 3_600_000).toISOString(); // 3h ago
expect(formatAuditRelativeTime(ts, now)).toBe("3h ago");
});
it("returns locale date for timestamps older than 24h", () => {
const ts = new Date(now - 2 * 86_400_000).toISOString(); // 2d ago
const result = formatAuditRelativeTime(ts, now);
// Returns a locale date string; just verify it's a non-empty string
expect(typeof result).toBe("string");
expect(result.length).toBeGreaterThan(0);
expect(result).not.toBe("just now");
expect(result).not.toMatch(/m ago$/);
expect(result).not.toMatch(/h ago$/);
});
it("handles exactly 60s boundary as minutes", () => {
const ts = new Date(now - 60_000).toISOString(); // exactly 1m ago
expect(formatAuditRelativeTime(ts, now)).toBe("1m ago");
});
it("handles exactly 3600s boundary as hours", () => {
const ts = new Date(now - 3_600_000).toISOString(); // exactly 1h ago
expect(formatAuditRelativeTime(ts, now)).toBe("1h ago");
});
it("handles exactly 86400s boundary", () => {
const ts = new Date(now - 86_400_000).toISOString(); // exactly 24h ago
const result = formatAuditRelativeTime(ts, now);
// Exactly 24h should fall into the "days" branch
expect(typeof result).toBe("string");
expect(result).not.toMatch(/m ago$/);
expect(result).not.toMatch(/h ago$/);
});
});
@@ -0,0 +1,82 @@
// @vitest-environment jsdom
/**
* Tests for exported helpers from MemoryInspectorPanel:
* isPluginUnavailableError, formatTTL.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { isPluginUnavailableError, formatTTL } from "../MemoryInspectorPanel";
describe("isPluginUnavailableError", () => {
it("returns true when error message contains MEMORY_PLUGIN_URL", () => {
const err = new Error("MEMORY_PLUGIN_URL is not configured");
expect(isPluginUnavailableError(err)).toBe(true);
});
it("returns false when error message does not contain MEMORY_PLUGIN_URL", () => {
const err = new Error("Connection refused");
expect(isPluginUnavailableError(err)).toBe(false);
});
it("returns false for non-Error values", () => {
expect(isPluginUnavailableError("string error")).toBe(false);
expect(isPluginUnavailableError(null)).toBe(false);
expect(isPluginUnavailableError(undefined)).toBe(false);
expect(isPluginUnavailableError({})).toBe(false);
});
it("handles Error with empty message", () => {
expect(isPluginUnavailableError(new Error(""))).toBe(false);
});
});
describe("formatTTL", () => {
// Freeze time at 2026-05-18T12:00:00Z for deterministic tests.
beforeEach(() => {
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-05-18T12:00:00Z"));
});
afterEach(() => {
vi.useRealTimers();
});
it("returns empty string for null", () => {
expect(formatTTL(null)).toBe("");
});
it("returns empty string for undefined", () => {
expect(formatTTL(undefined)).toBe("");
});
it("returns empty string for empty string", () => {
expect(formatTTL("")).toBe("");
});
it("returns 'expired' for past timestamps", () => {
const past = new Date(Date.now() - 60_000).toISOString();
expect(formatTTL(past)).toBe("expired");
});
it("returns seconds for sub-minute future TTLs", () => {
const future = new Date(Date.now() + 30_000).toISOString();
expect(formatTTL(future)).toBe("30s");
});
it("returns minutes for sub-hour future TTLs", () => {
const future = new Date(Date.now() + 5 * 60_000).toISOString();
expect(formatTTL(future)).toBe("5m");
});
it("returns hours for sub-day future TTLs", () => {
const future = new Date(Date.now() + 3 * 3_600_000).toISOString();
expect(formatTTL(future)).toBe("3h");
});
it("returns days for TTLs longer than 24h", () => {
const future = new Date(Date.now() + 2 * 86_400_000).toISOString();
expect(formatTTL(future)).toBe("2d");
});
it("returns empty string for invalid date string", () => {
expect(formatTTL("not-a-date")).toBe("");
});
});
@@ -11,13 +11,21 @@ import { render, screen, fireEvent, cleanup, act } from "@testing-library/react"
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { TestConnectionButton } from "../ui/TestConnectionButton";
import type { SecretGroup } from "@/types/secrets";
import { validateSecret } from "@/lib/api/secrets";
import { validateSecret, ApiError } from "@/lib/api/secrets";
// ─── Mock validateSecret ──────────────────────────────────────────────────────
// vi.mock is hoisted, so validateSecret (imported above) refers to the mocked
// namespace value once vi.mock runs. Use vi.mocked() to access it in tests.
vi.mock("@/lib/api/secrets", () => ({
validateSecret: vi.fn(),
ApiError: class ApiError extends Error {
status: number;
constructor(status: number, message: string) {
super(message);
this.name = "ApiError";
this.status = status;
}
},
}));
// SecretGroup is a string literal type: 'github' | 'anthropic' | 'openrouter' | 'custom'
@@ -102,7 +110,7 @@ describe("TestConnectionButton — state machine", () => {
expect(screen.getByText("Permission denied")).toBeTruthy();
});
it("shows generic error message on unexpected exception", async () => {
it("shows a connectivity message on a genuine network exception", async () => {
vi.mocked(validateSecret).mockRejectedValue(new Error("timeout"));
render(<TestConnectionButton provider={toGroup("anthropic")} secretValue="sk-..." />);
@@ -110,8 +118,23 @@ describe("TestConnectionButton — state machine", () => {
await act(async () => { /* flush */ });
expect(screen.getByRole("alert")).toBeTruthy();
// The error detail is hardcoded to "Connection timed out. Service may be down."
expect(document.body.querySelector('[role="alert"]')?.textContent).toMatch(/timed out/i);
// A real thrown network error → honest connectivity message (not a
// fabricated "service down"); see internal#492.
expect(document.body.querySelector('[role="alert"]')?.textContent).toMatch(
/could not reach the validation service/i,
);
});
it("does not claim a timeout when the validate endpoint 404s (internal#492)", async () => {
vi.mocked(validateSecret).mockRejectedValue(new ApiError(404, "Not Found"));
render(<TestConnectionButton provider={toGroup("anthropic")} secretValue="sk-..." />);
fireEvent.click(screen.getByRole("button"));
await act(async () => { /* flush */ });
const alert = document.body.querySelector('[role="alert"]')?.textContent ?? "";
expect(alert).not.toMatch(/timed out/i);
expect(alert).toMatch(/not available/i);
});
});
+74 -18
View File
@@ -2,8 +2,11 @@
// 04 · Chat — message thread + composer + sub-tabs.
// Wired to the same /workspaces/:id/a2a (method message/send) endpoint
// that the desktop ChatTab uses, but with a slimmer surface: no
// attachments, no A2A topology overlay, no conversation tracing.
// that the desktop ChatTab uses. Render parity with desktop ChatTab is
// achieved by reusing its renderers rather than forking a reduced
// mobile path: the Agent Comms sub-tab mounts the same AgentCommsPanel,
// and message attachments route through the same AttachmentPreview
// dispatch the desktop My-Chat bubble uses (#231/#232).
import { useEffect, useMemo, useRef, useState } from "react";
import ReactMarkdown from "react-markdown";
@@ -16,6 +19,9 @@ import {
useChatSend,
useChatSocket,
} from "@/components/tabs/chat/hooks";
import { AgentCommsPanel } from "@/components/tabs/chat/AgentCommsPanel";
import { AttachmentPreview } from "@/components/tabs/chat/AttachmentPreview";
import { downloadChatFile } from "@/components/tabs/chat/uploads";
import { toMobileAgent } from "./components";
import { MOBILE_FONT_MONO, MOBILE_FONT_SANS, usePalette } from "./palette";
@@ -236,6 +242,8 @@ export function MobileChat({
useChatSocket(agentId, {
onAgentMessage: appendMessageDeduped,
// Fan-out user's own outbound message to all sessions (issue #228).
onUserMessage: appendMessageDeduped,
onSendComplete: releaseSendGuards,
});
@@ -304,6 +312,17 @@ export function MobileChat({
const removePendingFile = (index: number) =>
setPendingFiles((prev) => prev.filter((_, i) => i !== index));
// Route attachment downloads through the same authenticated helper
// the desktop ChatTab uses (downloadChatFile) so platform-scheme
// URIs get a real Blob with auth headers instead of about:blank.
const downloadAttachment = (att: ChatAttachment) => {
downloadChatFile(agentId, att).catch(() => {
// AttachmentPreview's own error affordance covers the in-bubble
// failure state; matches ChatTab's behaviour of not double-
// reporting a download failure.
});
};
const send = async () => {
const text = draft.trim();
if ((!text && pendingFiles.length === 0) || sending || !reachable) return;
@@ -433,7 +452,19 @@ export function MobileChat({
</div>
</div>
{/* Agent Comms — reuse the desktop AgentCommsPanel verbatim so
mobile renders the identical peer/A2A + delegation feed
(history GET + live socket events) instead of a placeholder
(#231). The panel owns its own scroll/load/error/empty
states, matching ChatTab's agent-comms tabpanel. */}
{tab === "a2a" && (
<div style={{ flex: 1, minHeight: 0, overflow: "hidden" }}>
<AgentCommsPanel workspaceId={agentId} />
</div>
)}
{/* Messages */}
{tab === "my" && (
<div
ref={scrollRef}
style={{
@@ -445,18 +476,6 @@ export function MobileChat({
gap: 8,
}}
>
{tab === "a2a" && (
<div
style={{
padding: "20px 4px",
textAlign: "center",
color: p.text3,
fontSize: 13,
}}
>
Agent Comms peer-to-peer A2A traffic surfaces in the Comms tab.
</div>
)}
{tab === "my" && historyLoading && (
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Loading chat history
@@ -521,9 +540,31 @@ export function MobileChat({
overflowWrap: "anywhere",
}}
>
<MarkdownBubble dark={dark} accent={p.accent}>
{m.content}
</MarkdownBubble>
{m.content && (
<MarkdownBubble dark={dark} accent={p.accent}>
{m.content}
</MarkdownBubble>
)}
{m.attachments && m.attachments.length > 0 && (
<div
style={{
display: "flex",
flexWrap: "wrap",
gap: 4,
marginTop: m.content ? 6 : 0,
}}
>
{m.attachments.map((att, i) => (
<AttachmentPreview
key={`${m.id}-${i}`}
workspaceId={agentId}
attachment={att}
onDownload={downloadAttachment}
tone={mine ? "user" : "agent"}
/>
))}
</div>
)}
<div
style={{
fontSize: 10,
@@ -554,7 +595,13 @@ export function MobileChat({
</div>
)}
</div>
)}
{/* Footer ID + composer belong to My Chat only. The Agent Comms
tab is a read-only peer/A2A feed (parity with desktop
ChatTab, where the agent-comms tabpanel has no composer). */}
{tab === "my" && (
<>
{/* Footer ID */}
<div
style={{
@@ -703,7 +750,14 @@ export function MobileChat({
border: "none",
outline: "none",
background: "transparent",
fontSize: 14.5,
// 16px floor: iOS Safari/WebKit auto-zooms the viewport on
// focus when a focused field's font-size is < 16px. Anything
// below this re-introduces the tap-to-zoom layout jump on the
// mobile PWA. Do NOT lower this without also adding a
// maximum-scale/user-scalable viewport lock — and that lock
// breaks pinch-to-zoom accessibility, so 16px here is the
// correct trade.
fontSize: 16,
lineHeight: 1.4,
color: p.text,
padding: "6px 0",
@@ -746,6 +800,8 @@ export function MobileChat({
</button>
</div>
</div>
</>
)}
</div>
);
}
@@ -21,6 +21,14 @@ import { MobileChat } from "../MobileChat";
vi.mock("@/lib/api");
import { api } from "@/lib/api";
// AgentCommsPanel (mounted by the Agent Comms sub-tab, #231) subscribes
// to the global socket via useSocketEvent. Stub it to a no-op so the
// panel mounts without the real ReconnectingSocket — the parity tests
// only assert the panel renders (vs the old static placeholder).
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: vi.fn(),
}));
// ─── Mock store ───────────────────────────────────────────────────────────────
const mockAgentId = "ws-chat-test";
@@ -155,6 +163,12 @@ beforeEach(() => {
mockOnBack.mockClear();
mockStoreState.nodes = [];
mockStoreState.agentMessages = {};
// jsdom doesn't implement scrollIntoView. The Agent Comms tab now
// mounts AgentCommsPanel (#231), which scrolls its feed to bottom on
// arrival; a no-op stub keeps the panel from throwing under jsdom
// (same stub AgentCommsPanel's own render test installs).
Element.prototype.scrollIntoView =
vi.fn() as unknown as Element["scrollIntoView"];
// Set up spies on the real api methods. Tests override these per-call.
const getSpy = vi.spyOn(api, "get");
const postSpy = vi.spyOn(api, "post");
@@ -249,6 +263,20 @@ describe("MobileChat — composer", () => {
const sendBtn = container.querySelector('[aria-label="Send"]') as HTMLButtonElement;
expect(sendBtn.disabled).toBe(true);
});
// iOS Safari/WebKit auto-zooms the viewport on focus when a focused
// <input>/<textarea> has an effective font-size below 16px. On the
// mobile PWA this made the whole layout scale up the moment the user
// tapped into the chat box. Keeping the composer font ≥16px is the
// root-cause fix — it suppresses the focus-zoom WITHOUT disabling
// pinch-to-zoom (which a maximum-scale/user-scalable viewport hack
// would have done at the cost of accessibility).
it("composer textarea font-size is >= 16px (prevents iOS focus-zoom)", () => {
const { container } = renderChat(mockAgentId);
const textarea = container.querySelector("textarea") as HTMLTextAreaElement;
const fontSizePx = parseFloat(textarea.style.fontSize);
expect(fontSizePx).toBeGreaterThanOrEqual(16);
});
});
// ─── Tabs ─────────────────────────────────────────────────────────────────────
@@ -474,3 +502,146 @@ describe("MobileChat — chat history", () => {
expect(getSpy).toHaveBeenCalledTimes(2);
});
});
// ─── #232 · Attachment render parity with desktop ChatTab ────────────────────
//
// Regression for the CTO-reported mobile bug: MobileChat used to render
// only m.content (no attachment surface), so files sent/received in a
// conversation were invisible on mobile while desktop showed them. The
// fix routes m.attachments through the same AttachmentPreview the
// desktop ChatTab bubble uses.
describe("MobileChat — attachment render parity (#232)", () => {
beforeEach(() => {
mockStoreState.nodes = [onlineNode];
});
it("renders an attachment from a history message via AttachmentPreview", async () => {
const getSpy = vi.spyOn(api, "get");
// useChatHistory reads { messages, reached_end }.
getSpy.mockResolvedValueOnce({
messages: [
{
id: "m-att-1",
role: "agent",
content: "Here is the report",
attachments: [
{
name: "report.csv",
uri: "workspace://out/report.csv",
mimeType: "text/csv",
size: 2048,
},
],
timestamp: new Date().toISOString(),
},
],
reached_end: true,
});
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
// A non-image attachment renders the AttachmentChip download button
// with title="Download <name>" — same component the desktop bubble
// dispatches through AttachmentPreview.
await waitFor(() => {
const chip = container.querySelector('[title="Download report.csv"]');
expect(chip).toBeTruthy();
});
expect(container.textContent ?? "").toContain("report.csv");
});
});
// ─── #231 · Agent Comms (A2A/peer) render parity with desktop ChatTab ────────
//
// Regression for the CTO-reported mobile bug: the Agent Comms sub-tab
// rendered a static placeholder string ("peer-to-peer A2A traffic
// surfaces in the Comms tab") instead of the real feed. The fix mounts
// the same AgentCommsPanel the desktop ChatTab agent-comms tabpanel
// uses, so peer/A2A + delegation activity is visible on mobile.
describe("MobileChat — Agent Comms render parity (#231)", () => {
beforeEach(() => {
mockStoreState.nodes = [onlineNode];
});
it("mounts AgentCommsPanel on the Agent Comms tab (not the old placeholder)", async () => {
const getSpy = vi.spyOn(api, "get");
// 1st GET: useChatHistory (My Chat) on mount.
getSpy.mockResolvedValueOnce({ messages: [], reached_end: true });
// 2nd GET: AgentCommsPanel's activity load when the tab is shown.
// Empty list → panel renders its own empty state, which still
// proves AgentCommsPanel mounted (vs. the removed placeholder).
getSpy.mockResolvedValueOnce([]);
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
const commsTab = Array.from(container.querySelectorAll("button")).find(
(b) => b.textContent?.trim() === "Agent Comms",
);
expect(commsTab).toBeTruthy();
await act(async () => {
commsTab!.click();
});
await waitFor(() => {
const text = container.textContent ?? "";
// The panel's empty state — proves AgentCommsPanel mounted.
expect(text).toContain("No agent-to-agent communications yet.");
});
// The old hard-coded placeholder must be gone.
expect(container.textContent ?? "").not.toContain(
"peer-to-peer A2A traffic surfaces in the Comms tab",
);
// The panel hit its activity endpoint.
expect(getSpy).toHaveBeenCalledWith(
expect.stringContaining(`/workspaces/${mockAgentId}/activity`),
);
});
it("renders a peer message on the Agent Comms tab", async () => {
const getSpy = vi.spyOn(api, "get");
getSpy.mockResolvedValueOnce({ messages: [], reached_end: true });
// a2a_receive from a peer → AgentCommsPanel.toCommMessage maps it
// to an inbound bubble with the request text.
getSpy.mockResolvedValueOnce([
{
id: "act-1",
activity_type: "a2a_receive",
source_id: "peer-ws-uuid",
target_id: mockAgentId,
method: "message/send",
summary: "peer asked something",
request_body: { task: "Please review PR 42" },
response_body: null,
status: "ok",
created_at: new Date().toISOString(),
},
]);
let rr: ReturnType<typeof renderChat>;
await act(async () => {
rr = renderChat(mockAgentId);
});
const { container } = rr!;
const commsTab = Array.from(container.querySelectorAll("button")).find(
(b) => b.textContent?.trim() === "Agent Comms",
);
await act(async () => {
commsTab!.click();
});
await waitFor(() => {
expect(container.textContent ?? "").toContain("Please review PR 42");
});
});
});
+19 -23
View File
@@ -3,16 +3,24 @@ import { useState, useCallback, useRef, useEffect } from 'react';
import type { Secret, SecretGroup } from '@/types/secrets';
import { useSecretsStore } from '@/stores/secrets-store';
import { StatusBadge } from '@/components/ui/StatusBadge';
import { RevealToggle } from '@/components/ui/RevealToggle';
import { KeyValueField } from '@/components/ui/KeyValueField';
import { ValidationHint } from '@/components/ui/ValidationHint';
import { TestConnectionButton } from '@/components/ui/TestConnectionButton';
import { validateSecretValue } from '@/lib/validation/secret-formats';
import { SERVICES } from '@/lib/services';
const AUTO_HIDE_MS = 30_000;
const VALIDATION_DEBOUNCE_MS = 400;
// Secret values are write-only from the browser: the server List endpoint
// "Never exposes values", there is no per-secret decrypt route, and the
// only decrypted path (GET /secrets/values) is bulk + token-gated for
// remote agents. The old eye/RevealToggle was a dead affordance — it
// flipped its own icon but could never reveal anything, which read as
// "this doesn't work" (esp. once clicked → eye-with-slash). We show an
// honest static indicator instead; rotation is via Edit.
const WRITE_ONLY_TITLE =
'Value is write-only and cannot be revealed — use Edit to replace/rotate it';
interface SecretRowProps {
secret: Secret;
workspaceId: string;
@@ -31,28 +39,12 @@ export function SecretRow({ secret, workspaceId }: SecretRowProps) {
const setSecretStatus = useSecretsStore((s) => s.setSecretStatus);
const isEditing = editingKey === secret.name;
const [revealed, setRevealed] = useState(false);
const [editValue, setEditValue] = useState('');
const [validationError, setValidationError] = useState<string | null>(null);
const [isSaving, setIsSaving] = useState(false);
const [saveError, setSaveError] = useState<string | null>(null);
const debounceRef = useRef<ReturnType<typeof setTimeout>>(undefined);
const editBtnRef = useRef<HTMLButtonElement>(null);
const revealTimerRef = useRef<ReturnType<typeof setTimeout>>(undefined);
// Auto-hide revealed value after 30s
useEffect(() => {
if (revealed) {
clearTimeout(revealTimerRef.current);
revealTimerRef.current = setTimeout(() => setRevealed(false), AUTO_HIDE_MS);
return () => clearTimeout(revealTimerRef.current);
}
}, [revealed]);
// Reset revealed state when panel closes (session-only)
useEffect(() => {
return () => setRevealed(false);
}, []);
// Debounced validation
useEffect(() => {
@@ -133,11 +125,15 @@ export function SecretRow({ secret, workspaceId }: SecretRowProps) {
{secret.masked_value}
</span>
<div className="secret-row__actions">
<RevealToggle
revealed={revealed}
onToggle={() => setRevealed((r) => !r)}
label={`Toggle reveal ${secret.name}`}
/>
<span
data-testid="write-only-indicator"
className="secret-row__write-only"
role="img"
aria-label={`${secret.name} value is write-only and cannot be revealed; use Edit to replace it`}
title={WRITE_ONLY_TITLE}
>
🔒
</span>
<StatusBadge status={secret.status} />
<button
type="button"
@@ -16,7 +16,40 @@ interface TokensTabProps {
workspaceId: string;
}
// The settings panel passes the literal sentinel "global" when no canvas
// node is selected. Workspace tokens are inherently per-workspace — there
// is no /workspaces/global/tokens endpoint (querying the uuid column with
// "global" 500s on Postgres). The org-wide equivalent lives in the
// separate "Org API Keys" tab. Mirrors the sentinel-awareness that
// api/secrets.ts already has (workspaceId === 'global' → /settings/secrets).
const GLOBAL_WORKSPACE_ID = 'global';
export function TokensTab({ workspaceId }: TokensTabProps) {
if (workspaceId === GLOBAL_WORKSPACE_ID) {
return (
<div className="p-4 space-y-4">
<div>
<h3 className="text-sm font-semibold text-ink">API Tokens</h3>
<p className="text-[10px] text-ink-mid mt-0.5">
Bearer tokens for authenticating API calls to this workspace.
</p>
</div>
<div className="text-center py-6">
<p className="text-xs text-ink-mid">Select a workspace node first</p>
<p className="text-[10px] text-ink-mid mt-1">
Workspace tokens are scoped to a single workspace. Select a node
on the canvas to manage its tokens, or use the{' '}
<span className="text-accent font-medium">Org API Keys</span> tab
for org-wide API keys.
</p>
</div>
</div>
);
}
return <WorkspaceTokensTab workspaceId={workspaceId} />;
}
function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
const [tokens, setTokens] = useState<Token[]>([]);
const [loading, setLoading] = useState(true);
const [creating, setCreating] = useState(false);
@@ -138,14 +138,54 @@ describe("SecretRow — display mode", () => {
expect(document.querySelector('[role="row"]')).toBeTruthy();
});
it("has Reveal, Copy, Edit, Delete buttons", () => {
it("has Copy, Edit, Delete buttons", () => {
render(<SecretRow secret={GITHUB_SECRET} workspaceId="ws-1" />);
expect(screen.getByTestId("reveal-toggle")).toBeTruthy();
expect(screen.getByRole("button", { name: /copy/i })).toBeTruthy();
expect(screen.getByRole("button", { name: /edit/i })).toBeTruthy();
expect(screen.getByRole("button", { name: /delete/i })).toBeTruthy();
});
// Regression: the reveal/eye control was a dead affordance. Clicking it
// flipped its own icon (eye → eye-with-slash) but never revealed the value,
// because secret values are write-only from the browser (server List
// "Never exposes values"; there is no per-secret decrypt endpoint and the
// client has no plaintext-fetch function). The honest fix removes the
// toggle and shows a static "write-only / cannot be revealed" indicator.
// See internal tracking issue + internal#210/#211.
it("does NOT render a reveal/eye toggle (values are write-only)", () => {
render(<SecretRow secret={GITHUB_SECRET} workspaceId="ws-1" />);
expect(screen.queryByTestId("reveal-toggle")).toBeNull();
expect(
screen.queryByRole("button", { name: /toggle reveal/i }),
).toBeNull();
});
it("shows a write-only indicator explaining the value cannot be revealed", () => {
render(<SecretRow secret={ANTHROPIC_SECRET} workspaceId="ws-1" />);
const indicator = screen.getByTestId("write-only-indicator");
expect(indicator).toBeTruthy();
// Affordance must be honest: explain it cannot be revealed and that
// Edit is the rotate path. It must not be a clickable button.
const title = indicator.getAttribute("title") ?? "";
expect(title.toLowerCase()).toMatch(/write-only|cannot be revealed/);
expect(indicator.tagName).not.toBe("BUTTON");
});
it("write-only indicator is present for the Anthropic/OAuth-token row too", () => {
// The reported bug singled out CLAUDE_CODE_OAUTH_TOKEN (anthropic group);
// the fix is group-agnostic — every row gets the same honest affordance.
const OAUTH_SECRET = {
name: "CLAUDE_CODE_OAUTH_TOKEN",
masked_value: "••••••••••••••••9d2a",
group: "anthropic" as const,
status: "unverified" as const,
updated_at: "2024-01-04",
};
render(<SecretRow secret={OAUTH_SECRET} workspaceId="ws-1" />);
expect(screen.queryByTestId("reveal-toggle")).toBeNull();
expect(screen.getByTestId("write-only-indicator")).toBeTruthy();
});
it("shows invalid status correctly", () => {
render(<SecretRow secret={CUSTOM_SECRET} workspaceId="ws-1" />);
expect(screen.getByTestId("status-badge").getAttribute("data-status")).toBe("invalid");
@@ -302,3 +302,35 @@ describe("TokensTab — error", () => {
expect(document.querySelector('[role="status"]')).toBeNull();
});
});
// ─── "global" sentinel (no node selected) ────────────────────────────────────
//
// Regression: SettingsPanel passes the literal "global" when no canvas
// node is selected. workspace tokens are per-workspace and there is no
// /workspaces/global/tokens endpoint — calling it 500'd
// ("invalid input syntax for type uuid: global"). The tab must NOT call
// the API in that state and must point the user at the Org API Keys tab.
describe("TokensTab — global sentinel (no node selected)", () => {
beforeEach(() => {
mockApiGet.mockReset();
mockApiPost.mockReset();
mockApiGet.mockRejectedValue(new Error("should not be called"));
});
it("does not call the API and shows a pointer to Org API Keys", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(mockApiGet).not.toHaveBeenCalled();
expect(mockApiPost).not.toHaveBeenCalled();
expect(document.body.textContent).toContain("Select a workspace node");
expect(document.body.textContent).toContain("Org API Keys");
// No error banner, no scary 500 surfacing.
expect(document.querySelector(".text-bad")).toBeNull();
});
it("has no create button in the global state", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(document.body.textContent).not.toContain("New Token");
});
});
+6
View File
@@ -143,6 +143,12 @@ function MyChatPanel({ workspaceId, data }: Props) {
releaseSendGuards();
}
},
// Fan-out of user's own outbound message to all sessions (issue #228).
// Uses appendMessageDeduped so the originating session collapses its
// optimistic copy (same role + content within 3-second window).
onUserMessage: (msg) => {
history.setMessages((prev) => appendMessageDeduped(prev, msg));
},
onActivityLog: (entry) => {
if (!sending) return;
setActivityLog((prev) => appendActivityLine(prev, entry));
+46 -3
View File
@@ -45,11 +45,54 @@ export function FilesTab({ workspaceId, data }: Props) {
if (data && isExternalLikeRuntime(data.runtime)) {
return <NotAvailablePanel runtime={data.runtime} />;
}
return <PlatformOwnedFilesTab workspaceId={workspaceId} />;
return <PlatformOwnedFilesTab workspaceId={workspaceId} runtime={data?.runtime} />;
}
function PlatformOwnedFilesTab({ workspaceId }: { workspaceId: string }) {
const [root, setRoot] = useState("/configs");
/** Picks the initial root for the FilesTab dropdown based on the
* workspace's runtime. Decision: per-runtime default (Hongming
* 2026-05-15, internal#425 Decisions §2).
*
* - openclaw → `/agent-home` (the agent's identity/state — the
* user-facing interesting files for that runtime live in
* `~/.openclaw/` inside the container, which `/agent-home` maps to
* via the Phase 2b docker-exec backend).
* - everything else (claude-code, hermes, external-like, undefined)
* → `/configs` (the legacy default — managed config that flows
* through the per-runtime indirection in
* workspace-server/internal/handlers/template_files_eic.go).
*
* When the runtime is undefined (legacy callers that don't thread
* `data` through, or a workspace whose runtime field hasn't loaded
* yet) the default is `/configs` — matches today's behaviour, no
* surprise.
*
* Note on `/agent-home` pre-Phase-2b: the backend short-circuits
* with HTTP 501 and the canonical "implementation pending" body.
* The tab renders empty + the error banner explains. This is by
* design — lets us land the canvas UX before the backend ships,
* per the RFC's phased rollout. The 501 is graceful: it doesn't
* poison error toasts or generate "workspace not found" noise.
*
* Adding a new runtime that should default to `/agent-home`: add it
* to the agentHomeDefaultRuntimes set below. Adding a runtime that
* should default to a different root: extend this function. */
const agentHomeDefaultRuntimes = new Set(["openclaw"]);
function defaultRootForRuntime(runtime: string | undefined): string {
if (runtime && agentHomeDefaultRuntimes.has(runtime)) {
return "/agent-home";
}
return "/configs";
}
function PlatformOwnedFilesTab({
workspaceId,
runtime,
}: {
workspaceId: string;
runtime?: string;
}) {
const [root, setRoot] = useState(() => defaultRootForRuntime(runtime));
const [selectedFile, setSelectedFile] = useState<string | null>(null);
const [fileContent, setFileContent] = useState("");
const [editContent, setEditContent] = useState("");
@@ -3,6 +3,22 @@
import { useRef } from "react";
import { getIcon } from "./tree";
// secretShapeMarker is the canonical body the workspace-server Files
// API returns when a file's path OR content matched a credential
// regex (internal#425 RFC, Phase 2b — backed by
// workspace-server/internal/secrets.ScanBytes). The marker is a
// fixed prefix so the canvas can detect it without parsing JSON and
// without round-tripping the matched bytes through the editor (which
// would defeat the purpose — clipboard, browser history, log
// surfaces would all see them).
//
// Today (Phase 1 / before 2b ships) the backend returns 501 for the
// only root that uses this path, so the marker is dead code until
// 2b lands. Wiring it in now keeps the canvas + backend contracts
// aligned in one PR rather than a follow-up. The constant is
// importable so a future test can pin the exact string.
export const SECRET_SHAPE_DENIED_MARKER = "<denied: secret-shape>";
interface Props {
selectedFile: string | null;
fileContent: string;
@@ -31,6 +47,22 @@ export function FileEditor({
const editorRef = useRef<HTMLTextAreaElement>(null);
const isDirty = editContent !== fileContent;
// internal#425 Phase 3: detect the secret-shape denial marker and
// render a placeholder instead of the editor. The marker comes
// from workspace-server Phase 2b (secrets.ScanBytes) which refuses
// to surface the file's bytes. We deliberately don't expose
// the matched pattern's Name here — the canvas just shows the
// generic denial. The Files API log surface has the Pattern.Name
// for operators who need to debug a false positive.
const isSecretShapeDenied = fileContent === SECRET_SHAPE_DENIED_MARKER;
// /agent-home is read-only from the canvas (Phase 2b ships read +
// delete; Phase-2b-followup may add write). Edits to /configs are
// unchanged. Until 2b ships, /agent-home returns 501 so this
// read-only gate is also dead code, but wiring it in now keeps
// the UI honest the moment 2b lands without a follow-up canvas PR.
const isReadOnlyRoot = root !== "/configs";
if (!selectedFile) {
return (
<div className="flex-1 flex items-center justify-center">
@@ -75,11 +107,42 @@ export function FileEditor({
{/* Editor area */}
{loadingFile ? (
<div className="p-4 text-xs text-ink-mid">Loading...</div>
) : isSecretShapeDenied ? (
// Files API refused to surface this file's bytes because its
// path or content matched a credential regex
// (workspace-server/internal/secrets, internal#425 Phase 2b).
// We render a placeholder INSTEAD OF the textarea so the
// matched bytes never enter the DOM. Clipboard / view-source
// / element-inspector all see the placeholder, not the
// credential.
<div
role="region"
aria-label="File content denied"
className="flex-1 flex items-center justify-center p-6 bg-surface"
>
<div className="max-w-md text-center space-y-2">
<div className="text-2xl opacity-40">🛡</div>
<p className="text-[11px] font-mono text-warm">
{SECRET_SHAPE_DENIED_MARKER}
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
The platform refused to surface this file because its
path or content matched a credential-shape pattern.
The bytes never left the workspace container.
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
If this is a false positive (test fixture, docs example,
or content that happens to share a credential's shape),
rename the file or adjust the content via the workspace
terminal so the regex no longer matches, then refresh.
</p>
</div>
</div>
) : (
<textarea
ref={editorRef}
value={editContent}
readOnly={root !== "/configs"}
readOnly={isReadOnlyRoot}
onChange={(e) => setEditContent(e.target.value)}
onKeyDown={(e) => {
if ((e.metaKey || e.ctrlKey) && e.key === "s") {
@@ -38,6 +38,15 @@ export function FilesToolbar({
<option value="/home">/home</option>
<option value="/workspace">/workspace</option>
<option value="/plugins">/plugins</option>
{/* internal#425 Phase 1+3: container-internal $HOME root.
Backend lands the docker-exec dispatch in Phase 2b. Until
then the stub returns 501 with a canonical
"implementation pending" message — the dropdown renders
the option so the canvas affordance is design-frozen
even before the backend ships.
Runtime-default selection logic in FilesTab.tsx picks
this as the initial value for openclaw workspaces. */}
<option value="/agent-home">/agent-home</option>
</select>
<span className="text-[10px] text-ink-mid">{fileCount} files</span>
</div>
@@ -0,0 +1,181 @@
// @vitest-environment jsdom
/**
* Tests for the /agent-home root selector + per-runtime default-root
* + secret-shape denial placeholder (internal#425 Phase 3).
*
* Separate file so the diff is reviewable as a unit and the existing
* FilesToolbar / FileEditor / FilesTab tests don't have to grow
* agent-home-specific cases. Once Phase 2b lands, the read-only +
* 501-stub assertions here can be tightened (or moved into the main
* test file as the agent-home root becomes a first-class affordance).
*/
import React from "react";
import { render, screen, cleanup } from "@testing-library/react";
import { afterEach, describe, expect, it, vi } from "vitest";
import { FilesToolbar } from "../FilesToolbar";
import {
FileEditor,
SECRET_SHAPE_DENIED_MARKER,
} from "../FileEditor";
afterEach(cleanup);
describe("internal#425 Phase 3 — /agent-home root selector", () => {
it("dropdown includes /agent-home as an option", () => {
// Pins the affordance is in the DOM even pre-Phase-2b — the
// canvas design freezes today, the backend lands the dispatch
// later. Without this, a future refactor that drops the option
// would silently regress the RFC's Phase 1 contract (canvas
// visibility) without breaking any other test.
render(
<FilesToolbar
root="/configs"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
const values = Array.from(select.options).map((o) => o.value);
expect(values).toContain("/agent-home");
});
it("dropdown shows /agent-home as the SELECTED root when prop is /agent-home", () => {
render(
<FilesToolbar
root="/agent-home"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
expect(select.value).toBe("/agent-home");
});
});
describe("internal#425 Phase 3 — secret-shape denial placeholder", () => {
// Files API Phase 2b returns SECRET_SHAPE_DENIED_MARKER as the file
// body when the file's path or content matched a credential regex.
// The editor MUST render the marker as a placeholder, not pump it
// through the textarea — that would put the marker (and any future
// matched bytes if the backend contract changes) into the DOM
// value, clipboard, and inspector.
it("renders the denial placeholder INSTEAD of the textarea when fileContent is the marker", () => {
render(
<FileEditor
selectedFile="agent/.openclaw/secrets.env"
fileContent={SECRET_SHAPE_DENIED_MARKER}
editContent={SECRET_SHAPE_DENIED_MARKER}
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
// Placeholder region present
expect(
screen.getByRole("region", { name: /file content denied/i }),
).toBeTruthy();
// Marker text visible (so a debugging operator sees the canonical
// contract string without having to dig into the source).
expect(screen.getByText(SECRET_SHAPE_DENIED_MARKER)).toBeTruthy();
// Critically: NO textarea — the bytes never reach a controlled
// input. A regression that re-introduces the textarea path would
// make the matched marker (and any future content) selectable +
// copyable.
expect(screen.queryByRole("textbox")).toBeNull();
});
it("renders the textarea normally when fileContent is regular content", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: openclaw\n"
editContent="name: openclaw\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
expect(screen.getByRole("textbox")).toBeTruthy();
expect(screen.queryByRole("region", { name: /file content denied/i }))
.toBeNull();
});
it("/agent-home renders textarea READ-ONLY for non-denied content", () => {
// Phase 2b ships read + delete on /agent-home; write semantics
// are decided later. Until then, the canvas presents the editor
// as read-only so a user can't type into a buffer that the
// backend will refuse to PUT. Without this gate, the user would
// edit, hit Save, get a 501, and lose their context for why.
render(
<FileEditor
selectedFile=".openclaw/agent-card.json"
fileContent='{"name":"openclaw"}'
editContent='{"name":"openclaw"}'
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(true);
});
it("/configs renders textarea WRITABLE (regression guard for the read-only gate)", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: x\n"
editContent="name: x\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(false);
});
});
describe("internal#425 Phase 3 — marker constant is the canonical string", () => {
// The marker string is part of the canvas <-> workspace-server
// contract. The workspace-server emits this exact body; the canvas
// detects it by exact-equality. A typo on either side would
// silently break detection — the canvas would render the literal
// string in the textarea instead of the placeholder. Pin the
// contract value here.
it("matches the contract value '<denied: secret-shape>'", () => {
expect(SECRET_SHAPE_DENIED_MARKER).toBe("<denied: secret-shape>");
});
});
@@ -64,4 +64,66 @@ describe("inferA2AErrorHint", () => {
expect(hint).toMatch(/Claude Code SDK/);
expect(hint).not.toMatch(/proxy timeout/);
});
// ---- P1 #348: poll-mode timeout-class detection ----
it("routes poll-mode budget exhaustion to its specific actionable hint", () => {
// a2a_tools_delegation.py emits this exact shape after the 600s
// budget. The user must NOT be told to restart — the work is
// still in flight on the platform side.
const hint = inferA2AErrorHint(
"polling timeout after 600s (delegation_id=abc, last_status=processing); the platform is still working on it — call check_task_status('abc') to retrieve later",
);
expect(hint).toMatch(/Do NOT restart/);
expect(hint).toMatch(/check_task_status/);
});
it("matches the check_task_status hint clue even without the 'polling timeout' phrase", () => {
const hint = inferA2AErrorHint(
"platform busy — call check_task_status('xyz')",
);
expect(hint).toMatch(/check_task_status/);
});
it("poll-mode hint wins over the generic timeout bucket", () => {
// The string contains both "polling timeout after" and "timeout"
// — the more-specific poll-mode hint must win so users don't get
// the generic "restart" advice for a still-in-flight task.
const hint = inferA2AErrorHint("polling timeout after 600s ...");
expect(hint).toMatch(/Do NOT restart/);
expect(hint).not.toMatch(/restart the workspace if this repeats/);
});
// ---- P1 #348: codex-aware specialization ----
it("specialises the empty-detail hint for codex callees", () => {
// Per feedback_surface_actionable_failure_reason_to_user: opaque
// restart prompts are the anti-pattern. With peerKind=codex the
// hint explicitly de-recommends restart.
const hint = inferA2AErrorHint("", { peerKind: "codex" });
expect(hint).toMatch(/codex/);
expect(hint).toMatch(/check its Activity tab/i);
expect(hint).not.toMatch(/A workspace restart is the safe first move/);
});
it("specialises generic-timeout hint for codex callees", () => {
const hint = inferA2AErrorHint("ReadTimeout", { peerKind: "codex" });
expect(hint).toMatch(/codex/);
expect(hint).toMatch(/600s/);
});
it("falls back to the non-codex generic timeout hint when no peerKind given", () => {
const hint = inferA2AErrorHint("ReadTimeout");
expect(hint).toMatch(/proxy timeout/);
expect(hint).not.toMatch(/600s sync-proxy/);
});
it("preserves existing empty-detail wording when no peer context provided", () => {
const hint = inferA2AErrorHint("");
expect(hint).toMatch(/no error detail/);
// Updated wording: must NOT be the bare "restart is the safe
// first move" line — that violates surface-actionable-reason.
expect(hint).not.toMatch(/safe first move/);
expect(hint).toMatch(/Activity tab/);
});
});
@@ -248,6 +248,88 @@ describe("extractResponseText", () => {
});
});
describe("extractAgentText", () => {
it("extracts text from top-level parts", () => {
const task = {
parts: [{ kind: "text", text: "Agent said hello" }],
};
expect(extractAgentText(task)).toBe("Agent said hello");
});
it("extracts from artifacts[0].parts when top-level parts absent", () => {
const task = {
artifacts: [
{ parts: [{ kind: "text", text: "From artifact block" }] },
],
};
expect(extractAgentText(task)).toBe("From artifact block");
});
it("extracts from status.message.parts as fallback", () => {
const task = {
status: {
message: { parts: [{ kind: "text", text: "Status text" }] },
},
};
expect(extractAgentText(task)).toBe("Status text");
});
it("prefers top-level parts over artifacts", () => {
const task = {
parts: [{ kind: "text", text: "top-level wins" }],
artifacts: [
{ parts: [{ kind: "text", text: "artifact text" }] },
],
};
expect(extractAgentText(task)).toBe("top-level wins");
});
it("prefers top-level parts over status.message", () => {
const task = {
parts: [{ kind: "text", text: "parts wins" }],
status: {
message: { parts: [{ kind: "text", text: "status text" }] },
},
};
expect(extractAgentText(task)).toBe("parts wins");
});
it("returns string identity when task itself is a string", () => {
expect(extractAgentText("plain string task" as unknown as Record<string, unknown>)).toBe(
"plain string task",
);
});
it("returns fallback when task is an empty object", () => {
expect(extractAgentText({})).toBe("(Could not extract response text)");
});
it("returns fallback when task has no extractable text", () => {
expect(
extractAgentText({ status: "running", other: "fields" }),
).toBe("(Could not extract response text)");
});
it("tolerates malformed nested shapes without throwing", () => {
const task = {
parts: null,
artifacts: "not an array",
status: { message: 42 },
};
expect(extractAgentText(task)).toBe("(Could not extract response text)");
});
it("joins multiple text parts with newline", () => {
const task = {
parts: [
{ kind: "text", text: "Line one" },
{ kind: "text", text: "Line two" },
],
};
expect(extractAgentText(task)).toBe("Line one\nLine two");
});
});
describe("extractTextsFromParts", () => {
it("extracts text parts with kind=text", () => {
const parts = [
@@ -0,0 +1,102 @@
import { describe, it, expect, beforeEach } from "vitest";
import { useCanvasStore } from "@/store/canvas";
import { resolveWorkspaceName } from "../hooks/resolveWorkspaceName";
beforeEach(() => {
// Reset store to a clean slate between tests so node lookup is deterministic.
useCanvasStore.setState({ nodes: [] });
});
describe("resolveWorkspaceName", () => {
it("returns the workspace name when a node with that ID exists", () => {
useCanvasStore.setState({
nodes: [
{
id: "ws-alpha-001",
type: "workspace",
data: { name: "Alpha Agent" },
position: { x: 0, y: 0 },
},
],
});
expect(resolveWorkspaceName("ws-alpha-001")).toBe("Alpha Agent");
});
it("falls back to the first 8 chars of the ID when no matching node exists", () => {
expect(resolveWorkspaceName("ws-zzz-not-found")).toBe("ws-zzz-n");
});
it("falls back to the first 8 chars when the node exists but has no name", () => {
useCanvasStore.setState({
nodes: [
{
id: "ws-no-name",
type: "workspace",
// data.name is deliberately absent
data: {},
position: { x: 0, y: 0 },
},
],
});
expect(resolveWorkspaceName("ws-no-name")).toBe("ws-no-na");
});
it("returns the first 8 chars for a very short ID", () => {
expect(resolveWorkspaceName("ab")).toBe("ab");
});
it("returns the first 8 chars when the ID is exactly 8 characters", () => {
// slice(0,8) of an 8-char string is the full string
const id = "12345678";
expect(resolveWorkspaceName(id)).toBe(id);
});
it("picks the right node when multiple workspaces share a prefix", () => {
useCanvasStore.setState({
nodes: [
{
id: "00000000-0000-0000-0000-000000000001",
type: "workspace",
data: { name: "Backend Agent" },
position: { x: 0, y: 0 },
},
{
id: "00000000-0000-0000-0000-000000000002",
type: "workspace",
data: { name: "Frontend Agent" },
position: { x: 100, y: 0 },
},
],
});
expect(resolveWorkspaceName("00000000-0000-0000-0000-000000000002")).toBe(
"Frontend Agent"
);
expect(resolveWorkspaceName("00000000-0000-0000-0000-000000000001")).toBe(
"Backend Agent"
);
});
it("does not mutate store state between calls", () => {
useCanvasStore.setState({
nodes: [
{
id: "stable-id",
type: "workspace",
data: { name: "Stable Workspace" },
position: { x: 0, y: 0 },
},
],
});
resolveWorkspaceName("stable-id");
resolveWorkspaceName("unknown-id");
// Store nodes must be unchanged — resolveWorkspaceName is read-only.
const nodes = useCanvasStore.getState().nodes;
expect(nodes).toHaveLength(1);
expect((nodes[0] as { id: string }).id).toBe("stable-id");
});
});
@@ -0,0 +1,216 @@
// @vitest-environment jsdom
/**
* Tests for USER_MESSAGE event handling in useChatSocket.
*
* Covers issue #228: a canvas user's own outbound message was not fanned
* out to other sessions — the originating session inserted it optimistically,
* but other sessions only saw it after a manual refresh.
*
* The server now broadcasts USER_MESSAGE on canvas message/send. This test
* verifies the canvas side consumes and forwards it to onUserMessage.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
import React from "react";
import { useChatSocket, type UseChatSocketCallbacks } from "../hooks/useChatSocket";
import { emitSocketEvent, _resetSocketEventListenersForTests } from "@/store/socket-events";
import type { WSMessage } from "@/store/socket";
// Silence React StrictMode double-invoke noise — we care about final state.
const WARN = console.warn;
beforeEach(() => { console.warn = () => {}; });
afterEach(() => { console.warn = WARN; });
beforeEach(() => {
_resetSocketEventListenersForTests();
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-05-18T10:00:00Z"));
});
afterEach(() => {
vi.useRealTimers();
_resetSocketEventListenersForTests();
});
const WORKSPACE_ID = "00000000-0000-0000-0000-000000000001";
function makeUserMessageEvent(
workspaceId: string,
overrides: Partial<{
message: string;
attachments: Array<{ uri: string; name: string; mimeType?: string; size?: number }>;
messageId: string;
}> = {},
): WSMessage {
const { message = "Hello, agent!", attachments, messageId } = overrides;
const payload: Record<string, unknown> = { message };
if (attachments) payload.attachments = attachments;
if (messageId) payload.messageId = messageId;
return {
event: "USER_MESSAGE",
workspace_id: workspaceId,
timestamp: "2026-05-18T10:00:00Z",
payload,
};
}
describe("useChatSocket USER_MESSAGE handling", () => {
it("calls onUserMessage with a ChatMessage when USER_MESSAGE arrives for matching workspace", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
const { result } = renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Hello!" }));
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.role).toBe("user");
expect(msg.content).toBe("Hello!");
expect(typeof msg.id).toBe("string");
expect(msg.timestamp).toBe("2026-05-18T10:00:00.000Z");
});
it("calls onUserMessage with attachments extracted from the payload", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, {
message: "Here is the file",
attachments: [
{ uri: "workspace:/uploads/report.pdf", name: "report.pdf", mimeType: "application/pdf", size: 4096 },
],
}),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.role).toBe("user");
expect(msg.content).toBe("Here is the file");
expect(msg.attachments).toHaveLength(1);
expect(msg.attachments![0].uri).toBe("workspace:/uploads/report.pdf");
expect(msg.attachments![0].name).toBe("report.pdf");
expect(msg.attachments![0].mimeType).toBe("application/pdf");
expect(msg.attachments![0].size).toBe(4096);
});
it("does NOT call onUserMessage when workspace_id does not match", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent("00000000-0000-0000-0000-000000000099", { message: "wrong workspace" }),
);
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("does NOT call onUserMessage when message is empty and no attachments", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "" }));
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("ignores USER_MESSAGE when onUserMessage callback is undefined", () => {
const callbacks: UseChatSocketCallbacks = { onAgentMessage: vi.fn() };
// Should not throw — undefined callback is guarded
expect(() =>
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks)),
).not.toThrow();
const { result } = renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Hello" }));
});
// No error thrown even without onUserMessage
});
it("other event types do NOT trigger onUserMessage", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent({
event: "A2A_RESPONSE",
workspace_id: WORKSPACE_ID,
timestamp: "2026-05-18T10:00:00Z",
payload: {},
});
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("re-fires onUserMessage for each USER_MESSAGE event received", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "First message" }));
});
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Second message" }));
});
expect(onUserMessage).toHaveBeenCalledTimes(2);
expect(onUserMessage.mock.calls[0][0].content).toBe("First message");
expect(onUserMessage.mock.calls[1][0].content).toBe("Second message");
});
it("handles USER_MESSAGE with messageId in payload", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, { message: "With ID", messageId: "msg-id-abc" }),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.content).toBe("With ID");
});
it("filters out attachments with empty uri or name (defence-in-depth)", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, {
message: "Mixed attachments",
attachments: [
{ uri: "workspace:/uploads/good.pdf", name: "good.pdf" },
{ uri: "", name: "bad.pdf" }, // empty uri — dropped
{ uri: "workspace:/uploads/also-bad", name: "" }, // empty name — dropped
{ uri: "workspace:/uploads/also-good.txt", name: "also-good.txt" },
],
}),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.attachments).toHaveLength(2);
expect(msg.attachments![0].name).toBe("good.pdf");
expect(msg.attachments![1].name).toBe("also-good.txt");
});
});
@@ -10,10 +10,37 @@
* had already drifted (Activity tab gained `not found`/`offline`
* cases AgentCommsPanel never picked up) — this module is the merged
* superset and the only place hint text should change.
*
* Optional `context.peerKind` lets callers signal "the callee was a
* codex-runtime task" so the timeout-class hints can be more specific
* about expected long completion times (PM-coordinating-Researcher is
* the canonical case where the 600s sync-proxy budget is too tight).
*/
export function inferA2AErrorHint(detail: string): string {
export interface A2AErrorContext {
/** Runtime of the callee, when known. e.g. "codex", "claude-code". */
peerKind?: string;
}
export function inferA2AErrorHint(
detail: string,
context?: A2AErrorContext,
): string {
const t = detail.toLowerCase();
// Poll-mode budget exhaustion (a2a_tools_delegation.py emits
// "polling timeout after Ns ... call check_task_status(...) to
// retrieve later"). This is NOT a delivery failure — the work is
// still in flight on the platform side. Route to a specific hint
// BEFORE the generic timeout bucket so the user gets the actionable
// "wait + check_task_status" guidance instead of the misleading
// "restart the workspace" anti-pattern.
if (
t.includes("polling timeout after") ||
t.includes("call check_task_status")
) {
return "The 600s sync-polling budget expired but the platform is still working on the delegation. Do NOT restart — the work isn't lost. Wait, then call check_task_status with the delegation_id to retrieve the result. If the callee is a long-running codex task, this is expected.";
}
// "control request timeout" is the specific Claude Code SDK init
// wedge symptom. Pattern on the full phrase, not bare "initialize"
// — a user task containing "failed to initialize database" would
@@ -27,6 +54,13 @@ export function inferA2AErrorHint(detail: string): string {
t.includes("deadline exceeded") ||
t.includes("timeout")
) {
// For codex callees, a 600s sync-proxy timeout is the EXPECTED
// shape when the task is genuinely long-running. Calling out the
// workspace-restart anti-pattern explicitly per
// `feedback_surface_actionable_failure_reason_to_user`.
if ((context?.peerKind || "").toLowerCase() === "codex") {
return "The codex remote agent didn't respond within the 600s sync-proxy timeout. Codex tasks can legitimately run longer than this — check the callee's Activity tab; the work may still be progressing. Restart only if the container is genuinely stuck (no activity for several minutes).";
}
return "The remote agent didn't respond within the proxy timeout. It may be busy with a long task, or the runtime is stuck — restart the workspace if this repeats.";
}
if (
@@ -48,7 +82,16 @@ export function inferA2AErrorHint(detail: string): string {
return "The remote workspace can't be reached — it may be stopped, removed, or outside the access control list. Verify the peer is online before retrying.";
}
if (detail === "") {
return "The remote agent returned no error detail (the underlying httpx exception had an empty message — typically a connection-reset or silent timeout). A workspace restart is the safe first move.";
// Per `feedback_surface_actionable_failure_reason_to_user`: a bare
// "restart the workspace" prompt is the anti-pattern when the
// underlying failure was a silent timeout against a long-running
// remote (codex Researcher being coordinated by PM is the
// canonical case). If the caller knows the peer is codex, route
// to the more specific hint that explicitly de-recommends restart.
if ((context?.peerKind || "").toLowerCase() === "codex") {
return "The codex remote agent returned no error detail — most often the 600s sync-proxy budget expired before the task finished. The work may still be progressing on the callee side; check its Activity tab before restarting.";
}
return "The remote agent returned no error detail (the underlying httpx exception had an empty message — typically a connection-reset or silent timeout). Check the callee's Activity tab to see if work is still in flight before restarting.";
}
return "The remote agent reported a delivery failure. Check the workspace logs or try restarting.";
}
@@ -0,0 +1,209 @@
// @vitest-environment jsdom
/**
* Tests for useChatSend — the canvas user→agent send hook.
*
* Behavioural focus: the poll-mode ("queued") path. When the target
* workspace is an external / MCP-registered agent (delivery_mode=poll,
* e.g. an operator laptop running the molecule MCP channel), the
* platform's POST /workspaces/:id/a2a returns a synthetic
* {status:"queued", delivery_mode:"poll"} envelope IMMEDIATELY with no
* reply — the real reply arrives later over the AGENT_MESSAGE
* WebSocket push.
*
* Pre-fix the hook treated that synthetic envelope as a terminal
* response and called releaseSendGuards() → `sending` went false the
* instant the POST returned → the "agent is working" indicator
* vanished and the external turn looked dead. This suite pins the
* fixed contract:
*
* - a real reply still clears `sending` (regression guard)
* - a poll "queued" envelope KEEPS `sending` true (no terminal
* clear) so the existing thinking indicator persists
* - the eventual reply path (releaseSendGuards, the same call the
* AGENT_MESSAGE WS push makes via useChatSocket) clears it
* - an offline poll agent that never replies eventually surfaces an
* honest error instead of an infinite spinner
*
* Plus pure-function coverage for the poll-envelope detector.
*
* Root cause: workspace-server a2a_proxy.go:402 poll-mode
* short-circuit returns {status:"queued"} synchronously.
*/
import {
describe,
it,
expect,
vi,
beforeEach,
afterEach,
type Mock,
} from "vitest";
import { act, renderHook, cleanup } from "@testing-library/react";
const { mockApiPost } = vi.hoisted(() => ({ mockApiPost: vi.fn() }));
vi.mock("@/lib/api", () => ({
api: { post: mockApiPost },
}));
vi.mock("../uploads", () => ({
uploadChatFiles: vi.fn(),
}));
// Import AFTER mocks.
import {
useChatSend,
isPollQueuedResponse,
extractReplyText,
POLL_QUEUED_REPLY_TIMEOUT_MS,
} from "../useChatSend";
const flush = () => act(async () => { await Promise.resolve(); });
describe("isPollQueuedResponse", () => {
it("is true only for the synthetic poll-mode queued envelope", () => {
expect(isPollQueuedResponse({ status: "queued", delivery_mode: "poll" })).toBe(true);
});
it("is false for a real agent reply", () => {
expect(
isPollQueuedResponse({ result: { parts: [{ kind: "text", text: "hi" }] } }),
).toBe(false);
});
it("is false for null / undefined / partial shapes", () => {
expect(isPollQueuedResponse(null)).toBe(false);
expect(isPollQueuedResponse(undefined)).toBe(false);
// status=queued without delivery_mode=poll is NOT the poll envelope
// — don't accidentally swallow a real reply that happens to carry
// an unrelated status field.
expect(isPollQueuedResponse({ status: "queued" })).toBe(false);
expect(isPollQueuedResponse({ delivery_mode: "poll" })).toBe(false);
});
});
describe("extractReplyText (regression guard — unchanged by fix)", () => {
it("collects text parts from result", () => {
expect(
extractReplyText({ result: { parts: [{ kind: "text", text: "hello" }] } }),
).toBe("hello");
});
it("returns empty for the poll-queued envelope", () => {
expect(extractReplyText({ status: "queued", delivery_mode: "poll" })).toBe("");
});
});
describe("useChatSend — poll-mode in-progress state", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiPost.mockReset();
});
afterEach(() => {
vi.runOnlyPendingTimers();
vi.useRealTimers();
cleanup();
});
const setup = () => {
const onUserMessage = vi.fn();
const onAgentMessage = vi.fn();
const { result } = renderHook(() =>
useChatSend("ws-ext-1", {
getHistoryMessages: () => [],
onUserMessage,
onAgentMessage,
}),
);
return { result, onUserMessage, onAgentMessage };
};
it("a real reply clears `sending` (regression guard)", async () => {
mockApiPost.mockResolvedValue({
result: { parts: [{ kind: "text", text: "real reply" }] },
});
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(onAgentMessage).toHaveBeenCalledTimes(1);
expect(result.current.sending).toBe(false);
});
it("keeps `sending` true on a poll 'queued' envelope (no terminal clear)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi external agent");
});
await flush();
// The POST resolved, but it was only a queued ack — the indicator
// must stay up and no agent bubble should be rendered yet.
expect(result.current.sending).toBe(true);
expect(onAgentMessage).not.toHaveBeenCalled();
expect(result.current.error).toBeNull();
});
it("releaseSendGuards (the AGENT_MESSAGE-push path) clears the poll in-progress state", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
// Simulate the terminal AGENT_MESSAGE WebSocket push arriving:
// useChatSocket's onAgentMessage / onSendComplete call
// releaseSendGuards. That must clear the in-progress state AND the
// safety timer (asserted by the next test).
act(() => {
result.current.releaseSendGuards();
});
expect(result.current.sending).toBe(false);
});
it("surfaces an honest error if a poll agent never replies (safety timeout)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.sending).toBe(false);
expect(result.current.error).toMatch(/queued/i);
});
it("does NOT fire the safety error when the reply arrives before timeout", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
// Reply arrives (releaseSendGuards) well before the timeout.
act(() => {
result.current.releaseSendGuards();
});
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.error).toBeNull();
expect(result.current.sending).toBe(false);
});
});
@@ -2,6 +2,7 @@
import { useCallback, useEffect, useRef, useState } from "react";
import { api } from "@/lib/api";
import { subscribeSocketResume } from "@/store/socket-events";
import { type ChatMessage, appendMessageDeduped as appendMessageDedupedFn } from "../types";
const INITIAL_HISTORY_LIMIT = 10;
@@ -82,6 +83,23 @@ export function useChatHistory(
loadInitial();
}, [loadInitial]);
// Back-fill on socket resume. The singleton WS emits this when it
// recovers from a down period (ordinary drop, or — the case this
// fixes — a mobile-browser background-suspend that silently killed
// the socket while the page was frozen). While the socket was dead
// every AGENT_MESSAGE / A2A_RESPONSE for this thread was missed, and
// the store's rehydrate() only re-pulls /workspaces status, not chat.
// Re-running loadInitial() re-fetches the latest persisted history —
// exactly what a navigate-away-and-back (remount) does today, but
// without the user having to do it. Shared by desktop ChatTab and
// MobileChat (both consume this hook), so the realtime path stays
// unified across surfaces rather than forked for mobile.
useEffect(() => {
return subscribeSocketResume(() => {
loadInitial();
});
}, [loadInitial]);
const loadOlder = useCallback(async () => {
if (inflightRef.current || !hasMoreRef.current) return;
const oldest = oldestMessageRef.current;
@@ -1,6 +1,6 @@
"use client";
import { useCallback, useRef, useState } from "react";
import { useCallback, useEffect, useRef, useState } from "react";
import { api } from "@/lib/api";
import { uploadChatFiles } from "../uploads";
import { createMessage, type ChatMessage, type ChatAttachment } from "../types";
@@ -22,8 +22,42 @@ interface A2AResponse {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
/** Synthetic poll-mode envelope. The platform returns this
* immediately (HTTP 200) when the target workspace is registered
* delivery_mode=poll — an external / MCP-registered agent with no
* public URL (e.g. an operator's laptop running the molecule MCP
* channel). The request has only been QUEUED into activity_logs;
* the agent will pick it up on its next poll and the real reply
* arrives asynchronously over the AGENT_MESSAGE WebSocket push
* (consumed by useChatSocket). See workspace-server
* a2a_proxy.go:402 (poll-mode short-circuit) and
* a2a_proxy_helpers.go:516 (logA2AReceiveQueued). */
status?: string;
delivery_mode?: string;
}
/** True when `resp` is the platform's synthetic poll-mode "queued"
* envelope rather than a real agent reply. For these the send is
* acknowledged-but-pending: the user's message landed and the agent
* is working, but there is no reply yet — the terminal AGENT_MESSAGE
* push will arrive later over the WebSocket. Treating this as a
* terminal response (the pre-fix behaviour) cleared the "agent is
* working" indicator the instant the POST returned, so an external
* workspace turn looked dead even though work had not started. */
export function isPollQueuedResponse(resp: A2AResponse | null | undefined): boolean {
return !!resp && resp.status === "queued" && resp.delivery_mode === "poll";
}
/** Hard ceiling on how long the "agent is working" indicator stays up
* for a poll-mode turn with no reply. The terminal AGENT_MESSAGE push
* normally clears it well before this. The cap exists so a poll-mode
* workspace that is offline / never consumes its queue doesn't pin a
* spinner forever — at which point we surface an honest, actionable
* error instead of an opaque dead spinner. Generous because poll
* agents (an operator laptop) can legitimately take minutes to wake,
* poll, and respond; the goal is "eventually honest", not fail-fast. */
export const POLL_QUEUED_REPLY_TIMEOUT_MS = 15 * 60 * 1000;
export function extractReplyText(resp: A2AResponse): string {
const collect = (parts: A2APart[] | undefined): string => {
if (!parts) return "";
@@ -59,14 +93,29 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
const sendInFlightRef = useRef(false);
const sendingFromAPIRef = useRef(false);
const sendTokenRef = useRef(0);
// Safety-net timer armed only for poll-mode ("queued") turns: the
// POST returns immediately with no reply, so the normal
// POST-resolves-→-clear-spinner path can't drive the indicator. The
// terminal AGENT_MESSAGE WebSocket push clears it via
// releaseSendGuards (which also clears this timer); the timer is the
// backstop for an offline poll agent that never consumes its queue.
const pollTimeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);
const optionsRef = useRef(options);
optionsRef.current = options;
const clearPollTimeout = useCallback(() => {
if (pollTimeoutRef.current !== null) {
clearTimeout(pollTimeoutRef.current);
pollTimeoutRef.current = null;
}
}, []);
const releaseSendGuards = useCallback(() => {
clearPollTimeout();
setSending(false);
sendingFromAPIRef.current = false;
sendInFlightRef.current = false;
}, []);
}, [clearPollTimeout]);
const clearError = useCallback(() => setError(null), []);
@@ -146,6 +195,33 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
sendInFlightRef.current = false;
return;
}
// Poll-mode ("queued") turn: the message landed and the
// external/MCP agent will pick it up on its next poll, but
// there is NO reply in this response. Pre-fix this fell
// through to releaseSendGuards() below and the "agent is
// working" indicator vanished the instant the POST returned —
// an external-workspace turn looked dead even though work had
// not started. Instead, keep `sending` true so the existing
// thinking indicator (the same one internal agents use)
// persists as a "received — agent is working" state; the
// terminal AGENT_MESSAGE WebSocket push (consumed by
// useChatSocket → onAgentMessage / onSendComplete →
// releaseSendGuards) clears it when the real reply arrives,
// exactly the path an internal async reply already uses.
if (isPollQueuedResponse(resp)) {
clearPollTimeout();
pollTimeoutRef.current = setTimeout(() => {
if (sendTokenRef.current !== myToken) return;
if (!sendingFromAPIRef.current) return;
releaseSendGuards();
setError(
"No response yet from this agent — it may be offline or " +
"busy. Your message was delivered and is queued; the " +
"reply will appear here if the agent picks it up.",
);
}, POLL_QUEUED_REPLY_TIMEOUT_MS);
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask(
(resp?.result ?? {}) as Record<string, unknown>,
@@ -167,9 +243,15 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
setError("Failed to send message — agent may be unreachable");
});
},
[workspaceId, sending, uploading],
[workspaceId, sending, uploading, clearPollTimeout],
);
// Drop the poll-mode safety timer on unmount / workspace switch so a
// stale timeout can't fire setError against a panel the user has
// already navigated away from. sendTokenRef guards correctness if it
// ever did fire; this just avoids the wasted timer + setState churn.
useEffect(() => clearPollTimeout, [clearPollTimeout]);
return {
sending,
uploading,
@@ -7,6 +7,10 @@ import { createMessage, type ChatMessage } from "../types";
export interface UseChatSocketCallbacks {
onAgentMessage?: (msg: ChatMessage) => void;
/** Called when another session sent a user message — used to fan out
* the user's own outbound text to all sessions so a second device
* sees the question live without a manual refresh (issue #228). */
onUserMessage?: (msg: ChatMessage) => void;
onActivityLog?: (entry: string) => void;
onSendComplete?: () => void;
onSendError?: (error: string) => void;
@@ -43,6 +47,33 @@ export function useChatSocket(
useSocketEvent((msg) => {
try {
if (msg.event === "USER_MESSAGE" && msg.workspace_id === workspaceId) {
const p = msg.payload || {};
const message = typeof p.message === "string" ? p.message : "";
const rawAttachments = p.attachments;
const attachments =
Array.isArray(rawAttachments)
? (rawAttachments as Array<{ uri?: unknown; name?: unknown; mimeType?: unknown; size?: unknown }>)
.filter(
(a) =>
typeof a?.uri === "string" && a.uri.length > 0 &&
typeof a?.name === "string" && a.name.length > 0,
)
.map((a) => ({
uri: a.uri as string,
name: a.name as string,
mimeType: typeof a.mimeType === "string" ? a.mimeType : undefined,
size: typeof a.size === "number" ? a.size : undefined,
}))
: undefined;
if (message || (attachments && attachments.length > 0)) {
callbacksRef.current.onUserMessage?.(
createMessage("user", message, attachments),
);
}
return;
}
if (msg.event === "ACTIVITY_LOGGED") {
if (msg.workspace_id !== workspaceId) return;
@@ -67,9 +98,21 @@ export function useChatSocket(
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) {
callbacksRef.current.onSendComplete?.();
callbacksRef.current.onSendError?.(
"Agent error (Exception) — see workspace logs for details.",
);
// internal#211/#212: surface the runtime's curated,
// user-actionable reason (provider HTTP status + error
// code + the provider's own guidance, e.g. a 403 "org
// disabled · use an API key / ask your admin"). The
// server now includes error_detail in the ACTIVITY_LOGGED
// broadcast; fall back to summary, and only as a last
// resort to a generic line. The old hardcoded
// "Agent error (Exception) — see workspace logs for
// details." string pointed at a logs UI that does not
// exist and discarded the actionable reason entirely.
const detail =
(p.error_detail as string) ||
(p.summary as string) ||
"The agent turn failed but the runtime reported no detail. Retry once; if it repeats the workspace runtime may need a restart.";
callbacksRef.current.onSendError?.(detail);
}
}
} else if (type === "a2a_send") {
@@ -2,7 +2,7 @@
import { useState, useCallback, useRef, useEffect } from 'react';
import type { TestConnectionState, SecretGroup } from '@/types/secrets';
import { validateSecret } from '@/lib/api/secrets';
import { validateSecret, ApiError } from '@/lib/api/secrets';
interface TestConnectionButtonProps {
provider: SecretGroup;
@@ -55,9 +55,23 @@ export function TestConnectionButton({
}
onResult?.(result.valid);
resetTimerRef.current = setTimeout(() => setState('idle'), RESET_DELAYS[nextState]!);
} catch {
} catch (err) {
// Distinguish a real failure shape rather than always claiming a
// timeout. A reachable server that answered with an HTTP status
// (ApiError) did NOT time out — most commonly the validation route
// is not available (404/501), which must not masquerade as
// "service down". Only an actual thrown network/abort error is a
// connectivity failure.
setState('failure');
setErrorDetail('Connection timed out. Service may be down.');
if (err instanceof ApiError) {
setErrorDetail(
err.status === 404 || err.status === 501
? 'Key validation is not available for this service yet. The key was not tested.'
: `Could not verify key (server returned ${err.status}). Saving is unaffected.`,
);
} else {
setErrorDetail('Could not reach the validation service. Check your connection and try again.');
}
onResult?.(false);
resetTimerRef.current = setTimeout(() => setState('idle'), RESET_DELAYS.failure);
}
@@ -28,8 +28,20 @@ const mockValidateSecret = vi.fn();
vi.mock("@/lib/api/secrets", () => ({
validateSecret: (...args: unknown[]) => mockValidateSecret(...args),
ApiError: class ApiError extends Error {
status: number;
constructor(status: number, message: string) {
super(message);
this.name = "ApiError";
this.status = status;
}
},
}));
// Re-import the mocked ApiError so test cases construct the same class the
// component's `instanceof` check sees.
import { ApiError } from "@/lib/api/secrets";
beforeEach(() => {
vi.useFakeTimers();
vi.clearAllMocks();
@@ -201,8 +213,27 @@ describe("TestConnectionButton — failure path", () => {
});
describe("TestConnectionButton — catch path", () => {
it("shows 'Connection timed out' on network error", async () => {
mockValidateSecret.mockRejectedValue(new Error("timeout"));
it("does NOT claim a timeout when the validate endpoint 404s (regression: internal#492)", async () => {
// The validate route is unimplemented on the server and returns a fast
// 404. Before the fix this rendered the misleading hardcoded string
// "Connection timed out. Service may be down." It must instead state
// honestly that validation isn't available and the key was not tested.
mockValidateSecret.mockRejectedValue(new ApiError(404, "Not Found"));
render(
<TestConnectionButton provider="anthropic" secretValue="sk-ant-xxx" />,
);
fireEvent.click(document.querySelector('button[type="button"]')!);
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).not.toContain("Connection timed out");
expect(document.body.textContent).not.toContain("Service may be down");
expect(document.body.textContent).toContain("not available");
expect(document.body.textContent).toContain("not tested");
});
it("reports a non-404 server error with its status, not a timeout", async () => {
mockValidateSecret.mockRejectedValue(new ApiError(500, "Internal Server Error"));
render(
<TestConnectionButton provider="github" secretValue="ghp_xxx" />,
);
@@ -210,7 +241,20 @@ describe("TestConnectionButton — catch path", () => {
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).toContain("Connection timed out");
expect(document.body.textContent).toContain("500");
expect(document.body.textContent).not.toContain("Connection timed out");
});
it("shows a connectivity message on a genuine network error", async () => {
mockValidateSecret.mockRejectedValue(new Error("network down"));
render(
<TestConnectionButton provider="github" secretValue="ghp_xxx" />,
);
fireEvent.click(document.querySelector('button[type="button"]')!);
await act(async () => {
await vi.advanceTimersByTimeAsync(0);
});
expect(document.body.textContent).toContain("Could not reach the validation service");
});
it("calls onResult(false) on network error", async () => {
@@ -0,0 +1,166 @@
// @vitest-environment jsdom
/**
* Tests for useKeyboardShortcut.
*
* Strategy: use renderHook from @testing-library/react so useEffect fires
* before dispatch. We spy on window.addEventListener to capture the registered
* handler. Events are dispatched by calling the captured handler directly
* with a KeyboardEvent that has metaKey/ctrlKey overridden via
* Object.defineProperty (jsdom's built-in modifier-key event is unreliable).
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { cleanup, act, renderHook } from "@testing-library/react";
import { useState, useCallback } from "react";
import { useKeyboardShortcut } from "../use-keyboard-shortcut";
afterEach(cleanup);
// Capture the most-recently registered keydown handler so tests can dispatch through it.
let registeredHandler: ((e: KeyboardEvent) => void) | null = null;
const addSpy = vi.spyOn(window, "addEventListener").mockImplementation(
(event: string, handler: EventListener) => {
if (event === "keydown") {
registeredHandler = handler as (e: KeyboardEvent) => void;
}
},
);
const removeSpy = vi.spyOn(window, "removeEventListener").mockImplementation(
(event: string) => {
if (event === "keydown") {
registeredHandler = null;
}
},
);
beforeEach(() => {
registeredHandler = null;
addSpy.mockClear();
removeSpy.mockClear();
});
/**
* Dispatch a keydown event through the captured handler.
* Wrapped in act() so React flushes any state updates synchronously.
* Bypasses jsdom's internal event routing (which doesn't go through
* window.EventTarget.prototype.addEventListener for fireEvent dispatch).
*/
function dispatchKeydown(
key: string,
{ meta = false, ctrl = false }: { meta?: boolean; ctrl?: boolean } = {},
) {
act(() => {
const e = new KeyboardEvent("keydown", { key, bubbles: true });
Object.defineProperty(e, "metaKey", { value: meta });
Object.defineProperty(e, "ctrlKey", { value: ctrl });
registeredHandler?.(e);
});
}
describe("useKeyboardShortcut", () => {
describe("enabled=false", () => {
it("does not register a keydown listener", () => {
renderHook(() =>
useKeyboardShortcut("k", vi.fn(), { enabled: false }),
);
expect(addSpy).not.toHaveBeenCalledWith("keydown", expect.any(Function));
});
});
describe("meta modifier", () => {
it("fires callback on Cmd+K", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { meta: true });
expect(cb).toHaveBeenCalledTimes(1);
});
it("does NOT fire on Ctrl+K when only meta=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { ctrl: true });
expect(cb).not.toHaveBeenCalled();
});
it("does NOT fire on plain K even with meta=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { meta: false, ctrl: false });
expect(cb).not.toHaveBeenCalled();
});
});
describe("ctrl modifier", () => {
it("fires callback on Ctrl+K", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { ctrl: true }));
dispatchKeydown("k", { ctrl: true });
expect(cb).toHaveBeenCalledTimes(1);
});
it("does NOT fire on Cmd+K when only ctrl=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { ctrl: true }));
dispatchKeydown("k", { meta: true });
expect(cb).not.toHaveBeenCalled();
});
});
describe("no-modifier guard", () => {
it("does not fire when no modifier is held", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, {}));
dispatchKeydown("k", { meta: false, ctrl: false });
expect(cb).not.toHaveBeenCalled();
});
});
describe("key mismatch", () => {
it("does not fire when wrong key is pressed", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("j", { meta: true });
expect(cb).not.toHaveBeenCalled();
});
});
describe("count reflects shortcut fires", () => {
it("increments when Cmd+K fires", () => {
const { result } = renderHook(() => {
const [count, setCount] = useState(0);
const cb = useCallback(() => setCount((c) => c + 1), []);
useKeyboardShortcut("k", cb, { meta: true });
return count;
});
expect(result.current).toBe(0);
dispatchKeydown("k", { meta: true });
expect(result.current).toBe(1);
dispatchKeydown("k", { meta: true });
expect(result.current).toBe(2);
});
it("does not increment on wrong modifier", () => {
const { result } = renderHook(() => {
const [count, setCount] = useState(0);
const cb = useCallback(() => setCount((c) => c + 1), []);
useKeyboardShortcut("k", cb, { meta: true });
return count;
});
dispatchKeydown("k", { ctrl: true }); // wrong modifier
expect(result.current).toBe(0);
});
});
describe("cleanup on unmount", () => {
it("removes the keydown listener on unmount", () => {
const cb = vi.fn();
const { unmount } = renderHook(() =>
useKeyboardShortcut("k", cb, { meta: true }),
);
expect(removeSpy).not.toHaveBeenCalled();
unmount();
expect(removeSpy).toHaveBeenCalledWith("keydown", expect.any(Function));
});
});
});
@@ -0,0 +1,84 @@
// @vitest-environment jsdom
/**
* Tests for useSocketEvent.
*
* Covers:
* - subscribeSocketEvents is called on mount
* - Unsubscribe is called on unmount
* - subscribeSocketEvents is called only once (ref-based, not render-based)
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, cleanup } from "@testing-library/react";
import React from "react";
import { useSocketEvent } from "../useSocketEvent";
afterEach(cleanup);
// Mutable ref shared between vi.mock factory and test helpers
const state = {
handler: null as ((msg: unknown) => void) | null,
unsubscribe: null as (() => void) | null,
};
// Module-level mock — factory uses the state object so beforeEach can update it
vi.mock("@/store/socket-events", () => ({
subscribeSocketEvents: vi.fn().mockImplementation(() => {
if (state.unsubscribe) return state.unsubscribe;
const fn = vi.fn();
state.unsubscribe = fn;
return fn;
}),
}));
import { subscribeSocketEvents } from "@/store/socket-events";
beforeEach(() => {
state.handler = null;
state.unsubscribe = null;
vi.mocked(subscribeSocketEvents).mockImplementation(() => {
const fn = vi.fn();
state.unsubscribe = fn;
return fn;
});
});
// Dispatch a message through the subscribed handler
function dispatchMsg(msg: unknown) {
if (state.handler) {
state.handler(msg);
}
}
// Consumer component that stores the handler ref
function SocketConsumer({ cb }: { cb: (msg: unknown) => void }) {
useSocketEvent(cb as (msg: unknown) => void);
// Store the handler so tests can dispatch through it
// We do this by re-mocking to capture the handler
return <div data-testid="consumer" />;
}
describe("useSocketEvent", () => {
it("calls subscribeSocketEvents on mount", () => {
render(<SocketConsumer cb={vi.fn()} />);
expect(subscribeSocketEvents).toHaveBeenCalledTimes(1);
});
it("calls the unsubscribe function on unmount", () => {
const unsubscribe = vi.fn();
vi.mocked(subscribeSocketEvents).mockReturnValueOnce(unsubscribe);
const { unmount } = render(<SocketConsumer cb={vi.fn()} />);
unmount();
expect(unsubscribe).toHaveBeenCalledTimes(1);
});
it("subscribeSocketEvents is called only once on re-renders", () => {
const { rerender } = render(<SocketConsumer cb={vi.fn()} />);
const initial = vi.mocked(subscribeSocketEvents).mock.calls.length;
rerender(<SocketConsumer cb={vi.fn()} />);
rerender(<SocketConsumer cb={vi.fn()} />);
rerender(<SocketConsumer cb={vi.fn()} />);
expect(vi.mocked(subscribeSocketEvents).mock.calls.length).toBe(initial);
});
});
@@ -0,0 +1,98 @@
// @vitest-environment jsdom
/**
* Tests for useWorkspaceName.
*
* Tests that the hook correctly resolves workspace IDs to names
* using the canvas store's nodes.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, cleanup } from "@testing-library/react";
import React from "react";
import { useWorkspaceName } from "../useWorkspaceName";
afterEach(cleanup);
const mockNodes = [
{ id: "ws-1", data: { name: "Alpha Workspace" } },
{ id: "ws-2", data: { name: "Beta Workspace" } },
{ id: "ws-3", data: {} }, // node without name
{ id: "ws-4", data: { name: "" } }, // empty name
] as const;
// Stable reference so useCallback deps are stable across re-renders
const stableNodes = [...mockNodes];
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
vi.fn((selector?: (s: { nodes: typeof stableNodes }) => unknown) => {
if (typeof selector === "function") {
return selector({ nodes: stableNodes });
}
return { nodes: stableNodes };
}),
{ getState: vi.fn(() => ({ nodes: stableNodes })) },
),
}));
import { useCanvasStore } from "@/store/canvas";
beforeEach(() => {
vi.mocked(useCanvasStore).mockClear();
});
describe("useWorkspaceName", () => {
it("returns the workspace name for a known ID", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-1");
});
expect(result.current).toBe("Alpha Workspace");
});
it("returns the workspace name for another known ID", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-2");
});
expect(result.current).toBe("Beta Workspace");
});
it("returns empty string for null", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve(null);
});
expect(result.current).toBe("");
});
it("falls back to first 8 chars of ID when node has no name", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-3");
});
expect(result.current).toBe("ws-3".slice(0, 8));
});
it("falls back to first 8 chars of ID when name is empty string", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-4");
});
expect(result.current).toBe("ws-4".slice(0, 8));
});
it("falls back to first 8 chars of ID for unknown workspace", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-999");
});
expect(result.current).toBe("ws-999".slice(0, 8));
});
it("callback is memoized — same reference across renders", () => {
const { result, rerender } = renderHook(() => useWorkspaceName());
const first = result.current;
rerender();
expect(result.current).toBe(first);
});
});
+20 -55
View File
@@ -1,67 +1,32 @@
// @vitest-environment jsdom
/**
* Tests for cssVar — maps ColorToken to a CSS variable string.
*
* Exists for the rare case where an inline style="" or SVG fill needs
* a token value rather than a Tailwind class. The returned var(--color-foo)
* string follows the live theme without re-renders.
*/
import { describe, it, expect } from "vitest";
import { cssVar } from "../theme";
import type { ColorToken } from "../theme";
import { cssVar, type ColorToken } from "../theme";
describe("cssVar", () => {
it("returns 'var(--color-surface)' for 'surface'", () => {
expect(cssVar("surface")).toBe("var(--color-surface)");
});
const tokens: ColorToken[] = [
"surface", "surface-elevated", "surface-sunken", "surface-card",
"line", "line-soft", "ink", "ink-mid", "ink-soft",
"accent", "accent-strong", "warm", "good", "bad",
"bg", "bg-elev", "bg-card", "line-strong",
"ink-mute", "ink-dim", "accent-dim", "plasma", "warn",
];
it("returns 'var(--color-ink)' for 'ink'", () => {
expect(cssVar("ink")).toBe("var(--color-ink)");
});
it("returns 'var(--color-accent)' for 'accent'", () => {
expect(cssVar("accent")).toBe("var(--color-accent)");
});
it("returns 'var(--color-good)' for 'good'", () => {
expect(cssVar("good")).toBe("var(--color-good)");
});
it("returns 'var(--color-bad)' for 'bad'", () => {
expect(cssVar("bad")).toBe("var(--color-bad)");
});
it("returns 'var(--color-warn)' for 'warn'", () => {
expect(cssVar("warn")).toBe("var(--color-warn)");
});
it("handles all surface variants", () => {
const surfaces: ColorToken[] = ["surface", "surface-elevated", "surface-sunken", "surface-card"];
for (const t of surfaces) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
it("returns a CSS variable string for every colour token", () => {
for (const token of tokens) {
expect(cssVar(token)).toBe(`var(--color-${token})`);
}
});
it("handles all ink variants", () => {
const inks: ColorToken[] = ["ink", "ink-mid", "ink-soft", "ink-mute", "ink-dim"];
for (const t of inks) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
it("returned string can be used as an inline style value", () => {
const el = document.createElement("div");
el.style.color = cssVar("ink");
el.style.backgroundColor = cssVar("surface");
expect(el.style.color).toBe("var(--color-ink)");
expect(el.style.backgroundColor).toBe("var(--color-surface)");
});
it("handles always-dark tokens", () => {
const dark: ColorToken[] = ["bg", "bg-elev", "bg-card", "line-strong", "accent-dim", "plasma"];
for (const t of dark) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
});
it("is a pure function — same input always returns same output", () => {
const tokens: ColorToken[] = ["surface", "accent", "good", "bad", "warm"];
for (const t of tokens) {
for (let i = 0; i < 3; i++) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
}
it("returned string contains the token name verbatim", () => {
expect(cssVar("accent-strong")).toContain("accent-strong");
expect(cssVar("ink-dim")).toContain("ink-dim");
});
});
@@ -0,0 +1,134 @@
// @vitest-environment jsdom
/**
* Tests for ThemeProvider and useTheme.
*
* Uses renderHook so useEffect fires before assertions.
* matchMedia is stubbed via Object.defineProperty in beforeEach.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, renderHook, cleanup, act } from "@testing-library/react";
import React from "react";
import { ThemeProvider, useTheme } from "../theme-provider";
afterEach(cleanup);
function makeMatcher(prefersDark: boolean) {
return {
matches: prefersDark,
media: "(prefers-color-scheme: dark)",
onchange: null,
addListener: vi.fn(),
removeListener: vi.fn(),
addEventListener: vi.fn(),
removeEventListener: vi.fn(),
dispatchEvent: vi.fn(),
};
}
beforeEach(() => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(false)),
});
});
afterEach(() => {
vi.restoreAllMocks();
});
describe("useTheme", () => {
it("returns noopTheme when no provider is in the tree", () => {
const { result } = renderHook(() => useTheme());
expect(result.current).toMatchObject({
theme: "system",
resolvedTheme: "light",
});
expect(typeof result.current.setTheme).toBe("function");
});
});
describe("ThemeProvider", () => {
it("initialises with the initialTheme prop", () => {
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="dark">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "dark",
resolvedTheme: "dark",
});
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("reflects system preference when theme=system", () => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(true)),
});
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="system">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "system",
resolvedTheme: "dark",
});
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("resolvedTheme follows explicit theme, not system, when theme != system", () => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(true)),
});
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="light">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "light",
resolvedTheme: "light",
});
expect(document.documentElement.dataset.theme).toBe("light");
});
it("setTheme updates theme state", () => {
let setThemeRef: ((t: string) => void) | null = null;
const { result } = renderHook(() => {
const ctx = useTheme();
// Capture setTheme on first render
if (!setThemeRef) setThemeRef = ctx.setTheme;
return ctx;
}, {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="light">{children}</ThemeProvider>
),
});
expect(result.current.theme).toBe("light");
act(() => { setThemeRef!("dark"); });
expect(result.current.theme).toBe("dark");
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("sets document.documentElement.dataset.theme to resolvedTheme on mount", () => {
render(
<ThemeProvider initialTheme="dark">
<div />
</ThemeProvider>,
);
// renderHook already flushed effects; plain render also needs act
act(() => {});
expect(document.documentElement.dataset.theme).toBe("dark");
});
});
+199
View File
@@ -21,12 +21,22 @@ vi.mock("../canvas", () => ({
class MockWebSocket {
static instances: MockWebSocket[] = [];
// Mirror the real WebSocket readyState constants — socket.ts's wake
// path reads WebSocket.OPEN / WebSocket.CONNECTING and this.ws.readyState.
static readonly CONNECTING = 0;
static readonly OPEN = 1;
static readonly CLOSING = 2;
static readonly CLOSED = 3;
url: string;
onopen: (() => void) | null = null;
onmessage: ((event: { data: string }) => void) | null = null;
onclose: (() => void) | null = null;
onerror: (() => void) | null = null;
closeCallCount = 0;
// Starts OPEN once triggerOpen runs; tests flip this to simulate a
// mobile background-suspend that left a dead/half-open socket.
readyState = MockWebSocket.CONNECTING;
constructor(url: string) {
this.url = url;
@@ -35,10 +45,12 @@ class MockWebSocket {
close() {
this.closeCallCount++;
this.readyState = MockWebSocket.CLOSED;
}
// Helpers to trigger events in tests
triggerOpen() {
this.readyState = MockWebSocket.OPEN;
this.onopen?.();
}
@@ -59,6 +71,46 @@ class MockWebSocket {
}
}
// ---------------------------------------------------------------------------
// Minimal DOM stub (vitest environment is 'node' — no window/document).
// socket.ts's wake-recovery attaches visibilitychange/pageshow/online/
// focus listeners; under node it self-no-ops via a typeof guard, so to
// exercise the path we inject just enough of window/document here, the
// same way WebSocket is stubbed above. Kept tiny on purpose — a single
// listener registry keyed by event name, plus a settable
// visibilityState.
// ---------------------------------------------------------------------------
interface FakeTarget {
_l: Record<string, Array<() => void>>;
addEventListener: (type: string, fn: () => void) => void;
removeEventListener: (type: string, fn: () => void) => void;
dispatch: (type: string) => void;
}
function makeFakeTarget(): FakeTarget {
const l: Record<string, Array<() => void>> = {};
return {
_l: l,
addEventListener(type, fn) {
(l[type] ||= []).push(fn);
},
removeEventListener(type, fn) {
l[type] = (l[type] || []).filter((f) => f !== fn);
},
dispatch(type) {
for (const fn of l[type] || []) fn();
},
};
}
const fakeWindow = makeFakeTarget();
const fakeDocument = Object.assign(makeFakeTarget(), {
visibilityState: "visible" as string,
});
(globalThis as unknown as Record<string, unknown>).window = fakeWindow;
(globalThis as unknown as Record<string, unknown>).document = fakeDocument;
// Install mock WebSocket globally before importing socket module
(globalThis as unknown as Record<string, unknown>).WebSocket = MockWebSocket;
@@ -328,6 +380,153 @@ describe("WebSocket onerror", () => {
});
});
// ---------------------------------------------------------------------------
// Wake recovery — mobile background-suspend regression (mobile chat not
// updating in real time until refresh). Simulates: connect → open →
// the OS freezes the page and silently kills the WS WITHOUT firing
// onclose → user returns (visibilitychange / pageshow / online /
// focus) → assert the dead socket is replaced AND, on the new socket's
// open, the resume signal fires so chat history back-fills the missed
// AGENT_MESSAGE / A2A_RESPONSE events.
// ---------------------------------------------------------------------------
import {
subscribeSocketResume,
_resetSocketResumeListenersForTests,
} from "../socket-events";
describe("wake recovery (mobile background-suspend)", () => {
beforeEach(() => {
_resetSocketResumeListenersForTests();
fakeDocument.visibilityState = "visible";
});
function suspendKill(ws: MockWebSocket) {
// Mobile background-suspend: the OS tore the transport down but the
// page was frozen so onclose never ran. The socket object survives
// with a CLOSED readyState and no reconnect was scheduled.
ws.readyState = MockWebSocket.CLOSED;
}
it("reconnects on visibilitychange when the socket was silently killed", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
expect(MockWebSocket.instances).toHaveLength(1);
suspendKill(ws);
fakeDocument.dispatch("visibilitychange");
// A fresh socket must have been created — the stale one is not
// reused.
expect(MockWebSocket.instances.length).toBeGreaterThan(1);
});
it("does NOT reconnect on visibilitychange while the socket is still healthy", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
expect(MockWebSocket.instances).toHaveLength(1);
// Healthy OPEN socket + a spurious visibilitychange (e.g. quick tab
// peek that never actually suspended) → no churn.
fakeDocument.dispatch("visibilitychange");
expect(MockWebSocket.instances).toHaveLength(1);
});
it("ignores visibilitychange when the page is hidden (the hide transition)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
suspendKill(ws);
fakeDocument.visibilityState = "hidden";
fakeDocument.dispatch("visibilitychange");
// Hidden → must not reconnect (would defeat the purpose; we only
// re-arm when the user is actually looking at the page again).
expect(MockWebSocket.instances).toHaveLength(1);
});
it.each(["pageshow", "online", "focus"])(
"reconnects on window '%s' after a silent kill",
(evt) => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
suspendKill(ws);
fakeWindow.dispatch(evt);
expect(MockWebSocket.instances.length).toBeGreaterThan(1);
},
);
it("emits the resume signal once the recovered socket re-opens (so chat back-fills missed messages)", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
const ws1 = getLastWS();
ws1.triggerOpen();
// First open must NOT fire resume — the mount-time chat-history
// fetch already covers the initial load.
expect(onResume).not.toHaveBeenCalled();
// Background-suspend silently kills the socket, then the user
// returns.
suspendKill(ws1);
fakeDocument.dispatch("visibilitychange");
// The wake handler force-reconnected; the new socket completing its
// handshake is what signals "we recovered from a gap — re-fetch".
const ws2 = getLastWS();
expect(ws2).not.toBe(ws1);
ws2.triggerOpen();
expect(onResume).toHaveBeenCalledTimes(1);
unsub();
});
it("does not emit resume on the very first connect", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
getLastWS().triggerOpen();
expect(onResume).not.toHaveBeenCalled();
unsub();
});
it("emits resume after an ordinary onclose-driven reconnect too (desktop path unchanged)", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
const ws1 = getLastWS();
ws1.triggerOpen();
// Ordinary network drop — onclose fires normally.
ws1.triggerClose();
vi.advanceTimersByTime(1100); // past the 1s backoff
const ws2 = getLastWS();
expect(ws2).not.toBe(ws1);
ws2.triggerOpen();
expect(onResume).toHaveBeenCalledTimes(1);
unsub();
});
it("detaches wake listeners on disconnect (no reconnect after teardown)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
disconnectSocket();
const countAfterDisconnect = MockWebSocket.instances.length;
// A wake event after teardown must be inert.
fakeDocument.dispatch("visibilitychange");
fakeWindow.dispatch("focus");
expect(MockWebSocket.instances.length).toBe(countAfterDisconnect);
});
});
// ---------------------------------------------------------------------------
// Health check (startHealthCheck / stopHealthCheck via onopen / disconnect)
// ---------------------------------------------------------------------------
+50
View File
@@ -61,3 +61,53 @@ export function subscribeSocketEvents(listener: Listener): () => void {
export function _resetSocketEventListenersForTests(): void {
listeners.clear();
}
// ---------------------------------------------------------------------------
// Socket-resume signal
// ---------------------------------------------------------------------------
//
// Fired by the ReconnectingSocket when the WS comes back up AFTER having
// been down (drop, or a mobile-browser background-suspend that silently
// killed the socket while the page was frozen). Distinct from the raw
// event bus above: while the socket was dead the page missed every
// AGENT_MESSAGE / A2A_RESPONSE, and the store's rehydrate() only re-pulls
// /workspaces status — it does NOT back-fill chat messages. Components
// that render a live message thread (desktop ChatTab + MobileChat, both
// via useChatHistory) subscribe here to re-fetch their history on resume
// so missed agent replies appear without the user having to navigate
// away+back or hard-refresh. Shared by desktop and mobile — the recovery
// is in the singleton socket, not forked per-surface.
type ResumeListener = () => void;
const resumeListeners = new Set<ResumeListener>();
/** Notify every resume subscriber that the socket just recovered from a
* down period. Called by ReconnectingSocket.onopen, but only when the
* open follows a prior loss (not the very first connect — the initial
* mount-time history fetch already covers that). */
export function emitSocketResume(): void {
for (const listener of resumeListeners) {
try {
listener();
} catch (err) {
if (typeof console !== "undefined") {
console.error("socket-resume listener threw:", err);
}
}
}
}
/** Register a resume subscriber. Returns an unsubscribe function the
* caller must invoke from its effect cleanup. */
export function subscribeSocketResume(listener: ResumeListener): () => void {
resumeListeners.add(listener);
return () => {
resumeListeners.delete(listener);
};
}
/** Test-only: drop all resume subscribers. */
export function _resetSocketResumeListenersForTests(): void {
resumeListeners.clear();
}
+117 -1
View File
@@ -1,6 +1,6 @@
import { useCanvasStore } from "./canvas";
import { deriveWsBaseUrl } from "@/lib/ws-url";
import { emitSocketEvent } from "./socket-events";
import { emitSocketEvent, emitSocketResume } from "./socket-events";
// If explicit WS_URL is set, use it as-is (may include custom path).
// Otherwise derive base + append /ws.
@@ -98,9 +98,107 @@ class ReconnectingSocket {
// caller can fire-and-forget without coordinating.
private rehydrateInFlight: Promise<void> | null = null;
private rehydrateDedup = new RehydrateDedup(REHYDRATE_DEDUP_WINDOW_MS);
// True once any onopen has fired. Gates the resume signal so the very
// first connect doesn't fire it (the mount-time chat-history fetch
// already covers the initial load — a resume here would be a wasted
// duplicate). Set on the first successful open and stays true.
private everConnected = false;
// True between a loss (onclose / wake-detected stale socket) and the
// next successful onopen. Only when this is set does onopen emit the
// resume signal — i.e. we recovered from a real gap during which
// AGENT_MESSAGE / A2A_RESPONSE events may have been missed.
private wasDown = false;
// Bound wake handler. iOS Safari / Chrome-mobile freeze the page and
// its timers when the tab is backgrounded or the device locks, and
// tear the WS down WITHOUT reliably firing onclose before the freeze.
// On thaw nothing re-arms: onclose never ran so no reconnect was
// scheduled, and the health-check / fallback-poll intervals were
// suspended. The socket is silently dead until a manual refresh. This
// handler force-reconnects on any wake signal when the socket isn't
// healthy. Stored so disconnect() can detach the listeners.
private onWake: (() => void) | null = null;
constructor(url: string) {
this.url = url;
this.installWakeListeners();
}
/** Attach page-lifecycle listeners that force a reconnect when the
* page returns to the foreground / regains connectivity and the
* socket is not OPEN. Shared by desktop and mobile — desktop rarely
* hits the stale-socket path (its onclose fires promptly) so this is
* effectively a no-op there, while mobile depends on it because the
* background-suspend kills the socket without an onclose. */
private installWakeListeners() {
if (typeof window === "undefined" || typeof document === "undefined") {
return;
}
const wake = () => {
if (this.disposed) return;
// Only act on a visible page — visibilitychange also fires on the
// hide transition, which we must ignore (closing here would defeat
// the point).
if (
typeof document.visibilityState === "string" &&
document.visibilityState !== "visible"
) {
return;
}
// Healthy socket → nothing to do. A stale/half-open socket on
// mobile reports CLOSED or CLOSING (the OS tore the transport
// down); CONNECTING is also unhealthy from the user's POV but a
// reconnect attempt is already in flight, so leave it.
const live =
this.ws !== null &&
(this.ws.readyState === WebSocket.OPEN ||
this.ws.readyState === WebSocket.CONNECTING);
if (live) return;
// Tear down any zombie and reconnect immediately. Mark wasDown so
// the subsequent onopen emits the resume signal and chat threads
// back-fill the messages missed while frozen.
this.wasDown = true;
this.forceReconnect();
};
this.onWake = wake;
document.addEventListener("visibilitychange", wake);
window.addEventListener("pageshow", wake);
window.addEventListener("online", wake);
window.addEventListener("focus", wake);
}
private removeWakeListeners() {
if (!this.onWake) return;
if (typeof window !== "undefined" && typeof document !== "undefined") {
document.removeEventListener("visibilitychange", this.onWake);
window.removeEventListener("pageshow", this.onWake);
window.removeEventListener("online", this.onWake);
window.removeEventListener("focus", this.onWake);
}
this.onWake = null;
}
/** Detach the current (presumed dead/stale) socket without routing
* through its onclose, cancel any pending backoff timer, and
* reconnect now. Used by the wake path: the browser already killed
* the transport, so the exponential backoff that onclose would have
* scheduled is both absent and undesirable — the user is looking at
* the page and wants it live immediately. */
private forceReconnect() {
if (this.disposed) return;
if (this.reconnectTimer) {
clearTimeout(this.reconnectTimer);
this.reconnectTimer = null;
}
if (this.ws) {
this.ws.onopen = null;
this.ws.onmessage = null;
this.ws.onclose = null;
this.ws.onerror = null;
try { this.ws.close(); } catch { /* noop */ }
this.ws = null;
}
this.attempt = 0;
this.connect();
}
connect() {
@@ -132,6 +230,18 @@ class ReconnectingSocket {
this.stopFallbackPoll();
this.rehydrate();
this.startHealthCheck();
// If this open follows a real loss (drop, or a mobile background-
// suspend that the wake handler recovered from), signal resume so
// live message threads re-fetch the AGENT_MESSAGE / A2A_RESPONSE
// history they missed while the socket was dead — rehydrate()
// above only refreshes /workspaces status, not chat. Gate on
// everConnected so the very first open (covered by the mount-time
// history fetch) doesn't fire a redundant resume.
if (this.everConnected && this.wasDown) {
emitSocketResume();
}
this.everConnected = true;
this.wasDown = false;
};
ws.onmessage = (event) => {
@@ -157,6 +267,11 @@ class ReconnectingSocket {
// corresponds to the WS we just tore down (prevents a stale
// onclose from a zombie socket from re-arming the loop).
if (this.disposed || this.ws !== ws) return;
// We had a live socket and lost it — mark down so the next onopen
// emits the resume signal and chat threads back-fill missed
// messages. (The wake path also sets this; setting it here covers
// the ordinary network-drop case.)
this.wasDown = true;
this.stopHealthCheck();
useCanvasStore.getState().setWsStatus("connecting");
this.startFallbackPoll();
@@ -247,6 +362,7 @@ class ReconnectingSocket {
disconnect() {
this.disposed = true;
this.removeWakeListeners();
this.stopHealthCheck();
this.stopFallbackPoll();
if (this.reconnectTimer) {
+1
View File
@@ -62,6 +62,7 @@ TOP_LEVEL_MODULES = {
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"a2a_tools_identity",
"adapter_base",
"agent",
"agents_md",
+4 -2
View File
@@ -41,8 +41,9 @@ type EventType string
// scan-friendly as it grows.
const (
// Chat / agent messaging — surfaces in canvas chat panels.
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventUserMessage EventType = "USER_MESSAGE"
EventActivityLogged EventType = "ACTIVITY_LOGGED"
EventChannelMessage EventType = "CHANNEL_MESSAGE"
@@ -95,6 +96,7 @@ const (
var AllEventTypes = []EventType{
EventA2AResponse,
EventActivityLogged,
EventUserMessage,
EventAgentAssigned,
EventAgentCardUpdated,
EventAgentMessage,
@@ -41,6 +41,7 @@ func TestAllEventTypes_IsSnapshot(t *testing.T) {
"DELEGATION_STATUS",
"EXTERNAL_CREDENTIALS_ROTATED",
"TASK_UPDATED",
"USER_MESSAGE",
"WORKSPACE_AWAITING_AGENT",
"WORKSPACE_DEGRADED",
"WORKSPACE_HEARTBEAT",
@@ -399,7 +399,21 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
// (no Do(), no maybeMarkContainerDead). The response is a synthetic
// {status:"queued"} envelope so the caller (canvas, another workspace)
// knows delivery is acknowledged but pending consumption.
if lookupDeliveryMode(ctx, workspaceID) == models.DeliveryModePoll {
deliveryMode, deliveryModeErr := lookupDeliveryMode(ctx, workspaceID)
if deliveryModeErr != nil {
// internal#497 fail-closed: a real DB/context error on the
// delivery-mode read MUST NOT silently fall through to the push
// dispatch path — that is exactly what silently misrouted every
// poll-mode peer for 5 days under the ce2db75f regression. Surface
// a structured error so the delegation is marked failed (loud +
// retryable) instead of dispatched to the wrong path.
log.Printf("ProxyA2A: delivery-mode lookup failed for %s: %v — failing closed", workspaceID, deliveryModeErr)
return 0, nil, &proxyA2AError{
Status: http.StatusServiceUnavailable,
Response: gin.H{"error": "delivery-mode lookup failed; refusing to dispatch to avoid silent misrouting"},
}
}
if deliveryMode == models.DeliveryModePoll {
if logActivity {
h.logA2AReceiveQueued(ctx, workspaceID, callerID, body, a2aMethod)
}
@@ -11,6 +11,7 @@ import (
"log"
"net/http"
"strconv"
"strings"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
@@ -194,6 +195,11 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
// Tracked via goAsync (not bare `go`) so the asyncWG can be drained
// before a test swaps the global db.DB. runRestartCycle reads db.DB
// before its provisioner gate, so an untracked detached goroutine
// races setupTestDB's t.Cleanup db.DB restore. Matches the already-
// correct site at a2a_proxy.go:648.
h.goAsync(func() { h.RestartByID(workspaceID) })
return true
}
@@ -241,6 +247,9 @@ func (h *WorkspaceHandler) preflightContainerHealth(ctx context.Context, workspa
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
// Tracked via goAsync (see maybeMarkContainerDead): preflight's
// detached restart must be drainable so it doesn't race the global
// db.DB swap in test cleanup.
h.goAsync(func() { h.RestartByID(workspaceID) })
return &proxyA2AError{
Status: http.StatusServiceUnavailable,
@@ -262,8 +271,9 @@ func (h *WorkspaceHandler) logA2AFailure(ctx context.Context, workspaceID, calle
errWsName = workspaceID
}
summary := "A2A request to " + errWsName + " failed: " + errMsg
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), 30*time.Second)
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
WorkspaceID: workspaceID,
@@ -309,8 +319,9 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
}
summary := a2aMethod + " → " + wsNameForLog
toolTrace := extractToolTrace(respBody)
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(ctx), 30*time.Second)
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
WorkspaceID: workspaceID,
@@ -334,6 +345,19 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
"duration_ms": durationMs,
})
}
// #228: fan user's own outbound message to all sessions of the workspace.
// When a canvas user sends a message (callerID == "" and method == "message/send"),
// the originating session already inserted it optimistically in useChatSend.
// Other sessions see nothing until a manual refresh — this broadcast closes
// that gap. The originating session collapses its optimistic copy via the
// 3-second appendMessageDeduped window (same role + content = deduped).
if callerID == "" && a2aMethod == "message/send" && statusCode < 400 {
userPayload := extractCanvasUserMessage(body)
if userPayload != nil {
h.broadcaster.BroadcastOnly(workspaceID, string(events.EventUserMessage), userPayload)
}
}
}
func nilIfEmpty(s string) *string {
@@ -383,6 +407,119 @@ func validateCallerToken(ctx context.Context, c *gin.Context, callerID string) e
// matching (the wsauth errors are typed for the invalid case).
var errInvalidCallerToken = errors.New("missing caller auth token")
// canvasUserMessage holds the extracted user message extracted from an
// A2A canvas request body for broadcasting to other sessions.
type canvasUserMessage struct {
Message string `json:"message,omitempty"`
Parts []map[string]interface{} `json:"parts,omitempty"`
MessageID string `json:"messageId,omitempty"`
Attachments []map[string]interface{} `json:"attachments,omitempty"`
}
// extractCanvasUserMessage parses an A2A JSON-RPC request body and extracts
// the user-authored text and attachments from a canvas-initiated message/send.
// Returns nil when the body is not a canvas user message (empty, malformed,
// or not a message/send from canvas). The returned payload is safe to pass
// directly to BroadcastOnly — nil fields are omitted from JSON.
func extractCanvasUserMessage(body []byte) map[string]interface{} {
if len(body) == 0 {
return nil
}
var top map[string]json.RawMessage
if err := json.Unmarshal(body, &top); err != nil {
return nil
}
// Only handle message/send from canvas
var method string
if err := json.Unmarshal(top["method"], &method); err != nil || method != "message/send" {
return nil
}
params, ok := top["params"]
if !ok {
return nil
}
var paramsMap map[string]json.RawMessage
if err := json.Unmarshal(params, &paramsMap); err != nil {
return nil
}
msgRaw, ok := paramsMap["message"]
if !ok {
return nil
}
var msg map[string]json.RawMessage
if err := json.Unmarshal(msgRaw, &msg); err != nil {
return nil
}
// role field: only broadcast user-role messages (canvas users)
var role string
if err := json.Unmarshal(msg["role"], &role); err != nil || role != "user" {
return nil
}
result := make(map[string]interface{})
// Extract messageId if present
var mid string
if err := json.Unmarshal(msg["messageId"], &mid); err == nil && mid != "" {
result["messageId"] = mid
}
// Extract text from parts — accumulate all text parts into a single string
var parts []json.RawMessage
if err := json.Unmarshal(msg["parts"], &parts); err == nil {
var texts []string
var fileAttachments []map[string]interface{}
for _, pRaw := range parts {
var p map[string]json.RawMessage
if err := json.Unmarshal(pRaw, &p); err != nil {
continue
}
var t string
if err := json.Unmarshal(p["text"], &t); err == nil && t != "" {
texts = append(texts, t)
}
var fileRaw json.RawMessage
if err := json.Unmarshal(p["file"], &fileRaw); err == nil && fileRaw != nil {
var f map[string]json.RawMessage
if err := json.Unmarshal(fileRaw, &f); err == nil {
att := make(map[string]interface{})
var s string
if err := json.Unmarshal(f["uri"], &s); err == nil {
att["uri"] = s
}
if err := json.Unmarshal(f["name"], &s); err == nil {
att["name"] = s
}
if err := json.Unmarshal(f["mimeType"], &s); err == nil {
att["mimeType"] = s
}
var n float64
if err := json.Unmarshal(f["size"], &n); err == nil {
att["size"] = n
}
if len(att) > 0 {
fileAttachments = append(fileAttachments, att)
}
}
}
}
if len(texts) > 0 {
// Join with newlines — user may have sent multiple text parts
result["message"] = strings.Join(texts, "\n")
}
if len(fileAttachments) > 0 {
result["attachments"] = fileAttachments
}
}
// Drop empty payloads
if len(result) == 0 {
return nil
}
return result
}
// extractToolTrace pulls metadata.tool_trace from an A2A JSON-RPC response.
// Returns nil when absent or malformed — callers can pass it straight through.
func extractToolTrace(respBody []byte) json.RawMessage {
@@ -458,40 +595,64 @@ func parseUsageFromA2AResponse(body []byte) (inputTokens, outputTokens int64) {
return 0, 0
}
// lookupDeliveryMode returns the workspace's delivery_mode. On any DB
// error or missing row it returns DeliveryModePush — the fail-closed
// default. "Closed" here means "fall back to today's behavior (synchronous
// dispatch)" rather than "fall back to drop the request silently into
// activity_logs where the agent might never see it." A poll-mode workspace
// that briefly reads as push will get its A2A request dispatched to the
// stored URL (or a 502 if no URL); a push-mode workspace that briefly
// reads as poll would get its request silently queued with no dispatch.
// The first failure is loud + recoverable; the second is silent.
// lookupDeliveryMode returns the workspace's delivery_mode.
//
// internal#497 / RFC#497 fail-closed (SURGICAL scope): the *specific*
// failure mode that hid the ce2db75f regression for 5 days is now
// propagated instead of silently swallowed — a CONTEXT error
// (context.Canceled / context.DeadlineExceeded). Under ce2db75f the
// detached delegation goroutine ran on a cancelled request context, every
// `SELECT delivery_mode` failed `context canceled`, this function returned
// push, the poll-mode short-circuit in proxyA2ARequest was skipped, and
// poll-mode peers (e.g. an operator laptop on molecule-mcp-claude-channel)
// silently never got their a2a_receive inbox row. A transient,
// systematic-once-triggered context cancellation became permanent
// invisible misrouting. Returning that error lets the caller fail loud
// (mark the delegation failed) instead of mis-dispatching.
//
// Scope is deliberately narrow: only ctx errors propagate. Other DB
// errors retain the long-standing documented "fall back to push (today's
// synchronous behavior)" contract — that path is loud + recoverable
// (502 / SSRF reject / restart), unlike the silent poll-mode drop, and
// the surrounding proxy (incl. the sibling checkWorkspaceBudget) is
// intentionally built around that fail-open-to-push behavior. Widening
// further is an RFC#497 follow-up, not part of this P0 fix.
//
// A genuinely *absent* configuration is NOT an error and still resolves to
// push (the safe synchronous default): sql.ErrNoRows, a NULL/empty column,
// or an unrecognised value all return (push, nil).
//
// The function is intentionally lookup-only — it never mutates the row.
// The register handler (registry.go) is the only writer for delivery_mode.
//
// See #2339 PR 1 for the column + register-flow side; this is the
// proxy-side read used for the short-circuit in proxyA2ARequest.
func lookupDeliveryMode(ctx context.Context, workspaceID string) string {
func lookupDeliveryMode(ctx context.Context, workspaceID string) (string, error) {
var mode sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT delivery_mode FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&mode)
if err != nil {
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("ProxyA2A: lookupDeliveryMode(%s) failed (%v) — defaulting to push", workspaceID, err)
// internal#497: a context cancellation/deadline MUST NOT be
// swallowed into a silent push default — that is the exact 5-day
// silent-misrouting vector. Propagate so the caller fails closed.
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
log.Printf("ProxyA2A: lookupDeliveryMode(%s) context error (%v) — failing closed (NOT defaulting to push)", workspaceID, err)
return "", err
}
return models.DeliveryModePush
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("ProxyA2A: lookupDeliveryMode(%s) failed (%v) — defaulting to push (non-ctx DB error; legacy fail-open-to-push contract)", workspaceID, err)
}
return models.DeliveryModePush, nil
}
if !mode.Valid || mode.String == "" {
return models.DeliveryModePush
return models.DeliveryModePush, nil
}
if !models.IsValidDeliveryMode(mode.String) {
log.Printf("ProxyA2A: workspace %s has invalid delivery_mode=%q — defaulting to push", workspaceID, mode.String)
return models.DeliveryModePush
return models.DeliveryModePush, nil
}
return mode.String
return mode.String, nil
}
// logA2AReceiveQueued records a poll-mode "queued" A2A receive into
@@ -0,0 +1,262 @@
package handlers
import (
"encoding/json"
"testing"
)
// TestExtractCanvasUserMessage_TextOnly covers the primary path: a canvas user
// sends a plain text message with no attachments.
func TestExtractCanvasUserMessage_TextOnly(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": "msg-abc-123",
"parts": [
{"kind": "text", "text": "Hello, agent!"}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for text message")
}
if got["message"] != "Hello, agent!" {
t.Errorf("message = %v, want %q", got["message"], "Hello, agent!")
}
mid, ok := got["messageId"].(string)
if !ok || mid != "msg-abc-123" {
t.Errorf("messageId = %v, want %q", got["messageId"], "msg-abc-123")
}
_, hasAttachments := got["attachments"]
if hasAttachments {
t.Errorf("unexpected attachments: %v", got["attachments"])
}
}
// TestExtractCanvasUserMessage_FileOnly covers a user message with a file but no text.
func TestExtractCanvasUserMessage_FileOnly(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": "msg-file-456",
"parts": [
{
"kind": "file",
"file": {
"name": "report.pdf",
"uri": "workspace:/uploads/report.pdf",
"mimeType": "application/pdf",
"size": 4096
}
}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for file-only message")
}
if got["message"] != nil {
t.Errorf("unexpected message text: %v", got["message"])
}
attachments, ok := got["attachments"].([]map[string]interface{})
if !ok || len(attachments) != 1 {
t.Fatalf("attachments = %v, want 1-element array", got["attachments"])
}
att := attachments[0]
if att["uri"] != "workspace:/uploads/report.pdf" {
t.Errorf("uri = %v, want %q", att["uri"], "workspace:/uploads/report.pdf")
}
if att["name"] != "report.pdf" {
t.Errorf("name = %v, want %q", att["name"], "report.pdf")
}
if att["mimeType"] != "application/pdf" {
t.Errorf("mimeType = %v, want %q", att["mimeType"], "application/pdf")
}
}
// TestExtractCanvasUserMessage_TextAndFile covers a user message with both text and a file.
func TestExtractCanvasUserMessage_TextAndFile(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [
{"kind": "text", "text": "Here is the file:"},
{"kind": "text", "text": "see below"},
{
"kind": "file",
"file": {
"name": "data.csv",
"uri": "workspace:/exports/data.csv",
"mimeType": "text/csv",
"size": 8192
}
}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload")
}
// Two text parts are joined with newline
if got["message"] != "Here is the file:\nsee below" {
t.Errorf("message = %v, want %q", got["message"], "Here is the file:\nsee below")
}
attachments, ok := got["attachments"].([]map[string]interface{})
if !ok || len(attachments) != 1 {
t.Fatalf("attachments = %v, want 1-element array", got["attachments"])
}
}
// TestExtractCanvasUserMessage_Malformed covers malformed JSON.
func TestExtractCanvasUserMessage_Malformed(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{"not JSON", []byte(`{not valid`)},
{"wrong type top-level", []byte(`123`)},
{"missing params", []byte(`{"method":"message/send"}`)},
{"params not object", []byte(`{"method":"message/send","params":123}`)},
{"missing message", []byte(`{"method":"message/send","params":{}}`)},
{"message not object", []byte(`{"method":"message/send","params":{"message":123}}`)},
{"role missing", []byte(`{"method":"message/send","params":{"message":{"parts":[]}}}`)},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := extractCanvasUserMessage(tc.body); got != nil {
t.Errorf("expected nil for %s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_NotUserRole covers agent/workspace callers
// whose role is not "user" — these should not be broadcast as USER_MESSAGE.
func TestExtractCanvasUserMessage_NotUserRole(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{
"agent role",
[]byte(`{"method":"message/send","params":{"message":{"role":"agent","parts":[{"kind":"text","text":"hello"}]}}}`),
},
{
"assistant role",
[]byte(`{"method":"message/send","params":{"message":{"role":"assistant","parts":[{"kind":"text","text":"hello"}]}}}`),
},
{
"empty role",
[]byte(`{"method":"message/send","params":{"message":{"role":"","parts":[{"kind":"text","text":"hello"}]}}}`),
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := extractCanvasUserMessage(tc.body); got != nil {
t.Errorf("expected nil for role=%s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_NotMessageSend covers non-message/send methods.
func TestExtractCanvasUserMessage_NotMessageSend(t *testing.T) {
cases := []struct {
name string
method string
}{
{"tasks/send", "tasks/send"},
{"initialize", "initialize"},
{"ping", "ping"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
body, _ := json.Marshal(map[string]interface{}{
"method": tc.method,
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]interface{}{{"kind": "text", "text": "hello"}},
},
},
})
if got := extractCanvasUserMessage(body); got != nil {
t.Errorf("expected nil for method=%q, got %v", tc.method, got)
}
})
}
}
// TestExtractCanvasUserMessage_BlankOrEmpty covers text with only whitespace
// and empty parts arrays.
func TestExtractCanvasUserMessage_BlankOrEmpty(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{
"empty text part",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":""}]}}}`),
},
{
"empty parts array",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[]}}}`),
},
{
"whitespace-only text — still included as valid content",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":" "}]}}}`),
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := extractCanvasUserMessage(tc.body)
if tc.name == "whitespace-only text — still included as valid content" {
// Whitespace-only text is valid content — preserve it as-is.
// Canvas dedup collapses identical copies; whitespace is not stripped.
if got == nil {
t.Error("expected non-nil for whitespace-only text")
} else if got["message"] != " " {
t.Errorf("message = %q, want %q", got["message"], " ")
}
return
}
if got != nil {
t.Errorf("expected nil for %s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_Unicode covers non-ASCII text.
func TestExtractCanvasUserMessage_Unicode(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [
{"kind": "text", "text": "こんにちは世界 🌍 日本語"}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for unicode message")
}
if got["message"] != "こんにちは世界 🌍 日本語" {
t.Errorf("message = %v, want %q", got["message"], "こんにちは世界 🌍 日本語")
}
}
@@ -2235,12 +2235,18 @@ func TestProxyA2A_PushMode_NoShortCircuit(t *testing.T) {
}
}
// TestProxyA2A_PollMode_FailsClosedToPush verifies the safety contract:
// a DB error reading delivery_mode must default to push (the existing
// behavior), NOT poll. Failing to push means a poll-mode workspace
// briefly attempts a real dispatch — visible failure (502 / SSRF
// rejection / restart cascade), not a silent drop into activity_logs
// where the agent might never look. Loud > silent, recoverable > lost.
// TestProxyA2A_PollMode_FailsClosedToPush verifies the LEGACY safety
// contract is PRESERVED for non-context DB errors: a generic DB error
// reading delivery_mode still defaults to push (today's behavior), NOT
// poll. Failing to push means a poll-mode workspace briefly attempts a
// real dispatch — visible failure (502 / SSRF rejection / restart
// cascade), not a silent drop into activity_logs where the agent might
// never look. Loud > silent, recoverable > lost.
//
// internal#497 narrows the fail-closed change to *context* errors only
// (the actual ce2db75f regression vector); generic DB errors keep this
// long-standing fail-open-to-push contract. The ctx-error fail-closed is
// covered by TestLookupDeliveryMode_ContextCanceled_FailsClosed.
func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t) // empty Redis — forces resolveAgentURL DB lookup
@@ -2251,7 +2257,8 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
expectBudgetCheck(mock, wsID)
// lookupDeliveryMode hits a transient DB error → must default push.
// lookupDeliveryMode hits a generic (non-context) DB error → must
// still default push (legacy contract preserved by internal#497).
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnError(sql.ErrConnDone)
@@ -2275,7 +2282,7 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
var resp map[string]interface{}
_ = json.Unmarshal(w.Body.Bytes(), &resp)
if resp["status"] == "queued" {
t.Errorf("DB error on delivery_mode lookup silently queued the request — must fail-closed-to-push, got body: %s", w.Body.String())
t.Errorf("generic DB error on delivery_mode lookup silently queued the request — must fail-open-to-push, got body: %s", w.Body.String())
}
}
@@ -2284,6 +2291,37 @@ func TestProxyA2A_PollMode_FailsClosedToPush(t *testing.T) {
}
}
// TestLookupDeliveryMode_ContextCanceled_FailsClosed is the internal#497
// regression test for the SECONDARY defect. It pins the exact invariant
// that hid the ce2db75f regression for 5 days: when the delivery_mode read
// fails because the context was cancelled (precisely what happened in the
// detached delegation goroutine running on a returned request context),
// lookupDeliveryMode MUST return an error and MUST NOT silently return
// "push". Returning push there is what skipped the poll-mode short-circuit
// and silently dropped 100% of poll-mode peer deliveries.
//
// A pre-cancelled context makes QueryRowContext fail with
// context.Canceled deterministically — no DB rows are mocked because the
// query never reaches a result.
func TestLookupDeliveryMode_ContextCanceled_FailsClosed(t *testing.T) {
mock := setupTestDB(t)
// The query fails on the cancelled ctx before matching; provide a
// permissive expectation so sqlmock doesn't complain about the attempt.
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WillReturnError(context.Canceled)
ctx, cancel := context.WithCancel(context.Background())
cancel() // simulate the HTTP handler having returned (request ctx dead)
mode, err := lookupDeliveryMode(ctx, "ws-poll-peer")
if err == nil {
t.Fatalf("internal#497 regression: lookupDeliveryMode swallowed a context error and returned mode=%q with nil err — this is the exact 5-day silent-misrouting vector", mode)
}
if mode == models.DeliveryModePush {
t.Errorf("internal#497 regression: context error must NOT default to push (got mode=%q)", mode)
}
}
// ==================== a2aClient ResponseHeaderTimeout config ====================
func TestA2AClientResponseHeaderTimeout(t *testing.T) {
@@ -691,6 +691,19 @@ func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster eve
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
// internal#211/#212: error_detail carries the runtime's curated,
// user-actionable, secret-safe failure reason (provider HTTP
// status + error code + the provider's own guidance, e.g. a 403
// "org disabled · use an API key / ask your admin"). It is
// already persisted to the DB column above and capped by the
// runtime's report_activity helper (4096 chars). Previously it
// was dropped from the LIVE broadcast, so the canvas had nothing
// to render and fell back to a hardcoded opaque
// "Agent error (Exception) — see workspace logs" string. Include
// it so the chat bubble shows the real reason in real time.
if params.ErrorDetail != nil && *params.ErrorDetail != "" {
payload["error_detail"] = *params.ErrorDetail
}
}
return func() {
@@ -0,0 +1,113 @@
package handlers
import "encoding/json"
// agent_card_reconcile.go — server-side repair for the fleet-wide
// agent-card identity gap.
//
// Root cause: the runtime builds its AgentCard from config.name
// (workspace/main.py:198), and config.name is read from the
// CP-regenerated /configs/config.yaml whose `name:` field is the raw
// workspace UUID — NOT the friendly name the operator sees. The friendly
// name IS captured: POST /workspaces and PATCH /workspaces/:id (the
// canvas Details tab) write it to the trusted workspaces.name DB column.
// But /registry/register stores the runtime-supplied card verbatim
// (registry.go: `agent_card = EXCLUDED.agent_card`), so the stored card
// served at /.well-known/agent-card.json and returned to peers via
// agent_card_url ends up with name = UUID, description = "", role = null.
//
// Fix shape (deliberately minimal, no contract weakening): when the
// runtime-supplied card's `name` is empty or equals the workspace UUID
// (the placeholder the runtime had no better value for), the PLATFORM —
// not the agent — substitutes the friendly value from the trusted
// workspaces row. Identity stays platform-controlled: the agent never
// gains the ability to self-set its own name/role; the platform sources
// it from the operator-controlled DB column. We only ever FILL gaps
// (empty / UUID-placeholder); a card that already carries a real
// friendly name is never downgraded.
//
// list_peers / the /registry/:id/peers endpoint already resolve display
// names from workspaces.name directly (discovery.go / mcp_tools.go
// `SELECT w.id, w.name, ...`), so peer_name in delivered message tags
// was already correct — this fix closes the remaining surface: the
// agent_card blob itself (canvas Agent Card / Skills view, peer
// agent_card_url fetches, the well-known card).
//
// description / role degrade discovery the same way: an empty
// description and null role give peers nothing to reason about. We
// default description from the (now reconciled) name when blank and
// role from workspaces.role when the operator set one.
// reconcileAgentCardIdentity patches identity gaps in a runtime-supplied
// agent card from the trusted workspace DB row. It returns the
// (possibly rewritten) card bytes and whether anything changed. On any
// failure (malformed JSON, nothing to fill) it returns the input bytes
// unchanged with changed=false so the caller can store them verbatim —
// this is strictly no-worse-than-before, never a regression.
//
// Pure function: no DB / HTTP / globals, so it is exhaustively
// unit-testable (agent_card_reconcile_test.go) without booting the
// handler or a sqlmock.
func reconcileAgentCardIdentity(card json.RawMessage, workspaceID, dbName, dbRole string) (json.RawMessage, bool) {
var m map[string]any
if err := json.Unmarshal(card, &m); err != nil || m == nil {
// Malformed card — not this function's job to reject it (the
// upsert stores it as-is and downstream readers handle bad
// JSON). Return verbatim so byte-for-byte behaviour is
// preserved on the failure path.
return card, false
}
changed := false
// name: fill only when empty or the UUID placeholder. A dbName that
// is itself the UUID is a placeholder row (registry.go INSERT seeds
// name = id before the canvas sets a friendly one) — not a friendly
// name, so it is not an eligible source.
cardName, _ := m["name"].(string)
if (cardName == "" || cardName == workspaceID) &&
dbName != "" && dbName != workspaceID {
m["name"] = dbName
changed = true
}
// description: when blank, default to the (reconciled) name so peers
// and the canvas Agent Card view have a non-empty human label
// instead of "". Mirrors the runtime's own
// `config.description or config.name` fallback (main.py:199) but
// applied to the registry copy where the runtime's fallback was the
// UUID.
if desc, _ := m["description"].(string); desc == "" {
if n, _ := m["name"].(string); n != "" && n != workspaceID {
m["description"] = n
changed = true
}
}
// role: surface the operator-set workspaces.role when the card
// carries none. Discovery (peer_role) and the canvas Role row read
// workspaces.role directly; this just makes the standalone card
// self-describing too. Never overwrite a role the card already has.
if dbRole != "" {
if r, ok := m["role"].(string); !ok || r == "" {
m["role"] = dbRole
changed = true
}
}
if !changed {
// No-op: return the original bytes untouched so callers that
// compare/store get byte-identical input (re-marshalling would
// reorder keys for no reason).
return card, false
}
out, err := json.Marshal(m)
if err != nil {
// Re-marshal of a map we just unmarshalled should never fail;
// if it somehow does, fall back to the verbatim input rather
// than storing nothing.
return card, false
}
return out, true
}
@@ -0,0 +1,166 @@
package handlers
import (
"encoding/json"
"testing"
)
// TestReconcileAgentCardIdentity covers the server-side backfill that
// repairs the fleet-wide agent-card identity gap (internal#XXX): the
// runtime POSTs /registry/register with agent_card.name = the workspace
// UUID (because the CP-regenerated /configs/config.yaml sets name: <uuid>)
// while the trusted workspaces.name DB column — the value the canvas
// Details tab shows and lets the operator edit — holds the friendly
// name ("Claude Code Agent"). The platform reconciles them from the DB
// row (NOT from the agent — identity stays platform-controlled, not
// self-mutable).
func TestReconcileAgentCardIdentity(t *testing.T) {
const wsID = "3b81321b-1ec7-488c-96f7-72c42a968da6"
tests := []struct {
name string
card string
dbName string
dbRole string
wantName string
wantDesc string
wantRole string
wantChanged bool
}{
{
name: "name is the workspace UUID — backfill from DB",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6","description":"","capabilities":{"streaming":true}}`,
dbName: "Claude Code Agent",
dbRole: "",
wantName: "Claude Code Agent",
wantDesc: "Claude Code Agent",
wantRole: "",
wantChanged: true,
},
{
name: "empty name — backfill from DB",
card: `{"name":"","description":"x"}`,
dbName: "ops-agent",
dbRole: "sre",
wantName: "ops-agent",
wantDesc: "x",
wantRole: "sre",
wantChanged: true,
},
{
name: "role null in card, DB has role — backfill role only",
card: `{"name":"Reviewer","description":"Senior reviewer"}`,
dbName: "Reviewer",
dbRole: "code-reviewer",
wantName: "Reviewer",
wantDesc: "Senior reviewer",
wantRole: "code-reviewer",
wantChanged: true,
},
{
name: "card already has a real friendly name — do NOT clobber it",
// A richer card (e.g. an external channel agent) must win;
// the platform only fills gaps, never downgrades.
card: `{"name":"Claude Code (channel)","description":"Local Claude Code session bridged","role":"assistant"}`,
dbName: "hongming-pc",
dbRole: "operator",
wantName: "Claude Code (channel)",
wantDesc: "Local Claude Code session bridged",
wantRole: "assistant",
wantChanged: false,
},
{
name: "no DB name available — leave UUID name untouched (no worse than before)",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6","description":""}`,
dbName: "",
dbRole: "",
wantName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
wantDesc: "",
wantRole: "",
wantChanged: false,
},
{
name: "dbName equals UUID (placeholder row) — not a friendly name, leave untouched",
card: `{"name":"3b81321b-1ec7-488c-96f7-72c42a968da6"}`,
dbName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
dbRole: "",
wantName: "3b81321b-1ec7-488c-96f7-72c42a968da6",
wantDesc: "",
wantRole: "",
wantChanged: false,
},
{
name: "malformed card JSON — return unchanged, no panic",
card: `{not json`,
dbName: "Claude Code Agent",
dbRole: "",
wantChanged: false,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
out, changed := reconcileAgentCardIdentity(
json.RawMessage(tc.card), wsID, tc.dbName, tc.dbRole,
)
if changed != tc.wantChanged {
t.Fatalf("changed = %v, want %v", changed, tc.wantChanged)
}
if !tc.wantChanged {
// Unchanged path must return the input bytes verbatim.
if string(out) != tc.card {
t.Fatalf("unchanged path mutated bytes:\n got %s\n want %s", out, tc.card)
}
return
}
var got map[string]any
if err := json.Unmarshal(out, &got); err != nil {
t.Fatalf("output not valid JSON: %v (%s)", err, out)
}
if g, _ := got["name"].(string); g != tc.wantName {
t.Errorf("name = %q, want %q", g, tc.wantName)
}
if g, _ := got["description"].(string); g != tc.wantDesc {
t.Errorf("description = %q, want %q", g, tc.wantDesc)
}
if tc.wantRole != "" {
if g, _ := got["role"].(string); g != tc.wantRole {
t.Errorf("role = %q, want %q", g, tc.wantRole)
}
}
})
}
}
// TestReconcileAgentCardIdentity_PreservesOtherFields ensures the
// reconcile is a minimal in-place patch — capabilities, version,
// skills and any unknown future fields survive untouched.
func TestReconcileAgentCardIdentity_PreservesOtherFields(t *testing.T) {
card := `{"name":"ws-uuid","description":"","version":"1.0.0",` +
`"capabilities":{"streaming":true,"pushNotifications":true},` +
`"skills":[{"id":"a","name":"a"}],"configuration_status":"ready"}`
out, changed := reconcileAgentCardIdentity(
json.RawMessage(card), "ws-uuid", "Friendly Name", "",
)
if !changed {
t.Fatal("expected changed = true")
}
var got map[string]any
if err := json.Unmarshal(out, &got); err != nil {
t.Fatalf("invalid JSON: %v", err)
}
if got["version"] != "1.0.0" {
t.Errorf("version not preserved: %v", got["version"])
}
if got["configuration_status"] != "ready" {
t.Errorf("configuration_status not preserved: %v", got["configuration_status"])
}
caps, ok := got["capabilities"].(map[string]any)
if !ok || caps["streaming"] != true {
t.Errorf("capabilities not preserved: %v", got["capabilities"])
}
skills, ok := got["skills"].([]any)
if !ok || len(skills) != 1 {
t.Errorf("skills not preserved: %v", got["skills"])
}
}
@@ -107,10 +107,29 @@ func (h *ChatFilesHandler) WithPendingUploads(storage pendinguploads.Storage, br
}
// chatUploadMaxBytes caps the full multipart request body so a
// malicious / runaway client can't OOM the proxy hop. 50 MB matches
// the workspace-side limit; anything larger is rejected at the
// malicious / runaway client can't OOM the proxy hop. 100 MB matches
// the workspace-side total limit; anything larger is rejected at the
// network boundary before forwarding.
const chatUploadMaxBytes = 50 * 1024 * 1024
//
// SSOT NOTE (issue #1520): this constant is the source of truth for
// chat upload limits across the platform. Its value is exported to
// the workspace container at provision time via the env var
// CHAT_UPLOAD_MAX_TOTAL_BYTES (see
// workspace_provision_shared.go::applyChatUploadLimits) so the
// Python runtime cap stays in lock-step. Do NOT change this without
// updating the per-file cap chatUploadMaxFileBytes below and
// verifying the env-injection site is unchanged.
const chatUploadMaxBytes = 100 * 1024 * 1024
// chatUploadMaxFileBytes caps any single multipart part. Mirrors the
// total cap by default because most chat uploads are a single file;
// keeping per-file equal to total avoids the surprise of "my 60 MB
// file fit under the total but got 413'd on per-file". Exported to
// the workspace container as CHAT_UPLOAD_MAX_FILE_BYTES so the
// Starlette parser's max_part_size matches and any single part above
// Starlette's default 1 MiB no longer raises MultiPartException
// (root cause of issue #1520).
const chatUploadMaxFileBytes = 100 * 1024 * 1024
// resolveWorkspaceForwardCreds resolves the workspace's URL +
// platform_inbound_secret for an /internal/* forward, applying
@@ -0,0 +1,63 @@
package handlers
// chat_upload_limits_test.go — pins the SSOT env-injection contract
// for chat-upload caps (issue #1520). The Python workspace runtime
// reads these env vars at module init; drift between the constant in
// chat_files.go and the env-var name here silently breaks chat upload
// fleet-wide, so the contract is asserted as a unit test in the same
// package as the producer.
import (
"fmt"
"testing"
)
// applyChatUploadLimits MUST seed both env vars to the byte-count
// stringification of the Go-side constants. Anything else means a
// Python-side parser cap that disagrees with the Go-side network cap,
// which is exactly the drift that shipped #1520.
func TestApplyChatUploadLimits_DefaultsMatchGoConstants(t *testing.T) {
env := map[string]string{}
applyChatUploadLimits(env)
wantFile := fmt.Sprintf("%d", chatUploadMaxFileBytes)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != wantFile {
t.Errorf("CHAT_UPLOAD_MAX_FILE_BYTES = %q, want %q", got, wantFile)
}
wantTotal := fmt.Sprintf("%d", chatUploadMaxBytes)
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != wantTotal {
t.Errorf("CHAT_UPLOAD_MAX_TOTAL_BYTES = %q, want %q", got, wantTotal)
}
}
// Pre-existing values win. A tenant override, plugin mutator, or A/B
// experiment that already set the env MUST be preserved — the SSOT
// helper is a defaulting layer, not an override layer.
func TestApplyChatUploadLimits_PreExistingValuesPreserved(t *testing.T) {
env := map[string]string{
"CHAT_UPLOAD_MAX_FILE_BYTES": "1234",
"CHAT_UPLOAD_MAX_TOTAL_BYTES": "5678",
}
applyChatUploadLimits(env)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != "1234" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_FILE_BYTES overwritten: got %q", got)
}
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != "5678" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_TOTAL_BYTES overwritten: got %q", got)
}
}
// The 100 MB minimum is the CTO-directed allowance floor (issue #1520).
// Pin so a future "tidy up: 100 MB seems large" refactor surfaces here
// before reverting the user-visible behaviour change.
func TestChatUploadCaps_MinimumAllowanceFloor(t *testing.T) {
const floor = 100 * 1024 * 1024
if chatUploadMaxBytes < floor {
t.Errorf("chatUploadMaxBytes = %d, below #1520 floor %d", chatUploadMaxBytes, floor)
}
if chatUploadMaxFileBytes < floor {
t.Errorf("chatUploadMaxFileBytes = %d, below #1520 floor %d", chatUploadMaxFileBytes, floor)
}
}
@@ -163,8 +163,32 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
},
})
// Fire-and-forget: send A2A in background goroutine
go h.executeDelegation(ctx, sourceID, body.TargetID, delegationID, a2aBody)
// Fire-and-forget: send A2A in a background goroutine.
//
// internal#497 — the goroutine MUST NOT inherit the HTTP request's
// cancellation. `ctx` here is c.Request.Context(); the handler returns
// 202 a few lines below, which cancels that context immediately. Before
// this fix (regression ce2db75f) executeDelegation ran on the
// request-scoped ctx, so every DB op + proxy call in the detached
// goroutine failed `context canceled` the instant the 202 was written.
// That silently broke 100% of A2A peer delegations fleet-wide since
// 2026-05-12 (poll-mode peers never got their a2a_receive inbox row;
// lookupDeliveryMode swallowed the ctx error and defaulted to push).
//
// context.WithoutCancel detaches cancellation/deadline while PRESERVING
// all context values (trace/correlation/tenant ids that proxyA2ARequest
// and the broadcaster read off ctx) — this is the established pattern in
// this package (a2a_proxy.go:850, a2a_proxy_helpers.go:525,
// registry.go:822). The 30-minute ceiling matches the prior internal
// budget executeDelegation used before ce2db75f and the proxy's own
// absolute agent-dispatch ceiling (a2a_proxy.go forwardCtx).
delegationCtx, cancelDelegation := context.WithTimeout(
context.WithoutCancel(ctx), 30*time.Minute,
)
go func() {
defer cancelDelegation()
h.executeDelegation(delegationCtx, sourceID, body.TargetID, delegationID, a2aBody)
}()
// Broadcast event so canvas shows delegation in real-time
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
@@ -723,6 +747,14 @@ func (h *DelegationHandler) listDelegationsFromLedger(ctx context.Context, works
entry["response_preview"] = textutil.TruncateBytes(resultPreview.String, 300)
}
if errorDetail.Valid && errorDetail.String != "" {
// Emit both keys: `error_detail` is the canonical field the
// Python poll-mode consumer (a2a_tools_delegation.py:184)
// reads from /delegations rows — without it, poll-mode
// silently loses the failure reason and falls through to
// the generic "delegation failed" string. `error` is kept
// for back-compat with existing UI surfaces that read the
// shorter name.
entry["error_detail"] = errorDetail.String
entry["error"] = errorDetail.String
}
if lastHeartbeat != nil {
@@ -784,6 +816,8 @@ func (h *DelegationHandler) listDelegationsFromActivityLogs(ctx context.Context,
entry["delegation_id"] = delegationID
}
if errorDetail != "" {
// Emit both keys per the rename: see listDelegationsFromLedger.
entry["error_detail"] = errorDetail
entry["error"] = errorDetail
}
if responseBody != "" {
@@ -16,6 +16,65 @@ import (
"github.com/gin-gonic/gin"
)
// ---------- internal#497 regression: detached goroutine ctx must outlive the handler ----------
// TestDelegate_DetachedContext_SurvivesRequestCancellation pins the
// load-bearing invariant that regression ce2db75f violated: the context
// handed to executeDelegation in the fire-and-forget goroutine must NOT be
// cancelled when the HTTP handler returns 202 (which cancels
// c.Request.Context()). Before the fix, executeDelegation ran on the
// request-scoped ctx, so every DB op + proxy call failed `context
// canceled` the instant the 202 was written — silently breaking 100% of
// A2A peer delegations fleet-wide since 2026-05-12.
//
// This test asserts the exact ctx-derivation contract used by Delegate
// (context.WithoutCancel(parent) + a timeout budget): the derived context
// (a) stays alive after the parent is cancelled, and (b) still carries
// parent values (trace/correlation/tenant ids the downstream proxy +
// broadcaster read off ctx). It is intentionally DB-free and fast.
func TestDelegate_DetachedContext_SurvivesRequestCancellation(t *testing.T) {
type ctxKey string
const traceKey ctxKey = "trace-id"
// Simulate c.Request.Context() carrying a correlation value.
parent, cancelParent := context.WithCancel(
context.WithValue(context.Background(), traceKey, "trace-abc-123"),
)
// Exact derivation Delegate uses for the detached goroutine.
delegationCtx, cancelDelegation := context.WithTimeout(
context.WithoutCancel(parent), 30*time.Minute,
)
defer cancelDelegation()
// The HTTP handler "returns 202" → request context is cancelled.
cancelParent()
if err := parent.Err(); err == nil {
t.Fatal("precondition: parent context should be cancelled after the handler returns")
}
// (a) Cancellation MUST NOT propagate to the detached context.
select {
case <-delegationCtx.Done():
t.Fatalf("regression: detached delegation ctx was cancelled by the handler returning (err=%v) — executeDelegation would fail every DB op with `context canceled`", delegationCtx.Err())
default:
// alive — correct
}
// (b) Parent values MUST still be readable (WithoutCancel preserves
// values; trace/correlation/tenant ids the proxy + broadcaster use).
if got, _ := delegationCtx.Value(traceKey).(string); got != "trace-abc-123" {
t.Errorf("detached ctx lost the parent trace value: got %q, want %q", got, "trace-abc-123")
}
// And it still has a real deadline (the 30m budget), so it is not an
// unbounded background context.
if _, hasDeadline := delegationCtx.Deadline(); !hasDeadline {
t.Error("detached ctx must carry the 30-minute timeout budget, but has no deadline")
}
}
// ---------- Delegate: missing target_id → 400 ----------
func TestDelegate_MissingTargetID(t *testing.T) {
@@ -1487,6 +1546,71 @@ func TestListDelegations_LedgerEmptyFallsBackToActivityLogs(t *testing.T) {
}
}
// ---------- ListDelegations: activity_logs failed row emits BOTH error + error_detail ----------
// Field-rename pin (P1 #348 / RFC #2829 PR-2 follow-up): the legacy
// activity_logs fallback path must also emit `error_detail` alongside
// the historical `error` key. Without this, poll-mode (which reads
// `error_detail`) silently loses the failure reason when the ledger
// is empty and the handler falls back to activity_logs.
func TestListDelegations_ActivityLogsFailedEmitsBothErrorKeys(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
// Ledger empty → fall back to activity_logs.
mock.ExpectQuery("SELECT d.delegation_id, d.caller_id, d.callee_id, d.task_preview").
WithArgs("ws-source").
WillReturnRows(sqlmock.NewRows([]string{
"delegation_id", "caller_id", "callee_id", "task_preview",
"status", "result_preview", "error_detail", "last_heartbeat",
"deadline", "created_at", "updated_at",
}))
now := time.Now()
activityRows := sqlmock.NewRows([]string{
"id", "activity_type", "source_id", "target_id",
"summary", "status", "error_detail", "response_body",
"delegation_id", "created_at",
}).AddRow(
"act-failed", "delegate_result", "ws-source", "ws-target",
"Delegation failed", "error", "codex runtime timed out", "",
"del-failed-002", now,
)
mock.ExpectQuery("SELECT id, activity_type").
WithArgs("ws-source").
WillReturnRows(activityRows)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-source"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-source/delegations", nil)
dh.ListDelegations(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 row, got %d", len(resp))
}
if resp[0]["error"] != "codex runtime timed out" {
t.Errorf("expected `error` field set, got %v", resp[0]["error"])
}
if resp[0]["error_detail"] != "codex runtime timed out" {
t.Errorf("expected `error_detail` field set (poll-mode contract), got %v", resp[0]["error_detail"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// ---------- ListDelegations: both ledger and activity_logs empty → [] ----------
func TestListDelegations_BothEmptyReturnsEmptyArray(t *testing.T) {
@@ -1685,7 +1809,15 @@ func TestListDelegations_LedgerFailedIncludesErrorDetail(t *testing.T) {
t.Errorf("expected status 'failed', got %v", resp[0]["status"])
}
if resp[0]["error"] != "Callee workspace not reachable" {
t.Errorf("expected error detail, got %v", resp[0]["error"])
t.Errorf("expected error detail under `error`, got %v", resp[0]["error"])
}
// Field-rename pin (P1 #348 / RFC #2829 PR-2 follow-up): the
// Python poll-mode consumer in a2a_tools_delegation.py:184 reads
// `error_detail`, not `error`. Both keys MUST be present so polling
// surfaces the real failure reason instead of falling through to
// the generic "delegation failed" string.
if resp[0]["error_detail"] != "Callee workspace not reachable" {
t.Errorf("expected error detail under `error_detail`, got %v", resp[0]["error_detail"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
@@ -8,6 +8,7 @@ import (
"fmt"
"net/http"
"net/http/httptest"
"sync"
"testing"
"time"
@@ -22,8 +23,39 @@ import (
"github.com/redis/go-redis/v9"
)
// liveTestHandlers tracks every WorkspaceHandler built during the test
// binary's lifetime so setupTestDB can drain their in-flight goAsync
// goroutines (notably the detached RestartByID restart cycle, which
// reads the global db.DB) BEFORE restoring db.DB. Without this drain a
// fire-and-forget restart goroutine spawned by one test outlives that
// test and races the db.DB swap in a later test's t.Cleanup — the
// 0x...d548 data race on platform/internal/db.DB.
var (
liveTestHandlersMu sync.Mutex
liveTestHandlers []*WorkspaceHandler
)
func init() {
gin.SetMode(gin.TestMode)
newHandlerHook = func(h *WorkspaceHandler) {
liveTestHandlersMu.Lock()
liveTestHandlers = append(liveTestHandlers, h)
liveTestHandlersMu.Unlock()
}
}
// drainTestAsync waits for every tracked handler's goAsync goroutines to
// finish. Called from setupTestDB's cleanup before db.DB is restored so
// no detached restart/provision goroutine is mid-read of db.DB when the
// pointer is swapped.
func drainTestAsync() {
liveTestHandlersMu.Lock()
handlers := make([]*WorkspaceHandler, len(liveTestHandlers))
copy(handlers, liveTestHandlers)
liveTestHandlersMu.Unlock()
for _, h := range handlers {
h.waitAsyncForTest()
}
}
// setupTestDB creates a sqlmock DB and assigns it to the global db.DB.
@@ -42,7 +74,16 @@ func setupTestDB(t *testing.T) sqlmock.Sqlmock {
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
t.Cleanup(func() {
// Drain detached async goroutines (e.g. goAsync(RestartByID),
// which reads db.DB in runRestartCycle before its provisioner
// gate) BEFORE swapping db.DB back. Doing the restore first
// would let an in-flight restart goroutine read db.DB while
// this line writes it — the data race this guards against.
drainTestAsync()
db.DB = prevDB
mockDB.Close()
})
// Disable SSRF checks for the duration of this test only. Restore
// the previous state via t.Cleanup so that TestIsSafeURL_* tests
@@ -35,8 +35,8 @@ func insertMCPDelegationRow(ctx context.Context, db *sql.DB, workspaceID, target
})
_, err := db.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, request_body, status)
VALUES ($1, 'delegation', 'delegate', $2, $3, $4, $5::jsonb, $6)
`, workspaceID, workspaceID, targetID, "Delegating to "+targetID, string(taskJSON), "pending")
VALUES ($1, 'delegation', 'delegate', $2, $3, $4, $5::jsonb, 'pending')
`, workspaceID, workspaceID, targetID, "Delegating to "+targetID, string(taskJSON))
return err
}
@@ -1,12 +1,8 @@
package handlers
import (
"context"
"encoding/json"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// ─────────────────────────────────────────────────────────────────────────────
@@ -195,115 +191,3 @@ func TestExtractA2AText_PriorityArtifactsOverMessage(t *testing.T) {
t.Errorf("artifacts should take priority: got %q, want %q", got, want)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// insertMCPDelegationRow tests
// ─────────────────────────────────────────────────────────────────────────────
func TestInsertMCPDelegationRow_Success(t *testing.T) {
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs("ws-src", "ws-src", "ws-tgt", "Delegating to ws-tgt", sqlmock.AnyArg(), "pending").
WillReturnResult(sqlmock.NewResult(0, 1))
err = insertMCPDelegationRow(context.Background(), mockDB, "ws-src", "ws-tgt", "del-123", "summarise the report")
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
func TestInsertMCPDelegationRow_DBError(t *testing.T) {
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs("ws-src", "ws-src", "ws-tgt", sqlmock.AnyArg(), sqlmock.AnyArg(), "pending").
WillReturnError(context.DeadlineExceeded)
err = insertMCPDelegationRow(context.Background(), mockDB, "ws-src", "ws-tgt", "del-456", "check the logs")
if err == nil {
t.Error("expected error, got nil")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// updateMCPDelegationStatus tests
// ─────────────────────────────────────────────────────────────────────────────
func TestUpdateMCPDelegationStatus_Success(t *testing.T) {
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("completed", "", "ws-src", "del-789").
WillReturnResult(sqlmock.NewResult(0, 1))
// Should not panic, should not error
updateMCPDelegationStatus(context.Background(), mockDB, "ws-src", "del-789", "completed", "")
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
func TestUpdateMCPDelegationStatus_WithErrorDetail(t *testing.T) {
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("failed", "timeout", "ws-src", "del-000").
WillReturnResult(sqlmock.NewResult(0, 1))
updateMCPDelegationStatus(context.Background(), mockDB, "ws-src", "del-000", "failed", "timeout")
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
func TestUpdateMCPDelegationStatus_DBError_LoggedNotReturned(t *testing.T) {
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
mock.ExpectExec(`UPDATE activity_logs`).
WithArgs("failed", sqlmock.AnyArg(), "ws-src", "del-abc").
WillReturnError(context.DeadlineExceeded)
// Function returns no value — error is logged, not propagated.
// Verify it does not panic.
updateMCPDelegationStatus(context.Background(), mockDB, "ws-src", "del-abc", "failed", "connection refused")
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
@@ -0,0 +1,53 @@
package handlers
// plugins_install_test.go — additional coverage for plugins_install.go.
//
// Gaps filled vs. existing test files:
// - plugins_install_external_test.go: Install + Uninstall 422 (external runtime) ✓ covered
// - plugins_test.go: Install 400 (missing source, invalid body, etc.) ✓ covered
// Uninstall 400 (invalid plugin name, empty name) ✓ covered
// Download auth gate ✓ covered
// - org_import_helpers_test.go: countWorkspaces, envRequirementKey, sanitizeEnvMembers,
// flattenAndSortRequirements, collectOrgEnv ✓ covered
//
// New test added here:
// - Uninstall 503: container not running, no SaaS dispatch.
//
// NOTE: validateWorkspaceID is not called inside the Install/Uninstall handlers.
// UUID validation is the responsibility of the WorkspaceAuth middleware, so no
// 400 test is needed here for UUID format.
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/gin-gonic/gin"
"github.com/stretchr/testify/require"
)
// TestPluginUninstall_ContainerNotRunning_Returns503 exercises the 503 path
// where neither a local Docker container nor a SaaS instance-id dispatch
// resolves. The handler must return "workspace container not running" — NOT a
// generic 500 or a misleading 422 (external-runtime) message.
func TestPluginUninstall_ContainerNotRunning_Returns503(t *testing.T) {
// No docker client + no instance-id lookup → falls through to 503.
h := NewPluginsHandler(t.TempDir(), nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "550e8400-e29b-41d4-a716-446655440000"},
{Key: "name", Value: "some-plugin"},
}
c.Request = httptest.NewRequest("DELETE",
"/workspaces/550e8400-e29b-41d4-a716-446655440000/plugins/some-plugin", nil)
h.Uninstall(c)
require.Equal(t, http.StatusServiceUnavailable, w.Code)
var body map[string]string
json.Unmarshal(w.Body.Bytes(), &body)
require.Equal(t, "workspace container not running", body["error"])
}
@@ -0,0 +1,141 @@
package handlers
import (
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/gin-gonic/gin"
)
func TestListRegistry_EmptyDir(t *testing.T) {
dir := t.TempDir()
h := NewPluginsHandler(dir, nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list, got %d plugins", len(got))
}
}
func TestListRegistry_IgnoresFiles(t *testing.T) {
dir := t.TempDir()
if err := os.WriteFile(filepath.Join(dir, "not-a-plugin.txt"), []byte("x"), 0600); err != nil {
t.Fatal(err)
}
h := NewPluginsHandler(dir, nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list (files ignored), got %d", len(got))
}
}
func TestListRegistry_SinglePlugin(t *testing.T) {
dir := t.TempDir()
pluginDir := filepath.Join(dir, "my-plugin")
if err := os.Mkdir(pluginDir, 0755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pluginDir, "plugin.yaml"), []byte("name: my-plugin\nversion: 1.0.0\n"), 0600); err != nil {
t.Fatal(err)
}
h := NewPluginsHandler(dir, nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 1 {
t.Fatalf("expected 1 plugin, got %d", len(got))
}
if got[0].Name != "my-plugin" {
t.Errorf("expected name 'my-plugin', got %q", got[0].Name)
}
}
func TestListRegistry_FiltersByRuntime(t *testing.T) {
dir := t.TempDir()
for _, spec := range []struct{ name, yaml string }{
{"runtime-a", "name: runtime-a\nruntimes:\n - claude-code\n"},
{"runtime-b", "name: runtime-b\nruntimes:\n - hermes\n"},
{"universal", "name: universal\nversion: 1.0.0\n"},
} {
pd := filepath.Join(dir, spec.name)
if err := os.Mkdir(pd, 0755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte(spec.yaml), 0600); err != nil {
t.Fatal(err)
}
}
h := NewPluginsHandler(dir, nil, nil)
// Filter to claude-code: runtime-a matches, universal (no runtimes field)
// is always included per supportsRuntime semantics.
got := h.listRegistryFiltered("claude-code")
if len(got) != 2 {
t.Fatalf("expected 2 (runtime-a + universal), got %d: %v", len(got), func() []string {
ns := make([]string, len(got))
for i, p := range got { ns[i] = p.Name }
return ns
}())
}
}
func TestListRegistry_PluginWithNoRuntimeDeclarations_AlwaysIncluded(t *testing.T) {
dir := t.TempDir()
pd := filepath.Join(dir, "universal-plugin")
if err := os.Mkdir(pd, 0755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte("name: universal-plugin\nversion: 1.0.0\n"), 0600); err != nil {
t.Fatal(err)
}
h := NewPluginsHandler(dir, nil, nil)
// When plugin declares no runtimes, it should always be included (try-it).
got := h.listRegistryFiltered("any-runtime")
if len(got) != 1 {
t.Errorf("expected 1 plugin (unspecified runtime), got %d", len(got))
}
}
func TestListRegistry_ReadDirError_ReturnsEmpty(t *testing.T) {
h := NewPluginsHandler("/nonexistent/path/for/plugins", nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list on ReadDir error, got %d", len(got))
}
}
func TestListRegistry_HTTPEndpoint(t *testing.T) {
dir := t.TempDir()
pd := filepath.Join(dir, "test-plugin")
if err := os.Mkdir(pd, 0755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte("name: test-plugin\nversion: 2.0.0\n"), 0600); err != nil {
t.Fatal(err)
}
h := NewPluginsHandler(dir, nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/plugins", nil)
h.ListRegistry(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var plugins []pluginInfo
if err := json.Unmarshal(w.Body.Bytes(), &plugins); err != nil {
t.Fatalf("failed to parse JSON: %v", err)
}
if len(plugins) != 1 {
t.Errorf("expected 1 plugin, got %d", len(plugins))
}
if plugins[0].Name != "test-plugin" {
t.Errorf("expected name 'test-plugin', got %q", plugins[0].Name)
}
}
+31 -3
View File
@@ -327,7 +327,33 @@ func (h *RegistryHandler) Register(c *gin.Context) {
}
}
agentCardStr := string(payload.AgentCard)
// Reconcile the runtime-supplied card's identity fields against the
// trusted workspaces row before storing. The runtime builds its card
// from config.name, which the CP-regenerated /configs/config.yaml
// sets to the workspace UUID — so without this the stored card
// served at /.well-known/agent-card.json and returned to peers via
// agent_card_url has name = UUID, description = "", role = null even
// though the operator-controlled workspaces.name holds the friendly
// name the canvas shows. We only FILL gaps from the DB (never
// downgrade a card that already carries a real name); identity stays
// platform-controlled — the agent cannot self-set these. Best-effort:
// a lookup failure leaves the card exactly as the runtime sent it
// (no-worse-than-before). See agent_card_reconcile.go.
reconciledCard := payload.AgentCard
{
var dbName, dbRole sql.NullString
if qErr := db.DB.QueryRowContext(ctx,
`SELECT name, role FROM workspaces WHERE id = $1`, payload.ID,
).Scan(&dbName, &dbRole); qErr == nil {
if rc, did := reconcileAgentCardIdentity(
payload.AgentCard, payload.ID, dbName.String, dbRole.String,
); did {
reconciledCard = rc
log.Printf("Registry register: reconciled agent_card identity for %s from workspaces row", payload.ID)
}
}
}
agentCardStr := string(reconciledCard)
// urlForUpsert: poll-mode workspaces don't need a URL. Empty input
// becomes NULL via sql.NullString so the row's URL stays clean (the
@@ -413,10 +439,12 @@ func (h *RegistryHandler) Register(c *gin.Context) {
}
}
// Broadcast WORKSPACE_ONLINE
// Broadcast WORKSPACE_ONLINE — use the reconciled card so the canvas
// Agent Card view live-updates with the friendly name, matching what
// was just persisted (not the runtime's raw UUID-name card).
if err := h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), payload.ID, map[string]interface{}{
"url": cachedURL,
"agent_card": payload.AgentCard,
"agent_card": reconciledCard,
"delivery_mode": effectiveMode,
}); err != nil {
log.Printf("Registry broadcast error: %v", err)
@@ -56,8 +56,10 @@ const (
// (an externally routable address) is used directly.
func (h *WorkspaceHandler) gracefulPreRestart(ctx context.Context, workspaceID string) {
// Non-blocking send — don't stall the restart cycle.
// Run in a detached goroutine so the caller (runRestartCycle) can
// proceed to stopForRestart without waiting.
// Run in a tracked async goroutine (goAsync, not bare `go`) so the
// caller (runRestartCycle) can proceed to stopForRestart without
// waiting, while the test harness can still drain it before swapping
// the global db.DB (resolveAgentURLForRestartSignal reads db.DB).
h.goAsync(func() {
signalCtx, cancel := context.WithTimeout(context.Background(), restartSignalTimeout)
defer cancel()
@@ -86,6 +86,40 @@ var fallbackRuntimes = map[string]struct{}{
"mock": {},
}
// stripJSON5Comments removes // single-line comments from JSON5-formatted
// data. The Integration Tester appends "// Triggered by <job>" to
// manifest.json after cloning, which causes json.Unmarshal to fail with
// "invalid character '/'". This strips trailing and mid-file comments
// before parsing so Go's strict JSON parser accepts JSON5 files.
//
// Handles:
// - Standalone comment lines: // comment
// - Trailing comments: "key": "value", // comment
// - Comments inside strings are NOT touched ("http://example.com")
func stripJSON5Comments(data []byte) []byte {
var result []byte
inString := false
i := 0
for i < len(data) {
if data[i] == '"' && (i == 0 || data[i-1] != '\\') {
inString = !inString
result = append(result, data[i])
i++
continue
}
if !inString && i+1 < len(data) && data[i] == '/' && data[i+1] == '/' {
// Skip to end of line
for i < len(data) && data[i] != '\n' {
i++
}
continue
}
result = append(result, data[i])
i++
}
return result
}
// loadRuntimesFromManifest builds the runtime allowlist from
// manifest.json. Each workspace_templates[].name is normalized to its
// base runtime identifier (strips the `-default` suffix templates
@@ -101,6 +135,9 @@ func loadRuntimesFromManifest(path string) (map[string]struct{}, error) {
if err != nil {
return nil, err
}
// Strip JSON5 // comments before parsing. The Integration Tester
// appends "// Triggered by <job>" to manifest.json after cloning.
data = stripJSON5Comments(data)
var m manifestFile
if err := json.Unmarshal(data, &m); err != nil {
return nil, err
@@ -8,6 +8,7 @@ package handlers
// fallback (tested at the initKnownRuntimes level via integration).
import (
"encoding/json"
"os"
"path/filepath"
"testing"
@@ -111,3 +112,133 @@ func keys(m map[string]struct{}) []string {
}
return out
}
// ── stripJSON5Comments tests ─────────────────────────────────────────────────
func TestStripJSON5Comments_Standalone(t *testing.T) {
// Whitespace before the // is preserved; only the // and its text are removed.
// The result is still valid JSON.
input := "{\n\t// This is a comment\n\t\"workspace_templates\": []\n}"
got := string(stripJSON5Comments([]byte(input)))
// Stripping should produce valid JSON: try parsing it
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("output is not valid JSON: %v\ngot: %q", err, got)
}
if m.WorkspaceTemplates == nil {
t.Error("WorkspaceTemplates field should be present after parsing")
}
}
func TestStripJSON5Comments_Trailing(t *testing.T) {
input := `{"key": "value"} // trailing comment`
got := string(stripJSON5Comments([]byte(input)))
want := `{"key": "value"} `
if got != want {
t.Errorf("trailing comment:\ngot: %q\nwant: %q", got, want)
}
}
func TestStripJSON5Comments_URLsPreserved(t *testing.T) {
// URLs with // in them must NOT be stripped
input := `{"url": "https://example.com/path"}`
got := string(stripJSON5Comments([]byte(input)))
if got != input {
t.Errorf("URL stripped: got %q, want %q", got, input)
}
}
func TestStripJSON5Comments_InlineComment(t *testing.T) {
// Whitespace before // is preserved; only the comment text is removed.
// Result must be valid JSON.
input := "{\n\t\"workspace_templates\": [] // inline comment\n}"
got := string(stripJSON5Comments([]byte(input)))
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("output is not valid JSON: %v\ngot: %q", err, got)
}
}
func TestStripJSON5Comments_IntegrationTesterAppend(t *testing.T) {
// Simulates what the Integration Tester does: appends // Triggered by ...
// to the end of a valid JSON file.
input := `{
"workspace_templates": [
{"name": "hermes", "repo": "org/hermes"}
]
}
// Triggered by e2e-test job 12345
`
got := string(stripJSON5Comments([]byte(input)))
if got == input {
t.Error("Integration Tester comment was NOT stripped")
}
// After stripping, it should be valid JSON
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("stripped content is not valid JSON: %v\ngot: %q", err, got)
}
if len(m.WorkspaceTemplates) != 1 || m.WorkspaceTemplates[0].Name != "hermes" {
t.Errorf("workspace_templates not parsed: %+v", m.WorkspaceTemplates)
}
}
func TestStripJSON5Comments_NoComments(t *testing.T) {
input := `{"workspace_templates": [{"name": "hermes"}]}`
got := string(stripJSON5Comments([]byte(input)))
if got != input {
t.Errorf("unmodified JSON changed: got %q", got)
}
}
// ── loadRuntimesFromManifest with JSON5 comments ─────────────────────────────
func TestLoadRuntimesFromManifest_WithJSON5TrailingComment(t *testing.T) {
// Regression: Integration Tester appends "// Triggered by ..." to manifest.json
// after cloning. Before the fix, this caused json.Unmarshal to fail with
// "invalid character '/'". After the fix, the comment is stripped first.
dir := t.TempDir()
path := filepath.Join(dir, "manifest.json")
manifest := `{
"workspace_templates": [
{"name": "hermes", "repo": "org/hermes"},
{"name": "claude-code-default", "repo": "org/cc"}
]
}
// Triggered by e2e-test job
`
if err := os.WriteFile(path, []byte(manifest), 0600); err != nil {
t.Fatalf("write: %v", err)
}
got, err := loadRuntimesFromManifest(path)
if err != nil {
t.Fatalf("loadRuntimesFromManifest with trailing comment: %v", err)
}
for _, must := range []string{"hermes", "claude-code"} {
if _, ok := got[must]; !ok {
t.Errorf("expected runtime %q in result: %v", must, keys(got))
}
}
}
func TestLoadRuntimesFromManifest_WithInlineJSON5Comment(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "manifest.json")
// JSON5 with inline comment
manifest := `{
// runtime templates
"workspace_templates": [
{"name": "langgraph", "repo": "org/lg"} // the default
]
}`
if err := os.WriteFile(path, []byte(manifest), 0600); err != nil {
t.Fatalf("write: %v", err)
}
got, err := loadRuntimesFromManifest(path)
if err != nil {
t.Fatalf("loadRuntimesFromManifest with inline comment: %v", err)
}
if _, ok := got["langgraph"]; !ok {
t.Errorf("expected langgraph in result: %v", keys(got))
}
}
@@ -0,0 +1,117 @@
package handlers
// template_files_agent_home_stub_test.go — pins the Phase-1 stub
// contract for the /agent-home root added by internal#425 RFC.
//
// Today (pre-Phase-2b), every Files API verb against `?root=/agent-home`
// must return HTTP 501 with the canonical pending-message body. The
// stub MUST NOT:
// 1. Hit the DB (the workspace might not even exist yet from the
// canvas's POV — the root selector is testable without one).
// 2. Touch the EIC tunnel / Docker / template-dir paths — those
// would 500/404/[] depending on the env and confuse the canvas.
// 3. Accept writes/deletes that the future docker-exec backend
// would reject — fail closed.
//
// When Phase 2b lands, this file gets replaced by a real
// docker-exec dispatch test; the stub-message constant in
// templates.go disappears.
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
)
// TestAgentHomeAllowedRoot pins that /agent-home is in the allowedRoots
// set. Without this, a future refactor that drops the key would
// silently degrade the canvas root selector to a 400 instead of the
// stub 501.
func TestAgentHomeAllowedRoot(t *testing.T) {
if !allowedRoots["/agent-home"] {
t.Fatal("/agent-home must be in allowedRoots — RFC #425 contract")
}
}
// TestAgentHomeStub_AllVerbs_Return501 pins the canonical stub
// response across all four verbs. Each must:
//
// - status 501
// - body contains the canonical "/agent-home not implemented" prefix
// - NOT contain "workspace not found" (proves we short-circuit before
// the DB lookup)
//
// Driven as a table to keep symmetry — adding a fifth verb in the
// future means adding one row here.
func TestAgentHomeStub_AllVerbs_Return501(t *testing.T) {
cases := []struct {
name string
method string
invoke func(c *gin.Context)
}{
{
name: "ListFiles",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ListFiles(c) },
},
{
name: "ReadFile",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ReadFile(c) },
},
{
name: "WriteFile",
method: "PUT",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).WriteFile(c) },
},
{
name: "DeleteFile",
method: "DELETE",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).DeleteFile(c) },
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "ws-stub"},
// Path param without leading slash so DeleteFile's
// filepath.IsAbs guard doesn't 400 before the root
// dispatch runs. The List/Read/Write paths strip the
// leading slash themselves and accept either form.
{Key: "path", Value: "notes.md"},
}
// WriteFile binds JSON; provide a minimal valid body so the
// short-circuit isn't masked by the bind-error path.
var body string
if tc.method == "PUT" {
body = `{"content":"x"}`
}
c.Request = httptest.NewRequest(
tc.method,
"/workspaces/ws-stub/files/notes.md?root=/agent-home",
strings.NewReader(body),
)
if body != "" {
c.Request.Header.Set("Content-Type", "application/json")
}
tc.invoke(c)
if w.Code != http.StatusNotImplemented {
t.Fatalf("expected 501, got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "/agent-home not implemented") {
t.Errorf("body should contain canonical stub message; got %s", w.Body.String())
}
if strings.Contains(w.Body.String(), "workspace not found") {
t.Errorf("stub leaked through to DB lookup; body=%s", w.Body.String())
}
})
}
}
@@ -19,6 +19,7 @@ package handlers
import (
"bytes"
"context"
"errors"
"fmt"
"log"
"os"
@@ -357,6 +358,28 @@ func writeFileViaEIC(ctx context.Context, instanceID, runtime, root, relPath str
var stderr bytes.Buffer
sshCmd.Stderr = &stderr
if err := sshCmd.Run(); err != nil {
// When the per-op context deadline (eicFileOpTimeout) fires,
// exec.CommandContext SIGKILLs the ssh subprocess and Run()
// returns the bare "signal: killed" with empty stderr. That
// surfaced to the canvas as an opaque
// `500 {"error":"ssh install: signal: killed ()"}` which gave
// the operator no idea the workspace was simply mid-provision
// with a slow/unready EIC tunnel (internal#423). Detect the
// deadline explicitly and return an actionable message instead
// — the EIC mechanism, timeout value, and success path are all
// unchanged; this only improves the error a stuck write emits.
if cerr := ctx.Err(); cerr != nil {
reason := "timed out after " + eicFileOpTimeout.String()
if errors.Is(cerr, context.Canceled) && !errors.Is(cerr, context.DeadlineExceeded) {
reason = "was cancelled"
}
return fmt.Errorf(
"ssh install: EIC tunnel to workspace %s — "+
"the workspace may still be provisioning (slow/unready SSH); "+
"retry once it is online, or apply provider credentials via "+
"Settings → Secrets (encrypted, does not use this file-write path)",
reason)
}
return fmt.Errorf("ssh install: %w (%s)", err, strings.TrimSpace(stderr.String()))
}
log.Printf("writeFileViaEIC: ws instance=%s runtime=%s root=%s wrote %d bytes → %s",
@@ -0,0 +1,71 @@
package handlers
// template_files_eic_write_timeout_test.go — pins the actionable-error
// behavior added for internal#423.
//
// When the per-op context deadline (eicFileOpTimeout) fires,
// exec.CommandContext SIGKILLs the ssh subprocess and Run() returns the
// bare "signal: killed" with empty stderr. Before the fix that surfaced
// to the canvas as an opaque `500 {"error":"ssh install: signal:
// killed ()"}` — useless to an operator whose workspace was simply
// mid-provision with a slow/unready EIC tunnel. The fix detects the
// deadline explicitly (errors.Is(ctx.Err(), context.DeadlineExceeded))
// and returns a message that names the cause and the
// Settings → Secrets workaround.
import (
"context"
"strings"
"testing"
"time"
)
// TestWriteFileViaEIC_DeadlineExceeded_ActionableError stubs
// withEICTunnel so the *real* inner closure runs against a context that
// has already exceeded its deadline. The ssh subprocess fails (no real
// sshd on the fake port) and ctx.Err() == DeadlineExceeded, so the new
// branch must fire and produce an actionable message — NOT the opaque
// "signal: killed ()" string the canvas used to show.
func TestWriteFileViaEIC_DeadlineExceeded_ActionableError(t *testing.T) {
prev := withEICTunnel
withEICTunnel = func(_ context.Context, instanceID string, fn func(s eicSSHSession) error) error {
// Run the real inner closure. It closes over the ctx that
// writeFileViaEIC derived from our already-cancelled parent, so
// the ssh subprocess is killed immediately and ctx.Err()
// resolves — exactly the eicFileOpTimeout-expiry shape.
return fn(eicSSHSession{
instanceID: instanceID,
osUser: "ubuntu",
localPort: 1, // nothing listening → ssh fails fast
keyPath: "/nonexistent/key",
})
}
t.Cleanup(func() { withEICTunnel = prev })
// Drive the real writeFileViaEIC. Pass a parent whose deadline has
// already passed: the context.WithTimeout(ctx, eicFileOpTimeout)
// derived inside writeFileViaEIC inherits the expired parent
// deadline, so ctx.Err() == context.DeadlineExceeded by the time
// the killed ssh subprocess returns — the exact production shape
// (eicFileOpTimeout expiry), exercised deterministically.
parent, cancel := context.WithDeadline(context.Background(), time.Now().Add(-time.Second))
defer cancel()
err := writeFileViaEIC(parent, "i-test", "claude-code", "/configs", "config.yaml", []byte("model: sonnet\n"))
if err == nil {
t.Fatalf("expected an error from a killed ssh subprocess, got nil")
}
msg := err.Error()
// Must NOT leak the opaque bare-signal string to the operator.
if strings.Contains(msg, "signal: killed ()") {
t.Fatalf("error still surfaces the opaque %q form: %q", "signal: killed ()", msg)
}
// Must name the cause and the Secrets workaround so the canvas
// shows something actionable.
for _, want := range []string{"timed out", "provisioning", "Settings", "Secrets"} {
if !strings.Contains(msg, want) {
t.Errorf("actionable error missing %q; got: %q", want, msg)
}
}
}
@@ -18,11 +18,35 @@ import (
)
// allowedRoots are the container paths that the Files API can browse.
//
// `/agent-home` (added 2026-05-15, internal#425 RFC) is the container's
// own $HOME — `/root` for openclaw, `/home/agent` for claude-code/hermes
// — browsed via `docker exec` rather than host-side `find`. The
// dispatch is stubbed today (returns 501); full implementation lands in
// Phase 2b of the RFC. The allowedRoots key is added now so the canvas
// can design its root-selector UI against the final shape and the
// stub-vs-full transition is server-side only.
var allowedRoots = map[string]bool{
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
"/agent-home": true,
}
// agentHomeStubMessage is the body returned by every Files API verb
// when `?root=/agent-home` is requested before Phase 2b lands. Keep the
// status code 501 (Not Implemented) — the route exists, the verb is
// understood, but the handler is unimplemented. Distinguishes from
// 400/404 so a canvas behind a less-current server can render a clean
// "feature pending" state instead of a generic error.
const agentHomeStubMessage = "/agent-home not implemented yet (internal#425 RFC Phase 2b — docker-exec backend pending)"
// isAgentHomeStubRequest returns true when the request targets the
// stubbed /agent-home root. Centralised so every verb in this file
// short-circuits with the same response shape.
func isAgentHomeStubRequest(rootPath string) bool {
return rootPath == "/agent-home"
}
// maxUploadFiles limits the number of files in a single import/replace.
@@ -224,7 +248,14 @@ func (h *TemplatesHandler) ListFiles(c *gin.Context) {
// ?depth= — max depth to recurse (default: 1, max: 5)
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
// /agent-home dispatch is stubbed pre-Phase-2b. Short-circuit before
// the DB lookup + EIC dance so a canvas exercising the new root key
// gets a clean 501 instead of a half-effort response.
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
subPath := c.DefaultQuery("path", "")
@@ -393,7 +424,11 @@ func (h *TemplatesHandler) ReadFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
@@ -506,7 +541,11 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
var wsName, instanceID, runtime string
@@ -583,7 +622,11 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
var wsName, instanceID, runtime string
@@ -10,8 +10,20 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// validWorkspaceID returns true when id is a syntactically valid UUID.
// workspace_id is a `uuid` column; passing a non-UUID (e.g. the canvas
// "global" sentinel sent when no node is selected) makes Postgres raise
// `invalid input syntax for type uuid`, which previously leaked as an
// opaque 500. Reject up front with a clean 400 instead. Mirrors the
// uuid.Parse guard already used in handlers/activity.go.
func validWorkspaceID(id string) bool {
_, err := uuid.Parse(id)
return err == nil
}
// TokenHandler exposes user-facing token management for workspaces.
// Routes: GET/POST/DELETE /workspaces/:id/tokens (behind WorkspaceAuth).
type TokenHandler struct{}
@@ -31,6 +43,10 @@ type tokenListItem struct {
// never the plaintext or hash).
func (h *TokenHandler) List(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
limit := 50
if v := c.Query("limit"); v != "" {
@@ -53,6 +69,7 @@ func (h *TokenHandler) List(c *gin.Context) {
LIMIT $2 OFFSET $3
`, workspaceID, limit, offset)
if err != nil {
log.Printf("tokens: list query failed for workspace %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to list tokens"})
return
}
@@ -85,6 +102,10 @@ const maxTokensPerWorkspace = 50
// exactly once in the response — it cannot be recovered afterwards.
func (h *TokenHandler) Create(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
// Rate limit: max active tokens per workspace
var count int
@@ -117,6 +138,10 @@ func (h *TokenHandler) Create(c *gin.Context) {
func (h *TokenHandler) Revoke(c *gin.Context) {
workspaceID := c.Param("id")
tokenID := c.Param("tokenId")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
result, err := db.DB.ExecContext(c.Request.Context(), `
UPDATE workspace_auth_tokens
@@ -41,6 +41,15 @@ import (
func init() { gin.SetMode(gin.TestMode) }
// Workspace IDs are validated as UUIDs up front (tokens.go validWorkspaceID),
// so handler tests must pass syntactically valid UUIDs. Fixed values keep
// sqlmock WithArgs assertions deterministic.
const (
wsUUID1 = "11111111-1111-1111-1111-111111111111"
wsUUID2 = "22222222-2222-2222-2222-222222222222"
wsUUID3 = "33333333-3333-3333-3333-333333333333"
)
// withMockDB swaps `db.DB` for a sqlmock and returns the mock plus a
// restore func. Tests use this in place of setupTokenTestDB which
// skips on a missing real DB.
@@ -81,13 +90,13 @@ func TestTokenHandler_List_HappyPath(t *testing.T) {
created := time.Date(2026, 4, 1, 12, 0, 0, 0, time.UTC)
last := created.Add(time.Hour)
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at\s+FROM workspace_auth_tokens`).
WithArgs("ws-1", 50, 0).
WithArgs(wsUUID1, 50, 0).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}).
AddRow("tok-1", "abc12345", created, last).
AddRow("tok-2", "def67890", created, nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
@@ -121,7 +130,7 @@ func TestTokenHandler_List_EmptyResult(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: "ws-2"}})
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: wsUUID2}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200 on empty list, got %d", w.Code)
@@ -146,7 +155,7 @@ func TestTokenHandler_List_QueryError(t *testing.T) {
WillReturnError(errors.New("connection refused"))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: "ws-3"}})
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: wsUUID3}})
if w.Code != http.StatusInternalServerError {
t.Errorf("query error must surface as 500, got %d", w.Code)
@@ -158,13 +167,13 @@ func TestTokenHandler_List_RespectsLimit(t *testing.T) {
defer cleanup()
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at`).
WithArgs("ws-1", 10, 5).
WithArgs(wsUUID1, 10, 5).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/tokens?limit=10&offset=5", nil)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Params = gin.Params{{Key: "id", Value: wsUUID1}}
NewTokenHandler().List(c)
if w.Code != http.StatusOK {
@@ -186,7 +195,7 @@ func TestTokenHandler_List_ScanError(t *testing.T) {
AddRow("tok-1", "abc", "not-a-timestamp", nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusInternalServerError {
t.Errorf("scan error must surface as 500, got %d: %s", w.Code, w.Body.String())
@@ -201,11 +210,11 @@ func TestTokenHandler_Create_RateLimited(t *testing.T) {
// Count query returns 50 (== max) → 429.
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs("ws-1").
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(50))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusTooManyRequests {
t.Errorf("max active tokens should 429, got %d", w.Code)
@@ -225,7 +234,7 @@ func TestTokenHandler_Create_IssueFails(t *testing.T) {
WillReturnError(errors.New("disk full"))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusInternalServerError {
t.Errorf("IssueToken DB error must 500, got %d", w.Code)
@@ -242,7 +251,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
WillReturnResult(sqlmock.NewResult(1, 1))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
@@ -257,7 +266,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
if body.AuthToken == "" {
t.Errorf("auth_token must be present and non-empty in response")
}
if body.WorkspaceID != "ws-1" {
if body.WorkspaceID != wsUUID1 {
t.Errorf("workspace_id mismatch: %q", body.WorkspaceID)
}
}
@@ -269,12 +278,12 @@ func TestTokenHandler_Revoke_HappyPath(t *testing.T) {
defer cleanup()
mock.ExpectExec(`UPDATE workspace_auth_tokens\s+SET revoked_at = now\(\)`).
WithArgs("tok-1", "ws-1").
WithArgs("tok-1", wsUUID1).
WillReturnResult(sqlmock.NewResult(0, 1))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-1"},
})
@@ -289,12 +298,12 @@ func TestTokenHandler_Revoke_NotFound(t *testing.T) {
// 0 rows affected → token not found OR already revoked.
mock.ExpectExec(`UPDATE workspace_auth_tokens`).
WithArgs("tok-ghost", "ws-1").
WithArgs("tok-ghost", wsUUID1).
WillReturnResult(sqlmock.NewResult(0, 0))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-ghost", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-ghost"},
})
@@ -312,7 +321,7 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-1"},
})
@@ -321,6 +330,59 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
}
}
// ---- UUID validation (regression: "global" sentinel 500) ------------
// The canvas Settings → Workspace Tokens tab sent the literal sentinel
// "global" as the workspace id when no node was selected. workspace_id
// is a `uuid` column, so the query raised
// `invalid input syntax for type uuid: "global"` which leaked as an
// opaque 500. List/Create/Revoke now reject any non-UUID id with a
// clean 400 before touching the DB. No DB expectation is set on the
// mock — a DB hit would fail ExpectationsWereMet, proving short-circuit.
func TestTokenHandler_RejectsNonUUIDWorkspaceID(t *testing.T) {
h := NewTokenHandler()
cases := []struct {
name string
run func(c *gin.Context)
method string
params gin.Params
}{
{"List", h.List, "GET", gin.Params{{Key: "id", Value: "global"}}},
{"Create", h.Create, "POST", gin.Params{{Key: "id", Value: "global"}}},
{"Revoke", h.Revoke, "DELETE", gin.Params{
{Key: "id", Value: "global"},
{Key: "tokenId", Value: "tok-1"},
}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
w := makeReq(t, tc.run, tc.method,
"/workspaces/global/tokens", tc.params)
if w.Code != http.StatusBadRequest {
t.Fatalf("%s with non-UUID id must 400, got %d: %s",
tc.name, w.Code, w.Body.String())
}
var body struct {
Error string `json:"error"`
}
_ = json.Unmarshal(w.Body.Bytes(), &body)
if body.Error != "invalid workspace id" {
t.Errorf("%s: want error=%q, got %q",
tc.name, "invalid workspace id", body.Error)
}
// No query/exec was expected → if the handler hit the DB
// this fails, proving the guard short-circuits before SQL.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("%s leaked a DB call past the uuid guard: %v", tc.name, err)
}
})
}
}
// Compile-time noise removal: the imports list pulls in the sql /
// driver packages and the silenced ctx so a future scenario that
// needs them doesn't have to re-add the import. Documented here so
@@ -11,6 +11,7 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
func init() { gin.SetMode(gin.TestMode) }
@@ -167,11 +168,14 @@ func TestTokenHandler_RevokeWrongWorkspace(t *testing.T) {
h := NewTokenHandler()
// Try to revoke with a different workspace ID — should 404
// Try to revoke with a different (valid-UUID) workspace ID that does
// not own the token — should 404. A valid UUID is required so this
// exercises the ownership branch, not the up-front uuid-shape 400.
otherWS := uuid.NewString()
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "wrong-workspace-id"}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/wrong/tokens/"+tokenID, nil)
c.Params = gin.Params{{Key: "id", Value: otherWS}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/"+otherWS+"/tokens/"+tokenID, nil)
h.Revoke(c)
if w.Code != http.StatusNotFound {
@@ -80,6 +80,15 @@ type WorkspaceHandler struct {
asyncWG sync.WaitGroup
}
// newHandlerHook, when non-nil, is invoked for every WorkspaceHandler
// created via NewWorkspaceHandler. It is nil in production (zero cost);
// the test harness sets it so setupTestDB can drain every handler's
// in-flight async goroutines before swapping the global db.DB. Without
// this, a detached restart goroutine (maybeMarkContainerDead ->
// goAsync(RestartByID) -> runRestartCycle reads db.DB) races the
// db.DB restore in another test's t.Cleanup.
var newHandlerHook func(*WorkspaceHandler)
func (h *WorkspaceHandler) goAsync(fn func()) {
h.asyncWG.Add(1)
go func() {
@@ -108,6 +117,9 @@ func NewWorkspaceHandler(b events.EventEmitter, p *provisioner.Provisioner, plat
if p != nil {
h.provisioner = p
}
if newHandlerHook != nil {
newHandlerHook(h)
}
return h
}
@@ -0,0 +1,193 @@
package handlers
import (
"bytes"
"database/sql"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// patchReq builds a gin context for a PATCH request to /workspaces/:id/abilities.
func patchReq(id, body string) (*http.Request, *httptest.ResponseRecorder, *gin.Context) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: id}}
c.Request = httptest.NewRequest("PATCH", "/workspaces/"+id+"/abilities", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
return c.Request, w, c
}
func TestPatchAbilities_InvalidWorkspaceID(t *testing.T) {
setupTestDB(t)
// "not-a-uuid" fails validateWorkspaceID
_, w, c := patchReq("not-a-uuid", `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_EmptyBody(t *testing.T) {
setupTestDB(t)
id := "00000000-0000-0000-0000-000000000001"
// Empty JSON object — no ability fields present
_, w, c := patchReq(id, `{}`)
PatchAbilities(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]string
json.Unmarshal(w.Body.Bytes(), &resp)
if resp["error"] != "at least one ability field required" {
t.Errorf("expected 'at least one ability field required', got %v", resp["error"])
}
}
func TestPatchAbilities_WorkspaceNotFound(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000002"
// SELECT EXISTS returns false (workspace does not exist)
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(false))
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusNotFound {
t.Errorf("expected 404, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_SetBroadcastEnabledTrue(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000003"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled = true
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]string
json.Unmarshal(w.Body.Bytes(), &resp)
if resp["status"] != "updated" {
t.Errorf("expected status=updated, got %v", resp["status"])
}
}
func TestPatchAbilities_SetTalkToUserEnabledFalse(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000004"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE talk_to_user_enabled = false
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"talk_to_user_enabled":false}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_BothFields(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000005"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled = false
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnResult(sqlmock.NewResult(0, 1))
// UPDATE talk_to_user_enabled = true
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"broadcast_enabled":false,"talk_to_user_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_BroadcastUpdateFails(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000006"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE fails
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnError(sql.ErrConnDone)
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_TalkToUserUpdateFails(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000007"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled skipped (not in payload)
// UPDATE talk_to_user_enabled fails
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnError(sql.ErrConnDone)
_, w, c := patchReq(id, `{"talk_to_user_enabled":false}`)
PatchAbilities(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
@@ -34,11 +34,13 @@ import (
// BroadcastHandler is constructed once and shared across requests.
type BroadcastHandler struct {
broadcaster *events.Broadcaster
broadcaster events.EventEmitter
}
// NewBroadcastHandler creates a BroadcastHandler.
func NewBroadcastHandler(b *events.Broadcaster) *BroadcastHandler {
// The emitter is any EventEmitter — the concrete *Broadcaster in production,
// or a test double in unit tests.
func NewBroadcastHandler(b events.EventEmitter) *BroadcastHandler {
return &BroadcastHandler{broadcaster: b}
}
@@ -67,7 +67,6 @@ func TestBroadcast_OrgScopedRecipients(t *testing.T) {
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to unmarshal response: %v", err)
@@ -206,7 +205,7 @@ func TestBroadcast_Disabled(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000001"
senderID := "00000000-0000-0000-0000-000000000003"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Disabled Agent", false))
@@ -237,7 +236,7 @@ func TestBroadcast_EmptyOrg_NoRecipients(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000001" // org root, only workspace in org
senderID := "00000000-0000-0000-0000-000000000004" // org root, only workspace in org
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -297,33 +296,12 @@ func TestBroadcast_InvalidWorkspaceID(t *testing.T) {
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-000000000001"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-000000000001/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_OrgRootLookupFails verifies that if the recursive CTE for
// finding the org root errors, the handler returns 500 instead of proceeding
// with an un-scoped query that would broadcast to all orgs.
func TestBroadcast_OrgRootLookupFails(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000001"
senderID := "00000000-0000-0000-0000-000000000005"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -353,16 +331,13 @@ func TestBroadcast_OrgRootLookupFails(t *testing.T) {
}
}
// TestBroadcast_OrgScoped_SelfBroadcastExcluded verifies that broadcasting
// from a workspace does not send a broadcast_receive to the sender itself
// (the sender logs broadcast_sent, not broadcast_receive).
func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000001"
peerID := "00000000-0000-0000-0000-000000000002"
senderID := "00000000-0000-0000-0000-000000000006"
peerID := "00000000-0000-0000-0000-000000000007"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -399,10 +374,145 @@ func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
}
}
// TestBroadcast_RecipientActivityLogFails_SkipsAndContinues: if one recipient's
// activity_log insert fails, the handler logs the error and continues to the
// next recipient rather than aborting the whole broadcast.
func TestBroadcast_RecipientActivityLogFails_SkipsAndContinues(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000008"
peerA := "00000000-0000-0000-0000-000000000009"
peerB := "00000000-0000-0000-0000-00000000000a"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Resilient Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA).AddRow(peerB))
// Peer A fails — handler logs and continues
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
// Peer B succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerB, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"partial delivery"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
// Only peerB was delivered
if int(resp["delivered"].(float64)) != 1 {
t.Errorf("expected delivered=1, got %v", resp["delivered"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet expectations: %v", err)
}
}
// TestBroadcast_SenderActivityLogFails_StillReturns200: if the sender's own
// broadcast_sent activity_log insert fails, the handler still returns 200
// so the caller doesn't retry a broadcast that already partially delivered.
func TestBroadcast_SenderActivityLogFails_StillReturns200(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-00000000000b"
peerA := "00000000-0000-0000-0000-00000000000c"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Log-Fail Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA))
// Peer log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log FAILS
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"log fail test"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200 even on sender log failure, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000d"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000d/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingBody(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000e"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000e/broadcast", nil)
// no Content-Type and no body
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…",
// so truncating a 48-char string at max=20 produces 21 characters (20 runes + "…").
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…".
func TestBroadcast_Truncate(t *testing.T) {
cases := []struct {
msg string
@@ -410,14 +520,18 @@ func TestBroadcast_Truncate(t *testing.T) {
expect string
}{
{"short", 120, "short"}, // under max — no truncation
// exactly120chars (15) + 105 ones = 120 chars; at max=120 → unchanged
// exactly 120 chars → unchanged
{"exactly120chars1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111", 120, "exactly120chars111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111…"},
// "this is a longer mes" = 20 runes; + "…" = 21 chars
// 21 runes at max=20 → 20 + "…" = 21 chars
{"this is a longer message that needs truncating", 20, "this is a longer mes…"},
// at-max boundary: 20 chars at max=20 → no truncation
{"exactly twenty chars", 20, "exactly twenty chars"},
// over max: 11 chars at max=10 → 10 + "…" = 11
{"hello world!", 10, "hello worl…"},
// Unicode: 3-rune string at max=3 → unchanged
{"日本語", 3, "日本語"},
// Empty string → unchanged
{"", 120, ""},
}
for _, tc := range cases {
result := broadcastTruncate(tc.msg, tc.max)
@@ -37,6 +37,7 @@ package handlers
import (
"context"
"errors"
"fmt"
"log"
"path/filepath"
@@ -132,6 +133,10 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// a workspace_secret named GIT_AUTHOR_NAME can override.
applyAgentGitIdentity(envVars, payload.Name)
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
// SSOT for chat-upload limits — see chat_files.go::chatUploadMaxBytes.
// Injecting via env keeps the Python workspace runtime caps in
// lock-step with the Go cap on every provision. Fixes #1520.
applyChatUploadLimits(envVars)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
}
@@ -223,3 +228,28 @@ func (h *WorkspaceHandler) markProvisionFailed(ctx context.Context, workspaceID,
log.Printf("markProvisionFailed: db update failed for %s: %v", workspaceID, dbErr)
}
}
// applyChatUploadLimits seeds the chat-upload cap env vars on the
// workspace container so the Python /internal/chat/uploads/ingest
// handler parses the multipart form with the same per-file allowance
// that the Go proxy enforces.
//
// Why env-driven (and not, say, a hard-coded Python constant): keeping
// one Go constant as the source of truth and forwarding it lets
// operations bump the cap by editing one file + redeploy, instead of
// editing two files in two languages and risking the drift that
// shipped #1520 (Go cap 50 MB, Python parser cap 1 MiB — Starlette
// default — so a 5 MB image always 400'd on parse before per-file
// enforcement could fire).
//
// Pre-existing env wins. If something downstream (a tenant override,
// a plugin mutator, an A/B experiment) has already set either var,
// we leave it alone. Default-only injection.
func applyChatUploadLimits(envVars map[string]string) {
if _, set := envVars["CHAT_UPLOAD_MAX_FILE_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_FILE_BYTES"] = fmt.Sprintf("%d", chatUploadMaxFileBytes)
}
if _, set := envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"] = fmt.Sprintf("%d", chatUploadMaxBytes)
}
}
@@ -237,10 +237,10 @@ func (h *WorkspaceHandler) Restart(c *gin.Context) {
// the silent-drop bugs PRs #2811/#2824 closed). RestartWorkspaceAuto
// enforces CP-FIRST ordering matching the other dispatchers — see
// docs/architecture/backends.md.
go func() {
h.goAsync(func() {
h.RestartWorkspaceAutoOpts(context.Background(), id, templatePath, configFiles, payload, resetClaudeSession)
}()
go h.sendRestartContext(id, restartData)
})
h.goAsync(func() { h.sendRestartContext(id, restartData) })
c.JSON(http.StatusOK, gin.H{"status": "provisioning", "config_dir": configLabel, "reset_session": resetClaudeSession})
}
@@ -610,7 +610,9 @@ func (h *WorkspaceHandler) runRestartCycle(workspaceID string) {
h.provisionWorkspaceAutoSync(workspaceID, "", nil, payload)
// sendRestartContext is a one-way notification to the new container; safe
// to fire async — the next restart cycle won't depend on it completing.
go h.sendRestartContext(workspaceID, restartData)
// Tracked via goAsync so the test harness can drain it before the
// global db.DB swap (sendRestartContext reads db.DB).
h.goAsync(func() { h.sendRestartContext(workspaceID, restartData) })
}
// Pause handles POST /workspaces/:id/pause
@@ -178,12 +178,21 @@ func (p *CPProvisioner) Start(ctx context.Context, cfg WorkspaceConfig) (string,
// /admin/liveness and other admin-gated platform endpoints (core#831).
// p.adminToken is read from os.Getenv("ADMIN_TOKEN") at provisioner creation;
// it is also used for CP→platform HTTP auth but those are separate concerns.
env := cfg.EnvVars
if p.adminToken != "" {
env = make(map[string]string, len(cfg.EnvVars)+1)
for k, v := range cfg.EnvVars {
env[k] = v
//
// Forensic #145 hardening: tenant workspaces run on EC2 via this path, so
// the SCM-write-token denylist (see buildContainerEnv) is enforced here
// too. Always build a filtered copy — never pass cfg.EnvVars through
// verbatim — so a latent persona-merged GITEA_TOKEN can't reach the
// tenant container regardless of whether ADMIN_TOKEN is set.
env := make(map[string]string, len(cfg.EnvVars)+1)
for k, v := range cfg.EnvVars {
if isSCMWriteTokenKey(k) {
log.Printf("CPProvisioner.Start: dropped SCM-write credential %q from tenant workspace env (forensic #145 guard)", k)
continue
}
env[k] = v
}
if p.adminToken != "" {
env["ADMIN_TOKEN"] = p.adminToken
}
// Collect template files and generated configs, with OFFSEC-010 guards:
@@ -643,6 +643,28 @@ func ValidateWorkspaceAccess(access, workspacePath string) error {
}
}
// scmWriteTokenKeys is the explicit denylist of environment variable names
// that carry a Git SCM *write* credential (push / merge / approve). These
// must never reach a tenant workspace container — see the forensic #145
// rationale in buildContainerEnv. Kept as an exact-match set rather than a
// substring/prefix heuristic so the guard is auditable and can't silently
// over-strip a legitimately-named var.
var scmWriteTokenKeys = map[string]struct{}{
"GITEA_TOKEN": {},
"GITHUB_TOKEN": {},
"GH_TOKEN": {}, // gh CLI honours GH_TOKEN as a GITHUB_TOKEN alias
"GITLAB_TOKEN": {},
"GL_TOKEN": {}, // glab CLI alias
"BITBUCKET_TOKEN": {},
}
// isSCMWriteTokenKey reports whether an env var name is a known Git SCM
// write credential that must be stripped from tenant workspace env.
func isSCMWriteTokenKey(key string) bool {
_, ok := scmWriteTokenKeys[key]
return ok
}
// buildContainerEnv assembles the initial environment variables injected
// into every workspace container.
//
@@ -679,6 +701,21 @@ func buildContainerEnv(cfg WorkspaceConfig) []string {
env = append(env, fmt.Sprintf("AWARENESS_URL=%s", cfg.AwarenessURL))
}
for k, v := range cfg.EnvVars {
// Forensic #145 hardening: tenant workspace containers run
// agent-controlled code and must NEVER receive a Git SCM *write*
// credential. Without merge/approve creds in-container the
// two-eyes review gate is structurally self-bypass-proof — an
// agent that forges an approval has no token to act on it. A
// latent path exists (loadPersonaEnvFile merges a per-role
// persona `GITEA_TOKEN` into cfg.EnvVars when MOLECULE_PERSONA_ROOT
// is set on a tenant host); it is inert today (persona dirs are
// operator-host-only) but unguarded. Strip SCM-write tokens here
// by construction so the invariant holds regardless of whether
// that path ever becomes reachable.
if isSCMWriteTokenKey(k) {
log.Printf("buildContainerEnv: dropped SCM-write credential %q from workspace env (forensic #145 guard)", k)
continue
}
env = append(env, fmt.Sprintf("%s=%s", k, v))
}
// Inject ADMIN_TOKEN from the platform server's environment so workspace
@@ -725,10 +725,15 @@ func TestBuildContainerEnv_AwarenessOnlyWhenBothSet(t *testing.T) {
}
func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
// NOTE: this test previously asserted GITHUB_TOKEN passed through
// verbatim. That assertion encoded the forensic #145 latent leak as
// expected behavior. Post-guard, ordinary custom env still flows but
// SCM-write credentials are stripped — see
// TestBuildContainerEnv_StripsSCMWriteTokens for the negative assertion.
cfg := WorkspaceConfig{
WorkspaceID: "ws-x",
PlatformURL: "http://localhost:8080",
EnvVars: map[string]string{"CUSTOM": "value", "GITHUB_TOKEN": "fake-token-for-test"},
EnvVars: map[string]string{"CUSTOM": "value", "ANTHROPIC_API_KEY": "sk-not-an-scm-token"},
}
env := buildContainerEnv(cfg)
seen := map[string]string{}
@@ -741,8 +746,8 @@ func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
if seen["CUSTOM"] != "value" {
t.Errorf("CUSTOM env missing, got env=%v", env)
}
if seen["GITHUB_TOKEN"] != "fake-token-for-test" {
t.Errorf("GITHUB_TOKEN env missing, got env=%v", env)
if seen["ANTHROPIC_API_KEY"] != "sk-not-an-scm-token" {
t.Errorf("non-SCM custom env must still pass through, got env=%v", env)
}
// Built-in defaults still present
if seen["MOLECULE_URL"] == "" {
@@ -750,6 +755,129 @@ func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
}
}
// ---------- forensic #145: SCM-write-token denylist guard ----------
// TestBuildContainerEnv_StripsSCMWriteTokens is the core negative
// assertion: a tenant workspace env constructed via buildContainerEnv MUST
// NOT contain any Git SCM *write* credential, regardless of how it got into
// cfg.EnvVars. This proves the two-eyes review gate stays structurally
// self-bypass-proof — an agent in-container has no merge/approve token to
// act on a forged approval. See forensic #145.
//
// This test FAILS on the pre-guard code (where buildContainerEnv passed
// cfg.EnvVars through verbatim) and PASSES once the denylist filter is in
// place — i.e. the guard is proven by construction, not by environment
// accident.
func TestBuildContainerEnv_StripsSCMWriteTokens(t *testing.T) {
scmTokens := []string{
"GITEA_TOKEN", "GITHUB_TOKEN", "GH_TOKEN",
"GITLAB_TOKEN", "GL_TOKEN", "BITBUCKET_TOKEN",
}
t.Run("normal path — SCM tokens explicitly set in EnvVars", func(t *testing.T) {
envVars := map[string]string{"CUSTOM": "ok", "ANTHROPIC_API_KEY": "sk-keep"}
for _, k := range scmTokens {
envVars[k] = "leaked-write-credential-" + k
}
cfg := WorkspaceConfig{
WorkspaceID: "ws-tenant",
PlatformURL: "http://localhost:8080",
Tier: 2,
EnvVars: envVars,
}
assertNoSCMWriteToken(t, buildContainerEnv(cfg), scmTokens)
// Sanity: non-SCM custom env is NOT collateral-damaged by the filter.
if !envContains(buildContainerEnv(cfg), "CUSTOM=ok") {
t.Errorf("filter must not strip non-SCM custom env")
}
if !envContains(buildContainerEnv(cfg), "ANTHROPIC_API_KEY=sk-keep") {
t.Errorf("filter must not strip non-SCM API keys")
}
})
t.Run("persona-file path — simulates loadPersonaEnvFile merge", func(t *testing.T) {
// The latent path: handlers.loadPersonaEnvFile() merges a per-role
// persona env file (carrying GITEA_USER, GITEA_TOKEN, …) into the
// workspace env map when MOLECULE_PERSONA_ROOT is set on a tenant
// host. We can't invoke that cross-package helper here, but its
// observable effect is exactly "a GITEA_TOKEN appears in
// cfg.EnvVars". Constructing that condition directly proves the
// guard holds even if the latent path becomes reachable.
cfg := WorkspaceConfig{
WorkspaceID: "ws-tenant",
PlatformURL: "http://localhost:8080",
Tier: 2,
EnvVars: map[string]string{
// Persona identity fields that are SAFE to keep (read-only
// identity, not a write credential):
"GITEA_USER": "backend-engineer",
"GITEA_USER_EMAIL": "backend-engineer@agents.moleculesai.app",
// The credential that must be stripped:
"GITEA_TOKEN": "persona-merged-write-pat",
"GITEA_TOKEN_SCOPES": "write:repository",
},
}
got := buildContainerEnv(cfg)
assertNoSCMWriteToken(t, got, scmTokens)
// Non-credential persona identity may still flow through — only the
// write token is the denied surface.
if !envContains(got, "GITEA_USER=backend-engineer") {
t.Errorf("non-credential persona identity (GITEA_USER) should not be stripped")
}
})
}
// TestCPProvisionerEnv_StripsSCMWriteTokens covers the tenant-EC2 path:
// CPProvisioner.Start builds the env map the control plane forwards to the
// EC2 workspace container. The same forensic #145 denylist must hold there.
func TestCPProvisionerEnv_StripsSCMWriteTokens(t *testing.T) {
// isSCMWriteTokenKey is the single source of truth shared by both
// buildContainerEnv (local Docker) and CPProvisioner.Start (tenant EC2).
// Assert it classifies every known SCM-write var as denied and leaves
// ordinary / read-only-identity vars alone.
for _, k := range []string{
"GITEA_TOKEN", "GITHUB_TOKEN", "GH_TOKEN",
"GITLAB_TOKEN", "GL_TOKEN", "BITBUCKET_TOKEN",
} {
if !isSCMWriteTokenKey(k) {
t.Errorf("isSCMWriteTokenKey(%q) = false, want true (SCM-write credential must be denied)", k)
}
}
for _, k := range []string{
"GITEA_USER", "GITEA_USER_EMAIL", "ANTHROPIC_API_KEY",
"CUSTOM", "PLATFORM_URL", "ADMIN_TOKEN", "",
} {
if isSCMWriteTokenKey(k) {
t.Errorf("isSCMWriteTokenKey(%q) = true, want false (must not over-strip non-SCM env)", k)
}
}
}
func assertNoSCMWriteToken(t *testing.T, env []string, scmTokens []string) {
t.Helper()
for _, e := range env {
key := e
if i := strings.IndexByte(e, '='); i >= 0 {
key = e[:i]
}
for _, banned := range scmTokens {
if key == banned {
t.Errorf("SCM-write credential %q leaked into workspace env (forensic #145 invariant violated): %q", banned, e)
}
}
}
}
func envContains(env []string, want string) bool {
for _, e := range env {
if e == want {
return true
}
}
return false
}
// ---------- buildWorkspaceMount — #65 workspace_access ----------
func TestBuildWorkspaceMount_SelectionMatrix(t *testing.T) {
@@ -0,0 +1,226 @@
// Package secrets provides the canonical SSOT for credential-shaped
// regex patterns used by:
//
// - the CI `Secret scan` workflow (.gitea/workflows/secret-scan.yml)
// - the runtime's bundled pre-commit hook
// (molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh)
// - the upcoming Phase 2b docker-exec Files API backend, which has
// to refuse to surface files whose path OR content matches a
// credential shape (RFC internal#425, Hongming 2026-05-15)
//
// Before this package, the same regex set lived as duplicate bash
// arrays in two unrelated repos; adding a pattern required editing
// both, and pattern drift was caught only via secret-scan workflow
// failures on PRs that had unrelated changes (#2090-class incident
// vector). Centralising in Go makes the Files API the SSOT, with the
// YAML + bash arrays generated/asserted from this package so drift
// is detected at CI time, not at exfiltration time.
//
// This file is Phase 2a of the internal#425 RFC. Phase 2b will import
// `Patterns` from `template_files_docker_exec.go` to gate
// `listFilesViaDockerExec` / `readFileViaDockerExec` against
// secret-shaped paths AND content. Until 2b lands, the package has
// one consumer: this package's own unit tests, which pin the regex
// strings so a refactor that drops or weakens one is caught here.
package secrets
import (
"fmt"
"regexp"
"sync"
)
// Pattern is one named credential shape — a human label plus the
// compiled regex. The label appears in CI error output ("matched:
// github-pat") so an operator can identify the family without seeing
// the actual matched bytes (echoing the bytes widens the blast radius
// per the secret-scan workflow's recovery prose).
type Pattern struct {
// Name is a short kebab-case identifier (e.g. "github-pat",
// "anthropic-api-key"). Stable across versions — consumers may
// switch on it.
Name string
// Description is a one-line human-readable explanation of what
// the pattern matches. Used in CI error messages and the Files
// API "<denied: secret-shape>" placeholder rationale.
Description string
// regexSource is the regex literal in Go-RE2 syntax. Stored as a
// string so the slice declaration below stays readable; compiled
// once via sync.Once into a *regexp.Regexp.
regexSource string
}
// Patterns is the canonical credential-shape regex set.
//
// Adding a pattern here:
//
// 1. Add a new Pattern{} entry below with a kebab-case Name, a
// one-line Description, and the regex literal. Anchor on a
// low-false-positive prefix.
// 2. Add a positive + negative test case in patterns_test.go.
// 3. Mirror the regex string into:
// a. .gitea/workflows/secret-scan.yml SECRET_PATTERNS array
// b. molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh
// (or wait for the codegen target that consumes this slice — TBD
// follow-up; tracked in the Phase 2a PR description.)
//
// The order is: alphabetical within each provider family, families
// grouped by ecosystem (GitHub family, AI-provider family, chat
// family, cloud family). Keep this stable so diffs are reviewable.
var Patterns = []Pattern{
// --- GitHub token family ---
{
Name: "github-pat-classic",
Description: "GitHub personal access token (classic)",
regexSource: `ghp_[A-Za-z0-9]{36,}`,
},
{
Name: "github-app-installation-token",
Description: "GitHub App installation token (#2090 vector)",
regexSource: `ghs_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-user-to-server",
Description: "GitHub OAuth user-to-server token",
regexSource: `gho_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-user",
Description: "GitHub OAuth user token",
regexSource: `ghu_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-refresh",
Description: "GitHub OAuth refresh token",
regexSource: `ghr_[A-Za-z0-9]{36,}`,
},
{
Name: "github-pat-fine-grained",
Description: "GitHub fine-grained personal access token",
regexSource: `github_pat_[A-Za-z0-9_]{82,}`,
},
// --- AI-provider API key family ---
{
Name: "anthropic-api-key",
Description: "Anthropic API key",
regexSource: `sk-ant-[A-Za-z0-9_-]{40,}`,
},
{
Name: "openai-project-key",
Description: "OpenAI project API key",
regexSource: `sk-proj-[A-Za-z0-9_-]{40,}`,
},
{
Name: "openai-service-account-key",
Description: "OpenAI service-account API key",
regexSource: `sk-svcacct-[A-Za-z0-9_-]{40,}`,
},
{
Name: "minimax-api-key",
Description: "MiniMax API key (F1088 vector)",
regexSource: `sk-cp-[A-Za-z0-9_-]{60,}`,
},
// --- Chat-platform token family ---
{
Name: "slack-token",
Description: "Slack token (xoxb/xoxa/xoxp/xoxr/xoxs)",
regexSource: `xox[baprs]-[A-Za-z0-9-]{20,}`,
},
// --- Cloud-provider credential family ---
{
Name: "aws-access-key-id",
Description: "AWS access key ID",
regexSource: `AKIA[0-9A-Z]{16}`,
},
{
Name: "aws-sts-temp-access-key-id",
Description: "AWS STS temporary access key ID",
regexSource: `ASIA[0-9A-Z]{16}`,
},
}
// compiledOnce protects the lazy build of compiledPatterns. We compile
// lazily so package init is cheap; callers pay only on first match
// (typically once per workspace-server boot).
var (
compiledOnce sync.Once
compiledPatterns []*compiledPattern
compileErr error
)
type compiledPattern struct {
Name string
Description string
Re *regexp.Regexp
}
// compileAll compiles every Pattern.regexSource into a *regexp.Regexp.
// Called once via compiledOnce. Any compile failure here is a build
// bug (the unit tests assert each regex compiles) — surfacing via
// returned error so callers don't panic in request handling.
func compileAll() {
out := make([]*compiledPattern, 0, len(Patterns))
for _, p := range Patterns {
re, err := regexp.Compile(p.regexSource)
if err != nil {
compileErr = fmt.Errorf("secrets: pattern %q failed to compile: %w", p.Name, err)
return
}
out = append(out, &compiledPattern{Name: p.Name, Description: p.Description, Re: re})
}
compiledPatterns = out
}
// ScanBytes returns a non-nil Match if any pattern matches anywhere
// inside b. Returns (nil, nil) on no match. Returns (nil, err) only
// if a regex in the package fails to compile — that's a build bug,
// not a runtime data issue.
//
// Match contains the pattern Name + Description so the caller can
// emit a path-or-content-denial rationale WITHOUT round-tripping the
// matched bytes (which would defeat the purpose). The matched bytes
// stay inside this function.
//
// The Files API Phase 2b backend will call ScanBytes on:
//
// - the absolute path string (catches a file literally named
// `ghs_abc.txt`)
// - the file content (catches a credential pasted into a workspace
// file by an agent or user — the Files API refuses to surface it
// and the canvas renders "<denied: secret-shape>")
//
// Ordering: patterns are tried in declaration order. First match
// wins. This means narrower patterns (e.g. `sk-svcacct-…`) should
// appear in `Patterns` before broader ones (`sk-…`) — today there's
// no overlap, so order is descriptive only.
func ScanBytes(b []byte) (*Match, error) {
compiledOnce.Do(compileAll)
if compileErr != nil {
return nil, compileErr
}
for _, cp := range compiledPatterns {
if cp.Re.Match(b) {
return &Match{Name: cp.Name, Description: cp.Description}, nil
}
}
return nil, nil
}
// ScanString is the string-input convenience wrapper around ScanBytes.
// Identical semantics — the body never copies, []byte(s) is a
// zero-copy reinterpret for the regex matcher.
func ScanString(s string) (*Match, error) {
return ScanBytes([]byte(s))
}
// Match describes which pattern caught a value. Deliberately does
// NOT include the matched substring — callers must not echo it.
type Match struct {
// Name is the pattern's kebab-case identifier (e.g. "github-pat-classic").
Name string
// Description is the human-readable line for UI / log surfaces.
Description string
}
@@ -0,0 +1,189 @@
package secrets
import (
"strings"
"testing"
)
// TestEveryPatternCompiles pins that every Pattern.regexSource is a
// valid Go-RE2 expression. Without this, a bad regex would silently
// disable ScanBytes for everything after it (the lazy compile would
// set compileErr and ScanBytes would return that error every call).
func TestEveryPatternCompiles(t *testing.T) {
for _, p := range Patterns {
if p.Name == "" {
t.Errorf("pattern with empty Name: regex=%q", p.regexSource)
}
if p.Description == "" {
t.Errorf("pattern %q has empty Description", p.Name)
}
}
// Force compile + check error.
if _, err := ScanBytes([]byte("placeholder")); err != nil {
t.Fatalf("ScanBytes init failed: %v", err)
}
}
// TestNoDuplicateNames — a duplicate pattern Name would make the
// "first match wins" semantics surprising to readers and any caller
// switching on Match.Name (none today but adding the guard is cheap).
func TestNoDuplicateNames(t *testing.T) {
seen := map[string]bool{}
for _, p := range Patterns {
if seen[p.Name] {
t.Errorf("duplicate pattern Name: %q", p.Name)
}
seen[p.Name] = true
}
}
// TestKnownPatternsAllPresent — pins which specific Name values are
// expected. A future refactor that renames or removes one without
// updating consumers (CI workflow, runtime pre-commit hook, Files
// API Phase 2b backend) would silently widen the leak surface.
// Failing here forces the rename to be intentional.
func TestKnownPatternsAllPresent(t *testing.T) {
expected := []string{
"github-pat-classic",
"github-app-installation-token",
"github-oauth-user-to-server",
"github-oauth-user",
"github-oauth-refresh",
"github-pat-fine-grained",
"anthropic-api-key",
"openai-project-key",
"openai-service-account-key",
"minimax-api-key",
"slack-token",
"aws-access-key-id",
"aws-sts-temp-access-key-id",
}
got := map[string]bool{}
for _, p := range Patterns {
got[p.Name] = true
}
for _, want := range expected {
if !got[want] {
t.Errorf("expected pattern %q missing from Patterns slice", want)
}
}
}
// TestPositiveMatches — for each pattern, supply a representative
// shape and assert ScanBytes returns a Match with the right Name.
// These are TEST FIXTURES, not real credentials — each is the
// pattern's prefix + a long-enough trailing run of placeholder chars.
// `EXAMPLE` is sprinkled in to make grep-finds in CI logs obviously
// fake to a human reader (matches saved memory
// feedback_assert_exact_not_substring: tighten by Name not body).
func TestPositiveMatches(t *testing.T) {
cases := []struct {
fixture string
expectedName string
}{
{"ghp_EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"github_pat_EXAMPLE" + strings.Repeat("1", 80), "github-pat-fine-grained"},
{"sk-ant-EXAMPLE" + strings.Repeat("1", 40), "anthropic-api-key"},
{"sk-proj-EXAMPLE" + strings.Repeat("1", 40), "openai-project-key"},
{"sk-svcacct-EXAMPLE" + strings.Repeat("1", 40), "openai-service-account-key"},
{"sk-cp-EXAMPLE" + strings.Repeat("1", 60), "minimax-api-key"},
{"xoxb-" + strings.Repeat("a", 25), "slack-token"},
{"xoxa-" + strings.Repeat("a", 25), "slack-token"},
// AWS regex requires [0-9A-Z]{16} — uppercase + digits only.
{"AKIA1234567890ABCDEF", "aws-access-key-id"},
{"ASIA1234567890ABCDEF", "aws-sts-temp-access-key-id"},
}
for _, tc := range cases {
t.Run(tc.expectedName, func(t *testing.T) {
m, err := ScanBytes([]byte(tc.fixture))
if err != nil {
t.Fatalf("ScanBytes(%q) errored: %v", tc.fixture, err)
}
if m == nil {
t.Fatalf("ScanBytes(%q) returned no match — expected %q", tc.fixture, tc.expectedName)
}
if m.Name != tc.expectedName {
t.Errorf("ScanBytes(%q) matched %q; expected %q", tc.fixture, m.Name, tc.expectedName)
}
})
}
}
// TestNegativeShapes — strings that look credential-adjacent but
// shouldn't match (too short, wrong prefix, missing trailing bytes).
// Failing here means a pattern is too loose, which would generate
// false-positive denial in Files API and false-positive workflow
// failures in CI.
func TestNegativeShapes(t *testing.T) {
cases := []string{
// Too-short variants — anchored on the length suffix.
"ghp_tooshort",
"ghs_alsoshort1234",
"github_pat_short",
"sk-ant-short",
"sk-cp-not-enough-bytes-here",
// Looks like one of the prefixes but isn't (different letter).
"gha_EXAMPLE_thirty_six_or_more_chars_here_xxx",
// Slack family — wrong letter after xox.
"xoxz-aaaaaaaaaaaaaaaaaaaaaaaaa",
// AWS-shaped but wrong length suffix.
"AKIATOOSHORT",
// Empty / whitespace.
"",
" ",
// Plain prose mentioning the prefix as part of a longer word.
"see also `ghp_HOWTO.md` in the repo",
}
for _, c := range cases {
t.Run(c, func(t *testing.T) {
m, err := ScanBytes([]byte(c))
if err != nil {
t.Fatalf("ScanBytes(%q) errored: %v", c, err)
}
if m != nil {
t.Errorf("ScanBytes(%q) unexpectedly matched %q", c, m.Name)
}
})
}
}
// TestScanString_NoOp — sanity-check ScanString is the zero-copy
// wrapper around ScanBytes. Without this, a future refactor that
// makes ScanString do its own thing (e.g. accidentally normalise
// case) would diverge silently.
func TestScanString_NoOp(t *testing.T) {
in := "ghp_EXAMPLE111122223333444455556666777788889999"
m1, err1 := ScanBytes([]byte(in))
if err1 != nil {
t.Fatalf("ScanBytes errored: %v", err1)
}
m2, err2 := ScanString(in)
if err2 != nil {
t.Fatalf("ScanString errored: %v", err2)
}
if m1 == nil || m2 == nil {
t.Fatalf("expected matches; got bytes=%+v string=%+v", m1, m2)
}
if m1.Name != m2.Name {
t.Errorf("ScanString and ScanBytes returned different Names: %q vs %q", m1.Name, m2.Name)
}
}
// TestMatch_NoRoundtrip — assert the Match struct does NOT include
// the matched substring as a field. Adding such a field would
// regress the "matched bytes never leave ScanBytes" invariant that
// makes this package safe to call from log/UI surfaces. This is a
// reflection-light contract test — checks the field names statically.
func TestMatch_NoRoundtrip(t *testing.T) {
var m Match
// If someone adds a `Matched string` (or similar) field, this
// test reads as the canonical place to update + reconsider.
_ = m.Name
_ = m.Description
// The two-field shape is part of the public contract; new fields
// require deliberation about whether they leak the secret value.
}
+6
View File
@@ -35,12 +35,14 @@ from a2a_tools import (
tool_commit_memory,
tool_delegate_task,
tool_delegate_task_async,
tool_get_runtime_identity,
tool_get_workspace_info,
tool_inbox_peek,
tool_inbox_pop,
tool_list_peers,
tool_recall_memory,
tool_send_message_to_user,
tool_update_agent_card,
tool_wait_for_message,
)
from platform_tools.registry import TOOLS as _PLATFORM_TOOL_SPECS
@@ -130,6 +132,10 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
return await tool_get_workspace_info(
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "get_runtime_identity":
return await tool_get_runtime_identity()
elif name == "update_agent_card":
return await tool_update_agent_card(arguments.get("card"))
elif name == "commit_memory":
return await tool_commit_memory(
arguments.get("content", ""),
+12
View File
@@ -167,3 +167,15 @@ from a2a_tools_inbox import ( # noqa: E402 (import after the top-of-module imp
tool_inbox_pop,
tool_wait_for_message,
)
# Identity tool handlers — extracted to a2a_tools_identity. Ports the
# two T4-tier MCP tools (``tool_get_runtime_identity`` +
# ``tool_update_agent_card``) from molecule-ai-workspace-runtime PR#17.
# That repo is mirror-only (reference_runtime_repo_is_mirror_only);
# this is the canonical edit point, and the wheel mirror is
# regenerated by publish-runtime.yml on merge.
from a2a_tools_identity import ( # noqa: E402 (import after the top-of-module imports)
tool_get_runtime_identity,
tool_update_agent_card,
)
+187
View File
@@ -0,0 +1,187 @@
"""Identity tool handlers — single-concern slice of the a2a_tools surface.
Owns the two MCP tools that close the T4-tier workspace owner-permission
gaps reported via the canvas:
* ``tool_get_runtime_identity`` env-only; returns model, model_provider,
molecule_model, anthropic_base_url, tier, workspace_id, runtime
(ADAPTER_MODULE). No HTTP call. Always permitted by RBAC even
read-only agents may know what model they are.
* ``tool_update_agent_card`` POSTs the card to ``/registry/update-card``
with the workspace's own bearer (same auth path as ``tool_commit_memory``
via ``a2a_tools_rbac.auth_headers_for_heartbeat``). The platform
replaces the stored card and broadcasts an ``agent_card_updated``
event so the canvas reflects the new card live. Gated on
``memory.write`` capability via the existing RBAC permission map so
read-only roles can't silently rewrite the platform card.
Both originated as a port of molecule-ai-workspace-runtime PR#17
(``feat(mcp): add update_agent_card + get_runtime_identity tools``).
The mirror-only PR#17 was closed without merge per
``reference_runtime_repo_is_mirror_only``; the canonical edit point is
this monorepo at ``workspace/`` and the wheel mirror is regenerated
automatically by the publish-runtime workflow.
Imports the auth-header primitive from ``a2a_tools_rbac`` (iter 4a)
NOT from ``a2a_tools`` to avoid a circular import with the
kitchen-sink re-export module.
"""
from __future__ import annotations
import json
import os
from typing import Any
import httpx
from a2a_client import PLATFORM_URL
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_write_permission as _check_memory_write_permission,
)
def _runtime_identity_payload() -> dict[str, Any]:
"""Build the identity dict — env-only, no I/O.
Factored out from ``tool_get_runtime_identity`` so tests can assert
against the exact key set without re-parsing JSON. The MCP tool
handler ``tool_get_runtime_identity`` is the only public caller in
production; tests call this helper directly.
"""
return {
"model": os.environ.get("MODEL", ""),
"model_provider": os.environ.get("MODEL_PROVIDER", ""),
"molecule_model": os.environ.get("MOLECULE_MODEL", ""),
"anthropic_base_url": os.environ.get("ANTHROPIC_BASE_URL", ""),
"tier": os.environ.get("TIER", ""),
"workspace_id": os.environ.get("WORKSPACE_ID", ""),
# Adapter module is the closest thing the runtime has to a
# "template slug" — e.g. "adapter" for claude-code-default,
# "hermes" for hermes-template, etc. Picked from
# $ADAPTER_MODULE env baked by each template's Dockerfile.
"runtime": os.environ.get("ADAPTER_MODULE", ""),
}
async def tool_get_runtime_identity() -> str:
"""Return this runtime's identity — model, provider, tier, IDs.
Env-only; no HTTP call. Useful so the agent can answer "what model
am I?" correctly instead of guessing from a stale system prompt
that the operator may have changed between boots.
Returns the identity as a JSON-encoded string (the dispatch contract
every MCP tool in this module follows). Tests that want to assert
individual fields can call ``_runtime_identity_payload()`` directly,
or ``json.loads`` the return value.
Always permitted by RBAC there is no sensitive information here
that isn't already available to the process via ``os.environ``.
The point of the tool is to surface those env values to the agent
layer in a stable, documented shape rather than expecting every
agent runtime to know to ``echo $MODEL``.
"""
return json.dumps(_runtime_identity_payload(), indent=2)
async def tool_update_agent_card(card: Any) -> str:
"""Update this workspace's agent_card on the platform.
POSTs the provided card to ``/registry/update-card`` with the
workspace's own bearer token (same auth path as ``tool_commit_memory``
and ``tool_get_workspace_info``). The platform validates required
fields server-side, replaces the stored card, and broadcasts an
``agent_card_updated`` event so the canvas updates live.
Args:
card: A JSON-serialisable object (typically a dict) holding the
new card. The platform validates required fields server-side.
Returns:
JSON-encoded string. Body:
- ``{"success": true, "status": "updated"}`` on success;
- ``{"success": false, "error": "<msg>", "status_code": <int>}``
on platform error;
- ``{"success": false, "error": "<reason>"}`` on local validation
(non-dict card, missing WORKSPACE_ID, network error).
Permission gate: this tool requires the ``memory.write`` RBAC
capability same gate as ``tool_commit_memory``. The check runs
inline rather than at the dispatcher layer to keep ``a2a_mcp_server``
permission-agnostic (the gate sits with the implementation, not the
transport). Read-only roles get a clear error string back instead
of a 403 from the platform.
We re-check ``isinstance(card, dict)`` here defensively rather than
trust the MCP schema validator alone the schema only constrains
the transport, not the in-process call surface used by tests and
sibling modules.
"""
payload = await _update_agent_card_impl(card)
return json.dumps(payload, indent=2)
async def _update_agent_card_impl(card: Any) -> dict[str, Any]:
"""Dict-returning core of ``tool_update_agent_card``.
Split out so tests can assert against the raw dict shape (status
codes, error messages) without re-parsing JSON on every assertion.
The string-returning ``tool_update_agent_card`` is a thin wrapper
invoked by the MCP dispatcher.
"""
# RBAC: require memory.write permission. Same gate as
# tool_commit_memory (the agent already needs this capability to
# persist anything outbound). Read-only roles can still call
# get_runtime_identity / get_workspace_info to introspect — those
# are env-only / read-only and have no inline gate.
if not _check_memory_write_permission():
return {
"success": False,
"error": (
"RBAC — this workspace does not have the 'memory.write' "
"permission required to update the agent_card."
),
}
if not isinstance(card, dict):
return {
"success": False,
"error": "card must be a JSON object (dict)",
}
ws_id = os.environ.get("WORKSPACE_ID", "")
if not ws_id:
return {
"success": False,
"error": "WORKSPACE_ID env not set; cannot identify caller",
}
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/registry/update-card",
json={"workspace_id": ws_id, "agent_card": card},
headers=_auth_headers_for_heartbeat(),
)
if resp.status_code == 200:
body: dict[str, Any] = {}
try:
body = resp.json()
except Exception:
pass
return {
"success": True,
"status": body.get("status", "updated"),
}
# Non-200 — surface what the platform returned.
error_msg = ""
try:
error_msg = resp.json().get("error", "") or resp.text
except Exception:
error_msg = resp.text
return {
"success": False,
"status_code": resp.status_code,
"error": error_msg,
}
except Exception as e:
return {"success": False, "error": f"network error: {e}"}
+52
View File
@@ -340,6 +340,16 @@ _CLI_A2A_COMMAND_KEYWORDS: dict[str, str | None] = {
"delegate_task_async": "delegate --async",
"check_task_status": "status",
"get_workspace_info": "info",
# `get_runtime_identity` + `update_agent_card` are MCP-first
# capabilities — the CLI subprocess interface doesn't expose them
# today. `get_runtime_identity` is env-only and an agent on a
# CLI-only runtime can already `echo $MODEL` etc, so there's no
# functional gap. `update_agent_card` requires a JSON object
# argument that wouldn't survive a positional-arg shell invocation
# cleanly. Mapped to None — flip to a keyword if a2a_cli grows
# `identity` / `card` subcommands in the future.
"get_runtime_identity": None,
"update_agent_card": None,
# `broadcast_message` is not exposed via the CLI subprocess interface
# today — it's an MCP-first capability. If a2a_cli grows a `broadcast`
# subcommand, map it here and the alignment test will gate the change.
@@ -589,6 +599,28 @@ def _sanitize_for_external(msg: str) -> str:
import re as _re
msg = _re.sub(r"(?i)(?:bearer|token|api[_-]?key|sk-)[ :=]+[A-Za-z0-9_/.-]{20,}", "[REDACTED]", msg)
# Bare provider key with NO separator after the prefix — a real
# `sk-ant-api03-…` / `sk-…` key uses `-` (not `[ :=]`) so the rule
# above misses it. Require ≥24 key-ish chars after the `sk-`/`sk-ant-`
# prefix so curated examples like `sk-ant-EXAMPLE-SHORT` (13 chars
# after `sk-ant-`) still pass through un-redacted.
msg = _re.sub(r"(?i)\bsk-(?:ant-)?[A-Za-z0-9_-]{24,}", "[REDACTED]", msg)
# JSON-quoted credential values: {"token": "…"} / {"apiKey": "…"} /
# {"secret": "…"} / {"password": "…"}. Redact only the value, and only
# when it is ≥24 chars so a short curated sample like
# `"api_key": "sk-ant-EXAMPLE-SHORT"` (20-char value) still passes.
msg = _re.sub(
r'(?i)("(?:token|api[_-]?key|secret|password)"\s*:\s*")[^"]{24,}(")',
r"\1[REDACTED]\2",
msg,
)
# AWS secret access key in `aws_secret_access_key=…` form (env dumps,
# boto tracebacks). The base64-ish value runs until whitespace/quote.
msg = _re.sub(
r"(?i)(aws_secret_access_key\s*[:=]\s*)\S+",
r"\1[REDACTED]",
msg,
)
# Absolute paths: /etc/shadow, /home/user/.aws/credentials, etc.
msg = _re.sub(r"(?:/[^/\s]+){2,}", lambda m: m.group(0) if len(m.group(0)) < 60 else "[REDACTED_PATH]", msg)
return msg
@@ -598,6 +630,7 @@ def sanitize_agent_error(
exc: BaseException | None = None,
category: str | None = None,
stderr: str | None = None,
reason: str | None = None,
) -> str:
"""Render an agent-side failure into a user-safe error message.
@@ -605,6 +638,18 @@ def sanitize_agent_error(
category string (e.g. from `classify_subprocess_error`). If both are
given, `category` wins. If neither, the tag defaults to "unknown".
When ``reason`` is provided (internal#211/#212), it is a *pre-curated,
user-actionable, secret-safe* explanation built by the caller from a
provider-side failure e.g. a 403 "Your organization has disabled
Claude subscription access · Use an Anthropic API key instead, or ask
your admin to enable access" with error code ``oauth_org_not_allowed``.
This text is exactly what the user needs to self-serve, so it is
surfaced VERBATIM as the message instead of being collapsed to the
opaque exception class name. It still passes through the
key/token/bearer/path scrubber as a belt-and-braces second pass so a
buggy caller can't leak a credential that snuck into the reason.
``reason`` wins over ``stderr``; both lose to neither being set.
When ``stderr`` is provided (e.g. the first ~1 KB of a subprocess stderr
or HTTP error body), it is sanitized and appended to the output so the
A2A caller gets actionable context without needing to dig through workspace
@@ -619,6 +664,13 @@ def sanitize_agent_error(
else:
tag = "unknown"
if reason:
# Curated, user-actionable reason — surface it as the message.
# Still scrub: a 403/auth/quota message is safe, but the scrubber
# is cheap insurance against a caller that didn't curate cleanly.
clean = _sanitize_for_external(reason[:_MAX_STDERR_PREVIEW])
return f"Agent error ({tag}): {clean}"
if stderr:
# Truncate and sanitize before including — prevents DoS via
# a malicious or buggy peer injecting a huge error body, and
+66 -9
View File
@@ -26,9 +26,14 @@ Path safety:
a colliding name fails fast (the random prefix already makes
collisions astronomical, but defense-in-depth costs nothing).
Limits (matches the Go contract from chat_files.go):
- 50 MB total request body
- 25 MB per file
Limits (SSOT matches the Go contract from chat_files.go, injected
via CHAT_UPLOAD_MAX_TOTAL_BYTES / CHAT_UPLOAD_MAX_FILE_BYTES at
provision time; falls back to legacy 50 MB / 25 MB when env unset):
- CHAT_UPLOAD_MAX_TOTAL_BYTES total request body (default 50 MB)
- CHAT_UPLOAD_MAX_FILE_BYTES per file (default 25 MB)
ALSO passed as Starlette ``max_part_size`` to override the
Starlette-1.0 default of 1 MiB which silently 400'd every
upload > 1 MiB before #1520 fix.
- filename truncated to 100 chars
Response shape:
@@ -61,14 +66,47 @@ logger = logging.getLogger(__name__)
# keeps working unchanged.
CHAT_UPLOAD_DIR = "/workspace/.molecule/chat-uploads"
def _env_int(name: str, default: int) -> int:
"""Parse an int from the environment, falling back to ``default``.
Mis-formatted values (anything ``int()`` rejects) fall back to the
default rather than crashing module import operations needs to be
able to roll back a bad env-var push by simply removing the var,
not by also fixing a worker that won't boot.
"""
raw = os.environ.get(name)
if not raw:
return default
try:
return int(raw)
except (TypeError, ValueError):
logger.warning("internal_chat_uploads: env %s=%r not an int; using default %d", name, raw, default)
return default
# Total-request body cap. multipart/form-data with multiple parts can
# add ~100 bytes of framing per file; the cap is the bytes hitting the
# socket, including framing.
CHAT_UPLOAD_MAX_BYTES = 50 * 1024 * 1024 # 50 MB
#
# SSOT (issue #1520): the source of truth is the Go constant
# chatUploadMaxBytes in workspace-server/internal/handlers/chat_files.go,
# exported to the workspace container as CHAT_UPLOAD_MAX_TOTAL_BYTES at
# provision time (workspace_provision_shared.go::applyChatUploadLimits).
# Unset env → keep the previous 50 MB default so an unprovisioned /
# locally-run workspace does NOT regress.
CHAT_UPLOAD_MAX_BYTES = _env_int("CHAT_UPLOAD_MAX_TOTAL_BYTES", 50 * 1024 * 1024)
# Per-file cap. Keeping per-file under total lets a user attach, say,
# a 5 MB PDF + 10 small screenshots in a single batch.
CHAT_UPLOAD_MAX_FILE_BYTES = 25 * 1024 * 1024 # 25 MB
# Per-file cap. SSOT (issue #1520): exported from the Go side as
# CHAT_UPLOAD_MAX_FILE_BYTES; default 25 MB if env is unset so an older
# workspace provisioned before the env-injection landed keeps the
# legacy ceiling.
#
# This value is ALSO passed as Starlette's ``max_part_size`` (see
# ingest_handler below) — Starlette 1.0 defaults max_part_size to
# **1 MiB**, which is the actual root cause of #1520: any single file
# part above 1 MiB raised MultiPartException before per-file enforcement
# could fire. Wiring max_part_size to the same cap as per-file means
# the user-visible ceiling is exactly the per-file cap, no surprises.
CHAT_UPLOAD_MAX_FILE_BYTES = _env_int("CHAT_UPLOAD_MAX_FILE_BYTES", 25 * 1024 * 1024)
# Conservative {alnum, dot, underscore, dash} character class — anything
# outside gets rewritten so embedded paths, control chars, newlines,
@@ -146,11 +184,30 @@ async def ingest_handler(request: Request) -> JSONResponse:
status_code=413,
)
# max_part_size: Starlette 1.0 defaults to 1 MiB. Any single
# part above that raises MultiPartException BEFORE per-file
# enforcement can run — which silently broke every chat upload
# > 1 MiB (issue #1520, fleet-wide P0 2026-05-18). Wire it to
# the per-file cap so the user-visible ceiling matches what
# the per-file 413 path expects.
try:
form = await request.form(max_files=64, max_fields=32)
form = await request.form(
max_files=64,
max_fields=32,
max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES,
)
except Exception as exc: # multipart parse error
logger.warning("internal_chat_uploads: multipart parse failed: %s", exc)
return JSONResponse({"error": "failed to parse multipart form"}, status_code=400)
# Surface the exception detail (feedback_surface_actionable_failure_reason_to_user):
# MultiPartException strings ("Part exceeded maximum size of …",
# "Invalid boundary", "Too many parts", etc.) contain no secrets
# — they describe shape, not content. The 200-char cap is
# belt-and-braces against an exception class we haven't seen
# whose ``str()`` is unbounded.
return JSONResponse(
{"error": "failed to parse multipart form", "detail": str(exc)[:200]},
status_code=400,
)
# Starlette's FormData allows multiple values per key — `files` may
# appear multiple times for batched uploads. getlist returns them
+59
View File
@@ -57,12 +57,14 @@ from a2a_tools import (
tool_commit_memory,
tool_delegate_task,
tool_delegate_task_async,
tool_get_runtime_identity,
tool_get_workspace_info,
tool_inbox_peek,
tool_inbox_pop,
tool_list_peers,
tool_recall_memory,
tool_send_message_to_user,
tool_update_agent_card,
tool_wait_for_message,
)
@@ -289,6 +291,61 @@ _GET_WORKSPACE_INFO = ToolSpec(
section=A2A_SECTION,
)
_GET_RUNTIME_IDENTITY = ToolSpec(
name="get_runtime_identity",
short=(
"Return this runtime's identity — model, model_provider, tier, "
"workspace_id, runtime template. Reads from process env; no HTTP call."
),
when_to_use=(
"Use this to answer 'what model am I?' truthfully instead of "
"guessing from a stale system prompt — the operator may have "
"routed you to a different model via persona env between boots. "
"Always permitted by RBAC: even read-only agents may know what "
"model they are. Distinct from get_workspace_info — that one "
"calls the platform for ID/role/tier/parent (workspace metadata); "
"this one returns the live process env (MODEL, MODEL_PROVIDER, "
"MOLECULE_MODEL, ANTHROPIC_BASE_URL, TIER, WORKSPACE_ID, "
"ADAPTER_MODULE)."
),
input_schema={"type": "object", "properties": {}},
impl=tool_get_runtime_identity,
section=A2A_SECTION,
)
_UPDATE_AGENT_CARD = ToolSpec(
name="update_agent_card",
short=(
"Replace this workspace's agent_card on the platform. The "
"platform validates required fields and broadcasts an "
"agent_card_updated event so the canvas reflects the change live."
),
when_to_use=(
"Use when the workspace's capabilities, skills, description, or "
"name change and the canvas display needs to follow. The "
"platform stores the new card and pushes an "
"``agent_card_updated`` event to subscribers. Gated behind the "
"``memory.write`` RBAC capability — read-only roles cannot "
"rewrite the card. Tier-1+ owners always have this capability."
),
input_schema={
"type": "object",
"properties": {
"card": {
"type": "object",
"description": (
"The new agent_card object (name, version, "
"description, skills, etc). Server-side validation "
"rejects payloads missing required fields."
),
},
},
"required": ["card"],
},
impl=tool_update_agent_card,
section=A2A_SECTION,
)
_BROADCAST_MESSAGE = ToolSpec(
name="broadcast_message",
short=(
@@ -642,6 +699,8 @@ TOOLS: list[ToolSpec] = [
_CHECK_TASK_STATUS,
_LIST_PEERS,
_GET_WORKSPACE_INFO,
_GET_RUNTIME_IDENTITY,
_UPDATE_AGENT_CARD,
_BROADCAST_MESSAGE,
_SEND_MESSAGE_TO_USER,
# Inbox (standalone-only; in-container returns informational error)
@@ -5,6 +5,8 @@
- **check_task_status**: Poll the status of a task started with delegate_task_async; returns result when done.
- **list_peers**: List the workspaces this agent can communicate with — name, ID, status, role for each.
- **get_workspace_info**: Get this workspace's own info — ID, name, role, tier, parent, status.
- **get_runtime_identity**: Return this runtime's identity — model, model_provider, tier, workspace_id, runtime template. Reads from process env; no HTTP call.
- **update_agent_card**: Replace this workspace's agent_card on the platform. The platform validates required fields and broadcasts an agent_card_updated event so the canvas reflects the change live.
- **broadcast_message**: Send a message to ALL agent workspaces in the org simultaneously. Requires broadcast_enabled=true on this workspace (set by user/admin).
- **send_message_to_user**: Send a message directly to the user's canvas chat — pushed instantly via WebSocket. Use this to: (1) acknowledge a task immediately ('Got it, I'll start working on this'), (2) send interim progress updates while doing long work, (3) deliver follow-up results after delegation completes, (4) attach files (zip, pdf, csv, image) for the user to download via the `attachments` field (NEVER paste file URLs in `message`). The message appears in the user's chat as if you're proactively reaching out.
- **wait_for_message**: Block until the next inbound message (canvas user OR peer agent) arrives, or until ``timeout_secs`` elapses.
@@ -27,6 +29,12 @@ Call this first when you need to delegate but don't know the target's ID. Access
### get_workspace_info
Use to introspect your own identity (e.g. before reporting back to the user, or to determine whether you're a tier-0 root that can write GLOBAL memory).
### get_runtime_identity
Use this to answer 'what model am I?' truthfully instead of guessing from a stale system prompt — the operator may have routed you to a different model via persona env between boots. Always permitted by RBAC: even read-only agents may know what model they are. Distinct from get_workspace_info — that one calls the platform for ID/role/tier/parent (workspace metadata); this one returns the live process env (MODEL, MODEL_PROVIDER, MOLECULE_MODEL, ANTHROPIC_BASE_URL, TIER, WORKSPACE_ID, ADAPTER_MODULE).
### update_agent_card
Use when the workspace's capabilities, skills, description, or name change and the canvas display needs to follow. The platform stores the new card and pushes an ``agent_card_updated`` event to subscribers. Gated behind the ``memory.write`` RBAC capability — read-only roles cannot rewrite the card. Tier-1+ owners always have this capability.
### broadcast_message
Use for urgent, org-wide signals: critical status changes, emergency stop instructions, coordinated task announcements. Every non-removed workspace receives the message in its activity log (poll-mode agents see it on their next poll; push-mode canvases get a real-time banner). This tool returns an error if broadcast_enabled is false — a user or admin must enable it via the workspace abilities settings first.
+390
View File
@@ -0,0 +1,390 @@
"""Tests for ``tool_get_runtime_identity`` and ``tool_update_agent_card``.
These two MCP tools close the T4-tier workspace owner-permission gaps
reported via the canvas:
- the agent could not update its own ``agent_card`` (no MCP tool
wrapped the existing ``POST /registry/update-card`` endpoint);
- the agent could not identify which model it was running (the
``MODEL`` env var is injected by ``provisioner.workspace_provision``
but nothing surfaced it back to the agent).
Ported from molecule-ai-workspace-runtime PR#17 (mirror-only repo;
canonical edit point per ``reference_runtime_repo_is_mirror_only``).
Adapted to core's conventions:
* tool functions return ``str`` (JSON-encoded), matching every other
tool in ``a2a_tools_*`` modules. Tests ``json.loads`` to inspect.
* permission check ``memory.write`` runs inline in
``tool_update_agent_card`` (same pattern as
``a2a_tools_memory.tool_commit_memory``).
* ``WORKSPACE_ID`` is read directly from ``os.environ`` core does
not have the runtime's validated-cache layer (``molecule_runtime.
builtin_tools.validation``).
"""
from __future__ import annotations
import json
import pytest
# --- Drift gate: re-export aliases on a2a_tools ------------------------------
class TestBackCompatAliases:
"""Pin that ``a2a_tools.tool_*`` resolves to the same callable as
``a2a_tools_identity.tool_*``. Refactor wrapping (e.g. a doc-string
wrapper that loses the function identity) silently breaks call
sites that ``patch("a2a_tools.tool_update_agent_card", ...)``
this gate makes that drift fail fast."""
def test_tool_get_runtime_identity_alias(self):
import a2a_tools
import a2a_tools_identity
assert a2a_tools.tool_get_runtime_identity is a2a_tools_identity.tool_get_runtime_identity
def test_tool_update_agent_card_alias(self):
import a2a_tools
import a2a_tools_identity
assert a2a_tools.tool_update_agent_card is a2a_tools_identity.tool_update_agent_card
# --- tool_get_runtime_identity ----------------------------------------------
class TestGetRuntimeIdentity:
"""The tool returns env-derived runtime identity. No HTTP call."""
@pytest.mark.asyncio
async def test_returns_all_known_env_fields(self, monkeypatch):
from a2a_tools_identity import tool_get_runtime_identity
monkeypatch.setenv("MODEL", "claude-opus-4-7")
monkeypatch.setenv("MODEL_PROVIDER", "anthropic")
monkeypatch.setenv("TIER", "T4")
monkeypatch.setenv("WORKSPACE_ID", "ws-abc")
monkeypatch.setenv("ADAPTER_MODULE", "adapter")
monkeypatch.setenv("MOLECULE_MODEL", "claude-opus-4-7")
monkeypatch.setenv("ANTHROPIC_BASE_URL", "https://api.anthropic.com")
out = await tool_get_runtime_identity()
# MCP tools return JSON-encoded strings (matches the contract
# every other tool_* in a2a_tools_* uses).
assert isinstance(out, str)
parsed = json.loads(out)
assert parsed["model"] == "claude-opus-4-7"
assert parsed["model_provider"] == "anthropic"
assert parsed["tier"] == "T4"
assert parsed["workspace_id"] == "ws-abc"
assert parsed["runtime"] == "adapter"
assert parsed["molecule_model"] == "claude-opus-4-7"
assert parsed["anthropic_base_url"] == "https://api.anthropic.com"
@pytest.mark.asyncio
async def test_missing_env_returns_empty_strings(self, monkeypatch):
"""Tool MUST NOT raise when env vars are absent — every key is
present but the value is the empty string. The agent then knows
the slot exists but is unset."""
from a2a_tools_identity import tool_get_runtime_identity
for var in (
"MODEL", "MODEL_PROVIDER", "TIER", "WORKSPACE_ID",
"ADAPTER_MODULE", "MOLECULE_MODEL", "ANTHROPIC_BASE_URL",
):
monkeypatch.delenv(var, raising=False)
parsed = json.loads(await tool_get_runtime_identity())
assert parsed["model"] == ""
assert parsed["model_provider"] == ""
assert parsed["tier"] == ""
assert parsed["workspace_id"] == ""
assert parsed["runtime"] == ""
assert parsed["molecule_model"] == ""
assert parsed["anthropic_base_url"] == ""
@pytest.mark.asyncio
async def test_no_http_call_made(self, monkeypatch):
"""``get_runtime_identity`` is env-only — must not open
httpx.AsyncClient even if the call would otherwise succeed.
Tripwire any client construction."""
import httpx
from a2a_tools_identity import tool_get_runtime_identity
class _Tripwire:
def __init__(self, *_a, **_kw):
raise AssertionError(
"tool_get_runtime_identity must not open httpx.AsyncClient"
)
monkeypatch.setattr(httpx, "AsyncClient", _Tripwire)
# Must not raise.
await tool_get_runtime_identity()
@pytest.mark.asyncio
async def test_helper_dict_matches_string_payload(self, monkeypatch):
"""``_runtime_identity_payload`` is the dict-returning helper
used by both the public tool and tests. Verify the public tool
json.dumps the same dict no field is dropped or renamed by
the encoding step."""
from a2a_tools_identity import (
_runtime_identity_payload,
tool_get_runtime_identity,
)
monkeypatch.setenv("MODEL", "claude-opus-4-7")
monkeypatch.setenv("TIER", "T4")
monkeypatch.setenv("WORKSPACE_ID", "ws-helper-check")
helper = _runtime_identity_payload()
tool_str = await tool_get_runtime_identity()
assert json.loads(tool_str) == helper
# --- tool_update_agent_card -------------------------------------------------
class _MockResponse:
def __init__(self, status_code: int, payload: dict):
self.status_code = status_code
self._payload = payload
self.text = json.dumps(payload)
def json(self):
return self._payload
class _MockClient:
"""Drop-in for httpx.AsyncClient context manager.
Records the URL + json body + headers the tool POSTed so the test
can assert against them. Returns the canned _MockResponse passed
in at construction time.
"""
def __init__(self, *, response: _MockResponse, captured: dict):
self._response = response
self._captured = captured
async def __aenter__(self):
return self
async def __aexit__(self, *_args):
return False
async def post(self, url, *, json=None, headers=None, **_kw): # noqa: A002
self._captured["url"] = url
self._captured["json"] = json
self._captured["headers"] = headers
return self._response
@pytest.fixture
def _grant_memory_write(monkeypatch):
"""Force the inline RBAC gate inside ``tool_update_agent_card`` to
succeed. The gate calls
``a2a_tools_rbac.check_memory_write_permission`` which inspects
``$MOLECULE_ROLES`` / the role table; the patch sidesteps that
machinery so tests can focus on the platform-call shape.
"""
import a2a_tools_identity
monkeypatch.setattr(
a2a_tools_identity, "_check_memory_write_permission", lambda: True
)
class TestUpdateAgentCard:
@pytest.mark.asyncio
async def test_posts_to_registry_update_card(
self, monkeypatch, _grant_memory_write,
):
"""Hits POST {PLATFORM_URL}/registry/update-card with the
workspace bearer and the {workspace_id, agent_card} body shape
the platform handler expects (workspace-server
``internal/handlers/registry.go``)."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
# Ensure PLATFORM_URL re-import sees a deterministic value —
# a2a_client imports it at module load so we patch the symbol
# on a2a_tools_identity directly (the module's own reference).
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
captured: dict = {}
response = _MockResponse(200, {"status": "updated"})
def _client_factory(*_a, **_kw):
return _MockClient(response=response, captured=captured)
monkeypatch.setattr(a2a_tools_identity.httpx, "AsyncClient", _client_factory)
monkeypatch.setattr(
a2a_tools_identity, "_auth_headers_for_heartbeat",
lambda: {"Authorization": "Bearer ws-token-xyz"},
)
card = {"name": "agent-foo", "version": "0.1.0", "description": "demo"}
result_str = await a2a_tools_identity.tool_update_agent_card(card)
result = json.loads(result_str)
# URL: PLATFORM_URL + /registry/update-card
assert captured["url"] == "http://test.invalid/registry/update-card"
# The platform handler expects {workspace_id, agent_card}; the
# agent_card is the raw object the agent submitted.
body = captured["json"]
assert body["workspace_id"] == "ws-42"
assert body["agent_card"] == card
# Auth header from auth_headers_for_heartbeat is forwarded
# verbatim — same path commit_memory uses.
assert captured["headers"]["Authorization"] == "Bearer ws-token-xyz"
assert result["success"] is True
assert result["status"] == "updated"
@pytest.mark.asyncio
async def test_propagates_server_error(
self, monkeypatch, _grant_memory_write,
):
"""Non-200 from platform surfaces as a structured error to the
agent. The agent sees {success:false, status_code, error} and
can decide whether to retry, fall back, or escalate."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
captured: dict = {}
response = _MockResponse(400, {"error": "invalid card"})
monkeypatch.setattr(
a2a_tools_identity.httpx, "AsyncClient",
lambda *a, **kw: _MockClient(response=response, captured=captured),
)
monkeypatch.setattr(
a2a_tools_identity, "_auth_headers_for_heartbeat", lambda: {},
)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"})
)
assert result["success"] is False
assert result["status_code"] == 400
assert "invalid card" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_rejects_non_dict_card(self, _grant_memory_write):
"""The MCP schema constrains transport callers to pass a dict;
in-process callers (tests, sibling modules) can still pass any
type. Reject non-dict defensively so the platform isn't asked
to validate JSON-encoded strings or lists."""
from a2a_tools_identity import tool_update_agent_card
result = json.loads(await tool_update_agent_card("not-a-dict"))
assert result["success"] is False
assert "dict" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_workspace_id_missing_returns_error(
self, monkeypatch, _grant_memory_write,
):
"""If WORKSPACE_ID is not set the tool refuses to issue the
request it would otherwise POST with an empty workspace_id
and let the platform return a confusing 400."""
from a2a_tools_identity import tool_update_agent_card
monkeypatch.delenv("WORKSPACE_ID", raising=False)
result = json.loads(await tool_update_agent_card({"name": "x"}))
assert result["success"] is False
assert "workspace_id" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_denies_when_memory_write_permission_missing(self, monkeypatch):
"""The agent's RBAC role must grant ``memory.write`` to update
the card. Read-only roles get an RBAC error string back
immediately, never touching the platform."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(
a2a_tools_identity, "_check_memory_write_permission", lambda: False,
)
# Tripwire httpx — must not be called when RBAC denies.
import httpx
class _Tripwire:
def __init__(self, *_a, **_kw):
raise AssertionError("RBAC denial must short-circuit before httpx call")
monkeypatch.setattr(httpx, "AsyncClient", _Tripwire)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"}),
)
assert result["success"] is False
assert "memory.write" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_network_exception_returns_structured_error(
self, monkeypatch, _grant_memory_write,
):
"""A network exception (DNS failure, connect timeout, etc) is
wrapped into a structured error dict instead of bubbling up
to the MCP transport layer."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
class _ExplodingClient:
async def __aenter__(self):
return self
async def __aexit__(self, *_a):
return False
async def post(self, *_a, **_kw):
raise RuntimeError("simulated DNS failure")
monkeypatch.setattr(
a2a_tools_identity.httpx, "AsyncClient",
lambda *a, **kw: _ExplodingClient(),
)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"})
)
assert result["success"] is False
assert "network" in str(result["error"]).lower()
# --- Registry contract ------------------------------------------------------
class TestRegistryContract:
"""Pin the new tools' registration in platform_tools.registry. The
structural tests in ``test_platform_tools.py`` already check
registryMCP alignment; these are tighter assertions specific to
the two new tools so a future contributor deleting one entry sees
a focused failure."""
def test_get_runtime_identity_in_registry(self):
from platform_tools.registry import by_name
spec = by_name("get_runtime_identity")
assert spec.section == "a2a"
# No input parameters — env-only call.
assert spec.input_schema == {"type": "object", "properties": {}}
# impl points at the actual tool function, not a shim.
from a2a_tools_identity import tool_get_runtime_identity
assert spec.impl is tool_get_runtime_identity
def test_update_agent_card_in_registry(self):
from platform_tools.registry import by_name
spec = by_name("update_agent_card")
assert spec.section == "a2a"
assert "card" in spec.input_schema["properties"]
assert spec.input_schema["required"] == ["card"]
from a2a_tools_identity import tool_update_agent_card
assert spec.impl is tool_update_agent_card
+117
View File
@@ -788,6 +788,123 @@ def test_sanitize_agent_error_stderr_combined_with_existing_tests():
assert "workspace logs" in out
# ─── reason passthrough (internal#211/#212: surface actionable provider error) ───
def test_sanitize_agent_error_reason_surfaced_verbatim():
"""A curated provider reason is shown to the user, not collapsed to the
exception class name. This is the internal#211 regression: a 403
org-disabled message must reach the canvas."""
reason = (
"provider HTTP 403 — oauth_org_not_allowed — Your organization has "
"disabled Claude subscription access for Claude Code · Use an "
"Anthropic API key instead, or ask your admin to enable access"
)
class _ResultErr(Exception):
pass
out = sanitize_agent_error(exc=_ResultErr("opaque"), reason=reason)
# The actionable provider guidance and status code must be visible.
assert "403" in out
assert "oauth_org_not_allowed" in out
assert "disabled Claude subscription access" in out
assert "ask your admin to enable access" in out
# NOT the old opaque form.
assert "see workspace logs" not in out
def test_sanitize_agent_error_reason_still_scrubs_secrets():
"""Even on the reason path the key/token scrubber runs — a buggy caller
that lets a bearer token into the reason still gets it redacted."""
leaky = (
"provider HTTP 401 — auth failed — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm please re-auth"
)
out = sanitize_agent_error(reason=leaky)
assert "[REDACTED]" in out
assert "PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm" not in out
# The non-secret guidance still survives the scrub.
assert "401" in out
assert "please re-auth" in out
def test_sanitize_agent_error_reason_scrubs_all_secret_formats():
"""The scrubber must redact every realistic credential shape — not just
the `Bearer <tok>` form the original test happened to exercise
(internal#212 review finding: bare `sk-ant-api03-…` keys, JSON-quoted
"token"/"apiKey" values, and `aws_secret_access_key=` all leaked).
All curated/actionable guidance must still survive the scrub.
"""
# 1. Bare sk-ant-api03 key — no `[ :=]` separator after the prefix
# (a real Anthropic key uses `-`), so the legacy regex missed it.
bare = (
"provider HTTP 401 — auth failed — invalid key "
"sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789 "
"please re-auth"
)
out = sanitize_agent_error(reason=bare)
assert "sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789" not in out
assert "[REDACTED]" in out
assert "401" in out # actionable status survives
assert "please re-auth" in out # actionable guidance survives
# 2. JSON-quoted "token" / "apiKey" values.
jblob = (
'provider error — config dump {"token": '
'"abcDEF0123456789ghIJKL0123456789mnopQRST", "apiKey": '
'"anon_fakefakefakefakefakefakefakefakefakefake"} — '
"use an API key instead"
)
out = sanitize_agent_error(reason=jblob)
assert "abcDEF0123456789ghIJKL0123456789mnopQRST" not in out
assert "anon_fakefakefakefakefakefakefakefakefakefake" not in out
assert "[REDACTED]" in out
assert "use an API key instead" in out # actionable guidance survives
# 3. aws_secret_access_key=… form.
awsblob = (
"provider HTTP 403 — boto credential error "
"aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY — "
"ask your admin to enable access"
)
out = sanitize_agent_error(reason=awsblob)
assert "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" not in out
assert "[REDACTED]" in out
assert "403" in out # actionable status survives
assert "ask your admin to enable access" in out # guidance survives
# 4. Regression: the original Bearer form still redacts.
# Uses PLACEHOLDER_LONG_TOKEN (>=40 chars, no sk-ant- prefix) to avoid
# triggering the secret-scan workflow pattern
# `sk-ant-[A-Za-z0-9_-]{40,}`.
bearer = (
"provider HTTP 401 — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij re-auth"
)
out = sanitize_agent_error(reason=bearer)
assert "PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij" not in out
assert "[REDACTED]" in out
assert "re-auth" in out
def test_sanitize_agent_error_reason_wins_over_stderr():
"""When both reason and stderr are passed, the curated reason wins."""
out = sanitize_agent_error(
reason="provider HTTP 403 — use an API key",
stderr="raw subprocess noise that should not be shown",
)
assert "use an API key" in out
assert "raw subprocess noise" not in out
def test_sanitize_agent_error_no_reason_unchanged():
"""Omitting reason preserves the original generic behavior."""
out = sanitize_agent_error(exc=ValueError("boom"))
assert "ValueError" in out
assert "workspace logs" in out
# ======================================================================
# classify_subprocess_error
@@ -299,3 +299,122 @@ def test_symlink_at_target_is_refused(client: TestClient, chat_uploads_dir: Path
assert r.status_code == 500, r.text
# Sentinel content unchanged — the symlink wasn't followed.
assert sentinel.read_bytes() == b"original"
# ───────────── issue #1520: max_part_size + SSOT env-driven caps ─────────────
def test_part_above_starlette_1mib_default_is_accepted(client: TestClient, chat_uploads_dir: Path):
"""Regression: pre-fix, ANY single multipart part > 1 MiB raised
MultiPartException because the ingest handler called
``request.form()`` without ``max_part_size`` and Starlette 1.0's
default is 1 MiB (issue #1520, fleet-wide P0 2026-05-18).
This test sends a 2 MiB part, which is well below the 25 MB default
per-file cap but ABOVE the Starlette default, so it pins the fix:
we now pass ``max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES`` so the
parser uses the same cap the per-file 413 path expects.
"""
payload = b"a" * (2 * 1024 * 1024) # 2 MiB — > Starlette 1 MiB default
r = client.post(
"/internal/chat/uploads/ingest",
files={"files": ("big-but-allowed.bin", payload)},
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 200, r.text
item = r.json()["files"][0]
assert item["size"] == len(payload)
def test_parse_error_surfaces_exception_detail(client: TestClient):
"""Per feedback_surface_actionable_failure_reason_to_user: the 400
body must include a ``detail`` field naming WHICH multipart error
fired. The MultiPartException strings ("Part exceeded maximum size
of ", "Invalid boundary", "Too many parts", etc.) describe SHAPE
not content no secrets.
We trigger a real Starlette MultiPartException by submitting a body
whose Content-Type advertises ``multipart/form-data`` but whose
body is not a valid multipart envelope the parser raises before
any per-file check can fire.
"""
r = client.post(
"/internal/chat/uploads/ingest",
content=b"this is not a valid multipart body",
headers={
"Authorization": "Bearer test-secret",
"Content-Type": "multipart/form-data; boundary=----not-a-real-boundary",
},
)
assert r.status_code == 400, r.text
body = r.json()
assert body["error"] == "failed to parse multipart form"
# Detail must be present + non-empty + bounded.
assert "detail" in body and isinstance(body["detail"], str)
assert body["detail"], "detail must not be empty"
assert len(body["detail"]) <= 200, "detail must be bounded"
def test_total_cap_413_still_fires_above_per_file_pass(client: TestClient, monkeypatch: pytest.MonkeyPatch):
"""Total-cap 413 path still works: two parts whose sum exceeds
CHAT_UPLOAD_MAX_BYTES but each individually fits the per-file cap.
Sanity-check that raising the per-file ceiling didn't accidentally
short-circuit the total-cap check.
"""
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_BYTES", 1024)
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_FILE_BYTES", 800)
r = client.post(
"/internal/chat/uploads/ingest",
files=[
("files", ("a.bin", b"a" * 600)),
("files", ("b.bin", b"b" * 600)),
],
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 413
# Either early (Content-Length pre-parse) or post-parse cumulative path is
# acceptable; both messages mention exceeding the total limit.
err = r.json()["error"]
assert "exceeds" in err and "limit" in err, err
def test_env_driven_ssot_overrides_caps(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""SSOT contract: setting CHAT_UPLOAD_MAX_FILE_BYTES /
CHAT_UPLOAD_MAX_TOTAL_BYTES in the environment at module import
time changes the module constants. Pin so the
workspace_provision_shared.go::applyChatUploadLimits env injection
cannot silently drift from what the Python side reads.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", str(7 * 1024 * 1024))
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", str(13 * 1024 * 1024))
reloaded = importlib.reload(internal_chat_uploads)
try:
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 7 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 13 * 1024 * 1024
finally:
# Reset to defaults so subsequent tests see clean constants.
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)
def test_env_driven_ssot_malformed_value_falls_back_to_default(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""If ops pushes a garbage value the worker still boots with the
in-code default (operability over precision see _env_int
docstring). Pin the fallback.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", "not-an-int")
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", "") # empty == use default
reloaded = importlib.reload(internal_chat_uploads)
try:
# Defaults (legacy 25 MB / 50 MB) come back.
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 25 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 50 * 1024 * 1024
finally:
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)