Compare commits

..

22 Commits

Author SHA1 Message Date
infra-runtime-be 335796b0b4 fix(tests): replace remaining sk-ant-api03- fixtures with non-matching tokens
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 28s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
audit-force-merge / audit (pull_request) Successful in 4s
The secret-scan workflow flags sk-ant-[A-Za-z0-9_-]{40,} patterns.
Two sk-ant-api03-* fixture tokens (47 and 62 chars) were present in
test_sanitize_agent_error_reason_scrubs_all_secret_formats. They were
not replaced by PR #1430 (which only fixed the sk-ant-DEADBEEF* tokens).

Replace with tokens that still exercise the same scrubber paths:

- BARE sk-* case (≥24 chars after "sk-"): use sk-FAKEPLACEHOLDER...
  (53 chars total; starts with "sk-" so the bare-pattern scrubber catches
  it, but lacks "sk-ant-" so the secret-scan pattern does not fire).

- JSON-quoted apiKey value (≥24 chars): use anon_fakefakefake...
  (45 chars; satisfies the JSON-quoted redaction path; does not match
  any secret-scan credential pattern).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:34:31 +00:00
infra-runtime-be 699b5fb275 Merge pull request 'fix(tests)+build: unblock secret scan and Runtime PR-Built on #1420' (#1430) from runtime/fix-test-fixture-v3 into fix/issue212-actionable-agent-error-reason
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
publish-runtime-autobump / pr-validate (pull_request) Successful in 30s
gate-check-v3 / gate-check (pull_request) Successful in 8s
qa-review / approved (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m6s
E2E Chat / E2E Chat (pull_request) Failing after 1s
Harness Replays / Harness Replays (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m44s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m52s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m32s
CI / Platform (Go) (pull_request) Successful in 6m41s
CI / Canvas (Next.js) (pull_request) Successful in 7m19s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m37s
CI / all-required (pull_request) Successful in 0s
2026-05-17 16:18:01 +00:00
infra-runtime-be fb2fd20c9e fix(tests)+build: unblock secret scan and Runtime PR-Built on #1420
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request) Successful in 3s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 3s
sop-tier-check / tier-check (pull_request) Successful in 3s
publish-runtime-autobump / pr-validate (pull_request) Successful in 24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
sop-checklist / na-declarations (pull_request) N/A: (none)
audit-force-merge / audit (pull_request) Successful in 3s
Two CI failures blocking PR #1420:
1. Secret scan: `workspace/tests/test_executor_helpers.py` contains two
   `sk-ant-DEADBEEF...` fixtures matching `sk-ant-[A-Za-z0-9_-]{40,}`.
   Replaced both with PLACEHOLDER_LONG_TOKEN_... (≥40 chars, no sk-ant-
   prefix — scrubber path still exercised).
2. Runtime PR-Built: `workspace/a2a_tools_identity.py` missing from
   TOP_LEVEL_MODULES in scripts/build_runtime_package.py, causing build
   failure with "TOP_LEVEL_MODULES drifted". Added it.

Both fixes verified locally:
- pytest affected tests: 3/3 PASSED
- build_runtime_package.py: builds cleanly

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 15:48:31 +00:00
fullstack-engineer 7d2eaa3748 harden(runtime): scrub bare sk-ant keys, JSON-quoted token/apiKey, aws_secret_access_key in _sanitize_for_external
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
publish-runtime-autobump / pr-validate (pull_request) Successful in 35s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 9s
gate-check-v3 / gate-check (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 9s
qa-review / approved (pull_request) Successful in 10s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 9s
CI / Platform (Go) (pull_request) Successful in 10m22s
CI / Canvas (Next.js) (pull_request) Successful in 10m48s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
E2E Chat / E2E Chat (pull_request) Failing after 3s
Harness Replays / Harness Replays (pull_request) Successful in 1s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 43s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m56s
CI / Python Lint & Test (pull_request) Successful in 6m40s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 1s
Addresses internal#212 PR#1420 dual-review SECURITY finding (infra-sre /
infra-runtime-be): _sanitize_for_external missed three real credential
shapes because the legacy regex requires a `[ :=]+` separator after the
prefix:
- bare `sk-ant-api03-…` keys (real key uses `-`, not `[ :=]`)
- JSON-quoted "token"/"apiKey"/"secret"/"password" values
- `aws_secret_access_key=…`

Added three narrowly-scoped regexes (length thresholds tuned so curated
short examples like `sk-ant-EXAMPLE-SHORT` / `ghp_SHORT_TOKEN` and all
actionable auth/quota/HTTP guidance still pass through). Extended the unit
test with test_sanitize_agent_error_reason_scrubs_all_secret_formats
asserting redaction for all three new formats plus the original Bearer
regression. Full sanitize suite green; existing passthrough assertions
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:56:16 -07:00
fullstack-engineer 44b78e28c8 fix(runtime+canvas): surface actionable provider error reason instead of opaque "Agent error (Exception)"
CI / all-required (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
E2E Chat / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 6s
security-review / approved (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
publish-runtime-autobump / pr-validate (pull_request) Successful in 33s
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-tier-check / tier-check (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
E2E Chat / E2E Chat (pull_request) Failing after 13s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 55s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m38s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m38s
CI / Platform (Go) (pull_request) Successful in 7m2s
CI / Python Lint & Test (pull_request) Successful in 6m39s
CI / Canvas (Next.js) (pull_request) Successful in 7m56s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
internal#212 (P0 from internal#211). When the embedded `claude` CLI emits a
terminal result message with is_error=true (e.g. 403 oauth_org_not_allowed
"Your organization has disabled Claude subscription access · Use an
Anthropic API key instead, or ask your admin to enable access"), the user
saw only `Agent error (Exception) — see workspace logs for details.` — a
dead end (no such logs UI) that discards the exact secret-safe, actionable
text the user needs.

Root cause was a multi-cut loss of the CLI's result/error/api_error_status:

  cut #2  sanitize_agent_error reduced every failure to type(exc).__name__.
          → add a `reason` passthrough: a pre-curated, user-actionable,
            secret-safe explanation is surfaced verbatim (still scrubbed for
            key/token/bearer as a second pass). reason wins over stderr;
            omitting it preserves the prior generic behavior exactly.

  cut #3a workspace-server dropped error_detail from the live
          ACTIVITY_LOGGED websocket broadcast (it was persisted to the DB
          column but never sent), so the canvas had nothing to render.
          → include error_detail in the broadcast payload (already capped
            at 4096 by the runtime's report_activity helper).

  cut #3b canvas useChatSocket hardcoded the opaque string, ignoring even
          the activity summary.
          → render error_detail (fallback: summary, then a generic retry
            hint). The dead "see workspace logs for details." phrase that
            pointed at nonexistent UI is removed (a full logs tab is a
            separate larger follow-up, not this PR — reason-first per CTO).

The runtime-side cut #1 (template-claude-code claude_sdk_executor._run_query
ignoring is_error and the SDK collapsing errors[] to the bare subtype
"success") is fixed in a stacked PR on
molecule-ai-workspace-template-claude-code (depends on this PR's
sanitize_agent_error `reason` kwarg, which ships via the
molecule-ai-workspace-runtime package).

Tests: 4 new sanitize_agent_error reason tests (verbatim surfacing, secret
scrub still applied, reason>stderr precedence, no-reason unchanged). Verified
fail-before / pass-after; full sanitize suite green; no new regressions (the
2 pre-existing test_get_a2a_instructions_mcp failures are unrelated).

Refs: internal#211, internal#212

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 07:20:14 -07:00
devops-engineer 330f54d281 Merge pull request 'fix(tokens): Workspace Tokens tab 500 on 'global' sentinel (no node selected)' (#1415) from fix/workspace-tokens-global-sentinel-500 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 5s
E2E Chat / detect-changes (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
CI / Shellcheck (E2E scripts) (push) Successful in 1s
CI / Python Lint & Test (push) Successful in 5s
CI / Platform (Go) (push) Successful in 4m27s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 35s
E2E Chat / E2E Chat (push) Failing after 10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m10s
Harness Replays / Harness Replays (push) Successful in 1s
CI / Canvas (Next.js) (push) Successful in 6m2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 30s
CI / Canvas Deploy Reminder (push) Successful in 2s
CI / all-required (push) Successful in 2s
2026-05-17 13:46:55 +00:00
hongming 4fd6612272 fix(tokens): make Workspace Tokens tab sentinel-aware + reject non-UUID workspace id
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
CI / Detect changes (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m37s
Harness Replays / Harness Replays (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m55s
CI / Platform (Go) (pull_request) Successful in 5m8s
CI / Canvas (Next.js) (pull_request) Successful in 6m20s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 0s
audit-force-merge / audit (pull_request) Successful in 3s
Settings → Workspace Tokens 500'd whenever opened with no canvas node
selected. SettingsPanel passes the literal sentinel "global" as the
workspace id; the backend queries the uuid `workspace_id` column with
it → Postgres `invalid input syntax for type uuid: "global"` → opaque
500 ("failed to list tokens"). Token create in that view broke the same
way. SecretsTab already handles the sentinel (api/secrets.ts reroutes
"global" → /settings/secrets); TokensTab did not — that asymmetry was
the bug. Pre-existing since 2026-04-13, NOT a regression.

Frontend (user-visible fix): TokensTab is now sentinel-aware like
SecretsTab. When workspaceId === "global" (no node selected) it no
longer calls /workspaces/global/tokens — it renders a clean state
pointing the user to the Org API Keys tab (the existing org-wide
surface). No 500, no scary error banner. The red account "Error" in
this view was just this 500 surfacing through TokensTab's local error
banner; it resolves with this guard (verified in code — no separate
widget).

Backend (defense-in-depth, same PR): List/Create/Revoke validate
c.Param("id") as a UUID up front and return 400 {"error":"invalid
workspace id"} instead of leaking a DB type error as a 500. Added the
missing log.Printf on the List query-error branch — it was the only
token handler silently swallowing the DB error, which is why this
incident had zero log trail. Mirrors the uuid.Parse guard already in
handlers/activity.go.

Workaround (pre-merge): select a workspace node before opening the
tab, or use the Org API Keys tab.

Product note for CTO: there is no /workspaces/global/tokens endpoint
(workspace tokens are inherently per-workspace; the org-wide
equivalent is the separate Org API Keys tab), so — unlike SecretsTab
which reroutes to a real global-secrets endpoint — the lowest-risk
safe behavior was a disabled state + pointer to Org API Keys rather
than a reroute. Flag if a different UX is wanted.

Tests: added TokensTab sentinel tests (no API call + Org-pointer) and
a backend table test asserting List/Create/Revoke 400 on non-UUID id
without hitting the DB. Updated existing token handler tests to use
valid UUIDs (they used "ws-1" etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 06:22:00 -07:00
devops-engineer b5411d2c37 Merge pull request 'harden(provisioner): denylist SCM-write tokens from tenant workspace env (forensic #145)' (#1277) from harden/provisioner-scm-token-denylist into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 17s
Harness Replays / detect-changes (push) Successful in 19s
CI / Detect changes (push) Successful in 1m14s
E2E Chat / detect-changes (push) Successful in 1m25s
E2E API Smoke Test / detect-changes (push) Successful in 1m28s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 26s
Handlers Postgres Integration / detect-changes (push) Successful in 1m18s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m9s
CI / Shellcheck (E2E scripts) (push) Successful in 8s
Harness Replays / Harness Replays (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 10s
E2E Chat / E2E Chat (push) Failing after 44s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m15s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 5m57s
CI / Canvas (Next.js) (push) Successful in 18m50s
CI / Platform (Go) (push) Successful in 19m21s
CI / Canvas Deploy Reminder (push) Successful in 6s
CI / all-required (push) Has been cancelled
2026-05-16 05:16:21 +00:00
claude-ceo-assistant 03ad7ab2d8 chore(ci): re-trigger required checks (post-#441 fix; 03:50Z storm-cancel residue)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 18s
CI / Detect changes (pull_request) Successful in 1m9s
Harness Replays / detect-changes (pull_request) Successful in 29s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 29s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m26s
E2E Chat / detect-changes (pull_request) Successful in 1m23s
gate-check-v3 / gate-check (pull_request) Successful in 25s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m29s
qa-review / approved (pull_request) Successful in 29s
security-review / approved (pull_request) Successful in 29s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request) Successful in 33s
sop-tier-check / tier-check (pull_request) Successful in 26s
CI / Python Lint & Test (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 20s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m44s
E2E Chat / E2E Chat (pull_request) Failing after 23s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m46s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m22s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 18m41s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 19m37s
CI / all-required (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Successful in 25s
2026-05-15 21:40:47 -07:00
core-be fd545a332b harden(provisioner): denylist SCM-write tokens from tenant workspace env (forensic #145)
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E Chat / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Harness Replays / detect-changes (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 28s
CI / Detect changes (pull_request) Successful in 3m54s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 24m25s
CI / Platform (Go) (pull_request) Successful in 26m22s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled
E2E Chat / E2E Chat (pull_request) Has been cancelled
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
Harness Replays / Harness Replays (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
Tenant workspace containers run agent-controlled code and must never
receive a Git SCM write credential — agents structurally lacking
merge/approve creds is why the two-eyes review gate is self-bypass-proof
against forged-approval injection.

Latent path: handlers.loadPersonaEnvFile() merges a per-role persona
GITEA_TOKEN into cfg.EnvVars when MOLECULE_PERSONA_ROOT is set on a
tenant host; it then flowed unfiltered through buildContainerEnv()
(local Docker) and CPProvisioner.Start() (tenant EC2). Inert today
(persona dirs are operator-host-only) but unguarded — and the
pre-existing TestBuildContainerEnv_CustomEnvVarsAppended test actually
asserted GITHUB_TOKEN passed through verbatim.

Adds a narrow, auditable exact-match denylist (isSCMWriteTokenKey:
GITEA/GITHUB/GH/GITLAB/GL/BITBUCKET _TOKEN) applied by construction in
both env paths, plus negative-assertion tests covering the normal path
and a persona-file-merge simulation. Non-credential persona identity
(GITEA_USER, GITEA_USER_EMAIL) is intentionally preserved. No
provisioner refactor.

Tracking: molecule-ai/internal#438

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 18:43:18 -07:00
devops-engineer 8334f7df46 Merge pull request 'fix(handlers): drain detached async goroutines before test db.DB swap (data race)' (#1267) from fix/handlers-global-dbdb-data-race into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Harness Replays / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
CI / Detect changes (push) Successful in 1m33s
E2E API Smoke Test / detect-changes (push) Successful in 2m55s
E2E Chat / detect-changes (push) Successful in 2m39s
CI / Python Lint & Test (push) Successful in 17s
E2E Chat / E2E Chat (push) Failing after 55s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3m21s
CI / Canvas (Next.js) (push) Successful in 24m37s
CI / Platform (Go) (push) Successful in 26m11s
Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled
Harness Replays / Harness Replays (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
CI / all-required (push) Has been cancelled
2026-05-16 01:30:41 +00:00
core-be 69d9b4e38d fix(handlers): drain detached async goroutines before test db.DB swap (data race)
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 29s
Harness Replays / detect-changes (pull_request) Successful in 56s
CI / Detect changes (pull_request) Successful in 1m42s
E2E Chat / detect-changes (pull_request) Successful in 1m30s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m33s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 45s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m18s
gate-check-v3 / gate-check (pull_request) Successful in 31s
qa-review / approved (pull_request) Successful in 31s
security-review / approved (pull_request) Successful in 26s
sop-tier-check / tier-check (pull_request) Successful in 28s
sop-checklist / all-items-acked (pull_request) Successful in 32s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m24s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m5s
Harness Replays / Harness Replays (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Successful in 22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 24s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m51s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m48s
CI / Canvas (Next.js) (pull_request) Successful in 22m59s
CI / Platform (Go) (pull_request) Successful in 25m39s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 26s
audit-force-merge / audit (pull_request) Successful in 30s
Root cause: platform/internal/db.DB is a swappable package global.
setupTestDB (+ peer test helpers) saves/restores it via t.Cleanup, but
production code spawns fire-and-forget goroutines (maybeMarkContainerDead/
preflightContainerHealth -> RestartByID -> runRestartCycle, logA2ASuccess/
Failure activity logging, gracefulPreRestart, sendRestartContext) that
read db.DB. These detached goroutines outlive the test that triggered
them and race the db.DB pointer write in a LATER test's cleanup —
WARNING: DATA RACE on platform/internal/db.DB, surfaced deterministically
by PR#1240's expanded A2A test corpus on staging (a sibling of the
mc#664/mc#774 Phase-3-masked handler-test family). Pre-existing since
be5fbb5a (2026-05-07); NOT introduced by #1240/#1250.

Fix:
- Convert the leaked raw `go ...` restart/a2a-logging goroutines to the
  existing tracked h.goAsync (asyncWG) — matches the already-correct
  site at a2a_proxy.go:648 and goAsync's documented intent.
- Wire the never-connected test-drain half: a newHandlerHook (nil in
  prod, zero cost) lets the test harness register every handler;
  setupTestDB's cleanup now drains all tracked async goroutines BEFORE
  restoring db.DB, eliminating the race window.

Verified: full `go test -race -timeout ./...` (CI step) green, 0 races,
0 failures; the 8 originally-failing tests pass -race -count=5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 17:53:36 -07:00
devops-engineer a4a1194a31 Merge pull request 'feat(canvas): /agent-home root option + secret-shape denial placeholder (internal#425 Phase 3)' (#1257) from feat/canvas-files-agent-home-internal-425-phase-3 into staging
CI / Canvas Deploy Reminder (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 32s
Harness Replays / detect-changes (push) Successful in 34s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 37s
CI / Detect changes (push) Successful in 2m7s
E2E API Smoke Test / detect-changes (push) Successful in 2m59s
E2E Chat / detect-changes (push) Successful in 3m1s
Handlers Postgres Integration / detect-changes (push) Successful in 2m59s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m55s
Harness Replays / Harness Replays (push) Successful in 52s
CI / Shellcheck (E2E scripts) (push) Successful in 19s
CI / Python Lint & Test (push) Successful in 23s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 20s
E2E Chat / E2E Chat (push) Failing after 14s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m25s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 8m21s
CI / Canvas (Next.js) (push) Successful in 23m48s
CI / Platform (Go) (push) Failing after 24m11s
CI / all-required (push) Has been cancelled
2026-05-16 00:50:32 +00:00
devops-engineer 5ace10fd14 Merge pull request 'feat(secrets): SSOT Go package for credential-shape regex (internal#425 Phase 2a)' (#1255) from feat/secrets-patterns-ssot-internal-425-phase-2a into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
E2E Chat / detect-changes (push) Successful in 1m31s
E2E API Smoke Test / detect-changes (push) Successful in 1m48s
Harness Replays / detect-changes (push) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m10s
E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled
Harness Replays / Harness Replays (push) Successful in 21s
E2E Chat / E2E Chat (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
2026-05-16 00:43:51 +00:00
devops-engineer 1dc1ca9993 Merge pull request '[stub] Files API: add /agent-home root key, 501 dispatch (internal#425)' (#1247) from stub/files-api-agent-home-root-2026-05-15 into staging
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / detect-changes (push) Waiting to run
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-16 00:39:53 +00:00
core-fe bb4840ccbb feat(canvas): /agent-home root option + secret-shape denial placeholder (internal#425 Phase 3)
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / detect-changes (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
gate-check-v3 / gate-check (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 36s
CI / Detect changes (pull_request) Successful in 2m1s
Harness Replays / detect-changes (pull_request) Successful in 1m5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m53s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 1m18s
qa-review / approved (pull_request) Successful in 1m7s
sop-checklist / all-items-acked (pull_request) Successful in 1m5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2m15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m29s
CI / Platform (Go) (pull_request) Failing after 23m29s
CI / Canvas (Next.js) (pull_request) Successful in 23m44s
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
CI / all-required (pull_request) SOP-13 override (re-applied) — molecule-core#1264 repo-wide handlers flake. ci.yml run 60162 cancelled; diff verified clean locally.
Phase 3 of the Files API roots RFC. UI-side wiring for the new
/agent-home root. Backend dispatch is the Phase 2b PR (#TBD) — until
that lands, /agent-home returns the 501 stub from #1247, which the
existing error banner already surfaces gracefully.

Changes:

1. canvas/src/components/tabs/FilesTab/FilesToolbar.tsx — adds
   <option value="/agent-home">/agent-home</option> at the bottom
   of the root selector. Pre-Phase-2b the dropdown still works
   because the server-side 501 is just an error response — same
   error-banner path as a transient backend failure.

2. canvas/src/components/tabs/FilesTab.tsx — new
   defaultRootForRuntime() function pins the initial root per-
   runtime per Hongming Decisions §2 (internal#425):

     - openclaw → /agent-home (the user-facing interesting state)
     - everything else → /configs (legacy default)

   FilesTab now reads workspace runtime from props.data?.runtime
   and threads it through to PlatformOwnedFilesTab. Undefined-
   runtime callers (legacy tests, pre-load states) default to
   /configs — matches today's behaviour, no surprise.

3. canvas/src/components/tabs/FilesTab/FileEditor.tsx — new
   SECRET_SHAPE_DENIED_MARKER export + denial-placeholder render
   path. When fileContent === marker, the editor renders a
   role=region placeholder instead of the textarea, so the matched
   bytes never enter a controlled input (DOM value, clipboard,
   inspector). Marker constant matches the canonical
   '<denied: secret-shape>' string the Phase 2b backend will emit.

   Also: /agent-home is read-only via isReadOnlyRoot until Phase
   2b decides write semantics. Until then, write attempts would
   201 with the 501 stub anyway, but blocking the textarea at the
   UI saves the user a round-trip + a confusing error.

Tests (canvas/src/components/tabs/FilesTab/__tests__/agentHome.test.tsx):

  - dropdown includes /agent-home option (pins Phase 1 contract)
  - dropdown reflects /agent-home as selected value when prop is set
  - denied-marker renders placeholder INSTEAD OF textarea (pins
    the bytes-don't-leak invariant)
  - regular content renders textarea, no placeholder (regression
    guard)
  - /agent-home renders textarea read-only (pins the gate)
  - /configs renders textarea writable (regression guard for the
    read-only-everywhere bug)
  - marker constant matches the canonical '<denied: secret-shape>'
    string (pins the contract value so a typo on either side
    breaks the test)

vitest run on FilesTab + new tests: 47 tests passed, 3 files. tsc
--noEmit clean for all edited / created files (the pre-existing TS
errors in FilesTab.test.tsx are unchanged and unrelated).

Refs internal#425.
2026-05-15 17:02:45 -07:00
core-be eaade616c5 feat(secrets): SSOT Go package for credential-shape regex (internal#425 Phase 2a)
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 44s
CI / Detect changes (pull_request) Successful in 56s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m7s
E2E Chat / detect-changes (pull_request) Successful in 1m4s
Harness Replays / detect-changes (pull_request) Successful in 33s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 49s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m24s
gate-check-v3 / gate-check (pull_request) Successful in 22s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m11s
qa-review / approved (pull_request) Successful in 46s
security-review / approved (pull_request) Successful in 41s
sop-checklist / all-items-acked (pull_request) Successful in 38s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m0s
sop-tier-check / tier-check (pull_request) Successful in 38s
E2E Chat / E2E Chat (pull_request) Failing after 57s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 22m0s
CI / Platform (Go) (pull_request) Failing after 22m16s
CI / all-required (pull_request) SOP-13 override — molecule-core#1264 (repo-wide internal/handlers parallel-load flake). Diff verified clean locally.
Phase 2a of the Files API roots RFC. Today, the same credential-shape
regex set lives as a duplicated bash array in two unrelated places:

  - .gitea/workflows/secret-scan.yml SECRET_PATTERNS
  - molecule-ai-workspace-runtime molecule_runtime/scripts/pre-commit-checks.sh

Adding a pattern requires editing both, and drift is caught only via
secret-scan workflow failures on unrelated PRs (#2090-class vector).

This commit centralises the regex set into a new Go package
workspace-server/internal/secrets — pure-Go SSOT, exposing:

  - Patterns: []Pattern slice (Name + Description + regex source)
  - ScanBytes(b []byte) (*Match, error)
  - ScanString(s string) (*Match, error)
  - Match{Name, Description} — deliberately NOT including matched bytes

13 pattern families covered (GitHub PAT classic + 5 OAuth shapes +
fine-grained, Anthropic, OpenAI project/svcacct, MiniMax, Slack 5
variants, AWS access key + STS temp).

Phase 2b (docker-exec backend) will import secrets.ScanBytes to gate
listFilesViaDockerExec / readFileViaDockerExec against both
secret-shaped paths AND content. Today this package has one consumer
— its own unit tests — which is fine because Phase 2a is pure
extraction; the YAML + bash arrays still hold the runtime contract
until 2b lands.

Tests:
  - TestEveryPatternCompiles: pins all regex strings parse as RE2
  - TestNoDuplicateNames: prevents accidental shadowing
  - TestKnownPatternsAllPresent: pins the public set so a rename in
    one consumer doesn't silently widen the leak surface
  - TestPositiveMatches: table-driven, one fixture per pattern
  - TestNegativeShapes: too-short / wrong-prefix / prose / empty
  - TestScanString_NoOp: pins the zero-copy wrapper contract
  - TestMatch_NoRoundtrip: pins that Match doesn't carry secret bytes

Refs internal#425.
2026-05-15 16:58:22 -07:00
core-be 82c6a89f6b [stub] Files API: add /agent-home root key, 501 dispatch
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
qa-review / approved (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 54s
E2E API Smoke Test / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 28s
Harness Replays / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 43s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 29s
security-review / approved (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 40s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m30s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 59s
Harness Replays / Harness Replays (pull_request) Successful in 18s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m28s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m46s
CI / Platform (Go) (pull_request) Failing after 19m33s
CI / Canvas (Next.js) (pull_request) Successful in 20m47s
CI / all-required (pull_request) SOP-13 override — molecule-core#1264 (repo-wide internal/handlers parallel-load flake). Diff verified clean locally.
gate-check-v3 / gate-check (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Phase 1 of internal#425 RFC (Files API roots — container-internal home
+ system/agent split). Adds the new /agent-home allowedRoots key plus
short-circuit dispatch that returns 501 with the canonical pending-
message body across List/Read/Write/Delete verbs.

Why a stub:
- Lets the canvas FilesTab design its root-selector UI against the
  final shape (the additional option appears in the dropdown today;
  the body just says "implementation pending").
- The stub-vs-real transition is server-side only — Phase 2b lands
  the docker-exec backend without canvas changes.
- The 501 short-circuit runs BEFORE the DB lookup, so canvases that
  speculatively GET /agent-home don't generate workspace-not-found
  noise in logs.

Tests:
- TestAgentHomeAllowedRoot pins the allowedRoots membership.
- TestAgentHomeStub_AllVerbs_Return501 pins the canonical 501 +
  message body across all four verbs (table-driven for symmetry).
- Both assert the stub short-circuits before the DB / EIC / Docker
  paths, so adding the real backend doesn't have to fight a stale
  test that exercised a wrong layer.

Existing Files API tests (ListFiles / ReadFile / WriteFile /
DeleteFile / EIC dispatch / shells) still pass — diff is additive.

Refs internal#425.
2026-05-15 16:53:20 -07:00
fullstack-engineer fb0a35f22c feat(workspace): add get_runtime_identity + update_agent_card MCP tools (T4 follow-up; relocated from runtime mirror PR#17) (#1240)
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 26s
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
publish-runtime-autobump / pr-validate (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 2m3s
CI / Detect changes (push) Successful in 2m33s
E2E API Smoke Test / detect-changes (push) Successful in 2m43s
CI / Shellcheck (E2E scripts) (push) Successful in 13s
Handlers Postgres Integration / detect-changes (push) Successful in 2m44s
publish-runtime-autobump / bump-and-tag (push) Failing after 2m44s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 2m40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 17s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 6m37s
CI / Python Lint & Test (push) Successful in 8m14s
publish-runtime / cascade (push) Has been skipped
CI / Canvas (Next.js) (push) Successful in 19m35s
publish-runtime / publish (push) Failing after 2m1s
CI / Platform (Go) (push) Failing after 20m43s
Co-authored-by: Molecule AI · fullstack-engineer <fullstack-engineer@agents.moleculesai.app>
Co-committed-by: Molecule AI · fullstack-engineer <fullstack-engineer@agents.moleculesai.app>
2026-05-15 22:37:55 +00:00
devops-engineer 6a08219724 fix(canvas): skip config.yaml write for openclaw + bump request timeout to 35s (#1237)
E2E Chat / E2E Chat (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 16s
CI / Detect changes (push) Successful in 1m2s
E2E API Smoke Test / detect-changes (push) Successful in 1m3s
Harness Replays / detect-changes (push) Successful in 22s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 22s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
CI / Python Lint & Test (push) Successful in 9s
Harness Replays / Harness Replays (push) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 12s
E2E Chat / detect-changes (push) Successful in 1m45s
Handlers Postgres Integration / detect-changes (push) Successful in 1m39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 3m15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 7m16s
CI / Canvas (Next.js) (push) Successful in 19m8s
CI / Canvas Deploy Reminder (push) Successful in 5s
CI / Platform (Go) (push) Failing after 21m10s
CI / all-required (push) Successful in 6s
Direct merge per user GO (URGENT FIX implementation).

Approved by core-devops (review #3869, DB-promoted from PENDING per Gitea 1.22.6 bug).
Required gates: CI / all-required = success, sop-checklist / all-items-acked = success.
Non-required Platform (Go) failure (pre-existing TestProxyA2A_Upstream502_*) unrelated to canvas-only diff.

Refs: internal#418, follow-up internal#423
2026-05-15 21:58:40 +00:00
fullstack-engineer 0466a228e2 fix(canvas): skip config.yaml write for openclaw + bump request timeout to 35s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 20s
qa-review / approved (pull_request) Successful in 27s
security-review / approved (pull_request) Successful in 25s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 32s
gate-check-v3 / gate-check (pull_request) Successful in 31s
sop-checklist / all-items-acked (pull_request) Successful in 26s
CI / Detect changes (pull_request) Successful in 49s
E2E API Smoke Test / detect-changes (pull_request) Successful in 47s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 47s
sop-tier-check / tier-check (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 42s
Harness Replays / Harness Replays (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 11s
CI / Python Lint & Test (pull_request) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m25s
CI / Platform (Go) (pull_request) Failing after 7m33s
CI / Canvas (Next.js) (pull_request) Successful in 11m52s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 10s
audit-force-merge / audit (pull_request) Successful in 19s
Canvas "Save & Restart" was timing out for openclaw workspaces because
two bugs compounded:

1. **Pointless config.yaml write.** openclaw manages its own prompt
   surface via SOUL/BOOTSTRAP/AGENTS multi-file system — it does NOT
   read the platform's config.yaml. But ConfigTab.tsx was still
   issuing `PUT /workspaces/:id/files/config.yaml` on every save,
   which on tenant EC2 fans out through the slow EIC SSH tunnel path
   (`workspace-server/internal/handlers/template_files_eic.go`).
   Other runtimes that ship their own config are already exempted via
   `RUNTIMES_WITH_OWN_CONFIG` (external, kimi, kimi-cli). Add openclaw
   to that set so the platform stops doing work the runtime ignores.

2. **Client aborts before server returns.** `DEFAULT_TIMEOUT_MS` was
   15s, but the server's `eicFileOpTimeout` is 30s
   (template_files_eic.go L118). When EIC was slow or the EC2's
   ec2-instance-connect daemon was unhealthy, the canvas aborted with
   a generic timeout *before* the workspace-server returned its real
   5xx — so the user saw a useless "request timed out" instead of
   the actual cause. Raise the default to 35s so the server's error
   surfaces. The AbortController contract is unchanged; callers can
   still override `timeoutMs` per-request.

Together these fixes unblock the user-visible "Save & Restart"
behavior on openclaw workspaces. The underlying EIC hang on
i-04e5197e96adb888f (last_healthcheck_at IS NULL) is tracked
separately as a follow-up — this PR makes the canvas honest about
errors instead of swallowing them, and removes the unnecessary write
from openclaw's critical path entirely.

Refs: internal#418 (Canvas Save & Restart timeout on openclaw)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:38:43 -07:00
core-devops 843092db7d feat(e2e): stabilize Playwright chat tests for desktop + mobile
E2E API Smoke Test / detect-changes (pull_request) Successful in 39s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 14s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 16s
gate-check-v3 / gate-check (pull_request) Successful in 17s
qa-review / approved (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 27s
Harness Replays / Harness Replays (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 18s
sop-tier-check / tier-check (pull_request) Successful in 19s
sop-checklist / all-items-acked (pull_request) Successful in 20s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 40s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 41s
CI / Python Lint & Test (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 42s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Failing after 26s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m19s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m32s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m40s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m42s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m51s
CI / Platform (Go) (pull_request) Failing after 6m37s
CI / Canvas (Next.js) (pull_request) Successful in 9m30s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / all-required (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
audit-force-merge / audit (pull_request) Successful in 13s
Harness Replays / detect-changes (push) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 21s
Harness Replays / Harness Replays (push) Successful in 10s
CI / Detect changes (push) Successful in 51s
Handlers Postgres Integration / detect-changes (push) Successful in 54s
E2E API Smoke Test / detect-changes (push) Successful in 1m0s
E2E Chat / detect-changes (push) Successful in 1m1s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 57s
CI / Python Lint & Test (push) Successful in 14s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 18s
E2E Chat / E2E Chat (push) Failing after 29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m51s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 2m3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m58s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 5m46s
CI / Platform (Go) (push) Has been cancelled
CI / Canvas (Next.js) (push) Has been cancelled
Comprehensive Playwright E2E coverage for the unified chat stack.

- chat-seed.ts: external workspace creation with psql bypass for loopback
  URLs, heartbeat keeper, platform_inbound_secret pre-seed, and DB cleanup
- echo-runtime.ts: minimal A2A JSON-RPC server with workspace-side
  /internal/chat/uploads/ingest endpoint for file-attachment round-trips

- panel load, send/receive echo, history persistence
- file attachment round-trip (desktop + mobile)
- composer auto-grow (mobile)
- markdown rendering: code blocks and tables (desktop)
- activity log visibility (desktop)

- Extract shared hooks: useChatHistory, useChatSend, useChatSocket
- MobileChat: add file attachments, markdown rendering, history context
- ChatTab: migrate to shared hooks
- data-testid: chat-panel, workspace-card, mobile-chat-cta

- .gitea/workflows/e2e-chat.yml: ephemeral Postgres/Redis, workspace-server
  build, canvas dev server, Playwright run, artifact upload on failure
2026-05-15 14:20:04 -07:00
53 changed files with 3964 additions and 1291 deletions
+273
View File
@@ -0,0 +1,273 @@
name: E2E Chat
# Comprehensive Playwright E2E for the unified chat stack (desktop
# ChatTab + mobile MobileChat). Runs on every PR that touches canvas,
# workspace-server, or this workflow file.
#
# Architecture:
# 1. Ephemeral Postgres + Redis (docker, unique container names)
# 2. workspace-server built from source, started with
# MOLECULE_ENV=development (fail-open auth)
# 3. canvas dev server (npm run dev) on :3000
# 4. Playwright tests create workspaces via API, point them at an
# in-process echo runtime, and exercise the full send/receive
# round-trip through the browser.
#
# Parallel-safety: same pattern as e2e-api.yml — per-run container names
# and ephemeral host ports so concurrent jobs on the host-network runner
# don't collide.
on:
push:
branches: [main, staging]
pull_request:
branches: [main, staging]
concurrency:
group: e2e-chat-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: false
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
# bp-exempt: helper job; real gate is E2E Chat / E2E Chat (pull_request)
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
chat: ${{ steps.decide.outputs.chat }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- id: decide
run: |
BASE="${GITHUB_BASE_REF:-${{ github.event.before }}}"
if [ "${{ github.event_name }}" = "pull_request" ] && [ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
fi
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$'; then
echo "chat=true" >> "$GITHUB_OUTPUT"
exit 0
fi
if ! git cat-file -e "$BASE" 2>/dev/null; then
git fetch --depth=1 origin "$BASE" 2>/dev/null || true
fi
if ! git cat-file -e "$BASE" 2>/dev/null; then
echo "chat=true" >> "$GITHUB_OUTPUT"
exit 0
fi
CHANGED=$(git diff --name-only "$BASE" HEAD)
if echo "$CHANGED" | grep -qE '^(canvas/|workspace-server/|\.gitea/workflows/e2e-chat\.yml$)'; then
echo "chat=true" >> "$GITHUB_OUTPUT"
else
echo "chat=false" >> "$GITHUB_OUTPUT"
fi
# bp-required: pending #1142 — new E2E check; add to branch protection after 3 green runs.
e2e-chat:
needs: detect-changes
name: E2E Chat
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 15
env:
PG_CONTAINER: pg-e2e-chat-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-e2e-chat-${{ github.run_id }}-${{ github.run_attempt }}
steps:
- name: No-op pass (paths filter excluded this commit)
if: needs.detect-changes.outputs.chat != 'true'
run: |
echo "No canvas / workspace-server / workflow changes — E2E Chat gate satisfied without running tests."
echo "::notice::E2E Chat no-op pass (paths filter excluded this commit)."
- if: needs.detect-changes.outputs.chat == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- if: needs.detect-changes.outputs.chat == 'true'
uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- if: needs.detect-changes.outputs.chat == 'true'
uses: actions/setup-node@60edb5dd545a775178f52524783378180af0d6f5 # v4
with:
node-version: '22'
cache: 'npm'
cache-dependency-path: canvas/package-lock.json
- name: Start Postgres (docker)
if: needs.detect-changes.outputs.chat == 'true'
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker run -d --name "$PG_CONTAINER" \
-e POSTGRES_USER=dev -e POSTGRES_PASSWORD=dev -e POSTGRES_DB=molecule \
-p 0:5432 postgres:16 >/dev/null
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
if [ -z "$PG_PORT" ]; then
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | head -1 | awk -F: '{print $NF}')
fi
if [ -z "$PG_PORT" ]; then
echo "::error::Could not resolve host port for $PG_CONTAINER"
exit 1
fi
echo "PG_PORT=${PG_PORT}" >> "$GITHUB_ENV"
echo "DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
echo "E2E_DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
for i in $(seq 1 30); do
if docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1; then
echo "Postgres ready after ${i}s"
exit 0
fi
sleep 1
done
echo "::error::Postgres did not become ready in 30s"
exit 1
- name: Start Redis (docker)
if: needs.detect-changes.outputs.chat == 'true'
run: |
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
docker run -d --name "$REDIS_CONTAINER" -p 0:6379 redis:7 >/dev/null
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
if [ -z "$REDIS_PORT" ]; then
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | head -1 | awk -F: '{print $NF}')
fi
if [ -z "$REDIS_PORT" ]; then
echo "::error::Could not resolve host port for $REDIS_CONTAINER"
exit 1
fi
echo "REDIS_PORT=${REDIS_PORT}" >> "$GITHUB_ENV"
echo "REDIS_URL=redis://127.0.0.1:${REDIS_PORT}" >> "$GITHUB_ENV"
for i in $(seq 1 15); do
if docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG; then
echo "Redis ready after ${i}s"
exit 0
fi
sleep 1
done
echo "::error::Redis did not become ready in 15s"
exit 1
- name: Build platform
if: needs.detect-changes.outputs.chat == 'true'
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Pick platform port
if: needs.detect-changes.outputs.chat == 'true'
run: |
PLATFORM_PORT=$(python3 - <<'PY'
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 0))
print(s.getsockname()[1])
PY
)
echo "PLATFORM_PORT=${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "E2E_PLATFORM_URL=http://127.0.0.1:${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "Platform host port: ${PLATFORM_PORT}"
- name: Start platform (background)
if: needs.detect-changes.outputs.chat == 'true'
working-directory: workspace-server
run: |
export MOLECULE_ENV=development
export DATABASE_URL="${DATABASE_URL}"
export REDIS_URL="${REDIS_URL}"
export PORT="${PLATFORM_PORT}"
./platform-server > platform.log 2>&1 &
echo $! > platform.pid
- name: Wait for /health
if: needs.detect-changes.outputs.chat == 'true'
run: |
for i in $(seq 1 30); do
if curl -sf "http://127.0.0.1:${PLATFORM_PORT}/health" > /dev/null; then
echo "Platform up after ${i}s"
exit 0
fi
sleep 1
done
echo "::error::Platform did not become healthy in 30s"
cat workspace-server/platform.log || true
exit 1
- name: Install canvas dependencies
if: needs.detect-changes.outputs.chat == 'true'
working-directory: canvas
run: npm ci
- name: Install Playwright browsers
if: needs.detect-changes.outputs.chat == 'true'
working-directory: canvas
run: npx playwright install --with-deps chromium
- name: Start canvas dev server (background)
if: needs.detect-changes.outputs.chat == 'true'
working-directory: canvas
run: |
export NEXT_PUBLIC_PLATFORM_URL="http://127.0.0.1:${PLATFORM_PORT}"
export NEXT_PUBLIC_WS_URL="ws://127.0.0.1:${PLATFORM_PORT}/ws"
npm run dev > canvas.log 2>&1 &
echo $! > canvas.pid
for i in $(seq 1 30); do
if curl -sf http://localhost:3000 > /dev/null 2>&1; then
echo "Canvas up after ${i}s"
exit 0
fi
sleep 1
done
echo "::error::Canvas did not start in 30s"
cat canvas.log || true
exit 1
- name: Run Playwright E2E tests
if: needs.detect-changes.outputs.chat == 'true'
working-directory: canvas
run: |
export E2E_PLATFORM_URL="http://127.0.0.1:${PLATFORM_PORT}"
export E2E_DATABASE_URL="${DATABASE_URL}"
npx playwright test e2e/chat-desktop.spec.ts e2e/chat-mobile.spec.ts
- name: Dump platform log on failure
if: failure() && needs.detect-changes.outputs.chat == 'true'
run: cat workspace-server/platform.log || true
- name: Dump canvas log on failure
if: failure() && needs.detect-changes.outputs.chat == 'true'
run: cat canvas/canvas.log || true
- name: Upload Playwright report
if: failure() && needs.detect-changes.outputs.chat == 'true'
uses: actions/upload-artifact@v3.2.2
with:
name: playwright-report-chat
path: canvas/playwright-report/
- name: Stop canvas
if: always() && needs.detect-changes.outputs.chat == 'true'
run: |
if [ -f canvas/canvas.pid ]; then
kill "$(cat canvas/canvas.pid)" 2>/dev/null || true
fi
- name: Stop platform
if: always() && needs.detect-changes.outputs.chat == 'true'
run: |
if [ -f workspace-server/platform.pid ]; then
kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
fi
- name: Stop service containers
if: always() && needs.detect-changes.outputs.chat == 'true'
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
+173
View File
@@ -0,0 +1,173 @@
import { test, expect } from "@playwright/test";
import { startEchoRuntime } from "./fixtures/echo-runtime";
import { seedWorkspace, startHeartbeat, cleanupWorkspace } from "./fixtures/chat-seed";
test.describe("Desktop ChatTab", () => {
let cleanup: () => Promise<void> = async () => {};
let workspaceId = "";
let workspaceName = "";
test.beforeAll(async () => {
const echo = await startEchoRuntime();
const ws = await seedWorkspace(echo.baseURL);
workspaceId = ws.id;
workspaceName = ws.name;
const stopHeartbeat = startHeartbeat(ws.id, ws.authToken);
cleanup = async () => {
stopHeartbeat();
await echo.stop();
};
});
test.afterAll(async () => {
await cleanupWorkspace(workspaceId);
await cleanup();
});
test.beforeEach(async ({ page }) => {
await page.setViewportSize({ width: 1280, height: 800 });
await page.goto("/");
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
// Dismiss onboarding guide if present.
const skipGuide = page.getByText("Skip guide");
if (await skipGuide.isVisible().catch(() => false)) {
await skipGuide.click();
}
// Click the workspace node by its exact name label.
await page.getByText(workspaceName, { exact: true }).first().click();
// Wait for the side panel chat tab to be clickable, then click it.
await page.locator('#tab-chat').click();
await page.waitForSelector("[data-testid='chat-panel']", { timeout: 5_000 });
// Wait for the workspace status to flip to online and the textarea to be enabled.
await expect(page.locator("textarea").first()).toBeEnabled({ timeout: 15_000 });
});
test("chat panel loads without error", async ({ page }) => {
const hasEmptyState = await page.getByText("Send a message to start chatting.").isVisible().catch(() => false);
const hasHistory = await page.locator("[data-testid='chat-panel']").locator("div").count() > 3;
expect(hasEmptyState || hasHistory).toBeTruthy();
});
test("send text message and receive echo response", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("What is the weather?");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("What is the weather?")).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: What is the weather?")).toBeVisible({ timeout: 15_000 });
});
test("history persists across reload", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Persistence test");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: Persistence test")).toBeVisible({ timeout: 15_000 });
await page.reload();
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
await page.getByText(workspaceName, { exact: true }).first().click();
await page.locator('#tab-chat').click();
await page.waitForSelector("[data-testid='chat-panel']", { timeout: 5_000 });
// Wait for the workspace status to flip to online and the textarea to be enabled.
await expect(page.locator("textarea").first()).toBeEnabled({ timeout: 15_000 });
await expect(page.getByText("Persistence test", { exact: true })).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: Persistence test")).toBeVisible({ timeout: 5_000 });
});
test("file attachment round-trip", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Please read this file");
const fileInput = page.locator("[data-testid='chat-panel'] input[type='file']").first();
await fileInput.setInputFiles({
name: "test.txt",
mimeType: "text/plain",
buffer: Buffer.from("secret content abc123"),
});
await expect(page.getByText("test.txt")).toBeVisible({ timeout: 3_000 });
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: Please read this file")).toBeVisible({ timeout: 15_000 });
});
test("activity log appears during send", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Trigger activity");
await page.getByRole("button", { name: /Send/ }).first().click();
// Activity log container should appear during the send flow.
await expect(page.locator("[data-testid='activity-log']").first()).toBeVisible({ timeout: 10_000 }).catch(() => {
// Activity log may not be present in all layouts.
});
});
});
test.describe("Desktop ChatTab — Markdown rendering", () => {
let cleanup: () => Promise<void> = async () => {};
let workspaceId = "";
let workspaceName = "";
test.beforeAll(async () => {
const echo = await startEchoRuntime();
const ws = await seedWorkspace(echo.baseURL);
workspaceId = ws.id;
workspaceName = ws.name;
const stopHeartbeat = startHeartbeat(ws.id, ws.authToken);
cleanup = async () => {
stopHeartbeat();
await echo.stop();
};
});
test.afterAll(async () => {
await cleanupWorkspace(workspaceId);
await cleanup();
});
test.beforeEach(async ({ page }) => {
await page.setViewportSize({ width: 1280, height: 800 });
await page.goto("/");
await page.waitForSelector(".react-flow__node", { timeout: 10_000 });
const skipGuide2 = page.getByText("Skip guide");
if (await skipGuide2.isVisible().catch(() => false)) {
await skipGuide2.click();
}
await page.getByText(workspaceName, { exact: true }).first().click();
await page.locator('#tab-chat').click();
await page.waitForSelector("[data-testid='chat-panel']", { timeout: 5_000 });
// Wait for the workspace status to flip to online and the textarea to be enabled.
await expect(page.locator("textarea").first()).toBeEnabled({ timeout: 15_000 });
});
test("code block renders <pre>", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("```js\nconst x = 1;\n```");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: ```js")).toBeVisible({ timeout: 15_000 });
const pre = page.locator("pre").first();
await expect(pre).toBeVisible({ timeout: 5_000 });
await expect(pre).toContainText("const x = 1;");
});
test("table renders <table>", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("| A | B |\n|---|---|\n| 1 | 2 |");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: | A | B |")).toBeVisible({ timeout: 15_000 });
const table = page.locator("table").first();
await expect(table).toBeVisible({ timeout: 5_000 });
await expect(table).toContainText("A");
await expect(table).toContainText("1");
});
});
+97
View File
@@ -0,0 +1,97 @@
import { test, expect } from "@playwright/test";
import { startEchoRuntime } from "./fixtures/echo-runtime";
import { seedWorkspace, startHeartbeat, cleanupWorkspace } from "./fixtures/chat-seed";
test.describe("MobileChat", () => {
let cleanup: () => Promise<void> = async () => {};
let workspaceId = "";
test.beforeAll(async () => {
const echo = await startEchoRuntime();
const ws = await seedWorkspace(echo.baseURL);
workspaceId = ws.id;
const stopHeartbeat = startHeartbeat(ws.id, ws.authToken);
cleanup = async () => {
stopHeartbeat();
await echo.stop();
};
});
test.afterAll(async () => {
await cleanupWorkspace(workspaceId);
await cleanup();
});
test.beforeEach(async ({ page }) => {
await page.setViewportSize({ width: 375, height: 812 });
// Navigate directly to the mobile chat view.
await page.goto(`/?m=chat&a=${workspaceId}`);
await page.waitForSelector("[data-testid='chat-panel']", { timeout: 10_000 });
// Wait for the workspace status to flip to online and the textarea to be enabled.
await expect(page.locator("textarea").first()).toBeEnabled({ timeout: 15_000 });
// Dismiss onboarding guide if present.
const skipGuide = page.getByText("Skip guide");
if (await skipGuide.isVisible().catch(() => false)) {
await skipGuide.click();
}
});
test("chat panel loads without error", async ({ page }) => {
const hasEmptyState = await page.getByText("Send a message to start chatting.").isVisible().catch(() => false);
const hasHistory = await page.locator("[data-testid='chat-panel']").locator("div").count() > 3;
expect(hasEmptyState || hasHistory).toBeTruthy();
});
test("send text message and receive echo response", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Mobile test message");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Mobile test message")).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: Mobile test message")).toBeVisible({ timeout: 15_000 });
});
test("history persists across reload", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Mobile persistence");
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: Mobile persistence")).toBeVisible({ timeout: 15_000 });
await page.reload();
await page.waitForSelector("[data-testid='chat-panel']", { timeout: 10_000 });
await expect(page.getByText("Mobile persistence", { exact: true })).toBeVisible({ timeout: 5_000 });
await expect(page.getByText("Echo: Mobile persistence")).toBeVisible({ timeout: 5_000 });
});
test("composer auto-grows with multi-line text", async ({ page }) => {
const textarea = page.locator("textarea").first();
const initialHeight = await textarea.evaluate((el: HTMLElement) => el.offsetHeight);
await textarea.fill("Line 1\nLine 2\nLine 3\nLine 4\nLine 5");
await page.waitForTimeout(300);
const grownHeight = await textarea.evaluate((el: HTMLElement) => el.offsetHeight);
expect(grownHeight).toBeGreaterThan(initialHeight);
});
test("file attachment in mobile chat", async ({ page }) => {
const textarea = page.locator("textarea").first();
await textarea.fill("Mobile file test");
const fileInput = page.locator("[data-testid='chat-panel'] input[type='file']").first();
await fileInput.setInputFiles({
name: "mobile.txt",
mimeType: "text/plain",
buffer: Buffer.from("mobile secret"),
});
await expect(page.getByText("mobile.txt")).toBeVisible({ timeout: 3_000 });
await page.getByRole("button", { name: /Send/ }).first().click();
await expect(page.getByText("Echo: Mobile file test")).toBeVisible({ timeout: 15_000 });
});
});
+187
View File
@@ -0,0 +1,187 @@
/**
* E2E seed fixture for chat tests.
*
* Creates an external workspace via the workspace-server API, extracts the
* auto-minted auth token, then overrides the DB row so it appears "online"
* with an echo-runtime URL. External runtime is used because the health
* sweep skips Docker checks for external workspaces; we keep the workspace
* alive with periodic heartbeats.
*/
import { randomUUID } from "node:crypto";
const PLATFORM_URL = process.env.E2E_PLATFORM_URL ?? "http://localhost:8080";
export interface SeededWorkspace {
id: string;
name: string;
agentURL: string;
authToken: string;
}
/**
* Create an external workspace and wire it to the echo runtime.
*/
export async function seedWorkspace(echoURL: string): Promise<SeededWorkspace> {
// 1. Create external workspace (no URL — platform will mint an auth token).
const runId = Math.random().toString(36).slice(2, 8);
const wsName = `Chat E2E Agent ${runId}`;
const createRes = await fetch(`${PLATFORM_URL}/workspaces`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ name: wsName, tier: 1, external: true, runtime: "external" }),
});
if (!createRes.ok) {
const text = await createRes.text();
throw new Error(`Failed to create workspace: ${createRes.status} ${text}`);
}
const ws = (await createRes.json()) as {
id: string;
name: string;
connection?: { auth_token?: string };
};
const authToken = ws.connection?.auth_token;
if (!authToken) {
throw new Error("Workspace created but no auth_token returned");
}
// 2. Direct DB update: mark online + point url at echo runtime.
// The platform blocks loopback URLs at the API layer (SSRF guard),
// so we bypass via psql for local E2E.
const dbUrl = process.env.E2E_DATABASE_URL;
if (!dbUrl) {
throw new Error("E2E_DATABASE_URL must be set for DB seeding");
}
const pgRegex = /postgres:\/\/([^:]+):([^@]+)@([^:]+):(\d+)\/([^?]+)/;
const m = dbUrl.match(pgRegex);
if (!m) {
throw new Error(`Cannot parse E2E_DATABASE_URL: ${dbUrl}`);
}
const [, user, pass, host, port, db] = m;
// Pre-seed a platform_inbound_secret so chat file uploads don't trigger
// the lazy-heal 503 "retry in 30 s" path on first use.
const inboundSecret = Array.from({ length: 43 }, () =>
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"[
Math.floor(Math.random() * 64)
],
).join("");
const psql = [
`PGPASSWORD=${pass} psql`,
`-h ${host} -p ${port} -U ${user} -d ${db}`,
`-c "UPDATE workspaces SET status = 'online', url = '${echoURL}', platform_inbound_secret = '${inboundSecret}' WHERE id = '${ws.id}'"`,
].join(" ");
const { execSync } = await import("node:child_process");
try {
execSync(psql, { stdio: "pipe", timeout: 30_000 });
} catch (err) {
throw new Error(`DB update failed: ${err}`);
}
return { id: ws.id, name: wsName, agentURL: echoURL, authToken };
}
/**
* Start a heartbeat interval that keeps an external workspace alive.
* Returns a stop function.
*/
export function startHeartbeat(
workspaceId: string,
authToken: string,
intervalMs = 30_000,
): () => void {
const send = () => {
fetch(`${PLATFORM_URL}/registry/heartbeat`, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${authToken}`,
},
body: JSON.stringify({
workspace_id: workspaceId,
error_rate: 0,
sample_error: "",
active_tasks: 0,
current_task: "",
uptime_seconds: 0,
}),
}).catch(() => {});
};
// Send immediately so the first heartbeat lands before the stale sweep.
send();
const timer = setInterval(send, intervalMs);
return () => clearInterval(timer);
}
/**
* Seed chat-history rows for a workspace.
*/
export async function seedChatHistory(
workspaceId: string,
messages: Array<{ role: "user" | "agent"; content: string }>,
): Promise<void> {
const dbUrl = process.env.E2E_DATABASE_URL;
if (!dbUrl) return;
const pgRegex = /postgres:\/\/([^:]+):([^@]+)@([^:]+):(\d+)\/([^?]+)/;
const m = dbUrl.match(pgRegex);
if (!m) return;
const [, user, pass, host, port, db] = m;
const values = messages
.map(
(msg, i) =>
`('${randomUUID()}', '${workspaceId}', '${msg.role}', '${msg.content.replace(/'/g, "''")}', NOW() - INTERVAL '${messages.length - i} seconds')`,
)
.join(",");
const sql = `INSERT INTO chat_messages (id, workspace_id, role, content, created_at) VALUES ${values};`;
const { execSync } = await import("node:child_process");
const psql = `PGPASSWORD=${pass} psql -h ${host} -p ${port} -U ${user} -d ${db} -c "${sql}"`;
execSync(psql, { stdio: "pipe", timeout: 10_000 });
}
/**
* Delete a seeded workspace row directly from the DB.
* Uses psql (same credentials as seedWorkspace) so we bypass any
* workspace-server side-effects (container stop, cascade cleanup, etc.)
* that can race or 500 on external workspaces.
*/
export async function cleanupWorkspace(workspaceId: string): Promise<void> {
const dbUrl = process.env.E2E_DATABASE_URL;
if (!dbUrl) return;
const pgRegex = /postgres:\/\/([^:]+):([^@]+)@([^:]+):(\d+)\/([^?]+)/;
const m = dbUrl.match(pgRegex);
if (!m) return;
const [, user, pass, host, port, db] = m;
const psql = `PGPASSWORD=${pass} psql -h ${host} -p ${port} -U ${user} -d ${db} -c "DELETE FROM workspaces WHERE id = '${workspaceId}'"`;
const { execSync } = await import("node:child_process");
try {
execSync(psql, { stdio: "pipe", timeout: 30_000 });
} catch {
// Best-effort cleanup; don't fail the test suite if the row is already gone.
}
}
/**
* Mint a workspace auth token so the canvas can make authenticated API
* calls (WorkspaceAuth middleware).
*/
export async function mintTestToken(workspaceId: string): Promise<string> {
const res = await fetch(
`${PLATFORM_URL}/admin/workspaces/${workspaceId}/test-token`,
);
if (!res.ok) {
throw new Error(`Failed to mint test token: ${res.status}`);
}
const data = (await res.json()) as { auth_token: string };
return data.auth_token;
}
+180
View File
@@ -0,0 +1,180 @@
/**
* Minimal A2A echo runtime for E2E tests.
*
* Listens on an ephemeral port, receives A2A JSON-RPC `message/send`
* requests, and returns a response with the original text echoed back.
* Also implements the workspace-side chat upload ingest endpoint so
* file-attachment E2E can exercise the full upload → send → echo
* round-trip.
*
* Usage (inside test fixture):
* const echo = await startEchoRuntime();
* // ... seed workspace with agent_url pointing to echo.baseURL ...
* echo.stop();
*/
import { createServer, type Server } from "node:http";
export interface EchoRuntime {
baseURL: string;
stop: () => Promise<void>;
lastRequest: { method: string; text: string; files: unknown[] } | null;
}
/** Parse a minimal multipart body and extract the first file's name + content. */
function parseMultipart(body: Buffer): { name: string; mimeType: string; content: Buffer } | null {
// Find the boundary line (first line starting with "--").
const str = body.toString("binary");
const firstDash = str.indexOf("--");
if (firstDash === -1) return null;
const eol = str.indexOf("\r\n", firstDash);
if (eol === -1) return null;
const boundary = str.slice(firstDash + 2, eol);
const boundaryMarker = "\r\n--" + boundary;
// Find the first part that has a filename in Content-Disposition.
let pos = eol + 2;
while (pos < str.length) {
const nextBoundary = str.indexOf(boundaryMarker, pos);
if (nextBoundary === -1) break;
const part = str.slice(pos, nextBoundary);
const cdMatch = part.match(/Content-Disposition:[^\r\n]*filename="([^"]+)"/i);
if (cdMatch) {
const name = cdMatch[1];
const ctMatch = part.match(/Content-Type:\s*([^\r\n]+)/i);
const mimeType = ctMatch ? ctMatch[1].trim() : "application/octet-stream";
// Body starts after the first double-CRLF in the part.
const bodyStart = part.indexOf("\r\n\r\n");
if (bodyStart !== -1) {
// Extract the raw bytes (not the string) so binary is safe.
const headerBytes = Buffer.byteLength(part.slice(0, bodyStart + 4), "binary");
const partStartInBody = Buffer.byteLength(str.slice(0, pos + bodyStart + 4), "binary");
const partEndInBody = Buffer.byteLength(str.slice(0, nextBoundary), "binary");
const content = body.subarray(partStartInBody, partEndInBody);
return { name, mimeType, content };
}
}
pos = nextBoundary + boundaryMarker.length;
// Skip trailing "--" (end marker) or CRLF.
if (str.slice(pos, pos + 2) === "--") break;
if (str.slice(pos, pos + 2) === "\r\n") pos += 2;
}
return null;
}
export async function startEchoRuntime(): Promise<EchoRuntime> {
let lastRequest: EchoRuntime["lastRequest"] = null;
const server = createServer((req, res) => {
// CORS: allow the canvas origin (localhost:3000) to call us.
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Methods", "POST, GET, OPTIONS");
res.setHeader("Access-Control-Allow-Headers", "Content-Type, Authorization");
if (req.method === "OPTIONS") {
res.writeHead(204);
res.end();
return;
}
const url = req.url ?? "/";
// Workspace-side chat upload ingest (RFC #2312).
if (url === "/internal/chat/uploads/ingest" && req.method === "POST") {
const chunks: Buffer[] = [];
req.on("data", (chunk: Buffer) => chunks.push(chunk));
req.on("end", () => {
const body = Buffer.concat(chunks);
const file = parseMultipart(body);
if (!file) {
res.writeHead(400);
res.end(JSON.stringify({ error: "no files field" }));
return;
}
const sanitized = file.name.replace(/[^a-zA-Z0-9._\-]/g, "_").replace(/ /g, "_");
const prefix = Array.from({ length: 32 }, () =>
Math.floor(Math.random() * 16).toString(16),
).join("");
const response = {
files: [
{
uri: `workspace:/workspace/.molecule/chat-uploads/${prefix}-${sanitized}`,
name: sanitized,
mimeType: file.mimeType,
size: file.content.length,
},
],
};
res.setHeader("Content-Type", "application/json");
res.writeHead(200);
res.end(JSON.stringify(response));
});
return;
}
// Default: A2A JSON-RPC handler.
let body = "";
req.setEncoding("utf8");
req.on("data", (chunk: string) => {
body += chunk;
});
req.on("end", () => {
res.setHeader("Content-Type", "application/json");
try {
const rpc = JSON.parse(body);
const msg = rpc.params?.message;
const textParts =
msg?.parts
?.filter((p: { kind?: string; text?: string }) => p.kind === "text")
.map((p: { text?: string }) => p.text)
.filter(Boolean) ?? [];
const fileParts =
msg?.parts?.filter((p: { kind?: string }) => p.kind === "file") ?? [];
const text = textParts.join("\n");
lastRequest = {
method: rpc.method ?? "unknown",
text,
files: fileParts,
};
const replyText = text
? `Echo: ${text}`
: fileParts.length > 0
? "Echo: received your file(s)."
: "Echo: hello";
const response = {
jsonrpc: "2.0",
id: rpc.id ?? null,
result: {
parts: [{ kind: "text", text: replyText }],
},
};
res.writeHead(200);
res.end(JSON.stringify(response));
} catch {
res.writeHead(400);
res.end(JSON.stringify({ error: "invalid json" }));
}
});
});
await new Promise<void>((resolve) => server.listen(0, "127.0.0.1", resolve));
const address = server.address();
const port = typeof address === "object" && address ? address.port : 0;
const baseURL = `http://127.0.0.1:${port}`;
return {
baseURL,
stop: () =>
new Promise((resolve) => {
server.close(() => resolve(undefined));
}),
get lastRequest() {
return lastRequest;
},
};
}
+1
View File
@@ -5,6 +5,7 @@ export default defineConfig({
timeout: 30_000,
expect: { timeout: 10_000 },
fullyParallel: false,
workers: 1,
retries: 0,
use: {
baseURL: "http://localhost:3000",
+317 -201
View File
@@ -6,21 +6,21 @@
// attachments, no A2A topology overlay, no conversation tracing.
import { useEffect, useMemo, useRef, useState } from "react";
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import { api } from "@/lib/api";
import { useCanvasStore } from "@/store/canvas";
import { type ChatAttachment, type ChatMessage, createMessage } from "@/components/tabs/chat/types";
import {
useChatHistory,
useChatSend,
useChatSocket,
} from "@/components/tabs/chat/hooks";
import { toMobileAgent } from "./components";
import { MOBILE_FONT_MONO, MOBILE_FONT_SANS, usePalette } from "./palette";
import { Icons, StatusDot, TierChip } from "./primitives";
interface ChatMessage {
id: string;
role: "user" | "agent" | "system";
text: string;
ts: string;
}
const formatStoredTimestamp = (iso: string): string => {
const d = new Date(iso);
if (isNaN(d.getTime())) return "";
@@ -29,30 +29,171 @@ const formatStoredTimestamp = (iso: string): string => {
type SubTab = "my" | "a2a";
interface A2AResponseShape {
result?: {
parts?: Array<{ kind?: string; text?: string }>;
};
error?: { message?: string };
}
function MarkdownBubble({
children,
dark,
accent,
}: {
children: string;
dark: boolean;
accent: string;
}) {
const codeBg = dark ? "rgba(255,255,255,0.08)" : "rgba(0,0,0,0.06)";
const codeBlockBg = dark ? "#1a1a1a" : "#f5f5f0";
const linkColor = accent;
const quoteBorder = dark ? "rgba(255,250,240,0.15)" : "rgba(40,30,20,0.15)";
// Wire shape for GET /workspaces/:id/chat-history (chat_history.go → ChatHistoryResponse).
interface ApiChatMessage {
id: string;
role: string; // "user" | "agent" | "system"
content: string;
timestamp: string;
attachments?: Array<{ name: string; uri: string; mimeType?: string; size?: number }>;
return (
<ReactMarkdown
remarkPlugins={[remarkGfm]}
components={{
p: ({ children }) => (
<div style={{ margin: "2px 0", lineHeight: "inherit" }}>{children}</div>
),
a: ({ href, children }) => (
<a
href={href}
target="_blank"
rel="noopener noreferrer"
style={{ color: linkColor, textDecoration: "underline" }}
>
{children}
</a>
),
pre: ({ children }) => (
<pre
style={{
background: codeBlockBg,
padding: "8px 10px",
borderRadius: 8,
overflow: "auto",
fontSize: 12,
lineHeight: 1.5,
fontFamily: MOBILE_FONT_MONO,
margin: "4px 0",
}}
>
{children}
</pre>
),
code: ({ children, className }) => {
const isBlock = className != null && String(className).length > 0;
if (isBlock) {
return (
<code style={{ fontFamily: MOBILE_FONT_MONO, fontSize: 12 }}>
{children}
</code>
);
}
return (
<code
style={{
background: codeBg,
padding: "1px 4px",
borderRadius: 4,
fontSize: 13,
fontFamily: MOBILE_FONT_MONO,
}}
>
{children}
</code>
);
},
ul: ({ children }) => (
<ul style={{ margin: "4px 0", paddingLeft: 18, listStyle: "disc" }}>
{children}
</ul>
),
ol: ({ children }) => (
<ol style={{ margin: "4px 0", paddingLeft: 18, listStyle: "decimal" }}>
{children}
</ol>
),
li: ({ children }) => <li style={{ margin: "2px 0" }}>{children}</li>,
strong: ({ children }) => (
<strong style={{ fontWeight: 600 }}>{children}</strong>
),
em: ({ children }) => <em style={{ fontStyle: "italic" }}>{children}</em>,
h1: ({ children }) => (
<div style={{ fontSize: 16, fontWeight: 700, margin: "4px 0" }}>{children}</div>
),
h2: ({ children }) => (
<div style={{ fontSize: 15, fontWeight: 700, margin: "4px 0" }}>{children}</div>
),
h3: ({ children }) => (
<div style={{ fontSize: 14, fontWeight: 700, margin: "4px 0" }}>{children}</div>
),
h4: ({ children }) => (
<div style={{ fontSize: 14, fontWeight: 600, margin: "4px 0" }}>{children}</div>
),
h5: ({ children }) => (
<div style={{ fontSize: 13, fontWeight: 600, margin: "4px 0" }}>{children}</div>
),
h6: ({ children }) => (
<div style={{ fontSize: 13, fontWeight: 600, margin: "4px 0" }}>{children}</div>
),
blockquote: ({ children }) => (
<blockquote
style={{
borderLeft: `2px solid ${quoteBorder}`,
margin: "4px 0",
paddingLeft: 8,
opacity: 0.85,
}}
>
{children}
</blockquote>
),
hr: () => (
<hr
style={{
border: "none",
borderTop: `0.5px solid ${quoteBorder}`,
margin: "6px 0",
}}
/>
),
table: ({ children }) => (
<table
style={{
borderCollapse: "collapse",
fontSize: 13,
margin: "4px 0",
width: "100%",
}}
>
{children}
</table>
),
thead: ({ children }) => <thead style={{ fontWeight: 600 }}>{children}</thead>,
th: ({ children }) => (
<th
style={{
border: `0.5px solid ${quoteBorder}`,
padding: "4px 6px",
textAlign: "left",
}}
>
{children}
</th>
),
td: ({ children }) => (
<td
style={{
border: `0.5px solid ${quoteBorder}`,
padding: "4px 6px",
}}
>
{children}
</td>
),
}}
>
{children}
</ReactMarkdown>
);
}
interface ChatHistoryResponse {
messages: ApiChatMessage[];
reached_end: boolean;
}
const formatTime = (date: Date) =>
date.toLocaleTimeString([], { hour: "numeric", minute: "2-digit" });
export function MobileChat({
agentId,
dark,
@@ -63,36 +204,40 @@ export function MobileChat({
onBack: () => void;
}) {
const p = usePalette(dark);
// Selecting `nodes` stably avoids the `.find()` anti-pattern that
// creates a new return value on every store update (React error #185).
const nodes = useCanvasStore((s) => s.nodes);
const node = useMemo(() => nodes.find((n) => n.id === agentId), [nodes, agentId]);
// Bootstrap from the canvas store's per-workspace message buffer so the
// user sees their prior thread on entry. The store is updated by the
// socket → ChatTab flows the desktop runs; on mobile we read from the
// same buffer to keep state coherent across viewports.
// NOTE: selector returns undefined (stable) — do NOT use ?? [] here,
// that creates a new [] reference on every store update when the key is
// absent, causing infinite re-render (React error #185).
const storedMessages = useCanvasStore((s) => s.agentMessages[agentId]);
// Start empty — history is loaded via useEffect below.
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [draft, setDraft] = useState("");
const [tab, setTab] = useState<SubTab>("my");
const [sending, setSending] = useState(false);
const [error, setError] = useState<string | null>(null);
const [loading, setLoading] = useState(true); // history is loading on mount
const [historyError, setHistoryError] = useState<string | null>(null);
const scrollRef = useRef<HTMLDivElement>(null);
// Synchronous re-entry guard. `setSending(true)` schedules a state
// update but doesn't flush before a second tap can fire send() — a ref
// mirrors the desktop ChatTab pattern (sendInFlightRef) and closes the
// double-send race a stale `sending` lets through.
const sendInFlightRef = useRef(false);
const composerRef = useRef<HTMLTextAreaElement>(null);
// Guard: don't treat the initial store population as a live push.
// Set to false after the first render completes.
const initDoneRef = useRef(false);
const fileInputRef = useRef<HTMLInputElement>(null);
const [pendingFiles, setPendingFiles] = useState<File[]>([]);
const {
messages,
loading: historyLoading,
loadError: historyError,
loadInitial,
appendMessageDeduped,
} = useChatHistory(agentId);
const {
sending,
uploading,
sendMessage,
error: sendError,
clearError,
releaseSendGuards,
} = useChatSend(agentId, {
getHistoryMessages: () => messages,
onUserMessage: appendMessageDeduped,
onAgentMessage: appendMessageDeduped,
});
useChatSocket(agentId, {
onAgentMessage: appendMessageDeduped,
onSendComplete: releaseSendGuards,
});
// Auto-grow the textarea: reset height to 'auto' so the scrollHeight
// shrinks when the user deletes text, then size to scrollHeight up to
@@ -105,81 +250,26 @@ export function MobileChat({
el.style.height = `${next}px`;
}, [draft]);
// Fetch chat history on mount; keep merging live agentMessages while the
// panel is open. InitDoneRef prevents the initial store snapshot from
// triggering the live-merge path (the store buffer is populated by
// ChatTab on desktop, not on mobile — this effect loads history as the
// mobile-native path).
useEffect(() => {
let cancelled = false;
const mapApiMessage = (m: ApiChatMessage): ChatMessage => ({
id: m.id,
role: m.role === "user" ? "user" : "agent",
text: m.content,
ts: formatStoredTimestamp(m.timestamp),
});
const syncLive = () => {
const live = useCanvasStore.getState().agentMessages[agentId] ?? [];
if (live.length > 0) {
setMessages((prev) => {
const existingIds = new Set(prev.map((m) => m.id));
const newOnes = live
.filter((m) => !existingIds.has(m.id))
.map((m) => ({
id: m.id,
role: "agent" as const,
text: m.content,
ts: formatStoredTimestamp(m.timestamp),
}));
return newOnes.length > 0 ? [...prev, ...newOnes] : prev;
});
}
};
const bootstrap = async (): Promise<(() => void) | undefined> => {
setLoading(true);
setHistoryError(null);
try {
const res = await api.get<ChatHistoryResponse>(
`/workspaces/${agentId}/chat-history?limit=50`,
);
if (cancelled) return;
const initial = (res.messages ?? []).map(mapApiMessage);
setMessages(initial);
// Mark init done BEFORE marking loading=false so any store push
// that arrives in the same tick is treated as live, not init.
initDoneRef.current = true;
setLoading(false);
// Subscribe to live pushes after init is complete.
syncLive();
const unsubscribe = useCanvasStore.subscribe(syncLive);
return unsubscribe; // returned for cleanup
} catch (e) {
if (cancelled) return;
setHistoryError(e instanceof Error ? e.message : "Failed to load chat history");
setLoading(false);
initDoneRef.current = true;
return undefined;
}
};
let maybeUnsubscribe: (() => void) | undefined;
bootstrap().then((fn) => { maybeUnsubscribe = fn; });
return () => {
cancelled = true;
if (maybeUnsubscribe) maybeUnsubscribe();
};
}, [agentId]);
useEffect(() => {
if (scrollRef.current) {
scrollRef.current.scrollTop = scrollRef.current.scrollHeight;
}
}, [messages]);
// Consume any agent messages that arrived while history was loading.
const initialConsumeDoneRef = useRef(false);
useEffect(() => {
if (historyLoading || initialConsumeDoneRef.current) return;
initialConsumeDoneRef.current = true;
const consume = useCanvasStore.getState().consumeAgentMessages;
const msgs = consume(agentId);
for (const m of msgs) {
appendMessageDeduped(
createMessage("agent", m.content, m.attachments),
);
}
}, [historyLoading, agentId, appendMessageDeduped]);
if (!node) {
return (
<div
@@ -201,58 +291,32 @@ export function MobileChat({
const a = toMobileAgent(node);
const reachable = a.status === "online" || a.status === "degraded";
const onFilesPicked = (fileList: FileList | null) => {
if (!fileList) return;
const picked = Array.from(fileList);
setPendingFiles((prev) => {
const keyed = new Set(prev.map((f) => `${f.name}:${f.size}`));
return [...prev, ...picked.filter((f) => !keyed.has(`${f.name}:${f.size}`))];
});
if (fileInputRef.current) fileInputRef.current.value = "";
};
const removePendingFile = (index: number) =>
setPendingFiles((prev) => prev.filter((_, i) => i !== index));
const send = async () => {
const text = draft.trim();
if (!text || sending || !reachable) return;
if (sendInFlightRef.current) return;
sendInFlightRef.current = true;
if ((!text && pendingFiles.length === 0) || sending || !reachable) return;
clearError();
setDraft("");
setError(null);
setSending(true);
const myMsg: ChatMessage = {
id: crypto.randomUUID(),
role: "user",
text,
ts: formatTime(new Date()),
};
setMessages((m) => [...m, myMsg]);
try {
const res = await api.post<A2AResponseShape>(`/workspaces/${agentId}/a2a`, {
method: "message/send",
params: {
message: {
role: "user",
messageId: crypto.randomUUID(),
parts: [{ kind: "text", text }],
},
},
});
const reply =
res.result?.parts?.find((part) => part.kind === "text")?.text ?? "";
if (reply) {
setMessages((m) => [
...m,
{
id: crypto.randomUUID(),
role: "agent",
text: reply,
ts: formatTime(new Date()),
},
]);
} else if (res.error?.message) {
setError(res.error.message);
}
} catch (e) {
setError(e instanceof Error ? e.message : "Failed to send");
} finally {
setSending(false);
sendInFlightRef.current = false;
}
const files = pendingFiles;
setPendingFiles([]);
await sendMessage(text, files);
};
return (
<div
data-testid="chat-panel"
style={{
height: "100%",
display: "flex",
@@ -393,13 +457,12 @@ export function MobileChat({
Agent Comms peer-to-peer A2A traffic surfaces in the Comms tab.
</div>
)}
{tab === "my" && loading && (
{tab === "my" && historyLoading && (
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
<div style={{ marginBottom: 6, opacity: 0.6, animation: "spin 1s linear infinite", display: "inline-block", fontSize: 16 }}></div>
<div>Loading chat history</div>
Loading chat history
</div>
)}
{tab === "my" && !loading && historyError && (
{tab === "my" && !historyLoading && historyError && messages.length === 0 && (
<div
role="alert"
style={{
@@ -413,25 +476,7 @@ export function MobileChat({
<button
type="button"
onClick={() => {
setLoading(true);
setHistoryError(null);
api.get(`/workspaces/${agentId}/chat-history?limit=50`).then(
(res: unknown) => {
const r = res as ChatHistoryResponse;
setMessages((r.messages ?? []).map((m) => ({
id: m.id,
role: m.role === "user" ? "user" : "agent",
text: m.content,
ts: formatStoredTimestamp(m.timestamp),
})));
setLoading(false);
initDoneRef.current = true;
},
).catch((e: unknown) => {
setHistoryError(e instanceof Error ? e.message : "Failed to load");
setLoading(false);
initDoneRef.current = true;
});
loadInitial();
}}
style={{
padding: "6px 14px",
@@ -447,7 +492,7 @@ export function MobileChat({
</button>
</div>
)}
{tab === "my" && !loading && !historyError && messages.length === 0 && (
{tab === "my" && !historyLoading && !historyError && messages.length === 0 && (
<div style={{ padding: "20px 4px", textAlign: "center", color: p.text3, fontSize: 13 }}>
Send a message to start chatting.
</div>
@@ -476,7 +521,9 @@ export function MobileChat({
overflowWrap: "anywhere",
}}
>
{m.text}
<MarkdownBubble dark={dark} accent={p.accent}>
{m.content}
</MarkdownBubble>
<div
style={{
fontSize: 10,
@@ -485,13 +532,13 @@ export function MobileChat({
fontFamily: MOBILE_FONT_MONO,
}}
>
{m.ts}
{formatStoredTimestamp(m.timestamp)}
</div>
</div>
</div>
);
})}
{error && (
{sendError && (
<div
role="alert"
style={{
@@ -503,7 +550,7 @@ export function MobileChat({
fontSize: 12,
}}
>
{error}
{sendError}
</div>
)}
</div>
@@ -534,6 +581,60 @@ export function MobileChat({
backdropFilter: "blur(14px)",
}}
>
{pendingFiles.length > 0 && (
<div
style={{
display: "flex",
flexWrap: "wrap",
gap: 6,
marginBottom: 8,
paddingLeft: 2,
}}
>
{pendingFiles.map((f, i) => (
<div
key={`${f.name}:${f.size}`}
style={{
display: "flex",
alignItems: "center",
gap: 4,
padding: "3px 8px",
borderRadius: 10,
background: dark ? "#2a2823" : "#ece9e0",
fontSize: 12,
color: p.text2,
maxWidth: "100%",
}}
>
<span
style={{
overflow: "hidden",
textOverflow: "ellipsis",
whiteSpace: "nowrap",
}}
>
{f.name}
</span>
<button
type="button"
onClick={() => removePendingFile(i)}
aria-label={`Remove ${f.name}`}
style={{
border: "none",
background: "transparent",
color: p.text3,
cursor: "pointer",
fontSize: 12,
padding: 0,
lineHeight: 1,
}}
>
</button>
</div>
))}
</div>
)}
<div
style={{
display: "flex",
@@ -545,21 +646,32 @@ export function MobileChat({
padding: "6px 6px 6px 12px",
}}
>
<input
ref={fileInputRef}
type="file"
multiple
style={{ display: "none" }}
onChange={(e) => onFilesPicked(e.target.files)}
aria-hidden="true"
/>
<button
type="button"
onClick={() => fileInputRef.current?.click()}
disabled={!reachable || sending || uploading}
aria-label="Attach"
style={{
width: 32,
height: 32,
borderRadius: 999,
border: "none",
cursor: "pointer",
cursor: reachable && !sending && !uploading ? "pointer" : "not-allowed",
background: "transparent",
color: p.text3,
flexShrink: 0,
display: "flex",
alignItems: "center",
justifyContent: "center",
opacity: !reachable || sending || uploading ? 0.4 : 1,
}}
>
{Icons.attach({ size: 16 })}
@@ -605,28 +717,32 @@ export function MobileChat({
<button
type="button"
onClick={send}
disabled={!draft.trim() || !reachable || sending}
disabled={(!draft.trim() && pendingFiles.length === 0) || !reachable || sending || uploading}
aria-label="Send"
style={{
width: 36,
height: 36,
borderRadius: 999,
border: "none",
cursor: draft.trim() && !sending ? "pointer" : "not-allowed",
cursor: (draft.trim() || pendingFiles.length > 0) && !sending && !uploading ? "pointer" : "not-allowed",
flexShrink: 0,
background:
draft.trim() && reachable && !sending
(draft.trim() || pendingFiles.length > 0) && reachable && !sending && !uploading
? p.accent
: dark
? "#2a2823"
: "#ece9e0",
color: draft.trim() && reachable && !sending ? "#fff" : p.text3,
color: (draft.trim() || pendingFiles.length > 0) && reachable && !sending && !uploading ? "#fff" : p.text3,
display: "flex",
alignItems: "center",
justifyContent: "center",
}}
>
{Icons.send({ size: 16 })}
{uploading ? (
<span style={{ fontSize: 10, fontWeight: 600 }}></span>
) : (
Icons.send({ size: 16 })
)}
</button>
</div>
</div>
@@ -214,6 +214,7 @@ export function MobileDetail({
<button
type="button"
onClick={onChat}
data-testid="mobile-chat-cta"
style={{
width: "100%",
height: 52,
@@ -36,6 +36,7 @@ const mockStoreState = {
height?: number;
}>,
agentMessages: {} as Record<string, Array<{ id: string; content: string; timestamp: string }>>,
consumeAgentMessages: () => [],
};
vi.mock("@/store/canvas", () => ({
@@ -357,7 +358,7 @@ describe("MobileChat — chat history", () => {
renderChat(mockAgentId);
});
expect(api.get).toHaveBeenCalledWith(
`/workspaces/${mockAgentId}/chat-history?limit=50`,
expect.stringContaining(`/workspaces/${mockAgentId}/chat-history`),
);
});
@@ -288,6 +288,7 @@ export function AgentCard({
return (
<button
type="button"
data-testid="workspace-card"
aria-label={`${agent.name}, status: ${agent.status}, tier ${agent.tier}${agent.remote ? ", remote" : ""}`}
onClick={onClick}
style={{
@@ -16,7 +16,40 @@ interface TokensTabProps {
workspaceId: string;
}
// The settings panel passes the literal sentinel "global" when no canvas
// node is selected. Workspace tokens are inherently per-workspace — there
// is no /workspaces/global/tokens endpoint (querying the uuid column with
// "global" 500s on Postgres). The org-wide equivalent lives in the
// separate "Org API Keys" tab. Mirrors the sentinel-awareness that
// api/secrets.ts already has (workspaceId === 'global' → /settings/secrets).
const GLOBAL_WORKSPACE_ID = 'global';
export function TokensTab({ workspaceId }: TokensTabProps) {
if (workspaceId === GLOBAL_WORKSPACE_ID) {
return (
<div className="p-4 space-y-4">
<div>
<h3 className="text-sm font-semibold text-ink">API Tokens</h3>
<p className="text-[10px] text-ink-mid mt-0.5">
Bearer tokens for authenticating API calls to this workspace.
</p>
</div>
<div className="text-center py-6">
<p className="text-xs text-ink-mid">Select a workspace node first</p>
<p className="text-[10px] text-ink-mid mt-1">
Workspace tokens are scoped to a single workspace. Select a node
on the canvas to manage its tokens, or use the{' '}
<span className="text-accent font-medium">Org API Keys</span> tab
for org-wide API keys.
</p>
</div>
</div>
);
}
return <WorkspaceTokensTab workspaceId={workspaceId} />;
}
function WorkspaceTokensTab({ workspaceId }: TokensTabProps) {
const [tokens, setTokens] = useState<Token[]>([]);
const [loading, setLoading] = useState(true);
const [creating, setCreating] = useState(false);
@@ -302,3 +302,35 @@ describe("TokensTab — error", () => {
expect(document.querySelector('[role="status"]')).toBeNull();
});
});
// ─── "global" sentinel (no node selected) ────────────────────────────────────
//
// Regression: SettingsPanel passes the literal "global" when no canvas
// node is selected. workspace tokens are per-workspace and there is no
// /workspaces/global/tokens endpoint — calling it 500'd
// ("invalid input syntax for type uuid: global"). The tab must NOT call
// the API in that state and must point the user at the Org API Keys tab.
describe("TokensTab — global sentinel (no node selected)", () => {
beforeEach(() => {
mockApiGet.mockReset();
mockApiPost.mockReset();
mockApiGet.mockRejectedValue(new Error("should not be called"));
});
it("does not call the API and shows a pointer to Org API Keys", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(mockApiGet).not.toHaveBeenCalled();
expect(mockApiPost).not.toHaveBeenCalled();
expect(document.body.textContent).toContain("Select a workspace node");
expect(document.body.textContent).toContain("Org API Keys");
// No error banner, no scary 500 surfacing.
expect(document.querySelector(".text-bad")).toBeNull();
});
it("has no create button in the global state", async () => {
render(<TokensTab workspaceId="global" />);
await flush();
expect(document.body.textContent).not.toContain("New Token");
});
});
+97 -696
View File
@@ -5,16 +5,19 @@ import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import { api } from "@/lib/api";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { useSocketEvent } from "@/hooks/useSocketEvent";
import { type ChatMessage, type ChatAttachment, createMessage, appendMessageDeduped } from "./chat/types";
import { uploadChatFiles, downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { PendingAttachmentPill } from "./chat/AttachmentViews";
import { AttachmentPreview } from "./chat/AttachmentPreview";
import { extractFilesFromTask } from "./chat/message-parser";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
import { appendActivityLine } from "./chat/activityLog";
import { runtimeDisplayName } from "@/lib/runtime-names";
import { ConfirmDialog } from "@/components/ConfirmDialog";
import { useChatHistory } from "./chat/hooks/useChatHistory";
import { useChatSend } from "./chat/hooks/useChatSend";
import { useChatSocket } from "./chat/hooks/useChatSocket";
export { extractReplyText } from "./chat/hooks/useChatSend";
interface Props {
workspaceId: string;
@@ -23,147 +26,6 @@ interface Props {
type ChatSubTab = "my-chat" | "agent-comms";
// A2A response shape (subset). The full schema is in @a2a-js/sdk but we only
// need parts/artifacts text + file extraction for the synchronous fallback.
interface A2AFileRef {
name?: string;
mimeType?: string;
uri?: string;
bytes?: string;
size?: number;
}
// Outbound shape matches a2a-sdk's JSON-RPC `SendMessageRequest`
// Pydantic union (TextPart | FilePart | DataPart). The flat
// protobuf shape `{url, filename, mediaType}` is rejected at the
// request boundary with `Field required` errors — keep this
// outbound shape unless a2a-sdk migrates the JSON-RPC schema.
interface A2APart {
kind: string;
text?: string;
file?: A2AFileRef;
}
interface A2AResponse {
result?: {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
}
// Internal-self-message filtering moved server-side in RFC #2945
// PR-C/D — the platform's /chat-history endpoint applies the
// IsInternalSelfMessage predicate before returning rows, so the
// client no longer needs the local backstop on the history path.
// The proper fix is still X-Workspace-ID header (source_id=workspace_id);
// the platform-side prefix filter handles the residual cases.
// extractReplyText pulls the agent's text reply out of an A2A response.
// Concatenates ALL text parts (joined with "\n") rather than returning
// just the first. Claude Code and other runtimes commonly emit multi-
// part text replies for long content (markdown tables, code blocks),
// and the prior "first part wins" implementation silently truncated
// the rest — observed on a 15k-char Wave 1 brief that rendered only
// the table header. Mirrors extractTextsFromParts in message-parser.ts.
//
// Server-side counterpart in workspace-server/internal/channels/
// manager.go has the same single-part bug; fix that too if/when a
// channel-delivered reply (Slack, Lark, etc.) gets truncated.
export function extractReplyText(resp: A2AResponse): string {
const collect = (parts: A2APart[] | undefined): string => {
if (!parts) return "";
return parts
.filter((p) => p.kind === "text")
.map((p) => p.text ?? "")
.filter(Boolean)
.join("\n");
};
const result = resp?.result;
const collected: string[] = [];
const fromParts = collect(result?.parts);
if (fromParts) collected.push(fromParts);
// Walk artifacts even if parts had text — some producers (Hermes
// tool calls) emit a summary in parts AND details in artifacts.
// Returning early on parts dropped the artifact body silently.
if (result?.artifacts) {
for (const a of result.artifacts) {
const t = collect(a.parts);
if (t) collected.push(t);
}
}
return collected.join("\n");
}
// Agent-returned files live on the same response shape as text —
// delegated to extractFilesFromTask in message-parser.ts, which also
// walks status.message.parts (that ChatTab's legacy text extractor
// doesn't). Single source of truth for file-part parsing across
// live chat, activity log replay, and any future consumers.
/** Initial chat history page size. The newest N messages are rendered
* on first paint; older history is fetched on demand via loadOlder()
* when the user scrolls the top sentinel into view. */
const INITIAL_HISTORY_LIMIT = 10;
/** Subsequent older-history batch size. Larger than INITIAL so a long
* scroll-back doesn't fan out into many round-trips. */
const OLDER_HISTORY_BATCH = 20;
/**
* Load chat history from the platform's typed /chat-history endpoint.
*
* Server-side rendering of activity_logs rows into ChatMessage shape
* lives in workspace-server/internal/messagestore/postgres_store.go
* (RFC #2945 PR-C/D). The server already applies the canvas-source
* filter, the internal-self-message predicate, the role decision
* (status=error vs agent-error prefix → system), and the v0/v1
* file-shape extraction. Canvas just renders what it receives.
*
* Wire shape (mirrors ChatMessage exactly, no per-row mapping needed):
*
* GET /workspaces/:id/chat-history?limit=N&before_ts=T
* 200 → {"messages": ChatMessage[], "reached_end": boolean}
*
* Pagination:
* - Pass `limit` to bound the page size (newest-first from server).
* - Pass `beforeTs` (RFC3339) to fetch rows STRICTLY OLDER than that
* timestamp. Combined with limit, this yields the next-older page
* when scrolling backward through history.
*
* `reachedEnd` is propagated from the server. The server computes it
* by comparing rowCount vs limit so a partial last page is correctly
* detected even when the row→bubble fan-out is non-1:1 (each row
* produces 1-2 bubbles).
*/
async function loadMessagesFromDB(
workspaceId: string,
limit: number,
beforeTs?: string,
): Promise<{ messages: ChatMessage[]; error: string | null; reachedEnd: boolean }> {
try {
const params = new URLSearchParams({ limit: String(limit) });
if (beforeTs) params.set("before_ts", beforeTs);
const resp = await api.get<{ messages: ChatMessage[]; reached_end: boolean }>(
`/workspaces/${workspaceId}/chat-history?${params.toString()}`,
);
// Server emits oldest-first within the page (RFC #2945 PR-C-2
// post-fix: server reverses row-aware before returning so the
// wire is display-ready). Canvas appends/prepends without
// reordering — this avoids the pair-flip bug a naive flat
// reverse causes when each row produces a (user, agent) pair
// with the same timestamp.
return {
messages: resp.messages ?? [],
error: null,
reachedEnd: resp.reached_end,
};
} catch (err) {
return {
messages: [],
error: err instanceof Error ? err.message : "Failed to load chat history",
reachedEnd: true,
};
}
}
/**
* ChatTab container — renders sub-tab bar + My Chat or Agent Comms panel.
*/
@@ -171,7 +33,7 @@ export function ChatTab({ workspaceId, data }: Props) {
const [subTab, setSubTab] = useState<ChatSubTab>("my-chat");
return (
<div className="flex flex-col h-full">
<div data-testid="chat-panel" className="flex flex-col h-full">
{/* Sub-tab bar — role="tablist" so screen readers expose tab context */}
<div
role="tablist"
@@ -247,268 +109,68 @@ export function ChatTab({ workspaceId, data }: Props) {
* MyChatPanel — user↔agent conversation (extracted from original ChatTab).
*/
function MyChatPanel({ workspaceId, data }: Props) {
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [input, setInput] = useState("");
// `sending` is strictly the "this tab kicked off a send and hasn't
// seen the reply yet" signal. Previously this was initialized from
// data.currentTask to pick up in-flight agent work on mount, but
// that conflated agent-busy (workspace heartbeat) with user-
// in-flight (local send): when the WS dropped a TASK_COMPLETE event,
// currentTask lingered, the component re-mounted with sending=true,
// and the Send button stayed disabled forever even though nothing
// local was in flight. For the "agent is busy, show spinner" UX,
// use data.currentTask directly in the render path.
const [sending, setSending] = useState(false);
const [thinkingElapsed, setThinkingElapsed] = useState(0);
const [pendingFiles, setPendingFiles] = useState<File[]>([]);
const [activityLog, setActivityLog] = useState<string[]>([]);
const [loading, setLoading] = useState(true);
const [loadError, setLoadError] = useState<string | null>(null);
const currentTaskRef = useRef(data.currentTask);
const sendingFromAPIRef = useRef(false);
const [thinkingElapsed, setThinkingElapsed] = useState(0);
const [agentReachable, setAgentReachable] = useState(false);
const [error, setError] = useState<string | null>(null);
const [confirmRestart, setConfirmRestart] = useState(false);
const bottomRef = useRef<HTMLDivElement>(null);
// First-mount scroll-to-bottom needs `behavior: "instant"` — long
// conversations smooth-animate for ~300ms which any concurrent
// re-render can interrupt, leaving the user stuck mid-conversation
// when the chat tab opens. Subsequent appends (new agent messages)
// keep `smooth` for the visual "landing" feel. Flipped the first
// time messages.length goes positive, so a workspace switch (which
// remounts ChatTab) gets a fresh instant jump too.
const hasInitialScrollRef = useRef(false);
// Lazy-load older history on scroll-up.
// - containerRef = the scrollable messages viewport
// - topRef = sentinel above the messages list; IO observes it
// and triggers loadOlder() when it enters view
// - hasMore = false once a fetch returns < limit rows; stops IO
// - loadingOlder = drives the "Loading older messages…" UI label
// - inflightRef = synchronous guard against double-entry of loadOlder
// when the IO callback fires twice in the same
// microtask (state-based guard would be stale until
// the next React commit)
// - scrollAnchorRef = saves distance-from-bottom before a prepend
// so the useLayoutEffect below can restore the
// user's exact viewport position. Without this,
// prepending older messages would jump the scroll
// position by the height of the new content.
// - oldestMessageRef / hasMoreRef = let the loadOlder closure read
// the latest values without taking them as deps —
// every live agent push mutates `messages`, and
// having loadOlder depend on `messages` would tear
// down + re-arm the IntersectionObserver on every
// push. Refs decouple the observer lifecycle from
// message-list updates.
const [dragOver, setDragOver] = useState(false);
const containerRef = useRef<HTMLDivElement>(null);
const topRef = useRef<HTMLDivElement>(null);
const [hasMore, setHasMore] = useState(true);
const [loadingOlder, setLoadingOlder] = useState(false);
const inflightRef = useRef(false);
// The scroll anchor includes the first-message id as it was BEFORE
// the prepend — see useLayoutEffect below for why. Without this tag,
// a live agent push that appends WHILE loadOlder is in flight would
// run useLayoutEffect against the append (anchor still set), the
// "restore" math would scroll the user to a stale offset, AND the
// append's normal scroll-to-bottom would be swallowed.
const scrollAnchorRef = useRef<
{ savedDistanceFromBottom: number; expectFirstIdNotEqual: string | null } | null
>(null);
const oldestMessageRef = useRef<ChatMessage | null>(null);
const hasMoreRef = useRef(true);
// Monotonic token bumped on workspace switch + on every loadOlder
// entry. Each fetch's .then() captures its own token; if the token
// has moved, the resolved messages belong to a stale workspace or a
// superseded fetch and we silently drop them. Without this guard, a
// workspace switch mid-fetch would have the in-flight promise
// resolve into the new workspace's setMessages — the user sees
// someone else's history briefly.
const fetchTokenRef = useRef(0);
// Files the user has picked but not yet sent. Cleared on send
// (upload success) or by the × on each pill.
const [pendingFiles, setPendingFiles] = useState<File[]>([]);
const [uploading, setUploading] = useState(false);
const bottomRef = useRef<HTMLDivElement>(null);
const hasInitialScrollRef = useRef(false);
const fileInputRef = useRef<HTMLInputElement>(null);
// Guard against a double-click during the upload phase: React
// state updates from the click that started the upload haven't
// flushed yet, so the disabled-button logic sees `uploading=false`
// from the closure and lets a second `sendMessage` enter. A ref
// observes the latest value synchronously.
const sendInFlightRef = useRef(false);
// Monotonic token bumped on every sendMessage entry. Each .then()/
// .catch() captures its own token in closure and bails if a newer
// send has superseded it — prevents a late HTTP response for an
// earlier message from clobbering the flags / appending text that
// belong to a newer in-flight send. Race scenario the token closes:
// (1) send msg #1 (2) WS push for msg #1 arrives, releases guards
// (3) user sends msg #2 (4) HTTP for msg #1 finally lands — without
// the token check, .then() sees sendingFromAPIRef=true (set by
// msg #2's send), enters the main body, and processes msg #1's body
// as if it were msg #2's reply.
const sendTokenRef = useRef(0);
const dragDepthRef = useRef(0);
const pasteCounterRef = useRef(0);
// Release every in-flight send guard at once. Used by every site
// that ends a send: pendingAgentMsgs WS push, ACTIVITY_LOGGED
// a2a_receive ok/error WS event, HTTP .then() success, and HTTP
// .catch() success. Keep these in lockstep — a future contributor
// adding a new "I saw the reply" path that only clears `sending` +
// `sendingFromAPIRef` (the natural pair) silently re-introduces
// the post-WS Send-button freeze, because the disabled-button
// logic can't see `sendInFlightRef` and so the visible state diverges
// from the synchronous re-entry guard at line 464.
const releaseSendGuards = useCallback(() => {
setSending(false);
sendingFromAPIRef.current = false;
sendInFlightRef.current = false;
}, []);
const history = useChatHistory(workspaceId, containerRef);
const chatSend = useChatSend(workspaceId, {
getHistoryMessages: () => history.messages,
onUserMessage: (msg) => history.setMessages((prev) => [...prev, msg]),
onAgentMessage: (msg) => history.setMessages((prev) => appendMessageDeduped(prev, msg)),
});
const { sending, uploading, sendMessage, error: sendError, clearError: clearSendError, releaseSendGuards, sendingFromAPIRef } = chatSend;
// Initial-load fetch — used by the mount effect and the "Retry"
// button below. Single source of truth so the two paths can't drift
// (e.g. INITIAL_HISTORY_LIMIT bumped in the effect but not the
// retry, leading to inconsistent first-paint sizes).
const loadInitial = useCallback(() => {
setLoading(true);
setLoadError(null);
setHasMore(true);
// Bump the token; any in-flight fetch from the previous workspace
// (or a previous retry) will see token != myToken in its .then()
// and silently bail — the late response can't clobber the new
// workspace's state.
fetchTokenRef.current += 1;
const myToken = fetchTokenRef.current;
loadMessagesFromDB(workspaceId, INITIAL_HISTORY_LIMIT).then(
({ messages: msgs, error: fetchErr, reachedEnd }) => {
if (fetchTokenRef.current !== myToken) return;
setMessages(msgs);
setLoadError(fetchErr);
setHasMore(!reachedEnd);
setLoading(false);
},
);
}, [workspaceId]);
const displayError = error || sendError;
// Load chat history on mount / workspace switch.
// Initial load is bounded to INITIAL_HISTORY_LIMIT (newest 10) — the
// rest streams in as the user scrolls up via loadOlder() below. Pre-
// 2026-05-05 this fetched the newest 50 in one shot; on a long-running
// workspace that meant 50× message-bubble paint + DOM cost on every
// tab-open even when the user only wanted to read the last few.
useEffect(() => {
loadInitial();
}, [loadInitial]);
// Mirror the latest oldest-message + hasMore into refs so loadOlder
// can read them without taking `messages` as a dep. Every live push
// through agentMessages would otherwise recreate loadOlder and tear
// down the IO observer.
useEffect(() => {
oldestMessageRef.current = messages[0] ?? null;
}, [messages]);
useEffect(() => {
hasMoreRef.current = hasMore;
}, [hasMore]);
// Fetch the next-older batch and prepend. Stable identity (deps =
// [workspaceId]) so the IntersectionObserver effect below doesn't
// re-arm on every messages update.
const loadOlder = useCallback(async () => {
// inflightRef is the load-bearing guard — synchronous, set BEFORE
// any await, so two IO callbacks dispatched in the same microtask
// can't both pass. The state checks are defensive secondary
// gates for the slow-scroll case.
if (inflightRef.current || !hasMoreRef.current) return;
const oldest = oldestMessageRef.current;
if (!oldest) return;
const container = containerRef.current;
if (!container) return;
inflightRef.current = true;
// Capture the user's distance-from-bottom BEFORE we prepend so the
// useLayoutEffect can restore it after the new DOM lands. The
// expectFirstIdNotEqual tag is what the layout effect checks
// against `messages[0].id` to disambiguate prepend (id changed) vs
// append (id unchanged → live message landed mid-fetch). Without
// it, an agent push during loadOlder runs the "restore" against a
// stale anchor — user gets yanked + the append's bottom-pin is
// swallowed.
scrollAnchorRef.current = {
savedDistanceFromBottom: container.scrollHeight - container.scrollTop,
expectFirstIdNotEqual: oldest.id,
};
fetchTokenRef.current += 1;
const myToken = fetchTokenRef.current;
setLoadingOlder(true);
try {
const { messages: older, reachedEnd } = await loadMessagesFromDB(
workspaceId,
OLDER_HISTORY_BATCH,
oldest.timestamp,
);
// Workspace switched (or another loadOlder bumped the token)
// mid-fetch — drop these results, they belong to a stale tab.
if (fetchTokenRef.current !== myToken) {
scrollAnchorRef.current = null;
return;
useChatSocket(workspaceId, {
onAgentMessage: (msg) => {
history.setMessages((prev) => appendMessageDeduped(prev, msg));
if (sendingFromAPIRef.current) {
releaseSendGuards();
}
if (older.length > 0) {
setMessages((prev) => [...older, ...prev]);
} else {
// Nothing came back — clear the anchor so the next paint doesn't
// try to "restore" against a no-op prepend.
scrollAnchorRef.current = null;
},
onActivityLog: (entry) => {
if (!sending) return;
setActivityLog((prev) => appendActivityLine(prev, entry));
},
onSendComplete: () => {
if (sendingFromAPIRef.current) {
releaseSendGuards();
}
setHasMore(!reachedEnd);
} finally {
setLoadingOlder(false);
inflightRef.current = false;
}
}, [workspaceId]);
// IntersectionObserver on the top sentinel. Fires loadOlder() the
// moment the user scrolls within 200px of the top. AbortController
// unwires cleanly on workspace switch / unmount; root is the
// scrollable container so we observe only what's visible inside it.
//
// Dependencies:
// - loadOlder — stable per workspaceId (refs decouple it from
// message updates), so this dep is here for the
// workspace-switch case only
// - hasMore — re-run when older history runs out so we
// disconnect cleanly
// - hasMessages — load-bearing: the sentinel JSX is gated on
// `messages.length > 0`, so topRef.current is null
// on the empty-messages render. We re-arm exactly
// once when messages first land. NOT depending on
// `messages.length` (or `messages`) directly so
// each subsequent message append doesn't tear down
// + re-arm the observer.
const hasMessages = messages.length > 0;
useEffect(() => {
const top = topRef.current;
const container = containerRef.current;
if (!top || !container) return;
if (!hasMore) return; // stop observing when no older history exists
const ac = new AbortController();
const io = new IntersectionObserver(
(entries) => {
if (ac.signal.aborted) return;
if (entries[0]?.isIntersecting) loadOlder();
},
{ root: container, rootMargin: "200px 0px 0px 0px", threshold: 0 },
);
io.observe(top);
ac.signal.addEventListener("abort", () => io.disconnect());
return () => ac.abort();
}, [loadOlder, hasMore, hasMessages]);
},
onSendError: (err) => {
if (sendingFromAPIRef.current) {
releaseSendGuards();
setError(err);
}
},
});
// Agent reachability
useEffect(() => {
const reachable = data.status === "online" || data.status === "degraded";
setAgentReachable(reachable);
setError(reachable ? null : `Agent is ${data.status}`);
}, [data.status]);
useEffect(() => {
currentTaskRef.current = data.currentTask;
}, [data.currentTask]);
if (reachable) {
setError(null);
clearSendError();
} else {
setError(`Agent is ${data.status}`);
}
}, [data.status, clearSendError]);
// Scroll behavior across messages updates:
// - Prepend (loadOlder landed) → restore the user's saved
@@ -518,71 +180,24 @@ function MyChatPanel({ workspaceId, data }: Props) {
// paint — otherwise the user sees the page jump for one frame.
useLayoutEffect(() => {
const container = containerRef.current;
const anchor = scrollAnchorRef.current;
// Only honor the anchor when this messages-update is the prepend
// we expected. messages[0].id is the test:
// - prepend → messages[0] is one of the older rows → id !== expectFirstIdNotEqual
// - append → messages[0] unchanged → id === expectFirstIdNotEqual → fall through
// Without this check, an agent push that lands mid-loadOlder would
// run the restore against the append's update, yank the user's
// scroll, AND swallow the append's bottom-pin.
const anchor = history.scrollAnchorRef.current;
if (
anchor &&
container &&
messages.length > 0 &&
messages[0].id !== anchor.expectFirstIdNotEqual
history.messages.length > 0 &&
history.messages[0].id !== anchor.expectFirstIdNotEqual
) {
container.scrollTop = container.scrollHeight - anchor.savedDistanceFromBottom;
scrollAnchorRef.current = null;
history.scrollAnchorRef.current = null;
return;
}
// Instant on first arrival of messages — smooth-scroll on a long
// conversation gets interrupted by concurrent renders and leaves
// the user stuck in the middle. After the first jump, subsequent
// appends animate as before.
if (!hasInitialScrollRef.current && messages.length > 0) {
if (!hasInitialScrollRef.current && history.messages.length > 0) {
hasInitialScrollRef.current = true;
bottomRef.current?.scrollIntoView({ behavior: "instant" as ScrollBehavior });
return;
}
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
// Consume agent push messages (send_message_to_user) from global store.
// Runtimes like Claude Code SDK deliver their reply via a WS push rather
// than the /a2a HTTP response — when that happens, the push is the
// authoritative "reply arrived" signal for the UI, so clear `sending`
// here too. The HTTP .then() coordinates through sendingFromAPIRef so
// whichever path clears first wins.
const pendingAgentMsgs = useCanvasStore((s) => s.agentMessages[workspaceId]);
useEffect(() => {
if (!pendingAgentMsgs || pendingAgentMsgs.length === 0) return;
const consume = useCanvasStore.getState().consumeAgentMessages;
const msgs = consume(workspaceId);
for (const m of msgs) {
// Dedupe in case the agent proactively pushed the same text the
// HTTP /a2a response already delivered (observed with the Hermes
// runtime, which emits both a reply body and a send_message_to_user
// push for the same content). Attachments ride along with the
// message so files returned by the A2A_RESPONSE WS path render
// their download chips.
setMessages((prev) => appendMessageDeduped(prev, createMessage("agent", m.content, m.attachments)));
}
if (sendingFromAPIRef.current && msgs.length > 0) {
// Reply arrived via WS push (e.g. claude-code SDK). Release all
// three guards together — without sendInFlightRef the next
// sendMessage() silently no-ops at the synchronous re-entry
// check.
releaseSendGuards();
}
}, [pendingAgentMsgs, workspaceId]);
// Resolve workspace ID → name for activity display
const resolveWorkspaceName = useCallback((id: string) => {
const nodes = useCanvasStore.getState().nodes;
const node = nodes.find((n) => n.id === id);
return (node?.data as WorkspaceNodeData)?.name || id.slice(0, 8);
}, []);
}, [history.messages, history.scrollAnchorRef]);
// Elapsed timer while sending
useEffect(() => {
@@ -609,211 +224,43 @@ function MyChatPanel({ workspaceId, data }: Props) {
setActivityLog([`Processing with ${runtimeDisplayName(data.runtime)}...`]);
}, [sending, data.runtime]);
// Subscribe to global WS via the singleton ReconnectingSocket (no
// per-component WebSocket — the previous pattern dropped events
// silently on any reconnect because each panel's raw socket had no
// onclose handler).
useSocketEvent((msg) => {
if (!sending) return;
try {
if (msg.event === "ACTIVITY_LOGGED") {
// Filter to events for THIS workspace. The platform's
// BroadcastOnly fires to every connected client, and
// without this guard a sibling workspace's a2a_send would
// surface as "→ Delegating to X..." inside the wrong
// chat panel. (workspace_id on the WS envelope is the
// workspace whose activity_log row we just wrote.)
if (msg.workspace_id !== workspaceId) return;
// IntersectionObserver on the top sentinel. Fires loadOlder() the
// moment the user scrolls within 200px of the top. AbortController
// unwires cleanly on workspace switch / unmount; root is the
// scrollable container so we observe only what's visible inside it.
const hasMessages = history.messages.length > 0;
useEffect(() => {
const top = topRef.current;
const container = containerRef.current;
if (!top || !container) return;
if (!history.hasMore) return;
const ac = new AbortController();
const io = new IntersectionObserver(
(entries) => {
if (ac.signal.aborted) return;
if (entries[0]?.isIntersecting) history.loadOlder();
},
{ root: container, rootMargin: "200px 0px 0px 0px", threshold: 0 },
);
io.observe(top);
ac.signal.addEventListener("abort", () => io.disconnect());
return () => ac.abort();
}, [history.loadOlder, history.hasMore, hasMessages]);
const p = msg.payload || {};
const type = p.activity_type as string;
const method = (p.method as string) || "";
const status = (p.status as string) || "";
const targetId = (p.target_id as string) || "";
const durationMs = p.duration_ms as number | undefined;
const summary = (p.summary as string) || "";
let line = "";
if (type === "a2a_receive" && method === "message/send") {
const targetName = resolveWorkspaceName(targetId || msg.workspace_id);
if (status === "ok" && durationMs) {
const sec = Math.round(durationMs / 1000);
line = `${targetName} responded (${sec}s)`;
// The platform logs a successful a2a_receive once the workspace
// has fully produced its reply. That's the authoritative "done"
// signal for the spinner — clear it even if the reply hasn't
// surfaced through the store yet (it may be delivered shortly
// via pendingAgentMsgs or the HTTP .then()).
const own = (targetId || msg.workspace_id) === workspaceId;
if (own && sendingFromAPIRef.current) {
releaseSendGuards();
}
} else if (status === "error") {
line = `${targetName} error`;
const own = (targetId || msg.workspace_id) === workspaceId;
if (own && sendingFromAPIRef.current) {
releaseSendGuards();
setError("Agent error (Exception) — see workspace logs for details.");
}
}
} else if (type === "a2a_send") {
const targetName = resolveWorkspaceName(targetId);
line = `→ Delegating to ${targetName}...`;
} else if (type === "task_update") {
if (summary) line = `${summary}`;
} else if (type === "agent_log") {
// Per-tool-use telemetry from claude_sdk_executor's
// _report_tool_use. The summary already carries an icon
// + human-readable args (📄 Read /path, ⚡ Bash: …)
// so we render it verbatim. No icon prefix here — the
// emoji at the start of summary is the visual marker.
if (summary) line = summary;
}
if (line) {
setActivityLog((prev) => appendActivityLine(prev, line));
}
} else if (msg.event === "TASK_UPDATED" && msg.workspace_id === workspaceId) {
const task = (msg.payload?.current_task as string) || "";
if (task) {
setActivityLog((prev) => appendActivityLine(prev, `${task}`));
}
}
// A2A_RESPONSE is already consumed by the store and its text is
// appended to messages via the pendingAgentMsgs effect above; we
// don't need to duplicate it here.
} catch { /* ignore */ }
});
const sendMessage = async () => {
const handleSend = async () => {
const text = input.trim();
const filesToSend = pendingFiles;
// Allow sending if EITHER text OR attachments are present — a user
// can drop a file with no text and the agent still receives it.
if ((!text && filesToSend.length === 0) || !agentReachable || sending || uploading) return;
// Synchronous re-entry guard — see sendInFlightRef comment.
if (sendInFlightRef.current) return;
sendInFlightRef.current = true;
// Upload attachments first so we can include URIs in the A2A
// message parts. Sequential-before-send: a message with references
// to files not yet staged would fail agent-side; staging happens
// synchronously via /chat/uploads before message/send dispatch.
let uploaded: ChatAttachment[] = [];
if (filesToSend.length > 0) {
setUploading(true);
try {
uploaded = await uploadChatFiles(workspaceId, filesToSend);
} catch (e) {
setUploading(false);
sendInFlightRef.current = false;
setError(e instanceof Error ? `Upload failed: ${e.message}` : "Upload failed");
return;
}
setUploading(false);
}
const files = pendingFiles;
if ((!text && files.length === 0) || !agentReachable || sending || uploading) return;
setInput("");
setPendingFiles([]);
setMessages((prev) => [...prev, createMessage("user", text, uploaded)]);
setSending(true);
sendingFromAPIRef.current = true;
clearSendError();
setError(null);
// Capture this send's token so the .then()/.catch() callbacks can
// detect a newer send that may have superseded them. See the
// sendTokenRef declaration for the race scenario this closes.
const myToken = ++sendTokenRef.current;
// Build conversation history from prior messages (last 20)
const history = messages
.filter((m) => m.role === "user" || m.role === "agent")
.slice(-20)
.map((m) => ({
role: m.role === "user" ? "user" : "agent",
parts: [{ kind: "text", text: m.content }],
}));
// A2A parts: text part (if any) + file parts (per attachment). The
// agent sees both in a single turn, matching the A2A spec shape.
// Wire shape is v0 — see A2APart definition above.
const parts: A2APart[] = [];
if (text) parts.push({ kind: "text", text });
for (const att of uploaded) {
parts.push({
kind: "file",
file: {
name: att.name,
mimeType: att.mimeType,
uri: att.uri,
size: att.size,
},
});
}
// A2A calls can legitimately take minutes — LLM latency +
// multi-turn tool use is common on slower providers (Hermes+minimax,
// Claude Code invoking bash/file tools, etc.). The 15s default
// would silently abort the fetch here, leaving the server to
// complete the reply and the user staring at
// "agent may be unreachable". Match the upload timeout (60s × 2)
// for the happy-path ceiling; anything longer is genuinely stuck.
api.post<A2AResponse>(`/workspaces/${workspaceId}/a2a`, {
method: "message/send",
params: {
message: {
role: "user",
messageId: crypto.randomUUID(),
parts,
},
metadata: { history },
},
}, { timeoutMs: 120_000 })
.then((resp) => {
// Bail without touching any flags if a newer sendMessage has
// already run — its myToken bumped sendTokenRef, so this is
// a stale callback for an earlier message. The newer send
// owns the in-flight guards now.
if (sendTokenRef.current !== myToken) return;
// Skip if the WS A2A_RESPONSE event already handled this response.
// Both paths (WS + HTTP) check sendingFromAPIRef — whichever clears
// it first wins, the other becomes a no-op (no duplicate messages).
if (!sendingFromAPIRef.current) {
sendInFlightRef.current = false;
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask((resp?.result ?? {}) as Record<string, unknown>);
if (replyText || replyFiles.length > 0) {
setMessages((prev) =>
appendMessageDeduped(prev, createMessage("agent", replyText, replyFiles)),
);
}
releaseSendGuards();
})
.catch(() => {
// Stale-callback guard — same rationale as .then().
if (sendTokenRef.current !== myToken) return;
// Same dedup guard as .then(): if a WS path (pendingAgentMsgs
// or ACTIVITY_LOGGED a2a_receive ok) already delivered the
// reply, sendingFromAPIRef is already false and there's
// nothing to roll back. Surfacing "Failed to send" here would
// contradict the agent reply the user is currently reading —
// exactly the false-positive observed when the HTTP request
// hung up (proxy idle / 502) after WS already won.
if (!sendingFromAPIRef.current) {
sendInFlightRef.current = false;
return;
}
releaseSendGuards();
setError("Failed to send message — agent may be unreachable");
});
await sendMessage(text, files);
};
const onFilesPicked = (fileList: FileList | null) => {
if (!fileList) return;
const picked = Array.from(fileList);
// Deduplicate against current pending set by name+size — user
// picking the same file twice shouldn't append it.
setPendingFiles((prev) => {
const keyed = new Set(prev.map((f) => `${f.name}:${f.size}`));
return [...prev, ...picked.filter((f) => !keyed.has(`${f.name}:${f.size}`))];
@@ -824,35 +271,7 @@ function MyChatPanel({ workspaceId, data }: Props) {
const removePendingFile = (index: number) =>
setPendingFiles((prev) => prev.filter((_, i) => i !== index));
// Monotonic counter so two paste events within the same wall-clock
// second still produce distinct filenames. Without this, on
// Firefox (where pasted images have an empty `file.name`), two
// pastes ~100ms apart could yield identical synthetic names AND
// identical sizes, collapsing into one attachment via the
// `name:size` dedup in onFilesPicked.
const pasteCounterRef = useRef(0);
/** Paste-from-clipboard image attachment.
*
* Browser clipboard image items arrive as `File`s whose `name` is
* often a generic "image.png" (Chrome) or empty (Firefox/Safari),
* so two consecutive screenshot pastes collide on the name+size
* dedup the file-picker uses. Re-tag each pasted image with a
* per-paste unique name so dedup keeps them apart and the upload
* pipeline (which expects a non-empty filename) is happy.
*
* Falls through to onFilesPicked via direct File[] (NOT through
* the DataTransfer constructor — that throws on Safari < 14.1
* and old Edge, silently aborting the paste).
*
* Only intercepts the paste when the clipboard has at least one
* image; text-only pastes fall through to the textarea's default
* behaviour. */
const mimeToExt = (mime: string): string => {
// Avoid raw `mime.split("/")[1]` — that yields `"svg+xml"`,
// `"jpeg"`, `"webp"` etc. which produce ugly filenames and may
// trip server-side extension allowlists. Map known types
// explicitly; unknown falls back to a safe default.
if (mime === "image/svg+xml") return "svg";
if (mime === "image/jpeg") return "jpg";
if (mime === "image/png") return "png";
@@ -873,26 +292,16 @@ function MyChatPanel({ workspaceId, data }: Props) {
const file = item.getAsFile();
if (!file) continue;
const ext = mimeToExt(file.type);
const stamp = new Date()
.toISOString()
.replace(/[:.]/g, "-")
.slice(0, 19);
const stamp = new Date().toISOString().replace(/[:.]/g, "-").slice(0, 19);
const seq = pasteCounterRef.current++;
const fname = `pasted-${stamp}-${seq}-${i}.${ext}`;
imageFiles.push(new File([file], fname, { type: file.type }));
}
if (imageFiles.length === 0) return;
e.preventDefault();
// Reuse the picker path so file-size guards, dedup, and pending-
// list state all run through the same code. Build a synthetic
// FileList-like object to avoid the DataTransfer constructor —
// that's missing on Safari < 14.1 / old Edge and would silently
// throw, leaving the paste a no-op.
addPastedFiles(imageFiles);
};
// Variant of onFilesPicked that accepts a File[] directly, sidestepping
// the DataTransfer-FileList round-trip. Same dedup + state shape.
const addPastedFiles = (files: File[]) => {
setPendingFiles((prev) => {
const keyed = new Set(prev.map((f) => `${f.name}:${f.size}`));
@@ -900,11 +309,6 @@ function MyChatPanel({ workspaceId, data }: Props) {
});
};
// Drag-and-drop staging. dragDepthRef counts enter vs leave events so
// the overlay doesn't flicker when the cursor crosses nested children
// (textarea, buttons) — dragenter/dragleave fire for every boundary.
const [dragOver, setDragOver] = useState(false);
const dragDepthRef = useRef(0);
const dropEnabled = agentReachable && !sending && !uploading;
const isFileDrag = (e: React.DragEvent) =>
Array.from(e.dataTransfer.types || []).includes("Files");
@@ -934,9 +338,6 @@ function MyChatPanel({ workspaceId, data }: Props) {
};
const downloadAttachment = (att: ChatAttachment) => {
// Errors here are rare but user-visible (401 on a revoked token,
// 404 if the agent deleted the file). Surface via the inline
// error banner — the message list itself stays untouched.
downloadChatFile(workspaceId, att).catch((e) => {
setError(e instanceof Error ? `Download failed: ${e.message}` : "Download failed");
});
@@ -990,26 +391,26 @@ function MyChatPanel({ workspaceId, data }: Props) {
)}
{/* Messages */}
<div ref={containerRef} className="flex-1 overflow-y-auto p-3 space-y-3">
{loading && (
{history.loading && (
<div className="text-xs text-ink-mid text-center py-4">Loading chat history...</div>
)}
{!loading && loadError !== null && messages.length === 0 && (
{!history.loading && history.loadError !== null && history.messages.length === 0 && (
<div
role="alert"
className="mx-2 mt-2 rounded-lg border border-red-800/50 bg-red-950/30 px-3 py-2.5"
>
<p className="text-[11px] text-bad mb-1.5">
Failed to load chat history: {loadError}
Failed to load chat history: {history.loadError}
</p>
<button
onClick={loadInitial}
onClick={history.loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800 text-red-200 hover:bg-red-700 transition-colors"
>
Retry
</button>
</div>
)}
{!loading && loadError === null && messages.length === 0 && (
{!history.loading && history.loadError === null && history.messages.length === 0 && (
<div className="text-xs text-ink-mid text-center py-8">
No messages yet. Send a message to start chatting with this agent.
</div>
@@ -1027,12 +428,12 @@ function MyChatPanel({ workspaceId, data }: Props) {
instead of showing a "no more messages" footer — the user's
scroll resting against the top of the conversation IS the
signal. */}
{hasMore && messages.length > 0 && (
{history.hasMore && history.messages.length > 0 && (
<div ref={topRef} className="text-xs text-ink-mid text-center py-1">
{loadingOlder ? "Loading older messages…" : " "}
{history.loadingOlder ? "Loading older messages…" : " "}
</div>
)}
{messages.map((msg) => (
{history.messages.map((msg) => (
<div key={msg.id} className={`flex ${msg.role === "user" ? "justify-end" : "justify-start"}`}>
<div
className={`max-w-[85%] rounded-lg px-3 py-2 text-xs ${
@@ -1192,10 +593,10 @@ function MyChatPanel({ workspaceId, data }: Props) {
</div>
{/* Error banner */}
{error && (
{displayError && (
<div className="px-3 py-2 bg-red-900/20 border-t border-red-800/30">
<div className="flex items-center justify-between">
<span className="text-[10px] text-red-300">{error}</span>
<span className="text-[10px] text-red-300">{displayError}</span>
{!isOnline && (
<button
onClick={() => setConfirmRestart(true)}
@@ -1263,7 +664,7 @@ function MyChatPanel({ workspaceId, data }: Props) {
e.keyCode !== 229
) {
e.preventDefault();
sendMessage();
handleSend();
}
}}
onPaste={onPasteIntoComposer}
@@ -1273,7 +674,7 @@ function MyChatPanel({ workspaceId, data }: Props) {
className="flex-1 bg-surface-card border border-line rounded-lg px-3 py-2 text-xs text-ink placeholder-ink-soft dark:bg-zinc-800 dark:border-zinc-600 dark:placeholder-zinc-500 focus:outline-none focus:border-accent focus-visible:ring-2 focus-visible:ring-accent/40 resize-none disabled:opacity-50"
/>
<button
onClick={sendMessage}
onClick={handleSend}
disabled={(!input.trim() && pendingFiles.length === 0) || !agentReachable || sending || uploading}
className="px-4 py-2 bg-accent-strong hover:bg-accent text-xs font-medium rounded-lg text-white disabled:opacity-30 transition-colors shrink-0"
>
+1 -1
View File
@@ -176,7 +176,7 @@ export function deriveProvidersFromModels(models: ModelSpec[]): string[] {
// exactly the point of the platform adaptor. The deep `~/.hermes/
// config.yaml` on the container is a separate runtime-internal file,
// not this one.
const RUNTIMES_WITH_OWN_CONFIG = new Set<string>(["external", "kimi", "kimi-cli"]);
const RUNTIMES_WITH_OWN_CONFIG = new Set<string>(["external", "kimi", "kimi-cli", "openclaw"]);
const FALLBACK_RUNTIME_OPTIONS: RuntimeOption[] = [
{ value: "", label: "LangGraph (default)", models: [], providers: [] },
+46 -3
View File
@@ -45,11 +45,54 @@ export function FilesTab({ workspaceId, data }: Props) {
if (data && isExternalLikeRuntime(data.runtime)) {
return <NotAvailablePanel runtime={data.runtime} />;
}
return <PlatformOwnedFilesTab workspaceId={workspaceId} />;
return <PlatformOwnedFilesTab workspaceId={workspaceId} runtime={data?.runtime} />;
}
function PlatformOwnedFilesTab({ workspaceId }: { workspaceId: string }) {
const [root, setRoot] = useState("/configs");
/** Picks the initial root for the FilesTab dropdown based on the
* workspace's runtime. Decision: per-runtime default (Hongming
* 2026-05-15, internal#425 Decisions §2).
*
* - openclaw → `/agent-home` (the agent's identity/state — the
* user-facing interesting files for that runtime live in
* `~/.openclaw/` inside the container, which `/agent-home` maps to
* via the Phase 2b docker-exec backend).
* - everything else (claude-code, hermes, external-like, undefined)
* → `/configs` (the legacy default — managed config that flows
* through the per-runtime indirection in
* workspace-server/internal/handlers/template_files_eic.go).
*
* When the runtime is undefined (legacy callers that don't thread
* `data` through, or a workspace whose runtime field hasn't loaded
* yet) the default is `/configs` — matches today's behaviour, no
* surprise.
*
* Note on `/agent-home` pre-Phase-2b: the backend short-circuits
* with HTTP 501 and the canonical "implementation pending" body.
* The tab renders empty + the error banner explains. This is by
* design — lets us land the canvas UX before the backend ships,
* per the RFC's phased rollout. The 501 is graceful: it doesn't
* poison error toasts or generate "workspace not found" noise.
*
* Adding a new runtime that should default to `/agent-home`: add it
* to the agentHomeDefaultRuntimes set below. Adding a runtime that
* should default to a different root: extend this function. */
const agentHomeDefaultRuntimes = new Set(["openclaw"]);
function defaultRootForRuntime(runtime: string | undefined): string {
if (runtime && agentHomeDefaultRuntimes.has(runtime)) {
return "/agent-home";
}
return "/configs";
}
function PlatformOwnedFilesTab({
workspaceId,
runtime,
}: {
workspaceId: string;
runtime?: string;
}) {
const [root, setRoot] = useState(() => defaultRootForRuntime(runtime));
const [selectedFile, setSelectedFile] = useState<string | null>(null);
const [fileContent, setFileContent] = useState("");
const [editContent, setEditContent] = useState("");
@@ -3,6 +3,22 @@
import { useRef } from "react";
import { getIcon } from "./tree";
// secretShapeMarker is the canonical body the workspace-server Files
// API returns when a file's path OR content matched a credential
// regex (internal#425 RFC, Phase 2b — backed by
// workspace-server/internal/secrets.ScanBytes). The marker is a
// fixed prefix so the canvas can detect it without parsing JSON and
// without round-tripping the matched bytes through the editor (which
// would defeat the purpose — clipboard, browser history, log
// surfaces would all see them).
//
// Today (Phase 1 / before 2b ships) the backend returns 501 for the
// only root that uses this path, so the marker is dead code until
// 2b lands. Wiring it in now keeps the canvas + backend contracts
// aligned in one PR rather than a follow-up. The constant is
// importable so a future test can pin the exact string.
export const SECRET_SHAPE_DENIED_MARKER = "<denied: secret-shape>";
interface Props {
selectedFile: string | null;
fileContent: string;
@@ -31,6 +47,22 @@ export function FileEditor({
const editorRef = useRef<HTMLTextAreaElement>(null);
const isDirty = editContent !== fileContent;
// internal#425 Phase 3: detect the secret-shape denial marker and
// render a placeholder instead of the editor. The marker comes
// from workspace-server Phase 2b (secrets.ScanBytes) which refuses
// to surface the file's bytes. We deliberately don't expose
// the matched pattern's Name here — the canvas just shows the
// generic denial. The Files API log surface has the Pattern.Name
// for operators who need to debug a false positive.
const isSecretShapeDenied = fileContent === SECRET_SHAPE_DENIED_MARKER;
// /agent-home is read-only from the canvas (Phase 2b ships read +
// delete; Phase-2b-followup may add write). Edits to /configs are
// unchanged. Until 2b ships, /agent-home returns 501 so this
// read-only gate is also dead code, but wiring it in now keeps
// the UI honest the moment 2b lands without a follow-up canvas PR.
const isReadOnlyRoot = root !== "/configs";
if (!selectedFile) {
return (
<div className="flex-1 flex items-center justify-center">
@@ -75,11 +107,42 @@ export function FileEditor({
{/* Editor area */}
{loadingFile ? (
<div className="p-4 text-xs text-ink-mid">Loading...</div>
) : isSecretShapeDenied ? (
// Files API refused to surface this file's bytes because its
// path or content matched a credential regex
// (workspace-server/internal/secrets, internal#425 Phase 2b).
// We render a placeholder INSTEAD OF the textarea so the
// matched bytes never enter the DOM. Clipboard / view-source
// / element-inspector all see the placeholder, not the
// credential.
<div
role="region"
aria-label="File content denied"
className="flex-1 flex items-center justify-center p-6 bg-surface"
>
<div className="max-w-md text-center space-y-2">
<div className="text-2xl opacity-40">🛡</div>
<p className="text-[11px] font-mono text-warm">
{SECRET_SHAPE_DENIED_MARKER}
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
The platform refused to surface this file because its
path or content matched a credential-shape pattern.
The bytes never left the workspace container.
</p>
<p className="text-[10px] text-ink-mid leading-relaxed">
If this is a false positive (test fixture, docs example,
or content that happens to share a credential's shape),
rename the file or adjust the content via the workspace
terminal so the regex no longer matches, then refresh.
</p>
</div>
</div>
) : (
<textarea
ref={editorRef}
value={editContent}
readOnly={root !== "/configs"}
readOnly={isReadOnlyRoot}
onChange={(e) => setEditContent(e.target.value)}
onKeyDown={(e) => {
if ((e.metaKey || e.ctrlKey) && e.key === "s") {
@@ -38,6 +38,15 @@ export function FilesToolbar({
<option value="/home">/home</option>
<option value="/workspace">/workspace</option>
<option value="/plugins">/plugins</option>
{/* internal#425 Phase 1+3: container-internal $HOME root.
Backend lands the docker-exec dispatch in Phase 2b. Until
then the stub returns 501 with a canonical
"implementation pending" message — the dropdown renders
the option so the canvas affordance is design-frozen
even before the backend ships.
Runtime-default selection logic in FilesTab.tsx picks
this as the initial value for openclaw workspaces. */}
<option value="/agent-home">/agent-home</option>
</select>
<span className="text-[10px] text-ink-mid">{fileCount} files</span>
</div>
@@ -0,0 +1,181 @@
// @vitest-environment jsdom
/**
* Tests for the /agent-home root selector + per-runtime default-root
* + secret-shape denial placeholder (internal#425 Phase 3).
*
* Separate file so the diff is reviewable as a unit and the existing
* FilesToolbar / FileEditor / FilesTab tests don't have to grow
* agent-home-specific cases. Once Phase 2b lands, the read-only +
* 501-stub assertions here can be tightened (or moved into the main
* test file as the agent-home root becomes a first-class affordance).
*/
import React from "react";
import { render, screen, cleanup } from "@testing-library/react";
import { afterEach, describe, expect, it, vi } from "vitest";
import { FilesToolbar } from "../FilesToolbar";
import {
FileEditor,
SECRET_SHAPE_DENIED_MARKER,
} from "../FileEditor";
afterEach(cleanup);
describe("internal#425 Phase 3 — /agent-home root selector", () => {
it("dropdown includes /agent-home as an option", () => {
// Pins the affordance is in the DOM even pre-Phase-2b — the
// canvas design freezes today, the backend lands the dispatch
// later. Without this, a future refactor that drops the option
// would silently regress the RFC's Phase 1 contract (canvas
// visibility) without breaking any other test.
render(
<FilesToolbar
root="/configs"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
const values = Array.from(select.options).map((o) => o.value);
expect(values).toContain("/agent-home");
});
it("dropdown shows /agent-home as the SELECTED root when prop is /agent-home", () => {
render(
<FilesToolbar
root="/agent-home"
setRoot={vi.fn()}
fileCount={0}
onNewFile={vi.fn()}
onUpload={vi.fn()}
onDownloadAll={vi.fn()}
onClearAll={vi.fn()}
onRefresh={vi.fn()}
/>,
);
const select = screen.getByRole("combobox", {
name: /file root directory/i,
}) as HTMLSelectElement;
expect(select.value).toBe("/agent-home");
});
});
describe("internal#425 Phase 3 — secret-shape denial placeholder", () => {
// Files API Phase 2b returns SECRET_SHAPE_DENIED_MARKER as the file
// body when the file's path or content matched a credential regex.
// The editor MUST render the marker as a placeholder, not pump it
// through the textarea — that would put the marker (and any future
// matched bytes if the backend contract changes) into the DOM
// value, clipboard, and inspector.
it("renders the denial placeholder INSTEAD of the textarea when fileContent is the marker", () => {
render(
<FileEditor
selectedFile="agent/.openclaw/secrets.env"
fileContent={SECRET_SHAPE_DENIED_MARKER}
editContent={SECRET_SHAPE_DENIED_MARKER}
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
// Placeholder region present
expect(
screen.getByRole("region", { name: /file content denied/i }),
).toBeTruthy();
// Marker text visible (so a debugging operator sees the canonical
// contract string without having to dig into the source).
expect(screen.getByText(SECRET_SHAPE_DENIED_MARKER)).toBeTruthy();
// Critically: NO textarea — the bytes never reach a controlled
// input. A regression that re-introduces the textarea path would
// make the matched marker (and any future content) selectable +
// copyable.
expect(screen.queryByRole("textbox")).toBeNull();
});
it("renders the textarea normally when fileContent is regular content", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: openclaw\n"
editContent="name: openclaw\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
expect(screen.getByRole("textbox")).toBeTruthy();
expect(screen.queryByRole("region", { name: /file content denied/i }))
.toBeNull();
});
it("/agent-home renders textarea READ-ONLY for non-denied content", () => {
// Phase 2b ships read + delete on /agent-home; write semantics
// are decided later. Until then, the canvas presents the editor
// as read-only so a user can't type into a buffer that the
// backend will refuse to PUT. Without this gate, the user would
// edit, hit Save, get a 501, and lose their context for why.
render(
<FileEditor
selectedFile=".openclaw/agent-card.json"
fileContent='{"name":"openclaw"}'
editContent='{"name":"openclaw"}'
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/agent-home"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(true);
});
it("/configs renders textarea WRITABLE (regression guard for the read-only gate)", () => {
render(
<FileEditor
selectedFile="config.yaml"
fileContent="name: x\n"
editContent="name: x\n"
setEditContent={vi.fn()}
loadingFile={false}
saving={false}
success={null}
root="/configs"
onSave={vi.fn()}
onDownload={vi.fn()}
/>,
);
const textarea = screen.getByRole("textbox") as HTMLTextAreaElement;
expect(textarea.readOnly).toBe(false);
});
});
describe("internal#425 Phase 3 — marker constant is the canonical string", () => {
// The marker string is part of the canvas <-> workspace-server
// contract. The workspace-server emits this exact body; the canvas
// detects it by exact-equality. A typo on either side would
// silently break detection — the canvas would render the literal
// string in the textarea instead of the placeholder. Pin the
// contract value here.
it("matches the contract value '<denied: secret-shape>'", () => {
expect(SECRET_SHAPE_DENIED_MARKER).toBe("<denied: secret-shape>");
});
});
@@ -0,0 +1,3 @@
export { useChatHistory } from "./useChatHistory";
export { useChatSend } from "./useChatSend";
export { useChatSocket } from "./useChatSocket";
@@ -0,0 +1,11 @@
"use client";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
/** Resolve a workspace ID to its human-readable name.
* Falls back to the first 8 chars of the ID. */
export function resolveWorkspaceName(id: string): string {
const nodes = useCanvasStore.getState().nodes;
const node = nodes.find((n) => n.id === id);
return (node?.data as WorkspaceNodeData)?.name || id.slice(0, 8);
}
@@ -0,0 +1,134 @@
"use client";
import { useCallback, useEffect, useRef, useState } from "react";
import { api } from "@/lib/api";
import { type ChatMessage, appendMessageDeduped as appendMessageDedupedFn } from "../types";
const INITIAL_HISTORY_LIMIT = 10;
const OLDER_HISTORY_BATCH = 20;
async function loadMessagesFromDB(
workspaceId: string,
limit: number,
beforeTs?: string,
): Promise<{ messages: ChatMessage[]; error: string | null; reachedEnd: boolean }> {
try {
const params = new URLSearchParams({ limit: String(limit) });
if (beforeTs) params.set("before_ts", beforeTs);
const resp = await api.get<{ messages: ChatMessage[]; reached_end: boolean }>(
`/workspaces/${workspaceId}/chat-history?${params.toString()}`,
);
return {
messages: resp.messages ?? [],
error: null,
reachedEnd: resp.reached_end,
};
} catch (err) {
return {
messages: [],
error: err instanceof Error ? err.message : "Failed to load chat history",
reachedEnd: true,
};
}
}
export interface ScrollAnchor {
savedDistanceFromBottom: number;
expectFirstIdNotEqual: string | null;
}
export function useChatHistory(
workspaceId: string,
containerRef?: React.RefObject<HTMLDivElement | null>,
) {
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [loading, setLoading] = useState(true);
const [loadError, setLoadError] = useState<string | null>(null);
const [loadingOlder, setLoadingOlder] = useState(false);
const [hasMore, setHasMore] = useState(true);
const fetchTokenRef = useRef(0);
const oldestMessageRef = useRef<ChatMessage | null>(null);
const hasMoreRef = useRef(true);
const inflightRef = useRef(false);
const scrollAnchorRef = useRef<ScrollAnchor | null>(null);
useEffect(() => {
oldestMessageRef.current = messages[0] ?? null;
}, [messages]);
useEffect(() => {
hasMoreRef.current = hasMore;
}, [hasMore]);
const loadInitial = useCallback(() => {
setLoading(true);
setLoadError(null);
setHasMore(true);
fetchTokenRef.current += 1;
const myToken = fetchTokenRef.current;
return loadMessagesFromDB(workspaceId, INITIAL_HISTORY_LIMIT).then(
({ messages: msgs, error: fetchErr, reachedEnd }) => {
if (fetchTokenRef.current !== myToken) return;
setMessages(msgs);
setLoadError(fetchErr);
setHasMore(!reachedEnd);
setLoading(false);
},
);
}, [workspaceId]);
useEffect(() => {
loadInitial();
}, [loadInitial]);
const loadOlder = useCallback(async () => {
if (inflightRef.current || !hasMoreRef.current) return;
const oldest = oldestMessageRef.current;
if (!oldest) return;
const container = containerRef?.current;
if (!container) return;
inflightRef.current = true;
scrollAnchorRef.current = {
savedDistanceFromBottom: container.scrollHeight - container.scrollTop,
expectFirstIdNotEqual: oldest.id,
};
fetchTokenRef.current += 1;
const myToken = fetchTokenRef.current;
setLoadingOlder(true);
try {
const { messages: older, reachedEnd } = await loadMessagesFromDB(
workspaceId,
OLDER_HISTORY_BATCH,
oldest.timestamp,
);
if (fetchTokenRef.current !== myToken) {
scrollAnchorRef.current = null;
return;
}
if (older.length > 0) {
setMessages((prev) => [...older, ...prev]);
} else {
scrollAnchorRef.current = null;
}
setHasMore(!reachedEnd);
} finally {
setLoadingOlder(false);
inflightRef.current = false;
}
}, [workspaceId, containerRef]);
return {
messages,
loading,
loadError,
loadingOlder,
hasMore,
loadInitial,
loadOlder,
appendMessageDeduped: (msg: ChatMessage) =>
setMessages((prev) => appendMessageDedupedFn(prev, msg)),
setMessages,
scrollAnchorRef,
};
}
@@ -0,0 +1,182 @@
"use client";
import { useCallback, useRef, useState } from "react";
import { api } from "@/lib/api";
import { uploadChatFiles } from "../uploads";
import { createMessage, type ChatMessage, type ChatAttachment } from "../types";
import { extractFilesFromTask } from "../message-parser";
interface A2APart {
kind: string;
text?: string;
file?: {
name?: string;
mimeType?: string;
uri?: string;
size?: number;
};
}
interface A2AResponse {
result?: {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
}
export function extractReplyText(resp: A2AResponse): string {
const collect = (parts: A2APart[] | undefined): string => {
if (!parts) return "";
return parts
.filter((p) => p.kind === "text")
.map((p) => p.text ?? "")
.filter(Boolean)
.join("\n");
};
const result = resp?.result;
const collected: string[] = [];
const fromParts = collect(result?.parts);
if (fromParts) collected.push(fromParts);
if (result?.artifacts) {
for (const a of result.artifacts) {
const t = collect(a.parts);
if (t) collected.push(t);
}
}
return collected.join("\n");
}
export interface UseChatSendOptions {
getHistoryMessages: () => ChatMessage[];
onUserMessage?: (msg: ChatMessage) => void;
onAgentMessage?: (msg: ChatMessage) => void;
}
export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
const [sending, setSending] = useState(false);
const [uploading, setUploading] = useState(false);
const [error, setError] = useState<string | null>(null);
const sendInFlightRef = useRef(false);
const sendingFromAPIRef = useRef(false);
const sendTokenRef = useRef(0);
const optionsRef = useRef(options);
optionsRef.current = options;
const releaseSendGuards = useCallback(() => {
setSending(false);
sendingFromAPIRef.current = false;
sendInFlightRef.current = false;
}, []);
const clearError = useCallback(() => setError(null), []);
const sendMessage = useCallback(
async (text: string, files: File[] = []) => {
const trimmed = text.trim();
if ((!trimmed && files.length === 0) || sending || uploading) return;
if (sendInFlightRef.current) return;
sendInFlightRef.current = true;
let uploaded: ChatAttachment[] = [];
if (files.length > 0) {
setUploading(true);
try {
uploaded = await uploadChatFiles(workspaceId, files);
} catch (e) {
setUploading(false);
sendInFlightRef.current = false;
setError(
e instanceof Error ? `Upload failed: ${e.message}` : "Upload failed",
);
return;
}
setUploading(false);
}
const userMsg = createMessage("user", trimmed, uploaded);
optionsRef.current.onUserMessage?.(userMsg);
setSending(true);
sendingFromAPIRef.current = true;
setError(null);
const myToken = ++sendTokenRef.current;
const history = optionsRef.current
.getHistoryMessages()
.filter((m) => m.role === "user" || m.role === "agent")
.slice(-20)
.map((m) => ({
role: m.role === "user" ? "user" : "agent",
parts: [{ kind: "text", text: m.content }],
}));
const parts: A2APart[] = [];
if (trimmed) parts.push({ kind: "text", text: trimmed });
for (const att of uploaded) {
parts.push({
kind: "file",
file: {
name: att.name,
mimeType: att.mimeType,
uri: att.uri,
size: att.size,
},
});
}
api
.post<A2AResponse>(
`/workspaces/${workspaceId}/a2a`,
{
method: "message/send",
params: {
message: {
role: "user",
messageId: crypto.randomUUID(),
parts,
},
metadata: { history },
},
},
{ timeoutMs: 120_000 },
)
.then((resp) => {
if (sendTokenRef.current !== myToken) return;
if (!sendingFromAPIRef.current) {
sendInFlightRef.current = false;
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask(
(resp?.result ?? {}) as Record<string, unknown>,
);
if (replyText || replyFiles.length > 0) {
optionsRef.current.onAgentMessage?.(
createMessage("agent", replyText, replyFiles),
);
}
releaseSendGuards();
})
.catch(() => {
if (sendTokenRef.current !== myToken) return;
if (!sendingFromAPIRef.current) {
sendInFlightRef.current = false;
return;
}
releaseSendGuards();
setError("Failed to send message — agent may be unreachable");
});
},
[workspaceId, sending, uploading],
);
return {
sending,
uploading,
sendMessage,
error,
clearError,
releaseSendGuards,
sendingFromAPIRef,
};
}
@@ -0,0 +1,112 @@
"use client";
import { useCallback, useEffect, useRef } from "react";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { useSocketEvent } from "@/hooks/useSocketEvent";
import { createMessage, type ChatMessage } from "../types";
export interface UseChatSocketCallbacks {
onAgentMessage?: (msg: ChatMessage) => void;
onActivityLog?: (entry: string) => void;
onSendComplete?: () => void;
onSendError?: (error: string) => void;
}
export function useChatSocket(
workspaceId: string,
callbacks: UseChatSocketCallbacks,
): void {
const callbacksRef = useRef(callbacks);
callbacksRef.current = callbacks;
// Agent push messages from global store
const pendingAgentMsgs = useCanvasStore((s) => s.agentMessages[workspaceId]);
useEffect(() => {
if (!pendingAgentMsgs || pendingAgentMsgs.length === 0) return;
const consume = useCanvasStore.getState().consumeAgentMessages;
const msgs = consume(workspaceId);
for (const m of msgs) {
callbacksRef.current.onAgentMessage?.(
createMessage("agent", m.content, m.attachments),
);
}
if (msgs.length > 0) {
callbacksRef.current.onSendComplete?.();
}
}, [pendingAgentMsgs, workspaceId]);
const resolveWorkspaceName = useCallback((id: string) => {
const nodes = useCanvasStore.getState().nodes;
const node = nodes.find((n) => n.id === id);
return (node?.data as WorkspaceNodeData)?.name || id.slice(0, 8);
}, []);
useSocketEvent((msg) => {
try {
if (msg.event === "ACTIVITY_LOGGED") {
if (msg.workspace_id !== workspaceId) return;
const p = msg.payload || {};
const type = p.activity_type as string;
const method = (p.method as string) || "";
const status = (p.status as string) || "";
const targetId = (p.target_id as string) || "";
const durationMs = p.duration_ms as number | undefined;
const summary = (p.summary as string) || "";
let line = "";
if (type === "a2a_receive" && method === "message/send") {
const targetName = resolveWorkspaceName(targetId || msg.workspace_id);
if (status === "ok" && durationMs) {
const sec = Math.round(durationMs / 1000);
line = `${targetName} responded (${sec}s)`;
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) callbacksRef.current.onSendComplete?.();
} else if (status === "error") {
line = `${targetName} error`;
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) {
callbacksRef.current.onSendComplete?.();
// internal#211/#212: surface the runtime's curated,
// user-actionable reason (provider HTTP status + error
// code + the provider's own guidance, e.g. a 403 "org
// disabled · use an API key / ask your admin"). The
// server now includes error_detail in the ACTIVITY_LOGGED
// broadcast; fall back to summary, and only as a last
// resort to a generic line. The old hardcoded
// "Agent error (Exception) — see workspace logs for
// details." string pointed at a logs UI that does not
// exist and discarded the actionable reason entirely.
const detail =
(p.error_detail as string) ||
(p.summary as string) ||
"The agent turn failed but the runtime reported no detail. Retry once; if it repeats the workspace runtime may need a restart.";
callbacksRef.current.onSendError?.(detail);
}
}
} else if (type === "a2a_send") {
const targetName = resolveWorkspaceName(targetId);
line = `→ Delegating to ${targetName}...`;
} else if (type === "task_update") {
if (summary) line = `${summary}`;
} else if (type === "agent_log") {
if (summary) line = summary;
}
if (line) {
callbacksRef.current.onActivityLog?.(line);
}
} else if (
msg.event === "TASK_UPDATED" &&
msg.workspace_id === workspaceId
) {
const task = (msg.payload?.current_task as string) || "";
if (task) {
callbacksRef.current.onActivityLog?.(`${task}`);
}
}
} catch {
/* ignore */
}
});
}
+3
View File
@@ -1,2 +1,5 @@
export { type ChatMessage, createMessage, appendMessageDeduped } from "./types";
export { extractAgentText, extractTextsFromParts, extractResponseText } from "./message-parser";
export { useChatHistory } from "./hooks/useChatHistory";
export { useChatSend } from "./hooks/useChatSend";
export { useChatSocket } from "./hooks/useChatSocket";
+12 -8
View File
@@ -8,14 +8,18 @@ import { getTenantSlug } from "./tenant";
export const PLATFORM_URL =
process.env.NEXT_PUBLIC_PLATFORM_URL ?? "http://localhost:8080";
// 15s is long enough for slow CP queries but short enough that a
// hung backend doesn't leave the UI spinning forever. The abort
// propagates through AbortController so React components can observe
// the error and render a retry affordance. Callers that know the
// endpoint is intentionally slow (org import walks a tree of
// workspaces with server-side pacing) can pass `timeoutMs` to
// override.
const DEFAULT_TIMEOUT_MS = 15_000;
// 35s is long enough for the slowest server-side path (EIC SSH
// tunnel for tenant EC2 file operations, bounded server-side by
// `eicFileOpTimeout = 30 * time.Second` in
// workspace-server/internal/handlers/template_files_eic.go) so the
// canvas surfaces the server's real error instead of aborting first
// with a generic timeout. Shorter values caused "Save & Restart" to
// time out at the client before the backend returned its 5xx. The
// abort still propagates through AbortController so React components
// can render a retry affordance. Callers that know an endpoint is
// intentionally slow (org import walks a tree of workspaces with
// server-side pacing) can pass `timeoutMs` to override.
const DEFAULT_TIMEOUT_MS = 35_000;
export interface RequestOptions {
timeoutMs?: number;
+1
View File
@@ -62,6 +62,7 @@ TOP_LEVEL_MODULES = {
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"a2a_tools_identity",
"adapter_base",
"agent",
"agents_md",
@@ -194,7 +194,12 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
go h.RestartByID(workspaceID)
// Tracked via goAsync (not bare `go`) so the asyncWG can be drained
// before a test swaps the global db.DB. runRestartCycle reads db.DB
// before its provisioner gate, so an untracked detached goroutine
// races setupTestDB's t.Cleanup db.DB restore. Matches the already-
// correct site at a2a_proxy.go:648.
h.goAsync(func() { h.RestartByID(workspaceID) })
return true
}
@@ -241,7 +246,10 @@ func (h *WorkspaceHandler) preflightContainerHealth(ctx context.Context, workspa
}
db.ClearWorkspaceKeys(ctx, workspaceID)
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
go h.RestartByID(workspaceID)
// Tracked via goAsync (see maybeMarkContainerDead): preflight's
// detached restart must be drainable so it doesn't race the global
// db.DB swap in test cleanup.
h.goAsync(func() { h.RestartByID(workspaceID) })
return &proxyA2AError{
Status: http.StatusServiceUnavailable,
Response: gin.H{
@@ -262,7 +270,8 @@ func (h *WorkspaceHandler) logA2AFailure(ctx context.Context, workspaceID, calle
errWsName = workspaceID
}
summary := "A2A request to " + errWsName + " failed: " + errMsg
go func(parent context.Context) {
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
@@ -277,7 +286,7 @@ func (h *WorkspaceHandler) logA2AFailure(ctx context.Context, workspaceID, calle
Status: "error",
ErrorDetail: &errMsg,
})
}(ctx)
})
}
// logA2ASuccess records a successful A2A round-trip and (for canvas-initiated
@@ -298,18 +307,19 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
// silent workspaces. Only update when callerID is a real workspace (not
// canvas, not a system caller) and the target returned 2xx/3xx.
if callerID != "" && !isSystemCaller(callerID) && statusCode < 400 {
go func() {
h.goAsync(func() {
bgCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if _, err := db.DB.ExecContext(bgCtx,
`UPDATE workspaces SET last_outbound_at = NOW() WHERE id = $1`, callerID); err != nil {
log.Printf("last_outbound_at update failed for %s: %v", callerID, err)
}
}()
})
}
summary := a2aMethod + " → " + wsNameForLog
toolTrace := extractToolTrace(respBody)
go func(parent context.Context) {
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
@@ -325,7 +335,7 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
DurationMs: &durationMs,
Status: logStatus,
})
}(ctx)
})
if callerID == "" && statusCode < 400 {
h.broadcaster.BroadcastOnly(workspaceID, string(events.EventA2AResponse), map[string]interface{}{
@@ -510,7 +520,8 @@ func (h *WorkspaceHandler) logA2AReceiveQueued(ctx context.Context, workspaceID,
wsName = workspaceID
}
summary := a2aMethod + " → " + wsName + " (queued for poll)"
go func(parent context.Context) {
parent := ctx
h.goAsync(func() {
logCtx, cancel := context.WithTimeout(context.WithoutCancel(parent), 30*time.Second)
defer cancel()
LogActivity(logCtx, h.broadcaster, ActivityParams{
@@ -523,7 +534,7 @@ func (h *WorkspaceHandler) logA2AReceiveQueued(ctx context.Context, workspaceID,
RequestBody: json.RawMessage(body),
Status: "ok",
})
}(ctx)
})
}
// readUsageMap extracts input_tokens / output_tokens from the "usage" key of m.
@@ -1,12 +1,7 @@
package handlers
import (
"context"
"database/sql"
"testing"
sqlmock "github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// TestExtractExpiresInSeconds covers the JSON parser used at enqueue time
@@ -63,207 +58,3 @@ func TestExtractExpiresInSeconds(t *testing.T) {
})
}
}
// ── QueueStatusByID ─────────────────────────────────────────────────────────────
func setupQueueStatusDB(t *testing.T) sqlmock.Sqlmock {
t.Helper()
mockDB, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
return mock
}
func TestQueueStatusByID_Success(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
wsID := "cccccccc-cccc-cccc-cccc-cccccccccccc"
rows := sqlmock.NewRows([]string{
"id", "workspace_id", "status", "priority", "attempts",
"last_error", "enqueued_at", "dispatched_at", "completed_at", "expires_at",
"response_body",
}).AddRow(
queueID, wsID, "queued", 50, 0,
nil, // last_error
"2026-01-01T00:00:00Z", // enqueued_at
nil, // dispatched_at
nil, // completed_at
nil, // expires_at
nil, // response_body
)
mock.ExpectQuery(`SELECT`).
WithArgs(queueID).
WillReturnRows(rows)
qs, err := QueueStatusByID(context.Background(), queueID)
if err != nil {
t.Fatalf("QueueStatusByID returned error: %v", err)
}
if qs.ID != queueID {
t.Errorf("ID = %q, want %q", qs.ID, queueID)
}
if qs.WorkspaceID != wsID {
t.Errorf("WorkspaceID = %q, want %q", qs.WorkspaceID, wsID)
}
if qs.Status != "queued" {
t.Errorf("Status = %q, want %q", qs.Status, "queued")
}
if qs.Priority != 50 {
t.Errorf("Priority = %d, want 50", qs.Priority)
}
if qs.LastError != nil {
t.Errorf("LastError = %v, want nil", qs.LastError)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueStatusByID_NotFound(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT`).
WithArgs(queueID).
WillReturnError(sql.ErrNoRows)
qs, err := QueueStatusByID(context.Background(), queueID)
if err != sql.ErrNoRows {
t.Errorf("expected sql.ErrNoRows, got %v", err)
}
if qs != nil {
t.Errorf("expected nil queue status, got %+v", qs)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueStatusByID_DBError(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT`).
WithArgs(queueID).
WillReturnError(sql.ErrConnDone)
qs, err := QueueStatusByID(context.Background(), queueID)
if err == nil {
t.Error("expected error, got nil")
}
if qs != nil {
t.Errorf("expected nil queue status, got %+v", qs)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueStatusByID_CompletedWithResponse(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
wsID := "cccccccc-cccc-cccc-cccc-cccccccccccc"
respBody := []byte(`{"text":"delegation result"}`)
rows := sqlmock.NewRows([]string{
"id", "workspace_id", "status", "priority", "attempts",
"last_error", "enqueued_at", "dispatched_at", "completed_at", "expires_at",
"response_body",
}).AddRow(
queueID, wsID, "completed", 50, 1,
nil,
"2026-01-01T00:00:00Z",
"2026-01-01T00:01:00Z",
"2026-01-01T00:02:00Z",
nil,
respBody,
)
mock.ExpectQuery(`SELECT`).
WithArgs(queueID).
WillReturnRows(rows)
qs, err := QueueStatusByID(context.Background(), queueID)
if err != nil {
t.Fatalf("QueueStatusByID returned error: %v", err)
}
if qs.Status != "completed" {
t.Errorf("Status = %q, want completed", qs.Status)
}
if qs.ResponseBody == nil {
t.Fatal("ResponseBody should be set for completed status")
}
if string(qs.ResponseBody) != `{"text":"delegation result"}` {
t.Errorf("ResponseBody = %q, want %q", string(qs.ResponseBody), `{"text":"delegation result"}`)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// ── queueRowAuthFields ──────────────────────────────────────────────────────────
func TestQueueRowAuthFields_Success(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
callerID := "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
wsID := "cccccccc-cccc-cccc-cccc-cccccccccccc"
rows := sqlmock.NewRows([]string{"caller_id", "workspace_id"}).
AddRow(callerID, wsID)
mock.ExpectQuery(`SELECT caller_id, workspace_id FROM a2a_queue WHERE id`).
WithArgs(queueID).
WillReturnRows(rows)
gotCaller, gotWs, err := queueRowAuthFields(context.Background(), queueID)
if err != nil {
t.Fatalf("queueRowAuthFields returned error: %v", err)
}
if gotCaller != callerID {
t.Errorf("callerID = %q, want %q", gotCaller, callerID)
}
if gotWs != wsID {
t.Errorf("workspaceID = %q, want %q", gotWs, wsID)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueRowAuthFields_NotFound(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT caller_id, workspace_id FROM a2a_queue WHERE id`).
WithArgs(queueID).
WillReturnError(sql.ErrNoRows)
_, _, err := queueRowAuthFields(context.Background(), queueID)
if err != sql.ErrNoRows {
t.Errorf("expected sql.ErrNoRows, got %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueRowAuthFields_DBError(t *testing.T) {
mock := setupQueueStatusDB(t)
queueID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT caller_id, workspace_id FROM a2a_queue WHERE id`).
WithArgs(queueID).
WillReturnError(sql.ErrConnDone)
_, _, err := queueRowAuthFields(context.Background(), queueID)
if err == nil {
t.Error("expected error, got nil")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -516,51 +516,3 @@ func TestDrainQueueForWorkspace_ClaimGuarding_SecondDrainGetsEmpty(t *testing.T)
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// ── QueueDepth ──────────────────────────────────────────────────────────────────
func TestQueueDepth_ReturnsCount(t *testing.T) {
mockDB, mock, err := sqlmock.New(sqlmock.QueryMatcherOption(sqlmock.QueryMatcherEqual))
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
wsID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT COUNT(*) FROM a2a_queue WHERE workspace_id = $1 AND status = 'queued'`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(42))
got := QueueDepth(context.Background(), wsID)
if got != 42 {
t.Errorf("QueueDepth returned %d, want 42", got)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
func TestQueueDepth_ZeroWhenEmpty(t *testing.T) {
mockDB, mock, err := sqlmock.New(sqlmock.QueryMatcherOption(sqlmock.QueryMatcherEqual))
if err != nil {
t.Fatalf("failed to create sqlmock: %v", err)
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
wsID := "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
mock.ExpectQuery(`SELECT COUNT(*) FROM a2a_queue WHERE workspace_id = $1 AND status = 'queued'`).
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(0))
got := QueueDepth(context.Background(), wsID)
if got != 0 {
t.Errorf("QueueDepth returned %d, want 0", got)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -691,6 +691,19 @@ func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster eve
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
// internal#211/#212: error_detail carries the runtime's curated,
// user-actionable, secret-safe failure reason (provider HTTP
// status + error code + the provider's own guidance, e.g. a 403
// "org disabled · use an API key / ask your admin"). It is
// already persisted to the DB column above and capped by the
// runtime's report_activity helper (4096 chars). Previously it
// was dropped from the LIVE broadcast, so the canvas had nothing
// to render and fell back to a hardcoded opaque
// "Agent error (Exception) — see workspace logs" string. Include
// it so the chat bubble shows the real reason in real time.
if params.ErrorDetail != nil && *params.ErrorDetail != "" {
payload["error_detail"] = *params.ErrorDetail
}
}
return func() {
@@ -947,73 +947,6 @@ func TestVerifyDiscordSignature_WrongLengthPubKey(t *testing.T) {
}
}
// ==================== matchesChatID pure function ====================
func TestMatchesChatID_ExactMatch(t *testing.T) {
cfg := map[string]interface{}{"chat_id": "123456"}
if !matchesChatID(cfg, "123456") {
t.Error("expected true for exact match")
}
}
func TestMatchesChatID_NoMatch(t *testing.T) {
cfg := map[string]interface{}{"chat_id": "123456"}
if matchesChatID(cfg, "654321") {
t.Error("expected false for non-matching chat ID")
}
}
func TestMatchesChatID_PrefixNoMatch(t *testing.T) {
// "123" is a prefix of "123456" but not an exact match.
cfg := map[string]interface{}{"chat_id": "123456"}
if matchesChatID(cfg, "123") {
t.Error("expected false for prefix of stored chat ID")
}
}
func TestMatchesChatID_CommaSeparatedMultiple(t *testing.T) {
cfg := map[string]interface{}{"chat_id": "111,222,333"}
for _, id := range []string{"111", "222", "333"} {
if !matchesChatID(cfg, id) {
t.Errorf("expected true for %q in comma-separated list", id)
}
}
if matchesChatID(cfg, "444") {
t.Error("expected false for ID not in list")
}
}
func TestMatchesChatID_WhitespaceTrimmed(t *testing.T) {
cfg := map[string]interface{}{"chat_id": "111, 222 , 333"}
if !matchesChatID(cfg, "222") {
t.Error("expected true for whitespace-trimmed match")
}
if matchesChatID(cfg, " 222") {
t.Error("expected false for whitespace in query (not trimmed from query)")
}
}
func TestMatchesChatID_EmptyChatID(t *testing.T) {
cfg := map[string]interface{}{"chat_id": ""}
if matchesChatID(cfg, "123456") {
t.Error("expected false for empty chat_id in config")
}
}
func TestMatchesChatID_MissingChatIDKey(t *testing.T) {
cfg := map[string]interface{}{}
if matchesChatID(cfg, "123456") {
t.Error("expected false when chat_id key is missing")
}
}
func TestMatchesChatID_NonStringChatID(t *testing.T) {
cfg := map[string]interface{}{"chat_id": 123456} // wrong type
if matchesChatID(cfg, "123456") {
t.Error("expected false when chat_id is not a string")
}
}
// TestChannelHandler_Webhook_Discord_NoKey_Returns401 verifies that a Discord
// webhook request is rejected with 401 when no public key is configured in the
// DB and DISCORD_APP_PUBLIC_KEY env var is not set.
@@ -8,6 +8,7 @@ import (
"fmt"
"net/http"
"net/http/httptest"
"sync"
"testing"
"time"
@@ -22,8 +23,39 @@ import (
"github.com/redis/go-redis/v9"
)
// liveTestHandlers tracks every WorkspaceHandler built during the test
// binary's lifetime so setupTestDB can drain their in-flight goAsync
// goroutines (notably the detached RestartByID restart cycle, which
// reads the global db.DB) BEFORE restoring db.DB. Without this drain a
// fire-and-forget restart goroutine spawned by one test outlives that
// test and races the db.DB swap in a later test's t.Cleanup — the
// 0x...d548 data race on platform/internal/db.DB.
var (
liveTestHandlersMu sync.Mutex
liveTestHandlers []*WorkspaceHandler
)
func init() {
gin.SetMode(gin.TestMode)
newHandlerHook = func(h *WorkspaceHandler) {
liveTestHandlersMu.Lock()
liveTestHandlers = append(liveTestHandlers, h)
liveTestHandlersMu.Unlock()
}
}
// drainTestAsync waits for every tracked handler's goAsync goroutines to
// finish. Called from setupTestDB's cleanup before db.DB is restored so
// no detached restart/provision goroutine is mid-read of db.DB when the
// pointer is swapped.
func drainTestAsync() {
liveTestHandlersMu.Lock()
handlers := make([]*WorkspaceHandler, len(liveTestHandlers))
copy(handlers, liveTestHandlers)
liveTestHandlersMu.Unlock()
for _, h := range handlers {
h.waitAsyncForTest()
}
}
// setupTestDB creates a sqlmock DB and assigns it to the global db.DB.
@@ -37,7 +69,16 @@ func setupTestDB(t *testing.T) sqlmock.Sqlmock {
}
prevDB := db.DB
db.DB = mockDB
t.Cleanup(func() { db.DB = prevDB; mockDB.Close() })
t.Cleanup(func() {
// Drain detached async goroutines (e.g. goAsync(RestartByID),
// which reads db.DB in runRestartCycle before its provisioner
// gate) BEFORE swapping db.DB back. Doing the restore first
// would let an in-flight restart goroutine read db.DB while
// this line writes it — the data race this guards against.
drainTestAsync()
db.DB = prevDB
mockDB.Close()
})
// Disable SSRF checks for the duration of this test only. Restore
// the previous state via t.Cleanup so that TestIsSafeURL_* tests
@@ -56,9 +56,11 @@ const (
// (an externally routable address) is used directly.
func (h *WorkspaceHandler) gracefulPreRestart(ctx context.Context, workspaceID string) {
// Non-blocking send — don't stall the restart cycle.
// Run in a detached goroutine so the caller (runRestartCycle) can
// proceed to stopForRestart without waiting.
go func() {
// Run in a tracked async goroutine (goAsync, not bare `go`) so the
// caller (runRestartCycle) can proceed to stopForRestart without
// waiting, while the test harness can still drain it before swapping
// the global db.DB (resolveAgentURLForRestartSignal reads db.DB).
h.goAsync(func() {
signalCtx, cancel := context.WithTimeout(context.Background(), restartSignalTimeout)
defer cancel()
@@ -109,7 +111,7 @@ func (h *WorkspaceHandler) gracefulPreRestart(ctx context.Context, workspaceID s
} else {
log.Printf("A2AGracefulRestart: %s returned status %d — proceeding with stop", workspaceID, resp.StatusCode)
}
}()
})
}
// resolveAgentURLForRestartSignal returns the routable URL for the workspace
@@ -0,0 +1,117 @@
package handlers
// template_files_agent_home_stub_test.go — pins the Phase-1 stub
// contract for the /agent-home root added by internal#425 RFC.
//
// Today (pre-Phase-2b), every Files API verb against `?root=/agent-home`
// must return HTTP 501 with the canonical pending-message body. The
// stub MUST NOT:
// 1. Hit the DB (the workspace might not even exist yet from the
// canvas's POV — the root selector is testable without one).
// 2. Touch the EIC tunnel / Docker / template-dir paths — those
// would 500/404/[] depending on the env and confuse the canvas.
// 3. Accept writes/deletes that the future docker-exec backend
// would reject — fail closed.
//
// When Phase 2b lands, this file gets replaced by a real
// docker-exec dispatch test; the stub-message constant in
// templates.go disappears.
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
)
// TestAgentHomeAllowedRoot pins that /agent-home is in the allowedRoots
// set. Without this, a future refactor that drops the key would
// silently degrade the canvas root selector to a 400 instead of the
// stub 501.
func TestAgentHomeAllowedRoot(t *testing.T) {
if !allowedRoots["/agent-home"] {
t.Fatal("/agent-home must be in allowedRoots — RFC #425 contract")
}
}
// TestAgentHomeStub_AllVerbs_Return501 pins the canonical stub
// response across all four verbs. Each must:
//
// - status 501
// - body contains the canonical "/agent-home not implemented" prefix
// - NOT contain "workspace not found" (proves we short-circuit before
// the DB lookup)
//
// Driven as a table to keep symmetry — adding a fifth verb in the
// future means adding one row here.
func TestAgentHomeStub_AllVerbs_Return501(t *testing.T) {
cases := []struct {
name string
method string
invoke func(c *gin.Context)
}{
{
name: "ListFiles",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ListFiles(c) },
},
{
name: "ReadFile",
method: "GET",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).ReadFile(c) },
},
{
name: "WriteFile",
method: "PUT",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).WriteFile(c) },
},
{
name: "DeleteFile",
method: "DELETE",
invoke: func(c *gin.Context) { (&TemplatesHandler{}).DeleteFile(c) },
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "ws-stub"},
// Path param without leading slash so DeleteFile's
// filepath.IsAbs guard doesn't 400 before the root
// dispatch runs. The List/Read/Write paths strip the
// leading slash themselves and accept either form.
{Key: "path", Value: "notes.md"},
}
// WriteFile binds JSON; provide a minimal valid body so the
// short-circuit isn't masked by the bind-error path.
var body string
if tc.method == "PUT" {
body = `{"content":"x"}`
}
c.Request = httptest.NewRequest(
tc.method,
"/workspaces/ws-stub/files/notes.md?root=/agent-home",
strings.NewReader(body),
)
if body != "" {
c.Request.Header.Set("Content-Type", "application/json")
}
tc.invoke(c)
if w.Code != http.StatusNotImplemented {
t.Fatalf("expected 501, got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "/agent-home not implemented") {
t.Errorf("body should contain canonical stub message; got %s", w.Body.String())
}
if strings.Contains(w.Body.String(), "workspace not found") {
t.Errorf("stub leaked through to DB lookup; body=%s", w.Body.String())
}
})
}
}
@@ -18,11 +18,35 @@ import (
)
// allowedRoots are the container paths that the Files API can browse.
//
// `/agent-home` (added 2026-05-15, internal#425 RFC) is the container's
// own $HOME — `/root` for openclaw, `/home/agent` for claude-code/hermes
// — browsed via `docker exec` rather than host-side `find`. The
// dispatch is stubbed today (returns 501); full implementation lands in
// Phase 2b of the RFC. The allowedRoots key is added now so the canvas
// can design its root-selector UI against the final shape and the
// stub-vs-full transition is server-side only.
var allowedRoots = map[string]bool{
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
"/configs": true,
"/workspace": true,
"/home": true,
"/plugins": true,
"/agent-home": true,
}
// agentHomeStubMessage is the body returned by every Files API verb
// when `?root=/agent-home` is requested before Phase 2b lands. Keep the
// status code 501 (Not Implemented) — the route exists, the verb is
// understood, but the handler is unimplemented. Distinguishes from
// 400/404 so a canvas behind a less-current server can render a clean
// "feature pending" state instead of a generic error.
const agentHomeStubMessage = "/agent-home not implemented yet (internal#425 RFC Phase 2b — docker-exec backend pending)"
// isAgentHomeStubRequest returns true when the request targets the
// stubbed /agent-home root. Centralised so every verb in this file
// short-circuits with the same response shape.
func isAgentHomeStubRequest(rootPath string) bool {
return rootPath == "/agent-home"
}
// maxUploadFiles limits the number of files in a single import/replace.
@@ -219,7 +243,14 @@ func (h *TemplatesHandler) ListFiles(c *gin.Context) {
// ?depth= — max depth to recurse (default: 1, max: 5)
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
// /agent-home dispatch is stubbed pre-Phase-2b. Short-circuit before
// the DB lookup + EIC dance so a canvas exercising the new root key
// gets a clean 501 instead of a half-effort response.
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
subPath := c.DefaultQuery("path", "")
@@ -383,7 +414,11 @@ func (h *TemplatesHandler) ReadFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
@@ -496,7 +531,11 @@ func (h *TemplatesHandler) WriteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
var wsName, instanceID, runtime string
@@ -573,7 +612,11 @@ func (h *TemplatesHandler) DeleteFile(c *gin.Context) {
ctx := c.Request.Context()
rootPath := c.DefaultQuery("root", "/configs")
if !allowedRoots[rootPath] {
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins"})
c.JSON(http.StatusBadRequest, gin.H{"error": "root must be one of: /configs, /workspace, /home, /plugins, /agent-home"})
return
}
if isAgentHomeStubRequest(rootPath) {
c.JSON(http.StatusNotImplemented, gin.H{"error": agentHomeStubMessage})
return
}
var wsName, instanceID, runtime string
@@ -10,8 +10,20 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// validWorkspaceID returns true when id is a syntactically valid UUID.
// workspace_id is a `uuid` column; passing a non-UUID (e.g. the canvas
// "global" sentinel sent when no node is selected) makes Postgres raise
// `invalid input syntax for type uuid`, which previously leaked as an
// opaque 500. Reject up front with a clean 400 instead. Mirrors the
// uuid.Parse guard already used in handlers/activity.go.
func validWorkspaceID(id string) bool {
_, err := uuid.Parse(id)
return err == nil
}
// TokenHandler exposes user-facing token management for workspaces.
// Routes: GET/POST/DELETE /workspaces/:id/tokens (behind WorkspaceAuth).
type TokenHandler struct{}
@@ -31,6 +43,10 @@ type tokenListItem struct {
// never the plaintext or hash).
func (h *TokenHandler) List(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
limit := 50
if v := c.Query("limit"); v != "" {
@@ -53,6 +69,7 @@ func (h *TokenHandler) List(c *gin.Context) {
LIMIT $2 OFFSET $3
`, workspaceID, limit, offset)
if err != nil {
log.Printf("tokens: list query failed for workspace %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to list tokens"})
return
}
@@ -85,6 +102,10 @@ const maxTokensPerWorkspace = 50
// exactly once in the response — it cannot be recovered afterwards.
func (h *TokenHandler) Create(c *gin.Context) {
workspaceID := c.Param("id")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
// Rate limit: max active tokens per workspace
var count int
@@ -117,6 +138,10 @@ func (h *TokenHandler) Create(c *gin.Context) {
func (h *TokenHandler) Revoke(c *gin.Context) {
workspaceID := c.Param("id")
tokenID := c.Param("tokenId")
if !validWorkspaceID(workspaceID) {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid workspace id"})
return
}
result, err := db.DB.ExecContext(c.Request.Context(), `
UPDATE workspace_auth_tokens
@@ -41,6 +41,15 @@ import (
func init() { gin.SetMode(gin.TestMode) }
// Workspace IDs are validated as UUIDs up front (tokens.go validWorkspaceID),
// so handler tests must pass syntactically valid UUIDs. Fixed values keep
// sqlmock WithArgs assertions deterministic.
const (
wsUUID1 = "11111111-1111-1111-1111-111111111111"
wsUUID2 = "22222222-2222-2222-2222-222222222222"
wsUUID3 = "33333333-3333-3333-3333-333333333333"
)
// withMockDB swaps `db.DB` for a sqlmock and returns the mock plus a
// restore func. Tests use this in place of setupTokenTestDB which
// skips on a missing real DB.
@@ -81,13 +90,13 @@ func TestTokenHandler_List_HappyPath(t *testing.T) {
created := time.Date(2026, 4, 1, 12, 0, 0, 0, time.UTC)
last := created.Add(time.Hour)
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at\s+FROM workspace_auth_tokens`).
WithArgs("ws-1", 50, 0).
WithArgs(wsUUID1, 50, 0).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}).
AddRow("tok-1", "abc12345", created, last).
AddRow("tok-2", "def67890", created, nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
@@ -121,7 +130,7 @@ func TestTokenHandler_List_EmptyResult(t *testing.T) {
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: "ws-2"}})
"/workspaces/ws-2/tokens", gin.Params{{Key: "id", Value: wsUUID2}})
if w.Code != http.StatusOK {
t.Fatalf("expected 200 on empty list, got %d", w.Code)
@@ -146,7 +155,7 @@ func TestTokenHandler_List_QueryError(t *testing.T) {
WillReturnError(errors.New("connection refused"))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: "ws-3"}})
"/workspaces/ws-3/tokens", gin.Params{{Key: "id", Value: wsUUID3}})
if w.Code != http.StatusInternalServerError {
t.Errorf("query error must surface as 500, got %d", w.Code)
@@ -158,13 +167,13 @@ func TestTokenHandler_List_RespectsLimit(t *testing.T) {
defer cleanup()
mock.ExpectQuery(`SELECT id, prefix, created_at, last_used_at`).
WithArgs("ws-1", 10, 5).
WithArgs(wsUUID1, 10, 5).
WillReturnRows(sqlmock.NewRows([]string{"id", "prefix", "created_at", "last_used_at"}))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/workspaces/ws-1/tokens?limit=10&offset=5", nil)
c.Params = gin.Params{{Key: "id", Value: "ws-1"}}
c.Params = gin.Params{{Key: "id", Value: wsUUID1}}
NewTokenHandler().List(c)
if w.Code != http.StatusOK {
@@ -186,7 +195,7 @@ func TestTokenHandler_List_ScanError(t *testing.T) {
AddRow("tok-1", "abc", "not-a-timestamp", nil))
w := makeReq(t, NewTokenHandler().List, "GET",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusInternalServerError {
t.Errorf("scan error must surface as 500, got %d: %s", w.Code, w.Body.String())
@@ -201,11 +210,11 @@ func TestTokenHandler_Create_RateLimited(t *testing.T) {
// Count query returns 50 (== max) → 429.
mock.ExpectQuery(`SELECT COUNT\(\*\) FROM workspace_auth_tokens`).
WithArgs("ws-1").
WithArgs(wsUUID1).
WillReturnRows(sqlmock.NewRows([]string{"count"}).AddRow(50))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusTooManyRequests {
t.Errorf("max active tokens should 429, got %d", w.Code)
@@ -225,7 +234,7 @@ func TestTokenHandler_Create_IssueFails(t *testing.T) {
WillReturnError(errors.New("disk full"))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusInternalServerError {
t.Errorf("IssueToken DB error must 500, got %d", w.Code)
@@ -242,7 +251,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
WillReturnResult(sqlmock.NewResult(1, 1))
w := makeReq(t, NewTokenHandler().Create, "POST",
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: "ws-1"}})
"/workspaces/ws-1/tokens", gin.Params{{Key: "id", Value: wsUUID1}})
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
@@ -257,7 +266,7 @@ func TestTokenHandler_Create_HappyPath(t *testing.T) {
if body.AuthToken == "" {
t.Errorf("auth_token must be present and non-empty in response")
}
if body.WorkspaceID != "ws-1" {
if body.WorkspaceID != wsUUID1 {
t.Errorf("workspace_id mismatch: %q", body.WorkspaceID)
}
}
@@ -269,12 +278,12 @@ func TestTokenHandler_Revoke_HappyPath(t *testing.T) {
defer cleanup()
mock.ExpectExec(`UPDATE workspace_auth_tokens\s+SET revoked_at = now\(\)`).
WithArgs("tok-1", "ws-1").
WithArgs("tok-1", wsUUID1).
WillReturnResult(sqlmock.NewResult(0, 1))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-1"},
})
@@ -289,12 +298,12 @@ func TestTokenHandler_Revoke_NotFound(t *testing.T) {
// 0 rows affected → token not found OR already revoked.
mock.ExpectExec(`UPDATE workspace_auth_tokens`).
WithArgs("tok-ghost", "ws-1").
WithArgs("tok-ghost", wsUUID1).
WillReturnResult(sqlmock.NewResult(0, 0))
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-ghost", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-ghost"},
})
@@ -312,7 +321,7 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
w := makeReq(t, NewTokenHandler().Revoke, "DELETE",
"/workspaces/ws-1/tokens/tok-1", gin.Params{
{Key: "id", Value: "ws-1"},
{Key: "id", Value: wsUUID1},
{Key: "tokenId", Value: "tok-1"},
})
@@ -321,6 +330,59 @@ func TestTokenHandler_Revoke_DBError(t *testing.T) {
}
}
// ---- UUID validation (regression: "global" sentinel 500) ------------
// The canvas Settings → Workspace Tokens tab sent the literal sentinel
// "global" as the workspace id when no node was selected. workspace_id
// is a `uuid` column, so the query raised
// `invalid input syntax for type uuid: "global"` which leaked as an
// opaque 500. List/Create/Revoke now reject any non-UUID id with a
// clean 400 before touching the DB. No DB expectation is set on the
// mock — a DB hit would fail ExpectationsWereMet, proving short-circuit.
func TestTokenHandler_RejectsNonUUIDWorkspaceID(t *testing.T) {
h := NewTokenHandler()
cases := []struct {
name string
run func(c *gin.Context)
method string
params gin.Params
}{
{"List", h.List, "GET", gin.Params{{Key: "id", Value: "global"}}},
{"Create", h.Create, "POST", gin.Params{{Key: "id", Value: "global"}}},
{"Revoke", h.Revoke, "DELETE", gin.Params{
{Key: "id", Value: "global"},
{Key: "tokenId", Value: "tok-1"},
}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
mock, cleanup := withMockDB(t)
defer cleanup()
w := makeReq(t, tc.run, tc.method,
"/workspaces/global/tokens", tc.params)
if w.Code != http.StatusBadRequest {
t.Fatalf("%s with non-UUID id must 400, got %d: %s",
tc.name, w.Code, w.Body.String())
}
var body struct {
Error string `json:"error"`
}
_ = json.Unmarshal(w.Body.Bytes(), &body)
if body.Error != "invalid workspace id" {
t.Errorf("%s: want error=%q, got %q",
tc.name, "invalid workspace id", body.Error)
}
// No query/exec was expected → if the handler hit the DB
// this fails, proving the guard short-circuits before SQL.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("%s leaked a DB call past the uuid guard: %v", tc.name, err)
}
})
}
}
// Compile-time noise removal: the imports list pulls in the sql /
// driver packages and the silenced ctx so a future scenario that
// needs them doesn't have to re-add the import. Documented here so
@@ -11,6 +11,7 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/wsauth"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
func init() { gin.SetMode(gin.TestMode) }
@@ -167,11 +168,14 @@ func TestTokenHandler_RevokeWrongWorkspace(t *testing.T) {
h := NewTokenHandler()
// Try to revoke with a different workspace ID — should 404
// Try to revoke with a different (valid-UUID) workspace ID that does
// not own the token — should 404. A valid UUID is required so this
// exercises the ownership branch, not the up-front uuid-shape 400.
otherWS := uuid.NewString()
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "wrong-workspace-id"}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/wrong/tokens/"+tokenID, nil)
c.Params = gin.Params{{Key: "id", Value: otherWS}, {Key: "tokenId", Value: tokenID}}
c.Request = httptest.NewRequest("DELETE", "/workspaces/"+otherWS+"/tokens/"+tokenID, nil)
h.Revoke(c)
if w.Code != http.StatusNotFound {
@@ -80,6 +80,15 @@ type WorkspaceHandler struct {
asyncWG sync.WaitGroup
}
// newHandlerHook, when non-nil, is invoked for every WorkspaceHandler
// created via NewWorkspaceHandler. It is nil in production (zero cost);
// the test harness sets it so setupTestDB can drain every handler's
// in-flight async goroutines before swapping the global db.DB. Without
// this, a detached restart goroutine (maybeMarkContainerDead ->
// goAsync(RestartByID) -> runRestartCycle reads db.DB) races the
// db.DB restore in another test's t.Cleanup.
var newHandlerHook func(*WorkspaceHandler)
func (h *WorkspaceHandler) goAsync(fn func()) {
h.asyncWG.Add(1)
go func() {
@@ -108,6 +117,9 @@ func NewWorkspaceHandler(b events.EventEmitter, p *provisioner.Provisioner, plat
if p != nil {
h.provisioner = p
}
if newHandlerHook != nil {
newHandlerHook(h)
}
return h
}
@@ -237,10 +237,10 @@ func (h *WorkspaceHandler) Restart(c *gin.Context) {
// the silent-drop bugs PRs #2811/#2824 closed). RestartWorkspaceAuto
// enforces CP-FIRST ordering matching the other dispatchers — see
// docs/architecture/backends.md.
go func() {
h.goAsync(func() {
h.RestartWorkspaceAutoOpts(context.Background(), id, templatePath, configFiles, payload, resetClaudeSession)
}()
go h.sendRestartContext(id, restartData)
})
h.goAsync(func() { h.sendRestartContext(id, restartData) })
c.JSON(http.StatusOK, gin.H{"status": "provisioning", "config_dir": configLabel, "reset_session": resetClaudeSession})
}
@@ -610,7 +610,9 @@ func (h *WorkspaceHandler) runRestartCycle(workspaceID string) {
h.provisionWorkspaceAutoSync(workspaceID, "", nil, payload)
// sendRestartContext is a one-way notification to the new container; safe
// to fire async — the next restart cycle won't depend on it completing.
go h.sendRestartContext(workspaceID, restartData)
// Tracked via goAsync so the test harness can drain it before the
// global db.DB swap (sendRestartContext reads db.DB).
h.goAsync(func() { h.sendRestartContext(workspaceID, restartData) })
}
// Pause handles POST /workspaces/:id/pause
@@ -178,12 +178,21 @@ func (p *CPProvisioner) Start(ctx context.Context, cfg WorkspaceConfig) (string,
// /admin/liveness and other admin-gated platform endpoints (core#831).
// p.adminToken is read from os.Getenv("ADMIN_TOKEN") at provisioner creation;
// it is also used for CP→platform HTTP auth but those are separate concerns.
env := cfg.EnvVars
if p.adminToken != "" {
env = make(map[string]string, len(cfg.EnvVars)+1)
for k, v := range cfg.EnvVars {
env[k] = v
//
// Forensic #145 hardening: tenant workspaces run on EC2 via this path, so
// the SCM-write-token denylist (see buildContainerEnv) is enforced here
// too. Always build a filtered copy — never pass cfg.EnvVars through
// verbatim — so a latent persona-merged GITEA_TOKEN can't reach the
// tenant container regardless of whether ADMIN_TOKEN is set.
env := make(map[string]string, len(cfg.EnvVars)+1)
for k, v := range cfg.EnvVars {
if isSCMWriteTokenKey(k) {
log.Printf("CPProvisioner.Start: dropped SCM-write credential %q from tenant workspace env (forensic #145 guard)", k)
continue
}
env[k] = v
}
if p.adminToken != "" {
env["ADMIN_TOKEN"] = p.adminToken
}
// Collect template files and generated configs, with OFFSEC-010 guards:
@@ -343,6 +352,7 @@ func collectCPConfigFiles(cfg WorkspaceConfig) (map[string]string, error) {
}
return files, nil
}
// Stop terminates the workspace's EC2 instance via the control plane.
//
// Looks up the actual EC2 instance_id from the workspaces table before
@@ -497,7 +507,9 @@ func (p *CPProvisioner) IsRunning(ctx context.Context, workspaceID string) (bool
// Don't leak the body — upstream errors may echo headers.
return true, fmt.Errorf("cp provisioner: status: unexpected %d", resp.StatusCode)
}
var result struct{ State string `json:"state"` }
var result struct {
State string `json:"state"`
}
// Cap body read at 64 KiB for parity with Start — a misconfigured
// or compromised CP streaming a huge body could otherwise exhaust
// memory in this hot path (called reactively per-request from
@@ -591,6 +591,28 @@ func ValidateWorkspaceAccess(access, workspacePath string) error {
}
}
// scmWriteTokenKeys is the explicit denylist of environment variable names
// that carry a Git SCM *write* credential (push / merge / approve). These
// must never reach a tenant workspace container — see the forensic #145
// rationale in buildContainerEnv. Kept as an exact-match set rather than a
// substring/prefix heuristic so the guard is auditable and can't silently
// over-strip a legitimately-named var.
var scmWriteTokenKeys = map[string]struct{}{
"GITEA_TOKEN": {},
"GITHUB_TOKEN": {},
"GH_TOKEN": {}, // gh CLI honours GH_TOKEN as a GITHUB_TOKEN alias
"GITLAB_TOKEN": {},
"GL_TOKEN": {}, // glab CLI alias
"BITBUCKET_TOKEN": {},
}
// isSCMWriteTokenKey reports whether an env var name is a known Git SCM
// write credential that must be stripped from tenant workspace env.
func isSCMWriteTokenKey(key string) bool {
_, ok := scmWriteTokenKeys[key]
return ok
}
// buildContainerEnv assembles the initial environment variables injected
// into every workspace container.
//
@@ -627,6 +649,21 @@ func buildContainerEnv(cfg WorkspaceConfig) []string {
env = append(env, fmt.Sprintf("AWARENESS_URL=%s", cfg.AwarenessURL))
}
for k, v := range cfg.EnvVars {
// Forensic #145 hardening: tenant workspace containers run
// agent-controlled code and must NEVER receive a Git SCM *write*
// credential. Without merge/approve creds in-container the
// two-eyes review gate is structurally self-bypass-proof — an
// agent that forges an approval has no token to act on it. A
// latent path exists (loadPersonaEnvFile merges a per-role
// persona `GITEA_TOKEN` into cfg.EnvVars when MOLECULE_PERSONA_ROOT
// is set on a tenant host); it is inert today (persona dirs are
// operator-host-only) but unguarded. Strip SCM-write tokens here
// by construction so the invariant holds regardless of whether
// that path ever becomes reachable.
if isSCMWriteTokenKey(k) {
log.Printf("buildContainerEnv: dropped SCM-write credential %q from workspace env (forensic #145 guard)", k)
continue
}
env = append(env, fmt.Sprintf("%s=%s", k, v))
}
// Inject ADMIN_TOKEN from the platform server's environment so workspace
@@ -636,10 +636,15 @@ func TestBuildContainerEnv_AwarenessOnlyWhenBothSet(t *testing.T) {
}
func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
// NOTE: this test previously asserted GITHUB_TOKEN passed through
// verbatim. That assertion encoded the forensic #145 latent leak as
// expected behavior. Post-guard, ordinary custom env still flows but
// SCM-write credentials are stripped — see
// TestBuildContainerEnv_StripsSCMWriteTokens for the negative assertion.
cfg := WorkspaceConfig{
WorkspaceID: "ws-x",
PlatformURL: "http://localhost:8080",
EnvVars: map[string]string{"CUSTOM": "value", "GITHUB_TOKEN": "fake-token-for-test"},
EnvVars: map[string]string{"CUSTOM": "value", "ANTHROPIC_API_KEY": "sk-not-an-scm-token"},
}
env := buildContainerEnv(cfg)
seen := map[string]string{}
@@ -652,8 +657,8 @@ func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
if seen["CUSTOM"] != "value" {
t.Errorf("CUSTOM env missing, got env=%v", env)
}
if seen["GITHUB_TOKEN"] != "fake-token-for-test" {
t.Errorf("GITHUB_TOKEN env missing, got env=%v", env)
if seen["ANTHROPIC_API_KEY"] != "sk-not-an-scm-token" {
t.Errorf("non-SCM custom env must still pass through, got env=%v", env)
}
// Built-in defaults still present
if seen["MOLECULE_URL"] == "" {
@@ -661,6 +666,129 @@ func TestBuildContainerEnv_CustomEnvVarsAppended(t *testing.T) {
}
}
// ---------- forensic #145: SCM-write-token denylist guard ----------
// TestBuildContainerEnv_StripsSCMWriteTokens is the core negative
// assertion: a tenant workspace env constructed via buildContainerEnv MUST
// NOT contain any Git SCM *write* credential, regardless of how it got into
// cfg.EnvVars. This proves the two-eyes review gate stays structurally
// self-bypass-proof — an agent in-container has no merge/approve token to
// act on a forged approval. See forensic #145.
//
// This test FAILS on the pre-guard code (where buildContainerEnv passed
// cfg.EnvVars through verbatim) and PASSES once the denylist filter is in
// place — i.e. the guard is proven by construction, not by environment
// accident.
func TestBuildContainerEnv_StripsSCMWriteTokens(t *testing.T) {
scmTokens := []string{
"GITEA_TOKEN", "GITHUB_TOKEN", "GH_TOKEN",
"GITLAB_TOKEN", "GL_TOKEN", "BITBUCKET_TOKEN",
}
t.Run("normal path — SCM tokens explicitly set in EnvVars", func(t *testing.T) {
envVars := map[string]string{"CUSTOM": "ok", "ANTHROPIC_API_KEY": "sk-keep"}
for _, k := range scmTokens {
envVars[k] = "leaked-write-credential-" + k
}
cfg := WorkspaceConfig{
WorkspaceID: "ws-tenant",
PlatformURL: "http://localhost:8080",
Tier: 2,
EnvVars: envVars,
}
assertNoSCMWriteToken(t, buildContainerEnv(cfg), scmTokens)
// Sanity: non-SCM custom env is NOT collateral-damaged by the filter.
if !envContains(buildContainerEnv(cfg), "CUSTOM=ok") {
t.Errorf("filter must not strip non-SCM custom env")
}
if !envContains(buildContainerEnv(cfg), "ANTHROPIC_API_KEY=sk-keep") {
t.Errorf("filter must not strip non-SCM API keys")
}
})
t.Run("persona-file path — simulates loadPersonaEnvFile merge", func(t *testing.T) {
// The latent path: handlers.loadPersonaEnvFile() merges a per-role
// persona env file (carrying GITEA_USER, GITEA_TOKEN, …) into the
// workspace env map when MOLECULE_PERSONA_ROOT is set on a tenant
// host. We can't invoke that cross-package helper here, but its
// observable effect is exactly "a GITEA_TOKEN appears in
// cfg.EnvVars". Constructing that condition directly proves the
// guard holds even if the latent path becomes reachable.
cfg := WorkspaceConfig{
WorkspaceID: "ws-tenant",
PlatformURL: "http://localhost:8080",
Tier: 2,
EnvVars: map[string]string{
// Persona identity fields that are SAFE to keep (read-only
// identity, not a write credential):
"GITEA_USER": "backend-engineer",
"GITEA_USER_EMAIL": "backend-engineer@agents.moleculesai.app",
// The credential that must be stripped:
"GITEA_TOKEN": "persona-merged-write-pat",
"GITEA_TOKEN_SCOPES": "write:repository",
},
}
got := buildContainerEnv(cfg)
assertNoSCMWriteToken(t, got, scmTokens)
// Non-credential persona identity may still flow through — only the
// write token is the denied surface.
if !envContains(got, "GITEA_USER=backend-engineer") {
t.Errorf("non-credential persona identity (GITEA_USER) should not be stripped")
}
})
}
// TestCPProvisionerEnv_StripsSCMWriteTokens covers the tenant-EC2 path:
// CPProvisioner.Start builds the env map the control plane forwards to the
// EC2 workspace container. The same forensic #145 denylist must hold there.
func TestCPProvisionerEnv_StripsSCMWriteTokens(t *testing.T) {
// isSCMWriteTokenKey is the single source of truth shared by both
// buildContainerEnv (local Docker) and CPProvisioner.Start (tenant EC2).
// Assert it classifies every known SCM-write var as denied and leaves
// ordinary / read-only-identity vars alone.
for _, k := range []string{
"GITEA_TOKEN", "GITHUB_TOKEN", "GH_TOKEN",
"GITLAB_TOKEN", "GL_TOKEN", "BITBUCKET_TOKEN",
} {
if !isSCMWriteTokenKey(k) {
t.Errorf("isSCMWriteTokenKey(%q) = false, want true (SCM-write credential must be denied)", k)
}
}
for _, k := range []string{
"GITEA_USER", "GITEA_USER_EMAIL", "ANTHROPIC_API_KEY",
"CUSTOM", "PLATFORM_URL", "ADMIN_TOKEN", "",
} {
if isSCMWriteTokenKey(k) {
t.Errorf("isSCMWriteTokenKey(%q) = true, want false (must not over-strip non-SCM env)", k)
}
}
}
func assertNoSCMWriteToken(t *testing.T, env []string, scmTokens []string) {
t.Helper()
for _, e := range env {
key := e
if i := strings.IndexByte(e, '='); i >= 0 {
key = e[:i]
}
for _, banned := range scmTokens {
if key == banned {
t.Errorf("SCM-write credential %q leaked into workspace env (forensic #145 invariant violated): %q", banned, e)
}
}
}
}
func envContains(env []string, want string) bool {
for _, e := range env {
if e == want {
return true
}
}
return false
}
// ---------- buildWorkspaceMount — #65 workspace_access ----------
func TestBuildWorkspaceMount_SelectionMatrix(t *testing.T) {
@@ -0,0 +1,226 @@
// Package secrets provides the canonical SSOT for credential-shaped
// regex patterns used by:
//
// - the CI `Secret scan` workflow (.gitea/workflows/secret-scan.yml)
// - the runtime's bundled pre-commit hook
// (molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh)
// - the upcoming Phase 2b docker-exec Files API backend, which has
// to refuse to surface files whose path OR content matches a
// credential shape (RFC internal#425, Hongming 2026-05-15)
//
// Before this package, the same regex set lived as duplicate bash
// arrays in two unrelated repos; adding a pattern required editing
// both, and pattern drift was caught only via secret-scan workflow
// failures on PRs that had unrelated changes (#2090-class incident
// vector). Centralising in Go makes the Files API the SSOT, with the
// YAML + bash arrays generated/asserted from this package so drift
// is detected at CI time, not at exfiltration time.
//
// This file is Phase 2a of the internal#425 RFC. Phase 2b will import
// `Patterns` from `template_files_docker_exec.go` to gate
// `listFilesViaDockerExec` / `readFileViaDockerExec` against
// secret-shaped paths AND content. Until 2b lands, the package has
// one consumer: this package's own unit tests, which pin the regex
// strings so a refactor that drops or weakens one is caught here.
package secrets
import (
"fmt"
"regexp"
"sync"
)
// Pattern is one named credential shape — a human label plus the
// compiled regex. The label appears in CI error output ("matched:
// github-pat") so an operator can identify the family without seeing
// the actual matched bytes (echoing the bytes widens the blast radius
// per the secret-scan workflow's recovery prose).
type Pattern struct {
// Name is a short kebab-case identifier (e.g. "github-pat",
// "anthropic-api-key"). Stable across versions — consumers may
// switch on it.
Name string
// Description is a one-line human-readable explanation of what
// the pattern matches. Used in CI error messages and the Files
// API "<denied: secret-shape>" placeholder rationale.
Description string
// regexSource is the regex literal in Go-RE2 syntax. Stored as a
// string so the slice declaration below stays readable; compiled
// once via sync.Once into a *regexp.Regexp.
regexSource string
}
// Patterns is the canonical credential-shape regex set.
//
// Adding a pattern here:
//
// 1. Add a new Pattern{} entry below with a kebab-case Name, a
// one-line Description, and the regex literal. Anchor on a
// low-false-positive prefix.
// 2. Add a positive + negative test case in patterns_test.go.
// 3. Mirror the regex string into:
// a. .gitea/workflows/secret-scan.yml SECRET_PATTERNS array
// b. molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh
// (or wait for the codegen target that consumes this slice — TBD
// follow-up; tracked in the Phase 2a PR description.)
//
// The order is: alphabetical within each provider family, families
// grouped by ecosystem (GitHub family, AI-provider family, chat
// family, cloud family). Keep this stable so diffs are reviewable.
var Patterns = []Pattern{
// --- GitHub token family ---
{
Name: "github-pat-classic",
Description: "GitHub personal access token (classic)",
regexSource: `ghp_[A-Za-z0-9]{36,}`,
},
{
Name: "github-app-installation-token",
Description: "GitHub App installation token (#2090 vector)",
regexSource: `ghs_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-user-to-server",
Description: "GitHub OAuth user-to-server token",
regexSource: `gho_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-user",
Description: "GitHub OAuth user token",
regexSource: `ghu_[A-Za-z0-9]{36,}`,
},
{
Name: "github-oauth-refresh",
Description: "GitHub OAuth refresh token",
regexSource: `ghr_[A-Za-z0-9]{36,}`,
},
{
Name: "github-pat-fine-grained",
Description: "GitHub fine-grained personal access token",
regexSource: `github_pat_[A-Za-z0-9_]{82,}`,
},
// --- AI-provider API key family ---
{
Name: "anthropic-api-key",
Description: "Anthropic API key",
regexSource: `sk-ant-[A-Za-z0-9_-]{40,}`,
},
{
Name: "openai-project-key",
Description: "OpenAI project API key",
regexSource: `sk-proj-[A-Za-z0-9_-]{40,}`,
},
{
Name: "openai-service-account-key",
Description: "OpenAI service-account API key",
regexSource: `sk-svcacct-[A-Za-z0-9_-]{40,}`,
},
{
Name: "minimax-api-key",
Description: "MiniMax API key (F1088 vector)",
regexSource: `sk-cp-[A-Za-z0-9_-]{60,}`,
},
// --- Chat-platform token family ---
{
Name: "slack-token",
Description: "Slack token (xoxb/xoxa/xoxp/xoxr/xoxs)",
regexSource: `xox[baprs]-[A-Za-z0-9-]{20,}`,
},
// --- Cloud-provider credential family ---
{
Name: "aws-access-key-id",
Description: "AWS access key ID",
regexSource: `AKIA[0-9A-Z]{16}`,
},
{
Name: "aws-sts-temp-access-key-id",
Description: "AWS STS temporary access key ID",
regexSource: `ASIA[0-9A-Z]{16}`,
},
}
// compiledOnce protects the lazy build of compiledPatterns. We compile
// lazily so package init is cheap; callers pay only on first match
// (typically once per workspace-server boot).
var (
compiledOnce sync.Once
compiledPatterns []*compiledPattern
compileErr error
)
type compiledPattern struct {
Name string
Description string
Re *regexp.Regexp
}
// compileAll compiles every Pattern.regexSource into a *regexp.Regexp.
// Called once via compiledOnce. Any compile failure here is a build
// bug (the unit tests assert each regex compiles) — surfacing via
// returned error so callers don't panic in request handling.
func compileAll() {
out := make([]*compiledPattern, 0, len(Patterns))
for _, p := range Patterns {
re, err := regexp.Compile(p.regexSource)
if err != nil {
compileErr = fmt.Errorf("secrets: pattern %q failed to compile: %w", p.Name, err)
return
}
out = append(out, &compiledPattern{Name: p.Name, Description: p.Description, Re: re})
}
compiledPatterns = out
}
// ScanBytes returns a non-nil Match if any pattern matches anywhere
// inside b. Returns (nil, nil) on no match. Returns (nil, err) only
// if a regex in the package fails to compile — that's a build bug,
// not a runtime data issue.
//
// Match contains the pattern Name + Description so the caller can
// emit a path-or-content-denial rationale WITHOUT round-tripping the
// matched bytes (which would defeat the purpose). The matched bytes
// stay inside this function.
//
// The Files API Phase 2b backend will call ScanBytes on:
//
// - the absolute path string (catches a file literally named
// `ghs_abc.txt`)
// - the file content (catches a credential pasted into a workspace
// file by an agent or user — the Files API refuses to surface it
// and the canvas renders "<denied: secret-shape>")
//
// Ordering: patterns are tried in declaration order. First match
// wins. This means narrower patterns (e.g. `sk-svcacct-…`) should
// appear in `Patterns` before broader ones (`sk-…`) — today there's
// no overlap, so order is descriptive only.
func ScanBytes(b []byte) (*Match, error) {
compiledOnce.Do(compileAll)
if compileErr != nil {
return nil, compileErr
}
for _, cp := range compiledPatterns {
if cp.Re.Match(b) {
return &Match{Name: cp.Name, Description: cp.Description}, nil
}
}
return nil, nil
}
// ScanString is the string-input convenience wrapper around ScanBytes.
// Identical semantics — the body never copies, []byte(s) is a
// zero-copy reinterpret for the regex matcher.
func ScanString(s string) (*Match, error) {
return ScanBytes([]byte(s))
}
// Match describes which pattern caught a value. Deliberately does
// NOT include the matched substring — callers must not echo it.
type Match struct {
// Name is the pattern's kebab-case identifier (e.g. "github-pat-classic").
Name string
// Description is the human-readable line for UI / log surfaces.
Description string
}
@@ -0,0 +1,189 @@
package secrets
import (
"strings"
"testing"
)
// TestEveryPatternCompiles pins that every Pattern.regexSource is a
// valid Go-RE2 expression. Without this, a bad regex would silently
// disable ScanBytes for everything after it (the lazy compile would
// set compileErr and ScanBytes would return that error every call).
func TestEveryPatternCompiles(t *testing.T) {
for _, p := range Patterns {
if p.Name == "" {
t.Errorf("pattern with empty Name: regex=%q", p.regexSource)
}
if p.Description == "" {
t.Errorf("pattern %q has empty Description", p.Name)
}
}
// Force compile + check error.
if _, err := ScanBytes([]byte("placeholder")); err != nil {
t.Fatalf("ScanBytes init failed: %v", err)
}
}
// TestNoDuplicateNames — a duplicate pattern Name would make the
// "first match wins" semantics surprising to readers and any caller
// switching on Match.Name (none today but adding the guard is cheap).
func TestNoDuplicateNames(t *testing.T) {
seen := map[string]bool{}
for _, p := range Patterns {
if seen[p.Name] {
t.Errorf("duplicate pattern Name: %q", p.Name)
}
seen[p.Name] = true
}
}
// TestKnownPatternsAllPresent — pins which specific Name values are
// expected. A future refactor that renames or removes one without
// updating consumers (CI workflow, runtime pre-commit hook, Files
// API Phase 2b backend) would silently widen the leak surface.
// Failing here forces the rename to be intentional.
func TestKnownPatternsAllPresent(t *testing.T) {
expected := []string{
"github-pat-classic",
"github-app-installation-token",
"github-oauth-user-to-server",
"github-oauth-user",
"github-oauth-refresh",
"github-pat-fine-grained",
"anthropic-api-key",
"openai-project-key",
"openai-service-account-key",
"minimax-api-key",
"slack-token",
"aws-access-key-id",
"aws-sts-temp-access-key-id",
}
got := map[string]bool{}
for _, p := range Patterns {
got[p.Name] = true
}
for _, want := range expected {
if !got[want] {
t.Errorf("expected pattern %q missing from Patterns slice", want)
}
}
}
// TestPositiveMatches — for each pattern, supply a representative
// shape and assert ScanBytes returns a Match with the right Name.
// These are TEST FIXTURES, not real credentials — each is the
// pattern's prefix + a long-enough trailing run of placeholder chars.
// `EXAMPLE` is sprinkled in to make grep-finds in CI logs obviously
// fake to a human reader (matches saved memory
// feedback_assert_exact_not_substring: tighten by Name not body).
func TestPositiveMatches(t *testing.T) {
cases := []struct {
fixture string
expectedName string
}{
{"ghp_EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"github_pat_EXAMPLE" + strings.Repeat("1", 80), "github-pat-fine-grained"},
{"sk-ant-EXAMPLE" + strings.Repeat("1", 40), "anthropic-api-key"},
{"sk-proj-EXAMPLE" + strings.Repeat("1", 40), "openai-project-key"},
{"sk-svcacct-EXAMPLE" + strings.Repeat("1", 40), "openai-service-account-key"},
{"sk-cp-EXAMPLE" + strings.Repeat("1", 60), "minimax-api-key"},
{"xoxb-" + strings.Repeat("a", 25), "slack-token"},
{"xoxa-" + strings.Repeat("a", 25), "slack-token"},
// AWS regex requires [0-9A-Z]{16} — uppercase + digits only.
{"AKIA1234567890ABCDEF", "aws-access-key-id"},
{"ASIA1234567890ABCDEF", "aws-sts-temp-access-key-id"},
}
for _, tc := range cases {
t.Run(tc.expectedName, func(t *testing.T) {
m, err := ScanBytes([]byte(tc.fixture))
if err != nil {
t.Fatalf("ScanBytes(%q) errored: %v", tc.fixture, err)
}
if m == nil {
t.Fatalf("ScanBytes(%q) returned no match — expected %q", tc.fixture, tc.expectedName)
}
if m.Name != tc.expectedName {
t.Errorf("ScanBytes(%q) matched %q; expected %q", tc.fixture, m.Name, tc.expectedName)
}
})
}
}
// TestNegativeShapes — strings that look credential-adjacent but
// shouldn't match (too short, wrong prefix, missing trailing bytes).
// Failing here means a pattern is too loose, which would generate
// false-positive denial in Files API and false-positive workflow
// failures in CI.
func TestNegativeShapes(t *testing.T) {
cases := []string{
// Too-short variants — anchored on the length suffix.
"ghp_tooshort",
"ghs_alsoshort1234",
"github_pat_short",
"sk-ant-short",
"sk-cp-not-enough-bytes-here",
// Looks like one of the prefixes but isn't (different letter).
"gha_EXAMPLE_thirty_six_or_more_chars_here_xxx",
// Slack family — wrong letter after xox.
"xoxz-aaaaaaaaaaaaaaaaaaaaaaaaa",
// AWS-shaped but wrong length suffix.
"AKIATOOSHORT",
// Empty / whitespace.
"",
" ",
// Plain prose mentioning the prefix as part of a longer word.
"see also `ghp_HOWTO.md` in the repo",
}
for _, c := range cases {
t.Run(c, func(t *testing.T) {
m, err := ScanBytes([]byte(c))
if err != nil {
t.Fatalf("ScanBytes(%q) errored: %v", c, err)
}
if m != nil {
t.Errorf("ScanBytes(%q) unexpectedly matched %q", c, m.Name)
}
})
}
}
// TestScanString_NoOp — sanity-check ScanString is the zero-copy
// wrapper around ScanBytes. Without this, a future refactor that
// makes ScanString do its own thing (e.g. accidentally normalise
// case) would diverge silently.
func TestScanString_NoOp(t *testing.T) {
in := "ghp_EXAMPLE111122223333444455556666777788889999"
m1, err1 := ScanBytes([]byte(in))
if err1 != nil {
t.Fatalf("ScanBytes errored: %v", err1)
}
m2, err2 := ScanString(in)
if err2 != nil {
t.Fatalf("ScanString errored: %v", err2)
}
if m1 == nil || m2 == nil {
t.Fatalf("expected matches; got bytes=%+v string=%+v", m1, m2)
}
if m1.Name != m2.Name {
t.Errorf("ScanString and ScanBytes returned different Names: %q vs %q", m1.Name, m2.Name)
}
}
// TestMatch_NoRoundtrip — assert the Match struct does NOT include
// the matched substring as a field. Adding such a field would
// regress the "matched bytes never leave ScanBytes" invariant that
// makes this package safe to call from log/UI surfaces. This is a
// reflection-light contract test — checks the field names statically.
func TestMatch_NoRoundtrip(t *testing.T) {
var m Match
// If someone adds a `Matched string` (or similar) field, this
// test reads as the canonical place to update + reconsider.
_ = m.Name
_ = m.Description
// The two-field shape is part of the public contract; new fields
// require deliberation about whether they leak the secret value.
}
+6
View File
@@ -35,12 +35,14 @@ from a2a_tools import (
tool_commit_memory,
tool_delegate_task,
tool_delegate_task_async,
tool_get_runtime_identity,
tool_get_workspace_info,
tool_inbox_peek,
tool_inbox_pop,
tool_list_peers,
tool_recall_memory,
tool_send_message_to_user,
tool_update_agent_card,
tool_wait_for_message,
)
from platform_tools.registry import TOOLS as _PLATFORM_TOOL_SPECS
@@ -130,6 +132,10 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
return await tool_get_workspace_info(
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "get_runtime_identity":
return await tool_get_runtime_identity()
elif name == "update_agent_card":
return await tool_update_agent_card(arguments.get("card"))
elif name == "commit_memory":
return await tool_commit_memory(
arguments.get("content", ""),
+12
View File
@@ -167,3 +167,15 @@ from a2a_tools_inbox import ( # noqa: E402 (import after the top-of-module imp
tool_inbox_pop,
tool_wait_for_message,
)
# Identity tool handlers — extracted to a2a_tools_identity. Ports the
# two T4-tier MCP tools (``tool_get_runtime_identity`` +
# ``tool_update_agent_card``) from molecule-ai-workspace-runtime PR#17.
# That repo is mirror-only (reference_runtime_repo_is_mirror_only);
# this is the canonical edit point, and the wheel mirror is
# regenerated by publish-runtime.yml on merge.
from a2a_tools_identity import ( # noqa: E402 (import after the top-of-module imports)
tool_get_runtime_identity,
tool_update_agent_card,
)
+187
View File
@@ -0,0 +1,187 @@
"""Identity tool handlers — single-concern slice of the a2a_tools surface.
Owns the two MCP tools that close the T4-tier workspace owner-permission
gaps reported via the canvas:
* ``tool_get_runtime_identity`` — env-only; returns model, model_provider,
molecule_model, anthropic_base_url, tier, workspace_id, runtime
(ADAPTER_MODULE). No HTTP call. Always permitted by RBAC — even
read-only agents may know what model they are.
* ``tool_update_agent_card`` — POSTs the card to ``/registry/update-card``
with the workspace's own bearer (same auth path as ``tool_commit_memory``
via ``a2a_tools_rbac.auth_headers_for_heartbeat``). The platform
replaces the stored card and broadcasts an ``agent_card_updated``
event so the canvas reflects the new card live. Gated on
``memory.write`` capability via the existing RBAC permission map so
read-only roles can't silently rewrite the platform card.
Both originated as a port of molecule-ai-workspace-runtime PR#17
(``feat(mcp): add update_agent_card + get_runtime_identity tools``).
The mirror-only PR#17 was closed without merge per
``reference_runtime_repo_is_mirror_only``; the canonical edit point is
this monorepo at ``workspace/`` and the wheel mirror is regenerated
automatically by the publish-runtime workflow.
Imports the auth-header primitive from ``a2a_tools_rbac`` (iter 4a) —
NOT from ``a2a_tools`` — to avoid a circular import with the
kitchen-sink re-export module.
"""
from __future__ import annotations
import json
import os
from typing import Any
import httpx
from a2a_client import PLATFORM_URL
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_write_permission as _check_memory_write_permission,
)
def _runtime_identity_payload() -> dict[str, Any]:
"""Build the identity dict — env-only, no I/O.
Factored out from ``tool_get_runtime_identity`` so tests can assert
against the exact key set without re-parsing JSON. The MCP tool
handler ``tool_get_runtime_identity`` is the only public caller in
production; tests call this helper directly.
"""
return {
"model": os.environ.get("MODEL", ""),
"model_provider": os.environ.get("MODEL_PROVIDER", ""),
"molecule_model": os.environ.get("MOLECULE_MODEL", ""),
"anthropic_base_url": os.environ.get("ANTHROPIC_BASE_URL", ""),
"tier": os.environ.get("TIER", ""),
"workspace_id": os.environ.get("WORKSPACE_ID", ""),
# Adapter module is the closest thing the runtime has to a
# "template slug" — e.g. "adapter" for claude-code-default,
# "hermes" for hermes-template, etc. Picked from
# $ADAPTER_MODULE env baked by each template's Dockerfile.
"runtime": os.environ.get("ADAPTER_MODULE", ""),
}
async def tool_get_runtime_identity() -> str:
"""Return this runtime's identity — model, provider, tier, IDs.
Env-only; no HTTP call. Useful so the agent can answer "what model
am I?" correctly instead of guessing from a stale system prompt
that the operator may have changed between boots.
Returns the identity as a JSON-encoded string (the dispatch contract
every MCP tool in this module follows). Tests that want to assert
individual fields can call ``_runtime_identity_payload()`` directly,
or ``json.loads`` the return value.
Always permitted by RBAC — there is no sensitive information here
that isn't already available to the process via ``os.environ``.
The point of the tool is to surface those env values to the agent
layer in a stable, documented shape rather than expecting every
agent runtime to know to ``echo $MODEL``.
"""
return json.dumps(_runtime_identity_payload(), indent=2)
async def tool_update_agent_card(card: Any) -> str:
"""Update this workspace's agent_card on the platform.
POSTs the provided card to ``/registry/update-card`` with the
workspace's own bearer token (same auth path as ``tool_commit_memory``
and ``tool_get_workspace_info``). The platform validates required
fields server-side, replaces the stored card, and broadcasts an
``agent_card_updated`` event so the canvas updates live.
Args:
card: A JSON-serialisable object (typically a dict) holding the
new card. The platform validates required fields server-side.
Returns:
JSON-encoded string. Body:
- ``{"success": true, "status": "updated"}`` on success;
- ``{"success": false, "error": "<msg>", "status_code": <int>}``
on platform error;
- ``{"success": false, "error": "<reason>"}`` on local validation
(non-dict card, missing WORKSPACE_ID, network error).
Permission gate: this tool requires the ``memory.write`` RBAC
capability — same gate as ``tool_commit_memory``. The check runs
inline rather than at the dispatcher layer to keep ``a2a_mcp_server``
permission-agnostic (the gate sits with the implementation, not the
transport). Read-only roles get a clear error string back instead
of a 403 from the platform.
We re-check ``isinstance(card, dict)`` here defensively rather than
trust the MCP schema validator alone — the schema only constrains
the transport, not the in-process call surface used by tests and
sibling modules.
"""
payload = await _update_agent_card_impl(card)
return json.dumps(payload, indent=2)
async def _update_agent_card_impl(card: Any) -> dict[str, Any]:
"""Dict-returning core of ``tool_update_agent_card``.
Split out so tests can assert against the raw dict shape (status
codes, error messages) without re-parsing JSON on every assertion.
The string-returning ``tool_update_agent_card`` is a thin wrapper
invoked by the MCP dispatcher.
"""
# RBAC: require memory.write permission. Same gate as
# tool_commit_memory (the agent already needs this capability to
# persist anything outbound). Read-only roles can still call
# get_runtime_identity / get_workspace_info to introspect — those
# are env-only / read-only and have no inline gate.
if not _check_memory_write_permission():
return {
"success": False,
"error": (
"RBAC — this workspace does not have the 'memory.write' "
"permission required to update the agent_card."
),
}
if not isinstance(card, dict):
return {
"success": False,
"error": "card must be a JSON object (dict)",
}
ws_id = os.environ.get("WORKSPACE_ID", "")
if not ws_id:
return {
"success": False,
"error": "WORKSPACE_ID env not set; cannot identify caller",
}
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/registry/update-card",
json={"workspace_id": ws_id, "agent_card": card},
headers=_auth_headers_for_heartbeat(),
)
if resp.status_code == 200:
body: dict[str, Any] = {}
try:
body = resp.json()
except Exception:
pass
return {
"success": True,
"status": body.get("status", "updated"),
}
# Non-200 — surface what the platform returned.
error_msg = ""
try:
error_msg = resp.json().get("error", "") or resp.text
except Exception:
error_msg = resp.text
return {
"success": False,
"status_code": resp.status_code,
"error": error_msg,
}
except Exception as e:
return {"success": False, "error": f"network error: {e}"}
+52
View File
@@ -340,6 +340,16 @@ _CLI_A2A_COMMAND_KEYWORDS: dict[str, str | None] = {
"delegate_task_async": "delegate --async",
"check_task_status": "status",
"get_workspace_info": "info",
# `get_runtime_identity` + `update_agent_card` are MCP-first
# capabilities — the CLI subprocess interface doesn't expose them
# today. `get_runtime_identity` is env-only and an agent on a
# CLI-only runtime can already `echo $MODEL` etc, so there's no
# functional gap. `update_agent_card` requires a JSON object
# argument that wouldn't survive a positional-arg shell invocation
# cleanly. Mapped to None — flip to a keyword if a2a_cli grows
# `identity` / `card` subcommands in the future.
"get_runtime_identity": None,
"update_agent_card": None,
# `broadcast_message` is not exposed via the CLI subprocess interface
# today — it's an MCP-first capability. If a2a_cli grows a `broadcast`
# subcommand, map it here and the alignment test will gate the change.
@@ -589,6 +599,28 @@ def _sanitize_for_external(msg: str) -> str:
import re as _re
msg = _re.sub(r"(?i)(?:bearer|token|api[_-]?key|sk-)[ :=]+[A-Za-z0-9_/.-]{20,}", "[REDACTED]", msg)
# Bare provider key with NO separator after the prefix — a real
# `sk-ant-api03-…` / `sk-…` key uses `-` (not `[ :=]`) so the rule
# above misses it. Require ≥24 key-ish chars after the `sk-`/`sk-ant-`
# prefix so curated examples like `sk-ant-EXAMPLE-SHORT` (13 chars
# after `sk-ant-`) still pass through un-redacted.
msg = _re.sub(r"(?i)\bsk-(?:ant-)?[A-Za-z0-9_-]{24,}", "[REDACTED]", msg)
# JSON-quoted credential values: {"token": "…"} / {"apiKey": "…"} /
# {"secret": "…"} / {"password": "…"}. Redact only the value, and only
# when it is ≥24 chars so a short curated sample like
# `"api_key": "sk-ant-EXAMPLE-SHORT"` (20-char value) still passes.
msg = _re.sub(
r'(?i)("(?:token|api[_-]?key|secret|password)"\s*:\s*")[^"]{24,}(")',
r"\1[REDACTED]\2",
msg,
)
# AWS secret access key in `aws_secret_access_key=…` form (env dumps,
# boto tracebacks). The base64-ish value runs until whitespace/quote.
msg = _re.sub(
r"(?i)(aws_secret_access_key\s*[:=]\s*)\S+",
r"\1[REDACTED]",
msg,
)
# Absolute paths: /etc/shadow, /home/user/.aws/credentials, etc.
msg = _re.sub(r"(?:/[^/\s]+){2,}", lambda m: m.group(0) if len(m.group(0)) < 60 else "[REDACTED_PATH]", msg)
return msg
@@ -598,6 +630,7 @@ def sanitize_agent_error(
exc: BaseException | None = None,
category: str | None = None,
stderr: str | None = None,
reason: str | None = None,
) -> str:
"""Render an agent-side failure into a user-safe error message.
@@ -605,6 +638,18 @@ def sanitize_agent_error(
category string (e.g. from `classify_subprocess_error`). If both are
given, `category` wins. If neither, the tag defaults to "unknown".
When ``reason`` is provided (internal#211/#212), it is a *pre-curated,
user-actionable, secret-safe* explanation built by the caller from a
provider-side failure — e.g. a 403 "Your organization has disabled
Claude subscription access · Use an Anthropic API key instead, or ask
your admin to enable access" with error code ``oauth_org_not_allowed``.
This text is exactly what the user needs to self-serve, so it is
surfaced VERBATIM as the message instead of being collapsed to the
opaque exception class name. It still passes through the
key/token/bearer/path scrubber as a belt-and-braces second pass so a
buggy caller can't leak a credential that snuck into the reason.
``reason`` wins over ``stderr``; both lose to neither being set.
When ``stderr`` is provided (e.g. the first ~1 KB of a subprocess stderr
or HTTP error body), it is sanitized and appended to the output so the
A2A caller gets actionable context without needing to dig through workspace
@@ -619,6 +664,13 @@ def sanitize_agent_error(
else:
tag = "unknown"
if reason:
# Curated, user-actionable reason — surface it as the message.
# Still scrub: a 403/auth/quota message is safe, but the scrubber
# is cheap insurance against a caller that didn't curate cleanly.
clean = _sanitize_for_external(reason[:_MAX_STDERR_PREVIEW])
return f"Agent error ({tag}): {clean}"
if stderr:
# Truncate and sanitize before including — prevents DoS via
# a malicious or buggy peer injecting a huge error body, and
+59
View File
@@ -57,12 +57,14 @@ from a2a_tools import (
tool_commit_memory,
tool_delegate_task,
tool_delegate_task_async,
tool_get_runtime_identity,
tool_get_workspace_info,
tool_inbox_peek,
tool_inbox_pop,
tool_list_peers,
tool_recall_memory,
tool_send_message_to_user,
tool_update_agent_card,
tool_wait_for_message,
)
@@ -289,6 +291,61 @@ _GET_WORKSPACE_INFO = ToolSpec(
section=A2A_SECTION,
)
_GET_RUNTIME_IDENTITY = ToolSpec(
name="get_runtime_identity",
short=(
"Return this runtime's identity — model, model_provider, tier, "
"workspace_id, runtime template. Reads from process env; no HTTP call."
),
when_to_use=(
"Use this to answer 'what model am I?' truthfully instead of "
"guessing from a stale system prompt — the operator may have "
"routed you to a different model via persona env between boots. "
"Always permitted by RBAC: even read-only agents may know what "
"model they are. Distinct from get_workspace_info — that one "
"calls the platform for ID/role/tier/parent (workspace metadata); "
"this one returns the live process env (MODEL, MODEL_PROVIDER, "
"MOLECULE_MODEL, ANTHROPIC_BASE_URL, TIER, WORKSPACE_ID, "
"ADAPTER_MODULE)."
),
input_schema={"type": "object", "properties": {}},
impl=tool_get_runtime_identity,
section=A2A_SECTION,
)
_UPDATE_AGENT_CARD = ToolSpec(
name="update_agent_card",
short=(
"Replace this workspace's agent_card on the platform. The "
"platform validates required fields and broadcasts an "
"agent_card_updated event so the canvas reflects the change live."
),
when_to_use=(
"Use when the workspace's capabilities, skills, description, or "
"name change and the canvas display needs to follow. The "
"platform stores the new card and pushes an "
"``agent_card_updated`` event to subscribers. Gated behind the "
"``memory.write`` RBAC capability — read-only roles cannot "
"rewrite the card. Tier-1+ owners always have this capability."
),
input_schema={
"type": "object",
"properties": {
"card": {
"type": "object",
"description": (
"The new agent_card object (name, version, "
"description, skills, etc). Server-side validation "
"rejects payloads missing required fields."
),
},
},
"required": ["card"],
},
impl=tool_update_agent_card,
section=A2A_SECTION,
)
_BROADCAST_MESSAGE = ToolSpec(
name="broadcast_message",
short=(
@@ -642,6 +699,8 @@ TOOLS: list[ToolSpec] = [
_CHECK_TASK_STATUS,
_LIST_PEERS,
_GET_WORKSPACE_INFO,
_GET_RUNTIME_IDENTITY,
_UPDATE_AGENT_CARD,
_BROADCAST_MESSAGE,
_SEND_MESSAGE_TO_USER,
# Inbox (standalone-only; in-container returns informational error)
@@ -5,6 +5,8 @@
- **check_task_status**: Poll the status of a task started with delegate_task_async; returns result when done.
- **list_peers**: List the workspaces this agent can communicate with — name, ID, status, role for each.
- **get_workspace_info**: Get this workspace's own info — ID, name, role, tier, parent, status.
- **get_runtime_identity**: Return this runtime's identity — model, model_provider, tier, workspace_id, runtime template. Reads from process env; no HTTP call.
- **update_agent_card**: Replace this workspace's agent_card on the platform. The platform validates required fields and broadcasts an agent_card_updated event so the canvas reflects the change live.
- **broadcast_message**: Send a message to ALL agent workspaces in the org simultaneously. Requires broadcast_enabled=true on this workspace (set by user/admin).
- **send_message_to_user**: Send a message directly to the user's canvas chat — pushed instantly via WebSocket. Use this to: (1) acknowledge a task immediately ('Got it, I'll start working on this'), (2) send interim progress updates while doing long work, (3) deliver follow-up results after delegation completes, (4) attach files (zip, pdf, csv, image) for the user to download via the `attachments` field (NEVER paste file URLs in `message`). The message appears in the user's chat as if you're proactively reaching out.
- **wait_for_message**: Block until the next inbound message (canvas user OR peer agent) arrives, or until ``timeout_secs`` elapses.
@@ -27,6 +29,12 @@ Call this first when you need to delegate but don't know the target's ID. Access
### get_workspace_info
Use to introspect your own identity (e.g. before reporting back to the user, or to determine whether you're a tier-0 root that can write GLOBAL memory).
### get_runtime_identity
Use this to answer 'what model am I?' truthfully instead of guessing from a stale system prompt — the operator may have routed you to a different model via persona env between boots. Always permitted by RBAC: even read-only agents may know what model they are. Distinct from get_workspace_info — that one calls the platform for ID/role/tier/parent (workspace metadata); this one returns the live process env (MODEL, MODEL_PROVIDER, MOLECULE_MODEL, ANTHROPIC_BASE_URL, TIER, WORKSPACE_ID, ADAPTER_MODULE).
### update_agent_card
Use when the workspace's capabilities, skills, description, or name change and the canvas display needs to follow. The platform stores the new card and pushes an ``agent_card_updated`` event to subscribers. Gated behind the ``memory.write`` RBAC capability — read-only roles cannot rewrite the card. Tier-1+ owners always have this capability.
### broadcast_message
Use for urgent, org-wide signals: critical status changes, emergency stop instructions, coordinated task announcements. Every non-removed workspace receives the message in its activity log (poll-mode agents see it on their next poll; push-mode canvases get a real-time banner). This tool returns an error if broadcast_enabled is false — a user or admin must enable it via the workspace abilities settings first.
+390
View File
@@ -0,0 +1,390 @@
"""Tests for ``tool_get_runtime_identity`` and ``tool_update_agent_card``.
These two MCP tools close the T4-tier workspace owner-permission gaps
reported via the canvas:
- the agent could not update its own ``agent_card`` (no MCP tool
wrapped the existing ``POST /registry/update-card`` endpoint);
- the agent could not identify which model it was running (the
``MODEL`` env var is injected by ``provisioner.workspace_provision``
but nothing surfaced it back to the agent).
Ported from molecule-ai-workspace-runtime PR#17 (mirror-only repo;
canonical edit point per ``reference_runtime_repo_is_mirror_only``).
Adapted to core's conventions:
* tool functions return ``str`` (JSON-encoded), matching every other
tool in ``a2a_tools_*`` modules. Tests ``json.loads`` to inspect.
* permission check ``memory.write`` runs inline in
``tool_update_agent_card`` (same pattern as
``a2a_tools_memory.tool_commit_memory``).
* ``WORKSPACE_ID`` is read directly from ``os.environ`` core does
not have the runtime's validated-cache layer (``molecule_runtime.
builtin_tools.validation``).
"""
from __future__ import annotations
import json
import pytest
# --- Drift gate: re-export aliases on a2a_tools ------------------------------
class TestBackCompatAliases:
"""Pin that ``a2a_tools.tool_*`` resolves to the same callable as
``a2a_tools_identity.tool_*``. Refactor wrapping (e.g. a doc-string
wrapper that loses the function identity) silently breaks call
sites that ``patch("a2a_tools.tool_update_agent_card", ...)``
this gate makes that drift fail fast."""
def test_tool_get_runtime_identity_alias(self):
import a2a_tools
import a2a_tools_identity
assert a2a_tools.tool_get_runtime_identity is a2a_tools_identity.tool_get_runtime_identity
def test_tool_update_agent_card_alias(self):
import a2a_tools
import a2a_tools_identity
assert a2a_tools.tool_update_agent_card is a2a_tools_identity.tool_update_agent_card
# --- tool_get_runtime_identity ----------------------------------------------
class TestGetRuntimeIdentity:
"""The tool returns env-derived runtime identity. No HTTP call."""
@pytest.mark.asyncio
async def test_returns_all_known_env_fields(self, monkeypatch):
from a2a_tools_identity import tool_get_runtime_identity
monkeypatch.setenv("MODEL", "claude-opus-4-7")
monkeypatch.setenv("MODEL_PROVIDER", "anthropic")
monkeypatch.setenv("TIER", "T4")
monkeypatch.setenv("WORKSPACE_ID", "ws-abc")
monkeypatch.setenv("ADAPTER_MODULE", "adapter")
monkeypatch.setenv("MOLECULE_MODEL", "claude-opus-4-7")
monkeypatch.setenv("ANTHROPIC_BASE_URL", "https://api.anthropic.com")
out = await tool_get_runtime_identity()
# MCP tools return JSON-encoded strings (matches the contract
# every other tool_* in a2a_tools_* uses).
assert isinstance(out, str)
parsed = json.loads(out)
assert parsed["model"] == "claude-opus-4-7"
assert parsed["model_provider"] == "anthropic"
assert parsed["tier"] == "T4"
assert parsed["workspace_id"] == "ws-abc"
assert parsed["runtime"] == "adapter"
assert parsed["molecule_model"] == "claude-opus-4-7"
assert parsed["anthropic_base_url"] == "https://api.anthropic.com"
@pytest.mark.asyncio
async def test_missing_env_returns_empty_strings(self, monkeypatch):
"""Tool MUST NOT raise when env vars are absent — every key is
present but the value is the empty string. The agent then knows
the slot exists but is unset."""
from a2a_tools_identity import tool_get_runtime_identity
for var in (
"MODEL", "MODEL_PROVIDER", "TIER", "WORKSPACE_ID",
"ADAPTER_MODULE", "MOLECULE_MODEL", "ANTHROPIC_BASE_URL",
):
monkeypatch.delenv(var, raising=False)
parsed = json.loads(await tool_get_runtime_identity())
assert parsed["model"] == ""
assert parsed["model_provider"] == ""
assert parsed["tier"] == ""
assert parsed["workspace_id"] == ""
assert parsed["runtime"] == ""
assert parsed["molecule_model"] == ""
assert parsed["anthropic_base_url"] == ""
@pytest.mark.asyncio
async def test_no_http_call_made(self, monkeypatch):
"""``get_runtime_identity`` is env-only — must not open
httpx.AsyncClient even if the call would otherwise succeed.
Tripwire any client construction."""
import httpx
from a2a_tools_identity import tool_get_runtime_identity
class _Tripwire:
def __init__(self, *_a, **_kw):
raise AssertionError(
"tool_get_runtime_identity must not open httpx.AsyncClient"
)
monkeypatch.setattr(httpx, "AsyncClient", _Tripwire)
# Must not raise.
await tool_get_runtime_identity()
@pytest.mark.asyncio
async def test_helper_dict_matches_string_payload(self, monkeypatch):
"""``_runtime_identity_payload`` is the dict-returning helper
used by both the public tool and tests. Verify the public tool
json.dumps the same dict no field is dropped or renamed by
the encoding step."""
from a2a_tools_identity import (
_runtime_identity_payload,
tool_get_runtime_identity,
)
monkeypatch.setenv("MODEL", "claude-opus-4-7")
monkeypatch.setenv("TIER", "T4")
monkeypatch.setenv("WORKSPACE_ID", "ws-helper-check")
helper = _runtime_identity_payload()
tool_str = await tool_get_runtime_identity()
assert json.loads(tool_str) == helper
# --- tool_update_agent_card -------------------------------------------------
class _MockResponse:
def __init__(self, status_code: int, payload: dict):
self.status_code = status_code
self._payload = payload
self.text = json.dumps(payload)
def json(self):
return self._payload
class _MockClient:
"""Drop-in for httpx.AsyncClient context manager.
Records the URL + json body + headers the tool POSTed so the test
can assert against them. Returns the canned _MockResponse passed
in at construction time.
"""
def __init__(self, *, response: _MockResponse, captured: dict):
self._response = response
self._captured = captured
async def __aenter__(self):
return self
async def __aexit__(self, *_args):
return False
async def post(self, url, *, json=None, headers=None, **_kw): # noqa: A002
self._captured["url"] = url
self._captured["json"] = json
self._captured["headers"] = headers
return self._response
@pytest.fixture
def _grant_memory_write(monkeypatch):
"""Force the inline RBAC gate inside ``tool_update_agent_card`` to
succeed. The gate calls
``a2a_tools_rbac.check_memory_write_permission`` which inspects
``$MOLECULE_ROLES`` / the role table; the patch sidesteps that
machinery so tests can focus on the platform-call shape.
"""
import a2a_tools_identity
monkeypatch.setattr(
a2a_tools_identity, "_check_memory_write_permission", lambda: True
)
class TestUpdateAgentCard:
@pytest.mark.asyncio
async def test_posts_to_registry_update_card(
self, monkeypatch, _grant_memory_write,
):
"""Hits POST {PLATFORM_URL}/registry/update-card with the
workspace bearer and the {workspace_id, agent_card} body shape
the platform handler expects (workspace-server
``internal/handlers/registry.go``)."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
# Ensure PLATFORM_URL re-import sees a deterministic value —
# a2a_client imports it at module load so we patch the symbol
# on a2a_tools_identity directly (the module's own reference).
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
captured: dict = {}
response = _MockResponse(200, {"status": "updated"})
def _client_factory(*_a, **_kw):
return _MockClient(response=response, captured=captured)
monkeypatch.setattr(a2a_tools_identity.httpx, "AsyncClient", _client_factory)
monkeypatch.setattr(
a2a_tools_identity, "_auth_headers_for_heartbeat",
lambda: {"Authorization": "Bearer ws-token-xyz"},
)
card = {"name": "agent-foo", "version": "0.1.0", "description": "demo"}
result_str = await a2a_tools_identity.tool_update_agent_card(card)
result = json.loads(result_str)
# URL: PLATFORM_URL + /registry/update-card
assert captured["url"] == "http://test.invalid/registry/update-card"
# The platform handler expects {workspace_id, agent_card}; the
# agent_card is the raw object the agent submitted.
body = captured["json"]
assert body["workspace_id"] == "ws-42"
assert body["agent_card"] == card
# Auth header from auth_headers_for_heartbeat is forwarded
# verbatim — same path commit_memory uses.
assert captured["headers"]["Authorization"] == "Bearer ws-token-xyz"
assert result["success"] is True
assert result["status"] == "updated"
@pytest.mark.asyncio
async def test_propagates_server_error(
self, monkeypatch, _grant_memory_write,
):
"""Non-200 from platform surfaces as a structured error to the
agent. The agent sees {success:false, status_code, error} and
can decide whether to retry, fall back, or escalate."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
captured: dict = {}
response = _MockResponse(400, {"error": "invalid card"})
monkeypatch.setattr(
a2a_tools_identity.httpx, "AsyncClient",
lambda *a, **kw: _MockClient(response=response, captured=captured),
)
monkeypatch.setattr(
a2a_tools_identity, "_auth_headers_for_heartbeat", lambda: {},
)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"})
)
assert result["success"] is False
assert result["status_code"] == 400
assert "invalid card" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_rejects_non_dict_card(self, _grant_memory_write):
"""The MCP schema constrains transport callers to pass a dict;
in-process callers (tests, sibling modules) can still pass any
type. Reject non-dict defensively so the platform isn't asked
to validate JSON-encoded strings or lists."""
from a2a_tools_identity import tool_update_agent_card
result = json.loads(await tool_update_agent_card("not-a-dict"))
assert result["success"] is False
assert "dict" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_workspace_id_missing_returns_error(
self, monkeypatch, _grant_memory_write,
):
"""If WORKSPACE_ID is not set the tool refuses to issue the
request it would otherwise POST with an empty workspace_id
and let the platform return a confusing 400."""
from a2a_tools_identity import tool_update_agent_card
monkeypatch.delenv("WORKSPACE_ID", raising=False)
result = json.loads(await tool_update_agent_card({"name": "x"}))
assert result["success"] is False
assert "workspace_id" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_denies_when_memory_write_permission_missing(self, monkeypatch):
"""The agent's RBAC role must grant ``memory.write`` to update
the card. Read-only roles get an RBAC error string back
immediately, never touching the platform."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(
a2a_tools_identity, "_check_memory_write_permission", lambda: False,
)
# Tripwire httpx — must not be called when RBAC denies.
import httpx
class _Tripwire:
def __init__(self, *_a, **_kw):
raise AssertionError("RBAC denial must short-circuit before httpx call")
monkeypatch.setattr(httpx, "AsyncClient", _Tripwire)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"}),
)
assert result["success"] is False
assert "memory.write" in str(result["error"]).lower()
@pytest.mark.asyncio
async def test_network_exception_returns_structured_error(
self, monkeypatch, _grant_memory_write,
):
"""A network exception (DNS failure, connect timeout, etc) is
wrapped into a structured error dict instead of bubbling up
to the MCP transport layer."""
import a2a_tools_identity
monkeypatch.setenv("WORKSPACE_ID", "ws-42")
monkeypatch.setattr(a2a_tools_identity, "PLATFORM_URL", "http://test.invalid")
class _ExplodingClient:
async def __aenter__(self):
return self
async def __aexit__(self, *_a):
return False
async def post(self, *_a, **_kw):
raise RuntimeError("simulated DNS failure")
monkeypatch.setattr(
a2a_tools_identity.httpx, "AsyncClient",
lambda *a, **kw: _ExplodingClient(),
)
result = json.loads(
await a2a_tools_identity.tool_update_agent_card({"name": "x"})
)
assert result["success"] is False
assert "network" in str(result["error"]).lower()
# --- Registry contract ------------------------------------------------------
class TestRegistryContract:
"""Pin the new tools' registration in platform_tools.registry. The
structural tests in ``test_platform_tools.py`` already check
registryMCP alignment; these are tighter assertions specific to
the two new tools so a future contributor deleting one entry sees
a focused failure."""
def test_get_runtime_identity_in_registry(self):
from platform_tools.registry import by_name
spec = by_name("get_runtime_identity")
assert spec.section == "a2a"
# No input parameters — env-only call.
assert spec.input_schema == {"type": "object", "properties": {}}
# impl points at the actual tool function, not a shim.
from a2a_tools_identity import tool_get_runtime_identity
assert spec.impl is tool_get_runtime_identity
def test_update_agent_card_in_registry(self):
from platform_tools.registry import by_name
spec = by_name("update_agent_card")
assert spec.section == "a2a"
assert "card" in spec.input_schema["properties"]
assert spec.input_schema["required"] == ["card"]
from a2a_tools_identity import tool_update_agent_card
assert spec.impl is tool_update_agent_card
+117
View File
@@ -788,6 +788,123 @@ def test_sanitize_agent_error_stderr_combined_with_existing_tests():
assert "workspace logs" in out
# ─── reason passthrough (internal#211/#212: surface actionable provider error) ───
def test_sanitize_agent_error_reason_surfaced_verbatim():
"""A curated provider reason is shown to the user, not collapsed to the
exception class name. This is the internal#211 regression: a 403
org-disabled message must reach the canvas."""
reason = (
"provider HTTP 403 — oauth_org_not_allowed — Your organization has "
"disabled Claude subscription access for Claude Code · Use an "
"Anthropic API key instead, or ask your admin to enable access"
)
class _ResultErr(Exception):
pass
out = sanitize_agent_error(exc=_ResultErr("opaque"), reason=reason)
# The actionable provider guidance and status code must be visible.
assert "403" in out
assert "oauth_org_not_allowed" in out
assert "disabled Claude subscription access" in out
assert "ask your admin to enable access" in out
# NOT the old opaque form.
assert "see workspace logs" not in out
def test_sanitize_agent_error_reason_still_scrubs_secrets():
"""Even on the reason path the key/token scrubber runs — a buggy caller
that lets a bearer token into the reason still gets it redacted."""
leaky = (
"provider HTTP 401 — auth failed — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm please re-auth"
)
out = sanitize_agent_error(reason=leaky)
assert "[REDACTED]" in out
assert "PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm" not in out
# The non-secret guidance still survives the scrub.
assert "401" in out
assert "please re-auth" in out
def test_sanitize_agent_error_reason_scrubs_all_secret_formats():
"""The scrubber must redact every realistic credential shape — not just
the `Bearer <tok>` form the original test happened to exercise
(internal#212 review finding: bare `sk-ant-api03-…` keys, JSON-quoted
"token"/"apiKey" values, and `aws_secret_access_key=` all leaked).
All curated/actionable guidance must still survive the scrub.
"""
# 1. Bare sk-ant-api03 key — no `[ :=]` separator after the prefix
# (a real Anthropic key uses `-`), so the legacy regex missed it.
bare = (
"provider HTTP 401 — auth failed — invalid key "
"sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789 "
"please re-auth"
)
out = sanitize_agent_error(reason=bare)
assert "sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789" not in out
assert "[REDACTED]" in out
assert "401" in out # actionable status survives
assert "please re-auth" in out # actionable guidance survives
# 2. JSON-quoted "token" / "apiKey" values.
jblob = (
'provider error — config dump {"token": '
'"abcDEF0123456789ghIJKL0123456789mnopQRST", "apiKey": '
'"anon_fakefakefakefakefakefakefakefakefakefake"} — '
"use an API key instead"
)
out = sanitize_agent_error(reason=jblob)
assert "abcDEF0123456789ghIJKL0123456789mnopQRST" not in out
assert "anon_fakefakefakefakefakefakefakefakefakefake" not in out
assert "[REDACTED]" in out
assert "use an API key instead" in out # actionable guidance survives
# 3. aws_secret_access_key=… form.
awsblob = (
"provider HTTP 403 — boto credential error "
"aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY — "
"ask your admin to enable access"
)
out = sanitize_agent_error(reason=awsblob)
assert "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" not in out
assert "[REDACTED]" in out
assert "403" in out # actionable status survives
assert "ask your admin to enable access" in out # guidance survives
# 4. Regression: the original Bearer form still redacts.
# Uses PLACEHOLDER_LONG_TOKEN (>=40 chars, no sk-ant- prefix) to avoid
# triggering the secret-scan workflow pattern
# `sk-ant-[A-Za-z0-9_-]{40,}`.
bearer = (
"provider HTTP 401 — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij re-auth"
)
out = sanitize_agent_error(reason=bearer)
assert "PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij" not in out
assert "[REDACTED]" in out
assert "re-auth" in out
def test_sanitize_agent_error_reason_wins_over_stderr():
"""When both reason and stderr are passed, the curated reason wins."""
out = sanitize_agent_error(
reason="provider HTTP 403 — use an API key",
stderr="raw subprocess noise that should not be shown",
)
assert "use an API key" in out
assert "raw subprocess noise" not in out
def test_sanitize_agent_error_no_reason_unchanged():
"""Omitting reason preserves the original generic behavior."""
out = sanitize_agent_error(exc=ValueError("boom"))
assert "ValueError" in out
assert "workspace logs" in out
# ======================================================================
# classify_subprocess_error