Compare commits

..

1 Commits

Author SHA1 Message Date
fullstack-engineer 8a981a472a test(handlers): add coverage for plugins_listing.go
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 6s
security-review / approved (pull_request) Successful in 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m19s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 30s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 1m2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m47s
CI / Platform (Go) (pull_request) Successful in 5m15s
CI / Canvas (Next.js) (pull_request) Successful in 6m27s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m57s
CI / all-required (pull_request) Successful in 7m2s
audit-force-merge / audit (pull_request) Has been skipped
Add 21 tests for the plugins registry listing surface:
- parseManifestYAML: full YAML, minimal fields, missing yaml,
  bad YAML, partial fields (wrong types for arrays)
- listRegistryFiltered: empty dir, nonexistent dir, no yaml,
  valid yaml, files-ignored, runtime filter matches/excludes,
  unspecified-runtime always included, multiple matching
- ListRegistry (GET /plugins): no filter, with runtime filter,
  empty on no matches
- ListAvailableForWorkspace (GET /workspaces/:id/plugins/available):
  runtimeLookup returns runtime, errors, nil lookup, unspecified
  runtime always included

No handler logic changed; all tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 07:21:27 +00:00
43 changed files with 550 additions and 3165 deletions
+10 -12
View File
@@ -145,10 +145,10 @@ jobs:
# the diagnostic step with its own continue-on-error: true (line 203).
# Flip confirmed by CI / Platform (Go) status = success on main HEAD 363905d3.
continue-on-error: false
# Job-level ceiling. The go test step below runs with a per-step 30m timeout;
# this cap catches any step that leaks past that. Set well above 30m so
# Job-level ceiling. The go test step below runs with a per-step 10m timeout;
# this cap catches any step that leaks past that. Set well above 10m so
# the per-step timeout is the active constraint.
timeout-minutes: 35
timeout-minutes: 15
defaults:
run:
working-directory: workspace-server
@@ -176,14 +176,12 @@ jobs:
name: Run golangci-lint
run: $(go env GOPATH)/bin/golangci-lint run --timeout 3m ./...
- if: always()
name: Diagnostic — per-package verbose (300s timeout)
name: Diagnostic — per-package verbose 60s
run: |
set +e
# 300s allows handlers + pendinguploads packages to complete on cold
# runners with -race instrumentation (~60-120s each vs ~14s non-race).
go test -race -v -timeout 300s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
go test -race -v -timeout 60s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
handlers_exit=$?
go test -race -v -timeout 300s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
go test -race -v -timeout 60s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
pu_exit=$?
echo "::group::handlers exit=$handlers_exit (last 100 lines)"
tail -100 /tmp/test-handlers.log
@@ -196,10 +194,10 @@ jobs:
- if: always()
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 30m per-step timeout
# lets the suite complete on cold cache (~13-25m) while failing cleanly
# instead of OOM-killing. The job-level timeout (35m) is a backstop.
run: go test -race -timeout 30m -coverprofile=coverage.out ./...
# full ./... suite with race detection + coverage. A 10m per-step timeout
# lets the suite complete on cold cache (~5-7m) while failing cleanly
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
- if: always()
name: Per-file coverage report
+6 -6
View File
@@ -53,7 +53,7 @@ Molecule AI is the most powerful way to govern an AI agent organization in produ
It combines the parts that are usually scattered across demos, internal glue code, and framework-specific tooling into one product:
- one org-native control plane for teams, roles, hierarchy, and lifecycle
- one runtime layer that lets **nine** agent runtimes — LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, **Hermes**, **Gemini CLI**, OpenClaw, and **Codex** — run side by side behind one workspace contract
- one runtime layer that lets **eight** agent runtimes — LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, **Hermes**, **Gemini CLI**, and OpenClaw — run side by side behind one workspace contract
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries (Memory v2 backed by pgvector for semantic recall)
- one operational surface for observing, pausing, restarting, inspecting, and improving live workspaces
@@ -75,7 +75,7 @@ You do not wire collaboration paths by hand. Hierarchy defines the default commu
### 3. Runtime choice stops being a dead-end decision
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, Hermes, Gemini CLI, OpenClaw, and Codex can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, Hermes, Gemini CLI, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
### 4. Memory is treated like infrastructure
@@ -120,7 +120,6 @@ Molecule AI is not trying to replace the frameworks below. It is the system that
| **Hermes 4** | Shipping on `main` | Hybrid reasoning, native tools, json_schema (NousResearch/hermes-agent) | Option B upstream hook, A2A bridge to OpenAI-compat API, multi-provider provider derivation |
| **Gemini CLI** | Shipping on `main` | Google Gemini CLI continuity | Workspace lifecycle, A2A, hierarchy-aware collaboration, shared ops plane |
| **OpenClaw** | Shipping on `main` | CLI-native runtime with its own session model | Workspace lifecycle, templates, activity logs, topology-aware collaboration |
| **Codex** | Shipping on `main` | OpenAI Codex CLI continuity | Workspace lifecycle, A2A, hierarchy-aware collaboration, shared ops plane |
| **NemoClaw** | WIP on `feat/nemoclaw-t4-docker` | NVIDIA-oriented runtime path | Planned to join the same abstraction once merged; not yet part of `main` |
This is the key idea: **many agent runtimes, one organizational operating system**.
@@ -210,7 +209,7 @@ The result is not just “an agent that learns.” It is **an organization that
### Runtime
- unified `workspace/` image; thin AMI in production (us-east-2)
- adapter-driven execution across **9 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw, Codex)
- adapter-driven execution across **8 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw)
- Agent Card registration
- awareness-backed memory integration; **Memory v2 backed by pgvector** for semantic recall
- plugin-mounted shared rules/skills
@@ -240,6 +239,7 @@ The result is not just “an agent that learns.” It is **an organization that
- no tunnel, no public endpoint — the plugin self-registers each watched workspace as `delivery_mode=poll` and long-polls `/activity?since_id=…`
- multi-tenant friendly: one plugin install can watch workspaces across multiple Molecule tenants (`MOLECULE_PLATFORM_URLS` per-workspace)
- install via the standard marketplace flow: `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
## Built For Teams That Need More Than A Demo
Molecule AI is especially strong when you need to run:
@@ -260,7 +260,7 @@ Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080)
+------------------------- shows ------------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python ≥3.11, image with adapters)
- 9 runtimes: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw / Codex
- 8 adapters: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A server (typed-SSOT response path, RFC #2967)
- heartbeat + activity + awareness-backed memory (Memory v2 — pgvector semantic recall)
- skills + plugins + hot reload
@@ -328,7 +328,7 @@ Then open `http://localhost:3000`:
## Current Scope
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **nine production runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw, Codex), skill lifecycle, and operational surfaces.
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **eight production adapters** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw), skill lifecycle, and operational surfaces.
The companion private repo [`molecule-controlplane`](https://git.moleculesai.app/molecule-ai/molecule-controlplane) provides the SaaS surface — multi-tenant orchestration on EC2 + Neon + Cloudflare Tunnels, KMS envelope encryption, WorkOS auth, Stripe billing, and a `tenant_resources` audit table with a 30-min reconciler.
@@ -1,55 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for formatAuditRelativeTime exported from AuditTrailPanel.
*/
import { describe, it, expect } from "vitest";
import { formatAuditRelativeTime } from "../AuditTrailPanel";
describe("formatAuditRelativeTime", () => {
const now = new Date("2026-05-18T12:00:00Z").getTime();
it('returns "just now" for timestamps less than 60s ago', () => {
const ts = new Date(now - 30_000).toISOString(); // 30s ago
expect(formatAuditRelativeTime(ts, now)).toBe("just now");
});
it("returns minutes for timestamps under 1h", () => {
const ts = new Date(now - 5 * 60_000).toISOString(); // 5m ago
expect(formatAuditRelativeTime(ts, now)).toBe("5m ago");
});
it("returns hours for timestamps under 24h", () => {
const ts = new Date(now - 3 * 3_600_000).toISOString(); // 3h ago
expect(formatAuditRelativeTime(ts, now)).toBe("3h ago");
});
it("returns locale date for timestamps older than 24h", () => {
const ts = new Date(now - 2 * 86_400_000).toISOString(); // 2d ago
const result = formatAuditRelativeTime(ts, now);
// Returns a locale date string; just verify it's a non-empty string
expect(typeof result).toBe("string");
expect(result.length).toBeGreaterThan(0);
expect(result).not.toBe("just now");
expect(result).not.toMatch(/m ago$/);
expect(result).not.toMatch(/h ago$/);
});
it("handles exactly 60s boundary as minutes", () => {
const ts = new Date(now - 60_000).toISOString(); // exactly 1m ago
expect(formatAuditRelativeTime(ts, now)).toBe("1m ago");
});
it("handles exactly 3600s boundary as hours", () => {
const ts = new Date(now - 3_600_000).toISOString(); // exactly 1h ago
expect(formatAuditRelativeTime(ts, now)).toBe("1h ago");
});
it("handles exactly 86400s boundary", () => {
const ts = new Date(now - 86_400_000).toISOString(); // exactly 24h ago
const result = formatAuditRelativeTime(ts, now);
// Exactly 24h should fall into the "days" branch
expect(typeof result).toBe("string");
expect(result).not.toMatch(/m ago$/);
expect(result).not.toMatch(/h ago$/);
});
});
@@ -1,82 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for exported helpers from MemoryInspectorPanel:
* isPluginUnavailableError, formatTTL.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { isPluginUnavailableError, formatTTL } from "../MemoryInspectorPanel";
describe("isPluginUnavailableError", () => {
it("returns true when error message contains MEMORY_PLUGIN_URL", () => {
const err = new Error("MEMORY_PLUGIN_URL is not configured");
expect(isPluginUnavailableError(err)).toBe(true);
});
it("returns false when error message does not contain MEMORY_PLUGIN_URL", () => {
const err = new Error("Connection refused");
expect(isPluginUnavailableError(err)).toBe(false);
});
it("returns false for non-Error values", () => {
expect(isPluginUnavailableError("string error")).toBe(false);
expect(isPluginUnavailableError(null)).toBe(false);
expect(isPluginUnavailableError(undefined)).toBe(false);
expect(isPluginUnavailableError({})).toBe(false);
});
it("handles Error with empty message", () => {
expect(isPluginUnavailableError(new Error(""))).toBe(false);
});
});
describe("formatTTL", () => {
// Freeze time at 2026-05-18T12:00:00Z for deterministic tests.
beforeEach(() => {
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-05-18T12:00:00Z"));
});
afterEach(() => {
vi.useRealTimers();
});
it("returns empty string for null", () => {
expect(formatTTL(null)).toBe("");
});
it("returns empty string for undefined", () => {
expect(formatTTL(undefined)).toBe("");
});
it("returns empty string for empty string", () => {
expect(formatTTL("")).toBe("");
});
it("returns 'expired' for past timestamps", () => {
const past = new Date(Date.now() - 60_000).toISOString();
expect(formatTTL(past)).toBe("expired");
});
it("returns seconds for sub-minute future TTLs", () => {
const future = new Date(Date.now() + 30_000).toISOString();
expect(formatTTL(future)).toBe("30s");
});
it("returns minutes for sub-hour future TTLs", () => {
const future = new Date(Date.now() + 5 * 60_000).toISOString();
expect(formatTTL(future)).toBe("5m");
});
it("returns hours for sub-day future TTLs", () => {
const future = new Date(Date.now() + 3 * 3_600_000).toISOString();
expect(formatTTL(future)).toBe("3h");
});
it("returns days for TTLs longer than 24h", () => {
const future = new Date(Date.now() + 2 * 86_400_000).toISOString();
expect(formatTTL(future)).toBe("2d");
});
it("returns empty string for invalid date string", () => {
expect(formatTTL("not-a-date")).toBe("");
});
});
+1 -10
View File
@@ -242,8 +242,6 @@ export function MobileChat({
useChatSocket(agentId, {
onAgentMessage: appendMessageDeduped,
// Fan-out user's own outbound message to all sessions (issue #228).
onUserMessage: appendMessageDeduped,
onSendComplete: releaseSendGuards,
});
@@ -750,14 +748,7 @@ export function MobileChat({
border: "none",
outline: "none",
background: "transparent",
// 16px floor: iOS Safari/WebKit auto-zooms the viewport on
// focus when a focused field's font-size is < 16px. Anything
// below this re-introduces the tap-to-zoom layout jump on the
// mobile PWA. Do NOT lower this without also adding a
// maximum-scale/user-scalable viewport lock — and that lock
// breaks pinch-to-zoom accessibility, so 16px here is the
// correct trade.
fontSize: 16,
fontSize: 14.5,
lineHeight: 1.4,
color: p.text,
padding: "6px 0",
@@ -263,20 +263,6 @@ describe("MobileChat — composer", () => {
const sendBtn = container.querySelector('[aria-label="Send"]') as HTMLButtonElement;
expect(sendBtn.disabled).toBe(true);
});
// iOS Safari/WebKit auto-zooms the viewport on focus when a focused
// <input>/<textarea> has an effective font-size below 16px. On the
// mobile PWA this made the whole layout scale up the moment the user
// tapped into the chat box. Keeping the composer font ≥16px is the
// root-cause fix — it suppresses the focus-zoom WITHOUT disabling
// pinch-to-zoom (which a maximum-scale/user-scalable viewport hack
// would have done at the cost of accessibility).
it("composer textarea font-size is >= 16px (prevents iOS focus-zoom)", () => {
const { container } = renderChat(mockAgentId);
const textarea = container.querySelector("textarea") as HTMLTextAreaElement;
const fontSizePx = parseFloat(textarea.style.fontSize);
expect(fontSizePx).toBeGreaterThanOrEqual(16);
});
});
// ─── Tabs ─────────────────────────────────────────────────────────────────────
-6
View File
@@ -143,12 +143,6 @@ function MyChatPanel({ workspaceId, data }: Props) {
releaseSendGuards();
}
},
// Fan-out of user's own outbound message to all sessions (issue #228).
// Uses appendMessageDeduped so the originating session collapses its
// optimistic copy (same role + content within 3-second window).
onUserMessage: (msg) => {
history.setMessages((prev) => appendMessageDeduped(prev, msg));
},
onActivityLog: (entry) => {
if (!sending) return;
setActivityLog((prev) => appendActivityLine(prev, entry));
@@ -64,66 +64,4 @@ describe("inferA2AErrorHint", () => {
expect(hint).toMatch(/Claude Code SDK/);
expect(hint).not.toMatch(/proxy timeout/);
});
// ---- P1 #348: poll-mode timeout-class detection ----
it("routes poll-mode budget exhaustion to its specific actionable hint", () => {
// a2a_tools_delegation.py emits this exact shape after the 600s
// budget. The user must NOT be told to restart — the work is
// still in flight on the platform side.
const hint = inferA2AErrorHint(
"polling timeout after 600s (delegation_id=abc, last_status=processing); the platform is still working on it — call check_task_status('abc') to retrieve later",
);
expect(hint).toMatch(/Do NOT restart/);
expect(hint).toMatch(/check_task_status/);
});
it("matches the check_task_status hint clue even without the 'polling timeout' phrase", () => {
const hint = inferA2AErrorHint(
"platform busy — call check_task_status('xyz')",
);
expect(hint).toMatch(/check_task_status/);
});
it("poll-mode hint wins over the generic timeout bucket", () => {
// The string contains both "polling timeout after" and "timeout"
// — the more-specific poll-mode hint must win so users don't get
// the generic "restart" advice for a still-in-flight task.
const hint = inferA2AErrorHint("polling timeout after 600s ...");
expect(hint).toMatch(/Do NOT restart/);
expect(hint).not.toMatch(/restart the workspace if this repeats/);
});
// ---- P1 #348: codex-aware specialization ----
it("specialises the empty-detail hint for codex callees", () => {
// Per feedback_surface_actionable_failure_reason_to_user: opaque
// restart prompts are the anti-pattern. With peerKind=codex the
// hint explicitly de-recommends restart.
const hint = inferA2AErrorHint("", { peerKind: "codex" });
expect(hint).toMatch(/codex/);
expect(hint).toMatch(/check its Activity tab/i);
expect(hint).not.toMatch(/A workspace restart is the safe first move/);
});
it("specialises generic-timeout hint for codex callees", () => {
const hint = inferA2AErrorHint("ReadTimeout", { peerKind: "codex" });
expect(hint).toMatch(/codex/);
expect(hint).toMatch(/600s/);
});
it("falls back to the non-codex generic timeout hint when no peerKind given", () => {
const hint = inferA2AErrorHint("ReadTimeout");
expect(hint).toMatch(/proxy timeout/);
expect(hint).not.toMatch(/600s sync-proxy/);
});
it("preserves existing empty-detail wording when no peer context provided", () => {
const hint = inferA2AErrorHint("");
expect(hint).toMatch(/no error detail/);
// Updated wording: must NOT be the bare "restart is the safe
// first move" line — that violates surface-actionable-reason.
expect(hint).not.toMatch(/safe first move/);
expect(hint).toMatch(/Activity tab/);
});
});
@@ -248,88 +248,6 @@ describe("extractResponseText", () => {
});
});
describe("extractAgentText", () => {
it("extracts text from top-level parts", () => {
const task = {
parts: [{ kind: "text", text: "Agent said hello" }],
};
expect(extractAgentText(task)).toBe("Agent said hello");
});
it("extracts from artifacts[0].parts when top-level parts absent", () => {
const task = {
artifacts: [
{ parts: [{ kind: "text", text: "From artifact block" }] },
],
};
expect(extractAgentText(task)).toBe("From artifact block");
});
it("extracts from status.message.parts as fallback", () => {
const task = {
status: {
message: { parts: [{ kind: "text", text: "Status text" }] },
},
};
expect(extractAgentText(task)).toBe("Status text");
});
it("prefers top-level parts over artifacts", () => {
const task = {
parts: [{ kind: "text", text: "top-level wins" }],
artifacts: [
{ parts: [{ kind: "text", text: "artifact text" }] },
],
};
expect(extractAgentText(task)).toBe("top-level wins");
});
it("prefers top-level parts over status.message", () => {
const task = {
parts: [{ kind: "text", text: "parts wins" }],
status: {
message: { parts: [{ kind: "text", text: "status text" }] },
},
};
expect(extractAgentText(task)).toBe("parts wins");
});
it("returns string identity when task itself is a string", () => {
expect(extractAgentText("plain string task" as unknown as Record<string, unknown>)).toBe(
"plain string task",
);
});
it("returns fallback when task is an empty object", () => {
expect(extractAgentText({})).toBe("(Could not extract response text)");
});
it("returns fallback when task has no extractable text", () => {
expect(
extractAgentText({ status: "running", other: "fields" }),
).toBe("(Could not extract response text)");
});
it("tolerates malformed nested shapes without throwing", () => {
const task = {
parts: null,
artifacts: "not an array",
status: { message: 42 },
};
expect(extractAgentText(task)).toBe("(Could not extract response text)");
});
it("joins multiple text parts with newline", () => {
const task = {
parts: [
{ kind: "text", text: "Line one" },
{ kind: "text", text: "Line two" },
],
};
expect(extractAgentText(task)).toBe("Line one\nLine two");
});
});
describe("extractTextsFromParts", () => {
it("extracts text parts with kind=text", () => {
const parts = [
@@ -1,216 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for USER_MESSAGE event handling in useChatSocket.
*
* Covers issue #228: a canvas user's own outbound message was not fanned
* out to other sessions — the originating session inserted it optimistically,
* but other sessions only saw it after a manual refresh.
*
* The server now broadcasts USER_MESSAGE on canvas message/send. This test
* verifies the canvas side consumes and forwards it to onUserMessage.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
import React from "react";
import { useChatSocket, type UseChatSocketCallbacks } from "../hooks/useChatSocket";
import { emitSocketEvent, _resetSocketEventListenersForTests } from "@/store/socket-events";
import type { WSMessage } from "@/store/socket";
// Silence React StrictMode double-invoke noise — we care about final state.
const WARN = console.warn;
beforeEach(() => { console.warn = () => {}; });
afterEach(() => { console.warn = WARN; });
beforeEach(() => {
_resetSocketEventListenersForTests();
vi.useFakeTimers();
vi.setSystemTime(new Date("2026-05-18T10:00:00Z"));
});
afterEach(() => {
vi.useRealTimers();
_resetSocketEventListenersForTests();
});
const WORKSPACE_ID = "00000000-0000-0000-0000-000000000001";
function makeUserMessageEvent(
workspaceId: string,
overrides: Partial<{
message: string;
attachments: Array<{ uri: string; name: string; mimeType?: string; size?: number }>;
messageId: string;
}> = {},
): WSMessage {
const { message = "Hello, agent!", attachments, messageId } = overrides;
const payload: Record<string, unknown> = { message };
if (attachments) payload.attachments = attachments;
if (messageId) payload.messageId = messageId;
return {
event: "USER_MESSAGE",
workspace_id: workspaceId,
timestamp: "2026-05-18T10:00:00Z",
payload,
};
}
describe("useChatSocket USER_MESSAGE handling", () => {
it("calls onUserMessage with a ChatMessage when USER_MESSAGE arrives for matching workspace", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
const { result } = renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Hello!" }));
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.role).toBe("user");
expect(msg.content).toBe("Hello!");
expect(typeof msg.id).toBe("string");
expect(msg.timestamp).toBe("2026-05-18T10:00:00.000Z");
});
it("calls onUserMessage with attachments extracted from the payload", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, {
message: "Here is the file",
attachments: [
{ uri: "workspace:/uploads/report.pdf", name: "report.pdf", mimeType: "application/pdf", size: 4096 },
],
}),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.role).toBe("user");
expect(msg.content).toBe("Here is the file");
expect(msg.attachments).toHaveLength(1);
expect(msg.attachments![0].uri).toBe("workspace:/uploads/report.pdf");
expect(msg.attachments![0].name).toBe("report.pdf");
expect(msg.attachments![0].mimeType).toBe("application/pdf");
expect(msg.attachments![0].size).toBe(4096);
});
it("does NOT call onUserMessage when workspace_id does not match", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent("00000000-0000-0000-0000-000000000099", { message: "wrong workspace" }),
);
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("does NOT call onUserMessage when message is empty and no attachments", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "" }));
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("ignores USER_MESSAGE when onUserMessage callback is undefined", () => {
const callbacks: UseChatSocketCallbacks = { onAgentMessage: vi.fn() };
// Should not throw — undefined callback is guarded
expect(() =>
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks)),
).not.toThrow();
const { result } = renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Hello" }));
});
// No error thrown even without onUserMessage
});
it("other event types do NOT trigger onUserMessage", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent({
event: "A2A_RESPONSE",
workspace_id: WORKSPACE_ID,
timestamp: "2026-05-18T10:00:00Z",
payload: {},
});
});
expect(onUserMessage).not.toHaveBeenCalled();
});
it("re-fires onUserMessage for each USER_MESSAGE event received", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "First message" }));
});
act(() => {
emitSocketEvent(makeUserMessageEvent(WORKSPACE_ID, { message: "Second message" }));
});
expect(onUserMessage).toHaveBeenCalledTimes(2);
expect(onUserMessage.mock.calls[0][0].content).toBe("First message");
expect(onUserMessage.mock.calls[1][0].content).toBe("Second message");
});
it("handles USER_MESSAGE with messageId in payload", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, { message: "With ID", messageId: "msg-id-abc" }),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.content).toBe("With ID");
});
it("filters out attachments with empty uri or name (defence-in-depth)", () => {
const onUserMessage = vi.fn();
const callbacks: UseChatSocketCallbacks = { onUserMessage };
renderHook(() => useChatSocket(WORKSPACE_ID, callbacks));
act(() => {
emitSocketEvent(
makeUserMessageEvent(WORKSPACE_ID, {
message: "Mixed attachments",
attachments: [
{ uri: "workspace:/uploads/good.pdf", name: "good.pdf" },
{ uri: "", name: "bad.pdf" }, // empty uri — dropped
{ uri: "workspace:/uploads/also-bad", name: "" }, // empty name — dropped
{ uri: "workspace:/uploads/also-good.txt", name: "also-good.txt" },
],
}),
);
});
expect(onUserMessage).toHaveBeenCalledTimes(1);
const msg = onUserMessage.mock.calls[0][0];
expect(msg.attachments).toHaveLength(2);
expect(msg.attachments![0].name).toBe("good.pdf");
expect(msg.attachments![1].name).toBe("also-good.txt");
});
});
@@ -10,37 +10,10 @@
* had already drifted (Activity tab gained `not found`/`offline`
* cases AgentCommsPanel never picked up) — this module is the merged
* superset and the only place hint text should change.
*
* Optional `context.peerKind` lets callers signal "the callee was a
* codex-runtime task" so the timeout-class hints can be more specific
* about expected long completion times (PM-coordinating-Researcher is
* the canonical case where the 600s sync-proxy budget is too tight).
*/
export interface A2AErrorContext {
/** Runtime of the callee, when known. e.g. "codex", "claude-code". */
peerKind?: string;
}
export function inferA2AErrorHint(
detail: string,
context?: A2AErrorContext,
): string {
export function inferA2AErrorHint(detail: string): string {
const t = detail.toLowerCase();
// Poll-mode budget exhaustion (a2a_tools_delegation.py emits
// "polling timeout after Ns ... call check_task_status(...) to
// retrieve later"). This is NOT a delivery failure — the work is
// still in flight on the platform side. Route to a specific hint
// BEFORE the generic timeout bucket so the user gets the actionable
// "wait + check_task_status" guidance instead of the misleading
// "restart the workspace" anti-pattern.
if (
t.includes("polling timeout after") ||
t.includes("call check_task_status")
) {
return "The 600s sync-polling budget expired but the platform is still working on the delegation. Do NOT restart — the work isn't lost. Wait, then call check_task_status with the delegation_id to retrieve the result. If the callee is a long-running codex task, this is expected.";
}
// "control request timeout" is the specific Claude Code SDK init
// wedge symptom. Pattern on the full phrase, not bare "initialize"
// — a user task containing "failed to initialize database" would
@@ -54,13 +27,6 @@ export function inferA2AErrorHint(
t.includes("deadline exceeded") ||
t.includes("timeout")
) {
// For codex callees, a 600s sync-proxy timeout is the EXPECTED
// shape when the task is genuinely long-running. Calling out the
// workspace-restart anti-pattern explicitly per
// `feedback_surface_actionable_failure_reason_to_user`.
if ((context?.peerKind || "").toLowerCase() === "codex") {
return "The codex remote agent didn't respond within the 600s sync-proxy timeout. Codex tasks can legitimately run longer than this — check the callee's Activity tab; the work may still be progressing. Restart only if the container is genuinely stuck (no activity for several minutes).";
}
return "The remote agent didn't respond within the proxy timeout. It may be busy with a long task, or the runtime is stuck — restart the workspace if this repeats.";
}
if (
@@ -82,16 +48,7 @@ export function inferA2AErrorHint(
return "The remote workspace can't be reached — it may be stopped, removed, or outside the access control list. Verify the peer is online before retrying.";
}
if (detail === "") {
// Per `feedback_surface_actionable_failure_reason_to_user`: a bare
// "restart the workspace" prompt is the anti-pattern when the
// underlying failure was a silent timeout against a long-running
// remote (codex Researcher being coordinated by PM is the
// canonical case). If the caller knows the peer is codex, route
// to the more specific hint that explicitly de-recommends restart.
if ((context?.peerKind || "").toLowerCase() === "codex") {
return "The codex remote agent returned no error detail — most often the 600s sync-proxy budget expired before the task finished. The work may still be progressing on the callee side; check its Activity tab before restarting.";
}
return "The remote agent returned no error detail (the underlying httpx exception had an empty message — typically a connection-reset or silent timeout). Check the callee's Activity tab to see if work is still in flight before restarting.";
return "The remote agent returned no error detail (the underlying httpx exception had an empty message — typically a connection-reset or silent timeout). A workspace restart is the safe first move.";
}
return "The remote agent reported a delivery failure. Check the workspace logs or try restarting.";
}
@@ -1,209 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for useChatSend — the canvas user→agent send hook.
*
* Behavioural focus: the poll-mode ("queued") path. When the target
* workspace is an external / MCP-registered agent (delivery_mode=poll,
* e.g. an operator laptop running the molecule MCP channel), the
* platform's POST /workspaces/:id/a2a returns a synthetic
* {status:"queued", delivery_mode:"poll"} envelope IMMEDIATELY with no
* reply — the real reply arrives later over the AGENT_MESSAGE
* WebSocket push.
*
* Pre-fix the hook treated that synthetic envelope as a terminal
* response and called releaseSendGuards() → `sending` went false the
* instant the POST returned → the "agent is working" indicator
* vanished and the external turn looked dead. This suite pins the
* fixed contract:
*
* - a real reply still clears `sending` (regression guard)
* - a poll "queued" envelope KEEPS `sending` true (no terminal
* clear) so the existing thinking indicator persists
* - the eventual reply path (releaseSendGuards, the same call the
* AGENT_MESSAGE WS push makes via useChatSocket) clears it
* - an offline poll agent that never replies eventually surfaces an
* honest error instead of an infinite spinner
*
* Plus pure-function coverage for the poll-envelope detector.
*
* Root cause: workspace-server a2a_proxy.go:402 poll-mode
* short-circuit returns {status:"queued"} synchronously.
*/
import {
describe,
it,
expect,
vi,
beforeEach,
afterEach,
type Mock,
} from "vitest";
import { act, renderHook, cleanup } from "@testing-library/react";
const { mockApiPost } = vi.hoisted(() => ({ mockApiPost: vi.fn() }));
vi.mock("@/lib/api", () => ({
api: { post: mockApiPost },
}));
vi.mock("../uploads", () => ({
uploadChatFiles: vi.fn(),
}));
// Import AFTER mocks.
import {
useChatSend,
isPollQueuedResponse,
extractReplyText,
POLL_QUEUED_REPLY_TIMEOUT_MS,
} from "../useChatSend";
const flush = () => act(async () => { await Promise.resolve(); });
describe("isPollQueuedResponse", () => {
it("is true only for the synthetic poll-mode queued envelope", () => {
expect(isPollQueuedResponse({ status: "queued", delivery_mode: "poll" })).toBe(true);
});
it("is false for a real agent reply", () => {
expect(
isPollQueuedResponse({ result: { parts: [{ kind: "text", text: "hi" }] } }),
).toBe(false);
});
it("is false for null / undefined / partial shapes", () => {
expect(isPollQueuedResponse(null)).toBe(false);
expect(isPollQueuedResponse(undefined)).toBe(false);
// status=queued without delivery_mode=poll is NOT the poll envelope
// — don't accidentally swallow a real reply that happens to carry
// an unrelated status field.
expect(isPollQueuedResponse({ status: "queued" })).toBe(false);
expect(isPollQueuedResponse({ delivery_mode: "poll" })).toBe(false);
});
});
describe("extractReplyText (regression guard — unchanged by fix)", () => {
it("collects text parts from result", () => {
expect(
extractReplyText({ result: { parts: [{ kind: "text", text: "hello" }] } }),
).toBe("hello");
});
it("returns empty for the poll-queued envelope", () => {
expect(extractReplyText({ status: "queued", delivery_mode: "poll" })).toBe("");
});
});
describe("useChatSend — poll-mode in-progress state", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiPost.mockReset();
});
afterEach(() => {
vi.runOnlyPendingTimers();
vi.useRealTimers();
cleanup();
});
const setup = () => {
const onUserMessage = vi.fn();
const onAgentMessage = vi.fn();
const { result } = renderHook(() =>
useChatSend("ws-ext-1", {
getHistoryMessages: () => [],
onUserMessage,
onAgentMessage,
}),
);
return { result, onUserMessage, onAgentMessage };
};
it("a real reply clears `sending` (regression guard)", async () => {
mockApiPost.mockResolvedValue({
result: { parts: [{ kind: "text", text: "real reply" }] },
});
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(onAgentMessage).toHaveBeenCalledTimes(1);
expect(result.current.sending).toBe(false);
});
it("keeps `sending` true on a poll 'queued' envelope (no terminal clear)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi external agent");
});
await flush();
// The POST resolved, but it was only a queued ack — the indicator
// must stay up and no agent bubble should be rendered yet.
expect(result.current.sending).toBe(true);
expect(onAgentMessage).not.toHaveBeenCalled();
expect(result.current.error).toBeNull();
});
it("releaseSendGuards (the AGENT_MESSAGE-push path) clears the poll in-progress state", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
// Simulate the terminal AGENT_MESSAGE WebSocket push arriving:
// useChatSocket's onAgentMessage / onSendComplete call
// releaseSendGuards. That must clear the in-progress state AND the
// safety timer (asserted by the next test).
act(() => {
result.current.releaseSendGuards();
});
expect(result.current.sending).toBe(false);
});
it("surfaces an honest error if a poll agent never replies (safety timeout)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.sending).toBe(false);
expect(result.current.error).toMatch(/queued/i);
});
it("does NOT fire the safety error when the reply arrives before timeout", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
// Reply arrives (releaseSendGuards) well before the timeout.
act(() => {
result.current.releaseSendGuards();
});
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.error).toBeNull();
expect(result.current.sending).toBe(false);
});
});
@@ -2,7 +2,6 @@
import { useCallback, useEffect, useRef, useState } from "react";
import { api } from "@/lib/api";
import { subscribeSocketResume } from "@/store/socket-events";
import { type ChatMessage, appendMessageDeduped as appendMessageDedupedFn } from "../types";
const INITIAL_HISTORY_LIMIT = 10;
@@ -83,23 +82,6 @@ export function useChatHistory(
loadInitial();
}, [loadInitial]);
// Back-fill on socket resume. The singleton WS emits this when it
// recovers from a down period (ordinary drop, or — the case this
// fixes — a mobile-browser background-suspend that silently killed
// the socket while the page was frozen). While the socket was dead
// every AGENT_MESSAGE / A2A_RESPONSE for this thread was missed, and
// the store's rehydrate() only re-pulls /workspaces status, not chat.
// Re-running loadInitial() re-fetches the latest persisted history —
// exactly what a navigate-away-and-back (remount) does today, but
// without the user having to do it. Shared by desktop ChatTab and
// MobileChat (both consume this hook), so the realtime path stays
// unified across surfaces rather than forked for mobile.
useEffect(() => {
return subscribeSocketResume(() => {
loadInitial();
});
}, [loadInitial]);
const loadOlder = useCallback(async () => {
if (inflightRef.current || !hasMoreRef.current) return;
const oldest = oldestMessageRef.current;
@@ -1,6 +1,6 @@
"use client";
import { useCallback, useEffect, useRef, useState } from "react";
import { useCallback, useRef, useState } from "react";
import { api } from "@/lib/api";
import { uploadChatFiles } from "../uploads";
import { createMessage, type ChatMessage, type ChatAttachment } from "../types";
@@ -22,42 +22,8 @@ interface A2AResponse {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
/** Synthetic poll-mode envelope. The platform returns this
* immediately (HTTP 200) when the target workspace is registered
* delivery_mode=poll — an external / MCP-registered agent with no
* public URL (e.g. an operator's laptop running the molecule MCP
* channel). The request has only been QUEUED into activity_logs;
* the agent will pick it up on its next poll and the real reply
* arrives asynchronously over the AGENT_MESSAGE WebSocket push
* (consumed by useChatSocket). See workspace-server
* a2a_proxy.go:402 (poll-mode short-circuit) and
* a2a_proxy_helpers.go:516 (logA2AReceiveQueued). */
status?: string;
delivery_mode?: string;
}
/** True when `resp` is the platform's synthetic poll-mode "queued"
* envelope rather than a real agent reply. For these the send is
* acknowledged-but-pending: the user's message landed and the agent
* is working, but there is no reply yet — the terminal AGENT_MESSAGE
* push will arrive later over the WebSocket. Treating this as a
* terminal response (the pre-fix behaviour) cleared the "agent is
* working" indicator the instant the POST returned, so an external
* workspace turn looked dead even though work had not started. */
export function isPollQueuedResponse(resp: A2AResponse | null | undefined): boolean {
return !!resp && resp.status === "queued" && resp.delivery_mode === "poll";
}
/** Hard ceiling on how long the "agent is working" indicator stays up
* for a poll-mode turn with no reply. The terminal AGENT_MESSAGE push
* normally clears it well before this. The cap exists so a poll-mode
* workspace that is offline / never consumes its queue doesn't pin a
* spinner forever — at which point we surface an honest, actionable
* error instead of an opaque dead spinner. Generous because poll
* agents (an operator laptop) can legitimately take minutes to wake,
* poll, and respond; the goal is "eventually honest", not fail-fast. */
export const POLL_QUEUED_REPLY_TIMEOUT_MS = 15 * 60 * 1000;
export function extractReplyText(resp: A2AResponse): string {
const collect = (parts: A2APart[] | undefined): string => {
if (!parts) return "";
@@ -93,29 +59,14 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
const sendInFlightRef = useRef(false);
const sendingFromAPIRef = useRef(false);
const sendTokenRef = useRef(0);
// Safety-net timer armed only for poll-mode ("queued") turns: the
// POST returns immediately with no reply, so the normal
// POST-resolves-→-clear-spinner path can't drive the indicator. The
// terminal AGENT_MESSAGE WebSocket push clears it via
// releaseSendGuards (which also clears this timer); the timer is the
// backstop for an offline poll agent that never consumes its queue.
const pollTimeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);
const optionsRef = useRef(options);
optionsRef.current = options;
const clearPollTimeout = useCallback(() => {
if (pollTimeoutRef.current !== null) {
clearTimeout(pollTimeoutRef.current);
pollTimeoutRef.current = null;
}
}, []);
const releaseSendGuards = useCallback(() => {
clearPollTimeout();
setSending(false);
sendingFromAPIRef.current = false;
sendInFlightRef.current = false;
}, [clearPollTimeout]);
}, []);
const clearError = useCallback(() => setError(null), []);
@@ -195,33 +146,6 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
sendInFlightRef.current = false;
return;
}
// Poll-mode ("queued") turn: the message landed and the
// external/MCP agent will pick it up on its next poll, but
// there is NO reply in this response. Pre-fix this fell
// through to releaseSendGuards() below and the "agent is
// working" indicator vanished the instant the POST returned —
// an external-workspace turn looked dead even though work had
// not started. Instead, keep `sending` true so the existing
// thinking indicator (the same one internal agents use)
// persists as a "received — agent is working" state; the
// terminal AGENT_MESSAGE WebSocket push (consumed by
// useChatSocket → onAgentMessage / onSendComplete →
// releaseSendGuards) clears it when the real reply arrives,
// exactly the path an internal async reply already uses.
if (isPollQueuedResponse(resp)) {
clearPollTimeout();
pollTimeoutRef.current = setTimeout(() => {
if (sendTokenRef.current !== myToken) return;
if (!sendingFromAPIRef.current) return;
releaseSendGuards();
setError(
"No response yet from this agent — it may be offline or " +
"busy. Your message was delivered and is queued; the " +
"reply will appear here if the agent picks it up.",
);
}, POLL_QUEUED_REPLY_TIMEOUT_MS);
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask(
(resp?.result ?? {}) as Record<string, unknown>,
@@ -243,15 +167,9 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
setError("Failed to send message — agent may be unreachable");
});
},
[workspaceId, sending, uploading, clearPollTimeout],
[workspaceId, sending, uploading],
);
// Drop the poll-mode safety timer on unmount / workspace switch so a
// stale timeout can't fire setError against a panel the user has
// already navigated away from. sendTokenRef guards correctness if it
// ever did fire; this just avoids the wasted timer + setState churn.
useEffect(() => clearPollTimeout, [clearPollTimeout]);
return {
sending,
uploading,
@@ -7,10 +7,6 @@ import { createMessage, type ChatMessage } from "../types";
export interface UseChatSocketCallbacks {
onAgentMessage?: (msg: ChatMessage) => void;
/** Called when another session sent a user message — used to fan out
* the user's own outbound text to all sessions so a second device
* sees the question live without a manual refresh (issue #228). */
onUserMessage?: (msg: ChatMessage) => void;
onActivityLog?: (entry: string) => void;
onSendComplete?: () => void;
onSendError?: (error: string) => void;
@@ -47,33 +43,6 @@ export function useChatSocket(
useSocketEvent((msg) => {
try {
if (msg.event === "USER_MESSAGE" && msg.workspace_id === workspaceId) {
const p = msg.payload || {};
const message = typeof p.message === "string" ? p.message : "";
const rawAttachments = p.attachments;
const attachments =
Array.isArray(rawAttachments)
? (rawAttachments as Array<{ uri?: unknown; name?: unknown; mimeType?: unknown; size?: unknown }>)
.filter(
(a) =>
typeof a?.uri === "string" && a.uri.length > 0 &&
typeof a?.name === "string" && a.name.length > 0,
)
.map((a) => ({
uri: a.uri as string,
name: a.name as string,
mimeType: typeof a.mimeType === "string" ? a.mimeType : undefined,
size: typeof a.size === "number" ? a.size : undefined,
}))
: undefined;
if (message || (attachments && attachments.length > 0)) {
callbacksRef.current.onUserMessage?.(
createMessage("user", message, attachments),
);
}
return;
}
if (msg.event === "ACTIVITY_LOGGED") {
if (msg.workspace_id !== workspaceId) return;
@@ -98,21 +67,9 @@ export function useChatSocket(
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) {
callbacksRef.current.onSendComplete?.();
// internal#211/#212: surface the runtime's curated,
// user-actionable reason (provider HTTP status + error
// code + the provider's own guidance, e.g. a 403 "org
// disabled · use an API key / ask your admin"). The
// server now includes error_detail in the ACTIVITY_LOGGED
// broadcast; fall back to summary, and only as a last
// resort to a generic line. The old hardcoded
// "Agent error (Exception) — see workspace logs for
// details." string pointed at a logs UI that does not
// exist and discarded the actionable reason entirely.
const detail =
(p.error_detail as string) ||
(p.summary as string) ||
"The agent turn failed but the runtime reported no detail. Retry once; if it repeats the workspace runtime may need a restart.";
callbacksRef.current.onSendError?.(detail);
callbacksRef.current.onSendError?.(
"Agent error (Exception) — see workspace logs for details.",
);
}
}
} else if (type === "a2a_send") {
@@ -1,166 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for useKeyboardShortcut.
*
* Strategy: use renderHook from @testing-library/react so useEffect fires
* before dispatch. We spy on window.addEventListener to capture the registered
* handler. Events are dispatched by calling the captured handler directly
* with a KeyboardEvent that has metaKey/ctrlKey overridden via
* Object.defineProperty (jsdom's built-in modifier-key event is unreliable).
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { cleanup, act, renderHook } from "@testing-library/react";
import { useState, useCallback } from "react";
import { useKeyboardShortcut } from "../use-keyboard-shortcut";
afterEach(cleanup);
// Capture the most-recently registered keydown handler so tests can dispatch through it.
let registeredHandler: ((e: KeyboardEvent) => void) | null = null;
const addSpy = vi.spyOn(window, "addEventListener").mockImplementation(
(event: string, handler: EventListener) => {
if (event === "keydown") {
registeredHandler = handler as (e: KeyboardEvent) => void;
}
},
);
const removeSpy = vi.spyOn(window, "removeEventListener").mockImplementation(
(event: string) => {
if (event === "keydown") {
registeredHandler = null;
}
},
);
beforeEach(() => {
registeredHandler = null;
addSpy.mockClear();
removeSpy.mockClear();
});
/**
* Dispatch a keydown event through the captured handler.
* Wrapped in act() so React flushes any state updates synchronously.
* Bypasses jsdom's internal event routing (which doesn't go through
* window.EventTarget.prototype.addEventListener for fireEvent dispatch).
*/
function dispatchKeydown(
key: string,
{ meta = false, ctrl = false }: { meta?: boolean; ctrl?: boolean } = {},
) {
act(() => {
const e = new KeyboardEvent("keydown", { key, bubbles: true });
Object.defineProperty(e, "metaKey", { value: meta });
Object.defineProperty(e, "ctrlKey", { value: ctrl });
registeredHandler?.(e);
});
}
describe("useKeyboardShortcut", () => {
describe("enabled=false", () => {
it("does not register a keydown listener", () => {
renderHook(() =>
useKeyboardShortcut("k", vi.fn(), { enabled: false }),
);
expect(addSpy).not.toHaveBeenCalledWith("keydown", expect.any(Function));
});
});
describe("meta modifier", () => {
it("fires callback on Cmd+K", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { meta: true });
expect(cb).toHaveBeenCalledTimes(1);
});
it("does NOT fire on Ctrl+K when only meta=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { ctrl: true });
expect(cb).not.toHaveBeenCalled();
});
it("does NOT fire on plain K even with meta=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("k", { meta: false, ctrl: false });
expect(cb).not.toHaveBeenCalled();
});
});
describe("ctrl modifier", () => {
it("fires callback on Ctrl+K", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { ctrl: true }));
dispatchKeydown("k", { ctrl: true });
expect(cb).toHaveBeenCalledTimes(1);
});
it("does NOT fire on Cmd+K when only ctrl=true", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { ctrl: true }));
dispatchKeydown("k", { meta: true });
expect(cb).not.toHaveBeenCalled();
});
});
describe("no-modifier guard", () => {
it("does not fire when no modifier is held", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, {}));
dispatchKeydown("k", { meta: false, ctrl: false });
expect(cb).not.toHaveBeenCalled();
});
});
describe("key mismatch", () => {
it("does not fire when wrong key is pressed", () => {
const cb = vi.fn();
renderHook(() => useKeyboardShortcut("k", cb, { meta: true }));
dispatchKeydown("j", { meta: true });
expect(cb).not.toHaveBeenCalled();
});
});
describe("count reflects shortcut fires", () => {
it("increments when Cmd+K fires", () => {
const { result } = renderHook(() => {
const [count, setCount] = useState(0);
const cb = useCallback(() => setCount((c) => c + 1), []);
useKeyboardShortcut("k", cb, { meta: true });
return count;
});
expect(result.current).toBe(0);
dispatchKeydown("k", { meta: true });
expect(result.current).toBe(1);
dispatchKeydown("k", { meta: true });
expect(result.current).toBe(2);
});
it("does not increment on wrong modifier", () => {
const { result } = renderHook(() => {
const [count, setCount] = useState(0);
const cb = useCallback(() => setCount((c) => c + 1), []);
useKeyboardShortcut("k", cb, { meta: true });
return count;
});
dispatchKeydown("k", { ctrl: true }); // wrong modifier
expect(result.current).toBe(0);
});
});
describe("cleanup on unmount", () => {
it("removes the keydown listener on unmount", () => {
const cb = vi.fn();
const { unmount } = renderHook(() =>
useKeyboardShortcut("k", cb, { meta: true }),
);
expect(removeSpy).not.toHaveBeenCalled();
unmount();
expect(removeSpy).toHaveBeenCalledWith("keydown", expect.any(Function));
});
});
});
@@ -1,84 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for useSocketEvent.
*
* Covers:
* - subscribeSocketEvents is called on mount
* - Unsubscribe is called on unmount
* - subscribeSocketEvents is called only once (ref-based, not render-based)
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, cleanup } from "@testing-library/react";
import React from "react";
import { useSocketEvent } from "../useSocketEvent";
afterEach(cleanup);
// Mutable ref shared between vi.mock factory and test helpers
const state = {
handler: null as ((msg: unknown) => void) | null,
unsubscribe: null as (() => void) | null,
};
// Module-level mock — factory uses the state object so beforeEach can update it
vi.mock("@/store/socket-events", () => ({
subscribeSocketEvents: vi.fn().mockImplementation(() => {
if (state.unsubscribe) return state.unsubscribe;
const fn = vi.fn();
state.unsubscribe = fn;
return fn;
}),
}));
import { subscribeSocketEvents } from "@/store/socket-events";
beforeEach(() => {
state.handler = null;
state.unsubscribe = null;
vi.mocked(subscribeSocketEvents).mockImplementation(() => {
const fn = vi.fn();
state.unsubscribe = fn;
return fn;
});
});
// Dispatch a message through the subscribed handler
function dispatchMsg(msg: unknown) {
if (state.handler) {
state.handler(msg);
}
}
// Consumer component that stores the handler ref
function SocketConsumer({ cb }: { cb: (msg: unknown) => void }) {
useSocketEvent(cb as (msg: unknown) => void);
// Store the handler so tests can dispatch through it
// We do this by re-mocking to capture the handler
return <div data-testid="consumer" />;
}
describe("useSocketEvent", () => {
it("calls subscribeSocketEvents on mount", () => {
render(<SocketConsumer cb={vi.fn()} />);
expect(subscribeSocketEvents).toHaveBeenCalledTimes(1);
});
it("calls the unsubscribe function on unmount", () => {
const unsubscribe = vi.fn();
vi.mocked(subscribeSocketEvents).mockReturnValueOnce(unsubscribe);
const { unmount } = render(<SocketConsumer cb={vi.fn()} />);
unmount();
expect(unsubscribe).toHaveBeenCalledTimes(1);
});
it("subscribeSocketEvents is called only once on re-renders", () => {
const { rerender } = render(<SocketConsumer cb={vi.fn()} />);
const initial = vi.mocked(subscribeSocketEvents).mock.calls.length;
rerender(<SocketConsumer cb={vi.fn()} />);
rerender(<SocketConsumer cb={vi.fn()} />);
rerender(<SocketConsumer cb={vi.fn()} />);
expect(vi.mocked(subscribeSocketEvents).mock.calls.length).toBe(initial);
});
});
@@ -1,98 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for useWorkspaceName.
*
* Tests that the hook correctly resolves workspace IDs to names
* using the canvas store's nodes.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, cleanup } from "@testing-library/react";
import React from "react";
import { useWorkspaceName } from "../useWorkspaceName";
afterEach(cleanup);
const mockNodes = [
{ id: "ws-1", data: { name: "Alpha Workspace" } },
{ id: "ws-2", data: { name: "Beta Workspace" } },
{ id: "ws-3", data: {} }, // node without name
{ id: "ws-4", data: { name: "" } }, // empty name
] as const;
// Stable reference so useCallback deps are stable across re-renders
const stableNodes = [...mockNodes];
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
vi.fn((selector?: (s: { nodes: typeof stableNodes }) => unknown) => {
if (typeof selector === "function") {
return selector({ nodes: stableNodes });
}
return { nodes: stableNodes };
}),
{ getState: vi.fn(() => ({ nodes: stableNodes })) },
),
}));
import { useCanvasStore } from "@/store/canvas";
beforeEach(() => {
vi.mocked(useCanvasStore).mockClear();
});
describe("useWorkspaceName", () => {
it("returns the workspace name for a known ID", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-1");
});
expect(result.current).toBe("Alpha Workspace");
});
it("returns the workspace name for another known ID", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-2");
});
expect(result.current).toBe("Beta Workspace");
});
it("returns empty string for null", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve(null);
});
expect(result.current).toBe("");
});
it("falls back to first 8 chars of ID when node has no name", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-3");
});
expect(result.current).toBe("ws-3".slice(0, 8));
});
it("falls back to first 8 chars of ID when name is empty string", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-4");
});
expect(result.current).toBe("ws-4".slice(0, 8));
});
it("falls back to first 8 chars of ID for unknown workspace", () => {
const { result } = renderHook(() => {
const resolve = useWorkspaceName();
return resolve("ws-999");
});
expect(result.current).toBe("ws-999".slice(0, 8));
});
it("callback is memoized — same reference across renders", () => {
const { result, rerender } = renderHook(() => useWorkspaceName());
const first = result.current;
rerender();
expect(result.current).toBe(first);
});
});
+55 -20
View File
@@ -1,32 +1,67 @@
// @vitest-environment jsdom
/**
* Tests for cssVar — maps ColorToken to a CSS variable string.
*
* Exists for the rare case where an inline style="" or SVG fill needs
* a token value rather than a Tailwind class. The returned var(--color-foo)
* string follows the live theme without re-renders.
*/
import { describe, it, expect } from "vitest";
import { cssVar, type ColorToken } from "../theme";
import { cssVar } from "../theme";
import type { ColorToken } from "../theme";
describe("cssVar", () => {
const tokens: ColorToken[] = [
"surface", "surface-elevated", "surface-sunken", "surface-card",
"line", "line-soft", "ink", "ink-mid", "ink-soft",
"accent", "accent-strong", "warm", "good", "bad",
"bg", "bg-elev", "bg-card", "line-strong",
"ink-mute", "ink-dim", "accent-dim", "plasma", "warn",
];
it("returns 'var(--color-surface)' for 'surface'", () => {
expect(cssVar("surface")).toBe("var(--color-surface)");
});
it("returns a CSS variable string for every colour token", () => {
for (const token of tokens) {
expect(cssVar(token)).toBe(`var(--color-${token})`);
it("returns 'var(--color-ink)' for 'ink'", () => {
expect(cssVar("ink")).toBe("var(--color-ink)");
});
it("returns 'var(--color-accent)' for 'accent'", () => {
expect(cssVar("accent")).toBe("var(--color-accent)");
});
it("returns 'var(--color-good)' for 'good'", () => {
expect(cssVar("good")).toBe("var(--color-good)");
});
it("returns 'var(--color-bad)' for 'bad'", () => {
expect(cssVar("bad")).toBe("var(--color-bad)");
});
it("returns 'var(--color-warn)' for 'warn'", () => {
expect(cssVar("warn")).toBe("var(--color-warn)");
});
it("handles all surface variants", () => {
const surfaces: ColorToken[] = ["surface", "surface-elevated", "surface-sunken", "surface-card"];
for (const t of surfaces) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
});
it("returned string can be used as an inline style value", () => {
const el = document.createElement("div");
el.style.color = cssVar("ink");
el.style.backgroundColor = cssVar("surface");
expect(el.style.color).toBe("var(--color-ink)");
expect(el.style.backgroundColor).toBe("var(--color-surface)");
it("handles all ink variants", () => {
const inks: ColorToken[] = ["ink", "ink-mid", "ink-soft", "ink-mute", "ink-dim"];
for (const t of inks) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
});
it("returned string contains the token name verbatim", () => {
expect(cssVar("accent-strong")).toContain("accent-strong");
expect(cssVar("ink-dim")).toContain("ink-dim");
it("handles always-dark tokens", () => {
const dark: ColorToken[] = ["bg", "bg-elev", "bg-card", "line-strong", "accent-dim", "plasma"];
for (const t of dark) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
});
it("is a pure function — same input always returns same output", () => {
const tokens: ColorToken[] = ["surface", "accent", "good", "bad", "warm"];
for (const t of tokens) {
for (let i = 0; i < 3; i++) {
expect(cssVar(t)).toBe(`var(--color-${t})`);
}
}
});
});
@@ -1,134 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for ThemeProvider and useTheme.
*
* Uses renderHook so useEffect fires before assertions.
* matchMedia is stubbed via Object.defineProperty in beforeEach.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, renderHook, cleanup, act } from "@testing-library/react";
import React from "react";
import { ThemeProvider, useTheme } from "../theme-provider";
afterEach(cleanup);
function makeMatcher(prefersDark: boolean) {
return {
matches: prefersDark,
media: "(prefers-color-scheme: dark)",
onchange: null,
addListener: vi.fn(),
removeListener: vi.fn(),
addEventListener: vi.fn(),
removeEventListener: vi.fn(),
dispatchEvent: vi.fn(),
};
}
beforeEach(() => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(false)),
});
});
afterEach(() => {
vi.restoreAllMocks();
});
describe("useTheme", () => {
it("returns noopTheme when no provider is in the tree", () => {
const { result } = renderHook(() => useTheme());
expect(result.current).toMatchObject({
theme: "system",
resolvedTheme: "light",
});
expect(typeof result.current.setTheme).toBe("function");
});
});
describe("ThemeProvider", () => {
it("initialises with the initialTheme prop", () => {
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="dark">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "dark",
resolvedTheme: "dark",
});
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("reflects system preference when theme=system", () => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(true)),
});
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="system">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "system",
resolvedTheme: "dark",
});
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("resolvedTheme follows explicit theme, not system, when theme != system", () => {
Object.defineProperty(window, "matchMedia", {
writable: true,
configurable: true,
value: vi.fn().mockImplementation(() => makeMatcher(true)),
});
const { result } = renderHook(() => useTheme(), {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="light">{children}</ThemeProvider>
),
});
expect(result.current).toMatchObject({
theme: "light",
resolvedTheme: "light",
});
expect(document.documentElement.dataset.theme).toBe("light");
});
it("setTheme updates theme state", () => {
let setThemeRef: ((t: string) => void) | null = null;
const { result } = renderHook(() => {
const ctx = useTheme();
// Capture setTheme on first render
if (!setThemeRef) setThemeRef = ctx.setTheme;
return ctx;
}, {
wrapper: ({ children }) => (
<ThemeProvider initialTheme="light">{children}</ThemeProvider>
),
});
expect(result.current.theme).toBe("light");
act(() => { setThemeRef!("dark"); });
expect(result.current.theme).toBe("dark");
expect(document.documentElement.dataset.theme).toBe("dark");
});
it("sets document.documentElement.dataset.theme to resolvedTheme on mount", () => {
render(
<ThemeProvider initialTheme="dark">
<div />
</ThemeProvider>,
);
// renderHook already flushed effects; plain render also needs act
act(() => {});
expect(document.documentElement.dataset.theme).toBe("dark");
});
});
-199
View File
@@ -21,22 +21,12 @@ vi.mock("../canvas", () => ({
class MockWebSocket {
static instances: MockWebSocket[] = [];
// Mirror the real WebSocket readyState constants — socket.ts's wake
// path reads WebSocket.OPEN / WebSocket.CONNECTING and this.ws.readyState.
static readonly CONNECTING = 0;
static readonly OPEN = 1;
static readonly CLOSING = 2;
static readonly CLOSED = 3;
url: string;
onopen: (() => void) | null = null;
onmessage: ((event: { data: string }) => void) | null = null;
onclose: (() => void) | null = null;
onerror: (() => void) | null = null;
closeCallCount = 0;
// Starts OPEN once triggerOpen runs; tests flip this to simulate a
// mobile background-suspend that left a dead/half-open socket.
readyState = MockWebSocket.CONNECTING;
constructor(url: string) {
this.url = url;
@@ -45,12 +35,10 @@ class MockWebSocket {
close() {
this.closeCallCount++;
this.readyState = MockWebSocket.CLOSED;
}
// Helpers to trigger events in tests
triggerOpen() {
this.readyState = MockWebSocket.OPEN;
this.onopen?.();
}
@@ -71,46 +59,6 @@ class MockWebSocket {
}
}
// ---------------------------------------------------------------------------
// Minimal DOM stub (vitest environment is 'node' — no window/document).
// socket.ts's wake-recovery attaches visibilitychange/pageshow/online/
// focus listeners; under node it self-no-ops via a typeof guard, so to
// exercise the path we inject just enough of window/document here, the
// same way WebSocket is stubbed above. Kept tiny on purpose — a single
// listener registry keyed by event name, plus a settable
// visibilityState.
// ---------------------------------------------------------------------------
interface FakeTarget {
_l: Record<string, Array<() => void>>;
addEventListener: (type: string, fn: () => void) => void;
removeEventListener: (type: string, fn: () => void) => void;
dispatch: (type: string) => void;
}
function makeFakeTarget(): FakeTarget {
const l: Record<string, Array<() => void>> = {};
return {
_l: l,
addEventListener(type, fn) {
(l[type] ||= []).push(fn);
},
removeEventListener(type, fn) {
l[type] = (l[type] || []).filter((f) => f !== fn);
},
dispatch(type) {
for (const fn of l[type] || []) fn();
},
};
}
const fakeWindow = makeFakeTarget();
const fakeDocument = Object.assign(makeFakeTarget(), {
visibilityState: "visible" as string,
});
(globalThis as unknown as Record<string, unknown>).window = fakeWindow;
(globalThis as unknown as Record<string, unknown>).document = fakeDocument;
// Install mock WebSocket globally before importing socket module
(globalThis as unknown as Record<string, unknown>).WebSocket = MockWebSocket;
@@ -380,153 +328,6 @@ describe("WebSocket onerror", () => {
});
});
// ---------------------------------------------------------------------------
// Wake recovery — mobile background-suspend regression (mobile chat not
// updating in real time until refresh). Simulates: connect → open →
// the OS freezes the page and silently kills the WS WITHOUT firing
// onclose → user returns (visibilitychange / pageshow / online /
// focus) → assert the dead socket is replaced AND, on the new socket's
// open, the resume signal fires so chat history back-fills the missed
// AGENT_MESSAGE / A2A_RESPONSE events.
// ---------------------------------------------------------------------------
import {
subscribeSocketResume,
_resetSocketResumeListenersForTests,
} from "../socket-events";
describe("wake recovery (mobile background-suspend)", () => {
beforeEach(() => {
_resetSocketResumeListenersForTests();
fakeDocument.visibilityState = "visible";
});
function suspendKill(ws: MockWebSocket) {
// Mobile background-suspend: the OS tore the transport down but the
// page was frozen so onclose never ran. The socket object survives
// with a CLOSED readyState and no reconnect was scheduled.
ws.readyState = MockWebSocket.CLOSED;
}
it("reconnects on visibilitychange when the socket was silently killed", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
expect(MockWebSocket.instances).toHaveLength(1);
suspendKill(ws);
fakeDocument.dispatch("visibilitychange");
// A fresh socket must have been created — the stale one is not
// reused.
expect(MockWebSocket.instances.length).toBeGreaterThan(1);
});
it("does NOT reconnect on visibilitychange while the socket is still healthy", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
expect(MockWebSocket.instances).toHaveLength(1);
// Healthy OPEN socket + a spurious visibilitychange (e.g. quick tab
// peek that never actually suspended) → no churn.
fakeDocument.dispatch("visibilitychange");
expect(MockWebSocket.instances).toHaveLength(1);
});
it("ignores visibilitychange when the page is hidden (the hide transition)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
suspendKill(ws);
fakeDocument.visibilityState = "hidden";
fakeDocument.dispatch("visibilitychange");
// Hidden → must not reconnect (would defeat the purpose; we only
// re-arm when the user is actually looking at the page again).
expect(MockWebSocket.instances).toHaveLength(1);
});
it.each(["pageshow", "online", "focus"])(
"reconnects on window '%s' after a silent kill",
(evt) => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
suspendKill(ws);
fakeWindow.dispatch(evt);
expect(MockWebSocket.instances.length).toBeGreaterThan(1);
},
);
it("emits the resume signal once the recovered socket re-opens (so chat back-fills missed messages)", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
const ws1 = getLastWS();
ws1.triggerOpen();
// First open must NOT fire resume — the mount-time chat-history
// fetch already covers the initial load.
expect(onResume).not.toHaveBeenCalled();
// Background-suspend silently kills the socket, then the user
// returns.
suspendKill(ws1);
fakeDocument.dispatch("visibilitychange");
// The wake handler force-reconnected; the new socket completing its
// handshake is what signals "we recovered from a gap — re-fetch".
const ws2 = getLastWS();
expect(ws2).not.toBe(ws1);
ws2.triggerOpen();
expect(onResume).toHaveBeenCalledTimes(1);
unsub();
});
it("does not emit resume on the very first connect", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
getLastWS().triggerOpen();
expect(onResume).not.toHaveBeenCalled();
unsub();
});
it("emits resume after an ordinary onclose-driven reconnect too (desktop path unchanged)", () => {
const onResume = vi.fn();
const unsub = subscribeSocketResume(onResume);
connectSocket();
const ws1 = getLastWS();
ws1.triggerOpen();
// Ordinary network drop — onclose fires normally.
ws1.triggerClose();
vi.advanceTimersByTime(1100); // past the 1s backoff
const ws2 = getLastWS();
expect(ws2).not.toBe(ws1);
ws2.triggerOpen();
expect(onResume).toHaveBeenCalledTimes(1);
unsub();
});
it("detaches wake listeners on disconnect (no reconnect after teardown)", () => {
connectSocket();
const ws = getLastWS();
ws.triggerOpen();
disconnectSocket();
const countAfterDisconnect = MockWebSocket.instances.length;
// A wake event after teardown must be inert.
fakeDocument.dispatch("visibilitychange");
fakeWindow.dispatch("focus");
expect(MockWebSocket.instances.length).toBe(countAfterDisconnect);
});
});
// ---------------------------------------------------------------------------
// Health check (startHealthCheck / stopHealthCheck via onopen / disconnect)
// ---------------------------------------------------------------------------
-50
View File
@@ -61,53 +61,3 @@ export function subscribeSocketEvents(listener: Listener): () => void {
export function _resetSocketEventListenersForTests(): void {
listeners.clear();
}
// ---------------------------------------------------------------------------
// Socket-resume signal
// ---------------------------------------------------------------------------
//
// Fired by the ReconnectingSocket when the WS comes back up AFTER having
// been down (drop, or a mobile-browser background-suspend that silently
// killed the socket while the page was frozen). Distinct from the raw
// event bus above: while the socket was dead the page missed every
// AGENT_MESSAGE / A2A_RESPONSE, and the store's rehydrate() only re-pulls
// /workspaces status — it does NOT back-fill chat messages. Components
// that render a live message thread (desktop ChatTab + MobileChat, both
// via useChatHistory) subscribe here to re-fetch their history on resume
// so missed agent replies appear without the user having to navigate
// away+back or hard-refresh. Shared by desktop and mobile — the recovery
// is in the singleton socket, not forked per-surface.
type ResumeListener = () => void;
const resumeListeners = new Set<ResumeListener>();
/** Notify every resume subscriber that the socket just recovered from a
* down period. Called by ReconnectingSocket.onopen, but only when the
* open follows a prior loss (not the very first connect — the initial
* mount-time history fetch already covers that). */
export function emitSocketResume(): void {
for (const listener of resumeListeners) {
try {
listener();
} catch (err) {
if (typeof console !== "undefined") {
console.error("socket-resume listener threw:", err);
}
}
}
}
/** Register a resume subscriber. Returns an unsubscribe function the
* caller must invoke from its effect cleanup. */
export function subscribeSocketResume(listener: ResumeListener): () => void {
resumeListeners.add(listener);
return () => {
resumeListeners.delete(listener);
};
}
/** Test-only: drop all resume subscribers. */
export function _resetSocketResumeListenersForTests(): void {
resumeListeners.clear();
}
+1 -117
View File
@@ -1,6 +1,6 @@
import { useCanvasStore } from "./canvas";
import { deriveWsBaseUrl } from "@/lib/ws-url";
import { emitSocketEvent, emitSocketResume } from "./socket-events";
import { emitSocketEvent } from "./socket-events";
// If explicit WS_URL is set, use it as-is (may include custom path).
// Otherwise derive base + append /ws.
@@ -98,107 +98,9 @@ class ReconnectingSocket {
// caller can fire-and-forget without coordinating.
private rehydrateInFlight: Promise<void> | null = null;
private rehydrateDedup = new RehydrateDedup(REHYDRATE_DEDUP_WINDOW_MS);
// True once any onopen has fired. Gates the resume signal so the very
// first connect doesn't fire it (the mount-time chat-history fetch
// already covers the initial load — a resume here would be a wasted
// duplicate). Set on the first successful open and stays true.
private everConnected = false;
// True between a loss (onclose / wake-detected stale socket) and the
// next successful onopen. Only when this is set does onopen emit the
// resume signal — i.e. we recovered from a real gap during which
// AGENT_MESSAGE / A2A_RESPONSE events may have been missed.
private wasDown = false;
// Bound wake handler. iOS Safari / Chrome-mobile freeze the page and
// its timers when the tab is backgrounded or the device locks, and
// tear the WS down WITHOUT reliably firing onclose before the freeze.
// On thaw nothing re-arms: onclose never ran so no reconnect was
// scheduled, and the health-check / fallback-poll intervals were
// suspended. The socket is silently dead until a manual refresh. This
// handler force-reconnects on any wake signal when the socket isn't
// healthy. Stored so disconnect() can detach the listeners.
private onWake: (() => void) | null = null;
constructor(url: string) {
this.url = url;
this.installWakeListeners();
}
/** Attach page-lifecycle listeners that force a reconnect when the
* page returns to the foreground / regains connectivity and the
* socket is not OPEN. Shared by desktop and mobile — desktop rarely
* hits the stale-socket path (its onclose fires promptly) so this is
* effectively a no-op there, while mobile depends on it because the
* background-suspend kills the socket without an onclose. */
private installWakeListeners() {
if (typeof window === "undefined" || typeof document === "undefined") {
return;
}
const wake = () => {
if (this.disposed) return;
// Only act on a visible page — visibilitychange also fires on the
// hide transition, which we must ignore (closing here would defeat
// the point).
if (
typeof document.visibilityState === "string" &&
document.visibilityState !== "visible"
) {
return;
}
// Healthy socket → nothing to do. A stale/half-open socket on
// mobile reports CLOSED or CLOSING (the OS tore the transport
// down); CONNECTING is also unhealthy from the user's POV but a
// reconnect attempt is already in flight, so leave it.
const live =
this.ws !== null &&
(this.ws.readyState === WebSocket.OPEN ||
this.ws.readyState === WebSocket.CONNECTING);
if (live) return;
// Tear down any zombie and reconnect immediately. Mark wasDown so
// the subsequent onopen emits the resume signal and chat threads
// back-fill the messages missed while frozen.
this.wasDown = true;
this.forceReconnect();
};
this.onWake = wake;
document.addEventListener("visibilitychange", wake);
window.addEventListener("pageshow", wake);
window.addEventListener("online", wake);
window.addEventListener("focus", wake);
}
private removeWakeListeners() {
if (!this.onWake) return;
if (typeof window !== "undefined" && typeof document !== "undefined") {
document.removeEventListener("visibilitychange", this.onWake);
window.removeEventListener("pageshow", this.onWake);
window.removeEventListener("online", this.onWake);
window.removeEventListener("focus", this.onWake);
}
this.onWake = null;
}
/** Detach the current (presumed dead/stale) socket without routing
* through its onclose, cancel any pending backoff timer, and
* reconnect now. Used by the wake path: the browser already killed
* the transport, so the exponential backoff that onclose would have
* scheduled is both absent and undesirable — the user is looking at
* the page and wants it live immediately. */
private forceReconnect() {
if (this.disposed) return;
if (this.reconnectTimer) {
clearTimeout(this.reconnectTimer);
this.reconnectTimer = null;
}
if (this.ws) {
this.ws.onopen = null;
this.ws.onmessage = null;
this.ws.onclose = null;
this.ws.onerror = null;
try { this.ws.close(); } catch { /* noop */ }
this.ws = null;
}
this.attempt = 0;
this.connect();
}
connect() {
@@ -230,18 +132,6 @@ class ReconnectingSocket {
this.stopFallbackPoll();
this.rehydrate();
this.startHealthCheck();
// If this open follows a real loss (drop, or a mobile background-
// suspend that the wake handler recovered from), signal resume so
// live message threads re-fetch the AGENT_MESSAGE / A2A_RESPONSE
// history they missed while the socket was dead — rehydrate()
// above only refreshes /workspaces status, not chat. Gate on
// everConnected so the very first open (covered by the mount-time
// history fetch) doesn't fire a redundant resume.
if (this.everConnected && this.wasDown) {
emitSocketResume();
}
this.everConnected = true;
this.wasDown = false;
};
ws.onmessage = (event) => {
@@ -267,11 +157,6 @@ class ReconnectingSocket {
// corresponds to the WS we just tore down (prevents a stale
// onclose from a zombie socket from re-arming the loop).
if (this.disposed || this.ws !== ws) return;
// We had a live socket and lost it — mark down so the next onopen
// emits the resume signal and chat threads back-fill missed
// messages. (The wake path also sets this; setting it here covers
// the ordinary network-drop case.)
this.wasDown = true;
this.stopHealthCheck();
useCanvasStore.getState().setWsStatus("connecting");
this.startFallbackPoll();
@@ -362,7 +247,6 @@ class ReconnectingSocket {
disconnect() {
this.disposed = true;
this.removeWakeListeners();
this.stopHealthCheck();
this.stopFallbackPoll();
if (this.reconnectTimer) {
-1
View File
@@ -62,7 +62,6 @@ TOP_LEVEL_MODULES = {
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"a2a_tools_identity",
"adapter_base",
"agent",
"agents_md",
+2 -4
View File
@@ -41,9 +41,8 @@ type EventType string
// scan-friendly as it grows.
const (
// Chat / agent messaging — surfaces in canvas chat panels.
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventUserMessage EventType = "USER_MESSAGE"
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventActivityLogged EventType = "ACTIVITY_LOGGED"
EventChannelMessage EventType = "CHANNEL_MESSAGE"
@@ -96,7 +95,6 @@ const (
var AllEventTypes = []EventType{
EventA2AResponse,
EventActivityLogged,
EventUserMessage,
EventAgentAssigned,
EventAgentCardUpdated,
EventAgentMessage,
@@ -41,7 +41,6 @@ func TestAllEventTypes_IsSnapshot(t *testing.T) {
"DELEGATION_STATUS",
"EXTERNAL_CREDENTIALS_ROTATED",
"TASK_UPDATED",
"USER_MESSAGE",
"WORKSPACE_AWAITING_AGENT",
"WORKSPACE_DEGRADED",
"WORKSPACE_HEARTBEAT",
@@ -11,7 +11,6 @@ import (
"log"
"net/http"
"strconv"
"strings"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
@@ -345,19 +344,6 @@ func (h *WorkspaceHandler) logA2ASuccess(ctx context.Context, workspaceID, calle
"duration_ms": durationMs,
})
}
// #228: fan user's own outbound message to all sessions of the workspace.
// When a canvas user sends a message (callerID == "" and method == "message/send"),
// the originating session already inserted it optimistically in useChatSend.
// Other sessions see nothing until a manual refresh — this broadcast closes
// that gap. The originating session collapses its optimistic copy via the
// 3-second appendMessageDeduped window (same role + content = deduped).
if callerID == "" && a2aMethod == "message/send" && statusCode < 400 {
userPayload := extractCanvasUserMessage(body)
if userPayload != nil {
h.broadcaster.BroadcastOnly(workspaceID, string(events.EventUserMessage), userPayload)
}
}
}
func nilIfEmpty(s string) *string {
@@ -407,110 +393,6 @@ func validateCallerToken(ctx context.Context, c *gin.Context, callerID string) e
// matching (the wsauth errors are typed for the invalid case).
var errInvalidCallerToken = errors.New("missing caller auth token")
// extractCanvasUserMessage parses an A2A JSON-RPC request body and extracts
// the user-authored text and attachments from a canvas-initiated message/send.
// Returns nil when the body is not a canvas user message (empty, malformed,
// or not a message/send from canvas). The returned payload is safe to pass
// directly to BroadcastOnly — nil fields are omitted from JSON.
func extractCanvasUserMessage(body []byte) map[string]interface{} {
if len(body) == 0 {
return nil
}
var top map[string]json.RawMessage
if err := json.Unmarshal(body, &top); err != nil {
return nil
}
// Only handle message/send from canvas
var method string
if err := json.Unmarshal(top["method"], &method); err != nil || method != "message/send" {
return nil
}
params, ok := top["params"]
if !ok {
return nil
}
var paramsMap map[string]json.RawMessage
if err := json.Unmarshal(params, &paramsMap); err != nil {
return nil
}
msgRaw, ok := paramsMap["message"]
if !ok {
return nil
}
var msg map[string]json.RawMessage
if err := json.Unmarshal(msgRaw, &msg); err != nil {
return nil
}
// role field: only broadcast user-role messages (canvas users)
var role string
if err := json.Unmarshal(msg["role"], &role); err != nil || role != "user" {
return nil
}
result := make(map[string]interface{})
// Extract messageId if present
var mid string
if err := json.Unmarshal(msg["messageId"], &mid); err == nil && mid != "" {
result["messageId"] = mid
}
// Extract text from parts — accumulate all text parts into a single string
var parts []json.RawMessage
if err := json.Unmarshal(msg["parts"], &parts); err == nil {
var texts []string
var fileAttachments []map[string]interface{}
for _, pRaw := range parts {
var p map[string]json.RawMessage
if err := json.Unmarshal(pRaw, &p); err != nil {
continue
}
var t string
if err := json.Unmarshal(p["text"], &t); err == nil && t != "" {
texts = append(texts, t)
}
var fileRaw json.RawMessage
if err := json.Unmarshal(p["file"], &fileRaw); err == nil && fileRaw != nil {
var f map[string]json.RawMessage
if err := json.Unmarshal(fileRaw, &f); err == nil {
att := make(map[string]interface{})
var s string
if err := json.Unmarshal(f["uri"], &s); err == nil {
att["uri"] = s
}
if err := json.Unmarshal(f["name"], &s); err == nil {
att["name"] = s
}
if err := json.Unmarshal(f["mimeType"], &s); err == nil {
att["mimeType"] = s
}
var n float64
if err := json.Unmarshal(f["size"], &n); err == nil {
att["size"] = n
}
if len(att) > 0 {
fileAttachments = append(fileAttachments, att)
}
}
}
}
if len(texts) > 0 {
// Join with newlines — user may have sent multiple text parts
result["message"] = strings.Join(texts, "\n")
}
if len(fileAttachments) > 0 {
result["attachments"] = fileAttachments
}
}
// Drop empty payloads
if len(result) == 0 {
return nil
}
return result
}
// extractToolTrace pulls metadata.tool_trace from an A2A JSON-RPC response.
// Returns nil when absent or malformed — callers can pass it straight through.
func extractToolTrace(respBody []byte) json.RawMessage {
@@ -1,262 +0,0 @@
package handlers
import (
"encoding/json"
"testing"
)
// TestExtractCanvasUserMessage_TextOnly covers the primary path: a canvas user
// sends a plain text message with no attachments.
func TestExtractCanvasUserMessage_TextOnly(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": "msg-abc-123",
"parts": [
{"kind": "text", "text": "Hello, agent!"}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for text message")
}
if got["message"] != "Hello, agent!" {
t.Errorf("message = %v, want %q", got["message"], "Hello, agent!")
}
mid, ok := got["messageId"].(string)
if !ok || mid != "msg-abc-123" {
t.Errorf("messageId = %v, want %q", got["messageId"], "msg-abc-123")
}
_, hasAttachments := got["attachments"]
if hasAttachments {
t.Errorf("unexpected attachments: %v", got["attachments"])
}
}
// TestExtractCanvasUserMessage_FileOnly covers a user message with a file but no text.
func TestExtractCanvasUserMessage_FileOnly(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"messageId": "msg-file-456",
"parts": [
{
"kind": "file",
"file": {
"name": "report.pdf",
"uri": "workspace:/uploads/report.pdf",
"mimeType": "application/pdf",
"size": 4096
}
}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for file-only message")
}
if got["message"] != nil {
t.Errorf("unexpected message text: %v", got["message"])
}
attachments, ok := got["attachments"].([]map[string]interface{})
if !ok || len(attachments) != 1 {
t.Fatalf("attachments = %v, want 1-element array", got["attachments"])
}
att := attachments[0]
if att["uri"] != "workspace:/uploads/report.pdf" {
t.Errorf("uri = %v, want %q", att["uri"], "workspace:/uploads/report.pdf")
}
if att["name"] != "report.pdf" {
t.Errorf("name = %v, want %q", att["name"], "report.pdf")
}
if att["mimeType"] != "application/pdf" {
t.Errorf("mimeType = %v, want %q", att["mimeType"], "application/pdf")
}
}
// TestExtractCanvasUserMessage_TextAndFile covers a user message with both text and a file.
func TestExtractCanvasUserMessage_TextAndFile(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [
{"kind": "text", "text": "Here is the file:"},
{"kind": "text", "text": "see below"},
{
"kind": "file",
"file": {
"name": "data.csv",
"uri": "workspace:/exports/data.csv",
"mimeType": "text/csv",
"size": 8192
}
}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload")
}
// Two text parts are joined with newline
if got["message"] != "Here is the file:\nsee below" {
t.Errorf("message = %v, want %q", got["message"], "Here is the file:\nsee below")
}
attachments, ok := got["attachments"].([]map[string]interface{})
if !ok || len(attachments) != 1 {
t.Fatalf("attachments = %v, want 1-element array", got["attachments"])
}
}
// TestExtractCanvasUserMessage_Malformed covers malformed JSON.
func TestExtractCanvasUserMessage_Malformed(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{"not JSON", []byte(`{not valid`)},
{"wrong type top-level", []byte(`123`)},
{"missing params", []byte(`{"method":"message/send"}`)},
{"params not object", []byte(`{"method":"message/send","params":123}`)},
{"missing message", []byte(`{"method":"message/send","params":{}}`)},
{"message not object", []byte(`{"method":"message/send","params":{"message":123}}`)},
{"role missing", []byte(`{"method":"message/send","params":{"message":{"parts":[]}}}`)},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := extractCanvasUserMessage(tc.body); got != nil {
t.Errorf("expected nil for %s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_NotUserRole covers agent/workspace callers
// whose role is not "user" — these should not be broadcast as USER_MESSAGE.
func TestExtractCanvasUserMessage_NotUserRole(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{
"agent role",
[]byte(`{"method":"message/send","params":{"message":{"role":"agent","parts":[{"kind":"text","text":"hello"}]}}}`),
},
{
"assistant role",
[]byte(`{"method":"message/send","params":{"message":{"role":"assistant","parts":[{"kind":"text","text":"hello"}]}}}`),
},
{
"empty role",
[]byte(`{"method":"message/send","params":{"message":{"role":"","parts":[{"kind":"text","text":"hello"}]}}}`),
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := extractCanvasUserMessage(tc.body); got != nil {
t.Errorf("expected nil for role=%s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_NotMessageSend covers non-message/send methods.
func TestExtractCanvasUserMessage_NotMessageSend(t *testing.T) {
cases := []struct {
name string
method string
}{
{"tasks/send", "tasks/send"},
{"initialize", "initialize"},
{"ping", "ping"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
body, _ := json.Marshal(map[string]interface{}{
"method": tc.method,
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]interface{}{{"kind": "text", "text": "hello"}},
},
},
})
if got := extractCanvasUserMessage(body); got != nil {
t.Errorf("expected nil for method=%q, got %v", tc.method, got)
}
})
}
}
// TestExtractCanvasUserMessage_BlankOrEmpty covers text with only whitespace
// and empty parts arrays.
func TestExtractCanvasUserMessage_BlankOrEmpty(t *testing.T) {
cases := []struct {
name string
body []byte
}{
{
"empty text part",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":""}]}}}`),
},
{
"empty parts array",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[]}}}`),
},
{
"whitespace-only text — still included as valid content",
[]byte(`{"method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":" "}]}}}`),
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := extractCanvasUserMessage(tc.body)
if tc.name == "whitespace-only text — still included as valid content" {
// Whitespace-only text is valid content — preserve it as-is.
// Canvas dedup collapses identical copies; whitespace is not stripped.
if got == nil {
t.Error("expected non-nil for whitespace-only text")
} else if got["message"] != " " {
t.Errorf("message = %q, want %q", got["message"], " ")
}
return
}
if got != nil {
t.Errorf("expected nil for %s, got %v", tc.name, got)
}
})
}
}
// TestExtractCanvasUserMessage_Unicode covers non-ASCII text.
func TestExtractCanvasUserMessage_Unicode(t *testing.T) {
body := []byte(`{
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [
{"kind": "text", "text": "こんにちは世界 🌍 日本語"}
]
}
}
}`)
got := extractCanvasUserMessage(body)
if got == nil {
t.Fatal("expected non-nil payload for unicode message")
}
if got["message"] != "こんにちは世界 🌍 日本語" {
t.Errorf("message = %v, want %q", got["message"], "こんにちは世界 🌍 日本語")
}
}
@@ -691,19 +691,6 @@ func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster eve
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
// internal#211/#212: error_detail carries the runtime's curated,
// user-actionable, secret-safe failure reason (provider HTTP
// status + error code + the provider's own guidance, e.g. a 403
// "org disabled · use an API key / ask your admin"). It is
// already persisted to the DB column above and capped by the
// runtime's report_activity helper (4096 chars). Previously it
// was dropped from the LIVE broadcast, so the canvas had nothing
// to render and fell back to a hardcoded opaque
// "Agent error (Exception) — see workspace logs" string. Include
// it so the chat bubble shows the real reason in real time.
if params.ErrorDetail != nil && *params.ErrorDetail != "" {
payload["error_detail"] = *params.ErrorDetail
}
}
return func() {
@@ -107,29 +107,10 @@ func (h *ChatFilesHandler) WithPendingUploads(storage pendinguploads.Storage, br
}
// chatUploadMaxBytes caps the full multipart request body so a
// malicious / runaway client can't OOM the proxy hop. 100 MB matches
// the workspace-side total limit; anything larger is rejected at the
// malicious / runaway client can't OOM the proxy hop. 50 MB matches
// the workspace-side limit; anything larger is rejected at the
// network boundary before forwarding.
//
// SSOT NOTE (issue #1520): this constant is the source of truth for
// chat upload limits across the platform. Its value is exported to
// the workspace container at provision time via the env var
// CHAT_UPLOAD_MAX_TOTAL_BYTES (see
// workspace_provision_shared.go::applyChatUploadLimits) so the
// Python runtime cap stays in lock-step. Do NOT change this without
// updating the per-file cap chatUploadMaxFileBytes below and
// verifying the env-injection site is unchanged.
const chatUploadMaxBytes = 100 * 1024 * 1024
// chatUploadMaxFileBytes caps any single multipart part. Mirrors the
// total cap by default because most chat uploads are a single file;
// keeping per-file equal to total avoids the surprise of "my 60 MB
// file fit under the total but got 413'd on per-file". Exported to
// the workspace container as CHAT_UPLOAD_MAX_FILE_BYTES so the
// Starlette parser's max_part_size matches and any single part above
// Starlette's default 1 MiB no longer raises MultiPartException
// (root cause of issue #1520).
const chatUploadMaxFileBytes = 100 * 1024 * 1024
const chatUploadMaxBytes = 50 * 1024 * 1024
// resolveWorkspaceForwardCreds resolves the workspace's URL +
// platform_inbound_secret for an /internal/* forward, applying
@@ -1,63 +0,0 @@
package handlers
// chat_upload_limits_test.go — pins the SSOT env-injection contract
// for chat-upload caps (issue #1520). The Python workspace runtime
// reads these env vars at module init; drift between the constant in
// chat_files.go and the env-var name here silently breaks chat upload
// fleet-wide, so the contract is asserted as a unit test in the same
// package as the producer.
import (
"fmt"
"testing"
)
// applyChatUploadLimits MUST seed both env vars to the byte-count
// stringification of the Go-side constants. Anything else means a
// Python-side parser cap that disagrees with the Go-side network cap,
// which is exactly the drift that shipped #1520.
func TestApplyChatUploadLimits_DefaultsMatchGoConstants(t *testing.T) {
env := map[string]string{}
applyChatUploadLimits(env)
wantFile := fmt.Sprintf("%d", chatUploadMaxFileBytes)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != wantFile {
t.Errorf("CHAT_UPLOAD_MAX_FILE_BYTES = %q, want %q", got, wantFile)
}
wantTotal := fmt.Sprintf("%d", chatUploadMaxBytes)
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != wantTotal {
t.Errorf("CHAT_UPLOAD_MAX_TOTAL_BYTES = %q, want %q", got, wantTotal)
}
}
// Pre-existing values win. A tenant override, plugin mutator, or A/B
// experiment that already set the env MUST be preserved — the SSOT
// helper is a defaulting layer, not an override layer.
func TestApplyChatUploadLimits_PreExistingValuesPreserved(t *testing.T) {
env := map[string]string{
"CHAT_UPLOAD_MAX_FILE_BYTES": "1234",
"CHAT_UPLOAD_MAX_TOTAL_BYTES": "5678",
}
applyChatUploadLimits(env)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != "1234" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_FILE_BYTES overwritten: got %q", got)
}
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != "5678" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_TOTAL_BYTES overwritten: got %q", got)
}
}
// The 100 MB minimum is the CTO-directed allowance floor (issue #1520).
// Pin so a future "tidy up: 100 MB seems large" refactor surfaces here
// before reverting the user-visible behaviour change.
func TestChatUploadCaps_MinimumAllowanceFloor(t *testing.T) {
const floor = 100 * 1024 * 1024
if chatUploadMaxBytes < floor {
t.Errorf("chatUploadMaxBytes = %d, below #1520 floor %d", chatUploadMaxBytes, floor)
}
if chatUploadMaxFileBytes < floor {
t.Errorf("chatUploadMaxFileBytes = %d, below #1520 floor %d", chatUploadMaxFileBytes, floor)
}
}
@@ -747,14 +747,6 @@ func (h *DelegationHandler) listDelegationsFromLedger(ctx context.Context, works
entry["response_preview"] = textutil.TruncateBytes(resultPreview.String, 300)
}
if errorDetail.Valid && errorDetail.String != "" {
// Emit both keys: `error_detail` is the canonical field the
// Python poll-mode consumer (a2a_tools_delegation.py:184)
// reads from /delegations rows — without it, poll-mode
// silently loses the failure reason and falls through to
// the generic "delegation failed" string. `error` is kept
// for back-compat with existing UI surfaces that read the
// shorter name.
entry["error_detail"] = errorDetail.String
entry["error"] = errorDetail.String
}
if lastHeartbeat != nil {
@@ -816,8 +808,6 @@ func (h *DelegationHandler) listDelegationsFromActivityLogs(ctx context.Context,
entry["delegation_id"] = delegationID
}
if errorDetail != "" {
// Emit both keys per the rename: see listDelegationsFromLedger.
entry["error_detail"] = errorDetail
entry["error"] = errorDetail
}
if responseBody != "" {
@@ -1546,71 +1546,6 @@ func TestListDelegations_LedgerEmptyFallsBackToActivityLogs(t *testing.T) {
}
}
// ---------- ListDelegations: activity_logs failed row emits BOTH error + error_detail ----------
// Field-rename pin (P1 #348 / RFC #2829 PR-2 follow-up): the legacy
// activity_logs fallback path must also emit `error_detail` alongside
// the historical `error` key. Without this, poll-mode (which reads
// `error_detail`) silently loses the failure reason when the ledger
// is empty and the handler falls back to activity_logs.
func TestListDelegations_ActivityLogsFailedEmitsBothErrorKeys(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
// Ledger empty → fall back to activity_logs.
mock.ExpectQuery("SELECT d.delegation_id, d.caller_id, d.callee_id, d.task_preview").
WithArgs("ws-source").
WillReturnRows(sqlmock.NewRows([]string{
"delegation_id", "caller_id", "callee_id", "task_preview",
"status", "result_preview", "error_detail", "last_heartbeat",
"deadline", "created_at", "updated_at",
}))
now := time.Now()
activityRows := sqlmock.NewRows([]string{
"id", "activity_type", "source_id", "target_id",
"summary", "status", "error_detail", "response_body",
"delegation_id", "created_at",
}).AddRow(
"act-failed", "delegate_result", "ws-source", "ws-target",
"Delegation failed", "error", "codex runtime timed out", "",
"del-failed-002", now,
)
mock.ExpectQuery("SELECT id, activity_type").
WithArgs("ws-source").
WillReturnRows(activityRows)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-source"}}
c.Request = httptest.NewRequest("GET", "/workspaces/ws-source/delegations", nil)
dh.ListDelegations(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to parse response: %v", err)
}
if len(resp) != 1 {
t.Fatalf("expected 1 row, got %d", len(resp))
}
if resp[0]["error"] != "codex runtime timed out" {
t.Errorf("expected `error` field set, got %v", resp[0]["error"])
}
if resp[0]["error_detail"] != "codex runtime timed out" {
t.Errorf("expected `error_detail` field set (poll-mode contract), got %v", resp[0]["error_detail"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// ---------- ListDelegations: both ledger and activity_logs empty → [] ----------
func TestListDelegations_BothEmptyReturnsEmptyArray(t *testing.T) {
@@ -1809,15 +1744,7 @@ func TestListDelegations_LedgerFailedIncludesErrorDetail(t *testing.T) {
t.Errorf("expected status 'failed', got %v", resp[0]["status"])
}
if resp[0]["error"] != "Callee workspace not reachable" {
t.Errorf("expected error detail under `error`, got %v", resp[0]["error"])
}
// Field-rename pin (P1 #348 / RFC #2829 PR-2 follow-up): the
// Python poll-mode consumer in a2a_tools_delegation.py:184 reads
// `error_detail`, not `error`. Both keys MUST be present so polling
// surfaces the real failure reason instead of falling through to
// the generic "delegation failed" string.
if resp[0]["error_detail"] != "Callee workspace not reachable" {
t.Errorf("expected error detail under `error_detail`, got %v", resp[0]["error_detail"])
t.Errorf("expected error detail, got %v", resp[0]["error"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
@@ -1,7 +1,14 @@
package handlers
// Unit tests for plugins_listing.go:
// - parseManifestYAML: full YAML, missing fields, empty YAML
// - listRegistryFiltered: empty/missing dir, no yaml, valid yaml, runtime filter
// - ListRegistry (GET /plugins): no filter, with runtime filter
// - ListAvailableForWorkspace (GET /workspaces/:id/plugins/available): runtimeLookup stub
import (
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"os"
@@ -11,131 +18,455 @@ import (
"github.com/gin-gonic/gin"
)
func TestListRegistry_EmptyDir(t *testing.T) {
dir := t.TempDir()
h := NewPluginsHandler(dir, nil, nil)
// -------- parseManifestYAML --------
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list, got %d plugins", len(got))
func TestParseManifestYAML_FullPlugin(t *testing.T) {
data := []byte(`
name: molecule-audit
version: 1.2.3
description: Security audit plugin for Claude Code
author: Molecule AI
tags:
- security
- audit
skills:
- security-scan
- compliance-check
runtimes:
- claude_code
- hermes
`)
info := parseManifestYAML("fallback-name", data)
if info.Name != "fallback-name" {
t.Errorf("Name = %q; want fallback-name", info.Name)
}
if info.Version != "1.2.3" {
t.Errorf("Version = %q; want 1.2.3", info.Version)
}
if info.Description != "Security audit plugin for Claude Code" {
t.Errorf("Description = %q; want full description", info.Description)
}
if info.Author != "Molecule AI" {
t.Errorf("Author = %q; want Molecule AI", info.Author)
}
if len(info.Tags) != 2 || info.Tags[0] != "security" || info.Tags[1] != "audit" {
t.Errorf("Tags = %v; want [security audit]", info.Tags)
}
if len(info.Skills) != 2 || info.Skills[0] != "security-scan" || info.Skills[1] != "compliance-check" {
t.Errorf("Skills = %v; want [security-scan compliance-check]", info.Skills)
}
if len(info.Runtimes) != 2 || info.Runtimes[0] != "claude_code" || info.Runtimes[1] != "hermes" {
t.Errorf("Runtimes = %v; want [claude_code hermes]", info.Runtimes)
}
}
func TestListRegistry_IgnoresFiles(t *testing.T) {
dir := t.TempDir()
if err := os.WriteFile(filepath.Join(dir, "not-a-plugin.txt"), []byte("x"), 0600); err != nil {
t.Fatal(err)
func TestParseManifestYAML_MinimalFields(t *testing.T) {
// Only name field; all others should be zero-value.
data := []byte(`name: minimal-plugin`)
info := parseManifestYAML("fallback", data)
if info.Name != "fallback" {
t.Errorf("Name = %q; want fallback", info.Name)
}
h := NewPluginsHandler(dir, nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list (files ignored), got %d", len(got))
if info.Version != "" {
t.Errorf("Version = %q; want empty", info.Version)
}
if info.Description != "" {
t.Errorf("Description = %q; want empty", info.Description)
}
if len(info.Tags) != 0 {
t.Errorf("Tags = %v; want []", info.Tags)
}
if len(info.Skills) != 0 {
t.Errorf("Skills = %v; want []", info.Skills)
}
if len(info.Runtimes) != 0 {
t.Errorf("Runtimes = %v; want []", info.Runtimes)
}
}
func TestListRegistry_SinglePlugin(t *testing.T) {
dir := t.TempDir()
pluginDir := filepath.Join(dir, "my-plugin")
if err := os.Mkdir(pluginDir, 0755); err != nil {
t.Fatal(err)
func TestParseManifestYAML_MissingPluginYAML(t *testing.T) {
// No plugin.yaml present → returns fallback name only.
info := parseManifestYAML("no-file", nil)
if info.Name != "no-file" {
t.Errorf("Name = %q; want no-file", info.Name)
}
if err := os.WriteFile(filepath.Join(pluginDir, "plugin.yaml"), []byte("name: my-plugin\nversion: 1.0.0\n"), 0600); err != nil {
t.Fatal(err)
}
h := NewPluginsHandler(dir, nil, nil)
}
func TestParseManifestYAML_BadYAML(t *testing.T) {
// Malformed YAML → returns fallback name only (no panic).
info := parseManifestYAML("bad-yaml", []byte("not: [yaml: at all"))
if info.Name != "bad-yaml" {
t.Errorf("Name = %q; want bad-yaml", info.Name)
}
if info.Version != "" {
t.Errorf("Version = %q; want empty after bad YAML", info.Version)
}
}
func TestParseManifestYAML_PartialFields(t *testing.T) {
// Present tags/skills/runtimes that are not []interface{} (e.g. wrong type)
// should not panic and should leave the field empty.
data := []byte(`
name: partial
tags: "not-an-array"
skills: 123
runtimes: true
`)
info := parseManifestYAML("partial", data)
if info.Name != "partial" {
t.Errorf("Name = %q; want partial", info.Name)
}
if len(info.Tags) != 0 {
t.Errorf("Tags = %v; want [] (wrong type)", info.Tags)
}
if len(info.Skills) != 0 {
t.Errorf("Skills = %v; want [] (wrong type)", info.Skills)
}
if len(info.Runtimes) != 0 {
t.Errorf("Runtimes = %v; want [] (wrong type)", info.Runtimes)
}
}
// -------- listRegistryFiltered --------
func makeTestHandler(t *testing.T, pluginsDir string) *PluginsHandler {
// Construct a minimal PluginsHandler with a nil docker client
// (filesystem paths are tested directly; container-dependent paths are
// tested separately or skipped in this file).
h := &PluginsHandler{pluginsDir: pluginsDir}
return h
}
func writePluginYAML(t *testing.T, dir, name, content string) {
path := filepath.Join(dir, name, "plugin.yaml")
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
t.Fatalf("writeFile: %v", err)
}
}
func TestListRegistryFiltered_EmptyDir(t *testing.T) {
dir := t.TempDir()
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list for empty dir; got %d plugins", len(got))
}
}
func TestListRegistryFiltered_NonExistentDir(t *testing.T) {
h := makeTestHandler(t, "/does/not/exist")
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list for nonexistent dir; got %d plugins", len(got))
}
}
func TestListRegistryFiltered_NoPluginYAML(t *testing.T) {
// Plugin directory exists but has no plugin.yaml → fallback name only.
dir := t.TempDir()
writePluginYAML(t, dir, "no-manifest-plugin", "")
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("")
if len(got) != 1 {
t.Fatalf("expected 1 plugin, got %d", len(got))
t.Fatalf("expected 1 plugin; got %d", len(got))
}
if got[0].Name != "my-plugin" {
t.Errorf("expected name 'my-plugin', got %q", got[0].Name)
if got[0].Name != "no-manifest-plugin" {
t.Errorf("Name = %q; want no-manifest-plugin", got[0].Name)
}
}
func TestListRegistry_FiltersByRuntime(t *testing.T) {
func TestListRegistryFiltered_ValidPlugin(t *testing.T) {
dir := t.TempDir()
for _, spec := range []struct{ name, yaml string }{
{"runtime-a", "name: runtime-a\nruntimes:\n - claude-code\n"},
{"runtime-b", "name: runtime-b\nruntimes:\n - hermes\n"},
{"universal", "name: universal\nversion: 1.0.0\n"},
} {
pd := filepath.Join(dir, spec.name)
if err := os.Mkdir(pd, 0755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte(spec.yaml), 0600); err != nil {
t.Fatal(err)
}
writePluginYAML(t, dir, "molecule-audit", `
name: molecule-audit
version: 1.0.0
description: Security audit plugin
author: Molecule AI
tags:
- security
skills:
- audit
runtimes:
- hermes
`)
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("")
if len(got) != 1 {
t.Fatalf("expected 1 plugin; got %d", len(got))
}
h := NewPluginsHandler(dir, nil, nil)
// Filter to claude-code: runtime-a matches, universal (no runtimes field)
// is always included per supportsRuntime semantics.
got := h.listRegistryFiltered("claude-code")
if len(got) != 2 {
t.Fatalf("expected 2 (runtime-a + universal), got %d: %v", len(got), func() []string {
ns := make([]string, len(got))
for i, p := range got { ns[i] = p.Name }
return ns
}())
if got[0].Name != "molecule-audit" {
t.Errorf("Name = %q; want molecule-audit", got[0].Name)
}
if got[0].Version != "1.0.0" {
t.Errorf("Version = %q; want 1.0.0", got[0].Version)
}
if len(got[0].Tags) != 1 || got[0].Tags[0] != "security" {
t.Errorf("Tags = %v; want [security]", got[0].Tags)
}
}
func TestListRegistry_PluginWithNoRuntimeDeclarations_AlwaysIncluded(t *testing.T) {
func TestListRegistryFiltered_FilesIgnored(t *testing.T) {
// Regular files in pluginsDir are skipped (only directories are scanned).
dir := t.TempDir()
pd := filepath.Join(dir, "universal-plugin")
if err := os.Mkdir(pd, 0755); err != nil {
writePluginYAML(t, dir, "real-plugin", `
name: real-plugin
version: 1.0.0
`)
f, err := os.Create(filepath.Join(dir, "not-a-plugin.txt"))
if err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte("name: universal-plugin\nversion: 1.0.0\n"), 0600); err != nil {
t.Fatal(err)
f.Close()
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("")
if len(got) != 1 || got[0].Name != "real-plugin" {
t.Errorf("expected only real-plugin; got %v", got)
}
h := NewPluginsHandler(dir, nil, nil)
}
// When plugin declares no runtimes, it should always be included (try-it).
func TestListRegistryFiltered_RuntimeFilterMatches(t *testing.T) {
dir := t.TempDir()
writePluginYAML(t, dir, "cc-plugin", `
name: cc-plugin
runtimes: [claude_code]
`)
writePluginYAML(t, dir, "hermes-plugin", `
name: hermes-plugin
runtimes: [hermes]
`)
h := makeTestHandler(t, dir)
// With hermes filter → only hermes-plugin returned.
got := h.listRegistryFiltered("hermes")
if len(got) != 1 || got[0].Name != "hermes-plugin" {
t.Errorf("expected [hermes-plugin]; got %v", got)
}
// With claude-code filter → hyphen normalises to underscore → cc-plugin returned.
got2 := h.listRegistryFiltered("claude-code")
if len(got2) != 1 || got2[0].Name != "cc-plugin" {
t.Errorf("expected [cc-plugin] with claude-code filter; got %v", got2)
}
}
func TestListRegistryFiltered_RuntimeFilterExcludes(t *testing.T) {
// Plugin declares hermes; query asks for claude-code → plugin excluded.
dir := t.TempDir()
writePluginYAML(t, dir, "hermes-only", `
name: hermes-only
runtimes: [hermes]
`)
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("claude_code")
if len(got) != 0 {
t.Errorf("expected empty list for mismatched runtime; got %v", got)
}
}
func TestListRegistryFiltered_UnspecifiedRuntimeIncluded(t *testing.T) {
// Plugin with no runtimes field is included in any filtered query
// ("unspecified = try it" contract).
dir := t.TempDir()
writePluginYAML(t, dir, "universal-plugin", `
name: universal-plugin
runtimes: []
`)
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("any-runtime")
if len(got) != 1 {
t.Errorf("expected 1 plugin (unspecified runtime), got %d", len(got))
if len(got) != 1 || got[0].Name != "universal-plugin" {
t.Errorf("expected [universal-plugin] with any runtime filter; got %v", got)
}
}
func TestListRegistry_ReadDirError_ReturnsEmpty(t *testing.T) {
h := NewPluginsHandler("/nonexistent/path/for/plugins", nil, nil)
got := h.listRegistryFiltered("")
if len(got) != 0 {
t.Errorf("expected empty list on ReadDir error, got %d", len(got))
}
}
func TestListRegistry_HTTPEndpoint(t *testing.T) {
func TestListRegistryFiltered_MultipleMatching(t *testing.T) {
dir := t.TempDir()
pd := filepath.Join(dir, "test-plugin")
if err := os.Mkdir(pd, 0755); err != nil {
t.Fatal(err)
for _, name := range []string{"plugin-a", "plugin-b", "plugin-c"} {
writePluginYAML(t, dir, name, `name: `+name+`
runtimes: [hermes, claude_code]
`)
}
if err := os.WriteFile(filepath.Join(pd, "plugin.yaml"), []byte("name: test-plugin\nversion: 2.0.0\n"), 0600); err != nil {
t.Fatal(err)
h := makeTestHandler(t, dir)
got := h.listRegistryFiltered("hermes")
if len(got) != 3 {
t.Errorf("expected 3 plugins; got %d: %v", len(got), got)
}
h := NewPluginsHandler(dir, nil, nil)
}
// -------- ListRegistry (GET /plugins) --------
func listRegistryReq(runtime string) (*http.Request, *httptest.ResponseRecorder, *gin.Context) {
url := "/plugins"
if runtime != "" {
url += "?runtime=" + runtime
}
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = httptest.NewRequest("GET", "/plugins", nil)
h.ListRegistry(c)
c.Request = httptest.NewRequest("GET", url, nil)
return c.Request, w, c
}
func TestListRegistry_NoFilter(t *testing.T) {
dir := t.TempDir()
writePluginYAML(t, dir, "test-plugin", `
name: test-plugin
version: 0.1.0
`)
h := makeTestHandler(t, dir)
_, w, c := listRegistryReq("")
h.ListRegistry(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
t.Errorf("expected 200; got %d: %s", w.Code, w.Body.String())
}
var plugins []pluginInfo
if err := json.Unmarshal(w.Body.Bytes(), &plugins); err != nil {
t.Fatalf("failed to parse JSON: %v", err)
var resp []map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(plugins) != 1 {
t.Errorf("expected 1 plugin, got %d", len(plugins))
}
if plugins[0].Name != "test-plugin" {
t.Errorf("expected name 'test-plugin', got %q", plugins[0].Name)
if len(resp) != 1 || resp[0]["name"] != "test-plugin" {
t.Errorf("unexpected response: %v", resp)
}
}
func TestListRegistry_WithRuntimeFilter(t *testing.T) {
dir := t.TempDir()
writePluginYAML(t, dir, "hermes-plugin", `
name: hermes-plugin
runtimes: [hermes]
`)
writePluginYAML(t, dir, "cc-plugin", `
name: cc-plugin
runtimes: [claude_code]
`)
h := makeTestHandler(t, dir)
_, w, c := listRegistryReq("hermes")
h.ListRegistry(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200; got %d", w.Code)
}
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if len(resp) != 1 || resp[0]["name"] != "hermes-plugin" {
t.Errorf("expected [hermes-plugin]; got %v", resp)
}
}
func TestListRegistry_EmptyOnNoMatches(t *testing.T) {
dir := t.TempDir()
writePluginYAML(t, dir, "cc-plugin", `name: cc-plugin
runtimes: [claude_code]
`)
h := makeTestHandler(t, dir)
_, w, c := listRegistryReq("nonexistent")
h.ListRegistry(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200; got %d", w.Code)
}
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if len(resp) != 0 {
t.Errorf("expected empty list; got %v", resp)
}
}
// -------- ListAvailableForWorkspace (GET /workspaces/:id/plugins/available) --------
func listAvailableReq(workspaceID string) (*http.Request, *httptest.ResponseRecorder, *gin.Context) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: workspaceID}}
c.Request = httptest.NewRequest("GET", "/workspaces/"+workspaceID+"/plugins/available", nil)
return c.Request, w, c
}
func TestListAvailableForWorkspace_RuntimeLookupReturnsRuntime(t *testing.T) {
dir := t.TempDir()
writePluginYAML(t, dir, "hermes-plugin", `
name: hermes-plugin
runtimes: [hermes]
`)
writePluginYAML(t, dir, "cc-plugin", `
name: cc-plugin
runtimes: [claude_code]
`)
h := makeTestHandler(t, dir)
h.runtimeLookup = func(workspaceID string) (string, error) {
return "hermes", nil
}
_, w, c := listAvailableReq("00000000-0000-0000-0000-000000000001")
h.ListAvailableForWorkspace(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200; got %d", w.Code)
}
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if len(resp) != 1 || resp[0]["name"] != "hermes-plugin" {
t.Errorf("expected [hermes-plugin]; got %v", resp)
}
}
func TestListAvailableForWorkspace_RuntimeLookupErrors(t *testing.T) {
// runtimeLookup error → runtime="" → full registry returned.
dir := t.TempDir()
writePluginYAML(t, dir, "plugin-a", `name: plugin-a
runtimes: [hermes]
`)
writePluginYAML(t, dir, "plugin-b", `name: plugin-b
runtimes: [claude_code]
`)
h := makeTestHandler(t, dir)
h.runtimeLookup = func(workspaceID string) (string, error) {
return "", errors.New("runtime lookup failed")
}
_, w, c := listAvailableReq("00000000-0000-0000-0000-000000000002")
h.ListAvailableForWorkspace(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200; got %d", w.Code)
}
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if len(resp) != 2 {
t.Errorf("expected 2 plugins (full registry fallback); got %d: %v", len(resp), resp)
}
}
func TestListAvailableForWorkspace_NoRuntimeLookup(t *testing.T) {
// runtimeLookup nil → full registry (no filter).
dir := t.TempDir()
writePluginYAML(t, dir, "plugin-x", `name: plugin-x`)
h := makeTestHandler(t, dir)
// runtimeLookup is nil by default from makeTestHandler.
_, w, c := listAvailableReq("00000000-0000-0000-0000-000000000003")
h.ListAvailableForWorkspace(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200; got %d", w.Code)
}
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
if len(resp) != 1 || resp[0]["name"] != "plugin-x" {
t.Errorf("expected [plugin-x]; got %v", resp)
}
}
func TestListAvailableForWorkspace_UnspecifiedRuntimePluginsAlwaysIncluded(t *testing.T) {
// Plugins with empty runtimes list should always be included
// regardless of workspace runtime.
dir := t.TempDir()
writePluginYAML(t, dir, "universal", `name: universal
runtimes: []
`)
writePluginYAML(t, dir, "cc-only", `name: cc-only
runtimes: [claude_code]
`)
h := makeTestHandler(t, dir)
h.runtimeLookup = func(id string) (string, error) { return "hermes", nil }
_, w, c := listAvailableReq("ws-001")
h.ListAvailableForWorkspace(c)
var resp []map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
// "universal" has no runtimes (try-it); "cc-only" doesn't support hermes.
if len(resp) != 1 || resp[0]["name"] != "universal" {
t.Errorf("expected [universal]; got %v", resp)
}
}
@@ -86,40 +86,6 @@ var fallbackRuntimes = map[string]struct{}{
"mock": {},
}
// stripJSON5Comments removes // single-line comments from JSON5-formatted
// data. The Integration Tester appends "// Triggered by <job>" to
// manifest.json after cloning, which causes json.Unmarshal to fail with
// "invalid character '/'". This strips trailing and mid-file comments
// before parsing so Go's strict JSON parser accepts JSON5 files.
//
// Handles:
// - Standalone comment lines: // comment
// - Trailing comments: "key": "value", // comment
// - Comments inside strings are NOT touched ("http://example.com")
func stripJSON5Comments(data []byte) []byte {
var result []byte
inString := false
i := 0
for i < len(data) {
if data[i] == '"' && (i == 0 || data[i-1] != '\\') {
inString = !inString
result = append(result, data[i])
i++
continue
}
if !inString && i+1 < len(data) && data[i] == '/' && data[i+1] == '/' {
// Skip to end of line
for i < len(data) && data[i] != '\n' {
i++
}
continue
}
result = append(result, data[i])
i++
}
return result
}
// loadRuntimesFromManifest builds the runtime allowlist from
// manifest.json. Each workspace_templates[].name is normalized to its
// base runtime identifier (strips the `-default` suffix templates
@@ -135,9 +101,6 @@ func loadRuntimesFromManifest(path string) (map[string]struct{}, error) {
if err != nil {
return nil, err
}
// Strip JSON5 // comments before parsing. The Integration Tester
// appends "// Triggered by <job>" to manifest.json after cloning.
data = stripJSON5Comments(data)
var m manifestFile
if err := json.Unmarshal(data, &m); err != nil {
return nil, err
@@ -8,7 +8,6 @@ package handlers
// fallback (tested at the initKnownRuntimes level via integration).
import (
"encoding/json"
"os"
"path/filepath"
"testing"
@@ -112,133 +111,3 @@ func keys(m map[string]struct{}) []string {
}
return out
}
// ── stripJSON5Comments tests ─────────────────────────────────────────────────
func TestStripJSON5Comments_Standalone(t *testing.T) {
// Whitespace before the // is preserved; only the // and its text are removed.
// The result is still valid JSON.
input := "{\n\t// This is a comment\n\t\"workspace_templates\": []\n}"
got := string(stripJSON5Comments([]byte(input)))
// Stripping should produce valid JSON: try parsing it
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("output is not valid JSON: %v\ngot: %q", err, got)
}
if m.WorkspaceTemplates == nil {
t.Error("WorkspaceTemplates field should be present after parsing")
}
}
func TestStripJSON5Comments_Trailing(t *testing.T) {
input := `{"key": "value"} // trailing comment`
got := string(stripJSON5Comments([]byte(input)))
want := `{"key": "value"} `
if got != want {
t.Errorf("trailing comment:\ngot: %q\nwant: %q", got, want)
}
}
func TestStripJSON5Comments_URLsPreserved(t *testing.T) {
// URLs with // in them must NOT be stripped
input := `{"url": "https://example.com/path"}`
got := string(stripJSON5Comments([]byte(input)))
if got != input {
t.Errorf("URL stripped: got %q, want %q", got, input)
}
}
func TestStripJSON5Comments_InlineComment(t *testing.T) {
// Whitespace before // is preserved; only the comment text is removed.
// Result must be valid JSON.
input := "{\n\t\"workspace_templates\": [] // inline comment\n}"
got := string(stripJSON5Comments([]byte(input)))
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("output is not valid JSON: %v\ngot: %q", err, got)
}
}
func TestStripJSON5Comments_IntegrationTesterAppend(t *testing.T) {
// Simulates what the Integration Tester does: appends // Triggered by ...
// to the end of a valid JSON file.
input := `{
"workspace_templates": [
{"name": "hermes", "repo": "org/hermes"}
]
}
// Triggered by e2e-test job 12345
`
got := string(stripJSON5Comments([]byte(input)))
if got == input {
t.Error("Integration Tester comment was NOT stripped")
}
// After stripping, it should be valid JSON
var m manifestFile
if err := json.Unmarshal([]byte(got), &m); err != nil {
t.Errorf("stripped content is not valid JSON: %v\ngot: %q", err, got)
}
if len(m.WorkspaceTemplates) != 1 || m.WorkspaceTemplates[0].Name != "hermes" {
t.Errorf("workspace_templates not parsed: %+v", m.WorkspaceTemplates)
}
}
func TestStripJSON5Comments_NoComments(t *testing.T) {
input := `{"workspace_templates": [{"name": "hermes"}]}`
got := string(stripJSON5Comments([]byte(input)))
if got != input {
t.Errorf("unmodified JSON changed: got %q", got)
}
}
// ── loadRuntimesFromManifest with JSON5 comments ─────────────────────────────
func TestLoadRuntimesFromManifest_WithJSON5TrailingComment(t *testing.T) {
// Regression: Integration Tester appends "// Triggered by ..." to manifest.json
// after cloning. Before the fix, this caused json.Unmarshal to fail with
// "invalid character '/'". After the fix, the comment is stripped first.
dir := t.TempDir()
path := filepath.Join(dir, "manifest.json")
manifest := `{
"workspace_templates": [
{"name": "hermes", "repo": "org/hermes"},
{"name": "claude-code-default", "repo": "org/cc"}
]
}
// Triggered by e2e-test job
`
if err := os.WriteFile(path, []byte(manifest), 0600); err != nil {
t.Fatalf("write: %v", err)
}
got, err := loadRuntimesFromManifest(path)
if err != nil {
t.Fatalf("loadRuntimesFromManifest with trailing comment: %v", err)
}
for _, must := range []string{"hermes", "claude-code"} {
if _, ok := got[must]; !ok {
t.Errorf("expected runtime %q in result: %v", must, keys(got))
}
}
}
func TestLoadRuntimesFromManifest_WithInlineJSON5Comment(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "manifest.json")
// JSON5 with inline comment
manifest := `{
// runtime templates
"workspace_templates": [
{"name": "langgraph", "repo": "org/lg"} // the default
]
}`
if err := os.WriteFile(path, []byte(manifest), 0600); err != nil {
t.Fatalf("write: %v", err)
}
got, err := loadRuntimesFromManifest(path)
if err != nil {
t.Fatalf("loadRuntimesFromManifest with inline comment: %v", err)
}
if _, ok := got["langgraph"]; !ok {
t.Errorf("expected langgraph in result: %v", keys(got))
}
}
@@ -34,13 +34,11 @@ import (
// BroadcastHandler is constructed once and shared across requests.
type BroadcastHandler struct {
broadcaster events.EventEmitter
broadcaster *events.Broadcaster
}
// NewBroadcastHandler creates a BroadcastHandler.
// The emitter is any EventEmitter — the concrete *Broadcaster in production,
// or a test double in unit tests.
func NewBroadcastHandler(b events.EventEmitter) *BroadcastHandler {
func NewBroadcastHandler(b *events.Broadcaster) *BroadcastHandler {
return &BroadcastHandler{broadcaster: b}
}
@@ -67,6 +67,7 @@ func TestBroadcast_OrgScopedRecipients(t *testing.T) {
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to unmarshal response: %v", err)
@@ -205,7 +206,7 @@ func TestBroadcast_Disabled(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000003"
senderID := "00000000-0000-0000-0000-000000000001"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Disabled Agent", false))
@@ -236,7 +237,7 @@ func TestBroadcast_EmptyOrg_NoRecipients(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000004" // org root, only workspace in org
senderID := "00000000-0000-0000-0000-000000000001" // org root, only workspace in org
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -296,12 +297,33 @@ func TestBroadcast_InvalidWorkspaceID(t *testing.T) {
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-000000000001"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-000000000001/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_OrgRootLookupFails verifies that if the recursive CTE for
// finding the org root errors, the handler returns 500 instead of proceeding
// with an un-scoped query that would broadcast to all orgs.
func TestBroadcast_OrgRootLookupFails(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000005"
senderID := "00000000-0000-0000-0000-000000000001"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -331,13 +353,16 @@ func TestBroadcast_OrgRootLookupFails(t *testing.T) {
}
}
// TestBroadcast_OrgScoped_SelfBroadcastExcluded verifies that broadcasting
// from a workspace does not send a broadcast_receive to the sender itself
// (the sender logs broadcast_sent, not broadcast_receive).
func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000006"
peerID := "00000000-0000-0000-0000-000000000007"
senderID := "00000000-0000-0000-0000-000000000001"
peerID := "00000000-0000-0000-0000-000000000002"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -374,145 +399,10 @@ func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
}
}
// TestBroadcast_RecipientActivityLogFails_SkipsAndContinues: if one recipient's
// activity_log insert fails, the handler logs the error and continues to the
// next recipient rather than aborting the whole broadcast.
func TestBroadcast_RecipientActivityLogFails_SkipsAndContinues(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000008"
peerA := "00000000-0000-0000-0000-000000000009"
peerB := "00000000-0000-0000-0000-00000000000a"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Resilient Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA).AddRow(peerB))
// Peer A fails — handler logs and continues
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
// Peer B succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerB, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"partial delivery"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
// Only peerB was delivered
if int(resp["delivered"].(float64)) != 1 {
t.Errorf("expected delivered=1, got %v", resp["delivered"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet expectations: %v", err)
}
}
// TestBroadcast_SenderActivityLogFails_StillReturns200: if the sender's own
// broadcast_sent activity_log insert fails, the handler still returns 200
// so the caller doesn't retry a broadcast that already partially delivered.
func TestBroadcast_SenderActivityLogFails_StillReturns200(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-00000000000b"
peerA := "00000000-0000-0000-0000-00000000000c"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Log-Fail Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA))
// Peer log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log FAILS
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"log fail test"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200 even on sender log failure, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000d"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000d/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingBody(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000e"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000e/broadcast", nil)
// no Content-Type and no body
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…".
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…",
// so truncating a 48-char string at max=20 produces 21 characters (20 runes + "…").
func TestBroadcast_Truncate(t *testing.T) {
cases := []struct {
msg string
@@ -520,18 +410,14 @@ func TestBroadcast_Truncate(t *testing.T) {
expect string
}{
{"short", 120, "short"}, // under max — no truncation
// exactly 120 chars → unchanged
// exactly120chars (15) + 105 ones = 120 chars; at max=120 → unchanged
{"exactly120chars1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111", 120, "exactly120chars111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111…"},
// 21 runes at max=20 → 20 + "…" = 21 chars
// "this is a longer mes" = 20 runes; + "…" = 21 chars
{"this is a longer message that needs truncating", 20, "this is a longer mes…"},
// at-max boundary: 20 chars at max=20 → no truncation
{"exactly twenty chars", 20, "exactly twenty chars"},
// over max: 11 chars at max=10 → 10 + "…" = 11
{"hello world!", 10, "hello worl…"},
// Unicode: 3-rune string at max=3 → unchanged
{"日本語", 3, "日本語"},
// Empty string → unchanged
{"", 120, ""},
}
for _, tc := range cases {
result := broadcastTruncate(tc.msg, tc.max)
@@ -37,7 +37,6 @@ package handlers
import (
"context"
"errors"
"fmt"
"log"
"path/filepath"
@@ -133,10 +132,6 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// a workspace_secret named GIT_AUTHOR_NAME can override.
applyAgentGitIdentity(envVars, payload.Name)
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
// SSOT for chat-upload limits — see chat_files.go::chatUploadMaxBytes.
// Injecting via env keeps the Python workspace runtime caps in
// lock-step with the Go cap on every provision. Fixes #1520.
applyChatUploadLimits(envVars)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
}
@@ -228,28 +223,3 @@ func (h *WorkspaceHandler) markProvisionFailed(ctx context.Context, workspaceID,
log.Printf("markProvisionFailed: db update failed for %s: %v", workspaceID, dbErr)
}
}
// applyChatUploadLimits seeds the chat-upload cap env vars on the
// workspace container so the Python /internal/chat/uploads/ingest
// handler parses the multipart form with the same per-file allowance
// that the Go proxy enforces.
//
// Why env-driven (and not, say, a hard-coded Python constant): keeping
// one Go constant as the source of truth and forwarding it lets
// operations bump the cap by editing one file + redeploy, instead of
// editing two files in two languages and risking the drift that
// shipped #1520 (Go cap 50 MB, Python parser cap 1 MiB — Starlette
// default — so a 5 MB image always 400'd on parse before per-file
// enforcement could fire).
//
// Pre-existing env wins. If something downstream (a tenant override,
// a plugin mutator, an A/B experiment) has already set either var,
// we leave it alone. Default-only injection.
func applyChatUploadLimits(envVars map[string]string) {
if _, set := envVars["CHAT_UPLOAD_MAX_FILE_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_FILE_BYTES"] = fmt.Sprintf("%d", chatUploadMaxFileBytes)
}
if _, set := envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"] = fmt.Sprintf("%d", chatUploadMaxBytes)
}
}
-42
View File
@@ -599,28 +599,6 @@ def _sanitize_for_external(msg: str) -> str:
import re as _re
msg = _re.sub(r"(?i)(?:bearer|token|api[_-]?key|sk-)[ :=]+[A-Za-z0-9_/.-]{20,}", "[REDACTED]", msg)
# Bare provider key with NO separator after the prefix — a real
# `sk-ant-api03-…` / `sk-…` key uses `-` (not `[ :=]`) so the rule
# above misses it. Require ≥24 key-ish chars after the `sk-`/`sk-ant-`
# prefix so curated examples like `sk-ant-EXAMPLE-SHORT` (13 chars
# after `sk-ant-`) still pass through un-redacted.
msg = _re.sub(r"(?i)\bsk-(?:ant-)?[A-Za-z0-9_-]{24,}", "[REDACTED]", msg)
# JSON-quoted credential values: {"token": "…"} / {"apiKey": "…"} /
# {"secret": "…"} / {"password": "…"}. Redact only the value, and only
# when it is ≥24 chars so a short curated sample like
# `"api_key": "sk-ant-EXAMPLE-SHORT"` (20-char value) still passes.
msg = _re.sub(
r'(?i)("(?:token|api[_-]?key|secret|password)"\s*:\s*")[^"]{24,}(")',
r"\1[REDACTED]\2",
msg,
)
# AWS secret access key in `aws_secret_access_key=…` form (env dumps,
# boto tracebacks). The base64-ish value runs until whitespace/quote.
msg = _re.sub(
r"(?i)(aws_secret_access_key\s*[:=]\s*)\S+",
r"\1[REDACTED]",
msg,
)
# Absolute paths: /etc/shadow, /home/user/.aws/credentials, etc.
msg = _re.sub(r"(?:/[^/\s]+){2,}", lambda m: m.group(0) if len(m.group(0)) < 60 else "[REDACTED_PATH]", msg)
return msg
@@ -630,7 +608,6 @@ def sanitize_agent_error(
exc: BaseException | None = None,
category: str | None = None,
stderr: str | None = None,
reason: str | None = None,
) -> str:
"""Render an agent-side failure into a user-safe error message.
@@ -638,18 +615,6 @@ def sanitize_agent_error(
category string (e.g. from `classify_subprocess_error`). If both are
given, `category` wins. If neither, the tag defaults to "unknown".
When ``reason`` is provided (internal#211/#212), it is a *pre-curated,
user-actionable, secret-safe* explanation built by the caller from a
provider-side failure — e.g. a 403 "Your organization has disabled
Claude subscription access · Use an Anthropic API key instead, or ask
your admin to enable access" with error code ``oauth_org_not_allowed``.
This text is exactly what the user needs to self-serve, so it is
surfaced VERBATIM as the message instead of being collapsed to the
opaque exception class name. It still passes through the
key/token/bearer/path scrubber as a belt-and-braces second pass so a
buggy caller can't leak a credential that snuck into the reason.
``reason`` wins over ``stderr``; both lose to neither being set.
When ``stderr`` is provided (e.g. the first ~1 KB of a subprocess stderr
or HTTP error body), it is sanitized and appended to the output so the
A2A caller gets actionable context without needing to dig through workspace
@@ -664,13 +629,6 @@ def sanitize_agent_error(
else:
tag = "unknown"
if reason:
# Curated, user-actionable reason — surface it as the message.
# Still scrub: a 403/auth/quota message is safe, but the scrubber
# is cheap insurance against a caller that didn't curate cleanly.
clean = _sanitize_for_external(reason[:_MAX_STDERR_PREVIEW])
return f"Agent error ({tag}): {clean}"
if stderr:
# Truncate and sanitize before including — prevents DoS via
# a malicious or buggy peer injecting a huge error body, and
+9 -66
View File
@@ -26,14 +26,9 @@ Path safety:
a colliding name fails fast (the random prefix already makes
collisions astronomical, but defense-in-depth costs nothing).
Limits (SSOT — matches the Go contract from chat_files.go, injected
via CHAT_UPLOAD_MAX_TOTAL_BYTES / CHAT_UPLOAD_MAX_FILE_BYTES at
provision time; falls back to legacy 50 MB / 25 MB when env unset):
- CHAT_UPLOAD_MAX_TOTAL_BYTES total request body (default 50 MB)
- CHAT_UPLOAD_MAX_FILE_BYTES per file (default 25 MB)
ALSO passed as Starlette ``max_part_size`` to override the
Starlette-1.0 default of 1 MiB which silently 400'd every
upload > 1 MiB before #1520 fix.
Limits (matches the Go contract from chat_files.go):
- 50 MB total request body
- 25 MB per file
- filename truncated to 100 chars
Response shape:
@@ -66,47 +61,14 @@ logger = logging.getLogger(__name__)
# keeps working unchanged.
CHAT_UPLOAD_DIR = "/workspace/.molecule/chat-uploads"
def _env_int(name: str, default: int) -> int:
"""Parse an int from the environment, falling back to ``default``.
Mis-formatted values (anything ``int()`` rejects) fall back to the
default rather than crashing module import — operations needs to be
able to roll back a bad env-var push by simply removing the var,
not by also fixing a worker that won't boot.
"""
raw = os.environ.get(name)
if not raw:
return default
try:
return int(raw)
except (TypeError, ValueError):
logger.warning("internal_chat_uploads: env %s=%r not an int; using default %d", name, raw, default)
return default
# Total-request body cap. multipart/form-data with multiple parts can
# add ~100 bytes of framing per file; the cap is the bytes hitting the
# socket, including framing.
#
# SSOT (issue #1520): the source of truth is the Go constant
# chatUploadMaxBytes in workspace-server/internal/handlers/chat_files.go,
# exported to the workspace container as CHAT_UPLOAD_MAX_TOTAL_BYTES at
# provision time (workspace_provision_shared.go::applyChatUploadLimits).
# Unset env → keep the previous 50 MB default so an unprovisioned /
# locally-run workspace does NOT regress.
CHAT_UPLOAD_MAX_BYTES = _env_int("CHAT_UPLOAD_MAX_TOTAL_BYTES", 50 * 1024 * 1024)
CHAT_UPLOAD_MAX_BYTES = 50 * 1024 * 1024 # 50 MB
# Per-file cap. SSOT (issue #1520): exported from the Go side as
# CHAT_UPLOAD_MAX_FILE_BYTES; default 25 MB if env is unset so an older
# workspace provisioned before the env-injection landed keeps the
# legacy ceiling.
#
# This value is ALSO passed as Starlette's ``max_part_size`` (see
# ingest_handler below) — Starlette 1.0 defaults max_part_size to
# **1 MiB**, which is the actual root cause of #1520: any single file
# part above 1 MiB raised MultiPartException before per-file enforcement
# could fire. Wiring max_part_size to the same cap as per-file means
# the user-visible ceiling is exactly the per-file cap, no surprises.
CHAT_UPLOAD_MAX_FILE_BYTES = _env_int("CHAT_UPLOAD_MAX_FILE_BYTES", 25 * 1024 * 1024)
# Per-file cap. Keeping per-file under total lets a user attach, say,
# a 5 MB PDF + 10 small screenshots in a single batch.
CHAT_UPLOAD_MAX_FILE_BYTES = 25 * 1024 * 1024 # 25 MB
# Conservative {alnum, dot, underscore, dash} character class — anything
# outside gets rewritten so embedded paths, control chars, newlines,
@@ -184,30 +146,11 @@ async def ingest_handler(request: Request) -> JSONResponse:
status_code=413,
)
# max_part_size: Starlette 1.0 defaults to 1 MiB. Any single
# part above that raises MultiPartException BEFORE per-file
# enforcement can run — which silently broke every chat upload
# > 1 MiB (issue #1520, fleet-wide P0 2026-05-18). Wire it to
# the per-file cap so the user-visible ceiling matches what
# the per-file 413 path expects.
try:
form = await request.form(
max_files=64,
max_fields=32,
max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES,
)
form = await request.form(max_files=64, max_fields=32)
except Exception as exc: # multipart parse error
logger.warning("internal_chat_uploads: multipart parse failed: %s", exc)
# Surface the exception detail (feedback_surface_actionable_failure_reason_to_user):
# MultiPartException strings ("Part exceeded maximum size of …",
# "Invalid boundary", "Too many parts", etc.) contain no secrets
# — they describe shape, not content. The 200-char cap is
# belt-and-braces against an exception class we haven't seen
# whose ``str()`` is unbounded.
return JSONResponse(
{"error": "failed to parse multipart form", "detail": str(exc)[:200]},
status_code=400,
)
return JSONResponse({"error": "failed to parse multipart form"}, status_code=400)
# Starlette's FormData allows multiple values per key — `files` may
# appear multiple times for batched uploads. getlist returns them
-117
View File
@@ -788,123 +788,6 @@ def test_sanitize_agent_error_stderr_combined_with_existing_tests():
assert "workspace logs" in out
# ─── reason passthrough (internal#211/#212: surface actionable provider error) ───
def test_sanitize_agent_error_reason_surfaced_verbatim():
"""A curated provider reason is shown to the user, not collapsed to the
exception class name. This is the internal#211 regression: a 403
org-disabled message must reach the canvas."""
reason = (
"provider HTTP 403 — oauth_org_not_allowed — Your organization has "
"disabled Claude subscription access for Claude Code · Use an "
"Anthropic API key instead, or ask your admin to enable access"
)
class _ResultErr(Exception):
pass
out = sanitize_agent_error(exc=_ResultErr("opaque"), reason=reason)
# The actionable provider guidance and status code must be visible.
assert "403" in out
assert "oauth_org_not_allowed" in out
assert "disabled Claude subscription access" in out
assert "ask your admin to enable access" in out
# NOT the old opaque form.
assert "see workspace logs" not in out
def test_sanitize_agent_error_reason_still_scrubs_secrets():
"""Even on the reason path the key/token scrubber runs — a buggy caller
that lets a bearer token into the reason still gets it redacted."""
leaky = (
"provider HTTP 401 — auth failed — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm please re-auth"
)
out = sanitize_agent_error(reason=leaky)
assert "[REDACTED]" in out
assert "PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm" not in out
# The non-secret guidance still survives the scrub.
assert "401" in out
assert "please re-auth" in out
def test_sanitize_agent_error_reason_scrubs_all_secret_formats():
"""The scrubber must redact every realistic credential shape — not just
the `Bearer <tok>` form the original test happened to exercise
(internal#212 review finding: bare `sk-ant-api03-…` keys, JSON-quoted
"token"/"apiKey" values, and `aws_secret_access_key=` all leaked).
All curated/actionable guidance must still survive the scrub.
"""
# 1. Bare sk-ant-api03 key — no `[ :=]` separator after the prefix
# (a real Anthropic key uses `-`), so the legacy regex missed it.
bare = (
"provider HTTP 401 — auth failed — invalid key "
"sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789 "
"please re-auth"
)
out = sanitize_agent_error(reason=bare)
assert "sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789" not in out
assert "[REDACTED]" in out
assert "401" in out # actionable status survives
assert "please re-auth" in out # actionable guidance survives
# 2. JSON-quoted "token" / "apiKey" values.
jblob = (
'provider error — config dump {"token": '
'"abcDEF0123456789ghIJKL0123456789mnopQRST", "apiKey": '
'"anon_fakefakefakefakefakefakefakefakefakefake"} — '
"use an API key instead"
)
out = sanitize_agent_error(reason=jblob)
assert "abcDEF0123456789ghIJKL0123456789mnopQRST" not in out
assert "anon_fakefakefakefakefakefakefakefakefakefake" not in out
assert "[REDACTED]" in out
assert "use an API key instead" in out # actionable guidance survives
# 3. aws_secret_access_key=… form.
awsblob = (
"provider HTTP 403 — boto credential error "
"aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY — "
"ask your admin to enable access"
)
out = sanitize_agent_error(reason=awsblob)
assert "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" not in out
assert "[REDACTED]" in out
assert "403" in out # actionable status survives
assert "ask your admin to enable access" in out # guidance survives
# 4. Regression: the original Bearer form still redacts.
# Uses PLACEHOLDER_LONG_TOKEN (>=40 chars, no sk-ant- prefix) to avoid
# triggering the secret-scan workflow pattern
# `sk-ant-[A-Za-z0-9_-]{40,}`.
bearer = (
"provider HTTP 401 — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij re-auth"
)
out = sanitize_agent_error(reason=bearer)
assert "PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij" not in out
assert "[REDACTED]" in out
assert "re-auth" in out
def test_sanitize_agent_error_reason_wins_over_stderr():
"""When both reason and stderr are passed, the curated reason wins."""
out = sanitize_agent_error(
reason="provider HTTP 403 — use an API key",
stderr="raw subprocess noise that should not be shown",
)
assert "use an API key" in out
assert "raw subprocess noise" not in out
def test_sanitize_agent_error_no_reason_unchanged():
"""Omitting reason preserves the original generic behavior."""
out = sanitize_agent_error(exc=ValueError("boom"))
assert "ValueError" in out
assert "workspace logs" in out
# ======================================================================
# classify_subprocess_error
@@ -299,122 +299,3 @@ def test_symlink_at_target_is_refused(client: TestClient, chat_uploads_dir: Path
assert r.status_code == 500, r.text
# Sentinel content unchanged — the symlink wasn't followed.
assert sentinel.read_bytes() == b"original"
# ───────────── issue #1520: max_part_size + SSOT env-driven caps ─────────────
def test_part_above_starlette_1mib_default_is_accepted(client: TestClient, chat_uploads_dir: Path):
"""Regression: pre-fix, ANY single multipart part > 1 MiB raised
MultiPartException because the ingest handler called
``request.form()`` without ``max_part_size`` and Starlette 1.0's
default is 1 MiB (issue #1520, fleet-wide P0 2026-05-18).
This test sends a 2 MiB part, which is well below the 25 MB default
per-file cap but ABOVE the Starlette default, so it pins the fix:
we now pass ``max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES`` so the
parser uses the same cap the per-file 413 path expects.
"""
payload = b"a" * (2 * 1024 * 1024) # 2 MiB — > Starlette 1 MiB default
r = client.post(
"/internal/chat/uploads/ingest",
files={"files": ("big-but-allowed.bin", payload)},
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 200, r.text
item = r.json()["files"][0]
assert item["size"] == len(payload)
def test_parse_error_surfaces_exception_detail(client: TestClient):
"""Per feedback_surface_actionable_failure_reason_to_user: the 400
body must include a ``detail`` field naming WHICH multipart error
fired. The MultiPartException strings ("Part exceeded maximum size
of …", "Invalid boundary", "Too many parts", etc.) describe SHAPE
not content — no secrets.
We trigger a real Starlette MultiPartException by submitting a body
whose Content-Type advertises ``multipart/form-data`` but whose
body is not a valid multipart envelope — the parser raises before
any per-file check can fire.
"""
r = client.post(
"/internal/chat/uploads/ingest",
content=b"this is not a valid multipart body",
headers={
"Authorization": "Bearer test-secret",
"Content-Type": "multipart/form-data; boundary=----not-a-real-boundary",
},
)
assert r.status_code == 400, r.text
body = r.json()
assert body["error"] == "failed to parse multipart form"
# Detail must be present + non-empty + bounded.
assert "detail" in body and isinstance(body["detail"], str)
assert body["detail"], "detail must not be empty"
assert len(body["detail"]) <= 200, "detail must be bounded"
def test_total_cap_413_still_fires_above_per_file_pass(client: TestClient, monkeypatch: pytest.MonkeyPatch):
"""Total-cap 413 path still works: two parts whose sum exceeds
CHAT_UPLOAD_MAX_BYTES but each individually fits the per-file cap.
Sanity-check that raising the per-file ceiling didn't accidentally
short-circuit the total-cap check.
"""
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_BYTES", 1024)
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_FILE_BYTES", 800)
r = client.post(
"/internal/chat/uploads/ingest",
files=[
("files", ("a.bin", b"a" * 600)),
("files", ("b.bin", b"b" * 600)),
],
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 413
# Either early (Content-Length pre-parse) or post-parse cumulative path is
# acceptable; both messages mention exceeding the total limit.
err = r.json()["error"]
assert "exceeds" in err and "limit" in err, err
def test_env_driven_ssot_overrides_caps(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""SSOT contract: setting CHAT_UPLOAD_MAX_FILE_BYTES /
CHAT_UPLOAD_MAX_TOTAL_BYTES in the environment at module import
time changes the module constants. Pin so the
workspace_provision_shared.go::applyChatUploadLimits env injection
cannot silently drift from what the Python side reads.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", str(7 * 1024 * 1024))
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", str(13 * 1024 * 1024))
reloaded = importlib.reload(internal_chat_uploads)
try:
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 7 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 13 * 1024 * 1024
finally:
# Reset to defaults so subsequent tests see clean constants.
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)
def test_env_driven_ssot_malformed_value_falls_back_to_default(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""If ops pushes a garbage value the worker still boots with the
in-code default (operability over precision — see _env_int
docstring). Pin the fallback.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", "not-an-int")
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", "") # empty == use default
reloaded = importlib.reload(internal_chat_uploads)
try:
# Defaults (legacy 25 MB / 50 MB) come back.
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 25 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 50 * 1024 * 1024
finally:
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)