feat(workspace-server): mock runtime + mock-bigorg org template
Some checks failed
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
cascade-list-drift-gate / check (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m30s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m50s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 4m29s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Some checks failed
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
cascade-list-drift-gate / check (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m30s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m39s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m50s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 4m29s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Adds a 'mock' runtime: virtual workspaces with no container, no EC2,
no LLM. Every A2A reply is synthesised from a small canned-variant
pool ('On it!', 'Got it, on it now.', etc.) deterministically seeded
by (workspace_id, request_id).
Built for funding-demo "200-workspace mock org" — renders an
enterprise-scale org chart on the canvas (CEO/VPs/Managers/ICs)
without burning real LLM credits or provisioning 200 EC2 instances.
Surfaces:
- workspace-server/internal/handlers/mock_runtime.go: A2A proxy
short-circuit, canned-reply pool, deterministic variant pick.
- workspace-server/internal/handlers/a2a_proxy.go: gate the
short-circuit before resolveAgentURL (mock has no URL).
- workspace-server/internal/handlers/org_import.go: skip Docker
provisioning for mock workspaces, set status='online' directly,
drop the per-sibling 2s pacing for mock children (collapses
a 200-workspace import from ~7min → ~1s).
- workspace-server/internal/handlers/runtime_registry.go: register
'mock' in the runtime allowlist (manifest + fallback set).
- workspace-server/internal/registry/healthsweep.go +
orphan_sweeper.go: skip mock workspaces in container-health and
stale-token sweeps (no container by design).
- workspace-server/internal/handlers/workspace_restart.go: mirror
the 'external' Restart no-op for mock.
- manifest.json: register the new
Molecule-AI/molecule-ai-org-template-mock-bigorg repo.
Tests: 5 new in mock_runtime_test.go covering happy-path, non-mock
regression guard, determinism, IsMockRuntime trim/case, JSON-RPC
id echo. All existing handler + registry tests still pass.
Local-verified: imported the 200-workspace template against a fresh
postgres+redis, confirmed all 200 land in 'online' and stay there
through the 30s health-sweep window, exercised A2A on CEO + VPs +
Managers + ICs and saw the variant pool rotate.
Org template lives at
Molecule-AI/molecule-ai-org-template-mock-bigorg (created today)
and is imported via the existing /org/import flow on the canvas
Template Palette.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
70104d1cef
commit
d64641904f
@ -41,6 +41,7 @@
|
||||
{"name": "medo-smoke", "repo": "Molecule-AI/molecule-ai-org-template-medo-smoke", "ref": "main"},
|
||||
{"name": "molecule-worker-gemini", "repo": "Molecule-AI/molecule-ai-org-template-molecule-worker-gemini", "ref": "main"},
|
||||
{"name": "reno-stars", "repo": "Molecule-AI/molecule-ai-org-template-reno-stars", "ref": "main"},
|
||||
{"name": "ux-ab-lab", "repo": "Molecule-AI/molecule-ai-org-template-ux-ab-lab", "ref": "main"}
|
||||
{"name": "ux-ab-lab", "repo": "Molecule-AI/molecule-ai-org-template-ux-ab-lab", "ref": "main"},
|
||||
{"name": "mock-bigorg", "repo": "Molecule-AI/molecule-ai-org-template-mock-bigorg", "ref": "main"}
|
||||
]
|
||||
}
|
||||
|
||||
@ -413,6 +413,23 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
|
||||
return http.StatusOK, respBody, nil
|
||||
}
|
||||
|
||||
// Mock-runtime short-circuit. Workspaces with runtime='mock' have
|
||||
// no container, no EC2, no URL — every reply is synthesised here
|
||||
// from a small canned-variant pool. Built for the "200-workspace
|
||||
// mock org" demo: a CEO/VPs/Managers/ICs hierarchy that renders
|
||||
// at scale on the canvas without burning real LLM credits or
|
||||
// provisioning 200 EC2 instances. See mock_runtime.go for the
|
||||
// full rationale + reply shape contract.
|
||||
//
|
||||
// Position: AFTER poll-mode (mock isn't a delivery mode, it's a
|
||||
// runtime; treating poll-set-on-mock as poll matches operator
|
||||
// intent if anyone ever does that), BEFORE resolveAgentURL (mock
|
||||
// has no URL — going through resolveAgentURL would 404 on the
|
||||
// SELECT url since the row is provisioned as NULL).
|
||||
if status, respBody, handled := h.handleMockA2A(ctx, workspaceID, callerID, body, a2aMethod, logActivity); handled {
|
||||
return status, respBody, nil
|
||||
}
|
||||
|
||||
agentURL, proxyErr := h.resolveAgentURL(ctx, workspaceID)
|
||||
if proxyErr != nil {
|
||||
return 0, nil, proxyErr
|
||||
|
||||
223
workspace-server/internal/handlers/mock_runtime.go
Normal file
223
workspace-server/internal/handlers/mock_runtime.go
Normal file
@ -0,0 +1,223 @@
|
||||
package handlers
|
||||
|
||||
// mock_runtime.go — "mock" runtime: a virtual workspace that has no
|
||||
// container, no EC2, no LLM, just hardcoded canned A2A replies. Built
|
||||
// for the funding-demo "200-workspace mock org" so hongming can show
|
||||
// investors a CEO/VPs/Managers/ICs hierarchy at scale without burning
|
||||
// 200 EC2 instances or 200 Anthropic keys.
|
||||
//
|
||||
// Wire model:
|
||||
// - org template declares `runtime: mock` on every workspace
|
||||
// - createWorkspaceTree skips provisioning, sets status='online'
|
||||
// directly (mirrors the `external` short-circuit, minus the URL +
|
||||
// awaiting_agent dance)
|
||||
// - proxyA2ARequest short-circuits on a mock-runtime target and
|
||||
// returns a canned JSON-RPC reply; never calls resolveAgentURL,
|
||||
// never opens an HTTP connection, never touches Docker/EC2
|
||||
//
|
||||
// The reply is JSON-RPC 2.0 + a2a-sdk v0.3 shape so the canvas's
|
||||
// extractAgentText / extractTextsFromParts read it without any
|
||||
// special-casing. We rotate over a small variant pool so a screen
|
||||
// full of replies doesn't all read identical — gives the demo a bit
|
||||
// of life without pretending to be a real agent.
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/sha1"
|
||||
"database/sql"
|
||||
"encoding/binary"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
// MockRuntimeName is the canonical runtime string a workspace row
|
||||
// carries to opt into the canned-reply short-circuit. Kept as a const
|
||||
// so the proxy's runtime-check + the org-import skip-block reference
|
||||
// the same literal.
|
||||
const MockRuntimeName = "mock"
|
||||
|
||||
// mockReplyVariants is the pool of canned strings the mock runtime
|
||||
// rotates through. Picked to read like a busy-but-short reply from a
|
||||
// real human in a hierarchy — a CEO would NOT respond with "On it!",
|
||||
// but for the demo every node is shown to be reachable, so we lean
|
||||
// into the variety. Variant selection is deterministic per
|
||||
// (workspaceID, request-id) pair so a screen recording replays the
|
||||
// same reply for the same input.
|
||||
var mockReplyVariants = []string{
|
||||
"On it!",
|
||||
"Got it, on it now.",
|
||||
"On it, boss.",
|
||||
"Working on it.",
|
||||
"Acknowledged — on it.",
|
||||
"On it, will report back.",
|
||||
"Roger that, on it.",
|
||||
"Copy that. On it.",
|
||||
"On it — ETA shortly.",
|
||||
"On it. Standby for update.",
|
||||
}
|
||||
|
||||
// pickMockReply returns a canned reply for the given workspaceID +
|
||||
// requestID. Deterministic so the same (workspace, message-id) pair
|
||||
// always picks the same variant — useful for screen recordings and
|
||||
// flake-free e2e snapshots. Falls back to variant[0] if the inputs
|
||||
// are empty.
|
||||
func pickMockReply(workspaceID, requestID string) string {
|
||||
if len(mockReplyVariants) == 0 {
|
||||
return "On it!"
|
||||
}
|
||||
if workspaceID == "" && requestID == "" {
|
||||
return mockReplyVariants[0]
|
||||
}
|
||||
h := sha1.Sum([]byte(workspaceID + ":" + requestID))
|
||||
idx := int(binary.BigEndian.Uint32(h[0:4]) % uint32(len(mockReplyVariants)))
|
||||
return mockReplyVariants[idx]
|
||||
}
|
||||
|
||||
// lookupRuntime returns the workspace's runtime string. Empty when the
|
||||
// row is missing / DB hiccup so callers fall through to the existing
|
||||
// dispatch path (which will then 404 / 502 normally). Fail-open here
|
||||
// because a transient DB error must not silently flip a real workspace
|
||||
// into mock-mode and start handing out canned replies in place of
|
||||
// genuine agent traffic.
|
||||
func lookupRuntime(ctx context.Context, workspaceID string) string {
|
||||
var runtime sql.NullString
|
||||
err := db.DB.QueryRowContext(ctx,
|
||||
`SELECT runtime FROM workspaces WHERE id = $1`, workspaceID,
|
||||
).Scan(&runtime)
|
||||
if err != nil {
|
||||
if !errors.Is(err, sql.ErrNoRows) {
|
||||
log.Printf("ProxyA2A: lookupRuntime(%s) failed (%v) — falling through to dispatch path", workspaceID, err)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
if !runtime.Valid {
|
||||
return ""
|
||||
}
|
||||
return runtime.String
|
||||
}
|
||||
|
||||
// buildMockA2AResponse synthesises a JSON-RPC 2.0 success envelope that
|
||||
// matches the a2a-sdk v0.3 reply shape the canvas's extractAgentText
|
||||
// already understands: `{result: {parts: [{kind: "text", text: ...}]}}`.
|
||||
// `requestID` is the JSON-RPC `id` of the inbound request — A2A
|
||||
// implementations echo it on the reply so callers can correlate. We
|
||||
// extract it from the normalized payload in the caller and pass it in
|
||||
// here so this function stays JSON-only (no payload parsing).
|
||||
//
|
||||
// Returns marshalled bytes ready to write straight to the HTTP body.
|
||||
// Marshal failure is logged + a tiny fallback envelope returned, since
|
||||
// failing the whole request because of a JSON encoding hiccup on a
|
||||
// constant-shaped payload would defeat the "mock always works" guarantee.
|
||||
func buildMockA2AResponse(workspaceID, requestID, replyText string) []byte {
|
||||
if requestID == "" {
|
||||
requestID = uuid.New().String()
|
||||
}
|
||||
envelope := map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": requestID,
|
||||
"result": map[string]any{
|
||||
"parts": []map[string]any{
|
||||
{"kind": "text", "text": replyText},
|
||||
},
|
||||
},
|
||||
}
|
||||
out, err := json.Marshal(envelope)
|
||||
if err != nil {
|
||||
log.Printf("ProxyA2A: mock-runtime response marshal failed for %s: %v — emitting fallback", workspaceID, err)
|
||||
// Hand-rolled minimal envelope. Safe because every value is a
|
||||
// hardcoded constant string with no characters that need
|
||||
// escaping in a JSON string literal.
|
||||
fallback := fmt.Sprintf(
|
||||
`{"jsonrpc":"2.0","id":%q,"result":{"parts":[{"kind":"text","text":%q}]}}`,
|
||||
requestID, replyText,
|
||||
)
|
||||
return []byte(fallback)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// extractRequestID pulls the JSON-RPC `id` out of an already-normalized
|
||||
// A2A payload. Returns "" when the field is absent or not a string —
|
||||
// caller substitutes a fresh UUID. Tolerant of every shape
|
||||
// normalizeA2APayload could produce.
|
||||
func extractRequestID(body []byte) string {
|
||||
var top map[string]json.RawMessage
|
||||
if err := json.Unmarshal(body, &top); err != nil {
|
||||
return ""
|
||||
}
|
||||
raw, ok := top["id"]
|
||||
if !ok {
|
||||
return ""
|
||||
}
|
||||
var s string
|
||||
if json.Unmarshal(raw, &s) == nil {
|
||||
return s
|
||||
}
|
||||
// JSON-RPC permits numeric IDs too; canvas issues UUIDs but be
|
||||
// defensive against alternative SDKs.
|
||||
var n json.Number
|
||||
if json.Unmarshal(raw, &n) == nil {
|
||||
return n.String()
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// handleMockA2A is the proxy short-circuit for mock-runtime workspaces.
|
||||
// Returns (status, body, true) when the target is mock — caller writes
|
||||
// the response and returns. Returns (_, _, false) when the target is
|
||||
// not mock — caller continues to the real dispatch path.
|
||||
//
|
||||
// Side-effects: writes a synthetic activity_logs row via logA2ASuccess
|
||||
// when logActivity is true so the canvas's "Agent Comms" tab shows the
|
||||
// mock reply in the trace alongside real-agent traffic. Without this
|
||||
// the demo would render messages on the canvas chat panel but a peer
|
||||
// node clicking through to its activity tab would see an empty list.
|
||||
func (h *WorkspaceHandler) handleMockA2A(ctx context.Context, workspaceID, callerID string, body []byte, a2aMethod string, logActivity bool) (int, []byte, bool) {
|
||||
if lookupRuntime(ctx, workspaceID) != MockRuntimeName {
|
||||
return 0, nil, false
|
||||
}
|
||||
requestID := extractRequestID(body)
|
||||
replyText := pickMockReply(workspaceID, requestID)
|
||||
respBody := buildMockA2AResponse(workspaceID, requestID, replyText)
|
||||
|
||||
// Tiny artificial delay so the canvas chat UI has time to render
|
||||
// the user's outgoing bubble before the agent reply appears.
|
||||
// Without it the reply lands the same animation frame and feels
|
||||
// robotic. 80ms is too fast to look "real" but masks the React
|
||||
// double-render race that drops the user bubble entirely on slow
|
||||
// machines (observed locally on M1 Air, 2026-05-07). Below 200ms
|
||||
// keeps a 200-node demo snappy when investors fan out 30 messages
|
||||
// at once.
|
||||
time.Sleep(80 * time.Millisecond)
|
||||
|
||||
if logActivity {
|
||||
// Reuse the existing success-logger so the activity feed shape
|
||||
// is identical to a real agent reply. Status 200 + duration 0
|
||||
// is the "synthesised reply" marker; activity_logs.duration_ms
|
||||
// being 0 is harmless (real fast paths can hit 0 too).
|
||||
h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, http.StatusOK, 0)
|
||||
}
|
||||
return http.StatusOK, respBody, true
|
||||
}
|
||||
|
||||
// IsMockRuntime is a small public helper for callers outside this
|
||||
// package (tests, the org importer) that need to ask the question
|
||||
// without depending on the unexported constant. Trims + lower-cases
|
||||
// so a typoed YAML cell like " Mock " still resolves correctly.
|
||||
func IsMockRuntime(runtime string) bool {
|
||||
return strings.EqualFold(strings.TrimSpace(runtime), MockRuntimeName)
|
||||
}
|
||||
|
||||
// gin import is unused at file scope but kept as a tag so a future
|
||||
// addition of a thin HTTP handler (e.g. POST /workspaces/:id/mock/replies
|
||||
// for an admin-set custom reply pool) doesn't need an import re-order.
|
||||
var _ = gin.H{}
|
||||
266
workspace-server/internal/handlers/mock_runtime_test.go
Normal file
266
workspace-server/internal/handlers/mock_runtime_test.go
Normal file
@ -0,0 +1,266 @@
|
||||
package handlers
|
||||
|
||||
// mock_runtime_test.go — locks the contract for the mock-runtime
|
||||
// short-circuit added for the funding-demo "200-workspace mock org"
|
||||
// template. Three invariants:
|
||||
//
|
||||
// 1. ProxyA2A on a workspace with runtime='mock' must return 200
|
||||
// with a JSON-RPC reply containing one text part. NO HTTP
|
||||
// dispatch, NO resolveAgentURL DB read (mock workspaces have
|
||||
// no URL — that read would 404 and break the demo).
|
||||
//
|
||||
// 2. The reply text must be one of the canned variants and must be
|
||||
// deterministic for a given (workspace_id, request_id) pair so
|
||||
// screen recordings replay identically.
|
||||
//
|
||||
// 3. Workspaces with runtime != 'mock' must NOT be affected — the
|
||||
// mock check fails fast and falls through to the existing
|
||||
// dispatch path. Same kind of regression guard the poll-mode
|
||||
// tests carry.
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/DATA-DOG/go-sqlmock"
|
||||
"github.com/gin-gonic/gin"
|
||||
)
|
||||
|
||||
// TestProxyA2A_MockRuntime_ReturnsCannedReply is the happy-path
|
||||
// contract. A workspace flagged runtime='mock' must:
|
||||
// - return 200 with JSON-RPC envelope {result:{parts:[{kind:text,text:...}]}}
|
||||
// - not dispatch HTTP (no SELECT url SQL expected)
|
||||
// - reply text is one of mockReplyVariants
|
||||
func TestProxyA2A_MockRuntime_ReturnsCannedReply(t *testing.T) {
|
||||
mock := setupTestDB(t)
|
||||
setupTestRedis(t)
|
||||
broadcaster := newTestBroadcaster()
|
||||
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
|
||||
|
||||
const wsID = "ws-mock-canned"
|
||||
|
||||
// Budget check fires before runtime lookup (same as the poll-mode
|
||||
// short-circuit) — keeps mock workspaces honest if a tenant ever
|
||||
// sets a budget on one. Unlikely on a demo, but the guard stays
|
||||
// uniform so future "monthly_spend on mock = 0" assertions don't
|
||||
// drift.
|
||||
expectBudgetCheck(mock, wsID)
|
||||
|
||||
// lookupDeliveryMode runs first — return push so the poll
|
||||
// short-circuit doesn't fire and we hit the mock check.
|
||||
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
|
||||
|
||||
// lookupRuntime SELECT — returns 'mock', triggering the canned-reply
|
||||
// short-circuit. CRITICAL: NO ExpectQuery for `SELECT url, status
|
||||
// FROM workspaces` (resolveAgentURL's query). If the short-circuit
|
||||
// fails to fire, sqlmock will surface "unexpected query" on the URL
|
||||
// SELECT and the test fails loudly — that's the dispatch-leak detector.
|
||||
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("mock"))
|
||||
|
||||
// Activity log: logA2ASuccess writes the synthetic reply to
|
||||
// activity_logs so the canvas's Agent Comms tab shows it alongside
|
||||
// real-agent traffic.
|
||||
mock.ExpectExec("INSERT INTO activity_logs").
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: wsID}}
|
||||
|
||||
body := `{"jsonrpc":"2.0","id":"req-mock-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hello mock"}]}}}`
|
||||
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
handler.ProxyA2A(c)
|
||||
|
||||
// logA2ASuccess fires async — give it a moment to settle so
|
||||
// ExpectationsWereMet doesn't flake.
|
||||
time.Sleep(200 * time.Millisecond)
|
||||
|
||||
if w.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
var resp map[string]interface{}
|
||||
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
|
||||
t.Fatalf("response is not valid JSON: %v", err)
|
||||
}
|
||||
if resp["jsonrpc"] != "2.0" {
|
||||
t.Errorf("response.jsonrpc = %v, want 2.0", resp["jsonrpc"])
|
||||
}
|
||||
if resp["id"] != "req-mock-1" {
|
||||
t.Errorf("response.id = %v, want %q (echoed from request)", resp["id"], "req-mock-1")
|
||||
}
|
||||
result, _ := resp["result"].(map[string]interface{})
|
||||
if result == nil {
|
||||
t.Fatalf("response.result missing or wrong type: %v", resp["result"])
|
||||
}
|
||||
parts, _ := result["parts"].([]interface{})
|
||||
if len(parts) != 1 {
|
||||
t.Fatalf("expected exactly one part, got %d: %v", len(parts), parts)
|
||||
}
|
||||
part, _ := parts[0].(map[string]interface{})
|
||||
if part["kind"] != "text" {
|
||||
t.Errorf("part.kind = %v, want text", part["kind"])
|
||||
}
|
||||
text, _ := part["text"].(string)
|
||||
if text == "" {
|
||||
t.Error("part.text is empty — canned reply not populated")
|
||||
}
|
||||
// Reply must be one of the variants.
|
||||
matched := false
|
||||
for _, v := range mockReplyVariants {
|
||||
if v == text {
|
||||
matched = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !matched {
|
||||
t.Errorf("reply text %q is not in mockReplyVariants", text)
|
||||
}
|
||||
|
||||
if err := mock.ExpectationsWereMet(); err != nil {
|
||||
t.Errorf("unmet sqlmock expectations: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// TestProxyA2A_NonMockRuntime_NoShortCircuit verifies the symmetric
|
||||
// contract: a workspace with a real runtime (claude-code, hermes, etc.)
|
||||
// must NOT be affected by the mock check — it falls through to the
|
||||
// real dispatch path. Without this guard, a regression in
|
||||
// lookupRuntime could silently flip every workspace into mock-mode
|
||||
// and start handing out canned replies in place of real-agent traffic.
|
||||
func TestProxyA2A_NonMockRuntime_NoShortCircuit(t *testing.T) {
|
||||
mock := setupTestDB(t)
|
||||
mr := setupTestRedis(t)
|
||||
allowLoopbackForTest(t)
|
||||
broadcaster := newTestBroadcaster()
|
||||
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
|
||||
|
||||
const wsID = "ws-real-runtime"
|
||||
|
||||
dispatched := false
|
||||
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
dispatched = true
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Write([]byte(`{"jsonrpc":"2.0","id":"1","result":{"status":"ok"}}`))
|
||||
}))
|
||||
defer agentServer.Close()
|
||||
mr.Set("ws:"+wsID+":url", agentServer.URL)
|
||||
|
||||
expectBudgetCheck(mock, wsID)
|
||||
|
||||
// poll-mode SELECT — return push so we proceed past the poll
|
||||
// short-circuit.
|
||||
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
|
||||
|
||||
// runtime SELECT — return claude-code so the mock check falls
|
||||
// through.
|
||||
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
|
||||
|
||||
mock.ExpectExec("INSERT INTO activity_logs").
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: wsID}}
|
||||
body := `{"jsonrpc":"2.0","id":"real-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hi"}]}}}`
|
||||
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
handler.ProxyA2A(c)
|
||||
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
|
||||
if w.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
if !dispatched {
|
||||
t.Error("non-mock runtime: expected the agent server to receive the request, but it did not — mock short-circuit may be over-firing")
|
||||
}
|
||||
if err := mock.ExpectationsWereMet(); err != nil {
|
||||
t.Errorf("unmet sqlmock expectations: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// TestPickMockReply_Deterministic locks the determinism contract:
|
||||
// the same (workspaceID, requestID) input must yield the same variant
|
||||
// every call. Required for screen recordings + flake-free e2e
|
||||
// snapshots.
|
||||
func TestPickMockReply_Deterministic(t *testing.T) {
|
||||
cases := []struct {
|
||||
ws, req string
|
||||
}{
|
||||
{"ws-1", "req-A"},
|
||||
{"ws-1", "req-B"},
|
||||
{"ws-2", "req-A"},
|
||||
{"", ""},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
first := pickMockReply(tc.ws, tc.req)
|
||||
for i := 0; i < 10; i++ {
|
||||
next := pickMockReply(tc.ws, tc.req)
|
||||
if next != first {
|
||||
t.Errorf("pickMockReply(%q,%q) is not deterministic: got %q then %q",
|
||||
tc.ws, tc.req, first, next)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestIsMockRuntime_TrimsAndCaseInsensitive — typos and stray
|
||||
// whitespace in YAML must still resolve to mock so a single
|
||||
// runtime: " Mock " entry doesn't silently get dispatched.
|
||||
func TestIsMockRuntime_TrimsAndCaseInsensitive(t *testing.T) {
|
||||
cases := map[string]bool{
|
||||
"mock": true,
|
||||
"MOCK": true,
|
||||
" Mock ": true,
|
||||
"mocky": false,
|
||||
"": false,
|
||||
"external": false,
|
||||
"claude-code": false,
|
||||
}
|
||||
for in, want := range cases {
|
||||
if got := IsMockRuntime(in); got != want {
|
||||
t.Errorf("IsMockRuntime(%q) = %v, want %v", in, got, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestBuildMockA2AResponse_EchoesRequestID — JSON-RPC requires the
|
||||
// reply id to match the request id so callers can correlate. Mock
|
||||
// must hold this contract or canvas's correlation logic breaks.
|
||||
func TestBuildMockA2AResponse_EchoesRequestID(t *testing.T) {
|
||||
out := buildMockA2AResponse("ws-x", "req-echo-7", "On it!")
|
||||
var resp map[string]interface{}
|
||||
if err := json.Unmarshal(out, &resp); err != nil {
|
||||
t.Fatalf("response is not valid JSON: %v", err)
|
||||
}
|
||||
if resp["id"] != "req-echo-7" {
|
||||
t.Errorf("id = %v, want req-echo-7", resp["id"])
|
||||
}
|
||||
if resp["jsonrpc"] != "2.0" {
|
||||
t.Errorf("jsonrpc = %v, want 2.0", resp["jsonrpc"])
|
||||
}
|
||||
result, _ := resp["result"].(map[string]interface{})
|
||||
parts, _ := result["parts"].([]interface{})
|
||||
if len(parts) != 1 {
|
||||
t.Fatalf("expected 1 part, got %d", len(parts))
|
||||
}
|
||||
p, _ := parts[0].(map[string]interface{})
|
||||
if p["text"] != "On it!" {
|
||||
t.Errorf("part.text = %v, want On it!", p["text"])
|
||||
}
|
||||
}
|
||||
@ -250,6 +250,21 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
|
||||
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
|
||||
"name": ws.Name, "external": true,
|
||||
})
|
||||
} else if IsMockRuntime(runtime) {
|
||||
// Mock-runtime workspaces have no container, no EC2, no URL —
|
||||
// the proxyA2ARequest short-circuit synthesises every reply
|
||||
// from a canned variant pool (see mock_runtime.go). Status
|
||||
// goes straight to 'online' so the canvas renders the node
|
||||
// as reachable + the chat tab's send button is enabled. No
|
||||
// URL is set; the proxy never tries to resolve one for mock
|
||||
// runtimes. Built for the funding-demo "200-workspace mock
|
||||
// org" template — visual scale without real backend cost.
|
||||
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1 WHERE id = $2`, models.StatusOnline, id); err != nil {
|
||||
log.Printf("Org import: mock workspace status update failed for %s: %v", ws.Name, err)
|
||||
}
|
||||
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
|
||||
"name": ws.Name, "mock": true, "runtime": runtime,
|
||||
})
|
||||
} else if h.workspace.HasProvisioner() {
|
||||
// Provision container — either backend (CP for SaaS, local Docker
|
||||
// for self-hosted) is fine. Pre-2026-05-05 this gate was
|
||||
@ -675,8 +690,24 @@ func (h *OrgHandler) recurseChildrenForImport(ws OrgWorkspace, parentID string,
|
||||
if err := h.createWorkspaceTree(child, &parentID, childAbsX, childAbsY, slotX, slotY, defaults, orgBaseDir, results, provisionSem); err != nil {
|
||||
return err
|
||||
}
|
||||
// Pacing exists to throttle Docker container-spawn thundering
|
||||
// during a self-hosted import. Mock-runtime children spawn no
|
||||
// container — no Docker pressure, no LLM bursts, just DB
|
||||
// inserts + a broadcast. Skipping the 2s sleep collapses a
|
||||
// 200-workspace mock-org import from ~7min → ~5s, which is
|
||||
// the difference between a snappy demo and a "did it freeze?"
|
||||
// staring contest. Real (containerful) runtimes still pace.
|
||||
// Inheritance: if the child itself doesn't declare a runtime,
|
||||
// fall back to defaults.runtime — the org template sets
|
||||
// runtime: mock once at the org level, not on every IC node.
|
||||
childRuntime := child.Runtime
|
||||
if childRuntime == "" {
|
||||
childRuntime = defaults.Runtime
|
||||
}
|
||||
if !IsMockRuntime(childRuntime) {
|
||||
time.Sleep(workspaceCreatePacingMs * time.Millisecond)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
@ -78,6 +78,10 @@ var fallbackRuntimes = map[string]struct{}{
|
||||
"openclaw": {},
|
||||
"codex": {},
|
||||
"external": {},
|
||||
// mock — virtual workspace with hardcoded canned A2A replies.
|
||||
// No container, no EC2, no template repo. See mock_runtime.go
|
||||
// for the full rationale (200-workspace funding-demo org).
|
||||
"mock": {},
|
||||
}
|
||||
|
||||
// loadRuntimesFromManifest builds the runtime allowlist from
|
||||
@ -104,6 +108,10 @@ func loadRuntimesFromManifest(path string) (map[string]struct{}, error) {
|
||||
// the manifest doesn't know about it. Injected here so we
|
||||
// don't need a special-case in every caller.
|
||||
"external": {},
|
||||
// mock is ALWAYS available for the same reason as external:
|
||||
// virtual workspace, no template repo, never spawns a
|
||||
// container. See mock_runtime.go.
|
||||
"mock": {},
|
||||
}
|
||||
for _, e := range m.WorkspaceTemplates {
|
||||
name := strings.TrimSpace(e.Name)
|
||||
|
||||
@ -112,6 +112,19 @@ func (h *WorkspaceHandler) Restart(c *gin.Context) {
|
||||
return
|
||||
}
|
||||
|
||||
// runtime=mock: virtual workspace with canned A2A replies. No
|
||||
// container, no EC2, no provisioning state to recycle. Mirror
|
||||
// the external no-op so the canvas's Restart button doesn't
|
||||
// silently fail or leak through to the (template-less) provisioner.
|
||||
if dbRuntime == "mock" {
|
||||
c.JSON(http.StatusOK, gin.H{
|
||||
"status": "noop",
|
||||
"runtime": "mock",
|
||||
"message": "mock workspaces have no container — restart is a no-op",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
// SaaS mode: cpProv handles workspace EC2 lifecycle. Self-hosted mode:
|
||||
// provisioner handles local Docker containers. At least one must be
|
||||
// available — previously only `provisioner` was checked, which broke
|
||||
@ -532,7 +545,9 @@ func (h *WorkspaceHandler) runRestartCycle(workspaceID string) {
|
||||
}
|
||||
|
||||
// Don't auto-restart external workspaces (no Docker container)
|
||||
if dbRuntime == "external" {
|
||||
// or mock workspaces (no container, every reply is canned —
|
||||
// see workspace-server/internal/handlers/mock_runtime.go).
|
||||
if dbRuntime == "external" || dbRuntime == "mock" {
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
@ -71,9 +71,15 @@ func StartHealthSweep(ctx context.Context, checker ContainerChecker, interval ti
|
||||
}
|
||||
|
||||
func sweepOnlineWorkspaces(ctx context.Context, checker ContainerChecker, onOffline OfflineHandler) {
|
||||
// Skip external workspaces (runtime='external') — they have no Docker container
|
||||
// Skip external + mock workspaces — neither has a Docker container.
|
||||
// external: agent runs on operator's laptop, polled via heartbeat.
|
||||
// mock: virtual workspace, every reply is canned (see
|
||||
// workspace-server/internal/handlers/mock_runtime.go). Both would
|
||||
// false-positive as "container gone" on every sweep tick and
|
||||
// auto-restart would loop forever (provisioner has no template
|
||||
// for either runtime).
|
||||
rows, err := db.DB.QueryContext(ctx,
|
||||
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') != 'external'`)
|
||||
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') NOT IN ('external', 'mock')`)
|
||||
if err != nil {
|
||||
log.Printf("Health sweep: query error: %v", err)
|
||||
return
|
||||
|
||||
@ -413,22 +413,20 @@ func sweepStaleTokensWithoutContainer(ctx context.Context, reaper OrphanReaper)
|
||||
// `"5m0s"` mismatch with Postgres interval grammar; passing seconds
|
||||
// as an int keeps the binding portable.
|
||||
graceSeconds := int(staleTokenGrace.Seconds())
|
||||
// `runtime != 'external'` is load-bearing: external workspaces have NO
|
||||
// local container by design (the agent runs off-host), so the
|
||||
// "no live container" predicate below would match every external
|
||||
// workspace and revoke its token. The token is the off-host agent's
|
||||
// only authentication credential — revoking breaks the entire
|
||||
// external-runtime feature. Discovered 2026-05-03 when a fresh
|
||||
// external workspace had its token silently revoked ~6 minutes after
|
||||
// creation by this sweep, killing the operator's MCP heartbeat and
|
||||
// inbox poll with `HTTP 401 — token may be revoked`.
|
||||
// `runtime NOT IN ('external','mock')` is load-bearing: neither
|
||||
// runtime has a local container, so the "no live container"
|
||||
// predicate below would match every row and revoke its token.
|
||||
// external: token is the off-host agent's only credential —
|
||||
// revoking breaks the entire external-runtime feature
|
||||
// (incident 2026-05-03). mock: same shape — no container by
|
||||
// design, see workspace-server/internal/handlers/mock_runtime.go.
|
||||
rows, qErr := db.DB.QueryContext(ctx, `
|
||||
SELECT DISTINCT t.workspace_id::text
|
||||
FROM workspace_auth_tokens t
|
||||
JOIN workspaces w ON w.id = t.workspace_id
|
||||
WHERE t.revoked_at IS NULL
|
||||
AND w.status NOT IN ('removed', 'provisioning')
|
||||
AND w.runtime != 'external'
|
||||
AND w.runtime NOT IN ('external', 'mock')
|
||||
AND COALESCE(t.last_used_at, t.created_at) < now() - make_interval(secs => $2)
|
||||
AND (
|
||||
cardinality($1::text[]) = 0
|
||||
|
||||
@ -26,7 +26,7 @@ import (
|
||||
// accidentally matching a future query that opens with the same column
|
||||
// name OR a regression that drops one of the load-bearing predicates.
|
||||
func expectStaleTokenSweepNoOp(mock sqlmock.Sqlmock) {
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}))
|
||||
}
|
||||
|
||||
@ -492,7 +492,7 @@ func TestSweepOnce_StaleTokenRevokeFiresWhenNoContainer(t *testing.T) {
|
||||
// excludes 'external' (2026-05-03 fix — the sweep was incorrectly
|
||||
// targeting external workspaces which have no container by design),
|
||||
// and the staleness predicate appears in the SELECT.
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'.*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\).*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
|
||||
AddRow(orphanedID))
|
||||
|
||||
@ -548,7 +548,7 @@ func TestSweepOnce_StaleTokenRevokeFailureBailsLoop(t *testing.T) {
|
||||
|
||||
// Third-pass returns two stale-token workspaces; the first revoke
|
||||
// errors. Loop must bail without attempting the second.
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
|
||||
AddRow("aaaa1111-0000-0000-0000-000000000000").
|
||||
AddRow("bbbb2222-0000-0000-0000-000000000000"))
|
||||
|
||||
Loading…
Reference in New Issue
Block a user