From 9cb7cf70e3ea24dd411da13b940a7fd7b7386389 Mon Sep 17 00:00:00 2001 From: Molecule AI Core-Security Date: Mon, 11 May 2026 22:52:06 -0700 Subject: [PATCH 01/13] test(mcp): rewrite GlobalScope_Blocked to assert OFFSEC-001 scrub contract (mc#664 Class 2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Background — chain of defects ----------------------------- mc#664 (Platform (Go) CI red) decomposes into: • Class 1 — 4 TestExecuteDelegation_* failures (parallel dispatch to core-be) • Class 2 — TestMCPHandler_CommitMemory_GlobalScope_Blocked (this PR) Class 2 root cause: commit 7d1a189f (2026-05-10) hardened mcp.go to scrub err.Error() out of the JSON-RPC error.message returned to the client, replacing the third leak (the dispatchRPC tool-call branch, line ~427) with the constant string "tool call failed". The internal error is now log.Printf'd server-side only. The existing test at mcp_test.go:432 asserted that the client-visible message CONTAINED the substring "GLOBAL" — which was exactly the internal err.Error() text the 7d1a189f scrub now removes. So the test had silently flipped from "verifies behaviour" to "verifies the bug", and once the scrub landed the test went red. PR #665 has been masking this red via continue-on-error as an interim measure; this PR is the proper fix for Class 2. Wrong fix --------- Un-scrub mcp.go (i.e. restore err.Error() into the client-facing message). This would re-open OFFSEC-001 / #259 and defeat the security hardening that was applied uniformly across 22 sibling files in PRs #1193 / #1206 / #1219 / #168. Right fix (this PR) ------------------- Rewrite the test so it asserts the OFFSEC-001 scrub-works contract on this very code path, matching the same style used by the four canonical OFFSEC-001 tests already in this file (lines 1031–1149): • exact-equality on resp.Error.Code (-32000) • exact-equality on resp.Error.Message ("tool call failed") • negative-substring canaries on six tokens from the production-internal error string ("GLOBAL", "scope", "permitted", "bridge", "LOCAL", "TEAM") — if ANY leaks through to the client, the scrub has regressed and the test fires immediately • C3 invariant preserved (no DB calls — handler short-circuits) • Test renamed to _ScrubsInternalError so the contract is visible at the call site / in failure output Per feedback_assert_exact_not_substring: the positive assertion uses exact-equality (`!= "tool call failed"`) rather than substring-match, so any future mutation of the constant breaks the test loudly. Verification (local, falsified both ways) ----------------------------------------- Positive: against current main (7d1a189f scrub in place) $ go test -run TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError ok .../internal/handlers 0.515s PASS Falsification: temporarily reverted line 427 of mcp.go to `Message: err.Error()`, ran the test → all positive assertions failed AND all six leaked-token canaries fired (proves the test really does guard the contract, not just shape). All other TestMCPHandler_* tests continue to pass. The four TestExecuteDelegation_* failures observed in the full handlers/ package run pre-exist on origin/main and are Class 1 (core-be's parallel work) — not touched here. Tier ---- tier:high — this is the security-hardening contract test for the OFFSEC-001 scrub. A weak version of this assertion is what allowed the original gap on the GLOBAL-scope path to go unnoticed for so long. Brief-falsification log ----------------------- • Brief halt-condition: "If reading of 7d1a189f differs from this brief's account: STOP" — confirmed identical (3rd hunk, line 425 in pre-patch mcp.go, dispatchRPC tool-call branch, scrubs err.Error() → "tool call failed", logs server-side). • Brief halt-condition: "If mcp_test.go line 433 has been modified since this brief was written: STOP" — confirmed unchanged (line 432–434 exact text matches brief description). • Brief widen-scope check: searched for sibling tests with the same anti-pattern (assert internal err.Error() content on the OFFSEC code path). Findings: – TestMCPHandler_RecallMemory_GlobalScope_Blocked (line 539) asserts `resp.Error != nil` only; does NOT assert on "GLOBAL"-substring, so it isn't broken by the scrub. BUT it also doesn't verify the scrub-works contract — a future regression would slip past it. Recommending a follow-up to strengthen it (and the corresponding RecallMemory v2 path, if any) in a separate single-purpose PR rather than widening scope here. NOT addressed in this PR per the brief's "1-2 siblings or report" discipline. • OFFSEC-001 issue lookup: 22 files were touched by the sibling scrub PRs (#1193 / #1206 / #1219 / #168). This PR addresses ONE test that was asserting against the now-scrubbed surface. No other red-on-main tests are believed to share this anti-pattern in mcp_test.go (grep verified). References ---------- • mc#664 (Platform (Go) red — chain root issue) • PR #665 (interim continue-on-error mask — to be reverted post-fix) • commit 7d1a189f (OFFSEC-001 scrub, the hardening this test now guards) • OFFSEC-001 / molecule-ai/molecule-core#259 (original security issue) • feedback_assert_exact_not_substring (assertion-style memory) Co-Authored-By: Claude Opus 4.7 (1M context) --- .../internal/handlers/mcp_test.go | 76 +++++++++++++++++-- 1 file changed, 68 insertions(+), 8 deletions(-) diff --git a/workspace-server/internal/handlers/mcp_test.go b/workspace-server/internal/handlers/mcp_test.go index d306fa14..385b03ec 100644 --- a/workspace-server/internal/handlers/mcp_test.go +++ b/workspace-server/internal/handlers/mcp_test.go @@ -417,11 +417,32 @@ func TestMCPHandler_CommitMemory_LocalScope_Success(t *testing.T) { } } -// TestMCPHandler_CommitMemory_GlobalScope_Blocked verifies that C3 is enforced: -// GLOBAL scope is not permitted on the MCP bridge. -func TestMCPHandler_CommitMemory_GlobalScope_Blocked(t *testing.T) { +// TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError verifies +// two contracts at once on the GLOBAL-scope-blocked path: +// +// 1. C3 invariant (commit_memory with scope=GLOBAL aborts on the MCP bridge +// before touching the DB), AND +// 2. OFFSEC-001 / #259 scrub contract (commit 7d1a189f): the JSON-RPC error +// returned to the client is a CONSTANT — code=-32000, message="tool call +// failed" — with the production-internal err.Error() text logged +// server-side, never reflected back to the caller. +// +// Prior to this rename the test asserted that the client-visible message +// CONTAINED the substring "GLOBAL", which was the human-readable internal +// error from toolCommitMemory. mc#664 Class 2 flipped that assertion the +// right way around: now the test FAILS if the scrub regresses (i.e. if the +// internal string is ever reflected back to the wire), and PASSES iff the +// scrubbed constant reaches the client. +// +// Coupling note: the constant string "tool call failed" and the code -32000 +// are the same values asserted by +// TestMCPHandler_dispatchRPC_UnknownTool_ReturnsConstantMessage — both are +// the OFFSEC-001 contract for the dispatch-failure branch in mcp.go (the +// third err.Error() leak that 7d1a189f scrubbed). If those constants ever +// change, both tests must move together. +func TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError(t *testing.T) { h, mock := newMCPHandler(t) - // No DB expectations — handler must abort before touching the DB. + // No DB expectations — handler must abort before touching the DB (C3). w := mcpPost(t, h, "ws-1", map[string]interface{}{ "jsonrpc": "2.0", @@ -436,14 +457,53 @@ func TestMCPHandler_CommitMemory_GlobalScope_Blocked(t *testing.T) { }, }) + // JSON-RPC envelope returns 200 with the error in the body — only + // malformed-JSON-at-the-envelope-layer returns 400 (see Call() in mcp.go). + if w.Code != http.StatusOK { + t.Fatalf("expected 200 (JSON-RPC error in body), got %d: %s", w.Code, w.Body.String()) + } + var resp mcpResponse - json.Unmarshal(w.Body.Bytes(), &resp) + if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil { + t.Fatalf("response is not valid JSON: %v", err) + } + + // (1) C3: an error must be reported. if resp.Error == nil { - t.Error("expected JSON-RPC error for GLOBAL scope, got nil") + t.Fatal("expected JSON-RPC error for GLOBAL scope, got nil") } - if resp.Error != nil && !bytes.Contains([]byte(resp.Error.Message), []byte("GLOBAL")) { - t.Errorf("error message should mention GLOBAL, got: %s", resp.Error.Message) + + // (2) OFFSEC-001 positive assertions — exact equality on the scrubbed + // constants so any change (re-leak of err.Error(), code mutation) trips + // the test. Substring-match would not catch a partial re-leak. + if resp.Error.Code != -32000 { + t.Errorf("error code should be -32000 (Server error / dispatch-failure), got: %d", resp.Error.Code) } + if resp.Error.Message != "tool call failed" { + t.Errorf("error message should be the OFFSEC-001 constant %q, got: %q", "tool call failed", resp.Error.Message) + } + + // (3) OFFSEC-001 negative assertions — the internal err.Error() text + // from toolCommitMemory ("GLOBAL scope is not permitted via the MCP + // bridge — use LOCAL or TEAM") must NOT appear in the client-visible + // message. Each token below is a distinct substring of that internal + // string; if ANY leaks through, the scrub in mcp.go dispatchRPC has + // regressed and this assertion fires the canary. + leakedTokens := []string{ + "GLOBAL", // scope name + "scope", // policy lexicon + "permitted", // policy verb + "bridge", // internal architecture term + "LOCAL", // alternative scope name + "TEAM", // alternative scope name + } + for _, tok := range leakedTokens { + if bytes.Contains([]byte(resp.Error.Message), []byte(tok)) { + t.Errorf("OFFSEC-001 scrub regression: client-visible error.message leaks internal token %q (got: %q)", tok, resp.Error.Message) + } + } + + // (4) C3 invariant preserved: handler must short-circuit before any DB call. if err := mock.ExpectationsWereMet(); err != nil { t.Errorf("unexpected DB calls on GLOBAL scope block: %v", err) } -- 2.45.2 From 1dbffed3d91467118501cfbdf3be2f54962d41a0 Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 19:12:47 +0000 Subject: [PATCH 02/13] ci: trigger CI rerun [empty commit] -- 2.45.2 From 5bcc1ff7dc788a78885c8e4a1814f68aba2a3b44 Mon Sep 17 00:00:00 2001 From: core-devops Date: Tue, 12 May 2026 20:50:54 +0000 Subject: [PATCH 03/13] ci: rerun after mc#724 all-required fix lands -- 2.45.2 From c91619cd484d4074def8c9292458b6b8b5e28c3e Mon Sep 17 00:00:00 2001 From: claude-ceo-assistant Date: Tue, 12 May 2026 21:15:10 +0000 Subject: [PATCH 04/13] ci: rerun after concurrency-block clear -- 2.45.2 From 2b0e5b9f8b386dbfe1e414586c0816f15c970478 Mon Sep 17 00:00:00 2001 From: claude-ceo-assistant Date: Tue, 12 May 2026 21:23:51 +0000 Subject: [PATCH 05/13] ci: clean-slate rerun -- 2.45.2 From b4a3515b791912cc1ac0a9f5f45660400b859e4c Mon Sep 17 00:00:00 2001 From: claude-ceo-assistant Date: Tue, 12 May 2026 21:30:02 +0000 Subject: [PATCH 06/13] ci: post-restart rerun -- 2.45.2 From 8a30d8514a7aff3d7436990e7c364598a89e8346 Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 21:34:46 +0000 Subject: [PATCH 07/13] ci: clean-slate rerun v2 -- 2.45.2 From 1d6e14d819a92785c1b0ee5ee4d5efc847f26eee Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 21:44:18 +0000 Subject: [PATCH 08/13] ci: global-zombie-purge rerun -- 2.45.2 From c27c847bf48cd5613a44a56eb90f4876f7e55661 Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 21:48:16 +0000 Subject: [PATCH 09/13] ci: post-full-purge rerun -- 2.45.2 From 540d8eea3fe1ff88f97f8f2ad5c8492fe230ef3c Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 21:55:18 +0000 Subject: [PATCH 10/13] ci: clean-queue rerun -- 2.45.2 From 17a4862a3f3f7926c044f9b0909b0bf5d04942b4 Mon Sep 17 00:00:00 2001 From: core-lead Date: Tue, 12 May 2026 22:01:52 +0000 Subject: [PATCH 11/13] ci: post-delete-purge rerun -- 2.45.2 From 7f2fb1348372bd59fd3d45230fe4f1af387381ed Mon Sep 17 00:00:00 2001 From: core-devops Date: Tue, 12 May 2026 23:26:14 +0000 Subject: [PATCH 12/13] fix(handlers): preserve HTTP status through body-read errors; fix TestExecuteDelegation_* mocks Three coordinated fixes for the delivery-confirmed-success path added in PR #680: 1. a2a_proxy.go: When io.ReadAll returns a readErr (partial body), preserve resp.StatusCode in proxyA2AError.Status for non-2xx responses (status >= 300). Previously always returned BadGateway, causing isTransientProxyError to wrongly retry 500/server-rejected requests as if they were transient. 2. delegation.go: Move isDeliveryConfirmedSuccess check BEFORE the isTransientProxyError retry gate. Previously a 200+partial-body response triggered the 8s retry before the success check ran. Also change delegationRetryDelay from const to var for test overrides. 3. delegation_test.go: Rewrite TestExecuteDelegation_* helper functions and test bodies to match the actual ordered DB call sequence: - expectProxyA2ARequest: full 5-call sequence (parent lookups, budget, delivery_mode, runtime) - expectLogA2ASuccess: synchronous SELECT name inside logA2ASuccess - expectMaybeMarkContainerDead: SELECT COALESCE(runtime) for 502 path - setRetryDelayForTest: zero-delay retry in ProxyErrorEmptyBody test - Remove spurious second dispatched-UPDATE expectation (no such call) --- .../internal/handlers/a2a_proxy.go | 12 +- .../internal/handlers/delegation.go | 26 ++-- .../internal/handlers/delegation_test.go | 129 +++++++++++++----- 3 files changed, 118 insertions(+), 49 deletions(-) diff --git a/workspace-server/internal/handlers/a2a_proxy.go b/workspace-server/internal/handlers/a2a_proxy.go index 816d5c81..a8a8f5d4 100644 --- a/workspace-server/internal/handlers/a2a_proxy.go +++ b/workspace-server/internal/handlers/a2a_proxy.go @@ -501,8 +501,18 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri // to correctly route delivery-confirmed responses (where the agent completed // the work but the TCP connection dropped before the full body was received) // to success instead of failure (#159). + // + // For non-2xx responses (server explicitly rejected with 3xx+), preserve + // resp.StatusCode in the proxyA2AError.Status so isTransientProxyError + // returns false — a server-authored rejection is not a transient transport + // error and must not be retried. Only 2xx body-read errors keep Status=502 + // (the agent completed work but the TCP layer dropped the response). + errStatus := http.StatusBadGateway + if resp.StatusCode >= 300 { + errStatus = resp.StatusCode + } return resp.StatusCode, respBody, &proxyA2AError{ - Status: http.StatusBadGateway, + Status: errStatus, Response: gin.H{ "error": "failed to read agent response", "delivery_confirmed": deliveryConfirmed, diff --git a/workspace-server/internal/handlers/delegation.go b/workspace-server/internal/handlers/delegation.go index e0d06b8b..d11eaf83 100644 --- a/workspace-server/internal/handlers/delegation.go +++ b/workspace-server/internal/handlers/delegation.go @@ -308,7 +308,7 @@ func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, b // to land a fresh URL in the cache before we try again. Fixes #74 — // bulk restarts used to produce spurious "failed to reach workspace // agent" errors when delegations fired within the warm-up window. -const delegationRetryDelay = 8 * time.Second +var delegationRetryDelay = 8 * time.Second func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID string, a2aBody []byte) { ctx, cancel := context.WithTimeout(context.Background(), 30*time.Minute) @@ -324,6 +324,18 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s status, respBody, proxyErr := h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true) + // When proxyA2ARequest returns an error but we have a non-empty response body + // with a 2xx status code, the agent completed the work successfully — the error + // is a delivery/transport error (e.g., connection reset after response was + // received). Treat as success: the response body is valid and the work is done. + // This check MUST run before the transient-retry gate so a delivery-confirmed + // partial-body 2xx response is never retried. + if isDeliveryConfirmedSuccess(proxyErr, status, respBody) { + log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success", + delegationID, status, len(respBody), proxyErr.Error()) + goto handleSuccess + } + // #74: one retry after the reactive URL refresh has had a chance to // run. The proxyA2ARequest's health-check path on a connection error // marks the workspace offline, clears cached keys, and kicks off a @@ -342,18 +354,6 @@ func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID s } } - // When proxyA2ARequest returns an error but we have a non-empty response body - // with a 2xx status code, the agent completed the work successfully — the error - // is a delivery/transport error (e.g., connection reset after response was - // received). Treat as success: the response body is valid and the work is done. - // This prevents "retry storms" where the canvas sees error + Restart-workspace - // suggestion even though the delegation actually completed. - if isDeliveryConfirmedSuccess(proxyErr, status, respBody) { - log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success", - delegationID, status, len(respBody), proxyErr.Error()) - goto handleSuccess - } - if proxyErr != nil { log.Printf("Delegation %s: failed — %s", delegationID, proxyErr.Error()) h.updateDelegationStatus(sourceID, delegationID, "failed", proxyErr.Error()) diff --git a/workspace-server/internal/handlers/delegation_test.go b/workspace-server/internal/handlers/delegation_test.go index b2d1c93a..19168d38 100644 --- a/workspace-server/internal/handlers/delegation_test.go +++ b/workspace-server/internal/handlers/delegation_test.go @@ -974,18 +974,21 @@ const testDelegationID = "del-159-test" const testSourceID = "ws-source-159" const testTargetID = "ws-target-159" -// expectExecuteDelegationBase sets up sqlmock expectations for the DB queries that -// executeDelegation always makes, regardless of outcome. -func expectExecuteDelegationBase(mock sqlmock.Sqlmock) { - // updateDelegationStatus: dispatched - // Uses prefix match — sqlmock regexes match the full query string. - mock.ExpectExec("UPDATE activity_logs SET status"). - WithArgs("dispatched", "", testSourceID, testDelegationID). - WillReturnResult(sqlmock.NewResult(0, 1)) +// setRetryDelayForTest overrides delegationRetryDelay for the duration of the +// test. Use 0 in retry-path tests to avoid 8-second sleeps. +func setRetryDelayForTest(t *testing.T, d time.Duration) { + t.Helper() + old := delegationRetryDelay + delegationRetryDelay = d + t.Cleanup(func() { delegationRetryDelay = old }) +} - // CanCommunicate: source != target → fires two getWorkspaceRef lookups. - // Both test fixtures have parent_id = NULL (root-level siblings) → allowed. - // Order matches call order: source first, then target. +// expectProxyA2ARequest sets up DB expectations for one call to proxyA2ARequest: +// CanCommunicate × 2, budget check, delivery-mode lookup, runtime lookup. +// resolveAgentURL is NOT mocked — the test sets the URL in Redis so the DB +// fallback never fires. +func expectProxyA2ARequest(mock sqlmock.Sqlmock) { + // CanCommunicate: source then target (root-level siblings → allowed) mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id"). WithArgs(testSourceID). WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testSourceID, nil)) @@ -993,36 +996,84 @@ func expectExecuteDelegationBase(mock sqlmock.Sqlmock) { WithArgs(testTargetID). WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testTargetID, nil)) - // resolveAgentURL: reads ws:{id}:url from Redis, falls back to DB for target - mock.ExpectQuery("SELECT url, status FROM workspaces WHERE id = "). + // checkWorkspaceBudget: no limit set → fail-open (no 402) + mock.ExpectQuery("SELECT budget_limit, COALESCE"). WithArgs(testTargetID). - WillReturnRows(sqlmock.NewRows([]string{"url", "status"}).AddRow("", "online")) + WillReturnRows(sqlmock.NewRows([]string{"budget_limit", "monthly_spend"}).AddRow(nil, int64(0))) + + // lookupDeliveryMode: push → no poll short-circuit + mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id"). + WithArgs(testTargetID). + WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push")) + + // handleMockA2A/lookupRuntime: not mock runtime → fall through to real dispatch + mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id"). + WithArgs(testTargetID). + WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("")) } -// expectExecuteDelegationSuccess sets up expectations for a completed delegation. -func expectExecuteDelegationSuccess(mock sqlmock.Sqlmock, respBody string) { - // INSERT activity_logs for delegation completion (response_body status = 'completed') - mock.ExpectExec("INSERT INTO activity_logs"). - WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "completed"). +// expectExecuteDelegationBase sets up DB expectations for the pre-proxy phase +// (updateDelegationStatus dispatched + broadcaster structure event) followed by +// one proxyA2ARequest call. +func expectExecuteDelegationBase(mock sqlmock.Sqlmock) { + // updateDelegationStatus: dispatched + mock.ExpectExec("UPDATE activity_logs SET status"). + WithArgs("dispatched", "", testSourceID, testDelegationID). WillReturnResult(sqlmock.NewResult(0, 1)) + // broadcaster.RecordAndBroadcast(EventDelegationStatus) + mock.ExpectExec("INSERT INTO structure_events"). + WillReturnResult(sqlmock.NewResult(0, 1)) + expectProxyA2ARequest(mock) +} - // updateDelegationStatus: completed +// expectLogA2ASuccess sets up the synchronous SELECT name call inside +// logA2ASuccess. Add after expectExecuteDelegationBase (or expectProxyA2ARequest) +// when the proxy attempt calls logA2ASuccess: delivery-confirmed 2xx with body, +// clean 200, or a non-2xx fully read without a body-read error (e.g. 502 empty). +func expectLogA2ASuccess(mock sqlmock.Sqlmock) { + mock.ExpectQuery("SELECT name FROM workspaces WHERE id"). + WithArgs(testTargetID). + WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("")) +} + +// expectMaybeMarkContainerDead sets up the SELECT COALESCE(runtime) call in +// maybeMarkContainerDead. Fires on the non-2xx path (isUpstreamDeadStatus) after +// a successful body read. Test WorkspaceHandler has no provisioner so the +// function returns false after this query. +func expectMaybeMarkContainerDead(mock sqlmock.Sqlmock) { + mock.ExpectQuery("SELECT COALESCE.runtime"). + WithArgs(testTargetID). + WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("langgraph")) +} + +// expectExecuteDelegationSuccess sets up DB expectations for the handleSuccess +// path: INSERT delegation-result log, UPDATE status to completed, structure event. +// The INSERT uses 5 positional args ($1–$5); status='completed' is a literal. +func expectExecuteDelegationSuccess(mock sqlmock.Sqlmock) { + mock.ExpectExec("INSERT INTO activity_logs"). + WithArgs(testSourceID, testSourceID, testTargetID, sqlmock.AnyArg(), sqlmock.AnyArg()). + WillReturnResult(sqlmock.NewResult(0, 1)) mock.ExpectExec("UPDATE activity_logs SET status"). WithArgs("completed", "", testSourceID, testDelegationID). WillReturnResult(sqlmock.NewResult(0, 1)) + // broadcaster.RecordAndBroadcast(EventDelegationComplete) + mock.ExpectExec("INSERT INTO structure_events"). + WillReturnResult(sqlmock.NewResult(0, 1)) } -// expectExecuteDelegationFailed sets up expectations for a failed delegation. +// expectExecuteDelegationFailed sets up DB expectations for the failure path: +// UPDATE status to failed, INSERT delegation-failure log, structure event. +// The INSERT uses 5 positional args ($1–$5); status='failed' is a literal. func expectExecuteDelegationFailed(mock sqlmock.Sqlmock) { - // INSERT activity_logs for delegation failure (response_body status = 'failed') - mock.ExpectExec("INSERT INTO activity_logs"). - WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "failed"). - WillReturnResult(sqlmock.NewResult(0, 1)) - - // updateDelegationStatus: failed mock.ExpectExec("UPDATE activity_logs SET status"). WithArgs("failed", sqlmock.AnyArg(), testSourceID, testDelegationID). WillReturnResult(sqlmock.NewResult(0, 1)) + mock.ExpectExec("INSERT INTO activity_logs"). + WithArgs(testSourceID, testSourceID, testTargetID, "Delegation failed", sqlmock.AnyArg()). + WillReturnResult(sqlmock.NewResult(0, 1)) + // broadcaster.RecordAndBroadcast(EventDelegationFailed) + mock.ExpectExec("INSERT INTO structure_events"). + WillReturnResult(sqlmock.NewResult(0, 1)) } // TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess is the primary regression @@ -1087,7 +1138,8 @@ func TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess(t *testin allowLoopbackForTest(t) expectExecuteDelegationBase(mock) - expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"work completed successfully"}]}}`) + expectLogA2ASuccess(mock) + expectExecuteDelegationSuccess(mock) // Execute synchronously (not as a goroutine) so we can check DB state immediately. // The handler fires it as goroutine; we call it directly for deterministic testing. @@ -1181,6 +1233,8 @@ func TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed(t *testing.T) { // path is unchanged when proxyA2ARequest returns an error with a 2xx status but empty body. // The new condition requires len(respBody) > 0, so empty body routes to failure. func TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) { + setRetryDelayForTest(t, 0) + mock := setupTestDB(t) mr := setupTestRedis(t) allowLoopbackForTest(t) @@ -1202,13 +1256,17 @@ func TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) { mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentServer.URL) allowLoopbackForTest(t) - // First attempt: updateDelegationStatus(dispatched) — from expectExecuteDelegationBase + // First attempt: updateDelegationStatus(dispatched) + proxyA2ARequest DB calls expectExecuteDelegationBase(mock) - // Second attempt (retry): updateDelegationStatus(dispatched) again - mock.ExpectExec("UPDATE activity_logs SET status"). - WithArgs("dispatched", "", testSourceID, testDelegationID). - WillReturnResult(sqlmock.NewResult(0, 1)) - // Failure: INSERT + UPDATE (failed) + // 502 with empty body → logA2ASuccess fires (body read succeeded, no readErr) + expectLogA2ASuccess(mock) + // maybeMarkContainerDead fires on 502 (isUpstreamDeadStatus); returns false (no provisioner) + expectMaybeMarkContainerDead(mock) + // Retry attempt: proxyA2ARequest runs again (no extra dispatched UPDATE) + expectProxyA2ARequest(mock) + expectLogA2ASuccess(mock) + expectMaybeMarkContainerDead(mock) + // After retry: still fails → failure path expectExecuteDelegationFailed(mock) a2aBody, _ := json.Marshal(map[string]interface{}{ @@ -1252,7 +1310,8 @@ func TestExecuteDelegation_CleanProxyResponse_Unchanged(t *testing.T) { allowLoopbackForTest(t) expectExecuteDelegationBase(mock) - expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"all good"}]}}`) + expectLogA2ASuccess(mock) + expectExecuteDelegationSuccess(mock) a2aBody, _ := json.Marshal(map[string]interface{}{ "jsonrpc": "2.0", "id": "1", "method": "message/send", -- 2.45.2 From 7a7ec880fef34d8aad2291765c4016671dd09966 Mon Sep 17 00:00:00 2001 From: molecule-operator Date: Wed, 13 May 2026 00:07:56 +0000 Subject: [PATCH 13/13] fix(a2a_proxy): return error for 2xx responses with empty body MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit An A2A agent must always return a JSON body. A 2xx with empty body means the connection closed before body bytes were written — this should route to the failure path, not silently succeed. Without this fix: 200 + empty body → (200, [], nil) → falls through to handleSuccess → marked "completed" despite no payload. With this fix: 200 + empty body → proxyA2AError{Status:200} → isDeliveryConfirmedSuccess=false → isTransientProxyError(200)=false → failure path → "failed" with error detail. --- workspace-server/internal/handlers/a2a_proxy.go | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/workspace-server/internal/handlers/a2a_proxy.go b/workspace-server/internal/handlers/a2a_proxy.go index a8a8f5d4..5737b156 100644 --- a/workspace-server/internal/handlers/a2a_proxy.go +++ b/workspace-server/internal/handlers/a2a_proxy.go @@ -520,6 +520,21 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri } } + // 2xx with empty body: the agent completed the request but returned no content. + // An A2A agent must always return a JSON body; empty means the agent is + // broken or the connection closed before any body bytes were written. + // Return a proxyA2AError so executeDelegation routes this to failure rather + // than silently marking it as completed with a nil body. + // logA2ASuccess is intentionally NOT called here — delivery was not confirmed. + if resp.StatusCode >= 200 && resp.StatusCode < 300 && len(respBody) == 0 { + log.Printf("ProxyA2A: agent %s returned %d with empty body — treating as failure", + workspaceID, resp.StatusCode) + return resp.StatusCode, respBody, &proxyA2AError{ + Status: resp.StatusCode, + Response: gin.H{"error": "agent returned empty response body"}, + } + } + if logActivity { h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, resp.StatusCode, durationMs) } -- 2.45.2