From 4fe9e809e92f9b906f5fa7e561b5e3467dbb90c9 Mon Sep 17 00:00:00 2001
From: core-devops <core-devops@agents.moleculesai.app>
Date: Wed, 3 Jun 2026 21:17:16 -0700
Subject: [PATCH] test(e2e): name the A2A empty-completion failure class in
 staging SaaS canary
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Step 8 of the full-lifecycle SaaS canary sends an A2A round-trip to the
parent and asserts a PONG. When the configured completion backend returns
a 2xx with no text part (empty content / tool_calls-or-reasoning-only),
the agent runtime surfaces the literal reply "Error: message contained no
text content." Today that fell through the generic "error|exception"
catch-all and was reported as a vague "A2A returned an error-shaped
response", which misdirects triage to workspace-server.

Add a specific error-class check (mirroring the existing hermes-401 /
quota-exhausted patterns) that names this as a model/provider BACKEND
regression with the operator action, before the generic catch-all. No
behaviour change for healthy runs; the failure still hard-fails — it is
just diagnosed correctly.

Observed 2026-06-03/04: 100% of staging canaries on MODEL_SLUG=MiniMax-M2
(canary default since #2710) hit this on the parent's first cold turn,
identical on main's scheduled synthetic E2E and on open PRs — i.e. an
environmental backend regression, not PR-introduced. This is purely a
diagnostic-precision improvement to the unmodified main-line step-8 block.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 tests/e2e/test_staging_full_saas.sh | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tests/e2e/test_staging_full_saas.sh b/tests/e2e/test_staging_full_saas.sh
index b7d8ea1b7..89afca1f9 100755
--- a/tests/e2e/test_staging_full_saas.sh
+++ b/tests/e2e/test_staging_full_saas.sh
@@ -858,6 +858,24 @@ fi
 if echo "$AGENT_TEXT" | grep -qiE "exceeded your current quota|insufficient_quota"; then
   fail "A2A — PROVIDER QUOTA EXHAUSTED (NOT a platform regression). Operator action: top up MOLECULE_STAGING_OPENAI_API_KEY billing or rotate to a higher-quota org at Settings → Secrets and Variables → Actions. Tracked in #2578. Raw: $AGENT_TEXT"
 fi
+# Empty-completion class — the agent runtime reached the LLM and got a
+# 2xx back, but the assistant turn carried NO text part (empty content,
+# or tool_calls/reasoning-only with no surfaced text), so the runtime
+# returns the literal "Error: message contained no text content." as its
+# reply text. Steps 0-7 passing means the platform is healthy (CP up,
+# tenant provisioned, workspace online + routable, A2A delivery e2e); the
+# break is the configured completion BACKEND returning an empty turn — a
+# model/provider-side regression, NOT a workspace-server or harness bug,
+# and NOT NOT_CONFIGURED (that fails earlier, at boot). Name it explicitly
+# so the canary alert points at the model, not the platform: a generic
+# "error-shaped response" misdirects triage to workspace-server. Observed
+# 2026-06-03/04 across every staging canary on MODEL_SLUG=MiniMax-M2 (the
+# canary default since #2710) — 100% on the parent's first cold turn,
+# identical on main's scheduled synthetic E2E and on PRs (so it is an
+# environmental backend regression, never PR-introduced).
+if echo "$AGENT_TEXT" | grep -qiF "message contained no text content"; then
+  fail "A2A — EMPTY COMPLETION (backend regression, NOT a platform/workspace-server bug). The configured model (MODEL_SLUG=${MODEL_SLUG:-?}) returned a 2xx completion with no text part; the runtime surfaced 'message contained no text content.'. Operator action: check the staging LLM backend / proxy for the canary model (MiniMax-M2 since #2710) — empty assistant turns, not an auth/quota/boot fault. Raw: $AGENT_TEXT"
+fi
 # Generic catch-all — falls through if none of the known regressions hit.
 if echo "$AGENT_TEXT" | grep -qiE "error|exception"; then
   fail "A2A returned an error-shaped response: $AGENT_TEXT"
-- 
2.52.0