From 4fe9e809e92f9b906f5fa7e561b5e3467dbb90c9 Mon Sep 17 00:00:00 2001 From: core-devops Date: Wed, 3 Jun 2026 21:17:16 -0700 Subject: [PATCH] test(e2e): name the A2A empty-completion failure class in staging SaaS canary MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Step 8 of the full-lifecycle SaaS canary sends an A2A round-trip to the parent and asserts a PONG. When the configured completion backend returns a 2xx with no text part (empty content / tool_calls-or-reasoning-only), the agent runtime surfaces the literal reply "Error: message contained no text content." Today that fell through the generic "error|exception" catch-all and was reported as a vague "A2A returned an error-shaped response", which misdirects triage to workspace-server. Add a specific error-class check (mirroring the existing hermes-401 / quota-exhausted patterns) that names this as a model/provider BACKEND regression with the operator action, before the generic catch-all. No behaviour change for healthy runs; the failure still hard-fails — it is just diagnosed correctly. Observed 2026-06-03/04: 100% of staging canaries on MODEL_SLUG=MiniMax-M2 (canary default since #2710) hit this on the parent's first cold turn, identical on main's scheduled synthetic E2E and on open PRs — i.e. an environmental backend regression, not PR-introduced. This is purely a diagnostic-precision improvement to the unmodified main-line step-8 block. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/e2e/test_staging_full_saas.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/tests/e2e/test_staging_full_saas.sh b/tests/e2e/test_staging_full_saas.sh index b7d8ea1b7..89afca1f9 100755 --- a/tests/e2e/test_staging_full_saas.sh +++ b/tests/e2e/test_staging_full_saas.sh @@ -858,6 +858,24 @@ fi if echo "$AGENT_TEXT" | grep -qiE "exceeded your current quota|insufficient_quota"; then fail "A2A — PROVIDER QUOTA EXHAUSTED (NOT a platform regression). Operator action: top up MOLECULE_STAGING_OPENAI_API_KEY billing or rotate to a higher-quota org at Settings → Secrets and Variables → Actions. Tracked in #2578. Raw: $AGENT_TEXT" fi +# Empty-completion class — the agent runtime reached the LLM and got a +# 2xx back, but the assistant turn carried NO text part (empty content, +# or tool_calls/reasoning-only with no surfaced text), so the runtime +# returns the literal "Error: message contained no text content." as its +# reply text. Steps 0-7 passing means the platform is healthy (CP up, +# tenant provisioned, workspace online + routable, A2A delivery e2e); the +# break is the configured completion BACKEND returning an empty turn — a +# model/provider-side regression, NOT a workspace-server or harness bug, +# and NOT NOT_CONFIGURED (that fails earlier, at boot). Name it explicitly +# so the canary alert points at the model, not the platform: a generic +# "error-shaped response" misdirects triage to workspace-server. Observed +# 2026-06-03/04 across every staging canary on MODEL_SLUG=MiniMax-M2 (the +# canary default since #2710) — 100% on the parent's first cold turn, +# identical on main's scheduled synthetic E2E and on PRs (so it is an +# environmental backend regression, never PR-introduced). +if echo "$AGENT_TEXT" | grep -qiF "message contained no text content"; then + fail "A2A — EMPTY COMPLETION (backend regression, NOT a platform/workspace-server bug). The configured model (MODEL_SLUG=${MODEL_SLUG:-?}) returned a 2xx completion with no text part; the runtime surfaced 'message contained no text content.'. Operator action: check the staging LLM backend / proxy for the canary model (MiniMax-M2 since #2710) — empty assistant turns, not an auth/quota/boot fault. Raw: $AGENT_TEXT" +fi # Generic catch-all — falls through if none of the known regressions hit. if echo "$AGENT_TEXT" | grep -qiE "error|exception"; then fail "A2A returned an error-shaped response: $AGENT_TEXT" -- 2.52.0