test(e2e): name the A2A empty-completion failure class in staging SaaS canary #2203
Reference in New Issue
Block a user
Delete Branch "devops/saas-a2a-empty-completion-diagnostic"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Step 8 of
tests/e2e/test_staging_full_saas.sh(the full-lifecycle SaaS canary) sends an A2A round-trip to the parent and asserts a PONG. When the configured completion backend returns a 2xx with no text part (empty content, or tool_calls/reasoning-only), the agent runtime returns the literal replyError: message contained no text content.Until now that fell through the genericerror|exceptioncatch-all and was reported as a vague "A2A returned an error-shaped response", which misdirects triage to workspace-server.This adds a specific error-class check (mirroring the existing hermes-401 / quota-exhausted patterns) that names it as a model/provider backend regression with the operator action, immediately before the generic catch-all. No behaviour change for healthy runs; a genuine empty-completion still hard-fails — it is just diagnosed correctly.
Why now
Observed 2026-06-03/04: 100% of staging canaries on
MODEL_SLUG=MiniMax-M2(the canary default since #2710) hit this on the parent first cold turn. It is identical on main scheduled synthetic E2E and on open PRs (incl. #2197) — i.e. an environmental backend regression, NOT PR-introduced and NOT a workspace-server/boot fault. This change is purely diagnostic precision on the unmodified main-line step-8 block; it does not mask the regression.Verification
bash -nclean (on main base)shellcheck -S errorcleantests/e2e/test_model_slug.sh: 16/16 pass (untouched)Error: message contained no text content.🤖 Generated with Claude Code
Owner force-merged (honest bypass). Diagnostic precision for the staging A2A empty-completion class (names it a backend/reasoning-model issue with operator action, before the generic catch-all) — does NOT mask the red (tolerating empty completions would hide a real signal). Required CI green; bash-lint clean; model-slug tests 16/16. Token revoked.