From b7f0b279eb47205cd5ecf430f2ed56401db3bac1 Mon Sep 17 00:00:00 2001
From: Hongming Wang <hongmingwangrabbit@gmail.com>
Date: Mon, 4 May 2026 01:49:42 -0700
Subject: [PATCH] =?UTF-8?q?e2e:=20bump=20A2A=20timeout=20from=2030s=20?=
 =?UTF-8?q?=E2=86=92=2090s=20for=20cold=20MiniMax=20workspace?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

After #2710 + #2714 + the MOLECULE_STAGING_MINIMAX_API_KEY repo secret
landed (2026-05-04 08:37Z), the next dispatched canary
(run 25309323698) cleared every previous failure point but timed out
at step 8/11 with `curl: (28) Operation timed out after 30002 ms`.

The canary creates a fresh org per run, so every A2A POST hits a cold
workspace + cold MiniMax endpoint:
  workspace boot → claude-code adapter starts event loop
  → first prompt ships → TLS handshake to api.minimax.io
  → cold model warmup → first-token generation

Cold-call P95 lands around 25-30s on MiniMax-M2.7-highspeed; the
30-second `CURL_COMMON --max-time` is right on the edge and the run
that timed out was 30.002s of zero bytes received.

Fix: override `--max-time` for the canary's A2A POST only — 90s gives
~3x headroom. Subsequent A2A turns to the same workspace are
sub-second, so this only widens step 8 of the canary's first turn.
The shared CURL_COMMON timeout stays at 30s for everything else
(provision, register, terminal, peers, teardown), where 30s is right.

Verifies the rest of the canary script (provision, DNS, terminal-EIC,
A2A round-trip) is platform-correct and the only operational gap is
this latency knob.
---
 tests/e2e/test_staging_full_saas.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tests/e2e/test_staging_full_saas.sh b/tests/e2e/test_staging_full_saas.sh
index ec6bbf5e..41c58fd5 100755
--- a/tests/e2e/test_staging_full_saas.sh
+++ b/tests/e2e/test_staging_full_saas.sh
@@ -534,7 +534,17 @@ print(json.dumps({
     }
 }))
 ")
+# Override CURL_COMMON's --max-time 30 for THIS call only. Each canary
+# creates a fresh org → workspace, so the A2A POST hits a cold model:
+# claude-code adapter starts its event loop, opens TLS to the LLM
+# endpoint, ships the first prompt, waits for first token. With MiniMax
+# (which is the canary default since #2710) cold-call latency
+# routinely exceeds 30s on the first request after workspace boot.
+# 90s gives ~3x headroom over observed cold-call P95 (~25-30s).
+# Subsequent A2A turns hit the same workspace and are sub-second, so
+# this only widens the window for step 8/11 of the canary's first turn.
 A2A_RESP=$(tenant_call POST "/workspaces/$PARENT_ID/a2a" \
+  --max-time 90 \
   -H "Content-Type: application/json" \
   -d "$A2A_PAYLOAD")
 AGENT_TEXT=$(echo "$A2A_RESP" | python3 -c "