From b7f0b279eb47205cd5ecf430f2ed56401db3bac1 Mon Sep 17 00:00:00 2001 From: Hongming Wang Date: Mon, 4 May 2026 01:49:42 -0700 Subject: [PATCH] =?UTF-8?q?e2e:=20bump=20A2A=20timeout=20from=2030s=20?= =?UTF-8?q?=E2=86=92=2090s=20for=20cold=20MiniMax=20workspace?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After #2710 + #2714 + the MOLECULE_STAGING_MINIMAX_API_KEY repo secret landed (2026-05-04 08:37Z), the next dispatched canary (run 25309323698) cleared every previous failure point but timed out at step 8/11 with `curl: (28) Operation timed out after 30002 ms`. The canary creates a fresh org per run, so every A2A POST hits a cold workspace + cold MiniMax endpoint: workspace boot → claude-code adapter starts event loop → first prompt ships → TLS handshake to api.minimax.io → cold model warmup → first-token generation Cold-call P95 lands around 25-30s on MiniMax-M2.7-highspeed; the 30-second `CURL_COMMON --max-time` is right on the edge and the run that timed out was 30.002s of zero bytes received. Fix: override `--max-time` for the canary's A2A POST only — 90s gives ~3x headroom. Subsequent A2A turns to the same workspace are sub-second, so this only widens step 8 of the canary's first turn. The shared CURL_COMMON timeout stays at 30s for everything else (provision, register, terminal, peers, teardown), where 30s is right. Verifies the rest of the canary script (provision, DNS, terminal-EIC, A2A round-trip) is platform-correct and the only operational gap is this latency knob. --- tests/e2e/test_staging_full_saas.sh | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tests/e2e/test_staging_full_saas.sh b/tests/e2e/test_staging_full_saas.sh index ec6bbf5e..41c58fd5 100755 --- a/tests/e2e/test_staging_full_saas.sh +++ b/tests/e2e/test_staging_full_saas.sh @@ -534,7 +534,17 @@ print(json.dumps({ } })) ") +# Override CURL_COMMON's --max-time 30 for THIS call only. Each canary +# creates a fresh org → workspace, so the A2A POST hits a cold model: +# claude-code adapter starts its event loop, opens TLS to the LLM +# endpoint, ships the first prompt, waits for first token. With MiniMax +# (which is the canary default since #2710) cold-call latency +# routinely exceeds 30s on the first request after workspace boot. +# 90s gives ~3x headroom over observed cold-call P95 (~25-30s). +# Subsequent A2A turns hit the same workspace and are sub-second, so +# this only widens the window for step 8/11 of the canary's first turn. A2A_RESP=$(tenant_call POST "/workspaces/$PARENT_ID/a2a" \ + --max-time 90 \ -H "Content-Type: application/json" \ -d "$A2A_PAYLOAD") AGENT_TEXT=$(echo "$A2A_RESP" | python3 -c "