fix(canvas/e2e): raise staging-setup deadline 15 min → 20 min

Matches tests/e2e/test_staging_full_saas.sh's 20-min budget (#1930). Canvas E2E was still stuck at 900s (15 min) which regularly flakes on tenant cold boots in 12-15 min range — especially on staging where workspace-server image pulls + AMI bootstrapping add 3-5 min vs prod. Concrete blocker: 2026-04-24 staging→main sync (#1981) kept failing on "tenant provision: timed out after 900s" in canvas/e2e/staging-setup.ts despite the actual sync E2E going green. Canvas-side timeout was strictly tighter than the sync-side timeout. Also raises WORKSPACE_ONLINE_TIMEOUT_MS to 20 min to cover the case where the workspace EC2 is provisioned but hermes cold-install (apt + uv + hermes-agent clone + gateway boot) takes longer than the original 10-min budget — matches the 20-min workspace deadline in SaaS E2E. No behavior change when things are fast. Just covers the tail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:26:13 -07:00 · 2026-04-24 01:26:13 -07:00 · 46fbffb95b
commit 46fbffb95b
parent 3770d4d68c
1 changed files with 7 additions and 2 deletions
--- a/canvas/e2e/staging-setup.ts
+++ b/canvas/e2e/staging-setup.ts
@ -26,8 +26,13 @@ const CP_URL = process.env.MOLECULE_CP_URL || "https://staging-api.moleculesai.a
 const ADMIN_TOKEN = process.env.MOLECULE_ADMIN_TOKEN;
 const STAGING = process.env.CANVAS_E2E_STAGING === "1";

-const PROVISION_TIMEOUT_MS = 15 * 60 * 1000;
-const WORKSPACE_ONLINE_TIMEOUT_MS = 10 * 60 * 1000;
+// Tenant cold boot on staging regularly takes 12-15 min when the
+// workspace-server Docker image isn't already cached on the AMI. Raised
+// to 20 min to match tests/e2e/test_staging_full_saas.sh (PR #1930)
+// after repeated "tenant provision: timed out after 900s" flakes
+// were blocking staging→main syncs on 2026-04-24.
+const PROVISION_TIMEOUT_MS = 20 * 60 * 1000;
+const WORKSPACE_ONLINE_TIMEOUT_MS = 20 * 60 * 1000;
 const TLS_TIMEOUT_MS = 3 * 60 * 1000;

 async function jsonFetch(