fix(e2e): increase hermes workspace wait from 20 to 30 min

Root cause of PR #1981 E2E failures (step 7 timeout):
- hermes-agent install from NousResearch (Node 22 tarball + Python
  deps from source) + gateway health wait takes 15-25 min on staging
This commit is contained in:
Molecule AI · cp-be 2026-04-24 17:11:37 +00:00
parent 6b62391e5d
commit ca7fa3b65e
2 changed files with 4 additions and 4 deletions

View File

@ -5,7 +5,7 @@ name: E2E Staging SaaS (full lifecycle)
# HMA memory → activity → peers), then tears down and asserts leak-free.
#
# Why a separate workflow (not folded into ci.yml):
# - The run takes ~20 min (EC2 boot + cloudflared DNS + provision sweeps +
# - The run takes ~25-35 min (EC2 boot + cloudflared DNS + provision sweeps +
# agent bootstrap), way too slow for every PR.
# - Needs its own concurrency group so two pushes don't fight over the
# same staging org slug prefix.
@ -68,7 +68,7 @@ jobs:
e2e-staging-saas:
name: E2E Staging SaaS
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 45
permissions:
contents: read

View File

@ -308,8 +308,8 @@ fi
# polling, only hard-fail at the deadline. Pre-bootstrap-watcher-fix
# (controlplane#245) this was a flake generator: workspace went
# failed→online inside our window but we bailed at the failed read.
log "7/11 Waiting for workspace(s) to reach status=online (up to 20 min — hermes cold boot)..."
WS_DEADLINE=$(( $(date +%s) + 1200 ))
log "7/11 Waiting for workspace(s) to reach status=online (up to 30 min — hermes cold boot)..."
WS_DEADLINE=$(( $(date +%s) + 1800 ))
WS_TO_CHECK="$PARENT_ID"
[ -n "$CHILD_ID" ] && WS_TO_CHECK="$WS_TO_CHECK $CHILD_ID"
for wid in $WS_TO_CHECK; do