fix(e2e): reconciler e2e — platform create path + capture 400 body (core#2261) #2276

Merged
hongming merged 2 commits from fix/core2261-reconciler-e2e-create into main 2026-06-05 02:55:37 +00:00
Owner

Unblocks the live reconciler e2e. It was failing at workspace-create (HTTP 400) because it used the BYOK path (MiniMax-M2 + secret); switched to the platform-managed path (moonshot/kimi-k2.6, no key) that test_staging_full_saas.sh creates cleanly with. Also fixes the set -e/--fail-with-body bug that hid the 400 body. Dispatched a live staging run to prove the reconciler heals a terminated EC2. Refs core#2261.

Unblocks the live reconciler e2e. It was failing at workspace-create (HTTP 400) because it used the BYOK path (MiniMax-M2 + secret); switched to the platform-managed path (moonshot/kimi-k2.6, no key) that test_staging_full_saas.sh creates cleanly with. Also fixes the set -e/--fail-with-body bug that hid the 400 body. Dispatched a live staging run to prove the reconciler heals a terminated EC2. Refs core#2261.
hongming added 1 commit 2026-06-05 02:22:07 +00:00
fix(e2e): reconciler e2e — use platform-managed create path + capture 400 body (core#2261)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 1s
CI / Detect changes (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
qa-review / approved (pull_request_target) Failing after 4s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 4s
sop-checklist / all-items-acked (pull_request_target) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 1s
CI / Canvas (Next.js) (pull_request) Successful in 1s
sop-tier-check / tier-check (pull_request_target) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
CI / all-required (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 55s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m16s
CI / Canvas Deploy Status (pull_request) Has been skipped
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m26s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m37s
E2E Staging Reconciler (heals terminated EC2) / pr-validate (pull_request) Has been cancelled
E2E Staging Reconciler (heals terminated EC2) / E2E Staging Reconciler (pull_request) Has been cancelled
6447edd2fd
Two fixes so the live reconciler e2e can actually reach its assertion:
1. The create 400'd because the script used the BYOK path (MiniMax-M2 +
   MINIMAX_API_KEY secret) — a combo that fails workspace-create. Add the
   E2E_LLM_PATH=platform branch (DEFAULT) mirroring test_staging_full_saas.sh:
   moonshot/kimi-k2.6, no tenant key — the create combo proven to succeed.
   This test only needs the workspace status=online (then it kills the EC2),
   so it doesn't need a real LLM completion.
2. set -e + curl --fail-with-body aborted the create command-substitution
   before the fail line could echo $WS_RESP, hiding the real HTTP-400 reason.
   Capture the body via `|| { fail "...$WS_RESP" }` so any future create
   failure is diagnosable.

core#2261

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hongming added 1 commit 2026-06-05 02:51:22 +00:00
fix(e2e): reconciler e2e — fail fast on online-wait (900s) to avoid EC2 leak (core#2261)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 10s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Failing after 4s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Failing after 5s
security-review / approved (pull_request_target) Failing after 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 14s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
gate-check-v3 / gate-check (pull_request_target) Successful in 17s
CI / Canvas (Next.js) (pull_request) Successful in 12s
CI / Canvas Deploy Status (pull_request) Has been skipped
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
CI / all-required (pull_request) Successful in 2s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 51s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m15s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m25s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-tier-check / tier-check (pull_request_target) Has been cancelled
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Successful in 5s
E2E Staging Reconciler (heals terminated EC2) / pr-validate (pull_request) Has been cancelled
E2E Staging Reconciler (heals terminated EC2) / E2E Staging Reconciler (pull_request) Has been cancelled
audit-force-merge / audit (pull_request_target) Successful in 6s
d8ff0b2503
Run 216031 hung ~32min in the boot-to-online poll (3600s default) and leaked a
running staging e2e-rec EC2 — the workspace never reached online (a staging
boot/serving issue, same root as the full-saas A2A failures, upstream of the
reconciler this test exercises). Reduce the online timeout default to 900s so a
non-booting workspace fails fast and the teardown trap terminates the EC2
instead of hanging ~1h. Does not change what the test proves once staging can
boot a workspace online.

core#2261

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hongming added the tier:low label 2026-06-05 02:55:06 +00:00
core-security approved these changes 2026-06-05 02:55:10 +00:00
core-security left a comment
Member

Security (core#2261). Test-only + e2e. Fail-fast + guaranteed teardown trap prevent leaked EC2s; no prod surface. Approve.

Security (core#2261). Test-only + e2e. Fail-fast + guaranteed teardown trap prevent leaked EC2s; no prod surface. Approve.
core-qa approved these changes 2026-06-05 02:55:34 +00:00
core-qa left a comment
Member

QA approve (core#2261 reconciler e2e fixes: create-path + fail-fast + diagnosability). Required CI green.

QA approve (core#2261 reconciler e2e fixes: create-path + fail-fast + diagnosability). Required CI green.
hongming merged commit 91c9893ad4 into main 2026-06-05 02:55:37 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2276