molecule-core/tests/e2e
Hongming Wang 79a0203798 feat(synth-e2e): switch canary to claude-code + MiniMax-M2.7-highspeed
Cuts the per-run LLM cost ~10x (MiniMax M2.7 vs gpt-4.1-mini) and
removes the recurring OpenAI-quota-exhaustion failure mode that took
the canary down on 2026-05-03 (#265 — staging quota burnt for ~16h).

Path:
  E2E_RUNTIME=claude-code (default)
  → workspace-configs-templates/claude-code-default/config.yaml's
    `minimax` provider (lines 64-69)
  → ANTHROPIC_BASE_URL auto-set to api.minimax.io/anthropic
  → reads MINIMAX_API_KEY (per-vendor env, no collision with
    GLM/Z.ai etc.)

Workflow changes (continuous-synth-e2e.yml):
- Default runtime: langgraph → claude-code
- New env: E2E_MODEL_SLUG (defaults to MiniMax-M2.7-highspeed,
  overridable via workflow_dispatch)
- New secret wire: E2E_MINIMAX_API_KEY ←
  secrets.MOLECULE_STAGING_MINIMAX_API_KEY
- Per-runtime missing-secret guard: claude-code requires MINIMAX,
  langgraph/hermes require OPENAI. Cron firing hard-fails on missing
  key for the active runtime; dispatch soft-skips so operators can
  ad-hoc test without setting up the secret first
- Operators can still pick langgraph/hermes via workflow_dispatch;
  the OpenAI fallback path stays wired

Script changes (tests/e2e/test_staging_full_saas.sh):
- SECRETS_JSON branches on which key is set:
    E2E_MINIMAX_API_KEY → {MINIMAX_API_KEY: <key>}  (claude-code path)
    E2E_OPENAI_API_KEY  → {OPENAI_API_KEY, HERMES_*, MODEL_PROVIDER}  (legacy)
  MiniMax wins when both are present — claude-code default canary
  must not accidentally consume the OpenAI key

Tests (new tests/e2e/test_secrets_dispatch.sh):
- 10 cases pinning the precedence + payload shape per branch
- Discipline check verified: 5 of 10 FAIL on a swapped if/elif
  (precedence inversion), all 10 PASS on the fix
- Anchors on the section-comment header so a structural refactor
  fails loudly rather than silently sourcing nothing

The model_slug dispatcher (lib/model_slug.sh) needs no change:
E2E_MODEL_SLUG override path is already wired (line 41), and
claude-code template's `minimax-` prefix matcher catches
"MiniMax-M2.7-highspeed" via lowercase-on-lookup.

Operator action required to land green:
- Set MOLECULE_STAGING_MINIMAX_API_KEY in repo secrets
  (Settings → Secrets and Variables → Actions). Use
  `gh secret set MOLECULE_STAGING_MINIMAX_API_KEY -R Molecule-AI/molecule-core`
  to avoid leaking the value into shell history.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 15:35:14 -07:00
..
lib test(e2e): pin pick_model_slug behavior with bash unit tests 2026-05-03 12:04:12 -07:00
_extract_token.py chore: apply round-7 review nits 2026-04-13 17:08:45 -07:00
_lib.sh feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6) 2026-04-14 09:35:26 -07:00
STAGING_SAAS_E2E.md feat(e2e): pivot to admin-bearer-only auth + add sanity self-check workflow 2026-04-21 04:34:11 -07:00
test_2307_peer_visibility_staging.sh test(e2e): add staging peer-visibility harness for #2307 2026-04-29 13:26:24 -07:00
test_a2a_e2e.sh initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
test_activity_e2e.sh chore: apply code-review round-6 suggestions 2026-04-13 17:08:45 -07:00
test_api.sh fix(e2e): stop asserting current_task on public workspace GET (#966) 2026-04-19 02:19:15 -07:00
test_chat_attachments_e2e.sh feat(canvas+platform): chat attachments, model selection, deploy/delete UX 2026-04-24 13:27:51 -07:00
test_chat_attachments_multiruntime_e2e.sh feat(canvas+platform): chat attachments, model selection, deploy/delete UX 2026-04-24 13:27:51 -07:00
test_chat_upload_e2e.sh feat(chat_files): rewrite Upload as HTTP-forward to workspace (RFC #2312, PR-C) 2026-04-29 14:26:37 -07:00
test_claude_code_e2e.sh chore: final open-source cleanup — binary, stale paths, private refs 2026-04-18 00:38:55 -07:00
test_comprehensive_e2e.sh fix(e2e): make provisioning-status assertions robust to CI environment 2026-04-13 17:31:07 -07:00
test_dev_mode.sh fix(quickstart): hotfixes discovered during live testing session 2026-04-23 14:57:18 -07:00
test_harness_rc_normalization.sh fix(e2e-sanity): normalize unexpected curl exit codes in cleanup trap (#2159) 2026-04-27 02:55:44 -07:00
test_model_slug.sh test(e2e): pin pick_model_slug behavior with bash unit tests 2026-05-03 12:04:12 -07:00
test_notify_attachments_e2e.sh test(notify): pre-sweep prior workspaces so interrupted runs don't pile up 2026-04-26 20:55:13 -07:00
test_poll_mode_e2e.sh fix(e2e): use real UUIDs for poll-mode test workspace ids 2026-04-29 23:10:36 -07:00
test_priority_runtimes_e2e.sh feat(e2e): extend priority-runtimes test to cover all 8 templates 2026-04-27 05:57:59 -07:00
test_saas_tenant.sh chore: final open-source cleanup — binary, stale paths, private refs 2026-04-18 00:38:55 -07:00
test_secrets_dispatch.sh feat(synth-e2e): switch canary to claude-code + MiniMax-M2.7-highspeed 2026-05-03 15:35:14 -07:00
test_staging_external_runtime.sh test(e2e): read delivery_mode from register response, not GET 2026-04-30 10:35:21 -07:00
test_staging_full_saas.sh feat(synth-e2e): switch canary to claude-code + MiniMax-M2.7-highspeed 2026-05-03 15:35:14 -07:00