molecule-core

Author	SHA1	Message	Date
Hongming Wang	79a0203798	feat(synth-e2e): switch canary to claude-code + MiniMax-M2.7-highspeed Cuts the per-run LLM cost ~10x (MiniMax M2.7 vs gpt-4.1-mini) and removes the recurring OpenAI-quota-exhaustion failure mode that took the canary down on 2026-05-03 (#265 — staging quota burnt for ~16h). Path: E2E_RUNTIME=claude-code (default) → workspace-configs-templates/claude-code-default/config.yaml's `minimax` provider (lines 64-69) → ANTHROPIC_BASE_URL auto-set to api.minimax.io/anthropic → reads MINIMAX_API_KEY (per-vendor env, no collision with GLM/Z.ai etc.) Workflow changes (continuous-synth-e2e.yml): - Default runtime: langgraph → claude-code - New env: E2E_MODEL_SLUG (defaults to MiniMax-M2.7-highspeed, overridable via workflow_dispatch) - New secret wire: E2E_MINIMAX_API_KEY ← secrets.MOLECULE_STAGING_MINIMAX_API_KEY - Per-runtime missing-secret guard: claude-code requires MINIMAX, langgraph/hermes require OPENAI. Cron firing hard-fails on missing key for the active runtime; dispatch soft-skips so operators can ad-hoc test without setting up the secret first - Operators can still pick langgraph/hermes via workflow_dispatch; the OpenAI fallback path stays wired Script changes (tests/e2e/test_staging_full_saas.sh): - SECRETS_JSON branches on which key is set: E2E_MINIMAX_API_KEY → {MINIMAX_API_KEY: <key>} (claude-code path) E2E_OPENAI_API_KEY → {OPENAI_API_KEY, HERMES_*, MODEL_PROVIDER} (legacy) MiniMax wins when both are present — claude-code default canary must not accidentally consume the OpenAI key Tests (new tests/e2e/test_secrets_dispatch.sh): - 10 cases pinning the precedence + payload shape per branch - Discipline check verified: 5 of 10 FAIL on a swapped if/elif (precedence inversion), all 10 PASS on the fix - Anchors on the section-comment header so a structural refactor fails loudly rather than silently sourcing nothing The model_slug dispatcher (lib/model_slug.sh) needs no change: E2E_MODEL_SLUG override path is already wired (line 41), and claude-code template's `minimax-` prefix matcher catches "MiniMax-M2.7-highspeed" via lowercase-on-lookup. Operator action required to land green: - Set MOLECULE_STAGING_MINIMAX_API_KEY in repo secrets (Settings → Secrets and Variables → Actions). Use `gh secret set MOLECULE_STAGING_MINIMAX_API_KEY -R Molecule-AI/molecule-core` to avoid leaking the value into shell history. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 15:35:14 -07:00
Hongming Wang	5e46ea70d6	ci(synth-e2e): wire MOLECULE_STAGING_OPENAI_KEY into provisioned tenant The synth-E2E (#2342) provisions a langgraph tenant whose default model `openai:gpt-4.1-mini` requires OPENAI_API_KEY for the first LLM call. Sibling workflows already wire this: - e2e-staging-saas.yml:89 - canary-staging.yml:63 continuous-synth-e2e.yml just forgot. Result: tenant boots, accepts a2a messages, then returns: Agent error: "Could not resolve authentication method. Expected either api_key or auth_token to be set." This was masked since 2026-04-29 (workflow creation) by a2a-sdk v0→v1 contract violations — PR #2558 (Task-enqueue) and #2563 (TaskUpdater.complete/failed terminal events) cleared those, exposing the underlying auth gap on the synth-E2E firing at 11:39 UTC today. The script tests/e2e/test_staging_full_saas.sh:325 already reads E2E_OPENAI_API_KEY and persists it as a workspace_secret on tenant create — only the workflow wiring was missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:43:07 -07:00
dependabot[bot]	3598eb41d1	chore(deps)(deps): bump actions/checkout from 4 to 6 Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Commits](https://github.com/actions/checkout/compare/v4...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-05-02 19:23:01 +00:00
Hongming Wang	db5d11ffca	ci: continuous synthetic E2E against staging (#2342 ) Hard gate Tier 2 item 2 of 4. Cron-driven full-lifecycle E2E that catches regressions visible only at runtime — schema drift, deployment-pipeline gaps, vendor outages, env-var rotations, DNS / CF / Railway side-effects. Empirical motivation from today: - #2345 (A2A v0.2 silent drop) — passed unit tests, broke at JSON-RPC parse layer between sender + receiver. Visible only when a sender exercises the full path. Now-fixed by PR #2349, but a continuous E2E would have surfaced it within 20 min of the regression. - RFC #2312 chat upload — landed staging-branch but never reached staging tenants because publish-workspace-server-image was main- only. Caught by manual dogfooding hours after deploy. Same pattern. Both classes are invisible to PR-time CI. The continuous gate fires every 20 min against a real staging tenant and surfaces regressions within minutes. Cadence: cron `0,20,40 * * * *` (3x/hour). Offsets the existing sweep-cf-orphans (:15) and sweep-cf-tunnels (:45) so the three ops don't burst CF/AWS APIs at the same minute. Concurrency group prevents overlapping runs if one hangs. Cost: ~$0.50-1/day GHA + pennies of staging tenant lifecycle. Reuses existing tests/e2e/test_staging_full_saas.sh — no new harness to maintain. Bounded at 10 min wall-clock (vs 15 min default) so stuck runs fail fast rather than holding up the next firing. Defaults to E2E_RUNTIME=langgraph (fastest cold start; the regression classes this gate catches don't need hermes-specific paths). Operators can dispatch with runtime=hermes when they want SDK-native coverage. Schedule-vs-dispatch hardening: hard-fail on missing CP_STAGING_ADMIN_API_TOKEN for cron firing (silent-skip would mask real outages); soft-skip for operator dispatch. Refs: - #2342 hard-gates Tier 2 item 2 - #2345 (A2A v0.2 fix that this gate would have caught earlier) - #2335 / #2337 (deployment-pipeline gaps that this gate also catches)	2026-04-29 22:04:57 -07:00

4 Commits