molecule-core

History

Hongming Wang f4700858ac feat(e2e): canary + canvas Playwright workflows; delegation mechanics Three additions on top of `187a9bf`: 1. Canary (.github/workflows/canary-staging.yml) 30-min cron that runs the full-SaaS harness in E2E_MODE=canary: one hermes workspace + one A2A PONG + teardown. ~8-min wall clock vs ~20-min for the full run. Alerting is self-contained: opens a single 'Canary failing' issue on first failure, comments on subsequent failures (no issue spam), auto-closes the issue on the next green run. Labels: canary-staging, bug. Safety-net teardown step sweeps e2e-YYYYMMDD-canary-* orgs tagged today so a runner cancel can't leak EC2. 2. Canvas Playwright (canvas/e2e/staging-*.ts + playwright.staging.config.ts + .github/workflows/e2e-staging-canvas.yml) staging-setup.ts provisions a fresh org + hermes workspace (same lifecycle as the bash harness, just in TypeScript). staging-tabs.spec.ts clicks through all 13 workspace-panel tabs (chat, activity, details, skills, terminal, config, schedule, channels, files, memory, traces, events, audit) and asserts each renders without crashing and without 'Failed to load' error toasts. Known SaaS gaps (Files empty, Terminal disconnects, Peers 401) are documented in #1369 and whitelisted so they don't fail the test — the gate is 'no hard crash', not 'no issues'. staging-teardown.ts deletes the org via DELETE /cp/admin/tenants/:slug. playwright.staging.config.ts separates staging from local tests so pnpm test in dev doesn't try to provision against staging. Retries=2 and timeouts are longer; workers=1 because the setup provisions one shared workspace. Workflow uploads HTML report + screenshots on failure for 14 days. 3. Delegation mechanics (tests/e2e/test_staging_full_saas.sh section 10) Parent → child proxy test: POST /workspaces/CHILD/a2a with X-Source-Workspace-Id=PARENT and verify the child responds + child activity log captures PARENT as source. Intentionally LLM-free: the mechanics regression is what matters; prompt-driven delegation correctness belongs in canvas-driven tests. Also reorders teardown step to 11/11 since delegation is 10/11. Mode gating: E2E_MODE=canary -> skips child workspace, HMA memory, peers, activity, delegation (steps 6, 9, 10 no-op). Full-lifecycle still runs every piece. Validated both paths via 'bash -n' syntax check after each edit. Secrets requirement unchanged (same two secrets as `187a9bf`): MOLECULE_STAGING_SESSION_COOKIE, MOLECULE_STAGING_ADMIN_TOKEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-21 04:15:10 -07:00
..
_extract_token.py	chore: apply round-7 review nits	2026-04-13 17:08:45 -07:00
_lib.sh	feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6 )	2026-04-14 09:35:26 -07:00
STAGING_SAAS_E2E.md	feat(e2e): staging full-SaaS workflow — per-run org provision + leak-free teardown	2026-04-21 03:54:09 -07:00
test_a2a_e2e.sh	initial commit — Molecule AI platform	2026-04-13 11:55:37 -07:00
test_activity_e2e.sh	chore: apply code-review round-6 suggestions	2026-04-13 17:08:45 -07:00
test_api.sh	fix(e2e): stop asserting current_task on public workspace GET (#966 )	2026-04-19 02:19:15 -07:00
test_claude_code_e2e.sh	chore: final open-source cleanup — binary, stale paths, private refs	2026-04-18 00:38:55 -07:00
test_comprehensive_e2e.sh	fix(e2e): make provisioning-status assertions robust to CI environment	2026-04-13 17:31:07 -07:00
test_saas_tenant.sh	chore: final open-source cleanup — binary, stale paths, private refs	2026-04-18 00:38:55 -07:00
test_staging_full_saas.sh	feat(e2e): canary + canvas Playwright workflows; delegation mechanics	2026-04-21 04:15:10 -07:00