molecule-core

History

dev-lead 5c0c15eb4f chore(canary): workflow_dispatch input keep_on_failure for log capture Investigating molecule-core#129 failure mode #1 (claude-code "Agent error (Exception)") needs the workspace's docker logs to find the actual exception. The canary tears down the tenant on every failure, so the workspace container is destroyed before anyone can SSM in. Add a workflow_dispatch input `keep_on_failure: bool` (default false). When true, sets `E2E_KEEP_ORG=1` for the canary script — its existing debug path skips teardown, leaving the tenant + EC2 + CF tunnel + DNS alive. Operator can then SSM into the workspace EC2 (via the same flow as recover-tunnels.py) and capture `docker logs` from the claude-code container. Cron-triggered runs never set the input (it only exists on dispatch), so unattended scheduled canaries always tear down — no risk of unattended cost leak. Operator workflow: 1. Dispatch canary-staging.yml with keep_on_failure=true 2. Watch CI; on failure (likely, given the 38h chronic red), note the SLUG / TENANT_URL printed at step 1/11 3. SSM exec into the workspace EC2 (us-east-2) and run `docker logs <claude-code-container>` to find the actual exception traceback 4. Manually delete via DELETE /cp/admin/tenants/<slug> when done (the script logs this reminder on E2E_KEEP_ORG=1 path) Refs: molecule-core#129 (canary investigation) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-08 10:58:19 -07:00
..
scripts	fix(scripts): migrate ghcr.io→ECR + raw.githubusercontent.com→Gitea (#46 )	2026-05-07 00:56:23 -07:00
workflows	chore(canary): workflow_dispatch input keep_on_failure for log capture	2026-05-08 10:58:19 -07:00
CODEOWNERS	chore: add CODEOWNERS to auto-route agent PRs to personal review account	2026-04-26 13:40:13 -07:00
dependabot.yml	chore(security): pin Actions to SHAs + enable Dependabot auto-bumps	2026-04-28 15:37:06 -07:00