molecule-core/workspace-server/internal
Hongming Wang 731a9aef6e feat(platform): bootstrap-failed + console endpoints for CP watcher
Workspaces stuck in provisioning used to sit in "starting" for 10min
until the sweeper flipped them. The real signal — a runtime crash at
EC2 boot — lands on the serial console within seconds but nothing
listened. These endpoints close the loop.

1. POST /admin/workspaces/:id/bootstrap-failed
   The control plane's bootstrap watcher posts here when it spots
   "RUNTIME CRASHED" in ec2:GetConsoleOutput. Handler:
   - UPDATEs workspaces SET status='failed' only when status was
     'provisioning' (idempotent — a raced online/failed stays put)
   - Stores the error + log_tail in last_sample_error so the canvas
     can render the real stack trace, not a generic "timeout" string
   - Broadcasts WORKSPACE_PROVISION_FAILED with source='bootstrap_watcher'

2. GET /workspaces/:id/console
   Proxies to CP's new /cp/admin/workspaces/:id/console endpoint so
   the tenant platform can surface EC2 serial console output without
   holding AWS credentials. CPProvisioner.GetConsoleOutput is the
   client; returns 501 in non-CP deployments (docker-compose dev).

Both gated by AdminAuth — CP holds the tenant ADMIN_TOKEN that the
middleware accepts on its tier 2b branch.

Tests cover: happy-path fail, already-transitioned no-op, empty id,
log_tail truncation, and the 501 fallback when no CP is wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:11:34 -07:00
..
artifacts chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
bundle fix(bundle/exporter): add rows.Err() after child workspace enumeration 2026-04-19 21:46:36 +00:00
channels fix(security): cap webhook + config PATCH bodies (H3/H4) 2026-04-19 01:23:03 -07:00
crypto chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
db test: schema_migrations tracking — 4 cases (first boot, re-boot, mixed, down.sql filter) 2026-04-18 11:52:27 -07:00
envx chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
events chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
handlers feat(platform): bootstrap-failed + console endpoints for CP watcher 2026-04-20 17:11:34 -07:00
metrics chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
middleware fix(org-tokens): rate-limit mint, bound list, correct audit provenance 2026-04-20 14:22:38 -07:00
models feat: seed initial memories from org template and create payload (#1050) 2026-04-20 00:35:49 -07:00
orgtoken fix(org-tokens): rate-limit mint, bound list, correct audit provenance 2026-04-20 14:22:38 -07:00
plugins chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
provisioner feat(platform): bootstrap-failed + console endpoints for CP watcher 2026-04-20 17:11:34 -07:00
registry fix: harden stuck-provisioning UX — details crash, preflight, sweeper 2026-04-20 14:51:39 -07:00
router feat(platform): bootstrap-failed + console endpoints for CP watcher 2026-04-20 17:11:34 -07:00
scheduler Merge pull request #1007 from Molecule-AI/fix/scheduler-defer-busy-969 2026-04-19 20:21:16 -07:00
supervised chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
ws chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
wsauth chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00