molecule-core

History

Hongming Wang 46daae1ffb fix(provision): entry log + panic recovery on workspace provision goroutines Issue #2486: 7 claude-code workspaces stuck in provisioning produced NONE of the four documented exit-path log lines in provisionWorkspaceCP — neither prepare-failed, nor start-failed, nor persist-instance-id-failed, nor success. Operators couldn't tell whether the goroutine ran at all. Add an entry log at the top of provisionWorkspaceOpts + provisionWorkspaceCP so a missing entry distinguishes "goroutine never started" from "started but exited via an unlogged path." Add logProvisionPanic at the same defer site so a panic inside either provisioner doesn't (a) crash the whole workspace-server process, taking every other tenant workspace with it, and (b) silently leave the row in `provisioning` until the 10-min sweeper fires. The recover persists status='failed' with a sanitized panic-class message via a fresh 10s context (the goroutine's own ctx may have been the one panicking). Tests pin three contracts: - no-op when no panic (otherwise every successful provision emits a spurious log line) - recovers + persists failed status on panic, with stack trace - defense-in-depth: if the persist itself fails, log it instead of leaving the operator with a recovered-panic log but no row Regression-injected by neutering the recover() body — all three tests fail until the recover + UPDATE path is restored. This is observability + resilience only, not a root-cause fix for #2486. The actual silent-drop class still needs reproduction once the tenant is on a build that includes this entry log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-01 19:14:20 -07:00
..
cmd/server	fix(boot): always start health-sweep goroutine — SaaS tenants need it for external-runtime liveness	2026-04-30 12:05:40 -07:00
internal	fix(provision): entry log + panic recovery on workspace provision goroutines	2026-05-01 19:14:20 -07:00
migrations	fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum	2026-04-30 08:52:05 -07:00
pkg/provisionhook	feat(#1957 ): wire gh-identity plugin into workspace-server	2026-04-24 15:01:41 +00:00
.ci-force	chore: force Platform(Go) CI run on main — validate go vet clean	2026-04-21 15:43:19 +00:00
.gitignore	feat(ws-server): pull env from CP on startup	2026-04-19 02:41:15 -07:00
.golangci.yaml	chore(workspace-server): add golangci.yaml disabling errcheck	2026-04-24 07:16:54 +00:00
Dockerfile	feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy	2026-04-30 10:55:08 -07:00
Dockerfile.tenant	feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy	2026-04-30 10:55:08 -07:00
entrypoint-tenant.sh	fix(security): add USER directive before ENTRYPOINT in all tenant images (#1155 )	2026-04-20 23:51:33 +00:00
go.mod	chore(deps): batch dep bumps — 11 safe upgrades from 2026-04-28 dependabot wave	2026-04-28 16:25:46 -07:00
go.sum	chore(deps): batch dep bumps — 11 safe upgrades from 2026-04-28 dependabot wave	2026-04-28 16:25:46 -07:00