molecule-core/workspace-server
Hongming Wang c6cb82e1c0 fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum
Migration 043 (2026-04-25) introduced the workspace_status enum but
omitted two values application code had been writing for days, so every
UPDATE that tried to write either value failed silently in production:

  'awaiting_agent' (since 2026-04-24, commit 1e8b5e01):
    - handlers/workspace.go:333 — external workspace pre-register
    - handlers/registry.go (via PR #2382) — liveness offline transition
    - registry/healthsweep.go (via PR #2382) — heartbeat-staleness sweep

  'hibernating' (since hibernation feature shipped):
    - handlers/workspace_restart.go:271 — DB-level claim before stop

All four/five sites swallowed the enum-cast error. User-visible impact:
external workspaces never transition to a stale state when their agent
disconnects (canvas shows them stuck on 'online'/'degraded' indefinitely),
new external workspaces never advance past 'provisioning', and idle
workspaces never auto-hibernate (resources held forever).

PR #2382 didn't *cause* this — it inherited the gap and added two more
silent-fail paths on top. The pre-existing two had been broken for five
days and went unnoticed because:

  1. sqlmock matches SQL by regex, not against the live enum constraint.
     Every test passed despite the prod-only failure.
  2. The handlers either drop the Exec error entirely (workspace.go:333)
     or log+continue without an alert (the other three).

Fix in three pieces:

  1. migrations/046_*.up.sql — ALTER TYPE workspace_status ADD VALUE
     'awaiting_agent', 'hibernating'. IF NOT EXISTS makes it idempotent
     across re-runs (RunMigrations re-applies until schema_migrations
     records the file). ALTER TYPE ADD VALUE doesn't take a heavy lock
     and commits immediately, safe under live traffic.

  2. migrations/046_*.down.sql — full rename → recreate → cast → drop
     recipe. Postgres has no DROP VALUE so this is the only honest
     rollback. Pre-flights existing rows to compatible values
     (awaiting_agent → offline, hibernating → hibernated) before the
     type swap.

  3. internal/db/workspace_status_enum_drift_test.go — static gate that
     parses every UPDATE/INSERT against `workspaces` in workspace-server/
     internal/, extracts every status literal, and asserts each is in
     the enum union (CREATE TYPE + every ALTER TYPE ADD VALUE). The gate
     runs in unit tests, no DB required, and would have caught both
     omissions on the day they shipped. Pattern matches
     feedback_behavior_based_ast_gates and feedback_mock_at_drifting_layer.

Verification:
  - go test ./internal/db/ -count=1 -race  ✓
  - go vet ./...                           ✓
  - Drift gate flips red if I delete either ADD VALUE from the migration
    (validated via local mutation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:52:05 -07:00
..
cmd/server feat(runtime): native_scheduler skip — primitive #3 of 6 2026-04-26 22:47:00 -07:00
internal fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum 2026-04-30 08:52:05 -07:00
migrations fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum 2026-04-30 08:52:05 -07:00
pkg/provisionhook feat(#1957): wire gh-identity plugin into workspace-server 2026-04-24 15:01:41 +00:00
.ci-force chore: force Platform(Go) CI run on main — validate go vet clean 2026-04-21 15:43:19 +00:00
.gitignore feat(ws-server): pull env from CP on startup 2026-04-19 02:41:15 -07:00
.golangci.yaml chore(workspace-server): add golangci.yaml disabling errcheck 2026-04-24 07:16:54 +00:00
Dockerfile chore: extract ContextMenu Zustand fix + a2a_proxy local-docker SSRF bypass + workspace-server Dockerfile GID entrypoint 2026-04-22 20:00:16 -07:00
Dockerfile.tenant feat(terminal): remote path via aws ec2-instance-connect + pty 2026-04-21 18:13:29 -07:00
entrypoint-tenant.sh fix(security): add USER directive before ENTRYPOINT in all tenant images (#1155) 2026-04-20 23:51:33 +00:00
go.mod chore(deps): batch dep bumps — 11 safe upgrades from 2026-04-28 dependabot wave 2026-04-28 16:25:46 -07:00
go.sum chore(deps): batch dep bumps — 11 safe upgrades from 2026-04-28 dependabot wave 2026-04-28 16:25:46 -07:00