molecule-core/workspace-server/internal/db
Hongming Wang c6cb82e1c0 fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum
Migration 043 (2026-04-25) introduced the workspace_status enum but
omitted two values application code had been writing for days, so every
UPDATE that tried to write either value failed silently in production:

  'awaiting_agent' (since 2026-04-24, commit 1e8b5e01):
    - handlers/workspace.go:333 — external workspace pre-register
    - handlers/registry.go (via PR #2382) — liveness offline transition
    - registry/healthsweep.go (via PR #2382) — heartbeat-staleness sweep

  'hibernating' (since hibernation feature shipped):
    - handlers/workspace_restart.go:271 — DB-level claim before stop

All four/five sites swallowed the enum-cast error. User-visible impact:
external workspaces never transition to a stale state when their agent
disconnects (canvas shows them stuck on 'online'/'degraded' indefinitely),
new external workspaces never advance past 'provisioning', and idle
workspaces never auto-hibernate (resources held forever).

PR #2382 didn't *cause* this — it inherited the gap and added two more
silent-fail paths on top. The pre-existing two had been broken for five
days and went unnoticed because:

  1. sqlmock matches SQL by regex, not against the live enum constraint.
     Every test passed despite the prod-only failure.
  2. The handlers either drop the Exec error entirely (workspace.go:333)
     or log+continue without an alert (the other three).

Fix in three pieces:

  1. migrations/046_*.up.sql — ALTER TYPE workspace_status ADD VALUE
     'awaiting_agent', 'hibernating'. IF NOT EXISTS makes it idempotent
     across re-runs (RunMigrations re-applies until schema_migrations
     records the file). ALTER TYPE ADD VALUE doesn't take a heavy lock
     and commits immediately, safe under live traffic.

  2. migrations/046_*.down.sql — full rename → recreate → cast → drop
     recipe. Postgres has no DROP VALUE so this is the only honest
     rollback. Pre-flights existing rows to compatible values
     (awaiting_agent → offline, hibernating → hibernated) before the
     type swap.

  3. internal/db/workspace_status_enum_drift_test.go — static gate that
     parses every UPDATE/INSERT against `workspaces` in workspace-server/
     internal/, extracts every status literal, and asserts each is in
     the enum union (CREATE TYPE + every ALTER TYPE ADD VALUE). The gate
     runs in unit tests, no DB required, and would have caught both
     omissions on the day they shipped. Pattern matches
     feedback_behavior_based_ast_gates and feedback_mock_at_drifting_layer.

Verification:
  - go test ./internal/db/ -count=1 -race  ✓
  - go vet ./...                           ✓
  - Drift gate flips red if I delete either ADD VALUE from the migration
    (validated via local mutation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:52:05 -07:00
..
architecture_test.go test(arch): codify 4 module boundaries as architecture tests (#2344) 2026-04-29 22:12:58 -07:00
postgres_migrate_test.go chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
postgres_schema_migrations_test.go test: schema_migrations tracking — 4 cases (first boot, re-boot, mixed, down.sql filter) 2026-04-18 11:52:27 -07:00
postgres.go feat(db): schema_migrations tracking — migrations only run once 2026-04-18 07:39:20 -07:00
redis.go chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
workspace_status_enum_drift_test.go fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum 2026-04-30 08:52:05 -07:00