molecule-core/workspace-server/migrations
Hongming Wang 9991057ad1 feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening
Resolves four of six findings from the retrospective code review of Phases
1–4 (poll-mode chat upload). Bundled because every change is in the
platform's pending_uploads layer or the multi-file handler that reads it.

Findings resolved:

1. Important — Sweep query lacked an index for the acked-retention OR-arm.
   The Phase 1 partial indexes are both `WHERE acked_at IS NULL`, so the
   `(acked_at IS NOT NULL AND acked_at < retention)` half of the WHERE
   clause seq-scanned the table on every cycle. Add a complementary
   partial index on `acked_at WHERE acked_at IS NOT NULL` so both arms
   of the disjunction are index-covered. Disjoint from the existing two
   indexes (no row matches both predicates), so write amplification is
   bounded to ~one index entry per terminal-state row.

2. Important — uploadPollMode partial-failure left orphans. The previous
   per-file Put loop committed rows 1..K-1 and then errored on row K with
   no compensation, so a client retry would double-insert the survivors.
   Refactor the handler into three explicit phases (pre-validate +
   read-into-memory, single atomic PutBatch, per-file activity row) and
   add Storage.PutBatch with all-or-nothing transaction semantics.

3. FYI — pendinguploads.StartSweeperWithInterval was exported only for
   tests. Move it to lower-case startSweeperWithInterval and expose the
   test seam through pendinguploads/export_test.go (Go convention; the
   shim file is stripped from the production binary at build time).

4. Nit — multipart Content-Type was passed verbatim into pending_uploads
   rows and re-served on /content. Add safeMimetype which strips
   parameters, rejects CR/LF/control bytes, and coerces malformed shapes
   to application/octet-stream. The eventual GET /content response can no
   longer be header-split via a crafted Content-Type on the multipart.

Comprehensive tests:

- 10 PutBatch unit tests (sqlmock): happy path, empty input, all four
  pre-validation rejection paths, BeginTx error, per-row error +
  Rollback (no Commit), first-row error, Commit error.
- 4 new PutBatch integration tests (real Postgres): all-rows-commit
  happy path with COUNT(*) verification, atomic-rollback no-leak via
  a NUL-byte filename that lib/pq rejects mid-batch, oversize
  short-circuit no-Tx, idx_pending_uploads_acked existence + partial
  predicate via pg_indexes (planner-shape-independent).
- 3 new chat_files_poll tests: atomic rollback on second-file oversize,
  atomic rollback on PutBatch error, mimetype CRLF/NUL/parameter
  sanitization (8 sub-cases).

The two remaining review findings (inbox_uploads.fetch_and_stage blocks
the poll loop synchronously; two httpx Clients per row) are Python-side
and ship in Phase 5b once this lands on staging.

Test-only export pattern via export_test.go, atomic pre-validation
discipline (validate before Tx), and behavior-based (not name-based)
test assertions follow the standing project conventions.
2026-05-05 11:10:13 -07:00
..
001_workspaces.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
002_agents.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
003_events.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
004_secrets.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
005_canvas_layouts.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
006_workspace_config_memory.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
007_approvals.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
008_agent_memories.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
009_activity_logs.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
010_workspace_awareness.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
011_workspace_runtime.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
012_global_secrets.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
013_workspace_dir.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
014_indexes.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
015_workspace_schedules.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
016_workspace_channels.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
017_memories_fts_namespace.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
017_memories_fts_namespace.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
018_secrets_encryption_version.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
018_secrets_encryption_version.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
019_workspace_access.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
019_workspace_access.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
020_workspace_auth_tokens.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
020_workspace_auth_tokens.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
021_delegation_idempotency.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
021_delegation_idempotency.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
022_workspace_schedules_source.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
022_workspace_schedules_source.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
023_workspace_memory_version.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
023_workspace_memory_version.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
024_channel_budget.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
024_channel_budget.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
025_workspace_token_usage.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
025_workspace_token_usage.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
026_org_plugin_allowlist.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
026_org_plugin_allowlist.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
027_workspace_budget.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
027_workspace_budget.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
028_workspace_artifacts.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
028_workspace_artifacts.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
029_workspace_hibernation.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
029_workspace_hibernation.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
030_audit_events.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
030_audit_events.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
031_memories_pgvector.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
031_memories_pgvector.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
032_schedule_consecutive_empty.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
032_schedule_consecutive_empty.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
033_strip_crlf_cron_prompts.up.sql fix(scheduler): strip CRLF from cron prompts on insert/update (closes #958) 2026-04-18 07:45:14 -07:00
034_workspaces_last_outbound_at.up.sql feat(platform): track last_outbound_at for silent-workspace detection (closes #817) 2026-04-18 13:04:54 -07:00
035_org_api_tokens.down.sql feat(auth): organization-scoped API keys for admin access 2026-04-20 14:01:41 -07:00
035_org_api_tokens.up.sql feat(auth): organization-scoped API keys for admin access 2026-04-20 14:01:41 -07:00
036_org_api_tokens_org_id.down.sql fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1234) 2026-04-21 02:47:12 +00:00
036_org_api_tokens_org_id.up.sql fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1234) 2026-04-21 02:47:12 +00:00
037_max_concurrent_tasks.down.sql fix: CWE-78 rm scope, go vet failures, delegation idempotency 2026-04-21 18:22:30 +00:00
037_max_concurrent_tasks.up.sql fix: CWE-78 rm scope, go vet failures, delegation idempotency 2026-04-21 18:22:30 +00:00
038_workspace_instance_id.down.sql feat(workspace): persist CP-returned EC2 instance_id on provision 2026-04-21 17:56:15 -07:00
038_workspace_instance_id.up.sql feat(workspace): persist CP-returned EC2 instance_id on provision 2026-04-21 17:56:15 -07:00
039_activity_tool_trace.down.sql feat: add tool_trace to activity_logs for platform-level agent observability 2026-04-22 15:17:14 -07:00
039_activity_tool_trace.up.sql feat: add tool_trace to activity_logs for platform-level agent observability 2026-04-22 15:17:14 -07:00
040_platform_instructions.down.sql feat: platform instructions system with global/team/workspace scope 2026-04-22 15:17:14 -07:00
040_platform_instructions.up.sql fix(review): address code review blockers on tool-trace + instructions 2026-04-22 16:18:06 -07:00
042_a2a_queue.down.sql feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870) 2026-04-23 14:09:29 -07:00
042_a2a_queue.up.sql feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870) 2026-04-23 14:09:29 -07:00
043_workspace_status_enum.down.sql chore: second-pass review polish — symmetry + clearer test fixtures 2026-04-25 08:48:30 -07:00
043_workspace_status_enum.up.sql fix: review-driven hardening of wedge detector + idle timeout + progress feed 2026-04-25 08:43:10 -07:00
044_platform_inbound_secret.down.sql feat(wsauth): platform→workspace inbound secret (RFC #2312, PR-A) 2026-04-29 14:09:33 -07:00
044_platform_inbound_secret.up.sql feat(wsauth): platform→workspace inbound secret (RFC #2312, PR-A) 2026-04-29 14:09:33 -07:00
045_workspaces_delivery_mode.down.sql feat(workspaces): delivery_mode column + poll-mode register flow (#2339 PR 1) 2026-04-29 21:47:14 -07:00
045_workspaces_delivery_mode.up.sql feat(workspaces): delivery_mode column + poll-mode register flow (#2339 PR 1) 2026-04-29 21:47:14 -07:00
046_workspace_status_awaiting_agent.down.sql fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum 2026-04-30 08:52:05 -07:00
046_workspace_status_awaiting_agent.up.sql fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum 2026-04-30 08:52:05 -07:00
047_runtime_image_pins.down.sql feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1) 2026-05-03 02:30:00 -07:00
047_runtime_image_pins.up.sql feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1) 2026-05-03 02:30:00 -07:00
048_activity_logs_peer_indexes.down.sql feat(db): add per-peer btree indexes on activity_logs for chat_history scale (#2478) 2026-05-03 11:34:35 -07:00
048_activity_logs_peer_indexes.up.sql feat(db): add per-peer btree indexes on activity_logs for chat_history scale (#2478) 2026-05-03 11:34:35 -07:00
049_delegations.down.sql feat(delegations): durable per-task ledger + audit-write helper (RFC #2829 PR-1) 2026-05-04 20:43:06 -07:00
049_delegations.up.sql feat(delegations): durable per-task ledger + audit-write helper (RFC #2829 PR-1) 2026-05-04 20:43:06 -07:00
20260417000000_workflow_checkpoints.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
20260417000000_workflow_checkpoints.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
20260505100000_pending_uploads.down.sql feat(rfc): poll-mode chat upload — phase 1 platform staging layer 2026-05-05 04:22:24 -07:00
20260505100000_pending_uploads.up.sql feat(rfc): poll-mode chat upload — phase 1 platform staging layer 2026-05-05 04:22:24 -07:00
20260505200000_pending_uploads_acked_index.down.sql feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening 2026-05-05 11:10:13 -07:00
20260505200000_pending_uploads_acked_index.up.sql feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening 2026-05-05 11:10:13 -07:00