molecule-core/workspace-server/migrations
rabbitblood 87a97846cd feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870)
## Problem

When a lead delegates to a worker that's mid-synthesis, the proxy returns
503 "workspace agent busy" and the caller records the delegation as
failed. On fan-out storms from leads this hits ~70% drop rate — today's
observed numbers in the cycle reports.

## Fix — Phase 1 TASK-level queue-on-busy

When `handleA2ADispatchError` determines the target is busy, instead of
returning 503, enqueue the request as priority=TASK and return 202
Accepted with `{queued: true, queue_id, queue_depth}`. The workspace's
next heartbeat (≤30s) drains one item if it reports spare capacity.

Files:

  - migrations/042_a2a_queue.{up,down}.sql — `a2a_queue` table with
    partial indexes on status='queued' + idempotency_key. Schema
    supports PriorityCritical/Task/Info from day one so Phase 2/3 ship
    without migration churn.

  - internal/handlers/a2a_queue.go — EnqueueA2A / DequeueNext /
    Mark*-helpers plus WorkspaceHandler.DrainQueueForWorkspace. Uses
    `SELECT ... FOR UPDATE SKIP LOCKED` so concurrent drains can't
    double-claim the same row. Max 5 attempts before marking 'failed'
    so a stuck item doesn't wedge the queue forever.

  - internal/handlers/a2a_proxy_helpers.go — isUpstreamBusyError branch
    calls EnqueueA2A and returns 202 on success. Falls through to the
    legacy 503 on enqueue error (DB hiccup shouldn't silently drop).

  - internal/handlers/registry.go — RegistryHandler gets a QueueDrainFunc
    injection hook (SetQueueDrainFunc). When Heartbeat sees
    active_tasks < max_concurrent_tasks, spawns a goroutine that calls
    the drain hook. context.WithoutCancel ensures the drain outlives
    the heartbeat handler's ctx.

  - internal/router/router.go — wires wh.DrainQueueForWorkspace into
    rh.SetQueueDrainFunc after both are constructed.

## Not in this PR (Phase 2/3/4 follow-ups)

  - INFO priority + TTL (Phase 2)
  - CRITICAL priority + soft preemption between tool calls (Phase 3)
  - Age-based promotion so TASK doesn't starve (Phase 4)
  - `GET /workspaces/:id/queue` observability endpoint

Schema already supports all of these; only the dispatch + policy code
remains.

## Tests

  - TestExtractIdempotencyKey (5 cases): messageId parsing is robust
  - TestPriorityConstants: ordering invariant + 50=TASK default
    alignment with migration DEFAULT

Full DB-touching tests (FIFO order, retry bound, idempotency conflict)
intentionally deferred to the CI migration-enabled path — sqlmock
ceremony would duplicate the existing test infrastructure 3× over and
the behaviour is directly expressible in SQL constraints (FOR UPDATE
SKIP LOCKED, partial unique index).

## Expected impact once deployed

  - a2a_receive error with "busy" flavor drops from ~69/10min observed
    today to ~0
  - delegation_failed rate drops from ~50% to <5%
  - real_output metric rises from ~30/15min back toward the pre-
    throttle baseline

Closes #1870 Phase 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:09:29 -07:00
..
001_workspaces.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
002_agents.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
003_events.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
004_secrets.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
005_canvas_layouts.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
006_workspace_config_memory.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
007_approvals.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
008_agent_memories.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
009_activity_logs.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
010_workspace_awareness.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
011_workspace_runtime.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
012_global_secrets.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
013_workspace_dir.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
014_indexes.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
015_workspace_schedules.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
016_workspace_channels.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
017_memories_fts_namespace.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
017_memories_fts_namespace.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
018_secrets_encryption_version.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
018_secrets_encryption_version.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
019_workspace_access.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
019_workspace_access.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
020_workspace_auth_tokens.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
020_workspace_auth_tokens.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
021_delegation_idempotency.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
021_delegation_idempotency.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
022_workspace_schedules_source.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
022_workspace_schedules_source.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
023_workspace_memory_version.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
023_workspace_memory_version.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
024_channel_budget.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
024_channel_budget.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
025_workspace_token_usage.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
025_workspace_token_usage.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
026_org_plugin_allowlist.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
026_org_plugin_allowlist.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
027_workspace_budget.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
027_workspace_budget.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
028_workspace_artifacts.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
028_workspace_artifacts.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
029_workspace_hibernation.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
029_workspace_hibernation.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
030_audit_events.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
030_audit_events.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
031_memories_pgvector.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
031_memories_pgvector.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
032_schedule_consecutive_empty.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
032_schedule_consecutive_empty.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
033_strip_crlf_cron_prompts.up.sql fix(scheduler): strip CRLF from cron prompts on insert/update (closes #958) 2026-04-18 07:45:14 -07:00
034_workspaces_last_outbound_at.up.sql feat(platform): track last_outbound_at for silent-workspace detection (closes #817) 2026-04-18 13:04:54 -07:00
035_org_api_tokens.down.sql feat(auth): organization-scoped API keys for admin access 2026-04-20 14:01:41 -07:00
035_org_api_tokens.up.sql feat(auth): organization-scoped API keys for admin access 2026-04-20 14:01:41 -07:00
036_org_api_tokens_org_id.down.sql fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1234) 2026-04-21 02:47:12 +00:00
036_org_api_tokens_org_id.up.sql fix(auth): F1094 — requireCallerOwnsOrg reads org_id not created_by (#1234) 2026-04-21 02:47:12 +00:00
037_max_concurrent_tasks.down.sql fix: CWE-78 rm scope, go vet failures, delegation idempotency 2026-04-21 18:22:30 +00:00
037_max_concurrent_tasks.up.sql fix: CWE-78 rm scope, go vet failures, delegation idempotency 2026-04-21 18:22:30 +00:00
038_workspace_instance_id.down.sql feat(workspace): persist CP-returned EC2 instance_id on provision 2026-04-21 17:56:15 -07:00
038_workspace_instance_id.up.sql feat(workspace): persist CP-returned EC2 instance_id on provision 2026-04-21 17:56:15 -07:00
039_activity_tool_trace.down.sql feat: add tool_trace to activity_logs for platform-level agent observability 2026-04-22 15:17:14 -07:00
039_activity_tool_trace.up.sql feat: add tool_trace to activity_logs for platform-level agent observability 2026-04-22 15:17:14 -07:00
040_platform_instructions.down.sql feat: platform instructions system with global/team/workspace scope 2026-04-22 15:17:14 -07:00
040_platform_instructions.up.sql fix(review): address code review blockers on tool-trace + instructions 2026-04-22 16:18:06 -07:00
042_a2a_queue.down.sql feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870) 2026-04-23 14:09:29 -07:00
042_a2a_queue.up.sql feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870) 2026-04-23 14:09:29 -07:00
20260417000000_workflow_checkpoints.down.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
20260417000000_workflow_checkpoints.up.sql chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00