molecule-core/workspace-server/internal
Hongming Wang 2ed4f4fb41 feat(delegations): operator dashboard endpoint over the durable ledger (RFC #2829 PR-4)
Two read endpoints over the `delegations` table (PR-1 schema):

  GET /admin/delegations[?status=in_flight|stuck|failed|completed&limit=N]
  GET /admin/delegations/stats

## What this gives operators

Without this, post-incident investigation requires direct DB access —
only the on-call SRE can answer "is workspace X delegating to a wedged
callee?". This moves that visibility into the same surface as
/admin/queue, /admin/schedules-health, /admin/memories.

## List endpoint

Status filter via tight allowlist:

  - in_flight (default) → status IN (queued, dispatched, in_progress)
  - stuck               → status='stuck' (rows the PR-3 sweeper marked)
  - failed              → status='failed'
  - completed           → status='completed'

Unknown status → 400 with the allowlist in the error body. Limit
1..1000, default 100.

The status allowlist drives a parameterized IN clause (no string-
concatenation of user-controlled values into SQL).

Result rows expose all the audit-grade fields the dashboard needs:
delegation_id, caller_id, callee_id, task_preview, status,
last_heartbeat, deadline, result_preview, error_detail, retry_count,
created_at, updated_at. Nullable fields use pointer types so JSON
omits them when NULL (no false-zero "" for missing values).

## Stats endpoint

Zero-fills every known status key (queued, dispatched, in_progress,
completed, failed, stuck) so the dashboard summary card doesn't have
to handle "missing key vs zero" branching.

## Out of scope (deferred)

  - "retry this stuck task" mutation: needs the agent-side cutover
    (RFC #2829 PR-5 plan) before re-fire is safe
  - p95 / p99 duration aggregates: separate metric exposure, not a
    row-level read endpoint
  - Canvas UI: this is the JSON contract; the canvas operator panel
    consumes it in a follow-up canvas PR

## Wiring

NOT wired into the router in this PR — ships separately to keep
PR-by-PR review surface tight. Wiring will land in the
`enable-rfc2829-server-side` follow-up PR alongside the sweeper Start
call and the result-push flag flip.

## Coverage

11 unit tests:

List (8):
  - default status=in_flight, IN(queued,dispatched,in_progress)
  - status=stuck → IN(stuck)
  - status=failed → IN(failed)
  - unknown status → 400 with allowlist
  - negative limit → 400
  - over-cap limit → 400
  - custom limit accepted + echoed in response
  - nullable fields populated correctly (pointer-omitempty)

Stats (2):
  - zero-fills missing status keys
  - empty table → all counts zero

Contract pin (1):
  - statusFilters table shape — every documented key + value pair
    pinned. Drift catches accidental edits (forward defense).

Refs RFC #2829.
2026-05-04 20:58:17 -07:00
..
artifacts chore: sync staging to main — 1188 commits, 5 conflicts resolved (#1743) 2026-04-23 18:30:18 +00:00
buildinfo feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy 2026-04-30 10:55:08 -07:00
bundle refactor(workspace-status): typed constants + AST-based drift gate 2026-04-30 10:41:41 -07:00
channels feat(channels): first-class Lark/Feishu support via schema-driven config 2026-04-24 11:51:15 -07:00
crypto chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
db refactor(workspace-status): catch missed literal in workspace_bootstrap.go + add literal-drift gate 2026-04-30 10:51:01 -07:00
envx chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
events test(handlers): introduce events.EventEmitter interface (#1814 partial) 2026-04-26 09:05:52 -07:00
handlers feat(delegations): operator dashboard endpoint over the durable ledger (RFC #2829 PR-4) 2026-05-04 20:58:17 -07:00
imagewatch feat(workspace-server): GHCR digest watcher closes runtime CD chain (#2114) 2026-04-26 13:36:26 -07:00
memory fix(memory v2): warn at boot when cutover env half-configured 2026-05-04 17:24:11 -07:00
metrics chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
middleware fix(tenant-guard): allowlist /buildinfo so redeploy verifier can reach it 2026-04-30 12:54:51 -07:00
models refactor(workspace-status): typed constants + AST-based drift gate 2026-04-30 10:41:41 -07:00
orgtoken fix: F1085 rm scope concat + GH#756 ValidateToken terminal guard + CI test fixes 2026-04-24 07:16:54 +00:00
plugins chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
provisioner feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1) 2026-05-03 02:30:00 -07:00
registry fix(orphan-sweeper): exclude runtime='external' from stale-token revoke 2026-05-03 00:49:37 -07:00
router fix(provision): StopWorkspaceAuto mirror — close SaaS EC2-leak class 2026-05-04 20:00:23 -07:00
scheduler feat(runtime): native_scheduler skip — primitive #3 of 6 2026-04-26 22:47:00 -07:00
supervised chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
ws chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
wsauth perf(wsauth): in-process cache for platform_inbound_secret reads 2026-05-03 00:04:38 -07:00