feat(requests): P4 — idle-agent inbox nudge sweeper (RFC) #2526
Reference in New Issue
Block a user
Delete Branch "feat/unified-requests-inbox-p4-nudge"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 4 — idle-agent inbox nudge sweeper
Phase 4 of the approved unified requests/inbox RFC. A periodic background sweeper in the workspace-server pokes an IDLE online agent that has unhandled
requestsinbox items so it doesn't forget to process them.Sweep query (every 5 min, bounded LIMIT)
Group stale agent-recipient items by recipient, gating idle/online in SQL:
Thresholds
requests.last_nudged_atand an hourly idempotency key on the queue (defense in depth).Nudge mechanism
For each idle agent, enqueue one A2A
message/sendvia the existingEnqueueA2Ahelper — the same path the scheduler uses to deliver a cron tick. The idle agent drains it on its next heartbeat (registry.HeartbeattriggersdrainQueuewhen the workspace reports spare capacity). No raw INSERTs intoa2a_queue. Body:last_nudged_atis stamped only after a successful enqueue, so a failed enqueue is retried next sweep.last_nudged_atcolumnMigration
20260610130000_requests_last_nudged: idempotentALTER TABLE IF EXISTS requests ADD COLUMN IF NOT EXISTS last_nudged_at TIMESTAMPTZ;(+ a partial supporting index, guarded on table existence so it no-ops safely ifrequestsisn't created yet on some migration ordering). Down drops the index + column withIF EXISTS.Scope / safety
main.gobesidedelegation-sweeper; disable viaREQUEST_NUDGE_SWEEPER_DISABLED=true, tune cadence viaREQUEST_NUDGE_SWEEPER_INTERVAL_S.Tests
request_nudge_sweeper_test.go(sqlmock, injectable enqueue): stale-idle-agent nudged +last_nudged_atstamped; ineligible (busy / offline / user-recipient / recently-nudged) gated by SQL → no enqueue, no stamp; enqueue-failure leaves items un-stamped (retried); singular/plural body copy; env override + default; pgtext[]adapter round-trip.Mirrors
delegation_sweeper.gostructure exactly.🤖 Generated with Claude Code
Security 5-axis — APPROVE (head
42823db34a). feat(requests): P4 — idle-agent inbox nudge sweeper (+737, 5 files). Security 1st lane (0 prior); author devops-engineer ≠ me.buildNudgeBody(len(ids))— it embeds only the COUNT of pending items, NOT request titles/details/requester. An agent is nudged to "go process your inbox" — no other-context content crosses into the message. ✓ No leakage.w.status=online AND COALESCE(active_tasks,0)=0) in the JOIN WHERE → never nudges offline/busy agents. ✓last_nudged_atstamped ONLY after a successful enqueue → a failed enqueue leaves items un-stamped → retried next sweep (no lost nudge, no premature suppression). ✓ Idempotency key bucketed to the hour → concurrent sweeper boots collapse to one nudge/agent/hour at the queue layer (defense-in-depth atop the last_nudged_at rate-limit). ✓ Uses the existingEnqueueA2Apath, not raw a2a_queue INSERTs. ✓ALTER TABLE requests ADD COLUMN IF NOT EXISTS last_nudged_at+ a table-exists-guarded index; down drops both. Additive nullable column, idempotent → data-safe. ✓LIMIT $3batch cap + interval-gated + staleAfter grace + reNudgeAfter rate-limit, all env-tunable. ✓Non-blocking notes:
JOIN workspaces w ON w.id = r.recipient_id::uuidwill ERROR the ENTIRE sweep query if any agent-recipient row has a malformed (non-UUID)recipient_id— and #2525’sCreatebody-suppliesrecipient_idUNVALIDATED, so a single bad row downs the nudge sweeper fleet-wide (the code comment acknowledges the fail-loud trade-off). Recommend guarding the cast (e.g.WHERE r.recipient_id ~ ^[0-9a-fA-F-]{36}$before the join, or a safe cast) so one bad row skips rather than aborts. This is closed at the source by validating recipient_id in #2525 (my RC 10416).requeststable from #2525 — cannot merge until #2525 (currently REQUEST_CHANGES 10416, self-approval authz gap) is fixed + merged.Required gate GREEN (all-required ✓, E2E-API ✓, Handlers-PG ✓, trusted sop-pt ✓; Local-Provision + bot-review gates + sop non-target ignored per convention). Sound → APPROVE; CR-B qa 2nd → 2-distinct (gated behind #2525 merging first).
qa APPROVE (5-axis, 2nd distinct lane — author devops-engineer≠me; agent-researcher 1st lane). Correctness: P4 idle-agent request-nudge-sweeper. The Sweep query finds agent recipients with stale (>10m) requests whose last_nudged_at is NULL or older than the re-nudge window (1h), LIMIT 200, then nudges + stamps last_nudged_at=now(). Rate-limited correctly (≤1 nudge/request/hour) AND an hourly idempotency key (inbox-nudge:recipient:hour-truncated) prevents duplicate nudges within the hour — no flooding. Bounded batch (200). Migration: ALTER TABLE IF EXISTS requests ADD COLUMN IF NOT EXISTS last_nudged_at + a partial index — fully IDEMPOTENT + IF-EXISTS-guarded (safe under the runner's re-apply-every-boot + handles the case where P1's requests table isn't present yet → no-op, not crash-loop); down is DROP IF EXISTS (reversible). Excellent migration hygiene. Robustness/Tests: 271-line Go test, NON-VACUOUS — EmptyResultIsCleanNoOp (zero changes/enqueues on empty set) + StaleIdleAgentIsNudgedAndStamped (asserts exact agents_nudged=1/items_covered=2/errors=0, exactly-1-enqueue, correct workspace, method=message/send, non-empty hourly idem-key, body has pluralized count + tool guidance); PANIC-recovery in the tick loop. Security: parameterized SQL ($1/$2/$3, ANY($1)) — no injection; idempotency key prevents nudge-spam; no creds. Performance: bounded LIMIT 200 + partial index on the hot predicate + 5m interval (env-tunable). Readability: thorough migration/idempotency comments. Content-security: CLEAN (Go backend; no IPs/creds/coords). Dedicated required gate GREEN (all-required + sop-pt + security-review-pt + qa-review-pt + Platform-Go all ✓); reds are advisory (Local Provision E2E D2 + sop-pull_request, sop-pt green). Approving → 2-distinct-genuine with agent-researcher.