molecule-core/platform/internal/handlers/delegation.go
Hongming Wang d751420679 test: 100% coverage of extracted helpers + ConfirmDialog singleButton
Follow-up to the quality-fixes-pass2 code review.

## Go: direct unit tests for PR #5 extracted helpers (~47 new tests)

a2a_proxy_test.go:
- resolveAgentURL: cache hit, cache-miss DB hit, not-found, null-URL,
  docker-rewrite guard
- dispatchA2A: build error, canvas timeout, agent timeout, success
- handleA2ADispatchError: context deadline, generic error, build error
- maybeMarkContainerDead: nil-provisioner, runtime=external short-circuits
- logA2AFailure, logA2ASuccess: activity_logs row content + status

delegation_test.go:
- bindDelegateRequest: valid / malformed / bad-UUID
- lookupIdempotentDelegation: no-key / no-match / failed-row-deleted / existing-pending
- insertDelegationRow: insertOK / insertHandledByIdempotent /
  insertTrackingUnavailable
- insertDelegationOutcome: zero-value is insertOutcomeUnknown sentinel

discovery_test.go:
- discoverWorkspacePeer: online / not-found / access-denied + 2 edges
- writeExternalWorkspaceURL: 3 cases
- discoverHostPeer: smoke test documents the unreachable-by-design path

activity_test.go:
- parseSessionSearchParams: defaults + custom limit/offset/q
- buildSessionSearchQuery: no-filters + with-query shapes
- scanSessionSearchRows: empty / single / multiple rows

Package coverage: 56.1% → 57.6%. Every helper extracted in PR #5 is
now at or near 100% line coverage (see PR notes for the 4 remaining
gaps, all blocked on provisioner interface mockability).

## Defensive enum zero-value fix

insertDelegationOutcome now starts with insertOutcomeUnknown=0 as a
sentinel so an un-initialized variable can't silently read as
"success". insertOK, insertHandledByIdempotent, insertTrackingUnavailable
shift to 1/2/3. No caller changes needed.

## Canvas: ConfirmDialog.singleButton test (5 cases)

canvas/src/components/__tests__/ConfirmDialog.test.tsx covers:
- default render (both buttons)
- singleButton hides Cancel
- singleButton: Escape still fires onCancel
- singleButton: backdrop-click still fires onCancel
- singleButton: onConfirm fires on click

vitest total: 352 → 357, all passing.

## Docstring clarity

ConfirmDialog.tsx: expanded singleButton prop comment to explicitly
instruct callers to pass the same handler for onConfirm/onCancel when
using it as an info toast (matches TemplatePalette usage).

## ErrorBoundary clipboard observability

.catch(() => {}) silently swallowed rejections. Now:
.catch((e) => console.warn("clipboard write failed:", e))
so permission-denied / insecure-context failures surface in the console.

## Verification

- go build ./... clean
- go vet ./... clean
- go test -race ./internal/... — all pass
- canvas npm run build — clean
- canvas npm test -- --run — 357/357 pass
- tests/e2e/test_api.sh — 46/62 pass; all 16 failures are pre-existing
  (token-auth enforcement + stale test workspaces + missing Docker
  network). None involve handlers touched in PR #5.
- Manual: platform + canvas running locally, title=Molecule AI,
  /workspaces returns [], /health returns ok. Identified + killed a
  stale Next.js server from the old Starfire-AgentTeam repo that was
  serving the old brand on IPv4 port 3000.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:08:33 -07:00

549 lines
21 KiB
Go

package handlers
import (
"context"
"encoding/json"
"log"
"net/http"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// Delegation status lifecycle:
// pending → dispatched → received → in_progress → completed | failed
//
// pending: stored in DB, goroutine not yet started
// dispatched: A2A request sent to target workspace
// received: target workspace acknowledged (200 from A2A server)
// in_progress: target agent is actively working (set via heartbeat)
// completed: response received and stored
// failed: error during any stage
// DelegationHandler manages async delegation between workspaces.
// Delegations are fire-and-forget: the caller gets a task_id immediately,
// and the A2A request runs in the background.
type DelegationHandler struct {
workspace *WorkspaceHandler
broadcaster *events.Broadcaster
}
func NewDelegationHandler(wh *WorkspaceHandler, b *events.Broadcaster) *DelegationHandler {
return &DelegationHandler{workspace: wh, broadcaster: b}
}
// delegateRequest is the bound POST /workspaces/:id/delegate body.
type delegateRequest struct {
TargetID string `json:"target_id" binding:"required"`
Task string `json:"task" binding:"required"`
IdempotencyKey string `json:"idempotency_key"`
}
// Delegate handles POST /workspaces/:id/delegate
// Sends an A2A message to the target workspace in the background.
// Returns immediately with a delegation_id.
func (h *DelegationHandler) Delegate(c *gin.Context) {
sourceID := c.Param("id")
ctx := c.Request.Context()
var body delegateRequest
if err := bindDelegateRequest(c, &body); err != nil {
return // response already written
}
// #124 — idempotency. If the caller supplies an idempotency_key, return
// the existing delegation when (workspace_id, idempotency_key) already
// exists and is not in a failed terminal state.
if hit := lookupIdempotentDelegation(ctx, c, sourceID, body.IdempotencyKey); hit {
return
}
delegationID := uuid.New().String()
outcome := insertDelegationRow(ctx, c, sourceID, body, delegationID)
if outcome == insertHandledByIdempotent {
return // idempotency-conflict response already written
}
// insertTrackingUnavailable means insert failed for a non-idempotency
// reason (logged); we still dispatch the A2A request and surface the
// warning in the response.
// Build A2A payload
a2aBody, _ := json.Marshal(map[string]interface{}{
"method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]interface{}{{"type": "text", "text": body.Task}},
},
},
})
// Fire-and-forget: send A2A in background goroutine
go h.executeDelegation(sourceID, body.TargetID, delegationID, a2aBody)
// Broadcast event so canvas shows delegation in real-time
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_SENT", sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": body.TargetID,
"task_preview": truncate(body.Task, 100),
})
resp := gin.H{
"delegation_id": delegationID,
"status": "delegated",
"target_id": body.TargetID,
}
if outcome == insertTrackingUnavailable {
resp["warning"] = "delegation dispatched but status tracking unavailable"
}
c.JSON(http.StatusAccepted, resp)
}
// bindDelegateRequest binds and validates the JSON body. On error it writes
// the 400 response and returns the error so the caller can return.
func bindDelegateRequest(c *gin.Context, body *delegateRequest) error {
if err := c.ShouldBindJSON(body); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return err
}
if _, err := uuid.Parse(body.TargetID); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "target_id must be a valid UUID"})
return err
}
return nil
}
// lookupIdempotentDelegation returns true (and writes the response) when an
// existing non-failed delegation matches the (sourceID, idempotencyKey) pair.
// Failed rows are deleted to release the unique slot so the retry can take it.
// Returns false when there's no key, no existing row, or the existing row was
// failed and just deleted.
func lookupIdempotentDelegation(ctx context.Context, c *gin.Context, sourceID, idempotencyKey string) bool {
if idempotencyKey == "" {
return false
}
var existingID, existingStatus, existingTarget string
err := db.DB.QueryRowContext(ctx, `
SELECT request_body->>'delegation_id', status, target_id
FROM activity_logs
WHERE workspace_id = $1 AND idempotency_key = $2
LIMIT 1
`, sourceID, idempotencyKey).Scan(&existingID, &existingStatus, &existingTarget)
if err != nil || existingID == "" {
return false
}
if existingStatus == "failed" {
_, _ = db.DB.ExecContext(ctx, `
DELETE FROM activity_logs
WHERE workspace_id = $1 AND idempotency_key = $2 AND status = 'failed'
`, sourceID, idempotencyKey)
return false
}
c.JSON(http.StatusOK, gin.H{
"delegation_id": existingID,
"status": existingStatus,
"target_id": existingTarget,
"idempotent_hit": true,
})
return true
}
// insertDelegationOutcome captures the three distinct results of storing
// the pending delegation row, so callers never have to decode a positional
// (bool, bool) tuple.
type insertDelegationOutcome int
const (
// insertOutcomeUnknown — zero-value sentinel; should never be returned
// by insertDelegationRow. Exists so that an uninitialized
// insertDelegationOutcome value doesn't silently alias a real outcome.
insertOutcomeUnknown insertDelegationOutcome = iota
// insertOK — row stored; caller continues with dispatch and does NOT
// surface a tracking warning.
insertOK
// insertHandledByIdempotent — a concurrent idempotent request took the
// slot; the winner's JSON response is already written and the caller
// MUST return without further writes.
insertHandledByIdempotent
// insertTrackingUnavailable — insert failed for a non-idempotency
// reason (logged by this function); caller continues with dispatch
// and surfaces a tracking-unavailable warning in the response.
insertTrackingUnavailable
)
// insertDelegationRow stores the pending delegation row. See
// insertDelegationOutcome for the three possible return values.
func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, body delegateRequest, delegationID string) insertDelegationOutcome {
taskJSON, _ := json.Marshal(map[string]interface{}{
"task": body.Task,
"delegation_id": delegationID,
})
var idemArg interface{}
if body.IdempotencyKey != "" {
idemArg = body.IdempotencyKey
}
_, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, request_body, status, idempotency_key)
VALUES ($1, 'delegation', 'delegate', $2, $3, $4, $5::jsonb, 'pending', $6)
`, sourceID, sourceID, body.TargetID, "Delegating to "+body.TargetID, string(taskJSON), idemArg)
if err == nil {
return insertOK
}
// A unique-constraint hit means a concurrent request just took the
// slot — rare, but worth surfacing as the same idempotent response
// rather than a generic 500. Re-query to fetch the winner's id.
if body.IdempotencyKey != "" {
var winnerID, winnerStatus string
if qerr := db.DB.QueryRowContext(ctx, `
SELECT request_body->>'delegation_id', status
FROM activity_logs
WHERE workspace_id = $1 AND idempotency_key = $2
LIMIT 1
`, sourceID, body.IdempotencyKey).Scan(&winnerID, &winnerStatus); qerr == nil && winnerID != "" {
c.JSON(http.StatusOK, gin.H{
"delegation_id": winnerID,
"status": winnerStatus,
"target_id": body.TargetID,
"idempotent_hit": true,
})
return insertHandledByIdempotent
}
}
log.Printf("Delegation: failed to store: %v", err)
return insertTrackingUnavailable
}
// executeDelegation runs in a goroutine — sends A2A and stores the result.
// Updates delegation status through: pending → dispatched → received → completed/failed
// delegationRetryDelay is the pause between the first failed proxy attempt
// and the retry. The first failure triggers `proxyA2ARequest`'s reactive
// health check (marks workspace offline, clears cached URL, triggers
// container restart). This delay gives the restart + re-register a chance
// to land a fresh URL in the cache before we try again. Fixes #74 —
// bulk restarts used to produce spurious "failed to reach workspace
// agent" errors when delegations fired within the warm-up window.
const delegationRetryDelay = 8 * time.Second
func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID string, a2aBody []byte) {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Minute)
defer cancel()
log.Printf("Delegation %s: %s → %s (dispatched)", delegationID, sourceID, targetID)
// Update status: pending → dispatched
h.updateDelegationStatus(sourceID, delegationID, "dispatched", "")
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_STATUS", sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "status": "dispatched",
})
status, respBody, proxyErr := h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true)
// #74: one retry after the reactive URL refresh has had a chance to
// run. The proxyA2ARequest's health-check path on a connection error
// marks the workspace offline, clears cached keys, and kicks off a
// restart — all on the *next* request's benefit, not this one. A short
// pause + second attempt catches the common restart-race case where
// the first attempt sees a stale 127.0.0.1:<ephemeral> URL from a
// container that was just recreated.
if proxyErr != nil && isTransientProxyError(proxyErr) {
log.Printf("Delegation %s: first attempt failed (%s) — retrying in %s after reactive URL refresh",
delegationID, proxyErr.Error(), delegationRetryDelay)
select {
case <-ctx.Done():
// outer timeout hit before retry window elapsed
case <-time.After(delegationRetryDelay):
status, respBody, proxyErr = h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true)
}
}
if proxyErr != nil {
log.Printf("Delegation %s: failed — %s", delegationID, proxyErr.Error())
h.updateDelegationStatus(sourceID, delegationID, "failed", proxyErr.Error())
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, status, error_detail)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, 'failed', $5)
`, sourceID, sourceID, targetID, "Delegation failed", proxyErr.Error()); err != nil {
log.Printf("Delegation %s: failed to insert error log: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_FAILED", sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "error": proxyErr.Error(),
})
return
}
// A2A returned 200 — target received and processed the task
// Status: dispatched → received → completed (we don't have a separate "received" signal from the target yet)
responseText := extractResponseText(respBody)
log.Printf("Delegation %s: completed (status=%d, %d chars)", delegationID, status, len(responseText))
// Store success (response_body must be JSONB, include delegation_id)
respJSON, _ := json.Marshal(map[string]interface{}{
"text": responseText,
"delegation_id": delegationID,
})
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, $5::jsonb, 'completed')
`, sourceID, sourceID, targetID, "Delegation completed ("+truncate(responseText, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
}
h.updateDelegationStatus(sourceID, delegationID, "completed", "")
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_COMPLETE", sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": targetID,
"response_preview": truncate(responseText, 200),
})
}
// updateDelegationStatus updates the status of a delegation record in activity_logs.
func (h *DelegationHandler) updateDelegationStatus(workspaceID, delegationID, status, errorDetail string) {
if _, err := db.DB.ExecContext(context.Background(), `
UPDATE activity_logs
SET status = $1, error_detail = CASE WHEN $2 = '' THEN error_detail ELSE $2 END
WHERE workspace_id = $3
AND method = 'delegate'
AND request_body->>'delegation_id' = $4
`, status, errorDetail, workspaceID, delegationID); err != nil {
log.Printf("Delegation %s: status update failed: %v", delegationID, err)
}
}
// Record handles POST /workspaces/:id/delegations/record — the agent-initiated
// "I just fired a delegation directly via A2A, please record it" endpoint (#64).
//
// The canvas-driven POST /delegate endpoint records to activity_logs AND fires
// the A2A request. Agents calling delegate_to_workspace fire A2A themselves
// (preserves OTEL trace-context propagation + retry logic) — this endpoint
// lets them register the row without double-firing the request.
//
// Body: {"target_id": "...", "task": "...", "delegation_id": "..."}
// - delegation_id is the agent-generated task_id (matches what
// check_delegation_status returns, so a single ID correlates the two
// views).
func (h *DelegationHandler) Record(c *gin.Context) {
sourceID := c.Param("id")
ctx := c.Request.Context()
var body struct {
TargetID string `json:"target_id" binding:"required"`
Task string `json:"task" binding:"required"`
DelegationID string `json:"delegation_id" binding:"required"`
}
if err := c.ShouldBindJSON(&body); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
if _, err := uuid.Parse(body.TargetID); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "target_id must be a valid UUID"})
return
}
taskJSON, _ := json.Marshal(map[string]interface{}{
"task": body.Task,
"delegation_id": body.DelegationID,
})
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, request_body, status)
VALUES ($1, 'delegation', 'delegate', $2, $3, $4, $5::jsonb, 'dispatched')
`, sourceID, sourceID, body.TargetID, "Delegating to "+body.TargetID, string(taskJSON)); err != nil {
log.Printf("Delegation Record: insert failed for %s: %v", body.DelegationID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to record delegation"})
return
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_SENT", sourceID, map[string]interface{}{
"delegation_id": body.DelegationID,
"target_id": body.TargetID,
"task_preview": truncate(body.Task, 100),
})
c.JSON(http.StatusAccepted, gin.H{
"delegation_id": body.DelegationID,
"status": "recorded",
})
}
// UpdateStatus handles POST /workspaces/:id/delegations/:delegation_id/update — agent
// reports completion/failure for a delegation it recorded via Record (#64).
//
// Body: {"status": "completed"|"failed", "error": "...", "response_preview": "..."}
func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
sourceID := c.Param("id")
delegationID := c.Param("delegation_id")
ctx := c.Request.Context()
var body struct {
Status string `json:"status" binding:"required"`
Error string `json:"error,omitempty"`
ResponsePreview string `json:"response_preview,omitempty"`
}
if err := c.ShouldBindJSON(&body); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
if body.Status != "completed" && body.Status != "failed" {
c.JSON(http.StatusBadRequest, gin.H{"error": "status must be 'completed' or 'failed'"})
return
}
h.updateDelegationStatus(sourceID, delegationID, body.Status, body.Error)
if body.Status == "completed" {
respJSON, _ := json.Marshal(map[string]interface{}{
"text": body.ResponsePreview,
"delegation_id": delegationID,
})
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, summary, response_body, status)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4::jsonb, 'completed')
`, sourceID, sourceID, "Delegation completed ("+truncate(body.ResponsePreview, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation UpdateStatus: result insert failed for %s: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_COMPLETE", sourceID, map[string]interface{}{
"delegation_id": delegationID,
"response_preview": truncate(body.ResponsePreview, 200),
})
} else {
h.broadcaster.RecordAndBroadcast(ctx, "DELEGATION_FAILED", sourceID, map[string]interface{}{
"delegation_id": delegationID,
"error": body.Error,
})
}
c.JSON(http.StatusOK, gin.H{"status": body.Status, "delegation_id": delegationID})
}
// ListDelegations handles GET /workspaces/:id/delegations
// Returns recent delegations for a workspace with their status.
func (h *DelegationHandler) ListDelegations(c *gin.Context) {
workspaceID := c.Param("id")
ctx := c.Request.Context()
rows, err := db.DB.QueryContext(ctx, `
SELECT id, activity_type, COALESCE(source_id::text, ''), COALESCE(target_id::text, ''),
COALESCE(summary, ''), COALESCE(status, ''), COALESCE(error_detail, ''),
COALESCE(response_body->>'text', response_body::text, ''),
COALESCE(request_body->>'delegation_id', response_body->>'delegation_id', ''),
created_at
FROM activity_logs
WHERE workspace_id = $1 AND method IN ('delegate', 'delegate_result')
ORDER BY created_at DESC
LIMIT 50
`, workspaceID)
if err != nil {
c.JSON(http.StatusInternalServerError, gin.H{"error": "query failed"})
return
}
defer rows.Close()
var delegations []map[string]interface{}
for rows.Next() {
var id, actType, sourceID, targetID, summary, status, errorDetail, responseBody, delegationID string
var createdAt time.Time
if err := rows.Scan(&id, &actType, &sourceID, &targetID, &summary, &status, &errorDetail, &responseBody, &delegationID, &createdAt); err != nil {
continue
}
entry := map[string]interface{}{
"id": id,
"type": actType,
"source_id": sourceID,
"target_id": targetID,
"summary": summary,
"status": status,
"created_at": createdAt,
}
if delegationID != "" {
entry["delegation_id"] = delegationID
}
if errorDetail != "" {
entry["error"] = errorDetail
}
if responseBody != "" {
entry["response_preview"] = truncate(responseBody, 300)
}
delegations = append(delegations, entry)
}
if delegations == nil {
delegations = []map[string]interface{}{}
}
c.JSON(http.StatusOK, delegations)
}
// --- helpers ---
// isTransientProxyError returns true when the proxy error looks like a
// restart-race condition worth retrying (connection refused, EOF, stale
// URL pointing at a dead ephemeral port, container-restart-triggered
// 503). Static 4xx errors (bad request, access denied, not found) are
// NOT retried — retrying them wastes the 8-second delay for no benefit.
func isTransientProxyError(err *proxyA2AError) bool {
if err == nil {
return false
}
// 503 is the explicit "container unreachable / restart triggered"
// response from a2a_proxy.go after its reactive health check.
// 502 is "failed to reach workspace agent" — the pre-reactive-check
// error for plain connection failures.
if err.Status == http.StatusServiceUnavailable || err.Status == http.StatusBadGateway {
return true
}
return false
}
func extractResponseText(body []byte) string {
var resp map[string]interface{}
if json.Unmarshal(body, &resp) != nil {
return string(body)
}
result, ok := resp["result"].(map[string]interface{})
if !ok {
return string(body)
}
// Check top-level parts
if parts, ok := result["parts"].([]interface{}); ok {
for _, p := range parts {
if part, ok := p.(map[string]interface{}); ok {
if kind, _ := part["kind"].(string); kind == "text" {
if text, ok := part["text"].(string); ok {
return text
}
}
}
}
}
// Check artifacts
if artifacts, ok := result["artifacts"].([]interface{}); ok {
for _, a := range artifacts {
if art, ok := a.(map[string]interface{}); ok {
if parts, ok := art["parts"].([]interface{}); ok {
for _, p := range parts {
if part, ok := p.(map[string]interface{}); ok {
if kind, _ := part["kind"].(string); kind == "text" {
if text, ok := part["text"].(string); ok {
return text
}
}
}
}
}
}
}
}
return string(body)
}
func truncate(s string, max int) string {
if len(s) <= max {
return s
}
return s[:max] + "..."
}