fix(registry): clear last_register_failure_at on healthy heartbeat agent_card backfill #2668
Reference in New Issue
Block a user
Delete Branch "fix/registry-clear-failure-on-healthy-heartbeat"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Bug: After an authenticated non-200
/registry/registerstampslast_register_failure_at, a later healthy heartbeat that backfillsagent_carddid not clear that marker.evaluateStatustherefore kept the workspace stuck indegradedforever.Fix: In the heartbeat backfill path, clear
last_register_failure_atin the same UPDATE that writes the agent_card.Verification: Added
TestHeartbeatHandler_BackfillAgentCard_ClearsRegisterFailurecovering the degraded→online recovery path.SOP checklist
last_register_failure_atreset on healthy heartbeat backfillRelates-to: #2659 #2665
Route: CR2 (CI-green)
2f16fce90atodd3dad7952APPROVED: reviewed head
dd3dad7952with the 5-axis lens. CI / all-required is green; the red staging/SOP contexts are outside this required-code path. The change is scoped to the heartbeat agent_card backfill recovery: when a healthy heartbeat writes a missing agent_card, it also clears last_register_failure_at before evaluateStatus reads it, allowing degraded->online recovery once error_rate/runtime_state are healthy. It preserves the agent_card IS NULL guard, does not broaden auth or registration behavior, and the new sqlmock test covers the degraded recovery path and status transition. No blockers found.