fix(core#2611): re-open bootstrap when stale bearer is rejected and live tokens are now zero #2757
Reference in New Issue
Block a user
Delete Branch "fix/core2611-register-401-retry"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Adds a single re-check in
RegistryHandler.requireWorkspaceToken(workspace-server): whenValidateTokenrejects the presented bearer, re-queryHasLiveInstanceToken. If the workspace now has zero live tokens (the previously-valid token was revoked in the gap between the firstHasLivecheck and theValidateTokencall), the request is allowed through as a fresh bootstrap. The agent's next register iteration mints a new token.Why (core#2611, enter-os 2026-06-11 ~22:19Z)
The watchdog double-provision race produced a "loser" box that presented a stale bearer to
/registry/register. Sequence:HasLiveInstanceTokenfirst check: count=1 (a live token exists).IssueTokenrewriting the row) fires.ValidateTokenrejects the bearer — token gone.The re-check closes the gap. C18 hardening: the re-open ONLY fires when the post-validation live-token count is zero. A stolen / rotated / misconfigured bearer with live tokens still present still 401s — never silently re-bootstrapped.
Test plan
TestRegister_BootstrapRecovery_StaleBearerZeroLiveTokens— stale bearer + re-check shows zero live tokens → 200, fresh token mintedTestRegister_BootstrapRecovery_StaleBearerLiveTokensRemains— stale bearer + re-check shows live tokens still present → 401 (C18 hardening)./internal/handlers/→ ok 24.9sgo vet+go buildcleanScope
This PR is one of the four sub-fixes in core#2611. The other three are explicitly out of scope and documented in the commit body for follow-up:
delegation status=completedmeans DELIVERED, not processed (rename or add processed/acked)last_erroron workspace record when watchdog recreate follows failed registerClosing core#2611 fully requires the other three. This PR addresses the workspace-server-recoverable portion of the wedge (the loser box's 401 path) without scope-creeping into CP changes that aren't mine to land here.
Refs core#2611.
APPROVED on head
2355becf.5-axis review:
Tests cover both critical paths: stale bearer + zero live tokens returns 200 and mints a fresh token; stale bearer + live tokens present remains 401. Scope is correctly limited to this workspace-server 401 sub-fix; the other #2611 sub-fixes are documented follow-ups and are not claimed here.
CI/all-required is green on current head. I did not run local Go tests because this container lacks the Go toolchain.
/sop-ack
APPROVED on head
2355becfba2e38c9e155bc3dd866102ff84a1e39.5-axis/security review:
requireWorkspaceTokenstill first requires a live instance token to exist, then validates the presented bearer; only after validation fails does it re-checkHasLiveInstanceToken, and onlynowLiveErr == nil && !nowLivere-opens bootstrap. Register then follows the existing bootstrap path and mints a fresh instance token.auth_token; stale bearer + live token remains => 401. The IssueToken gate is still checked before minting.CI / all-requiredis green on this head, with Platform Go, API smoke, handler integration, and local-provision advisory also green. I could not run local Go tests in this container becausegois not installed, so I relied on the required CI results.No changes requested.
sop-ack: I performed an independent 5-axis review on #2757 head
2355becfba2e38c9e155bc3dd866102ff84a1e39afterCI / all-requiredwas green. Approval posted as review #11448. Security focus checked C18 bootstrap protection: stale bearer + any live instance token remains 401; stale bearer + zero live instance tokens re-opens bootstrap only after a successful zero-live re-check. No secret material is logged or exposed.