Platform side (Option B):
- provisioner.go: add WriteAuthTokenToVolume() — writes .auth_token to
the Docker named volume BEFORE ContainerStart using a throwaway alpine
container, eliminating the race window where a restarted container could
read a stale token before WriteFilesToContainer writes the new one.
- workspace_provision.go: call WriteAuthTokenToVolume() in issueAndInjectToken
as a best-effort pre-write before the container starts.
Runtime side (Option A):
- heartbeat.py: on HTTPStatusError 401 from /registry/heartbeat, call
refresh_cache() to force re-read of /configs/.auth_token from disk,
then retry the heartbeat once. Fall through to normal failure tracking
if the retry also fails.
- platform_auth.py: add refresh_cache() which discards the in-process
_cached_token and calls get_token() to re-read from disk.
Together these eliminate the >1 consecutive 401 window described in
issue #1877. Pre-write (B) is the primary fix; runtime retry (A) is the
self-healing fallback for any residual race.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>