fix(canvas): keep desktop take-control connected (auto-reconnect + lease renewal) #2216
Reference in New Issue
Block a user
Delete Branch "fix/desktop-takecontrol-reconnect-renewal"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Desktop "take control" (noVNC) disconnects every ~5 minutes and never recovers.
Root cause (client half — see molecule-controlplane#522 for the websockify keepalive)
DisplayTabhad two gaps:sessionUrlchanged — so a single idle/network blip left the user permanently disconnected.acquireControlpoststtl_seconds: 300only on click; the display-control lock is a 300s lease with no client heartbeat, so the lock (and theexpires_at-bound session token) silently expired mid-session.Fix
ON CONFLICT ... controlled_by = EXCLUDEDpath). The fresh session URL is kept in a ref for reconnects without swapping the live stream'ssessionUrl(which would needlessly reconnect the desktop).Tests
13 passed— added "auto-reconnects after an unclean disconnect but not a clean one"; existing acquire/release/clipboard/observe tests unchanged.Companion: molecule-controlplane#522 adds
websockify --heartbeat 30(prevents the idle drop in the first place). Together: the stream stays alive through idle, and recovers from any drop.Independent adversarial review done; one BLOCKER found + fixed.
The reviewer flagged a real correctness bug in the first cut: the reconnect reused a cached signed session URL, whose token is only ~300s — so a reconnect after the token's window (especially after any swallowed renewal failure) would 401 and never recover, reproducing the exact symptom this PR fixes.
Fix (pushed): reconnect now re-acquires a fresh lease+token first (
reacquireSession()— shared by the renewal timer and the reconnect path) rather than dialing with a possibly-stale ref. The renewal generation-guard is now consistent withacquireControl(ridesrequestGeneration). Added a test asserting the reconnect dials with a freshly-minted token (signed2), not the stale original (signed) — the previous test would have passed even with the bug.13 passed, eslint clean (0 errors), tsc clean for DisplayTab. Verified the noVNCconnect/disconnectevent +cleanflag assumptions against@novnc/novnc@1.7.0source.Security approve. Client-only canvas change (display take-control reconnect + lease renewal). No new secrets, no auth-boundary change — reuses the existing signed display-session token + the existing acquire endpoint (same-holder lease extension). Independent adversarial review done; the stale-token reconnect BLOCKER it found is fixed (reconnect re-acquires a fresh token). 13 tests pass.
QA approve. 13 canvas tests pass incl reconnect-uses-fresh-token + clean-disconnect-no-reconnect; lint clean; tsc clean for DisplayTab. noVNC event assumptions verified vs @novnc/novnc 1.7.0. Companion CP#522 keepalive already deployed to prod.
/qa-recheck
/security-recheck
/qa-recheck
New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
Security re-approve on new head (added resizeSession=false to quiet the x11vnc resize-prohibited console spam; no security surface change).
QA re-approve on new head. 13 tests pass; resizeSession=false quiets resize spam.
/qa-recheck
/security-recheck