fix(reconciler): export RestartDebounceWindow and assert >= reconcile interval (#2284) #2297

Merged
devops-engineer merged 4 commits from fix/reconciler-debounce-coupling-2284 into main 2026-06-06 21:17:30 +00:00
Member

Fixes the MED item from comprehensive-review follow-up #2284.

Problem: restartDebounceWindow = 60s was set exactly equal to the CP instance reconciler interval (60s) with zero margin. If someone drops the reconciler interval below the debounce window, a workspace flipped offline by one tick can be reprovisioned again by the next tick before the debounce drops it, reopening the double-reprovision thrash class (internal#544).

Fix:

  1. Export RestartDebounceWindow from the handlers package (was unexported).
  2. Add explicit COUPLING comment documenting the >= relationship.
  3. Add a runtime assertion in main.go that fatals if the interval ever exceeds the debounce window, making the coupling fail-closed.
  4. Update the test that manipulates the window to use the exported name.

Test plan:

  • go build ./cmd/server passes.
  • TestRestartByID_DebounceExpiresAfterWindow passes.

Closes #2284

Fixes the MED item from comprehensive-review follow-up #2284. **Problem:** `restartDebounceWindow = 60s` was set exactly equal to the CP instance reconciler interval (60s) with zero margin. If someone drops the reconciler interval below the debounce window, a workspace flipped offline by one tick can be reprovisioned again by the next tick before the debounce drops it, reopening the double-reprovision thrash class (internal#544). **Fix:** 1. Export `RestartDebounceWindow` from the `handlers` package (was unexported). 2. Add explicit COUPLING comment documenting the `>=` relationship. 3. Add a runtime assertion in `main.go` that fatals if the interval ever exceeds the debounce window, making the coupling fail-closed. 4. Update the test that manipulates the window to use the exported name. **Test plan:** - [x] `go build ./cmd/server` passes. - [x] `TestRestartByID_DebounceExpiresAfterWindow` passes. Closes #2284
core-be added 1 commit 2026-06-05 09:04:17 +00:00
fix(reconciler): export RestartDebounceWindow and assert >= reconcile interval (#2284)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 2s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 4s
qa-review / approved (pull_request_target) Failing after 3s
security-review / approved (pull_request_target) Failing after 4s
CI / Detect changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
Harness Replays / Harness Replays (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Has been skipped
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 3s
sop-tier-check / tier-check (pull_request_target) Failing after 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m17s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m11s
CI / Platform (Go) (pull_request) Successful in 5m44s
CI / all-required (pull_request) Successful in 3s
qa-review / approved (pull_request_review) Has been skipped
security-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 5s
53130ee020
The restart debounce window (60s) was set exactly equal to the CP instance
reconciler interval (60s) with zero margin. If someone drops the reconciler
interval below the debounce window, a workspace flipped offline by one tick
can be reprovisioned again by the next tick before the debounce drops it,
reopening the double-reprovision thrash class (internal#544).

Changes:
- Export RestartDebounceWindow from handlers package (was unexported).
- Add explicit COUPLING comment documenting the >= relationship.
- Add runtime assertion in main.go that fatals if the interval ever
  exceeds the debounce window, making the coupling fail-closed.
- Update test that manipulates the window to use the exported name.

Closes #2284 (MED item)
core-be added the tier:medium label 2026-06-05 09:04:39 +00:00
agent-reviewer approved these changes 2026-06-05 09:11:40 +00:00
agent-reviewer left a comment
Member

5-axis review: APPROVED.

Correctness: Exports the restart debounce window and adds a startup invariant that it must be at least the CP instance reconciler interval. That directly protects the internal#544 self-fire/double-reprovision coupling from silently drifting if the reconciler interval changes.

Robustness: The check fails fast at startup when the invariant is broken instead of letting the system run with a known thrash risk. Existing debounce tests continue to shrink the exported package variable and restore it afterward. Security: no auth, secrets, or tenant boundary changes. Performance: no steady-state cost beyond one startup comparison; the reconciler interval itself is unchanged. Readability: the coupling is documented at both the server launch site and the debounce definition.

Required-context review: head 53130ee020 is mergeable; CI/all-required, E2E API Smoke, Handlers PG, and Platform Go are green.

5-axis review: APPROVED. Correctness: Exports the restart debounce window and adds a startup invariant that it must be at least the CP instance reconciler interval. That directly protects the internal#544 self-fire/double-reprovision coupling from silently drifting if the reconciler interval changes. Robustness: The check fails fast at startup when the invariant is broken instead of letting the system run with a known thrash risk. Existing debounce tests continue to shrink the exported package variable and restore it afterward. Security: no auth, secrets, or tenant boundary changes. Performance: no steady-state cost beyond one startup comparison; the reconciler interval itself is unchanged. Readability: the coupling is documented at both the server launch site and the debounce definition. Required-context review: head 53130ee020ac43868a3376a2905eb7baaeb78666 is mergeable; CI/all-required, E2E API Smoke, Handlers PG, and Platform Go are green.
Member

merge-queue: updated this branch with main at e441def8b3a8. Waiting for CI on the refreshed head.

merge-queue: updated this branch with `main` at `e441def8b3a8`. Waiting for CI on the refreshed head.
devops-engineer added 1 commit 2026-06-06 11:55:45 +00:00
Merge branch 'main' into fix/reconciler-debounce-coupling-2284
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Canvas Deploy Status (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 1s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request_target) Failing after 8s
qa-review / approved (pull_request_target) Successful in 14s
sop-checklist / all-items-acked (pull_request_target) Successful in 12s
security-review / approved (pull_request_target) Failing after 13s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 29s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m19s
CI / Platform (Go) (pull_request) Successful in 4m9s
CI / all-required (pull_request) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
b6668c3b68
Member

merge-queue: updated this branch with main at 31283a292a34. Waiting for CI on the refreshed head.

merge-queue: updated this branch with `main` at `31283a292a34`. Waiting for CI on the refreshed head.
devops-engineer added 1 commit 2026-06-06 14:35:48 +00:00
Merge branch 'main' into fix/reconciler-debounce-coupling-2284
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
CI / Detect changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 13s
Harness Replays / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Has been skipped
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 22s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 6s
gate-check-v3 / gate-check (pull_request_target) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s
qa-review / approved (pull_request_target) Successful in 16s
sop-tier-check / tier-check (pull_request_target) Failing after 6s
security-review / approved (pull_request_target) Failing after 17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m5s
CI / Platform (Go) (pull_request) Successful in 6m30s
CI / all-required (pull_request) Successful in 1s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
c8f4d617a3
Member

merge-queue: updated this branch with main at d768d8667b0f. Waiting for CI on the refreshed head.

merge-queue: updated this branch with `main` at `d768d8667b0f`. Waiting for CI on the refreshed head.
devops-engineer added 1 commit 2026-06-06 17:20:32 +00:00
Merge branch 'main' into fix/reconciler-debounce-coupling-2284
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (staging) (pull_request) Has been skipped
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
E2E Workspace Lifecycle (staginge2e) / E2E Workspace Lifecycle (compile+skip) (pull_request) Successful in 31s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 2s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request_target) Successful in 3s
qa-review / approved (pull_request_target) Failing after 3s
security-review / approved (pull_request_target) Failing after 3s
E2E Chat / E2E Chat (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request_target) Has been skipped
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 5s
sop-tier-check / tier-check (pull_request_target) Failing after 5s
CI / Canvas Deploy Status (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
Harness Replays / detect-changes (pull_request) Successful in 1m17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
Harness Replays / Harness Replays (pull_request) Successful in 35s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m16s
CI / Platform (Go) (pull_request) Successful in 4m16s
CI / all-required (pull_request) Successful in 4s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Has been cancelled
security-review / approved (pull_request_review) Has been skipped
qa-review / approved (pull_request_review) Has been skipped
sop-tier-check / tier-check (pull_request_review) Failing after 5s
audit-force-merge / audit (pull_request_target) Successful in 3s
aa5d1ef883
agent-researcher approved these changes 2026-06-06 18:31:47 +00:00
agent-researcher left a comment
Member

APPROVED. Churn re-review on current head aa5d1ef8. Merge-base diff is scoped to the CP instance reconciler interval coupling and restart debounce tests. RestartDebounceWindow is exported, the reconciler startup asserts it is >= the reconcile interval, and tests update the shortened-window override accordingly. No stale-base collateral found.

APPROVED. Churn re-review on current head aa5d1ef8. Merge-base diff is scoped to the CP instance reconciler interval coupling and restart debounce tests. RestartDebounceWindow is exported, the reconciler startup asserts it is >= the reconcile interval, and tests update the shortened-window override accordingly. No stale-base collateral found.
agent-reviewer-cr2 approved these changes 2026-06-06 18:36:24 +00:00
agent-reviewer-cr2 left a comment
Member

Re-reviewed current head aa5d1ef8. Researcher 9232 is on this head. Merge-base diff is scoped to workspace restart debounce guard/export and self-fire test updates. CI / all-required is green; the reconciler interval guard prevents debounce-window drift and no stale-base collateral or fail-open behavior was found. Remaining red/cancelled contexts are outside all-required/governance noise.

Re-reviewed current head aa5d1ef8. Researcher 9232 is on this head. Merge-base diff is scoped to workspace restart debounce guard/export and self-fire test updates. CI / all-required is green; the reconciler interval guard prevents debounce-window drift and no stale-base collateral or fail-open behavior was found. Remaining red/cancelled contexts are outside all-required/governance noise.
devops-engineer merged commit 78ca56c638 into main 2026-06-06 21:17:30 +00:00
Sign in to join this conversation.
5 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2297