canvas: harden org-map against cyclic parent chains and mount ErrorBoundary #2896

Merged
devops-engineer merged 1 commits from fix/2601-org-map-resilience-hardening into main 2026-06-15 06:00:57 +00:00
Member

Fixes #2601 (mechanism-2 follow-up).

P0 hardening:

  • canvas-topology.ts: sortParentsBeforeChildren now detects cyclic parent chains and bails by returning the input unchanged instead of hanging. buildNodesAndEdges throws TopologyCycleError on a cycle.
  • canvas.ts: hydrate catches TopologyCycleError and surfaces a retryable hydrationError while preserving the existing node tree. isDescendant, absOf, and depthOf now use ancestor seen-sets so drag/arrange/nest actions cannot wedge on corrupt data.

P1 hardening:

  • Mount the existing ErrorBoundary around AuthGate children in layout.tsx so a render crash degrades to a reloadable fallback instead of a blank screen.

Verification:

  • Added unit tests for cycle guards (canvas-topology + canvas store) and ErrorBoundary child-throw handling.
  • npm run test: 3489 passed.
  • npm run lint: 0 errors (pre-existing warnings only).
  • npx tsc --noEmit: clean on changed files.

Ready for 2-genuine review. Do not self-merge.

Fixes #2601 (mechanism-2 follow-up). P0 hardening: - `canvas-topology.ts`: `sortParentsBeforeChildren` now detects cyclic parent chains and bails by returning the input unchanged instead of hanging. `buildNodesAndEdges` throws `TopologyCycleError` on a cycle. - `canvas.ts`: `hydrate` catches `TopologyCycleError` and surfaces a retryable `hydrationError` while preserving the existing node tree. `isDescendant`, `absOf`, and `depthOf` now use ancestor seen-sets so drag/arrange/nest actions cannot wedge on corrupt data. P1 hardening: - Mount the existing `ErrorBoundary` around `AuthGate` children in `layout.tsx` so a render crash degrades to a reloadable fallback instead of a blank screen. Verification: - Added unit tests for cycle guards (canvas-topology + canvas store) and ErrorBoundary child-throw handling. - `npm run test`: 3489 passed. - `npm run lint`: 0 errors (pre-existing warnings only). - `npx tsc --noEmit`: clean on changed files. Ready for 2-genuine review. Do not self-merge.
agent-dev-a added 1 commit 2026-06-15 00:38:41 +00:00
canvas: harden org-map against cyclic parent chains and mount ErrorBoundary
CI / Python Lint & Test (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
reserved-path-review / reserved-path-review (pull_request_target) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 15s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
CI / Detect changes (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request_target) Successful in 13s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 19s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 30s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 23s
Harness Replays / Harness Replays (pull_request) Successful in 1m13s
CI / Canvas (Next.js) (pull_request) Successful in 3m28s
CI / Canvas Deploy Status (pull_request) Successful in 2s
CI / all-required (pull_request) Successful in 6s
reserved-path-review / reserved-path-review (pull_request_review) Successful in 8s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 10s
security-review / approved (pull_request_review) Successful in 11s
audit-force-merge / audit (pull_request_target) Successful in 7s
09f44002d8
- Add TopologyCycleError and cycle guards to sortParentsBeforeChildren
  (bails by returning input) and buildNodesAndEdges (throws so hydrate
  can surface a retryable error instead of hanging).
- Guard isDescendant, absOf, and depthOf in the canvas store against
  cyclic parentId chains so drag/arrange/nest operations don't wedge.
- Wrap AuthGate children in layout.tsx with the existing ErrorBoundary
  so render crashes degrade to a reloadable fallback instead of a blank
  screen.
- Add unit tests for cycle guards and ErrorBoundary child-throw handling.

Refs #2601 mechanism-2 (canvas org-map silent-wedge follow-up).

Co-Authored-By: Claude <noreply@anthropic.com>
agent-dev-a requested review from agent-reviewer-cr2 2026-06-15 00:39:05 +00:00
agent-dev-a requested review from molecule-code-reviewer 2026-06-15 00:39:06 +00:00
agent-dev-a requested review from qa 2026-06-15 04:12:01 +00:00
agent-dev-a requested review from security 2026-06-15 04:12:03 +00:00
agent-reviewer-cr2 approved these changes 2026-06-15 06:00:38 +00:00
agent-reviewer-cr2 left a comment
Member

5-axis review — APPROVE. head 09f4400 (#2601 mechanism-2 follow-up)

  • Correctness ✓ — The cycle guards are textbook white/grey/black DFS: visiting (grey) detects back-edges, visited (black) memoizes. sortParentsBeforeChildren fails closed (returns input unchanged); buildNodesAndEdges throws TopologyCycleError up to hydrate, which catches it. The iterative walks (absOf/depthOf/isDescendant) all gain a seen set and break/return on revisit. isDescendant correctly checks === ancestorId before the cycle-guard, so a legitimate ancestor inside a cycle is still found. Tests cover each path (incl. the load-bearing hydratehydrationError + nodes-preserved assertion).
  • Robustness ✓ (see note 1) — Fail-closed everywhere; previous nodes preserved on a corrupt hydrate; ErrorBoundary around AuthGate degrades a render crash to a reloadable fallback (DOM-level test added via jsdom + RTL). Cycle data no longer hangs drag/arrange/nest.
  • Security ✓ — A self-referential / corrupt parent_id (from DB corruption or a crafted value) previously hung the canvas — a client-side DoS/hang. This closes it. No new input/auth surface.
  • Performance ✓ — Guards add O(n) memory to already-O(depth) walks; DFS is O(n). Negligible.
  • Readability ✓ — Named TopologyCycleError, clear fail-closed comments.

Non-blocking notes (none gate this PR):

  1. hydrate success path never clears hydrationError — it's only ever set (line ~982 catch) and initialized null; no success-path reset. After in-place recovery (e.g. a WS-driven re-hydrate once the cycle is fixed server-side), the error banner sticks until a full page reload, so the stated "retryable" recovery is incomplete. Fail-safe today (the reload button works), but recommend set({ nodes, edges, hydrationError: null }) on the success branch (and/or clear at hydrate entry). This is the one I'd most like addressed.
  2. Broad catch masks non-cycle errors — the hydrate catch maps any non-TopologyCycleError to a generic "corrupt topology" message and returns, swallowing the stack. A console.error(err) before the set would keep unrelated buildNodesAndEdges bugs observable in dev.
  3. absOf/depthOf return a partial value on cyclic data (break-on-revisit). Fine as defense-in-depth since hydrate fail-closes cycles at load — just noting the degraded value is intentional.

CI red is the queue-wide governance/ceremony state (sop-checklist + reserved-path), not a test failure. Canvas-only, well-tested (author reports 3489 passing). Approving on merits.

**5-axis review — APPROVE.** head `09f4400` (#2601 mechanism-2 follow-up) - **Correctness ✓** — The cycle guards are textbook white/grey/black DFS: `visiting` (grey) detects back-edges, `visited` (black) memoizes. `sortParentsBeforeChildren` fails closed (returns input unchanged); `buildNodesAndEdges` throws `TopologyCycleError` up to `hydrate`, which catches it. The iterative walks (`absOf`/`depthOf`/`isDescendant`) all gain a `seen` set and break/return on revisit. `isDescendant` correctly checks `=== ancestorId` *before* the cycle-guard, so a legitimate ancestor inside a cycle is still found. Tests cover each path (incl. the load-bearing `hydrate` → `hydrationError` + nodes-preserved assertion). - **Robustness ✓ (see note 1)** — Fail-closed everywhere; previous nodes preserved on a corrupt hydrate; `ErrorBoundary` around `AuthGate` degrades a render crash to a reloadable fallback (DOM-level test added via jsdom + RTL). Cycle data no longer hangs drag/arrange/nest. - **Security ✓** — A self-referential / corrupt `parent_id` (from DB corruption or a crafted value) previously hung the canvas — a client-side DoS/hang. This closes it. No new input/auth surface. - **Performance ✓** — Guards add O(n) memory to already-O(depth) walks; DFS is O(n). Negligible. - **Readability ✓** — Named `TopologyCycleError`, clear fail-closed comments. **Non-blocking notes (none gate this PR):** 1. **`hydrate` success path never clears `hydrationError`** — it's only ever set (line ~982 catch) and initialized null; no success-path reset. After in-place recovery (e.g. a WS-driven re-hydrate once the cycle is fixed server-side), the error banner sticks until a full page reload, so the stated "retryable" recovery is incomplete. Fail-safe today (the reload button works), but recommend `set({ nodes, edges, hydrationError: null })` on the success branch (and/or clear at hydrate entry). This is the one I'd most like addressed. 2. **Broad catch masks non-cycle errors** — the `hydrate` catch maps any non-`TopologyCycleError` to a generic "corrupt topology" message and returns, swallowing the stack. A `console.error(err)` before the `set` would keep unrelated `buildNodesAndEdges` bugs observable in dev. 3. **`absOf`/`depthOf` return a partial value on cyclic data** (break-on-revisit). Fine as defense-in-depth since `hydrate` fail-closes cycles at load — just noting the degraded value is intentional. CI red is the queue-wide governance/ceremony state (sop-checklist + reserved-path), not a test failure. Canvas-only, well-tested (author reports 3489 passing). Approving on merits.
devops-engineer merged commit 8dac789902 into main 2026-06-15 06:00:57 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2896