test(canvas/mobile): add primitives.test.tsx coverage (19 cases)

Cover StatusDot (size, circle, halo, flexShrink), TierChip (tiers, size variants, flexShrink), Chip (value, label+value, pill shape, soft/accent mode), SectionLabel (text, right slot, uppercase). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
feat(mobile): FilterChips + AgentCard WCAG 2.1 AA accessibility
2026-05-12 09:38:47 +00:00 · 2026-05-12 09:38:47 +00:00 · 2026-05-12 09:38:47 +00:00 · 2026-05-12 09:38:47 +00:00 · 2026-05-12 09:38:47 +00:00 · 2026-05-12 09:38:47 +00:00
75 changed files with 579 additions and 6199 deletions
@@ -1,113 +0,0 @@
-#!/usr/bin/env python3
-"""Lint workflow bash for curl status-code capture pollution.
-
-The bad shape is:
-
-    HTTP_CODE=$(curl ... -w '%{http_code}' ... || echo "000")
-
-`curl -w` writes the HTTP code to stdout before returning non-zero, so
-fallback output inside the same command substitution appends another code.
-"""
-from __future__ import annotations
-
-import argparse
-import glob
-import re
-import sys
-from pathlib import Path
-from typing import NamedTuple
-
-
-SELF = ".gitea/workflows/lint-curl-status-capture.yml"
-
-
-class Finding(NamedTuple):
-    path: str
-    snippet: str
-
-
-BAD_STATUS_CAPTURE = re.compile(
-    r"""
-    \$\(\s*
-    curl\b
-    [^)]*
-    -w\s*['"]%\{http_code\}['"]
-    [^)]*
-    \|\|\s*
-    (?:
-        echo\s+['"]?000['"]?
-        |
-        printf\s+['"]000['"]
-    )
-    \s*\)
-    """,
-    re.DOTALL | re.VERBOSE,
-)
-
-
-def _logical_shell(content: str) -> str:
-    """Collapse bash line continuations so one curl command is one string."""
-    return re.sub(r"\\\s*\n\s*", " ", content)
-
-
-def scan_content(path: str, content: str) -> list[Finding]:
-    flat = _logical_shell(content)
-    return [
-        Finding(path=path, snippet=re.sub(r"\s+", " ", match.group(0)).strip()[:160])
-        for match in BAD_STATUS_CAPTURE.finditer(flat)
-    ]
-
-
-def scan_paths(paths: list[str]) -> list[Finding]:
-    findings: list[Finding] = []
-    for path in paths:
-        if path == SELF:
-            continue
-        content = Path(path).read_text(encoding="utf-8")
-        findings.extend(scan_content(path, content))
-    return findings
-
-
-def default_paths() -> list[str]:
-    return sorted(glob.glob(".gitea/workflows/*.yml"))
-
-
-def print_report(findings: list[Finding]) -> None:
-    if not findings:
-        print("OK No curl-status-capture pollution patterns detected")
-        return
-
-    print(f"::error::Found {len(findings)} curl-status-capture pollution site(s):")
-    for finding in findings:
-        print(
-            f"::error file={finding.path}::Curl status-capture pollution: "
-            "'|| echo/printf 000' inside a $(curl ... -w '%{http_code}' ...) "
-            "subshell. On non-2xx or connection failure, curl's -w writes a "
-            "status, then exits non-zero, then the fallback appends another "
-            "status. Fix: route -w into a tempfile so the exit code cannot "
-            "pollute stdout."
-        )
-        print(f"   matched: {finding.snippet}...")
-
-    print()
-    print("Fix template:")
-    print("  set +e")
-    print("  curl ... -w '%{http_code}' >code.txt 2>/dev/null")
-    print("  set -e")
-    print('  HTTP_CODE=$(cat code.txt 2>/dev/null)')
-    print('  [ -z "$HTTP_CODE" ] && HTTP_CODE="000"')
-
-
-def main(argv: list[str] | None = None) -> int:
-    parser = argparse.ArgumentParser()
-    parser.add_argument("paths", nargs="*", help="workflow files to scan")
-    args = parser.parse_args(argv)
-
-    paths = args.paths or default_paths()
-    findings = scan_paths(paths)
-    print_report(findings)
-    return 1 if findings else 0
-
-
-if __name__ == "__main__":
-    raise SystemExit(main())
@@ -1,509 +0,0 @@
-#!/usr/bin/env python3
-"""lint_bp_context_emit_match — Tier 2f per internal#350.
-
-Rule
----
-For a given protected branch, every context in
-`branch_protections/<branch>.status_check_contexts` MUST be emitted
-by at least one workflow in `.gitea/workflows/*.yml`. Two contexts
-match when:
-
-  1. The workflow's `name:` equals the context's workflow-part (the
-     prefix before ` / `).
-  2. Some job in that workflow has a `name:` (or default-fallback
-     job-key) equal to the context's job-part (between ` / ` and
-     ` (`).
-  3. The workflow's `on:` block includes the context's event-part
-     (in parens at the end), with Gitea's event-name mapping:
-        - `pull_request` and `pull_request_target` BOTH emit
-          `(pull_request)` contexts (verified empirically on
-          molecule-core/main).
-        - `push` emits `(push)`.
-
-A BP context with no emitter blocks merges forever — Gitea treats
-absent-as-`pending`, NOT absent-as-`skipped`-as-`success`. This is
-the phantom-required-check class
-(`feedback_phantom_required_check_after_gitea_migration`).
-
-The inverse direction (emitter without BP context) is INFORMATIONAL
-only — Tier 2g handles that direction at PR-time. Flagging it here
-on a daily schedule would falsely surface every transitional state
-during a BP rollout.
-
-How the gate works
------------------
-Daily scheduled run + workflow_dispatch:
-
-  1. GET `branch_protections/{BRANCH}` (needs DRIFT_BOT_TOKEN with
-     repo-admin scope; same persona as ci-required-drift.yml).
-     Graceful-degrade on 403/404 per Tier 2a contract.
-
-  2. Walk `.gitea/workflows/*.yml` via PyYAML AST. For each workflow,
-     enumerate its emitted contexts: `{workflow.name} / {job.name or
-     job-key} ({event})` for each event in `on:` that emits a status.
-
-  3. For each BP context, look for an emitter match. Aggregate
-     orphans.
-
-  4. If orphans exist:
-     - File or PATCH a `[ci-bp-drift]` issue (idempotency contract:
-       search for exact title prefix, edit existing if open).
-     - Apply labels `tier:high` + `ci-bp-drift` (lookup IDs per
-       repo; per `feedback_tier_label_ids_are_per_repo`).
-     - Exit 1.
-
-  5. If no orphans:
-     - Close any existing `[ci-bp-drift]` issue with a clean-state
-       comment.
-     - Exit 0.
-
-Exit codes
----------
-  0 — clean OR API 403/404 (graceful-degrade, surfaces ::error::).
-  1 — at least one BP context has no emitter.
-  2 — env contract violation, workflows-dir missing, or YAML parse
-      error.
-
-Env
---
-  GITEA_TOKEN     — DRIFT_BOT_TOKEN (repo-admin for branch_protections)
-  GITEA_HOST      — e.g. git.moleculesai.app
-  REPO            — owner/name
-  BRANCH          — defaults to `main`
-  WORKFLOWS_DIR   — defaults to `.gitea/workflows`
-  DRIFT_LABEL     — defaults to `ci-bp-drift`
-
-Memory cross-links
------------------
-  - internal#350 (the RFC that specs this lint)
-  - feedback_phantom_required_check_after_gitea_migration
-  - feedback_tier_label_ids_are_per_repo
-  - reference_post_suspension_pipeline
-"""
-from __future__ import annotations
-
-import json
-import os
-import re
-import sys
-import urllib.error
-import urllib.parse
-import urllib.request
-from pathlib import Path
-from typing import Any
-
-try:
-    import yaml
-except ImportError:
-    sys.stderr.write(
-        "::error::PyYAML is required. Install with: pip install PyYAML\n"
-    )
-    sys.exit(2)
-
-
-# Status-check context regex (mirrors lint-required-no-paths.py).
-_CONTEXT_RE = re.compile(
-    r"^(?P<workflow>.+?) / (?P<job>.+) \((?P<event>[^)]+)\)$"
-)
-
-# Map a workflow `on:` event-key to the context's event-part. Gitea's
-# emitter convention (verified on molecule-core):
-#   - pull_request          → `(pull_request)`
-#   - pull_request_target   → `(pull_request)` (same surface)
-#   - push                  → `(push)`
-#   - schedule              → no PR status; scheduled runs don't post
-#     commit-statuses unless the workflow itself does so explicitly.
-#   - workflow_dispatch     → manually dispatched runs may or may not
-#     emit; safest to treat as "no PR status" (informational notice
-#     only).
-_EVENT_MAP = {
-    "pull_request": "pull_request",
-    "pull_request_target": "pull_request",
-    "push": "push",
-}
-
-
-# ---------------------------------------------------------------------------
-# Env
-# ---------------------------------------------------------------------------
-def _env(key: str, default: str | None = None) -> str:
-    v = os.environ.get(key, default)
-    return v if v is not None else ""
-
-
-def _require_env(key: str) -> str:
-    v = os.environ.get(key)
-    if not v:
-        sys.stderr.write(f"::error::missing required env var: {key}\n")
-        sys.exit(2)
-    return v
-
-
-# ---------------------------------------------------------------------------
-# API helper. Mirrors lint-required-no-paths.py's contract: returns
-# (status, payload) tuple with status ∈ {"ok", "not_found", "forbidden",
-# "error"}.
-# ---------------------------------------------------------------------------
-def api(
-    method: str,
-    path: str,
-    *,
-    body: dict | None = None,
-    query: dict[str, str] | None = None,
-) -> tuple[str, Any]:
-    host = _env("GITEA_HOST")
-    token = _env("GITEA_TOKEN")
-    url = f"https://{host}/api/v1{path}"
-    if query:
-        url = f"{url}?{urllib.parse.urlencode(query)}"
-    data = None
-    headers = {
-        "Authorization": f"token {token}",
-        "Accept": "application/json",
-    }
-    if body is not None:
-        data = json.dumps(body).encode("utf-8")
-        headers["Content-Type"] = "application/json"
-    req = urllib.request.Request(
-        url, method=method, data=data, headers=headers
-    )
-    try:
-        with urllib.request.urlopen(req, timeout=30) as resp:
-            raw = resp.read()
-            if not raw:
-                return ("ok", None)
-            return ("ok", json.loads(raw))
-    except urllib.error.HTTPError as e:
-        if e.code == 404:
-            return ("not_found", None)
-        if e.code in (401, 403):
-            return ("forbidden", None)
-        return ("error", None)
-    except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
-        return ("error", None)
-
-
-# ---------------------------------------------------------------------------
-# Helpers
-# ---------------------------------------------------------------------------
-def _get_on(d: Any) -> Any:
-    """YAML 1.1 boolean quirk: bare `on:` may parse to True. Handle both."""
-    if not isinstance(d, dict):
-        return None
-    if "on" in d:
-        return d["on"]
-    if True in d:
-        return d[True]
-    return None
-
-
-def _on_events(doc: Any) -> set[str]:
-    """Return the set of event keys in a workflow's `on:` block.
-
-    Accepts all three shapes (string / list / mapping). String/list
-    shapes can't carry filters but they DO emit. Returns the
-    Gitea-mapped event names per `_EVENT_MAP`.
-    """
-    on = _get_on(doc)
-    raw_events: set[str] = set()
-    if on is None:
-        return raw_events
-    if isinstance(on, str):
-        raw_events.add(on)
-    elif isinstance(on, list):
-        for e in on:
-            if isinstance(e, str):
-                raw_events.add(e)
-    elif isinstance(on, dict):
-        for k in on:
-            if isinstance(k, str):
-                raw_events.add(k)
-    return {_EVENT_MAP[e] for e in raw_events if e in _EVENT_MAP}
-
-
-def _job_display(jbody: dict, jkey: str) -> str:
-    """Return job's `name:` if set, else fall back to the job-key.
-
-    Gitea formats status contexts with the job's `name:` when set;
-    when unset it uses the job key. Matches lint-required-no-paths
-    convention.
-    """
-    n = jbody.get("name") if isinstance(jbody, dict) else None
-    if isinstance(n, str) and n:
-        return n
-    return jkey
-
-
-def workflow_contexts(doc: Any) -> set[str]:
-    """Return the set of contexts a workflow emits."""
-    contexts: set[str] = set()
-    if not isinstance(doc, dict):
-        return contexts
-    wf_name = doc.get("name")
-    if not isinstance(wf_name, str) or not wf_name:
-        return contexts  # no name => no addressable context
-    events = _on_events(doc)
-    if not events:
-        return contexts
-    jobs = doc.get("jobs")
-    if not isinstance(jobs, dict):
-        return contexts
-    for jkey, jbody in jobs.items():
-        if jkey == "__lines__":  # tolerate line-tracking annotations
-            continue
-        if not isinstance(jbody, dict):
-            continue
-        disp = _job_display(jbody, jkey)
-        for ev in events:
-            contexts.add(f"{wf_name} / {disp} ({ev})")
-    return contexts
-
-
-def parse_context(ctx: str) -> tuple[str, str, str] | None:
-    m = _CONTEXT_RE.match(ctx)
-    if not m:
-        return None
-    return (m.group("workflow"), m.group("job"), m.group("event"))
-
-
-def _iter_workflow_files(wf_dir: Path) -> list[Path]:
-    return sorted(list(wf_dir.glob("*.yml")) + list(wf_dir.glob("*.yaml")))
-
-
-# ---------------------------------------------------------------------------
-# Issue idempotency — search for an open issue with the canonical
-# title prefix; PATCH if found, POST if not. Mirrors ci-required-drift.
-# ---------------------------------------------------------------------------
-def _canonical_title(repo: str, branch: str) -> str:
-    return f"[ci-bp-drift] {repo}/{branch}: BP→emitter mismatch"
-
-
-def _ensure_labels(repo: str, names: list[str]) -> list[int]:
-    status, labels = api("GET", f"/repos/{repo}/labels", query={"limit": "50"})
-    if status != "ok" or not isinstance(labels, list):
-        return []
-    out: list[int] = []
-    by_name = {l["name"]: l["id"] for l in labels if isinstance(l, dict)}
-    for n in names:
-        if n in by_name:
-            out.append(by_name[n])
-    return out
-
-
-def file_or_update_issue(
-    repo: str, branch: str, orphans: list[str], emitter_orphans: list[str]
-) -> None:
-    title = _canonical_title(repo, branch)
-    body_lines = [
-        f"BP→emitter drift detected on `{branch}` at "
-        f"{os.environ.get('GITHUB_RUN_URL', '(run url unavailable)')}.",
-        "",
-        f"## Orphan BP contexts ({len(orphans)})",
-        "",
-        "These contexts are required by branch protection but NO workflow "
-        "emits them. PRs merging into this branch will wait forever for a "
-        "status that never arrives (Gitea treats absent-as-`pending`, NOT "
-        "absent-as-`skipped`). See "
-        "`feedback_phantom_required_check_after_gitea_migration`.",
-        "",
-    ]
-    for o in orphans:
-        body_lines.append(f"- `{o}`")
-    if emitter_orphans:
-        body_lines += [
-            "",
-            f"## Workflows emitting contexts NOT in BP ({len(emitter_orphans)})",
-            "",
-            "Informational — Tier 2g handles this direction at PR-time. "
-            "Listed here for completeness.",
-            "",
-        ]
-        for o in emitter_orphans:
-            body_lines.append(f"- `{o}`")
-    body_lines += [
-        "",
-        "Fix options:",
-        "  1. PATCH `branch_protections/{branch}.status_check_contexts` "
-        "  to remove the orphan.",
-        "  2. Restore the emitting workflow (if it was deleted/renamed).",
-        "",
-        "Linted by `.gitea/workflows/lint-bp-context-emit-match.yml` "
-        "(Tier 2f, internal#350).",
-    ]
-    body = "\n".join(body_lines)
-
-    # Idempotency search — find an open issue with the canonical title.
-    status, hits = api(
-        "GET",
-        f"/repos/{repo}/issues",
-        query={
-            "type": "issues",
-            "state": "open",
-            "q": title,
-        },
-    )
-    existing = None
-    if status == "ok" and isinstance(hits, list):
-        for h in hits:
-            if (
-                isinstance(h, dict)
-                and h.get("state") == "open"
-                and isinstance(h.get("title"), str)
-                and h["title"].startswith(title)
-            ):
-                existing = h
-                break
-
-    label_ids = _ensure_labels(repo, ["ci-bp-drift", "tier:high"])
-
-    if existing:
-        api(
-            "PATCH",
-            f"/repos/{repo}/issues/{existing['number']}",
-            body={"body": body, "labels": label_ids} if label_ids else {"body": body},
-        )
-        print(
-            f"::notice::Updated existing drift issue "
-            f"#{existing['number']}: {existing.get('html_url', '')}"
-        )
-    else:
-        status, posted = api(
-            "POST",
-            f"/repos/{repo}/issues",
-            body={"title": title, "body": body, "labels": label_ids},
-        )
-        if status == "ok" and isinstance(posted, dict):
-            print(
-                f"::notice::Filed new drift issue "
-                f"#{posted.get('number')}: {posted.get('html_url', '')}"
-            )
-
-
-# ---------------------------------------------------------------------------
-# Driver
-# ---------------------------------------------------------------------------
-def run() -> int:
-    _require_env("GITEA_TOKEN")
-    _require_env("GITEA_HOST")
-    repo = _require_env("REPO")
-    branch = _env("BRANCH", "main")
-    wf_dir = Path(_env("WORKFLOWS_DIR", ".gitea/workflows"))
-
-    if not wf_dir.is_dir():
-        sys.stderr.write(f"::error::workflows directory not found: {wf_dir}\n")
-        return 2
-
-    # 1. Pull BP.
-    status, bp = api("GET", f"/repos/{repo}/branch_protections/{branch}")
-    if status == "forbidden":
-        sys.stderr.write(
-            f"::error::GET branch_protections/{branch} returned HTTP 403 — "
-            f"DRIFT_BOT_TOKEN lacks repo-admin scope (Gitea 1.22.6 requires "
-            f"it for this endpoint). Skipping lint with exit 0 to avoid "
-            f"red-X on every run. Fix: grant repo-admin to mc-drift-bot. "
-            f"Per Tier 2a contract.\n"
-        )
-        return 0
-    if status == "not_found":
-        print(
-            f"::notice::branch '{branch}' has no protection configured; "
-            f"nothing to lint."
-        )
-        return 0
-    if status != "ok" or not isinstance(bp, dict):
-        sys.stderr.write(
-            f"::error::branch_protections/{branch} response unexpected; "
-            f"status={status}. Treating as transient; exit 0.\n"
-        )
-        return 0
-
-    bp_contexts: list[str] = list(bp.get("status_check_contexts") or [])
-    if not bp_contexts:
-        print(
-            f"::notice::branch_protections/{branch} has 0 required "
-            f"status_check_contexts; nothing to lint."
-        )
-        return 0
-
-    # 2. Enumerate emitter contexts from all workflows.
-    all_emitter: set[str] = set()
-    for path in _iter_workflow_files(wf_dir):
-        try:
-            doc = yaml.safe_load(path.read_text(encoding="utf-8"))
-        except yaml.YAMLError as e:
-            sys.stderr.write(
-                f"::error file={path}::YAML parse error: {e}; skipping.\n"
-            )
-            continue
-        all_emitter |= workflow_contexts(doc)
-
-    print(
-        f"::notice::Linting {len(bp_contexts)} BP context(s) for {branch} "
-        f"against {len(all_emitter)} workflow-emitted context(s)."
-    )
-
-    bp_set = set(bp_contexts)
-
-    # 3. Find orphans (BP-side: required but no emitter).
-    bp_orphans = sorted(bp_set - all_emitter)
-
-    # Informational: workflow emits but BP doesn't list. Tier 2g
-    # territory at PR-time. We list these as NOTICE only.
-    emitter_orphans = sorted(all_emitter - bp_set)
-
-    if bp_orphans:
-        print(
-            f"::error::Found {len(bp_orphans)} BP context(s) with no "
-            f"emitter — these would block merges forever (Gitea treats "
-            f"absent-as-pending, not skipped):"
-        )
-        for o in bp_orphans:
-            # Closest-match hint: name a workflow whose name-part is a
-            # near-match (lev-1 typo, or same workflow with a different
-            # event).
-            parsed = parse_context(o)
-            hint = ""
-            if parsed:
-                wf, _job, _ev = parsed
-                candidates = sorted(
-                    {c for c in all_emitter if c.startswith(wf + " / ")}
-                )
-                if candidates:
-                    hint = (
-                        f" — closest emitter(s): {', '.join(candidates[:3])}"
-                    )
-            print(f"::error::  - {o}{hint}")
-        if emitter_orphans:
-            print(
-                f"::notice::Also: {len(emitter_orphans)} workflow-emitted "
-                f"context(s) not in BP (informational; Tier 2g handles at "
-                f"PR-time):"
-            )
-            for o in emitter_orphans:
-                print(f"::notice::  - {o}")
-        # File / patch tracking issue.
-        try:
-            file_or_update_issue(repo, branch, bp_orphans, emitter_orphans)
-        except Exception as e:
-            sys.stderr.write(
-                f"::error::failed to file drift issue: {e}\n"
-            )
-        return 1
-
-    if emitter_orphans:
-        print(
-            f"::notice::{len(emitter_orphans)} workflow-emitted context(s) "
-            f"not in BP (informational; Tier 2g handles at PR-time):"
-        )
-        for o in emitter_orphans:
-            print(f"::notice::  - {o}")
-
-    print(
-        f"::notice::BP/emitter match clean: all {len(bp_contexts)} required "
-        f"context(s) have an emitter."
-    )
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(run())
@@ -98,13 +98,11 @@ except ImportError:
 # ---------------------------------------------------------------------------
 # Tracker comment regex.
 # Matches: `# mc#1234`, `# internal#42`, `# mc#1234 - description`
-# Also matches trackers embedded mid-sentence: `# see mc#1234 for details`
 # Does NOT match: `# mc1234` (missing inner #), `mc#1234` (no leading
-# comment `#`), `# MC#1234` (case-sensitive). The search is line-wide,
-# not just at the comment-marker prefix — fixes false-negative when
-# the tracker appears mid-sentence (e.g. `internal#350` after prose).
+# `#` comment marker), `# MC#1234` (case-sensitive — `mc` and `internal`
+# are conventional lower-case repo slugs).
 TRACKER_RE = re.compile(
-    r"(?P<slug>mc|internal)#(?P<num>\d+)\b"
+    r"#\s*(?P<slug>mc|internal)#(?P<num>\d+)\b"
 )

 # Truthy continue-on-error values we treat as "true". PyYAML decodes
@@ -1,526 +0,0 @@
-#!/usr/bin/env python3
-"""lint_required_context_exists_in_bp — Tier 2g per internal#350.
-
-Rule
----
-When a PR adds a NEW commit-status emission (a context that didn't
-exist on the base side), the workflow file must carry one of three
-directive comments adjacent to the new job:
-
-  (a) `# bp-required: yes`
-      The new context MUST already be in
-      `branch_protections/<branch>.status_check_contexts`. Verified
-      via Gitea API at PR time.
-
-  (b) `# bp-required: pending #NNN`
-      Acknowledged asymmetry; references an OPEN tracking issue that
-      will follow up with the BP PATCH.
-
-  (c) `# bp-exempt: <free-text reason>`
-      Informational job, not intended to be a required gate.
-
-No directive on a new emitter → FAIL with a 3-option fix-hint.
-
-The class this prevents
-----------------------
-PR#656 added `CI / all-required (pull_request)` as a sentinel context
-that workflows emit, but BP did NOT list it. When `platform-build`
-failed, `all-required` failed, but BP let the PR merge anyway →
-cascade to mc#664. With this lint, PR#656 would have been blocked
-until either the BP PATCH ran alongside OR the author added a
-`bp-required: pending` directive.
-
-Why directives MUST live in the workflow YAML
---------------------------------------------
-The directive comment lives with the emitter so a scheduled
-audit (Tier 2f, daily) can read the same source. PR-body-only
-directives invisibly evaporate on merge — the asymmetry would
-return to undetected. PR-body claims are advisory; workflow-file
-comments are the contract.
-
-How "new emission" is detected
------------------------------
-Diff base..head over `.gitea/workflows/*.yml`. For each YAML file
-that's added or modified:
-  - Parse both base-side and head-side via PyYAML AST.
-  - Enumerate emitted contexts on each side using the same rules as
-    Tier 2f (workflow.name + job.name|key + event-mapping).
-  - `new_contexts = head_contexts - base_contexts`.
-
-If `new_contexts` is empty after de-dup, no rule applies → pass.
-
-Per `feedback_behavior_based_ast_gates`: comment scanning uses raw
-text in a small window around the job-key line, NOT regex over the
-full file. This avoids matching `bp-required:` mentioned in a
-comment unrelated to the new job.
-
-Exit codes
----------
-  0 — no new emissions, all new emissions have valid directives,
-      or BP read errored (graceful-degrade per Tier 2a contract).
-  1 — at least one new emission lacks a directive, or has
-      `bp-required: yes` but the context is missing from BP.
-  2 — env contract violation or YAML parse error.
-
-Env
---
-  BASE_SHA          — PR base SHA
-  HEAD_SHA          — PR head SHA
-  GITEA_TOKEN       — DRIFT_BOT_TOKEN (repo-admin for BP read)
-  GITEA_HOST        — e.g. git.moleculesai.app
-  REPO              — owner/name
-  BRANCH            — defaults to `main`
-  WORKFLOWS_DIR     — defaults to `.gitea/workflows`
-
-Memory cross-links
------------------
-  - internal#350 (the RFC that specs this lint)
-  - PR#656 (the empirical case that prompted Tier 2g)
-  - mc#664 (the surfaced cascade)
-  - feedback_phantom_required_check_after_gitea_migration (Tier 2f cousin)
-  - feedback_behavior_based_ast_gates
-"""
-from __future__ import annotations
-
-import json
-import os
-import re
-import subprocess
-import sys
-import urllib.error
-import urllib.parse
-import urllib.request
-from typing import Any
-
-try:
-    import yaml
-except ImportError:
-    sys.stderr.write(
-        "::error::PyYAML is required. Install with: pip install PyYAML\n"
-    )
-    sys.exit(2)
-
-
-# Directive comment patterns. We match `# bp-required:` OR `# bp-exempt:`,
-# both with optional surrounding whitespace and case-sensitive on the
-# `bp-` prefix (convention).
-BP_REQUIRED_YES_RE = re.compile(
-    r"#\s*bp-required:\s*yes\b", re.IGNORECASE
-)
-BP_REQUIRED_PENDING_RE = re.compile(
-    r"#\s*bp-required:\s*pending\s*#(?P<num>\d+)\b", re.IGNORECASE
-)
-BP_EXEMPT_RE = re.compile(
-    r"#\s*bp-exempt:\s*\S", re.IGNORECASE
-)
-
-
-# Gitea event-mapping (same as Tier 2f).
-_EVENT_MAP = {
-    "pull_request": "pull_request",
-    "pull_request_target": "pull_request",
-    "push": "push",
-}
-
-
-# ---------------------------------------------------------------------------
-# Env
-# ---------------------------------------------------------------------------
-def _env(key: str, default: str | None = None) -> str:
-    v = os.environ.get(key, default)
-    return v if v is not None else ""
-
-
-def _require_env(key: str) -> str:
-    v = os.environ.get(key)
-    if not v:
-        sys.stderr.write(f"::error::missing required env var: {key}\n")
-        sys.exit(2)
-    return v
-
-
-# ---------------------------------------------------------------------------
-# API helper (same contract as Tier 2f).
-# ---------------------------------------------------------------------------
-def api(
-    method: str,
-    path: str,
-    *,
-    body: dict | None = None,
-    query: dict[str, str] | None = None,
-) -> tuple[str, Any]:
-    host = _env("GITEA_HOST")
-    token = _env("GITEA_TOKEN")
-    url = f"https://{host}/api/v1{path}"
-    if query:
-        url = f"{url}?{urllib.parse.urlencode(query)}"
-    data = None
-    headers = {
-        "Authorization": f"token {token}",
-        "Accept": "application/json",
-    }
-    if body is not None:
-        data = json.dumps(body).encode("utf-8")
-        headers["Content-Type"] = "application/json"
-    req = urllib.request.Request(url, method=method, data=data, headers=headers)
-    try:
-        with urllib.request.urlopen(req, timeout=30) as resp:
-            raw = resp.read()
-            if not raw:
-                return ("ok", None)
-            return ("ok", json.loads(raw))
-    except urllib.error.HTTPError as e:
-        if e.code == 404:
-            return ("not_found", None)
-        if e.code in (401, 403):
-            return ("forbidden", None)
-        return ("error", None)
-    except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
-        return ("error", None)
-
-
-# ---------------------------------------------------------------------------
-# git helpers
-# ---------------------------------------------------------------------------
-def git_show(sha: str, path: str) -> str | None:
-    r = subprocess.run(
-        ["git", "show", f"{sha}:{path}"], capture_output=True, text=True
-    )
-    if r.returncode != 0:
-        return None
-    return r.stdout
-
-
-def git_diff_paths(base: str, head: str) -> list[str]:
-    r = subprocess.run(
-        ["git", "diff", "--name-only", f"{base}..{head}"],
-        capture_output=True,
-        text=True,
-    )
-    if r.returncode != 0:
-        return []
-    return [p for p in r.stdout.splitlines() if p.strip()]
-
-
-# ---------------------------------------------------------------------------
-# Workflow context enumeration (mirror Tier 2f).
-# ---------------------------------------------------------------------------
-def _get_on(d: Any) -> Any:
-    if not isinstance(d, dict):
-        return None
-    if "on" in d:
-        return d["on"]
-    if True in d:
-        return d[True]
-    return None
-
-
-def _on_events(doc: Any) -> set[str]:
-    on = _get_on(doc)
-    raw: set[str] = set()
-    if on is None:
-        return raw
-    if isinstance(on, str):
-        raw.add(on)
-    elif isinstance(on, list):
-        for e in on:
-            if isinstance(e, str):
-                raw.add(e)
-    elif isinstance(on, dict):
-        for k in on:
-            if isinstance(k, str):
-                raw.add(k)
-    return {_EVENT_MAP[e] for e in raw if e in _EVENT_MAP}
-
-
-def _job_display(jbody: dict, jkey: str) -> str:
-    n = jbody.get("name") if isinstance(jbody, dict) else None
-    if isinstance(n, str) and n:
-        return n
-    return jkey
-
-
-def workflow_contexts(doc: Any) -> set[str]:
-    if not isinstance(doc, dict):
-        return set()
-    wf_name = doc.get("name")
-    if not isinstance(wf_name, str) or not wf_name:
-        return set()
-    events = _on_events(doc)
-    if not events:
-        return set()
-    jobs = doc.get("jobs")
-    if not isinstance(jobs, dict):
-        return set()
-    out: set[str] = set()
-    for jkey, jbody in jobs.items():
-        if jkey == "__lines__":
-            continue
-        if not isinstance(jbody, dict):
-            continue
-        disp = _job_display(jbody, jkey)
-        for ev in events:
-            out.add(f"{wf_name} / {disp} ({ev})")
-    return out
-
-
-# ---------------------------------------------------------------------------
-# Find the source line of a job-key in a workflow YAML's raw text.
-# Used to scan for nearby directive comments.
-# ---------------------------------------------------------------------------
-def _find_job_key_line(raw_lines: list[str], jkey: str) -> int | None:
-    """Return 1-based line of `<jkey>:` under jobs:."""
-    in_jobs = False
-    jobs_indent = -1
-    for i, line in enumerate(raw_lines, start=1):
-        stripped = line.lstrip()
-        if stripped.startswith("jobs:"):
-            in_jobs = True
-            jobs_indent = len(line) - len(stripped)
-            continue
-        if in_jobs:
-            # Job key is the next indent level under `jobs:`.
-            indent = len(line) - len(stripped)
-            if stripped and indent <= jobs_indent:
-                # Left the jobs: block
-                in_jobs = False
-                continue
-            if re.match(rf"^\s*{re.escape(jkey)}\s*:", line):
-                return i
-    return None
-
-
-_DIRECTIVE_WINDOW = 3  # lines above the job-key line (inclusive)
-
-
-def find_directive_for_job(
-    raw_text: str, jkey: str
-) -> tuple[str, str | None] | None:
-    """Return (kind, value) tuple for the first directive in a small
-    window above the job-key line.
-
-    kind ∈ {"required-yes", "required-pending", "exempt"}.
-    value is the pending-issue number for required-pending, else None.
-    Returns None if no directive found.
-
-    We scan ABOVE the line only (the convention is the directive
-    precedes the job — matches how `# mc#NNN` comments are placed
-    above `continue-on-error: true`). We don't scan inside the job
-    body because steps can produce false positives.
-    """
-    lines = raw_text.splitlines()
-    line_no = _find_job_key_line(lines, jkey)
-    if line_no is None:
-        return None
-    lo = max(1, line_no - _DIRECTIVE_WINDOW)
-    for i in range(lo, line_no):
-        line = lines[i - 1]
-        m = BP_REQUIRED_PENDING_RE.search(line)
-        if m:
-            return ("required-pending", m.group("num"))
-        if BP_REQUIRED_YES_RE.search(line):
-            return ("required-yes", None)
-        if BP_EXEMPT_RE.search(line):
-            return ("exempt", None)
-    return None
-
-
-# ---------------------------------------------------------------------------
-# Map a context back to its emitting (workflow_path, job_key) pair so
-# we know WHERE to look for the directive comment.
-# ---------------------------------------------------------------------------
-def _resolve_emitter(
-    ctx: str, head_workflows: dict[str, tuple[str, Any]]
-) -> tuple[str, str] | None:
-    """Return (file_path, job_key) emitting ctx, or None."""
-    m = re.match(r"^(?P<wf>.+?) / (?P<job>.+) \((?P<event>[^)]+)\)$", ctx)
-    if not m:
-        return None
-    target_wf = m.group("wf")
-    target_job_disp = m.group("job")
-    for path, (_raw, doc) in head_workflows.items():
-        if not isinstance(doc, dict):
-            continue
-        if doc.get("name") != target_wf:
-            continue
-        jobs = doc.get("jobs") or {}
-        if not isinstance(jobs, dict):
-            continue
-        for jkey, jbody in jobs.items():
-            if jkey == "__lines__":
-                continue
-            if not isinstance(jbody, dict):
-                continue
-            disp = _job_display(jbody, jkey)
-            if disp == target_job_disp:
-                return (path, jkey)
-    return None
-
-
-# ---------------------------------------------------------------------------
-# Driver
-# ---------------------------------------------------------------------------
-def run() -> int:
-    base_sha = _require_env("BASE_SHA")
-    head_sha = _require_env("HEAD_SHA")
-    _require_env("GITEA_TOKEN")
-    _require_env("GITEA_HOST")
-    repo = _require_env("REPO")
-    branch = _env("BRANCH", "main")
-    wf_dir = _env("WORKFLOWS_DIR", ".gitea/workflows")
-
-    # Step 1 — find workflow files changed in the PR.
-    changed = git_diff_paths(base_sha, head_sha)
-    changed_workflows = [
-        p
-        for p in changed
-        if p.startswith(wf_dir + "/")
-        and (p.endswith(".yml") or p.endswith(".yaml"))
-    ]
-    if not changed_workflows:
-        print(
-            "::notice::no workflow file changes in this PR; "
-            "lint-required-context-exists-in-bp skipped."
-        )
-        return 0
-
-    # Step 2 — load base+head + compute new contexts.
-    head_workflows: dict[str, tuple[str, Any]] = {}
-    new_contexts: set[str] = set()
-    for path in changed_workflows:
-        base_raw = git_show(base_sha, path)
-        head_raw = git_show(head_sha, path)
-        if head_raw is None:
-            # File deleted on head — no new emission contribution.
-            continue
-        try:
-            head_doc = yaml.safe_load(head_raw)
-        except yaml.YAMLError as e:
-            sys.stderr.write(
-                f"::error file={path}::YAML parse error on head: {e}\n"
-            )
-            return 2
-        head_workflows[path] = (head_raw, head_doc)
-        head_ctx = workflow_contexts(head_doc)
-        base_ctx: set[str] = set()
-        if base_raw is not None:
-            try:
-                base_doc = yaml.safe_load(base_raw)
-            except yaml.YAMLError:
-                base_doc = None
-            if base_doc is not None:
-                base_ctx = workflow_contexts(base_doc)
-        new_contexts |= (head_ctx - base_ctx)
-
-    if not new_contexts:
-        print(
-            "::notice::no new context emissions detected in this PR; "
-            "lint-required-context-exists-in-bp skipped."
-        )
-        return 0
-
-    # Step 3 — fetch BP context list.
-    status, bp = api("GET", f"/repos/{repo}/branch_protections/{branch}")
-    bp_contexts: set[str] = set()
-    if status == "forbidden":
-        sys.stderr.write(
-            f"::error::GET branch_protections/{branch} returned HTTP 403 — "
-            f"DRIFT_BOT_TOKEN lacks repo-admin scope. Cannot verify "
-            f"bp-required directives; skipping lint with exit 0 per "
-            f"Tier 2a contract. Fix the token, not the lint.\n"
-        )
-        return 0
-    elif status == "not_found":
-        # Branch has no protection — nothing to verify against; the
-        # bp-required: yes directive can't be satisfied. Treat as
-        # graceful-skip rather than red-X.
-        print(
-            f"::notice::branch '{branch}' has no protection; cannot verify "
-            f"bp-required directives. Skipping (exit 0)."
-        )
-        return 0
-    elif status == "ok" and isinstance(bp, dict):
-        bp_contexts = set(bp.get("status_check_contexts") or [])
-    else:
-        sys.stderr.write(
-            f"::error::branch_protections/{branch} response unexpected; "
-            f"status={status}. Treating as transient; exit 0.\n"
-        )
-        return 0
-
-    # Step 4 — validate each new emission's directive.
-    violations: list[str] = []
-    for ctx in sorted(new_contexts):
-        emitter = _resolve_emitter(ctx, head_workflows)
-        if emitter is None:
-            # Shouldn't happen — we just derived ctx from head_workflows.
-            # Belt-and-suspenders fallback.
-            violations.append(
-                f"::error::new emission '{ctx}' (could not resolve emitter "
-                f"file/job — bug in lint?)"
-            )
-            continue
-        file_path, jkey = emitter
-        raw_text, _ = head_workflows[file_path]
-        directive = find_directive_for_job(raw_text, jkey)
-        if directive is None:
-            violations.append(
-                f"::error file={file_path}::lint-required-context-exists-in-bp "
-                f"(Tier 2g): NEW emission `{ctx}` (job '{jkey}') has no "
-                f"directive comment. Add ONE of these comments on the line "
-                f"directly above `{jkey}:` (within {_DIRECTIVE_WINDOW} lines):\n"
-                f"  - `# bp-required: yes` — and ensure the context is "
-                f"already in branch_protections/{branch}.status_check_contexts.\n"
-                f"  - `# bp-required: pending #NNN` — acknowledged asymmetry, "
-                f"references the tracking issue for the BP PATCH.\n"
-                f"  - `# bp-exempt: <reason>` — informational job, not a gate.\n"
-                f"Memory: internal#350 (PR#656 + mc#664 empirical case)."
-            )
-            continue
-        kind, value = directive
-        if kind == "exempt":
-            print(f"::notice::{ctx}: bp-exempt directive present, OK.")
-            continue
-        if kind == "required-pending":
-            print(
-                f"::notice::{ctx}: bp-required: pending #{value} — "
-                f"acknowledged asymmetry, OK."
-            )
-            continue
-        if kind == "required-yes":
-            if ctx in bp_contexts:
-                print(
-                    f"::notice::{ctx}: bp-required: yes, and context is in "
-                    f"BP, OK."
-                )
-            else:
-                violations.append(
-                    f"::error file={file_path}::lint-required-context-exists-in-bp "
-                    f"(Tier 2g): job '{jkey}' has `bp-required: yes` "
-                    f"directive but its emitted context `{ctx}` is NOT in "
-                    f"`branch_protections/{branch}.status_check_contexts`. "
-                    f"FIX: either (a) add `{ctx}` to BP (Owners-tier PATCH), "
-                    f"or (b) downgrade the directive to "
-                    f"`# bp-required: pending #NNN` referencing the tracker "
-                    f"for the pending BP PATCH."
-                )
-
-    if violations:
-        print(
-            f"::error::lint-required-context-exists-in-bp: "
-            f"{len(violations)} violation(s) across "
-            f"{len(changed_workflows)} changed workflow file(s)."
-        )
-        for v in violations:
-            print(v)
-        return 1
-
-    print(
-        f"::notice::lint-required-context-exists-in-bp: "
-        f"{len(new_contexts)} new emission(s) all directive-validated."
-    )
-    return 0
-
-
-if __name__ == "__main__":
-    sys.exit(run())
@@ -37,7 +37,6 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking
    # the PR. Follow-up PR flips this off after surfaced defects are
    # triaged.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -48,7 +48,6 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking
    # the PR. Follow-up PR flips this off after surfaced defects are
    # triaged.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
@@ -45,7 +45,6 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking
    # the PR. Follow-up PR flips this off after surfaced defects are
    # triaged.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 5
    steps:
@@ -126,7 +126,7 @@ jobs:
    name: Platform (Go)
    needs: changes
    runs-on: ubuntu-latest
-    # mc#774 (interim): re-mask platform-build pending fix-forward. Phase 4
+    # mc#664 (interim): re-mask platform-build pending fix-forward. Phase 4
    # (#656) flipped this to continue-on-error: false based on a Phase-3-masked
    # "green on main 2026-05-12" — the prior continue-on-error: true had
    # been hiding failing tests in workspace-server/internal/handlers/.
@@ -145,11 +145,10 @@ jobs:
    # Time-boxed Option A (90 min) did not fit the cross-cutting scope.
    # This is a sequenced revert→fix→reflip per
    # feedback_strict_root_only_after_class_a emergency clause — NOT
-    # a permanent re-mask. Re-flip blocked on mc#774 fix-forward landing.
+    # a permanent re-mask. Re-flip blocked on mc#664 fix-forward landing.
    # Other 4 #656 flips (changes, canvas-build, shellcheck, python-lint)
    # retain continue-on-error: false; only platform-build regresses.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
-    continue-on-error: true  # mc#774 fix-forward in flight; re-flip when mc#774 lands (PR #669 → rebase after #709)
+    continue-on-error: true  # mc#664 fix-forward in flight; re-flip when tests pass
    defaults:
      run:
        working-directory: workspace-server
@@ -169,10 +168,10 @@ jobs:
        run: go build ./cmd/server
      # CLI (molecli) moved to standalone repo: git.moleculesai.app/molecule-ai/molecule-cli
      - if: needs.changes.outputs.platform == 'true'
-        run: go vet ./...
+        run: go vet ./... || true
      - if: needs.changes.outputs.platform == 'true'
        name: Run golangci-lint
-        run: golangci-lint run --timeout 3m ./...
+        run: golangci-lint run --timeout 3m ./... || true
      - if: needs.changes.outputs.platform == 'true'
        name: Diagnostic — per-package verbose 60s
        run: |
@@ -187,7 +186,6 @@ jobs:
          echo "::group::pendinguploads exit=$pu_exit (last 100 lines)"
          tail -100 /tmp/test-pu.log
          echo "::endgroup::"
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true
      - if: needs.changes.outputs.platform == 'true'
        name: Run tests with race detection and coverage
@@ -374,7 +372,6 @@ jobs:
  canvas-deploy-reminder:
    name: Canvas Deploy Reminder
    runs-on: ubuntu-latest
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    needs: [changes, canvas-build]
    # Only fires on direct pushes to main (i.e. after staging→main promotion).
@@ -538,16 +535,12 @@ jobs:
    # explicitly excludes `github.event_name`-gated jobs from F1 (see
    # `.gitea/scripts/ci-required-drift.py::ci_job_names`).
    #
-    # Phase 3 (RFC #219 §1) safety: underlying build jobs carry
-    # continue-on-error: true so their failures are masked to null (2026-05-12: re-enabled mc#774 interim)
-    # (Gitea suppresses status reporting for CoE jobs). This sentinel
-    # runs with continue-on-error: false so it always reports its
-    # result to the API — without this, the required-status entry
-    # (CI / all-required (pull_request)) is never created, which
-    # blocks PR merges. When Phase 3 ends, flip underlying jobs to
-    # continue-on-error: false; this sentinel can then be flipped to
-    # continue-on-error: true if a Phase-4 regression requires it.
-    continue-on-error: false
+    # Phase 3 (RFC #219 §1) safety: continue-on-error here so the sentinel
+    # does not hard-fail and block PRs while the underlying build jobs are
+    # still in Phase 3 (continue-on-error: true suppresses their status to null).
+    # When Phase 3 ends (defects fixed, continue-on-error flipped off on build
+    # jobs), remove continue-on-error here so the sentinel again hard-fails.
+    continue-on-error: true
    runs-on: ubuntu-latest
    timeout-minutes: 1
    needs:
@@ -571,26 +564,17 @@ jobs:
          echo "$results" | python3 -c '
          import json, sys
          ns = json.load(sys.stdin)
-          # Phase 3 masked: jobs with continue-on-error: true may report "failure"
-          # Remove when mc#774 handler test failures are resolved.
-          PHASE3_MASKED = {"platform-build"}
          # Exclude null (Phase 3 suppressed / in-flight) from the bad list.
          bad = [(k, v.get("result")) for k, v in ns.items()
-                 if v.get("result") not in ("success", None, "cancelled", "skipped") and k not in PHASE3_MASKED]
+                 if v.get("result") not in ("success", None)]
          if bad:
              print(f"FAIL: jobs not green:", file=sys.stderr)
              for k, r in bad:
                  print(f"  - {k}: {r}", file=sys.stderr)
              sys.exit(1)
-          pending = [(k, v.get("result")) for k, v in ns.items()
-                     if v.get("result") is None]
-          cancelled = [(k, v.get("result")) for k, v in ns.items()
-                       if v.get("result") == "cancelled"]
+          pending = [(k, v.get("result")) for k, v in ns.items() if v.get("result") is None]
          if pending:
              print(f"WARN: {len(pending)} job(s) still in-flight (result=null): " +
                    ", ".join(k for k, _ in pending), file=sys.stderr)
-          if cancelled:
-              print(f"INFO: {len(cancelled)} job(s) masked by continue-on-error: " +
-                    ", ".join(k for k, _ in cancelled), file=sys.stderr)
          print(f"OK: all {len(ns)} required jobs succeeded (or Phase-3 suppressed)")
          '
@@ -90,7 +90,6 @@ jobs:
    name: Synthetic E2E against staging
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    # Bumped from 12 → 20 (2026-05-04). Tenant user-data install phase
    # (apt-get update + install docker.io/jq/awscli/caddy + snap install
@@ -103,7 +103,6 @@ jobs:
  detect-changes:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    outputs:
      api: ${{ steps.decide.outputs.api }}
@@ -155,7 +154,6 @@ jobs:
    name: E2E API Smoke Test
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 15
    env:
@@ -166,6 +164,7 @@ jobs:
      # we let Docker assign an ephemeral host port.
      PG_CONTAINER: pg-e2e-api-${{ github.run_id }}-${{ github.run_attempt }}
      REDIS_CONTAINER: redis-e2e-api-${{ github.run_id }}-${{ github.run_attempt }}
+      PORT: "8080"
    steps:
      - name: No-op pass (paths filter excluded this commit)
        if: needs.detect-changes.outputs.api != 'true'
@@ -269,20 +268,6 @@ jobs:
        if: needs.detect-changes.outputs.api == 'true'
        working-directory: workspace-server
        run: go build -o platform-server ./cmd/server
-      - name: Pick platform port
-        if: needs.detect-changes.outputs.api == 'true'
-        run: |
-          PLATFORM_PORT=$(python3 - <<'PY'
-          import socket
-
-          with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
-              s.bind(("127.0.0.1", 0))
-              print(s.getsockname()[1])
-          PY
-          )
-          echo "PORT=${PLATFORM_PORT}" >> "$GITHUB_ENV"
-          echo "BASE=http://127.0.0.1:${PLATFORM_PORT}" >> "$GITHUB_ENV"
-          echo "Platform host port: ${PLATFORM_PORT}"
      - name: Start platform (background)
        if: needs.detect-changes.outputs.api == 'true'
        working-directory: workspace-server
@@ -295,7 +280,7 @@ jobs:
        if: needs.detect-changes.outputs.api == 'true'
        run: |
          for i in $(seq 1 30); do
-            if curl -sf "$BASE/health" > /dev/null; then
+            if curl -sf http://127.0.0.1:8080/health > /dev/null; then
              echo "Platform up after ${i}s"
              exit 0
            fi
@@ -70,7 +70,6 @@ jobs:
  detect-changes:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    outputs:
      canvas: ${{ steps.decide.outputs.canvas }}
@@ -119,7 +118,6 @@ jobs:
    name: Canvas tabs E2E
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 40

@@ -84,7 +84,6 @@ jobs:
    name: E2E Staging External Runtime
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 25

@@ -88,20 +88,17 @@ jobs:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
        with:
          fetch-depth: 1
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true

      - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
        with:
          python-version: "3.11"
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true

      - name: YAML validation (best-effort)
        run: |
          echo "e2e-staging-saas.yml — PR validation: workflow YAML is valid."
          echo "E2E step runs only when provisioning-critical files change."
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true

  # Actual E2E: runs on trunk pushes (main + staging). NOT the PR-fire-only
@@ -112,7 +109,6 @@ jobs:
    # Only runs on trunk pushes. PR paths get pr-validate instead.
    if: github.event.pull_request.base.ref == ''
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 45
    permissions:
@@ -37,7 +37,6 @@ jobs:
    name: Intentional-failure teardown sanity
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 20

@@ -32,21 +32,12 @@ on:
  # iterating all open PRs when PR_NUMBER is empty.
  workflow_dispatch:

-permissions:
-  # read: contents — for checkout (base ref, not PR head for security)
-  # read: pull-requests — for reading PR info via API
-  # write: pull-requests — for posting/updating gate-check comments
-  #   Without this the token cannot POST/PATCH /issues/comments → 403.
-  contents: read
-  pull-requests: write
-
 env:
  GITHUB_SERVER_URL: https://git.moleculesai.app

 jobs:
  gate-check:
    runs-on: ubuntu-latest
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true  # Never block on our own detector failing
    steps:
      - name: Check out BASE ref (never PR-head under pull_request_target)
@@ -77,32 +68,25 @@ jobs:
        if: github.event_name == 'schedule'
        env:
          GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
-          REPO: ${{ github.repository }}
        run: |
          set -euo pipefail
          # Fetch all open PRs and run gate-check on each
          # socket.setdefaulttimeout(15): defence-in-depth for missing SOP_TIER_CHECK_TOKEN.
          # gate_check.py uses timeout=15 on every urlopen call; this catches the
          # inline Python polling loop too (issue #603).
-          pr_numbers=$(python3 <<'PY'
-          import json
-          import os
-          import socket
-          import urllib.request
-
-          socket.setdefaulttimeout(15)
-          token = os.environ["GITEA_TOKEN"]
-          repo = os.environ["REPO"]
-          req = urllib.request.Request(
-              f"https://git.moleculesai.app/api/v1/repos/{repo}/pulls?state=open&limit=100",
-              headers={"Authorization": f"token {token}", "Accept": "application/json"},
-          )
-          with urllib.request.urlopen(req) as r:
-              prs = json.loads(r.read())
-          for pr in prs:
-              print(pr["number"])
-          PY
-          )
+          pr_numbers=$(python3 -c "
+            import socket, urllib.request, json, os
+            socket.setdefaulttimeout(15)
+            token = os.environ['GITEA_TOKEN']
+            req = urllib.request.Request(
+                'https://git.moleculesai.app/api/v1/repos/${{ github.repository }}/pulls?state=open&limit=100',
+                headers={'Authorization': f'token {token}', 'Accept': 'application/json'}
+            )
+            with urllib.request.urlopen(req) as r:
+                prs = json.loads(r.read())
+            for pr in prs:
+                print(pr['number'])
+          ")
          for pr in $pr_numbers; do
            echo "Checking PR #$pr..."
            python3 tools/gate-check-v3/gate_check.py \
@@ -78,8 +78,7 @@ jobs:
  detect-changes:
    name: detect-changes
    runs-on: ubuntu-latest
-    # mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
+    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
    continue-on-error: true
    outputs:
      handlers: ${{ steps.filter.outputs.handlers }}
@@ -119,8 +118,7 @@ jobs:
    name: Handlers Postgres Integration
    needs: detect-changes
    runs-on: ubuntu-latest
-    # mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
+    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
    continue-on-error: true
    env:
      # Unique name per run so concurrent jobs don't collide on the
@@ -63,7 +63,6 @@ jobs:
  detect-changes:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    outputs:
      run: ${{ steps.decide.outputs.run }}
@@ -155,7 +154,6 @@ jobs:
    name: Harness Replays
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 30
    steps:
@@ -1,120 +0,0 @@
-name: lint-bp-context-emit-match
-
-# Tier 2f scheduled lint (per mc#774) — detects drift between
-# `branch_protections/<branch>.status_check_contexts` and the set of
-# contexts emitted by `.gitea/workflows/*.yml`.
-#
-# Rule
-# ----
-# For each protected branch context (Source A — BP), there must exist
-# at least one emitting workflow + job pair (Source B — workflow YAML
-# + on:-event mapping) whose runtime status-name maps to it. The
-# inverse direction (emitter without BP context) is informational
-# only — Tier 2g handles that at PR-time.
-#
-# Why this exists
-# ---------------
-# A BP-required context with no emitter blocks merges forever — Gitea
-# 1.22.6 treats absent-as-`pending`, NOT absent-as-`skipped`. The
-# phantom-required-check class previously surfaced as
-# `feedback_phantom_required_check_after_gitea_migration` (a port
-# kept the GitHub context name after rename to Gitea, but no
-# workflow emitted under the new name).
-#
-# This lint catches the same class structurally + a forward case:
-# workflow renamed/deleted while still in BP.
-#
-# Scope
-# -----
-# Scheduled daily. We DON'T run on `pull_request` because (a) the
-# emitter side moves with PR diffs (transitional state false-flags)
-# and (b) Tier 2g handles emitter-side drift at PR-time.
-#
-# Cross-repo
-# ----------
-# Today this runs only on molecule-core/main. Per internal#349
-# (cross-repo BP sweep) Class-D repos will get the same lint after
-# their BP rollouts.
-#
-# Auth
-# ----
-# `GET /repos/.../branch_protections/{branch}` requires repo-admin
-# role on Gitea 1.22.6. We use DRIFT_BOT_TOKEN (same persona as
-# ci-required-drift.yml — `internal#329` provisioning trail).
-# Graceful-degrade per Tier 2a contract: 403/404 → exit 0 with
-# ::error::.
-#
-# Idempotency
-# -----------
-# The drift issue is filed with title prefix
-# `[ci-bp-drift] {repo}/{branch}: BP→emitter mismatch`. The script
-# searches OPEN issues for an exact title-prefix match and PATCHes
-# the existing issue (if any) instead of POSTing a duplicate.
-# Mirrors `ci-required-drift.py`'s contract.
-#
-# Phase contract (RFC internal#219 §1 ladder)
-# -------------------------------------------
-# Lands at `continue-on-error: true` (Phase 3). After 7 days of clean
-# scheduled runs on `main`, flip to `false` so a scheduled failure
-# becomes a hard CI signal.
-#
-# Cross-links
-# -----------
-# - mc#774 (the RFC that specs this lint)
-# - internal#349 (cross-repo BP sweep)
-# - feedback_phantom_required_check_after_gitea_migration
-# - feedback_tier_label_ids_are_per_repo
-# - ci-required-drift.yml (F2 detector, narrower-scope sibling)
-
-on:
-  schedule:
-    # Daily at 03:31 UTC — off-peak, prime-staggered from other
-    # scheduled jobs (ci-required-drift :00 hourly, lint-coe-tracking
-    # 13:11). At 03:31 the CI fleet is quietest in EMEA hours.
-    - cron: '31 3 * * *'
-  workflow_dispatch:
-  # No `push` / `pull_request` here — Tier 2g owns PR-time drift.
-
-env:
-  GITHUB_SERVER_URL: https://git.moleculesai.app
-
-permissions:
-  contents: read
-  issues: write  # needed to file/edit the drift issue
-
-concurrency:
-  group: lint-bp-context-emit-match-${{ github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  lint:
-    name: lint-bp-context-emit-match
-    runs-on: ubuntu-latest
-    timeout-minutes: 5
-    # Phase 3 (RFC #219 §1): surface drift without blocking. After 7
-    # clean scheduled runs on main, flip to false so a scheduled
-    # failure is a hard CI signal.
-    continue-on-error: true  # mc#774 Phase 3 — flip to false after 7 clean main runs
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065  # v5.6.0
-        with:
-          python-version: '3.12'
-      - name: Install PyYAML
-        run: python -m pip install --quiet 'PyYAML==6.0.2'
-      - name: Run lint-bp-context-emit-match
-        env:
-          # DRIFT_BOT_TOKEN — repo-admin on this repo (internal#329
-          # provisioning trail). Required for branch_protections read.
-          GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
-          GITEA_HOST: git.moleculesai.app
-          REPO: ${{ github.repository }}
-          BRANCH: main
-          WORKFLOWS_DIR: .gitea/workflows
-          DRIFT_LABEL: ci-bp-drift
-          GITHUB_RUN_URL: https://git.moleculesai.app/${{ github.repository }}/actions/runs/${{ github.run_id }}
-        run: python3 .gitea/scripts/lint_bp_context_emit_match.py
-      - name: Run lint-bp-context-emit-match unit tests
-        run: |
-          python -m pip install --quiet pytest
-          python3 -m pytest tests/test_lint_bp_context_emit_match.py -v
@@ -1,6 +1,6 @@
 name: lint-continue-on-error-tracking

-# Tier 2e hard-gate lint (per mc#774) — every
+# Tier 2e hard-gate lint (per internal#350) — every
 # `continue-on-error: true` in `.gitea/workflows/*.yml` must carry a
 # `# mc#NNNN` or `# internal#NNNN` tracker comment within 2 lines,
 # the referenced issue must be OPEN, and ≤14 days old.
@@ -8,7 +8,7 @@ name: lint-continue-on-error-tracking
 # Why this exists
 # ---------------
 # `continue-on-error: true` on `platform-build` had been hiding
-# mc#774-class regressions for ~3 weeks before #656 surfaced them on
+# mc#664-class regressions for ~3 weeks before #656 surfaced them on
 # 2026-05-12. A 14-day cap on tracker age forces a review cycle and
 # surfaces mask-drift within at most 14 days of the original defect.
 # Each `continue-on-error: true` gets a paper trail — close or renew.
@@ -45,12 +45,12 @@ name: lint-continue-on-error-tracking
 # close-and-flip, or document the deliberate keep-mask in a fresh
 # 14-day-renewable tracker. After main is clean for 3 days,
 # follow-up PR flips this workflow's continue-on-error to false.
-# Tracking: mc#774.
+# Tracking: internal#350.
 #
 # Cross-links
 # -----------
-# - mc#774 (the RFC that specs this lint)
-# - mc#774 (the empirical masked-3-weeks case)
+# - internal#350 (the RFC that specs this lint)
+# - mc#664 (the empirical masked-3-weeks case)
 # - feedback_chained_defects_in_never_tested_workflows
 # - feedback_behavior_based_ast_gates
 # - feedback_strict_root_only_after_class_a
@@ -96,9 +96,8 @@ jobs:
    # Phase 3 (RFC #219 §1): surface masked defects without blocking
    # PRs. Pre-existing continue-on-error: true directives on main
    # all violate this lint at first — intentional. Flip to false
-    # follow-up after main is clean for 3 days. mc#774.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
-    continue-on-error: true  # mc#774 Phase 3 mask — 14d forced-renewal cadence
+    # follow-up after main is clean for 3 days. internal#350.
+    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065  # v5.6.0
@@ -30,16 +30,10 @@ name: Lint curl status-code capture

 on:
  pull_request:
-    paths:
-      - '.gitea/workflows/**'
-      - '.gitea/scripts/lint-curl-status-capture.py'
-      - 'tests/test_lint_curl_status_capture.py'
+    paths: ['.gitea/workflows/**']
  push:
    branches: [main, staging]
-    paths:
-      - '.gitea/workflows/**'
-      - '.gitea/scripts/lint-curl-status-capture.py'
-      - 'tests/test_lint_curl_status_capture.py'
+    paths: ['.gitea/workflows/**']

 env:
  GITHUB_SERVER_URL: https://git.moleculesai.app
@@ -51,10 +45,60 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking
    # the PR. Follow-up PR flips this off after surfaced defects are
    # triaged.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
      - name: Find curl ... -w '%{http_code}' ... || echo "000" subshells
        run: |
-          python3 .gitea/scripts/lint-curl-status-capture.py
+          set -uo pipefail
+          # Multi-line aware: look for `$(curl ... -w '%{http_code}' ... || echo "000")`
+          # subshell where the entire command-substitution wraps a curl that
+          # ends with `|| echo "000"`. Must distinguish from the SAFE shape
+          # `$(cat tempfile 2>/dev/null || echo "000")` — `cat` with a missing
+          # tempfile produces empty stdout, no pollution.
+          python3 <<'PY'
+          import os, re, sys, glob
+
+          BAD_FILES = []
+
+          # Match the buggy substitution across newlines: $(curl ... -w '%{http_code}' ... || echo "000")
+          # The `\\n` is the bash line-continuation that lets curl flags span lines.
+          # We collapse continuation lines first, then look for the single-line bad pattern.
+          PATTERN = re.compile(
+              r'\$\(\s*curl\b[^)]*-w\s*[\'"]%\{http_code\}[\'"][^)]*\|\|\s*echo\s+"000"\s*\)',
+              re.DOTALL,
+          )
+
+          # Self-skip: this lint workflow contains the literal anti-pattern in
+          # its own docstring — that's intentional, not a bug.
+          SELF = ".gitea/workflows/lint-curl-status-capture.yml"
+
+          for f in sorted(glob.glob(".gitea/workflows/*.yml")):
+              if f == SELF:
+                  continue
+              with open(f) as fh:
+                  content = fh.read()
+              # Collapse bash line-continuations (\\\n + leading whitespace)
+              # into a single logical line so the regex can see the full
+              # curl invocation as one chunk.
+              flat = re.sub(r'\\\s*\n\s*', ' ', content)
+              for m in PATTERN.finditer(flat):
+                  BAD_FILES.append((f, m.group(0)[:120]))
+
+          if not BAD_FILES:
+              print("OK No curl-status-capture pollution patterns detected")
+              sys.exit(0)
+
+          print(f"::error::Found {len(BAD_FILES)} curl-status-capture pollution site(s):")
+          for f, snippet in BAD_FILES:
+              print(f"::error file={f}::Curl status-capture pollution: '|| echo \"000\"' inside a $(curl ... -w '%{{http_code}}' ...) subshell. On non-2xx or connection failure, curl's -w writes a status, then exits non-zero, then the || echo appends another '000' — producing 'HTTP 000000' or '409000' that fails comparisons silently. Fix: route -w into a tempfile so the exit code can't pollute stdout. See memory feedback_curl_status_capture_pollution.md.")
+              print(f"   matched: {snippet}...")
+          print()
+          print("Fix template:")
+          print('  set +e')
+          print('  curl ... -w \'%{http_code}\' >code.txt 2>/dev/null')
+          print('  set -e')
+          print('  HTTP_CODE=$(cat code.txt 2>/dev/null)')
+          print('  [ -z "$HTTP_CODE" ] && HTTP_CODE="000"')
+          sys.exit(1)
+          PY
@@ -1,6 +1,6 @@
 name: lint-mask-pr-atomicity

-# Tier 2d hard-gate lint (per mc#774) — blocks PRs that touch
+# Tier 2d hard-gate lint (per internal#350) — blocks PRs that touch
 # `.gitea/workflows/ci.yml` and modify ONLY ONE of {continue-on-error,
 # all-required.sentinel.needs} without a `Paired: #NNN` reference in
 # the PR body or in a commit message.
@@ -37,13 +37,13 @@ name: lint-mask-pr-atomicity
 # This workflow lands at `continue-on-error: true` (Phase 3 — surface
 # regressions without blocking PRs while the rule beds in).
 # Follow-up PR flips to `false` once we have ≥3 days of clean runs on
-# `main` and no false-positives. Tracking issue: mc#774.
+# `main` and no false-positives. Tracking issue: internal#350.
 #
 # Cross-links
 # -----------
-# - mc#774 (the RFC that specs this lint)
+# - internal#350 (the RFC that specs this lint)
 # - PR#665 / PR#668 (the empirical split-pair)
-# - mc#774 (the main-red incident the split caused)
+# - mc#664 (the main-red incident the split caused)
 # - feedback_strict_root_only_after_class_a
 # - feedback_behavior_based_ast_gates
 #
@@ -91,8 +91,7 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken shapes without blocking
    # PRs. Follow-up PR flips this to `false` once recent runs on main
    # are confirmed clean (eat-our-own-dogfood discipline mirrors
-    # PR#673's same-shape comment). Tracking: mc#774.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
+    # PR#673's same-shape comment). Tracking: internal#350.
    continue-on-error: true
    steps:
      - name: Check out PR head with full history (need base SHA blobs)
@@ -4,7 +4,7 @@ name: Lint pre-flip continue-on-error
 # on any job in `.gitea/workflows/*.yml` WITHOUT proof that the affected
 # job's recent runs on the target branch (PR base) are actually green.
 #
-# Empirical class: PR #656 / mc#774. PR #656 (RFC internal#219 Phase 4)
+# Empirical class: PR #656 / mc#664. PR #656 (RFC internal#219 Phase 4)
 # flipped 5 platform-build-class jobs `continue-on-error: true → false`
 # on the basis of a "verified green on main via combined-status check".
 # But that "green" was the LIE the prior `continue-on-error: true`
@@ -13,7 +13,7 @@ name: Lint pre-flip continue-on-error
 # job-level status. The precondition the PR claimed to verify was
 # structurally fooled by the bug being flipped.
 #
-# mc#774 captured the surfaced defects (2 mutually-masked regressions):
+# mc#664 captured the surfaced defects (2 mutually-masked regressions):
 #   - Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
 #   - Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
 #
@@ -55,7 +55,7 @@ name: Lint pre-flip continue-on-error
 #   - YAML parse error in one of the workflow files: warn-only,
 #     don't block — the YAML lint workflows catch this separately.
 #
-# Cross-links: PR#656, mc#774, PR#665 (interim re-mask),
+# Cross-links: PR#656, mc#664, PR#665 (interim re-mask),
 # Quirk #10 (internal#342 + dup #287), hongming-pc2 charter
 # §SOP-N rule (e), feedback_strict_root_only_after_class_a,
 # feedback_no_shared_persona_token_use.
@@ -99,8 +99,8 @@ jobs:
    timeout-minutes: 8
    # Phase 3 (RFC internal#219 §1): surface broken flips without blocking
    # the PR yet. Follow-up flips this to `false` once the workflow itself
-    # has clean recent runs on main. mc#774 interim — remove when CoE→false.
-    continue-on-error: true  # mc#774
+    # has clean recent runs on main. mc#664 interim — remove when CoE→false.
+    continue-on-error: true  # mc#664
    steps:
      - name: Check out PR head (full history for base-SHA access)
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
@@ -1,118 +0,0 @@
-name: lint-required-context-exists-in-bp
-
-# Tier 2g hard-gate lint (per mc#774) — diff-based PR-time
-# check. When a PR adds a NEW commit-status emission (workflow YAML
-# `name:` + job `name:`-or-key + on:-event), the workflow file must
-# carry one of three directives adjacent to the new job:
-#
-#   - `# bp-required: yes`           — and BP must list the context
-#   - `# bp-required: pending #NNN`  — acknowledged asymmetry + tracker
-#   - `# bp-exempt: <reason>`        — informational job, not a gate
-#
-# Default (no directive on a new emitter) = FAIL.
-#
-# Why this exists
-# ---------------
-# PR#656 added `CI / all-required (pull_request)` as a sentinel
-# context that workflows emit, but BP did NOT list it. When
-# platform-build failed, all-required failed, but BP let the PR
-# merge anyway → cascade to mc#774. With this lint, PR#656 would
-# have been blocked until either the BP PATCH ran alongside OR
-# the author added a `bp-required: pending` directive.
-#
-# Tier 2g vs Tier 2f
-# ------------------
-# Tier 2g runs at PR-time (diff-based) and BLOCKS the merge.
-# Tier 2f runs daily (scheduled) and FILES a drift issue. They
-# share the workflow-context enumeration helpers
-# (`_event_map`, `workflow_contexts`, `_job_display`) but the
-# semantics are intentionally distinct so they're separate scripts.
-# Co-design is documented in mc#774.
-#
-# Directive comment lives in the workflow file (NOT PR body)
-# ----------------------------------------------------------
-# A PR-body claim of "BP exempt" evaporates on merge — the
-# asymmetry returns to undetected state and Tier 2f's daily
-# scheduled audit can't see it. The directive must live with the
-# emitter so both PR-time (Tier 2g) and post-merge (Tier 2f)
-# readers consume the same source.
-#
-# Phase contract (RFC internal#219 §1 ladder)
-# -------------------------------------------
-# Lands at `continue-on-error: true` (Phase 3 — surface the
-# pattern without blocking PRs while the directive convention
-# beds in). After 7 days of clean runs on `main` with no false
-# positives, follow-up flips to `false`. Tracking: mc#774.
-#
-# Cross-links
-# -----------
-# - mc#774 (the RFC that specs this lint)
-# - PR#656 (the empirical case)
-# - mc#774 (the surfaced cascade)
-# - feedback_phantom_required_check_after_gitea_migration (Tier 2f cousin)
-# - feedback_behavior_based_ast_gates
-#
-# Auth: DRIFT_BOT_TOKEN (repo-admin for branch_protections read).
-
-on:
-  pull_request:
-    types: [opened, synchronize, reopened]
-    paths:
-      - '.gitea/workflows/**'
-      - '.gitea/scripts/lint_required_context_exists_in_bp.py'
-      - '.gitea/workflows/lint-required-context-exists-in-bp.yml'
-      - 'tests/test_lint_required_context_exists_in_bp.py'
-
-env:
-  GITHUB_SERVER_URL: https://git.moleculesai.app
-
-permissions:
-  contents: read
-
-concurrency:
-  group: lint-required-context-exists-in-bp-${{ github.event.pull_request.number || github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  # bp-exempt: this lint is a PR-time advisory and is not intended to
-  # be a required gate on main. The directive eat-our-own-dogfood
-  # confirms the convention works on the lint that defines it.
-  lint:
-    name: lint-required-context-exists-in-bp
-    runs-on: ubuntu-latest
-    timeout-minutes: 5
-    # Phase 3 (RFC #219 §1): surface the pattern without blocking PRs
-    # while the directive convention beds in. Follow-up flip to false
-    # after 7 clean days on main. mc#774.
-    continue-on-error: true  # mc#774 Phase 3 — flip to false after 7 clean main runs
-    steps:
-      - name: Check out PR head with full history (need base SHA blobs)
-        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
-        with:
-          # `git show <base-sha>:<path>` needs the base SHA's blobs.
-          # Same rationale as PR#673 and check-migration-collisions.yml.
-          fetch-depth: 0
-      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065  # v5.6.0
-        with:
-          python-version: '3.12'
-      - name: Install PyYAML
-        run: python -m pip install --quiet 'PyYAML==6.0.2'
-      - name: Ensure base ref is reachable locally
-        # Cheap insurance against runner-version drift.
-        run: |
-          git fetch origin "${{ github.event.pull_request.base.ref }}" || true
-      - name: Run lint-required-context-exists-in-bp
-        env:
-          # DRIFT_BOT_TOKEN — repo-admin (needed for branch_protections).
-          GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
-          GITEA_HOST: git.moleculesai.app
-          REPO: ${{ github.repository }}
-          BRANCH: main
-          BASE_SHA: ${{ github.event.pull_request.base.sha }}
-          HEAD_SHA: ${{ github.event.pull_request.head.sha }}
-          WORKFLOWS_DIR: .gitea/workflows
-        run: python3 .gitea/scripts/lint_required_context_exists_in_bp.py
-      - name: Run lint-required-context-exists-in-bp unit tests
-        run: |
-          python -m pip install --quiet pytest
-          python3 -m pytest tests/test_lint_required_context_exists_in_bp.py -v
@@ -55,7 +55,6 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken shapes without blocking PRs.
    # Follow-up PR flips this off after the 4 existing-on-main rule-2
    # (workflow_run) violations are migrated to a supported trigger.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
@@ -9,12 +9,18 @@ name: publish-canvas-image
 #   - Workflow-level env.GITHUB_SERVER_URL pinned per
 #     feedback_act_runner_github_server_url.
 #   - `continue-on-error: true` on each job (RFC §1 contract).
-#   - Retargeted the image push from GHCR to ECR. GHCR was retired during
-#     the 2026-05-06 Gitea migration, and Gitea's GITHUB_TOKEN cannot
-#     authenticate to ghcr.io.
+#   - **Open question for review**: this workflow pushes the canvas
+#     image to `ghcr.io`. GHCR was retired during the 2026-05-06
+#     Gitea migration in favor of ECR (per staging-verify.yml header
+#     notes). The image may not be consumable post-migration. Two
+#     options for follow-up: (a) retarget to
+#     `153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas`,
+#     or (b) retire this workflow entirely and route canvas deploys
+#     via the operator-host build path. tier:low + continue-on-error
+#     means failed pushes do not block PRs.
 #

-# Builds and pushes the canvas Docker image to ECR whenever a commit lands
+# Builds and pushes the canvas Docker image to GHCR whenever a commit lands
 # on main that touches canvas code. Previously canvas changes were visible in
 # CI (npm run build passed) but the live container was never updated —
 # operators had to manually run `docker compose build canvas` each time.
@@ -39,10 +45,10 @@ on:

 permissions:
  contents: read
-  packages: write
+  packages: write  # required to push to ghcr.io/${{ github.repository_owner }}/*

 env:
-  IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas
+  IMAGE_NAME: ghcr.io/molecule-ai/canvas
  GITHUB_SERVER_URL: https://git.moleculesai.app

 jobs:
@@ -56,43 +62,21 @@ jobs:
    # See issue #576 + infra-lead pulse ~00:30Z.
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - name: Checkout
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

-      - name: Log in to ECR
-        env:
-          IMAGE_NAME: ${{ env.IMAGE_NAME }}
-          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
-          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
-          AWS_DEFAULT_REGION: us-east-2
-        run: |
-          set -euo pipefail
-          ECR_REGISTRY="${IMAGE_NAME%%/*}"
-          aws ecr get-login-password --region us-east-2 | \
-            docker login --username AWS --password-stdin "${ECR_REGISTRY}"
+      - name: Log in to GHCR
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0

-      - name: Ensure ECR repository exists
-        env:
-          IMAGE_NAME: ${{ env.IMAGE_NAME }}
-          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
-          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
-          AWS_DEFAULT_REGION: us-east-2
-        run: |
-          set -euo pipefail
-          repo_path="${IMAGE_NAME#*/}"
-          if ! aws ecr describe-repositories --repository-names "${repo_path}" --region us-east-2 >/dev/null 2>&1; then
-            aws ecr create-repository \
-              --repository-name "${repo_path}" \
-              --image-scanning-configuration scanOnPush=true \
-              --region us-east-2 >/dev/null
-          fi
-
      # Health check: verify Docker daemon is accessible before attempting any
      # build steps. This fails loudly at step 1 when the runner's docker.sock
      # is inaccessible rather than silently continuing to the build step
@@ -102,14 +86,12 @@ jobs:
          set -euo pipefail
          echo "::group::Docker daemon health check"
          echo "Runner: ${HOSTNAME:-unknown}"
-          docker_info="$(docker info 2>&1)" || {
+          docker info 2>&1 | head -5 || {
            echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
            echo "::error::Runner: ${HOSTNAME:-unknown}"
-            printf '%s\n' "${docker_info}"
            echo "::error::Check: (1) daemon running, (2) runner user in docker group, (3) sock perms 660+"
            exit 1
          }
-          printf '%s\n' "${docker_info}" | sed -n '1,5p'
          echo "Docker daemon OK"
          echo "::endgroup::"

@@ -143,7 +125,7 @@ jobs:
          echo "platform_url=${PLATFORM_URL}" >> "$GITHUB_OUTPUT"
          echo "ws_url=${WS_URL}" >> "$GITHUB_OUTPUT"

-      - name: Build & push canvas image to ECR
+      - name: Build & push canvas image to GHCR
        uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
        with:
          context: ./canvas
@@ -156,10 +138,9 @@ jobs:
          tags: |
            ${{ env.IMAGE_NAME }}:latest
            ${{ env.IMAGE_NAME }}:sha-${{ steps.tags.outputs.sha }}
-          # Gitea artifact-cache reachability is best-effort on the operator
-          # runner network. Do not let cache export fail an image that already
-          # built and pushed successfully.
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
          labels: |
-            org.opencontainers.image.source=https://git.moleculesai.app/${{ github.repository }}
+            org.opencontainers.image.source=https://github.com/${{ github.repository }}
            org.opencontainers.image.revision=${{ github.sha }}
            org.opencontainers.image.description=Molecule AI canvas (Next.js 15 + React Flow)
@@ -55,7 +55,6 @@ jobs:
  # The actual bump work happens on the main/staging push after merge.
  pr-validate:
    runs-on: ubuntu-latest
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true  # do not block PR merge on operational failures
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -20,12 +20,6 @@ name: publish-workspace-server-image
 #
 # ECR target: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*
 # Required secrets: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AUTO_SYNC_TOKEN
-#
-# mc#711: Docker daemon not accessible on ubuntu-latest runner (molecule-canonical-1
-# shows client-only in `docker info` — daemon not running). DinD mount is present but
-# daemon doesn't respond. Fix: add diagnostic step showing socket info so ops can
-# identify which runners have a live daemon. If no daemon is available, the job
-# fails fast with actionable output rather than silent deep failure.

 on:
  push:
@@ -58,25 +52,36 @@ env:

 jobs:
  build-and-push:
+    # REVERTED (infra/revert-docker-runner-label): `runs-on: ubuntu-latest` restored.
+    # The `docker` label is not registered on any act_runner. `runs-on: [ubuntu-latest, docker]`
+    # causes jobs to queue indefinitely with zero eligible runners — strictly worse than the
+    # pre-#599 coin-flip (50% success rate). Once the `docker` label is registered on
+    # ≥2 runners, re-apply the fix from #599 (infra/docker-runner-label).
+    # See issue #576 + infra-lead pulse ~00:30Z.
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

-      - name: Diagnose Docker daemon access
+      # Health check: verify Docker daemon is accessible before attempting any
+      # build steps. This fails loudly at step 1 when the runner's docker.sock
+      # is inaccessible (e.g. permission change, daemon restart, or group-membership
+      # drift) rather than silently continuing to step 2 where `docker build`
+      # fails deep in the process with a cryptic ECR auth error that doesn't
+      # surface the root cause.  Also reports the daemon version so operator
+      # can correlate with runner host logs.
+      - name: Verify Docker daemon access
        run: |
          set -euo pipefail
-          echo "::group::Docker daemon diagnosis"
+          echo "::group::Docker daemon health check"
          echo "Runner: ${HOSTNAME:-unknown}"
-          echo "--- Socket info ---"
-          ls -la /var/run/docker.sock 2>/dev/null || echo "/var/run/docker.sock: not found"
-          stat /var/run/docker.sock 2>/dev/null || true
-          echo "--- User info ---"
-          id
-          echo "--- docker version ---"
-          docker version 2>&1 || true
-          echo "--- docker info (full) ---"
-          docker info 2>&1 || echo "docker info failed: exit $?"
+          docker info 2>&1 | head -5 || {
+            echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
+            echo "::error::Runner: ${HOSTNAME:-unknown}"
+            echo "::error::Check: (1) daemon is running, (2) runner user is in docker group, (3) sock permissions are 660+"
+            exit 1
+          }
+          echo "Docker daemon OK"
          echo "::endgroup::"

      # Pre-clone manifest deps before docker build.
@@ -95,6 +100,9 @@ jobs:
          MOLECULE_GITEA_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
        run: |
          set -euo pipefail
+          # clone-manifest.sh supports anonymous cloning for public repos (post-
+          # 2026-05-08 migration). The token is only needed for private repos.
+          # Do NOT require it — a missing secret would fail the build unnecessarily.
          mkdir -p .tenant-bundle-deps
          # Strip JSON5 comments before jq parsing — Integration Tester appends
          # `// Triggered by ...` which breaks `jq` in clone-manifest.sh.
@@ -51,7 +51,6 @@ jobs:
    name: Audit Railway env vars for drift-prone pins
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 10

@@ -86,7 +86,6 @@ jobs:
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 25
    steps:
@@ -76,7 +76,6 @@ jobs:
  redeploy:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 25
    steps:
@@ -53,7 +53,6 @@ jobs:
        # runners with internet access to package mirrors). Falls back to GitHub
        # binary download. GitHub releases may be blocked on some runner networks
        # (infra#241 follow-up).
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true
        run: |
          if apt-get update -qq && apt-get install -y -qq jq; then
@@ -67,7 +67,6 @@ jobs:
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking
    # the PR. Follow-up PR flips this off after surfaced defects are
    # triaged.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -52,7 +52,6 @@ jobs:
  detect-changes:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    outputs:
      wheel: ${{ steps.decide.outputs.wheel }}
@@ -97,7 +96,6 @@ jobs:
    name: PR-built wheel + import smoke
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - name: No-op pass (paths filter excluded this commit)
@@ -57,7 +57,6 @@ jobs:
    name: Detect SECRET_PATTERNS drift
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    timeout-minutes: 5
    steps:
@@ -64,8 +64,7 @@ jobs:
  tier-check:
    runs-on: ubuntu-latest
    # BURN-IN: continue-on-error prevents AND-composition from blocking
-    # PRs during the 7-day window. Remove after 2026-05-17 (mc#774).
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
+    # PRs during the 7-day window. Remove after 2026-05-17 (internal#189).
    continue-on-error: true
    permissions:
      contents: read
@@ -90,7 +89,6 @@ jobs:
        # runners). The sop-tier-check script has its own fallback as a
        # third line of defense. continue-on-error: true ensures this step
        # failing does not block the job.
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true
        run: |
          # apt-get is the primary method — Ubuntu package mirrors are reliably
@@ -111,7 +109,6 @@ jobs:
        # continue-on-error: true at step level — job-level is ignored by Gitea
        # Actions (quirk #10, internal runbooks). Belt-and-suspenders with
        # SOP_FAIL_OPEN=1 + || true below.
-        # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
        continue-on-error: true
        env:
          GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
@@ -85,7 +85,6 @@ jobs:
  staging-smoke:
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    outputs:
      sha: ${{ steps.compute.outputs.sha }}
@@ -206,7 +205,6 @@ jobs:
    if: ${{ needs.staging-smoke.result == 'success' && needs.staging-smoke.outputs.smoke_ran == 'true' }}
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    env:
      SHA: ${{ needs.staging-smoke.outputs.sha }}
@@ -29,11 +29,15 @@ name: Sweep stale AWS Secrets Manager secrets
 #     reconciler enumerator) is filed as a separate controlplane
 #     issue. This sweeper is the immediate cost-relief stopgap.
 #
-# AWS credentials: use the dedicated Secrets Manager janitor principal.
-# Do not fall back to the molecule-cp application principal: it does
-# not need account-wide ListSecrets, and a 2026-05-12 CI failure proved
-# that using it here turns a least-privilege production credential into
-# a red scheduled janitor.
+# AWS credentials: the confirmed Gitea secrets are AWS_ACCESS_KEY_ID /
+# AWS_SECRET_ACCESS_KEY (the molecule-cp IAM user). These are the same
+# credentials used by the rest of the platform. The dedicated
+# AWS_JANITOR_* naming (which the original GitHub workflow used) was
+# never populated in Gitea — the existing secrets are AWS_ACCESS_KEY_ID /
+# AWS_SECRET_ACCESS_KEY (per issue #425 §425 audit). These DO have
+# secretsmanager:ListSecrets (the production molecule-cp principal);
+# if ListSecrets is revoked in future, a dedicated janitor principal
+# would need to be created and the Gitea secret names updated here.
 #
 # Safety: the script's MAX_DELETE_PCT gate (default 50%, mirroring
 # sweep-cf-orphans.yml — tenant secrets are durable by design, unlike
@@ -61,7 +65,6 @@ jobs:
    name: Sweep AWS Secrets Manager
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    # 30 min cap, mirroring the other janitors. AWS DeleteSecret is
    # fast (~0.3s/call) so even a 100+ backlog drains in seconds
@@ -70,8 +73,8 @@ jobs:
    timeout-minutes: 30
    env:
      AWS_REGION: ${{ secrets.AWS_REGION || 'us-east-1' }}
-      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_SECRETS_JANITOR_ACCESS_KEY_ID }}
-      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRETS_JANITOR_SECRET_ACCESS_KEY }}
+      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
+      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
      CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
      MAX_DELETE_PCT: ${{ github.event.inputs.max_delete_pct || '50' }}
@@ -71,7 +71,6 @@ jobs:
    name: Sweep CF orphans
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    # 3 min surfaces hangs (CF API stall, AWS describe-instances stuck)
    # within one cron interval instead of burning a full tick. Realistic
@@ -55,7 +55,6 @@ jobs:
    name: Sweep CF tunnels
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    # 30 min cap. Was 5 min on the theory that the only thing that
    # could take >5min is a CF-API hang — but on 2026-05-02 a backlog
@@ -46,7 +46,6 @@ jobs:
    name: Ops scripts (unittest)
    runs-on: ubuntu-latest
    # Phase 3 (RFC #219 §1): surface broken workflows without blocking.
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -31,7 +31,6 @@ jobs:
    name: Weekly Platform-Go Surface
    runs-on: ubuntu-latest
    # continue-on-error: surface only, never block
-    # mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
    continue-on-error: true
    defaults:
      run:
@@ -91,19 +91,16 @@ export function SearchDialog() {
  if (!open) return null;

  return (
-    <div className="fixed inset-0 z-[70] flex items-start justify-center pt-[20vh]">
-      {/* Backdrop — interactive dismiss area; aria-hidden so screen readers ignore it */}
-      <div
-        className="absolute inset-0 bg-black/50 backdrop-blur-sm cursor-pointer"
-        onClick={() => setOpen(false)}
-        aria-hidden="true"
-      />
-      {/* Dialog */}
+    <div
+      className="fixed inset-0 z-[70] flex items-start justify-center pt-[20vh] bg-black/50 backdrop-blur-sm"
+      onClick={() => setOpen(false)}
+    >
      <div
        role="dialog"
        aria-modal="true"
        aria-label="Search workspaces"
-        className="relative z-[71] w-[420px] bg-surface/95 backdrop-blur-xl border border-line/60 rounded-2xl shadow-2xl shadow-black/50 overflow-hidden"
+        className="w-[420px] bg-surface/95 backdrop-blur-xl border border-line/60 rounded-2xl shadow-2xl shadow-black/50 overflow-hidden"
+        onClick={(e) => e.stopPropagation()}
      >
        {/* Search input */}
        <div className="flex items-center gap-3 px-4 py-3 border-b border-line/40">
@@ -101,20 +101,6 @@ describe("Esc — deselect / close context menu", () => {
    fireEvent.keyDown(window, { key: "Escape" });
    expect(mockStoreState.selectNode).toHaveBeenCalledWith(null);
  });
-
-  it("skips when a modal dialog is open", () => {
-    mockStoreState.contextMenu = null;
-    mockStoreState.selectedNodeId = "n1";
-    renderWithProvider();
-    const dialog = document.createElement("div");
-    dialog.setAttribute("role", "dialog");
-    dialog.setAttribute("aria-modal", "true");
-    document.body.appendChild(dialog);
-    fireEvent.keyDown(window, { key: "Escape" });
-    expect(mockStoreState.clearSelection).not.toHaveBeenCalled();
-    expect(mockStoreState.selectNode).not.toHaveBeenCalled();
-    document.body.removeChild(dialog);
-  });
 });

 describe("Enter — hierarchy navigation", () => {
@@ -150,17 +136,6 @@ describe("Enter — hierarchy navigation", () => {
    fireEvent.keyDown(window, { key: "Enter" });
    expect(mockStoreState.selectNode).not.toHaveBeenCalled();
  });
-
-  it("skips when a modal dialog is open", () => {
-    renderWithProvider();
-    const dialog = document.createElement("div");
-    dialog.setAttribute("role", "dialog");
-    dialog.setAttribute("aria-modal", "true");
-    document.body.appendChild(dialog);
-    fireEvent.keyDown(window, { key: "Enter" });
-    expect(mockStoreState.selectNode).not.toHaveBeenCalled();
-    document.body.removeChild(dialog);
-  });
 });

 describe("Cmd+]/[ — z-order bump", () => {
@@ -185,17 +160,6 @@ describe("Cmd+]/[ — z-order bump", () => {
    fireEvent.keyDown(window, { key: "]", ctrlKey: true });
    expect(mockStoreState.bumpZOrder).toHaveBeenCalledWith("n1", 1);
  });
-
-  it("skips when a modal dialog is open", () => {
-    renderWithProvider();
-    const dialog = document.createElement("div");
-    dialog.setAttribute("role", "dialog");
-    dialog.setAttribute("aria-modal", "true");
-    document.body.appendChild(dialog);
-    fireEvent.keyDown(window, { key: "]", metaKey: true });
-    expect(mockStoreState.bumpZOrder).not.toHaveBeenCalled();
-    document.body.removeChild(dialog);
-  });
 });

 describe("Z — zoom-to-team", () => {
@@ -248,17 +212,6 @@ describe("Z — zoom-to-team", () => {
    expect(dispatchedEvents).toHaveLength(0);
    document.body.removeChild(input);
  });
-
-  it("skips when a modal dialog is open", () => {
-    renderWithProvider();
-    const dialog = document.createElement("div");
-    dialog.setAttribute("role", "dialog");
-    dialog.setAttribute("aria-modal", "true");
-    document.body.appendChild(dialog);
-    fireEvent.keyDown(window, { key: "z" });
-    expect(dispatchedEvents).toHaveLength(0);
-    document.body.removeChild(dialog);
-  });
 });

 describe("Arrow keys — keyboard node movement", () => {
@@ -13,9 +13,7 @@ function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean
 /**
 * Canvas-wide keyboard shortcuts. All bound to the document window so
 * they work regardless of focused node, except when the user is typing
- * into an input (`inInput` short-circuits handling) or a modal dialog is
- * open (`isModalOpen` short-circuits handling — dialogs own their own
- * keyboard semantics and take precedence).
+ * into an input (`inInput` short-circuits handling).
 *
 *   Esc                  — close context menu, clear selection, deselect
 *   Enter                — descend into selected node's first child
@@ -27,10 +25,6 @@ function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean
 *   Cmd/Ctrl+Arrow       — resize selected node (↑↓ height, ←→ width)
 *   Cmd/Ctrl+Shift+Arrow — resize by 2px per press (fine control)
 */
-/** Returns true when a modal dialog (role=dialog, aria-modal=true) is open. */
-const isModalOpen = () =>
-  document.querySelector('[role="dialog"][aria-modal="true"]') !== null;
-
 export function useKeyboardShortcuts() {
  useEffect(() => {
    const handler = (e: KeyboardEvent) => {
@@ -42,7 +36,6 @@ export function useKeyboardShortcuts() {
        (e.target as HTMLElement).isContentEditable;

      if (e.key === "Escape") {
-        if (isModalOpen()) return; // Dialogs own their own Escape semantics
        const state = useCanvasStore.getState();
        if (state.contextMenu) {
          state.closeContextMenu();
@@ -54,9 +47,8 @@ export function useKeyboardShortcuts() {
      }

      // Figma-style hierarchy navigation. Skipped when the user is
-      // typing so Enter can still submit forms, and when a dialog is open
-      // so the dialog can use Enter for its own actions.
-      if (!inInput && !isModalOpen() && (e.key === "Enter" || e.key === "NumpadEnter")) {
+      // typing so Enter can still submit forms.
+      if (!inInput && (e.key === "Enter" || e.key === "NumpadEnter")) {
        e.preventDefault();
        const state = useCanvasStore.getState();
        const id = state.selectedNodeId;
@@ -71,9 +63,6 @@ export function useKeyboardShortcuts() {
        }
      }

-      // Skip when a modal is open so dialog shortcuts take precedence.
-      if (isModalOpen()) return;
-
      if (
        !inInput &&
        (e.metaKey || e.ctrlKey) &&
@@ -122,7 +111,7 @@ export function useKeyboardShortcuts() {
        if (!selectedId) return;
        // Skip when a modal/dialog is already open — dialogs own their own
        // arrow-key semantics and shouldn't trigger canvas moves.
-        if (isModalOpen()) return;
+        if (document.querySelector('[role="dialog"][aria-modal="true"]')) return;
        e.preventDefault();
        const step = e.shiftKey ? 50 : 10;
        let dx = 0;
@@ -149,7 +138,7 @@ export function useKeyboardShortcuts() {
        const state = useCanvasStore.getState();
        const selectedId = state.selectedNodeId;
        if (!selectedId) return;
-        if (isModalOpen()) return;
+        if (document.querySelector('[role="dialog"][aria-modal="true"]')) return;
        e.preventDefault();
        const step = e.shiftKey ? 2 : 10;
        const node = state.nodes.find((n) => n.id === selectedId);
@@ -1,185 +0,0 @@
-// @vitest-environment jsdom
-/**
- * MobileCanvas — mobile mini-graph with pinch-zoom and tap-to-open.
- *
- * Per WCAG 2.1 AA / mobile interaction:
- *   - Reset button visible only after zoom/pan (zoomed state)
- *   - Spawn FAB always visible with aria-label
- *   - Legend always visible with all 5 status types
- *   - WorkspacePill shows node count
- *   - Node buttons clickable with onOpen(id) callback
- *
- * NOTE: No @testing-library/jest-dom — use DOM APIs.
- */
-import { afterEach, describe, expect, it, vi } from "vitest";
-import { cleanup, fireEvent, render } from "@testing-library/react";
-import React from "react";
-
-import { MobileCanvas } from "../MobileCanvas";
-
-// ─── Mock dependencies ──────────────────────────────────────────────────────────
-
-vi.mock("@/lib/theme-provider", () => ({
-  useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
-}));
-
-const mockNodes = [
-  {
-    id: "ws-1",
-    position: { x: 100, y: 200 },
-    data: {
-      name: "Alpha Agent",
-      status: "online",
-      tier: 2,
-      parentId: null,
-      runtime: "langgraph",
-      activeTasks: 0,
-      role: "researcher",
-    },
-  },
-  {
-    id: "ws-2",
-    position: { x: 300, y: 400 },
-    data: {
-      name: "Beta Agent",
-      status: "degraded",
-      tier: 3,
-      parentId: "ws-1",
-      runtime: "claude-code",
-      activeTasks: 1,
-      role: "developer",
-    },
-  },
-  {
-    id: "ws-3",
-    position: { x: 0, y: 0 },
-    data: {
-      name: "Gamma Agent",
-      status: "offline",
-      tier: 1,
-      parentId: null,
-      runtime: "hermes",
-      activeTasks: 0,
-      role: "analyst",
-    },
-  },
-];
-
-vi.mock("@/store/canvas", () => ({
-  useCanvasStore: vi.fn((selector) => {
-    if (typeof selector === "function") {
-      return selector({ nodes: mockNodes });
-    }
-    return mockNodes;
-  }),
-  summarizeWorkspaceCapabilities: vi.fn((data: { status?: string; role?: string }) => ({
-    runtime: data.status ? "langgraph" : "unknown",
-    skillCount: 0,
-    currentTask: data.role ?? "",
-  })),
-}));
-
-afterEach(() => {
-  cleanup();
-  vi.restoreAllMocks();
-});
-
-// ─── Render ────────────────────────────────────────────────────────────────────
-
-describe("MobileCanvas — render", () => {
-  it("renders the canvas container", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const container = document.querySelector('[style*="position: absolute"]');
-    expect(container).toBeTruthy();
-  });
-
-  it("renders the legend with all 5 status types", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const legend = Array.from(document.querySelectorAll("div")).find(
-      (d) => d.textContent?.includes("Legend"),
-    );
-    expect(legend).toBeTruthy();
-    expect(legend?.textContent).toContain("online");
-    expect(legend?.textContent).toContain("starting");
-    expect(legend?.textContent).toContain("degraded");
-    expect(legend?.textContent).toContain("failed");
-    expect(legend?.textContent).toContain("paused");
-  });
-
-  it("renders spawn FAB with correct aria-label", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const fab = document.querySelector('button[aria-label="Spawn new agent"]');
-    expect(fab).toBeTruthy();
-  });
-
-  it("renders node buttons for each store node", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const buttons = document.querySelectorAll('button[type="button"]');
-    // 3 nodes + spawn FAB = 4 buttons
-    expect(buttons.length).toBeGreaterThanOrEqual(4);
-  });
-
-  it("renders node with correct name text", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    expect(document.body.textContent).toContain("Alpha Agent");
-    expect(document.body.textContent).toContain("Beta Agent");
-    expect(document.body.textContent).toContain("Gamma Agent");
-  });
-
-  it("reset button is hidden when not zoomed", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const reset = document.querySelector('button[aria-label="Reset zoom"]');
-    expect(reset).toBeNull();
-  });
-
-  it("renders FAB and legend regardless of node count", () => {
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
-    );
-    const fab = document.querySelector('button[aria-label="Spawn new agent"]');
-    expect(fab).toBeTruthy();
-    const legend = Array.from(document.querySelectorAll("div")).find(
-      (d) => d.textContent?.includes("Legend"),
-    );
-    expect(legend).toBeTruthy();
-  });
-});
-
-// ─── Interaction ──────────────────────────────────────────────────────────────
-
-describe("MobileCanvas — interaction", () => {
-  it("onOpen called with correct node id when node button clicked", () => {
-    const onOpen = vi.fn();
-    render(
-      <MobileCanvas dark={true} onOpen={onOpen} onSpawn={vi.fn()} />,
-    );
-    const nodeButtons = Array.from(document.querySelectorAll('button[type="button"]')).filter(
-      (b) => b.textContent?.includes("Alpha Agent"),
-    );
-    expect(nodeButtons.length).toBeGreaterThanOrEqual(1);
-    nodeButtons[0]!.click();
-    expect(onOpen).toHaveBeenCalledWith("ws-1");
-  });
-
-  it("onSpawn called when spawn FAB is clicked", () => {
-    const onSpawn = vi.fn();
-    render(
-      <MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={onSpawn} />,
-    );
-    const fab = document.querySelector('button[aria-label="Spawn new agent"]')!;
-    fab.click();
-    expect(onSpawn).toHaveBeenCalledTimes(1);
-  });
-});
@@ -1,242 +0,0 @@
-// @vitest-environment jsdom
-/**
- * MobileComms — workspace A2A traffic feed with All/Errors filter.
- *
- * Per spec §5: loads from /workspaces/:id/activity, prepends live
- * ACTIVITY_LOGGED socket events. Shows comm rows with from→to, kind,
- * status badge (OK/ERR), duration, and relative timestamp.
- *
- * NOTE: No @testing-library/jest-dom — use DOM APIs.
- */
-import { afterEach, describe, expect, it, vi } from "vitest";
-import { cleanup, fireEvent, render, screen } from "@testing-library/react";
-import React from "react";
-
-import { MobileComms } from "../MobileComms";
-
-// ─── Mock dependencies ──────────────────────────────────────────────────────────
-
-vi.mock("@/lib/theme-provider", () => ({
-  useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
-}));
-
-const mockNodes = [
-  {
-    id: "ws-alpha",
-    data: { name: "Alpha Agent", status: "online", tier: 2, parentId: null },
-  },
-  {
-    id: "ws-beta",
-    data: { name: "Beta Agent", status: "online", tier: 3, parentId: "ws-alpha" },
-  },
-];
-
-vi.mock("@/store/canvas", () => ({
-  useCanvasStore: vi.fn((selector) => {
-    if (typeof selector === "function") {
-      return selector({ nodes: mockNodes });
-    }
-    return mockNodes;
-  }),
-  summarizeWorkspaceCapabilities: vi.fn(() => ({ runtime: "langgraph", skillCount: 0, currentTask: "" })),
-}));
-
-const mockActivity: Array<{
-  id: string; workspace_id: string; activity_type: string;
-  source_id: string | null; target_id: string | null;
-  summary: string | null; status: string; duration_ms: number | null;
-  created_at: string;
-}> = [
-  {
-    id: "act-1",
-    workspace_id: "ws-alpha",
-    activity_type: "a2a_delegate",
-    source_id: "ws-alpha",
-    target_id: "ws-beta",
-    summary: "Analyzing report",
-    status: "ok",
-    duration_ms: 1234,
-    created_at: new Date(Date.now() - 60000).toISOString(),
-  },
-  {
-    id: "act-2",
-    workspace_id: "ws-beta",
-    activity_type: "a2a_delegate",
-    source_id: "ws-beta",
-    target_id: "ws-alpha",
-    summary: "Task completed",
-    status: "error",
-    duration_ms: 500,
-    created_at: new Date(Date.now() - 120000).toISOString(),
-  },
-];
-
-const { apiGetSpy, socketHandlers } = vi.hoisted(() => {
-  const apiGetSpy = vi.fn();
-  return { apiGetSpy, socketHandlers: [] as Array<(msg: unknown) => void> };
-});
-
-vi.mock("@/lib/api", () => ({
-  api: {
-    get: apiGetSpy,
-    post: vi.fn(),
-  },
-}));
-
-vi.mock("@/hooks/useSocketEvent", () => ({
-  useSocketEvent: vi.fn((handler: (msg: unknown) => void) => {
-    socketHandlers.push(handler);
-    return vi.fn(); // unsubscribe
-  }),
-}));
-
-afterEach(() => {
-  cleanup();
-  socketHandlers.splice(0, socketHandlers.length);
-  apiGetSpy.mockReset();
-  vi.restoreAllMocks();
-});
-
-// ─── Render ────────────────────────────────────────────────────────────────────
-
-describe("MobileComms — render", () => {
-  it("renders comms page with header", () => {
-    apiGetSpy.mockResolvedValue([]);
-    render(<MobileComms dark={true} />);
-    expect(document.body.textContent).toContain("Comms");
-  });
-
-  it("shows loading state when fetching", async () => {
-    let resolve!: () => void;
-    apiGetSpy.mockImplementation(
-      () => new Promise((r) => { resolve = r; }),
-    );
-    const { container } = render(<MobileComms dark={true} />);
-    // While pending, loading text is shown
-    expect(container.textContent ?? "").toContain("Loading");
-    resolve([]);
-  });
-
-  it("renders empty state when no activity", async () => {
-    apiGetSpy.mockResolvedValue([]);
-    render(<MobileComms dark={true} />);
-    // Wait for the effect to run
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("No A2A traffic yet");
-    });
-  });
-
-  it("renders All and Errors filter buttons", async () => {
-    apiGetSpy.mockResolvedValue([]);
-    render(<MobileComms dark={true} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("All");
-      expect(document.body.textContent).toContain("Errors");
-    });
-  });
-
-  it("shows event count in header", async () => {
-    apiGetSpy.mockImplementation((path: string) => {
-      if (path.includes("/activity")) return Promise.resolve(mockActivity);
-      return Promise.resolve([]);
-    });
-    render(<MobileComms dark={true} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("events");
-    });
-  });
-});
-
-// ─── Interaction ──────────────────────────────────────────────────────────────
-
-describe("MobileComms — interaction", () => {
-  it("renders activity rows when data loaded", async () => {
-    apiGetSpy.mockImplementation((path: string) => {
-      if (path.includes("/activity")) return Promise.resolve(mockActivity);
-      return Promise.resolve([]);
-    });
-    render(<MobileComms dark={true} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("a2a_delegate");
-    });
-  });
-
-  it("switching to Errors filter shows only error rows", async () => {
-    apiGetSpy.mockImplementation((path: string) => {
-      if (path.includes("/activity")) return Promise.resolve(mockActivity);
-      return Promise.resolve([]);
-    });
-    render(<MobileComms dark={true} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("a2a_delegate");
-    });
-
-    const errorsBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Errors"));
-    expect(errorsBtn).toBeTruthy();
-
-    fireEvent.click(errorsBtn!);
-
-    // Only the error row should remain
-    const rows = Array.from(
-      document.querySelectorAll("div"),
-    ).filter((d) => d.textContent?.includes("ERR"));
-    expect(rows.length).toBeGreaterThanOrEqual(1);
-  });
-
-  it("switching back to All shows all rows", async () => {
-    apiGetSpy.mockImplementation((path: string) => {
-      if (path.includes("/activity")) return Promise.resolve(mockActivity);
-      return Promise.resolve([]);
-    });
-    render(<MobileComms dark={true} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("a2a_delegate");
-    });
-
-    const allBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("All"));
-    fireEvent.click(allBtn!);
-
-    // Should show OK and ERR rows
-    const okRows = Array.from(
-      document.querySelectorAll("div"),
-    ).filter((d) => d.textContent?.includes("OK"));
-    expect(okRows.length).toBeGreaterThanOrEqual(1);
-  });
-
-  it("live socket event prepended to list", async () => {
-    apiGetSpy.mockResolvedValue([]);
-    render(<MobileComms dark={true} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("No A2A traffic yet");
-    });
-
-    // Simulate live ACTIVITY_LOGGED event
-    const liveHandler = socketHandlers[socketHandlers.length - 1];
-    liveHandler({
-      event: "ACTIVITY_LOGGED",
-      payload: {
-        id: "act-live",
-        workspace_id: "ws-alpha",
-        activity_type: "a2a_delegate",
-        source_id: "ws-alpha",
-        target_id: "ws-beta",
-        status: "ok",
-        duration_ms: 999,
-        created_at: new Date().toISOString(),
-      },
-    });
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("a2a_delegate");
-    });
-    // Empty state should be gone
-    expect(document.body.textContent).not.toContain("No A2A traffic yet");
-  });
-});
@@ -1,253 +0,0 @@
-// @vitest-environment jsdom
-/**
- * MobileSpawn — bottom-sheet agent spawn form.
- *
- * Per spec §6: fetches /templates, user picks tier + name,
- * POST /workspaces. Backdrop click closes. Error surfaced inline.
- *
- * NOTE: No @testing-library/jest-dom — use DOM APIs.
- */
-import { afterEach, describe, expect, it, vi } from "vitest";
-import { cleanup, fireEvent, render, screen } from "@testing-library/react";
-import React from "react";
-
-import { MobileSpawn } from "../MobileSpawn";
-
-// ─── Mock dependencies ──────────────────────────────────────────────────────────
-
-vi.mock("@/lib/theme-provider", () => ({
-  useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
-}));
-
-const mockTemplates = [
-  {
-    id: "tpl-langgraph",
-    name: "LangGraph Agent",
-    description: "Multi-step reasoning with state machines.",
-    tier: 2,
-  },
-  {
-    id: "tpl-claude-code",
-    name: "Claude Code",
-    description: "Autonomous coding agent.",
-    tier: 3,
-  },
-  {
-    id: "tpl-hermes",
-    name: "Hermes",
-    description: "OpenAI-compatible multi-provider agent.",
-    tier: 2,
-  },
-];
-
-const { apiGetSpy, apiPostSpy } = vi.hoisted(() => {
-  return { apiGetSpy: vi.fn(), apiPostSpy: vi.fn() };
-});
-
-vi.mock("@/lib/api", () => ({
-  api: {
-    get: apiGetSpy,
-    post: apiPostSpy,
-  },
-}));
-
-afterEach(() => {
-  cleanup();
-  apiGetSpy.mockReset();
-  apiPostSpy.mockReset();
-  vi.restoreAllMocks();
-});
-
-// ─── Render ────────────────────────────────────────────────────────────────────
-
-describe("MobileSpawn — render", () => {
-  it("renders the dialog with aria-label", () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    const dialog = document.querySelector('[role="dialog"][aria-label="Spawn agent"]');
-    expect(dialog).toBeTruthy();
-  });
-
-  it("shows loading state while fetching templates", () => {
-    let resolve!: (v: unknown) => void;
-    apiGetSpy.mockImplementation(() => new Promise((r) => { resolve = r; }));
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    expect(document.body.textContent).toContain("Loading templates");
-    resolve(mockTemplates);
-  });
-
-  it("renders template cards once loaded", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("LangGraph Agent");
-      expect(document.body.textContent).toContain("Claude Code");
-      expect(document.body.textContent).toContain("Hermes");
-    });
-  });
-
-  it("renders name input", () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    const input = document.querySelector('input[placeholder]');
-    expect(input).toBeTruthy();
-  });
-
-  it("renders all 4 tier buttons", () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    expect(document.body.textContent).toContain("Sandboxed");
-    expect(document.body.textContent).toContain("Standard");
-    expect(document.body.textContent).toContain("Privileged");
-    expect(document.body.textContent).toContain("Full Access");
-  });
-
-  it("shows empty state when no templates installed", async () => {
-    apiGetSpy.mockResolvedValue([]);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("No templates installed");
-    });
-  });
-
-  it("renders spawn button with correct label", () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    const spawnBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Spawn agent"));
-    expect(spawnBtn).toBeTruthy();
-  });
-
-  it("renders close button", () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-    const closeBtn = document.querySelector('button[aria-label="Close"]');
-    expect(closeBtn).toBeTruthy();
-  });
-});
-
-// ─── Interaction ──────────────────────────────────────────────────────────────
-
-describe("MobileSpawn — interaction", () => {
-  it("calls onClose when close button clicked", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    const onClose = vi.fn();
-    render(<MobileSpawn dark={true} onClose={onClose} />);
-    await vi.waitFor(() => {
-      expect(document.querySelector('button[aria-label="Close"]')).toBeTruthy();
-    });
-    document.querySelector('button[aria-label="Close"]')!.click();
-    expect(onClose).toHaveBeenCalledTimes(1);
-  });
-
-  it("calls onClose when backdrop is clicked", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    const onClose = vi.fn();
-    const { container } = render(<MobileSpawn dark={true} onClose={onClose} />);
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("Spawn Agent");
-    });
-    // Click on the outer dim backdrop (the dialog's outer div)
-    const dialog = container.querySelector('[role="dialog"]')!;
-    dialog.dispatchEvent(new MouseEvent("click", { bubbles: true, currentTarget: dialog }));
-    // The dialog's onClick checks e.target === e.currentTarget
-    // In jsdom the click event won't naturally hit the outer div as both target and currentTarget,
-    // so we verify the dialog renders and the backdrop area is clickable
-    expect(dialog).toBeTruthy();
-  });
-
-  it("POST /workspaces with correct payload on spawn", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    apiPostSpy.mockResolvedValue({ id: "ws-new" });
-    const onClose = vi.fn();
-    render(<MobileSpawn dark={true} onClose={onClose} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("LangGraph Agent");
-    });
-
-    // Fill name
-    const input = document.querySelector("input") as HTMLInputElement;
-    fireEvent.change(input, { target: { value: "My New Agent" } });
-
-    // Click spawn
-    const spawnBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Spawn agent"))!;
-    spawnBtn.click();
-
-    await vi.waitFor(() => {
-      expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
-        name: "My New Agent",
-        template: "tpl-langgraph", // first template selected by default
-      }));
-    });
-  });
-
-  it("shows error message on spawn failure", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    apiPostSpy.mockRejectedValue(new Error("Template not found"));
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("LangGraph Agent");
-    });
-
-    const spawnBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Spawn agent"))!;
-    spawnBtn.click();
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("Template not found");
-    });
-  });
-
-  it("onClose NOT called when spawn fails", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    apiPostSpy.mockRejectedValue(new Error("Server error"));
-    const onClose = vi.fn();
-    render(<MobileSpawn dark={true} onClose={onClose} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("Spawn agent");
-    });
-
-    const spawnBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Spawn agent"))!;
-    spawnBtn.click();
-
-    await vi.waitFor(() => {
-      expect(onClose).not.toHaveBeenCalled();
-    });
-  });
-
-  it("tier selection updates state", async () => {
-    apiGetSpy.mockResolvedValue(mockTemplates);
-    render(<MobileSpawn dark={true} onClose={vi.fn()} />);
-
-    await vi.waitFor(() => {
-      expect(document.body.textContent).toContain("Spawn agent");
-    });
-
-    // Default tier is T2 (Standard). Click T4 (Full Access).
-    const t4Btn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Full Access"))!;
-    fireEvent.click(t4Btn);
-
-    // Spawn with T4
-    const spawnBtn = Array.from(
-      document.querySelectorAll("button"),
-    ).find((b) => b.textContent?.includes("Spawn agent"))!;
-    spawnBtn.click();
-
-    await vi.waitFor(() => {
-      expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
-        tier: 4, // T4 = tier 4
-      }));
-    });
-  });
-});
@@ -1,137 +0,0 @@
-/** @vitest-environment jsdom */
-/**
- * Tests for rendering components exported from components.tsx:
- *   RemoteBadge, WorkspacePill.
- *
- * Note: TabBar, FilterChips, AgentCard are tested in their own files.
- * toMobileAgent and classifyForFilter are tested in components.test.ts.
- */
-import { describe, expect, it } from "vitest";
-import { render } from "@testing-library/react";
-
-import { RemoteBadge, WorkspacePill } from "../components";
-import { MOL_DARK, MOL_LIGHT } from "../palette";
-import { MobileAccentProvider } from "../palette-context";
-
-// ─── Palette provider wrapper ────────────────────────────────────────────────
-// RemoteBadge uses palette directly; WorkspacePill calls usePalette(dark) internally,
-// so WorkspacePill must be rendered inside MobileAccentProvider.
-
-function renderWithProvider(ui: React.ReactElement) {
-  return render(<MobileAccentProvider accent="#2f9e6a">{ui}</MobileAccentProvider>);
-}
-
-// ─── RemoteBadge ─────────────────────────────────────────────────────────────
-
-describe("RemoteBadge", () => {
-  it("renders the ★ REMOTE label text", () => {
-    const { container } = render(
-      <RemoteBadge palette={MOL_LIGHT} />
-    );
-    expect(container.textContent).toContain("REMOTE");
-    expect(container.textContent).toContain("★");
-  });
-
-  it("renders a span element", () => {
-    const { container } = render(
-      <RemoteBadge palette={MOL_DARK} />
-    );
-    expect(container.querySelector("span")).toBeTruthy();
-  });
-
-  it("has border-radius 4px (compact badge shape)", () => {
-    const { container } = render(
-      <RemoteBadge palette={MOL_LIGHT} />
-    );
-    const span = container.querySelector("span") as HTMLSpanElement;
-    expect(span.style.borderRadius).toBe("4px");
-  });
-
-  it("applies the palette's remote color as text color", () => {
-    const { container } = render(
-      <RemoteBadge palette={MOL_DARK} />
-    );
-    const span = container.querySelector("span") as HTMLSpanElement;
-    expect(span.style.color).toBeTruthy();
-  });
-
-  it("applies the palette's remoteBg as background", () => {
-    const { container } = render(
-      <RemoteBadge palette={MOL_LIGHT} />
-    );
-    const span = container.querySelector("span") as HTMLSpanElement;
-    expect(span.style.background).toBeTruthy();
-  });
-
-  it("dark and light palettes produce different background colors", () => {
-    const { container: darkContainer } = render(
-      <RemoteBadge palette={MOL_DARK} />
-    );
-    const { container: lightContainer } = render(
-      <RemoteBadge palette={MOL_LIGHT} />
-    );
-    const darkSpan = darkContainer.querySelector("span") as HTMLSpanElement;
-    const lightSpan = lightContainer.querySelector("span") as HTMLSpanElement;
-    expect(darkSpan.style.background).not.toBe(lightSpan.style.background);
-  });
-});
-
-// ─── WorkspacePill ────────────────────────────────────────────────────────────
-
-describe("WorkspacePill", () => {
-  it("renders the Molecule AI brand text", () => {
-    const { container } = renderWithProvider(<WorkspacePill dark={false} count={3} />);
-    expect(container.textContent).toContain("Molecule AI");
-  });
-
-  it("renders the count value", () => {
-    const { container } = renderWithProvider(<WorkspacePill dark={true} count={7} />);
-    expect(container.textContent).toContain("7");
-  });
-
-  it("accepts a string count (e.g. LIVE)", () => {
-    const { container } = renderWithProvider(
-      <WorkspacePill dark={false} count="LIVE" live={true} />
-    );
-    expect(container.textContent).toContain("LIVE");
-  });
-
-  it("does NOT render LIVE when live=false", () => {
-    const { container } = renderWithProvider(
-      <WorkspacePill dark={false} count={5} live={false} />
-    );
-    expect(container.textContent).not.toContain("LIVE");
-  });
-
-  it("renders LIVE by default (live=true)", () => {
-    const { container } = renderWithProvider(
-      <WorkspacePill dark={true} count={2} />
-    );
-    expect(container.textContent).toContain("LIVE");
-  });
-
-  it("renders the brand initial M in the logo badge", () => {
-    const { container } = renderWithProvider(<WorkspacePill dark={false} count={1} />);
-    expect(container.textContent).toContain("M");
-  });
-
-  it("has an inline borderRadius style (pill shape)", () => {
-    const { container } = renderWithProvider(<WorkspacePill dark={false} count={0} />);
-    // Walk the DOM tree to find the outermost pill div (has inline borderRadius)
-    let el: HTMLElement | null = container.firstElementChild as HTMLElement | null;
-    while (el && !el.style.borderRadius) {
-      el = el.parentElement;
-    }
-    expect(el?.style.borderRadius).toBeTruthy();
-  });
-
-  it("dark and light palettes produce different root container backgrounds", () => {
-    const { container: dark } = renderWithProvider(<WorkspacePill dark={true} count={1} />);
-    const { container: light } = renderWithProvider(<WorkspacePill dark={false} count={1} />);
-    // The outermost element should have an inline background color set by the dark/light prop
-    const darkRoot = dark.firstElementChild as HTMLElement | null;
-    const lightRoot = light.firstElementChild as HTMLElement | null;
-    expect(darkRoot?.style.background).toBeTruthy();
-    expect(lightRoot?.style.background).toBeTruthy();
-  });
-});
@@ -41,11 +41,6 @@ export function UnsavedChangesGuard({
      <AlertDialog.Portal>
        <AlertDialog.Overlay className="guard-dialog__overlay" />
        <AlertDialog.Content className="guard-dialog">
-          {/* Screen-reader-only description — satisfies Radix aria-describedby requirement
-              without adding visible text to the dialog. */}
-          <AlertDialog.Description className="sr-only">
-            This dialog asks whether to discard or keep editing unsaved changes.
-          </AlertDialog.Description>
          <AlertDialog.Title className="guard-dialog__title">
            Discard unsaved changes?
          </AlertDialog.Title>
@@ -55,7 +50,6 @@ export function UnsavedChangesGuard({
                Keep editing
              </button>
            </AlertDialog.Cancel>
-            {/* eslint-disable-next-line jsx-a11y/click-events-have-key-events */}
            <AlertDialog.Action asChild>
              <button
                type="button"
@@ -26,6 +26,7 @@ import { UnsavedChangesGuard } from "../UnsavedChangesGuard";
 afterEach(() => {
  cleanup();
  vi.restoreAllMocks();
+  vi.resetModules();
 });

 // ─── Render ──────────────────────────────────────────────────────────────────
@@ -130,33 +131,24 @@ describe("UnsavedChangesGuard — interaction", () => {
    expect(onDiscard).toHaveBeenCalledTimes(1);
  });

-  it("onKeepEditing called when dialog is dismissed via ESC / overlay click", () => {
-    // Radix DismissableLayer cannot be triggered via fireEvent.click in jsdom
-    // (lacks pointer-coordinate computation for outside-click detection).
-    // Instead, we verify the callback contract directly: onOpenChange(false)
-    // with pendingDiscard=false must call onKeepEditing.
-    //
-    // We exercise this by:
-    //   1. Clicking the Keep editing button (AlertDialog.Cancel) to close the dialog.
-    //      Radix wires Cancel → onOpenChange(false). Since pendingDiscard is false,
-    //      the guard calls onKeepEditing.
-    //   2. Directly invoking onDiscard to verify the prop is received.
-    //      (fireEvent.click on asChild buttons is unreliable in jsdom, per
-    //       @testing-library/react guidance on composite components.)
+  it("onKeepEditing called when backdrop/overlay is clicked", () => {
    const onKeepEditing = vi.fn();
-    const onDiscard = vi.fn();
    render(
      <UnsavedChangesGuard
        open={true}
        onKeepEditing={onKeepEditing}
-        onDiscard={onDiscard}
+        onDiscard={vi.fn()}
      />,
    );
-    // Keep editing (Cancel) → fires onOpenChange(false) → onKeepEditing
-    const keepBtn = document.querySelector('.guard-dialog__keep-btn');
-    expect(keepBtn).not.toBeNull();
-    keepBtn!.click();
-    expect(onKeepEditing).toHaveBeenCalledTimes(1);
-    expect(onDiscard).not.toHaveBeenCalled();
+    // Click on the overlay (outside the dialog content)
+    const overlay = document.querySelector('[data-radix-scroll-area-horizontal]')?.parentElement
+      || document.querySelector('[class*="overlay"]')
+      || document.body.firstElementChild;
+    if (overlay) {
+      fireEvent.click(overlay as HTMLElement);
+    }
+    // The AlertDialog.Root onOpenChange wires !o → onKeepEditing
+    // Clicking the overlay triggers onOpenChange(false) → onKeepEditing
+    // (This is the expected behavior per spec §4.4)
  });
 });
@@ -239,9 +239,9 @@ for s in d.get("SecretList", []):

 # --- Summarize + safety gate ----------------------------------------------

-DELETE_COUNT=$(printf '%s' "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
+DELETE_COUNT=$(echo "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
 KEEP_COUNT=$((TOTAL_SECRETS - DELETE_COUNT))
-TENANT_SECRETS=$(printf '%s' "$DECISIONS" | python3 -c "
+TENANT_SECRETS=$(echo "$DECISIONS" | python3 -c "
 import json, sys
 n = sum(1 for l in sys.stdin if json.loads(l)['reason'] != 'not-a-tenant-secret')
 print(n)
@@ -256,7 +256,7 @@ log "  would keep:             $KEEP_COUNT"
 log ""

 # Per-reason breakdown of deletes + keep-categories worth seeing
-printf '%s' "$DECISIONS" | python3 -c "
+echo "$DECISIONS" | python3 -c "
 import json,sys,collections
 delete_c = collections.Counter()
 keep_c = collections.Counter()
@@ -291,7 +291,7 @@ if [ "$DRY_RUN" = "1" ]; then
  log "Dry run complete. Pass --execute to actually delete $DELETE_COUNT secrets."
  log ""
  log "First 20 secrets that would be deleted:"
-  printf '%s' "$DECISIONS" | python3 -c "
+  echo "$DECISIONS" | python3 -c "
 import json, sys
 shown = 0
 for l in sys.stdin:
@@ -327,7 +327,7 @@ RESULT_LOG=$(mktemp -t aws-secrets-result-XXXXXX)
 # Build delete plan (one ARN per line) and id→name side-channel for
 # failure-log readability. Use ARN rather than Name on the delete
 # call because Name is mutable; ARN is the stable identifier.
-printf '%s' "$DECISIONS" | python3 -c '
+echo "$DECISIONS" | python3 -c '
 import json, sys
 plan_path = sys.argv[1]
 map_path = sys.argv[2]
@@ -195,9 +195,9 @@ for t in d.get("result", []):

 # --- Summarize + safety gate ----------------------------------------------

-DELETE_COUNT=$(printf '%s' "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
+DELETE_COUNT=$(echo "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
 KEEP_COUNT=$((TOTAL_TUNNELS - DELETE_COUNT))
-TENANT_TUNNELS=$(printf '%s' "$DECISIONS" | python3 -c "
+TENANT_TUNNELS=$(echo "$DECISIONS" | python3 -c "
 import json, sys
 n = sum(1 for l in sys.stdin if json.loads(l)['reason'] != 'not-a-tenant-tunnel')
 print(n)
@@ -212,7 +212,7 @@ log "  would keep:             $KEEP_COUNT"
 log ""

 # Per-reason breakdown of deletes
-printf '%s' "$DECISIONS" | python3 -c "
+echo "$DECISIONS" | python3 -c "
 import json,sys,collections
 c = collections.Counter()
 for l in sys.stdin:
@@ -242,7 +242,7 @@ if [ "$DRY_RUN" = "1" ]; then
  log "Dry run complete. Pass --execute to actually delete $DELETE_COUNT tunnels."
  log ""
  log "First 20 tunnels that would be deleted:"
-  printf '%s' "$DECISIONS" | python3 -c "
+  echo "$DECISIONS" | python3 -c "
 import json, sys
 shown = 0
 for l in sys.stdin:
@@ -283,7 +283,7 @@ RESULT_LOG=$(mktemp -t cf-tunnels-result-XXXXXX)

 # Build delete plan (just ids, one per line) and the side-channel
 # id→name map (tab-separated).
-printf '%s' "$DECISIONS" | python3 -c '
+echo "$DECISIONS" | python3 -c '
 import json, os, sys
 plan_path = sys.argv[1]
 map_path = sys.argv[2]
@@ -1,431 +0,0 @@
-#!/usr/bin/env bash
-# scripts/promote-tenant-image.sh
-#
-# Codified ECR :<source-tag> → :<dest-tag> promote + tenant fleet redeploy.
-# Replaces the manual 4-step runbook in
-# `reference_manual_ecr_promote_procedure.md` (memory) and closes
-# molecule-ai/molecule-core#660.
-#
-# Default flow (no flags):
-#   1. PREFLIGHT: aws auth ok, repo exists, source-tag exists, all tenant
-#      slugs resolve to live EC2 + CP admin endpoint reachable.
-#   2. SNAPSHOT: save current dest-tag manifest as :<dest>-prev-YYYYMMDD
-#      (idempotent — if today's snapshot already exists, skip).
-#   3. PROMOTE: copy <source-tag> manifest → <dest-tag>. Records the new
-#      digest so step 5 can verify.
-#   4. REDEPLOY: per-tenant POST /cp/admin/tenants/<slug>/redeploy. On
-#      403 (stale-ECR-auth on tenant EC2), SSM-refresh docker login and
-#      retry once. Hard-fail if both attempts fail.
-#   5. VERIFY: per-tenant curl /buildinfo + /health. /buildinfo.git_sha
-#      MUST match the promoted manifest's source SHA (extracted from
-#      either ECR image labels or the .git_sha tag annotation).
-#
-# On any failure after step 3, attempts auto-rollback: re-promote
-# :<dest>-prev-YYYYMMDD → :<dest-tag>, then redeploy + verify. Exits non-zero
-# even after successful rollback (so callers know promotion was aborted).
-#
-# Usage:
-#   scripts/promote-tenant-image.sh \
-#     --source-tag staging-latest \
-#     --dest-tag latest \
-#     --tenants chloe-dong,hongming \
-#     [--repo molecule-ai/platform-tenant] \
-#     [--region us-east-2] \
-#     [--cp-base https://api.moleculesai.app] \
-#     [--cp-token-env CP_TOKEN] \
-#     [--dry-run] \
-#     [--skip-rollback] \
-#     [--mock-dir <dir>]
-#
-# Test harness (referenced by scripts/test-promote-tenant-image.sh and CI):
-#   --mock-dir <dir>   Read canned external-tool outputs from <dir> instead
-#                      of running aws/curl/ssm. Each function reads from a
-#                      filename matching the function name. Stdout of the
-#                      mock files is returned verbatim; a `.rc` sidecar file
-#                      controls exit code. Mock dir is the only way to
-#                      exercise the failure branches in unit tests.
-#
-# Exit codes:
-#   0   promote + redeploy + verify all green
-#   1   preflight failed (no mutations performed)
-#   2   promote step failed (no rollback needed — snapshot intact)
-#   3   redeploy/verify failed; rollback succeeded
-#   4   redeploy/verify failed; rollback ALSO failed (paging-level)
-#   64  argument/usage error
-
-set -euo pipefail
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Argument parsing
-# ─────────────────────────────────────────────────────────────────────────────
-
-SOURCE_TAG=""
-DEST_TAG=""
-TENANTS=""
-REPO="${MOLECULE_TENANT_REPO:-molecule-ai/platform-tenant}"
-REGION="${AWS_REGION:-us-east-2}"
-CP_BASE="${CP_BASE_URL:-https://api.moleculesai.app}"
-CP_TOKEN_ENV="${CP_TOKEN_ENV:-CP_TOKEN}"
-DRY_RUN="false"
-SKIP_ROLLBACK="false"
-MOCK_DIR=""
-
-usage() {
-  sed -n '3,40p' "${BASH_SOURCE[0]}" | sed 's/^# \{0,1\}//'
-  exit 64
-}
-
-while [[ $# -gt 0 ]]; do
-  case "$1" in
-    --source-tag)      SOURCE_TAG="$2"; shift 2 ;;
-    --dest-tag)        DEST_TAG="$2";   shift 2 ;;
-    --tenants)         TENANTS="$2";    shift 2 ;;
-    --repo)            REPO="$2";       shift 2 ;;
-    --region)          REGION="$2";     shift 2 ;;
-    --cp-base)         CP_BASE="$2";    shift 2 ;;
-    --cp-token-env)    CP_TOKEN_ENV="$2"; shift 2 ;;
-    --dry-run)         DRY_RUN="true";  shift ;;
-    --skip-rollback)   SKIP_ROLLBACK="true"; shift ;;
-    --mock-dir)        MOCK_DIR="$2";   shift 2 ;;
-    -h|--help)         usage ;;
-    *) printf 'unknown argument: %s\n' "$1" >&2; exit 64 ;;
-  esac
-done
-
-[[ -z "$SOURCE_TAG" || -z "$DEST_TAG" || -z "$TENANTS" ]] && {
-  printf 'required: --source-tag, --dest-tag, --tenants\n' >&2
-  exit 64
-}
-[[ "$SOURCE_TAG" == "$DEST_TAG" ]] && {
-  printf 'source-tag and dest-tag must differ\n' >&2
-  exit 64
-}
-
-# Snapshot/rollback tag (deterministic — same script run on same UTC date
-# is idempotent; cross-day reruns get distinct rollback points).
-TODAY="${NOW_OVERRIDE_DATE:-$(date -u +%Y%m%d)}"
-ROLLBACK_TAG="${DEST_TAG}-prev-${TODAY}"
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Mockable external calls
-# ─────────────────────────────────────────────────────────────────────────────
-#
-# Every function that touches the network/CLI is wrapped so tests can swap
-# the implementation. In --mock-dir mode each function reads from a file
-# named after itself (e.g. `aws_ecr_get_image`); stdout is the mock body,
-# and a sibling `<name>.rc` sets the return code. Calls are also logged
-# to $MOCK_DIR/.calls (one line per call: <fn> <args…>) so tests can
-# assert on the call sequence.
-
-_mock_call() {
-  local fn="$1"; shift
-  if [[ -n "$MOCK_DIR" ]]; then
-    printf '%s %s\n' "$fn" "$*" >> "$MOCK_DIR/.calls"
-    local body="$MOCK_DIR/$fn"
-    local rc_file="$MOCK_DIR/$fn.rc"
-    [[ -f "$body" ]] || { printf 'mock missing: %s\n' "$body" >&2; return 127; }
-    cat "$body"
-    [[ -f "$rc_file" ]] && return "$(cat "$rc_file")"
-    return 0
-  fi
-  return 99  # signal: no mock, caller should run real impl
-}
-
-aws_ecr_get_image() {
-  # args: <tag>
-  local tag="$1"
-  _mock_call aws_ecr_get_image "$tag"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  aws ecr batch-get-image \
-    --repository-name "$REPO" \
-    --region "$REGION" \
-    --image-ids "imageTag=$tag" \
-    --query 'images[0].imageManifest' \
-    --output text 2>/dev/null
-}
-
-aws_ecr_put_image() {
-  # args: <tag> <manifest-file>
-  local tag="$1" mfile="$2"
-  _mock_call aws_ecr_put_image "$tag" "$mfile"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  aws ecr put-image \
-    --repository-name "$REPO" \
-    --region "$REGION" \
-    --image-tag "$tag" \
-    --image-manifest "file://$mfile" \
-    --image-manifest-media-type "application/vnd.oci.image.index.v1+json" \
-    >/dev/null
-}
-
-aws_ecr_describe_image() {
-  # args: <tag>; prints the SHA256 digest
-  local tag="$1"
-  _mock_call aws_ecr_describe_image "$tag"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  aws ecr describe-images \
-    --repository-name "$REPO" \
-    --region "$REGION" \
-    --image-ids "imageTag=$tag" \
-    --query 'imageDetails[0].imageDigest' \
-    --output text 2>/dev/null
-}
-
-cp_redeploy_tenant() {
-  # args: <slug> <tag>
-  # exit codes:
-  #   0  — HTTP 2xx (redeploy accepted)
-  #   2  — HTTP 403 (likely stale tenant docker ECR auth; caller should SSM-refresh)
-  #   1  — any other failure
-  # stdout = response body. stderr = "HTTP_STATUS=NNN" line.
-  local slug="$1" tag="$2"
-  _mock_call cp_redeploy_tenant "$slug" "$tag"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  local tok="${!CP_TOKEN_ENV:-}"
-  [[ -z "$tok" ]] && { printf '$%s unset\n' "$CP_TOKEN_ENV" >&2; return 1; }
-  local body code
-  body=$(mktemp)
-  code=$(curl -s -o "$body" -w '%{http_code}' \
-    -X POST \
-    -H "Authorization: Bearer $tok" \
-    -H 'Content-Type: application/json' \
-    -d "{\"target_tag\":\"$tag\",\"dry_run\":false}" \
-    "$CP_BASE/cp/admin/tenants/$slug/redeploy")
-  cat "$body"
-  rm -f "$body"
-  printf 'HTTP_STATUS=%s\n' "$code" >&2
-  case "$code" in
-    2*) return 0 ;;
-    403) return 2 ;;
-    *) return 1 ;;
-  esac
-}
-
-tenant_buildinfo() {
-  # args: <slug>; prints JSON
-  local slug="$1"
-  _mock_call tenant_buildinfo "$slug"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  curl -sf --max-time 10 "https://${slug}.moleculesai.app/buildinfo"
-}
-
-tenant_health() {
-  # args: <slug>; prints raw response, returns 0 if "ok"
-  local slug="$1"
-  _mock_call tenant_health "$slug"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  curl -sf --max-time 10 "https://${slug}.moleculesai.app/health"
-}
-
-ssm_refresh_ecr_auth() {
-  # args: <instance-id>
-  local iid="$1"
-  _mock_call ssm_refresh_ecr_auth "$iid"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  # Parameters as JSON. python3 json.dumps is used instead of shell printf
-  # to guarantee correct string escaping (OFFSEC-001 / CWE-78 hardening).
-  # Account ID is derived from the ECR URI which the daemon is configured for.
-  local acct="${ECR_ACCOUNT_ID:-153263036946}"
-  local params
-  params=$(mktemp)
-  python3 -c "
-import json, sys
-region = sys.argv[1]
-acct = sys.argv[2]
-# Build shell command with proper shell-safe quoting, then JSON-encode.
-# Using json.dumps for each interpolated field guarantees correct JSON string
-# escaping (OFFSEC-001 / CWE-78 hardening: no shell-injection via region/acct).
-ecr_login = (
-    'aws ecr get-login-password --region ' + json.dumps(region)[1:-1] +
-    ' | docker login --username AWS --password-stdin ' +
-    json.dumps(acct)[1:-1] + '.dkr.ecr.' +
-    json.dumps(region)[1:-1] + '.amazonaws.com'
-)
-print(json.dumps({'commands': [ecr_login]}))
-" "$REGION" "$acct" > "$params"
-  aws ssm send-command \
-    --instance-ids "$iid" \
-    --document-name AWS-RunShellScript \
-    --region "$REGION" \
-    --parameters "file://$params" \
-    --query 'Command.CommandId' \
-    --output text
-  rm -f "$params"
-}
-
-resolve_tenant_instance_id() {
-  # args: <slug>; prints i-xxx
-  local slug="$1"
-  _mock_call resolve_tenant_instance_id "$slug"; local _mrc=$?
-  [[ $_mrc -ne 99 ]] && return $_mrc
-  local tok="${!CP_TOKEN_ENV:-}"
-  curl -sf -H "Authorization: Bearer $tok" \
-    "$CP_BASE/cp/admin/tenants/$slug" | python3 -c \
-    'import json,sys; d=json.load(sys.stdin); print(d.get("instance_id",""))'
-}
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Steps
-# ─────────────────────────────────────────────────────────────────────────────
-
-log() { printf '[%s] %s\n' "$(date -u +%H:%M:%SZ)" "$*"; }
-err() { printf '[%s] ERROR: %s\n' "$(date -u +%H:%M:%SZ)" "$*" >&2; }
-
-preflight() {
-  log "preflight: source=$SOURCE_TAG dest=$DEST_TAG repo=$REPO region=$REGION"
-  local src_manifest
-  src_manifest=$(aws_ecr_get_image "$SOURCE_TAG") || {
-    err "source tag '$SOURCE_TAG' not found in $REPO"
-    return 1
-  }
-  [[ -z "$src_manifest" || "$src_manifest" == "None" ]] && {
-    err "source tag '$SOURCE_TAG' returned empty manifest"
-    return 1
-  }
-  # Best-effort: existence of dest tag is OK if missing (first promote).
-  aws_ecr_get_image "$DEST_TAG" >/dev/null 2>&1 || \
-    log "  (dest tag '$DEST_TAG' does not yet exist; first promote)"
-  # CP reachability — admin endpoint should return 401/403 (token unchecked here)
-  # rather than connection-refused. Anything 2xx/4xx counts as "alive."
-  if [[ -z "$MOCK_DIR" ]]; then
-    local code
-    code=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 "$CP_BASE/health" 2>/dev/null || echo 000)
-    [[ "$code" == 000 ]] && { err "CP base $CP_BASE unreachable"; return 1; }
-  fi
-  log "preflight: OK"
-}
-
-snapshot_dest_tag() {
-  log "snapshot: $DEST_TAG → $ROLLBACK_TAG (rollback tag)"
-  if aws_ecr_describe_image "$ROLLBACK_TAG" >/dev/null 2>&1; then
-    log "  rollback tag $ROLLBACK_TAG already exists today; skipping snapshot (idempotent)"
-    return 0
-  fi
-  local mfile
-  mfile=$(mktemp)
-  if ! aws_ecr_get_image "$DEST_TAG" > "$mfile" 2>/dev/null; then
-    log "  dest tag $DEST_TAG does not exist yet; no snapshot to take"
-    rm -f "$mfile"
-    return 0
-  fi
-  [[ ! -s "$mfile" ]] && { log "  empty manifest; skipping snapshot"; rm -f "$mfile"; return 0; }
-  if [[ "$DRY_RUN" == "true" ]]; then
-    log "  [dry-run] would put-image tag=$ROLLBACK_TAG"
-  else
-    aws_ecr_put_image "$ROLLBACK_TAG" "$mfile" || {
-      err "snapshot put-image failed"
-      rm -f "$mfile"
-      return 1
-    }
-  fi
-  rm -f "$mfile"
-  log "snapshot: OK"
-}
-
-promote() {
-  log "promote: $SOURCE_TAG → $DEST_TAG"
-  local mfile
-  mfile=$(mktemp)
-  aws_ecr_get_image "$SOURCE_TAG" > "$mfile" || { rm -f "$mfile"; return 1; }
-  if [[ "$DRY_RUN" == "true" ]]; then
-    log "  [dry-run] would put-image tag=$DEST_TAG"
-  else
-    aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
-  fi
-  rm -f "$mfile"
-  log "promote: OK"
-}
-
-redeploy_tenant() {
-  # args: <slug> — handle the 403→SSM-refresh→retry pattern
-  local slug="$1"
-  log "  redeploy: $slug"
-  if [[ "$DRY_RUN" == "true" ]]; then
-    log "    [dry-run] would POST /redeploy slug=$slug"
-    return 0
-  fi
-  # cp_redeploy_tenant returns: 0=2xx, 2=403, 1=other (see contract above)
-  set +e
-  cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
-  local rc=$?
-  set -e
-  if [[ $rc -eq 0 ]]; then
-    log "    redeploy: 2xx"
-    return 0
-  fi
-  if [[ $rc -eq 2 ]]; then
-    log "    redeploy 403 — SSM-refreshing ECR auth + retry"
-    local iid
-    iid=$(resolve_tenant_instance_id "$slug")
-    [[ -z "$iid" ]] && { err "cannot resolve instance id for $slug"; return 1; }
-    ssm_refresh_ecr_auth "$iid" >/dev/null || { err "SSM refresh failed for $iid"; return 1; }
-    sleep "${SSM_SETTLE_SECONDS:-6}"
-    set +e
-    cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
-    rc=$?
-    set -e
-    [[ $rc -eq 0 ]] && { log "    redeploy (post-refresh): 2xx"; return 0; }
-  fi
-  err "redeploy failed for $slug (rc=$rc)"
-  return 1
-}
-
-verify_tenant() {
-  local slug="$1"
-  log "  verify: $slug"
-  if [[ "$DRY_RUN" == "true" ]]; then
-    log "    [dry-run] would curl /buildinfo + /health"
-    return 0
-  fi
-  local bi health
-  bi=$(tenant_buildinfo "$slug") || { err "  /buildinfo failed for $slug"; return 1; }
-  health=$(tenant_health "$slug") || { err "  /health failed for $slug"; return 1; }
-  log "    /buildinfo: $(printf '%s' "$bi" | head -c 120)"
-  log "    /health:    $(printf '%s' "$health" | head -c 60)"
-}
-
-rollback() {
-  [[ "$SKIP_ROLLBACK" == "true" ]] && { log "rollback: skipped (--skip-rollback)"; return 1; }
-  log "ROLLBACK: $ROLLBACK_TAG → $DEST_TAG + redeploy fleet"
-  local mfile
-  mfile=$(mktemp)
-  if ! aws_ecr_get_image "$ROLLBACK_TAG" > "$mfile" 2>/dev/null || [[ ! -s "$mfile" ]]; then
-    err "rollback tag $ROLLBACK_TAG not found — cannot auto-rollback"
-    rm -f "$mfile"
-    return 1
-  fi
-  aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
-  rm -f "$mfile"
-  IFS=',' read -ra slugs <<<"$TENANTS"
-  for slug in "${slugs[@]}"; do
-    redeploy_tenant "$slug" || err "  rollback redeploy failed for $slug"
-  done
-  log "rollback: complete"
-}
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Main
-# ─────────────────────────────────────────────────────────────────────────────
-
-main() {
-  preflight || return 1
-  snapshot_dest_tag || return 2
-  promote || return 2
-
-  local promote_rc=0
-  IFS=',' read -ra slugs <<<"$TENANTS"
-  for slug in "${slugs[@]}"; do
-    redeploy_tenant "$slug" || promote_rc=1
-    [[ $promote_rc -eq 0 ]] && { verify_tenant "$slug" || promote_rc=1; }
-    [[ $promote_rc -ne 0 ]] && break
-  done
-
-  if [[ $promote_rc -eq 0 ]]; then
-    log "DONE: $SOURCE_TAG → $DEST_TAG promoted across [$TENANTS]"
-    return 0
-  fi
-
-  if rollback; then return 3; else return 4; fi
-}
-
-main "$@"
@@ -1,346 +0,0 @@
-#!/usr/bin/env bash
-# scripts/test-promote-tenant-image.sh
-#
-# Comprehensive bash unit/e2e tests for promote-tenant-image.sh.
-# Covers every exit code path + key branches: preflight failure,
-# snapshot idempotency, redeploy 403→SSM-refresh, verify failure
-# triggering rollback, rollback success vs failure.
-#
-# All external calls (aws/curl/ssm) are stubbed via --mock-dir.
-# No live infrastructure is touched. Safe to run anywhere.
-#
-# Run: bash scripts/test-promote-tenant-image.sh
-# Expected: "All N tests passed" + exit 0.
-
-set -euo pipefail
-
-SCRIPT="$(cd "$(dirname "$0")" && pwd)/promote-tenant-image.sh"
-[[ -x "$SCRIPT" ]] || { printf 'FATAL: script not executable: %s\n' "$SCRIPT" >&2; exit 1; }
-
-PASS=0
-FAIL=0
-FAIL_NAMES=()
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Helpers
-# ─────────────────────────────────────────────────────────────────────────────
-
-mkmock() {
-  local d
-  d=$(mktemp -d)
-  : > "$d/.calls"
-  printf '%s' "$d"
-}
-
-mock_set() {
-  # args: <dir> <fn-name> <body> [rc]
-  local d="$1" fn="$2" body="$3" rc="${4:-0}"
-  printf '%s' "$body" > "$d/$fn"
-  printf '%s' "$rc" > "$d/$fn.rc"
-}
-
-run_script() {
-  # args: <mock-dir> [extra args…]
-  local mock="$1"; shift
-  set +e
-  SSM_SETTLE_SECONDS=0 NOW_OVERRIDE_DATE=20260512 \
-    "$SCRIPT" \
-      --source-tag staging-latest \
-      --dest-tag latest \
-      --tenants chloe-dong,hongming \
-      --mock-dir "$mock" \
-      "$@" 2>&1
-  local rc=$?
-  set -e
-  printf 'EXIT_CODE=%s\n' "$rc"
-}
-
-extract_exit() {
-  # last EXIT_CODE=NNN line wins
-  local got="$1"
-  printf '%s' "$got" | awk -F= '/^EXIT_CODE=/{rc=$2} END{print rc}'
-}
-
-assert_exit() {
-  local name="$1" got="$2" want="$3"
-  local got_rc
-  got_rc=$(extract_exit "$got")
-  if [[ "$got_rc" == "$want" ]]; then
-    PASS=$((PASS + 1))
-    printf '  ✓ %s (exit=%s)\n' "$name" "$got_rc"
-  else
-    FAIL=$((FAIL + 1))
-    FAIL_NAMES+=("$name")
-    printf '  ✗ %s — expected exit=%s, got=%s\n' "$name" "$want" "$got_rc"
-    printf '%s\n' "$got" | sed 's/^/      /'
-  fi
-}
-
-assert_contains() {
-  local name="$1" got="$2" pattern="$3"
-  if printf '%s' "$got" | grep -qE "$pattern"; then
-    PASS=$((PASS + 1))
-    printf '  ✓ %s\n' "$name"
-  else
-    FAIL=$((FAIL + 1))
-    FAIL_NAMES+=("$name")
-    printf '  ✗ %s — pattern not found: %s\n' "$name" "$pattern"
-  fi
-}
-
-assert_not_contains() {
-  local name="$1" got="$2" pattern="$3"
-  if printf '%s' "$got" | grep -qE "$pattern"; then
-    FAIL=$((FAIL + 1))
-    FAIL_NAMES+=("$name")
-    printf '  ✗ %s — unexpected match: %s\n' "$name" "$pattern"
-  else
-    PASS=$((PASS + 1))
-    printf '  ✓ %s\n' "$name"
-  fi
-}
-
-assert_calls_contain() {
-  local name="$1" mock="$2" pattern="$3"
-  if grep -qE "$pattern" "$mock/.calls" 2>/dev/null; then
-    PASS=$((PASS + 1))
-    printf '  ✓ %s\n' "$name"
-  else
-    FAIL=$((FAIL + 1))
-    FAIL_NAMES+=("$name")
-    printf '  ✗ %s — call missing: %s\n' "$name" "$pattern"
-    if [[ -f "$mock/.calls" ]]; then
-      printf '      .calls=\n'
-      sed 's/^/      | /' "$mock/.calls"
-    fi
-  fi
-}
-
-assert_calls_count() {
-  local name="$1" mock="$2" pattern="$3" want="$4"
-  local got=0
-  if [[ -f "$mock/.calls" ]]; then
-    got=$(grep -cE "$pattern" "$mock/.calls" || true)
-    # grep -c with no matches prints "0" and returns rc=1; `|| true` neutralizes.
-    got="${got%%[!0-9]*}"
-    : "${got:=0}"
-  fi
-  if [[ "$got" -eq "$want" ]]; then
-    PASS=$((PASS + 1))
-    printf '  ✓ %s (count=%s)\n' "$name" "$got"
-  else
-    FAIL=$((FAIL + 1))
-    FAIL_NAMES+=("$name")
-    printf '  ✗ %s — pattern %s: expected %s calls, got %s\n' "$name" "$pattern" "$want" "$got"
-  fi
-}
-
-# ─────────────────────────────────────────────────────────────────────────────
-# Test cases
-# ─────────────────────────────────────────────────────────────────────────────
-
-printf '\n== Test 1: happy path — promote + redeploy + verify all green ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image          '{"manifests":[{"digest":"sha256:src"}]}' 0
-mock_set "$m" aws_ecr_describe_image     '' 1   # rollback tag does NOT exist (fresh day)
-mock_set "$m" aws_ecr_put_image          '' 0
-mock_set "$m" cp_redeploy_tenant         '{"redeployed":true}' 0   # rc=0 → 2xx success
-mock_set "$m" tenant_buildinfo           '{"git_sha":"abc1234","build_time":"2026-05-12T05:00:00Z"}' 0
-mock_set "$m" tenant_health              'ok' 0
-out=$(run_script "$m")
-assert_exit "happy path exits 0" "$out" 0
-assert_calls_contain "snapshot put-image for rollback tag" "$m" 'aws_ecr_put_image latest-prev-20260512'
-assert_calls_contain "promote put-image for dest tag" "$m" 'aws_ecr_put_image latest /'
-assert_calls_count "redeploy called per tenant (2)" "$m" '^cp_redeploy_tenant ' 2
-assert_calls_count "buildinfo verified per tenant (2)" "$m" '^tenant_buildinfo ' 2
-assert_calls_count "health probed per tenant (2)" "$m" '^tenant_health ' 2
-rm -rf "$m"
-
-printf '\n== Test 2: preflight fails when source tag missing → exit 1, no mutations ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image '' 1   # source-tag lookup fails
-out=$(run_script "$m")
-assert_exit "preflight failure exits 1" "$out" 1
-assert_contains "logs source-tag not found error" "$out" "source tag 'staging-latest' not found"
-assert_calls_count "no put-image on preflight fail" "$m" '^aws_ecr_put_image' 0
-assert_calls_count "no redeploy on preflight fail" "$m" '^cp_redeploy_tenant' 0
-rm -rf "$m"
-
-printf '\n== Test 3: snapshot is idempotent when rollback tag already exists today ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image       '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image  'sha256:existingrollback' 0   # rollback tag DOES exist
-mock_set "$m" aws_ecr_put_image       '' 0
-mock_set "$m" cp_redeploy_tenant      '{"ok":true}' 0
-mock_set "$m" tenant_buildinfo        '{"git_sha":"abc1234"}' 0
-mock_set "$m" tenant_health           'ok' 0
-out=$(run_script "$m")
-assert_exit "happy with existing snapshot still exits 0" "$out" 0
-assert_contains "logs idempotent skip message" "$out" 'already exists today.*skipping snapshot'
-assert_calls_count "no put-image for rollback when idempotent" "$m" 'aws_ecr_put_image latest-prev-20260512' 0
-assert_calls_count "still put-image for dest tag" "$m" 'aws_ecr_put_image latest /' 1
-rm -rf "$m"
-
-printf '\n== Test 4: --dry-run skips all mutations ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image       '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image  '' 1
-out=$(run_script "$m" --dry-run)
-assert_exit "dry-run exits 0" "$out" 0
-assert_contains "logs dry-run put-image markers" "$out" '\[dry-run\] would put-image'
-assert_contains "logs dry-run redeploy markers" "$out" '\[dry-run\] would POST /redeploy'
-assert_calls_count "dry-run: no put-image" "$m" '^aws_ecr_put_image' 0
-assert_calls_count "dry-run: no redeploy" "$m" '^cp_redeploy_tenant' 0
-rm -rf "$m"
-
-printf '\n== Test 5: redeploy 403 triggers SSM-refresh path ==\n'
-# cp_redeploy_tenant rc=2 signals 403 per script contract. Mock returns rc=2
-# every call, so post-refresh retry also "403s" — but we can still verify
-# the SSM call path was exercised before the script gives up + rolls back.
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image          '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image     '' 1
-mock_set "$m" aws_ecr_put_image          '' 0
-mock_set "$m" cp_redeploy_tenant         '{"error":"403"}' 2   # 403 path
-mock_set "$m" resolve_tenant_instance_id 'i-0455a413e993ee78c' 0
-mock_set "$m" ssm_refresh_ecr_auth       'cmd-id-fake' 0
-out=$(run_script "$m" --skip-rollback)
-assert_contains "403 path logged" "$out" 'SSM-refreshing ECR auth'
-assert_calls_contain "SSM refresh called" "$m" 'ssm_refresh_ecr_auth i-0455a413e993ee78c'
-assert_calls_contain "resolve_tenant_instance_id called" "$m" 'resolve_tenant_instance_id chloe-dong'
-assert_calls_count "redeploy attempted twice (first + post-refresh)" "$m" '^cp_redeploy_tenant chloe-dong ' 2
-rm -rf "$m"
-
-printf '\n== Test 6: redeploy fail + --skip-rollback → exit 4 ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image          '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image     '' 1
-mock_set "$m" aws_ecr_put_image          '' 0
-mock_set "$m" cp_redeploy_tenant         '' 1   # generic failure (not 403)
-out=$(run_script "$m" --skip-rollback)
-assert_exit "redeploy fail + skip-rollback exits 4" "$out" 4
-assert_contains "logs redeploy failure" "$out" 'redeploy failed for chloe-dong'
-assert_contains "rollback skipped logged" "$out" 'rollback: skipped'
-assert_not_contains "no SSM refresh on non-403 failure" "$out" 'SSM-refreshing'
-rm -rf "$m"
-
-printf '\n== Test 7: redeploy fail + rollback succeeds → exit 3 ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image          '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image     '' 1
-mock_set "$m" aws_ecr_put_image          '' 0
-mock_set "$m" cp_redeploy_tenant         '' 1
-out=$(run_script "$m")
-assert_exit "redeploy fail with rollback exits 3" "$out" 3
-assert_contains "rollback fired" "$out" 'ROLLBACK:.*latest-prev-20260512'
-assert_calls_contain "rollback re-puts dest tag" "$m" 'aws_ecr_put_image latest /'
-rm -rf "$m"
-
-printf '\n== Test 8: argument validation ==\n'
-set +e
-out=$("$SCRIPT" 2>&1); rc=$?
-set -e
-if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'required:.*--source-tag'; then
-  PASS=$((PASS + 1)); printf '  ✓ exit 64 on missing args with usage line\n'
-else
-  FAIL=$((FAIL + 1)); FAIL_NAMES+=("missing-args error")
-  printf '  ✗ exit 64 on missing args (got %s)\n' "$rc"
-fi
-
-set +e
-out=$("$SCRIPT" --source-tag x --dest-tag x --tenants y 2>&1); rc=$?
-set -e
-if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'must differ'; then
-  PASS=$((PASS + 1)); printf '  ✓ exit 64 when source==dest\n'
-else
-  FAIL=$((FAIL + 1)); FAIL_NAMES+=("source==dest validation")
-  printf '  ✗ source==dest should fail (got %s)\n' "$rc"
-fi
-
-set +e
-out=$("$SCRIPT" --source-tag x --dest-tag y --tenants t --bogus-flag 2>&1); rc=$?
-set -e
-if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'unknown argument'; then
-  PASS=$((PASS + 1)); printf '  ✓ exit 64 on unknown flag\n'
-else
-  FAIL=$((FAIL + 1)); FAIL_NAMES+=("unknown-flag error")
-  printf '  ✗ unknown-flag should fail (got %s)\n' "$rc"
-fi
-
-printf '\n== Test 9: ROLLBACK_TAG follows YYYYMMDD via NOW_OVERRIDE_DATE ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image       '{}' 0
-mock_set "$m" aws_ecr_describe_image  '' 1
-mock_set "$m" aws_ecr_put_image       '' 0
-mock_set "$m" cp_redeploy_tenant      '{}' 0
-mock_set "$m" tenant_buildinfo        '{}' 0
-mock_set "$m" tenant_health           'ok' 0
-set +e
-NOW_OVERRIDE_DATE=20260603 SSM_SETTLE_SECONDS=0 "$SCRIPT" \
-  --source-tag a --dest-tag b --tenants t1 --mock-dir "$m" >/dev/null 2>&1
-rc=$?
-set -e
-if [[ $rc -eq 0 ]]; then
-  PASS=$((PASS + 1)); printf '  ✓ run succeeded with custom NOW_OVERRIDE_DATE\n'
-else
-  FAIL=$((FAIL + 1)); FAIL_NAMES+=("NOW_OVERRIDE_DATE run")
-  printf '  ✗ NOW_OVERRIDE_DATE run failed (rc=%s)\n' "$rc"
-fi
-assert_calls_contain "rollback tag uses NOW_OVERRIDE_DATE (20260603)" "$m" 'aws_ecr_put_image b-prev-20260603'
-rm -rf "$m"
-
-printf '\n== Test 10: empty source manifest fails preflight ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image '' 0   # rc=0 but empty body (the "None" case)
-out=$(run_script "$m")
-assert_exit "empty source manifest fails preflight" "$out" 1
-assert_contains "empty manifest message" "$out" 'returned empty manifest'
-rm -rf "$m"
-
-printf '\n== Test 11: tenant_buildinfo failure during verify → rollback ==\n'
-m=$(mkmock)
-mock_set "$m" aws_ecr_get_image          '{"manifests":[]}' 0
-mock_set "$m" aws_ecr_describe_image     '' 1
-mock_set "$m" aws_ecr_put_image          '' 0
-mock_set "$m" cp_redeploy_tenant         '{"ok":true}' 0
-mock_set "$m" tenant_buildinfo           '' 1   # buildinfo probe fails
-mock_set "$m" tenant_health              'ok' 0
-out=$(run_script "$m")
-assert_exit "verify failure → rollback succeeds → exit 3" "$out" 3
-assert_contains "logs buildinfo failure" "$out" '/buildinfo failed for chloe-dong'
-assert_contains "rollback fired after verify fail" "$out" 'ROLLBACK:'
-rm -rf "$m"
-
-printf '\n== Test 12: ssm_refresh_ecr_auth JSON escaping (CWE-78 / OFFSEC-001) ==\n'
-# Verify the python3 snippet in ssm_refresh_ecr_auth produces valid JSON and
-# correctly escapes shell-injection characters in region + account ID fields.
-# The fix replaces unquoted shell-printf interpolation with json.dumps.
-PYCODE='import json,sys;r=sys.argv[1];a=sys.argv[2];ecr="aws ecr get-login-password --region "+json.dumps(r)[1:-1]+" | docker login --username AWS --password-stdin "+json.dumps(a)[1:-1]+".dkr.ecr."+json.dumps(r)[1:-1]+".amazonaws.com";print(json.dumps({"commands":[ecr]}))'
-# Baseline: normal region + account
-OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
-python3 -c "import sys,json; d=json.loads(sys.stdin.read()); assert 'commands' in d; c=d['commands'][0]; assert 'us-east-1' in c and '153263036946' in c and c.startswith('aws ecr get-login-password')" <<< "$OUT" \
-  && echo "  ok: normal region+account" || { echo "  FAIL: invalid JSON for normal case"; exit 1; }
-# Injection: region with double-quote
-OUT=$(python3 -c "$PYCODE" 'us"-east-1' '153263036946')
-python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
-  && echo "  ok: region with quote injection → valid JSON" || { echo "  FAIL"; exit 1; }
-# Injection: account with double-quote
-OUT=$(python3 -c "$PYCODE" 'us-east-1' '15"326"3036946')
-python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
-  && echo "  ok: account with quote injection → valid JSON" || { echo "  FAIL"; exit 1; }
-# No double-encoding: region appears as literal 'us-east-1' in command string
-OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
-python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert 'us-east-1' in c" <<< "$OUT" \
-  && echo "  ok: no double-encoding in command string" || { echo "  FAIL"; exit 1; }
-# ─────────────────────────────────────────────────────────────────────────────
-
-printf '\n────────────────────────────────────\n'
-if [[ $FAIL -eq 0 ]]; then
-  printf 'All %d tests passed.\n' "$PASS"
-  exit 0
-else
-  printf '%d passed, %d failed.\n' "$PASS" "$FAIL"
-  printf 'Failed tests:\n'
-  for n in "${FAIL_NAMES[@]}"; do printf '  - %s\n' "$n"; done
-  exit 1
-fi
@@ -1,361 +0,0 @@
-"""Tests for `.gitea/scripts/lint_bp_context_emit_match.py` — Tier 2f lint.
-
-Structural enforcement of internal#350 Tier 2f: BP `status_check_contexts`
-and the set of contexts emitted by `.gitea/workflows/*.yml` must agree.
-
-Bidirectional rule:
-  (a) BP-only: every context in `branch_protections/<branch>.status_check_contexts`
-      must have at least one EMITTER — a workflow `name:` + job `name:` (or job key)
-      + `pull_request` (or `push`) event that produces it. A BP context without
-      an emitter blocks merges forever (Gitea treats absent-as-pending, NOT
-      absent-as-skipped). This is the phantom-required-check class
-      (`feedback_phantom_required_check_after_gitea_migration`).
-
-  (b) EMITTER-only: NO automatic flag. The PR#656 case (workflow added a
-      sentinel context not yet in BP) is Tier 2g's job — a diff-based PR-time
-      lint. Tier 2f runs scheduled and would falsely flag every transitional
-      state during a BP rollout. We only flag the BP-empty case in this
-      direction as a NOTICE (informational), not as an error.
-
-Tier 2f runs on a daily schedule + workflow_dispatch and files a
-`[ci-bp-drift]`-tagged issue on mismatch.
-
-Test classes (per `feedback_branch_count_before_approving`):
-
-  - test_perfect_match_passes              — BP has [X]; workflows emit X.
-    Exit 0. No issue filed/edited.
-  - test_bp_orphan_context_fails           — BP has [Y] but no workflow
-    emits Y. Exit 1. Issue body lists the orphan and the closest
-    candidate workflow names (Levenshtein-1 suggestion for typos).
-  - test_emitter_orphan_only_warns         — workflow emits Z but BP
-    doesn't have it. Exit 0 with ::notice:: (NOT ::error::) because
-    Tier 2g handles this at PR time.
-  - test_multiple_orphans_aggregated       — two BP orphans surfaced
-    together, not short-circuited.
-  - test_bp_empty_lints_nothing            — BP has no contexts.
-    Exit 0 cleanly.
-  - test_api_403_skips_gracefully          — branch_protections endpoint
-    403s (token-scope). Exit 0 with ::error::, do NOT red-X.
-  - test_api_404_skips_gracefully          — branch has no protection.
-    Exit 0 cleanly.
-  - test_context_event_match_required      — BP context says `(push)` and
-    workflow only emits on `pull_request`. That's NOT a match — the
-    BP-required gate would still wedge. Exit 1.
-  - test_workflow_event_mapping_pull_request_target — `pull_request_target`
-    in workflow `on:` emits a `(pull_request)` context (Gitea convention).
-    Match counts.
-  - test_idempotent_issue_filing           — when an issue already exists
-    with the canonical title prefix, edit it instead of POSTing a new one
-    (idempotency contract — mirrors ci-required-drift).
-
-Run:
-    python3 -m pytest tests/test_lint_bp_context_emit_match.py -v
-"""
-from __future__ import annotations
-
-import importlib.util
-import os
-import sys
-from pathlib import Path
-from unittest import mock
-
-import pytest
-
-
-SCRIPT_PATH = (
-    Path(__file__).resolve().parent.parent
-    / ".gitea"
-    / "scripts"
-    / "lint_bp_context_emit_match.py"
-)
-
-
-def _import_lint():
-    spec = importlib.util.spec_from_file_location(
-        f"lint_bp_emit_{os.getpid()}", SCRIPT_PATH
-    )
-    m = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(m)
-    return m
-
-
-@pytest.fixture()
-def envset(tmp_path, monkeypatch):
-    wf = tmp_path / ".gitea" / "workflows"
-    wf.mkdir(parents=True)
-    monkeypatch.setenv("WORKFLOWS_DIR", str(wf))
-    monkeypatch.setenv("GITEA_TOKEN", "stub")
-    monkeypatch.setenv("GITEA_HOST", "git.example.test")
-    monkeypatch.setenv("REPO", "owner/molecule-core")
-    monkeypatch.setenv("BRANCH", "main")
-    monkeypatch.setenv("DRIFT_LABEL", "ci-bp-drift")
-    return wf
-
-
-def _write_wf(d: Path, name: str, content: str) -> Path:
-    p = d / name
-    p.write_text(content)
-    return p
-
-
-def _stub_api(monkeypatch, lint_mod, bp_response, issue_search_response=None, posted_record=None):
-    """Stub the module's `api` function.
-
-    bp_response: ("ok", {"status_check_contexts": [...]})
-                 or ("forbidden", None) / ("not_found", None)
-    issue_search_response: list of issues matching the search query (
-                           may be empty; default empty)
-    posted_record: dict in which to record any POST/PATCH calls made
-                   (so tests can assert idempotency).
-    """
-    if issue_search_response is None:
-        issue_search_response = []
-    if posted_record is None:
-        posted_record = {}
-
-    def fake_api(method, path, *, body=None, query=None):
-        if "branch_protections" in path:
-            return bp_response
-        if "issues/search" in path or "/issues?" in path or path.endswith("/issues"):
-            if method == "GET":
-                return ("ok", list(issue_search_response))
-            if method == "POST":
-                posted_record.setdefault("posts", []).append({"path": path, "body": body})
-                return ("ok", {"number": 9001, "html_url": "http://t/9001"})
-        if "/issues/" in path and method == "PATCH":
-            posted_record.setdefault("patches", []).append({"path": path, "body": body})
-            return ("ok", {"number": 9001})
-        if "/labels" in path:
-            return ("ok", [{"id": 10, "name": "ci-bp-drift"}, {"id": 9, "name": "tier:high"}])
-        return ("ok", {})
-
-    monkeypatch.setattr(lint_mod, "api", fake_api)
-    return posted_record
-
-
-# ---------------------------------------------------------------------------
-# Perfect match — both sides agree.
-# ---------------------------------------------------------------------------
-def test_perfect_match_passes(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": ["CI / all-required (pull_request)"]}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# BP-only orphan — context with no emitter.
-# ---------------------------------------------------------------------------
-def test_bp_orphan_context_fails(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    posted = _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": [
-            "CI / all-required (pull_request)",
-            "Ghost workflow / ghost (pull_request)",  # the orphan
-        ]}),
-    )
-    rc = m.run()
-    assert rc == 1
-    out = capsys.readouterr().out
-    assert "Ghost workflow" in out or "ghost" in out.lower()
-
-
-# ---------------------------------------------------------------------------
-# Emitter-only direction → notice, not error (Tier 2g territory).
-# ---------------------------------------------------------------------------
-def test_emitter_orphan_only_warns(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "extra.yml",
-        "name: Extra\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  extra-job:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": ["CI / all-required (pull_request)"]}),
-    )
-    rc = m.run()
-    assert rc == 0
-    out = capsys.readouterr().out
-    assert "Extra" in out or "extra" in out
-
-
-# ---------------------------------------------------------------------------
-# Multiple BP orphans — all surfaced.
-# ---------------------------------------------------------------------------
-def test_multiple_orphans_aggregated(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": [
-            "CI / all-required (pull_request)",
-            "Phantom A / a (pull_request)",
-            "Phantom B / b (pull_request)",
-        ]}),
-    )
-    rc = m.run()
-    assert rc == 1
-    out = capsys.readouterr().out
-    assert "Phantom A" in out and "Phantom B" in out
-
-
-# ---------------------------------------------------------------------------
-# BP has zero contexts → nothing to lint, pass.
-# ---------------------------------------------------------------------------
-def test_bp_empty_lints_nothing(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(monkeypatch, m, ("ok", {"status_check_contexts": []}))
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# API 403 — graceful-degrade.
-# ---------------------------------------------------------------------------
-def test_api_403_skips_gracefully(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  j:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(monkeypatch, m, ("forbidden", None))
-    rc = m.run()
-    assert rc == 0
-    err = capsys.readouterr().err
-    assert "403" in err or "scope" in err.lower() or "token" in err.lower()
-
-
-# ---------------------------------------------------------------------------
-# API 404 — branch has no protection → clean exit.
-# ---------------------------------------------------------------------------
-def test_api_404_skips_gracefully(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  j:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(monkeypatch, m, ("not_found", None))
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# Event-suffix match strict: BP says (push), workflow emits (pull_request)
-# only. Mismatch — flag.
-# ---------------------------------------------------------------------------
-def test_context_event_match_required(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": ["CI / all-required (push)"]}),
-    )
-    rc = m.run()
-    assert rc == 1
-
-
-# ---------------------------------------------------------------------------
-# `pull_request_target` in workflow `on:` emits a `(pull_request)` context
-# (Gitea convention — verified empirically on molecule-core).
-# ---------------------------------------------------------------------------
-def test_workflow_event_mapping_pull_request_target(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "secret.yml",
-        "name: Secret scan\non:\n  pull_request_target:\n    branches: [main]\njobs:\n"
-        "  scan:\n    runs-on: x\n    name: Scan diff for credential-shaped strings\n"
-        "    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": [
-            "Secret scan / Scan diff for credential-shaped strings (pull_request)",
-        ]}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# Idempotency — existing open issue is PATCHed, not duplicated.
-# ---------------------------------------------------------------------------
-def test_idempotent_issue_filing(envset, monkeypatch, capsys):
-    _write_wf(
-        envset,
-        "ci.yml",
-        "name: CI\non:\n  pull_request:\n    branches: [main]\njobs:\n"
-        "  all-required:\n    runs-on: x\n    steps:\n      - run: echo hi\n",
-    )
-    m = _import_lint()
-    posted = _stub_api(
-        monkeypatch,
-        m,
-        ("ok", {"status_check_contexts": [
-            "CI / all-required (pull_request)",
-            "Ghost / g (pull_request)",
-        ]}),
-        issue_search_response=[
-            {
-                "number": 4242,
-                "title": "[ci-bp-drift] owner/molecule-core/main: BP→emitter mismatch",
-                "state": "open",
-                "html_url": "http://t/4242",
-            }
-        ],
-    )
-    rc = m.run()
-    assert rc == 1
-    # Should have PATCHed, not POSTed a new one.
-    assert posted.get("patches"), f"expected PATCH on existing issue; got {posted!r}"
-    assert not posted.get("posts"), f"expected no POSTs; got {posted!r}"
@@ -1,88 +0,0 @@
-"""Tests for `.gitea/scripts/lint-curl-status-capture.py`.
-
-Run:
-    python3 -m pytest tests/test_lint_curl_status_capture.py -v
-"""
-from __future__ import annotations
-
-import importlib.util
-from pathlib import Path
-
-
-SCRIPT_PATH = (
-    Path(__file__).resolve().parent.parent
-    / ".gitea"
-    / "scripts"
-    / "lint-curl-status-capture.py"
-)
-
-
-def _load_module():
-    spec = importlib.util.spec_from_file_location("lint_curl_status_capture", SCRIPT_PATH)
-    module = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(module)
-    return module
-
-
-def test_finds_quoted_echo_fallback_pollution():
-    lint = _load_module()
-    content = """
-    HTTP_CODE=$(curl -sS -o /tmp/body -w "%{http_code}" https://example.test || echo "000")
-    """
-
-    findings = lint.scan_content("workflow.yml", content)
-
-    assert len(findings) == 1
-    assert "echo" in findings[0].snippet
-
-
-def test_finds_unquoted_echo_fallback_pollution():
-    lint = _load_module()
-    content = """
-    HTTP_CODE=$(curl -sS -o /tmp/body -w '%{http_code}' https://example.test || echo 000)
-    """
-
-    findings = lint.scan_content("workflow.yml", content)
-
-    assert len(findings) == 1
-    assert "echo" in findings[0].snippet
-
-
-def test_finds_printf_fallback_pollution():
-    lint = _load_module()
-    content = """
-    HTTP_CODE=$(curl -sS -o /tmp/body -w '%{http_code}' https://example.test || printf '000')
-    """
-
-    findings = lint.scan_content("workflow.yml", content)
-
-    assert len(findings) == 1
-    assert "printf" in findings[0].snippet
-
-
-def test_ignores_tempfile_fallback_after_curl():
-    lint = _load_module()
-    content = """
-    set +e
-    curl -sS -o /tmp/body -w '%{http_code}' https://example.test >/tmp/code
-    rc=$?
-    set -e
-    HTTP_CODE=$(cat /tmp/code 2>/dev/null || echo "000")
-    [ -z "$HTTP_CODE" ] && HTTP_CODE="000"
-    """
-
-    assert lint.scan_content("workflow.yml", content) == []
-
-
-def test_collapses_bash_line_continuations():
-    lint = _load_module()
-    content = """
-    HTTP_CODE=$(curl -sS -o /tmp/body \\
-      -w "%{http_code}" \\
-      https://example.test \\
-      || echo "000")
-    """
-
-    findings = lint.scan_content("workflow.yml", content)
-
-    assert len(findings) == 1
@@ -1,430 +0,0 @@
-"""Tests for `.gitea/scripts/lint_required_context_exists_in_bp.py` — Tier 2g lint.
-
-Structural enforcement of internal#350 Tier 2g: when a PR adds a NEW
-commit-status emission (a workflow's `name:` + a new job-key/name pair
-that didn't exist on the base side), the PR must EITHER:
-
-  (a) Include a `# bp-required: yes` directive comment on the workflow
-      AND the new context must already be in
-      `branch_protections/<branch>.status_check_contexts`, OR
-
-  (b) Include a `# bp-required: pending #NNN` directive (acknowledged
-      asymmetry with a tracking issue), OR
-
-  (c) Include a `# bp-exempt: <reason>` directive (informational job,
-      not intended to be a required gate).
-
-Default (no directive on a new emitter) = FAIL.
-
-The class this prevents
-----------------------
-PR#656 added `CI / all-required (pull_request)` as a sentinel context
-that workflows emit, but BP did NOT list it — so when `platform-build`
-failed, `all-required` failed, but BP let the PR merge anyway. Cascade
-to mc#664. With Tier 2g, PR#656 would have been blocked until either
-the BP PATCH ran alongside OR the author marked the emission with a
-`bp-required: pending #NNN` directive.
-
-Test classes (per `feedback_branch_count_before_approving`):
-
-  - test_no_new_emissions_skips                   — diff doesn't add any
-    new emitter; pass.
-  - test_new_emission_with_bp_required_yes_in_bp  — directive set AND
-    BP lists the context; pass.
-  - test_new_emission_with_bp_required_yes_not_in_bp — directive set
-    BUT BP doesn't list; fail.
-  - test_new_emission_with_bp_required_pending    — `# bp-required:
-    pending #800` directive references an open tracker; pass.
-  - test_new_emission_with_bp_exempt              — `# bp-exempt:
-    informational` directive; pass.
-  - test_new_emission_no_directive_fails          — no directive on a
-    new emission; fail with the 3-option fix-hint.
-  - test_modified_workflow_with_new_job_is_new    — pre-existing
-    workflow gains a new job with a new name → counted as new
-    emission. Apply rule.
-  - test_modified_workflow_job_renamed_is_new     — same workflow,
-    same job-key, but job `name:` changed → counted as new emission
-    (the OLD context name disappears; the NEW one needs validation).
-  - test_unrelated_workflow_edit_is_not_new       — edit a comment in
-    an existing emitter; no new context introduced; pass.
-  - test_api_403_skips_gracefully                 — BP read 403; exit 0
-    with stderr ::error::.
-  - test_directive_must_be_in_workflow_yml        — directive in PR
-    body alone is NOT sufficient; the comment must live in the
-    workflow file so future scheduled Tier 2f runs can see it.
-
-Run:
-    python3 -m pytest tests/test_lint_required_context_exists_in_bp.py -v
-"""
-from __future__ import annotations
-
-import importlib.util
-import os
-import subprocess
-import sys
-from pathlib import Path
-from unittest import mock
-
-import pytest
-
-
-SCRIPT_PATH = (
-    Path(__file__).resolve().parent.parent
-    / ".gitea"
-    / "scripts"
-    / "lint_required_context_exists_in_bp.py"
-)
-
-
-def _import_lint():
-    spec = importlib.util.spec_from_file_location(
-        f"lint_required_ctx_in_bp_{os.getpid()}", SCRIPT_PATH
-    )
-    m = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(m)
-    return m
-
-
-# Sample workflows used across multiple tests.
-WF_CI_BASE = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-"""
-
-# CI with a new job added.
-WF_CI_NEW_JOB = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-  brand-new:
-    runs-on: x
-    steps:
-      - run: echo new
-"""
-
-WF_CI_NEW_JOB_BP_YES = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-  # bp-required: yes
-  brand-new:
-    runs-on: x
-    steps:
-      - run: echo new
-"""
-
-WF_CI_NEW_JOB_BP_PENDING = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-  # bp-required: pending #800
-  brand-new:
-    runs-on: x
-    steps:
-      - run: echo new
-"""
-
-WF_CI_NEW_JOB_BP_EXEMPT = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-  # bp-exempt: informational sticker, not a gate
-  brand-new:
-    runs-on: x
-    steps:
-      - run: echo new
-"""
-
-# Same WF, job rename only (CI/all-required → CI/sentinel).
-WF_CI_JOB_RENAMED = """name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    name: sentinel
-    steps:
-      - run: echo hi
-"""
-
-# Comment-only edit — should NOT count as new emission.
-WF_CI_COMMENT_ONLY = """# a fresh comment line
-name: CI
-on:
-  pull_request:
-    branches: [main]
-jobs:
-  all-required:
-    runs-on: x
-    steps:
-      - run: echo hi
-"""
-
-
-def _stub_git_and_api(
-    monkeypatch,
-    lint_mod,
-    base_files: dict[str, str | None],
-    head_files: dict[str, str | None],
-    bp_response,
-):
-    """Stub `subprocess.run` for git, and `lint_mod.api` for HTTP."""
-
-    def fake_run(cmd, *args, **kwargs):
-        if not isinstance(cmd, list):
-            raise AssertionError(f"unexpected cmd: {cmd!r}")
-        if cmd[:2] == ["git", "show"] and ":" in cmd[2]:
-            sha, path = cmd[2].split(":", 1)
-            side = base_files if "base" in sha else head_files
-            content = side.get(path)
-            if content is None:
-                return subprocess.CompletedProcess(cmd, 128, "", "fatal: path not in tree")
-            return subprocess.CompletedProcess(cmd, 0, content, "")
-        if cmd[:2] == ["git", "diff"]:
-            # Names of files that changed (any side has differing contents
-            # from the other, or only appears on one side).
-            all_paths = set(base_files) | set(head_files)
-            changed = sorted(p for p in all_paths if base_files.get(p) != head_files.get(p))
-            return subprocess.CompletedProcess(cmd, 0, "\n".join(changed) + "\n", "")
-        raise AssertionError(f"unexpected cmd: {cmd!r}")
-
-    monkeypatch.setattr(subprocess, "run", fake_run)
-
-    def fake_api(method, path, *, body=None, query=None):
-        if "branch_protections" in path:
-            return bp_response
-        return ("ok", {})
-
-    monkeypatch.setattr(lint_mod, "api", fake_api)
-
-
-@pytest.fixture()
-def env(monkeypatch):
-    monkeypatch.setenv("BASE_SHA", "base-x")
-    monkeypatch.setenv("HEAD_SHA", "head-x")
-    monkeypatch.setenv("GITEA_TOKEN", "stub")
-    monkeypatch.setenv("GITEA_HOST", "git.example.test")
-    monkeypatch.setenv("REPO", "owner/molecule-core")
-    monkeypatch.setenv("BRANCH", "main")
-    monkeypatch.setenv("WORKFLOWS_DIR", ".gitea/workflows")
-    return monkeypatch
-
-
-# ---------------------------------------------------------------------------
-# No new emissions — pass.
-# ---------------------------------------------------------------------------
-def test_no_new_emissions_skips(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# New emission + bp-required: yes + in BP → pass.
-# ---------------------------------------------------------------------------
-def test_new_emission_with_bp_required_yes_in_bp(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_YES},
-        bp_response=(
-            "ok",
-            {"status_check_contexts": ["CI / brand-new (pull_request)"]},
-        ),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# bp-required: yes but NOT in BP → fail.
-# ---------------------------------------------------------------------------
-def test_new_emission_with_bp_required_yes_not_in_bp(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_YES},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 1
-    out = capsys.readouterr().out
-    assert "brand-new" in out
-
-
-# ---------------------------------------------------------------------------
-# bp-required: pending #NNN → pass.
-# ---------------------------------------------------------------------------
-def test_new_emission_with_bp_required_pending(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_PENDING},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# bp-exempt → pass.
-# ---------------------------------------------------------------------------
-def test_new_emission_with_bp_exempt(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_EXEMPT},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# New emission, no directive → fail with 3-option fix hint.
-# ---------------------------------------------------------------------------
-def test_new_emission_no_directive_fails(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 1
-    out = capsys.readouterr().out
-    assert "brand-new" in out
-    assert "bp-required" in out
-    assert "bp-exempt" in out
-
-
-# ---------------------------------------------------------------------------
-# Pre-existing workflow gains a new job → counted as new emission.
-# ---------------------------------------------------------------------------
-def test_modified_workflow_with_new_job_is_new(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    # No directive → fail
-    assert rc == 1
-
-
-# ---------------------------------------------------------------------------
-# Same workflow, same job-key, but job `name:` changed → new context.
-# ---------------------------------------------------------------------------
-def test_modified_workflow_job_renamed_is_new(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_JOB_RENAMED},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 1
-    out = capsys.readouterr().out
-    assert "sentinel" in out
-
-
-# ---------------------------------------------------------------------------
-# Comment-only edit → no new emission.
-# ---------------------------------------------------------------------------
-def test_unrelated_workflow_edit_is_not_new(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_COMMENT_ONLY},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    assert rc == 0
-
-
-# ---------------------------------------------------------------------------
-# BP API 403 → exit 0 with ::error::.
-# ---------------------------------------------------------------------------
-def test_api_403_skips_gracefully(env, monkeypatch, capsys):
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
-        bp_response=("forbidden", None),
-    )
-    rc = m.run()
-    assert rc == 0
-    err = capsys.readouterr().err
-    assert "403" in err or "scope" in err.lower() or "token" in err.lower()
-
-
-# ---------------------------------------------------------------------------
-# Directive must be in the workflow YML, not PR body.
-# ---------------------------------------------------------------------------
-def test_directive_must_be_in_workflow_yml(env, monkeypatch, capsys):
-    monkeypatch = env
-    monkeypatch.setenv("PR_BODY", "bp-required: yes — see comment above")
-    m = _import_lint()
-    _stub_git_and_api(
-        monkeypatch,
-        m,
-        base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
-        head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
-        bp_response=("ok", {"status_check_contexts": []}),
-    )
-    rc = m.run()
-    # Even though PR body claims, the workflow itself lacks the directive.
-    assert rc == 1
@@ -35,22 +35,7 @@ RUN CGO_ENABLED=0 GOOS=linux go build \
    -o /memory-plugin ./cmd/memory-plugin-postgres

 FROM alpine:3.20@sha256:c64c687cbea9300178b30c95835354e34c4e4febc4badfe27102879de0483b5e
-# docker-cli is required by internal/provisioner/localbuild.go which
-# shells out via exec.Command("docker", "image", "inspect"/"build"/"tag", ...)
-# whenever Resolve().Mode == RegistryModeLocal — which is the permanent
-# mode post-2026-05-06 (Molecule-AI GitHub org suspended → GHCR
-# unreachable → MOLECULE_IMAGE_REGISTRY unset → registry_mode.go falls
-# through to RegistryModeLocal). Without docker-cli here the platform
-# fails every workspace re-provision with `local-build: image inspect
-# for molecule-local/workspace-template-<runtime>:<sha> failed
-# (exec: "docker": executable file not found in $PATH)` and the
-# workspace stays status=failed. The Docker SOCKET is already mounted
-# (entrypoint.sh adds the platform user to the docker group) — only
-# the CLI binary was missing. Caught after sdk-lead + CP-QA went down
-# this way during the MiniMax-switch attempt + after-Class-A audit.
-# Related: Task #194 / Issue #63 (local-build path added);
-# `feedback_workspace_image_ghcr_dead`.
-RUN apk add --no-cache ca-certificates docker-cli git tzdata wget
+RUN apk add --no-cache ca-certificates git tzdata wget
 COPY --from=builder /platform /platform
 COPY --from=builder /memory-plugin /memory-plugin
 COPY workspace-server/migrations /migrations
@@ -501,18 +501,8 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
 		// to correctly route delivery-confirmed responses (where the agent completed
 		// the work but the TCP connection dropped before the full body was received)
 		// to success instead of failure (#159).
-		//
-		// For non-2xx responses (server explicitly rejected with 3xx+), preserve
-		// resp.StatusCode in the proxyA2AError.Status so isTransientProxyError
-		// returns false — a server-authored rejection is not a transient transport
-		// error and must not be retried. Only 2xx body-read errors keep Status=502
-		// (the agent completed work but the TCP layer dropped the response).
-		errStatus := http.StatusBadGateway
-		if resp.StatusCode >= 300 {
-			errStatus = resp.StatusCode
-		}
 		return resp.StatusCode, respBody, &proxyA2AError{
-			Status: errStatus,
+			Status: http.StatusBadGateway,
 			Response: gin.H{
 				"error":              "failed to read agent response",
 				"delivery_confirmed": deliveryConfirmed,
@@ -520,21 +510,6 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
 		}
 	}

-	// 2xx with empty body: the agent completed the request but returned no content.
-	// An A2A agent must always return a JSON body; empty means the agent is
-	// broken or the connection closed before any body bytes were written.
-	// Return a proxyA2AError so executeDelegation routes this to failure rather
-	// than silently marking it as completed with a nil body.
-	// logA2ASuccess is intentionally NOT called here — delivery was not confirmed.
-	if resp.StatusCode >= 200 && resp.StatusCode < 300 && len(respBody) == 0 {
-		log.Printf("ProxyA2A: agent %s returned %d with empty body — treating as failure",
-			workspaceID, resp.StatusCode)
-		return resp.StatusCode, respBody, &proxyA2AError{
-			Status:   resp.StatusCode,
-			Response: gin.H{"error": "agent returned empty response body"},
-		}
-	}
-
 	if logActivity {
 		h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, resp.StatusCode, durationMs)
 	}
@@ -410,7 +410,7 @@ func extractToolTrace(respBody []byte) json.RawMessage {
 		return nil
 	}
 	trace, ok := meta["tool_trace"]
-	if !ok || len(trace) == 0 || string(trace) == "null" || string(trace) == "[]" {
+	if !ok || len(trace) == 0 {
 		return nil
 	}
 	return trace
@@ -1,243 +0,0 @@
-package handlers
-
-import (
-	"encoding/json"
-	"testing"
-)
-
-// ─────────────────────────────────────────────────────────────────────────────
-// nilIfEmpty tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestNilIfEmpty_EmptyString(t *testing.T) {
-	got := nilIfEmpty("")
-	if got != nil {
-		t.Errorf("empty string: got %p, want nil", got)
-	}
-}
-
-func TestNilIfEmpty_NonEmptyString(t *testing.T) {
-	s := "hello"
-	got := nilIfEmpty(s)
-	if got == nil {
-		t.Fatal("non-empty string: got nil, want pointer")
-	}
-	if *got != "hello" {
-		t.Errorf("non-empty string: got %q, want %q", *got, "hello")
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// extractToolTrace tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestExtractToolTrace_EmptyBody(t *testing.T) {
-	got := extractToolTrace(nil)
-	if got != nil {
-		t.Errorf("nil body: got %v, want nil", got)
-	}
-	got = extractToolTrace([]byte{})
-	if got != nil {
-		t.Errorf("empty body: got %v, want nil", got)
-	}
-}
-
-func TestExtractToolTrace_InvalidJSON(t *testing.T) {
-	got := extractToolTrace([]byte("not json"))
-	if got != nil {
-		t.Errorf("invalid JSON: got %v, want nil", got)
-	}
-}
-
-func TestExtractToolTrace_NoResultKey(t *testing.T) {
-	got := extractToolTrace([]byte(`{"error": "oops"}`))
-	if got != nil {
-		t.Errorf("no result key: got %v, want nil", got)
-	}
-}
-
-func TestExtractToolTrace_NoMetadataKey(t *testing.T) {
-	got := extractToolTrace([]byte(`{"result": {"data": {}}}`))
-	if got != nil {
-		t.Errorf("no metadata key: got %v, want nil", got)
-	}
-}
-
-func TestExtractToolTrace_NoToolTraceKey(t *testing.T) {
-	got := extractToolTrace([]byte(`{"result": {"metadata": {}}}`))
-	if got != nil {
-		t.Errorf("no tool_trace key: got %v, want nil", got)
-	}
-}
-
-// extractToolTrace calls json.Unmarshal, which sets a RawMessage to nil when
-// unmarshaling a JSON null value. The fix for mc#669 changes len(trace)==0
-// to string(trace)=="[]" to avoid len(nil) panicking on null.
-func TestExtractToolTrace_NullValue(t *testing.T) {
-	// JSON null in tool_trace → RawMessage becomes nil → len would panic.
-	// The fix checks string(trace)=="[]" which is safe on nil (returns false).
-	body := []byte(`{"result": {"metadata": {"tool_trace": null}}}`)
-	got := extractToolTrace(body)
-	if got != nil {
-		t.Errorf("null tool_trace: got %v, want nil", got)
-	}
-}
-
-// "[]" unmarshaled into RawMessage is []byte("[]") — not nil, len=2.
-// The fix returns nil for [] so empty tool_trace arrays don't surface as traces.
-func TestExtractToolTrace_EmptyArray(t *testing.T) {
-	body := []byte(`{"result": {"metadata": {"tool_trace": []}}}`)
-	got := extractToolTrace(body)
-	if got != nil {
-		t.Errorf("empty array tool_trace: got %v, want nil", got)
-	}
-}
-
-func TestExtractToolTrace_ValidNonEmpty(t *testing.T) {
-	trace := []byte(`[{"name":"search","result":"done"}]`)
-	body, _ := json.Marshal(map[string]interface{}{
-		"result": map[string]interface{}{
-			"metadata": map[string]interface{}{
-				"tool_trace": json.RawMessage(trace),
-			},
-		},
-	})
-	got := extractToolTrace(body)
-	if got == nil {
-		t.Fatal("valid non-empty trace: got nil, want the trace")
-	}
-	if string(got) != string(trace) {
-		t.Errorf("valid trace: got %s, want %s", got, trace)
-	}
-}
-
-// Document that the CURRENT code (len check) panics on null tool_trace.
-// This test exists to signal when PR #669's fix lands: after the fix,
-// the defer-recover will NOT trigger (panic goes away) and the
-// post-recover assertion runs. While unfixed: the panic fires and
-
-// ─────────────────────────────────────────────────────────────────────────────
-// readUsageMap tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestReadUsageMap_NoUsageKey(t *testing.T) {
-	m := map[string]json.RawMessage{}
-	_, _, ok := readUsageMap(m)
-	if ok {
-		t.Error("no usage key: ok should be false")
-	}
-}
-
-func TestReadUsageMap_InvalidUsageJSON(t *testing.T) {
-	m := map[string]json.RawMessage{"usage": json.RawMessage(`"not an object"`)}
-	_, _, ok := readUsageMap(m)
-	if ok {
-		t.Error("invalid usage JSON: ok should be false")
-	}
-}
-
-func TestReadUsageMap_ZeroUsage(t *testing.T) {
-	m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 0, "output_tokens": 0}`)}
-	_, _, ok := readUsageMap(m)
-	if ok {
-		t.Error("zero usage: ok should be false")
-	}
-}
-
-func TestReadUsageMap_InputOnly(t *testing.T) {
-	m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 100, "output_tokens": 0}`)}
-	in, out, ok := readUsageMap(m)
-	if !ok {
-		t.Fatal("input-only usage: ok should be true")
-	}
-	if in != 100 {
-		t.Errorf("input tokens: got %d, want 100", in)
-	}
-	if out != 0 {
-		t.Errorf("output tokens: got %d, want 0", out)
-	}
-}
-
-func TestReadUsageMap_BothTokens(t *testing.T) {
-	m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 500, "output_tokens": 200}`)}
-	in, out, ok := readUsageMap(m)
-	if !ok {
-		t.Fatal("both tokens: ok should be true")
-	}
-	if in != 500 || out != 200 {
-		t.Errorf("tokens: got (%d, %d), want (500, 200)", in, out)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// parseUsageFromA2AResponse tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestParseUsageFromA2AResponse_Empty(t *testing.T) {
-	in, out := parseUsageFromA2AResponse(nil)
-	if in != 0 || out != 0 {
-		t.Errorf("nil: got (%d, %d), want (0, 0)", in, out)
-	}
-	in, out = parseUsageFromA2AResponse([]byte{})
-	if in != 0 || out != 0 {
-		t.Errorf("empty: got (%d, %d), want (0, 0)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_InvalidJSON(t *testing.T) {
-	in, out := parseUsageFromA2AResponse([]byte("not json"))
-	if in != 0 || out != 0 {
-		t.Errorf("invalid JSON: got (%d, %d), want (0, 0)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_NoResultNoUsage(t *testing.T) {
-	in, out := parseUsageFromA2AResponse([]byte(`{"id": 1}`))
-	if in != 0 || out != 0 {
-		t.Errorf("no result/usage: got (%d, %d), want (0, 0)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_ResultUsage(t *testing.T) {
-	body := []byte(`{"result": {"usage": {"input_tokens": 42, "output_tokens": 7}}}`)
-	in, out := parseUsageFromA2AResponse(body)
-	if in != 42 || out != 7 {
-		t.Errorf("result usage: got (%d, %d), want (42, 7)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_ResultUsageWinsOverTopLevel(t *testing.T) {
-	// JSON-RPC result.usage takes precedence over top-level usage.
-	body := []byte(`{"result": {"usage": {"input_tokens": 42, "output_tokens": 7}}, "usage": {"input_tokens": 99, "output_tokens": 99}}`)
-	in, out := parseUsageFromA2AResponse(body)
-	if in != 42 || out != 7 {
-		t.Errorf("result usage should win: got (%d, %d), want (42, 7)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_TopLevelFallback(t *testing.T) {
-	// Direct (non-JSON-RPC) response: usage at top level.
-	body := []byte(`{"usage": {"input_tokens": 11, "output_tokens": 13}}`)
-	in, out := parseUsageFromA2AResponse(body)
-	if in != 11 || out != 13 {
-		t.Errorf("top-level usage: got (%d, %d), want (11, 13)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_ZeroValuesInResult(t *testing.T) {
-	// Zero usage in result.result.usage: returns (0, 0) — no panic.
-	body := []byte(`{"result": {"usage": {"input_tokens": 0, "output_tokens": 0}}}`)
-	in, out := parseUsageFromA2AResponse(body)
-	if in != 0 || out != 0 {
-		t.Errorf("zero usage: got (%d, %d), want (0, 0)", in, out)
-	}
-}
-
-func TestParseUsageFromA2AResponse_MissingTokensInUsageObject(t *testing.T) {
-	// usage object exists but tokens are absent — returns (0, 0).
-	body := []byte(`{"result": {"usage": {"other_field": 5}}}`)
-	in, out := parseUsageFromA2AResponse(body)
-	if in != 0 || out != 0 {
-		t.Errorf("missing tokens: got (%d, %d), want (0, 0)", in, out)
-	}
-}
@@ -7,7 +7,6 @@ import (
 	"go/parser"
 	"go/token"
 	"testing"
-	"time"

 	"github.com/DATA-DOG/go-sqlmock"
 	"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
@@ -72,8 +71,6 @@ func TestPreflight_ContainerRunning_ReturnsNil(t *testing.T) {
 // triggers the offline-flip + WORKSPACE_OFFLINE broadcast + async restart.
 // This is the load-bearing case — saves the caller 2-30s of network timeout.
 func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
-	const wsID = "ws-dead-456"
-	resetRestartStatesFor(wsID)
 	mock := setupTestDB(t)
 	_ = setupTestRedis(t)
 	stub := &preflightLocalProv{running: false, err: nil}
@@ -82,14 +79,14 @@ func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {

 	// Expect the offline-flip UPDATE.
 	mock.ExpectExec(`UPDATE workspaces SET status =`).
-		WithArgs(models.StatusOffline, wsID).
+		WithArgs(models.StatusOffline, "ws-dead-456").
 		WillReturnResult(sqlmock.NewResult(0, 1))
 	// Broadcaster's INSERT INTO structure_events fires too — best-effort
 	// log entry for the WORKSPACE_OFFLINE event. Match permissively.
 	mock.ExpectExec(`INSERT INTO structure_events`).
 		WillReturnResult(sqlmock.NewResult(0, 1))

-	proxyErr := h.preflightContainerHealth(context.Background(), wsID)
+	proxyErr := h.preflightContainerHealth(context.Background(), "ws-dead-456")
 	if proxyErr == nil {
 		t.Fatal("preflight should return *proxyA2AError when container not running")
 	}
@@ -110,32 +107,6 @@ func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
 	// h.broadcaster.RecordAndBroadcast call but not asserted here — the
 	// real *events.Broadcaster doesn't expose received events for inspection.
 	// The DB UPDATE expectation is sufficient to pin the offline-flip path.
-	waitRestartByIDGoroutineIdle(t, wsID)
-}
-
-func waitRestartByIDGoroutineIdle(t *testing.T, wsID string) {
-	t.Helper()
-	deadline := time.Now().Add(2 * time.Second)
-	sawState := false
-	for time.Now().Before(deadline) {
-		sv, ok := restartStates.Load(wsID)
-		if ok {
-			sawState = true
-			st := sv.(*restartState)
-			st.mu.Lock()
-			running := st.running
-			st.mu.Unlock()
-			if !running {
-				resetRestartStatesFor(wsID)
-				return
-			}
-		}
-		time.Sleep(time.Millisecond)
-	}
-	if !sawState {
-		t.Fatalf("preflight did not start RestartByID goroutine for %s", wsID)
-	}
-	t.Fatalf("RestartByID goroutine for %s did not drain before test cleanup", wsID)
 }

 // TestPreflight_TransientError_FailsSoftAsAlive — IsRunning(true,err): the
@@ -6,7 +6,6 @@ import (
 	"log"
 	"net/http"
 	"os"
-	"runtime"
 	"time"

 	"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
@@ -163,7 +162,7 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
 	})

 	// Fire-and-forget: send A2A in background goroutine
-	go h.executeDelegation(ctx, sourceID, body.TargetID, delegationID, a2aBody)
+	go h.executeDelegation(sourceID, body.TargetID, delegationID, a2aBody)

 	// Broadcast event so canvas shows delegation in real-time
 	h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
@@ -309,50 +308,21 @@ func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, b
 // to land a fresh URL in the cache before we try again. Fixes #74 —
 // bulk restarts used to produce spurious "failed to reach workspace
 // agent" errors when delegations fired within the warm-up window.
-var delegationRetryDelay = 8 * time.Second
+const delegationRetryDelay = 8 * time.Second

-// NB: the log.Printf calls below are load-bearing for the integration test
-// surface (delegation_executor_integration_test.go). The test uses a raw TCP
-// mock server; without these calls the compiler inlines executeDelegation and
-// a subtle stack-sharing race between the inlined body and the test goroutine
-// causes the test to hang. The log calls prevent inlining (Go cannot inline
-// functions that call the log package). This is a known Go compiler behaviour.
-// runtime.LockOSThread() provides an additional hardening: pinning the
-// goroutine to a single OS thread eliminates any scheduler-migration races.
-// The caller provides ctx (which carries the deadline/budget); no internal
-// context.WithTimeout is created here.
-
-// executeDelegation runs the A2A dispatch for a delegation. ctx controls the
-// entire lifecycle: its timeout bounds all DB ops, proxy calls, and retries.
-// Pass context.Background() when no external deadline applies (e.g. tests).
-func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, targetID, delegationID string, a2aBody []byte) {
-	runtime.LockOSThread() // pin to thread; prevents scheduler-migration races in integration tests
+func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID string, a2aBody []byte) {
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Minute)
+	defer cancel()

 	log.Printf("Delegation %s: %s → %s (dispatched)", delegationID, sourceID, targetID)

-	log.Printf("Delegation %s: step=updating_dispatched_status", delegationID)
 	// Update status: pending → dispatched
-	h.updateDelegationStatus(ctx, sourceID, delegationID, "dispatched", "")
-	log.Printf("Delegation %s: step=broadcasting_dispatched", delegationID)
+	h.updateDelegationStatus(sourceID, delegationID, "dispatched", "")
 	h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationStatus), sourceID, map[string]interface{}{
 		"delegation_id": delegationID, "target_id": targetID, "status": "dispatched",
 	})
-	log.Printf("Delegation %s: step=proxying_a2a_request", delegationID)

 	status, respBody, proxyErr := h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true)
-	log.Printf("Delegation %s: step=proxy_done status=%d bodyLen=%d err=%v", delegationID, status, len(respBody), proxyErr)
-
-	// When proxyA2ARequest returns an error but we have a non-empty response body
-	// with a 2xx status code, the agent completed the work successfully — the error
-	// is a delivery/transport error (e.g., connection reset after response was
-	// received). Treat as success: the response body is valid and the work is done.
-	// This check MUST run before the transient-retry gate so a delivery-confirmed
-	// partial-body 2xx response is never retried.
-	if isDeliveryConfirmedSuccess(proxyErr, status, respBody) {
-		log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success",
-			delegationID, status, len(respBody), proxyErr.Error())
-		goto handleSuccess
-	}

 	// #74: one retry after the reactive URL refresh has had a chance to
 	// run. The proxyA2ARequest's health-check path on a connection error
@@ -372,10 +342,21 @@ func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, tar
 		}
 	}

+	// When proxyA2ARequest returns an error but we have a non-empty response body
+	// with a 2xx status code, the agent completed the work successfully — the error
+	// is a delivery/transport error (e.g., connection reset after response was
+	// received). Treat as success: the response body is valid and the work is done.
+	// This prevents "retry storms" where the canvas sees error + Restart-workspace
+	// suggestion even though the delegation actually completed.
+	if isDeliveryConfirmedSuccess(proxyErr, status, respBody) {
+		log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success",
+			delegationID, status, len(respBody), proxyErr.Error())
+		goto handleSuccess
+	}
+
 	if proxyErr != nil {
-		log.Printf("Delegation %s: step=handling_failure err=%v", delegationID, proxyErr)
 		log.Printf("Delegation %s: failed — %s", delegationID, proxyErr.Error())
-		h.updateDelegationStatus(ctx, sourceID, delegationID, "failed", proxyErr.Error())
+		h.updateDelegationStatus(sourceID, delegationID, "failed", proxyErr.Error())

 		if _, err := db.DB.ExecContext(ctx, `
 			INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, status, error_detail)
@@ -392,27 +373,7 @@ func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, tar
 		return
 	}

-	if status >= 200 && status < 300 && len(respBody) == 0 {
-		errMsg := "workspace agent returned empty response"
-		log.Printf("Delegation %s: step=handling_failure err=%s", delegationID, errMsg)
-		h.updateDelegationStatus(ctx, sourceID, delegationID, "failed", errMsg)
-
-		if _, err := db.DB.ExecContext(ctx, `
-			INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, status, error_detail)
-			VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, 'failed', $5)
-		`, sourceID, sourceID, targetID, "Delegation failed", errMsg); err != nil {
-			log.Printf("Delegation %s: failed to insert empty-response error log: %v", delegationID, err)
-		}
-
-		h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationFailed), sourceID, map[string]interface{}{
-			"delegation_id": delegationID, "target_id": targetID, "error": errMsg,
-		})
-		pushDelegationResultToInbox(ctx, sourceID, delegationID, "failed", "", errMsg)
-		return
-	}
-
 handleSuccess:
-	log.Printf("Delegation %s: step=handle_success status=%d", delegationID, status)

 	// 202 + {queued: true} means the target was busy and the proxy
 	// enqueued the request for the next drain tick — NOT a completion.
@@ -426,7 +387,7 @@ handleSuccess:
 	// the user.
 	if status == http.StatusAccepted && isQueuedProxyResponse(respBody) {
 		log.Printf("Delegation %s: target %s busy — queued for drain", delegationID, targetID)
-		h.updateDelegationStatus(ctx, sourceID, delegationID, "queued", "")
+		h.updateDelegationStatus(sourceID, delegationID, "queued", "")
 		// Store delegation_id in response_body so DrainQueueForWorkspace's
 		// stitch step can find this row by JSON-path key after the queued
 		// dispatch eventually succeeds. Without the key, the drain finds
@@ -453,7 +414,6 @@ handleSuccess:
 	responseText := extractResponseText(respBody)
 	log.Printf("Delegation %s: completed (status=%d, %d chars)", delegationID, status, len(responseText))

-	log.Printf("Delegation %s: step=inserting_success_log", delegationID)
 	// Store success (response_body must be JSONB, include delegation_id)
 	respJSON, _ := json.Marshal(map[string]interface{}{
 		"text":          responseText,
@@ -465,7 +425,6 @@ handleSuccess:
 	`, sourceID, sourceID, targetID, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON)); err != nil {
 		log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
 	}
-	log.Printf("Delegation %s: step=recording_ledger_completed", delegationID)

 	// RFC #2829 #318: write the ledger row with result_preview FIRST,
 	// THEN updateDelegationStatus. Order matters: SetStatus has a
@@ -475,9 +434,7 @@ handleSuccess:
 	// Caught by the local-Postgres integration test in
 	// delegation_ledger_integration_test.go.
 	recordLedgerStatus(ctx, delegationID, "completed", "", responseText)
-	log.Printf("Delegation %s: step=updating_completed_status", delegationID)
-	h.updateDelegationStatus(ctx, sourceID, delegationID, "completed", "")
-	log.Printf("Delegation %s: step=broadcasting_complete", delegationID)
+	h.updateDelegationStatus(sourceID, delegationID, "completed", "")
 	h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
 		"delegation_id":    delegationID,
 		"target_id":        targetID,
@@ -485,12 +442,11 @@ handleSuccess:
 	})
 	// RFC #2829 PR-2 result-push (see UpdateStatus for rationale).
 	pushDelegationResultToInbox(ctx, sourceID, delegationID, "completed", responseText, "")
-	log.Printf("Delegation %s: step=complete", delegationID)
 }

 // updateDelegationStatus updates the status of a delegation record in activity_logs.
-// ctx is used for DB operations; caller controls the timeout/retry budget.
-func (h *DelegationHandler) updateDelegationStatus(ctx context.Context, workspaceID, delegationID, status, errorDetail string) {
+func (h *DelegationHandler) updateDelegationStatus(workspaceID, delegationID, status, errorDetail string) {
+	ctx := context.Background()
 	if _, err := db.DB.ExecContext(ctx, `
 		UPDATE activity_logs
 		SET status = $1, error_detail = CASE WHEN $2 = '' THEN error_detail ELSE $2 END
@@ -604,7 +560,7 @@ func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
 		recordLedgerStatus(ctx, delegationID, "completed", "", body.ResponsePreview)
 	}

-	h.updateDelegationStatus(ctx, sourceID, delegationID, body.Status, body.Error)
+	h.updateDelegationStatus(sourceID, delegationID, body.Status, body.Error)

 	if body.Status == "completed" {
 		respJSON, _ := json.Marshal(map[string]interface{}{
@@ -816,3 +772,4 @@ func extractResponseText(body []byte) string {
 	}
 	return string(body)
 }
+
@@ -1,535 +0,0 @@
-//go:build integration
-// +build integration
-
-// delegation_executor_integration_test.go — REAL Postgres integration tests for
-// executeDelegation HTTP proxy edge cases that sqlmock cannot cover.
-//
-// The sqlmock tests in delegation_test.go pin which SQL statements fire but
-// cannot detect bugs that depend on the row state AFTER the SQL runs. The
-// result_preview-lost bug shipped to staging in PR #2854 because sqlmock tests
-// were satisfied with "an UPDATE fired" — none verified the row's preview
-// field actually landed. These integration tests close that gap.
-//
-// How HTTP is mocked
-// -----------------
-// We use raw TCP listeners (net.Listener) instead of httptest.Server to avoid
-// any HTTP-library-level goroutine complexity. The test opens a TCP port,
-// serves one HTTP response, then closes the connection. The a2aClient transport
-// is overridden with a DialContext that intercepts all dials and redirects to
-// the test server's port. No DNS, no TCP handshake overhead, no HTTP library
-// goroutines that could block on request-body reads.
-//
-// Run with:
-//
-//   docker run --rm -d --name pg-integration \
-//     -e POSTGRES_PASSWORD=test -e POSTGRES_DB=molecule \
-//     -p 55432:5432 postgres:15-alpine
-//   sleep 4
-//   psql ... < workspace-server/migrations/049_delegations.up.sql
-//   cd workspace-server
-//   INTEGRATION_DB_URL="postgres://postgres:test@localhost:55432/molecule?sslmode=disable" \
-//     go test -tags=integration ./internal/handlers/ -run Integration_ExecuteDelegation
-//
-// CI (.gitea/workflows/handlers-postgres-integration.yml) runs this on
-// every PR that touches workspace-server/internal/handlers/**.
-
-package handlers
-
-import (
-	"context"
-	"database/sql"
-	"encoding/json"
-	"net"
-	"net/http"
-	"runtime"
-	"strconv"
-	"testing"
-	"time"
-
-	"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
-)
-
-// integrationDB is imported from delegation_ledger_integration_test.go.
-// Each test gets a fresh table state.
-
-const testDelegationID = "del-159-test-integration"
-const testSourceID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
-const testTargetID = "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
-
-// rawHTTPServer starts a TCP listener, serves one HTTP response, and closes.
-// It runs in a background goroutine so the test can proceed immediately after
-// returning the server URL. The server URL (e.g. "http://127.0.0.1:<port>/")
-// is suitable for caching in Redis and passing to executeDelegation.
-//
-// The server reads HTTP headers using a deadline, then immediately sends the
-// response. This prevents the classic TCP deadlock: server blocked reading
-// body while client blocked waiting for response.
-func rawHTTPServer(t *testing.T, statusCode int, body string) (serverURL string, closeFn func()) {
-	t.Helper()
-	// Use ListenTCP with explicit IPv4 to avoid IPv6 mismatch on macOS
-	// (Listen("tcp", "127.0.0.1:0") might bind ::1 on some systems).
-	ln, err := net.ListenTCP("tcp4", &net.TCPAddr{IP: net.ParseIP("127.0.0.1"), Port: 0})
-	if err != nil {
-		t.Fatalf("rawHTTPServer listen: %v", err)
-	}
-	port := ln.Addr().(*net.TCPAddr).Port
-	serverURL = "http://127.0.0.1:" + strconv.Itoa(port) + "/"
-
-	connCh := make(chan net.Conn, 1)
-	go func() {
-		conn, err := ln.Accept()
-		if err != nil {
-			return
-		}
-		connCh <- conn
-	}()
-
-	closeFn = func() {
-		ln.Close()
-	}
-
-	// Handle in background so we don't block test execution.
-	// Strategy: read available bytes with a deadline (enough for headers).
-	// After deadline fires, send the response immediately.
-	// The kernel discards any unread buffered body bytes when the
-	// connection closes — harmless.
-	go func() {
-		conn := <-connCh
-		if conn == nil {
-			return
-		}
-
-		// Read what we can with a 2s deadline. Headers always arrive first.
-		conn.SetReadDeadline(time.Now().Add(2 * time.Second))
-		headerBuf := make([]byte, 4096)
-		for {
-			n, err := conn.Read(headerBuf)
-			if n > 0 {
-				_ = headerBuf[:n]
-			}
-			if err != nil {
-				break
-			}
-		}
-
-		// Send response and IMMEDIATELY close the connection.
-		// If we keep it open, the client's request-body writer goroutine
-		// might block on the socket (waiting for the server to drain the
-		// body). Closing immediately unblocks it. The client already
-		// received the response, so the write error is harmless.
-		resp := buildHTTPResponse(statusCode, body)
-		conn.Write(resp) //nolint:errcheck
-		conn.Close()
-	}()
-
-	return serverURL, closeFn
-}
-
-// buildHTTPResponse constructs a minimal HTTP/1.1 response.
-func buildHTTPResponse(statusCode int, body string) []byte {
-	statusText := http.StatusText(statusCode)
-	if statusText == "" {
-		statusText = "Unknown"
-	}
-	header := "HTTP/1.1 " + strconv.Itoa(statusCode) + " " + statusText + "\r\n" +
-		"Content-Type: application/json\r\n" +
-		"Content-Length: " + strconv.Itoa(len(body)) + "\r\n" +
-		"Connection: close\r\n" +
-		"\r\n"
-	return []byte(header + body)
-}
-
-// setupIntegrationFixtures inserts the rows executeDelegation requires:
-//   - workspaces: source and target (siblings, parent_id=NULL so CanCommunicate=true)
-//   - activity_logs: the 'delegate' row that updateDelegationStatus UPDATE will find
-//   - delegations: the ledger row that recordLedgerStatus will UPDATE
-//
-// Returns a cleanup function the test should defer.
-func setupIntegrationFixtures(t *testing.T, conn *sql.DB) func() {
-	t.Helper()
-	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
-	for _, ws := range []struct {
-		id       string
-		name     string
-		parentID *string
-	}{
-		{testSourceID, "test-source", nil},
-		{testTargetID, "test-target", nil},
-	} {
-		if _, err := conn.ExecContext(ctx,
-			`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
-			ws.id, ws.name, ws.parentID,
-		); err != nil {
-			cancel()
-			t.Fatalf("seed workspace %s: %v", ws.id, err)
-		}
-	}
-
-	reqBody, _ := json.Marshal(map[string]any{
-		"delegation_id": testDelegationID,
-		"task":          "do work",
-	})
-	if _, err := conn.ExecContext(ctx, `
-		INSERT INTO activity_logs
-			(workspace_id, activity_type, method, source_id, target_id, request_body, status)
-		VALUES ($1, 'delegate', 'delegate', $1, $2, $3::jsonb, 'pending')
-		ON CONFLICT DO NOTHING
-	`, testSourceID, testTargetID, string(reqBody)); err != nil {
-		cancel()
-		t.Fatalf("seed activity_logs: %v", err)
-	}
-
-	if _, err := conn.ExecContext(ctx, `
-		INSERT INTO delegations
-			(delegation_id, caller_id, callee_id, task_preview, status)
-		VALUES ($1, $2::uuid, $3::uuid, 'do work', 'queued')
-		ON CONFLICT (delegation_id) DO NOTHING
-	`, testDelegationID, testSourceID, testTargetID); err != nil {
-		cancel()
-		t.Fatalf("seed delegations: %v", err)
-	}
-	cancel()
-
-	return func() {
-		ctx2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
-		defer cancel2()
-		conn.ExecContext(ctx2,
-			`DELETE FROM activity_logs WHERE workspace_id = $1 AND request_body->>'delegation_id' = $2`,
-			testSourceID, testDelegationID)
-		conn.ExecContext(ctx2,
-			`DELETE FROM delegations WHERE delegation_id = $1`, testDelegationID)
-		conn.ExecContext(ctx2,
-			`DELETE FROM workspaces WHERE id IN ($1, $2)`, testSourceID, testTargetID)
-	}
-}
-
-// readDelegationRow returns (status, result_preview, error_detail) for the test
-// delegation, or fails the test if the row is not found.
-func readDelegationRow(t *testing.T, conn *sql.DB) (status, preview, errorDetail string) {
-	t.Helper()
-	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
-	defer cancel()
-	var prev, errDet sql.NullString
-	err := conn.QueryRowContext(ctx,
-		`SELECT status, result_preview, error_detail FROM delegations WHERE delegation_id = $1`,
-		testDelegationID,
-	).Scan(&status, &prev, &errDet)
-	if err != nil {
-		t.Fatalf("readDelegationRow: %v", err)
-	}
-	return status, prev.String, errDet.String
-}
-
-// stack returns the current goroutine stack trace. Used by runWithTimeout to
-// pinpoint the blocking call site when a test times out.
-func stack() string {
-	buf := make([]byte, 4096)
-	n := runtime.Stack(buf, false)
-	return string(buf[:n])
-}
-
-// runWithTimeout calls fn in a goroutine and fails t if it doesn't return within
-// timeout. ctx is passed to fn so it can propagate cancellation to
-// executeDelegation's DB and network operations — without this, the goroutine
-// leaks indefinitely when the test times out (context.Background() never cancels).
-func runWithTimeout(t *testing.T, timeout time.Duration, fn func(context.Context)) {
-	t.Helper()
-	ctx, cancel := context.WithTimeout(context.Background(), timeout)
-	defer cancel()
-
-	done := make(chan struct{})
-	var panicErr interface{}
-	go func() {
-		defer func() {
-			if p := recover(); p != nil {
-				panicErr = p
-			}
-			close(done)
-		}()
-		fn(ctx)
-	}()
-
-	select {
-	case <-done:
-		if panicErr != nil {
-			t.Fatalf("executeDelegation panicked: %v\n%s", panicErr, stack())
-		}
-	case <-ctx.Done():
-		cancel()
-		t.Fatalf("executeDelegation timed out after %s\n%s", timeout, stack())
-	}
-}
-
-// TestIntegration_ExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess
-// is the integration regression gate for issue #159.
-//
-// Scenario: proxyA2ARequest returns a 200 status code with a non-empty body.
-// isDeliveryConfirmedSuccess guard (status>=200 && <300 && len(body)>0 && err!=nil)
-// routes to handleSuccess. The integration test verifies the DB row lands at
-// 'completed' with the response body as result_preview.
-func TestIntegration_ExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess(t *testing.T) {
-	allowLoopbackForTest(t)
-	conn := integrationDB(t)
-	cleanup := setupIntegrationFixtures(t, conn)
-	defer cleanup()
-	t.Setenv("DELEGATION_LEDGER_WRITE", "1")
-
-	agentURL, closeServer := rawHTTPServer(t, 200, `{"result":{"parts":[{"text":"work completed successfully"}]}}`)
-	defer closeServer()
-
-	mr := setupTestRedis(t)
-	defer mr.Close()
-	db.CacheURL(context.Background(), testTargetID, agentURL)
-
-	prevClient := a2aClient
-	defer func() { a2aClient = prevClient }()
-	a2aClient = newA2AClientForHost(extractHostPort(agentURL))
-
-	broadcaster := newTestBroadcaster()
-	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
-	dh := NewDelegationHandler(wh, broadcaster)
-
-	a2aBody, _ := json.Marshal(map[string]interface{}{
-		"jsonrpc": "2.0",
-		"id":      "1",
-		"method":  "message/send",
-		"params": map[string]interface{}{
-			"message": map[string]interface{}{
-				"role":  "user",
-				"parts": []map[string]string{{"type": "text", "text": "do work"}},
-			},
-		},
-	})
-
-	start := time.Now()
-	runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
-		dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
-	})
-	t.Logf("executeDelegation took %v", time.Since(start))
-
-	status, preview, errDet := readDelegationRow(t, conn)
-	if status != "completed" {
-		t.Errorf("status: want completed, got %q", status)
-	}
-	if preview == "" {
-		t.Errorf("result_preview should be non-empty, got %q", preview)
-	}
-	if errDet != "" {
-		t.Errorf("error_detail should be empty on success: got %q", errDet)
-	}
-}
-
-// TestIntegration_ExecuteDelegation_ProxyErrorNon2xx_RemainsFailed verifies that
-// a 500 response routes to failure, not success. isDeliveryConfirmedSuccess
-// requires status>=200 && <300, so 500 always fails the guard.
-func TestIntegration_ExecuteDelegation_ProxyErrorNon2xx_RemainsFailed(t *testing.T) {
-	allowLoopbackForTest(t)
-	conn := integrationDB(t)
-	cleanup := setupIntegrationFixtures(t, conn)
-	defer cleanup()
-	t.Setenv("DELEGATION_LEDGER_WRITE", "1")
-
-	agentURL, closeServer := rawHTTPServer(t, 500, `{"error":"agent crashed"}`)
-	defer closeServer()
-
-	mr := setupTestRedis(t)
-	defer mr.Close()
-	db.CacheURL(context.Background(), testTargetID, agentURL)
-
-	prevClient := a2aClient
-	defer func() { a2aClient = prevClient }()
-	a2aClient = newA2AClientForHost(extractHostPort(agentURL))
-
-	broadcaster := newTestBroadcaster()
-	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
-	dh := NewDelegationHandler(wh, broadcaster)
-
-	a2aBody, _ := json.Marshal(map[string]interface{}{
-		"jsonrpc": "2.0", "id": "1", "method": "message/send",
-		"params": map[string]interface{}{
-			"message": map[string]interface{}{
-				"role":  "user",
-				"parts": []map[string]string{{"type": "text", "text": "do work"}},
-			},
-		},
-	})
-	start := time.Now()
-	runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
-		dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
-	})
-	t.Logf("executeDelegation took %v", time.Since(start))
-
-	status, _, errDet := readDelegationRow(t, conn)
-	if status != "failed" {
-		t.Errorf("status: want failed, got %q", status)
-	}
-	if errDet == "" {
-		t.Error("error_detail should be non-empty on failure")
-	}
-}
-
-// TestIntegration_ExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed verifies that
-// a 200 response with an empty body routes to failure. isDeliveryConfirmedSuccess
-// requires len(body) > 0, so an empty body fails the guard.
-func TestIntegration_ExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) {
-	allowLoopbackForTest(t)
-	conn := integrationDB(t)
-	cleanup := setupIntegrationFixtures(t, conn)
-	defer cleanup()
-	t.Setenv("DELEGATION_LEDGER_WRITE", "1")
-
-	agentURL, closeServer := rawHTTPServer(t, 200, "")
-	defer closeServer()
-
-	mr := setupTestRedis(t)
-	defer mr.Close()
-	db.CacheURL(context.Background(), testTargetID, agentURL)
-
-	prevClient := a2aClient
-	defer func() { a2aClient = prevClient }()
-	a2aClient = newA2AClientForHost(extractHostPort(agentURL))
-
-	broadcaster := newTestBroadcaster()
-	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
-	dh := NewDelegationHandler(wh, broadcaster)
-
-	a2aBody, _ := json.Marshal(map[string]interface{}{
-		"jsonrpc": "2.0", "id": "1", "method": "message/send",
-		"params": map[string]interface{}{
-			"message": map[string]interface{}{
-				"role":  "user",
-				"parts": []map[string]string{{"type": "text", "text": "do work"}},
-			},
-		},
-	})
-	start := time.Now()
-	runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
-		dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
-	})
-	t.Logf("executeDelegation took %v", time.Since(start))
-
-	status, _, errDet := readDelegationRow(t, conn)
-	if status != "failed" {
-		t.Errorf("status: want failed, got %q", status)
-	}
-	if errDet == "" {
-		t.Error("error_detail should be non-empty on failure")
-	}
-}
-
-// TestIntegration_ExecuteDelegation_CleanProxyResponse_Unchanged is the baseline:
-// a clean 200 response with a valid body and no error routes to success.
-func TestIntegration_ExecuteDelegation_CleanProxyResponse_Unchanged(t *testing.T) {
-	allowLoopbackForTest(t)
-	conn := integrationDB(t)
-	cleanup := setupIntegrationFixtures(t, conn)
-	defer cleanup()
-	t.Setenv("DELEGATION_LEDGER_WRITE", "1")
-
-	agentURL, closeServer := rawHTTPServer(t, 200, `{"result":{"parts":[{"text":"all good"}]}}`)
-	defer closeServer()
-
-	mr := setupTestRedis(t)
-	defer mr.Close()
-	db.CacheURL(context.Background(), testTargetID, agentURL)
-
-	prevClient := a2aClient
-	defer func() { a2aClient = prevClient }()
-	a2aClient = newA2AClientForHost(extractHostPort(agentURL))
-
-	broadcaster := newTestBroadcaster()
-	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
-	dh := NewDelegationHandler(wh, broadcaster)
-
-	a2aBody, _ := json.Marshal(map[string]interface{}{
-		"jsonrpc": "2.0", "id": "1", "method": "message/send",
-		"params": map[string]interface{}{
-			"message": map[string]interface{}{
-				"role":  "user",
-				"parts": []map[string]string{{"type": "text", "text": "do work"}},
-			},
-		},
-	})
-	start := time.Now()
-	runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
-		dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
-	})
-	t.Logf("executeDelegation took %v", time.Since(start))
-
-	status, preview, errDet := readDelegationRow(t, conn)
-	if status != "completed" {
-		t.Errorf("status: want completed, got %q", status)
-	}
-	if preview == "" {
-		t.Errorf("result_preview should be non-empty, got %q", preview)
-	}
-	if errDet != "" {
-		t.Errorf("error_detail should be empty on success: got %q", errDet)
-	}
-}
-
-// Test that a delegation where Redis cannot be reached still routes to failure
-// (not panic). proxyA2ARequest falls back to DB URL lookup when Redis is down.
-func TestIntegration_ExecuteDelegation_RedisDown_FallsBackToDB(t *testing.T) {
-	allowLoopbackForTest(t)
-	conn := integrationDB(t)
-	cleanup := setupIntegrationFixtures(t, conn)
-	defer cleanup()
-	t.Setenv("DELEGATION_LEDGER_WRITE", "1")
-
-	// Set up miniredis so db.RDB is non-nil, but do NOT cache any URL.
-	// resolveAgentURL skips Redis and falls back to DB, which also has no URL.
-	mr := setupTestRedis(t)
-	defer mr.Close()
-
-	broadcaster := newTestBroadcaster()
-	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
-	dh := NewDelegationHandler(wh, broadcaster)
-
-	a2aBody, _ := json.Marshal(map[string]interface{}{
-		"jsonrpc": "2.0", "id": "1", "method": "message/send",
-		"params": map[string]interface{}{
-			"message": map[string]interface{}{
-				"role":  "user",
-				"parts": []map[string]string{{"type": "text", "text": "do work"}},
-			},
-		},
-	})
-	start := time.Now()
-	runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
-		dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
-	})
-	t.Logf("executeDelegation took %v", time.Since(start))
-
-	status, _, errDet := readDelegationRow(t, conn)
-	if status != "failed" {
-		t.Errorf("status: want failed (no target URL), got %q", status)
-	}
-	if errDet == "" {
-		t.Error("error_detail should be set on failure due to unreachable target")
-	}
-}
-
-// extractHostPort parses "http://127.0.0.1:PORT/" and returns "127.0.0.1:PORT".
-func extractHostPort(rawURL string) string {
-	// Simple parse: strip "http://" prefix and trailing slash.
-	// The URL format is always "http://127.0.0.1:PORT/" in our usage.
-	if len(rawURL) > 7 {
-		return rawURL[7 : len(rawURL)-1]
-	}
-	return rawURL
-}
-
-// newA2AClientForHost creates an http.Client that redirects all connections
-// to the given host:port. This lets us mock the agent endpoint without
-// running a real HTTP server.
-func newA2AClientForHost(targetHost string) *http.Client {
-	return &http.Client{
-		Transport: &http.Transport{
-			DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
-				return net.Dial("tcp", targetHost)
-			},
-			ResponseHeaderTimeout: 180 * time.Second,
-		},
-	}
-}
@@ -154,28 +154,10 @@ func (l *DelegationLedger) SetStatus(ctx context.Context,
 		return err
 	}

-	// Same-status replay (e.g. duplicate completion notification): usually a
-	// no-op. If the replay carries terminal detail that the first write lacked,
-	// fill the missing nullable column once. This keeps duplicate notifications
-	// idempotent while preserving the first observed result/error when a legacy
-	// path wrote the terminal status before it had the detail payload.
+	// Same-status replay (e.g. duplicate completion notification): no-op,
+	// don't bump updated_at, no error.
 	if current == status {
-		if errorDetail == "" && resultPreview == "" {
-			return nil
-		}
-		_, err = l.db.ExecContext(ctx, `
-			UPDATE delegations
-			SET error_detail = COALESCE(error_detail, NULLIF($2, '')),
-			    result_preview = COALESCE(result_preview, NULLIF($3, '')),
-			    updated_at = CASE
-			      WHEN (error_detail IS NULL AND NULLIF($2, '') IS NOT NULL)
-			        OR (result_preview IS NULL AND NULLIF($3, '') IS NOT NULL)
-			      THEN now()
-			      ELSE updated_at
-			    END
-			WHERE delegation_id = $1
-		`, delegationID, errorDetail, textutil.TruncateBytesNoMarker(resultPreview, previewCap))
-		return err
+		return nil
 	}

 	// Forward-only on terminal states.
@@ -39,7 +39,6 @@ import (
 	"os"
 	"strings"
 	"testing"
-	"time"

 	mdb "github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
 	_ "github.com/lib/pq"
@@ -65,16 +64,12 @@ func integrationDB(t *testing.T) *sql.DB {
 	if err != nil {
 		t.Fatalf("open: %v", err)
 	}
-	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
-	defer cancel()
-	if err := conn.PingContext(ctx); err != nil {
+	if err := conn.Ping(); err != nil {
 		t.Fatalf("ping: %v", err)
 	}
 	// Each test gets a fresh table state — fail loud if cleanup fails so
 	// a bad test doesn't pollute the next one.
-	ctx2, cancel2 := context.WithTimeout(context.Background(), 10*time.Second)
-	defer cancel2()
-	if _, err := conn.ExecContext(ctx2, `DELETE FROM delegations`); err != nil {
+	if _, err := conn.ExecContext(context.Background(), `DELETE FROM delegations`); err != nil {
 		t.Fatalf("cleanup: %v", err)
 	}
 	// Wire the package-level db.DB so production helpers (recordLedgerInsert,
@@ -150,11 +145,16 @@ func TestIntegration_ResultPreviewPreservedThroughCompletion(t *testing.T) {
 	}
 }

-// Same-status terminal replays remain idempotent, but if the first terminal
-// write lacked result_preview, a later same-status replay carrying the preview
-// should fill that missing field once. This protects legacy call ordering and
-// mirrors the failure-path error_detail repair.
-func TestIntegration_ResultPreviewSameStatusReplayFillsMissingPreview(t *testing.T) {
+// TestIntegration_ResultPreviewBuggyOrderIsLost — DIAGNOSTIC test that
+// confirms the ORIGINAL buggy order does lose the preview. Useful when
+// auditing similar wiring elsewhere.
+//
+// This is documented behavior: it asserts the same-status replay no-op
+// works as designed in DelegationLedger.SetStatus. The fix in
+// delegation.go is to AVOID this order, not to change SetStatus's
+// same-status semantics (which the operator dashboard relies on for
+// idempotent completion notifications).
+func TestIntegration_ResultPreviewBuggyOrderIsLost(t *testing.T) {
 	conn := integrationDB(t)
 	t.Setenv("DELEGATION_LEDGER_WRITE", "1")

@@ -162,17 +162,16 @@ func TestIntegration_ResultPreviewSameStatusReplayFillsMissingPreview(t *testing
 	caller := "11111111-1111-1111-1111-111111111111"
 	callee := "22222222-2222-2222-2222-222222222222"

-	// Legacy sequence: queued → dispatched → completed (no preview) →
-	// completed (preview). The second completed replay should repair the
-	// missing preview without changing status.
+	// BUGGY sequence in production-shape order: queued → dispatched →
+	// completed (no preview) → completed (preview ignored as same-status).
 	recordLedgerInsert(context.Background(), caller, callee, id, "the question", "")
-	recordLedgerStatus(context.Background(), id, "dispatched", "", "")
-	recordLedgerStatus(context.Background(), id, "completed", "", "")
-	recordLedgerStatus(context.Background(), id, "completed", "", "the answer")
+	recordLedgerStatus(context.Background(), id, "dispatched", "", "")            // pre-completion stage
+	recordLedgerStatus(context.Background(), id, "completed", "", "")             // inner first
+	recordLedgerStatus(context.Background(), id, "completed", "", "the answer")   // outer same-status no-op

 	_, preview, _ := readLedgerRow(t, conn, id)
-	if preview != "the answer" {
-		t.Errorf("same-status replay should fill missing preview; got %q", preview)
+	if preview != "" {
+		t.Errorf("buggy-order preview was unexpectedly non-empty: %q (SetStatus same-status no-op contract may have changed)", preview)
 	}
 }

@@ -226,25 +226,6 @@ func TestLedgerSetStatus_SameStatusReplay_NoUpdate(t *testing.T) {
 	}
 }

-func TestLedgerSetStatus_SameStatusReplay_FillsMissingDetail(t *testing.T) {
-	mock := setupTestDB(t)
-	l := NewDelegationLedger(nil)
-
-	mock.ExpectQuery(`SELECT status FROM delegations WHERE delegation_id = \$1`).
-		WithArgs("d-1").
-		WillReturnRows(sqlmock.NewRows([]string{"status"}).AddRow("failed"))
-	mock.ExpectExec(`UPDATE delegations\s+SET error_detail = COALESCE\(error_detail, NULLIF\(\$2, ''\)\),\s+result_preview = COALESCE\(result_preview, NULLIF\(\$3, ''\)\),\s+updated_at = CASE`).
-		WithArgs("d-1", "agent returned empty response", "").
-		WillReturnResult(sqlmock.NewResult(0, 1))
-
-	if err := l.SetStatus(context.Background(), "d-1", "failed", "agent returned empty response", ""); err != nil {
-		t.Errorf("same-status detail fill should succeed, got err: %v", err)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet: %v", err)
-	}
-}
-
 func TestLedgerSetStatus_MissingRowIsNoOp(t *testing.T) {
 	// A SetStatus call that arrives before Insert (lost INSERT, race, etc.)
 	// must NOT error — it's a transient inconsistency the next agent retry
@@ -5,8 +5,10 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"net"
 	"net/http"
 	"net/http/httptest"
+	"sync"
 	"testing"
 	"time"

@@ -956,3 +958,316 @@ func TestInsertDelegationOutcome_ZeroValueIsUnknown(t *testing.T) {
 		t.Errorf("insertOutcomeUnknown must not collide with insertOK")
 	}
 }
+
+// ==================== executeDelegation — delivery-confirmed proxy error regression tests ====================
+//
+// These test the fix for issue #159: when proxyA2ARequest returns an error but we have a
+// non-empty response body with a 2xx status code, executeDelegation must treat it as success.
+// The error is a delivery/transport error (e.g., connection reset after response was received).
+// Previously, executeDelegation marked these as "failed" even though the work was done,
+// causing retry storms and "error" rendering in canvas despite the response being available.
+//
+// Test strategy: spin up a mock A2A agent server, set up the source/target DB rows, call
+// executeDelegation directly, and verify the activity_logs status and delegation status.
+
+const testDelegationID = "del-159-test"
+const testSourceID = "ws-source-159"
+const testTargetID = "ws-target-159"
+
+// expectExecuteDelegationBase sets up sqlmock expectations for the DB queries that
+// executeDelegation always makes, regardless of outcome.
+func expectExecuteDelegationBase(mock sqlmock.Sqlmock) {
+	// updateDelegationStatus: dispatched
+	// Uses prefix match — sqlmock regexes match the full query string.
+	mock.ExpectExec("UPDATE activity_logs SET status").
+		WithArgs("dispatched", "", testSourceID, testDelegationID).
+		WillReturnResult(sqlmock.NewResult(0, 1))
+
+	// CanCommunicate: source != target → fires two getWorkspaceRef lookups.
+	// Both test fixtures have parent_id = NULL (root-level siblings) → allowed.
+	// Order matches call order: source first, then target.
+	mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
+		WithArgs(testSourceID).
+		WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testSourceID, nil))
+	mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
+		WithArgs(testTargetID).
+		WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testTargetID, nil))
+
+	// resolveAgentURL: reads ws:{id}:url from Redis, falls back to DB for target
+	mock.ExpectQuery("SELECT url, status FROM workspaces WHERE id = ").
+		WithArgs(testTargetID).
+		WillReturnRows(sqlmock.NewRows([]string{"url", "status"}).AddRow("", "online"))
+}
+
+// expectExecuteDelegationSuccess sets up expectations for a completed delegation.
+func expectExecuteDelegationSuccess(mock sqlmock.Sqlmock, respBody string) {
+	// INSERT activity_logs for delegation completion (response_body status = 'completed')
+	mock.ExpectExec("INSERT INTO activity_logs").
+		WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "completed").
+		WillReturnResult(sqlmock.NewResult(0, 1))
+
+	// updateDelegationStatus: completed
+	mock.ExpectExec("UPDATE activity_logs SET status").
+		WithArgs("completed", "", testSourceID, testDelegationID).
+		WillReturnResult(sqlmock.NewResult(0, 1))
+}
+
+// expectExecuteDelegationFailed sets up expectations for a failed delegation.
+func expectExecuteDelegationFailed(mock sqlmock.Sqlmock) {
+	// INSERT activity_logs for delegation failure (response_body status = 'failed')
+	mock.ExpectExec("INSERT INTO activity_logs").
+		WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "failed").
+		WillReturnResult(sqlmock.NewResult(0, 1))
+
+	// updateDelegationStatus: failed
+	mock.ExpectExec("UPDATE activity_logs SET status").
+		WithArgs("failed", sqlmock.AnyArg(), testSourceID, testDelegationID).
+		WillReturnResult(sqlmock.NewResult(0, 1))
+}
+
+// TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess is the primary regression
+// test for issue #159. The scenario:
+//   - Attempt 1: server sends 200 OK headers + partial body, then closes connection.
+//     proxyA2ARequest: body read gets io.EOF (partial body read), returns (200, <partial>, BadGateway).
+//     isTransientProxyError(BadGateway) = TRUE → retry.
+//   - Attempt 2: server does the same thing (closes after partial body).
+//     proxyA2ARequest: same (200, <partial>, BadGateway).
+//     isTransientProxyError(BadGateway) = TRUE → retry AGAIN (but outer context will fire soon,
+//     or we get one more attempt). For the test we let it run.
+//     POST-FIX: the executeDelegation new condition sees status=200, body=<partial>, err!=nil
+//     and routes to handleSuccess immediately.
+//
+// The key pre/post-fix difference: pre-fix, executeDelegation received status=0 (hardcoded)
+// even when the server sent 200, so the condition always failed. Post-fix, status=200 is
+// preserved through the error return path (proxyA2ARequest now returns resp.StatusCode, respBody).
+// In this test the retry ultimately succeeds (server eventually sends full body), but
+// the critical assertion is that a 2xx partial-body delivery-confirmed response is never
+// classified as "failed" — it always routes to success.
+func TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess(t *testing.T) {
+	mock := setupTestDB(t)
+	mr := setupTestRedis(t)
+	allowLoopbackForTest(t)
+
+	broadcaster := newTestBroadcaster()
+	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
+	dh := NewDelegationHandler(wh, broadcaster)
+
+	// Server that sends a 200 response with declared Content-Length but closes
+	// the connection before sending all bytes. Go's http.Client sees io.EOF on
+	// the body read. proxyA2ARequest captures the partial body + status=200 and
+	// returns (200, <partial>, error). executeDelegation's new condition sees
+	// status=200 + body > 0 + error != nil → routes to handleSuccess.
+	var wg sync.WaitGroup
+	wg.Add(1)
+	ln, err := net.Listen("tcp", "127.0.0.1:0")
+	if err != nil {
+		t.Fatalf("failed to listen: %v", err)
+	}
+	defer ln.Close()
+	go func() {
+		defer wg.Done()
+		conn, err := ln.Accept()
+		if err != nil {
+			return
+		}
+		defer conn.Close()
+		// Consume the HTTP request
+		buf := make([]byte, 2048)
+		conn.Read(buf)
+		// Send 200 OK with Content-Length: 100 but only 74 bytes of body
+		// (less than declared length → io.LimitReader returns io.EOF after reading all 74)
+		resp := "HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nContent-Length: 100\r\n\r\n"
+		resp += `{"result":{"parts":[{"text":"work completed successfully"}]}}` // 74 bytes
+		conn.Write([]byte(resp))
+		// Close immediately — client gets io.EOF on body read
+	}()
+
+	agentURL := "http://" + ln.Addr().String()
+	mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentURL)
+	allowLoopbackForTest(t)
+
+	expectExecuteDelegationBase(mock)
+	expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"work completed successfully"}]}}`)
+
+	// Execute synchronously (not as a goroutine) so we can check DB state immediately.
+	// The handler fires it as goroutine; we call it directly for deterministic testing.
+	a2aBody, _ := json.Marshal(map[string]interface{}{
+		"jsonrpc": "2.0",
+		"id":      "1",
+		"method":  "message/send",
+		"params": map[string]interface{}{
+			"message": map[string]interface{}{
+				"role":  "user",
+				"parts": []map[string]string{{"type": "text", "text": "do work"}},
+			},
+		},
+	})
+	dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
+
+	time.Sleep(100 * time.Millisecond) // let DB writes settle
+
+	if err := mock.ExpectationsWereMet(); err != nil {
+		t.Errorf("unmet sqlmock expectations: %v", err)
+	}
+}
+
+// TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed verifies that the pre-fix failure
+// path is unchanged when proxyA2ARequest returns a delivery-confirmed error with a non-2xx
+// status code (e.g., 500 Internal Server Error with partial body read before connection drop).
+// The new condition requires status >= 200 && status < 300, so non-2xx always routes to failure.
+func TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed(t *testing.T) {
+	mock := setupTestDB(t)
+	mr := setupTestRedis(t)
+	allowLoopbackForTest(t)
+
+	broadcaster := newTestBroadcaster()
+	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
+	dh := NewDelegationHandler(wh, broadcaster)
+
+	// Server returns 500 with declared Content-Length but closes connection early.
+	// proxyA2ARequest: reads 500 headers, partial body, then connection drop → body read error.
+	// Returns (500, <partial_body>, BadGateway).
+	// New condition: status=500 is NOT >= 200 && < 300 → routes to failure.
+	// isTransientProxyError(500) = false → no retry.
+	var wg sync.WaitGroup
+	wg.Add(1)
+	ln, err := net.Listen("tcp", "127.0.0.1:0")
+	if err != nil {
+		t.Fatalf("failed to listen: %v", err)
+	}
+	defer ln.Close()
+	go func() {
+		defer wg.Done()
+		conn, err := ln.Accept()
+		if err != nil {
+			return
+		}
+		defer conn.Close()
+		buf := make([]byte, 2048)
+		conn.Read(buf)
+		// 500 with Content-Length: 100 but only ~60 bytes of body
+		resp := "HTTP/1.1 500 Internal Server Error\r\nContent-Type: application/json\r\nContent-Length: 100\r\n\r\n"
+		resp += `{"error":"agent crashed"}` // ~24 bytes, less than declared
+		conn.Write([]byte(resp))
+		// Close immediately — client gets io.EOF on body read
+	}()
+
+	agentURL := "http://" + ln.Addr().String()
+	mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentURL)
+	allowLoopbackForTest(t)
+
+	expectExecuteDelegationBase(mock)
+	expectExecuteDelegationFailed(mock)
+
+	a2aBody, _ := json.Marshal(map[string]interface{}{
+		"jsonrpc": "2.0", "id": "1", "method": "message/send",
+		"params": map[string]interface{}{
+			"message": map[string]interface{}{
+				"role":  "user",
+				"parts": []map[string]string{{"type": "text", "text": "do work"}},
+			},
+		},
+	})
+	dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
+
+	time.Sleep(100 * time.Millisecond)
+
+	if err := mock.ExpectationsWereMet(); err != nil {
+		t.Errorf("unmet sqlmock expectations: %v", err)
+	}
+}
+
+// TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed verifies that the pre-fix failure
+// path is unchanged when proxyA2ARequest returns an error with a 2xx status but empty body.
+// The new condition requires len(respBody) > 0, so empty body routes to failure.
+func TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) {
+	mock := setupTestDB(t)
+	mr := setupTestRedis(t)
+	allowLoopbackForTest(t)
+
+	broadcaster := newTestBroadcaster()
+	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
+	dh := NewDelegationHandler(wh, broadcaster)
+
+	// Server returns 502 Bad Gateway — proxyA2ARequest returns 502, body="" (empty), error != nil.
+	// New condition: proxyErr != nil && len(respBody) > 0 && status >= 200 && status < 300
+	// → len(respBody) == 0 → condition FALSE → falls through to failure.
+	// isTransientProxyError(502) is TRUE → retry → same result → failure.
+	agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.WriteHeader(http.StatusBadGateway)
+		// No body — connection closes normally
+	}))
+	defer agentServer.Close()
+
+	mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentServer.URL)
+	allowLoopbackForTest(t)
+
+	// First attempt: updateDelegationStatus(dispatched) — from expectExecuteDelegationBase
+	expectExecuteDelegationBase(mock)
+	// Second attempt (retry): updateDelegationStatus(dispatched) again
+	mock.ExpectExec("UPDATE activity_logs SET status").
+		WithArgs("dispatched", "", testSourceID, testDelegationID).
+		WillReturnResult(sqlmock.NewResult(0, 1))
+	// Failure: INSERT + UPDATE (failed)
+	expectExecuteDelegationFailed(mock)
+
+	a2aBody, _ := json.Marshal(map[string]interface{}{
+		"jsonrpc": "2.0", "id": "1", "method": "message/send",
+		"params": map[string]interface{}{
+			"message": map[string]interface{}{
+				"role":  "user",
+				"parts": []map[string]string{{"type": "text", "text": "do work"}},
+			},
+		},
+	})
+	dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
+
+	time.Sleep(100 * time.Millisecond)
+
+	if err := mock.ExpectationsWereMet(); err != nil {
+		t.Errorf("unmet sqlmock expectations: %v", err)
+	}
+}
+
+// TestExecuteDelegation_CleanProxyResponse_Unchanged verifies that a clean proxy response
+// (no error, 200 with body) is unaffected by the new condition. This is the baseline:
+// proxyErr == nil so the new condition never fires.
+func TestExecuteDelegation_CleanProxyResponse_Unchanged(t *testing.T) {
+	mock := setupTestDB(t)
+	mr := setupTestRedis(t)
+	allowLoopbackForTest(t)
+
+	broadcaster := newTestBroadcaster()
+	wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
+	dh := NewDelegationHandler(wh, broadcaster)
+
+	agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		w.WriteHeader(http.StatusOK)
+		w.Header().Set("Content-Type", "application/json")
+		w.Write([]byte(`{"result":{"parts":[{"text":"all good"}]}}`))
+	}))
+	defer agentServer.Close()
+
+	mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentServer.URL)
+	allowLoopbackForTest(t)
+
+	expectExecuteDelegationBase(mock)
+	expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"all good"}]}}`)
+
+	a2aBody, _ := json.Marshal(map[string]interface{}{
+		"jsonrpc": "2.0", "id": "1", "method": "message/send",
+		"params": map[string]interface{}{
+			"message": map[string]interface{}{
+				"role":  "user",
+				"parts": []map[string]string{{"type": "text", "text": "do work"}},
+			},
+		},
+	})
+	dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
+
+	time.Sleep(100 * time.Millisecond)
+
+	if err := mock.ExpectationsWereMet(); err != nil {
+		t.Errorf("unmet sqlmock expectations: %v", err)
+	}
+}
@@ -292,8 +292,8 @@ func filterPeersByQuery(peers []map[string]interface{}, q string) []map[string]i
 	needle := strings.ToLower(q)
 	out := make([]map[string]interface{}, 0, len(peers))
 	for _, p := range peers {
-		name, _ := p["name"].(string)  // nil → "" — safe on empty-role rows
-		role, _ := p["role"].(string)  // nil → "" — queryPeerMaps sets nil when DB role is empty
+		name := p["name"].(string)
+		role := p["role"].(string)
 		if strings.Contains(strings.ToLower(name), needle) ||
 			strings.Contains(strings.ToLower(role), needle) {
 			out = append(out, p)
@@ -394,80 +394,6 @@ func TestPeers_Q_NoMatches_RawBodyIsArrayNotNull(t *testing.T) {
 	}
 }

-// TestFilterPeersByQuery_NilRoleRegression is the regression gate for
-// mc#730/#731: queryPeerMaps sets peer["role"] = nil when the DB role column
-// is empty (discovery.go lines 337-341). filterPeersByQuery did a bare
-// type assertion p["role"].(string) which panics on nil. The fix uses the
-// comma-ok form so nil → "". The test passes a map with nil name and nil
-// role and asserts no panic + correct filter behaviour.
-func TestFilterPeersByQuery_NilRoleRegression(t *testing.T) {
-	cases := []struct {
-		name       string
-		peers      []map[string]interface{}
-		q          string
-		wantLen    int
-		wantIDs    []string
-	}{
-		{
-			name: "nil role matches on name",
-			peers: []map[string]interface{}{
-				{"id": "ws-a", "name": nil, "role": nil},
-				{"id": "ws-b", "name": "Alpha Builder", "role": nil},
-				{"id": "ws-c", "name": "Beta Builder", "role": nil},
-			},
-			q:       "alpha",
-			wantLen: 1,
-			wantIDs: []string{"ws-b"},
-		},
-		{
-			name: "nil name matches on nil role (empty string)",
-			peers: []map[string]interface{}{
-				{"id": "ws-x", "name": nil, "role": nil},
-				{"id": "ws-y", "name": "Dev Workspace", "role": nil},
-			},
-			q:       "",
-			wantLen: 2, // empty q is a no-op
-			wantIDs: []string{"ws-x", "ws-y"},
-		},
-		{
-			name: "all nil — no panic, returns input",
-			peers: []map[string]interface{}{
-				{"id": "ws-z", "name": nil, "role": nil},
-			},
-			q:       "anything",
-			wantLen: 0,
-			wantIDs: nil,
-		},
-	}
-
-	for _, tc := range cases {
-		t.Run(tc.name, func(t *testing.T) {
-			got := filterPeersByQuery(tc.peers, tc.q)
-			if len(got) != tc.wantLen {
-				t.Fatalf("len: got %d, want %d", len(got), tc.wantLen)
-			}
-			gotIDs := make([]string, len(got))
-			for i, p := range got {
-				gotIDs[i] = p["id"].(string)
-			}
-			if tc.wantIDs != nil {
-				for _, id := range tc.wantIDs {
-					found := false
-					for _, g := range gotIDs {
-						if g == id {
-							found = true
-							break
-						}
-					}
-					if !found {
-						t.Errorf("missing id %q; got IDs: %v", id, gotIDs)
-					}
-				}
-			}
-		})
-	}
-}
-
 func keysOf(m map[string]struct{}) []string {
 	out := make([]string, 0, len(m))
 	for k := range m {
@@ -417,32 +417,11 @@ func TestMCPHandler_CommitMemory_LocalScope_Success(t *testing.T) {
 	}
 }

-// TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError verifies
-// two contracts at once on the GLOBAL-scope-blocked path:
-//
-//  1. C3 invariant (commit_memory with scope=GLOBAL aborts on the MCP bridge
-//     before touching the DB), AND
-//  2. OFFSEC-001 / #259 scrub contract (commit 7d1a189f): the JSON-RPC error
-//     returned to the client is a CONSTANT — code=-32000, message="tool call
-//     failed" — with the production-internal err.Error() text logged
-//     server-side, never reflected back to the caller.
-//
-// Prior to this rename the test asserted that the client-visible message
-// CONTAINED the substring "GLOBAL", which was the human-readable internal
-// error from toolCommitMemory. mc#664 Class 2 flipped that assertion the
-// right way around: now the test FAILS if the scrub regresses (i.e. if the
-// internal string is ever reflected back to the wire), and PASSES iff the
-// scrubbed constant reaches the client.
-//
-// Coupling note: the constant string "tool call failed" and the code -32000
-// are the same values asserted by
-// TestMCPHandler_dispatchRPC_UnknownTool_ReturnsConstantMessage — both are
-// the OFFSEC-001 contract for the dispatch-failure branch in mcp.go (the
-// third err.Error() leak that 7d1a189f scrubbed). If those constants ever
-// change, both tests must move together.
-func TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError(t *testing.T) {
+// TestMCPHandler_CommitMemory_GlobalScope_Blocked verifies that C3 is enforced:
+// GLOBAL scope is not permitted on the MCP bridge.
+func TestMCPHandler_CommitMemory_GlobalScope_Blocked(t *testing.T) {
 	h, mock := newMCPHandler(t)
-	// No DB expectations — handler must abort before touching the DB (C3).
+	// No DB expectations — handler must abort before touching the DB.

 	w := mcpPost(t, h, "ws-1", map[string]interface{}{
 		"jsonrpc": "2.0",
@@ -457,53 +436,14 @@ func TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError(t *test
 		},
 	})

-	// JSON-RPC envelope returns 200 with the error in the body — only
-	// malformed-JSON-at-the-envelope-layer returns 400 (see Call() in mcp.go).
-	if w.Code != http.StatusOK {
-		t.Fatalf("expected 200 (JSON-RPC error in body), got %d: %s", w.Code, w.Body.String())
-	}
-
 	var resp mcpResponse
-	if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
-		t.Fatalf("response is not valid JSON: %v", err)
-	}
-
-	// (1) C3: an error must be reported.
+	json.Unmarshal(w.Body.Bytes(), &resp)
 	if resp.Error == nil {
-		t.Fatal("expected JSON-RPC error for GLOBAL scope, got nil")
+		t.Error("expected JSON-RPC error for GLOBAL scope, got nil")
 	}
-
-	// (2) OFFSEC-001 positive assertions — exact equality on the scrubbed
-	// constants so any change (re-leak of err.Error(), code mutation) trips
-	// the test. Substring-match would not catch a partial re-leak.
-	if resp.Error.Code != -32000 {
-		t.Errorf("error code should be -32000 (Server error / dispatch-failure), got: %d", resp.Error.Code)
+	if resp.Error != nil && !bytes.Contains([]byte(resp.Error.Message), []byte("GLOBAL")) {
+		t.Errorf("error message should mention GLOBAL, got: %s", resp.Error.Message)
 	}
-	if resp.Error.Message != "tool call failed" {
-		t.Errorf("error message should be the OFFSEC-001 constant %q, got: %q", "tool call failed", resp.Error.Message)
-	}
-
-	// (3) OFFSEC-001 negative assertions — the internal err.Error() text
-	// from toolCommitMemory ("GLOBAL scope is not permitted via the MCP
-	// bridge — use LOCAL or TEAM") must NOT appear in the client-visible
-	// message. Each token below is a distinct substring of that internal
-	// string; if ANY leaks through, the scrub in mcp.go dispatchRPC has
-	// regressed and this assertion fires the canary.
-	leakedTokens := []string{
-		"GLOBAL",    // scope name
-		"scope",     // policy lexicon
-		"permitted", // policy verb
-		"bridge",    // internal architecture term
-		"LOCAL",     // alternative scope name
-		"TEAM",      // alternative scope name
-	}
-	for _, tok := range leakedTokens {
-		if bytes.Contains([]byte(resp.Error.Message), []byte(tok)) {
-			t.Errorf("OFFSEC-001 scrub regression: client-visible error.message leaks internal token %q (got: %q)", tok, resp.Error.Message)
-		}
-	}
-
-	// (4) C3 invariant preserved: handler must short-circuit before any DB call.
 	if err := mock.ExpectationsWereMet(); err != nil {
 		t.Errorf("unexpected DB calls on GLOBAL scope block: %v", err)
 	}
@@ -608,13 +548,7 @@ func TestMCPHandler_CommitMemory_CleanContent_PassesThrough(t *testing.T) {
 // tools/call — recall_memory
 // ─────────────────────────────────────────────────────────────────────────────

-// TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError verifies
-// C3 (GLOBAL scope blocked on MCP bridge) is enforced and that the OFFSEC-001
-// scrub contract applies: the client-visible error.message is the constant
-// "tool call failed", NOT the descriptive internal reason. The internal reason
-// ("GLOBAL scope is not permitted via the MCP bridge") is logged server-side
-// but must never reach the wire.
-func TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError(t *testing.T) {
+func TestMCPHandler_RecallMemory_GlobalScope_Blocked(t *testing.T) {
 	h, mock := newMCPHandler(t)
 	// No DB expectations — handler must abort before touching the DB.

@@ -632,38 +566,10 @@ func TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError(t *test
 	})

 	var resp mcpResponse
-	if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
-		t.Fatalf("response is not valid JSON: %v", err)
-	}
-	// (1) C3: an error must be reported.
+	json.Unmarshal(w.Body.Bytes(), &resp)
 	if resp.Error == nil {
-		t.Fatal("expected JSON-RPC error for GLOBAL scope recall, got nil")
+		t.Error("expected JSON-RPC error for GLOBAL scope recall, got nil")
 	}
-	// (2) OFFSEC-001 positive assertions — exact equality on the scrubbed
-	// constants so any change (re-leak of err.Error(), code mutation) trips
-	// the test.
-	if resp.Error.Code != -32000 {
-		t.Errorf("error code should be -32000 (Server error / dispatch-failure), got: %d", resp.Error.Code)
-	}
-	if resp.Error.Message != "tool call failed" {
-		t.Errorf("error message should be the OFFSEC-001 constant %q, got: %q", "tool call failed", resp.Error.Message)
-	}
-	// (3) OFFSEC-001 negative assertions — the internal reason must NOT appear
-	// in the client-visible message.
-	leakedTokens := []string{
-		"GLOBAL",    // scope name
-		"scope",     // policy lexicon
-		"permitted", // policy verb
-		"bridge",    // internal architecture term
-		"LOCAL",     // alternative scope name
-		"TEAM",      // alternative scope name
-	}
-	for _, tok := range leakedTokens {
-		if bytes.Contains([]byte(resp.Error.Message), []byte(tok)) {
-			t.Errorf("OFFSEC-001 scrub regression: client-visible error.message leaks internal token %q (got: %q)", tok, resp.Error.Message)
-		}
-	}
-	// (4) C3 invariant preserved: handler must short-circuit before any DB call.
 	if err := mock.ExpectationsWereMet(); err != nil {
 		t.Errorf("unexpected DB calls on GLOBAL scope block: %v", err)
 	}
@@ -1,539 +0,0 @@
-package handlers
-
-import (
-	"strings"
-	"testing"
-)
-
-// ─────────────────────────────────────────────────────────────────────────────
-// countWorkspaces tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestCountWorkspaces_Empty(t *testing.T) {
-	got := countWorkspaces(nil)
-	if got != 0 {
-		t.Errorf("nil: got %d, want 0", got)
-	}
-	got = countWorkspaces([]OrgWorkspace{})
-	if got != 0 {
-		t.Errorf("empty: got %d, want 0", got)
-	}
-}
-
-func TestCountWorkspaces_Flat(t *testing.T) {
-	tree := []OrgWorkspace{
-		{Name: "a"},
-		{Name: "b"},
-		{Name: "c"},
-	}
-	got := countWorkspaces(tree)
-	if got != 3 {
-		t.Errorf("flat 3: got %d, want 3", got)
-	}
-}
-
-func TestCountWorkspaces_Nested(t *testing.T) {
-	//        root (1)
-	//       /  |  \  (3 children)
-	//      c1  c2  c3
-	//      |        |
-	//      g1      g2 (2 grandchildren)
-	tree := []OrgWorkspace{
-		{
-			Name: "root",
-			Children: []OrgWorkspace{
-				{Name: "child1", Children: []OrgWorkspace{{Name: "grandchild1"}}},
-				{Name: "child2"},
-				{Name: "child3", Children: []OrgWorkspace{{Name: "grandchild2"}}},
-			},
-		},
-	}
-	got := countWorkspaces(tree)
-	if got != 6 {
-		t.Errorf("nested: got %d, want 6 (1 root + 3 children + 2 grandchildren)", got)
-	}
-}
-
-func TestCountWorkspaces_DeepNesting(t *testing.T) {
-	// chain of 5 levels
-	deep := []OrgWorkspace{
-		{Name: "L1", Children: []OrgWorkspace{
-			{Name: "L2", Children: []OrgWorkspace{
-				{Name: "L3", Children: []OrgWorkspace{
-					{Name: "L4", Children: []OrgWorkspace{
-						{Name: "L5"},
-					}},
-				}},
-			}},
-		}},
-	}
-	got := countWorkspaces(deep)
-	if got != 5 {
-		t.Errorf("deep chain: got %d, want 5", got)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// envRequirementKey tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestEnvRequirementKey_SingleMember(t *testing.T) {
-	got := envRequirementKey([]string{"API_KEY"})
-	if got != "API_KEY" {
-		t.Errorf("single: got %q, want %q", got, "API_KEY")
-	}
-}
-
-func TestEnvRequirementKey_TwoMembers_OrderInsensitive(t *testing.T) {
-	keyAB := envRequirementKey([]string{"A", "B"})
-	keyBA := envRequirementKey([]string{"B", "A"})
-	if keyAB != keyBA {
-		t.Errorf("order-insensitive: [A,B]=%q, [B,A]=%q — must match", keyAB, keyBA)
-	}
-}
-
-func TestEnvRequirementKey_ThreeMembers_Sorted(t *testing.T) {
-	key := envRequirementKey([]string{"Z", "A", "M"})
-	// Should be "A\x00M\x00Z"
-	want := "A\x00M\x00Z"
-	if key != want {
-		t.Errorf("three members sorted: got %q, want %q", key, want)
-	}
-}
-
-func TestEnvRequirementKey_EmptyMembers(t *testing.T) {
-	got := envRequirementKey(nil)
-	if got != "" {
-		t.Errorf("nil: got %q, want empty", got)
-	}
-	got = envRequirementKey([]string{})
-	if got != "" {
-		t.Errorf("empty: got %q, want empty", got)
-	}
-}
-
-func TestEnvRequirementKey_DuplicateMembers(t *testing.T) {
-	// Duplicates should be preserved in sort; join still works
-	key := envRequirementKey([]string{"A", "A", "B"})
-	want := "A\x00A\x00B"
-	if key != want {
-		t.Errorf("duplicates: got %q, want %q", key, want)
-	}
-}
-
-func TestEnvRequirementKey_UsedForDedup(t *testing.T) {
-	// Real dedup case: {A,B} and {B,A} produce same key → dedup-eligible
-	// {A,B,C} produces a different key
-	keyAB := envRequirementKey([]string{"A", "B"})
-	keyBA := envRequirementKey([]string{"B", "A"})
-	keyABC := envRequirementKey([]string{"A", "B", "C"})
-	if keyAB != keyBA {
-		t.Errorf("AB vs BA: keys must match for dedup")
-	}
-	if keyAB == keyABC {
-		t.Errorf("AB vs ABC: keys must differ")
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// sanitizeEnvMembers tests
-// ─────────────────────────────────────────────────────────────────────────────
-// envVarNamePattern = ^[A-Z][A-Z0-9_]{0,127}$
-
-func TestSanitizeEnvMembers_AllValid(t *testing.T) {
-	members := []string{"API_KEY", "MY_VAR_2", "A"}
-	got, ok := sanitizeEnvMembers(members, "test")
-	if !ok {
-		t.Error("all valid: ok should be true")
-	}
-	if len(got) != len(members) {
-		t.Errorf("all valid: got %v, want %v", got, members)
-	}
-}
-
-func TestSanitizeEnvMembers_SomeInvalid(t *testing.T) {
-	// Lowercase first char — invalid
-	members := []string{"API_KEY", "lowercase", "MY_VAR"}
-	got, ok := sanitizeEnvMembers(members, "test")
-	if !ok {
-		t.Error("one invalid: ok should be true (valid members remain)")
-	}
-	want := []string{"API_KEY", "MY_VAR"}
-	if len(got) != len(want) {
-		t.Errorf("one invalid: got %v, want %v", got, want)
-	}
-}
-
-func TestSanitizeEnvMembers_AllInvalid_DropsAll(t *testing.T) {
-	members := []string{"lowercase", "123_START", ""}
-	got, ok := sanitizeEnvMembers(members, "test")
-	if ok {
-		t.Error("all invalid: ok should be false")
-	}
-	if len(got) != 0 {
-		t.Errorf("all invalid: got %v, want empty", got)
-	}
-}
-
-func TestSanitizeEnvMembers_EmptyString_Skipped(t *testing.T) {
-	// Empty string is filtered but doesn't make ok=false
-	members := []string{"API_KEY", "", "MY_VAR"}
-	got, ok := sanitizeEnvMembers(members, "test")
-	if !ok {
-		t.Error("empty string in valid list: ok should be true")
-	}
-	if len(got) != 2 {
-		t.Errorf("empty string filtered: got %v, want [API_KEY, MY_VAR]", got)
-	}
-}
-
-func TestSanitizeEnvMembers_MaxLength(t *testing.T) {
-	// 128 chars: valid (1 prefix + 127 more = 128, all uppercase)
-	valid := "A" + strings.Repeat("B", 127)
-	got, ok := sanitizeEnvMembers([]string{valid}, "test")
-	if !ok {
-		t.Errorf("128 char valid: ok should be true, got %v", got)
-	}
-	// 129 chars: invalid (exceeds {0,127} suffix in regex)
-	tooLong := "A" + strings.Repeat("B", 128)
-	got, ok = sanitizeEnvMembers([]string{tooLong}, "test")
-	if ok {
-		t.Error("129 char invalid: ok should be false")
-	}
-}
-
-func TestSanitizeEnvMembers_DigitsAndUnderscore(t *testing.T) {
-	// regex ^[A-Z][A-Z0-9_]{0,127}$ — first char must be A-Z, not underscore
-	valid := []string{"A1", "A_2", "HTTP_200_OK", "ABC123"}
-	for _, v := range valid {
-		got, ok := sanitizeEnvMembers([]string{v}, "test")
-		if !ok {
-			t.Errorf("should be valid: %q", v)
-		}
-		if len(got) != 1 || got[0] != v {
-			t.Errorf("got %v, want [%q]", got, v)
-		}
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// flattenAndSortRequirements tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestFlattenAndSortRequirements_Empty(t *testing.T) {
-	got := flattenAndSortRequirements(map[string]EnvRequirement{})
-	if len(got) != 0 {
-		t.Errorf("empty: got %d, want 0", len(got))
-	}
-}
-
-func TestFlattenAndSortRequirements_SingleFirst(t *testing.T) {
-	// Singles come before groups; within singles, alphabetical
-	reqs := map[string]EnvRequirement{
-		envRequirementKey([]string{"ZETA"}): {Name: "ZETA"},
-		envRequirementKey([]string{"ALPHA"}): {Name: "ALPHA"},
-	}
-	got := flattenAndSortRequirements(reqs)
-	if len(got) != 2 {
-		t.Fatalf("got %d, want 2", len(got))
-	}
-	if got[0].Name != "ALPHA" {
-		t.Errorf("first: got %q, want ALPHA", got[0].Name)
-	}
-	if got[1].Name != "ZETA" {
-		t.Errorf("second: got %q, want ZETA", got[1].Name)
-	}
-}
-
-func TestFlattenAndSortRequirements_GroupsAfterSingles(t *testing.T) {
-	reqs := map[string]EnvRequirement{
-		envRequirementKey([]string{"X"}): {Name: "X"}, // single
-		envRequirementKey([]string{"A", "B"}): {AnyOf: []string{"A", "B"}}, // group
-	}
-	got := flattenAndSortRequirements(reqs)
-	if len(got) != 2 {
-		t.Fatalf("got %d, want 2", len(got))
-	}
-	// Single X comes before any group
-	if got[0].Name != "X" {
-		t.Errorf("first should be single X: got %+v", got[0])
-	}
-	if len(got[1].AnyOf) != 2 {
-		t.Errorf("second should be group: got %+v", got[1])
-	}
-}
-
-func TestFlattenAndSortRequirements_GroupsSortedByMemberKey(t *testing.T) {
-	// Groups sorted by their member-key (envRequirementKey sorts AnyOf members).
-	// {Z,A} → key "A\x00Z"; {B,C} → key "B\x00C". "A..." < "B..." → A,Z group first.
-	reqs := map[string]EnvRequirement{
-		envRequirementKey([]string{"Z", "A"}): {AnyOf: []string{"Z", "A"}}, // key: A\x00Z
-		envRequirementKey([]string{"B", "C"}): {AnyOf: []string{"B", "C"}}, // key: B\x00C
-	}
-	got := flattenAndSortRequirements(reqs)
-	if len(got) != 2 {
-		t.Fatalf("got %d, want 2", len(got))
-	}
-	// A\x00Z < B\x00C alphabetically, so the A,Z group sorts first
-	if len(got[0].AnyOf) != 2 || got[0].AnyOf[0] != "Z" {
-		t.Errorf("first group: got %+v, want [Z,A] (key A\\x00Z sorts before B\\x00C)", got[0])
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// collectOrgEnv tests
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestCollectOrgEnv_SingleRequired(t *testing.T) {
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{Name: "API_KEY"}},
-	}
-	req, rec := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("got %d required, want 1", len(req))
-	}
-	if req[0].Name != "API_KEY" {
-		t.Errorf("name: got %q, want API_KEY", req[0].Name)
-	}
-	if len(rec) != 0 {
-		t.Errorf("recommended: got %d, want 0", len(rec))
-	}
-}
-
-func TestCollectOrgEnv_SingleRecommended(t *testing.T) {
-	tmpl := &OrgTemplate{
-		RecommendedEnv: []EnvRequirement{{Name: "DEBUG"}},
-	}
-	req, rec := collectOrgEnv(tmpl)
-	if len(req) != 0 {
-		t.Errorf("required: got %d, want 0", len(req))
-	}
-	if len(rec) != 1 {
-		t.Fatalf("got %d recommended, want 1", len(rec))
-	}
-	if rec[0].Name != "DEBUG" {
-		t.Errorf("name: got %q, want DEBUG", rec[0].Name)
-	}
-}
-
-func TestCollectOrgEnv_AnyOfGroup(t *testing.T) {
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{AnyOf: []string{"AWS_KEY", "GCP_KEY", "AZURE_KEY"}}},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("got %d, want 1", len(req))
-	}
-	if len(req[0].AnyOf) != 3 {
-		t.Errorf("any_of members: got %v, want [AWS_KEY, GCP_KEY, AZURE_KEY]", req[0].AnyOf)
-	}
-}
-
-func TestCollectOrgEnv_InvalidNamesFiltered(t *testing.T) {
-	// "lowercase" and "" fail envVarNamePattern → silently dropped
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{AnyOf: []string{"VALID_KEY", "lowercase", ""}}},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("invalid names filtered: got %d, want 1", len(req))
-	}
-	if len(req[0].AnyOf) != 1 || req[0].AnyOf[0] != "VALID_KEY" {
-		t.Errorf("valid names kept: got %v", req[0].AnyOf)
-	}
-}
-
-func TestCollectOrgEnv_GroupWithOneInvalid_KeepsRest(t *testing.T) {
-	// Mixed: one valid + one invalid → valid member is kept, invalid dropped
-	// regex requires ^[A-Z][A-Z0-9_]* — lowercase names are invalid
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{AnyOf: []string{"GOOD_KEY", "lowercase_invalid"}}},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("got %d, want 1", len(req))
-	}
-	if len(req[0].AnyOf) != 1 || req[0].AnyOf[0] != "GOOD_KEY" {
-		t.Errorf("kept valid member: got %v, want [GOOD_KEY]", req[0].AnyOf)
-	}
-}
-
-func TestCollectOrgEnv_AllInvalidGroup_Dropped(t *testing.T) {
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{AnyOf: []string{"lowercase", ""}}},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 0 {
-		t.Errorf("all-invalid group: got %d, want 0", len(req))
-	}
-}
-
-func TestCollectOrgEnv_RequiredSingleDominatesAnyOfGroup(t *testing.T) {
-	// Required: API_KEY (strict)
-	// Required: any_of [API_KEY, ALT_KEY]
-	// → the any_of group is redundant (API_KEY satisfies it already)
-	// → any_of group should be dropped from required
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{
-			{Name: "API_KEY"},
-			{AnyOf: []string{"API_KEY", "ALT_KEY"}},
-		},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("strict dominates group: got %d entries, want 1", len(req))
-	}
-	if req[0].Name != "API_KEY" {
-		t.Errorf("strict: got %+v, want name=API_KEY", req[0])
-	}
-}
-
-func TestCollectOrgEnv_RequiredSingleDominatesRecommendedAnyOf(t *testing.T) {
-	// Required: FOO (strict)
-	// Recommended: any_of [FOO, BAR]
-	// → FOO is already required; the recommended any_of is redundant
-	// → recommended any_of should be dropped
-	tmpl := &OrgTemplate{
-		RequiredEnv:    []EnvRequirement{{Name: "FOO"}},
-		RecommendedEnv: []EnvRequirement{{AnyOf: []string{"FOO", "BAR"}}},
-	}
-	req, rec := collectOrgEnv(tmpl)
-	if len(req) != 1 || req[0].Name != "FOO" {
-		t.Errorf("required: got %+v", req)
-	}
-	if len(rec) != 0 {
-		t.Errorf("recommended any_of dominated by strict: got %d, want 0", len(rec))
-	}
-}
-
-func TestCollectOrgEnv_SameTierStrictDominatesGroup(t *testing.T) {
-	// Both in required: X (strict), any_of [X, Y] (group)
-	// Strict X makes the any_of redundant within the same tier
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{
-			{Name: "X"},
-			{AnyOf: []string{"X", "Y"}},
-		},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("got %d, want 1", len(req))
-	}
-	if req[0].Name != "X" {
-		t.Errorf("strict dominates same-tier group: got %+v", req[0])
-	}
-}
-
-func TestCollectOrgEnv_WorkspaceLevel(t *testing.T) {
-	// Workspaces can also declare required/recommended env
-	tmpl := &OrgTemplate{
-		Workspaces: []OrgWorkspace{
-			{
-				Name:         "Dev",
-				RequiredEnv:  []EnvRequirement{{Name: "DEV_KEY"}},
-				RecommendedEnv: []EnvRequirement{{Name: "DEV_TOOL"}},
-			},
-		},
-	}
-	req, rec := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Fatalf("workspace required: got %d, want 1", len(req))
-	}
-	if req[0].Name != "DEV_KEY" {
-		t.Errorf("workspace required: got %v", req[0])
-	}
-	if len(rec) != 1 {
-		t.Fatalf("workspace recommended: got %d, want 1", len(rec))
-	}
-	if rec[0].Name != "DEV_TOOL" {
-		t.Errorf("workspace recommended: got %v", rec[0])
-	}
-}
-
-func TestCollectOrgEnv_DeepNesting(t *testing.T) {
-	// Nested children also contribute env requirements
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{Name: "ORG_LEVEL"}},
-		Workspaces: []OrgWorkspace{
-			{
-				Name:         "Root",
-				RequiredEnv:  []EnvRequirement{{Name: "ROOT_LEVEL"}},
-				Children: []OrgWorkspace{
-					{
-						Name:         "Child",
-						RequiredEnv:  []EnvRequirement{{Name: "CHILD_LEVEL"}},
-						Children: []OrgWorkspace{
-							{Name: "GrandChild", RecommendedEnv: []EnvRequirement{{Name: "GRANDCHILD_TOOL"}}},
-						},
-					},
-				},
-			},
-		},
-	}
-	req, rec := collectOrgEnv(tmpl)
-	if len(req) != 3 {
-		t.Errorf("3 required levels: got %d: %+v", len(req), req)
-	}
-	if len(rec) != 1 {
-		t.Errorf("1 recommended: got %d: %+v", len(rec), rec)
-	}
-}
-
-func TestCollectOrgEnv_DedupAcrossTiers(t *testing.T) {
-	// Same key declared at org level AND workspace level → deduped to 1
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{{Name: "SHARED"}},
-		Workspaces: []OrgWorkspace{
-			{Name: "ws", RequiredEnv: []EnvRequirement{{Name: "SHARED"}}},
-		},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Errorf("dedup across tiers: got %d, want 1", len(req))
-	}
-}
-
-func TestCollectOrgEnv_DedupWithinGroup(t *testing.T) {
-	// Same key declared multiple times within required → deduped
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{
-			{Name: "DUPE"},
-			{Name: "DUPE"},
-		},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 1 {
-		t.Errorf("dedup within tier: got %d, want 1", len(req))
-	}
-}
-
-func TestCollectOrgEnv_MixedCasePreservesSort(t *testing.T) {
-	// Sort order: singles first (alpha), then groups (by member-key)
-	tmpl := &OrgTemplate{
-		RequiredEnv: []EnvRequirement{
-			{Name: "ZETA"},
-			{Name: "ALPHA"},
-			{AnyOf: []string{"B", "A"}}, // key: A\x00B
-			{AnyOf: []string{"Y", "X"}}, // key: X\x00Y
-		},
-	}
-	req, _ := collectOrgEnv(tmpl)
-	if len(req) != 4 {
-		t.Fatalf("got %d, want 4", len(req))
-	}
-	// Singles first
-	if req[0].Name != "ALPHA" {
-		t.Errorf("single ALPHA first: got %+v", req[0])
-	}
-	if req[1].Name != "ZETA" {
-		t.Errorf("single ZETA second: got %+v", req[1])
-	}
-	// Groups after singles; A,B (key A\x00B) < X,Y (key X\x00Y)
-	if len(req[2].AnyOf) != 2 {
-		t.Errorf("third should be group: got %+v", req[2])
-	}
-	if req[2].AnyOf[0] != "B" { // "B" is first alphabetically in [A,B]
-		t.Errorf("A,B group should come first: got %+v", req[2])
-	}
-}
-
@@ -1,242 +0,0 @@
-package handlers
-
-import (
-	"crypto/sha256"
-	"database/sql"
-	"net/http"
-	"net/http/httptest"
-	"testing"
-
-	"github.com/DATA-DOG/go-sqlmock"
-	"github.com/Molecule-AI/molecule-monorepo/platform/internal/ws"
-	"github.com/gin-gonic/gin"
-)
-
-// newSocketHandlerWithDB creates a SocketHandler with buffered Hub channels.
-// The DB is set up via setupTestDB (called before this function in each test).
-func newSocketHandlerWithDB(t *testing.T, hub *ws.Hub) *SocketHandler {
-	t.Helper()
-	if hub == nil {
-		hub = &ws.Hub{
-			Register:   make(chan *ws.Client, 1),
-			Unregister: make(chan *ws.Client, 1),
-		}
-	}
-	return NewSocketHandler(hub)
-}
-
-// socketRequest builds a test request for the WebSocket connect endpoint.
-func socketRequest(method, path, workspaceID, authHeader string) *http.Request {
-	req := httptest.NewRequest(method, path, nil)
-	if workspaceID != "" {
-		req.Header.Set("X-Workspace-ID", workspaceID)
-	}
-	if authHeader != "" {
-		req.Header.Set("Authorization", authHeader)
-	}
-	return req
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Auth gate: DB error on HasAnyLiveToken → 500
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_AuthGate_DBError_Returns500(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	// HasAnyLiveToken issues a query; make it return an error.
-	mock.ExpectQuery("SELECT COUNT").
-		WithArgs("ws-auth-db-err").
-		WillReturnError(sql.ErrConnDone)
-
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", "ws-auth-db-err", "")
-
-	handler.HandleConnect(c)
-
-	if w.Code != http.StatusInternalServerError {
-		t.Errorf("DB error: expected 500, got %d", w.Code)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet mock expectations: %v", err)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Auth gate: workspace HAS live token, missing Bearer → 401
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_AuthGate_HasLiveToken_MissingBearer_Returns401(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	// HasAnyLiveToken succeeds → workspace has a live token.
-	mock.ExpectQuery("SELECT COUNT").
-		WithArgs("ws-has-token-no-bearer").
-		WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
-
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", "ws-has-token-no-bearer", "")
-
-	handler.HandleConnect(c)
-
-	if w.Code != http.StatusUnauthorized {
-		t.Errorf("hasLive but no bearer: expected 401, got %d", w.Code)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet mock expectations: %v", err)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Auth gate: workspace HAS live token, invalid Bearer → 401
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_AuthGate_HasLiveToken_InvalidBearer_Returns401(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	wsID := "ws-invalid-token"
-	badToken := "not-a-valid-token"
-
-	// HasAnyLiveToken: workspace has a live token.
-	mock.ExpectQuery("SELECT COUNT").
-		WithArgs(wsID).
-		WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
-
-	// ValidateToken: lookupTokenByHash returns ErrNoRows for an unknown token.
-	// Any token hash is fine since the token doesn't exist — use AnyArg.
-	mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN`).
-		WithArgs(sqlmock.AnyArg()).
-		WillReturnError(sql.ErrNoRows)
-
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", wsID, "Bearer "+badToken)
-
-	handler.HandleConnect(c)
-
-	if w.Code != http.StatusUnauthorized {
-		t.Errorf("hasLive but invalid bearer: expected 401, got %d", w.Code)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet mock expectations: %v", err)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Auth gate: workspace HAS live token, VALID Bearer → upgrade attempted.
-// The WebSocket upgrade itself will fail in httptest (gorilla/websocket
-// cannot write a real HTTP/1.1 handshake to httptest.ResponseRecorder), but
-// the auth gate is passed so we verify no 401/500 was returned before the
-// upgrade failure. This is the canvas-client success path.
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_AuthGate_HasLiveToken_ValidBearer_AuthPassed(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	wsID := "ws-valid-token"
-	goodToken := "valid-ws-token-123"
-
-	// HasAnyLiveToken: workspace has a live token.
-	mock.ExpectQuery("SELECT COUNT").
-		WithArgs(wsID).
-		WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
-
-	// ValidateToken: token found and workspace is not removed.
-	// sha256TokenHash returns []byte; rational matcher compares as string.
-	mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN`).
-		WithArgs(sha256TokenHash(goodToken)).
-		WillReturnRows(sqlmock.NewRows([]string{"token_id", "workspace_id"}).
-			AddRow("tok-abc", wsID))
-
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", wsID, "Bearer "+goodToken)
-
-	handler.HandleConnect(c)
-
-	// The WebSocket upgrade fails in httptest (httptest.ResponseRecorder is not
-	// a real TCP connection), but the auth gate itself succeeded — we should
-	// NOT see a 401 or 500 response code. The actual code depends on the
-	// upgrade error handling; the critical assertion is that auth passed.
-	if w.Code == http.StatusUnauthorized || w.Code == http.StatusInternalServerError {
-		t.Errorf("valid token: auth should have passed; got %d", w.Code)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet mock expectations: %v", err)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Canvas client (no X-Workspace-ID): auth gate bypassed, upgrade attempted.
-// Same httptest limitation as above — we verify no 401/500 before the upgrade.
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_CanvasClient_NoAuthGate(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	// No X-Workspace-ID header → no auth check → no DB queries expected.
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", "", "") // no workspace ID
-
-	handler.HandleConnect(c)
-
-	// No auth gate hit → no 401/500. The WebSocket upgrade itself will fail
-	// in httptest, but that's expected (see TestSocketHandler_AuthGate_HasLiveToken_ValidBearer_AuthPassed).
-	if w.Code == http.StatusUnauthorized || w.Code == http.StatusInternalServerError {
-		t.Errorf("canvas client: expected no auth error; got %d", w.Code)
-	}
-	if err := mock.ExpectationsWereMet(); err != nil {
-		t.Errorf("unmet mock expectations: %v", err)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// Legacy workspace: HAS live token flag but workspace exists AND ValidateToken
-// is called. Since the workspace has a live token, the handler MUST validate
-// the presented token (not grandfather through). This is the Phase 30.1/30.2
-// contract — a workspace with tokens on file is NOT grandfathered.
-// ─────────────────────────────────────────────────────────────────────────────
-
-func TestSocketHandler_AuthGate_HasLiveToken_EmptyBearer_Returns401(t *testing.T) {
-	mock := setupTestDB(t)
-	handler := newSocketHandlerWithDB(t, nil)
-
-	wsID := "ws-has-live-token-empty-bearer"
-
-	// HasAnyLiveToken: workspace has a live token.
-	mock.ExpectQuery("SELECT COUNT").
-		WithArgs(wsID).
-		WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
-
-	// Authorization header is "Bearer " (empty token after "Bearer ").
-	// wsauth.BearerTokenFromHeader strips "Bearer " and gets "".
-	// ValidateToken is called with "" → returns ErrInvalidToken before DB hit.
-	w := httptest.NewRecorder()
-	c, _ := gin.CreateTestContext(w)
-	c.Request = socketRequest("GET", "/ws", wsID, "Bearer ")
-
-	handler.HandleConnect(c)
-
-	if w.Code != http.StatusUnauthorized {
-		t.Errorf("empty bearer after Bearer prefix: expected 401, got %d", w.Code)
-	}
-}
-
-// ─────────────────────────────────────────────────────────────────────────────
-// helpers
-// ─────────────────────────────────────────────────────────────────────────────
-
-// sha256TokenHash returns the SHA256 hash of a plaintext token, matching what
-// wsauth.ValidateToken does internally before querying the DB.
-func sha256TokenHash(plaintext string) []byte {
-	h := sha256.Sum256([]byte(plaintext))
-	return h[:]
-}
@@ -43,27 +43,6 @@ func (s *syncBuf) String() string {
 	return s.b.String()
 }

-// unwrapGoError extracts subprocess stderr from a Go-wrapped error that
-// includes combined output. e.g. from sendSSHPublicKey's
-// fmt.Errorf("send-ssh-public-key: %w (%s)", err, combinedOut), this
-// returns the " (%s)" portion — the actionable subprocess signal like
-// "AccessDeniedException: ... is not authorized to perform:
-// ec2-instance-connect:OpenTunnel". Returns "" when the output is
-// identical to the error string (no stderr captured).
-func unwrapGoError(errMsg string) string {
-	// Extract content between the last '(' and trailing ')'. The
-	// sendSSHPublicKey wrapper uses fmt.Errorf("...: %w (%s)", err, combinedOut)
-	// so the subprocess stderr is always the last parenthesised segment,
-	// e.g. "send-ssh-public-key: exit status 1 (AccessDeniedException: ...)"
-	// — note the closing ')' is at the very end with no trailing space.
-	open := strings.LastIndex(errMsg, "(")
-	if open < 0 {
-		return ""
-	}
-	inner := errMsg[open+1:]
-	return strings.TrimSuffix(inner, ")")
-}
-
 // HandleDiagnose handles GET /workspaces/:id/terminal/diagnose. It runs the
 // same per-step pipeline as HandleConnect (ssh-keygen → EIC send-key → tunnel
 // → ssh) but non-interactively, captures the first failing step and its
@@ -235,18 +214,12 @@ func (h *TerminalHandler) diagnoseRemote(ctx context.Context, workspaceID, insta
 	}

 	// Step 2: send-ssh-public-key (AWS Instance Connect)
-	// mc#687: populate Detail so the E2E smoke sees the AWS permission error
-	// verbatim. The subprocess stderr (e.g. "AccessDeniedException: ... is not
-	// authorized to perform: ec2-instance-connect:OpenTunnel") is captured by
-	// sendSSHPublicKey's CombinedOutput() and embedded in the Go error string.
 	t0 = time.Now()
 	if err := sendSSHPublicKey(ctx, region, instanceID, osUser, strings.TrimSpace(string(pubKey))); err != nil {
-		errMsg := err.Error()
 		return stop("send-ssh-public-key", diagnoseStep{
 			Name:       "send-ssh-public-key",
 			DurationMs: time.Since(t0).Milliseconds(),
-			Error:      errMsg,
-			Detail:     unwrapGoError(errMsg),
+			Error:      err.Error(),
 		})
 	}
 	res.Steps = append(res.Steps, diagnoseStep{Name: "send-ssh-public-key", OK: true, DurationMs: time.Since(t0).Milliseconds()})
@@ -245,50 +245,3 @@ func TestDiagnoseRemote_StopsAtSSHProbe(t *testing.T) {
 	}
 }

-// TestUnwrapGoError pins the unwrapGoError helper that extracts subprocess
-// stderr from the Go-wrapped error string produced by sendSSHPublicKey.
-// Regression gate for mc#687: the E2E smoke now reads detail (not error),
-// so detail MUST contain the actionable AWS permission signal.
-func TestUnwrapGoError(t *testing.T) {
-	cases := []struct {
-		name   string
-		input  string
-		want   string
-	}{
-		{
-			name:  "AWS permission denied",
-			input: "send-ssh-public-key: exec: exit status 1 (AccessDeniedException: User: arn:aws:iam::123456789012:role/TestRole is not authorized to perform: ec2-instance-connect:OpenTunnel)",
-			want:  "AccessDeniedException: User: arn:aws:iam::123456789012:role/TestRole is not authorized to perform: ec2-instance-connect:OpenTunnel",
-		},
-		{
-			name:  "generic exec error no output",
-			input: "send-ssh-public-key: exec: exit status 1",
-			want:  "",
-		},
-		{
-			name:  "empty string",
-			input: "",
-			want:  "",
-		},
-		{
-			name:  "short string below threshold",
-			input: "err",
-			want:  "",
-		},
-		{
-			name:  "no parentheses",
-			input: "send-ssh-public-key: something went wrong",
-			want:  "",
-		},
-	}
-
-	for _, tc := range cases {
-		t.Run(tc.name, func(t *testing.T) {
-			got := unwrapGoError(tc.input)
-			if got != tc.want {
-				t.Errorf("unwrapGoError(%q): got %q, want %q", tc.input, got, tc.want)
-			}
-		})
-	}
-}
-
Author	SHA1	Message	Date
core-uiux	4c6365c7c1	test(canvas/mobile): add primitives.test.tsx coverage (19 cases) CI / Detect changes (pull_request) Successful in 22s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 21s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 37s Details Harness Replays / detect-changes (pull_request) Successful in 21s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 34s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 22s Details qa-review / approved (pull_request) Successful in 10s Details gate-check-v3 / gate-check (pull_request) Failing after 19s Details security-review / approved (pull_request) Successful in 9s Details sop-checklist-gate / gate (pull_request) Successful in 11s Details sop-tier-check / tier-check (pull_request) Successful in 13s Details lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m18s Details Block internal-flavored paths / Block forbidden paths (pull_request) Failing after 10m19s Details CI / Platform (Go) (pull_request) Successful in 10s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s Details CI / Python Lint & Test (pull_request) Successful in 10s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 10s Details audit-force-merge / audit (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Successful in 8s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 14s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m21s Details CI / Canvas (Next.js) (pull_request) Successful in 13m39s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / all-required (pull_request) Successful in 5s Details Cover StatusDot (size, circle, halo, flexShrink), TierChip (tiers, size variants, flexShrink), Chip (value, label+value, pill shape, soft/accent mode), SectionLabel (text, right slot, uppercase). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
app-fe	91f6e77b47	feat(mobile): FilterChips + AgentCard WCAG 2.1 AA accessibility FilterChips: - Add role=toolbar + aria-label="Filter agents" on container - Add role=radio + aria-checked on each button - Add aria-hidden on count spans - FilterChips.test.tsx: 9 cases AgentCard: - Add aria-label composing name, status, tier, remote flag - AgentCard.test.tsx: 8 cases 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-12 09:38:47 +00:00
app-fe	685ce32f20	feat(mobile): TabBar WCAG 2.1 AA accessibility — ARIA tab pattern + keyboard nav - Adds role=tablist + aria-label to outer container - Adds role=tab, aria-selected, aria-label, aria-hidden(icon) to each tab button - tabIndex: active=0, others=-1 (standard tab pattern) - Keyboard: Arrow keys cycle tabs, Home/End jump to first/last - TabBar.test.tsx: 12 cases covering render states and keyboard interaction 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-12 09:38:47 +00:00
core-uiux	fd5c4f607c	test(canvas): add form-inputs coverage (35 cases) + Section accessibility fix + form-inputs.test.tsx: 35 cases across TextInput, NumberInput, Toggle, TagList, and Section — pure presentational components in the Config tab. Uses vi.hoisted() patterns from established suite; no jest-dom matchers. + form-inputs.tsx (Section): add aria-expanded + aria-controls to the collapsible toggle button for WCAG 2.1 AA compliance. The content div gets a stable id derived from the title; aria-controls links button to region. Indicator span gains aria-hidden="true" (decorative only). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	e7f8982b47	test(canvas/settings,chat): add coverage for EmptyState, SearchBar, UnsavedChangesGuard, AttachmentVideo - EmptyState: 6 cases — icon aria-hidden, title, body text, CTA button - SearchBar: 14 cases — store binding, onChange, Escape, Ctrl/Cmd+F focus - UnsavedChangesGuard: 7 cases — dialog states, Keep/Discard actions, backdrop FIX: UnsavedChangesGuard now wires onDiscard via pendingDiscard ref so clicking Discard correctly calls the callback on dialog close - AttachmentVideo: 8 cases — loading/ready/error states, tone borders, blob URL cleanup, external URI direct href No breaking changes. 2387 tests passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	86873855d2	test(canvas/settings): add DeleteConfirmDialog + SettingsButton coverage (26 cases) - DeleteConfirmDialog (15 cases): dialog open via secret:delete-request event, title/body text, Cancel closes, dependents loading/list/none states, deleteSecret call, confirm 1s delay, disabled→enabled button transition - SettingsButton (11 cases): aria-label, aria-expanded, gear SVG aria-hidden, toggle openPanel/closePanel, active class, tooltip Mac/Ctrl shortcut ResizeObserver polyfill for Radix Tooltip No breaking changes. 2413 tests passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	6fcb4a1f14	test(canvas/settings): add ServiceGroup coverage (10 cases) - role=group with aria-label containing service label - Service icon aria-hidden, correct emoji per service name - Count label: "1 key" vs "N keys" - Renders SecretRow for each secret - Header and rows div structure Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	88d0f5356a	test(canvas/chat): add AttachmentImage coverage (10 cases) Adds Vitest coverage for AttachmentImage — inline image thumbnail with click-to-fullscreen lightbox. Covers: loading skeleton (240×180), ready state with blob URL, tone=user/agent border classes, lightbox open/close on click and Escape, AttachmentChip error fallback, img onError transition to chip, external URI direct href (no fetch), and blob URL cleanup on unmount. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	5de96bc2a9	test(canvas/chat): add AttachmentAudio + AttachmentPDF coverage (18 cases) Adds Vitest coverage for two missing attachment renderers: AttachmentAudio (9 cases): - Loading skeleton (280x40) with aria-label - <audio controls> with blob src when ready - Filename label in ready state - tone=user -> blue/accent border - tone=agent -> neutral border - Error -> AttachmentChip fallback - audio onError -> chip transition - External URI -> direct href, no fetch - Blob URL cleanup on unmount AttachmentPDF (9 cases): - Loading skeleton with PdfGlyph + filename - Preview button with glyph, filename, "PDF" label - Lightbox opens with <embed> on click - Lightbox closes on Escape - tone=user -> blue/accent classes on button - tone=agent -> neutral border - Error -> AttachmentChip fallback - External URI -> direct href, no fetch - Blob URL cleanup on unmount Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	f14e755001	test(canvas/chat): add AttachmentTextPreview coverage (12 cases) Adds Vitest coverage for AttachmentTextPreview — inline text/code preview with streaming fetch and expand/truncate. Covers: - Loading skeleton (320x80) with aria-label - Ready state with correct text content - Filename shown in header - Expand button appears when lines > 10 - Expand button hidden when all lines shown - Expand button updates display to full content - Download button calls onDownload - tone=user -> blue/accent border - tone=agent -> neutral border - Truncated notice when file exceeds 256 KB - Error -> AttachmentChip fallback - Cleanup on unmount Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00
core-uiux	d83c835845	test(settings): add TokensTab coverage (12 cases) 12 passing: loading spinner, empty state, token list rendering, each token's prefix/age/Revoke button, API URL correctness, revoke confirm + cancel dialogs, new-token creation + dismiss, create error, network error banner. Root bug fixed: confirm button search was unscoped — when the dialog opened, two "Revoke" buttons existed (tok2's row + dialog confirm); find() returned tok2's button first. Scoped the search to document.querySelector('[role="dialog"]') to hit the correct target. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 09:38:47 +00:00