Four fixes for the cascade-delete confirmation modal:
1. Cancel button hover was a no-op: bg-surface-card on top of the
same base — clicking did something but the button looked dead.
Lifted to surface-elevated, matching the ConfirmDialog Cancel
pattern.
2. Delete button hovered LIGHTER (bg-red-500 over bg-red-600). On
white text that drops contrast below AA — same trap fixed in
ConfirmDialog and ApprovalBanner. Flipped to bg-red-700 so hover
stays readable in both themes.
3. Checkbox ring-offset color was zinc-900 — but the dialog actually
sits on bg-surface-sunken, so the offset showed the wrong color
through the ring gap. Corrected to ring-offset-surface-sunken.
Also moved focus → focus-visible so the ring only shows on
keyboard nav, not mouse clicks.
4. Cancel + Delete had no focus-visible rings. Added accent ring
on Cancel, danger ring on Delete, both with the correct
ring-offset-surface-sunken.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User report: handing the modal's Claude Code channel snippet to an
agent fails immediately with two errors that the snippet doesn't tell
the operator how to resolve:
plugin:molecule@Molecule-AI/molecule-mcp-claude-channel · plugin not installed
plugin:molecule@Molecule-AI/molecule-mcp-claude-channel · not on the approved channels allowlist
Root cause: the snippet's `claude --channels plugin:...` line assumes
the plugin is pre-installed AND that the channel is on Anthropic's
default allowlist. Both assumptions are wrong for a custom Molecule
plugin in a public repo.
Two changes:
1. Rewrite externalChannelTemplate (Go) with full setup chain:
- Bun prereq check (channel plugins are Bun scripts)
- `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel`
+ `/plugin install molecule@molecule-mcp-claude-channel` BEFORE the
launch — otherwise "plugin not installed"
- `--dangerously-load-development-channels` flag on launch — required
for non-Anthropic-allowlisted channels, otherwise "not on approved
channels allowlist"
- Common-errors block at the bottom mapping each error string to
which numbered step recovers it
- Team/Enterprise managed-settings caveat (the dev-channels flag is
blocked there; admin must use channelsEnabled + allowedChannelPlugins)
Plugin install info verified by reading `Molecule-AI/molecule-mcp-claude-channel`
plugin.json (`name: "molecule"`) and the Claude Code channels +
plugin-discovery docs at code.claude.com/docs/en/{channels,discover-plugins}.
2. Add per-tab HelpBlock to the modal (canvas):
- Collapsible <details> below each snippet, closed by default so the
snippet stays the visual focus
- "Where to install" link (PyPI for runtime, claude.com for Claude
Code, github.com/openai/codex for Codex, NousResearch/hermes-agent
for Hermes)
- "Documentation" link (docs.molecule.ai/docs/guides/*; hostname
confirmed by existing blog post canonical metadata; paths map
1:1 to docs/guides/*.md files in this repo)
- "Common errors" list with concrete recovery steps for each tab
(e.g. Codex tab calls out the codex≥0.57 requirement and TOML
duplicate-table parse error; OpenClaw calls out the :18789 port
conflict check)
URL discipline: every URL is either (a) verified against a file path
in this repo's docs/, (b) the canonical repo of an existing snippet
reference, or (c) a well-known third-party canonical URL. No guessed
URLs — broken links would defeat the purpose of "more comprehensive
instructions."
Verification:
- `go build ./...` clean in workspace-server
- `go test ./internal/handlers/...` passes (4.3s)
- Bash syntax check on test_staging_full_saas.sh (no edits there) clean
- TS brace/paren/bracket counts balanced; no full tsc run because the
worktree's node_modules isn't installed — counterpart Canvas tabs E2E
on the PR will exercise the full type-check + render path
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub Actions scheduler de-prioritises :00 cron firings under load.
Empirical 2026-05-03: the canary's cron was '0,20,40 * * * *' but
actual firings landed at :08, :03, :01, :03 — :20 and :40 silently
dropped. Detection latency degraded from claimed 20 min to actual
~60 min worst case.
Move to '10,30,50 * * * *':
- :10/:30/:50 sit 10 min off the top-of-hour load peak
- Still 5 min from :15 sweep-cf-orphans and :45 sweep-cf-tunnels
(the original constraint that kept us off :15/:45)
- Same 20-min cadence; only the phase changes
No code change beyond the cron expression + comment refresh.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small UIUX fixes for the bundle drag-import surface.
1. Drag overlay was hardcoded blue-950/blue-400 — those tones don't
exist in the warm-paper light theme, so the overlay washed out
inconsistently. Switched to bg-accent/15 + border-accent/40 so
the overlay flips with theme and matches the inner card's
border-accent/50.
2. Importing spinner was visually obvious but invisible to screen
readers — only the result toast had aria-live. Operators relying
on AT had no way to know the import was in flight. Added
role="status" + aria-live="polite" + aria-hidden on the spinner
itself so the SR hears "Importing bundle..." once.
3. animate-spin → motion-safe:animate-spin so the spinner respects
prefers-reduced-motion (Tailwind's built-in variant gates the
animation on the user's OS setting). Layout doesn't change in
either case — text alone communicates state.
Also dropped border-sky-400 → border-accent on the spinner so it
matches the rest of the canvas semantics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four UIUX fixes for the EC2 console modal:
1. Copy and Close buttons had hover:bg-surface-card on TOP of the
same base bg-surface-card — silent no-op hover. Lifted to
surface-elevated + line-soft border, matching ConfirmDialog's
Cancel pattern. The button visibly responds now.
2. Copy button silently succeeded — no toast, no animation, no UI
feedback. Operators clicking it had no idea whether anything
landed in the clipboard. Now fires showToast on resolve/reject
so the action is observable.
3. × close button was ~10x16px (well under WCAG 2.5.5's 24x24).
Bumped to w-6 h-6 with focus-visible ring + hover bg.
4. Added focus-visible:ring-accent/60 + ring-offset-surface to
all three buttons so keyboard users see focus. Matches the
semantic ring pattern used across the canvas.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small fixes for the batch-action toolbar:
1. The deselect button's title says "Clear selection (Escape)" — but
pressing Escape did NOTHING. The title has been lying since the bar
shipped. Now wired: window keydown handler calls clearSelection
when Esc fires. Skipped while the confirm dialog is open
(`pending !== null`) so the dialog's own Esc-cancels takes
precedence, and skipped during a busy in-flight action so the
user can't strand a partial-failure mid-flight.
2. focus-visible:ring-zinc-500/70 → focus-visible:ring-accent/50
on the deselect button. The hardcoded zinc broke the semantic-
token pattern used by the other action buttons.
Tests: two new vitest cases — Esc clears with selection, Esc no-op
when empty (the bar isn't mounted at count===0 so the listener never
registers). Full suite: 1222/1222.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to #2648 — same `>/dev/null || true` swallow-on-error
pattern existed in:
e2e-staging-canvas.yml (single-slug)
e2e-staging-saas.yml (loop)
e2e-staging-sanity.yml (loop)
e2e-staging-external.yml (loop, was `>/dev/null 2>&1` variant)
All four now capture the HTTP code, log a "[teardown] deleted $slug
(HTTP $code)" line on success, and emit a workflow warning naming
the slug + body excerpt on non-2xx. Loop bodies also tally + summarise
total leaks at the end.
Exit semantics unchanged: a single cleanup miss still doesn't fail-flag
the test (sweep-stale-e2e-orgs is the safety net within ~45 min). The
behavior change is purely surfacing — failures that were silent are
now visible on the workflow run page.
Pairs with #2648's tightened sweeper. Together: per-run cleanup
failures are visible AND the safety net catches them quickly.
Closes the per-workflow port noted as out-of-scope in #2648.
See molecule-controlplane#420.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes that close one of the leak classes from the
molecule-controlplane#420 vCPU audit:
1. sweep-stale-e2e-orgs.yml: cron */15 (was hourly), MAX_AGE_MINUTES
30 (was 120). E2E runs are 8-25 min wall clock; 30 min is safely
above the longest run while shrinking the worst-case leak window
from ~2h to ~45 min (15-min sweep cadence + 30-min threshold).
2. canary-staging.yml teardown: the per-slug DELETE used `>/dev/null
|| true`, which swallowed every failure. A 5xx or timeout from CP
looked identical to "successfully deleted" and the canary tenant
kept eating ~2 vCPU until the sweeper caught it. Now we capture
the response code and surface non-2xx as a workflow warning that
names the leaked slug.
The exit semantics stay unchanged — a single-canary cleanup miss
shouldn't fail-flag the canary itself when the actual smoke check
passed. The sweeper is the safety net for whatever slips past.
Caught during the molecule-controlplane#420 audit on 2026-05-03 —
3 e2e canary tenant orphans were running for 24-95 min, all under
the previous 120-min sweep threshold so they went unnoticed until
manual cleanup. Same `|| true` pattern exists in
e2e-staging-{canvas,external,saas,sanity}.yml; out of scope for
this PR (mechanical port; tracking separately) but the sweeper
tightening covers all of them by reducing the safety-net latency.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small a11y fixes for the floating legend.
1. Both buttons (open pill + close ×) had no focus-visible ring —
keyboard users couldn't tell where focus landed. Added the
accent-ring pattern used across the rest of the canvas.
2. Close button was a ~10x16px hit area — well below WCAG 2.5.5's
24x24 minimum. Bumped to w-6 h-6 with negative margin so the
visible × stays in the same spot but the hit area + focus ring
are larger. Hover bg added to make the hit area visible on hover.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three fixes for the cookie banner:
1. role="dialog" aria-modal="true" → <section role="region">. The
banner has no focus trap, doesn't block the page, and the user
can keep using the canvas while it's up — none of which are modal
semantics. Claiming aria-modal="true" without a trap actively
harms screen-reader users: they're told the rest of the page is
inert, jump into the banner, and then can't escape. Region
semantics let AT navigate around it normally. (Forcing a modal
cookie banner would also be a dark pattern under GDPR.)
2. Privacy-policy link: hover:text-accent → hover:text-accent-strong.
The original was a no-op (same color). Also added focus-visible
ring + underline-offset so the link is readable AND keyboard-
distinguishable in both themes.
3. Both buttons: focus-visible:ring-2 + ring-offset-surface so
keyboard users see where focus lands. Mouse clicks unchanged
thanks to focus-visible.
Tests: swapped getByRole("dialog") → getByRole("region") in 8
existing tests, then tightened the role-assertion test into a
regression guard that explicitly asserts NO aria-modal and NO
dialog role exist. Full suite: 1220/1220.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cuts the per-run LLM cost ~10x (MiniMax M2.7 vs gpt-4.1-mini) and
removes the recurring OpenAI-quota-exhaustion failure mode that took
the canary down on 2026-05-03 (#265 — staging quota burnt for ~16h).
Path:
E2E_RUNTIME=claude-code (default)
→ workspace-configs-templates/claude-code-default/config.yaml's
`minimax` provider (lines 64-69)
→ ANTHROPIC_BASE_URL auto-set to api.minimax.io/anthropic
→ reads MINIMAX_API_KEY (per-vendor env, no collision with
GLM/Z.ai etc.)
Workflow changes (continuous-synth-e2e.yml):
- Default runtime: langgraph → claude-code
- New env: E2E_MODEL_SLUG (defaults to MiniMax-M2.7-highspeed,
overridable via workflow_dispatch)
- New secret wire: E2E_MINIMAX_API_KEY ←
secrets.MOLECULE_STAGING_MINIMAX_API_KEY
- Per-runtime missing-secret guard: claude-code requires MINIMAX,
langgraph/hermes require OPENAI. Cron firing hard-fails on missing
key for the active runtime; dispatch soft-skips so operators can
ad-hoc test without setting up the secret first
- Operators can still pick langgraph/hermes via workflow_dispatch;
the OpenAI fallback path stays wired
Script changes (tests/e2e/test_staging_full_saas.sh):
- SECRETS_JSON branches on which key is set:
E2E_MINIMAX_API_KEY → {MINIMAX_API_KEY: <key>} (claude-code path)
E2E_OPENAI_API_KEY → {OPENAI_API_KEY, HERMES_*, MODEL_PROVIDER} (legacy)
MiniMax wins when both are present — claude-code default canary
must not accidentally consume the OpenAI key
Tests (new tests/e2e/test_secrets_dispatch.sh):
- 10 cases pinning the precedence + payload shape per branch
- Discipline check verified: 5 of 10 FAIL on a swapped if/elif
(precedence inversion), all 10 PASS on the fix
- Anchors on the section-comment header so a structural refactor
fails loudly rather than silently sourcing nothing
The model_slug dispatcher (lib/model_slug.sh) needs no change:
E2E_MODEL_SLUG override path is already wired (line 41), and
claude-code template's `minimax-` prefix matcher catches
"MiniMax-M2.7-highspeed" via lowercase-on-lookup.
Operator action required to land green:
- Set MOLECULE_STAGING_MINIMAX_API_KEY in repo secrets
(Settings → Secrets and Variables → Actions). Use
`gh secret set MOLECULE_STAGING_MINIMAX_API_KEY -R Molecule-AI/molecule-core`
to avoid leaking the value into shell history.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The staging canary's A2A step has a ladder of specific regression
classifiers (hermes-agent down, model_not_found, Invalid API key,
etc.) followed by a generic "error|exception" catch-all. Provider-
side OpenAI 429 quota errors fell through to the catch-all, so the
canary issue body and CI log just said "A2A returned an error-shaped
response" — which is technically true but obscures the actual
operator action.
This adds a 7th classifier above the catch-all for "exceeded your
current quota" / "insufficient_quota" — both terms appear in
OpenAI's quota-exhaustion 429 response. When matched, the failure
message names the operator action directly (top up MOLECULE_STAGING_OPENAI_KEY
or rotate the secret) and links to #2578.
Why this is correct, not "lowering the bar":
- Steps 0–7 of the canary cover full platform health (CP up, tenant
provisioned, DNS+TLS reachable, workspace booted, A2A delivered).
- Reaching step 8 with a provider-side 429 means the platform IS
healthy — the failure is downstream of all platform invariants.
- The canary still exits 1 (CI stays red, threshold-3 alarm still
fires); only the failure message changes.
- All 6 existing specific classifiers run BEFORE this one, so any
real platform regression is still caught with its specific message.
Verification:
- Regex tested against the actual 429 string from canary run 25291517608:
"API call failed after 3 retries: HTTP 429: You exceeded your current quota..."
→ matches ✅
- Negative tests: "PONG", "hermes-agent unreachable" → no match ✅
- bash -n syntax check passes
- shellcheck -S error clean
Tracking: #2593 (canary), #2578 (root cause)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small UIUX fixes for Cmd+K search.
1. Auto-highlight the first match while the user types. Before, Enter
on a non-empty query was a no-op — focusedIndex stayed at -1 until
the user pressed ↓. Standard search-palette behavior is to highlight
the top result so Enter just works. Empty query keeps -1 (opening
the dialog shows ALL workspaces; arbitrarily pinning one looks
wrong).
2. placeholder-zinc-400 → placeholder-ink-soft. The hardcoded zinc
broke the semantic-token pattern other inputs use; placeholder now
flips with theme correctly. (Also reordered focus:outline-none
ahead of the focus-visible variants — cosmetic, more idiomatic.)
Tests: replaced the "resets to -1" test with two new ones — auto-
highlight on a matching query (Enter selects without ArrowDown), and
no-results query stays a no-op. Full suite 1220/1220.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small fixes for the workspace right-click menu:
1. Off-screen clamp. Right-clicking near the right or bottom edge of
the canvas put part of the menu past the viewport — items hidden
under the scrollbar / off the screen. The menu now measures itself
on the same rAF that auto-focuses the first item, and shifts back
inside with an 8px margin (matching the floating-tooltip top-edge
clamp in Tooltip.tsx). Falls back to the raw cursor coords for the
first paint frame so there's no flash.
2. focus:ring-zinc-600 → focus-visible:ring-accent/50. The hardcoded
zinc tone broke the semantic-token pattern every other surface
uses; flipping to focus-visible also stops the ring from showing
when items are clicked with the mouse (only keyboard nav now
triggers the ring, matching Toolbar/SidePanel behavior).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two diagnostic upgrades to the Playwright staging-setup harness, both
zero-behavior-change:
1. provision-failed throw now includes the full admin-orgs row (boot
stage, last error, terraform/SSM state, etc) instead of just the
slug. Every "provision failed: <slug>" in CI history was followed
by a manual repro to find out WHY — that round-trip is gone.
2. workspace-failed throw dumps the full /workspaces/{id} body when
last_sample_error is empty. Boot crashes, image-pull errors,
missing PYTHONPATH, and OpenAI-quota-at-startup all surface as a
bare "Workspace failed:" today (see #2632). Now they carry the
boot_stage / image / last_error fields the API row exposes.
No fix for the underlying flakes — those are tracked in #2632 (CP race)
and #2578 (OpenAI quota). This just stops them looking identical in the
CI log.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small a11y fixes for the global toast surface:
1. Esc dismisses the newest toast. Errors never auto-expire, so without
a keyboard shortcut a keyboard-only user has to tab through the entire
app to reach the × button on a stuck error.
2. Dismiss button gets focus-visible ring + theme-aware tint. The previous
`opacity-70 hover:opacity-100` gave no visible focus indicator (WCAG
2.4.7). Info toasts use the semantic surface that flips with theme,
so the dismiss tint splits per type — accent ring on info, white ring
on the always-dark success/error toasts.
3. Touch target bumps from p-1 (~24x24) to w-7 h-7 (28x28) toward WCAG
2.5.5 AAA's 44x44 ideal.
Tests: 5 new vitest cases covering Esc on info/error, no-op on empty
queue, accessible label, and per-toast click dismissal.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>