Self-review of the modal-tab additions caught footguns in the new
hermes/codex/openclaw snippets. Ship the fixes before merge.
Critical 1 — Hermes `cat >> ~/.hermes/config.yaml` corrupts existing
configs. Most existing hermes installs have a top-level gateway:
block; appending creates a duplicate, which YAML rejects. Replaced
the auto-append with explicit instructions: 'under your existing
gateway: block, add a plugin_platforms entry'.
Critical 2 — Codex `cat >> ~/.codex/config.toml` corrupts on
re-run. TOML rejects duplicate [mcp_servers.molecule] tables; a
second run breaks codex parse. Replaced auto-append with commented
config block + explicit 'open ~/.codex/config.toml in your editor
and paste'. Canvas-side token stamping still hits the literal in
the comment so the operator's clipboard has the real token already
substituted.
Required 3 — OpenClaw `onboard --non-interactive` missing
provider/model defaults. Added explicit --provider + --model
placeholders in a commented form so operators see what's needed
without a stub default applying silently.
Required 4 — OpenClaw gateway started with bare '&' dies on
terminal close. Switched to nohup + log file + disown, with a note
that systemd is the right answer for production.
Optional 5 + 6 (env_vars cleanup, tests) deferred — env_vars stripped
to keep the in-tree-vs-external surface narrow; tests for the new
response fields can land separately when external_connection.go is
next touched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The External Connect modal had tabs for Python SDK / curl / Claude Code
channel / Universal MCP. Operators using hermes / codex / openclaw as
their external runtime had no copy-paste; they pieced together
WORKSPACE_ID + PLATFORM_URL + auth_token into config files by reading
docs.
Adds three runtime-specific snippets stamped server-side:
- **Hermes** — installs molecule-ai-workspace-runtime + the
hermes-channel-molecule plugin, exports the 4 env vars, and writes
the gateway.plugin_platforms.molecule block into ~/.hermes/config.yaml.
Same long-poll-based push semantics the Claude Code channel tab
delivers (push parity with the in-tree template-hermes adapter).
- **Codex** — wires the molecule_runtime A2A MCP server into
~/.codex/config.toml ([mcp_servers.molecule] block with env_vars
passthrough + literal env values). Outbound tools only — codex's
MCP client doesn't route arbitrary notifications/* (verified by
reading codex-rs/codex-mcp/src/connection_manager.rs); push parity
on external codex would need a separate bridge daemon, tracked
as future work. Snippet calls this out so operators know to pair
with Python SDK if they need inbound delivery.
- **OpenClaw** — installs openclaw + onboards, wires the molecule
MCP server via openclaw mcp set, starts the gateway on loopback.
Same outbound-tools-only caveat as codex; the in-tree template-
openclaw adapter implements the full sessions.steer push path,
but an external setup would need the same bridge daemon to translate
platform inbox events into sessions.steer calls. Future work.
Default open tab changed from "Claude Code" to "Universal MCP".
Universal MCP is runtime-agnostic and works as a starting point for
any operator regardless of their downstream agent runtime; runtime-
specific tabs are still one click away. Pre-2026-05-03 the modal
defaulted to Claude Code, so operators using non-Claude runtimes
opened to a tab they had to skip past.
Tab order also reorganized:
Universal MCP → Python SDK → Claude Code → Hermes → Codex → OpenClaw → curl → Fields
Each runtime-specific tab is gated on the platform supplying the
snippet (older platform builds without the field don't show empty
tabs).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regression from PR #2618 (chat dark-contrast).
PR #2618 switched the agent bubble bg to `dark:bg-zinc-700` so it
visibly elevates against the dark panel — but the inner ReactMarkdown
prose div only got `prose-invert` for USER messages. Result: in dark
mode the agent's markdown text rendered with the Tailwind Typography
plugin's default dark body color on top of the new dark bg = invisible
text. User reported empty-looking gray rectangles where agent replies
should be.
Fix: apply `dark:prose-invert` to agent bubbles so prose body text
flips light alongside the bg. Light mode unchanged (default prose
colors against the warm `bg-surface-card`).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues on a high-stakes surface (revoke token, delete workspace,
cascade delete):
1. **Cancel hover was a no-op.** `bg-surface-card hover:bg-surface-card`
gave zero visual feedback on hover. Now hovers to surface-elevated
with a softened border so the button visibly lifts.
2. **Confirm hovers went LIGHTER, dropping white-text contrast.**
`bg-red-600 hover:bg-red-500` made the destructive button less
readable on hover. Same for warning (amber) and primary (accent).
Reversed to hover-darker so contrast holds in both themes.
3. **No focus-visible rings on either button.** Keyboard users had no
indication of focus position (WCAG 2.4.7 fail). Added
`focus-visible:ring-2 focus-visible:ring-accent/40` on Cancel and
`focus-visible:ring-2 focus-visible:ring-offset-2 ...accent/60` on
Confirm so the focused destructive action is unambiguous.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2571 fixed synth-E2E by branching MODEL_SLUG per runtime, but only
the langgraph branch was verified at runtime — hermes / claude-code /
override / fallback had zero automated coverage. A future regression
(e.g. dropping the langgraph case) would silently revert and only
surface as "Could not resolve authentication method" mid-E2E.
This PR:
- Extracts the dispatch into tests/e2e/lib/model_slug.sh as a sourceable
pick_model_slug() function. No behavior change.
- Adds tests/e2e/test_model_slug.sh — 9 assertions across all 5 dispatch
branches plus the override path. Verified to FAIL when any branch is
flipped (manually regressed langgraph slash-form to confirm the test
catches it; restored before commit).
- Wires the unit test into ci.yml's existing shellcheck job (only runs
when tests/e2e/ or scripts/ change). Pure-bash, no live infra.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User screenshot showed pale lavender user bubbles with hard-to-read white
text and a nearly-invisible agent bubble blending into the dark panel.
Root causes:
1. Tailwind v4 defaults `dark:` to `prefers-color-scheme: dark`. Our
ThemeProvider writes `data-theme="dark"` on <html> so user toggle wins
over OS — but `dark:` classes elsewhere in the codebase weren't
tracking it. Added `@custom-variant dark` to re-bind the variant.
2. `bg-accent` themes lighter in dark mode (--color-accent: #6883e8),
dropping white-text contrast to ~3:1 (fails WCAG AA). Switched user
bubble to solid blue-600/500 so it stays ~5:1 in both modes.
3. `bg-surface-card` (#1a1d23) was only ~7% lighter than the panel bg
(#0e1014), making agent bubbles disappear. Bumped to zinc-700 in
dark; light mode keeps the warm surface-card tint.
4. System (error) bubble's /10 overlay was nearly invisible; raised to
/25 in dark with stronger border + ink for readability.
Sub-tab + textarea polish included: low-contrast `text-ink-soft` →
`text-ink-mid`, focus-visible rings on tabs, dark variants on textarea.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chat_history query
WHERE workspace_id = $1
AND activity_type = 'a2a_receive'
AND (source_id = $2 OR target_id = $2)
ORDER BY created_at DESC
forces a workspace-scoped seq-scan-and-filter at every call —
idx_activity_ws_type_time covers workspace_id+type prefix but the
(source OR target) clause then walks every workspace row. Demo
workspaces (≤50 rows) don't notice; production workspaces accumulate
thousands over months and chat_history latency grows linearly.
Adds two partial btree indexes (workspace_id, source_id) WHERE NOT NULL
and (workspace_id, target_id) WHERE NOT NULL. Postgres BitmapOrs them
into a workspace-scoped BitmapAnd against the existing index, dropping
chat_history from O(workspace_rows) to O(peer_a2a_rows).
Partial WHERE NOT NULL because most activity rows (heartbeats,
agent_log, memory_write, etc.) carry NULL source_id/target_id and
shouldn't bloat the index.
Anti-pattern caveat (per the issue): a single compound (a, b) index
can't serve 'a OR b' — Postgres only uses compound for prefix match.
Two separate indexes + BitmapOr is the right shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-of-canvas Toolbar had multiple low-contrast surfaces in light theme:
Action buttons (Stop All, Restart Pending):
- bg-red-950/50 + bg-amber-950/40 → bg-bad/10 + bg-warm/10 with bg-bad/40
+ bg-warm/40 borders. Dark-tinted backgrounds with /40-/50 alpha render
as nearly invisible smudges on warm-paper; semantic tokens at /10 give
a clear pale-bad / pale-warm tint that scales correctly in dark mode.
- Both gain focus-visible:ring-2 focus-visible:ring-{bad,warm}/40.
Toggle button (A2A edges):
- Active state: bg-blue-950/50 → bg-accent/15 (themes correctly).
- Inactive state: bg-surface-card/50 + text-ink-soft → solid bg-surface-card
+ text-ink-mid; hover bumps to text-ink. Drops the redundant
"hover:bg-surface-card/50" identity hover.
Icon buttons (Audit, Search, Help):
- Same pattern as toggle inactive: solid bg-surface-card + text-ink-mid +
text-ink hover, with focus-visible:ring-2 focus-visible:ring-accent/40.
Workspace count + bullet separator:
- text-ink-soft (3.5:1 on warm-paper) → text-ink-mid (7:1).
WS connection status:
- "Live": text-ink-soft → text-ink-mid (paired with the green dot).
- "Reconnecting": text-ink-soft → text-warm (semantic match for amber dot).
- "Offline": text-ink-soft → text-bad (semantic match for red dot).
Status text now reinforces the dot colour instead of disappearing on
light surfaces.
Help popover:
- Close button: text-ink-soft → text-ink-mid + focus-visible:underline.
- HelpRow body text: text-ink-soft → text-ink-mid (was 3.5:1 on the
bg-surface-sunken/45 popover row — failed AA for body text).
Defense-in-depth follow-up to #2481 (peer_id trust-boundary gate).
Same XML-attribute injection vector applies to the four other meta
fields rendered as agent-context attrs in the <channel> tag:
<channel kind="..." method="..." activity_id="..." ts="..." source="molecule">
Each field is now passed through a closed-set / shape-validate gate:
- kind → frozenset {canvas_user, peer_agent} via _safe_meta_field
- method → frozenset {message/send, tasks/send, tasks/get, notify, ""}
- activity_id → UUID-shape regex via _safe_activity_id
- ts → ISO-8601 RFC3339 regex via _safe_ts
Any value outside the allowed shape is replaced with empty string.
Today the values come from a platform-DB column so they're trusted,
but "trust the source" was the same assumption that got peer_id into
trouble (#2481). Closed-enum allowlists make this row-content-blind.
5 new tests mirroring test_envelope_enrichment_strips_path_traversal_peer_id:
- test_envelope_strips_unknown_kind — kind injection stripped
- test_envelope_strips_unknown_method — method injection stripped
- test_envelope_strips_malformed_activity_id — non-UUID stripped
- test_envelope_strips_malformed_ts — non-ISO8601 stripped
- test_envelope_keeps_valid_meta_fields_unchanged — happy-path negative case
Mutation-tested: temporarily making _safe_meta_field permissive kills
both kind/method strip tests with the injection payload reflecting
into the meta dict, confirming the gate is what blocks them.
Two existing tests updated to use UUID-shaped activity_ids ("act-7",
"act-bridge-test" → real UUIDs) since the gate strips synthetic ids.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups from the multi-axis review of #2474:
1. **Docstring inversion** in tool_chat_history. The doc said
'(source_id=peer)' meant 'this workspace is the sender' — actually
it means the *peer* is the sender (source_id is where the activity
came FROM). Reframed to 'where the peer is either the sender or
the recipient' to match the underlying SQL semantics.
2. **Empty-history test**. TestChatHistory had 10 tests but no
200+[] happy-path pin. Added test_empty_history_returns_empty_json_list
asserting result == '[]' on exact-equality (per assert-exact
memory — substring '[]' would match envelope shapes too).
Both changes are pure docs+tests — no behaviour change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chat bubble fixes (canvas/src/components/tabs/ChatTab.tsx):
- User bubble: bg-accent-strong/30 + text-blue-100 → bg-accent + text-white
(translucent dark-blue overlay on warm-paper surface read as pale lavender
with near-invisible light-blue text — a real WCAG AA failure on the
highest-traffic surface in canvas).
- System/error bubble: bg-red-900/30 + text-red-200 → bg-bad/10 + text-bad,
using semantic tokens so dark-mode adapts automatically.
- Agent bubble: drop /80 + /30 opacity modifiers; solid bg-surface-card +
text-ink + border-line gives consistent contrast in both themes.
- prose-invert was unconditional, so markdown text on agent/system bubbles
rendered as light text on a light surface in light mode. Make it apply
only on the user bubble (the only dark surface in this component).
- Timestamp: text-ink-soft is too pale on warm-paper; use text-ink-mid for
agent/system, white/70 for user (visible on the now-solid blue bg).
Sub-tab bar fixes (canvas/src/components/SidePanel.tsx):
- Right-edge fade was hardcoded `from-zinc-950` — that paints a dark vertical
strip on the right edge of the tab bar in light mode. Switch to
`from-surface` so the gradient blends into whichever theme is active.
- Inactive tab text: text-ink-soft (~3.5:1 on warm-paper) → text-ink-mid
(~7:1). Active tab background: drop the /40 opacity so the selection is
unambiguous on either surface.
No semantic-token additions; all changes use existing warm-paper tokens
that already work in both themes.
The two missing branch tests called out by the multi-axis review of #2471.
a2a_client.enrich_peer_metadata handles two failure shapes (lines 105-112)
that the existing 12 envelope-enrichment tests don't exercise:
1. HTTP 200, response.json() raises (non-JSON body)
2. HTTP 200, valid JSON, but body is list/string/number not dict
Both paths land at the negative-cache write, but no test verified the
discriminator. Pin both with the same call_count == 1 assertion shape
the 5xx + network-exception tests already use.
Verified: temporarily removing the negative-cache write in either
branch makes the corresponding test fail with call_count == 2 — the
assertion correctly discriminates the contract from a fall-through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auto-promote ↔ auto-sync chain has been generating empty PRs
indefinitely since the staging merge_queue ruleset uses MERGE
strategy:
1. Auto-promote merges PR via queue → main = merge commit M2 not in staging
2. Auto-sync opens sync-back PR. Workflow's local `git merge --ff-only`
succeeds (PR title even says "ff to ..."), but the queue lands the
PR via MERGE → staging = merge commit S2 not in main
3. Auto-promote sees staging ahead by 1 → opens new promote PR. Tree
diff vs main = 0 (S2's tree == main's tree). But the gate logic
only checks "all required workflows green", not "actual code to
ship" → opens an empty promote PR
4. ... repeat indefinitely
Each round costs ~30-40 min wallclock, ~2 manual approvals (the queue
requires 1 review and the bot can't self-approve without admin
bypass), and one full CodeQL Go run (~15 min).
Observed today (2026-05-03) across PRs #2592 → #2594 → #2595 → #2596
→ #2597 — 5 PRs, ~3 hours, all empty content.
Fix: before opening the promote PR, check that staging's tree
actually differs from main's tree. If they're identical (the
empty-merge-commit cycle), skip cleanly and let the cycle terminate.
Implementation:
- New step `Skip if staging tree == main tree` runs before the
existing gate check.
- `git diff --quiet origin/main $HEAD_SHA` exits 0 iff trees match.
- On match: emits a step summary explaining the skip + sets
`skip=true`; subsequent gate-check + promote steps are gated on
`skip != 'true'` so they short-circuit.
- Fail-open: if `git fetch` errors, fall through to gate check
(preserve existing behavior). Only skip when diff is DEFINITIVELY
empty.
Long-term, the cleaner fix is to switch the merge_queue ruleset's
merge_method away from MERGE so FF-able PRs land cleanly without a
new commit — but that's a broader change affecting every staging
PR's commit shape. This guard is the surgical one-step break.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The retarget-main-to-staging workflow tries to PATCH base=staging on
every bot-authored PR opened against main. Auto-promote staging→main
PRs have head=staging, base=main — retargeting them sets head AND
base to staging, which GitHub rejects with HTTP 422 "no new commits
between base 'staging' and head 'staging'".
This started surfacing on PR #2588 (2026-05-03 14:30) once #2586
switched the auto-promote workflow to an App token. Before #2586
the auto-promote PR was authored by github-actions[bot], which the
retarget filter happened to skip; now it's molecule-ai[bot], which
passes the bot filter and triggers the broken retarget attempt.
Add a head-ref != 'staging' guard so auto-promote PRs short-circuit
before the PATCH. The existing 422 "duplicate base" detector is
left alone — it covers a different operational case.