User feedback: chat-bubble agent text still washed out after #2618 +
#2623. Looked at the actual rendered colors and the issue was Tailwind
Typography's `prose-invert` defaults — body text ships at zinc-300,
which lands at ~5.3:1 against bg-zinc-700. Passes AA but visibly
duller than the user bubble's crisp white-on-blue (~10:1).
Override the prose CSS variables on the agent bubble in dark mode:
- body → zinc-100 (was zinc-300)
- headings / bold → white
- inline code → zinc-100
That brings agent body text to ~13:1 against bg-zinc-700, matching the
user bubble's brightness so both sides of the conversation read at
the same crispness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered during code review of the #2623 hotfix audit. Same
regression class as #2618: prose-invert applied where the bubble bg
themes between light/dark, leaving markdown unreadable in one theme.
`MarkdownBody` was unconditionally `prose-invert` — fine for the
outgoing-message bubble (bg-cyan-900, dark in both themes) and the
failure bubble (bg-red-950, dark in both themes), but WRONG for the
incoming-message bubble (bg-surface-card, which themes LIGHT in light
mode). Result: light prose body text on light cream bg = invisible
markdown for incoming peer-to-peer messages in light mode.
Added an `invert: "always" | "dark-only"` prop to MarkdownBody. The
NormalMessage call sites switch on `msg.flow` so each bubble gets the
direction matching its bg's theming behavior. Failure bubble keeps
the default ("always") since red-950 stays dark.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regression from PR #2618 (chat dark-contrast).
PR #2618 switched the agent bubble bg to `dark:bg-zinc-700` so it
visibly elevates against the dark panel — but the inner ReactMarkdown
prose div only got `prose-invert` for USER messages. Result: in dark
mode the agent's markdown text rendered with the Tailwind Typography
plugin's default dark body color on top of the new dark bg = invisible
text. User reported empty-looking gray rectangles where agent replies
should be.
Fix: apply `dark:prose-invert` to agent bubbles so prose body text
flips light alongside the bg. Light mode unchanged (default prose
colors against the warm `bg-surface-card`).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues on a high-stakes surface (revoke token, delete workspace,
cascade delete):
1. **Cancel hover was a no-op.** `bg-surface-card hover:bg-surface-card`
gave zero visual feedback on hover. Now hovers to surface-elevated
with a softened border so the button visibly lifts.
2. **Confirm hovers went LIGHTER, dropping white-text contrast.**
`bg-red-600 hover:bg-red-500` made the destructive button less
readable on hover. Same for warning (amber) and primary (accent).
Reversed to hover-darker so contrast holds in both themes.
3. **No focus-visible rings on either button.** Keyboard users had no
indication of focus position (WCAG 2.4.7 fail). Added
`focus-visible:ring-2 focus-visible:ring-accent/40` on Cancel and
`focus-visible:ring-2 focus-visible:ring-offset-2 ...accent/60` on
Confirm so the focused destructive action is unambiguous.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2571 fixed synth-E2E by branching MODEL_SLUG per runtime, but only
the langgraph branch was verified at runtime — hermes / claude-code /
override / fallback had zero automated coverage. A future regression
(e.g. dropping the langgraph case) would silently revert and only
surface as "Could not resolve authentication method" mid-E2E.
This PR:
- Extracts the dispatch into tests/e2e/lib/model_slug.sh as a sourceable
pick_model_slug() function. No behavior change.
- Adds tests/e2e/test_model_slug.sh — 9 assertions across all 5 dispatch
branches plus the override path. Verified to FAIL when any branch is
flipped (manually regressed langgraph slash-form to confirm the test
catches it; restored before commit).
- Wires the unit test into ci.yml's existing shellcheck job (only runs
when tests/e2e/ or scripts/ change). Pure-bash, no live infra.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User screenshot showed pale lavender user bubbles with hard-to-read white
text and a nearly-invisible agent bubble blending into the dark panel.
Root causes:
1. Tailwind v4 defaults `dark:` to `prefers-color-scheme: dark`. Our
ThemeProvider writes `data-theme="dark"` on <html> so user toggle wins
over OS — but `dark:` classes elsewhere in the codebase weren't
tracking it. Added `@custom-variant dark` to re-bind the variant.
2. `bg-accent` themes lighter in dark mode (--color-accent: #6883e8),
dropping white-text contrast to ~3:1 (fails WCAG AA). Switched user
bubble to solid blue-600/500 so it stays ~5:1 in both modes.
3. `bg-surface-card` (#1a1d23) was only ~7% lighter than the panel bg
(#0e1014), making agent bubbles disappear. Bumped to zinc-700 in
dark; light mode keeps the warm surface-card tint.
4. System (error) bubble's /10 overlay was nearly invisible; raised to
/25 in dark with stronger border + ink for readability.
Sub-tab + textarea polish included: low-contrast `text-ink-soft` →
`text-ink-mid`, focus-visible rings on tabs, dark variants on textarea.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chat_history query
WHERE workspace_id = $1
AND activity_type = 'a2a_receive'
AND (source_id = $2 OR target_id = $2)
ORDER BY created_at DESC
forces a workspace-scoped seq-scan-and-filter at every call —
idx_activity_ws_type_time covers workspace_id+type prefix but the
(source OR target) clause then walks every workspace row. Demo
workspaces (≤50 rows) don't notice; production workspaces accumulate
thousands over months and chat_history latency grows linearly.
Adds two partial btree indexes (workspace_id, source_id) WHERE NOT NULL
and (workspace_id, target_id) WHERE NOT NULL. Postgres BitmapOrs them
into a workspace-scoped BitmapAnd against the existing index, dropping
chat_history from O(workspace_rows) to O(peer_a2a_rows).
Partial WHERE NOT NULL because most activity rows (heartbeats,
agent_log, memory_write, etc.) carry NULL source_id/target_id and
shouldn't bloat the index.
Anti-pattern caveat (per the issue): a single compound (a, b) index
can't serve 'a OR b' — Postgres only uses compound for prefix match.
Two separate indexes + BitmapOr is the right shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-of-canvas Toolbar had multiple low-contrast surfaces in light theme:
Action buttons (Stop All, Restart Pending):
- bg-red-950/50 + bg-amber-950/40 → bg-bad/10 + bg-warm/10 with bg-bad/40
+ bg-warm/40 borders. Dark-tinted backgrounds with /40-/50 alpha render
as nearly invisible smudges on warm-paper; semantic tokens at /10 give
a clear pale-bad / pale-warm tint that scales correctly in dark mode.
- Both gain focus-visible:ring-2 focus-visible:ring-{bad,warm}/40.
Toggle button (A2A edges):
- Active state: bg-blue-950/50 → bg-accent/15 (themes correctly).
- Inactive state: bg-surface-card/50 + text-ink-soft → solid bg-surface-card
+ text-ink-mid; hover bumps to text-ink. Drops the redundant
"hover:bg-surface-card/50" identity hover.
Icon buttons (Audit, Search, Help):
- Same pattern as toggle inactive: solid bg-surface-card + text-ink-mid +
text-ink hover, with focus-visible:ring-2 focus-visible:ring-accent/40.
Workspace count + bullet separator:
- text-ink-soft (3.5:1 on warm-paper) → text-ink-mid (7:1).
WS connection status:
- "Live": text-ink-soft → text-ink-mid (paired with the green dot).
- "Reconnecting": text-ink-soft → text-warm (semantic match for amber dot).
- "Offline": text-ink-soft → text-bad (semantic match for red dot).
Status text now reinforces the dot colour instead of disappearing on
light surfaces.
Help popover:
- Close button: text-ink-soft → text-ink-mid + focus-visible:underline.
- HelpRow body text: text-ink-soft → text-ink-mid (was 3.5:1 on the
bg-surface-sunken/45 popover row — failed AA for body text).
Defense-in-depth follow-up to #2481 (peer_id trust-boundary gate).
Same XML-attribute injection vector applies to the four other meta
fields rendered as agent-context attrs in the <channel> tag:
<channel kind="..." method="..." activity_id="..." ts="..." source="molecule">
Each field is now passed through a closed-set / shape-validate gate:
- kind → frozenset {canvas_user, peer_agent} via _safe_meta_field
- method → frozenset {message/send, tasks/send, tasks/get, notify, ""}
- activity_id → UUID-shape regex via _safe_activity_id
- ts → ISO-8601 RFC3339 regex via _safe_ts
Any value outside the allowed shape is replaced with empty string.
Today the values come from a platform-DB column so they're trusted,
but "trust the source" was the same assumption that got peer_id into
trouble (#2481). Closed-enum allowlists make this row-content-blind.
5 new tests mirroring test_envelope_enrichment_strips_path_traversal_peer_id:
- test_envelope_strips_unknown_kind — kind injection stripped
- test_envelope_strips_unknown_method — method injection stripped
- test_envelope_strips_malformed_activity_id — non-UUID stripped
- test_envelope_strips_malformed_ts — non-ISO8601 stripped
- test_envelope_keeps_valid_meta_fields_unchanged — happy-path negative case
Mutation-tested: temporarily making _safe_meta_field permissive kills
both kind/method strip tests with the injection payload reflecting
into the meta dict, confirming the gate is what blocks them.
Two existing tests updated to use UUID-shaped activity_ids ("act-7",
"act-bridge-test" → real UUIDs) since the gate strips synthetic ids.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups from the multi-axis review of #2474:
1. **Docstring inversion** in tool_chat_history. The doc said
'(source_id=peer)' meant 'this workspace is the sender' — actually
it means the *peer* is the sender (source_id is where the activity
came FROM). Reframed to 'where the peer is either the sender or
the recipient' to match the underlying SQL semantics.
2. **Empty-history test**. TestChatHistory had 10 tests but no
200+[] happy-path pin. Added test_empty_history_returns_empty_json_list
asserting result == '[]' on exact-equality (per assert-exact
memory — substring '[]' would match envelope shapes too).
Both changes are pure docs+tests — no behaviour change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chat bubble fixes (canvas/src/components/tabs/ChatTab.tsx):
- User bubble: bg-accent-strong/30 + text-blue-100 → bg-accent + text-white
(translucent dark-blue overlay on warm-paper surface read as pale lavender
with near-invisible light-blue text — a real WCAG AA failure on the
highest-traffic surface in canvas).
- System/error bubble: bg-red-900/30 + text-red-200 → bg-bad/10 + text-bad,
using semantic tokens so dark-mode adapts automatically.
- Agent bubble: drop /80 + /30 opacity modifiers; solid bg-surface-card +
text-ink + border-line gives consistent contrast in both themes.
- prose-invert was unconditional, so markdown text on agent/system bubbles
rendered as light text on a light surface in light mode. Make it apply
only on the user bubble (the only dark surface in this component).
- Timestamp: text-ink-soft is too pale on warm-paper; use text-ink-mid for
agent/system, white/70 for user (visible on the now-solid blue bg).
Sub-tab bar fixes (canvas/src/components/SidePanel.tsx):
- Right-edge fade was hardcoded `from-zinc-950` — that paints a dark vertical
strip on the right edge of the tab bar in light mode. Switch to
`from-surface` so the gradient blends into whichever theme is active.
- Inactive tab text: text-ink-soft (~3.5:1 on warm-paper) → text-ink-mid
(~7:1). Active tab background: drop the /40 opacity so the selection is
unambiguous on either surface.
No semantic-token additions; all changes use existing warm-paper tokens
that already work in both themes.
The two missing branch tests called out by the multi-axis review of #2471.
a2a_client.enrich_peer_metadata handles two failure shapes (lines 105-112)
that the existing 12 envelope-enrichment tests don't exercise:
1. HTTP 200, response.json() raises (non-JSON body)
2. HTTP 200, valid JSON, but body is list/string/number not dict
Both paths land at the negative-cache write, but no test verified the
discriminator. Pin both with the same call_count == 1 assertion shape
the 5xx + network-exception tests already use.
Verified: temporarily removing the negative-cache write in either
branch makes the corresponding test fail with call_count == 2 — the
assertion correctly discriminates the contract from a fall-through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auto-promote ↔ auto-sync chain has been generating empty PRs
indefinitely since the staging merge_queue ruleset uses MERGE
strategy:
1. Auto-promote merges PR via queue → main = merge commit M2 not in staging
2. Auto-sync opens sync-back PR. Workflow's local `git merge --ff-only`
succeeds (PR title even says "ff to ..."), but the queue lands the
PR via MERGE → staging = merge commit S2 not in main
3. Auto-promote sees staging ahead by 1 → opens new promote PR. Tree
diff vs main = 0 (S2's tree == main's tree). But the gate logic
only checks "all required workflows green", not "actual code to
ship" → opens an empty promote PR
4. ... repeat indefinitely
Each round costs ~30-40 min wallclock, ~2 manual approvals (the queue
requires 1 review and the bot can't self-approve without admin
bypass), and one full CodeQL Go run (~15 min).
Observed today (2026-05-03) across PRs #2592 → #2594 → #2595 → #2596
→ #2597 — 5 PRs, ~3 hours, all empty content.
Fix: before opening the promote PR, check that staging's tree
actually differs from main's tree. If they're identical (the
empty-merge-commit cycle), skip cleanly and let the cycle terminate.
Implementation:
- New step `Skip if staging tree == main tree` runs before the
existing gate check.
- `git diff --quiet origin/main $HEAD_SHA` exits 0 iff trees match.
- On match: emits a step summary explaining the skip + sets
`skip=true`; subsequent gate-check + promote steps are gated on
`skip != 'true'` so they short-circuit.
- Fail-open: if `git fetch` errors, fall through to gate check
(preserve existing behavior). Only skip when diff is DEFINITIVELY
empty.
Long-term, the cleaner fix is to switch the merge_queue ruleset's
merge_method away from MERGE so FF-able PRs land cleanly without a
new commit — but that's a broader change affecting every staging
PR's commit shape. This guard is the surgical one-step break.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>