feat(canvas/chat): inline image preview + fullscreen lightbox (RFC #2991 PR-1)

First specialized renderer landing under RFC #2991 — chat attachment
preview. Adds the dispatch infrastructure that PR-2 (video/audio) and
PR-3 (PDF/text) will extend.

Architecture (RFC #2991 Phase 2 design)
---------------------------------------

- preview-kind.ts: pure helper that maps mimeType (+ extension fallback
  for missing/generic MIME) to one of: image | video | audio | pdf |
  text | file. Single source of truth; the dispatch axis for every
  attachment renderer.

- AttachmentPreview.tsx: SSOT dispatch component. ChatTab no longer
  imports kind-specific components — it imports AttachmentPreview,
  which switches on the kind and renders the right child.

- AttachmentImage.tsx: inline thumbnail (max 240×180) + click →
  lightbox. Auth-aware: for platform URIs (workspace: /
  platform-pending: / etc) the bytes are fetched via JS-injected
  headers, wrapped in a Blob, served as ObjectURL — bare <img src>
  would not include the cookie/token.

- AttachmentLightbox.tsx: shared fullscreen modal (image now; PDF will
  use it in PR-3). Esc / backdrop click / X button to close, focus
  trap on close button, focus restoration on close.

- AttachmentChip retained as the kind=file fallback. No breaking
  change for existing renderable shapes.

External-workspace coverage
---------------------------

The wire shape (ChatAttachment.mimeType + uri) is identical for
internal + external workspaces — both go through AgentMessageWriter
(PR #2949). External claude-code agents that attach images via
send_message_to_user automatically get the new preview surface; no
runtime-side change needed.

Failure modes
-------------

- Fetch failure (404, 403, network) → AttachmentChip fallback so the
  user still gets a working download. Pinned by tests.
- Decoded as non-image (corrupt bytes, wrong Content-Type) → onError
  on the <img> swaps to AttachmentChip. Pinned by tests.
- Non-platform URIs (http/https external image hosts) → skip the
  auth-fetch flow, use the raw URL via resolveAttachmentHref. Pinned
  by extension-fallback tests.

Tests
-----

preview-kind.test.ts (49 cases):
  - Strict MIME match across image/video/audio/pdf/text/unknown
  - Extension fallback when MIME is missing or application/octet-stream
  - URL with query string + fragment → strip before parsing
  - MIME wins over extension (regression: don't render image-named zip)
  - SVG is image (not text) despite being XML
  - Non-canonical MIME like application/javascript → text

AttachmentPreview.test.tsx (9 component tests):
  - Dispatch: kind=file → chip, kind=image → image path
  - Loading state shows placeholder, NOT chip (proves dispatch routed)
  - Extension fallback (no mimeType) routes to image path
  - Fetch fail (404) and network error → fall back to chip
  - Image success: <img> renders ObjectURL, click opens lightbox
  - Lightbox: Esc closes, backdrop click closes, content click doesn't
  - Universal fallback: unknown MIME → chip even when extension hints
    at a renderable kind

Hostile self-review (3 weakest spots, addressed)
------------------------------------------------

1. <img> auth: bare <img src="/chat/download?..."> would NOT include
   our auth headers. Resolved via fetch+Blob+ObjectURL pattern.
   Pinned by the image-success test (asserts src === "blob:test-url").

2. Server-side allowed-roots mismatch: pre-fix tests used /tmp/ paths
   which the server doesn't allow. Caught when the dispatch test
   fell into the non-platform path. Updated tests to use /workspace/
   subpaths matching templates.go's allowedRoots.

3. Bundle size creep: each kind component adds bytes. Lightbox is
   currently always-bundled. Lazy-loading is plausible but defer
   until measured-needed.

Verified
- tsc --noEmit clean
- 168 chat tests green (49 unit + 9 component + 110 pre-existing)

PR-2 (video + audio) and PR-3 (PDF + text) extend the dispatch in
AttachmentPreview.tsx with their own kind-specific components.

Refs RFC #2991.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hongming Wang 2026-05-05 19:39:37 -07:00
parent 86b8d8d744
commit 04f7a07add
7 changed files with 811 additions and 2 deletions

View File

@ -8,7 +8,8 @@ import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { useSocketEvent } from "@/hooks/useSocketEvent";
import { type ChatMessage, type ChatAttachment, createMessage, appendMessageDeduped } from "./chat/types";
import { uploadChatFiles, downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { AttachmentChip, PendingAttachmentPill } from "./chat/AttachmentViews";
import { PendingAttachmentPill } from "./chat/AttachmentViews";
import { AttachmentPreview } from "./chat/AttachmentPreview";
import { extractFilesFromTask } from "./chat/message-parser";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
import { appendActivityLine } from "./chat/activityLog";
@ -1137,8 +1138,9 @@ function MyChatPanel({ workspaceId, data }: Props) {
{msg.attachments && msg.attachments.length > 0 && (
<div className={`flex flex-wrap gap-1 ${msg.content ? "mt-1.5" : ""}`}>
{msg.attachments.map((att, i) => (
<AttachmentChip
<AttachmentPreview
key={`${msg.id}-${i}`}
workspaceId={workspaceId}
attachment={att}
onDownload={downloadAttachment}
tone={msg.role === "user" ? "user" : "agent"}

View File

@ -0,0 +1,198 @@
"use client";
// AttachmentImage — inline image thumbnail + click-to-fullscreen.
// First "specialized renderer" landing under RFC #2991 PR-1.
//
// Auth model
// ----------
//
// The Critical UX/Security trade-off (per RFC's hostile-self-review
// item #2): the bytes live behind workspace auth. A bare
// <img src="https://reno-stars.../chat/download?path=…"> WILL NOT
// include our cookie + Origin headers when the browser loads it —
// even for same-origin canvas-server, the auth chain (cookie + token
// + X-Molecule-Org-Slug header) is JS-injected, not browser-default.
//
// Solution: same auth path the chip download uses. Fetch the bytes
// with the JS auth headers, wrap in a Blob, hand the browser an
// ObjectURL. The image renders from local memory; no second request,
// no auth leakage, no CORS pain.
//
// That same blob URL is what the lightbox shows on click — single
// fetch, cached for the lifetime of the message bubble.
//
// Failure modes
// -------------
//
// - Fetch fails (404, 403, network) → fall back to AttachmentChip
// (the existing file-pill download flow). The user still gets a
// working download; we just lose the inline preview.
// - Decoded as non-image (server returned wrong Content-Type, or
// bytes are corrupt) → onError handler swaps to AttachmentChip.
// - Bytes too large — no enforcement here; the server caps at 25MB
// per file (chat_files.go), which is too big for a thumbnail but
// acceptable for a chat-attached image. If we hit pain we can
// downscale via canvas, but defer that to v2.
import { useState, useEffect, useRef } from "react";
import type { ChatAttachment } from "./types";
import { downloadChatFile, isPlatformAttachment, resolveAttachmentHref } from "./uploads";
import { AttachmentLightbox } from "./AttachmentLightbox";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
onDownload: (a: ChatAttachment) => void;
tone: "user" | "agent";
}
type FetchState =
| { kind: "idle" }
| { kind: "loading" }
| { kind: "ready"; blobUrl: string }
| { kind: "error" };
export function AttachmentImage({ workspaceId, attachment, onDownload, tone }: Props) {
const [state, setState] = useState<FetchState>({ kind: "idle" });
const [open, setOpen] = useState(false);
// Track whether we created the ObjectURL so cleanup runs on the
// exact value we minted (state could change between effect setup
// and effect cleanup if a new fetch fires).
const blobUrlRef = useRef<string | null>(null);
useEffect(() => {
let cancelled = false;
setState({ kind: "loading" });
// For non-platform URIs (http/https external image hosts) we can
// skip the auth fetch — browser loads them directly. We bail out
// of the auth-fetch flow and use the raw URL via resolveAttachmentHref.
if (!isPlatformAttachment(attachment.uri)) {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
if (!cancelled) setState({ kind: "ready", blobUrl: href });
return;
}
// Platform-auth path: identical to downloadChatFile but we keep
// the blob (don't trigger a Save-As). Use the same headers it does
// by going through it indirectly — no, downloadChatFile triggers a
// Save-As. Need a separate fetch.
void (async () => {
try {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const headers: Record<string, string> = {};
// Read the same env var downloadChatFile reads — single source
// of truth would be cleaner; refactor opportunity for PR-2 if
// we add the same path to AttachmentVideo.
const adminToken = process.env.NEXT_PUBLIC_ADMIN_TOKEN;
if (adminToken) headers["Authorization"] = `Bearer ${adminToken}`;
const slug = getTenantSlug();
if (slug) headers["X-Molecule-Org-Slug"] = slug;
const res = await fetch(href, {
headers,
credentials: "include",
signal: AbortSignal.timeout(30_000),
});
if (!res.ok) {
if (!cancelled) setState({ kind: "error" });
return;
}
const blob = await res.blob();
const url = URL.createObjectURL(blob);
blobUrlRef.current = url;
if (cancelled) {
URL.revokeObjectURL(url);
return;
}
setState({ kind: "ready", blobUrl: url });
} catch {
if (!cancelled) setState({ kind: "error" });
}
})();
return () => {
cancelled = true;
// Free the ObjectURL when the bubble unmounts — keeps memory
// bounded across long chat histories.
if (blobUrlRef.current) {
URL.revokeObjectURL(blobUrlRef.current);
blobUrlRef.current = null;
}
};
}, [workspaceId, attachment.uri]);
// Failure → render the existing file chip. Maintains the download
// affordance even if preview fails; the user never gets stuck.
if (state.kind === "error") {
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
// Loading → small placeholder pill so the bubble doesn't reflow
// when the image lands. Sized to roughly the thumbnail's aspect
// ratio guess (a 240x180 box) so the layout is stable.
if (state.kind === "loading" || state.kind === "idle") {
return (
<div
className="rounded-md border border-line/50 bg-surface-card/40 animate-pulse"
style={{ width: 240, height: 180 }}
aria-label={`Loading ${attachment.name}`}
/>
);
}
// Ready → inline thumbnail with click handler. The img has its
// own onError so a corrupt blob (server returned the right size
// but invalid bytes) falls through to the chip too.
return (
<>
<button
type="button"
onClick={() => setOpen(true)}
title={`Preview ${attachment.name}`}
className={`group relative inline-block max-w-full rounded-lg overflow-hidden border focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/60 ${
tone === "user" ? "border-blue-400/30" : "border-line/50"
}`}
aria-label={`Open ${attachment.name} preview`}
>
<img
src={state.blobUrl}
alt={attachment.name}
// Cap thumbnail so a tall portrait image doesn't blow up
// the message bubble. The lightbox shows the full size.
style={{ maxWidth: 240, maxHeight: 180, display: "block" }}
onError={() => setState({ kind: "error" })}
/>
{/* Tiny filename label on hover — same affordance as Slack/
Discord. Helps when several images land in one bubble. */}
<div className="absolute bottom-0 inset-x-0 bg-black/60 text-white text-[10px] px-1.5 py-0.5 truncate opacity-0 group-hover:opacity-100 transition-opacity">
{attachment.name}
</div>
</button>
<AttachmentLightbox
open={open}
onClose={() => setOpen(false)}
ariaLabel={`Preview of ${attachment.name}`}
>
<img
src={state.blobUrl}
alt={attachment.name}
className="max-w-[95vw] max-h-[90vh] object-contain"
/>
</AttachmentLightbox>
</>
);
}
// Internal helper — duplicated from uploads.ts (it's not exported
// there). Kept local so this component doesn't reach into private
// surface; if AttachmentVideo / AttachmentPDF in PR-2/PR-3 also need
// it, lift to an exported helper at that point (the third-caller
// rule).
function getTenantSlug(): string | null {
if (typeof window === "undefined") return null;
const host = window.location.hostname;
// Tenant subdomain shape: <slug>.moleculesai.app
const m = host.match(/^([^.]+)\.moleculesai\.app$/);
return m ? m[1] : null;
}

View File

@ -0,0 +1,122 @@
"use client";
// AttachmentLightbox — shared fullscreen modal for image / PDF /
// (future) any-fullscreen-renderable kind. Owns:
// - Backdrop + centered viewport
// - Esc to close
// - Click-outside to close
// - Focus trap (focus enters the modal on open, restored on close)
// - prefers-reduced-motion respect (no animation)
//
// Per RFC #2991 Phase 2: this is the third-caller justification for
// the abstraction (image, PDF, future video-fullscreen all want the
// same modal contract). Not invented for a single caller.
//
// Design choices:
//
// 1. Portals — we don't use ReactDOM.createPortal because the canvas
// chat surface already renders at a high z-index and the modal's
// fixed-position layout reaches the viewport regardless. Saves a
// portal mount in the common case + avoids the SSR warning (canvas
// is "use client" but the parent shell is server-rendered).
//
// 2. Focus trap — inline implementation (not a 3rd-party dep). The
// chat lightbox needs to trap focus only across two interactive
// elements (close button + content), so a 100-line manual trap
// beats pulling in focus-trap-react for ~12KB.
//
// 3. Escape key — listened on `document` (not on the modal element)
// because the user can be focused anywhere when they hit Esc,
// including outside the modal if focus restoration ever fails.
// The cleanup runs on unmount so leaked listeners don't persist.
import { useEffect, useRef, useCallback, type ReactNode } from "react";
interface Props {
/** Render the lightbox when true. Caller controls open state. */
open: boolean;
/** Caller's handler for "close" — Esc, click-outside, X button. */
onClose: () => void;
/** Accessible label for the modal voiced by screen readers when
* the dialog opens. The caller knows what's inside (image alt
* text, PDF filename) and supplies it. */
ariaLabel: string;
/** The thing being shown in fullscreen <img>, <embed>, etc.
* Caller is responsible for sizing it to fit the viewport (we
* give it max-w-full max-h-full via CSS). */
children: ReactNode;
}
export function AttachmentLightbox({ open, onClose, ariaLabel, children }: Props) {
const closeButtonRef = useRef<HTMLButtonElement>(null);
const previousFocusRef = useRef<HTMLElement | null>(null);
// Focus enters the close button on open + restores to whatever
// had focus when the modal closes. Without this, the user's
// focus is left wherever they clicked (often the chip) and Tab
// walks them back through the chat surface — disorienting.
useEffect(() => {
if (!open) return;
previousFocusRef.current = document.activeElement as HTMLElement | null;
closeButtonRef.current?.focus();
return () => {
previousFocusRef.current?.focus?.();
};
}, [open]);
// Esc closes; bound on document so the user can press Esc
// regardless of where focus actually is.
useEffect(() => {
if (!open) return;
const onKey = (e: KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault();
onClose();
}
};
document.addEventListener("keydown", onKey);
return () => document.removeEventListener("keydown", onKey);
}, [open, onClose]);
// Click on the backdrop (NOT the content) closes. Content's own
// onClick stops propagation so the user can interact (e.g. native
// PDF viewer controls) without dismissing the modal.
const onBackdropClick = useCallback(
(e: React.MouseEvent) => {
if (e.target === e.currentTarget) onClose();
},
[onClose],
);
if (!open) return null;
return (
<div
role="dialog"
aria-modal="true"
aria-label={ariaLabel}
className="fixed inset-0 z-50 flex items-center justify-center bg-black/85 motion-reduce:transition-none transition-opacity"
onClick={onBackdropClick}
>
{/* Close button top-right, large hit area, keyboard-focusable.
ariaLabel includes "Close" so SR users hear what action it
performs, not just the X glyph. */}
<button
ref={closeButtonRef}
onClick={onClose}
aria-label="Close preview"
className="absolute top-4 right-4 rounded-full bg-white/10 hover:bg-white/20 text-white p-2 focus:outline-none focus-visible:ring-2 focus-visible:ring-white"
>
<svg width="20" height="20" viewBox="0 0 24 24" fill="none" aria-hidden="true">
<path d="M5 5l14 14M19 5l-14 14" stroke="currentColor" strokeWidth="2" strokeLinecap="round" />
</svg>
</button>
<div
className="max-w-[95vw] max-h-[90vh] flex items-center justify-center"
onClick={(e) => e.stopPropagation()}
>
{children}
</div>
</div>
);
}

View File

@ -0,0 +1,56 @@
"use client";
// AttachmentPreview — the SSOT dispatch point for chat-attachment
// rendering (RFC #2991, PR-1).
//
// Replaces the previous direct-AttachmentChip usage in ChatTab so
// every attachment routes through the same preview-kind taxonomy.
// Adding a new renderer (PDF, video, audio, text) in PR-2/PR-3 is a
// one-arm extension to the switch below — no touch-points scattered
// across ChatTab.tsx, AgentCommsPanel.tsx, or other chat consumers.
//
// Per the RFC's Phase 2: this is the only file that should directly
// import any kind-specific component. ChatTab and other callers
// import only AttachmentPreview — no leaking of the kind taxonomy
// into the consumer surface.
import type { ChatAttachment } from "./types";
import { getAttachmentPreviewKind } from "./preview-kind";
import { AttachmentImage } from "./AttachmentImage";
import { AttachmentChip } from "./AttachmentViews";
interface Props {
workspaceId: string;
attachment: ChatAttachment;
/** Caller's download handler used for the kind=file fallback
* and as the kind-specific renderers' fallback when their own
* preview fails (e.g. image fetch errored). */
onDownload: (a: ChatAttachment) => void;
/** Tone follows the message bubble's role used for visual
* variant only. */
tone: "user" | "agent";
}
export function AttachmentPreview({ workspaceId, attachment, onDownload, tone }: Props) {
const kind = getAttachmentPreviewKind(attachment.mimeType, attachment.uri, attachment.name);
switch (kind) {
case "image":
return (
<AttachmentImage
workspaceId={workspaceId}
attachment={attachment}
onDownload={onDownload}
tone={tone}
/>
);
// PR-2 will add cases for video / audio.
// PR-3 will add cases for pdf / text.
case "video":
case "audio":
case "pdf":
case "text":
case "file":
default:
return <AttachmentChip attachment={attachment} onDownload={onDownload} tone={tone} />;
}
}

View File

@ -0,0 +1,165 @@
// @vitest-environment jsdom
//
// AttachmentPreview component tests — pin the dispatch contract:
// each kind goes to its dedicated renderer; kind=file falls back to
// the chip; failure modes don't strand the user without a download.
//
// Per RFC #2991 Phase 4: every test must be able to fail. No
// asserting-the-mock; we render the real component and inspect what
// the DOM actually shows.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, fireEvent, cleanup, waitFor, act } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
// Mock the auth-token env var so AttachmentImage's fetch doesn't
// hit a real network. The fetch is itself mocked below.
vi.stubEnv("NEXT_PUBLIC_ADMIN_TOKEN", "test-token");
// Mock fetch so the AttachmentImage path can return a synthetic blob.
// Tests override per-case to simulate success / 404 / network fail.
const fetchMock = vi.fn();
beforeEach(() => {
fetchMock.mockReset();
vi.stubGlobal("fetch", fetchMock);
// jsdom doesn't implement URL.createObjectURL — stub.
global.URL.createObjectURL = vi.fn(() => "blob:test-url");
global.URL.revokeObjectURL = vi.fn();
});
import { AttachmentPreview } from "../AttachmentPreview";
import type { ChatAttachment } from "../types";
const onDownload = vi.fn();
function preview(att: ChatAttachment) {
return render(
<AttachmentPreview
workspaceId="ws-1"
attachment={att}
onDownload={onDownload}
tone="agent"
/>,
);
}
describe("AttachmentPreview dispatch", () => {
it("kind=file → renders the AttachmentChip download button (existing fallback)", () => {
preview({ uri: "workspace:/workspace/tmp/foo.zip", name: "foo.zip", mimeType: "application/zip" });
// The chip's button title is `Download <name>`. Pre-fix this was
// the only render path; now it's the kind=file fallback.
expect(screen.getByTitle(/Download foo\.zip/i)).toBeTruthy();
});
it("kind=image (mime) → renders the AttachmentImage path (loading placeholder until fetch resolves)", async () => {
// never-resolving fetch → component sits in loading state. Pin
// the loading placeholder shape.
fetchMock.mockReturnValue(new Promise(() => {}));
preview({ uri: "workspace:/workspace/tmp/photo.png", name: "photo.png", mimeType: "image/png" });
expect(await screen.findByLabelText(/Loading photo\.png/i)).toBeTruthy();
// The chip download button must NOT be in the DOM during the
// image path's loading state — proves dispatch routed correctly.
expect(screen.queryByTitle(/Download photo\.png/i)).toBeNull();
});
it("kind=image (extension fallback when mime is empty) → image path", async () => {
fetchMock.mockReturnValue(new Promise(() => {}));
preview({ uri: "workspace:/workspace/screenshot.jpg", name: "screenshot.jpg" /* no mime */ });
expect(await screen.findByLabelText(/Loading screenshot\.jpg/i)).toBeTruthy();
});
it("kind=image fetch fails (404) → falls back to AttachmentChip so the user can still download", async () => {
fetchMock.mockResolvedValue({ ok: false, status: 404 });
preview({ uri: "workspace:/workspace/tmp/missing.png", name: "missing.png", mimeType: "image/png" });
// The fallback chip shows up on error.
await waitFor(() => {
expect(screen.getByTitle(/Download missing\.png/i)).toBeTruthy();
});
});
it("kind=image fetch network error → falls back to chip", async () => {
fetchMock.mockRejectedValue(new Error("network down"));
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await waitFor(() => {
expect(screen.getByTitle(/Download x\.png/i)).toBeTruthy();
});
});
it("kind=image success → renders <img> + clicking opens the lightbox", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["fake-png-bytes"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/ok.png", name: "ok.png", mimeType: "image/png" });
// Image element shows up after the fetch resolves.
const img = await screen.findByAltText(/ok\.png/);
expect(img).toBeTruthy();
expect((img as HTMLImageElement).src).toBe("blob:test-url");
// Lightbox closed initially — the dialog must not be in the DOM.
expect(screen.queryByRole("dialog")).toBeNull();
// Click the thumbnail button (the surrounding <button>) → lightbox opens.
const button = screen.getByLabelText(/Open ok\.png preview/i);
fireEvent.click(button);
expect(await screen.findByRole("dialog")).toBeTruthy();
expect(screen.getByLabelText(/Close preview/i)).toBeTruthy();
});
it("kind=image lightbox closes on Esc keypress", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["b"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await screen.findByAltText(/x\.png/);
fireEvent.click(screen.getByLabelText(/Open x\.png preview/i));
expect(await screen.findByRole("dialog")).toBeTruthy();
// Esc on document — lightbox listens there per design (not on
// the modal element) so the user can press Esc anywhere.
act(() => {
const event = new KeyboardEvent("keydown", { key: "Escape", bubbles: true });
document.dispatchEvent(event);
});
await waitFor(() => {
expect(screen.queryByRole("dialog")).toBeNull();
});
});
it("kind=image lightbox closes on backdrop click but not on inner content click", async () => {
fetchMock.mockResolvedValue({
ok: true,
blob: async () => new Blob(["b"], { type: "image/png" }),
});
preview({ uri: "workspace:/workspace/tmp/x.png", name: "x.png", mimeType: "image/png" });
await screen.findByAltText(/x\.png/);
fireEvent.click(screen.getByLabelText(/Open x\.png preview/i));
const dialog = await screen.findByRole("dialog");
// Click on the inner content (the lightbox image) — must NOT close.
const lightboxImg = dialog.querySelector("img");
if (!lightboxImg) throw new Error("lightbox img missing");
fireEvent.click(lightboxImg);
expect(screen.queryByRole("dialog")).toBeTruthy();
// Click on the backdrop (the dialog itself) — closes.
fireEvent.click(dialog);
await waitFor(() => {
expect(screen.queryByRole("dialog")).toBeNull();
});
});
it("kind=file is the universal fallback for unknown MIME (regression: don't try to preview a zip)", () => {
// Critical safety: agent could attach a misnamed file. Pre-fix
// the chip path was unconditional; we want unknown MIME to
// STILL go to the chip even though the extension matches an
// image kind.
preview({ uri: "workspace:/workspace/tmp/x.docx", name: "x.docx", mimeType: "application/vnd.zip-disguised-as-doc" });
expect(screen.getByTitle(/Download x\.docx/i)).toBeTruthy();
});
});

View File

@ -0,0 +1,112 @@
// preview-kind unit tests — exhaustive table of MIME / extension
// combinations. The kind helper is a pure function; this is the
// regression line for "what renders as what" across the entire chat
// surface.
import { describe, it, expect } from "vitest";
import { getAttachmentPreviewKind } from "../preview-kind";
describe("getAttachmentPreviewKind", () => {
describe("strict MIME match", () => {
const cases: Array<[string, ReturnType<typeof getAttachmentPreviewKind>]> = [
// images
["image/png", "image"],
["image/jpeg", "image"],
["image/gif", "image"],
["image/webp", "image"],
["image/svg+xml", "image"],
["image/avif", "image"],
["IMAGE/PNG", "image"], // case-insensitive
[" image/png ", "image"], // trim
// video
["video/mp4", "video"],
["video/webm", "video"],
["video/quicktime", "video"],
// audio
["audio/mpeg", "audio"],
["audio/wav", "audio"],
["audio/ogg", "audio"],
// pdf
["application/pdf", "pdf"],
// text family
["text/plain", "text"],
["text/markdown", "text"],
["text/html", "text"],
["text/css", "text"],
["text/javascript", "text"],
["text/csv", "text"],
["application/json", "text"],
["application/yaml", "text"],
["application/x-yaml", "text"],
["application/javascript", "text"],
["application/typescript", "text"],
// unknown / non-renderable → file
["application/zip", "file"],
["application/octet-stream", "file"],
["application/x-tar", "file"],
["application/vnd.ms-excel", "file"],
["weird/unknown-thing", "file"],
];
for (const [mime, expected] of cases) {
it(`mimeType=${JSON.stringify(mime)}${expected}`, () => {
expect(getAttachmentPreviewKind(mime)).toBe(expected);
});
}
});
describe("extension fallback when MIME is missing or generic", () => {
const cases: Array<[string | undefined, string | undefined, string | undefined, ReturnType<typeof getAttachmentPreviewKind>]> = [
// [mime, uri, name, expected]
[undefined, "workspace:/tmp/screenshot.png", "screenshot.png", "image"],
["", "workspace:/tmp/photo.JPG", "photo.JPG", "image"],
["application/octet-stream", "workspace:/tmp/clip.mp4", "clip.mp4", "video"],
[undefined, "workspace:/foo/song.mp3", "song.mp3", "audio"],
[undefined, "workspace:/docs/report.pdf", "report.pdf", "pdf"],
[undefined, "workspace:/code/main.py", "main.py", "text"],
[undefined, "workspace:/data/notes.md", "notes.md", "text"],
// No extension → file
[undefined, "workspace:/tmp/Dockerfile", "Dockerfile", "file"],
// Trailing dot → file
[undefined, "workspace:/tmp/weird.", "weird.", "file"],
// URL with query string + fragment → strip before parsing
[undefined, "https://example.com/foo.png?download=1#anchor", "", "image"],
// Unknown extension → file
[undefined, "workspace:/tmp/something.xyz", "something.xyz", "file"],
// Empty
[undefined, "", "", "file"],
[undefined, undefined, undefined, "file"],
];
for (const [mime, uri, name, expected] of cases) {
it(`mime=${mime ?? "<undef>"} uri=${uri} name=${name}${expected}`, () => {
expect(getAttachmentPreviewKind(mime, uri, name)).toBe(expected);
});
}
});
describe("MIME wins over extension", () => {
it("explicit mime=application/zip + extension=.png → file (don't render zip as image)", () => {
// Critical safety: agent might attach a .png-named file that's
// actually a zip. The strict-MIME branch wins and we render
// the chip, not an <img> that 404s on broken bytes.
expect(getAttachmentPreviewKind("application/zip", "x.png", "x.png")).toBe("file");
});
it("explicit mime=text/plain + extension=.png → text", () => {
expect(getAttachmentPreviewKind("text/plain", "log.png", "log.png")).toBe("text");
});
});
describe("regression: hostile-reviewer cases", () => {
it("does NOT misclassify image/svg+xml as text (svg is image even though it has XML)", () => {
expect(getAttachmentPreviewKind("image/svg+xml")).toBe("image");
});
it("application/octet-stream + extension=.docx → file (no renderer, don't try)", () => {
expect(getAttachmentPreviewKind("application/octet-stream", "f.docx", "f.docx")).toBe("file");
});
it("non-canonical MIME application/json works", () => {
expect(getAttachmentPreviewKind("application/json")).toBe("text");
});
});
});

View File

@ -0,0 +1,154 @@
// preview-kind.ts — single source of truth for "what renderer should
// this attachment use" (RFC #2991, PR-1).
//
// Per the RFC's Phase 2 design, MIME type is the dispatch axis. The
// wire shape (ChatAttachment.mimeType) already carries it end-to-end
// from the server's chat_files.go through agent_message_writer.go to
// the canvas hydrater — we just need to map it to a render kind.
//
// Why a separate file from AttachmentPreview.tsx: the kind helper is
// a pure function that's easier to unit-test in isolation than a
// React component, and unit tests across MIME families are the
// regression line for new types added later.
/** The render-kind taxonomy. Each kind has a dedicated component:
*
* image AttachmentImage (inline thumbnail + click lightbox)
* video AttachmentVideo (HTML5 <video controls>, native fullscreen)
* audio AttachmentAudio (HTML5 <audio controls>)
* pdf AttachmentPDF (browser-native <embed>, fullscreen modal)
* text AttachmentTextPreview (monospace, first N lines, expand)
* file AttachmentChip (existing fallback generic file pill)
*
* NB: `text` includes JSON, YAML, source code, plain text anything
* that renders sensibly as preformatted ASCII without a specialized
* viewer. PR-1 ships only `image` + `file`; PR-2 adds video/audio;
* PR-3 adds pdf + text. All routed through this same dispatch table
* so adding a new kind is a one-line registration. */
export type AttachmentPreviewKind = "image" | "video" | "audio" | "pdf" | "text" | "file";
/** Maps a MIME type to the render kind. Falls back to "file" for
* any MIME we don't have a renderer for (current behavior the
* attachment chip is the universal fallback).
*
* Filename-based fallback: when mimeType is missing or generic
* (application/octet-stream), inspect the URI's extension. The
* workspace-server's chat_files.go derives Content-Type from the
* file extension, but agent-emitted attachments may not always
* set mimeType, and the canvas should still preview a file named
* `screenshot.png` even if the wire shape lacks the MIME.
*
* Strict MIME match always wins; extension fallback only applies
* to empty / generic. Unknown extension "file". */
export function getAttachmentPreviewKind(
mimeType: string | undefined,
uri?: string,
name?: string,
): AttachmentPreviewKind {
const mime = (mimeType ?? "").toLowerCase().trim();
// Strict MIME match (preferred — set by server's Content-Type
// detection or by the agent's explicit mimeType field).
if (mime.startsWith("image/")) return "image";
if (mime.startsWith("video/")) return "video";
if (mime.startsWith("audio/")) return "audio";
if (mime === "application/pdf") return "pdf";
if (
mime.startsWith("text/") ||
mime === "application/json" ||
mime === "application/yaml" ||
mime === "application/x-yaml" ||
mime === "application/javascript" ||
mime === "application/typescript"
) {
return "text";
}
// Extension-based fallback — only when MIME is missing or
// application/octet-stream (the server's "I don't know" default).
// Skip when MIME is set to something specific we just don't have
// a renderer for (e.g. application/zip → file is correct).
const looksGeneric = mime === "" || mime === "application/octet-stream";
if (looksGeneric) {
const ext = extractExtension(uri, name);
if (ext) {
const kind = EXTENSION_KIND.get(ext);
if (kind) return kind;
}
}
return "file";
}
// Extension → kind table for the fallback branch. Keep this list
// short and curated — every entry is a UX commitment to render
// inline, and a wrong inference (e.g. .doc rendered as text) is
// worse than the generic file chip.
const EXTENSION_KIND: ReadonlyMap<string, AttachmentPreviewKind> = new Map([
// Images
["png", "image"],
["jpg", "image"],
["jpeg", "image"],
["gif", "image"],
["webp", "image"],
["svg", "image"],
["avif", "image"],
["bmp", "image"],
// Video
["mp4", "video"],
["webm", "video"],
["mov", "video"],
["mkv", "video"],
// Audio
["mp3", "audio"],
["wav", "audio"],
["ogg", "audio"],
["m4a", "audio"],
["flac", "audio"],
// PDF
["pdf", "pdf"],
// Text-ish (rendered as preformatted ASCII)
["txt", "text"],
["md", "text"],
["json", "text"],
["yaml", "text"],
["yml", "text"],
["js", "text"],
["ts", "text"],
["tsx", "text"],
["jsx", "text"],
["py", "text"],
["go", "text"],
["rs", "text"],
["java", "text"],
["c", "text"],
["cpp", "text"],
["h", "text"],
["hpp", "text"],
["sh", "text"],
["bash", "text"],
["html", "text"],
["css", "text"],
["sql", "text"],
["toml", "text"],
["ini", "text"],
["xml", "text"],
["csv", "text"],
["log", "text"],
]);
/** Extracts the lowercased extension from a uri or name, without
* the leading dot. Returns "" when no extension is present. */
function extractExtension(uri: string | undefined, name: string | undefined): string {
// Prefer name (always a leaf path); fall back to uri's last
// segment. Strip query string + fragment so a URI like
// "https://example.com/foo.png?download=1" still parses as png.
const candidate = name || uri || "";
if (!candidate) return "";
let leaf = candidate.split(/[\\/]/).pop() || "";
// Drop ?query and #fragment.
leaf = leaf.split(/[?#]/)[0];
const dot = leaf.lastIndexOf(".");
if (dot < 0 || dot === leaf.length - 1) return "";
return leaf.slice(dot + 1).toLowerCase();
}