Hongming Wang d8026347e5 chore: open-source restructure — rename dirs, remove internal files, scrub secrets

Renames:
- platform/ → workspace-server/ (Go module path stays as "platform" for
  external dep compat — will update after plugin module republish)
- workspace-template/ → workspace/

Removed (moved to separate repos or deleted):
- PLAN.md — internal roadmap (move to private project board)
- HANDOFF.md, AGENTS.md — one-time internal session docs
- .claude/ — gitignored entirely (local agent config)
- infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy
- org-templates/molecule-dev/ → standalone template repo
- .mcp-eval/ → molecule-mcp-server repo
- test-results/ — ephemeral, gitignored

Security scrubbing:
- Cloudflare account/zone/KV IDs → placeholders
- Real EC2 IPs → <EC2_IP> in all docs
- CF token prefix, Neon project ID, Fly app names → redacted
- Langfuse dev credentials → parameterized
- Personal runner username/machine name → generic

Community files:
- CONTRIBUTING.md — build, test, branch conventions
- CODE_OF_CONDUCT.md — Contributor Covenant 2.1

All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml,
README, CLAUDE.md updated for new directory names.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-18 00:24:44 -07:00

19 KiB

Raw Blame History

Edit History — 2026-04-06

Summary

Merged PR from HongmingWang-Rabbit/molecule-monorepo#1 (Claude Code workspace runtime + A2A delegation + canvas improvements — 46 commits, 2,548 additions). Then performed comprehensive code review across all 3 layers (Python, Go, TypeScript) and fixed 18 issues (5 critical, 10 warnings, 3 suggestions).

Merged PR: Claude Code Workspace Runtime

CLI-based workspace runtimes — unified executor for Claude Code, Codex, Ollama, or custom CLI agents
A2A delegation via MCP + CLI — delegate_task, delegate_task_async, check_task_status, list_peers
Canvas improvements — legend panel, communication overlay, chat persistence with session sidebar, confirmation dialogs, enhanced thinking indicator
Platform fixes — offline→online heartbeat recovery, file API writes to correct config dir, restart uses workspace's own config, configurable rate limiter, Docker-in-Docker mount resolution
Security — unique temp files, shlex.quote for tokens, subprocess kill on timeout, path traversal prevention

Code Review Fixes (18 issues)

Critical (5 fixed)

ChatTab.tsx — Elapsed time calculation was Date.now() - Date.now() + thinkingStartTime (always equals thinkingStartTime). Fixed to Date.now() - thinkingStartTime.
Canvas.tsx — saveTimerRef debounce timer never cleared on component unmount. Added useEffect cleanup.
workspace.go Update handler — All 5 ExecContext calls in Update() silently discarded errors. Added log.Printf on each.
workspace.go Delete handler — All 4 cascade delete ExecContext calls ignored errors. Added log.Printf on each.
cli_executor.py — Temp files leaked if exception occurred between mkstemp and _temp_files.append(). Moved append() immediately after creation.

Warnings (10 fixed)

a2a_cli.py — resp.json() could crash on malformed JSON response. Wrapped in try/except.
a2a_mcp_server.py — chunk.decode() could crash on invalid UTF-8. Added errors="replace".
a2a_cli.py — Async mode timeout returned misleading "submitted_timeout" status. Changed to "uncertain" on stderr.
templates.go — Config files written with 0644 (world-readable). Changed all 4 occurrences to 0600.
CommunicationOverlay.tsx — fetchComms callback recreated on every nodes change, causing interval reset. Stabilized with useRef.
ContextMenu.tsx — Delete confirmation dialog orphaned when context menu closed externally. Added useEffect cleanup.
ContextMenu.tsx — No loading guard on export/duplicate async actions. Added actionLoading state to prevent double clicks.
cli_executor.py — config.args appended after prompt, breaking CLI flag parsing. Moved before prompt.
main.py — Any non-langgraph runtime silently treated as CLI. Added validation warning for unknown values.
provisioner.go — Created container not cleaned up if ContainerStart failed. Added ContainerRemove on failure.

Suggestions (3 fixed)

router.go — CORS origins hardcoded to localhost. Now configurable via CORS_ORIGINS env var (comma-separated).
config.py — int() conversion on tier crashed on non-numeric YAML. Added .isdigit() guard with default 1.
ChatTab.tsx — loadSessions() called twice during mount. Consolidated to single call shared between state initializers.

Provisioner Auto-Setup (URL Resolution)

Fixed the core issue preventing workspace chat from working after creation without manual intervention:

provisioner.go — Now inspects container after start to resolve the actual host-mapped ephemeral port (127.0.0.1:<port>), instead of returning the Docker-internal URL. The host URL is stored in DB and Redis, preserved by the registry's ON CONFLICT clause when the agent self-registers.
workspace.go — provisionWorkspace now also caches the Docker-internal URL (ws-<id>:8000) for inter-container discovery.
discovery.go — When a workspace discovers another workspace (via X-Workspace-ID header), constructs the Docker-internal URL from the container name convention (ws-<first12chars>:8000) when the Redis cache is empty. This enables inter-agent A2A delegation.

Before: create workspace → agent registers with Docker hostname → proxy gets 502 → manual re-registration needed. After: create workspace → provisioner stores host URL → proxy works immediately.

Grid Layout for Embedded Team Members

WorkspaceNode.tsx — Departments render in a 3-column grid at depth 0 (was single column). Sub-teams use 2-column grid at depth 1+. Root nodes wider (720-960px) to accommodate side-by-side layout. Company org chart now fits in one screen without scrolling.

Chat UX Improvements

ChatTab.tsx — 502/503/timeout errors show user-friendly messages ("CEO is not responding. The agent container may not be running. Try restarting the workspace.") instead of raw API error dumps. Input disables after failure. Agent unreachable state shown in empty chat and placeholder.
ChatTab.tsx — Agent and system messages now render markdown (bold, lists, code blocks, headers, tables) via react-markdown + remark-gfm + @tailwindcss/typography. User messages stay as plain text.

Workspace Config Cleanup

.gitignore — Added workspace-configs-templates/ws-* to exclude auto-generated provisioner instance configs (not templates, shouldn't be committed).
Removed 15 stale ws-* instance directories from the templates folder.

Test Infrastructure

test_api.sh — Fixed degraded status test to re-register before high error rate heartbeat (avoids Redis TTL expiry race).
test_activity_e2e.sh — Fixed assertion to match actual Go binding error field name (ActivityType not activity_type).
Full clean-slate E2E verified: nuke → setup → create 11 workspaces → all online with HOST URLs → 21/21 tests pass (peer discovery, access control, chat, delegation, activity logs, current task, URL auto-resolution).

Code Review Round 2 (7 fixes)

Critical (2 fixed)

workspace.go — workspaceID[:12] panics on IDs shorter than 12 chars. Added length guard matching containerName() pattern.
discovery.go — Fallback URL synthesis returned a Docker-internal URL even for non-existent or offline workspaces. Now checks workspace status (online/degraded) before constructing URL.

Warnings (3 fixed)

discovery.go — CacheInternalURL error silently discarded (inconsistent with workspace.go). Added log.Printf.
ChatTab.tsx — ReactMarkdown rendered for both agent and system messages. System error messages (containing *, #, etc.) could produce unexpected formatting. Now only renders markdown for role === "agent".
ChatTab.tsx — thinkingStartTime state used in setInterval closure was stale (captured before setThinkingStartTime applied). Replaced with ref + local variable captured at effect creation time.

Suggestions (2 fixed)

tailwind.config.ts — require("@tailwindcss/typography") replaced with ESM import typography for consistency with TypeScript config.
ci.yml — CI Node.js bumped from 20 to 22 (LTS). Lock file (lockfileVersion 3, npm 11) had @emnapi resolution differences with Node 20's npm 10, causing npm ci to fail.

Code Review Round 3 (DRY + hardening)

Refactor: Exported `provisioner.ContainerName()` / `provisioner.InternalURL()`

The ws-<first12chars>:8000 URL construction was duplicated in discovery.go, workspace.go, and terminal.go. Exported the provisioner's existing helpers and replaced all inline duplications. Prevents drift if naming convention changes.

Fix: Discovery fall-through returned host URLs to container callers

When a workspace-to-workspace discovery request hit a workspace that was offline/provisioning/failed, the code fell through to the external URL path and returned http://127.0.0.1:<port> — unreachable from inside Docker. Now returns 503 workspace not available (with status) or 404 workspace not found.

Fix: Dead `thinkingStartRef` removed (ChatTab.tsx)

Round 2 replaced thinkingStartTime state with a ref + local variable. The ref was written but never read — only the local startTime in the closure was used. Removed the dead ref entirely.

Fix: Terminal.go container name lookup

Replaced inline "ws-"+workspaceID[:12] with provisioner.ContainerName(). Cached the result in a local name variable to avoid calling the function twice.

Hardening: `.gitignore` comprehensiveness

Added 12 missing patterns: .awareness/, **/.next/, mcp-server/dist/, dist/, .pytest_cache/, coverage/, .nyc_output/, *.db/*.sqlite*, postgres_data//redis_data/, .env.production, *.bundle.json.

CLI Executor Fixes

Fix: Claude Code exit code 1 with valid output

Claude Code sometimes exits with code 1 but still produces valid output on stdout (e.g. MCP tool failures that don't prevent a response). The executor now accepts stdout output regardless of exit code (if proc.returncode == 0 or stdout_text). Also added detailed stderr/stdout logging on non-zero exit.

Fix: Empty description crashes AgentCard (main.py)

Pydantic's AgentCard requires a non-null string for description. Auto-generated configs had description: "". Fixed with config.description or config.name.

Fix: No timeout on A2A proxy and CLI executor

Removed all artificial timeouts from the A2A proxy (http.Client{}), CLI executor (timeout: 0 → await proc.communicate() without wait_for), and MCP delegation client (httpx timeout=None). Delegation chains (PM → Lead → Agent) can take arbitrarily long — agent liveness is monitored via heartbeat, not proxy deadlines. Proxy uses context.WithoutCancel(ctx) to survive client disconnect while still canceling on server shutdown.

Restart Handler Fixes

Fix: Template resolution by config.yaml name field

findTemplateByName("PM") normalized to "pm" but the template dir is org-pm. Added a second pass that reads config.yaml files in template dirs and matches by the name: field.

Fix: Stale ws-* config dirs take precedence on restart

A previous restart's ensureDefaultConfig created a ws-<id>/ dir with only config.yaml (wrong runtime, empty description). On next restart, the ownDir check found it and used it. Fixed: only use ownDir if it contains more than just config.yaml (meaning files were uploaded via the Files API).

Live Activity Feed (ChatTab)

Replaced the fake rotating status messages ("Analyzing your request...", "Almost there...") with a real-time activity feed powered by WebSocket events:

Opens a dedicated WebSocket while sending=true
Listens for ACTIVITY_LOGGED events across all workspaces
Shows color-coded delegation progress: → Delegating to Marketing Lead... (blue), ← Marketing Lead responded (42s) (green), ⚠ error (red)
MCP server now reports a2a_send activity before each delegation call

WebSocket Health Check (socket.ts)

Added periodic rehydration to the canvas WebSocket — if no events arrive for 30s, automatically re-fetches workspace state from the API. Prevents the canvas from showing stale offline status when agents recover between heartbeat cycles without a WebSocket event.

Shared Workspace Mount (WORKSPACE_DIR)

Added WORKSPACE_DIR env var for the platform. When set, all provisioned workspace containers bind-mount the host directory as /workspace instead of using isolated Docker named volumes. This gives all agents read/write access to the same codebase.

Default Org Setup (setup-org.sh)

Created setup-org.sh — reproducible script that creates the full 15-agent org hierarchy:

PM → Marketing Lead (Content Writer, SEO Specialist, Social Media Manager)
PM → Research Lead (Market Analyst, Technical Researcher, Competitive Intelligence)
PM → Dev Lead (Frontend Engineer, Backend Engineer, DevOps Engineer, Security Auditor, QA Engineer)

All agents use Claude Code runtime with shared OAuth token. Script also extracts the token from macOS keychain and distributes to all org-* templates.

Canvas Agent Task Visibility

Live current_task on workspace cards

CLI executor now reports current_task via immediate heartbeat push when starting/finishing a request. The MCP server also pushes current_task when delegating. Each workspace card on the canvas shows an amber task banner with what the agent is currently working on — visible across the entire org chart in real time.

heartbeat.py — added current_task field to heartbeat payload
cli_executor.py — calls _set_current_task(summary) on execute start, clears on finish via try/finally
a2a_mcp_server.py — pushes current_task heartbeat alongside report_activity on delegation

Session continuity (Claude Code --resume)

CLI executor now maintains conversation state across messages using Claude Code's --resume flag:

First message: runs with --output-format json to capture session_id
Subsequent messages: runs with --resume <session_id> to continue the conversation
System prompt only injected on first message (resumed sessions already have it)

Chat input textarea

Replaced single-line <input> with auto-growing <textarea> (Shift+Enter for new line, Enter to send, max 200px height).

New ConversationTraceModal component — full-screen modal showing the delegation chain across all workspaces chronologically:

Fetches activity from ALL workspaces (including hidden children) via parallel API calls
Timeline view with color-coded dots: cyan = SEND, blue = RECEIVE, red = ERROR
Shows workspace names (not UUIDs): PM → Research Lead
Displays message content: Task box (what was sent) and Response box (what came back)
Accessible via "Full Trace" button in the Activity tab

Activity tab improvements

Workspace names replace raw UUIDs in flow indicators (PM → Research Lead instead of d70d7ed8 → f3ea3f90)
Summaries resolve IDs to names
Expanded details show Source: PM (d70d7ed8) format
New MessagePreview component extracts human-readable text from A2A request/response JSON
MCP server now includes task text in a2a_send activity reports (request_body: {task: "..."})

Shared types and hooks

Extracted duplicated code:

canvas/src/types/activity.ts — shared ActivityEntry interface
canvas/src/hooks/useWorkspaceName.ts — shared workspace ID → name resolver hook
Both ActivityTab and ConversationTraceModal import from shared locations

New "Stop All (N)" button in the top toolbar, visible when any workspace has active tasks. Restarts all active workspace containers to kill running Claude processes. Button disappears when no tasks are active.

Workspace Name Resolution Everywhere

Discovery endpoint returns `name`

GET /registry/discover/:id now returns name alongside id and url for workspace-to-workspace calls. Query only runs for agent-to-agent callers (X-Workspace-ID header present), not canvas/external.

MCP server caches peer names

_peer_names cache populated by list_peers calls and discovery responses. delegate_task uses cached names so task banners show "Delegating to Research Lead" instead of raw UUIDs.

Activity report accepts request_body and response_body

POST /workspaces/:id/activity now reads request_body, response_body, and source_id from the JSON payload (previously only read metadata). MCP server logs full task text and agent response in delegation activities, enabling complete conversation traces.

Replaced native browser title attributes with Tooltip.tsx — styled hover popup (dark bg, scrollable, 350px max width, 400ms delay). Used on all task banners: WorkspaceNode, TeamMemberChip, SidePanel. Includes unmount cleanup to prevent stale setState.

Chat Persistence on Refresh

sending state now initializes from data.currentTask — if the agent has an active task on page load, the processing indicator shows immediately. Cleared when current_task empties via WebSocket heartbeat. sendingFromAPIRef distinguishes user-initiated sends from resumed state.

Comprehensive Trace Log Generator

logs/conversation-trace.log — generated by Python script, shows full timeline across all 15 workspaces with workspace names, message bodies (request + response), error details, and delegation chains. logs/ added to .gitignore.

Poll-Based Chat (ChatTab)

Replaced synchronous fetch-and-wait with fire-and-poll architecture:

Send A2A request via fire-and-forget fetch (no await, uses AbortController)
Poll GET /workspaces/:id/activity?type=a2a_receive every 3s for the response
Match by timestamp (created_at > sentAt) with response_body present
extractResponseText() handles all activity body formats ({result: "..."}, {task: "..."}, A2A JSON-RPC)
On page refresh: if agent has active task (data.currentTask), auto-resume polling using last chat message timestamp
Response stored server-side in activity_logs table — can never be lost to browser disconnects

molecule-monorepo-status CLI

New molecule_ai_status.py — CLI tool + importable module for any process inside a workspace container to update the canvas task display:

molecule-monorepo-status "Running weekly audit..."  # show on canvas
molecule-monorepo-status ""                         # clear

Pushes immediate heartbeat (current_task) + logs activity (task_update). Linked as /usr/local/bin/molecule-monorepo-status in Dockerfile. Importable from Python:

from molecule_ai_status import set_status
set_status("Analyzing data...")

Prometheus Metrics Endpoint

New workspace-server/internal/metrics/metrics.go — zero-dependency Prometheus metrics:

GET /metrics — scrape-safe, no auth required
molecule_http_requests_total{method,path,status} — counter
molecule_http_request_duration_seconds_total{method,path} — counter (sum)
molecule_websocket_connections_active — gauge
Go runtime metrics: goroutines, heap alloc, sys bytes, GC pause

Middleware registered in router.go, WebSocket connect/disconnect tracked in socket.go. Map snapshot taken under lock before HTTP write to avoid holding the read lock during slow responses.

E2B Cloud Sandbox Backend

workspace/tools/sandbox.py now supports three backends:

subprocess (default) — local execution with timeout
docker — throwaway Docker-in-Docker container
e2b — cloud microVM via E2B (https://e2b.dev), supports Python and JavaScript

Selected via SANDBOX_BACKEND env var (from config.yaml → sandbox.backend). E2B requires E2B_API_KEY workspace secret and e2b-code-interpreter package. Uses asyncio.get_running_loop().run_in_executor() for non-blocking calls.

LiteLLM & Ollama Docker Compose Profiles

Optional services added to docker-compose.yml:

litellm — unified OpenAI-compatible proxy for all LLM providers (Anthropic, OpenAI, OpenRouter, Ollama). Start with docker compose --profile multi-provider up
ollama — local LLM models. Start with docker compose --profile local-models up

Both use compose profiles so they only start when explicitly requested.

Deployment Configs

railway.toml — Railway.app deployment config
render.yaml — Render.com deployment config

Resizable Side Panel

SidePanel now has a draggable resize handle on the left edge. Drag to resize between 320px and 80% of screen width. Default 480px.

19 KiB Raw Blame History