Both were lost during the PR #844 rebase — the converter was in the
source but the binary couldn't compile because FetchWorkspaceChannelContext
was missing from manager.go (interface mismatch). Previous deploys
silently used the cached old binary without the converter.
Also removed unused 'log' import that blocked compilation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents output standard Markdown (Claude Code default) but Slack uses
its own mrkdwn format. Without conversion:
**bold** shows as literal **bold**
### heading shows as literal ###
[text](url) shows as raw markdown link
Converter handles:
**bold** → *bold* (Slack bold is single asterisk)
### heading → *heading* (bold text, no headings in Slack)
[text](url) → <url|text> (Slack link format)
--- → ——— (visual separator)
`code` and ```blocks``` pass through unchanged
6 new tests: bold, heading, link, hr, code block, mixed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a cron fires, the scheduler now fetches the last 10 messages from
the workspace's Slack channel via conversations.history and prepends them
to the cron prompt as '[Slack channel context — recent team messages]'.
This gives each agent ambient awareness of what peers are doing:
- Backend sees Frontend posted 'PR #840 ready for review' → can check
- Security Auditor sees Backend posted 'new endpoint added' → plans review
- PM sees all engineering activity → better synthesis in rollup
Implementation:
- slack.go: FetchChannelHistory() calls conversations.history, filters
bot's own messages, returns last N as SlackHistoryMessage structs
- manager.go: FetchWorkspaceChannelContext() looks up the workspace's
Slack config, fetches history, formats as readable context block
- scheduler.go: ChannelBroadcaster interface extended with
FetchWorkspaceChannelContext; fireSchedule injects context before
the cron prompt (prepended, not appended, so the agent sees team
context BEFORE its task instructions)
Best-effort: if Slack API fails or workspace has no channels, the
prompt is unchanged. Truncated to 200 chars per message, 10 messages
max to keep prompt overhead bounded.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code review findings addressed:
Critical:
1. Bot echo loop: add bot_id + subtype='bot_message' check in ParseWebhook
to prevent outbound auto-posts from triggering inbound → infinite loop
2. Connection leak: close resp.Body immediately after reading instead of
defer inside loop (was holding N connections open for N chunks)
3. Cancelled context: auto-post goroutine now uses context.Background()
with 30s timeout instead of inheriting fireCtx (which gets cancelled
by deferred cancel() when fireSchedule returns)
4. Slug validation: regex ^[a-zA-Z0-9 _-]+$ rejects path traversal and
special chars in [slug] routing
Improvements:
5. Shared HTTP client (slackHTTPClient) for connection pooling instead of
per-request &http.Client{}
6. Rune-safe truncation in BroadcastToWorkspaceChannels for CJK/emoji
7. Log async HandleInbound errors instead of silently discarding
8. url_verification challenge properly returned (c.JSON with challenge)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Humans type [backend] what's #800? in a shared #mol-engineering channel
and the message routes specifically to Backend Engineer's workspace.
Matching logic (case-insensitive):
[pm] → PM
[backend] → Backend Engineer
[dev-lead] → Dev Lead
[security] → Security Auditor (prefix match on 'security-auditor')
Unknown slugs return the available agent list for that channel so the
user knows what slugs are valid.
Messages without a [slug] prefix route to the first matching workspace
(backward compat with Level 2).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Level 1 — Auto-post cron output to Slack:
- scheduler.go: captures A2A response body, extracts agent text via
extractResponseSummary(), broadcasts to workspace's configured Slack
channels on successful non-empty cron completions
- manager.go: adds BroadcastToWorkspaceChannels() — fans out to all
enabled channels for a workspace (engineering+firehose for eng agents,
research+firehose for research agents, etc.)
- main.go: wires scheduler → channel manager via SetChannels()
- Truncates output to 500 chars for Slack readability
Level 2 — Inbound Slack messages route to workspaces:
Already implemented by the existing webhook handler (POST /webhooks/slack)
+ the ParseWebhook method in slack.go which handles both Events API JSON
payloads and slash command form-encoded payloads. Needs Slack App Events
API URL configured to: https://<platform-host>/webhooks/slack
Also in this commit:
- slack.go: dual-mode adapter (bot_token + webhook fallback)
- 031 migration: pgvector guard wraps entire DO block
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Slack adapter: adds chat.postMessage mode alongside legacy webhooks.
When bot_token is configured, uses chat:write.customize for per-agent
display name + emoji on every message. Each of the 15 active agents
posts with a distinct identity (PM 💼, Backend ⚙️, etc.).
5 channels configured:
#mol-engineering — PM, Dev Lead, Frontend, Backend, QA, Security, UIUX, Docs
#mol-research — Research Lead, Market Analyst, Tech Researcher, Competitive Intel
#mol-ops — DevOps, Triage, Offensive Security
#mol-ceo-feed — PM synthesized rollup (CEO-facing)
#mol-firehose — all agents (raw feed)
Tested live: 5 test messages across 4 channels, all ok=true.
pgvector migration: moved ALTER TABLE + CREATE INDEX inside the DO
block so the entire migration is skipped when pgvector extension is
unavailable (was crashing platform on restart — the guard caught
CREATE EXTENSION but execution continued to ALTER TABLE which used
the non-existent vector type).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The migration SQL is read as raw SQL (not through Go fmt.Sprintf),
so %% is two parameters, not an escaped percent. Postgres RAISE
uses single % for parameter substitution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ALTER TABLE and CREATE INDEX referenced vector(1536) outside the
exception-handling DO block, so when pgvector wasn't installed they
crashed the migration runner — blocking ALL E2E runs on main.
Fix: move all DDL inside the single DO block so the EXCEPTION handler
catches any pgvector-related failure and skips the entire migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents three upgrade strategies for keeping tenant EC2 instances
current with platform-tenant:latest:
- Option A: Rolling restart via CP admin endpoint (coordinated)
- Option B: Sidecar auto-updater cron (implemented, 5 min interval)
- Option C: Blue-green via Worker (zero downtime, future)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Badge was always text-zinc-500; apply blue-500 (>=0.8), zinc-400 (0.5–0.8),
zinc-600 (<0.5) per spec. Add 3 vitest tests for each color tier (725 total).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds step-level checkpoint storage so workflows can resume from the
last completed step after a crash or restart without replaying prior work.
- Migration: `workflow_checkpoints` table — workspace_id (FK + CASCADE),
workflow_id, step_name, step_index, completed_at, payload JSONB.
UNIQUE(workspace_id, workflow_id, step_name) + covering index on
(workspace_id, workflow_id, completed_at DESC).
- Handlers (platform/internal/handlers/checkpoints.go):
POST /workspaces/:id/checkpoints — upsert via ON CONFLICT DO UPDATE
GET /workspaces/:id/checkpoints/:wfid — list steps ordered step_index DESC
DELETE /workspaces/:id/checkpoints/:wfid — clear on clean shutdown (404 if none)
- Router: all three routes on the wsAuth group (WorkspaceAuth middleware);
workspace A's token cannot reach workspace B's checkpoints.
- Tests (11 cases, sqlmock + race-safe): upsert-insert, upsert-update,
payload forwarding, list-ordered, list-not-found, rows.Err() → 500,
delete-success, delete-not-found, callerMismatch 403 on all 3 endpoints.
Closes#788. Parent: #583-1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Post-mortem fix: UIUX Designer ran 22 cron fires over 23 hours with
every single response being empty or '(no response generated)'. The
scheduler reported status=ok because the HTTP call succeeded — nobody
caught it until the CEO asked.
Changes:
- Migration 032: adds consecutive_empty_runs INT to workspace_schedules
- scheduler.go: captures response body from ProxyA2ARequest (was _),
checks for empty/sentinel markers via isEmptyResponse(), increments
consecutive_empty_runs on empty ok responses, resets on non-empty.
When consecutive_empty_runs >= 3, sets last_status='stale' with a
descriptive error message.
The 'stale' status is surfaced via:
- GET /admin/schedules/health (merged in #671)
- PM's silence detector (companion fix in org-template PR)
- Maintenance loop response-body sampling (operator-side fix)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rebase of feat/issue-576-pgvector-semantic-memory onto current main,
preserving the #767 security layer (globalMemoryDelimiter + GLOBAL audit
log) that predates this branch.
Changes layered on top of main:
- Migration 031: embedding vector(1536) column + ivfflat cosine-ops index
(renumbered from 029 — 029/030 were taken by workspace-hibernation and
audit-events)
- Commit: embed-on-write after INSERT, non-fatal on embedding failure
- Search: semantic cosine-distance path when EmbeddingFunc is wired up;
falls back to FTS/ILIKE; GLOBAL delimiter wrapping applies on both paths
- EmbeddingFunc injection pattern; WithEmbedding chainable builder
All security invariants preserved:
- globalMemoryDelimiter wrapping on GLOBAL scope in both semantic + FTS
- GLOBAL write audit log (SHA-256 forensic trail) in Commit
- TestRecallMemory_GlobalScope_HasDelimiter passes
- TestMemoriesCommit_Global_AsRoot passes
- 3 new pgvector tests pass
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Addresses all 4 review points from PR #786:
1. Worker resilience: 3-tier cache (in-memory → KV → CP API) with stale
fallback so CP outages are invisible to tenants
2. WebSocket proxying: documented upgradeHeader handling, fallback to
keep Caddy for WS-only if Workers WS is unreliable
3. SG automation: note to auto-update Cloudflare IP ranges, don't hardcode
4. Trusted proxy: X-Forwarded-For / CF-Connecting-IP trust chain documented
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add org-templates/molecule-dev/system-prompt.md as a canonical org-level
shared-context template for all molecule-dev org agents. The Communication
section explains that /workspace/AGENTS.md is auto-generated at startup from
config.yaml (via agents_md.py / PR #763), describes the AAIF format it
follows, explains the GET /workspace/AGENTS.md peer-discovery contract, and
tells agents to keep their config.yaml name/role/description accurate as the
sole source of truth.
Also restructure the /org-templates/ gitignore rule from a hard directory-ignore
to a content-glob pattern so this specific reference template can be tracked
while all other cloned standalone-repo content remains ignored.
Co-authored-by: Molecule AI Documentation Specialist <documentation-specialist@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds path-filter table so developers and agents know which files
trigger which CI jobs, and that docs-only PRs skip everything.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI now detects which paths changed and skips irrelevant jobs:
- Platform (Go): only runs when platform/** changes
- Canvas (Next.js): only runs when canvas/** changes
- Python Lint: only runs when workspace-template/** changes
- Shellcheck: only runs when tests/e2e/** or scripts/** change
- E2E API: only runs when platform/** or tests/e2e/** change
Docs-only PRs (*.md, docs/**) skip all 5 jobs, saving ~15 min of
runner time per PR. Uses dorny/paths-filter for the CI workflow and
native paths: filter for the E2E workflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore(eco-watch): add BeeAI ACP + Claw Code — 2026-04-17
BeeAI ACP (i-am-bee/acp, IBM) — REST/OpenAPI agent comm protocol, direct
A2A alternative; Copilot CLI ACP support already in preview. GH #777 filed
for TR comparison vs A2A.
Claw Code (ultraworkers/claw-code) — 100k+★ Rust+Python clean-room rewrite
of Claude Code architecture; architectural reference + competitive signal for
molecule-ai-workspace-template-claude-code.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(eco-watch): mark BeeAI ACP as archived — A2A won consolidation
IBM archived i-am-bee/acp on Aug 27, 2025; contributed to AAIF/A2A
working group. No bridge or shim needed — Molecule's A2A bet vindicated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Molecule AI Research Lead <research-lead@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Phase 33 plan and architecture doc for replacing per-tenant DNS
records with a wildcard DNS + Cloudflare Worker proxy pattern.
Eliminates: DNS propagation delays, NXDOMAIN caching, per-instance
Let's Encrypt, Caddy on EC2. Same pattern used by Vercel, Railway,
Fly.io, WordPress, n8n.
4-phase migration: deploy Worker → stop creating DNS records →
remove Caddy from EC2 → cleanup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds two missing env vars to .env.example + docker-compose.yml platform block:
1. HIBERNATION_IDLE_MINUTES (default 60)
Source: issue #724 / workspace hibernation feature.
Note: currently configured per-workspace via the hibernation_idle_minutes
DB column. This placeholder documents the planned global-default env var;
the platform does not yet read it. Per-workspace DB column is active now.
2. PLUGIN_ALLOW_UNPINNED (empty = false)
Source: issue #768 / PR #775 (supply chain hardening, not yet merged).
Pre-emptive documentation — takes effect when PR #775 lands.
ADMIN_TOKEN (item 3): already present with clear generation instructions
(openssl rand -base64 32) and NEVER-commit reminder. No changes needed.
docker-compose.yml cross-check — vars present in .env.example but absent from
the platform service env block (flagged, not fixed in this PR — all have safe
compiled-in defaults and are optional):
SECRETS_ENCRYPTION_KEY, AWARENESS_URL, MOLECULE_ENV, MOLECULE_IN_DOCKER,
MOLECULE_ENABLE_TEST_TOKENS, MOLECULE_ORG_ID, CP_PROVISION_URL,
ACTIVITY_RETENTION_DAYS, ACTIVITY_CLEANUP_INTERVAL_HOURS,
REMOTE_LIVENESS_STALE_AFTER, PLUGIN_INSTALL_{BODY_MAX_BYTES,FETCH_TIMEOUT,
MAX_DIR_BYTES}, TIER{2,3,4}_{MEMORY_MB,CPU_SHARES}, WORKSPACE_DIR.
These are not forwarded by docker-compose because they either auto-detect or
have safe defaults — operators override them via .env on the host. Adding
all of them to docker-compose would be noisy; a separate cleanup issue tracks
this.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds platform/internal/plugins/supply_chain_test.go with 8 tests (7 from
the spec + 1 end-to-end combo) specifying both security controls.
Control 1 — SHA256 content integrity (tests 1-3 + end-to-end):
Tests call VerifyManifestIntegrity(stagedDir string) error, which does
NOT exist yet → 5 compile errors / build failure until supply_chain.go
is written. Once stubbed to nil, SHA256Mismatch test fails at runtime.
VerifyManifestIntegrity contract:
- manifest.json absent → nil (backward compat)
- manifest.json present, no sha256 field → nil (backward compat)
- sha256 matches computed stagedDirDigest → nil
- sha256 mismatch → error mentioning "sha256"
stagedDirDigest algorithm (canonical, test + impl must agree):
Walk all files except manifest.json, sorted by rel path,
format each as "<rel>\x00<content>", concatenate, SHA256, hex.
Control 2 — Pinned-ref enforcement (tests 4-7):
Tests call GithubResolver.Fetch with/without "#ref" fragment.
Currently returns nil for bare refs → TestPluginInstall_UnpinnedRef_Rejected
fails (GitRunner IS called; no "pinned ref" in error message).
PLUGIN_ALLOW_UNPINNED=true escape hatch tested by test 7.
RED state summary (current):
go test ./internal/plugins/... -v -run TestPluginInstall
→ build failed: 5× undefined: VerifyManifestIntegrity
→ (with no-op stub) 2 runtime failures:
FAIL TestPluginInstall_SHA256Mismatch_AbortsInstall
FAIL TestPluginInstall_UnpinnedRef_Rejected
Backend Engineer implementation checklist:
[ ] Add supply_chain.go in package plugins with VerifyManifestIntegrity
[ ] Add pinned-ref gate to GithubResolver.Fetch in github.go
[ ] PLUGIN_ALLOW_UNPINNED=true check skips the gate
[ ] All 8 tests GREEN before merge
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolves 4 merge conflicts: Toolbar.tsx (2), Canvas.a11y.test.tsx (1),
Canvas.pan-to-node.test.tsx (1). All conflicts were additive — PR adds
selectedNodeId/setPanelTab selectors and the Audit toolbar button; main
didn't have them. Took PR additions throughout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two defenses against malicious plugins from uncontrolled sources:
1. **Pinned-ref enforcement** (resolveAndStage): github:// install/download
specs without a #<tag/sha> suffix are now rejected with HTTP 422. A
mutable default-branch tip could change between audit and install,
silently swapping in untrusted code. Override via PLUGIN_ALLOW_UNPINNED=true.
2. **SHA-256 content integrity** (installRequest.sha256): callers may
supply the expected hex SHA-256 of the fetched plugin.yaml. When present,
resolveAndStage verifies the digest after staging; a mismatch aborts the
install with HTTP 422 and cleans up the staging dir.
Updated TestPluginDownload_GithubSchemeStreamsTarball to use a pinned ref
(#v1.0.0) so it reflects the new security requirement.
Tests: 4 new (TestPluginInstall_SHA256Mismatch_AbortsInstall,
TestPluginInstall_SHA256Match_Succeeds, TestPluginInstall_UnpinnedRef_Rejected,
TestPluginInstall_PinnedRef_Accepted). All 15 packages green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #724 (workspace hibernation) claimed migration number 029.
Renaming to 030 to resolve the sequence collision before merging #651.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>