Backend Engineer's PR #729 introduces ADMIN_TOKEN — when set, only that value
is accepted on /admin/* and /approvals/* routes, replacing the vulnerable
workspace-bearer fallback. Without the env var wired into deployments the fix
is code-only and the vulnerability stays open in every running instance.
Changes:
- `docker-compose.yml`: adds ADMIN_TOKEN env var to the platform service
(blank default = backward-compat fallback, i.e. still vulnerable until set).
NOTE: docker-compose.infra.yml has no platform service — the platform lives
only in the full-stack docker-compose.yml, so that is the correct file.
- `.env.example`: documents ADMIN_TOKEN with generation instructions and a
clear warning that it must be set to close#684.
- `infra/scripts/setup.sh`: prints a visible warning when ADMIN_TOKEN is unset
so operators know the vulnerability is still open in that deployment.
- `CLAUDE.md`: adds ADMIN_TOKEN to the env vars reference section.
No Go code changed — go build ./... passes clean.
Part of fix for #684 / PR #729
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Measured test counts (not guessed):
- Platform Go: 12 packages (was claiming 818 individual tests — now
reports package-level which is the go test output format)
- Canvas: 490 Vitest tests (33 files)
- workspace-template: 955 pytest tests (down from 1179 — 224 adapter-
specific tests moved to standalone template repos)
- molecule-app: 76 unit + 22 e2e (separate repo)
Architecture updates:
- CI section: documents manifest-driven Docker builds + reusable CI
workflows from molecule-ci repo for all 33 plugin/template repos
- Workspace Images section: already updated by prior PR (adapter repos)
- Test commands: accurate counts, standalone repo URLs with test counts
Code review fixes:
- 🟡#1: Replace python3 with jq in Dockerfile template stages (~50MB → ~2MB)
- 🟡#2: Add clone count verification to scripts/clone-manifest.sh
(set -e + expected vs actual count check — fails build if any clone fails)
- 🟡#3: Drop 'unsafe-eval' from CSP (not needed for Next.js production
standalone builds, only dev mode). Updated test assertion.
- 🟡#4: Remove broken pyproject.toml from workspace-template/ (it claimed
to package as molecule-ai-workspace-runtime but the directory structure
didn't match — the real package ships from the standalone repo)
- 🔵#1: Add version-pinning TODO comment to manifest.json
- 🔵#3: Add full repo URLs + test counts for SDK/MCP/CLI/runtime in CLAUDE.md
Security (GitGuardian alert):
- Removed Telegram bot token (8633739353:AA...) from template-molecule-dev
pm/.env — replaced with ${TELEGRAM_BOT_TOKEN} placeholder
- Removed Claude OAuth token (sk-ant-oat01-...) from template-molecule-dev
root .env — replaced with ${CLAUDE_CODE_OAUTH_TOKEN} placeholder
- Both tokens need immediate rotation by the operator
Tests: Platform middleware tests updated + all pass.
Remove plugins/, workspace-configs-templates/, org-templates/ dirs (now
in standalone repos). Add manifest.json listing all 33 repos and
scripts/clone-manifest.sh to clone them. Both Dockerfiles now use the
manifest script instead of 33 hardcoded git-clone lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tick 32 (manual) merged a large batch of PRs — the test counts in
CLAUDE.md were drifting behind reality by enough to matter:
- platform: 816 → 818 (YAML injection fix + sanitizeRuntime allowlist)
- canvas: 453 → 482 (12 CookieConsent + 17 PricingTable/billing)
- workspace-template: 1180 → 1179 (Hermes Phase 2a/2b dispatch tests
landed but the test_hermes_providers env-var-leak fix removed a
fragile flake-path count; net -1)
This is measured not guessed: running the full suites on fresh main.
Not in this sync but worth mentioning for the next retrospective:
- controlplane repo received the full GDPR/admin/usage/consent/email
stack (#29-#34) — that work sits in molecule-controlplane, not
monorepo CLAUDE.md
- monorepo picked up /pricing route, cookie consent banner, molecule-
hitl plugin (#262), Hermes Phase 2a native Anthropic + 2b Gemini
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the 4-step hard-delete cascade implemented in
molecule-controlplane PR #29 (Stripe → Redis → Infra → DB rows),
how to read the org_purges audit table when a purge fails, the 30-day
GDPR deadline, and what the cascade deliberately does NOT cover
(WorkOS users, LLM provider history, Langfuse traces).
Cross-referenced from the "SaaS ops" block in CLAUDE.md so future
agents find it when handling erasure requests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Captures ~27 PRs merged across both repos this session: security
hardening cluster (#94/#99/#106/#110/#119/#162/#155/#167/#185/#200/#203/
#209/#233), data-integrity fixes (#212/#224/#236), CI runner migration
(#186), platform/scheduler reliability (#95/#149/#207/#206), workspace
runtime features (#205/#208/#198/#216/#225/#235/#231), code-review
follow-ups (#228/#232).
Updated counts: 816 Go (+70), 1180 Python (+40), 453 vitest (unchanged
— UI/a11y patches), 97 jest (unchanged).
CLAUDE.md additions:
- Idle Loop section (#205) under Architectural Patterns
- Admin auth middleware variants section linking docs/runbooks/admin-auth.md
- Migration runner section explaining the .down.sql filter (#212)
- Per-route auth notes in the API table (PATCH field-whitelist, CanvasOrBearer
on PUT /canvas/viewport, AdminAuth on bundles/events/templates-import/
approvals-pending/admin-liveness)
- Database section updated with workspace_auth_tokens auto-revoke (#110),
scheduler.error_detail surfacing (#206), workspace_schedules.last_status
'skipped' state (#207)
PLAN.md additions:
- New Recently launched (overnight sweep) section with full PR/issue index
- Phase status updated (B–G now complete, H partial)
- Live infrastructure deltas (migration fix, token rotation, legal pages)
- Outstanding items consolidated
Edit-history file expanded from the tick-9 stub to a full session record
covering malware cleanup, CI runner migration, security cluster, data
integrity, infra/feature/code-review batches, and outstanding user
actions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses PR #82 code review: 🟡×3 + 🔵×5.
- Fly registry login username: 'x' → 'molecule-ai' + explanatory comment.
- Build & push split into two steps (GHCR / Fly registry) so a single-
registry outage can't fail the other. Second step uses 'if: always()'
to ensure Fly mirror runs even if GHCR push flakes.
- docs/runbooks/saas-secrets.md: full secret map + rotation procedures
for every SaaS credential, with danger-case callouts. Documents the
coupled FLY_API_TOKEN (lives in GHA secret AND fly secrets — must be
rotated in both).
- CLAUDE.md: new 'SaaS ops' section linking to the runbook.
Phase B.2 companion to the private molecule-controlplane provisioner PR.
On every push to main that touches platform/**, builds platform/Dockerfile
and pushes to GHCR with two tags:
- :latest (floating, always main's tip)
- :sha-<short-commit> (immutable, pin-friendly)
Cache via GitHub Actions cache (cache-from: type=gha). Workflow_dispatch
trigger so we can re-publish after a docs-only merge if needed.
The private molecule-controlplane sets TENANT_IMAGE=ghcr.io/molecule-ai/platform:<tag>
and the provisioner creates each tenant Fly Machine from this image. Staying
on the same base image across tenants keeps upgrades atomic.
CLAUDE.md updated to document the new workflow in the CI pipeline section.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 32 foundation. The SaaS control plane (private molecule-controlplane
repo) provisions one platform instance per customer org on Fly Machines
and sets MOLECULE_ORG_ID=<uuid> on the machine. Its subdomain router
forwards requests with X-Molecule-Org-Id=<uuid>.
TenantGuard:
- When MOLECULE_ORG_ID is set → every non-allowlisted request must carry a
matching X-Molecule-Org-Id header. Mismatched/missing header → 404 (not
403 — don't leak tenant existence by letting probers distinguish "wrong
org" from "route doesn't exist").
- When unset → passthrough. Self-hosted / dev / CI behavior unchanged.
- Allowlist is exact-match, not prefix — /health and /metrics only.
No orgs table, no signup, no billing, no Fly provisioning in this repo —
all that lives in the private control plane. The public repo's SaaS
surface is exactly this one middleware.
6 tests covering: unset-is-passthrough, matching header, mismatched
header 404 (with empty body), missing header 404, allowlist bypass, and
allowlist-is-exact-match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- edit-history/2026-04-14.md: append tick-4 section covering the 12
modular guardrail plugins (#63), global-secrets auto-restart fan-out
(#64, fixes issue #15), and synthetic restart-context A2A message
(#65, fixes issue #19 Layer 1; Layer 2 deferred to issue #66).
- CLAUDE.md: bump Go test count 699 -> 726 (measured); note global
secrets auto-restart on SetGlobal/DeleteGlobal in the route table;
add Workspace Lifecycle paragraph for the restart-context message
and its system:restart-context caller prefix.
- PLAN.md: bump Go test count in the coverage table; record issues
#15 and #19 Layer 1 as launched; add new Backlog entry for the
Layer 2 follow-up (issue #66).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the proposed monolithic molecule-guardrails plugin with 12
single-purpose plugins users can install à la carte. Powered by a
small extension to the AgentskillsAdaptor base class so any plugin can
ship hooks/, commands/, and a settings-fragment.json without writing a
custom adapter.
## Base adapter changes
workspace-template/plugins_registry/builtins.py + sdk/python/molecule_plugin/builtins.py
(both copies — drift-tested):
- New _install_claude_layer() helper called at the end of install()
- Conditionally copies hooks/ → /configs/.claude/hooks/ (preserving exec bit)
- Conditionally copies commands/*.md → /configs/.claude/commands/
- Conditionally merges settings-fragment.json into /configs/.claude/settings.json
with ${CLAUDE_DIR} placeholder rewritten to the workspace's absolute install
path. Existing user hooks are preserved (deep-merge by event name).
- All steps no-op when the plugin doesn't ship the corresponding files,
so existing skill+rule plugins (molecule-dev, superpowers, ecc,
browser-automation) are unchanged.
Drift test (tests/test_plugins_builtins_drift.py) still passes.
## 12 new plugins
Hook plugins (ambient enforcement):
- molecule-careful-bash — refuses destructive bash; ships careful-mode skill
- molecule-freeze-scope — locks edits via .claude/freeze
- molecule-audit-trail — appends every Edit/Write to audit.jsonl
- molecule-session-context — auto-loads cron-learnings at session start
- molecule-prompt-watchdog — injects warnings on destructive prompt keywords
Skill plugins (on-demand):
- molecule-skill-code-review — 16-criteria multi-axis review
- molecule-skill-cross-vendor-review — adversarial second-model review
- molecule-skill-llm-judge — deliverable-vs-request scoring
- molecule-skill-update-docs — post-merge doc sync
- molecule-skill-cron-learnings — operational-memory JSONL format
Workflow plugins (slash commands):
- molecule-workflow-triage — /triage full PR-triage cycle
- molecule-workflow-retro — /retro + cron-retro skill, weekly retrospective
Each ships only what it needs — most have just plugin.yaml + skills/ or
hooks/ + adapter (one-line stub: `from plugins_registry.builtins import
AgentskillsAdaptor as Adaptor`). Total ~120 files but each plugin is
small and self-contained.
## Verification
- python3 -m molecule_plugin validate plugins/molecule-* → all 13 valid
(12 new + pre-existing molecule-dev)
- End-to-end install smoke test on representative samples: hook plugin
(molecule-careful-bash), skill-only plugin (molecule-skill-code-review),
workflow plugin (molecule-workflow-triage). All produce expected
/configs/ tree, settings.json paths rewritten, exec bits preserved,
zero warnings.
- workspace-template pytest tests/test_plugins_builtins_drift.py → passes
(SDK + runtime stay in sync).
## CLAUDE.md repo-doc updated
Lists all 12 new plugins under the existing Plugins section, organized
by category (hook / skill / workflow). Each entry one line, recommend-
together hints where dependencies make sense.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skills are opt-in (I have to remember to invoke them). Hooks are
ambient — they fire on every matching event automatically. This PR
moves the careful-mode and learnings discipline from "doc I should
read" to "harness-enforced behavior I cannot bypass".
## 6 new hooks (.claude/hooks/)
- pre-bash-careful — REFUSES git push --force to main, rm -rf at root,
DROP TABLE against prod schema. WARNs on force-with-lease, gh pr/
issue close. Tested: blocks the destructive case, allows safe ones.
- pre-edit-freeze — implements /freeze. When .claude/freeze contains
a path glob, edits outside it are denied. Tested: edits to PLAN.md
blocked when scope locked to platform/internal/handlers/.
- session-start-context — auto-loads last 20 cron-learnings, freeze
status, open-PR/issue counts as additionalContext at session start.
Tested: emits valid SessionStart JSON.
- post-edit-audit — appends every Edit/Write to .claude/audit.jsonl
(gitignored). One-line records {ts, tool, file, ok}. Tested writes.
- user-prompt-tag — injects context warnings when prompt mentions
force-push, drop-table, "delete all", "push to main", etc. Tested:
emits warning for "force push the fix to main".
- subagent-stop-judge — off by default; touch .claude/judge-subagents
to enable. When on, prompts orchestrator to verify subagent's last
message addresses the original task. Cost-free MVP (no LLM call yet).
All hooks are Python (jq isn't on the hook PATH on macOS — Python is).
Shared helpers in _lib.py: read_input, deny_pretooluse, add_context,
warn_to_stderr.
## settings.json — wires all 6 hooks
Adds SessionStart, UserPromptSubmit, SubagentStop event handlers.
Existing PreToolUse:Bash + PostToolUse:Edit chains gain the new hooks
alongside the existing ones (check-inbox.sh, echo reminder).
Adds @modelcontextprotocol/server-sequential-thinking MCP server for
structured chain-of-thought scratchpad — useful when triaging multiple
PRs in parallel without losing context.
## .claude/commands/triage.md — slash command shortcut
Manual /triage runs the same flow as the c5074cd5 hourly cron, on
demand. Saves ~4KB of prompt every invocation by pulling the cron
prompt out of working memory.
## CLAUDE.md additions
New "Agent operating rules (auto-loaded — read first)" section right
after Ecosystem Context. Documents:
- Cron / triage discipline (read learnings, treat docs PRs touching
CLAUDE.md/PLAN.md as noteworthy, write per-tick reflections)
- Table of all 6 hooks active in this repo
- List of skills and how to invoke them
- Standing rules (inviolable) consolidated for the agent
This block auto-loads into every conversation context — free behavior
change without me remembering to opt in.
## .gitignore
audit.jsonl, freeze, judge-subagents, per-tick-reflections.md are all
local operational state, never committed.
## Verification
- echo '{"tool_input":{"command":"git push --force origin main"}}' |
bash pre-bash-careful.sh → emits deny JSON ✓
- Same for git status (safe command) → empty output, exit 0 ✓
- pre-edit-freeze with .claude/freeze=platform/handlers/ blocks
edits to PLAN.md, allows edits inside the locked path ✓
- post-edit-audit appends valid JSONL ✓
- session-start-context emits additionalContext with PR/issue counts ✓
- user-prompt-tag emits warning for "force push to main" prompt ✓
- python3 -c "json.load(open('.claude/settings.json'))" → valid ✓
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves#14. ApplyTierConfig now reads TIER{2,3,4}_MEMORY_MB and
TIER{2,3,4}_CPU_SHARES env vars, falling back to the compiled defaults
agreed in the issue:
- T2: 512 MiB / 1024 shares (1 CPU) — unchanged baseline
- T3: 2048 MiB / 2048 shares (2 CPU) — new cap (previously uncapped)
- T4: 4096 MiB / 4096 shares (4 CPU) — new cap (previously uncapped)
CPU_SHARES follows Docker's 1024 = 1 CPU convention; internally the
value is translated to NanoCPUs for a hard allocation so behaviour
remains deterministic across hosts. Malformed or non-positive env
values silently fall back to the default.
Behaviour change note: T3 and T4 previously had no explicit cap.
Operators who relied on unlimited can set very large TIERn_MEMORY_MB /
TIERn_CPU_SHARES values; a follow-up can add unset-means-unlimited
semantics if required.
Tests:
- TestGetTierMemoryMB_DefaultsMatchLegacy
- TestGetTierMemoryMB_EnvOverride (covers malformed + zero fallback)
- TestGetTierCPUShares_EnvOverride
- TestApplyTierConfig_T3_UsesEnvOverride (wiring)
- TestApplyTierConfig_T3_DefaultCap (documents the new cap)
Docs: .env.example section + CLAUDE.md platform env-vars list updated.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- docs/edit-history/2026-04-14.md: append tick-3 section covering the
admin test-token route (#53), the prior-tick doc-sync PR (#54), and
the hermes required_env alignment (#55). Record measured test counts
(Go +4 for the TestAdminTestToken_* quartet).
- CLAUDE.md: bump Go test count 695 → 699 with a note pointing at the
new quartet. Route-table row and env-var mentions for the admin
route already landed with #53; verified on main.
- .env.example: add MOLECULE_ENABLE_TEST_TOKENS with a comment about
the prod-hidden default. Closes the code-review doc-sync flag from
#53 (var was in CLAUDE.md but missing from .env.example).
No PLAN.md / README.md / README.zh-CN.md update needed — none of the
three merges expose a user-visible surface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a gated admin endpoint that mints a fresh workspace bearer token on
demand, eliminating the register-race currently used by
test_comprehensive_e2e.sh (PR #5 follow-up).
- New handler admin_test_token.go: returns 404 unless MOLECULE_ENV != production
or MOLECULE_ENABLE_TEST_TOKENS=1. Hides route existence in prod (404 not 403).
- Mints via wsauth.IssueToken; logs at INFO without the token itself.
- Verifies workspace exists before minting (missing -> 404, never 500).
- Tests cover prod-hidden, enable-flag-overrides-prod, missing workspace,
and happy-path + token-validates round trip.
- tests/e2e/_lib.sh gains e2e_mint_test_token helper for downstream adoption.
- CLAUDE.md updated with route + env vars.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds HERMES_API_KEY to .env.example with a cross-reference to the
OPENROUTER_API_KEY fallback, and adds the hermes runtime row to the
CLAUDE.md runtime table so the new adapter is discoverable alongside
its siblings (langgraph, claude-code, openclaw, crewai, autogen,
deepagents).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second-pass cleanup after the monolith split. Addresses every issue
from the code-review pass.
Core additions in src/api.ts:
- toMcpResult(data) + toMcpText(text): single source of truth for the
MCP text-content envelope (was ~87 duplicated literals)
- ApiError type + isApiError(v) guard: typed discriminated-union for
the error-by-value pattern; replaces open-coded shape checks
- apiCall<T = unknown>: generic so callers can document expected
response shape without unchecked "as" casts
Bulk cleanups across all 12 tools/*.ts:
- Every handler now returns toMcpResult(data) or toMcpText(text)
- Open-coded "typeof obj === 'object' && 'error' in obj" in
remote_agents.ts replaced with isApiError(v)
- Extracted initialCanvasPosition() helper out of
handleCreateWorkspace; explains why random seeding exists
- Added runtime/workspace_dir/workspace_access to create_workspace
zod schema (previously accepted by handler but hidden from clients)
src/index.ts:
- Replaced "export * from" with explicit named re-exports so the
public surface is auditable and future name collisions fail loudly
Tests:
- createServer() smoke test that records every srv.tool(...) call and
asserts 87 registered tools unique by name. Catches future PRs that
forget to wire a registerXxxTools(srv).
Docs:
- Fix broken relative links in sdk/python/molecule_agent/README.md
(was ../../examples/ from inside sdk/python/, should be ../examples/)
- Update stale "61 tools" -> "87 tools" in CLAUDE.md + main() log
Verification:
- npm run build clean
- npx jest -> 97/97 passed (was 96; +1 smoke test)
- grep "content: [{ type: \"text\" as const" src/tools/ -> 0 matches
- No file over 216 lines
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>