Files
molecule-core/.env.example
core-devops 3747fe2f49 harden(security): remove dev-mode fail-open auth — fail-closed everywhere + dev-token + regression gate
CTO directive: "nothing should be fail-open." Remove the dev-mode fail-open
auth hatch so AdminAuth/WorkspaceAuth (and the discovery caller) ALWAYS
require a real credential — fail-CLOSED in every environment, dev included —
fix local dev to stay AUTHENTICATED (not open), and add a regression gate so
fail-open cannot return.

Removed fail-open call-sites (workspace-server):
- internal/middleware/wsauth_middleware.go WorkspaceAuth — deleted the
  isDevModeFailOpen() short-circuit that let a bearer-less /workspaces/:id/*
  request through when MOLECULE_ENV=dev + ADMIN_TOKEN unset.
- internal/middleware/wsauth_middleware.go AdminAuth — deleted BOTH fail-open
  branches: the Tier-1 lazy-bootstrap (no live tokens + no ADMIN_TOKEN ⇒ pass,
  the C4 /org/import pre-empt hole) and the Tier-1b isDevModeFailOpen() dev
  hatch. HasAnyLiveTokenGlobal is still probed for the 503-on-outage semantics
  but opens no path.
- internal/handlers/discovery.go validateDiscoveryCaller — deleted the
  IsDevModeFailOpen() allow branch; discovery now requires a verified CP
  session or valid bearer in every env.
- Removed the isDevModeFailOpen()/IsDevModeFailOpen() helper entirely. The two
  legitimately non-auth uses (rate-limit relaxation in ratelimit.go, loopback
  bind default in cmd/server) now key on a new NON-security isLocalDevEnv()
  predicate (MOLECULE_ENV only, decoupled from ADMIN_TOKEN). CanvasOrBearer's
  cosmetic-only behaviour (PUT /canvas/viewport) is unchanged.

Dev path stays authenticated, not open:
- scripts/dev-start.sh provisions a deterministic ADMIN_TOKEN into .env and
  exports the matching NEXT_PUBLIC_ADMIN_TOKEN so the dev Canvas sends a real
  bearer (canvas/src/lib/api.ts already attaches it; next.config.ts pair-guard).
- Docs updated: .env.example, docs/quickstart.md, docs/architecture/overview.md.

Regression gate:
- internal/middleware/no_fail_open_test.go — asserts AdminAuth + WorkspaceAuth
  fail CLOSED (401) under the EXACT old-hatch conditions (ADMIN_TOKEN unset +
  MOLECULE_ENV=dev/development × hasLive 0/1). Proven RED against a temporarily
  restored hatch, GREEN after. Plus a source-guard test forbidding the
  isDevModeFailOpen(-style helper from re-appearing.
- Converted the stale fail-open assertions in wsauth_middleware_test.go,
  discovery_test.go, security_regression_685_686_687_688_test.go and the
  devmode/bind tests to pin the fail-closed contract.

Audit (other fail-open patterns on the auth surface): CanvasOrBearer and
validateDiscoveryCaller retain a fail-open-on-DB-error (and CanvasOrBearer a
no-token lazy-bootstrap) — both are documented availability tradeoffs on
cosmetic / low-sensitivity routes, left as-is and flagged for follow-up.

Verify: go build ./... ok; go vet middleware/cmd/handlers clean; full module
go test ./... = 46 ok / 0 fail.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 01:02:48 -07:00

192 lines
14 KiB
Bash

# Postgres
# These defaults match docker-compose.infra.yml, which is the stack
# launched by `./infra/scripts/setup.sh`. Override for production.
POSTGRES_USER=dev
POSTGRES_PASSWORD=dev
POSTGRES_DB=molecule
# DATABASE_URL points at the host-published Postgres port so that
# `go run ./cmd/server` on the host (the README quickstart path) can
# connect. When running the platform *inside* docker-compose.yml, the
# compose file builds a DATABASE_URL with host `postgres` automatically
# from POSTGRES_USER/PASSWORD/DB above — that path ignores this value.
DATABASE_URL=postgres://dev:dev@localhost:5432/molecule?sslmode=disable
# Redis — same host-vs-container story as DATABASE_URL above.
REDIS_URL=redis://localhost:6379
# Platform
# PORT only applies to the Go platform (workspace-server). The Canvas pins
# itself to 3000 in canvas/package.json, so sourcing this file before
# `npm run dev` won't accidentally make Next.js try to bind 8080.
PORT=8080
# ---- Admin credential — REQUIRED in EVERY environment (auth is fail-closed) ----
# Auth is fail-CLOSED everywhere now (harden/no-fail-open-auth): there is NO
# dev-mode escape hatch. AdminAuth / WorkspaceAuth / discovery all require a
# real credential. The canvas authenticates by sending this value as a bearer
# (it reads NEXT_PUBLIC_ADMIN_TOKEN — set it to the SAME value).
# When ADMIN_TOKEN is set, only this value is accepted on /admin/* and /approvals/* routes.
# (When unset, a fresh install 401s on admin routes and any valid workspace bearer
# is the only deprecated fallback once tokens exist — set ADMIN_TOKEN to close #684.)
# Generate: openssl rand -base64 32 (scripts/dev-start.sh provisions a fixed dev value)
# Store in fly secrets / deployment env — NEVER commit the actual value here.
ADMIN_TOKEN=
# NEXT_PUBLIC_ADMIN_TOKEN= # Canvas-side mirror of ADMIN_TOKEN. The canvas
# bakes this into its bundle and sends it as the
# bearer. MUST equal ADMIN_TOKEN (next.config.ts
# warns if the pair is half-set). dev-start.sh
# exports it for you.
SECRETS_ENCRYPTION_KEY= # 32-byte key (raw or base64). Leave empty for plaintext (dev only).
CONFIGS_DIR= # Path to workspace-configs-templates/ (auto-discovered if empty)
PLUGINS_DIR= # Path to plugins/ directory (default: /plugins in container)
# PLATFORM_URL=http://host.docker.internal:8080 # URL agent containers use to reach the platform; injected into workspace env. Default derives from PORT.
# MOLECULE_URL=http://localhost:8080 # Canonical MCP-client URL (mirrors PLATFORM_URL inside containers). Read by the MCP server (mcp-server/) and Molecule MCP tooling.
# MOLECULE_MCP_ALLOW_SEND_MESSAGE= # Set to "true" to include send_message_to_user in the MCP bridge tool list (issue #810). Excluded by default to prevent unintended WebSocket pushes from CLI sessions.
# MOLECULE_MCP_URL=http://localhost:8080 # Platform URL for opencode MCP config (opencode.json). Same as PLATFORM_URL; separate var so opencode configs can reference it without ambiguity.
# WORKSPACE_DIR= # Optional global host path bind-mounted to /workspace in every container. Per-workspace workspace_dir column overrides this; if neither is set each workspace gets an isolated Docker named volume.
MOLECULE_ENV=development # Environment label (development/staging/production). Used for log tagging and for NON-security local-dev conveniences (loopback HTTP bind, relaxed rate-limit bucket). It is NOT an auth lever — auth is fail-closed in every environment. SaaS deployments MUST set MOLECULE_ENV=production.
# MOLECULE_ENABLE_TEST_TOKENS= # Set to 1 to expose GET /admin/workspaces/:id/test-token (mints a fresh bearer token for E2E scripts). The route is auto-enabled when MOLECULE_ENV != production; this flag is the explicit override. Leave unset/0 in prod — the route 404s unless enabled.
# MOLECULE_ORG_ID= # SaaS only: org UUID set by control plane on tenant machines. When set, workspace provisioning auto-routes through the control plane API instead of Docker.
# CP_PROVISION_URL= # Override control plane URL for workspace provisioning (default: https://api.moleculesai.app). Only needed for testing against a non-production control plane.
# CORS / rate limiting
# CORS_ORIGINS=http://localhost:3000,http://localhost:3001 # Comma-separated allowed origins for the HTTP API.
# RATE_LIMIT=600 # Requests/minute per client (default 600).
# Activity retention
# ACTIVITY_RETENTION_DAYS=7 # Days to keep rows in activity_logs before pruning.
# ACTIVITY_CLEANUP_INTERVAL_HOURS=6 # How often the background pruner runs.
# Container/runtime detection
# MOLECULE_IN_DOCKER= # Set when running the platform inside Docker (accepts 1/0, true/false). Triggers A2A proxy to rewrite 127.0.0.1:<port> agent URLs to Docker bridge hostnames. Auto-detected via /.dockerenv; only set if detection fails or to force off.
# GitHub
# GITHUB_REPO=owner/repo # Target repo for agent initial_prompt clone (e.g. Molecule-AI/molecule-core). Read inside workspace containers.
# GITHUB_TOKEN= # Personal access token / installation token used by agents that clone private repos. Register as a global secret via POST /admin/secrets for propagation to workspace env. Token is used in-URL during clone and then scrubbed from .git/config via `git remote set-url`.
# Webhooks
# GITHUB_WEBHOOK_SECRET= # HMAC secret used to verify incoming GitHub webhook payloads at /webhooks/github.
# CLI clients
# MOLECLI_URL=http://localhost:8080 # URL the molecli TUI uses to reach the platform.
# Plugin install safeguards (POST /workspaces/:id/plugins)
# All three bound the cost of a single install so a slow/malicious
# source can't tie up a handler. Defaults are sane for typical use.
PLUGIN_INSTALL_BODY_MAX_BYTES=65536 # max request body size (default: 64 KiB)
PLUGIN_INSTALL_FETCH_TIMEOUT=5m # duration string; whole fetch+copy deadline
PLUGIN_INSTALL_MAX_DIR_BYTES=104857600 # max staged-tree size (default: 100 MiB)
# ---- Plugin supply chain hardening (issue #768, PR #775) ----
# Set to 'true' to allow unpinned plugin refs (no #tag/#sha). Local dev only.
# When unset or 'false' (default), installing a plugin from a source without
# an explicit ref is rejected — prevents supply chain attacks via floating HEAD.
# NEVER set in production. Pending: PR #775 must merge before this takes effect.
PLUGIN_ALLOW_UNPINNED=
# Phase 30.7 — remote-agent liveness threshold. Workspaces with
# runtime='external' are marked offline if their last_heartbeat_at is
# older than this many seconds. Slightly larger than the 60s Redis TTL
# so transient WAN hiccups don't flap online/offline. Set to 0 to use
# the built-in default (90s).
REMOTE_LIVENESS_STALE_AFTER=90
# ---- Workspace hibernation (issue #724, PR #724) ----
# Workspaces with no active tasks hibernate after this many minutes.
# Leave empty to disable. Per-workspace override via the hibernation_idle_minutes
# column (set via PATCH /workspaces/:id or org.yaml). This env var sets the
# platform-wide default applied to workspaces that have no per-workspace setting.
# Note: the global-default behaviour (reading this env var) is pending — currently
# only the per-workspace DB column is active. Setting this has no effect until that
# code lands.
HIBERNATION_IDLE_MINUTES=60
# Canvas
NEXT_PUBLIC_PLATFORM_URL=http://localhost:8080
NEXT_PUBLIC_WS_URL=ws://localhost:8080/ws
# Workspace Runtime
ANTHROPIC_API_KEY= # Anthropic API key (console.anthropic.com). Required for MODEL_PROVIDER=anthropic (default). Also used by workspaces that call the Anthropic SDK directly (e.g. molecule-hitl, hermes runtime). Register as a global secret via POST /settings/secrets so it is auto-propagated to all workspace containers — do NOT set only as a workspace-level secret or SDK-direct workspaces will silently fail with 401. See docs/runbooks/saas-secrets.md#anthropic_api_key.
OPENROUTER_API_KEY= # OpenRouter API key (openrouter.ai). Use with model: openrouter:anthropic/claude-3.5-haiku. Also acts as the fallback key for the hermes runtime when HERMES_API_KEY is unset.
HERMES_API_KEY= # Nous Research Portal API key (inference-prod.nousresearch.com). Used by the hermes runtime; falls back to OPENROUTER_API_KEY if unset.
GROQ_API_KEY= # Groq API key (console.groq.com). Use with model: groq:llama-3.3-70b-versatile
CEREBRAS_API_KEY= # Cerebras API key (cloud.cerebras.ai). Use with model: cerebras:llama3.1-8b
GOOGLE_API_KEY= # Google AI API key (aistudio.google.com). Use with model: google_genai:gemini-2.5-flash
MAX_TOKENS=2048 # Max output tokens for OpenRouter requests (default: 2048)
LANGGRAPH_RECURSION_LIMIT=500 # LangGraph/DeepAgents max ReAct steps per turn (lib default: 25; raised to 500 — PM fan-out to 6+ reports + synthesis routinely exceeds 100)
MODEL_PROVIDER=anthropic:claude-opus-4-7 # Format: provider:model. Providers: anthropic, openai, openrouter, groq, cerebras, google_genai, ollama
# ---- Workspace tier resource limits (issue #14) ----
# Per-tier memory/CPU caps applied to each workspace Docker container.
# CPU_SHARES follows the Docker convention: 1024 shares == 1 CPU.
# Any value <=0 or malformed falls back to the compiled default shown.
# Tier 1 is sandboxed (tmpfs, readonly) and is not resource-capped here.
TIER2_MEMORY_MB=512 # Standard tier memory cap (default 512 MiB)
TIER2_CPU_SHARES=1024 # Standard tier CPU (default 1024 = 1 CPU)
TIER3_MEMORY_MB=2048 # Privileged tier memory cap (default 2048 MiB; previously uncapped)
TIER3_CPU_SHARES=2048 # Privileged tier CPU (default 2048 = 2 CPU; previously uncapped)
TIER4_MEMORY_MB=4096 # Full-host tier memory cap (default 4096 MiB; previously uncapped)
TIER4_CPU_SHARES=4096 # Full-host tier CPU (default 4096 = 4 CPU; previously uncapped)
# Social Channels (optional — configure per-workspace via API or Canvas)
TELEGRAM_BOT_TOKEN= # Telegram Bot API token (talk to @BotFather). Used as default for new Telegram channels.
DISCORD_WEBHOOK_URL= # Discord Incoming Webhook URL (Server → Channel → Integrations → Webhooks). Used by Community Manager workspace.
# CI/CD Slack notifications (issue #624)
# Add SLACK_CI_WEBHOOK_URL as a GitHub Actions secret (repo Settings → Secrets → Actions).
# When set, CI failures in platform-build, canvas-build, python-lint, shellcheck,
# and e2e-api workflows post an alert to the configured #ci-alerts Slack channel.
# Obtain: Slack App → Incoming Webhooks → Add to channel → copy URL.
# Leave unset to disable (jobs skip silently — no build failure).
SLACK_CI_WEBHOOK_URL= # https://hooks.slack.com/services/...
# Langfuse (optional observability)
LANGFUSE_HOST=http://langfuse-web:3000
LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
# ---- EU AI Act Annex III compliance — molecule-audit-ledger (#594) ----
# Secret salt for PBKDF2 key derivation (HMAC-SHA256 chain verification).
# When set, GET /workspaces/:id/audit derives the HMAC key and verifies the
# chain inline, returning "chain_valid": true/false in the response.
# When unset, "chain_valid": null — use the CLI to verify:
# python -m molecule_audit.verify --agent-id <id>
# Must match AUDIT_LEDGER_SALT set in each workspace container.
# AUDIT_LEDGER_SALT= # 32+ random bytes (base64 or arbitrary string)
# ---- Operator identity (for org-templates/reno-stars/, see OPERATOR_NOTES.md) ----
# These are NOT consumed by the platform itself — they're documented here so
# operators of the reno-stars template (and any future operator-personalised
# template) know what to set as global_secrets. The platform injects every
# global_secret into every workspace container as an env var; the agent
# system-prompts reference them via ${VAR_NAME}.
OPERATOR_EMAIL= # e.g. you@example.com
OPERATOR_PHONE= # e.g. 555-123-4567 (display only, not used for SMS)
OPERATOR_TELEGRAM_ID= # numeric Telegram user ID (for bot DMs)
GADS_MCC_ID= # Google Ads MCC (manager) account ID, format 123-456-7890
GADS_CUSTOMER_ID= # Google Ads child customer ID, format 987-654-3210
GCP_PROJECT_ID= # Google Cloud project ID (e.g. my-website-123456)
GSC_SERVICE_ACCOUNT= # Search Console reporter service account email
# ---- opencode / remote MCP client auth (see docs/integrations/opencode.md) ----
# MOLECULE_MCP_URL is the base URL of the Molecule platform's /mcp endpoint.
# MOLECULE_MCP_TOKEN is a workspace-scoped bearer token issued via
# POST /workspaces/:id/tokens (scopes: mcp:read, mcp:delegate).
# Token goes in Authorization: Bearer header — never embed in the URL.
MOLECULE_MCP_URL= # e.g. https://api.molecule.ai or http://localhost:8080
MOLECULE_MCP_TOKEN= # workspace-scoped bearer token — NEVER COMMIT
# ---- workspace-template image refresh ----
# IMAGE_AUTO_REFRESH=true makes the platform poll GHCR every 5 min for digest
# changes on each workspace-template-*:latest. When a digest moves the
# platform pulls + force-recreates matching ws-* containers (same code path
# as POST /admin/workspace-images/refresh). Closes the runtime CD chain to
# zero operator steps.
# Default in docker-compose.yml is "true" for local dev so the runtime → ws
# loop is tight; explicit override here lets you turn it off when running a
# long test that shouldn't be disturbed by a publish.
IMAGE_AUTO_REFRESH= # true|false; unset = inherit compose default (true for local dev)
# GHCR_USER + GHCR_TOKEN are required only for private template images
# (current workspace-template-* set is public; both can stay unset).
GHCR_USER=
GHCR_TOKEN=