molecule-core

Author	SHA1	Message	Date
Hongming Wang	97dbe1c987	feat(canary): rollback-latest script + release-pipeline doc (Phase 4) Closes the canary loop with the escape hatch and a single place to read about the whole flow. scripts/rollback-latest.sh <sha> uses crane to retag :latest ← :staging-<sha> for BOTH the platform and tenant images. Pre-checks the target tag exists and verifies the :latest digest after the move so a bad ops typo doesn't silently promote the wrong thing. Prod tenants auto-update to the rolled-back digest within their 5-min cycle. Exit codes: 0 = both retagged, 1 = registry/tag error, 2 = usage error. docs/architecture/canary-release.md The one-page map of the pipeline: how PR → main → staging-<sha> → canary smoke → :latest promotion works end-to-end, how to add a canary tenant, how to roll back, and what this gate explicitly does NOT catch (prod-only data, config drift, cross-tenant bugs). No code changes in the CP or workspace-server — this PR is shell + docs only, so it's safe to land independently of the other Phase {1,1.5,2,3} PRs still in review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 03:37:42 -07:00
Hongming Wang	ca5a5f1a7f	docs: 2026-04-19 SaaS prod migration notes Captures the 10-PR staging→main cutover: what shipped, the three new Railway prod env vars (PROVISION_SHARED_SECRET / EC2_VPC_ID / CP_BASE_URL), and the sharp edge for existing tenants — their containers pre-date PR #53 so they still need MOLECULE_CP_SHARED_SECRET added manually (or a re-provision) before the new CPProvisioner's outbound bearer works. Also includes a post-deploy verification checklist and rollback plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 02:29:31 -07:00
Hongming Wang	aaa6a4db83	fix(docs): update architecture + API reference paths for workspace-server rename Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 01:25:21 -07:00
Hongming Wang	0d3c57cced	chore: gitignore CLAUDE.md, extract content to proper docs CLAUDE.md was a 44KB catch-all mixing architecture docs (useful for everyone) with agent operating instructions (internal). Split: - docs/architecture/overview.md — system architecture, component descriptions, 13 key patterns (import cycles, health detection, communication rules, WebSocket flow, lifecycle, etc.) - docs/api-reference.md — full REST API route table + database schema - CLAUDE.md → gitignored (stays local for agent tooling) All internal PR/issue references stripped from the new docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:43:33 -07:00
Hongming Wang	39074cc4ae	chore: final open-source cleanup — binary, stale paths, private refs - Remove compiled workspace-server/server binary from git - Fix .gitignore, .gitattributes, .githooks/pre-commit for renamed dirs - Fix CI workflow path filters (workspace-template → workspace) - Replace real EC2 IP and personal slug in test_saas_tenant.sh - Scrub molecule-controlplane references in docs - Fix stale workspace-template/ paths in provisioner, handlers, tests - Clean tracked Python cache files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:38:55 -07:00
Hongming Wang	d8026347e5	chore: open-source restructure — rename dirs, remove internal files, scrub secrets Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:24:44 -07:00
Hongming Wang	295c4d930a	chore: open-source preparation — scrub secrets, add community files Security: - Replace hardcoded Cloudflare account/zone/KV IDs in wrangler.toml with placeholders; add wrangler.toml to .gitignore, ship .example - Replace real EC2 IPs in docs with <EC2_IP> placeholders - Redact partial CF API token prefix in retrospective - Parameterize Langfuse dev credentials in docker-compose.infra.yml - Replace Neon project ID in runbook with <neon-project-id> Community: - Add CONTRIBUTING.md (build, test, branch conventions, CI info) - Add CODE_OF_CONDUCT.md (Contributor Covenant 2.1) Cleanup: - Replace personal runner username/machine name in CI + PLAN.md - Replace personal tenant URL in MCP setup guide - Replace personal author field in bundle-system doc - Replace personal login in webhook test fixture - Rewrite cryptominer incident reference as generic security remediation - Remove private repo commit hashes from PLAN.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-18 00:10:56 -07:00
Hongming Wang	b0eed5135f	fix: resolve PLAN.md merge conflict — keep both Phase 34 and Phase 36 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 21:41:32 -07:00
Hongming Wang	a873ae0eae	docs: staging environment design + Phase 36 plan Full staging environment that mirrors production. Every infra change ships to staging first before promotion. Gates Phase 33 (Tunnel) and Phase 35 (security hardening). Components: Railway staging env, Neon branch, staging DNS, tagged Docker images, promotion workflow, automated smoke tests. Also marks Phase 33 as migrating from Worker to Cloudflare Tunnel (issue #933), prerequisite: staging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:37:11 -07:00
Hongming Wang	7094290850	docs: Partner API Keys architecture + Phase 34 plan Adds programmatic org management for partner platforms, CI/CD, and automation. Partners authenticate with mol_pk_* API keys (SHA-256 hashed, scoped, rate-limited, revocable) alongside existing WorkOS browser auth. - Full architecture doc with schema, scopes, middleware integration, security considerations, and use cases - Phase 34 in PLAN.md (4 sub-phases) - CLAUDE.md cross-reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 14:07:50 -07:00
Hongming Wang	20750cf128	docs: tenant image upgrade strategies (Options A/B/C) Documents three upgrade strategies for keeping tenant EC2 instances current with platform-tenant:latest: - Option A: Rolling restart via CP admin endpoint (coordinated) - Option B: Sidecar auto-updater cron (implemented, 5 min interval) - Option C: Blue-green via Worker (zero downtime, future) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 11:59:15 -07:00
Hongming Wang	8c02d2d878	docs(wildcard-dns): address CEO review — KV cache, WebSocket, proxy trust Addresses all 4 review points from PR #786: 1. Worker resilience: 3-tier cache (in-memory → KV → CP API) with stale fallback so CP outages are invisible to tenants 2. WebSocket proxying: documented upgradeHeader handling, fallback to keep Caddy for WS-only if Workers WS is unreliable 3. SG automation: note to auto-update Cloudflare IP ranges, don't hardcode 4. Trusted proxy: X-Forwarded-For / CF-Connecting-IP trust chain documented Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 10:17:43 -07:00
Hongming Wang	d36b612bbf	docs: wildcard DNS + Cloudflare Worker proxy architecture Adds Phase 33 plan and architecture doc for replacing per-tenant DNS records with a wildcard DNS + Cloudflare Worker proxy pattern. Eliminates: DNS propagation delays, NXDOMAIN caching, per-instance Let's Encrypt, Caddy on EC2. Same pattern used by Vercel, Railway, Fly.io, WordPress, n8n. 4-phase migration: deploy Worker → stop creating DNS records → remove Caddy from EC2 → cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 10:02:32 -07:00
Hongming Wang	24fec62d7f	initial commit — Molecule AI platform Forked clean from public hackathon repo (Starfire-AgentTeam, BSL 1.1) with full rebrand to Molecule AI under github.com/Molecule-AI/molecule-monorepo. Brand: Starfire → Molecule AI. Slug: starfire / agent-molecule → molecule. Env vars: STARFIRE_* → MOLECULE_*. Go module: github.com/agent-molecule/platform → github.com/Molecule-AI/molecule-monorepo/platform. Python packages: starfire_plugin → molecule_plugin, starfire_agent → molecule_agent. DB: agentmolecule → molecule. History truncated; see public repo for prior commits and contributor attribution. Verified green: go test -race ./... (platform), pytest (workspace-template 1129 + sdk 132), vitest (canvas 352), build (mcp). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:55:37 -07:00

14 Commits