CLAUDE.md was a 44KB catch-all mixing architecture docs (useful for
everyone) with agent operating instructions (internal). Split:
- docs/architecture/overview.md — system architecture, component
descriptions, 13 key patterns (import cycles, health detection,
communication rules, WebSocket flow, lifecycle, etc.)
- docs/api-reference.md — full REST API route table + database schema
- CLAUDE.md → gitignored (stays local for agent tooling)
All internal PR/issue references stripped from the new docs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove compiled workspace-server/server binary from git
- Fix .gitignore, .gitattributes, .githooks/pre-commit for renamed dirs
- Fix CI workflow path filters (workspace-template → workspace)
- Replace real EC2 IP and personal slug in test_saas_tenant.sh
- Scrub molecule-controlplane references in docs
- Fix stale workspace-template/ paths in provisioner, handlers, tests
- Clean tracked Python cache files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Security:
- Replace hardcoded Cloudflare account/zone/KV IDs in wrangler.toml
with placeholders; add wrangler.toml to .gitignore, ship .example
- Replace real EC2 IPs in docs with <EC2_IP> placeholders
- Redact partial CF API token prefix in retrospective
- Parameterize Langfuse dev credentials in docker-compose.infra.yml
- Replace Neon project ID in runbook with <neon-project-id>
Community:
- Add CONTRIBUTING.md (build, test, branch conventions, CI info)
- Add CODE_OF_CONDUCT.md (Contributor Covenant 2.1)
Cleanup:
- Replace personal runner username/machine name in CI + PLAN.md
- Replace personal tenant URL in MCP setup guide
- Replace personal author field in bundle-system doc
- Replace personal login in webhook test fixture
- Rewrite cryptominer incident reference as generic security remediation
- Remove private repo commit hashes from PLAN.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Full staging environment that mirrors production. Every infra change
ships to staging first before promotion. Gates Phase 33 (Tunnel) and
Phase 35 (security hardening).
Components: Railway staging env, Neon branch, staging DNS, tagged
Docker images, promotion workflow, automated smoke tests.
Also marks Phase 33 as migrating from Worker to Cloudflare Tunnel
(issue #933), prerequisite: staging.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents three upgrade strategies for keeping tenant EC2 instances
current with platform-tenant:latest:
- Option A: Rolling restart via CP admin endpoint (coordinated)
- Option B: Sidecar auto-updater cron (implemented, 5 min interval)
- Option C: Blue-green via Worker (zero downtime, future)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses all 4 review points from PR #786:
1. Worker resilience: 3-tier cache (in-memory → KV → CP API) with stale
fallback so CP outages are invisible to tenants
2. WebSocket proxying: documented upgradeHeader handling, fallback to
keep Caddy for WS-only if Workers WS is unreliable
3. SG automation: note to auto-update Cloudflare IP ranges, don't hardcode
4. Trusted proxy: X-Forwarded-For / CF-Connecting-IP trust chain documented
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds Phase 33 plan and architecture doc for replacing per-tenant DNS
records with a wildcard DNS + Cloudflare Worker proxy pattern.
Eliminates: DNS propagation delays, NXDOMAIN caching, per-instance
Let's Encrypt, Caddy on EC2. Same pattern used by Vercel, Railway,
Fly.io, WordPress, n8n.
4-phase migration: deploy Worker → stop creating DNS records →
remove Caddy from EC2 → cleanup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>