arch/RFC: workspace-placement (org-per-EC2 architecture) — formalize the implicit design #1793

Closed
opened 2026-05-24 09:13:21 +00:00 by hongming · 0 comments
Owner

Summary

Formalize the workspace-placement architecture as a written RFC. The pattern is implicitly already in place: each org/tenant = one EC2, fully isolated, with its own DB and memory plugin. Memory and other state lives ON the tenant (not in the platform), with SSOT and tenant isolation guaranteed by the org-per-EC2 boundary.

This was the design implicit in the 2026-05-24 memory-system migration: the v2 plugin runs as a sidecar on each tenant's EC2, sharing the tenant's Postgres, isolated by EC2 boundary not by application-level multi-tenancy.

Why write it down

  • Future architectural decisions (multi-region, dedicated-tier customers, BYO-compute, etc.) need to reference this pattern.
  • Onboarding new engineers: the mental model "platform is billing/infra-only; tenants are fully functional and self-hosted-shaped" needs an authoritative doc.
  • Implicit-design drift is a real failure mode (e.g., someone could try to centralize memory at the platform layer, breaking SSOT and isolation).

Scope of the RFC

  1. Architecture diagram — platform vs tenant boundary, what crosses (provisioning + billing + tunnel auth) vs what stays (DB, memory, workspace files, agent state).
  2. SSOT rationale — why each tenant owns its own state rather than platform aggregating. Cite this session's "no v1 fallback, v2 plugin is SSOT on the tenant" decision.
  3. OSS-deployment shape — molecule-core is OSS; anyone can deploy a tenant. Their tenant connects to the platform for billing+provisioning but otherwise runs identically to our hosted tenants. Document the workspace's inject org credentials + URL pattern that lets it reach its tenant server.
  4. Scaling envelope — what's the org count this design supports? When does a customer need a dedicated-tier (region-pinned, larger EC2, etc.) vs the default shared shape?
  5. Migration path — for any legacy code that still treats platform as the data-owner, document the deprecation plan.

Location

docs/architecture/workspace-placement.md or internal/runbooks/workspace-placement-rfc.md — wherever architecture decisions live.

Triggers

  • Multi-region or BYO-compute customer ask → RFC must be done first
  • Onboarding a new platform engineer
  • Anyone proposing a platform-side state store that competes with tenant-side state

Acceptance

  • RFC document committed under docs/architecture/ or internal/runbooks/
  • Cross-linked from docs/architecture/molecule-technical-doc.md and from this session's memory.md updates
  • Reviewed by Cui (CEO) — this is a product decision as much as an engineering one
  • Saved as a memory pointer (reference_workspace_placement_rfc) for future agents

Discovered during

Multi-PR memory-system migration 2026-05-24. The architecture was the implicit basis for every design decision in the migration; needs to be made explicit.

## Summary Formalize the workspace-placement architecture as a written RFC. The pattern is implicitly already in place: each org/tenant = one EC2, fully isolated, with its own DB and memory plugin. Memory and other state lives ON the tenant (not in the platform), with SSOT and tenant isolation guaranteed by the org-per-EC2 boundary. This was the design implicit in the 2026-05-24 memory-system migration: the v2 plugin runs as a sidecar on each tenant's EC2, sharing the tenant's Postgres, isolated by EC2 boundary not by application-level multi-tenancy. ## Why write it down - Future architectural decisions (multi-region, dedicated-tier customers, BYO-compute, etc.) need to reference this pattern. - Onboarding new engineers: the mental model "platform is billing/infra-only; tenants are fully functional and self-hosted-shaped" needs an authoritative doc. - Implicit-design drift is a real failure mode (e.g., someone could try to centralize memory at the platform layer, breaking SSOT and isolation). ## Scope of the RFC 1. **Architecture diagram** — platform vs tenant boundary, what crosses (provisioning + billing + tunnel auth) vs what stays (DB, memory, workspace files, agent state). 2. **SSOT rationale** — why each tenant owns its own state rather than platform aggregating. Cite this session's "no v1 fallback, v2 plugin is SSOT on the tenant" decision. 3. **OSS-deployment shape** — molecule-core is OSS; anyone can deploy a tenant. Their tenant connects to the platform for billing+provisioning but otherwise runs identically to our hosted tenants. Document the workspace's `inject org credentials + URL` pattern that lets it reach its tenant server. 4. **Scaling envelope** — what's the org count this design supports? When does a customer need a dedicated-tier (region-pinned, larger EC2, etc.) vs the default shared shape? 5. **Migration path** — for any legacy code that still treats platform as the data-owner, document the deprecation plan. ## Location `docs/architecture/workspace-placement.md` or `internal/runbooks/workspace-placement-rfc.md` — wherever architecture decisions live. ## Triggers - Multi-region or BYO-compute customer ask → RFC must be done first - Onboarding a new platform engineer - Anyone proposing a platform-side state store that competes with tenant-side state ## Acceptance - [ ] RFC document committed under `docs/architecture/` or `internal/runbooks/` - [ ] Cross-linked from `docs/architecture/molecule-technical-doc.md` and from this session's memory.md updates - [ ] Reviewed by Cui (CEO) — this is a product decision as much as an engineering one - [ ] Saved as a memory pointer (`reference_workspace_placement_rfc`) for future agents ## Discovered during Multi-PR memory-system migration 2026-05-24. The architecture was the implicit basis for every design decision in the migration; needs to be made explicit.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1793