feat(local-dev): containerize platform + canvas stack via docker-compose #131

Merged
claude-ceo-assistant merged 1 commits from feat/126-containerize-local-platform-stack into main 2026-05-08 18:38:32 +00:00

Summary

Replaces the legacy nohup go run ./cmd/server setup with a fully containerized local stack: postgres + redis + platform + canvas, all with restart: unless-stopped so they survive Mac sleep/wake and Docker Desktop daemon restarts.

Closes #126.

Changes

  • docker-compose.ymlrestart: unless-stopped on platform/postgres/redis; BIND_ADDR=0.0.0.0 for platform (the dev-mode-fail-open default of 127.0.0.1 from PR #7 made the host unable to reach the container even with port mapping; container netns is already isolated, so binding all interfaces inside is safe); healthchecks switched from wget --spider (HEAD → 404 forever because /health is GET-only) to wget -qO /dev/null (GET) on platform + canvas.
  • workspace-server/Dockerfile.devCGO_ENABLED=10 to match prod Dockerfile + Dockerfile.tenant. Without this the alpine dev image fails with gcc: not found. Closes a divergence introduced in 9d50a6da (today's air hot-reload PR).
  • canvas/Dockerfilenpm installnpm ci --include=optional for lockfile-exact installs that include platform-specific @tailwindcss/oxide native binaries.
  • canvas/.dockerignore (new) — excludes node_modules and .next so COPY . . doesn't clobber the freshly-installed container node_modules with the host's stale/wrong-arch copy. This was the root cause of the canvas build failure on @import "tailwindcss".
  • workspace-server/.gitignore — adds /tmp/ for air's live-reload build cache.

Stage A — verified

container          status                    restart
postgres-1         Up (healthy)              unless-stopped
redis-1            Up (healthy)              unless-stopped
platform-1         Up (healthy, air-mode)    unless-stopped
canvas-1           Up (healthy)              unless-stopped

GET :8080/health  → 200
GET :3000/        → 200
DB preserved:     407 workspace rows + 5 named personas
Persona mount:    28 dirs at /etc/molecule-bootstrap/personas

Stage B — N/A

Local-dev infrastructure only. None of these files ship to SaaS tenants — production EC2s use Dockerfile.tenant + ec2.go user-data, not docker-compose.

Out of scope (filed as follow-ups)

  • The decorative-but-broken wget --spider healthcheck has presumably also been silently 404-ing on prod tenants. Worth a follow-up to audit + fix the prod path.
  • Docker Desktop "Start at login" is a per-machine GUI setting; toggle manually in Docker Desktop Settings → General.
  • The legacy ~/.molecule-ai/heartbeat-all.sh that pinged 5 persona workspaces from the host has been deleted. Per Hongming, each workspace is responsible for its own heartbeat.

Test plan

  • make dev → all 4 containers reach healthy state
  • curl localhost:8080/health → 200
  • curl localhost:3000/ → 200
  • DB row count preserved across cutover (407 rows)
  • Persona env files visible inside platform container (28 dirs)
  • CI green
## Summary Replaces the legacy nohup `go run ./cmd/server` setup with a fully containerized local stack: postgres + redis + platform + canvas, all with `restart: unless-stopped` so they survive Mac sleep/wake and Docker Desktop daemon restarts. Closes #126. ## Changes - **`docker-compose.yml`** — `restart: unless-stopped` on platform/postgres/redis; `BIND_ADDR=0.0.0.0` for platform (the dev-mode-fail-open default of 127.0.0.1 from PR #7 made the host unable to reach the container even with port mapping; container netns is already isolated, so binding all interfaces inside is safe); healthchecks switched from `wget --spider` (HEAD → 404 forever because `/health` is GET-only) to `wget -qO /dev/null` (GET) on platform + canvas. - **`workspace-server/Dockerfile.dev`** — `CGO_ENABLED=1` → `0` to match prod Dockerfile + Dockerfile.tenant. Without this the alpine dev image fails with `gcc: not found`. Closes a divergence introduced in `9d50a6da` (today's air hot-reload PR). - **`canvas/Dockerfile`** — `npm install` → `npm ci --include=optional` for lockfile-exact installs that include platform-specific `@tailwindcss/oxide` native binaries. - **`canvas/.dockerignore`** *(new)* — excludes `node_modules` and `.next` so `COPY . .` doesn't clobber the freshly-installed container `node_modules` with the host's stale/wrong-arch copy. **This was the root cause** of the canvas build failure on `@import "tailwindcss"`. - **`workspace-server/.gitignore`** — adds `/tmp/` for air's live-reload build cache. ## Stage A — verified ``` container status restart postgres-1 Up (healthy) unless-stopped redis-1 Up (healthy) unless-stopped platform-1 Up (healthy, air-mode) unless-stopped canvas-1 Up (healthy) unless-stopped GET :8080/health → 200 GET :3000/ → 200 DB preserved: 407 workspace rows + 5 named personas Persona mount: 28 dirs at /etc/molecule-bootstrap/personas ``` ## Stage B — N/A Local-dev infrastructure only. None of these files ship to SaaS tenants — production EC2s use `Dockerfile.tenant` + `ec2.go` user-data, not docker-compose. ## Out of scope (filed as follow-ups) - The decorative-but-broken `wget --spider` healthcheck has presumably also been silently 404-ing on prod tenants. Worth a follow-up to audit + fix the prod path. - Docker Desktop "Start at login" is a per-machine GUI setting; toggle manually in Docker Desktop Settings → General. - The legacy `~/.molecule-ai/heartbeat-all.sh` that pinged 5 persona workspaces from the host has been deleted. Per Hongming, each workspace is responsible for its own heartbeat. ## Test plan - [x] `make dev` → all 4 containers reach healthy state - [x] `curl localhost:8080/health` → 200 - [x] `curl localhost:3000/` → 200 - [x] DB row count preserved across cutover (407 rows) - [x] Persona env files visible inside platform container (28 dirs) - [ ] CI green
claude-ceo-assistant added 1 commit 2026-05-08 17:54:14 +00:00
feat(local-dev): containerize platform + canvas stack via docker-compose (closes #126)
Some checks failed
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 0s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Failing after 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 51s
CI / Canvas (Next.js) (pull_request) Successful in 2m5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 2m31s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4m22s
7eda8f510f
Replaces the legacy nohup `go run ./cmd/server` setup with a fully
containerized local stack: postgres + redis + platform + canvas, all
with `restart: unless-stopped` so they survive Mac sleep/wake and
Docker Desktop daemon restarts.

## Changes

- **docker-compose.yml**
  - `restart: unless-stopped` on platform/postgres/redis
  - `BIND_ADDR=0.0.0.0` for platform — the dev-mode-fail-open default
    of 127.0.0.1 (PR #7) made the host unable to reach the container
    even with port mapping. Container netns is already isolated, so
    binding all interfaces inside is safe.
  - Healthchecks switched from `wget --spider` (HEAD → 404 forever
    because /health is GET-only) to `wget -qO /dev/null` (GET).
    Same regression existed on canvas; fixed both.

- **workspace-server/Dockerfile.dev**
  - `CGO_ENABLED=1` → `0` to match prod Dockerfile + Dockerfile.tenant.
    Without this, the alpine dev image fails with "gcc: not found"
    because workspace-server has no actual cgo deps but the env was
    forcing the cgo build path. Closes a divergence introduced in
    9d50a6da (today's air hot-reload PR).

- **canvas/Dockerfile**
  - `npm install` → `npm ci --include=optional` for lockfile-exact
    installs that include platform-specific @tailwindcss/oxide native
    binaries. Without these, `next build` fails with "Cannot read
    properties of undefined (reading 'All')" on the
    `@import "tailwindcss"` directive.

- **canvas/.dockerignore** (new)
  - Excludes `node_modules` and `.next` so the Dockerfile's
    `COPY . .` step doesn't clobber the freshly-installed container
    node_modules with the host's (potentially stale or wrong-arch)
    copy. This was the actual root cause of the canvas build break.

- **workspace-server/.gitignore**
  - Adds `/tmp/` for air's live-reload build cache.

## Stage A verified

```
container          status                    restart
postgres-1         Up (healthy)              unless-stopped
redis-1            Up (healthy)              unless-stopped
platform-1         Up (healthy, air-mode)    unless-stopped
canvas-1           Up (healthy)              unless-stopped

GET :8080/health  → 200
GET :3000/        → 200
DB preserved:     407 workspace rows + 5 named personas
Persona mount:    28 dirs at /etc/molecule-bootstrap/personas
```

## Stage B — N/A

This is local-dev infrastructure only. None of these files ship to
SaaS tenants — production EC2s use `Dockerfile.tenant` + `ec2.go`
user-data, not docker-compose.

## Out of scope

- The decorative-but-broken `wget --spider` healthcheck has presumably
  also been silently 404'ing on prod tenants. Ship a follow-up to
  audit + fix the prod path; not done here to keep the PR scoped.
- Docker Desktop "Start at login" is a per-machine GUI setting that
  must be toggled manually (Settings → General).
- The legacy heartbeat-all.sh that pinged 5 persona workspaces from
  the host has been deleted (~/.molecule-ai/heartbeat-all.sh).
  Per Hongming: each workspace is responsible for its own heartbeat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
claude-ceo-assistant scheduled this pull request to auto merge when all checks succeed 2026-05-08 17:54:54 +00:00
claude-ceo-assistant merged commit 8e4169cfac into main 2026-05-08 18:38:32 +00:00
claude-ceo-assistant deleted branch feat/126-containerize-local-platform-stack 2026-05-08 18:38:32 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#131
No description provided.