forked from molecule-ai/molecule-core
Closes the gap between the merged Memory v2 code (PR #2757 wired the client into main.go) and operator activation. Without this PR an operator wanting to flip MEMORY_V2_CUTOVER=true had to provision a separate memory-plugin service and point MEMORY_PLUGIN_URL at it — extra ops surface for what the design intends to be a built-in. What ships: * Both Dockerfile + Dockerfile.tenant build the cmd/memory-plugin-postgres binary into /memory-plugin. * Entrypoints spawn the plugin in the background on :9100 BEFORE starting the main server; wait up to 30s for /v1/health to return 200; abort boot loud if it doesn't (better to crash-loop than to silently route cutover traffic against a dead plugin). * Default env: MEMORY_PLUGIN_DATABASE_URL=$DATABASE_URL (share the existing tenant Postgres — plugin's `memory_namespaces` / `memory_records` tables coexist with platform schema, no conflicts), MEMORY_PLUGIN_LISTEN_ADDR=:9100. * MEMORY_PLUGIN_DISABLE=1 escape hatch for operators running the plugin externally on a separate host. * Platform image: plugin runs as the `platform` user (not root) via su-exec — matches the privilege boundary the main server already drops to. Tenant image already starts as `canvas` so the plugin inherits non-root automatically. What stays operator-controlled: * MEMORY_V2_CUTOVER is NOT auto-set. Behavior change for existing deployments: zero. The wiring at workspace-server/internal/memory/ wiring/wiring.go skips building the plugin client until the operator opts in, so the running sidecar is a no-op for traffic until then. * MEMORY_PLUGIN_URL is NOT auto-set either, for the same reason — setting it implies cutover-active intent. Operators set both on staging first, verify a live commit/recall round-trip (closes pending task #292), then promote to production. Operator activation steps after this PR ships: 1. Verify pgvector extension is available on the target Postgres (the plugin's first migration runs CREATE EXTENSION IF NOT EXISTS vector). Railway's managed Postgres ships pgvector available; some self-hosted operators may need to enable it. 2. Redeploy the workspace-server with this image. 3. Set MEMORY_PLUGIN_URL=http://localhost:9100 + MEMORY_V2_CUTOVER=true in the environment (staging first). 4. Watch boot logs for "memory-plugin: ✅ sidecar healthy" and the wiring.go cutover messages; do a live commit_memory + recall_memory round-trip via the canvas Memory tab to verify. 5. Promote to production once staging holds for a sweep window. Refs RFC #2728. Closes the dormant-plugin gap noted in task #294.
109 lines
4.9 KiB
Docker
109 lines
4.9 KiB
Docker
# Dockerfile.tenant — combined platform (Go) + canvas (Next.js) image.
|
|
#
|
|
# Serves both the API (Go on :8080) and the UI (Node.js on :3000) in a
|
|
# single container. Go reverse-proxies unknown routes to canvas.
|
|
#
|
|
# Templates are cloned from standalone GitHub repos at build time so the
|
|
# monorepo doesn't need to carry them. The repos are public; no auth.
|
|
#
|
|
# Build context: repo root.
|
|
#
|
|
# docker buildx build --platform linux/amd64 \
|
|
# -f workspace-server/Dockerfile.tenant \
|
|
# -t registry.fly.io/molecule-tenant:latest \
|
|
# --push .
|
|
|
|
# ── Stage 1: Go platform binary ──────────────────────────────────────
|
|
FROM golang:1.25-alpine AS go-builder
|
|
WORKDIR /app
|
|
COPY molecule-ai-plugin-github-app-auth/ /plugin/
|
|
COPY workspace-server/go.mod workspace-server/go.sum ./
|
|
RUN echo 'replace github.com/Molecule-AI/molecule-ai-plugin-github-app-auth => /plugin' >> go.mod
|
|
RUN go mod download
|
|
COPY workspace-server/ .
|
|
|
|
# GIT_SHA is baked into the binary via -ldflags so /buildinfo can return
|
|
# it at runtime. CI passes ${{ github.sha }}; local builds default to
|
|
# "dev" so an unset value never reads as a real SHA.
|
|
#
|
|
# Why this matters: the redeploy verification step compares each tenant's
|
|
# /buildinfo against the SHA the workflow expects. If GIT_SHA isn't
|
|
# threaded through here, every tenant returns "dev" and the verification
|
|
# fails closed — which is the correct fail-direction (#2395 root fix).
|
|
ARG GIT_SHA=dev
|
|
RUN CGO_ENABLED=0 GOOS=linux go build \
|
|
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
|
|
-o /platform ./cmd/server
|
|
# Memory v2 sidecar binary (Memory v2 #2728). Bundled so an operator
|
|
# can activate cutover by flipping MEMORY_V2_CUTOVER=true without
|
|
# provisioning a separate service. See entrypoint-tenant.sh for the
|
|
# launch logic.
|
|
RUN CGO_ENABLED=0 GOOS=linux go build \
|
|
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
|
|
-o /memory-plugin ./cmd/memory-plugin-postgres
|
|
|
|
# ── Stage 2: Canvas Next.js standalone ────────────────────────────────
|
|
FROM node:20-alpine AS canvas-builder
|
|
WORKDIR /canvas
|
|
COPY canvas/package.json canvas/package-lock.json* ./
|
|
RUN npm install
|
|
COPY canvas/ .
|
|
ARG NEXT_PUBLIC_PLATFORM_URL=""
|
|
ARG NEXT_PUBLIC_WS_URL=""
|
|
ENV NEXT_PUBLIC_PLATFORM_URL=$NEXT_PUBLIC_PLATFORM_URL
|
|
ENV NEXT_PUBLIC_WS_URL=$NEXT_PUBLIC_WS_URL
|
|
RUN npm run build
|
|
|
|
# ── Stage 3: Clone templates + plugins from manifest.json ─────────────
|
|
FROM alpine:3.20 AS templates
|
|
RUN apk add --no-cache git jq
|
|
COPY manifest.json /manifest.json
|
|
COPY scripts/clone-manifest.sh /scripts/clone-manifest.sh
|
|
RUN chmod +x /scripts/clone-manifest.sh && /scripts/clone-manifest.sh /manifest.json /workspace-configs-templates /org-templates /plugins
|
|
|
|
# ── Stage 4: Runtime ──────────────────────────────────────────────────
|
|
FROM node:20-alpine
|
|
RUN apk add --no-cache ca-certificates git tzdata openssh-client aws-cli
|
|
|
|
# Non-root runtime for the Node.js canvas process.
|
|
# The Go binary (started by entrypoint.sh) is also non-root — the
|
|
# entrypoint runs as root only long enough to set volume ownership,
|
|
# then exec's as the 'canvas' user via su-exec / setpriv.
|
|
# The Go platform itself drops privileges after init.
|
|
#
|
|
# node:20-alpine ships with uid/gid 1000 already taken by `node`. Delete
|
|
# it first so we can recreate `canvas` at the same uid/gid without
|
|
# conflict. Previously plain addgroup/adduser at 1000 failed with
|
|
# "group 'node' in use" — blocked the tenant image build for hours
|
|
# 2026-04-21. Picking a different uid would break mounted volumes
|
|
# that expect 1000, so we keep the slot and rename the user.
|
|
RUN deluser --remove-home node 2>/dev/null || true; \
|
|
delgroup node 2>/dev/null || true; \
|
|
addgroup -g 1000 canvas && adduser -u 1000 -G canvas -s /bin/sh -D canvas
|
|
|
|
# Go platform binary + Memory v2 sidecar
|
|
COPY --from=go-builder /platform /platform
|
|
COPY --from=go-builder /memory-plugin /memory-plugin
|
|
COPY workspace-server/migrations /migrations
|
|
|
|
# Templates + plugins (cloned from GitHub in stage 3)
|
|
COPY --from=templates /workspace-configs-templates /workspace-configs-templates
|
|
COPY --from=templates /org-templates /org-templates
|
|
COPY --from=templates /plugins /plugins
|
|
|
|
# Canvas standalone
|
|
WORKDIR /canvas
|
|
COPY --from=canvas-builder /canvas/.next/standalone ./
|
|
COPY --from=canvas-builder /canvas/.next/static ./.next/static
|
|
COPY --from=canvas-builder /canvas/public ./public
|
|
|
|
COPY workspace-server/entrypoint-tenant.sh /entrypoint.sh
|
|
RUN chmod +x /entrypoint.sh && \
|
|
chown -R canvas:canvas /canvas /platform /memory-plugin /migrations
|
|
|
|
EXPOSE 8080
|
|
# entrypoint.sh starts as root to fix volume perms, then drops to
|
|
# canvas user. The Go binary (PID 1 replacement) runs as non-root.
|
|
USER canvas
|
|
CMD ["/entrypoint.sh"]
|