Files
molecule-core/workspace-server/Dockerfile.platform-agent
devops-engineer e4efc35db1
CI / Python Lint & Test (pull_request) Successful in 5s
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
Harness Replays / detect-changes (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
qa-review / approved (pull_request_target) Failing after 8s
reserved-path-review / reserved-path-review (pull_request_target) Successful in 8s
security-review / approved (pull_request_target) Failing after 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Detect changes (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
gate-check-v3 / gate-check (pull_request_target) Failing after 15s
E2E Chat / detect-changes (pull_request) Successful in 21s
CI / Canvas (Next.js) (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 21s
E2E Chat / E2E Chat (pull_request) Successful in 5s
PR Diff Guard / PR diff guard (pull_request) Successful in 33s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 46s
audit-force-merge / audit (pull_request_target) Successful in 8s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 28s
Harness Replays / Harness Replays (pull_request) Successful in 1m17s
sop-checklist / all-items-acked (pull_request) Compensated by status-reaper (non-required pull_request/pull_request_review governance shadow overridden by successful pull_request_target status; see .gitea/scripts/status-reaper.py)
CI / Platform (Go) (pull_request) Failing after 2m13s
CI / all-required (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m18s
fix(image): COPY --chmod instead of RUN chmod in Dockerfile.platform-agent (build failed on non-root tenant base)
After #2982 repointed the concierge image to FROM platform-tenant (the live
base), the build failed at `RUN chmod +x` with "Operation not permitted":
platform-tenant runs as a non-root user, so a build-time RUN chmod can't set the
+x bit (the dead molecule-ai/platform base was root, which masked this).

Replace both `COPY + RUN chmod +x` pairs (identity-fallback.sh and the
/entrypoint-platform-agent.sh heredoc) with buildx-native `COPY --chmod=0755`,
which sets the executable bit at copy time regardless of the base USER.

Empirically verified: with this change the platform-agent image builds cleanly
FROM platform-tenant:staging-latest (manual buildx run completed all layers;
only the push was denied due to cross-account ECR perms on the build host — the
CI publish runner has the correct prod-account principal).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 20:07:50 -07:00

157 lines
8.7 KiB
Docker

# Platform-agent image variant (RFC #2843 §10a IMAGE-BAKED).
#
# The platform-agent image is the concierge's dedicated image. The base
# platform image (Dockerfile) is the ordinary /platform image; the
# platform-agent variant EXTENDS the base with the concierge's IDENTITY
# baked in, sourced FROM the platform-agent TEMPLATE REPO
# (molecule-ai/molecule-ai-workspace-template-platform-agent) — the
# SAME SSOT that the asset-channel delivers post-#29-activation.
#
# Why a dedicated image: the concierge is a platform-managed agent
# (NOT a user template) with a different threat model and a different
# identity-delivery requirement. The asset channel (PR-B, #2900+#2903)
# works in SaaS+token, in SaaS+public-fetch, and (post-#29-activation)
# in any tenant. The IMAGE-BAKED variant covers the remaining gap:
# self-hosted deployments (no MOLECULE_TEMPLATE_REPO_TOKEN) and the
# pre-#29-activation bootstrap window, where neither the asset channel
# nor the local template path is guaranteed to be available.
#
# SSOT contract (driver hard-requirement on the IMAGE-BAKED impl):
# The image-baked content (config.yaml + prompts/concierge.md +
# mcp_servers.yaml + identity-fallback.sh) is SOURCED FROM the
# platform-agent TEMPLATE REPO, NOT vendored/duplicated in core. A CI
# DRIFT-GATE
# (workspace-server/internal/provisioner/platform_agent_image_drift_test.go,
# pinned against /opt/molecule-platform-agent-template/{config.yaml,
# mcp_servers.yaml,prompts/concierge.md,identity-fallback.sh} vs the
# pre-cloned .tenant-bundle-deps/workspace-configs-templates/platform-
# agent/ source) asserts byte-equal at build time. A future drift
# would fail the drift-gate test (go test -run
# TestPlatformAgentImageDriftGate), catching it BEFORE the image is
# published — so image snapshot + template can NEVER diverge in
# production without a CI-red signal.
#
# Build context: same as Dockerfile. The platform-agent template
# content is pre-cloned by scripts/clone-manifest.sh into
# .tenant-bundle-deps/workspace-configs-templates/platform-agent/
# (the platform-agent template is a manifest.json workspace_templates
# entry per RFC #2843 §10a), and this Dockerfile reads it via the
# PLATFORM_AGENT_TEMPLATE_DIR build-arg (default = canonical CI path).
#
# Usage (operator / CI):
# docker buildx build \
# --build-arg PLATFORM_AGENT_TEMPLATE_DIR=.tenant-bundle-deps/workspace-configs-templates/platform-agent \
# --tag ${REGISTRY}/molecule-ai/platform-platform-agent:staging-${GIT_SHA} \
# -f workspace-server/Dockerfile.platform-agent \
# .
#
# Runtime contract: a container started from this image has the
# concierge identity at /opt/molecule-platform-agent-template/. The
# pre-#29/self-host fallback path in the workspace-server's
# applyConciergeProvisionConfig hook (workspace-server/internal/
# handlers/platform_agent.go) reads from this path when the asset-
# channel deliver is unavailable. Post-#29 activation, the asset
# channel remains the SSOT-delivery path; the image-baked copy is
# a last-resort fallback (intentionally NOT a parallel SSOT — the
# drift-gate enforces single-SSOT).
#
# The identity-fallback.sh script is the WORKING fallback (the
# IMAGE_BAKED_IDENTITY_PRESENT echo-only marker that the #2919 PR
# shipped was a log line that did nothing — this PR's companion
# template-platform-agent PR adds the real script). The platform-
# agent entrypoint sources the script at boot, BEFORE handing off
# to the base image's /entrypoint.sh. Fill-absent-only: a delivered
# /configs/* (asset-channel SSOT) is NEVER overwritten; the image-
# baked copy is the safety net for self-host + pre-#29-bootstrap
# windows where neither the asset channel nor the local template
# path is guaranteed to be available. See
# template-platform-agent #2 (identity-fallback.sh) for the script
# semantics — SRC=/opt/molecule-platform-agent-template, DST=/configs,
# fail-soft on missing SRC.
ARG BASE_IMAGE=molecule-local/platform:latest
FROM ${BASE_IMAGE}
# PLATFORM-AGENT TEMPLATE CONTENT — SSOT-sourced from the pre-cloned
# template repo. The default path mirrors where scripts/clone-manifest.sh
# places workspace_templates entries
# (.tenant-bundle-deps/workspace-configs-templates/<name>/). The
# platform-agent template is in manifest.json's workspace_templates
# (per RFC #2843 §10a), so the existing pre-clone step in
# publish-workspace-server-image.yml populates this path with no
# additional CI work.
#
# The build-arg exists for operators / staging mirrors that pre-clone
# to a different dir (e.g. a shallow --depth=1 mirror for fast CI).
# Default value is the canonical CI path; override only when the
# pre-clone layout differs.
#
# Why build-arg, not ENV: the path is a BUILD-TIME input, not a
# runtime config; build-args are the right tool and the value never
# enters the running container.
ARG PLATFORM_AGENT_TEMPLATE_DIR=.tenant-bundle-deps/workspace-configs-templates/platform-agent
COPY ${PLATFORM_AGENT_TEMPLATE_DIR}/config.yaml /opt/molecule-platform-agent-template/config.yaml
COPY ${PLATFORM_AGENT_TEMPLATE_DIR}/mcp_servers.yaml /opt/molecule-platform-agent-template/mcp_servers.yaml
COPY ${PLATFORM_AGENT_TEMPLATE_DIR}/prompts/ /opt/molecule-platform-agent-template/prompts/
# The boot-time identity-fallback script. Sourced at container start
# (see /entrypoint-platform-agent.sh below) to fill ABSENT files at
# /configs/ from the image-baked /opt path. The script is the SSOT in
# the platform-agent TEMPLATE REPO — drift-gate
# (platform_agent_image_drift_test.go) catches content drift between
# this COPY source and the image-baked destination.
#
# RCA 12124 (DRIVER-ESCALATED live prod identity): the script MUST
# write /configs/system-prompt.md (the file the
# conciergeIdentityPresent probe at platform_agent.go:399 reads) —
# NOT just /configs/prompts/concierge.md. Prior shape had a
# conditional write (`if [ ! -s "$DST/system-prompt.md" ]`) which
# could fail to fire after a partial-template run; the fixed script
# in the template-platform-agent repo (PR-side, merged to template
# main) is unconditional: always writes /configs/system-prompt.md
# from prompts/concierge.md + {{CONCIERGE_NAME}} substitution.
# COPY --chmod sets +x at copy time (buildx-native). A `RUN chmod` fails with
# "Operation not permitted" when the base image runs as a non-root user — the
# live platform-tenant base does, whereas the dead molecule-ai/platform base was
# root, which masked this. --chmod works regardless of base USER.
COPY --chmod=0755 ${PLATFORM_AGENT_TEMPLATE_DIR}/identity-fallback.sh /opt/molecule-platform-agent-template/identity-fallback.sh
# PLATFORM-AGENT ENTRYPOINT — runs identity-fallback.sh FIRST (fills
# absent /configs/ files from the image-baked /opt path; the
# asset-channel deliver is the SSOT post-#29-activation, this is the
# self-host + pre-#29-bootstrap safety net), then hands off to the
# base image's /entrypoint.sh (which does docker-socket group setup,
# memory-plugin sidecar spawn-gate, then exec su-exec platform
# /platform).
#
# Why a separate entrypoint (not extending /entrypoint.sh in core):
# the IMAGE-BAKED identity-fallback is a platform-agent-specific
# concern — the base /entrypoint.sh stays the single runtime entry
# for the ordinary /platform image, and the platform-agent variant
# overrides only the boot hook. The script is sourced (not exec'd)
# so a missing-script failure bubbles up cleanly (su-exec will still
# run /platform; the runtime's MISSING_MODEL fail-closed surfaces
# the operator-visible error in that case).
COPY --chmod=0755 <<'ENTRY' /entrypoint-platform-agent.sh
#!/bin/sh
# /opt/molecule-platform-agent-template/identity-fallback.sh: per-
# file copy of ABSENT files from the image-baked SSOT path to
# /configs/. The asset-channel deliver (post-#29-activation) is the
# authoritative path when it lands; this is the safety net for self-
# host + pre-#29-bootstrap windows where neither the asset channel
# nor the local template path is guaranteed to be available.
if [ -x /opt/molecule-platform-agent-template/identity-fallback.sh ]; then
/opt/molecule-platform-agent-template/identity-fallback.sh || {
echo "platform-agent: ⚠️ identity-fallback.sh failed (see prior log lines); continuing boot — runtime will MISSING_MODEL fail-closed if /configs is empty" >&2
}
else
echo "platform-agent: identity-fallback.sh not present (image built without it); skipping fallback (runtime will MISSING_MODEL fail-closed)" >&2
fi
# Hand off to the base image's entrypoint (docker-socket group
# setup, memory-plugin sidecar spawn-gate, then su-exec platform
# /platform). Pass through any CMD args (the platform-agent image
# is invoked the same way as the base — operator/CI sets CMD as
# needed; this entrypoint is transparent to the args).
exec /entrypoint.sh "$@"
ENTRY
ENTRYPOINT ["/entrypoint-platform-agent.sh"]