chore: reconcile main → staging post-suspension divergence
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s
cascade-list-drift-gate / check (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 43s
Harness Replays / Harness Replays (pull_request) Failing after 40s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m32s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m34s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 2m53s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m44s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m57s
CI / Canvas (Next.js) (pull_request) Successful in 6m50s
CI / Python Lint & Test (pull_request) Successful in 7m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 8m31s
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s
cascade-list-drift-gate / check (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 43s
Harness Replays / Harness Replays (pull_request) Failing after 40s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m32s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m34s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 2m53s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m44s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m57s
CI / Canvas (Next.js) (pull_request) Successful in 6m50s
CI / Python Lint & Test (pull_request) Successful in 7m37s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 8m31s
Refs Task #165 (Class D AUTO_SYNC_TOKEN plumbing). main and staging diverged after the 2026-05-06 GitHub-org suspension because Class D / Class G / feature work landed on staging while unrelated CI fixes (#34-47, ECR auth-inline, buildx→docker, pre-clone manifest deps) landed straight on main. Both branches edited the same workflow files, so every push to main triggered an Auto-sync run that aborted at `git merge --no-ff origin/main` with 7 content conflicts: - .github/workflows/canary-verify.yml (URL: github.com → Gitea) - .github/workflows/ci.yml (3 URL refs) - .github/workflows/publish-runtime.yml (cascade: HTTP repo-dispatch → Gitea push) - .github/workflows/publish-workspace-server-image.yml (drop AWS-action steps; ECR auth is inline) - .github/workflows/retarget-main-to-staging.yml (URL) - manifest.json (lowercase org slug + add mock-bigorg from main) - scripts/clone-manifest.sh (keep main's MOLECULE_GITEA_TOKEN auth path + drop awk-tolower since manifest is now lowercase) Resolution: union — staging's post-suspension Gitea/ECR migrations win on URL/policy edits; main's additive work (mock-bigorg manifest entry, inline ECR auth, MOLECULE_GITEA_TOKEN basic-auth) is preserved on top. After this lands, staging is a strict superset of main, so the next auto-sync run on a push to main will be a clean fast-forward / no-op. The auto-sync workflow on main also picks up staging's AUTO_SYNC_TOKEN swap (Class D #26) for free, fixing the latent layer-2 push-auth issue. Verified locally: - bash -n scripts/clone-manifest.sh - python -c 'yaml.safe_load(...)' on each touched workflow - python -c 'json.load(open(manifest.json))' (21 plugins, 9 templates, 7 org_templates) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
25fb696965
4
.github/workflows/block-internal-paths.yml
vendored
4
.github/workflows/block-internal-paths.yml
vendored
@ -1,7 +1,7 @@
|
||||
name: Block internal-flavored paths
|
||||
|
||||
# Hard CI gate. Internal content (positioning, competitive briefs, sales
|
||||
# playbooks, PMM/press drip, draft campaigns) lives in Molecule-AI/internal —
|
||||
# playbooks, PMM/press drip, draft campaigns) lives in molecule-ai/internal —
|
||||
# this public monorepo must never re-acquire those paths. CEO directive
|
||||
# 2026-04-23 after a fleet-wide audit found 79 internal files leaked here.
|
||||
#
|
||||
@ -135,7 +135,7 @@ jobs:
|
||||
echo "::error::Forbidden internal-flavored paths detected:"
|
||||
printf "$OFFENDING"
|
||||
echo ""
|
||||
echo "These paths belong in Molecule-AI/internal, not this public repo."
|
||||
echo "These paths belong in molecule-ai/internal, not this public repo."
|
||||
echo "See docs/internal-content-policy.md for canonical locations."
|
||||
echo ""
|
||||
echo "If your file is genuinely public-facing (e.g. a blog post"
|
||||
|
||||
2
.github/workflows/ci.yml
vendored
2
.github/workflows/ci.yml
vendored
@ -165,7 +165,7 @@ jobs:
|
||||
# Strip the package-import prefix so we can match .coverage-allowlist.txt
|
||||
# entries written as paths relative to workspace-server/.
|
||||
# Handle both module paths: platform/workspace-server/... and platform/...
|
||||
rel=$(echo "$file" | sed 's|^github.com/Molecule-AI/molecule-monorepo/platform/workspace-server/||; s|^github.com/Molecule-AI/molecule-monorepo/platform/||')
|
||||
rel=$(echo "$file" | sed 's|^github.com/molecule-ai/molecule-monorepo/platform/workspace-server/||; s|^github.com/molecule-ai/molecule-monorepo/platform/||')
|
||||
|
||||
if echo "$ALLOWLIST" | grep -qxF "$rel"; then
|
||||
echo "::warning file=workspace-server/$rel::Critical file at ${pct}% coverage (allowlisted, #1823) — fix before expiry."
|
||||
|
||||
3
.github/workflows/codeql.yml
vendored
3
.github/workflows/codeql.yml
vendored
@ -43,6 +43,9 @@ permissions:
|
||||
jobs:
|
||||
analyze:
|
||||
name: Analyze (${{ matrix.language }})
|
||||
# CodeQL set to advisory (non-blocking) on Gitea Actions — Hongming dec'''n 2026-05-07 (#156).
|
||||
# Findings still emit as SARIF artifacts; failing CodeQL run does not block PR merge.
|
||||
continue-on-error: true
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 45
|
||||
|
||||
|
||||
2
.github/workflows/pr-guards.yml
vendored
2
.github/workflows/pr-guards.yml
vendored
@ -19,4 +19,4 @@ permissions:
|
||||
|
||||
jobs:
|
||||
disable-auto-merge-on-push:
|
||||
uses: Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
|
||||
uses: molecule-ai/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
|
||||
|
||||
4
.github/workflows/publish-runtime.yml
vendored
4
.github/workflows/publish-runtime.yml
vendored
@ -25,7 +25,7 @@ name: publish-runtime
|
||||
# 3. Publishes to PyPI via the PyPA Trusted Publisher action (OIDC).
|
||||
# No static API token is stored — PyPI verifies the workflow's
|
||||
# OIDC claim against the trusted-publisher config registered for
|
||||
# molecule-ai-workspace-runtime (Molecule-AI/molecule-core,
|
||||
# molecule-ai-workspace-runtime (molecule-ai/molecule-core,
|
||||
# publish-runtime.yml, environment pypi-publish).
|
||||
#
|
||||
# After publish: the 8 template repos pick up the new version on their
|
||||
@ -166,7 +166,7 @@ jobs:
|
||||
|
||||
- name: Publish to PyPI (Trusted Publisher / OIDC)
|
||||
# PyPI side is configured: project molecule-ai-workspace-runtime →
|
||||
# publisher Molecule-AI/molecule-core, workflow publish-runtime.yml,
|
||||
# publisher molecule-ai/molecule-core, workflow publish-runtime.yml,
|
||||
# environment pypi-publish. The action mints a short-lived OIDC
|
||||
# token and exchanges it for a PyPI upload credential — no static
|
||||
# API token in this repo's secrets.
|
||||
|
||||
203
.github/workflows/publish-workspace-server-image.yml
vendored
203
.github/workflows/publish-workspace-server-image.yml
vendored
@ -37,6 +37,7 @@ on:
|
||||
- 'workspace-server/**'
|
||||
- 'canvas/**'
|
||||
- 'manifest.json'
|
||||
- 'scripts/**'
|
||||
- '.github/workflows/publish-workspace-server-image.yml'
|
||||
workflow_dispatch:
|
||||
|
||||
@ -74,33 +75,87 @@ jobs:
|
||||
# plugin was dropped + workspace-server/Dockerfile no longer
|
||||
# COPYs it.
|
||||
|
||||
- name: Configure AWS credentials for ECR
|
||||
# GHCR was the pre-suspension target; the molecule-ai org on
|
||||
# GitHub got swept 2026-05-06 and ghcr.io/molecule-ai/* is no
|
||||
# longer reachable. Post-suspension target is the operator's
|
||||
# ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
|
||||
# molecule-ai/*), which already hosts platform-tenant +
|
||||
# workspace-template-* + runner-base images. AWS creds come
|
||||
# from the AWS_ACCESS_KEY_ID/SECRET secrets bound to the
|
||||
# molecule-cp IAM user. Closes #161.
|
||||
uses: aws-actions/configure-aws-credentials@v4
|
||||
with:
|
||||
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
||||
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
||||
aws-region: us-east-2
|
||||
|
||||
- name: Log in to ECR
|
||||
id: ecr-login
|
||||
uses: aws-actions/amazon-ecr-login@v2
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
|
||||
# ECR auth + buildx setup are now inline in each build step
|
||||
# below (Task #173, 2026-05-07).
|
||||
#
|
||||
# Why moved inline: aws-actions/configure-aws-credentials@v4 +
|
||||
# aws-actions/amazon-ecr-login@v2 + docker/setup-buildx-action
|
||||
# all left auth state in places that the actual `docker push`
|
||||
# couldn't see on Gitea Actions:
|
||||
# - The actions wrote to a step-scoped DOCKER_CONFIG path
|
||||
# that didn't survive into subsequent shell steps.
|
||||
# - Buildx couldn't bridge the runner container ↔
|
||||
# operator-host docker daemon auth gap (401 on the
|
||||
# docker-container driver, "no basic auth credentials"
|
||||
# with the action-driven login).
|
||||
#
|
||||
# Doing AWS+ECR auth inline (`aws ecr get-login-password |
|
||||
# docker login`) in the same shell step as `docker build` +
|
||||
# `docker push` is the operator-host manual approach, mapped
|
||||
# 1:1 into CI. Auth state is guaranteed to live in the env that
|
||||
# `docker push` actually runs from.
|
||||
#
|
||||
# Post-suspension target is the operator's ECR org
|
||||
# (153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*),
|
||||
# which already hosts platform-tenant + workspace-template-* +
|
||||
# runner-base images. AWS creds come from the
|
||||
# AWS_ACCESS_KEY_ID/SECRET secrets bound to the molecule-cp
|
||||
# IAM user. Closes #161.
|
||||
|
||||
- name: Compute tags
|
||||
id: tags
|
||||
run: |
|
||||
echo "sha=${GITHUB_SHA::7}" >> "$GITHUB_OUTPUT"
|
||||
|
||||
# Pre-clone manifest deps before docker build (Task #173 fix).
|
||||
#
|
||||
# Why pre-clone: post-2026-05-06, every workspace-template-* repo on
|
||||
# Gitea (codex, crewai, deepagents, gemini-cli, langgraph) plus all
|
||||
# 7 org-template-* repos are private. The pre-fix Dockerfile.tenant
|
||||
# ran `git clone` inside an in-image stage, which had no auth path
|
||||
# — every CI build failed with "fatal: could not read Username for
|
||||
# https://git.moleculesai.app". For weeks, every workspace-server
|
||||
# rebuild required a manual operator-host push. Now we clone in the
|
||||
# trusted CI context (where AUTO_SYNC_TOKEN is naturally available)
|
||||
# and Dockerfile.tenant just COPYs from .tenant-bundle-deps/.
|
||||
#
|
||||
# Token shape: AUTO_SYNC_TOKEN is the devops-engineer persona PAT
|
||||
# (see /etc/molecule-bootstrap/agent-secrets.env). Per saved memory
|
||||
# `feedback_per_agent_gitea_identity_default`, every CI surface uses
|
||||
# a per-persona token, never the founder PAT. clone-manifest.sh
|
||||
# embeds it as basic-auth (oauth2:<token>) for the duration of the
|
||||
# clones, then strips .git directories — the token never enters
|
||||
# the resulting image.
|
||||
#
|
||||
# Idempotent: if a re-run finds populated dirs, clone-manifest.sh
|
||||
# skips them; safe to retrigger via path-filter or workflow_dispatch.
|
||||
- name: Pre-clone manifest deps
|
||||
env:
|
||||
MOLECULE_GITEA_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
if [ -z "${MOLECULE_GITEA_TOKEN}" ]; then
|
||||
echo "::error::AUTO_SYNC_TOKEN secret is empty — register the devops-engineer persona PAT in repo Actions secrets"
|
||||
exit 1
|
||||
fi
|
||||
mkdir -p .tenant-bundle-deps
|
||||
bash scripts/clone-manifest.sh \
|
||||
manifest.json \
|
||||
.tenant-bundle-deps/workspace-configs-templates \
|
||||
.tenant-bundle-deps/org-templates \
|
||||
.tenant-bundle-deps/plugins
|
||||
# Sanity-check counts so a silent partial clone fails fast
|
||||
# instead of producing a half-empty image.
|
||||
ws_count=$(find .tenant-bundle-deps/workspace-configs-templates -mindepth 1 -maxdepth 1 -type d | wc -l)
|
||||
org_count=$(find .tenant-bundle-deps/org-templates -mindepth 1 -maxdepth 1 -type d | wc -l)
|
||||
plugins_count=$(find .tenant-bundle-deps/plugins -mindepth 1 -maxdepth 1 -type d | wc -l)
|
||||
echo "Cloned: ws=$ws_count org=$org_count plugins=$plugins_count"
|
||||
# Counts are derived from manifest.json (9 ws / 7 org / 21
|
||||
# plugins as of 2026-05-07). If manifest.json grows but the
|
||||
# clone step regresses silently, the find above caps at the
|
||||
# actual disk state — but clone-manifest.sh's own EXPECTED vs
|
||||
# CLONED check (line ~95) is the authoritative fail-fast.
|
||||
|
||||
# Canary-gated release flow:
|
||||
# - This step always publishes :staging-<sha> + :staging-latest.
|
||||
# - On staging push, staging-CP picks up :staging-latest immediately
|
||||
@ -126,43 +181,42 @@ jobs:
|
||||
# were running pre-RFC code. Adding the staging trigger above closes
|
||||
# that gap. Earlier 2026-04-24 incident: a static :staging-<sha> pin
|
||||
# drifted 10 days behind staging — same class of bug, different
|
||||
# mechanism.
|
||||
- name: Build & push platform image to GHCR (staging-<sha> + staging-latest)
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
file: ./workspace-server/Dockerfile
|
||||
platforms: linux/amd64
|
||||
push: true
|
||||
tags: |
|
||||
${{ env.IMAGE_NAME }}:staging-${{ steps.tags.outputs.sha }}
|
||||
${{ env.IMAGE_NAME }}:staging-latest
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
# mechanism. ECR repo molecule-ai/platform created 2026-05-07.
|
||||
# Build + push platform image with plain `docker` (no buildx).
|
||||
# GIT_SHA bakes into the Go binary via -ldflags so /buildinfo
|
||||
# returns it at runtime — see Dockerfile + buildinfo/buildinfo.go.
|
||||
# This is the same value as the OCI revision label below; passing
|
||||
# it twice is intentional, the OCI label is for registry tooling
|
||||
# while /buildinfo is for the redeploy verification step.
|
||||
build-args: |
|
||||
GIT_SHA=${{ github.sha }}
|
||||
labels: |
|
||||
org.opencontainers.image.source=https://github.com/${{ github.repository }}
|
||||
org.opencontainers.image.revision=${{ github.sha }}
|
||||
org.opencontainers.image.description=Molecule AI platform (Go API server) — pending canary verify
|
||||
# The OCI revision label below carries the same value for registry
|
||||
# tooling; the duplication is intentional.
|
||||
- name: Build & push platform image to ECR (staging-<sha> + staging-latest)
|
||||
env:
|
||||
IMAGE_NAME: ${{ env.IMAGE_NAME }}
|
||||
TAG_SHA: staging-${{ steps.tags.outputs.sha }}
|
||||
TAG_LATEST: staging-latest
|
||||
GIT_SHA: ${{ github.sha }}
|
||||
REPO: ${{ github.repository }}
|
||||
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
||||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
||||
AWS_DEFAULT_REGION: us-east-2
|
||||
run: |
|
||||
set -euo pipefail
|
||||
# ECR auth in-step so config.json is populated in the same
|
||||
# shell env that runs `docker push`. ECR get-login-password
|
||||
# tokens last 12h, plenty for a single-step build+push.
|
||||
ECR_REGISTRY="${IMAGE_NAME%%/*}"
|
||||
aws ecr get-login-password --region us-east-2 | \
|
||||
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
|
||||
docker build \
|
||||
--file ./workspace-server/Dockerfile \
|
||||
--build-arg GIT_SHA="${GIT_SHA}" \
|
||||
--label "org.opencontainers.image.source=https://github.com/${REPO}" \
|
||||
--label "org.opencontainers.image.revision=${GIT_SHA}" \
|
||||
--label "org.opencontainers.image.description=Molecule AI platform (Go API server) — pending canary verify" \
|
||||
--tag "${IMAGE_NAME}:${TAG_SHA}" \
|
||||
--tag "${IMAGE_NAME}:${TAG_LATEST}" \
|
||||
.
|
||||
docker push "${IMAGE_NAME}:${TAG_SHA}"
|
||||
docker push "${IMAGE_NAME}:${TAG_LATEST}"
|
||||
|
||||
- name: Build & push tenant image to GHCR (staging-<sha> + staging-latest)
|
||||
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
|
||||
with:
|
||||
context: .
|
||||
file: ./workspace-server/Dockerfile.tenant
|
||||
platforms: linux/amd64
|
||||
push: true
|
||||
tags: |
|
||||
${{ env.TENANT_IMAGE_NAME }}:staging-${{ steps.tags.outputs.sha }}
|
||||
${{ env.TENANT_IMAGE_NAME }}:staging-latest
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
# Canvas uses same-origin fetches. The tenant Go platform
|
||||
# reverse-proxies /cp/* to the SaaS CP via its CP_UPSTREAM_URL
|
||||
# env; the tenant's /canvas/viewport, /approvals/pending,
|
||||
@ -174,10 +228,35 @@ jobs:
|
||||
# Self-hosted / private-label deployments override this at
|
||||
# build time with a specific backend (e.g. local dev:
|
||||
# NEXT_PUBLIC_PLATFORM_URL=http://localhost:8080).
|
||||
build-args: |
|
||||
NEXT_PUBLIC_PLATFORM_URL=
|
||||
GIT_SHA=${{ github.sha }}
|
||||
labels: |
|
||||
org.opencontainers.image.source=https://github.com/${{ github.repository }}
|
||||
org.opencontainers.image.revision=${{ github.sha }}
|
||||
org.opencontainers.image.description=Molecule AI tenant platform + canvas — pending canary verify
|
||||
- name: Build & push tenant image to ECR (staging-<sha> + staging-latest)
|
||||
env:
|
||||
TENANT_IMAGE_NAME: ${{ env.TENANT_IMAGE_NAME }}
|
||||
TAG_SHA: staging-${{ steps.tags.outputs.sha }}
|
||||
TAG_LATEST: staging-latest
|
||||
GIT_SHA: ${{ github.sha }}
|
||||
REPO: ${{ github.repository }}
|
||||
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
||||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
||||
AWS_DEFAULT_REGION: us-east-2
|
||||
run: |
|
||||
set -euo pipefail
|
||||
# Re-login: the platform-image step's docker login wrote to
|
||||
# the same config.json, so this is technically redundant — but
|
||||
# making each push step self-contained keeps the workflow
|
||||
# robust to step reordering / future extraction.
|
||||
ECR_REGISTRY="${TENANT_IMAGE_NAME%%/*}"
|
||||
aws ecr get-login-password --region us-east-2 | \
|
||||
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
|
||||
docker build \
|
||||
--file ./workspace-server/Dockerfile.tenant \
|
||||
--build-arg NEXT_PUBLIC_PLATFORM_URL= \
|
||||
--build-arg GIT_SHA="${GIT_SHA}" \
|
||||
--label "org.opencontainers.image.source=https://github.com/${REPO}" \
|
||||
--label "org.opencontainers.image.revision=${GIT_SHA}" \
|
||||
--label "org.opencontainers.image.description=Molecule AI tenant platform + canvas — pending canary verify" \
|
||||
--tag "${TENANT_IMAGE_NAME}:${TAG_SHA}" \
|
||||
--tag "${TENANT_IMAGE_NAME}:${TAG_LATEST}" \
|
||||
.
|
||||
docker push "${TENANT_IMAGE_NAME}:${TAG_SHA}"
|
||||
docker push "${TENANT_IMAGE_NAME}:${TAG_LATEST}"
|
||||
|
||||
|
||||
@ -9,7 +9,7 @@ name: redeploy-tenants-on-main
|
||||
#
|
||||
# This workflow closes the gap by calling the control-plane admin
|
||||
# endpoint that performs a canary-first, batched, health-gated rolling
|
||||
# redeploy across every live tenant. Implemented in Molecule-AI/
|
||||
# redeploy across every live tenant. Implemented in molecule-ai/
|
||||
# molecule-controlplane as POST /cp/admin/tenants/redeploy-fleet
|
||||
# (feat/tenant-auto-redeploy, landing alongside this workflow).
|
||||
#
|
||||
@ -146,7 +146,7 @@ jobs:
|
||||
|
||||
- name: Call CP redeploy-fleet
|
||||
# CP_ADMIN_API_TOKEN must be set as a repo/org secret on
|
||||
# Molecule-AI/molecule-core, matching the staging/prod CP's
|
||||
# molecule-ai/molecule-core, matching the staging/prod CP's
|
||||
# CP_ADMIN_API_TOKEN env. Stored in Railway, mirrored to this
|
||||
# repo's secrets for CI.
|
||||
env:
|
||||
|
||||
@ -97,7 +97,7 @@ jobs:
|
||||
|
||||
- name: Call staging-CP redeploy-fleet
|
||||
# CP_STAGING_ADMIN_API_TOKEN must be set as a repo/org secret
|
||||
# on Molecule-AI/molecule-core, matching staging-CP's
|
||||
# on molecule-ai/molecule-core, matching staging-CP's
|
||||
# CP_ADMIN_API_TOKEN env var (visible in Railway controlplane
|
||||
# / staging environment). Stored separately from the prod
|
||||
# CP_ADMIN_API_TOKEN so a leak of one doesn't auth the other.
|
||||
|
||||
2
.github/workflows/secret-scan.yml
vendored
2
.github/workflows/secret-scan.yml
vendored
@ -12,7 +12,7 @@ name: Secret scan
|
||||
#
|
||||
# jobs:
|
||||
# secret-scan:
|
||||
# uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
|
||||
# uses: molecule-ai/molecule-core/.github/workflows/secret-scan.yml@staging
|
||||
#
|
||||
# Pin to @staging not @main — staging is the active default branch,
|
||||
# main lags via the staging-promotion workflow. Updates ride along
|
||||
|
||||
7
.gitignore
vendored
7
.gitignore
vendored
@ -131,6 +131,13 @@ backups/
|
||||
# Cloned by publish-workspace-server-image.yml so the Dockerfile's
|
||||
# replace-directive path resolves. Lives in its own repo.
|
||||
/molecule-ai-plugin-github-app-auth/
|
||||
# Tenant-image build context — populated by the workflow's
|
||||
# "Pre-clone manifest deps" step. Mirrors the public manifest, holds the
|
||||
# same content as the three /<>/ dirs above but namespaced under one
|
||||
# parent so the Docker build context is a single COPY-friendly tree.
|
||||
# Each entry is a transient working-dir, never source-of-truth, never
|
||||
# committed.
|
||||
/.tenant-bundle-deps/
|
||||
|
||||
# Internal-flavored content lives in Molecule-AI/internal — NEVER in this
|
||||
# public monorepo. Migrated 2026-04-23 (CEO directive). The CI workflow
|
||||
|
||||
@ -3,6 +3,7 @@ import { cookies, headers } from "next/headers";
|
||||
import "./globals.css";
|
||||
import { AuthGate } from "@/components/AuthGate";
|
||||
import { CookieConsent } from "@/components/CookieConsent";
|
||||
import { PurchaseSuccessModal } from "@/components/PurchaseSuccessModal";
|
||||
import { ThemeProvider } from "@/lib/theme-provider";
|
||||
import {
|
||||
THEME_COOKIE,
|
||||
@ -86,6 +87,12 @@ export default async function RootLayout({
|
||||
vercel preview URL, apex) pass through unchanged. */}
|
||||
<AuthGate>{children}</AuthGate>
|
||||
<CookieConsent />
|
||||
{/* Demo Mock #1: post-purchase success toast. Mounted at the
|
||||
layout level so it persists across page state transitions
|
||||
(loading → hydrated → error) without being unmounted and
|
||||
losing its open-state. Reads ?purchase_success=1 from the
|
||||
URL on first paint, then strips the param. */}
|
||||
<PurchaseSuccessModal />
|
||||
</ThemeProvider>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
175
canvas/src/components/PurchaseSuccessModal.tsx
Normal file
175
canvas/src/components/PurchaseSuccessModal.tsx
Normal file
@ -0,0 +1,175 @@
|
||||
"use client";
|
||||
|
||||
/**
|
||||
* PurchaseSuccessModal — demo-only post-purchase confirmation.
|
||||
*
|
||||
* Mounted on the canvas root (`app/page.tsx`). On first paint it inspects
|
||||
* `?purchase_success=1[&item=<name>]` on the current URL. If present, it
|
||||
* renders a centred modal styled after `ConfirmDialog`, schedules a 5s
|
||||
* auto-dismiss, and rewrites the URL via `history.replaceState` to drop
|
||||
* the params so a refresh after dismiss does NOT re-show the modal.
|
||||
*
|
||||
* Mock for the funding demo — there is no real billing surface behind
|
||||
* this. The marketplace "Purchase" button on the landing page redirects
|
||||
* here with the params; this modal is the only thing the user sees of
|
||||
* the "transaction".
|
||||
*
|
||||
* Styling matches the warm-paper @theme tokens (surface-sunken / line /
|
||||
* ink / good) so it tracks light + dark without per-mode overrides.
|
||||
*/
|
||||
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import { createPortal } from "react-dom";
|
||||
|
||||
const AUTO_DISMISS_MS = 5000;
|
||||
|
||||
function readPurchaseParams(): { open: boolean; item: string | null } {
|
||||
if (typeof window === "undefined") return { open: false, item: null };
|
||||
const sp = new URLSearchParams(window.location.search);
|
||||
const flag = sp.get("purchase_success");
|
||||
if (flag !== "1" && flag !== "true") return { open: false, item: null };
|
||||
return { open: true, item: sp.get("item") };
|
||||
}
|
||||
|
||||
function stripPurchaseParams() {
|
||||
if (typeof window === "undefined") return;
|
||||
const url = new URL(window.location.href);
|
||||
url.searchParams.delete("purchase_success");
|
||||
url.searchParams.delete("item");
|
||||
// replaceState (not pushState) so back-button doesn't return to the
|
||||
// pre-strip URL and re-trigger the modal.
|
||||
window.history.replaceState({}, "", url.toString());
|
||||
}
|
||||
|
||||
export function PurchaseSuccessModal() {
|
||||
const [open, setOpen] = useState(false);
|
||||
const [item, setItem] = useState<string | null>(null);
|
||||
const [mounted, setMounted] = useState(false);
|
||||
const dialogRef = useRef<HTMLDivElement>(null);
|
||||
|
||||
// Read the URL params once on mount. We don't subscribe to navigation —
|
||||
// this modal is a one-shot for the demo redirect, not a persistent
|
||||
// listener.
|
||||
useEffect(() => {
|
||||
setMounted(true);
|
||||
const { open: shouldOpen, item: itemName } = readPurchaseParams();
|
||||
if (shouldOpen) {
|
||||
setOpen(true);
|
||||
setItem(itemName);
|
||||
// Clean the URL immediately so a refresh after the modal is closed
|
||||
// (or even while it's still open) does NOT re-trigger it.
|
||||
stripPurchaseParams();
|
||||
}
|
||||
}, []);
|
||||
|
||||
// Auto-dismiss timer + Escape handler.
|
||||
useEffect(() => {
|
||||
if (!open) return;
|
||||
const t = window.setTimeout(() => setOpen(false), AUTO_DISMISS_MS);
|
||||
const onKey = (e: KeyboardEvent) => {
|
||||
if (e.key === "Escape") setOpen(false);
|
||||
};
|
||||
window.addEventListener("keydown", onKey);
|
||||
// Focus the close button so keyboard users land on it after redirect.
|
||||
const raf = requestAnimationFrame(() => {
|
||||
dialogRef.current?.querySelector<HTMLButtonElement>("button")?.focus();
|
||||
});
|
||||
return () => {
|
||||
window.clearTimeout(t);
|
||||
window.removeEventListener("keydown", onKey);
|
||||
cancelAnimationFrame(raf);
|
||||
};
|
||||
}, [open]);
|
||||
|
||||
if (!open || !mounted) return null;
|
||||
|
||||
const itemLabel = item ? decodeURIComponent(item) : "Your new agent";
|
||||
|
||||
return createPortal(
|
||||
<div
|
||||
className="fixed inset-0 z-[9999] flex items-center justify-center"
|
||||
data-testid="purchase-success-modal"
|
||||
>
|
||||
{/* Backdrop — click closes, matches ConfirmDialog backdrop. */}
|
||||
<div
|
||||
className="absolute inset-0 bg-black/60 backdrop-blur-sm"
|
||||
onClick={() => setOpen(false)}
|
||||
aria-hidden="true"
|
||||
/>
|
||||
|
||||
<div
|
||||
ref={dialogRef}
|
||||
role="dialog"
|
||||
aria-modal="true"
|
||||
aria-labelledby="purchase-success-title"
|
||||
className="relative bg-surface-sunken border border-line rounded-xl shadow-2xl shadow-black/50 max-w-[420px] w-full mx-4 overflow-hidden"
|
||||
>
|
||||
<div className="px-6 pt-6 pb-4">
|
||||
<div className="flex items-start gap-4">
|
||||
{/* Success glyph — uses --color-good so it tracks the theme.
|
||||
Inline SVG over an emoji so it stays readable + on-brand
|
||||
in both light and dark. */}
|
||||
<div
|
||||
className="flex h-10 w-10 flex-shrink-0 items-center justify-center rounded-full"
|
||||
style={{
|
||||
background:
|
||||
"color-mix(in srgb, var(--color-good) 15%, transparent)",
|
||||
color: "var(--color-good)",
|
||||
}}
|
||||
>
|
||||
<svg
|
||||
width="22"
|
||||
height="22"
|
||||
viewBox="0 0 24 24"
|
||||
fill="none"
|
||||
aria-hidden="true"
|
||||
>
|
||||
<circle
|
||||
cx="12"
|
||||
cy="12"
|
||||
r="10"
|
||||
stroke="currentColor"
|
||||
strokeWidth="1.5"
|
||||
/>
|
||||
<path
|
||||
d="M7.5 12.5L10.5 15.5L16.5 9.5"
|
||||
stroke="currentColor"
|
||||
strokeWidth="1.8"
|
||||
strokeLinecap="round"
|
||||
strokeLinejoin="round"
|
||||
/>
|
||||
</svg>
|
||||
</div>
|
||||
<div className="flex-1">
|
||||
<h3
|
||||
id="purchase-success-title"
|
||||
className="text-base font-semibold text-ink"
|
||||
>
|
||||
Purchase successful
|
||||
</h3>
|
||||
<p className="mt-1.5 text-[13px] leading-relaxed text-ink-mid">
|
||||
<span className="font-medium text-ink">{itemLabel}</span> has
|
||||
been added to your workspace. Provisioning starts in the
|
||||
background — you can keep working while it spins up.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center justify-between gap-3 px-6 py-3 border-t border-line bg-surface/50">
|
||||
<span className="font-mono text-[10.5px] uppercase tracking-[0.12em] text-ink-soft">
|
||||
auto-dismiss · {AUTO_DISMISS_MS / 1000}s
|
||||
</span>
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => setOpen(false)}
|
||||
className="px-3.5 py-1.5 text-[13px] rounded-lg bg-accent hover:bg-accent-strong text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-surface-sunken focus-visible:ring-accent/60"
|
||||
>
|
||||
Close
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>,
|
||||
document.body,
|
||||
);
|
||||
}
|
||||
@ -41,6 +41,7 @@
|
||||
{"name": "medo-smoke", "repo": "molecule-ai/molecule-ai-org-template-medo-smoke", "ref": "main"},
|
||||
{"name": "molecule-worker-gemini", "repo": "molecule-ai/molecule-ai-org-template-molecule-worker-gemini", "ref": "main"},
|
||||
{"name": "reno-stars", "repo": "molecule-ai/molecule-ai-org-template-reno-stars", "ref": "main"},
|
||||
{"name": "ux-ab-lab", "repo": "molecule-ai/molecule-ai-org-template-ux-ab-lab", "ref": "main"}
|
||||
{"name": "ux-ab-lab", "repo": "molecule-ai/molecule-ai-org-template-ux-ab-lab", "ref": "main"},
|
||||
{"name": "mock-bigorg", "repo": "molecule-ai/molecule-ai-org-template-mock-bigorg", "ref": "main"}
|
||||
]
|
||||
}
|
||||
|
||||
@ -6,6 +6,29 @@
|
||||
# ./scripts/clone-manifest.sh <manifest.json> <ws-templates-dir> <org-templates-dir> <plugins-dir>
|
||||
#
|
||||
# Requires: git, jq (lighter than python3 — ~2MB vs ~50MB in Alpine)
|
||||
#
|
||||
# Auth (optional):
|
||||
# When MOLECULE_GITEA_TOKEN is set, embed it as the basic-auth password so
|
||||
# private Gitea repos clone successfully. When unset, clone anonymously
|
||||
# (works only for repos that are public on git.moleculesai.app).
|
||||
#
|
||||
# This is the path the publish-workspace-server-image.yml workflow uses:
|
||||
# it injects AUTO_SYNC_TOKEN (devops-engineer persona PAT, repo:read on
|
||||
# the molecule-ai org) so the in-CI pre-clone step succeeds for ALL
|
||||
# manifest entries — including the 5 private workspace-template-* repos
|
||||
# (codex, crewai, deepagents, gemini-cli, langgraph) and all 7
|
||||
# org-template-* repos.
|
||||
#
|
||||
# The token never enters the Docker image: this script runs in the
|
||||
# trusted CI context BEFORE `docker buildx build`, populates
|
||||
# .tenant-bundle-deps/, then `Dockerfile.tenant` COPYs from there with
|
||||
# the .git directories already stripped (see line ~67 below).
|
||||
#
|
||||
# For backward compatibility — and so a fresh clone works without
|
||||
# secrets when (eventually) the workspace-template-* repos flip public —
|
||||
# the unset path remains a plain anonymous HTTPS clone. That path will
|
||||
# FAIL with "could not read Username" on private repos today; CI MUST
|
||||
# set MOLECULE_GITEA_TOKEN.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
@ -45,11 +68,27 @@ clone_category() {
|
||||
continue
|
||||
fi
|
||||
|
||||
echo " cloning $repo -> $target_dir/$name (ref=$ref)"
|
||||
if [ "$ref" = "main" ]; then
|
||||
git clone --depth=1 -q "https://git.moleculesai.app/${repo}.git" "$target_dir/$name"
|
||||
# Build the clone URL. When MOLECULE_GITEA_TOKEN is set (CI path)
|
||||
# embed it as basic-auth so private repos succeed. The username
|
||||
# part ("oauth2") is conventional and ignored by Gitea — only the
|
||||
# token-as-password is verified.
|
||||
#
|
||||
# manifest.json was migrated to lowercase org slugs on
|
||||
# 2026-05-07 (post-suspension reconciliation), so we use $repo
|
||||
# verbatim — no on-the-fly tolower transform needed.
|
||||
if [ -n "${MOLECULE_GITEA_TOKEN:-}" ]; then
|
||||
clone_url="https://oauth2:${MOLECULE_GITEA_TOKEN}@git.moleculesai.app/${repo}.git"
|
||||
display_url="https://oauth2:***@git.moleculesai.app/${repo}.git"
|
||||
else
|
||||
git clone --depth=1 -q --branch "$ref" "https://git.moleculesai.app/${repo}.git" "$target_dir/$name"
|
||||
clone_url="https://git.moleculesai.app/${repo}.git"
|
||||
display_url="$clone_url"
|
||||
fi
|
||||
|
||||
echo " cloning $display_url -> $target_dir/$name (ref=$ref)"
|
||||
if [ "$ref" = "main" ]; then
|
||||
git clone --depth=1 -q "$clone_url" "$target_dir/$name"
|
||||
else
|
||||
git clone --depth=1 -q --branch "$ref" "$clone_url" "$target_dir/$name"
|
||||
fi
|
||||
CLONED=$((CLONED + 1))
|
||||
i=$((i + 1))
|
||||
|
||||
@ -1,7 +1,15 @@
|
||||
# Platform-only image (no canvas). Used by publish-platform-image workflow
|
||||
# for GHCR + Fly registry. Tenant image uses Dockerfile.tenant instead.
|
||||
# Platform-only image (no canvas). Used by publish-workspace-server-image
|
||||
# workflow for ECR. Tenant image uses Dockerfile.tenant instead.
|
||||
#
|
||||
# Build context: repo root.
|
||||
# Templates + plugins are pre-cloned by scripts/clone-manifest.sh (in CI
|
||||
# or on the operator host) into .tenant-bundle-deps/ — same pattern as
|
||||
# Dockerfile.tenant. See that file's header for the full rationale; the
|
||||
# short version is that post-2026-05-06 every workspace-template-* and
|
||||
# org-template-* repo on Gitea is private, so an in-image `git clone`
|
||||
# has no auth path that doesn't leak the Gitea token into a layer.
|
||||
#
|
||||
# Build context: repo root, with `.tenant-bundle-deps/` populated by the
|
||||
# workflow's "Pre-clone manifest deps" step (Task #173).
|
||||
|
||||
FROM golang:1.25-alpine AS builder
|
||||
WORKDIR /app
|
||||
@ -26,21 +34,18 @@ RUN CGO_ENABLED=0 GOOS=linux go build \
|
||||
-ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=${GIT_SHA}" \
|
||||
-o /memory-plugin ./cmd/memory-plugin-postgres
|
||||
|
||||
# Clone templates + plugins at build time from manifest.json
|
||||
FROM alpine:3.20 AS templates
|
||||
RUN apk add --no-cache git jq
|
||||
COPY manifest.json /manifest.json
|
||||
COPY scripts/clone-manifest.sh /scripts/clone-manifest.sh
|
||||
RUN chmod +x /scripts/clone-manifest.sh && /scripts/clone-manifest.sh /manifest.json /workspace-configs-templates /org-templates /plugins
|
||||
|
||||
FROM alpine:3.20
|
||||
RUN apk add --no-cache ca-certificates git tzdata wget
|
||||
COPY --from=builder /platform /platform
|
||||
COPY --from=builder /memory-plugin /memory-plugin
|
||||
COPY workspace-server/migrations /migrations
|
||||
COPY --from=templates /workspace-configs-templates /workspace-configs-templates
|
||||
COPY --from=templates /org-templates /org-templates
|
||||
COPY --from=templates /plugins /plugins
|
||||
# Templates + plugins (pre-cloned by scripts/clone-manifest.sh in the
|
||||
# trusted CI / operator-host context, .git already stripped). The Gitea
|
||||
# token used to clone them never enters this image — same shape as
|
||||
# Dockerfile.tenant.
|
||||
COPY .tenant-bundle-deps/workspace-configs-templates /workspace-configs-templates
|
||||
COPY .tenant-bundle-deps/org-templates /org-templates
|
||||
COPY .tenant-bundle-deps/plugins /plugins
|
||||
# Non-root runtime with Docker socket access for workspace provisioning.
|
||||
RUN addgroup -g 1000 platform && adduser -u 1000 -G platform -s /bin/sh -D platform
|
||||
EXPOSE 8080
|
||||
|
||||
@ -3,14 +3,34 @@
|
||||
# Serves both the API (Go on :8080) and the UI (Node.js on :3000) in a
|
||||
# single container. Go reverse-proxies unknown routes to canvas.
|
||||
#
|
||||
# Templates are cloned from standalone GitHub repos at build time so the
|
||||
# monorepo doesn't need to carry them. The repos are public; no auth.
|
||||
# Templates + plugins are NOT cloned at build time. They are pre-cloned
|
||||
# in the trusted CI context (or operator host) by
|
||||
# `scripts/clone-manifest.sh` into `.tenant-bundle-deps/` and COPYed in.
|
||||
# The reason: post-2026-05-06, every workspace-template-* repo on Gitea
|
||||
# (codex, crewai, deepagents, gemini-cli, langgraph) plus all 7
|
||||
# org-template-* repos are private, so the Docker build can't `git clone`
|
||||
# from inside the build context — there's no auth path that doesn't leak
|
||||
# the Gitea token into an image layer. Pre-cloning keeps the token in
|
||||
# the CI environment only; the resulting image carries the cloned trees
|
||||
# with `.git` already stripped (see clone-manifest.sh).
|
||||
#
|
||||
# Build context: repo root.
|
||||
# Build context: repo root, with `.tenant-bundle-deps/` populated by:
|
||||
#
|
||||
# MOLECULE_GITEA_TOKEN=<persona-PAT> scripts/clone-manifest.sh \
|
||||
# manifest.json \
|
||||
# .tenant-bundle-deps/workspace-configs-templates \
|
||||
# .tenant-bundle-deps/org-templates \
|
||||
# .tenant-bundle-deps/plugins
|
||||
#
|
||||
# In CI this happens in publish-workspace-server-image.yml's "Pre-clone
|
||||
# manifest deps" step (uses AUTO_SYNC_TOKEN = devops-engineer persona).
|
||||
# For a manual operator-host build, source the same token from
|
||||
# /etc/molecule-bootstrap/agent-secrets.env first.
|
||||
#
|
||||
# docker buildx build --platform linux/amd64 \
|
||||
# -f workspace-server/Dockerfile.tenant \
|
||||
# -t registry.fly.io/molecule-tenant:latest \
|
||||
# -t <ECR>/molecule-ai/platform-tenant:latest \
|
||||
# --build-arg GIT_SHA=<sha> --build-arg NEXT_PUBLIC_PLATFORM_URL= \
|
||||
# --push .
|
||||
|
||||
# ── Stage 1: Go platform binary ──────────────────────────────────────
|
||||
@ -55,14 +75,7 @@ ENV NEXT_PUBLIC_PLATFORM_URL=$NEXT_PUBLIC_PLATFORM_URL
|
||||
ENV NEXT_PUBLIC_WS_URL=$NEXT_PUBLIC_WS_URL
|
||||
RUN npm run build
|
||||
|
||||
# ── Stage 3: Clone templates + plugins from manifest.json ─────────────
|
||||
FROM alpine:3.20 AS templates
|
||||
RUN apk add --no-cache git jq
|
||||
COPY manifest.json /manifest.json
|
||||
COPY scripts/clone-manifest.sh /scripts/clone-manifest.sh
|
||||
RUN chmod +x /scripts/clone-manifest.sh && /scripts/clone-manifest.sh /manifest.json /workspace-configs-templates /org-templates /plugins
|
||||
|
||||
# ── Stage 4: Runtime ──────────────────────────────────────────────────
|
||||
# ── Stage 3: Runtime ──────────────────────────────────────────────────
|
||||
FROM node:20-alpine
|
||||
RUN apk add --no-cache ca-certificates git tzdata openssh-client aws-cli
|
||||
|
||||
@ -87,10 +100,13 @@ COPY --from=go-builder /platform /platform
|
||||
COPY --from=go-builder /memory-plugin /memory-plugin
|
||||
COPY workspace-server/migrations /migrations
|
||||
|
||||
# Templates + plugins (cloned from GitHub in stage 3)
|
||||
COPY --from=templates /workspace-configs-templates /workspace-configs-templates
|
||||
COPY --from=templates /org-templates /org-templates
|
||||
COPY --from=templates /plugins /plugins
|
||||
# Templates + plugins (pre-cloned by scripts/clone-manifest.sh in the
|
||||
# trusted CI / operator-host context, .git already stripped — see
|
||||
# .tenant-bundle-deps/ in the build context). The Gitea token used to
|
||||
# clone them never enters this image.
|
||||
COPY .tenant-bundle-deps/workspace-configs-templates /workspace-configs-templates
|
||||
COPY .tenant-bundle-deps/org-templates /org-templates
|
||||
COPY .tenant-bundle-deps/plugins /plugins
|
||||
|
||||
# Canvas standalone
|
||||
WORKDIR /canvas
|
||||
|
||||
89
workspace-server/cmd/server/bind_test.go
Normal file
89
workspace-server/cmd/server/bind_test.go
Normal file
@ -0,0 +1,89 @@
|
||||
package main
|
||||
|
||||
import "testing"
|
||||
|
||||
// TestResolveBindHost pins the precedence: BIND_ADDR explicit > dev-mode
|
||||
// fail-open default of 127.0.0.1 > production-shape empty (all interfaces).
|
||||
//
|
||||
// Mutation-test invariant: removing the IsDevModeFailOpen() branch makes
|
||||
// "no_bindaddr_devmode_unset_admin" fail (returns "" instead of "127.0.0.1").
|
||||
// Removing the BIND_ADDR branch makes "explicit_bindaddr_*" cases fail.
|
||||
func TestResolveBindHost(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
bindAddr string
|
||||
adminToken string
|
||||
molEnv string
|
||||
want string
|
||||
}{
|
||||
{
|
||||
name: "no_bindaddr_devmode_unset_admin",
|
||||
bindAddr: "",
|
||||
adminToken: "",
|
||||
molEnv: "dev",
|
||||
want: "127.0.0.1",
|
||||
},
|
||||
{
|
||||
name: "no_bindaddr_devmode_unset_admin_full_word",
|
||||
bindAddr: "",
|
||||
adminToken: "",
|
||||
molEnv: "development",
|
||||
want: "127.0.0.1",
|
||||
},
|
||||
{
|
||||
name: "no_bindaddr_admin_set_in_dev_env",
|
||||
bindAddr: "",
|
||||
adminToken: "secret",
|
||||
molEnv: "dev",
|
||||
want: "", // ADMIN_TOKEN flips IsDevModeFailOpen to false → all interfaces
|
||||
},
|
||||
{
|
||||
name: "no_bindaddr_production_env",
|
||||
bindAddr: "",
|
||||
adminToken: "",
|
||||
molEnv: "production",
|
||||
want: "", // production is not a dev value → all interfaces
|
||||
},
|
||||
{
|
||||
name: "no_bindaddr_unset_env",
|
||||
bindAddr: "",
|
||||
adminToken: "",
|
||||
molEnv: "",
|
||||
want: "", // unset MOLECULE_ENV → not dev → all interfaces
|
||||
},
|
||||
{
|
||||
name: "explicit_bindaddr_loopback_overrides_devmode",
|
||||
bindAddr: "127.0.0.1",
|
||||
adminToken: "",
|
||||
molEnv: "dev",
|
||||
want: "127.0.0.1",
|
||||
},
|
||||
{
|
||||
name: "explicit_bindaddr_wildcard_overrides_devmode_default",
|
||||
bindAddr: "0.0.0.0",
|
||||
adminToken: "",
|
||||
molEnv: "dev",
|
||||
want: "0.0.0.0",
|
||||
},
|
||||
{
|
||||
name: "explicit_bindaddr_in_production",
|
||||
bindAddr: "10.0.5.7",
|
||||
adminToken: "secret",
|
||||
molEnv: "production",
|
||||
want: "10.0.5.7",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
t.Setenv("BIND_ADDR", tc.bindAddr)
|
||||
t.Setenv("ADMIN_TOKEN", tc.adminToken)
|
||||
t.Setenv("MOLECULE_ENV", tc.molEnv)
|
||||
got := resolveBindHost()
|
||||
if got != tc.want {
|
||||
t.Errorf("resolveBindHost() = %q, want %q (BIND_ADDR=%q ADMIN_TOKEN=%q MOLECULE_ENV=%q)",
|
||||
got, tc.want, tc.bindAddr, tc.adminToken, tc.molEnv)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
@ -19,6 +19,7 @@ import (
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/handlers"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/imagewatch"
|
||||
memwiring "github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/wiring"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/middleware"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/pendinguploads"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/registry"
|
||||
@ -332,15 +333,23 @@ func main() {
|
||||
// Router
|
||||
r := router.Setup(hub, broadcaster, prov, platformURL, configsDir, wh, channelMgr, memBundle)
|
||||
|
||||
// HTTP server with graceful shutdown
|
||||
// HTTP server with graceful shutdown.
|
||||
//
|
||||
// Bind host: in dev-mode (no ADMIN_TOKEN, MOLECULE_ENV=dev|development)
|
||||
// the AdminAuth chain fails open by design; pairing that with a wildcard
|
||||
// bind would expose unauth /workspaces to any same-LAN peer. Default to
|
||||
// loopback when fail-open is active. Operators who need LAN exposure set
|
||||
// BIND_ADDR=0.0.0.0 explicitly. Production (ADMIN_TOKEN set) is unchanged.
|
||||
// See molecule-core#7.
|
||||
bindHost := resolveBindHost()
|
||||
srv := &http.Server{
|
||||
Addr: fmt.Sprintf(":%s", port),
|
||||
Addr: fmt.Sprintf("%s:%s", bindHost, port),
|
||||
Handler: r,
|
||||
}
|
||||
|
||||
// Start server in goroutine
|
||||
go func() {
|
||||
log.Printf("Platform starting on :%s", port)
|
||||
log.Printf("Platform starting on %s:%s (dev-mode-fail-open=%v)", bindHost, port, middleware.IsDevModeFailOpen())
|
||||
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
|
||||
log.Fatalf("Server failed: %v", err)
|
||||
}
|
||||
@ -375,6 +384,29 @@ func envOr(key, fallback string) string {
|
||||
return fallback
|
||||
}
|
||||
|
||||
// resolveBindHost picks the listener interface for the HTTP server.
|
||||
//
|
||||
// Precedence:
|
||||
// 1. BIND_ADDR — explicit operator override (any value, including "0.0.0.0").
|
||||
// 2. dev-mode fail-open active → "127.0.0.1" (loopback only).
|
||||
// 3. otherwise → "" (Go binds every interface; existing prod/self-host shape).
|
||||
//
|
||||
// Coupling the loopback default to middleware.IsDevModeFailOpen() means the
|
||||
// two safety levers — bind narrowness and auth strength — move together. A
|
||||
// production deploy (ADMIN_TOKEN set) keeps binding to all interfaces because
|
||||
// the auth chain is doing its job; a dev Mac (no ADMIN_TOKEN, MOLECULE_ENV=dev)
|
||||
// is reachable only via loopback because the auth chain is fail-open. See
|
||||
// molecule-core#7 for the original LAN exposure finding.
|
||||
func resolveBindHost() string {
|
||||
if v := os.Getenv("BIND_ADDR"); v != "" {
|
||||
return v
|
||||
}
|
||||
if middleware.IsDevModeFailOpen() {
|
||||
return "127.0.0.1"
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func findConfigsDir() string {
|
||||
candidates := []string{
|
||||
"workspace-configs-templates",
|
||||
|
||||
@ -413,11 +413,56 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
|
||||
return http.StatusOK, respBody, nil
|
||||
}
|
||||
|
||||
// Mock-runtime short-circuit. Workspaces with runtime='mock' have
|
||||
// no container, no EC2, no URL — every reply is synthesised here
|
||||
// from a small canned-variant pool. Built for the "200-workspace
|
||||
// mock org" demo: a CEO/VPs/Managers/ICs hierarchy that renders
|
||||
// at scale on the canvas without burning real LLM credits or
|
||||
// provisioning 200 EC2 instances. See mock_runtime.go for the
|
||||
// full rationale + reply shape contract.
|
||||
//
|
||||
// Position: AFTER poll-mode (mock isn't a delivery mode, it's a
|
||||
// runtime; treating poll-set-on-mock as poll matches operator
|
||||
// intent if anyone ever does that), BEFORE resolveAgentURL (mock
|
||||
// has no URL — going through resolveAgentURL would 404 on the
|
||||
// SELECT url since the row is provisioned as NULL).
|
||||
if status, respBody, handled := h.handleMockA2A(ctx, workspaceID, callerID, body, a2aMethod, logActivity); handled {
|
||||
return status, respBody, nil
|
||||
}
|
||||
|
||||
agentURL, proxyErr := h.resolveAgentURL(ctx, workspaceID)
|
||||
if proxyErr != nil {
|
||||
return 0, nil, proxyErr
|
||||
}
|
||||
|
||||
// Pre-flight container-health check (#36). The dispatchA2A path below
|
||||
// does Docker-DNS forwarding to `ws-<wsShort>:8000` and only catches a
|
||||
// missing/dead container REACTIVELY via maybeMarkContainerDead in
|
||||
// handleA2ADispatchError. That works but costs the caller a full
|
||||
// network-timeout (2-30s) before the structured 503 surfaces.
|
||||
//
|
||||
// When we KNOW the workspace is container-backed (h.docker != nil + we
|
||||
// rewrite to Docker-DNS form below), do a single proactive
|
||||
// RunningContainerName lookup. If the container is genuinely missing,
|
||||
// short-circuit with the same structured 503 + async restart that
|
||||
// maybeMarkContainerDead would produce — but immediately, without the
|
||||
// network round-trip.
|
||||
//
|
||||
// Three outcomes of provisioner.RunningContainerName(ctx, h.docker, id):
|
||||
// ("ws-<id>", nil) → forward as today.
|
||||
// ("", nil) → container is genuinely not running. Fast-503.
|
||||
// ("", err) → transient daemon error. Fall through to optimistic
|
||||
// forward — matches Provisioner.IsRunning's
|
||||
// (true, err) "fail-soft as alive" contract.
|
||||
//
|
||||
// Same SSOT as findRunningContainer (#10/#12). See AST gate
|
||||
// TestProxyA2A_RoutesThroughProvisionerSSOT.
|
||||
if h.provisioner != nil && platformInDocker && strings.HasPrefix(agentURL, "http://"+provisioner.ContainerName(workspaceID)+":") {
|
||||
if proxyErr := h.preflightContainerHealth(ctx, workspaceID); proxyErr != nil {
|
||||
return 0, nil, proxyErr
|
||||
}
|
||||
}
|
||||
|
||||
startTime := time.Now()
|
||||
resp, cancelFwd, err := h.dispatchA2A(ctx, workspaceID, agentURL, body, callerID)
|
||||
if cancelFwd != nil {
|
||||
|
||||
@ -198,6 +198,60 @@ func (h *WorkspaceHandler) maybeMarkContainerDead(ctx context.Context, workspace
|
||||
return true
|
||||
}
|
||||
|
||||
// preflightContainerHealth runs a proactive Provisioner.IsRunning check
|
||||
// (#36) before dispatching the a2a forward. Routed through provisioner's
|
||||
// SSOT IsRunning, which itself wraps RunningContainerName — same source
|
||||
// as findRunningContainer in the plugins handler (#10/#12).
|
||||
//
|
||||
// Returns nil when the forward should proceed:
|
||||
// - container is running, OR
|
||||
// - daemon errored transiently (matches IsRunning's (true, err)
|
||||
// "fail-soft as alive" contract — let the optimistic forward run
|
||||
// and reactive maybeMarkContainerDead catch a real failure).
|
||||
//
|
||||
// Returns a structured 503 + triggers the same async restart that
|
||||
// maybeMarkContainerDead would produce, when:
|
||||
// - container is genuinely not running (NotFound / Exited / Created…).
|
||||
//
|
||||
// The point of running this BEFORE the forward is to save the caller
|
||||
// 2-30s of network-timeout cost when the container is missing — a common
|
||||
// shape post-EC2-replace (see molecule-controlplane#20 incident
|
||||
// 2026-05-07) where the reconciler hasn't respawned the agent yet.
|
||||
func (h *WorkspaceHandler) preflightContainerHealth(ctx context.Context, workspaceID string) *proxyA2AError {
|
||||
running, err := h.provisioner.IsRunning(ctx, workspaceID)
|
||||
if err != nil {
|
||||
// Transient daemon error. Provisioner.IsRunning returns (true, err)
|
||||
// in this case — fall through to the optimistic forward, reactive
|
||||
// maybeMarkContainerDead handles a real failure later.
|
||||
log.Printf("ProxyA2A preflight: IsRunning transient error for %s: %v (proceeding with forward)", workspaceID, err)
|
||||
return nil
|
||||
}
|
||||
if running {
|
||||
// Container is running — forward as today.
|
||||
return nil
|
||||
}
|
||||
// Container is genuinely not running. Mark offline + trigger restart
|
||||
// (same effect as maybeMarkContainerDead's branch), and return the
|
||||
// structured 503 immediately so the caller skips the forward.
|
||||
log.Printf("ProxyA2A preflight: container for %s is not running — marking offline and triggering restart (#36)", workspaceID)
|
||||
if _, dbErr := db.DB.ExecContext(ctx,
|
||||
`UPDATE workspaces SET status = $1, updated_at = now() WHERE id = $2 AND status NOT IN ('removed', 'provisioning')`,
|
||||
models.StatusOffline, workspaceID); dbErr != nil {
|
||||
log.Printf("ProxyA2A preflight: failed to mark workspace %s offline: %v", workspaceID, dbErr)
|
||||
}
|
||||
db.ClearWorkspaceKeys(ctx, workspaceID)
|
||||
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOffline), workspaceID, map[string]interface{}{})
|
||||
go h.RestartByID(workspaceID)
|
||||
return &proxyA2AError{
|
||||
Status: http.StatusServiceUnavailable,
|
||||
Response: gin.H{
|
||||
"error": "workspace container not running — restart triggered",
|
||||
"restarting": true,
|
||||
"preflight": true, // distinguishes from reactive containerDead path
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// logA2AFailure records a failed A2A attempt to activity_logs in a detached
|
||||
// goroutine (the request context may already be done by the time it runs).
|
||||
func (h *WorkspaceHandler) logA2AFailure(ctx context.Context, workspaceID, callerID string, body []byte, a2aMethod string, err error, durationMs int) {
|
||||
|
||||
194
workspace-server/internal/handlers/a2a_proxy_preflight_test.go
Normal file
194
workspace-server/internal/handlers/a2a_proxy_preflight_test.go
Normal file
@ -0,0 +1,194 @@
|
||||
package handlers
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"go/ast"
|
||||
"go/parser"
|
||||
"go/token"
|
||||
"testing"
|
||||
|
||||
"github.com/DATA-DOG/go-sqlmock"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
|
||||
)
|
||||
|
||||
// preflightLocalProv is a controllable LocalProvisionerAPI stub for the
|
||||
// preflight tests (#36). Other API methods panic to guard against tests
|
||||
// that should be using a different stub.
|
||||
type preflightLocalProv struct {
|
||||
running bool
|
||||
err error
|
||||
calls int
|
||||
calledWith []string
|
||||
}
|
||||
|
||||
func (p *preflightLocalProv) IsRunning(_ context.Context, workspaceID string) (bool, error) {
|
||||
p.calls++
|
||||
p.calledWith = append(p.calledWith, workspaceID)
|
||||
return p.running, p.err
|
||||
}
|
||||
func (p *preflightLocalProv) Start(_ context.Context, _ provisioner.WorkspaceConfig) (string, error) {
|
||||
panic("preflightLocalProv: Start not implemented")
|
||||
}
|
||||
func (p *preflightLocalProv) Stop(_ context.Context, _ string) error {
|
||||
panic("preflightLocalProv: Stop not implemented")
|
||||
}
|
||||
func (p *preflightLocalProv) ExecRead(_ context.Context, _, _ string) ([]byte, error) {
|
||||
panic("preflightLocalProv: ExecRead not implemented")
|
||||
}
|
||||
func (p *preflightLocalProv) RemoveVolume(_ context.Context, _ string) error {
|
||||
panic("preflightLocalProv: RemoveVolume not implemented")
|
||||
}
|
||||
func (p *preflightLocalProv) VolumeHasFile(_ context.Context, _, _ string) (bool, error) {
|
||||
panic("preflightLocalProv: VolumeHasFile not implemented")
|
||||
}
|
||||
func (p *preflightLocalProv) WriteAuthTokenToVolume(_ context.Context, _, _ string) error {
|
||||
panic("preflightLocalProv: WriteAuthTokenToVolume not implemented")
|
||||
}
|
||||
|
||||
// TestPreflight_ContainerRunning_ReturnsNil — IsRunning(true,nil): forward
|
||||
// proceeds. preflight returns nil → caller continues to dispatchA2A.
|
||||
func TestPreflight_ContainerRunning_ReturnsNil(t *testing.T) {
|
||||
_ = setupTestDB(t)
|
||||
stub := &preflightLocalProv{running: true, err: nil}
|
||||
h := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
|
||||
h.provisioner = stub
|
||||
|
||||
if err := h.preflightContainerHealth(context.Background(), "ws-running-123"); err != nil {
|
||||
t.Fatalf("preflight should return nil when container running, got %+v", err)
|
||||
}
|
||||
if stub.calls != 1 {
|
||||
t.Errorf("IsRunning should be called exactly once, got %d", stub.calls)
|
||||
}
|
||||
if len(stub.calledWith) != 1 || stub.calledWith[0] != "ws-running-123" {
|
||||
t.Errorf("IsRunning should be called with workspace id, got %v", stub.calledWith)
|
||||
}
|
||||
}
|
||||
|
||||
// TestPreflight_ContainerNotRunning_StructuredFastFail — IsRunning(false,nil):
|
||||
// preflight returns structured 503 with restarting=true + preflight=true, AND
|
||||
// triggers the offline-flip + WORKSPACE_OFFLINE broadcast + async restart.
|
||||
// This is the load-bearing case — saves the caller 2-30s of network timeout.
|
||||
func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
|
||||
mock := setupTestDB(t)
|
||||
_ = setupTestRedis(t)
|
||||
stub := &preflightLocalProv{running: false, err: nil}
|
||||
h := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
|
||||
h.provisioner = stub
|
||||
|
||||
// Expect the offline-flip UPDATE.
|
||||
mock.ExpectExec(`UPDATE workspaces SET status =`).
|
||||
WithArgs(models.StatusOffline, "ws-dead-456").
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
// Broadcaster's INSERT INTO structure_events fires too — best-effort
|
||||
// log entry for the WORKSPACE_OFFLINE event. Match permissively.
|
||||
mock.ExpectExec(`INSERT INTO structure_events`).
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
|
||||
proxyErr := h.preflightContainerHealth(context.Background(), "ws-dead-456")
|
||||
if proxyErr == nil {
|
||||
t.Fatal("preflight should return *proxyA2AError when container not running")
|
||||
}
|
||||
if proxyErr.Status != 503 {
|
||||
t.Errorf("expected 503, got %d", proxyErr.Status)
|
||||
}
|
||||
if got := proxyErr.Response["restarting"]; got != true {
|
||||
t.Errorf("response should mark restarting=true, got %v", got)
|
||||
}
|
||||
if got := proxyErr.Response["preflight"]; got != true {
|
||||
t.Errorf("response should mark preflight=true so callers can distinguish from reactive containerDead, got %v", got)
|
||||
}
|
||||
if got := proxyErr.Response["error"]; got != "workspace container not running — restart triggered" {
|
||||
t.Errorf("error message mismatch, got %q", got)
|
||||
}
|
||||
|
||||
// Note: broadcaster firing is exercised by the production path's
|
||||
// h.broadcaster.RecordAndBroadcast call but not asserted here — the
|
||||
// real *events.Broadcaster doesn't expose received events for inspection.
|
||||
// The DB UPDATE expectation is sufficient to pin the offline-flip path.
|
||||
}
|
||||
|
||||
// TestPreflight_TransientError_FailsSoftAsAlive — IsRunning(true,err): the
|
||||
// (true, err) "fail-soft" contract — preflight returns nil so the optimistic
|
||||
// forward runs; reactive maybeMarkContainerDead handles a real failure later.
|
||||
// This pin is critical: a flaky daemon must NOT trigger a restart cascade.
|
||||
func TestPreflight_TransientError_FailsSoftAsAlive(t *testing.T) {
|
||||
_ = setupTestDB(t)
|
||||
stub := &preflightLocalProv{running: true, err: errors.New("docker daemon EOF")}
|
||||
h := NewWorkspaceHandler(newTestBroadcaster(), nil, "http://localhost:8080", t.TempDir())
|
||||
h.provisioner = stub
|
||||
|
||||
if err := h.preflightContainerHealth(context.Background(), "ws-flaky-789"); err != nil {
|
||||
t.Fatalf("preflight should return nil on transient error (fail-soft), got %+v", err)
|
||||
}
|
||||
// No DB UPDATE expected — sqlmock would complain about unexpected calls
|
||||
// at test cleanup if the offline-flip path fired.
|
||||
}
|
||||
|
||||
// TestProxyA2A_Preflight_RoutesThroughProvisionerSSOT — AST gate (#36 mirror
|
||||
// of #12's gate). Pins the invariant that preflightContainerHealth uses the
|
||||
// SSOT Provisioner.IsRunning helper, NOT a parallel docker.ContainerInspect
|
||||
// of its own.
|
||||
//
|
||||
// Mutation invariant: if a future PR replaces h.provisioner.IsRunning with
|
||||
// a direct cli.ContainerInspect call, this test fails. That's the signal to
|
||||
// either (a) extend Provisioner.IsRunning's contract OR (b) document why
|
||||
// this call site needs to differ. Either way, the drift gets a reviewer's
|
||||
// attention instead of shipping silently.
|
||||
func TestProxyA2A_Preflight_RoutesThroughProvisionerSSOT(t *testing.T) {
|
||||
fset := token.NewFileSet()
|
||||
file, err := parser.ParseFile(fset, "a2a_proxy_helpers.go", nil, parser.ParseComments)
|
||||
if err != nil {
|
||||
t.Fatalf("parse a2a_proxy_helpers.go: %v", err)
|
||||
}
|
||||
|
||||
var fn *ast.FuncDecl
|
||||
ast.Inspect(file, func(n ast.Node) bool {
|
||||
f, ok := n.(*ast.FuncDecl)
|
||||
if !ok || f.Name.Name != "preflightContainerHealth" {
|
||||
return true
|
||||
}
|
||||
fn = f
|
||||
return false
|
||||
})
|
||||
if fn == nil {
|
||||
t.Fatal("preflightContainerHealth not found — was it renamed? update this gate or the SSOT routing assumption")
|
||||
}
|
||||
|
||||
var (
|
||||
callsIsRunning bool
|
||||
callsContainerInspectRaw bool
|
||||
callsRunningContainerNameDirect bool
|
||||
)
|
||||
ast.Inspect(fn.Body, func(n ast.Node) bool {
|
||||
call, ok := n.(*ast.CallExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
sel, ok := call.Fun.(*ast.SelectorExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
switch sel.Sel.Name {
|
||||
case "IsRunning":
|
||||
callsIsRunning = true
|
||||
case "ContainerInspect":
|
||||
callsContainerInspectRaw = true
|
||||
case "RunningContainerName":
|
||||
// Direct RunningContainerName is also acceptable SSOT — but
|
||||
// preferring IsRunning keeps the (bool, error) contract that
|
||||
// already exists in the helper API surface.
|
||||
callsRunningContainerNameDirect = true
|
||||
}
|
||||
return true
|
||||
})
|
||||
|
||||
if !callsIsRunning && !callsRunningContainerNameDirect {
|
||||
t.Errorf("preflightContainerHealth must call provisioner.IsRunning OR provisioner.RunningContainerName for the SSOT health check — see molecule-core#36. Found neither.")
|
||||
}
|
||||
if callsContainerInspectRaw {
|
||||
t.Errorf("preflightContainerHealth carries a direct ContainerInspect call. This is the parallel-impl drift molecule-core#36 fixed. " +
|
||||
"Either route through provisioner.IsRunning OR — if a new use case truly needs a different inspect — extend the helper's contract first and update this gate to allow the specific delta.")
|
||||
}
|
||||
}
|
||||
223
workspace-server/internal/handlers/mock_runtime.go
Normal file
223
workspace-server/internal/handlers/mock_runtime.go
Normal file
@ -0,0 +1,223 @@
|
||||
package handlers
|
||||
|
||||
// mock_runtime.go — "mock" runtime: a virtual workspace that has no
|
||||
// container, no EC2, no LLM, just hardcoded canned A2A replies. Built
|
||||
// for the funding-demo "200-workspace mock org" so hongming can show
|
||||
// investors a CEO/VPs/Managers/ICs hierarchy at scale without burning
|
||||
// 200 EC2 instances or 200 Anthropic keys.
|
||||
//
|
||||
// Wire model:
|
||||
// - org template declares `runtime: mock` on every workspace
|
||||
// - createWorkspaceTree skips provisioning, sets status='online'
|
||||
// directly (mirrors the `external` short-circuit, minus the URL +
|
||||
// awaiting_agent dance)
|
||||
// - proxyA2ARequest short-circuits on a mock-runtime target and
|
||||
// returns a canned JSON-RPC reply; never calls resolveAgentURL,
|
||||
// never opens an HTTP connection, never touches Docker/EC2
|
||||
//
|
||||
// The reply is JSON-RPC 2.0 + a2a-sdk v0.3 shape so the canvas's
|
||||
// extractAgentText / extractTextsFromParts read it without any
|
||||
// special-casing. We rotate over a small variant pool so a screen
|
||||
// full of replies doesn't all read identical — gives the demo a bit
|
||||
// of life without pretending to be a real agent.
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/sha1"
|
||||
"database/sql"
|
||||
"encoding/binary"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"log"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
|
||||
"github.com/gin-gonic/gin"
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
// MockRuntimeName is the canonical runtime string a workspace row
|
||||
// carries to opt into the canned-reply short-circuit. Kept as a const
|
||||
// so the proxy's runtime-check + the org-import skip-block reference
|
||||
// the same literal.
|
||||
const MockRuntimeName = "mock"
|
||||
|
||||
// mockReplyVariants is the pool of canned strings the mock runtime
|
||||
// rotates through. Picked to read like a busy-but-short reply from a
|
||||
// real human in a hierarchy — a CEO would NOT respond with "On it!",
|
||||
// but for the demo every node is shown to be reachable, so we lean
|
||||
// into the variety. Variant selection is deterministic per
|
||||
// (workspaceID, request-id) pair so a screen recording replays the
|
||||
// same reply for the same input.
|
||||
var mockReplyVariants = []string{
|
||||
"On it!",
|
||||
"Got it, on it now.",
|
||||
"On it, boss.",
|
||||
"Working on it.",
|
||||
"Acknowledged — on it.",
|
||||
"On it, will report back.",
|
||||
"Roger that, on it.",
|
||||
"Copy that. On it.",
|
||||
"On it — ETA shortly.",
|
||||
"On it. Standby for update.",
|
||||
}
|
||||
|
||||
// pickMockReply returns a canned reply for the given workspaceID +
|
||||
// requestID. Deterministic so the same (workspace, message-id) pair
|
||||
// always picks the same variant — useful for screen recordings and
|
||||
// flake-free e2e snapshots. Falls back to variant[0] if the inputs
|
||||
// are empty.
|
||||
func pickMockReply(workspaceID, requestID string) string {
|
||||
if len(mockReplyVariants) == 0 {
|
||||
return "On it!"
|
||||
}
|
||||
if workspaceID == "" && requestID == "" {
|
||||
return mockReplyVariants[0]
|
||||
}
|
||||
h := sha1.Sum([]byte(workspaceID + ":" + requestID))
|
||||
idx := int(binary.BigEndian.Uint32(h[0:4]) % uint32(len(mockReplyVariants)))
|
||||
return mockReplyVariants[idx]
|
||||
}
|
||||
|
||||
// lookupRuntime returns the workspace's runtime string. Empty when the
|
||||
// row is missing / DB hiccup so callers fall through to the existing
|
||||
// dispatch path (which will then 404 / 502 normally). Fail-open here
|
||||
// because a transient DB error must not silently flip a real workspace
|
||||
// into mock-mode and start handing out canned replies in place of
|
||||
// genuine agent traffic.
|
||||
func lookupRuntime(ctx context.Context, workspaceID string) string {
|
||||
var runtime sql.NullString
|
||||
err := db.DB.QueryRowContext(ctx,
|
||||
`SELECT runtime FROM workspaces WHERE id = $1`, workspaceID,
|
||||
).Scan(&runtime)
|
||||
if err != nil {
|
||||
if !errors.Is(err, sql.ErrNoRows) {
|
||||
log.Printf("ProxyA2A: lookupRuntime(%s) failed (%v) — falling through to dispatch path", workspaceID, err)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
if !runtime.Valid {
|
||||
return ""
|
||||
}
|
||||
return runtime.String
|
||||
}
|
||||
|
||||
// buildMockA2AResponse synthesises a JSON-RPC 2.0 success envelope that
|
||||
// matches the a2a-sdk v0.3 reply shape the canvas's extractAgentText
|
||||
// already understands: `{result: {parts: [{kind: "text", text: ...}]}}`.
|
||||
// `requestID` is the JSON-RPC `id` of the inbound request — A2A
|
||||
// implementations echo it on the reply so callers can correlate. We
|
||||
// extract it from the normalized payload in the caller and pass it in
|
||||
// here so this function stays JSON-only (no payload parsing).
|
||||
//
|
||||
// Returns marshalled bytes ready to write straight to the HTTP body.
|
||||
// Marshal failure is logged + a tiny fallback envelope returned, since
|
||||
// failing the whole request because of a JSON encoding hiccup on a
|
||||
// constant-shaped payload would defeat the "mock always works" guarantee.
|
||||
func buildMockA2AResponse(workspaceID, requestID, replyText string) []byte {
|
||||
if requestID == "" {
|
||||
requestID = uuid.New().String()
|
||||
}
|
||||
envelope := map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": requestID,
|
||||
"result": map[string]any{
|
||||
"parts": []map[string]any{
|
||||
{"kind": "text", "text": replyText},
|
||||
},
|
||||
},
|
||||
}
|
||||
out, err := json.Marshal(envelope)
|
||||
if err != nil {
|
||||
log.Printf("ProxyA2A: mock-runtime response marshal failed for %s: %v — emitting fallback", workspaceID, err)
|
||||
// Hand-rolled minimal envelope. Safe because every value is a
|
||||
// hardcoded constant string with no characters that need
|
||||
// escaping in a JSON string literal.
|
||||
fallback := fmt.Sprintf(
|
||||
`{"jsonrpc":"2.0","id":%q,"result":{"parts":[{"kind":"text","text":%q}]}}`,
|
||||
requestID, replyText,
|
||||
)
|
||||
return []byte(fallback)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// extractRequestID pulls the JSON-RPC `id` out of an already-normalized
|
||||
// A2A payload. Returns "" when the field is absent or not a string —
|
||||
// caller substitutes a fresh UUID. Tolerant of every shape
|
||||
// normalizeA2APayload could produce.
|
||||
func extractRequestID(body []byte) string {
|
||||
var top map[string]json.RawMessage
|
||||
if err := json.Unmarshal(body, &top); err != nil {
|
||||
return ""
|
||||
}
|
||||
raw, ok := top["id"]
|
||||
if !ok {
|
||||
return ""
|
||||
}
|
||||
var s string
|
||||
if json.Unmarshal(raw, &s) == nil {
|
||||
return s
|
||||
}
|
||||
// JSON-RPC permits numeric IDs too; canvas issues UUIDs but be
|
||||
// defensive against alternative SDKs.
|
||||
var n json.Number
|
||||
if json.Unmarshal(raw, &n) == nil {
|
||||
return n.String()
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// handleMockA2A is the proxy short-circuit for mock-runtime workspaces.
|
||||
// Returns (status, body, true) when the target is mock — caller writes
|
||||
// the response and returns. Returns (_, _, false) when the target is
|
||||
// not mock — caller continues to the real dispatch path.
|
||||
//
|
||||
// Side-effects: writes a synthetic activity_logs row via logA2ASuccess
|
||||
// when logActivity is true so the canvas's "Agent Comms" tab shows the
|
||||
// mock reply in the trace alongside real-agent traffic. Without this
|
||||
// the demo would render messages on the canvas chat panel but a peer
|
||||
// node clicking through to its activity tab would see an empty list.
|
||||
func (h *WorkspaceHandler) handleMockA2A(ctx context.Context, workspaceID, callerID string, body []byte, a2aMethod string, logActivity bool) (int, []byte, bool) {
|
||||
if lookupRuntime(ctx, workspaceID) != MockRuntimeName {
|
||||
return 0, nil, false
|
||||
}
|
||||
requestID := extractRequestID(body)
|
||||
replyText := pickMockReply(workspaceID, requestID)
|
||||
respBody := buildMockA2AResponse(workspaceID, requestID, replyText)
|
||||
|
||||
// Tiny artificial delay so the canvas chat UI has time to render
|
||||
// the user's outgoing bubble before the agent reply appears.
|
||||
// Without it the reply lands the same animation frame and feels
|
||||
// robotic. 80ms is too fast to look "real" but masks the React
|
||||
// double-render race that drops the user bubble entirely on slow
|
||||
// machines (observed locally on M1 Air, 2026-05-07). Below 200ms
|
||||
// keeps a 200-node demo snappy when investors fan out 30 messages
|
||||
// at once.
|
||||
time.Sleep(80 * time.Millisecond)
|
||||
|
||||
if logActivity {
|
||||
// Reuse the existing success-logger so the activity feed shape
|
||||
// is identical to a real agent reply. Status 200 + duration 0
|
||||
// is the "synthesised reply" marker; activity_logs.duration_ms
|
||||
// being 0 is harmless (real fast paths can hit 0 too).
|
||||
h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, http.StatusOK, 0)
|
||||
}
|
||||
return http.StatusOK, respBody, true
|
||||
}
|
||||
|
||||
// IsMockRuntime is a small public helper for callers outside this
|
||||
// package (tests, the org importer) that need to ask the question
|
||||
// without depending on the unexported constant. Trims + lower-cases
|
||||
// so a typoed YAML cell like " Mock " still resolves correctly.
|
||||
func IsMockRuntime(runtime string) bool {
|
||||
return strings.EqualFold(strings.TrimSpace(runtime), MockRuntimeName)
|
||||
}
|
||||
|
||||
// gin import is unused at file scope but kept as a tag so a future
|
||||
// addition of a thin HTTP handler (e.g. POST /workspaces/:id/mock/replies
|
||||
// for an admin-set custom reply pool) doesn't need an import re-order.
|
||||
var _ = gin.H{}
|
||||
266
workspace-server/internal/handlers/mock_runtime_test.go
Normal file
266
workspace-server/internal/handlers/mock_runtime_test.go
Normal file
@ -0,0 +1,266 @@
|
||||
package handlers
|
||||
|
||||
// mock_runtime_test.go — locks the contract for the mock-runtime
|
||||
// short-circuit added for the funding-demo "200-workspace mock org"
|
||||
// template. Three invariants:
|
||||
//
|
||||
// 1. ProxyA2A on a workspace with runtime='mock' must return 200
|
||||
// with a JSON-RPC reply containing one text part. NO HTTP
|
||||
// dispatch, NO resolveAgentURL DB read (mock workspaces have
|
||||
// no URL — that read would 404 and break the demo).
|
||||
//
|
||||
// 2. The reply text must be one of the canned variants and must be
|
||||
// deterministic for a given (workspace_id, request_id) pair so
|
||||
// screen recordings replay identically.
|
||||
//
|
||||
// 3. Workspaces with runtime != 'mock' must NOT be affected — the
|
||||
// mock check fails fast and falls through to the existing
|
||||
// dispatch path. Same kind of regression guard the poll-mode
|
||||
// tests carry.
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/DATA-DOG/go-sqlmock"
|
||||
"github.com/gin-gonic/gin"
|
||||
)
|
||||
|
||||
// TestProxyA2A_MockRuntime_ReturnsCannedReply is the happy-path
|
||||
// contract. A workspace flagged runtime='mock' must:
|
||||
// - return 200 with JSON-RPC envelope {result:{parts:[{kind:text,text:...}]}}
|
||||
// - not dispatch HTTP (no SELECT url SQL expected)
|
||||
// - reply text is one of mockReplyVariants
|
||||
func TestProxyA2A_MockRuntime_ReturnsCannedReply(t *testing.T) {
|
||||
mock := setupTestDB(t)
|
||||
setupTestRedis(t)
|
||||
broadcaster := newTestBroadcaster()
|
||||
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
|
||||
|
||||
const wsID = "ws-mock-canned"
|
||||
|
||||
// Budget check fires before runtime lookup (same as the poll-mode
|
||||
// short-circuit) — keeps mock workspaces honest if a tenant ever
|
||||
// sets a budget on one. Unlikely on a demo, but the guard stays
|
||||
// uniform so future "monthly_spend on mock = 0" assertions don't
|
||||
// drift.
|
||||
expectBudgetCheck(mock, wsID)
|
||||
|
||||
// lookupDeliveryMode runs first — return push so the poll
|
||||
// short-circuit doesn't fire and we hit the mock check.
|
||||
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
|
||||
|
||||
// lookupRuntime SELECT — returns 'mock', triggering the canned-reply
|
||||
// short-circuit. CRITICAL: NO ExpectQuery for `SELECT url, status
|
||||
// FROM workspaces` (resolveAgentURL's query). If the short-circuit
|
||||
// fails to fire, sqlmock will surface "unexpected query" on the URL
|
||||
// SELECT and the test fails loudly — that's the dispatch-leak detector.
|
||||
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("mock"))
|
||||
|
||||
// Activity log: logA2ASuccess writes the synthetic reply to
|
||||
// activity_logs so the canvas's Agent Comms tab shows it alongside
|
||||
// real-agent traffic.
|
||||
mock.ExpectExec("INSERT INTO activity_logs").
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: wsID}}
|
||||
|
||||
body := `{"jsonrpc":"2.0","id":"req-mock-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hello mock"}]}}}`
|
||||
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
handler.ProxyA2A(c)
|
||||
|
||||
// logA2ASuccess fires async — give it a moment to settle so
|
||||
// ExpectationsWereMet doesn't flake.
|
||||
time.Sleep(200 * time.Millisecond)
|
||||
|
||||
if w.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
var resp map[string]interface{}
|
||||
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
|
||||
t.Fatalf("response is not valid JSON: %v", err)
|
||||
}
|
||||
if resp["jsonrpc"] != "2.0" {
|
||||
t.Errorf("response.jsonrpc = %v, want 2.0", resp["jsonrpc"])
|
||||
}
|
||||
if resp["id"] != "req-mock-1" {
|
||||
t.Errorf("response.id = %v, want %q (echoed from request)", resp["id"], "req-mock-1")
|
||||
}
|
||||
result, _ := resp["result"].(map[string]interface{})
|
||||
if result == nil {
|
||||
t.Fatalf("response.result missing or wrong type: %v", resp["result"])
|
||||
}
|
||||
parts, _ := result["parts"].([]interface{})
|
||||
if len(parts) != 1 {
|
||||
t.Fatalf("expected exactly one part, got %d: %v", len(parts), parts)
|
||||
}
|
||||
part, _ := parts[0].(map[string]interface{})
|
||||
if part["kind"] != "text" {
|
||||
t.Errorf("part.kind = %v, want text", part["kind"])
|
||||
}
|
||||
text, _ := part["text"].(string)
|
||||
if text == "" {
|
||||
t.Error("part.text is empty — canned reply not populated")
|
||||
}
|
||||
// Reply must be one of the variants.
|
||||
matched := false
|
||||
for _, v := range mockReplyVariants {
|
||||
if v == text {
|
||||
matched = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !matched {
|
||||
t.Errorf("reply text %q is not in mockReplyVariants", text)
|
||||
}
|
||||
|
||||
if err := mock.ExpectationsWereMet(); err != nil {
|
||||
t.Errorf("unmet sqlmock expectations: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// TestProxyA2A_NonMockRuntime_NoShortCircuit verifies the symmetric
|
||||
// contract: a workspace with a real runtime (claude-code, hermes, etc.)
|
||||
// must NOT be affected by the mock check — it falls through to the
|
||||
// real dispatch path. Without this guard, a regression in
|
||||
// lookupRuntime could silently flip every workspace into mock-mode
|
||||
// and start handing out canned replies in place of real-agent traffic.
|
||||
func TestProxyA2A_NonMockRuntime_NoShortCircuit(t *testing.T) {
|
||||
mock := setupTestDB(t)
|
||||
mr := setupTestRedis(t)
|
||||
allowLoopbackForTest(t)
|
||||
broadcaster := newTestBroadcaster()
|
||||
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
|
||||
|
||||
const wsID = "ws-real-runtime"
|
||||
|
||||
dispatched := false
|
||||
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
dispatched = true
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.Write([]byte(`{"jsonrpc":"2.0","id":"1","result":{"status":"ok"}}`))
|
||||
}))
|
||||
defer agentServer.Close()
|
||||
mr.Set("ws:"+wsID+":url", agentServer.URL)
|
||||
|
||||
expectBudgetCheck(mock, wsID)
|
||||
|
||||
// poll-mode SELECT — return push so we proceed past the poll
|
||||
// short-circuit.
|
||||
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
|
||||
|
||||
// runtime SELECT — return claude-code so the mock check falls
|
||||
// through.
|
||||
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
|
||||
WithArgs(wsID).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
|
||||
|
||||
mock.ExpectExec("INSERT INTO activity_logs").
|
||||
WillReturnResult(sqlmock.NewResult(0, 1))
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: wsID}}
|
||||
body := `{"jsonrpc":"2.0","id":"real-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hi"}]}}}`
|
||||
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
handler.ProxyA2A(c)
|
||||
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
|
||||
if w.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
if !dispatched {
|
||||
t.Error("non-mock runtime: expected the agent server to receive the request, but it did not — mock short-circuit may be over-firing")
|
||||
}
|
||||
if err := mock.ExpectationsWereMet(); err != nil {
|
||||
t.Errorf("unmet sqlmock expectations: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// TestPickMockReply_Deterministic locks the determinism contract:
|
||||
// the same (workspaceID, requestID) input must yield the same variant
|
||||
// every call. Required for screen recordings + flake-free e2e
|
||||
// snapshots.
|
||||
func TestPickMockReply_Deterministic(t *testing.T) {
|
||||
cases := []struct {
|
||||
ws, req string
|
||||
}{
|
||||
{"ws-1", "req-A"},
|
||||
{"ws-1", "req-B"},
|
||||
{"ws-2", "req-A"},
|
||||
{"", ""},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
first := pickMockReply(tc.ws, tc.req)
|
||||
for i := 0; i < 10; i++ {
|
||||
next := pickMockReply(tc.ws, tc.req)
|
||||
if next != first {
|
||||
t.Errorf("pickMockReply(%q,%q) is not deterministic: got %q then %q",
|
||||
tc.ws, tc.req, first, next)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestIsMockRuntime_TrimsAndCaseInsensitive — typos and stray
|
||||
// whitespace in YAML must still resolve to mock so a single
|
||||
// runtime: " Mock " entry doesn't silently get dispatched.
|
||||
func TestIsMockRuntime_TrimsAndCaseInsensitive(t *testing.T) {
|
||||
cases := map[string]bool{
|
||||
"mock": true,
|
||||
"MOCK": true,
|
||||
" Mock ": true,
|
||||
"mocky": false,
|
||||
"": false,
|
||||
"external": false,
|
||||
"claude-code": false,
|
||||
}
|
||||
for in, want := range cases {
|
||||
if got := IsMockRuntime(in); got != want {
|
||||
t.Errorf("IsMockRuntime(%q) = %v, want %v", in, got, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestBuildMockA2AResponse_EchoesRequestID — JSON-RPC requires the
|
||||
// reply id to match the request id so callers can correlate. Mock
|
||||
// must hold this contract or canvas's correlation logic breaks.
|
||||
func TestBuildMockA2AResponse_EchoesRequestID(t *testing.T) {
|
||||
out := buildMockA2AResponse("ws-x", "req-echo-7", "On it!")
|
||||
var resp map[string]interface{}
|
||||
if err := json.Unmarshal(out, &resp); err != nil {
|
||||
t.Fatalf("response is not valid JSON: %v", err)
|
||||
}
|
||||
if resp["id"] != "req-echo-7" {
|
||||
t.Errorf("id = %v, want req-echo-7", resp["id"])
|
||||
}
|
||||
if resp["jsonrpc"] != "2.0" {
|
||||
t.Errorf("jsonrpc = %v, want 2.0", resp["jsonrpc"])
|
||||
}
|
||||
result, _ := resp["result"].(map[string]interface{})
|
||||
parts, _ := result["parts"].([]interface{})
|
||||
if len(parts) != 1 {
|
||||
t.Fatalf("expected 1 part, got %d", len(parts))
|
||||
}
|
||||
p, _ := parts[0].(map[string]interface{})
|
||||
if p["text"] != "On it!" {
|
||||
t.Errorf("part.text = %v, want On it!", p["text"])
|
||||
}
|
||||
}
|
||||
@ -250,6 +250,21 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
|
||||
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
|
||||
"name": ws.Name, "external": true,
|
||||
})
|
||||
} else if IsMockRuntime(runtime) {
|
||||
// Mock-runtime workspaces have no container, no EC2, no URL —
|
||||
// the proxyA2ARequest short-circuit synthesises every reply
|
||||
// from a canned variant pool (see mock_runtime.go). Status
|
||||
// goes straight to 'online' so the canvas renders the node
|
||||
// as reachable + the chat tab's send button is enabled. No
|
||||
// URL is set; the proxy never tries to resolve one for mock
|
||||
// runtimes. Built for the funding-demo "200-workspace mock
|
||||
// org" template — visual scale without real backend cost.
|
||||
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1 WHERE id = $2`, models.StatusOnline, id); err != nil {
|
||||
log.Printf("Org import: mock workspace status update failed for %s: %v", ws.Name, err)
|
||||
}
|
||||
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
|
||||
"name": ws.Name, "mock": true, "runtime": runtime,
|
||||
})
|
||||
} else if h.workspace.HasProvisioner() {
|
||||
// Provision container — either backend (CP for SaaS, local Docker
|
||||
// for self-hosted) is fine. Pre-2026-05-05 this gate was
|
||||
@ -675,8 +690,24 @@ func (h *OrgHandler) recurseChildrenForImport(ws OrgWorkspace, parentID string,
|
||||
if err := h.createWorkspaceTree(child, &parentID, childAbsX, childAbsY, slotX, slotY, defaults, orgBaseDir, results, provisionSem); err != nil {
|
||||
return err
|
||||
}
|
||||
// Pacing exists to throttle Docker container-spawn thundering
|
||||
// during a self-hosted import. Mock-runtime children spawn no
|
||||
// container — no Docker pressure, no LLM bursts, just DB
|
||||
// inserts + a broadcast. Skipping the 2s sleep collapses a
|
||||
// 200-workspace mock-org import from ~7min → ~5s, which is
|
||||
// the difference between a snappy demo and a "did it freeze?"
|
||||
// staring contest. Real (containerful) runtimes still pace.
|
||||
// Inheritance: if the child itself doesn't declare a runtime,
|
||||
// fall back to defaults.runtime — the org template sets
|
||||
// runtime: mock once at the org level, not on every IC node.
|
||||
childRuntime := child.Runtime
|
||||
if childRuntime == "" {
|
||||
childRuntime = defaults.Runtime
|
||||
}
|
||||
if !IsMockRuntime(childRuntime) {
|
||||
time.Sleep(workspaceCreatePacingMs * time.Millisecond)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
@ -4,6 +4,7 @@ import (
|
||||
"bytes"
|
||||
"context"
|
||||
"io"
|
||||
"log"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
@ -177,16 +178,42 @@ func strDefault(m map[string]interface{}, key, fallback string) string {
|
||||
return fallback
|
||||
}
|
||||
|
||||
// findRunningContainer returns the live container name for workspaceID, or ""
|
||||
// when the container is genuinely not running OR the daemon errored
|
||||
// transiently. Routed through provisioner.RunningContainerName as the SSOT
|
||||
// (molecule-core#10) so this handler agrees with healthsweep on the same
|
||||
// inputs. Transient daemon errors are logged distinctly so triage doesn't
|
||||
// confuse a flaky daemon with a stopped container.
|
||||
func (h *PluginsHandler) findRunningContainer(ctx context.Context, workspaceID string) string {
|
||||
if h.docker == nil {
|
||||
name, err := provisioner.RunningContainerName(ctx, h.docker, workspaceID)
|
||||
if err != nil {
|
||||
log.Printf("plugins: docker inspect transient error for %s: %v (treating as not-running for this request)", workspaceID, err)
|
||||
return ""
|
||||
}
|
||||
name := provisioner.ContainerName(workspaceID)
|
||||
info, err := h.docker.ContainerInspect(ctx, name)
|
||||
if err == nil && info.State.Running {
|
||||
return name
|
||||
}
|
||||
return ""
|
||||
|
||||
// isExternalRuntime reports whether the workspace's runtime is the
|
||||
// `external` (remote-pull) shape introduced in Phase 30. External
|
||||
// workspaces have no local container — `POST /plugins` (push-install via
|
||||
// docker exec) doesn't apply to them; they pull via the download endpoint
|
||||
// instead. Returns false (allow-install) if the lookup is unwired or
|
||||
// errors — failing open here is safe because the downstream
|
||||
// findRunningContainer step still gates on a real container being there.
|
||||
//
|
||||
// Background — molecule-core#10: without this check, external workspaces
|
||||
// fall through to findRunningContainer's NotFound path and return a
|
||||
// misleading 503 "container not running" instead of a clear "use the
|
||||
// pull endpoint" message.
|
||||
func (h *PluginsHandler) isExternalRuntime(workspaceID string) bool {
|
||||
if h.runtimeLookup == nil {
|
||||
return false
|
||||
}
|
||||
runtime, err := h.runtimeLookup(workspaceID)
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
return runtime == "external"
|
||||
}
|
||||
|
||||
func (h *PluginsHandler) execAsRoot(ctx context.Context, containerName string, cmd []string) (string, error) {
|
||||
|
||||
@ -0,0 +1,176 @@
|
||||
package handlers
|
||||
|
||||
import (
|
||||
"go/ast"
|
||||
"go/parser"
|
||||
"go/token"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestFindRunningContainer_RoutesThroughProvisionerSSOT is a behavior-based
|
||||
// AST gate: it pins the invariant that PluginsHandler.findRunningContainer
|
||||
// MUST go through provisioner.RunningContainerName for its is-running check,
|
||||
// instead of carrying its own copy of cli.ContainerInspect logic.
|
||||
//
|
||||
// Background — molecule-core#10: a parallel impl of "is the workspace's
|
||||
// container running" used to live in plugins.go. It drifted from the
|
||||
// canonical impl in healthsweep (which goes through Provisioner.IsRunning
|
||||
// → RunningContainerName) on edge cases like "transient daemon error" —
|
||||
// the duplicate would 503 with a misleading message while healthsweep
|
||||
// correctly stayed defensive. Consolidating onto RunningContainerName as
|
||||
// the SSOT prevents any future copy from re-introducing that drift.
|
||||
//
|
||||
// Mutation invariant: if a future PR replaces the provisioner call with
|
||||
// `h.docker.ContainerInspect(...)` directly, this test fails. That's the
|
||||
// signal to either (a) extend RunningContainerName's contract OR (b)
|
||||
// document why this call site needs to differ. Either way: the drift
|
||||
// gets a reviewer's attention instead of shipping silently.
|
||||
func TestFindRunningContainer_RoutesThroughProvisionerSSOT(t *testing.T) {
|
||||
fset := token.NewFileSet()
|
||||
file, err := parser.ParseFile(fset, "plugins.go", nil, parser.ParseComments)
|
||||
if err != nil {
|
||||
t.Fatalf("parse plugins.go: %v", err)
|
||||
}
|
||||
|
||||
var fn *ast.FuncDecl
|
||||
ast.Inspect(file, func(n ast.Node) bool {
|
||||
f, ok := n.(*ast.FuncDecl)
|
||||
if !ok || f.Name.Name != "findRunningContainer" {
|
||||
return true
|
||||
}
|
||||
// Confirm receiver is *PluginsHandler so we don't pick up an unrelated
|
||||
// helper of the same name. ast.Recv is a FieldList — receivers carry
|
||||
// at most one field.
|
||||
if f.Recv == nil || len(f.Recv.List) == 0 {
|
||||
return true
|
||||
}
|
||||
fn = f
|
||||
return false
|
||||
})
|
||||
|
||||
if fn == nil {
|
||||
t.Fatal("findRunningContainer not found in plugins.go — was it renamed? update this test or the SSOT routing assumption")
|
||||
}
|
||||
|
||||
var (
|
||||
callsRunningContainerName bool
|
||||
callsContainerInspectRaw bool
|
||||
)
|
||||
ast.Inspect(fn.Body, func(n ast.Node) bool {
|
||||
call, ok := n.(*ast.CallExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
sel, ok := call.Fun.(*ast.SelectorExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
// Pkg.Func form: provisioner.RunningContainerName(...)
|
||||
if pkgIdent, ok := sel.X.(*ast.Ident); ok {
|
||||
if pkgIdent.Name == "provisioner" && sel.Sel.Name == "RunningContainerName" {
|
||||
callsRunningContainerName = true
|
||||
}
|
||||
}
|
||||
// Receiver-then-method form: h.docker.ContainerInspect(...) /
|
||||
// p.cli.ContainerInspect(...) — anything ending in
|
||||
// .ContainerInspect that's NOT routed through provisioner.
|
||||
if sel.Sel.Name == "ContainerInspect" {
|
||||
callsContainerInspectRaw = true
|
||||
}
|
||||
return true
|
||||
})
|
||||
|
||||
if !callsRunningContainerName {
|
||||
t.Errorf(
|
||||
"findRunningContainer must call provisioner.RunningContainerName for the SSOT inspect — see molecule-core#10. Found no such call.",
|
||||
)
|
||||
}
|
||||
if callsContainerInspectRaw {
|
||||
t.Errorf(
|
||||
"findRunningContainer carries a direct ContainerInspect call. This is the parallel-impl drift molecule-core#10 fixed. " +
|
||||
"Either route through provisioner.RunningContainerName OR — if a new use case truly needs a different inspect — extend RunningContainerName's contract first and update this gate to allow the specific delta.",
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
// TestProvisionerIsRunning_RoutesThroughRunningContainerName mirrors the
|
||||
// gate above but for the OTHER consumer of the SSOT — Provisioner.IsRunning
|
||||
// (called by healthsweep). If a future refactor makes IsRunning carry its
|
||||
// own ContainerInspect again, the two consumers' edge-case behaviors will
|
||||
// silently drift. Keep them yoked.
|
||||
func TestProvisionerIsRunning_RoutesThroughRunningContainerName(t *testing.T) {
|
||||
fset := token.NewFileSet()
|
||||
file, err := parser.ParseFile(fset, "../provisioner/provisioner.go", nil, parser.ParseComments)
|
||||
if err != nil {
|
||||
t.Fatalf("parse provisioner.go: %v", err)
|
||||
}
|
||||
|
||||
var fn *ast.FuncDecl
|
||||
ast.Inspect(file, func(n ast.Node) bool {
|
||||
f, ok := n.(*ast.FuncDecl)
|
||||
if !ok || f.Name.Name != "IsRunning" || f.Recv == nil {
|
||||
return true
|
||||
}
|
||||
// The receiver type must be *Provisioner specifically. CPProvisioner
|
||||
// has its own IsRunning that talks HTTP to the controlplane and is
|
||||
// out of scope for this gate.
|
||||
if !receiverIs(f, "Provisioner") {
|
||||
return true
|
||||
}
|
||||
fn = f
|
||||
return false
|
||||
})
|
||||
if fn == nil {
|
||||
t.Fatal("Provisioner.IsRunning not found — was it renamed? update this test")
|
||||
}
|
||||
|
||||
var (
|
||||
callsRunningContainerName bool
|
||||
callsContainerInspectRaw bool
|
||||
)
|
||||
ast.Inspect(fn.Body, func(n ast.Node) bool {
|
||||
call, ok := n.(*ast.CallExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
// Same-package call: bare identifier (e.g. RunningContainerName(...)).
|
||||
if id, ok := call.Fun.(*ast.Ident); ok && id.Name == "RunningContainerName" {
|
||||
callsRunningContainerName = true
|
||||
return true
|
||||
}
|
||||
// Selector call: pkg.Func (e.g. provisioner.RunningContainerName)
|
||||
// OR recv.Method (e.g. p.cli.ContainerInspect).
|
||||
sel, ok := call.Fun.(*ast.SelectorExpr)
|
||||
if !ok {
|
||||
return true
|
||||
}
|
||||
switch sel.Sel.Name {
|
||||
case "RunningContainerName":
|
||||
callsRunningContainerName = true
|
||||
case "ContainerInspect":
|
||||
callsContainerInspectRaw = true
|
||||
}
|
||||
return true
|
||||
})
|
||||
|
||||
if !callsRunningContainerName {
|
||||
t.Errorf("Provisioner.IsRunning must call RunningContainerName for the SSOT inspect — see molecule-core#10")
|
||||
}
|
||||
if callsContainerInspectRaw {
|
||||
t.Errorf("Provisioner.IsRunning carries a direct ContainerInspect call; route through RunningContainerName instead")
|
||||
}
|
||||
}
|
||||
|
||||
// receiverIs reports whether fn's receiver is `*<typeName>` or `<typeName>`.
|
||||
func receiverIs(fn *ast.FuncDecl, typeName string) bool {
|
||||
if fn.Recv == nil || len(fn.Recv.List) == 0 {
|
||||
return false
|
||||
}
|
||||
expr := fn.Recv.List[0].Type
|
||||
if star, ok := expr.(*ast.StarExpr); ok {
|
||||
expr = star.X
|
||||
}
|
||||
id, ok := expr.(*ast.Ident)
|
||||
return ok && strings.EqualFold(id.Name, typeName)
|
||||
}
|
||||
@ -32,6 +32,18 @@ import (
|
||||
// inside the workspace at startup.
|
||||
func (h *PluginsHandler) Install(c *gin.Context) {
|
||||
workspaceID := c.Param("id")
|
||||
// External-runtime guard (molecule-core#10): push-install via docker
|
||||
// exec is meaningless for `runtime='external'` workspaces — they have
|
||||
// no local container. Reject early with a hint pointing at the
|
||||
// pull-mode endpoint, instead of falling through to a misleading
|
||||
// "container not running" 503 from findRunningContainer.
|
||||
if h.isExternalRuntime(workspaceID) {
|
||||
c.JSON(http.StatusUnprocessableEntity, gin.H{
|
||||
"error": "plugin install via push is not supported for external runtimes",
|
||||
"hint": "external workspaces pull plugins via GET /workspaces/:id/plugins/:name/download",
|
||||
})
|
||||
return
|
||||
}
|
||||
// Cap the JSON body so a pathological POST can't exhaust parser memory.
|
||||
bodyMax := envx.Int64("PLUGIN_INSTALL_BODY_MAX_BYTES", defaultInstallBodyMaxBytes)
|
||||
c.Request.Body = http.MaxBytesReader(c.Writer, c.Request.Body, bodyMax)
|
||||
@ -93,6 +105,16 @@ func (h *PluginsHandler) Uninstall(c *gin.Context) {
|
||||
pluginName := c.Param("name")
|
||||
ctx := c.Request.Context()
|
||||
|
||||
// Mirror Install's external-runtime guard (molecule-core#10) so the
|
||||
// two endpoints reject the same shape with the same message.
|
||||
if h.isExternalRuntime(workspaceID) {
|
||||
c.JSON(http.StatusUnprocessableEntity, gin.H{
|
||||
"error": "plugin uninstall via docker exec is not supported for external runtimes",
|
||||
"hint": "external workspaces manage their own plugin directory; remove it locally",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
if err := validatePluginName(pluginName); err != nil {
|
||||
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid plugin name"})
|
||||
return
|
||||
|
||||
@ -0,0 +1,176 @@
|
||||
package handlers
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"github.com/gin-gonic/gin"
|
||||
)
|
||||
|
||||
// TestPluginInstall_ExternalRuntime_Returns422 — molecule-core#10.
|
||||
// Install on a `runtime='external'` workspace must NOT fall through to
|
||||
// findRunningContainer (which would 503 with a misleading "container not
|
||||
// running"). It must return 422 with a hint pointing at the pull-mode
|
||||
// download endpoint.
|
||||
func TestPluginInstall_ExternalRuntime_Returns422(t *testing.T) {
|
||||
h := NewPluginsHandler(t.TempDir(), nil, nil).
|
||||
WithRuntimeLookup(func(workspaceID string) (string, error) {
|
||||
return "external", nil
|
||||
})
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: "ba1789b0-4d21-4f4f-a878-fa226bf77cf5"}}
|
||||
c.Request = httptest.NewRequest(
|
||||
"POST",
|
||||
"/workspaces/ba1789b0-4d21-4f4f-a878-fa226bf77cf5/plugins",
|
||||
bytes.NewBufferString(`{"source":"local://my-plugin"}`),
|
||||
)
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
h.Install(c)
|
||||
|
||||
if w.Code != http.StatusUnprocessableEntity {
|
||||
t.Errorf("expected 422 (Unprocessable Entity) for runtime='external', got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
if !strings.Contains(w.Body.String(), "external runtimes") {
|
||||
t.Errorf("expected error body to mention 'external runtimes', got: %s", w.Body.String())
|
||||
}
|
||||
if !strings.Contains(w.Body.String(), "download") {
|
||||
t.Errorf("expected error body to point at the download endpoint, got: %s", w.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
// TestPluginUninstall_ExternalRuntime_Returns422 — symmetric guard on the
|
||||
// uninstall path (DELETE /workspaces/:id/plugins/:name). External
|
||||
// workspaces manage their own plugin directory locally; the platform
|
||||
// can't docker-exec into them.
|
||||
func TestPluginUninstall_ExternalRuntime_Returns422(t *testing.T) {
|
||||
h := NewPluginsHandler(t.TempDir(), nil, nil).
|
||||
WithRuntimeLookup(func(workspaceID string) (string, error) {
|
||||
return "external", nil
|
||||
})
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{
|
||||
{Key: "id", Value: "ba1789b0-4d21-4f4f-a878-fa226bf77cf5"},
|
||||
{Key: "name", Value: "my-plugin"},
|
||||
}
|
||||
c.Request = httptest.NewRequest(
|
||||
"DELETE",
|
||||
"/workspaces/ba1789b0-4d21-4f4f-a878-fa226bf77cf5/plugins/my-plugin",
|
||||
nil,
|
||||
)
|
||||
|
||||
h.Uninstall(c)
|
||||
|
||||
if w.Code != http.StatusUnprocessableEntity {
|
||||
t.Errorf("expected 422 for runtime='external', got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
if !strings.Contains(w.Body.String(), "external runtimes") {
|
||||
t.Errorf("expected error body to mention 'external runtimes', got: %s", w.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
// TestPluginInstall_ContainerBackedRuntime_FallsThroughGuard — the runtime
|
||||
// guard MUST NOT short-circuit container-backed runtimes. With
|
||||
// `runtime='claude-code'` the install proceeds past the guard; without a
|
||||
// real plugin source it'll fail downstream (here: 404 from local resolver
|
||||
// because no plugin staged), which is the correct error to surface.
|
||||
//
|
||||
// This is the mutation-test partner: deleting the `runtime == "external"`
|
||||
// check would still pass TestPluginInstall_ExternalRuntime (because Install
|
||||
// would 404 instead of 422 — but the test asserts 422), and would still
|
||||
// pass this test (because both pre-fix and post-fix produce 404 here).
|
||||
// What this case pins is "non-external still falls through," catching
|
||||
// any over-eager guard that rejects all runtimes.
|
||||
func TestPluginInstall_ContainerBackedRuntime_FallsThroughGuard(t *testing.T) {
|
||||
h := NewPluginsHandler(t.TempDir(), nil, nil).
|
||||
WithRuntimeLookup(func(workspaceID string) (string, error) {
|
||||
return "claude-code", nil
|
||||
})
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: "c7c28c0b-4ea5-4e75-9728-3ba860081708"}}
|
||||
c.Request = httptest.NewRequest(
|
||||
"POST",
|
||||
"/workspaces/c7c28c0b-4ea5-4e75-9728-3ba860081708/plugins",
|
||||
bytes.NewBufferString(`{"source":"local://nonexistent-plugin"}`),
|
||||
)
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
h.Install(c)
|
||||
|
||||
if w.Code == http.StatusUnprocessableEntity {
|
||||
t.Errorf("runtime='claude-code' must fall through the external guard; got 422: %s", w.Body.String())
|
||||
}
|
||||
// The local resolver will fail to find the plugin → 404. Anything
|
||||
// other than 422 (which would mean we mis-classified) is fine.
|
||||
if w.Code != http.StatusNotFound {
|
||||
t.Errorf("expected 404 (plugin not found in registry), got %d: %s", w.Code, w.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
// TestPluginInstall_NoRuntimeLookup_FailsOpen — when the runtime lookup
|
||||
// is unwired (test fixtures, niche deploy shapes) the guard MUST default
|
||||
// to allowing the install attempt. The downstream findRunningContainer
|
||||
// step still gates on a real container, so failing open here doesn't
|
||||
// expose a bypass — it just preserves backwards-compat with deployments
|
||||
// that haven't wired the lookup.
|
||||
func TestPluginInstall_NoRuntimeLookup_FailsOpen(t *testing.T) {
|
||||
h := NewPluginsHandler(t.TempDir(), nil, nil) // NO WithRuntimeLookup
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: "ws-no-lookup"}}
|
||||
c.Request = httptest.NewRequest(
|
||||
"POST",
|
||||
"/workspaces/ws-no-lookup/plugins",
|
||||
bytes.NewBufferString(`{"source":"local://nonexistent"}`),
|
||||
)
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
h.Install(c)
|
||||
|
||||
if w.Code == http.StatusUnprocessableEntity {
|
||||
t.Errorf("nil runtimeLookup must fall through (fail-open); got 422: %s", w.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
// TestPluginInstall_RuntimeLookupErrors_FailsOpen — same fail-open story
|
||||
// for transient DB errors in the lookup. We don't want a momentary
|
||||
// Postgres hiccup to flip every plugin install into a 422.
|
||||
func TestPluginInstall_RuntimeLookupErrors_FailsOpen(t *testing.T) {
|
||||
h := NewPluginsHandler(t.TempDir(), nil, nil).
|
||||
WithRuntimeLookup(func(workspaceID string) (string, error) {
|
||||
return "", errFakeDB
|
||||
})
|
||||
|
||||
w := httptest.NewRecorder()
|
||||
c, _ := gin.CreateTestContext(w)
|
||||
c.Params = gin.Params{{Key: "id", Value: "ws-db-flake"}}
|
||||
c.Request = httptest.NewRequest(
|
||||
"POST",
|
||||
"/workspaces/ws-db-flake/plugins",
|
||||
bytes.NewBufferString(`{"source":"local://nonexistent"}`),
|
||||
)
|
||||
c.Request.Header.Set("Content-Type", "application/json")
|
||||
|
||||
h.Install(c)
|
||||
|
||||
if w.Code == http.StatusUnprocessableEntity {
|
||||
t.Errorf("runtimeLookup error must fall through (fail-open); got 422: %s", w.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
// errFakeDB is a sentinel for the fail-open lookup-error case.
|
||||
var errFakeDB = &fakeError{msg: "synthetic db error"}
|
||||
|
||||
type fakeError struct{ msg string }
|
||||
|
||||
func (e *fakeError) Error() string { return e.msg }
|
||||
@ -78,6 +78,10 @@ var fallbackRuntimes = map[string]struct{}{
|
||||
"openclaw": {},
|
||||
"codex": {},
|
||||
"external": {},
|
||||
// mock — virtual workspace with hardcoded canned A2A replies.
|
||||
// No container, no EC2, no template repo. See mock_runtime.go
|
||||
// for the full rationale (200-workspace funding-demo org).
|
||||
"mock": {},
|
||||
}
|
||||
|
||||
// loadRuntimesFromManifest builds the runtime allowlist from
|
||||
@ -104,6 +108,10 @@ func loadRuntimesFromManifest(path string) (map[string]struct{}, error) {
|
||||
// the manifest doesn't know about it. Injected here so we
|
||||
// don't need a special-case in every caller.
|
||||
"external": {},
|
||||
// mock is ALWAYS available for the same reason as external:
|
||||
// virtual workspace, no template repo, never spawns a
|
||||
// container. See mock_runtime.go.
|
||||
"mock": {},
|
||||
}
|
||||
for _, e := range m.WorkspaceTemplates {
|
||||
name := strings.TrimSpace(e.Name)
|
||||
|
||||
@ -112,6 +112,19 @@ func (h *WorkspaceHandler) Restart(c *gin.Context) {
|
||||
return
|
||||
}
|
||||
|
||||
// runtime=mock: virtual workspace with canned A2A replies. No
|
||||
// container, no EC2, no provisioning state to recycle. Mirror
|
||||
// the external no-op so the canvas's Restart button doesn't
|
||||
// silently fail or leak through to the (template-less) provisioner.
|
||||
if dbRuntime == "mock" {
|
||||
c.JSON(http.StatusOK, gin.H{
|
||||
"status": "noop",
|
||||
"runtime": "mock",
|
||||
"message": "mock workspaces have no container — restart is a no-op",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
// SaaS mode: cpProv handles workspace EC2 lifecycle. Self-hosted mode:
|
||||
// provisioner handles local Docker containers. At least one must be
|
||||
// available — previously only `provisioner` was checked, which broke
|
||||
@ -532,7 +545,9 @@ func (h *WorkspaceHandler) runRestartCycle(workspaceID string) {
|
||||
}
|
||||
|
||||
// Don't auto-restart external workspaces (no Docker container)
|
||||
if dbRuntime == "external" {
|
||||
// or mock workspaces (no container, every reply is canned —
|
||||
// see workspace-server/internal/handlers/mock_runtime.go).
|
||||
if dbRuntime == "external" || dbRuntime == "mock" {
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
@ -1,6 +1,7 @@
|
||||
package handlers
|
||||
|
||||
import (
|
||||
"runtime"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
@ -15,6 +16,42 @@ func resetRestartStatesFor(workspaceID string) {
|
||||
restartStates.Delete(workspaceID)
|
||||
}
|
||||
|
||||
// drainCoalesceGoroutine spawns `coalesceRestart(wsID, cycle)` on a
|
||||
// goroutine that mirrors the real production caller shape
|
||||
// (`go h.RestartByID(...)` from a2a_proxy.go, a2a_proxy_helpers.go,
|
||||
// main.go), and registers a t.Cleanup that blocks until the goroutine
|
||||
// has TERMINATED — not just panicked-and-recovered, fully exited.
|
||||
//
|
||||
// This is the bleed-prevention contract for Class H (Task #170): no
|
||||
// test in this file may declare itself complete while a coalesceRestart
|
||||
// goroutine it spawned is still alive, because that goroutine could
|
||||
// otherwise wake up after the test's sqlmock has been closed and
|
||||
// either:
|
||||
// - issue a stale INSERT that gets attributed to the next test's
|
||||
// sqlmock connection — surfaces as
|
||||
// "INSERT-not-expected for kind=DELEGATION_FAILED" / =WORKSPACE_PROVISION_FAILED
|
||||
// in a neighbour test that doesn't itself touch coalesceRestart; or
|
||||
// - hold a reference to the closed *sql.DB and panic on the next op.
|
||||
//
|
||||
// Implementation notes:
|
||||
// - sync.WaitGroup must be Add()ed BEFORE the goroutine is spawned;
|
||||
// Add inside the goroutine races with Wait.
|
||||
// - t.Cleanup runs in LIFO order, so this composes safely with other
|
||||
// cleanups (e.g. setupTestDB's mockDB.Close).
|
||||
// - We don't bound the Wait with a timeout — if the goroutine
|
||||
// genuinely deadlocks, the whole test process should hang and fail
|
||||
// under -timeout. A timeout-then-orphan would mask the bleed.
|
||||
func drainCoalesceGoroutine(t *testing.T, wsID string, cycle func()) {
|
||||
t.Helper()
|
||||
var wg sync.WaitGroup
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
coalesceRestart(wsID, cycle)
|
||||
}()
|
||||
t.Cleanup(wg.Wait)
|
||||
}
|
||||
|
||||
// TestCoalesceRestart_SingleCallRunsOneCycle is the baseline:
|
||||
// no concurrency, one cycle. If this fails the gate logic is broken at
|
||||
// its simplest path.
|
||||
@ -200,19 +237,45 @@ func TestCoalesceRestart_PanicInCycleClearsState(t *testing.T) {
|
||||
const wsID = "test-coalesce-panic-recovery"
|
||||
resetRestartStatesFor(wsID)
|
||||
|
||||
// First call's cycle panics. coalesceRestart's defer must swallow
|
||||
// the panic so this test caller doesn't see it propagate up — that
|
||||
// matches what the real production caller (`go h.RestartByID(...)`)
|
||||
// gets: the goroutine survives, no process crash.
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
t.Errorf("panic should NOT propagate out of coalesceRestart (would crash the platform process from a goroutine), got: %v", r)
|
||||
// Spawn the panicking cycle on a goroutine via drainCoalesceGoroutine
|
||||
// — this mirrors the real production callsite shape
|
||||
// (`go h.RestartByID(...)` from a2a_proxy.go:584,
|
||||
// a2a_proxy_helpers.go:197, main.go:213). The previous form called
|
||||
// coalesceRestart synchronously, which neither exercised the
|
||||
// goroutine-survival contract nor caught Class H bleed regressions
|
||||
// where the panic-recovery goroutine outlives the test and pollutes
|
||||
// the next test's sqlmock with INSERTs from runRestartCycle's
|
||||
// LogActivity calls (kinds DELEGATION_FAILED / WORKSPACE_PROVISION_FAILED).
|
||||
//
|
||||
// drainCoalesceGoroutine registers a t.Cleanup that Wait()s for the
|
||||
// goroutine to TERMINATE — not merely panic-and-recover — before
|
||||
// the test ends.
|
||||
drainCoalesceGoroutine(t, wsID, func() { panic("simulated cycle failure") })
|
||||
|
||||
// We need a mid-test barrier (not just the t.Cleanup-time barrier)
|
||||
// so the second coalesceRestart below sees state.running=false. The
|
||||
// goroutine clears state.running inside its deferred recover; poll
|
||||
// the package-level restartStates map until that observable flip
|
||||
// happens. Bound at 2s — longer = real bug.
|
||||
deadline := time.Now().Add(2 * time.Second)
|
||||
for time.Now().Before(deadline) {
|
||||
sv, ok := restartStates.Load(wsID)
|
||||
if ok {
|
||||
st := sv.(*restartState)
|
||||
st.mu.Lock()
|
||||
running := st.running
|
||||
st.mu.Unlock()
|
||||
if !running {
|
||||
break
|
||||
}
|
||||
}
|
||||
time.Sleep(time.Millisecond)
|
||||
}
|
||||
}()
|
||||
coalesceRestart(wsID, func() { panic("simulated cycle failure") })
|
||||
|
||||
// Second call must run a fresh cycle. If running stayed true after
|
||||
// the panic, this call would early-return without invoking cycle.
|
||||
// Synchronous — no panic, so no goroutine to drain, and we want to
|
||||
// assert ran.Load() immediately after.
|
||||
var ran atomic.Bool
|
||||
coalesceRestart(wsID, func() { ran.Store(true) })
|
||||
if !ran.Load() {
|
||||
@ -220,6 +283,98 @@ func TestCoalesceRestart_PanicInCycleClearsState(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestCoalesceRestart_DrainHelperWaitsForGoroutineExit is the Class H
|
||||
// regression guard for Task #170. It asserts the contract enforced by
|
||||
// drainCoalesceGoroutine: t.Cleanup blocks until the spawned
|
||||
// coalesceRestart goroutine has FULLY EXITED — not merely recovered
|
||||
// from panic. This is the contract that prevents stale LogActivity
|
||||
// INSERTs from a recovering goroutine bleeding into the next test's
|
||||
// sqlmock (the failure mode reported as "INSERT-not-expected for
|
||||
// kind=DELEGATION_FAILED" in TestPooledWithEICTunnel_PreservesFnErr).
|
||||
//
|
||||
// We use a deterministic bleed-shape probe rather than goroutine-count
|
||||
// arithmetic: the cycle blocks on a release channel for ~150ms — long
|
||||
// enough that without a Wait barrier, the outer sub-test would return
|
||||
// before the goroutine exited. We then verify the wg.Wait inside
|
||||
// drainCoalesceGoroutine actually delayed t.Run's completion: total
|
||||
// elapsed must be >= the block duration. Asserts exact-shape, not
|
||||
// substring (per saved-memory feedback_assert_exact_not_substring):
|
||||
// elapsed < blockFor would mean the cleanup didn't wait, which is the
|
||||
// exact bleed we're guarding against.
|
||||
//
|
||||
// We additionally panic from the cycle (after the block) to confirm
|
||||
// the helper waits past panic recovery, not just past cycle return.
|
||||
func TestCoalesceRestart_DrainHelperWaitsForGoroutineExit(t *testing.T) {
|
||||
const blockFor = 150 * time.Millisecond
|
||||
const wsID = "test-coalesce-drain-helper-contract"
|
||||
resetRestartStatesFor(wsID)
|
||||
|
||||
// done is closed inside the cycle, AFTER the block + AFTER the
|
||||
// panic (which the deferred recover in coalesceRestart catches).
|
||||
// Actually: defer in cycle runs before panic propagates to the
|
||||
// outer recover. Use defer to close.
|
||||
exited := make(chan struct{})
|
||||
|
||||
subStart := time.Now()
|
||||
t.Run("drain_under_subtest", func(st *testing.T) {
|
||||
drainCoalesceGoroutine(st, wsID, func() {
|
||||
defer close(exited)
|
||||
time.Sleep(blockFor)
|
||||
panic("contract-test panic-after-block")
|
||||
})
|
||||
// st.Cleanup runs here, before t.Run returns. wg.Wait must
|
||||
// block until the goroutine has finished its panic recovery.
|
||||
})
|
||||
subElapsed := time.Since(subStart)
|
||||
|
||||
// Contract: the helper's wg.Wait MUST have blocked t.Run from
|
||||
// returning until after the cycle's block + panic recovery.
|
||||
if subElapsed < blockFor {
|
||||
t.Fatalf(
|
||||
"drainCoalesceGoroutine contract violated: t.Run returned in %v, "+
|
||||
"but cycle blocks for %v. The Wait barrier is broken — a "+
|
||||
"coalesceRestart goroutine can outlive its test's t.Cleanup "+
|
||||
"and pollute neighbour-test sqlmock state (Class H bleed).",
|
||||
subElapsed, blockFor,
|
||||
)
|
||||
}
|
||||
|
||||
// And the goroutine must have actually closed `exited` (i.e. ran
|
||||
// the deferred close before panic propagated through coalesceRestart's
|
||||
// recover). If exited is still open here, the goroutine never
|
||||
// reached the close — meaning either the panic short-circuited the
|
||||
// defer (Go runtime bug — won't happen) or the goroutine never
|
||||
// ran at all (drainCoalesceGoroutine spawn shape regressed).
|
||||
select {
|
||||
case <-exited:
|
||||
// Correct path.
|
||||
default:
|
||||
t.Fatal("cycle goroutine never reached its deferred close — panic-recovery contract regressed")
|
||||
}
|
||||
|
||||
// Belt-and-suspenders: the post-recover state-clear must have
|
||||
// flipped state.running back to false. If this fails, the panic
|
||||
// path skipped the deferred state-clear in coalesceRestart.
|
||||
sv, ok := restartStates.Load(wsID)
|
||||
if !ok {
|
||||
t.Fatal("restartStates entry missing for wsID after cycle — sync.Map regression")
|
||||
}
|
||||
st := sv.(*restartState)
|
||||
st.mu.Lock()
|
||||
running := st.running
|
||||
st.mu.Unlock()
|
||||
if running {
|
||||
t.Error("state.running was not cleared after panic — sticky-running deadlock regressed")
|
||||
}
|
||||
|
||||
// Reference runtime.NumGoroutine to keep the runtime import
|
||||
// honest — also a useful smoke check that the goroutine count
|
||||
// hasn't ballooned 10x while debugging this test.
|
||||
if n := runtime.NumGoroutine(); n > 200 {
|
||||
t.Logf("warning: NumGoroutine=%d after drain — high but not necessarily a leak", n)
|
||||
}
|
||||
}
|
||||
|
||||
// TestCoalesceRestart_DifferentWorkspacesDoNotSerialize verifies the
|
||||
// per-workspace state map: an in-flight restart for ws A must not
|
||||
// block restarts for ws B. Important for performance — without this,
|
||||
|
||||
@ -1073,18 +1073,53 @@ func (p *Provisioner) IsRunning(ctx context.Context, workspaceID string) (bool,
|
||||
if p == nil || p.cli == nil {
|
||||
return false, ErrNoBackend
|
||||
}
|
||||
name := ContainerName(workspaceID)
|
||||
info, err := p.cli.ContainerInspect(ctx, name)
|
||||
name, err := RunningContainerName(ctx, p.cli, workspaceID)
|
||||
if err != nil {
|
||||
if isContainerNotFound(err) {
|
||||
return false, nil
|
||||
}
|
||||
// Transient daemon error: caller treats !running as dead + restarts.
|
||||
// Returning true + the underlying error preserves the error for
|
||||
// metrics/logging without triggering the destructive path.
|
||||
return true, err
|
||||
}
|
||||
return info.State.Running, nil
|
||||
return name != "", nil
|
||||
}
|
||||
|
||||
// RunningContainerName returns the container name for workspaceID iff the
|
||||
// container exists AND is in the Running state. Single source of truth for
|
||||
// "what live container should I exec into for this workspace?" — used by
|
||||
// both Provisioner.IsRunning (healthsweep) and the plugins handler.
|
||||
//
|
||||
// Distinguishes three outcomes so callers can pick their own policy:
|
||||
//
|
||||
// - ("ws-<id>", nil): container is running. Caller can exec into it.
|
||||
// - ("", nil): container does not exist OR exists but is stopped
|
||||
// (NotFound, Exited, Created, Restarting…). Caller
|
||||
// should treat as a definitive "not running."
|
||||
// - ("", err): transient daemon error (timeout, socket EOF, ctx
|
||||
// cancel). Caller should NOT infer "not running" —
|
||||
// this could be a flaky daemon under load. Decide
|
||||
// per-callsite whether to fail soft or hard.
|
||||
//
|
||||
// Background — molecule-core#10: the plugins handler used to carry its own
|
||||
// copy of this inspect logic (`findRunningContainer`) which collapsed
|
||||
// transient errors into the same "" return as a genuinely-stopped container.
|
||||
// That hid daemon flakes as misleading 503 "container not running" responses
|
||||
// AND let the two impls drift on edge-case behavior. This is the SSOT.
|
||||
func RunningContainerName(ctx context.Context, cli *client.Client, workspaceID string) (string, error) {
|
||||
if cli == nil {
|
||||
return "", ErrNoBackend
|
||||
}
|
||||
name := ContainerName(workspaceID)
|
||||
info, err := cli.ContainerInspect(ctx, name)
|
||||
if err != nil {
|
||||
if isContainerNotFound(err) {
|
||||
return "", nil
|
||||
}
|
||||
return "", err
|
||||
}
|
||||
if info.State.Running {
|
||||
return name, nil
|
||||
}
|
||||
return "", nil
|
||||
}
|
||||
|
||||
// isContainerNotFound returns true when the Docker client indicates the
|
||||
|
||||
@ -71,9 +71,15 @@ func StartHealthSweep(ctx context.Context, checker ContainerChecker, interval ti
|
||||
}
|
||||
|
||||
func sweepOnlineWorkspaces(ctx context.Context, checker ContainerChecker, onOffline OfflineHandler) {
|
||||
// Skip external workspaces (runtime='external') — they have no Docker container
|
||||
// Skip external + mock workspaces — neither has a Docker container.
|
||||
// external: agent runs on operator's laptop, polled via heartbeat.
|
||||
// mock: virtual workspace, every reply is canned (see
|
||||
// workspace-server/internal/handlers/mock_runtime.go). Both would
|
||||
// false-positive as "container gone" on every sweep tick and
|
||||
// auto-restart would loop forever (provisioner has no template
|
||||
// for either runtime).
|
||||
rows, err := db.DB.QueryContext(ctx,
|
||||
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') != 'external'`)
|
||||
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') NOT IN ('external', 'mock')`)
|
||||
if err != nil {
|
||||
log.Printf("Health sweep: query error: %v", err)
|
||||
return
|
||||
|
||||
@ -413,22 +413,20 @@ func sweepStaleTokensWithoutContainer(ctx context.Context, reaper OrphanReaper)
|
||||
// `"5m0s"` mismatch with Postgres interval grammar; passing seconds
|
||||
// as an int keeps the binding portable.
|
||||
graceSeconds := int(staleTokenGrace.Seconds())
|
||||
// `runtime != 'external'` is load-bearing: external workspaces have NO
|
||||
// local container by design (the agent runs off-host), so the
|
||||
// "no live container" predicate below would match every external
|
||||
// workspace and revoke its token. The token is the off-host agent's
|
||||
// only authentication credential — revoking breaks the entire
|
||||
// external-runtime feature. Discovered 2026-05-03 when a fresh
|
||||
// external workspace had its token silently revoked ~6 minutes after
|
||||
// creation by this sweep, killing the operator's MCP heartbeat and
|
||||
// inbox poll with `HTTP 401 — token may be revoked`.
|
||||
// `runtime NOT IN ('external','mock')` is load-bearing: neither
|
||||
// runtime has a local container, so the "no live container"
|
||||
// predicate below would match every row and revoke its token.
|
||||
// external: token is the off-host agent's only credential —
|
||||
// revoking breaks the entire external-runtime feature
|
||||
// (incident 2026-05-03). mock: same shape — no container by
|
||||
// design, see workspace-server/internal/handlers/mock_runtime.go.
|
||||
rows, qErr := db.DB.QueryContext(ctx, `
|
||||
SELECT DISTINCT t.workspace_id::text
|
||||
FROM workspace_auth_tokens t
|
||||
JOIN workspaces w ON w.id = t.workspace_id
|
||||
WHERE t.revoked_at IS NULL
|
||||
AND w.status NOT IN ('removed', 'provisioning')
|
||||
AND w.runtime != 'external'
|
||||
AND w.runtime NOT IN ('external', 'mock')
|
||||
AND COALESCE(t.last_used_at, t.created_at) < now() - make_interval(secs => $2)
|
||||
AND (
|
||||
cardinality($1::text[]) = 0
|
||||
|
||||
@ -26,7 +26,7 @@ import (
|
||||
// accidentally matching a future query that opens with the same column
|
||||
// name OR a regression that drops one of the load-bearing predicates.
|
||||
func expectStaleTokenSweepNoOp(mock sqlmock.Sqlmock) {
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}))
|
||||
}
|
||||
|
||||
@ -492,7 +492,7 @@ func TestSweepOnce_StaleTokenRevokeFiresWhenNoContainer(t *testing.T) {
|
||||
// excludes 'external' (2026-05-03 fix — the sweep was incorrectly
|
||||
// targeting external workspaces which have no container by design),
|
||||
// and the staleness predicate appears in the SELECT.
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'.*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\).*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
|
||||
AddRow(orphanedID))
|
||||
|
||||
@ -548,7 +548,7 @@ func TestSweepOnce_StaleTokenRevokeFailureBailsLoop(t *testing.T) {
|
||||
|
||||
// Third-pass returns two stale-token workspaces; the first revoke
|
||||
// errors. Loop must bail without attempting the second.
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
|
||||
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
|
||||
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
|
||||
AddRow("aaaa1111-0000-0000-0000-000000000000").
|
||||
AddRow("bbbb2222-0000-0000-0000-000000000000"))
|
||||
|
||||
Loading…
Reference in New Issue
Block a user