Compare commits

...

32 Commits

Author SHA1 Message Date
devops-engineer 64a0bc1f7e fix(ci): use AUTO_SYNC_TOKEN for auto-sync main->staging (Class D)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 32s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 31s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m23s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m24s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m32s
Same shape as molecule-controlplane#29: per-job GITHUB_TOKEN
doesn't have the Gitea API permissions to open PRs / push branches
the auto-sync flow needs. AUTO_SYNC_TOKEN is the devops-engineer
persona PAT (per saved memory feedback_per_agent_gitea_identity_default).

Companion prod ops (already done):
- devops-engineer added as collaborator on molecule-core (write)
- devops-engineer added to staging branch protection push_whitelist
- AUTO_SYNC_TOKEN registered as Actions secret on molecule-core
2026-05-07 07:01:46 -07:00
claude-ceo-assistant f29cbb3691 Merge pull request 'chore(ci): retrigger staging CI on new runner image' (#25) from chore/retrigger-staging-on-fixed-runner-image into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 9s
CI / Platform (Go) (push) Successful in 4s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 51s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m29s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m29s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m29s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 5m46s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 5m59s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 10m5s
2026-05-07 13:50:16 +00:00
devops-engineer c8110b5766 chore(ci): retrigger staging CI on new runner image
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 27s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 34s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m31s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m35s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m37s
All current core/staging reds ran 12:14-12:33 BEFORE the runner
image swap (cloudflared bake + GOPROXY pipe-separator at 12:55).
This empty commit forces a fresh CI run under the post-fix
runner image so we can categorize:
  - REAL fails (need targeted fix)
  - STALE-cleared (was a runner-image issue, now fixed)
  - Genuinely unrelated (Auto-sync, CodeQL — Hongming-parked)

Per feedback_orchestrator_must_verify_before_declaring_fixed,
don't mass-mark stale — wait for fresh run, verify each context.
2026-05-07 06:48:13 -07:00
claude-ceo-assistant 08e6f108ab Merge pull request 'chore: drop github-app-auth + swap GHCR→ECR (closes #157, #161)' (#23) from chore/drop-github-app-auth-and-ecr-swap into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 6s
CI / Detect changes (push) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
Harness Replays / detect-changes (push) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 13s
CI / Canvas (Next.js) (push) Successful in 19s
CI / Canvas Deploy Reminder (push) Has been skipped
Harness Replays / Harness Replays (push) Failing after 27s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 59s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m26s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m28s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m31s
CI / Platform (Go) (push) Failing after 2m6s
publish-workspace-server-image / build-and-push (push) Failing after 2m47s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 6m3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 18m31s
2026-05-07 12:14:36 +00:00
devops-engineer 1d8c101c94 chore: drop github-app-auth + swap GHCR→ECR (closes #157, #161)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Failing after 27s
CI / Python Lint & Test (pull_request) Successful in 31s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m19s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m21s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m25s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 15m34s
CI / Platform (Go) (pull_request) Failing after 15m35s
Two coupled cleanups for the post-2026-05-06 stack:

#157 — drop molecule-ai-plugin-github-app-auth
============================================
The plugin injected GITHUB_TOKEN/GH_TOKEN via the App's
installation-access flow (~hourly rotation). Per-agent Gitea
identities replaced this approach after the 2026-05-06 suspension —
workspaces now provision with a per-persona Gitea PAT from .env
instead of an App-rotated token. The plugin code itself lived on
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth which is
also unreachable post-suspension; checking it out at CI build time
was already failing.

Removed:
- workspace-server/cmd/server/main.go: githubappauth import + the
  `if os.Getenv("GITHUB_APP_ID") != ""` block that called
  BuildRegistry. gh-identity remains as the active mutator.
- workspace-server/Dockerfile + Dockerfile.tenant: COPY of the
  sibling repo + the `replace github.com/Molecule-AI/molecule-ai-
  plugin-github-app-auth => /plugin` directive injection.
- workspace-server/go.mod + go.sum: github-app-auth dep entry
  (cleaned up by `go mod tidy`).
- 3 workflows: actions/checkout steps for the sibling plugin repo:
    - .github/workflows/codeql.yml (Go matrix path)
    - .github/workflows/harness-replays.yml
    - .github/workflows/publish-workspace-server-image.yml

Verified `go build ./cmd/server` + `go vet ./...` pass post-removal.

#161 — swap GHCR→ECR for publish-workspace-server-image
=======================================================
Same workflow used to push to ghcr.io/molecule-ai/platform +
platform-tenant. ghcr.io/molecule-ai is gone post-suspension. The
operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
molecule-ai/) already hosts platform-tenant + workspace-template-*
+ runner-base images and is the post-suspension SSOT for container
images. This PR aligns publish-workspace-server-image with that
stack.

- env.IMAGE_NAME + env.TENANT_IMAGE_NAME repointed to ECR URL.
- docker/login-action swapped for aws-actions/configure-aws-
  credentials@v4 + aws-actions/amazon-ecr-login@v2 chain (the
  standard ECR auth pattern; uses AWS_ACCESS_KEY_ID/SECRET secrets
  bound to the molecule-cp IAM user).

The :staging-<sha> + :staging-latest tag policy is unchanged —
staging-CP's TENANT_IMAGE pin still points at :staging-latest, just
with the new registry prefix.

Refs molecule-core#157, #161; parallel to org-wide CI-green sweep.
2026-05-07 05:12:06 -07:00
claude-ceo-assistant fd1fbd2c5f Merge pull request 'docs(README): comprehensive refresh — landing-page icon (light/dark SVG) + 8 runtimes + Canvas v4 + Memory v2 + SaaS + channel plugin' (#5) from docs/readme-comprehensive-refresh-2026-05-06 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Platform (Go) (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 8s
CI / Canvas (Next.js) (push) Successful in 18s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Python Lint & Test (push) Successful in 30s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 48s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 51s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m25s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m33s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 5m57s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 10m7s
2026-05-07 11:46:29 +00:00
claude-ceo-assistant ea7f35b724 docs(README.zh-CN): mirror EN refresh — 8 runtimes + Canvas v4 + Memory v2 + SaaS + channel plugin
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 17s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 23s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m4s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m29s
Brings the Chinese README to parity with the comprehensive English
refresh in the same PR:

- Icon: PNG → SVG (light/dark adaptive)
- Runtimes: 6 → 8 (added Hermes 4 + Gemini CLI to pitch line, "Runtime
  choice" section, comparison table)
- Canvas v4 — warm-paper 主题系统 callout
- Memory v2 — pgvector 语义召回 callout
- RFC #2967 typed-SSOT A2A 响应路径 — platform ship list + arch diagram
- SaaS section — 多租户 EC2 + Neon + Cloudflare Tunnels, WorkOS, Stripe,
  KMS, tenant_resources 审计 + 30 分钟 reconciler
- molecule-mcp-claude-channel section — 在 Claude Code 里直接接入,
  marketplace 安装流程, 多租户配置
- Architecture diagram redrawn (Canvas v4 → Platform 1.25 → Provisioner
  Docker|EC2+SSM, plus SaaS Control Plane block)
- "Current Scope" updated — Canvas v4, Memory v2, 8 adapters, RFC
  #2967, SaaS surface

Translation kept idiomatic — used Chinese tech terms where natural
(语义召回, 多租户, 信封加密) and kept English for established
proper nouns (Hermes, Gemini CLI, RFC #2967, pgvector, WorkOS, KMS).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-07 04:42:40 -07:00
claude-ceo-assistant 132f97d261 docs(README): comprehensive refresh — landing-page icon (SVG, light/dark) + 8 runtimes + Canvas v4 + Memory v2 + SaaS + channel plugin
The README hadn't been refreshed since the v0 wave. Several major
shipped surfaces weren't called out (Canvas v4 warm-paper theme,
Memory v2 with pgvector, RFC #2967 typed-SSOT A2A response path,
the SaaS control plane, the molecule-mcp-claude-channel plugin we
just shipped via v0.4.0/0.4.1/0.4.2). The runtime list still said
"6" when 8 are in production. The icon was a 1.3 MB PNG with no
light-mode variant.

- New `docs/assets/branding/molecule-icon.svg` matches the landing
  page's `public/favicon.svg` shape (5-spoke molecular graph) but
  carries `prefers-color-scheme` styles so it adapts to GitHub's
  light/dark modes. The PNG stays for back-compat with anything
  that hotlinks it.
- `docs/assets/branding/molecule-logo.svg` adds a wordmark variant
  for places that want the brand name alongside the icon.
- README hero replaces the PNG `<img>` with the SVG so contributors
  reading on GitHub light see a tinted version that doesn't blow
  out the page background.

- **8 production runtimes** named explicitly throughout: Claude
  Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen,
  OpenClaw. Comparison table grew Hermes 4 + Gemini CLI rows with
  the integration mechanism (Option B upstream hook, A2A bridge,
  multi-provider derivation).
- **Canvas v4** — warm-paper theme system (light / dark / follow-
  system) called out alongside the existing Next.js 15 / React Flow /
  Zustand stack.
- **Memory v2 backed by pgvector** — semantic recall callout in
  both the "memory model" pitch line and the runtime stack section.
- **RFC #2967 typed-SSOT A2A response path** named in the platform
  ship list + architecture diagram.
- **SaaS surface section** added — multi-tenant EC2 + Neon +
  Cloudflare Tunnels, WorkOS + Stripe, KMS envelope, tenant_resources
  audit + 30-min reconciler. Cross-links to molecule-controlplane.
- **molecule-mcp-claude-channel plugin** added — entry point for
  Claude Code users to bridge A2A traffic into a local session via
  MCP. Documents the standard marketplace install flow + multi-
  tenant config.
- **Architecture diagram** redrawn with Canvas → Platform → Postgres
  + Provisioner (Docker | EC2+SSM) layout, plus a SaaS control plane
  block.
- **Quick Start** repo URL fixed (`molecule-monorepo` → `molecule-core`),
  Go version bumped to 1.25, Python ≥3.11 noted.

- Deploy buttons + Quick Start URL all bump from the old
  `molecule-monorepo` name to the current `molecule-core`. Pre-fix
  these clicked through to a 404.

The provisioner refactor (`registry.go` deletion + RegistryPrefix
env-driven changes) that lived alongside an earlier draft of this
README on the `docs/readme-refresh-2026-05-06` branch is OUT of
this PR — that work shipped separately via #6. This branch is
docs-only so the review surface is small and the merge is reversible.

- `git diff staging --stat`:
  ```
  README.md                              | 75 +++++++++++++++++++++++-----------
  docs/assets/branding/molecule-icon.svg | 28 +++++++++++++
  docs/assets/branding/molecule-logo.svg | 17 ++++++++
  3 files changed, 97 insertions(+), 23 deletions(-)
  ```
- SVGs validated in a browser at light + dark `prefers-color-scheme`.
- All linked docs (./docs/index.md, ./docs/quickstart.md, ./docs/
  architecture/architecture.md, ./docs/api-protocol/platform-api.md,
  ./docs/agent-runtime/workspace-runtime.md, ./LICENSE, etc.) verified
  to exist on staging.

- README.zh-CN.md mirror — non-trivial translation work; file as
  separate issue if mirror is wanted.
- molecule-ai/.github org-profile README — Gitea has no equivalent
  to GitHub's org-profile surface, and the GitHub org is suspended.
  Skipped.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-07 04:42:40 -07:00
claude-ceo-assistant 6a7dcd287c Merge pull request 'feat(canvas/chat-server): canvas consumes /chat-history + server-side row-aware reverse (RFC #2945 PR-C-2)' (#4) from feat/rfc-2945-pr-c-2-canvas-chat-history into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 7s
CI / Detect changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 12s
Harness Replays / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 26s
Harness Replays / Harness Replays (push) Failing after 41s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 1m5s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m35s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m43s
CI / Canvas (Next.js) (push) Successful in 2m32s
CI / Canvas Deploy Reminder (push) Has been skipped
publish-workspace-server-image / build-and-push (push) Failing after 2m42s
CI / Platform (Go) (push) Failing after 2m58s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 6m8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
2026-05-07 11:38:54 +00:00
claude-ceo-assistant b49bdde997 Merge pull request 'fix(workspace-server): CP orphan sweeper closes deprovision split-write race (#2989)' (#2) from fix/cp-orphan-sweeper-2989 into staging
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Harness Replays / Harness Replays (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled
E2E API Smoke Test / detect-changes (push) Has been cancelled
Handlers Postgres Integration / detect-changes (push) Has been cancelled
Harness Replays / detect-changes (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (push) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Has been cancelled
2026-05-07 11:38:48 +00:00
claude-ceo-assistant f140b19e79 Merge pull request 'perf(workspace-server,canvas): EIC tunnel pool + canvas Promise.all (closes core#11)' (#13) from feat/eic-tunnel-pool-core-11 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 12s
CI / Python Lint & Test (push) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 23s
Harness Replays / Harness Replays (push) Failing after 39s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 1m1s
CI / Canvas (Next.js) (push) Successful in 2m35s
CI / Canvas Deploy Reminder (push) Has been skipped
publish-workspace-server-image / build-and-push (push) Failing after 2m50s
CI / Platform (Go) (push) Failing after 2m59s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 17m54s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 17m55s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 17m46s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 17m43s
2026-05-07 11:10:25 +00:00
claude-ceo-assistant 06d4bab29d Merge pull request 'fix(ci): port publish-runtime cascade to Gitea repo-dispatch API (closes #14)' (#20) from fix/14-cascade-gitea-dispatch into staging
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 10s
Block internal-flavored paths / Block forbidden paths (push) Successful in 10s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 6s
CI / Platform (Go) (push) Successful in 29s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 28s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m57s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 9s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 54s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 10m34s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 19m45s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 20m19s
2026-05-07 10:36:32 +00:00
Hongming Wang 4279fecde5 fix(ci): keep codex in TEMPLATES + skip-if-no-publish-image.yml
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 6s
cascade-list-drift-gate / check (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 9s
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 1s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m39s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 25s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 51s
CI / Canvas (Next.js) (pull_request) Failing after 5m16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m22s
CI / Python Lint & Test (pull_request) Successful in 15m42s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 19m46s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 20m54s
The v2 dropped codex from TEMPLATES on the basis of "no
publish-image.yml = not part of cascade today." That was correct
about the immediate behavior but tripped cascade-list-drift-gate.yml
because manifest.json still declares codex (it IS a live runtime —
referenced from workspace/config.py and cloned into dev envs by
clone-manifest.sh; only the image-publish path is missing).

Restore codex to TEMPLATES (matching manifest) and add a runtime
soft-skip: probe each repo for .github/workflows/publish-image.yml
via the Gitea contents API and skip cleanly if 404. Final job log
distinguishes "complete across all" vs "complete with soft-skips".

This preserves the drift gate's invariant (TEMPLATES == manifest)
while honoring the empirical fact that codex has no publish-image
workflow yet. If codex later gains the workflow, no change here is
needed — the probe will see 200 and the cascade will fan out to it
naturally.

Refs molecule-core#14, molecule-core#20.
2026-05-07 03:32:53 -07:00
Hongming Wang 607444e71b feat(ci): replace curl-dispatch with push-mode cascade (v2)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
cascade-list-drift-gate / check (pull_request) Failing after 9s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 2s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m21s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 46s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m28s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 26s
CI / Platform (Go) (pull_request) Successful in 3m32s
CI / Canvas (Next.js) (pull_request) Failing after 3m34s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 16m16s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 20m25s
Empirical blocker on v1: Gitea 1.22.6 has no repository_dispatch /
workflow_dispatch trigger API (verified across 6 candidate paths in
issuecomment-913). v1's curl-POST loop would always exit-1.

v2 pivots to push-mode: each template repo got a small companion PR
(merged 2026-05-07) adding a `.runtime-version` file at root + a
`resolve-version` job in publish-image.yml that reads the file and
forwards the value to the reusable build workflow. publish-runtime
now updates that file via git-clone + commit + push, which trips
each template's existing `on: push: branches: [main]` trigger.

Behaviour changes vs v1:
- Templates list dropped from 9 → 8 (codex has no publish-image.yml
  so was never part of the cascade in practice).
- 3-retry pull-rebase loop per template (handles concurrent-push
  races without force-push). Failures collected, job exits 1 with
  the failed-template list at the end.
- Idempotency: when re-run with the same version, templates already
  pinned to that version contribute zero commits — operator can
  safely re-run to retry partial failures.
- Author line: "publish-runtime cascade <publish-runtime@moleculesai
  .app>" trailer makes it clear the commit is workflow-driven, not
  human (per memory feedback_github_botring_fingerprint).

DISPATCH_TOKEN secret name unchanged (still consumed at
secrets.DISPATCH_TOKEN per 569df259).

Refs molecule-core#14, builds on molecule-core#20 issuecomment-923
(Phase 2 design).
2026-05-07 03:17:38 -07:00
Hongming Wang 1ff7342e91 chore: retrigger CI after runner config fix
cascade-list-drift-gate / check (pull_request) Successful in 10s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 1s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 44s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 22s
CI / Canvas (Next.js) (pull_request) Failing after 3m28s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 3m39s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 29s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 30s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 15m39s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 15m41s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 16m1s
CI / Python Lint & Test (pull_request) Successful in 15m52s
2026-05-07 03:01:23 -07:00
Hongming Wang 569df259ba fix(ci): align secret name to plumbed DISPATCH_TOKEN (closes #14)
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 3s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
cascade-list-drift-gate / check (pull_request) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 19s
CI / Python Lint & Test (pull_request) Failing after 20s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 34s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m31s
CI / Platform (Go) (pull_request) Successful in 3m6s
CI / Canvas (Next.js) (pull_request) Failing after 3m8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 14m54s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 15m3s
The cascade workflow was reading from `secrets.TEMPLATE_DISPATCH_TOKEN`
but the plumbed secret name is `DISPATCH_TOKEN` (verified just now via
GET /repos/molecule-ai/molecule-core/actions/secrets — only DISPATCH_TOKEN
is set). Without this rename the cascade would always evaluate "secret
missing" and exit 1 on the next push to staging, defeating the entire
point of grant-role-access.sh --apply that just landed.

Three references updated:
  - env mapping (`secrets.X` → `secrets.DISPATCH_TOKEN`)
  - workflow_dispatch warning text
  - push-trigger error text

The bash-side variable name is unchanged (still `DISPATCH_TOKEN`) so
the curl invocation at line 372 is unaffected. YAML round-trip parses
clean.
2026-05-07 02:38:20 -07:00
claude-ceo-assistant 422360b912 Merge pull request 'docs(workspace-runtime): migrate github.com refs at source (#41)' (#15) from docs/workspace-runtime-readme-source-edit into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 7s
CI / Platform (Go) (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
Ops Scripts Tests / Ops scripts (unittest) (push) Failing after 13s
CI / Python Lint & Test (push) Failing after 14s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 10s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 21s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 49s
CI / Canvas (Next.js) (push) Successful in 44s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 47s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m22s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 14m40s
2026-05-07 09:25:28 +00:00
claude-ceo-assistant 1d9d8c7809 Merge pull request 'fix(scripts): migrate ghcr.io→ECR + raw.githubusercontent.com→Gitea (#46)' (#16) from fix/script-ghcr-and-lint-paths into staging
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Has been cancelled
Handlers Postgres Integration / detect-changes (push) Has been cancelled
CI / Detect changes (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Has been cancelled
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Failing after 12s
2026-05-07 09:25:24 +00:00
claude-ceo-assistant d30c813ff9 Merge pull request 'docs: bulk-sed molecule-core .md docs → Gitea (#37 final molecule-core sweep)' (#19) from docs/molecule-core-bulk-sed into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Handlers Postgres Integration / detect-changes (push) Successful in 7s
CI / Detect changes (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 6s
Ops Scripts Tests / Ops scripts (unittest) (push) Failing after 13s
E2E API Smoke Test / detect-changes (push) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 17s
CI / Platform (Go) (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Has been cancelled
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled
2026-05-07 09:24:46 +00:00
claude-ceo-assistant ce3f1f48a4 fix(ci): port publish-runtime cascade to Gitea repo-dispatch API (closes molecule-core#14)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
cascade-list-drift-gate / check (pull_request) Successful in 4s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Failing after 14s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 49s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m20s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m24s
CI / Canvas (Next.js) (pull_request) Failing after 1m55s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 2m5s
## Symptom

`publish-runtime.yml::cascade` fired a `repository_dispatch` to 10 workspace-template
repos via direct curl to `https://api.github.com/repos/...`. Post-2026-05-06 the
org's GitHub presence is suspended; every invocation 404s. The job's
`::warning::` posture meant the failure didn't propagate, leaving the runtime
PyPI publish → template image rebuild pipeline silently broken.

## Why Option A (rewrite) and not Option B (delete)

Verified 2026-05-07 by devops-engineer (molecule-core#14 thread):

- The cron-poll mechanism (/etc/cron.d/molecule-deploy-poll) tracks ONLY the
  Vercel/Railway-deployed repos (landingpage/docs/molecule-app/molecules-market
  /molecule-controlplane). It does NOT track workspace-template-* repos.
- Each of the 9 template `publish-image.yml` workflows has
  `repository_dispatch: types: [runtime-published]` as a load-bearing trigger.
  Without the cascade, when the runtime ships a new PyPI version, templates
  don't auto-rebuild.

So Option B (delete) would silently break the runtime → template fan-out.
Option A (rewrite to Gitea's API shape) is the right call. Security-auditor
agreed after seeing the cron-poll TRACKED list.

## API surface change

| Concern | Pre-fix (GitHub) | Post-fix (Gitea) |
|---|---|---|
| URL | `https://api.github.com/repos/$REPO/dispatches` | `${GITEA_URL}/api/v1/repos/$REPO/dispatches` |
| Owner case | `Molecule-AI/...` | `molecule-ai/...` (lowercase, Gitea is case-sensitive) |
| Auth header | `Authorization: Bearer $DISPATCH_TOKEN` | `Authorization: token $DISPATCH_TOKEN` |
| Body shape | `{event_type, client_payload}` | UNCHANGED — Gitea is GitHub-compatible here |
| Success code | `204 No Content` | `204 No Content` (unchanged) |

`GITEA_URL` defaults to `https://git.moleculesai.app`; overridable via job env.

## Out-of-band: DISPATCH_TOKEN secret rotation

The DISPATCH_TOKEN secret was a GitHub PAT. It must be re-minted as a Gitea
PAT for the new API to authenticate. Per saved memory
`feedback_per_agent_gitea_identity_default`, this should be a dedicated
`publish-runtime-bot` persona token with `write:repository` scope on the
9 target repos — NOT the founder PAT.

This PR ships the workflow change. Token rotation is the operator-host
follow-up (security-auditor's lane) — coordinate the merge so the token
is in place before the next runtime release fires.

## Backwards compatibility

The workflow ran silently-broken since 2026-05-06 (every invocation 404
+ ::warning:: but no failure). So there is no functional regression from
"silently broken" to "actually working". Any in-progress operator-managed
manual dispatch path is unaffected; the Gitea API parallel path doesn't
require operator intervention.

## Test plan

- [x] YAML parse OK on the modified workflow file
- [ ] Smoke test: trigger a runtime publish (or simulate via dispatching to one
      template) post-merge; verify HTTP 204 + the template's publish-image
      workflow fires + the template's image gets re-pushed against the new
      runtime version. Phase 4 verification belongs to internal#46 follow-up.

## Hostile self-review (3 weakest spots)

1. The fan-out remains all-or-nothing: a single template failure surfaces as
   a `::warning::` but PyPI publish proceeds. With 9 templates this is a
   ~10% per-template chance of stale-image-on-runtime-bump if any one fails.
   Defense: the warning shows up in the workflow summary; operators retry.
   Future hardening: requeue-on-fail with bounded retry, or a separate
   reconcile cron that detects template/runtime version drift and re-dispatches.

2. `DISPATCH_TOKEN` validity is enforced by the Gitea API (401 on stale)
   but the workflow doesn't differentiate 401 from 404. Either way the
   warning fires. Future hardening: explicit token-shape check at the start
   of the cascade job (curl `/api/v1/user` once, fail-fast if 401).

3. Owner-case lowercase is right today but couples the workflow to the
   current Gitea org slug. If the org is ever renamed, this workflow
   breaks silently. Less fragile alternative: derive REPO from a
   canonical config (e.g. `gh repo list molecule-ai`) instead of
   string-concatenating. Acceptable today; filed as the same future
   hardening pass as item 1.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:31:37 -07:00
documentation-specialist 26afbbfdf4 docs(internal): bulk-sed molecule-core .md docs → Gitea (#37 final molecule-core sweep)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 12s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 51s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m20s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s
Mass-sed across 17 files / 38 active refs in molecule-core .md docs
(README + CONTRIBUTING + docs/architecture/ + docs/blog/ + docs/guides/
+ docs/integrations/ + docs/quickstart.md + scripts/README.md).

Driver: /tmp/sweep_core.py — same pattern set as the
internal-marketing bulk-sed (PR #50). 4 url-substitution patterns +
SKIP_PATTERN preserves /pull/<n> /issues/<n> /commit/<sha>
/releases/... historical refs.

Files NOT touched in this PR:
- docs/workspace-runtime-package.md — owned by molecule-core#15
  (workspace-runtime source-edit per #41). Reverted my bulk-sed of
  that file to avoid merge conflict.
- 2 Go-import-path refs in docs/memory-plugins/testing-your-plugin.md
  (github.com/Molecule-AI/molecule-monorepo/platform/internal/...) —
  Q5 cross-repo Go-module migration territory.
- 1 GitHub Gist link in docs/guides/external-workspace-quickstart.md
  (gist.github.com/molecule-ai/...) — no Gitea equivalent;
  consistent with the same handling in docs#1.

Manual fixes (2):
- docs/blog/2026-04-20-chrome-devtools-mcp-seo/index.md:306 —
  GitHub Discussions (no Gitea equivalent) → issue tracker link
- docs/guides/external-workspace-quickstart.md:218 — tracking-issue
  ?q= query-string url (regex didn't catch) → reformulated text +
  Gitea search-by-query approach

Pattern matches my docs#1 (public docs site) PR + internal#50
(internal/marketing bulk-sed). Standard substitutions:
- https://github.com/Molecule-AI/<repo> → https://git.moleculesai.app/molecule-ai/<repo>
- /blob/<branch>/ + /tree/<branch>/ → /src/branch/<branch>/

Refs: molecule-ai/internal#37, molecule-ai/internal#38
2026-05-07 01:27:50 -07:00
claude-ceo-assistant bed9644c10 Merge pull request 'chore(ci): pin artifact actions to @v3 for Gitea act_runner compatibility' (#18) from chore/pin-artifact-actions-v3 into staging
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 7s
CI / Detect changes (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 4s
CI / Python Lint & Test (push) Failing after 15s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 10s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m11s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m26s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m40s
CI / Canvas (Next.js) (push) Successful in 1m58s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Platform (Go) (push) Successful in 2m11s
2026-05-07 08:23:20 +00:00
claude-ceo-assistant aa22183e52 chore(ci): pin artifact actions to @v3 for Gitea act_runner compatibility (internal#46)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 7s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m31s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m33s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s
CI / Python Lint & Test (pull_request) Failing after 19s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 27s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 4m47s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m32s
Mechanical pin: 4 `actions/upload-artifact@v4.6.2/v7.0.1` uses → `@v3`. v4+/v7+
rely on a runtime API shape that Gitea's act_runner v0.6.x doesn't fully
support. v3 uses the legacy server protocol act_runner ships end-to-end.

Files (4 uses):
  - .github/workflows/ci.yml:238 (v4.6.2 → v3)
  - .github/workflows/codeql.yml:124 (v7.0.1 → v3)
  - .github/workflows/e2e-staging-canvas.yml:142 (v7.0.1 → v3)
  - .github/workflows/e2e-staging-canvas.yml:150 (v7.0.1 → v3)

YAML parse green on all 3 files.

Sister PRs land for `molecule-controlplane` and `codex-channel-molecule`.
Per internal#46 Phase 2 audit; tracked under that umbrella.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:00:53 -07:00
documentation-specialist 5d4184f4a3 fix(scripts): migrate ghcr.io→ECR + raw.githubusercontent.com→Gitea (#46)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 13s
CI / Canvas (Next.js) (pull_request) Successful in 42s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 54s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m18s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s
Per documentation-specialist's grep agent (2026-05-07T07:30, see
internal#46): runtime-breaking ghcr.io references in shell scripts +
docker-compose + the slip-past-workflow lint_secret_pattern_drift.py
all need migration. These were missed by security-auditor's
workflow-only audit.

Files (6):

- .github/scripts/lint_secret_pattern_drift.py:40 — workspace-runtime
  pre-commit-checks.sh consumer URL: raw.githubusercontent.com →
  Gitea raw URL (https://git.moleculesai.app/molecule-ai/.../raw/
  branch/main/...). The lint job runs in CI and would 404 today.

- scripts/refresh-workspace-images.sh:54 — workspace-template image
  pull URL: ghcr.io → ECR (153263036946.dkr.ecr.us-east-2.amazonaws.com).

- scripts/rollback-latest.sh — full rewrite of header + auth flow:
  * ghcr.io/molecule-ai/{platform,platform-tenant} → ECR
  * GITHUB_TOKEN with write:packages → AWS ECR auth
    (aws ecr get-login-password). Per saved memory
    reference_post_suspension_pipeline, prod cutover is to ECR.
  * Updated header docs to match new auth flow + prereqs.

- scripts/demo-freeze.sh:13,17 — comment-only ghcr → ECR
  (the script doesn't currently exec these URLs, but the comments
  describe the cascade and need to match reality).

- docker-compose.yml:215-216 — canvas image: ghcr.io → ECR + updated
  the auth comment to describe `aws ecr get-login-password` flow.

- tools/check-template-parity.sh:21 — inline curl install instructions:
  raw.githubusercontent.com → Gitea raw URL.

Hostile self-review:

1. rollback-latest.sh's GITHUB_TOKEN→aws-cli auth swap is a behavior
   change. Operators using this script now need aws CLI
   authenticated for region us-east-2 with ECR pull/push perms.
   Documented in updated header. Operators who don't have aws CLI
   will get 'aws: command not installed' which is a clear failure
   mode (not silent).
2. The Gitea raw URL shape (/raw/branch/main/) differs from GitHub's
   raw.githubusercontent.com structure. Verified pattern by
   inspecting other Gitea raw URLs in the codebase. If Gitea's URL
   changes (1.23+), update via the same one-line edit.
3. Doesn't touch packer/scripts/install-base.sh which has a similar
   ghcr.io ref per the grep agent's findings — that's bigger-scope
   (packer build pipeline) and lives in molecule-controlplane-ish
   territory; filing as parked follow-up under #46 if not already.

Refs: molecule-ai/internal#46, molecule-ai/internal#37,
molecule-ai/internal#38, saved memory reference_post_suspension_pipeline
2026-05-07 00:56:23 -07:00
documentation-specialist bd145dcec6 docs(workspace-runtime): migrate github.com refs at source so mirror inherits Gitea links (internal#41)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 12s
CI / Python Lint & Test (pull_request) Failing after 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 11s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m18s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m21s
The molecule-ai-workspace-runtime mirror is regenerated on every
runtime-v* tag from this monorepo's workspace/. Per saved memory
reference_runtime_repo_is_mirror_only, mirror-guard rejects direct
PRs to the mirror; edit at source.

Source-side files that propagate to the mirror's published README +
read by users of the in-monorepo workspace-runtime docs:

- scripts/build_runtime_package.py (the README generator):
  * line 281 README_TEMPLATE: 'Shared workspace runtime for Molecule
    AI' link → Gitea
  * line 399 doc-link to workspace-runtime-package.md → Gitea path
    (with /src/branch/main/ shape)
  LEFT AS-IS (per Q3 audit-trail decision):
  * lines 379, 392 historical issue cross-refs (#2936, #2937)

- workspace/build-all.sh:5 — comment block linking to template-*
  repos. Migrated to Gitea path-shape.

- docs/workspace-runtime-package.md:
  * lines 101-108 adapter→repo table (8 templates, all PUBLIC on
    Gitea) — Gitea URLs
  * line 247 starter-repo link — substituted host + added inline
    note that starter doesn't survive the suspension migration
    (recreation pending; cross-link to this issue)
  * line 259 generic git clone command for new templates → Gitea
  * line 289 second starter mention — same handling as 247

Files NOT touched in this PR:
- workspace/ Python source code (.py files) — those use github
  paths in docstrings + a few log strings; fix bundled with the
  cross-repo Go-module-style migration (per #37 Q5 + parked
  follow-ups).
- 'Writing a new adapter' section's `gh repo create` command (line
  254-256) — gh CLI doesn't talk to Gitea (per #45 parked follow-up).
- 'Writing a new adapter' section's ghcr.io image ref (line 276) —
  per #46 ghcr→ECR migration (separate concern).

After this PR merges to staging + a runtime-v* tag is pushed, the
mirror's published README will inherit the Gitea link. Until then
the mirror's README continues to reference github.com/Molecule-AI
(stale but historical-marker-correct since the mirror existed
pre-suspension).

Refs: molecule-ai/internal#41, molecule-ai/internal#37,
molecule-ai/internal#38, molecule-ai/internal#42,
molecule-ai/internal#45, molecule-ai/internal#46
2026-05-07 00:48:04 -07:00
claude-ceo-assistant 624ef4d06d perf(workspace-server,canvas): EIC tunnel pool + canvas Promise.all (closes core#11)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 9s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 52s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 43s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m20s
Harness Replays / Harness Replays (pull_request) Failing after 31s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s
CI / Platform (Go) (pull_request) Failing after 2m41s
CI / Canvas (Next.js) (pull_request) Failing after 2m42s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m56s
## Symptom
Canvas detail-panel "config + filesystem load" took ~20s. Reported on
production hongming tenant, workspace c7c28c0b-... (Claude Code Agent T2).

## Two stacked latency sources

### 1. Server-side: per-call EIC tunnel setup (~80% of the win)

`workspace-server/internal/handlers/template_files_eic.go::realWithEICTunnel`
performed ssh-keygen + SendSSHPublicKey + open-tunnel + waitForPort PER call.
4 callers (read/write/list/delete) each paid the full ~3-5s setup cost even
when fired back-to-back on the same workspace EC2.

Fix: refcounted pool keyed on instanceID with TTL ≤ 50s (under the 60s
SendSSHPublicKey grant). One tunnel serves N file ops; concurrent acquires
for the same instance share the slot via a pendingSetups gate; LRU eviction
caps simultaneous tracked instances at 32. Poisons entries on tunnel-fatal
errors (connection refused, broken pipe, auth failed) so the next acquire
builds fresh. Cleanup on panic via defer-release pattern (added after
self-review caught a refcount-leak hazard).

Public API unchanged — `var withEICTunnel` rebinds to `pooledWithEICTunnel`
at package init, so all 4 callers inherit pooling for free.

10 unit tests pin: 4-ops-amortise (1 setup), different-instances-do-not-share,
TTL eviction, poison invalidates, concurrent-acquire-single-setup,
TTL=0 escape hatch, LRU eviction at cap, error classification heuristic,
refcount blocks expired eviction, panic poisons entry. All green.

### 2. Canvas-side: serial fan-out + duplicate fetch (~20% of the win)

`canvas/src/components/tabs/ConfigTab.tsx::loadConfig` awaited 3 independent
metadata GETs (`/workspaces/{id}`, `/model`, `/provider`) serially.
`AgentCardSection` fired a SECOND `/workspaces/{id}` from its own useEffect.

Fix: Promise.all over the 3 metadata GETs (each leg keeps its existing
.catch fallback semantics). AgentCardSection now reads `agentCard` from
the canvas store (`useCanvasStore`) instead of refetching — the canvas
already hydrates `node.data.agentCard` from the platform event stream.
Defensive selector handles test mocks without a `nodes` array.

## Verification

- `go test ./internal/handlers/` 5.07s green (full handlers package, including
  10 new pool tests)
- `go vet ./internal/handlers/` clean
- `npx vitest run` — 1380/1380 canvas unit tests pass (2 test FILES fail on
  a pre-existing xyflow CSS-load issue in vitest config, unrelated to this
  change)
- `npx tsc --noEmit` clean

Live wall-time verification deferred to Phase 4 / E2E (canvas browser session
required; external probe blocked by 403 since the canvas auth chain is
session-cookie + Origin header, not a bearer token I can fabricate).

## Backwards compatibility

API surface unchanged. All 4 EIC handler callers use the rebound var; no
caller migration. Pool defaults to enabled (TTL=50s); tests can disable by
setting poolTTL=0 or by overwriting withEICTunnel directly (existing stub
pattern in template_files_eic_dispatch_test.go preserved).

## Hostile self-review (3 weakest spots)

1. `fnErrIndicatesTunnelFault` is a substring grep on err.Error() — the
   marker list is hand-curated and ssh client error formats vary across
   OpenSSH versions. A future ssh that reports a tunnel failure via a
   phrasing not in the list would NOT poison the entry → next callers reuse
   a dead tunnel until TTL evicts. Acceptable: TTL bounds the impact (≤50s
   of bad reuse), and the heuristic covers every tunnel-error shape that
   appears in the existing test fixtures and known incidents.

2. `acquire`'s for-loop has unbounded retry potential under pathological
   churn (signal closed → new acquirer → setup fails → repeat). No bounded
   retry counter. Today there is no test exercise for "flaky setup that
   succeeds-then-fails-then-succeeds"; if observability ever shows this
   shape, add a max-retry guard. Filed as a known limitation, not blocking.

3. The substring assertion `strings.Contains` style I used for tunnel-fault
   classification could false-positive on app-level error messages that
   happen to contain "permission denied" or "broken pipe" verbatim. The
   classification test covers the discriminator but only against the
   error shapes we know today. Acceptable: poisoning errs on the side of
   building fresh, which is correct-but-slightly-slow rather than incorrect.

## Phase 4 / E2E plan

- Live timing of the canvas detail-panel open against a real workspace
  (browser session, not external probe).
- Target: perceived latency under 2s on warm pool. Cold open still pays
  one tunnel setup (~3-5s) — the pool buys you the SECOND through Nth
  panel-open within the TTL window.
- Memory `feedback_chase_verification_to_staging` applies — will not
  declare done at PR-merge; will follow through to user-visible behavior
  on staging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 23:17:58 -07:00
claude-ceo-assistant f92ba492de Merge pull request 'test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B)' (#3) from fix/2872-sqlmock-regex-tightening into staging
Harness Replays / detect-changes (push) Successful in 6s
Harness Replays / Harness Replays (push) Failing after 43s
publish-workspace-server-image / build-and-push (push) Failing after 2m17s
Auto-sync main → staging / sync-staging (push) Successful in 6s
CI / Detect changes (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Failing after 4s
Block internal-flavored paths / Block forbidden paths (push) Successful in 26s
CI / Python Lint & Test (push) Failing after 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 10s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 48s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 27s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 40s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 1m11s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m23s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m24s
CI / Canvas (Next.js) (push) Failing after 1m57s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Platform (Go) (push) Failing after 2m27s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 4m45s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Failing after 14s
Canary — staging SaaS smoke (every 30 min) / Canary smoke (push) Failing after 16s
2026-05-07 00:19:40 +00:00
claude-ceo-assistant 75a72bf5a2 feat(canvas/chat-server): canvas consumes /chat-history + server-side row-aware reverse (RFC #2945 PR-C-2)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 30s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 9s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 54s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m19s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s
Harness Replays / Harness Replays (pull_request) Failing after 46s
CI / Canvas (Next.js) (pull_request) Failing after 2m21s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 2m44s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m49s
Closes the SSOT story shipped in PR-C/D: canvas now consumes the typed
/chat-history endpoint instead of /activity?type=a2a_receive, and the
server emits messages in display-ready chronological order so the
client doesn't have to re-order them.

## Canvas (consumer migration)

- loadMessagesFromDB swaps from /activity to /chat-history.
- Drops type=a2a_receive + source=canvas params (server applies the
  filter centrally now).
- Drops [...activities].reverse() — wire is already display-ready.
- Drops the local INTERNAL_SELF_MESSAGE_PREFIXES constant +
  isInternalSelfMessage helper. Server-side IsInternalSelfMessage
  applies the same predicate before emitting rows.
- Drops the activityRowToMessages + ActivityRowForHydration imports
  from historyHydration.ts. The TS parser stays in tree because
  message-parser.ts is still load-bearing for live A2A WebSocket
  messages (ChatTab.tsx:805, AgentCommsPanel.tsx, canvas-events.ts).

## Server (row-aware wire-order fix)

The pre-PR-C-2 client did `[...activities].reverse()` over ROWS, then
flattened each row into [user, agent] messages. The reversal was
ROW-aware. After PR-C/D, the server returned a flat ChatMessage slice
in `ORDER BY created_at DESC` order, with [user, agent] within each
row. A naive client-side flat reverse would FLIP each pair (agent
before user at same timestamp).

Two ways to fix it:

  A) Server emits oldest-first within page; canvas does NOT reverse.
  B) Canvas does row-aware reversal (group by timestamp, reverse).

Option A is cleaner — server owns the wire-order responsibility, every
client trusts `for m of messages` to render chronologically. Server
adds reverseRowChunks() that:

  1. Groups consecutive same-Timestamp messages into row chunks
     (1-2 messages per row).
  2. Reverses the chunk order (newest-row-first → oldest-row-first).
  3. Flattens. Within-chunk [user, agent] order is preserved.

Single-message rows (agent reply not yet recorded, attachments-only
user upload) collapse to 1-element chunks and reverse correctly too.

## Tests

Server: 3 new unit tests on reverseRowChunks (paired across rows,
single-message rows, empty input) + 1 sqlmock integration test on
List() that drives the full SQL → reverse → wire path. Mutation-tested:
removed `messages = reverseRowChunks(messages)` from List(), confirmed
the integration test fires red with all 4 misordered indices flagged.
Restored, all 25 messagestore tests + 9 chat-history handler tests
green.

Canvas: 8 lazyHistory pagination tests refactored to mock
/chat-history (not /activity) and assert against the new wire shape
({messages, reached_end} not raw activity rows). All 1389/1389 vitest
tests green; tsc --noEmit clean.

## Three weakest spots (hostile-reviewer self-pass)

1. reverseRowChunks groups by Timestamp string equality. If two
   distinct rows had the SAME timestamp (legitimately possible at sub-
   millisecond granularity), the algorithm would treat them as one
   chunk and not reverse them relative to each other. Mitigated:
   activity_logs.created_at uses microsecond resolution; concurrent
   inserts at exact-same microsecond are vanishingly rare. If a
   collision happens, the within-chunk order is whatever the SQL
   returned — both rows render at the same timestamp, no user-visible
   misordering.

2. The pre-existing TS parser files (historyHydration.ts +
   message-parser.ts) stay in tree. historyHydration.ts is now dead
   code (no consumers post-migration); deletion is parked as a follow-
   up after a one-week observation window confirms no live-message
   consumer reaches it.

3. canvas's loadMessagesFromDB returns `resp.messages ?? []`. If the
   server were ever to return `null` instead of `[]` (it currently
   doesn't — handler defensively coerces nil to []), the nullish coalesce
   keeps the canvas from crashing. A stricter wire schema would assert
   the never-null invariant; for today's pragmatic safety, the ?? is
   enough.

## Security review

- Untrusted input? Same as PR-C — agent JSON parsed defensively in
  the messagestore parser. No new exposure.
- Trust boundary? Same. Canvas → /chat-history → wsAuth → messagestore.
- Output sanitization? Plain text + opaque attachment URIs as before.

No security-relevant changes beyond what /chat-history already
exposes via PR-C. Considered, not skipped.

## Versioning / backwards compat

- /activity endpoint unchanged.
- /chat-history endpoint shape unchanged (still {messages, reached_end});
  only the wire ORDER within a page changed (newest-first row → oldest-
  first row). Canvas is the only consumer in tree; no API consumers
  depend on the previous order.
- canvas's loadMessagesFromDB call signature unchanged — internal
  refactor.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-06 16:55:00 -07:00
Hongming Wang 00cfe51df7 test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m23s
CI / Python Lint & Test (pull_request) Successful in 31s
CI / Canvas (Next.js) (pull_request) Successful in 52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 40s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 40s
Harness Replays / Harness Replays (pull_request) Failing after 43s
CI / Platform (Go) (pull_request) Failing after 2m23s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m47s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m23s
The five `mock.ExpectQuery(\`SELECT id FROM workspaces\`)` sites used a
loose substring regex that silent-passed three regression shapes #2872
called out:

  1. `WHERE parent_id = $2` (drops `IS NOT DISTINCT FROM` — breaks
     NULL-parent root matching)
  2. `WHERE name = $1` only (drops parent_id check entirely — hijacks
     siblings of the same name across different parents)
  3. Drops `AND status != 'removed'` (blocks re-import after Collapse)

Extracts a `lookupChildSQLRE` const that anchors all four load-bearing
tokens (the SELECT/FROM, the name predicate, the IS NOT DISTINCT FROM
predicate, and the status filter). All five ExpectQuery sites now use
the same const so a future schema/predicate change fails one place.

Mutation-tested per memory feedback_assert_exact_not_substring.md:
- Replacing `IS NOT DISTINCT FROM` with `=` fails
  TestLookupExistingChild_NilParent_MatchesRoot.
- Dropping `AND status != 'removed'` fails
  TestLookupExistingChild_Found_ReturnsIDAndTrue.

Note: #2872 PR-A (AST gate strengthening) is already addressed inline —
findWorkspacesInsertSQL + TestCreateWorkspaceTree_InsertUsesOnConflictDoNothing
pin the ON CONFLICT DO NOTHING shape, which is a strictly stronger
gate than the original lookup-before-insert ordering check.
2026-05-06 16:43:42 -07:00
Hongming Wang 3cdb67f27e fix(workspace-server): CP orphan sweeper closes deprovision split-write race (#2989)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
Harness Replays / detect-changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 18s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 43s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m19s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m22s
Harness Replays / Harness Replays (pull_request) Failing after 37s
CI / Platform (Go) (pull_request) Failing after 2m33s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m48s
The deprovision path marks `workspaces.status='removed'` BEFORE calling
the controlplane DELETE. If that CP call fails (transient 5xx, network
hiccup, AWS provider error), the DB row stays at 'removed' with
`instance_id` populated and there's no retry — the EC2 lives forever.
9 prod orphans accumulated over 3 days under this bug.

Adds a SaaS-mode counterpart to the existing Docker `orphan_sweeper`:
- 60s tick (matches the Docker sweeper cadence)
- LIMIT 100 per cycle so a sustained CP outage drains over multiple
  cycles without blowing the request timeout
- Re-issues `cpProv.Stop` for any workspace at status='removed' with a
  non-NULL `instance_id`. Stop is idempotent (AWS terminate on
  already-terminated is a no-op; CP's Deprovision tolerates already-
  deleted DNS) so retries are safe.
- On Stop success, NULLs `instance_id` so the next cycle skips the row.
- On Stop failure, leaves `instance_id` populated for next cycle.

The existing Docker sweeper is gated on `prov != nil`; the new sweeper
is gated on `cpProv != nil`. SaaS tenants get exactly one of the two,
self-hosted tenants get the Docker one — no overlap.

Why this shape over option A (CP-first ordering) or B (durable outbox):
the existing inline path already returns a loud 500 to the user when
CP fails — the only missing piece is automatic retry, which a 60s
sweeper provides without protocol changes, new tables, or new workers.
~30 LOC of production code vs. ~400 for an outbox. RFC discussion in
#2989 comment chain.

Tests:
- 9 unit tests covering happy path, Stop failure, UPDATE failure,
  multiple orphans (one-fails-others-still-process), DB query error,
  nil-DB defense, nil-reaper short-circuit, and the boot-immediate-then-
  tick cadence contract.
- Mutation-tested: status='running' substitution and removed-UPDATE-
  block both fail at least one test.

Out of scope:
- Backfilling the 9 named orphans — they'll heal automatically on the
  first sweep cycle after this lands; no manual cleanup needed.
- Long-term durable-outbox architecture — separate RFC.
2026-05-06 16:43:33 -07:00
claude-ceo-assistant 55ef3176ed feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6)
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 6s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 5s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Python Lint & Test (push) Successful in 30s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 48s
CI / Canvas (Next.js) (push) Successful in 48s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 49s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m19s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 39s
Harness Replays / Harness Replays (push) Failing after 37s
CI / Platform (Go) (push) Failing after 2m8s
publish-workspace-server-image / build-and-push (push) Failing after 2m39s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 4m46s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 13m21s
Allows MOLECULE_IMAGE_REGISTRY env override on the tenant workspace-server. Used to flip from ghcr.io/molecule-ai → private ECR mirror after the GitHub org suspension on 2026-05-06. Default unchanged for OSS users.

Closes #6.
2026-05-06 22:51:53 +00:00
claude-ceo-assistant 4b074f631b feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6)
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 41s
Harness Replays / Harness Replays (pull_request) Failing after 30s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 3m8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m7s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 14m36s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 14m30s
Block internal-flavored paths / Block forbidden paths (pull_request) Has been cancelled
CI / Python Lint & Test (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
CI / Canvas (Next.js) (pull_request) Has been cancelled
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled
CI / Detect changes (pull_request) Has been cancelled
Secret scan / Scan diff for credential-shaped strings (pull_request) Has been cancelled
E2E API Smoke Test / detect-changes (pull_request) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (pull_request) Has been cancelled
Harness Replays / detect-changes (pull_request) Has been cancelled
Handlers Postgres Integration / detect-changes (pull_request) Has been cancelled
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Has been cancelled
CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled
Add MOLECULE_IMAGE_REGISTRY env var to override the registry prefix used
by all workspace-template image references. Defaults to ghcr.io/molecule-ai
(unchanged for OSS users); set to an ECR URI in production tenants when
mirroring to AWS.

Why this matters: GitHub suspended the Molecule-AI org on 2026-05-06 with
no warning. Production tenants kept running because they had images cached
locally, but any tenant restart (AWS health event, redeploy, OS reboot)
would have failed at `docker pull ghcr.io/molecule-ai/...` because GHCR
returned 401. This change introduces the seam needed to point new pulls at
a registry we control (AWS ECR) by flipping a single env var on Railway.

Design (RFC: molecule-ai/internal#6):

- New `RegistryPrefix()` function in `provisioner/registry.go` reads
  MOLECULE_IMAGE_REGISTRY, falls back to "ghcr.io/molecule-ai".
- New `RuntimeImage(runtime)` returns the canonical ref using the prefix.
- `RuntimeImages` map computed at init via `computeRuntimeImages()` so
  existing callers that range over it still work.
- `DefaultImage` likewise computed via `RuntimeImage(defaultRuntime)`.
- `handlers.TemplateImageRef()` switched from hardcoded format string to
  `provisioner.RegistryPrefix()`.
- `runtime_image_pin.go::resolveRuntimeImage()` automatically inherits
  the prefix change because it reads from `provisioner.RuntimeImages[]`
  and only re-formats the tag suffix to a digest pin.

Alternatives rejected (see RFC):

- Multi-registry fallback chain (try ECR, fall back to GHCR): GHCR is
  locked from outbound for our org, so the fallback never works for us.
  Adds code complexity for no benefit.
- Hardcoded ECR-only switch: couples production code to a specific
  deployment environment. OSS users self-hosting Molecule would need
  the upstream GHCR.
- Self-hosted Harbor / registry-on-Hetzner: adds a component to operate.
  Not justified at 3-tenant scale; AWS ECR is mature and IAM-integrated.

Auth — deliberately NOT changed in this commit:

- For GHCR, the existing `ghcrAuthHeader()` reads GHCR_USER/GHCR_TOKEN.
- For ECR, EC2 user-data installs `amazon-ecr-credential-helper` and adds
  a `credHelpers` entry in `~/.docker/config.json` so the daemon resolves
  ECR credentials via the EC2 instance role on every pull. The Go code
  needs no auth change. This keeps the diff minimal.

Backwards compatibility:

- Additive: env unset → identical behavior to today (GHCR).
- Existing tests reference literal `ghcr.io/molecule-ai/...` strings;
  they continue to pass under the default prefix.
- `RuntimeImages` map preserved for callers that iterate it.
- No interface, schema, API, or migration version bump needed.

Security review:

- No untrusted input: MOLECULE_IMAGE_REGISTRY is set at deploy time
  (Railway env, EC2 user-data), not by users.
- No expanded data collection or logging changes.
- No new permissions: ECR pull permission is a future user-data + IAM
  role change, separate from this code change.
- Worst-case: an attacker who already compromises Railway can swap the
  registry prefix to a malicious URI — same blast radius as compromising
  Railway today, no expansion.

Tests:

- 9 new unit tests in `registry_test.go` covering: default fallback,
  env override, empty env, all 9 known runtimes, unknown runtime,
  override-applies-to-all, computeRuntimeImages map population, env
  reflection, alphabetical ordering pin.
- All existing provisioner + handlers tests continue to pass.
- Mutation-tested mentally: deleting `if v := os.Getenv(...)` makes
  TestRegistryPrefix_RespectsEnv fail. Deleting `for _, r := range
  knownRuntimes` makes TestRuntimeImage_AllKnownRuntimes fail. The test
  suite would catch a regression of the original failure mode.

Rollout plan: this PR is safe to merge with no env change. Production
cutover happens by setting MOLECULE_IMAGE_REGISTRY on Railway after
the AWS ECR mirror is populated (separate ops change, tracked in
issue #6 phases 3b–3f).

Tracking:
- RFC: molecule-ai/internal#6
- Tasks: #97 (ECR setup), #98 (CP fallback)
- Tech debt: runbooks/hetzner-rollout-tech-debt-2026-05-06.md item 7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 14:23:01 -07:00
55 changed files with 2572 additions and 521 deletions
+1 -1
View File
@@ -37,7 +37,7 @@ CANONICAL_FILE = Path(".github/workflows/secret-scan.yml")
CONSUMERS: list[tuple[str, str]] = [
(
"molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh",
"https://raw.githubusercontent.com/Molecule-AI/molecule-ai-workspace-runtime/main/molecule_runtime/scripts/pre-commit-checks.sh",
"https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/raw/branch/main/molecule_runtime/scripts/pre-commit-checks.sh",
),
]
@@ -103,7 +103,7 @@ jobs:
with:
fetch-depth: 0
ref: staging
token: ${{ secrets.GITHUB_TOKEN }}
token: ${{ secrets.AUTO_SYNC_TOKEN }}
- name: Configure git author
run: |
@@ -174,7 +174,7 @@ jobs:
- name: Open auto-sync PR + enable auto-merge
if: steps.check.outputs.needs_sync == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
BRANCH: ${{ steps.check.outputs.branch }}
MAIN_SHORT: ${{ steps.check.outputs.main_short }}
DID_FF: ${{ steps.prep.outputs.did_ff }}
+1 -1
View File
@@ -235,7 +235,7 @@ jobs:
run: npx vitest run --coverage
- name: Upload coverage summary as artifact
if: needs.changes.outputs.canvas == 'true' && always()
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
uses: actions/upload-artifact@v3 # pinned to v3 for Gitea act_runner v0.6 compatibility (internal#46)
with:
name: canvas-coverage-${{ github.run_id }}
path: canvas/coverage/
+3 -12
View File
@@ -55,17 +55,8 @@ jobs:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# Same reasoning as publish-workspace-server-image.yml — the Go
# module's replace directive needs the plugin source so
# CodeQL's "go build" phase can resolve.
if: matrix.language == 'go'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# plugin was dropped + the Dockerfile no longer needs it.
# jq is pre-installed on ubuntu-latest — no setup step needed.
- name: Initialize CodeQL
@@ -121,7 +112,7 @@ jobs:
# 14-day retention — longer than default 3, short enough not
# to bloat quota.
if: always()
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
uses: actions/upload-artifact@v3 # pinned to v3 for Gitea act_runner v0.6 compatibility (internal#46)
with:
name: codeql-sarif-${{ matrix.language }}
path: sarif-results/${{ matrix.language }}/
+2 -2
View File
@@ -139,7 +139,7 @@ jobs:
- name: Upload Playwright report on failure
if: failure() && needs.detect-changes.outputs.canvas == 'true'
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
uses: actions/upload-artifact@v3 # pinned to v3 for Gitea act_runner v0.6 compatibility (internal#46)
with:
name: playwright-report-staging
path: canvas/playwright-report-staging/
@@ -147,7 +147,7 @@ jobs:
- name: Upload screenshots on failure
if: failure() && needs.detect-changes.outputs.canvas == 'true'
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
uses: actions/upload-artifact@v3 # pinned to v3 for Gitea act_runner v0.6 compatibility (internal#46)
with:
name: playwright-screenshots
path: canvas/test-results/
+2 -10
View File
@@ -95,16 +95,8 @@ jobs:
- if: needs.detect-changes.outputs.run == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# Dockerfile.tenant copies molecule-ai-plugin-github-app-auth/
# at the build-context root (see workspace-server/Dockerfile.tenant
# line 19). PLUGIN_REPO_PAT pattern matches publish-workspace-server-image.yml.
if: needs.detect-changes.outputs.run == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# the plugin was dropped + Dockerfile.tenant no longer COPYs it.
- name: Install Python deps for replays
# peer-discovery-404 (and future replays) eval Python against the
+126 -53
View File
@@ -282,42 +282,33 @@ jobs:
echo "::error::Refusing to fan out cascade against stale or corrupt PyPI surfaces."
exit 1
- name: Fan out repository_dispatch
- name: Fan out via push to .runtime-version
env:
# Fine-grained PAT with `actions:write` on the 8 template repos.
# GITHUB_TOKEN can't fire dispatches across repos — needs an explicit
# token. Stored as a repo secret; rotate per the standard schedule.
DISPATCH_TOKEN: ${{ secrets.TEMPLATE_DISPATCH_TOKEN }}
# Single source of truth: the publish job's output, which handles
# tag/manual-input/auto-bump uniformly. The previous fallback
# (`steps.version.outputs.version` from inside the cascade job)
# was a dead reference — different job, no shared step scope.
# Gitea PAT with write:repository scope on the 8 cascade-active
# template repos. Used here for `git push` (NOT for an API
# dispatch — Gitea 1.22.6 has no repository_dispatch endpoint;
# empirically verified across 6 candidate paths in molecule-
# core#20 issuecomment-913). The push trips each template's
# existing `on: push: branches: [main]` trigger on
# publish-image.yml, which then reads the updated
# .runtime-version via its resolve-version job.
DISPATCH_TOKEN: ${{ secrets.DISPATCH_TOKEN }}
RUNTIME_VERSION: ${{ needs.publish.outputs.version }}
run: |
set +e # don't abort on a single repo failure — collect them all
# Schedule-vs-dispatch behaviour split (hardened 2026-04-28
# after the sweep-cf-orphans soft-skip incident — same class
# of bug):
#
# The earlier "skipping cascade. templates will pick up the
# new version on their own next rebuild" message was wrong —
# templates only build on this dispatch trigger; without it
# they stay pinned to whatever runtime version they last saw.
# A silent skip here means "PyPI is current, templates are
# not" and the gap is invisible until someone notices a
# template still on the old version weeks later.
#
# - push → exit 1 (red CI surfaces the gap)
# - workflow_dispatch → exit 0 with a warning (operator
# ran this ad-hoc; let them rerun
# after fixing the secret)
# Soft-skip on workflow_dispatch when the token is missing
# (operator ad-hoc test); hard-fail on push so unattended
# publishes can't silently skip the cascade. Same shape as
# the original v1, intentional split per the schedule-vs-
# dispatch hardening 2026-04-28.
if [ -z "$DISPATCH_TOKEN" ]; then
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "::warning::TEMPLATE_DISPATCH_TOKEN secret not set — skipping cascade."
echo "::warning::DISPATCH_TOKEN secret not set — skipping cascade."
echo "::warning::set it at Settings → Secrets and Variables → Actions, then rerun. Templates will stay on the prior runtime version until either this token is set or each template is rebuilt manually."
exit 0
fi
echo "::error::TEMPLATE_DISPATCH_TOKEN secret missing — cascade cannot fan out."
echo "::error::DISPATCH_TOKEN secret missing — cascade cannot fan out."
echo "::error::PyPI was published, but the 8 template repos will NOT pick up the new version until this token is restored and a republish dispatches the cascade."
echo "::error::set it at Settings → Secrets and Variables → Actions; then re-trigger publish-runtime via workflow_dispatch."
exit 1
@@ -327,37 +318,119 @@ jobs:
echo "::error::publish job did not expose a version output — cascade cannot fan out"
exit 1
fi
# All 9 active workspace template repos. The PR #2536 pruning
# ("deprecated, no shipping images") was empirically wrong:
# continuous-synth-e2e.yml defaults to langgraph as its primary
# canary (line 44), and every excluded template had successful
# publish-image runs as of 2026-05-03 — none were dormant.
# Symptom of the prune: today's a2a-sdk strict-mode fix
# (#2566 / commit e1628c4) cascaded to 4 templates but never
# reached langgraph, so the synth-E2E correctly canary'd a fix
# that had landed but not deployed. Re-added the 5 templates.
# Long-term: derive this list from manifest.json so cascade
# scope can't drift from E2E scope — tracked in RFC #388 as a
# Phase-1 invariant.
# All 9 workspace templates declared in manifest.json. The list
# MUST stay aligned with manifest.json's workspace_templates —
# cascade-list-drift-gate.yml enforces this in CI per the
# codex-stuck-on-stale-runtime invariant from PR #2556.
# Long-term goal: derive this list from manifest.json so it
# can't drift even on a manifest edit (RFC #388 Phase-1).
#
# Per-template publish-image.yml presence is checked at
# cascade-time below: codex doesn't ship one today, so the
# cascade soft-skips it with an informational message rather
# than dropping it from this list (which would re-introduce
# the drift the gate exists to catch).
GITEA_URL="${GITEA_URL:-https://git.moleculesai.app}"
TEMPLATES="claude-code hermes openclaw codex langgraph crewai autogen deepagents gemini-cli"
FAILED=""
SKIPPED=""
# Configure git identity once. The persona owning DISPATCH_TOKEN
# is the same identity that authored this commit on each
# template; using a generic "publish-runtime cascade" co-author
# trailer in the message keeps the audit trail honest about the
# workflow-driven origin.
git config --global user.name "publish-runtime cascade"
git config --global user.email "publish-runtime@moleculesai.app"
WORKDIR="$(mktemp -d)"
for tpl in $TEMPLATES; do
REPO="Molecule-AI/molecule-ai-workspace-template-$tpl"
STATUS=$(curl -sS -o /tmp/dispatch.out -w "%{http_code}" \
-X POST "https://api.github.com/repos/$REPO/dispatches" \
-H "Authorization: Bearer $DISPATCH_TOKEN" \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-d "{\"event_type\":\"runtime-published\",\"client_payload\":{\"runtime_version\":\"$VERSION\"}}")
if [ "$STATUS" = "204" ]; then
echo "✓ dispatched $tpl ($VERSION)"
else
echo "::warning::✗ failed to dispatch $tpl: HTTP $STATUS — $(cat /tmp/dispatch.out)"
REPO="molecule-ai/molecule-ai-workspace-template-$tpl"
CLONE="$WORKDIR/$tpl"
# Pre-check: skip templates without a publish-image.yml.
# The cascade's job is to trip the template's on-push
# rebuild — if there's no rebuild workflow, pushing a
# .runtime-version commit is just noise on the target
# repo. Use the Gitea contents API (no clone required for
# the probe). 200 = present; 404 = absent.
HTTP=$(curl -sS -o /dev/null -w "%{http_code}" \
-H "Authorization: token $DISPATCH_TOKEN" \
"$GITEA_URL/api/v1/repos/$REPO/contents/.github/workflows/publish-image.yml")
if [ "$HTTP" = "404" ]; then
echo "↷ $tpl has no publish-image.yml — soft-skip (informational; manifest still tracks it)"
SKIPPED="$SKIPPED $tpl"
continue
fi
if [ "$HTTP" != "200" ]; then
echo "::warning::$tpl publish-image.yml probe returned HTTP $HTTP — proceeding anyway, push will surface the real failure if any"
fi
# Use a per-template attempt loop so a transient race (e.g.
# human pushing to the same template at the same instant)
# doesn't lose the cascade. Bounded retries (3) — beyond
# that we surface the failure and let the operator retry.
attempt=0
success=false
while [ $attempt -lt 3 ]; do
attempt=$((attempt + 1))
rm -rf "$CLONE"
if ! git clone --depth=1 \
"https://x-access-token:${DISPATCH_TOKEN}@${GITEA_URL#https://}/$REPO.git" \
"$CLONE" >/tmp/clone.log 2>&1; then
echo "::warning::clone $tpl attempt $attempt failed: $(tail -n3 /tmp/clone.log)"
sleep 2
continue
fi
cd "$CLONE"
echo "$VERSION" > .runtime-version
# Idempotency guard: if the file already matches, this
# publish is a re-run for a version already cascaded.
# Don't push a no-op commit (would spuriously re-trip the
# template's on-push and rebuild for nothing).
if git diff --quiet -- .runtime-version; then
echo "✓ $tpl already at $VERSION — no commit needed (idempotent)"
success=true
cd - >/dev/null
break
fi
git add .runtime-version
git commit -m "chore: pin runtime to $VERSION (publish-runtime cascade)" \
-m "Co-Authored-By: publish-runtime cascade <publish-runtime@moleculesai.app>" \
>/dev/null
if git push origin HEAD:main >/tmp/push.log 2>&1; then
echo "✓ $tpl pushed $VERSION on attempt $attempt"
success=true
cd - >/dev/null
break
fi
# Likely a non-fast-forward — pull-rebase and retry.
# Don't force-push: that would silently overwrite a racing
# human/cascade commit.
echo "::warning::push $tpl attempt $attempt failed, pull-rebasing: $(tail -n3 /tmp/push.log)"
git pull --rebase origin main >/tmp/rebase.log 2>&1 || true
cd - >/dev/null
done
if [ "$success" != "true" ]; then
FAILED="$FAILED $tpl"
fi
done
rm -rf "$WORKDIR"
if [ -n "$FAILED" ]; then
echo "::warning::Cascade incomplete. Failed templates:$FAILED"
# Don't fail the whole job — PyPI publish already succeeded;
# operators can retry the failed templates manually.
echo "::error::Cascade incomplete after 3 retries each. Failed templates:$FAILED"
echo "::error::PyPI publish succeeded; failed templates lag the new version. Re-run this workflow_dispatch with the same version to retry only the laggers (idempotent — already-cascaded templates skip)."
exit 1
fi
if [ -n "$SKIPPED" ]; then
echo "Cascade complete: pinned $VERSION on cascade-active templates. Soft-skipped (no publish-image.yml):$SKIPPED"
else
echo "Cascade complete: $VERSION pinned across all manifest workspace_templates."
fi
@@ -60,8 +60,8 @@ permissions:
packages: write
env:
IMAGE_NAME: ghcr.io/molecule-ai/platform
TENANT_IMAGE_NAME: ghcr.io/molecule-ai/platform-tenant
IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform
TENANT_IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
jobs:
build-and-push:
@@ -70,31 +70,28 @@ jobs:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# workspace-server/Dockerfile expects
# ./molecule-ai-plugin-github-app-auth at build-context root because
# the Go module has a `replace` directive pointing at /plugin inside
# the image. Pre-repo-split the plugin lived in the monorepo; the
# 2026-04-18 restructure moved it out but didn't add this clone step
# — which is why publish was failing after that restructure.
#
# Uses a fine-grained PAT (PLUGIN_REPO_PAT) because the plugin repo
# is private and the default GITHUB_TOKEN is scoped to THIS repo.
# The PAT needs Contents:Read on Molecule-AI/molecule-ai-plugin-
# github-app-auth. Falls back to the default token for the (rare)
# case where an operator made the plugin repo public.
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# plugin was dropped + workspace-server/Dockerfile no longer
# COPYs it.
- name: Log in to GHCR
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
- name: Configure AWS credentials for ECR
# GHCR was the pre-suspension target; the molecule-ai org on
# GitHub got swept 2026-05-06 and ghcr.io/molecule-ai/* is no
# longer reachable. Post-suspension target is the operator's
# ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/*), which already hosts platform-tenant +
# workspace-template-* + runner-base images. AWS creds come
# from the AWS_ACCESS_KEY_ID/SECRET secrets bound to the
# molecule-cp IAM user. Closes #161.
uses: aws-actions/configure-aws-credentials@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-2
- name: Log in to ECR
id: ecr-login
uses: aws-actions/amazon-ecr-login@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
+6 -6
View File
@@ -22,7 +22,7 @@ development workflow, conventions, and how to get your changes merged.
```bash
# Clone the repo
git clone https://github.com/Molecule-AI/molecule-core.git
git clone https://git.moleculesai.app/molecule-ai/molecule-core.git
cd molecule-core
# Install git hooks
@@ -57,7 +57,7 @@ See `CLAUDE.md` for a full list of environment variables and their purposes.
This repo is scoped to **code** (canvas, workspace, workspace-server, related
infra). Public content (blog posts, marketing copy, OG images, SEO briefs,
DevRel demos) lives in [`Molecule-AI/docs`](https://github.com/Molecule-AI/docs).
DevRel demos) lives in [`Molecule-AI/docs`](https://git.moleculesai.app/molecule-ai/docs).
The `Block forbidden paths` CI gate fails any PR that writes to `marketing/`
or other removed paths — open against `Molecule-AI/docs` instead.
@@ -110,7 +110,7 @@ causing a render loop when any node position changed.
1. **Repo-wide:** "Automatically delete head branches" is on. Once a PR merges, the branch is deleted server-side. Any subsequent `git push` to that branch fails with `remote rejected — no such branch`.
2. **CI:** the `pr-guards` workflow (calling [molecule-ci `disable-auto-merge-on-push`](https://github.com/Molecule-AI/molecule-ci/blob/main/.github/workflows/disable-auto-merge-on-push.yml)) fires on every push to an open PR. If auto-merge was already enabled, it's disabled and a comment is posted. You must explicitly re-enable after verifying the new commit.
2. **CI:** the `pr-guards` workflow (calling [molecule-ci `disable-auto-merge-on-push`](https://git.moleculesai.app/molecule-ai/molecule-ci/src/branch/main/.github/workflows/disable-auto-merge-on-push.yml)) fires on every push to an open PR. If auto-merge was already enabled, it's disabled and a comment is posted. You must explicitly re-enable after verifying the new commit.
**Workflow rules that follow from the guards:**
- Push **all** commits before running `gh pr merge --auto`.
@@ -180,9 +180,9 @@ and run CI manually.
Code in this repo lands in molecule-core. Some related runtime artifacts
live in their own repos:
- [`Molecule-AI/molecule-ai-workspace-runtime`](https://github.com/Molecule-AI/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`Molecule-AI/molecule-sdk-python`](https://github.com/Molecule-AI/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`Molecule-AI/molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install with `claude --channels plugin:molecule@Molecule-AI/molecule-mcp-claude-channel`.
- [`Molecule-AI/molecule-ai-workspace-runtime`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime) — Python adapter SDK (`molecule_runtime`) that runs inside containerized Molecule workspaces. Bridges Claude Code SDK / hermes / langgraph / etc. → A2A queue.
- [`Molecule-AI/molecule-sdk-python`](https://git.moleculesai.app/molecule-ai/molecule-sdk-python) — `A2AServer` + `RemoteAgentClient` for external agents that register over the public `/registry/register` flow.
- [`Molecule-AI/molecule-mcp-claude-channel`](https://git.moleculesai.app/molecule-ai/molecule-mcp-claude-channel) — Claude Code channel plugin. Bridges A2A traffic into a running Claude Code session via MCP `notifications/claude/channel`. Polling-based (no tunnel required); install with `claude --channels plugin:molecule@Molecule-AI/molecule-mcp-claude-channel`.
When extending the **A2A surface** in molecule-core (`workspace-server/internal/handlers/a2a_proxy.go` etc.), consider whether the change has a downstream impact on the runtime SDK or the channel plugin — they're versioned independently but share the wire shape.
+52 -23
View File
@@ -1,7 +1,7 @@
<div align="center">
<p>
<img src="./docs/assets/branding/molecule-icon.png" alt="Molecule AI Icon Logo" width="160" />
<img src="./docs/assets/branding/molecule-icon.svg" alt="Molecule AI" width="160" />
</p>
<p>
@@ -39,8 +39,8 @@
<a href="./docs/agent-runtime/workspace-runtime.md"><strong>Workspace Runtime</strong></a>
</p>
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://github.com/Molecule-AI/molecule-monorepo)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/Molecule-AI/molecule-monorepo)
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://git.moleculesai.app/molecule-ai/molecule-core)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://git.moleculesai.app/molecule-ai/molecule-core)
</div>
@@ -53,8 +53,8 @@ Molecule AI is the most powerful way to govern an AI agent organization in produ
It combines the parts that are usually scattered across demos, internal glue code, and framework-specific tooling into one product:
- one org-native control plane for teams, roles, hierarchy, and lifecycle
- one runtime layer that lets LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw run side by side
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries
- one runtime layer that lets **eight** agent runtimes — LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, **Hermes**, **Gemini CLI**, and OpenClaw run side by side behind one workspace contract
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries (Memory v2 backed by pgvector for semantic recall)
- one operational surface for observing, pausing, restarting, inspecting, and improving live workspaces
Most teams can build a workflow, a strong single agent, a coding agent, or a custom multi-agent graph.
@@ -75,7 +75,7 @@ You do not wire collaboration paths by hand. Hierarchy defines the default commu
### 3. Runtime choice stops being a dead-end decision
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, Hermes, Gemini CLI, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
### 4. Memory is treated like infrastructure
@@ -117,6 +117,8 @@ Molecule AI is not trying to replace the frameworks below. It is the system that
| **Claude Code** | Shipping on `main` | Real coding workflows, CLI-native continuity | Secure workspace abstraction, A2A delegation, org boundaries, shared control plane |
| **CrewAI** | Shipping on `main` | Role-based crews | Persistent workspace identity, policy consistency, shared canvas and registry |
| **AutoGen** | Shipping on `main` | Assistant/tool orchestration | Standardized deployment, hierarchy-aware collaboration, shared ops plane |
| **Hermes 4** | Shipping on `main` | Hybrid reasoning, native tools, json_schema (NousResearch/hermes-agent) | Option B upstream hook, A2A bridge to OpenAI-compat API, multi-provider provider derivation |
| **Gemini CLI** | Shipping on `main` | Google Gemini CLI continuity | Workspace lifecycle, A2A, hierarchy-aware collaboration, shared ops plane |
| **OpenClaw** | Shipping on `main` | CLI-native runtime with its own session model | Workspace lifecycle, templates, activity logs, topology-aware collaboration |
| **NemoClaw** | WIP on `feat/nemoclaw-t4-docker` | NVIDIA-oriented runtime path | Planned to join the same abstraction once merged; not yet part of `main` |
@@ -182,9 +184,10 @@ The result is not just “an agent that learns.” It is **an organization that
## What Ships In `main`
### Canvas
### Canvas (v4)
- Next.js 15 + React Flow + Zustand
- **warm-paper theme system** — light / dark / follow-system, SSR cookie + nonce'd boot script + ThemeProvider; terminal + code surfaces stay dark unconditionally
- drag-to-nest team building
- empty-state deployment + onboarding wizard
- template palette
@@ -193,8 +196,9 @@ The result is not just “an agent that learns.” It is **an organization that
### Platform
- Go/Gin control plane
- workspace CRUD and provisioning
- Go 1.25 / Gin control plane (80+ HTTP endpoints + Gorilla WebSocket fanout)
- workspace CRUD and provisioning (pluggable Provisioner — Docker locally, EC2 + SSM in production)
- **A2A response path is a typed discriminated union (RFC #2967)** — frozen dataclasses + total parser; 100% unit + adversarial fuzz coverage
- registry and heartbeats
- browser-safe A2A proxy
- team expansion/collapse
@@ -204,10 +208,10 @@ The result is not just “an agent that learns.” It is **an organization that
### Runtime
- unified `workspace/` image
- adapter-driven execution
- unified `workspace/` image; thin AMI in production (us-east-2)
- adapter-driven execution across **8 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw)
- Agent Card registration
- awareness-backed memory integration
- awareness-backed memory integration; **Memory v2 backed by pgvector** for semantic recall
- plugin-mounted shared rules/skills
- hot-reloadable local skills
- coordinator-only delegation path
@@ -221,6 +225,21 @@ The result is not just “an agent that learns.” It is **an organization that
- runtime tiers
- direct workspace inspection through terminal and files
### SaaS (via [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane))
- multi-tenant on AWS EC2 + Neon (per-tenant Postgres branch) + Cloudflare Tunnels (per-tenant, no public ports)
- WorkOS AuthKit + Stripe Checkout + Customer Portal
- AWS KMS envelope encryption (DB / Redis connection strings); AWS Secrets Manager for tenant bootstrap
- `tenant_resources` audit table + 30-min boot-event-aware reconciler — every CF / AWS lifecycle event recorded, claim vs live state diffed
### Bring your own Claude Code session (via [`molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel))
- Claude Code plugin that bridges Molecule A2A traffic into a local Claude Code session via MCP
- subscribe to one or more workspaces; peer messages surface as conversation turns; replies route back through Molecule's A2A
- no tunnel, no public endpoint — the plugin self-registers each watched workspace as `delivery_mode=poll` and long-polls `/activity?since_id=…`
- multi-tenant friendly: one plugin install can watch workspaces across multiple Molecule tenants (`MOLECULE_PLATFORM_URLS` per-workspace)
- install via the standard marketplace flow: `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
## Built For Teams That Need More Than A Demo
Molecule AI is especially strong when you need to run:
@@ -233,24 +252,30 @@ Molecule AI is especially strong when you need to run:
## Architecture
```text
Canvas (Next.js :3000) <--HTTP / WS--> Platform (Go :8080) <---> Postgres + Redis
| |
| +--> Docker provisioner / bundles / templates / secrets
Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080) <---> Postgres + Redis
| |
| +--> Provisioner: Docker (local) / EC2 + SSM (prod)
| +--> bundles · templates · secrets · KMS
|
+-------------------- shows --------------------> workspaces, teams, tasks, traces, events
+------------------------- shows ------------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python image with adapters)
- LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / OpenClaw
- Agent Card + A2A server
- heartbeat + activity + awareness-backed memory
Workspace Runtime (Python ≥3.11, image with adapters)
- 8 adapters: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A server (typed-SSOT response path, RFC #2967)
- heartbeat + activity + awareness-backed memory (Memory v2 — pgvector semantic recall)
- skills + plugins + hot reload
SaaS Control Plane (molecule-controlplane, private)
- per-tenant EC2 + Neon (Postgres branch) + Cloudflare Tunnel
- WorkOS · Stripe · KMS · AWS Secrets Manager
- tenant_resources audit + 30-min reconciler
```
## Quick Start
```bash
git clone https://github.com/Molecule-AI/molecule-monorepo.git
cd molecule-monorepo
git clone https://git.moleculesai.app/molecule-ai/molecule-core.git
cd molecule-core
cp .env.example .env
# Defaults boot the stack locally out of the box. See .env.example for
@@ -303,7 +328,11 @@ Then open `http://localhost:3000`:
## Current Scope
The current `main` branch already includes the core platform, canvas, memory model, six production adapters, skill lifecycle, and operational surfaces. Adjacent runtime work such as **NemoClaw** remains branch-level until merged, and this README keeps that distinction explicit on purpose.
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **eight production adapters** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw), skill lifecycle, and operational surfaces.
The companion private repo [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane) provides the SaaS surface — multi-tenant orchestration on EC2 + Neon + Cloudflare Tunnels, KMS envelope encryption, WorkOS auth, Stripe billing, and a `tenant_resources` audit table with a 30-min reconciler.
Adjacent runtime work such as **NemoClaw** remains branch-level until merged, and this README keeps that distinction explicit on purpose.
## License
+51 -22
View File
@@ -1,7 +1,7 @@
<div align="center">
<p>
<img src="./docs/assets/branding/molecule-icon.png" alt="Molecule AI 图案 Logo" width="160" />
<img src="./docs/assets/branding/molecule-icon.svg" alt="Molecule AI" width="160" />
</p>
<p>
@@ -38,8 +38,8 @@
<a href="./docs/agent-runtime/workspace-runtime.md"><strong>Workspace Runtime</strong></a>
</p>
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://github.com/Molecule-AI/molecule-core)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/Molecule-AI/molecule-core)
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://git.moleculesai.app/molecule-ai/molecule-core)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://git.moleculesai.app/molecule-ai/molecule-core)
</div>
@@ -52,8 +52,8 @@ Molecule AI 是目前最强的 AI Agent 组织治理方案之一,用来把 age
它把过去分散在 demo、内部胶水代码和各类 framework 私有工具里的关键能力,收敛成一个产品:
- 一套组织原生 control plane,管理团队、角色、层级和生命周期
- 一套 runtime abstraction,让 LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、OpenClaw 并存运行
- 一套与组织边界对齐的 memory 模型,把 recall、sharing 和 skill evolution 放进同一体系
- 一套 runtime abstraction,让 **8 个** agent runtime —— LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、**Hermes**、**Gemini CLI**、OpenClaw —— 共用一套 workspace 契约
- 一套与组织边界对齐的 memory 模型,把 recall、sharing 和 skill evolution 放进同一体系Memory v2 由 pgvector 支撑语义召回)
- 一套面向线上 workspace 的运维面,统一完成观测、暂停、重启、检查和持续改进
今天很多团队能做好 workflow、单 agent、coding agent,或者自定义 multi-agent graph 中的一种。
@@ -74,7 +74,7 @@ Molecule AI 填的就是这个空白。
### 3. Runtime 选择不再是死路
LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、OpenClaw 都可以挂到同一个 workspace abstraction 下。团队可以统一治理方式,而不必统一到底层 runtime。
LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、Hermes、Gemini CLI、OpenClaw 都可以挂到同一个 workspace abstraction 下。团队可以统一治理方式,而不必统一到底层 runtime。
### 4. Memory 被当成基础设施来做
@@ -116,6 +116,8 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
| **Claude Code** | `main` 已支持 | 真实编码工作流、CLI-native continuity | 安全 workspace 抽象、A2A delegation、组织边界、共享 control plane |
| **CrewAI** | `main` 已支持 | 角色型 crew 模式清晰 | 持久 workspace 身份、统一策略、共享 Canvas 和 registry |
| **AutoGen** | `main` 已支持 | assistant/tool orchestration | 统一部署、层级协作、共享运维平面 |
| **Hermes 4** | `main` 已支持 | 混合推理、原生工具调用、json_schema 输出(NousResearch/hermes-agent | Option B 上游 hook、A2A 桥接 OpenAI 兼容 API、多 provider 自动派生 |
| **Gemini CLI** | `main` 已支持 | Google Gemini CLI 持续会话 | workspace 生命周期、A2A、层级感知协作、共享运维平面 |
| **OpenClaw** | `main` 已支持 | CLI-native runtime,自有 session 模型 | workspace 生命周期、templates、activity logs、拓扑感知协作 |
| **NemoClaw** | `feat/nemoclaw-t4-docker` 分支 WIP | NVIDIA 方向 runtime 路线 | 计划并入同一抽象层,但当前还不是 `main` 已合并能力 |
@@ -181,9 +183,10 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
## `main` 分支已经具备什么
### Canvas
### Canvasv4
- Next.js 15 + React Flow + Zustand
- **warm-paper 主题系统** —— light / dark / 跟随系统;SSR cookie + nonce'd boot 脚本 + ThemeProvider;终端与代码面板始终保持深色
- drag-to-nest 团队构建
- empty state + onboarding wizard
- template palette
@@ -192,8 +195,9 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
### Platform
- Go/Gin control plane
- workspace CRUD 和 provisioning
- Go 1.25 / Gin control plane80+ HTTP 端点 + Gorilla WebSocket fanout
- workspace CRUD 和 provisioning(可插拔 Provisioner —— 本地 Docker、生产 EC2 + SSM
- **A2A 响应路径已收敛为类型化的判别联合(RFC #2967** —— 冻结 dataclass + 全量 parser100% 单元测试 + 对抗性 fuzz 覆盖
- registry 与 heartbeat
- 浏览器安全的 A2A proxy
- team expansion/collapse
@@ -203,10 +207,10 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
### Runtime
- 统一 `workspace/` 镜像
- adapter 驱动执行
- 统一 `workspace/` 镜像;生产环境采用 thin AMIus-east-2
- adapter 驱动执行,覆盖 **8 个 runtime**Claude Code、Hermes、Gemini CLI、LangGraph、DeepAgents、CrewAI、AutoGen、OpenClaw
- Agent Card 注册
- awareness-backed memory
- awareness-backed memory**Memory v2 由 pgvector 支撑**语义召回
- plugin 挂载共享 rules/skills
- 本地 skills 热加载
- coordinator-only delegation 路径
@@ -220,6 +224,21 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
- runtime tiers
- 终端与文件层面的 workspace 直接排障
### SaaS(由 [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane) 提供)
- 多租户运行在 AWS EC2 + Neon(每租户一个 Postgres branch+ Cloudflare Tunnels(每租户一条隧道,对外不开任何端口)
- WorkOS AuthKit + Stripe Checkout + Customer Portal
- AWS KMS 信封加密(DB / Redis 连接串);AWS Secrets Manager 负责租户 bootstrap
- `tenant_resources` 审计表 + 30 分钟 boot-event-aware reconciler —— 每个 CF / AWS lifecycle 事件都有记录,每 30 分钟比对 claim 与实际状态
### 在 Claude Code 里直接接入(由 [`molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel) 提供)
- 把 Molecule A2A 流量桥接到本地 Claude Code 会话的 MCP 插件
- 订阅一个或多个 workspacepeer 的消息会以 user-turn 出现,回复会经 Molecule A2A 路由出去
- 无需公网隧道、无需公开端点 —— 插件启动时自动把每个 watched workspace 注册成 `delivery_mode=poll`,长轮询 `/activity?since_id=…`
- 多租户友好:单次安装即可同时 watch 跨多个 Molecule 租户的 workspace`MOLECULE_PLATFORM_URLS` 按 workspace 配置)
- 通过标准 marketplace 流程安装:`/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel``/plugin install molecule-channel@molecule-mcp-claude-channel`
## 适合什么团队
Molecule AI 特别适合下面这些场景:
@@ -232,23 +251,29 @@ Molecule AI 特别适合下面这些场景:
## 架构总览
```text
Canvas (Next.js :3000) <--HTTP / WS--> Platform (Go :8080) <---> Postgres + Redis
| |
| +--> Docker provisioner / bundles / templates / secrets
Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080) <---> Postgres + Redis
| |
| +--> Provisioner: Docker (本地) / EC2 + SSM (生产)
| +--> bundles · templates · secrets · KMS
|
+-------------------- 展示 --------------------> workspaces, teams, tasks, traces, events
+------------------------- 展示 ------------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python image with adapters)
- LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / OpenClaw
- Agent Card + A2A server
- heartbeat + activity + awareness-backed memory
Workspace Runtime (Python ≥3.11,含 adapter 集合的镜像)
- 8 个 adapter: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A servertyped-SSOT 响应路径,RFC #2967
- heartbeat + activity + awareness-backed memoryMemory v2 —— pgvector 语义召回)
- skills + plugins + hot reload
SaaS Control Plane (molecule-controlplane,私有)
- 每租户 EC2 + Neon (Postgres branch) + Cloudflare Tunnel
- WorkOS · Stripe · KMS · AWS Secrets Manager
- tenant_resources 审计 + 30 分钟 reconciler
```
## 快速开始
```bash
git clone https://github.com/Molecule-AI/molecule-core.git
git clone https://git.moleculesai.app/molecule-ai/molecule-core.git
cd molecule-core
cp .env.example .env
@@ -296,7 +321,11 @@ npm run dev
## 当前范围说明
当前 `main` 已经包含核心平台、Canvas、memory model、6 个正式 adapter、skill lifecycle主要运维面。**NemoClaw** 这样的相邻 runtime 路线仍然属于分支级工作,只有合并后才会进入正式支持列表,这里会明确区分。
当前 `main` 已经包含核心平台、Canvas v4warm-paper 主题)、Memory v2pgvector 语义召回)、typed-SSOT A2A 响应路径(RFC #2967)、**8 个正式 adapter**Claude Code、Hermes、Gemini CLI、LangGraph、DeepAgents、CrewAI、AutoGen、OpenClaw、skill lifecycle,以及主要运维面。
配套的私有仓库 [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane) 提供 SaaS 层 —— 多租户编排(EC2 + Neon + Cloudflare Tunnels)、KMS 信封加密、WorkOS 鉴权、Stripe 计费,以及 `tenant_resources` 审计表加 30 分钟 reconciler。
**NemoClaw** 这样的相邻 runtime 路线仍然属于分支级工作,只有合并后才会进入正式支持列表,这里会明确区分。
## License
+37 -55
View File
@@ -13,7 +13,6 @@ import { AttachmentPreview } from "./chat/AttachmentPreview";
import { extractFilesFromTask } from "./chat/message-parser";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
import { appendActivityLine } from "./chat/activityLog";
import { activityRowToMessages, type ActivityRowForHydration } from "./chat/historyHydration";
import { runtimeDisplayName } from "@/lib/runtime-names";
import { ConfirmDialog } from "@/components/ConfirmDialog";
@@ -50,38 +49,12 @@ interface A2AResponse {
};
}
/** Detect activity-log rows that the workspace's own runtime fired
* against itself but were misclassified as canvas-source. The proper
* fix is the X-Workspace-ID header from `self_source_headers()` in
* workspace/platform_auth.py, which makes the platform record
* source_id = workspace_id. But three failure modes still leak a
* self-message into "My Chat":
*
* 1. Historical rows already in the DB with source_id=NULL.
* 2. Workspace containers running pre-fix heartbeat.py / main.py
* (the fix only takes effect after an image rebuild + redeploy).
* 3. Future internal triggers added without the helper.
*
* This client-side filter recognises the heartbeat trigger by its
* exact prefix — the heartbeat assembles
*
* "Delegation results are ready. Review them and take appropriate
* action:\n" + summary_lines + report_instruction
*
* in workspace/heartbeat.py. The prefix is template-fixed so a
* string match is reliable. If the heartbeat copy ever changes,
* update this constant in the same commit.
*
* This is a backstop, not the primary defence — the X-Workspace-ID
* header is. Filtering content is fragile to copy edits, so keep
* the list narrow. */
const INTERNAL_SELF_MESSAGE_PREFIXES = [
"Delegation results are ready. Review them and take appropriate action",
];
function isInternalSelfMessage(text: string): boolean {
return INTERNAL_SELF_MESSAGE_PREFIXES.some((p) => text.startsWith(p));
}
// Internal-self-message filtering moved server-side in RFC #2945
// PR-C/D — the platform's /chat-history endpoint applies the
// IsInternalSelfMessage predicate before returning rows, so the
// client no longer needs the local backstop on the history path.
// The proper fix is still X-Workspace-ID header (source_id=workspace_id);
// the platform-side prefix filter handles the residual cases.
// extractReplyText pulls the agent's text reply out of an A2A response.
// Concatenates ALL text parts (joined with "\n") rather than returning
@@ -134,8 +107,19 @@ const INITIAL_HISTORY_LIMIT = 10;
const OLDER_HISTORY_BATCH = 20;
/**
* Load chat history from the activity_logs database via the platform API.
* Uses source=canvas to only get user-initiated messages (not agent-to-agent).
* Load chat history from the platform's typed /chat-history endpoint.
*
* Server-side rendering of activity_logs rows into ChatMessage shape
* lives in workspace-server/internal/messagestore/postgres_store.go
* (RFC #2945 PR-C/D). The server already applies the canvas-source
* filter, the internal-self-message predicate, the role decision
* (status=error vs agent-error prefix → system), and the v0/v1
* file-shape extraction. Canvas just renders what it receives.
*
* Wire shape (mirrors ChatMessage exactly, no per-row mapping needed):
*
* GET /workspaces/:id/chat-history?limit=N&before_ts=T
* 200 → {"messages": ChatMessage[], "reached_end": boolean}
*
* Pagination:
* - Pass `limit` to bound the page size (newest-first from server).
@@ -143,10 +127,10 @@ const OLDER_HISTORY_BATCH = 20;
* timestamp. Combined with limit, this yields the next-older page
* when scrolling backward through history.
*
* `reachedEnd` is true when the server returned fewer rows than asked
* for — caller uses this to disable further older-batch fetches.
* (Counts row-level returns, not chat-bubble count: each row may
* produce 1-2 bubbles.)
* `reachedEnd` is propagated from the server. The server computes it
* by comparing rowCount vs limit so a partial last page is correctly
* detected even when the row→bubble fan-out is non-1:1 (each row
* produces 1-2 bubbles).
*/
async function loadMessagesFromDB(
workspaceId: string,
@@ -154,25 +138,23 @@ async function loadMessagesFromDB(
beforeTs?: string,
): Promise<{ messages: ChatMessage[]; error: string | null; reachedEnd: boolean }> {
try {
const params = new URLSearchParams({
type: "a2a_receive",
source: "canvas",
limit: String(limit),
});
const params = new URLSearchParams({ limit: String(limit) });
if (beforeTs) params.set("before_ts", beforeTs);
const activities = await api.get<ActivityRowForHydration[]>(
`/workspaces/${workspaceId}/activity?${params.toString()}`,
const resp = await api.get<{ messages: ChatMessage[]; reached_end: boolean }>(
`/workspaces/${workspaceId}/chat-history?${params.toString()}`,
);
const messages: ChatMessage[] = [];
// Activities are newest-first, reverse for chronological order.
// Per-row mapping lives in chat/historyHydration.ts so it can be
// unit-tested without spinning up the full ChatTab component
// (regression cover for the timestamp-collapse bug).
for (const a of [...activities].reverse()) {
messages.push(...activityRowToMessages(a, isInternalSelfMessage));
}
return { messages, error: null, reachedEnd: activities.length < limit };
// Server emits oldest-first within the page (RFC #2945 PR-C-2
// post-fix: server reverses row-aware before returning so the
// wire is display-ready). Canvas appends/prepends without
// reordering — this avoids the pair-flip bug a naive flat
// reverse causes when each row produces a (user, agent) pair
// with the same timestamp.
return {
messages: resp.messages ?? [],
error: null,
reachedEnd: resp.reached_end,
};
} catch (err) {
return {
messages: [],
+66 -45
View File
@@ -21,20 +21,39 @@ interface Props {
// --- Agent Card Section ---
function AgentCardSection({ workspaceId }: { workspaceId: string }) {
const [card, setCard] = useState<Record<string, unknown> | null>(null);
const [loading, setLoading] = useState(true);
// Initial card value comes from the canvas store — node.data.agentCard
// is hydrated by the platform stream when the workspace appears in the
// graph, so reading it here avoids a duplicate `GET /workspaces/${id}`
// (the parent ConfigTab.loadConfig already fetches workspace metadata,
// and refetching here adds a serialised RTT to the panel-open path —
// contributed to the ~20s detail-panel load reported in core#11).
// Local state still tracks the edited/saved value so the editor flow
// is unchanged.
const storeCard = useCanvasStore((s) => {
// Defensive against test mocks that omit `nodes` (some test files
// stub the store with a minimal shape). In production `nodes` is
// always an array — empty or not — so the optional chaining only
// matters for the test path.
const node = s.nodes?.find?.((n) => n.id === workspaceId);
return (node?.data.agentCard as
| Record<string, unknown>
| null
| undefined) ?? null;
});
const [card, setCard] = useState<Record<string, unknown> | null>(storeCard);
const [editing, setEditing] = useState(false);
const [draft, setDraft] = useState("");
const [saving, setSaving] = useState(false);
const [error, setError] = useState<string | null>(null);
const [success, setSuccess] = useState(false);
// If the store updates while this section is mounted (another tab
// pushed an update via the platform event stream), reflect that —
// unless the user is mid-edit, in which case we don't clobber their
// unsaved draft.
useEffect(() => {
api.get<Record<string, unknown>>(`/workspaces/${workspaceId}`)
.then((ws) => setCard((ws.agent_card as Record<string, unknown>) || null))
.catch(() => {})
.finally(() => setLoading(false));
}, [workspaceId]);
if (!editing) setCard(storeCard);
}, [storeCard, editing]);
const handleSave = async () => {
setError(null);
@@ -53,9 +72,7 @@ function AgentCardSection({ workspaceId }: { workspaceId: string }) {
return (
<Section title="Agent Card" defaultOpen={false}>
{loading ? (
<div className="text-[10px] text-ink-soft">Loading...</div>
) : editing ? (
{editing ? (
<div className="space-y-2">
<textarea
aria-label="Agent card JSON editor"
@@ -221,47 +238,51 @@ export function ConfigTab({ workspaceId }: Props) {
setLoading(true);
setError(null);
// ALWAYS load workspace metadata first (runtime + model). These are the
// source of truth regardless of whether the runtime uses our config.yaml
// template. Without this the form falls back to empty/default values on
// a hermes workspace (which doesn't use our template), creating the
// appearance that the saved runtime is unset — and worse, clicking Save
// would silently flip `runtime` from `hermes` back to the dropdown
// default `LangGraph`. See GH #1894.
let wsMetadataRuntime = "";
let wsMetadataModel = "";
let wsMetadataTier: number | null = null;
try {
const ws = await api.get<{ runtime?: string; tier?: number }>(`/workspaces/${workspaceId}`);
wsMetadataRuntime = (ws.runtime || "").trim();
if (typeof ws.tier === "number") wsMetadataTier = ws.tier;
} catch { /* fall back to config.yaml */ }
try {
const m = await api.get<{ model?: string }>(`/workspaces/${workspaceId}/model`);
wsMetadataModel = (m.model || "").trim();
} catch { /* non-fatal */ }
// Load workspace metadata (runtime + model + provider) in parallel.
// These are independent GETs against three workspace-server endpoints
// and used to be awaited serially — for SaaS workspaces each call
// round-trips through an EIC SSH tunnel, so the previous serial
// pattern stacked 3-5s of tunnel-setup latency per call (core#11).
// Promise.all overlaps them; the per-call cost stays the same but
// wall time drops to max() instead of sum().
//
// Each leg has its own .catch handler that yields a sentinel value,
// matching the previous semantics:
// - /workspaces/${id}: required source-of-truth for runtime+tier;
// fall back to YAML if the GET fails (rare, network-class only).
// - /workspaces/${id}/model: non-fatal; empty model lets the form
// fall through to YAML runtime_config.model.
// - /workspaces/${id}/provider: non-fatal; old workspace-servers
// return 404, in which case provider="" and Save skips the PUT.
//
// See GH #1894 for the workspace-row-as-source-of-truth rationale
// that motivated splitting from a single config.yaml read.
const [wsRes, modelRes, providerRes] = await Promise.all([
api.get<{ runtime?: string; tier?: number }>(`/workspaces/${workspaceId}`)
.catch(() => ({} as { runtime?: string; tier?: number })),
api.get<{ model?: string }>(`/workspaces/${workspaceId}/model`)
.catch(() => ({} as { model?: string })),
api.get<{ provider?: string }>(`/workspaces/${workspaceId}/provider`)
.catch(() => null),
]);
const wsMetadataRuntime = (wsRes.runtime || "").trim();
const wsMetadataModel = (modelRes.model || "").trim();
const wsMetadataTier: number | null =
typeof wsRes.tier === "number" ? wsRes.tier : null;
if (providerRes !== null) {
const loadedProvider = (providerRes.provider || "").trim();
setProvider(loadedProvider);
setOriginalProvider(loadedProvider);
} else {
setProvider("");
setOriginalProvider("");
}
// originalModel is set further down once the YAML has been parsed —
// we want it to reflect what the form ACTUALLY rendered, which may
// be the YAML's runtime_config.model fallback when MODEL_PROVIDER
// is empty. Setting it here from wsMetadataModel alone would be
// wrong for hermes/pre-#240 workspaces.
// Load explicit provider override (Option B PR-5). Endpoint returns
// {provider: "", source: "default"} when no override is set, so the
// empty string is the legitimate "auto-derive" signal — don't treat
// it as a load error. Non-fatal: an older workspace-server that
// predates PR-2 returns 404 here; the form falls back to "" and
// Save just won't PUT the provider field.
try {
const p = await api.get<{ provider?: string }>(`/workspaces/${workspaceId}/provider`);
const loadedProvider = (p.provider || "").trim();
setProvider(loadedProvider);
setOriginalProvider(loadedProvider);
} catch {
setProvider("");
setOriginalProvider("");
}
// Skip the config.yaml fetch entirely for runtimes that manage
// their own config (external, hermes, etc.) — they don't have a
// platform-side template, so the GET would 404. The catch block
@@ -1,13 +1,11 @@
// @vitest-environment jsdom
//
// Pins the lazy-loading chat-history pagination added 2026-05-05.
// Pins the lazy-loading chat-history pagination.
//
// Pre-fix: ChatTab fetched the newest 50 messages on every mount and
// scrolled to bottom, paying full DOM cost up-front even when the user
// only wanted to read the last few bubbles. Post-fix: initial load is
// bounded to 10 newest, and an IntersectionObserver on a top sentinel
// triggers loadOlder() (batch of 20 with `before_ts` cursor) when the
// user scrolls up.
// PR-C-2 (RFC #2945): canvas was migrated from /activity?type=a2a_receive
// to /chat-history. Server now returns typed ChatMessage[] in
// display-ready oldest-first order. These tests guard the canvas-side
// pagination invariants against the new endpoint surface.
//
// Pinned branches:
// 1. Initial fetch carries `limit=10` and NO before_ts (newest-first
@@ -20,11 +18,10 @@
// asserting the rendered bubble count matches the full page).
// 4. The retry button after a failed initial load uses the same
// INITIAL_HISTORY_LIMIT (10), not the legacy 50.
//
// IntersectionObserver / scroll-anchor restoration is exercised by the
// E2E synth-canary suite — pinning it in jsdom would require mocking
// the observer and faking layout, which is brittler than trusting a
// live-DOM canary against the staging tenant.
// 5. before_ts cursor is the OLDEST timestamp from the current page,
// passed verbatim to walk backward.
// 6. Inflight guard rejects duplicate IO triggers while a loadOlder
// fetch is in flight.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
@@ -33,24 +30,31 @@ import React from "react";
afterEach(cleanup);
// Both ChatTab sub-panels (MyChat + AgentComms) mount simultaneously so
// keyboard tab order and aria-controls land on a real DOM. Both fire
// /activity GETs on mount: MyChat's hits `type=a2a_receive&source=canvas`,
// AgentComms's hits a different filter. Route the mock by URL so each
// gets a sensible default and only MyChat's call is what the assertions
// scrutinise.
const myChatActivityCalls: string[] = [];
let myChatNextResponse: { ok: true; rows: unknown[] } | { ok: false; err: Error } = {
ok: true,
rows: [],
};
// keyboard tab order and aria-controls land on a real DOM. MyChat's
// loadMessagesFromDB hits /chat-history; AgentComms's polling hits a
// different URL. Route the mock by URL so each gets a sensible default
// and only MyChat's calls land in the assertion array.
const myChatHistoryCalls: string[] = [];
let myChatNextResponse:
| { ok: true; messages: unknown[]; reachedEnd?: boolean }
| { ok: false; err: Error } = { ok: true, messages: [] };
const apiGet = vi.fn((path: string): Promise<unknown> => {
if (path.includes("type=a2a_receive") && path.includes("source=canvas")) {
myChatActivityCalls.push(path);
if (myChatNextResponse.ok) return Promise.resolve(myChatNextResponse.rows);
if (path.includes("/chat-history")) {
myChatHistoryCalls.push(path);
if (myChatNextResponse.ok) {
const reached_end =
myChatNextResponse.reachedEnd !== undefined
? myChatNextResponse.reachedEnd
: myChatNextResponse.messages.length < 10;
return Promise.resolve({
messages: myChatNextResponse.messages,
reached_end,
});
}
return Promise.reject(myChatNextResponse.err);
}
// AgentComms / heartbeat / anything else — empty array is a safe
// default that won't blow up the corresponding component's .then().
// AgentComms / heartbeat / anything else — empty array safe default.
return Promise.resolve([]);
});
const apiPost = vi.fn();
@@ -84,8 +88,8 @@ const ioInstances: IOInstance[] = [];
beforeEach(() => {
apiGet.mockClear();
apiPost.mockReset();
myChatActivityCalls.length = 0;
myChatNextResponse = { ok: true, rows: [] };
myChatHistoryCalls.length = 0;
myChatNextResponse = { ok: true, messages: [] };
ioInstances.length = 0;
class FakeIO {
private inst: IOInstance;
@@ -101,20 +105,12 @@ beforeEach(() => {
this.inst.disconnected = true;
}
}
// Install on every reachable global — different bundlers / module
// graphs can resolve `IntersectionObserver` via `window`, `globalThis`,
// or the bare global. Without all three, jsdom's own (pre-existing)
// stub silently wins and ioInstances stays empty.
(window as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
(globalThis as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
// jsdom doesn't implement scrollIntoView; ChatTab calls it after every
// messages update.
Element.prototype.scrollIntoView = vi.fn();
});
function triggerIntersection(instanceIdx = -1) {
// -1 → the latest observer (the live one). Tests targeting an old
// (disconnected) instance pass a positive index.
const inst = ioInstances.at(instanceIdx);
if (!inst) throw new Error(`no IO instance at ${instanceIdx}`);
inst.callback(
@@ -125,25 +121,30 @@ function triggerIntersection(instanceIdx = -1) {
import { ChatTab } from "../ChatTab";
function makeActivityRow(seq: number): Record<string, unknown> {
// Zero-pad seq into the minute slot so "seq=10" doesn't produce
// the invalid timestamp "00:010:00Z" (caught by the loadOlder URL
// assertion below — first version of the helper used `0${seq}` and
// the test failed on `before_ts` having an extra digit).
// makeMessagePair returns a (user, agent) pair sharing a timestamp,
// matching the wire shape /chat-history emits per activity_logs row.
// Server-side reverseRowChunks ensures the wire is oldest-first across
// rows but [user, agent] within each row.
function makeMessagePair(seq: number): unknown[] {
// Zero-pad seq into the minute slot so seq=10 produces a valid
// timestamp (00:10:00Z, not 00:010:00Z).
const mm = String(seq).padStart(2, "0");
return {
activity_type: "a2a_receive",
status: "ok",
created_at: `2026-05-05T00:${mm}:00Z`,
request_body: { params: { message: { parts: [{ kind: "text", text: `user msg ${seq}` }] } } },
response_body: { result: `agent reply ${seq}` },
};
const ts = `2026-05-05T00:${mm}:00Z`;
return [
{ id: `u-${seq}`, role: "user", content: `user msg ${seq}`, timestamp: ts },
{ id: `a-${seq}`, role: "agent", content: `agent reply ${seq}`, timestamp: ts },
];
}
// Server returns newest-first; the helper builds a server-shape page
// so the order in the rendered messages array matches production.
function newestFirstPage(start: number, count: number): unknown[] {
return Array.from({ length: count }, (_, i) => makeActivityRow(start + count - 1 - i));
// pageOldestFirst builds a wire-shape page (oldest-first within page)
// of `count` row-pairs starting at seq=`start`. Mirrors the server's
// post-reverseRowChunks emission order.
function pageOldestFirst(start: number, count: number): unknown[] {
const out: unknown[] = [];
for (let i = 0; i < count; i++) {
out.push(...makeMessagePair(start + i));
}
return out;
}
const minimalData = {
@@ -153,28 +154,30 @@ const minimalData = {
} as unknown as Parameters<typeof ChatTab>[0]["data"];
describe("ChatTab lazy history pagination", () => {
it("initial fetch carries limit=10 (not the legacy 50)", async () => {
myChatNextResponse = { ok: true, rows: [makeActivityRow(1)] };
it("initial fetch carries limit=10 (not the legacy 50) and hits /chat-history", async () => {
myChatNextResponse = { ok: true, messages: makeMessagePair(1) };
render(<ChatTab workspaceId="ws-1" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
const url = myChatActivityCalls[0];
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
const url = myChatHistoryCalls[0];
expect(url).toContain("/chat-history");
expect(url).toContain("limit=10");
expect(url).not.toContain("limit=50");
// before_ts should NOT be set on the initial fetch — that's the
// newest-first slice the user lands on.
expect(url).not.toContain("before_ts");
// /chat-history filters source-canvas server-side; client should
// NOT pass type/source params (they belonged to /activity).
expect(url).not.toContain("type=a2a_receive");
expect(url).not.toContain("source=canvas");
});
it("hides the top sentinel when initial fetch returns fewer than the limit", async () => {
// 3 < 10 → server says "no more older history exists"; sentinel
// should NOT mount and the "Loading older messages…" line should
// never appear (it can't, since the sentinel is what triggers it).
myChatNextResponse = {
ok: true,
rows: [makeActivityRow(1), makeActivityRow(2), makeActivityRow(3)],
};
// never appear.
myChatNextResponse = { ok: true, messages: pageOldestFirst(1, 3) };
render(<ChatTab workspaceId="ws-2" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => {
expect(screen.queryByText(/Loading chat history/i)).toBeNull();
});
@@ -182,15 +185,15 @@ describe("ChatTab lazy history pagination", () => {
});
it("renders all messages when initial fetch returns exactly the limit", async () => {
// 10 == limit → server might have more older rows; sentinel SHOULD
// mount so the IO observer can fire loadOlder() on scroll-up. We
// verify by checking the rendered bubble count — if hasMore stayed
// true the sentinel render path doesn't crash and all 10 rows
// produced their pair of bubbles.
const fullPage = Array.from({ length: 10 }, (_, i) => makeActivityRow(i + 1));
myChatNextResponse = { ok: true, rows: fullPage };
// limit=10 row-pairs → 20 ChatMessages. reachedEnd should be FALSE
// so the sentinel mounts. Verified by bubble counts.
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(1, 10),
reachedEnd: false,
};
render(<ChatTab workspaceId="ws-3" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => {
expect(screen.queryByText(/Loading chat history/i)).toBeNull();
});
@@ -202,54 +205,67 @@ describe("ChatTab lazy history pagination", () => {
myChatNextResponse = { ok: false, err: new Error("network down") };
render(<ChatTab workspaceId="ws-4" data={minimalData} />);
const retry = await screen.findByText(/Retry/);
myChatNextResponse = { ok: true, rows: [makeActivityRow(1)] };
myChatNextResponse = { ok: true, messages: makeMessagePair(1) };
fireEvent.click(retry);
await waitFor(() => expect(myChatActivityCalls.length).toBe(2));
const retryUrl = myChatActivityCalls[1];
await waitFor(() => expect(myChatHistoryCalls.length).toBe(2));
const retryUrl = myChatHistoryCalls[1];
expect(retryUrl).toContain("/chat-history");
expect(retryUrl).toContain("limit=10");
expect(retryUrl).not.toContain("limit=50");
});
it("loadOlder fetches limit=20 with before_ts=oldest.timestamp", async () => {
// Initial page = 10 rows in newest-first order (seq 10..1). After
// the component reverses to oldest-first for display, messages[0]
// is built from seq=1 — the oldest — and its timestamp is what
// before_ts should carry.
myChatNextResponse = { ok: true, rows: newestFirstPage(1, 10) };
// Initial page = 10 row-pairs in oldest-first order (seq 1..10).
// The oldest (and so the cursor for loadOlder) is seq=1's
// timestamp 2026-05-05T00:01:00Z.
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(1, 10),
reachedEnd: false,
};
render(<ChatTab workspaceId="ws-load-older" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => expect(ioInstances.length).toBeGreaterThan(0));
// Stage the older-batch response, then fire the IO callback.
myChatNextResponse = { ok: true, rows: newestFirstPage(0, 1) };
// Stage older-batch response, then fire IO callback.
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(0, 1),
reachedEnd: true,
};
triggerIntersection();
await waitFor(() => expect(myChatActivityCalls.length).toBe(2));
const olderUrl = myChatActivityCalls[1];
await waitFor(() => expect(myChatHistoryCalls.length).toBe(2));
const olderUrl = myChatHistoryCalls[1];
expect(olderUrl).toContain("/chat-history");
expect(olderUrl).toContain("limit=20");
expect(olderUrl).toContain("before_ts=");
expect(decodeURIComponent(olderUrl)).toContain("before_ts=2026-05-05T00:01:00Z");
});
it("inflight guard rejects a second IO trigger while first loadOlder is in flight", async () => {
myChatNextResponse = { ok: true, rows: newestFirstPage(1, 10) };
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(1, 10),
reachedEnd: false,
};
render(<ChatTab workspaceId="ws-inflight" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => expect(ioInstances.length).toBeGreaterThan(0));
// Hold the next loadOlder fetch open with a manual deferred so we
// can fire the second trigger while the first is in-flight.
let release!: (rows: unknown[]) => void;
const deferred = new Promise<unknown[]>((res) => {
let release!: (resp: unknown) => void;
const deferred = new Promise<unknown>((res) => {
release = res;
});
apiGet.mockImplementationOnce((path: string): Promise<unknown> => {
myChatActivityCalls.push(path);
myChatHistoryCalls.push(path);
return deferred;
});
triggerIntersection(); // start loadOlder #1
await waitFor(() => expect(myChatActivityCalls.length).toBe(2));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(2));
// Second IO trigger lands while #1 is still pending.
triggerIntersection();
@@ -258,79 +274,62 @@ describe("ChatTab lazy history pagination", () => {
// Without the inflight guard, each of these would have started a
// new fetch. With the guard, none of them do — call count stays 2.
await new Promise((r) => setTimeout(r, 10));
expect(myChatActivityCalls.length).toBe(2);
expect(myChatHistoryCalls.length).toBe(2);
// Release the first fetch. Inflight clears in the finally block;
// a subsequent IO trigger is permitted again (verified by checking
// we can fire a follow-up after release without hanging the test).
release([]);
await waitFor(() => expect(myChatActivityCalls.length).toBe(2));
// Release the first fetch with a valid wire response shape.
release({ messages: [], reached_end: true });
await waitFor(() => expect(myChatHistoryCalls.length).toBe(2));
});
it("empty older response clears the scroll anchor and unmounts the sentinel", async () => {
// The bug we're pinning: if loadOlder returns 0 rows, the
// scrollAnchorRef must be cleared so the next paint doesn't try to
// restore against a no-op prepend (which would fight the natural
// bottom-pin for any subsequent live message). hasMore flipping to
// false is the same flag-flip path; sentinel disappearing is the
// observable proxy.
myChatNextResponse = { ok: true, rows: newestFirstPage(1, 10) };
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(1, 10),
reachedEnd: false,
};
render(<ChatTab workspaceId="ws-anchor" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => expect(ioInstances.length).toBeGreaterThan(0));
myChatNextResponse = { ok: true, rows: [] }; // empty → reachedEnd
myChatNextResponse = {
ok: true,
messages: [],
reachedEnd: true,
};
triggerIntersection();
await waitFor(() => expect(myChatActivityCalls.length).toBe(2));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(2));
// After reachedEnd the sentinel unmounts (hasMore=false). We can't
// peek scrollAnchorRef directly, but we can assert the consequence:
// scrollIntoView (the bottom-pin for live appends) is not blocked
// by a stale anchor. Trigger a re-render via an unrelated state
// change… in practice the safest assertion here is that the
// sentinel disappeared (proving the empty response propagated to
// hasMore correctly, which is the same flag-flip path as anchor
// clearing).
await waitFor(() => {
expect(screen.queryByText(/Loading older messages/i)).toBeNull();
});
});
it("IntersectionObserver does not churn when older messages prepend", async () => {
// Whole-PR perf invariant: prepending older history (the load-bearing
// user gesture) must NOT tear down + re-arm the IO observer.
// Triggering loadOlder is the cleanest way to drive a messages
// mutation from inside the test, since live agent push goes through
// a Zustand store that's harder to drive reliably from jsdom.
//
// Pre-fix, loadOlder depended on `messages`, so every prepend
// recreated loadOlder → re-ran the IO effect → new observer. Each
// call to triggerIntersection() produced a fresh disconnected
// observer + a new live one. Post-fix, the observer survives.
myChatNextResponse = { ok: true, rows: newestFirstPage(1, 10) };
myChatNextResponse = {
ok: true,
messages: pageOldestFirst(1, 10),
reachedEnd: false,
};
render(<ChatTab workspaceId="ws-stable-io" data={minimalData} />);
await waitFor(() => expect(myChatActivityCalls.length).toBe(1));
await waitFor(() => expect(myChatHistoryCalls.length).toBe(1));
await waitFor(() => expect(ioInstances.length).toBeGreaterThan(0));
// Snapshot the observer instance after first paint stabilises.
const observerBefore = ioInstances.at(-1);
expect(observerBefore).toBeDefined();
expect(observerBefore!.disconnected).toBe(false);
// Trigger three older-batch prepends. Each batch returns the full
// OLDER_HISTORY_BATCH (20 rows) so reachedEnd stays false and the
// sentinel keeps mounting. Pre-fix, each prepend mutated `messages`
// → recreated loadOlder → re-ran the IO effect → new observer.
// OLDER_HISTORY_BATCH (20 row-pairs = 40 messages) so reachedEnd
// stays false and the sentinel keeps mounting.
for (let batch = 0; batch < 3; batch++) {
myChatNextResponse = {
ok: true,
rows: newestFirstPage(-(batch + 1) * 20, 20),
messages: pageOldestFirst(-(batch + 1) * 20, 20),
reachedEnd: false,
};
const callsBefore = myChatActivityCalls.length;
const callsBefore = myChatHistoryCalls.length;
triggerIntersection();
await waitFor(() =>
expect(myChatActivityCalls.length).toBe(callsBefore + 1),
);
await waitFor(() => expect(myChatHistoryCalls.length).toBe(callsBefore + 1));
}
// The original observer is still the live one — no churn.
+2 -2
View File
@@ -212,8 +212,8 @@ services:
# docker compose pull canvas && docker compose up -d canvas
# First-time local setup or testing unreleased changes — build from source:
# docker compose build canvas && docker compose up -d canvas
# Note: GHCR images are private — `docker login ghcr.io` required before pull.
image: ghcr.io/molecule-ai/canvas:latest
# Note: ECR images require AWS auth — `aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 153263036946.dkr.ecr.us-east-2.amazonaws.com` before pull.
image: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas:latest
build:
context: ./canvas
dockerfile: Dockerfile
+1 -1
View File
@@ -4,7 +4,7 @@ How a workspace-server code change reaches the prod tenant fleet — and how to
> **⚠️ State note (2026-04-22):** this doc describes the **intended design**. As of this write, the canary fleet described below is **not actually running** — no canary tenants are provisioned, `CANARY_TENANT_URLS` / `CANARY_ADMIN_TOKENS` / `CANARY_CP_SHARED_SECRET` are empty in repo secrets, and `canary-verify.yml` fails every run.
>
> Current merges gate on manual `promote-latest.yml` dispatches, not canary. See [molecule-controlplane/docs/canary-tenants.md](https://github.com/Molecule-AI/molecule-controlplane/blob/main/docs/canary-tenants.md) for the Phase 1 code work that's already shipped + the Phase 2 plan for actually standing up the fleet + a "should we even do this now?" decision framework.
> Current merges gate on manual `promote-latest.yml` dispatches, not canary. See [molecule-controlplane/docs/canary-tenants.md](https://git.moleculesai.app/molecule-ai/molecule-controlplane/src/branch/main/docs/canary-tenants.md) for the Phase 1 code work that's already shipped + the Phase 2 plan for actually standing up the fleet + a "should we even do this now?" decision framework.
>
> **Account-specific identifiers (AWS account ID, IAM role name) referenced below in the original design have been redacted from this public doc.** The actual values — if they exist — are in `Molecule-AI/internal/runbooks/canary-fleet.md`. If you're implementing Phase 2, start there.
>
+6 -6
View File
@@ -1,7 +1,7 @@
# Molecule AI — Comprehensive Technical Documentation
> Definitive technical reference for the Molecule AI Agent Team platform.
> Based on a full non-invasive scan of the [molecule-monorepo](https://github.com/Molecule-AI/molecule-monorepo) repository.
> Based on a full non-invasive scan of the [molecule-monorepo](https://git.moleculesai.app/molecule-ai/molecule-monorepo) repository.
---
@@ -1149,11 +1149,11 @@ Molecule AI's workspace abstraction is **runtime-agnostic by design**. A workspa
## Links
- **GitHub**: https://github.com/Molecule-AI/molecule-monorepo
- **Architecture Docs**: https://github.com/Molecule-AI/molecule-monorepo/tree/main/docs/architecture
- **API Protocol**: https://github.com/Molecule-AI/molecule-monorepo/tree/main/docs/api-protocol
- **Agent Runtime**: https://github.com/Molecule-AI/molecule-monorepo/tree/main/docs/agent-runtime
- **Product Docs**: https://github.com/Molecule-AI/molecule-monorepo/tree/main/docs/product
- **GitHub**: https://git.moleculesai.app/molecule-ai/molecule-monorepo
- **Architecture Docs**: https://git.moleculesai.app/molecule-ai/molecule-monorepo/src/branch/main/docs/architecture
- **API Protocol**: https://git.moleculesai.app/molecule-ai/molecule-monorepo/src/branch/main/docs/api-protocol
- **Agent Runtime**: https://git.moleculesai.app/molecule-ai/molecule-monorepo/src/branch/main/docs/agent-runtime
- **Product Docs**: https://git.moleculesai.app/molecule-ai/molecule-monorepo/src/branch/main/docs/product
---
+2 -2
View File
@@ -79,7 +79,7 @@ For SOC2 / ISO 27001 / customer security questionnaires:
## Pointers
- KMS envelope code: [`molecule-controlplane/internal/crypto/kms.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/internal/crypto/kms.go)
- Static-key fallback: [`molecule-controlplane/internal/crypto/aes.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/internal/crypto/aes.go)
- KMS envelope code: [`molecule-controlplane/internal/crypto/kms.go`](https://git.moleculesai.app/molecule-ai/molecule-controlplane/src/branch/main/internal/crypto/kms.go)
- Static-key fallback: [`molecule-controlplane/internal/crypto/aes.go`](https://git.moleculesai.app/molecule-ai/molecule-controlplane/src/branch/main/internal/crypto/aes.go)
- Tenant secrets handler: [`workspace-server/internal/crypto/aes.go`](../../workspace-server/internal/crypto/aes.go)
- Tenant secrets schema: [database-schema.md](./database-schema.md#workspace_secrets)
+28
View File
@@ -0,0 +1,28 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64">
<style>
.bg { fill: #0a1120; }
.accent { fill: #7fe8d6; }
.accent-stroke { stroke: #7fe8d6; }
@media (prefers-color-scheme: light) {
.bg { fill: #f5f7fa; }
.accent { fill: #1a8a72; }
.accent-stroke { stroke: #1a8a72; }
}
</style>
<rect class="bg" width="64" height="64" rx="14"/>
<g class="accent-stroke" stroke-width="2.4" stroke-linecap="round" fill="none">
<line x1="32" y1="32" x2="12" y2="14"/>
<line x1="32" y1="32" x2="52" y2="18"/>
<line x1="32" y1="32" x2="10" y2="40"/>
<line x1="32" y1="32" x2="54" y2="44"/>
<line x1="32" y1="32" x2="32" y2="56"/>
</g>
<g class="accent">
<circle cx="32" cy="32" r="6.5"/>
<circle cx="12" cy="14" r="3.5"/>
<circle cx="52" cy="18" r="3.5"/>
<circle cx="10" cy="40" r="3.5"/>
<circle cx="54" cy="44" r="3.5"/>
<circle cx="32" cy="56" r="3.5"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 957 B

+17
View File
@@ -0,0 +1,17 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64" role="img" aria-label="Molecule AI">
<g stroke="#7fe8d6" stroke-width="2.6" stroke-linecap="round" fill="none">
<line x1="32" y1="32" x2="12" y2="14"/>
<line x1="32" y1="32" x2="52" y2="18"/>
<line x1="32" y1="32" x2="10" y2="40"/>
<line x1="32" y1="32" x2="54" y2="44"/>
<line x1="32" y1="32" x2="32" y2="56"/>
</g>
<g fill="#7fe8d6">
<circle cx="32" cy="32" r="7"/>
<circle cx="12" cy="14" r="3.6"/>
<circle cx="52" cy="18" r="3.6"/>
<circle cx="10" cy="40" r="3.6"/>
<circle cx="54" cy="44" r="3.6"/>
<circle cx="32" cy="56" r="3.6"/>
</g>
</svg>

After

Width:  |  Height:  |  Size: 662 B

@@ -299,8 +299,8 @@ Or use the Canvas UI: Workspace → Config → MCP Servers → Add browser MCP s
**Try it free** — Molecule AI is open source and self-hostable. Get a workspace running in under 5 minutes.
→ [Get started on GitHub →](https://github.com/Molecule-AI/molecule-core)
→ [Get started on GitHub →](https://git.moleculesai.app/molecule-ai/molecule-core)
---
*Have a browser automation use case you want to see covered? Open a discussion on [GitHub Discussions](https://github.com/Molecule-AI/molecule-core/discussions) — or file an issue with the `enhancement` label.*
*Have a browser automation use case you want to see covered? File an issue with the `enhancement` label on the [molecule-core issue tracker](https://git.moleculesai.app/molecule-ai/molecule-core/issues).*
@@ -148,7 +148,7 @@ Then follow the [quick-start guide](/docs/guides/remote-workspaces.md).
Or run the annotated example directly:
```bash
git clone https://github.com/Molecule-AI/molecule-sdk-python
git clone https://git.moleculesai.app/molecule-ai/molecule-sdk-python
cd molecule-sdk-python/examples/remote-agent
# Create workspace with runtime:external, grab the ID, then:
WORKSPACE_ID=<your-id> PLATFORM_URL=https://acme.moleculesai.app python3 run.py
@@ -160,6 +160,6 @@ The agent appears on the canvas within seconds.
→ [Remote Workspaces Guide →](/docs/guides/remote-workspaces.md)
→ [External Agent Registration Reference →](/docs/guides/external-agent-registration.md)
→ [molecule-sdk-python →](https://github.com/Molecule-AI/molecule-sdk-python)
→ [molecule-sdk-python →](https://git.moleculesai.app/molecule-ai/molecule-sdk-python)
*Phase 30 shipped in PRs #1075#1083 and #1085#1100 on `molecule-core`.*
@@ -133,4 +133,4 @@ With protocol-native A2A, you get:
Molecule AI's external agent registration is production-ready. Documentation is live at [External Agent Registration Guide](https://docs.molecule.ai/docs/guides/external-agent-registration). The npm package for the MCP server is available at [`@molecule-ai/mcp-server`](https://www.npmjs.com/package/@molecule-ai/mcp-server).
Read the full [A2A v1.0 protocol spec](https://github.com/Molecule-AI/molecule-core/blob/main/docs/api-protocol/a2a-protocol.md) on GitHub.
Read the full [A2A v1.0 protocol spec](https://git.moleculesai.app/molecule-ai/molecule-core/src/branch/main/docs/api-protocol/a2a-protocol.md) on GitHub.
@@ -45,7 +45,7 @@ canonicalUrl: "https://docs.molecule.ai/blog/remote-workspaces"
" proficiencyLevel": "Expert",
"genre": ["technical documentation", "product announcement"],
"sameAs": [
"https://github.com/Molecule-AI/molecule-core",
"https://git.moleculesai.app/molecule-ai/molecule-core",
"https://molecule.ai"
]
}
@@ -270,7 +270,7 @@ Configure it in your project's `.mcp.json` and any AI agent (Claude Code, Cursor
→ [External Agent Registration Guide](/docs/guides/external-agent-registration) — full step-by-step with Python and Node.js reference implementations
→ [GitHub: molecule-core](https://github.com/Molecule-AI/molecule-core) — source and issues
→ [GitHub: molecule-core](https://git.moleculesai.app/molecule-ai/molecule-core) — source and issues
→ [Phase 30 Launch Thread on X](https://x.com) — follow for updates
@@ -170,4 +170,4 @@ The `staging` branch is now on `a2a-sdk` 1.0.0. The `main` branch still carries
If you're running `a2a-sdk` 0.3.x and planning the 1.0.0 migration, this post is the reference. The four breaking changes are well-contained, the migration is a single PR, and the eight smoke scenarios above will tell you whether the upgrade is clean before you merge.
Questions? The [A2A protocol spec](https://github.com/google-a2a/a2a-specification) is the authoritative source. For Molecule AI's production A2A implementation, see [External Agent Registration](https://docs.molecule.ai/docs/guides/external-agent-registration) or open an issue in the [molecule-core](https://github.com/Molecule-AI/molecule-core) repo.
Questions? The [A2A protocol spec](https://github.com/google-a2a/a2a-specification) is the authoritative source. For Molecule AI's production A2A implementation, see [External Agent Registration](https://docs.molecule.ai/docs/guides/external-agent-registration) or open an issue in the [molecule-core](https://git.moleculesai.app/molecule-ai/molecule-core) repo.
+1 -1
View File
@@ -215,7 +215,7 @@ Push mode (this guide) works today but requires an inbound-reachable URL — whi
Your agent makes only outbound HTTPS calls to the platform, pulling messages from an inbox queue and posting replies back. Works behind any NAT/firewall, tolerates offline laptops, no tunnel needed.
See the [design doc](https://github.com/Molecule-AI/internal/blob/main/product/external-workspaces-polling.md) (internal) and [implementation tracking issue](https://github.com/Molecule-AI/molecule-core/issues?q=polling+mode) once opened.
See the [design doc](https://git.moleculesai.app/molecule-ai/internal/src/branch/main/product/external-workspaces-polling.md) (internal) and the implementation tracking issue (search `polling+mode` on the [molecule-core issue tracker](https://git.moleculesai.app/molecule-ai/molecule-core/issues)).
---
+2 -2
View File
@@ -143,5 +143,5 @@ The agent appears on the canvas with a **purple REMOTE badge** within seconds. F
## Next Steps
- **[External Agent Registration Guide →](/docs/guides/external-agent-registration)** — full endpoint reference, Python + Node.js examples, troubleshooting
- **[molecule-sdk-python →](https://github.com/Molecule-AI/molecule-sdk-python)** — SDK source, `RemoteAgentClient` API docs
- **[SDK Examples →](https://github.com/Molecule-AI/molecule-sdk-python/tree/main/examples/remote-agent)** — `run.py` demo script, annotated walkthrough
- **[molecule-sdk-python →](https://git.moleculesai.app/molecule-ai/molecule-sdk-python)** — SDK source, `RemoteAgentClient` API docs
- **[SDK Examples →](https://git.moleculesai.app/molecule-ai/molecule-sdk-python/src/branch/main/examples/remote-agent)** — `run.py` demo script, annotated walkthrough
+2 -2
View File
@@ -61,7 +61,7 @@ molecule skills install arxiv-research --from community
Community skills are reviewed by the Molecule AI team before being
listed. Submit a skill for review by opening a PR against
[`molecule-ai/skills`](https://github.com/Molecule-AI/skills).
[`molecule-ai/skills`](https://git.moleculesai.app/molecule-ai/skills).
## Installing via config.yaml
@@ -151,7 +151,7 @@ molecule skills bundle my-custom-skill --output ./org-templates/my-role/
```
**Publishing to the community:** Open a PR against
[`molecule-ai/skills`](https://github.com/Molecule-AI/skills) with a
[`molecule-ai/skills`](https://git.moleculesai.app/molecule-ai/skills) with a
complete skill package. Community skills are reviewed for security and
correctness before listing.
@@ -96,7 +96,7 @@ fork needed in production.
`resolve_platform_id` for plugin-platform-safe deserialization, and
`self.adapters[adapter.platform]` keying fix (caught by real-subprocess
test before merge — see below).
- **Plugin package**: [Molecule-AI/hermes-platform-molecule-a2a](https://github.com/Molecule-AI/hermes-platform-molecule-a2a)
- **Plugin package**: [Molecule-AI/hermes-platform-molecule-a2a](https://git.moleculesai.app/molecule-ai/hermes-platform-molecule-a2a)
v0.1.0 — public, MIT-licensed. 11 unit tests + 8 in-process E2E
+ 4 real-subprocess E2E checkpoints all green.
- **Workspace template patch**: [Molecule-AI/molecule-ai-workspace-template-hermes#32](https://github.com/Molecule-AI/molecule-ai-workspace-template-hermes/pull/32)
@@ -154,7 +154,7 @@ intermediate shim earns its complexity.
## Codex (OpenAI Codex CLI)
**Status:** Template SHIPPED. Repo live at
[`Molecule-AI/molecule-ai-workspace-template-codex`](https://github.com/Molecule-AI/molecule-ai-workspace-template-codex)
[`Molecule-AI/molecule-ai-workspace-template-codex`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-codex)
(14 files, 1411 LOC, 12/12 tests). molecule-core registration in
[PR #2512](https://github.com/Molecule-AI/molecule-core/pull/2512).
E2E with real A2A traffic remains.
+2 -2
View File
@@ -17,7 +17,7 @@ This path is aligned to the current repository and current UI. It gets you from
## The one-command path
```bash
git clone https://github.com/Molecule-AI/molecule-monorepo.git
git clone https://git.moleculesai.app/molecule-ai/molecule-monorepo.git
cd molecule-monorepo
./scripts/dev-start.sh
```
@@ -42,7 +42,7 @@ If you'd rather run each component yourself — useful when you're iterating on
### Step 1: Clone the repository
```bash
git clone https://github.com/Molecule-AI/molecule-monorepo.git
git clone https://git.moleculesai.app/molecule-ai/molecule-monorepo.git
cd molecule-monorepo
```
+11 -11
View File
@@ -98,14 +98,14 @@ Each of the 8 adapter template repos contains:
| Adapter | Repo |
|---------|------|
| claude-code | https://github.com/Molecule-AI/molecule-ai-workspace-template-claude-code |
| langgraph | https://github.com/Molecule-AI/molecule-ai-workspace-template-langgraph |
| crewai | https://github.com/Molecule-AI/molecule-ai-workspace-template-crewai |
| autogen | https://github.com/Molecule-AI/molecule-ai-workspace-template-autogen |
| deepagents | https://github.com/Molecule-AI/molecule-ai-workspace-template-deepagents |
| hermes | https://github.com/Molecule-AI/molecule-ai-workspace-template-hermes |
| gemini-cli | https://github.com/Molecule-AI/molecule-ai-workspace-template-gemini-cli |
| openclaw | https://github.com/Molecule-AI/molecule-ai-workspace-template-openclaw |
| claude-code | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code |
| langgraph | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-langgraph |
| crewai | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-crewai |
| autogen | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-autogen |
| deepagents | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-deepagents |
| hermes | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-hermes |
| gemini-cli | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-gemini-cli |
| openclaw | https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-openclaw |
## Adapter discovery (ADAPTER_MODULE)
@@ -244,7 +244,7 @@ correctness before pushing a `runtime-v*` tag.
## Writing a new adapter
Use the GitHub template repo
[`Molecule-AI/molecule-ai-workspace-template-starter`](https://github.com/Molecule-AI/molecule-ai-workspace-template-starter)
[`molecule-ai/molecule-ai-workspace-template-starter`](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-starter) (note: the starter repo did not survive the 2026-05-06 GitHub-org-suspension migration; recreation tracked at internal#41)
— it ships with the canonical Dockerfile + adapter.py skeleton + config.yaml
schema + the `repository_dispatch: [runtime-published]` cascade receiver
already wired up. No follow-up setup PR required.
@@ -256,7 +256,7 @@ gh repo create Molecule-AI/molecule-ai-workspace-template-<runtime> \
--public \
--description "Molecule AI workspace template: <runtime>"
git clone https://github.com/Molecule-AI/molecule-ai-workspace-template-<runtime>
git clone https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>.git
cd molecule-ai-workspace-template-<runtime>
```
@@ -286,7 +286,7 @@ After `git push`:
If the canonical shape changes (e.g. `config.yaml` schema gets a new field,
the `BaseAdapter` interface adds a method, the reusable CI workflow
signature changes), update the
[starter](https://github.com/Molecule-AI/molecule-ai-workspace-template-starter)
[starter](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-starter) (recreation pending — see note above)
**first**. Existing templates can either migrate at their own pace or be
touched in a coordinated cleanup PR. Either way, future templates pick up
the new shape from day one.
+1 -1
View File
@@ -11,7 +11,7 @@ There are three related scripts; pick the right one:
|---|---|---|
| `measure-coordinator-task-bounds.sh` | **Canonical** v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via `claude-code-default` + `langgraph` templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace. | OSS-shape platform — localhost or any `/workspaces`-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
| `measure-coordinator-task-bounds-runner.sh` | Generalised runner for the same measurement contract but with **arbitrary template + secret + model combinations** (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via `MODE=saas`). |
| `measure-coordinator-task-bounds.sh` (in [molecule-controlplane](https://github.com/Molecule-AI/molecule-controlplane)) | **Production-shape** variant that bootstraps a real staging tenant via `POST /cp/admin/orgs`, then runs the same measurement against `<slug>.staging.moleculesai.app`. | Staging controlplane only — refuses to run against production. |
| `measure-coordinator-task-bounds.sh` (in [molecule-controlplane](https://git.moleculesai.app/molecule-ai/molecule-controlplane)) | **Production-shape** variant that bootstraps a real staging tenant via `POST /cp/admin/orgs`, then runs the same measurement against `<slug>.staging.moleculesai.app`. | Staging controlplane only — refuses to run against production. |
See `reference_harness_pair_pattern` (auto-memory) for when to use which
and the cross-repo design rationale.
+2 -2
View File
@@ -278,7 +278,7 @@ include = ["molecule_runtime*"]
README_TEMPLATE = """\
# molecule-ai-workspace-runtime
Shared workspace runtime for [Molecule AI](https://github.com/Molecule-AI/molecule-core)
Shared workspace runtime for [Molecule AI](https://git.moleculesai.app/molecule-ai/molecule-core)
agent adapters. Installed by every workspace template image
(`workspace-template-claude-code`, `-langgraph`, `-hermes`, etc.) to provide
A2A delegation, heartbeat, memory, plugin loading, and skill management.
@@ -396,7 +396,7 @@ If you don't need real-time push, the default poll path works
universally with no extra setup; both modes converge on the same
`inbox_pop` ack so messages never duplicate.
See [`docs/workspace-runtime-package.md`](https://github.com/Molecule-AI/molecule-core/blob/main/docs/workspace-runtime-package.md)
See [`docs/workspace-runtime-package.md`](https://git.moleculesai.app/molecule-ai/molecule-core/src/branch/main/docs/workspace-runtime-package.md)
for the publish flow and architecture.
"""
+2 -2
View File
@@ -10,11 +10,11 @@
# → PyPI auto-bumps molecule-ai-workspace-runtime patch version
# → repository_dispatch fans out to 8 workspace-template-* repos
# → each template repo rebuilds and re-tags
# ghcr.io/molecule-ai/workspace-template-<runtime>:latest
# 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/workspace-template-<runtime>:latest
#
# PATH 2: any merge to a workspace-template-* repo's main branch
# → that repo's publish-image.yml fires
# → ghcr.io/molecule-ai/workspace-template-<runtime>:latest
# → 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/workspace-template-<runtime>:latest
# gets re-tagged
#
# provisioner.go:296 RuntimeImages[runtime] reads `:latest` at every
+1 -1
View File
@@ -51,7 +51,7 @@ log "pulling latest images for: ${RUNTIMES[*]}"
PULLED=()
FAILED=()
for rt in "${RUNTIMES[@]}"; do
IMG="ghcr.io/molecule-ai/workspace-template-$rt:latest"
IMG="153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/workspace-template-$rt:latest"
if docker pull "$IMG" >/dev/null 2>&1; then
log "$rt"
PULLED+=("$rt")
+21 -16
View File
@@ -1,9 +1,10 @@
#!/bin/bash
# rollback-latest.sh — moves the :latest tag on ghcr.io/molecule-ai/platform
# (and the matching tenant image) back to a prior :staging-<sha> digest
# without rebuilding anything. Prod tenants auto-pull :latest every 5
# min, so this is the fast path when a canary-verified image turns out
# to have a runtime regression that canary didn't catch.
# rollback-latest.sh — moves the :latest tag on the platform image
# (and the matching tenant image) on AWS ECR back to a prior
# :staging-<sha> digest without rebuilding anything. Prod tenants
# auto-pull :latest every 5 min, so this is the fast path when a
# canary-verified image turns out to have a runtime regression that
# canary didn't catch.
#
# Usage:
# scripts/rollback-latest.sh <sha>
@@ -12,12 +13,14 @@
# Prereqs:
# - crane on $PATH (brew install crane OR download from
# https://github.com/google/go-containerregistry/releases)
# - GHCR token exported as GITHUB_TOKEN with write:packages scope
# - aws CLI authenticated for region us-east-2 with ECR pull/push
# access to the molecule-ai/platform + platform-tenant repositories.
# `aws sts get-caller-identity` should succeed.
#
# What it does (per image — platform + tenant):
# crane digest ghcr.io/…:<sha> # verify the target sha exists
# crane tag ghcr.io/…:<sha> latest # retag remotely, single API call
# crane digest ghcr.io/…:latest # confirm the move
# crane digest <ecr>:<sha> # verify the target sha exists
# crane tag <ecr>:<sha> latest # retag remotely, single API call
# crane digest <ecr>:latest # confirm the move
#
# Exit codes: 0 = both retagged, 1 = tag missing / crane error, 2 = bad args.
@@ -30,21 +33,23 @@ if [ "${1:-}" = "" ]; then
fi
TARGET_SHA="$1"
PLATFORM=ghcr.io/molecule-ai/platform
TENANT=ghcr.io/molecule-ai/platform-tenant
ECR_HOST=153263036946.dkr.ecr.us-east-2.amazonaws.com
PLATFORM=$ECR_HOST/molecule-ai/platform
TENANT=$ECR_HOST/molecule-ai/platform-tenant
if ! command -v crane >/dev/null; then
echo "ERROR: crane not installed. brew install crane" >&2
exit 1
fi
if [ -z "${GITHUB_TOKEN:-}" ]; then
echo "ERROR: GITHUB_TOKEN unset. export it with write:packages scope." >&2
if ! command -v aws >/dev/null; then
echo "ERROR: aws CLI not installed. brew install awscli" >&2
exit 1
fi
# Log in once. crane stores creds in a config file keyed by registry;
# re-running is cheap.
printf '%s\n' "$GITHUB_TOKEN" | crane auth login ghcr.io -u "${GITHUB_ACTOR:-$(whoami)}" --password-stdin >/dev/null
# Log in once. ECR auth is via short-lived password from `aws ecr
# get-login-password`. crane stores creds in a config file keyed by
# registry; re-running is cheap.
aws ecr get-login-password --region us-east-2 | crane auth login "$ECR_HOST" -u AWS --password-stdin >/dev/null
roll() {
local image="$1"
+1 -1
View File
@@ -18,7 +18,7 @@
#
# Or inline via curl:
#
# bash <(curl -fsSL https://raw.githubusercontent.com/Molecule-AI/molecule-core/main/tools/check-template-parity.sh) \
# bash <(curl -fsSL https://git.moleculesai.app/molecule-ai/molecule-core/raw/branch/main/tools/check-template-parity.sh) \
# install.sh start.sh
#
# Exit codes:
+4 -8
View File
@@ -5,15 +5,11 @@
FROM golang:1.25-alpine AS builder
WORKDIR /app
# Plugin source for replace directive in go.mod
COPY molecule-ai-plugin-github-app-auth/ /plugin/
COPY workspace-server/go.mod workspace-server/go.sum ./
# Add replace directives for Docker builds:
# 1. Platform → plugin (plugin source at /plugin/)
# 2. Plugin → platform (plugin's go.mod has a relative replace that doesn't
# work in Docker; fix it to point at /app where the platform source lives)
RUN echo 'replace github.com/Molecule-AI/molecule-ai-plugin-github-app-auth => /plugin' >> go.mod
RUN sed -i 's|replace github.com/Molecule-AI/molecule-monorepo/platform => .*|replace github.com/Molecule-AI/molecule-monorepo/platform => /app|' /plugin/go.mod
# github-app-auth plugin removed 2026-05-07 (#157): per-agent Gitea
# identities replaced the GitHub-App-installation token flow after the
# 2026-05-06 suspension. Pre-removal this stage COPY'd the sibling
# plugin repo + injected a `replace` directive; both are gone.
RUN go mod download
COPY workspace-server/ .
# GIT_SHA mirror of Dockerfile.tenant — see that file for the rationale.
+3 -2
View File
@@ -16,9 +16,10 @@
# ── Stage 1: Go platform binary ──────────────────────────────────────
FROM golang:1.25-alpine AS go-builder
WORKDIR /app
COPY molecule-ai-plugin-github-app-auth/ /plugin/
COPY workspace-server/go.mod workspace-server/go.sum ./
RUN echo 'replace github.com/Molecule-AI/molecule-ai-plugin-github-app-auth => /plugin' >> go.mod
# github-app-auth plugin removed 2026-05-07 (#157): per-agent Gitea
# identities replaced GitHub-App tokens post-suspension. The sibling
# COPY + replace directive are gone.
RUN go mod download
COPY workspace-server/ .
+23 -28
View File
@@ -29,8 +29,7 @@ import (
// External plugins — each registers EnvMutator(s) that run at workspace
// provision time. Loaded via soft-dep gates in main() so self-hosters
// without the App or without per-agent identity configured keep working.
githubappauth "github.com/Molecule-AI/molecule-ai-plugin-github-app-auth/pluginloader"
// without per-agent identity configured keep working.
ghidentity "github.com/Molecule-AI/molecule-ai-plugin-gh-identity/pluginloader"
"github.com/Molecule-AI/molecule-monorepo/platform/pkg/provisionhook"
@@ -179,12 +178,15 @@ func main() {
}
// External-plugin env mutators — each plugin contributes 0+ mutators
// onto a shared registry. Order matters: gh-identity populates
// MOLECULE_AGENT_ROLE-derived attribution env vars that downstream
// mutators and the workspace's install.sh can then read. Keep
// github-app-auth last because it fails loudly on misconfig and its
// failure mode is "no GITHUB_TOKEN" — worth surfacing after the
// cheaper mutators already ran.
// onto a shared registry. gh-identity populates MOLECULE_AGENT_ROLE-
// derived attribution env vars that the workspace's install.sh can
// then read.
//
// github-app-auth was dropped 2026-05-07 (closes #157): per-agent
// Gitea identities (this gh-identity plugin's role-derived path)
// replaced GitHub-App-installation tokens after the 2026-05-06
// suspension. Workspaces now provision with a per-persona Gitea PAT
// from .env instead of an App-rotated GITHUB_TOKEN.
envReg := provisionhook.NewRegistry()
// gh-identity plugin — per-agent attribution via env injection + gh
@@ -198,26 +200,6 @@ func main() {
log.Printf("gh-identity: registered (config file=%q)", os.Getenv("MOLECULE_GH_IDENTITY_CONFIG_FILE"))
}
// github-app-auth plugin — injects GITHUB_TOKEN + GH_TOKEN into every
// workspace env using the App's installation access token (rotates ~hourly).
// Soft-skip when GITHUB_APP_* env vars are absent so dev/self-hosters
// without an App configured keep working; fail-loud only on MISCONFIG
// (e.g. APP_ID set but key file missing), not on unset.
if os.Getenv("GITHUB_APP_ID") != "" {
if reg, err := githubappauth.BuildRegistry(); err != nil {
log.Fatalf("github-app-auth plugin: %v", err)
} else {
// Copy the plugin's mutators onto the shared registry so the
// TokenProvider probe (FirstTokenProvider) still finds them.
for _, m := range reg.Mutators() {
envReg.Register(m)
}
log.Printf("github-app-auth: registered, %d mutator(s) added to chain", reg.Len())
}
} else {
log.Println("github-app-auth: GITHUB_APP_ID unset — skipping plugin registration (agents will use any PAT from .env)")
}
wh.SetEnvMutators(envReg)
log.Printf("env-mutator chain: %v", envReg.Names())
@@ -266,6 +248,19 @@ func main() {
})
}
// CP-mode orphan sweeper — SaaS counterpart to the Docker sweeper
// above. Re-issues cpProv.Stop for any workspace at status='removed'
// with a non-NULL instance_id, healing the deprovision split-write
// race documented in #2989: tenant marks status='removed' BEFORE
// calling CP DELETE, so a transient CP failure leaves the EC2
// running with no retry path. cpProv.Stop is idempotent against
// already-terminated instances; on success we clear instance_id.
if cpProv != nil {
go supervised.RunWithRecover(ctx, "cp-orphan-sweeper", func(c context.Context) {
registry.StartCPOrphanSweeper(c, cpProv)
})
}
// Pending-uploads GC sweep — deletes acked rows past their retention
// window plus unacked rows past expires_at. Without this the
// pending_uploads table grows unbounded; even with the 24h hard TTL,
-1
View File
@@ -5,7 +5,6 @@ go 1.25.0
require (
github.com/DATA-DOG/go-sqlmock v1.5.2
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d
github.com/alicebob/miniredis/v2 v2.37.0
github.com/creack/pty v1.1.24
github.com/docker/docker v28.5.2+incompatible
-2
View File
@@ -6,8 +6,6 @@ github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERo
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f h1:YkLRhUg+9qr9OV9N8dG1Hj0Ml7TThHlRwh5F//oUJVs=
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f/go.mod h1:NqdtlWZDJvpXNJRHnMkPhTKHdA1LZTNH+63TB66JSOU=
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d h1:GpYhP6FxaJZc1Ljy5/YJ9ZIVGvfOqZBmDolNr2S5x2g=
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d/go.mod h1:3a6LR/zd7FjR9ZwLTbytwYlWuCBsbCOVFlEg0WnoYiM=
github.com/alicebob/miniredis/v2 v2.37.0 h1:RheObYW32G1aiJIj81XVt78ZHJpHonHLHW7OLIshq68=
github.com/alicebob/miniredis/v2 v2.37.0/go.mod h1:TcL7YfarKPGDAthEtl5NBeHZfeUQj6OXMm/+iu5cLMM=
github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs=
@@ -56,10 +56,17 @@ type RefreshResult struct {
Recreated []string `json:"recreated"`
}
// TemplateImageRef returns the canonical GHCR ref for a runtime's template
// image. Single source of truth shared with imagewatch.
// TemplateImageRef returns the canonical image ref for a runtime's template,
// using the configured registry (provisioner.RegistryPrefix()) and the
// moving `:latest` tag. Single source of truth shared with imagewatch.
//
// Defaults to ghcr.io/molecule-ai/workspace-template-<runtime>:latest
// (upstream OSS). When MOLECULE_IMAGE_REGISTRY is set in the environment
// (typically the AWS ECR mirror in production), this returns the prefixed
// equivalent so admin operations and image-watch checks hit the same
// registry the provisioner pulls from.
func TemplateImageRef(runtime string) string {
return fmt.Sprintf("ghcr.io/molecule-ai/workspace-template-%s:latest", runtime)
return fmt.Sprintf("%s/workspace-template-%s:latest", provisioner.RegistryPrefix(), runtime)
}
// ghcrAuthHeader returns the base64-encoded JSON auth payload Docker's
@@ -0,0 +1,437 @@
package handlers
// eic_tunnel_pool.go — refcounted pool for EIC SSH tunnels keyed on
// instanceID. Reuses one tunnel across N file ops, amortising the
// ssh-keygen + SendSSHPublicKey + open-tunnel + waitForPort cost
// (~3-5s) over multiple cats/finds (~50-200ms each).
//
// Origin: core#11 — canvas detail-panel config + filesystem load
// took ~20s. ConfigTab fans out 4 GETs serially; the slowest is
// /files/config.yaml which dispatches to readFileViaEIC. Without a
// pool, every readFileViaEIC + listFilesViaEIC + writeFileViaEIC +
// deleteFileViaEIC pays the full setup cost even when fired
// back-to-back on the same workspace EC2.
//
// The pool keeps one eicSSHSession alive per instanceID for up to
// poolTTL. SendSSHPublicKey grants a 60s key validity, so poolTTL
// must stay strictly below that to avoid serving requests on a
// just-expired key. We default to 50s with a 10s safety margin.
//
// Concurrency model:
//
// - Single mutex guards the entries map.
// - Slow path (tunnel setup) runs OUTSIDE the lock, gated by an
// "intent" placeholder so concurrent acquires for the same
// instanceID don't both build a tunnel — the loser drops its
// setup and uses the winner's.
// - Refcount on each entry; eviction blocked while refcount > 0.
// - Janitor goroutine sweeps every poolJanitorInterval, drops
// entries where refcount == 0 && expiresAt < now.
//
// Test injection:
//
// - poolSetupTunnel is a package-level var so tests can swap the
// slow path for a counting stub. Production wires it to
// realWithEICTunnel-style setup.
// - withEICTunnel (the public, single-shot API) is also a var
// (already, see template_files_eic.go). It's rebound here to
// pooledWithEICTunnel which routes through globalEICTunnelPool.
// - Tests that need single-shot behaviour can set poolTTL = 0,
// which makes pooledWithEICTunnel fall through to the underlying
// setup directly (no pool entry kept).
import (
"context"
"fmt"
"sync"
"time"
)
// poolTTL is the maximum age of a pooled tunnel. Must be strictly
// less than the SendSSHPublicKey grant window (60s) so we never
// serve a request through a key that's about to expire mid-op.
//
// Configurable via init-time wiring (see initEICTunnelPool); not a
// const so tests can pin TTL=0 (disable pooling) or TTL=50ms (drive
// eviction tests).
var poolTTL = 50 * time.Second
// poolJanitorInterval is how often the janitor goroutine sweeps for
// expired idle entries. Tighter than poolTTL so eviction is timely;
// loose enough that the goroutine doesn't burn CPU.
var poolJanitorInterval = 10 * time.Second
// poolMaxEntries caps simultaneous instanceIDs the pool tracks.
// Beyond this, new acquires evict the LRU entry. Defends against a
// pathological caller (e.g. a sweep over hundreds of workspace
// EC2s) from leaking unbounded tunnel processes. 32 is a generous
// ceiling for the canvas use case (one human navigates ≤ ~5
// workspaces at a time).
var poolMaxEntries = 32
// poolSetupTunnel is the slow-path tunnel constructor. Wrapped in a
// var so tests can inject a counter stub. Returns a session and a
// cleanup function (closes the open-tunnel subprocess + scrubs the
// ephemeral keydir). nil session + non-nil err means setup failed
// and there is nothing to clean up.
//
// Production wiring lives in eic_tunnel_pool_setup.go (a thin shim
// over the existing realWithEICTunnel logic).
var poolSetupTunnel = func(ctx context.Context, instanceID string) (
sess eicSSHSession, cleanup func(), err error) {
return setupRealEICTunnel(ctx, instanceID)
}
// pooledTunnel is one entry in the pool. session is shared by N
// concurrent fn calls; cleanup runs once when refcount returns to
// zero AND the entry is past expiresAt or evicted.
//
// lastUsed tracks the most recent acquire time for LRU bookkeeping
// (overflow eviction). expiresAt is set at construction and not
// extended on use — a tunnel cannot live past poolTTL even if it's
// hot, because the underlying SendSSHPublicKey grant expires.
type pooledTunnel struct {
session eicSSHSession
cleanup func()
expiresAt time.Time
lastUsed time.Time
refcount int
poisoned bool // true if a fn returned a tunnel-fatal error; do not reuse
}
// eicTunnelPool is the package-level pool. Single instance lives
// in globalEICTunnelPool; constructor runs lazily on first acquire.
type eicTunnelPool struct {
mu sync.Mutex
entries map[string]*pooledTunnel
// pendingSetups guards concurrent setup for the same instanceID.
// First acquirer takes the slot; later ones wait on the channel.
pendingSetups map[string]chan struct{}
stopJanitor chan struct{}
}
var (
globalEICTunnelPool *eicTunnelPool
globalEICTunnelPoolOnce sync.Once
)
// getEICTunnelPool returns the singleton pool, lazy-initialising on
// first call. Idempotent.
func getEICTunnelPool() *eicTunnelPool {
globalEICTunnelPoolOnce.Do(func() {
globalEICTunnelPool = newEICTunnelPool()
go globalEICTunnelPool.janitor()
})
return globalEICTunnelPool
}
// newEICTunnelPool constructs an empty pool. Exported so tests can
// build isolated pools without sharing the singleton.
func newEICTunnelPool() *eicTunnelPool {
return &eicTunnelPool{
entries: map[string]*pooledTunnel{},
pendingSetups: map[string]chan struct{}{},
stopJanitor: make(chan struct{}),
}
}
// acquire returns a usable session for instanceID. If a healthy entry
// exists, refcount++ and return it. If a setup is in flight for the
// same instanceID, wait for it. Otherwise build one (slow path).
//
// done() must be called by the caller when the op finishes. It
// decrements refcount and triggers cleanup if the entry is past
// TTL or poisoned and refcount==0.
//
// Errors from the slow path propagate; pool state is not modified
// for failed setups (no poisoned entry created — that's only for
// fn-returned errors on a previously-good session).
func (p *eicTunnelPool) acquire(ctx context.Context, instanceID string) (
sess eicSSHSession, done func(poisoned bool), err error) {
if poolTTL <= 0 {
// Pool disabled (TTL=0 mode for tests / opt-out). Fall
// through to a direct setup with caller-driven cleanup.
s, cleanup, err := poolSetupTunnel(ctx, instanceID)
if err != nil {
return eicSSHSession{}, nil, err
}
return s, func(_ bool) { cleanup() }, nil
}
for {
p.mu.Lock()
if pt, ok := p.entries[instanceID]; ok && !pt.poisoned && pt.expiresAt.After(time.Now()) {
pt.refcount++
pt.lastUsed = time.Now()
p.mu.Unlock()
return pt.session, p.releaser(instanceID, pt), nil
}
// Either no entry, expired entry, or poisoned entry. If a
// setup is already in flight, wait and retry.
if pending, ok := p.pendingSetups[instanceID]; ok {
p.mu.Unlock()
select {
case <-pending:
continue // re-check the entries map
case <-ctx.Done():
return eicSSHSession{}, nil, ctx.Err()
}
}
// Drop expired/poisoned entry now (we'll cleanup outside
// the lock — the entry is unreferenced or we'd not be here).
var oldCleanup func()
if pt, ok := p.entries[instanceID]; ok {
if pt.refcount == 0 {
oldCleanup = pt.cleanup
delete(p.entries, instanceID)
}
}
// Reserve the setup slot.
signal := make(chan struct{})
p.pendingSetups[instanceID] = signal
p.mu.Unlock()
if oldCleanup != nil {
go oldCleanup()
}
// Slow path: build a new tunnel. Anything that goes wrong
// here cleans up the pendingSetups slot and propagates to
// the caller without leaving the pool in a state where the
// next acquire blocks waiting on a signal that never fires.
newSess, cleanup, setupErr := poolSetupTunnel(ctx, instanceID)
p.mu.Lock()
delete(p.pendingSetups, instanceID)
close(signal)
if setupErr != nil {
p.mu.Unlock()
return eicSSHSession{}, nil, fmt.Errorf("eic tunnel setup: %w", setupErr)
}
// Enforce LRU bound BEFORE inserting so we don't briefly
// exceed the cap even by one entry.
p.evictLRUIfFullLocked(instanceID)
pt := &pooledTunnel{
session: newSess,
cleanup: cleanup,
expiresAt: time.Now().Add(poolTTL),
lastUsed: time.Now(),
refcount: 1,
}
p.entries[instanceID] = pt
p.mu.Unlock()
return pt.session, p.releaser(instanceID, pt), nil
}
}
// releaser returns a closure that decrements refcount and triggers
// cleanup if (a) the entry is past TTL or (b) the caller signalled
// poison. Idempotent against double-release (decrements once via the
// captured pt; pool entry may have been replaced by then).
func (p *eicTunnelPool) releaser(instanceID string, pt *pooledTunnel) func(poisoned bool) {
released := false
return func(poisoned bool) {
p.mu.Lock()
defer p.mu.Unlock()
if released {
return
}
released = true
pt.refcount--
if poisoned {
pt.poisoned = true
}
// Evict immediately if poisoned-and-idle OR expired-and-idle.
// Hot entries (refcount > 0) defer eviction to the last release.
if pt.refcount == 0 && (pt.poisoned || pt.expiresAt.Before(time.Now())) {
// If the entry in the map is still us, remove it.
if cur, ok := p.entries[instanceID]; ok && cur == pt {
delete(p.entries, instanceID)
}
go pt.cleanup()
}
}
}
// evictLRUIfFullLocked drops the least-recently-used IDLE entry
// when the pool is at capacity. Caller must hold p.mu. The new
// instanceID about to be inserted is excluded so we don't evict
// ourselves. If no idle entries exist, no eviction happens — the
// new entry will push us above the soft cap until something releases.
func (p *eicTunnelPool) evictLRUIfFullLocked(skipInstance string) {
if len(p.entries) < poolMaxEntries {
return
}
var oldestKey string
var oldest *pooledTunnel
for k, pt := range p.entries {
if k == skipInstance {
continue
}
if pt.refcount > 0 {
continue
}
if oldest == nil || pt.lastUsed.Before(oldest.lastUsed) {
oldestKey = k
oldest = pt
}
}
if oldest == nil {
return // every entry is in use; no eviction possible
}
delete(p.entries, oldestKey)
go oldest.cleanup()
}
// janitor periodically scans for entries that are idle AND expired,
// closing their tunnels. Runs forever (per pool lifetime); cancelled
// by close(p.stopJanitor) for tests that build short-lived pools.
func (p *eicTunnelPool) janitor() {
t := time.NewTicker(poolJanitorInterval)
defer t.Stop()
for {
select {
case <-t.C:
p.sweep()
case <-p.stopJanitor:
return
}
}
}
// sweep is one janitor pass. Drops idle expired entries.
func (p *eicTunnelPool) sweep() {
p.mu.Lock()
now := time.Now()
var toClose []func()
for k, pt := range p.entries {
if pt.refcount == 0 && pt.expiresAt.Before(now) {
toClose = append(toClose, pt.cleanup)
delete(p.entries, k)
}
}
p.mu.Unlock()
for _, c := range toClose {
go c()
}
}
// stop terminates the janitor and closes all idle entries. Hot
// (refcount > 0) entries are NOT force-closed — callers running
// against them would see a use-after-free. In practice stop is only
// called by tests that have already drained their callers.
func (p *eicTunnelPool) stop() {
close(p.stopJanitor)
p.mu.Lock()
defer p.mu.Unlock()
for k, pt := range p.entries {
if pt.refcount == 0 {
go pt.cleanup()
delete(p.entries, k)
}
}
}
// pooledWithEICTunnel is the pool-backed replacement for
// realWithEICTunnel. The signature matches `var withEICTunnel`
// exactly so the rebind (in initEICTunnelPool) is a drop-in.
//
// Errors from `fn` itself are forwarded to the caller AND mark the
// pool entry as poisoned, so the next acquire builds a fresh
// tunnel. This catches the case where the workspace EC2 was
// restarted out-of-band (tunnel still appears alive locally but
// every cat/find errors out).
func pooledWithEICTunnel(ctx context.Context, instanceID string,
fn func(s eicSSHSession) error) error {
pool := getEICTunnelPool()
sess, done, err := pool.acquire(ctx, instanceID)
if err != nil {
return err
}
// poisoned defaults to true so a panic from fn poisons the
// entry on the way through the deferred release. Without the
// defer, a panicking fn would leak refcount=1 forever and
// permanently block eviction of this entry. The fn-error path
// resets poisoned to its real classification before return.
poisoned := true
defer func() { done(poisoned) }()
fnErr := fn(sess)
poisoned = fnErrIndicatesTunnelFault(fnErr)
return fnErr
}
// fnErrIndicatesTunnelFault returns true for fn errors whose nature
// suggests the underlying tunnel is no longer reusable (auth gone,
// network gone, ssh process dead). Returning true poisons the pool
// entry so the next acquire builds fresh.
//
// Conservative: only marks tunnel-faulty for clearly tunnel-level
// failures (connection refused, broken pipe, ssh exit-status from
// fatal-channel signals). A `cat` returning os.ErrNotExist on a
// missing file is NOT a tunnel fault — that's the file path being
// wrong, the tunnel is fine.
func fnErrIndicatesTunnelFault(err error) bool {
if err == nil {
return false
}
msg := err.Error()
// stderr substrings produced by ssh when the tunnel is broken.
for _, marker := range []string{
"connection refused",
"connection closed",
"broken pipe",
"Connection reset by peer",
"kex_exchange_identification",
"port forwarding failed",
"Permission denied",
"Authentication failed",
} {
if containsCaseInsensitive(msg, marker) {
return true
}
}
return false
}
// containsCaseInsensitive avoids importing strings just for this
// (the file already needs ssh stderr matching elsewhere — this
// keeps the helper local to avoid a cross-file dependency).
func containsCaseInsensitive(s, substr string) bool {
if len(substr) > len(s) {
return false
}
// Manual lowercase compare loop; ssh error markers are ASCII so
// no need for unicode-aware folding.
low := func(b byte) byte {
if b >= 'A' && b <= 'Z' {
return b + 32
}
return b
}
for i := 0; i+len(substr) <= len(s); i++ {
match := true
for j := 0; j < len(substr); j++ {
if low(s[i+j]) != low(substr[j]) {
match = false
break
}
}
if match {
return true
}
}
return false
}
// initEICTunnelPool rebinds the package-level withEICTunnel var to
// the pooled implementation. Called once at package init via the
// init() in eic_tunnel_pool_setup.go (split file so the rebind
// itself is testable without dragging in the production setup
// shim's exec/aws dependencies).
func initEICTunnelPool() {
withEICTunnel = pooledWithEICTunnel
}
@@ -0,0 +1,136 @@
package handlers
// eic_tunnel_pool_setup.go — production setup shim.
//
// setupRealEICTunnel decomposes the existing realWithEICTunnel into
// its slow half (build the tunnel) and its caller half (run fn). The
// pool calls the slow half once and shares the resulting session
// across N callers, holding cleanup until the last release.
//
// Why decompose instead of refactoring realWithEICTunnel: the
// existing function and its test stub-vars (withEICTunnel,
// sendSSHPublicKey, openTunnelCmd) are load-bearing for the
// dispatch tests. Extracting a sibling setup function preserves the
// existing single-shot path verbatim — the pool wraps it by calling
// realWithEICTunnel through a thin adapter, leaving the tested
// surface unchanged.
//
// The pool's acquire() invokes poolSetupTunnel, which is a `var`
// pointing to setupRealEICTunnel for production and a counting stub
// for tests.
import (
"context"
"fmt"
"os"
"os/exec"
"strings"
"time"
)
// setupRealEICTunnel is the slow path that the pool consumes when
// no warm entry exists. Mirrors realWithEICTunnel's setup half but
// returns the session + cleanup instead of running fn inline.
//
// The cleanup func owns the tunnel subprocess, ephemeral key dir,
// and a one-time wait. Idempotent — calling it twice is safe; the
// pool guarantees one call per session, but defence-in-depth helps
// when tests run pools in parallel and racy sweeps re-trigger.
func setupRealEICTunnel(ctx context.Context, instanceID string) (
eicSSHSession, func(), error) {
if instanceID == "" {
return eicSSHSession{}, nil,
fmt.Errorf("workspace has no instance_id — not a SaaS EC2 workspace")
}
osUser := os.Getenv("WORKSPACE_EC2_OS_USER")
if osUser == "" {
osUser = "ubuntu"
}
region := os.Getenv("AWS_REGION")
if region == "" {
region = "us-east-2"
}
keyDir, err := os.MkdirTemp("", "molecule-eic-pool-*")
if err != nil {
return eicSSHSession{}, nil, fmt.Errorf("keydir mkdir: %w", err)
}
keyPath := keyDir + "/id"
if out, kerr := exec.CommandContext(ctx, "ssh-keygen",
"-t", "ed25519", "-f", keyPath, "-N", "", "-q",
"-C", "molecule-eic-pool",
).CombinedOutput(); kerr != nil {
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil,
fmt.Errorf("ssh-keygen: %w (%s)", kerr, strings.TrimSpace(string(out)))
}
pubKey, err := os.ReadFile(keyPath + ".pub")
if err != nil {
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil, fmt.Errorf("read pubkey: %w", err)
}
if err := sendSSHPublicKey(ctx, region, instanceID, osUser,
strings.TrimSpace(string(pubKey))); err != nil {
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil, fmt.Errorf("send-ssh-public-key: %w", err)
}
localPort, err := pickFreePort()
if err != nil {
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil, fmt.Errorf("pick free port: %w", err)
}
tunnel := openTunnelCmd(eicSSHOptions{
InstanceID: instanceID,
OSUser: osUser,
Region: region,
LocalPort: localPort,
PrivateKeyPath: keyPath,
})
tunnel.Env = os.Environ()
if err := tunnel.Start(); err != nil {
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil, fmt.Errorf("open-tunnel start: %w", err)
}
if err := waitForPort(ctx, "127.0.0.1", localPort, 10*time.Second); err != nil {
if tunnel.Process != nil {
_ = tunnel.Process.Kill()
}
_ = tunnel.Wait()
_ = os.RemoveAll(keyDir)
return eicSSHSession{}, nil, fmt.Errorf("tunnel never listened: %w", err)
}
cleanedUp := false
cleanup := func() {
if cleanedUp {
return
}
cleanedUp = true
if tunnel.Process != nil {
_ = tunnel.Process.Kill()
}
_ = tunnel.Wait()
_ = os.RemoveAll(keyDir)
}
return eicSSHSession{
keyPath: keyPath,
localPort: localPort,
osUser: osUser,
instanceID: instanceID,
}, cleanup, nil
}
// init wires the pool into the package-level withEICTunnel var so
// every read/write/list/delete EIC op uses pooled tunnels by default.
// Test files that need single-shot behaviour can swap withEICTunnel
// back via the existing stubWithEICTunnel pattern, OR set poolTTL=0
// to disable pooling without rebinding the var.
func init() {
initEICTunnelPool()
}
@@ -0,0 +1,467 @@
package handlers
// eic_tunnel_pool_test.go — tests for the refcounted EIC tunnel pool
// added in core#11. Stubs poolSetupTunnel with a counter so the
// tests don't fork ssh-keygen / aws subprocesses.
//
// Per memory feedback_assert_exact_not_substring: each test pins
// exact expected counts (not "at least N") so a regression that
// silently double-sets-up surfaces here.
import (
"context"
"errors"
"sync"
"sync/atomic"
"testing"
"time"
)
// withPoolSetupStub swaps poolSetupTunnel for a counting fake that
// returns a sentinel session and a cleanup func that records its
// invocation. Restores on test cleanup.
//
// setupSignal blocks each setup until released — for concurrent-
// acquire tests where we want to gate setup completion.
func withPoolSetupStub(t *testing.T) (
setupCount *int64, cleanupCount *int64, restore func(), unblock func()) {
t.Helper()
prev := poolSetupTunnel
prevTTL := poolTTL
prevJanitor := poolJanitorInterval
var sc, cc int64
setupCount, cleanupCount = &sc, &cc
gate := make(chan struct{}, 1)
gate <- struct{}{} // allow the first setup through immediately
unblock = func() { gate <- struct{}{} }
poolSetupTunnel = func(ctx context.Context, instanceID string) (
eicSSHSession, func(), error) {
select {
case <-gate:
case <-ctx.Done():
return eicSSHSession{}, nil, ctx.Err()
}
atomic.AddInt64(&sc, 1)
sess := eicSSHSession{
instanceID: instanceID,
osUser: "ubuntu",
localPort: 10000 + int(atomic.LoadInt64(&sc)),
keyPath: "/tmp/molecule-eic-test-" + instanceID,
}
cleanup := func() { atomic.AddInt64(&cc, 1) }
return sess, cleanup, nil
}
restore = func() {
poolSetupTunnel = prev
poolTTL = prevTTL
poolJanitorInterval = prevJanitor
}
t.Cleanup(restore)
return
}
// freshPool returns an isolated pool (NOT the global) so tests run
// independently. Stops the janitor on cleanup.
func freshPool(t *testing.T) *eicTunnelPool {
t.Helper()
p := newEICTunnelPool()
t.Cleanup(p.stop)
return p
}
// TestEICTunnelPool_FourOpsAmortise pins the core invariant: four
// sequential acquire/release cycles on the same instanceID share
// ONE underlying tunnel setup. Mutation: delete the cache hit branch
// in acquire() → setupCount goes 1 → 4 → test fails.
func TestEICTunnelPool_FourOpsAmortise(t *testing.T) {
setupCount, cleanupCount, _, _ := withPoolSetupStub(t)
// Refill gate after each setup so concurrent stubs aren't blocked
// (we want every test to be able to set up if it needs to).
t.Cleanup(func() { /* no-op; defer is enough */ })
poolTTL = 50 * time.Second
pool := freshPool(t)
ctx := context.Background()
for i := 0; i < 4; i++ {
sess, done, err := pool.acquire(ctx, "i-test-1")
if err != nil {
t.Fatalf("op %d: acquire: %v", i, err)
}
if sess.instanceID != "i-test-1" {
t.Fatalf("op %d: session has wrong instanceID: %q", i, sess.instanceID)
}
done(false)
}
if got := atomic.LoadInt64(setupCount); got != 1 {
t.Errorf("expected exactly 1 tunnel setup across 4 ops, got %d", got)
}
if got := atomic.LoadInt64(cleanupCount); got != 0 {
t.Errorf("expected 0 cleanups while entry is hot (TTL=50s), got %d", got)
}
}
// TestEICTunnelPool_DifferentInstancesDoNotShare pins that two
// different instanceIDs each get their own tunnel — the pool is
// keyed on instanceID, not a single global slot.
func TestEICTunnelPool_DifferentInstancesDoNotShare(t *testing.T) {
setupCount, _, _, unblock := withPoolSetupStub(t)
poolTTL = 50 * time.Second
pool := freshPool(t)
ctx := context.Background()
// First instance setup uses the initial gate slot.
_, doneA, err := pool.acquire(ctx, "i-a")
if err != nil {
t.Fatalf("acquire A: %v", err)
}
doneA(false)
// Second instance needs a new slot through the gate.
unblock()
_, doneB, err := pool.acquire(ctx, "i-b")
if err != nil {
t.Fatalf("acquire B: %v", err)
}
doneB(false)
if got := atomic.LoadInt64(setupCount); got != 2 {
t.Errorf("expected 2 setups (one per instance), got %d", got)
}
}
// TestEICTunnelPool_TTLEviction: a short TTL forces the second op
// to build a fresh tunnel after the first expires.
func TestEICTunnelPool_TTLEviction(t *testing.T) {
setupCount, cleanupCount, _, unblock := withPoolSetupStub(t)
poolTTL = 50 * time.Millisecond
poolJanitorInterval = 1 * time.Second // keep janitor away
pool := freshPool(t)
ctx := context.Background()
_, done, err := pool.acquire(ctx, "i-ttl")
if err != nil {
t.Fatalf("acquire 1: %v", err)
}
done(false)
time.Sleep(80 * time.Millisecond) // past TTL
unblock() // allow next setup
_, done, err = pool.acquire(ctx, "i-ttl")
if err != nil {
t.Fatalf("acquire 2: %v", err)
}
done(false)
if got := atomic.LoadInt64(setupCount); got != 2 {
t.Errorf("expected 2 setups (TTL eviction between), got %d", got)
}
// First entry should have been cleaned up when the second
// acquire evicted it on the slow path. Cleanup runs in a
// goroutine; poll briefly for it to land.
deadline := time.Now().Add(500 * time.Millisecond)
for atomic.LoadInt64(cleanupCount) < 1 && time.Now().Before(deadline) {
time.Sleep(5 * time.Millisecond)
}
if got := atomic.LoadInt64(cleanupCount); got < 1 {
t.Errorf("expected ≥1 cleanup (first entry evicted), got %d", got)
}
}
// TestEICTunnelPool_FailureInvalidates pins the poison-on-fault
// behavior — fn returning a tunnel-fatal error marks the entry
// unusable so the next acquire builds fresh.
func TestEICTunnelPool_FailureInvalidates(t *testing.T) {
setupCount, _, _, unblock := withPoolSetupStub(t)
poolTTL = 50 * time.Second
pool := freshPool(t)
ctx := context.Background()
_, done, err := pool.acquire(ctx, "i-fault")
if err != nil {
t.Fatalf("acquire 1: %v", err)
}
done(true) // signal poison
unblock() // let the next setup through
_, done, err = pool.acquire(ctx, "i-fault")
if err != nil {
t.Fatalf("acquire 2: %v", err)
}
done(false)
if got := atomic.LoadInt64(setupCount); got != 2 {
t.Errorf("expected 2 setups (poison forced rebuild), got %d", got)
}
}
// TestEICTunnelPool_ConcurrentAcquireSingleSetup pins that N
// concurrent acquires for the same instanceID before any release
// only trigger ONE tunnel setup — the rest wait via pendingSetups.
//
// Without this guard each concurrent acquire would spawn its own
// tunnel and the loser-cleanup would still leak refcount. Mutation:
// delete the pendingSetups gate → setupCount goes 1 → N → fails.
func TestEICTunnelPool_ConcurrentAcquireSingleSetup(t *testing.T) {
setupCount, _, _, _ := withPoolSetupStub(t)
// Pause setup so all goroutines pile into the pending slot.
prev := poolSetupTunnel
gate := make(chan struct{})
poolSetupTunnel = func(ctx context.Context, instanceID string) (
eicSSHSession, func(), error) {
<-gate
atomic.AddInt64(setupCount, 1)
return eicSSHSession{instanceID: instanceID}, func() {}, nil
}
t.Cleanup(func() { poolSetupTunnel = prev })
poolTTL = 50 * time.Second
pool := freshPool(t)
ctx := context.Background()
const N = 8
type result struct {
done func(bool)
err error
}
results := make(chan result, N)
var startWg sync.WaitGroup
startWg.Add(N)
for i := 0; i < N; i++ {
go func() {
startWg.Done()
_, done, err := pool.acquire(ctx, "i-concurrent")
results <- result{done, err}
}()
}
startWg.Wait()
// give all N goroutines time to enter pool.acquire
time.Sleep(20 * time.Millisecond)
close(gate)
for i := 0; i < N; i++ {
r := <-results
if r.err != nil {
t.Fatalf("acquire %d: %v", i, r.err)
}
r.done(false)
}
if got := atomic.LoadInt64(setupCount); got != 1 {
t.Errorf("expected 1 setup across %d concurrent acquires, got %d", N, got)
}
}
// TestEICTunnelPool_TTLZeroDisablesPooling pins the escape hatch:
// poolTTL=0 means every acquire goes straight through to setup +
// cleanup, no entry kept. Useful for tests / opt-out.
func TestEICTunnelPool_TTLZeroDisablesPooling(t *testing.T) {
setupCount, cleanupCount, _, unblock := withPoolSetupStub(t)
poolTTL = 0
pool := freshPool(t)
ctx := context.Background()
_, done, err := pool.acquire(ctx, "i-ttlzero")
if err != nil {
t.Fatalf("acquire 1: %v", err)
}
done(false)
unblock()
_, done, err = pool.acquire(ctx, "i-ttlzero")
if err != nil {
t.Fatalf("acquire 2: %v", err)
}
done(false)
if got := atomic.LoadInt64(setupCount); got != 2 {
t.Errorf("expected 2 setups with TTL=0 (pool disabled), got %d", got)
}
if got := atomic.LoadInt64(cleanupCount); got != 2 {
t.Errorf("expected 2 cleanups with TTL=0 (each release closes), got %d", got)
}
}
// TestEICTunnelPool_LRUEvictionAtCap pins the LRU defence: when the
// pool reaches poolMaxEntries, a new acquire for an unseen
// instanceID evicts the LRU idle entry instead of growing unbounded.
func TestEICTunnelPool_LRUEvictionAtCap(t *testing.T) {
setupCount, cleanupCount, _, _ := withPoolSetupStub(t)
prev := poolMaxEntries
poolMaxEntries = 2
t.Cleanup(func() { poolMaxEntries = prev })
poolTTL = 50 * time.Second
// Replace stub with one that doesn't gate so we can fill quickly.
poolSetupTunnel = func(ctx context.Context, instanceID string) (
eicSSHSession, func(), error) {
atomic.AddInt64(setupCount, 1)
return eicSSHSession{instanceID: instanceID}, func() {
atomic.AddInt64(cleanupCount, 1)
}, nil
}
pool := freshPool(t)
ctx := context.Background()
for _, id := range []string{"i-1", "i-2"} {
_, done, err := pool.acquire(ctx, id)
if err != nil {
t.Fatalf("acquire %s: %v", id, err)
}
done(false)
}
// Both entries idle, pool at cap.
_, done, err := pool.acquire(ctx, "i-3")
if err != nil {
t.Fatalf("acquire i-3: %v", err)
}
done(false)
// Wait for the goroutine'd cleanup of the evicted entry.
deadline := time.Now().Add(500 * time.Millisecond)
for atomic.LoadInt64(cleanupCount) < 1 && time.Now().Before(deadline) {
time.Sleep(10 * time.Millisecond)
}
if got := atomic.LoadInt64(setupCount); got != 3 {
t.Errorf("expected 3 setups (one per unique instance), got %d", got)
}
if got := atomic.LoadInt64(cleanupCount); got < 1 {
t.Errorf("expected ≥1 cleanup (LRU eviction), got %d", got)
}
}
// TestEICTunnelPool_PoisonedClassification pins the heuristic that
// distinguishes tunnel-fatal errors (poison the entry) from
// app-level errors (file not found, validation) that should NOT
// invalidate the tunnel.
func TestEICTunnelPool_PoisonedClassification(t *testing.T) {
cases := []struct {
name string
err error
want bool
}{
{"nil", nil, false},
{"file not found", errors.New("os: file does not exist"), false},
{"validation", errors.New("invalid path: must be relative"), false},
{"connection refused", errors.New("ssh: connect to host: connection refused"), true},
{"connection refused upper", errors.New("Connection Refused"), true},
{"broken pipe", errors.New("write tunnel: broken pipe"), true},
{"permission denied", errors.New("Permission denied (publickey)"), true},
{"auth failed", errors.New("Authentication failed"), true},
{"connection reset", errors.New("Connection reset by peer"), true},
{"port forward", errors.New("port forwarding failed"), true},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := fnErrIndicatesTunnelFault(tc.err)
if got != tc.want {
t.Errorf("fnErrIndicatesTunnelFault(%v) = %v, want %v",
tc.err, got, tc.want)
}
})
}
}
// TestEICTunnelPool_RefcountBlocksEviction pins that an entry past
// TTL is NOT evicted while a caller still holds it — preventing
// use-after-free in the holder.
func TestEICTunnelPool_RefcountBlocksEviction(t *testing.T) {
setupCount, cleanupCount, _, _ := withPoolSetupStub(t)
poolTTL = 30 * time.Millisecond
poolJanitorInterval = 5 * time.Millisecond
pool := freshPool(t)
ctx := context.Background()
_, done, err := pool.acquire(ctx, "i-hold")
if err != nil {
t.Fatalf("acquire: %v", err)
}
// Sleep past TTL while holding the session. Janitor sweeps
// every 5ms but must skip our entry because refcount=1.
time.Sleep(80 * time.Millisecond)
if got := atomic.LoadInt64(cleanupCount); got != 0 {
t.Errorf("expected 0 cleanups while holder is active, got %d", got)
}
done(false)
// Now refcount=0 and entry is past TTL; releaser triggers cleanup.
deadline := time.Now().Add(200 * time.Millisecond)
for atomic.LoadInt64(cleanupCount) < 1 && time.Now().Before(deadline) {
time.Sleep(5 * time.Millisecond)
}
if got := atomic.LoadInt64(cleanupCount); got != 1 {
t.Errorf("expected 1 cleanup after release of expired entry, got %d", got)
}
if got := atomic.LoadInt64(setupCount); got != 1 {
t.Errorf("setupCount tracking: got %d, want 1", got)
}
}
// TestPooledWithEICTunnel_PanicPoisonsEntry pins that a panic
// from fn poisons the pool entry on the way out — refcount goes
// back to zero (no leak) and the entry is marked unusable so the
// next acquire builds fresh. Without the defer-release pattern, a
// panic would leave refcount=1 forever and the entry would never
// evict.
func TestPooledWithEICTunnel_PanicPoisonsEntry(t *testing.T) {
setupCount, _, _, _ := withPoolSetupStub(t)
poolTTL = 50 * time.Second
globalEICTunnelPool = newEICTunnelPool()
t.Cleanup(globalEICTunnelPool.stop)
func() {
defer func() {
if r := recover(); r == nil {
t.Errorf("expected panic to bubble up, got nil")
}
}()
_ = pooledWithEICTunnel(context.Background(), "i-panic",
func(s eicSSHSession) error { panic("boom") })
}()
// Replenish the gate so the next setup can run.
prev := poolSetupTunnel
poolSetupTunnel = func(ctx context.Context, instanceID string) (
eicSSHSession, func(), error) {
atomic.AddInt64(setupCount, 1)
return eicSSHSession{instanceID: instanceID}, func() {}, nil
}
t.Cleanup(func() { poolSetupTunnel = prev })
// Next acquire must build fresh — entry was poisoned by panic.
if err := pooledWithEICTunnel(context.Background(), "i-panic",
func(s eicSSHSession) error { return nil }); err != nil {
t.Fatalf("post-panic acquire: %v", err)
}
if got := atomic.LoadInt64(setupCount); got != 2 {
t.Errorf("expected 2 setups (panic poisoned, rebuild), got %d", got)
}
}
// TestPooledWithEICTunnel_PreservesFnErr pins that errors from the
// inner fn pass through to the caller verbatim — pool wrapping
// should not swallow or transform error semantics for app code.
func TestPooledWithEICTunnel_PreservesFnErr(t *testing.T) {
withPoolSetupStub(t)
poolTTL = 50 * time.Second
// Reset the global pool so this test is isolated from any prior
// test that may have populated it.
globalEICTunnelPool = newEICTunnelPool()
want := errors.New("file does not exist")
got := pooledWithEICTunnel(context.Background(), "i-fn-err",
func(s eicSSHSession) error { return want })
if !errors.Is(got, want) {
t.Errorf("pooledWithEICTunnel returned %v, want %v", got, want)
}
}
@@ -31,11 +31,25 @@ import (
// tests pin the helper's three observable behaviors plus an AST gate
// that catches future re-introductions of the un-checked INSERT.
// lookupChildSQLRE anchors the sqlmock ExpectQuery on every load-bearing
// token of lookupExistingChild's SELECT (org_import.go:639-645). A loose
// substring match (the prior shape, just `SELECT id FROM workspaces`)
// would silent-pass a regression that drops `IS NOT DISTINCT FROM`
// (breaks NULL-parent matching), drops `parent_id` entirely (hijacks
// siblings of the same name across different parents), or drops the
// `status != 'removed'` filter (blocks re-import after Collapse).
// RFC #2872 Important-2.
//
// The four anchored tokens are exactly the predicates the bug shapes
// would tamper with. Whitespace is `\s+` so a future formatter pass
// doesn't churn this string.
const lookupChildSQLRE = `(?s)SELECT id FROM workspaces\s+WHERE name = \$1\s+AND parent_id IS NOT DISTINCT FROM \$2\s+AND status != 'removed'`
func TestLookupExistingChild_NotFound_ReturnsFalseNoError(t *testing.T) {
mock := setupTestDB(t)
// 0-row result → driver returns sql.ErrNoRows on Scan.
parent := "parent-1"
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
@@ -56,7 +70,7 @@ func TestLookupExistingChild_NotFound_ReturnsFalseNoError(t *testing.T) {
func TestLookupExistingChild_Found_ReturnsIDAndTrue(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-existing-uuid"))
@@ -79,7 +93,7 @@ func TestLookupExistingChild_NilParent_MatchesRoot(t *testing.T) {
// a plain `=` would never match a NULL row. Pin that roots
// (parent_id=NULL) are still found by the lookup.
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("RootAgent", (*string)(nil)).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-root-uuid"))
@@ -102,7 +116,7 @@ func TestLookupExistingChild_DBError_Propagates(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
connFail := errors.New("simulated postgres unavailable")
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnError(connFail)
@@ -137,7 +151,7 @@ func TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
wrapped := fmt.Errorf("driver-wrapped: %w", sql.ErrNoRows)
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnError(wrapped)
@@ -110,10 +110,55 @@ func (s *PostgresMessageStore) List(ctx context.Context, workspaceID string, opt
return nil, false, err
}
// Wire order: oldest-first within the page so canvas (and any
// future client) can render chronologically without per-pair
// reordering. The SQL is `ORDER BY created_at DESC LIMIT N` for
// pagination correctness, and activityRowToChatMessages emits
// [user, agent] within a row — so a naive client-side flat-reverse
// would swap the pair (agent before user at the same timestamp).
// Reversing ROW-AWARE here keeps the wire shape display-ready.
//
// Algorithm: group consecutive same-timestamp messages into row
// chunks (1-2 messages each), reverse the chunk order, flatten.
// Within-row [user, agent] order is preserved. Single-message
// rows (no agent reply yet, or attachments-only) collapse to
// 1-element chunks and still reverse correctly.
messages = reverseRowChunks(messages)
reachedEnd := rowCount < opts.Limit
return messages, reachedEnd, nil
}
// reverseRowChunks groups msgs by adjacent same-Timestamp runs and
// reverses the run order, preserving within-run order. Pairs of
// (user, agent) emitted by activityRowToChatMessages share a
// timestamp, so this keeps each pair internally ordered while
// reversing the row sequence.
func reverseRowChunks(msgs []ChatMessage) []ChatMessage {
if len(msgs) == 0 {
return msgs
}
var chunks [][]ChatMessage
cur := []ChatMessage{msgs[0]}
for i := 1; i < len(msgs); i++ {
if msgs[i].Timestamp == cur[len(cur)-1].Timestamp {
cur = append(cur, msgs[i])
} else {
chunks = append(chunks, cur)
cur = []ChatMessage{msgs[i]}
}
}
chunks = append(chunks, cur)
for i, j := 0, len(chunks)-1; i < j; i, j = i+1, j-1 {
chunks[i], chunks[j] = chunks[j], chunks[i]
}
out := make([]ChatMessage, 0, len(msgs))
for _, chunk := range chunks {
out = append(out, chunk...)
}
return out
}
// queryActivityRows is split from List so unit tests can exercise the
// parser without spinning a real DB. Internal — alternative impls
// shouldn't depend on the SQL shape.
@@ -14,10 +14,13 @@ package messagestore
// legacy source the server replaces; divergence == regression.
import (
"context"
"encoding/json"
"strings"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
)
const fixedTimestamp = "2026-04-25T18:00:00Z"
@@ -282,6 +285,145 @@ func TestChatHistory_NoAgentMessageWhenResponseHasNoTextNoFiles(t *testing.T) {
}
}
// =====================================================================
// List() integration — sqlmock-backed end-to-end via the real handler
// =====================================================================
// TestList_WireOrderIsOldestFirstAcrossPagedRows pins the integration
// invariant: List() returns wire-display-ready messages even though
// the underlying SQL is `ORDER BY created_at DESC`. This is the
// load-bearing test for PR-C-2 — without the row-aware reversal,
// canvas would render every paired bubble in the wrong order on every
// chat reload (agent before user within each timestamp).
//
// Mutation-test cover: removing the `messages = reverseRowChunks(...)`
// call in List() must turn this test red. (The lower-level
// TestReverseRowChunks_PreservesPairOrderAcrossRows pins the helper
// itself; this test pins that List ACTUALLY CALLS the helper.)
func TestList_WireOrderIsOldestFirstAcrossPagedRows(t *testing.T) {
db, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("sqlmock.New: %v", err)
}
defer db.Close()
// Server's SQL is ORDER BY created_at DESC. Build mock rows in
// THAT order so the row-aware reversal has work to do.
rows := sqlmock.NewRows([]string{"created_at", "status", "request_body", "response_body"}).
AddRow(mustParseTime(t, "2026-05-05T00:03:00Z"), "ok",
`{"params":{"message":{"parts":[{"kind":"text","text":"u3"}]}}}`,
`{"result":"a3"}`).
AddRow(mustParseTime(t, "2026-05-05T00:02:00Z"), "ok",
`{"params":{"message":{"parts":[{"kind":"text","text":"u2"}]}}}`,
`{"result":"a2"}`).
AddRow(mustParseTime(t, "2026-05-05T00:01:00Z"), "ok",
`{"params":{"message":{"parts":[{"kind":"text","text":"u1"}]}}}`,
`{"result":"a1"}`)
mock.ExpectQuery(`SELECT created_at, status, request_body::text, response_body::text`).
WillReturnRows(rows)
store := NewPostgresMessageStore(db)
msgs, reachedEnd, err := store.List(context.Background(), "ws-1", ListOptions{Limit: 10})
if err != nil {
t.Fatalf("List: %v", err)
}
wantContents := []string{"u1", "a1", "u2", "a2", "u3", "a3"}
if len(msgs) != len(wantContents) {
t.Fatalf("len(msgs)=%d want %d; got=%v", len(msgs), len(wantContents), msgs)
}
for i, w := range wantContents {
if msgs[i].Content != w {
t.Errorf("idx %d: got %q want %q (full slice ordering broken; reverseRowChunks regressed?)", i, msgs[i].Content, w)
}
}
if !reachedEnd {
t.Errorf("3 rows < limit 10 should reach end, got reachedEnd=false")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("sqlmock expectations: %v", err)
}
}
// =====================================================================
// reverseRowChunks — wire-order helper added in PR-C-2
// =====================================================================
// TestReverseRowChunks_PreservesPairOrderAcrossRows pins the
// row-aware reversal that List() applies before returning. Server's
// SQL is `ORDER BY created_at DESC`, so messages come out
// newest-row-first; activityRowToChatMessages emits [user, agent]
// per row with same timestamp. A naive flat reversal of the messages
// slice would flip each pair (agent before user). reverseRowChunks
// reverses ROWS, preserving pair-internal order. Without this, canvas
// would render every paired bubble in the wrong order on every chat
// reload — the canvas-side reverse used to do the right thing because
// it reversed ROWS BEFORE flattening, but PR-C/D moved the flattening
// into the server, so the row-awareness has to live there too.
func TestReverseRowChunks_PreservesPairOrderAcrossRows(t *testing.T) {
// Build messages newest-row-first as List() collects them. Each
// row is a pair sharing a timestamp, with [user, agent] order.
in := []ChatMessage{
{Role: "user", Content: "user_3", Timestamp: "2026-05-05T00:03:00Z"},
{Role: "agent", Content: "agent_3", Timestamp: "2026-05-05T00:03:00Z"},
{Role: "user", Content: "user_2", Timestamp: "2026-05-05T00:02:00Z"},
{Role: "agent", Content: "agent_2", Timestamp: "2026-05-05T00:02:00Z"},
{Role: "user", Content: "user_1", Timestamp: "2026-05-05T00:01:00Z"},
{Role: "agent", Content: "agent_1", Timestamp: "2026-05-05T00:01:00Z"},
}
got := reverseRowChunks(in)
want := []struct {
role, content string
}{
{"user", "user_1"}, {"agent", "agent_1"},
{"user", "user_2"}, {"agent", "agent_2"},
{"user", "user_3"}, {"agent", "agent_3"},
}
if len(got) != len(want) {
t.Fatalf("len(got)=%d len(want)=%d", len(got), len(want))
}
for i, w := range want {
if got[i].Role != w.role || got[i].Content != w.content {
t.Errorf("idx %d: got role=%q content=%q want role=%q content=%q",
i, got[i].Role, got[i].Content, w.role, w.content)
}
}
}
// TestReverseRowChunks_HandlesSingleMessageRows pins the case where
// a row has only a user OR only an agent message (e.g., agent reply
// not yet recorded, attachments-only user upload). Naive reversal
// still works for single-message chunks; the test guards against a
// future change that special-cases the 2-message-row path.
func TestReverseRowChunks_HandlesSingleMessageRows(t *testing.T) {
in := []ChatMessage{
{Role: "user", Content: "u3", Timestamp: "2026-05-05T00:03:00Z"},
{Role: "user", Content: "u2", Timestamp: "2026-05-05T00:02:00Z"}, // single, no agent
{Role: "agent", Content: "a2", Timestamp: "2026-05-05T00:02:00Z"},
{Role: "user", Content: "u1", Timestamp: "2026-05-05T00:01:00Z"},
}
got := reverseRowChunks(in)
wantContents := []string{"u1", "u2", "a2", "u3"}
if len(got) != len(wantContents) {
t.Fatalf("len got=%d want=%d", len(got), len(wantContents))
}
for i, w := range wantContents {
if got[i].Content != w {
t.Errorf("idx %d: got %q want %q", i, got[i].Content, w)
}
}
}
// TestReverseRowChunks_EmptyInput returns nil/empty without panic.
func TestReverseRowChunks_EmptyInput(t *testing.T) {
got := reverseRowChunks(nil)
if len(got) != 0 {
t.Errorf("nil input should return empty, got %v", got)
}
}
// =====================================================================
// end-to-end shape — paired user + agent with same timestamp
// =====================================================================
@@ -35,36 +35,37 @@ import (
// drift-risk #6.
var ErrNoBackend = errors.New("provisioner: no backend configured (zero-valued receiver)")
// RuntimeImages maps runtime names to their Docker image refs on GHCR.
// RuntimeImages maps runtime names to their Docker image refs.
// Each standalone template repo publishes its image via the reusable
// publish-template-image workflow in molecule-ci on every main merge.
// The provisioner pulls these on demand (see ensureImageLocal) — no
// pre-build step on the tenant host.
//
// The registry prefix is determined by RegistryPrefix() in registry.go;
// defaults to ghcr.io/molecule-ai (upstream OSS) and is overridden via the
// MOLECULE_IMAGE_REGISTRY env var in production tenants that mirror to
// AWS ECR or another registry. The map is computed at package init and
// captures whatever prefix was active then.
//
// Legacy local-build path (`docker build -t workspace-template:<runtime>`
// via scripts/build-images.sh) is still supported for development:
// when a bare `workspace-template:<runtime>` image is present locally,
// Docker's image resolver matches it before any pull is attempted. Set
// the env var WORKSPACE_IMAGE_LOCAL_OVERRIDE=1 (enforced by callers) to
// short-circuit pulls entirely if needed.
var RuntimeImages = map[string]string{
"langgraph": "ghcr.io/molecule-ai/workspace-template-langgraph:latest",
"claude-code": "ghcr.io/molecule-ai/workspace-template-claude-code:latest",
"openclaw": "ghcr.io/molecule-ai/workspace-template-openclaw:latest",
"deepagents": "ghcr.io/molecule-ai/workspace-template-deepagents:latest",
"crewai": "ghcr.io/molecule-ai/workspace-template-crewai:latest",
"autogen": "ghcr.io/molecule-ai/workspace-template-autogen:latest",
"hermes": "ghcr.io/molecule-ai/workspace-template-hermes:latest", // Hermes (Nous Research) — real hermes-agent behind A2A bridge
"gemini-cli": "ghcr.io/molecule-ai/workspace-template-gemini-cli:latest", // Google Gemini CLI
}
var RuntimeImages = computeRuntimeImages()
// DefaultImage is the fallback workspace Docker image (langgraph is the
// most common runtime). Computed via RegistryPrefix() so the prefix
// override applies to the fallback path too.
//
// NOTE: Every runtime MUST have an entry in knownRuntimes (registry.go).
// If a runtime is missing, it falls back to DefaultImage which may have
// wrong deps. Add new runtimes to knownRuntimes AND create the standalone
// template repo.
var DefaultImage = RuntimeImage(defaultRuntime)
const (
// DefaultImage is the fallback workspace Docker image (langgraph is the most common runtime).
DefaultImage = "ghcr.io/molecule-ai/workspace-template-langgraph:latest"
// NOTE: Every runtime MUST have an entry in RuntimeImages above. If a runtime is missing,
// it falls back to DefaultImage which may have wrong deps. Add new runtimes to both
// RuntimeImages AND create the standalone template repo.
// DefaultNetwork is the Docker network workspaces join.
DefaultNetwork = "molecule-monorepo-net"
@@ -0,0 +1,95 @@
package provisioner
import (
"fmt"
"os"
)
// defaultRegistryPrefix is the upstream OSS face for all workspace template
// images. Self-hosted Molecule deployments without the MOLECULE_IMAGE_REGISTRY
// override pull from here.
const defaultRegistryPrefix = "ghcr.io/molecule-ai"
// knownRuntimes is the canonical list of workspace template runtimes shipped
// in main. Any runtime added here MUST also have a standalone template repo
// (Molecule-AI/molecule-ai-workspace-template-<name>) and an entry in the
// publish-template-image workflow that builds it.
//
// Order matters for deterministic test snapshots; keep alphabetical.
var knownRuntimes = []string{
"autogen",
"claude-code",
"codex",
"crewai",
"deepagents",
"gemini-cli",
"hermes",
"langgraph",
"openclaw",
}
// defaultRuntime is the fallback when a workspace's config doesn't specify a
// runtime. Picked because LangGraph is the most common in our org templates
// and has the smallest "first impression" cold-start surface.
const defaultRuntime = "langgraph"
// RegistryPrefix returns the registry prefix all workspace-template image
// references should use. Defaults to ghcr.io/molecule-ai (the upstream OSS
// face) and is overridden by the MOLECULE_IMAGE_REGISTRY env var in
// production tenants where we mirror images to a private registry.
//
// The override is set at deploy time (Railway env, EC2 user-data) — never
// from user-supplied input — so the value is trusted by the time it reaches
// this code. Validation is deliberately minimal: an operator-supplied
// prefix that points at a registry the EC2 can't authenticate to will fail
// loudly at docker-pull time, which is the right blast radius.
//
// Example values:
//
// (unset) → ghcr.io/molecule-ai (OSS default)
// "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai" → AWS ECR mirror
// "git.moleculesai.app/molecule-ai" → self-hosted Gitea Container Registry (future)
//
// Auth is registry-specific and configured outside this function:
// - GHCR: GHCR_USER/GHCR_TOKEN env vars consumed by ghcrAuthHeader()
// - ECR: docker credential helper (amazon-ecr-credential-helper) configured
// in EC2 user-data; ~/.docker/config.json has credHelpers entry; the
// daemon resolves auth automatically on every pull.
func RegistryPrefix() string {
if v := os.Getenv("MOLECULE_IMAGE_REGISTRY"); v != "" {
return v
}
return defaultRegistryPrefix
}
// RuntimeImage returns the canonical image reference for the given runtime,
// using the current RegistryPrefix() and the moving `:latest` tag.
//
// For SHA-pinned references (production thin-AMI launches), the
// runtime_image_pins lookup in handlers/runtime_image_pin.go strips the
// `:latest` suffix and appends an immutable `@sha256:<digest>` from the DB.
// That code path naturally inherits any RegistryPrefix() change because it
// reads from RuntimeImages[runtime] and only re-formats the tag suffix.
//
// Returns the empty string for unknown runtimes; callers should fall through
// to DefaultImage in that case (matching legacy behavior).
func RuntimeImage(runtime string) string {
for _, r := range knownRuntimes {
if r == runtime {
return fmt.Sprintf("%s/workspace-template-%s:latest", RegistryPrefix(), runtime)
}
}
return ""
}
// computeRuntimeImages returns the {runtime: image-ref} map evaluated against
// the current RegistryPrefix(). Called at package init to populate the
// exported RuntimeImages var. Tests that flip MOLECULE_IMAGE_REGISTRY between
// expected values use this helper to rebuild the map mid-run.
func computeRuntimeImages() map[string]string {
out := make(map[string]string, len(knownRuntimes))
for _, r := range knownRuntimes {
out[r] = RuntimeImage(r)
}
return out
}
@@ -0,0 +1,140 @@
package provisioner
import (
"strings"
"testing"
)
// TestRegistryPrefix_DefaultsToGHCR pins the OSS-default behavior. If a future
// refactor accidentally drops the default, OSS users self-hosting Molecule
// would silently lose image pulls — this test should fail loudly instead.
func TestRegistryPrefix_DefaultsToGHCR(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
got := RegistryPrefix()
want := "ghcr.io/molecule-ai"
if got != want {
t.Fatalf("RegistryPrefix() = %q, want %q (default must remain GHCR for OSS users)", got, want)
}
}
// TestRegistryPrefix_RespectsEnv verifies the override path used in
// production tenants where MOLECULE_IMAGE_REGISTRY points at a private
// mirror (AWS ECR, self-hosted Harbor, etc.).
func TestRegistryPrefix_RespectsEnv(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai")
got := RegistryPrefix()
want := "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"
if got != want {
t.Fatalf("RegistryPrefix() = %q, want %q (env override path is the production cutover mechanism)", got, want)
}
}
// TestRegistryPrefix_EmptyEnvFallsBackToDefault — guard against an operator
// setting MOLECULE_IMAGE_REGISTRY="" by mistake (e.g. unset deploy variable
// becomes empty string, not literally absent). We treat "" as "use default"
// so a misconfigured env doesn't mean an empty registry prefix.
func TestRegistryPrefix_EmptyEnvFallsBackToDefault(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
if RegistryPrefix() != defaultRegistryPrefix {
t.Fatalf("empty MOLECULE_IMAGE_REGISTRY should fall back to %q, got %q", defaultRegistryPrefix, RegistryPrefix())
}
}
// TestRuntimeImage_AllKnownRuntimes — every runtime in the canonical list
// must produce a properly-formatted image ref. If a new runtime is added to
// knownRuntimes but the format changes, this catches it.
func TestRuntimeImage_AllKnownRuntimes(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
for _, r := range knownRuntimes {
got := RuntimeImage(r)
want := "ghcr.io/molecule-ai/workspace-template-" + r + ":latest"
if got != want {
t.Errorf("RuntimeImage(%q) = %q, want %q", r, got, want)
}
}
// Pin the count so adding a runtime requires explicit test acknowledgement.
if len(knownRuntimes) != 9 {
t.Errorf("knownRuntimes length = %d, want 9 (autogen, claude-code, codex, crewai, deepagents, gemini-cli, hermes, langgraph, openclaw)", len(knownRuntimes))
}
}
// TestRuntimeImage_UnknownRuntime — defensive: callers must fall back to
// DefaultImage when a runtime is unknown, never silently use the wrong
// prefix. Returning "" enforces an explicit fallback at every call site.
func TestRuntimeImage_UnknownRuntime(t *testing.T) {
for _, name := range []string{"", "nonexistent", "WORKSPACE-TEMPLATE-FAKE", "../../../etc/passwd"} {
if got := RuntimeImage(name); got != "" {
t.Errorf("RuntimeImage(%q) = %q, want empty string for unknown runtime", name, got)
}
}
}
// TestRuntimeImage_RegistryOverrideAppliesToAllRuntimes — the override
// flips ALL runtimes consistently. If a refactor accidentally hardcoded
// the prefix in some runtimes but not others (the failure mode that
// triggered this whole rollout), this test catches it.
func TestRuntimeImage_RegistryOverrideAppliesToAllRuntimes(t *testing.T) {
const ecr = "999999999999.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"
t.Setenv("MOLECULE_IMAGE_REGISTRY", ecr)
for _, r := range knownRuntimes {
got := RuntimeImage(r)
if !strings.HasPrefix(got, ecr+"/workspace-template-") {
t.Errorf("RuntimeImage(%q) = %q, must start with override prefix %q", r, got, ecr)
}
if !strings.HasSuffix(got, ":latest") {
t.Errorf("RuntimeImage(%q) = %q, must keep :latest tag suffix", r, got)
}
}
}
// TestComputeRuntimeImages_AllRuntimesPresent — the map must contain every
// known runtime. Drift between knownRuntimes and computeRuntimeImages would
// silently break the runtime → image lookup that provisioner.Start uses.
func TestComputeRuntimeImages_AllRuntimesPresent(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
m := computeRuntimeImages()
if len(m) != len(knownRuntimes) {
t.Fatalf("computeRuntimeImages() has %d entries, want %d (one per knownRuntime)", len(m), len(knownRuntimes))
}
for _, r := range knownRuntimes {
img, ok := m[r]
if !ok {
t.Errorf("computeRuntimeImages() missing runtime %q", r)
continue
}
if img == "" {
t.Errorf("computeRuntimeImages()[%q] is empty", r)
}
}
}
// TestComputeRuntimeImages_ReflectsCurrentEnv — calling computeRuntimeImages
// after env change rebuilds the map with new prefix. Tests + ops procedures
// that flip the env in-process rely on this.
func TestComputeRuntimeImages_ReflectsCurrentEnv(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
defaultMap := computeRuntimeImages()
if !strings.HasPrefix(defaultMap["claude-code"], "ghcr.io/molecule-ai/") {
t.Fatalf("default map should be GHCR-prefixed, got %q", defaultMap["claude-code"])
}
const mirror = "registry.example.com/molecule-ai"
t.Setenv("MOLECULE_IMAGE_REGISTRY", mirror)
mirrorMap := computeRuntimeImages()
if !strings.HasPrefix(mirrorMap["claude-code"], mirror+"/") {
t.Fatalf("mirror-prefixed map should start with %q, got %q", mirror, mirrorMap["claude-code"])
}
}
// TestKnownRuntimes_AlphabeticalOrder — pin the order so test snapshots
// (and human readers diffing the file) see deterministic output. Adding a
// new runtime out of alphabetical order will fail this test, which is the
// nudge to keep the file readable.
func TestKnownRuntimes_AlphabeticalOrder(t *testing.T) {
for i := 1; i < len(knownRuntimes); i++ {
if knownRuntimes[i-1] >= knownRuntimes[i] {
t.Errorf("knownRuntimes not alphabetical: %q comes before %q", knownRuntimes[i-1], knownRuntimes[i])
}
}
}
@@ -0,0 +1,149 @@
package registry
// cp_orphan_sweeper.go — SaaS-mode counterpart to orphan_sweeper.go.
//
// The Docker sweeper (StartOrphanSweeper) runs only when prov != nil
// (single-tenant Docker mode); SaaS tenants run cpProv != nil and prov
// == nil, so they get no sweep coverage from that path. This file fills
// the gap for the deprovision split-write race documented in #2989:
//
// 1. handlers/workspace_crud.go:365 marks workspaces.status = 'removed'.
// 2. workspace_crud.go:439 calls StopWorkspaceAuto → cpProv.Stop, which
// issues DELETE /cp/workspaces/:id?instance_id=… to controlplane.
// 3. If step 2 fails (CP transient 5xx, network blip, AWS hiccup), the
// inline path returns a 500 to the canvas — but the DB row is already
// at status='removed' with instance_id still populated. There's no
// retry, and the EC2 lives forever.
//
// This sweeper closes that gap by re-issuing cpProv.Stop on every cycle
// for any workspace at status='removed' with a non-NULL instance_id.
// Stop is idempotent: AWS TerminateInstance on an already-terminated
// instance is a no-op (per AWS docs), and CP's Deprovision handler
// (controlplane/internal/handlers/workspace_provision.go:289) handles
// the already-terminated and already-deleted-DNS cases via best-effort
// guards. On Stop success, the sweeper clears instance_id so the next
// cycle skips the row.
//
// Cadence + safety filters mirror the Docker sweeper:
// - 60s tick (OrphanSweepInterval)
// - 30s per-cycle deadline (orphanSweepDeadline)
// - LIMIT 100 per cycle so a sustained CP outage that backs up many
// orphans doesn't blow the request timeout; subsequent cycles drain.
//
// SSOT note: Stop's idempotency (no-op on empty instance_id, AWS
// terminate on already-terminated) is the load-bearing invariant. Any
// future change that adds non-idempotent side effects to cpProv.Stop
// must also gate this sweeper, or it will re-execute those side effects
// every 60s for every cleared-but-not-yet-NULL row.
import (
"context"
"log"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// CPOrphanReaper is the dependency the SaaS-mode sweeper takes from
// the CP provisioner. *provisioner.CPProvisioner satisfies this
// naturally; tests inject fakes.
type CPOrphanReaper interface {
Stop(ctx context.Context, workspaceID string) error
}
// cpSweepLimit caps the per-cycle row count so a sustained CP outage
// can't make a single sweep cycle blow orphanSweepDeadline. With a
// 60s cadence and 100-row limit, drain rate is up to 100 orphans/min,
// which has never been approached even during the worst leak windows.
const cpSweepLimit = 100
// StartCPOrphanSweeper runs the SaaS-mode reconcile loop until ctx is
// cancelled. nil reaper makes the loop a no-op (matches the Docker
// sweeper's nil-tolerant pattern).
//
// Caller is expected to gate on `cpProv != nil` (matching how
// StartOrphanSweeper is gated on `prov != nil` at the call site in
// cmd/server/main.go) — passing a nil *CPProvisioner here would also
// short-circuit but the gate at the wiring site keeps the call shape
// symmetric across the two sweepers.
func StartCPOrphanSweeper(ctx context.Context, reaper CPOrphanReaper) {
if reaper == nil {
log.Println("CP orphan sweeper: reaper is nil — sweeper disabled")
return
}
log.Printf("CP orphan sweeper started — reconciling every %s", OrphanSweepInterval)
ticker := time.NewTicker(OrphanSweepInterval)
defer ticker.Stop()
cpSweepOnce(ctx, reaper)
for {
select {
case <-ctx.Done():
log.Println("CP orphan sweeper: shutdown")
return
case <-ticker.C:
cpSweepOnce(ctx, reaper)
}
}
}
// cpSweepOnce executes one reconcile pass. Defensive against db.DB
// being nil so a misconfigured boot doesn't panic.
func cpSweepOnce(parent context.Context, reaper CPOrphanReaper) {
if db.DB == nil {
return
}
ctx, cancel := context.WithTimeout(parent, orphanSweepDeadline)
defer cancel()
rows, err := db.DB.QueryContext(ctx, `
SELECT id::text
FROM workspaces
WHERE status = 'removed'
AND instance_id IS NOT NULL
AND instance_id != ''
ORDER BY updated_at DESC
LIMIT $1
`, cpSweepLimit)
if err != nil {
log.Printf("CP orphan sweeper: DB query failed: %v", err)
return
}
defer rows.Close()
var orphanIDs []string
for rows.Next() {
var id string
if scanErr := rows.Scan(&id); scanErr != nil {
log.Printf("CP orphan sweeper: row scan failed: %v", scanErr)
continue
}
orphanIDs = append(orphanIDs, id)
}
if iterErr := rows.Err(); iterErr != nil {
log.Printf("CP orphan sweeper: rows iteration failed: %v", iterErr)
return
}
for _, id := range orphanIDs {
log.Printf("CP orphan sweeper: terminating leaked EC2 for removed workspace %s", id)
if stopErr := reaper.Stop(ctx, id); stopErr != nil {
// CP-side error — transient 5xx, network, AWS hiccup. Leave
// instance_id populated so the next cycle retries. Loud-fail
// only at the log layer; the user-visible 500 was already
// returned by the inline path that triggered this orphan.
log.Printf("CP orphan sweeper: Stop failed for %s: %v — retry next cycle", id, stopErr)
continue
}
// Stop succeeded — clear instance_id so the next cycle skips this
// row. We can't use a tombstone column (no schema change in this
// PR); NULL'ing instance_id is the SSOT signal for "no live
// EC2 attached." The matching SELECT predicate above stays in
// sync with this UPDATE.
if _, updErr := db.DB.ExecContext(ctx,
`UPDATE workspaces SET instance_id = NULL, updated_at = now() WHERE id = $1`,
id,
); updErr != nil {
log.Printf("CP orphan sweeper: clear instance_id failed for %s: %v — next cycle will re-Stop (idempotent)", id, updErr)
}
}
}
@@ -0,0 +1,266 @@
package registry
import (
"context"
"errors"
"sync"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// fakeCPReaper is a hand-rolled CPOrphanReaper for the SaaS-mode
// sweeper tests. Records every Stop call so tests can assert which
// workspace IDs were re-issued.
type fakeCPReaper struct {
mu sync.Mutex
stopErr map[string]error
stopCalls []string
}
func (f *fakeCPReaper) Stop(_ context.Context, wsID string) error {
f.mu.Lock()
defer f.mu.Unlock()
f.stopCalls = append(f.stopCalls, wsID)
return f.stopErr[wsID]
}
// TestCPSweepOnce_StopSucceeds_ClearsInstanceID — happy path. Single
// removed-row with non-NULL instance_id; Stop succeeds; instance_id
// gets NULL'd so the next cycle won't re-sweep it.
func TestCPSweepOnce_StopSucceeds_ClearsInstanceID(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces\s+WHERE status = 'removed'\s+AND instance_id IS NOT NULL\s+AND instance_id != ''\s+ORDER BY updated_at DESC\s+LIMIT \$1`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-uuid-1"))
mock.ExpectExec(`UPDATE workspaces SET instance_id = NULL, updated_at = now\(\) WHERE id = \$1`).
WithArgs("ws-uuid-1").
WillReturnResult(sqlmock.NewResult(0, 1))
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 1 || reaper.stopCalls[0] != "ws-uuid-1" {
t.Fatalf("expected Stop(ws-uuid-1), got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestCPSweepOnce_StopFails_KeepsInstanceID — CP transient failure.
// Stop returns an error; instance_id MUST stay populated so the next
// cycle retries. UPDATE must NOT fire.
func TestCPSweepOnce_StopFails_KeepsInstanceID(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{
stopErr: map[string]error{"ws-uuid-1": errors.New("CP returned 503")},
}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-uuid-1"))
// No ExpectExec for the UPDATE — sqlmock fails the test if the
// UPDATE fires.
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 1 || reaper.stopCalls[0] != "ws-uuid-1" {
t.Fatalf("expected Stop(ws-uuid-1), got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations (UPDATE should NOT have fired): %v", err)
}
}
// TestCPSweepOnce_NoOrphans — empty result set is the steady state in
// healthy operation. No Stop, no UPDATE.
func TestCPSweepOnce_NoOrphans(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 0 {
t.Fatalf("expected zero Stop calls, got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestCPSweepOnce_MultipleOrphans — all rows in the batch get Stop'd
// independently; one failure doesn't block others.
func TestCPSweepOnce_MultipleOrphans(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{
stopErr: map[string]error{"ws-uuid-2": errors.New("CP 503 on ws-uuid-2")},
}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}).
AddRow("ws-uuid-1").
AddRow("ws-uuid-2").
AddRow("ws-uuid-3"))
// ws-uuid-1 succeeds → UPDATE fires.
mock.ExpectExec(`UPDATE workspaces SET instance_id = NULL`).
WithArgs("ws-uuid-1").
WillReturnResult(sqlmock.NewResult(0, 1))
// ws-uuid-2 fails → no UPDATE.
// ws-uuid-3 succeeds → UPDATE fires.
mock.ExpectExec(`UPDATE workspaces SET instance_id = NULL`).
WithArgs("ws-uuid-3").
WillReturnResult(sqlmock.NewResult(0, 1))
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 3 {
t.Fatalf("expected Stop on all 3 ids, got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestCPSweepOnce_QueryError — DB transient failure. Sweep returns
// without panicking. No Stop calls.
func TestCPSweepOnce_QueryError(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnError(errors.New("connection refused"))
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 0 {
t.Fatalf("expected zero Stop calls on query error, got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestCPSweepOnce_UpdateError_LogsButContinues — Stop succeeded but
// the UPDATE to clear instance_id failed. Subsequent rows in the batch
// must still process; comment in cpSweepOnce promises idempotent re-Stop
// next cycle.
func TestCPSweepOnce_UpdateError_LogsButContinues(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{}
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}).
AddRow("ws-uuid-1").
AddRow("ws-uuid-2"))
mock.ExpectExec(`UPDATE workspaces SET instance_id = NULL`).
WithArgs("ws-uuid-1").
WillReturnError(errors.New("UPDATE timeout"))
mock.ExpectExec(`UPDATE workspaces SET instance_id = NULL`).
WithArgs("ws-uuid-2").
WillReturnResult(sqlmock.NewResult(0, 1))
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 2 {
t.Fatalf("expected Stop on both ids despite UPDATE error on first, got %v", reaper.stopCalls)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Fatalf("unmet expectations: %v", err)
}
}
// TestCPSweepOnce_NilDB — defensive against db.DB being nil. Must not
// panic; must not call Stop.
func TestCPSweepOnce_NilDB(t *testing.T) {
saved := db.DB
db.DB = nil
t.Cleanup(func() { db.DB = saved })
reaper := &fakeCPReaper{}
cpSweepOnce(context.Background(), reaper)
if len(reaper.stopCalls) != 0 {
t.Fatalf("expected zero Stop calls when db.DB is nil, got %v", reaper.stopCalls)
}
}
// TestStartCPOrphanSweeper_NilReaperDisabled — boot-safety: a SaaS CP
// without cpProv configured must not start the loop (immediate return,
// no goroutine leak).
func TestStartCPOrphanSweeper_NilReaperDisabled(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
done := make(chan struct{})
go func() {
StartCPOrphanSweeper(ctx, nil)
close(done)
}()
select {
case <-done:
// expected — nil reaper short-circuits.
case <-time.After(500 * time.Millisecond):
t.Fatal("StartCPOrphanSweeper(nil) did not return immediately")
}
}
// TestStartCPOrphanSweeper_RunsOnceImmediatelyAndOnTick — cadence
// contract: kick off one sweep at boot (so a platform restart starts
// healing immediately), then once per OrphanSweepInterval. Verifies
// the loop terminates on ctx cancel.
func TestStartCPOrphanSweeper_RunsOnceImmediatelyAndOnTick(t *testing.T) {
mock := setupTestDB(t)
reaper := &fakeCPReaper{}
// Two sweeps within the test window: one immediate, one on the
// first tick. We can't shrink OrphanSweepInterval (it's a const),
// so assert "at least one immediate sweep" and let cancel close
// the loop.
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
// The ticker may or may not fire in the test window depending on
// scheduler; tolerate both shapes by registering a second optional
// expectation. sqlmock fails on UNREGISTERED queries, so register
// one more then accept either 1 or 2 fires.
mock.ExpectQuery(`(?s)^\s*SELECT id::text\s+FROM workspaces`).
WithArgs(cpSweepLimit).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
ctx, cancel := context.WithCancel(context.Background())
done := make(chan struct{})
go func() {
StartCPOrphanSweeper(ctx, reaper)
close(done)
}()
// 100ms is well past the boot-sweep but well shy of the 60s
// interval, so the second query expectation is intentionally
// unmet — that's fine, sqlmock distinguishes "expected but not
// received" (we don't enforce here) from "unexpected query"
// (which would fail).
time.Sleep(100 * time.Millisecond)
cancel()
select {
case <-done:
// expected
case <-time.After(2 * time.Second):
t.Fatal("StartCPOrphanSweeper did not exit on ctx cancel")
}
// Boot sweep must have happened — without it, an operator restart
// after a CP outage would leave a 60s gap before the first heal.
// We don't assert mock.ExpectationsWereMet() here because the
// second query is intentionally optional.
}
+1 -1
View File
@@ -2,7 +2,7 @@
# build-all.sh — Rebuild base image and optionally adapter images.
#
# NOTE: Adapters have been extracted to standalone template repos:
# https://github.com/Molecule-AI/molecule-ai-workspace-template-<runtime>
# https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>
#
# This script now only builds the base image from workspace/Dockerfile.
# Each adapter repo has its own Dockerfile that installs molecule-ai-workspace-runtime