molecule-core

Author	SHA1	Message	Date
claude-ceo-assistant	3de51faa19	fix(org-import): reconcile mode + audit-event emission All checks were successful CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 2s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 34s Details CI / Canvas (Next.js) (pull_request) Successful in 57s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 56s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m1s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m22s Details Harness Replays / Harness Replays (pull_request) Successful in 2m59s Details CI / Platform (Go) (pull_request) Successful in 3m20s Details Closes the additive-import zombie bug — re-running /org/import with a tree shape that reparents same-named roles left the prior workspace online because lookupExistingChild's dedupe is parent-scoped (different parent_id → "different" workspace). Caught 2026-05-08 after a dev-tree re-import left 8 orphans co-existing with the new tree on canvas until manual cascade-delete. Three layers in this PR: - mode="reconcile" on /org/import — after the import loop, online workspaces whose name matches an imported name but whose id isn't in the result set are cascade-deleted. Default mode "" / "merge" preserves existing additive behavior. Empty-set guards prevent accidental "delete everything" if either array comes up empty. - WorkspaceHandler.CascadeDelete extracted as a callable helper from the existing Delete HTTP handler so OrgImport's reconcile path shares the same teardown sequence (#73 race guard, container stop, volume removal, token revocation, schedule disable, event broadcast). The HTTP Delete handler still inlines the same logic; deduplication tracked as tech-debt follow-up. - emitOrgEvent(structure_events) records org.import.started + org.import.completed with mode, created/skipped/reconcile_removed counts, duration_ms, error. Replaces the lost-on-restart stdout-only log shape for an audit-trail surface that's queryable by SQL. Closes the "what happened at 20:13?" debugging gap that motivated this fix. Verified live against the local platform: cascade-delete on an old tree's removed root cleared 8 surviving orphans; mode="reconcile" with a freshly-INSERTed fake orphan removed exactly the fake; idempotent re-run of reconcile is a no-op (0 removed, no errors); structure_events captures every started+completed pair with full payload. 7 new unit tests (walkOrgWorkspaceNames flat/nested/spawning:false/ empty-name; emitOrgEvent success + DB-error-swallow; errString). Full handler suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 15:04:47 -07:00
claude-ceo-assistant	6f861926bd	Merge pull request 'fix(workspace_provision): preserve MODEL secret over MODEL_PROVIDER slug on restart' (#136 ) from fix/preserve-model-secret-on-restart into main Some checks failed E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 7s Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 7s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (push) Successful in 22s Details CI / Detect changes (push) Successful in 29s Details Handlers Postgres Integration / detect-changes (push) Successful in 22s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 24s Details Harness Replays / detect-changes (push) Successful in 21s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 18s Details CI / Shellcheck (E2E scripts) (push) Successful in 11s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 30s Details CI / Python Lint & Test (push) Successful in 10s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m5s Details CI / Canvas (Next.js) (push) Successful in 1m47s Details CI / Canvas Deploy Reminder (push) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 1m53s Details Harness Replays / Harness Replays (push) Successful in 2m27s Details Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 7m31s Details publish-workspace-server-image / build-and-push (push) Failing after 9m49s Details CI / Platform (Go) (push) Successful in 10m11s Details E2E API Smoke Test / detect-changes (push) Failing after 11m16s Details	2026-05-08 21:31:50 +00:00
claude-ceo-assistant	15c5f32491	fix(workspace_provision): preserve MODEL secret over MODEL_PROVIDER slug on restart Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 13s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 26s Details cascade-list-drift-gate / check (pull_request) Successful in 30s Details CI / Detect changes (pull_request) Successful in 35s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 32s Details Harness Replays / detect-changes (pull_request) Successful in 34s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 36s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 40s Details branch-protection drift check / Branch protection drift (pull_request) Successful in 42s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 38s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 42s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 37s Details Harness Replays / Harness Replays (pull_request) Failing after 40s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m46s Details CI / Python Lint & Test (pull_request) Successful in 1m10s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m7s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1m39s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7m39s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m51s Details CI / Canvas (Next.js) (pull_request) Successful in 9m16s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Successful in 10m17s Details Phase 4 follow-up to template-claude-code PR #9 (2026-05-08 dev-tree wedge). Pre-fix: applyRuntimeModelEnv unconditionally overwrote envVars["MODEL"] with the MODEL_PROVIDER slug whenever payload.Model was empty (the restart path). This silently wiped the operator'\''s explicit per-persona MODEL secret on every restart. Symptom: dev-tree workspaces booted correctly on first /org/import (the envVars map was populated direct from the persona env file with both MODEL=MiniMax-M2.7-highspeed and MODEL_PROVIDER=minimax), then on the next Restart the MODEL secret got clobbered to literal "minimax" — a provider slug, not a valid model id — and the workspace template'\''s adapter failed to match any registry prefix, fell through to providers[0] (anthropic-oauth), and wedged at SDK initialize. Fix: resolution order in applyRuntimeModelEnv is now: 1. payload.Model (caller passed the canvas-picked model id verbatim) 2. envVars["MODEL"] (workspace_secret persisted from persona env) 3. envVars["MODEL_PROVIDER"] (legacy canvas Save+Restart shape) Tests ----- TestApplyRuntimeModelEnv_PersonaEnvMODELSecretPreserved — locks in the new resolution order with four cases: - MODEL secret wins over MODEL_PROVIDER slug (persona-env shape) - MODEL secret wins even when same as MODEL_PROVIDER - MODEL absent → fall back to MODEL_PROVIDER (legacy shape) - Both absent → no MODEL set (no-op) Existing TestApplyRuntimeModelEnv_SetsUniversalMODELForAllRuntimes continues to pass — fix is strictly additive on the precedence chain.	2026-05-08 14:31:14 -07:00
claude-ceo-assistant	9b5e89bb42	Merge pull request 'feat(org-import): add spawning:false field to skip workspace + descendants' (#135 ) from feat/org-import-spawning-false into main Some checks are pending Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions Details publish-workspace-server-image / build-and-push (push) Waiting to run Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 21s Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 23s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 21s Details CI / Detect changes (push) Successful in 28s Details Block internal-flavored paths / Block forbidden paths (push) Successful in 35s Details Handlers Postgres Integration / detect-changes (push) Successful in 29s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 33s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 31s Details E2E API Smoke Test / detect-changes (push) Successful in 1m5s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m1s Details Harness Replays / detect-changes (push) Successful in 1m4s Details CI / Shellcheck (E2E scripts) (push) Successful in 11s Details CI / Canvas (Next.js) (push) Successful in 17s Details CI / Canvas Deploy Reminder (push) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m15s Details CI / Python Lint & Test (push) Successful in 1m56s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 2m27s Details Harness Replays / Harness Replays (push) Successful in 3m0s Details E2E API Smoke Test / E2E API Smoke Test (push) Successful in 5m46s Details CI / Platform (Go) (push) Successful in 8m23s Details	2026-05-08 21:20:56 +00:00
claude-ceo-assistant	b91da1ab77	feat(org-import): add spawning:false field to skip workspace + descendants Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 11s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 11s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 11s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 24s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 36s Details cascade-list-drift-gate / check (pull_request) Successful in 35s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 36s Details CI / Detect changes (pull_request) Successful in 39s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 27s Details branch-protection drift check / Branch protection drift (pull_request) Successful in 45s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 47s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 37s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 58s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 57s Details Harness Replays / detect-changes (pull_request) Successful in 50s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 29s Details CI / Python Lint & Test (pull_request) Successful in 33s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 56s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 30s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 2m5s Details Harness Replays / Harness Replays (pull_request) Failing after 1m37s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4m54s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m49s Details CI / Platform (Go) (pull_request) Successful in 9m13s Details CI / Canvas (Next.js) (pull_request) Failing after 11m30s Details CI / Canvas Deploy Reminder (pull_request) Has been cancelled Details Lets a workspace declare it (and its entire subtree) should be skipped during /org/import. Pointer-typed `*bool` so we distinguish "explicitly false" from "unset" (default = spawn). ## Use case The dev-tree org template ships the full role taxonomy (Dev Lead with Core Platform / Controlplane / App & Docs / Infra / SDK Leads, each with their own engineering / QA / security / UI-UX children — 27 personas total in a single import). Some setups need a smaller set: - Local dev on a memory-constrained machine - Demo / smoke runs that don't need the full org breathing - Customer trials starting with leadership-only before fan-out Pre-fix the only options were: - Edit the canonical template (mutates shared state) - Author a parallel slimmer template (duplicates structure) - Manual workspace deprovision after full import (wasteful — already paid the docker pull / build cost) `spawning: false` is the per-workspace knob that solves this without touching the canonical template structure. ## Semantics - Unset: workspace spawns (current behaviour, no migration) - `spawning: true`: explicitly spawns (same as unset) - `spawning: false`: workspace is skipped AND every descendant is skipped. The guard sits BEFORE any side effect in createWorkspaceTree — no DB row, no docker provision, no children recursion. A false-spawning subtree is genuinely a no-op except for the log line. countWorkspaces still counts the subtree (so /org/templates numbers reflect the full structure). ## Stage A — verified Local dev-only template that wraps teams/dev.yaml (Dev Lead) with children:[] cleared on the 5 sub-team yaml files, plus 3 floater personas (Release Manager / Integration Tester / Fullstack Engineer). /org/import returned 9 workspaces. Drop-in: same result via `spawning: false` on each sub-tree root in the future. ## Stage B — N/A Pure additive feature on the org-template handler. No SaaS deploy chain implications. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 14:20:14 -07:00
claude-ceo-assistant	c3596d6271	fix(org-import): use ws.FilesDir as persona-dir lookup, add docker-cli-buildx to dev image Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 7s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 8s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 8s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 20s Details branch-protection drift check / Branch protection drift (pull_request) Successful in 23s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 23s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 28s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 28s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 30s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 24s Details Harness Replays / detect-changes (pull_request) Successful in 25s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 27s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 21s Details Harness Replays / Harness Replays (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 13s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s Details CI / Detect changes (pull_request) Successful in 52s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s Details CI / Python Lint & Test (pull_request) Successful in 13s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2m5s Details CI / Platform (Go) (pull_request) Failing after 1m46s Details CI / Canvas (Next.js) (pull_request) Failing after 1m49s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m16s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details ## org_import.go — persona env injection root-cause fix The Phase-3 fix from earlier today (`feedback/per-agent-gitea-identity-default`) introduced loadPersonaEnvFile to inject persona-specific creds into workspace_secrets on /org/import. It passed `ws.Role` as the persona-dir lookup key, but in our dev-tree org.yaml shape `role:` carries the multi-line descriptive text the agent reads from its prompt ("Engineering planning and team coordination — leads Core Platform, Controlplane, ..."), while `files_dir:` holds the short slug (`core-lead`, `dev-lead`, etc.) matching `~/.molecule-ai/personas/<files_dir>/env`. isSafeRoleName silently rejected the multi-word role text → no persona env loaded → every imported workspace booted with zero workspace_secrets rows → no ANTHROPIC / CLAUDE_CODE / MINIMAX auth in the container env → claude_agent_sdk wedged on `query.initialize()` with a 60s control-request timeout. After the fix, /org/import on the dev tree (27 personas) populates 8 workspace_secrets per workspace (Gitea identity + MODEL/MODEL_PROVIDER + provider-specific token), 5 of 6 leads boot online, and the remaining wedges trace to a separate runtime-template-repo bug (workspace-template-claude-code's claude_sdk_executor.py doesn't dispatch on MODEL_PROVIDER=minimax — filed separately). ## Dockerfile.dev — docker-cli + docker-cli-buildx Without these, every claude-code/tier-2 workspace POST fails-fast: - docker-cli alone produces `exec: "docker": executable file not found` - docker-cli alone (no buildx) fails on `docker build` with `ERROR: BuildKit is enabled but the buildx component is missing or broken` Both packages are now installed in the dev image; verified with `docker exec molecule-core-platform-1 docker buildx version`. ## Stage A verified Local /org/import dev-only path: 27 workspaces created, all 27 receive persona env injection (8 secrets each — Gitea identity + provider creds). Lead workspaces (claude-code-OAuth tier) boot online. ## Stage B — N/A Local-dev-only path (docker-compose.dev.yml + dev image). Tenant EC2 provisioning uses Dockerfile.tenant (untouched). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:50:46 -07:00
claude-ceo-assistant	7eda8f510f	feat(local-dev): containerize platform + canvas stack via docker-compose (closes #126 ) Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 0s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 7s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s Details CI / Python Lint & Test (pull_request) Successful in 3s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details Harness Replays / Harness Replays (pull_request) Failing after 5s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 51s Details CI / Canvas (Next.js) (pull_request) Successful in 2m5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Successful in 2m31s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4m22s Details Replaces the legacy nohup `go run ./cmd/server` setup with a fully containerized local stack: postgres + redis + platform + canvas, all with `restart: unless-stopped` so they survive Mac sleep/wake and Docker Desktop daemon restarts. ## Changes - docker-compose.yml - `restart: unless-stopped` on platform/postgres/redis - `BIND_ADDR=0.0.0.0` for platform — the dev-mode-fail-open default of 127.0.0.1 (PR #7) made the host unable to reach the container even with port mapping. Container netns is already isolated, so binding all interfaces inside is safe. - Healthchecks switched from `wget --spider` (HEAD → 404 forever because /health is GET-only) to `wget -qO /dev/null` (GET). Same regression existed on canvas; fixed both. - workspace-server/Dockerfile.dev - `CGO_ENABLED=1` → `0` to match prod Dockerfile + Dockerfile.tenant. Without this, the alpine dev image fails with "gcc: not found" because workspace-server has no actual cgo deps but the env was forcing the cgo build path. Closes a divergence introduced in `9d50a6da` (today's air hot-reload PR). - canvas/Dockerfile - `npm install` → `npm ci --include=optional` for lockfile-exact installs that include platform-specific @tailwindcss/oxide native binaries. Without these, `next build` fails with "Cannot read properties of undefined (reading 'All')" on the `@import "tailwindcss"` directive. - canvas/.dockerignore (new) - Excludes `node_modules` and `.next` so the Dockerfile's `COPY . .` step doesn't clobber the freshly-installed container node_modules with the host's (potentially stale or wrong-arch) copy. This was the actual root cause of the canvas build break. - workspace-server/.gitignore - Adds `/tmp/` for air's live-reload build cache. ## Stage A verified ``` container status restart postgres-1 Up (healthy) unless-stopped redis-1 Up (healthy) unless-stopped platform-1 Up (healthy, air-mode) unless-stopped canvas-1 Up (healthy) unless-stopped GET :8080/health → 200 GET :3000/ → 200 DB preserved: 407 workspace rows + 5 named personas Persona mount: 28 dirs at /etc/molecule-bootstrap/personas ``` ## Stage B — N/A This is local-dev infrastructure only. None of these files ship to SaaS tenants — production EC2s use `Dockerfile.tenant` + `ec2.go` user-data, not docker-compose. ## Out of scope - The decorative-but-broken `wget --spider` healthcheck has presumably also been silently 404'ing on prod tenants. Ship a follow-up to audit + fix the prod path; not done here to keep the PR scoped. - Docker Desktop "Start at login" is a per-machine GUI setting that must be toggled manually (Settings → General). - The legacy heartbeat-all.sh that pinged 5 persona workspaces from the host has been deleted (~/.molecule-ai/heartbeat-all.sh). Per Hongming: each workspace is responsible for its own heartbeat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 10:53:39 -07:00
claude-ceo-assistant	120b3a25aa	feat(workspaces): update_tier column for canary vs production fan-out Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 19s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 4s Details Check migration collisions / Migration version collision check (pull_request) Successful in 29s Details CI / Detect changes (pull_request) Successful in 1m3s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 39s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 47s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 34s Details Harness Replays / detect-changes (pull_request) Successful in 38s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 38s Details CI / Canvas (Next.js) (pull_request) Successful in 11s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s Details CI / Python Lint & Test (pull_request) Successful in 10s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 15s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 35s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 21s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 6m49s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m55s Details CI / Platform (Go) (pull_request) Successful in 14m6s Details Closes core#115 partial. Schema-only change; the apply-endpoint filter logic that reads this column lands with core#123 (drift detector + queue + apply endpoint, the deferred follow-up of core#113). Default 'production' so existing customers (Reno-Stars + any future tenant) are default-safe. Synthetic dogfooding workspaces opt INTO 'canary' explicitly. CHECK constraint pins the closed value set ('canary' \| 'production') — the apply endpoint's filter relies on the database to reject anything else, so a future operator typo in PATCH /workspaces/:id ({update_tier: 'canery'}) returns a constraint violation, not silent fan-out to nobody. Partial index on canary rows since the apply-endpoint query path ('apply this update only to canary tier first') hits canary much more often than production, and the production set is the much larger default. WHAT THIS DOES NOT DO (lands with core#123) - PATCH endpoint to flip a workspace to canary - The apply endpoint that consults the column - Tests that exercise canary-vs-production fan-out Schema-only foundation; same pattern as core#113 (workspace_plugins). PHASE 4 SELF-REVIEW Correctness: No finding — IF NOT EXISTS guards, DEFAULT clause means existing rows get 'production' on migration apply. Readability: No finding — comment block documents the tier semantics + the deferral to core#123. Architecture: No finding — additive ALTER, partial index for the expected access pattern. Security: No finding — no code path; column constraint reduces blast radius of bad PATCH input. Performance: No finding — partial index minimizes write amplification on the production-default rows. REFS core#115 — this issue core#123 — apply endpoint follow-up (will exercise this column) core#113 — version subscription DB foundation (sibling pattern) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 08:55:19 -07:00
claude-ceo-assistant	72b0d4b1ab	feat(plugins): workspace_plugins tracking table — version-subscription foundation Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 7s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 6s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 14s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 35s Details CI / Detect changes (pull_request) Successful in 43s Details Check migration collisions / Migration version collision check (pull_request) Successful in 44s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 31s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 28s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 27s Details Harness Replays / detect-changes (pull_request) Successful in 33s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 30s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s Details CI / Canvas (Next.js) (pull_request) Successful in 12s Details CI / Python Lint & Test (pull_request) Successful in 15s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 12s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 29s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m20s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m1s Details CI / Platform (Go) (pull_request) Successful in 14m52s Details Closes core#113 partial. Adds the DB foundation for the version-subscription model. Drift detection + queue + admin apply endpoint are follow-up scope (separate PR; filed as a new issue). WHY THIS PR ONLY GETS US PART-WAY Plugin install state today is filesystem-only — '/configs/plugins/<name>/' inside the container. There's no DB record of 'plugin X installed at workspace W from source S, tracking ref T'. That makes drift detection impossible: nothing to compare upstream tags against. This PR adds the table + the install-endpoint hook that writes to it. With baseline tags now on every plugin (post internal#92), the table starts collecting tracked-ref values immediately on the next install. The actual drift-check job + queue + apply endpoint layer on top. WHAT THIS ADDS workspace_plugins table: workspace_id FK → workspaces(id) ON DELETE CASCADE plugin_name canonical name from plugin.yaml source_raw full source URL the install used tracked_ref 'none' \| 'tag:vX.Y.Z' \| 'tag:latest' \| 'sha:<full>' installed_at, updated_at installRequest gains optional 'track' field (defaults to 'none'). Install handler upserts the workspace_plugins row after delivery succeeds. DB write failure is logged but doesn't fail the install (the plugin IS in the container; surfacing 500 misleads the caller). validateTrackedRef enforces the closed set of accepted shapes: 'none' \| 'tag:<non-empty>' \| 'sha:<non-empty>' Bare values like 'latest' / 'main' / version-strings without prefix are rejected — the drift detector keys on prefix to know what kind of resolution to do. WHAT THIS DOES NOT ADD (filed separately) - Drift detector job (cron / on-demand) that scans 'WHERE tracked_ref != none' rows and queues updates on upstream drift - plugin_update_queue table (separate migration once detector lands) - GET /admin/plugin-updates-pending and POST .../apply endpoints - Tier-aware apply (core#115 — composes here) PHASE 4 SELF-REVIEW (FIVE-AXIS) Correctness: No finding — install endpoint behavior unchanged for callers that don't pass 'track'. DB write is best-effort + logged on failure. validateTrackedRef rejects ambiguous bare strings. Readability: No finding — separate file plugins_tracking.go isolates the new concern; install handler delta is a single 4-line block. Architecture: No finding — additive table; existing schema untouched. Migration 20260508160000_* uses the timestamp-prefixed convention. Security: No finding — INSERT params via placeholders (no string interpolation). validateTrackedRef rejects unexpected shapes before the column constraint would. Performance: No finding — one extra ExecContext per install. Install is already seconds-scale (network fetch + tar + docker exec); rounds to noise. TESTS (1 new, all green) TestValidateTrackedRef — pin closed set + structural validators REFS core#113 — this issue (foundation only; drift+queue+apply = follow-up) internal#92, internal#93 — plugin/template baseline tags (now exists for tracking) core#114 — atomic install (this PR composes — no atomicity regression) core#115 — canary tier filter (will key off the same DB foundation) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 08:52:35 -07:00
claude-ceo-assistant	249e760fbd	feat(plugins): hot-reload classifier — skip restart on SKILL-content-only updates Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 6s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 6s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 17s Details branch-protection drift check / Branch protection drift (pull_request) Successful in 21s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 20s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 19s Details Harness Replays / detect-changes (pull_request) Successful in 22s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s Details CI / Detect changes (pull_request) Successful in 27s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 22s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s Details CI / Canvas (Next.js) (pull_request) Successful in 10s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Python Lint & Test (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 9s Details Harness Replays / Harness Replays (pull_request) Failing after 25s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m41s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m33s Details CI / Platform (Go) (pull_request) Successful in 5m11s Details Closes molecule-core#112. Composes with #114 (atomic install). Before issuing restartFunc, classify the diff between staged and live: - skill-content-only: only **/SKILL.md content changed → skip restart (Claude Code re-reads SKILL.md on each Skill invocation; no in-memory cache) - cold: anything else → restartFunc as before (hooks/settings load at session start; plugin.yaml is structural; added/removed files require a fresh load) DETECTION - Hash every regular file in staged tree (host filesystem, sha256) - Hash every regular file in live tree (in-container via docker exec sh -c 'cd <livePath> && find . -type f -print0 \| xargs -0 sha256sum') - .complete marker dropped from comparison (mtime varies install-to- install; including it would force-cold every reinstall) - File added/removed → cold - File content differs but isn't SKILL.md → cold - All differences are SKILL.md basenames → skill-content-only DEFAULTS COLD - First install (no live tree) → cold - Live tree read failure → cold (conservative; never hot-reload speculatively) - Symlinks skipped during hash (same posture as tar walker) PHASE 4 SELF-REVIEW Correctness: No finding — all error paths default to cold; never falsely classify as skill-content-only. The .complete drop is a deliberate exception (the marker is bookkeeping, not content). Readability: No finding — single-purpose helpers (hashLocalTree, hashContainerTree, isSkillMarkdown, shQuote) each do one thing. The classifier itself reads as 'compare set, then walk diff with isSkillMarkdown gate.' Architecture: No finding — composes existing execAsRoot primitive; new helpers in plugins_classifier.go don't touch any other handler. Old behavior unchanged when live read fails. Security: No finding — shQuote single-quotes any non-trivial path, pluginName comes from validatePluginName-validated source, and the docker exec command takes the path as a single arg (xargs -0 handles binary-safe path delimiting). Symlinks skipped. Performance: No finding — adds two tree walks (host + container) per install. Container walk is one docker exec call returning sha256 lines; for typical plugins (~10-50 files) round-trip is ~100ms. Versus the saved ~5-10s of restart on a hot-reloadable update, this is a clear win. TESTS (4 new, all green; full handler suite green) TestIsSkillMarkdown — basename match, case-sensitive TestHashLocalTree_StableHash — re-hash same dir = same map TestHashLocalTree_SymlinkSkipped — hostile link doesn't poison classifier TestShQuote — quoting boundary for shell injection safety REFS molecule-core#112 — this issue molecule-core#114 — atomic install (.complete marker added there) Reno-Stars iteration safety (Hongming 2026-05-08) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 08:26:05 -07:00
claude-ceo-assistant	3e96184d6f	Merge pull request 'feat(plugins): atomic install — stage→snapshot→swap→marker (docker path)' (#120 ) from feat/plugin-atomic-install into main Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (push) Successful in 14s Details CI / Detect changes (push) Successful in 19s Details E2E API Smoke Test / detect-changes (push) Successful in 14s Details Auto-sync main → staging / sync-staging (push) Failing after 25s Details Handlers Postgres Integration / detect-changes (push) Successful in 16s Details Harness Replays / detect-changes (push) Successful in 17s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 19s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 14s Details publish-workspace-server-image / build-and-push (push) Failing after 18s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 18s Details CI / Shellcheck (E2E scripts) (push) Successful in 8s Details CI / Canvas (Next.js) (push) Successful in 9s Details CI / Python Lint & Test (push) Successful in 9s Details CI / Canvas Deploy Reminder (push) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 12s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 7s Details Harness Replays / Harness Replays (push) Failing after 18s Details E2E API Smoke Test / E2E API Smoke Test (push) Failing after 1m30s Details CI / Platform (Go) (push) Has been cancelled Details Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled Details	2026-05-08 15:23:31 +00:00
claude-ceo-assistant	7fbb8cb6e9	feat(plugins): atomic install — stage→snapshot→swap→marker (docker path) Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 4s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 13s Details CI / Detect changes (pull_request) Successful in 15s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s Details Harness Replays / detect-changes (pull_request) Successful in 15s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 15s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s Details CI / Canvas (Next.js) (pull_request) Successful in 10s Details CI / Python Lint & Test (pull_request) Successful in 8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s Details Harness Replays / Harness Replays (pull_request) Failing after 20s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m55s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m47s Details CI / Platform (Go) (pull_request) Successful in 7m36s Details Closes molecule-core#114 for the docker (local-OSS) path. EIC (SaaS) path tracked as a follow-up — same shape, different exec primitives (ssh vs docker exec); shipping both in one PR doubles the test surface. THE FOUR-STEP DANCE 1. STAGE — docker.CopyToContainer extracts tar into /configs/plugins/.staging/<name>.<ts>/ 2. SNAPSHOT — if /configs/plugins/<name>/ exists, mv to /configs/plugins/.previous/<name>.<ts>/ 3. SWAP — atomic mv staging → live (single rename(2)) 4. MARKER — touch /configs/plugins/<name>/.complete Workspace-side plugin loaders should refuse to load any plugin dir without .complete (separate small change, not in this PR — the marker write is the necessary precursor; consumer side is a follow-up so existing-content plugins don't break before they're re-installed). ROLLBACK - Stage failure: rm -rf staging dir; live untouched - Snapshot failure: rm -rf staging dir; live untouched (no rename happened) - Swap failure with snapshot present: mv previous back to live - Swap failure (no snapshot): rm -rf staging; live (which never existed) stays absent - Marker failure: content already in place, log loudly with manual recovery hint (touch <plugin>/.complete) — don't roll back since the new content is what we wanted, just unmarked GC Best-effort delete of previous-version snapshot after successful marker write. Failures non-fatal — next install or a separate sweeper reclaims. Sweeper for stale .previous/* across reboots is follow-up scope. CONCURRENCY Each install gets a unique stamp (UTC second precision), so two concurrent reinstalls land in distinct staging dirs and the second swap simply overwrites the first's live result. The atomicity is per-install, not cross-install — by design (the platform serializes POST /workspaces/:id/plugins via Go-side semaphore upstream of this code, so cross-install collisions don't reach here). CHANGES + plugins_atomic.go — installVersion + atomicCopyToContainer + plugins_atomic_tar.go — tarWalk/tarHostDirWithPrefix helpers + plugins_atomic_test.go — 5 unit tests (paths, stamp shape, tar happy path, symlink-skip, prefix normalization). All green. ~ plugins_install_pipeline.go::deliverToContainer — swap copyPluginToContainer call to atomicCopyToContainer Old copyPluginToContainer is retained (still called by Download()) so this PR is purely additive on the install path; no public API change. PHASE 4 SELF-REVIEW (FIVE-AXIS) Correctness: Required (addressed) — swap-failure rollback writes mv of previous back to live before returning the error; if rollback itself fails, we wrap both errors and surface the combined fault. Marker-write failure is treated as content-landed-but-unmarked (LOG, don't roll back the new content). Readability: No finding — installVersion path methods make the /staging/.previous/live/marker layout obvious from one struct. tarWalk extracted from the inline filepath.Walk in plugins_install_pipeline.go for testability. Architecture: No finding — atomicCopyToContainer composes existing execAsRoot / docker.CopyToContainer primitives; no new dependencies. Old copyPluginToContainer kept for Download() — single responsibility per function. Security: No finding — symlinks still skipped during tar walk (defense vs hostile plugin escaping its own dir). Marker writes use composeable path.Join, no user input touches the path. Performance: No finding — adds ~3 docker exec calls per install (mkdir, mv-snapshot, mv-swap, touch — actually 4) on top of the one CopyToContainer. Each exec ~50-100ms in practice; install end-to-end was already seconds-scale, this rounds to noise. REFS molecule-core#114 — this issue Companion: molecule-core#112 (hot-reload classifier — depends on .complete marker) Companion: molecule-core#113 (version subscription — uses install machinery) EIC follow-up: separate issue to be filed for SaaS path parity Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 08:22:52 -07:00
claude-ceo-assistant	c3686a4bb3	Merge branch 'main' into fix/pendinguploads-test-isolation Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details pr-guards / disable-auto-merge-on-push (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s Details Harness Replays / Harness Replays (pull_request) Failing after 5s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m59s Details CI / Platform (Go) (pull_request) Successful in 4m39s Details	2026-05-08 15:20:36 +00:00
claude-ceo-assistant	e37a289eb6	Merge pull request 'feat(org-import): inject per-role persona env from operator-host bootstrap dir' (#110 ) from feat/persona-env-injection into main Some checks failed CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 7s Details Block internal-flavored paths / Block forbidden paths (push) Successful in 8s Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 7s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 7s Details Auto-sync main → staging / sync-staging (push) Failing after 12s Details CI / Detect changes (push) Successful in 11s Details E2E API Smoke Test / detect-changes (push) Successful in 12s Details Harness Replays / detect-changes (push) Successful in 12s Details Handlers Postgres Integration / detect-changes (push) Successful in 13s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s Details publish-workspace-server-image / build-and-push (push) Failing after 13s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s Details CI / Shellcheck (E2E scripts) (push) Successful in 3s Details CI / Python Lint & Test (push) Successful in 4s Details CI / Canvas (Next.js) (push) Successful in 6s Details CI / Canvas Deploy Reminder (push) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s Details Harness Replays / Harness Replays (push) Failing after 6s Details E2E API Smoke Test / E2E API Smoke Test (push) Successful in 54s Details Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m18s Details CI / Platform (Go) (push) Successful in 2m23s Details	2026-05-08 15:17:17 +00:00
claude-ceo-assistant	9d50a6dae4	feat(local-dev): air-based hot-reload for workspace-server Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details CI / Detect changes (pull_request) Successful in 7s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details CI / Python Lint & Test (pull_request) Successful in 3s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s Details CI / Canvas (Next.js) (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 6s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 43s Details CI / Platform (Go) (pull_request) Successful in 2m1s Details Closes core#116. Brings local-dev iteration parity with the canvas's Turbopack HMR — edit a Go file, see the platform restart in <5s instead of running 'docker compose up --build' (~30s) per change. USAGE make dev # docker compose with air-driven live reload make up # production-shape stack (no air, normal Dockerfile) WHAT THIS ADDS workspace-server/.air.toml — air watch config workspace-server/Dockerfile.dev — air-on-golang:1.25-alpine, dev-only docker-compose.dev.yml — overlay swapping platform service to Dockerfile.dev + bind-mounting workspace-server/ source Makefile — make {dev,up,down,logs,build,test} WHAT THIS DOES NOT TOUCH workspace-server/Dockerfile (production multi-stage build) docker-compose.yml (prod-shape stack) CI workflows (build prod image directly) Tenant deployment / SaaS (image swap stays the model) Pure additive. Existing 'docker compose up' path unchanged; production stays on the static binary. Air install pinned via go install at image build time so the dev image is reproducible-enough for local use (we don't pin air to a SHA — the dev image is rebuilt locally and updates opportunistically). PHASE 4 SELF-REVIEW (FIVE-AXIS) Correctness: No finding — additive change, no existing path modified. .air.toml watches .go + .yaml under workspace-server/, excludes _test.go and tests dir so test edits don't trigger rebuild. Dockerfile.dev mirrors prod's 'go mod download' so first rebuild is fast. Readability: No finding — three small files plus a Makefile, each with header comments explaining the WHY, not just the WHAT. The Makefile uses the standard ## help-target pattern. Architecture: No finding — overlay pattern (docker-compose.dev.yml on top of docker-compose.yml) is the standard compose convention for env-specific overrides. Doesn't fork the prod path. Security: No finding because no production code path; dev-only image isn't built in CI and isn't published to ECR. Performance: No finding — air debounce=500ms, exclude_unchanged=true so a save that doesn't change content is a no-op rebuild. REFS core#116 — this issue Companion: core#117 (workspace-side config-watcher for hot-reload of config.yaml) — different scope; this issue is platform-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 08:10:50 -07:00
dev-lead	9e18ab4620	fix(pendinguploads): wait for error metric before test exit Some checks failed CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 0s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details CI / Detect changes (pull_request) Successful in 8s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 8s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s Details CI / Python Lint & Test (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 6s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m0s Details CI / Platform (Go) (pull_request) Successful in 4m34s Details TestStartSweeper_TransientErrorDoesNotCrashLoop leaks an in-flight metric write across the test boundary: cycleDone fires inside the fake's Sweep defer (before Sweep returns), waitForCycle returns immediately after, cancel() lands, but the goroutine still has metrics.PendingUploadsSweepError() to execute. Whether that write happens before or after the next test's metricDelta() baseline read is a coin-flip on slow CI hosts. Outcome: TestStartSweeper_RecordsMetricsOnSuccess fails with "error counter delta = 1, want 0" — looks like a real bug, isn't. Instrumented analysis (per the file's existing waitForMetricDelta docstring covering the same shape) confirms the metric IS getting recorded, just AFTER the next test reads its baseline. The Records* tests already use waitForMetricDelta to close this race on their own assertions. This change extends the same shape to TransientErrorDoesNotCrashLoop so it doesn't poison subsequent tests' baselines. Verified by running `go test -race -count=20 ./internal/pendinguploads/...` locally — passes deterministically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 07:37:45 -07:00
claude-ceo-assistant	43b33bcaa5	feat(org-import): inject per-role persona env from operator-host bootstrap dir Some checks failed CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 0s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 8s Details CI / Detect changes (pull_request) Successful in 9s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details CI / Python Lint & Test (pull_request) Successful in 4s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 5s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m24s Details CI / Platform (Go) (pull_request) Successful in 3m23s Details Wires the 28 dev-tree persona credentials minted 2026-05-08 into the workspace-secrets path used by org_import. When a workspace.yaml carries `role: <name>`, the importer now reads $MOLECULE_PERSONA_ROOT/<role>/env (default /etc/molecule-bootstrap/personas/<role>/env, populated by the bootstrap kit on the tenant host) and merges the role's GITEA_USER / GITEA_TOKEN / GITEA_TOKEN_SCOPES / GITEA_USER_EMAIL / GITEA_SSH_KEY_PATH into the same envVars map that already feeds workspace_secrets via parseEnvFile + crypto.Encrypt + INSERT. PRECEDENCE Persona env is the LOWEST layer: 0. Persona env (per-role) 1. Org root .env (shared) 2. Workspace .env (per-workspace) Each later layer overrides the previous, so a workspace .env can pin a different GITEA_TOKEN if it ever needs to (testing, override). WHY THIS LAYERING Workspaces should boot with the role's identity by default. .env files stay the explicit-override mechanism for the (rare) case where a workspace needs to deviate. No new behavior for workspaces with no role: persona load is silent no-op when ws.Role is empty or unsafe. SECURITY isSafeRoleName accepts only [A-Za-z0-9_-]+ (no '..', '/', or separators) — admin-only construct, but defense-in-depth keeps the persona dir shape invariant. Test TestLoadPersonaEnvFile_RejectsTraversal pins the rejection set against a planted target file. OPERATOR-HOST CONTRACT The 28 persona env files live at /etc/molecule-bootstrap/personas/<role>/env (mode 600, owner root:root) with the per-role token-scope tailoring Hongming approved 2026-05-08 (D5). Synced via task #241. Override via MOLECULE_PERSONA_ROOT for tests + non-prod hosts. TESTS (7 new, all green) TestLoadPersonaEnvFile_HappyPath — typical persona-env shape TestLoadPersonaEnvFile_MissingDir — silent no-op when file absent TestLoadPersonaEnvFile_EmptyRole — silent no-op when role empty TestLoadPersonaEnvFile_RejectsTraversal — planted file unreachable via '../../etc/passwd' etc. TestLoadPersonaEnvFile_DefaultRoot — falls back to /etc/... TestLoadPersonaEnvFile_OverwritesEmptyMap TestIsSafeRoleName_Acceptance — positive + negative role names PHASE 4 SELF-REVIEW (FIVE-AXIS) Correctness: No finding — additive change, silent no-op on the ws.Role=='' path covers every existing workspace; tests cover happy path + each rejection mode + missing-dir. Readability: No finding — helper sits next to parseEnvFile in org_helpers.go with a comment block explaining WHY persona is lowest precedence. Architecture: No finding — fits the existing 'merge .env into envVars then INSERT INTO workspace_secrets' pattern that's been in place since the .env-driven workspace secrets feature; no new dependencies, no new tables. Security: Required (addressed) — path traversal blocked by isSafeRoleName. No finding beyond that since persona files are admin-managed and the helper does not log token values. Performance: No finding — one extra os.ReadFile per workspace at import time; amortized over workspace lifetime, cost is negligible. REFS internal#85 — RFC for SOP Phase 4 + structured Five-Axis (parent context) Saved memories: feedback_per_agent_gitea_identity_default, feedback_unified_credentials_file Task #241 — operator-host sync (already DONE; populated 28 dirs) Task #242 — this PR Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 07:09:40 -07:00
claude-ceo-assistant	c72d0a5383	harden(org-external): token via http.extraHeader, .complete cache marker, ref '..' deny, naming cleanup Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 10s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 8s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 9s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 25s Details CI / Detect changes (pull_request) Successful in 32s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 27s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 28s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 28s Details Harness Replays / detect-changes (pull_request) Successful in 20s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 23s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s Details CI / Canvas (Next.js) (pull_request) Successful in 16s Details CI / Python Lint & Test (pull_request) Successful in 12s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 13s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 15s Details CI / Platform (Go) (pull_request) Failing after 1m19s Details Harness Replays / Harness Replays (pull_request) Failing after 1m6s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 1m9s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m4s Details Self-review of molecule-core PR #105 + #106 (the !external resolver chain) surfaced 3 real correctness/security gaps and 2 readability nits. Fixes all four in one PR since they're the same file's hardening. (1) TOKEN LEAKAGE — fixed Before: gitFetcher built clone URLs with auth in userinfo (https://oauth2:TOKEN@host/repo.git). Two leak paths: a. Token persisted in cloned repo's .git/config b. Token could appear in clone error output captured via cmd.CombinedOutput() After: clone URL has no userinfo (https://host/repo.git). Auth is layered on via -c http.extraHeader=Authorization: token ... which sends the header per-request without persisting. Plus a redactToken() pass over any error string before it surfaces in fmt.Errorf, as belt-and-braces. Tradeoff: token now visible in 'ps aux' for the duration of the git child process (same as before via env var), but no longer in any persistent state. (2) CACHE-VALIDITY FOOTGUN — fixed Before: cache-hit was 'cacheDir/.git exists'. A clone interrupted after .git was created but before content finished writing would leave a partially-written cache that subsequent imports treated as hit, returning stale/incomplete content forever (no self-heal). After: cache-hit also requires a .complete marker file written only AFTER successful clone+rename. Partially-written cache is treated as cache-miss and re-fetched cleanly (after RemoveAll on the partial dir to avoid blocking the new clone's mkdir). (3) REF '..' DENY — fixed Before: safeRefPattern '^[a-zA-Z0-9_./-]+$' allowed '..' as a substring. Git itself rejects most refs containing '..', but defense-in-depth says don't depend on the downstream tool's validation when sanitizing input at the boundary. After: explicit strings.Contains(ref.Ref, '..') check. (4) NAMING CLEANUP — fixed Before: rewriteFilesDirAndIncludes() — name claims to rewrite !include scalars but doesn't (we removed that during PR-A development; double-prefix bug). Misleading for readers. After: rewriteFilesDir(). Docstring updated to explicitly explain why !include paths are NOT rewritten (relative to subDir, naturally inside cache). Also: removed unused buildAuthedURL() (replaced by buildExternalCloneURL + authConfigArgs split), removed unused shortHash() helper (replaced by os.MkdirTemp), removed unused crypto/sha1 + encoding/hex + fmt imports, removed stray '_ = fmt.Sprint' line in integration test. NEW TESTS - TestGitFetcher_RejectsRefWithDoubleDot (defense-in-depth on ref input) - TestGitFetcher_CacheValidatedByCompleteMarker (partial cache → re-fetch) VERIFIED LOCALLY 2026-05-08 Full ./internal/handlers/ suite: ok (7.8s, 14 external-resolver tests + all existing tests). Two new tests cover the two new behaviors. Refs: internal#77 — extraction RFC molecule-core#105 (resolver), #106 (tests) — original implementation Hongming code-review-and-quality skill invocation 2026-05-08 + 'fix all'	2026-05-08 05:54:54 -07:00
claude-ceo-assistant	89c5567d79	test(org-external): integration test against local bare-git + e2e against live Gitea (PR-B + PR-C) All checks were successful CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details CI / Detect changes (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details Harness Replays / Harness Replays (pull_request) Successful in 58s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m45s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m51s Details CI / Platform (Go) (pull_request) Successful in 2m49s Details PR-B (local bare-git integration, task #233): workspace-server/internal/handlers/org_external_integration_test.go Three tests using git's GIT_CONFIG_COUNT/KEY/VALUE env-var-injected insteadOf URL rewrite — process-scoped, no ~/.gitconfig pollution: - TestGitFetcher_RealClone_LocalRedirect: full resolver chain end-to- end with REAL git clone against a local bare-repo, asserts cache population + content materialization + path rewrite + cache-hit on second invocation. - TestGitFetcher_RealClone_BadRefFails: nonexistent ref surfaces git's error cleanly through the ls-remote step. - TestGitFetcher_DirectFetch_CacheHit: gitFetcher.Fetch direct invocation (no resolver wrapping); verifies cache-hit returns same dir + same SHA, no clobber. Production code untouched — insteadOf rewrite makes the production gitFetcher think it's cloning from Gitea, but git rewrites at clone time to file://<barePath>. Tests the real shell-out + parsing. PR-C (live Gitea e2e, task #234): workspace-server/internal/handlers/local_e2e_dev_dept_test.go TestLocalE2E_ExternalDevDepartment — minimal parent template that uses !external against the LIVE molecule-ai/molecule-dev-department repo. No symlink, no /tmp/local-e2e-deploy fixture. Composition resolves over network at import time. Asserts: - 28+ dev-tree workspaces resolve through the fetched cache (matches the count from TestLocalE2E_DevDepartmentExtraction) - Q1 placement: 'Documentation Specialist' present (under app-lead) - Q2 placement: 'Triage Operator' present (under dev-lead) - Every workspace's files_dir is cache-prefixed (proves rewrite ran) - Every workspace's resolveInsideRoot+Stat succeeds (would fail provisioning if not) Skipped if Gitea unreachable (TCP probe to git.moleculesai.app:443) or git binary absent — won't false-fail offline runners. VERIFIED LOCALLY 2026-05-08: --- PASS: TestGitFetcher_RealClone_LocalRedirect (0.26s) --- PASS: TestGitFetcher_RealClone_BadRefFails (0.15s) --- PASS: TestGitFetcher_DirectFetch_CacheHit (0.23s) --- PASS: TestLocalE2E_ExternalDevDepartment (0.55s) workspaces resolved through !external: 28 Full ./internal/handlers/ test suite: ok (no regressions) Together with PR-A's unit tests (#105), the !external resolver is now covered at three layers: - unit (fakeFetcher injection): allowlist, validation, path rewrite - integration (real git, local bare-repo): clone, cache, ls-remote - e2e (real git, live Gitea, live dev-department): full chain Refs: internal#77 — extraction RFC (Phase 3a phasing in comment 1995) task #233 (PR-B), task #234 (PR-C) Hongming GO 2026-05-08 ('do PR-B/C/D')	2026-05-08 05:30:04 -07:00
claude-ceo-assistant	257d6c1b5a	feat(org-import): `!external` cross-repo subtree resolver (Phase 3a, internal#77 / task #222 ) Some checks failed CI / Detect changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details Harness Replays / detect-changes (pull_request) Successful in 8s Details CI / Python Lint & Test (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m21s Details CI / Platform (Go) (pull_request) Successful in 2m26s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details Harness Replays / Harness Replays (pull_request) Failing after 48s Details Adds gitops-style cross-repo subtree composition to the platform's org-template importer. Replaces (eventually) the operator-side filesystem symlink approach shipped in PR #5. DESIGN See internal#77 comment 1995 for the full design doc + decision points agreed with Hongming 2026-05-08. Schema: a `!external`-tagged mapping anywhere a workspace entry is allowed (workspaces:, roots:, children:): - !external repo: molecule-ai/molecule-dev-department ref: main path: dev-lead/workspace.yaml url: git.moleculesai.app # optional; default = MOLECULE_EXTERNAL_GITEA_URL or git.moleculesai.app At resolve time the platform fetches the repo at ref into a content- addressable cache under <orgBaseDir>/.external-cache/<repo>/<sha>/, loads <cacheDir>/<path>, recursively resolves nested !include / !external in the loaded subtree, then rewrites every files_dir scalar in the fully-resolved subtree to be cache-prefixed. Downstream pipeline (resolveInsideRoot, plugin merge, CopyTemplateToContainer) sees ordinary in-tree paths. IMPLEMENTATION - org_external.go: ExternalRef type, fetcher interface (gitFetcher production + injectable for tests), resolveExternalMapping resolver, rewriteFilesDirAndIncludes path-rewrite walker, allowlistedHostPath + safeRefPattern + safeRepoCacheDir validation helpers. - org_include.go: 4-line hook in expandNode dispatching MappingNode with Tag=="!external" to resolveExternalMapping. - org_external_test.go: 8 unit tests with fakeFetcher injection (no network): * happy path (top + nested workspace files_dir cache-prefixed) * allowlist rejection (github.com/foo/bar) * path-traversal rejection (../../etc/passwd) * malformed ref rejection ("main; rm -rf /") * missing required fields (repo / ref / path) * rewriteFilesDirAndIncludes basic + idempotent * allowlistedHostPath env-override + glob Path rewrite ONLY rewrites files_dir scalars. !include scalars are NOT rewritten — they resolve relative to their containing file's directory, which post-fetch is naturally inside the cache, so relative !includes Just Work without modification. ALLOWLIST + AUTH - Default allowlist: git.moleculesai.app/molecule-ai/. - Override: MOLECULE_EXTERNAL_REPO_ALLOWLIST (comma-separated prefixes; trailing /* or / supported). - Auth: MOLECULE_GITEA_TOKEN env var injected into clone URL. Optional — falls back to unauthenticated for public repos. - Reject: malformed refs, path-traversal, non-allowlisted hosts. CACHE - Location: <orgBaseDir>/.external-cache/<safe-repo>/<sha>/. Operators add to .gitignore. - Content-addressable: same (repo, sha) reuses cache, no overwrite. - Atomic clone via tmp-then-rename. - Concurrency: race-tolerant — last-writer-wins on same SHA. GC out of scope for v1 (filed as parked follow-up). SECURITY (per SOP Phase 2) Untrusted yaml input — all validated: repo: allowlist (default molecule-ai/* on Gitea host) ref: ^[a-zA-Z0-9_./-]+$ regex (rejects shell injection) path: relative-and-down-only (rejects ../escape) Auth: read-only token scoped to allowed orgs. Recursion: maxExternalDepth=4 (vs maxIncludeDepth=16) to limit network fan-out cost. Cache poisoning: per-(repo, sha) content-addressable; can't poison across SHAs. Trust boundary: cloned content treated identically to a sibling- cloned subtree (same model as current symlink approach). VERSIONING / BACKWARDS COMPAT Pure additive. Existing !include and inline workspaces unchanged. Existing dev-lead symlink (parent template PR #5) keeps working. Migration of parent template to !external is a separate PR-D. No DB schema change. No public API change. VERIFIED LOCALLY go test ./internal/handlers/ → ok (5.2s, all 8 new tests + existing) Stub fetcher injection lets unit tests cover the resolver + path-rewrite logic without network. PR-B (follow-up) adds an integration test against a local bare-git repo. PR-C adds the real-Gitea e2e test against the live dev-department repo. Refs: internal#77 — extraction RFC (comment 1995 = Phase 1+2 design) task #222 — this PR is Phase 3a (PR-A in the design's phasing) Hongming GO 2026-05-08 ('go' on 4 decision points + design)	2026-05-08 05:17:55 -07:00
claude-ceo-assistant	3dcc7230f9	fix(provisioner)+test: EvalSymlinks templatePath; stage-2 e2e for files_dir consumption Some checks failed Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m20s Details CI / Platform (Go) (pull_request) Successful in 2m48s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 2s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details CI / Python Lint & Test (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s Details Harness Replays / Harness Replays (pull_request) Failing after 46s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s Details Two changes that fall out of one root cause discovered while preparing the local platform spin-up for the dev-department extraction (internal#77): PROBLEM CopyTemplateToContainer's filepath.Walk is called with templatePath set to the workspace's resolved files_dir. With the cross-repo symlink composition shipped in PR #5 (parent template's dev-lead → ../molecule-dev-department/dev-lead/), the Dev Lead workspace's files_dir is literally 'dev-lead' — i.e. the symlink itself, not a path THROUGH the symlink. filepath.Walk does not descend into a symlink leaf — it Lstats the root, sees a symlink (mode bit set, not a directory), emits exactly one entry, and returns. Result: the workspace's /configs/ tar would ship empty. Other 38 workspaces are fine because their files_dir paths just TRAVERSE the symlink (path resolution handles intermediate symlinks via Lstat traversal); only the leaf-is-symlink case breaks. FIX workspace-server/internal/provisioner/provisioner.go: Call filepath.EvalSymlinks on templatePath before filepath.Walk. Resolves the leaf-symlink case for ALL templates, not just dev-dept. Security: templatePath has already passed resolveInsideRoot's path-string check at the call site; the trust boundary is the operator-side /org-templates/ filesystem layout, not this resolution step. TEST workspace-server/internal/handlers/local_e2e_dev_dept_test.go: New TestLocalE2E_FilesDirConsumption — stage-2 of the local e2e. For every workspace in the resolved OrgTemplate, asserts: 1. resolveInsideRoot(orgBaseDir, ws.FilesDir) succeeds. 2. os.Stat on the result returns a directory. 3. filepath.Walk after EvalSymlinks (mirroring the platform fix) emits at least one file. 4. At least one workspace marker exists (workspace.yaml, system-prompt.md, or initial-prompt.md). Exercises the SECOND half of POST /org/import that TestLocalE2E_DevDepartmentExtraction (PR #103) didn't cover. VERIFIED LOCALLY (2026-05-08, against post-extraction Gitea state): --- PASS: TestLocalE2E_FilesDirConsumption (0.05s) checked 39 workspaces with files_dir All 39 walk paths emit non-empty file sets with valid workspace markers. REGRESSION GUARD Without the EvalSymlinks fix, this test fails on Dev Lead with: files_dir 'dev-lead' at '/.../molecule-dev/dev-lead' is empty — CopyTemplateToContainer would produce empty /configs/ Refs: internal#77 — extraction RFC molecule-core#102 (resolver symlink contract test) molecule-core#103 (stage-1 e2e: include resolution) Hongming GO 2026-05-08 ('go' on the 3 pre-spin-up optimizations)	2026-05-08 04:46:33 -07:00
claude-ceo-assistant	3adbbacf2e	test(local-e2e): verify dev-department extraction end-to-end via real resolveYAMLIncludes All checks were successful E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s Details CI / Canvas (Next.js) (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s Details CI / Python Lint & Test (pull_request) Successful in 7s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Successful in 1m12s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m6s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m9s Details CI / Platform (Go) (pull_request) Successful in 3m14s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s Details CI / Detect changes (pull_request) Successful in 16s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 13s Details Harness Replays / detect-changes (pull_request) Successful in 13s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 13s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s Details Phase 4 (local-only) of internal#77 (dev-department extraction). Adds TestLocalE2E_DevDepartmentExtraction that exercises the FULL platform import path against the real molecule-ai-org-template-molecule-dev (post-slim) and molecule-ai/molecule-dev-department (post-atomize) repos cloned as siblings under /tmp/local-e2e-deploy/. What it proves end-to-end: - The dev-lead symlink at parent's template root is followed by resolveYAMLIncludes (filepath.Abs/Rel-style security check passes, os.ReadFile follows the link). - Recursive !include chain through the symlinked subtree resolves: parent's org.yaml → !include dev-lead/workspace.yaml (symlinked) → !include ./core-lead/workspace.yaml → !include ./core-be/workspace.yaml (atomized children: paths, no '..'). - 39 workspaces enumerate after resolution: 5 PM-tree + 6 Marketing-tree + 28 dev-tree (Dev Lead + 5 sub-team leads + 18 leaf workspaces + 3 floaters + 1 triage-operator). - Q1+Q2 placements verified by sentinel name check: 'Documentation Specialist' is reachable (under app-lead via app-docs sub-team), 'Triage Operator' is reachable (direct child of Dev Lead). Test skips with t.Skipf if the local-e2e fixture isn't present on the host — won't block CI on hosts that haven't set it up. To set up locally: TESTROOT=/tmp/local-e2e-deploy mkdir -p $TESTROOT && cd $TESTROOT git clone https://git.moleculesai.app/molecule-ai/molecule-ai-org-template-molecule-dev.git molecule-dev git clone https://git.moleculesai.app/molecule-ai/molecule-dev-department.git cd /Users/<you>/molecule-core/workspace-server go test -v -run TestLocalE2E_DevDepartmentExtraction ./internal/handlers/ Verified locally 2026-05-08: --- PASS: TestLocalE2E_DevDepartmentExtraction (0.01s) total workspaces (recursive): 39 Refs: internal#77 — extraction RFC molecule-core PR #102 — symlink-resolution contract test molecule-ai/molecule-dev-department PRs #1, #2, #3 (scaffold + extract + atomize) molecule-ai/molecule-ai-org-template-molecule-dev PR #5 (parent slim + symlink wire) Hongming GO 2026-05-08 ('lets not go for staging right now, we do local test first') SOP Phase 4 (local) — task #226	2026-05-08 04:24:47 -07:00
claude-ceo-assistant	78c4b9b74f	test(org-include): pin symlink-based subtree composition contract Some checks failed CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details CI / Detect changes (pull_request) Successful in 9s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 8s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s Details Harness Replays / detect-changes (pull_request) Successful in 10s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details CI / Python Lint & Test (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 45s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 52s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m19s Details CI / Platform (Go) (pull_request) Successful in 2m23s Details Two new tests in workspace-server/internal/handlers/org_include_test.go: - TestResolveYAMLIncludes_FollowsDirectorySymlink: parent template's org.yaml `!include`s into a sibling-repo subtree via a relative directory symlink. The resolver's filepath.Abs/Rel security check operates on path strings (passes), and os.ReadFile follows the symlink at OS layer (file content delivered). Recursive nested `!include`s within the symlinked subtree resolve correctly because filepath.Dir(absTarget) keeps the literal symlink path as currentDir. - TestResolveYAMLIncludes_RejectsSymlinkEscapingRoot: companion test pinning current behavior where a symlink target outside the parent root is followed (resolveInsideRoot doesn't EvalSymlinks). Asserted as 'should resolve' so future hardening (if filepath.EvalSymlinks is added) flips the test red and forces a coordinated update to the dev-department subtree-composition pattern. Why now: internal#77 RFC (dev-department extraction) selects symlink- based composition over a future platform-level external: ref. These tests pin the contract before the operator-side symlink convention gets shipped, so a refactor or hardening of the resolver can't silently break the production org-import path. No production code changes. Pure additive test coverage. Refs: internal#77 (Phase 3b verification — task #223)	2026-05-07 20:42:38 -07:00
claude-ceo-assistant	7f61206a18	Merge branch 'staging' into fix/saas-plugin-install-eic Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 3s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 20s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 4s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 22s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s Details Harness Replays / detect-changes (pull_request) Successful in 18s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s Details pr-guards / disable-auto-merge-on-push (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 16s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s Details CI / Canvas (Next.js) (pull_request) Successful in 12s Details CI / Python Lint & Test (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m43s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Successful in 1m42s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 4m58s Details CI / Platform (Go) (pull_request) Successful in 8m39s Details	2026-05-08 00:21:10 +00:00
devops-engineer	6d7554d282	chore: sync main → staging (auto, `d84d88ad`) Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 6s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (push) Successful in 11s Details CI / Detect changes (push) Successful in 16s Details E2E API Smoke Test / detect-changes (push) Successful in 18s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 20s Details Handlers Postgres Integration / detect-changes (push) Successful in 20s Details Harness Replays / detect-changes (push) Successful in 20s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 17s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 19s Details CI / Shellcheck (E2E scripts) (push) Successful in 13s Details CI / Canvas (Next.js) (push) Successful in 13s Details CI / Python Lint & Test (push) Successful in 11s Details CI / Canvas Deploy Reminder (push) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled Details Handlers Postgres Integration / Handlers Postgres Integration (push) Has been cancelled Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Has been cancelled Details publish-workspace-server-image / build-and-push (push) Has been cancelled Details CI / Platform (Go) (push) Has been cancelled Details Harness Replays / Harness Replays (push) Failing after 1m15s Details	2026-05-07 23:38:08 +00:00
claude-ceo-assistant	6bb272360d	Merge branch 'main' into feat/issue-63-local-build-from-gitea-v2 Some checks failed CI / Canvas (Next.js) (pull_request) Successful in 10s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 17s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 11s Details CI / Detect changes (pull_request) Successful in 19s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 6s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 16s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 16s Details Harness Replays / detect-changes (pull_request) Successful in 16s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 19s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s Details CI / Python Lint & Test (pull_request) Successful in 9s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s Details Harness Replays / Harness Replays (pull_request) Failing after 1m6s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m25s Details CI / Platform (Go) (pull_request) Successful in 3m19s Details	2026-05-07 23:33:03 +00:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	b664691051	fix(eic-tunnel-pool): capture poolJanitorInterval at pool construction Some checks failed CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m11s Details CI / Canvas (Next.js) (pull_request) Successful in 14s Details CI / Python Lint & Test (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 13s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 26s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 10s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 6m9s Details Harness Replays / Harness Replays (pull_request) Failing after 1m17s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 18s Details branch-protection drift check / Branch protection drift (pull_request) Successful in 16s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 8s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 8s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 14s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 9s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 9s Details CI / Detect changes (pull_request) Successful in 23s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 22s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 20s Details Harness Replays / detect-changes (pull_request) Successful in 23s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 23s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m0s Details CI / Platform (Go) (pull_request) Successful in 12m26s Details Closes the chronic -race flake on TestPooledWithEICTunnel_PanicPoisonsEntry and the handlers package as a whole (CI / Platform (Go) was intermittent on staging, ~50% red on workspace-server-touching commits since 2026-04). The race: tests swap the package-level poolJanitorInterval via t.Cleanup (eic_tunnel_pool_test.go:61) AFTER an earlier test caused the global pool's janitor goroutine to start. The janitor loops on time.NewTicker(poolJanitorInterval) on every tick — so the cleanup write races the goroutine read for the rest of the process. Caught locally + on PR #84's CI run on Gitea. Fix: capture the interval as a field on eicTunnelPool at newEICTunnelPool(). The janitor now reads p.janitorInterval, which never changes after construction. Tests that override poolJanitorInterval before freshPool() still get the new value (they set the package var before construction). The global pool's janitor — created lazily once via sync.Once on first getEICTunnelPool() — is now immune to t.Cleanup-driven swaps from later tests. Surfaced while verifying #84 (SaaS plugin install via EIC SSH); folded into this PR per the "fix root not symptom" rule rather than merging around a chronic-red CI signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 16:01:11 -07:00
devops-engineer	7e2cca7fad	chore: sync main → staging (auto, `e7660618`) Some checks are pending CI / Platform (Go) (push) Blocked by required conditions Details CI / Canvas (Next.js) (push) Blocked by required conditions Details CI / Shellcheck (E2E scripts) (push) Blocked by required conditions Details CI / Canvas Deploy Reminder (push) Blocked by required conditions Details CI / Python Lint & Test (push) Blocked by required conditions Details E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions Details Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions Details Harness Replays / Harness Replays (push) Blocked by required conditions Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions Details Block internal-flavored paths / Block forbidden paths (push) Successful in 18s Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 10s Details CI / Detect changes (push) Successful in 27s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 10s Details E2E API Smoke Test / detect-changes (push) Successful in 23s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 24s Details Handlers Postgres Integration / detect-changes (push) Successful in 24s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 18s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 28s Details Harness Replays / detect-changes (push) Successful in 30s Details publish-workspace-server-image / build-and-push (push) Successful in 6m26s Details	2026-05-07 23:00:21 +00:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	16868c4ec1	fix(plugins): SaaS (EC2-per-workspace) install/uninstall via EIC SSH Some checks failed CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 5s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 15s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 5s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 17s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 6s Details CI / Detect changes (pull_request) Successful in 19s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 15s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 16s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 17s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 13s Details Harness Replays / detect-changes (pull_request) Successful in 18s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 17s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 10s Details CI / Canvas (Next.js) (pull_request) Successful in 19s Details CI / Python Lint & Test (pull_request) Successful in 12s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 17s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 15s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 2m4s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m53s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5m14s Details CI / Platform (Go) (pull_request) Failing after 8m5s Details Closes the 🔴 docker-only row in docs/architecture/backends.md. Plugin install on every SaaS tenant currently 503s with "workspace container not running" because the handler is hardcoded to Docker exec but SaaS workspaces live on per-workspace EC2s. Caught on hongming.moleculesai.app when canvas POST /workspaces/<id>/plugins surfaced the error. Mirrors the Files API PR #1702 pattern: dispatch on workspaces.instance_id in deliverToContainer (and Uninstall). When set, push the staged plugin tarball to the EC2 over the existing withEICTunnel primitive (template_files_eic.go) and unpack into the runtime's bind-mounted config dir (/configs for claude-code, /home/ubuntu/.hermes for hermes — see workspaceFilePathPrefix). chown 1000:1000 to match the docker path's agent-uid contract; restart via the existing dispatcher. Direct host write rather than docker-cp via SSH because the runtime's config dir is already bind-mounted into the workspace container — the runtime sees the files on next start with no additional plumbing. Adds InstanceIDLookup (parallel to RuntimeLookup) so unit tests don't need a DB; production wires it in router.go like templates.go does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 15:42:51 -07:00
claude-ceo-assistant	d9e380c5bc	feat(workspace-server): local-dev provisioner builds from Gitea source when MOLECULE_IMAGE_REGISTRY is unset (#63 , Task #194 ) Some checks failed E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 7s Details CI / Canvas (Next.js) (pull_request) Successful in 7s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m38s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Harness Replays / Harness Replays (pull_request) Failing after 42s Details CI / Platform (Go) (pull_request) Successful in 3m32s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details CI / Python Lint & Test (pull_request) Successful in 6s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s Details OSS contributors who clone molecule-core and `go run ./workspace-server/cmd/server` now get a working end-to-end provision without authenticating to GHCR or AWS ECR. Pre-fix: with MOLECULE_IMAGE_REGISTRY unset, the provisioner attempted to pull ghcr.io/molecule-ai/workspace-template-<runtime>:latest, which has been returning 403 since the 2026-05-06 GitHub-org suspension. Post-fix: when MOLECULE_IMAGE_REGISTRY is unset, the provisioner switches to local-build mode — looks up the workspace-template-<runtime> repo's HEAD sha on Gitea via a single API call, shallow-clones into ~/.cache/molecule/, and runs `docker build --platform=linux/amd64`. SHA-pinned cache key skips the clone+build entirely on subsequent provisions. Production tenants are unaffected: every prod tenant sets the var to its private ECR mirror, so the SaaS pull path is byte-for-byte identical. SSOT for mode detection lives in Resolve() (registry_mode.go) returning a discriminated RegistrySource{Mode, Prefix} so call sites that branch on mode get a compile-time push instead of a string-equality footgun. Coverage: * registry_mode.go — new SSOT (Resolve, RegistryMode, IsKnownRuntime) * registry_mode_test.go — 8 tests pinning mode-decision contract * localbuild.go — clone+build pipeline (570 LOC, fully unit-tested) * localbuild_test.go — 22 tests covering happy/sad paths, fail-closed * provisioner.go — Start() inserts ensureLocalImageHook in local mode * docs/adr/ADR-002 — design rationale + alternatives + security review * docs/development/local-development.md — local-build flow + env overrides Security: * Allowlist-only runtime names (knownRuntimes) gate the clone path. * Repo prefix hardcoded to git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-; forks via opt-in MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX. * MOLECULE_GITEA_TOKEN masked in every log line via maskTokenInURL/maskTokenInString. * Fail-closed: Gitea unreachable / runtime not mirrored → clear error, never silently fall back to GHCR/ECR. * docker build invocation passes no --build-arg from external input. * HTTP body cap 64KB on Gitea API responses (defence vs malicious upstream). Closes #63 / Task #194. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 15:16:51 -07:00
security-auditor	5b7b669b4c	docs(ratelimit): tighten dev-mode comment after keyFor refactor Some checks failed CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 2s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 7s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details CI / Python Lint & Test (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 35s Details Harness Replays / Harness Replays (pull_request) Failing after 36s Details CI / Platform (Go) (pull_request) Successful in 1m52s Details The previous comment said "all share one IP bucket" — accurate before the keyFor refactor, slightly stale after it. The dev-mode rationale (bucket fills fast, blanks the page on a single-user dev box) is unchanged; only the bucket-key flavour text needed updating. Doc-only follow-up from #60's hostile self-review #3. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:57:21 -07:00
security-auditor	9dda84d671	fix(ratelimit): tenant-aware bucket keying — close canvas 429 storm Some checks failed E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m13s Details CI / Platform (Go) (pull_request) Successful in 2m8s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 0s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 7s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 3s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 15s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 39s Details Closes #59. Symptom: /workspaces/:id/activity returns 429 with rate-limit-exceeded on hongming.moleculesai.app whenever multiple workspaces are visible in the canvas. Single-tab, single-user, well within the documented 600 req/min budget — but every request collapsed into one bucket. Root cause: workspace-server's RateLimiter keyed buckets on c.ClientIP(). After issue #179 turned off proxy-header trust (SetTrustedProxies(nil), correctly closing the XFF spoofing hole), c.ClientIP() returns the TCP RemoteAddr — which in production is the upstream proxy (Caddy on per-tenant EC2; CP/Vercel on the SaaS plane). Every browser tab + every canvas consumer + every poll loop for every tenant collapsed into one bucket. Fix: bucket key derivation moves into a single keyFor helper that mirrors the SSOT pattern of: - molecule-controlplane/internal/middleware/ratelimit.go (org > user > IP) - this package's own MCPRateLimiter (token-hash via tokenKey) Priority: X-Molecule-Org-Id header → SHA-256(Authorization Bearer) → ClientIP. Token values are kept hashed in the bucket map so the in-memory state can't become a token dump. Tests: - TestKeyFor_OrgIdHeaderTrumpsBearerAndIP — priority order - TestKeyFor_BearerTokenWhenNoOrgId — middle tier + raw-token leak pin - TestKeyFor_IPFallbackWhenNoOrgIdNoBearer — anon probe path - TestRateLimit_TwoOrgsSameIP_IndependentBuckets — load-bearing regression (issue #59) — two tenants behind same upstream proxy must not share a bucket - TestRateLimit_TwoTokensSameIP_IndependentBuckets — same shape for the per-tenant Caddy box - TestRateLimit_SameOrgDifferentTokens_SharedBucket — counter-pin: rotating tokens within one org must NOT bypass the org's quota - TestRateLimit_Middleware_RoutesThroughKeyFor — AST gate, mirrors the SSOT gates established in #36/#10/#12 Mutation-tested: - strip org-id branch in keyFor → 3 tests fail - strip bearer-token branch → 2 tests fail - reintroduce direct c.ClientIP() in Middleware → 3 tests fail (including the AST gate) Existing tests pass unchanged: dev-mode fail-open, X-RateLimit-* headers (#105), Retry-After on 429 (#105), XFF anti-spoofing (#179). No schema/API change. 429 response body and X-RateLimit-* headers unchanged. RATE_LIMIT env var semantics unchanged. Hostile self-review (three weakest spots) is in the issue body: 1. one-shot Docker-inspect cost is now bucket-key derivation cost (string compare + SHA-256 of bearer); single-digit microseconds. 2. X-Molecule-Org-Id is unvalidated at the rate-limiter layer — spoofing is closed by tenant SG + CP front; documented in keyFor's docstring with the conditions under which to revisit. 3. cpProv-style SaaS surface is out of scope; CP's own limiter handles that hop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:51:08 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	25fb696965	chore: reconcile main → staging post-suspension divergence Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s Details cascade-list-drift-gate / check (pull_request) Successful in 9s Details CI / Detect changes (pull_request) Successful in 10s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 10s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 11s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s Details Harness Replays / detect-changes (pull_request) Successful in 13s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 9s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 43s Details Harness Replays / Harness Replays (pull_request) Failing after 40s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m32s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m34s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 2m53s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m44s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m57s Details CI / Canvas (Next.js) (pull_request) Successful in 6m50s Details CI / Python Lint & Test (pull_request) Successful in 7m37s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Failing after 8m31s Details Refs Task #165 (Class D AUTO_SYNC_TOKEN plumbing). main and staging diverged after the 2026-05-06 GitHub-org suspension because Class D / Class G / feature work landed on staging while unrelated CI fixes (#34-47, ECR auth-inline, buildx→docker, pre-clone manifest deps) landed straight on main. Both branches edited the same workflow files, so every push to main triggered an Auto-sync run that aborted at `git merge --no-ff origin/main` with 7 content conflicts: - .github/workflows/canary-verify.yml (URL: github.com → Gitea) - .github/workflows/ci.yml (3 URL refs) - .github/workflows/publish-runtime.yml (cascade: HTTP repo-dispatch → Gitea push) - .github/workflows/publish-workspace-server-image.yml (drop AWS-action steps; ECR auth is inline) - .github/workflows/retarget-main-to-staging.yml (URL) - manifest.json (lowercase org slug + add mock-bigorg from main) - scripts/clone-manifest.sh (keep main's MOLECULE_GITEA_TOKEN auth path + drop awk-tolower since manifest is now lowercase) Resolution: union — staging's post-suspension Gitea/ECR migrations win on URL/policy edits; main's additive work (mock-bigorg manifest entry, inline ECR auth, MOLECULE_GITEA_TOKEN basic-auth) is preserved on top. After this lands, staging is a strict superset of main, so the next auto-sync run on a push to main will be a clean fast-forward / no-op. The auto-sync workflow on main also picks up staging's AUTO_SYNC_TOKEN swap (Class D #26) for free, fixing the latent layer-2 push-auth issue. Verified locally: - bash -n scripts/clone-manifest.sh - python -c 'yaml.safe_load(...)' on each touched workflow - python -c 'json.load(open(manifest.json))' (21 plugins, 9 templates, 7 org_templates) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 14:24:37 -07:00
devops-engineer	e16d7eaa08	fix(ci): apply pre-clone fix to platform Dockerfile too (followup #173 ) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s Details CI / Detect changes (pull_request) Successful in 17s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 20s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 15s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 3s Details Harness Replays / detect-changes (pull_request) Successful in 14s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 13s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 2m12s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 2m5s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m54s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s Details CI / Python Lint & Test (pull_request) Successful in 9s Details CI / Canvas (Next.js) (pull_request) Successful in 14s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 14s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s Details Harness Replays / Harness Replays (pull_request) Failing after 1m8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m4s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5m38s Details CI / Platform (Go) (pull_request) Successful in 8m55s Details The first PR (#38) only patched Dockerfile.tenant — but the workflow also builds the platform image from workspace-server/Dockerfile, which had the SAME in-image `git clone` stage. Build run #794 caught this: "process clone-manifest.sh ... exit code 128" on the platform image. Apply the same pre-clone shape to the platform Dockerfile: drop the `templates` stage, COPY from .tenant-bundle-deps/ instead. The workflow's existing "Pre-clone manifest deps" step (added in #38) already populates .tenant-bundle-deps/ before either build runs, so no workflow change needed. Self-review note: the missed-platform-Dockerfile is a Phase 1 quality miss — I read both files but only registered the tenant one as in-scope. Saved memory `feedback_orchestrator_must_verify_before_declaring_fixed` applies: should have grepped the whole workspace-server/ for "templates" stages before claiming Task #173 done. CI run #794 caught it within ~6 minutes; net cost: one followup commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 13:13:13 -07:00
Hongming Wang	17f1f30b3f	fix(test): drain coalesceRestart goroutines before t.Cleanup (Class H, #170 ) TestPooledWithEICTunnel_PreservesFnErr (and any sqlmock-using neighbour test) was at risk of inheriting stale INSERT calls from a previous test's coalesceRestart goroutine that survived its t.Cleanup boundary. The production callsite shape is `go h.RestartByID(...)` from a2a_proxy.go, a2a_proxy_helpers.go and main.go. When that goroutine's runRestartCycle panics, coalesceRestart's deferred recover swallows it to keep the platform process alive — but in tests, nothing waits for the goroutine to fully exit. If it's still draining LogActivity-shaped work after the test returns, those INSERTs land in the next test's sqlmock connection as kind=DELEGATION_FAILED / kind=WORKSPACE_PROVISION_FAILED, surfacing as "INSERT-not-expected". Fix: introduce drainCoalesceGoroutine(t, wsID, cycle) test helper that spawns coalesceRestart on a goroutine (matching production) and registers a t.Cleanup with sync.WaitGroup.Wait so the test can't declare itself done while a goroutine is still alive. Convert TestCoalesceRestart_PanicInCycleClearsState to use the helper (previously it called coalesceRestart synchronously, which never exercised the production goroutine-survival contract). Add TestCoalesceRestart_DrainHelperWaitsForGoroutineExit as the regression guard: cycle blocks 150ms then panics; the test asserts t.Run elapsed >= 150ms (proving the Wait barrier engaged) AND the deferred close ran (proving the panic-recovery defer chain executed) AND state.running was cleared. Verified the assertion is real by mutation-testing: removing t.Cleanup(wg.Wait) makes this test FAIL deterministically with elapsed <300µs. Per saved memory feedback_assert_exact_not_substring: the regression test asserts an exact-shape contract (elapsed >= blockFor) rather than a substring-in-output, so it discriminates between "drain works" and "drain skipped". Per Phase 3: 10/10 race-detector runs pass for all TestCoalesceRestart_* tests. Full ./internal/handlers/... suite green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 13:13:13 -07:00
devops-engineer	55689e0b10	fix(post-suspension): migrate github.com/Molecule-AI refs to git.moleculesai.app (Class G #168 ) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 22s Details CI / Detect changes (pull_request) Successful in 24s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 20s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 9s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 44s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 38s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 35s Details Harness Replays / detect-changes (pull_request) Successful in 44s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 27s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 56s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 2m1s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 2m34s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 2m34s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 23s Details Harness Replays / Harness Replays (pull_request) Failing after 1m12s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m51s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m37s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m15s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m34s Details CI / Python Lint & Test (pull_request) Successful in 8m20s Details CI / Canvas (Next.js) (pull_request) Successful in 9m46s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Failing after 13m23s Details The GitHub org Molecule-AI was suspended on 2026-05-06; canonical SCM is now Gitea at https://git.moleculesai.app/molecule-ai/. Stale github.com/Molecule-AI/... URLs return 404 and break tooling that clones / pip-installs / curls them. This bundles all non-Go-module URL fixes for this repo into a single PR. Go module path references (in *.go, go.mod, go.sum) are out of scope here -- tracked separately under Task #140. Token-auth clone URLs also flip ${GITHUB_TOKEN} -> ${GITEA_TOKEN} since the GitHub token does not auth against Gitea. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 13:08:15 -07:00
devops-engineer	a6d67b4c68	fix(ci): pre-clone manifest deps in workflow, drop in-image clone (closes #173 ) Some checks failed Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 7s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details CI / Detect changes (pull_request) Successful in 9s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 10s Details Harness Replays / detect-changes (pull_request) Successful in 10s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s Details CI / Python Lint & Test (pull_request) Successful in 6s Details CI / Canvas (Next.js) (pull_request) Successful in 6s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 8s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 13s Details Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 34s Details Harness Replays / Harness Replays (pull_request) Failing after 33s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 53s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m28s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m29s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m31s Details CI / Platform (Go) (pull_request) Failing after 4m4s Details publish-workspace-server-image.yml could not run on Gitea Actions because Dockerfile.tenant's stage 3 ran `git clone` against private Gitea repos from inside the Docker build context, where no auth path exists. Every workspace-server rebuild required a manual operator-host push. Move cloning to the trusted CI context (where AUTO_SYNC_TOKEN — the devops-engineer persona PAT — is naturally available). Dockerfile.tenant now COPYs from .tenant-bundle-deps/, populated by the workflow's new "Pre-clone manifest deps" step. The Gitea token never enters the image. - scripts/clone-manifest.sh: optional MOLECULE_GITEA_TOKEN env embeds basic-auth in the clone URL; redacted in log output. Anonymous fallback preserved for future public-repo path. - .github/workflows/publish-workspace-server-image.yml: new pre-clone step before docker build; injects AUTO_SYNC_TOKEN. Fail-fast if the secret is empty. - workspace-server/Dockerfile.tenant: drop stage 3 (templates), COPY from .tenant-bundle-deps/ instead. Header documents the prereq. - .gitignore: ignore /.tenant-bundle-deps/ so a local build can't accidentally commit cloned repos. Verified locally: clone-manifest.sh with the devops-engineer persona token cloned all 37 repos (9 ws + 7 org + 21 plugins, 4.9MB after .git strip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 12:59:46 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	be5fbb5ad3	fix(workspace-server): a2a-proxy preflight container check (closes #36 ) Some checks failed Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 7s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s Details Harness Replays / detect-changes (pull_request) Successful in 7s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 56s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m25s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m25s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m37s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m38s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m46s Details CI / Platform (Go) (pull_request) Successful in 2m44s Details Same SSOT-divergence shape as #10 / fixed in #12, but on the a2a-proxy code path. The plugin handler was routed through `provisioner.RunningContainerName`; a2a-proxy was forwarding optimistically and only catching missing containers REACTIVELY via `maybeMarkContainerDead` after the network call timed out. Result on tenants whose agent containers had been recycled (e.g. post-EC2 replace from molecule-controlplane#20): canvas waits 2-30s for the network forward to fail before getting a 503, and the workspace-server logs only "ProxyA2A forward error" without the "container is dead" signal. This PR adds a proactive `Provisioner.IsRunning` check in `proxyA2ARequest` between `resolveAgentURL` and `dispatchA2A`, gated on the conditions where we know we're talking to a sibling Docker container we own (`h.provisioner != nil` AND `platformInDocker` AND the URL was rewritten to Docker-DNS form). Three outcomes via the SSOT helper: (true, nil) → forward as today (false, nil) → fast-503 with `error="workspace container not running — restart triggered"`, `restarting=true`, `preflight=true`, plus the same offline-flip + WORKSPACE_OFFLINE broadcast + async restart that `maybeMarkContainerDead` produces (true, err) → fall through to optimistic forward (matches IsRunning's "fail-soft as alive" contract — flaky daemon must not trigger a restart cascade) The `preflight=true` flag in the response distinguishes the proactive short-circuit from the reactive `maybeMarkContainerDead` path so canvas or downstream callers can render distinct messages later. * `internal/handlers/a2a_proxy.go` — preflight call site between resolveAgentURL and dispatchA2A; gated on `h.provisioner != nil && platformInDocker && url == http://<ContainerName(id)>:port`. * `internal/handlers/a2a_proxy_helpers.go` — `preflightContainerHealth` helper. Routes through `h.provisioner.IsRunning` (which itself wraps `RunningContainerName`). Identical offline-flip side-effects as `maybeMarkContainerDead` for the dead-container case. * `internal/handlers/a2a_proxy_preflight_test.go` — 4 tests: running → nil; not-running → structured 503 + sqlmock expectations on the offline-flip + structure_events insert; transient error → nil (fail-soft); AST gate pinning the SSOT routing (mirror of #12's gate). Mutation-tested: removing the `if running { return nil }` guard makes the production code fail to compile (unused var). A subtler mutation (replacing the !running branch with `return nil`) would make TestPreflight_ContainerNotRunning_StructuredFastFail fail at runtime with sqlmock's "expected DB call did not occur." Refs: molecule-core#36. Companion to #12 (issue #10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 11:15:08 -07:00
Hongming Wang	d64641904f	feat(workspace-server): mock runtime + mock-bigorg org template Some checks failed E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 9s Details CI / Python Lint & Test (pull_request) Successful in 6s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s Details CI / Canvas (Next.js) (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s Details Harness Replays / Harness Replays (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s Details cascade-list-drift-gate / check (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m30s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m39s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m50s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Platform (Go) (pull_request) Successful in 4m29s Details CI / Detect changes (pull_request) Successful in 6s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 8s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s Details Adds a 'mock' runtime: virtual workspaces with no container, no EC2, no LLM. Every A2A reply is synthesised from a small canned-variant pool ('On it!', 'Got it, on it now.', etc.) deterministically seeded by (workspace_id, request_id). Built for funding-demo "200-workspace mock org" — renders an enterprise-scale org chart on the canvas (CEO/VPs/Managers/ICs) without burning real LLM credits or provisioning 200 EC2 instances. Surfaces: - workspace-server/internal/handlers/mock_runtime.go: A2A proxy short-circuit, canned-reply pool, deterministic variant pick. - workspace-server/internal/handlers/a2a_proxy.go: gate the short-circuit before resolveAgentURL (mock has no URL). - workspace-server/internal/handlers/org_import.go: skip Docker provisioning for mock workspaces, set status='online' directly, drop the per-sibling 2s pacing for mock children (collapses a 200-workspace import from ~7min → ~1s). - workspace-server/internal/handlers/runtime_registry.go: register 'mock' in the runtime allowlist (manifest + fallback set). - workspace-server/internal/registry/healthsweep.go + orphan_sweeper.go: skip mock workspaces in container-health and stale-token sweeps (no container by design). - workspace-server/internal/handlers/workspace_restart.go: mirror the 'external' Restart no-op for mock. - manifest.json: register the new Molecule-AI/molecule-ai-org-template-mock-bigorg repo. Tests: 5 new in mock_runtime_test.go covering happy-path, non-mock regression guard, determinism, IsMockRuntime trim/case, JSON-RPC id echo. All existing handler + registry tests still pass. Local-verified: imported the 200-workspace template against a fresh postgres+redis, confirmed all 200 land in 'online' and stay there through the 30s health-sweep window, exercised A2A on CEO + VPs + Managers + ICs and saw the variant pool rotate. Org template lives at Molecule-AI/molecule-ai-org-template-mock-bigorg (created today) and is imported via the existing /org/import flow on the canvas Template Palette. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 08:40:37 -07:00
devops-engineer	10e510f50c	chore: drop github-app-auth + swap GHCR→ECR (closes #157 , #161 ) Some checks failed Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Harness Replays / detect-changes (pull_request) Successful in 9s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 4s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 17s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 30s Details Harness Replays / Harness Replays (pull_request) Failing after 32s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m26s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m21s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m36s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s Details CI / Platform (Go) (pull_request) Successful in 2m18s Details Two coupled cleanups for the post-2026-05-06 stack: ============================================ The plugin injected GITHUB_TOKEN/GH_TOKEN via the App's installation-access flow (~hourly rotation). Per-agent Gitea identities replaced this approach after the 2026-05-06 suspension — workspaces now provision with a per-persona Gitea PAT from .env instead of an App-rotated token. The plugin code itself lived on github.com/Molecule-AI/molecule-ai-plugin-github-app-auth which is also unreachable post-suspension; checking it out at CI build time was already failing. Removed: - workspace-server/cmd/server/main.go: githubappauth import + the `if os.Getenv("GITHUB_APP_ID") != ""` block that called BuildRegistry. gh-identity remains as the active mutator. - workspace-server/Dockerfile + Dockerfile.tenant: COPY of the sibling repo + the `replace github.com/Molecule-AI/molecule-ai- plugin-github-app-auth => /plugin` directive injection. - workspace-server/go.mod + go.sum: github-app-auth dep entry (cleaned up by `go mod tidy`). - 3 workflows: actions/checkout steps for the sibling plugin repo: - .github/workflows/codeql.yml (Go matrix path) - .github/workflows/harness-replays.yml - .github/workflows/publish-workspace-server-image.yml Verified `go build ./cmd/server` + `go vet ./...` pass post-removal. ======================================================= Same workflow used to push to ghcr.io/molecule-ai/platform + platform-tenant. ghcr.io/molecule-ai is gone post-suspension. The operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/ molecule-ai/) already hosts platform-tenant + workspace-template-* + runner-base images and is the post-suspension SSOT for container images. This PR aligns publish-workspace-server-image with that stack. - env.IMAGE_NAME + env.TENANT_IMAGE_NAME repointed to ECR URL. - docker/login-action swapped for aws-actions/configure-aws- credentials@v4 + aws-actions/amazon-ecr-login@v2 chain (the standard ECR auth pattern; uses AWS_ACCESS_KEY_ID/SECRET secrets bound to the molecule-cp IAM user). The :staging-<sha> + :staging-latest tag policy is unchanged — staging-CP's TENANT_IMAGE pin still points at :staging-latest, just with the new registry prefix. Refs molecule-core#157, #161; parallel to org-wide CI-green sweep.	2026-05-07 07:48:51 -07:00
devops-engineer	1d8c101c94	chore: drop github-app-auth + swap GHCR→ECR (closes #157 , #161 ) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 8s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 8s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s Details Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s Details Harness Replays / detect-changes (pull_request) Successful in 9s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details CI / Canvas (Next.js) (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s Details Harness Replays / Harness Replays (pull_request) Failing after 27s Details CI / Python Lint & Test (pull_request) Successful in 31s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m19s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m21s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m25s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 15m34s Details CI / Platform (Go) (pull_request) Failing after 15m35s Details Two coupled cleanups for the post-2026-05-06 stack: #157 — drop molecule-ai-plugin-github-app-auth ============================================ The plugin injected GITHUB_TOKEN/GH_TOKEN via the App's installation-access flow (~hourly rotation). Per-agent Gitea identities replaced this approach after the 2026-05-06 suspension — workspaces now provision with a per-persona Gitea PAT from .env instead of an App-rotated token. The plugin code itself lived on github.com/Molecule-AI/molecule-ai-plugin-github-app-auth which is also unreachable post-suspension; checking it out at CI build time was already failing. Removed: - workspace-server/cmd/server/main.go: githubappauth import + the `if os.Getenv("GITHUB_APP_ID") != ""` block that called BuildRegistry. gh-identity remains as the active mutator. - workspace-server/Dockerfile + Dockerfile.tenant: COPY of the sibling repo + the `replace github.com/Molecule-AI/molecule-ai- plugin-github-app-auth => /plugin` directive injection. - workspace-server/go.mod + go.sum: github-app-auth dep entry (cleaned up by `go mod tidy`). - 3 workflows: actions/checkout steps for the sibling plugin repo: - .github/workflows/codeql.yml (Go matrix path) - .github/workflows/harness-replays.yml - .github/workflows/publish-workspace-server-image.yml Verified `go build ./cmd/server` + `go vet ./...` pass post-removal. #161 — swap GHCR→ECR for publish-workspace-server-image ======================================================= Same workflow used to push to ghcr.io/molecule-ai/platform + platform-tenant. ghcr.io/molecule-ai is gone post-suspension. The operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/ molecule-ai/) already hosts platform-tenant + workspace-template-* + runner-base images and is the post-suspension SSOT for container images. This PR aligns publish-workspace-server-image with that stack. - env.IMAGE_NAME + env.TENANT_IMAGE_NAME repointed to ECR URL. - docker/login-action swapped for aws-actions/configure-aws- credentials@v4 + aws-actions/amazon-ecr-login@v2 chain (the standard ECR auth pattern; uses AWS_ACCESS_KEY_ID/SECRET secrets bound to the molecule-cp IAM user). The :staging-<sha> + :staging-latest tag policy is unchanged — staging-CP's TENANT_IMAGE pin still points at :staging-latest, just with the new registry prefix. Refs molecule-core#157, #161; parallel to org-wide CI-green sweep.	2026-05-07 05:12:06 -07:00
claude-ceo-assistant	6a7dcd287c	Merge pull request 'feat(canvas/chat-server): canvas consumes /chat-history + server-side row-aware reverse (RFC #2945 PR-C-2)' (#4 ) from feat/rfc-2945-pr-c-2-canvas-chat-history into staging Some checks failed CI / Detect changes (push) Successful in 9s Details E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 12s Details Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s Details Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Failing after 26s Details Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 1m5s Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Failing after 1m35s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Failing after 1m43s Details CI / Canvas (Next.js) (push) Successful in 2m32s Details CI / Canvas Deploy Reminder (push) Has been skipped Details Block internal-flavored paths / Block forbidden paths (push) Successful in 7s Details Handlers Postgres Integration / detect-changes (push) Successful in 10s Details E2E API Smoke Test / detect-changes (push) Successful in 12s Details Harness Replays / detect-changes (push) Successful in 11s Details CI / Shellcheck (E2E scripts) (push) Successful in 4s Details CI / Python Lint & Test (push) Successful in 6s Details Harness Replays / Harness Replays (push) Failing after 41s Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Failing after 1m5s Details publish-workspace-server-image / build-and-push (push) Failing after 2m42s Details CI / Platform (Go) (push) Failing after 2m58s Details E2E API Smoke Test / E2E API Smoke Test (push) Failing after 6m8s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Has been cancelled Details	2026-05-07 11:38:54 +00:00
claude-ceo-assistant	b49bdde997	Merge pull request 'fix(workspace-server): CP orphan sweeper closes deprovision split-write race (#2989 )' (#2 ) from fix/cp-orphan-sweeper-2989 into staging Some checks failed CI / Platform (Go) (push) Blocked by required conditions Details CI / Canvas (Next.js) (push) Blocked by required conditions Details CI / Shellcheck (E2E scripts) (push) Blocked by required conditions Details CI / Canvas Deploy Reminder (push) Blocked by required conditions Details CI / Python Lint & Test (push) Blocked by required conditions Details E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions Details Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions Details Harness Replays / Harness Replays (push) Blocked by required conditions Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions Details Block internal-flavored paths / Block forbidden paths (push) Successful in 5s Details CI / Detect changes (push) Has been cancelled Details E2E Staging Canvas (Playwright) / detect-changes (push) Has been cancelled Details E2E API Smoke Test / detect-changes (push) Has been cancelled Details Handlers Postgres Integration / detect-changes (push) Has been cancelled Details Harness Replays / detect-changes (push) Has been cancelled Details publish-workspace-server-image / build-and-push (push) Has been cancelled Details Runtime PR-Built Compatibility / detect-changes (push) Has been cancelled Details Secret scan / Scan diff for credential-shaped strings (push) Has been cancelled Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Has been cancelled Details CodeQL / Analyze (${{ matrix.language }}) (python) (push) Has been cancelled Details CodeQL / Analyze (${{ matrix.language }}) (go) (push) Has been cancelled Details	2026-05-07 11:38:48 +00:00
claude-ceo-assistant	f51722411b	Merge branch 'main' into fix/issue10-runtime-aware-plugin-install Some checks failed CI / Detect changes (pull_request) Successful in 13s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 13s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s Details pr-guards / disable-auto-merge-on-push (pull_request) Failing after 6s Details Harness Replays / detect-changes (pull_request) Successful in 11s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s Details CI / Canvas (Next.js) (pull_request) Successful in 6s Details CI / Python Lint & Test (pull_request) Successful in 8s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m6s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m41s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m44s Details Harness Replays / Harness Replays (pull_request) Failing after 55s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 1m13s Details CI / Platform (Go) (pull_request) Successful in 5m42s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m44s Details	2026-05-07 11:26:14 +00:00
claude-ceo-assistant	624ef4d06d	perf(workspace-server,canvas): EIC tunnel pool + canvas Promise.all (closes core#11) Some checks failed E2E API Smoke Test / detect-changes (pull_request) Successful in 6s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Harness Replays / detect-changes (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 3s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 52s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 9s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 43s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m20s Details Harness Replays / Harness Replays (pull_request) Failing after 31s Details CI / Platform (Go) (pull_request) Failing after 2m41s Details CI / Canvas (Next.js) (pull_request) Failing after 2m42s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m56s Details ## Symptom Canvas detail-panel "config + filesystem load" took ~20s. Reported on production hongming tenant, workspace c7c28c0b-... (Claude Code Agent T2). ## Two stacked latency sources ### 1. Server-side: per-call EIC tunnel setup (~80% of the win) `workspace-server/internal/handlers/template_files_eic.go::realWithEICTunnel` performed ssh-keygen + SendSSHPublicKey + open-tunnel + waitForPort PER call. 4 callers (read/write/list/delete) each paid the full ~3-5s setup cost even when fired back-to-back on the same workspace EC2. Fix: refcounted pool keyed on instanceID with TTL ≤ 50s (under the 60s SendSSHPublicKey grant). One tunnel serves N file ops; concurrent acquires for the same instance share the slot via a pendingSetups gate; LRU eviction caps simultaneous tracked instances at 32. Poisons entries on tunnel-fatal errors (connection refused, broken pipe, auth failed) so the next acquire builds fresh. Cleanup on panic via defer-release pattern (added after self-review caught a refcount-leak hazard). Public API unchanged — `var withEICTunnel` rebinds to `pooledWithEICTunnel` at package init, so all 4 callers inherit pooling for free. 10 unit tests pin: 4-ops-amortise (1 setup), different-instances-do-not-share, TTL eviction, poison invalidates, concurrent-acquire-single-setup, TTL=0 escape hatch, LRU eviction at cap, error classification heuristic, refcount blocks expired eviction, panic poisons entry. All green. ### 2. Canvas-side: serial fan-out + duplicate fetch (~20% of the win) `canvas/src/components/tabs/ConfigTab.tsx::loadConfig` awaited 3 independent metadata GETs (`/workspaces/{id}`, `/model`, `/provider`) serially. `AgentCardSection` fired a SECOND `/workspaces/{id}` from its own useEffect. Fix: Promise.all over the 3 metadata GETs (each leg keeps its existing .catch fallback semantics). AgentCardSection now reads `agentCard` from the canvas store (`useCanvasStore`) instead of refetching — the canvas already hydrates `node.data.agentCard` from the platform event stream. Defensive selector handles test mocks without a `nodes` array. ## Verification - `go test ./internal/handlers/` 5.07s green (full handlers package, including 10 new pool tests) - `go vet ./internal/handlers/` clean - `npx vitest run` — 1380/1380 canvas unit tests pass (2 test FILES fail on a pre-existing xyflow CSS-load issue in vitest config, unrelated to this change) - `npx tsc --noEmit` clean Live wall-time verification deferred to Phase 4 / E2E (canvas browser session required; external probe blocked by 403 since the canvas auth chain is session-cookie + Origin header, not a bearer token I can fabricate). ## Backwards compatibility API surface unchanged. All 4 EIC handler callers use the rebound var; no caller migration. Pool defaults to enabled (TTL=50s); tests can disable by setting poolTTL=0 or by overwriting withEICTunnel directly (existing stub pattern in template_files_eic_dispatch_test.go preserved). ## Hostile self-review (3 weakest spots) 1. `fnErrIndicatesTunnelFault` is a substring grep on err.Error() — the marker list is hand-curated and ssh client error formats vary across OpenSSH versions. A future ssh that reports a tunnel failure via a phrasing not in the list would NOT poison the entry → next callers reuse a dead tunnel until TTL evicts. Acceptable: TTL bounds the impact (≤50s of bad reuse), and the heuristic covers every tunnel-error shape that appears in the existing test fixtures and known incidents. 2. `acquire`'s for-loop has unbounded retry potential under pathological churn (signal closed → new acquirer → setup fails → repeat). No bounded retry counter. Today there is no test exercise for "flaky setup that succeeds-then-fails-then-succeeds"; if observability ever shows this shape, add a max-retry guard. Filed as a known limitation, not blocking. 3. The substring assertion `strings.Contains` style I used for tunnel-fault classification could false-positive on app-level error messages that happen to contain "permission denied" or "broken pipe" verbatim. The classification test covers the discriminator but only against the error shapes we know today. Acceptable: poisoning errs on the side of building fresh, which is correct-but-slightly-slow rather than incorrect. ## Phase 4 / E2E plan - Live timing of the canvas detail-panel open against a real workspace (browser session, not external probe). - Target: perceived latency under 2s on warm pool. Cold open still pays one tunnel setup (~3-5s) — the pool buys you the SECOND through Nth panel-open within the TTL window. - Memory `feedback_chase_verification_to_staging` applies — will not declare done at PR-merge; will follow through to user-visible behavior on staging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 23:17:58 -07:00
security-auditor	c1de2287fd	fix(workspace-server): SSOT-route container check + 422 on external runtimes Some checks failed E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m46s Details CI / Detect changes (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 4s Details Harness Replays / detect-changes (pull_request) Successful in 5s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 4s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 53s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 44s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m21s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m28s Details Harness Replays / Harness Replays (pull_request) Failing after 43s Details CI / Platform (Go) (pull_request) Successful in 3m19s Details Two coupled fixes for molecule-core#10 (plugin install 503 vs status=online split-state): 1. SSOT for "is this workspace's container running" — `findRunningContainer` in plugins.go used to carry its own copy of `cli.ContainerInspect`, which collapsed transient daemon errors into the same `""` return as a genuinely-stopped container. Healthsweep's `Provisioner.IsRunning` handled the same input correctly (defensive). Promote the inspect logic to `provisioner.RunningContainerName`, route both consumers through it. Transient errors get a distinct log line on the plugins side so triage doesn't confuse a flaky daemon with a stopped container. 2. Runtime-aware Install/Uninstall — `runtime='external'` workspaces have no local container; push-install via docker exec is meaningless. They pull plugins via the download endpoint instead (Phase 30.3). Without a guard they fell through to `findRunningContainer` and 503'd with a misleading "container not running." Add an early 422 with a hint pointing at the download endpoint. The two fixes are independent: (1) preserves correctness when the SSOT helper is later modified; (2) eliminates the persistent split-state on the 5 external persona-agent workspaces in this DB (and on tenant deployments hitting the same shape). * `internal/provisioner/provisioner.go` — new `RunningContainerName(ctx, cli, id) (string, error)` with three documented outcomes (running / stopped / transient). `Provisioner.IsRunning` now wraps it; behavior preserved. * `internal/handlers/plugins.go` — `findRunningContainer` shimmed onto `RunningContainerName`; new `isExternalRuntime(id)` predicate. * `internal/handlers/plugins_install.go` — Install + Uninstall reject external runtimes with 422 + hint, before the source-fetch step. * `internal/handlers/plugins_install_external_test.go` — 5 cases: external→422, uninstall-external→422, container-backed-falls-through, no-runtime-lookup-fails-open, lookup-error-fails-open. * `internal/handlers/plugins_findrunning_ssot_test.go` — two AST gates pin the SSOT routing so future PRs can't silently re-introduce the parallel impl. Mutation-tested: reverting either consumer to a direct `ContainerInspect` makes the gate fail. Refs: molecule-core#10 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:58:20 -07:00
security-auditor	f3187ea0c1	fix(workspace-server): default-bind to 127.0.0.1 in dev-mode fail-open Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s Details Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped Details Harness Replays / detect-changes (pull_request) Successful in 5s Details CI / Detect changes (pull_request) Successful in 6s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 5s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s Details CI / Canvas (Next.js) (pull_request) Successful in 5s Details CI / Python Lint & Test (pull_request) Successful in 3s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Harness Replays / Harness Replays (pull_request) Failing after 35s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 56s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m24s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m25s Details CI / Platform (Go) (pull_request) Successful in 1m48s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m47s Details In dev mode (`MOLECULE_ENV=dev\|development`, `ADMIN_TOKEN` unset) the AdminAuth chain fails open by design so canvas at :3000 can call workspace-server at :8080 without a bearer token. Combined with the existing wildcard bind on `:8080`, that exposed unauthenticated `POST /workspaces` to any same-LAN peer (S-8 in the audit RFC v1). Couple the bind narrowness to the same signal that drives the auth fail-open: when `middleware.IsDevModeFailOpen()` returns true, default the listener to `127.0.0.1`. Production (`ADMIN_TOKEN` set) keeps binding to all interfaces — its auth chain is doing the work. Operators who need LAN exposure set `BIND_ADDR=<host>` explicitly. * `cmd/server/main.go` — `resolveBindHost()` precedence: BIND_ADDR explicit > IsDevModeFailOpen() loopback > "" (all interfaces). Startup log line now includes the resolved bind + dev-mode-fail-open state for post-deploy auditing. * `cmd/server/bind_test.go` — 8 t.Setenv table cases covering precedence, explicit overrides, dev/prod env words. Mutation-tested: removing the `IsDevModeFailOpen()` branch makes the dev-mode cases fail with "" vs "127.0.0.1". Refs: molecule-core#7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 22:29:24 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	75a72bf5a2	feat(canvas/chat-server): canvas consumes /chat-history + server-side row-aware reverse (RFC #2945 PR-C-2) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 5s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s Details Harness Replays / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details CI / Python Lint & Test (pull_request) Successful in 30s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Failing after 9s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 54s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s Details Harness Replays / Harness Replays (pull_request) Failing after 46s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m19s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m20s Details CI / Canvas (Next.js) (pull_request) Failing after 2m21s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Failing after 2m44s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m49s Details Closes the SSOT story shipped in PR-C/D: canvas now consumes the typed /chat-history endpoint instead of /activity?type=a2a_receive, and the server emits messages in display-ready chronological order so the client doesn't have to re-order them. ## Canvas (consumer migration) - loadMessagesFromDB swaps from /activity to /chat-history. - Drops type=a2a_receive + source=canvas params (server applies the filter centrally now). - Drops [...activities].reverse() — wire is already display-ready. - Drops the local INTERNAL_SELF_MESSAGE_PREFIXES constant + isInternalSelfMessage helper. Server-side IsInternalSelfMessage applies the same predicate before emitting rows. - Drops the activityRowToMessages + ActivityRowForHydration imports from historyHydration.ts. The TS parser stays in tree because message-parser.ts is still load-bearing for live A2A WebSocket messages (ChatTab.tsx:805, AgentCommsPanel.tsx, canvas-events.ts). ## Server (row-aware wire-order fix) The pre-PR-C-2 client did `[...activities].reverse()` over ROWS, then flattened each row into [user, agent] messages. The reversal was ROW-aware. After PR-C/D, the server returned a flat ChatMessage slice in `ORDER BY created_at DESC` order, with [user, agent] within each row. A naive client-side flat reverse would FLIP each pair (agent before user at same timestamp). Two ways to fix it: A) Server emits oldest-first within page; canvas does NOT reverse. B) Canvas does row-aware reversal (group by timestamp, reverse). Option A is cleaner — server owns the wire-order responsibility, every client trusts `for m of messages` to render chronologically. Server adds reverseRowChunks() that: 1. Groups consecutive same-Timestamp messages into row chunks (1-2 messages per row). 2. Reverses the chunk order (newest-row-first → oldest-row-first). 3. Flattens. Within-chunk [user, agent] order is preserved. Single-message rows (agent reply not yet recorded, attachments-only user upload) collapse to 1-element chunks and reverse correctly too. ## Tests Server: 3 new unit tests on reverseRowChunks (paired across rows, single-message rows, empty input) + 1 sqlmock integration test on List() that drives the full SQL → reverse → wire path. Mutation-tested: removed `messages = reverseRowChunks(messages)` from List(), confirmed the integration test fires red with all 4 misordered indices flagged. Restored, all 25 messagestore tests + 9 chat-history handler tests green. Canvas: 8 lazyHistory pagination tests refactored to mock /chat-history (not /activity) and assert against the new wire shape ({messages, reached_end} not raw activity rows). All 1389/1389 vitest tests green; tsc --noEmit clean. ## Three weakest spots (hostile-reviewer self-pass) 1. reverseRowChunks groups by Timestamp string equality. If two distinct rows had the SAME timestamp (legitimately possible at sub- millisecond granularity), the algorithm would treat them as one chunk and not reverse them relative to each other. Mitigated: activity_logs.created_at uses microsecond resolution; concurrent inserts at exact-same microsecond are vanishingly rare. If a collision happens, the within-chunk order is whatever the SQL returned — both rows render at the same timestamp, no user-visible misordering. 2. The pre-existing TS parser files (historyHydration.ts + message-parser.ts) stay in tree. historyHydration.ts is now dead code (no consumers post-migration); deletion is parked as a follow- up after a one-week observation window confirms no live-message consumer reaches it. 3. canvas's loadMessagesFromDB returns `resp.messages ?? []`. If the server were ever to return `null` instead of `[]` (it currently doesn't — handler defensively coerces nil to []), the nullish coalesce keeps the canvas from crashing. A stricter wire schema would assert the never-null invariant; for today's pragmatic safety, the ?? is enough. ## Security review - Untrusted input? Same as PR-C — agent JSON parsed defensively in the messagestore parser. No new exposure. - Trust boundary? Same. Canvas → /chat-history → wsAuth → messagestore. - Output sanitization? Plain text + opaque attachment URIs as before. No security-relevant changes beyond what /chat-history already exposes via PR-C. Considered, not skipped. ## Versioning / backwards compat - /activity endpoint unchanged. - /chat-history endpoint shape unchanged (still {messages, reached_end}); only the wire ORDER within a page changed (newest-first row → oldest- first row). Canvas is the only consumer in tree; no API consumers depend on the previous order. - canvas's loadMessagesFromDB call signature unchanged — internal refactor. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-06 16:55:00 -07:00
Hongming Wang	00cfe51df7	test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B) Some checks failed Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s Details CI / Detect changes (pull_request) Successful in 6s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s Details Harness Replays / detect-changes (pull_request) Successful in 5s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m23s Details CI / Python Lint & Test (pull_request) Successful in 31s Details CI / Canvas (Next.js) (pull_request) Successful in 52s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 40s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 40s Details Harness Replays / Harness Replays (pull_request) Failing after 43s Details CI / Platform (Go) (pull_request) Failing after 2m23s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m47s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m23s Details The five `mock.ExpectQuery(\`SELECT id FROM workspaces\`)` sites used a loose substring regex that silent-passed three regression shapes #2872 called out: 1. `WHERE parent_id = $2` (drops `IS NOT DISTINCT FROM` — breaks NULL-parent root matching) 2. `WHERE name = $1` only (drops parent_id check entirely — hijacks siblings of the same name across different parents) 3. Drops `AND status != 'removed'` (blocks re-import after Collapse) Extracts a `lookupChildSQLRE` const that anchors all four load-bearing tokens (the SELECT/FROM, the name predicate, the IS NOT DISTINCT FROM predicate, and the status filter). All five ExpectQuery sites now use the same const so a future schema/predicate change fails one place. Mutation-tested per memory feedback_assert_exact_not_substring.md: - Replacing `IS NOT DISTINCT FROM` with `=` fails TestLookupExistingChild_NilParent_MatchesRoot. - Dropping `AND status != 'removed'` fails TestLookupExistingChild_Found_ReturnsIDAndTrue. Note: #2872 PR-A (AST gate strengthening) is already addressed inline — findWorkspacesInsertSQL + TestCreateWorkspaceTree_InsertUsesOnConflictDoNothing pin the ON CONFLICT DO NOTHING shape, which is a strictly stronger gate than the original lookup-before-insert ordering check.	2026-05-06 16:43:42 -07:00
Hongming Wang	3cdb67f27e	fix(workspace-server): CP orphan sweeper closes deprovision split-write race (#2989 ) Some checks failed CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s Details Harness Replays / detect-changes (pull_request) Successful in 8s Details Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s Details CI / Detect changes (pull_request) Successful in 6s Details Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s Details E2E API Smoke Test / detect-changes (pull_request) Successful in 6s Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 5s Details CI / Python Lint & Test (pull_request) Successful in 3s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s Details CI / Canvas (Next.js) (pull_request) Successful in 18s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 43s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m19s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m22s Details Harness Replays / Harness Replays (pull_request) Failing after 37s Details CI / Platform (Go) (pull_request) Failing after 2m33s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m48s Details The deprovision path marks `workspaces.status='removed'` BEFORE calling the controlplane DELETE. If that CP call fails (transient 5xx, network hiccup, AWS provider error), the DB row stays at 'removed' with `instance_id` populated and there's no retry — the EC2 lives forever. 9 prod orphans accumulated over 3 days under this bug. Adds a SaaS-mode counterpart to the existing Docker `orphan_sweeper`: - 60s tick (matches the Docker sweeper cadence) - LIMIT 100 per cycle so a sustained CP outage drains over multiple cycles without blowing the request timeout - Re-issues `cpProv.Stop` for any workspace at status='removed' with a non-NULL `instance_id`. Stop is idempotent (AWS terminate on already-terminated is a no-op; CP's Deprovision tolerates already- deleted DNS) so retries are safe. - On Stop success, NULLs `instance_id` so the next cycle skips the row. - On Stop failure, leaves `instance_id` populated for next cycle. The existing Docker sweeper is gated on `prov != nil`; the new sweeper is gated on `cpProv != nil`. SaaS tenants get exactly one of the two, self-hosted tenants get the Docker one — no overlap. Why this shape over option A (CP-first ordering) or B (durable outbox): the existing inline path already returns a loud 500 to the user when CP fails — the only missing piece is automatic retry, which a 60s sweeper provides without protocol changes, new tables, or new workers. ~30 LOC of production code vs. ~400 for an outbox. RFC discussion in #2989 comment chain. Tests: - 9 unit tests covering happy path, Stop failure, UPDATE failure, multiple orphans (one-fails-others-still-process), DB query error, nil-DB defense, nil-reaper short-circuit, and the boot-immediate-then- tick cadence contract. - Mutation-tested: status='running' substitution and removed-UPDATE- block both fail at least one test. Out of scope: - Backfilling the 9 named orphans — they'll heal automatically on the first sweep cycle after this lands; no manual cleanup needed. - Long-term durable-outbox architecture — separate RFC.	2026-05-06 16:43:33 -07:00
claude-ceo-assistant (Claude Opus 4.7 on Hongming's MacBook)	4b074f631b	feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6 ) Some checks failed pr-guards / disable-auto-merge-on-push (pull_request) Failing after 0s Details Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 41s Details Harness Replays / Harness Replays (pull_request) Failing after 30s Details CI / Canvas Deploy Reminder (pull_request) Has been skipped Details CI / Platform (Go) (pull_request) Failing after 3m8s Details E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m7s Details CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 14m4s Details CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 14m36s Details CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 14m30s Details Block internal-flavored paths / Block forbidden paths (pull_request) Has been cancelled Details CI / Python Lint & Test (pull_request) Has been cancelled Details E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled Details CI / Canvas (Next.js) (pull_request) Has been cancelled Details Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Has been cancelled Details CI / Detect changes (pull_request) Has been cancelled Details Secret scan / Scan diff for credential-shaped strings (pull_request) Has been cancelled Details E2E API Smoke Test / detect-changes (pull_request) Has been cancelled Details Runtime PR-Built Compatibility / detect-changes (pull_request) Has been cancelled Details Harness Replays / detect-changes (pull_request) Has been cancelled Details Handlers Postgres Integration / detect-changes (pull_request) Has been cancelled Details E2E Staging Canvas (Playwright) / detect-changes (pull_request) Has been cancelled Details CI / Shellcheck (E2E scripts) (pull_request) Has been cancelled Details Add MOLECULE_IMAGE_REGISTRY env var to override the registry prefix used by all workspace-template image references. Defaults to ghcr.io/molecule-ai (unchanged for OSS users); set to an ECR URI in production tenants when mirroring to AWS. Why this matters: GitHub suspended the Molecule-AI org on 2026-05-06 with no warning. Production tenants kept running because they had images cached locally, but any tenant restart (AWS health event, redeploy, OS reboot) would have failed at `docker pull ghcr.io/molecule-ai/...` because GHCR returned 401. This change introduces the seam needed to point new pulls at a registry we control (AWS ECR) by flipping a single env var on Railway. Design (RFC: molecule-ai/internal#6): - New `RegistryPrefix()` function in `provisioner/registry.go` reads MOLECULE_IMAGE_REGISTRY, falls back to "ghcr.io/molecule-ai". - New `RuntimeImage(runtime)` returns the canonical ref using the prefix. - `RuntimeImages` map computed at init via `computeRuntimeImages()` so existing callers that range over it still work. - `DefaultImage` likewise computed via `RuntimeImage(defaultRuntime)`. - `handlers.TemplateImageRef()` switched from hardcoded format string to `provisioner.RegistryPrefix()`. - `runtime_image_pin.go::resolveRuntimeImage()` automatically inherits the prefix change because it reads from `provisioner.RuntimeImages[]` and only re-formats the tag suffix to a digest pin. Alternatives rejected (see RFC): - Multi-registry fallback chain (try ECR, fall back to GHCR): GHCR is locked from outbound for our org, so the fallback never works for us. Adds code complexity for no benefit. - Hardcoded ECR-only switch: couples production code to a specific deployment environment. OSS users self-hosting Molecule would need the upstream GHCR. - Self-hosted Harbor / registry-on-Hetzner: adds a component to operate. Not justified at 3-tenant scale; AWS ECR is mature and IAM-integrated. Auth — deliberately NOT changed in this commit: - For GHCR, the existing `ghcrAuthHeader()` reads GHCR_USER/GHCR_TOKEN. - For ECR, EC2 user-data installs `amazon-ecr-credential-helper` and adds a `credHelpers` entry in `~/.docker/config.json` so the daemon resolves ECR credentials via the EC2 instance role on every pull. The Go code needs no auth change. This keeps the diff minimal. Backwards compatibility: - Additive: env unset → identical behavior to today (GHCR). - Existing tests reference literal `ghcr.io/molecule-ai/...` strings; they continue to pass under the default prefix. - `RuntimeImages` map preserved for callers that iterate it. - No interface, schema, API, or migration version bump needed. Security review: - No untrusted input: MOLECULE_IMAGE_REGISTRY is set at deploy time (Railway env, EC2 user-data), not by users. - No expanded data collection or logging changes. - No new permissions: ECR pull permission is a future user-data + IAM role change, separate from this code change. - Worst-case: an attacker who already compromises Railway can swap the registry prefix to a malicious URI — same blast radius as compromising Railway today, no expansion. Tests: - 9 new unit tests in `registry_test.go` covering: default fallback, env override, empty env, all 9 known runtimes, unknown runtime, override-applies-to-all, computeRuntimeImages map population, env reflection, alphabetical ordering pin. - All existing provisioner + handlers tests continue to pass. - Mutation-tested mentally: deleting `if v := os.Getenv(...)` makes TestRegistryPrefix_RespectsEnv fail. Deleting `for _, r := range knownRuntimes` makes TestRuntimeImage_AllKnownRuntimes fail. The test suite would catch a regression of the original failure mode. Rollout plan: this PR is safe to merge with no env change. Production cutover happens by setting MOLECULE_IMAGE_REGISTRY on Railway after the AWS ECR mirror is populated (separate ops change, tracked in issue #6 phases 3b–3f). Tracking: - RFC: molecule-ai/internal#6 - Tasks: #97 (ECR setup), #98 (CP fallback) - Tech debt: runbooks/hetzner-rollout-tech-debt-2026-05-06.md item 7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 14:23:01 -07:00
Hongming Wang	a33c879017	feat(messagestore): MessageStore interface + Postgres impl (RFC #2945 PR-D) Closes #3026. Final piece of RFC #2945. ## What's new New package internal/messagestore/ holds: - MessageStore interface — single read-side contract operators implement to plug in alternative chat-history backends. - ChatMessage / ChatAttachment / ListOptions types — canonical data shapes returned by any impl, mirrors canvas's TS ChatMessage. - PostgresMessageStore — platform-default impl wrapping the activity_logs query + A2A-envelope parser ported in PR-C. Behavior is byte-identical to the pre-PR-D handler. ## What moves The activity_logs query, the parser (activityRowToChatMessages, extractRequestText, extractChatResponseText, extractFilesFromTask, etc.), and the internal-self-message predicate all migrate from internal/handlers/chat_history.go into the new package. handlers/ chat_history.go becomes a thin HTTP-shape adapter: parse query params → store.List(ctx, workspaceID, opts) → emit JSON Compile-time interface assertion in postgres_store.go catches future drift if the interface evolves and the impl falls behind. ## Why this PR OSS operators wanting to: - Tier hot/warm/cold storage (recent in Postgres, archival in S3) - Use a vector store with hybrid search (Pinecone, Weaviate) - Run an in-memory store for ephemeral test environments - Federate history across regions …had no extension point — they'd have to fork the handler. This PR makes that a constructor swap at router.go. ## Tests Parser-level (22 tests, MOVED to internal/messagestore/postgres_ store_test.go): every TS test case in canvas/src/components/tabs/chat/__tests__/historyHydration.test.ts has a Go counterpart. Timestamp preservation, user/agent extraction, internal-self filter, role decision (status=error vs agent-error prefix), v0/v1 file shapes, malformed JSON resilience. Handler-level (9 NEW tests in internal/handlers/chat_history_test.go): thin adapter coverage using a fake MessageStore. UUID validation, before_ts RFC3339 validation, default limit, max-limit clamp, invalid-limit fallback, before_ts passthrough, empty-array (not null) JSON shape, attachment shape preservation, store-error → 502 mapping. Compile-time interface conformance: PostgresMessageStore satisfies MessageStore, fakeStore (test fake) satisfies MessageStore. Mutation-tested. Removed UUID validation in the handler; confirmed TestChatHistoryHandler_RejectsNonUUIDWorkspaceID fires red (status 200 instead of 400, non-UUID reaches the store). Restored, all green. Full handlers + messagestore + router test runs green; full repo go test ./... green. ## SSOT decision ChatMessage / ChatAttachment / parser / DB query all live in internal/messagestore/ ONLY. handlers/chat_history.go imports the package and uses the types via messagestore.ChatMessage etc. — no re-declaration anywhere. ## Three weakest spots (hostile-reviewer self-pass) 1. The internal-self prefix list (Delegation results are ready...) is a package var in messagestore/postgres_store.go. A future impl that wants to override the predicate must reach into the package to use IsInternalSelfMessage or define its own. Acceptable: the predicate is part of the contract; if an impl wants different semantics it owns that decision explicitly. 2. ListOptions has Limit + BeforeTS + HasBefore; future paging needs (after_ts, peer_id filter, role filter) require additive struct field additions, which is a soft API break for any impl that handles ListOptions positionally. Mitigated by Go's struct-literal convention (named fields by default); also flagged in the interface comment for impl authors. 3. The handler does NOT log when a store returns an error — it just maps to 502. An impl that wants to surface its error class up the stack can't, today. If/when an impl needs that, the interface can add a typed-error contract in a follow-up. Today's coverage is sufficient: most ops issues land in the store impl's own logs. ## Security review - Untrusted input? Same as PR-C — agent-emitted JSON parsed defensively. New fakeStore in tests can't reach production. - Trust boundary? Same. Interface lives BEHIND wsAuth; impls only see workspace IDs already authenticated. - Auth/authz? Inherited from handler; the interface doesn't authenticate. - PII / secrets in logs? Documented in the interface contract: impls MUST NOT log full message bodies / attachment URIs. The Postgres impl logs nothing on the happy path. - Output sanitization? Same plain-text + opaque-URI surface as PR-C. Canvas validates attachment-URI schemes. No security-relevant changes beyond what /chat-history already exposes via PR-C. Considered, not skipped. ## Versioning / backwards compat - New internal package. Zero public API change. - Single caller site in router.go updated (one-line constructor change). NewChatHistoryHandler() → NewChatHistoryHandler(store). - No schema change, no migration. - Existing /chat-history endpoint unchanged on the wire — clients don't notice the refactor. ## Phasing This is the final RFC #2945 piece. Follow-ups parked: - PR-C-2 (canvas migration): swap canvas loadMessagesFromDB to call /chat-history instead of /activity. Independent of this PR; blocked only by canvas team's calendar. - Sample alternative impls (S3, in-memory) for OSS docs: separate PR when the first OSS consumer materializes; demonstration code untested against a real workload is anti-pattern. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 23:38:14 -07:00
Hongming Wang	089be695a9	Merge staging into rfc-2945-pr-c-chat-history	2026-05-05 23:18:52 -07:00
Hongming Wang	dcc870a6b7	feat(workspace-server): server-side chat-history endpoint (RFC #2945 PR-C) Closes the SSOT gap for chat-history hydration: today every consumer (canvas TS) re-implements an A2A-envelope walk to map activity_logs rows into rendered ChatMessage objects. This PR moves that walk into the server. ## What's added GET /workspaces/:id/chat-history?limit=N&before_ts=T Returns: { "messages": [ {"id": "<uuid>", "role": "user"\|"agent"\|"system", "content": "...", "attachments": [...], "timestamp": "<RFC3339>"} ], "reached_end": false } Auth chain: same wsAuth as /workspaces/:id/activity (tenant ADMIN_TOKEN + X-Molecule-Org-Id). No new trust boundary. Filter: a2a_receive rows with source_id IS NULL — same canvas-source filter the canvas applies via /activity?type=a2a_receive&source=canvas, centralized so future API consumers don't need to know it. ## What's mirrored from canvas TS Direct port of canvas/src/components/tabs/chat/historyHydration.ts + message-parser.ts: - extractRequestText / extractFilesFromUserMessage — user-side parts walk through request_body.params.message.parts[] - extractChatResponseText — agent-side response_body collector across the four shapes (string, A2A JSON-RPC parts, older nested parts.root.text, task artifacts) joined with "\n" (matches canvas multi-source collector — claude-code emits multiple text parts; hermes emits summary+artifacts) - extractFilesFromResponse / extractFilesFromTask — file walk across parts[] + artifacts[].parts[] + status.message.parts[] + message.parts[] - v0 hot path ({kind:"file", file:{...}}) AND v1 protobuf flat shape ({url, filename, mediaType}) both supported - Role decision: status='error' OR text starts with "agent error" (case-insensitive) → "system", else "agent" - isInternalSelfMessage prefix filter (Delegation results are ready...) - Timestamp pinned to row.created_at (regression cover for 2026-04-25 bubble-collapse bug) ## Tests 22 unit tests in chat_history_test.go, every TS test case in historyHydration.test.ts has a Go counterpart: Timestamp preservation (3): user/agent pin to created_at, two-rows produce two distinct timestamps. User-message extraction (5): text-only, internal-self skip, null body, attachments hydrated, attachments-only-when-text-empty, internal-self suppresses even with attachments. Agent-message extraction (4): result-string, status=error→system, agent-error-prefix→system, response_body.parts attachments, null body, no-text-no-files-no-bubble. End-to-end (1): paired user+agent same timestamp. Go-specific (5): malformed JSON returns empty (no panic), v1 protobuf flat shape extraction, task-artifacts extraction, older nested root.text shape, basename helper edge cases. isInternalSelfMessage predicate (1): prefix match, non-prefix non- match, empty-text non-match. Mutation-tested. Removed the role-promotion branch (status=error + agent-error prefix → system); confirmed both TestChatHistory_RoleSystemWhenStatusError and TestChatHistory_RoleSystemWhenAgentErrorPrefix fire red. Restored. Both green. Full handlers test suite (4.3s) green; full repo `go test ./...` green. ## SSOT decision Parsing logic lives in workspace-server/internal/handlers/chat_history.go ONLY. Canvas keeps historyHydration.ts + message-parser.ts during the transition because: - PR-C-2 (follow-up): canvas loadMessagesFromDB swaps to new endpoint. Today's canvas still calls /activity for backward compatibility. - The TS parsers are still load-bearing for LIVE message handling (WebSocket A2A_RESPONSE events) until RFC #2945 PR-B-2 mirrors the typed event payloads to canvas consumers. Canvas's TS path will be deleted in a separate PR after a one-week observation window confirms no live-message consumers depend on it. ## Security review - Untrusted input? YES — request_body and response_body come from agents (potentially OSS / third-party). Defensive: any malformed JSON returns empty content + no attachments, no panic. Tested via TestChatHistory_MalformedJSONInRequestBodyReturnsEmpty. - Trust boundary? Same as today: agent → workspace-server. No new boundary; reuses existing wsAuth middleware. - Auth/authz? Inherits wsAuth chain. Cross-workspace access blocked by existing TenantGuard middleware. - PII / secrets in logs? None. The handler logs nothing on the happy path; errors log 502 without body content. - Output sanitization? ChatMessage.content is plain text returned as-is; canvas already sanitizes via ReactMarkdown. Attachment URIs are agent-provided (workspace: / platform-pending: / https:); canvas's existing scheme allow-list still applies. ## Versioning / backwards compatibility - New endpoint /chat-history. /activity unchanged. - Canvas historyHydration.ts + message-parser.ts intact during transition (will be removed in PR-C-2 follow-up). - No public API consumer of /activity is broken — added route is additive. - No semver bump (server is internal versioning). ## Three weakest spots (hostile-reviewer self-pass) 1. extractRequestText returns ONLY parts[0].text. If a user message contains multiple text parts (uncommon — canvas only ever emits one), we lose later parts. Matches canvas exactly today, but a future change that emits multi-text user messages needs both parsers updated. Documented in code; covered by test if/when added. 2. activityRowToChatMessages rebuilds ChatMessage IDs every call (no caching). Each chat reload mints fresh UUIDs. This is fine because canvas dedupes by (role, content, timestamp window) not id, but a future API consumer that DID rely on id stability would break. Documented in the ChatMessage struct comment. 3. The handler scopes to source_id IS NULL only (canvas-source rows). A future "show all messages, including agent-to-agent" mode would need a new endpoint or a parameter. Out of scope for PR-C; canvas's /activity?source=canvas already enforces the same filter. Closes #3017. Unblocks RFC #2945 PR-D (MessageStore interface) which returns []ChatMessage typed values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 23:17:26 -07:00
Hongming Wang	656a02fae4	fix(textutil): SSOT for rune-safe string truncation, fix 3 audit-gap bugs Closes #2962. ## Why Six per-package `truncate` helpers had drifted into independent re-implementations of the same idea. Three of them (delegation.go, memory/client/client.go, memory-backfill/verify.go) used `s[:max] + "…"` byte-slice form, which on a multi-byte codepoint at byte `max` produces invalid UTF-8 → Postgres `text`/`jsonb` rejects the INSERT silently → `delegation` / `activity_logs` row never lands → audit gap. Three other helpers (delegation_ledger.go #2962, agent_message_writer.go #2959, scheduler.go #2026) had each been fixed in isolation with three slightly different rune-safe shapes — confirming this is a class of bug, not a single instance. ## What New package `internal/textutil` with three rune-safe functions: - `TruncateBytes(s, maxBytes)` — byte-cap, "…" marker. Used by 5 callers writing into byte-bounded columns / log lines. - `TruncateBytesNoMarker(s, maxBytes)` — byte-cap, no marker. Used by delegation_ledger.go where the storage already conveys "preview" and an extra ellipsis would push the result over the column cap. - `TruncateRunes(s, maxRunes)` — rune-cap, "…" marker. Used by agent_message_writer.go where the cap is in display chars (UI summary), not bytes. All three guarantee `utf8.ValidString(out)` for any `utf8.ValidString(in)`. Inputs already invalid go through `sanitizeUTF8` at the call site boundary (scheduler.go preserved this defense-in-depth). ## Migration map \| Old \| New \| Behavior change \| \|---\|---\|---\| \| `delegation_ledger.truncatePreview` \| `textutil.TruncateBytesNoMarker(s, 4096)` \| none \| \| `agent_message_writer.truncatePreviewRunes` \| `textutil.TruncateRunes(s, n)` \| none \| \| `scheduler.truncate` \| `textutil.TruncateBytes(s, n)` \| "..." → "…" (3 bytes either way; single-glyph display) \| \| `delegation.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix + ellipsis swap \| \| `memory/client.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix \| \| `memory-backfill.truncate` \| `textutil.TruncateBytes(s, n)` \| bug fix \| Five separate `truncate*` helpers + their per-package tests removed. Net: 12 files / +427 / -255. ## Tests - `internal/textutil/truncate_test.go` — 27 table-test cases + 145 fuzz-invariant cases asserting `utf8.ValidString` and byte-cap invariants on every output. - `delegation_ledger_test.go TestLedgerInsert_TruncatesOversizedPreview` strengthened with `capValidUTF8Matcher` so the SQL-write argument is asserted to be valid UTF-8 + within cap (not just `AnyArg()`). Mutation-tested: replacing the SSOT call with byte-slice form makes this test fail loud. ## Compatibility - All callers internal; no external API surface change. - Ellipsis swap "..." → "…": same byte budget (3 bytes), single-glyph display. No alerting/grep on either marker in this codebase (verified). Canvas renders both correctly. - DB column widths unchanged (4096 / 80 / 200 / 256 / 300 — all preserved in the migrations). ## Security Fixes a silent INSERT-failure mode that hid `activity_logs` / `delegations` rows containing peer-controlled text. The class of input that triggered it (CJK, emoji, accented Latin) is normal user content, not malicious — but the symptom (audit gap) makes incident reconstruction harder. Helper is pure-function over `string`; no secrets / PII / auth handling involved. Untrusted input is handled identically to before, just rune-aligned now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 23:01:21 -07:00
Hongming Wang	c53155ec5f	Merge pull request #3014 from Molecule-AI/test/cross-table-atomicity-integ-149-followup test(chat-uploads): integration test for cross-table atomicity (#149 follow-up)	2026-05-06 05:05:49 +00:00
Hongming Wang	7a39a08837	test(chat-uploads): integration test for cross-table atomicity (#149 follow-up) Adds two real-Postgres tests under //go:build integration: - TestIntegration_PollUpload_AtomicRollback_AcrossBothTables exercises the helpers in the same Tx shape uploadPollMode does (PutBatchTx + LogActivityTx + Rollback) and asserts COUNT(*)=0 on BOTH pending_uploads AND activity_logs after the rollback. Failure injection: NUL byte in `summary` triggers lib/pq protocol rejection on the second activity insert — same trick the existing PutBatch AtomicRollback test uses. - TestIntegration_PollUpload_HappyPath_AcrossBothTables is the positive counterpart — Commit lands N rows in both tables. Coverage rationale (post-PR-3010 review): - sqlmock unit test (TestPollUpload_AtomicRollbackOnActivityInsertFailure) proved the handler calls Begin/Exec/Exec-fail/Rollback in order. - Existing PutBatch integration test proved Postgres honors rollback for pending_uploads alone. - New tests close the cross-table gap: prove LogActivityTx + PutBatchTx + real Postgres MVCC compose correctly under rollback. A regression that made LogActivityTx silently route through db.DB instead of the passed tx would still pass the sqlmock test (the Begin/Commit/Rollback shape would look right) but would fail this integration test (the activity_logs row would survive the rollback). Verified locally: postgres:15-alpine + all migrations applied, both tests pass in 0.1s. Skips cleanly without INTEGRATION_DB_URL — CI already runs this file via the Handlers Postgres Integration job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:57:56 -07:00
Hongming Wang	ff21bbb876	Merge staging into rfc-2872-workspaces-uniq-toctou to clear BEHIND	2026-05-05 21:46:33 -07:00
Hongming Wang	da3cb4c098	fix(workspace-server): close TOCTOU race on workspaces(parent_id, name) (#2872 Critical 1) ## Bug `/org/import` had no per-tenant mutex, advisory lock, or DB-level uniqueness on (parent_id, name). The pattern was lookup-then-insert: existingID, existing, err := h.lookupExistingChild(...) // SELECT if existing { return /* skip / } db.DB.ExecContext(ctx, `INSERT INTO workspaces ...`) // INSERT Two concurrent admin POSTs (rapid double-click in canvas, retry-after- timeout, two operators on the same template) both saw "not found" in the SELECT and both INSERT'd the same (parent_id, name). Captured impact: tenant-hongming accumulated 72 stale child workspaces in 4 days from repeated org-template spawns of the same template (see #2857 phase 4 sweeper for the cleanup; #2872 for the prevention RFC). ## Fix Two-layer fix — DB-level backstop AND application-level happy path: 1. Migration* `20260506000000_workspaces_unique_parent_name.up.sql` ```sql CREATE UNIQUE INDEX CONCURRENTLY IF NOT EXISTS workspaces_parent_name_uniq ON workspaces ( COALESCE(parent_id, '00000000-0000-0000-0000-000000000000'::uuid), name ) WHERE status != 'removed'; ``` * COALESCE(parent_id, sentinel) collapses NULLs so root workspaces also collide pairwise. * `WHERE status != 'removed'` lets a tombstoned row be replaced by a same-named re-import (preserves existing org-import semantics). * CONCURRENTLY avoids ACCESS EXCLUSIVE on production tenants under live traffic; IF NOT EXISTS makes the migration resumable. * Down migration drops CONCURRENTLY symmetrically. 2. `org_import.go` swap Replace lookup-then-insert with `INSERT ... ON CONFLICT DO NOTHING RETURNING id`. On the skip path (RETURNING returns 0 rows → sql.ErrNoRows), re-select the existing id to recurse children: INSERT INTO workspaces (...) VALUES (...) ON CONFLICT (COALESCE(parent_id, ...), name) WHERE status != 'removed' DO NOTHING RETURNING id; The ON CONFLICT target predicate matches the partial-index predicate exactly — required for Postgres to consider the index applicable. Existing `lookupExistingChild` helper kept (still used on the skip path); semantics unchanged. ## Test coverage * AST gate refreshed to assert the workspaces INSERT contains the ON CONFLICT pattern (`onConflictDoNothingRE`) instead of the now-obsolete "lookup-before-insert" ordering. Per behavior-based gating (memory: feedback_behavior_based_ast_gates.md), the new gate pins the actual TOCTOU-resolution behavior. * Companion `TestGate_FailsWhenInsertOmitsOnConflict` proves the gate catches the bug shape on synthetic source. * All existing `lookupExistingChild` unit tests (no-rows, found, nil-parent, DB error, wrapped no-rows) still pass — helper is unchanged and still load-bearing on the skip path. * Live Postgres E2E coverage runs via the existing "Handlers Postgres Integration" CI job, which applies migrations to a real PG and exercises the INSERT path. ## Why ship the migration + swap together (not stacked) The migration alone provides a DB-level backstop, but without the handler swap a UNIQUE-violation surfaces as a 500 to the user. The handler swap alone has no enforceable target until the migration applies. Shipped together they give graceful skip + atomic backstop. Migration is CONCURRENTLY + IF NOT EXISTS, safe to apply even on tenants where the sweeper (#2860) hasn't run yet — the index just declines to build until conflicting rows are reconciled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:43:49 -07:00
Hongming Wang	b759548822	fix(chat-uploads): activity rows commit atomically with PutBatch Closes #149. uploadPollMode for poll-mode chat uploads previously committed N pending_uploads rows in one Tx (PutBatch), then wrote N activity_logs rows individually outside any Tx. A per-row failure on activity row K left rows 1..K-1 committed and pending_uploads orphaned until the 24h TTL — not data-loss because the platform's fetcher handled the half-state cleanly, but the user never saw file K in the canvas and the inconsistency surfaced as an "uploaded but invisible" complaint class. Thread one Tx through PutBatchTx + N × LogActivityTx + Commit so all or none commit. Broadcasts are deferred until after Commit — emitting an ACTIVITY_LOGGED event for a row that ends up rolled back would paint a ghost message into the canvas's optimistic UI. A new LogActivityTx returns a commitHook the caller invokes post-Commit; the existing fire-and-forget LogActivity is unchanged for the 4 other production callers (a2a_proxy_helpers + activity.go report path). Storage interface gains PutBatchTx; PostgresStorage.PutBatch is refactored to share the validation + insert path. inMemStorage and fakeSweepStorage delegate or no-op for PutBatchTx (the in-mem fake can't model Tx state — DB-level atomicity is verified by the existing real-Postgres integration test for PutBatch + the new unit test asserting the Go handler calls Rollback on activity-insert failure). Tests: - TestPollUpload_AtomicRollbackOnActivityInsertFailure pins the new contract via sqlmock — second activity insert errors → Rollback expected, Commit must NOT be called. - TestLogActivityTx_DefersBroadcastUntilCommitHook + _InsertError_NoHook_NoBroadcast + _NilTx_Errors cover the new API. - TestPutBatchTx_HappyPath / _EmptyItems / _ValidationFails / _PerRowErrorPropagates cover Tx-aware storage layer. - 7 existing TestPollUpload_* tests updated to mock Begin + Commit (or Begin + Rollback for failure paths) since the handler now opens a Tx around PutBatch + activity inserts. All workspace-server tests pass; integration tag also clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 21:34:28 -07:00
Hongming Wang	19df43e3da	Merge pull request #2993 from Molecule-AI/rfc-2945-pr-b-1-migrate-bare-event-strings refactor(events): migrate 18 producers to typed EventType constants (RFC #2945 PR-B-1)	2026-05-06 03:45:47 +00:00
Hongming Wang	f39b595a9c	fix(workspace files API): EIC parity for ListFiles + DeleteFile (closes #2999 PR-A) ## User-visible bug Canvas Files tab returns "0 files / No config files yet" for every SaaS workspace, every root (/configs, /home, /workspace, /plugins). Reported by user (canvas screenshot, hongming.moleculesai.app, Hongming Personal Brand Agent — claude-code, T4, online). ## Root cause `ListFiles` (templates.go) was missing the SSH-via-EIC branch that ReadFile (PR #2785) and WriteFile (PR #1702) already have. On SaaS, dockerCli is nil → findContainer returns "" → falls through to host-side resolveTemplateDir which only matches baked-in template names. For a user-named workspace it matches nothing, so the handler silently returns []fileEntry{}. DeleteFile had the same gap — right-click delete (introduced in PR-C of this issue) would silently no-op once #1 was fixed. ## Fix 1. Extracted shared EIC plumbing into `withEICTunnel` (closure-based, single SSOT for keypair → key push → tunnel → port-wait → cleanup). Refactored writeFileViaEIC + readFileViaEIC to use it. Added listFilesViaEIC + deleteFileViaEIC on the same scaffold. The `LogLevel=ERROR` shim from PR #2822 now lives in one `eicSSHSession.sshArgs()` helper instead of being duplicated per helper — the next time we need to tweak ssh options, one place. 2. Factored remote shell strings into pure functions (buildInstallShell / buildCatShell / buildRmShell / buildFindShell + parseFindOutput) so the wire shape can be pinned without booting a real EIC tunnel. 3. Refactored `resolveWorkspaceFilePath(runtime, root, relPath)` to honor `?root=`. New rule: `/configs` (or empty / unrecognized) → runtime managed-config dir via workspaceFilePathPrefix (preserves the v1 ReadFile/WriteFile behaviour where canvas's Config tab GETs/PUTs config.yaml without specifying a root and lands in the right per-runtime dir); `/home`, `/workspace`, `/plugins` → literal absolute path on the EC2 host. List/Read/Write/Delete now agree on what file a tree row points to — pre-fix List would say "/home contents" but Read/Write would route to /configs. 4. ListFiles + DeleteFile dispatch on instance_id != "" → EIC helper. Errors from the EIC path produce 500 (not silent fall-through to local-Docker, which would mask the failure as "0 files" — the exact user-visible symptom). 5. Added ?root= validation gate to WriteFile + DeleteFile so an out-of-allowlist root is rejected before the resolver runs. ## Test coverage - TestResolveWorkspaceFilePath_RuntimeIndirection — pins the /configs → runtime prefix translation per-runtime (hermes, claude-code, langgraph, external, unknown). Catches the regression where a future edit accidentally drops the runtime indirection. - TestResolveWorkspaceFilePath_LiteralRoots — pins /home, /workspace, /plugins as literal pass-through regardless of runtime. Catches the symmetric regression where the literal roots start getting rewritten to the runtime prefix (which would mean the FilesTab "/home" selector silently routes to /configs on hermes). - TestResolveWorkspaceRootPath — directory-only translation used by listFilesViaEIC, same indirection rules. - TestSSHArgs_HardenedFlags — pins the centralised ssh option set (LogLevel=ERROR + hardening). Catches drift in the one-place-where-ssh-flags-live. - TestEicSSHSessionSingleSourceForSSHFlags — behaviour-based AST gate (per memory). Counts s.sshArgs() callers (must be ≥4 — list/read/write/delete) and asserts LogLevel=ERROR appears exactly once in the source. Fires if anyone copy-pastes a raw ssh args slice instead of going through the helper. - TestBuildInstallShell / TestBuildCatShell / TestBuildRmShell / TestBuildFindShell — pure-function tests pinning the remote command shape. Catches regression like "rm -f silently becomes rm -rf" or "find loses node_modules pruning" without needing a real EC2. - TestBuildFindShell_DepthForwarding — catches a regression where the helper hard-codes a depth instead of using the caller's value. - TestParseFindOutput / TestParseFindOutput_EmptyInput — pin the TYPE\|SIZE\|REL parser. Empty-input case explicitly returns [] not nil so the JSON wire shape stays a list. - TestListFiles_EICDispatch_Success / Error — sqlmock-driven handler test. Verifies instance_id != "" routes to listFilesViaEIC and surfaces errors as 500 (does NOT silently fall through to local-Docker, which is the exact regression-mode of the original bug). - TestListFiles_EICBranch_NotTakenForSelfHosted — back-compat guard: instance_id == "" must NOT enter the EIC branch (would break self-hosted operators). - TestDeleteFile_EICDispatch_Success / Error — same shape for DeleteFile. - TestListFiles_RootValidation / TestDeleteFile_RootValidation — ?root=/etc must 400 before any DB query or EIC call. ## Verification - `go build ./...` clean - `go test ./...` clean (full workspace-server suite) - Will be live-verified against staging on hongming.moleculesai.app after merge: open Files tab → expect populated /home + /configs + /workspace listings (not "0 files"); right-click delete on /configs/old.yaml → expect file removed on the EC2 host. ## Three weakest spots (hostile self-review) 1. The LogLevel=ERROR drift gate counts source occurrences. A future refactor that intentionally moves the literal somewhere else (e.g. into a constant) would trigger a false positive. The gate's failure message points to the load-bearing constraint (must appear in sshArgs); operator can adjust. 2. `eicFileWriteTimeout` constant kept as an alias for back-compat with prior tests. Documented as intentional + safe to remove on the next pass. 3. The resolver tests pin the runtime → prefix map values (`/home/ubuntu/.hermes`, `/configs`, etc.). A future runtime addition that ships a new prefix needs the test updated. This is intentional — silent prefix changes orphan saved files, so a test failure on map edit IS the right signal. ## Follow-up (RFC #2312 subtask 2) Long-term the right fix is to drop EIC entirely and HTTP-forward to the workspace's own URL (RFC #2312). That's a substantially larger refactor across 5 surfaces (chat upload, files, templates, plugins, terminal) and out of scope for this bug-fix PR. Tracked separately under that RFC. Refs #2999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:18:05 -07:00
Hongming Wang	64e58fb390	test(memory-v2-e2e): update expectChainQueryRoot for new name column PR #2990 root cause: the resolver SQL added `name` to the SELECT for DisplayName plumbing, but the e2e test's sqlmock fixture (expectChainQueryRoot at swap_test.go:216) still scripts the 3-column shape. Three e2e tests fail with: sql: expected 3 destination arguments in Scan, not 4 Fix: bump the fixture to 4 columns (id, name, parent_id, depth) and pass an empty name. The e2e tests don't assert on label rendering — they pin the namespace string flow ("workspace:root-1" etc), which is unchanged. Empty name is fine: ReadableNamespaces still emits the correct namespace strings; only DisplayName is empty. Caught by CI's Platform (Go) check on PR #2990 — would have been a silent missed-coverage case in the resolver_test.go run because that package doesn't import the e2e package. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 19:10:18 -07:00
Hongming Wang	9ceda9d81f	refactor(events): migrate 18 files to typed EventType constants (RFC #2945 PR-B-1) Mechanical migration of bare event-name strings in BroadcastOnly / RecordAndBroadcast call sites to the typed constants from internal/events/types.go (RFC #2945 PR-B). Wire format unchanged (both shapes serialize to identical WSMessage.Event literals); pinned by TestAllEventTypes_IsSnapshot in #2965. Migrated (18 files, scope: handlers/, scheduler/, registry/, bundle/, channels/): - handlers/{approvals,a2a_proxy_helpers,a2a_queue,activity,agent, delegation,external_rotate,org_import,registry,workspace, workspace_bootstrap,workspace_crud,workspace_provision_shared, workspace_restart}.go - channels/manager.go (caught by hostile-reviewer pass — initial scope missed channels/, found via grep on the post-migration tree) - scheduler/scheduler.go - registry/provisiontimeout.go - bundle/importer.go Hostile self-review (3 weakest spots, addressed) ------------------------------------------------ 1. Missed call sites — initial scope omitted channels/. Post-migration `grep -rEn 'BroadcastOnly\([^,]+,[^,]"[A-Z_]+"\|RecordAndBroadcast\([^,]+,[^,]"[A-Z_]+"' internal/` found 2 stragglers in channels/manager.go. Migrated. Final grep on the same pattern returns only the docstring example in types.go (intentional). 2. gofmt drift — auto-import injection produced non-canonical import ordering. `gofmt -w` applied ONLY to the 18 modified files (NOT the whole tree, to avoid sweeping unrelated pre-existing drift into this PR's diff). Three pre-existing un-gofmt'd files in handlers/ (a2a_proxy.go, a2a_proxy_test.go, a2a_queue_test.go) left as-is — they're unchanged by this PR and their drift predates it. 3. Wire format — paranoia check: do the constants serialize to the exact strings consumers (canvas TS, hermes plugin, anything parsing WSMessage.Event) expect? Yes. Pinned by the snapshot test. The migration is name-only; not a single character of wire output changes. Verified - go build ./... clean - go vet ./internal/... clean - gofmt -l on the 5 migrated package dirs: only pre-existing files - Full tests: handlers/, channels/, scheduler/, registry/, events/, bundle/ all green (5 ok, 0 fail) PR-B-2 (canvas TS mirror + cross-language parity gate) remains as the final piece of RFC #2945 PR-B. Tracked separately so this PR stays mechanical + reviewable. Refs RFC #2945, PR #2965 (PR-B types). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 19:05:03 -07:00
Hongming Wang	b6310d7ebf	fix(memory-v2): namespace dropdown labels use display names not UUID prefixes (#2988 ) User feedback on the v2 Memory tab redesign: on a root workspace, the namespace dropdown showed three indistinguishable entries: Workspace (30ba7f0b) Team (30ba7f0b) (team) Org (30ba7f0b-b303-4a20-aefe-3a4a675b8aa4) (org) For a root workspace, the resolver collapses workspace==team==org IDs (resolver.go:113-122 derive() degenerate case). The previous shortID(8)-truncated UUID label scheme made all three look identical even though the three concepts (private / team-shared / org-wide) remain semantically distinct. ## Backend — Resolver returns DisplayName - SQL chain query now SELECTs workspaces.name (COALESCE → "" on NULL) - chainNode carries .name through walk - deriveNames() computes the display name for each namespace, mirroring derive(): workspace: self.name team: parent.name (or self.name if root — degenerate) org: chain[end].name (root of tree) - Namespace struct gets a new DisplayName field, omitempty wire-shape ## Backend — Handler renders label from DisplayName when present - memories_v2.go:namespaceLabelWithName(name, kind, displayName) is the new SSOT label generator. Falls back to the UUID-prefix shape when displayName is empty so callers without name plumbing keep working unchanged. - namespacesToViews now plumbs Namespace.DisplayName into the label. - Old namespaceLabel(name, kind) is preserved as a thin wrapper around namespaceLabelWithName(_, _, "") for back-compat. - Custom namespaces ignore displayName by design — operator-defined suffixes ARE the chosen label; a name override would surprise. ## Frontend — drop redundant `(kind)` suffix Pre-fix: "Team (mac laptop) (team)" — kind shown twice. Post-fix: "Team (mac laptop)" — the prefix already conveys the kind. ## Test coverage Resolver (3 new tests): - DisplayName_Root: workspace name propagates to all 3 namespaces - DisplayName_Child: workspace=self.name, team=parent.name, org=root.name - DisplayName_EmptyOnNULL: COALESCE → "" → empty fallback Handler (3 new tests): - NamespaceLabelWithName_PrefersDisplayName: workspace/team/org/custom paths - NamespaceLabelWithName_FallsBackToUUIDPrefix: empty displayName → legacy shape - NamespacesToViews_PassesDisplayNameThrough: full integration on root case Canvas: existing 30 tests still pass; suffix drop is rendering-only. memories_v2.go function coverage: 14/14 = 100% - namespaceLabelWithName: 100% - namespacesToViews: 100% - (all 11 pre-existing functions stay at 100%) ## SSOT The "what is this namespace called" question now has one source of truth: namespace.Resolver.ReadableNamespaces sets DisplayName from the canonical workspace.name column. The handler is a renderer; the canvas is a consumer. No name-lookup logic duplicated across the three layers. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-05 18:46:50 -07:00
Hongming Wang	f1dc721eeb	Merge pull request #2964 from Molecule-AI/fix/delegation-ledger-utf8-truncate-2962 fix(delegation_ledger): rune-safe preview truncation (#2962)	2026-05-05 23:34:57 +00:00
Hongming Wang	a5903af459	fix(delegation_ledger): rune-safe preview truncation (#2962 ) The previous byte-slice form `s[:previewCap]` could split a multi-byte codepoint at byte 4096, producing invalid UTF-8. Postgres JSONB rejects the row → ledger insert silently fails → audit gap on dashboards while activity_logs continues to record the event. Walk the string by rune index and stop at the last boundary that fits inside the cap. ASCII-only strings still hit the cap exactly; CJK/emoji strings stop slightly under, never over. Mirrors the truncatePreviewRunes fix shipped for agent_message_writer in #2959. Followup: deduplicate into a shared helper once both have landed. Tests: 2 regression tests using utf8.ValidString — one with an all-3-byte rune string just over the cap, one with a single multi-byte rune sitting exactly on the boundary. Verified on the previous byte-slice impl: both new tests would fail (invalid UTF-8 + truncation past cap by 1 byte).	2026-05-05 16:19:51 -07:00
Hongming Wang	5b78bea10d	feat(events): typed EventType registry — single source of truth for WS event names (RFC #2945 PR-B) Pre-RFC-#2945, every BroadcastOnly / RecordAndBroadcast call site passed a bare string literal: h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload) 29 producers (Go, ~30 call sites in handlers/, scheduler/, registry/, bundle/) and ~30 canvas consumers (TS store + listeners) duplicated the same string with no shared definition. A producer renaming an event silently broke every consumer — same drift class that produced the reno-stars data-loss regression on the persistence side. PR-A fixed the persistence-side SSOT (AgentMessageWriter); PR-B fixes the event-name SSOT. What this PR ships internal/events/types.go - EventType typed string + 29 named constants covering the full taxonomy (chat / lifecycle / agent assignment / delegation / task / approval / auth). - Grouped semantically; new constants must be added here AND mirrored in canvas/src/lib/ws-events.ts (parity gate landing in PR-B-2 follow-up). - AllEventTypes slice — authoritative list for the snapshot test + the cross-language parity gate. internal/events/types_test.go (3 tests) - TestAllEventTypes_IsSnapshot: pins the canonical list. Adding a new constant without updating AllEventTypes (or vice versa) fails with a one-line diff. - TestEventType_NoEmptyConstants: catches accidentally-empty values (typo in types.go: const X EventType = ...). - TestEventType_AllUppercaseSnakeCase: pins the wire format that canvas TS switch statements assume (no kebab-case, no mixed case, no leading/trailing/double underscores). agent_message_writer.go (single migration) - Demonstrates the constant-usage shape: events.EventAgentMessage → "AGENT_MESSAGE" - Other ~30 call sites stay on bare strings for now (this PR narrow); the migration happens in PR-B-1 follow-up. Both shapes (constant + bare string) co-exist on the wire — the typed version is just the recommended path for new code. Why ship this in stages 1. PR-B (this): types + tests + first migration → MERGEABLE NOW, low risk. 2. PR-B-1 (follow-up): migrate the remaining ~30 call sites to constants. Mechanical, low-risk. 3. PR-B-2 (follow-up): canvas/src/lib/ws-events.ts mirror + cross- language parity gate. Touches both repos. Per memory feedback_oss_design_philosophy.md (every refactor toward OSS plugin shape) — this surface is now plugin-safe: external implementations can import the events package and get the same named taxonomy without copying strings. Verified - go vet ./internal/events/ clean - go build ./... clean - TestAllEventTypes_IsSnapshot + TestEventType_* all pass - TestAgentMessageWriter_* (the only call site touched) still green Refs RFC #2945, PR #2949 (PR-A SSOT), PR #2944 (reno-stars). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:25:38 -07:00
Hongming Wang	07d09f3696	Merge pull request #2959 from Molecule-AI/rfc-2945-pr-a-followup-utf8-and-db-errors fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (PR-A followup)	2026-05-05 16:19:29 -07:00
Hongming Wang	feef80423b	Merge pull request #2958 from Molecule-AI/fix/external-connect-templates-mcp-command fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)	2026-05-05 16:18:23 -07:00
Hongming Wang	1e01083e55	fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (RFC #2945 PR-A followup) Self-review of PR #2949 surfaced two pre-existing defects that the SSOT consolidation inherited from the original /notify handler. Both are addressable in a small follow-up; shipping them as a separate PR keeps the consolidation and the bug-fix individually reviewable. Critical: byte-slice preview truncation produces invalid UTF-8 ------------------------------------------------------------- Pre-fix: if len(preview) > 80 { preview = preview[:80] + "…" } `len()` returns BYTES; `preview[:80]` slices on a byte boundary. For agent-authored chat in CJK / emoji / accented characters, byte 80 lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT fails → activity_log row never written → message vanishes from chat history on the next reload. The persistence-failure log fires but operators have to grep to find it, and the user-visible regression mode is identical to reno-stars. Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune boundary using `for i := range s` (Go's range over string yields rune start indices). Cap at 80 RUNES not bytes — UI-friendly count, not storage count. Important: workspace-lookup error path swallows real DB errors -------------------------------------------------------------- Pre-fix: if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil { return ErrWorkspaceNotFound } Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real DB errors (connection drop, query timeout, pool exhaustion → caller should 503). During a Postgres outage every notify call surfaced as "workspace not found" — masking the actual incident in alerting and making the symptom indistinguishable from "you typed a bad workspace ID". Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap non-not-found errors with `fmt.Errorf("agent_message: workspace lookup: %w", err)`. Callers' existing fallback path (return 500 / return error wrapped) handles the new shape correctly without any changes — verified by running existing TestNotify_* and TestMCPHandler_SendMessage_* tests. Tests added (3 new, 11 total writer tests) ------------------------------------------ - TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK, exactly-at-max, emoji prefix. Asserts both correct visible output AND `utf8.ValidString` on every result so the bug shape (invalid UTF-8) can't recur. - TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end with a 200-rune CJK message (exceeds the 80-rune cap, would have hit the byte-slice bug). Pins the INSERT summary contains valid UTF-8 with exactly 80-rune body + ellipsis. - TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so alerting can distinguish 404 from 503. Verified via mock ExpectQuery returning a transient error. Verified -------- - `go vet ./internal/handlers/` clean - `go build ./...` clean - All 14 writer + caller tests pass (8 original + 3 new + AST gate + TestNotify_* + TestMCPHandler_SendMessage_* sibling tests) Per memory feedback_assert_exact_not_substring.md: every new test asserts boundary behavior directly (UTF-8 validity, exact rune count, errors.Is comparison) rather than substring-match in stringified output. Refs RFC #2945, PR #2949, PR #2944. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:10:58 -07:00
Hongming Wang	eab36e217e	fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957 ) The External Connect modal's Codex and OpenClaw tabs were rendering this MCP server config: command = "python3" args = ["-m", "molecule_runtime.a2a_mcp_server"] That spawns the bare MCP dispatcher with no presence wiring. The ``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls ``POST /registry/register`` at startup and runs the 20s heartbeat thread alongside the MCP stdio loop. Without the wrapper, the canvas flips the workspace back to ``awaiting_agent`` (OFFLINE) within 60-90s — even while tools work — because nothing is heartbeating. Operator-side this looks like: the workspace is registered and tools work fine when invoked, but the canvas shows "offline" / "Restart" CTA, peer agents see the workspace as awaiting_agent in list_peers output, and inbound A2A delivery silently fails the readiness check. A new external-Codex operator (#2957) hit this and spent debugging time on what should have been a copy-paste install. Fix: switch both Codex and OpenClaw templates to ``command = "molecule-mcp"`` / ``args = []``, matching the universal MCP template that already handles this correctly. Inline comment in each template explains the wrapper-vs-bare-module tradeoff so a future template author doesn't regress to the shorter form. Hermes-channel intentionally still spawns the bare module — the hermes plugin owns the platform plugin path and runs its own register_platform/heartbeat code in-process; double-heartbeating would race. Universal/Codex/OpenClaw all need the wrapper. Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper asserts the three templates that must use the wrapper actually do, and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server`` shape. Verified the test FAILS on pre-fix source by stashing only external_connection.go and re-running. Source: molecule-core#2957 issue 1 (item 4 of the report — the ``(codex returned empty output)`` / opaque-canvas-error / stale- session items live in codex-channel-molecule and are tracked separately). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:06:02 -07:00
Hongming Wang	decec9b9a1	Merge pull request #2956 from Molecule-AI/feat/memory-tab-v2-redesign feat(memory): redesign Memory tab for v2 plugin	2026-05-05 22:56:55 +00:00
Hongming Wang	f0f4d0e761	feat(memory): redesign Memory tab for v2 plugin Replaces the v1 LOCAL/TEAM/GLOBAL tab trio (mapped to the deprecated shared_context model) with a v2 plugin-driven UI. Without this, canvas Memory tab was reading the frozen agent_memories table while all post-cutover agent writes went to the plugin's memory_records — the tab silently displayed stale data. ## Backend (workspace-server) New routes under wsAuth, all behind the existing per-tenant token: GET /workspaces/:id/v2/namespaces → readable + writable lists GET /workspaces/:id/v2/memories → plugin search proxy DELETE /workspaces/:id/v2/memories/:mid → plugin forget proxy memories_v2.go — slim handler: - Server-side ACL: every search request is intersected with the resolver's readable-namespaces set (canvas-supplied namespace that the workspace can't read returns [] not 403, matches v1 existence-non-inferring shape). - Returns 503 with "set MEMORY_PLUGIN_URL" hint when plugin isn't wired (canvas surfaces a banner). - Maps plugin not_found → 404, other plugin errors → 502. - View shaping: NamespaceView.label rendered server-side ("Workspace (abc-1234)", "Team (t-99)", "Org (acme)", custom) so canvas doesn't parse namespace names. MemoryView surfaces pin/expires_at/score/source_workspace_id from Propagation. memories_v2_test.go — 100% line + 100% function coverage: - 503 path on every endpoint when unwired - Namespaces success + readable/writable error paths - Search: empty intersection, full-path query/kind/limit propagation, namespace=/no-namespace branches, propagation map missing/wrong-type, intersect error, plugin error - Forget: success, plugin not_found→404, other plugin errors→502, missing memoryId→400 - Helpers: namespaceLabel for all 4 kinds + truncation, parseLimit edge cases (default/0/negative/over-cap/non-num), memoryToView field round-trip, indexOfColon, shortID ## Frontend (canvas) MemoryInspectorPanel rewritten for v2: - Drop LOCAL/TEAM/GLOBAL trio. Namespace dropdown driven by GET /v2/namespaces.readable, "All namespaces" default. - New per-row badges: kind (F/S/C), source (agent/runtime/user), pin (📌), TTL countdown (⌛12h / "expired"), score% on semantic search, source-workspace ⇡ws-pee for propagated. - Drop Edit button — v2 plugin contract has no PATCH; the model is forget + recommit. Forget stays. - Plugin-unavailable banner with operator hint when /v2/* returns 503. - Bug fix surfaced by test: rollback-on-failed-delete order of operations (loadEntries() called setError(null) AFTER we set the failure message, wiping it). Reload first, then set the error. MemoryEditorDialog deleted — Add was POST /memories which v2 doesn't support from canvas (writes go via MCP). The legacy Edit-flow tests go with it. ## Test results Backend: `go test ./internal/handlers/` — all pass Backend coverage on memories_v2.go: 100% lines, 100% functions Canvas: `vitest run` — 91 files, 1273 tests pass (26 new) Canvas coverage on MemoryInspectorPanel.tsx: 100% lines, 100% functions, 96.7% statements, 84.7% branches (uncovered branches are defensive `?? fallback` for contract-impossible kind/source values) ## Migration note The legacy v1 GET/POST/PATCH/DELETE on /workspaces/:id/memories remains in place for the back-compat MCP shim (mcp_tools_memory_v2's legacy routing) and admin export/import. PR-9 (#283) drops agent_memories along with the v1 endpoints once the cutover verification window closes.	2026-05-05 15:53:28 -07:00
Hongming Wang	d99b3f2aec	refactor(handlers): consolidate Notify + MCP send_message_to_user through AgentMessageWriter (RFC #2945 PR-A) Pre-RFC-#2945 the broadcast + activity_log INSERT for "agent → user chat" was duplicated across two handlers — activity.go's Notify (HTTP /notify) and mcp_tools.go's toolSendMessageToUser (MCP tools/call). The duplication is exactly what produced the reno-stars production data-loss regression (PR #2944): the persistence-half fix landed for one handler and silently lagged for the other for months, dropping every long-form external-agent message on reload. PR #2944 added the missing INSERT to mcp_tools.go and a forward- looking AST gate. This PR removes the duplication at the source. What changes ------------ NEW: workspace-server/internal/handlers/agent_message_writer.go - AgentMessageWriter struct + NewAgentMessageWriter ctor. - Send(ctx, workspaceID, message, attachments) error: workspace lookup → broadcast WS AGENT_MESSAGE → INSERT activity_logs. - ErrWorkspaceNotFound for the lookup-miss path so callers can return 404 / JSON-RPC error cleanly. - Best-effort persistence: INSERT failure logs only, returns nil so the broadcast success isn't undone (matches previous behavior in both call sites — pinned by test). - Takes events.EventEmitter (interface) so tests can substitute a capturing fake without nil-panicking inside hub.Broadcast. UPDATED: activity.go:Notify - Replaced ~75 lines of inline broadcast+INSERT with a 12-line call to AgentMessageWriter.Send. - Attachment shape conversion (NotifyAttachment → AgentMessageAttachment) is local to the HTTP handler; the writer's API doesn't import the HTTP-binding-tagged type. UPDATED: mcp_tools.go:toolSendMessageToUser - Replaced ~40 lines (the post-#2944 broadcast+INSERT pair) with a 6-line call to the writer. - Attachments is nil today because the MCP tool args don't expose attachments yet. When the schema adds it, build the slice and pass through; the writer half is ready. Tests ----- agent_message_writer_test.go (8 tests, comprehensive): - TestAgentMessageWriter_Send_Success_NoAttachments — happy path, pins JSON `{"result":"hi"}`. - TestAgentMessageWriter_Send_Success_WithAttachments — pins file parts shape (kind=file, file.{uri,name,mimeType,size}). Uses a jsonMatcher that decodes + asserts via predicate (tolerant of map key ordering, exact on shape). - TestAgentMessageWriter_Send_WorkspaceNotFound — pins ErrWorkspaceNotFound + asserts NO broadcast NO INSERT. - TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil — pins best-effort persistence contract. - TestAgentMessageWriter_Send_PreviewTruncation — pins ≤80-char preview + ellipsis (Ryan's onboarding-friction report would have bloated activity_logs.summary by 2KB without this). - TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent — pins WS event name + payload shape via capturingEmitter. - TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty — pins the "no key when nil" wire contract. The existing AST gate from #2944 (TestAgentMessageBroadcastsArePersisted) still holds: any future function emitting AGENT_MESSAGE without an INSERT fails the test. With the writer in place that's now redundant — both producers go through it — but the gate is cheap to keep as defense-in-depth. Verified: go vet clean; all writer + caller tests pass; existing TestNotify_* + TestMCPHandler_SendMessage_* + the AST gate all green. Refs RFC #2945, PR #2944. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:29:42 -07:00
Hongming Wang	899c53550d	test(mcp): comprehensive coverage for send_message_to_user persistence + AST gate (reno-stars followup) Per user request: audit all similar tools + write comprehensive tests including E2E for the persistence-of-AGENT_MESSAGE-broadcasts contract. Audit (all BroadcastOnly call sites in workspace-server/internal/): \| Site \| Event \| Persisted? \| Notes \| \|---\|---\|:---:\|---\| \| a2a_proxy_helpers.go:275 \| A2A_RESPONSE \| ✓ \| LogActivity above \| \| activity.go:486 (Notify) \| AGENT_MESSAGE \| ✓ \| INSERT line 535 \| \| activity.go:701 (LogActivity) \| ACTIVITY_LOGGED \| ✓ \| self-emits inside DB write \| \| mcp_tools.go:341 (toolSendMessageToUser) \| AGENT_MESSAGE \| ✓ NEW (this PR) \| \| registry.go:575 \| TASK_UPDATED \| N/A \| transient progress, not chat \| \| registry.go:596 \| WORKSPACE_HEARTBEAT \| N/A \| infra ping, not chat \| Only one chat-bearing broadcast was missing persistence (the just- fixed mcp bridge path). No other regressions found. Tests added (4 new, total 5 send_message_to_user tests): 1. TestAgentMessageBroadcastsArePersisted — AST gate that walks every non-test .go in the package, finds funcs that BroadcastOnly with "AGENT_MESSAGE", asserts each ALSO contains an "INSERT INTO activity_logs". Forward-looking regression block: any future chat tool that broadcasts without persisting fails the test with a clear file:func diagnostic. Mutation-tested locally: removing the INSERT block from toolSendMessageToUser reliably produces the expected failure. 2. TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s — pins the "best-effort persistence" contract. DB INSERT failures must NOT abort the tool response (the WS broadcast already succeeded; retrying would double-render in the live chat). Matches /notify. 3. TestMCPHandler_SendMessageToUser_ResponseBodyShape — pins the exact `{"result": "<message>"}` JSON shape stored in response_body. The canvas hydrater (extractResponseText in historyHydration.ts) reads body.result; any drift here silently breaks chat history without failing the INSERT. Per memory feedback_assert_exact_not_substring.md, asserts the literal JSON shape, not a substring. 4. TestMCPHandler_SendMessageToUser_PersistsToActivityLog (existing, from previous commit) — pins INSERT shape with regex on 'a2a_receive' + 'notify' literals. 5. TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet (existing) — env-gate aborts before DB. Test fixture cleanup: newMCPHandler now uses newTestBroadcaster (real ws.Hub) instead of events.NewBroadcaster(nil) — the latter nil-panics inside hub.Broadcast on the AGENT_MESSAGE path. Same broadcaster shape every other handler test uses. E2E note: the AST gate is the strongest forward-looking guarantee. A real-DB integration test would add value for CI but is largely duplicative of the sqlmock contract tests above (sqlmock pins SQL shape with much faster feedback). Left as a future enhancement when the handlers Postgres-integration suite extends MCP coverage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 14:52:32 -07:00
Hongming Wang	cdfc9f743f	fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss) Reported on production tenant reno-stars: an external claude-code agent (CEO Ryan PC workspace) sent a long-form message via send_message_to_user; the user saw it live in the chat panel but it vanished after a refresh. Confirmed via direct production query — the message is NOT in activity_logs at all (only short test pings around it are persisted). Root cause: there are TWO server-side handlers for send_message_to_user: 1. HTTP `/workspaces/:id/notify` (activity.go:Notify) — broadcasts WS AND inserts a row into activity_logs. This is the path the in-container runtime's tool_send_message_to_user calls. 2. MCP-bridge `tools/call name=send_message_to_user` (mcp_tools.go:toolSendMessageToUser) — broadcasts WS only, never persisted. This is the path EXTERNAL agents using molecule-mcp's send_message_to_user tool route through. The persistence fix landed for path 1 months ago but was never mirrored on path 2. External agents — exactly the case in reno-stars/CEO Ryan PC — have been silently losing every long-form notification on reload. Fix: mirror the activity.go INSERT shape inside toolSendMessageToUser: INSERT INTO activity_logs (workspace_id, activity_type, method, summary, response_body, status) VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok') Same wire shape as /notify so the canvas's chat-history hydration (`type=a2a_receive&source=canvas`) treats both writers identically. Errors are log-only — broadcast already succeeded, persistence failure shouldn't block the tool response (matches /notify behavior; downside is the same data-loss-on-DB-error risk, surfaced via log.Printf). Tests ----- - `TestMCPHandler_SendMessageToUser_PersistsToActivityLog` — pins both the workspace-name lookup AND the INSERT shape. Regex-matches `'a2a_receive'` + `'notify'` literals so a future refactor that changes activity_type or method breaks the test loud, not silently re-introducing the data-loss bug. - Updated newMCPHandler to use newTestBroadcaster() (real ws.Hub) — events.NewBroadcaster(nil) crashes inside hub.Broadcast in the send_message_to_user path. Same shape every other handler test uses. Verified `go test ./internal/handlers/ -run TestMCPHandler_SendMessage` green; full vet clean. Refs reno-stars production incident 2026-05-05. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 14:47:48 -07:00
Hongming Wang	f3782662bd	refactor(external-connect): embed help in agent paste, fix wrong docs hostname Two related fixes to the Connect-External-Agent flow that the user flagged: the "Need help?" disclosure block in the modal is for the operator's eyes only — but the agent reading the pasted snippet has no access to that context. And the docs URL was pointing at a hostname that doesn't resolve. User-visible problems: 1. The agent doesn't see the install link, docs link, or the common- error/check pairs that the human pasted. When the agent fails to register or hits ConnectionRefused, it can't self-diagnose because the troubleshooting context lives in a separate UI block. 2. https://docs.molecule.ai → DNS NXDOMAIN. Every "Documentation" link in the modal was a dead link. ## Fixes ### Move help INTO the snippet (not a separate human-only UI block) Each of the 7 server-rendered templates in `workspace-server/internal/handlers/external_connection.go` now appends a `# Need help?` section with: install link, correct docs link, and the top common errors as `# • symptom — check` pairs. Templates updated: curl / channel (Claude Code) / mcp (Universal MCP) / python / hermes / codex / openclaw. Agents reading the paste now have the same diagnostic context the human did. ### Drop the duplicated UI block in the canvas modal `canvas/src/components/ExternalConnectModal.tsx`: - Removed the `TAB_HELP` per-tab metadata constant (152 lines). - Removed the `HelpBlock` component (62 lines). - Removed the `<HelpBlock help={TAB_HELP[tab]} />` render call. The snippet is now the single source of truth for tab-level help. ### Fix the wrong docs hostname The actual docs site is `doc.moleculesai.app` (singular `doc`, `.app` not `.ai`), confirmed by: - `package.json` description in `Molecule-AI/docs` repo → "Molecule AI documentation site — doc.moleculesai.app" - HTTP HEAD on the new URL → 200 for both `/docs/guides/mcp-server-setup` and `/docs/guides/external-agent-registration` - HTTP HEAD on old `docs.molecule.ai` → 000 (NXDOMAIN) All template docs URLs now point at `doc.moleculesai.app`. ## Verification - `go build ./...` clean - `go test ./internal/handlers/... -count=1` green - `pnpm test` → 1291/1291 pass (unchanged) - `tsc --noEmit` clean - 219 LOC removed (canvas duplicate UI), 69 LOC added (snippet help) - Net `-150 LOC` while gaining the agent-readable help ## Out of scope (deferred, captured in followups) - One blog post still has `canonical: "https://docs.molecule.ai/blog/..."` in `src/app/blog/2026-04-20-chrome-devtools-mcp/page.mdx` — separate blog-content fix. - Comment in `theme-provider.tsx` references `docs.moleculesai.app` (with `s`) — comment-only, not a runtime URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:51:35 -07:00
Hongming Wang	cb70d3d437	docs: callout Python>=3.11 requirement on Universal MCP install snippet User-reported friction: pip install molecule-ai-workspace-runtime on a 3.10 interpreter fails with "Could not find a version that satisfies the requirement (from versions: none)" — pip's requires_python filter silently drops the only available artifact before attempting install, so the error doesn't mention Python at all. Operators see "package missing", file a bug, and chase a phantom CDN/visibility issue. Two changes mirror the requirement at the two operator-touch surfaces: 1. workspace-server/internal/handlers/external_connection.go: the externalUniversalMcpTemplate snippet (rendered into the canvas Connect-External-Agent modal) now leads with a brief "Requires Python >= 3.11" block + diagnostic + upgrade paths. 2. docs/workspace-runtime-package.md: same callout at the top of the doc, before the Overview, so anyone landing here from search gets the answer immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:44:25 -07:00
Hongming Wang	423d58d42c	fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc Three small hardening passes from #2872's optional/important findings, batched into one polish PR: 1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows. The bare equality breaks if any future caller wraps the error via fmt.Errorf("…: %w", err) — the no-rows happy path would fall through to the "real DB error" branch and abort the import. errors.Is unwraps. New test TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the fix; verified the test fails on the old `==` shape (build break on unused-import + assertion failure once import dropped). 2. Bounded 5s timeout on lookupExistingChild instead of context.Background(). The createWorkspaceTree call site runs in goroutines spawned from the /org/import handler, so plumbing the request context here would cascade-cancel into provisionWorkspaceAuto and abort in-flight EC2 provisioning if the client disconnected mid-import — that's the wrong tradeoff. A short bounded timeout protects the per-row SELECT against a wedged DB without taking the drop-everything-on-disconnect behaviour. The lookup is a single ~10ms query; 5s leaves 500x headroom for transient slow paths. 3. Godoc clarifications on the skip-path block. - /org/import is ADDITIVE-ONLY, never destructive. Children present in the existing tree but absent from the new template are preserved (no DELETE on diff). - Skip-path does NOT propagate updates to existing nodes — a re-import that adds an initial_memory or schedule to an existing workspace is silently dropped. Document the limitation so future operators know to delete-and-re-import or reach for a future /org/sync route. Verification: - go build ./... → clean - go test ./internal/handlers/... → all passing (TestLookup* + TestCreateWorkspaceTree* + TestClass1* + TestGate*) - 4 lookup tests + 1 new wrap-safety test → 5/5 PASS - Full handlers suite → green Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc clarifications for additive-only + skip-path-update-limitation) Out of scope (deferred): - PR-D partial unique index migration + ON CONFLICT — sequenced after Phase 4 cleanup verified clean per #2872 plan - PR-E full createWorkspaceTree integration test for partial-match — needs heavier sqlmock scaffolding for downstream workspaces_audit/canvas_layouts/secrets/channels INSERTs; follow-up Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:20:54 -07:00
Hongming Wang	60afcd43c9	test(handlers): generic Class 1 leak AST gate (#2867 PR-A) Adds class1_ast_gate_test.go — a per-package AST walk that fails the build if any handler function INSERTs INTO workspaces inside a range loop body without one of three escape hatches: 1. A call to a registered preflight helper (lookupExistingChild today; extend preflightCallNames as new helpers are introduced). 2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT, like registry.go). 3. An explicit `// class1-gate: idempotent-by-design` comment in the function body (deliberately awkward — forces a code-review beat). Why this is broader than the existing TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in org_import_idempotency_test.go: that one is hard-coded to one function in one file. This one walks every non-test .go file in the handlers package and applies a structural rule independent of file/function names. A future handler written from scratch in a new file would not have been covered before — now it is. Detection mechanism (per AST): - Collect spans (Lbrace..Rbrace) of every RangeStmt body in each function. Position-based instead of stack-based — ast.Inspect's nil-callback ordering doesn't give per-node pop semantics, so a naive push/pop stack silently miscounts. Position spans are deterministic. - Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(` (tightened from bytes.Index "INSERT INTO workspaces" so workspaces_audit literals don't false-positive — same regex used by the existing createWorkspaceTree gate). - For each match: record insertLine, hasONCONFLICT, and the innermost enclosing RangeStmt line (or 0 if not inside any range). - Fail the function if INSERT is inside a range AND no preflight AND no ON CONFLICT AND no allowlist annotation. Self-tests (per `feedback_assert_exact_not_substring.md` — verify gate fails on the bug shape before merging): - TestClass1_GateFiresOnSyntheticBuggySource: synthetic source where INSERT is inside `for _, child := range children` body must trigger the gate's three guards (enclosingRangeLine!=0, hasONCONFLICT=false, no preflight call). - TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT must NOT trigger the gate (idempotent UPSERT case). - TestClass1_GateAllowsAllowlistAnnotation: function with `// class1-gate: idempotent-by-design` must be skipped. - TestClass1_NoUnpreflightedInsertInsideRange: production sweep over every handler .go file. Currently passes because org_import.go preflights, registry.go ON-CONFLICTs, and workspace.go's Create has no INSERT inside a range body. Verification: - go test ./internal/handlers/... -run TestClass1_ -count=1 → 4/4 PASS - go test ./internal/handlers/... -count=1 → suite green (no pre-existing test broken by the new file) Refs molecule-core#2867 (PR-A Class 1 generic AST gate) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 13:01:34 -07:00
Hongming Wang	ff75aeb43e	Merge pull request #2922 from Molecule-AI/fix/memory-plugin-gate-sidecar-on-cutover fix(memory-plugin): gate sidecar spawn on cutover-active	2026-05-05 19:44:01 +00:00
Hongming Wang	412dec0d87	fix(memory-plugin): gate sidecar spawn on cutover-active PR #2906 spawned the sidecar unconditionally on every tenant boot. The plugin's first migration runs \`CREATE EXTENSION vector\` which fails on tenant Postgres without pgvector preinstalled — every staging tenant redeploy aborted at the 30s health gate. CP fail-fast kept running tenants on the prior image (no outage), but the new image was DOA. Caught on staging redeploy 2026-05-05 19:23 with \`pq: extension "vector" is not available\`. Fix: only spawn the sidecar when the operator has flipped the cutover flag — \`MEMORY_V2_CUTOVER=true\` OR \`MEMORY_PLUGIN_URL\` is set. * Aligns the entrypoint to the same opt-in posture wiring.go already uses (it skips building the client when MEMORY_PLUGIN_URL is empty). * Until cutover, the sidecar isn't even running — no migration, no health gate, no boot-time pgvector dependency. * Operators activating cutover already redeploy with the new env vars set; that's when the sidecar starts. By definition they've verified pgvector is available before flipping. * MEMORY_PLUGIN_DISABLE=1 escape hatch preserved; harness fix #2915 becomes belt-and-suspenders (still respected). Both Dockerfile and entrypoint-tenant.sh updated. Behavior change for existing deployments: zero (cutover env vars still unset → sidecar still inert, but now also not running). Refs RFC #2728. Hotfix for #2906; supersedes the migration-path fragility class (the sidecar isn't doing migrations on tenants that won't use it).	2026-05-05 12:39:03 -07:00
Hongming Wang	83454e5efd	feat(workspace-server): structured logging at provisioning boundaries Adds internal/provlog with a single Event(name, fields) helper that emits JSON-tagged single-line records to the standard logger. Five boundary sites instrumented for #2867: provision.start — workspace_dispatchers.go (sync + async) provision.skip_existing — org_import.go idempotency hit provision.ec2_started — cp_provisioner.go after RunInstances provision.ec2_stopped — cp_provisioner.go after TerminateInstances ack restart.pre_stop — workspace_restart.go before Stop dispatch These pair with the existing human-prose log.Printf lines (kept). The new records are grep+jq friendly so a future log-aggregation pipeline can reconstruct per-workspace provision timelines without parsing the operator messages — this is the "and debug loggers so it dont happen again" half of the leak-prevention work. Tests: - provlog: emits evt-prefixed JSON, nil-tolerant, marshal-error fallback preserves event boundary, single-line output pinned. - handlers: provlog_emit_test.go pins three call-site contracts: provisionWorkspaceAutoSync emits provision.start with sync=true, stopForRestart emits restart.pre_stop with backend=cp on SaaS, and backend=none when both backends are nil. Field taxonomy is convenience for ops, not contract — payload can grow additively without breaking callers. Behavior gate is the event name + boundary location, per feedback_behavior_based_ast_gates.md. Refs #2867 (PR-D structured logging at provisioning boundaries) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 12:30:11 -07:00
Hongming Wang	8254bedf30	Merge pull request #2917 from Molecule-AI/chore/delete-team-collapse-2864 chore: delete TeamHandler.Collapse + docs cleanup (closes #2864)	2026-05-05 19:04:30 +00:00
Hongming Wang	eec4ea2e7d	chore: delete TeamHandler.Collapse + docs cleanup (closes #2864 ) Multi-model retrospective review of #2856 (Phase 1 Expand removal) flagged that TeamHandler.Collapse is unreachable from the canvas UI: the "Collapse Team" button calls PATCH /workspaces/:id { collapsed } (visual flag toggle on canvas_layouts), NOT POST /workspaces/:id/collapse. The destructive POST route — which stops EC2s, marks children removed, and deletes layouts — has zero UI callers (verified via grep across canvas/, scripts/, and the MCP tool registry; only docs referenced it). Two semantically different operations had been sharing the word "Collapse": - Visual collapse (canvas) → PATCH { collapsed: true }. Hides children visually. Reversible. UI-only. - Destructive collapse (POST /collapse) → Stops + marks removed. Irreversible. No caller. Deleting the destructive one + its supporting machinery: - workspace-server/internal/handlers/team.go (entirely) - workspace-server/internal/handlers/team_test.go (entirely) - POST /collapse route + teamh init in router.go - findTemplateDirByName helper (zero non-test callers after Expand was deleted in #2856; package-private so no out-of-package consumers) - NewTeamHandler constructor (no callers after route removed) Plus stale doc references (the most dangerous was the MCP wrapper mapping in mcp-server-setup.md — anyone generating MCP tool wrappers from that table was wiring a 404): - docs/agent-runtime/team-expansion.md (deleted entirely — whole guide taught the deleted flow) - docs/api-reference.md (dropped two team.go rows) - docs/api-protocol/platform-api.md (dropped /expand + /collapse rows) - docs/architecture/molecule-technical-doc.md (dropped /expand + /collapse rows) - docs/guides/mcp-server-setup.md (dropped expand_team + collapse_team MCP wrapper mappings) - docs/glossary.md (dropped "(org template expand_team)" parenthetical) - docs/frontend/canvas.md (dropped broken link to deleted team-expansion.md) Kept: docs/architecture/backends.md mention of "TeamHandler.Expand (#2367) bypassed routing on Start" — correct historical context for the AST gate's existence, no live route reference. Visual-collapse path unaffected: canvas/src/components/ContextMenu.tsx:227 → api.patch — unchanged canvas/src/components/WorkspaceNode.tsx:128 → api.patch — unchanged go vet ./... clean. go test ./internal/handlers/ -count 1 — all green (4.3s, no regression). Net: -388/+10 = ~378 lines removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:59:43 -07:00
Hongming Wang	6201d12533	fix(memory-plugin): embed migrations into binary via go:embed PR #2906 shipped the binary at /memory-plugin without the migrations directory. The plugin's runMigrations() resolved a relative path \`cmd/memory-plugin-postgres/migrations\` that exists in the build context but NOT in the runtime image. Every staging tenant boot failed with: memory-plugin-postgres: migrate: read migrations dir "cmd/memory-plugin-postgres/migrations": open cmd/memory-plugin-postgres/migrations: no such file or directory memory-plugin: ❌ /v1/health never returned 200 after 30s — aborting boot Caught on the staging redeploy fleet job after #2906 merged. Tenants stayed on the old image (CP redeploy correctly fail-fasted) but the new image was broken. Fix: \`//go:embed migrations/.up.sql\` bundles the migrations into the binary at build time. No filesystem path dependency at runtime. \`embed.FS\` embeds the .up.sql files alongside the binary. * runMigrations() reads from migrationsFS by default; MEMORY_PLUGIN_MIGRATIONS_DIR override path preserved for operators shipping custom migrations. * Names sorted alphabetically — pinned by a test so a future \`002_.up.sql\` is guaranteed to run after \`001_.up.sql\`. Tests: * TestMigrationsEmbedded_ContainsCreateTable — pins that the embed pattern matched files AND those files contain CREATE TABLE (catches both empty-pattern and wrong-files-embedded). * TestRunMigrationsFromEmbed_OrderingIsAlphabetic — pins sorted application order. Verified locally: \`go build\` succeeds, binary 9.3MB, \`strings\` shows the embedded SQL. Refs RFC #2728. Hotfix for #2906.	2026-05-05 11:57:37 -07:00
Hongming Wang	fc1c45789e	Merge pull request #2912 from Molecule-AI/feat/saas-default-hardening-2910 feat(saas): close 4th default-tier site + lift org_import asymmetry + tests (#2910)	2026-05-05 18:42:19 +00:00
Hongming Wang	9f551319d2	feat(saas): close 4th default-tier site + lift org_import asymmetry + tests (#2910 ) Multi-model retrospective review of #2901 found three Critical gaps: 1. (#2910 PR-B) template_import.go:79 wrote `tier: 3` hardcoded into generated config.yaml. On SaaS this defeated the T4 default at the create-handler layer — a config-less template import landed at T3 regardless of POST /workspaces' computed default. The 4th default-tier site #2901 missed. 2. (#2910 PR-A) #2901 claimed `go test ... all green` but added zero new tests. Existing structural-pin tests caught dispatch-layer drift but said nothing about tier-default drift. A future refactor that flips DefaultTier() to always return 3 would ship green. 3. (#2910 PR-E) org_import.go fallback returned T2 on self-hosted while workspace.go returned T3. Internally consistent ("bulk vs interactive defaults") but undocumented same-name-different-value drift. Fix: - TemplatesHandler.NewTemplatesHandler now takes `wh *WorkspaceHandler` (nil-tolerant for read-only callers). Import + ReplaceFiles compute tier via h.wh.DefaultTier() and pass it to generateDefaultConfig. generateDefaultConfig gets a `tier int` parameter (bounds-checked, invalid input falls back to T3). - org_import.go fallback lifts to h.workspace.DefaultTier() — single source of truth shared with Create + Templates so a future tier-default change sweeps every entry point at once. - New saas_default_tier_test.go pinning: TestIsSaaS_TrueWhenCPProvWired TestIsSaaS_FalseWhenOnlyDocker TestDefaultTier_SaaS_IsT4 TestDefaultTier_SelfHosted_IsT3 TestGenerateDefaultConfig_RespectsTierParam TestGenerateDefaultConfig_SelfHostedTierT3 TestGenerateDefaultConfig_OutOfRangeFallsBackToT3 - Existing template_import_test.go tests + chat_files_test.go + security_regression_test.go updated to thread the new tier param / wh constructor arg through their NewTemplatesHandler calls. Their pre-#2910 assertion of `tier: 3` is preserved (now passes because the test caller passes `3` explicitly), so no regression. go vet ./... clean. go test ./internal/handlers/ -count 1 — all green (4.2s). Deferred to separate follow-ups (per #2910 plan): - PR-C: MOLECULE_DEPLOYMENT_MODE explicit deployment-mode signal (closes the IsSaaS()=cpProv!=nil structural fragility) - PR-D: Host iptables IMDS block + IMDSv2 hop-limit (paired with molecule-controlplane EC2-IAM-scope audit) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:38:22 -07:00
Hongming Wang	1052f8bdb0	fix(memory-plugin): bind to 127.0.0.1 by default Self-review of PR #2906 flagged: defaultListenAddr was ":9100" — binds on every container interface. Inside today's deployment that's moot (no host port mapping, platform talks over loopback) but it's not least-privilege. A future Dockerfile edit that publishes the port, a misconfigured Fly machine, or a future cross-host plugin topology would expose an unauth'd memory store. Loopback is the right baseline. Operators with a multi-host topology already override via MEMORY_PLUGIN_LISTEN_ADDR — that path is unchanged. Tests: * TestLoadConfig_DefaultListenAddrIsLoopback pins the new default. * TestLoadConfig_ListenAddrEnvOverride pins the override path so operators relying on it don't break. * TestLoadConfig_MissingDatabaseURL covers the existing fail-fast. No prior unit tests existed for loadConfig — boot_e2e_test.go always sets MEMORY_PLUGIN_LISTEN_ADDR explicitly, so the default was never exercised by tests. This PR adds that coverage. Refs RFC #2728. Hardening follow-up to PR #2906.	2026-05-05 11:35:24 -07:00
Hongming Wang	5334d60de4	Merge pull request #2898 from Molecule-AI/2867-workspaces-insert-allowlist test(handlers): allowlist INSERT INTO workspaces sites (#2867 class 1)	2026-05-05 18:18:19 +00:00
Hongming Wang	d6c0227e3f	Merge pull request #2906 from Molecule-AI/feat/memory-plugin-sidecar-bundle feat(memory-v2): bundle memory-plugin-postgres as in-image sidecar	2026-05-05 18:16:57 +00:00
Hongming Wang	27db090d3d	Merge pull request #2907 from Molecule-AI/feat/poll-mode-chat-upload-phase5a feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening	2026-05-05 11:16:56 -07:00
Hongming Wang	0f25f6de97	test(handlers): allowlist INSERT INTO workspaces sites — close bulk-create regression class (#2867 class 1) Adds TestINSERTworkspacesAllowlist: walks every non-test .go in this package, finds funcs containing an `INSERT INTO workspaces (` SQL literal, and pins the result against an explicit allowlist with the safety mechanism named per entry. New entries fail the build until a reviewer adds them — forcing the question "what makes this INSERT idempotent?" at PR-review time, not after the next bulk-create leak (the shape that produced 72 stale child workspaces in tenant-hongming over 4 days). Pairs with TestCreateWorkspaceTree_CallsLookupBeforeInsert (the behavior pin for the one bulk path today). Together: - this test catches "did a new function start inserting?" - that test catches "did the existing bulk path drop its idempotency check?" Both fire immediately when drift happens. Current allowlist (3 entries): - org_import.go:createWorkspaceTree → lookup-then-insert via lookupExistingChild (#2868 phase 3, also pinned by the sibling AST gate from #2895) - registry.go:Register → ON CONFLICT (id) DO UPDATE (idempotent by primary key — external workspace upsert) - workspace.go:Create → single-workspace POST /workspaces, server- generated UUID, no iteration Verified via mutation: dropping a synthetic tempBulkLeakTest with an unsafe loop+INSERT into the package fails the gate with a clear diagnostic pointing at the file + function. Restoring the tree returns the gate to green. Memory: feedback_assert_exact_not_substring.md (verify tightened test FAILS on bug shape) — mutation proof done locally. RFC #2867 class 1. Class 2 (Prometheus gauge for ec2_instance duplicates) + class 3 (structured logging on workspace create) are follow-up PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:15:16 -07:00
Hongming Wang	9991057ad1	feat(poll-upload): phase 5a — atomic batch insert + acked-index + mime hardening Resolves four of six findings from the retrospective code review of Phases 1–4 (poll-mode chat upload). Bundled because every change is in the platform's pending_uploads layer or the multi-file handler that reads it. Findings resolved: 1. Important — Sweep query lacked an index for the acked-retention OR-arm. The Phase 1 partial indexes are both `WHERE acked_at IS NULL`, so the `(acked_at IS NOT NULL AND acked_at < retention)` half of the WHERE clause seq-scanned the table on every cycle. Add a complementary partial index on `acked_at WHERE acked_at IS NOT NULL` so both arms of the disjunction are index-covered. Disjoint from the existing two indexes (no row matches both predicates), so write amplification is bounded to ~one index entry per terminal-state row. 2. Important — uploadPollMode partial-failure left orphans. The previous per-file Put loop committed rows 1..K-1 and then errored on row K with no compensation, so a client retry would double-insert the survivors. Refactor the handler into three explicit phases (pre-validate + read-into-memory, single atomic PutBatch, per-file activity row) and add Storage.PutBatch with all-or-nothing transaction semantics. 3. FYI — pendinguploads.StartSweeperWithInterval was exported only for tests. Move it to lower-case startSweeperWithInterval and expose the test seam through pendinguploads/export_test.go (Go convention; the shim file is stripped from the production binary at build time). 4. Nit — multipart Content-Type was passed verbatim into pending_uploads rows and re-served on /content. Add safeMimetype which strips parameters, rejects CR/LF/control bytes, and coerces malformed shapes to application/octet-stream. The eventual GET /content response can no longer be header-split via a crafted Content-Type on the multipart. Comprehensive tests: - 10 PutBatch unit tests (sqlmock): happy path, empty input, all four pre-validation rejection paths, BeginTx error, per-row error + Rollback (no Commit), first-row error, Commit error. - 4 new PutBatch integration tests (real Postgres): all-rows-commit happy path with COUNT(*) verification, atomic-rollback no-leak via a NUL-byte filename that lib/pq rejects mid-batch, oversize short-circuit no-Tx, idx_pending_uploads_acked existence + partial predicate via pg_indexes (planner-shape-independent). - 3 new chat_files_poll tests: atomic rollback on second-file oversize, atomic rollback on PutBatch error, mimetype CRLF/NUL/parameter sanitization (8 sub-cases). The two remaining review findings (inbox_uploads.fetch_and_stage blocks the poll loop synchronously; two httpx Clients per row) are Python-side and ship in Phase 5b once this lands on staging. Test-only export pattern via export_test.go, atomic pre-validation discipline (validate before Tx), and behavior-based (not name-based) test assertions follow the standing project conventions.	2026-05-05 11:10:13 -07:00
Hongming Wang	b89a49ec93	feat(memory-v2): bundle memory-plugin-postgres as in-image sidecar Closes the gap between the merged Memory v2 code (PR #2757 wired the client into main.go) and operator activation. Without this PR an operator wanting to flip MEMORY_V2_CUTOVER=true had to provision a separate memory-plugin service and point MEMORY_PLUGIN_URL at it — extra ops surface for what the design intends to be a built-in. What ships: * Both Dockerfile + Dockerfile.tenant build the cmd/memory-plugin-postgres binary into /memory-plugin. * Entrypoints spawn the plugin in the background on :9100 BEFORE starting the main server; wait up to 30s for /v1/health to return 200; abort boot loud if it doesn't (better to crash-loop than to silently route cutover traffic against a dead plugin). * Default env: MEMORY_PLUGIN_DATABASE_URL=$DATABASE_URL (share the existing tenant Postgres — plugin's `memory_namespaces` / `memory_records` tables coexist with platform schema, no conflicts), MEMORY_PLUGIN_LISTEN_ADDR=:9100. * MEMORY_PLUGIN_DISABLE=1 escape hatch for operators running the plugin externally on a separate host. * Platform image: plugin runs as the `platform` user (not root) via su-exec — matches the privilege boundary the main server already drops to. Tenant image already starts as `canvas` so the plugin inherits non-root automatically. What stays operator-controlled: * MEMORY_V2_CUTOVER is NOT auto-set. Behavior change for existing deployments: zero. The wiring at workspace-server/internal/memory/ wiring/wiring.go skips building the plugin client until the operator opts in, so the running sidecar is a no-op for traffic until then. * MEMORY_PLUGIN_URL is NOT auto-set either, for the same reason — setting it implies cutover-active intent. Operators set both on staging first, verify a live commit/recall round-trip (closes pending task #292), then promote to production. Operator activation steps after this PR ships: 1. Verify pgvector extension is available on the target Postgres (the plugin's first migration runs CREATE EXTENSION IF NOT EXISTS vector). Railway's managed Postgres ships pgvector available; some self-hosted operators may need to enable it. 2. Redeploy the workspace-server with this image. 3. Set MEMORY_PLUGIN_URL=http://localhost:9100 + MEMORY_V2_CUTOVER=true in the environment (staging first). 4. Watch boot logs for "memory-plugin: ✅ sidecar healthy" and the wiring.go cutover messages; do a live commit_memory + recall_memory round-trip via the canvas Memory tab to verify. 5. Promote to production once staging holds for a sweep window. Refs RFC #2728. Closes the dormant-plugin gap noted in task #294.	2026-05-05 11:10:11 -07:00
Hongming Wang	c79ba05ed5	test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests TestStartSweeper_RecordsMetricsOnError flaked on every CI rerun under race detection: `error counter delta = 0, want 1`. Root cause is a race between two goroutines, not a bug in the production sweeper. The fake `fakeSweepStorage.Sweep` signals `cycleDone` from inside its deferred return — that happens BEFORE Sweep's return value is received by `sweepOnce`, which is what triggers the metric increment. On slow CI hosts the test goroutine wins the read after `waitForCycle` unblocks and BEFORE StartSweeper's goroutine has called `metrics.PendingUploadsSweepError`, so the asserted delta is 0 even though the metric WILL be 1 a few ms later. Adds a polling assert helper, `waitForMetricDelta`, that closes the race deterministically without timing-based sleeps: - TestStartSweeper_RecordsMetricsOnError uses waitForMetricDelta to wait for the error counter to settle at 1. - TestStartSweeper_RecordsMetricsOnSuccess uses it on the success counters (acked, expired) so the error-stayed-zero assertion reads after StartSweeper has fully processed the cycle. - waitForCycle keeps its current shape but documents the caveat in its comment so future tests don't repeat the assumption. Verified: `go test ./internal/pendinguploads/ -race -count 5` passes all 9 tests across 5 iterations cleanly. Per memory feedback_question_test_when_unexpected.md: the "delta=0, want=1" failure looked like a real production bug at first glance, but instrumented inspection showed the metric DOES increment, just AFTER the test's read. The fix is the test's wait shape, not the sweeper. Unblocks every PR currently broken by this flake (#2898 hit it on two consecutive CI runs; staging-merged PRs from earlier today (#2877/#2881/#2885/#2886) introduced the test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 10:46:17 -07:00
Hongming Wang	7644e82f2f	feat(saas): default new workspaces to T4 on SaaS, T3 self-hosted User reported every SaaS workspace defaults to T2 (Standard). Three sites quietly disagreed on the default: - canvas CreateWorkspaceDialog (line 126): isSaaS ? 4 : 3 ← only correct one - canvas EmptyState "Create blank": tier: 2 ← hardcoded - workspace.go POST /workspaces: tier = 3 ← not SaaS-aware - org_import.go createWorkspaceTree: tier = 2 (fallback)← not SaaS-aware So a user clicking "+ New Workspace" via the dialog got T4 on SaaS, but a user clicking "Create blank" on the empty canvas got T2, and an agent POSTing /workspaces directly got T3. Same tenant, three different tiers depending on entry point. Fix: 1. WorkspaceHandler.IsSaaS() and DefaultTier() helpers (workspace_dispatchers.go). IsSaaS() := h.cpProv != nil — single source of truth for "are we SaaS" across the file. DefaultTier() returns 4 on SaaS, 3 on self-hosted. SaaS rationale: each workspace runs on its own sibling EC2 so the per-workspace tier boundary is a Docker resource limit on the only container present — no neighbour to protect from. T4 matches the boundary. 2. workspace.go now defaults tier via h.DefaultTier() instead of hardcoded T3. 3. org_import.go fallback (when neither ws.tier nor defaults.tier set) becomes SaaS-aware: T4 on SaaS, T2 on self-hosted (preserve the existing safe-shared-Docker-daemon default for self-hosted org imports). 4. canvas EmptyState "Create blank" stops sending tier:2 in the body and lets the backend pick — single source of truth in the backend. Eliminates the third disagreement. Test plan: - go vet ./... clean - go test ./internal/handlers/ -count 1 — all green (4.3s) - npx tsc --noEmit on canvas — clean - Staging E2E (after deploy): create a fresh workspace via canvas empty-state on hongming.moleculesai.app, confirm tier=4 on the workspace details panel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 10:30:22 -07:00
Hongming Wang	48d1945269	test(org-import): tighten AST gate to discriminate workspaces vs lookalikes (#2872 Imp-1) The previous TestCreateWorkspaceTree_CallsLookupBeforeInsert used bytes.Index("INSERT INTO workspaces"), which prefix-matches INSERT INTO workspaces_audit, INSERT INTO workspace_secrets, and INSERT INTO workspace_channels. RFC #2872 cited this as a silent false-pass mode: a future refactor that adds an audit-table INSERT literal earlier in source than the real workspaces INSERT would make the gate point at the wrong target. Replaces the byte-search with a go/ast walk + a regex that requires `\s\(` after `workspaces` — distinguishes the real target from prefix lookalikes. Adds three discriminating tests: - TestWorkspacesInsertRE_RejectsLookalikes — pins the regex against 9 sql shapes (real, raw-string-literal, audit-shadow, workspace_ prefixes, canvas_layouts, UPDATE/SELECT, comments). - TestGate_FailsWhenLookupAfterInsert — synthesizes Go source where the lookup is positioned AFTER the workspaces INSERT, asserts the helper returns lookupPos > insertPos (which the production gate flags via t.Errorf). Proves the gate isn't vestigial. - TestGate_IgnoresAuditTableShadow — synthesizes source with an audit-table INSERT BEFORE the lookup + real INSERT, asserts the tightened regex correctly walks past the shadow and finds the real INSERT. Also extracts findLookupAndWorkspacesInsertPos as a helper so the gate logic can be exercised against synthetic source, not only against the real org_import.go. Memory: feedback_assert_exact_not_substring.md (verify tightened test FAILS on old code) — TestGate_FailsWhenLookupAfterInsert is the failing-on-bug-shape proof. Closes the silent-false-pass mode of #2872 Important-1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 08:32:56 -07:00
Hongming Wang	e50799bc29	test(rfc): poll-mode chat upload — phase 4 real-Postgres integration Phase 4 closes out the rollout — strict-sqlmock unit tests pin which SQL fires, but they cannot detect bugs that depend on the actual row state after the SQL runs. Real-Postgres integration tests catch: - the Sweep CTE depends on Postgres' make_interval function and the table's CHECK constraints; sqlmock would happily accept a hand-written SQL literal that Postgres rejects at runtime. - the partial idx_pending_uploads_unacked index only catches a wrong WHERE predicate at real-query-plan time. - subtle predicate drift (e.g. a WHERE clause that filters by acked_at IS NOT NULL but uses BETWEEN incorrectly). Test cases: - PutGetAckRoundTrip: the full happy path — Put, Get, MarkFetched, Ack, idempotent re-Ack, Get-after-Ack returns ErrNotFound. - Sweep_DeletesAckedAfterRetention: row not eligible at retention=1h immediately after Ack; deleted at retention=0. - Sweep_DeletesExpiredUnacked: backdated expires_at exercises the unacked-and-expired branch of the WHERE clause. - Sweep_DeletesBothCategoriesInOneCycle: three rows (acked, expired, fresh); a single Sweep deletes the first two and leaves the third. - PutEnforcesSizeCap: ErrTooLarge above MaxFileBytes. - GetIgnoresExpiredAndAcked: Get filters predicate matches expected row state in the table. Run path: - locally via the file-header docker incantation. - CI runs on every PR/push that touches handlers/ OR migrations/ (.github/workflows/handlers-postgres-integration.yml).	2026-05-05 05:04:41 -07:00

1 2 3 4 5 ...

760 Commits