Commit Graph

469 Commits

Author SHA1 Message Date
Hongming Wang
2eb11d6f2c Merge pull request #465 from Molecule-AI/fix/memory-recall-flood-limit
[Backend Engineer] fix(memories): hard cap of 50 on recall results (#377)
2026-04-16 05:16:49 -07:00
Hongming Wang
61d97e9a34 Merge pull request #468 from Molecule-AI/fix/issue-458-e2e-cancel-protection
ci: extract e2e-api into dedicated workflow with run-level cancel protection (#458)
2026-04-16 05:16:45 -07:00
Hongming Wang
dd0840fe1d Merge pull request #475 from Molecule-AI/docs/sync-2026-04-16
docs: sync CLAUDE.md with current architecture (2026-04-16)
2026-04-16 05:09:40 -07:00
Hongming Wang
b7003d89ff docs: sync CLAUDE.md with current architecture (2026-04-16)
Measured test counts (not guessed):
- Platform Go: 12 packages (was claiming 818 individual tests — now
  reports package-level which is the go test output format)
- Canvas: 490 Vitest tests (33 files)
- workspace-template: 955 pytest tests (down from 1179 — 224 adapter-
  specific tests moved to standalone template repos)
- molecule-app: 76 unit + 22 e2e (separate repo)

Architecture updates:
- CI section: documents manifest-driven Docker builds + reusable CI
  workflows from molecule-ci repo for all 33 plugin/template repos
- Workspace Images section: already updated by prior PR (adapter repos)
- Test commands: accurate counts, standalone repo URLs with test counts
2026-04-16 05:09:19 -07:00
Hongming Wang
4b6f08833e Merge pull request #474 from Molecule-AI/fix/code-review-issues
fix: code review findings + remove exposed secrets
2026-04-16 05:06:11 -07:00
Hongming Wang
510c40089f fix: address all code review findings + remove exposed secrets
Code review fixes:
- 🟡 #1: Replace python3 with jq in Dockerfile template stages (~50MB → ~2MB)
- 🟡 #2: Add clone count verification to scripts/clone-manifest.sh
  (set -e + expected vs actual count check — fails build if any clone fails)
- 🟡 #3: Drop 'unsafe-eval' from CSP (not needed for Next.js production
  standalone builds, only dev mode). Updated test assertion.
- 🟡 #4: Remove broken pyproject.toml from workspace-template/ (it claimed
  to package as molecule-ai-workspace-runtime but the directory structure
  didn't match — the real package ships from the standalone repo)
- 🔵 #1: Add version-pinning TODO comment to manifest.json
- 🔵 #3: Add full repo URLs + test counts for SDK/MCP/CLI/runtime in CLAUDE.md

Security (GitGuardian alert):
- Removed Telegram bot token (8633739353:AA...) from template-molecule-dev
  pm/.env — replaced with ${TELEGRAM_BOT_TOKEN} placeholder
- Removed Claude OAuth token (sk-ant-oat01-...) from template-molecule-dev
  root .env — replaced with ${CLAUDE_CODE_OAUTH_TOKEN} placeholder
- Both tokens need immediate rotation by the operator

Tests: Platform middleware tests updated + all pass.
2026-04-16 05:05:49 -07:00
Hongming Wang
73865ee164 Merge pull request #473 from Molecule-AI/fix/remove-adapters-dir
fix: remove adapter subdirectories from workspace-template
2026-04-16 04:59:34 -07:00
Hongming Wang
2347d6a80b fix: properly remove adapter subdirectories + move shared code to root
PR #471 removed Dockerfiles/requirements from adapters/ but left the
Python source files. This commit finishes the extraction:

1. Moved shared_runtime.py → workspace-template/shared_runtime.py
   (used by prompt.py, a2a_executor.py, coordinator.py — not adapter-specific)
2. Moved base.py → workspace-template/adapter_base.py
   (BaseAdapter + AdapterConfig — the interface adapters implement)
3. Updated imports in prompt.py, a2a_executor.py, coordinator.py
4. Rewritten adapters/__init__.py as a thin shim that:
   - Reads ADAPTER_MODULE env var (production: standalone repos set this)
   - Re-exports BaseAdapter/AdapterConfig for backward compat
5. adapters/base.py + adapters/shared_runtime.py remain as re-export shims
6. Deleted all 8 adapter subdirectories (autogen, claude_code, crewai,
   deepagents, gemini_cli, hermes, langgraph, openclaw)
7. Removed 11 test files that imported adapter-specific code

Tests: 955 passed, 0 failed (down from 1216 — the difference is
adapter-specific tests that moved to standalone repos).
2026-04-16 04:59:13 -07:00
Hongming Wang
12db4e9342 Merge pull request #472 from Molecule-AI/fix/remove-orphaned-plugin-tests
fix: remove orphaned plugin/adapter tests
2026-04-16 04:39:44 -07:00
Hongming Wang
c0af9cbde2 fix: remove tests that referenced removed plugins/ directory
test_first_party_plugins.py, test_plugins_builtins_drift.py, and
test_hermes_adapter.py all referenced files under plugins/ and
adapters/ which were extracted to standalone repos. These tests
belong in those repos now, not in the core workspace-template.

1216 passed, 0 failed after removal.
2026-04-16 04:39:31 -07:00
Hongming Wang
bf2208a49d Merge pull request #471 from Molecule-AI/chore/extract-workspace-runtime-to-pypi
chore: extract workspace runtime to PyPI package + standalone adapter repos
2026-04-16 04:34:30 -07:00
Hongming Wang
ab1562f3fe chore: remove adapter Dockerfiles and requirements.txt from monorepo
These files have moved to the standalone template repos:
  https://github.com/Molecule-AI/molecule-ai-workspace-template-<runtime>

Each adapter repo now has its own Dockerfile (FROM python:3.11-slim + pip install
molecule-ai-workspace-runtime) and requirements.txt. The adapter Python source
files (.py) stay in the monorepo for local development and testing.

Adapters removed from workspace-template/adapters/*/: Dockerfile, requirements.txt
Adapters retained: adapter.py, __init__.py (+ hermes extras: escalation.py, executor.py, providers.py)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 04:33:22 -07:00
Hongming Wang
03f6fc81dd chore: extract workspace runtime to PyPI + move adapter Dockerfiles to template repos
Published `molecule-ai-workspace-runtime==0.1.0` to PyPI:
  https://pypi.org/project/molecule-ai-workspace-runtime/0.1.0/

Source repo: https://github.com/Molecule-AI/molecule-ai-workspace-runtime

Each adapter's Dockerfile and requirements.txt have moved to the corresponding
standalone template repo (molecule-ai-workspace-template-<runtime>). The adapter
Python code (.py files) stays in the monorepo for local dev and testing.

Changes:
- workspace-template/pyproject.toml — new, packages the shared runtime as a PyPI package
- workspace-template/adapters/*/Dockerfile — removed (now in template repos)
- workspace-template/adapters/*/requirements.txt — removed (now in template repos)
- workspace-template/Dockerfile — drop COPY adapters/ (still copies .py files via *.py glob)
- workspace-template/build-all.sh — simplified to base-image-only build
- workspace-template/entrypoint.sh — remove adapter requirements.txt install step
- workspace-template/tests/test_hermes_adapter.py — skip Dockerfile/requirements.txt checks
- CLAUDE.md — update architecture description + workspace image table
- docs/workspace-runtime-package.md — new, explains the package + adapter repo layout

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 04:33:10 -07:00
Hongming Wang
0e60b94a1c Merge pull request #459 from Molecule-AI/chore/remove-extracted-dirs
chore: remove extracted dirs (templates, SDK, MCP, CLI)
2026-04-16 04:18:05 -07:00
DevOps Engineer
8ba6e18c0a ci: extract e2e-api into dedicated workflow with run-level cancel protection (#458)
Job-level `concurrency.cancel-in-progress: false` only prevents sibling jobs
from killing each other — it does not protect the parent workflow run from
being cancelled when a new push arrives. Every PR push was cancelling the
in-progress E2E run, forcing manual `gh run rerun` across 7+ active PRs.

Fix: move e2e-api into `.github/workflows/e2e-api.yml` with a workflow-level
concurrency group (`e2e-api-${{ github.ref }}`, cancel-in-progress: false).
New pushes now queue behind the running E2E job instead of cancelling it.

Fast jobs (platform-build, canvas-build, shellcheck, python-lint) stay in
ci.yml and retain normal run-level cancellation for quick iteration feedback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 11:15:13 +00:00
Hongming Wang
d424bd947f chore: remove extracted directories, add manifest-driven Docker builds
Remove plugins/, workspace-configs-templates/, org-templates/ dirs (now
in standalone repos). Add manifest.json listing all 33 repos and
scripts/clone-manifest.sh to clone them. Both Dockerfiles now use the
manifest script instead of 33 hardcoded git-clone lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 04:13:29 -07:00
Molecule AI Backend Engineer
9b2539a042 fix(memories): add hard cap of 50 on recall results (#377)
Introduce `memoryRecallMaxLimit = 50` constant and honour the `?limit=N`
query parameter in Search. Values above 50 are silently clamped to 50;
absent or invalid values default to 50. The LIMIT clause is now a
parameterised argument (nextArg pattern) instead of a hardcoded literal.
Three sqlmock tests verify the cap, the explicit limit, and the default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 11:12:35 +00:00
Hongming Wang
055efc535a Merge pull request #449 from Molecule-AI/fix/issue-425-sidepanel-width-persist
fix(canvas): persist SidePanel width to localStorage (closes #425)
2026-04-16 03:49:05 -07:00
Hongming Wang
a8c0bc059e Merge pull request #440 from Molecule-AI/fix/docker-compose-platform-build-context
fix(compose): platform build context must be repo root
2026-04-16 03:48:30 -07:00
Canvas Agent
026921ae62 fix(canvas): persist SidePanel width to localStorage (issue #425)
Width was initialized to 480px on every render, so clicking a different
workspace node (which re-mounts SidePanel) discarded any resize the user
had done.

Fix:
- localStorage-backed useState initializer (SSR-safe typeof window guard)
- Validates the stored value: must be a finite integer ≥ 320px
- Persists the width in the mouseUp handler via a widthRef that stays in
  sync with the live drag value — avoids spamming localStorage on every
  pixel during the drag
- Extra guard: onMouseUp bails early if not actually dragging (prevents
  spurious saves on unrelated window mouseup events)
- Named constants replace magic numbers 480 / 320

Tests: 5 new cases in SidePanel.tabs.test.tsx — default fallback, valid
saved value, too-small saved value, NaN saved value, drag-persist roundtrip.

Closes #425

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 10:40:08 +00:00
rabbitblood
de82bc2096 fix(compose): platform build context must be repo root, not ./platform
The platform Dockerfile COPYs paths relative to the repo root —
\`COPY platform/go.mod\`, \`COPY platform/migrations\`,
\`COPY workspace-configs-templates\`. The compose file was setting
\`context: ./platform\`, which silently caused those COPY layers to
miss + stop invalidating cache.

Symptom (caught 2026-04-16 10:22 UTC): after PR #417 (memory schema
migration 023) merged + I ran \`docker compose up -d --build platform\`,
the rebuild was a no-op. Image SHA didn't change, container booted with
old migration set, \`Applied 22 migrations\` instead of the expected 23.
Migration 023 file was on disk locally but never reached the image.

Workaround was \`docker build -t molecule-monorepo-platform:fresh -f
platform/Dockerfile .\` from repo root → SHA changed, migration 023
applied. This commit makes \`docker compose up -d --build platform\`
work correctly without the manual workaround.

CI workflow already builds with \`context: .\` + \`file: ./platform/Dockerfile\`
(per the comment at the top of platform/Dockerfile). This change just
aligns the local compose file with what CI does.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 03:25:58 -07:00
Hongming Wang
7fc2da2146 Merge pull request #419 from Molecule-AI/feat/gh-agent-attribution
feat(workspace): gh-wrapper — auto-tag agent PRs + issues with role
2026-04-16 03:19:46 -07:00
Hongming Wang
3a2ea197a2 Merge pull request #433 from Molecule-AI/feat/externalize-prompts-phase4
feat(org-templates): Phase 4 — atomize each role to <role>/workspace.yaml
2026-04-16 03:19:43 -07:00
Hongming Wang
f5843eff4d Merge pull request #417 from Molecule-AI/feat/memory-checkpoint-reconciliation
feat(memory): optimistic-locking via if_match_version on workspace_memory writes
2026-04-16 03:18:09 -07:00
rabbitblood
67675880cb feat(workspace): gh-wrapper — auto-tag agent PRs + issues with role
Every agent in the template currently uses the same GitHub PAT, so
\`gh pr list\` shows every PR as authored by the CEO's account with
no signal which agent opened each one. Commits already carry
per-agent authors (GIT_AUTHOR_NAME from #402). This wrapper extends
the identity split to the PR/issue metadata surface layer that
commit attribution can't reach.

## How it works

A tiny bash script installed at \`/usr/local/bin/gh\`, which sits
earlier in PATH than the real binary at \`/usr/bin/gh\`. For \`gh pr
create\` and \`gh issue create\`:

- Title gets prefixed with \`[Role Name]\` — e.g. \`[Frontend Engineer]
  fix: canvas grid index\`
- Body gets \`\n\n---\n_Opened by: Molecule AI <Role>_\` appended

Role is read from \`GIT_AUTHOR_NAME\` which the platform provisioner
sets to \`Molecule AI <Role>\` (shipped with #402). Accepts both
\`--title X\` and \`--title=X\` forms. Same for \`--body\`.

Anything that isn't \`gh pr create\` or \`gh issue create\` (e.g.
\`gh pr list\`, \`gh issue view\`, \`gh run watch\`) passes through
untouched. No behaviour change for read-side operations.

## Idempotent

- If the title already starts with \`[...]\` the wrapper does not
  re-prefix. \`gh pr edit\` flows that resubmit title won't layer
  multiple tags.
- If the body already contains \`Opened by: Molecule AI\` the footer
  is not re-appended.

## Fail-open

When \`GIT_AUTHOR_NAME\` is absent or doesn't start with \`Molecule
AI \`, the wrapper exec's the real gh with unchanged args. No call
is ever blocked by this script.

## Test coverage

\`tests/test_gh_wrapper.sh\` — 12 cases, no network, no Docker:
- Passthrough for non-create subcommands (pr list)
- pr create title prefix + body footer
- issue create with \`--title=X\` \`--body=X\` equals-form
- Idempotent title re-prefix
- Idempotent body footer (count = 1 after two applies)
- Missing GIT_AUTHOR_NAME → passthrough, title preserved
- Malformed GIT_AUTHOR_NAME (not "Molecule AI ...") → passthrough

All 12 pass. Test script is standalone bash + a temp fake gh binary
that echoes argv; safe to run in CI's Python Lint & Test job via
subprocess shell-out.

## Deployment note

This lands in the workspace image. Existing containers keep their
old /usr/bin/gh until the image is rebuilt and they're re-provisioned
(POST /workspaces/:id/restart {}). No migration required; the wrapper
just starts tagging PRs once the new image is rolled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 03:10:46 -07:00
rabbitblood
665f7f6313 feat(org-templates): Phase 4 — atomize each role to <role>/workspace.yaml
Part 4 of 4 — terminal step of the org.yaml scalability refactor. Each
role in the molecule-dev template now owns its own workspace.yaml file,
colocated with the existing system-prompt.md / initial-prompt.md /
idle-prompt.md / schedules/*.md. Team files shrink to a leader's own
definition plus a list of !include refs.

## Platform change

`resolveYAMLIncludes` now uses a TWO-ROOT model:
- Path resolution is relative to the INCLUDING file's directory
  (natural sibling + cousin refs, C-include / Sass @import convention).
- Security bound is the ORIGINAL org root (`rootDir`), preserved across
  all recursion depths. Sibling-dir refs like `../my-role/workspace.yaml`
  from a team file are now allowed (they stay inside the org template);
  refs that escape the root still error.

Regression coverage: new `TestResolveYAMLIncludes_SiblingDirAccess`
reproduces the Phase 4 pattern (team file at `teams/x.yaml` referencing
`../<role>/workspace.yaml`) — fails without the fix, passes with.

## Template change

Atomized 15 child workspaces across 3 team files:
- `teams/research.yaml`: 58 → 30 lines; 3 children now !include refs
- `teams/dev.yaml`: 222 → 38 lines; 6 children now !include refs
- `teams/marketing.yaml`: 143 → 28 lines; 6 children now !include refs

Each role now has `<role>/workspace.yaml` colocated with its prompts.
Example `frontend-engineer/` directory:
  frontend-engineer/
  ├── workspace.yaml        (24 lines — name/role/tier/canvas/plugins/...)
  ├── system-prompt.md      (from earlier phases)
  ├── initial-prompt.md
  ├── idle-prompt.md
  └── (no schedules for this role — but if added, schedules/<slug>.md)

## File-size progression across all 4 phases

| State | org.yaml | total `.yaml` in tree |
|---|---:|---:|
| Before (main) | 1801 lines / 108 KB | 1801 / 108 KB (one file) |
| After Phase 1 (#389) | 1687 | 1687 / 101 KB |
| After Phase 2 (#390) | 676 | 676 / 35 KB |
| After Phase 3 (#393) | 114 | 683 (1 + 6 teams) / 33 KB |
| **After this PR** | **114** | **~698** (1 + 6 + 15 workspace) / 35 KB |

Aggregate size is flat — the decrease came from prompt externalization
in Phases 1/2; Phases 3/4 reorganize structure without adding content.
The win is readability and ownership:
- Every individual file fits on 1-2 screens.
- Adding a new role is now: create `<role>/` dir, add `workspace.yaml`
  + `system-prompt.md` + prompts, add ONE `!include` line to the team
  file. No touching of aggregated mega-YAML.
- Team files can be reviewed + merged independently.

## Tests

All 10 `TestResolveYAMLIncludes_*` tests pass, including the real-template
integration test (`TestResolveYAMLIncludes_RealMoleculeDev`) which now
walks org.yaml → teams/pm.yaml → teams/research.yaml → ../market-analyst/
workspace.yaml and validates the full 21-role tree unmarshals cleanly.

Plus all existing `TestResolvePromptRef` + `TestOrgYAML` + `TestInitialPrompt`
suites stay green.

## Ops followup

After merging all 4 phases and deploying, the `POST /org/import`
endpoint should produce a workspace tree byte-identical to the
pre-refactor state. Verify with:
  diff <(curl POST /org/import before) <(curl POST /org/import after)
or by spot-checking:
  - `/configs/config.yaml` bodies across all 21 workspaces
  - `workspace_schedules.prompt` row values

The externalization is lossless — YAML literal to file and back
recovers the same string modulo trailing-whitespace normalization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 03:09:56 -07:00
Hongming Wang
586cd87ab6 Merge pull request #415 from Molecule-AI/fix/issue-399-canvas-image-publish
feat(ci): auto-publish canvas Docker image to GHCR on canvas/** merges
2026-04-16 03:08:27 -07:00
Hongming Wang
f5b81710df Merge pull request #409 from Molecule-AI/fix/use-client-ui-components
fix(next): add missing 'use client' to TestConnectionButton and KeyValueField
2026-04-16 03:08:24 -07:00
Hongming Wang
7d49ebfcfa Merge pull request #408 from Molecule-AI/fix/canvas-events-sequence-counter-v2
fix(canvas): monotonic sequence counter + 7px→9px chip labels
2026-04-16 03:08:20 -07:00
Hongming Wang
3fd9594c11 Merge pull request #405 from Molecule-AI/fix/wcag-zinc600-smalltext-sweep
fix(wcag): sweep text-zinc-600→zinc-500 on small-text labels across 9 components
2026-04-16 03:08:17 -07:00
Hongming Wang
431087d151 Merge pull request #404 from Molecule-AI/feat/externalize-prompts-phase3
feat(org-templates): Phase 3 — !include directive + split org.yaml into team files
2026-04-16 03:08:01 -07:00
Hongming Wang
eae5736d96 Merge pull request #416 from Molecule-AI/feat/hermes-escalation-ladder
feat(hermes): escalation ladder — promote to stronger models on transient failure
2026-04-16 03:07:57 -07:00
Hongming Wang
441a2a5938 Merge pull request #413 from Molecule-AI/fix/isrunning-distinguish-notfound
fix(provisioner): IsRunning conservative on daemon errors to stop restart cascade
2026-04-16 03:07:54 -07:00
Hongming Wang
523f9ecb69 Merge pull request #402 from Molecule-AI/feat/per-agent-git-identity
feat(provisioner): per-agent git identity via GIT_AUTHOR_* env vars
2026-04-16 03:07:50 -07:00
Hongming Wang
99529debc6 Merge pull request #428 from Molecule-AI/fix/securityheaders-test-stale-csp
fix(tests): CSP test fragment-match instead of exact-match
2026-04-16 03:07:05 -07:00
rabbitblood
7b1930bb87 fix(tests): CSP test now fragment-matches instead of exact-matches
SecurityHeaders middleware widened its CSP to allow Next.js inline scripts
+ data:/blob: images (platform/internal/middleware/securityheaders.go:44,
canvas is reverse-proxied through the gin stack so it needs the permissive
policy). The two CSP asserts in securityheaders_test.go still hard-compared
against the old tight `default-src 'self'`, so they fail on main as of
this afternoon.

Fix: assert each expected CSP fragment is PRESENT in the header (substring
match) instead of byte-for-byte equality. Test intent is "CSP is set, starts
with tight default-src, contains the expected directives" — not "CSP matches
this exact string". Future subsource tuning (add a new CDN, bump blob:/data:
scope) won't re-break this test.

Caught because every PR touching anything in the monorepo currently fails
the Platform (Go) CI job on these two asserts. Fixing on a dedicated branch
so it can land ahead of every blocked PR in the queue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:59:06 -07:00
Hongming Wang
10ea5062a1 Merge pull request #407 from Molecule-AI/fix/bake-templates-into-platform-image
fix(ops): bake templates into platform Docker image
2026-04-16 02:47:04 -07:00
Hongming Wang
a363b56f25 feat(tenant): combined platform + canvas Docker image with reverse proxy
Single-container tenant architecture: Go platform (:8080) + Canvas
Node.js (:3000) in one Fly machine, with Go's NoRoute handler reverse-
proxying non-API routes to the canvas. Browser only talks to :8080.

Changes:

platform/Dockerfile.tenant — multi-stage build (Go + Node + runtime).
  Bakes workspace-configs-templates/ + org-templates/ into the image.
  Build context: repo root.

platform/entrypoint-tenant.sh — starts both processes, kills both if
  either exits. Fly health check on :8080 covers the Go binary; canvas
  health is implicit (proxy returns 502 if canvas is down).

platform/internal/router/canvas_proxy.go — httputil.ReverseProxy that
  forwards unmatched routes to CANVAS_PROXY_URL (http://localhost:3000).
  Activated by NoRoute when CANVAS_PROXY_URL env is set.

platform/internal/router/router.go — wire NoRoute → canvasProxy when
  CANVAS_PROXY_URL is present; no-op otherwise (local dev unchanged).

platform/internal/middleware/securityheaders.go — relaxed CSP to allow
  Next.js inline scripts/styles/eval + WebSocket + data: URIs. The
  strict `default-src 'self'` was blocking all canvas rendering.

canvas/src/lib/api.ts — changed `||` to `??` for NEXT_PUBLIC_PLATFORM_URL
  so empty string means "same-origin" (combined image) instead of falling
  back to localhost:8080.

canvas/src/components/tabs/TerminalTab.tsx — same `??` fix for WS URL.

Verified: tenant machine boots, canvas renders, 8 runtime templates +
4 org templates visible, API routes work through the same port.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:46:47 -07:00
rabbitblood
f76ddba0f5 fix(tests): test_hermes_phase2_dispatch exec-load needs escalation + __name__
Phase 3 escalation ladder added `from .escalation import ...` to
executor.py. The phase-2 dispatch tests load executor.py via
`exec(compile(src, ...))` with the relative import rewritten — this
broke because (a) the rewrite didn't know about escalation and (b) the
exec namespace lacked `__name__`, which executor.py needs at import
time for `logging.getLogger(__name__)`.

Fix both in all 8 exec sites:
- Rewrite both `from .providers import` AND `from .escalation import`
- Pre-register escalation + providers in sys.modules under the fake
  package name
- Seed the exec namespace with `__name__ = "hermes_executor_under_test"`

54/54 hermes tests pass (28 escalation truth-table + 6 ladder-integration
+ 20 existing phase-2 dispatch).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:43:02 -07:00
rabbitblood
ba1e45b27f feat(memory): optimistic-locking via if_match_version on workspace_memory writes
Closes the silent-overwrite hole where two agents racing a read-modify-
write on the same memory key left only one agent's update. Relevant for
orchestrators (PM, Dev Lead, Marketing Lead) keeping structured running
state (delegation-result ledgers, task queues) in memory, and for the
``research-backlog:*`` keys that multiple idle loops write in parallel.

## Semantics

### Back-compat path (no if_match_version)
Unchanged: ``INSERT ... ON CONFLICT UPDATE`` last-write-wins. Every
existing agent tool, every existing ``commit_memory`` call, every
existing cron that writes memory — all continue to work with no edit.

### Optimistic-lock path (if_match_version set)
1. Client calls ``GET /memory/:key`` → ``{value, version: V}``
2. Client modifies value locally
3. Client ``POST /memory {key, value, if_match_version: V}``
4. Server: ``UPDATE ... WHERE version = V`` + RETURNING new version
5. On match → 200 + ``{version: V+1}``
6. On mismatch → 409 + ``{expected_version: V, current_version: <actual>}``
7. Client reads the actual version and retries.

### Create-only marker
``if_match_version: 0`` means "create iff the key doesn't exist yet".
Two agents simultaneously seeding a shared key will see exactly one
success + one 409 — no silent collision, no duplicate-init work.

### Schema

Migration 023 adds ``version BIGINT NOT NULL DEFAULT 1``. Existing rows
baseline at 1. New rows start at 1. Every successful write (both paths)
increments: ``version = version + 1`` on update, ``1`` on insert.

## Why version, not updated_at

``updated_at`` has second-granularity and can collide between concurrent
writers on a fast clock. A monotonic counter is collision-free and more
readable in the 409 response body ("expected 5, current is 7 — you
missed 2 writes" tells an agent exactly what to re-read).

## Why ``if_match_version`` and not an ETag header

JSON field keeps it in the request body, visible alongside the value
payload. Agents assembling requests programmatically don't have to
remember to thread a header through their HTTP client wrapper; the
existing ``commit_memory`` tool can grow one optional kwarg and match
the existing signature shape.

## Tests

11 memory-handler cases covering every path:
- GET list / get (with version in response shape)
- Set with no version (back-compat upsert, returns new version)
- Set with if_match_version match (happy path, increment)
- Set with if_match_version mismatch (409 + expected/current fields)
- Set with if_match_version=0 on absent key (create-only success)
- Set with if_match_version=N on absent key (409 — caller's mental
  model is wrong)
- Bad inputs (missing key, malformed JSON)
- Delete happy + error path

Full ``go test ./internal/handlers/`` green.

## Follow-up (not in this PR)

- Workspace-template tool update: ``commit_memory(content, *,
  if_match_version=None)`` surfaces the new option + on 409 surfaces
  the current_version so agents can retry without manual re-read.
- Named checkpoints table (``workspace_checkpoints``) for durable
  orchestrator state snapshots. Different concern than per-key locking;
  separate PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:32:46 -07:00
rabbitblood
7d732cec3c feat(hermes): escalation ladder — promote to stronger models on transient failure
Ships scoped Phase 3 of the Hermes multi-provider work. Every workspace
can now declare an ordered list of (provider, model) rungs; when the
pinned model hits rate-limit / 5xx / context-length / overload, the
executor advances to the next rung before raising.

## Why

3× Claude Max saturation is a routine occurrence now — the "first 429 on
a batch delegation" is the common path, not the exception. A workspace
pinned to Haiku that hits a context-length limit has no recovery today;
same for Sonnet hitting rate-limit mid-synthesis. Escalation promotes
to the next tier for that single call, preserves coordination, avoids
restart cascades.

## New module: adapters/hermes/escalation.py

- ``LadderRung(provider, model)`` — one config entry.
- ``parse_ladder(raw)`` — tolerant config parser; skips malformed rungs
  with a warning rather than raising so boot stays resilient.
- ``should_escalate(exc) -> bool`` — truth table over 15+ error shapes:
  - Typed classes (RateLimitError, OverloadedError, APITimeoutError,
    APIConnectionError, InternalServerError)
  - Context-length markers (each provider uses different phrasing)
  - Gateway markers (502/503/504, overloaded, temporarily unavailable)
  - Status-code substrings (429, 529, 5xx)
  - Hard-rejects auth failures (401/403/invalid_api_key) even if the
    outer exception class is RateLimitError — wrapping case matters.

## Executor wiring

``HermesA2AExecutor`` now accepts ``escalation_ladder`` in its
constructor + ``create_executor()`` factory. ``_do_inference()`` walks
the ladder:

  1. First attempt = pinned provider:model (matches pre-ladder behaviour)
  2. On escalatable error, try each rung in order
  3. On non-escalatable error, raise immediately (auth, malformed payload)
  4. On exhaustion, raise the last error

Rung switches temporarily rebind ``self.provider_cfg`` / ``self.model``
/ ``self.api_key`` / ``self.base_url`` in a try/finally, so any raised
error leaves the executor in its original state for the next call. Key
resolution for non-pinned rungs goes through ``resolve_provider`` which
reads the rung-provider's env vars fresh.

## Config shape

``config.yaml`` (rendered from ``org.yaml`` → workspace secrets):

    runtime_config:
      escalation_ladder:
        - provider: gemini
          model: gemini-2.5-flash
        - provider: anthropic
          model: claude-sonnet-4-5-20250929
        - provider: anthropic
          model: claude-opus-4-1-20250805

Empty / absent = single-shot behaviour, full backwards-compat with
every existing workspace.

## Tests

34 passing, all isolated (no network):

- ``test_hermes_escalation.py`` (28): parser + truth-table across
  rate-limit, overload, context-length, gateway, auth-reject, unrelated
  exceptions, and case-insensitivity.
- ``test_hermes_ladder_integration.py`` (6): no-ladder single call,
  ladder-not-triggered on success, escalate-on-rate-limit-then-succeed,
  stop-on-non-escalatable, raise-last-error-when-exhausted, skip-
  unknown-provider-in-rung.

## Not in this PR

- Uncertainty-driven escalation (judge pass after successful reply).
- Per-workspace budget tracking (#305 covers this separately).
- Live streaming reuse across rungs (ladder retries the whole call).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:27:27 -07:00
Canvas Agent
c928b4cbe8 feat(ci): auto-publish canvas Docker image to GHCR on canvas/** merges
Closes #399.

## Root cause
`publish-platform-image.yml` existed for the Go platform image but there
was no equivalent for the canvas. After every canvas PR merged, CI ran
`npm run build` and passed — but the live container at :3000 was never
updated. The `canvas-deploy-reminder` job only posted a comment asking
operators to manually rebuild, which was consistently missed.

## What this adds
- `.github/workflows/publish-canvas-image.yml`: triggers on `canvas/**`
  changes to main (and `workflow_dispatch`). Mirrors the platform workflow:
  macOS Keychain isolation, QEMU for linux/amd64, Buildx, GHCR push with
  `:latest` + `:sha-<7>` tags.
  - `NEXT_PUBLIC_PLATFORM_URL` / `NEXT_PUBLIC_WS_URL` resolve from
    `workflow_dispatch` inputs → `CANVAS_PLATFORM_URL` / `CANVAS_WS_URL`
    repo secrets → `localhost:8080` defaults (safe for self-hosted dev).
  - Inputs are passed via env vars (not direct `${{ }}` interpolation) to
    prevent shell injection from string inputs.

- `docker-compose.yml`: adds `image: ghcr.io/molecule-ai/canvas:latest`
  to the canvas service so `docker compose pull canvas && docker compose
  up -d canvas` applies the new image. `build:` is retained for local
  development. Adds a comment clarifying that `NEXT_PUBLIC_*` runtime env
  vars are ignored by the standalone bundle (build-time only).

- `ci.yml`: updates `canvas-deploy-reminder` commit comment to reference
  `docker compose pull` as the fast path, with `docker compose build` as
  the local-source fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:23:26 +00:00
rabbitblood
1883f001bf fix(provisioner): IsRunning conservative on daemon errors to stop restart cascade
Root cause of the 2026-04-16 09:10 UTC six-container restart cascade.

## Timeline

09:10:26 — PM sent a batch delegation to 15+ agents (Dev Lead coordinating).
09:10:26-27 — 4 leaders/auditors (Security, RL, BE, DevOps) simultaneously
              hit "workspace agent unreachable — container restart triggered"
              even though their containers were running fine. Another 2
              (DL, UIUX) tripped in the next few seconds.
09:10:27 — Provisioner stopped + recreated 6 containers in parallel. A2A
           callers got EOFs, PM's batch coordination stalled.

## Root cause

`provisioner.IsRunning` collapsed every ContainerInspect error into
`(false, nil)`, including transient Docker daemon hiccups:

  func IsRunning(...) (bool, error) {
      info, err := p.cli.ContainerInspect(ctx, name)
      if err != nil {
          return false, nil // Container doesn't exist ← MISREAD
      }
      return info.State.Running, nil
  }

The comment said "Container doesn't exist" but the error was actually
any of: daemon timeout, socket EOF, context deadline, connection
refused. Under load (batch delegation fan-out → 15 concurrent HTTP
inbound → 15 concurrent Claude Code subprocesses → Docker daemon CPU
pressure), ContainerInspect calls started failing transiently. All 6
calls returned `(false, nil)`. Caller `maybeMarkContainerDead` treated
`running=false` as "container is dead, restart it" → six parallel
restarts. This was exactly the destructive-on-error pattern we keep
trying to kill (see #160 SDK-stderr-probe, #318 fail-open classes).

## Fix

`IsRunning` now distinguishes NotFound from transient errors:

- Legitimately missing container (caller deleted, Docker pruned) →
  `(false, nil)` — safe to act on; caller marks dead + restarts.
- Any other error (daemon timeout, socket issue, context deadline) →
  `(true, err)` — caller stays on the alive path. The transient error
  is preserved so metrics + logging still see it, but it does NOT
  trigger the destructive restart branch.

`isContainerNotFound` matches on error-message substring — same
approach docker/cli uses internally — to avoid pulling in errdefs as a
direct dep. Truth table tests in `isrunning_test.go` cover 8 cases:
NotFound variants (real + generic), nil, empty, and the 4 transient-
error shapes we've actually observed (deadline, EOF, connection-refused,
i/o timeout).

## Caller update

`maybeMarkContainerDead` in a2a_proxy.go now logs the transient inspect
error (was silently discarded via `_`). Visibility without
destructiveness. If this error becomes persistent, we'll see it in
platform logs rather than diagnosing after another restart cascade.

## Expected impact

- Zero restart cascades from the current class of transient inspect
  errors (EOF, timeout, connection refused).
- Dead containers still detected within the A2A layer because an actual
  stopped container returns NotFound on inspect, and the TTL monitor
  (180s post #386) catches anything that slips through.
- New visibility in platform logs when inspect has trouble — previously
  silent.

Combined with the TTL fix in #386, the defense-in-depth on spurious
restart is now:
  1. IsRunning only returns false for real NotFound
  2. Liveness TTL is 180s, surviving 5+ missed heartbeats
  3. A2A proxy 503-Busy path retries with backoff before touching
     restart logic at all

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 02:21:25 -07:00
Canvas Agent
04df479e4f fix(next): add missing 'use client' to TestConnectionButton and KeyValueField
Both components use useState/useEffect/useCallback/useRef but were
missing the 'use client' directive. Without it Next.js App Router
renders them as server HTML — React never hydrates them and event
handlers are silently dropped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:10:22 +00:00
Canvas Agent
0a0eab6657 fix(a11y): raise TeamMemberChip label text 7px→9px in WorkspaceNode
Chip labels (status badge, active-task count, current-task text) were
rendered at text-[7px] — well below the 9px minimum required to meet
WCAG 1.4.3 readability. Raised all three to text-[9px] so the labels
are legible without magnification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 09:06:56 +00:00
Hongming Wang
111c59da68 fix(ops): bake workspace-configs-templates into platform Docker image
Tenant machines were booting with no templates because the Dockerfile
only shipped the Go binary + migrations. The canvas showed "0 templates"
with an empty picker.

Changes:
- platform/Dockerfile: build context changed from ./platform to repo
  root so COPY can reach workspace-configs-templates/ alongside the
  Go source. COPY paths updated for platform/{go.mod,go.sum,*.go} and
  platform/migrations/.
- .github/workflows/publish-platform-image.yml: context: . (was
  ./platform), paths trigger now includes workspace-configs-templates/
  so template changes rebuild the image.

Phase A of the template-registry plan. Phase B adds a DB registry +
on-demand fetch for community templates (user pastes GitHub URL at
workspace creation time). The baked defaults always ship in the image
for zero-config tenant boot.

Verified: `docker build -f platform/Dockerfile -t test .` succeeds,
`docker run --rm test ls /workspace-configs-templates/` shows all 8
templates (autogen, claude-code-default, crewai, deepagents, gemini-cli,
hermes, langgraph, openclaw).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 01:54:47 -07:00
Hongming Wang
84de543378 fix(a2a): add missing Authorization header to delegation and message calls (#401)
* fix(a2a): add missing Authorization header to delegation and message calls

Three A2A client functions were missing the Bearer token on their HTTP calls
after the Phase 30.1 workspace-auth enforcement rollout:

1. send_a2a_message (a2a_client.py): POST to target workspace's /message/send
   used WorkspaceAuth middleware that fails-closed on missing auth header.
   Fix: headers=auth_headers() — auth_headers() already imported.

2. tool_delegate_task_async (a2a_tools.py): POST to platform /delegate endpoint
   requires the caller's workspace bearer token since Phase 30.1.
   Fix: headers=_auth_headers_for_heartbeat()

3. tool_check_task_status (a2a_tools.py): GET /delegations endpoint, same issue.
   Fix: headers=_auth_headers_for_heartbeat()

tool_list_peers already uses _auth_headers_for_heartbeat() correctly —
that's why list_peers works while delegation returns 401/[A2A_ERROR].

Root cause of the multi-session A2A outage. PR #386 (TTL fix) addressed
the workspace-restart cascade; this fixes the underlying 401 on each call.

Closes #391
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(a2a): add missing auth headers to /activity and /notify endpoints

Two more Phase 30.1 regressions in a2a_tools.py found during send_message_to_user
debugging (it was returning 401):

- tool_report_activity: POST /workspaces/:id/activity missing headers
- tool_send_message_to_user: POST /workspaces/:id/notify missing headers

Both now use headers=_auth_headers_for_heartbeat() matching the pattern used
by commit_memory, recall_memory, and the heartbeat POST in the same file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: PM (Molecule AI) <pm@molecule-ai.internal>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 00:53:18 -07:00
UIUX Designer
e810986f44 fix(wcag): sweep text-zinc-600→zinc-500 across 9 components with small text
zinc-600 on zinc-900/950 background ≈ 2.6:1 contrast (WCAG AA requires
4.5:1 for text under 18pt). Found 15 instances across 9 components where
small-text data labels used this low-contrast pairing.

Files and what they label:
  EmptyState.tsx:132     — skill count + model on template cards (new-user visible)
  SidePanel.tsx:230      — workspace ID in panel footer (copyable, functional)
  ActivityTab.tsx:210    — entry timestamp (8px)
  ActivityTab.tsx:214    — expand chevron affordance (9px)
  ActivityTab.tsx:236    — "→" direction arrow between agents (9px)
  ActivityTab.tsx:278    — entry ID (8px, font-mono)
  ScheduleTab.tsx:284    — empty-state description text (9px)
  ScheduleTab.tsx:320    — schedule prompt preview (9px, truncate)
  ScheduleTab.tsx:323    — last/next/run-count metadata row (8px)
  SkillsTab.tsx:380      — "Examples" section header (9px uppercase)
  TracesTab.tsx:132      — trace ID (8px, font-mono)
  AgentCommsPanel.tsx:166 — message timestamp (9px)
  secrets-section.tsx:59  — secret key name (9px, font-mono)
  secrets-section.tsx:308 — encryption notice (9px)
  MissingKeysModal.tsx:175 — missing key identifier (9px, font-mono)

Fix: zinc-600 → zinc-500 across all 15 instances. Purely cosmetic —
no logic, no layout, no interactive behaviour changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:53:00 +00:00
rabbitblood
24a882ccc9 feat(org-templates): Phase 3 — !include directive + split org.yaml into team files
Part 3 of 4 in the scalability refactor. Adds YAML `!include` support
to the org importer and splits molecule-dev/org.yaml (676 lines post-
Phase 2) into 6 team / role files; top-level org.yaml drops to 114 lines
of pure scaffolding.

## Platform changes

New `platform/internal/handlers/org_include.go`:

- `resolveYAMLIncludes(data, baseDir)` — pre-processes a YAML document,
  expanding any scalar tagged `!include <path>` with the parsed content
  of the referenced file.
- Path resolution via `resolveInsideRoot` so a crafted `!include
  ../../etc/passwd` can't escape the org template directory (same
  defense the existing `files_dir` copy uses).
- Nested includes supported: each included file carries its own search
  root (its directory), so `teams/pm.yaml` with `!include research.yaml`
  resolves to `teams/research.yaml` — matching the convention of
  C-include / Sass @import / most package systems.
- Cycle detection via visited-set keyed on absolute path; belt-and-
  braces `maxIncludeDepth = 16` cap in case symlinks or path
  normalization defeats the set.
- Inline-template mode (POST /org/import with raw JSON body, no `dir`)
  errors cleanly when a file ref is used — can't resolve without a
  base.

Wired into both `ListTemplates` (so /org/templates shows an accurate
workspace count after the split) and `Import` (expansion happens before
unmarshal into OrgTemplate).

## Template changes

molecule-dev/org.yaml now contains only:
- name + description
- defaults (runtime, plugins, category_routing, initial_prompt text)
- `workspaces: [!include teams/pm.yaml, !include teams/marketing.yaml]`

New files:
- `teams/pm.yaml` — PM top-level, children are !include refs
- `teams/research.yaml` — Research Lead + Market Analyst + Technical
  Researcher + Competitive Intelligence (inline children)
- `teams/dev.yaml` — Dev Lead + FE/BE/DevOps/Security/QA/UIUX (inline)
- `teams/marketing.yaml` — Marketing Lead + DevRel/PMM/Content/
  Community/SEO/Social (inline)
- `teams/documentation-specialist.yaml` — leaf
- `teams/triage-operator.yaml` — leaf

## File-size impact

| State | org.yaml lines | total config size |
|---|---:|---:|
| Before (main) | 1801 | 108 KB |
| After Phase 1 (#389) | 1687 | 101 KB |
| After Phase 2 (#390) | 676 | 35 KB |
| After this PR | **114** | **4 KB** (org.yaml only) |

With the 6 team files (total ~570 lines of structural yaml), every file
is now under 230 lines and individually readable without scrolling past
a single team's boundaries.

## Tests

`platform/internal/handlers/org_include_test.go` — 9 cases:
- Flat include (single file, single workspace)
- Nested include (file → file → file)
- Traversal rejection (`../secret.yaml`, `../../secret.yaml`)
- Cycle detection (a↔b)
- Empty path error
- Missing file error
- Inline-template error (baseDir empty)
- No-op when YAML has no includes (safety: we always run the preprocessor)
- **Integration**: load the real `org-templates/molecule-dev/org.yaml`,
  resolve includes, unmarshal into OrgTemplate, verify PM + Marketing
  Lead are top-level and PM has ≥4 children after expansion.

All 9 pass + existing `TestResolvePromptRef` + `TestOrgYAML` suites stay
green.

## Ownership implication

Each team file can now be owned + reviewed independently. When the
marketing team adds a 7th role, the diff is in `teams/marketing.yaml`
alone — no merge conflicts against PM or research changes in the same
review window. Same for the eventual engineer team, security team, etc.

## What's next

- **Phase 4 (queued):** per-workspace atomization. Each role gets
  `<role>/workspace.yaml`; team files shrink to a list of !include
  refs. Terminal step in the scalability arc — at that point adding a
  new role is one new file under `org-templates/molecule-dev/<role>/`
  plus one line in the team's manifest.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:49:56 +00:00
Hongming Wang
f075c49af1 feat(org-templates): Phase 2 — bulk migrate 20 roles to file-ref prompts (#395)
Part 2 of 4 in the org.yaml scalability refactor. Follows PR #389 which
added platform support; this PR completes the migration for every role
in the `molecule-dev` template.

## Scope

All 20 remaining roles moved from inline YAML literals to sibling .md
files under their existing `files_dir`:

- PM, Research Lead, Dev Lead, Marketing Lead (4 leaders)
- Market Analyst, Technical Researcher, Competitive Intelligence (research)
- Frontend/Backend/DevOps Engineer, Security Auditor, QA Engineer, UIUX
  Designer, Triage Operator (dev team)
- DevRel, PMM, Content Marketer, Community Manager, SEO Growth Analyst,
  Social Media Brand (marketing team)

Per workspace, externalized (where present):
- `initial_prompt: |...` → `initial-prompt.md` + `initial_prompt_file:`
- `idle_prompt: |...`    → `idle-prompt.md`    + `idle_prompt_file:`
- `schedules[*].prompt: |...` → `schedules/<slug>.md` + `prompt_file:`

Totals: 17 initial-prompt files, 12 idle-prompt files, 18 schedule files
(47 new files).

## File-size impact

| Before (main) | After Phase 1 | After Phase 2 | Reduction |
|---|---|---|---|
| 1801 lines | 1687 lines | 676 lines | **-62.5%** |
| 108 KB | 101 KB | 35 KB | **-67%** |

org.yaml is now pure structural scaffolding (name / role / tier / model /
canvas / plugins / channels / children / category_routing / schedules
metadata). Readable end-to-end on one screen per team.

## How the migration was driven

A Python round-trip script (using `ruamel.yaml` to preserve comments +
formatting) walked the workspace tree recursively, wrote prompts to
files keyed by `files_dir`, and replaced inline keys with `*_file:` refs.
Zero manual YAML hand-editing beyond the Phase 1 Documentation Specialist
proof. Script is one-shot; not committed.

Slug convention for schedule files: lowercase the schedule name, replace
non-alphanumeric with `-`, collapse, cap 60 chars. Examples:
- "Orchestrator pulse" → `orchestrator-pulse.md`
- "Hourly template fitness audit" → `hourly-template-fitness-audit.md`
- "Code quality audit (every 12h)" → `code-quality-audit-every-12h.md`

## Backwards compatibility

Fully compatible — Phase 1's resolver prefers inline when both are set,
so a future one-off experiment can still drop inline YAML. The migration
doesn't remove inline support, just stops using it.

## Verification

- [x] `python -c "yaml.safe_load(...)"` on edited org.yaml — parses clean
- [x] Walk-and-inspect script: every workspace has exactly the expected
      `*_file:` refs, zero `INLINE_*` markers remain
- [x] All 47 extracted .md files non-empty + trimmed
- [x] `go test -run 'TestResolvePromptRef|TestOrgYAML|TestInitialPrompt'`
      passes (from Phase 1 platform work)
- [ ] Post-merge: live `POST /org/import` against a fresh workspace,
      diff the resulting `/configs/config.yaml` + `workspace_schedules`
      rows against the pre-migration values (should be identical bodies)

## What's next

- **Phase 3 (queued):** YAML `!include` directive for org.yaml; split the
  remaining 676 lines into `teams/{research,dev,marketing,ops}.yaml`.
- **Phase 4 (queued):** per-workspace atomization; each role owns its
  own `workspace.yaml` manifest.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 00:47:32 -07:00