Root cause of position collision after node deletion:
handleCanvasEvent(WORKSPACE_PROVISIONING) used nodes.length as the
grid placement index. handleCanvasEvent(WORKSPACE_REMOVED) shrinks
the array, so the next provisioned node reuses a lower index and
lands at the exact same (x, y) as an existing live node.
Example (4-col grid, COL_SPACING=320):
Provision A → idx 0 → (100, 100)
Provision B → idx 1 → (420, 100)
Provision C → idx 2 → (740, 100)
Remove A → nodes.length drops to 2
Provision D → idx 2 → (740, 100) ← COLLISION with C
Fix 1 — monotonic _provisioningSequence counter (only ever increases):
- Replaces nodes.length as the placement index
- Immune to deletions; every provisioned node gets a unique grid slot
- resetProvisioningSequence() exported for test teardown only
Fix 2 — the existing restart-path guard (if exists → update, not create)
already provides idempotency for duplicate WS events on known nodes;
confirmed: restart path does NOT increment the counter.
Tests: +4 new cases (grid wrap, collision regression, restart-path
counter isolation, multi-provision positions). 485/485 pass.
Build: next build ✓ clean.
Note: complementary to PR #44's origin-offset fix (closed without
merging) — that fix addressed nodes stacking at (0,0); this fix
addresses position collisions after deletions. Both should land.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause of position collision after node deletion:
handleCanvasEvent(WORKSPACE_PROVISIONING) used nodes.length as the
grid placement index. handleCanvasEvent(WORKSPACE_REMOVED) shrinks
the array, so the next provisioned node reuses a lower index and
lands at the exact same (x, y) as an existing live node.
Example (4-col grid, COL_SPACING=320):
Provision A → idx 0 → (100, 100)
Provision B → idx 1 → (420, 100)
Provision C → idx 2 → (740, 100)
Remove A → nodes.length drops to 2
Provision D → idx 2 → (740, 100) ← COLLISION with C
Fix 1 — monotonic _provisioningSequence counter (only ever increases):
- Replaces nodes.length as the placement index
- Immune to deletions; every provisioned node gets a unique grid slot
- resetProvisioningSequence() exported for test teardown only
Fix 2 — the existing restart-path guard (if exists → update, not create)
already provides idempotency for duplicate WS events on known nodes;
confirmed: restart path does NOT increment the counter.
Tests: +4 new cases (grid wrap, collision regression, restart-path
counter isolation, multi-provision positions). 485/485 pass.
Build: next build ✓ clean.
Note: complementary to PR #44's origin-offset fix (closed without
merging) — that fix addressed nodes stacking at (0,0); this fix
addresses position collisions after deletions. Both should land.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem observed 2026-04-16: Research Lead, Dev Lead, Security Auditor,
and UIUX Designer were being auto-restarted by the liveness monitor every
~30 minutes, even though their containers were healthy and processing
real work. A2A callers (PM, children agents) saw regular EOFs:
A2A request to <leader-id> failed: Post http://ws-*:8000: EOF
Followed in platform logs by:
Liveness: workspace <id> TTL expired
Auto-restart: restarting <name> (was: offline)
Provisioner: stopped and removed container ws-*
Root cause: the liveness key `ws:{id}` in Redis has a 60s TTL
(platform/internal/db/redis.go). The workspace heartbeat loop
(workspace-template/heartbeat.py) refreshes it every 30s. That leaves
room for exactly ONE missed heartbeat before expiry.
A busy Claude Code Opus synthesis can starve the container's asyncio
scheduler for 60-120s (the SDK spawns the claude CLI subprocess and
blocks until the message-reader yields; the heartbeat coroutine doesn't
run during that window). Leaders running 5-minute orchestrator pulses
or processing deep delegations routinely hit this. The platform then
mistakes a busy-but-healthy container for a dead one, marks it offline,
tears it down, and re-provisions — interrupting whatever work was mid-
synthesis and generating a cascade of EOF errors on pending A2A calls.
Fix: hoist the TTL into a named `LivenessTTL` constant and raise it to
180s. With a 30s heartbeat interval this now tolerates up to ~5 missed
beats before expiry — comfortably longer than any realistic Opus stall,
while still detecting genuinely-dead containers within 3 minutes.
Safety: real crashes are still caught immediately by a2a_proxy's reactive
IsRunning() check (maybeMarkContainerDead in a2a_proxy.go:439). That path
doesn't depend on TTL; it fires on the first failed forward. So this PR
only relaxes the "slow but alive" false-positive — dead-container
detection is unchanged.
Observed impact before fix (2026-04-16 ~06:40–06:49 UTC, 10-minute
window, 4 containers affected):
| Container | EOF errors | Forced restart |
|-------------------|-----------:|:--------------:|
| Dev Lead | 5 | yes (06:48) |
| Research Lead | 5 | yes (06:47) |
| Security Auditor | 5 | yes (06:49) |
| UIUX Designer | 4 | no (not yet) |
Expected impact after merge + redeploy: drop to ~0 forced restarts on
healthy-busy leaders. If genuinely-stuck agents stop responding, the
IsRunning check still catches them on the next A2A forward.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Problem observed 2026-04-16: Research Lead, Dev Lead, Security Auditor,
and UIUX Designer were being auto-restarted by the liveness monitor every
~30 minutes, even though their containers were healthy and processing
real work. A2A callers (PM, children agents) saw regular EOFs:
A2A request to <leader-id> failed: Post http://ws-*:8000: EOF
Followed in platform logs by:
Liveness: workspace <id> TTL expired
Auto-restart: restarting <name> (was: offline)
Provisioner: stopped and removed container ws-*
Root cause: the liveness key `ws:{id}` in Redis has a 60s TTL
(platform/internal/db/redis.go). The workspace heartbeat loop
(workspace-template/heartbeat.py) refreshes it every 30s. That leaves
room for exactly ONE missed heartbeat before expiry.
A busy Claude Code Opus synthesis can starve the container's asyncio
scheduler for 60-120s (the SDK spawns the claude CLI subprocess and
blocks until the message-reader yields; the heartbeat coroutine doesn't
run during that window). Leaders running 5-minute orchestrator pulses
or processing deep delegations routinely hit this. The platform then
mistakes a busy-but-healthy container for a dead one, marks it offline,
tears it down, and re-provisions — interrupting whatever work was mid-
synthesis and generating a cascade of EOF errors on pending A2A calls.
Fix: hoist the TTL into a named `LivenessTTL` constant and raise it to
180s. With a 30s heartbeat interval this now tolerates up to ~5 missed
beats before expiry — comfortably longer than any realistic Opus stall,
while still detecting genuinely-dead containers within 3 minutes.
Safety: real crashes are still caught immediately by a2a_proxy's reactive
IsRunning() check (maybeMarkContainerDead in a2a_proxy.go:439). That path
doesn't depend on TTL; it fires on the first failed forward. So this PR
only relaxes the "slow but alive" false-positive — dead-container
detection is unchanged.
Observed impact before fix (2026-04-16 ~06:40–06:49 UTC, 10-minute
window, 4 containers affected):
| Container | EOF errors | Forced restart |
|-------------------|-----------:|:--------------:|
| Dev Lead | 5 | yes (06:48) |
| Research Lead | 5 | yes (06:47) |
| Security Auditor | 5 | yes (06:49) |
| UIUX Designer | 4 | no (not yet) |
Expected impact after merge + redeploy: drop to ~0 forced restarts on
healthy-busy leaders. If genuinely-stuck agents stop responding, the
IsRunning check still catches them on the next A2A forward.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(adapters): add gemini-cli runtime adapter (closes#332)
Adds a `gemini-cli` workspace runtime backed by Google's Gemini CLI
(@google/gemini-cli, ~101k ★, Apache 2.0). Mirrors the claude-code
adapter pattern: Docker image installs the CLI, CLIAgentExecutor
drives the subprocess, A2A MCP tools wire via ~/.gemini/settings.json.
Changes:
- workspace-template/adapters/gemini_cli/ — new adapter (Dockerfile,
adapter.py, __init__.py, requirements.txt); setup() seeds GEMINI.md
from system-prompt.md and injects A2A MCP server into settings.json
- workspace-template/cli_executor.py — adds gemini-cli to
RUNTIME_PRESETS (--yolo flag, -p prompt, --model, GEMINI_API_KEY env
auth); adds mcp_via_settings preset flag to skip --mcp-config
injection for runtimes that own their own settings file
- workspace-configs-templates/gemini-cli/ — default config.yaml +
system-prompt.md template
- tests/test_adapters.py — adds gemini-cli to expected adapter set
- CLAUDE.md — documents new runtime row in the image table
Requires: GEMINI_API_KEY global secret. Build:
bash workspace-template/build-all.sh gemini-cli
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(provisioner): add gemini-cli to RuntimeImages map
Without this entry, POST /workspaces with runtime:gemini-cli falls back
to workspace-template:langgraph (wrong image, missing gemini dep) instead
of workspace-template:gemini-cli. Every runtime MUST have an entry here.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* config(org): add Telegram to Dev Lead and Research Lead (closes#383)
Completes leadership-tier Telegram coverage:
PM ✓ DevOps ✓ Security ✓ → Dev Lead ✓ Research Lead ✓
Both roles produce high-value async output (architecture decisions,
eco-watch summaries) that was invisible until the user polled the
canvas. Same bot_token/chat_id secrets as the other three roles —
no new credentials required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(adapters): add gemini-cli runtime adapter (closes#332)
Adds a `gemini-cli` workspace runtime backed by Google's Gemini CLI
(@google/gemini-cli, ~101k ★, Apache 2.0). Mirrors the claude-code
adapter pattern: Docker image installs the CLI, CLIAgentExecutor
drives the subprocess, A2A MCP tools wire via ~/.gemini/settings.json.
Changes:
- workspace-template/adapters/gemini_cli/ — new adapter (Dockerfile,
adapter.py, __init__.py, requirements.txt); setup() seeds GEMINI.md
from system-prompt.md and injects A2A MCP server into settings.json
- workspace-template/cli_executor.py — adds gemini-cli to
RUNTIME_PRESETS (--yolo flag, -p prompt, --model, GEMINI_API_KEY env
auth); adds mcp_via_settings preset flag to skip --mcp-config
injection for runtimes that own their own settings file
- workspace-configs-templates/gemini-cli/ — default config.yaml +
system-prompt.md template
- tests/test_adapters.py — adds gemini-cli to expected adapter set
- CLAUDE.md — documents new runtime row in the image table
Requires: GEMINI_API_KEY global secret. Build:
bash workspace-template/build-all.sh gemini-cli
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(provisioner): add gemini-cli to RuntimeImages map
Without this entry, POST /workspaces with runtime:gemini-cli falls back
to workspace-template:langgraph (wrong image, missing gemini dep) instead
of workspace-template:gemini-cli. Every runtime MUST have an entry here.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* config(org): add Telegram to Dev Lead and Research Lead (closes#383)
Completes leadership-tier Telegram coverage:
PM ✓ DevOps ✓ Security ✓ → Dev Lead ✓ Research Lead ✓
Both roles produce high-value async output (architecture decisions,
eco-watch summaries) that was invisible until the user polled the
canvas. Same bot_token/chat_id secrets as the other three roles —
no new credentials required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Two helper paragraphs in ChannelsTab.tsx used text-[9px] text-zinc-600:
- Chat IDs discover hint (line 254)
- Allowed Users hint (line 281)
9px fails WCAG 1.4.3 by size alone; zinc-600 on zinc-800/900 bg is
~2.6:1 contrast (fails AA). Changed to text-[11px] text-zinc-500
(~3.8:1 at 11px — acceptable for non-body helper text).
Found in UX audit Run 13 (2026-04-16).
Co-authored-by: UIUX Designer <uiux@molecule-ai.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Two helper paragraphs in ChannelsTab.tsx used text-[9px] text-zinc-600:
- Chat IDs discover hint (line 254)
- Allowed Users hint (line 281)
9px fails WCAG 1.4.3 by size alone; zinc-600 on zinc-800/900 bg is
~2.6:1 contrast (fails AA). Changed to text-[11px] text-zinc-500
(~3.8:1 at 11px — acceptable for non-body helper text).
Found in UX audit Run 13 (2026-04-16).
Co-authored-by: UIUX Designer <uiux@molecule-ai.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Wraps up a ~100-tick autonomous triage session by converting the prior
operator's institutional knowledge into standing, checked-in artifacts
so the next team picking up the hourly PR + issue cycle can drop in
without re-discovering everything from scratch.
## New role: Triage Operator
Peer to Dev Lead, Research Lead, Documentation Specialist under PM.
Owns the 7-gate PR verification + issue-pickup cycle across both
molecule-monorepo and molecule-controlplane. NOT an engineer — never
writes logic, never makes design calls. Mechanical fixes on other
people's branches + verified-merge only.
Runs on cron `17 * * * *`. On first boot reads four handoff files +
the last 20 lines of cron-learnings.jsonl, waits for the scheduled
tick (no first-boot triage — known stale-state footgun).
## Files
org-templates/molecule-dev/triage-operator/
- system-prompt.md (48 lines) — role prompt loaded at boot. Standing
rules, verification discipline, escalation paths.
- philosophy.md (135 lines) — 10 principles each tied to a real
incident. Rule 2 ("tool succeeded ≠ work done") references the
WorkOS refresh-token + missing-migration saga. Rule 3 (authority
verification) references PR #370 CEO directive hold.
- playbook.md (234 lines) — step-by-step tick flow (Step 0 guards →
1 list → 2 seven-gate → 3 docs sync → 4 issue pickup → 5 report).
Expected 5–30 min wall-clock. When-not-to-triage.
- handoff-notes.md (146 lines) — point-in-time state for the NEXT
operator arriving fresh. 15 PRs merged this session, in-flight
items, design-call backlog with recommendations per issue.
- SKILL.md (152 lines) — installable skill spec. Invocation, inputs,
outputs, required composed skills, edge cases, output format.
.claude/AGENT_HANDOFF.md (206 lines) — top-level handoff for any
Claude Code agent working this repo (not just the triage operator).
The 10 principles (one-liners), communication style the user
expects, currently-live state, open items, what NOT to do, break-
glass escalation conditions. Points at triage-operator/philosophy.md
for full incident context.
## Wiring
org.yaml gains a Triage Operator workspace block under PM with:
- tier: 3, model: opus
- 8 plugins (careful-bash, session-context, cron-learnings,
code-review, cross-vendor-review, llm-judge, update-docs, hitl)
- Hourly cron at `:17` with the full Step 0–5 flow inline as prompt
- canvas position (1150, 250) — peer to Documentation Specialist
## Why this ships now
The 30-min manual triage cron was cancelled per CEO direction. The
role moves to another team. Without this handoff package they'd be
rediscovering the same incident-classes I shipped fixes for
(#318 fail-open, #327 cross-tenant decrypt, #351 tokenless grace,
WorkOS refresh-token saga, missing migration runner). The philosophy
file gives them the scar tissue in ~10 min of reading; the playbook
gives them the steps; the SKILL gives them an invocable entry point.
No code changes outside org.yaml. Existing TestPlugins_UnionWithDefaults
still passes (verified in platform test run).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wraps up a ~100-tick autonomous triage session by converting the prior
operator's institutional knowledge into standing, checked-in artifacts
so the next team picking up the hourly PR + issue cycle can drop in
without re-discovering everything from scratch.
## New role: Triage Operator
Peer to Dev Lead, Research Lead, Documentation Specialist under PM.
Owns the 7-gate PR verification + issue-pickup cycle across both
molecule-monorepo and molecule-controlplane. NOT an engineer — never
writes logic, never makes design calls. Mechanical fixes on other
people's branches + verified-merge only.
Runs on cron `17 * * * *`. On first boot reads four handoff files +
the last 20 lines of cron-learnings.jsonl, waits for the scheduled
tick (no first-boot triage — known stale-state footgun).
## Files
org-templates/molecule-dev/triage-operator/
- system-prompt.md (48 lines) — role prompt loaded at boot. Standing
rules, verification discipline, escalation paths.
- philosophy.md (135 lines) — 10 principles each tied to a real
incident. Rule 2 ("tool succeeded ≠ work done") references the
WorkOS refresh-token + missing-migration saga. Rule 3 (authority
verification) references PR #370 CEO directive hold.
- playbook.md (234 lines) — step-by-step tick flow (Step 0 guards →
1 list → 2 seven-gate → 3 docs sync → 4 issue pickup → 5 report).
Expected 5–30 min wall-clock. When-not-to-triage.
- handoff-notes.md (146 lines) — point-in-time state for the NEXT
operator arriving fresh. 15 PRs merged this session, in-flight
items, design-call backlog with recommendations per issue.
- SKILL.md (152 lines) — installable skill spec. Invocation, inputs,
outputs, required composed skills, edge cases, output format.
.claude/AGENT_HANDOFF.md (206 lines) — top-level handoff for any
Claude Code agent working this repo (not just the triage operator).
The 10 principles (one-liners), communication style the user
expects, currently-live state, open items, what NOT to do, break-
glass escalation conditions. Points at triage-operator/philosophy.md
for full incident context.
## Wiring
org.yaml gains a Triage Operator workspace block under PM with:
- tier: 3, model: opus
- 8 plugins (careful-bash, session-context, cron-learnings,
code-review, cross-vendor-review, llm-judge, update-docs, hitl)
- Hourly cron at `:17` with the full Step 0–5 flow inline as prompt
- canvas position (1150, 250) — peer to Documentation Specialist
## Why this ships now
The 30-min manual triage cron was cancelled per CEO direction. The
role moves to another team. Without this handoff package they'd be
rediscovering the same incident-classes I shipped fixes for
(#318 fail-open, #327 cross-tenant decrypt, #351 tokenless grace,
WorkOS refresh-token saga, missing migration runner). The philosophy
file gives them the scar tissue in ~10 min of reading; the playbook
gives them the steps; the SKILL gives them an invocable entry point.
No code changes outside org.yaml. Existing TestPlugins_UnionWithDefaults
still passes (verified in platform test run).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The platform's GET /workspaces/:id/transcript proxy was constructing the
outbound request without an Authorization header. The workspace's /transcript
endpoint (hardened in #287/#328) fails-closed when the header is absent,
so every transcript call in production returned 401 from the workspace.
Fix: after WorkspaceAuth validates the incoming bearer token, the handler
now forwards it verbatim via req.Header.Set("Authorization", ...).
Forwarding is safe — the token has already been validated by the middleware.
Tests:
- TestTranscript_ForwardsAuthHeader: was t.Skip'd as a bug marker; now
active. Verifies the Authorization header reaches the workspace stub.
- TestTranscript_NoAuthHeader_PassesThrough: new. Verifies that a missing
header produces no synthetic Authorization on the upstream call, and the
workspace 401 is faithfully relayed.
Identified by QA audit 2026-04-16.
Co-authored-by: QA Engineer <qa-engineer@molecule-ai.internal>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The platform's GET /workspaces/:id/transcript proxy was constructing the
outbound request without an Authorization header. The workspace's /transcript
endpoint (hardened in #287/#328) fails-closed when the header is absent,
so every transcript call in production returned 401 from the workspace.
Fix: after WorkspaceAuth validates the incoming bearer token, the handler
now forwards it verbatim via req.Header.Set("Authorization", ...).
Forwarding is safe — the token has already been validated by the middleware.
Tests:
- TestTranscript_ForwardsAuthHeader: was t.Skip'd as a bug marker; now
active. Verifies the Authorization header reaches the workspace stub.
- TestTranscript_NoAuthHeader_PassesThrough: new. Verifies that a missing
header produces no synthetic Authorization on the upstream call, and the
workspace 401 is faithfully relayed.
Identified by QA audit 2026-04-16.
Co-authored-by: QA Engineer <qa-engineer@molecule-ai.internal>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Marketing Lead + 6 reports as a peer sub-tree of PM under the CEO:
DevRel Engineer, Product Marketing Manager, Content Marketer, Community
Manager, SEO Growth Analyst, Social Media / Brand.
- Marketing Lead: tier-3 Opus CMO-equivalent with a 5-min orchestrator
pulse (minutes 4/9/14/... offset from Dev Lead's 2/7/12/...) that
dispatches cross-role work, reviews drafts, and routes cross-team
asks back to PM.
- DevRel + PMM: tier-3 Opus (technical writing + positioning judgment).
Each has an idle_prompt for proactive issue-claim plus an hourly
evolution cron (DevRel = sample-coverage audit, PMM = competitor
diff against docs/ecosystem-watch.md).
- Content / Community / SEO / Social: tier-2 Sonnet with idle_prompts
for backlog-pull (matches the #205 idle-loop pattern proven on
Technical Researcher + Market Analyst + Competitive Intelligence).
Each has an hourly cron tuned to its surface.
- category_routing gets 6 new keys (content, positioning, community,
growth, social, devrel) so audit_summary messages fan out correctly.
- Canvas positions lay out the marketing cluster to the right of
PM/Dev Lead (x=1000-1300, y=50/250/400) so the graph stays readable.
Each role also gets a system-prompt.md under its files_dir with
responsibilities, team interfaces, conventions, and self-review gates
(molecule-skill-llm-judge or molecule-hitl depending on risk).
Per CEO directive 2026-04-16 ("comprehensive marketing team"). This is
PR 1 of 2 — follow-up will add cross-tree A2A conventions and wire
DevRel ↔ Backend Engineer / PMM ↔ Competitive Intelligence delegations.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Marketing Lead + 6 reports as a peer sub-tree of PM under the CEO:
DevRel Engineer, Product Marketing Manager, Content Marketer, Community
Manager, SEO Growth Analyst, Social Media / Brand.
- Marketing Lead: tier-3 Opus CMO-equivalent with a 5-min orchestrator
pulse (minutes 4/9/14/... offset from Dev Lead's 2/7/12/...) that
dispatches cross-role work, reviews drafts, and routes cross-team
asks back to PM.
- DevRel + PMM: tier-3 Opus (technical writing + positioning judgment).
Each has an idle_prompt for proactive issue-claim plus an hourly
evolution cron (DevRel = sample-coverage audit, PMM = competitor
diff against docs/ecosystem-watch.md).
- Content / Community / SEO / Social: tier-2 Sonnet with idle_prompts
for backlog-pull (matches the #205 idle-loop pattern proven on
Technical Researcher + Market Analyst + Competitive Intelligence).
Each has an hourly cron tuned to its surface.
- category_routing gets 6 new keys (content, positioning, community,
growth, social, devrel) so audit_summary messages fan out correctly.
- Canvas positions lay out the marketing cluster to the right of
PM/Dev Lead (x=1000-1300, y=50/250/400) so the graph stays readable.
Each role also gets a system-prompt.md under its files_dir with
responsibilities, team interfaces, conventions, and self-review gates
(molecule-skill-llm-judge or molecule-hitl depending on risk).
Per CEO directive 2026-04-16 ("comprehensive marketing team"). This is
PR 1 of 2 — follow-up will add cross-tree A2A conventions and wire
DevRel ↔ Backend Engineer / PMM ↔ Competitive Intelligence delegations.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CEO directive verbatim: *"devs should pick up issues and declare that its
assigned to them, PM and leaders regularly check in. dont just rely on
outside reviewer"*.
Adds `idle_prompt` + `idle_interval_seconds: 600` to Frontend Engineer,
Backend Engineer, and DevOps Engineer. Each engineer now polls open GH
issues matching its specialty, claims unassigned ones via `gh issue edit
--add-assignee @me`, leaves a public comment declaring the pickup, and
commits memory to prevent double-pickup on the next tick.
Previously engineers were reactive-only per the #159 orchestrator/worker
split. The CEO is correcting that: devs should be a true self-organizing
unit, not a work-queue that only advances when an outside reviewer
dispatches.
## Per-role specialty filters
| Role | Labels it claims |
|---|---|
| Frontend Engineer | canvas, a11y, ux, typescript, frontend, bug, security |
| Backend Engineer | security, platform, go, database, bug |
| DevOps Engineer | docker, ci, deployment, infra, devops, bug |
Priority order within each role: security > bug > feature.
## Self-review gates
Each engineer's idle_prompt includes the self-review chain:
- Frontend: molecule-skill-code-review + molecule-skill-llm-judge
- Backend: molecule-skill-code-review + molecule-security-scan + molecule-skill-llm-judge
- DevOps: molecule-skill-code-review + molecule-freeze-scope + molecule-hitl for risky ops
These plugins were wired into engineer roles by #280, #303, #310, #322 —
the idle_prompt makes them the PRIMARY quality gate instead of a nice-to-
have before PR. Matches the "team self-regulates, don't rely on outside
reviewer" spirit.
## Hard rules (same shape as researcher idle_prompts from #216/#321)
- Max 1 claim per tick (1 `gh issue edit --add-assignee` call)
- Never take someone else's assigned issue
- Under 90 seconds wall-clock for the claim + plan step
- Don't double-pick: check `task-assigned:<role>` memory first
- No busy-work fabrication: write "<role>-idle HH:MM — no work" if nothing matches
## What this does NOT change
- Leaders' orchestrator pulses still dispatch (#159) — this is the TAIL
pickup, not the primary dispatch path. Dev Lead still prioritizes via
its own pulse.
- PR merging still goes through reviewer per `feedback_never_merge_prs.md`.
This directive is about the QUALITY GATE (team self-review, peer review
via Dev Lead's pulse) not about bypassing merge approval.
- Destructive/irreversible ops still need explicit human ack via
molecule-hitl's @requires_approval decorator.
## Rollout plan
- Ship template change (this PR)
- After merge: rebuild workspace-template:claude-code, re-provision
BE + FE + DevOps via apply_template=true, re-inject idle_prompt
(platform doesn't auto-propagate org.yaml to live configs — tracked
separately)
- Measure: 24h of activity_logs. Should see `a2a_receive` events every
10 min per engineer, response bodies mentioning claim decisions or
idle-clean states, and `gh issue edit` events showing up as assignees.
## Related
- `feedback_devs_pick_up_issues_leaders_check_in.md` — memory saved last cycle
- #159 orchestrator/worker split (leaders dispatch)
- #216 / #321 researcher idle_prompts (same pattern applied to researchers)
- `project_north_star_24_7.md` — team self-regulation is the north-star
CEO directive verbatim: *"devs should pick up issues and declare that its
assigned to them, PM and leaders regularly check in. dont just rely on
outside reviewer"*.
Adds `idle_prompt` + `idle_interval_seconds: 600` to Frontend Engineer,
Backend Engineer, and DevOps Engineer. Each engineer now polls open GH
issues matching its specialty, claims unassigned ones via `gh issue edit
--add-assignee @me`, leaves a public comment declaring the pickup, and
commits memory to prevent double-pickup on the next tick.
Previously engineers were reactive-only per the #159 orchestrator/worker
split. The CEO is correcting that: devs should be a true self-organizing
unit, not a work-queue that only advances when an outside reviewer
dispatches.
## Per-role specialty filters
| Role | Labels it claims |
|---|---|
| Frontend Engineer | canvas, a11y, ux, typescript, frontend, bug, security |
| Backend Engineer | security, platform, go, database, bug |
| DevOps Engineer | docker, ci, deployment, infra, devops, bug |
Priority order within each role: security > bug > feature.
## Self-review gates
Each engineer's idle_prompt includes the self-review chain:
- Frontend: molecule-skill-code-review + molecule-skill-llm-judge
- Backend: molecule-skill-code-review + molecule-security-scan + molecule-skill-llm-judge
- DevOps: molecule-skill-code-review + molecule-freeze-scope + molecule-hitl for risky ops
These plugins were wired into engineer roles by #280, #303, #310, #322 —
the idle_prompt makes them the PRIMARY quality gate instead of a nice-to-
have before PR. Matches the "team self-regulates, don't rely on outside
reviewer" spirit.
## Hard rules (same shape as researcher idle_prompts from #216/#321)
- Max 1 claim per tick (1 `gh issue edit --add-assignee` call)
- Never take someone else's assigned issue
- Under 90 seconds wall-clock for the claim + plan step
- Don't double-pick: check `task-assigned:<role>` memory first
- No busy-work fabrication: write "<role>-idle HH:MM — no work" if nothing matches
## What this does NOT change
- Leaders' orchestrator pulses still dispatch (#159) — this is the TAIL
pickup, not the primary dispatch path. Dev Lead still prioritizes via
its own pulse.
- PR merging still goes through reviewer per `feedback_never_merge_prs.md`.
This directive is about the QUALITY GATE (team self-review, peer review
via Dev Lead's pulse) not about bypassing merge approval.
- Destructive/irreversible ops still need explicit human ack via
molecule-hitl's @requires_approval decorator.
## Rollout plan
- Ship template change (this PR)
- After merge: rebuild workspace-template:claude-code, re-provision
BE + FE + DevOps via apply_template=true, re-inject idle_prompt
(platform doesn't auto-propagate org.yaml to live configs — tracked
separately)
- Measure: 24h of activity_logs. Should see `a2a_receive` events every
10 min per engineer, response bodies mentioning claim decisions or
idle-clean states, and `gh issue edit` events showing up as assignees.
## Related
- `feedback_devs_pick_up_issues_leaders_check_in.md` — memory saved last cycle
- #159 orchestrator/worker split (leaders dispatch)
- #216 / #321 researcher idle_prompts (same pattern applied to researchers)
- `project_north_star_24_7.md` — team self-regulation is the north-star
Add two new entries to docs/ecosystem-watch.md:
- **AMD GAIA** (amd/gaia, ~1.2k ⭐, MIT, v0.17.2 April 10 2026):
AMD-backed local-first agent framework with MCP client support,
RAG, vision, and voice. Hardware-locked to Ryzen AI but signals
local/privacy-first positioning. @tool decorator pattern worth
borrowing for workspace adapters.
- **ClawRun** (clawrun-sh/clawrun, ~84 ⭐, Apache 2.0, 45 releases):
Closest architectural match we've tracked — hosting/lifecycle layer
with sandbox, heartbeat, snapshot/resume, channels, and cost
tracking. Per-channel budget enforcement is a concrete gap in our
workspace_channels. Filed #368.
HEAD at survey time: 8db86df
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two new entries to docs/ecosystem-watch.md:
- **AMD GAIA** (amd/gaia, ~1.2k ⭐, MIT, v0.17.2 April 10 2026):
AMD-backed local-first agent framework with MCP client support,
RAG, vision, and voice. Hardware-locked to Ryzen AI but signals
local/privacy-first positioning. @tool decorator pattern worth
borrowing for workspace adapters.
- **ClawRun** (clawrun-sh/clawrun, ~84 ⭐, Apache 2.0, 45 releases):
Closest architectural match we've tracked — hosting/lifecycle layer
with sandbox, heartbeat, snapshot/resume, channels, and cost
tracking. Per-channel budget enforcement is a concrete gap in our
workspace_channels. Filed #368.
HEAD at survey time: a4a89a3
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two new entries to docs/ecosystem-watch.md:
- **GenericAgent** (lsdefine/GenericAgent, ~2.1k ⭐, MIT, v1.0 January
2026): self-evolving skill tree with a four-tier memory hierarchy
(rules/indices/facts/skills/archives). Skill crystallisation at
runtime is the automation of our install-time plugins model. Filed
#361 to add named memory tiers to agent_memories.
- **OpenSRE** (Tracer-Cloud/opensre, ~900 ⭐, Apache 2.0): AI SRE
agent toolkit with 40+ production DevOps integrations and MCP
support. Filed #362 to evaluate its adapters as a Molecule AI
DevOps workspace skill pack.
HEAD at survey time: 2e1fc8d
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two new entries to docs/ecosystem-watch.md:
- **GenericAgent** (lsdefine/GenericAgent, ~2.1k ⭐, MIT, v1.0 January
2026): self-evolving skill tree with a four-tier memory hierarchy
(rules/indices/facts/skills/archives). Skill crystallisation at
runtime is the automation of our install-time plugins model. Filed
#361 to add named memory tiers to agent_memories.
- **OpenSRE** (Tracer-Cloud/opensre, ~900 ⭐, Apache 2.0): AI SRE
agent toolkit with 40+ production DevOps integrations and MCP
support. Filed #362 to evaluate its adapters as a Molecule AI
DevOps workspace skill pack.
HEAD at survey time: 93fd546
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #357 deleted the grace-period tests that used hasLiveTokenQuery and
workspaceExistsQuery, but the constants themselves (and the stale comment
describing the old HasAnyLiveToken-based dispatch) were not removed.
Remove both dead const declarations and update the header comment to
reflect the strict-enforcement contract introduced by #357.
Closes#358.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PR #357 deleted the grace-period tests that used hasLiveTokenQuery and
workspaceExistsQuery, but the constants themselves (and the stale comment
describing the old HasAnyLiveToken-based dispatch) were not removed.
Remove both dead const declarations and update the header comment to
reflect the strict-enforcement contract introduced by #357.
Closes#358.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Severity HIGH. #318 closed the fake-UUID fail-open for WorkspaceAuth
but left the grace period intact for *real* workspaces with no live
tokens. Zombie test-artifact workspaces from prior DAST runs still
exist in the DB with empty configs and no tokens, so they pass
WorkspaceExists=true but HasAnyLiveToken=false — and fell through the
grace period, leaking every global-secret key name to any
unauthenticated caller on the Docker network.
Phase 30.1 shipped months ago; every production workspace has gone
through multiple boot cycles and acquired a token since. The
"legacy workspaces grandfathered" window no longer serves legitimate
traffic. Removing it entirely is the cleanest fix — and does NOT
affect registration (which is on /registry/register, outside this
middleware's scope).
New contract (strict):
every /workspaces/:id/* request MUST carry
Authorization: Bearer <token-for-this-workspace>
Any missing/mismatched/revoked/wrong-workspace bearer → 401. No
existence check, no fallback. The wsauth.WorkspaceExists helper is
kept in the package for any future caller but no longer used here.
Tests:
- TestWorkspaceAuth_351_NoBearer_Returns401_NoDBCalls — new, covers
fake UUID / zombie / pre-token in one sub-table. Asserts zero DB
calls on missing bearer.
- Existing C4/C8 + #170 tests updated to drop the stale
HasAnyLiveToken sqlmock expectations.
- Renamed TestWorkspaceAuth_Issue170_SecretDelete_FailOpen_NoTokens
to _NoTokensStillRejected and flipped the assertion from 200 to 401.
- Dropped TestWorkspaceAuth_318_ExistsQueryError_Returns500 — the
code path it covered no longer exists.
Full platform test sweep green.
Closes#351
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Severity HIGH. #318 closed the fake-UUID fail-open for WorkspaceAuth
but left the grace period intact for *real* workspaces with no live
tokens. Zombie test-artifact workspaces from prior DAST runs still
exist in the DB with empty configs and no tokens, so they pass
WorkspaceExists=true but HasAnyLiveToken=false — and fell through the
grace period, leaking every global-secret key name to any
unauthenticated caller on the Docker network.
Phase 30.1 shipped months ago; every production workspace has gone
through multiple boot cycles and acquired a token since. The
"legacy workspaces grandfathered" window no longer serves legitimate
traffic. Removing it entirely is the cleanest fix — and does NOT
affect registration (which is on /registry/register, outside this
middleware's scope).
New contract (strict):
every /workspaces/:id/* request MUST carry
Authorization: Bearer <token-for-this-workspace>
Any missing/mismatched/revoked/wrong-workspace bearer → 401. No
existence check, no fallback. The wsauth.WorkspaceExists helper is
kept in the package for any future caller but no longer used here.
Tests:
- TestWorkspaceAuth_351_NoBearer_Returns401_NoDBCalls — new, covers
fake UUID / zombie / pre-token in one sub-table. Asserts zero DB
calls on missing bearer.
- Existing C4/C8 + #170 tests updated to drop the stale
HasAnyLiveToken sqlmock expectations.
- Renamed TestWorkspaceAuth_Issue170_SecretDelete_FailOpen_NoTokens
to _NoTokensStillRejected and flipped the assertion from 200 to 401.
- Dropped TestWorkspaceAuth_318_ExistsQueryError_Returns500 — the
code path it covered no longer exists.
Full platform test sweep green.
Closes#351
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two new entries to docs/ecosystem-watch.md:
- **AgentScope** (modelscope/agentscope, ~23.8k ⭐, Apache 2.0,
v1.0.18 March 26 2026): Alibaba/ModelScope multi-agent framework
with MCP support, MsgHub typed routing, and OpenTelemetry
observability. No canvas or workspace lifecycle — framework-layer
complement, not a platform competitor.
- **Plannotator** (backnotprop/plannotator, ~4.3k ⭐, Apache 2.0+MIT,
v0.17.10 April 13 2026): Browser-based agent plan annotation tool
with structured feedback types (delete/insert/replace/comment).
Directly informs our hitl.py feedback schema. Filed #349 to add
structured feedback types to resume_task.
HEAD at survey time: 0897f9e
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two new entries to docs/ecosystem-watch.md:
- **AgentScope** (modelscope/agentscope, ~23.8k ⭐, Apache 2.0,
v1.0.18 March 26 2026): Alibaba/ModelScope multi-agent framework
with MCP support, MsgHub typed routing, and OpenTelemetry
observability. No canvas or workspace lifecycle — framework-layer
complement, not a platform competitor.
- **Plannotator** (backnotprop/plannotator, ~4.3k ⭐, Apache 2.0+MIT,
v0.17.10 April 13 2026): Browser-based agent plan annotation tool
with structured feedback types (delete/insert/replace/comment).
Directly informs our hitl.py feedback schema. Filed #349 to add
structured feedback types to resume_task.
HEAD at survey time: 4196876
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>