#125 added a SELECT EXISTS guard before WorkspaceHandler.Update applies
any UPDATE so nonexistent workspace IDs return 404 instead of silent
zero-row successes. The 4 existing WorkspaceUpdate_* sqlmock tests
didn't mock the probe, so they broke on main. This was not caught
because CI is blocked by the Actions billing cap.
Adds ExpectQuery for the EXISTS probe to:
- TestWorkspaceUpdate_ParentID
- TestWorkspaceUpdate_NameOnly
- TestWorkspaceUpdate_MultipleFields
- TestWorkspaceUpdate_RuntimeField
TestWorkspaceUpdate_BadJSON doesn't need the fix — it aborts on
c.ShouldBindJSON before reaching the guard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
#125 added a SELECT EXISTS guard before WorkspaceHandler.Update applies
any UPDATE so nonexistent workspace IDs return 404 instead of silent
zero-row successes. The 4 existing WorkspaceUpdate_* sqlmock tests
didn't mock the probe, so they broke on main. This was not caught
because CI is blocked by the Actions billing cap.
Adds ExpectQuery for the EXISTS probe to:
- TestWorkspaceUpdate_ParentID
- TestWorkspaceUpdate_NameOnly
- TestWorkspaceUpdate_MultipleFields
- TestWorkspaceUpdate_RuntimeField
TestWorkspaceUpdate_BadJSON doesn't need the fix — it aborts on
c.ShouldBindJSON before reaching the guard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Supersedes #158 (10-min uniform bump). That PR was too blunt — it treated
research/audit/orchestration crons the same when they have fundamentally
different cost/value/cadence profiles.
## The split
Three layers, three cadences, grounded in the survey of Hermes/Letta/
Trigger.dev/Inngest/AG2/Rivet/n8n/Composio/SWE-agent done this session.
Nobody in that survey runs while(true) per agent — they all combine
event-driven reactivity with short orchestration pulses on a coordinator.
This PR implements that split for our 12-workspace template.
| Layer | Roles | Cadence | Purpose |
|---|---|---|---|
| Orchestration | PM, Dev Lead, Research Lead | every 5 min | Check backlog, dispatch work, review completed tasks |
| Audit | Security Auditor | every 10 min | Focused security audit |
| Audit | UI/UX Designer | every 15 min | Vision-heavy, dial back from 10 |
| Deep-work | Research Lead (eco-watch) | every 30 min (8,38) | Was hourly |
| Deep-work | Dev Lead (template fitness) | every 30 min (15,45) | Was hourly |
| Deep-work | Technical Researcher (plugins) | hourly (unchanged) | Research-heavy, slow |
| Deep-work | DevOps (channels) | hourly (unchanged) | Research-heavy, slow |
| Reactive | BE, FE, DevOps, Docs | no cron | Execute A2A delegations |
## Orchestration pulse prompts
The three new schedules each carry a detailed orchestration_prompt:
- **PM** (5-min): scan all 12 workspaces, scan GH PRs/issues backlog
(external), scan memory backlog (internal), dispatch up to 3 tasks per
pulse, review completed work, write pulse summary to memory. Hard
rules: under 90s wall-clock, never dispatch to busy agents, write
"orchestrator-clean" and stop if genuinely nothing to do.
- **Dev Lead** (5-min, offset +1 from PM): same shape, scoped to
engineering team. Reviews open PRs from direct reports, matches idle
engineers to labeled GH issues (security/bug/feature), dispatches with
"fix/issue-N-slug" branch convention. Skips pulse if own template
fitness audit is in flight (:15, :45).
- **Research Lead** (5-min, offset +2 from PM): same shape, scoped to
research team. Matches Market Analyst / Technical Researcher /
Competitive Intelligence to research-labeled issues or memory-stashed
questions. Max 2 A2A per pulse (research is slow). Skips pulse if own
eco-watch is in flight (:8, :38).
## Cadence offset table
No two crons fire in the same minute:
:01,:11,:21,:31,:41,:51 — Security audit (Security Auditor)
:02,:07,:12,:17,:22,:27,:32,:37,:42,:47,:52,:57 — Dev Lead orchestrator
:04,:09,:14,:19,:24,:29,:34,:39,:44,:49,:54,:59 — Research Lead orchestrator
:01,:06,:11,:16,:21,:26,:31,:36,:41,:46,:51,:56 — PM orchestrator
:05,:20,:35,:50 — UI/UX audit (UIUX Designer)
:08,:38 — Ecosystem watch deep-work (Research Lead)
:15,:45 — Template fitness deep-work (Dev Lead)
:22 — Plugin curation (Technical Researcher)
:47 — Channel expansion (DevOps Engineer)
Note PM and Security Auditor share :01 — this is fine because they
target different workspaces so scheduler concurrency handles it.
## Cost estimate
- PM pulse: 12/hour × 24 × ~3k tokens = 864k tokens/day/org ~ $5/day
- Dev Lead pulse: same ~ $5/day
- Research Lead pulse: same ~ $5/day
- Audits (security 10min, UIUX 15min): ~$8/day/org combined
- Deep-work crons (unchanged from original): ~$4/day/org
**Total ~$27/day/org**. Comparable to #158's $25 but MUCH higher
utility because orchestration produces dispatches that keep workers
busy, whereas #158 just fired more audits against the same team.
Closes#158 (superseded — will close that PR with a pointer to this one).
## Related research
See docs/ecosystem-watch.md `### Hermes Agent` and today's research agent
output: event-driven + reflection-on-completion + short orchestration
pulses on leaders is the shape that delivers 24/7 activity without
runaway cost. This is the concrete implementation.
Supersedes #158 (10-min uniform bump). That PR was too blunt — it treated
research/audit/orchestration crons the same when they have fundamentally
different cost/value/cadence profiles.
## The split
Three layers, three cadences, grounded in the survey of Hermes/Letta/
Trigger.dev/Inngest/AG2/Rivet/n8n/Composio/SWE-agent done this session.
Nobody in that survey runs while(true) per agent — they all combine
event-driven reactivity with short orchestration pulses on a coordinator.
This PR implements that split for our 12-workspace template.
| Layer | Roles | Cadence | Purpose |
|---|---|---|---|
| Orchestration | PM, Dev Lead, Research Lead | every 5 min | Check backlog, dispatch work, review completed tasks |
| Audit | Security Auditor | every 10 min | Focused security audit |
| Audit | UI/UX Designer | every 15 min | Vision-heavy, dial back from 10 |
| Deep-work | Research Lead (eco-watch) | every 30 min (8,38) | Was hourly |
| Deep-work | Dev Lead (template fitness) | every 30 min (15,45) | Was hourly |
| Deep-work | Technical Researcher (plugins) | hourly (unchanged) | Research-heavy, slow |
| Deep-work | DevOps (channels) | hourly (unchanged) | Research-heavy, slow |
| Reactive | BE, FE, DevOps, Docs | no cron | Execute A2A delegations |
## Orchestration pulse prompts
The three new schedules each carry a detailed orchestration_prompt:
- **PM** (5-min): scan all 12 workspaces, scan GH PRs/issues backlog
(external), scan memory backlog (internal), dispatch up to 3 tasks per
pulse, review completed work, write pulse summary to memory. Hard
rules: under 90s wall-clock, never dispatch to busy agents, write
"orchestrator-clean" and stop if genuinely nothing to do.
- **Dev Lead** (5-min, offset +1 from PM): same shape, scoped to
engineering team. Reviews open PRs from direct reports, matches idle
engineers to labeled GH issues (security/bug/feature), dispatches with
"fix/issue-N-slug" branch convention. Skips pulse if own template
fitness audit is in flight (:15, :45).
- **Research Lead** (5-min, offset +2 from PM): same shape, scoped to
research team. Matches Market Analyst / Technical Researcher /
Competitive Intelligence to research-labeled issues or memory-stashed
questions. Max 2 A2A per pulse (research is slow). Skips pulse if own
eco-watch is in flight (:8, :38).
## Cadence offset table
No two crons fire in the same minute:
:01,:11,:21,:31,:41,:51 — Security audit (Security Auditor)
:02,:07,:12,:17,:22,:27,:32,:37,:42,:47,:52,:57 — Dev Lead orchestrator
:04,:09,:14,:19,:24,:29,:34,:39,:44,:49,:54,:59 — Research Lead orchestrator
:01,:06,:11,:16,:21,:26,:31,:36,:41,:46,:51,:56 — PM orchestrator
:05,:20,:35,:50 — UI/UX audit (UIUX Designer)
:08,:38 — Ecosystem watch deep-work (Research Lead)
:15,:45 — Template fitness deep-work (Dev Lead)
:22 — Plugin curation (Technical Researcher)
:47 — Channel expansion (DevOps Engineer)
Note PM and Security Auditor share :01 — this is fine because they
target different workspaces so scheduler concurrency handles it.
## Cost estimate
- PM pulse: 12/hour × 24 × ~3k tokens = 864k tokens/day/org ~ $5/day
- Dev Lead pulse: same ~ $5/day
- Research Lead pulse: same ~ $5/day
- Audits (security 10min, UIUX 15min): ~$8/day/org combined
- Deep-work crons (unchanged from original): ~$4/day/org
**Total ~$27/day/org**. Comparable to #158's $25 but MUCH higher
utility because orchestration produces dispatches that keep workers
busy, whereas #158 just fired more audits against the same team.
Closes#158 (superseded — will close that PR with a pointer to this one).
## Related research
See docs/ecosystem-watch.md `### Hermes Agent` and today's research agent
output: event-driven + reflection-on-completion + short orchestration
pulses on leaders is the shape that delivers 24/7 activity without
runaway cost. This is the concrete implementation.
Two new entries added from the second daily pass (first run merged as PR #150
at 03:20 UTC). Both surfaced in the afternoon trending windows and were not
covered by the morning run.
- microsoft/agent-framework (~9.5k ⭐): official Microsoft successor to
AutoGen; ships migration guide and April 2026 .NET release. Directly affects
our autogen adapter in workspace-template/adapters/. Filed issue #156 to
evaluate adapter update.
- vercel-labs/open-agents (~2.2k ⭐, +1,020 today): cloud coding agent template
from Vercel Labs (same team as Skills CLI). Notable for agent-outside-sandbox
architecture and snapshot-based VM resumption — a more efficient approach
than our current Docker restart + git-clone pattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two new entries added from the second daily pass (first run merged as PR #150
at 03:20 UTC). Both surfaced in the afternoon trending windows and were not
covered by the morning run.
- microsoft/agent-framework (~9.5k ⭐): official Microsoft successor to
AutoGen; ships migration guide and April 2026 .NET release. Directly affects
our autogen adapter in workspace-template/adapters/. Filed issue #156 to
evaluate adapter update.
- vercel-labs/open-agents (~2.2k ⭐, +1,020 today): cloud coding agent template
from Vercel Labs (same team as Skills CLI). Notable for agent-outside-sandbox
architecture and snapshot-based VM resumption — a more efficient approach
than our current Docker restart + git-clone pattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes#151. The middleware was already implemented + tested (3 passing
tests in securityheaders_test.go covering base set, multi-route, and
the don't-override-existing contract) but never registered in router.go.
One-line wire-up, runs after TenantGuard so rejected requests still
get the same headers as accepted ones, and before routes so handlers
can still opt out by setting their own header before c.Next() returns.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes#151. The middleware was already implemented + tested (3 passing
tests in securityheaders_test.go covering base set, multi-route, and
the don't-override-existing contract) but never registered in router.go.
One-line wire-up, runs after TenantGuard so rejected requests still
get the same headers as accepted ones, and before routes so handlers
can still opt out by setting their own header before c.Next() returns.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The #95 scheduler heartbeat scheme relied on:
1. Top of tick() (once per poll interval)
2. Per-fire goroutine entry + exit
That leaves a gap: tick() ends with wg.Wait(), so if a single fire takes
longer than pollInterval (UIUX audits routinely take 60-120s; max fireTimeout
is 5min), the next tick doesn't run and no top-of-tick heartbeat fires.
Per-fire heartbeats only bracket the fire — between entry and the HTTP
response returning, nothing heartbeats either.
Observed today: /admin/liveness reports seconds_ago=251 while docker logs
show the scheduler actively firing 'Hourly ecosystem watch'. Scheduler is
fine; liveness is lying.
Adds an independent 10s heartbeat pulse goroutine inside Start(), decoupled
from tick completion. The existing heartbeats at tick top + per-fire are
kept as redundant signals but this pulse is the one that guarantees liveness
freshness regardless of what tick is doing.
Ships the exact fix proposed in #140 body.
Closes#140.
The #95 scheduler heartbeat scheme relied on:
1. Top of tick() (once per poll interval)
2. Per-fire goroutine entry + exit
That leaves a gap: tick() ends with wg.Wait(), so if a single fire takes
longer than pollInterval (UIUX audits routinely take 60-120s; max fireTimeout
is 5min), the next tick doesn't run and no top-of-tick heartbeat fires.
Per-fire heartbeats only bracket the fire — between entry and the HTTP
response returning, nothing heartbeats either.
Observed today: /admin/liveness reports seconds_ago=251 while docker logs
show the scheduler actively firing 'Hourly ecosystem watch'. Scheduler is
fine; liveness is lying.
Adds an independent 10s heartbeat pulse goroutine inside Start(), decoupled
from tick completion. The existing heartbeats at tick top + per-fire are
kept as redundant signals but this pulse is the one that guarantees liveness
freshness regardless of what tick is doing.
Ships the exact fix proposed in #140 body.
Closes#140.
Closes#133. Both roles previously inherited defaults only (ecc,
molecule-dev, superpowers, careful-bash, prompt-watchdog, audit-trail,
session-context, cron-learnings, update-docs) — no review skill.
Dev Lead enforces PR quality gates per triage SKILL.md; QA Engineer
reviews test coverage against acceptance criteria. Both need the
16-criteria code-review rubric and llm-judge to operate deterministically.
Mirrors Security Auditor's existing \`[molecule-skill-code-review,
molecule-skill-cross-vendor-review, molecule-skill-llm-judge]\` override.
Dropped cross-vendor from these two since it's a noteworthy-PR tool —
the workflow-triage entry in defaults already gates that for the ticks
that need it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes#133. Both roles previously inherited defaults only (ecc,
molecule-dev, superpowers, careful-bash, prompt-watchdog, audit-trail,
session-context, cron-learnings, update-docs) — no review skill.
Dev Lead enforces PR quality gates per triage SKILL.md; QA Engineer
reviews test coverage against acceptance criteria. Both need the
16-criteria code-review rubric and llm-judge to operate deterministically.
Mirrors Security Auditor's existing \`[molecule-skill-code-review,
molecule-skill-cross-vendor-review, molecule-skill-llm-judge]\` override.
Dropped cross-vendor from these two since it's a noteworthy-PR tool —
the workflow-triage entry in defaults already gates that for the ticks
that need it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Security fix merging despite CI outage (issue #136 — runner failing since 07:22, all jobs fail in 1-2s with no log output, infrastructure issue confirmed across 28 consecutive runs).
Issue #120 confirmed live by Security Auditor (cycle 3):
curl -X PATCH .../workspaces/00000000-... -d '{"name":"probe"}' → 200 (no token)
Code reviewed and approved by Security Auditor. Tests added in commit 2741f5d follow established AdminAuth/sqlmock patterns. CI outage is unrelated to these changes.
Security fix merging despite CI outage (issue #136 — runner failing since 07:22, all jobs fail in 1-2s with no log output, infrastructure issue confirmed across 28 consecutive runs).
Issue #120 confirmed live by Security Auditor (cycle 3):
curl -X PATCH .../workspaces/00000000-... -d '{"name":"probe"}' → 200 (no token)
Code reviewed and approved by Security Auditor. Tests added in commit 76cb7c3 follow established AdminAuth/sqlmock patterns. CI outage is unrelated to these changes.
Two gaps identified by Security Auditor in PR #125 review cycle:
1. handlers_extended_test.go:
- Fix TestExtended_WorkspaceUpdate: add SELECT EXISTS mock expectation
so the test correctly reflects the #120 existence guard now running first.
- Add TestExtended_WorkspaceUpdate_NotFound: verifies PATCH returns 404
(not 200) for a nonexistent workspace ID — the core #120 behaviour fix.
2. wsauth_middleware_test.go:
- Add TestAdminAuth_Issue120_PatchWorkspace_NoBearer_Returns401: documents
the confirmed attack vector (PATCH without token must return 401) and
asserts AdminAuth is applied to PATCH /workspaces/:id per the router.go change.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two gaps identified by Security Auditor in PR #125 review cycle:
1. handlers_extended_test.go:
- Fix TestExtended_WorkspaceUpdate: add SELECT EXISTS mock expectation
so the test correctly reflects the #120 existence guard now running first.
- Add TestExtended_WorkspaceUpdate_NotFound: verifies PATCH returns 404
(not 200) for a nonexistent workspace ID — the core #120 behaviour fix.
2. wsauth_middleware_test.go:
- Add TestAdminAuth_Issue120_PatchWorkspace_NoBearer_Returns401: documents
the confirmed attack vector (PATCH without token must return 401) and
asserts AdminAuth is applied to PATCH /workspaces/:id per the router.go change.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses the three release-blocking WCAG violations from the UX audit
(3rd consecutive cycle) and the new ChatTab ARIA gap from Audit #2.
Changes:
- Toaster: split into polite (success/info) + assertive (error) live
regions, both always in DOM so screen readers register them before
any toast fires. Adds x dismiss button on every toast. Errors no
longer auto-expire after 4s — persist until explicitly dismissed.
- ConfirmDialog: on open, requestAnimationFrame focuses the first
button inside the dialog. Tab/Shift-Tab is now trapped inside the
dialog while open. Added role="dialog" aria-modal="true" and
aria-labelledby pointing to the title h3.
- WorkspaceNode: outer div gains role="button", tabIndex={0},
aria-label, aria-pressed, and onKeyDown (Enter/Space => selectNode,
ContextMenu key => openContextMenu). Keyboard-only users can now
reach and activate workspace nodes.
- ChatTab sub-tab bar: role="tablist" on wrapper, role="tab" +
aria-selected + aria-controls on each button, matching
role="tabpanel" + id on each panel div. Textarea gets
aria-label="Message to agent".
453/453 Vitest tests pass. Production build clean (Next.js 15).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses the three release-blocking WCAG violations from the UX audit
(3rd consecutive cycle) and the new ChatTab ARIA gap from Audit #2.
Changes:
- Toaster: split into polite (success/info) + assertive (error) live
regions, both always in DOM so screen readers register them before
any toast fires. Adds x dismiss button on every toast. Errors no
longer auto-expire after 4s — persist until explicitly dismissed.
- ConfirmDialog: on open, requestAnimationFrame focuses the first
button inside the dialog. Tab/Shift-Tab is now trapped inside the
dialog while open. Added role="dialog" aria-modal="true" and
aria-labelledby pointing to the title h3.
- WorkspaceNode: outer div gains role="button", tabIndex={0},
aria-label, aria-pressed, and onKeyDown (Enter/Space => selectNode,
ContextMenu key => openContextMenu). Keyboard-only users can now
reach and activate workspace nodes.
- ChatTab sub-tab bar: role="tablist" on wrapper, role="tab" +
aria-selected + aria-controls on each button, matching
role="tabpanel" + id on each panel div. Textarea gets
aria-label="Message to agent".
453/453 Vitest tests pass. Production build clean (Next.js 15).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Issue #120 (HIGH — immediately exploitable):
PATCH /workspaces/:id was registered on the root router with no auth
middleware. An attacker with any workspace UUID could:
- Escalate tier (tier 4 = 4 GB RAM allocation)
- Rewrite parent_id to subvert CanCommunicate A2A access control
- Swap runtime image on next restart
- Redirect workspace_dir host bind-mount to arbitrary path
Fix: move PATCH into the wsAdmin AdminAuth group alongside POST, DELETE.
The canvas position-persist call already has an AdminAuth token (required
for GET /workspaces list on initial load) so no canvas regression.
Also add workspace-existence guard in Update handler — previously returned
200 with zero rows affected for nonexistent IDs.
Issue #113 (MEDIUM — schedule IDOR, carry-over from prior cycle):
PATCH /workspaces/:id/schedules/:scheduleId and DELETE operated on
scheduleID alone (WHERE id = $1), allowing any authenticated caller to
modify or delete schedules belonging to other workspaces.
Fix: bind workspace_id = c.Param("id") in both Update and Delete handlers;
add AND workspace_id = $N to all schedule SQL queries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Issue #120 (HIGH — immediately exploitable):
PATCH /workspaces/:id was registered on the root router with no auth
middleware. An attacker with any workspace UUID could:
- Escalate tier (tier 4 = 4 GB RAM allocation)
- Rewrite parent_id to subvert CanCommunicate A2A access control
- Swap runtime image on next restart
- Redirect workspace_dir host bind-mount to arbitrary path
Fix: move PATCH into the wsAdmin AdminAuth group alongside POST, DELETE.
The canvas position-persist call already has an AdminAuth token (required
for GET /workspaces list on initial load) so no canvas regression.
Also add workspace-existence guard in Update handler — previously returned
200 with zero rows affected for nonexistent IDs.
Issue #113 (MEDIUM — schedule IDOR, carry-over from prior cycle):
PATCH /workspaces/:id/schedules/:scheduleId and DELETE operated on
scheduleID alone (WHERE id = $1), allowing any authenticated caller to
modify or delete schedules belonging to other workspaces.
Fix: bind workspace_id = c.Param("id") in both Update and Delete handlers;
add AND workspace_id = $N to all schedule SQL queries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New nodes were placed at (0,0) or close to it, causing them to spawn
behind the toolbar/palette chrome and require manual panning to find.
Add GRID_ORIGIN_X/Y = 100 offset so the first node lands in clear canvas
space, and update the position assertion in the unit test accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New nodes were placed at (0,0) or close to it, causing them to spawn
behind the toolbar/palette chrome and require manual panning to find.
Add GRID_ORIGIN_X/Y = 100 offset so the first node lands in clear canvas
space, and update the position assertion in the unit test accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>