- New A2ATopologyOverlay component polls /activity fan-out every 60s and
writes directed edges to a2aEdges store slice (separate from topology edges)
- buildA2AEdges aggregates delegate rows per source→target pair; violet-500
animated edge when last call <5 min ago, blue-500 static otherwise
- Toolbar toggle persists to localStorage (molecule:show-a2a-edges)
- Canvas.tsx merges a2aEdges into allEdges via useMemo; pointerEvents:none
on all edge elements keeps nodes draggable
- 24 new unit tests across pure function, helper, and component suites
- Fix Canvas.a11y and Canvas.pan-to-node store mocks (missing A2A fields)
Closes#744
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `spike/issue-742-managed-agents-executor/` with:
- `demo.py`: standalone Python script that authenticates to the Managed Agents
beta API, provisions an environment + agent, starts a session, runs two
conversational turns (with cross-turn state recall verification), and prints
cold-start and per-turn latency measurements.
- `README.md`: full integration assessment covering provisioner changes needed,
A2A routing conflict (primary blocker — sessions have no addressable URL),
cost model, API gaps table, and a no-ship recommendation with a 3-week effort
estimate if we proceeded anyway.
Recommendation: no-ship for primary executor. Revisit as a batch/cron worker
in Phase H once Molecule's MCP server is feature-complete.
Closes#745. References #742.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the anthropic:claude-sonnet-4-6 default across config, handlers,
env example, and litellm proxy config. All tests updated to match the new
default; sonnet-4-6 alias kept in litellm_config.yml for pinned workspaces.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The BE's tests (AdminTokenSet_*, FailOpen_*) validated the core AdminAuth
contract on /admin/secrets. These table-driven additions pin the same contract
on the three routes explicitly named in the #684 security report, each with
three scenarios: workspace token rejected, correct ADMIN_TOKEN accepted, no
bearer rejected.
Routes covered:
GET /admin/liveness
GET /admin/github-installation-token
GET /approvals/pending
When ADMIN_TOKEN is set (tier 2), ValidateAnyToken is never called — the
env-var comparison short-circuits before any DB lookup. The mock sets only
HasAnyLiveTokenGlobal and nothing else; an extra DB expectation would itself
be a test bug (calling it proves the middleware regressed to tier 3).
All 18 TestAdminAuth_684* tests pass. Full go test ./... is green across all
15 platform packages.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend Engineer's PR #729 introduces ADMIN_TOKEN — when set, only that value
is accepted on /admin/* and /approvals/* routes, replacing the vulnerable
workspace-bearer fallback. Without the env var wired into deployments the fix
is code-only and the vulnerability stays open in every running instance.
Changes:
- `docker-compose.yml`: adds ADMIN_TOKEN env var to the platform service
(blank default = backward-compat fallback, i.e. still vulnerable until set).
NOTE: docker-compose.infra.yml has no platform service — the platform lives
only in the full-stack docker-compose.yml, so that is the correct file.
- `.env.example`: documents ADMIN_TOKEN with generation instructions and a
clear warning that it must be set to close#684.
- `infra/scripts/setup.sh`: prints a visible warning when ADMIN_TOKEN is unset
so operators know the vulnerability is still open in that deployment.
- `CLAUDE.md`: adds ADMIN_TOKEN to the env vars reference section.
No Go code changed — go build ./... passes clean.
Part of fix for #684 / PR #729
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
goose donated to Linux Foundation AAIF (alongside MCP + AGENTS.md) — AGENTS.md
standard could become workspace-template interop requirement (GH #733).
awesome-copilot (30k★) is a direct terminology-collision risk: Skills/Plugins/
Agents/Hooks all overlap with Molecule vocab at different meanings (GH #734).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Blast-radius isolation gap: AdminAuth called ValidateAnyToken which
accepted any live workspace bearer token. A compromised workspace agent
could present its own token to GET /admin/github-installation-token and
steal the platform's GitHub App credential, or hit /approvals/pending to
enumerate cross-workspace approvals.
Fix: introduce a dedicated admin credential tier via ADMIN_TOKEN env var.
When set, AdminAuth verifies the bearer against that secret exclusively
(crypto/subtle constant-time comparison). Workspace tokens are rejected
outright — no DB lookup occurs. When ADMIN_TOKEN is not set the previous
behaviour is preserved as a deprecated backward-compat fallback (tier 3)
so existing deployments without the env var don't break immediately.
Credential tiers (evaluated in order):
1. Fail-open — no live tokens globally (fresh install / pre-Phase-30)
2. ADMIN_TOKEN match — env var set, bearer must equal it exactly
3. Fallback (deprecated) — any valid workspace token (ADMIN_TOKEN unset)
Operators should set ADMIN_TOKEN=<openssl rand -base64 32> to fully close
the blast-radius gap. Tier 3 will be removed in a future release.
Fixes#684.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three bugs caused enabled schedules to silently disappear from the fire query
(which requires next_run_at IS NOT NULL AND next_run_at <= now()):
Bug 1 - fireSchedule() and recordSkipped(): when ComputeNextRun returned an
error, nextRunPtr stayed nil and UPDATE SET next_run_at = $2 wrote NULL.
Fix: change to COALESCE($2, next_run_at) so the existing DB value is preserved
when $2 is NULL, and log the error explicitly.
Bug 2 - org importer (handlers/org.go): nextRun, _ := ComputeNextRun(...)
silently discarded the error. A bad cron expression would pass time.Time{}
(zero value) to the INSERT. Fix: surface the error, log it, and skip the
schedule INSERT via continue.
Bug 3 - no startup repair: schedules already NULL'd by the pre-fix binary
would never recover. Fix: Start() now calls repairNullNextRunAt() once on
boot, recomputing next_run_at for every enabled schedule with a NULL value.
Tests: TestFireSchedule_ComputeNextRunError, TestRecordSkipped_ComputeNextRunError,
TestRepairNullNextRunAt_RepairsRows, TestRepairNullNextRunAt_DBError_NoPanic,
TestOrgImport_ScheduleComputeError (all pass).
Fixes#722
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Defense-in-depth: workspace-scoped ValidateToken now rejects tokens
belonging to workspaces with status='removed' at the DB layer, even
when revoked_at IS NULL. Mirrors the same guard added to ValidateAnyToken
in #696. Updated all test mock patterns (workspace_test, a2a_proxy_test,
secrets_test, admin_test_token_test, middleware) to match the new JOIN query.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ValidateAnyToken: add JOIN on workspaces with AND w.status != 'removed'
so tokens belonging to deleted workspaces cannot be replayed against
admin endpoints even before the token row is explicitly revoked.
- tokens_test.go: update ValidateAnyToken regexp patterns to match new
JOIN query; add TestValidateAnyToken_RemovedWorkspaceRejected.
- wsauth_middleware_test.go: update validateAnyTokenSelectQuery constant
to match JOIN query; add TestAdminAuth_RemovedWorkspaceToken_Returns401
to pin the AdminAuth removed-workspace rejection at the middleware layer.
- ADR-001: restore full blast-radius endpoint table (15 affected admin
routes), explicit risk statement ("full platform takeover"), current
mitigations, and Phase-H remediation plan (schema, middleware, bootstrap
flow, migration path). Tracking issue: #710.
#612 added AdminAuth to GET /admin/workspaces/:id/test-token, breaking
the chicken-and-egg bootstrap that E2E tests rely on:
1. POST /workspaces creates first workspace (fail-open, no tokens)
2. Provision generates a workspace auth token → inserts into DB
3. AdminAuth now sees a live token → requires auth on ALL routes
4. E2E calls test-token to get its first admin bearer → 401
5. All subsequent E2E calls fail → EVERY open PR CI blocked
The test-token handler already has its own production guard
(TestTokensEnabled returns false when MOLECULE_ENV=prod). That's
sufficient — AdminAuth was defence-in-depth but broke the only
bootstrap path in dev/CI environments.
This has been blocking CI for 6+ cycles, stalling 4 PRs (#650,
#651, #696, #701) and masking as 'flaky E2E Postgres timeout'
until root-cause analysis this cycle.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two targeted fixes for the A2A false-negative (delivery succeeded but caller
receives A2A_ERROR):
Body-read failure: when Do() succeeds (target sent 2xx headers — delivery
confirmed) but io.ReadAll(resp.Body) fails, proxy now returns
{"delivery_confirmed": true} in the 502 body and logs the activity as
successful. Audit trail records true delivery, not a false failed entry.
isTransientProxyError fix: delegation retry loop now only retries 503s with
{restarting: true} (container died, message NOT delivered). 503 {busy: true}
signals the agent IS processing the delivered message — retrying causes
double-delivery. Fix prevents the double-delivery race.
All 16 packages pass: go test ./...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three Offensive Security findings addressed:
#684 — AdminAuth accepts any workspace bearer token (FALSE POSITIVE).
ValidateAnyToken intentionally accepts any valid workspace token — the
platform's trust model uses workspace credentials as admin credentials.
No code change; documented as by-design in the PR body.
#682 — Deleted-workspace bearer tokens still authenticate (defense-in-depth).
The Delete handler already revokes all tokens (revoked_at = now()), so this
was a false positive. As defense-in-depth we add a JOIN against workspaces in
ValidateAnyToken so that even if revoked_at is not set (transient DB error
between status update and token revocation), the token still fails validation
once workspace.status = 'removed'.
Files: platform/internal/wsauth/tokens.go, tokens_test.go,
platform/internal/middleware/wsauth_middleware_test.go
#683 — /metrics unauthenticated (REAL).
GET /metrics was on the open router with no auth. The Prometheus endpoint
exposes the full HTTP route-pattern map, request counts by route+status, and
Go runtime memory stats — ops intel that should not reach unauthenticated
callers. Scraper must now present a valid workspace bearer token.
File: platform/internal/router/router.go
All 16 packages pass: go test ./...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ISSUE #680 — IDOR on PATCH /workspaces/🆔
- Route was on the open router with no auth middleware. Any unauthenticated
caller could rename, change role, or update any workspace field of any
workspace ID without credentials (zero auth + no ownership check).
- Fix: register under wsAuth (WorkspaceAuth middleware) which (a) requires a
valid bearer token and (b) validates the token belongs to the target
workspace, providing auth + ownership in a single check.
- Remove the now-redundant in-handler field-level auth block — the middleware
is a strictly stronger gate. Dead code gone.
- Remove unused `middleware` import from workspace.go.
- Update tests: two tests that asserted the old in-handler 401 are replaced
by TestWorkspaceUpdate_SensitiveField_AuthEnforcedByMiddleware (documents
that auth is now at the router layer); cosmetic-field test renamed.
ISSUE #681 — test-token endpoint auth:
- Confirmed: GET /admin/workspaces/:id/test-token already has
middleware.AdminAuth(db.DB). No change needed — finding was from older state.
Build: `go build ./...` clean. All 15 test packages pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Unbounded io.ReadAll on the Discord webhook error response body was a LOW
OOM risk: a malicious gateway or misconfigured proxy could return a multi-MB
body and exhaust agent memory. Cap with io.LimitReader(resp.Body, 4096) —
error messages are always short; any extra content is irrelevant noise.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
FIX 1: Cloudflare Artifacts routes (wsAuth POST/GET /artifacts, /fork, /token)
were accidentally dropped when #618 modified router.go. Restored along with the
handler and client packages that were already on main (#595/#641) but missing
from this branch.
FIX 2: Stray `audh := handlers.NewAuditHandler()` / `wsAuth.GET("/audit", ...)` block
was added out-of-scope during #618 work. Removed — #594 (audit-ledger) is a
separate merged PR and its routes live on main independently.
Build: `go build ./...` clean. All 17 test packages pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>