molecule-core/workspace-server
Hongming Wang dae7f50095 fix(wsauth): extend dev-mode escape hatch to WorkspaceAuth
The previous commit on this branch added a dev-mode fail-open branch to
AdminAuth so the Canvas dashboard could enumerate workspaces after the
first token lands in the DB. Verification via Chrome (clicking a
workspace to open its side panel) surfaced the same class of bug on a
different middleware — `WorkspaceAuth` — triggering:

  API GET /workspaces/<id>/activity?type=a2a_receive&source=canvas&limit=50:
    401 {"error":"missing workspace auth token"}

Root cause is identical to AdminAuth's: in local dev the Canvas (at
localhost:3000) calls the platform (at localhost:8080) cross-port, so
`isSameOriginCanvas`'s Host==Referer check fails. Without a bearer
token, every per-workspace read (/activity, /delegations, /memories,
/events/stream, /schedules, etc.) 401s and the side panel is unusable.

### Fix

Symmetric extension in `WorkspaceAuth` (workspace-server/internal/middleware/wsauth_middleware.go):
after the existing `isSameOriginCanvas` fallback, add a narrow escape
hatch that stays fail-open only when BOTH

  - `ADMIN_TOKEN` is unset (operator has not opted in to the #684
    closure), AND
  - `MOLECULE_ENV` is explicitly a dev mode (`development` / `dev`).

SaaS tenants never hit this branch because hosted provisioning sets
both `ADMIN_TOKEN` and `MOLECULE_ENV=production`. The comment in the
code also links back to AdminAuth's Tier-1b for consistency.

### Tests

Three new table-driven tests in wsauth_middleware_test.go mirror the
AdminAuth tier-1b suite, exercising the positive path and both
negative cases:

  - `TestWorkspaceAuth_DevModeEscapeHatch_NoBearer_FailsOpen` — the
    happy path (dev mode, no admin token → 200)
  - `TestWorkspaceAuth_DevModeEscapeHatch_IgnoredInProduction` — the
    SaaS-safety guarantee (production + no admin token → 401)
  - `TestWorkspaceAuth_DevModeEscapeHatch_IgnoredWhenAdminTokenSet` —
    explicit `ADMIN_TOKEN` wins; dev mode does not silently override
    the opt-in

### Comprehensive audit of adjacent middlewares

Re-scanned every file under workspace-server/internal/middleware/ and
every handler that invokes `AbortWithStatusJSON(Unauthorized)` directly,
to check for other surfaces where local dev might silently 401.
Findings, already OK:

  - `CanvasOrBearer` — cosmetic routes already accept localhost:3000
    via `canvasOriginAllowed` (Origin header check); no change needed.
  - `tenant_guard.go` — no-op when `MOLECULE_ORG_ID` is unset (self-
    hosted / dev); no change needed.
  - `session_auth.go` — verifies against `CP_UPSTREAM_URL`; returns
    (false, false) in local dev so callers fall through to bearer; no
    change needed.
  - `socket.go` `HandleConnect` — Canvas browser clients don't send
    `X-Workspace-ID` so skip the bearer check; agent clients do and
    validate as today. No change needed.
  - Handlers in handlers/{discovery,registry,secrets,plugins_install,
    a2a_proxy_helpers,schedules}.go — all workspace-scoped routes
    called by the workspace runtime, not the Canvas browser. Unaffected.
  - `handlers/admin_test_token.go` — already `MOLECULE_ENV`-aware (the
    convention this hatch mirrors).

### End-to-end verification

1. Fresh-nuked DB, platform + canvas restarted with `MOLECULE_ENV=development`
2. `POST /workspaces` → token lands in DB (Tier-1 would close here)
3. Probed every Canvas-hit endpoint with no bearer, with Canvas-like
   `Origin: http://localhost:3000`:

     200  /workspaces
     200  /workspaces/<id>/activity
     200  /workspaces/<id>/delegations
     200  /workspaces/<id>/memories
     200  /approvals/pending
     200  /events

4. Chrome browser test: opened http://localhost:3000, clicked a
   workspace tile — the side panel rendered with the full 13-tab
   structure (Chat, Activity, Details, Skills, Terminal, Config,
   Schedule, Channels, Files, Memory, Traces, Events, Audit) and no
   `Failed to load chat history` error. "No messages yet" placeholder
   shows instead of the 401 retry screen.

5. `go test -race ./internal/middleware/` — clean
6. `bash tests/e2e/test_api.sh` — 61/61 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:55:34 -07:00
..
cmd/server fix(go): replace $1 literal with resp.Body.Close() in 7 files (#1247) 2026-04-21 03:18:21 +00:00
internal fix(wsauth): extend dev-mode escape hatch to WorkspaceAuth 2026-04-23 14:55:34 -07:00
migrations feat(a2a): queue-on-busy — Phase 1 of priority queue (#1870) 2026-04-23 14:09:29 -07:00
pkg/provisionhook fix(docker): fix plugin go.mod replace for TokenProvider interface (#960) 2026-04-20 13:42:53 -07:00
.ci-force chore: force Platform(Go) CI run on main — validate go vet clean 2026-04-21 15:43:19 +00:00
.gitignore feat(ws-server): pull env from CP on startup 2026-04-19 02:41:15 -07:00
Dockerfile chore: extract ContextMenu Zustand fix + a2a_proxy local-docker SSRF bypass + workspace-server Dockerfile GID entrypoint 2026-04-22 20:00:16 -07:00
Dockerfile.tenant feat(terminal): remote path via aws ec2-instance-connect + pty 2026-04-21 18:13:29 -07:00
entrypoint-tenant.sh fix(security): add USER directive before ENTRYPOINT in all tenant images (#1155) 2026-04-20 23:51:33 +00:00
go.mod Merge main into staging - resolving 1,388 commit divergence for PR #1573 2026-04-22 13:54:53 +00:00
go.sum feat(terminal): remote path via aws ec2-instance-connect + pty 2026-04-21 18:13:29 -07:00