Commit Graph

3996 Commits

Author SHA1 Message Date
Hongming Wang
7318ead8a4 fix(security): scrub workspace-server token + upstream error logs
Two findings from the pre-launch log-scrub audit:

1. handlers/workspace_provision.go:548 logged `token[:8]` — the exact
   H1 pattern that panicked on short keys. Even with a length guard,
   leaking 8 chars of an auth token into centralized logs shortens the
   search space for anyone who gets log-read access. Now logs only
   `len(token)` as a liveness signal.

2. provisioner/cp_provisioner.go:101 fell back to logging the raw
   control-plane response body when the structured {"error":"..."}
   field was absent. If the CP ever echoed request headers (Authorization)
   or a portion of user-data back in an error path, the bearer token
   would end up in our tenant-instance logs. Now logs the byte count
   only; the structured error remains in place for the happy path.
   Also caps the read at 64 KiB via io.LimitReader to prevent
   log-flood DoS from a compromised upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:33:47 -07:00
Hongming Wang
a5d6e5319f Merge pull request #979 from Molecule-AI/fix/security-adminauth-c4
fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
2026-04-19 01:29:54 -07:00
Hongming Wang
cb16e55447
Merge pull request #979 from Molecule-AI/fix/security-adminauth-c4
fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
2026-04-19 01:29:54 -07:00
Hongming Wang
bf08a7edd9 Merge pull request #978 from Molecule-AI/fix/security-discord-config-limitreader
fix(security): cap Discord webhook + config PATCH bodies (H3/H4)
2026-04-19 01:28:46 -07:00
Hongming Wang
13992478ec
Merge pull request #978 from Molecule-AI/fix/security-discord-config-limitreader
fix(security): cap Discord webhook + config PATCH bodies (H3/H4)
2026-04-19 01:28:46 -07:00
Hongming Wang
481b5cfb1a fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
Pre-launch review blocker. AdminAuth's Tier-1 fail-open fired whenever
the workspace_auth_tokens table was empty — including the window between
a hosted tenant EC2 booting and the first workspace being created. In
that window, every admin-gated route (POST /org/import, POST /workspaces,
POST /bundles/import, etc.) was reachable without a bearer, letting an
attacker pre-empt the first real user by importing a hostile workspace
into a freshly provisioned instance.

Fix: fail-open is now ONLY applied when ADMIN_TOKEN is unset (self-
hosted dev with zero auth configured). Hosted SaaS always sets
ADMIN_TOKEN at provision time, so the branch never fires in prod and
requests with no bearer get 401 even before the first token is minted.

Tier-2 / Tier-3 paths unchanged.

The old TestAdminAuth_684_FailOpen_AdminTokenSet_NoGlobalTokens test
was codifying exactly this bug (asserting 200 on fresh install with
ADMIN_TOKEN set). Renamed and flipped to
TestAdminAuth_C4_AdminTokenSet_FreshInstall_FailsClosed asserting 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:28:13 -07:00
Hongming Wang
0e917ef6b8 fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
Pre-launch review blocker. AdminAuth's Tier-1 fail-open fired whenever
the workspace_auth_tokens table was empty — including the window between
a hosted tenant EC2 booting and the first workspace being created. In
that window, every admin-gated route (POST /org/import, POST /workspaces,
POST /bundles/import, etc.) was reachable without a bearer, letting an
attacker pre-empt the first real user by importing a hostile workspace
into a freshly provisioned instance.

Fix: fail-open is now ONLY applied when ADMIN_TOKEN is unset (self-
hosted dev with zero auth configured). Hosted SaaS always sets
ADMIN_TOKEN at provision time, so the branch never fires in prod and
requests with no bearer get 401 even before the first token is minted.

Tier-2 / Tier-3 paths unchanged.

The old TestAdminAuth_684_FailOpen_AdminTokenSet_NoGlobalTokens test
was codifying exactly this bug (asserting 200 on fresh install with
ADMIN_TOKEN set). Renamed and flipped to
TestAdminAuth_C4_AdminTokenSet_FreshInstall_FailsClosed asserting 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:28:13 -07:00
Hongming Wang
af9aae2c38 fix(security): cap webhook + config PATCH bodies (H3/H4)
Two HIGH-severity DoS surfaces: both handlers read the entire HTTP
body with io.ReadAll(r.Body) and no upper bound, so a caller streaming
a multi-gigabyte request could exhaust memory on the tenant instance
before we even validated the JSON.

H3 (Discord webhook): wrap Body in io.LimitReader with a 1 MiB cap.
Discord Interactions payloads are well under 10 KiB in practice.

H4 (workspace config PATCH): wrap Body in http.MaxBytesReader with a
256 KiB cap. Real configs are <10 KiB; jsonb handles the cap
comfortably. Returns 413 Request Entity Too Large on overflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:23:03 -07:00
Hongming Wang
60c4801a13 fix(security): cap webhook + config PATCH bodies (H3/H4)
Two HIGH-severity DoS surfaces: both handlers read the entire HTTP
body with io.ReadAll(r.Body) and no upper bound, so a caller streaming
a multi-gigabyte request could exhaust memory on the tenant instance
before we even validated the JSON.

H3 (Discord webhook): wrap Body in io.LimitReader with a 1 MiB cap.
Discord Interactions payloads are well under 10 KiB in practice.

H4 (workspace config PATCH): wrap Body in http.MaxBytesReader with a
256 KiB cap. Real configs are <10 KiB; jsonb handles the cap
comfortably. Returns 413 Request Entity Too Large on overflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:23:03 -07:00
Hongming Wang
61b5236aa1 Merge pull request #977 from Molecule-AI/feat/workspace-snapshot-scrubber-823
feat(workspace): snapshot secret scrubber (closes #823)
2026-04-19 00:33:14 -07:00
Hongming Wang
b367f18e95
Merge pull request #977 from Molecule-AI/feat/workspace-snapshot-scrubber-823
feat(workspace): snapshot secret scrubber (closes #823)
2026-04-19 00:33:14 -07:00
Hongming Wang
3976361483 feat(workspace): snapshot secret scrubber (closes #823)
Sub-issue of #799, security condition C4. Standalone module in
workspace/lib/snapshot_scrub.py with three public functions:

- scrub_content(str) → str: regex-based redaction of secret patterns
- is_sandbox_content(str) → bool: detect run_code tool output markers
- scrub_snapshot(dict) → dict: walk memories, scrub each, drop sandbox entries

Patterns covered: sk-ant-/sk-proj-, ghp_/ghs_/github_pat_, AKIA,
cfut_, mol_pk_, ctx7_, Bearer, env-var assignments, base64 blobs ≥33 chars.

21 unit tests, 100% coverage on new code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 00:32:42 -07:00
Hongming Wang
e7b9b7df71 feat(workspace): snapshot secret scrubber (closes #823)
Sub-issue of #799, security condition C4. Standalone module in
workspace/lib/snapshot_scrub.py with three public functions:

- scrub_content(str) → str: regex-based redaction of secret patterns
- is_sandbox_content(str) → bool: detect run_code tool output markers
- scrub_snapshot(dict) → dict: walk memories, scrub each, drop sandbox entries

Patterns covered: sk-ant-/sk-proj-, ghp_/ghs_/github_pat_, AKIA,
cfut_, mol_pk_, ctx7_, Bearer, env-var assignments, base64 blobs ≥33 chars.

21 unit tests, 100% coverage on new code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 00:32:42 -07:00
Hongming Wang
285b9d1fa3 Merge pull request #972 from Molecule-AI/chore/ci-action-versions
ci: update GitHub Actions to current stable versions (closes #780)
2026-04-19 00:31:17 -07:00
Hongming Wang
aec64a6a63
Merge pull request #972 from Molecule-AI/chore/ci-action-versions
ci: update GitHub Actions to current stable versions (closes #780)
2026-04-19 00:31:17 -07:00
Hongming Wang
151e458c38 Merge pull request #975 from Molecule-AI/fix/hibernate-409-guard-active-tasks
feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
2026-04-19 00:30:24 -07:00
Hongming Wang
04e10fb19d
Merge pull request #975 from Molecule-AI/fix/hibernate-409-guard-active-tasks
feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
2026-04-19 00:30:24 -07:00
Hongming Wang
4e7c4ceeb3 Merge pull request #976 from Molecule-AI/feat/last-outbound-at-817
feat(platform): track last_outbound_at for silent detection (closes #817)
2026-04-19 00:30:01 -07:00
Hongming Wang
e2c270600c
Merge pull request #976 from Molecule-AI/feat/last-outbound-at-817
feat(platform): track last_outbound_at for silent detection (closes #817)
2026-04-19 00:30:01 -07:00
Hongming Wang
68f55c3ebc Merge pull request #974 from Molecule-AI/fix/canvas-a11y-degraded-badge
fix(canvas): degraded badge WCAG AA contrast (closes #885 p1)
2026-04-19 00:28:39 -07:00
Hongming Wang
eef8949b65
Merge pull request #974 from Molecule-AI/fix/canvas-a11y-degraded-badge
fix(canvas): degraded badge WCAG AA contrast (closes #885 p1)
2026-04-19 00:28:39 -07:00
Hongming Wang
c0233317b8 Merge pull request #968 from Molecule-AI/fix/security-memory-delimiter-npm-pin
fix(security): GLOBAL memory delimiter spoofing + pin MCP version (closes #807, #805)
2026-04-19 00:28:08 -07:00
Hongming Wang
4c9d0d683f
Merge pull request #968 from Molecule-AI/fix/security-memory-delimiter-npm-pin
fix(security): GLOBAL memory delimiter spoofing + pin MCP version (closes #807, #805)
2026-04-19 00:28:08 -07:00
Hongming Wang
d183f89b94 Merge pull request #964 from Molecule-AI/feat/schema-migrations-tracking
feat(db): schema_migrations tracking — run each migration only once
2026-04-19 00:27:27 -07:00
Hongming Wang
acb67c75b8
Merge pull request #964 from Molecule-AI/feat/schema-migrations-tracking
feat(db): schema_migrations tracking — run each migration only once
2026-04-19 00:27:27 -07:00
Hongming Wang
50364817ac Merge pull request #967 from Molecule-AI/chore/shadcn-init
chore(canvas): initialize shadcn/ui CLI
2026-04-19 00:27:07 -07:00
Hongming Wang
9b49024ce4
Merge pull request #967 from Molecule-AI/chore/shadcn-init
chore(canvas): initialize shadcn/ui CLI
2026-04-19 00:27:07 -07:00
Hongming Wang
6fb8472c26 Merge pull request #966 from Molecule-AI/fix/strip-current-task-public-get
fix(security): strip current_task from public GET response (closes #955)
2026-04-19 00:26:27 -07:00
Hongming Wang
ff4962e20f
Merge pull request #966 from Molecule-AI/fix/strip-current-task-public-get
fix(security): strip current_task from public GET response (closes #955)
2026-04-19 00:26:27 -07:00
Hongming Wang
ad77b84854 Merge pull request #973 from Molecule-AI/docs/rfc2119-opencode-must-not
docs(opencode): 'should not' → 'must not' for SAFE-T1201 (closes #861)
2026-04-19 00:26:05 -07:00
Hongming Wang
0519327179
Merge pull request #973 from Molecule-AI/docs/rfc2119-opencode-must-not
docs(opencode): 'should not' → 'must not' for SAFE-T1201 (closes #861)
2026-04-19 00:26:05 -07:00
Hongming Wang
6efc355f2f Merge pull request #965 from Molecule-AI/fix/crlf-cron-prompts
fix(scheduler): strip CRLF from cron prompts (closes #958)
2026-04-19 00:25:14 -07:00
Hongming Wang
0111a882ab
Merge pull request #965 from Molecule-AI/fix/crlf-cron-prompts
fix(scheduler): strip CRLF from cron prompts (closes #958)
2026-04-19 00:25:14 -07:00
Hongming Wang
f96be032a1 Merge pull request #963 from Molecule-AI/chore/turbopack-dev
chore(canvas): enable Turbopack for dev server
2026-04-19 00:24:37 -07:00
Hongming Wang
60ab365d81
Merge pull request #963 from Molecule-AI/chore/turbopack-dev
chore(canvas): enable Turbopack for dev server
2026-04-19 00:24:37 -07:00
Hongming Wang
9e17c86df1 Merge pull request #971 from Molecule-AI/chore/phase35-sg-lockdown-script
feat(security): Phase 35.1 — SG lockdown script for tenant EC2
2026-04-19 00:24:11 -07:00
Hongming Wang
beccd02519
Merge pull request #971 from Molecule-AI/chore/phase35-sg-lockdown-script
feat(security): Phase 35.1 — SG lockdown script for tenant EC2
2026-04-19 00:24:11 -07:00
Hongming Wang
c3eddd7950 Merge pull request #962 from Molecule-AI/chore/secret-scanner-mol-pk
chore: add mol_pk_ and cfut_ to pre-commit secret scanner
2026-04-19 00:22:44 -07:00
Hongming Wang
a00d0dc602
Merge pull request #962 from Molecule-AI/chore/secret-scanner-mol-pk
chore: add mol_pk_ and cfut_ to pre-commit secret scanner
2026-04-19 00:22:44 -07:00
Hongming Wang
4e1a513160 feat(platform): track last_outbound_at for silent-workspace detection (closes #817)
Sub of #795 (phantom-busy post-mortem). Adds last_outbound_at TIMESTAMPTZ
column to workspaces. Bumped async on every successful outbound A2A call
from a real workspace (skip canvas + system callers). Exposed in
GET /workspaces/:id response as "last_outbound_at".

PM/Dev Lead orchestrators can now detect workspaces that have gone silent
despite being online (> 2h + active cron = phantom-busy warning).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 13:04:54 -07:00
Hongming Wang
2f36bb9a7f feat(platform): track last_outbound_at for silent-workspace detection (closes #817)
Sub of #795 (phantom-busy post-mortem). Adds last_outbound_at TIMESTAMPTZ
column to workspaces. Bumped async on every successful outbound A2A call
from a real workspace (skip canvas + system callers). Exposed in
GET /workspaces/:id response as "last_outbound_at".

PM/Dev Lead orchestrators can now detect workspaces that have gone silent
despite being online (> 2h + active cron = phantom-busy warning).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 13:04:54 -07:00
Hongming Wang
37030c307d feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
Phase 35.1 / #799 security condition C3 — prevents operator from
accidentally killing a mid-task agent.

Behavior:
- active_tasks == 0 → proceed as before
- active_tasks > 0 && ?force=true → log [WARN] + proceed
- active_tasks > 0 && no force → 409 with {error, active_tasks}

2 new tests: TestHibernateHandler_ActiveTasks_Returns409,
TestHibernateHandler_ActiveTasks_ForceTrue_Returns200.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:09:52 -07:00
Hongming Wang
a8897c5f17 feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
Phase 35.1 / #799 security condition C3 — prevents operator from
accidentally killing a mid-task agent.

Behavior:
- active_tasks == 0 → proceed as before
- active_tasks > 0 && ?force=true → log [WARN] + proceed
- active_tasks > 0 && no force → 409 with {error, active_tasks}

2 new tests: TestHibernateHandler_ActiveTasks_Returns409,
TestHibernateHandler_ActiveTasks_ForceTrue_Returns200.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:09:52 -07:00
Hongming Wang
89d96e8581 fix(canvas): degraded badge WCAG AA contrast — amber-400 → amber-300 (closes #885)
amber-400 on zinc-900 is 5.4:1 (AA pass). amber-300 is 6.9:1 (AA+AAA pass)
and matches the rest of the amber usage in WorkspaceNode (currentTask,
error detail, badge chip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:05:38 -07:00
Hongming Wang
e74d41bbaa fix(canvas): degraded badge WCAG AA contrast — amber-400 → amber-300 (closes #885)
amber-400 on zinc-900 is 5.4:1 (AA pass). amber-300 is 6.9:1 (AA+AAA pass)
and matches the rest of the amber usage in WorkspaceNode (currentTask,
error detail, badge chip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:05:38 -07:00
Hongming Wang
df632aeab5 docs(opencode): RFC 2119 — 'should not' → 'must not' for SAFE-T1201 warning (closes #861)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:49 -07:00
Hongming Wang
90236c4d23 docs(opencode): RFC 2119 — 'should not' → 'must not' for SAFE-T1201 warning (closes #861)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:49 -07:00
Hongming Wang
64796838e0 ci: update GitHub Actions to current stable versions (closes #780)
- golangci/golangci-lint-action@v4 → v9
- docker/setup-qemu-action@v3 → v4
- docker/setup-buildx-action@v3 → v4
- docker/build-push-action@v5 → v6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:10 -07:00
Hongming Wang
755c6952c9 ci: update GitHub Actions to current stable versions (closes #780)
- golangci/golangci-lint-action@v4 → v9
- docker/setup-qemu-action@v3 → v4
- docker/setup-buildx-action@v3 → v4
- docker/build-push-action@v5 → v6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:10 -07:00
Hongming Wang
510083ccc5 feat(security): Phase 35.1 — SG lockdown script for tenant EC2 instances
Restricts tenant EC2 port 8080 ingress to Cloudflare IP ranges only,
blocking direct-IP access. Supports two modes:

1. Lock to CF IPs (Worker deployment): 14 IPv4 CIDR rules
2. Close ingress entirely (Tunnel deployment): removes 0.0.0.0/0 only

Usage:
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx --close-ingress
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx --dry-run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:01:41 -07:00