Commit Graph

1020 Commits

Author SHA1 Message Date
Molecule AI Backend Engineer
818d5cde91 fix(mcp): scrub secrets in commit_memory MCP tool path (#838 sibling)
PR #881 closed SAFE-T1201 (#838) on the HTTP path by wiring redactSecrets()
into MemoriesHandler.Commit — but the sibling code path on the MCP bridge
(MCPHandler.toolCommitMemory) was left with only the TODO comment. Agents
calling commit_memory via the MCP tool bridge are the PRIMARY attack vector
for #838 (confused / prompt-injected agent pipes raw tool-response text
containing plain-text credentials into agent_memories, leaking into shared
TEAM scope). The HTTP path is only exercised by canvas UI posts, so the MCP
gap was the hotter one.

Change:

  workspace-server/internal/handlers/mcp.go:725
    - TODO(#838): run _redactSecrets(content) before insert — plain-text
    - API keys from tool responses must not land in the memories table.
    + SAFE-T1201 (#838): scrub known credential patterns before persistence…
    + content, _ = redactSecrets(workspaceID, content)

Reuses redactSecrets (same package) so there's no duplicated pattern list —
a future-added pattern in memories.go automatically covers the MCP path too.

Tests added in mcp_test.go:

  - TestMCPHandler_CommitMemory_SecretInContent_IsRedactedBeforeInsert
      Exercises three patterns (env-var assignment, Bearer token, sk-…)
      and uses sqlmock's WithArgs to bind the exact REDACTED form — so a
      regression (removing the redactSecrets call) fails with arg-mismatch
      rather than silently persisting the secret.

  - TestMCPHandler_CommitMemory_CleanContent_PassesThrough
      Regression guard — benign content must NOT be altered by the redactor.

NOTE: unable to run `go test -race ./...` locally (this container has no Go
toolchain). The change is mechanical reuse of an already-shipped function in
the same package; CI must validate. The sqlmock patterns mirror the existing
TestMCPHandler_CommitMemory_LocalScope_Success test exactly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-19 17:52:52 +00:00
Hongming Wang
393ecc74e3 Merge pull request #991 from Molecule-AI/perf/scheduler-returning-clause
perf(scheduler): collapse empty-run bump to single RETURNING query
2026-04-19 03:48:42 -07:00
Hongming Wang
69d5af6636 Merge pull request #990 from Molecule-AI/fix/cp-provisioner-tests
test(ws-server): CPProvisioner coverage — auth, env fallback, error paths
2026-04-19 03:48:40 -07:00
Hongming Wang
b749c558f3 perf(scheduler): collapse empty-run bump to single RETURNING query
The phantom-producer detector (#795) was doing UPDATE + SELECT in two
roundtrips — first incrementing consecutive_empty_runs, then re-
reading to check the stale threshold. Switch to UPDATE ... RETURNING
so the post-increment value comes back in one query.

Called once per schedule per cron tick. At 100 tenants × dozens of
schedules per tenant, the halved DB traffic on the empty-response
path is measurable, not just cosmetic.

Also now properly logs if the bump itself fails (previously it silent-
swallowed the ExecContext error and still ran the SELECT, which would
confuse debugging).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 03:44:48 -07:00
Hongming Wang
5a33784ea1 Merge pull request #989 from Molecule-AI/feat/canary-rollback-script
feat(canary): rollback script + release-pipeline doc (Phase 4)
2026-04-19 03:41:53 -07:00
Hongming Wang
296c52cb25 test(ws-server): cover CPProvisioner — auth, env fallback, error paths
Post-merge audit flagged cp_provisioner.go as the only new file from
the canary/C1 work without test coverage. Fills the gap:

- NewCPProvisioner_RequiresOrgID — self-hosted without MOLECULE_ORG_ID
  refuses to construct (avoids silent phone-home to prod CP).
- NewCPProvisioner_FallsBackToProvisionSharedSecret — the operator
  ergonomics of using one env-var name on both sides of the wire.
- AuthHeader noop + happy path — bearer only set when secret is set.
- Start_HappyPath — end-to-end POST to stubbed CP, bearer forwarded,
  instance_id parsed out of response.
- Start_Non201ReturnsStructuredError — when CP returns structured
  {"error":"…"}, that message surfaces to the caller.
- Start_NoStructuredErrorFallsBackToSize — regression gate for the
  anti-log-leak change from PR #980: raw upstream body must NOT
  appear in the error, only the byte count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 03:41:16 -07:00
Hongming Wang
ee2cfe1148 Merge pull request #988 from Molecule-AI/feat/canary-gate-latest-tag
feat(canary): gate :latest tag promotion on canary verify green (Phase 3)
2026-04-19 03:38:22 -07:00
Hongming Wang
97dbe1c987 feat(canary): rollback-latest script + release-pipeline doc (Phase 4)
Closes the canary loop with the escape hatch and a single place to
read about the whole flow.

scripts/rollback-latest.sh <sha>
  uses crane to retag :latest ← :staging-<sha> for BOTH the platform
  and tenant images. Pre-checks the target tag exists and verifies
  the :latest digest after the move so a bad ops typo doesn't
  silently promote the wrong thing. Prod tenants auto-update to the
  rolled-back digest within their 5-min cycle. Exit codes: 0 = both
  retagged, 1 = registry/tag error, 2 = usage error.

docs/architecture/canary-release.md
  The one-page map of the pipeline: how PR → main → staging-<sha> →
  canary smoke → :latest promotion works end-to-end, how to add a
  canary tenant, how to roll back, and what this gate explicitly does
  NOT catch (prod-only data, config drift, cross-tenant bugs).

No code changes in the CP or workspace-server — this PR is shell
+ docs only, so it's safe to land independently of the other Phase
{1,1.5,2,3} PRs still in review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 03:37:42 -07:00
Hongming Wang
7e3a6fd756 feat(canary): gate :latest tag promotion on canary verify green (Phase 3)
Completes the canary release train. Before this, publish-workspace-
server-image.yml pushed both :staging-<sha> and :latest on every
main merge — meaning the prod tenant fleet auto-pulled every image
immediately, before any post-deploy smoke test. A broken image
(think: this morning's E2E current_task drift, but shipped at 3am
instead of caught in CI) would have fanned out to every running
tenant within 5 min.

Now:
- publish workflow pushes :staging-<sha> ONLY
- canary tenants are configured to track :staging-<sha>; they pick
  up the new image on their next auto-update cycle
- canary-verify.yml runs the smoke suite (Phase 2) after the sleep
- on green: a new promote-to-latest job uses crane to remotely
  retag :staging-<sha> → :latest for both platform and tenant images
- prod tenants auto-update to the newly-retagged :latest within
  their usual 5-min window
- on red: :latest stays frozen on prior good digest; prod is untouched

crane is pulled onto the runner (~4 MB, GitHub release) rather than
docker-daemon retag so the workflow doesn't need a privileged runner.

Rollback: if canary passed but something surfaces post-promotion,
operator runs "crane tag ghcr.io/molecule-ai/platform:<prior-good-sha>
latest" manually. A follow-up can wrap that in a Phase 4 admin
endpoint / script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 03:33:04 -07:00
Hongming Wang
fdcc715e97 Merge pull request #987 from Molecule-AI/feat/canary-smoke-harness
feat(canary): smoke harness + GHA verify workflow (Phase 2)
2026-04-19 03:31:22 -07:00
Hongming Wang
c1ac32365d feat(canary): smoke harness + GHA verification workflow (Phase 2)
Post-deploy verification for staging tenant images. Runs against the
canary fleet after each publish-workspace-server-image build — catches
auto-update breakage (a la today's E2E current_task drift) before it
propagates to the prod tenant fleet that auto-pulls :latest every 5 min.

scripts/canary-smoke.sh iterates a space-sep list of canary base URLs
(paired with their ADMIN_TOKENs) and checks:
- /admin/liveness reachable with admin bearer (tenant boot OK)
- /workspaces list responds (wsAuth + DB path OK)
- /memories/commit + /memories/search round-trip (encryption + scrubber)
- /events admin read (AdminAuth C4 path)
- /admin/liveness without bearer returns 401 (C4 fail-closed regression)

.github/workflows/canary-verify.yml runs after publish succeeds:
- 6-min sleep (tenant auto-updater pulls every 5 min)
- bash scripts/canary-smoke.sh with secrets pulled from repo settings
- on failure: writes a Step Summary flagging that :latest should be
  rolled back to prior known-good digest

Phase 3 follow-up will split the publish workflow so only
:staging-<sha> ships initially, and canary-verify's green gate is
what promotes :staging-<sha> → :latest. This commit lays the test
gate alone so we have something running against tenants immediately.

Secrets to set in GitHub repo settings before this workflow can run:
- CANARY_TENANT_URLS (space-sep list)
- CANARY_ADMIN_TOKENS (same order as URLs)
- CANARY_CP_SHARED_SECRET (matches staging CP PROVISION_SHARED_SECRET)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 03:30:19 -07:00
Hongming Wang
c40e7ba68c Merge pull request #986 from Molecule-AI/feat/tenant-cp-env-refresh
feat(ws-server): pull env from CP on startup
2026-04-19 03:27:14 -07:00
Hongming Wang
f09ff8dc4f Merge pull request #985 from Molecule-AI/docs/saas-migration-notes-prod
docs: 2026-04-19 SaaS prod migration notes
2026-04-19 03:27:12 -07:00
Hongming Wang
bc83f0ac23 Merge pull request #982 from Molecule-AI/fix/canvas-api-fetch-timeout
fix(canvas): add 15s fetch timeout on API calls
2026-04-19 03:27:09 -07:00
Hongming Wang
80caba97ee feat(ws-server): pull env from CP on startup
Paired with molecule-controlplane PR #55 (GET /cp/tenants/config). Lets
existing tenants heal themselves when we rotate or add a CP-side env
var (e.g. MOLECULE_CP_SHARED_SECRET landing earlier today) without any
ssh or re-provision.

Flow: main() calls refreshEnvFromCP() before any other os.Getenv read.
The helper reads MOLECULE_ORG_ID + ADMIN_TOKEN from the baked-in
user-data env, GETs {MOLECULE_CP_URL}/cp/tenants/config with those
credentials, and applies the returned string map via os.Setenv so
downstream code (CPProvisioner, etc.) sees the fresh values.

Best-effort semantics:
- self-hosted / no MOLECULE_ORG_ID → no-op (return nil)
- CP unreachable / non-200 → log + return error (main keeps booting)
- oversized values (>4 KiB each) rejected to avoid env pollution
- body read capped at 64 KiB

Once this image hits GHCR, the 5-minute tenant auto-updater picks it
up, the container restarts, refresh runs, and every tenant has
MOLECULE_CP_SHARED_SECRET within ~5 minutes — no operator toil.

Also fixes workspace-server/.gitignore so `server` no longer matches
the cmd/server package dir — it only ignored the compiled binary but
pattern was too broad. Anchored to `/server`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 02:41:15 -07:00
Hongming Wang
ca5a5f1a7f docs: 2026-04-19 SaaS prod migration notes
Captures the 10-PR staging→main cutover: what shipped, the three new
Railway prod env vars (PROVISION_SHARED_SECRET / EC2_VPC_ID /
CP_BASE_URL), and the sharp edge for existing tenants — their
containers pre-date PR #53 so they still need MOLECULE_CP_SHARED_SECRET
added manually (or a re-provision) before the new CPProvisioner's
outbound bearer works.

Also includes a post-deploy verification checklist and rollback plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 02:29:31 -07:00
Hongming Wang
5e1032289e Merge pull request #983 from Molecule-AI/staging
promote: staging → main (security hardening + Phase 35.1)
2026-04-19 02:28:05 -07:00
Hongming Wang
99417b7b20 Merge pull request #984 from Molecule-AI/fix/e2e-current-task-public-get
fix(e2e): stop asserting current_task on public workspace GET
2026-04-19 02:21:08 -07:00
Hongming Wang
f32196d351 fix(e2e): stop asserting current_task on public workspace GET (#966)
PR #966 intentionally stripped current_task, last_sample_error, and
workspace_dir from the public GET /workspaces/:id response to avoid
leaking task bodies to anyone with a workspace bearer. The E2E smoke
test hadn't caught up — it was still asserting "current_task":"..."
on the single-workspace GET, which made every post-#966 CI run fail
with '60 passed, 2 failed'.

Swap the per-workspace asserts to check active_tasks (still exposed,
canonical busy signal) and keep the list-endpoint check that proves
admin-auth'd callers still see current_task end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 02:19:15 -07:00
Hongming Wang
956c1fb9b9 fix(canvas): add 15s fetch timeout on API calls
Pre-launch audit flagged api.ts as missing a timeout on every fetch.
A slow or hung CP response would leave the UI spinning indefinitely
with no way for the user to abort — effectively a client-side DoS.

15s is long enough for real CP queries (slowest observed is Stripe
portal redirect at ~3s) and short enough that a stalled backend
surfaces as a clear error with a retry affordance.

Uses AbortSignal.timeout (widely supported since 2023) so the
abort propagates through React Query / SWR consumers cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 02:12:47 -07:00
Hongming Wang
896a34429a Merge pull request #981 from Molecule-AI/fix/security-tenant-cpprovisioner-bearer
fix(security): tenant CPProvisioner sends CP bearer on provision / stop / status
2026-04-19 01:55:20 -07:00
Hongming Wang
a79366a04a fix(security): tenant CPProvisioner attaches CP bearer on all calls
Completes the C1 integration (PR #50 on molecule-controlplane). The CP
now requires Authorization: Bearer <PROVISION_SHARED_SECRET> on all
three /cp/workspaces/* endpoints; without this change the tenant-side
Start/Stop/IsRunning calls would all 401 (or 404 when the CP's routes
refused to mount) and every workspace provision from a SaaS tenant
would silently fail.

Reads MOLECULE_CP_SHARED_SECRET, falling back to PROVISION_SHARED_SECRET
so operators can use one env-var name on both sides of the wire. Empty
value is a no-op: self-hosted deployments with no CP or a CP that
doesn't gate /cp/workspaces/* keep working as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:53:12 -07:00
Hongming Wang
c8c92ffe21 Merge pull request #980 from Molecule-AI/fix/security-log-scrubbing
fix(security): scrub workspace-server token + upstream error logs
2026-04-19 01:39:39 -07:00
Hongming Wang
365f13199e fix(security): scrub workspace-server token + upstream error logs
Two findings from the pre-launch log-scrub audit:

1. handlers/workspace_provision.go:548 logged `token[:8]` — the exact
   H1 pattern that panicked on short keys. Even with a length guard,
   leaking 8 chars of an auth token into centralized logs shortens the
   search space for anyone who gets log-read access. Now logs only
   `len(token)` as a liveness signal.

2. provisioner/cp_provisioner.go:101 fell back to logging the raw
   control-plane response body when the structured {"error":"..."}
   field was absent. If the CP ever echoed request headers (Authorization)
   or a portion of user-data back in an error path, the bearer token
   would end up in our tenant-instance logs. Now logs the byte count
   only; the structured error remains in place for the happy path.
   Also caps the read at 64 KiB via io.LimitReader to prevent
   log-flood DoS from a compromised upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:33:47 -07:00
Hongming Wang
a5d6e5319f Merge pull request #979 from Molecule-AI/fix/security-adminauth-c4
fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
2026-04-19 01:29:54 -07:00
Hongming Wang
bf08a7edd9 Merge pull request #978 from Molecule-AI/fix/security-discord-config-limitreader
fix(security): cap Discord webhook + config PATCH bodies (H3/H4)
2026-04-19 01:28:46 -07:00
Hongming Wang
481b5cfb1a fix(security): C4 — close AdminAuth fail-open race on hosted-SaaS fresh install
Pre-launch review blocker. AdminAuth's Tier-1 fail-open fired whenever
the workspace_auth_tokens table was empty — including the window between
a hosted tenant EC2 booting and the first workspace being created. In
that window, every admin-gated route (POST /org/import, POST /workspaces,
POST /bundles/import, etc.) was reachable without a bearer, letting an
attacker pre-empt the first real user by importing a hostile workspace
into a freshly provisioned instance.

Fix: fail-open is now ONLY applied when ADMIN_TOKEN is unset (self-
hosted dev with zero auth configured). Hosted SaaS always sets
ADMIN_TOKEN at provision time, so the branch never fires in prod and
requests with no bearer get 401 even before the first token is minted.

Tier-2 / Tier-3 paths unchanged.

The old TestAdminAuth_684_FailOpen_AdminTokenSet_NoGlobalTokens test
was codifying exactly this bug (asserting 200 on fresh install with
ADMIN_TOKEN set). Renamed and flipped to
TestAdminAuth_C4_AdminTokenSet_FreshInstall_FailsClosed asserting 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:28:13 -07:00
Hongming Wang
af9aae2c38 fix(security): cap webhook + config PATCH bodies (H3/H4)
Two HIGH-severity DoS surfaces: both handlers read the entire HTTP
body with io.ReadAll(r.Body) and no upper bound, so a caller streaming
a multi-gigabyte request could exhaust memory on the tenant instance
before we even validated the JSON.

H3 (Discord webhook): wrap Body in io.LimitReader with a 1 MiB cap.
Discord Interactions payloads are well under 10 KiB in practice.

H4 (workspace config PATCH): wrap Body in http.MaxBytesReader with a
256 KiB cap. Real configs are <10 KiB; jsonb handles the cap
comfortably. Returns 413 Request Entity Too Large on overflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 01:23:03 -07:00
Hongming Wang
61b5236aa1 Merge pull request #977 from Molecule-AI/feat/workspace-snapshot-scrubber-823
feat(workspace): snapshot secret scrubber (closes #823)
2026-04-19 00:33:14 -07:00
Hongming Wang
3976361483 feat(workspace): snapshot secret scrubber (closes #823)
Sub-issue of #799, security condition C4. Standalone module in
workspace/lib/snapshot_scrub.py with three public functions:

- scrub_content(str) → str: regex-based redaction of secret patterns
- is_sandbox_content(str) → bool: detect run_code tool output markers
- scrub_snapshot(dict) → dict: walk memories, scrub each, drop sandbox entries

Patterns covered: sk-ant-/sk-proj-, ghp_/ghs_/github_pat_, AKIA,
cfut_, mol_pk_, ctx7_, Bearer, env-var assignments, base64 blobs ≥33 chars.

21 unit tests, 100% coverage on new code.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 00:32:42 -07:00
Hongming Wang
285b9d1fa3 Merge pull request #972 from Molecule-AI/chore/ci-action-versions
ci: update GitHub Actions to current stable versions (closes #780)
2026-04-19 00:31:17 -07:00
Hongming Wang
151e458c38 Merge pull request #975 from Molecule-AI/fix/hibernate-409-guard-active-tasks
feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
2026-04-19 00:30:24 -07:00
Hongming Wang
4e7c4ceeb3 Merge pull request #976 from Molecule-AI/feat/last-outbound-at-817
feat(platform): track last_outbound_at for silent detection (closes #817)
2026-04-19 00:30:01 -07:00
Hongming Wang
68f55c3ebc Merge pull request #974 from Molecule-AI/fix/canvas-a11y-degraded-badge
fix(canvas): degraded badge WCAG AA contrast (closes #885 p1)
2026-04-19 00:28:39 -07:00
Hongming Wang
c0233317b8 Merge pull request #968 from Molecule-AI/fix/security-memory-delimiter-npm-pin
fix(security): GLOBAL memory delimiter spoofing + pin MCP version (closes #807, #805)
2026-04-19 00:28:08 -07:00
Hongming Wang
d183f89b94 Merge pull request #964 from Molecule-AI/feat/schema-migrations-tracking
feat(db): schema_migrations tracking — run each migration only once
2026-04-19 00:27:27 -07:00
Hongming Wang
50364817ac Merge pull request #967 from Molecule-AI/chore/shadcn-init
chore(canvas): initialize shadcn/ui CLI
2026-04-19 00:27:07 -07:00
Hongming Wang
6fb8472c26 Merge pull request #966 from Molecule-AI/fix/strip-current-task-public-get
fix(security): strip current_task from public GET response (closes #955)
2026-04-19 00:26:27 -07:00
Hongming Wang
ad77b84854 Merge pull request #973 from Molecule-AI/docs/rfc2119-opencode-must-not
docs(opencode): 'should not' → 'must not' for SAFE-T1201 (closes #861)
2026-04-19 00:26:05 -07:00
Hongming Wang
6efc355f2f Merge pull request #965 from Molecule-AI/fix/crlf-cron-prompts
fix(scheduler): strip CRLF from cron prompts (closes #958)
2026-04-19 00:25:14 -07:00
Hongming Wang
f96be032a1 Merge pull request #963 from Molecule-AI/chore/turbopack-dev
chore(canvas): enable Turbopack for dev server
2026-04-19 00:24:37 -07:00
Hongming Wang
9e17c86df1 Merge pull request #971 from Molecule-AI/chore/phase35-sg-lockdown-script
feat(security): Phase 35.1 — SG lockdown script for tenant EC2
2026-04-19 00:24:11 -07:00
Hongming Wang
c3eddd7950 Merge pull request #962 from Molecule-AI/chore/secret-scanner-mol-pk
chore: add mol_pk_ and cfut_ to pre-commit secret scanner
2026-04-19 00:22:44 -07:00
Hongming Wang
4e1a513160 feat(platform): track last_outbound_at for silent-workspace detection (closes #817)
Sub of #795 (phantom-busy post-mortem). Adds last_outbound_at TIMESTAMPTZ
column to workspaces. Bumped async on every successful outbound A2A call
from a real workspace (skip canvas + system callers). Exposed in
GET /workspaces/:id response as "last_outbound_at".

PM/Dev Lead orchestrators can now detect workspaces that have gone silent
despite being online (> 2h + active cron = phantom-busy warning).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 13:04:54 -07:00
Hongming Wang
37030c307d feat(platform): 409 guard on /hibernate when active_tasks > 0 (closes #822)
Phase 35.1 / #799 security condition C3 — prevents operator from
accidentally killing a mid-task agent.

Behavior:
- active_tasks == 0 → proceed as before
- active_tasks > 0 && ?force=true → log [WARN] + proceed
- active_tasks > 0 && no force → 409 with {error, active_tasks}

2 new tests: TestHibernateHandler_ActiveTasks_Returns409,
TestHibernateHandler_ActiveTasks_ForceTrue_Returns200.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:09:52 -07:00
Hongming Wang
89d96e8581 fix(canvas): degraded badge WCAG AA contrast — amber-400 → amber-300 (closes #885)
amber-400 on zinc-900 is 5.4:1 (AA pass). amber-300 is 6.9:1 (AA+AAA pass)
and matches the rest of the amber usage in WorkspaceNode (currentTask,
error detail, badge chip).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:05:38 -07:00
Hongming Wang
df632aeab5 docs(opencode): RFC 2119 — 'should not' → 'must not' for SAFE-T1201 warning (closes #861)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:49 -07:00
Hongming Wang
64796838e0 ci: update GitHub Actions to current stable versions (closes #780)
- golangci/golangci-lint-action@v4 → v9
- docker/setup-qemu-action@v3 → v4
- docker/setup-buildx-action@v3 → v4
- docker/build-push-action@v5 → v6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:04:10 -07:00
Hongming Wang
510083ccc5 feat(security): Phase 35.1 — SG lockdown script for tenant EC2 instances
Restricts tenant EC2 port 8080 ingress to Cloudflare IP ranges only,
blocking direct-IP access. Supports two modes:

1. Lock to CF IPs (Worker deployment): 14 IPv4 CIDR rules
2. Close ingress entirely (Tunnel deployment): removes 0.0.0.0/0 only

Usage:
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx --close-ingress
  bash scripts/lockdown-tenant-sg.sh --sg-id sg-xxxxx --dry-run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 12:01:41 -07:00
Hongming Wang
4dfd7b969e test: GLOBAL memory delimiter spoofing escape + LOCAL scope untouched
- TestCommitMemory_GlobalScope_DelimiterSpoofingEscaped: verifies [MEMORY prefix
  is escaped to [_MEMORY before DB insert (SAFE-T1201, #807)
- TestCommitMemory_LocalScope_NoDelimiterEscape: LOCAL scope stored verbatim

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 11:54:52 -07:00