fix(rfc523): scope forbidden-env check to global_secrets (allow user-set workspace_secrets) #1622

Merged
core-devops merged 1 commits from fix/523-allow-user-set-workspace-secrets into main 2026-05-20 23:40:54 +00:00
Member

Summary

RFC#523 Layer 1 was over-firing on the legitimate user flow of pasting a personal scoped GitHub PAT into the canvas Secrets tab under GITHUB_TOKEN. The guardrail ran on the merged env-set from loadWorkspaceSecrets and could not distinguish:

  1. Operator-leak (token sourced from global_secrets — the RFC#523 §"Threat model" attack vector) — must BLOCK
  2. User-set (token sourced from workspace_secrets via authenticated canvas Secrets tab — the user's own scoped PAT for their own repos) — must ALLOW

Empirical: CTO 2026-05-20 hit {"rfc":"523","error":"provision aborted: env var \"GITHUB_TOKEN\" is operator-scope and must not reach tenant workspaces — remove it from workspace_secrets / global_secrets and retry"} after legitimately adding their own PAT.

RFC#523 intent (cited)

Issue molecule-ai/internal#523 §"Threat model":

Tenant workspaces (per-customer EC2 instances running user code…) MUST NOT receive operator-scope tokens. Specifically: GITEA_TOKEN, GITHUB_TOKEN, CP_ADMIN_API_TOKEN, INFISICAL_OPERATOR_TOKEN, RAILWAY_TOKEN, and similar operator-fleet scope creds.

User-scoped per-tenant PATs in workspace_secrets are out of scope. The original commit that landed Layer 1 (4f45d37 fix(secrets-seed): remove GITHUB_TOKEN from tenant seedAllowList) already eliminated the one upstream operator-bleed source — tenant_secrets_seed.go's seedAllowList propagating CP-env GITHUB_TOKEN into every fresh tenant's global_secrets.

Fix approach (Option C — schema-natural provenance split)

  • loadWorkspaceSecrets (workspace_provision.go:894) now returns a second value globalSecretKeys map[string]struct{} recording which keys originated from global_secrets. Workspace_secrets writes override and clear the flag (user re-set wins).
  • New helper findForbiddenTenantEnvKeysFromGlobals (workspace_provision_forbidden_env.go) restricts the deny-set scan to that provenance subset. Old name-only findForbiddenTenantEnvKeys is kept for the Layer 3 CI-lint surface + test-pin uses.
  • prepareProvisionContext (workspace_provision_shared.go:149) calls the provenance-aware helper. Error surface adds source: "global_secrets" to the structured Extra payload.

Why Option C over A/B

  • Option A (filter by secret.source field): no such field exists on the schema; would require a migration.
  • Option B (value-match against known operator-host token values): introduces a runtime dependency on the operator-host secret-store + a startup precache step. Brittle, network coupling.
  • Option C: the two DB tables ARE the provenance discriminant. Zero schema change. Single-function refactor.

Defense-in-depth NOT removed

  • provisioner.buildContainerEnv silent-strip (forensic #145) — still strips SCM-write tokens at container-env-build time regardless of provenance.
  • workspace/entrypoint.sh Layer 2 in-container env-grep — unchanged (see Open Question).
  • .gitea/workflows/lint-forbidden-env-keys.yml Layer 3 — unchanged.

Backwards-compat / migration

Behavior does change silently re-allow for any tenant who previously had GITHUB_TOKEN (or similar) in workspace_secrets and was being blocked. That is the desired outcome — they were victims of the over-fire. No migration step needed; the next provision attempt simply succeeds.

Tenants whose global_secrets table contains operator-scope tokens continue to be blocked. The follow-up audit step from issue internal#552 ("Audit existing tenants' global_secrets for stale GITHUB_TOKEN rows + DELETE") is unchanged in scope.

Tests added

In workspace_provision_forbidden_env_test.go:

  • TestFindForbiddenTenantEnvKeysFromGlobals_UserSetAllowed — user-set workspace_secrets w/ GITHUB_TOKEN → allowed
  • TestFindForbiddenTenantEnvKeysFromGlobals_OperatorLeakBlocked — global_secrets w/ GITHUB_TOKEN → blocked, names key
  • TestFindForbiddenTenantEnvKeysFromGlobals_UserOverrideOfGlobalAllowed — workspace row supersedes global → allowed
  • TestFindForbiddenTenantEnvKeysFromGlobals_MultipleOperatorLeaks — sorted slice contract
  • TestFindForbiddenTenantEnvKeysFromGlobals_EmptyInputs — nil/empty total

All 10 existing forbidden-env tests still pass. Full internal/handlers suite green (17s).

Open question for CTO

Should Layer 2 (workspace/entrypoint.sh FORBIDDEN_KEYS grep) also become provenance-aware? Right now it does a blanket env-grep at container init and exits 1 if GITHUB_TOKEN is set, regardless of how the env got there. Two possible directions:

  1. Leave as-is — Layer 2 stays maximally paranoid; users who want to push to github.com from inside the workspace use a different env-var name (GH_PAT_USER etc) or use the gh-token-refresh daemon path. Cleanest separation of concerns; but the lines 164-172 of entrypoint.sh contradictorily say "If GITHUB_TOKEN or GH_TOKEN is set, authenticate gh CLI" — code that can never execute under current Layer 2.
  2. Pass provenance into the container — workspace-server writes a /configs/.molecule-secret-provenance marker file; entrypoint.sh L2 skips keys flagged as user-origin. More moving parts.

This PR fixes the producer-side block, which is enough for workspace_secrets-sourced GITHUB_TOKEN to reach the container. Whether it then survives to molecule-runtime is a separate L2 question. Tracked as a follow-up only after CTO direction.

Should the deny-set expand? The current list catches Gitea/GitHub/CP-admin/Infisical/Railway/Hetzner. ANTHROPIC_API_KEY is intentionally NOT on the list (per-agent scope, used by every workspace). If we ever wire operator-fleet ANTHROPIC_API_KEY (e.g. for fleet metering), it should NOT bleed into tenants and would need its own deny-set member with a different var name to disambiguate from per-agent. Out of scope here.

Refs

  • molecule-ai/internal#523 — original RFC (Threat model cited above)
  • 4f45d37 — RFC#523 follow-up: removed GITHUB_TOKEN from tenant seedAllowList (eliminated upstream operator-bleed source)
  • task #146 (orchestrator tracker — CI-lint Layer 3 side)
  • feedback_upstream_docs_first_before_hypothesizing — pulled RFC body before fixing

Test plan

  • go build ./... clean
  • go vet ./... clean
  • go test ./internal/handlers/ -run 'TestFindForbiddenTenantEnvKeysFromGlobals' — 5/5 PASS
  • go test ./internal/handlers/ -run 'TestIsForbiddenTenantEnvKey|TestFindForbiddenTenantEnvKeys|TestFormatForbiddenTenantEnvError' — 10/10 PASS (5 old + 5 new)
  • go test ./internal/handlers/ -count=1 — full handlers suite 17s, all green
  • Staging E2E: provision with user-set workspace_secrets.GITHUB_TOKEN succeeds (Layer 1 passes)
  • Staging E2E: provision with global_secrets-injected GITHUB_TOKEN still aborts with RFC#523 message
  • Smoke: existing tenants with no GITHUB_TOKEN in either table continue to provision
## Summary RFC#523 Layer 1 was over-firing on the legitimate user flow of pasting a personal scoped GitHub PAT into the canvas Secrets tab under `GITHUB_TOKEN`. The guardrail ran on the merged env-set from `loadWorkspaceSecrets` and could not distinguish: 1. **Operator-leak** (token sourced from `global_secrets` — the RFC#523 §"Threat model" attack vector) — must BLOCK 2. **User-set** (token sourced from `workspace_secrets` via authenticated canvas Secrets tab — the user's own scoped PAT for their own repos) — must ALLOW Empirical: CTO 2026-05-20 hit `{"rfc":"523","error":"provision aborted: env var \"GITHUB_TOKEN\" is operator-scope and must not reach tenant workspaces — remove it from workspace_secrets / global_secrets and retry"}` after legitimately adding their own PAT. ## RFC#523 intent (cited) Issue [molecule-ai/internal#523](https://git.moleculesai.app/molecule-ai/internal/issues/523) §"Threat model": > Tenant workspaces (per-customer EC2 instances running user code…) MUST NOT receive operator-scope tokens. Specifically: `GITEA_TOKEN`, `GITHUB_TOKEN`, `CP_ADMIN_API_TOKEN`, `INFISICAL_OPERATOR_TOKEN`, `RAILWAY_TOKEN`, and similar **operator-fleet scope** creds. User-scoped per-tenant PATs in `workspace_secrets` are out of scope. The original commit that landed Layer 1 (4f45d37 `fix(secrets-seed): remove GITHUB_TOKEN from tenant seedAllowList`) already eliminated the one upstream operator-bleed source — `tenant_secrets_seed.go`'s `seedAllowList` propagating CP-env `GITHUB_TOKEN` into every fresh tenant's `global_secrets`. ## Fix approach (Option C — schema-natural provenance split) - `loadWorkspaceSecrets` (`workspace_provision.go:894`) now returns a second value `globalSecretKeys map[string]struct{}` recording which keys originated from `global_secrets`. Workspace_secrets writes override and clear the flag (user re-set wins). - New helper `findForbiddenTenantEnvKeysFromGlobals` (`workspace_provision_forbidden_env.go`) restricts the deny-set scan to that provenance subset. Old name-only `findForbiddenTenantEnvKeys` is kept for the Layer 3 CI-lint surface + test-pin uses. - `prepareProvisionContext` (`workspace_provision_shared.go:149`) calls the provenance-aware helper. Error surface adds `source: "global_secrets"` to the structured Extra payload. ## Why Option C over A/B - **Option A (filter by `secret.source` field)**: no such field exists on the schema; would require a migration. - **Option B (value-match against known operator-host token values)**: introduces a runtime dependency on the operator-host secret-store + a startup precache step. Brittle, network coupling. - **Option C**: the two DB tables ARE the provenance discriminant. Zero schema change. Single-function refactor. ## Defense-in-depth NOT removed - `provisioner.buildContainerEnv` silent-strip (forensic #145) — still strips SCM-write tokens at container-env-build time regardless of provenance. - `workspace/entrypoint.sh` Layer 2 in-container env-grep — unchanged (see Open Question). - `.gitea/workflows/lint-forbidden-env-keys.yml` Layer 3 — unchanged. ## Backwards-compat / migration Behavior **does change** silently re-allow for any tenant who previously had `GITHUB_TOKEN` (or similar) in `workspace_secrets` and was being blocked. That is the desired outcome — they were victims of the over-fire. No migration step needed; the next provision attempt simply succeeds. Tenants whose `global_secrets` table contains operator-scope tokens continue to be blocked. The follow-up audit step from issue internal#552 ("Audit existing tenants' global_secrets for stale GITHUB_TOKEN rows + DELETE") is unchanged in scope. ## Tests added In `workspace_provision_forbidden_env_test.go`: - `TestFindForbiddenTenantEnvKeysFromGlobals_UserSetAllowed` — user-set workspace_secrets w/ `GITHUB_TOKEN` → allowed - `TestFindForbiddenTenantEnvKeysFromGlobals_OperatorLeakBlocked` — global_secrets w/ `GITHUB_TOKEN` → blocked, names key - `TestFindForbiddenTenantEnvKeysFromGlobals_UserOverrideOfGlobalAllowed` — workspace row supersedes global → allowed - `TestFindForbiddenTenantEnvKeysFromGlobals_MultipleOperatorLeaks` — sorted slice contract - `TestFindForbiddenTenantEnvKeysFromGlobals_EmptyInputs` — nil/empty total All 10 existing forbidden-env tests still pass. Full `internal/handlers` suite green (17s). ## Open question for CTO **Should Layer 2 (`workspace/entrypoint.sh` `FORBIDDEN_KEYS` grep) also become provenance-aware?** Right now it does a blanket env-grep at container init and exits 1 if `GITHUB_TOKEN` is set, regardless of how the env got there. Two possible directions: 1. **Leave as-is** — Layer 2 stays maximally paranoid; users who want to push to github.com from inside the workspace use a different env-var name (`GH_PAT_USER` etc) or use the gh-token-refresh daemon path. Cleanest separation of concerns; but the lines 164-172 of entrypoint.sh contradictorily say "If GITHUB_TOKEN or GH_TOKEN is set, authenticate gh CLI" — code that can never execute under current Layer 2. 2. **Pass provenance into the container** — workspace-server writes a `/configs/.molecule-secret-provenance` marker file; entrypoint.sh L2 skips keys flagged as user-origin. More moving parts. This PR fixes the producer-side block, which is enough for `workspace_secrets`-sourced `GITHUB_TOKEN` to **reach the container**. Whether it then **survives** to molecule-runtime is a separate L2 question. Tracked as a follow-up only after CTO direction. **Should the deny-set expand?** The current list catches Gitea/GitHub/CP-admin/Infisical/Railway/Hetzner. `ANTHROPIC_API_KEY` is intentionally NOT on the list (per-agent scope, used by every workspace). If we ever wire operator-fleet `ANTHROPIC_API_KEY` (e.g. for fleet metering), it should NOT bleed into tenants and would need its own deny-set member with a different var name to disambiguate from per-agent. Out of scope here. ## Refs - molecule-ai/internal#523 — original RFC (Threat model cited above) - 4f45d37 — RFC#523 follow-up: removed GITHUB_TOKEN from tenant `seedAllowList` (eliminated upstream operator-bleed source) - task #146 (orchestrator tracker — CI-lint Layer 3 side) - `feedback_upstream_docs_first_before_hypothesizing` — pulled RFC body before fixing ## Test plan - [x] `go build ./...` clean - [x] `go vet ./...` clean - [x] `go test ./internal/handlers/ -run 'TestFindForbiddenTenantEnvKeysFromGlobals'` — 5/5 PASS - [x] `go test ./internal/handlers/ -run 'TestIsForbiddenTenantEnvKey|TestFindForbiddenTenantEnvKeys|TestFormatForbiddenTenantEnvError'` — 10/10 PASS (5 old + 5 new) - [x] `go test ./internal/handlers/ -count=1` — full handlers suite 17s, all green - [ ] Staging E2E: provision with user-set `workspace_secrets.GITHUB_TOKEN` succeeds (Layer 1 passes) - [ ] Staging E2E: provision with global_secrets-injected `GITHUB_TOKEN` still aborts with RFC#523 message - [ ] Smoke: existing tenants with no GITHUB_TOKEN in either table continue to provision
core-be added 1 commit 2026-05-20 22:23:10 +00:00
fix(rfc523): scope forbidden-env check to global_secrets (allow user-set workspace_secrets)
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 20s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 5m45s
E2E Chat / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 32s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 6m48s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
CI / Python Lint & Test (pull_request) Successful in 7m3s
CI / all-required (pull_request) Successful in 5m10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m26s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Failing after 5m52s
063e1e2683
RFC#523 Layer 1 was over-firing: a user pasting their own scoped
GitHub PAT into the canvas Secrets tab under GITHUB_TOKEN landed in
the per-workspace `workspace_secrets` table, but the L1 guardrail
ran on the merged env-set from loadWorkspaceSecrets and refused the
provision with the operator-leak abort message — even though
RFC#523's literal threat model (issue molecule-ai/internal#523
§"Threat model") only names operator-scope tokens injected via
operator-side stores. CTO empirical 2026-05-20.

Fix (Option C — schema-natural provenance split):

- loadWorkspaceSecrets now returns a second value: the set of keys
  whose value came from `global_secrets` (the operator-controlled
  table). A subsequent workspace_secrets row of the same key drops
  the global-origin flag (user override wins).
- New helper findForbiddenTenantEnvKeysFromGlobals restricts the
  scan to that set; the old findForbiddenTenantEnvKeys (name-only)
  is kept for the Layer 3 CI-lint use case + test pins.
- prepareProvisionContext calls the provenance-aware helper.
  Error surface adds source="global_secrets" to the structured
  Extra payload so canvas Events can disambiguate.

Defense-in-depth NOT removed:
- provisioner.buildContainerEnv silent-strip (forensic #145) still
  applies at the container-env-build step.
- workspace/entrypoint.sh L2 in-container env-grep stays as-is
  (a separate concern — see the "Open question" in the PR body
  about whether L2 also needs provenance-awareness).

Regression tests (workspace_provision_forbidden_env_test.go):
- user-set workspace_secrets w/ GITHUB_TOKEN → allowed
- operator-leak global_secrets w/ GITHUB_TOKEN → blocked
- user-override of pre-existing global GITHUB_TOKEN → allowed
- multiple operator-leaks → sorted slice returned
- nil/empty inputs total

Refs:
- molecule-ai/internal#523 (original RFC, §"Threat model")
- 4f45d37 (RFC#523 follow-up: removed GITHUB_TOKEN from tenant
  seedAllowList — the one upstream operator-bleed source has
  already been eliminated)
- feedback_upstream_docs_first_before_hypothesizing
- task #146 (this orchestrator's tracker — the CI-lint side)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Member

Cross-links:

  • molecule-ai/internal#523 (RFC, original Threat model cited in PR body)
  • task #146 in orchestrator tracker (CI-lint Layer 3 side — this PR is the runtime-check side)
  • 4f45d37 — RFC#523 producer-side seedAllowList fix (already-landed; this PR completes the producer-side guardrail by removing the over-fire residual)
Cross-links: - molecule-ai/internal#523 (RFC, original Threat model cited in PR body) - task #146 in orchestrator tracker (CI-lint Layer 3 side — this PR is the runtime-check side) - 4f45d37 — RFC#523 producer-side seedAllowList fix (already-landed; this PR completes the producer-side guardrail by removing the over-fire residual)
core-qa approved these changes 2026-05-20 22:27:28 +00:00
core-qa left a comment
Member

core-qa lens — five-axis.

Correctness

Provenance check is binary by design: findForbiddenTenantEnvKeysFromGlobals iterates globalSecretKeys only, so workspace_secrets-only keys never enter the candidate set. The shared.go loadWorkspaceSecrets diff explicitly delete(globalKeys, k) on workspace_secrets read AFTER the global loop, so override-of-global silently demotes provenance. Loop order (globals first, then workspace) is the load-bearing invariant — order swap would break override. Pinned in code comment but not asserted by a test; minor.

Test coverage

All 5 new tests exercise the production helper directly. UserSetAllowed (empty globals), OperatorLeakBlocked (globals contains key), UserOverrideOfGlobalAllowed (globals lacks key per caller contract), MultipleOperatorLeaks (sorted), EmptyInputs (4 nil/empty permutations incl. workspace-only). Defensive present branch on line 1202 is not directly covered — contract says it never fires, fine to leave.

Edge: both-sources

User-override semantics match CTO intent — security-correct because the user re-authenticated via canvas Secrets tab. Comment at provision_shared.go:147-150 documents this. No finding.

Backwards-compat

Re-allows previously-blocked tenants whose canvas-pasted GITHUB_TOKEN tripped the blanket check (CTO empirical 2026-05-20). loadWorkspaceSecrets signature changed (2→3 returns) but only one caller — prepareProvisionContext — and it is updated in the same PR. No drift surface.

RFC#523 alignment

Threat model literal-text says operator-scope tokens via operator-side stores; the implementation now exactly matches that. Defense-in-depth preserved: forensic #145 silent-strip in buildContainerEnv unchanged, Layer 2 entrypoint.sh + Layer 3 CI lint unchanged. The name-only findForbiddenTenantEnvKeys is retained for the Layer 3 lint with a PROVENANCE NOTE doc-comment — no API churn.

APPROVED.

core-qa lens — five-axis. ## Correctness Provenance check is binary by design: `findForbiddenTenantEnvKeysFromGlobals` iterates `globalSecretKeys` only, so workspace_secrets-only keys never enter the candidate set. The shared.go loadWorkspaceSecrets diff explicitly `delete(globalKeys, k)` on workspace_secrets read AFTER the global loop, so override-of-global silently demotes provenance. Loop order (globals first, then workspace) is the load-bearing invariant — order swap would break override. Pinned in code comment but not asserted by a test; minor. ## Test coverage All 5 new tests exercise the production helper directly. UserSetAllowed (empty globals), OperatorLeakBlocked (globals contains key), UserOverrideOfGlobalAllowed (globals lacks key per caller contract), MultipleOperatorLeaks (sorted), EmptyInputs (4 nil/empty permutations incl. workspace-only). Defensive `present` branch on line 1202 is not directly covered — contract says it never fires, fine to leave. ## Edge: both-sources User-override semantics match CTO intent — security-correct because the user re-authenticated via canvas Secrets tab. Comment at provision_shared.go:147-150 documents this. No finding. ## Backwards-compat Re-allows previously-blocked tenants whose canvas-pasted GITHUB_TOKEN tripped the blanket check (CTO empirical 2026-05-20). loadWorkspaceSecrets signature changed (2→3 returns) but only one caller — prepareProvisionContext — and it is updated in the same PR. No drift surface. ## RFC#523 alignment Threat model literal-text says operator-scope tokens via operator-side stores; the implementation now exactly matches that. Defense-in-depth preserved: forensic #145 silent-strip in buildContainerEnv unchanged, Layer 2 entrypoint.sh + Layer 3 CI lint unchanged. The name-only `findForbiddenTenantEnvKeys` is retained for the Layer 3 lint with a PROVENANCE NOTE doc-comment — no API churn. APPROVED.
core-devops approved these changes 2026-05-20 22:27:50 +00:00
core-devops left a comment
Member

core-devops lens — five-axis.

Layer-2 entrypoint.sh interaction (CRITICAL but NOT blocking THIS PR)

Verified at PR HEAD: workspace/entrypoint.sh:34 has FORBIDDEN_KEYS including GITHUB_TOKEN and at line 57-61 exit 1 if any present — NAME-based, NO provenance awareness. So a user-set GITHUB_TOKEN that passes Layer 1 (after this PR) will still kill the container at boot via Layer 2, making the gh auth login codepath at line 164-172 genuinely unreachable. This PR is necessary-but-not-sufficient: it unblocks the legitimate provision flow at L1, but L2 needs a parallel fix (e.g. drop GITHUB_TOKEN from L2 FORBIDDEN_KEYS, or pass a provenance hint via env). Recommend FOLLOW-UP PR, NOT a hard block here — Layer 1 is the fail-closed-loud layer; Layer 2 is defense-in-depth that has always been more conservative. Tracked as task #381.

Deny-set expansion

Staying narrow is the right call. The set covers operator-fleet creds (SCM-write/CP-admin/Infisical/Railway/Hetzner/MOLECULE_OPERATOR_*). ANTHROPIC_API_KEY etc are correctly EXCLUDED because they are per-user/per-workspace by design. Comment at forbidden_env.go:1079-1082 documents the value-vs-name boundary. No finding.

DB query hot-path

findForbiddenTenantEnvKeysFromGlobals iterates globalSecretKeys (map keys only) — O(globals) not O(envVars). globalSecretKeys is built from one already-issued SELECT key,encrypted_value FROM global_secrets in loadWorkspaceSecrets; no new query. Even at 100-1000 global rows the map iteration is sub-millisecond. No finding.

Migration safety

Zero schema change. Pure handler-code edit. loadWorkspaceSecrets signature is package-private and has exactly one caller (verified via raw grep across workspace-server/internal/handlers/*.go). No backwards-compat surface.

Rollback

Clean single-PR revert: 3 source files + 1 test file all touched in this PR; no migrations, no DB writes, no infra changes. Reverting restores the original blanket check (over-fire returns but no new defect). Layer 2 entrypoint.sh + Layer 3 CI lint unchanged so the security floor is unaffected by either direction.

APPROVED.

core-devops lens — five-axis. ## Layer-2 entrypoint.sh interaction (CRITICAL but NOT blocking THIS PR) Verified at PR HEAD: workspace/entrypoint.sh:34 has `FORBIDDEN_KEYS` including `GITHUB_TOKEN` and at line 57-61 `exit 1` if any present — NAME-based, NO provenance awareness. So a user-set GITHUB_TOKEN that passes Layer 1 (after this PR) will still kill the container at boot via Layer 2, making the `gh auth login` codepath at line 164-172 genuinely unreachable. This PR is necessary-but-not-sufficient: it unblocks the legitimate provision flow at L1, but L2 needs a parallel fix (e.g. drop GITHUB_TOKEN from L2 FORBIDDEN_KEYS, or pass a provenance hint via env). Recommend FOLLOW-UP PR, NOT a hard block here — Layer 1 is the fail-closed-loud layer; Layer 2 is defense-in-depth that has always been more conservative. Tracked as task #381. ## Deny-set expansion Staying narrow is the right call. The set covers operator-fleet creds (SCM-write/CP-admin/Infisical/Railway/Hetzner/MOLECULE_OPERATOR_*). ANTHROPIC_API_KEY etc are correctly EXCLUDED because they are per-user/per-workspace by design. Comment at forbidden_env.go:1079-1082 documents the value-vs-name boundary. No finding. ## DB query hot-path `findForbiddenTenantEnvKeysFromGlobals` iterates `globalSecretKeys` (map keys only) — O(globals) not O(envVars). globalSecretKeys is built from one already-issued `SELECT key,encrypted_value FROM global_secrets` in loadWorkspaceSecrets; no new query. Even at 100-1000 global rows the map iteration is sub-millisecond. No finding. ## Migration safety Zero schema change. Pure handler-code edit. loadWorkspaceSecrets signature is package-private and has exactly one caller (verified via raw grep across workspace-server/internal/handlers/*.go). No backwards-compat surface. ## Rollback Clean single-PR revert: 3 source files + 1 test file all touched in this PR; no migrations, no DB writes, no infra changes. Reverting restores the original blanket check (over-fire returns but no new defect). Layer 2 entrypoint.sh + Layer 3 CI lint unchanged so the security floor is unaffected by either direction. APPROVED.
core-be force-pushed fix/523-allow-user-set-workspace-secrets from 063e1e2683 to 514341dd51 2026-05-20 23:02:57 +00:00 Compare
core-devops merged commit ee9dc5b9c5 into main 2026-05-20 23:40:54 +00:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1622