feat(security): RFC#523 3-layer forbidden-env guardrail for tenant workspaces (task #146) #1555

Merged
hongming merged 1 commits from feat/146-forbidden-env-guard into main 2026-05-19 01:57:31 +00:00
Member

Summary

Implements RFC#523 (internal#523) — task #146. Refuse to start a tenant workspace if any operator-fleet-scope env var name is present.

Threat model: a leaked GITEA_TOKEN / CP_ADMIN_API_TOKEN / RAILWAY_TOKEN / INFISICAL_OPERATOR_TOKEN / MOLECULE_OPERATOR_* in a tenant container would let a compromised agent escalate from "compromise of one workspace" to "compromise of the whole platform."

3-layer defense-in-depth

L1 — provisioner-side fail-closed abort (Go)

  • New workspace_provision_forbidden_env.go + hook in prepareProvisionContext
  • Runs immediately after loadWorkspaceSecrets, BEFORE per-agent applyAgentGitHTTPCreds (which legitimately sets a fallback GITEA_TOKEN)
  • Catches leaks from operator-controlled stores: global_secrets, workspace_secrets
  • Existing forensic #145 silent-strip in provisioner.buildContainerEnv stays as defense-in-depth

L2 — workspace/entrypoint.sh top-of-file env-grep

  • POSIX-portable (works on busybox/alpine/debian/macOS-test)
  • exit 1 with explicit error naming the offending keys
  • MOLECULE_TENANT_GUARD_DISABLE=1 escape hatch for local-dev (NEVER in tenant containers)

L3 — .gitea/workflows/lint-forbidden-env-keys.yml

  • Scans workspace-server/internal/**.go for new code hardcoding a forbidden env-var name
  • Exempts deny-set definitions + pre-existing persona-fallback paths (downstream silent-strip + new L1 already cover the runtime risk)

Open-source-template compatibility

Deny set lives in Go and YAML constants — NOT hardcoded in any open-source template's start.sh. Per memory feedback_open_source_templates_no_hardcoded_org_internals, templates published as separate repos (template-codex / template-hermes / template-openclaw) get their L2 in follow-up template PRs with a fork-friendly default deny set (no MOLECULE_-specific literal). The MOLECULE_OPERATOR_ prefix appears only in the internal claude-code template's entrypoint.sh here.

Test plan

  • L1 unit tests (go test):
    • TestIsForbiddenTenantEnvKey_ExactMatches — 25 cases (all 16 forbidden + 9 allowed)
    • TestIsForbiddenTenantEnvKey_PrefixMatches — 8 cases (4 MOLECULE_OPERATOR_*, 4 adjacent-allowed)
    • TestFindForbiddenTenantEnvKeys_NoneAndEmpty
    • TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted
    • TestFormatForbiddenTenantEnvError_Phrasing — singular vs plural
  • L2 shell test (workspace/tests/test_entrypoint_forbidden_env_guard.sh): 12 cases — clean / per-agent-vars-pass / 5 forbidden-blocks / 2 MOLECULE_OPERATOR_* blocks / 2 adjacent-pass / disable-flag-bypass
  • L3 verified locally: current tree passes; synthetic offender (envVars["GITEA_TOKEN"] = "x") is caught
  • CI green on this PR
  • (Follow-up E2E in staging per RFC#523 §Acceptance) provision with GITEA_TOKEN in payload → 4xx with forbidden_env_keys extra
  • (Follow-up E2E) docker run -e GITEA_TOKEN=foo <image> → exit 1 within 5s

Refs

Out of scope (per RFC#523)

  • Value-shaped (40-byte hex) leak detection — separate secret-scan workflow
  • Per-tenant Infisical scoping — already done (reference_prod_team_infisical_identities)
  • L2 add to external open-source templates (template-codex / template-hermes / template-openclaw) — follow-up PRs in those repos

🤖 Generated with Claude Code

## Summary Implements RFC#523 (internal#523) — task #146. Refuse to start a tenant workspace if any operator-fleet-scope env var name is present. Threat model: a leaked `GITEA_TOKEN` / `CP_ADMIN_API_TOKEN` / `RAILWAY_TOKEN` / `INFISICAL_OPERATOR_TOKEN` / `MOLECULE_OPERATOR_*` in a tenant container would let a compromised agent escalate from "compromise of one workspace" to "compromise of the whole platform." ## 3-layer defense-in-depth **L1 — provisioner-side fail-closed abort (Go)** - New `workspace_provision_forbidden_env.go` + hook in `prepareProvisionContext` - Runs immediately after `loadWorkspaceSecrets`, BEFORE per-agent `applyAgentGitHTTPCreds` (which legitimately sets a fallback `GITEA_TOKEN`) - Catches leaks from operator-controlled stores: `global_secrets`, `workspace_secrets` - Existing forensic #145 silent-strip in `provisioner.buildContainerEnv` stays as defense-in-depth **L2 — `workspace/entrypoint.sh` top-of-file env-grep** - POSIX-portable (works on busybox/alpine/debian/macOS-test) - `exit 1` with explicit error naming the offending keys - `MOLECULE_TENANT_GUARD_DISABLE=1` escape hatch for local-dev (NEVER in tenant containers) **L3 — `.gitea/workflows/lint-forbidden-env-keys.yml`** - Scans `workspace-server/internal/**.go` for new code hardcoding a forbidden env-var name - Exempts deny-set definitions + pre-existing persona-fallback paths (downstream silent-strip + new L1 already cover the runtime risk) ## Open-source-template compatibility Deny set lives in Go and YAML constants — NOT hardcoded in any open-source template's `start.sh`. Per memory `feedback_open_source_templates_no_hardcoded_org_internals`, templates published as separate repos (template-codex / template-hermes / template-openclaw) get their L2 in follow-up template PRs with a fork-friendly default deny set (no `MOLECULE_`-specific literal). The `MOLECULE_OPERATOR_` prefix appears only in the **internal** claude-code template's `entrypoint.sh` here. ## Test plan - [x] L1 unit tests (go test): - `TestIsForbiddenTenantEnvKey_ExactMatches` — 25 cases (all 16 forbidden + 9 allowed) - `TestIsForbiddenTenantEnvKey_PrefixMatches` — 8 cases (4 MOLECULE_OPERATOR_*, 4 adjacent-allowed) - `TestFindForbiddenTenantEnvKeys_NoneAndEmpty` - `TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted` - `TestFormatForbiddenTenantEnvError_Phrasing` — singular vs plural - [x] L2 shell test (`workspace/tests/test_entrypoint_forbidden_env_guard.sh`): 12 cases — clean / per-agent-vars-pass / 5 forbidden-blocks / 2 MOLECULE_OPERATOR_* blocks / 2 adjacent-pass / disable-flag-bypass - [x] L3 verified locally: current tree passes; synthetic offender (`envVars["GITEA_TOKEN"] = "x"`) is caught - [ ] CI green on this PR - [ ] (Follow-up E2E in staging per RFC#523 §Acceptance) provision with `GITEA_TOKEN` in payload → 4xx with `forbidden_env_keys` extra - [ ] (Follow-up E2E) `docker run -e GITEA_TOKEN=foo <image>` → exit 1 within 5s ## Refs - RFC#523 / internal#523 - Task #146 - Memory `feedback_passwords_in_chat_are_burned` - Memory `feedback_per_agent_gitea_identity_default` - Memory `feedback_open_source_templates_no_hardcoded_org_internals` - Memory `feedback_check_vendor_docs_and_actual_source_before_guess_api_shape` (verified POSIX `env` semantics + Go `os.Environ` / map contract before writing) ## Out of scope (per RFC#523) - Value-shaped (40-byte hex) leak detection — separate secret-scan workflow - Per-tenant Infisical scoping — already done (`reference_prod_team_infisical_identities`) - L2 add to external open-source templates (template-codex / template-hermes / template-openclaw) — follow-up PRs in those repos 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-security added 1 commit 2026-05-19 01:22:39 +00:00
feat(security): RFC#523 3-layer forbidden-env guardrail for tenant workspaces (task #146)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m14s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m18s
CI / Platform (Go) (pull_request) Successful in 5m6s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 1m7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
publish-runtime-autobump / pr-validate (pull_request) Successful in 27s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 5s
qa-review / approved (pull_request) Failing after 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 5s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
CI / Canvas (Next.js) (pull_request) Successful in 6m10s
CI / Python Lint & Test (pull_request) Successful in 6m38s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m20s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m32s
E2E Chat / E2E Chat (pull_request) Failing after 5m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1m4s
CI / all-required (pull_request) emitter-null compensating success (feedback_gitea_emitter_null_state_blocks_merge); CI ran, state never persisted by Gitea 1.22.6 emitter
audit-force-merge / audit (pull_request) Successful in 4s
aabf933a5c
Refuse to start a tenant workspace if any operator-fleet-scope env var
name is present. Threat model: a leaked GITEA_TOKEN /
CP_ADMIN_API_TOKEN / RAILWAY_TOKEN / INFISICAL_OPERATOR_TOKEN /
MOLECULE_OPERATOR_* in a tenant container would let a compromised
agent escalate from "compromise of one workspace" to "compromise of
the whole platform."

3-layer defense-in-depth:

L1 — provisioner-side fail-closed abort (Go):
  workspace_provision_forbidden_env.go + prepareProvisionContext hook.
  Runs immediately after loadWorkspaceSecrets, BEFORE the per-agent
  persona GIT_HTTP_* injection that legitimately sets a fallback
  GITEA_TOKEN. Catches leaks from the operator-controlled stores
  (global_secrets, workspace_secrets). The existing forensic #145
  silent-strip guard in provisioner.buildContainerEnv stays as
  defense-in-depth.

L2 — workspace/entrypoint.sh top-of-file env-grep + exit 1:
  Fires if both upstream layers are bypassed (e.g. docker run -e
  GITEA_TOKEN=... standalone). MOLECULE_TENANT_GUARD_DISABLE=1
  bypass for local-dev. POSIX-portable (busybox/alpine/debian).

L3 — .gitea/workflows/lint-forbidden-env-keys.yml:
  Scans workspace-server/internal/**.go for new code that hardcodes a
  forbidden env-var name. Exempts the deny-set definitions + the
  pre-existing persona-fallback paths whose downstream silent-strip +
  new L1 fail-closed already cover the runtime risk.

Tests:
  - L1: TestIsForbiddenTenantEnvKey_ExactMatches,
        TestIsForbiddenTenantEnvKey_PrefixMatches,
        TestFindForbiddenTenantEnvKeys_NoneAndEmpty,
        TestFindForbiddenTenantEnvKeys_SingleAndMultipleSorted,
        TestFormatForbiddenTenantEnvError_Phrasing
  - L2: workspace/tests/test_entrypoint_forbidden_env_guard.sh
        (12 cases — clean/per-agent/each-forbidden/prefix/disable-flag)
  - L3: verified locally that current tree passes + synthetic offender
        is caught

Open-source-template-friendly: the deny set lives in Go and YAML
constants, not hardcoded in any open-source template's start.sh.
Per memory feedback_open_source_templates_no_hardcoded_org_internals,
templates published as separate repos (template-codex / template-
hermes / template-openclaw) get their L2 added in follow-up template
PRs with a fork-friendly default deny set (no MOLECULE_-specific
literal). The MOLECULE_OPERATOR_ prefix appears only in the
internal claude-code template's entrypoint.sh.

Refs:
  - RFC#523 (internal#523)
  - Task #146
  - memory feedback_passwords_in_chat_are_burned
  - memory feedback_per_agent_gitea_identity_default
  - memory feedback_open_source_templates_no_hardcoded_org_internals
  - memory feedback_check_vendor_docs_and_actual_source_before_guess_api_shape
    (POSIX env-set semantics verified via shell test; Go os.Environ /
    map[string]string contract verified via go test)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-be approved these changes 2026-05-19 01:53:39 +00:00
core-be left a comment
Member

5-axis review on RFC#523 3-layer forbidden-env guardrail: correctness OK (L1 fail-closed at prepareProvisionContext, L2 entrypoint env-grep, L3 PR-time grep lint with exempt allowlist); readability OK (each layer's purpose + threat model documented inline); arch OK (drift detection via the test TestIsForbiddenTenantEnvKey_ExactMatches as SoT, layers cross-reference each other); security OK (operator-scope key NAMES blocked from tenant env writers, prefix scan covers MOLECULE_OPERATOR_*, exempt list is narrow + justified); perf OK (lint runs sub-second, L1 runs once per provision). APPROVED.

5-axis review on RFC#523 3-layer forbidden-env guardrail: correctness OK (L1 fail-closed at prepareProvisionContext, L2 entrypoint env-grep, L3 PR-time grep lint with exempt allowlist); readability OK (each layer's purpose + threat model documented inline); arch OK (drift detection via the test TestIsForbiddenTenantEnvKey_ExactMatches as SoT, layers cross-reference each other); security OK (operator-scope key NAMES blocked from tenant env writers, prefix scan covers MOLECULE_OPERATOR_*, exempt list is narrow + justified); perf OK (lint runs sub-second, L1 runs once per provision). APPROVED.
core-devops approved these changes 2026-05-19 01:53:40 +00:00
core-devops left a comment
Member

DevOps review: workflow has no paths: filter (feedback_path_filtered_workflow_cant_be_required compliant for required-ability), exempt list is narrow with per-class justification, grep -F + grep -E prefix pass cover both shapes. EXEMPT_PATHS reviewer-signoff comment is enforceable. APPROVED.

DevOps review: workflow has no paths: filter (feedback_path_filtered_workflow_cant_be_required compliant for required-ability), exempt list is narrow with per-class justification, grep -F + grep -E prefix pass cover both shapes. EXEMPT_PATHS reviewer-signoff comment is enforceable. APPROVED.
hongming merged commit f5cc9493bb into main 2026-05-19 01:57:31 +00:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1555