molecule-core/workspace-server/internal
Hongming Wang 6d7a7fc86f feat(org-import): make provision concurrency configurable via env
Org-import was hard-capped at 3 concurrent workspace provisions (#1084),
calibrated for Docker-mode workspaces where each provision was a
docker-run. Now that workspaces are EC2 instances, AWS RunInstances
parallelises happily and the artificial cap of 3 makes a 7-workspace
org-import take 3-4× longer than necessary (3 batches × ~70s/provision
≈ 4 min wall time when AWS could absorb all 7 in parallel for ~70s).

This PR makes the cap configurable via MOLECULE_PROVISION_CONCURRENCY:
  unset    → 3 (Docker-mode default, unchanged)
  "0"      → effectively unlimited (SaaS / EC2 backend; AWS rate-limit
             + vCPU quota are the real backpressure)
  N>0      → exactly N
  N<0      → fall back to default 3 + warning log
  garbage  → fall back to default 3 + warning log

The "0 = unlimited" mapping is the user-facing convention requested for
SaaS deployments — operators don't have to pick an arbitrary large
number. Implementation hands off 1<<20 internally so the channel-based
semaphore stays a no-op without infinite-buffer risk.

Test coverage (org_provision_concurrency_test.go, 6 cases / 15 subtests):
- unset → default
- "0" → large unlimited cap
- positive integer exact (1, 5, 10, 50)
- negative → default + warning
- non-numeric → default + warning
- whitespace-trimmed (" 7 " → 7)

Boot-time log line confirms the resolved cap so an operator can verify
their env is being honored without re-deploying.

Does NOT address the separate 600s "never registered" timeout the user
also reported during org-import — that's filed as molecule-core#2793
for proper investigation (parallel-provision contention, network
routing, register-retry budget, or container-start failure are all
candidates and need live SSM capture to bisect).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 16:32:56 -07:00
..
artifacts chore: sync staging to main — 1188 commits, 5 conflicts resolved (#1743) 2026-04-23 18:30:18 +00:00
buildinfo feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy 2026-04-30 10:55:08 -07:00
bundle refactor(workspace-status): typed constants + AST-based drift gate 2026-04-30 10:41:41 -07:00
channels feat(channels): first-class Lark/Feishu support via schema-driven config 2026-04-24 11:51:15 -07:00
crypto
db refactor(workspace-status): catch missed literal in workspace_bootstrap.go + add literal-drift gate 2026-04-30 10:51:01 -07:00
envx
events test(handlers): introduce events.EventEmitter interface (#1814 partial) 2026-04-26 09:05:52 -07:00
handlers feat(org-import): make provision concurrency configurable via env 2026-05-04 16:32:56 -07:00
imagewatch feat(workspace-server): GHCR digest watcher closes runtime CD chain (#2114) 2026-04-26 13:36:26 -07:00
memory Memory v2 wiring: replace decorative tests with real integration 2026-05-04 10:38:59 -07:00
metrics
middleware fix(tenant-guard): allowlist /buildinfo so redeploy verifier can reach it 2026-04-30 12:54:51 -07:00
models refactor(workspace-status): typed constants + AST-based drift gate 2026-04-30 10:41:41 -07:00
orgtoken fix: F1085 rm scope concat + GH#756 ValidateToken terminal guard + CI test fixes 2026-04-24 07:16:54 +00:00
plugins
provisioner feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1) 2026-05-03 02:30:00 -07:00
registry fix(orphan-sweeper): exclude runtime='external' from stale-token revoke 2026-05-03 00:49:37 -07:00
router Memory v2 fixup Critical: wire plugin from main.go (was fully dormant) 2026-05-04 10:22:30 -07:00
scheduler feat(runtime): native_scheduler skip — primitive #3 of 6 2026-04-26 22:47:00 -07:00
supervised
ws
wsauth perf(wsauth): in-process cache for platform_inbound_secret reads 2026-05-03 00:04:38 -07:00