From 6002d36df0c2674d0b389f2c734c549cfd15034e Mon Sep 17 00:00:00 2001 From: "Molecule AI Dev Engineer A (Kimi)" Date: Tue, 2 Jun 2026 00:50:59 +0000 Subject: [PATCH] docs(architecture): mark drift risk #6 resolved and contract tests run without Skip (re-created from staging #2031) Updates backends.md last-audit date and drift-risk #6 status to RESOLVED, noting nil guards landed in Provisioner and CPProvisioner. Contract tests now run in CI without t.Skip. Re-created from molecule-core staging PR #2031 targeting main. Co-Authored-By: Claude Opus 4.7 --- docs/architecture/backends.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/architecture/backends.md b/docs/architecture/backends.md index d8042e4f5..fecf667d4 100644 --- a/docs/architecture/backends.md +++ b/docs/architecture/backends.md @@ -2,7 +2,7 @@ **Status:** living document — update when you ship a feature that touches one backend. **Owner:** workspace-server + controlplane teams. -**Last audit:** 2026-05-07 (plugin install/uninstall closed for EC2 backend via EIC SSH push to the bind-mounted `/configs/plugins//`, mirroring the Files API PR #1702 pattern). +**Last audit:** 2026-05-31 (Claude agent — drift risk #6 verified resolved; nil guards present, contract tests run without Skip). ## Why this exists @@ -93,12 +93,12 @@ For "do we have any backend?", use `HasProvisioner()`, never bare `h.provisioner 3. **Restart divergence on runtime changes.** Docker re-reads `/configs/config.yaml` from the container before stop, so a changed `runtime:` survives a restart even if the DB isn't synced. EC2 trusts the DB only. If you change the runtime via the Config tab and the handler races the restart, Docker will land on the new runtime, EC2 will land on the old one. **Fix path:** make the Config-tab save explicitly flush to DB before kicking off a restart, not deferred. 4. **Console-output asymmetry.** Users debugging a stuck workspace on Docker see `docker logs`; on EC2 they see `GetConsoleOutput`. The two outputs look nothing alike. **Fix path:** expose a unified `GET /workspaces/:id/boot-log` that proxies to whichever backend serves the data. Already partly there via `cp_provisioner.Console`. 5. **Template script drift.** `install.sh` and `start.sh` in each template repo do the same high-level work (install hermes-agent, write .env, write config.yaml, start gateway) but must be kept byte-level consistent on the provider-key forwarding block. Easy to forget. Enforced now by `tools/check-template-parity.sh` (see below) — run it in each template repo's CI. -6. **Both backends panic when underlying client is nil.** Discovered by the contract-test scaffold landing in this PR: `Provisioner.{Stop,IsRunning}` nil-dereferences the Docker client, and `CPProvisioner.{Stop,IsRunning}` nil-dereferences `httpClient`. The real code always sets these, so this is theoretical in prod — but it means the contract runner can't execute scenarios against zero-value backends. **Fix path:** guard each method with `if p.docker == nil { return false, errNoBackend }` (and equivalent for CP), then flip the `t.Skip` in the contract tests to `t.Run`. +6. ~~**Both backends panic when underlying client is nil.**~~ **RESOLVED** — nil guards landed in `Provisioner` (`Start`, `Stop`, `IsRunning`, `ExecRead`, `RemoveVolume`, `VolumeHasFile`, `WriteAuthTokenToVolume`) and `CPProvisioner` (`Stop`, `IsRunning`), all returning `ErrNoBackend`. Contract tests (`TestDockerBackend_Contract`, `TestCPProvisionerBackend_Contract`, `TestZeroValuedBackends_NoPanic`) run in CI without `t.Skip`. ## Enforcement - **`tools/check-template-parity.sh`** (this repo) — ensures `install.sh` and `start.sh` in a template repo forward identical sets of provider keys. Wire into each template repo's CI as `bash $MONOREPO/tools/check-template-parity.sh install.sh start.sh`. -- **Contract tests** (stub) — `workspace-server/internal/provisioner/backend_contract_test.go` defines the behaviors every `provisioner.Provisioner` implementation must satisfy. Fails compile when a method drifts between `Docker` and `CPProvisioner`. Scenario-level runs are `t.Skip`'d today pending drift risk #6 (see above) — compile-time assertions still catch method drift. +- **Contract tests** — `workspace-server/internal/provisioner/backend_contract_test.go` defines the behaviors every `provisioner.Provisioner` implementation must satisfy. Fails compile when a method drifts between `Docker` and `CPProvisioner`. Scenario-level runs (`TestDockerBackend_Contract`, `TestCPProvisionerBackend_Contract`, `TestZeroValuedBackends_NoPanic`) execute in CI — drift risk #6 resolved. - **Source-level dispatcher pins** — `workspace_provision_auto_test.go` enforces the SoT pattern documented above: - `TestNoCallSiteCallsDirectProvisionerExceptAuto` — no handler calls `.provisionWorkspace(` or `.provisionWorkspaceCP(` directly outside the dispatcher's allowlist. - `TestNoCallSiteCallsBareStop` — no handler calls `.provisioner.Stop(` or `.cpProv.Stop(` directly outside the dispatcher's allowlist (strips Go comments before substring match so archaeology in code comments doesn't trip the gate). -- 2.52.0