Test-coverage work order: every 2026-06-10 pilot/outage finding gets unit + integration + e2e wired to CI (CTO directive) #2537

Open
opened 2026-06-10 14:47:41 +00:00 by core-devops · 0 comments
Member

CTO directive (2026-06-10): all issues found during the platform-agent pilot + memory-write outage must be FULLY tested and wired to CI, including e2e. Matrix below — each row is a deliverable; fixes without their listed tests do not merge.

Finding Fix state Required tests → CI home
Memory-write FK outage (namespace upsert) core#2517 MERGED (unit) e2e: handlers-postgres-integration boots the REAL memory-plugin sidecar against the testcontainer PG, POSTs /workspaces/:id/memories for a FRESH workspace (no namespace row), asserts 201 + row in memory_plugin.memory_records. This is the only layer that exercises the real FK.
Chat false-unreachable on >120s turns core#2515 MERGED (vitest) e2e: e2e-chat (Playwright) case — stub the send fetch to reject with DOMException TimeoutError → assert NO error banner + thinking indicator persists; reject with ECONNREFUSED → banner shown.
Concierge MCP declaration (fragment + mode env + auth env) core#2522 MERGED (unit), template#107 MERGED, template#108 OPEN build-gate e2e is in #108 (management serverInfo smoke). ADD: E2E Staging Concierge Platform Agent job asserts the provisioned concierge /configs contains mcp_servers.yaml AND a forced tool-call returns TOOLS-OK (the pilot check, automated).
delivery_mode=poll starvation + sticky stale poll (core#2530) OPEN fix = register self-heals stale poll for push-capable runtimes + idle drain; tests: registry unit (resolveDeliveryMode self-heal), handlers-pg integration (queued msg → drains on heartbeat), e2e-chat case (message to fresh concierge gets a reply, no queue-forever).
register-401 workspace shows online (core#2530) OPEN fix = persistent register-401 ⇒ degraded; tests: registry unit + integration (heartbeat with no live token after N 401s flips status), canvas unit for degraded hint.
Provision races SecretsManager deletion window (cp#691) OPEN fix = RestoreSecret-or-recreate; e2e: provisioner-localstack — create secret, schedule deletion, provision → asserts restore+put path (localstack supports it).
Admin workspace-env endpoint silent no-op (cp#696) OPEN fix = container re-create w/ merged env + VERIFY printenv before ok:true; tests: script-gen unit (env merge, last-wins) + handler unit (verification failure ⇒ ok:false); staging_e2e assertion if cheap.
Pin allowlist / concierge naming / redeploy org-name cp#689 OPEN (unit tests included) merge as-is (unit-covered); ADD integration: redeploy script golden-render test already in PR; staging_e2e redeploy asserts MOLECULE_ORG_NAME lands (optional follow-up).
Abandoned-tenant fleet discovery cp#685 MERGED (sqlmock unit) ADD real-DB integration (internal/integration): seed abandoned+suspended+running orgs, assert resolveFleetSlugs set.
Management tools can't pass X-Confirm-Name (mcp-server#58) OPEN fix = confirm_name arg; tests: unit (header mapping) + over-the-wire integration (#34 harness): gated delete with/without confirm.
MOL_PACKAGE_TOKEN dead + publish 401 publish done manually ops: rotate secret (publish-runtime-bot needs org package-write); ADD a publish-dry-run CI step (npm publish --dry-run with the secret) so a dead token fails BEFORE a release tag.

Process: per-repo PRs through the normal gate (2 approvals, merge commits). E2E jobs must be required contexts where the suite already is; new suites start advisory for 48h then flip required.

🤖 Generated with Claude Code

CTO directive (2026-06-10): all issues found during the platform-agent pilot + memory-write outage must be FULLY tested and wired to CI, including e2e. Matrix below — each row is a deliverable; fixes without their listed tests do not merge. | Finding | Fix state | Required tests → CI home | |---|---|---| | Memory-write FK outage (namespace upsert) | core#2517 MERGED (unit) | **e2e**: handlers-postgres-integration boots the REAL memory-plugin sidecar against the testcontainer PG, POSTs /workspaces/:id/memories for a FRESH workspace (no namespace row), asserts 201 + row in memory_plugin.memory_records. This is the only layer that exercises the real FK. | | Chat false-unreachable on >120s turns | core#2515 MERGED (vitest) | **e2e**: e2e-chat (Playwright) case — stub the send fetch to reject with DOMException TimeoutError → assert NO error banner + thinking indicator persists; reject with ECONNREFUSED → banner shown. | | Concierge MCP declaration (fragment + mode env + auth env) | core#2522 MERGED (unit), template#107 MERGED, template#108 OPEN | **build-gate e2e** is in #108 (management serverInfo smoke). ADD: E2E Staging Concierge Platform Agent job asserts the provisioned concierge /configs contains mcp_servers.yaml AND a forced tool-call returns TOOLS-OK (the pilot check, automated). | | delivery_mode=poll starvation + sticky stale poll (core#2530) | OPEN | fix = register self-heals stale poll for push-capable runtimes + idle drain; **tests**: registry unit (resolveDeliveryMode self-heal), handlers-pg integration (queued msg → drains on heartbeat), e2e-chat case (message to fresh concierge gets a reply, no queue-forever). | | register-401 workspace shows online (core#2530) | OPEN | fix = persistent register-401 ⇒ degraded; **tests**: registry unit + integration (heartbeat with no live token after N 401s flips status), canvas unit for degraded hint. | | Provision races SecretsManager deletion window (cp#691) | OPEN | fix = RestoreSecret-or-recreate; **e2e**: provisioner-localstack — create secret, schedule deletion, provision → asserts restore+put path (localstack supports it). | | Admin workspace-env endpoint silent no-op (cp#696) | OPEN | fix = container re-create w/ merged env + VERIFY printenv before ok:true; **tests**: script-gen unit (env merge, last-wins) + handler unit (verification failure ⇒ ok:false); staging_e2e assertion if cheap. | | Pin allowlist / concierge naming / redeploy org-name | cp#689 OPEN (unit tests included) | merge as-is (unit-covered); ADD integration: redeploy script golden-render test already in PR; staging_e2e redeploy asserts MOLECULE_ORG_NAME lands (optional follow-up). | | Abandoned-tenant fleet discovery | cp#685 MERGED (sqlmock unit) | ADD real-DB integration (internal/integration): seed abandoned+suspended+running orgs, assert resolveFleetSlugs set. | | Management tools can't pass X-Confirm-Name (mcp-server#58) | OPEN | fix = confirm_name arg; **tests**: unit (header mapping) + over-the-wire integration (#34 harness): gated delete with/without confirm. | | MOL_PACKAGE_TOKEN dead + publish 401 | publish done manually | ops: rotate secret (publish-runtime-bot needs org package-write); ADD a publish-dry-run CI step (npm publish --dry-run with the secret) so a dead token fails BEFORE a release tag. | Process: per-repo PRs through the normal gate (2 approvals, merge commits). E2E jobs must be required contexts where the suite already is; new suites start advisory for 48h then flip required. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2537