test(e2e): template-asset delivery gate — fresh seo-agent must boot WITH skills (RFC #2843) #2971

Merged
core-devops merged 1 commits from feat/template-delivery-e2e-gate into main 2026-06-16 01:07:52 +00:00
Member

RFC #2843 regression gate — fresh seo-agent must boot WITH its skills

Nothing in CI provisions a fresh agent end-to-end and inspects the delivered /configs. That blind spot is exactly why the 2026-06-15 incident family shipped green:

  1. concierge booted with no model (MISSING_MODEL fail-closed) — fixed in #2966
  2. concierge booted with no identity (generic Claude Code) — fix in #2955
  3. seo-agent booted with config+prompts but no agent-skills/#32 (the seo-all skill pack, the whole point of the agent, is missing)

I confirmed #3 empirically: a fresh seo-agent provisioned on the current image gets config.yaml (8429 B real) + prompts/seo-agent.md + model, but /configs/agent-skills/ is empty.

What this adds

tests/e2e/test_template_delivery_e2e.sh — provisions a throwaway tenant + a fresh seo-agent template workspace, then asserts delivery, one assertion per incident:

assertion catches
A seo-agent reaches online MISSING_MODEL fail-closed
B GET /model == declared model model drop
C config.yaml delivered + real (>1 KiB, not the 218 B stub) config drop
D prompts/ delivered identity drop
E agent-skills/seo-all/SKILL.md delivered the #32 skill drop

Teardown trap deletes the org even on failure. Mirrors the auth/provision/teardown shape of test_staging_concierge_e2e.sh.

Staged rollout (intentional — do not flip to required yet)

  • Phase 1 (this PR): advisory (continue-on-error), path-filtered to delivery code. It will report RED while #2955 and #32 are open — that's the point: it proves the gate correctly detects the bug.
  • Phase 2 (after #2955 + #32 land and this is green ×2): drop continue-on-error and add to branch-protection required_status_checks → merge-blocking.

🤖 Generated with Claude Code

## RFC #2843 regression gate — fresh seo-agent must boot WITH its skills Nothing in CI provisions a fresh agent end-to-end and inspects the **delivered** `/configs`. That blind spot is exactly why the 2026-06-15 incident family shipped green: 1. concierge booted with **no model** (MISSING_MODEL fail-closed) — fixed in #2966 2. concierge booted with **no identity** (generic Claude Code) — fix in #2955 3. seo-agent booted with config+prompts but **no `agent-skills/`** — #32 (the seo-all skill pack, the whole point of the agent, is missing) I confirmed #3 empirically: a fresh `seo-agent` provisioned on the current image gets `config.yaml` (8429 B real) + `prompts/seo-agent.md` + model, but `/configs/agent-skills/` is **empty**. ### What this adds `tests/e2e/test_template_delivery_e2e.sh` — provisions a throwaway tenant + a fresh `seo-agent` template workspace, then asserts delivery, **one assertion per incident**: | | assertion | catches | |---|---|---| | A | seo-agent reaches `online` | MISSING_MODEL fail-closed | | B | `GET /model` == declared model | model drop | | C | `config.yaml` delivered + real (>1 KiB, not the 218 B stub) | config drop | | D | `prompts/` delivered | identity drop | | E | `agent-skills/seo-all/SKILL.md` delivered | **the #32 skill drop** | Teardown trap deletes the org even on failure. Mirrors the auth/provision/teardown shape of `test_staging_concierge_e2e.sh`. ### Staged rollout (intentional — do not flip to required yet) - **Phase 1 (this PR):** advisory (`continue-on-error`), path-filtered to delivery code. It will report **RED** while #2955 and #32 are open — that's the point: it proves the gate correctly *detects* the bug. - **Phase 2 (after #2955 + #32 land and this is green ×2):** drop `continue-on-error` and add to branch-protection `required_status_checks` → merge-blocking. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
core-devops added 1 commit 2026-06-15 22:17:38 +00:00
test(e2e): template-asset delivery gate — fresh seo-agent must boot WITH skills (RFC #2843 regression gate)
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 16s
Lint publish-runner timeout-minutes / Lint publish-runner timeout-minutes (pull_request) Successful in 15s
CI / Platform (Go) (pull_request) Successful in 2s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
qa-review / approved (pull_request_target) Failing after 7s
lint-no-coe-on-required / lint-no-coe-on-required (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 21s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 20s
CI / Canvas (Next.js) (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 8s
sop-checklist / all-items-acked (pull_request_target) Successful in 8s
reserved-path-review / reserved-path-review (pull_request_target) Failing after 9s
lint-setup-go-cache / lint-setup-go-cache (pull_request) Successful in 17s
CI / Canvas Deploy Status (pull_request) Successful in 2s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 18s
gate-check-v3 / gate-check (pull_request_target) Successful in 14s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
PR Diff Guard / PR diff guard (pull_request) Successful in 19s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 29s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Failing after 27s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 33s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 45s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 46s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1m54s
CI / all-required (pull_request) Successful in 3s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 2m3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m30s
template-delivery-e2e / Template-asset delivery (fresh seo-agent boots WITH skills) (pull_request) Failing after 4m37s
audit-force-merge / audit (pull_request_target) Successful in 9s
sop-checklist / all-items-acked (pull_request) Compensated by status-reaper (non-required pull_request/pull_request_review governance shadow overridden by successful pull_request_target status; see .gitea/scripts/status-reaper.py)
6c0a373572
Nothing in CI provisions a fresh agent end-to-end and inspects the DELIVERED
/configs — which is exactly why the 2026-06-15 incident family shipped green:
  1. concierge with no MODEL (MISSING_MODEL fail-closed)  — fixed #2966
  2. concierge with no IDENTITY (generic Claude Code)     — fix #2955
  3. seo-agent with config+prompts but NO agent-skills/   — #32

This adds tests/e2e/test_template_delivery_e2e.sh: provision a throwaway tenant
+ a fresh seo-agent template workspace, then assert the template-asset channel
actually delivered, with one assertion per incident:
  A. seo-agent reaches online                    (catches MISSING_MODEL)
  B. GET /model == declared model
  C. config.yaml delivered + REAL (>1KiB, not the 218B default stub)
  D. prompts/ delivered (identity prompt present)
  E. agent-skills/seo-all/SKILL.md delivered     (catches the #32 skill drop)
Teardown trap deletes the org even on failure.

Workflow template-delivery-e2e.yml is path-filtered (delivery code only) and
STAGED:
  Phase 1 (this PR): advisory (continue-on-error) — proves the gate correctly
          detects the open delivery bugs by going RED.
  Phase 2 (after #2955 + #32 land + green x2): drop continue-on-error and add
          to branch-protection required checks → merge-blocking.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
core-devops merged commit 2e31a8fc9d into main 2026-06-16 01:07:52 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2971