Files
molecule-core/local-e2e/templates/session-continuity-e2e.yml
claude-ceo-assistant 59d699b61c
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 14s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 9s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
gate-check-v3 / gate-check (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 7s
security-review / approved (pull_request) Failing after 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Platform (Go) (pull_request) Successful in 5m45s
CI / Python Lint & Test (pull_request) Successful in 7m0s
CI / Canvas (Next.js) (pull_request) Successful in 7m34s
CI / all-required (pull_request) Successful in 7m14s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
feat(local-e2e): session-continuity canary harness (task #342, RFC#600 gate)
Adds a self-contained docker-compose harness in local-e2e/ that gates
RFC#600-class template changes BEFORE customer canary. Implements the 4
canonical canaries:

  1. 2-turn name continuity   — SessionStore key derivation
  2. File-only message        — no caption drop-to-empty-prompt regress
  3. File + prompt (multimodal) — multimodal happy path
  4. Cross-session memory     — explicit memory tool, distinct context_ids

Architecture is deliberately lean per CTO "separate CI as possible":

  local-e2e/
    docker-compose.yml       # runtime + cp_sim ONLY (no platform Go, no pg)
    cp_sim/                  # ~250 LoC Python A2A wire-shape emitter
    cp_sim/canary/           # 4 canary scenarios + layer-isolation probes
    scripts/run-canary.sh    # one-shot orchestration (target <3 min)
    scripts/onboard-template.sh  # gitops helper for cascade
    templates/session-continuity-e2e.yml  # canonical workflow shim

Rationale for a Python tenant-CP simulator (not the real workspace-server):
SessionStore behaviour is fully owned by workspace/a2a_executor.py +
executor_helpers.py — the Go platform service doesn't touch session
continuity. Excising it gets the harness to <3 min cold-boot on
docker-host runners and keeps the surface small enough to debug fast.

The simulator emits the byte-identical JSON-RPC message/send envelope
that workspace-server POSTs (cross-checked against
tests/e2e/test_chat_attachments_e2e.sh and workspace/a2a_executor.py
:_core_execute).

Per feedback_no_single_source_of_truth: the harness IS the canonical
session-continuity validator across templates. Per-template unit tests
keep covering their own guard logic.

Per feedback_image_promote_is_not_user_live + feedback_verify_actual_
endstate_not_ack_follow_sop: every canary asserts at the running-
container layer; artifacts dump SessionStore state + runtime logs on
failure for post-mortem.

Rollout (deliberate sequencing, per task #342):
  1. THIS PR — lands harness in molecule-core. NOT yet wired to any
     template repo.
  2. Companion PR in molecule-ai-workspace-template-hermes — adds
     .gitea/workflows/session-continuity-e2e.yml. NOT required yet.
  3. Bake on hermes for ≥5 business days.
  4. Cascade to remaining 6 templates via onboard-template.sh.
  5. Per-template BP flip — add "session-continuity-e2e (pull_request)"
     to status_check_contexts on each repo, hermes first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 02:39:30 -07:00

86 lines
3.2 KiB
YAML

name: session-continuity-e2e
# Per-template wrapper for the molecule-core/local-e2e canary harness.
# DO NOT EDIT THIS FILE IN A TEMPLATE REPO — the canonical copy lives at
# molecule-ai/molecule-core:local-e2e/templates/session-continuity-e2e.yml
# (feedback_no_single_source_of_truth). The onboard-template.sh script
# copies it verbatim into each template; future fixes propagate via that
# helper, not by editing the template-side copy.
#
# What this workflow does:
# 1. Build THIS template's runtime image locally on the docker-host runner.
# 2. Clone molecule-core (canonical harness source).
# 3. Invoke local-e2e/scripts/run-canary.sh with TEMPLATE_IMAGE set to
# the just-built local image.
# 4. Upload artifacts/ on failure for post-mortem.
#
# Required-context flip:
# This workflow posts a status under the literal context name
# "session-continuity-e2e (pull_request)" — Gitea's standard
# <workflow-name> (<event>) format. To make it REQUIRED, add that
# exact string to the template repo's branch_protection
# status_check_contexts list. See README.md for the bake-period rule.
#
# Gitea 1.22.6 / act_runner notes (cross-refs to known footguns):
# - No cross-repo `uses:` (feedback_gitea_cross_repo_uses_blocked) —
# we clone molecule-core via plain git instead.
# - Per-SHA concurrency (feedback_concurrency_group_per_sha).
# - Workflow-level GITHUB_SERVER_URL pinned to the Gitea host
# (feedback_act_runner_github_server_url).
# - Runs on docker-host pool — NOT the heavy CI pool — per CTO
# directive "separate CI as possible" and the <3 min target.
on:
pull_request:
branches: [main]
push:
branches: [main]
concurrency:
group: session-continuity-e2e-${{ github.workflow }}-${{ github.event_name }}-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
session-continuity-e2e:
runs-on: docker-host
timeout-minutes: 8
steps:
- name: Checkout template
uses: actions/checkout@v4
with:
path: template
- name: Build template image
id: build
working-directory: template
run: |
IMAGE_TAG="local-e2e-${GITHUB_SHA::12}"
docker build -t "molecule-ai/template-under-test:${IMAGE_TAG}" .
echo "image=molecule-ai/template-under-test:${IMAGE_TAG}" >> "$GITHUB_OUTPUT"
- name: Clone harness from molecule-core
run: |
# Anonymous clone — molecule-core is internal-readable. NEVER bake
# an auth token into the URL (feedback_credentials_in_git_url).
git clone --depth 1 "${GITHUB_SERVER_URL}/molecule-ai/molecule-core.git" harness
- name: Run canary
env:
TEMPLATE_IMAGE: ${{ steps.build.outputs.image }}
CANARY_RUN_ID: ${{ github.run_id }}-${{ github.run_attempt }}
run: |
cd harness
./local-e2e/scripts/run-canary.sh
- name: Upload artifacts on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: session-continuity-canary-${{ github.run_id }}
path: harness/local-e2e/artifacts/
if-no-files-found: warn
retention-days: 7