Compare commits

..

2 Commits

Author SHA1 Message Date
hongming 013c8cfe58 review(docs): #1735 patch API spec lines describing awareness_namespace
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 10s
Check migration collisions / Migration version collision check (pull_request) Successful in 17s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 55s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 27s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 56s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m3s
gate-check-v3 / gate-check (pull_request) Failing after 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m37s
Harness Replays / Harness Replays (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m9s
CI / Platform (Go) (pull_request) Successful in 5m10s
CI / all-required (pull_request) Successful in 38m28s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
#1737 deletes the awareness_namespace column from `workspaces` and the
provisioner env injection that backed it. Two API-spec lines still
describe the field as if it were live, which would mislead any
external integrator reading the docs against a post-#1737 backend:

- docs/api-reference.md:106 listed `awareness_namespace` as a column
  on the `workspaces` row.
- docs/api-protocol/platform-api.md:93 claimed workspace creation
  "assigns an awareness_namespace on the workspace row" and the
  namespace is "later injected into the provisioned runtime".

Both are factually wrong post-#1737. Patched.

The broader docs sweep (~30 more references across
`docs/architecture/memory.md`, `docs/agent-runtime/*`, READMEs,
superpowers plans, etc.) is tracked in #1753 as a docs-only
follow-up — those are narrative / handbook copy, not contract.

Refs: #1735 (RFC), #1737 (the backend removal PR this rebases on),
#1753 (docs sweep follow-up), review-finding C3 (API doc contradiction).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:57:14 -07:00
hongming d20147300b chore(workspace-server): #1735 remove unused Awareness namespace surface
Drop the entire `Awareness namespace` memory-routing plumbing. The
feature was wired through migrations, models, the provisioner, and
several handlers but was never enabled in any environment — the
2026-05-23 sweep against the Railway controlplane (project
molecule-platform, prod env 59227671-…, staging env 639539ec-…) found
zero `AWARENESS_*` env vars set, and the operator-host bootstrap files
(`/etc/molecule-bootstrap/all-credentials.env`, `secrets.env`,
`agent-secrets.env`) likewise had no awareness entries. The provisioner
already only injected the env vars when both URL and namespace were
non-empty (so workspace containers also never received them).

Scope:
- Drop `workspaces.awareness_namespace` column via new forward migration
  `20260523130000_drop_workspaces_awareness_namespace.up.sql` (mirrors
  the recent `drop_runtime_image_pins` shape; paired down migration
  restores migration 010 verbatim).
- Drop `Workspace.AwarenessNamespace` from the model.
- Drop `WorkspaceConfig.AwarenessURL` + `AwarenessNamespace` and the
  conditional env injection in `internal/provisioner/provisioner.go`.
- Drop `workspaceAwarenessNamespace` + `loadAwarenessNamespace`, rename
  the one-line helper to `workspaceMemoryNamespace` (still produces the
  canonical `workspace:<id>` string matching the v2 namespace resolver
  at `internal/memory/namespace/resolver.go:186`).
- `seedInitialMemories` drops its `awarenessNamespace` parameter and
  computes the namespace inline — the parameter was always
  `workspace:<workspaceID>` at every call site, so the value is a pure
  function of the workspace id.
- Update three INSERT call sites (`workspace.go`, `org_import.go`) and
  the org-import root-memory seed in `org.go`.
- Trim `awareness_namespace` from the create-handler response payload.
- Remove ~22 awareness-specific test assertions and SQL-mock arg
  placeholders across `handlers_test.go`, `handlers_additional_test.go`,
  `workspace_test.go`, `workspace_provision_test.go`,
  `workspace_compute_test.go`, `workspace_budget_test.go`,
  `workspace_create_name_integration_test.go`, and
  `provisioner_test.go`.

The `agent_memories.namespace` column (added by migration 017) is
unaffected — seedInitialMemories continues to write `workspace:<id>`
into it, just computed inline now.

Canvas-side cleanup (the `<iframe>` block in `MemoryTab.tsx` reading
`NEXT_PUBLIC_AWARENESS_URL`) is deliberately deferred — it overlaps
with the rewrite landing in #1734 and goes in after that to avoid the
merge conflict.

Verification:
- `go vet ./...` clean.
- `go test -short -count=1 ./...` green (~30 packages).
- Migration round-trip verified on a throwaway `postgres:16-alpine`
  container: up drops the column, down restores it to the same
  `text` type as migration 010.

Refs: #1735 (this issue), #1733 (memory SSOT consolidation), #1734
(canvas Memory tab bug).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:55:10 -07:00
554 changed files with 9986 additions and 33891 deletions
+4 -1
View File
@@ -50,8 +50,11 @@ MOLECULE_ENV=development # Environment label (development/
# Container/runtime detection
# MOLECULE_IN_DOCKER= # Set when running the platform inside Docker (accepts 1/0, true/false). Triggers A2A proxy to rewrite 127.0.0.1:<port> agent URLs to Docker bridge hostnames. Auto-detected via /.dockerenv; only set if detection fails or to force off.
# Observability (Awareness)
# AWARENESS_URL= # If set, injected into workspace containers along with a deterministic AWARENESS_NAMESPACE derived from workspace ID. Enables the cross-session memory MCP server.
# GitHub
# GITHUB_REPO=owner/repo # Target repo for agent initial_prompt clone (e.g. Molecule-AI/molecule-core). Read inside workspace containers.
# GITHUB_REPO=owner/repo # Target repo for agent initial_prompt clone (e.g. Molecule-AI/molecule-monorepo). Read inside workspace containers.
# GITHUB_TOKEN= # Personal access token / installation token used by agents that clone private repos. Register as a global secret via POST /admin/secrets for propagation to workspace env. Token is used in-URL during clone and then scrubbed from .git/config via `git remote set-url`.
# Webhooks
+12 -28
View File
@@ -18,24 +18,15 @@
# per §SOP-6 security model). No-op when merged=false.
#
# Required env (set by the workflow):
# GITEA_TOKEN, GITEA_HOST, REPO, PR_NUMBER
# plus one of REQUIRED_CHECKS_JSON (preferred) or REQUIRED_CHECKS (legacy)
# GITEA_TOKEN, GITEA_HOST, REPO, PR_NUMBER, REQUIRED_CHECKS
#
# REQUIRED_CHECKS_JSON is a JSON object keyed by branch name. Each value
# is an array of status-check context names that branch protection
# requires for that branch. The script looks up the PR's base branch and
# evaluates only the checks declared for that branch.
#
# {"main": ["CI / all-required (pull_request)", ...],
# "staging": ["CI / all-required (pull_request)", ...]}
#
# REQUIRED_CHECKS (legacy) is a newline-separated list used when the
# JSON variable is not set. Declared in the workflow YAML rather than
# fetched from /branch_protections (which needs admin scope — sop-tier-bot
# has read-only). Trade dynamism for simplicity: when the required-check
# set changes, update both branch protection AND this env. Keeping them
# in sync is less complexity than granting the audit bot admin perms on
# every repo.
# REQUIRED_CHECKS is a newline-separated list of status-check context
# names that branch protection requires. Declared in the workflow YAML
# rather than fetched from /branch_protections (which needs admin
# scope — sop-tier-bot has read-only). Trade dynamism for simplicity:
# when the required-check set changes, update both branch protection
# AND this env. Keeping them in sync is less complexity than granting
# the audit bot admin perms on every repo.
set -euo pipefail
@@ -43,10 +34,7 @@ set -euo pipefail
: "${GITEA_HOST:?required}"
: "${REPO:?required}"
: "${PR_NUMBER:?required}"
if [ -z "${REQUIRED_CHECKS_JSON:-}" ] && [ -z "${REQUIRED_CHECKS:-}" ]; then
echo "::error::Either REQUIRED_CHECKS_JSON or REQUIRED_CHECKS must be set"
exit 1
fi
: "${REQUIRED_CHECKS:?required (newline-separated context names)}"
OWNER="${REPO%%/*}"
NAME="${REPO##*/}"
@@ -77,14 +65,10 @@ if [ -z "$MERGE_SHA" ]; then
exit 0
fi
# 2. Required status checks — branch-aware JSON dict takes precedence.
if [ -n "${REQUIRED_CHECKS_JSON:-}" ]; then
REQUIRED=$(echo "$REQUIRED_CHECKS_JSON" | jq -r --arg branch "$BASE_BRANCH" '.[$branch] // [] | .[]')
else
REQUIRED="$REQUIRED_CHECKS"
fi
# 2. Required status checks declared in the workflow env.
REQUIRED="$REQUIRED_CHECKS"
if [ -z "${REQUIRED//[[:space:]]/}" ]; then
echo "::notice::REQUIRED_CHECKS empty for branch '$BASE_BRANCH' — force-merge not applicable."
echo "::notice::REQUIRED_CHECKS empty — force-merge not applicable."
exit 0
fi
+14 -28
View File
@@ -274,8 +274,7 @@ def required_checks_env(audit_doc: dict) -> set[str]:
found.append(v)
if not found:
sys.stderr.write(
f"::error::REQUIRED_CHECKS env not found in any step of "
f"{AUDIT_WORKFLOW_PATH}\n"
f"::error::REQUIRED_CHECKS env not found in any step of {AUDIT_WORKFLOW_PATH}\n"
)
sys.exit(3)
if len(found) > 1:
@@ -385,15 +384,10 @@ def detect_drift(branch: str) -> tuple[list[str], dict]:
contexts = set(protection.get("status_check_contexts") or [])
# ----- F1: job exists in CI but not under sentinel.needs -----
# Post-#1766 contract: the sentinel may deliberately have no `needs:`
# and instead poll path-relevant statuses dynamically. In that case
# F1 is a false positive — skip it. F1b (typos in existing needs)
# is naturally skipped when needs is empty.
missing_from_needs = sorted(jobs - needs)
if missing_from_needs and needs:
if missing_from_needs:
findings.append(
"F1 — jobs in ci.yml NOT under sentinel `needs:` "
"(sentinel doesn't gate them):\n"
"F1 — jobs in ci.yml NOT under sentinel `needs:` (sentinel doesn't gate them):\n"
+ "\n".join(f" - {n}" for n in missing_from_needs)
)
@@ -403,8 +397,7 @@ def detect_drift(branch: str) -> tuple[list[str], dict]:
stale_needs = sorted(needs - jobs_all)
if stale_needs:
findings.append(
"F1b — sentinel `needs:` lists jobs NOT present in ci.yml "
"(typo or removed job):\n"
"F1b — sentinel `needs:` lists jobs NOT present in ci.yml (typo or removed job):\n"
+ "\n".join(f" - {n}" for n in stale_needs)
)
@@ -412,9 +405,7 @@ def detect_drift(branch: str) -> tuple[list[str], dict]:
# Compute the contexts the CI YAML actually produces. The sentinel
# is in (B) intentionally (`ci / all-required (pull_request)`); we
# whitelist it explicitly.
emitted_contexts = {
expected_context(j) for j in jobs
} | {expected_context(SENTINEL_JOB)}
emitted_contexts = {expected_context(j) for j in jobs} | {expected_context(SENTINEL_JOB)}
# Contexts NOT produced by ci.yml may still come from other
# workflows in the repo (Secret scan etc). We can't enumerate
# every workflow's emissions cheaply; instead, flag only contexts
@@ -427,9 +418,8 @@ def detect_drift(branch: str) -> tuple[list[str], dict]:
)
if stale_protection:
findings.append(
"F2 — protection `status_check_contexts` entries with `ci / ` "
"prefix that NO job in ci.yml emits "
"(stale name → silent advisory gate):\n"
"F2 — protection `status_check_contexts` entries with `ci / ` prefix that NO "
"job in ci.yml emits (stale name → silent advisory gate):\n"
+ "\n".join(f" - {c}" for c in stale_protection)
)
@@ -504,8 +494,7 @@ def render_body(branch: str, findings: list[str], debug: dict) -> str:
f"# Drift detected on `{REPO}/{branch}`",
"",
"Auto-filed by `.gitea/workflows/ci-required-drift.yml` "
"(RFC [internal#219]"
"(https://git.moleculesai.app/molecule-ai/internal/issues/219) §4 + §6).",
"(RFC [internal#219](https://git.moleculesai.app/molecule-ai/internal/issues/219) §4 + §6).",
"",
"## Findings",
"",
@@ -516,11 +505,8 @@ def render_body(branch: str, findings: list[str], debug: dict) -> str:
"",
"## Resolution",
"",
"- **F1 / F1b**: if the sentinel job has a `needs:` block, add "
"the missing job to it in `.gitea/workflows/ci.yml`, or remove "
"the stale entry. If the sentinel deliberately has no `needs:` "
"(path-aware polling sentinel per post-#1766 contract), this "
"finding is expected and F1 is skipped.",
"- **F1 / F1b**: add the missing job to `all-required.needs:` "
"in `.gitea/workflows/ci.yml`, or remove the stale entry.",
"- **F2**: rename the protection context to match an emitter, "
"or remove it from `status_check_contexts` "
"(PATCH `/api/v1/repos/{owner}/{repo}/branch_protections/{branch}`).",
@@ -561,12 +547,12 @@ def file_or_update(
if dry_run:
print(f"::notice::[dry-run] would file/update drift issue for {branch}")
print("::group::[dry-run] title")
print(f"::group::[dry-run] title")
print(title)
print("::endgroup::")
print("::group::[dry-run] body")
print(f"::endgroup::")
print(f"::group::[dry-run] body")
print(body)
print("::endgroup::")
print(f"::endgroup::")
return
existing = find_open_issue(title)
+2 -4
View File
@@ -15,6 +15,7 @@ import subprocess
import sys
from pathlib import Path
PROFILES: dict[str, dict[str, str]] = {
"ci": {
"platform": r"^workspace-server/",
@@ -152,10 +153,7 @@ def parse_args(argv: list[str]) -> argparse.Namespace:
parser.add_argument("--event-name", default=os.environ.get("GITHUB_EVENT_NAME", ""))
parser.add_argument("--pr-base-sha", default="")
parser.add_argument("--base-ref", default="")
parser.add_argument(
"--push-before",
default=os.environ.get("GITHUB_EVENT_BEFORE", ""),
)
parser.add_argument("--push-before", default=os.environ.get("GITHUB_EVENT_BEFORE", ""))
return parser.parse_args(argv)
+1 -3
View File
@@ -183,9 +183,7 @@ def required_contexts_green(
status = latest_statuses.get(context)
state = status_state(status or {})
if state != "success":
if pr_labels and _is_tier_low_pending_ok(
latest_statuses, context, pr_labels
):
if pr_labels and _is_tier_low_pending_ok(latest_statuses, context, pr_labels):
continue # tier:low soft-fail: accept pending sop-checklist
missing_or_bad.append(f"{context}={state or 'missing'}")
return not missing_or_bad, missing_or_bad
@@ -13,9 +13,11 @@ from __future__ import annotations
import argparse
import glob
import re
import sys
from pathlib import Path
from typing import NamedTuple
SELF = ".gitea/workflows/lint-curl-status-capture.yml"
+1 -1
View File
@@ -283,7 +283,7 @@ def _ensure_labels(repo: str, names: list[str]) -> list[int]:
if status != "ok" or not isinstance(labels, list):
return []
out: list[int] = []
by_name = {label["name"]: label["id"] for label in labels if isinstance(label, dict)}
by_name = {l["name"]: l["id"] for l in labels if isinstance(l, dict)}
for n in names:
if n in by_name:
out.append(by_name[n])
@@ -82,7 +82,7 @@ import sys
import urllib.error
import urllib.parse
import urllib.request
from datetime import datetime, timezone
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
@@ -641,15 +641,6 @@ def main(argv: list[str] | None = None) -> int:
base_workflows = workflows_at_sha(BASE_SHA)
head_workflows = workflows_at_sha(HEAD_SHA)
# Ignore workflow files that are identical on both sides — old branches
# that haven't rebased onto main carry stale copies of workflows that
# were updated later. Comparing those stale copies against the current
# base produces false-positive "flips".
base_workflows = {
p: t for p, t in base_workflows.items()
if p in head_workflows and head_workflows[p] != t
}
head_workflows = {p: t for p, t in head_workflows.items() if p in base_workflows}
flips = detect_flips(base_workflows, head_workflows)
if not flips:
+30 -268
View File
@@ -90,15 +90,6 @@ API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
# match by exact title without parsing.
TITLE_PREFIX = "[main-red]"
# Contexts that are scheduled or non-required — their pending/failure
# state should not block stale-issue closeout (mc#1789).
SCHEDULED_CONTEXT_PATTERNS = (
"Staging SaaS smoke",
"Continuous synthetic E2E",
"main-red-watchdog",
"ci-arm64-advisory",
)
# Settling window (seconds) between initial red detection and the
# pre-file recheck. The recheck filters out the two largest false-
# positive classes seen in mc#1597..1630 (task #394, 2026-05-21):
@@ -274,11 +265,6 @@ def get_combined_status(sha: str) -> dict:
return body
def _entry_state(s: dict) -> str:
"""Per-entry status key in Gitea 1.22.6 is `status`; fall back to `state`."""
return s.get("status") or s.get("state") or ""
def is_red(status: dict) -> tuple[bool, list[dict]]:
"""Return (is_red, failed_statuses).
@@ -326,6 +312,9 @@ def is_red(status: dict) -> tuple[bool, list[dict]]:
# "no per-context entries were in a red state" fallback even when
# the combined-state correctly flagged red. See
# `feedback_smoke_test_vendor_truth_not_shape_match`.
def _entry_state(s: dict) -> str:
return s.get("status") or s.get("state") or ""
def _is_cancel_cascade(s: dict) -> bool:
"""status=3 entry per Gitea 1.22.6 description-string contract.
Match exactly (after strip) — substring match would catch
@@ -364,15 +353,6 @@ def title_for(sha: str) -> str:
return f"{TITLE_PREFIX} {REPO}: {sha[:10]}"
def _is_scheduled_context(context: str) -> bool:
"""Return True if `context` is a known scheduled/non-required job.
These contexts run on a schedule and should not block stale-issue
closeout when main's required CI has recovered (mc#1789).
"""
return any(pattern.lower() in context.lower() for pattern in SCHEDULED_CONTEXT_PATTERNS)
def list_open_red_issues() -> list[dict]:
"""All open issues whose title starts with `[main-red] {repo}: `.
@@ -382,34 +362,23 @@ def list_open_red_issues() -> list[dict]:
file-or-update path to POST a duplicate — exactly the regression
class the helper-raises contract closes.
Pagination is exhausted (mc#1789). The old "by design ≤ 1" invariant
was false — backlog can exceed 50 open issues.
Gitea issue search returns at most 50/page; we only need open
`[main-red]` issues which are by design ≤ 1 at any time per repo,
so a single page is enough.
"""
prefix = f"{TITLE_PREFIX} {REPO}: "
all_issues: list[dict] = []
page = 1
limit = 50
while True:
_, results = api(
"GET",
f"/repos/{OWNER}/{NAME}/issues",
query={"state": "open", "type": "issues", "limit": str(limit), "page": str(page)},
_, results = api(
"GET",
f"/repos/{OWNER}/{NAME}/issues",
query={"state": "open", "type": "issues", "limit": "50"},
)
if not isinstance(results, list):
raise ApiError(
f"issue search returned non-list body (got {type(results).__name__})"
)
if not isinstance(results, list):
raise ApiError(
f"issue search returned non-list body (got {type(results).__name__})"
)
matched = [
i for i in results
if isinstance(i, dict)
prefix = f"{TITLE_PREFIX} {REPO}: "
return [i for i in results if isinstance(i, dict)
and isinstance(i.get("title"), str)
and i["title"].startswith(prefix)
]
all_issues.extend(matched)
if len(results) < limit:
break
page += 1
return all_issues
and i["title"].startswith(prefix)]
def find_open_issue_for_sha(sha: str) -> dict | None:
@@ -605,156 +574,10 @@ def file_or_update_red(
sys.stderr.write(f"::warning::label '{RED_LABEL}' not found on repo\n")
def close_stale_red_issues(
current_sha: str,
current_status: dict,
*,
dry_run: bool = False,
) -> int:
"""Close open [main-red] issues whose specific failing contexts have
all recovered on `current_sha`, even though `main` is still red for
other reasons (mc#1789).
When main stays red across consecutive SHAs for *different* causes,
`close_open_red_issues_for_other_shas` never fires (it only runs when
main is green). This function prevents stale issues from accumulating
indefinitely by comparing per-context recovery across SHAs.
An issue is considered stale when every context that was in a failed
state on the issue's SHA is now either `success` on the current HEAD
or absent (workflow removed / renamed). Issues whose original SHA had
a combined-red-with-no-detail (empty statuses list) are skipped — we
cannot verify recovery without per-context data.
Returns the number of issues closed.
"""
open_red = list_open_red_issues()
if not open_red:
return 0
current_statuses = current_status.get("statuses") or []
closed = 0
for issue in open_red:
title = issue.get("title", "")
prefix = f"{TITLE_PREFIX} {REPO}: "
if not title.startswith(prefix):
continue
short_sha = title[len(prefix):]
if short_sha == current_sha[:10]:
continue
# Query status for the old SHA. Short SHA should resolve; if it
# doesn't (GC'd, force-pushed, ambiguous), skip conservatively.
try:
old_status = get_combined_status(short_sha)
except ApiError:
continue
old_red, old_failed = is_red(old_status)
if not old_red:
# Open issue for a now-green SHA — close it via the normal path.
num = issue.get("number")
if isinstance(num, int):
comment = (
f"Commit `{short_sha}` is no longer red. Closing as the "
f"failure context has recovered or expired."
)
if dry_run:
print(
f"::notice::[dry-run] would close issue #{num} "
f"({title}) — old SHA is now green"
)
closed += 1
continue
api(
"POST",
f"/repos/{OWNER}/{NAME}/issues/{num}/comments",
body={"body": comment},
)
api(
"PATCH",
f"/repos/{OWNER}/{NAME}/issues/{num}",
body={"state": "closed"},
)
print(
f"::notice::Closed stale main-red issue #{num} "
f"(old SHA {short_sha} is now green)"
)
closed += 1
continue
if not old_failed:
# Combined red with no per-context detail — can't verify recovery.
continue
# Verify every failed context from the old SHA has recovered.
all_recovered = True
recovered_ctxs: list[str] = []
still_failing_ctxs: list[str] = []
for s in old_failed:
ctx = s.get("context", "")
if not ctx:
continue
current_match = None
for cs in current_statuses:
if isinstance(cs, dict) and cs.get("context") == ctx:
current_match = cs
break
if current_match is None:
recovered_ctxs.append(ctx)
elif _entry_state(current_match) == "success":
recovered_ctxs.append(ctx)
else:
all_recovered = False
still_failing_ctxs.append(ctx)
if not all_recovered:
continue
num = issue.get("number")
if not isinstance(num, int):
continue
comment = (
f"The failing contexts from this SHA (`{short_sha}`) have "
f"recovered on current HEAD `{current_sha[:10]}`: "
f"{', '.join(recovered_ctxs)}. "
f"Main is still red for other reasons; see the current "
f"`[main-red]` issue for `{current_sha[:10]}`."
)
if dry_run:
print(
f"::notice::[dry-run] would close stale issue #{num} "
f"({title}) — contexts recovered"
)
closed += 1
continue
api(
"POST",
f"/repos/{OWNER}/{NAME}/issues/{num}/comments",
body={"body": comment},
)
api(
"PATCH",
f"/repos/{OWNER}/{NAME}/issues/{num}",
body={"state": "closed"},
)
print(
f"::notice::Closed stale main-red issue #{num} "
f"(contexts recovered at {current_sha[:10]})"
)
closed += 1
return closed
def close_open_red_issues_for_other_shas(
current_sha: str,
*,
dry_run: bool = False,
close_same_sha: bool = False,
) -> int:
"""When main is green at current_sha, close any open `[main-red]`
issues whose title references a different SHA. Returns the number
@@ -763,25 +586,15 @@ def close_open_red_issues_for_other_shas(
Lineage note: we only close issues whose title prefix matches; if
a human renamed the issue or added a suffix this won't touch it.
That's intentional — manual editorial state takes precedence.
Args:
close_same_sha: set True when the caller already knows main is
green at current_sha (e.g. recovery block) and wants to close
the open issue for THIS SHA too. Defaults False so the
green-path callers never accidentally close an issue they just
filed on the same tick.
"""
target_title = title_for(current_sha)
open_red = list_open_red_issues()
closed = 0
for issue in open_red:
if issue.get("title") == target_title:
if not close_same_sha:
# Same SHA — caller should not have invoked this if main is
# green. Skip defensively (guards against green-path callers
# that accidentally pass the SHA they just filed for).
continue
# close_same_sha=True: close even this SHA's issue (recovery path)
# Same SHA — caller should not have invoked this if main is
# green. Skip defensively.
continue
num = issue.get("number")
if not isinstance(num, int):
continue
@@ -886,10 +699,6 @@ def run_once(*, dry_run: bool = False) -> int:
f"{sha[:10]} but HEAD is now {recheck_sha[:10]} on "
f"{WATCH_BRANCH}; next cron tick will re-evaluate."
)
# HEAD drifted — close any stale main-red issue for the prior SHA
# before returning, so we don't leave stale open issues when main
# is no longer pointing at the red commit.
close_open_red_issues_for_other_shas(recheck_sha, dry_run=dry_run)
return 0
recheck_status = get_combined_status(sha)
@@ -902,9 +711,6 @@ def run_once(*, dry_run: bool = False) -> int:
f"{recheck_status.get('state')!r} on recheck; "
f"initial red was a transient cancel-cascade."
)
# CI recovered on the same SHA — close any stale main-red issue
# that was filed on a prior tick for this SHA.
close_open_red_issues_for_other_shas(sha, dry_run=dry_run, close_same_sha=True)
return 0
# Still red after settling — file/update. Use the recheck data
@@ -920,68 +726,24 @@ def run_once(*, dry_run: bool = False) -> int:
print(f"::warning::main is RED at {sha[:10]} on {WATCH_BRANCH}: "
f"{len(failed)} failed context(s)")
file_or_update_red(sha, failed, debug, dry_run=dry_run)
stale_closed = close_stale_red_issues(sha, recheck_status, dry_run=dry_run)
if stale_closed:
emit_loki_event("main_red_stale_closed", sha, [])
print(
f"::notice::Closed {stale_closed} stale main-red issue(s) "
f"whose contexts recovered at {sha[:10]}"
)
else:
# Green or pending-with-no-real-failures. Close stale issues
# from earlier SHAs when required CI has recovered.
#
# mc#1789: main often sits at combined `pending` because
# scheduled/non-required contexts (Staging SaaS smoke,
# Continuous synthetic E2E, main-red-watchdog itself,
# ci-arm64-advisory) are still running. We close stale issues
# as long as no *non-scheduled* context has failed and no
# *non-scheduled* context is still pending — i.e. required CI
# is effectively green.
#
# The success-only gate is preserved for the canonical green
# path; the extended check below only fires when combined is
# `pending` but all required work is done.
combined_state = status.get("state")
if combined_state == "success":
should_close = True
close_reason = "GREEN"
else:
statuses = status.get("statuses") or []
non_scheduled_pending = [
s for s in statuses
if isinstance(s, dict)
and (_entry_state(s) == "pending")
and not _is_scheduled_context(s.get("context", ""))
]
non_scheduled_failed = [
s for s in statuses
if isinstance(s, dict)
and (_entry_state(s) in {"failure", "error"})
and not _is_scheduled_context(s.get("context", ""))
]
# Cancel-cascade already filtered by is_red(); red=False
# here means no real failures. We additionally check that
# no non-scheduled context is still pending.
should_close = not non_scheduled_pending and not non_scheduled_failed
close_reason = "pending-but-required-green"
if should_close:
# Green (or pending — pending is treated as not-red so we don't
# spam during the post-merge CI window). Close any stale issues
# from earlier SHAs only when we're actually green; pending
# means CI hasn't finished and the prior issue might still be
# accurate.
if status.get("state") == "success":
closed = close_open_red_issues_for_other_shas(sha, dry_run=dry_run)
if closed:
emit_loki_event(
"main_returned_to_green", sha,
[],
)
print(
f"::notice::main is {close_reason} at {sha[:10]} on {WATCH_BRANCH} "
f"(closed {closed} stale issue(s))"
)
print(f"::notice::main is GREEN at {sha[:10]} on {WATCH_BRANCH} "
f"(closed {closed} stale issue(s))")
else:
print(
f"::notice::main has pending-or-failed required CI at {sha[:10]} "
f"on {WATCH_BRANCH} (combined state={combined_state!r}; no action)"
)
print(f"::notice::main is PENDING at {sha[:10]} on {WATCH_BRANCH} "
f"(combined state={status.get('state')!r}; no action)")
return 0
+5 -218
View File
@@ -17,14 +17,18 @@ import urllib.error
import urllib.request
from urllib.parse import quote
TRUE_VALUES = {"1", "true", "yes", "on", "disabled", "disable"}
PROD_CP_URL = "https://api.moleculesai.app"
DEFAULT_REQUIRED_CONTEXTS = [
"CI / Platform (Go) (push)",
"CI / Canvas (Next.js) (push)",
"CI / Shellcheck (E2E scripts) (push)",
"CI / Python Lint & Test (push)",
"CI / all-required (push)",
"Secret scan / Scan diff for credential-shaped strings (push)",
]
TERMINAL_FAILURE_STATES = {"failure", "error", "cancelled", "canceled", "skipped"}
REDEPLOY_PATH = "/cp/admin/tenants/redeploy-fleet"
def truthy_flag(value: str | None) -> bool:
@@ -130,217 +134,6 @@ def required_contexts(env: dict[str, str]) -> list[str]:
return [line.strip() for line in raw.replace(",", "\n").splitlines() if line.strip()]
def chunks(items: list[str], size: int) -> list[list[str]]:
return [items[i : i + size] for i in range(0, len(items), size)]
class RolloutFailed(RuntimeError):
def __init__(self, message: str, response: dict):
super().__init__(message)
self.response = response
def slugs_from_redeploy_response(body: dict) -> list[str]:
slugs: list[str] = []
for row in body.get("results") or []:
slug = str(row.get("slug") or "").strip()
if slug:
slugs.append(slug)
return slugs
def scoped_redeploy_body(base: dict, slugs: list[str]) -> dict:
body = dict(base)
body.pop("canary_slug", None)
body["only_slugs"] = slugs
body["soak_seconds"] = 0
body["batch_size"] = max(1, len(slugs))
return body
def cp_api_json(method: str, url: str, token: str, body: dict | None = None) -> tuple[int, dict]:
data = None
headers = {
"Authorization": f"Bearer {token}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, data=data, headers=headers, method=method)
try:
with urllib.request.urlopen(req, timeout=120) as resp:
return resp.status, json.loads(resp.read())
except urllib.error.HTTPError as exc:
raw = exc.read().decode("utf-8", errors="replace")
try:
parsed = json.loads(raw)
except json.JSONDecodeError:
parsed = {"error": raw[:500]}
return exc.code, parsed
def plan_rollout_slugs(cp_url: str, token: str, body: dict, redeploy=None) -> list[str]:
if redeploy is None:
redeploy = redeploy_scoped
dry_run_body = dict(body)
dry_run_body["dry_run"] = True
status, resp = redeploy(cp_url, token, dry_run_body)
if status != 200:
raise RuntimeError(f"dry-run redeploy-fleet returned HTTP {status}: {resp.get('error', '')}")
if resp.get("ok") is not True:
raise RuntimeError(f"dry-run redeploy-fleet reported ok={resp.get('ok')}: {resp.get('error', '')}")
slugs = slugs_from_redeploy_response(resp)
if not slugs:
raise RuntimeError("dry-run redeploy-fleet returned no rollout candidates")
return slugs
def redeploy_scoped(cp_url: str, token: str, body: dict) -> tuple[int, dict]:
return cp_api_json("POST", f"{cp_url}{REDEPLOY_PATH}", token, body)
def _raise_for_redeploy_result(status: int, body: dict, slugs: list[str]) -> None:
if status != 200 or body.get("ok") is not True:
raise RuntimeError(
"redeploy scoped call failed for "
f"{','.join(slugs)}: HTTP {status}, ok={body.get('ok')}"
)
def rollout_stragglers(enumerated: list[str], results: list[dict]) -> list[str]:
"""Return every enumerated tenant NOT proven on the target build.
A straggler is any tenant the rollout was supposed to cover that the
CP could not verify is running the target image tag — whether it
errored, was skipped, or SSM-succeeded onto the wrong image
(internal#724). CP marks each per-tenant result row with
``verified_on_target`` (the REDEPLOY_RUNNING_IMAGE docker-inspect
proof). A tenant enumerated for the rollout but absent from the
result set (no batch ever ran it) is also a straggler — that is the
exact agents-team silent-skip class.
Backward-compat: an OLDER CP that doesn't emit ``verified_on_target``
yet returns rows without the key. Treat a missing key as verified so
this surfacing degrades to the previous (ok-based) behavior against an
un-upgraded CP, rather than failing every deploy spuriously. Once the
CP fix is deployed the key is always present and real stragglers are
caught.
"""
verified: set[str] = set()
for row in results:
if str(row.get("ssm_status") or "") == "DryRun":
continue
slug = str(row.get("slug") or "").strip()
if not slug:
continue
# Missing key (old CP) => assume verified; present key is authoritative.
if "verified_on_target" not in row or row.get("verified_on_target"):
verified.add(slug)
return sorted(s for s in dict.fromkeys(enumerated) if s not in verified)
def assert_full_coverage(enumerated: list[str], aggregate: dict, dry_run: bool) -> None:
"""Fail the rollout if any enumerated tenant is not on the target build.
This is the no-silent-skip gate (internal#724). A dry run proves
nothing landed, so coverage is not asserted for it.
"""
if dry_run:
return
stragglers = rollout_stragglers(enumerated, aggregate.get("results") or [])
if stragglers:
msg = (
f"incomplete rollout: {len(stragglers)} tenant(s) not verified on target "
f"after redeploy-fleet: {', '.join(stragglers)} "
f"(enumerated {len(set(enumerated))})"
)
aggregate["ok"] = False
aggregate["error"] = msg
aggregate["stragglers"] = stragglers
raise RolloutFailed(msg, aggregate)
def execute_scoped_rollout(
plan: dict,
token: str,
list_slugs=plan_rollout_slugs,
redeploy=redeploy_scoped,
sleep=time.sleep,
) -> dict:
cp_url = plan["cp_url"]
base_body = plan["body"]
all_slugs = list_slugs(cp_url, token, base_body)
batch_size = int(base_body.get("batch_size") or 1)
canary_slug = str(base_body.get("canary_slug") or "").strip()
dry_run = bool(base_body.get("dry_run"))
aggregate = {"ok": True, "results": []}
if canary_slug:
if canary_slug not in all_slugs:
raise RuntimeError(f"configured canary slug {canary_slug!r} is not a running tenant")
body = scoped_redeploy_body(base_body, [canary_slug])
print(f"POST {cp_url}{REDEPLOY_PATH} only_slugs={','.join(body['only_slugs'])}")
status, resp = redeploy(cp_url, token, body)
aggregate["results"].extend(resp.get("results") or [])
try:
_raise_for_redeploy_result(status, resp, [canary_slug])
except RuntimeError as exc:
aggregate["ok"] = False
aggregate["error"] = str(exc)
raise RolloutFailed(str(exc), aggregate) from exc
soak_seconds = int(base_body.get("soak_seconds") or 0)
if soak_seconds > 0 and not dry_run:
print(f"Canary passed; soaking locally for {soak_seconds}s")
sleep(soak_seconds)
remaining = [slug for slug in all_slugs if slug != canary_slug]
for group in chunks(remaining, batch_size):
body = scoped_redeploy_body(base_body, group)
print(f"POST {cp_url}{REDEPLOY_PATH} only_slugs={','.join(group)}")
status, resp = redeploy(cp_url, token, body)
aggregate["results"].extend(resp.get("results") or [])
try:
_raise_for_redeploy_result(status, resp, group)
except RuntimeError as exc:
aggregate["ok"] = False
aggregate["error"] = str(exc)
raise RolloutFailed(str(exc), aggregate) from exc
# No-silent-skip coverage gate (internal#724): every enumerated tenant
# must be PROVEN on the target build. A per-tenant HTTP-200/ok response
# is not proof — a tenant that SSM-succeeded but stayed on the old tag,
# or one enumerated but never batched, is a straggler. Surfacing it as
# a RolloutFailed makes the deploy step exit non-zero instead of
# silently reporting success (the exact agents-team failure mode).
assert_full_coverage(all_slugs, aggregate, dry_run)
return aggregate
def rollout_from_plan_file(plan_path: str, response_path: str, env: dict[str, str]) -> None:
token = env.get("CP_ADMIN_API_TOKEN", "").strip()
if not token:
raise ValueError("CP_ADMIN_API_TOKEN is required for production auto-deploy")
with open(plan_path, "r", encoding="utf-8") as fh:
plan = json.load(fh)
if not plan.get("enabled"):
raise RuntimeError("production auto-deploy plan is disabled")
try:
response = execute_scoped_rollout(plan, token)
except RolloutFailed as exc:
response = exc.response
with open(response_path, "w", encoding="utf-8") as fh:
json.dump(response, fh, sort_keys=True)
fh.write("\n")
raise
with open(response_path, "w", encoding="utf-8") as fh:
json.dump(response, fh, sort_keys=True)
fh.write("\n")
def _api_json(url: str, token: str) -> dict:
req = urllib.request.Request(url, headers={"Authorization": f"token {token}"})
try:
@@ -442,9 +235,6 @@ def main() -> int:
sub.add_parser("plan", help="print production deploy plan as JSON")
sub.add_parser("assert-enabled", help="fail if production deploy is currently disabled")
sub.add_parser("wait-ci", help="block until required CI context is green")
rollout_parser = sub.add_parser("rollout", help="execute canary-first scoped production rollout")
rollout_parser.add_argument("--plan", required=True, help="path to prod-auto-deploy plan JSON")
rollout_parser.add_argument("--response", required=True, help="path to write aggregate response JSON")
args = parser.parse_args()
try:
@@ -457,9 +247,6 @@ def main() -> int:
if args.command == "wait-ci":
wait_for_ci_context(dict(os.environ))
return 0
if args.command == "rollout":
rollout_from_plan_file(args.plan, args.response, dict(os.environ))
return 0
except Exception as exc: # noqa: BLE001 - CLI should render operator-friendly errors.
print(f"::error::{exc}", file=sys.stderr)
return 1
+54 -71
View File
@@ -1,5 +1,4 @@
#!/usr/bin/env bash
# shellcheck disable=SC2016,SC2329
# review-check — evaluate whether a PR satisfies a single team-review gate.
#
# RFC#324 Step 1 of 5 — qa-review + security-review check workflows.
@@ -12,7 +11,6 @@
# ≥ 1 review on the PR where:
# • state == APPROVED
# • review.dismissed == false
# • review.official != false (excludes draft/mis-filed APPROVED reviews)
# • review.user.login != PR.user.login (non-author)
# • review.user.login ∈ team-members
#
@@ -202,7 +200,6 @@ fi
JQ_FILTER='.[]
| select(.state == "APPROVED")
| select(.dismissed != true)
| select(.official != false)
| select(.user.login != $author)'
if [ "${REVIEW_CHECK_STRICT:-}" = "1" ]; then
JQ_FILTER="${JQ_FILTER}
@@ -211,10 +208,10 @@ fi
JQ_FILTER="${JQ_FILTER}
| .user.login"
REVIEW_CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILTER" "$REVIEWS_JSON" | sort -u)
debug "candidate non-author approvers: $(echo "$REVIEW_CANDIDATES" | tr '\n' ' ')"
CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILTER" "$REVIEWS_JSON" | sort -u)
debug "candidate non-author approvers: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -z "$REVIEW_CANDIDATES" ]; then
if [ -z "$CANDIDATES" ]; then
# --- Guardrail (internal#503): explain the most common false
# "no candidates" red. Gitea's review event enum is EXACTLY
# APPROVED/REQUEST_CHANGES/COMMENT/PENDING. A wrong value ("APPROVE",
@@ -239,52 +236,55 @@ if [ -z "$REVIEW_CANDIDATES" ]; then
done
fi
fi
# --- Fallback (internal#348): check issue comments for agent-approval ---
# core-qa-agent and core-security-agent approve via issue comments, NOT
# the reviews API. The reviews API returns zero entries for comment-only
# approvals. This fallback reads PR issue comments and extracts logins that:
# 1. Posted a comment matching the agent-prefix pattern for this gate:
# qa → "[core-qa-agent] APPROVED"
# security → "[core-security-agent] APPROVED"
# OR posted a generic approval keyword (word-anchored, case-insensitive):
# APPROVED / LGTM / ACCEPTED
# 2. Are not the PR author
# 3. The team-membership probe below is the authoritative filter.
AGENT_PATTERN=""
case "$TEAM" in
qa) AGENT_PATTERN="\\[core-qa-agent\\]" ;;
security) AGENT_PATTERN="\\[core-security-agent\\]" ;;
esac
HTTP_CODE=$(curl -sS -o "$COMMENTS_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/issues/${PR_NUMBER}/comments")
debug "GET /issues/${PR_NUMBER}/comments → HTTP ${HTTP_CODE}"
if [ "$HTTP_CODE" = "200" ]; then
# JQ expression: select non-author comments that match either the
# agent-prefix pattern (case-insensitive) OR a generic approval keyword.
JQ_APPROVALS='
.[] |
select(.user.login != $author) |
. as $cmt |
if ($agent_pattern | length) > 0 and ($cmt.body // "" | test($agent_pattern; "i")) then
$cmt.user.login
elif ($cmt.body // "" | test("\\b(APPROVED|LGTM|ACCEPTED)\\b"; "i")) then
$cmt.user.login
else
empty
end
'
CANDIDATES=$(jq -r \
--arg author "$PR_AUTHOR" \
--arg agent_pattern "$AGENT_PATTERN" \
"$JQ_APPROVALS" \
"$COMMENTS_JSON" 2>/dev/null | sort -u)
debug "comment-based approval candidates: $(echo "$CANDIDATES" | tr '\n' ' ')"
# --- Fallback/extension (internal#348): check issue comments for agent-approval ---
# core-qa-agent and core-security-agent can approve via issue comments. Always
# include comment candidates, even if the reviews API returned approvals for a
# different team; team membership below is the authoritative filter.
COMMENT_CANDIDATES=""
AGENT_PATTERN=""
case "$TEAM" in
qa) AGENT_PATTERN="\\[core-qa-agent\\]" ;;
security) AGENT_PATTERN="\\[core-security-agent\\]" ;;
esac
HTTP_CODE=$(curl -sS -o "$COMMENTS_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/issues/${PR_NUMBER}/comments")
debug "GET /issues/${PR_NUMBER}/comments → HTTP ${HTTP_CODE}"
if [ "$HTTP_CODE" = "200" ]; then
# JQ expression: select non-author comments that match either the
# agent-prefix pattern (case-insensitive) OR a generic approval keyword.
JQ_APPROVALS='
.[] |
select(.user.login != $author) |
. as $cmt |
if ($agent_pattern | length) > 0 and ($cmt.body // "" | test($agent_pattern; "i")) then
$cmt.user.login
elif ($cmt.body // "" | test("\\b(APPROVED|LGTM|ACCEPTED)\\b"; "i")) then
$cmt.user.login
else
empty
end
'
COMMENT_CANDIDATES=$(jq -r \
--arg author "$PR_AUTHOR" \
--arg agent_pattern "$AGENT_PATTERN" \
"$JQ_APPROVALS" \
"$COMMENTS_JSON" 2>/dev/null | sort -u)
debug "comment-based approval candidates: $(echo "$COMMENT_CANDIDATES" | tr '\n' ' ')"
if [ -n "$COMMENT_CANDIDATES" ]; then
echo "::notice::${TEAM}-review: found $(echo "$COMMENT_CANDIDATES" | wc -w | xargs) comment-based approval candidate(s) — verifying team membership..."
if [ -n "$CANDIDATES" ]; then
echo "::notice::${TEAM}-review: reviews API found no APPROVED reviews; found $(echo "$CANDIDATES" | wc -w | xargs) comment-based approval candidate(s) — verifying team membership..."
fi
else
debug "could not fetch issue comments (HTTP ${HTTP_CODE})"
fi
else
debug "could not fetch issue comments (HTTP ${HTTP_CODE})"
fi
CANDIDATES=$(printf '%s\n%s\n' "$REVIEW_CANDIDATES" "$COMMENT_CANDIDATES" | sed '/^$/d' | sort -u)
if [ -z "${CANDIDATES:-}" ]; then
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates from reviews API or issue comments)"
exit 1
@@ -296,15 +296,7 @@ fi
# 403 → token owner is not in this team (Gitea 1.22.6 'Must be a team
# member' constraint — see follow-up issue for token-provisioning)
# 404 → not a member
# Track whether every candidate returned 403 (token owner not in team).
# When this happens the root cause is a token-provisioning issue, not a
# reviewer-eligibility issue — surface it clearly so ops don't waste time
# verifying team roster (Bug C / RFC#324 follow-up).
_ALL_CANDIDATES_403="yes"
_CANDIDATE_COUNT=0
for U in $CANDIDATES; do
_CANDIDATE_COUNT=$((_CANDIDATE_COUNT + 1))
CODE=$(curl -sS -o "$TEAM_PROBE_TMP" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/teams/${TEAM_ID}/members/${U}")
debug "probe ${U} in team ${TEAM} (id=${TEAM_ID}) → HTTP ${CODE}"
@@ -314,31 +306,22 @@ for U in $CANDIDATES; do
exit 0
;;
403)
# Token owner is not in the team being probed; Gitea 1.22.6 refuses
# to confirm membership in this case. Do NOT hard-fail the gate on a
# 403 — doing so would fail the entire gate if ANY candidate triggers
# a 403, even when other valid team-members exist. Instead skip this
# candidate and continue checking others. If all candidates produce
# 403 (token owner can't query any of them) the final exit fires.
echo "::warning::team-probe for ${U} in ${TEAM} returned 403 (token owner not in ${TEAM} team — skipping; cannot confirm membership)"
# Token owner is not in the team being probed; the API refuses to
# confirm membership. This is the RFC#324 follow-up token-scope gap.
# Fail closed — never grant approval on a 403; surface clearly.
echo "::error::team-probe for ${U} in ${TEAM} returned 403 (token owner not in ${TEAM} team — RFC#324 token-scope follow-up). Cannot confirm membership; failing closed."
cat "$TEAM_PROBE_TMP" >&2
continue
exit 1
;;
404)
_ALL_CANDIDATES_403="no"
debug "${U} not a member of ${TEAM}"
;;
*)
_ALL_CANDIDATES_403="no"
echo "::warning::team-probe for ${U} in ${TEAM} returned unexpected HTTP ${CODE}"
cat "$TEAM_PROBE_TMP" >&2
;;
esac
done
if [ "$_ALL_CANDIDATES_403" = "yes" ] && [ "$_CANDIDATE_COUNT" -gt 0 ]; then
echo "::error::${TEAM}-review FAILED — every candidate returned 403 (token owner is not a member of the ${TEAM} team). This is a TOKEN PROVISIONING issue, not a reviewer-eligibility issue. Add the token owner to the '${TEAM}' Gitea team (id=${TEAM_ID}) or use a token whose owner is already in that team."
else
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (candidates: $(echo "$CANDIDATES" | tr '\n' ',' | sed 's/,$//') — none are in team)"
fi
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (candidates: $(echo "$CANDIDATES" | tr '\n' ',' | sed 's/,$//') — none are in team)"
exit 1
+12 -20
View File
@@ -6,8 +6,8 @@
# RFC#351 Step 2 of 6 (implementation MVP).
#
# Invoked by .gitea/workflows/sop-checklist.yml on:
# - pull_request_target: [opened, edited, synchronize, reopened, labeled, unlabeled]
# - issue_comment: [created] # edited/deleted omitted (Gitea 1.22.6 job-parsing quirk)
# - pull_request_target: [opened, edited, synchronize, reopened]
# - issue_comment: [created, edited, deleted]
#
# Flow:
# 1. Load .gitea/sop-checklist-config.yaml (from BASE ref — trusted).
@@ -338,6 +338,7 @@ def compute_ack_state(
# Filter out self-acks and unknown slugs.
ackers_per_slug: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_self: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_unknown: dict[str, list[str]] = {s: [] for s in items_by_slug}
pending_team_check: dict[str, list[str]] = {s: [] for s in items_by_slug}
for (user, slug), kind in latest_directive.items():
@@ -636,11 +637,8 @@ def load_config(path: str) -> dict[str, Any]:
dep by keeping the config shape constrained.
"""
try:
# yaml is an optional dep; the canonical loader is used when available,
# but the SOP runs on runners that may not have PyYAML installed. The
# fallback _load_config_minimal covers the same config shape without
import yaml # type: ignore[import-not-found] # optional dep; fall back silently if absent
with open(path, encoding="utf-8") as f:
import yaml # type: ignore[import-not-found]
with open(path) as f:
return yaml.safe_load(f)
except ImportError:
return _load_config_minimal(path)
@@ -654,19 +652,13 @@ def _load_config_minimal(path: str) -> dict[str, Any]:
item map: scalars + lists of scalars. Does NOT support nested lists,
YAML anchors, multi-doc, or flow style.
"""
with open(path, encoding="utf-8") as f:
with open(path) as f:
lines = f.readlines()
return _parse_minimal_yaml(lines)
def _parse_minimal_yaml(lines: list[str]) -> dict[str, Any]:
"""Hand-rolled subset parser. See _load_config_minimal docstring.
C901: function is necessarily long — it implements a finite-state YAML
subset (scalars, maps, lists of maps at fixed depth). No utility refactors
meaningfully reduce length without degrading readability. All branches
are exhaustively tested in test_parse_minimal_yaml.py.
"""
def _parse_minimal_yaml(lines: list[str]) -> dict[str, Any]: # noqa: C901
"""Hand-rolled subset parser. See _load_config_minimal docstring."""
# Strip comments + blank lines but preserve indentation.
cleaned: list[tuple[int, str]] = []
for raw in lines:
@@ -850,7 +842,7 @@ def render_status(
def get_tier_mode(pr: dict[str, Any], cfg: dict[str, Any]) -> str:
"""Read tier label, return 'hard' or 'soft' per cfg.tier_failure_mode."""
labels = pr.get("labels") or []
tier_labels = [label.get("name", "") for label in labels if (label.get("name", "") or "").startswith("tier:")]
tier_labels = [l.get("name", "") for l in labels if (l.get("name", "") or "").startswith("tier:")]
mode_map = cfg.get("tier_failure_mode") or {}
default_mode = cfg.get("default_mode", "hard")
for tl in tier_labels:
@@ -873,7 +865,7 @@ def is_high_risk(pr: dict[str, Any], cfg: dict[str, Any]) -> bool:
Governance fix for internal#442 — closes the inconsistency between
sop-tier-check (tier-aware) and sop-checklist (was tier-blind).
"""
label_set = {(label.get("name") or "") for label in (pr.get("labels") or [])}
label_set = {(l.get("name") or "") for l in (pr.get("labels") or [])}
if "tier:high" in label_set:
return True
high_risk_labels = set(cfg.get("high_risk_labels") or [])
@@ -1024,14 +1016,14 @@ def main(argv: list[str] | None = None) -> int:
tid = client.resolve_team_id(args.owner, tn)
if tid is None:
# Try the list endpoint as a fallback.
code, data = client._req( # noqa: SLF001 # internal helper; called from loop in caller context
code, data = client._req( # noqa: SLF001
"GET", f"/orgs/{args.owner}/teams"
)
if code == 200 and isinstance(data, list):
for t in data:
if t.get("name") == tn:
tid = t.get("id")
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001 # write-through cache; intentional side-effect for reuse across calls
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001
break
if tid is not None:
team_ids.append(tid)
+1 -1
View File
@@ -33,7 +33,7 @@ def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_success"
with open(p, encoding="utf-8") as f:
with open(p) as f:
return f.read().strip()
+4 -10
View File
@@ -20,7 +20,6 @@ Scenarios:
T15_comments_agent_approval — reviews empty; comments have "[core-qa-agent] APPROVED" → exit 0
T16_comments_generic_approval — reviews empty; comments have "APPROVED" by team member → exit 0
T17_comments_no_approval — reviews empty; comments have no approval keywords → exit 1
T18_review_wrong_team_comment_right_team — review candidate 404s, comment candidate passes
Usage:
FIXTURE_STATE_DIR=/tmp/x python3 _review_check_fixture.py 8080
@@ -33,6 +32,7 @@ import re
import sys
import urllib.parse
STATE_DIR = os.environ.get("FIXTURE_STATE_DIR", "/tmp")
@@ -40,7 +40,7 @@ def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_pr_open"
with open(p, encoding="utf-8") as f:
with open(p) as f:
return f.read().strip()
@@ -80,7 +80,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
# GET /repos/{owner}/{name}/pulls/{pr_number}
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/pulls/(\d+)$", path)
if m:
pr_num = m.group(3)
owner, name, pr_num = m.group(1), m.group(2), m.group(3)
if sc == "T2_pr_closed":
return self._json(200, {
"number": int(pr_num),
@@ -140,23 +140,17 @@ class Handler(http.server.BaseHTTPRequestHandler):
{"user": {"login": "alice"}, "body": "I authored this PR", "id": 1},
{"user": {"login": "random-user"}, "body": "Looks okay to me", "id": 2},
])
if sc == "T18_review_wrong_team_comment_right_team":
return self._json(200, [
{"user": {"login": "core-qa-agent"}, "body": "[core-qa-agent] APPROVED after focused review", "id": 1},
])
# Default scenarios (T1T9, T14): no comments
return self._json(200, [])
# GET /teams/{team_id}/members/{username}
m = re.match(r"^/api/v1/teams/(\d+)/members/([^/]+)$", path)
if m:
login = m.group(2)
team_id, login = m.group(1), m.group(2)
if sc == "T8_team_not_member":
return self._empty(404)
if sc == "T9_team_403":
return self._empty(403)
if sc == "T18_review_wrong_team_comment_right_team" and login == "core-devops":
return self._empty(404)
# T7_team_member: member
return self._empty(204)
@@ -1,176 +0,0 @@
import importlib.util
import sys
from pathlib import Path
from unittest.mock import patch
SCRIPT = Path(__file__).resolve().parents[1] / "ci-required-drift.py"
spec = importlib.util.spec_from_file_location("ci_required_drift", SCRIPT)
drift = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = drift
spec.loader.exec_module(drift)
# Module-level constants are loaded from env at import time; set them
# explicitly so unit tests can import without the full env contract.
drift.SENTINEL_JOB = "all-required"
drift.CI_WORKFLOW_PATH = ".gitea/workflows/ci.yml"
drift.AUDIT_WORKFLOW_PATH = ".gitea/workflows/audit-force-merge.yml"
# ---------------------------------------------------------------------------
# Helper fixtures
# ---------------------------------------------------------------------------
def _make_ci_doc(jobs: dict) -> dict:
return {"jobs": jobs}
def _make_audit_doc(required_checks: list[str]) -> dict:
return {
"jobs": {
"audit": {
"steps": [
{"env": {"REQUIRED_CHECKS": "\n".join(required_checks)}}
]
}
}
}
# ---------------------------------------------------------------------------
# sentinel_needs
# ---------------------------------------------------------------------------
def test_sentinel_needs_returns_empty_when_absent():
doc = _make_ci_doc({"all-required": {"runs-on": "ubuntu-latest"}})
assert drift.sentinel_needs(doc) == set()
def test_sentinel_needs_parses_list():
doc = _make_ci_doc(
{"all-required": {"needs": ["platform-build", "canvas-build"]}}
)
assert drift.sentinel_needs(doc) == {"platform-build", "canvas-build"}
def test_sentinel_needs_parses_string():
doc = _make_ci_doc({"all-required": {"needs": "platform-build"}})
assert drift.sentinel_needs(doc) == {"platform-build"}
# ---------------------------------------------------------------------------
# ci_job_names / ci_jobs_all
# ---------------------------------------------------------------------------
def test_ci_job_names_excludes_sentinel_and_event_gated():
doc = _make_ci_doc(
{
"platform-build": {},
"canvas-build": {"if": "github.event_name == 'pull_request'"},
"main-push": {"if": "github.ref == 'refs/heads/main'"},
"all-required": {},
}
)
assert drift.ci_job_names(doc) == {"platform-build"}
def test_ci_jobs_all_includes_event_gated():
doc = _make_ci_doc(
{
"platform-build": {},
"canvas-build": {"if": "github.event_name == 'pull_request'"},
"all-required": {},
}
)
assert drift.ci_jobs_all(doc) == {"platform-build", "canvas-build"}
# ---------------------------------------------------------------------------
# detect_drift — F1 / F1b with mocked I/O
# ---------------------------------------------------------------------------
SAMPLE_PROTECTION = {
"status_check_contexts": [
"CI / all-required (pull_request)",
"Secret scan / Scan diff for credential-shaped strings (pull_request)",
]
}
def test_detect_drift_no_needs_sentinel_skips_f1():
"""Post-#1766 contract: all-required has no needs: → F1 is a false positive."""
ci = _make_ci_doc(
{
"platform-build": {},
"canvas-build": {},
"all-required": {},
}
)
audit = _make_audit_doc(
[
"CI / all-required (pull_request)",
"Secret scan / Scan diff for credential-shaped strings (pull_request)",
]
)
with patch.object(drift, "load_yaml", side_effect=[ci, audit]):
with patch.object(drift, "api", return_value=(200, SAMPLE_PROTECTION)):
findings, debug = drift.detect_drift("main")
assert findings == []
assert debug["sentinel_needs"] == []
def test_detect_drift_typo_in_needs_triggers_f1b():
"""F1b still catches typos when needs exists."""
ci = _make_ci_doc(
{
"platform-build": {},
"all-required": {"needs": ["platfom-build"]}, # typo
}
)
audit = _make_audit_doc(["CI / all-required (pull_request)"])
with patch.object(drift, "load_yaml", side_effect=[ci, audit]):
with patch.object(drift, "api", return_value=(200, SAMPLE_PROTECTION)):
findings, _ = drift.detect_drift("main")
assert any("F1b" in f for f in findings)
assert any("platfom-build" in f for f in findings)
def test_detect_drift_missing_job_in_needs_triggers_f1():
"""F1 still fires when needs is non-empty and jobs are missing."""
ci = _make_ci_doc(
{
"platform-build": {},
"canvas-build": {},
"all-required": {"needs": ["platform-build"]},
}
)
audit = _make_audit_doc(["CI / all-required (pull_request)"])
with patch.object(drift, "load_yaml", side_effect=[ci, audit]):
with patch.object(drift, "api", return_value=(200, SAMPLE_PROTECTION)):
findings, _ = drift.detect_drift("main")
assert any("F1 —" in f for f in findings)
assert any("canvas-build" in f for f in findings)
assert not any("F1b" in f for f in findings)
def test_detect_drift_no_f1_when_needs_empty_even_with_jobs():
"""Explicit regression guard: empty needs + existing jobs = no F1."""
ci = _make_ci_doc(
{
"platform-build": {},
"canvas-build": {},
"all-required": {"needs": []},
}
)
audit = _make_audit_doc(["CI / all-required (pull_request)"])
with patch.object(drift, "load_yaml", side_effect=[ci, audit]):
with patch.object(drift, "api", return_value=(200, SAMPLE_PROTECTION)):
findings, _ = drift.detect_drift("main")
assert not any("F1 —" in f for f in findings)
@@ -1,110 +0,0 @@
from pathlib import Path
import yaml
ROOT = Path(__file__).resolve().parents[2]
def load_workflow(name: str) -> dict:
with (ROOT / "workflows" / name).open() as f:
return yaml.safe_load(f)
def _all_required(workflow: dict) -> dict:
return workflow["jobs"]["all-required"]
def test_all_required_uses_dedicated_meta_runner_lane():
workflow = load_workflow("ci.yml")
all_required = _all_required(workflow)
# Stays on the dedicated `ci-meta` lane (the sentinel does no docker
# work, so it must NOT occupy the general docker-host pool).
assert all_required["runs-on"] == "ci-meta"
def test_all_required_is_needs_aggregator_not_a_polling_gate():
"""fix/ci-scheduler-fanout (2026-06-01): the sentinel was converted
from a status-polling loop (which squatted a ci-meta executor slot for
up to 40 min per PR) into a plain `needs:` aggregator that frees the
slot immediately. Pin the new shape so a regression to the poller is
caught.
"""
workflow = load_workflow("ci.yml")
all_required = _all_required(workflow)
rendered = str(all_required)
# The job MUST aggregate via `needs:` (the slot-freeing design).
assert "needs" in all_required, "all-required must be a needs: aggregator"
# It MUST NOT reintroduce the polling loop / per-SHA status fetch that
# was the throughput sink.
assert "detect-changes.py" not in rendered, (
"all-required must not run the detect-changes poller path"
)
assert "commits/" not in rendered and "statuses" not in rendered, (
"all-required must not poll commit statuses (the slot-squat path)"
)
def test_all_required_does_not_use_if_always():
"""Plain `needs:` works on Gitea 1.22.6 / act_runner v0.6.1; `needs:` +
`if: always()` is BROKEN (feedback_gitea_needs_works_only_ifalways_broken)
and would let a non-success need pass the gate. The sentinel must use
plain `needs:` WITHOUT a job-level `if: always()`.
"""
workflow = load_workflow("ci.yml")
all_required = _all_required(workflow)
job_if = all_required.get("if")
assert not (isinstance(job_if, str) and "always()" in job_if), (
"all-required must not combine needs: with if: always()"
)
def test_all_required_needs_matches_ci_required_drift_f1_set():
"""The sentinel `needs:` list MUST equal ci-required-drift.py's
`ci_job_names()` set: every job MINUS the sentinel itself MINUS jobs
whose `if:` gates on github.event_name/github.ref (event-gated jobs
skip on PRs and a `needs:` on a skipped job would never let the
sentinel run). If they diverge, ci-required-drift F1 fires.
"""
workflow = load_workflow("ci.yml")
jobs = workflow["jobs"]
sentinel = "all-required"
expected = set()
for key, body in jobs.items():
if key == sentinel:
continue
gate = body.get("if") if isinstance(body, dict) else None
if isinstance(gate, str) and (
"github.event_name" in gate or "github.ref" in gate
):
# event-gated → legitimately skips on some triggers; excluded
# from both `needs:` and the F1 set.
continue
expected.add(key)
needs = jobs[sentinel].get("needs", [])
if isinstance(needs, str):
needs = [needs]
actual = set(needs)
assert actual == expected, (
f"all-required needs: {sorted(actual)} != ci_job_names() "
f"{sorted(expected)} — ci-required-drift F1 would fire"
)
def test_all_required_needs_reference_real_jobs():
"""F1b guard: every entry in `needs:` must name an existing job."""
workflow = load_workflow("ci.yml")
jobs = workflow["jobs"]
needs = jobs["all-required"].get("needs", [])
if isinstance(needs, str):
needs = [needs]
job_keys = set(jobs)
for dep in needs:
assert dep in job_keys, f"all-required needs unknown job {dep!r}"
@@ -2,6 +2,7 @@ import importlib.util
import sys
from pathlib import Path
SCRIPT = Path(__file__).resolve().parents[1] / "gitea-merge-queue.py"
spec = importlib.util.spec_from_file_location("gitea_merge_queue", SCRIPT)
mq = importlib.util.module_from_spec(spec)
@@ -15,6 +15,7 @@ Mirrors the pattern in scripts/ops/test_check_migration_collisions.py
from __future__ import annotations
import importlib.util
import os
import sys
import unittest
from pathlib import Path
@@ -1,283 +0,0 @@
import importlib.util
import sys
from pathlib import Path
from unittest.mock import patch, MagicMock
SCRIPT = Path(__file__).resolve().parents[1] / "main-red-watchdog.py"
spec = importlib.util.spec_from_file_location("main_red_watchdog", SCRIPT)
wd = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = wd
spec.loader.exec_module(wd)
# Module-level constants are loaded from env at import time; set them
# explicitly so unit tests can import without the full env contract.
wd.GITEA_TOKEN = "fake-token"
wd.GITEA_HOST = "git.example.com"
wd.REPO = "molecule-ai/molecule-core"
wd.OWNER = "molecule-ai"
wd.NAME = "molecule-core"
wd.WATCH_BRANCH = "main"
wd.RED_LABEL = "tier:high"
wd.API = "https://git.example.com/api/v1"
# ---------------------------------------------------------------------------
# _is_scheduled_context
# ---------------------------------------------------------------------------
def test_is_scheduled_context_matches_staging_saas_smoke():
assert wd._is_scheduled_context("Staging SaaS smoke") is True
def test_is_scheduled_context_matches_case_insensitive():
assert wd._is_scheduled_context("continuous synthetic e2e") is True
def test_is_scheduled_context_no_match_for_required_ci():
assert wd._is_scheduled_context("CI / all-required") is False
# ---------------------------------------------------------------------------
# _entry_state
# ---------------------------------------------------------------------------
def test_entry_state_prefers_status_over_state():
"""Gitea 1.22.6 per-entry key is `status`; `state` is fallback."""
assert wd._entry_state({"status": "failure", "state": "success"}) == "failure"
def test_entry_state_falls_back_to_state():
assert wd._entry_state({"state": "pending"}) == "pending"
def test_entry_state_empty_when_neither_key_present():
assert wd._entry_state({"context": "foo"}) == ""
# ---------------------------------------------------------------------------
# is_red
# ---------------------------------------------------------------------------
def test_is_red_combined_failure_no_statuses():
"""Combined failure with empty statuses[] still trips red."""
red, failed = wd.is_red({"state": "failure", "statuses": []})
assert red is True
assert failed == []
def test_is_red_cancel_cascade_filtered():
"""status=3 (cancelled) mapped to failure string must be filtered."""
status = {
"state": "failure",
"statuses": [
{"context": "CI / build", "status": "failure", "description": "Has been cancelled"},
],
}
red, failed = wd.is_red(status)
assert red is False
assert failed == []
def test_is_red_real_failure_not_filtered():
"""Real failures with different descriptions are kept."""
status = {
"state": "failure",
"statuses": [
{"context": "CI / build", "status": "failure", "description": "Failing after 12s"},
],
}
red, failed = wd.is_red(status)
assert red is True
assert len(failed) == 1
assert failed[0]["context"] == "CI / build"
def test_is_red_uses_entry_state_not_top_level_state():
"""Regression: per-entry key is `status`, not `state`."""
status = {
"state": "failure",
"statuses": [
# Only `status` present; pre-rev4 code read `state` and got None
{"context": "CI / test", "status": "failure"},
],
}
red, failed = wd.is_red(status)
assert red is True
assert len(failed) == 1
# ---------------------------------------------------------------------------
# list_open_red_issues — pagination (mc#1789)
# ---------------------------------------------------------------------------
def test_list_open_red_issues_exhausts_pagination():
"""Backlog can exceed 50 issues; all pages must be fetched."""
calls = []
def fake_api(method, path, **kwargs):
calls.append((method, path, kwargs))
query = (kwargs.get("query") or {})
page = int(query.get("page", "1"))
limit = int(query.get("limit", "50"))
# Page 1 returns full limit; page 2 returns partial → break
if page == 1:
return 200, [
{"title": f"[main-red] molecule-ai/molecule-core: sha{i:04d}"}
for i in range(limit)
]
if page == 2:
return 200, [
{"title": "[main-red] molecule-ai/molecule-core: extra1"},
{"title": "[main-red] molecule-ai/molecule-core: extra2"},
{"title": " unrelated issue "}, # filtered out
]
return 200, []
with patch.object(wd, "api", side_effect=fake_api):
issues = wd.list_open_red_issues()
assert len(issues) == 52 # 50 + 2 matched
titles = {i["title"] for i in issues}
assert "[main-red] molecule-ai/molecule-core: extra1" in titles
assert "[main-red] molecule-ai/molecule-core: extra2" in titles
def test_list_open_red_issues_single_page():
"""When results < limit, loop breaks after first page."""
def fake_api(method, path, **kwargs):
return 200, [
{"title": "[main-red] molecule-ai/molecule-core: abc123"},
]
with patch.object(wd, "api", side_effect=fake_api):
issues = wd.list_open_red_issues()
assert len(issues) == 1
# ---------------------------------------------------------------------------
# run_once — close logic (mc#1789)
# ---------------------------------------------------------------------------
def test_run_once_green_closes_stale_issues(monkeypatch):
"""Combined success → close stale issues."""
monkeypatch.setattr(wd, "get_head_sha", lambda b: "abc123")
monkeypatch.setattr(wd, "get_combined_status", lambda s: {"state": "success", "statuses": []})
monkeypatch.setattr(wd, "is_red", lambda s: (False, []))
closed = []
def capture_close(current_sha, *, dry_run=False, close_same_sha=False):
closed.append(current_sha)
return 1
monkeypatch.setattr(wd, "close_open_red_issues_for_other_shas", capture_close)
monkeypatch.setattr(wd, "emit_loki_event", lambda *a, **k: None)
assert wd.run_once(dry_run=True) == 0
assert closed == ["abc123"]
def test_run_once_pending_scheduled_only_closes_stale_issues(monkeypatch):
"""Combined pending, but only scheduled contexts pending → close stale."""
monkeypatch.setattr(wd, "get_head_sha", lambda b: "abc123")
monkeypatch.setattr(
wd, "get_combined_status",
lambda s: {
"state": "pending",
"statuses": [
{"context": "CI / all-required", "status": "success"},
{"context": "Staging SaaS smoke", "status": "pending"},
],
}
)
monkeypatch.setattr(wd, "is_red", lambda s: (False, []))
closed = []
def capture_close(current_sha, *, dry_run=False, close_same_sha=False):
closed.append(current_sha)
return 1
monkeypatch.setattr(wd, "close_open_red_issues_for_other_shas", capture_close)
monkeypatch.setattr(wd, "emit_loki_event", lambda *a, **k: None)
assert wd.run_once(dry_run=True) == 0
assert closed == ["abc123"]
def test_run_once_pending_required_does_not_close(monkeypatch):
"""Combined pending with a real required context still pending → no close."""
monkeypatch.setattr(wd, "get_head_sha", lambda b: "abc123")
monkeypatch.setattr(
wd, "get_combined_status",
lambda s: {
"state": "pending",
"statuses": [
{"context": "CI / all-required", "status": "pending"},
{"context": "Staging SaaS smoke", "status": "success"},
],
}
)
monkeypatch.setattr(wd, "is_red", lambda s: (False, []))
closed = []
def capture_close(current_sha, *, dry_run=False, close_same_sha=False):
closed.append(current_sha)
return 0
monkeypatch.setattr(wd, "close_open_red_issues_for_other_shas", capture_close)
monkeypatch.setattr(wd, "emit_loki_event", lambda *a, **k: None)
assert wd.run_once(dry_run=True) == 0
assert closed == []
def test_run_once_failure_does_not_close(monkeypatch):
"""Real failure in non-scheduled context → no close."""
monkeypatch.setattr(wd, "get_head_sha", lambda b: "abc123")
monkeypatch.setattr(
wd, "get_combined_status",
lambda s: {
"state": "failure",
"statuses": [
{"context": "CI / all-required", "status": "failure"},
],
}
)
# is_red will return True, so we enter the red path, not the green close path
monkeypatch.setattr(wd, "is_red", lambda s: (True, s.get("statuses", [])))
monkeypatch.setattr(wd, "time", MagicMock(sleep=lambda x: None))
monkeypatch.setattr(wd, "emit_loki_event", lambda *a, **k: None)
filed = []
def capture_file(sha, failed, debug, *, dry_run=False):
filed.append(sha)
monkeypatch.setattr(wd, "file_or_update_red", capture_file)
monkeypatch.setattr(wd, "close_open_red_issues_for_other_shas", lambda *a, **k: 0)
monkeypatch.setattr(wd, "close_stale_red_issues", lambda *a, **k: 0)
assert wd.run_once(dry_run=True) == 0
assert filed == ["abc123"]
# ---------------------------------------------------------------------------
# title_for / find_open_issue_for_sha
# ---------------------------------------------------------------------------
def test_title_for_uses_short_sha():
assert wd.title_for("abcdef123456") == "[main-red] molecule-ai/molecule-core: abcdef1234"
def test_find_open_issue_for_sha_matches_exact_title(monkeypatch):
fake_issue = {"title": "[main-red] molecule-ai/molecule-core: abc1234567", "number": 42}
monkeypatch.setattr(wd, "list_open_red_issues", lambda: [fake_issue])
assert wd.find_open_issue_for_sha("abc1234567") == fake_issue
def test_find_open_issue_for_sha_returns_none_when_no_match(monkeypatch):
monkeypatch.setattr(wd, "list_open_red_issues", lambda: [])
assert wd.find_open_issue_for_sha("abc123") is None
@@ -146,343 +146,3 @@ def test_context_is_terminal_failure_rejects_cancelled_and_skipped():
assert prod.context_is_terminal_failure(state) is True
for state in ("pending", "missing", "success"):
assert prod.context_is_terminal_failure(state) is False
def test_default_required_contexts_delegate_path_gating_to_all_required():
assert prod.required_contexts({}) == [
"CI / all-required (push)",
"Secret scan / Scan diff for credential-shaped strings (push)",
]
def test_slugs_from_redeploy_response_uses_controlplane_plan_rows():
body = {
"results": [
{"slug": "hongming", "phase": "canary", "ssm_status": "DryRun"},
{"slug": "tenant-a", "phase": "batch-1", "ssm_status": "DryRun"},
{"slug": "", "phase": "batch-1", "ssm_status": "DryRun"},
{"phase": "batch-1", "ssm_status": "DryRun"},
]
}
assert prod.slugs_from_redeploy_response(body) == ["hongming", "tenant-a"]
def test_plan_rollout_slugs_asks_controlplane_for_dry_run_plan():
calls = []
def fake_redeploy(_cp_url, _token, body):
calls.append(body)
return 200, {
"ok": True,
"results": [
{"slug": "hongming", "phase": "canary", "ssm_status": "DryRun"},
{"slug": "tenant-a", "phase": "batch-1", "ssm_status": "DryRun"},
],
}
slugs = prod.plan_rollout_slugs(
"https://api.moleculesai.app",
"secret",
{
"target_tag": "staging-abcdef1",
"canary_slug": "hongming",
"soak_seconds": 60,
"batch_size": 3,
"dry_run": False,
"confirm": True,
},
redeploy=fake_redeploy,
)
assert slugs == ["hongming", "tenant-a"]
assert calls == [
{
"target_tag": "staging-abcdef1",
"canary_slug": "hongming",
"soak_seconds": 60,
"batch_size": 3,
"dry_run": True,
"confirm": True,
}
]
def test_scoped_redeploy_body_removes_canary_and_local_soak():
base = {
"target_tag": "staging-abcdef1",
"canary_slug": "hongming",
"soak_seconds": 60,
"batch_size": 3,
"dry_run": False,
"confirm": True,
}
scoped = prod.scoped_redeploy_body(base, ["tenant-a", "tenant-b"])
assert scoped == {
"target_tag": "staging-abcdef1",
"soak_seconds": 0,
"batch_size": 2,
"dry_run": False,
"confirm": True,
"only_slugs": ["tenant-a", "tenant-b"],
}
def test_plan_scoped_rollout_preserves_canary_then_batches():
calls, sleeps = [], []
def fake_list(_cp_url, _token, _body):
return ["tenant-a", "hongming", "tenant-b", "tenant-c"]
def fake_redeploy(_cp_url, _token, body):
calls.append(body)
return 200, {
"ok": True,
"results": [{"slug": slug, "healthz_ok": True} for slug in body["only_slugs"]],
}
aggregate = prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-abcdef1",
"canary_slug": "hongming",
"soak_seconds": 60,
"batch_size": 2,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=fake_list,
redeploy=fake_redeploy,
sleep=sleeps.append,
)
assert [call["only_slugs"] for call in calls] == [
["hongming"],
["tenant-a", "tenant-b"],
["tenant-c"],
]
assert sleeps == [60]
assert aggregate["ok"] is True
assert [result["slug"] for result in aggregate["results"]] == [
"hongming",
"tenant-a",
"tenant-b",
"tenant-c",
]
def test_scoped_rollout_halts_after_failed_canary():
calls = []
def fake_redeploy(_cp_url, _token, body):
calls.append(body)
return 200, {"ok": False, "results": [{"slug": body["only_slugs"][0], "error": "bad"}]}
try:
prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-abcdef1",
"canary_slug": "hongming",
"soak_seconds": 60,
"batch_size": 2,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _cp_url, _token, _body: ["hongming", "tenant-a"],
redeploy=fake_redeploy,
sleep=lambda _seconds: None,
)
except prod.RolloutFailed as exc:
assert "redeploy scoped call failed" in str(exc)
assert exc.response["ok"] is False
assert exc.response["results"] == [{"slug": "hongming", "error": "bad"}]
else:
raise AssertionError("expected failed canary to halt rollout")
assert [call["only_slugs"] for call in calls] == [["hongming"]]
def test_rollout_from_plan_file_writes_partial_response_on_failure(tmp_path):
plan_path = tmp_path / "plan.json"
response_path = tmp_path / "response.json"
plan_path.write_text(
"""
{
"enabled": true,
"cp_url": "https://api.moleculesai.app",
"body": {"target_tag": "staging-abcdef1", "confirm": true}
}
""",
encoding="utf-8",
)
original = prod.execute_scoped_rollout
def fake_execute(_plan, _token):
raise prod.RolloutFailed(
"redeploy scoped call failed for hongming: HTTP 500, ok=false",
{
"ok": False,
"error": "redeploy scoped call failed for hongming: HTTP 500, ok=false",
"results": [{"slug": "hongming", "error": "bad"}],
},
)
prod.execute_scoped_rollout = fake_execute
try:
try:
prod.rollout_from_plan_file(
str(plan_path),
str(response_path),
{"CP_ADMIN_API_TOKEN": "secret"},
)
except prod.RolloutFailed:
pass
else:
raise AssertionError("expected rollout failure")
finally:
prod.execute_scoped_rollout = original
assert response_path.read_text(encoding="utf-8").strip()
assert '"ok": false' in response_path.read_text(encoding="utf-8")
assert '"slug": "hongming"' in response_path.read_text(encoding="utf-8")
# ──────────────────────────────────────────────────────────────────────
# No-silent-skip coverage gate (internal#724)
# ──────────────────────────────────────────────────────────────────────
def test_rollout_stragglers_flags_tenant_not_on_target():
# b SSM-succeeded but its container is on the old tag → straggler.
stragglers = prod.rollout_stragglers(
["a", "b", "c"],
[
{"slug": "a", "verified_on_target": True},
{"slug": "b", "verified_on_target": False, "running_image": "platform-tenant:staging-old"},
{"slug": "c", "verified_on_target": True},
],
)
assert stragglers == ["b"]
def test_rollout_stragglers_flags_enumerated_tenant_with_no_result():
# agents-team class: enumerated but no batch ever produced a row for it.
stragglers = prod.rollout_stragglers(
["a", "agents-team"],
[{"slug": "a", "verified_on_target": True}],
)
assert stragglers == ["agents-team"]
def test_rollout_stragglers_missing_key_is_backward_compatible():
# Older CP without verified_on_target → treat as verified (no spurious fail).
stragglers = prod.rollout_stragglers(
["a", "b"],
[{"slug": "a", "healthz_ok": True}, {"slug": "b", "healthz_ok": True}],
)
assert stragglers == []
def test_rollout_stragglers_ignores_dry_run_rows():
stragglers = prod.rollout_stragglers(
["a"], [{"slug": "a", "ssm_status": "DryRun"}]
)
# dry-run row is skipped, so "a" has no verifying row → straggler.
assert stragglers == ["a"]
def test_scoped_rollout_fails_when_a_tenant_stays_on_old_tag():
# Every per-tenant call returns ok=True, but agents-team is NOT
# verified_on_target. The rollout must still fail loudly — this is
# the exact "reported success, one tenant silently skipped" bug.
def fake_redeploy(_cp_url, _token, body):
rows = []
for slug in body["only_slugs"]:
rows.append({"slug": slug, "verified_on_target": slug != "agents-team"})
return 200, {"ok": True, "results": rows}
try:
prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["reno-stars", "agents-team", "hongming"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
except prod.RolloutFailed as exc:
assert "incomplete rollout" in str(exc)
assert exc.response["stragglers"] == ["agents-team"]
assert exc.response["ok"] is False
else:
raise AssertionError("expected an incomplete rollout to fail loudly")
def test_scoped_rollout_passes_when_all_tenants_verified_on_target():
def fake_redeploy(_cp_url, _token, body):
return 200, {
"ok": True,
"results": [{"slug": s, "verified_on_target": True} for s in body["only_slugs"]],
}
aggregate = prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": False,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["reno-stars", "agents-team", "hongming"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
assert aggregate["ok"] is True
assert "stragglers" not in aggregate
def test_scoped_rollout_dry_run_does_not_assert_coverage():
# A dry run proves nothing landed; coverage must NOT be asserted or
# every plan would fail.
def fake_redeploy(_cp_url, _token, body):
return 200, {
"ok": True,
"results": [{"slug": s, "ssm_status": "DryRun"} for s in body["only_slugs"]],
}
aggregate = prod.execute_scoped_rollout(
{
"cp_url": "https://api.moleculesai.app",
"body": {
"target_tag": "staging-new",
"batch_size": 5,
"dry_run": True,
"confirm": True,
},
},
token="secret",
list_slugs=lambda _u, _t, _b: ["a", "b"],
redeploy=fake_redeploy,
sleep=lambda _s: None,
)
assert aggregate["ok"] is True
+3 -16
View File
@@ -1,5 +1,4 @@
#!/usr/bin/env bash
# shellcheck disable=SC2034
# Regression tests for .gitea/scripts/review-check.sh (RFC#324 Step 1).
#
# Covers:
@@ -17,7 +16,6 @@
# T12 — jq filter: non-author APPROVED → in candidate list; dismissed → excluded
# T13 — missing required env GITEA_TOKEN → exits 1 with error
# T14 — non-default-base PR exits 0 without requiring review
# T18 — wrong-team review candidate does not block right-team comment approval
#
# Hostile-self-review (per feedback_assert_exact_not_substring):
# this test MUST FAIL if the script is absent. Verified by running
@@ -140,7 +138,7 @@ fi
echo
echo "== T13 missing GITEA_TOKEN =="
set +e
T13_OUT=$(PATH="/tmp:$PATH" GITEA_TOKEN='' GITEA_HOST=git.example.com REPO=x/y PR_NUMBER=1 TEAM=qa TEAM_ID=1 bash "$SCRIPT" 2>&1 || true)
T13_OUT=$(PATH="/tmp:$PATH" GITEA_TOKEN= GITEA_HOST=git.example.com REPO=x/y PR_NUMBER=1 TEAM=qa TEAM_ID=1 bash "$SCRIPT" 2>&1 || true)
set -e
assert_contains "T13 exits non-zero when GITEA_TOKEN missing" "GITEA_TOKEN required" "$T13_OUT"
@@ -308,12 +306,12 @@ echo
echo "== T10 CURL_AUTH_FILE =="
# Verify the token-file logic directly: create a temp file with the
# same mktemp pattern, write the header with printf, chmod 600, then assert.
T10_TOKEN="secret-fixture-token-abc123"
T10_TOKEN="secret-test-token-abc123"
T10_AUTHFILE=$(mktemp "${TMPDIR:-/tmp}/curl-auth.test.XXXXXX")
chmod 600 "$T10_AUTHFILE"
printf 'header = "Authorization: token %s"\n' "$T10_TOKEN" > "$T10_AUTHFILE"
assert_file_mode "T10a mktemp authfile mode 600 (CURL_AUTH_FILE pattern)" "$T10_AUTHFILE" "600"
assert_file_contains "T10b printf header format (CURL_AUTH_FILE content)" "$T10_AUTHFILE" "Authorization: token secret-fixture-token-abc123"
assert_file_contains "T10b printf header format (CURL_AUTH_FILE content)" "$T10_AUTHFILE" "Authorization: token secret-test-token-abc123"
assert_file_contains "T10c 'header =' curl-config syntax" "$T10_AUTHFILE" 'header = "Authorization: token '
rm -f "$T10_AUTHFILE"
@@ -361,17 +359,6 @@ T17_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T17 exit code 1 (no candidates from comments)" "1" "$T17_RC"
assert_contains "T17 no candidates error" "no candidates from reviews API or issue comments" "$T17_OUT"
# T18 — a wrong-team PR review candidate must not suppress a right-team
# comment approval. This matches PR #1790, where QA had an APPROVED review
# and security approved via the agent comment convention.
echo
echo "== T18 review candidate wrong team, comment candidate right team =="
T18_OUT=$(run_review_check "T18_review_wrong_team_comment_right_team")
T18_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T18 exit code 0 (comment approval still considered)" "0" "$T18_RC"
assert_contains "T18 comment candidate notice" "comment-based approval" "$T18_OUT"
assert_contains "T18 comment approver accepted" "APPROVED by core-qa-agent" "$T18_OUT"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
@@ -22,6 +22,7 @@ from __future__ import annotations
import os
import sys
import tempfile
import unittest
# Resolve sibling script regardless of where pytest is invoked from.
@@ -14,7 +14,7 @@ def load_reaper():
assert spec.loader is not None
spec.loader.exec_module(mod)
mod.API = "https://git.example.test/api/v1"
mod.GITEA_TOKEN = "fixture-token"
mod.GITEA_TOKEN = "test-token"
mod.API_TIMEOUT_SEC = 1
mod.API_RETRIES = 3
mod.API_RETRY_SLEEP_SEC = 0
+5 -18
View File
@@ -47,25 +47,12 @@ jobs:
REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
# Required-status-check contexts to evaluate at merge time.
# Branch-aware JSON dict: keys are protected branch names,
# values are arrays of context names that branch protection
# requires for that branch. Mirror this against branch
# protection (settings → branches → protected branch →
# required checks) for each branch listed here.
#
# Newline-separated. Mirror this against branch protection
# (settings → branches → protected branch → required checks).
# Declared here rather than fetched from /branch_protections
# because that endpoint requires admin write — sop-tier-bot is
# read-only by design (least-privilege).
REQUIRED_CHECKS_JSON: |
{
"main": [
"CI / all-required (pull_request)",
"E2E API Smoke Test / E2E API Smoke Test (pull_request)",
"Handlers Postgres Integration / Handlers Postgres Integration (pull_request)"
],
"staging": [
"CI / all-required (pull_request)",
"sop-checklist / all-items-acked (pull_request)"
]
}
REQUIRED_CHECKS: |
CI / all-required (pull_request)
sop-checklist / all-items-acked (pull_request)
run: bash .gitea/scripts/audit-force-merge.sh
+1 -1
View File
@@ -37,7 +37,7 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -45,7 +45,7 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 5
steps:
+1 -1
View File
@@ -101,7 +101,7 @@ jobs:
# AND-set: only the Mac arm64 runner advertises macos-self-hosted.
# See "RUNNER TARGETING" header note for why bare self-hosted is unsafe.
runs-on: [self-hosted, macos-self-hosted]
# ADVISORY: never blocks. See safety contract point 3. mc#1982
# ADVISORY: never blocks. See safety contract point 3. mc#774
# internal#418 — tracked: arm64 advisory pilot, non-gating by design.
continue-on-error: true
# event_name gate: functional (only meaningful on push/PR) AND keeps
+106 -109
View File
@@ -106,7 +106,7 @@ jobs:
name: Platform (Go)
needs: changes
runs-on: ubuntu-latest
# mc#1982 (closed 2026-05-14): Phase 4 flip of the platform-build job.
# mc#774 (closed 2026-05-14): Phase 4 flip of the platform-build job.
# Phase 4 (#656) originally flipped this to continue-on-error: false based on
# Phase-3-masked "green on main 2026-05-12". Two failure classes then surfaced:
# (1) 4x delegation_test.go sqlmock gaps (PR #669 / #634 fix-forward, closed).
@@ -161,23 +161,15 @@ jobs:
echo "::group::pendinguploads exit=$pu_exit (last 100 lines)"
tail -100 /tmp/test-pu.log
echo "::endgroup::"
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Run tests with coverage (blocking gate)
# Removed -race from the blocking gate per #1184: cold runners
# take 13-25 min to compile with race instrumentation, exceeding
# the 10m step timeout and causing false failures. Race detection
# now runs as a non-blocking advisory step below.
run: go test -timeout 10m -coverprofile=coverage.out ./...
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Race detection (advisory, non-blocking)
# mc#1184: runs race detector as an advisory check so cold-runner
# compile-time spikes don't block merges. Failures here surface in
# the run log but do not fail the build.
run: go test -race -timeout 10m ./...
continue-on-error: true
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 10m per-step timeout
# lets the suite complete on cold cache (~5-7m) while failing cleanly
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
- if: ${{ needs.changes.outputs.platform == 'true' }}
name: Per-file coverage report
@@ -247,7 +239,7 @@ jobs:
# Strip the package-import prefix so we can match .coverage-allowlist.txt
# entries written as paths relative to workspace-server/.
# Handle both module paths: platform/workspace-server/... and platform/...
rel=$(echo "$file" | sed 's|^git.moleculesai.app/molecule-ai/molecule-core/workspace-server/workspace-server/||; s|^git.moleculesai.app/molecule-ai/molecule-core/workspace-server/||')
rel=$(echo "$file" | sed 's|^github.com/molecule-ai/molecule-monorepo/platform/workspace-server/||; s|^github.com/molecule-ai/molecule-monorepo/platform/||')
if echo "$ALLOWLIST" | grep -qxF "$rel"; then
echo "::warning file=workspace-server/$rel::Critical file at ${pct}% coverage (allowlisted, #1823) — fix before expiry."
@@ -357,14 +349,6 @@ jobs:
name: Run E2E bash unit tests (no live infra)
run: |
bash tests/e2e/test_model_slug.sh
# molecule-core#1995 (#1994 follow-on): fail-direction proof for
# the A2A real-completion + byok-routing assertion helpers
# (lib/completion_assert.sh). Offline (no LLM, no network): it
# asserts an error-as-text payload FAILS the real-completion gate
# — the exact trap the historical shape-only `"kind":"text"`
# check missed. If a refactor weakens the gate to a shape check,
# this step goes red on every PR.
bash tests/e2e/test_completion_assert_unit.sh
- if: ${{ needs.changes.outputs.scripts == 'true' }}
name: Test ECR promote-tenant-image script (mock-driven, no live infra)
@@ -392,7 +376,7 @@ jobs:
canvas-deploy-reminder:
name: Canvas Deploy Reminder
runs-on: docker-host
# mc#1982 root-fix: added job-level `if:` so ci-required-drift.py's
# mc#774 root-fix: added job-level `if:` so ci-required-drift.py's
# ci_job_names() detects this as github.ref-gated and skips it from F1.
# The step-level exit 0 handles the "not main push" case; the job-level
# `if:` makes the gating explicit so the drift script sees it.
@@ -475,10 +459,10 @@ jobs:
#
# Emits `CI / all-required (<event>)` where <event> is the workflow trigger
# (e.g. `CI / all-required (pull_request)`, `CI / all-required (push)`).
# Branch protection requires the event-suffixed name —
# Branch protection MUST be updated to require the event-suffixed name —
# requiring `CI / all-required` (bare, no suffix) silently blocks all merges
# because Gitea treats absent status contexts as pending (not skipped), and
# no workflow emits the bare name. BP requires
# no workflow emits the bare name. Fixed: BP now requires
# `CI / all-required (pull_request)` per issue #1473.
#
# Closes the failure mode where status_check_contexts on molecule-core/main
@@ -487,91 +471,104 @@ jobs:
# red silently merged through. See internal#286 for the three concrete
# tonight-of-2026-05-11 incidents that prompted the emergency bump.
#
# ── 2026-06-01 CI-scheduler-overload fix (fix/ci-scheduler-fanout) ──
# PREVIOUS shape: a poll-gate that ran detect-changes then LOOPED on
# `GET /commits/{sha}/statuses` every 15s for up to 40 min, occupying a
# `ci-meta` executor slot the entire time it waited for upstream jobs.
# With only 2 ci-meta runners, that poll-loop squatted half the lane on
# every PR — a confirmed throughput sink in the live RCA (two concurrent
# `JOB-all-required` containers observed pinning the lane). The polling
# design existed only to dodge the Gitea `needs:` + `if: always()` bug,
# where an always()-guarded sentinel could be marked skipped before
# upstream jobs settled (leaving BP pending forever).
# This job deliberately has no `needs:`. Gitea 1.22/act_runner can mark a
# job-level `if: always()` + `needs:` sentinel as skipped before upstream
# jobs settle, leaving branch protection with a permanent pending
# `CI / all-required` context. Instead, this independent sentinel polls the
# required commit-status contexts for this SHA and fails if any fail, skip,
# or never emit.
#
# NEW shape: a plain `needs:` aggregator with NO polling loop. This is
# safe here — and was NOT safe at the time the poller was written —
# because every aggregated CI job now gates its real work PER-STEP
# (`if: needs.changes.outputs.* != 'true'`) rather than at the JOB level.
# A per-step-gated job always reaches a terminal SUCCESS (it no-ops its
# expensive steps but the job itself still completes), so it is never
# `skipped`. Plain `needs:` (WITHOUT `if: always()`) works correctly on
# Gitea 1.22.6 / act_runner v0.6.1 — only `needs:` + `if: always()` is
# broken (feedback_gitea_needs_works_only_ifalways_broken). We therefore
# use plain `needs:` + an explicit per-need result check (NOT
# `if: always()`); if any need fails/errors, Gitea never starts this job
# and BP sees `CI / all-required` go red via the failed dependency
# propagation — exactly the gate we want, with zero runner-squat.
# canvas-deploy-reminder is intentionally NOT included in all-required.needs.
# It is an informational main-push reminder, not a PR quality gate. Keeping
# it in this dependency list lets a skipped reminder skip the required
# sentinel before the `always()` guard can emit a branch-protection status.
#
# The `needs:` list MUST stay in lockstep with ci-required-drift.py's
# F1 check (`ci_job_names()` = every job MINUS the sentinel MINUS jobs
# whose `if:` gates on github.event_name/github.ref). canvas-deploy-
# reminder is event-gated (`if: github.ref == refs/heads/{main,staging}`)
# so it is intentionally EXCLUDED — it skips on PRs and a `needs:` on a
# skipped job would never let the sentinel run. If a new always-running
# CI job is added, add it here too or ci-required-drift F1 will flag it.
#
# Stays on the dedicated `ci-meta` lane (no docker work, so the
# docker-host-pin lint does not apply), but now the job is sub-second:
# it only inspects already-settled `needs.*.result` values, so it frees
# the slot immediately instead of holding it for the whole CI duration.
#
needs:
- changes
- platform-build
- canvas-build
- shellcheck
- python-lint
continue-on-error: false
runs-on: ci-meta
timeout-minutes: 5
runs-on: ubuntu-latest
timeout-minutes: 45
steps:
- name: Verify all aggregated CI jobs succeeded
# NO polling, NO API call, NO checkout. Because this job lists the
# aggregated jobs under `needs:` (without `if: always()`), Gitea only
# starts it once every need has reached SUCCESS — a failed/errored
# need short-circuits the job and propagates red to the
# `CI / all-required` context. This explicit check is a
# belt-and-suspenders assertion + a readable run summary; the real
# gating is the `needs:` edge itself.
- name: Wait for required CI contexts
env:
CHANGES_RESULT: ${{ needs.changes.result }}
PLATFORM_RESULT: ${{ needs.platform-build.result }}
CANVAS_RESULT: ${{ needs.canvas-build.result }}
SHELLCHECK_RESULT: ${{ needs.shellcheck.result }}
PYTHON_LINT_RESULT: ${{ needs.python-lint.result }}
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
API_ROOT: ${{ github.server_url }}/api/v1
REPOSITORY: ${{ github.repository }}
COMMIT_SHA: ${{ github.sha }}
EVENT_NAME: ${{ github.event_name }}
run: |
set -euo pipefail
fail=0
check() {
name="$1"; result="$2"
printf 'CI / %s = %s\n' "$name" "$result"
# `success` is the only green terminal state we accept. A plain
# `needs:` job is only started when all needs succeed, so reaching
# this step already implies success — but assert explicitly so a
# future `if: always()` reintroduction (which WOULD let non-success
# through) fails loudly instead of silently passing the gate.
if [ "$result" != "success" ]; then
echo "::error::aggregated CI job '${name}' did not succeed (result=${result})"
fail=1
fi
}
check "Detect changes" "$CHANGES_RESULT"
check "Platform (Go)" "$PLATFORM_RESULT"
check "Canvas (Next.js)" "$CANVAS_RESULT"
check "Shellcheck (E2E scripts)" "$SHELLCHECK_RESULT"
check "Python Lint & Test" "$PYTHON_LINT_RESULT"
if [ "$fail" -ne 0 ]; then
echo "::error::all-required: one or more aggregated CI jobs did not succeed"
exit 1
fi
echo "OK: all aggregated CI jobs succeeded — CI / all-required green."
python3 - <<'PY'
import json
import os
import sys
import time
import urllib.error
import urllib.request
token = os.environ["GITEA_TOKEN"]
api_root = os.environ["API_ROOT"].rstrip("/")
repo = os.environ["REPOSITORY"]
sha = os.environ["COMMIT_SHA"]
event = os.environ["EVENT_NAME"]
required = [
f"CI / Detect changes ({event})",
f"CI / Platform (Go) ({event})",
f"CI / Canvas (Next.js) ({event})",
f"CI / Shellcheck (E2E scripts) ({event})",
f"CI / Python Lint & Test ({event})",
]
terminal_bad = {"failure", "error"}
deadline = time.time() + 40 * 60
last_summary = None
def fetch_statuses():
statuses = []
for page in range(1, 6):
url = f"{api_root}/repos/{repo}/commits/{sha}/statuses?page={page}&limit=100"
req = urllib.request.Request(url, headers={"Authorization": f"token {token}"})
with urllib.request.urlopen(req, timeout=10) as resp:
chunk = json.load(resp)
if not chunk:
break
statuses.extend(chunk)
latest = {}
for item in statuses:
ctx = item.get("context")
if not ctx:
continue
prev = latest.get(ctx)
if prev is None or (item.get("updated_at") or item.get("created_at") or "") >= (prev.get("updated_at") or prev.get("created_at") or ""):
latest[ctx] = item
return latest
while True:
try:
latest = fetch_statuses()
except (TimeoutError, OSError, urllib.error.URLError) as exc:
if time.time() >= deadline:
print(f"FAIL: status polling did not recover before deadline: {exc}", file=sys.stderr)
sys.exit(1)
print(f"WARN: status poll failed, retrying: {exc}", flush=True)
time.sleep(15)
continue
states = {ctx: (latest.get(ctx) or {}).get("status") or (latest.get(ctx) or {}).get("state") or "missing" for ctx in required}
summary = ", ".join(f"{ctx}={state}" for ctx, state in states.items())
if summary != last_summary:
print(summary, flush=True)
last_summary = summary
bad = {ctx: state for ctx, state in states.items() if state in terminal_bad}
if bad:
print("FAIL: required CI context failed:", file=sys.stderr)
for ctx, state in bad.items():
desc = (latest.get(ctx) or {}).get("description") or ""
print(f" - {ctx}: {state} {desc}", file=sys.stderr)
sys.exit(1)
if all(state == "success" for state in states.values()):
print(f"OK: all {len(required)} required CI contexts succeeded")
sys.exit(0)
if time.time() >= deadline:
print("FAIL: timed out waiting for required CI contexts:", file=sys.stderr)
for ctx, state in states.items():
print(f" - {ctx}: {state}", file=sys.stderr)
sys.exit(1)
time.sleep(15)
PY
+1 -9
View File
@@ -102,7 +102,7 @@ jobs:
name: Synthetic E2E against staging
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# Bumped from 12 → 20 (2026-05-04). Tenant user-data install phase
# (apt-get update + install docker.io/jq/awscli/caddy + snap install
@@ -166,10 +166,6 @@ jobs:
# canary path. The script picks the right blob shape based on
# which key is non-empty.
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_API_KEY }}
# google-adk canary path — AI-Studio key (config model
# google_genai:gemini-2.5-pro). PROD disallows API keys (Vertex+ADC);
# the keyed path is CI-only. Dispatch with E2E_RUNTIME=google-adk.
E2E_GOOGLE_API_KEY: ${{ secrets.MOLECULE_STAGING_GOOGLE_API_KEY }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -221,10 +217,6 @@ jobs:
required_secret_name="MOLECULE_STAGING_OPENAI_API_KEY"
required_secret_value="${E2E_OPENAI_API_KEY:-}"
;;
google-adk)
required_secret_name="MOLECULE_STAGING_GOOGLE_API_KEY"
required_secret_value="${E2E_GOOGLE_API_KEY:-}"
;;
*)
echo "::warning::Unknown E2E_RUNTIME='${E2E_RUNTIME}' — skipping LLM-key check"
required_secret_name=""
+2 -2
View File
@@ -123,7 +123,7 @@ jobs:
# integration). See internal#512 for the class defect.
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
api: ${{ steps.decide.outputs.api }}
@@ -160,7 +160,7 @@ jobs:
# detect-changes for the full rationale.
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 15
env:
+2 -2
View File
@@ -48,7 +48,7 @@ jobs:
# defect.
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
chat: ${{ steps.decide.outputs.chat }}
@@ -112,7 +112,7 @@ jobs:
# Must land on operator-host Linux (docker-host).
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 15
env:
-242
View File
@@ -1,242 +0,0 @@
name: E2E Legacy Advisory
# Advisory lane for older/manual E2E scripts that are too broad or
# environment-dependent for required PR CI. This intentionally does not run on
# pull_request or push so it cannot block merges/deploys; scheduled/manual reds
# still surface drift in scripts that would otherwise only be shellchecked.
#
# Gitea 1.22.6 rejects workflow_dispatch.inputs, so keep dispatch input-free.
on:
schedule:
# Stagger after the staging smoke/canvas morning lanes.
- cron: '15 9 * * *'
workflow_dispatch:
concurrency:
group: e2e-legacy-advisory
cancel-in-progress: false
permissions:
contents: read
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
legacy-local-platform:
name: Legacy local-platform E2E
runs-on: docker-host
timeout-minutes: 45
env:
PG_CONTAINER: pg-e2e-legacy-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-e2e-legacy-${{ github.run_id }}-${{ github.run_attempt }}
MOLECULE_ENV: development
BIND_ADDR: 127.0.0.1
MOLECULE_IN_DOCKER: "false"
A2A_TIMEOUT: "30"
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Prepare local platform dependencies
run: |
set -euo pipefail
docker pull postgres:16 >/dev/null
docker pull redis:7 >/dev/null
docker pull alpine:latest >/dev/null
docker network create molecule-core-net >/dev/null 2>&1 || true
- name: Start Postgres
run: |
set -euo pipefail
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker run -d --name "$PG_CONTAINER" \
-e POSTGRES_USER=dev -e POSTGRES_PASSWORD=dev -e POSTGRES_DB=molecule \
-p 0:5432 postgres:16 >/dev/null
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
if [ -z "$PG_PORT" ]; then
PG_PORT=$(docker port "$PG_CONTAINER" 5432/tcp | head -1 | awk -F: '{print $NF}')
fi
if [ -z "$PG_PORT" ]; then
echo "::error::Could not resolve host port for $PG_CONTAINER"
docker port "$PG_CONTAINER" 5432/tcp || true
docker logs "$PG_CONTAINER" || true
exit 1
fi
echo "DATABASE_URL=postgres://dev:dev@127.0.0.1:${PG_PORT}/molecule?sslmode=disable" >> "$GITHUB_ENV"
for i in $(seq 1 30); do
docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1 && exit 0
sleep 1
done
docker logs "$PG_CONTAINER" || true
exit 1
- name: Start Redis
run: |
set -euo pipefail
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
docker run -d --name "$REDIS_CONTAINER" -p 0:6379 redis:7 >/dev/null
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | awk -F: '/^0\.0\.0\.0:/ {print $2; exit}')
if [ -z "$REDIS_PORT" ]; then
REDIS_PORT=$(docker port "$REDIS_CONTAINER" 6379/tcp | head -1 | awk -F: '{print $NF}')
fi
if [ -z "$REDIS_PORT" ]; then
echo "::error::Could not resolve host port for $REDIS_CONTAINER"
docker port "$REDIS_CONTAINER" 6379/tcp || true
docker logs "$REDIS_CONTAINER" || true
exit 1
fi
echo "REDIS_URL=redis://127.0.0.1:${REDIS_PORT}" >> "$GITHUB_ENV"
for i in $(seq 1 15); do
docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG && exit 0
sleep 1
done
docker logs "$REDIS_CONTAINER" || true
exit 1
- name: Pick platform port
run: |
set -euo pipefail
PLATFORM_PORT=$(python3 - <<'PY'
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 0))
print(s.getsockname()[1])
PY
)
echo "PORT=${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "BASE=http://127.0.0.1:${PLATFORM_PORT}" >> "$GITHUB_ENV"
- name: Build platform
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Populate template manifests for dev-mode E2E
run: |
set -euo pipefail
if command -v jq >/dev/null 2>&1; then
bash scripts/clone-manifest.sh manifest.json workspace-configs-templates org-templates plugins
else
echo "::warning::jq unavailable; dev-mode template assertion may fail if templates are absent"
fi
- name: Start platform
run: |
set -euo pipefail
./workspace-server/platform-server > workspace-server/platform.log 2>&1 &
echo $! > workspace-server/platform.pid
for i in $(seq 1 30); do
curl -sf "$BASE/health" >/dev/null && exit 0
sleep 1
done
cat workspace-server/platform.log || true
exit 1
- name: Run comprehensive E2E
run: bash tests/e2e/test_comprehensive_e2e.sh
- name: Run workspace abilities E2E
run: bash tests/e2e/test_workspace_abilities_e2e.sh
- name: Run dev-mode E2E
run: bash tests/e2e/test_dev_mode.sh
- name: Start stub A2A agents
run: |
set -euo pipefail
cat > /tmp/molecule-stub-a2a.py <<'PY'
import json
from http.server import BaseHTTPRequestHandler, HTTPServer
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
length = int(self.headers.get("content-length", "0"))
raw = self.rfile.read(length) if length else b"{}"
try:
req = json.loads(raw)
except Exception:
req = {}
method = req.get("method")
if method not in ("message/send", None):
body = {"jsonrpc": "2.0", "id": req.get("id"), "error": {"code": -32601, "message": "method not found"}}
else:
body = {
"jsonrpc": "2.0",
"id": req.get("id", "stub"),
"result": {
"role": "agent",
"parts": [{"kind": "text", "type": "text", "text": "stub agent response"}],
},
}
data = json.dumps(body, separators=(",", ":")).encode()
self.send_response(200)
self.send_header("content-type", "application/json")
self.send_header("content-length", str(len(data)))
self.end_headers()
self.wfile.write(data)
def log_message(self, *_):
return
HTTPServer(("127.0.0.1", 18080), Handler).serve_forever()
PY
python3 /tmp/molecule-stub-a2a.py > /tmp/molecule-stub-a2a.log 2>&1 &
echo $! > /tmp/molecule-stub-a2a.pid
- name: Seed external agents for legacy A2A/activity scripts
run: |
set -euo pipefail
create_agent() {
local name="$1" role="$2"
curl -sS -X POST "$BASE/workspaces" \
-H "Content-Type: application/json" \
-d "{\"name\":\"${name}\",\"role\":\"${role}\",\"tier\":1,\"runtime\":\"external\",\"external\":true,\"url\":\"http://127.0.0.1:18080\"}" \
| python3 -c "import json,sys; print(json.load(sys.stdin)['id'])"
}
ECHO_ID=$(create_agent "Echo Agent" "Echo")
SEO_ID=$(create_agent "SEO Agent" "SEO")
curl -sS -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
-d "{\"id\":\"$ECHO_ID\",\"url\":\"http://127.0.0.1:18080\",\"agent_card\":{\"name\":\"Echo Agent\",\"skills\":[{\"id\":\"echo\",\"name\":\"Echo\"}]}}" >/dev/null
curl -sS -X POST "$BASE/registry/register" -H "Content-Type: application/json" \
-d "{\"id\":\"$SEO_ID\",\"url\":\"http://127.0.0.1:18080\",\"agent_card\":{\"name\":\"SEO Agent\",\"skills\":[{\"id\":\"seo\",\"name\":\"SEO\"}]}}" >/dev/null
- name: Run activity E2E
run: bash tests/e2e/test_activity_e2e.sh
- name: Run A2A E2E
run: bash tests/e2e/test_a2a_e2e.sh
- name: Runtime-dependent legacy E2E preflight
run: |
set -euo pipefail
if [ -f workspace-configs-templates/claude-code-default/.auth-token ] && docker image inspect workspace:latest >/dev/null 2>&1; then
bash tests/e2e/test_claude_code_e2e.sh
bash tests/e2e/test_chat_upload_e2e.sh
else
echo "::notice::Skipping test_claude_code_e2e.sh and test_chat_upload_e2e.sh: require workspace:latest plus workspace-configs-templates/claude-code-default/.auth-token"
fi
- name: Dump platform log on failure
if: failure()
run: cat workspace-server/platform.log || true
- name: Stop platform and stub agents
if: always()
run: |
if [ -f workspace-server/platform.pid ]; then
kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
fi
if [ -f /tmp/molecule-stub-a2a.pid ]; then
kill "$(cat /tmp/molecule-stub-a2a.pid)" 2>/dev/null || true
fi
- name: Stop service containers
if: always()
run: |
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
+2 -3
View File
@@ -143,9 +143,8 @@ jobs:
echo "test_peer_visibility_token_mint_staging.sh — bash syntax OK"
bash -n tests/e2e/test_peer_visibility_mcp_local.sh
echo "test_peer_visibility_mcp_local.sh — bash syntax OK"
legacy_token_suffix="test""-token"
if rg -n "$legacy_token_suffix" tests/e2e/test_*staging*.sh; then
echo "::error::staging E2E must use production-safe admin token minting"
if rg -n '/admin/workspaces/.*/test-token|test-token' tests/e2e/test_*staging*.sh; then
echo "::error::staging E2E must not use dev-only /admin/workspaces/:id/test-token; use production-safe admin token minting instead"
exit 1
fi
echo "Staging fresh-provision MCP list_peers E2E runs on push to"
+2 -2
View File
@@ -71,7 +71,7 @@ jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
canvas: ${{ steps.decide.outputs.canvas }}
@@ -140,7 +140,7 @@ jobs:
name: Canvas tabs E2E
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 40
+1 -1
View File
@@ -84,7 +84,7 @@ jobs:
name: E2E Staging External Runtime
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
+12 -23
View File
@@ -49,7 +49,6 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/completion_assert.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
@@ -62,7 +61,6 @@ on:
- 'workspace-server/internal/middleware/**'
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- 'tests/e2e/lib/completion_assert.sh'
- 'tests/e2e/lib/aws_leak_check.sh'
- 'tests/e2e/test_aws_leak_check.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
@@ -94,31 +92,31 @@ jobs:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- name: YAML validation (best-effort)
run: |
echo "e2e-staging-saas.yml — PR validation: workflow YAML is valid."
echo "E2E step runs only when provisioning-critical files change."
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# Actual E2E: runs on trunk pushes and PRs that touch provisioning-critical
# paths. pr-validate remains as the lightweight workflow-shape check for PRs,
# but it is not a substitute for live staging proof when this workflow or the
# staging harness changes.
# Actual E2E: runs on trunk pushes (main + staging). NOT the PR-fire-only
# path pr-validate above posts success for workflow-only PRs.
e2e-staging-saas:
name: E2E Staging SaaS
runs-on: ubuntu-latest
# Only runs on trunk pushes. PR paths get pr-validate instead.
if: github.event.pull_request.base.ref == ''
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 45
permissions:
@@ -154,21 +152,16 @@ jobs:
# block). See #2578 PR comment for the rationale.
E2E_ANTHROPIC_API_KEY: ${{ secrets.MOLECULE_STAGING_ANTHROPIC_API_KEY }}
# OpenAI fallback — kept wired so an operator-dispatched run with
# E2E_RUNTIME=hermes or =codex via workflow_dispatch can still
# E2E_RUNTIME=hermes or =langgraph via workflow_dispatch can still
# exercise the OpenAI path.
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_API_KEY }}
# google-adk (operator-dispatched only) auths Gemini with an
# AI-Studio key. Org policy disallows API keys in PROD (Vertex+ADC
# there); CI uses the keyed AI-Studio path with config model
# google_genai:gemini-2.5-pro. Vertex remains the supported prod path.
E2E_GOOGLE_API_KEY: ${{ secrets.MOLECULE_STAGING_GOOGLE_API_KEY }}
E2E_RUNTIME: ${{ github.event.inputs.runtime || 'claude-code' }}
# Pin the model when running on the default claude-code path —
# the per-runtime default ("sonnet") routes to direct Anthropic
# and defeats the cost saving. Operators can override via the
# workflow_dispatch flow (no input wired here yet — runtime
# override is enough for ad-hoc).
E2E_MODEL_SLUG: ${{ github.event.inputs.runtime == 'hermes' && 'openai/gpt-4o' || github.event.inputs.runtime == 'codex' && 'openai/gpt-4o' || github.event.inputs.runtime == 'google-adk' && 'google_genai:gemini-2.5-pro' || 'MiniMax-M2' }}
E2E_MODEL_SLUG: ${{ github.event.inputs.runtime == 'hermes' && 'openai/gpt-4o' || github.event.inputs.runtime == 'langgraph' && 'openai:gpt-4o' || 'MiniMax-M2' }}
E2E_RUN_ID: "${{ github.run_id }}-${{ github.run_attempt }}"
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org && '1' || '0' }}
@@ -192,7 +185,7 @@ jobs:
- name: Verify LLM key present
run: |
# Per-runtime key check — claude-code uses MiniMax; hermes /
# codex (operator-dispatched only) use OpenAI. Hard-fail
# langgraph (operator-dispatched only) use OpenAI. Hard-fail
# rather than soft-skip per #2578's lesson — empty key
# silently falls through to the wrong SECRETS_JSON branch and
# produces a confusing auth error 5 min later instead of the
@@ -213,14 +206,10 @@ jobs:
required_secret_value=""
fi
;;
codex|hermes)
langgraph|hermes)
required_secret_name="MOLECULE_STAGING_OPENAI_API_KEY"
required_secret_value="${E2E_OPENAI_API_KEY:-}"
;;
google-adk)
required_secret_name="MOLECULE_STAGING_GOOGLE_API_KEY"
required_secret_value="${E2E_GOOGLE_API_KEY:-}"
;;
*)
echo "::warning::Unknown E2E_RUNTIME='${E2E_RUNTIME}' — skipping LLM-key check"
required_secret_name=""
+1 -1
View File
@@ -37,7 +37,7 @@ jobs:
name: Intentional-failure teardown sanity
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 20
+2 -21
View File
@@ -7,11 +7,10 @@
# PR_NUMBER — set via ${{ github.event.pull_request.number }} from the trigger
# POST_COMMENT — "true" to post/update comment on PR
#
# Gating logic (MVP signals 1,2,3,4,6):
# Gating logic (MVP signals 1,2,3,6):
# 1. Author-aware agent-tag comment scan
# 2. REQUEST_CHANGES reviews state machine
# 3. Staleness detection (SOP-12: review.commit_id != PR.head_sha + >1 working day)
# 4. Branch divergence / scope-creep guard (base-sha vs target HEAD; mc#365)
# 6. CI required-checks awareness
#
# Exit code: 0=CLEAR, 1=BLOCKED, 2=ERROR
@@ -33,24 +32,6 @@ on:
# iterating all open PRs when PR_NUMBER is empty.
workflow_dispatch:
# Serialize per PR (or per repo for schedule/manual ticks) to prevent
# the fan-out OOM class documented in
# `reference_operator_host_python3_oom_storm_2026_05_18`. `edited`
# events fan out on every PR-body edit; combined with the hourly cron
# and synchronize bursts this workflow can stack runs of the same
# workflow_id on the same PR (each ~4GB anon-RSS) and trip the
# `--memory=4g --memory-swap=8g` per-container cap.
#
# NO `cancel-in-progress` (defaults to false). Per
# `feedback_janitor_supersede_must_group_by_workflow_id`, cancelling
# in-flight runs of any required-check-shaped workflow risks the
# dismiss_stale_approvals + empty-commit-rerun dance (Gitea 1.22.6 has
# no REST rerun). The gate-check is `continue-on-error: true` +
# idempotent (POST/PATCH gate-check comment by context) so sequential
# ticks are strictly safe.
concurrency:
group: gate-check-v3-${{ github.event.pull_request.number || github.event.issue.number || github.ref }}
permissions:
# read: contents — for checkout (base ref, not PR head for security)
# read: pull-requests — for reading PR info via API
@@ -66,7 +47,7 @@ jobs:
# bp-exempt: PR advisory bot; merge blocking is enforced by CI status and branch protection.
gate-check:
runs-on: ubuntu-latest
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # Never block on our own detector failing
steps:
- name: Check out BASE ref (never PR-head under pull_request_target)
@@ -87,8 +87,8 @@ jobs:
# both jobs on the same label avoids workspace-volume cross-host
# surprises and keeps the routing rule discoverable in one place.
runs-on: docker-host
# mc#1982 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
handlers: ${{ steps.filter.outputs.handlers }}
@@ -118,8 +118,8 @@ jobs:
# mc#1529 §1: must run on operator-host (where `molecule-core-net`
# exists). See detect-changes for the full routing rationale.
runs-on: docker-host
# mc#1982 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
env:
# Unique name per run so concurrent jobs don't collide on the
+2 -2
View File
@@ -70,7 +70,7 @@ jobs:
# of mc#1543; see internal#512 for class defect.
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
run: ${{ steps.decide.outputs.run }}
@@ -172,7 +172,7 @@ jobs:
# beta containers. Must run on operator-host Linux (docker-host).
runs-on: docker-host
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 30
steps:
@@ -1,6 +1,6 @@
name: lint-bp-context-emit-match
# Tier 2f scheduled lint (per mc#1982) — detects drift between
# Tier 2f scheduled lint (per mc#774) — detects drift between
# `branch_protections/<branch>.status_check_contexts` and the set of
# contexts emitted by `.gitea/workflows/*.yml`.
#
@@ -60,7 +60,7 @@ name: lint-bp-context-emit-match
#
# Cross-links
# -----------
# - mc#1982 (the RFC that specs this lint)
# - mc#774 (the RFC that specs this lint)
# - internal#349 (cross-repo BP sweep)
# - feedback_phantom_required_check_after_gitea_migration
# - feedback_tier_label_ids_are_per_repo
@@ -94,7 +94,7 @@ jobs:
# Phase 3 (RFC #219 §1): surface drift without blocking. After 7
# clean scheduled runs on main, flip to false so a scheduled
# failure is a hard CI signal.
continue-on-error: true # mc#1982 Phase 3 — flip to false after 7 clean main runs
continue-on-error: true # mc#774 Phase 3 — flip to false after 7 clean main runs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
@@ -1,6 +1,6 @@
name: lint-continue-on-error-tracking
# Tier 2e hard-gate lint (per mc#1982) — every
# Tier 2e hard-gate lint (per mc#774) — every
# `continue-on-error: true` in `.gitea/workflows/*.yml` must carry a
# `# mc#NNNN` or `# internal#NNNN` tracker comment within 2 lines,
# the referenced issue must be OPEN, and ≤14 days old.
@@ -8,7 +8,7 @@ name: lint-continue-on-error-tracking
# Why this exists
# ---------------
# `continue-on-error: true` on `platform-build` had been hiding
# mc#1982-class regressions for ~3 weeks before #656 surfaced them on
# mc#774-class regressions for ~3 weeks before #656 surfaced them on
# 2026-05-12. A 14-day cap on tracker age forces a review cycle and
# surfaces mask-drift within at most 14 days of the original defect.
# Each `continue-on-error: true` gets a paper trail — close or renew.
@@ -45,12 +45,12 @@ name: lint-continue-on-error-tracking
# close-and-flip, or document the deliberate keep-mask in a fresh
# 14-day-renewable tracker. After main is clean for 3 days,
# follow-up PR flips this workflow's continue-on-error to false.
# Tracking: mc#1982.
# Tracking: mc#774.
#
# Cross-links
# -----------
# - mc#1982 (the RFC that specs this lint)
# - mc#1982 (the empirical masked-3-weeks case)
# - mc#774 (the RFC that specs this lint)
# - mc#774 (the empirical masked-3-weeks case)
# - feedback_chained_defects_in_never_tested_workflows
# - feedback_behavior_based_ast_gates
# - feedback_strict_root_only_after_class_a
@@ -97,9 +97,9 @@ jobs:
# Phase 3 (RFC #219 §1): surface masked defects without blocking
# PRs. Pre-existing continue-on-error: true directives on main
# all violate this lint at first — intentional. Flip to false
# follow-up after main is clean for 3 days. mc#1982.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # mc#1982 Phase 3 mask — 14d forced-renewal cadence
# follow-up after main is clean for 3 days. mc#774.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # mc#774 Phase 3 mask — 14d forced-renewal cadence
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
@@ -51,7 +51,7 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -25,21 +25,6 @@ name: Lint forbidden tenant-env keys
# feedback_path_filtered_workflow_cant_be_required). The scan itself
# targets workspace_secrets-writer paths via grep -r; it's fast
# (sub-second) so unconditional run is fine.
#
# ── 2026-06-01 CI-scheduler-fanout consolidation (fix/ci-scheduler-fanout) ──
# The RFC#523 sibling lint formerly in its own file
# `lint-no-tenant-gitea-token.yml` (the broader "no repo-host token into
# any tenant-writer surface" scan) is now a SECOND job in THIS workflow
# (`scan-tenant-token-write`). Both are sub-second Go-source greps that
# fired as two separate workflow runs on every PR — pure scheduler
# fan-out. Folding the sibling in here drops one workflow run + one
# checkout per PR while keeping BOTH scans firing unconditionally on
# every PR (the no-paths discipline above is preserved — neither job is
# paths-filtered). The moved job keeps its exact `name:` so its emitted
# status context is unchanged in substance; its `# bp-exempt:` directive
# moves with it (Tier 2g). The old `Lint no tenant GITEA or GITHUB token
# write / …` context is retired (a disappearing context needs no
# directive; only NEW emitters do).
on:
pull_request:
@@ -181,126 +166,3 @@ jobs:
fi
echo "OK No forbidden operator-scope env key names hardcoded in writer paths."
# bp-exempt: advisory RFC#523 lint; PR review gate is review-driven, not BP-driven.
# (Carried with the workflow-name rename in PR mc#1593 so the renamed
# context emission satisfies lint_required_context_exists_in_bp Tier 2g.)
scan-tenant-token-write:
name: Scan for repo-host token write into tenant workspace surface
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- name: Find Go files referencing a tenant-writer surface AND a repo-host token
run: |
set -euo pipefail
# Repo-host token NAMES — the threat-model subset. Operator-fleet
# tokens (CP_ADMIN_API_TOKEN, RAILWAY_TOKEN, INFISICAL_*) are
# caught by lint-forbidden-env-keys.yml's broader deny set; this
# lint focuses on the git-host class so a single co-occurrence
# match has a low false-positive rate.
FORBIDDEN_KEYS=(
"GITEA_TOKEN"
"GITEA_PAT"
"GITHUB_TOKEN"
"GITHUB_PAT"
"GH_TOKEN"
)
# Tenant-writer surface markers. A file matches the surface set
# if it references ANY of these strings. This is the "is this
# code path writing into a tenant workspace?" heuristic.
# Curated to catch the actual code shapes used in this repo
# (verified by grep against current main 2026-05-19):
# - "workspace_secrets" / "global_secrets" → DB table writes
# - "seedAllowList" → CP-side seed table
# - "/settings/secrets" → tenant HTTP API write
# - "envVars[" → in-memory env map write
# - "containerEnv" → docker-run env-set
# - "userData" → EC2 user-data script
# - "provisionPayload" / "provisionContext" → provision-request shape
SURFACE_PATTERN='workspace_secrets|global_secrets|seedAllowList|/settings/secrets|envVars\[|containerEnv|userData|provisionPayload|provisionContext'
# Files that legitimately reference these names AND a surface
# marker, but do so for guard / strip / test / doc-comment
# reasons. New entries require reviewer signoff and a one-line
# justification in the diff.
EXEMPT_FILES=(
# RFC#523 L1 deny-set source-of-truth + tests
"workspace-server/internal/handlers/workspace_provision_forbidden_env.go"
"workspace-server/internal/handlers/workspace_provision_forbidden_env_test.go"
# Forensic-#145 silent-strip denylist (defense-in-depth, by design lists the names)
"workspace-server/internal/provisioner/provisioner.go"
"workspace-server/internal/provisioner/provisioner_test.go"
# Pre-RFC#523 persona-fallback / org-helper paths. The L1
# fail-closed runs BEFORE these writers; downstream silent-strip
# also covers them. See applyAgentGitHTTPCreds doc-comment.
"workspace-server/internal/handlers/agent_git_identity.go"
"workspace-server/internal/handlers/org_helpers.go"
"workspace-server/internal/handlers/org.go"
# CP→platform admin auth (NOT a tenant env write).
"workspace-server/internal/provisioner/cp_provisioner.go"
)
# Build an extended-regex alternation of forbidden keys.
KEY_ALT="$(IFS='|'; echo "${FORBIDDEN_KEYS[*]}")"
# Find candidate files: Go non-test sources that contain a
# tenant-writer surface marker.
mapfile -t CANDIDATES < <(
grep -rlE --include='*.go' --exclude='*_test.go' \
"${SURFACE_PATTERN}" . 2>/dev/null \
| sed 's|^\./||' \
| sort -u
)
if [ "${#CANDIDATES[@]}" -eq 0 ]; then
echo "OK No tenant-writer-surface files found in tree (unexpected, but not a lint failure)."
exit 0
fi
HITS=""
for f in "${CANDIDATES[@]}"; do
# Skip exempt files.
skip=0
for ex in "${EXEMPT_FILES[@]}"; do
if [ "$f" = "$ex" ]; then skip=1; break; fi
done
[ "$skip" = "1" ] && continue
# File contains a surface marker; now grep for a forbidden
# key NAME. We require a QUOTED-literal match to avoid
# firing on a comment like "// also handle GITEA_TOKEN".
#
# The literal form catches:
# - os.Getenv("GITEA_TOKEN")
# - envVars["GITEA_TOKEN"] = ...
# - {envKey: "GITEA_TOKEN", tenantKey: "GITEA_TOKEN"}
# but not:
# - // see GITEA_TOKEN below (no quotes)
found=$(grep -nE "\"(${KEY_ALT})\"" "$f" 2>/dev/null || true)
if [ -n "$found" ]; then
HITS="${HITS}--- ${f} ---\n${found}\n"
fi
done
if [ -n "$HITS" ]; then
echo "::error::Task #146 lint: repo-host token name(s) quoted in a tenant-writer-surface file:"
printf "$HITS"
echo ""
echo "These files reference a tenant-writer surface (workspace_secrets,"
echo "seedAllowList, /settings/secrets, containerEnv, userData, etc.)"
echo "AND quote a repo-host token name (GITEA_TOKEN/GITHUB_TOKEN/…)."
echo "Per RFC#523 threat model, tenant workspaces MUST NOT receive"
echo "operator-scope repo-host tokens. If your code legitimately needs"
echo "to reference one of these names in a tenant-writer file (e.g."
echo "a deny-set definition or silent-strip list), add the file to"
echo "EXEMPT_FILES with a one-line justification — reviewer signoff"
echo "required."
exit 1
fi
echo "OK No tenant-writer-surface file co-mentions a repo-host token literal."
+6 -6
View File
@@ -1,6 +1,6 @@
name: lint-mask-pr-atomicity
# Tier 2d hard-gate lint (per mc#1982) — blocks PRs that touch
# Tier 2d hard-gate lint (per mc#774) — blocks PRs that touch
# `.gitea/workflows/ci.yml` and modify ONLY ONE of {continue-on-error,
# all-required.sentinel.needs} without a `Paired: #NNN` reference in
# the PR body or in a commit message.
@@ -37,13 +37,13 @@ name: lint-mask-pr-atomicity
# This workflow lands at `continue-on-error: true` (Phase 3 — surface
# regressions without blocking PRs while the rule beds in).
# Follow-up PR flips to `false` once we have ≥3 days of clean runs on
# `main` and no false-positives. Tracking issue: mc#1982.
# `main` and no false-positives. Tracking issue: mc#774.
#
# Cross-links
# -----------
# - mc#1982 (the RFC that specs this lint)
# - mc#774 (the RFC that specs this lint)
# - PR#665 / PR#668 (the empirical split-pair)
# - mc#1982 (the main-red incident the split caused)
# - mc#774 (the main-red incident the split caused)
# - feedback_strict_root_only_after_class_a
# - feedback_behavior_based_ast_gates
#
@@ -92,8 +92,8 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken shapes without blocking
# PRs. Follow-up PR flips this to `false` once recent runs on main
# are confirmed clean (eat-our-own-dogfood discipline mirrors
# PR#673's same-shape comment). Tracking: mc#1982.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# PR#673's same-shape comment). Tracking: mc#774.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: Check out PR head with full history (need base SHA blobs)
@@ -0,0 +1,182 @@
name: Lint no tenant GITEA or GITHUB token write
# Task #146 — CI guardrail companion to RFC#523's `lint-forbidden-env-keys.yml`.
#
# `lint-forbidden-env-keys.yml` (Layer 3) catches code that hardcodes a
# forbidden env-var key NAME as a quoted literal in workspace_secrets
# writer paths under workspace-server/internal/.
#
# This workflow catches a BROADER class: any code path that reads a
# repo-host token (GITEA_TOKEN / GITHUB_TOKEN / GH_TOKEN) and then writes
# it into a TENANT WORKSPACE's env, secret store, user-data, or
# provision payload. This is the actual RFC#523 threat-model statement —
# the goal is "no tenant workspace ever receives an operator-scope repo
# token," not just "no _quoted_ literal `GITEA_TOKEN`." A future writer
# could route the value via a variable, a struct field, or a config key
# and slip past the existing literal scan; this lint catches those
# routing patterns at PR review time.
#
# Scope
# Scans the WHOLE repo's Go sources (not just workspace-server/) for
# co-occurrences of:
# - a repo-host token NAME (GITEA_TOKEN / GITHUB_TOKEN / GH_TOKEN /
# GITEA_PAT / GITHUB_PAT) used as os.Getenv argument or string
# literal
# - within a file that ALSO references a tenant-writer surface
# (`tenant`, `workspace_secrets`, `global_secrets`, `seedAllowList`,
# `/settings/secrets`, `userData`, `provisionPayload`,
# `envVars[`, `containerEnv`).
#
# Co-occurrence (not single-line) is the false-positive control: a
# file that just LOGS the variable name (e.g. "missing GITEA_TOKEN")
# without touching any tenant surface won't fire.
#
# Drift contract with lint-forbidden-env-keys.yml
# Both lints share the same FORBIDDEN_KEYS list (a subset — only the
# repo-host tokens, since this lint's threat model is "tenant gets
# write access to operator's git host"). If RFC#523's deny set grows,
# update BOTH this file AND lint-forbidden-env-keys.yml AND the Go
# source-of-truth in
# workspace-server/internal/handlers/workspace_provision_forbidden_env.go.
#
# Open-source-template-friendly
# The patterns scanned are generic (no MOLECULE_-prefix literals).
# A fork can copy this workflow as-is and adjust FORBIDDEN_KEYS.
#
# Path-filter discipline
# No `paths:` filter — required-status workflows must run on every PR
# per `feedback_path_filtered_workflow_cant_be_required`. Scan is
# sub-second.
on:
pull_request:
types: [opened, synchronize, reopened]
push:
branches: [main, staging]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
# bp-exempt: advisory RFC#523 lint; PR review gate is review-driven, not BP-driven.
# (Carried with the workflow-name rename in PR mc#1593 so the renamed
# context emission satisfies lint_required_context_exists_in_bp Tier 2g.)
scan:
name: Scan for repo-host token write into tenant workspace surface
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- name: Find Go files referencing a tenant-writer surface AND a repo-host token
run: |
set -euo pipefail
# Repo-host token NAMES — the threat-model subset. Operator-fleet
# tokens (CP_ADMIN_API_TOKEN, RAILWAY_TOKEN, INFISICAL_*) are
# caught by lint-forbidden-env-keys.yml's broader deny set; this
# lint focuses on the git-host class so a single co-occurrence
# match has a low false-positive rate.
FORBIDDEN_KEYS=(
"GITEA_TOKEN"
"GITEA_PAT"
"GITHUB_TOKEN"
"GITHUB_PAT"
"GH_TOKEN"
)
# Tenant-writer surface markers. A file matches the surface set
# if it references ANY of these strings. This is the "is this
# code path writing into a tenant workspace?" heuristic.
# Curated to catch the actual code shapes used in this repo
# (verified by grep against current main 2026-05-19):
# - "workspace_secrets" / "global_secrets" → DB table writes
# - "seedAllowList" → CP-side seed table
# - "/settings/secrets" → tenant HTTP API write
# - "envVars[" → in-memory env map write
# - "containerEnv" → docker-run env-set
# - "userData" → EC2 user-data script
# - "provisionPayload" / "provisionContext" → provision-request shape
SURFACE_PATTERN='workspace_secrets|global_secrets|seedAllowList|/settings/secrets|envVars\[|containerEnv|userData|provisionPayload|provisionContext'
# Files that legitimately reference these names AND a surface
# marker, but do so for guard / strip / test / doc-comment
# reasons. New entries require reviewer signoff and a one-line
# justification in the diff.
EXEMPT_FILES=(
# RFC#523 L1 deny-set source-of-truth + tests
"workspace-server/internal/handlers/workspace_provision_forbidden_env.go"
"workspace-server/internal/handlers/workspace_provision_forbidden_env_test.go"
# Forensic-#145 silent-strip denylist (defense-in-depth, by design lists the names)
"workspace-server/internal/provisioner/provisioner.go"
"workspace-server/internal/provisioner/provisioner_test.go"
# Pre-RFC#523 persona-fallback / org-helper paths. The L1
# fail-closed runs BEFORE these writers; downstream silent-strip
# also covers them. See applyAgentGitHTTPCreds doc-comment.
"workspace-server/internal/handlers/agent_git_identity.go"
"workspace-server/internal/handlers/org_helpers.go"
"workspace-server/internal/handlers/org.go"
# CP→platform admin auth (NOT a tenant env write).
"workspace-server/internal/provisioner/cp_provisioner.go"
)
# Build an extended-regex alternation of forbidden keys.
KEY_ALT="$(IFS='|'; echo "${FORBIDDEN_KEYS[*]}")"
# Find candidate files: Go non-test sources that contain a
# tenant-writer surface marker.
mapfile -t CANDIDATES < <(
grep -rlE --include='*.go' --exclude='*_test.go' \
"${SURFACE_PATTERN}" . 2>/dev/null \
| sed 's|^\./||' \
| sort -u
)
if [ "${#CANDIDATES[@]}" -eq 0 ]; then
echo "OK No tenant-writer-surface files found in tree (unexpected, but not a lint failure)."
exit 0
fi
HITS=""
for f in "${CANDIDATES[@]}"; do
# Skip exempt files.
skip=0
for ex in "${EXEMPT_FILES[@]}"; do
if [ "$f" = "$ex" ]; then skip=1; break; fi
done
[ "$skip" = "1" ] && continue
# File contains a surface marker; now grep for a forbidden
# key NAME. We require a QUOTED-literal match to avoid
# firing on a comment like "// also handle GITEA_TOKEN".
#
# The literal form catches:
# - os.Getenv("GITEA_TOKEN")
# - envVars["GITEA_TOKEN"] = ...
# - {envKey: "GITEA_TOKEN", tenantKey: "GITEA_TOKEN"}
# but not:
# - // see GITEA_TOKEN below (no quotes)
found=$(grep -nE "\"(${KEY_ALT})\"" "$f" 2>/dev/null || true)
if [ -n "$found" ]; then
HITS="${HITS}--- ${f} ---\n${found}\n"
fi
done
if [ -n "$HITS" ]; then
echo "::error::Task #146 lint: repo-host token name(s) quoted in a tenant-writer-surface file:"
printf "$HITS"
echo ""
echo "These files reference a tenant-writer surface (workspace_secrets,"
echo "seedAllowList, /settings/secrets, containerEnv, userData, etc.)"
echo "AND quote a repo-host token name (GITEA_TOKEN/GITHUB_TOKEN/…)."
echo "Per RFC#523 threat model, tenant workspaces MUST NOT receive"
echo "operator-scope repo-host tokens. If your code legitimately needs"
echo "to reference one of these names in a tenant-writer file (e.g."
echo "a deny-set definition or silent-strip list), add the file to"
echo "EXEMPT_FILES with a one-line justification — reviewer signoff"
echo "required."
exit 1
fi
echo "OK No tenant-writer-surface file co-mentions a repo-host token literal."
@@ -4,7 +4,7 @@ name: Lint pre-flip continue-on-error
# on any job in `.gitea/workflows/*.yml` WITHOUT proof that the affected
# job's recent runs on the target branch (PR base) are actually green.
#
# Empirical class: PR #656 / mc#1982. PR #656 (RFC internal#219 Phase 4)
# Empirical class: PR #656 / mc#774. PR #656 (RFC internal#219 Phase 4)
# flipped 5 platform-build-class jobs `continue-on-error: true → false`
# on the basis of a "verified green on main via combined-status check".
# But that "green" was the LIE the prior `continue-on-error: true`
@@ -13,7 +13,7 @@ name: Lint pre-flip continue-on-error
# job-level status. The precondition the PR claimed to verify was
# structurally fooled by the bug being flipped.
#
# mc#1982 captured the surfaced defects (2 mutually-masked regressions):
# mc#774 captured the surfaced defects (2 mutually-masked regressions):
# - Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
# - Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
#
@@ -55,7 +55,7 @@ name: Lint pre-flip continue-on-error
# - YAML parse error in one of the workflow files: warn-only,
# don't block — the YAML lint workflows catch this separately.
#
# Cross-links: PR#656, mc#1982, PR#665 (interim re-mask),
# Cross-links: PR#656, mc#774, PR#665 (interim re-mask),
# Quirk #10 (internal#342 + dup #287), hongming-pc2 charter
# §SOP-N rule (e), feedback_strict_root_only_after_class_a,
# feedback_no_shared_persona_token_use.
@@ -99,8 +99,8 @@ jobs:
timeout-minutes: 8
# Phase 3 (RFC internal#219 §1): surface broken flips without blocking
# the PR yet. Follow-up flips this to `false` once the workflow itself
# has clean recent runs on main. mc#1982 interim — remove when CoE→false.
continue-on-error: true # mc#1982
# has clean recent runs on main. mc#774 interim — remove when CoE→false.
continue-on-error: true # mc#774
steps:
- name: Check out PR head (full history for base-SHA access)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -1,6 +1,6 @@
name: lint-required-context-exists-in-bp
# Tier 2g hard-gate lint (per mc#1982) — diff-based PR-time
# Tier 2g hard-gate lint (per mc#774) — diff-based PR-time
# check. When a PR adds a NEW commit-status emission (workflow YAML
# `name:` + job `name:`-or-key + on:-event), the workflow file must
# carry one of three directives adjacent to the new job:
@@ -16,7 +16,7 @@ name: lint-required-context-exists-in-bp
# PR#656 added `CI / all-required (pull_request)` as a sentinel
# context that workflows emit, but BP did NOT list it. When
# platform-build failed, all-required failed, but BP let the PR
# merge anyway → cascade to mc#1982. With this lint, PR#656 would
# merge anyway → cascade to mc#774. With this lint, PR#656 would
# have been blocked until either the BP PATCH ran alongside OR
# the author added a `bp-required: pending` directive.
#
@@ -27,7 +27,7 @@ name: lint-required-context-exists-in-bp
# share the workflow-context enumeration helpers
# (`_event_map`, `workflow_contexts`, `_job_display`) but the
# semantics are intentionally distinct so they're separate scripts.
# Co-design is documented in mc#1982.
# Co-design is documented in mc#774.
#
# Directive comment lives in the workflow file (NOT PR body)
# ----------------------------------------------------------
@@ -42,13 +42,13 @@ name: lint-required-context-exists-in-bp
# Lands at `continue-on-error: true` (Phase 3 — surface the
# pattern without blocking PRs while the directive convention
# beds in). After 7 days of clean runs on `main` with no false
# positives, follow-up flips to `false`. Tracking: mc#1982.
# positives, follow-up flips to `false`. Tracking: mc#774.
#
# Cross-links
# -----------
# - mc#1982 (the RFC that specs this lint)
# - mc#774 (the RFC that specs this lint)
# - PR#656 (the empirical case)
# - mc#1982 (the surfaced cascade)
# - mc#774 (the surfaced cascade)
# - feedback_phantom_required_check_after_gitea_migration (Tier 2f cousin)
# - feedback_behavior_based_ast_gates
#
@@ -83,8 +83,8 @@ jobs:
timeout-minutes: 5
# Phase 3 (RFC #219 §1): surface the pattern without blocking PRs
# while the directive convention beds in. Follow-up flip to false
# after 7 clean days on main. mc#1982.
continue-on-error: true # mc#1982 Phase 3 — flip to false after 7 clean main runs
# after 7 clean days on main. mc#774.
continue-on-error: true # mc#774 Phase 3 — flip to false after 7 clean main runs
steps:
- name: Check out PR head with full history (need base SHA blobs)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -3,26 +3,11 @@ name: Lint shellcheck (arm64 pilot)
# Mac-CI dual-track pilot (#233). ADDITIVE / NOT REQUIRED.
#
# Validates the arm64 self-hosted lane (no docker.sock, no privileged
# ops) before any required gate moves onto it.
# ops) before any required gate moves onto it. Until a Mac arm64 runner
# is registered with the `arm64` label, this workflow sits PENDING —
# that is FINE: `arm64` is NOT in branch_protections required contexts.
#
# Runner label mapping (2026-05-22 fix): the actual Mac mini runner
# registered in this Gitea ships labels
# ["self-hosted","macos-self-hosted-arm64","arm64-darwin"]
# — no plain `arm64`. The earlier `runs-on: [self-hosted, arm64]`
# could not match any registered runner so every fire of this workflow
# was assigned task_id=0 / runner_id=NULL → Gitea cancelled it. The
# rows showed up as Cancelled in the action status feed (not Failed)
# but the lane never actually ran. Workflow now selects on
# `arm64-darwin` which is the canonical Mac-arm64 label per the
# Mac mini's registration (per internal#494 capability-honest labels).
#
# If we later want to add a Linux-arm64 runner to the same lane, add
# both labels to that runner's registration AND broaden the selector
# here — don't rename `arm64-darwin` (it's Mac-specific by design and
# `feedback_pc2_runner_labels_must_stay_narrow` rule applies).
#
# Pairs with internal#543 (RFC: Mac arm64 multi-arch runner-base) and
# internal#494 (multi-arch runner-base capability-honest labels).
# Pairs with internal#543 (RFC: Mac arm64 multi-arch runner-base).
# No paths: filter on purpose (feedback_path_filtered_workflow_cant_be_required).
on:
@@ -97,15 +82,7 @@ jobs:
echo "WARN: shellcheck binary not found — skipping (pilot mode)"
exit 0
fi
# NOTE: macOS ships Bash 3.2 (Apple license), no `mapfile`
# (Bash 4+ builtin). Mac mini runner empirically failed at
# `mapfile: command not found` (run 79275 / task 145654).
# Use the portable `while read` pattern instead — works on
# both Bash 3.2 (macOS) and Bash 4+ (Linux).
TARGETS=()
while IFS= read -r f; do
TARGETS+=("$f")
done < <(find .gitea/scripts -maxdepth 2 -type f -name '*.sh' | sort)
mapfile -t TARGETS < <(find .gitea/scripts -maxdepth 2 -type f -name '*.sh' | sort)
if [ "${#TARGETS[@]}" -eq 0 ]; then
echo "No .sh files found under .gitea/scripts — nothing to check"
exit 0
+1 -1
View File
@@ -55,7 +55,7 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken shapes without blocking PRs.
# Follow-up PR flips this off after the 4 existing-on-main rule-2
# (workflow_run) violations are migrated to a supported trigger.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+1 -1
View File
@@ -67,7 +67,7 @@ jobs:
# in this rollout (internal#462) so the precondition holds.
runs-on: publish
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: Checkout
@@ -234,18 +234,17 @@ jobs:
name: Production auto-deploy
needs: build-and-push
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}
# Side-effect deploy only; image publish success is the durable artifact. mc#1982
# Side-effect deploy only; image publish success is the durable artifact. mc#774
continue-on-error: true
# Publish/release lane (internal#462) — production deploy of a merged
# fix; reserved capacity, never queued behind PR-CI.
runs-on: publish
timeout-minutes: 90
timeout-minutes: 75
env:
CP_URL: ${{ vars.PROD_CP_URL || 'https://api.moleculesai.app' }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
GITEA_HOST: git.moleculesai.app
GITEA_TOKEN: ${{ secrets.PROD_AUTO_DEPLOY_CONTROL_TOKEN || secrets.AUTO_SYNC_TOKEN }}
CI_STATUS_TIMEOUT_SECONDS: "3600"
PROD_AUTO_DEPLOY_DISABLED: ${{ vars.PROD_AUTO_DEPLOY_DISABLED || secrets.PROD_AUTO_DEPLOY_DISABLED || '' }}
PROD_AUTO_DEPLOY_CANARY_SLUG: ${{ vars.PROD_AUTO_DEPLOY_CANARY_SLUG || 'hongming' }}
PROD_AUTO_DEPLOY_SOAK_SECONDS: ${{ vars.PROD_AUTO_DEPLOY_SOAK_SECONDS || '60' }}
@@ -304,19 +303,26 @@ jobs:
python3 .gitea/scripts/prod-auto-deploy.py assert-enabled
PLAN="$RUNNER_TEMP/prod-auto-deploy-plan.json"
TARGET_TAG="$(jq -r '.target_tag' "$PLAN")"
BODY="$(jq -c '.body' "$PLAN")"
echo "POST $CP_URL/cp/admin/tenants/redeploy-fleet"
echo " target_tag: $TARGET_TAG"
echo " body: $BODY"
HTTP_RESPONSE="$RUNNER_TEMP/prod-redeploy-response.json"
HTTP_CODE_FILE="$RUNNER_TEMP/prod-redeploy-http-code.txt"
set +e
python3 .gitea/scripts/prod-auto-deploy.py rollout \
--plan "$PLAN" \
--response "$HTTP_RESPONSE"
ROLLOUT_EXIT=$?
curl -sS -o "$HTTP_RESPONSE" -w '%{http_code}' \
-m 1200 \
-H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
-H "Content-Type: application/json" \
-X POST "$CP_URL/cp/admin/tenants/redeploy-fleet" \
-d "$BODY" > "$HTTP_CODE_FILE"
set -e
if [ ! -s "$HTTP_RESPONSE" ]; then
jq -nc --arg error "rollout command exited $ROLLOUT_EXIT before writing a response" \
'{ok:false, results:[], error:$error}' > "$HTTP_RESPONSE"
fi
HTTP_CODE="$(cat "$HTTP_CODE_FILE" 2>/dev/null || echo "000")"
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
echo "HTTP $HTTP_CODE"
jq '{ok, result_count: (.results // [] | length)}' "$HTTP_RESPONSE" || true
{
@@ -324,36 +330,23 @@ jobs:
echo ""
echo "**Commit:** \`${GITHUB_SHA:0:7}\`"
echo "**Target tag:** \`$TARGET_TAG\`"
echo "**HTTP:** $HTTP_CODE"
echo ""
echo "### Per-tenant result"
echo ""
echo "| Slug | Phase | SSM Status | Exit | Healthz | On target | Error present |"
echo "|------|-------|------------|------|---------|-----------|---------------|"
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \(.verified_on_target) | \((.error // "") != "") |"' "$HTTP_RESPONSE" || true
# internal#724: stragglers are tenants enumerated but not proven
# on the target build. Surface them loudly — a non-empty list
# means the rollout did NOT fully land.
STRAGGLERS="$(jq -r '(.stragglers // []) | join(", ")' "$HTTP_RESPONSE")"
if [ -n "$STRAGGLERS" ]; then
echo ""
echo "### ⚠ Stragglers (NOT on target tag \`$TARGET_TAG\`)"
echo ""
echo "\`$STRAGGLERS\`"
fi
echo "| Slug | Phase | SSM Status | Exit | Healthz | Error present |"
echo "|------|-------|------------|------|---------|---------------|"
jq -r '.results[]? | "| \(.slug) | \(.phase) | \(.ssm_status // "-") | \(.ssm_exit_code) | \(.healthz_ok) | \((.error // "") != "") |"' "$HTTP_RESPONSE" || true
} >> "$GITHUB_STEP_SUMMARY"
OK="$(jq -r '.ok' "$HTTP_RESPONSE")"
if [ "$OK" != "true" ]; then
STRAGGLERS="$(jq -r '(.stragglers // []) | join(", ")' "$HTTP_RESPONSE")"
if [ -n "$STRAGGLERS" ]; then
echo "::error::incomplete rollout — tenants not on target tag $TARGET_TAG: $STRAGGLERS"
fi
echo "::error::redeploy-fleet reported ok=false; production rollout halted."
if [ "$HTTP_CODE" != "200" ]; then
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
exit 1
fi
if [ "$ROLLOUT_EXIT" -ne 0 ]; then
echo "::error::redeploy-fleet rollout failed with exit code $ROLLOUT_EXIT."
exit "$ROLLOUT_EXIT"
OK="$(jq -r '.ok' "$HTTP_RESPONSE")"
if [ "$OK" != "true" ]; then
echo "::error::redeploy-fleet reported ok=false; production rollout halted."
exit 1
fi
- name: Verify reachable tenants report this SHA
+1 -1
View File
@@ -51,7 +51,7 @@ jobs:
name: Audit Railway env vars for drift-prone pins
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 10
@@ -73,7 +73,7 @@ jobs:
# it never queues behind PR-CI. `publish` -> molecule-runner-publish-*.
runs-on: publish
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
env:
@@ -80,7 +80,7 @@ jobs:
# `publish` -> molecule-runner-publish-* sub-pool.
runs-on: publish
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
steps:
+1 -1
View File
@@ -54,7 +54,7 @@ jobs:
# runners with internet access to package mirrors). Falls back to GitHub
# binary download. GitHub releases may be blocked on some runner networks
# (infra#241 follow-up).
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
run: |
if apt-get update -qq && apt-get install -y -qq jq; then
+1 -1
View File
@@ -57,7 +57,7 @@ jobs:
name: Detect SECRET_PATTERNS drift
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 5
steps:
+3 -3
View File
@@ -36,7 +36,7 @@
# window closed. continue-on-error: true has been removed from the
# tier-check job; AND-composition is now fully enforced. If you need
# to temporarily re-introduce a mask, file a tracker and follow the
# mc#1982 protocol (Tier 2e lint requires a current tracker within
# mc#774 protocol (Tier 2e lint requires a current tracker within
# 2 lines of any continue-on-error: true).
name: sop-tier-check
@@ -92,7 +92,7 @@ jobs:
# runners). The sop-tier-check script has its own fallback as a
# third line of defense. continue-on-error: true ensures this step
# failing does not block the job.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
run: |
# apt-get is the primary method — Ubuntu package mirrors are reliably
@@ -113,7 +113,7 @@ jobs:
# continue-on-error: true at step level — job-level is ignored by Gitea
# Actions (quirk #10, internal runbooks). Belt-and-suspenders with
# SOP_FAIL_OPEN=1 + || true below.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
+2 -2
View File
@@ -90,7 +90,7 @@ jobs:
staging-smoke:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
sha: ${{ steps.compute.outputs.sha }}
@@ -212,7 +212,7 @@ jobs:
if: ${{ needs.staging-smoke.result == 'success' && needs.staging-smoke.outputs.smoke_ran == 'true' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
env:
SHA: ${{ needs.staging-smoke.outputs.sha }}
+1 -1
View File
@@ -71,7 +71,7 @@ jobs:
name: Sweep CF orphans
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# 3 min surfaces hangs (CF API stall, AWS describe-instances stuck)
# within one cron interval instead of burning a full tick. Realistic
+1 -1
View File
@@ -55,7 +55,7 @@ jobs:
name: Sweep CF tunnels
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# 30 min cap. Was 5 min on the theory that the only thing that
# could take >5min is a CF-API hang — but on 2026-05-02 a backlog
-99
View File
@@ -1,99 +0,0 @@
name: sync-providers-yaml
# Cross-repo canonical↔synced-copy drift gate (internal#718 P2-A, CTO
# 2026-05-27 "Distribution = SDK via codegen + verify-CI", multi-repo branch:
# "codegen-checked-into-each-repo + verify-CI").
#
# The canonical provider-registry SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core has NO Go module dependency
# on controlplane, so instead of importing it we carry a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml and gate it.
#
# This workflow fetches the canonical providers.yaml from controlplane (via the
# Gitea raw endpoint, read-only) and byte-compares it against core's synced
# copy. RED if they differ — meaning the canonical moved and core's copy must be
# re-synced (copy verbatim + `go generate ./...` + bump
# canonicalProvidersYAMLSHA256 in sync_canonical_test.go).
#
# Pairs with:
# * sync_canonical_test.go — hermetic sha pin (catches a hand-edit of core's
# copy even with no network); runs in the normal `go test ./...`.
# * verify-providers-gen.yml — artifact ↔ synced-copy drift.
#
# ENFORCEMENT GATING: standalone workflow, NOT a job in ci.yml and NOT in
# branch protection (same soak-then-promote posture as verify-providers-gen).
# It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel does not fire on it.
#
# AUTH: uses AUTO_SYNC_TOKEN (the existing cross-repo read token used to sync
# template/provider content from sibling repos). If the secret is absent the
# job emits a clear ::warning:: and exits 0 — the hermetic sha pin in
# sync_canonical_test.go is the always-on backstop, so a missing cross-repo
# token degrades to "hand-edit still caught, live canonical drift not caught"
# rather than a hard red that blocks unrelated PRs.
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
push:
branches: [main, staging]
paths:
- 'workspace-server/internal/providers/providers.yaml'
- '.gitea/workflows/sync-providers-yaml.yml'
schedule:
# Daily at :23 — catch a canonical change in controlplane that landed
# without a paired core re-sync PR (off-zero to spread cron load).
- cron: '23 4 * * *'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: sync-providers-yaml-${{ github.ref }}
cancel-in-progress: true
jobs:
compare:
name: Compare synced providers.yaml against controlplane canonical
runs-on: ubuntu-latest
timeout-minutes: 6
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Fetch canonical providers.yaml from controlplane and byte-compare
env:
AUTO_SYNC_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
API_ROOT: ${{ github.server_url }}/api/v1
run: |
set -euo pipefail
if [ -z "${AUTO_SYNC_TOKEN:-}" ]; then
echo "::warning::AUTO_SYNC_TOKEN secret missing — skipping the live cross-repo compare."
echo "The hermetic sha pin (sync_canonical_test.go) still gates hand-edits of core's copy."
echo "Provision AUTO_SYNC_TOKEN (read scope on molecule-controlplane) to enable live canonical-drift detection."
exit 0
fi
CANON_URL="${API_ROOT}/repos/molecule-ai/molecule-controlplane/raw/internal/providers/providers.yaml?ref=main"
# Use the /raw endpoint: it returns the file bytes directly. (The
# /contents endpoint ignores Accept: application/vnd.gitea.raw on
# Gitea 1.22.6 and returns the JSON+base64 envelope, which made this
# diff a permanent false RED.)
curl -fsS \
-H "Authorization: token ${AUTO_SYNC_TOKEN}" \
"${CANON_URL}" -o /tmp/canonical-providers.yaml
LOCAL=workspace-server/internal/providers/providers.yaml
if diff -u /tmp/canonical-providers.yaml "$LOCAL"; then
echo "OK — core's synced providers.yaml is byte-identical to the controlplane canonical."
else
echo "::error::core's synced providers.yaml DRIFTED from the controlplane canonical (SSOT)."
echo "Re-sync: copy controlplane internal/providers/providers.yaml verbatim over"
echo " $LOCAL, run 'go generate ./...' in workspace-server/, and bump"
echo " canonicalProvidersYAMLSHA256 in internal/providers/sync_canonical_test.go."
exit 1
fi
+1 -1
View File
@@ -49,7 +49,7 @@ jobs:
name: Ops scripts (unittest)
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-107
View File
@@ -1,107 +0,0 @@
name: verify-providers-gen
# Provider-registry SSOT enforcement gate — molecule-core side (internal#718
# P2-A, CTO 2026-05-27 "Distribution = SDK via codegen + verify-CI").
#
# The canonical schema SSOT is molecule-controlplane
# internal/providers/providers.yaml. molecule-core carries a SYNCED COPY at
# workspace-server/internal/providers/providers.yaml (kept in sync by the
# companion sync-providers-yaml.yml gate), and cmd/gen-providers emits the
# checked-in Go projection workspace-server/internal/providers/gen/registry_gen.go.
#
# This workflow regenerates the artifact into the working tree and fails RED if
# it differs from what is committed — catching BOTH:
# * a providers.yaml (synced-copy) change that wasn't followed by `go generate ./...`, and
# * a hand-edit of the generated artifact (it carries a DO NOT EDIT header).
#
# It is the molecule-core mirror of molecule-controlplane's verify-providers-gen
# workflow. Together with sync-providers-yaml (canonical↔synced-copy drift) it
# closes the codegen-checked-into-each-repo + verify-CI loop the RFC mandates.
#
# ENFORCEMENT GATING (deliberate, per dev-SOP "implementation gating"):
# this is a STANDALONE workflow, NOT a job inside ci.yml, and is NOT yet in any
# branch-protection status_check_contexts. Rationale (identical to the CP P0
# rollout):
# * It runs + reports RED on every PR/push immediately (visible signal).
# * It is intentionally absent from ci.yml's job set so the ci-required-drift
# sentinel (jobs ↔ branch-protection ↔ audit-env) does NOT fire on it, and
# from branch protection (turning it into a hard merge gate has blast radius
# — operator GO required, same pattern as sop-tier-check / verify-providers-gen
# on controlplane). Promote it into branch protection in a follow-up once
# P2 has soaked.
# Until then it behaves like secret-scan / block-internal-paths: a standalone
# advisory-to-hard gate the author is expected to keep green.
on:
pull_request:
types: [opened, synchronize, reopened]
# CI-scheduler-overload fix (fix/ci-scheduler-fanout, 2026-06-01):
# this gate only verifies that the generated providers artifact is in
# sync with the schema SSOT. Its verdict can ONLY change when one of
# the codegen inputs/outputs changes, so firing the Go toolchain on
# every unrelated PR (docs, canvas, scripts) is pure fan-out cost.
# Scoped to the codegen surface. SAFE because this workflow is NOT a
# branch-protection status_check_context (see header §ENFORCEMENT
# GATING) — lint-required-no-paths only forbids paths filters on
# REQUIRED workflows; this is advisory, so a paths filter is allowed.
# Mirrors the sibling sync-providers-yaml.yml scoping convention.
paths:
- 'workspace-server/internal/providers/**'
- 'workspace-server/cmd/gen-providers/**'
- '.gitea/workflows/verify-providers-gen.yml'
push:
branches: [main, staging]
paths:
- 'workspace-server/internal/providers/**'
- 'workspace-server/cmd/gen-providers/**'
- '.gitea/workflows/verify-providers-gen.yml'
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: verify-providers-gen-${{ github.ref }}
cancel-in-progress: true
jobs:
verify:
name: Regenerate providers artifact and fail on drift
runs-on: ubuntu-latest
timeout-minutes: 8
defaults:
run:
working-directory: workspace-server
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@4a3601121dd01d1626a1e23e37211e3254c1c06c # v6.4.0
with:
go-version: 'stable'
cache: true
cache-dependency-path: workspace-server/go.sum
- name: Verify generated artifact is in sync with providers.yaml
run: |
set -euo pipefail
# -check regenerates in memory and byte-compares against the
# checked-in artifact; exit 1 (RED) on any drift. This is the
# single source of the gate's verdict — the same code path
# `go test ./cmd/gen-providers` exercises.
go run ./cmd/gen-providers -check
- name: Belt-and-braces — regenerate in place and assert clean tree
run: |
set -euo pipefail
# Independent confirmation that does not trust the -check path:
# actually write the artifact and assert git sees no change. If
# this and the step above ever disagree, the gate is suspect.
go generate ./...
if ! git diff --quiet -- internal/providers/gen/registry_gen.go; then
echo "::error::workspace-server/internal/providers/gen/registry_gen.go drifted from providers.yaml."
echo "Run 'go generate ./...' (or 'go run ./cmd/gen-providers') in workspace-server/ and commit the result."
git --no-pager diff -- internal/providers/gen/registry_gen.go | head -80
exit 1
fi
echo "OK — generated providers artifact is in sync with the schema SSOT."
+2 -2
View File
@@ -31,7 +31,7 @@ jobs:
name: Weekly Platform-Go Surface
runs-on: ubuntu-latest
# continue-on-error: surface only, never block
# mc#1982: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
defaults:
run:
@@ -106,7 +106,7 @@ jobs:
[[ "$file" == *_test.go ]] && continue
[[ "$file" == *"$path"* ]] || continue
awk "BEGIN{exit !(\$pct < 10)}" || continue
rel=$(echo "$file" | sed 's|^git.moleculesai.app/molecule-ai/molecule-core/workspace-server/workspace-server/||; s|^git.moleculesai.app/molecule-ai/molecule-core/workspace-server/||')
rel=$(echo "$file" | sed 's|^github.com/molecule-ai/molecule-monorepo/platform/workspace-server/||; s|^github.com/molecule-ai/molecule-monorepo/platform/||')
if echo "$ALLOWLIST" | grep -qxF "$rel"; then
continue
fi
+18 -26
View File
@@ -46,18 +46,6 @@
---
## Quick Start
```bash
git clone https://git.moleculesai.app/molecule-ai/molecule-core.git
cd molecule-core
./scripts/dev-start.sh
```
Then open [http://localhost:3000](http://localhost:3000), add your model API key in **Config → Secrets & API Keys → Global**, and create a workspace from a template.
See the full [Quickstart Guide](./docs/quickstart.md) for prerequisites, manual setup, and troubleshooting.
## The Pitch
Molecule AI is the most powerful way to govern an AI agent organization in production.
@@ -65,7 +53,7 @@ Molecule AI is the most powerful way to govern an AI agent organization in produ
It combines the parts that are usually scattered across demos, internal glue code, and framework-specific tooling into one product:
- one org-native control plane for teams, roles, hierarchy, and lifecycle
- one runtime layer that lets **four** maintained agent runtimes — Claude Code, Codex, **Hermes**, and OpenClaw — run side by side behind one workspace contract
- one runtime layer that lets **eight** agent runtimes — LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, **Hermes**, **Gemini CLI**, and OpenClaw — run side by side behind one workspace contract
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries (Memory v2 backed by pgvector for semantic recall)
- one operational surface for observing, pausing, restarting, inspecting, and improving live workspaces
@@ -87,11 +75,11 @@ You do not wire collaboration paths by hand. Hierarchy defines the default commu
### 3. Runtime choice stops being a dead-end decision
Claude Code, Codex, Hermes, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, Hermes, Gemini CLI, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
### 4. Memory is treated like infrastructure
Molecule AI's HMA approach is designed around organizational boundaries, not just "store more context somewhere." Durable recall, scoped sharing through the v2 memory plugin, and skill promotion are all part of one coherent system.
Molecule AI's HMA approach is designed around organizational boundaries, not just store more context somewhere. Durable recall, scoped sharing, awareness namespaces, and skill promotion are all part of one coherent system.
### 5. It comes with a real control plane
@@ -113,7 +101,7 @@ Registry, heartbeats, restart, pause/resume, activity logs, approvals, terminal
| **Role-native workspace abstraction** | Your org structure survives model swaps, framework changes, and team expansion |
| **Fractal team expansion** | A single specialist can become a managed department without breaking upstream integrations |
| **Heterogeneous runtime compatibility** | Different teams can keep their preferred agent architecture while sharing one control plane |
| **HMA + v2 memory plugin** | Memory sharing follows hierarchy instead of leaking across the whole system; one plugin per tenant, namespace-scoped per workspace |
| **HMA + awareness namespaces** | Memory sharing follows hierarchy instead of leaking across the whole system |
| **Skill evolution loop** | Durable successful workflows can graduate from memory into reusable, hot-reloadable skills |
| **WebSocket-first operational UX** | The canvas reflects task state, structure changes, and A2A responses in near real time |
| **Global secrets with local override** | Centralize provider access, then override only where a workspace needs specialized credentials |
@@ -124,9 +112,13 @@ Molecule AI is not trying to replace the frameworks below. It is the system that
| Runtime / architecture | Status in current repo | Native strength | What Molecule AI adds |
|---|---|---|---|
| **LangGraph** | Shipping on `main` | Graph control, tool use, Python extensibility | Canvas orchestration, hierarchy routing, A2A, memory scopes, operational lifecycle |
| **DeepAgents** | Shipping on `main` | Deeper planning and decomposition | Same workspace contract, team topology, activity stream, restart behavior |
| **Claude Code** | Shipping on `main` | Real coding workflows, CLI-native continuity | Secure workspace abstraction, A2A delegation, org boundaries, shared control plane |
| **Codex** | Shipping on `main` | OpenAI Codex CLI workflows | Secure workspace abstraction, A2A delegation, org boundaries, shared control plane |
| **CrewAI** | Shipping on `main` | Role-based crews | Persistent workspace identity, policy consistency, shared canvas and registry |
| **AutoGen** | Shipping on `main` | Assistant/tool orchestration | Standardized deployment, hierarchy-aware collaboration, shared ops plane |
| **Hermes 4** | Shipping on `main` | Hybrid reasoning, native tools, json_schema (NousResearch/hermes-agent) | Option B upstream hook, A2A bridge to OpenAI-compat API, multi-provider provider derivation |
| **Gemini CLI** | Shipping on `main` | Google Gemini CLI continuity | Workspace lifecycle, A2A, hierarchy-aware collaboration, shared ops plane |
| **OpenClaw** | Shipping on `main` | CLI-native runtime with its own session model | Workspace lifecycle, templates, activity logs, topology-aware collaboration |
| **NemoClaw** | WIP on `feat/nemoclaw-t4-docker` | NVIDIA-oriented runtime path | Planned to join the same abstraction once merged; not yet part of `main` |
@@ -141,7 +133,7 @@ Most projects stop at “we added memory.” Molecule AI pushes further:
| Flat store or weak namespaces | Hierarchy-aligned `LOCAL`, `TEAM`, `GLOBAL` scopes |
| Sharing is easy to overexpose | Sharing is explicit and structure-aware |
| Memory and procedure get mixed together | Memory stores durable facts; skills store repeatable procedure |
| Every agent can become over-privileged | Per-workspace namespaces in the v2 memory plugin reduce blast radius |
| Every agent can become over-privileged | Workspace awareness namespaces reduce blast radius |
| UI memory and runtime memory blur together | Separate surfaces for scoped agent memory, key/value workspace memory, and recall |
### The flywheel
@@ -171,7 +163,7 @@ Most agent systems stop at "a smart runtime." Molecule AI pushes further: it giv
| Core mechanism | Molecule AI module(s) | Why it matters |
|---|---|---|
| **Durable memory that survives sessions** | `molecule-ai-workspace-runtime/molecule_runtime/builtin_tools/`, `workspace-server/internal/handlers/memories.go`, `workspace-server/internal/memory/` (v2 plugin client + namespace resolver) | Memory is not just durable, it is **workspace-scoped** — every write lands in the workspace's own `workspace:<id>` namespace, with `team:<root>` and `org:<root>` available for cross-workspace shares via the platform's namespace ACL when an agent explicitly promotes a memory |
| **Durable memory that survives sessions** | `molecule-ai-workspace-runtime/molecule_runtime/builtin_tools/`, `workspace-server/internal/handlers/memories.go` | Memory is not just durable, it is **workspace-scoped** and can route into awareness namespaces tied to the org structure |
| **Cross-session recall** | `workspace-server/internal/handlers/activity.go` (`/workspaces/:id/session-search`) | Recall spans both activity history and memory rows, so the system can search what happened and what was learned without inventing a separate hidden store |
| **Skills built from experience** | `molecule-ai-workspace-runtime/molecule_runtime/builtin_tools/memory.py` (`_maybe_log_skill_promotion`) | Promotion from memory into a skill candidate is surfaced as an explicit platform activity, not a silent internal side effect |
| **Skill improvement during use** | `molecule-ai-workspace-runtime/molecule_runtime/skill_loader/`, `molecule-ai-workspace-runtime/molecule_runtime/main.py` | Skills hot-reload into the live runtime, so improvements become available on the next A2A task without restarting the workspace |
@@ -180,7 +172,7 @@ Most agent systems stop at "a smart runtime." Molecule AI pushes further: it giv
### Why this matters in Molecule AI
1. **The learning loop is org-aware, not just session-aware.**
Memory can live at `LOCAL`, `TEAM`, or `GLOBAL` scope, and the v2 plugin's namespace ACL gives each workspace a durable identity boundary.
Memory can live at `LOCAL`, `TEAM`, or `GLOBAL` scope, and awareness namespaces give each workspace a durable identity boundary.
2. **The learning loop is visible to operators.**
Promotion events, activity logs, current-task updates, traces, and WebSocket fanout mean self-improvement is part of the control plane, not a hidden black box.
@@ -217,9 +209,9 @@ The result is not just “an agent that learns.” It is **an organization that
### Runtime
- standalone workspace-template images that install `molecule-ai-workspace-runtime` from the Gitea package registry; thin AMI in production (us-east-2)
- adapter-driven execution across **4 maintained runtimes** (Claude Code, Codex, Hermes, OpenClaw)
- adapter-driven execution across **8 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw)
- Agent Card registration
- **Memory v2 backed by pgvector** — per-tenant plugin sidecar serving HMA namespaces with FTS + semantic recall
- awareness-backed memory integration; **Memory v2 backed by pgvector** for semantic recall
- plugin-mounted shared rules/skills
- hot-reloadable local skills
- coordinator-only delegation path
@@ -253,7 +245,7 @@ The result is not just “an agent that learns.” It is **an organization that
Molecule AI is especially strong when you need to run:
- AI engineering teams with PM / Dev Lead / QA / Research / Ops roles
- mixed runtime organizations where one team prefers Hermes and another prefers Claude Code
- mixed runtime organizations where one team prefers LangGraph and another prefers Claude Code
- long-lived agent organizations that need memory boundaries and reusable procedures
- internal platforms that want to expose agent teams as structured infrastructure, not ad hoc scripts
@@ -268,9 +260,9 @@ Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080)
+------------------------- shows ------------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python ≥3.11, image with adapters)
- 4 adapters: Claude Code / Codex / Hermes / OpenClaw
- 8 adapters: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A server (typed-SSOT response path, RFC #2967)
- heartbeat + activity + Memory v2 (pgvector semantic recall via per-tenant plugin sidecar)
- heartbeat + activity + awareness-backed memory (Memory v2 pgvector semantic recall)
- skills + plugins + hot reload
SaaS Control Plane (molecule-controlplane, private)
@@ -336,7 +328,7 @@ Then open `http://localhost:3000`:
## Current Scope
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **four maintained production adapters** (Claude Code, Codex, Hermes, OpenClaw), skill lifecycle, and operational surfaces.
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **eight production adapters** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw), skill lifecycle, and operational surfaces.
The companion private repo [`molecule-controlplane`](https://git.moleculesai.app/molecule-ai/molecule-controlplane) provides the SaaS surface — multi-tenant orchestration on EC2 + Neon + Cloudflare Tunnels, KMS envelope encryption, WorkOS auth, Stripe billing, and a `tenant_resources` audit table with a 30-min reconciler.
+17 -13
View File
@@ -52,7 +52,7 @@ Molecule AI 是目前最强的 AI Agent 组织治理方案之一,用来把 age
它把过去分散在 demo、内部胶水代码和各类 framework 私有工具里的关键能力,收敛成一个产品:
- 一套组织原生 control plane,管理团队、角色、层级和生命周期
- 一套 runtime abstraction,让 **4**维护中的 agent runtime —— Claude Code、Codex、**Hermes**、OpenClaw —— 共用一套 workspace 契约
- 一套 runtime abstraction,让 **8** agent runtime —— LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、**Hermes**、**Gemini CLI**、OpenClaw —— 共用一套 workspace 契约
- 一套与组织边界对齐的 memory 模型,把 recall、sharing 和 skill evolution 放进同一体系(Memory v2 由 pgvector 支撑语义召回)
- 一套面向线上 workspace 的运维面,统一完成观测、暂停、重启、检查和持续改进
@@ -74,11 +74,11 @@ Molecule AI 填的就是这个空白。
### 3. Runtime 选择不再是死路
Claude Code、Codex、Hermes、OpenClaw 都可以挂到同一个 workspace abstraction 下。团队可以统一治理方式,而不必统一到底层 runtime。
LangGraph、DeepAgents、Claude Code、CrewAI、AutoGen、Hermes、Gemini CLI、OpenClaw 都可以挂到同一个 workspace abstraction 下。团队可以统一治理方式,而不必统一到底层 runtime。
### 4. Memory 被当成基础设施来做
Molecule AI 的 HMA 不是“多存一点上下文”而已。它关注组织边界、durable recall、scope sharing、v2 memory plugin、skill promotion,把这些放在一个完整体系里。
Molecule AI 的 HMA 不是“多存一点上下文”而已。它关注组织边界、durable recall、scope sharing、awareness namespace、skill promotion,把这些放在一个完整体系里。
### 5. 它自带真正的 control plane
@@ -100,7 +100,7 @@ Registry、heartbeat、restart、pause/resume、activity、approval、terminal
| **角色原生 workspace 抽象** | 模型切换、框架切换、团队扩容都不会打碎你的组织结构 |
| **分形式团队扩展** | 一个 specialist 可以平滑升级成一个部门,而不影响上游集成 |
| **异构 runtime 兼容** | 不同团队可以保留偏好的 agent 架构,但共用一套平台规则 |
| **HMA + v2 memory plugin** | Memory 分享沿组织边界走,而不是全局乱穿透;每个 tenant 一个 plugin,按 workspace namespace 隔离 |
| **HMA + awareness namespace** | Memory 分享沿组织边界走,而不是全局乱穿透 |
| **Skill 演化闭环** | 成功工作流可以从 memory 逐步提升成可热加载的 skill |
| **WebSocket-first 运维体验** | Canvas 能即时反映任务状态、结构变更和 A2A 响应 |
| **Global secrets + local override** | 统一管理 provider 凭据,只在需要时做 workspace 级覆写 |
@@ -111,9 +111,13 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
| Runtime / 架构 | 当前仓库状态 | 原生优势 | Molecule AI 额外补上的能力 |
|---|---|---|---|
| **LangGraph** | `main` 已支持 | 图控制强、工具调用成熟、Python 扩展性好 | Canvas orchestration、层级路由、A2A、memory scope、operational lifecycle |
| **DeepAgents** | `main` 已支持 | 规划和任务拆解更强 | 同一套 workspace contract、团队拓扑、activity、restart 行为 |
| **Claude Code** | `main` 已支持 | 真实编码工作流、CLI-native continuity | 安全 workspace 抽象、A2A delegation、组织边界、共享 control plane |
| **Codex** | `main` 已支持 | OpenAI Codex CLI 工作流 | 安全 workspace 抽象、A2A delegation、组织边界、共享 control plane |
| **CrewAI** | `main` 已支持 | 角色型 crew 模式清晰 | 持久 workspace 身份、统一策略、共享 Canvas 和 registry |
| **AutoGen** | `main` 已支持 | assistant/tool orchestration | 统一部署、层级协作、共享运维平面 |
| **Hermes 4** | `main` 已支持 | 混合推理、原生工具调用、json_schema 输出(NousResearch/hermes-agent | Option B 上游 hook、A2A 桥接 OpenAI 兼容 API、多 provider 自动派生 |
| **Gemini CLI** | `main` 已支持 | Google Gemini CLI 持续会话 | workspace 生命周期、A2A、层级感知协作、共享运维平面 |
| **OpenClaw** | `main` 已支持 | CLI-native runtime,自有 session 模型 | workspace 生命周期、templates、activity logs、拓扑感知协作 |
| **NemoClaw** | `feat/nemoclaw-t4-docker` 分支 WIP | NVIDIA 方向 runtime 路线 | 计划并入同一抽象层,但当前还不是 `main` 已合并能力 |
@@ -128,7 +132,7 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
| 扁平 store 或弱命名空间隔离 | 与层级对齐的 `LOCAL``TEAM``GLOBAL` scope |
| 分享很容易越界 | 分享是显式且结构感知的 |
| Memory 和 procedure 混成一团 | Memory 存 durable factsskills 存 repeatable procedure |
| 任意 agent 容易过权 | v2 memory plugin 的 per-workspace namespace 缩小 blast radius |
| 任意 agent 容易过权 | workspace awareness namespace 缩小 blast radius |
| UI memory 和 runtime memory 混在一起 | scoped agent memory、key/value workspace memory、recall surface 分层清晰 |
### 这套飞轮怎么转
@@ -158,7 +162,7 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
| 核心机制 | Molecule AI 对应模块 | 为什么重要 |
|---|---|---|
| **跨 session 的 durable memory** | `workspace/builtin_tools/memory.py``workspace-server/internal/handlers/memories.go``workspace-server/internal/memory/`v2 plugin client + namespace resolver| 不只是持久化,而且是**按 workspace 隔离**的 —— 每次写入都落在 workspace 自己的 `workspace:<id>` namespace 里;当 agent 显式升级到跨 workspace 共享时,可以通过平台 namespace ACL 写到 `team:<root>``org:<root>` |
| **跨 session 的 durable memory** | `workspace/builtin_tools/memory.py``workspace/builtin_tools/awareness_client.py``workspace-server/internal/handlers/memories.go` | 不只是持久化,而且是**按 workspace 隔离**的,可进一步路由到和组织结构绑定的 awareness namespace |
| **Cross-session recall** | `workspace-server/internal/handlers/activity.go` 中的 `/workspaces/:id/session-search` | Recall 同时覆盖 activity history 和 memory rows,不需要再造一个隐蔽的新存储层 |
| **从经验里长出技能** | `workspace/builtin_tools/memory.py` 里的 `_maybe_log_skill_promotion` | 从 memory 到 skill candidate 的提升会被显式记录成平台 activity,而不是默默发生在黑盒里 |
| **技能在使用中持续改进** | `workspace/skill_loader/watcher.py``workspace/skill_loader/loader.py``workspace/main.py` | Skill 改动可以热加载进 live runtime,下一次 A2A 任务就能直接使用,不需要重启 workspace |
@@ -167,7 +171,7 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
### 为什么这在 Molecule AI 里更适合团队级系统
1. **学习闭环是 org-aware 的,而不只是 session-aware。**
Memory 可以按 `LOCAL``TEAM``GLOBAL` scope 运作,v2 plugin 的 namespace ACL 让每个 workspace 都有清晰的持久边界。
Memory 可以按 `LOCAL``TEAM``GLOBAL` scope 运作,awareness namespace 让每个 workspace 都有清晰的持久边界。
2. **学习闭环是对运维可见的。**
Promotion events、activity logs、current-task updates、traces、WebSocket fanout 让自我进化进入 control plane,而不是藏在黑盒内部。
@@ -204,9 +208,9 @@ Molecule AI 并不是要替代下面这些 framework,而是把它们纳入更
### Runtime
- 统一 `workspace/` 镜像;生产环境采用 thin AMIus-east-2
- adapter 驱动执行,覆盖 **4维护中的 runtime**Claude Code、Codex、Hermes、OpenClaw
- adapter 驱动执行,覆盖 **8 个 runtime**Claude Code、Hermes、Gemini CLI、LangGraph、DeepAgents、CrewAI、AutoGen、OpenClaw
- Agent Card 注册
- **Memory v2 由 pgvector 支撑** —— 每个 tenant 一个 plugin sidecar,承载 HMA namespace、FTS 与语义召回
- awareness-backed memory**Memory v2 由 pgvector 支撑**语义召回
- plugin 挂载共享 rules/skills
- 本地 skills 热加载
- coordinator-only delegation 路径
@@ -255,9 +259,9 @@ Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080)
+------------------------- 展示 ------------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python ≥3.11,含 adapter 集合的镜像)
- 4 个 adapter: Claude Code / Codex / Hermes / OpenClaw
- 8 个 adapter: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A servertyped-SSOT 响应路径,RFC #2967
- heartbeat + activity + Memory v2pgvector 语义召回per-tenant plugin sidecar
- heartbeat + activity + awareness-backed memoryMemory v2 —— pgvector 语义召回)
- skills + plugins + hot reload
SaaS Control Plane (molecule-controlplane,私有)
@@ -317,7 +321,7 @@ npm run dev
## 当前范围说明
当前 `main` 已经包含核心平台、Canvas v4warm-paper 主题)、Memory v2pgvector 语义召回)、typed-SSOT A2A 响应路径(RFC #2967)、**4维护中的正式 adapter**Claude Code、Codex、Hermes、OpenClaw)、skill lifecycle,以及主要运维面。
当前 `main` 已经包含核心平台、Canvas v4warm-paper 主题)、Memory v2pgvector 语义召回)、typed-SSOT A2A 响应路径(RFC #2967)、**8 个正式 adapter**Claude Code、Hermes、Gemini CLI、LangGraph、DeepAgents、CrewAI、AutoGen、OpenClaw)、skill lifecycle,以及主要运维面。
配套的私有仓库 [`molecule-controlplane`](https://git.moleculesai.app/molecule-ai/molecule-controlplane) 提供 SaaS 层 —— 多租户编排(EC2 + Neon + Cloudflare Tunnels)、KMS 信封加密、WorkOS 鉴权、Stripe 计费,以及 `tenant_resources` 审计表加 30 分钟 reconciler。
+3 -7
View File
@@ -15,11 +15,9 @@ test("FilesTab renders after split", async ({ page, request }) => {
// Clean slate
const { workspaces } = await request
.get("http://localhost:8080/workspaces")
.then(async (r) => ({ workspaces: (await r.json()) as Array<{ id: string; name: string }> }));
.then(async (r) => ({ workspaces: (await r.json()) as Array<{ id: string }> }));
for (const w of workspaces) {
await request.delete(`http://localhost:8080/workspaces/${w.id}?confirm=true`, {
headers: { "X-Confirm-Name": w.name },
});
await request.delete(`http://localhost:8080/workspaces/${w.id}?confirm=true`);
}
// Create a workspace
@@ -82,7 +80,5 @@ test("FilesTab renders after split", async ({ page, request }) => {
await expect(editorEmpty.first()).toBeVisible({ timeout: 5_000 });
// Cleanup
await request.delete(`http://localhost:8080/workspaces/${wsId}?confirm=true`, {
headers: { "X-Confirm-Name": "FilesTab Smoke" },
});
await request.delete(`http://localhost:8080/workspaces/${wsId}?confirm=true`);
});
+6 -12
View File
@@ -49,7 +49,7 @@ export async function seedWorkspace(echoURL: string): Promise<SeededWorkspace> {
};
let authToken = ws.connection?.auth_token;
if (!authToken) {
authToken = await mintWorkspaceToken(ws.id);
authToken = await mintTestToken(ws.id);
}
if (!authToken) {
throw new Error("Workspace created but no auth_token returned");
@@ -202,18 +202,12 @@ export async function cleanupWorkspace(workspaceId: string): Promise<void> {
* Mint a workspace auth token so the canvas can make authenticated API
* calls (WorkspaceAuth middleware).
*/
export async function mintWorkspaceToken(workspaceId: string): Promise<string> {
const headers: Record<string, string> = {};
const adminToken = process.env.E2E_ADMIN_TOKEN ?? process.env.ADMIN_TOKEN;
if (adminToken) {
headers.Authorization = `Bearer ${adminToken}`;
}
const res = await fetch(`${PLATFORM_URL}/admin/workspaces/${workspaceId}/tokens`, {
method: "POST",
headers,
});
export async function mintTestToken(workspaceId: string): Promise<string> {
const res = await fetch(
`${PLATFORM_URL}/admin/workspaces/${workspaceId}/test-token`,
);
if (!res.ok) {
throw new Error(`Failed to mint workspace token: ${res.status}`);
throw new Error(`Failed to mint test token: ${res.status}`);
}
const data = (await res.json()) as { auth_token: string };
return data.auth_token;
-35
View File
@@ -1,35 +0,0 @@
import { dirname } from "path";
import { fileURLToPath } from "url";
import { FlatCompat } from "@eslint/eslintrc";
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
const compat = new FlatCompat({
baseDirectory: __dirname,
});
const eslintConfig = [
{
ignores: [
".next/**",
"coverage/**",
"out/**",
"build/**",
"next-env.d.ts",
],
},
...compat.extends("next/core-web-vitals", "next/typescript"),
{
rules: {
"@typescript-eslint/no-explicit-any": "warn",
"@typescript-eslint/no-require-imports": "warn",
"prefer-const": "warn",
"react-hooks/rules-of-hooks": "warn",
"react/display-name": "warn",
"react/no-unescaped-entities": "warn",
},
},
];
export default eslintConfig;
+1 -4337
View File
File diff suppressed because it is too large Load Diff
+2 -5
View File
@@ -6,12 +6,11 @@
"dev": "next dev --turbopack -p 3000",
"build": "next build",
"start": "next start",
"lint": "eslint .",
"lint": "next lint",
"test": "vitest run",
"test:coverage": "vitest run --coverage"
},
"dependencies": {
"@novnc/novnc": "^1.7.0",
"@radix-ui/react-alert-dialog": "^1.1.15",
"@radix-ui/react-dialog": "^1.1.15",
"@radix-ui/react-tabs": "^1.1.12",
@@ -31,7 +30,6 @@
},
"devDependencies": {
"@playwright/test": "^1.59.1",
"@tailwindcss/postcss": "^4.0.0",
"@testing-library/jest-dom": "^6.6.0",
"@testing-library/react": "^16.1.0",
"@types/node": "^25.6.0",
@@ -39,8 +37,7 @@
"@types/react-dom": "^19.0.0",
"@vitejs/plugin-react": "^6.0.1",
"@vitest/coverage-v8": "^4.1.5",
"eslint": "^9.39.4",
"eslint-config-next": "^15.5.15",
"@tailwindcss/postcss": "^4.0.0",
"jsdom": "^29.1.1",
"postcss": "^8.5.13",
"tailwindcss": "^4.0.0",
-6
View File
@@ -41,12 +41,6 @@ describe("buildCsp — production", () => {
expect(csp).toContain("object-src 'none'");
});
it("allows blob: in frame-src for authenticated PDF previews", () => {
const frameSrc = csp.match(/frame-src[^;]*/)?.[0] ?? "";
expect(frameSrc).toContain("'self'");
expect(frameSrc).toContain("blob:");
});
it("locks base-uri to 'self' (prevents base-tag injection)", () => {
expect(csp).toContain("base-uri 'self'");
});
+1 -1
View File
@@ -41,7 +41,7 @@ export default function PricingPage() {
<p className="mt-2 text-ink-mid">
We publish the{" "}
<a
href="https://git.moleculesai.app/molecule-ai/molecule-core"
href="https://git.moleculesai.app/molecule-ai/molecule-monorepo"
className="text-accent underline hover:text-accent"
>
full source on GitHub
+1 -4
View File
@@ -232,10 +232,7 @@ function CanvasInner() {
}
state.beginDelete(subtree);
try {
const workspaceName = state.nodes.find((n) => n.id === id)?.data.name ?? "";
await api.del(`/workspaces/${id}?confirm=true`, {
headers: { "X-Confirm-Name": workspaceName },
});
await api.del(`/workspaces/${id}?confirm=true`);
// Mirror the server-side cascade locally — drop the parent AND
// every descendant in one atomic update. The per-descendant
// WORKSPACE_REMOVED WS events still arrive (and are no-ops
+1 -1
View File
@@ -128,7 +128,7 @@ export function ConsoleModal({ workspaceId, workspaceName, open, onClose }: Prop
<div className="flex-1 overflow-auto bg-black/80 p-4">
{loading && (
<div role="status" aria-live="polite" className="text-[12px] text-ink-mid" data-testid="console-loading">
<div className="text-[12px] text-ink-mid" data-testid="console-loading">
Loading console output
</div>
)}
+244 -200
View File
@@ -5,13 +5,6 @@ import * as Dialog from "@radix-ui/react-dialog";
import { api } from "@/lib/api";
import { isSaaSTenant } from "@/lib/tenant";
import { ExternalConnectModal, type ExternalConnectionInfo } from "./ExternalConnectModal";
import {
ProviderModelSelector,
buildProviderCatalog,
findProviderForModel,
type SelectorModel,
type SelectorValue,
} from "./ProviderModelSelector";
interface WorkspaceOption {
id: string;
@@ -29,30 +22,48 @@ interface TemplateSpec {
id: string;
name?: string;
runtime?: string;
model?: string;
models?: SelectorModel[];
providers?: string[];
}
const DEFAULT_RUNTIME = "claude-code";
const RUNTIME_OPTIONS = [
{ value: "claude-code", label: "Claude Code" },
{ value: "codex", label: "OpenAI Codex CLI" },
{ value: "google-adk", label: "Google ADK" },
{ value: "hermes", label: "Hermes" },
{ value: "openclaw", label: "OpenClaw" },
interface HermesProvider {
id: string;
label: string;
envVar: string;
defaultModel: string;
models: string[];
}
const DEFAULT_CREATE_MODEL = "anthropic:claude-opus-4-7";
// All providers supported by Hermes runtime via providers.resolve_provider().
// `defaultModel` is the slug injected into the workspace provision request
// when the user picks this provider — template-hermes's derive-provider.sh
// maps the prefix back to the provider name at install time, so this is
// the canonical handshake. `models` are additional suggestions surfaced in
// the datalist so the user can pick a different size without typing the
// whole slug.
export const HERMES_PROVIDERS: HermesProvider[] = [
{ id: "anthropic", label: "Anthropic (Claude)", envVar: "ANTHROPIC_API_KEY", defaultModel: "anthropic/claude-sonnet-4-5", models: ["anthropic/claude-opus-4-5", "anthropic/claude-sonnet-4-5", "anthropic/claude-haiku-4-5"] },
{ id: "openai", label: "OpenAI", envVar: "OPENAI_API_KEY", defaultModel: "openai/gpt-4o", models: ["openai/gpt-4o", "openai/gpt-4o-mini", "openai/o3-mini"] },
{ id: "openrouter", label: "OpenRouter", envVar: "OPENROUTER_API_KEY", defaultModel: "openrouter/auto", models: ["openrouter/auto", "openrouter/anthropic/claude-sonnet-4", "openrouter/meta-llama/llama-3.3-70b"] },
{ id: "xai", label: "xAI (Grok)", envVar: "XAI_API_KEY", defaultModel: "xai/grok-4", models: ["xai/grok-4", "xai/grok-4-mini"] },
{ id: "gemini", label: "Google Gemini", envVar: "GEMINI_API_KEY", defaultModel: "gemini/gemini-2.5-pro", models: ["gemini/gemini-2.5-pro", "gemini/gemini-2.5-flash"] },
{ id: "qwen", label: "Qwen (Alibaba)", envVar: "QWEN_API_KEY", defaultModel: "alibaba/qwen3-max", models: ["alibaba/qwen3-max", "alibaba/qwen3-coder"] },
{ id: "glm", label: "GLM (Zhipu AI)", envVar: "GLM_API_KEY", defaultModel: "zai/glm-4.6", models: ["zai/glm-4.6", "zai/glm-4.5-air"] },
{ id: "kimi", label: "Kimi (Moonshot)", envVar: "KIMI_API_KEY", defaultModel: "kimi-coding/kimi-k2", models: ["kimi-coding/kimi-k2", "kimi-coding/kimi-k1.5"] },
{ id: "minimax", label: "MiniMax", envVar: "MINIMAX_API_KEY", defaultModel: "minimax/MiniMax-M2.7", models: ["minimax/MiniMax-M2.7", "minimax/MiniMax-M2.7-highspeed", "minimax/MiniMax-M1"] },
{ id: "deepseek", label: "DeepSeek", envVar: "DEEPSEEK_API_KEY", defaultModel: "deepseek/deepseek-chat", models: ["deepseek/deepseek-chat", "deepseek/deepseek-reasoner"] },
{ id: "groq", label: "Groq", envVar: "GROQ_API_KEY", defaultModel: "openrouter/groq/llama-3.3-70b", models: ["openrouter/groq/llama-3.3-70b"] },
{ id: "mistral", label: "Mistral", envVar: "MISTRAL_API_KEY", defaultModel: "openrouter/mistralai/mistral-large", models: ["openrouter/mistralai/mistral-large"] },
{ id: "together", label: "Together AI", envVar: "TOGETHER_API_KEY", defaultModel: "openrouter/meta-llama/llama-3.3-70b", models: ["openrouter/meta-llama/llama-3.3-70b"] },
{ id: "fireworks", label: "Fireworks AI", envVar: "FIREWORKS_API_KEY", defaultModel: "openrouter/meta-llama/llama-3.3-70b", models: ["openrouter/meta-llama/llama-3.3-70b"] },
{ id: "hermes", label: "Hermes / Nous (legacy)", envVar: "HERMES_API_KEY", defaultModel: "nousresearch/Hermes-3-Llama-3.1-405B", models: ["nousresearch/Hermes-3-Llama-3.1-405B", "nousresearch/Hermes-4-14B"] },
];
const BASE_RUNTIME_TEMPLATE_IDS = new Set(["claude-code-default", "codex", "google-adk", "hermes", "openclaw"]);
const DEFAULT_HEADLESS_INSTANCE_TYPE = "t3.medium";
const DEFAULT_HEADLESS_ROOT_GB = 30;
const DEFAULT_DISPLAY_INSTANCE_TYPE = "t3.xlarge";
const DEFAULT_DISPLAY_ROOT_GB = 80;
export function CreateWorkspaceButton() {
const [open, setOpen] = useState(false);
const [name, setName] = useState("");
const [role, setRole] = useState("");
const [runtime, setRuntime] = useState(DEFAULT_RUNTIME);
const [template, setTemplate] = useState("");
const [parentId, setParentId] = useState("");
const [budgetLimit, setBudgetLimit] = useState("");
@@ -60,36 +71,41 @@ export function CreateWorkspaceButton() {
const [error, setError] = useState<string | null>(null);
const [workspaces, setWorkspaces] = useState<WorkspaceOption[]>([]);
const [displayEnabled, setDisplayEnabled] = useState(false);
const [displayInstanceType, setDisplayInstanceType] = useState(DEFAULT_DISPLAY_INSTANCE_TYPE);
const [displayRootGB, setDisplayRootGB] = useState(String(DEFAULT_DISPLAY_ROOT_GB));
const [displayInstanceType, setDisplayInstanceType] = useState("t3.xlarge");
const [displayRootGB, setDisplayRootGB] = useState("80");
const [displayResolution, setDisplayResolution] = useState("1920x1080");
// Templates fetched from /api/templates — drives the dynamic provider
// filter below. Same data source ConfigTab uses (PR #2454). When the
// selected template declares `runtime_config.providers` in its
// config.yaml, the modal surfaces only those providers in the
// <select>. Provider/model options are derived from template models.
// <select>. Empty/missing list falls back to the full HERMES_PROVIDERS
// catalog so older templates without the field keep working.
const [templateSpecs, setTemplateSpecs] = useState<TemplateSpec[]>([]);
// External-runtime path: skip docker provision, mint a workspace_auth_token,
// and surface the connection snippet in a modal after create. When
// isExternal is true the template and model fields are hidden (they're
// meaningless for BYO-compute agents).
// isExternal is true the template / model / hermes-provider fields are
// hidden (they're meaningless for BYO-compute agents).
const [isExternal, setIsExternal] = useState(false);
const [externalRuntime, setExternalRuntime] = useState("external");
const [externalConnection, setExternalConnection] =
useState<ExternalConnectionInfo | null>(null);
const [llmSelection, setLLMSelection] = useState<SelectorValue>({
providerId: "",
model: "",
envVars: [],
});
const [llmSecret, setLLMSecret] = useState("");
// Hermes-specific state
const [hermesProvider, setHermesProvider] = useState("anthropic");
const [hermesApiKey, setHermesApiKey] = useState("");
// Model slug is sent to CP as `model` and plumbed to the workspace EC2
// as HERMES_DEFAULT_MODEL env var. template-hermes's derive-provider.sh
// reads the prefix (`minimax/…`, `anthropic/…`) to set
// HERMES_INFERENCE_PROVIDER at install time. Missing model → provider
// falls back to "auto" and hermes picks its compiled-in default
// (Anthropic), which 401s if the user's key is for a different
// provider. Hence: require model when template=hermes.
const [hermesModel, setHermesModel] = useState("");
// Tier picker: on SaaS every workspace gets its own EC2 VM (Full Access
// by construction), so we hide the T1/T2/T3 Docker-sandbox tiers and
// lock to T4 — the full-host access tier. The EC2 size is controlled by
// the compute profile below. On self-hosted we still offer T1/T2/T3
// because the Docker-
// lock to T4 — the full-host access tier, which maps to t3.large at the
// CP level. On self-hosted we still offer T1/T2/T3 because the Docker-
// sandbox distinction is a real choice there; T4 is available too for
// operators who want the full-host tier.
//
@@ -139,65 +155,69 @@ export function CreateWorkspaceButton() {
[]
);
const handleRuntimeChange = useCallback((nextRuntime: string) => {
setRuntime(nextRuntime);
setTemplate("");
setLLMSelection({ providerId: "", model: "", envVars: [] });
setLLMSecret("");
}, []);
const isHermes = template.trim().toLowerCase() === "hermes";
// Resolve the selected workspace template from /templates. Runtime is
// deliberately separate: "SEO Agent" is a workspace template, not a
// runtime, so it must never appear in the runtime selector.
// Resolve the selected template's spec from the /templates response.
// The `template` input is free-text; templates can be matched by id,
// name, or runtime so any of those work. Lower-cased compare keeps
// "Hermes" / "hermes" / "HERMES" interchangeable.
const selectedTemplateSpec = useMemo<TemplateSpec | null>(() => {
if (!template) return null;
return templateSpecs.find((s) => s.id === template) ?? null;
const t = template.trim().toLowerCase();
if (!t) return null;
return (
templateSpecs.find(
(s) =>
(s.id || "").toLowerCase() === t ||
(s.name || "").toLowerCase() === t ||
(s.runtime || "").toLowerCase() === t,
) ?? null
);
}, [template, templateSpecs]);
const selectedRuntimeTemplateSpec = useMemo<TemplateSpec | null>(() => (
templateSpecs.find((s) => {
if (!BASE_RUNTIME_TEMPLATE_IDS.has(s.id)) return false;
const specRuntime = (s.runtime ?? s.id).trim().toLowerCase();
return s.id === runtime || specRuntime === runtime;
}) ?? null
), [runtime, templateSpecs]);
const visibleTemplateSpecs = useMemo(
() => templateSpecs.filter((spec) => {
if (BASE_RUNTIME_TEMPLATE_IDS.has(spec.id)) return false;
const specRuntime = (spec.runtime ?? DEFAULT_RUNTIME).trim().toLowerCase();
return specRuntime === runtime;
}),
[runtime, templateSpecs],
);
const llmModels = useMemo(
() => {
const sourceSpec = selectedTemplateSpec ?? selectedRuntimeTemplateSpec;
if (!sourceSpec?.models?.length) return [];
return sourceSpec.models;
},
[selectedRuntimeTemplateSpec, selectedTemplateSpec],
);
const llmCatalog = useMemo(() => buildProviderCatalog(llmModels), [llmModels]);
const selectedLLMProvider = useMemo(
() => llmCatalog.find((p) => p.id === llmSelection.providerId) ?? llmCatalog[0],
[llmCatalog, llmSelection.providerId],
);
// Filter HERMES_PROVIDERS by what the template declares it supports.
// Empty/missing declared list → fall back to the full catalog so
// templates that haven't migrated to the explicit `providers:` field
// (and self-hosted setups without /templates) keep working unchanged.
const availableProviders = useMemo<HermesProvider[]>(() => {
const declared = selectedTemplateSpec?.providers;
if (!declared || declared.length === 0) return HERMES_PROVIDERS;
const allowed = new Set(declared.map((p) => p.toLowerCase()));
const filtered = HERMES_PROVIDERS.filter((p) => allowed.has(p.id.toLowerCase()));
// Defensive: if the template's declared list doesn't match anything
// in our static catalog (e.g. brand-new provider id we don't have
// metadata for yet), fall back to the full list rather than render
// an empty <select>. Better to over-show than to lock the user out.
return filtered.length > 0 ? filtered : HERMES_PROVIDERS;
}, [selectedTemplateSpec]);
// If the currently-selected provider is filtered out by a template
// change, snap back to the first available. Without this, the
// hermesProvider state could refer to a provider not in the dropdown
// — confusing UI + the API key field's envVar would be wrong.
useEffect(() => {
if (llmCatalog.length === 0) return;
const sourceDefault = (selectedTemplateSpec ?? selectedRuntimeTemplateSpec)?.model?.trim();
const platformProvider = llmCatalog.find((p) => p.vendor === "platform");
const matched = sourceDefault ? findProviderForModel(llmCatalog, sourceDefault) : null;
const next = platformProvider ?? matched ?? llmCatalog[0];
const defaultModel = next.models.find((model) => model.id === sourceDefault)?.id
?? next.models[0]?.id
?? "";
setLLMSelection({
providerId: next.id,
model: next.wildcard ? "" : defaultModel,
envVars: next.envVars,
});
setLLMSecret("");
}, [llmCatalog, selectedRuntimeTemplateSpec, selectedTemplateSpec]);
if (!isHermes) return;
if (availableProviders.length === 0) return;
if (!availableProviders.some((p) => p.id === hermesProvider)) {
setHermesProvider(availableProviders[0].id);
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [availableProviders, isHermes]);
// Auto-fill hermesModel with the provider's defaultModel whenever the
// provider changes, but only if the user hasn't already typed their own
// slug. Prevents the empty-model → "auto" → Anthropic-default 401 trap.
useEffect(() => {
if (!isHermes) return;
const p = HERMES_PROVIDERS.find((x) => x.id === hermesProvider);
if (!p) return;
// Replace model only if current value matches another provider's
// default (user hasn't customized it) OR is empty.
const isUntouched =
hermesModel === "" ||
HERMES_PROVIDERS.some((x) => x.defaultModel === hermesModel);
if (isUntouched) setHermesModel(p.defaultModel);
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [hermesProvider, isHermes]);
// Reset form and load workspaces whenever dialog opens
useEffect(() => {
@@ -205,18 +225,18 @@ export function CreateWorkspaceButton() {
setName("");
setRole("");
setTier(defaultTier);
setRuntime(DEFAULT_RUNTIME);
setTemplate("");
setParentId("");
setBudgetLimit("");
setError(null);
setDisplayEnabled(false);
setDisplayInstanceType(DEFAULT_DISPLAY_INSTANCE_TYPE);
setDisplayRootGB(String(DEFAULT_DISPLAY_ROOT_GB));
setDisplayInstanceType("t3.xlarge");
setDisplayRootGB("80");
setDisplayResolution("1920x1080");
setHermesProvider("anthropic");
setExternalRuntime("external");
setLLMSelection({ providerId: "", model: "", envVars: [] });
setLLMSecret("");
setHermesApiKey("");
setHermesModel("");
api
.get<WorkspaceOption[]>("/workspaces")
.then((ws) => setWorkspaces(ws))
@@ -224,7 +244,7 @@ export function CreateWorkspaceButton() {
api
.get<TemplateSpec[]>("/templates")
.then((rows) => setTemplateSpecs(Array.isArray(rows) ? rows : []))
.catch(() => { /* keep empty; create stays blocked until the catalog loads */ });
.catch(() => { /* keep empty — HERMES_PROVIDERS fallback below */ });
// defaultTier is stable for the session (derived from window.location),
// safe to omit from deps.
// eslint-disable-next-line react-hooks/exhaustive-deps
@@ -235,18 +255,20 @@ export function CreateWorkspaceButton() {
setError("Name is required");
return;
}
if (!isExternal && !llmSelection.model.trim()) {
setError("Model is required");
if (isHermes && !hermesApiKey.trim()) {
setError("API key is required for Hermes workspaces");
return;
}
if (!isExternal && selectedLLMProvider?.envVars.length && !llmSecret.trim()) {
setError("Provider credential is required");
if (isHermes && !hermesModel.trim()) {
setError("Model is required for Hermes workspaces — provider routing depends on the model slug prefix");
return;
}
setCreating(true);
setError(null);
const nativeProvider = selectedLLMProvider;
const provider = isHermes
? HERMES_PROVIDERS.find((p) => p.id === hermesProvider)
: undefined;
try {
const parsedBudget = budgetLimit.trim()
@@ -270,40 +292,32 @@ export function CreateWorkspaceButton() {
tier,
parent_id: parentId || undefined,
budget_limit: parsedBudget,
...(!isExternal && nativeProvider
...(!isExternal && !isHermes ? { model: DEFAULT_CREATE_MODEL } : {}),
...(displayEnabled
? {
model: llmSelection.model.trim(),
llm_provider: nativeProvider.vendor,
...(nativeProvider.envVars.length > 0
? { secrets: { [nativeProvider.envVars[0]]: llmSecret.trim() } }
: {}),
}
: {}),
...(!isExternal
? {
compute: displayEnabled
? {
instance_type: displayInstanceType,
volume: { root_gb: Number.isFinite(parsedRootGB) ? parsedRootGB : DEFAULT_DISPLAY_ROOT_GB },
display: {
mode: "desktop-control",
protocol: "novnc",
width: Number.isFinite(displayWidth) ? displayWidth : 1920,
height: Number.isFinite(displayHeight) ? displayHeight : 1080,
},
}
: {
instance_type: DEFAULT_HEADLESS_INSTANCE_TYPE,
volume: { root_gb: DEFAULT_HEADLESS_ROOT_GB },
display: { mode: "none" },
},
compute: {
instance_type: displayInstanceType,
volume: { root_gb: Number.isFinite(parsedRootGB) ? parsedRootGB : 80 },
display: {
mode: "desktop-control",
protocol: "novnc",
width: Number.isFinite(displayWidth) ? displayWidth : 1920,
height: Number.isFinite(displayHeight) ? displayHeight : 1080,
},
},
}
: {}),
canvas: { x: Math.random() * 400 + 100, y: Math.random() * 300 + 100 },
// Runtime=external flips the backend into awaiting-agent mode:
// no container provisioning, token minted, connection payload
// returned in the response for the modal below.
...(isExternal ? { runtime: externalRuntime } : { runtime }),
...(isExternal ? { runtime: externalRuntime } : {}),
...(!isExternal && isHermes && provider
? {
secrets: { [provider.envVar]: hermesApiKey.trim() },
model: hermesModel.trim(),
}
: {}),
});
// External path: keep the create dialog open just long enough to
// hand control to the connect modal, then close. The connect
@@ -415,76 +429,13 @@ export function CreateWorkspaceButton() {
)}
{!isExternal && (
<div className="space-y-3">
<div>
<label htmlFor="runtime-select" className="text-[11px] text-ink-mid block mb-1">
Runtime
</label>
<select
id="runtime-select"
value={runtime}
onChange={(e) => handleRuntimeChange(e.target.value)}
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink focus:outline-none focus:border-accent/60 focus:ring-1 focus:ring-accent/20 transition-colors"
>
{RUNTIME_OPTIONS.map((option) => (
<option key={option.value} value={option.value}>
{option.label}
</option>
))}
</select>
</div>
<div>
<label htmlFor="workspace-template-select" className="text-[11px] text-ink-mid block mb-1">
Workspace Template
</label>
<select
id="workspace-template-select"
value={template}
onChange={(e) => setTemplate(e.target.value)}
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink focus:outline-none focus:border-accent/60 focus:ring-1 focus:ring-accent/20 transition-colors"
>
<option value="">Blank workspace</option>
{visibleTemplateSpecs.map((spec) => (
<option key={spec.id} value={spec.id}>
{spec.name || spec.id}
</option>
))}
</select>
</div>
</div>
)}
{!isExternal && selectedLLMProvider && (
<div className="rounded-lg border border-line/50 bg-surface-card/40 p-3 space-y-3">
<div className="text-[11px] font-medium text-ink-mid">
LLM
</div>
<ProviderModelSelector
models={llmModels}
value={llmSelection}
onChange={(next) => {
setLLMSelection(next);
setLLMSecret("");
}}
idPrefix="create-workspace-llm"
variant="stack"
/>
{selectedLLMProvider.envVars.length > 0 && (
<div>
<label htmlFor="llm-secret-input" className="text-[11px] text-ink-mid block mb-1">
{selectedLLMProvider.envVars[0]}
</label>
<input
id="llm-secret-input"
type="password"
value={llmSecret}
onChange={(e) => setLLMSecret(e.target.value)}
autoComplete="off"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink placeholder-ink-soft focus:outline-none focus:border-accent/60 focus:ring-1 focus:ring-accent/20 transition-colors font-mono"
/>
</div>
)}
</div>
<InputField
label="Template"
value={template}
onChange={setTemplate}
placeholder="e.g. seo-agent (from workspace-configs-templates/)"
mono
/>
)}
<div>
@@ -591,11 +542,10 @@ export function CreateWorkspaceButton() {
)}
<div>
<label htmlFor="parent-workspace-select" className="text-[11px] text-ink-mid block mb-1">
<label className="text-[11px] text-ink-mid block mb-1">
Parent Workspace
</label>
<select
id="parent-workspace-select"
value={parentId}
onChange={(e) => setParentId(e.target.value)}
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink focus:outline-none focus:border-accent/60 focus:ring-1 focus:ring-accent/20 transition-colors"
@@ -610,6 +560,100 @@ export function CreateWorkspaceButton() {
</div>
</div>
{/* Hermes provider configuration — shown only when template === "hermes" */}
{isHermes && (
<div
className="mt-4 rounded-xl border border-violet-700/40 bg-violet-950/20 p-4 space-y-3"
data-testid="hermes-provider-section"
>
<p className="text-[11px] font-semibold text-violet-400 uppercase tracking-wide">
Hermes Provider
</p>
<p className="text-[11px] text-ink-mid -mt-1">
Choose the AI provider and paste your API key. The key is
stored as an encrypted workspace secret.
</p>
<div>
<label
htmlFor="hermes-provider-select"
className="text-[11px] text-ink-mid block mb-1"
>
Provider
</label>
<select
id="hermes-provider-select"
value={hermesProvider}
onChange={(e) => setHermesProvider(e.target.value)}
aria-label="Hermes provider"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors"
>
{availableProviders.map((p) => (
<option key={p.id} value={p.id}>
{p.label}
</option>
))}
</select>
</div>
<div>
<label
htmlFor="hermes-api-key-input"
className="text-[11px] text-ink-mid block mb-1"
>
API Key{" "}
<span aria-hidden="true" className="text-bad">
*
</span>
<span className="sr-only"> (required)</span>
</label>
<input
id="hermes-api-key-input"
type="password"
value={hermesApiKey}
onChange={(e) => setHermesApiKey(e.target.value)}
placeholder="sk-…"
aria-label="Hermes API key"
autoComplete="off"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink placeholder-ink-soft focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors font-mono"
/>
</div>
<div>
<label
htmlFor="hermes-model-input"
className="text-[11px] text-ink-mid block mb-1"
>
Model{" "}
<span aria-hidden="true" className="text-bad">
*
</span>
<span className="sr-only"> (required)</span>
</label>
<input
id="hermes-model-input"
type="text"
value={hermesModel}
onChange={(e) => setHermesModel(e.target.value)}
placeholder="e.g. minimax/MiniMax-M2.7"
aria-label="Hermes model slug"
autoComplete="off"
spellCheck={false}
list="hermes-model-suggestions"
className="w-full bg-surface-card/60 border border-line/50 rounded-lg px-3 py-2 text-sm text-ink placeholder-ink-soft focus:outline-none focus:border-violet-500/60 focus:ring-1 focus:ring-violet-500/20 transition-colors font-mono"
/>
<datalist id="hermes-model-suggestions">
{HERMES_PROVIDERS.find((p) => p.id === hermesProvider)?.models.map(
(m) => <option key={m} value={m} />,
)}
</datalist>
<p className="text-[10px] text-ink-mid mt-1">
Slug determines which provider hermes routes to at install time.
</p>
</div>
</div>
)}
{error && (
<div
role="alert"
+2 -2
View File
@@ -4,7 +4,7 @@ import { useState, useEffect, useCallback } from "react";
import { api } from "@/lib/api";
import { useCanvasStore } from "@/store/canvas";
import { OrgTemplatesSection } from "./TemplatePalette";
import { isUserVisibleWorkspaceTemplate, type Template } from "@/lib/deploy-preflight";
import { type Template } from "@/lib/deploy-preflight";
import { useTemplateDeploy } from "@/hooks/useTemplateDeploy";
import { Spinner } from "./Spinner";
import { TIER_CONFIG } from "@/lib/design-tokens";
@@ -18,7 +18,7 @@ export function EmptyState() {
useEffect(() => {
api
.get<Template[]>("/templates")
.then((t) => setTemplates(t.filter(isUserVisibleWorkspaceTemplate)))
.then((t) => setTemplates(t))
.catch(() => setTemplates([]))
.finally(() => setLoading(false));
}, []);
+1 -56
View File
@@ -24,10 +24,9 @@
* "no memories yet".
*/
import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
import { useCallback, useEffect, useMemo, useState } from 'react';
import { api } from '@/lib/api';
import { ConfirmDialog } from '@/components/ConfirmDialog';
import { useSocketEvent } from '@/hooks/useSocketEvent';
// ── Types ─────────────────────────────────────────────────────────────────────
@@ -247,60 +246,6 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
loadEntries();
}, [loadEntries]);
// Live-refresh on ACTIVITY_LOGGED events that look like memory writes
// for this workspace (#1734). Without this, the user sees a stale
// empty state after an agent commits — agent says "wrote memory",
// panel keeps showing nothing until they hit Refresh.
//
// What actually broadcasts ACTIVITY_LOGGED on the server today
// (workspace-server/internal/handlers/activity.go LogActivity /
// LogActivityTx — those are the only emitters):
//
// - `memory_write_global` — `POST /workspaces/:id/memories` for GLOBAL scope
// - `memory_edit_global` — `PATCH /workspaces/:id/memories/:id` for GLOBAL scope
// - `memory_delete_global` — `DELETE /workspaces/:id/memories/:id` for GLOBAL scope
// - `agent_log` — generic catch-all an agent emits via
// `POST /workspaces/:id/activity`
//
// The MCP-tool path (`commit_memory`, `commit_memory_v2`,
// `commit_summary`) does NOT broadcast on the wire today; it inserts
// into agent_memories (pre-A1) or calls the v2 plugin (post-A1) and
// never round-trips through LogActivity. Server-side follow-up is
// tracked in **#1754** — once the MCP handlers emit `memory_write`
// via LogActivity, the `agent_log` arm of the filter below can be
// dropped. `memory_write` is included pre-emptively so this code
// lights up the moment #1754 lands. Until then, `agent_log` catches
// MCP commits over-inclusively; the 300ms debounce bounds the
// refetch rate. Issue #1734 review finding.
//
// The 300ms debounce coalesces bursts so a chatty agent (e.g. an
// agent in a long task emitting agent_log every few hundred ms)
// doesn't hammer /v2/memories on every keystroke-equivalent.
const refetchTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
useEffect(() => () => {
if (refetchTimerRef.current) clearTimeout(refetchTimerRef.current);
}, []);
useSocketEvent((msg) => {
if (msg.event !== 'ACTIVITY_LOGGED') return;
if (msg.workspace_id !== workspaceId) return;
const p = (msg.payload || {}) as Record<string, unknown>;
const activityType = (p.activity_type as string) || '';
switch (activityType) {
case 'memory_write':
case 'memory_write_global':
case 'memory_edit_global':
case 'memory_delete_global':
case 'agent_log':
break;
default:
return;
}
if (refetchTimerRef.current) clearTimeout(refetchTimerRef.current);
refetchTimerRef.current = setTimeout(() => {
loadEntries();
}, 300);
});
// ── Delete handlers ─────────────────────────────────────────────────────────
const confirmDelete = useCallback(async () => {
+18 -240
View File
@@ -23,8 +23,6 @@ interface Props {
/** Grouped provider options derived from the template's models[] /
* required_env. When length 2 the modal shows a radio picker. */
providers?: ProviderChoice[];
/** Optional keys to offer in the deploy modal without blocking Deploy. */
optionalKeys?: string[];
/** Runtime slug used only for the "The <runtime> runtime "
* headline; behavior is driven by providers/missingKeys. */
runtime: string;
@@ -96,13 +94,13 @@ export function MissingKeysModal({
open,
missingKeys,
providers,
optionalKeys,
runtime,
onKeysAdded,
onCancel,
onOpenSettings,
workspaceId,
configuredKeys,
modelSuggestions,
models,
initialModel,
title,
@@ -116,13 +114,13 @@ export function MissingKeysModal({
<ProviderPickerModal
open={open}
providers={pickerProviders}
optionalKeys={optionalKeys ?? []}
runtime={runtime}
onKeysAdded={onKeysAdded}
onCancel={onCancel}
onOpenSettings={onOpenSettings}
workspaceId={workspaceId}
configuredKeys={configuredKeys}
modelSuggestions={modelSuggestions}
models={models}
initialModel={initialModel}
title={title}
@@ -140,15 +138,11 @@ export function MissingKeysModal({
<AllKeysModal
open={open}
missingKeys={keys}
optionalKeys={optionalKeys ?? []}
runtime={runtime}
onKeysAdded={onKeysAdded}
onCancel={onCancel}
onOpenSettings={onOpenSettings}
workspaceId={workspaceId}
configuredKeys={configuredKeys}
title={title}
description={description}
/>
);
}
@@ -176,13 +170,13 @@ export function providerIdForModel(
function ProviderPickerModal({
open,
providers,
optionalKeys,
runtime,
onKeysAdded,
onCancel,
onOpenSettings,
workspaceId,
configuredKeys,
modelSuggestions,
models,
initialModel,
title,
@@ -190,13 +184,13 @@ function ProviderPickerModal({
}: {
open: boolean;
providers: ProviderChoice[];
optionalKeys: string[];
runtime: string;
onKeysAdded: (model?: string) => void;
onCancel: () => void;
onOpenSettings?: () => void;
workspaceId?: string;
configuredKeys?: Set<string>;
modelSuggestions?: string[];
models?: ModelSpec[];
initialModel?: string;
title?: string;
@@ -256,9 +250,16 @@ function ProviderPickerModal({
const [selectorValue, setSelectorValue] = useState<SelectorValue>(initial);
const [entries, setEntries] = useState<KeyEntry[]>([]);
const [optionalEntries, setOptionalEntries] = useState<KeyEntry[]>([]);
const firstInputRef = useRef<HTMLInputElement>(null);
// Legacy compat: map the selector value back into the old `selected`/
// `model` shape for the rest of the modal body (footer copy, etc.).
const selected = useMemo(
() =>
providers.find((p) => p.id === selectorValue.providerId) ??
providers[0],
[providers, selectorValue.providerId],
);
const model = selectorValue.model;
const showModelInput = catalog.length > 0;
@@ -281,18 +282,7 @@ function ProviderPickerModal({
error: null,
})),
);
setOptionalEntries(
optionalKeys
.filter((key) => !selectorValue.envVars.includes(key))
.map((key) => ({
key,
value: "",
saved: configuredKeys?.has(key) ?? false,
saving: false,
error: null,
})),
);
}, [open, selectorValue.envVars, configuredKeys, optionalKeys]);
}, [open, selectorValue.envVars, configuredKeys]);
useEffect(() => {
if (!open) return;
@@ -346,43 +336,6 @@ function ProviderPickerModal({
[entries, updateEntry, workspaceId],
);
const updateOptionalEntry = useCallback(
(index: number, updates: Partial<KeyEntry>) => {
setOptionalEntries((prev) =>
prev.map((e, i) => (i === index ? { ...e, ...updates } : e)),
);
},
[],
);
const handleSaveOptionalKey = useCallback(
async (index: number) => {
const entry = optionalEntries[index];
if (!entry.value.trim()) return;
updateOptionalEntry(index, { saving: true, error: null });
try {
if (workspaceId) {
await api.put(`/workspaces/${workspaceId}/secrets`, {
key: entry.key,
value: entry.value.trim(),
});
} else {
await api.put("/settings/secrets", {
key: entry.key,
value: entry.value.trim(),
});
}
updateOptionalEntry(index, { saved: true, saving: false });
} catch (e) {
updateOptionalEntry(index, {
saving: false,
error: e instanceof Error ? e.message : "Failed to save",
});
}
},
[optionalEntries, updateOptionalEntry, workspaceId],
);
if (!open) return null;
// Portal to document.body for the same reason as
// OrgImportPreflightModal — several callers (TemplatePalette,
@@ -512,62 +465,6 @@ function ProviderPickerModal({
</div>
))}
</div>
{optionalEntries.length > 0 && (
<div className="space-y-2">
<div className="text-[10px] uppercase tracking-wide text-ink-mid font-semibold">
Optional
</div>
{optionalEntries.map((entry, index) => (
<div
key={entry.key}
className="bg-surface-card/30 rounded-lg px-3 py-2.5 border border-line/40"
>
<div className="flex items-center justify-between mb-1.5">
<div>
<div className="text-[11px] text-ink-mid font-medium">
{getKeyLabel(entry.key)}
</div>
<div className="text-[9px] font-mono text-ink-mid">{entry.key}</div>
</div>
{entry.saved && (
<span className="text-[9px] text-good bg-emerald-900/30 px-1.5 py-0.5 rounded flex items-center gap-1">
Saved
</span>
)}
</div>
{!entry.saved && (
<div className="flex gap-2 mt-2">
<input
value={entry.value}
onChange={(e) => updateOptionalEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Optional value for ${entry.key}`}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
handleSaveOptionalKey(index);
}
}}
className="flex-1 bg-surface-sunken border border-line rounded px-2 py-1.5 text-[11px] text-ink font-mono focus:outline-none focus:border-accent focus:ring-1 focus:ring-accent/20 transition-colors"
/>
<button
type="button"
onClick={() => handleSaveOptionalKey(index)}
disabled={!entry.value.trim() || entry.saving}
className="px-3 py-1.5 bg-surface-card hover:bg-surface-card/80 text-[11px] rounded text-ink border border-line disabled:opacity-30 transition-colors shrink-0 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
{entry.saving ? "..." : "Save"}
</button>
</div>
)}
{entry.error && (
<div role="alert" aria-live="assertive" className="mt-1.5 text-[10px] text-bad">{entry.error}</div>
)}
</div>
))}
</div>
)}
</div>
<div className="px-5 py-3 border-t border-line bg-surface/50 flex items-center justify-between gap-2">
@@ -615,30 +512,21 @@ function ProviderPickerModal({
function AllKeysModal({
open,
missingKeys,
optionalKeys,
runtime,
onKeysAdded,
onCancel,
onOpenSettings,
workspaceId,
configuredKeys,
title,
description,
}: {
open: boolean;
missingKeys: string[];
optionalKeys: string[];
runtime: string;
onKeysAdded: () => void;
onCancel: () => void;
onOpenSettings?: () => void;
workspaceId?: string;
configuredKeys?: Set<string>;
title?: string;
description?: string;
}) {
const [entries, setEntries] = useState<KeyEntry[]>([]);
const [optionalEntries, setOptionalEntries] = useState<KeyEntry[]>([]);
const [globalError, setGlobalError] = useState<string | null>(null);
useEffect(() => {
@@ -647,24 +535,13 @@ function AllKeysModal({
missingKeys.map((key) => ({
key,
value: "",
saved: configuredKeys?.has(key) ?? false,
saved: false,
saving: false,
error: null,
})),
);
setOptionalEntries(
optionalKeys
.filter((key) => !missingKeys.includes(key))
.map((key) => ({
key,
value: "",
saved: configuredKeys?.has(key) ?? false,
saving: false,
error: null,
})),
);
setGlobalError(null);
}, [open, missingKeys, optionalKeys, configuredKeys]);
}, [open, missingKeys]);
useEffect(() => {
if (!open) return;
@@ -714,45 +591,6 @@ function AllKeysModal({
[entries, updateEntry, workspaceId],
);
const updateOptionalEntry = useCallback(
(index: number, updates: Partial<KeyEntry>) => {
setOptionalEntries((prev) =>
prev.map((entry, i) => (i === index ? { ...entry, ...updates } : entry)),
);
},
[],
);
const handleSaveOptionalKey = useCallback(
async (index: number) => {
const entry = optionalEntries[index];
if (!entry.value.trim()) return;
updateOptionalEntry(index, { saving: true, error: null });
try {
if (workspaceId) {
await api.put(`/workspaces/${workspaceId}/secrets`, {
key: entry.key,
value: entry.value.trim(),
});
} else {
await api.put("/settings/secrets", {
key: entry.key,
value: entry.value.trim(),
});
}
updateOptionalEntry(index, { saved: true, saving: false });
} catch (e) {
updateOptionalEntry(index, {
saving: false,
error: e instanceof Error ? e.message : "Failed to save",
});
}
},
[optionalEntries, updateOptionalEntry, workspaceId],
);
const handleAddKeysAndDeploy = useCallback(() => {
const anySaving = entries.some((e) => e.saving);
if (anySaving) {
@@ -818,16 +656,12 @@ function AllKeysModal({
</svg>
</div>
<h3 id="missing-keys-title" className="text-sm font-semibold text-ink">
{title ?? "Missing API Keys"}
Missing API Keys
</h3>
</div>
<p className="text-[12px] text-ink-mid leading-relaxed">
{description ?? (
<>
The <span className="text-warm font-medium">{runtimeLabel}</span>{" "}
runtime requires the following keys to be configured before deploying.
</>
)}
The <span className="text-warm font-medium">{runtimeLabel}</span>{" "}
runtime requires the following keys to be configured before deploying.
</p>
</div>
@@ -885,62 +719,6 @@ function AllKeysModal({
</div>
))}
{optionalEntries.length > 0 && (
<div className="space-y-2">
<div className="text-[10px] uppercase tracking-wide text-ink-mid font-semibold">
Optional
</div>
{optionalEntries.map((entry, index) => (
<div
key={entry.key}
className="bg-surface-card/30 rounded-lg px-3 py-2.5 border border-line/40"
>
<div className="flex items-center justify-between mb-1">
<div>
<div className="text-[11px] text-ink-mid font-medium">
{getKeyLabel(entry.key)}
</div>
<div className="text-[9px] font-mono text-ink-mid">{entry.key}</div>
</div>
{entry.saved && (
<span className="text-[9px] text-good bg-emerald-900/30 px-1.5 py-0.5 rounded">
Saved
</span>
)}
</div>
{!entry.saved && (
<div className="flex gap-2 mt-2">
<input
value={entry.value}
onChange={(e) => updateOptionalEntry(index, { value: e.target.value.trimStart() })}
placeholder={entry.key.includes("API_KEY") ? "sk-..." : "Enter value"}
type="password"
aria-label={`Optional value for ${entry.key}`}
onKeyDown={(e) => {
if (e.key === "Enter" && entry.value.trim()) {
handleSaveOptionalKey(index);
}
}}
className="flex-1 bg-surface-sunken border border-line rounded px-2 py-1.5 text-[11px] text-ink font-mono focus:outline-none focus:border-accent focus:ring-1 focus:ring-accent/20 transition-colors"
/>
<button
type="button"
onClick={() => handleSaveOptionalKey(index)}
disabled={!entry.value.trim() || entry.saving}
className="px-3 py-1.5 bg-surface-card hover:bg-surface-card/80 text-[11px] rounded text-ink border border-line disabled:opacity-30 transition-colors shrink-0 focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-accent focus-visible:ring-offset-1"
>
{entry.saving ? "..." : "Save"}
</button>
</div>
)}
{entry.error && <div className="mt-1.5 text-[10px] text-bad">{entry.error}</div>}
</div>
))}
</div>
)}
{globalError && (
<div role="alert" aria-live="assertive" className="px-3 py-2 bg-red-950/40 border border-red-800/50 rounded-lg text-[11px] text-bad">
{globalError}
+1 -108
View File
@@ -28,7 +28,6 @@ import { useId, useMemo } from "react";
export interface SelectorModel {
id: string;
name?: string;
provider?: string;
required_env?: string[];
}
@@ -49,33 +48,6 @@ export interface ProviderEntry {
wildcard: boolean;
/** Optional tooltip text (rendered as native title=). */
tooltip?: string;
/** Billing mode the DERIVED provider implies, when this entry came from the
* registry-backed payload (internal#718 P3): "platform_managed" | "byok".
* Undefined for entries built by the legacy inferVendor heuristic. */
billingMode?: "platform_managed" | "byok";
}
/** RegistryProvider mirrors one entry of GET /templates `registry_providers`
* (workspace-server registryProviderView): the registry's native provider for
* a runtime, with its display label, auth-env NAMES, and billing mode. This is
* the SSOT the dropdown labels come from the canvas drops VENDOR_LABELS for
* registry-backed runtimes (internal#718 P3, retire-list #4). */
export interface RegistryProvider {
name: string;
display_name?: string;
auth_env?: string[];
billing_mode?: "platform_managed" | "byok";
deprecated?: boolean;
}
/** RegistryModel mirrors one entry of GET /templates `registry_models`: a
* native model id annotated with its DERIVED provider (registry name) and the
* billing_mode that provider implies. */
export interface RegistryModel {
id: string;
name?: string;
provider?: string;
billing_mode?: "platform_managed" | "byok";
}
export interface SelectorValue {
@@ -95,13 +67,6 @@ interface Props {
models: SelectorModel[];
value: SelectorValue;
onChange: (next: SelectorValue) => void;
/** Optional pre-built provider catalog. When provided, the selector uses it
* verbatim instead of re-inferring one from `models` via
* buildProviderCatalog the registry-backed path (internal#718 P3), where
* the parent builds the catalog from the registry-served providers/models
* so dropdown labels + billing come from the provider-registry SSOT rather
* than the inferVendor heuristic. Omitted = legacy heuristic over `models`. */
catalog?: ProviderEntry[];
/** Display variant. "grid" = label+control side-by-side (used in ConfigTab
* Runtime section). "stack" = vertical (used in MissingKeysModal). */
variant?: "grid" | "stack";
@@ -123,7 +88,6 @@ interface Props {
/** Vendor keys human label. Add new vendors here when templates pick
* up new model families. */
const VENDOR_LABELS: Record<string, string> = {
"platform": "Platform",
"anthropic-oauth": "Claude Code subscription",
anthropic: "Anthropic API",
minimax: "MiniMax",
@@ -154,8 +118,6 @@ const VENDOR_LABELS: Record<string, string> = {
/** Optional per-vendor tooltip shown on hover. */
const VENDOR_TOOLTIPS: Record<string, string> = {
"platform":
"Use the Molecule platform-managed LLM proxy. No vendor API key is required.",
"anthropic-oauth":
"Use your Claude.ai (Pro/Max/Team) subscription via OAuth. Run `claude login` in the workspace terminal to mint the token, then paste it here. No API spend.",
anthropic:
@@ -203,9 +165,6 @@ const BARE_VENDOR_PATTERNS: Array<{ test: (id: string) => boolean; vendor: strin
/** Infer a vendor key from a model spec. Combines id-prefix and env
* signals. Exported for tests. */
export function inferVendor(model: SelectorModel): string {
const explicitProvider = model.provider?.trim().toLowerCase();
if (explicitProvider) return explicitProvider;
const id = model.id || "";
const envSet = new Set(model.required_env ?? []);
@@ -285,66 +244,6 @@ export function buildProviderCatalog(models: SelectorModel[]): ProviderEntry[] {
return Array.from(buckets.values());
}
/** Build the provider catalog from a REGISTRY-BACKED GET /templates payload
* (registry_providers + registry_models) internal#718 P3, retire-list #4.
*
* Unlike buildProviderCatalog (which RE-INFERS vendor from model-id prefixes
* + env via inferVendor/VENDOR_LABELS/BARE_VENDOR_PATTERNS), this trusts the
* registry: each model carries its DERIVED `provider` (a registry provider
* name) and the dropdown label/billing/auth come from the matching
* `registry_providers` entry. The canvas can render no provider/model the
* registry did not serve ("only registered selectable"), and the billing-mode
* shown reflects the derived provider rather than a hardcoded rule.
*
* A provider with no served model is omitted (no empty buckets). Models whose
* `provider` doesn't match a registry_providers entry still get a bucket
* keyed by the raw provider name (defensive should not happen for a
* well-formed registry payload), so a model is never silently dropped. */
export function buildProviderCatalogFromRegistry(
registryProviders: RegistryProvider[],
registryModels: RegistryModel[],
): ProviderEntry[] {
const byName = new Map<string, RegistryProvider>();
for (const p of registryProviders) byName.set(p.name, p);
// Bucket models by their derived provider name, preserving registry order.
const buckets = new Map<string, ProviderEntry>();
for (const m of registryModels) {
const vendor = (m.provider ?? "").trim();
if (!vendor) continue; // un-annotated registry model — skip from the
// provider cascade (selectable elsewhere via free-text); it has no
// derived provider to bucket under.
const meta = byName.get(vendor);
const wildcard = m.id.includes("*");
let entry = buckets.get(vendor);
if (!entry) {
entry = {
id: `registry|${vendor}`,
vendor,
label: meta?.display_name || vendor,
envVars: meta?.auth_env ?? [],
models: [],
wildcard,
billingMode: meta?.billing_mode ?? m.billing_mode,
tooltip: VENDOR_TOOLTIPS[vendor],
};
buckets.set(vendor, entry);
}
entry.models.push({ id: m.id, name: m.name, provider: vendor });
entry.wildcard = entry.wildcard || wildcard;
}
// Decorate label with model-count when ≥2 concrete models share the bucket,
// matching buildProviderCatalog's UX.
for (const e of buckets.values()) {
if (!e.wildcard && e.models.length > 1) {
e.label = `${e.label} (${e.models.length} models)`;
}
}
return Array.from(buckets.values());
}
/** Find the provider entry that contains a given model id. Used by
* callers to back-derive the provider when only the model is known
* (e.g. ConfigTab loading from saved state). */
@@ -377,7 +276,6 @@ export function ProviderModelSelector({
models,
value,
onChange,
catalog: catalogProp,
variant = "stack",
allowCustomModelEscape = false,
disabled = false,
@@ -388,12 +286,7 @@ export function ProviderModelSelector({
const providerSelectId = `${baseId}-provider`;
const modelSelectId = `${baseId}-model`;
// Registry-backed path (internal#718 P3): use the parent-supplied catalog
// verbatim; otherwise re-infer one from `models` via the legacy heuristic.
const catalog = useMemo(
() => catalogProp ?? buildProviderCatalog(models),
[catalogProp, models],
);
const catalog = useMemo(() => buildProviderCatalog(models), [models]);
const selected = useMemo(
() => catalog.find((p) => p.id === value.providerId) ?? null,
[catalog, value.providerId],
@@ -242,13 +242,10 @@ export function ProvisioningTimeout({
const handleCancelConfirm = useCallback(async () => {
if (!confirmingCancel) return;
const workspaceId = confirmingCancel;
const workspaceName = timedOut.find((e) => e.workspaceId === workspaceId)?.workspaceName ?? "";
setConfirmingCancel(null);
setCancelling((prev) => new Set(prev).add(workspaceId));
try {
await api.del(`/workspaces/${workspaceId}`, {
headers: { "X-Confirm-Name": workspaceName },
});
await api.del(`/workspaces/${workspaceId}`);
setTimedOut((prev) => prev.filter((e) => e.workspaceId !== workspaceId));
trackingRef.current.delete(workspaceId);
showToast("Deployment cancelled", "info");
+1 -3
View File
@@ -305,9 +305,7 @@ export function SidePanel() {
{panelTab === "chat" && <ChatTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "terminal" && <TerminalTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />}
{panelTab === "display" && <DisplayTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "container-config" && selectedNodeId && (
<ContainerConfigTab key={selectedNodeId} workspaceId={selectedNodeId} data={node.data} />
)}
{panelTab === "container-config" && <ContainerConfigTab key={selectedNodeId} data={node.data} />}
{panelTab === "config" && <ConfigTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "schedule" && <ScheduleTab key={selectedNodeId} workspaceId={selectedNodeId} />}
{panelTab === "channels" && <ChannelsTab key={selectedNodeId} workspaceId={selectedNodeId} />}
+2 -2
View File
@@ -5,7 +5,7 @@ import { flushSync } from "react-dom";
import { api } from "@/lib/api";
import { useCanvasStore } from "@/store/canvas";
import type { WorkspaceData } from "@/store/socket";
import { isUserVisibleWorkspaceTemplate, type Template } from "@/lib/deploy-preflight";
import { type Template } from "@/lib/deploy-preflight";
import { useTemplateDeploy } from "@/hooks/useTemplateDeploy";
import {
OrgImportPreflightModal,
@@ -446,7 +446,7 @@ export function TemplatePalette() {
setLoading(true);
try {
const data = await api.get<Template[]>("/templates");
setTemplates(data.filter(isUserVisibleWorkspaceTemplate));
setTemplates(data);
} catch {
setTemplates([]);
} finally {
+2 -4
View File
@@ -224,14 +224,12 @@ export function Toolbar() {
useEffect(() => {
const handler = (e: KeyboardEvent) => {
if (e.key !== "?") return;
const target = e.target as HTMLElement;
if (target.closest?.('[data-display-stream="true"]')) return;
const tag = target.tagName;
const tag = (e.target as HTMLElement).tagName;
const inInput =
tag === "INPUT" ||
tag === "TEXTAREA" ||
tag === "SELECT" ||
target.isContentEditable;
(e.target as HTMLElement).isContentEditable;
if (inInput) return;
// Don't fire when a modal/dialog is already mounted (canvas modals,
// side panel, etc. use z-50 or above).
@@ -1,82 +1,411 @@
// @vitest-environment jsdom
/**
* Focused tests for BudgetSection's PER-PERIOD progress-bar math + aria (#49).
* Tests for BudgetSection (issue #541).
*
* Behavioral coverage (loading, save, 402 banners, USD formatting, legacy
* back-compat) lives in tabs/__tests__/BudgetSection.test.tsx this file
* deliberately covers only the per-period progress percentage + aria-valuenow
* + the over-budget colouring, which that suite doesn't assert in detail. Kept
* separate to avoid duplicating the behavioral suite (one component, no
* parallel/identical suites).
* Covers:
* - Loading state
* - Stats row: used / limit, "Unlimited" when null
* - Progress bar: correct percentage, capped at 100%, absent when no limit
* - Budget remaining text
* - Input pre-fill (existing limit / blank when null)
* - Save: PATCH with number, PATCH with null (blank input)
* - 402 on GET exceeded banner, no fetch-error text
* - 402 on PATCH exceeded banner
* - Non-402 fetch error error text
* - Non-402 save error save error alert
* - Section header and subheading
* - Fetch error does not show stats
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, waitFor, cleanup } from "@testing-library/react";
import {
render,
screen,
fireEvent,
waitFor,
cleanup,
act,
} from "@testing-library/react";
// ── Mock api ──────────────────────────────────────────────────────────────────
vi.mock("@/lib/api", () => ({
api: { get: vi.fn(), patch: vi.fn() },
api: {
get: vi.fn(),
patch: vi.fn(),
},
}));
import { api } from "@/lib/api";
import { BudgetSection } from "../tabs/BudgetSection";
const mockGet = vi.mocked(api.get);
const mockPatch = vi.mocked(api.patch);
type P = { limit: number | null; spend: number; remaining: number | null };
// ── Helpers ───────────────────────────────────────────────────────────────────
// Build a periods response where the named period has the given limit/spend.
function withMonthly(limit: number | null, spend: number) {
const blank: P = { limit: null, spend: 0, remaining: null };
const monthly: P = { limit, spend, remaining: limit == null ? null : limit - spend };
function budgetResponse(overrides: Partial<{
budget_limit: number | null;
budget_used: number;
budget_remaining: number | null;
}> = {}) {
return {
periods: { hourly: blank, daily: blank, weekly: blank, monthly },
budget_limit: limit,
monthly_spend: spend,
budget_remaining: monthly.remaining,
budget_limit: 1000,
budget_used: 250,
budget_remaining: 750,
...overrides,
};
}
beforeEach(() => vi.clearAllMocks());
afterEach(() => cleanup());
function make402Error(): Error {
return new Error("API GET /workspaces/ws-1/budget: 402 Payment Required");
}
async function renderLoaded(data: unknown) {
function make402PatchError(): Error {
return new Error("API PATCH /workspaces/ws-1/budget: 402 Payment Required");
}
function makeGenericError(msg = "network timeout"): Error {
return new Error(`API GET /workspaces/ws-1/budget: 500 ${msg}`);
}
beforeEach(() => {
vi.clearAllMocks();
});
afterEach(() => {
cleanup();
});
// ── Rendering helpers ─────────────────────────────────────────────────────────
async function renderLoaded(budgetData = budgetResponse()) {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValueOnce(data as any);
mockGet.mockResolvedValueOnce(budgetData as any);
render(<BudgetSection workspaceId="ws-1" />);
// Wait for loading to finish
await waitFor(() => expect(screen.queryByTestId("budget-loading")).toBeNull());
}
describe("BudgetSection — per-period progress bar", () => {
it("renders the bar for a limited period and omits it for an unlimited one", async () => {
await renderLoaded(withMonthly(1000, 250));
expect(screen.getByTestId("budget-monthly-fill")).toBeTruthy();
expect(screen.queryByTestId("budget-hourly-fill")).toBeNull(); // hourly unlimited
// ── Loading state ─────────────────────────────────────────────────────────────
describe("BudgetSection — loading state", () => {
it("shows loading indicator while fetch is in flight", () => {
// Never resolve
mockGet.mockReturnValue(new Promise(() => {}));
render(<BudgetSection workspaceId="ws-1" />);
expect(screen.getByTestId("budget-loading")).toBeTruthy();
expect(screen.getByText("Loading…")).toBeTruthy();
});
it("fills to 25%", async () => {
await renderLoaded(withMonthly(1000, 250));
expect((screen.getByTestId("budget-monthly-fill") as HTMLElement).style.width).toBe("25%");
});
it("fills to 50%", async () => {
await renderLoaded(withMonthly(1000, 500));
expect((screen.getByTestId("budget-monthly-fill") as HTMLElement).style.width).toBe("50%");
});
it("caps fill at 100% when spend exceeds limit", async () => {
await renderLoaded(withMonthly(1000, 4000));
expect((screen.getByTestId("budget-monthly-fill") as HTMLElement).style.width).toBe("100%");
});
it("sets aria-valuenow to the computed percentage on the progressbar", async () => {
await renderLoaded(withMonthly(1000, 250));
const bars = screen.getAllByRole("progressbar");
// the monthly bar is the only one rendered (others unlimited)
expect(bars).toHaveLength(1);
expect(bars[0].getAttribute("aria-valuenow")).toBe("25");
});
it("shows a 0% bar when spend is 0 against a set limit", async () => {
await renderLoaded(withMonthly(1000, 0));
expect((screen.getByTestId("budget-monthly-fill") as HTMLElement).style.width).toBe("0%");
it("hides loading indicator after fetch resolves", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValueOnce(budgetResponse() as any);
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() => expect(screen.queryByTestId("budget-loading")).toBeNull());
});
});
// ── Section header ────────────────────────────────────────────────────────────
describe("BudgetSection — header and subheading", () => {
it("renders 'Budget' as the section heading", async () => {
await renderLoaded();
expect(screen.getByText("Budget")).toBeTruthy();
});
it("renders the subheading 'Limit total message credits for this workspace'", async () => {
await renderLoaded();
expect(
screen.getByText("Limit total message credits for this workspace")
).toBeTruthy();
});
it("renders 'Budget limit (credits)' label for the input", async () => {
await renderLoaded();
expect(screen.getByText("Budget limit (credits)")).toBeTruthy();
});
});
// ── Stats row ─────────────────────────────────────────────────────────────────
describe("BudgetSection — stats row", () => {
it("shows budget_used in the stats row", async () => {
await renderLoaded(budgetResponse({ budget_used: 350, budget_limit: 1000 }));
expect(screen.getByTestId("budget-used-value").textContent).toBe("350");
});
it("shows budget_limit in the stats row", async () => {
await renderLoaded(budgetResponse({ budget_used: 100, budget_limit: 500 }));
expect(screen.getByTestId("budget-limit-value").textContent).toBe("500");
});
it("shows 'Unlimited' when budget_limit is null", async () => {
await renderLoaded(budgetResponse({ budget_limit: null, budget_remaining: null }));
expect(screen.getByTestId("budget-limit-value").textContent).toBe("Unlimited");
});
it("shows budget_remaining when present", async () => {
await renderLoaded(budgetResponse({ budget_remaining: 750 }));
expect(screen.getByTestId("budget-remaining").textContent).toContain("750");
expect(screen.getByTestId("budget-remaining").textContent).toContain("credits remaining");
});
it("hides budget_remaining row when null", async () => {
await renderLoaded(budgetResponse({ budget_remaining: null }));
expect(screen.queryByTestId("budget-remaining")).toBeNull();
});
it("does not crash when budget_used is missing from the response", async () => {
// Backend for a provisioning-stuck workspace may return a partial
// shape. Regression: previously this threw
// "Cannot read properties of undefined (reading 'toLocaleString')"
// and crashed the whole Details tab.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
await renderLoaded({ budget_limit: 1000, budget_remaining: null } as any);
expect(screen.getByTestId("budget-used-value").textContent).toBe("0");
});
});
// ── Progress bar ──────────────────────────────────────────────────────────────
describe("BudgetSection — progress bar", () => {
it("renders the progress bar when budget_limit is set", async () => {
await renderLoaded(budgetResponse({ budget_used: 250, budget_limit: 1000 }));
expect(screen.getByRole("progressbar")).toBeTruthy();
});
it("does NOT render progress bar when budget_limit is null", async () => {
await renderLoaded(budgetResponse({ budget_limit: null, budget_remaining: null }));
expect(screen.queryByRole("progressbar")).toBeNull();
});
it("fills to the correct percentage (25%)", async () => {
await renderLoaded(budgetResponse({ budget_used: 250, budget_limit: 1000 }));
const fill = screen.getByTestId("budget-progress-fill") as HTMLDivElement;
expect(fill.style.width).toBe("25%");
});
it("fills to the correct percentage (50%)", async () => {
await renderLoaded(budgetResponse({ budget_used: 500, budget_limit: 1000 }));
const fill = screen.getByTestId("budget-progress-fill") as HTMLDivElement;
expect(fill.style.width).toBe("50%");
});
it("caps fill at 100% when budget_used exceeds budget_limit", async () => {
await renderLoaded(budgetResponse({ budget_used: 1500, budget_limit: 1000 }));
const fill = screen.getByTestId("budget-progress-fill") as HTMLDivElement;
expect(fill.style.width).toBe("100%");
});
it("progress bar has aria-valuenow equal to the calculated percentage", async () => {
await renderLoaded(budgetResponse({ budget_used: 300, budget_limit: 1000 }));
const bar = screen.getByRole("progressbar");
expect(bar.getAttribute("aria-valuenow")).toBe("30");
});
it("shows 0% progress bar when budget_used is absent from the response", async () => {
// Regression: budget_used is optional (provisioning-stuck workspaces return
// partial shapes). Without the `?? 0` guard the progressPct calculation
// throws a TypeScript strict-null error and the build fails.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
await renderLoaded({ budget_limit: 1000, budget_remaining: null } as any);
const bar = screen.getByRole("progressbar");
expect(bar.getAttribute("aria-valuenow")).toBe("0");
const fill = screen.getByTestId("budget-progress-fill") as HTMLDivElement;
expect(fill.style.width).toBe("0%");
});
});
// ── Input pre-fill ────────────────────────────────────────────────────────────
describe("BudgetSection — input pre-fill", () => {
it("pre-fills input with existing budget_limit", async () => {
await renderLoaded(budgetResponse({ budget_limit: 500 }));
const input = screen.getByTestId("budget-limit-input") as HTMLInputElement;
expect(input.value).toBe("500");
});
it("leaves input empty when budget_limit is null", async () => {
await renderLoaded(budgetResponse({ budget_limit: null, budget_remaining: null }));
const input = screen.getByTestId("budget-limit-input") as HTMLInputElement;
expect(input.value).toBe("");
});
});
// ── Save — PATCH calls ────────────────────────────────────────────────────────
describe("BudgetSection — save", () => {
it("calls PATCH /workspaces/:id/budget with budget_limit as integer", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPatch.mockResolvedValueOnce(budgetResponse({ budget_limit: 800 }) as any);
await renderLoaded(budgetResponse({ budget_limit: 1000 }));
fireEvent.change(screen.getByTestId("budget-limit-input"), {
target: { value: "800" },
});
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() => expect(mockPatch).toHaveBeenCalled());
expect(mockPatch.mock.calls[0][0]).toBe("/workspaces/ws-1/budget");
const body = mockPatch.mock.calls[0][1] as Record<string, unknown>;
expect(body.budget_limit).toBe(800);
});
it("sends budget_limit: 0 (not null) when input is '0' — zero-credit budget", async () => {
// Regression for QA bug report: `parseInt("0") || null` would yield null.
// The correct form `raw !== "" ? parseInt(raw, 10) : null` must return 0.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPatch.mockResolvedValueOnce(budgetResponse({ budget_limit: 0, budget_used: 0, budget_remaining: 0 }) as any);
await renderLoaded(budgetResponse({ budget_limit: 1000 }));
fireEvent.change(screen.getByTestId("budget-limit-input"), {
target: { value: "0" },
});
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() => expect(mockPatch).toHaveBeenCalled());
const body = mockPatch.mock.calls[0][1] as Record<string, unknown>;
expect(body.budget_limit).toBe(0);
expect(body.budget_limit).not.toBeNull();
});
it("sends budget_limit: null when input is blank", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPatch.mockResolvedValueOnce(budgetResponse({ budget_limit: null, budget_remaining: null }) as any);
await renderLoaded(budgetResponse({ budget_limit: 1000 }));
fireEvent.change(screen.getByTestId("budget-limit-input"), {
target: { value: "" },
});
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() => expect(mockPatch).toHaveBeenCalled());
const body = mockPatch.mock.calls[0][1] as Record<string, unknown>;
expect(body.budget_limit).toBeNull();
});
it("updates displayed stats after successful save", async () => {
const updated = budgetResponse({ budget_limit: 2000, budget_used: 500, budget_remaining: 1500 });
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPatch.mockResolvedValueOnce(updated as any);
await renderLoaded(budgetResponse({ budget_limit: 1000, budget_used: 250 }));
fireEvent.change(screen.getByTestId("budget-limit-input"), {
target: { value: "2000" },
});
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() =>
expect(screen.getByTestId("budget-limit-value").textContent).toBe("2,000")
);
});
it("shows save error message on non-402 PATCH failure", async () => {
mockPatch.mockRejectedValueOnce(
new Error("API PATCH /workspaces/ws-1/budget: 500 server error")
);
await renderLoaded();
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() =>
expect(screen.getByTestId("budget-save-error")).toBeTruthy()
);
expect(screen.getByTestId("budget-save-error").textContent).toContain("500");
});
});
// ── 402 handling ──────────────────────────────────────────────────────────────
describe("BudgetSection — 402 handling", () => {
it("shows exceeded banner when GET returns 402", async () => {
mockGet.mockRejectedValueOnce(make402Error());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() =>
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy()
);
expect(screen.getByText("Budget exceeded — messages blocked")).toBeTruthy();
});
it("does NOT show fetch error text when GET returns 402 (only banner)", async () => {
mockGet.mockRejectedValueOnce(make402Error());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() =>
expect(screen.queryByTestId("budget-loading")).toBeNull()
);
expect(screen.queryByTestId("budget-fetch-error")).toBeNull();
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy();
});
it("shows exceeded banner when PATCH returns 402", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValueOnce(budgetResponse() as any);
mockPatch.mockRejectedValueOnce(make402PatchError());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() => expect(screen.queryByTestId("budget-loading")).toBeNull());
fireEvent.click(screen.getByTestId("budget-save-btn"));
await waitFor(() =>
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy()
);
// Should NOT also show the save-error alert
expect(screen.queryByTestId("budget-save-error")).toBeNull();
});
it("clears exceeded banner after a successful save", async () => {
mockGet.mockRejectedValueOnce(make402Error());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() =>
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy()
);
// Now a successful PATCH (limit was raised)
const updated = budgetResponse({ budget_limit: 5000, budget_used: 250, budget_remaining: 4750 });
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPatch.mockResolvedValueOnce(updated as any);
await act(async () => {
fireEvent.change(screen.getByTestId("budget-limit-input"), {
target: { value: "5000" },
});
fireEvent.click(screen.getByTestId("budget-save-btn"));
});
await waitFor(() =>
expect(screen.queryByTestId("budget-exceeded-banner")).toBeNull()
);
});
});
// ── Non-402 fetch error ───────────────────────────────────────────────────────
describe("BudgetSection — non-402 fetch errors", () => {
it("shows fetch error text on non-402 GET failure", async () => {
mockGet.mockRejectedValueOnce(makeGenericError("internal server error"));
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() =>
expect(screen.getByTestId("budget-fetch-error")).toBeTruthy()
);
expect(screen.getByTestId("budget-fetch-error").textContent).toContain("500");
});
it("does NOT show stats row on fetch error", async () => {
mockGet.mockRejectedValueOnce(makeGenericError());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() => expect(screen.queryByTestId("budget-loading")).toBeNull());
expect(screen.queryByTestId("budget-stats-row")).toBeNull();
});
it("does NOT show exceeded banner on non-402 fetch error", async () => {
mockGet.mockRejectedValueOnce(makeGenericError());
render(<BudgetSection workspaceId="ws-1" />);
await waitFor(() => expect(screen.queryByTestId("budget-loading")).toBeNull());
expect(screen.queryByTestId("budget-exceeded-banner")).toBeNull();
});
});
@@ -201,13 +201,15 @@ describe("CreateWorkspaceDialog — WCAG SC 1.3.1 label/input association", () =
expect(label?.textContent).toContain("Budget limit");
});
it("Workspace Template select has a <label> whose htmlFor matches the select id", async () => {
it("Template input has a <label> whose htmlFor matches the input id", async () => {
await openDialog();
const templateSelect = screen.getByLabelText("Workspace Template") as HTMLSelectElement;
expect(templateSelect.id).toBeTruthy();
const label = document.querySelector(`label[for="${templateSelect.id}"]`);
const templateInput = screen.getByPlaceholderText(
"e.g. seo-agent (from workspace-configs-templates/)"
) as HTMLInputElement;
expect(templateInput.id).toBeTruthy();
const label = document.querySelector(`label[for="${templateInput.id}"]`);
expect(label).toBeTruthy();
expect(label?.textContent).toContain("Workspace Template");
expect(label?.textContent).toContain("Template");
});
it("each InputField generates a distinct id (no id collisions)", async () => {
@@ -216,16 +218,13 @@ describe("CreateWorkspaceDialog — WCAG SC 1.3.1 label/input association", () =
screen.getByPlaceholderText("e.g. SEO Agent"),
screen.getByPlaceholderText("e.g. SEO Specialist"),
screen.getByPlaceholderText("e.g. 100"),
screen.getByPlaceholderText("e.g. seo-agent (from workspace-configs-templates/)"),
] as HTMLInputElement[];
const selects = [
screen.getByLabelText("Runtime"),
screen.getByLabelText("Workspace Template"),
] as HTMLSelectElement[];
const ids = [...inputs, ...selects].map((i) => i.id).filter(Boolean);
const ids = inputs.map((i) => i.id).filter(Boolean);
const unique = new Set(ids);
expect(unique.size).toBe(ids.length); // no duplicates
expect(ids.length).toBe(5);
expect(ids.length).toBe(4);
});
it("Name label text contains the required asterisk indicator", async () => {
@@ -1,7 +1,7 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup } from "@testing-library/react";
import { CreateWorkspaceButton } from "../CreateWorkspaceDialog";
import { CreateWorkspaceButton, HERMES_PROVIDERS } from "../CreateWorkspaceDialog";
vi.mock("@/lib/api", () => ({
api: {
@@ -20,63 +20,10 @@ const SAMPLE_WORKSPACES = [
{ id: "ws-2", name: "Research Agent", tier: 2 },
];
const SAMPLE_TEMPLATES = [
{
id: "claude-code-default",
name: "Claude Code Agent",
runtime: "claude-code",
model: "moonshot/kimi-k2.6",
providers: ["platform", "minimax", "kimi-coding", "anthropic", "anthropic-oauth"],
models: [
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
{ id: "MiniMax-M2.7", name: "MiniMax M2.7", required_env: ["MINIMAX_API_KEY"] },
{ id: "kimi-k2-turbo-preview", name: "Kimi K2 Turbo Preview", required_env: ["KIMI_API_KEY"] },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "opus", name: "Claude Opus", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "haiku", name: "Claude Haiku", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
],
},
{
id: "seo-agent",
name: "SEO Agent",
runtime: "claude-code",
model: "moonshot/kimi-k2.6",
providers: ["platform", "minimax", "kimi-coding", "anthropic", "anthropic-oauth"],
models: [
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
{ id: "MiniMax-M2.7", name: "MiniMax M2.7", required_env: ["MINIMAX_API_KEY"] },
{ id: "kimi-k2-turbo-preview", name: "Kimi K2 Turbo Preview", required_env: ["KIMI_API_KEY"] },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "sonnet", name: "Claude Sonnet", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "opus", name: "Claude Opus", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
{ id: "haiku", name: "Claude Haiku", required_env: ["CLAUDE_CODE_OAUTH_TOKEN"] },
],
},
{
id: "hermes",
name: "Hermes",
runtime: "hermes",
model: "openai/gpt-4o",
providers: ["openai", "anthropic", "platform"],
models: [
{ id: "openai/gpt-4o", name: "GPT-4o", required_env: ["OPENAI_API_KEY"] },
{ id: "anthropic/claude-sonnet-4-5", name: "Claude Sonnet 4.5", required_env: ["ANTHROPIC_API_KEY"] },
{ id: "moonshot/kimi-k2.6", name: "Kimi K2.6", provider: "platform", required_env: [] },
],
},
];
beforeEach(() => {
vi.clearAllMocks();
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_TEMPLATES as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(SAMPLE_WORKSPACES as any);
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockPost.mockResolvedValue({} as any);
});
@@ -95,14 +42,7 @@ async function openDialog() {
async function setTemplate(value: string) {
fireEvent.change(
screen.getByLabelText("Workspace Template"),
{ target: { value } }
);
}
async function setRuntime(value: string) {
fireEvent.change(
screen.getByLabelText("Runtime"),
screen.getByPlaceholderText("e.g. seo-agent (from workspace-configs-templates/)"),
{ target: { value } }
);
}
@@ -123,7 +63,7 @@ describe("CreateWorkspaceDialog", () => {
it('first option is "None (root level)" with empty value', async () => {
await openDialog();
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
expect(select).toBeTruthy();
const firstOption = select.options[0];
expect(firstOption.value).toBe("");
@@ -133,12 +73,12 @@ describe("CreateWorkspaceDialog", () => {
it("populates select with workspace names from GET /workspaces", async () => {
await openDialog();
await waitFor(() => {
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
const optionValues = Array.from(select.options).map((o) => o.value);
expect(optionValues).toContain("ws-1");
expect(optionValues).toContain("ws-2");
});
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
const optionTexts = Array.from(select.options).map((o) => o.text.trim());
expect(optionTexts.some((t) => t.includes("Platform Team"))).toBe(true);
expect(optionTexts.some((t) => t.includes("Research Agent"))).toBe(true);
@@ -147,7 +87,7 @@ describe("CreateWorkspaceDialog", () => {
it("sends parent_id in POST body when a workspace is selected", async () => {
await openDialog();
await waitFor(() => {
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
expect(select.options.length).toBeGreaterThan(1);
});
@@ -155,7 +95,7 @@ describe("CreateWorkspaceDialog", () => {
target: { value: "My Agent" },
});
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
fireEvent.change(select, { target: { value: "ws-1" } });
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
@@ -172,7 +112,7 @@ describe("CreateWorkspaceDialog", () => {
target: { value: "Root Agent" },
});
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
fireEvent.change(select, { target: { value: "" } });
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
@@ -183,7 +123,7 @@ describe("CreateWorkspaceDialog", () => {
expect(body.parent_id).toBeUndefined();
});
it("sends the cost-efficient headless compute profile by default", async () => {
it("omits compute config by default", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Plain Agent" },
@@ -192,55 +132,10 @@ describe("CreateWorkspaceDialog", () => {
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.compute).toEqual({
instance_type: "t3.medium",
volume: { root_gb: 30 },
display: { mode: "none" },
});
expect(body.model).toBe("moonshot/kimi-k2.6");
expect(body.llm_provider).toBe("platform");
expect(body.runtime).toBe("claude-code");
expect(body.secrets).toBeUndefined();
});
it("keeps runtime and workspace template as separate selectors", async () => {
await openDialog();
const runtimeSelect = screen.getByLabelText("Runtime") as HTMLSelectElement;
const runtimeTexts = Array.from(runtimeSelect.options).map((o) => o.text.trim());
expect(runtimeTexts).toEqual([
"Claude Code",
"OpenAI Codex CLI",
"Google ADK",
"Hermes",
"OpenClaw",
]);
expect(runtimeTexts).not.toContain("SEO Agent");
await waitFor(() => {
const templateSelect = screen.getByLabelText("Workspace Template") as HTMLSelectElement;
const templateTexts = Array.from(templateSelect.options).map((o) => o.text.trim());
expect(templateTexts).toContain("SEO Agent");
expect(templateTexts).not.toContain("Hermes");
});
});
it("does not send managed compute for external agents", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "External Agent" },
});
fireEvent.click(screen.getByLabelText(/External agent/));
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.compute).toBeUndefined();
expect(body.runtime).toBe("external");
expect(body.model).toBe("anthropic:claude-opus-4-7");
});
it("sends display compute profile when desktop display is enabled", async () => {
@@ -255,8 +150,7 @@ describe("CreateWorkspaceDialog", () => {
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.model).toBe("moonshot/kimi-k2.6");
expect(body.llm_provider).toBe("platform");
expect(body.model).toBe("anthropic:claude-opus-4-7");
expect(body.compute).toEqual({
instance_type: "t3.xlarge",
volume: { root_gb: 80 },
@@ -269,72 +163,13 @@ describe("CreateWorkspaceDialog", () => {
});
});
it("sends BYOK API key secrets when API key auth mode is selected", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "BYOK Agent" },
});
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "minimax|MINIMAX_API_KEY" },
});
fireEvent.change(document.getElementById("llm-secret-input") as HTMLInputElement, {
target: { value: "sk-minimax-test" },
});
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.model).toBe("MiniMax-M2.7");
expect(body.llm_provider).toBe("minimax");
expect(body.secrets).toEqual({ MINIMAX_API_KEY: "sk-minimax-test" });
});
it("sends Claude OAuth token separately from platform-managed mode", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "OAuth Agent" },
});
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "anthropic-oauth|CLAUDE_CODE_OAUTH_TOKEN" },
});
fireEvent.change(document.querySelector("[data-testid='model-select']") as HTMLSelectElement, {
target: { value: "sonnet" },
});
fireEvent.change(document.getElementById("llm-secret-input") as HTMLInputElement, {
target: { value: "oauth-token" },
});
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.model).toBe("sonnet");
expect(body.llm_provider).toBe("anthropic-oauth");
expect(body.secrets).toEqual({ CLAUDE_CODE_OAUTH_TOKEN: "oauth-token" });
});
it("lists all Claude Code subscription aliases for blank workspaces", async () => {
await openDialog();
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "anthropic-oauth|CLAUDE_CODE_OAUTH_TOKEN" },
});
const modelSelect = document.querySelector("[data-testid='model-select']") as HTMLSelectElement;
const optionValues = Array.from(modelSelect.options).map((option) => option.value);
expect(optionValues).toEqual(expect.arrayContaining(["sonnet", "opus", "haiku"]));
});
it("renders gracefully when GET /workspaces fails", async () => {
mockGet.mockRejectedValueOnce(new Error("Network error"));
await openDialog();
// Dialog still renders; select exists with only the root option
await waitFor(() => {
const select = screen.getByLabelText("Parent Workspace") as HTMLSelectElement;
const select = document.querySelector("select") as HTMLSelectElement;
expect(select.options.length).toBe(1);
expect(select.options[0].value).toBe("");
});
@@ -342,103 +177,225 @@ describe("CreateWorkspaceDialog", () => {
});
// ---------------------------------------------------------------------------
// Dynamic runtime provider picker tests
// Hermes provider picker tests
// ---------------------------------------------------------------------------
describe("CreateWorkspaceDialog — dynamic runtime provider picker", () => {
it("does not render the old Hermes-only provider section", async () => {
describe("CreateWorkspaceDialog — Hermes provider picker", () => {
it("does NOT show hermes provider section for non-hermes templates", async () => {
await openDialog();
await setRuntime("hermes");
await setTemplate("seo-agent");
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeNull();
});
it("derives Hermes provider and model options from the /templates runtime row", async () => {
it("shows hermes provider section when template is 'hermes'", async () => {
await openDialog();
await setRuntime("hermes");
const providerSelect = document.querySelector("[data-testid='provider-select']") as HTMLSelectElement;
await waitFor(() => expect(providerSelect.options.length).toBe(4));
const providerValues = Array.from(providerSelect.options).map((option) => option.value);
expect(providerValues).toEqual(expect.arrayContaining([
"platform|",
"openai|OPENAI_API_KEY",
"anthropic|ANTHROPIC_API_KEY",
]));
expect(providerValues).not.toContain("gemini|GEMINI_API_KEY");
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
});
it("uses the template-declared default provider/model for Hermes", async () => {
it("shows hermes provider section for template 'HERMES' (case-insensitive)", async () => {
await openDialog();
await setRuntime("hermes");
await waitFor(() => {
const providerSelect = document.querySelector("[data-testid='provider-select']") as HTMLSelectElement;
expect(providerSelect.value).toBe("platform|");
});
const modelSelect = document.querySelector("[data-testid='model-select']") as HTMLSelectElement;
expect(modelSelect.value).toBe("moonshot/kimi-k2.6");
await setTemplate("HERMES");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
});
it("prompts for the provider credential required by the selected Hermes model", async () => {
it("hermes provider dropdown defaults to 'anthropic'", async () => {
await openDialog();
await setRuntime("hermes");
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect).toBeTruthy();
expect(providerSelect.value).toBe("anthropic");
});
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
it("hermes provider dropdown lists all 15 providers", async () => {
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
const ids = Array.from(providerSelect.options).map((o) => o.value);
expect(ids).toContain("anthropic");
expect(ids).toContain("openai");
expect(ids).toContain("gemini");
expect(ids).toContain("deepseek");
expect(ids).toContain("hermes");
});
// Pins the dynamic-providers behavior: when the matched template's
// /templates row declares `providers`, the dropdown filters to that
// subset instead of showing the full HERMES_PROVIDERS catalog. Same
// data source ConfigTab uses (PR #2454) — keeps the modal and the
// settings tab honest about which providers a template supports.
it("hermes provider dropdown filters to template-declared providers when /templates ships them", async () => {
// Per-URL mock: /workspaces returns the existing fixture, /templates
// returns a hermes row that only allows anthropic + minimax + openai.
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
return [
{ id: "hermes", name: "Hermes", runtime: "hermes", providers: ["anthropic", "minimax", "openai"] },
// eslint-disable-next-line @typescript-eslint/no-explicit-any
] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
const keyInput = document.getElementById("llm-secret-input") as HTMLInputElement;
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
// Filtered list arrives async after /templates fetch resolves —
// keep waiting until the dropdown shrinks below the full catalog.
await waitFor(() => expect(providerSelect.options.length).toBe(3));
const ids = Array.from(providerSelect.options).map((o) => o.value);
expect(ids).toEqual(expect.arrayContaining(["anthropic", "minimax", "openai"]));
expect(ids).not.toContain("gemini");
expect(ids).not.toContain("deepseek");
});
// Back-compat: a template that hasn't migrated to runtime_config.providers
// (older templates, self-hosted setups without /templates server) keeps
// showing the full provider catalog. Operators picking from those
// templates can't be locked out of providers we know hermes supports.
it("hermes provider dropdown falls back to all providers when template declares no providers list", async () => {
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
// No `providers` field — empty/missing → fall back to full catalog.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return [{ id: "hermes", name: "Hermes", runtime: "hermes" }] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
});
// Defensive: a template's declared list with NO matches against our
// static catalog (e.g. a brand-new provider id we don't have label/
// envVar metadata for yet) must not render an empty <select> — the
// operator can't pick a provider, the form locks. Component falls
// back to the full catalog so the user can still proceed.
it("hermes provider dropdown falls back to all providers when template declares only unknown providers", async () => {
mockGet.mockImplementation(async (url: string) => {
if (url === "/templates") {
return [
{ id: "hermes", name: "Hermes", runtime: "hermes", providers: ["totally-new-provider-2030"] },
// eslint-disable-next-line @typescript-eslint/no-explicit-any
] as any;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
return SAMPLE_WORKSPACES as any;
});
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
// Stays at full catalog length — no flapping to 0 then back.
expect(providerSelect.options.length).toBe(HERMES_PROVIDERS.length);
});
it("hermes API key field is a password input (masked)", async () => {
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
expect(keyInput).toBeTruthy();
expect(keyInput.type).toBe("password");
});
it("shows an error if the selected runtime provider requires a credential", async () => {
it("shows an error if hermes template is set but API key is empty on submit", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes Agent" },
});
await setRuntime("hermes");
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
});
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Submit without API key
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => {
const alert = screen.getByRole("alert");
expect(alert.textContent).toContain("Provider credential");
expect(alert.textContent).toContain("API key");
});
expect(mockPost).not.toHaveBeenCalled();
});
it("includes runtime-derived provider/model/secrets in POST body", async () => {
it("includes secrets in POST body with correct env var for selected provider", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes OpenAI" },
});
await setRuntime("hermes");
fireEvent.change(document.querySelector("[data-testid='provider-select']") as HTMLSelectElement, {
target: { value: "openai|OPENAI_API_KEY" },
});
fireEvent.change(document.getElementById("llm-secret-input") as HTMLInputElement, {
target: { value: "sk-openai-test" },
target: { value: "Hermes Agent" },
});
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Fill in the API key
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
fireEvent.change(keyInput, { target: { value: "sk-test-anthropic-key" } });
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.secrets).toEqual({ ANTHROPIC_API_KEY: "sk-test-anthropic-key" });
expect(body.template).toBe("hermes");
});
it("uses the correct env var when a non-default provider is selected", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Hermes OpenAI" },
});
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Switch to openai
const providerSelect = document.getElementById("hermes-provider-select") as HTMLSelectElement;
fireEvent.change(providerSelect, { target: { value: "openai" } });
const keyInput = document.getElementById("hermes-api-key-input") as HTMLInputElement;
fireEvent.change(keyInput, { target: { value: "sk-openai-test" } });
const createBtn = screen.getAllByRole("button").find((b) => b.textContent === "Create");
fireEvent.click(createBtn!);
await waitFor(() => expect(mockPost).toHaveBeenCalled());
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.runtime).toBe("hermes");
expect(body.template).toBeUndefined();
expect(body.model).toBe("openai/gpt-4o");
expect(body.llm_provider).toBe("openai");
expect(body.secrets).toEqual({ OPENAI_API_KEY: "sk-openai-test" });
});
it("does NOT include secrets field when provider is platform-managed", async () => {
it("does NOT include secrets field when template is not hermes", async () => {
await openDialog();
fireEvent.change(screen.getByPlaceholderText("e.g. SEO Agent"), {
target: { value: "Normal Agent" },
@@ -452,6 +409,20 @@ describe("CreateWorkspaceDialog — dynamic runtime provider picker", () => {
const body = mockPost.mock.calls[0][1] as Record<string, unknown>;
expect(body.secrets).toBeUndefined();
});
it("hides hermes section and resets state when template is cleared", async () => {
await openDialog();
await setTemplate("hermes");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeTruthy()
);
// Clear template
await setTemplate("");
await waitFor(() =>
expect(document.querySelector("[data-testid='hermes-provider-section']")).toBeNull()
);
});
});
// ---------------------------------------------------------------------------
@@ -16,7 +16,7 @@
* - handleDeployed fires after 500ms delay
*
* Uses vi.hoisted + vi.mock to fully isolate the api module, matching
* the pattern established in ApprovalBanner and ScheduleTab tests.
* the pattern established in ApprovalBanner, MemoryTab, and ScheduleTab tests.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
@@ -96,12 +96,12 @@ vi.mock("@/lib/design-tokens", () => ({
// ─── Fixtures ─────────────────────────────────────────────────────────────────
const TEMPLATE = {
id: "seo-agent",
name: "SEO Agent",
description: "SEO workspace template",
id: "tpl-1",
name: "Claude Code Agent",
description: "A general-purpose coding assistant",
tier: 2,
skill_count: 3,
model: "MiniMax-M2.7",
model: "claude-opus-4-5",
};
function template(overrides: Partial<typeof TEMPLATE> = {}): typeof TEMPLATE {
@@ -159,7 +159,7 @@ describe("EmptyState — loading", () => {
it("does not render template buttons while loading", async () => {
renderEmpty();
await flush();
expect(screen.queryByText("SEO Agent")).toBeNull();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
});
@@ -183,8 +183,8 @@ describe("EmptyState — templates", () => {
it("renders template buttons with name and description", async () => {
renderEmpty();
await flush();
expect(screen.getByText("SEO Agent")).toBeTruthy();
expect(screen.getByText("SEO workspace template")).toBeTruthy();
expect(screen.getByText("Claude Code Agent")).toBeTruthy();
expect(screen.getByText("A general-purpose coding assistant")).toBeTruthy();
});
it("renders tier badge and skill count", async () => {
@@ -198,42 +198,25 @@ describe("EmptyState — templates", () => {
it("renders model name when present", async () => {
renderEmpty();
await flush();
expect(screen.getByText(/MiniMax-M2.7/i)).toBeTruthy();
expect(screen.getByText(/claude-opus/i)).toBeTruthy();
});
it("calls deploy with the template on click", async () => {
renderEmpty();
await flush();
fireEvent.click(screen.getByText("SEO Agent"));
fireEvent.click(screen.getByText("Claude Code Agent"));
expect(_deploy.deployFn).toHaveBeenCalledWith(template());
});
it("hides runtime-default templates from the product template grid", async () => {
mockApiGet.mockResolvedValue([
template({ id: "claude-code-default", name: "Claude Code Agent" }),
template({ id: "codex", name: "OpenAI Codex CLI" }),
template({ id: "hermes", name: "Hermes Agent" }),
template({ id: "openclaw", name: "OpenClaw Agent" }),
template(),
]);
renderEmpty();
await flush();
expect(screen.getByText("SEO Agent")).toBeTruthy();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
expect(screen.queryByText("OpenAI Codex CLI")).toBeNull();
expect(screen.queryByText("Hermes Agent")).toBeNull();
expect(screen.queryByText("OpenClaw Agent")).toBeNull();
});
it("shows 'Deploying...' on the button of the template being deployed", async () => {
_deploy.deploying = "seo-agent";
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
expect(screen.getByText("Deploying...")).toBeTruthy();
});
it("disables the template button of the deploying template", async () => {
_deploy.deploying = "seo-agent";
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
const btn = screen.getByText("Deploying...").closest("button") as HTMLButtonElement;
@@ -241,7 +224,7 @@ describe("EmptyState — templates", () => {
});
it("disables 'create blank' while a template is deploying", async () => {
_deploy.deploying = "seo-agent";
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
expect(screen.getByRole("button", { name: "+ Create blank workspace" }).disabled).toBe(true);
@@ -262,7 +245,7 @@ describe("EmptyState — fetch failure / empty templates", () => {
it("does not render template grid when GET /templates returns []", async () => {
renderEmpty();
await flush();
expect(screen.queryByText("SEO Agent")).toBeNull();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
it("renders 'create blank' button when templates list is empty", async () => {
@@ -275,7 +258,7 @@ describe("EmptyState — fetch failure / empty templates", () => {
mockApiGet.mockReset().mockRejectedValue(new Error("Network failure"));
renderEmpty();
await flush();
expect(screen.queryByText("SEO Agent")).toBeNull();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
});
@@ -333,7 +316,7 @@ describe("EmptyState — create blank", () => {
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
expect((screen.getByText("SEO Agent").closest("button") as HTMLButtonElement).disabled).toBe(true);
expect((screen.getByText("Claude Code Agent").closest("button") as HTMLButtonElement).disabled).toBe(true);
});
it("shows error banner when POST /workspaces fails", async () => {
@@ -0,0 +1,93 @@
// @vitest-environment jsdom
/**
* Unit tests for pure helpers from MemoryInspectorPanel:
* isPluginUnavailableError, formatRelativeTime, formatTTL
*
* These are the three exported non-component functions. The component
* itself (MemoryInspectorPanel) requires full API + store mocking and
* is exercised by the existing MemoryTab.test.tsx.
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { isPluginUnavailableError, formatTTL } from "../MemoryInspectorPanel";
// formatRelativeTime is not exported — tested via the component in MemoryTab.test.tsx
describe("isPluginUnavailableError", () => {
it("returns true when Error message contains MEMORY_PLUGIN_URL", () => {
const err = new Error("memory: could not resolve MEMORY_PLUGIN_URL — plugin not configured");
expect(isPluginUnavailableError(err)).toBe(true);
});
it("returns true for Error containing MEMORY_PLUGIN_URL", () => {
expect(isPluginUnavailableError(new Error("MEMORY_PLUGIN_URL is not set"))).toBe(true);
});
it("returns false for unrelated error messages", () => {
expect(isPluginUnavailableError(new Error("workspace not found"))).toBe(false);
});
it("returns false for null", () => {
expect(isPluginUnavailableError(null)).toBe(false);
});
it("returns false for undefined", () => {
expect(isPluginUnavailableError(undefined)).toBe(false);
});
it("returns false for plain objects without message", () => {
expect(isPluginUnavailableError({ code: 503 })).toBe(false);
});
it("is case-sensitive (MEMORY_PLUGIN_URL must match exactly)", () => {
const lowerErr = new Error("memory_plugin_url missing");
const upperErr = new Error("MEMORY_PLUGIN_URL missing");
expect(isPluginUnavailableError(lowerErr)).toBe(false);
expect(isPluginUnavailableError(upperErr)).toBe(true);
});
});
describe("formatTTL", () => {
beforeEach(() => { vi.useFakeTimers(); });
afterEach(() => { vi.useRealTimers(); });
it("returns '' for null", () => {
expect(formatTTL(null)).toBe("");
});
it("returns '' for undefined", () => {
expect(formatTTL(undefined)).toBe("");
});
it('returns "expired" when expiresAt is in the past', () => {
const past = new Date(Date.now() - 60_000).toISOString();
expect(formatTTL(past)).toBe("expired");
});
it('returns "Xs" for less than a minute', () => {
const soon = new Date(Date.now() + 30_000).toISOString();
expect(formatTTL(soon)).toBe("30s");
});
it('returns "Xm" for less than an hour', () => {
const soon = new Date(Date.now() + 5 * 60_000).toISOString();
expect(formatTTL(soon)).toBe("5m");
});
it('returns "Xh" for less than a day', () => {
const soon = new Date(Date.now() + 3 * 3_600_000).toISOString();
expect(formatTTL(soon)).toBe("3h");
});
it('returns "Xd" for more than a day', () => {
const soon = new Date(Date.now() + 2 * 86_400_000).toISOString();
expect(formatTTL(soon)).toBe("2d");
});
it("returns '' for invalid date string", () => {
expect(formatTTL("not-a-date")).toBe("");
});
it("returns '' for empty string", () => {
expect(formatTTL("")).toBe("");
});
});
@@ -31,17 +31,6 @@ vi.mock('@/lib/api', () => ({
},
}));
// Capture the socket-event handler the panel registers so individual
// tests can replay an ACTIVITY_LOGGED message without spinning up a
// real WebSocket. One handler at a time is fine — the panel mounts
// exactly one useSocketEvent subscriber.
let __socketHandler: ((msg: unknown) => void) | null = null;
vi.mock('@/hooks/useSocketEvent', () => ({
useSocketEvent: (handler: (msg: unknown) => void) => {
__socketHandler = handler;
},
}));
vi.mock('@/components/ConfirmDialog', () => ({
ConfirmDialog: ({
open,
@@ -527,156 +516,3 @@ describe('MemoryInspectorPanel — refresh', () => {
});
});
});
// Live-refresh subscription wired in #1734 so the panel reacts to
// ACTIVITY_LOGGED events for memory writes on this workspace without
// the user clicking Refresh. The hook is mocked at the top of the
// file to capture the registered handler in __socketHandler.
describe('MemoryInspectorPanel — live refresh on activity', () => {
it('refetches memories when ACTIVITY_LOGGED arrives with activity_type=memory_write for the same workspace', async () => {
vi.useFakeTimers({ shouldAdvanceTime: true });
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
expect(__socketHandler).toBeTruthy();
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
__socketHandler!({
event: 'ACTIVITY_LOGGED',
workspace_id: 'ws-1',
payload: { activity_type: 'memory_write' },
});
// 300ms debounce inside the panel — advance the fake timer so the
// queued refetch fires.
await vi.advanceTimersByTimeAsync(350);
await waitFor(() => {
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before + 1);
});
vi.useRealTimers();
});
it('ignores ACTIVITY_LOGGED events from other workspaces', async () => {
vi.useFakeTimers({ shouldAdvanceTime: true });
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
__socketHandler!({
event: 'ACTIVITY_LOGGED',
workspace_id: 'ws-OTHER',
payload: { activity_type: 'memory_write' },
});
await vi.advanceTimersByTimeAsync(500);
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before);
vi.useRealTimers();
});
it('ignores activity types that are not memory-related', async () => {
vi.useFakeTimers({ shouldAdvanceTime: true });
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
__socketHandler!({
event: 'ACTIVITY_LOGGED',
workspace_id: 'ws-1',
payload: { activity_type: 'a2a_send' },
});
await vi.advanceTimersByTimeAsync(500);
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before);
vi.useRealTimers();
});
// Server-side emitters confirmed via grep of workspace-server/internal/handlers
// are `memory_write_global`, `memory_edit_global`, `memory_delete_global`
// (memories.go `LogActivity` calls for GLOBAL-scope writes). Pin each
// so a future filter narrow-down can't silently drop one and let the
// panel go stale on its actual production trigger.
it.each([
'memory_write', // pre-emptive: not yet emitted by server, see component comment
'memory_write_global', // memories.go:218 (Commit)
'memory_edit_global', // memories.go:617 (Update)
'memory_delete_global', // memories.go (Delete) — paired with the above two
'agent_log', // generic catch-all
])('refetches on activity_type=%s', async (activityType) => {
vi.useFakeTimers({ shouldAdvanceTime: true });
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
__socketHandler!({
event: 'ACTIVITY_LOGGED',
workspace_id: 'ws-1',
payload: { activity_type: activityType },
});
await vi.advanceTimersByTimeAsync(350);
await waitFor(() => {
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before + 1);
});
vi.useRealTimers();
});
it('coalesces a burst of memory_write events into one refetch', async () => {
vi.useFakeTimers({ shouldAdvanceTime: true });
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
for (let i = 0; i < 5; i++) {
__socketHandler!({
event: 'ACTIVITY_LOGGED',
workspace_id: 'ws-1',
payload: { activity_type: 'memory_write' },
});
}
await vi.advanceTimersByTimeAsync(350);
await waitFor(() => {
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before + 1);
});
vi.useRealTimers();
});
});
@@ -402,31 +402,6 @@ describe("MissingKeysModal — add keys and deploy", () => {
expect(onKeysAdded).toHaveBeenCalled();
});
it("shows optional keys without blocking deploy", () => {
const onKeysAdded = vi.fn();
render(
<MissingKeysModal
open={true}
missingKeys={[]}
optionalKeys={["GOOGLE_GSC_SITE"]}
runtime="claude-code"
title="Configure Workspace"
onKeysAdded={onKeysAdded}
onCancel={vi.fn()}
/>
);
expect(screen.getByText("Optional")).toBeTruthy();
expect(screen.getAllByText("GOOGLE_GSC_SITE").length).toBeGreaterThan(0);
const deployBtn = Array.from(document.querySelectorAll("button")).find(
(b) => b.textContent?.trim() === "Deploy",
);
expect(deployBtn).toBeTruthy();
expect(deployBtn!.disabled).toBe(false);
act(() => { fireEvent.click(deployBtn!); });
expect(onKeysAdded).toHaveBeenCalled();
});
it("shows global error when not all keys saved", async () => {
const onKeysAdded = vi.fn();
render(
@@ -554,4 +529,4 @@ describe("MissingKeysModal — cancel and settings", () => {
);
expect(screen.queryByRole("button", { name: /open settings/i })).toBeNull();
});
});
});
@@ -272,9 +272,7 @@ describe("OrgCancelButton — API interactions", () => {
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(mockApiDel).toHaveBeenCalledWith("/workspaces/root-1?confirm=true", {
headers: { "X-Confirm-Name": "Test Org" },
});
expect(mockApiDel).toHaveBeenCalledWith("/workspaces/root-1?confirm=true");
});
it("shows success toast on DELETE success", async () => {
@@ -1,110 +0,0 @@
// @vitest-environment jsdom
//
// internal#718 P3 (retire-list #4) — when GET /templates serves a
// registry-backed selectable list (registry_providers + registry_models with
// display_name / billing_mode / derived provider), the canvas builds the
// provider catalog FROM that registry data instead of re-inferring vendor
// from model-id prefixes (VENDOR_LABELS / BARE_VENDOR_PATTERNS / inferVendor).
// The heuristic path stays only as the fallback for non-registry runtimes /
// older backends.
import { describe, it, expect } from "vitest";
import {
buildProviderCatalogFromRegistry,
type RegistryProvider,
type RegistryModel,
} from "../ProviderModelSelector";
// Mirrors the registry-served claude-code payload from GET /templates
// (registry_providers / registry_models). display_name + billing_mode come
// from the registry, NOT from the canvas VENDOR_LABELS map.
const CLAUDE_CODE_REGISTRY_PROVIDERS: RegistryProvider[] = [
{
name: "anthropic-oauth",
display_name: "Claude Code subscription",
auth_env: ["CLAUDE_CODE_OAUTH_TOKEN"],
billing_mode: "byok",
},
{
name: "anthropic-api",
display_name: "Anthropic API",
auth_env: ["ANTHROPIC_API_KEY"],
billing_mode: "byok",
},
{
name: "platform",
display_name: "Platform",
auth_env: ["ANTHROPIC_API_KEY", "MOLECULE_LLM_USAGE_TOKEN"],
billing_mode: "platform_managed",
},
];
const CLAUDE_CODE_REGISTRY_MODELS: RegistryModel[] = [
{ id: "sonnet", provider: "anthropic-oauth", billing_mode: "byok" },
{ id: "opus", provider: "anthropic-oauth", billing_mode: "byok" },
{ id: "claude-opus-4-7", provider: "anthropic-api", billing_mode: "byok" },
{ id: "anthropic/claude-opus-4-7", provider: "platform", billing_mode: "platform_managed" },
];
describe("buildProviderCatalogFromRegistry", () => {
it("buckets models by their DERIVED registry provider, not by inferred vendor", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
const byVendor = new Map(catalog.map((p) => [p.vendor, p]));
// anthropic-oauth bucket holds the two OAuth-derived models.
const oauth = byVendor.get("anthropic-oauth");
expect(oauth).toBeDefined();
expect(oauth!.models.map((m) => m.id).sort()).toEqual(["opus", "sonnet"]);
// platform bucket holds the platform-namespaced model.
const platform = byVendor.get("platform");
expect(platform).toBeDefined();
expect(platform!.models.map((m) => m.id)).toEqual(["anthropic/claude-opus-4-7"]);
});
it("labels providers from the registry display_name, not VENDOR_LABELS", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
const oauth = catalog.find((p) => p.vendor === "anthropic-oauth");
// Registry display_name "Claude Code subscription" (decorated with the
// model count by the catalog builder is acceptable; assert it carries the
// registry label, not an inferred one).
expect(oauth!.label).toContain("Claude Code subscription");
});
it("carries the registry billing_mode per provider", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
expect(catalog.find((p) => p.vendor === "anthropic-oauth")!.billingMode).toBe("byok");
expect(catalog.find((p) => p.vendor === "platform")!.billingMode).toBe("platform_managed");
});
it("surfaces the registry auth_env on the provider entry", () => {
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
CLAUDE_CODE_REGISTRY_MODELS,
);
expect(catalog.find((p) => p.vendor === "anthropic-oauth")!.envVars).toEqual([
"CLAUDE_CODE_OAUTH_TOKEN",
]);
});
it("only includes providers that actually have at least one served model", () => {
// anthropic-api is a registry provider but has no model in this slice →
// it should not appear as an empty bucket.
const models: RegistryModel[] = [
{ id: "sonnet", provider: "anthropic-oauth", billing_mode: "byok" },
];
const catalog = buildProviderCatalogFromRegistry(
CLAUDE_CODE_REGISTRY_PROVIDERS,
models,
);
expect(catalog.map((p) => p.vendor)).toEqual(["anthropic-oauth"]);
});
});
@@ -44,14 +44,6 @@ const HERMES_MODELS: SelectorModel[] = [
];
describe("inferVendor", () => {
it("uses explicit provider metadata before slug heuristics", () => {
expect(inferVendor({
id: "moonshot/kimi-k2.6",
provider: "platform",
required_env: [],
})).toBe("platform");
});
it("uses slash prefix when present", () => {
expect(inferVendor({ id: "nousresearch/hermes-4-70b", required_env: ["HERMES_API_KEY"] }))
.toBe("nousresearch");
@@ -113,22 +105,6 @@ describe("buildProviderCatalog", () => {
expect(oauth!.models.map((m) => m.id).sort()).toEqual(["haiku", "opus", "sonnet"]);
});
it("labels explicit platform-managed providers", () => {
const catalog = buildProviderCatalog([
{
id: "moonshot/kimi-k2.6",
name: "Kimi K2.6",
provider: "platform",
required_env: [],
},
]);
expect(catalog[0]).toMatchObject({
vendor: "platform",
label: "Platform",
envVars: [],
});
});
it("flags wildcard providers", () => {
const catalog = buildProviderCatalog(HERMES_MODELS);
const hf = catalog.find((p) => p.vendor === "huggingface");
@@ -189,23 +189,6 @@ describe("TemplatePalette — sidebar", () => {
expect(screen.getByText("Researcher")).toBeTruthy();
});
it("hides runtime-default templates from the deployable product template list", async () => {
mockGet.mockResolvedValue([
{ id: "claude-code-default", name: "Claude Code Agent", description: "", tier: 4, skills: [] },
{ id: "codex", name: "OpenAI Codex CLI", description: "", tier: 4, skills: [] },
{ id: "hermes", name: "Hermes Agent", description: "", tier: 4, skills: [] },
{ id: "openclaw", name: "OpenClaw Agent", description: "", tier: 4, skills: [] },
{ id: "seo-agent", name: "SEO Agent", description: "SEO workspace template", tier: 4, skills: ["seo"] },
]);
render(<TemplatePalette />);
await openSidebar();
expect(screen.getByText("SEO Agent")).toBeTruthy();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
expect(screen.queryByText("OpenAI Codex CLI")).toBeNull();
expect(screen.queryByText("Hermes Agent")).toBeNull();
expect(screen.queryByText("OpenClaw Agent")).toBeNull();
});
it("shows template description", async () => {
mockGet.mockResolvedValue(MOCK_TEMPLATES);
render(<TemplatePalette />);
@@ -57,7 +57,6 @@ export function OrgCancelButton({ rootId, rootName, workspaceCount }: Props) {
try {
await api.del<{ status: string }>(
`/workspaces/${rootId}?confirm=true`,
{ headers: { "X-Confirm-Name": rootName } },
);
showToast(`Cancelled deployment of "${rootName}"`, "success");
// Optimistic local removal — workspace-server broadcasts
@@ -199,9 +199,7 @@ describe("OrgCancelButton — Yes / cascade delete", () => {
});
// 1) API call hit the cascade-delete endpoint with confirm=true
expect(mockApiDel).toHaveBeenCalledWith("/workspaces/ws-root?confirm=true", {
headers: { "X-Confirm-Name": "My Org" },
});
expect(mockApiDel).toHaveBeenCalledWith("/workspaces/ws-root?confirm=true");
// 2) beginDelete locked the WHOLE subtree (root + 2 children) — NOT the unrelated node
expect(mockState.beginDelete).toHaveBeenCalledTimes(1);
@@ -68,11 +68,7 @@ afterEach(() => {
function ShortcutTestComponent() {
useKeyboardShortcuts();
return (
<div data-testid="canvas-root">
<div data-testid="display-stream" data-display-stream="true" />
</div>
);
return <div data-testid="canvas-root" />;
}
function renderWithProvider() {
@@ -82,13 +78,6 @@ function renderWithProvider() {
// ─── Tests ───────────────────────────────────────────────────────────────────
describe("Esc — deselect / close context menu", () => {
it("does not handle keys targeted at the display stream", () => {
mockStoreState.contextMenu = { x: 100, y: 100, nodeId: "n1" };
const { getByTestId } = renderWithProvider();
fireEvent.keyDown(getByTestId("display-stream"), { key: "Escape" });
expect(mockStoreState.closeContextMenu).not.toHaveBeenCalled();
});
it("closes the context menu when one is open", () => {
mockStoreState.contextMenu = { x: 100, y: 100, nodeId: "n1" };
renderWithProvider();

Some files were not shown because too many files have changed in this diff Show More