Commit Graph

3856 Commits

Author SHA1 Message Date
Hongming Wang
effbcd737b
Merge pull request #2575 from Molecule-AI/fix/cascade-include-all-active-templates
fix(publish-runtime): re-add 5 templates wrongly removed from cascade — fixes #2566
2026-05-03 12:45:48 +00:00
Hongming Wang
6eb79adfd5 manifest: re-add 5 workspace templates pruned by #2536
The cascade-list-vs-manifest drift gate (PR #2556's behavior-based
test) caught my previous-commit cascade additions as 'extra-in-cascade'.
Manifest is the source of truth — restoring there.

All 5 templates have successful publish-image runs in the past 24h
(verified before the cascade fix), and continuous-synth-e2e defaults
to langgraph as its primary canary. None deprecated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:43:07 -07:00
Hongming Wang
8f48a38550 fix(publish-runtime): re-add 5 templates wrongly removed from cascade (#2566)
The PR #2536 cascade prune ('deprecated, no shipping images') was
empirically wrong. Re-confirmed 2026-05-03:

- continuous-synth-e2e.yml defaults to langgraph as its primary canary
- All 5 'deprecated' templates have successful publish-image runs in
  the past 24h: langgraph, crewai, autogen, deepagents, gemini-cli

Symptom this fixes — issue #2566 (priority-high, failing 36+h):

  Synthetic E2E (staging): langgraph adapter A2A failure
  'Received Message object in task mode' — failing for >36h

Today at 11:06 commit e1628c4 fixed the underlying a2a-sdk strict-mode
issue in workspace/a2a_executor.py. publish-runtime fired at 11:13 and
cascaded — but only to claude-code, hermes, openclaw, codex. langgraph
was excluded by the prune, so its image stayed on the broken runtime
and the synth E2E (which defaults to langgraph) kept failing despite
the fix being live in PyPI.

After this lands + the next runtime publish fires, langgraph image
re-bakes with the fix and synth-E2E goes green.

Test plan:

- [x] yaml-validate the workflow
- [ ] After merge, watch publish-runtime cascade to all 9 templates
- [ ] Confirm langgraph publish-image fires + succeeds
- [ ] Confirm next continuous-synth-e2e run goes green

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:41:53 -07:00
Hongming Wang
dc6425fe39
Merge pull request #2571 from Molecule-AI/fix/synth-e2e-model-slug-by-runtime
fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form
2026-05-03 12:22:19 +00:00
Hongming Wang
cbc69f5e7e fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form
The original script hardcoded `MODEL_SLUG="openai/gpt-4o"` (slash) and
claimed "non-hermes runtimes ignore the prefix" — wrong for langgraph,
which delegates model resolution to langchain's `init_chat_model`. That
function requires `<provider>:<model>` (colon) and treats slash-form as
OpenRouter routing, falling through without auth even when
OPENAI_API_KEY is set.

Surfaced 2026-05-03 after the a2a-sdk v1 contract bugs (PR
#2558+#2563+#2567) cleared the masking layers — synth-E2E firing
2026-05-03T12:14 returned a properly-shaped task with state=failed +
"Could not resolve authentication method" inside the agent body.

continuous-synth-e2e.yml defaults E2E_RUNTIME=langgraph for the cron,
so every firing hit this. Hermes still gets the slash-form it
needs; claude-code uses the entry-id pattern.

Adds E2E_MODEL_SLUG override for operator-dispatched runs that want
to pin a specific slug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:17:55 -07:00
Hongming Wang
c71f641b12
Merge pull request #2569 from Molecule-AI/fix/redeploy-canary-default
ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming'
2026-05-03 12:08:26 +00:00
Hongming Wang
173e22e091
Merge pull request #2568 from Molecule-AI/auto-sync/main-c0838d63
chore: sync main → staging (auto, ff to c0838d63)
2026-05-03 12:07:29 +00:00
Hongming Wang
60a516bc8d ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming'
The workflow_dispatch input default and the workflow_run env fallback
both pointed at 'hongmingwang', which doesn't match any current prod
tenant (slugs are: hongming, chloe-dong, reno-stars). CP silently
skipped the missing canary and put every tenant in batch-1 in parallel,
defeating the canary-first soak gate that exists to catch image-boot
regressions before they hit the whole fleet.

Concrete example from today's c0838d6 redeploy at 11:53Z (run 25278434388):
the dispatched body was `{"target_tag":"staging-c0838d6","canary_slug":"hongmingwang",...}`
and the CP response showed all 3 tenants in `"phase":"batch-1"` — no
soak, no canary. The deploy happened to be safe, but a broken image
would have hit hongming + chloe-dong + reno-stars simultaneously.

Fixed in three places: the runtime ordering comment, the
workflow_dispatch default, and the env fallback used by the
workflow_run trigger. Comment documents the rationale so the next
slug rename doesn't silently regress this again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:06:01 -07:00
Hongming Wang
c0838d637e
Merge pull request #2562 from Molecule-AI/staging
staging → main: auto-promote bb63e60
2026-05-03 04:49:36 -07:00
Hongming Wang
493ab2566e
Merge pull request #2567 from Molecule-AI/fix/synth-e2e-openai-key
ci(synth-e2e): wire MOLECULE_STAGING_OPENAI_KEY into provisioned tenant
2026-05-03 11:45:17 +00:00
Hongming Wang
5e46ea70d6 ci(synth-e2e): wire MOLECULE_STAGING_OPENAI_KEY into provisioned tenant
The synth-E2E (#2342) provisions a langgraph tenant whose default
model `openai:gpt-4.1-mini` requires OPENAI_API_KEY for the first LLM
call. Sibling workflows already wire this:
- e2e-staging-saas.yml:89
- canary-staging.yml:63

continuous-synth-e2e.yml just forgot. Result: tenant boots, accepts
a2a messages, then returns:

  Agent error: "Could not resolve authentication method. Expected
  either api_key or auth_token to be set."

This was masked since 2026-04-29 (workflow creation) by a2a-sdk v0→v1
contract violations — PR #2558 (Task-enqueue) and #2563
(TaskUpdater.complete/failed terminal events) cleared those, exposing
the underlying auth gap on the synth-E2E firing at 11:39 UTC today.

The script tests/e2e/test_staging_full_saas.sh:325 already reads
E2E_OPENAI_API_KEY and persists it as a workspace_secret on tenant
create — only the workflow wiring was missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:43:07 -07:00
Hongming Wang
5cf3dc4369
Merge pull request #2565 from Molecule-AI/fix/redeploy-soft-warn-rt-e2e-prefix
ci(deploy): broaden ephemeral-prefix matchers to cover rt-e2e-*
2026-05-03 11:30:52 +00:00
Hongming Wang
596e797dca ci(deploy): broaden ephemeral-prefix matchers to cover rt-e2e-*
The redeploy-tenants-on-staging soft-warn filter and the
sweep-stale-e2e-orgs janitor both hardcoded `^e2e-` to identify
ephemeral test tenants. Runtime-test harness fixtures (RFC #2251)
mint slugs prefixed with `rt-e2e-`, which neither matcher recognized.

Concrete impact observed today:
  - Two `rt-e2e-v{5,6}-*` tenants left orphaned 8h on staging
    (sweep-stale-e2e-orgs ignored them).
  - On the next staging redeploy their phantom EC2s returned
    `InvalidInstanceId: Instances not in a valid state for account`
    from SSM SendCommand → CP returned HTTP 500 + ok=false.
  - The redeploy soft-warn missed them too, so the workflow went
    red, which broke the auto-promote-staging chain feeding the
    canvas warm-paper rollout to prod.

Fix: switch both matchers to recognize the alternation
`^(e2e-|rt-e2e-)`. Long-lived prefixes (demo-prep, dryrun-*, dryrun2-*)
remain non-ephemeral and continue to hard-fail. Comment documents
the source-of-truth list and the cross-file invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:28:29 -07:00
Hongming Wang
3ce638d6e6
Merge pull request #2564 from Molecule-AI/fix/canvas-react-flow-color-mode
fix(canvas): wire ReactFlow colorMode to resolvedTheme
2026-05-03 11:14:13 +00:00
Hongming Wang
df7edfcd3f fix(canvas): wire ReactFlow colorMode to resolvedTheme
PR #2555 (Tailwind v4 + warm-paper) migrated all canvas chrome (toolbar,
side panel, modal layer) to semantic tokens, but missed the React Flow
viewport's `colorMode="dark"` literal — and two paired hardcoded dark
literals on the Background dot color and MiniMap mask. Net result on
prod: the user picked light mode, the toolbar flipped warm-paper, but
the canvas backplate, edges, dots, controls, and minimap stayed black —
visibly half-themed.

Three coordinated fixes inside the canvas viewport:
- ReactFlow `colorMode={resolvedTheme}` so the library's own dark/light
  styles flip with the user's choice.
- Background dot color picks the line-soft tone in light mode (zinc-800
  was invisible-on-cream).
- MiniMap maskColor warm-tints the off-viewport dim so the unselected
  region doesn't render as a hard black bar over warm-paper.

Verification:
- `npx tsc --noEmit` clean
- `npx vitest run` 188/188 pass
- (will browser-verify post-redeploy on hongming.moleculesai.app)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:11:35 -07:00
Hongming Wang
3ecb25eb4f
Merge pull request #2563 from Molecule-AI/fix/a2a-v1-terminal-event
fix(a2a): route terminal Message via TaskUpdater.complete/failed in task mode
2026-05-03 11:09:09 +00:00
Hongming Wang
e1628c4d56 fix(a2a): route terminal Message via TaskUpdater.complete/failed in task mode
PR #2558 enqueued a Task at the start of new requests so the v1 SDK
would accept TaskUpdater.start_work() — fix #1 of the v0→v1 migration
gap (PR #2170). But after Task is enqueued, the executor enters
"task mode" and the SDK rejects raw Message enqueues at the terminal
step:

  {"code":-32603,"message":"Received Message object in task mode.
  Use TaskStatusUpdateEvent or TaskArtifactUpdateEvent instead."}

Synth-E2E 2026-05-03T11:00:34Z surfaced this on the very first run
after the prior fix cascaded. Validation site is the same
a2a/server/agent_execution/active_task.py — the framework's job is
to enforce the v1 invariant; we're catching up to it.

The fix routes both terminal events through TaskUpdater helpers:
- success: updater.complete(message=msg) wraps in
  TaskStatusUpdateEvent(state=COMPLETED, final=True)
- error: updater.failed(message=...) wraps in
  TaskStatusUpdateEvent(state=FAILED, final=True)

Both helpers exist in a2a-sdk ≥ 1.0; verified via
TaskUpdater.complete signature.

Tests:
- conftest TaskUpdater stub now records complete/failed calls AND
  routes the message back through event_queue.enqueue_event so the
  ~20 legacy tests asserting on enqueue_event keep working
- 2 new regression tests pin the contract:
  * test_terminal_success_routes_via_updater_complete
  * test_terminal_error_routes_via_updater_failed
- Both NEW tests verified to FAIL on staging-baseline (without this
  fix) and PASS with it — they'd catch the regression before staging
  if the wheel-smoke gate covered task-mode terminal events too
  (separate yak-shave for #131 follow-up)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:06:45 -07:00
Hongming Wang
78721f7a42
Merge pull request #2561 from Molecule-AI/fix/cascade-list-drift-gate
feat(ci): structural drift gate for cascade list vs manifest (RFC #388 PR-3)
2026-05-03 10:55:08 +00:00
Hongming Wang
09010212a0 feat(ci): structural drift gate for cascade list vs manifest (RFC #388 PR-3)
Closes the recurrence path of PR #2556. The data fix realigned 8→4
templates in publish-runtime.yml's TEMPLATES variable, but the
underlying drift hazard was unguarded — the next manifest change
could silently leave cascade out of sync again.

This gate fails any PR that changes manifest.json or
publish-runtime.yml in a way that makes the cascade list diverge
from manifest workspace_templates (suffix-stripped). Either
direction is caught:

  missing-from-cascade  templates that won't auto-rebuild on a new
                       wheel publish (the codex-stuck-on-stale-runtime
                       bug class — PR #2512 added codex to manifest,
                       cascade wasn't updated, codex stayed pinned to
                       its last-built runtime version for weeks).

  extra-in-cascade     cascade dispatches to deprecated templates
                       (the wasted-API-calls + dead-CI-noise class —
                       PR #2536 pruned 5 templates from manifest;
                       cascade kept dispatching to all 8 until
                       PR #2556).

Triggers narrowly: only on PRs that touch manifest.json,
publish-runtime.yml, or the script itself. Fast (single grep+sed+comm
pipeline, no Go build).

Surfaced during the RFC #388 prior-art audit; folded in as the
structural follow-up to the data fix #2556 promised.

Self-tested both failure modes locally before commit:
  - Drop codex from cascade → script fails with "MISSING: codex"
  - Add langgraph to cascade → script fails with "EXTRA: langgraph"

Refs: https://github.com/Molecule-AI/molecule-controlplane/issues/388

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 03:52:39 -07:00
Hongming Wang
bb63e60114
Merge pull request #2560 from Molecule-AI/fix/preflight-smoke-mode-bypass
fix(preflight): skip required_env check in MOLECULE_SMOKE_MODE
2026-05-03 10:46:20 +00:00
Hongming Wang
06240ab67b fix(preflight): skip required_env check in MOLECULE_SMOKE_MODE
Boot smoke (#2275) exercises executor.execute() against stub deps
and never hits the real provider, so missing auth env is not a real
blocker. Without this bypass, every adapter that introduces a new
auth env var must be mirrored into molecule-ci's fake-env list — a
maintenance treadmill that just bit hermes-template:

- 2026-05-03 09:47 UTC: hermes publish-image smoke fails on
  HERMES_API_KEY preflight (workflow injects CLAUDE_CODE_OAUTH_TOKEN,
  ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY but not
  HERMES_API_KEY or OPENROUTER_API_KEY). Failed for two cycles
  before being noticed.

The bypass demotes Required-env failures to warnings when
MOLECULE_SMOKE_MODE is truthy, so the unset env stays visible in
the boot log without blocking. Production paths are unchanged
(env unset → fail).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 03:44:05 -07:00
Hongming Wang
88ef70431e
Merge pull request #2505 from Molecule-AI/staging
staging → main: auto-promote d64570a
2026-05-03 03:36:56 -07:00
Hongming Wang
750b32c33f
Merge pull request #2558 from Molecule-AI/fix/a2a-v1-task-enqueue
fix(a2a): enqueue Task before TaskStatusUpdateEvent for v1 SDK contract
2026-05-03 10:18:36 +00:00
Hongming Wang
5c3b79a8ba fix(a2a): enqueue Task before TaskStatusUpdateEvent for v1 SDK contract
a2a-sdk ≥ 1.0 raises InvalidAgentResponseError when an executor publishes a
TaskStatusUpdateEvent (e.g. via TaskUpdater.start_work) before any Task
event for fresh requests. The framework only auto-creates the Task on
continuation messages (existing task_id resolves via task_manager.get_task);
new requests leave _task_created unset and the SDK validation at
a2a/server/agent_execution/active_task.py rejects the first status update.

PR #2170 migrated the executor surface to v1 but missed this contract. The
synthetic E2E gate caught it on every staging run since (~1 week silent
fail) with:

    {"jsonrpc":"2.0","id":"e2e-msg-1","error":{"code":-32603,
     "message":"Agent should enqueue Task before TaskStatusUpdateEvent
     event","data":null}}

The fix enqueues a Task(state=SUBMITTED) before the TaskUpdater is
constructed, gated on `context.current_task is None` so continuation
messages don't double-enqueue (which the SDK logs about but doesn't reject).

Tests:

  - test_first_event_is_task_for_new_request — pins the new-request path:
    first enqueue must be a Task with the expected id/context_id
  - test_no_task_enqueue_on_continuation — pins the continuation path: when
    context.current_task is set, the executor must NOT re-enqueue Task
  - conftest: stub Task / TaskStatus / TaskState in the mocked a2a.types
    module so the import inside the executor resolves under unit tests

google-adk adapter does not have this bug — its execute() only emits
Message events, not TaskStatusUpdateEvent. Its cancel() does emit one,
but cancel is rarely-invoked and out of scope for this fix.

Live verification path: this PR's merge → publish-runtime cascade → next
synth-E2E firing should go green at step "8/11 Sending A2A message to
parent — expecting agent response".
2026-05-03 03:15:54 -07:00
Hongming Wang
e014d22ee9
Merge pull request #2557 from Molecule-AI/feat/sweep-aws-secrets-orphans
feat(ops): sweep orphan AWS Secrets Manager secrets
2026-05-03 09:48:59 +00:00
Hongming Wang
18c2bdbe68
Merge pull request #2529 from Molecule-AI/dependabot/pip/workspace/starlette-gte-1.0.0
chore(deps)(deps): update starlette requirement from >=0.38.0 to >=1.0.0 in /workspace
2026-05-03 09:42:15 +00:00
Hongming Wang
6f8f7932d2 feat(ops): add sweep-aws-secrets janitor — orphan tenant bootstrap secrets
CP's deprovision flow calls Secrets.DeleteSecret() (provisioner/ec2.go:806)
but only when the deprovision runs to completion. Crashed provisions and
incomplete teardowns leak the per-tenant `molecule/tenant/<org_id>/bootstrap`
secret. At ~$0.40/secret/month, ~45 leaked secrets surfaced as ~$19/month
on the AWS cost dashboard.

The tenant_resources audit table (mig 024) tracks four kinds today —
CloudflareTunnel, CloudflareDNS, EC2Instance, SecurityGroup — and the
existing reconciler doesn't catch Secrets Manager orphans. The proper fix
(KindSecretsManagerSecret + recorder hook + reconciler enumerator) is filed
as a follow-up controlplane issue. This sweeper is the immediate stopgap.

Parallel-shape to sweep-cf-tunnels.sh:
  - Hourly schedule offset (:30, between sweep-cf-orphans :15 and
    sweep-cf-tunnels :45) so the three janitors don't burst CP admin
    at the same minute.
  - 24h grace window — never deletes a secret younger than the
    provisioning roundtrip, so an in-flight provision can't be racemurdered.
  - MAX_DELETE_PCT=50 default (mirrors sweep-cf-orphans for durable
    resources; tenant secrets should track 1:1 with live tenants).
  - Same schedule-vs-dispatch hardening as the other janitors:
    schedule → hard-fail on missing secrets, dispatch → soft-skip.
  - 8-way xargs parallelism, dry-run by default, --execute to delete.

Requires a dedicated AWS_JANITOR_* IAM principal — the prod molecule-cp
principal lacks secretsmanager:ListSecrets (it only has scoped
Get/Create/Update/Delete). The workflow's verify-secrets step will hard-fail
on the first scheduled run until those secrets are configured, surfacing
the missing setup loudly rather than silently no-op'ing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:38:08 -07:00
Hongming Wang
15124527da
Merge pull request #2276 from Molecule-AI/feat/layer1-runtime-digest-pinning
feat(provisioner): digest-pin runtime images via runtime_image_pins table (Layer 1 of #2272)
2026-05-03 09:32:54 +00:00
Hongming Wang
e9a1ce3591
Merge pull request #2556 from Molecule-AI/fix/cascade-list-align-to-manifest
fix(publish-runtime): align cascade list to 4 supported runtimes
2026-05-03 09:32:36 +00:00
Hongming Wang
1bff419833 feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1)
Layer 1 of the runtime-rollout plan. Decouples publish from promotion by
giving operators a `runtime_image_pins` table the provisioner consults at
container-create time. No row = legacy `:latest` behavior; row present =
provisioner pulls `<base>@sha256:<digest>`. One bad publish no longer
breaks every workspace simultaneously.

Mechanics:

  - Migration 047: `runtime_image_pins` (template_name PK + sha256 digest +
    audit columns) and `workspaces.runtime_image_digest` (nullable, with
    partial index) for "show me workspaces still on the old digest" queries.
  - `resolveRuntimeImage` (handlers/runtime_image_pin.go): looks up the
    pin, returns `<base>@sha256:<digest>` on hit, "" on miss/error so the
    provisioner falls through to the legacy tag map. Availability over
    pinning — any DB error logs and returns "" rather than blocking the
    provision. `WORKSPACE_IMAGE_LOCAL_OVERRIDE=1` short-circuits the
    lookup so devs rebuilding template images locally see their fresh
    build.
  - `WorkspaceConfig.Image` carries the resolved value into the
    provisioner. `selectImage` honors it ahead of the runtime→tag map and
    falls back to DefaultImage on unknown runtime.
  - The existing `imageTagIsMoving` predicate (#215) already returns false
    on `@sha256:` form, so digest pins skip the force-pull path naturally.

Tests:

  - Handler-side (sqlmock): no-pin/db-error/with-pin/empty/unknown/local-
    override paths cover every branch of `resolveRuntimeImage`.
  - Provisioner-side: `selectImage` table covers explicit-image preference,
    runtime-map fallback, unknown-runtime → default, empty-config →
    default. Plus a struct-literal compile-time pin on `Image` so a future
    refactor can't silently drop the field.

Layer 2 (per-ring routing via `workspaces.runtime_image_digest`) and the
admin promote/rollback endpoint ride on top of this and ship separately.
2026-05-03 02:30:00 -07:00
Hongming Wang
24276b9458 fix(publish-runtime): align cascade list to 4 supported runtimes
The cascade `TEMPLATES` list in publish-runtime.yml had drifted from
manifest.json:

  Currently dispatches to: claude-code, langgraph, crewai, autogen,
                           deepagents, hermes, gemini-cli, openclaw
  manifest.json supports:  claude-code, hermes, openclaw, codex (after
                           PR #2536 pruned to 4 actively-supported)

Two consequences of the drift:

1. `codex` (added in PR #2512, supported in manifest) was never in the
   cascade — fresh runtime publishes did NOT trigger a codex template
   rebuild. Codex stayed pinned to whatever runtime version it last saw
   at its own image-build time.

2. langgraph/crewai/autogen/deepagents/gemini-cli — deprecated, no
   shipping images, no working A2A — were still receiving cascade
   dispatches. Wasted API calls and (worse) green CI on dead repos
   masks "this template is dead, stop maintaining it."

Now matches manifest.json workspace_templates exactly. Surfaced during
RFC #388 (fast workspace provision) prior-art audit.

Long-term fix is to derive TEMPLATES from manifest.json so this can't
drift again — captured as a Phase-1 invariant in RFC #388. This commit
is the data fix only; structural fix lands with the bake pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:28:15 -07:00
Hongming Wang
aef9555b1d
Merge pull request #2555 from Molecule-AI/feat/canvas-warm-paper-tailwind-v4
feat(canvas): warm-paper theme + Tailwind v4
2026-05-03 09:27:23 +00:00
Hongming Wang
db48d1d261 fix(canvas): restore text-white on saturated buttons + close zinc gaps
Independent code review of #2555 caught two contrast regressions left
by the bulk perl pass:

1. text-white → text-ink mass-substitution silently broke destructive
   and primary buttons. text-ink resolves to #15181c (warm-paper
   near-black) in light mode — dark text on bg-red-600 / bg-amber-600
   / bg-emerald-600 / bg-blue-600 / bg-accent / bg-accent-strong /
   bg-good / bg-bad fails WCAG contrast and looks broken. Per-line
   pass flips text-ink → text-white only when a saturated bg utility
   is present; tinted-state pills (bg-red-950/50 etc.) keep their
   intentionally-retained text-* literals.

2. Original mapping table was missing bg-zinc-600 (most-used
   hover-state literal for cancel buttons — caused them to JUMP from
   warm cream resting state to dark zinc on hover in light mode) and
   text-zinc-700/800/900 (separator dots and decorative dim text
   invisible on warm-paper light bg). Extended mapping fills these
   gaps with bg-surface-card / text-ink-soft.

Also: drop stale tailwind.config.ts reference from components.json
(file deleted by the v3→v4 migration); switch baseColor zinc →
neutral and enable cssVariables since v4 uses CSS-driven tokens.
Future shadcn-cli invocations would have failed or written malformed
components without this.

27 sites in 27 files affected by #1, ~20 sites in 20 files by #2.
1214/1214 unit tests still pass; build still clean.

Findings courtesy of multi-model review per code-review-and-quality
skill — different blind spots catch different bugs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:04:20 -07:00
Hongming Wang
052575d773 fix(canvas): regenerate lockfile with cross-platform optional deps
CI's `npm ci` failed because the previous lock was generated on macOS
arm64, which omits the Linux-specific optional deps that
@tailwindcss/postcss → lightningcss-linux-x64-gnu transitively need
(@emnapi/runtime, @emnapi/core).

Re-ran `npm install --include=optional` so the lock includes every
platform variant of lightningcss + the @emnapi packages they pull in.
Runner (Linux x64) now has what it needs; local macOS install still
fine (npm picks the matching binary at install time).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:52:42 -07:00
Hongming Wang
c0eca8d0e1 feat(canvas): warm-paper theme + Tailwind v4 migration
Brings the canvas onto the warm-paper design system already shipped to
landing, marketplace, and SaaS surfaces, and migrates the build from
Tailwind v3 → v4 to match molecule-app.

Plumbing:
- swap tailwindcss v3 → v4, drop autoprefixer, add @tailwindcss/postcss
- delete tailwind.config.ts (v4 reads tokens from @theme blocks in CSS)
- globals.css: @import "tailwindcss" + @plugin "@tailwindcss/typography"
- two @theme blocks: warm-paper light defaults + always-dark surface
  tokens (bg-bg / ink-mute / line-strong) for terminal/console panels
- [data-theme="dark"] cascade overrides the warm-paper tokens for dark
- React Flow edge stroke + scrollbar + selection colour pull from
  semantic tokens so they flip with the theme

Theme infra (ported from molecule-app, identical contracts):
- lib/theme-cookie.ts: mol_theme cookie + boot script (no "use client"
  so server components can read the constants)
- lib/theme-provider.tsx: ThemeProvider + useTheme + cookie writer with
  Domain=.moleculesai.app so the preference follows the user across
  canvas/app/market/landing subdomains AND tenant subdomains
- lib/theme.ts: ColorToken union + cssVar() helper
- components/ThemeToggle.tsx: 3-way System/Light/Dark picker
- layout.tsx: SSR cookie read + nonce'd inline boot script (CSP needs
  the explicit nonce — strict-dynamic doesn't forgive an un-nonce'd
  inline sibling) + ThemeProvider wrapper + bg-surface/text-ink body

Component migration (62 files):
- Mechanical bg-zinc-* / text-zinc-* / border-zinc-* / text-white →
  semantic surface/ink/line tokens via perl negative-lookahead pass
  (preserves opacity modifiers like /80, /60)
- bg-blue-500/600 → bg-accent / bg-accent-strong
- text-red-* / amber-* / emerald-* → text-bad / warm / good
- Tinted-state banner backgrounds (bg-red-950, bg-amber-950, bg-blue-950
  etc.) intentionally left literal — they remain readable on warm-paper
  in light mode without inventing new state-soft tokens
- TerminalTab.tsx skipped — xterm renders to canvas, not DOM
- 3 unit-test assertions updated to match new token strings (credits
  pillTone, AuthGate overlay class, A2AEdge accent)

Verification:
- pnpm test: 1214/1214 pass
- pnpm tsc --noEmit: clean
- next build: ✓ Compiled successfully (8 routes)
- dev server inspection: html data-theme stamped, body uses
  bg-surface text-ink, boot script carries nonce, compiled CSS
  contains both @theme blocks + [data-theme="dark"] override

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:43:55 -07:00
Hongming Wang
e4893f5a9a
Merge pull request #2552 from Molecule-AI/feat/wire-event-log-into-adapter-base
feat(workspace): wire EventLog into adapter base (#119 PR-3b)
2026-05-03 08:39:34 +00:00
Hongming Wang
872f8e8971
Merge pull request #2554 from Molecule-AI/chore/remove-dead-ast-defensive-block
chore(workspace): remove dead defensive block in load_skills AST gate
2026-05-03 08:33:18 +00:00
Hongming Wang
d58185b8a8 chore(workspace): remove dead defensive block in load_skills AST gate
Self-review of PR #2553 caught an unreachable defensive block at
test_load_skills_call_sites.py:99-103: the inner check guarded
`call.func.__class__.__name__ == "Name"` from a FunctionDef, but
`_find_load_skills_calls` already filters its return type to
`ast.Call` — `FunctionDef` cannot reach that loop body. The block
was a no-op `pass` with a misleading comment.

Removing keeps the gate behaviorally identical; tests still pass.

Same five-axis review pass that turned this up also approved the
substantive logic of #2553, so no behavior change here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:30:05 -07:00
Hongming Wang
3c0f7de4b9
Merge pull request #2553 from Molecule-AI/feat/skill-compat-audit
docs(skills): document SKILL.md `runtime` field + AST coverage gate (#119 PR-4)
2026-05-03 08:24:55 +00:00
Hongming Wang
f8b40d8d73 docs(skills): document SKILL.md runtime field + AST coverage gate (#119 PR-4)
Closes the documentation + audit gap for declarative skill-compat. The
plumbing has been live since PR #117 (RuntimeCapabilities) and
skill_loader's `_normalize_runtime_field` has been emitting filter
decisions for weeks, but:
- No public doc explained the `runtime` frontmatter field, so skill
  authors didn't know how to opt in / opt out.
- No structural gate ensured every load_skills() call site threads
  current_runtime — a future caller forgetting the kwarg silently
  force-loads runtime-incompatible skills (no AttributeError, just a
  delayed crash on first tool invocation).

Two changes:

1. docs/agent-runtime/skills.md
   - Adds `runtime`, `tags`, `examples` to the Frontmatter Fields table.
   - Adds a Runtime Compatibility section with example, accepted shapes
     (universal default, list, string sugar), and the "logged + omitted,
     not crashed" failure mode. Notes that match values come from each
     adapter's name() (the same string in config.yaml's runtime: field).

2. workspace/tests/test_load_skills_call_sites.py
   - Static AST gate: walks every workspace/*.py (excluding tests),
     finds load_skills(...) Call nodes, fails if any lacks
     current_runtime= as a keyword.
   - Defense-in-depth `test_known_call_sites_present` — pins that the
     scan actually sees the two known callers (adapter_base,
     skill_loader.watcher) so a refactor that moves them is loud.
   - Sanity-checked the matcher against a synthetic violating module.

Same-shape pattern as PR #2358 (tenant_resources audit-coverage AST
gate, #150) — pin the contract structurally, not just behaviorally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:22:34 -07:00
Hongming Wang
71e7a6ffee feat(workspace): wire EventLog into adapter base (#119 PR-3b)
Adds adapter.event_log property+setter on BaseAdapter so adapters can
emit structured events (tool dispatch, skill load, executor errors)
without coupling to the chosen backend. Default is a shared no-op
DisabledEventLog; main.py overrides at boot from the
observability.event_log config block (PR-2 schema).

The shape is intentionally additive:
- Property is invisible to the BaseAdapter signature snapshot drift
  gate (the helper walks vars(cls) for callables only — properties
  are not callable). Verified with a regression test in the new
  test_adapter_base_event_log.py.
- Existing adapters continue to work unchanged. Template repos that
  never call self.event_log get the no-op for free.
- Setter accepts any EventLogBackend, so swapping memory↔disabled
  at runtime (or to a future Redis backend) requires no adapter
  code change.

Sequels:
- PR-3c: emit events from claude-code/hermes adapters at the
  natural points (tool dispatch, skill load).
- PR-4: skill-compat audit + SKILL.md frontmatter docs.
- Platform-side /workspaces/:id/activity endpoint reads the buffer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:18:19 -07:00
Hongming Wang
e2b58f0fbc
Merge pull request #2551 from Molecule-AI/feat/wire-observability-config
feat(workspace): wire observability heartbeat + log_level into consumers (#119 PR-3a)
2026-05-03 08:05:00 +00:00
Hongming Wang
efa68a26b1 feat(workspace): wire observability config into heartbeat + uvicorn (#119 PR-3a)
Replaces the hard-coded HEARTBEAT_INTERVAL=30 in heartbeat.py and
log_level="info" in main.py with values from
ObservabilityConfig (#119 PR-1, schema landed in PR #2538).

Concrete plumbing:

  - heartbeat.HeartbeatLoop accepts an `interval_seconds=` keyword
    arg. Defaults to the legacy module constant so 2-arg callers
    (existing tests, any downstream code that hasn't been updated)
    keep their existing 30s behavior.
  - main.py constructs HeartbeatLoop with
    config.observability.heartbeat_interval_seconds — the value the
    config parser already clamped to [5, 300].
  - main.py's uvicorn.Config takes log_level from
    config.observability.log_level (lowercased — uvicorn's convention
    differs from Python logging's) with LOG_LEVEL env still winning
    as an ops-side debugging override.

Adapter EventLog wiring deferred to PR-3b (#208 follow-up) — touches
adapter_base interface + needs careful design, kept separate to keep
this PR small + reviewable.

Tests:
  - test_heartbeat.py: 3 new tests pin default interval, explicit
    override, and the [5, 300] band that the constructor accepts
    without re-clamping (clamping is the parser's job).
  - All 88 tests in test_heartbeat.py + test_config.py pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:01:57 -07:00
Hongming Wang
87e355c296
Merge pull request #2548 from Molecule-AI/feat/event-log-module
feat(workspace): event_log module + EventLogConfig (#119 PR-2)
2026-05-03 07:54:05 +00:00
Hongming Wang
67f3e49e42
Merge pull request #2549 from Molecule-AI/fix/orphan-sweeper-skip-external-runtime
fix(orphan-sweeper): exclude runtime='external' from stale-token revoke
2026-05-03 07:52:43 +00:00
Hongming Wang
be271aef8b fix(orphan-sweeper): exclude runtime='external' from stale-token revoke
The Docker-mode orphan sweeper was incorrectly targeting external runtime
workspaces, revoking their auth tokens ~6 minutes after creation (one
sweep cycle past the 5-min grace).

External workspaces have NO local container by design — their agent runs
off-host. The "no live container" predicate the sweep uses to detect
wiped-volume orphans matches every external workspace unconditionally,
which was killing the only auth credential the off-host agent has.

Reproducer: create runtime=external workspace, paste the auth token into
molecule-mcp / curl, wait 5 minutes. Next request returns
`HTTP 401 — token may be revoked`. Platform log shows
`Orphan sweeper: revoking stale tokens for workspace <id> (no live
container; volume likely wiped)`.

Fix: add `AND w.runtime != 'external'` to the sweep's SELECT. The
existing test regexes (third-pass query expectations + the shared
expectStaleTokenSweepNoOp helper) are tightened to require the new
predicate, so a regression that drops it fails CI immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:49:37 -07:00
Hongming Wang
9753d58539 fix(build): register event_log in TOP_LEVEL_MODULES
The wheel-build drift gate caught it correctly: any new top-level
module under workspace/ must be listed in TOP_LEVEL_MODULES so its
`from event_log import …` statements get rewritten to
`from molecule_runtime.event_log import …` at package time.

Without this entry, the published wheel ships event_log.py un-rewritten
and crashes at runtime with ModuleNotFoundError on first heartbeat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:19:30 -07:00
Hongming Wang
0fc2531250 feat(workspace): event_log module + EventLogConfig (#119 PR-2)
Adds workspace/event_log.py with an in-memory EventLog backend and a
disabled no-op variant, plus EventLogConfig nested in
ObservabilityConfig (backend / ttl_seconds / max_entries).

The event log is the append-and-query buffer that the canvas Activity
tab and platform `/activity` endpoint will read in PR-3 of the #119
stack. Two backends ship in this PR:

  - InMemoryEventLog: bounded ring buffer with TTL eviction, monotonic
    ids that survive eviction so cursors don't break, thread-safe for
    concurrent appends from heartbeat + main loop + A2A executor.
  - DisabledEventLog: no-op for `backend: disabled` — opts the
    workspace out without crashing callers that propagate event ids.

Schema-only PR — no consumers wired yet. Wiring lands in PR-3.

Test coverage:
  - 34 new test_event_log.py tests (100% line coverage on event_log.py)
  - 9 new test_config.py tests for EventLogConfig parsing
  - Concurrency stress with 8 threads × 200 appends — verifies unique
    monotonic ids under contention
  - TTL + max_entries eviction with injected clock (no time.sleep)
  - Disabled backend contract pinned

Closes #207.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:17:12 -07:00
Hongming Wang
350495f032
Merge pull request #2547 from Molecule-AI/perf/cache-platform-inbound-secret
perf(wsauth): in-process cache for platform_inbound_secret reads
2026-05-03 07:11:38 +00:00
Hongming Wang
384edb4af0
Merge branch 'staging' into perf/cache-platform-inbound-secret 2026-05-03 00:08:43 -07:00