molecule-core

Author	SHA1	Message	Date
Hongming Wang	36fd658cc0	Merge pull request #2393 from Molecule-AI/auto/fix-auto-promote-jq-null-handling fix(ci): handle empty E2E lookup in auto-promote-on-e2e gate	2026-04-30 17:10:25 +00:00
Hongming Wang	8efb2dae8d	fix(ci): handle empty E2E lookup in auto-promote-on-e2e gate When gh run list returns [] (no E2E run on the main SHA — the common case for canvas-only / cmd-only / sweep-only changes whose paths don't trigger E2E), jq's `.[0]` is null and the interpolation `"\(null)/\(null // "none")"` produces "null/none". The case statement has no `null/none)` branch, so it falls into `*)` → exit 1 → auto-promote-on-e2e fails → `:latest` doesn't get retagged to the new SHA → tenants on `redeploy-tenants-on-main` end up pulling the OLD `:latest` digest. Surfaced 2026-04-30 17:00Z as the first observable consequence of PR #2389 (App-token dispatch fix). Every prior auto-promote-on-e2e run was triggered by E2E completion (the "Upstream is E2E itself" short-circuit at line 151 fired before reaching the gate). #2389 made publish-image's completion event correctly fire workflow_run listeners — auto-promote-on-e2e is one of those listeners — and hit the latent jq bug on the first publish-upstream run. Fix: change `.[0]` to `(.[0] // {})` in the jq filter so the empty- array case becomes `none/none` (the documented "E2E paths-filtered out for this SHA — proceed" branch) instead of the unhandled `null/none`. Also default `.status` for the same defensive reason. Verified the three input shapes locally: [] → "none/none" ✓ [{status:completed,conclusion:success}] → "completed/success" ✓ [{status:in_progress,conclusion:null}] → "in_progress/none" ✓ Outer `\|\| echo "none/none"` fallback retained as defense-in-depth for non-zero gh exits (network / auth failures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 10:07:52 -07:00
hongmingwang-moleculeai	904cf31e2c	Merge pull request #2390 from Molecule-AI/auto/local-provisioner-api-interface refactor(handlers): widen WorkspaceHandler.provisioner to LocalProvisionerAPI interface (#2369)	2026-04-30 16:21:37 +00:00
Hongming Wang	e081c8335f	refactor(handlers): widen WorkspaceHandler.provisioner to LocalProvisionerAPI interface (#2369 ) Symmetric with the existing CPProvisionerAPI interface. Closes the asymmetry where the SaaS provisioner field was an interface (mockable in tests) but the Docker provisioner field was a concrete pointer (not). ## Changes - New ``provisioner.LocalProvisionerAPI`` interface — the 7 methods WorkspaceHandler / TeamHandler call on h.provisioner today: Start, Stop, IsRunning, ExecRead, RemoveVolume, VolumeHasFile, WriteAuthTokenToVolume. Compile-time assertion confirms Provisioner satisfies it. Mirror of cp_provisioner.go's CPProvisionerAPI block. - ``WorkspaceHandler.provisioner`` and ``TeamHandler.provisioner`` re-typed from ``provisioner.Provisioner`` to ``provisioner.LocalProvisionerAPI``. Constructor parameter type is unchanged — the assignment widens to the interface, so the 200+ callers of ``NewWorkspaceHandler`` / ``NewTeamHandler`` are unaffected. - Constructors gain a ``if p != nil`` guard before assigning to the interface field. Without this, ``NewWorkspaceHandler(..., nil, ...)`` (the test fixture pattern across 200+ tests) yields a typed-nil interface value where ``h.provisioner != nil`` evaluates true, and the SaaS-vs-Docker fork incorrectly routes nil-fixture tests into the Docker code path. Documented inline with reference to the Go FAQ. - Hardened the 5 Provisioner methods that lacked nil-receiver guards (Start, ExecRead, WriteAuthTokenToVolume, RemoveVolume, VolumeHasFile) — return ErrNoBackend on nil receiver instead of panicking on p.cli dereference. Symmetric with Stop/IsRunning (already hardened in #1813). Defensive cleanup so a future caller that bypasses the constructor's nil-elision still degrades cleanly. - Extended TestZeroValuedBackends_NoPanic with 5 new sub-tests covering the newly-hardened nil-receiver paths. Defense-in-depth: a future refactor that drops one of the nil-checks fails red here before reaching production. ## Why now - Provisioner orchestration has been touched in #2366 / #2368 — the interface symmetry is the natural follow-up captured in #2369. - Future work (CP fleet redeploy endpoint, multi-backend provisioners) wants this in place. Memory note ``project_provisioner_abstraction.md`` calls out pluggable backends as a north-star. - Memory note ``feedback_long_term_robust_automated.md`` — compile-time gates + ErrNoBackend symmetry > runtime panics. ## Verification - ``go build ./...`` clean. - ``go test ./...`` clean — 1300+ tests pass, including the previously-flaky Create-with-nil-provisioner paths that now exercise the constructor's nil-elision correctly. - ``go test ./internal/provisioner/ -run TestZeroValuedBackends_NoPanic -v`` — all 11 nil-receiver subtests green (was 6, +5 for the newly-hardened methods). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 09:18:16 -07:00
Hongming Wang	2507cc00a0	Merge pull request #2389 from Molecule-AI/auto/auto-promote-app-token-dispatch ci(auto-promote): dispatch publish via molecule-ai App token to unblock workflow_run chain	2026-04-30 16:09:08 +00:00
Hongming Wang	e418d32582	ci(auto-promote): dispatch publish via molecule-ai App token to unblock workflow_run chain Root cause (verified 2026-04-30): GITHUB_TOKEN-initiated workflow_dispatch creates the dispatched run, but the resulting run's completion event does NOT fire downstream `workflow_run` triggers. This is the documented "no recursion" rule: https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow Evidence (publish-workspace-server-image runs on main): run_id \| head_sha \| triggering_actor \| canary \| redeploy ------------+-----------+-----------------------+--------+---------- 25151545007 \| `6ef562ee` \| HongmingWang-Rabbit \| YES \| YES 25171773918 \| `21313dc` \| github-actions[bot] \| NO \| NO 25173801008 \| `59dec57` \| github-actions[bot] \| NO \| NO The 06:52Z run that "worked" was an operator-fired dispatch from the terminal — actor was the operator's PAT. The two runs that "dropped" were dispatched by auto-promote-staging.yml's `gh workflow run` step authenticated via `secrets.GITHUB_TOKEN`, so the actor became `github-actions[bot]` and the workflow_run cascade was suppressed. Same workflow file, same dispatch call, same successful publish run — only the auth token differed. Fix: mint a molecule-ai GitHub App installation token before the dispatch step and use it as `GH_TOKEN`. App-initiated dispatches DO propagate the workflow_run cascade (the App user is a real identity, not the GITHUB_TOKEN bot pseudonym). The molecule-ai App (app_id=3398844, installation 124443072) is already installed on the org with `actions:write` — no new App needed. Only secrets are missing. ## Required setup before merge The following repo secrets must be added at https://github.com/Molecule-AI/molecule-core/settings/secrets/actions or auto-promote will hard-fail at the new "Mint App token" step: - `MOLECULE_AI_APP_ID` = `3398844` - `MOLECULE_AI_APP_PRIVATE_KEY` = contents of a .pem file generated at https://github.com/organizations/Molecule-AI/settings/installations/124443072 (Click "Generate a private key" if one doesn't exist yet.) ## Long-term cleanup The polling tail step still exists because the auto-merge call itself uses GITHUB_TOKEN, so the FF push to main doesn't fire publish-workspace-server-image's `push` trigger naturally. Switching the auto-merge call to use the SAME App token would eliminate the polling tail entirely. Tracked in #2357. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 08:55:49 -07:00
Hongming Wang	8aba565df0	Merge pull request #2388 from Molecule-AI/auto/fix-workspace-status-enum-awaiting-agent fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum	2026-04-30 15:55:23 +00:00
Hongming Wang	c6cb82e1c0	fix(workspaces): add missing 'awaiting_agent' + 'hibernating' to workspace_status enum Migration 043 (2026-04-25) introduced the workspace_status enum but omitted two values application code had been writing for days, so every UPDATE that tried to write either value failed silently in production: 'awaiting_agent' (since 2026-04-24, commit `1e8b5e01`): - handlers/workspace.go:333 — external workspace pre-register - handlers/registry.go (via PR #2382) — liveness offline transition - registry/healthsweep.go (via PR #2382) — heartbeat-staleness sweep 'hibernating' (since hibernation feature shipped): - handlers/workspace_restart.go:271 — DB-level claim before stop All four/five sites swallowed the enum-cast error. User-visible impact: external workspaces never transition to a stale state when their agent disconnects (canvas shows them stuck on 'online'/'degraded' indefinitely), new external workspaces never advance past 'provisioning', and idle workspaces never auto-hibernate (resources held forever). PR #2382 didn't cause this — it inherited the gap and added two more silent-fail paths on top. The pre-existing two had been broken for five days and went unnoticed because: 1. sqlmock matches SQL by regex, not against the live enum constraint. Every test passed despite the prod-only failure. 2. The handlers either drop the Exec error entirely (workspace.go:333) or log+continue without an alert (the other three). Fix in three pieces: 1. migrations/046_.up.sql — ALTER TYPE workspace_status ADD VALUE 'awaiting_agent', 'hibernating'. IF NOT EXISTS makes it idempotent across re-runs (RunMigrations re-applies until schema_migrations records the file). ALTER TYPE ADD VALUE doesn't take a heavy lock and commits immediately, safe under live traffic. 2. migrations/046_.down.sql — full rename → recreate → cast → drop recipe. Postgres has no DROP VALUE so this is the only honest rollback. Pre-flights existing rows to compatible values (awaiting_agent → offline, hibernating → hibernated) before the type swap. 3. internal/db/workspace_status_enum_drift_test.go — static gate that parses every UPDATE/INSERT against `workspaces` in workspace-server/ internal/, extracts every status literal, and asserts each is in the enum union (CREATE TYPE + every ALTER TYPE ADD VALUE). The gate runs in unit tests, no DB required, and would have caught both omissions on the day they shipped. Pattern matches feedback_behavior_based_ast_gates and feedback_mock_at_drifting_layer. Verification: - go test ./internal/db/ -count=1 -race ✓ - go vet ./... ✓ - Drift gate flips red if I delete either ADD VALUE from the migration (validated via local mutation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 08:52:05 -07:00
hongmingwang-moleculeai	9b061672a0	Merge pull request #2387 from Molecule-AI/auto/platform-auth-signature-snapshot test(platform_auth): module-functions signature snapshot drift gate	2026-04-30 15:44:02 +00:00
Hongming Wang	899a2231d6	test(platform_auth): module-functions signature snapshot drift gate Pin the 5 public functions adapters and the runtime hot-path import through ``from platform_auth import``: - ``auth_headers`` — every outbound httpx call merges this in - ``self_source_headers`` — A2A peer + self-message header builder - ``get_token`` — main.py reads on boot to decide register-vs-resume - ``save_token`` — main.py persists the platform-issued token - ``refresh_cache`` — 401-retry path drops in-process cache (#1877) A grep across workspace/ shows 14+ runtime modules import these: main.py, heartbeat.py, a2a_client.py, a2a_tools.py, consolidation.py, events.py, executor_helpers.py (3 sites), molecule_ai_status.py, builtin_tools/memory.py (3 sites), builtin_tools/temporal_workflow.py (2 sites). Renaming any of the five (e.g. ``auth_headers`` → ``bearer_headers``) makes every one of those imports raise ImportError at workspace boot — the failure surface is deep in heartbeat init, nowhere near the rename site. Same drift class as the BaseAdapter signature snapshot (#2378, #2380), skill_loader gate (#2381), runtime_wedge gate (#2383). Reuses the ``_signature_snapshot.py`` helpers shipped in #2381. Defense-in-depth: ``test_snapshot_has_required_functions`` asserts the five names are still present, so removing one even with a synchronized snapshot edit forces an explicit edit here with a justification. ``clear_cache`` is intentionally NOT in the snapshot — it's a test-only helper. Production code MUST NOT depend on it. Verified red on deliberate rename: ``auth_headers`` → ``bearer_headers`` produces a clean diff of the missing function in the failure message. Restored before commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 08:41:42 -07:00
Hongming Wang	cdef8932ea	Merge pull request #2385 from Molecule-AI/auto/chat-files-stream-response-helper refactor(chat_files): extract streamWorkspaceResponse helper for Upload+Download	2026-04-30 15:30:52 +00:00
Hongming Wang	830e4aa548	refactor(chat_files): extract streamWorkspaceResponse helper for Upload+Download The "do request → check err → defer close → forward headers → set status → io.Copy → log mid-stream errors" tail was duplicated between Upload and Download. Each handler had ~12 lines that differed only in: - the op label in log messages ("upload" vs "download") - the set of response headers to forward verbatim (Upload: Content-Type only; Download: Content-Type + Content-Length + Content-Disposition) Hoist into ChatFilesHandler.streamWorkspaceResponse(c, op, workspaceID, forwardURL, req, forwardHeaders). Each call site reduces to one line. Future changes — request-id forwarding, observability metric, response-size cap, bytes-streamed log — go in ONE place rather than two. Same drift-prevention rationale as resolveWorkspaceForwardCreds (#2372) and readOrLazyHealInboundSecret (#2376), applied to the response-streaming layer of the same handlers. Behavior preserved: existing TestChatUpload_* and TestChatDownload_* integration tests (8 across both handlers) all pass unchanged. The log message format is consistent across both handlers now (single "chat_files {op}: ..." string template) — operators can grep one prefix for both features instead of separate prefixes per handler. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 08:27:45 -07:00
Hongming Wang	7cb8b476ad	Merge pull request #2382 from Molecule-AI/auto/external-defaults-poll-and-awaiting feat(external): default external runtime to poll-mode + awaiting_agent	2026-04-30 14:43:03 +00:00
Hongming Wang	6bd38c2333	Merge pull request #2383 from Molecule-AI/auto/runtime-wedge-signature-snapshot test(runtime_wedge): module-functions signature snapshot drift gate	2026-04-30 14:03:49 +00:00
Hongming Wang	70176e6c8f	test(runtime_wedge): module-functions signature snapshot drift gate BaseAdapter docstring tells adapter authors: > ``runtime_wedge.mark_wedged()`` / ``clear_wedge()`` — flip the > workspace to ``degraded`` + auto-recover when your SDK hits a > non-recoverable error class. Import directly from ``runtime_wedge``; > the heartbeat forwards the state to the platform automatically. That's a contract — adapter templates depend on the four module-level functions (``is_wedged``, ``wedge_reason``, ``mark_wedged``, ``clear_wedge``) being importable by those exact names with those exact signatures. Renaming any silently breaks every adapter that calls them: the import resolves the module fine, the ``AttributeError`` only surfaces when the adapter actually hits its first SDK error — long after the rename merges. Same drift class as #2378 / #2380 / #2381 (BaseAdapter, skill_loader) applied to the module-level function surface. Changes: - tests/_signature_snapshot.py gains build_module_functions_record. Walks a module's public top-level functions, optionally filtered to a specific name list (used here — runtime_wedge has internal helpers like reset_for_test that intentionally aren't part of the contract). Skips re-exports via __module__ check so a `from foo import bar` doesn't pollute the snapshot. - tests/test_runtime_wedge_signature.py snapshots the four contract functions. Plus a defense-in-depth required-functions test that catches removal even when source + snapshot are updated together. Verified: deliberately renaming `mark_wedged` → `mark_wedged_RENAMED` trips the gate with full snapshot diff in the failure message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 07:01:10 -07:00
Hongming Wang	284511f02e	feat(external): default external runtime to poll-mode + awaiting_agent Paired molecule-core change for the molecule-cli `molecule connect` RFC (https://github.com/Molecule-AI/molecule-cli/issues/10). After this PR an `external`-runtime workspace's full lifecycle matches the operator-driven model: it boots in awaiting_agent, the CLI connects in poll mode without operator-side flag tuning, the heartbeat-loss path lands back on awaiting_agent (re-registrable) instead of the terminal-feeling 'offline'. Two changes in workspace-server: 1) `resolveDeliveryMode` (registry.go) now reads `runtime` alongside `delivery_mode`. Resolution order: a. payload.delivery_mode if non-empty (operator override) b. row's existing delivery_mode if non-empty (preserves prior registration) c. NEW: "poll" if row.runtime = "external" — external operators run on laptops without public HTTPS; push-mode would hard-fail at validateAgentURL anyway. (`molecule connect` registers without --mode and expects this default.) d. "push" otherwise (historical default for platform-managed runtimes — langgraph, hermes, claude-code, etc.) 2) Heartbeat-loss for external workspaces lands them in `awaiting_agent` instead of `offline`. Two code paths: - `liveness.go` — Redis TTL expiration. Uses a CASE expression so the conditional is one UPDATE (no extra round-trip for non-external runtimes, no TOCTOU between runtime read and status write). - `healthsweep.go::sweepStaleRemoteWorkspaces` — DB-side last_heartbeat_at age scan. This sweep is already external- only by query filter, so the UPDATE just hard-codes the new status. The Docker-side `sweepOnlineWorkspaces` keeps `offline` — recovery there is "restart the container", not "re-register from the operator's box". Why awaiting_agent over offline for external: - Matches the status the workspace was created in (workspace.go:333). - The CLI re-registers on every invocation; awaiting_agent → online is the natural transition. offline is a terminal-feeling status that implies operator intervention is needed. - An operator who closed their laptop overnight should see awaiting_agent in canvas, not 'offline (something is wrong)'. Test plan: - Existing: 9 `resolveDeliveryMode` test sites updated to the new query shape. Sqlmock now reads `delivery_mode, runtime` columns. - New: TestRegister_ExternalRuntime_DefaultsToPoll asserts the external→poll branch. TestRegister_NonExternalRuntime_StillDefaultsToPush guards against the new branch overshooting (langgraph keeps push). - Liveness: regex updated to match the CASE expression. - Healthsweep: `TestSweepStaleRemoteWorkspaces_MarksStaleAwaitingAgent` (renamed for grep-ability), Docker-side sweepOnlineWorkspaces test unchanged (verified to still match `'offline'`). - Full handlers + registry suite green under -race (12.873s + 2.264s). No migration needed — `status` is a free-form text column; both 'offline' and 'awaiting_agent' are existing values used elsewhere (workspace.go uses awaiting_agent on initial external creation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 06:39:57 -07:00
Hongming Wang	4d1156cb8b	Merge pull request #2379 from Molecule-AI/auto/fix-migration-collision-fetch-depth fix(ci): drop --depth=1 from migration collision check fetch	2026-04-30 13:31:53 +00:00
Hongming Wang	ae0db09857	Merge pull request #2381 from Molecule-AI/auto/skill-loader-signature-snapshot test: extract shared signature-snapshot helpers + skill_loader drift gate	2026-04-30 13:29:45 +00:00
Hongming Wang	e336688278	test: extract shared signature-snapshot helpers + skill_loader gate Two changes in one PR (tightly coupled — the second wouldn't make sense without the first): 1. Hoist the inspect-based snapshot helpers out of test_adapter_base_signature.py into tests/_signature_snapshot.py so future surfaces don't copy-paste introspection logic. - build_class_signature_record(cls): walks public methods, unwraps static/class/abstract methods, returns a stable {class, methods: [...]} dict. - build_dataclass_record(cls): walks dataclass fields via dataclasses.fields(), returns {name, frozen, fields: [...]}. - compare_against_snapshot(actual, path): writes-on-first-run + diff-on-drift, with both expected and actual JSON in failure message. test_adapter_base_signature.py is rewritten to use the helpers; the existing snapshot file is byte-identical (no behavior change). 2. New gate: tests/test_skill_loader_signature.py covers the public dataclasses exported from skill_loader/loader.py: - SkillMetadata: every adapter pattern-matches on .runtime for skill-compat filtering. Renaming this field would silently break per-adapter skill loading — the loader still returns objects, but adapters' `if "*" in skill.metadata.runtime` raises AttributeError at workspace boot. - LoadedSkill: returned in SetupResult.loaded_skills. Includes test_snapshot_has_required_skill_metadata_fields defense-in-depth: ensures the runtime / id / name / description fields stay even if both source and snapshot are updated together. Verified: deliberately renaming SkillMetadata.runtime trips the gate with full snapshot diff in the failure message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 06:27:20 -07:00
Hongming Wang	7a6ccde7f2	Merge pull request #2380 from Molecule-AI/auto/adapter-snapshot-include-dataclasses test(adapter_base): extend signature snapshot to public dataclasses	2026-04-30 12:55:27 +00:00
Hongming Wang	12e39c7311	test(adapter_base): extend signature snapshot to public dataclasses (#2364 item 2 followup) Follows up #2378. The BaseAdapter snapshot covers method signatures but `adapter_base.py` also exports three public dataclasses that form the call/return contract between the platform and every adapter: - SetupResult — returned by adapter._common_setup() - AdapterConfig — passed into adapter setup hooks - RuntimeCapabilities — returned by adapter.capabilities(); drives platform-side dispatch routing (#117) Renaming a RuntimeCapabilities flag silently disables every adapter's capability declaration (the platform fallback runs) without an AttributeError to surface the breakage. That's exactly the drift class the snapshot pattern is meant to catch. Changes: - _build_dataclass_snapshot walks SetupResult, AdapterConfig, RuntimeCapabilities via dataclasses.fields(), capturing field name + type annotation + has_default per field, plus the @dataclass(frozen=...) flag. - _build_full_snapshot composes method + dataclass records into one stable JSON snapshot. - test_snapshot_has_required_dataclass_fields — defense-in-depth test parallel to test_snapshot_has_required_methods. Catches field removal even when both source AND snapshot are updated together. Required field set is intentionally short (the flags that drive platform dispatch + the adapter-level config knobs). Verified: deliberately renaming `provides_native_heartbeat` → `provides_native_heartbeat_RENAMED` trips test_base_adapter_signature_matches_snapshot with a full diff in the failure message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:53:10 -07:00
Hongming Wang	a9391c5900	fix(ci): drop --depth=1 from migration collision check fetch The check has been blocking the staging→main auto-promote PR (#2361) since 2026-04-30T07:17Z with: fatal: origin/main...<head>: no merge base Root cause: the workflow does `git fetch origin <base> --depth=1` which overwrites checkout@v4's full-history clone with a shallow tip — destroying the ancestry the subsequent `git diff origin/main...HEAD` (three-dot, merge-base form) needs. This deadlocks every staging→main promote PR until manually fixed. The auto-promote runs were succeeding at the gate-check phase but the subsequent PR-merge step waited 30 min for the failing check and timed out, skipping the publish + redeploy dispatch tail. Fleet recovery for any production-only fix went through staging fine but never reached main. Fix: drop --depth=1 so the explicit fetch preserves full history. The leading comment is updated to call out this trap so a future maintainer doesn't re-add the flag thinking it's a perf win. No test added: this is a workflow-config one-liner that the existing PR check itself exercises end-to-end (the real signal is PR #2361 going green after this lands). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:28:03 -07:00
Hongming Wang	a7ddfbc3b5	Merge pull request #2378 from Molecule-AI/auto/base-adapter-signature-snapshot test(adapter_base): signature snapshot — drift gate for adapter public surface (#2364 item 2)	2026-04-30 12:20:59 +00:00
Hongming Wang	8488a188c2	test(adapter_base): signature snapshot — drift gate for adapter public surface (#2364 item 2) Every workspace template (langgraph, claude-code, hermes, etc.) subclasses BaseAdapter. Renaming, removing, or re-typing a method on the base class silently breaks templates: the override stops being recognized as an override; the old method-name's caller silently invokes the default no-op; the new method-name is unimplemented in templates that haven't migrated. Recent #87 universal-runtime + #1957 recordResource refactor both renamed/added methods. Without a frozen snapshot, the next rename ships quietly and surfaces only when a template's CI catches the AttributeError days later — long after the merge window for an easy revert. This snapshot pins BaseAdapter's public method surface against a checked-in JSON file. Same-shape pattern as PR #2363's A2A protocol-compat replay gate, applied to a Python public-API surface instead of JSON message shapes. Both close drift classes by snapshotting the structural surface that consumers depend on. Two tests: 1. test_base_adapter_signature_matches_snapshot — full introspection diff against tests/snapshots/adapter_base_signature.json. Drift = test failure with both expected + actual JSON in the message so the reviewer sees what changed. 2. test_snapshot_has_required_methods — defense-in-depth: even if both the source AND snapshot are updated together (intentional API removal), this catches removal of the short list of methods that EVERY template depends on (name, display_name, description, capabilities, memory_filename). Removing one of these requires explicit edit to the `required` set with a justification. Verified the gate fires red on a deliberate rename (memory_filename → memory_filename_RENAMED) — failure message shows the full snapshot diff including parameter shapes and return annotations. Updating the snapshot is the explicit acknowledgment that a template-affecting API change is intentional. Reviewer of the introducing PR sees the snapshot diff and decides whether template repos need coordinated updates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:18:39 -07:00
Hongming Wang	97058d5392	Merge pull request #2377 from Molecule-AI/auto/lazy-heal-helper-direct-tests test(provision): direct unit tests for readOrLazyHealInboundSecret	2026-04-30 11:44:22 +00:00
Hongming Wang	233a912cbe	test(provision): direct unit tests for readOrLazyHealInboundSecret The helper landed in #2376 and is exercised via chat_files + registry integration tests. Those tests conflate the helper's behavior with the caller's response shape — a future refactor that broke the (secret, healed, err) contract subtly (e.g. returning healed=true on a read-success path, or swallowing a mint error) might still pass them. Adds 4 direct sub-tests pinning each branch of the contract: - secret already present → (s, false, nil) - secret missing, mint succeeds → (minted, true, nil) - secret missing, mint fails → ("", false, err) - read fails (non-NoInboundSecret) → ("", false, err) Each sub-case asserts the return tuple shape AND mock.ExpectationsWereMet (for the success path) so a future helper change that skips a DB op trips the gate immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 04:41:13 -07:00
Hongming Wang	677b4858ab	Merge pull request #2376 from Molecule-AI/auto/lazy-heal-secret-helper refactor: extract readOrLazyHealInboundSecret to dedup chat_files + registry	2026-04-30 11:14:49 +00:00
Hongming Wang	30a569c742	refactor: extract readOrLazyHealInboundSecret to dedup chat_files + registry The lazy-heal-on-miss pattern landed in two places this session: PR #2372 (chat_files.go::resolveWorkspaceForwardCreds — Upload + Download) and PR #2375 (registry.go::Register). Both implementations did the same thing: read → if ErrNoInboundSecret then mint inline → return outcome Different response-shape requirements but the same core mechanic. Three sites' worth of drift potential: any future heal-time condition we add (audit log, alert, secret rotation, observability) had to be applied to each site, with partial application silently re-opening the gap. Fix: extract readOrLazyHealInboundSecret in workspace_provision_shared.go returning (secret, healed, err). Each caller maps the outcome to its response shape: - chat_files: healed=true → 503 with retry hint; err != nil → 503 with RFC-#2312 reprovision hint - registry: healed=true\|false + err==nil → include in response; err != nil → omit field (workspace can retry on next register) Net effect: - Single source of truth for the read+heal mechanic - Response-shape decisions stay in callers (they DO differ per feature) - Future heal-time conditions go in one place - Behavior preserved: existing TestRegister_NoInboundSecret_LazyHeals, TestRegister_NoInboundSecret_LazyHealMintFailureOmitsField, TestChatUpload_NoInboundSecret_LazyHeal, TestChatDownload_NoInboundSecret_LazyHeal all pass unchanged Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 04:11:43 -07:00
Hongming Wang	eef1969e30	Merge pull request #2375 from Molecule-AI/auto/registry-lazy-heal-inbound-secret fix(registry): lazy-heal platform_inbound_secret on register for legacy workspaces	2026-04-30 10:47:49 +00:00
Hongming Wang	f3f5c4537b	fix(registry): lazy-heal platform_inbound_secret on register for legacy workspaces Pre-fix: a legacy SaaS workspace with NULL platform_inbound_secret needed two round-trips before chat upload worked: 1. Workspace registers → response missing platform_inbound_secret 2. User attempts chat upload → chat_files lazy-heals platform-side (RFC #2312 backfill) → 503 + retry-after 3. Workspace heartbeats → register response now includes the freshly-minted secret → workspace writes /configs/.platform_inbound_secret 4. User retries chat upload → workspace bearer matches → 200 The platform-side lazy-heal in chat_files.go (#2366) closes the existing-workspace gap, but the user-visible round-trip dance is still ugly. Fix: lazy-heal at register time too. When ReadPlatformInboundSecret returns ErrNoInboundSecret, mint inline and include the freshly- minted secret in the register response. Collapses the dance to a single round-trip: 1. Workspace registers → response includes lazy-healed secret 2. User attempts chat upload → workspace bearer matches → 200 Failure model: best-effort. Mint failure logs and falls through to omitting the field (workspace will retry on next register call). The 200 response status is preserved — register success doesn't hinge on the inbound-secret heal. Tests: - TestRegister_NoInboundSecret_LazyHeals: pins the success branch. Mocks the UPDATE explicitly + asserts ExpectationsWereMet, so a regression that skipped the mint would fail loudly. Replaces the prior TestRegister_NoInboundSecret_OmitsField which "passed" on this branch only because sqlmock-unmatched-UPDATE coincidentally drove the omit-field error path. - TestRegister_NoInboundSecret_LazyHealMintFailureOmitsField: pins the failure branch — explicit UPDATE error → 200 + field absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:44:50 -07:00
Hongming Wang	343e164f5f	Merge pull request #2374 from Molecule-AI/auto/wsauth-token-lookup-helper refactor(wsauth): extract lookupTokenByHash to dedup auth predicate across 3 callers	2026-04-30 10:14:40 +00:00
Hongming Wang	64822dac49	refactor(wsauth): extract lookupTokenByHash to dedup auth predicate across 3 callers ValidateToken, WorkspaceFromToken, and ValidateAnyToken each duplicated the same JOIN+WHERE auth predicate: FROM workspace_auth_tokens t JOIN workspaces w ON w.id = t.workspace_id WHERE t.token_hash = $1 AND t.revoked_at IS NULL AND w.status != 'removed' Same drift class as the SaaS provision-mint bug fixed in #2366. A future safety addition (e.g. exclude paused workspaces from auth) had to be applied to all three queries; a partial application would silently re-open one auth path while closing the others. Fix: hoist the predicate into lookupTokenByHash, which projects (id, workspace_id) — the union of fields any caller needs. Each public function picks what it uses: - ValidateToken — needs both (compares workspaceID, updates last_used_at by id) - WorkspaceFromToken — needs workspace_id - ValidateAnyToken — needs id The trivial perf cost of selecting one extra column per call is worth the single-source-of-truth guarantee for the auth predicate. Test mock updates: two upstream test files (a2a_proxy_test, middleware wsauth_middleware_test{,_canvasorbearer_test}) had hand-typed regex matchers and row shapes pinned to the per-function SELECT projection. Updated to the unified shape; behavior is unchanged. All wsauth + middleware + handlers + full-module tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:11:38 -07:00
Hongming Wang	17760e10d2	Merge pull request #2373 from Molecule-AI/auto/admin-test-token-mock-coverage test(admin_test_token): pin ADMIN_TOKEN IDOR-fix (#112) gate behavior	2026-04-30 10:02:13 +00:00
Hongming Wang	e403d74a3d	test(admin_test_token): pin ADMIN_TOKEN IDOR-fix (#112 ) gate behavior The admin test-token endpoint has a critical security check at admin_test_token.go:64-72 — the IDOR fix from #112 that requires an explicit ADMIN_TOKEN bearer when the env var is set. Pre-fix, the route accepted ANY bearer that matched a live org token, allowing cross-org test-token minting (and therefore cross-org workspace authentication). The current code uses subtle.ConstantTimeCompare against ADMIN_TOKEN. Test coverage was zero. The existing tests exercised the ADMIN_TOKEN-unset path (local dev / CI) but never set ADMIN_TOKEN. A regression that: - removed the os.Getenv("ADMIN_TOKEN") check - inverted the comparison - replaced ConstantTimeCompare with bytes.Equal (timing leak) - re-introduced the AdminAuth fallback that allows org tokens would not fail any test, and the breakage would re-open the IDOR that #112 closed. Adds four tests covering the gate matrix: - ADMIN_TOKEN set + no Authorization header → 401 - ADMIN_TOKEN set + wrong Authorization → 401 - ADMIN_TOKEN set + correct Authorization → 200 - ADMIN_TOKEN unset + no Authorization → 200 (gate bypassed safely) The 4-row matrix pins the gate's full truth table: any regression in either dimension (gate enabled/disabled, header correct/wrong) trips exactly one test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:59:08 -07:00
Hongming Wang	264e726672	Merge pull request #2372 from Molecule-AI/auto/chat-files-resolve-creds-helper refactor(chat_files): extract resolveWorkspaceForwardCreds shared by Upload+Download	2026-04-30 09:54:56 +00:00
Hongming Wang	501a42d753	refactor(chat_files): extract resolveWorkspaceForwardCreds shared by Upload+Download The 50-line "resolve URL + read inbound secret + lazy-heal on miss" block was duplicated nearly verbatim between Upload and Download handlers. Drift-prone — same class of risk as the original SaaS provision drift fixed in #2366. A future change like: - secret rotation (re-mint when the row's older than X) - per-feature audit logging - additional fail-closed conditions would have to be applied to both handlers, and a partial application that healed Upload but skipped Download would surface only at runtime. Fix: hoist the shared logic into resolveWorkspaceForwardCreds. The function takes an op label ("upload"/"download") used in log messages + the 503 RFC-#2312 detail copy so operators can still distinguish which feature ran. Both handlers reduce to: wsURL, secret, ok := resolveWorkspaceForwardCreds(c, ctx, workspaceID, "upload") if !ok { return } Net -20 lines (helper amortizes the 50-line block across both call sites). Existing test coverage (TestChatUpload_NoInboundSecret_, TestChatDownload_NoInboundSecret_ from PR #2370) covers all four branches of the shared helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:51:53 -07:00
Hongming Wang	29368dd749	Merge pull request #2371 from Molecule-AI/auto/team-expand-mint-fix test(provision): pin PARENT_ID env injection contract in prepareProvisionContext	2026-04-30 09:45:19 +00:00
Hongming Wang	4ba12668f0	test(provision): pin PARENT_ID env injection contract in prepareProvisionContext #2367 moved PARENT_ID env injection from inline TeamHandler.Expand into the shared prepareProvisionContext (sourced from payload.ParentID). The test was missing — a regression that: - dropped the injection - inverted the nil-check - leaked an empty PARENT_ID="" into env would not fail any existing test, but workspace/coordinator.py reads PARENT_ID on startup to track parent-child relationship, so the breakage would surface only at runtime. Adds TestPrepareProvisionContext_ParentIDInjection with three sub-cases: - nil ParentID → no PARENT_ID env - empty-string ParentID → no PARENT_ID env (don't pollute) - set ParentID → PARENT_ID env equals value Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:41:41 -07:00
Hongming Wang	ab6bcc030c	Merge pull request #2370 from Molecule-AI/auto/lazy-heal-test-coverage test(chat_files): pin lazy-heal mint contract for both Upload and Download	2026-04-30 09:41:40 +00:00
Hongming Wang	6c065a02e6	test(chat_files): pin lazy-heal mint contract for both Upload and Download The 2026-04-30 lazy-heal fix in chat_files.go (PR #2366) ATTEMPTS to mint platform_inbound_secret on miss so legacy workspaces self-heal without requiring destructive reprovision. The pre-existing TestChatUpload_NoInboundSecret + TestChatDownload_NoInboundSecret tests asserted the 503 response shape but did NOT pin that the mint UPDATE actually fires — they happened to exercise the mint-failure branch (sqlmock unmatched UPDATE = error = "Failed to mint" code path returns 503 with "RFC #2312" detail, which still passed the original assertions). This means a regression that: - skipped the lazy-heal mint entirely - inverted the success/failure response branches - moved the mint to a different code path would not fail those tests. Fix: - TestChatUpload_NoInboundSecret_LazyHeal: mock the UPDATE successfully; assert sqlmock.ExpectationsWereMet (mint MUST run) + body contains "retry" + "30" (success branch). - TestChatUpload_NoInboundSecret_LazyHealFailure: mock the UPDATE to fail; assert body contains "Reprovision" (failure branch). - Same pair for the Download handler — independent code path means independent test. Pins both branches of both handlers (4 tests) so future drift trips the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:38:28 -07:00
Hongming Wang	d9801d1e62	Merge pull request #2368 from Molecule-AI/auto/team-expand-mint-fix fix(team): delegate Expand child-provision to shared mint pipeline (#2367)	2026-04-30 09:31:34 +00:00
Hongming Wang	bb52a1a365	fix(team): delegate Expand child-provisioning to shared mint pipeline (#2367 ) Closes #2367. TeamHandler.Expand provisioned child workspaces by directly calling h.provisioner.Start, skipping mintWorkspaceSecrets and every other preflight (secrets load, env mutators, identity injection, missing-env, empty-config-volume auto-recover). Children shipped with NULL platform_inbound_secret + never-issued auth_token — same drift class as the SaaS bug just fixed in PR #2366, found while exercising a stronger gate against this package. Fix: - TeamHandler now holds WorkspaceHandler. Expand delegates each child provision to wh.provisionWorkspace, picking up the shared prepare/mint/preflight pipeline automatically. Future provision-time steps go in ONE place and team-expand inherits them. - prepareProvisionContext gains PARENT_ID env injection sourced from payload.ParentID (which Expand now populates). This preserves the signal workspace/coordinator.py reads on startup, without threading env through provisioner.WorkspaceConfig manually. - NewTeamHandler signature gains WorkspaceHandler; router passes it. Gate upgrade: - TestProvisionFunctions_AllCallMintWorkspaceSecrets is now behavior-based: it walks every FuncDecl in the package and flags any function that calls h.provisioner.Start or h.cpProv.Start without also calling mintWorkspaceSecrets. Drift-resistant by construction — a future provision function with any name still trips the gate. - Replaces the name-list version from PR #2366. The name list missed Expand precisely because Expand wasn't named provision*; the behavior-based detector caught it spontaneously when prototyped. Tests: full workspace-server module green; gate previously verified to fire red on Expand pre-fix and on deliberate mintWorkspaceSecrets removal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:28:29 -07:00
Hongming Wang	b9b0a46f2e	Merge pull request #2366 from Molecule-AI/auto/workspace-provision-shared fix(provision): share Docker+SaaS prepare path so both mint workspace secrets (RFC #2312)	2026-04-30 09:21:38 +00:00
Hongming Wang	3f8286ea47	fix(provision): share Docker+SaaS prepare path so both mint workspace secrets (RFC #2312 ) Root cause of 2026-04-30 silent-503 chat-upload bug: provisionWorkspaceCP (SaaS) skipped issueAndInjectInboundSecret while provisionWorkspaceOpts (Docker) called it. Every prod SaaS workspace provisioned with NULL platform_inbound_secret → upload returned 503 with the v2-enrollment message on every attempt. Structural fix: - Extract prepareProvisionContext (secrets load, env mutators, preflight, cfg build), mintWorkspaceSecrets (auth_token + platform_inbound_secret), markProvisionFailed (broadcast + DB update) into workspace_provision_shared.go - Refactor both provision modes to call the shared helpers - Add provisionAbort struct so the missing-env failure class can carry its structured "missing" payload through the shared abort path - Unify last_sample_error: previously the decrypt-fail path skipped it while others set it; users now see every failure class in the UI Drift prevention: - AST gate TestProvisionFunctions_AllCallMintWorkspaceSecrets asserts every function in the provisionFunctions set calls mintWorkspaceSecrets at least once (same shape as the audit-coverage gate from #335). New provision paths must either call mint or be added to provisionExemptFunctions with a one-line justification - Behavioral test TestMintWorkspaceSecrets_PersistsInboundSecretInSaaSMode pins the contract: SaaS mode MUST persist platform_inbound_secret to the DB column even though it skips file injection Existing-workspace recovery (chat_files.go lazy-heal): - Upload + Download handlers detect NULL platform_inbound_secret and call IssuePlatformInboundSecret inline, returning 503 with retry_after_seconds=30 - Self-heals workspaces that were provisioned before this fix without requiring destructive reprovision Tests: full handlers + workspace-server module green; AST gate verified to fire red on deliberate violation (commented-out mint call surfaces the exact function name + actionable remediation message). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:18:08 -07:00
Hongming Wang	49c3433a70	Merge pull request #2365 from Molecule-AI/auto/cf-dead-origin-status-gap fix(a2a): cover CF 521/522/523 in dead-origin status set	2026-04-30 08:51:28 +00:00
Hongming Wang	e06ebaefdf	Merge pull request #2346 from Molecule-AI/auto/issue-2341-migration-collision ci: hard gate against migration version collisions (#2341)	2026-04-30 08:50:19 +00:00
Hongming Wang	b5df2126b9	fix(test): convert migration-collision tests from pytest to unittest (#2341 ) CI failure: the Ops scripts (unittest) job runs `python -m unittest discover` which doesn't have pytest installed. test_check_migration_ collisions.py imported pytest unconditionally, failing module import: ImportError: Failed to import test module: test_check_migration_collisions Traceback (most recent call last): File ".../test_check_migration_collisions.py", line 12, in <module> import pytest ModuleNotFoundError: No module named 'pytest' The tests use no pytest-specific features (just bare assert + plain class). Sibling test_sweep_cf_decide.py in the same dir already uses unittest.TestCase. Convert this one to match: drop the pytest import, make TestMigrationFileRe inherit from unittest.TestCase. unittest.TestLoader.discover() requires TestCase subclasses for auto-discovery, so the fix is two lines (drop import, add base). Bare assert statements work fine inside TestCase methods. Verified: `python3 -m unittest scripts.ops.test_check_migration_collisions -v` runs all 9 tests, all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:47:27 -07:00
Hongming Wang	8e508a7a2f	fix(a2a): cover CF 521/522/523 in dead-origin status set Independent review on PR #2362 caught: the dead-agent classifier at a2a_proxy.go included 502/503/504/524 but missed the rest of the CF origin-failure family (521/522/523), which are MORE indicative of a dead EC2 than 524: - 521 "Web server is down" — CF can't open TCP to origin (most direct dead-EC2 signal; fires when the workspace EC2 has been terminated and CF still has the CNAME pointing at it). - 522 "Connection timed out" — TCP didn't complete in ~15s (typical of SG/NACL flap or agent process hung on accept). - 523 "Origin is unreachable" — CF can't route to origin (DNS gone, network path broken). Pre-fix any of these would propagate as-is to the canvas and the user would see a 5xx without the reactive auto-restart firing — exactly the SaaS-blind class of failure PR #2362 was meant to close. Refactor: extracted isUpstreamDeadStatus(int) helper so the matrix is in one place, with TestIsUpstreamDeadStatus locking in 18 status codes (7 dead, 11 not-dead including 520 and 525 which look CF-shaped but indicate different failures). Also tightened TestStopForRestart_NoProvisioner_NoOp per the same review: now uses sqlmock.ExpectationsWereMet to assert the dispatcher doesn't touch the DB on the both-nil path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:39:04 -07:00
Hongming Wang	2742c3c837	Merge pull request #2363 from Molecule-AI/auto/a2a-corpus-replay-gate test(a2a): protocol-shape replay corpus gate (#2345 follow-up)	2026-04-30 08:29:20 +00:00
Hongming Wang	747c12e582	test(a2a): protocol-shape replay corpus gate (#2345 follow-up) Backward-compat replay gate for the A2A JSON-RPC protocol surface. Every PR that touches normalizeA2APayload OR bumps the a-2-a-sdk version pin runs every shape in testdata/a2a_corpus/ through the current code and asserts: valid/ — every shape MUST parse without error and produce a canonical v0.3 payload (params.message.parts list). invalid/ — every shape MUST be rejected with the documented status code and error substring. What this prevents The 2026-04-29 v0.2 → v0.3 silent-drop bug (PR #2349) shipped because the SDK bump PR didn't replay v0.2-shaped inputs against the new code; the shape-mismatch surfaced only in production when the receiver's Pydantic validator silently rejected inbound messages. This gate would have caught it pre-merge. Hand-verified: reverting the v0.2 string→parts shim in normalizeA2APayload fails 3 of the v0.2 corpus entries with the exact rejection class the production bug exhibited. Corpus contents (11 entries) valid/ (10): v0_2_string_content — basic v0.2 (the broken case) v0_2_string_content_no_message_id — v0.2 + auto-fill messageId v0_2_list_content — v0.2 with content as Part list v0_3_parts_text_only — canonical v0.3 v0_3_parts_multi_text — multi-Part list v0_3_parts_with_file — multimodal (text + file) v0_3_parts_with_context — contextId for multi-turn v0_3_streaming_method — message/stream variant v0_3_unicode_text — emoji + multi-script v0_3_long_text — 10KB text Part no_jsonrpc_envelope — bare params/method without outer envelope (legacy senders) invalid/ (3): no_content_or_parts — message has neither field content_is_integer — wrong type for v0.2 content content_is_bool — wrong type, separate from int so the failure msg identifies which type-class regressed Plus 4 inline malformed-JSON cases (truncated, not-JSON, empty, whitespace) that can't be expressed as JSON corpus entries. Coverage tests The gate has 4 test functions: 1. TestA2ACorpus_ValidShapesParse — replay valid/ corpus, assert no error + canonical v0.3 output (parts list non-empty, messageId non-empty, content field deleted). 2. TestA2ACorpus_InvalidShapesRejected — replay invalid/ corpus, assert rejection matches recorded status + error substring. 3. TestA2ACorpus_MalformedJSONRejected — inline cases for non-parseable bodies. 4. TestA2ACorpus_HasMinimumCoverage — at least one v0.2 + one v0.3 entry exists (loses neither side of the bridge). 5. TestA2ACorpus_EveryEntryHasMetadata — _comment/_added/_source on every entry per the README policy; _expect_error and _expect_status on invalid entries. Documentation testdata/a2a_corpus/README.md describes the corpus contract: - When to add entries (new SDK shape, new production-observed shape). - When NOT to add (test scaffolding, hypothetical futures). - Removal policy (breaking change, deprecation window required). Verification - All 24 corpus subtests pass on current main. - Hand-test: revert the v0.2 compat shim → 3 v0.2 entries fail the gate with the exact rejection class the production bug exhibited. Confirmed. - Whole-module go test ./... green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:26:02 -07:00

1 2 3 4 5 ...

3511 Commits