molecule-core

Author	SHA1	Message	Date
Hongming Wang	8488a188c2	test(adapter_base): signature snapshot — drift gate for adapter public surface (#2364 item 2) Every workspace template (langgraph, claude-code, hermes, etc.) subclasses BaseAdapter. Renaming, removing, or re-typing a method on the base class silently breaks templates: the override stops being recognized as an override; the old method-name's caller silently invokes the default no-op; the new method-name is unimplemented in templates that haven't migrated. Recent #87 universal-runtime + #1957 recordResource refactor both renamed/added methods. Without a frozen snapshot, the next rename ships quietly and surfaces only when a template's CI catches the AttributeError days later — long after the merge window for an easy revert. This snapshot pins BaseAdapter's public method surface against a checked-in JSON file. Same-shape pattern as PR #2363's A2A protocol-compat replay gate, applied to a Python public-API surface instead of JSON message shapes. Both close drift classes by snapshotting the structural surface that consumers depend on. Two tests: 1. test_base_adapter_signature_matches_snapshot — full introspection diff against tests/snapshots/adapter_base_signature.json. Drift = test failure with both expected + actual JSON in the message so the reviewer sees what changed. 2. test_snapshot_has_required_methods — defense-in-depth: even if both the source AND snapshot are updated together (intentional API removal), this catches removal of the short list of methods that EVERY template depends on (name, display_name, description, capabilities, memory_filename). Removing one of these requires explicit edit to the `required` set with a justification. Verified the gate fires red on a deliberate rename (memory_filename → memory_filename_RENAMED) — failure message shows the full snapshot diff including parameter shapes and return annotations. Updating the snapshot is the explicit acknowledgment that a template-affecting API change is intentional. Reviewer of the introducing PR sees the snapshot diff and decides whether template repos need coordinated updates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 05:18:39 -07:00
Hongming Wang	97058d5392	Merge pull request #2377 from Molecule-AI/auto/lazy-heal-helper-direct-tests test(provision): direct unit tests for readOrLazyHealInboundSecret	2026-04-30 11:44:22 +00:00
Hongming Wang	233a912cbe	test(provision): direct unit tests for readOrLazyHealInboundSecret The helper landed in #2376 and is exercised via chat_files + registry integration tests. Those tests conflate the helper's behavior with the caller's response shape — a future refactor that broke the (secret, healed, err) contract subtly (e.g. returning healed=true on a read-success path, or swallowing a mint error) might still pass them. Adds 4 direct sub-tests pinning each branch of the contract: - secret already present → (s, false, nil) - secret missing, mint succeeds → (minted, true, nil) - secret missing, mint fails → ("", false, err) - read fails (non-NoInboundSecret) → ("", false, err) Each sub-case asserts the return tuple shape AND mock.ExpectationsWereMet (for the success path) so a future helper change that skips a DB op trips the gate immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 04:41:13 -07:00
Hongming Wang	677b4858ab	Merge pull request #2376 from Molecule-AI/auto/lazy-heal-secret-helper refactor: extract readOrLazyHealInboundSecret to dedup chat_files + registry	2026-04-30 11:14:49 +00:00
Hongming Wang	30a569c742	refactor: extract readOrLazyHealInboundSecret to dedup chat_files + registry The lazy-heal-on-miss pattern landed in two places this session: PR #2372 (chat_files.go::resolveWorkspaceForwardCreds — Upload + Download) and PR #2375 (registry.go::Register). Both implementations did the same thing: read → if ErrNoInboundSecret then mint inline → return outcome Different response-shape requirements but the same core mechanic. Three sites' worth of drift potential: any future heal-time condition we add (audit log, alert, secret rotation, observability) had to be applied to each site, with partial application silently re-opening the gap. Fix: extract readOrLazyHealInboundSecret in workspace_provision_shared.go returning (secret, healed, err). Each caller maps the outcome to its response shape: - chat_files: healed=true → 503 with retry hint; err != nil → 503 with RFC-#2312 reprovision hint - registry: healed=true\|false + err==nil → include in response; err != nil → omit field (workspace can retry on next register) Net effect: - Single source of truth for the read+heal mechanic - Response-shape decisions stay in callers (they DO differ per feature) - Future heal-time conditions go in one place - Behavior preserved: existing TestRegister_NoInboundSecret_LazyHeals, TestRegister_NoInboundSecret_LazyHealMintFailureOmitsField, TestChatUpload_NoInboundSecret_LazyHeal, TestChatDownload_NoInboundSecret_LazyHeal all pass unchanged Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 04:11:43 -07:00
Hongming Wang	eef1969e30	Merge pull request #2375 from Molecule-AI/auto/registry-lazy-heal-inbound-secret fix(registry): lazy-heal platform_inbound_secret on register for legacy workspaces	2026-04-30 10:47:49 +00:00
Hongming Wang	f3f5c4537b	fix(registry): lazy-heal platform_inbound_secret on register for legacy workspaces Pre-fix: a legacy SaaS workspace with NULL platform_inbound_secret needed two round-trips before chat upload worked: 1. Workspace registers → response missing platform_inbound_secret 2. User attempts chat upload → chat_files lazy-heals platform-side (RFC #2312 backfill) → 503 + retry-after 3. Workspace heartbeats → register response now includes the freshly-minted secret → workspace writes /configs/.platform_inbound_secret 4. User retries chat upload → workspace bearer matches → 200 The platform-side lazy-heal in chat_files.go (#2366) closes the existing-workspace gap, but the user-visible round-trip dance is still ugly. Fix: lazy-heal at register time too. When ReadPlatformInboundSecret returns ErrNoInboundSecret, mint inline and include the freshly- minted secret in the register response. Collapses the dance to a single round-trip: 1. Workspace registers → response includes lazy-healed secret 2. User attempts chat upload → workspace bearer matches → 200 Failure model: best-effort. Mint failure logs and falls through to omitting the field (workspace will retry on next register call). The 200 response status is preserved — register success doesn't hinge on the inbound-secret heal. Tests: - TestRegister_NoInboundSecret_LazyHeals: pins the success branch. Mocks the UPDATE explicitly + asserts ExpectationsWereMet, so a regression that skipped the mint would fail loudly. Replaces the prior TestRegister_NoInboundSecret_OmitsField which "passed" on this branch only because sqlmock-unmatched-UPDATE coincidentally drove the omit-field error path. - TestRegister_NoInboundSecret_LazyHealMintFailureOmitsField: pins the failure branch — explicit UPDATE error → 200 + field absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:44:50 -07:00
Hongming Wang	343e164f5f	Merge pull request #2374 from Molecule-AI/auto/wsauth-token-lookup-helper refactor(wsauth): extract lookupTokenByHash to dedup auth predicate across 3 callers	2026-04-30 10:14:40 +00:00
Hongming Wang	64822dac49	refactor(wsauth): extract lookupTokenByHash to dedup auth predicate across 3 callers ValidateToken, WorkspaceFromToken, and ValidateAnyToken each duplicated the same JOIN+WHERE auth predicate: FROM workspace_auth_tokens t JOIN workspaces w ON w.id = t.workspace_id WHERE t.token_hash = $1 AND t.revoked_at IS NULL AND w.status != 'removed' Same drift class as the SaaS provision-mint bug fixed in #2366. A future safety addition (e.g. exclude paused workspaces from auth) had to be applied to all three queries; a partial application would silently re-open one auth path while closing the others. Fix: hoist the predicate into lookupTokenByHash, which projects (id, workspace_id) — the union of fields any caller needs. Each public function picks what it uses: - ValidateToken — needs both (compares workspaceID, updates last_used_at by id) - WorkspaceFromToken — needs workspace_id - ValidateAnyToken — needs id The trivial perf cost of selecting one extra column per call is worth the single-source-of-truth guarantee for the auth predicate. Test mock updates: two upstream test files (a2a_proxy_test, middleware wsauth_middleware_test{,_canvasorbearer_test}) had hand-typed regex matchers and row shapes pinned to the per-function SELECT projection. Updated to the unified shape; behavior is unchanged. All wsauth + middleware + handlers + full-module tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 03:11:38 -07:00
Hongming Wang	17760e10d2	Merge pull request #2373 from Molecule-AI/auto/admin-test-token-mock-coverage test(admin_test_token): pin ADMIN_TOKEN IDOR-fix (#112) gate behavior	2026-04-30 10:02:13 +00:00
Hongming Wang	e403d74a3d	test(admin_test_token): pin ADMIN_TOKEN IDOR-fix (#112 ) gate behavior The admin test-token endpoint has a critical security check at admin_test_token.go:64-72 — the IDOR fix from #112 that requires an explicit ADMIN_TOKEN bearer when the env var is set. Pre-fix, the route accepted ANY bearer that matched a live org token, allowing cross-org test-token minting (and therefore cross-org workspace authentication). The current code uses subtle.ConstantTimeCompare against ADMIN_TOKEN. Test coverage was zero. The existing tests exercised the ADMIN_TOKEN-unset path (local dev / CI) but never set ADMIN_TOKEN. A regression that: - removed the os.Getenv("ADMIN_TOKEN") check - inverted the comparison - replaced ConstantTimeCompare with bytes.Equal (timing leak) - re-introduced the AdminAuth fallback that allows org tokens would not fail any test, and the breakage would re-open the IDOR that #112 closed. Adds four tests covering the gate matrix: - ADMIN_TOKEN set + no Authorization header → 401 - ADMIN_TOKEN set + wrong Authorization → 401 - ADMIN_TOKEN set + correct Authorization → 200 - ADMIN_TOKEN unset + no Authorization → 200 (gate bypassed safely) The 4-row matrix pins the gate's full truth table: any regression in either dimension (gate enabled/disabled, header correct/wrong) trips exactly one test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:59:08 -07:00
Hongming Wang	264e726672	Merge pull request #2372 from Molecule-AI/auto/chat-files-resolve-creds-helper refactor(chat_files): extract resolveWorkspaceForwardCreds shared by Upload+Download	2026-04-30 09:54:56 +00:00
Hongming Wang	501a42d753	refactor(chat_files): extract resolveWorkspaceForwardCreds shared by Upload+Download The 50-line "resolve URL + read inbound secret + lazy-heal on miss" block was duplicated nearly verbatim between Upload and Download handlers. Drift-prone — same class of risk as the original SaaS provision drift fixed in #2366. A future change like: - secret rotation (re-mint when the row's older than X) - per-feature audit logging - additional fail-closed conditions would have to be applied to both handlers, and a partial application that healed Upload but skipped Download would surface only at runtime. Fix: hoist the shared logic into resolveWorkspaceForwardCreds. The function takes an op label ("upload"/"download") used in log messages + the 503 RFC-#2312 detail copy so operators can still distinguish which feature ran. Both handlers reduce to: wsURL, secret, ok := resolveWorkspaceForwardCreds(c, ctx, workspaceID, "upload") if !ok { return } Net -20 lines (helper amortizes the 50-line block across both call sites). Existing test coverage (TestChatUpload_NoInboundSecret_, TestChatDownload_NoInboundSecret_ from PR #2370) covers all four branches of the shared helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:51:53 -07:00
Hongming Wang	29368dd749	Merge pull request #2371 from Molecule-AI/auto/team-expand-mint-fix test(provision): pin PARENT_ID env injection contract in prepareProvisionContext	2026-04-30 09:45:19 +00:00
Hongming Wang	4ba12668f0	test(provision): pin PARENT_ID env injection contract in prepareProvisionContext #2367 moved PARENT_ID env injection from inline TeamHandler.Expand into the shared prepareProvisionContext (sourced from payload.ParentID). The test was missing — a regression that: - dropped the injection - inverted the nil-check - leaked an empty PARENT_ID="" into env would not fail any existing test, but workspace/coordinator.py reads PARENT_ID on startup to track parent-child relationship, so the breakage would surface only at runtime. Adds TestPrepareProvisionContext_ParentIDInjection with three sub-cases: - nil ParentID → no PARENT_ID env - empty-string ParentID → no PARENT_ID env (don't pollute) - set ParentID → PARENT_ID env equals value Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:41:41 -07:00
Hongming Wang	ab6bcc030c	Merge pull request #2370 from Molecule-AI/auto/lazy-heal-test-coverage test(chat_files): pin lazy-heal mint contract for both Upload and Download	2026-04-30 09:41:40 +00:00
Hongming Wang	6c065a02e6	test(chat_files): pin lazy-heal mint contract for both Upload and Download The 2026-04-30 lazy-heal fix in chat_files.go (PR #2366) ATTEMPTS to mint platform_inbound_secret on miss so legacy workspaces self-heal without requiring destructive reprovision. The pre-existing TestChatUpload_NoInboundSecret + TestChatDownload_NoInboundSecret tests asserted the 503 response shape but did NOT pin that the mint UPDATE actually fires — they happened to exercise the mint-failure branch (sqlmock unmatched UPDATE = error = "Failed to mint" code path returns 503 with "RFC #2312" detail, which still passed the original assertions). This means a regression that: - skipped the lazy-heal mint entirely - inverted the success/failure response branches - moved the mint to a different code path would not fail those tests. Fix: - TestChatUpload_NoInboundSecret_LazyHeal: mock the UPDATE successfully; assert sqlmock.ExpectationsWereMet (mint MUST run) + body contains "retry" + "30" (success branch). - TestChatUpload_NoInboundSecret_LazyHealFailure: mock the UPDATE to fail; assert body contains "Reprovision" (failure branch). - Same pair for the Download handler — independent code path means independent test. Pins both branches of both handlers (4 tests) so future drift trips the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:38:28 -07:00
Hongming Wang	d9801d1e62	Merge pull request #2368 from Molecule-AI/auto/team-expand-mint-fix fix(team): delegate Expand child-provision to shared mint pipeline (#2367)	2026-04-30 09:31:34 +00:00
Hongming Wang	bb52a1a365	fix(team): delegate Expand child-provisioning to shared mint pipeline (#2367 ) Closes #2367. TeamHandler.Expand provisioned child workspaces by directly calling h.provisioner.Start, skipping mintWorkspaceSecrets and every other preflight (secrets load, env mutators, identity injection, missing-env, empty-config-volume auto-recover). Children shipped with NULL platform_inbound_secret + never-issued auth_token — same drift class as the SaaS bug just fixed in PR #2366, found while exercising a stronger gate against this package. Fix: - TeamHandler now holds WorkspaceHandler. Expand delegates each child provision to wh.provisionWorkspace, picking up the shared prepare/mint/preflight pipeline automatically. Future provision-time steps go in ONE place and team-expand inherits them. - prepareProvisionContext gains PARENT_ID env injection sourced from payload.ParentID (which Expand now populates). This preserves the signal workspace/coordinator.py reads on startup, without threading env through provisioner.WorkspaceConfig manually. - NewTeamHandler signature gains WorkspaceHandler; router passes it. Gate upgrade: - TestProvisionFunctions_AllCallMintWorkspaceSecrets is now behavior-based: it walks every FuncDecl in the package and flags any function that calls h.provisioner.Start or h.cpProv.Start without also calling mintWorkspaceSecrets. Drift-resistant by construction — a future provision function with any name still trips the gate. - Replaces the name-list version from PR #2366. The name list missed Expand precisely because Expand wasn't named provision*; the behavior-based detector caught it spontaneously when prototyped. Tests: full workspace-server module green; gate previously verified to fire red on Expand pre-fix and on deliberate mintWorkspaceSecrets removal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:28:29 -07:00
Hongming Wang	b9b0a46f2e	Merge pull request #2366 from Molecule-AI/auto/workspace-provision-shared fix(provision): share Docker+SaaS prepare path so both mint workspace secrets (RFC #2312)	2026-04-30 09:21:38 +00:00
Hongming Wang	3f8286ea47	fix(provision): share Docker+SaaS prepare path so both mint workspace secrets (RFC #2312 ) Root cause of 2026-04-30 silent-503 chat-upload bug: provisionWorkspaceCP (SaaS) skipped issueAndInjectInboundSecret while provisionWorkspaceOpts (Docker) called it. Every prod SaaS workspace provisioned with NULL platform_inbound_secret → upload returned 503 with the v2-enrollment message on every attempt. Structural fix: - Extract prepareProvisionContext (secrets load, env mutators, preflight, cfg build), mintWorkspaceSecrets (auth_token + platform_inbound_secret), markProvisionFailed (broadcast + DB update) into workspace_provision_shared.go - Refactor both provision modes to call the shared helpers - Add provisionAbort struct so the missing-env failure class can carry its structured "missing" payload through the shared abort path - Unify last_sample_error: previously the decrypt-fail path skipped it while others set it; users now see every failure class in the UI Drift prevention: - AST gate TestProvisionFunctions_AllCallMintWorkspaceSecrets asserts every function in the provisionFunctions set calls mintWorkspaceSecrets at least once (same shape as the audit-coverage gate from #335). New provision paths must either call mint or be added to provisionExemptFunctions with a one-line justification - Behavioral test TestMintWorkspaceSecrets_PersistsInboundSecretInSaaSMode pins the contract: SaaS mode MUST persist platform_inbound_secret to the DB column even though it skips file injection Existing-workspace recovery (chat_files.go lazy-heal): - Upload + Download handlers detect NULL platform_inbound_secret and call IssuePlatformInboundSecret inline, returning 503 with retry_after_seconds=30 - Self-heals workspaces that were provisioned before this fix without requiring destructive reprovision Tests: full handlers + workspace-server module green; AST gate verified to fire red on deliberate violation (commented-out mint call surfaces the exact function name + actionable remediation message). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 02:18:08 -07:00
Hongming Wang	49c3433a70	Merge pull request #2365 from Molecule-AI/auto/cf-dead-origin-status-gap fix(a2a): cover CF 521/522/523 in dead-origin status set	2026-04-30 08:51:28 +00:00
Hongming Wang	e06ebaefdf	Merge pull request #2346 from Molecule-AI/auto/issue-2341-migration-collision ci: hard gate against migration version collisions (#2341)	2026-04-30 08:50:19 +00:00
Hongming Wang	b5df2126b9	fix(test): convert migration-collision tests from pytest to unittest (#2341 ) CI failure: the Ops scripts (unittest) job runs `python -m unittest discover` which doesn't have pytest installed. test_check_migration_ collisions.py imported pytest unconditionally, failing module import: ImportError: Failed to import test module: test_check_migration_collisions Traceback (most recent call last): File ".../test_check_migration_collisions.py", line 12, in <module> import pytest ModuleNotFoundError: No module named 'pytest' The tests use no pytest-specific features (just bare assert + plain class). Sibling test_sweep_cf_decide.py in the same dir already uses unittest.TestCase. Convert this one to match: drop the pytest import, make TestMigrationFileRe inherit from unittest.TestCase. unittest.TestLoader.discover() requires TestCase subclasses for auto-discovery, so the fix is two lines (drop import, add base). Bare assert statements work fine inside TestCase methods. Verified: `python3 -m unittest scripts.ops.test_check_migration_collisions -v` runs all 9 tests, all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:47:27 -07:00
Hongming Wang	8e508a7a2f	fix(a2a): cover CF 521/522/523 in dead-origin status set Independent review on PR #2362 caught: the dead-agent classifier at a2a_proxy.go included 502/503/504/524 but missed the rest of the CF origin-failure family (521/522/523), which are MORE indicative of a dead EC2 than 524: - 521 "Web server is down" — CF can't open TCP to origin (most direct dead-EC2 signal; fires when the workspace EC2 has been terminated and CF still has the CNAME pointing at it). - 522 "Connection timed out" — TCP didn't complete in ~15s (typical of SG/NACL flap or agent process hung on accept). - 523 "Origin is unreachable" — CF can't route to origin (DNS gone, network path broken). Pre-fix any of these would propagate as-is to the canvas and the user would see a 5xx without the reactive auto-restart firing — exactly the SaaS-blind class of failure PR #2362 was meant to close. Refactor: extracted isUpstreamDeadStatus(int) helper so the matrix is in one place, with TestIsUpstreamDeadStatus locking in 18 status codes (7 dead, 11 not-dead including 520 and 525 which look CF-shaped but indicate different failures). Also tightened TestStopForRestart_NoProvisioner_NoOp per the same review: now uses sqlmock.ExpectationsWereMet to assert the dispatcher doesn't touch the DB on the both-nil path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:39:04 -07:00
Hongming Wang	2742c3c837	Merge pull request #2363 from Molecule-AI/auto/a2a-corpus-replay-gate test(a2a): protocol-shape replay corpus gate (#2345 follow-up)	2026-04-30 08:29:20 +00:00
Hongming Wang	747c12e582	test(a2a): protocol-shape replay corpus gate (#2345 follow-up) Backward-compat replay gate for the A2A JSON-RPC protocol surface. Every PR that touches normalizeA2APayload OR bumps the a-2-a-sdk version pin runs every shape in testdata/a2a_corpus/ through the current code and asserts: valid/ — every shape MUST parse without error and produce a canonical v0.3 payload (params.message.parts list). invalid/ — every shape MUST be rejected with the documented status code and error substring. What this prevents The 2026-04-29 v0.2 → v0.3 silent-drop bug (PR #2349) shipped because the SDK bump PR didn't replay v0.2-shaped inputs against the new code; the shape-mismatch surfaced only in production when the receiver's Pydantic validator silently rejected inbound messages. This gate would have caught it pre-merge. Hand-verified: reverting the v0.2 string→parts shim in normalizeA2APayload fails 3 of the v0.2 corpus entries with the exact rejection class the production bug exhibited. Corpus contents (11 entries) valid/ (10): v0_2_string_content — basic v0.2 (the broken case) v0_2_string_content_no_message_id — v0.2 + auto-fill messageId v0_2_list_content — v0.2 with content as Part list v0_3_parts_text_only — canonical v0.3 v0_3_parts_multi_text — multi-Part list v0_3_parts_with_file — multimodal (text + file) v0_3_parts_with_context — contextId for multi-turn v0_3_streaming_method — message/stream variant v0_3_unicode_text — emoji + multi-script v0_3_long_text — 10KB text Part no_jsonrpc_envelope — bare params/method without outer envelope (legacy senders) invalid/ (3): no_content_or_parts — message has neither field content_is_integer — wrong type for v0.2 content content_is_bool — wrong type, separate from int so the failure msg identifies which type-class regressed Plus 4 inline malformed-JSON cases (truncated, not-JSON, empty, whitespace) that can't be expressed as JSON corpus entries. Coverage tests The gate has 4 test functions: 1. TestA2ACorpus_ValidShapesParse — replay valid/ corpus, assert no error + canonical v0.3 output (parts list non-empty, messageId non-empty, content field deleted). 2. TestA2ACorpus_InvalidShapesRejected — replay invalid/ corpus, assert rejection matches recorded status + error substring. 3. TestA2ACorpus_MalformedJSONRejected — inline cases for non-parseable bodies. 4. TestA2ACorpus_HasMinimumCoverage — at least one v0.2 + one v0.3 entry exists (loses neither side of the bridge). 5. TestA2ACorpus_EveryEntryHasMetadata — _comment/_added/_source on every entry per the README policy; _expect_error and _expect_status on invalid entries. Documentation testdata/a2a_corpus/README.md describes the corpus contract: - When to add entries (new SDK shape, new production-observed shape). - When NOT to add (test scaffolding, hypothetical futures). - Removal policy (breaking change, deprecation window required). Verification - All 24 corpus subtests pass on current main. - Hand-test: revert the v0.2 compat shim → 3 v0.2 entries fail the gate with the exact rejection class the production bug exhibited. Confirmed. - Whole-module go test ./... green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 01:26:02 -07:00
Hongming Wang	344e3e8914	Merge pull request #2362 from Molecule-AI/auto/a2a-upstream-5xx-mark-dead fix(a2a): detect dead EC2 agents on upstream 5xx + reactive auto-restart for SaaS	2026-04-30 08:14:14 +00:00
Hongming Wang	a27cf8f39f	fix(restart): extract stopForRestart helper + add 524 to dead-agent list Addresses code-review C1 (test goroutine race) and I2 (CF 524) on PR #2362. C1: TestRunRestartCycle_SaaSPath_DispatchesViaCPProv invoked runRestartCycle end-to-end, which spawns `go h.sendRestartContext(...)`. That goroutine outlived the test, then read db.DB while the next test's setupTestDB wrote to it — DATA RACE under -race, cascading 30+ failures across the handlers suite. Refactored: extracted `stopForRestart(ctx, id)` from runRestartCycle as a pure dispatcher, and rewrote the SaaS-path test to call it directly (no async goroutine spawned). Added a no-provisioner no-op guard test. I2: Cloudflare 524 ("origin timed out") now triggers maybeMarkContainerDead alongside 502/503/504. Same upstream signal — origin agent unresponsive. Verified `go test -race -count=1 ./internal/handlers/...` green locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:58:22 -07:00
Hongming Wang	28b4e38002	fix(restart): branch provisionWorkspace dispatch on cpProv (PR #2362 amendment) Independent review of #2362 caught a Critical gap: the previous commit fixed the Stop dispatch in runRestartCycle but left the provisionWorkspace dispatch unconditionally Docker-only. So on SaaS the auto-restart cycle would Stop the EC2 successfully (good), then NPE inside provisionWorkspace's `h.provisioner.VolumeHasFile` call. coalesceRestart's recover()-without- re-raise (a deliberate platform-stability safeguard) silently swallowed the panic, leaving the workspace permanently stuck in status='provisioning' because the UPDATE on workspace_restart.go:450 had already run. Net pre-amendment effect on SaaS: dead agent → structured 503 (good) → workspace flipped to 'offline' (good) → cpProv.Stop succeeded (good) → provisionWorkspace NPE swallowed (bad) → workspace permanently 'provisioning' until manual canvas restart. The headline claim of #2362 ("SaaS auto-restart now works") was false on the path it shipped. Fix: dispatch the reprovision call the same way every other call site in the package does (workspace.go:431-433, workspace_restart.go:197+596) — branch on `h.cpProv != nil` and call provisionWorkspaceCP for SaaS, provisionWorkspace for Docker. Tests: - New TestRunRestartCycle_SaaSPath_DispatchesViaCPProv asserts cpProv.Stop is called when the SaaS path runs (would have caught the NPE if provisionWorkspace had been called instead). - fakeCPProv updated: methods record calls and return nil/empty by default rather than panicking. The previous "panic on unexpected call" pattern was unsafe — the panic fires on the async restart goroutine spawned by maybeMarkContainerDead AFTER the test assertions ran, so the test passed by accident even though the production path was broken (which is exactly how the Critical bug landed). - Existing tests still pass (full handlers + provisioner suites green). Branch-count audit refresh: runRestartCycle dispatch decisions: 1. h.provisioner != nil → provisioner.Stop + provisionWorkspace ✓ (existing tests) 2. h.cpProv != nil → cpProv.Stop + provisionWorkspaceCP ✓ (NEW test) 3. both nil → coalesceRestart never called (RestartByID gate) ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:35:51 -07:00
Hongming Wang	9f35788aee	fix(a2a): detect dead EC2 agents on upstream 5xx + reactive auto-restart for SaaS Class-of-bugs fix surfaced by hongmingwang.moleculesai.app's canvas chat to a dead workspace returning a generic Cloudflare 502 page on 2026-04-30. Three independent gaps in the reactive-health path that together leak dead-agent failures to canvas with no auto-recovery. ## Bug 1 — maybeMarkContainerDead is a no-op for SaaS tenants `maybeMarkContainerDead` only consulted `h.provisioner` (local Docker provisioner). SaaS tenants set `h.cpProv` (CP-backed EC2 provisioner) and leave `h.provisioner` nil — so the function early-returned false on every call and dead EC2 agents never triggered the offline-flip / broadcast / restart cascade. Fix: extend `CPProvisionerAPI` interface with `IsRunning(ctx, id) (bool, error)` (already implemented on `*CPProvisioner`; just needs to surface on the interface). `maybeMarkContainerDead` now branches: local-Docker path uses `h.provisioner.IsRunning`; SaaS path uses `h.cpProv.IsRunning` which calls the CP's `/cp/workspaces/:id/status` endpoint to read the EC2 state. ## Bug 2 — RestartByID short-circuits on `h.provisioner == nil` Same shape as Bug 1: the auto-restart cascade triggered by `maybeMarkContainerDead` calls `RestartByID` which short-circuited when the local Docker provisioner was missing. So even if Bug 1 were fixed, the workspace-offline state would never recover. Fix: change the gate to `h.provisioner == nil && h.cpProv == nil` and update `runRestartCycle` to branch on which provisioner is wired for the Stop call. (The HTTP `Restart` handler already does this branching correctly — we're just bringing the auto-restart path to parity.) ## Bug 3 — upstream 502/503/504 propagated as-is, masked by Cloudflare When the agent's tunnel returns 5xx (the "tunnel up but no origin" shape — agent process dead but cloudflared connection still healthy), `dispatchA2A` returns successfully at the HTTP layer with a 5xx body. `handleA2ADispatchError`'s reactive-health path doesn't run because that path is only triggered on transport-level errors. The pre-fix code propagated the 502 status to canvas; Cloudflare in front of the platform then masked the 502 with its own opaque "error code: 502" page, hiding any structured response and any Retry-After hint. Fix: in `proxyA2ARequest`, when the upstream returns 502/503/504, run `maybeMarkContainerDead` BEFORE propagating. If IsRunning confirms the agent is dead → return a structured 503 with restarting=true + Retry-After (CF doesn't mask 503s the same way). If running, propagate the original status (don't recycle a healthy agent on a transient hiccup — it might have legitimately returned 502). ## Drive-by — a2aClient transport timeouts a2aClient was `&http.Client{}` with no Transport timeouts. When a workspace's EC2 black-holes TCP connects (instance terminated mid-flight, SG flipped, NACL bug), the OS default is 75s on Linux / 21s on macOS — long enough for Cloudflare's ~100s edge timeout to fire first and surface a generic 502. Added DialContext (10s connect), TLSHandshake (10s), and ResponseHeaderTimeout (60s). Client.Timeout DELIBERATELY unset — that would pre-empt slow-cold-start flows (Claude Code OAuth first-token, multi-minute agent synthesis). Long-tail body streaming is still governed by per-request context deadline. ## Tests - `TestMaybeMarkContainerDead_CPOnly_NotRunning` — IsRunning(false) → marks workspace offline, returns true. - `TestMaybeMarkContainerDead_CPOnly_Running` — IsRunning(true) → no offline-flip, returns false (don't recycle a healthy agent). - `TestProxyA2A_Upstream502_TriggersContainerDeadCheck` — agent server returns 502 + cpProv reports dead → caller gets 503 with restarting= true and Retry-After: 15. - `TestProxyA2A_Upstream502_AliveAgent_PropagatesAsIs` — same upstream 502 but cpProv reports running → propagates 502 (existing behavior; safety check that prevents over-eager recycling). - Existing `TestMaybeMarkContainerDead_NilProvisioner` / `TestMaybeMarkContainerDead_ExternalRuntime` still pass. - Full handlers + provisioner test suites pass. ## Impact Pre-fix: dead EC2 agent on a SaaS tenant → CF-masked 502 to canvas, no auto-recovery, manual restart from canvas required. Post-fix: dead EC2 agent on a SaaS tenant → structured 503 with restarting=true + Retry-After to canvas, workspace flipped to offline, auto-restart cycle triggered. Canvas can show a user-actionable "agent is restarting, please wait" message instead of a generic 502. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:28:22 -07:00
Hongming Wang	92a29bb37c	Merge pull request #2360 from Molecule-AI/auto/2358-followup-permissions-concurrency fix(ci): close gaps in auto-promote dispatch tail (#2358 follow-up)	2026-04-30 07:07:21 +00:00
Hongming Wang	26d5c5ba1f	fix(ci): close gaps in auto-promote dispatch tail (#2358 follow-up) Independent review of #2358 surfaced three gaps that the original self-review missed. All three would manifest only on the FIRST real staging→main promotion through the new tail step, so they'd silently re-introduce the deploy-chain bug #2357 was supposed to fix. 1. Missing `actions: write` permission. `gh workflow run` POSTs to `/repos/.../actions/workflows/.../dispatches`, which requires the actions:write scope on GITHUB_TOKEN. The job had only contents:write + pull-requests:write, so the dispatch call would 403 on every run and the publish chain would still not fire. Adding the scope. 2. No workflow-level concurrency block. When CI + E2E Staging Canvas + E2E API Smoke + CodeQL all complete within seconds of each other on a green staging push (the typical case), four separate workflow_run events fire and four parallel auto-promote runs all reach the dispatch tail. They poll the same PR, all observe the same mergedAt, and all call `gh workflow run` — producing 2-4× redundant publish builds racing for the same `:staging-latest` retag and 2-4× canary-verify chains. Added `concurrency.group: auto-promote-staging, cancel-in-progress: false`. cancel-in-progress=false because killing a polling tail that's about to dispatch would re-introduce the original bug. 3. PR closed-without-merge ties up a runner for 30 min. If the merge queue rejects the PR (gates flip red post-approval), or an operator closes it manually, mergedAt stays null forever and the loop polls 60 × 30s burning a runner slot. Now also reads `state` in the same `gh pr view` call and breaks early when STATE=CLOSED. Verification on this PR is structural (workflow won't fire on a staging→main promotion until this lands AND a subsequent staging push triggers auto-promote). The actions:write fix in particular is unverifiable until the next real run — the prior #2358 fix has the same property, so we're stacking two unverifiable workflow edits. That's intentional rather than risky: stage 1 (#2358) was load-bearing for the deploy-chain restoration; stage 2 (this PR) hardens it before it actually matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:03:31 -07:00
Hongming Wang	d850ec7c8c	Merge pull request #2358 from Molecule-AI/auto/issue-2357-promote-dispatch-chain fix(ci): dispatch publish chain after auto-promote merge (#2357)	2026-04-30 06:36:02 +00:00
Hongming Wang	9a7f61661b	fix(ci): dispatch publish chain after auto-promote merge (#2357 ) The auto-promote staging → main flow uses `gh pr merge --auto` with GITHUB_TOKEN, which means GitHub suppresses downstream `push` events on the resulting main commit. This is documented behavior — events created by GITHUB_TOKEN do not trigger new workflow runs, with workflow_dispatch and repository_dispatch as the only exceptions. Effect: when the merge queue lands the auto-promote PR, the main push DOES NOT fire publish-workspace-server-image. canary-verify + the :staging-<sha> → :latest retag never run, so redeploy-tenants-on-main also never fires. Tenants stay on stale code until someone manually dispatches the chain (which is what just happened for issue #2339). Fix here: after enqueuing auto-merge, poll for the PR to land, then explicitly `gh workflow run publish-workspace-server-image.yml --ref main`. workflow_dispatch is the documented exception, so the dispatch event itself DOES create a new run. canary-verify and redeploy-tenants-on-main chain via workflow_run as before. Long-term (tracked in #2357): switch the auto-merge call above to a GitHub App token (actions/create-github-app-token) so the merge event itself can trigger the downstream chain naturally; the polling tail becomes deletable. Why a 30-min poll cap: merge queue typically lands a green promote PR within 5-10 min. 30 min covers a slow CI run without hanging the workflow indefinitely. If the merge times out, the step warns and exits 0 — operator can manually dispatch as a fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 23:31:13 -07:00
Hongming Wang	c1f993ca36	Merge pull request #2355 from Molecule-AI/auto/issue-2339-pr4-e2e-poll-roundtrip test(e2e): poll-mode + since_id cursor round-trip (#2339 PR 4)	2026-04-30 06:25:29 +00:00
Hongming Wang	08252b3cd7	fix(e2e): use real UUIDs for poll-mode test workspace ids CI run on PR #2355 surfaced `pq: invalid input syntax for type uuid: ws-poll-e2e-1777529293-3363` — workspaces.id is UUID-typed and the hand-rolled "ws-<tag>" shape fails the cast. Phase 1 returned generic 'registration failed' which cascaded into Phase 3 'lookup failed' (resolveAgentURL on a non-existent row) and Phase 4 'missing workspace auth token' (no token extracted because Phase 1 didn't run the bootstrap path). Generate v4 UUIDs via uuidgen (with a python3 fallback), one each for the poll workspace, the caller workspace, and the Phase 2 invalid-mode probe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 23:10:36 -07:00
Hongming Wang	a495b86a06	test(e2e): poll-mode + since_id cursor round-trip (#2339 PR 4) End-to-end coverage for the canvas-chat unblocker. Exercises every moving part of the #2339 stack against a real platform instance: Phase 1 — register a workspace as delivery_mode=poll WITHOUT a URL; verify the response carries delivery_mode=poll. Phase 2 — invalid delivery_mode rejected with 400 (typo defense). Phase 3 — POST A2A to the poll-mode workspace; verify proxyA2ARequest short-circuits and returns 200 {status:queued, delivery_mode:poll, method:message/send} without ever resolving an agent URL. Phase 4 — verify the queued message appears in /activity?type=a2a_receive with the right method + payload (the polling agent reads from here). Phase 5 — since_id cursor returns ASC-ordered rows STRICTLY AFTER the cursor; the cursor row itself must NOT be replayed. Sends two follow-up messages and asserts ordering: rows[0] is the older new event, rows[-1] is the newer. Phase 6 — unknown / pruned cursor returns 410 Gone with an explanation. Phase 7 — cross-workspace cursor isolation: a UUID belonging to one workspace cannot be used to peek at another workspace's feed (returns 410, same as pruned, no info leak). Idempotent: per-run unique workspace ids (date+pid). Trap-based cleanup deletes the test rows on exit; no e2e_cleanup_all_workspaces call (see feedback_never_run_cluster_cleanup_tests_on_live_platform.md). Wired into .github/workflows/e2e-api.yml so it runs on every PR that touches workspace-server/, tests/e2e/, or the workflow file itself — same gate as the existing test_a2a_e2e + test_notify_attachments suites. Stacked on #2354 (PR 3: since_id cursor). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 23:07:10 -07:00
Hongming Wang	b5bde0399a	Merge pull request #2354 from Molecule-AI/auto/issue-2339-pr3-activity-cursor feat(activity): since_id cursor on GET /activity (#2339 PR 3)	2026-04-30 05:55:05 +00:00
Hongming Wang	a81b0e1e3d	feat(activity): since_id cursor on GET /activity (#2339 PR 3) Telegram getUpdates / Slack RTM shape: poll-mode workspaces pass the id of the last activity_logs row they consumed, server returns rows strictly after in chronological (ASC) order. Existing callers that don't pass since_id keep DESC + most-recent-N — backwards-compatible. Cursor lookup is scoped by workspace_id so a caller cannot enumerate or peek at another workspace's events by passing a UUID belonging to a different workspace. Cross-workspace and pruned cursors both return 410 Gone — no information leak (caller cannot distinguish "row never existed" from "row exists but you can't see it"). since_id + since_secs both apply (AND). When since_id is set the order flips to ASC because polling consumers need recorded-order; the recent- feed shape (no since_id) keeps DESC. Tests: - TestActivityHandler_SinceID_ReturnsNewerASC — cursor lookup → main query with cursorTime + ASC ordering. - TestActivityHandler_SinceID_CursorNotFound_410 — pruned/unknown cursor. - TestActivityHandler_SinceID_CrossWorkspaceCursor_410 — UUID belongs to another workspace, scoped lookup hides it (same 410 path, no leak). - TestActivityHandler_SinceID_CombinedWithSinceSecs — placeholder index arithmetic with both filters. Stacked on #2353 (PR 2: poll-mode short-circuit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 22:51:52 -07:00
Hongming Wang	706a388806	Merge pull request #2353 from Molecule-AI/auto/issue-2339-pr2-poll-shortcircuit-v2 feat(a2a): poll-mode short-circuit in ProxyA2A (#2339 PR 2)	2026-04-30 05:29:03 +00:00
Hongming Wang	91a1d5377d	feat(a2a): poll-mode short-circuit in ProxyA2A (#2339 PR 2) Skip SSRF/dispatch and queue to activity_logs for delivery_mode=poll workspaces. The polling agent (e.g. molecule-mcp-claude-channel on an operator's laptop) consumes via GET /activity?since_id= in PR 3 — no public URL needed. Order: budget -> normalize -> lookupDeliveryMode short-circuit -> resolveAgentURL. Normalizing before the short-circuit keeps the JSON-RPC method name on the activity_logs row so the polling agent can dispatch correctly. Fail-closed-to-push: any DB error reading delivery_mode defaults to push (loud + recoverable) rather than poll (silent drop). Tests: - TestProxyA2A_PollMode_ShortCircuits_NoSSRF_NoDispatch — core invariant: no resolveAgentURL, no Do(), records to activity_logs, returns 200 {status:"queued",delivery_mode:"poll",method:"message/send"}. - TestProxyA2A_PushMode_NoShortCircuit — push path unaffected; the agent server actually receives the request. - TestProxyA2A_PollMode_FailsClosedToPush — DB error on mode lookup must NOT silently queue; falls through to the push path. Stacked on #2348 (PR 1: schema + register flow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 22:22:28 -07:00
Hongming Wang	3da2392f95	Merge pull request #2348 from Molecule-AI/auto/issue-2339-pr1-delivery-mode feat(workspaces): delivery_mode column + poll-mode register flow (#2339 PR 1)	2026-04-30 05:18:03 +00:00
Hongming Wang	ec6e47cbe3	Merge pull request #2351 from Molecule-AI/auto/issue-2344-architecture-lint test(arch): codify 4 module boundaries as architecture tests (#2344)	2026-04-30 05:16:09 +00:00
Hongming Wang	68f18424f5	test(arch): codify 4 module boundaries as architecture tests (#2344 ) Hard gate #4: codified module boundaries as Go tests, so a new contributor (or AI agent) can't silently land an import that crosses a layer. Boundaries enforced (one architecture_test.go per package): - wsauth has no internal/* deps — auth leaf, must be unit-testable in isolation - models has no internal/* deps — pure-types leaf, reverse dep would create cycles since most packages depend on models - db has no internal/* deps — DB layer below business logic, must be testable with sqlmock without spinning up handlers/provisioner - provisioner does not import handlers or router — unidirectional layering: handlers wires provisioner into HTTP routes; the reverse is a cycle Each test parses .go files in its package via go/parser (no x/tools dep needed) and asserts forbidden import paths don't appear. Failure messages name the rule, the offending file, and explain WHY the boundary exists so the diff reviewer learns the rule. Note: the original issue's first two proposed boundaries (provisioner-no-DB, handlers-no-docker) don't match the codebase today — provisioner already imports db (PR #2276 runtime-image lookup) and handlers hold *docker.Client directly (terminal, plugins, bundle, templates). I picked the four boundaries that actually hold; the first two are aspirational and would need a refactor before they could be codified. Hand-tested by injecting a deliberate wsauth -> orgtoken violation: the gate fires red with the rule message before merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 22:12:58 -07:00
Hongming Wang	664bbd8899	Merge pull request #2350 from Molecule-AI/auto/issue-2342-continuous-synth-e2e ci: continuous synthetic E2E against staging (#2342)	2026-04-30 05:08:35 +00:00
Hongming Wang	0b83faa33c	Merge pull request #2349 from Molecule-AI/auto/issue-2345-a2a-v02-compat-clean fix(a2a): v0.2 → v0.3 compat shim at proxy edge (#2345)	2026-04-30 05:05:04 +00:00
Hongming Wang	db5d11ffca	ci: continuous synthetic E2E against staging (#2342 ) Hard gate Tier 2 item 2 of 4. Cron-driven full-lifecycle E2E that catches regressions visible only at runtime — schema drift, deployment-pipeline gaps, vendor outages, env-var rotations, DNS / CF / Railway side-effects. Empirical motivation from today: - #2345 (A2A v0.2 silent drop) — passed unit tests, broke at JSON-RPC parse layer between sender + receiver. Visible only when a sender exercises the full path. Now-fixed by PR #2349, but a continuous E2E would have surfaced it within 20 min of the regression. - RFC #2312 chat upload — landed staging-branch but never reached staging tenants because publish-workspace-server-image was main- only. Caught by manual dogfooding hours after deploy. Same pattern. Both classes are invisible to PR-time CI. The continuous gate fires every 20 min against a real staging tenant and surfaces regressions within minutes. Cadence: cron `0,20,40 * * * *` (3x/hour). Offsets the existing sweep-cf-orphans (:15) and sweep-cf-tunnels (:45) so the three ops don't burst CF/AWS APIs at the same minute. Concurrency group prevents overlapping runs if one hangs. Cost: ~$0.50-1/day GHA + pennies of staging tenant lifecycle. Reuses existing tests/e2e/test_staging_full_saas.sh — no new harness to maintain. Bounded at 10 min wall-clock (vs 15 min default) so stuck runs fail fast rather than holding up the next firing. Defaults to E2E_RUNTIME=langgraph (fastest cold start; the regression classes this gate catches don't need hermes-specific paths). Operators can dispatch with runtime=hermes when they want SDK-native coverage. Schedule-vs-dispatch hardening: hard-fail on missing CP_STAGING_ADMIN_API_TOKEN for cron firing (silent-skip would mask real outages); soft-skip for operator dispatch. Refs: - #2342 hard-gates Tier 2 item 2 - #2345 (A2A v0.2 fix that this gate would have caught earlier) - #2335 / #2337 (deployment-pipeline gaps that this gate also catches)	2026-04-29 22:04:57 -07:00
Hongming Wang	140fc5fb10	fix(a2a): v0.2 → v0.3 compat shim at proxy edge (#2345 ) Closes #2345. ## Symptom Design Director silently dropped A2A briefs whose sender used the v0.2 message format (`params.message.content` string) instead of v0.3 (`params.message.parts` part-list). The downstream a2a-sdk's v0.3 Pydantic validator rejected with "params.message.parts — Field required" but the rejection only landed in tenant-side logs; the sender saw HTTP 200/202 and assumed delivery. UX Researcher therefore never received the kickoff. Multi-agent pipeline silently idle. ## Fix Convert at the proxy edge in normalizeA2APayload. Two cases handled, one explicitly rejected: v0.2 string content → wrap as [{kind: text, text: <content>}] (the canonical v0.2 case from the dogfooding report) v0.2 list content → preserve list as parts (some older clients put a list under `content`; treat as "client meant parts, used wrong field name") v0.3 parts present → no-op (hot path for normal traffic) Neither present → return HTTP 400 with structured JSON-RPC error pointing at the missing field Why at the proxy edge: every workspace gets the compat for free without each one bumping a2a-sdk separately. The SDK's own compat adapter is strict about `parts` and rejects v0.2 senders. Why reject loud on missing-both: pre-fix the SDK's Pydantic rejection was post-handler-dispatch and invisible to the original sender. Now misshapen payloads return a structured 400 to the actual caller — kills the entire silent-drop class for this payload-shape category. ## Tests 7 new cases on normalizeA2APayload (#2345) + 1 fixture update on the existing _MissingMethodReturnsEmpty test: TestNormalizeA2APayload_ConvertsV02StringContentToParts TestNormalizeA2APayload_ConvertsV02ListContentToParts TestNormalizeA2APayload_PreservesV03Parts (hot path) TestNormalizeA2APayload_RejectsMessageWithNeitherContentNorParts TestNormalizeA2APayload_RejectsContentWithUnsupportedType TestNormalizeA2APayload_NoMessageNoCheck (e.g. tasks/list bypasses) All 11 normalizeA2APayload tests pass + full handler suite (no regressions). ## Refs Hard-gates discussion: this is exactly the class of failure (silent-drop on schema mismatch) that #2342 (continuous synthetic E2E) would catch automatically. Tier 2 RFC item from #2345 (caller gets structured JSON-RPC error on parse failure) is delivered above via the loud-reject path.	2026-04-29 22:01:41 -07:00
Hongming Wang	d5b00d6ac1	feat(workspaces): delivery_mode column + poll-mode register flow (#2339 PR 1) Adds workspaces.delivery_mode (push, default \| poll) and lets the register handler accept poll-mode workspaces with no URL. This is the foundation for the unified poll/push delivery design in #2339 — Telegram-getUpdates shape for external runtimes that have no public URL. What this PR does: - Migration 045: NOT NULL TEXT column, default 'push', CHECK constraint on the two valid values. - models.Workspace + RegisterPayload + CreateWorkspacePayload gain a DeliveryMode field. RegisterPayload.URL drops the `binding:"required"` tag — the handler now enforces it conditionally on the resolved mode. - Register handler: validates explicit delivery_mode if set; resolves effective mode (payload value, else stored row value, else push) AFTER the C18 token check; validates URL only when effective mode is push; persists delivery_mode in the upsert; returns it in the response; skips URL caching when payload.URL is empty. - CreateWorkspace handler: persists delivery_mode (defaults to push) in the same INSERT, validates it before any side effects. What this PR does NOT do (intentional, follow-up PRs): - PR 2: short-circuit ProxyA2A for poll-mode workspaces (skip SSRF + dispatch, log a2a_receive activity, return 200). - PR 3: since_id cursor on GET /activity for lossless polling. - Plugin v0.2 in molecule-mcp-claude-channel: cursor persistence + a register helper that creates poll-mode workspaces. Backwards compatibility: every existing workspace stays push-mode (schema default) with identical behavior. New tests: TestRegister_PollMode_AcceptsEmptyURL, TestRegister_PushMode_RejectsEmptyURL, TestRegister_InvalidDeliveryMode, TestRegister_PollMode_PreservesExistingValue. All existing register + create tests updated to expect the new delivery_mode column in the INSERT args. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 21:47:14 -07:00

1 2 3 4 5 ...

3488 Commits