forked from molecule-ai/molecule-core
fix/cherry-3-files-vitest-postgres-e2eapi
26 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
4f4b6c4f90 |
test(runtime): pin PR #2756's card-vs-setup decoupling with build_routes helper
PR #2756's contract — card route always mounted regardless of adapter.setup() outcome — lived inline in main.py's `# pragma: no cover` boot sequence. A future refactor that re-coupled the two would have silently bypassed PR #2756 and shipped the original "stuck booting forever" UX again, with no pytest catching it. This change extracts route assembly into workspace/boot_routes.py's build_routes(card, executor, adapter_error) and pins the contract with 6 integration tests using Starlette's TestClient: - test_card_route_serves_200_when_adapter_ready: happy path - test_card_route_serves_200_when_adapter_failed: misconfigured boot, card still 200, skill stubs survive - test_jsonrpc_returns_503_when_no_executor: full -32603 envelope with the adapter_error in error.data - test_jsonrpc_returns_503_with_generic_when_no_error_string: fallback reason for the rare case main.py reaches this branch without one - test_card_route_does_not_depend_on_executor: direct PR #2756 regression guard — both branches MUST mount the card route - test_executor_present_does_not_mount_not_configured_handler: sanity that a healthy workspace doesn't return -32603 to every request Conftest stubs extended with a2a.server.routes / request_handlers classes so the tests work under the existing a2a-mock infra (pattern matches the AgentCard/AgentSkill stubs added for PR #2765). main.py now calls build_routes; the inline if/else is gone. Same production behaviour, cleaner shape, regression-proof. Heavy a2a-sdk imports inside build_routes() are lazy (deferred to the executor-only branch) so tests that only exercise the not-configured path don't pull DefaultRequestHandler / InMemoryTaskStore. card_helpers + boot_routes registered in TOP_LEVEL_MODULES (build drift gate would have caught the missing entry on the wheel-publish smoke). All 18 related tests pass (test_boot_routes.py: 6, test_card_helpers.py: 6, test_not_configured_handler.py: 6). Closes #2761 Pairs with: PR #2756 (decouple agent-card from setup), PR #2765 (defensive isolation of enrichment + transcript) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
63ac99788b |
fix(runtime): isolate card-skill enrichment + transcript handler from adapter shape mismatch
PR #2756 added a try/except around adapter.setup() so a missing LLM key doesn't crash the workspace boot. Two paths that now run AFTER setup succeeds were not similarly isolated, leaving small but real coupling risks for future adapter authors. 1. **Skill metadata enrichment swap (main.py:248-259).** When adapter.setup() returns, main.py reads adapter.loaded_skills and replaces the static stubs in agent_card.skills with rich metadata (description, tags, examples). The list comprehension assumes each element exposes .metadata.{id,name,description,tags,examples}. A future adapter that returns a non-canonical shape would raise AttributeError, propagate to the outer except, capture as adapter_error, and silently degrade an OK boot to the not-configured state — even though setup() actually succeeded. Extract to card_helpers.enrich_card_skills(card, loaded_skills) → bool. Helper swallows enrichment failures, logs the cause, returns False, leaves the static stubs in place. setup() success path continues unchanged. 6 unit tests cover: None input, empty list, canonical happy path, missing .metadata attr, partial .metadata (missing one canonical field), atomic-failure-no-partial-swap. 2. **/transcript handler (main.py:513).** Calls await adapter.transcript_lines(...) without try/except. BaseAdapter's default returns {"supported": false} so today's 4 adapters never trigger this — but a future adapter override that assumes setup() ran would surface as a 500 from Starlette's default error handler instead of a useful 503 with the exception class + message. Inline try/except returns 503 with the reason, matching the not-configured JSON-RPC handler's pattern. Both changes match the architectural principle the PR #2756 chain established: availability (workspace reachable) is decoupled from configuration / adapter behavior. Operators see useful errors instead of silent degradation; future adapter authors can't accidentally break tenant readiness with a shape mismatch. Adds: - workspace/card_helpers.py (~50 lines, 100% covered) - workspace/tests/test_card_helpers.py (6 tests) - AgentCard/AgentSkill/AgentCapabilities/AgentInterface stubs to workspace/tests/conftest.py so future card-related tests work under the existing a2a-mock infrastructure - card_helpers in TOP_LEVEL_MODULES (drift gate would have caught it) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4b35d25d86 |
fix(runtime): decouple agent-card readiness from adapter.setup()
Today, if `adapter.setup()` raises (most often: an LLM credential is
missing/rotated), main.py crashes before the agent-card route is mounted.
start.sh restart-loops, /.well-known/agent-card.json never returns 200,
and the workspace is invisible to the bench/canvas — operators see
"stuck booting forever" with no clear error to act on.
The agent-card is a static capability advertisement (name, version,
skills, supported protocols). It doesn't need a working LLM. Coupling
its mount to setup() conflates *availability* ("am I up?") with
*configuration* ("can I actually answer?"). They're different concerns.
This change:
- Builds AgentCard from `config.skills` (static names from config.yaml)
BEFORE adapter.setup(), so the route mounts independent of setup state.
- Wraps setup() + create_executor in try/except. On success, mounts
the real DefaultRequestHandler with rich loaded_skills metadata
swapped into the card in-place. On failure, mounts a JSON-RPC
handler that returns -32603 "agent not configured" with the
setup() exception in error.data.
- Heartbeat keeps running on misconfigured boots so the platform
marks the workspace as reachable-but-misconfigured rather than
crash-looping. Operators redeploy with corrected env without
chasing a restart loop.
- initial_prompt and idle_loop are skipped on misconfigured boots —
they self-fire to /, which would land in -32603 anyway, and the
marker would consume on the first useless attempt.
Bench impact (RFC #388 strict <120s): codex/openclaw bench-time-outs
were the agent-card-never-returns-200 symptom. With this fix those
runtimes serve the card immediately on EC2 boot, so the bench
measures infrastructure cold-start (claude-code class: ~50–80s)
instead of credential-coupled boot.
Adds workspace/not_configured_handler.py (factory + module-level so
behavior is unit-testable; main.py is `# pragma: no cover`) and
workspace/tests/test_not_configured_handler.py (6 tests covering
status code, JSON-RPC envelope shape, id-echo, malformed-body
fallback, reason surfacing, batch-body safety).
All 1665 existing workspace tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
71e7a6ffee |
feat(workspace): wire EventLog into adapter base (#119 PR-3b)
Adds adapter.event_log property+setter on BaseAdapter so adapters can emit structured events (tool dispatch, skill load, executor errors) without coupling to the chosen backend. Default is a shared no-op DisabledEventLog; main.py overrides at boot from the observability.event_log config block (PR-2 schema). The shape is intentionally additive: - Property is invisible to the BaseAdapter signature snapshot drift gate (the helper walks vars(cls) for callables only — properties are not callable). Verified with a regression test in the new test_adapter_base_event_log.py. - Existing adapters continue to work unchanged. Template repos that never call self.event_log get the no-op for free. - Setter accepts any EventLogBackend, so swapping memory↔disabled at runtime (or to a future Redis backend) requires no adapter code change. Sequels: - PR-3c: emit events from claude-code/hermes adapters at the natural points (tool dispatch, skill load). - PR-4: skill-compat audit + SKILL.md frontmatter docs. - Platform-side /workspaces/:id/activity endpoint reads the buffer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
efa68a26b1 |
feat(workspace): wire observability config into heartbeat + uvicorn (#119 PR-3a)
Replaces the hard-coded HEARTBEAT_INTERVAL=30 in heartbeat.py and log_level="info" in main.py with values from ObservabilityConfig (#119 PR-1, schema landed in PR #2538). Concrete plumbing: - heartbeat.HeartbeatLoop accepts an `interval_seconds=` keyword arg. Defaults to the legacy module constant so 2-arg callers (existing tests, any downstream code that hasn't been updated) keep their existing 30s behavior. - main.py constructs HeartbeatLoop with config.observability.heartbeat_interval_seconds — the value the config parser already clamped to [5, 300]. - main.py's uvicorn.Config takes log_level from config.observability.log_level (lowercased — uvicorn's convention differs from Python logging's) with LOG_LEVEL env still winning as an ops-side debugging override. Adapter EventLog wiring deferred to PR-3b (#208 follow-up) — touches adapter_base interface + needs careful design, kept separate to keep this PR small + reviewable. Tests: - test_heartbeat.py: 3 new tests pin default interval, explicit override, and the [5, 300] band that the constructor accepts without re-clamping (clamping is the parser's job). - All 88 tests in test_heartbeat.py + test_config.py pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
aacaba024c |
feat(wheel-smoke): exercise executor.execute() to catch lazy imports (#2275)
The existing wheel-publish smoke (`wheel_smoke.py`) only IMPORTS `molecule_runtime.main` at module scope. Lazy imports buried inside `async def execute(...)` bodies (e.g. `from a2a.types import FilePart`) NEVER evaluate at static-import time — they crash at first message delivery in production. The 2026-04-2x v0→v1 a2a-sdk migration shipped 5 such regressions in templates that all looked fine at module-load smoke. This change adds `smoke_mode.py` plus a `MOLECULE_SMOKE_MODE=1` short-circuit in `main.py`: after `adapter.create_executor(...)`, the boot path invokes `executor.execute(stub_ctx, stub_queue)` once with a 5s timeout (`MOLECULE_SMOKE_TIMEOUT_SECS`). Healthy import tree → execution proceeds far enough to hit a network boundary and times out (exit 0). Broken lazy import → `ImportError` / `ModuleNotFoundError` from inside the executor body (exit 1). Other downstream errors (auth, validation) pass — those are caught by adapter-level tests, not this gate. Stub `(RequestContext, EventQueue)` is built from the real a2a-sdk so SendMessageRequest/RequestContext constructor changes also surface as import-tree failures (the regression class also includes "SDK refactored mid-publish"). The stub-build itself is wrapped — if it raises, that's a smoke fail too. Phase 2 (separate PR, molecule-ci) wires this into publish-template-image.yml so the publish gate runs the boot smoke against every template image before pushing the tag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e955597a98 |
feat(chat_files): rewrite Download as HTTP-forward (RFC #2312, PR-D)
Mirrors PR-C's Upload migration: replaces the docker-cp tar-stream extraction with a streaming HTTP GET to the workspace's own /internal/file/read endpoint. Closes the SaaS gap for downloads — without this PR, GET /workspaces/:id/chat/download still returns 503 on Railway-hosted SaaS even after A+B+C+F land. Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → PR-F #2319 → this PR. Why a single broad /internal/file/read instead of /internal/chat/download: Today's chat_files.go::Download already accepts paths under any of the four allowed roots {/configs, /workspace, /home, /plugins} — it's not strictly chat. Future PRs (template export, etc.) will reuse this endpoint via the same forward pattern; reusing avoids three near- identical handlers (one per domain) with duplicated path-safety logic. Path safety is duplicated on platform + workspace sides — defence in depth via two parallel checks, not "trust the workspace." Changes: * workspace/internal_file_read.py — Starlette handler. Validates path (must be absolute, under allowed roots, no traversal, canonicalises cleanly). lstat (not stat) so a symlink at the path doesn't redirect the read. Streams via FileResponse (no buffering). Mirrors Go's contentDispositionAttachment for Content-Disposition header. * workspace/main.py — registers GET /internal/file/read alongside the POST /internal/chat/uploads/ingest from PR-B. * scripts/build_runtime_package.py — adds internal_file_read to TOP_LEVEL_MODULES so the publish-runtime cascade rewrites its imports correctly. Also includes the PR-B additions (internal_chat_uploads, platform_inbound_auth) since this branch was rooted before PR-B's drift-gate fix; merge-clean alphabetic additions. * workspace-server/internal/handlers/chat_files.go — Download rewritten as streaming HTTP GET forward. Resolves workspace URL + platform_inbound_secret (same shape as Upload), builds GET request with path query param, propagates response headers (Content-Type / Content-Length / Content-Disposition) + body. Drops archive/tar + mime imports (no longer needed). Drops Docker-exec branch entirely — Download is now uniform across self-hosted Docker and SaaS EC2. * workspace-server/internal/handlers/chat_files_test.go — replaces TestChatDownload_DockerUnavailable (stale post-rewrite) with 4 new tests: - TestChatDownload_WorkspaceNotInDB → 404 on missing row - TestChatDownload_NoInboundSecret → 503 on NULL column (with RFC #2312 detail in body) - TestChatDownload_ForwardsToWorkspace_HappyPath → forward shape (auth header, GET method, /internal/file/read path) + headers propagated + body byte-for-byte - TestChatDownload_404FromWorkspacePropagated → 404 from workspace propagates (NOT remapped to 500) Existing TestChatDownload_InvalidPath path-safety tests preserved. * workspace/tests/test_internal_file_read.py — 21 tests covering _validate_path matrix (absolute, allowed roots, traversal, double- slash, exact-match-on-root), 401 on missing/wrong/no-secret-file bearer, 400 on missing path/outside-root/traversal, 404 on missing file, happy-path streaming with correct Content-Type + Content-Disposition, special-char escaping in Content-Disposition, symlink-redirect-rejection (lstat-not-stat protection). Test results: * go test ./internal/handlers/ ./internal/wsauth/ — green * pytest workspace/tests/ — 1292 passed (was 1272 before PR-D) Refs #2312 (parent RFC), #2308 (chat upload+download 503 incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
055e447355 |
feat(saas): deliver platform_inbound_secret via /registry/register (RFC #2312, PR-F)
Closes the SaaS-side gap that PR-A acknowledged but didn't fix: SaaS workspaces have no persistent /configs volume, so the platform_inbound_secret that PR-A's provisioner wrote at workspace creation never reaches the runtime. Without this, even after the entire RFC #2312 stack lands, SaaS chat upload would 401 (workspace fails-closed when /configs/.platform_inbound_secret is missing). Solution: return the secret in the /registry/register response body on every register call. The runtime extracts it and persists to /configs/.platform_inbound_secret at mode 0600. Idempotent — Docker- mode workspaces also receive it and overwrite the value the provisioner already wrote (same value until rotation). Why on every register, not just first-register: * SaaS containers can be restarted (deploys, drains, EBS detach/ re-attach) — /configs is rebuilt empty on each fresh start. * The auth_token is "issue once" because re-issuing rotates and invalidates the previous one. The inbound secret has no rotation flow yet (#2318) so re-sending the same value is harmless. * Eliminates the bootstrap window where a restarted SaaS workspace has no inbound secret on disk and would 401 every platform call. Changes: * workspace-server/internal/handlers/registry.go — Register handler reads workspaces.platform_inbound_secret via wsauth.ReadPlatformInboundSecret and includes it in the response body. Legacy workspaces (NULL column) get a successful registration with the field omitted. * workspace-server/internal/handlers/registry_test.go — two new tests: - TestRegister_ReturnsPlatformInboundSecret_RFC2312_PRF: secret present in DB → secret in response, alongside auth_token. - TestRegister_NoInboundSecret_OmitsField: NULL column → field omitted, registration still 200. * workspace/platform_inbound_auth.py — adds save_inbound_secret(secret). Atomic write via tmp + os.replace, mode 0600 from os.open(O_CREAT, 0o600) so a concurrent reader never sees 0644-default. Resets the in-process cache after write so the next get_inbound_secret() returns the freshly-written value (rotation-safe when it lands). * workspace/main.py — register-response handler extracts platform_inbound_secret alongside auth_token and persists via save_inbound_secret. Mirrors the existing save_token pattern. * workspace/tests/test_platform_inbound_auth.py — 6 new tests for save_inbound_secret: writes file, mode 0600, overwrite-existing, cache invalidation after save, empty-input no-op, parent-dir creation for fresh installs. Test results: * go test ./internal/handlers/ ./internal/wsauth/ — all green * pytest workspace/tests/ — 1272 passed (was 1266 before this PR) Refs #2312 (parent RFC), #2308 (chat upload 503 incident). Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d1de330152 |
feat(workspace): /internal/chat/uploads/ingest endpoint (RFC #2312, PR-B)
Stacked on PR-A (#2313). The platform-side rewrite that actually calls this endpoint lands in PR-C; this PR adds the workspace-side consumer + hardening so PR-C is a small Go-only diff. What this adds: * platform_inbound_auth.py — auth gate mirroring transcript_auth.py. Reads /configs/.platform_inbound_secret (delivered by the PR-A provisioner). Fail-closed when the file is missing/empty/unreadable. Constant-time compare via hmac.compare_digest. * internal_chat_uploads.py — POST /internal/chat/uploads/ingest. Multipart parse → sanitize each filename → write to /workspace/.molecule/chat-uploads/<random>-<name> with O_CREAT|O_EXCL|O_NOFOLLOW. Same response shape (uri/name/mimeType/ size + workspace: URI scheme) as the legacy Go handler — canvas / agent code that resolves "workspace:..." paths keeps working. * Wired into workspace/main.py via starlette_app.add_route alongside the existing /transcript route. * python-multipart>=0.0.18 added to requirements.txt (Starlette's Request.form() needs it; ≥ 0.0.18 closes CVE-2024-53981). Test coverage (36 tests, all green; full workspace suite 1266 passed): * test_platform_inbound_auth.py — 14 tests: happy path, fail-closed on missing file, empty file, whitespace- only file, missing/case-wrong/empty Bearer prefix, in-process cache, default CONFIGS_DIR fallback, end-to-end file → authorized. * test_internal_chat_uploads.py — 22 tests: sanitize_filename matrix (incl. ../traversal, CJK chars, length truncation), 401 on missing/wrong/no-secret-file bearer, single + batch upload happy paths, unique random prefix on duplicate names, mimetype guess fallback, 400 on missing files field, 413 on per- file + total-body oversize, symlink-at-target refusal (with sentinel-content unchanged assertion). Why this is safe to ship before PR-C: * No platform-side caller yet → no behavior change visible to users. * Auth fails closed; nothing on the network can hit a write path until the platform forwards with the matching bearer. * Workspace's existing routes (/health, /transcript, /handle/*) are unchanged. Refs #2312 (parent RFC), #2308 (chat upload 503 incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e9a59cda3b |
feat(platform): single-source-of-truth tool registry — adapters consume, no drift
Establishes workspace/platform_tools/registry.py as THE place tool
naming and docs live. Every consumer reads from it; nothing duplicates
the source. Closes the architectural gap behind the doc/tool drift
discussion 2026-04-28 — adding hundreds of future runtime SDK adapters
should not require touching tool names anywhere except the registry.
What the registry owns
ToolSpec dataclass with: name, short (one-line description), when_to_use
(multi-paragraph agent-facing usage guidance), input_schema (JSON Schema),
impl (the actual coroutine in a2a_tools.py), section ('a2a' | 'memory').
TOOLS list with 8 entries — delegate_task, delegate_task_async,
check_task_status, list_peers, get_workspace_info, send_message_to_user,
commit_memory, recall_memory.
What now reads from the registry
- workspace/a2a_mcp_server.py
The hardcoded TOOLS list (167 lines of hand-maintained dicts) is
gone. Replaced with a 6-line list comprehension over the registry.
MCP description = spec.short. inputSchema = spec.input_schema.
- workspace/executor_helpers.py
get_a2a_instructions(mcp=True) and get_hma_instructions() now
GENERATE the agent-facing system-prompt text from the registry.
Heading + per-tool bullet (spec.short) + per-tool when_to_use +
a section-specific footer. No more hand-maintained instruction
blocks that drift from reality.
- workspace/builtin_tools/delegation.py
Renamed delegate_to_workspace -> delegate_task_async to match
registry. check_delegation_status -> check_task_status. Added
sync delegate_task @tool wrapping a2a_tools.tool_delegate_task
(was missing for LangChain runtimes — CP review Issue 3).
- workspace/builtin_tools/memory.py
Renamed search_memory -> recall_memory to match registry.
- workspace/adapter_base.py, workspace/main.py
Bundle all 7 core tools (was 6) into all_tools / base_tools.
- workspace/coordinator.py, shared_runtime.py, policies/routing.py
Updated system-prompt-text references to use the registry names.
Structural alignment tests
workspace/tests/test_platform_tools.py — 9 tests pin every
registry-to-adapter mapping:
- registry names are unique
- a2a + memory partition is complete (no orphans)
- by_name lookup works
- MCP server registers exactly the registry's tool set
- MCP description equals registry.short for every tool
- MCP inputSchema equals registry.input_schema for every tool
- get_a2a_instructions text contains every a2a tool name
- get_hma_instructions text contains every memory tool name
- pre-rename names (delegate_to_workspace, search_memory,
check_delegation_status) cannot leak back
Adding a future tool means adding one ToolSpec; the test failure
list tells the author exactly which adapter to update.
Adapter pattern for future SDK support
When (e.g.) AutoGen or Pydantic AI gets adapters, the only work
needed for tool surfacing is "wrap registry.TOOLS in your SDK's
tool format." Names, descriptions, schemas, impl come from the
registry — adapter author writes zero strings.
Why this needed to ship now
PR #2237 (already in staging) injected MCP-world docs as the
default system-prompt content. Without the registry, those docs
said "delegate_task" while LangChain runtimes only had
"delegate_to_workspace" — workers see docs for tools that don't
exist (CP review Issue 1+3). PR #2239 was a tactical rename;
this PR is the structural fix that prevents the same class of
drift from recurring as new adapters ship.
PR #2239 was closed in favor of this — same renames, plus the
registry, plus structural tests. Single coherent change.
Tests: 1232 pass, 2 xfailed (pre-existing). 9 new in
test_platform_tools.py; 4 alignment tests in test_prompt.py from
#2237 still pass; original test_executor_helpers tests adapted to
the registry-driven world.
Refs: CP review Issues 1, 2, 3, 5; project memory
project_runtime_native_pluggable.md (platform owns A2A);
project memory feedback_doc_tool_alignment.md (this is the structural
fix for the tactical lesson).
|
||
|
|
3eb599bbb6 |
fix(workspace): use SDK constant for agent-card readiness probe
The initial-prompt readiness probe in workspace/main.py hardcoded the
pre-1.x well-known path. After the a2a-sdk 1.x bump the SDK started
mounting the agent card at the new canonical path (the value of
`a2a.utils.constants.AGENT_CARD_WELL_KNOWN_PATH`), so the probe
returned 404 every attempt and silently fell through to "server not
ready after 30s, skipping". Net effect: every workspace silently
dropped its `initial_prompt` from config.yaml — the agent never sent
the kickoff self-message, and users hit a fresh chat with no context.
Reported by an external user as "/.well-known/agent.json 404 — the
a2a-sdk agent card route was not being mounted at the expected path".
The route IS mounted; the probe was looking at the wrong place.
Fix imports `AGENT_CARD_WELL_KNOWN_PATH` from `a2a.utils.constants`
and uses it directly in the probe URL — the SDK constant is now the
single source of truth, so any future rename travels through
automatically.
Adds two static regression tests pinning the invariant:
1. No hardcoded `/.well-known/agent.json` literal anywhere in
main.py.
2. The probe URL fstring interpolates AGENT_CARD_WELL_KNOWN_PATH
(catches a "fix" that imports the constant for show but reverts
to a literal in the actual GET).
Verified manually inside ghcr.io/molecule-ai/workspace-template-langgraph
that AGENT_CARD_WELL_KNOWN_PATH == '/.well-known/agent-card.json' and
that `create_agent_card_routes(card)` mounts at exactly that path —
constant + mount are aligned in the runtime image, so the probe will
now find the server.
Full workspace test suite: 1209 passed, 2 xfailed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
81c4c1321c |
fix(runtime): use lowercase wire role for v0.3 JSON-RPC compat layer
Manual-test failure surfaced what was hidden behind the MCP-path bug:
once delegate_task could actually fire, every cross-workspace call
came back as JSON-RPC -32600 "Invalid Request" with the underlying
pydantic ValidationError:
params.message.role
Input should be 'agent' or 'user' [type=enum,
input_value='ROLE_USER', input_type=str]
PR #2184's a2a-sdk 1.x migration sweep over-corrected: it changed
every `"role": "user"` literal in JSON-RPC payload construction to
`"role": "ROLE_USER"` to match the protobuf enum names of the 1.x
native types (a2a.types.Role.ROLE_USER / ROLE_AGENT). That was
correct for in-process Message construction (which the SDK
serialises before wire transmission) but WRONG for the 8 sites that
hand-build JSON-RPC payloads. The workspace's own a2a-sdk runs
inbound requests through the v0.3 compat adapter
(/usr/local/lib/python3.11/site-packages/a2a/compat/v0_3/) because
main.py sets enable_v0_3_compat=True for backwards compatibility,
and that adapter validates against the v0.3 Pydantic Role enum
(`agent` | `user` lowercase). The protobuf-style names blow it up.
Reverted the 8 wire-payload sites to lowercase:
- workspace/a2a_client.py:74
- workspace/a2a_cli.py:74, 111
- workspace/heartbeat.py:378
- workspace/main.py:464, 563
- workspace/builtin_tools/a2a_tools.py:60
- workspace/builtin_tools/delegation.py:272
Native-type usage at workspace/a2a_executor.py:471 (`Role.ROLE_AGENT`)
stays — that's an in-process Message construction; the SDK handles
wire serialisation correctly.
Updated the misleading comment at main.py:255-257 (which said
"outbound payloads are now 1.x-shaped (ROLE_USER)") to spell out
the actual rule: outbound JSON-RPC wire payloads MUST use v0.3
shape, native types are only for in-process construction.
New regression test test_jsonrpc_wire_role_format.py greps the 6
wire-payload-emitting files for any "ROLE_USER" / "ROLE_AGENT"
string literal and fails loud — cheapest possible drift detector.
Why E2E missed it: the priority-runtimes harness sends a single
message canvas → workspace, but the canvas already used lowercase
"user" (it never went through the migration sweep). The bug only
surfaces on workspace → workspace delegation, which the harness
doesn't exercise. Same gap as #131 (extend smoke to call main()
against a stub).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
dd57a840b6 |
fix: comprehensive a2a-sdk 1.x migration sweep across workspace/
Audited every a2a-sdk surface in workspace/ against the installed
1.0.2 wheel. Found and fixed:
main.py (the live workspace startup path):
• create_jsonrpc_routes(rpc_url='/', enable_v0_3_compat=True) —
rpc_url required in 1.x; v0.3 compat enables inbound legacy
clients (`"role": "user"` lowercase) without forcing them to
upgrade. Pairs with the outbound rename below.
a2a_executor.py:
• TextPart/FilePart/FileWithUri removed in 1.x. Part is now a
flat proto message: Part(text=…) / Part(url=…, filename=…,
media_type=…). Updated the file-attachment branch (only
reachable when an agent emits files; the harness's PONG path
didn't exercise this, but it's a latent crash).
• Message field names: messageId/taskId/contextId →
message_id/task_id/context_id (proto3 snake_case).
• Role enum: Role.agent → Role.ROLE_AGENT (proto enum).
Outbound JSON-RPC payloads (8 files):
• "role": "user" → "role": "ROLE_USER" — proto3 JSON serialization
is strict about enum values. Sites: a2a_client, a2a_cli, main
(initial+idle prompts), heartbeat, builtin_tools/a2a_tools,
builtin_tools/delegation. Wire JSON keys stay camelCase
(proto3 default), only the role enum value changed.
google-adk/adapter.py:
• new_agent_text_message → new_text_message (4 sites). This
adapter's directory has a hyphen, so it can't be imported as a
Python module — effectively dead code, but the wheel ships the
file and a future fix should keep it correct against 1.x.
Why one PR instead of seven: every previous a2a-sdk migration find
landed as its own publish → cascade → harness → next-bug cycle.
Today's audit ran every a2a-sdk symbol/type/method in workspace/
against the installed 1.0.2 wheel in a single sweep + tested the
critical paths (Message construction, Part construction, Role enum
parsing) against the actual SDK. Should be the last migration PR.
Verified locally:
python3 scripts/build_runtime_package.py --version 0.1.99 \
--out /tmp/build-final
pip install /tmp/build-final
python -c "import molecule_runtime.main; \
from molecule_runtime.a2a_executor import LangGraphA2AExecutor"
→ ✓ all imports clean against a2a-sdk 1.0.2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c80b3ff0eb |
fix: pass rpc_url='/' to create_jsonrpc_routes (a2a-sdk 1.x requirement)
7th a2a-sdk migration find from the v0 → v1 transition. create_jsonrpc_routes() now requires rpc_url as a positional arg (was implicit at root in 0.x). Pass '/' to match a2a.utils.constants.DEFAULT_RPC_URL — that's also what workspace-server's a2a_proxy.go forwards to (POSTs to workspace URL without appending a path). Symptom before fix: every workspace startup crashed with TypeError: create_jsonrpc_routes() missing 1 required positional argument: 'rpc_url' Caught by harness 9 phase 4 (claude-code + langgraph both on 0.1.24). The user's "use langgraph for fast iteration" call cut the diagnose cycle from 15min to ~30s — without that, this would have taken another hermes round-trip to surface. Updated reference_a2a_sdk_v0_to_v1_migration.md memory with this entry alongside the previous 6 finds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6859099a08 |
fix: pass agent_card to DefaultRequestHandler (a2a-sdk 1.x requirement)
a2a-sdk 1.x added agent_card as a required argument to DefaultRequestHandler.__init__. main.py constructed it with only agent_executor + task_store, so every workspace startup that reached the handler init step crashed with: TypeError: DefaultRequestHandlerV2.__init__() missing 1 required positional argument: 'agent_card' This is the 6th a2a-sdk migration find from the v0 → v1 transition (see reference_a2a_sdk_v0_to_v1_migration memory). Pattern is the same: SDK exposes a new required arg, our call site needs to pass the existing object we already construct upstream. Why the import-only smoke gates didn't catch this: it's a call-time constructor error inside `async def main()`, not a module load error. The runtime-pin-compat smoke imports main_sync but doesn't invoke main() against a real config. Worth filing a follow-up to extend the smoke to a "construct + dispose" cycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
851fd21fb1 |
fix(workspace): rename supported_protocols → supported_interfaces (a2a-sdk 1.0)
CRITICAL: every workspace boot since the a2a-sdk 1.0 migration (#1974) has been crashing at AgentCard construction with: ValueError: Protocol message AgentCard has no "supported_protocols" field The protobuf field is `supported_interfaces` (plural, interfaces — see a2a-sdk types/a2a_pb2.pyi:189). The 0.3→1.0 migration left the kwarg as `supported_protocols`, which doesn't exist in the 1.0 schema, so the constructor raises before any subsequent line of main runs. Why this hid for so long: - publish-runtime.yml's smoke step only IMPORTED molecule_runtime.main; importing the module is fine, only CONSTRUCTING the AgentCard fails - The user-visible symptom is "Workspace failed: " with empty last_sample_error, indistinguishable from generic boot timeouts - The state_transition_history=True bug (fixed in #2179) was a sibling of this — same migration, same class, just caught first Fix is symmetric with #2179: 1. workspace/main.py: rename the kwarg + comment explaining why 2. .github/workflows/publish-runtime.yml: extend the smoke block to instantiate AgentCard with the exact production call shape, so the next field-rename of this class fails at publish time instead of breaking every workspace startup Verification: - Constructed AgentCard against fresh a2a-sdk 1.0.2 in a clean venv with the corrected kwarg → succeeds - Constructed it with the original `supported_protocols` kwarg → fails immediately with the exact error production sees - Smoke test pinned to mirror main.py's exact call shape; main.py + smoke must stay in lockstep going forward Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
12d446bc8e |
docs: explain why state_transition_history is gone (research-backed)
Adds a comment block citing a2a-sdk's own a2a/compat/v0_3/conversions.py, which says verbatim: state_transition_history=None, # No longer supported in v1.0 So a future reader who notices the missing kwarg won't try to add it back. The capability is now universal: every v1.x Task carries a history list and tasks/get supports historyLength via the apply_history_length helper. No flag because nothing's optional. Confirmed by reading the SDK source directly: - a2a/types.py AgentCapabilities exposes only: streaming, push_notifications, extensions, extended_agent_card. - a2a/compat/v0_3/conversions.py explicitly maps None when down-converting v1 → v0.3 (deliberate removal, not rename). - a2a/server/request_handlers/default_request_handler_v2.py uses apply_history_length(task, params) — agent doesn't opt in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f531fe1367 |
fix: drop state_transition_history field — removed in a2a-sdk 1.x
a2a-sdk 1.x's AgentCapabilities only exposes 4 fields: streaming, push_notifications, extensions, extended_agent_card. The state_transition_history field was removed in the v1 protobuf schema. main.py still passed it as a kwarg, so every workspace that reached the AgentCard construction step (line 188) crashed: ValueError: Protocol message AgentCapabilities has no "state_transition_history" field Symptom: every claude-code + hermes workspace stuck in `provisioning` forever — caught when the user provisioned a Design Director crew manually via the canvas while harness 5 was running. Why every prior smoke gate missed it: - runtime-pin-compat.yml smokes `from molecule_runtime.main import main_sync` — only imports the module. AgentCapabilities() runs inside `async def main()`, not at module load. - Template image boot smoke does `import every /app/*.py` — same story. main.py imports fine; the field error only fires at call. The fix is one line — drop the kwarg. Fields we actually need (streaming + push_notifications) are still passed. Follow-up worth filing: smoke step that instantiates Adapter() + calls a no-op setup() against a stub config. That would have caught this before publish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3df5867b56 |
fix: restore main_sync entry point in workspace/main.py
The wheel's pyproject.toml has declared `molecule-runtime = "molecule_runtime.main:main_sync"` since the publish pipeline was created on 2026-04-26, but the function itself was never present in workspace/main.py — it lived in the pre-monorepo molecule-ai-workspace-runtime repo and was lost during the consolidation that made workspace/ the source of truth. The 0.1.15 wheel still had main_sync from a leftover snapshot, so the regression went unnoticed until 0.1.16 (the first wheel built from the new source-of-truth) shipped. Symptom: every workspace container restart loops with ImportError: cannot import name 'main_sync' from 'molecule_runtime.main' — the molecule-runtime CLI script's first line tries to import the missing symbol. Workspaces stay in `provisioning` until the 10-min sweep marks them failed. Caught by .github/workflows/runtime-pin-compat.yml, which already imports the symbol by name as its smoke test. (That check kept failing red on every recent merge_group run; this PR fixes the underlying symbol-not-found instead of the smoke step.) Also strengthens publish-runtime.yml's wheel smoke from `import molecule_runtime.main` (loads the module — passes even when entry-point target is missing) to `from molecule_runtime.main import main_sync` (the actual contract the CLI script needs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d0057912d2 |
feat(skills): per-skill runtime compatibility (#119, hermes pattern)
SKILL.md frontmatter can now declare `runtime: [claude-code]` or `runtime: [hermes, claude-code]` to opt out of incompatible adapters instead of failing at first invocation. Default `["*"]` means universal — existing skill libraries need zero migration. Borrowed from hermes' declarative skill-compat pattern surfaced in the hermes architecture survey. The remaining two patterns (event-log layer, observability config block) stay open under #119. Wiring: - SkillMetadata.runtime: list[str] = ["*"] - _normalize_runtime_field accepts list, string-sugar, missing -> ["*"]; malformed warns and falls back to universal so a typo never silently drops a skill. - load_skills(..., current_runtime=...) filters out skills whose runtime list lacks "*" or current_runtime, with an INFO log line. - BaseAdapter.start passes type(self).name() so the live adapter drives the filter; SkillsWatcher takes the same kwarg so hot-reload honors it. 8 new tests cover default universal, no-field universal, explicit match/mismatch, string sugar, wildcard short-circuit, current_runtime=None (preserves old behavior), and malformed-warns-not-drops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
65b531acf6 |
fix(workspace): tag self-originated A2A POSTs with X-Workspace-ID
Workspace runtime fired four classes of A2A request to the platform
without the X-Workspace-ID header that identifies the source
workspace: heartbeat self-messages, initial_prompt, idle-loop fires,
and peer-to-peer A2A from runtime tools. The platform's a2a_receive
logger keys source_id off that header — without it, every such row
was written with source_id=NULL, which the canvas's My Chat tab
filters as ?source=canvas (i.e. "user typed this") and rendered the
internal triggers as if the human user had sent them. The
"Delegation results are ready..." heartbeat trigger was visible to
end users in the chat history; delegate_task A2A calls between agents
were misclassified the same way.
Centralise the header construction in a new platform_auth helper
self_source_headers(workspace_id) that returns auth_headers() PLUS
{X-Workspace-ID: <id>}. Apply it to:
- heartbeat.py self-message (refactored from inline header dict)
- main.py initial_prompt POST
- main.py idle_prompt POST
- a2a_client.py send_a2a_message (peer A2A from runtime)
- builtin_tools/a2a_tools.py delegate_task (was missing ALL headers)
Tests:
- test_heartbeat.py asserts the X-Workspace-ID header is set on
the self-message POST.
- test_a2a_tools_module.py asserts the same on delegate_task POSTs;
FakeClient.post mocks updated to accept the headers kwarg.
Production effect lands the moment workspace containers are rebuilt
with this code; existing rows in activity_logs keep their NULL
source_id (legacy data). The canvas-side filter (#follow-up)
covers the historical-rows case until backfill.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
94d9331c76 |
feat(canvas+platform): chat attachments, model selection, deploy/delete UX
Session's accumulated UX work across frontend and platform. Reviewable in four logical sections — diff is large but internally cohesive (each section fixes a gap the next one depends on). ## Chat attachments — user ↔ agent file round trip - New POST /workspaces/:id/chat/uploads (multipart, 50 MB total / 25 MB per file, UUID-prefixed storage under /workspace/.molecule/chat-uploads/). - New GET /workspaces/:id/chat/download with RFC 6266 filename escaping and binary-safe io.CopyN streaming. - Canvas: drag-and-drop onto chat pane, pending-file pills, per-message attachment chips with fetch+blob download (anchor navigation can't carry auth headers). - A2A flow carries FileParts end-to-end; hermes template executor now consumes attachments via platform helpers. ## Platform attachment helpers (workspace/executor_helpers.py) Every runtime's executor routes through the same helpers so future runtimes inherit attachment awareness for free: - extract_attached_files — resolve workspace:/file:///bare URIs, reject traversal, skip non-existent. - build_user_content_with_files — manifest for non-image files, multi-modal list (text + image_url) for images. Respects MOLECULE_DISABLE_IMAGE_INLINING for providers whose vision adapter hangs on base64 payloads (MiniMax M2.7). - collect_outbound_files — scans agent reply for /workspace/... paths, stages each into chat-uploads/ (download endpoint whitelist), emits as FileParts in the A2A response. - ensure_workspace_writable — called at molecule-runtime startup so non-root agents can write /workspace without each template having to chmod in its Dockerfile. Hermes template executor + langgraph (a2a_executor.py) + claude-code (claude_sdk_executor.py) all adopt the helpers. ## Model selection & related platform fixes - PUT /workspaces/:id/model — was 404'ing, so canvas "Save" silently lost the model choice. Stores into workspace_secrets (MODEL_PROVIDER), auto-restarts via RestartByID. - applyRuntimeModelEnv falls back to envVars["MODEL_PROVIDER"] so Restart propagates the stored model to HERMES_DEFAULT_MODEL without needing the caller to rehydrate payload.Model. - ConfigTab Tier dropdown now reads from workspaces row, not the (stale) config.yaml — fixes "badge shows T3, form shows T2". ## ChatTab & WebSocket UX fixes - Send button no longer locks after a dropped TASK_COMPLETE — `sending` no longer initializes from data.currentTask. - A2A POST timeout 15 s → 120 s. LLM turns routinely exceed 15 s; the previous default aborted fetches while the server was still replying, producing "agent may be unreachable" on success. - socket.ts: disposed flag + reconnectTimer cancellation + handler detachment fix zombie-WebSocket in React StrictMode. - Hermes Config tab: RUNTIMES_WITH_OWN_CONFIG drops 'hermes' — the adaptor's purpose IS the form, banner was contradictory. - workspace_provision.go auto-recovery: try <runtime>-default AND bare <runtime> for template path (hermes lives at the bare name). ## Org deploy/delete animation (theme-ready CSS) - styles/theme-tokens.css — design tokens (durations, easings, colors). Light theme overrides by setting only the deltas. - styles/org-deploy.css — animation classes + keyframes, every value references a token. prefers-reduced-motion respected. - Canvas projects node.draggable=false onto locked workspaces (deploying children AND actively-deleting ids) — RF's authoritative drag lock; useDragHandlers retains a belt-and- braces check. - Organ cancel button (red pulse pill on root during deploy) cascades via existing DELETE /workspaces/:id?confirm=true. - Auto fit-view after each arrival, debounced 500 ms so rapid sibling arrivals coalesce into one fit (previous per-event fit made the viewport lurch continuously). - Auto-fit respects user-pan — onMoveEnd stamps a user-pan timestamp only when event !== null (ignores programmatic fitView) so auto-fits don't self-cancel. - deletingIds store slice + useOrgDeployState merge gives the delete flow the same dim + non-draggable treatment as deploy. - Platform-level classNames.ts shared by canvas-events + useCanvasViewport (DRY'd 3 copies of split/filter/join). ## Server payload change - org_import.go WORKSPACE_PROVISIONING broadcast now includes parent_id + parent-RELATIVE x/y (slotX/slotY) so the canvas renders the child at the right parent-nested slot without doing any absolute-position walk. createWorkspaceTree signature gains relX, relY alongside absX, absY; both call sites updated. ## Tests - workspace/tests/test_executor_helpers.py — 11 new cases covering URI resolution (including traversal rejection), attached-file extraction (both Part shapes), manifest-only vs multi-modal content, large-image skip, outbound staging, dedup, and ensure_workspace_writable (chmod 777 + non-root tolerance). - workspace-server chat_files_test.go — upload validation, Content-Disposition escaping, filename sanitisation. - workspace-server secrets_test.go — SetModel upsert, empty clears, invalid UUID rejection. - tests/e2e/test_chat_attachments_e2e.sh — round-trip against a live hermes workspace. - tests/e2e/test_chat_attachments_multiruntime_e2e.sh — static plumbing check + round-trip across hermes/langgraph/claude-code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
35bcad9204
|
feat(workspace): migrate a2a-sdk from 0.3.x to 1.0.0 (KI-009) (#1974)
* feat(workspace): migrate a2a-sdk from 0.3.x to 1.0.0 (KI-009) Migrates all workspace code from a2a-sdk v0.3.x to v1.0.0, following the official migration guide from a2aproject/a2a-python. Breaking changes applied: - A2AStarletteApplication → Starlette route factory (create_agent_card_routes + create_jsonrpc_routes) - AgentCard.url removed; url+protocol now in supported_protocols[].url - AgentCapabilities fields renamed to snake_case (pushNotifications→push_notifications, stateTransitionHistory→state_transition_history) - AgentCard.defaultInputModes/outputModes → default_input_modes/output_modes - TaskState.canceled → TaskState.TASK_STATE_CANCELED - a2a.utils → a2a.helpers - Part(root=TextPart(text=t)) → Part(text=t) (TextPart removed) Files changed: - requirements.txt: pinned >=1.0.0,<2.0 - main.py: Starlette route factory + AgentCard restructure - a2a_executor.py: Part() + TaskState + helpers import - hermes_executor.py: TaskState + helpers import - google-adk/adapter.py: TaskState + helpers import - cli_executor.py: helpers import - claude_sdk_executor.py: helpers import - tests/conftest.py: a2a.helpers mock stub - tests/test_a2a_executor.py: TaskState enum key - adapters/google-adk/test_adapter.py: Part + helpers stub Refs: KI-009 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): update _TaskState mock to a2a-sdk v1 enum name (TASK_STATE_CANCELED) --------- Co-authored-by: Molecule AI Tech Researcher <tech-researcher@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> |
||
|
|
4675402e58
|
feat(workspace): pre-stop serialization for pause/resume (closes #1386)
Add a pre-stop hook that captures agent state before container exit and writes a scrubbed snapshot to /configs/.agent_snapshot.json. On restart, the snapshot is loaded and the adapter's restore_state() is called before the A2A server starts. - New lib/pre_stop.py: build_snapshot / write_snapshot / read_snapshot / delete_snapshot + _scrub_value deep-scrubber (uses lib.snapshot_scrub to redact API keys, tokens, and sandbox output before persisting) - BaseAdapter.pre_stop_state(): captures _executor._session_id and recent transcript_lines; overridden by adapters with richer in-memory state - BaseAdapter.restore_state(): stores snapshot fields as adapter attrs for create_executor() to pick up - main.py: calls pre_stop serialization in finally block (after server serves) and restore_state() after adapter setup, before server starts - Added 12 unit tests covering scrub, read/write, adapter integration Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
3bef6af241 |
fix: apply #1124 env-var defaults + scrub F1088 credentials from INCIDENT_LOG.md (#1347)
- PLATFORM_URL: replace unreachable http://platform:8080 mesh-only default with Docker-aware detection (host.docker.internal in containers, localhost for local dev) across all workspace Python modules and the git-token-helper shell script. - WORKSPACE_ID: add fail-fast validation in main.py (SystemExit if empty) consistent with coordinator.py / a2a_cli.py patterns already in place. - INCIDENT_LOG.md: replace all 3 F1088 credential types with ***REDACTED*** (sk-cp- 2x, github_pat_ 2x, ADMIN_TOKEN base64 3x). Fixes #1124, #1333. Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app> |
||
|
|
d8026347e5 |
chore: open-source restructure — rename dirs, remove internal files, scrub secrets
Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |