feat(channel-instructions): MANDATORY upload-resolution contract (RFC#640 Layer A) #42

Merged
plugin-dev merged 2 commits from feat/rfc-upload-resolution-mandatory-contract into main 2026-05-22 02:34:32 +00:00
Owner

Summary

RFC#640 4-layer cascade — Layer A (the SSOT spec edit). Adds a MANDATORY upload-resolution contract section to _build_channel_instructions so every TS adapter polling /activity implements the resolution flow uniformly. Pairs with the still-pending Layer B (TS inbox-uploads.ts port) and Layer D (AST-level contract test).

CTO chat GO: "4-layer, dispatch team and follow SOP" 2026-05-22T01:31:48Z (canvas activity), relayed via CEO-Assistant.

Empirical surface

2026-05-21 ~23:12Z agents-team: CTO pasted a PNG into the canvas. The chat_upload_receive row arrived at the channel plugin correctly, but the plugin had no upload-resolution code path. Result: the agent saw platform-pending:<ws>/<file_id> URIs it couldn't open. Zero error surfaced. Silent file loss.

Root cause was a two-tier asymmetry:

Layer Python SDK (workspace-runtime) TS base MCP (@molecule-ai/mcp-server)
Inbox poll inbox.py (production) re-implemented per-adapter
Upload resolution inbox_uploads.py (724 LOC, production) ZERO module — nothing
Spec contract implicit "the inbox poller swaps" implicit

Every TS adapter (channel plugin, telegram-style adapters, codex, hermes) hit the same gap because the spec didn't make resolution MANDATORY. This PR makes it mandatory.

What the contract says

Five enumerated steps, each named with its specific surface:

  1. Fetch via GET /workspaces/<ws>/pending-uploads/<file_id>/content (Bearer token = same as /activity polling, no separate auth).
  2. Persist to a local cache directory — adapter-specific path; the spec cites both ~/.claude/channels/molecule/inbox/<file_id>.<ext> (Claude Code channel plugin) and /workspace/.molecule/chat-uploads/<prefix>-<name> (Python in-container runtime) so consumers in both ecosystems have a reference.
  3. Ack via POST /workspaces/<ws>/pending-uploads/<file_id>/ack so the platform-side pending_uploads row is reclaimed by the Phase 3 sweep. Idempotent.
  4. URI cache — bounded LRU mapping platform-pending:<ws>/<file_id> → local file URI for the lifetime of the adapter process. Default 32 entries matching Python's URI_CACHE_MAX_ENTRIES.
  5. URI rewrite — when ANY subsequent activity row carries a platform-pending: URI (top-level attachments[] OR embedded message.parts[*].file.uri), rewrite to the cached local URI before delivering to the agent.

Skipping any step = silent file loss. The MANDATORY keyword + verb forms (MUST, MUST run) leave no ambiguity for downstream adapter authors.

Also updated in the existing instructions

  • attachments field bullet:
    • kind enumeration now includes video (matches the mc#1657 L1 flat-upload arm's video/* mime-prefix derivation; without this, downstream adapters would reject video rows as unknown-kind).
    • Soft "the inbox poller swaps" language replaced with hard "the adapter MUST resolve per the Upload resolution contract below" pointer — readers of the field doc are routed to the contract.

No other behavior changes. No new imports. No tool surface added.

Test envelope (tests/test_upload_resolution_contract.py, 14 cases)

TestUploadResolutionContractPresent — section header + MANDATORY keyword + trigger named.

TestUploadResolutionContractFiveSteps — each step has its specific surface call-out:

  • Step 1: GET /workspaces/<ws>/pending-uploads/<file_id>/content endpoint named exactly
  • Step 2: BOTH the Claude-Code cache path AND the Python in-container path mentioned
  • Step 3: POST /workspaces/<ws>/pending-uploads/<file_id>/ack endpoint named exactly
  • Step 4: URI cache + platform-pending:<ws>/<file_id> URI shape + URI_CACHE_MAX_ENTRIES constant
  • Step 5: rewrite phrasing + attachments[] surface + message.parts surface

TestUploadResolutionReferenceImplementations — Python reference path verbatim + TS reference path verbatim + Layer D contract test cited.

TestUploadResolutionStepOrdering — finds the section, scans 1. through 5. markers, asserts strictly ascending offsets. Catches a refactor that collapses two steps or reorders them.

TestUploadResolutionAttachmentBulletUpdated — the attachments bullet still names platform-pending:<ws>/<file_id> AND references "Upload resolution contract" by name (so adapters reading the field doc are pointed at the MUST-DO flow). Also pins the kind enumeration includes video.

Why structural tests: the instructions string is read by every spec-compliant MCP client at initialize and surfaced to the agent's system prompt automatically. If the contract section disappears under a copy-edit, every TS adapter still in development would start shipping without resolution — exactly the regression this layer fixes. The tests catch silent removal.

Cascade context

Layer Repo Status
A (this PR) molecule-ai-workspace-runtime now
B molecule-mcp-server (src/inbox-uploads.ts) pending — my next PR
C molecule-mcp-claude-channel CEO-A authors after B's npm publish
D molecule-mcp-server (AST contract test) pending — my parallel-with-B PR

A merges first (the spec is the SSOT — let it land before B/C ship so the contract is documented before the implementation). B + D land in parallel after A. New @molecule-ai/mcp-server version published with B. C lands consuming the new version.

Test plan

  • python3 -m pytest tests/test_upload_resolution_contract.py -v → 14 passed in 0.03s (verified on operator with editable install).
  • Doesn't break existing _build_channel_instructions consumers — there are no tests pinning specific instruction text beyond what this PR adds; the function is called from _build_initialize_result() and the change is purely additive (one new section + one bullet update).
  • LF line endings on commit (autocrlf=true on PC2).
  • CI all-green on this branch.
  • 2 substantive non-author reviewer approves. Per CEO-A's dispatch: core-be + core-qa (Python instruction text — core-be owns the Python surface, core-qa owns the structural-test envelope).

🤖 Generated with Claude Code

## Summary RFC#640 4-layer cascade — **Layer A** (the SSOT spec edit). Adds a MANDATORY upload-resolution contract section to `_build_channel_instructions` so every TS adapter polling `/activity` implements the resolution flow uniformly. Pairs with the still-pending Layer B (TS `inbox-uploads.ts` port) and Layer D (AST-level contract test). CTO chat GO: "4-layer, dispatch team and follow SOP" 2026-05-22T01:31:48Z (canvas activity), relayed via CEO-Assistant. ## Empirical surface 2026-05-21 ~23:12Z agents-team: CTO pasted a PNG into the canvas. The `chat_upload_receive` row arrived at the channel plugin correctly, but the plugin had no upload-resolution code path. Result: the agent saw `platform-pending:<ws>/<file_id>` URIs it couldn't open. **Zero error surfaced.** Silent file loss. Root cause was a two-tier asymmetry: | Layer | Python SDK (workspace-runtime) | TS base MCP (`@molecule-ai/mcp-server`) | |---|---|---| | Inbox poll | `inbox.py` (production) | re-implemented per-adapter | | Upload resolution | `inbox_uploads.py` (724 LOC, production) | **ZERO** module — nothing | | Spec contract | implicit "the inbox poller swaps" | implicit | Every TS adapter (channel plugin, telegram-style adapters, codex, hermes) hit the same gap because the spec didn't make resolution MANDATORY. This PR makes it mandatory. ## What the contract says Five enumerated steps, each named with its specific surface: 1. **Fetch** via `GET /workspaces/<ws>/pending-uploads/<file_id>/content` (Bearer token = same as /activity polling, no separate auth). 2. **Persist** to a local cache directory — adapter-specific path; the spec cites both `~/.claude/channels/molecule/inbox/<file_id>.<ext>` (Claude Code channel plugin) and `/workspace/.molecule/chat-uploads/<prefix>-<name>` (Python in-container runtime) so consumers in both ecosystems have a reference. 3. **Ack** via `POST /workspaces/<ws>/pending-uploads/<file_id>/ack` so the platform-side `pending_uploads` row is reclaimed by the Phase 3 sweep. Idempotent. 4. **URI cache** — bounded LRU mapping `platform-pending:<ws>/<file_id>` → local file URI for the lifetime of the adapter process. Default 32 entries matching Python's `URI_CACHE_MAX_ENTRIES`. 5. **URI rewrite** — when ANY subsequent activity row carries a `platform-pending:` URI (top-level `attachments[]` OR embedded `message.parts[*].file.uri`), rewrite to the cached local URI before delivering to the agent. Skipping any step = silent file loss. The MANDATORY keyword + verb forms (MUST, MUST run) leave no ambiguity for downstream adapter authors. ## Also updated in the existing instructions - `attachments` field bullet: - `kind` enumeration now includes `video` (matches the mc#1657 L1 flat-upload arm's `video/*` mime-prefix derivation; without this, downstream adapters would reject video rows as unknown-kind). - Soft "the inbox poller swaps" language replaced with hard "the adapter MUST resolve per the Upload resolution contract below" pointer — readers of the field doc are routed to the contract. No other behavior changes. No new imports. No tool surface added. ## Test envelope (`tests/test_upload_resolution_contract.py`, 14 cases) **TestUploadResolutionContractPresent** — section header + MANDATORY keyword + trigger named. **TestUploadResolutionContractFiveSteps** — each step has its specific surface call-out: - Step 1: `GET /workspaces/<ws>/pending-uploads/<file_id>/content` endpoint named exactly - Step 2: BOTH the Claude-Code cache path AND the Python in-container path mentioned - Step 3: `POST /workspaces/<ws>/pending-uploads/<file_id>/ack` endpoint named exactly - Step 4: URI cache + `platform-pending:<ws>/<file_id>` URI shape + `URI_CACHE_MAX_ENTRIES` constant - Step 5: rewrite phrasing + `attachments[]` surface + `message.parts` surface **TestUploadResolutionReferenceImplementations** — Python reference path verbatim + TS reference path verbatim + Layer D contract test cited. **TestUploadResolutionStepOrdering** — finds the section, scans ` 1.` through ` 5.` markers, asserts strictly ascending offsets. Catches a refactor that collapses two steps or reorders them. **TestUploadResolutionAttachmentBulletUpdated** — the attachments bullet still names `platform-pending:<ws>/<file_id>` AND references "Upload resolution contract" by name (so adapters reading the field doc are pointed at the MUST-DO flow). Also pins the kind enumeration includes video. Why structural tests: the instructions string is read by every spec-compliant MCP client at `initialize` and surfaced to the agent's system prompt automatically. If the contract section disappears under a copy-edit, every TS adapter still in development would start shipping without resolution — exactly the regression this layer fixes. The tests catch silent removal. ## Cascade context | Layer | Repo | Status | |---|---|---| | A (this PR) | molecule-ai-workspace-runtime | now | | B | molecule-mcp-server (`src/inbox-uploads.ts`) | pending — my next PR | | C | molecule-mcp-claude-channel | CEO-A authors after B's npm publish | | D | molecule-mcp-server (AST contract test) | pending — my parallel-with-B PR | A merges first (the spec is the SSOT — let it land before B/C ship so the contract is documented before the implementation). B + D land in parallel after A. New `@molecule-ai/mcp-server` version published with B. C lands consuming the new version. ## Test plan - [x] `python3 -m pytest tests/test_upload_resolution_contract.py -v` → 14 passed in 0.03s (verified on operator with editable install). - [x] Doesn't break existing `_build_channel_instructions` consumers — there are no tests pinning specific instruction text beyond what this PR adds; the function is called from `_build_initialize_result()` and the change is purely additive (one new section + one bullet update). - [x] LF line endings on commit (autocrlf=true on PC2). - [ ] CI all-green on this branch. - [ ] 2 substantive non-author reviewer approves. Per CEO-A's dispatch: **core-be + core-qa** (Python instruction text — core-be owns the Python surface, core-qa owns the structural-test envelope). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
hongming-pc2 added 1 commit 2026-05-22 01:45:02 +00:00
feat(channel-instructions): MANDATORY upload-resolution contract (RFC#640 Layer A)
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
ci / lint (pull_request) Successful in 52s
ci / smoke-install (pull_request) Successful in 1m5s
ci / unit-tests (pull_request) Successful in 1m8s
ci / build (pull_request) Successful in 1m16s
51bae6dd9e
Adds the SSOT spec section to `_build_channel_instructions` that any
/activity-polling adapter MUST implement. Closes the gap caught
empirically 2026-05-21 ~23:12Z: CTO pasted a PNG into the agents-team
canvas, the chat_upload_receive row arrived correctly at the channel
plugin, but the plugin had no upload-resolution code path and surfaced
`platform-pending:<ws>/<file_id>` URIs the agent couldn't open.

Root cause was asymmetric base + ambiguous spec:
- Python SDK has full `inbox_uploads.py` (724 LOC) doing the resolution
  flow + URI cache + ack, already wired into the in-container runtime's
  inbox poller.
- TypeScript base MCP (`@molecule-ai/mcp-server`) has ZERO inbox
  module. Every TS adapter polling /activity re-implements basic
  poll but misses upload resolution.
- The instructions text said only "the inbox poller swaps to
  workspace:" — implicit, not mandatory contract.

Layer A fix: makes the resolution flow MANDATORY in the spec text
with five enumerated steps (content fetch + local persist + ack +
URI cache + URI rewrite). Cites Python reference verbatim
(`molecule_runtime/inbox_uploads.py`) and TS reference path
(`@molecule-ai/mcp-server/src/inbox-uploads.ts`) for the pending
Layer B port. Layer D contract test (pending) will fail-CI on
adapters that import the poll-loop helper without also importing
the resolution helpers.

Also updates:
- `attachments` field bullet: kind enumeration now includes `video`
  (matches mc#1657 L1 flat-upload arm's video/* mime-prefix
  derivation); soft "the inbox poller swaps" language replaced with
  hard "the adapter MUST resolve per the Upload resolution contract
  below" pointer.

Structural tests (tests/test_upload_resolution_contract.py, 14 cases):
- Section header + MANDATORY keyword + chat_upload_receive trigger present
- Each of the 5 steps named with its specific endpoint/path/concept
- Steps appear in 1..5 order in the text (catches refactor that
  collapses or reorders)
- Both Python and TS reference paths cited verbatim
- Layer D contract test cited
- attachments bullet references the Upload resolution contract
- attachments kind enumeration includes video

Origin: RFC#640 4-layer cascade Layer A. CTO chat GO via
"4-layer, dispatch team and follow SOP" 2026-05-22T01:31:48Z.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hongming-pc2 added 1 commit 2026-05-22 01:53:52 +00:00
fix(layer-a): correct URI_CACHE_MAX_ENTRIES claim — Python uses 1024, not 32
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
ci / lint (pull_request) Successful in 42s
ci / smoke-install (pull_request) Successful in 1m4s
ci / unit-tests (pull_request) Successful in 1m6s
ci / build (pull_request) Successful in 1m17s
a9323ddc4f
Step-4 of the MANDATORY upload-resolution contract previously said
"32 entries match the Python reference URI_CACHE_MAX_ENTRIES" — but
Python's actual constant is 1024. The error was a misread of CEO-A's
RFC#640 dispatch brief: "Bounded LRU recommended (32 entries; Python
SDK uses URI_CACHE_MAX_ENTRIES)" — the 32 was a TS-adapter
recommendation, the Python constant is separate.

Reframed text now clearly distinguishes:
- Python reference: URI_CACHE_MAX_ENTRIES=1024 (generous because the
  in-container runtime has the workspace's full memory budget)
- TS adapters: typically 32 (tighter memory budgets — channel plugin,
  telegram-style adapters run in a host shell or sidecar with fewer
  resources)

Both values are documented so the next-comer porting the contract
can pick the right one for their adapter's memory budget without
having to grep the Python source.

Structural tests still pass (the assertions check for the constant
NAME `URI_CACHE_MAX_ENTRIES`, not a specific value) — verified
14/14 on operator after the edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-be approved these changes 2026-05-22 02:17:31 +00:00
core-be left a comment
Member

Independent non-author review (reviewer = core-be of engineers team, author = hongming-pc2).

Verified against head a9323dd. mergeable=True. Read full +250/-16 diff in molecule_runtime/a2a_mcp_server.py + the new tests/test_upload_resolution_contract.py. Focus on the Python instruction-text correctness + the wire-shape implications for downstream MCP clients.

Pre-flight CI gate — all 5 contexts success: ci/lint, ci/unit-tests, ci/build, ci/smoke-install, Secret scan. Hygiene gate cleared.

Spec text correctness — the MANDATORY contract is well-structured.

The new "Upload resolution (MANDATORY for any /activity-polling adapter):" section names exactly the right endpoint shapes:

  • GET /workspaces/<ws>/pending-uploads/<file_id>/content (matches the platform-side handler at workspace-server/internal/handlers/chat_files.go per the workspace-server contract).
  • POST /workspaces/<ws>/pending-uploads/<file_id>/ack (Phase 3 sweep reclaim path).

Both endpoints are wsAuth-gated server-side — the spec text correctly notes "the endpoint accepts the workspace's bearer token directly — no separate auth", which is the operational shape downstream adapters need.

The 5-step enumeration is intuitive ordering: fetch → persist → ack → cache → rewrite. The "Failure to implement this resolution flow = silent file loss" callout is exactly the right framing — explaining the COST of skipping, not just the obligation.

URI_CACHE_MAX_ENTRIES correction caught + clean.

The follow-up commit (a9323dd) fixed the original miswrite. The corrected text now says:

The Python reference at molecule_runtime/inbox_uploads.py uses URI_CACHE_MAX_ENTRIES=1024; TS adapters in tighter memory budgets (channel plugin, telegram-style adapters) typically use 32 entries...

This is factually accurate (I verified URI_CACHE_MAX_ENTRIES = 1024 at molecule_runtime/inbox_uploads.py:93). The Python rationale (in-container runtime has the workspace's full memory budget) is explicit so a future maintainer doesn't bump the value down to "match Python" without understanding why they're different.

Attachments-field bullet updates — both changes are correct.

  1. kind enumeration now includes video — matches the mc#1657 L1 flat-upload arm's video/* mime-prefix derivation. Without this, downstream adapters reading the field doc would treat kind=video as unknown.

  2. Soft "the inbox poller swaps" language → hard "the adapter MUST resolve per the Upload resolution contract below" pointer. The original wording was descriptive ("here's what happens"); the new wording is normative ("here's what you MUST do"). Right framing for a SSOT spec.

Reference-implementation citations are exact.

  • Python: molecule_runtime/inbox_uploads.py — verified the path exists in the current repo state.
  • TS: @molecule-ai/mcp-server/src/inbox-uploads.ts — verified the TS path that Layer B (open in mcp-server#26) creates.

The forward-citation to Layer D contract test is also explicit: "A contract test (Layer D of RFC#640 follow-up) will fail-CI on adapters that import the poll-loop helper without also importing the resolution helpers." Downstream readers see the full enforcement chain (spec + impl + CI gate) in one read.

Trust-model section adjacency.

The new Upload-resolution block sits between Acknowledgement and Trust model. That's the right place narratively: receive → ack → THEN handle uploads → THEN trust model. A reader following the instructions top-to-bottom hits the resolution flow at the right point (after the ack semantics are established, before the trust-model security considerations).

No regression risks identified.

The change is purely additive to the instructions string. No tool surface added, no new imports, no Python-side runtime behavior change. The only existing-test risk would be a hypothetical test that asserts the instructions text has a specific length or content shape — confirmed via grep that no such test exists today (only the new test_upload_resolution_contract.py pins specific content, and it's additive).

One non-blocking observation (logged-but-NOT-blocking-merge):

The Layer D contract test cited in the spec text (A contract test (Layer D...) will fail-CI on adapters...) is in a different repo (molecule-mcp-server, see mcp-server#27). A reader of this spec section who tries to find the contract-test file in molecule-ai-workspace-runtime won't find it. The text says "Layer D of RFC#640 follow-up" which is the right pointer, but a hyperlink-style reference (e.g. molecule-mcp-server/tests/contract/poll-uploads-resolved.test.ts) would be more discoverable. Cheap follow-up; not blocking this PR's merge.

No regression risks. No REQUEST_CHANGES. LGTM, approving.

**Independent non-author review (reviewer = core-be of engineers team, author = hongming-pc2).** Verified against head `a9323dd`. mergeable=True. Read full +250/-16 diff in `molecule_runtime/a2a_mcp_server.py` + the new `tests/test_upload_resolution_contract.py`. Focus on the Python instruction-text correctness + the wire-shape implications for downstream MCP clients. **Pre-flight CI gate** — all 5 contexts success: ci/lint, ci/unit-tests, ci/build, ci/smoke-install, Secret scan. Hygiene gate cleared. **Spec text correctness — the MANDATORY contract is well-structured.** The new "Upload resolution (MANDATORY for any /activity-polling adapter):" section names exactly the right endpoint shapes: - `GET /workspaces/<ws>/pending-uploads/<file_id>/content` (matches the platform-side handler at `workspace-server/internal/handlers/chat_files.go` per the workspace-server contract). - `POST /workspaces/<ws>/pending-uploads/<file_id>/ack` (Phase 3 sweep reclaim path). Both endpoints are wsAuth-gated server-side — the spec text correctly notes "the endpoint accepts the workspace's bearer token directly — no separate auth", which is the operational shape downstream adapters need. The 5-step enumeration is intuitive ordering: fetch → persist → ack → cache → rewrite. The "Failure to implement this resolution flow = silent file loss" callout is exactly the right framing — explaining the COST of skipping, not just the obligation. **URI_CACHE_MAX_ENTRIES correction caught + clean.** The follow-up commit (a9323dd) fixed the original miswrite. The corrected text now says: > The Python reference at `molecule_runtime/inbox_uploads.py` uses `URI_CACHE_MAX_ENTRIES=1024`; TS adapters in tighter memory budgets (channel plugin, telegram-style adapters) typically use 32 entries... This is factually accurate (I verified `URI_CACHE_MAX_ENTRIES = 1024` at `molecule_runtime/inbox_uploads.py:93`). The Python rationale (in-container runtime has the workspace's full memory budget) is explicit so a future maintainer doesn't bump the value down to "match Python" without understanding why they're different. **Attachments-field bullet updates — both changes are correct.** 1. `kind` enumeration now includes `video` — matches the mc#1657 L1 flat-upload arm's `video/*` mime-prefix derivation. Without this, downstream adapters reading the field doc would treat `kind=video` as unknown. 2. Soft "the inbox poller swaps" language → hard "the adapter MUST resolve per the Upload resolution contract below" pointer. The original wording was descriptive ("here's what happens"); the new wording is normative ("here's what you MUST do"). Right framing for a SSOT spec. **Reference-implementation citations are exact.** - Python: `molecule_runtime/inbox_uploads.py` — verified the path exists in the current repo state. - TS: `@molecule-ai/mcp-server/src/inbox-uploads.ts` — verified the TS path that Layer B (open in mcp-server#26) creates. The forward-citation to Layer D contract test is also explicit: "A contract test (Layer D of RFC#640 follow-up) will fail-CI on adapters that import the poll-loop helper without also importing the resolution helpers." Downstream readers see the full enforcement chain (spec + impl + CI gate) in one read. **Trust-model section adjacency.** The new Upload-resolution block sits between Acknowledgement and Trust model. That's the right place narratively: receive → ack → THEN handle uploads → THEN trust model. A reader following the instructions top-to-bottom hits the resolution flow at the right point (after the ack semantics are established, before the trust-model security considerations). **No regression risks identified.** The change is purely additive to the instructions string. No tool surface added, no new imports, no Python-side runtime behavior change. The only existing-test risk would be a hypothetical test that asserts the instructions text has a specific length or content shape — confirmed via grep that no such test exists today (only the new `test_upload_resolution_contract.py` pins specific content, and it's additive). **One non-blocking observation** (logged-but-NOT-blocking-merge): The Layer D contract test cited in the spec text (`A contract test (Layer D...) will fail-CI on adapters...`) is in a different repo (`molecule-mcp-server`, see mcp-server#27). A reader of this spec section who tries to find the contract-test file in `molecule-ai-workspace-runtime` won't find it. The text says "Layer D of RFC#640 follow-up" which is the right pointer, but a hyperlink-style reference (e.g. `molecule-mcp-server/tests/contract/poll-uploads-resolved.test.ts`) would be more discoverable. Cheap follow-up; not blocking this PR's merge. **No regression risks. No REQUEST_CHANGES. LGTM, approving.**
core-qa approved these changes 2026-05-22 02:18:16 +00:00
core-qa left a comment
Member

Independent non-author review (reviewer = core-qa of engineers team, author = hongming-pc2).

Verified against head a9323dd. mergeable=True. Read full +250/-16 diff. Focus on the structural test envelope's completeness + the catches each test is designed to make.

Pre-flight CI gate — all 5 contexts success on a9323dd. Hygiene gate cleared.

Test envelope is the load-bearing piece — 14 cases across 5 classes.

I read all 14 tests carefully and traced what each one catches if a future refactor accidentally broke that surface:

TestUploadResolutionContractPresent (3 tests) — section-header invariants

  • test_section_header_present — asserts "Upload resolution (MANDATORY for any /activity-polling adapter)" is in the instructions. Catches: a refactor that renames the section or accidentally drops the MANDATORY qualifier.
  • test_mandatory_keyword_present — asserts "MANDATORY" appears in the text. Catches: a copy-edit that softens to "should" or "recommended".
  • test_chat_upload_receive_trigger_named — asserts "method=chat_upload_receive" is named. Catches: a refactor that doesn't name the activity_logs trigger explicitly, leaving adapters to guess.

TestUploadResolutionContractFiveSteps (5 tests) — per-step surface assertions

Each test pins ONE step with the specific endpoint/path/concept that step needs to convey:

  • Step 1: GET endpoint URL named exactly (GET /workspaces/<ws>/pending-uploads/<file_id>/content). Without this assertion, a refactor that paraphrased the endpoint shape would leave adapters guessing the URL.
  • Step 2: BOTH the Claude-Code cache path AND the Python in-container path mentioned. Important because the spec aims at BOTH ecosystems; a refactor that dropped one example would silently bias the doc toward the other.
  • Step 3: POST ack endpoint URL named exactly. Symmetric with step 1.
  • Step 4: URI cache concept + platform-pending:<ws>/<file_id> URI shape + URI_CACHE_MAX_ENTRIES constant name. Three sub-assertions in one test; if any disappears, the cache section is incomplete.
  • Step 5: rewrite phrasing + attachments[] surface + message.parts surface. Pins BOTH documented rewrite surfaces (top-level attachments AND embedded message-parts) so a refactor that only mentions one is caught.

TestUploadResolutionStepOrdering (1 test) — order pin

This is the cleverest test. It finds the section in the text, scans for 1. through 5. markers, asserts strictly ascending offsets. A refactor that:

  • Collapses two steps (e.g. merges "persist + ack" into one) → N. markers don't reach 5 → fails.
  • Reorders (e.g. "ack before persist") → offsets not strictly ascending → fails.

Either pathology would leave the contract subtly broken (adapters would implement steps in the order they read them). The order pin catches both at CI time.

TestUploadResolutionReferenceImplementations (3 tests) — cross-repo discoverability

  • test_python_reference_namedmolecule_runtime/inbox_uploads.py cited exactly. An adapter author greps for this path; if it changes, the grep fails.
  • test_ts_reference_named@molecule-ai/mcp-server/src/inbox-uploads.ts cited exactly (Layer B target path; verified consistent with mcp-server#26).
  • test_contract_test_layer_d_cited — "contract test" phrase present. Adapters know they can't opt out silently from the CI side.

TestUploadResolutionAttachmentBulletUpdated (2 tests) — backward-link sanity

  • test_attachment_bullet_references_contract — the attachments field bullet still cites "Upload resolution contract" by name. Without this, readers of the field doc wouldn't be pointed at the new MUST-DO section.
  • test_attachment_kind_includes_video — uses a regex `kind` is `file`,\s*`image`,\s*`audio`,\s*or\s*`video` against the actual instructions text. Catches: a future copy-edit that drops video from the enumeration.

Test execution confirmed on operator — 14 passed in 0.03s.

Why structural tests on a string instead of doc tests?

The instructions string is read by every spec-compliant MCP client at initialize and surfaced to the agent's system prompt automatically. If a copy-edit accidentally drops the contract section, EVERY TS adapter in development would start shipping without resolution — exactly the regression Layer A fixes. The test catches silent removal at CI time, before any downstream consumer is affected.

A documentation-style test (compare against a Markdown reference) would be brittle on wording polish. The chosen approach (specific phrases + section markers + step ordering) is hardier — it permits prose edits while catching structural removal.

One non-blocking observation (logged-but-NOT-blocking-merge):

The test_steps_appear_in_order test uses out.find(" N.") to locate step markers (two-space-indent + N + period). If a future formatter changed the indentation (e.g. to tab-indent), all 5 finds would return -1 and the test would fail with the asserts before the comparison even runs. That's defensive — the test guards both indentation AND ordering. The diagnostic message would be the index-sequence comparison, not the indent issue. Not blocking; could be more diagnostic with a per-marker existence check.

No regression risks. No REQUEST_CHANGES. LGTM, approving.

**Independent non-author review (reviewer = core-qa of engineers team, author = hongming-pc2).** Verified against head `a9323dd`. mergeable=True. Read full +250/-16 diff. Focus on the structural test envelope's completeness + the catches each test is designed to make. **Pre-flight CI gate** — all 5 contexts success on `a9323dd`. Hygiene gate cleared. **Test envelope is the load-bearing piece — 14 cases across 5 classes.** I read all 14 tests carefully and traced what each one catches if a future refactor accidentally broke that surface: ### TestUploadResolutionContractPresent (3 tests) — section-header invariants - `test_section_header_present` — asserts `"Upload resolution (MANDATORY for any /activity-polling adapter)"` is in the instructions. Catches: a refactor that renames the section or accidentally drops the MANDATORY qualifier. - `test_mandatory_keyword_present` — asserts `"MANDATORY"` appears in the text. Catches: a copy-edit that softens to "should" or "recommended". - `test_chat_upload_receive_trigger_named` — asserts `"method=chat_upload_receive"` is named. Catches: a refactor that doesn't name the activity_logs trigger explicitly, leaving adapters to guess. ### TestUploadResolutionContractFiveSteps (5 tests) — per-step surface assertions Each test pins ONE step with the specific endpoint/path/concept that step needs to convey: - Step 1: GET endpoint URL named exactly (`GET /workspaces/<ws>/pending-uploads/<file_id>/content`). Without this assertion, a refactor that paraphrased the endpoint shape would leave adapters guessing the URL. - Step 2: BOTH the Claude-Code cache path AND the Python in-container path mentioned. Important because the spec aims at BOTH ecosystems; a refactor that dropped one example would silently bias the doc toward the other. - Step 3: POST ack endpoint URL named exactly. Symmetric with step 1. - Step 4: URI cache concept + `platform-pending:<ws>/<file_id>` URI shape + `URI_CACHE_MAX_ENTRIES` constant name. Three sub-assertions in one test; if any disappears, the cache section is incomplete. - Step 5: rewrite phrasing + `attachments[]` surface + `message.parts` surface. Pins BOTH documented rewrite surfaces (top-level attachments AND embedded message-parts) so a refactor that only mentions one is caught. ### TestUploadResolutionStepOrdering (1 test) — order pin This is the cleverest test. It finds the section in the text, scans for ` 1.` through ` 5.` markers, asserts strictly ascending offsets. A refactor that: - Collapses two steps (e.g. merges "persist + ack" into one) → ` N.` markers don't reach 5 → fails. - Reorders (e.g. "ack before persist") → offsets not strictly ascending → fails. Either pathology would leave the contract subtly broken (adapters would implement steps in the order they read them). The order pin catches both at CI time. ### TestUploadResolutionReferenceImplementations (3 tests) — cross-repo discoverability - `test_python_reference_named` — `molecule_runtime/inbox_uploads.py` cited exactly. An adapter author greps for this path; if it changes, the grep fails. - `test_ts_reference_named` — `@molecule-ai/mcp-server/src/inbox-uploads.ts` cited exactly (Layer B target path; verified consistent with mcp-server#26). - `test_contract_test_layer_d_cited` — "contract test" phrase present. Adapters know they can't opt out silently from the CI side. ### TestUploadResolutionAttachmentBulletUpdated (2 tests) — backward-link sanity - `test_attachment_bullet_references_contract` — the attachments field bullet still cites "Upload resolution contract" by name. Without this, readers of the field doc wouldn't be pointed at the new MUST-DO section. - `test_attachment_kind_includes_video` — uses a regex `` `kind` is `file`,\s*`image`,\s*`audio`,\s*or\s*`video` `` against the actual instructions text. Catches: a future copy-edit that drops `video` from the enumeration. **Test execution confirmed on operator** — 14 passed in 0.03s. **Why structural tests on a string instead of doc tests?** The instructions string is read by every spec-compliant MCP client at `initialize` and surfaced to the agent's system prompt automatically. If a copy-edit accidentally drops the contract section, EVERY TS adapter in development would start shipping without resolution — exactly the regression Layer A fixes. The test catches silent removal at CI time, before any downstream consumer is affected. A documentation-style test (compare against a Markdown reference) would be brittle on wording polish. The chosen approach (specific phrases + section markers + step ordering) is hardier — it permits prose edits while catching structural removal. **One non-blocking observation** (logged-but-NOT-blocking-merge): The `test_steps_appear_in_order` test uses `out.find(" N.")` to locate step markers (two-space-indent + N + period). If a future formatter changed the indentation (e.g. to tab-indent), all 5 finds would return -1 and the test would fail with the asserts before the comparison even runs. That's defensive — the test guards both indentation AND ordering. The diagnostic message would be the index-sequence comparison, not the indent issue. Not blocking; could be more diagnostic with a per-marker existence check. **No regression risks. No REQUEST_CHANGES. LGTM, approving.**
plugin-dev merged commit f8fe4e909d into main 2026-05-22 02:34:32 +00:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-ai-workspace-runtime#42