feat(activity): flat-upload-manifest arm in extractAttachmentsFromRequestBody #1657

Merged
plugin-dev merged 1 commits from fix/activity-flat-upload-attachments into main 2026-05-22 00:01:49 +00:00
Owner

Summary

Adds a second walk arm to extractAttachmentsFromRequestBody for the canvas chat_upload_receive shape (a flat upload manifest at request_body root, no JSON-RPC envelope, camelCase mimeType). Normalizes to snake_case mime_type on emit + derives kind from the mime prefix.

L1 follow-up to mc#1654. Per the three-layer data-responsibility rule, upload-shape parsing belongs at the platform projection, not in downstream adaptors.

Empirical gap

2026-05-21 ~23:12Z on agents-team workspace (091a9180-…): a canvas user pasted a PNG into the chat. The resulting activity row:

activity_id : 4d72b4bb-505a-493b-976c-98f6e24d6cfd
method      : chat_upload_receive
source_id   : NULL (canvas_user)
request_body: {
  "uri":      "platform-pending:091a9180-…/26111d48-…",
  "name":     "pasted-2026-05-21T23-12-25-0-0.png",
  "size":     677133,
  "file_id":  "26111d48-…",
  "mimeType": "image/png"
}

?include=peer_info previously returned attachments: null because the Go helper only walked request_body.params.message.parts[]. The flat upload manifest is a fourth shape (relative to the Python helper's three) not covered by the message-parts walk.

What the arm does

After the existing v1 message-parts walk yields nothing, fall through to:

  1. Gate: top-level uri OR file_id present at the root.
  2. Mime normalization: read mimeType (canvas camelCase) → emit mime_type (snake_case wire convention everywhere else). Snake_case mime_type accepted as defensive fallback.
  3. Kind derivation from mime prefix:
    • image/*image
    • audio/*audio
    • video/*video
    • else → file
  4. Minimum-info rule (parity with the message-parts arm): a manifest with neither uri nor name is non-actionable; skip. file_id-only is skipped.
  5. Emit single-element attachments[] with {kind, uri, mime_type, name}, omitting any absent field (omit-when-absent envelope rule, consistent with L1 mc#1654 + L2 workspace-runtime#37 + L3 channel#13).

The message-parts arm takes precedence over the flat arm — a pathological body with BOTH shapes uses the documented inbound. Pinned by TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat.

Test envelope

Unit tests (extractAttachmentsFromRequestBody + helpers):

  • TestKindFromMimeType — 12 cases including image/, no-slash image, application/pdf, empty string.
  • TestExtractAttachmentsFromRequestBody_FlatUpload_Image — empirical 2026-05-21 shape verbatim, asserts mime_type normalization and that camelCase mimeType + file_id do NOT leak into the projected envelope.
  • _FlatUpload_Audio / _FlatUpload_Video / _FlatUpload_GenericFile — kind derivation per prefix.
  • _FlatUpload_NoMimeFallsToFile — no mimeType → kind=file, mime_type omitted from envelope.
  • _FlatUpload_SnakeCaseMimeTypeAccepted — defensive read for future non-canvas callers.
  • _FlatUpload_FileIDOnlyIsSkipped — minimum-info rule.
  • _FlatUpload_NameOnlyIsKept — symmetric with message-parts arm.
  • _MessagePartsTakesPrecedenceOverFlat — pathological-both-shapes precedence pin.

Wire-level integration:

  • TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow — pinned against the empirical 2026-05-21 canvas row, source_id NULL, asserts peer_name/peer_role/agent_card_url ABSENT (canvas row), attachments[] populated with kind=image / mime_type=image/png / uri preserved verbatim / name preserved.

Existing tests untouched + still pass.

Non-changes

  • No SQL changes. The handler-side if atts := extractAttachmentsFromRequestBody(reqBody); len(atts) > 0 call site is unchanged.
  • No new include-flag. ?include=peer_info continues to be the single trigger; existing back-compat tests still pass.
  • extractAttachmentsFromMessageParts is the renamed-extracted-helper for the original walk; its behavior is byte-identical to the prior monolithic function.

Follow-up (separate, not this PR)

  • molecule-ai-workspace-runtime#37's Python _extract_attachments_from_request_body gets the same flat-upload arm for pre-Layer-1 platform parity. Not blocking for L1-enabled platforms (agents-team is L1-enabled; the row-level attachments[] field that this projection populates is what the runtime defensively reads).
  • platform-pending: URI resolution to a local cached file (so Claude Code can Read the image) is a channel-adapter concern — separate issue against molecule-mcp-claude-channel.

Test plan

  • go vet ./internal/handlers/... clean
  • go test -run 'TestKindFromMimeType|TestExtractAttachmentsFromRequestBody|TestActivityList_IncludePeerInfo' ./internal/handlers/... passes locally (0.103s on operator)
  • gofmt-LF normalization handled by repo core.autocrlf=true
  • CI all-green on this branch
  • 2 substantive non-author reviewer approves (core-be + core-qa)
  • End-to-end verify: after merge + deploy, re-fetch the empirical 2026-05-21 canvas-paste activity row via ?include=peer_info and confirm attachments[0] = {kind:image, mime_type:image/png, uri:platform-pending:…, name:pasted-…}.

🤖 Generated with Claude Code

## Summary Adds a second walk arm to `extractAttachmentsFromRequestBody` for the canvas `chat_upload_receive` shape (a flat upload manifest at `request_body` root, no JSON-RPC envelope, camelCase `mimeType`). Normalizes to snake_case `mime_type` on emit + derives `kind` from the mime prefix. L1 follow-up to mc#1654. Per the three-layer data-responsibility rule, upload-shape parsing belongs at the platform projection, not in downstream adaptors. ## Empirical gap 2026-05-21 ~23:12Z on agents-team workspace (`091a9180-…`): a canvas user pasted a PNG into the chat. The resulting activity row: ``` activity_id : 4d72b4bb-505a-493b-976c-98f6e24d6cfd method : chat_upload_receive source_id : NULL (canvas_user) request_body: { "uri": "platform-pending:091a9180-…/26111d48-…", "name": "pasted-2026-05-21T23-12-25-0-0.png", "size": 677133, "file_id": "26111d48-…", "mimeType": "image/png" } ``` `?include=peer_info` previously returned `attachments: null` because the Go helper only walked `request_body.params.message.parts[]`. The flat upload manifest is a fourth shape (relative to the Python helper's three) not covered by the message-parts walk. ## What the arm does After the existing v1 message-parts walk yields nothing, fall through to: 1. **Gate**: top-level `uri` OR `file_id` present at the root. 2. **Mime normalization**: read `mimeType` (canvas camelCase) → emit `mime_type` (snake_case wire convention everywhere else). Snake_case `mime_type` accepted as defensive fallback. 3. **Kind derivation** from mime prefix: - `image/*` → `image` - `audio/*` → `audio` - `video/*` → `video` - else → `file` 4. **Minimum-info rule** (parity with the message-parts arm): a manifest with neither `uri` nor `name` is non-actionable; skip. `file_id`-only is skipped. 5. **Emit** single-element `attachments[]` with `{kind, uri, mime_type, name}`, omitting any absent field (omit-when-absent envelope rule, consistent with L1 mc#1654 + L2 workspace-runtime#37 + L3 channel#13). The message-parts arm takes precedence over the flat arm — a pathological body with BOTH shapes uses the documented inbound. Pinned by `TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat`. ## Test envelope Unit tests (`extractAttachmentsFromRequestBody` + helpers): - `TestKindFromMimeType` — 12 cases including `image/`, no-slash `image`, `application/pdf`, empty string. - `TestExtractAttachmentsFromRequestBody_FlatUpload_Image` — empirical 2026-05-21 shape verbatim, asserts mime_type normalization and that camelCase `mimeType` + `file_id` do NOT leak into the projected envelope. - `_FlatUpload_Audio` / `_FlatUpload_Video` / `_FlatUpload_GenericFile` — kind derivation per prefix. - `_FlatUpload_NoMimeFallsToFile` — no mimeType → kind=file, mime_type omitted from envelope. - `_FlatUpload_SnakeCaseMimeTypeAccepted` — defensive read for future non-canvas callers. - `_FlatUpload_FileIDOnlyIsSkipped` — minimum-info rule. - `_FlatUpload_NameOnlyIsKept` — symmetric with message-parts arm. - `_MessagePartsTakesPrecedenceOverFlat` — pathological-both-shapes precedence pin. Wire-level integration: - `TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow` — pinned against the empirical 2026-05-21 canvas row, source_id NULL, asserts peer_name/peer_role/agent_card_url ABSENT (canvas row), attachments[] populated with `kind=image / mime_type=image/png / uri preserved verbatim / name preserved`. Existing tests untouched + still pass. ## Non-changes - No SQL changes. The handler-side `if atts := extractAttachmentsFromRequestBody(reqBody); len(atts) > 0` call site is unchanged. - No new include-flag. `?include=peer_info` continues to be the single trigger; existing back-compat tests still pass. - `extractAttachmentsFromMessageParts` is the renamed-extracted-helper for the original walk; its behavior is byte-identical to the prior monolithic function. ## Follow-up (separate, not this PR) - `molecule-ai-workspace-runtime#37`'s Python `_extract_attachments_from_request_body` gets the same flat-upload arm for pre-Layer-1 platform parity. Not blocking for L1-enabled platforms (agents-team is L1-enabled; the row-level `attachments[]` field that this projection populates is what the runtime defensively reads). - `platform-pending:` URI resolution to a local cached file (so Claude Code can `Read` the image) is a channel-adapter concern — separate issue against `molecule-mcp-claude-channel`. ## Test plan - [x] `go vet ./internal/handlers/...` clean - [x] `go test -run 'TestKindFromMimeType|TestExtractAttachmentsFromRequestBody|TestActivityList_IncludePeerInfo' ./internal/handlers/...` passes locally (0.103s on operator) - [x] gofmt-LF normalization handled by repo `core.autocrlf=true` - [ ] CI all-green on this branch - [ ] 2 substantive non-author reviewer approves (core-be + core-qa) - [ ] End-to-end verify: after merge + deploy, re-fetch the empirical 2026-05-21 canvas-paste activity row via `?include=peer_info` and confirm `attachments[0] = {kind:image, mime_type:image/png, uri:platform-pending:…, name:pasted-…}`. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
hongming-pc2 added 1 commit 2026-05-21 23:26:51 +00:00
feat(activity): flat-upload-manifest arm in extractAttachmentsFromRequestBody
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 7s
sop-checklist / review-refire (pull_request) Has been skipped
gate-check-v3 / gate-check (pull_request) Successful in 9s
security-review / approved (pull_request) Failing after 7s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 5s
sop-checklist / all-items-acked (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Harness Replays / Harness Replays (pull_request) Successful in 18s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m45s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m20s
CI / Platform (Go) (pull_request) Successful in 5m7s
CI / all-required (pull_request) Successful in 5m33s
audit-force-merge / audit (pull_request) Successful in 4s
b6f2b90e9d
Adds a second walk arm for the canvas chat_upload_receive shape: a flat
upload manifest at request_body root (no JSON-RPC envelope) with
camelCase mimeType. Normalizes to snake_case mime_type on emit + derives
kind from the mime prefix (image/* -> image, audio/* -> audio, video/*
-> video, else -> file).

Empirical surface: 2026-05-21 ~23:12Z canvas-user pasted a PNG, the
activity row's request_body was {uri, name, size, file_id, mimeType}
with no params/message/parts wrapper, and ?include=peer_info projected
attachments: null instead of the expected one-element array. The new
arm handles this shape uniformly so every downstream adaptor (channel
/ telegram / codex / hermes) sees a populated attachments[] with zero
per-adaptor parsing.

Per the three-layer data-responsibility rule (platform / base / adaptor),
upload-shape parsing belongs at Layer 1 (the platform's projection),
not in adaptors.

Tests:
- TestKindFromMimeType (12 cases) pins the mime->kind derivation.
- TestExtractAttachmentsFromRequestBody_FlatUpload_* (8 sub-tests) cover
  image / audio / video / generic-file / no-mime-fallback / snake-case
  mime_type accepted / file_id-only-skipped / name-only-kept.
- TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat
  pins that a pathological body with BOTH shapes uses the parts[] arm
  (the documented inbound, historically the only one extracted).
- TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow is a wire-
  level integration test against the empirical 2026-05-21 row shape.

Follow-up: workspace-runtime#37's _extract_attachments_from_request_body
gets the same flat-upload arm for pre-Layer-1 platform parity. Not
required for already-L1-enabled platforms (which read the row-level
attachments[] field this projection populates).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-be approved these changes 2026-05-22 00:00:13 +00:00
core-be left a comment
Member

Independent non-author review (reviewer = core-be of engineers team, author = hongming-pc2).

Verified against head b6f2b90e9d. mergeable=True. Read the full diff: +98/-12 in workspace-server/internal/handlers/activity.go + 254 new lines in workspace-server/internal/handlers/activity_peer_info_test.go. Focus: Go idioms / SQL safety (n/a here, pure parsing) / defensive JSON walk / regression risk on the existing message-parts arm.

Pre-flight CI status check on the required-setCI / all-required (pull_request) on b6f2b90e9d = success (BP's only required context per /branch_protections/main; advisory pilots + reviewer-gates are NOT in status_check_contexts). All non-pilot code paths verified green: Platform Go, Handlers Postgres Integration, Harness Replays, E2E Smoke, E2E Chat, E2E Canvas, secret scan, gate-check-v3, sop-tier-check. Hygiene gate (required-set, not combined_state) cleared before submission.

The refactor is clean — original walk preserved as extractAttachmentsFromMessageParts, monolithic function turned into a two-arm dispatcher.

func extractAttachmentsFromRequestBody(raw []byte) []map[string]interface{} {
    if len(raw) == 0 { return nil }
    var body map[string]interface{}
    if err := json.Unmarshal(raw, &body); err != nil { return nil }
    if atts := extractAttachmentsFromMessageParts(body); len(atts) > 0 { return atts }
    if att := extractAttachmentFromFlatUploadManifest(body); att != nil {
        return []map[string]interface{}{att}
    }
    return nil
}

Three correctness pins for the arm-ordering:

  1. Message-parts arm runs FIRST. This is the documented inbound shape, historically the only one extracted, and the only one with multi-element results. The flat arm is a single-element fallback for a body that has NO parts wrapper. If a future bug ever produced a body with both shapes, the test TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat pins the message-parts arm as the winner — that's the right call: the documented inbound is the authoritative one.

  2. len(atts) > 0 gate, NOT atts != nil. Crucial. The message-parts arm returns nil when (a) no params/message/parts wrapper, OR (b) the wrapper exists but every part was skipped (empty / wrong-kind / no-uri-no-name). Both cases need to fall through to the flat arm. Using len(atts) > 0 correctly handles both — a returned empty slice from a future refactor that switched the early-return semantics would still fall through. Belt-and-suspenders correct.

  3. extractAttachmentsFromMessageParts is byte-identical to the original monolithic walk (I diffed line-by-line; the only change is the outer body decoding moved up to the public entry). Existing tests _FileKindV1 / _ImageAndAudio / _LegacyV0TypeDiscriminator / _SkipsEmptyParts / _MalformedShape continue to exercise this code path unchanged. Zero regression risk on the existing inbound shape.

extractAttachmentFromFlatUploadManifest — gate, normalization, kind derivation, minimum-info rule all correct.

uri, _ := body["uri"].(string)
fileID, _ := body["file_id"].(string)
if uri == "" && fileID == "" { return nil }
mimeType, _ := body["mimeType"].(string)
if mimeType == "" { mimeType, _ = body["mime_type"].(string) }
name, _ := body["name"].(string)
if uri == "" && name == "" { return nil }
att := map[string]interface{}{"kind": kindFromMimeType(mimeType)}
// ... omit-when-absent

Five things I scrutinized:

a) Gate (uri == "" && fileID == ""). Returns nil if neither root-level field is present — this prevents the arm from triggering on a stray body that happens to have mimeType but no upload identifier. Correct.

b) camelCase→snake_case normalization with snake_case fallback. The canvas writes mimeType; everywhere else (a2a-sdk v1, runtime helper, channel/telegram/codex/hermes adaptors) reads mime_type. The on-emit snake-case rule is the wire contractatt["mime_type"] = mimeType means downstream sees one shape regardless of inbound. Cross-checked with extractAttachmentsFromMessageParts line 138 (mimeType, _ := fileObj["mime_type"].(string)) — same emit key. Wire-consistent.

c) Defensive snake-case fallback for the input read. A future canvas refactor that adopts snake_case (consistent-with-everywhere-else) would still be parsed correctly. Cheap forward-compat hardening.

d) kindFromMimeType uses strings.HasPrefix — exact substring at index 0, no regex compile cost, no false positives. The test TestKindFromMimeType pins "image" (no slash) → "file", which means the function correctly rejects a stray bareword mime. Good defensive shape.

e) Minimum-info rule (uri == "" && name == "" skip) is parity with the message-parts arm's if uri == "" && name == "" { continue } rule. A flat manifest carrying only file_id + mimeType (no uri, no name) is non-actionable — downstream can't render a discoverable file from a canvas-internal id. Skip per the same SOP. Tested by _FlatUpload_FileIDOnlyIsSkipped.

The file_id field is NOT propagated to the envelope — only kind, uri, mime_type, name. This is correct: file_id is a canvas-internal UUID with no meaning to a downstream adaptor (Telegram/Codex don't speak canvas file ids). The uri is the discoverable handle. Pinned by TestExtractAttachmentsFromRequestBody_FlatUpload_Image asserting _, present := att["file_id"]; present is false. Good security/info-hygiene practice.

No SQL changes. The handler-side if atts := extractAttachmentsFromRequestBody(reqBody); len(atts) > 0 call site is untouched at line 521. The new code is a pure parser refactor with one extension arm — no new query, no new bind parameter, no new column reference. SQL-injection surface unchanged.

No new imports. strings was already in the imports list (used elsewhere in the file for sanitization regex). The new code uses strings.HasPrefix — no new module dependency.

Test envelope is comprehensive:

  • TestKindFromMimeType — 12 cases including the tricky no-slash "image" and empty-string defaults.
  • 8 TestExtractAttachmentsFromRequestBody_FlatUpload_* sub-tests covering image / audio / video / generic file / no-mime fallback / snake-case input accepted / file_id-only skipped / name-only kept.
  • _MessagePartsTakesPrecedenceOverFlat pinning arm precedence.
  • TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow — wire-level integration test against the empirical 2026-05-21 ~23:12Z row (verbatim canvas paste shape: mimeType=image/png, file_id set, size=677133, source_id NULL), asserts envelope projects snake-case mime_type=image/png + kind=image + uri preserved + name preserved AND that peer_name/peer_role/agent_card_url are ABSENT (canvas row).

The empirical-row test is the load-bearing one — pinned against the exact gap CEO-A observed. If a future refactor breaks the flat-upload arm, this test catches the regression at the wire-shape level, not just the helper-function level.

One non-blocking observation (logged-but-NOT-blocking-merge):

The _FlatUpload_NoMimeFallsToFile test asserts kindFromMimeType("") returns "file" and that mime_type is omitted from the envelope. But if a row has {uri: "...", name: "..."} with no mimeType, the envelope projects {kind: "file", uri: ..., name: ...} — no mime_type — which is correct, but a downstream adaptor that switches on mime to choose a render path (image preview vs. PDF inline vs. file-download button) has nothing to switch on. That's a real wire-shape concern but it's NOT a bug in this PR — it's a property of the source row, and the PR correctly omits the absent field rather than guessing. Worth flagging as a follow-up to file an issue against the canvas upload pipe to ALWAYS supply a mimeType (defensive HTTP content-sniff if extension isn't dispositive). File this post-merge if you agree it's worth chasing.

No regression risks. No REQUEST_CHANGES. LGTM, approving.

**Independent non-author review (reviewer = core-be of engineers team, author = hongming-pc2).** Verified against head `b6f2b90e9d`. mergeable=True. Read the full diff: +98/-12 in `workspace-server/internal/handlers/activity.go` + 254 new lines in `workspace-server/internal/handlers/activity_peer_info_test.go`. Focus: Go idioms / SQL safety (n/a here, pure parsing) / defensive JSON walk / regression risk on the existing message-parts arm. **Pre-flight CI status check on the required-set** — `CI / all-required (pull_request)` on `b6f2b90e9d` = **success** (BP's only required context per `/branch_protections/main`; advisory pilots + reviewer-gates are NOT in `status_check_contexts`). All non-pilot code paths verified green: Platform Go, Handlers Postgres Integration, Harness Replays, E2E Smoke, E2E Chat, E2E Canvas, secret scan, gate-check-v3, sop-tier-check. Hygiene gate (required-set, not combined_state) cleared before submission. **The refactor is clean — original walk preserved as `extractAttachmentsFromMessageParts`, monolithic function turned into a two-arm dispatcher.** ```go func extractAttachmentsFromRequestBody(raw []byte) []map[string]interface{} { if len(raw) == 0 { return nil } var body map[string]interface{} if err := json.Unmarshal(raw, &body); err != nil { return nil } if atts := extractAttachmentsFromMessageParts(body); len(atts) > 0 { return atts } if att := extractAttachmentFromFlatUploadManifest(body); att != nil { return []map[string]interface{}{att} } return nil } ``` Three correctness pins for the arm-ordering: 1. **Message-parts arm runs FIRST.** This is the documented inbound shape, historically the only one extracted, and the only one with multi-element results. The flat arm is a single-element fallback for a body that has NO parts wrapper. If a future bug ever produced a body with both shapes, the test `TestExtractAttachmentsFromRequestBody_MessagePartsTakesPrecedenceOverFlat` pins the message-parts arm as the winner — that's the right call: the documented inbound is the authoritative one. 2. **`len(atts) > 0` gate, NOT `atts != nil`.** Crucial. The message-parts arm returns `nil` when (a) no params/message/parts wrapper, OR (b) the wrapper exists but every part was skipped (empty / wrong-kind / no-uri-no-name). Both cases need to fall through to the flat arm. Using `len(atts) > 0` correctly handles both — a returned empty slice from a future refactor that switched the early-return semantics would still fall through. Belt-and-suspenders correct. 3. **`extractAttachmentsFromMessageParts` is byte-identical to the original monolithic walk** (I diffed line-by-line; the only change is the outer `body` decoding moved up to the public entry). Existing tests `_FileKindV1` / `_ImageAndAudio` / `_LegacyV0TypeDiscriminator` / `_SkipsEmptyParts` / `_MalformedShape` continue to exercise this code path unchanged. Zero regression risk on the existing inbound shape. **`extractAttachmentFromFlatUploadManifest` — gate, normalization, kind derivation, minimum-info rule all correct.** ```go uri, _ := body["uri"].(string) fileID, _ := body["file_id"].(string) if uri == "" && fileID == "" { return nil } mimeType, _ := body["mimeType"].(string) if mimeType == "" { mimeType, _ = body["mime_type"].(string) } name, _ := body["name"].(string) if uri == "" && name == "" { return nil } att := map[string]interface{}{"kind": kindFromMimeType(mimeType)} // ... omit-when-absent ``` Five things I scrutinized: a) **Gate (`uri == "" && fileID == ""`).** Returns nil if neither root-level field is present — this prevents the arm from triggering on a stray body that happens to have `mimeType` but no upload identifier. Correct. b) **camelCase→snake_case normalization** with snake_case fallback. The canvas writes `mimeType`; everywhere else (a2a-sdk v1, runtime helper, channel/telegram/codex/hermes adaptors) reads `mime_type`. The on-emit snake-case rule is the *wire contract* — `att["mime_type"] = mimeType` means downstream sees one shape regardless of inbound. Cross-checked with `extractAttachmentsFromMessageParts` line 138 (`mimeType, _ := fileObj["mime_type"].(string)`) — same emit key. Wire-consistent. c) **Defensive snake-case fallback for the input read.** A future canvas refactor that adopts snake_case (consistent-with-everywhere-else) would still be parsed correctly. Cheap forward-compat hardening. d) **kindFromMimeType** uses `strings.HasPrefix` — exact substring at index 0, no regex compile cost, no false positives. The test `TestKindFromMimeType` pins `"image"` (no slash) → `"file"`, which means the function correctly rejects a stray bareword mime. Good defensive shape. e) **Minimum-info rule** (`uri == "" && name == ""` skip) is parity with the message-parts arm's `if uri == "" && name == "" { continue }` rule. A flat manifest carrying only `file_id` + `mimeType` (no uri, no name) is non-actionable — downstream can't render a discoverable file from a canvas-internal id. Skip per the same SOP. Tested by `_FlatUpload_FileIDOnlyIsSkipped`. **The `file_id` field is NOT propagated to the envelope** — only `kind`, `uri`, `mime_type`, `name`. This is correct: `file_id` is a canvas-internal UUID with no meaning to a downstream adaptor (Telegram/Codex don't speak canvas file ids). The uri is the discoverable handle. Pinned by `TestExtractAttachmentsFromRequestBody_FlatUpload_Image` asserting `_, present := att["file_id"]; present` is false. Good security/info-hygiene practice. **No SQL changes.** The handler-side `if atts := extractAttachmentsFromRequestBody(reqBody); len(atts) > 0` call site is untouched at line 521. The new code is a pure parser refactor with one extension arm — no new query, no new bind parameter, no new column reference. SQL-injection surface unchanged. **No new imports.** `strings` was already in the imports list (used elsewhere in the file for sanitization regex). The new code uses `strings.HasPrefix` — no new module dependency. **Test envelope is comprehensive:** - `TestKindFromMimeType` — 12 cases including the tricky no-slash `"image"` and empty-string defaults. - 8 `TestExtractAttachmentsFromRequestBody_FlatUpload_*` sub-tests covering image / audio / video / generic file / no-mime fallback / snake-case input accepted / file_id-only skipped / name-only kept. - `_MessagePartsTakesPrecedenceOverFlat` pinning arm precedence. - `TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow` — wire-level integration test against the empirical 2026-05-21 ~23:12Z row (verbatim canvas paste shape: `mimeType=image/png`, `file_id` set, `size=677133`, source_id NULL), asserts envelope projects snake-case `mime_type=image/png` + `kind=image` + uri preserved + name preserved AND that peer_name/peer_role/agent_card_url are ABSENT (canvas row). The empirical-row test is the load-bearing one — pinned against the exact gap CEO-A observed. If a future refactor breaks the flat-upload arm, this test catches the regression at the wire-shape level, not just the helper-function level. **One non-blocking observation** (logged-but-NOT-blocking-merge): The `_FlatUpload_NoMimeFallsToFile` test asserts `kindFromMimeType("")` returns `"file"` and that `mime_type` is omitted from the envelope. But if a row has `{uri: "...", name: "..."}` with no mimeType, the envelope projects `{kind: "file", uri: ..., name: ...}` — no mime_type — which is correct, but a downstream adaptor that switches on mime to choose a render path (image preview vs. PDF inline vs. file-download button) has nothing to switch on. That's a real wire-shape concern but it's NOT a bug in this PR — it's a property of the source row, and the PR correctly omits the absent field rather than guessing. Worth flagging as a follow-up to file an issue against the canvas upload pipe to ALWAYS supply a mimeType (defensive HTTP content-sniff if extension isn't dispositive). File this post-merge if you agree it's worth chasing. **No regression risks. No REQUEST_CHANGES. LGTM, approving.**
core-qa approved these changes 2026-05-22 00:00:59 +00:00
core-qa left a comment
Member

Independent non-author review (reviewer = core-qa of engineers team, author = hongming-pc2).

Verified against head b6f2b90e9d. mergeable=True. Read +98/-12 in activity.go + 254 new lines in activity_peer_info_test.go. Focus on test envelope completeness + empirical-row coverage + omit-when-absent semantics + cross-layer wire consistency.

Pre-flight CI status check on the required-setCI / all-required (pull_request) on b6f2b90e9d = success (BP's only required context). All non-pilot code paths green. Hygiene gate cleared.

Test envelope completeness — comprehensive across three surfaces.

  1. kindFromMimeType helper (12 sub-tests in TestKindFromMimeType).

The mime-prefix-to-kind derivation is small but load-bearing because it's the ONLY classifier for the flat-upload arm (no explicit kind discriminator in the source row). The test pins:

  • "image/png", "image/jpeg""image" (happy path)
  • "image/""image" (trailing-slash edge — strings.HasPrefix accepts)
  • "audio/mpeg", "audio/wav""audio"
  • "video/mp4", "video/webm""video"
  • "application/pdf", "text/plain", "", "unknown""file" (default branch)
  • "image" (no slash) → "file" — load-bearing negative case. Without this assertion, a future refactor that switched to substring matching (e.g. strings.Contains(mime, "image")) would silently classify e.g. "x-image-rendered/foo" as image. The exact-prefix-with-slash semantics is pinned. Good QA defense.
  1. extractAttachmentsFromRequestBody flat-upload sub-tests (8 cases).

The empirical-row test _FlatUpload_Image deserves special attention — it's pinned against the verbatim 2026-05-21 ~23:12Z canvas paste shape:

body := []byte(`{
    "uri":"platform-pending:091a9180-/26111d48-",
    "name":"pasted-2026-05-21T23-12-25-0-0.png",
    "size":677133,
    "file_id":"26111d48-",
    "mimeType":"image/png"
}`)

And asserts:

  • kind == "image" (image/png prefix-derived)
  • uri preserved verbatim
  • mime_type == "image/png" — snake_case wire shape, NOT camelCase mimeType
  • name preserved verbatim
  • mimeType (camelCase) IS NOT in the envelope (_, present := att["mimeType"]; present == false)
  • file_id IS NOT in the envelope (canvas-internal id, not for downstream)

These last two are the load-bearing pins. Without them, a future refactor that switched to att["mimeType"] = mimeType or that propagated file_id would compile, produce a still-functional-looking envelope, and silently break wire-shape consistency with the message-parts arm + the L2 runtime + the L3 channel adaptor. The camelCase-leak test catches both regressions.

_FlatUpload_Audio + _FlatUpload_Video + _FlatUpload_GenericFile confirm the kind-derivation under realistic mime values (audio/mpeg, video/mp4, application/pdf). Each verifies kind classification AND that mime_type is the snake-case value.

_FlatUpload_NoMimeFallsToFile — defensive: a canvas row without mimeType (sniffing failure, legacy format) projects kind="file" and OMITS mime_type from the envelope (omit-when-absent rule). Without this case, a future bug that emitted att["mime_type"] = "" would slip past until a downstream consumer hit an empty-string mime branch.

_FlatUpload_SnakeCaseMimeTypeAccepted — forward-compat: a future canvas version emitting snake_case mime_type directly is still parsed. Pinning the defensive snake_case fallback means a planned canvas refactor won't break Layer 1 silently.

_FlatUpload_FileIDOnlyIsSkipped — symmetric with the message-parts arm's "empty part" skip rule. A manifest with file_id + mimeType but no uri and no name is non-actionable; the test asserts the arm returns nil rather than emitting a kind-only attachment placeholder.

_FlatUpload_NameOnlyIsKept — also symmetric: a manifest with name but no uri is useful (downstream can render "user uploaded X.bin"). Both arms apply the same uri-OR-name rule. Consistent.

  1. Arm-precedence pin (_MessagePartsTakesPrecedenceOverFlat).

A pathological body with both shapes:

{"uri":"platform-pending:should-not-win", "file_id":"x", "mimeType":"image/png",
 "params":{"message":{"parts":[
   {"kind":"file","file":{"uri":"workspace:should-win.pdf",...,"name":"win.pdf"}}
 ]}}}

Asserts len(atts) == 1 and atts[0]["uri"] == "workspace:should-win.pdf". This locks in the message-parts arm as the winner — the documented inbound shape is authoritative. A future refactor that flipped the arm order (or that introduced an additive merge of both arms) would fail this test. Right call.

Wire-level integration test (TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow) covers the full handler path.

The unit tests pin the parser; this test pins the whole pipe: sqlmock returns an activity_type=chat_upload_receive row with source_id=nil (canvas), request_body=<the empirical 2026-05-21 shape>, joined NULL peer columns. The handler runs through the regex-pinned LEFT JOIN workspaces query, builds the envelope, and emits JSON.

Assertions:

  • HTTP 200 (no error path on the new arm)
  • peer_name / peer_role / agent_card_url ABSENT (canvas row → NULL peer columns → omit per existing rule)
  • attachments IS present (array with 1 element)
  • Element has kind=image, mime_type=image/png (snake-case), full uri preserved with workspace + file ids embedded, name preserved
  • mock.ExpectationsWereMet() (LEFT JOIN keyword present + workspace_id bound)

This is the test that would have caught the original gap. If I'd been reviewing mc#1654 with chat_upload_receive in mind, the missing flat-arm would have shown up as attachments: null on this fixture — but the test didn't exist then because the surface wasn't yet known.

Omit-when-absent — verified per field at every step.

  • Empty body / non-JSON / no-uri-no-name body → arm returns nil → envelope omits attachments.
  • mimeType absent → att map omits mime_type key (if mimeType != "" { att["mime_type"] = ... }). The handler-side len(atts) > 0 check then keeps the envelope sane.
  • file_id never propagated regardless of source presence (correct info-hygiene).
  • Tested at the parser level + at the wire level.

Cross-layer wire consistency check.

The flat-upload arm's emit shape ({kind, uri, mime_type, name} keys; snake-case mime_type) matches:

  • the existing extractAttachmentsFromMessageParts emit shape (identical keys / case)
  • the L2 runtime helper _extract_attachments_from_request_body (Python) emit shape (verified by reading molecule_runtime/inbox.py:560+)
  • the L3 channel adaptor's expected meta.attachments contract (verified by reading molecule-mcp-claude-channel#13's _build_channel_notification documentation)

If a single request_body went through both arms (impossible by precedence rule, but hypothetically), the wire shape would be identical. Downstream Code Reviewer / Telegram bot / hermes wheel sees one envelope shape regardless of inbound. Tested by direct comparison between v1-message-parts test and flat-upload test expected envelopes.

No regression risks identified.

  • Existing message-parts tests untouched, still pass per local go test run on operator (0.103s).
  • Existing wire-level tests (_IssuesLeftJoin, _CanvasRowHasNoPeerFields, _AttachmentsSurfaceFromRequestBody) still pass (LEFT JOIN regex untouched).
  • The back-compat test _Unset_NoJoinNoExtraFields (no JOIN keyword in regex when flag absent) still passes — the new arm is purely additive inside the helper, no SQL path change.

One non-blocking observation (logged-but-NOT-blocking-merge):

The arm-precedence test is good but it doesn't pin the EARLY-RETURN ordering semantics — i.e. it doesn't verify that extractAttachmentsFromMessageParts is called first and its result inspected before extractAttachmentFromFlatUploadManifest is called at all. A future refactor that flipped to an "always-call-both, return parts arm if non-nil" pattern would still pass the precedence test. If you want to be paranoid you could add a test that uses a body shape where the flat arm WOULD have triggered an erroneous extraction (e.g. inject a panic in the flat arm via test build-tag) — but that's overengineering for what's at stake. Today's test is sufficient.

No regression risks. No REQUEST_CHANGES. LGTM, approving.

**Independent non-author review (reviewer = core-qa of engineers team, author = hongming-pc2).** Verified against head `b6f2b90e9d`. mergeable=True. Read +98/-12 in `activity.go` + 254 new lines in `activity_peer_info_test.go`. Focus on test envelope completeness + empirical-row coverage + omit-when-absent semantics + cross-layer wire consistency. **Pre-flight CI status check on the required-set** — `CI / all-required (pull_request)` on `b6f2b90e9d` = **success** (BP's only required context). All non-pilot code paths green. Hygiene gate cleared. **Test envelope completeness — comprehensive across three surfaces.** 1. **`kindFromMimeType` helper (12 sub-tests in `TestKindFromMimeType`).** The mime-prefix-to-kind derivation is small but load-bearing because it's the ONLY classifier for the flat-upload arm (no explicit `kind` discriminator in the source row). The test pins: - `"image/png"`, `"image/jpeg"` → `"image"` (happy path) - `"image/"` → `"image"` (trailing-slash edge — strings.HasPrefix accepts) - `"audio/mpeg"`, `"audio/wav"` → `"audio"` - `"video/mp4"`, `"video/webm"` → `"video"` - `"application/pdf"`, `"text/plain"`, `""`, `"unknown"` → `"file"` (default branch) - `"image"` (no slash) → `"file"` — load-bearing negative case. Without this assertion, a future refactor that switched to substring matching (e.g. `strings.Contains(mime, "image")`) would silently classify e.g. `"x-image-rendered/foo"` as image. The exact-prefix-with-slash semantics is pinned. Good QA defense. 2. **`extractAttachmentsFromRequestBody` flat-upload sub-tests (8 cases).** The empirical-row test `_FlatUpload_Image` deserves special attention — it's pinned against the verbatim 2026-05-21 ~23:12Z canvas paste shape: ```go body := []byte(`{ "uri":"platform-pending:091a9180-/26111d48-", "name":"pasted-2026-05-21T23-12-25-0-0.png", "size":677133, "file_id":"26111d48-", "mimeType":"image/png" }`) ``` And asserts: - `kind == "image"` (image/png prefix-derived) - `uri` preserved verbatim - `mime_type == "image/png"` — snake_case wire shape, NOT camelCase `mimeType` - `name` preserved verbatim - `mimeType` (camelCase) IS NOT in the envelope (`_, present := att["mimeType"]; present == false`) - `file_id` IS NOT in the envelope (canvas-internal id, not for downstream) These last two are the load-bearing pins. Without them, a future refactor that switched to `att["mimeType"] = mimeType` or that propagated `file_id` would compile, produce a still-functional-looking envelope, and silently break wire-shape consistency with the message-parts arm + the L2 runtime + the L3 channel adaptor. The camelCase-leak test catches both regressions. `_FlatUpload_Audio` + `_FlatUpload_Video` + `_FlatUpload_GenericFile` confirm the kind-derivation under realistic mime values (`audio/mpeg`, `video/mp4`, `application/pdf`). Each verifies kind classification AND that mime_type is the snake-case value. `_FlatUpload_NoMimeFallsToFile` — defensive: a canvas row without mimeType (sniffing failure, legacy format) projects `kind="file"` and OMITS mime_type from the envelope (omit-when-absent rule). Without this case, a future bug that emitted `att["mime_type"] = ""` would slip past until a downstream consumer hit an empty-string mime branch. `_FlatUpload_SnakeCaseMimeTypeAccepted` — forward-compat: a future canvas version emitting snake_case `mime_type` directly is still parsed. Pinning the defensive snake_case fallback means a planned canvas refactor won't break Layer 1 silently. `_FlatUpload_FileIDOnlyIsSkipped` — symmetric with the message-parts arm's "empty part" skip rule. A manifest with `file_id` + `mimeType` but no uri and no name is non-actionable; the test asserts the arm returns nil rather than emitting a kind-only attachment placeholder. `_FlatUpload_NameOnlyIsKept` — also symmetric: a manifest with name but no uri is useful (downstream can render "user uploaded X.bin"). Both arms apply the same uri-OR-name rule. Consistent. 3. **Arm-precedence pin (`_MessagePartsTakesPrecedenceOverFlat`).** A pathological body with both shapes: ```go {"uri":"platform-pending:should-not-win", "file_id":"x", "mimeType":"image/png", "params":{"message":{"parts":[ {"kind":"file","file":{"uri":"workspace:should-win.pdf",...,"name":"win.pdf"}} ]}}} ``` Asserts `len(atts) == 1` and `atts[0]["uri"] == "workspace:should-win.pdf"`. This locks in the message-parts arm as the winner — the documented inbound shape is authoritative. A future refactor that flipped the arm order (or that introduced an additive merge of both arms) would fail this test. Right call. **Wire-level integration test (`TestActivityList_IncludePeerInfo_ChatUploadReceiveCanvasRow`) covers the full handler path.** The unit tests pin the parser; this test pins the whole pipe: sqlmock returns an `activity_type=chat_upload_receive` row with `source_id=nil` (canvas), `request_body=<the empirical 2026-05-21 shape>`, joined NULL peer columns. The handler runs through the regex-pinned `LEFT JOIN workspaces` query, builds the envelope, and emits JSON. Assertions: - HTTP 200 (no error path on the new arm) - `peer_name` / `peer_role` / `agent_card_url` ABSENT (canvas row → NULL peer columns → omit per existing rule) - `attachments` IS present (array with 1 element) - Element has `kind=image`, `mime_type=image/png` (snake-case), full uri preserved with workspace + file ids embedded, name preserved - `mock.ExpectationsWereMet()` (LEFT JOIN keyword present + workspace_id bound) This is the test that would have caught the original gap. If I'd been reviewing mc#1654 with chat_upload_receive in mind, the missing flat-arm would have shown up as `attachments: null` on this fixture — but the test didn't exist then because the surface wasn't yet known. **Omit-when-absent — verified per field at every step.** - Empty body / non-JSON / no-uri-no-name body → arm returns nil → envelope omits `attachments`. - mimeType absent → att map omits `mime_type` key (`if mimeType != "" { att["mime_type"] = ... }`). The handler-side `len(atts) > 0` check then keeps the envelope sane. - file_id never propagated regardless of source presence (correct info-hygiene). - Tested at the parser level + at the wire level. **Cross-layer wire consistency check.** The flat-upload arm's emit shape (`{kind, uri, mime_type, name}` keys; snake-case mime_type) matches: - the existing `extractAttachmentsFromMessageParts` emit shape (identical keys / case) - the L2 runtime helper `_extract_attachments_from_request_body` (Python) emit shape (verified by reading `molecule_runtime/inbox.py:560+`) - the L3 channel adaptor's expected meta.attachments contract (verified by reading `molecule-mcp-claude-channel#13`'s `_build_channel_notification` documentation) If a single request_body went through both arms (impossible by precedence rule, but hypothetically), the wire shape would be identical. Downstream Code Reviewer / Telegram bot / hermes wheel sees one envelope shape regardless of inbound. Tested by direct comparison between v1-message-parts test and flat-upload test expected envelopes. **No regression risks identified.** - Existing message-parts tests untouched, still pass per local `go test` run on operator (0.103s). - Existing wire-level tests (`_IssuesLeftJoin`, `_CanvasRowHasNoPeerFields`, `_AttachmentsSurfaceFromRequestBody`) still pass (LEFT JOIN regex untouched). - The back-compat test `_Unset_NoJoinNoExtraFields` (no JOIN keyword in regex when flag absent) still passes — the new arm is purely additive inside the helper, no SQL path change. **One non-blocking observation** (logged-but-NOT-blocking-merge): The arm-precedence test is good but it doesn't pin the EARLY-RETURN ordering semantics — i.e. it doesn't verify that `extractAttachmentsFromMessageParts` is called first and its result inspected before `extractAttachmentFromFlatUploadManifest` is called at all. A future refactor that flipped to an "always-call-both, return parts arm if non-nil" pattern would still pass the precedence test. If you want to be paranoid you could add a test that uses a body shape where the flat arm WOULD have triggered an erroneous extraction (e.g. inject a panic in the flat arm via test build-tag) — but that's overengineering for what's at stake. Today's test is sufficient. **No regression risks. No REQUEST_CHANGES. LGTM, approving.**
plugin-dev merged commit a356bc94f3 into main 2026-05-22 00:01:49 +00:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1657