fix: resolve inbound attachments at runtime #45
Reference in New Issue
Block a user
Delete Branch "fix/attachment-l4-runtime"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements the baseline Layer 4 runtime attachment resolver for Python runtimes:
workspace:attachments still resolve directly from diskplatform-pending:<workspace>/<file_id>attachments fetch from/pending-uploads/:file_id/content, cache under/workspace/.molecule/inbox/<uri-hash>/<safe-name>, then ACK after the cache writeworkspace:paths fetch from/chat/download?path=<absolute-path>and cache locallyshared_runtime.extract_message_textappends the same local-path attachment manifest so shared-runtime adapters no longer treat file-only messages as empty0.2.1for publish cascadeRFC/design tracker: molecule-ai/internal#650.
Phase 1 evidence
Brief claims checked
extract_attached_files; shared-runtime adapters usedextract_message_textonly.chat/downloadcan fetchplatform-pending:URIs. Falsified byworkspace-server/internal/handlers/chat_files.go:Download: it requires an absolute path under allowed roots.molecule_runtime/inbox_uploads.pyandworkspace-server/internal/router/router.go:/pending-uploads/:file_id/contentand/ackarewsAuth-gated.wsAuthandplatform_auth.auth_headers; resolver only emits local paths into prompts.Layer map
molecule_runtime.executor_helpers.extract_attached_filesmolecule_runtime.shared_runtime.extract_message_textDesign / rollback
The fix lives in
molecule-ai-workspace-runtime, the Python runtime SSOT, so claude-code/codex/hermes and shared-runtime adapters inherit the baseline path on the next runtime publish/cascade. Runtime-native image block delivery remains a follow-up, not required for the baseline local-path contract.Rollback is a normal revert of this PR plus not tagging/publishing
runtime-v0.2.1if CI or review finds an issue.Changed-branch coverage ledger
extract_attached_filesworkspace:file existstest_extract_attached_files_accepts_both_shapes,test_extract_attached_files_accepts_v1_protobuf_partextract_attached_filesplatform-pending:miss fetches bytes and ACKstest_extract_attached_files_fetches_platform_pending_attachmentextract_attached_filestest_extract_attached_files_platform_pending_cache_is_idempotentextract_attached_filesworkspace:path downloads via/chat/downloadtest_extract_attached_files_downloads_missing_workspace_uriextract_attached_filestest_extract_attached_files_platform_pending_requires_workspace_tokenshared_runtime.extract_message_texttest_extract_message_text_appends_attachment_manifestshared_runtime.extract_message_texttest_extract_message_text_file_only_message_returns_manifestVerification
PYTHONPATH=. pytest -q-> 381 passedruff check --select=E9,F63,F7,F82 molecule_runtime/-> passpython3 -m build --wheel-> builtmolecule_ai_workspace_runtime-0.2.1-py3-none-any.whlSOP notes
Tier: medium. This changes runtime file handling and auth-bound attachment fetching, but does not add a new platform endpoint or migration. Needs 2 non-author approvals before merge per current branch protection.
SOP review request: PR #45 is ready for non-author review once CI leaves queue. Tier: medium (runtime file handling/auth-bound attachment fetch). Evidence + branch coverage ledger are in the PR body; RFC tracker is internal#650. Suggested reviewer pool: dev-lead / sdk-dev / release-manager.
SOP self-review (not an approval):
platform-pending:resolves via/pending-uploads/:file_id/content+/ack, while missing absoluteworkspace:paths resolve via/chat/download?path=....molecule-ai-workspace-runtime, the Python runtime SSOT, rather than per-template copies.platform_auth.auth_headers; tokens/URLs are not surfaced to the model; cached filenames are sanitized and paths remain local.Verification: local
pytest -q381 passed; ruff selected checks passed; wheel build passed. Gitea Actions:ci.ymlSuccess,secret-scan.ymlSuccess ate918ee7.Findings:
[High] Validate and escape the
platform-pending:route components before constructing authenticated platform URLs.Location:
molecule_runtime/executor_helpers.py:906andmolecule_runtime/executor_helpers.py:925-930_download_attachment_uritakesworkspace_idfrom rawWORKSPACE_IDor directly from the inboundplatform-pending:<ws>/<file_id>URI, then passes it toauth_headers(workspace_id)and interpolates it into/workspaces/{workspace_id}/.... That bypasses the existingplatform_auth.validate_workspace_id()contract documented inplatform_auth.py;auth_headers()does not validate the argument and can fall back to the single-workspace token. The same code also interpolatesfile_idinto a path segment without validation or URL escaping. Since attachment URIs are message data, a malformed URI containing/,..,?,#, or control-ish separators can alter the authenticated request path or make the runtime send its bearer token to an unintended platform route on the trusted host.Impact: a crafted attachment URI can turn this new L4 fetch path into authenticated path confusion against the platform. This is exactly the boundary this PR is meant to harden: workspace bearer auth should only be used for the current workspace and the intended pending-upload object.
Recommendation: derive the workspace ID through the existing validator (
validate_workspace_id/get_workspace_id) and reject mismatches between env/current workspace and the URI workspace. Validatefile_idagainst the platform's accepted ID grammar, or at minimum quote it as a single path segment withurllib.parse.quote(file_id, safe="")before building both content and ack URLs. Add regression tests for a bad workspace ID and a file ID containing/or..that assert no network call is made.Notes from the rest of the review:
e918ee7fa7e49bb16ec4320ef7ae523de9fc2515is green: unit-tests, lint, build, smoke-install, and secret scan all succeeded.resp.contentmaterializes the whole response; this matches existing adjacent code, so I am not blocking on it here, but streaming enforcement would be a reasonable follow-up hardening item.Required:
_download_attachment_uriignores the runtime's per-workspace platform URL registry. Inmolecule_runtime/executor_helpers.py:902-906, the download target is built only from process-wideMOLECULE_API_URL/PLATFORM_URL; it never consultsmolecule_runtime.platform_auth.get_workspace_platform_url(workspace_id), even thoughauth_headers(workspace_id)already supports the per-workspace token/origin path. For the supported externalMOLECULE_WORKSPACES=[{"id", "token", "platform_url"}]mode, this either skips attachment fetches when no global platform URL is set or fetches from the wrong tenant when one process is registered to multiple tenants. Because this PR bumpspyproject.toml:12to0.2.1for a runtime publish cascade, that gap would ship to every consumer of the wheel. Please resolve the URL with the same per-workspace precedence used byplatform_authand add changed-branch coverage for a registered per-workspaceplatform_urlattachment fetch.Five-axis pass:
Verification reviewed:
e918ee7fa7e49bb16ec4320ef7ae523de9fc2515.96010(ci.yml) and96011(secret-scan.yml) arestatus=1, Success; tasks149517-149521are allstatus=1, Success.Findings:
[High]
platform-pending:downloads can still fall back to the process-wide token whenWORKSPACE_IDis unset.Location:
molecule_runtime/executor_helpers.py:956-967The path-shaping half of the previous blocker is resolved: the URI workspace is validated with
validate_workspace_id, and the pending upload ID is now constrained by UUID parsing before it is interpolated into the content/ack URLs.The cross-workspace token-isolation half is still incomplete. The new guard only rejects a missing per-workspace token when
env_workspace_idis non-empty and different from the URI workspace:In the multi-workspace external-runtime shape,
WORKSPACE_IDmay be unset while tokens are expected to come from the per-workspace registry. In that case this guard does not run. The next line callsauth_headers(workspace_id), whose documented behavior is to checkget_workspace_token(workspace_id)and then fall back to the single-workspace cache/env token when no per-workspace token exists (platform_auth.py:420-423). That means aplatform-pending:ws-b/<uuid>attachment can still be fetched with a process-wide/single-workspace bearer token when no token is registered forws-b.Impact: an inbound attachment URI can cause the runtime to send the wrong workspace bearer token to a pending-upload route for another workspace, which is the token-confusion case this hardening was meant to close.
Recommendation: for every
platform-pending:URI, requireget_workspace_token(workspace_id)unless the URI workspace is the validated current single-workspace ID. If there is no current single-workspace ID, fail closed instead of allowingauth_headers(workspace_id)to fall back. Add a regression test withWORKSPACE_IDunset,get_workspace_token("ws-b") -> None, and a process-wide token/env present; assert no network call is made.CI note: statuses for head
8c9180399d46d2375a8f6ddf11862503cdc377c4were still pending at review time.Approved for the prior release-manager blocker.
The
MOLECULE_WORKSPACES/ external-runtime routing issue is resolved at this head:molecule_runtime/executor_helpers.py:937-946now derives the pending-upload workspace from theplatform-pending:URI and consultsget_workspace_platform_url(workspace_id)before falling back to process-wideMOLECULE_API_URL/PLATFORM_URL.molecule_runtime/executor_helpers.py:953-967also prevents reusing the process-wide token for a different workspace when no token is registered for that workspace.Coverage added for the blocker is in
tests/test_executor_helpers.py:1130-1192, which pins that aplatform-pending:ws-b/...attachment uses tenant B's registered platform URL and token even when the process fallback points at tenant A. The cross-workspace no-token case is covered attests/test_executor_helpers.py:1228-1269, and invalid pending-upload IDs are rejected before network I/O attests/test_executor_helpers.py:1195-1225.Five-axis scoped re-review:
platform_authregistry state instead of duplicating a separate routing source.8c9180399d46d2375a8f6ddf11862503cdc377c4were still queued during this review (status=5, Waitingin the Actions DB), so merge should still wait for branch protection / CI to pass.Local scoped verification on the PR head:
PYTHONPATH=. pytest -q tests/test_executor_helpers.py -k 'platform_pending or per_workspace_platform_url or cross_workspace_without_token or invalid_pending_upload_id'->6 passed, 99 deselected.Re-reviewed head
218d95ace89dd3d8c274b2ba8fb41e16d78b37c7against my prior blocker only.The blocker is resolved.
platform-pending:URI parts are validated before URL construction, and pending downloads now require a registered per-workspace token whenever the pending workspace does not exactly match the single-runtimeWORKSPACE_ID. The no-WORKSPACE_IDprocess-token fallback case is covered by regression test and fails closed before any network call.CI note: statuses for this head were still pending at review time; this approval is for the code/security review scope, not a CI override.
Approved for the scoped re-review at
218d95ace89dd3d8c274b2ba8fb41e16d78b37c7.The per-workspace platform URL path remains resolved:
molecule_runtime/executor_helpers.py:937-946derives the pending upload workspace from theplatform-pending:URI and checks the registered per-workspace platform URL before falling back to process-wide platform URLs.The added token-isolation guard resolves the remaining process-token fallback risk:
molecule_runtime/executor_helpers.py:956-967now requires a registered token for anyplatform-pending:workspace that is not the singleWORKSPACE_ID, including the external-runtime case where noWORKSPACE_IDis set. That preventsauth_headers(workspace_id)from falling through toMOLECULE_WORKSPACE_TOKEN/ process token for a workspace that is only named in the URI.Coverage reviewed:
tests/test_executor_helpers.py:1130-1192covers per-workspace platform URL selection for tenant B while the process fallback points at tenant A.tests/test_executor_helpers.py:1228-1267covers cross-workspace pending upload rejection without a registered token.tests/test_executor_helpers.py:1270-1306covers the no-WORKSPACE_IDexternal-runtime case and proves the process token is not reused.Five-axis scoped pass:
platform_authAPIs.status=5, Waiting) during review, so merge should still wait for required checks/branch protection.Local scoped verification on this exact head:
PYTHONPATH=. pytest -q tests/test_executor_helpers.py -k 'platform_pending or per_workspace_platform_url or cross_workspace_without_token or pending_without_workspace_registry or invalid_pending_upload_id'->7 passed, 99 deselected.