feat(uploads): bump cap to 100MB + correct-reason error messages (no more "timeout" for file-size) #1588

Merged
devops-engineer merged 1 commits from infra-runtime-be/upload-100mb-and-correct-reason-errors into main 2026-05-20 03:36:27 +00:00
Member

Summary

Bumps the chat-file upload cap from 50 MB to 100 MB AND fixes the wrong-reason error surface flagged by the CTO on forensic a99ab0a1 (reno-stars uploading a >50 MB file saw Upload failed: signal timed out when the actual cause was file-size + the client's fixed 60 s AbortSignal.timeout firing before the slow uplink finished streaming; the server eventually returned 400 body too large but the client had already aborted itself, so the user-visible reason was the wrong one).

CTO directive (verbatim): "if its file size issue, should have error that instead saying timeout which is wrong".

Bundled into one PR because the two changes are coupled — bumping the cap alone would still leak the fixed-60 s timeout for legitimate slow uploads; fixing the client alone would 413 every >50 MB attempt.

Failure-reason contract (the load-bearing UX change)

Case When fires User-facing message
File > 100 MB Pre-flight, before any fetch File too large (got X.XMB) — limit is 100MB. Please use a smaller file.
File ≤ 100 MB, slow upload exceeds dynamic timeout AbortSignal.timeout fires during fetch Upload timed out — your connection is too slow for this file. Try again, or reduce file size. (no mention of file-size — pre-flight already excluded it)
Server 4xx/5xx Server response Server's reason surfaced verbatim (Upload failed: <status> <body>)
Other JS error catch fallback Upload failed: <e.message>

No conflation. Each cause maps to ITS OWN message.

Server-side (push-mode, EC2 workspace)

  • workspace-server/internal/handlers/chat_files.go
    • chatUploadMaxBytes 50 → 100 MB
    • httpClient.Timeout 120 → 1200 s (matches the new slow-uplink budget at 100 KB/s)
  • workspace/internal_chat_uploads.py
    • CHAT_UPLOAD_MAX_BYTES 50 → 100 MB
    • CHAT_UPLOAD_MAX_FILE_BYTES 25 → 100 MB (aligned with the total so a single legitimate large file — e.g. Ryan's PDF — succeeds end-to-end)

Canvas

  • canvas/src/components/tabs/chat/uploads.ts
    • MAX_UPLOAD_BYTES exported constant (100 MB)
    • New FileTooLargeError class — distinct name so the catch path can route correctly without string-matching
    • Pre-flight size check throws BEFORE any fetch() so the file-size case can never surface as a downstream timeout
    • computeUploadTimeoutMs(bytes): 60 s floor + 100 KB/s scaled deadline (~1000 s at the 100 MB cap)
  • canvas/src/components/tabs/chat/hooks/useChatSend.ts
    • mapUploadErrorToReason(e) routes each cause to its own message; exported for unit testing

Tests

  • workspace-server/internal/handlers/chat_files_test.go (2 new):
    • TestChatUpload_BodyUnderCap_Forwards pins chatUploadMaxBytes == 100 MB and confirms a sub-cap upload forwards
    • TestChatUpload_BodyOverCap_NotOK verifies an over-cap body does NOT silently succeed
  • canvas/src/components/tabs/chat/__tests__/uploads.cap.test.ts (10 cases, NEW): cap constant, pre-flight gate, exact-cap edge, scaled-timeout curve, server-413 propagation, AbortSignal shape, explicit negative pinning TimeoutError ≠ FileTooLargeError
  • canvas/src/components/tabs/chat/hooks/__tests__/useChatSend.errorReason.test.ts (5 cases, NEW): per-cause message contract, explicit negatives guarding against wrong-reason conflation

Local run: Go handlers suite green (16.5 s); canvas chat suite green 283/283 (11.4 s).

Test harness mirror

  • tests/harness/cf-proxy/nginx.conf client_max_body_size 50 m → 100 m (the harness mirror; production CF / nginx tier is out-of-repo. If prod still caps at 50 m the mirror will pass while prod 413s — surfaced explicitly in the inline comment so on-call sees it.)

Follow-up (NOT in this PR)

The 100 MB constant now lives in THREE mirror sites (canvas TS + workspace Python + platform Go). Per feedback_no_single_source_of_truth the proper fix is exposing the cap via GET /uploads/limits so the client fetches the live value. Filing as a separate issue in molecule-ai/internal.

References

  • task #295 (internal tracker — CTO-authorized this work)
  • forensic a99ab0a1 (reno-stars 2026-05-19)
  • feedback_surface_actionable_failure_reason_to_user (CTO 2026-05-17)

Test plan

  • CI green on platform Go + canvas vitest + Python workspace tests
  • Post-merge: confirm new SHA live on api.moleculesai.app (workspace-server deploy) and app.moleculesai.app (canvas Vercel deploy)
  • Manual smoke on staging:
    • 70 MB file uploads successfully (was 50 MB-blocked pre-fix)
    • 101 MB file → user sees the "File too large (got 101.0MB) — limit is 100MB" message immediately, no network round-trip
    • 50 MB file on a throttled connection that exceeds the scaled deadline → user sees "Upload timed out — your connection is too slow for this file" (and explicitly NOT a "file too large" message)
  • Notify Ryan (reno-stars) via canvas in Chinese when verified
## Summary Bumps the chat-file upload cap from 50 MB to 100 MB AND fixes the wrong-reason error surface flagged by the CTO on forensic a99ab0a1 (reno-stars uploading a >50 MB file saw `Upload failed: signal timed out` when the actual cause was file-size + the client's fixed 60 s `AbortSignal.timeout` firing before the slow uplink finished streaming; the server eventually returned `400 body too large` but the client had already aborted itself, so the user-visible reason was the wrong one). CTO directive (verbatim): *"if its file size issue, should have error that instead saying timeout which is wrong"*. Bundled into one PR because the two changes are coupled — bumping the cap alone would still leak the fixed-60 s timeout for legitimate slow uploads; fixing the client alone would 413 every >50 MB attempt. ## Failure-reason contract (the load-bearing UX change) | Case | When fires | User-facing message | |---|---|---| | File > 100 MB | Pre-flight, before any fetch | `File too large (got X.XMB) — limit is 100MB. Please use a smaller file.` | | File ≤ 100 MB, slow upload exceeds dynamic timeout | `AbortSignal.timeout` fires during fetch | `Upload timed out — your connection is too slow for this file. Try again, or reduce file size.` (no mention of file-size — pre-flight already excluded it) | | Server 4xx/5xx | Server response | Server's reason surfaced verbatim (`Upload failed: <status> <body>`) | | Other JS error | catch fallback | `Upload failed: <e.message>` | No conflation. Each cause maps to ITS OWN message. ## Server-side (push-mode, EC2 workspace) - `workspace-server/internal/handlers/chat_files.go` - `chatUploadMaxBytes` 50 → 100 MB - `httpClient.Timeout` 120 → 1200 s (matches the new slow-uplink budget at 100 KB/s) - `workspace/internal_chat_uploads.py` - `CHAT_UPLOAD_MAX_BYTES` 50 → 100 MB - `CHAT_UPLOAD_MAX_FILE_BYTES` 25 → 100 MB (aligned with the total so a single legitimate large file — e.g. Ryan's PDF — succeeds end-to-end) ## Canvas - `canvas/src/components/tabs/chat/uploads.ts` - `MAX_UPLOAD_BYTES` exported constant (100 MB) - New `FileTooLargeError` class — distinct name so the catch path can route correctly without string-matching - Pre-flight size check throws BEFORE any `fetch()` so the file-size case can never surface as a downstream timeout - `computeUploadTimeoutMs(bytes)`: 60 s floor + 100 KB/s scaled deadline (~1000 s at the 100 MB cap) - `canvas/src/components/tabs/chat/hooks/useChatSend.ts` - `mapUploadErrorToReason(e)` routes each cause to its own message; exported for unit testing ## Tests - `workspace-server/internal/handlers/chat_files_test.go` (2 new): - `TestChatUpload_BodyUnderCap_Forwards` pins `chatUploadMaxBytes == 100 MB` and confirms a sub-cap upload forwards - `TestChatUpload_BodyOverCap_NotOK` verifies an over-cap body does NOT silently succeed - `canvas/src/components/tabs/chat/__tests__/uploads.cap.test.ts` (10 cases, NEW): cap constant, pre-flight gate, exact-cap edge, scaled-timeout curve, server-413 propagation, AbortSignal shape, explicit negative pinning `TimeoutError ≠ FileTooLargeError` - `canvas/src/components/tabs/chat/hooks/__tests__/useChatSend.errorReason.test.ts` (5 cases, NEW): per-cause message contract, explicit negatives guarding against wrong-reason conflation Local run: Go handlers suite green (16.5 s); canvas chat suite green 283/283 (11.4 s). ## Test harness mirror - `tests/harness/cf-proxy/nginx.conf` `client_max_body_size` 50 m → 100 m (the harness mirror; production CF / nginx tier is out-of-repo. If prod still caps at 50 m the mirror will pass while prod 413s — surfaced explicitly in the inline comment so on-call sees it.) ## Follow-up (NOT in this PR) The 100 MB constant now lives in THREE mirror sites (canvas TS + workspace Python + platform Go). Per `feedback_no_single_source_of_truth` the proper fix is exposing the cap via `GET /uploads/limits` so the client fetches the live value. Filing as a separate issue in `molecule-ai/internal`. ## References - task #295 (internal tracker — CTO-authorized this work) - forensic a99ab0a1 (reno-stars 2026-05-19) - `feedback_surface_actionable_failure_reason_to_user` (CTO 2026-05-17) ## Test plan - [ ] CI green on platform Go + canvas vitest + Python workspace tests - [ ] Post-merge: confirm new SHA live on `api.moleculesai.app` (workspace-server deploy) and `app.moleculesai.app` (canvas Vercel deploy) - [ ] Manual smoke on staging: - [ ] 70 MB file uploads successfully (was 50 MB-blocked pre-fix) - [ ] 101 MB file → user sees the *"File too large (got 101.0MB) — limit is 100MB"* message immediately, no network round-trip - [ ] 50 MB file on a throttled connection that exceeds the scaled deadline → user sees *"Upload timed out — your connection is too slow for this file"* (and explicitly NOT a "file too large" message) - [ ] Notify Ryan (reno-stars) via canvas in Chinese when verified
infra-runtime-be added 1 commit 2026-05-20 03:23:48 +00:00
feat(uploads): bump cap to 100MB + correct-reason error messages
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 11s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Harness Replays / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Lint no tenant GITEA/GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m8s
publish-runtime-autobump / pr-validate (pull_request) Successful in 34s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Successful in 6s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Platform (Go) (pull_request) Successful in 5m5s
CI / Canvas (Next.js) (pull_request) Successful in 6m11s
CI / Python Lint & Test (pull_request) Successful in 7m17s
CI / all-required (pull_request) Successful in 6m33s
Harness Replays / Harness Replays (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m27s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m24s
security-review / approved (pull_request) Refired via /security-recheck by unknown
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m56s
E2E Chat / E2E Chat (pull_request) Failing after 6m33s
audit-force-merge / audit (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m21s
5c989fef2f
CTO 2026-05-19 directive on forensic a99ab0a1 (reno-stars >50MB
upload that surfaced "signal timed out" when the real cause was
file-size + a fixed 60s client timeout):

  "if its file size issue, should have error that instead saying
   timeout which is wrong"

Bundles the cap raise + the wrong-reason fix in ONE PR because the
two are coupled — bumping the server alone would still leak the
fixed-60s timeout for legitimate slow uploads; fixing the client
alone would 413 every >50MB attempt.

Server (push-mode, EC2 workspace):
  - workspace-server/internal/handlers/chat_files.go:
      chatUploadMaxBytes 50→100 MB
      httpClient.Timeout 120→1200 s (matches the new slow-uplink budget)
  - workspace/internal_chat_uploads.py:
      CHAT_UPLOAD_MAX_BYTES 50→100 MB
      CHAT_UPLOAD_MAX_FILE_BYTES 25→100 MB (aligned with total so a
      single legitimate large file succeeds end-to-end)

Canvas:
  - canvas/src/components/tabs/chat/uploads.ts:
      MAX_UPLOAD_BYTES 100 MB constant + FileTooLargeError class
      pre-flight gate: file-size violation throws BEFORE any fetch,
        with the actionable "File too large (got X MB) — limit is 100MB"
      computeUploadTimeoutMs: 60s floor + 100 KB/s scaled deadline
        (was a fixed 60s — the root cause of the forensic)
  - canvas/src/components/tabs/chat/hooks/useChatSend.ts:
      mapUploadErrorToReason: routes each cause to ITS OWN message
        (FileTooLargeError | TimeoutError | server-Error | fallback)
      no conflation between file-size and connection-too-slow

Tests:
  - workspace-server chat_files_test.go: pins 100 MB constant,
    asserts sub-cap forwards + over-cap non-2xx
  - canvas uploads.cap.test.ts (10 cases): pre-flight gate, exact-cap
    edge, scaled-timeout curve, server-413 propagation, AbortSignal
    shape — explicit negative on "TimeoutError ≠ FileTooLargeError"
  - canvas useChatSend.errorReason.test.ts (5 cases): per-cause
    message contract, explicit negatives that guard against the
    wrong-reason conflation

Test harness mirror:
  - tests/harness/cf-proxy/nginx.conf: client_max_body_size 50m→100m
    (this is the harness mirror; the production CF / nginx tier is
    out-of-repo. If prod still caps at 50m, this mirror passes while
    prod 413s — surface to ops.)

Follow-up (SSOT, NOT in this PR):
  The 100 MB constant now lives in THREE mirror sites (canvas TS +
  workspace Python + platform Go). Per feedback_no_single_source_of_truth,
  the proper fix is exposing the cap via GET /uploads/limits so the
  client fetches the live value. Filing as a separate issue.

References:
  - task #295 (internal tracker; CTO-authorized this work)
  - forensic a99ab0a1 (reno-stars 2026-05-19)
  - feedback_surface_actionable_failure_reason_to_user (CTO 2026-05-17)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
core-devops approved these changes 2026-05-20 03:27:18 +00:00
core-devops left a comment
Member

core-devops review — Go workspace-server + nginx + Python lens.

Verified:

  • workspace-server/internal/handlers/chat_files.go: chatUploadMaxBytes = 10010241024 (line 127). http.MaxBytesReader installed BEFORE forwarding (line 304) — request body is gated server-side independent of any client pre-flight.
  • HTTP client Timeout = 1200s (line 99). Math: 100 MB / 100 KB/s = ~1049s client AbortSignal deadline; 1200s server > client by 151s, so a legitimate slow stream is killed by the CLIENT first with the precise 'connection too slow' message rather than dying server-side mid-stream with an ambiguous status. Curve checks out.
  • workspace/internal_chat_uploads.py: CHAT_UPLOAD_MAX_BYTES + CHAT_UPLOAD_MAX_FILE_BYTES both 10010241024 (lines 74, 80). 413 surfaced explicitly with the byte limit in the body (lines 154, 230, 236) — operator can read the cap straight off the 413 response.
  • tests/harness/cf-proxy/nginx.conf: client_max_body_size 100m (line 64). Matches.

All three server-side mirror sites at 100MB. SSOT drift risk acknowledged — author has TODO comment in canvas/uploads.ts pointing at GET /uploads/limits follow-up; I'll file the issue post-merge per the dispatch.

LGTM.

core-devops review — Go workspace-server + nginx + Python lens. Verified: - workspace-server/internal/handlers/chat_files.go: chatUploadMaxBytes = 100*1024*1024 (line 127). http.MaxBytesReader installed BEFORE forwarding (line 304) — request body is gated server-side independent of any client pre-flight. - HTTP client Timeout = 1200s (line 99). Math: 100 MB / 100 KB/s = ~1049s client AbortSignal deadline; 1200s server > client by 151s, so a legitimate slow stream is killed by the CLIENT first with the precise 'connection too slow' message rather than dying server-side mid-stream with an ambiguous status. Curve checks out. - workspace/internal_chat_uploads.py: CHAT_UPLOAD_MAX_BYTES + CHAT_UPLOAD_MAX_FILE_BYTES both 100*1024*1024 (lines 74, 80). 413 surfaced explicitly with the byte limit in the body (lines 154, 230, 236) — operator can read the cap straight off the 413 response. - tests/harness/cf-proxy/nginx.conf: client_max_body_size 100m (line 64). Matches. All three server-side mirror sites at 100MB. SSOT drift risk acknowledged — author has TODO comment in canvas/uploads.ts pointing at GET /uploads/limits follow-up; I'll file the issue post-merge per the dispatch. LGTM.
core-qa approved these changes 2026-05-20 03:27:39 +00:00
core-qa left a comment
Member

core-qa review — canvas TS + test coverage lens.

Verified the bug-fix contract (CTO forensic a99ab0a1 — 'file-size should not surface as timeout'):

  1. Pre-flight gate (canvas/uploads.ts:80-94). Iterates files BEFORE any fetch(); throws FileTooLargeError with the offending size + cap in the message. No network round-trip on the violating case. Catch in useChatSend.ts:75 routes by instanceof — uses err.message verbatim. PASS: 101MB rejection is instant + actionable.

  2. Timeout-vs-size disambiguation (useChatSend.ts:80-87). Discriminated by err.name === 'TimeoutError' (DOMException from AbortSignal). Because gate #1 already excluded oversize files, a TimeoutError reaching this branch CANNOT mean file-size — it's necessarily slow uplink. Message says exactly that: 'Upload timed out — your connection is too slow for this file. Try again, or reduce file size.' No conflation. PASS.

  3. Scaled timeout curve (uploads.ts:50, computeUploadTimeoutMs). 60s floor for small files (so a typo'd host surfaces fast); above the floor, totalBytes/100 ms = 100 KB/s assumed uplink. At 100MB cap → 1049s, comfortably above any realistic slow-but-real connection (mobile tether @ 200 KB/s = 524s — fine). PASS.

Test coverage:

  • uploads.cap.test.ts (179 LoC, 11+ cases): constant pinned to 10010241024 (line 31); FileTooLargeError shape + msg pattern; pre-flight throws BEFORE any fetch (verified via mocked fetch never called); timeout curve pinned at boundaries (100KB → 60s, 1MB → 60s, 100MB → 1048576ms, monotonic).
  • useChatSend.errorReason.test.ts (79 LoC): each of the 4 error-mapping branches has its own assertion (FileTooLargeError → verbatim, TimeoutError → 'too slow', generic Error → status+body, unknown → fallback).
  • chat_files_test.go (+76 LoC): Go MaxBytesReader 413 path covered.
  • test_internal_chat_uploads.py: cap-constant assertion bumped to 100MB.

17 new tests, all locally green per dispatch state. Failure-reason contract honored end-to-end.

LGTM.

core-qa review — canvas TS + test coverage lens. Verified the bug-fix contract (CTO forensic a99ab0a1 — 'file-size should not surface as timeout'): 1. Pre-flight gate (canvas/uploads.ts:80-94). Iterates files BEFORE any fetch(); throws FileTooLargeError with the offending size + cap in the message. No network round-trip on the violating case. Catch in useChatSend.ts:75 routes by instanceof — uses err.message verbatim. PASS: 101MB rejection is instant + actionable. 2. Timeout-vs-size disambiguation (useChatSend.ts:80-87). Discriminated by err.name === 'TimeoutError' (DOMException from AbortSignal). Because gate #1 already excluded oversize files, a TimeoutError reaching this branch CANNOT mean file-size — it's necessarily slow uplink. Message says exactly that: 'Upload timed out — your connection is too slow for this file. Try again, or reduce file size.' No conflation. PASS. 3. Scaled timeout curve (uploads.ts:50, computeUploadTimeoutMs). 60s floor for small files (so a typo'd host surfaces fast); above the floor, totalBytes/100 ms = 100 KB/s assumed uplink. At 100MB cap → 1049s, comfortably above any realistic slow-but-real connection (mobile tether @ 200 KB/s = 524s — fine). PASS. Test coverage: - uploads.cap.test.ts (179 LoC, 11+ cases): constant pinned to 100*1024*1024 (line 31); FileTooLargeError shape + msg pattern; pre-flight throws BEFORE any fetch (verified via mocked fetch never called); timeout curve pinned at boundaries (100KB → 60s, 1MB → 60s, 100MB → 1048576ms, monotonic). - useChatSend.errorReason.test.ts (79 LoC): each of the 4 error-mapping branches has its own assertion (FileTooLargeError → verbatim, TimeoutError → 'too slow', generic Error → status+body, unknown → fallback). - chat_files_test.go (+76 LoC): Go MaxBytesReader 413 path covered. - test_internal_chat_uploads.py: cap-constant assertion bumped to 100MB. 17 new tests, all locally green per dispatch state. Failure-reason contract honored end-to-end. LGTM.
core-security approved these changes 2026-05-20 03:30:50 +00:00
core-security left a comment
Member

core-security review — security-surface lens.

Threat-model walk on the PR diff:

  1. Body-size DoS surface. Server-side cap enforced via http.MaxBytesReader at workspace-server/internal/handlers/chat_files.go:304 BEFORE ParseMultipartForm. This is the correct order — ParseMultipartForm without an upstream MaxBytesReader would allocate up to disk before checking, exposing a slow-loris-style upload-DoS. PASS.

  2. Memory-exhaustion via per-file streaming. workspace/internal_chat_uploads.py reads CHAT_UPLOAD_MAX_FILE_BYTES+1 (line 227) to bound the upload.read() call — a hostile client claiming small Content-Length cannot OOM the python worker. PASS.

  3. Cap raised 50→100MB. Risk delta: 2× per-request memory/disk burst. Nginx harness (tests/harness/cf-proxy/nginx.conf) bumped to 100m matching, so prod-shaped reverse-proxy paths won't 413-clip pre-server. No regression on the prod CF/nginx edge is in this PR — flagging that for separate verification (which is in dispatch step 4). NOT A BLOCKER for this PR.

  4. Error-message info-disclosure. New error messages embed file size (MB-rounded) and the 100MB cap — both safe (size is client-supplied; cap is the public contract). Server-side 'upload failed: ' propagates server body verbatim; existing pattern, no new leakage.

  5. No new auth / no new env / no new secret. Diff is purely size/timeout/error-message logic. credentials: 'include' and platformAuthHeaders() are unchanged.

  6. Pre-flight gate cannot be bypassed by client trickery. Even if a malicious frontend skips the JS pre-flight, the Go server enforces MaxBytesReader → 413. Defense-in-depth holds. PASS.

No security regressions. LGTM.

core-security review — security-surface lens. Threat-model walk on the PR diff: 1. **Body-size DoS surface.** Server-side cap enforced via http.MaxBytesReader at workspace-server/internal/handlers/chat_files.go:304 BEFORE ParseMultipartForm. This is the correct order — ParseMultipartForm without an upstream MaxBytesReader would allocate up to disk before checking, exposing a slow-loris-style upload-DoS. PASS. 2. **Memory-exhaustion via per-file streaming.** workspace/internal_chat_uploads.py reads CHAT_UPLOAD_MAX_FILE_BYTES+1 (line 227) to bound the upload.read() call — a hostile client claiming small Content-Length cannot OOM the python worker. PASS. 3. **Cap raised 50→100MB.** Risk delta: 2× per-request memory/disk burst. Nginx harness (tests/harness/cf-proxy/nginx.conf) bumped to 100m matching, so prod-shaped reverse-proxy paths won't 413-clip pre-server. No regression on the prod CF/nginx edge is in this PR — flagging that for separate verification (which is in dispatch step 4). NOT A BLOCKER for this PR. 4. **Error-message info-disclosure.** New error messages embed file size (MB-rounded) and the 100MB cap — both safe (size is client-supplied; cap is the public contract). Server-side 'upload failed: <status> <text>' propagates server body verbatim; existing pattern, no new leakage. 5. **No new auth / no new env / no new secret.** Diff is purely size/timeout/error-message logic. credentials: 'include' and platformAuthHeaders() are unchanged. 6. **Pre-flight gate cannot be bypassed by client trickery.** Even if a malicious frontend skips the JS pre-flight, the Go server enforces MaxBytesReader → 413. Defense-in-depth holds. PASS. No security regressions. LGTM.
Member

/security-recheck

/security-recheck
devops-engineer merged commit a23c0217ae into main 2026-05-20 03:36:27 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1588