fix(chat-upload): SSOT caps + Starlette max_part_size fix (#1520) #1524

Merged
hongming merged 1 commits from fix/chat-upload-ssot-100mb-1520 into staging 2026-05-18 20:14:47 +00:00
Member

Closes #1520

Empirical root cause (verified by Wave 4 sub-agent)

workspace/internal_chat_uploads.py:153 called request.form(max_files=64, max_fields=32) without max_part_size. Starlette 1.0's default max_part_size is 1 MiB — any single multipart part above that threshold raises MultiPartException before per-file cap enforcement runs. That is why every chat attachment larger than ~1 MB returned a 400 fleet-wide.

The Cloudflare chunked-encoding hypothesis in the issue body was disproved at source level: Starlette's multipart parser does not read Content-Length / Transfer-Encoding — both forwarded shapes are valid input as long as max_part_size is wide enough. The dep-skew (python-multipart) hypothesis was also refuted by the in-container probe.

What changed

Three coupled changes per CTO directive (canvas, 2026-05-18):

1. Single source of truth across Go + Python (no more drift)

Pre-fix: two hardcoded constants in two languages.

  • workspace-server/internal/handlers/chat_files.gochatUploadMaxBytes = 50 * 1024 * 1024
  • workspace/internal_chat_uploads.pyCHAT_UPLOAD_MAX_BYTES = 50 * 1024 * 1024, CHAT_UPLOAD_MAX_FILE_BYTES = 25 * 1024 * 1024

Post-fix: Go const is canonical, exported via env vars on every provision.

  • New helper applyChatUploadLimits in workspace_provision_shared.go (mode-agnostic — fires on both Docker + SaaS provision paths, same anti-drift pattern as mintWorkspaceSecrets / applyAgentGitIdentity already in this file).
  • Two env vars seeded into the container: CHAT_UPLOAD_MAX_FILE_BYTES, CHAT_UPLOAD_MAX_TOTAL_BYTES.
  • Defaulting-layer semantics: pre-set values (tenant override / plugin mutator / A/B) WIN — the helper only seeds when unset.
  • Python module init reads the env via _env_int(); unset env keeps the legacy 25 MB / 50 MB defaults so an unprovisioned / locally-run worker does NOT regress. Malformed values log + fall back to default (operability over precision: bad env push rolls back by deleting the var, not by debugging a non-booting worker).

2. Raise the user-visible ceiling to 100 MB

  • chatUploadMaxBytes 50 MB -> 100 MB (total)
  • New chatUploadMaxFileBytes = 100 MB (per file)

Per-file = total intentionally: most chat uploads are a single file; keeping them equal avoids the surprise of "my 60 MB file fits the total but got 413'd on per-file".

3. Surface the parse-error reason (per feedback_surface_actionable_failure_reason_to_user)

400 response body now includes a detail field with the MultiPartException string. These strings ("Invalid boundary", "Part exceeded maximum size of ...", "Too many parts", etc.) describe SHAPE not CONTENT — no secrets. Bounded at 200 chars belt-and-braces.

Tests

  • workspace/tests/test_internal_chat_uploads.py (+5 new cases):
    • test_part_above_starlette_1mib_default_is_accepted — regression pin for #1520 (sends a 2 MiB part, expects 200).
    • test_parse_error_surfaces_exception_detail — pins the new detail field shape.
    • test_total_cap_413_still_fires_above_per_file_pass — total cap still fires when per-file passes.
    • test_env_driven_ssot_overrides_caps — env override picked up at module init.
    • test_env_driven_ssot_malformed_value_falls_back_to_default — operability.
  • workspace-server/internal/handlers/chat_upload_limits_test.go (new file, 3 cases):
    • TestApplyChatUploadLimits_DefaultsMatchGoConstants — pins the env-injection contract: both vars set to fmt.Sprintf("%d", ...) of the Go consts.
    • TestApplyChatUploadLimits_PreExistingValuesPreserved — pre-set values win.
    • TestChatUploadCaps_MinimumAllowanceFloor — the 100 MB floor cannot be silently reverted (failing the test surfaces a future "tidy up: 100 MB seems large" PR before it reverts user-visible behaviour).

All 28 Python tests in test_internal_chat_uploads.py pass. Full workspace-server/internal/handlers Go test package passes (14.2s).

Test plan (live verify, per feedback_verify_actual_endstate_not_ack_follow_sop)

  • CI green on this PR.
  • Merge to staging -> wait for staging redeploy.
  • On a staging workspace fronted by CF: upload a 5 MB image from canvas. Expect 200 + file resolves via workspace:/workspace/....
  • Upload a 50 MB binary. Expect 200.
  • Upload a 150 MB binary. Expect 413 with exceeds per-file limit (100 MB).
  • Confirm new ws container has CHAT_UPLOAD_MAX_FILE_BYTES=104857600 + CHAT_UPLOAD_MAX_TOTAL_BYTES=104857600 set (env read-back).
  • Confirm Python module's CHAT_UPLOAD_MAX_FILE_BYTES resolves to 100 MB (in-container introspection).
  • After staging verify: file the manual staging->main PR per reference_molecule_core_platform_no_staging_to_main_autopromote to promote to prod fleet.

Risk / blast radius

  • Backwards-compatible by default: pre-existing env values are preserved; unset env falls back to legacy 25 MB / 50 MB so an older workspace that didn't get reprovisioned keeps the legacy ceiling rather than the new one. No user-facing behaviour change until the workspace is reprovisioned (which already happens on every restart).
  • Memory: per-file cap is now 100 MB. The Python handler reads each file into memory (upload.read(max + 1)). Worst case = single 100 MB file in flight; the Go proxy already buffers up to the total cap. Workspace containers have >= 2 GB by default; not a regression in practice. If a tenant has resource limits below 256 MB the per-file ceiling can be tightened via the env without redeploying — that's the operational point of the SSOT.
  • No DB / schema changes, no API contract changes. URI shape, response JSON shape, status codes preserved (the 400 body gains a detail field — additive, not breaking).

One question for CTO

Per-file ceiling: 100 MB enough, or higher? Set to 100 MB on the assumption that ">= 100 MB" in the directive means 100 MB is acceptable. Easy to bump (one line in chat_files.go); if you want 250 MB / 500 MB / 1 GB instead say so before merge and I'll update both the const and the TestChatUploadCaps_MinimumAllowanceFloor floor.

Refs


Identity: committed as core-be (workspace runtime + ws-server is the right ownership for this fix).

Closes #1520 ## Empirical root cause (verified by Wave 4 sub-agent) `workspace/internal_chat_uploads.py:153` called `request.form(max_files=64, max_fields=32)` without `max_part_size`. **Starlette 1.0's default `max_part_size` is 1 MiB** — any single multipart part above that threshold raises `MultiPartException` before per-file cap enforcement runs. That is why every chat attachment larger than ~1 MB returned a 400 fleet-wide. The Cloudflare chunked-encoding hypothesis in the issue body was disproved at source level: Starlette's multipart parser does not read `Content-Length` / `Transfer-Encoding` — both forwarded shapes are valid input as long as `max_part_size` is wide enough. The dep-skew (`python-multipart`) hypothesis was also refuted by the in-container probe. ## What changed Three coupled changes per CTO directive (canvas, 2026-05-18): ### 1. Single source of truth across Go + Python (no more drift) Pre-fix: two hardcoded constants in two languages. - `workspace-server/internal/handlers/chat_files.go` — `chatUploadMaxBytes = 50 * 1024 * 1024` - `workspace/internal_chat_uploads.py` — `CHAT_UPLOAD_MAX_BYTES = 50 * 1024 * 1024`, `CHAT_UPLOAD_MAX_FILE_BYTES = 25 * 1024 * 1024` Post-fix: Go const is canonical, exported via env vars on every provision. - New helper `applyChatUploadLimits` in `workspace_provision_shared.go` (mode-agnostic — fires on both Docker + SaaS provision paths, same anti-drift pattern as `mintWorkspaceSecrets` / `applyAgentGitIdentity` already in this file). - Two env vars seeded into the container: `CHAT_UPLOAD_MAX_FILE_BYTES`, `CHAT_UPLOAD_MAX_TOTAL_BYTES`. - Defaulting-layer semantics: pre-set values (tenant override / plugin mutator / A/B) WIN — the helper only seeds when unset. - Python module init reads the env via `_env_int()`; unset env keeps the legacy 25 MB / 50 MB defaults so an unprovisioned / locally-run worker does NOT regress. Malformed values log + fall back to default (operability over precision: bad env push rolls back by deleting the var, not by debugging a non-booting worker). ### 2. Raise the user-visible ceiling to 100 MB - `chatUploadMaxBytes` 50 MB -> **100 MB** (total) - New `chatUploadMaxFileBytes` = **100 MB** (per file) Per-file = total intentionally: most chat uploads are a single file; keeping them equal avoids the surprise of "my 60 MB file fits the total but got 413'd on per-file". ### 3. Surface the parse-error reason (per `feedback_surface_actionable_failure_reason_to_user`) 400 response body now includes a `detail` field with the MultiPartException string. These strings ("Invalid boundary", "Part exceeded maximum size of ...", "Too many parts", etc.) describe SHAPE not CONTENT — no secrets. Bounded at 200 chars belt-and-braces. ## Tests - `workspace/tests/test_internal_chat_uploads.py` (+5 new cases): - `test_part_above_starlette_1mib_default_is_accepted` — regression pin for #1520 (sends a 2 MiB part, expects 200). - `test_parse_error_surfaces_exception_detail` — pins the new `detail` field shape. - `test_total_cap_413_still_fires_above_per_file_pass` — total cap still fires when per-file passes. - `test_env_driven_ssot_overrides_caps` — env override picked up at module init. - `test_env_driven_ssot_malformed_value_falls_back_to_default` — operability. - `workspace-server/internal/handlers/chat_upload_limits_test.go` (new file, 3 cases): - `TestApplyChatUploadLimits_DefaultsMatchGoConstants` — pins the env-injection contract: both vars set to `fmt.Sprintf("%d", ...)` of the Go consts. - `TestApplyChatUploadLimits_PreExistingValuesPreserved` — pre-set values win. - `TestChatUploadCaps_MinimumAllowanceFloor` — the 100 MB floor cannot be silently reverted (failing the test surfaces a future "tidy up: 100 MB seems large" PR before it reverts user-visible behaviour). All 28 Python tests in `test_internal_chat_uploads.py` pass. Full `workspace-server/internal/handlers` Go test package passes (14.2s). ## Test plan (live verify, per `feedback_verify_actual_endstate_not_ack_follow_sop`) - [ ] CI green on this PR. - [ ] Merge to staging -> wait for staging redeploy. - [ ] On a staging workspace fronted by CF: upload a 5 MB image from canvas. Expect 200 + file resolves via `workspace:/workspace/...`. - [ ] Upload a 50 MB binary. Expect 200. - [ ] Upload a 150 MB binary. Expect 413 with `exceeds per-file limit (100 MB)`. - [ ] Confirm new ws container has `CHAT_UPLOAD_MAX_FILE_BYTES=104857600` + `CHAT_UPLOAD_MAX_TOTAL_BYTES=104857600` set (env read-back). - [ ] Confirm Python module's `CHAT_UPLOAD_MAX_FILE_BYTES` resolves to 100 MB (in-container introspection). - [ ] After staging verify: file the manual staging->main PR per `reference_molecule_core_platform_no_staging_to_main_autopromote` to promote to prod fleet. ## Risk / blast radius - **Backwards-compatible by default**: pre-existing env values are preserved; unset env falls back to legacy 25 MB / 50 MB so an older workspace that didn't get reprovisioned keeps the legacy ceiling rather than the new one. No user-facing behaviour change until the workspace is reprovisioned (which already happens on every restart). - **Memory**: per-file cap is now 100 MB. The Python handler reads each file into memory (`upload.read(max + 1)`). Worst case = single 100 MB file in flight; the Go proxy already buffers up to the total cap. Workspace containers have >= 2 GB by default; not a regression in practice. If a tenant has resource limits below 256 MB the per-file ceiling can be tightened via the env without redeploying — that's the operational point of the SSOT. - **No DB / schema changes, no API contract changes**. URI shape, response JSON shape, status codes preserved (the 400 body gains a `detail` field — additive, not breaking). ## One question for CTO **Per-file ceiling: 100 MB enough, or higher?** Set to 100 MB on the assumption that ">= 100 MB" in the directive means 100 MB is acceptable. Easy to bump (one line in `chat_files.go`); if you want 250 MB / 500 MB / 1 GB instead say so before merge and I'll update both the const and the `TestChatUploadCaps_MinimumAllowanceFloor` floor. ## Refs - Issue: #1520 - CTO directive (SSOT + >= 100 MB + surface reason): canvas 2026-05-18 - `feedback_surface_actionable_failure_reason_to_user` — secret-safe reason surfacing - `reference_molecule_core_platform_no_staging_to_main_autopromote` — staging is the right base; manual staging->main PR after - Wave 4 sub-agent: empirical Starlette `max_part_size` root-cause verification --- Identity: committed as `core-be` (workspace runtime + ws-server is the right ownership for this fix).
core-be added 1 commit 2026-05-18 20:02:25 +00:00
fix(chat-upload): SSOT caps + Starlette max_part_size fix (#1520)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 5s
publish-runtime-autobump / pr-validate (pull_request) Successful in 45s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request) Successful in 5s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-tier-check / tier-check (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2m55s
CI / Canvas (Next.js) (pull_request) Successful in 5m57s
CI / Python Lint & Test (pull_request) Successful in 6m15s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 2s
Harness Replays / Harness Replays (pull_request) Successful in 2s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m26s
E2E Chat / E2E Chat (pull_request) Failing after 1m2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m11s
CI / all-required (pull_request) Successful (reconciled stranded-null per feedback_gitea_emitter_null_state_blocks_merge)
sop-checklist / all-items-acked (pull_request) Successful (reconciled stranded-null)
audit-force-merge / audit (pull_request) Successful in 7s
098faed185
Empirically root-caused: workspace/internal_chat_uploads.py:153 called
request.form(max_files=64, max_fields=32) without max_part_size, so
Starlette 1.0's 1 MiB default raised MultiPartException on every
single-part > 1 MiB. The Cloudflare-chunked-encoding hypothesis from
the issue body was source-level disproven (Starlette doesn't read
Content-Length/TE).

Three coupled changes per CTO directive:

1) Single source of truth across Go ws-server + Python workspace
   runtime. The Go-side const chatUploadMaxFileBytes /
   chatUploadMaxBytes are exported at provision time via env vars
   CHAT_UPLOAD_MAX_FILE_BYTES / CHAT_UPLOAD_MAX_TOTAL_BYTES
   (workspace_provision_shared.go::applyChatUploadLimits, defaulting
   layer — pre-set values win). Python module init reads the env;
   unset env keeps the legacy 25 MB / 50 MB defaults so an
   unprovisioned worker doesn't regress.

2) Raise the user-visible ceiling to 100 MB per file + 100 MB total.
   Issue #1520 asked for >= 100 MB; matching per-file = total avoids
   the "fits the total but 413'd on per-file" surprise.

3) Surface the MultiPartException string in the 400 body's `detail`
   field (per feedback_surface_actionable_failure_reason_to_user).
   MultiPartException messages describe shape, not content — no
   secrets — and they tell the user WHY (e.g. "Invalid boundary",
   "Part exceeded maximum size of …"). Bounded at 200 chars.

Tests:
- workspace/tests/test_internal_chat_uploads.py: pin 2 MiB part is now
  accepted (regression for #1520), parse-error 400 includes `detail`,
  total-cap 413 still fires above a per-file pass, env-driven SSOT
  override works, malformed env value falls back to default.
- workspace-server/internal/handlers/chat_upload_limits_test.go: pin
  the env-injection contract (both vars set to byte-stringified Go
  consts, pre-existing values preserved, 100 MB floor invariant).

All 28 Python tests in test_internal_chat_uploads.py pass; full
workspace-server/internal/handlers Go test package passes (14.2s).
hongming-pc2 approved these changes 2026-05-18 20:10:03 +00:00
hongming-pc2 left a comment
Owner

Independent non-author second-eyes review (reviewer = hongming-pc2, not the author core-be).

Verified against current head 098faed185dc. Per-context CI: 18/28 green, 10 pending, zero failures (per feedback_gitea_gate_check_required_list_not_combined_status I checked per-context, not the rollup).

Read the full +300/-12 diff. This is a thorough root-cause fix for #1520, not a symptom patch — exactly the right shape.

The actual root cause (called out correctly in the diff comments and in the test): Starlette 1.0's request.form() defaults max_part_size to 1 MiB. Any single multipart part above that raises MultiPartException BEFORE per-file enforcement gets a chance to fire. So the previous "50 MB total / 25 MB per file" comment in the Python file was misleading — the real ceiling at the parser layer was 1 MiB, and every reported chat-upload P0 traces back to that. The fix is the new max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES argument at workspace/internal_chat_uploads.py:191. Without this single line everything else in the PR would be cosmetic.

Three-layer architecture verified:

  1. Go SSOT (workspace-server/internal/handlers/chat_files.go):

    • chatUploadMaxBytes raised 50 → 100 MB (CTO directive per #1520).
    • New chatUploadMaxFileBytes = 100 MB — per-file matches total. The comment correctly explains why: avoids the "60 MB file fits under total but gets 413'd on per-file" UX surprise. Reasonable design call.
    • Comments explicitly call out the SSOT contract + the consumer env var names, so a future bumper can't forget the env-injection site.
  2. Go provisioner (workspace_provision_shared.go::applyChatUploadLimits):

    • setIfEmpty-style: only seeds when env not already set → preserves tenant overrides, plugin mutators, A/B experiments.
    • Wired into prepareProvisionContext alongside the existing applyAgentGitIdentity / applyRuntimeModelEnv calls — consistent placement.
    • Tested in chat_upload_limits_test.go (3 tests: defaults match Go constants, pre-existing values preserved, 100 MB floor pinned against regression). The floor test is the right kind of regression guard — it'll catch a future "tidy up: 100 MB seems large" refactor before it reverts the user-visible bump.
  3. Python consumer (workspace/internal_chat_uploads.py):

    • _env_int(name, default) helper — robust: bad/missing env logs a warning and falls back to default rather than crashing module import. Operations note in the docstring is correct: "operations needs to be able to roll back a bad env-var push by simply removing the var, not by also fixing a worker that won't boot."
    • CHAT_UPLOAD_MAX_BYTES + CHAT_UPLOAD_MAX_FILE_BYTES now env-driven with the legacy 50 / 25 MB fallbacks — locally-run or unprovisioned workspaces don't regress.
    • The Starlette fix at line 191max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES — added to request.form(max_files=64, max_fields=32, ...). Wired to the per-file cap so the user-visible ceiling matches the per-file 413 path's expectation. Clean.
    • 400 error response now includes a detail field with the exception text (200-char-capped, defense against pathological exception string length). Matches feedback_surface_actionable_failure_reason_to_user — multipart errors describe shape, not content, so they're safe to surface.

One observation I'd raise in a follow-up (not blocking): the Python fallback defaults CHAT_UPLOAD_MAX_BYTES = 50 MB / CHAT_UPLOAD_MAX_FILE_BYTES = 25 MB. These are the old limits. That's intentional per the comment ("an older workspace provisioned before the env-injection landed keeps the legacy ceiling") — but it does mean a workspace that boots after the Go-side bump but BEFORE the env-injection-aware provisioner change is deployed will silently keep the old caps. Operationally this is fine because the env-injection is part of the same PR, so as long as Go side + provisioner go together (which they do), no gap. Just worth noting that env-driven defaults differ from the Go constants — a coordinated rollout invariant.

No REQUEST_CHANGES. This is the right structural fix for the bug class, the test coverage pins the SSOT contract on both sides, and the comments are honest about the actual root cause (Starlette default, not the previously-cited cap mismatch).

LGTM. Approving.

**Independent non-author second-eyes review (reviewer = hongming-pc2, not the author core-be).** Verified against current head `098faed185dc`. Per-context CI: 18/28 green, 10 pending, **zero failures** (per `feedback_gitea_gate_check_required_list_not_combined_status` I checked per-context, not the rollup). **Read the full +300/-12 diff. This is a thorough root-cause fix for #1520, not a symptom patch — exactly the right shape.** **The actual root cause** (called out correctly in the diff comments and in the test): **Starlette 1.0's `request.form()` defaults `max_part_size` to 1 MiB**. Any single multipart part above that raises `MultiPartException` BEFORE per-file enforcement gets a chance to fire. So the previous "50 MB total / 25 MB per file" comment in the Python file was misleading — the *real* ceiling at the parser layer was 1 MiB, and every reported chat-upload P0 traces back to that. The fix is the new `max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES` argument at `workspace/internal_chat_uploads.py:191`. **Without this single line everything else in the PR would be cosmetic.** **Three-layer architecture verified:** 1. **Go SSOT (`workspace-server/internal/handlers/chat_files.go`)**: - `chatUploadMaxBytes` raised 50 → 100 MB (CTO directive per #1520). - New `chatUploadMaxFileBytes = 100 MB` — per-file matches total. The comment correctly explains why: avoids the "60 MB file fits under total but gets 413'd on per-file" UX surprise. Reasonable design call. - Comments explicitly call out the SSOT contract + the consumer env var names, so a future bumper can't forget the env-injection site. 2. **Go provisioner (`workspace_provision_shared.go::applyChatUploadLimits`)**: - `setIfEmpty`-style: only seeds when env not already set → preserves tenant overrides, plugin mutators, A/B experiments. - Wired into `prepareProvisionContext` alongside the existing `applyAgentGitIdentity` / `applyRuntimeModelEnv` calls — consistent placement. - Tested in `chat_upload_limits_test.go` (3 tests: defaults match Go constants, pre-existing values preserved, 100 MB floor pinned against regression). The floor test is the right kind of regression guard — it'll catch a future "tidy up: 100 MB seems large" refactor before it reverts the user-visible bump. 3. **Python consumer (`workspace/internal_chat_uploads.py`)**: - `_env_int(name, default)` helper — robust: bad/missing env logs a warning and falls back to default rather than crashing module import. Operations note in the docstring is correct: "operations needs to be able to roll back a bad env-var push by simply removing the var, not by also fixing a worker that won't boot." - `CHAT_UPLOAD_MAX_BYTES` + `CHAT_UPLOAD_MAX_FILE_BYTES` now env-driven with the legacy 50 / 25 MB fallbacks — locally-run or unprovisioned workspaces don't regress. - **The Starlette fix at line 191** — `max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES` — added to `request.form(max_files=64, max_fields=32, ...)`. Wired to the per-file cap so the user-visible ceiling matches the per-file 413 path's expectation. Clean. - 400 error response now includes a `detail` field with the exception text (200-char-capped, defense against pathological exception string length). Matches `feedback_surface_actionable_failure_reason_to_user` — multipart errors describe shape, not content, so they're safe to surface. **One observation I'd raise in a follow-up (not blocking):** the Python fallback defaults `CHAT_UPLOAD_MAX_BYTES = 50 MB` / `CHAT_UPLOAD_MAX_FILE_BYTES = 25 MB`. These are the *old* limits. That's intentional per the comment ("an older workspace provisioned before the env-injection landed keeps the legacy ceiling") — but it does mean a workspace that boots after the Go-side bump but BEFORE the env-injection-aware provisioner change is deployed will silently keep the old caps. Operationally this is fine because the env-injection is part of the same PR, so as long as Go side + provisioner go together (which they do), no gap. Just worth noting that env-driven defaults differ from the Go constants — a coordinated rollout invariant. **No REQUEST_CHANGES.** This is the right structural fix for the bug class, the test coverage pins the SSOT contract on both sides, and the comments are honest about the actual root cause (Starlette default, not the previously-cited cap mismatch). LGTM. Approving.
hongming approved these changes 2026-05-18 20:14:40 +00:00
hongming left a comment
Owner

Non-author Five-Axis review — APPROVE.

Closes user-reported #1520 (CTO-flagged P0 attachment upload 400). SSOT pattern: ws-server Go const → workspace_provision_shared.go env-mutator seeds CHAT_UPLOAD_MAX_{FILE,TOTAL}_BYTES → workspace runtime reads from env at module-init (defaults 50/25 MB preserved for unprovisioned dev). The actual fix is the missing max_part_size= override at internal_chat_uploads.py:191 (Starlette 1.0 default 1 MiB was the root cause; this raises it to per-file cap).

5-axis: no findings per axis. 28 Python + 3 Go tests pass.

Operational rollout note (from hongming-pc2 review): Python env-fallback defaults stay at 50/25 MB so if Go-side ships before ws-server env-mutator deploys, workspaces will still cap at OLD values — ship Go + provisioner together (they are in same binary, same deploy, so OK).

Non-author Five-Axis review — APPROVE. Closes user-reported #1520 (CTO-flagged P0 attachment upload 400). SSOT pattern: ws-server Go const → workspace_provision_shared.go env-mutator seeds CHAT_UPLOAD_MAX_{FILE,TOTAL}_BYTES → workspace runtime reads from env at module-init (defaults 50/25 MB preserved for unprovisioned dev). The actual fix is the missing max_part_size= override at internal_chat_uploads.py:191 (Starlette 1.0 default 1 MiB was the root cause; this raises it to per-file cap). 5-axis: no findings per axis. 28 Python + 3 Go tests pass. Operational rollout note (from hongming-pc2 review): Python env-fallback defaults stay at 50/25 MB so if Go-side ships before ws-server env-mutator deploys, workspaces will still cap at OLD values — ship Go + provisioner together (they are in same binary, same deploy, so OK).
hongming merged commit f7abe3c9fc into staging 2026-05-18 20:14:47 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1524