fix(uploads): bump poll-mode size_bytes CHECK to 100MB to match push-mode (mc#1588)
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 21s
Check migration collisions / Migration version collision check (pull_request) Successful in 23s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 44s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
publish-runtime-autobump / pr-validate (pull_request) Successful in 38s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 5m27s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m15s
CI / Canvas (Next.js) (pull_request) Successful in 6m47s
CI / Python Lint & Test (pull_request) Successful in 7m32s
CI / all-required (pull_request) Successful in 7m5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 20s
Harness Replays / Harness Replays (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m41s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m58s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m19s
audit-force-merge / audit (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Failing after 9m40s
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 21s
Check migration collisions / Migration version collision check (pull_request) Successful in 23s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 44s
E2E API Smoke Test / detect-changes (pull_request) Successful in 18s
E2E Chat / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 13s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m6s
publish-runtime-autobump / pr-validate (pull_request) Successful in 38s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 4s
security-review / approved (pull_request) Successful in 4s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 5m27s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m15s
CI / Canvas (Next.js) (pull_request) Successful in 6m47s
CI / Python Lint & Test (pull_request) Successful in 7m32s
CI / all-required (pull_request) Successful in 7m5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 20s
Harness Replays / Harness Replays (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m41s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m58s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m19s
audit-force-merge / audit (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Failing after 9m40s
CTO directive 2026-05-19 task #295: poll-mode (laptop-runtime workspaces) had a 25 MB CHECK constraint on `pending_uploads.size_bytes` that diverged from the push-mode 100 MB cap mc#1588 is landing for the SaaS EC2 tenants (reno-stars forensic a99ab0a1). This PR closes the poll-mode gap so an external runtime workspace pulling chat-attached files via the queue accepts the same per-file ceiling as a push-mode tenant. Surfaces touched (mirror sites for the per-file cap): 1. workspace-server/migrations/20260519200000_pending_uploads_bump_size_cap.{up,down}.sql — NEW pair. DROP IF EXISTS the auto-named `pending_uploads_size_bytes_check` constraint and re-add at 104857600. DOWN restores 26214400 (operator must drain any 25–100 MB rows first; documented in down-migration header). 2. workspace-server/internal/pendinguploads/storage.go — MaxFileBytes 25→100 MB. Pre-DB guard so an oversize Put short-circuits before round-tripping Postgres. 3. workspace-server/internal/handlers/chat_files.go (comment only) — header comment on uploadPollMode updated from "Per-file cap: 25 MB" to "100 MB" with the cross-reference to mc#1588. 4. workspace/inbox_uploads.py (workspace-side puller for poll-mode) — MAX_FILE_BYTES 25→100 MB. DEFAULT_FETCH_TIMEOUT 60→240 s so a legitimate 100 MB transfer over a ~5 Mbps consumer link (~160 s wire time) doesn't hit the wrong-reason timeout failure the canvas side of mc#1588 fixed for forensic a99ab0a1. 5. workspace-server/internal/handlers/pending_uploads_integration_test.go — TestIntegration_PendingUploads_SizeCap_100MB pins both the pre-DB guard (Put returns ErrTooLarge at MaxFileBytes+1) AND the raw DB CHECK (direct INSERT at 100MB+1 returns SQLSTATE 23514). Skips in -short mode (allocates ~100 MB). TestIntegration_PendingUploads_SizeCap_DBConstraintName pins the constraint name + clause via pg_get_constraintdef so a future schema-rename can't silently regress to 25 MB through a no-op DROP IF EXISTS. 6. workspace-server/internal/handlers/chat_files_poll_test.go — TestPollUpload_PerFileCapPreStorage_413 and TestPollUpload_AtomicRollbackOnSecondFileTooLarge gain a structural-precondition skip: when MaxFileBytes >= chatUploadMaxBytes (which is the post-mc#1588 + this PR state for poll-mode: per-file == body == 100 MB), the body MaxBytesReader 400s before the per-file 413 branch is reachable. Atomicity is still pinned by the integration test against real Postgres. Re-enable when chatUploadMaxBytes is raised above the per-file cap (RFC for GET /uploads/limits follow-up will reshape this layering). SSOT note (per feedback_no_single_source_of_truth): After this PR the 100 MB constant lives in FIVE mirror sites — canvas TS (mc#1588) + workspace-server Go (chat_files.go via mc#1588 + this PR's storage.go) + workspace Python ingest (internal_chat_uploads.py via mc#1588) + workspace Python pull (inbox_uploads.py via this PR) + nginx harness mirror (mc#1588) + this DB CHECK. The proper fix is the GET /uploads/limits endpoint per CTO follow-up (RFC filed by a0d62036, referenced in mc#1588 description). Until that lands, every cap change needs a coupled bump across all five surfaces. Out of scope: - mc#1588's push-mode bumps (chatUploadMaxBytes, the Python ingest CHAT_UPLOAD_MAX_FILE_BYTES, canvas MAX_UPLOAD_BYTES, nginx client_max_body_size). Those land via mc#1588; this PR touches a disjoint file set. - The GET /uploads/limits SSOT endpoint. Separate RFC. References: - mc#1588 (push-mode 50→100 MB; canvas + workspace-server + workspace Python + nginx harness) - task #295 (internal tracker; CTO-authorized) - forensic a99ab0a1 (reno-stars 2026-05-19; the root failure motivating mc#1588 and this follow-up) - feedback_no_single_source_of_truth (the SSOT discipline this bump duplicates against, pending the /uploads/limits fix) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -572,8 +572,25 @@ func TestPollUpload_PerFileCapPreStorage_413(t *testing.T) {
|
||||
// Pin the early-reject branch (fh.Size > MaxFileBytes) BEFORE we
|
||||
// read the part into memory. Without this, an oversize file
|
||||
// would hit the storage layer's belt-and-suspenders check, which
|
||||
// works but burns ~25 MB of memory + DB round-trip first. Send
|
||||
// 25 MB + 1 byte → 413 with the file size in the response.
|
||||
// works but burns ~100 MB of memory + DB round-trip first.
|
||||
//
|
||||
// Structural precondition: pendinguploads.MaxFileBytes must be
|
||||
// strictly LESS than chatUploadMaxBytes (the body cap on
|
||||
// http.MaxBytesReader at chat_files.go:290) — otherwise a single
|
||||
// oversize part never reaches our handler; the body reader 400s
|
||||
// first. As of 2026-05-19 with the poll-mode bump to 100 MB and
|
||||
// the push-mode bump in mc#1588 also to 100 MB, per-file == body
|
||||
// for single-file uploads and this branch is only reachable in
|
||||
// the multi-file scenario (see TestPollUpload_AtomicRollbackOn
|
||||
// SecondFileTooLarge for that path). When chatUploadMaxBytes is
|
||||
// next bumped above MaxFileBytes (e.g., RFC for the SSOT
|
||||
// GET /uploads/limits endpoint reshapes the layering) this test
|
||||
// can run again.
|
||||
if pendinguploads.MaxFileBytes >= chatUploadMaxBytes {
|
||||
t.Skipf("per-file cap %d >= body cap %d; the body MaxBytesReader 400s before the per-file 413 branch is reachable. Re-enable when body cap > per-file cap.",
|
||||
pendinguploads.MaxFileBytes, chatUploadMaxBytes)
|
||||
}
|
||||
|
||||
mock := setupTestDB(t)
|
||||
setupTestRedis(t)
|
||||
|
||||
@@ -584,7 +601,7 @@ func TestPollUpload_PerFileCapPreStorage_413(t *testing.T) {
|
||||
h := NewChatFilesHandler(NewTemplatesHandler(t.TempDir(), nil, nil)).
|
||||
WithPendingUploads(store, nil)
|
||||
|
||||
// 25 MB + 1 byte. Single file, large enough to trip the early
|
||||
// MaxFileBytes + 1. Single file, large enough to trip the early
|
||||
// size check.
|
||||
oversize := make([]byte, pendinguploads.MaxFileBytes+1)
|
||||
body, ct := pollUploadFixture(t, map[string][]byte{"big.bin": oversize})
|
||||
@@ -655,6 +672,25 @@ func TestPollUpload_SanitizesFilenameInResponse(t *testing.T) {
|
||||
// real atomicity guarantee is the integration test in
|
||||
// pending_uploads_integration_test.go.
|
||||
func TestPollUpload_AtomicRollbackOnSecondFileTooLarge(t *testing.T) {
|
||||
// Same structural precondition as
|
||||
// TestPollUpload_PerFileCapPreStorage_413: pendinguploads.MaxFileBytes
|
||||
// < chatUploadMaxBytes. The two-file fixture below sends
|
||||
// MaxFileBytes+1 worth of bytes; when per-file == body cap, that
|
||||
// total trips the body reader first (400) before our handler can
|
||||
// reach the per-file 413 branch.
|
||||
//
|
||||
// Per-call atomicity is still pinned at the storage layer in
|
||||
// TestIntegration_PendingUploads_PutBatch_RollsBackOnAnyError
|
||||
// (workspace-server/internal/handlers/pending_uploads_integration_test.go)
|
||||
// — that test exercises the same all-or-nothing guarantee against
|
||||
// a real Postgres without depending on the body-cap arithmetic
|
||||
// here. Re-enable this handler-level test when body cap exceeds
|
||||
// per-file cap again.
|
||||
if pendinguploads.MaxFileBytes >= chatUploadMaxBytes {
|
||||
t.Skipf("per-file cap %d >= body cap %d; the body MaxBytesReader 400s before the per-file 413 branch is reachable. Storage-level atomicity covered by integration test. Re-enable when body cap > per-file cap.",
|
||||
pendinguploads.MaxFileBytes, chatUploadMaxBytes)
|
||||
}
|
||||
|
||||
mock := setupTestDB(t)
|
||||
setupTestRedis(t)
|
||||
|
||||
|
||||
@@ -669,3 +669,101 @@ func TestIntegration_PendingUploads_GetIgnoresExpiredAndAcked(t *testing.T) {
|
||||
t.Errorf("Get after expiry: got %v, want ErrNotFound", err)
|
||||
}
|
||||
}
|
||||
|
||||
// TestIntegration_PendingUploads_SizeCap_100MB pins the post-mc#1588
|
||||
// poll-mode cap to 100 MB at the DB CHECK level. The MaxFileBytes-1 row
|
||||
// must INSERT cleanly (no CHECK violation) and the +1 row must be
|
||||
// rejected pre-DB by Put with ErrTooLarge. We also verify the raw DB
|
||||
// CHECK by attempting an INSERT that bypasses Put's pre-DB guard — the
|
||||
// constraint itself must enforce 104857600.
|
||||
//
|
||||
// Regression target: migration 20260519200000_pending_uploads_bump_size_cap
|
||||
// — if a future migration regresses the cap (e.g., re-applies the
|
||||
// 25 MB ceiling) this test 413s.
|
||||
func TestIntegration_PendingUploads_SizeCap_100MB(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping 100MB cap integration test in -short mode (allocates ~100MB)")
|
||||
}
|
||||
conn := integrationDB_PendingUploads(t)
|
||||
store := pendinguploads.NewPostgres(conn)
|
||||
ctx := context.Background()
|
||||
|
||||
wsID := uuid.New()
|
||||
|
||||
// 50 MB row — previously would have hit the 25 MB CHECK; must now
|
||||
// succeed end-to-end (Put → INSERT → row visible via Get).
|
||||
fiftyMB := make([]byte, 50*1024*1024)
|
||||
for i := range fiftyMB {
|
||||
fiftyMB[i] = 0x42 // non-zero so bytea round-trip is observable
|
||||
}
|
||||
fid, err := store.Put(ctx, wsID, fiftyMB, "halfcap.bin", "application/octet-stream")
|
||||
if err != nil {
|
||||
t.Fatalf("Put 50MB: want nil err (cap is 100MB after mc#1588 follow-up), got %v", err)
|
||||
}
|
||||
rec, err := store.Get(ctx, fid)
|
||||
if err != nil {
|
||||
t.Fatalf("Get 50MB row: %v", err)
|
||||
}
|
||||
if rec.SizeBytes != int64(len(fiftyMB)) {
|
||||
t.Errorf("size_bytes mismatch: got %d, want %d", rec.SizeBytes, len(fiftyMB))
|
||||
}
|
||||
|
||||
// Exact-cap row — 100 MB exact must succeed (the CHECK is `<=`).
|
||||
atCap := make([]byte, pendinguploads.MaxFileBytes)
|
||||
if _, err := store.Put(ctx, wsID, atCap, "atcap.bin", "application/octet-stream"); err != nil {
|
||||
t.Errorf("Put MaxFileBytes (at-cap): want nil err, got %v", err)
|
||||
}
|
||||
|
||||
// Over-cap row — must be rejected by Put pre-DB.
|
||||
overCap := make([]byte, pendinguploads.MaxFileBytes+1)
|
||||
if _, err := store.Put(ctx, wsID, overCap, "overcap.bin", "application/octet-stream"); err != pendinguploads.ErrTooLarge {
|
||||
t.Errorf("Put MaxFileBytes+1: want ErrTooLarge, got %v", err)
|
||||
}
|
||||
|
||||
// Raw DB CHECK enforcement — bypass Put's pre-DB guard with a direct
|
||||
// INSERT of a 101-MB row. The CHECK must reject with a Postgres
|
||||
// integrity-violation (SQLSTATE 23514). This catches the case where a
|
||||
// future code path inserts straight into the table without going
|
||||
// through Put (e.g., a background importer).
|
||||
overByOne := pendinguploads.MaxFileBytes + 1
|
||||
_, dbErr := conn.ExecContext(ctx, `
|
||||
INSERT INTO pending_uploads (workspace_id, content, size_bytes, filename, mimetype)
|
||||
VALUES ($1, $2, $3, $4, $5)
|
||||
`, wsID, []byte("dummy"), overByOne, "raw-overcap.bin", "application/octet-stream")
|
||||
if dbErr == nil {
|
||||
t.Errorf("raw INSERT of %d bytes: want CHECK violation, got nil", overByOne)
|
||||
} else if !strings.Contains(dbErr.Error(), "pending_uploads_size_bytes_check") &&
|
||||
!strings.Contains(dbErr.Error(), "23514") {
|
||||
t.Errorf("raw INSERT: want size_bytes_check violation (SQLSTATE 23514), got: %v", dbErr)
|
||||
}
|
||||
}
|
||||
|
||||
// TestIntegration_PendingUploads_SizeCap_DBConstraintName pins the
|
||||
// expected CHECK constraint name. Migration 20260519200000 DROP IF
|
||||
// EXISTSes by the auto-generated name `pending_uploads_size_bytes_check`
|
||||
// and re-adds the constraint under the same name. If a future schema
|
||||
// edit renames the constraint, the next bump migration will silently
|
||||
// no-op the DROP and leave the old ceiling in place — this test catches
|
||||
// that drift.
|
||||
func TestIntegration_PendingUploads_SizeCap_DBConstraintName(t *testing.T) {
|
||||
conn := integrationDB_PendingUploads(t)
|
||||
ctx := context.Background()
|
||||
|
||||
var checkClause string
|
||||
err := conn.QueryRowContext(ctx, `
|
||||
SELECT pg_get_constraintdef(c.oid)
|
||||
FROM pg_constraint c
|
||||
JOIN pg_class t ON t.oid = c.conrelid
|
||||
WHERE t.relname = 'pending_uploads'
|
||||
AND c.conname = 'pending_uploads_size_bytes_check'
|
||||
`).Scan(&checkClause)
|
||||
if err != nil {
|
||||
t.Fatalf("lookup CHECK clause: %v", err)
|
||||
}
|
||||
if !strings.Contains(checkClause, "104857600") {
|
||||
t.Errorf("size_bytes CHECK clause = %q; want it to contain 104857600 (100 MB)", checkClause)
|
||||
}
|
||||
if strings.Contains(checkClause, "26214400") {
|
||||
t.Errorf("size_bytes CHECK clause = %q; must NOT contain 26214400 (the pre-bump 25 MB ceiling)", checkClause)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -41,10 +41,17 @@ import (
|
||||
)
|
||||
|
||||
// Per-file size cap. Mirrors workspace-side ingest_handler
|
||||
// (workspace/internal_chat_uploads.py:198). Pinned at the DB level via
|
||||
// the size_bytes CHECK constraint; this Go-side constant exists so the
|
||||
// Put implementation can reject before round-tripping to Postgres.
|
||||
const MaxFileBytes = 25 * 1024 * 1024
|
||||
// (workspace/internal_chat_uploads.py:CHAT_UPLOAD_MAX_FILE_BYTES) and the
|
||||
// push-mode chat upload cap (chat_files.go:chatUploadMaxBytes). Pinned at
|
||||
// the DB level via the size_bytes CHECK constraint (currently
|
||||
// 104857600 per migration 20260519200000_pending_uploads_bump_size_cap);
|
||||
// this Go-side constant exists so the Put implementation can reject
|
||||
// before round-tripping to Postgres.
|
||||
//
|
||||
// Kept consistent with push-mode (mc#1588) per CTO directive 2026-05-19.
|
||||
// SSOT follow-up: GET /uploads/limits will let every surface read the
|
||||
// live cap rather than each pinning its own copy.
|
||||
const MaxFileBytes = 100 * 1024 * 1024
|
||||
|
||||
// ErrNotFound is returned by Get / MarkFetched / Ack when the row is
|
||||
// absent. Callers turn this into HTTP 404. Treat acked + expired rows
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
-- 20260519200000_pending_uploads_bump_size_cap.down.sql
|
||||
--
|
||||
-- Restores the pre-bump 25 MB (26214400) CHECK on
|
||||
-- pending_uploads.size_bytes.
|
||||
--
|
||||
-- DESTRUCTIVE on existing data: any row with size_bytes > 26214400
|
||||
-- will cause the ALTER TABLE … ADD CONSTRAINT below to fail (Postgres
|
||||
-- validates the new CHECK against every existing row before applying
|
||||
-- it). That is the correct behaviour — silently dropping such rows
|
||||
-- would lose evidence-of-receipt for files the workspace might still
|
||||
-- pull. Operator running this rollback MUST first decide what to do
|
||||
-- with any 25–100 MB rows (sweep them out via the normal expires_at
|
||||
-- path, or accept failed rollback + investigate).
|
||||
--
|
||||
-- In practice this DOWN is for migration-tool symmetry only; the
|
||||
-- pending_uploads table has a 24h hard TTL via expires_at, so waiting
|
||||
-- one cycle drains any oversized rows naturally.
|
||||
|
||||
ALTER TABLE pending_uploads
|
||||
DROP CONSTRAINT IF EXISTS pending_uploads_size_bytes_check;
|
||||
|
||||
ALTER TABLE pending_uploads
|
||||
ADD CONSTRAINT pending_uploads_size_bytes_check
|
||||
CHECK (size_bytes > 0 AND size_bytes <= 26214400);
|
||||
@@ -0,0 +1,56 @@
|
||||
-- 20260519200000_pending_uploads_bump_size_cap.up.sql
|
||||
--
|
||||
-- Bumps the pending_uploads.size_bytes CHECK from 25 MB (26214400) to
|
||||
-- 100 MB (104857600) so poll-mode (laptop-runtime) workspaces accept the
|
||||
-- same per-file cap as push-mode (SaaS EC2 tenants).
|
||||
--
|
||||
-- Why this is a separate PR from mc#1588:
|
||||
-- mc#1588 bumped push-mode 50→100 MB across canvas TS +
|
||||
-- workspace-server Go (chat_files.go cap) + workspace Python ingest
|
||||
-- (internal_chat_uploads.py) + nginx harness mirror. The poll-mode
|
||||
-- staging path is a different surface — the DB CHECK on
|
||||
-- pending_uploads + the Go staging constant
|
||||
-- (pendinguploads.MaxFileBytes) + the workspace-side puller
|
||||
-- (workspace/inbox_uploads.py MAX_FILE_BYTES). Bundling both in one
|
||||
-- PR would have crossed two RFCs' worth of review surface and
|
||||
-- slowed the urgent push-mode fix that landed for reno-stars.
|
||||
--
|
||||
-- Why 100 MB (not higher / lower):
|
||||
-- Aligns with mc#1588 (chat_files.go chatUploadMaxBytes,
|
||||
-- internal_chat_uploads.py CHAT_UPLOAD_MAX_FILE_BYTES). The CTO
|
||||
-- directive 2026-05-19 was a one-line "make it consistent" — no
|
||||
-- evidence of a load-bearing reason the laptop path needs to stay
|
||||
-- smaller (the original 25 MB comment in inbox_uploads.py cited
|
||||
-- "~5 Mbps × 40s" bandwidth math, but that's descriptive, not a
|
||||
-- hard cap; the fetch timeout is independently tunable).
|
||||
--
|
||||
-- bump to 100MB to match push-mode cap per mc#1588; CANVAS_MIRROR:
|
||||
-- canvas/src/components/tabs/chat/uploads.ts MAX_UPLOAD_BYTES
|
||||
--
|
||||
-- SSOT note: 100 MB is now duplicated FIVE times (canvas TS +
|
||||
-- workspace-server Go + workspace Python ingest + workspace Python
|
||||
-- pull + nginx harness + this DB CHECK). The proper fix is a
|
||||
-- GET /uploads/limits endpoint so each surface reads a single source
|
||||
-- — tracked separately per CTO follow-up (RFC referenced in
|
||||
-- mc#1588 description).
|
||||
--
|
||||
-- Constraint name: the original CREATE TABLE wrote the CHECK inline
|
||||
-- without an explicit name. Postgres auto-names it
|
||||
-- `pending_uploads_size_bytes_check`. We DROP IF EXISTS that name
|
||||
-- then re-add the constraint at the new ceiling. DROP IF EXISTS is
|
||||
-- idempotent — re-running this migration after a partial failure
|
||||
-- doesn't error.
|
||||
--
|
||||
-- Forward-only safety: ALTER TABLE … ADD CONSTRAINT … CHECK takes
|
||||
-- an ACCESS EXCLUSIVE lock briefly while it validates every existing
|
||||
-- row. At current row counts (<1k rows steady-state, 24h TTL) this
|
||||
-- is sub-second. If pending_uploads ever grows into the millions,
|
||||
-- switch to ADD CONSTRAINT … NOT VALID + VALIDATE CONSTRAINT in two
|
||||
-- steps to keep the lock short.
|
||||
|
||||
ALTER TABLE pending_uploads
|
||||
DROP CONSTRAINT IF EXISTS pending_uploads_size_bytes_check;
|
||||
|
||||
ALTER TABLE pending_uploads
|
||||
ADD CONSTRAINT pending_uploads_size_bytes_check
|
||||
CHECK (size_bytes > 0 AND size_bytes <= 104857600);
|
||||
+22
-13
@@ -57,26 +57,35 @@ logger = logging.getLogger(__name__)
|
||||
# doesn't ship the in-container HTTP server).
|
||||
CHAT_UPLOAD_DIR = "/workspace/.molecule/chat-uploads"
|
||||
|
||||
# Per-file safety net. The platform enforces 25 MB on the staging side,
|
||||
# but a buggy or hostile platform response shouldn't be able to fill the
|
||||
# workspace's disk — refuse to write more than this even if the response
|
||||
# claims a larger Content-Length.
|
||||
MAX_FILE_BYTES = 25 * 1024 * 1024
|
||||
# Per-file safety net. The platform enforces 100 MB on the staging side
|
||||
# (workspace-server migration 20260519200000_pending_uploads_bump_size_cap
|
||||
# + pendinguploads.MaxFileBytes — bumped from 25 MB per CTO directive
|
||||
# 2026-05-19 to match push-mode mc#1588), but a buggy or hostile
|
||||
# platform response shouldn't be able to fill the workspace's disk —
|
||||
# refuse to write more than this even if the response claims a larger
|
||||
# Content-Length.
|
||||
MAX_FILE_BYTES = 100 * 1024 * 1024
|
||||
|
||||
# Network deadline for the GET. Tuned for a 25 MB transfer over a
|
||||
# reasonable consumer link (~5 Mbps gives ~40s for the full payload),
|
||||
# plus headroom for TLS + platform auth. Aligned with inbox poller's
|
||||
# 10s default for /activity calls — both are user-perceived latency.
|
||||
DEFAULT_FETCH_TIMEOUT = 60.0
|
||||
# Network deadline for the GET. Tuned for a 100 MB transfer over a
|
||||
# reasonable consumer link (~5 Mbps gives ~160s for the full payload),
|
||||
# plus headroom for TLS + platform auth. Scaled up from the original
|
||||
# 60s (sized for 25 MB) when the per-file cap moved to 100 MB — a fixed
|
||||
# 60s would fire BEFORE a legitimate slow uplink finished streaming, the
|
||||
# same wrong-reason failure mc#1588 fixed on the canvas side (forensic
|
||||
# a99ab0a1 reno-stars). Aligned with platform httpClient.Timeout (1200s
|
||||
# in chat_files.go after mc#1588) — laptop pull side gets a smaller
|
||||
# value because it's downstream of a fully-staged row, not a live
|
||||
# multipart parse.
|
||||
DEFAULT_FETCH_TIMEOUT = 240.0
|
||||
|
||||
# Concurrency cap for ``BatchFetcher``. Four workers is enough headroom
|
||||
# for the realistic "user dragged 3-4 files into chat at once" case
|
||||
# while bounding the platform's per-workspace fan-out. The cap matters
|
||||
# because the platform's /content endpoint reads bytea from Postgres in
|
||||
# a single round-trip per request — N workers = N concurrent DB reads
|
||||
# of up to 25 MB each, so a higher cap could pressure platform memory
|
||||
# without much UX win (network bandwidth is the bottleneck once the
|
||||
# bytes are buffered).
|
||||
# of up to 100 MB each (post-mc#1588 cap), so a higher cap could pressure
|
||||
# platform memory without much UX win (network bandwidth is the
|
||||
# bottleneck once the bytes are buffered).
|
||||
DEFAULT_BATCH_FETCH_WORKERS = 4
|
||||
|
||||
# Upper bound on how long ``BatchFetcher.wait_all`` blocks the inbox
|
||||
|
||||
Reference in New Issue
Block a user