Files
hongming-codex-laptop 47d24be523
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
CI / Platform (Go) (pull_request) Successful in 4m9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 5m11s
CI / Python Lint & Test (pull_request) Successful in 6m37s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
CI / all-required (pull_request) Successful in 4m50s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Failing after 3s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 13s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m10s
E2E Chat / E2E Chat (pull_request) Failing after 7m40s
audit-force-merge / audit (pull_request) Successful in 9s
feat(uploads): add /uploads/limits SSOT endpoint + Go-side convergence (task #320)
Eliminates the upload-cap drift class that produced mc#1588 (push-mode
bumped to 100MB) and mc#1589 (poll-mode + DB CHECK catch-up one day
later). The same five surfaces had to be hand-synced for every cap
change; this PR collapses the Go-side mirrors into a single source
(internal/uploads) and exposes that source via a public GET
/uploads/limits endpoint so the out-of-process consumers (canvas TS,
workspace Python push + poll) can converge in Phase 2 follow-ups.

Source:
  * internal/uploads/limits.go — UploadLimits struct +
    DefaultUploadLimits() (per_file_bytes=100MB, per_request_bytes=100MB,
    max_attachments_per_message=10). JSON-tagged shape is the stable
    wire contract.
  * Pinned by internal/uploads/limits_test.go — every cap change must
    update this test as part of the same PR (forces a reviewer to see
    the cap move and audit the matching DB migration + nginx config).

Endpoint:
  * GET /uploads/limits — public, no auth, mirrors /buildinfo
    rationale (platform constraint, not operational state; gating it
    would force pre-auth UX before learning the cap).
  * Cached in the binary via DefaultUploadLimits(); zero per-request
    DB round-trip.

Go consumer convergence:
  * pendinguploads.MaxFileBytes — now var derived from
    uploads.DefaultUploadLimits().PerFileBytes (int cast preserves the
    int-typed API surface so len() comparisons and make([]byte, N+1)
    sites keep working).
  * handlers.chatUploadMaxBytes — now var derived from
    uploads.DefaultUploadLimits().PerRequestBytes (int64 for
    http.MaxBytesReader).
  * chat_files.go line 631: int64(pendinguploads.MaxFileBytes)
    conversion for fh.Size (multipart.FileHeader.Size is int64).
  * chat_files_poll_test.go: matching int64 cast in the skip-guard
    that compares per-file vs body cap.

Tests:
  * internal/uploads/limits_test.go — pins values + JSON wire shape.
  * internal/router/uploads_limits_route_test.go — pins endpoint is
    public + 200 + payload matches DefaultUploadLimits + in-tree Go
    consumers agree with the SSOT.
  * Full workspace-server test suite green (go test ./... — all
    packages ok).

Boundaries (intentional non-changes):
  * Cap value stays at 100MB everywhere — no behavior change for any
    upload. The 25→100MB bump landed in mc#1588 + mc#1589; this PR is
    purely the SSOT refactor.
  * Migration's pending_uploads.size_bytes CHECK upper bound stays at
    104857600 — DB constraints can't read Go vars at runtime, so this
    constant lives in lockstep with DefaultUploadLimits and the
    migration's --comment notes the dependency. Bumping the cap is
    still a two-step coordinated dance (Go default + matching
    migration) but step 1 is now one line.
  * Canvas TS (MAX_UPLOAD_BYTES) + workspace Python
    (CHAT_UPLOAD_MAX_BYTES / MAX_FILE_BYTES) stay as their own
    pinned-100MB constants for this PR; the Phase 2 follow-up
    migrates them to fetch /uploads/limits at startup with a cache.
    The doc comments in chat_files.go point at this PR so reviewers
    of the Phase 2 PR can trace the SSOT lineage.

Why the canvas + Python migrations are NOT in this PR:
  Each consumer needs a different cache+retry shape (canvas at
  app-init in a browser, workspace Python at module-load in the
  container, python ingest similar but distinct). Bundling all of
  them into a single mega-PR fails the review-able-unit test and
  blocks CI for hours on a single conflict. The source-first
  sequencing (this PR) + per-consumer follow-up PRs ships faster
  through 2-eye review per CTO 2026-05-19 move-fast directive.
2026-05-20 03:15:10 -07:00

102 lines
5.2 KiB
Go

// Package uploads is the single source of truth for chat-upload sizing
// constraints across every layer of the platform.
//
// Before this package existed the same numbers were duplicated across at
// least five surfaces:
//
// 1. workspace-server Go const — pendinguploads.MaxFileBytes
// 2. workspace-server Go const — handlers.chatUploadMaxBytes
// 3. workspace Python module — workspace/inbox_uploads.MAX_FILE_BYTES
// 4. workspace Python module — workspace/internal_chat_uploads
// .CHAT_UPLOAD_MAX_BYTES / .CHAT_UPLOAD_MAX_FILE_BYTES
// 5. canvas TypeScript const — canvas/.../chat/uploads.ts MAX_UPLOAD_BYTES
//
// plus a sixth (the DB CHECK on pending_uploads.size_bytes) and a seventh
// (the nginx test-harness client_max_body_size).
//
// Every cap change required a coordinated edit across all of them. mc#1588
// raised push-mode (1, 2, 4, 5, 7) from 50 MB to 100 MB on 2026-05-20;
// the matching poll-mode + DB CHECK bumps (3, 6, parts of pendinguploads)
// were missed and shipped a day later as mc#1589 (drift window: one day,
// production confusion: "why does push work but poll reject the same
// file?"). The same drift class is guaranteed to recur on every future cap
// change unless the constants converge.
//
// This package + the GET /uploads/limits endpoint are the convergence
// point. The Go consumers reference DefaultUploadLimits() directly; the
// out-of-process consumers (workspace Python, canvas TS, python ingest)
// can fetch the limits via the public endpoint at startup and cache them.
// The migration that defines the DB CHECK references the same numerical
// constant via a -- comment so a reviewer can see at a glance whether a
// new migration is in sync with the Go default.
//
// Task tracking: molecule-ai/internal #320 + the legacy SSOT-follow-up
// markers in pendinguploads/storage.go, handlers/chat_files.go, and
// canvas/src/components/tabs/chat/uploads.ts.
package uploads
// UploadLimits is the wire shape returned by GET /uploads/limits and the
// in-process type read by every Go consumer. The JSON tags are part of
// the stable public contract — renaming or removing a field is a
// breaking change for the canvas + Python consumers.
//
// New fields MAY be added without a major bump (consumers ignore unknown
// keys), but every existing field must keep its name + units forever or
// roll out a v2 endpoint.
type UploadLimits struct {
// PerFileBytes is the hard cap on a single uploaded file. Enforced
// in three places: the platform-side handler in chat_files.go
// (push + poll paths), the workspace-side ingest in
// internal_chat_uploads.py (push) + inbox_uploads.py (poll), and
// the canvas-side pre-flight gate before any network I/O. The DB
// CHECK on pending_uploads.size_bytes also enforces this value for
// the poll-mode staging table.
PerFileBytes int64 `json:"per_file_bytes"`
// PerRequestBytes is the hard cap on the full multipart request
// body. With one attachment + minimal multipart framing this is
// effectively equal to PerFileBytes; with N attachments it bounds
// the sum. Today we keep them equal at 100 MB — a multi-file batch
// must collectively fit under the same ceiling as a single large
// file. If we ever decouple them (e.g. raise per-request to allow
// a 200 MB batch of 50 MB files) this field is where that lands.
PerRequestBytes int64 `json:"per_request_bytes"`
// MaxAttachmentsPerMessage caps the count of files in a single
// /chat/uploads request. Defends against a pathological client that
// streams 10 000 1-byte files (which would each spawn a row in
// pending_uploads, exhaust file descriptors on the workspace side,
// and slow chat_files.uploadPollMode's per-file loop to a crawl).
// Currently advisory only — consumers are free to read it but no
// platform handler enforces it as of task #320 Phase 1. Will be
// enforced once the canvas + workspace consumers have rolled.
MaxAttachmentsPerMessage int `json:"max_attachments_per_message"`
}
// DefaultUploadLimits returns the production defaults. This is THE
// source: every other constant in the codebase that mentions an upload
// cap must derive from this function, NOT from a duplicated literal.
//
// Why a function and not a package-level var: a var would be mutable at
// runtime and create the "test modified it and forgot to reset it" class
// of flake. Callers that need a per-test override should pass a custom
// UploadLimits value through the handler/registration site, not mutate
// a global. (No such override exists today; if one is needed in the
// future, prefer a WithLimits(UploadLimits) wiring option over a
// SetDefault function.)
//
// Values pinned at 100 MB per CTO directive 2026-05-19, in lockstep
// with mc#1588 + mc#1589. Bumping the cap is a coordinated multi-PR
// dance: raise this default, ship a DB migration that loosens the
// pending_uploads.size_bytes CHECK, raise the nginx
// client_max_body_size in tests/harness/cf-proxy/nginx.conf, and
// confirm both push-mode + poll-mode E2E. The whole point of this
// package is that step 1 is now ONE edit instead of 5.
func DefaultUploadLimits() UploadLimits {
return UploadLimits{
PerFileBytes: 100 * 1024 * 1024, // 100 MB
PerRequestBytes: 100 * 1024 * 1024, // 100 MB
MaxAttachmentsPerMessage: 10,
}
}