forked from molecule-ai/molecule-core
Mirrors PR-C's Upload migration: replaces the docker-cp tar-stream extraction with a streaming HTTP GET to the workspace's own /internal/file/read endpoint. Closes the SaaS gap for downloads — without this PR, GET /workspaces/:id/chat/download still returns 503 on Railway-hosted SaaS even after A+B+C+F land. Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → PR-F #2319 → this PR. Why a single broad /internal/file/read instead of /internal/chat/download: Today's chat_files.go::Download already accepts paths under any of the four allowed roots {/configs, /workspace, /home, /plugins} — it's not strictly chat. Future PRs (template export, etc.) will reuse this endpoint via the same forward pattern; reusing avoids three near- identical handlers (one per domain) with duplicated path-safety logic. Path safety is duplicated on platform + workspace sides — defence in depth via two parallel checks, not "trust the workspace." Changes: * workspace/internal_file_read.py — Starlette handler. Validates path (must be absolute, under allowed roots, no traversal, canonicalises cleanly). lstat (not stat) so a symlink at the path doesn't redirect the read. Streams via FileResponse (no buffering). Mirrors Go's contentDispositionAttachment for Content-Disposition header. * workspace/main.py — registers GET /internal/file/read alongside the POST /internal/chat/uploads/ingest from PR-B. * scripts/build_runtime_package.py — adds internal_file_read to TOP_LEVEL_MODULES so the publish-runtime cascade rewrites its imports correctly. Also includes the PR-B additions (internal_chat_uploads, platform_inbound_auth) since this branch was rooted before PR-B's drift-gate fix; merge-clean alphabetic additions. * workspace-server/internal/handlers/chat_files.go — Download rewritten as streaming HTTP GET forward. Resolves workspace URL + platform_inbound_secret (same shape as Upload), builds GET request with path query param, propagates response headers (Content-Type / Content-Length / Content-Disposition) + body. Drops archive/tar + mime imports (no longer needed). Drops Docker-exec branch entirely — Download is now uniform across self-hosted Docker and SaaS EC2. * workspace-server/internal/handlers/chat_files_test.go — replaces TestChatDownload_DockerUnavailable (stale post-rewrite) with 4 new tests: - TestChatDownload_WorkspaceNotInDB → 404 on missing row - TestChatDownload_NoInboundSecret → 503 on NULL column (with RFC #2312 detail in body) - TestChatDownload_ForwardsToWorkspace_HappyPath → forward shape (auth header, GET method, /internal/file/read path) + headers propagated + body byte-for-byte - TestChatDownload_404FromWorkspacePropagated → 404 from workspace propagates (NOT remapped to 500) Existing TestChatDownload_InvalidPath path-safety tests preserved. * workspace/tests/test_internal_file_read.py — 21 tests covering _validate_path matrix (absolute, allowed roots, traversal, double- slash, exact-match-on-root), 401 on missing/wrong/no-secret-file bearer, 400 on missing path/outside-root/traversal, 404 on missing file, happy-path streaming with correct Content-Type + Content-Disposition, special-char escaping in Content-Disposition, symlink-redirect-rejection (lstat-not-stat protection). Test results: * go test ./internal/handlers/ ./internal/wsauth/ — green * pytest workspace/tests/ — 1292 passed (was 1272 before PR-D) Refs #2312 (parent RFC), #2308 (chat upload+download 503 incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| ops | ||
| build_runtime_package.py | ||
| build-images.sh | ||
| bundle-compile.sh | ||
| canary-smoke.sh | ||
| cleanup-rogue-workspaces.sh | ||
| clone-manifest.sh | ||
| dev-start.sh | ||
| import-agent.sh | ||
| lockdown-tenant-sg.sh | ||
| measure-coordinator-task-bounds-runner.sh | ||
| measure-coordinator-task-bounds.sh | ||
| nuke-and-rebuild.sh | ||
| post-rebuild-setup.sh | ||
| README.md | ||
| refresh-workspace-images.sh | ||
| rollback-latest.sh | ||
| test-a2a-cross-runtime.sh | ||
| test-all-adapters.sh | ||
| test-all.sh | ||
| test-cross-agent-chat.sh | ||
| test-nuke-and-rebuild.sh | ||
| test-team-e2e.sh | ||
scripts/
Operational and one-off scripts for molecule-core. Most are self-documenting — see the header comments in each file.
RFC #2251 coordinator task-bound harnesses
There are three related scripts; pick the right one:
| Script | Purpose | Targets |
|---|---|---|
measure-coordinator-task-bounds.sh |
Canonical v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via claude-code-default + langgraph templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace. |
OSS-shape platform — localhost or any /workspaces-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
measure-coordinator-task-bounds-runner.sh |
Generalised runner for the same measurement contract but with arbitrary template + secret + model combinations (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via MODE=saas). |
measure-coordinator-task-bounds.sh (in molecule-controlplane) |
Production-shape variant that bootstraps a real staging tenant via POST /cp/admin/orgs, then runs the same measurement against <slug>.staging.moleculesai.app. |
Staging controlplane only — refuses to run against production. |
See reference_harness_pair_pattern (auto-memory) for when to use which
and the cross-repo design rationale.
Common safety pattern across all three
- Cleanup trap on EXIT/INT/TERM auto-deletes provisioned resources.
DRY_RUN=1prints plan + auth fingerprint, exits before any state mutation. Run this before pointing at staging or any shared infrastructure.- Non-target guard refuses arbitrary endpoints (the controlplane
variant is locked to
staging-api.moleculesai.app; the OSS variant requires explicit auth + tenant scoping for non-localhost PLATFORM). - Cleanup failures emit
cleanup_*_failedevents with remediation hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a structured event rather than a silent leak.
Activity trace caveat
If activity_trace.raw == "<endpoint_unavailable>", the per-workspace
/activity endpoint isn't wired on the target build — the bound
measurement is INCONCLUSIVE on the platform-ceiling question. Either
wire the endpoint or replace with the equivalent Datadog query. Note
that /activity accepts a since_secs query parameter; see the
endpoint handler for the supported range.
Other scripts
cleanup-rogue-workspaces.sh— emergency teardown for leaked workspaces. Prompts for confirmation. Pair with the harnesses if a cleanup trap fails (seecleanup_*_failedevents).canary-smoke.sh— quick smoke test for canary releases.dev-start.sh— local-dev platform bring-up.
The rest are self-documenting in their header comments.