Commit Graph

5 Commits

Author SHA1 Message Date
Hongming Wang
e955597a98 feat(chat_files): rewrite Download as HTTP-forward (RFC #2312, PR-D)
Mirrors PR-C's Upload migration: replaces the docker-cp tar-stream
extraction with a streaming HTTP GET to the workspace's own
/internal/file/read endpoint. Closes the SaaS gap for downloads —
without this PR, GET /workspaces/:id/chat/download still returns 503
on Railway-hosted SaaS even after A+B+C+F land.

Stacks: PR-A #2313 → PR-B #2314 → PR-C #2315 → PR-F #2319 → this PR.

Why a single broad /internal/file/read instead of /internal/chat/download:

  Today's chat_files.go::Download already accepts paths under any of the
  four allowed roots {/configs, /workspace, /home, /plugins} — it's not
  strictly chat. Future PRs (template export, etc.) will reuse this
  endpoint via the same forward pattern; reusing avoids three near-
  identical handlers (one per domain) with duplicated path-safety logic.

Path safety is duplicated on platform + workspace sides — defence in
depth via two parallel checks, not "trust the workspace."

Changes:
  * workspace/internal_file_read.py — Starlette handler. Validates path
    (must be absolute, under allowed roots, no traversal, canonicalises
    cleanly). lstat (not stat) so a symlink at the path doesn't redirect
    the read. Streams via FileResponse (no buffering). Mirrors Go's
    contentDispositionAttachment for Content-Disposition header.
  * workspace/main.py — registers GET /internal/file/read alongside the
    POST /internal/chat/uploads/ingest from PR-B.
  * scripts/build_runtime_package.py — adds internal_file_read to
    TOP_LEVEL_MODULES so the publish-runtime cascade rewrites its
    imports correctly. Also includes the PR-B additions
    (internal_chat_uploads, platform_inbound_auth) since this branch
    was rooted before PR-B's drift-gate fix; merge-clean alphabetic
    additions.
  * workspace-server/internal/handlers/chat_files.go — Download
    rewritten as streaming HTTP GET forward. Resolves workspace URL +
    platform_inbound_secret (same shape as Upload), builds GET request
    with path query param, propagates response headers (Content-Type /
    Content-Length / Content-Disposition) + body. Drops archive/tar
    + mime imports (no longer needed). Drops Docker-exec branch entirely
    — Download is now uniform across self-hosted Docker and SaaS EC2.
  * workspace-server/internal/handlers/chat_files_test.go — replaces
    TestChatDownload_DockerUnavailable (stale post-rewrite) with 4
    new tests:
      - TestChatDownload_WorkspaceNotInDB → 404 on missing row
      - TestChatDownload_NoInboundSecret → 503 on NULL column
        (with RFC #2312 detail in body)
      - TestChatDownload_ForwardsToWorkspace_HappyPath → forward shape
        (auth header, GET method, /internal/file/read path) + headers
        propagated + body byte-for-byte
      - TestChatDownload_404FromWorkspacePropagated → 404 from
        workspace propagates (NOT remapped to 500)
    Existing TestChatDownload_InvalidPath path-safety tests preserved.
  * workspace/tests/test_internal_file_read.py — 21 tests covering
    _validate_path matrix (absolute, allowed roots, traversal, double-
    slash, exact-match-on-root), 401 on missing/wrong/no-secret-file
    bearer, 400 on missing path/outside-root/traversal, 404 on missing
    file, happy-path streaming with correct Content-Type +
    Content-Disposition, special-char escaping in Content-Disposition,
    symlink-redirect-rejection (lstat-not-stat protection).

Test results:
  * go test ./internal/handlers/ ./internal/wsauth/ — green
  * pytest workspace/tests/ — 1292 passed (was 1272 before PR-D)

Refs #2312 (parent RFC), #2308 (chat upload+download 503 incident).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:19:02 -07:00
Hongming Wang
f323def18f chore(build): include platform_tools in runtime wheel SUBPACKAGES
The PR-built wheel + import smoke gate refused the platform_tools
package because it's a new subdirectory under workspace/ that wasn't
in scripts/build_runtime_package.py:SUBPACKAGES. The drift gate (which
exists for exactly this reason) caught it cleanly:

  error: SUBPACKAGES drifted from workspace/ subdirectories:
    in workspace/ but NOT in SUBPACKAGES (will ship un-rewritten or
    be excluded): ['platform_tools']

Adding platform_tools to SUBPACKAGES wires the package into the
runtime wheel + applies the canonical
  from platform_tools.<x> -> from molecule_runtime.platform_tools.<x>
import-rewrite step that every other subpackage uses.

Verified locally: scripts/build_runtime_package.py succeeds, the
rewritten a2a_mcp_server.py reads
  from molecule_runtime.platform_tools.registry import TOOLS
which matches the package layout in the wheel.
2026-04-28 17:19:00 -07:00
Hongming Wang
6e732ab714 fix(build): ship lib/ subpackage + extend drift gate to SUBPACKAGES
Two compounding bugs that bit hermes (and any other workspace that
reaches main.py:142):

1. workspace/lib/ was in EXCLUDE_DIRS so the published wheel didn't
   contain the directory at all. main.py imports `from lib.pre_stop
   import read_snapshot` (and `build_snapshot`, `write_snapshot`) so
   every workspace startup that reaches the snapshot path crashed
   with `ModuleNotFoundError: No module named 'lib'`.

2. Even if lib/ had shipped, `lib` wasn't in SUBPACKAGES so the
   import-rewriter would have left the bare `from lib.pre_stop`
   unqualified — it would still fail because the package would only
   be reachable as `molecule_runtime.lib`.

Fix: move `lib` from EXCLUDE_DIRS to SUBPACKAGES (one entry each).

Drift gate extension: the existing gate I added in #2163 only
asserted TOP_LEVEL_MODULES against workspace/*.py. This change adds
the symmetric assertion for SUBPACKAGES against workspace/<dir>/
(filtered by EXCLUDE_DIRS + presence of __init__.py). Catches both:
- Subpackage added to workspace/ but missed in SUBPACKAGES
- Subpackage missing from workspace/ but lingering in SUBPACKAGES
- Subpackage wrongly in EXCLUDE_DIRS while also referenced by
  rewritten imports (the lib case)

Tested locally: build of 0.1.99 now ships lib/ and main.py contains
`from molecule_runtime.lib.pre_stop import ...` correctly rewritten.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:03:46 -07:00
Hongming Wang
c68dc1877f fix(release): drift-gate TOP_LEVEL_MODULES + smoke-import main in publish
Two compounding bugs surfaced when 0.1.16 hit production today:

1. scripts/build_runtime_package.py had a hand-curated TOP_LEVEL_MODULES
   set listing every workspace/*.py that should get its bare imports
   rewritten to `molecule_runtime.X`. The set silently went stale:
   - Missing: transcript_auth (added since #87 phase 1c), runtime_wedge,
     watcher → unrewritten imports shipped, every workspace startup
     died with ModuleNotFoundError.
   - Stale: claude_sdk_executor, cli_executor (both removed in #87),
     hermes_executor (never existed) → harmless but misleading.

2. publish-runtime.yml's wheel-smoke step asserted on stable invariants
   (BaseAdapter, AdapterConfig, a2a_client error sentinel) but never
   imported main. So even though main.py held the broken bare
   `from transcript_auth import ...`, the smoke check passed.

Fixes:

- Build script now derives the on-disk module set from workspace/*.py
  and asserts it matches TOP_LEVEL_MODULES exactly. Drift in either
  direction fails the build with a specific diff message instead of
  shipping a broken wheel. Closed-list typo guard preserved (we still
  edit the set explicitly when a module is added/removed) — the gate
  just makes drift impossible to ignore.

- TOP_LEVEL_MODULES updated to current reality: drop the 3 stale,
  add the 3 missing.

- publish-runtime.yml wheel-smoke now `import molecule_runtime.main`
  before the invariant asserts. main is the entry point and
  transitively imports every module — any bare-import bug surfaces
  as ModuleNotFoundError before PyPI accepts the upload.

Tested locally: `python3 scripts/build_runtime_package.py
--version 0.1.99 --out /tmp/build-test` succeeds, and
/tmp/build-test/molecule_runtime/main.py contains the rewritten
`from molecule_runtime.transcript_auth import ...`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 03:19:17 -07:00
Hongming Wang
0de67cd379 feat(platform/admin): /admin/workspace-images/refresh + Docker SDK + GHCR auth
The production-side end of the runtime CD chain. Operators (or the post-
publish CI workflow) hit this after a runtime release to pull the latest
workspace-template-* images from GHCR and recreate any running ws-* containers
so they adopt the new image. Without this, freshly-published runtime sat in
the registry but containers kept the old image until naturally cycled.

Implementation notes:
- Uses Docker SDK ImagePull rather than shelling out to docker CLI — the
  alpine platform container has no docker CLI installed.
- ghcrAuthHeader() reads GHCR_USER + GHCR_TOKEN env, builds the base64-
  encoded JSON payload Docker engine expects in PullOptions.RegistryAuth.
  Both empty → public/cached images only; both set → private GHCR pulls.
- Container matching uses ContainerInspect (NOT ContainerList) because
  ContainerList returns the resolved digest in .Image, not the human tag.
  Inspect surfaces .Config.Image which is what we need.
- Provisioner.DefaultImagePlatform() exported so admin handler picks the
  same Apple-Silicon-needs-amd64 platform as the provisioner — single
  source of truth for the multi-arch override.

Local-dev companion: scripts/refresh-workspace-images.sh runs on the
host and inherits the host's docker keychain auth — alternate path for
when GHCR_USER/TOKEN aren't set in the platform env.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-04-26 10:17:21 -07:00