Workspace-attached repo is provisioned as a flat directory (no .git/) — git pull impossible #1688

Open
opened 2026-05-22 21:25:09 +00:00 by RenoStarsAI-production-client · 1 comment

Summary

When a workspace is provisioned with an attached repo, the platform places the repo contents at /home/agent/<repo-name>/ as a flat directory without .git/. As a result, git pull, git fetch, and git status all fail with fatal: not a git repository. The agent has read-only files but no way to update them from upstream — every cron tick is stuck with the snapshot the workspace was originally provisioned with.

Filed as a follow-up to #1684 and #1687. Same investigation thread (SSOT-from-git pattern), different layer.

Tenant + repro context

  • Tenant: reno-stars.moleculesai.app
  • Workspace (SEO Agent): 3fe84b89-eb65-42fc-ad1f-5c93582ca3e7
  • Attached repo: Reno-Stars/reno-star-business-intelligent
  • Provisioned: 2026-05-20 at workspace creation
  • Symptom observed: 2026-05-22 ~20:30 UTC during a real cron tick

What the agent sees

$ cd /home/agent/reno-star-business-intelligent && git pull origin main
fatal: not a git repository (or any of the parent directories): .git
$ ls -la /home/agent/reno-star-business-intelligent/.git
ls: cannot access '/home/agent/reno-star-business-intelligent/.git': No such file or directory

$ ls /home/agent/reno-star-business-intelligent/
agent-skills/  agent-policies/  config/  data/  prompts/  scripts/  src/
# … the contents are all there, but no .git

So the directory is a snapshot of the repo at provisioning time, not a clone. Whatever the platform commits to the central repo upstream after provisioning never makes it into the agent's runtime — there's no upstream tracking.

Why this matters

Most agents on this platform that have a repo attached will at some point want to:

  1. Pull the latest version of their own prompts/policies (e.g. an SSOT-in-git pattern where every tick re-reads policy from git pull).
  2. Pick up new shared code/library files committed by other agents or by the workspace owner without needing a full re-provision.
  3. Commit work back upstream (which also requires a real working tree, not a snapshot).

#1684 was about cron checkpoint discipline; #1687 was about the secret-name plumbing for GitHub auth. This issue is the third leg of the SSOT setup. Even with both #1684 and #1687 fixed, the agent still can't pull from upstream because his directory isn't a git repo to begin with.

Workaround we deployed

We added a parallel SSOT clone at /home/agent/_ssot/<repo-name>/. Every cron tick now:

mkdir -p /home/agent/_ssot
if [ ! -d /home/agent/_ssot/reno-star-business-intelligent/.git ]; then
  GH_TOKEN=$GH_PAT gh repo clone Reno-Stars/reno-star-business-intelligent \
    /home/agent/_ssot/reno-star-business-intelligent -- --depth 1 --quiet
else
  cd /home/agent/_ssot/reno-star-business-intelligent && GH_TOKEN=$GH_PAT git pull origin main --quiet
fi

Then we read SSOT policy/skill files from /home/agent/_ssot/<repo>/ (read-only) while writing state to /home/agent/<repo>/data/ (working state).

This works but it's ugly:

  • Every cron prompt has to include the clone-or-pull boilerplate (we have 11 schedules — same boilerplate copy-pasted 11 times).
  • Disk is doubled (snapshot + clone both exist).
  • Two directories with the same name confuses the agent — he had to be coached over A2A on which dir is which.

Proposed fix

A. Provision repo as a real git clone instead of a tarball/snapshot.

When the platform attaches a repo to a workspace, do:

cd /home/agent
git clone https://<token>@github.com/<org>/<repo>.git

instead of whatever extracts-tarball-without-.git mechanism is happening now. This gives the agent:

  • A real .git/ directory
  • An origin remote pointing back at upstream
  • The ability to git pull straight out of the box
  • The ability to git commit && git push if write access is granted

B. (Less ideal alternative) Make .git/ part of the snapshot.

If there's a reason the platform avoids real clones (security, storage, IP isolation), at minimum include .git/ in the snapshot tarball so git fetch && git reset --hard origin/main works to update the working tree.

A is clearly better — it's the conventional approach and matches user expectation when they "attach a repo" to a workspace.

Strong correlation with #1687

These two issues are basically the same problem viewed from different angles:

  • #1687: "I have the secret but the env-var name is wrong."
  • This issue: "Even if env-var names were right, there's no .git/ to push the auth at."

Both need to be fixed before the SSOT-in-git pattern works on the platform. Both are mechanical, single-day fixes.

Reproduction

  1. Create a workspace on any tenant. Attach a private GitHub repo to it.
  2. Wait for provisioning to complete.
  3. Exec into the runtime. cd /home/agent/<repo-name> && git status.
  4. Observe: fatal: not a git repository.

— Hongming Wang (airenostars@gmail.com)
— Tenant: reno-stars.moleculesai.app
— Workspace owner: d76977b1-f17e-4a4c-9f74-bf6315238620

## Summary When a workspace is provisioned with an attached repo, the platform places the repo contents at `/home/agent/<repo-name>/` as a flat directory **without `.git/`**. As a result, `git pull`, `git fetch`, and `git status` all fail with `fatal: not a git repository`. The agent has read-only files but no way to update them from upstream — every cron tick is stuck with the snapshot the workspace was originally provisioned with. Filed as a follow-up to #1684 and #1687. Same investigation thread (SSOT-from-git pattern), different layer. ## Tenant + repro context - **Tenant:** `reno-stars.moleculesai.app` - **Workspace (SEO Agent):** `3fe84b89-eb65-42fc-ad1f-5c93582ca3e7` - **Attached repo:** `Reno-Stars/reno-star-business-intelligent` - **Provisioned:** 2026-05-20 at workspace creation - **Symptom observed:** 2026-05-22 ~20:30 UTC during a real cron tick ## What the agent sees ```bash $ cd /home/agent/reno-star-business-intelligent && git pull origin main fatal: not a git repository (or any of the parent directories): .git ``` ```bash $ ls -la /home/agent/reno-star-business-intelligent/.git ls: cannot access '/home/agent/reno-star-business-intelligent/.git': No such file or directory $ ls /home/agent/reno-star-business-intelligent/ agent-skills/ agent-policies/ config/ data/ prompts/ scripts/ src/ # … the contents are all there, but no .git ``` So the directory is a snapshot of the repo at provisioning time, not a clone. Whatever the platform commits to the central repo upstream after provisioning never makes it into the agent's runtime — there's no upstream tracking. ## Why this matters Most agents on this platform that have a repo attached will at some point want to: 1. **Pull the latest version of their own prompts/policies** (e.g. an SSOT-in-git pattern where every tick re-reads policy from `git pull`). 2. **Pick up new shared code/library files** committed by other agents or by the workspace owner without needing a full re-provision. 3. **Commit work back upstream** (which also requires a real working tree, not a snapshot). #1684 was about cron checkpoint discipline; #1687 was about the secret-name plumbing for GitHub auth. **This issue is the third leg of the SSOT setup.** Even with both #1684 and #1687 fixed, the agent still can't pull from upstream because his directory isn't a git repo to begin with. ## Workaround we deployed We added a parallel SSOT clone at `/home/agent/_ssot/<repo-name>/`. Every cron tick now: ```bash mkdir -p /home/agent/_ssot if [ ! -d /home/agent/_ssot/reno-star-business-intelligent/.git ]; then GH_TOKEN=$GH_PAT gh repo clone Reno-Stars/reno-star-business-intelligent \ /home/agent/_ssot/reno-star-business-intelligent -- --depth 1 --quiet else cd /home/agent/_ssot/reno-star-business-intelligent && GH_TOKEN=$GH_PAT git pull origin main --quiet fi ``` Then we read SSOT policy/skill files from `/home/agent/_ssot/<repo>/` (read-only) while writing state to `/home/agent/<repo>/data/` (working state). This works but it's ugly: - Every cron prompt has to include the clone-or-pull boilerplate (we have 11 schedules — same boilerplate copy-pasted 11 times). - Disk is doubled (snapshot + clone both exist). - Two directories with the same name confuses the agent — he had to be coached over A2A on which dir is which. ## Proposed fix **A. Provision repo as a real `git clone` instead of a tarball/snapshot.** When the platform attaches a repo to a workspace, do: ```bash cd /home/agent git clone https://<token>@github.com/<org>/<repo>.git ``` instead of whatever extracts-tarball-without-.git mechanism is happening now. This gives the agent: - A real `.git/` directory - An `origin` remote pointing back at upstream - The ability to `git pull` straight out of the box - The ability to `git commit && git push` if write access is granted **B. (Less ideal alternative) Make `.git/` part of the snapshot.** If there's a reason the platform avoids real clones (security, storage, IP isolation), at minimum include `.git/` in the snapshot tarball so `git fetch && git reset --hard origin/main` works to update the working tree. A is clearly better — it's the conventional approach and matches user expectation when they "attach a repo" to a workspace. ## Strong correlation with #1687 These two issues are basically the same problem viewed from different angles: - #1687: "I have the secret but the env-var name is wrong." - This issue: "Even if env-var names were right, there's no `.git/` to push the auth at." Both need to be fixed before the SSOT-in-git pattern works on the platform. Both are mechanical, single-day fixes. ## Reproduction 1. Create a workspace on any tenant. Attach a private GitHub repo to it. 2. Wait for provisioning to complete. 3. Exec into the runtime. `cd /home/agent/<repo-name> && git status`. 4. Observe: `fatal: not a git repository`. — Hongming Wang (airenostars@gmail.com) — Tenant: reno-stars.moleculesai.app — Workspace owner: `d76977b1-f17e-4a4c-9f74-bf6315238620`
Member

RCA — root cause

The platform has two different “repo as source” behaviors: external org-template imports preserve a real git clone in the server-side cache, but workspace-visible file surfaces and plugin/template copy paths deliberately hide or strip .git. That is correct for plugin/template safety, but wrong for a user-attached repository whose product contract implies an updateable working tree.

Evidence

  • workspace-server/internal/handlers/org_external.go:309 — external refs are fetched by git ls-remote and git clone --depth=1 into a content-addressed cache.
  • workspace-server/internal/handlers/org_external.go:376 — the cache is considered complete only when .git exists, so the server-side cache is a real clone.
  • workspace-server/internal/plugins/github.go:67 — plugin GitHub resolver explicitly says it copies contents minus .git into destination.
  • workspace-server/internal/plugins/github.go:135 — plugin resolver removes .git before copying.
  • workspace-server/internal/handlers/templates.go:341 and :405 — workspace file listing prunes .git, reinforcing that visible workspace files are treated as content snapshots rather than working trees.

Suggested fix

Responsible surface is molecule-core workspace provisioning for attached repos, not the plugin resolver. Add an explicit attached-repo mode that creates a real clone at the workspace path with origin preserved and auth configured via the workspace’s Git token. Keep plugin/template copy behavior stripping .git; only user-attached repos should be mutable/updateable git working trees. Add a provisioning regression test that cd /home/agent/<repo> && git status succeeds for attached repos.

Confidence

Medium — local code proves .git is intentionally stripped/hidden on adjacent copy/listing paths; confidence would become high after reading the exact attached-repo provisioning branch.

## RCA — root cause The platform has two different “repo as source” behaviors: external org-template imports preserve a real git clone in the server-side cache, but workspace-visible file surfaces and plugin/template copy paths deliberately hide or strip `.git`. That is correct for plugin/template safety, but wrong for a user-attached repository whose product contract implies an updateable working tree. ## Evidence - `workspace-server/internal/handlers/org_external.go:309` — external refs are fetched by `git ls-remote` and `git clone --depth=1` into a content-addressed cache. - `workspace-server/internal/handlers/org_external.go:376` — the cache is considered complete only when `.git` exists, so the server-side cache is a real clone. - `workspace-server/internal/plugins/github.go:67` — plugin GitHub resolver explicitly says it copies contents minus `.git` into destination. - `workspace-server/internal/plugins/github.go:135` — plugin resolver removes `.git` before copying. - `workspace-server/internal/handlers/templates.go:341` and `:405` — workspace file listing prunes `.git`, reinforcing that visible workspace files are treated as content snapshots rather than working trees. ## Suggested fix Responsible surface is `molecule-core` workspace provisioning for attached repos, not the plugin resolver. Add an explicit attached-repo mode that creates a real clone at the workspace path with `origin` preserved and auth configured via the workspace’s Git token. Keep plugin/template copy behavior stripping `.git`; only user-attached repos should be mutable/updateable git working trees. Add a provisioning regression test that `cd /home/agent/<repo> && git status` succeeds for attached repos. ## Confidence Medium — local code proves `.git` is intentionally stripped/hidden on adjacent copy/listing paths; confidence would become high after reading the exact attached-repo provisioning branch.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1688