Self-review of molecule-core PR #105 + #106 (the !external resolver
chain) surfaced 3 real correctness/security gaps and 2 readability
nits. Fixes all four in one PR since they're the same file's hardening.
(1) TOKEN LEAKAGE — fixed
Before: gitFetcher built clone URLs with auth in userinfo
(https://oauth2:TOKEN@host/repo.git). Two leak paths:
a. Token persisted in cloned repo's .git/config
b. Token could appear in clone error output captured via
cmd.CombinedOutput()
After: clone URL has no userinfo (https://host/repo.git). Auth is
layered on via -c http.extraHeader=Authorization: token ...
which sends the header per-request without persisting. Plus a
redactToken() pass over any error string before it surfaces in
fmt.Errorf, as belt-and-braces.
Tradeoff: token now visible in 'ps aux' for the duration of the
git child process (same as before via env var), but no longer
in any persistent state.
(2) CACHE-VALIDITY FOOTGUN — fixed
Before: cache-hit was 'cacheDir/.git exists'. A clone interrupted
after .git was created but before content finished writing would
leave a partially-written cache that subsequent imports treated
as hit, returning stale/incomplete content forever (no self-heal).
After: cache-hit also requires a .complete marker file written
only AFTER successful clone+rename. Partially-written cache is
treated as cache-miss and re-fetched cleanly (after RemoveAll
on the partial dir to avoid blocking the new clone's mkdir).
(3) REF '..' DENY — fixed
Before: safeRefPattern '^[a-zA-Z0-9_./-]+$' allowed '..' as a
substring. Git itself rejects most refs containing '..', but
defense-in-depth says don't depend on the downstream tool's
validation when sanitizing input at the boundary.
After: explicit strings.Contains(ref.Ref, '..') check.
(4) NAMING CLEANUP — fixed
Before: rewriteFilesDirAndIncludes() — name claims to rewrite
!include scalars but doesn't (we removed that during PR-A
development; double-prefix bug). Misleading for readers.
After: rewriteFilesDir(). Docstring updated to explicitly explain
why !include paths are NOT rewritten (relative to subDir, naturally
inside cache).
Also: removed unused buildAuthedURL() (replaced by
buildExternalCloneURL + authConfigArgs split), removed unused
shortHash() helper (replaced by os.MkdirTemp), removed unused
crypto/sha1 + encoding/hex + fmt imports, removed stray
'_ = fmt.Sprint' line in integration test.
NEW TESTS
- TestGitFetcher_RejectsRefWithDoubleDot (defense-in-depth on ref input)
- TestGitFetcher_CacheValidatedByCompleteMarker (partial cache → re-fetch)
VERIFIED LOCALLY 2026-05-08
Full ./internal/handlers/ suite: ok (7.8s, 14 external-resolver tests
+ all existing tests). Two new tests cover the two new behaviors.
Refs:
internal#77 — extraction RFC
molecule-core#105 (resolver), #106 (tests) — original implementation
Hongming code-review-and-quality skill invocation 2026-05-08 + 'fix all'
PR-B (local bare-git integration, task #233):
workspace-server/internal/handlers/org_external_integration_test.go
Three tests using git's GIT_CONFIG_COUNT/KEY/VALUE env-var-injected
insteadOf URL rewrite — process-scoped, no ~/.gitconfig pollution:
- TestGitFetcher_RealClone_LocalRedirect: full resolver chain end-to-
end with REAL git clone against a local bare-repo, asserts cache
population + content materialization + path rewrite + cache-hit on
second invocation.
- TestGitFetcher_RealClone_BadRefFails: nonexistent ref surfaces
git's error cleanly through the ls-remote step.
- TestGitFetcher_DirectFetch_CacheHit: gitFetcher.Fetch direct
invocation (no resolver wrapping); verifies cache-hit returns
same dir + same SHA, no clobber.
Production code untouched — insteadOf rewrite makes the production
gitFetcher think it's cloning from Gitea, but git rewrites at clone
time to file://<barePath>. Tests the real shell-out + parsing.
PR-C (live Gitea e2e, task #234):
workspace-server/internal/handlers/local_e2e_dev_dept_test.go
TestLocalE2E_ExternalDevDepartment — minimal parent template that
uses !external against the LIVE molecule-ai/molecule-dev-department
repo. No symlink, no /tmp/local-e2e-deploy fixture. Composition
resolves over network at import time.
Asserts:
- 28+ dev-tree workspaces resolve through the fetched cache
(matches the count from TestLocalE2E_DevDepartmentExtraction)
- Q1 placement: 'Documentation Specialist' present (under app-lead)
- Q2 placement: 'Triage Operator' present (under dev-lead)
- Every workspace's files_dir is cache-prefixed (proves rewrite ran)
- Every workspace's resolveInsideRoot+Stat succeeds
(would fail provisioning if not)
Skipped if Gitea unreachable (TCP probe to git.moleculesai.app:443)
or git binary absent — won't false-fail offline runners.
VERIFIED LOCALLY 2026-05-08:
--- PASS: TestGitFetcher_RealClone_LocalRedirect (0.26s)
--- PASS: TestGitFetcher_RealClone_BadRefFails (0.15s)
--- PASS: TestGitFetcher_DirectFetch_CacheHit (0.23s)
--- PASS: TestLocalE2E_ExternalDevDepartment (0.55s)
workspaces resolved through !external: 28
Full ./internal/handlers/ test suite: ok (no regressions)
Together with PR-A's unit tests (#105), the !external resolver is now
covered at three layers:
- unit (fakeFetcher injection): allowlist, validation, path rewrite
- integration (real git, local bare-repo): clone, cache, ls-remote
- e2e (real git, live Gitea, live dev-department): full chain
Refs:
internal#77 — extraction RFC (Phase 3a phasing in comment 1995)
task #233 (PR-B), task #234 (PR-C)
Hongming GO 2026-05-08 ('do PR-B/C/D')
Adds gitops-style cross-repo subtree composition to the platform's
org-template importer. Replaces (eventually) the operator-side
filesystem symlink approach shipped in PR #5.
DESIGN
See internal#77 comment 1995 for the full design doc + decision points
agreed with Hongming 2026-05-08.
Schema: a `!external`-tagged mapping anywhere a workspace entry is
allowed (workspaces:, roots:, children:):
- !external
repo: molecule-ai/molecule-dev-department
ref: main
path: dev-lead/workspace.yaml
url: git.moleculesai.app # optional; default = MOLECULE_EXTERNAL_GITEA_URL or git.moleculesai.app
At resolve time the platform fetches the repo at ref into a content-
addressable cache under <orgBaseDir>/.external-cache/<repo>/<sha>/,
loads <cacheDir>/<path>, recursively resolves nested !include /
!external in the loaded subtree, then rewrites every files_dir scalar
in the fully-resolved subtree to be cache-prefixed. Downstream
pipeline (resolveInsideRoot, plugin merge, CopyTemplateToContainer)
sees ordinary in-tree paths.
IMPLEMENTATION
- org_external.go: ExternalRef type, fetcher interface (gitFetcher
production + injectable for tests), resolveExternalMapping resolver,
rewriteFilesDirAndIncludes path-rewrite walker, allowlistedHostPath
+ safeRefPattern + safeRepoCacheDir validation helpers.
- org_include.go: 4-line hook in expandNode dispatching MappingNode
with Tag=="!external" to resolveExternalMapping.
- org_external_test.go: 8 unit tests with fakeFetcher injection
(no network):
* happy path (top + nested workspace files_dir cache-prefixed)
* allowlist rejection (github.com/foo/bar)
* path-traversal rejection (../../etc/passwd)
* malformed ref rejection ("main; rm -rf /")
* missing required fields (repo / ref / path)
* rewriteFilesDirAndIncludes basic + idempotent
* allowlistedHostPath env-override + glob
Path rewrite ONLY rewrites files_dir scalars. !include scalars are
NOT rewritten — they resolve relative to their containing file's
directory, which post-fetch is naturally inside the cache, so
relative !includes Just Work without modification.
ALLOWLIST + AUTH
- Default allowlist: git.moleculesai.app/molecule-ai/.
- Override: MOLECULE_EXTERNAL_REPO_ALLOWLIST (comma-separated
prefixes; trailing /* or / supported).
- Auth: MOLECULE_GITEA_TOKEN env var injected into clone URL.
Optional — falls back to unauthenticated for public repos.
- Reject: malformed refs, path-traversal, non-allowlisted hosts.
CACHE
- Location: <orgBaseDir>/.external-cache/<safe-repo>/<sha>/.
Operators add to .gitignore.
- Content-addressable: same (repo, sha) reuses cache, no overwrite.
- Atomic clone via tmp-then-rename.
- Concurrency: race-tolerant — last-writer-wins on same SHA.
GC out of scope for v1 (filed as parked follow-up).
SECURITY (per SOP Phase 2)
Untrusted yaml input — all validated:
repo: allowlist (default molecule-ai/* on Gitea host)
ref: ^[a-zA-Z0-9_./-]+$ regex (rejects shell injection)
path: relative-and-down-only (rejects ../escape)
Auth: read-only token scoped to allowed orgs.
Recursion: maxExternalDepth=4 (vs maxIncludeDepth=16) to limit
network fan-out cost.
Cache poisoning: per-(repo, sha) content-addressable; can't poison
across SHAs.
Trust boundary: cloned content treated identically to a sibling-
cloned subtree (same model as current symlink approach).
VERSIONING / BACKWARDS COMPAT
Pure additive. Existing !include and inline workspaces unchanged.
Existing dev-lead symlink (parent template PR #5) keeps working.
Migration of parent template to !external is a separate PR-D.
No DB schema change. No public API change.
VERIFIED LOCALLY
go test ./internal/handlers/ → ok (5.2s, all 8 new tests + existing)
Stub fetcher injection lets unit tests cover the resolver +
path-rewrite logic without network. PR-B (follow-up) adds an
integration test against a local bare-git repo. PR-C adds the
real-Gitea e2e test against the live dev-department repo.
Refs:
internal#77 — extraction RFC (comment 1995 = Phase 1+2 design)
task #222 — this PR is Phase 3a (PR-A in the design's phasing)
Hongming GO 2026-05-08 ('go' on 4 decision points + design)
Two changes that fall out of one root cause discovered while preparing
the local platform spin-up for the dev-department extraction (internal#77):
PROBLEM
CopyTemplateToContainer's filepath.Walk is called with templatePath
set to the workspace's resolved files_dir. With the cross-repo
symlink composition shipped in PR #5 (parent template's
dev-lead → ../molecule-dev-department/dev-lead/), the Dev Lead
workspace's files_dir is literally 'dev-lead' — i.e. the symlink
itself, not a path THROUGH the symlink.
filepath.Walk does not descend into a symlink leaf — it Lstats the
root, sees a symlink (mode bit set, not a directory), emits exactly
one entry, and returns. Result: the workspace's /configs/ tar would
ship empty. Other 38 workspaces are fine because their files_dir
paths just TRAVERSE the symlink (path resolution handles intermediate
symlinks via Lstat traversal); only the leaf-is-symlink case breaks.
FIX
workspace-server/internal/provisioner/provisioner.go:
Call filepath.EvalSymlinks on templatePath before filepath.Walk.
Resolves the leaf-symlink case for ALL templates, not just dev-dept.
Security: templatePath has already passed resolveInsideRoot's
path-string check at the call site; the trust boundary is the
operator-side /org-templates/ filesystem layout, not this
resolution step.
TEST
workspace-server/internal/handlers/local_e2e_dev_dept_test.go:
New TestLocalE2E_FilesDirConsumption — stage-2 of the local e2e.
For every workspace in the resolved OrgTemplate, asserts:
1. resolveInsideRoot(orgBaseDir, ws.FilesDir) succeeds.
2. os.Stat on the result returns a directory.
3. filepath.Walk after EvalSymlinks (mirroring the platform fix)
emits at least one file.
4. At least one workspace marker exists (workspace.yaml,
system-prompt.md, or initial-prompt.md).
Exercises the SECOND half of POST /org/import that
TestLocalE2E_DevDepartmentExtraction (PR #103) didn't cover.
VERIFIED LOCALLY (2026-05-08, against post-extraction Gitea state):
--- PASS: TestLocalE2E_FilesDirConsumption (0.05s)
checked 39 workspaces with files_dir
All 39 walk paths emit non-empty file sets with valid workspace markers.
REGRESSION GUARD
Without the EvalSymlinks fix, this test fails on Dev Lead with:
files_dir 'dev-lead' at '/.../molecule-dev/dev-lead' is empty —
CopyTemplateToContainer would produce empty /configs/
Refs:
internal#77 — extraction RFC
molecule-core#102 (resolver symlink contract test)
molecule-core#103 (stage-1 e2e: include resolution)
Hongming GO 2026-05-08 ('go' on the 3 pre-spin-up optimizations)
Phase 4 (local-only) of internal#77 (dev-department extraction).
Adds TestLocalE2E_DevDepartmentExtraction that exercises the FULL platform
import path against the real molecule-ai-org-template-molecule-dev (post-slim)
and molecule-ai/molecule-dev-department (post-atomize) repos cloned as siblings
under /tmp/local-e2e-deploy/.
What it proves end-to-end:
- The dev-lead symlink at parent's template root is followed by
resolveYAMLIncludes (filepath.Abs/Rel-style security check passes,
os.ReadFile follows the link).
- Recursive !include chain through the symlinked subtree resolves:
parent's org.yaml → !include dev-lead/workspace.yaml (symlinked)
→ !include ./core-lead/workspace.yaml → !include ./core-be/workspace.yaml
(atomized children: paths, no '..').
- 39 workspaces enumerate after resolution: 5 PM-tree + 6 Marketing-tree
+ 28 dev-tree (Dev Lead + 5 sub-team leads + 18 leaf workspaces +
3 floaters + 1 triage-operator).
- Q1+Q2 placements verified by sentinel name check: 'Documentation
Specialist' is reachable (under app-lead via app-docs sub-team),
'Triage Operator' is reachable (direct child of Dev Lead).
Test skips with t.Skipf if the local-e2e fixture isn't present on the
host — won't block CI on hosts that haven't set it up. To set up locally:
TESTROOT=/tmp/local-e2e-deploy
mkdir -p $TESTROOT && cd $TESTROOT
git clone https://git.moleculesai.app/molecule-ai/molecule-ai-org-template-molecule-dev.git molecule-dev
git clone https://git.moleculesai.app/molecule-ai/molecule-dev-department.git
cd /Users/<you>/molecule-core/workspace-server
go test -v -run TestLocalE2E_DevDepartmentExtraction ./internal/handlers/
Verified locally 2026-05-08:
--- PASS: TestLocalE2E_DevDepartmentExtraction (0.01s)
total workspaces (recursive): 39
Refs:
internal#77 — extraction RFC
molecule-core PR #102 — symlink-resolution contract test
molecule-ai/molecule-dev-department PRs #1, #2, #3 (scaffold + extract + atomize)
molecule-ai/molecule-ai-org-template-molecule-dev PR #5 (parent slim + symlink wire)
Hongming GO 2026-05-08 ('lets not go for staging right now, we do local test first')
SOP Phase 4 (local) — task #226
Two new tests in workspace-server/internal/handlers/org_include_test.go:
- TestResolveYAMLIncludes_FollowsDirectorySymlink: parent template's
org.yaml `!include`s into a sibling-repo subtree via a relative
directory symlink. The resolver's filepath.Abs/Rel security check
operates on path strings (passes), and os.ReadFile follows the
symlink at OS layer (file content delivered). Recursive nested
`!include`s within the symlinked subtree resolve correctly because
filepath.Dir(absTarget) keeps the literal symlink path as currentDir.
- TestResolveYAMLIncludes_RejectsSymlinkEscapingRoot: companion test
pinning current behavior where a symlink target outside the parent
root is followed (resolveInsideRoot doesn't EvalSymlinks). Asserted
as 'should resolve' so future hardening (if filepath.EvalSymlinks
is added) flips the test red and forces a coordinated update to the
dev-department subtree-composition pattern.
Why now: internal#77 RFC (dev-department extraction) selects symlink-
based composition over a future platform-level external: ref. These
tests pin the contract before the operator-side symlink convention
gets shipped, so a refactor or hardening of the resolver can't
silently break the production org-import path.
No production code changes. Pure additive test coverage.
Refs: internal#77 (Phase 3b verification — task #223)
Class B Hongming-owned CICD red sweep, e2e-api leg. Same substrate
hazard as PR #98 (handlers-postgres-integration) — Gitea act_runner
configures `container.network: host` operator-wide, so:
* Two concurrent e2e-api runs both attempted to bind `-p 15432:5432`
and `-p 16379:6379` on the operator host. Verified in run a7/2727
on 2026-05-07: `docker: Error response from daemon: Conflict. The
container name "/molecule-ci-redis" is already in use by container
af10f438...` — exit 125, job fails before any test runs.
* Hardcoded container names `molecule-ci-postgres` / `-redis` plus
the leading `docker rm -f` step meant a second job's startup also
KILLED the first job's still-running services.
Fix shape (mirrors PR #98 bridge-net pattern, adapted because the
platform-server is a Go binary on the host, not a containerised step):
1. Per-run unique container names: `pg-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`,
`redis-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`. Unique even across reruns
of the same run_id.
2. Ephemeral host port per run via `-p 0:5432` / `-p 0:6379` and
`docker port` lookup, exported as `DATABASE_URL` / `REDIS_URL` to
`$GITHUB_ENV`. No fixed host-port → no collision.
3. `127.0.0.1` (NOT `localhost`) in URLs — IPv6 first-resolve flake
fixed in #92 stays fixed.
4. `if: always()` cleanup so containers don't leak when test steps
fail.
Issue #94 items #2 + #3 also addressed:
* Pre-pull `alpine:latest` (provisioner uses it for ephemeral
token-write containers in `internal/handlers/container_files.go`).
* Idempotent `docker network create molecule-monorepo-net` (the
provisioner attaches workspace containers via that bridge —
`internal/provisioner/provisioner.go::DefaultNetwork`).
Issue #94 item #1 (timeouts) NOT bumped — recent log evidence shows
postgres ready in 3s, redis in 1s, platform in 1s when they DO come
up. Timeouts are not the bottleneck on the current substrate.
NOT addressed here (out of scope, separate change required):
* `Run E2E API tests` step has been failing on `Status back online`
because the platform's langgraph workspace template image
(`ghcr.io/molecule-ai/workspace-template-langgraph:latest`)
returns 403 Forbidden post-2026-05-06 GitHub org suspension. That
is a template-registry resolution issue (ADR-002 / local-build
mode) and belongs in a workspace-server change, not this workflow
file. This PR fixes the parallel-collision class and the workflow
setup hygiene; the langgraph-403 failure will still surface on
runs after this lands until template resolution is fixed
separately.
Verified manually on operator host 2026-05-08: docker now hands out
ephemeral ports on `-p 0:5432`, two parallel runs land on different
ports, both reach pg_isready GREEN.
Closes#94 (items #2 and #3; item #1 documented as not-bottleneck;
langgraph-template-403 referenced for follow-up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches from services: block to --network molecule-monorepo-net with unique per-run container names. Avoids port-5432 collision when parallel Handlers-Postgres jobs run on host-network act_runner. Approved by security-auditor.
Class B verification — second consecutive green run to demonstrate the
fix isn't flaky.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Class B Hongming-owned CICD red sweep. The Handlers Postgres Integration
workflow has been silently failing on staging push and PRs ever since
#92 fixed the IPv6 flake — the IPv6 fix correctly pinned 127.0.0.1, but
unmasked a deeper issue: with our act_runner global container.network=host
config, multiple concurrent runs of this workflow each tried to bind
0.0.0.0:5432 on the operator host. The first wins; subsequent postgres
service containers exit with `FATAL: could not create any TCP/IP sockets`
+ `Address in use`. Docker auto-removes them (act_runner sets
AutoRemove:true), so by the time `Apply migrations` runs `psql`, the
container is gone — Connection refused, then `failed to remove container:
No such container` at cleanup time.
Per-job container.network override is silently ignored by act_runner
(`--network and --net in the options will be ignored.`), so we sidestep
`services:` entirely. The job container still uses host-net (required
for cache server discovery on the operator's bridge IP). We launch a
sibling postgres on the existing molecule-monorepo-net bridge with a
unique name per run (run_id+run_attempt) and connect via the bridge IP
read from `docker inspect`.
Verified manually on operator host 2026-05-08: 2× postgres on host-net
collides, but on the bridge with unique names + different IPs, both
succeed and each is reachable from a host-net job container.
Adds:
- always()-cleanup step so containers don't leak on test failure
- Diagnostic dump now includes the postgres container's docker logs
- Runbook at docs/runbooks/ documenting the substrate behavior + the
pattern future workflows should adopt for any `services:`-shaped need.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Class A red sweep — 3 first-tests timing out at the 5000ms default on the
self-hosted Gitea Actions Docker runner across 4 unrelated PRs (#82, #81,
#54, #53). The PRs share zero canvas/ surface — same 3 tests, same
cold-start signature, same shape on every run.
Root cause: `npx vitest run --coverage` cold-start cost (v8 coverage
instrumentation init + JSDOM bootstrap + heavy @/components/* and @/lib/*
import + first React render) consumes 5-7 seconds for the first
synchronous test in a heavyweight test file. Empirically:
ActivityTab "renders all 7 filter options" 5230ms (FAIL)
CreateWorkspaceDialog "opens the dialog ..." 6453ms (FAIL)
ConfigTab.provider "PUTs the new provider on Save" 5605ms (FAIL)
vs subsequent tests in the same files at 100-1500ms each. The component
code is correct (e.g. ActivityTab.FILTERS has 7 entries matching the
test). 1407 tests pass locally with --coverage in 9-15s; CI runs at 200s
under the same flag — the gap is import/transform/environment overhead,
not test logic.
Fix: CI-conditional `testTimeout: process.env.CI ? 30000 : 5000` in
canvas/vitest.config.ts. Local-dev sensitivity to genuine waitFor races
preserved; CI gets ~5x headroom over the worst observed first-test
(6453ms). Same shape Vitest documents at
<https://vitest.dev/config/testtimeout> and
<https://vitest.dev/guide/coverage#profiling-test-performance>.
Verification:
- Local: 5x runs of the 3 failing test files, all 74 tests green
(process.env.CI unset → 5000ms applies).
- Local: 7s sleep probe FAILS at 5000ms default and PASSES under
CI=true → ternary takes effect as written.
- Local: full canvas suite under CI=true with --coverage:
"Test Files 98 passed (98) | Tests 1407 passed (1407)".
Closes#96.
Refs: #82, #81, #54, #53.
Hostile self-review (3 weakest spots):
1. 30000ms is a guess, not a measurement. Mitigation: vitest still
emits per-test duration; a real 25s+ test will surface as a
duration regression and we dial down.
2. Doesn't fix the Docker-runner-overhead root-root-cause. True. That
is a multi-week perf project. The right trade today is unblocking 4
PRs from this single class.
3. Local-default of 5000ms means a real 8s race that flies on CI's
30000ms could pass without local sensitivity. Mitigation: dev-time
waitFor races are caught at the per-test level; suite-level cold-
start is the only legitimate >5s case here.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflicted files in .github/workflows/ taken from main:
.github/workflows/ci.yml
.github/workflows/e2e-staging-canvas.yml
.github/workflows/retarget-main-to-staging.yml
Conflicts arose from main advancing through PR #66/#79/#89 (CI workflow rewrites)
while staging hadn't picked up the changes yet. Main is the source of truth for
CI workflows; staging is downstream.
Co-authored-by: Claude (orchestrator)
Closes#88. Bundles localhost→127.0.0.1 + 2 other Gitea-act_runner flakes per feedback_gitea_actions_migration_audit_pattern. Approved by security-auditor.
The previous configs:-based fix (87b971a2) didn't actually fix the DinD
issue — Compose v2 falls back to bind mounts for `configs:` when swarm
mode is not active, so the resulting runc invocation still tries to
mount /workspace/.../cf-proxy/nginx.conf from the OUTER host filesystem
that the act_runner-vs-host-docker socket-mount can't see. Same
"not a directory" error returned.
Switch to a thin Dockerfile (cf-proxy/Dockerfile) that COPYs nginx.conf
into nginx:1.27-alpine. The build context is uploaded to the daemon as
a tarball, not bind-mounted from the host filesystem, so the path
translation gap doesn't apply. Verified locally: `docker build` +
`docker run cf-proxy nginx -T` reproduces the baked config end-to-end.
Trade-off: ~2-3s build cost on every harness up. Acceptable for the
Gitea CI gate; local-dev re-builds the image only when nginx.conf
changes (Docker layer cache).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three workflows have been failing on every push to this Gitea repo for
GitHub-shaped reasons that don't translate to act_runner. Surfaced
while landing #84; bundled per `feedback_gitea_actions_migration_audit_pattern`
("bundle per-repo, not per-finding") instead of three separate PRs.
1) handlers-postgres-integration: localhost → 127.0.0.1
- lib/pq tries to dial localhost → ::1 first; the postgres service
container only listens on IPv4 → ECONNREFUSED → all
TestIntegration_* fail. Pin IPv4 to make the job deterministic.
2) pr-guards / disable-auto-merge-on-push: Gitea no-op
- The previous reusable-workflow caller invoked `gh pr merge
--disable-auto`, which calls GitHub's GraphQL API. Gitea returns
HTTP 405 on /api/graphql → step always fails. Inline the step so
it can detect Gitea (GITEA_ACTIONS=true OR repo url under
moleculesai.app) and no-op with a notice. Auto-merge gating is
moot on Gitea anyway: there's no `--auto` primitive being
touched. Job stays ALWAYS-RUN so branch protection's required
check still lands SUCCESS (avoids the SKIPPED-in-set trap from
`feedback_branch_protection_check_name_parity`).
3) Harness Replays: cf-proxy nginx.conf via docker `configs:` (not bind)
- act_runner runs the workflow inside a runner container; runc in
the docker daemon below resolves bind-mount source paths on the
OUTER host, not inside the runner. The path
`/workspace/.../cf-proxy/nginx.conf` is invisible there → "not a
directory" runc error. Switching to compose `configs:` packages
the file as content rather than a host bind, sidestepping the
DinD path-translation gap.
Local validation:
- YAML parsed clean for all 3 files.
- cf-proxy nginx.conf: standalone `docker compose run cf-proxy
nginx -T` reproduced the configs: mount end-to-end and dumped the
config correctly. The full harness compose still renders via
`docker compose config`.
Real-CI verification will land on this branch's first push.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `HongmingWang-Rabbit/hermes-agent` fork is no longer reachable on
github.com (account suspended 2026-05-06). The patched fork now lives
at https://git.moleculesai.app/molecule-ai/hermes-agent. Same SHAs,
same branches — pure URL flip.
See molecule-ai/internal#72 for the github.com fork shell decision.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
actions/upload-artifact@v4+ and download-artifact@v4+ use the GHES 3.10+
artifact protocol that Gitea Actions (act_runner v0.6 / Gitea 1.22.x)
does NOT implement. Failure cite from PR #54 run 1325 jobs/2:
::error::@actions/artifact v2.0.0+, upload-artifact@v4+ and
download-artifact@v4+ are not currently supported on GHES.
Pinned all 3 references to v3.2.2 (latest v3) at SHA-pinned form for
supply-chain hygiene, matching the existing `uses:` style in this repo.
Affected workflows:
- ci.yml (Canvas Next.js coverage upload, blocks `CI / Canvas (Next.js)`
required check on every PR — was the merge-queue blocker for #53,
#54, #69, #71, #76, #81)
- e2e-staging-canvas.yml (Playwright report + screenshots on failure)
No download-artifact callers in the repo, so v3-pin doesn't compose-break
anywhere. Drop these pins post-Gitea-1.23+ when the v4 artifact protocol
ships, or migrate to a Gitea-native action.
Closes#210.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Class F of #75 sweep. /commits/{sha}/statuses replaces unavailable workflow-runs API. 4 mapping buckets verified against synthetic+real Gitea data. Approved by security-auditor.