Compare commits

...

10 Commits

Author SHA1 Message Date
19b61729ac fix(security): CWE-22 path traversal guard in loadWorkspaceEnv
Some checks failed
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
audit-force-merge / audit (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Failing after 4s
OFFSEC-003 / closes #330.

`loadWorkspaceEnv` used `filepath.Join(orgBaseDir, filesDir, ".env")`
without a resolveInsideRoot guard on filesDir — allowing malicious org YAML
to read files outside the org root (e.g. filesDir: "../../../etc").

Two locations patched:

1. org_helpers.go:loadWorkspaceEnv — wrap filesDir with resolveInsideRoot
   before joining into the load path. On traversal rejection the org-root
   .env is still loaded; the traversal path is silently skipped.

2. org_import.go:createWorkspaceTree — same unguarded Join at line 494
   was patched with the identical guard.

resolveInsideRoot is already established in the codebase (used for
template and files_dir elsewhere in org_import.go), so no new primitives
are introduced.

Added org_helpers_test.go covering:
- Normal load of org-root + workspace .env (workspace overrides org)
- Traversal paths (../../../etc etc.) are silently rejected
- Non-existent workspace dir returns org-root vars only
- Empty orgBaseDir returns empty map
2026-05-10 23:12:21 +00:00
6958cd7966 Merge pull request 'fix(workspace): inject plugins_registry into sys.modules before loading adapters (closes #296)' (#326) from fix/issue-296-plugin-registry-sysmodules into staging
All checks were successful
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
2026-05-10 21:14:10 +00:00
d4d3306150 fix(workspace): inject plugins_registry into sys.modules before loading adapters (closes #296)
Some checks failed
sop-tier-check / tier-check (pull_request) Failing after 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 58s
audit-force-merge / audit (pull_request) Successful in 2s
Plugin adapters in molecule-skill-* repos do:
  from plugins_registry.builtins import AgentskillsAdaptor as Adaptor

But _load_module_from_path() used exec_module() with a fresh module
namespace that did NOT have plugins_registry or its submodules in sys.modules,
causing:
  ModuleNotFoundError: No module named 'plugins_registry'

Fix: before exec_module(), import and register plugins_registry + all three
submodules (builtins, protocol, raw_drop) in sys.modules so adapter imports
resolve correctly.  Follows the Option 1 recommendation from issue #296.

Also adds test_resolve_plugin.py verifying the fix for both the
AgentskillsAdaptor import and the full InstallContext/resolve/protocol import.

Closes #296.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 14:17:16 +00:00
a3c9f0b717 Merge pull request 'ci: pin GitHub Actions by SHA instead of mutable tags (staging sync)' (#276) from ci/staging-sha-pinning into staging
Some checks failed
Secret scan / Scan diff for credential-shaped strings (push) Failing after 2s
2026-05-10 14:03:05 +00:00
de9f46ea30 Merge pull request '[release-blocker] fix(ci): retry git clone in clone-manifest.sh (publish-workspace-server-image OOM flake)' (#298) from fix/publish-workspace-server-ci-clone-manifest-retry into staging
Some checks are pending
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
2026-05-10 12:44:35 +00:00
7ff5622a42 [infra-lead-agent] fix(ci): retry git clone in clone-manifest.sh (publish-workspace-server-image flake)
Some checks failed
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 1s
sop-tier-check / tier-check (pull_request) Failing after 1s
audit-force-merge / audit (pull_request) Failing after 2s
The publish-workspace-server-image / build-and-push job clones the full
manifest (~36 repos) serially in the "Pre-clone manifest deps" step on a
memory-constrained Gitea Actions runner. Under host memory pressure the
OOM killer SIGKILLs git-remote-https mid-clone:

  cloning .../molecule-ai-plugin-molecule-skill-code-review.git ...
  error: git-remote-https died of signal 9
  fatal: the remote end hung up unexpectedly
    Failure - Main Pre-clone manifest deps
  exitcode '128': failure

Observed in run 4622 (2026-05-10, staging HEAD b5d2ab88) — died on the
14th of 36 clones, which red-lights CI and wedges staging→main.

Wrap each `git clone` in clone-manifest.sh with bounded retry + backoff
(3 attempts, 3s/6s), wiping any partial checkout between tries. A single
transient SIGKILL / network blip no longer fails the whole tenant image
rebuild. Benefits every caller of the script (publish-workspace-server-image,
harness-replays, Dockerfile builds, local quickstart).

This is a mitigation; the durable fix is more runner RAM/swap on the
operator host — tracked separately with Infra-SRE.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 11:58:09 +00:00
bea89ce4e9 fix(a2a): handle string-form errors in delegate_task
Some checks failed
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 14s
sop-tier-check / tier-check (pull_request) Failing after 7s
audit-force-merge / audit (pull_request) Failing after 5s
The A2A proxy can return three error shapes:
  {"error": "plain string"}
  {"error": {"message": "...", "code": ...}}
  {"error": {"message": {"nested": "object"}}}   ← value at .message is a string

builtin_tools/a2a_tools.py:72 called data["error"].get("message")
without guarding against error being a string, which raised:
  AttributeError: 'str' object has no attribute 'get'

This broke every delegation attempt through the legacy a2a_tools path
(the LangChain-wrapped version used by adapter templates). The
SSOT parser a2a_response.py already handled string errors; the
legacy inline sniffer in a2a_tools.py did not.

Fix: branch on isinstance(err, dict/str/other) before calling .get().

Also update both publish-workflow files to remove the dead
`staging` branch trigger — trunk-based migration (PR #109,
2026-05-08) removed the staging branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 11:39:32 +00:00
14f05b5a64 chore: restore manifest.json after trigger test 2026-05-10 11:38:34 +00:00
7caee806df chore: trigger publish workflow [Integration Tester 2026-05-10T08:45Z] 2026-05-10 11:38:34 +00:00
a914f675a4 chore: staging trigger commit from Integration Tester 2026-05-10 11:38:34 +00:00
10 changed files with 268 additions and 13 deletions

View File

@ -32,11 +32,9 @@ on:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
# Serialize per-branch so two rapid staging pushes don't race the same
# :staging-latest tag retag. Allow staging and main to run in parallel
# (different GITHUB_REF → different concurrency group) since they
# produce different :staging-<sha> tags and last-write-wins on
# :staging-latest is acceptable across branches.
# Serialize per-branch so two rapid main pushes don't race the same
# :staging-latest tag retag. Allow parallel runs as they produce
# different :staging-<sha> tags and last-write-wins on :staging-latest.
#
# cancel-in-progress: false → in-flight builds finish; the next push's
# build queues. This avoids a partially-pushed image.

1
.staging-trigger Normal file
View File

@ -0,0 +1 @@
staging trigger

View File

@ -44,3 +44,4 @@
{"name": "mock-bigorg", "repo": "molecule-ai/molecule-ai-org-template-mock-bigorg", "ref": "main"}
]
}
// Triggered by Integration Tester at 2026-05-10T08:52Z

View File

@ -37,6 +37,50 @@ PLUGINS_DIR="${4:?Missing plugins dir}"
EXPECTED=0
CLONED=0
# clone_one_with_retry — clone a single repo, retrying on transient failure.
#
# Why: the publish-workspace-server-image (and harness-replays) CI jobs
# clone the full manifest (~36 repos) serially on a memory-constrained
# Gitea Actions runner. Under host memory pressure the OOM killer
# occasionally SIGKILLs git-remote-https mid-clone:
#
# error: git-remote-https died of signal 9
# fatal: the remote end hung up unexpectedly
#
# (observed in publish-workspace-server-image run 4622 on 2026-05-10 — the
# job died on the 14th of 36 clones, which wedged staging→main). One
# transient SIGKILL / network blip would otherwise fail the whole tenant
# image rebuild. Retrying after a short backoff lets the pressure subside.
# The durable fix is more runner RAM/swap (tracked with Infra-SRE); this
# just stops a single flake from being release-blocking.
#
# Args: <target_dir> <name> <clone_url> <display_url> <ref>
clone_one_with_retry() {
local tdir="$1" name="$2" url="$3" display="$4" ref="$5"
local attempt=1 max_attempts=3 backoff
while : ; do
# A killed attempt can leave a partial directory behind; git clone
# refuses a non-empty target, so wipe it before each try.
rm -rf "$tdir/$name"
if [ "$ref" = "main" ]; then
if git clone --depth=1 -q "$url" "$tdir/$name"; then return 0; fi
else
if git clone --depth=1 -q --branch "$ref" "$url" "$tdir/$name"; then return 0; fi
fi
if [ "$attempt" -ge "$max_attempts" ]; then
echo "::error::clone failed after ${max_attempts} attempts: ${display}" >&2
return 1
fi
backoff=$((attempt * 3)) # 3s, then 6s
echo " ⚠ clone attempt ${attempt}/${max_attempts} failed for ${display} — retrying in ${backoff}s" >&2
sleep "$backoff"
attempt=$((attempt + 1))
done
}
clone_category() {
local category="$1"
local target_dir="$2"
@ -82,11 +126,7 @@ clone_category() {
fi
echo " cloning $display_url -> $target_dir/$name (ref=$ref)"
if [ "$ref" = "main" ]; then
git clone --depth=1 -q "$clone_url" "$target_dir/$name"
else
git clone --depth=1 -q --branch "$ref" "$clone_url" "$target_dir/$name"
fi
clone_one_with_retry "$target_dir" "$name" "$clone_url" "$display_url" "$ref"
CLONED=$((CLONED + 1))
i=$((i + 1))
done

View File

@ -98,7 +98,11 @@ func loadWorkspaceEnv(orgBaseDir, filesDir string) map[string]string {
}
parseEnvFile(filepath.Join(orgBaseDir, ".env"), envVars)
if filesDir != "" {
parseEnvFile(filepath.Join(orgBaseDir, filesDir, ".env"), envVars)
safeFilesDir, err := resolveInsideRoot(orgBaseDir, filesDir)
if err != nil {
return envVars // silently reject path traversal
}
parseEnvFile(filepath.Join(safeFilesDir, ".env"), envVars)
}
return envVars
}

View File

@ -0,0 +1,123 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
// TestLoadWorkspaceEnv_OrgRootOnly verifies that loadWorkspaceEnv loads the
// org-root .env file even when filesDir is empty.
func TestLoadWorkspaceEnv_OrgRootOnly(t *testing.T) {
tmp := t.TempDir()
orgEnv := filepath.Join(tmp, ".env")
if err := os.WriteFile(orgEnv, []byte("ORG_KEY=org_value\n"), 0o644); err != nil {
t.Fatal(err)
}
got := loadWorkspaceEnv(tmp, "")
if got["ORG_KEY"] != "org_value" {
t.Errorf("got %v, want ORG_KEY=org_value", got)
}
}
// TestLoadWorkspaceEnv_WorkspaceOverrides verifies that a workspace-specific .env
// file (identified by filesDir) overrides org-root values.
func TestLoadWorkspaceEnv_WorkspaceOverrides(t *testing.T) {
tmp := t.TempDir()
// Org root .env
if err := os.WriteFile(filepath.Join(tmp, ".env"), []byte("KEY=org_value\n"), 0o644); err != nil {
t.Fatal(err)
}
// Workspace-specific .env
wsDir := filepath.Join(tmp, "my-workspace")
if err := os.Mkdir(wsDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(wsDir, ".env"), []byte("KEY=ws_value\n"), 0o644); err != nil {
t.Fatal(err)
}
got := loadWorkspaceEnv(tmp, "my-workspace")
if got["KEY"] != "ws_value" {
t.Errorf("got %v, want KEY=ws_value (workspace override)", got)
}
}
// TestLoadWorkspaceEnv_TraversalRejected verifies that CWE-22 path traversal
// via filesDir is silently rejected — no file outside orgBaseDir is read.
// The org-root .env is still loaded; only the traversal path is blocked.
func TestLoadWorkspaceEnv_TraversalRejected(t *testing.T) {
tmp := t.TempDir()
// Org root .env — must still be loaded
if err := os.WriteFile(filepath.Join(tmp, ".env"), []byte("SAFE=safe_val\n"), 0o644); err != nil {
t.Fatal(err)
}
// Simulate a malicious YAML: filesDir = "../../../tmp/traversal"
// The orgBaseDir is tmp; the traversal target is /tmp/traversal/.
// resolveInsideRoot must reject this and loadWorkspaceEnv must silently
// return only the org-root vars.
got := loadWorkspaceEnv(tmp, "../../../tmp/traversal")
if got["SAFE"] != "safe_val" {
t.Errorf("org-root .env should still be loaded, got %v", got)
}
// The traversal path is rejected, so no secrets from there appear.
if v, ok := got["TRAVERSAL_SECRET"]; ok {
t.Errorf("traversal path should not be read, got TRAVERSAL_SECRET=%q", v)
}
// Confirm the map contains only the safe key.
if len(got) != 1 || got["SAFE"] != "safe_val" {
t.Errorf("got %v, want only {SAFE:safe_val}", got)
}
}
// TestLoadWorkspaceEnv_TraversalDotdotdot verifies multiple-dot traversal is
// blocked (e.g. "../../../etc").
func TestLoadWorkspaceEnv_TraversalDotdotdot(t *testing.T) {
tmp := t.TempDir()
if err := os.WriteFile(filepath.Join(tmp, ".env"), []byte("X=x\n"), 0o644); err != nil {
t.Fatal(err)
}
cases := []string{
"../../../etc",
"../../../../../../../../../etc/passwd",
"..\\..\\..\\windows\\system32",
}
for _, tc := range cases {
got := loadWorkspaceEnv(tmp, tc)
// Only org root env should be present
if len(got) != 1 || got["X"] != "x" {
t.Errorf("traversal %q: got %v, want only {X:x}", tc, got)
}
}
}
// TestLoadWorkspaceEnv_NonExistentWorkspace returns org-root vars only when
// the workspace directory does not exist.
func TestLoadWorkspaceEnv_NonExistentWorkspace(t *testing.T) {
tmp := t.TempDir()
if err := os.WriteFile(filepath.Join(tmp, ".env"), []byte("A=a\n"), 0o644); err != nil {
t.Fatal(err)
}
got := loadWorkspaceEnv(tmp, "nonexistent-workspace")
if got["A"] != "a" {
t.Errorf("got %v, want A=a", got)
}
if len(got) != 1 {
t.Errorf("got %v, want only {A:a}", got)
}
}
// TestLoadWorkspaceEnv_EmptyOrgBaseDir returns empty map when orgBaseDir is
// empty (no org root .env to load).
func TestLoadWorkspaceEnv_EmptyOrgBaseDir(t *testing.T) {
got := loadWorkspaceEnv("", "any-files-dir")
if len(got) != 0 {
t.Errorf("got %v, want empty map", got)
}
}

View File

@ -489,9 +489,11 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
if orgBaseDir != "" {
// 1. Org root .env (shared defaults)
parseEnvFile(filepath.Join(orgBaseDir, ".env"), envVars)
// 2. Workspace-specific .env (overrides)
// 2. Workspace-specific .env (overrides) — guard against traversal
if ws.FilesDir != "" {
parseEnvFile(filepath.Join(orgBaseDir, ws.FilesDir, ".env"), envVars)
if safeFilesDir, err := resolveInsideRoot(orgBaseDir, ws.FilesDir); err == nil {
parseEnvFile(filepath.Join(safeFilesDir, ".env"), envVars)
}
}
}
// Store as workspace secrets via DB (encrypted if key is set, raw otherwise)

View File

@ -77,6 +77,16 @@ async def delegate_task(workspace_id: str, task: str) -> str:
return str(result) if isinstance(result, str) else "(no text)"
elif "error" in data:
err = data["error"]
# Handle both string-form errors ("error": "some string")
# and object-form errors ("error": {"message": "...", "code": ...}).
msg = ""
if isinstance(err, dict):
msg = err.get("message", "")
elif isinstance(err, str):
msg = err
else:
msg = str(err)
return f"Error: {msg}"
msg = ""
if isinstance(err, dict):
msg = err.get("message", "")

View File

@ -51,6 +51,22 @@ class AdaptorSource:
def _load_module_from_path(module_name: str, path: Path):
"""Import a Python file by absolute path. Returns the module or None on failure."""
# Ensure the plugins_registry package and its submodules are importable in the
# fresh module namespace created by module_from_spec(). Plugin adapters
# (molecule-skill-*/adapters/*.py) use "from plugins_registry.builtins import ..."
# which requires plugins_registry and its submodules to already be in sys.modules.
# We import and register them before exec_module so the plugin's own
# from ... import statements resolve correctly.
import sys
import plugins_registry
sys.modules.setdefault("plugins_registry", plugins_registry)
for _sub in ("builtins", "protocol", "raw_drop"):
try:
sub = importlib.import_module(f"plugins_registry.{_sub}")
sys.modules.setdefault(f"plugins_registry.{_sub}", sub)
except Exception:
# Submodule may not exist in all versions; skip if absent.
pass
spec = importlib.util.spec_from_file_location(module_name, path)
if spec is None or spec.loader is None:
return None

View File

@ -0,0 +1,60 @@
"""Tests for _load_module_from_path sys.modules injection fix (issue #296).
Verifies that plugin adapters using "from plugins_registry.builtins import ..."
can be loaded via _load_module_from_path() without ModuleNotFoundError.
"""
import sys
import tempfile
import os
from pathlib import Path
# Ensure the plugins_registry package is importable
import plugins_registry
from plugins_registry import _load_module_from_path
def test_load_adapter_with_plugins_registry_import():
"""Plugin adapter using 'from plugins_registry.builtins import ...' loads cleanly."""
# Write a temp adapter file that does the exact import from the bug report.
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False, dir=tempfile.gettempdir()
) as f:
f.write("from plugins_registry.builtins import AgentskillsAdaptor as Adaptor\n")
f.write("assert Adaptor is not None\n")
adapter_path = Path(f.name)
try:
module = _load_module_from_path("test_adapter", adapter_path)
assert module is not None, "module should load without error"
assert hasattr(module, "Adaptor"), "module should expose Adaptor"
finally:
os.unlink(adapter_path)
def test_load_adapter_with_full_plugins_registry_import():
"""Plugin adapter using 'from plugins_registry import ...' loads cleanly."""
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False, dir=tempfile.gettempdir()
) as f:
f.write("from plugins_registry import InstallContext, resolve\n")
f.write("from plugins_registry.protocol import PluginAdaptor\n")
f.write("assert InstallContext is not None\n")
f.write("assert resolve is not None\n")
f.write("assert PluginAdaptor is not None\n")
adapter_path = Path(f.name)
try:
module = _load_module_from_path("test_adapter_full", adapter_path)
assert module is not None, "module should load without error"
assert hasattr(module, "InstallContext"), "module should expose InstallContext"
assert hasattr(module, "resolve"), "module should expose resolve"
assert hasattr(module, "PluginAdaptor"), "module should expose PluginAdaptor"
finally:
os.unlink(adapter_path)
if __name__ == "__main__":
test_load_adapter_with_plugins_registry_import()
test_load_adapter_with_full_plugins_registry_import()
print("ALL TESTS PASS")