molecule-core/workspace/executor_helpers.py
molecule-ai[bot] 64ccf8e179
fix: CWE-78 rm scope, go vet failures, delegation idempotency
* refactor: split 4 oversized handler files into focused sub-files

- org.go (1099 lines) → org.go + org_import.go + org_helpers.go
- mcp.go (1001 lines) → mcp.go + mcp_tools.go
- workspace.go (934 lines) → workspace.go + workspace_crud.go
- a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go

No functional changes — same package, same exports, same tests.
All files stay under 635 lines.

Note: isSafeURL and isPrivateOrMetadataIP are duplicated between
mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue
from the original mcp.go and a2a_proxy.go, not introduced by this split.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386)

* docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal

* docs: add Remote Agents feature + Phase 30 blog links to docs index

* docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted

* docs(api-ref): add workspace file copy API reference (#1281)

Documents TemplatesHandler.copyFilesToContainer (container_files.go):
- Endpoint overview: PUT /workspaces/:id/files/*path
- Parameter descriptions for all four function parameters
- CWE-22 path traversal protection (PRs #1267/1270/1271)
- Defense-in-depth: validateRelPath at handler + archive boundary
- Full error code table (400/404/500)
- curl example with success and path-traversal rejection cases

Also covers: writeViaEphemeral routing, findContainer fallback,
allowed roots allow-list, and related links to platform-api.md.

Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310)

## Summary
Issue #1273: deleteViaEphemeral interpolated filePath directly into
rm command, enabling both shell injection (CWE-78) and path traversal
(CWE-22) attacks.

## Changes
1. Added validateRelPath(filePath) guard before constructing the rm command.
   validateRelPath blocks absolute paths and ".." traversal sequences.
2. Changed Cmd from "/configs/"+filePath (string interpolation) to
   []string{"rm", "-rf", "/configs", filePath} (exec form). This
   eliminates shell injection entirely — filePath is a plain argument,
   never interpreted as shell code.

## Security properties
- validateRelPath: blocks "../" and absolute paths before they reach Docker
- Exec form: filePath cannot inject shell metacharacters even if validation
  is somehow bypassed
- "/configs" as separate arg: rm has exactly two arguments, no room for
  injected args

Closes #1273.

Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>

* fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302)

* fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go

Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go.
staging already ships the fix (PRs #1147, #1154 → merged); main did not include it.

- mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate
  agentURL before outbound calls in mcpCallTool (line ~529) and
  toolDelegateTaskAsync (line ~607)
- a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP()
  helpers; call isSafeURL() before dispatchA2A in resolveAgentURL()
  (blocks finding #1 at line 462)
- mcp_test.go: 19 new tests covering all blocked URL patterns:
  file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x,
  172.16.x.x, 192.168.x.x, empty hostname, invalid URL,
  isPrivateOrMetadataIP across all private/CGNAT/metadata ranges

1. URL scheme enforcement — http/https only
2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges
3. DNS hostname resolution — blocks internal hostnames resolving to private IPs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go

Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in
mcp.go — both functions already exist on main at lines 829 and 876.
Kept the mcp.go definitions (the originals) and removed the 70-line
duplicate appended at end of file. a2a_proxy.go functions are
unchanged — they serve the same purpose via a separate code path.

* fix: remove orphaned commit-text lines from a2a_proxy.go

Three lines from the PR/commit title were accidentally baked into the
file during the rebase from #1274 to #1302, causing a Go syntax error
(a bare string literal at statement level followed by dangling braces).

Deletion restores:
  }
  return agentURL, nil
}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>

* fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct

Fixes #1324 — TypeScript strict mode flags budget.budget_used as
possibly undefined in the progressPct ternary, even though the
outer condition checks budget_limit > 0.

Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0%
when the backend returns a partial shape (provisioning-stuck
workspaces). Also adds a test covering the undefined-budget_used
case with the progress bar aria-valuenow and fill width both at 0%.

Closes #1324.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329)

* fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled

With cancel-in-progress: false, pending CI runs accumulate in the
ci-staging concurrency group. New pushes create queued runs, but
GitHub dispatches multiple runs for the same SHA instead of replacing
the pending one. All runs get stuck/cancelled before completing.

Reverting to cancel-in-progress: true restores CI operation — runs
that are superseded are cancelled, freeing the concurrency slot for
the new run to proceed.

Runner availability (ubuntu-latest dispatch stall) is a separate
infra issue tracked independently.

* fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043)

Tar header names were built from raw map keys without validation. A malicious
server-side caller could embed "../" in a file name to escape the destPath
volume mount (/configs) and write files outside the intended directory.

Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks
before using it in the tar header, then join with destPath for the archive
header. Also guard parent-directory creation against traversal.

Closes #1043.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix

Two regressions introduced by PR #1243 (fix issue #1207):

1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives
   `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test
   expected only `{id, name}`. Added `hasChildren: false` to the assertion.

2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)`
   without `act()`. With fake timers, `setState` (synchronous) is flushed by
   `advanceTimersByTimeAsync`, but the React state update it triggers is a
   microtask — so the test saw stale render. Wrapping in `act(async () =>
   { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain
   before assertions run.

All 813 vitest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add 100px proximity threshold to drag-to-nest detection

Fixes #1052 — previously, getIntersectingNodes() returned any node whose
bounding box overlapped the dragged node, regardless of actual pixel
distance. On a sparse canvas this triggered the "Nest Workspace" dialog
even when the dragged node was nowhere near any target.

The fix adds an on-node-drag proximity filter: only nodes within 100px
(center-to-center) of the dragged node are eligible as nest targets.
Distance is computed as squared Euclidean to avoid the sqrt overhead in
the hot drag path.

Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring
and confirming the regression is addressed in Canvas.tsx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas): add ?? 0 guard for optional budget_used in progressPct

Fixes #1324 — TypeScript strict mode flags budget.budget_used as
possibly undefined in the progressPct ternary, even though the
outer condition checks budget_limit > 0.

Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0%
when the backend returns a partial shape (provisioning-stuck
workspaces). Also adds a test covering the undefined-budget_used
case with the progress bar aria-valuenow and fill width both at 0%.

Closes #1324.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(platform): unblock SaaS workspace registration end-to-end

Every workspace in the cross-EC2 SaaS provisioning shape was failing
registration, heartbeat, or A2A routing. Four distinct blockers sat
between "EC2 is up" and "agent responds"; three are platform-side and
fixed here (the fourth is in the CP user-data, separate PR).

1. SSRF validator blocked RFC-1918 (registry.go + mcp.go)
   validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12,
   which contains the AWS default VPC range (172.31.x.x) that every
   sibling workspace EC2 registers from. Registration returned 400 and
   the 10-min provision sweep flipped status to failed. RFC-1918 +
   IPv6 ULA are now gated behind saasMode(); link-local (169.254/16),
   loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked
   unconditionally in both modes.

   saasMode() resolution order:
     1. MOLECULE_DEPLOY_MODE=saas|self-hosted (explicit operator flag)
     2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for
        back-compat so existing deployments don't need a config change)

   isPrivateOrMetadataIP now actually checks IPv6 — previously it
   returned false on any non-IPv4 input, which would let a registered
   [::1] or [fe80::...] URL bypass the SSRF check entirely.

2. Orphan auth-token minting (workspace_provision.go)
   issueAndInjectToken mints a token and stuffs it into
   cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that
   file into the /configs volume — the CP provisioner ignores it
   (only cfg.EnvVars crosses the wire). Result: live token in DB, no
   plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every
   /registry/register attempt because the workspace is no longer in
   the "no live token → bootstrap-allowed" state. Now no-ops in SaaS
   mode; the register handler already mints on first successful
   register and returns the plaintext in the response body for the
   runtime to persist locally.

   Also removes the redundant wsauth.IssueToken call at the bottom of
   provisionWorkspaceCP, which created the same orphan-token pattern
   a second time.

3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go,
   scheduler.go, workspace_provision.go)
   Four pre-existing compile errors on main from an earlier session's
   code truncation: missing tuple destructuring on ExecContext /
   redactSecrets / orgTokenActor, missing close-brace in
   Scheduler.fireSchedule's panic recovery. All one-line mechanical
   fixes; without them the binary would not build.

Tests
-----
ssrf_test.go adds:
  * TestSaasMode — covers the env resolution ladder (explicit flag
    wins over legacy signal, case-insensitive, whitespace tolerant)
  * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA
    flip to allowed, metadata/loopback/TEST-NET still blocked
  * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old
    "returns false for all IPv6" behaviour

Follow-up issue for CP-sourced workspace_id attestation will be filed
separately — closes the residual intra-VPC SSRF + token-race windows
the SaaS-mode relaxation introduces.

Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI
provider) — agent returned "PONG" in 1.4s after register → heartbeat →
A2A proxy → runtime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408)

Runtime (shared_runtime.py):
- set_current_task now increments active_tasks on task start, decrements
  on completion (was binary 0/1)
- Counter never goes below 0 (max(0, n-1))
- Pushes heartbeat immediately on BOTH increment and decrement (#1372)

Scheduler (scheduler.go):
- Reads max_concurrent_tasks from DB (default 1, backward compatible)
- Skips cron only when active_tasks >= max_concurrent_tasks (was > 0)
- Leaders can be configured with max_concurrent_tasks > 1 to accept
  A2A delegations while a cron runs

Platform:
- Added max_concurrent_tasks column to workspaces (migration 037)
- Workspace model + list/get queries include the new field
- API exposes max_concurrent_tasks in workspace JSON

Config.yaml support (future): runtime_config.max_concurrent_tasks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(review): address 3 critical issues from code review

1. BLOCKER: executor_helpers.py now uses increment/decrement too
   (was still binary 0/1, stomping the counter for CLI + SDK executors)

2. BUG: asymmetric getattr defaults fixed — both paths use default 0
   (was 0 on increment, 1 on decrement)

3. UX: current_task preserved when active_tasks > 0 on decrement
   (was clearing task description even when other tasks still running)

4. Scheduler polling loop re-reads max_concurrent_tasks on each poll
   (was using stale value from initial query)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>

* docs: workspace files API reference, skill catalog, and links

* docs: fix secrets endpoint path across docs

The workspace secrets endpoint is `/workspaces/:id/secrets`, not
`/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent)
and workspace-runtime.md (registration flow example and comparison table).
The external-agent-registration guide already had the correct path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: fix broken blog cross-link in skills-vs-bundled-tools post

Link path had an extra `/docs/` segment: `/docs/blog/...` instead of
`/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`,
not under `/docs/blog/`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add skill-catalog.md guide

Linked from the skills-vs-bundled-tools blog post as a reference
for TTS/image-generation/web-search skills. The blog promises
"install directly via the CLI" with a skill catalog — this page
fills that promise by documenting available skill types, install
commands, version management, custom skill authoring, and removal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted

* docs(api-ref): add workspace file copy API reference

Documents TemplatesHandler.copyFilesToContainer (container_files.go):
- Endpoint overview: PUT /workspaces/:id/files/*path
- Parameter descriptions for all four function parameters
- CWE-22 path traversal protection (PRs #1267/1270/1271)
- Defense-in-depth: validateRelPath at handler + archive boundary
- Full error code table (400/404/500)
- curl example with success and path-traversal rejection cases

Also covers: writeViaEphemeral routing, findContainer fallback,
allowed roots allow-list, and related links to platform-api.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>

* fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go

Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP
into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it
unconditionally blocks RFC-1918 addresses, regressing the fix in
commits 1125a02 / cf10733.

The A2A proxy path now has the same SaaS-gated logic as registry.go:
- Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes
- RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in
  self-hosted, allowed in SaaS cross-EC2 mode
- IPv6 addresses now properly checked (previous version returned false for all)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(marketing): Discord adapter Day 2 Reddit + HN community copy

* fix(tests): supply *events.Broadcaster pointer to captureBroadcaster

Cannot use *captureBroadcaster as *events.Broadcaster when the struct
embeds events.Broadcaster as a value — must initialize as a named field.

Fixes go vet error in workspace_provision_test.go:
  cannot use broadcaster (*captureBroadcaster) as *events.Broadcaster value

* Merge pull request #1429 from fix/canvas-tooltip-clear-timer

Without this, a 400ms setTimeout from onFocus/onMouseEnter that fires
after onBlur will re-show a tooltip the user just dismissed. The
setShow(false) in onBlur closes the tooltip immediately but leaves the
timer pending — Tab-blur followed by timer-fire would re-show it.

Fix: add clearTimeout(timerRef.current) at the top of onBlur, mirroring
the pattern already used in onMouseLeave and onFocus.

Refs: PR #1367 (a11y keyboard support — this was a pre-existing gap)

Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): add missing children:[] to setPendingDelete expectation (#1426)

PR #1252 (cascade-delete UX) updated setPendingDelete to pass a
children array for cascade-warning rendering. The keyboard-a11y test
assertion was not updated to match.

Test: clicking 'Delete' hoists state to the store and closes the menu

Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(canvas/test): add children:[] to setPendingDelete + \&apos; entity fix (closes #1380) (#1427)

* ci: retry — trigger fresh runner allocation

* fix(canvas/test): add children:[] to setPendingDelete assertion

setPendingDelete now includes children:[] (PR #1383 extended the
pendingDelete type). The keyboard accessibility test at line 225 used
exact object matching which omitted the new field, causing a failure
after staging merged #1383.

Issue: #1380

* fix(canvas): replace &apos; HTML entity with straight apostrophe

JSX does not entity-decode &apos; — it renders the literal text
"&apos;" instead of "'".  Found at line 157 (payment confirmed) and
line 321 (empty org list).  Replaced with a straight apostrophe,
which JSX handles correctly.

Ref: issue #1375
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Merge pull request #1430 from fix/1421-saas-ssrf-helpers

Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP
into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it
unconditionally blocks RFC-1918 addresses, regressing the fix in
commits 1125a02 / cf10733.

The A2A proxy path now has the same SaaS-gated logic as registry.go:
- Cloud metadata (169.254/16, fe80::/10, ::1) always blocked in both modes
- RFC-1918 (10/8, 172.16/12, 192.168/16) + IPv6 ULA (fc00::/7) blocked in
  self-hosted, allowed in SaaS cross-EC2 mode
- IPv6 addresses now properly checked (previous version returned false for all)

Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(P0): CWE-22 path traversal in copyFilesToContainer + ContextMenu test

Issue #1434 — CWE-22 Path Traversal Regression:
PR #1280 (dc218212) correctly used cleaned path in tar header.
PR #1363 (e9615af) regressed to using uncleaned `name`.
Fix: use `clean` in filepath.Join AND add defence-in-depth escape check.

Issue #1422 — ContextMenu Test Regression:
PR #1340 expanded pendingDelete store type to include `children:[]`.
Test assertion missing the field — add `children:[]` to match.

Note: ssrf.go created (shared isSafeURL/isPrivateOrMetadataIP) to
prepare for the handler-split refactor fix — current branch has no
build error, but the shared file will prevent regression when PR #1363
is merged. isSafeURL/isPrivateOrMetadataIP retained in both files
for now to avoid breaking callers while the split is finalized.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve 3 go vet failures + add idempotency_key to delegate_task_async

- workspace_provision_test.go: add missing mock := setupTestDB(t) to
  TestSeedInitialMemories_Truncation — mock was referenced but never
  declared, causing "undefined: mock" vet error
- orgtoken/tokens_test.go: discard unused orgID return value with _ in
  Validate call — "declared and not used" vet error
- a2a_tools.py: delegate_task_async now sends idempotency_key (SHA-256
  of workspace_id + task) to POST /workspaces/:id/delegate, fixing
  duplicate task execution when an agent restarts mid-delegation (#1456)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: airenostars <airenostars@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com>
Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app>
Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app>
Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app>
Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Molecule AI Community Manager <community-manager@agents.moleculesai.app>
Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
Co-authored-by: Molecule AI Core-QA <core-qa@agents.moleculesai.app>
Co-authored-by: DevOps Engineer <devops@molecule.ai>
Co-authored-by: Molecule AI Core-UIUX <core-uiux@agents.moleculesai.app>
Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app>
2026-04-21 18:22:30 +00:00

585 lines
22 KiB
Python

"""Shared helpers for AgentExecutor implementations.
Used by both CLIAgentExecutor (codex, ollama) and ClaudeSDKExecutor (claude-code).
Provides:
- Memory recall/commit (HTTP to platform /memories endpoints)
- Delegation results consumption (atomic file rename)
- Current task heartbeat updates
- System prompt loading from /configs
- A2A instructions text for system prompt injection (MCP and CLI variants)
- Brief task summary extraction (markdown-aware)
- Error message sanitization (exception classes and subprocess categories)
- Shared workspace path constants and the MCP server path resolver
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import re
import subprocess
from pathlib import Path
from typing import TYPE_CHECKING, Any
import httpx
from builtin_tools.security import _redact_secrets
if TYPE_CHECKING:
from heartbeat import HeartbeatLoop
logger = logging.getLogger(__name__)
# ========================================================================
# Constants — workspace container layout
# ========================================================================
WORKSPACE_MOUNT = "/workspace"
CONFIG_MOUNT = "/configs"
DEFAULT_MCP_SERVER_PATH = "/app/a2a_mcp_server.py"
DEFAULT_DELEGATION_RESULTS_FILE = "/tmp/delegation_results.jsonl"
PLATFORM_HTTP_TIMEOUT_S = 5.0
MEMORY_RECALL_LIMIT = 10
MEMORY_CONTENT_MAX_CHARS = 200
BRIEF_SUMMARY_MAX_LEN = 80
def get_mcp_server_path() -> str:
"""Return the path to the stdio MCP server script.
Overridable via A2A_MCP_SERVER_PATH for tests and non-default layouts.
"""
return os.environ.get("A2A_MCP_SERVER_PATH", DEFAULT_MCP_SERVER_PATH)
# ========================================================================
# HTTP client (shared, lazily initialised)
# ========================================================================
_http_client: httpx.AsyncClient | None = None
def get_http_client() -> httpx.AsyncClient:
"""Lazy-init a shared httpx client for platform API calls."""
global _http_client
if _http_client is None or _http_client.is_closed:
_http_client = httpx.AsyncClient(timeout=PLATFORM_HTTP_TIMEOUT_S)
return _http_client
def reset_http_client_for_tests() -> None:
"""Test helper — drop the shared client so the next call rebuilds it.
Not for production use. Exposed so tests can guarantee a clean slate
between cases without touching module internals.
"""
global _http_client
_http_client = None
# ========================================================================
# Memory recall + commit
# ========================================================================
async def recall_memories() -> str:
"""Recall recent memories from the platform API.
Returns a newline-joined bullet list of up to MEMORY_RECALL_LIMIT most recent
memories, or empty string when the platform is unreachable / not configured
/ returns a non-200 / returns an unexpected payload shape.
"""
workspace_id = os.environ.get("WORKSPACE_ID", "")
platform_url = os.environ.get("PLATFORM_URL", "")
if not workspace_id or not platform_url:
return ""
# Fix E (Cycle 5): send auth headers so the WorkspaceAuth middleware
# (Fix A) allows access once the workspace has a live token on file.
try:
from platform_auth import auth_headers as _platform_auth
_auth = _platform_auth()
except Exception:
_auth = {}
try:
resp = await get_http_client().get(
f"{platform_url}/workspaces/{workspace_id}/memories",
headers=_auth,
)
if not 200 <= resp.status_code < 300:
logger.debug(
"recall_memories: non-2xx response %s from platform",
resp.status_code,
)
return ""
data = resp.json()
except Exception as exc:
logger.debug("recall_memories: request failed: %s", exc)
return ""
if not isinstance(data, list) or not data:
return ""
lines = [
f"- [{m.get('scope', '?')}] {m.get('content', '')}"
for m in data[-MEMORY_RECALL_LIMIT:]
]
return "\n".join(lines)
async def commit_memory(content: str) -> None:
"""Save a memory to the platform API. Best-effort, no error propagation."""
workspace_id = os.environ.get("WORKSPACE_ID", "")
platform_url = os.environ.get("PLATFORM_URL", "")
if not workspace_id or not platform_url or not content:
return
content = _redact_secrets(content)
# Fix E (Cycle 5): include auth header so WorkspaceAuth middleware allows access.
try:
from platform_auth import auth_headers as _platform_auth
_auth = _platform_auth()
except Exception:
_auth = {}
try:
await get_http_client().post(
f"{platform_url}/workspaces/{workspace_id}/memories",
json={"content": content, "scope": "LOCAL"},
headers=_auth,
)
except Exception as exc:
logger.debug("commit_memory: request failed: %s", exc)
# ========================================================================
# Delegation results — written by heartbeat loop, consumed atomically
# ========================================================================
def read_delegation_results() -> str:
"""Read and consume delegation results written by the heartbeat loop.
Uses atomic rename to prevent races with the heartbeat writer.
Returns formatted text suitable for prompt injection, or empty string.
"""
results_file = Path(
os.environ.get("DELEGATION_RESULTS_FILE", DEFAULT_DELEGATION_RESULTS_FILE)
)
if not results_file.exists():
return ""
consumed = results_file.with_suffix(".consumed")
try:
results_file.rename(consumed)
except OSError:
return "" # File disappeared between exists() and rename()
try:
raw = consumed.read_text(encoding="utf-8", errors="replace")
except OSError:
return ""
finally:
consumed.unlink(missing_ok=True)
parts: list[str] = []
for line in raw.strip().split("\n"):
if not line.strip():
continue
try:
record = json.loads(line)
except json.JSONDecodeError:
continue
status = record.get("status", "?")
summary = record.get("summary", "")
preview = record.get("response_preview", "")
parts.append(f"- [{status}] {summary}")
if preview:
parts.append(f" Response: {preview[:200]}")
return "\n".join(parts)
# ========================================================================
# Current task heartbeat update
# ========================================================================
async def set_current_task(heartbeat: "HeartbeatLoop | None", task: str) -> None:
"""Update current task on heartbeat and push immediately via platform API.
Uses increment/decrement instead of binary 0/1 so agents can track
multiple concurrent tasks (#1408). Pushes immediately on both
increment and decrement to avoid phantom-busy (#1372).
"""
if heartbeat is not None:
if task:
heartbeat.active_tasks = getattr(heartbeat, "active_tasks", 0) + 1
heartbeat.current_task = task
else:
heartbeat.active_tasks = max(0, getattr(heartbeat, "active_tasks", 0) - 1)
if heartbeat.active_tasks == 0:
heartbeat.current_task = ""
workspace_id = os.environ.get("WORKSPACE_ID", "")
platform_url = os.environ.get("PLATFORM_URL", "")
if not (workspace_id and platform_url):
return
active = getattr(heartbeat, "active_tasks", 0) if heartbeat is not None else (1 if task else 0)
cur_task = getattr(heartbeat, "current_task", task or "") if heartbeat is not None else (task or "")
try:
try:
from platform_auth import auth_headers as _auth
_headers = _auth()
except Exception:
_headers = {}
await get_http_client().post(
f"{platform_url}/registry/heartbeat",
json={
"workspace_id": workspace_id,
"current_task": cur_task,
"active_tasks": active,
"error_rate": 0,
"sample_error": "",
"uptime_seconds": 0,
},
headers=_headers,
)
except Exception as exc:
logger.debug("set_current_task: heartbeat push failed: %s", exc)
# ========================================================================
# System prompt loading
# ========================================================================
def get_system_prompt(config_path: str, fallback: str | None = None) -> str | None:
"""Read system-prompt.md from the config dir each call (supports hot-reload).
Falls back to the provided string if the file doesn't exist.
"""
prompt_file = Path(config_path) / "system-prompt.md"
if prompt_file.exists():
return prompt_file.read_text(encoding="utf-8", errors="replace").strip()
return fallback
_A2A_INSTRUCTIONS_MCP = """## Inter-Agent Communication
You have MCP tools for communicating with other workspaces:
- list_peers: discover available peer workspaces (name, ID, status, role)
- delegate_task: send a task and WAIT for the response (for quick tasks)
- delegate_task_async: send a task and return immediately with a task_id (for long tasks)
- check_task_status: poll an async task's status and get results when done
- get_workspace_info: get your own workspace info
For quick questions, use delegate_task (synchronous).
For long-running work (building pages, running audits), use delegate_task_async + check_task_status.
Always use list_peers first to discover available workspace IDs.
Access control is enforced — you can only reach siblings and parent/children.
PROACTIVE MESSAGING: Use send_message_to_user to push messages to the user's chat at ANY time:
- Acknowledge tasks immediately: "Got it, delegating to the team now..."
- Send progress updates during long work: "Research Lead finished, waiting on Dev Lead..."
- Deliver follow-up results: "All teams reported back. Here's the synthesis: ..."
This lets you respond quickly ("I'll work on this") and come back later with results.
If delegate_task returns a DELEGATION FAILED message, do NOT forward the raw error to the user.
Instead: (1) try delegating to a different peer, (2) handle the task yourself, or
(3) tell the user which peer is unavailable and provide your own best answer."""
_A2A_INSTRUCTIONS_CLI = """## Inter-Agent Communication
You can delegate tasks to other workspaces using the a2a command:
python3 /app/a2a_cli.py peers # List available peers
python3 /app/a2a_cli.py delegate <workspace_id> <task> # Sync: wait for response
python3 /app/a2a_cli.py delegate --async <workspace_id> <task> # Async: return task_id
python3 /app/a2a_cli.py status <workspace_id> <task_id> # Check async task
python3 /app/a2a_cli.py info # Your workspace info
For quick questions, use sync delegate. For long tasks, use --async + status.
Only delegate to peers listed by the peers command (access control enforced)."""
def get_a2a_instructions(mcp: bool = True) -> str:
"""Return inter-agent communication instructions for system-prompt injection.
Pass `mcp=True` (default) for MCP-capable runtimes (Claude Code via SDK,
Codex). Pass `mcp=False` for CLI-only runtimes (Ollama, custom) that have
to call a2a_cli.py as a subprocess.
"""
return _A2A_INSTRUCTIONS_MCP if mcp else _A2A_INSTRUCTIONS_CLI
_HMA_INSTRUCTIONS = """## Hierarchical Memory (HMA)
You have persistent memory tools that survive across sessions and restarts:
- **commit_memory(content, scope)**: Save important information.
- LOCAL: private to you only (default)
- TEAM: shared with your parent workspace and siblings (same team)
- GLOBAL: shared with the entire org (only root workspaces can write)
- **recall_memory(query)**: Search your accessible memories. Returns LOCAL + TEAM + GLOBAL matches.
**When to use memory:**
- After making a decision or learning something non-obvious → commit_memory("decision X because Y", scope="TEAM")
- Before starting work → recall_memory("what did the team decide about X")
- When you discover org-wide knowledge (repo locations, API patterns, conventions) → commit_memory(fact, scope="GLOBAL") if you are a root workspace, or scope="TEAM" to share with your team
- After completing a task → commit_memory("completed task X, PR #N opened", scope="TEAM") so your lead and teammates know
**Memory is automatically recalled** at the start of each new session. Use it proactively during work to share context.
"""
def get_hma_instructions() -> str:
"""Return HMA memory instructions for system-prompt injection."""
return _HMA_INSTRUCTIONS
# ========================================================================
# Misc text helpers
# ========================================================================
_MARKDOWN_FENCE = "```"
_MARKDOWN_HR = "---"
_BRIEF_SUMMARY_MIN_LEN = 4 # 1 char + 3-char ellipsis
def brief_summary(text: str, max_len: int = BRIEF_SUMMARY_MAX_LEN) -> str:
"""Extract a one-line task summary for the canvas card display.
Strips markdown headers (#, ##, ###), bold/italic markers (**, __),
and skips code fences and horizontal rules. Returns the first meaningful
line, truncated with an ellipsis when it exceeds `max_len`.
`max_len` is clamped to at least 4 (one real character plus a 3-char
ellipsis) so degenerate callers can't produce negative slice indices.
"""
max_len = max(max_len, _BRIEF_SUMMARY_MIN_LEN)
for raw_line in text.split("\n"):
line = raw_line.strip()
while line.startswith("#"):
line = line[1:]
line = line.strip()
if not line or line.startswith(_MARKDOWN_FENCE) or line == _MARKDOWN_HR:
continue
line = line.replace("**", "").replace("__", "")
if len(line) > max_len:
return line[: max_len - 3] + "..."
return line
return text[:max_len]
def extract_message_text(message: Any) -> str:
"""Extract text from an A2A message (handles both .text and .root.text patterns)."""
parts = getattr(message, "parts", None) or []
text_parts: list[str] = []
for part in parts:
text = getattr(part, "text", None)
if text:
text_parts.append(text)
continue
root = getattr(part, "root", None)
if root is not None:
root_text = getattr(root, "text", None)
if root_text:
text_parts.append(root_text)
return " ".join(text_parts).strip()
# Word-boundary patterns for subprocess stderr classification. Using word
# boundaries avoids false positives like "author" matching "auth" or
# "generate" matching "rate".
_RATE_LIMIT_RE = re.compile(r"\brate\b|\b429\b|\boverloaded\b", re.IGNORECASE)
_AUTH_RE = re.compile(r"\bauth(?:entication|orization)?\b|\bapi[_-]?key\b", re.IGNORECASE)
_SESSION_RE = re.compile(r"\bsession\b|\bno conversation found\b", re.IGNORECASE)
def classify_subprocess_error(stderr_text: str, exit_code: int | None) -> str:
"""Map a subprocess stderr blob to a short, user-safe category tag.
The full stderr goes to the workspace logs via `logger.error`; only the
category is surfaced to the user to avoid leaking tokens, internal paths,
or stack traces in the chat UI. Used with `sanitize_agent_error` to
produce a user-facing message for subprocess failures.
"""
if _RATE_LIMIT_RE.search(stderr_text):
return "rate_limited"
if _AUTH_RE.search(stderr_text):
return "auth_failed"
if _SESSION_RE.search(stderr_text):
return "session_error"
if exit_code is not None and exit_code != 0:
return f"exit_{exit_code}"
return "subprocess_error"
def sanitize_agent_error(
exc: BaseException | None = None,
category: str | None = None,
) -> str:
"""Render an agent-side failure into a user-safe error message.
Either pass an exception (class name is used as the tag) or an explicit
category string (e.g. from `classify_subprocess_error`). If both are
given, `category` wins. If neither, the tag defaults to "unknown".
The message body is deliberately dropped — exception messages and
subprocess stderr frequently leak stack traces, paths, tokens, and
API keys. Full detail is available in the workspace logs via
`logger.exception()` / `logger.error()`.
"""
if category:
tag = category
elif exc is not None:
tag = type(exc).__name__
else:
tag = "unknown"
return f"Agent error ({tag}) — see workspace logs for details."
# ========================================================================
# Auto-push hook — push unpushed commits and open PR after task completion
# ========================================================================
# Git/gh wrappers at /usr/local/bin have GH_TOKEN baked in.
_GIT = "/usr/local/bin/git"
_GH = "/usr/local/bin/gh"
_PROTECTED_BRANCHES = frozenset({"staging", "main", "master"})
def _run_git(args: list[str], cwd: str, timeout: int = 30) -> subprocess.CompletedProcess:
"""Run a git/gh command with bounded timeout. Never raises on failure."""
return subprocess.run(
args,
cwd=cwd,
capture_output=True,
text=True,
timeout=timeout,
)
def _auto_push_and_pr_sync(cwd: str) -> None:
"""Synchronous implementation of the auto-push hook.
1. Check if we're in a git repo with unpushed commits on a feature branch.
2. Push the branch.
3. Open a PR against staging if one doesn't already exist.
Designed to be called from a background thread — never raises, logs all
errors. Uses the git/gh wrappers at /usr/local/bin/ which have GH_TOKEN
baked in.
"""
try:
# --- Guard: is this a git repo? ---
probe = _run_git([_GIT, "rev-parse", "--is-inside-work-tree"], cwd)
if probe.returncode != 0:
return
# --- Guard: get current branch ---
branch_result = _run_git(
[_GIT, "rev-parse", "--abbrev-ref", "HEAD"], cwd
)
if branch_result.returncode != 0:
return
branch = branch_result.stdout.strip()
if not branch or branch in _PROTECTED_BRANCHES or branch == "HEAD":
return
# --- Guard: any unpushed commits? ---
log_result = _run_git(
[_GIT, "log", "origin/staging..HEAD", "--oneline"], cwd
)
if log_result.returncode != 0 or not log_result.stdout.strip():
# No unpushed commits (or origin/staging doesn't exist).
return
unpushed_lines = log_result.stdout.strip().splitlines()
logger.info(
"auto-push: %d unpushed commit(s) on branch '%s', pushing...",
len(unpushed_lines),
branch,
)
# --- Push ---
push_result = _run_git(
[_GIT, "push", "origin", branch], cwd, timeout=60
)
if push_result.returncode != 0:
logger.warning(
"auto-push: git push failed (exit %d): %s",
push_result.returncode,
(push_result.stderr or push_result.stdout)[:500],
)
return
logger.info("auto-push: pushed branch '%s' successfully", branch)
# --- Check if PR already exists ---
pr_list = _run_git(
[_GH, "pr", "list", "--head", branch, "--json", "number"], cwd
)
if pr_list.returncode != 0:
logger.warning(
"auto-push: gh pr list failed (exit %d): %s",
pr_list.returncode,
(pr_list.stderr or pr_list.stdout)[:500],
)
return
existing_prs = json.loads(pr_list.stdout.strip() or "[]")
if existing_prs:
logger.info(
"auto-push: PR already exists for branch '%s' (#%s), skipping create",
branch,
existing_prs[0].get("number", "?"),
)
return
# --- Get first commit message for PR title ---
first_commit = _run_git(
[_GIT, "log", "origin/staging..HEAD", "--reverse",
"--format=%s", "-1"],
cwd,
)
pr_title = first_commit.stdout.strip() if first_commit.returncode == 0 else branch
# Truncate to 256 chars (GitHub limit)
if len(pr_title) > 256:
pr_title = pr_title[:253] + "..."
# --- Create PR ---
pr_create = _run_git(
[
_GH, "pr", "create",
"--base", "staging",
"--title", pr_title,
"--body", "Auto-created by workspace agent",
],
cwd,
timeout=60,
)
if pr_create.returncode != 0:
logger.warning(
"auto-push: gh pr create failed (exit %d): %s",
pr_create.returncode,
(pr_create.stderr or pr_create.stdout)[:500],
)
else:
pr_url = pr_create.stdout.strip()
logger.info("auto-push: created PR %s", pr_url)
except subprocess.TimeoutExpired:
logger.warning("auto-push: command timed out, skipping")
except Exception:
logger.exception("auto-push: unexpected error (non-fatal)")
async def auto_push_hook(cwd: str | None = None) -> None:
"""Post-execution hook: push unpushed commits and open a PR.
Runs the git/gh subprocess work in a background thread via
asyncio.to_thread so it never blocks the agent's event loop.
Catches all exceptions — the agent must never crash due to this hook.
"""
if cwd is None:
cwd = WORKSPACE_MOUNT
if not os.path.isdir(cwd):
return
try:
await asyncio.to_thread(_auto_push_and_pr_sync, cwd)
except Exception:
logger.exception("auto_push_hook: failed (non-fatal)")