1c38c78f5e
48 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
09bfd9bdce |
fix(tests): hoist _executor_mod alias so async wedge tests pass under --cov
The Copilot Auto-fix in
|
||
|
|
5a8f42b405
|
Potential fix for pull request finding 'Module is imported with 'import' and 'import from''
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> |
||
|
|
d0f198b24f |
merge: resolve staging conflicts (a2a_proxy + workspace_crud)
Three files conflicted with staging changes that landed while this PR sat open. Resolved each by combining both intents (not picking one side): - a2a_proxy.go: keep the branch's idle-timeout signature (workspaceID parameter + comment) AND apply staging's #1483 SSRF defense-in-depth check at the top of dispatchA2A. Type-assert h.broadcaster (now an EventEmitter interface per staging) back to *Broadcaster for applyIdleTimeout's SubscribeSSE call; falls through to no-op when the assertion fails (test-mock case). - a2a_proxy_test.go: keep both new test suites — branch's TestApplyIdleTimeout_* (3 cases for the idle-timeout helper) AND staging's TestDispatchA2A_RejectsUnsafeURL (#1483 regression). Updated the staging test's dispatchA2A call to pass the workspaceID arg introduced by the branch's signature change. - workspace_crud.go: combine both Delete-cleanup intents: * Branch's cleanupCtx detachment (WithoutCancel + 30s) so canvas hang-up doesn't cancel mid-Docker-call (the container-leak fix) * Branch's stopAndRemove helper that skips RemoveVolume when Stop fails (orphan sweeper handles) * Staging's #1843 stopErrs aggregation so Stop failures bubble up as 500 to the client (the EC2 orphan-instance prevention) Both concerns satisfied: cleanup runs to completion past canvas hangup AND failed Stop calls surface to caller. Build clean, all platform tests pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) |
||
|
|
fc2720c1fe |
fix(git-token-helper): close TOCTOU window + stop swallowing chmod errors (closes #1552)
The token-cache helper had three #1552 findings, all in the mode-600-after-the-fact pattern: 1. _write_cache writes .tmp with default umask (typically 022 → 644 on disk) and then chmod 600's after the mv. A concurrent reader in that microsecond-wide window sees the token at mode 644. 2. Each chmod was swallowed via `|| true` — if it ever fails, the tokens stay world-readable with no operator signal. 3. _refresh_gh's gh_token_file write has the same shape and same two issues. Hardening: - Wrap the .tmp creates in a `umask 077` block so the files are 600 from creation. Restore the previous umask before return so callers aren't perturbed. - Replace `chmod ... 2>/dev/null || true` with `if ! chmod ...; then echo WARN ...; fi`. A chmod failure is a real signal worth grep'ing. - Apply the same pattern to the _refresh_gh gh_token_file path. `local` is illegal in a top-level case branch, so use a uniquely- named global (_gh_prev_umask) and unset it after. Verified `bash -n` clean and `shellcheck` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5d294936b3 |
fixup: lower coverage floor 92→86 to match post-omit measurement
The 97% number from CI run 24956647701 was measured WITHOUT a .coveragerc omit list. Once this PR's prescribed omit set is in effect (`*/__init__.py`, `*/tests/*`, `plugins_registry/*` — files that don't carry behavior), the actual measurement of behavior-bearing code on the same staging snapshot is 91.11% (run 24957664272). 86% sits at the issue's prescribed `current − 5pp` margin and unblocks CI without lowering the bar in real terms. |
||
|
|
355355a80a |
test(workspace): centralize pytest-cov config + 92% floor (closes #1817)
The Python workspace already runs pytest-cov in CI but with no threshold and inline-flagged config. CI run 24956647701 (2026-04-26 staging) reports 97% coverage on the package — well above the issue's 75% target. The actionable gap is locking in a floor so a regression can't sneak past, and centralizing config so local `pytest` matches CI. Changes: - workspace/pytest.ini — coverage flags moved into addopts (-q, --cov=., --cov-report=term-missing, --cov-fail-under=92). 92% = current 97% measurement minus the 5pp safety margin the issue's Step 3 prescribes. - workspace/.coveragerc (new) — [run] omit list and [report] skip_covered. coverage.py doesn't read pytest.ini sections, so the omit config has to live here. - .github/workflows/ci.yml — removed the inline --cov flags from the Python Lint & Test step; now reads from pytest.ini. Workflow stays the same single-command shape, just simpler. Result: any PR that drops coverage below 92% fails CI loudly. Floor ratchets up by replacing 92 with current measurement on a future test-writing pass — same shape as Go coverage gates landed elsewhere. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
76d0f8d004 |
fix(a2a): document the metadata-attach except-pass in a2a_executor (closes #1787)
GitHub Code Quality bot flagged the empty `except (AttributeError,
TypeError): pass` at workspace/a2a_executor.py:424 as a nit on PR #1783.
The suppression IS intentional — `new_agent_text_message()` returns
a plain string in MagicMock paths in tests where assignment to
`.metadata` raises despite hasattr being true.
This:
- Adds a why-comment citing the test-mock motivation, commit
|
||
|
|
4a4a740804 |
refactor(test_config): parametrize the 3 yaml-default cases (simplify on #2085)
Collapses test_compliance_default_when_yaml_omits_block, _when_yaml_block_is_empty, _explicit_optout_still_works into one parametrized test_compliance_default_via_load_config with three ids (yaml_omits_block, yaml_block_empty, yaml_explicit_optout). The dataclass-default test stays separate (no tmp_path needed). Coverage and assertions identical; net -19 lines, same 4 logical cases. prompt_injection check moves out of per-case to a single tail-assert since no payload overrode it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
577294b8f4 |
test(config): lock ComplianceConfig default to owasp_agentic (#2059)
PR #2056 flipped ComplianceConfig.mode default from "" to "owasp_agentic" so every shipped template gets prompt-injection detection + PII redaction by default. The flip is correct + already shipping, but no test asserts the new default — a silent revert (or a refactor that reintroduces the old "" default) would pass workspace/tests/ and ship a workspace with compliance silently off. Add 4 regression tests: - test_compliance_dataclass_default — ComplianceConfig() with no args returns mode='owasp_agentic' + prompt_injection='detect' - test_compliance_default_when_yaml_omits_block — load_config on a yaml without `compliance:` key still produces owasp_agentic - test_compliance_default_when_yaml_block_is_empty — load_config on `compliance: {}` (a common shape during template editing) still produces owasp_agentic; covers the load_config() `.get("mode", "owasp_agentic")` default-fill path - test_compliance_explicit_optout_still_works — `mode: ""` in yaml must disable compliance (the documented opt-out path) 23/23 tests pass locally (4 new + 19 existing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
2ee4b67cab |
chore: third-pass review polish — empty-stream gate test + Callable type
Pass 3 review came back Approve with two optional polish items.
Both taken to fully converge the loop:
1. Regression test for the empty-stream wedge-clear gate (added in
|
||
|
|
3c4eef49aa |
chore: second-pass review polish — symmetry + clearer test fixtures
Round-2 review of the wedge/idle/progress bundle came back Approve
with 4 optional polish items. All taken:
1. Migration 043 down file gained `SET LOCAL lock_timeout = '5s'`
matching the up file. A rollback under the same load that
motivated the up-file guard would otherwise stall writers.
2. _clear_sdk_wedge_on_success now gates on actual stream content
(result_text or assistant_chunks). A degenerate "iterator
returned without raising but emitted nothing" case (possible
from a partial stream or stub SDK) no longer falsely advertises
recovery — only a real successful query (≥1 ResultMessage or
AssistantMessage TextBlock) clears the wedge.
3. isUpstreamBusyError dropped the redundant
`strings.Contains(msg, "context deadline exceeded")` fallback.
*url.Error.Unwrap propagates the typed sentinel since Go 1.13;
errors.Is(err, context.DeadlineExceeded) catches the real
net/http shape. The substring was a foot-gun (would also match
user-content with that phrase). Test fixture updated to use
`fmt.Errorf("Post: %w", context.DeadlineExceeded)` which
reflects what net/http actually returns.
4. TestIsUpstreamBusyError added a context.Canceled case (both
typed and wrapped via %w) — pins the new applyIdleTimeout
classification.
No critical/required findings on second pass; reviewer verdict was
Approve. Items above are polish for symmetry and test clarity.
1010 canvas + 64 Python + full Go suites pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
892de784b3 |
fix: review-driven hardening of wedge detector + idle timeout + progress feed
Bundle review of pieces 1/2/3 surfaced two critical issues plus a handful of required + optional fixes. All addressed. Critical: 1. Migration 043 was missing 'paused' and 'hibernated' from the workspace_status enum. Both are real production statuses written by workspace_restart.go (lines 283 and 406), introduced by migration 029_workspace_hibernation. The original `USING status::workspace_status` cast would have errored mid-transaction on any production DB containing those values. Added both. Also added `SET LOCAL lock_timeout = '5s'` so the migration aborts instead of stalling the workspace fleet behind a slow SELECT. 2. The chat activity-feed window kept only 8 lines, and a single multi-tool turn (Read 5 files + Grep + Bash + Edit + delegate) easily flushed older context before the user could read it. Extracted appendActivityLine to chat/activityLog.ts with a 20-line window AND consecutive-duplicate collapse (same tool on the same target twice in a row is noise, not new progress). 5 unit tests pin the behavior. Required: 3. The SDK wedge flag was sticky-only — a single transient Control-request-timeout from a flaky network blip locked the workspace into degraded for the whole process lifetime, even when the next query() would have succeeded. Added _clear_sdk_wedge_on_success(), called from _run_query's success path. The next heartbeat after a working query reports runtime_state empty and the platform recovers the workspace to online without a manual restart. New regression test. 4. _report_tool_use now sets target_id = WORKSPACE_ID for self- actions, matching the convention other self-logged activity rows use. DB consumers joining on target_id see a well-defined value instead of NULL. Optional taken: 5. Tightened _WEDGE_ERROR_PATTERNS from "control request timeout" to "control request timeout: initialize" — suffix-anchored so a future SDK error on an in-flight tool-call control message doesn't get misclassified as the unrecoverable post-init wedge. 6. Dropped the redundant "context canceled" substring fallback in isUpstreamBusyError. errors.Is(err, context.Canceled) is the typed check; the substring would also match healthy client-side aborts, which we don't want classified as upstream-busy. Verified: 1010 canvas tests + 64 Python tests + full Go suite pass; migration applies cleanly on dev DB with all 8 enum values; reverse migration restores TEXT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
166c7f77af |
feat(chat): stream per-tool progress into MyChat live feed
Two halves of the same UX win — the user wants to see what Claude is
doing while a chat reply is in flight instead of staring at "0s" for
minutes.
Workspace side (claude_sdk_executor.py):
- The executor's _run_query message loop already iterated the SDK
stream for AssistantMessage.TextBlock content. Now also detects
ToolUseBlock / ServerToolUseBlock entries (by class name, since
the conftest stub doesn't define them) and fires-and-forgets a
POST /workspaces/:id/activity row of type agent_log per tool use.
- _summarize_tool_use maps the common tools (Read, Write, Edit,
Bash, Glob, Grep, WebFetch, WebSearch, Task, TodoWrite) to a
one-line summary with the file path / pattern / command, falling
back to "🛠 <tool>(…)" for anything else. Truncated at 200 chars.
- Posts directly to /workspaces/:id/activity rather than going
through a2a_tools.report_activity, which would also push a
/registry/heartbeat current_task and double-log as a TASK_UPDATED
line in the same chat feed.
- All failures swallowed silently — telemetry must not break
the conversation.
Canvas side (ChatTab.tsx):
- The existing ACTIVITY_LOGGED handler streams a2a_send /
a2a_receive / task_update events into a sliding-window
activityLog state. Two issues fixed:
1. No `msg.workspace_id === workspaceId` filter — a sibling
workspace's a2a_send was leaking into the wrong chat
panel as "→ Delegating to X...". Added an early return.
2. No agent_log render branch. Added one that renders the
summary verbatim (the workspace already prefixed its
own emoji icon, so no double-icon).
- Existing 8-line sliding window keeps the UI scoped; older
progress lines naturally roll off as new ones arrive.
Result: when DD is delegating to Visual Designer + reading
config files + running Bash to lint, the spinner area shows:
📄 Read /configs/system-prompt.md
⚡ Bash: pnpm test
→ Delegating to Visual Designer...
← Visual Designer responded (47s)
instead of bare "0s · Processing with Claude Code..." for minutes.
63 Python tests + 58 canvas chat tests pass; tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4eb09e2146 |
feat(platform,workspace): SDK-wedge detection + workspace_status ENUM
Heartbeat lies. The asyncio task that POSTs /registry/heartbeat lives in its own process slot, so a workspace whose claude_agent_sdk has wedged on `Control request timeout: initialize` keeps reporting "online" — every chat send hangs the full 5-min platform deadline even though the runtime is dead in the water. This commit teaches the workspace to admit it's wedged and the platform to honor that admission by flipping status → degraded. Five layers, all in one commit because they share a contract: 1. Migration 043 — convert workspaces.status from free-form TEXT to a real `workspace_status` Postgres ENUM with the 6 values production code actually writes (provisioning, online, offline, degraded, failed, removed). Locks the value set; future typo writes error at the DB instead of silently storing rogue strings. Down migration reverts to TEXT and drops the type. 2. workspace-server/internal/models — `HeartbeatPayload` gains a `runtime_state string` field. Empty = healthy. Currently the only non-empty value the handler honors is "wedged"; future symptoms can extend without another migration. 3. workspace-server/internal/handlers/registry.go — `evaluateStatus` gains a wedge branch BEFORE the existing error_rate >= 0.5 path: if `RuntimeState=="wedged"` and currently online, flip to degraded and broadcast WORKSPACE_DEGRADED with the wedge sample error. Recovery (`degraded → online`) now requires BOTH error_rate < 0.1 AND runtime_state cleared, so a workspace still reporting wedged stays degraded even when its error count happens to be 0 (the wedge captures a runtime state, not an error count). 4. workspace/claude_sdk_executor.py — module-level `_sdk_wedged_reason` flag set when execute()'s catch block sees an error matching `_WEDGE_ERROR_PATTERNS` (currently just "control request timeout"). Sticky for the process lifetime; the SDK's internal client-process state is corrupted on this error and only a workspace restart (= new Python process = fresh module state) clears it. Helpers `is_wedged()` / `wedge_reason()` / `_reset_sdk_wedge_for_test()` exposed. 5. workspace/heartbeat.py — heartbeat body now layers on `_runtime_state_payload()` for both the happy path and the 401-retry path. Lazy-imports claude_sdk_executor so non-Claude runtimes (where the module may not even be importable) keep working unchanged. Canvas required no changes — `STATUS_CONFIG.degraded` was already defined in design-tokens.ts (amber dot, "Degraded" label) and WorkspaceNode.tsx already renders `lastSampleError` underneath the status pill when status === "degraded". The existing wiring just never fired because nothing was writing degraded in this code path. Tests: - 3 Go handler tests for the new transitions (online → degraded on wedged, degraded stays put while still wedged, degraded → online after wedge clears) - 5 Python wedge-detector tests (default clean, mark sets flag, sticky-first-wins, execute() flips on Control request timeout, execute() does NOT flip on unrelated errors) - Migration smoke-tested against the local dev DB (3 existing rows, all enum-compatible; migration applied cleanly, post-state has the column as workspace_status type and the index preserved) Verified: 79 Python tests pass; full Go test suite passes; migration applies clean on a real DB; reverse migration restores the column to TEXT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c159d85eb5 |
fix(a2a): review-driven hardening — prefix-anchored type check, error_detail cap, shared hint module
Three required fixes from the bundle review of
|
||
|
|
391e187281 |
fix(a2a,canvas): make delivery failures comprehensive instead of "[A2A_ERROR] "
Symptom: Activity tab and Agent Comms surfaced bare "[A2A_ERROR] "
(prefix + nothing) for failed delegations. Operator had no signal
to act on — no exception type, no target, no hint about what went
wrong, no next step. Fix is in three layers.
1. workspace/a2a_client.py — every error path now produces an
actionable detail string:
- except branch: some httpx exceptions (RemoteProtocolError,
ConnectionReset variants) stringify to "". Pre-fix the catch
was `f"{_A2A_ERROR_PREFIX}{e}"` → bare prefix. Now falls back
to `<TypeName> (no message — likely connection reset or silent
timeout)` and always appends `[target=<url>]` for traceability
in chained delegations.
- JSON-RPC error branch: previously dropped error.code on the
floor and printed "unknown" when message was missing. Now
surfaces both, including the well-defined "JSON-RPC error
with no message (code=N)" path.
- "neither result nor error" branch: pre-fix returned
str(payload) which the canvas rendered as a successful
response block. Now tagged as A2A_ERROR with a payload
snippet so downstream UI routes through the error path.
2. workspace/a2a_tools.py — tool_delegate_task now passes
error_detail (the stripped error message) through to the
activity-log POST. The platform's activity_logs.error_detail
column is the canvas's red error chip source; populating it
makes the failure visible in the row header without the user
having to expand into raw response_body JSON. The summary line
also gets a 120-char prefix of the cause so the collapsed row
reads "React Engineer failed: ConnectionResetError: ... [target=...]"
instead of "React Engineer failed".
3. canvas/src/components/tabs/ActivityTab.tsx — MessagePreview
now detects [A2A_ERROR]-prefixed bodies and renders a
structured error block (red chip, stripped detail, cause hint)
instead of the previous gray text-block that showed the literal
"[A2A_ERROR]" string. inferA2AErrorHint mirrors the patterns
from AgentCommsPanel.inferCauseHint so the same symptom reads
the same way in both surfaces (Claude SDK init wedge → restart
workspace; timeout → busy/stuck; connection-reset → transient
blip then check logs).
Tests: 9 send_a2a_message tests pass (including a new regression
test for the empty-stringifying-exception case that the user
reported); 995 canvas tests pass; tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
65b531acf6 |
fix(workspace): tag self-originated A2A POSTs with X-Workspace-ID
Workspace runtime fired four classes of A2A request to the platform
without the X-Workspace-ID header that identifies the source
workspace: heartbeat self-messages, initial_prompt, idle-loop fires,
and peer-to-peer A2A from runtime tools. The platform's a2a_receive
logger keys source_id off that header — without it, every such row
was written with source_id=NULL, which the canvas's My Chat tab
filters as ?source=canvas (i.e. "user typed this") and rendered the
internal triggers as if the human user had sent them. The
"Delegation results are ready..." heartbeat trigger was visible to
end users in the chat history; delegate_task A2A calls between agents
were misclassified the same way.
Centralise the header construction in a new platform_auth helper
self_source_headers(workspace_id) that returns auth_headers() PLUS
{X-Workspace-ID: <id>}. Apply it to:
- heartbeat.py self-message (refactored from inline header dict)
- main.py initial_prompt POST
- main.py idle_prompt POST
- a2a_client.py send_a2a_message (peer A2A from runtime)
- builtin_tools/a2a_tools.py delegate_task (was missing ALL headers)
Tests:
- test_heartbeat.py asserts the X-Workspace-ID header is set on
the self-message POST.
- test_a2a_tools_module.py asserts the same on delegate_task POSTs;
FakeClient.post mocks updated to accept the headers kwarg.
Production effect lands the moment workspace containers are rebuilt
with this code; existing rows in activity_logs keep their NULL
source_id (legacy data). The canvas-side filter (#follow-up)
covers the historical-rows case until backfill.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a34121d451 |
fix(a2a_executor): remove shadowing local Part import that broke streaming
Python scoping rule: any name assigned anywhere in a function body is local for the entire body. The outbound-files block at ~L442 had `from a2a.types import ... Part ...`, which made `Part` a local name throughout the execute() function. The astream_events loop at L358 — which runs BEFORE that import — then raised: UnboundLocalError: cannot access local variable 'Part' where it is not associated with a value Every streaming A2A reply died with "Agent error: cannot access local variable 'Part' where it is not associated with a value" instead of the actual agent text. 5 tests caught it: - test_streaming_plain_string_content - test_streaming_anthropic_content_blocks - test_non_stream_events_ignored - test_core_execute_on_chat_model_end_captures_last_ai_message - test_core_execute_pii_redaction_when_pii_found Fix: drop `Part` from the function-scope import (it is already imported at module level on line 42) and leave a comment pinning the rationale so a future refactor doesn't re-introduce the shadow. All 43 test_a2a_executor tests pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
425df5e5a9 |
merge(staging): resolve conflicts + fix 7 test regressions on top of #2061
- Merge origin/staging into fix/canvas-multilevel-layout-ux. 18 files
auto-merged (mostly canvas/tabs/chat and workspace-server handlers
the earlier DIRTY marker was stale relative to current staging).
- Fix 7 test failures surfaced by the merge:
1. Canvas.pan-to-node.test.tsx — mockGetIntersectingNodes was
inferred as vi.fn(() => never[]); mockReturnValueOnce of a node
object failed type check. Explicit return-type annotation.
2. Canvas.pan-to-node.test.tsx + Canvas.a11y.test.tsx — Canvas.tsx
reads deletingIds.size (new multilevel-layout state). Both mock
stores lacked deletingIds; added new Set<string>() to each.
3. canvas-batch-partial-failure.test.ts — makeWS() built a wire-
format WorkspaceData (snake_case, with x/y/uptime_seconds). The
store's node.data is now WorkspaceNodeData (camelCase, no wire-
only fields). Rewrote makeWS to produce WorkspaceNodeData and
updated 5 call-site casts. No assertions changed.
4. ConfigTab.hermes.test.tsx — two tests pinned pre-#2061 behavior
that the PR intentionally inverts:
a. "shows hermes-specific info banner" — RUNTIMES_WITH_OWN_CONFIG
now contains only {"external"}, so the banner is no longer
shown for hermes. Inverted assertion: now pins ABSENCE of
the banner, with a comment noting the inversion.
b. "config.yaml runtime wins over DB" — priority reversed:
DB is now authoritative so the tier-on-node badge matches
the form. Inverted scenario: DB=hermes + yaml=crewai →
form shows hermes. Switched test's DB runtime off langgraph
because the dropdown collapses langgraph into an empty-
valued "default" option that would hide the win signal.
- No production code changed — this commit is staging merge + test
realignment only. 953/953 canvas tests pass. tsc --noEmit clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
94d9331c76 |
feat(canvas+platform): chat attachments, model selection, deploy/delete UX
Session's accumulated UX work across frontend and platform. Reviewable in four logical sections — diff is large but internally cohesive (each section fixes a gap the next one depends on). ## Chat attachments — user ↔ agent file round trip - New POST /workspaces/:id/chat/uploads (multipart, 50 MB total / 25 MB per file, UUID-prefixed storage under /workspace/.molecule/chat-uploads/). - New GET /workspaces/:id/chat/download with RFC 6266 filename escaping and binary-safe io.CopyN streaming. - Canvas: drag-and-drop onto chat pane, pending-file pills, per-message attachment chips with fetch+blob download (anchor navigation can't carry auth headers). - A2A flow carries FileParts end-to-end; hermes template executor now consumes attachments via platform helpers. ## Platform attachment helpers (workspace/executor_helpers.py) Every runtime's executor routes through the same helpers so future runtimes inherit attachment awareness for free: - extract_attached_files — resolve workspace:/file:///bare URIs, reject traversal, skip non-existent. - build_user_content_with_files — manifest for non-image files, multi-modal list (text + image_url) for images. Respects MOLECULE_DISABLE_IMAGE_INLINING for providers whose vision adapter hangs on base64 payloads (MiniMax M2.7). - collect_outbound_files — scans agent reply for /workspace/... paths, stages each into chat-uploads/ (download endpoint whitelist), emits as FileParts in the A2A response. - ensure_workspace_writable — called at molecule-runtime startup so non-root agents can write /workspace without each template having to chmod in its Dockerfile. Hermes template executor + langgraph (a2a_executor.py) + claude-code (claude_sdk_executor.py) all adopt the helpers. ## Model selection & related platform fixes - PUT /workspaces/:id/model — was 404'ing, so canvas "Save" silently lost the model choice. Stores into workspace_secrets (MODEL_PROVIDER), auto-restarts via RestartByID. - applyRuntimeModelEnv falls back to envVars["MODEL_PROVIDER"] so Restart propagates the stored model to HERMES_DEFAULT_MODEL without needing the caller to rehydrate payload.Model. - ConfigTab Tier dropdown now reads from workspaces row, not the (stale) config.yaml — fixes "badge shows T3, form shows T2". ## ChatTab & WebSocket UX fixes - Send button no longer locks after a dropped TASK_COMPLETE — `sending` no longer initializes from data.currentTask. - A2A POST timeout 15 s → 120 s. LLM turns routinely exceed 15 s; the previous default aborted fetches while the server was still replying, producing "agent may be unreachable" on success. - socket.ts: disposed flag + reconnectTimer cancellation + handler detachment fix zombie-WebSocket in React StrictMode. - Hermes Config tab: RUNTIMES_WITH_OWN_CONFIG drops 'hermes' — the adaptor's purpose IS the form, banner was contradictory. - workspace_provision.go auto-recovery: try <runtime>-default AND bare <runtime> for template path (hermes lives at the bare name). ## Org deploy/delete animation (theme-ready CSS) - styles/theme-tokens.css — design tokens (durations, easings, colors). Light theme overrides by setting only the deltas. - styles/org-deploy.css — animation classes + keyframes, every value references a token. prefers-reduced-motion respected. - Canvas projects node.draggable=false onto locked workspaces (deploying children AND actively-deleting ids) — RF's authoritative drag lock; useDragHandlers retains a belt-and- braces check. - Organ cancel button (red pulse pill on root during deploy) cascades via existing DELETE /workspaces/:id?confirm=true. - Auto fit-view after each arrival, debounced 500 ms so rapid sibling arrivals coalesce into one fit (previous per-event fit made the viewport lurch continuously). - Auto-fit respects user-pan — onMoveEnd stamps a user-pan timestamp only when event !== null (ignores programmatic fitView) so auto-fits don't self-cancel. - deletingIds store slice + useOrgDeployState merge gives the delete flow the same dim + non-draggable treatment as deploy. - Platform-level classNames.ts shared by canvas-events + useCanvasViewport (DRY'd 3 copies of split/filter/join). ## Server payload change - org_import.go WORKSPACE_PROVISIONING broadcast now includes parent_id + parent-RELATIVE x/y (slotX/slotY) so the canvas renders the child at the right parent-nested slot without doing any absolute-position walk. createWorkspaceTree signature gains relX, relY alongside absX, absY; both call sites updated. ## Tests - workspace/tests/test_executor_helpers.py — 11 new cases covering URI resolution (including traversal rejection), attached-file extraction (both Part shapes), manifest-only vs multi-modal content, large-image skip, outbound staging, dedup, and ensure_workspace_writable (chmod 777 + non-root tolerance). - workspace-server chat_files_test.go — upload validation, Content-Disposition escaping, filename sanitisation. - workspace-server secrets_test.go — SetModel upsert, empty clears, invalid UUID rejection. - tests/e2e/test_chat_attachments_e2e.sh — round-trip against a live hermes workspace. - tests/e2e/test_chat_attachments_multiruntime_e2e.sh — static plumbing check + round-trip across hermes/langgraph/claude-code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9af058b82d |
fix(compliance): flip default mode to owasp_agentic (detect-only)
Prior state: compliance.mode default was "" (fully off) and no template in the repo set it explicitly — so prompt-injection detection, PII redaction, and agency-limit checks were silently disabled on every live workspace, despite the machinery being present in workspace/builtin_tools/compliance.py. This was surfaced during a 2026-04-24 review of the A2A inbound path: a2a_executor.py gates three security checks on _compliance_cfg.mode == "owasp_agentic" and default config never matches, so every A2A message skipped all three. Fix: default is now owasp_agentic + prompt_injection=detect. Detect mode logs injection attempts as audit events without blocking — no UX cost, just visibility. Operators who want stricter enforcement set `prompt_injection: block` per workspace. Operators who genuinely want compliance fully off can set `mode: ""` (not recommended; documented). Changes: - ComplianceConfig.mode default: "" → "owasp_agentic" - Yaml parser fallback default: "" → "owasp_agentic" (must match dataclass) - Docstring updated with rationale + opt-out snippet Tests: 66/66 test_compliance.py + test_a2a_executor.py pass. 19/19 test_config.py pass. The one test asserting compliance_mode == "" is for the "config load failed" fallback path (different from the default config path) — correctly unchanged. Security posture improvement: prompt-injection detection is now always on for every workspace created after this ships, with zero behavior change for legitimate inputs. Block mode remains an opt-in when an operator wants to actively reject injection attempts rather than just log them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6f24cc0961 |
fix(executors): move set_current_task inside try so active_tasks always decrements (#2026)
If asyncio.CancelledError arrived during the heartbeat HTTP push inside set_current_task() (the increment call), the code raised before entering the try/finally block in _execute_locked. The finally block never ran, so active_tasks stayed at 1 forever. Every subsequent heartbeat reported active_tasks=1, the server saw active_tasks < max_concurrent_tasks as false (1 < 1), and DrainQueueForWorkspace never fired. Queued A2A requests were permanently stuck. Fix: move set_current_task(increment) to be the FIRST statement inside the try block, not before it. set_current_task's synchronous portion (heartbeat.active_tasks mutation) still runs unconditionally; only the optional HTTP push can be cancelled. The finally block now always runs and always decrements active_tasks back to 0. Affected executors: claude_sdk_executor, cli_executor, a2a_executor. hermes_executor is not affected (does not call set_current_task). Root cause of today's "active_tasks: 1 + queue drain never triggers" P1 pattern across three workspaces. All 167 executor tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
7b662d2494 |
ci(gh-wrapper): translate --assignee @me → --label team:<role>
Fixes #1957. All agents share one PAT, so `gh issue create --assignee @me` resolves to the CEO. Today's "6 issues @me for 7 cycles" defect signal turned out to be CEO-load misclassified as team-stagnation. Translation rules: - `--assignee @me` → `--label team:<role-slug>` - `--reviewer @me` → dropped (review-bot scans labels, not requests) - `--assignee user` (real user) → unchanged role-slug derived from GIT_AUTHOR_NAME ("Molecule AI Core-BE" → "core-be"). The wrapper already handled the title-prefix + body-footer transforms; these are just two more cases in the existing arg-walk loop. Backward compat: any agent prompt that doesn't use @me passes through unchanged. Agents don't need prompt updates — the wrapper is transparent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
35bcad9204
|
feat(workspace): migrate a2a-sdk from 0.3.x to 1.0.0 (KI-009) (#1974)
* feat(workspace): migrate a2a-sdk from 0.3.x to 1.0.0 (KI-009) Migrates all workspace code from a2a-sdk v0.3.x to v1.0.0, following the official migration guide from a2aproject/a2a-python. Breaking changes applied: - A2AStarletteApplication → Starlette route factory (create_agent_card_routes + create_jsonrpc_routes) - AgentCard.url removed; url+protocol now in supported_protocols[].url - AgentCapabilities fields renamed to snake_case (pushNotifications→push_notifications, stateTransitionHistory→state_transition_history) - AgentCard.defaultInputModes/outputModes → default_input_modes/output_modes - TaskState.canceled → TaskState.TASK_STATE_CANCELED - a2a.utils → a2a.helpers - Part(root=TextPart(text=t)) → Part(text=t) (TextPart removed) Files changed: - requirements.txt: pinned >=1.0.0,<2.0 - main.py: Starlette route factory + AgentCard restructure - a2a_executor.py: Part() + TaskState + helpers import - hermes_executor.py: TaskState + helpers import - google-adk/adapter.py: TaskState + helpers import - cli_executor.py: helpers import - claude_sdk_executor.py: helpers import - tests/conftest.py: a2a.helpers mock stub - tests/test_a2a_executor.py: TaskState enum key - adapters/google-adk/test_adapter.py: Part + helpers stub Refs: KI-009 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): update _TaskState mock to a2a-sdk v1 enum name (TASK_STATE_CANCELED) --------- Co-authored-by: Molecule AI Tech Researcher <tech-researcher@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> |
||
| 61c5f8ad9a |
feat(plugin): implement MCPServerAdaptor (issue #847)
Rule-of-three threshold met: 4 plugin proposals (molecule-firecrawl #512, molecule-github-mcp #520, molecule-browser-use #553, mcp-connector #573) all independently shipped the same mcpServers-adapter pattern. Adds MCPServerAdaptor to builtins.py — plugins wrapping an MCP server now declare `from plugins_registry.builtins import MCPServerAdaptor as Adaptor` in their per-runtime adapter file. The adaptor: - Merges mcpServers from settings-fragment.json into <configs>/.claude/settings.json (deep-merge so multiple plugins' servers coexist). - Optionally ships skills/rules/setup.sh via AgentskillsAdaptor delegation. - On uninstall: removes skills/rules but intentionally leaves mcpServers entries in settings.json (users may share configs with other tools or have manually curated entries). Also fixes _deep_merge_hooks: non-hook top-level keys that are dicts (e.g. mcpServers) are now deep-merged with existing values instead of being skipped via setdefault. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
|||
| b5e2142c46 |
fix(#1877): close token-rotation race on restart — Option A+Option B combined
Platform side (Option B): - provisioner.go: add WriteAuthTokenToVolume() — writes .auth_token to the Docker named volume BEFORE ContainerStart using a throwaway alpine container, eliminating the race window where a restarted container could read a stale token before WriteFilesToContainer writes the new one. - workspace_provision.go: call WriteAuthTokenToVolume() in issueAndInjectToken as a best-effort pre-write before the container starts. Runtime side (Option A): - heartbeat.py: on HTTPStatusError 401 from /registry/heartbeat, call refresh_cache() to force re-read of /configs/.auth_token from disk, then retry the heartbeat once. Fall through to normal failure tracking if the retry also fails. - platform_auth.py: add refresh_cache() which discards the in-process _cached_token and calls get_token() to re-read from disk. Together these eliminate the >1 consecutive 401 window described in issue #1877. Pre-write (B) is the primary fix; runtime retry (A) is the self-healing fallback for any residual race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
|||
|
|
925a71887d
|
fix(workspace): credential helper security hardening (#1797)
Four findings from security audit (internal/security/credential-token-backlog.md):
1. STDERR LEAK — molecule-git-token-helper.sh:146,153 logged ${response}
on platform errors. The response body MAY contain the token in some
failure modes (alternate JSON key shape on partial success). Now:
- capture curl's stderr to a tmp file (not $response) so we can log
the curl error message without ever interpolating the response body
- on empty-token branch, log only response size (bytes) for debug
2. CHMOD 600 — already in place at lines 116, 124, 223 (verified, no change)
3. RESPAWN SUPERVISION — entrypoint.sh wrapped daemon launch in a
while-true bash loop with 30s back-off. Without this, a daemon crash
silently leaves the workspace stuck on an expired token until the
container restarts. Logs to /home/agent/.gh-token-refresh.log
(agent-writable; /var/log is root-owned).
4. JITTER — molecule-gh-token-refresh.sh: added 0..120s random offset to
each sleep so 39 containers don't synchronize their refresh requests
against the platform endpoint.
Also:
- Daemon now sends helper output to /dev/null instead of merging stderr,
belt-and-suspenders against any future helper change that might write
the token to stdout.
- Daemon log lines include rc=$? on failure for actionable triage.
Inherent risks (org-wide token blast, prompt-injection theft, bearer
in volume, no audit log) tracked in internal/security/credential-token-backlog.md
as separate roadmap items.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
|
||
|
|
e00797ba35 |
fix(security): prevent cross-tenant memory contamination in commit_memory/recall_memory (GH#1610)
Two critical gaps in a2a_tools.py let any tenant workspace poison org-wide (GLOBAL) memory and bypass all RBAC enforcement: 1. tool_commit_memory had no RBAC check — any agent could write any scope. 2. tool_commit_memory had no root-workspace enforcement for GLOBAL scope — Tenant A could POST scope=GLOBAL and pollute the shared memory store that Tenant B's agent reads as trusted context. Fix adds: - _ROLE_PERMISSIONS table (mirrors builtin_tools/audit.py) so a2a_tools has isolated RBAC logic without depending on memory.py. - _check_memory_write_permission() / _check_memory_read_permission() helpers: evaluate RBAC roles from WorkspaceConfig; fail closed (deny) on errors. - _is_root_workspace() / _get_workspace_tier(): read WorkspaceConfig.tier (0 = root/org, 1+ = tenant) from config.yaml; fall back to WORKSPACE_TIER env var. - tool_commit_memory now (a) checks memory.write RBAC, (b) rejects GLOBAL scope for non-root workspaces, (c) embeds workspace_id in the POST body so the platform can namespace-isolate and audit cross-workspace writes. - tool_recall_memory now checks memory.read RBAC before any HTTP call, and always sends workspace_id as a GET param for platform cross-validation. Security regression tests added: - GLOBAL scope denied for non-root (tier>0) workspaces. - RBAC denial blocks all scope levels (including LOCAL) on write. - RBAC denial blocks recall entirely. - workspace_id present in POST body and GET params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
2885583d05 |
feat(workspace): 45-min gh-token refresh daemon + credential helper cache
Extracted from the now-closed PR #1664 (Molecule-AI/molecule-core). - New scripts/molecule-gh-token-refresh.sh background daemon — every 45 min (TOKEN_REFRESH_INTERVAL_SEC) calls the credential helper's _refresh_gh action to keep both gh CLI auth and the on-disk cache fresh through the GitHub App installation token's ~60 min TTL. - scripts/molecule-git-token-helper.sh rewritten with a ~50 min on-disk cache (${CACHE_DIR}/gh_installation_token + _expiry companion file), a cache > API > env-var fallback chain, a new _refresh_gh action (invoked by the daemon above), a _invalidate_cache action, and path references flipped from /workspace/scripts/... to /app/scripts/... to match the runtime image layout. - Dockerfile copies the new refresh daemon and extends mkdir to create /home/agent/.molecule-token-cache at build time. - entrypoint.sh configures the git credential helper for github.com while still root (so the global gitconfig is written before the gosu handoff), creates + chowns the token cache dir, then as agent starts the refresh daemon in the background and does an initial gh auth login from GITHUB_TOKEN/GH_TOKEN so gh works before the first refresh fires. Dropped from PR #1664: cosmetic em-dash -> ASCII hyphen rewrites (charset-normalizer noise) that would conflict with the repo's existing em-dash convention used elsewhere in workspace/. |
||
|
|
dcbcf19da1 |
fix(test): guard msg.metadata assignment for non-Message returns
new_agent_text_message returns a real Message object in production but some test mocks return a plain string. Guard with hasattr + try/except so the tool_trace assignment doesn't crash test_non_stream_events_ignored. |
||
|
|
ed26f2733a |
fix(review): address code review blockers on tool-trace + instructions
BLOCKERS fixed: - instructions.go: Drop team-scope queries (teams/team_members tables don't exist in any migration). Schema column kept for future. Restored Resolve to /workspaces/:id/instructions/resolve under wsAuth — closes auth gap that allowed cross-workspace enumeration of operator policy. - migration 040: Add CHECK constraints on title (<=200) and content (<=8192) to prevent token-budget DoS via oversized instructions. - a2a_executor.py: Pair on_tool_start/on_tool_end via run_id instead of list-position so parallel tool calls don't drop or clobber outputs. Cap tool_trace at 200 entries to prevent runaway loops bloating JSONB. HIGH fixes: - instructions.go: Add length validation in Create + Update handlers. Removed dead rows_ shadow variable. Replaced string concatenation in Resolve with strings.Builder. - prompt.py: Drop httpx timeout 10s -> 3s (boot hot path). Switch print to logger.warning. Add Authorization bearer header from MOLECULE_WORKSPACE_TOKEN env var. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
d7afd15e59 |
feat: platform instructions system with global/team/workspace scope
Adds a configurable instruction injection system that prepends rules to every agent's system prompt. Instructions are stored in the DB and fetched at workspace startup, supporting three scopes: - Global: applies to all agents (e.g., "verify with tools before reporting") - Team: applies to agents in a specific team - Workspace: applies to a single agent (role-specific rules) Components: - Migration 040: platform_instructions table with scope hierarchy - Go API: CRUD endpoints + resolve endpoint that merges scopes - Python runtime: fetches instructions at startup via /instructions/resolve and prepends them to the system prompt as highest-priority context Initial global instructions seeded: 1. Verify Before Acting (check issues/PRs/docs first) 2. Verify Output Before Reporting (second signal before reporting done) 3. Tool Usage Requirements (claims must include tool output) 4. No Hallucinated Emergencies (CRITICAL needs proof) 5. Staging-First Workflow (never push to main directly) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
6c618c9c3f |
feat: add tool_trace to activity_logs for platform-level agent observability
Every A2A response now includes a tool_trace — the list of tools/commands the agent actually invoked during execution. This enables verifying agent claims against what they actually did, catches hallucinated "I checked X" responses, and provides an audit trail for the CEO to control hundreds of agents by checking the top-level PM's trace. Changes: - Python runtime: collect tool name/input/output_preview on every on_tool_start/on_tool_end event, embed in Message.metadata.tool_trace - Go platform: extract tool_trace from A2A response metadata, store in new activity_logs.tool_trace JSONB column with GIN index - Activity API: expose tool_trace in List and broadcast endpoints - Migration 039: adds tool_trace column + GIN index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
1aea013e20 |
fix(ci): unblock main CI on ubuntu-latest — IPv6-safe addr + MagicMock seed
Two latent bugs the self-hosted Mac mini had been hiding. Both caught by the newer toolchain on ubuntu-latest runners after PR #1626. 1. workspace-server/internal/handlers/terminal.go:442 `fmt.Sprintf("%s:%d", host, port)` flagged by go vet as unsafe for IPv6 (it omits the required [::] brackets). Replaced with `net.JoinHostPort(host, strconv.Itoa(port))` which handles both IPv4 and IPv6 correctly. No runtime behaviour change — the only call site passes "127.0.0.1", so the bug would never trigger in practice, but vet is right to flag it as a latent correctness issue. 2. workspace/tests/test_a2a_executor.py::test_set_current_task_updates_heartbeat `MagicMock()` auto-creates attributes on first access, so `getattr(heartbeat, "active_tasks", 0)` in shared_runtime.py returned a MagicMock rather than the default 0. Adding 1 to a MagicMock returns another MagicMock, so the assertion `heartbeat.active_tasks == 1` never held. Seeding `heartbeat.active_tasks = 0` before the first call makes getattr() return a real int, matching how the real HeartbeatLoop class initialises itself. Both pre-existed on main and were hidden by the older Python / Go toolchains on the Mac mini runner. Verified locally (venv pytest pass, `go vet ./...` + `go build ./...` clean on workspace-server). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
64ccf8e179
|
fix: CWE-78 rm scope, go vet failures, delegation idempotency
* refactor: split 4 oversized handler files into focused sub-files - org.go (1099 lines) → org.go + org_import.go + org_helpers.go - mcp.go (1001 lines) → mcp.go + mcp_tools.go - workspace.go (934 lines) → workspace.go + workspace_crud.go - a2a_proxy.go (825 lines) → a2a_proxy.go + a2a_proxy_helpers.go No functional changes — same package, same exports, same tests. All files stay under 635 lines. Note: isSafeURL and isPrivateOrMetadataIP are duplicated between mcp_tools.go and a2a_proxy_helpers.go — this is a pre-existing issue from the original mcp.go and a2a_proxy.go, not introduced by this split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks counter (refs #1386) * docs(tutorials): add Self-Hosted AI Agents guide — Docker, Fly Machines, bare metal * docs: add Remote Agents feature + Phase 30 blog links to docs index * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference (#1281) Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/*path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): CWE-78/CWE-22 — block shell injection in deleteViaEphemeral (#1310) ## Summary Issue #1273: deleteViaEphemeral interpolated filePath directly into rm command, enabling both shell injection (CWE-78) and path traversal (CWE-22) attacks. ## Changes 1. Added validateRelPath(filePath) guard before constructing the rm command. validateRelPath blocks absolute paths and ".." traversal sequences. 2. Changed Cmd from "/configs/"+filePath (string interpolation) to []string{"rm", "-rf", "/configs", filePath} (exec form). This eliminates shell injection entirely — filePath is a plain argument, never interpreted as shell code. ## Security properties - validateRelPath: blocks "../" and absolute paths before they reach Docker - Exec form: filePath cannot inject shell metacharacters even if validation is somehow bypassed - "/configs" as separate arg: rm has exactly two arguments, no room for injected args Closes #1273. Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in a2a_proxy.go (#1292) (#1302) * fix(security): backport SSRF defence (CWE-918) to main — isSafeURL in mcp.go and a2a_proxy.go Issue #1042: 3 CodeQL SSRF findings across mcp.go and a2a_proxy.go. staging already ships the fix (PRs #1147, #1154 → merged); main did not include it. - mcp.go: add isSafeURL() + isPrivateOrMetadataIP() helpers; validate agentURL before outbound calls in mcpCallTool (line ~529) and toolDelegateTaskAsync (line ~607) - a2a_proxy.go: add identical isSafeURL() + isPrivateOrMetadataIP() helpers; call isSafeURL() before dispatchA2A in resolveAgentURL() (blocks finding #1 at line 462) - mcp_test.go: 19 new tests covering all blocked URL patterns: file://, ftp://, 127.0.0.1, ::1, 169.254.169.254, 10.x.x.x, 172.16.x.x, 192.168.x.x, empty hostname, invalid URL, isPrivateOrMetadataIP across all private/CGNAT/metadata ranges 1. URL scheme enforcement — http/https only 2. IP literal blocking — loopback, link-local, RFC-1918, CGNAT, doc/test ranges 3. DNS hostname resolution — blocks internal hostnames resolving to private IPs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci-blocker): remove duplicate isSafeURL/isPrivateOrMetadataIP from mcp.go Issue #1292: PR #1274 duplicated isSafeURL + isPrivateOrMetadataIP in mcp.go — both functions already exist on main at lines 829 and 876. Kept the mcp.go definitions (the originals) and removed the 70-line duplicate appended at end of file. a2a_proxy.go functions are unchanged — they serve the same purpose via a separate code path. * fix: remove orphaned commit-text lines from a2a_proxy.go Three lines from the PR/commit title were accidentally baked into the file during the rebase from #1274 to #1302, causing a Go syntax error (a bare string literal at statement level followed by dangling braces). Deletion restores: } return agentURL, nil } Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> * fix(canvas/test): patch test regressions from PR #1243 + proximity hitbox fix (#1313) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (#1324) (#1327) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct (issue #1324) (#1329) * fix(ci): revert cancel-in-progress to true — ubuntu-runner dispatch stalled With cancel-in-progress: false, pending CI runs accumulate in the ci-staging concurrency group. New pushes create queued runs, but GitHub dispatches multiple runs for the same SHA instead of replacing the pending one. All runs get stuck/cancelled before completing. Reverting to cancel-in-progress: true restores CI operation — runs that are superseded are cancelled, freeing the concurrency slot for the new run to proceed. Runner availability (ubuntu-latest dispatch stall) is a separate infra issue tracked independently. * fix(security): validate tar header names in copyFilesToContainer — CWE-22 path traversal (#1043) Tar header names were built from raw map keys without validation. A malicious server-side caller could embed "../" in a file name to escape the destPath volume mount (/configs) and write files outside the intended directory. Fix: validate each name with filepath.Clean + IsAbs + HasPrefix("..") checks before using it in the tar header, then join with destPath for the archive header. Also guard parent-directory creation against traversal. Closes #1043. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas/test): patch regressed tests from PR #1243 orgs-page flakiness fix Two regressions introduced by PR #1243 (fix issue #1207): 1. **ContextMenu.keyboard.test.tsx** — `setPendingDelete` now receives `{id, name, hasChildren}` (cascade-delete UX, PR #1252), but the test expected only `{id, name}`. Added `hasChildren: false` to the assertion. 2. **orgs-page.test.tsx** — 10 tests awaited `vi.advanceTimersByTimeAsync(50)` without `act()`. With fake timers, `setState` (synchronous) is flushed by `advanceTimersByTimeAsync`, but the React state update it triggers is a microtask — so the test saw stale render. Wrapping in `act(async () => { await vi.advanceTimersByTimeAsync(50); })` ensures microtasks drain before assertions run. All 813 vitest tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add 100px proximity threshold to drag-to-nest detection Fixes #1052 — previously, getIntersectingNodes() returned any node whose bounding box overlapped the dragged node, regardless of actual pixel distance. On a sparse canvas this triggered the "Nest Workspace" dialog even when the dragged node was nowhere near any target. The fix adds an on-node-drag proximity filter: only nodes within 100px (center-to-center) of the dragged node are eligible as nest targets. Distance is computed as squared Euclidean to avoid the sqrt overhead in the hot drag path. Added two tests to Canvas.pan-to-node.test.tsx covering the mock wiring and confirming the regression is addressed in Canvas.tsx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(canvas): add ?? 0 guard for optional budget_used in progressPct Fixes #1324 — TypeScript strict mode flags budget.budget_used as possibly undefined in the progressPct ternary, even though the outer condition checks budget_limit > 0. Fix: use nullish coalescing (budget_used ?? 0) so progress shows 0% when the backend returns a partial shape (provisioning-stuck workspaces). Also adds a test covering the undefined-budget_used case with the progress bar aria-valuenow and fill width both at 0%. Closes #1324. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(platform): unblock SaaS workspace registration end-to-end Every workspace in the cross-EC2 SaaS provisioning shape was failing registration, heartbeat, or A2A routing. Four distinct blockers sat between "EC2 is up" and "agent responds"; three are platform-side and fixed here (the fourth is in the CP user-data, separate PR). 1. SSRF validator blocked RFC-1918 (registry.go + mcp.go) validateAgentURL and isPrivateOrMetadataIP rejected 172.16.0.0/12, which contains the AWS default VPC range (172.31.x.x) that every sibling workspace EC2 registers from. Registration returned 400 and the 10-min provision sweep flipped status to failed. RFC-1918 + IPv6 ULA are now gated behind saasMode(); link-local (169.254/16), loopback, IPv6 metadata (fe80::/10, ::1), and TEST-NET stay blocked unconditionally in both modes. saasMode() resolution order: 1. MOLECULE_DEPLOY_MODE=saas|self-hosted (explicit operator flag) 2. MOLECULE_ORG_ID presence (legacy implicit signal, kept for back-compat so existing deployments don't need a config change) isPrivateOrMetadataIP now actually checks IPv6 — previously it returned false on any non-IPv4 input, which would let a registered [::1] or [fe80::...] URL bypass the SSRF check entirely. 2. Orphan auth-token minting (workspace_provision.go) issueAndInjectToken mints a token and stuffs it into cfg.ConfigFiles[".auth_token"]. The Docker provisioner writes that file into the /configs volume — the CP provisioner ignores it (only cfg.EnvVars crosses the wire). Result: live token in DB, no plaintext on disk, RegistryHandler.requireWorkspaceToken 401s every /registry/register attempt because the workspace is no longer in the "no live token → bootstrap-allowed" state. Now no-ops in SaaS mode; the register handler already mints on first successful register and returns the plaintext in the response body for the runtime to persist locally. Also removes the redundant wsauth.IssueToken call at the bottom of provisionWorkspaceCP, which created the same orphan-token pattern a second time. 3. Compaction artefacts (bundle/importer.go, handlers/org_tokens.go, scheduler.go, workspace_provision.go) Four pre-existing compile errors on main from an earlier session's code truncation: missing tuple destructuring on ExecContext / redactSecrets / orgTokenActor, missing close-brace in Scheduler.fireSchedule's panic recovery. All one-line mechanical fixes; without them the binary would not build. Tests ----- ssrf_test.go adds: * TestSaasMode — covers the env resolution ladder (explicit flag wins over legacy signal, case-insensitive, whitespace tolerant) * TestIsPrivateOrMetadataIP_SaaSMode — asserts RFC-1918 + IPv6 ULA flip to allowed, metadata/loopback/TEST-NET still blocked * TestIsPrivateOrMetadataIP_IPv6 — regression guard for the old "returns false for all IPv6" behaviour Follow-up issue for CP-sourced workspace_id attestation will be filed separately — closes the residual intra-VPC SSRF + token-race windows the SaaS-mode relaxation introduces. Verified end-to-end today on workspace 6565a2e0 (hermes runtime, OpenAI provider) — agent returned "PONG" in 1.4s after register → heartbeat → A2A proxy → runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime+scheduler): increment/decrement active_tasks + max_concurrent (#1408) Runtime (shared_runtime.py): - set_current_task now increments active_tasks on task start, decrements on completion (was binary 0/1) - Counter never goes below 0 (max(0, n-1)) - Pushes heartbeat immediately on BOTH increment and decrement (#1372) Scheduler (scheduler.go): - Reads max_concurrent_tasks from DB (default 1, backward compatible) - Skips cron only when active_tasks >= max_concurrent_tasks (was > 0) - Leaders can be configured with max_concurrent_tasks > 1 to accept A2A delegations while a cron runs Platform: - Added max_concurrent_tasks column to workspaces (migration 037) - Workspace model + list/get queries include the new field - API exposes max_concurrent_tasks in workspace JSON Config.yaml support (future): runtime_config.max_concurrent_tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(review): address 3 critical issues from code review 1. BLOCKER: executor_helpers.py now uses increment/decrement too (was still binary 0/1, stomping the counter for CLI + SDK executors) 2. BUG: asymmetric getattr defaults fixed — both paths use default 0 (was 0 on increment, 1 on decrement) 3. UX: current_task preserved when active_tasks > 0 on decrement (was clearing task description even when other tasks still running) 4. Scheduler polling loop re-reads max_concurrent_tasks on each poll (was using stale value from initial query) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hongming Wang <hongmingwangrabbit@gmail.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Molecule AI Core-BE <core-be@agents.moleculesai.app> Co-authored-by: Molecule AI SDK Lead <sdk-lead@agents.moleculesai.app> Co-authored-by: Molecule AI Core-FE <core-fe@agents.moleculesai.app> Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> * docs: workspace files API reference, skill catalog, and links * docs: fix secrets endpoint path across docs The workspace secrets endpoint is `/workspaces/:id/secrets`, not `/secrets/values`. This was wrong in quickstart.md (Path 2: Remote Agent) and workspace-runtime.md (registration flow example and comparison table). The external-agent-registration guide already had the correct path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix broken blog cross-link in skills-vs-bundled-tools post Link path had an extra `/docs/` segment: `/docs/blog/...` instead of `/blog/...`. Nextra resolves blog posts directly under `/blog/<slug>`, not under `/docs/blog/`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add skill-catalog.md guide Linked from the skills-vs-bundled-tools blog post as a reference for TTS/image-generation/web-search skills. The blog promises "install directly via the CLI" with a skill catalog — this page fills that promise by documenting available skill types, install commands, version management, custom skill authoring, and removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(marketing): update Phase 30 brief — Action 5 complete, docs/index.md update noted * docs(api-ref): add workspace file copy API reference Documents TemplatesHandler.copyFilesToContainer (container_files.go): - Endpoint overview: PUT /workspaces/:id/files/*path - Parameter descriptions for all four function parameters - CWE-22 path traversal protection (PRs #1267/1270/1271) - Defense-in-depth: validateRelPath at handler + archive boundary - Full error code table (400/404/500) - curl example with success and path-traversal rejection cases Also covers: writeViaEphemeral routing, findContainer fallback, allowed roots allow-list, and related links to platform-api.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Molecule AI Technical Writer <technical-writer@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> * fix(handlers): add saasMode() gating to isPrivateOrMetadataIP in a2a_proxy_helpers.go Issue #1421 / #1401: PR #1363 (handler split) moved isPrivateOrMetadataIP into a2a_proxy_helpers.go but kept the OLD pre-SaaS version — it unconditionally blocks RFC-1918 addresses, regressing the fix in commits |
||
|
|
859d676f70
|
fix(CI): correct BASE in detect-changes (PR/push race); catch RuntimeError in conftest (#1473)
- ci.yml: replace if/else BASE assignment with GITHUB_BASE_REF default
+ pull_request base.sha override pattern. Prevents push events from
overwriting the correct PR base SHA when both events fire together.
- conftest.py: catch RuntimeError in addition to ImportError when
importing coordinator.py, which raises RuntimeError at import time
when WORKSPACE_ID is not set (before the ImportError guard).
Co-authored-by: Molecule AI Release Manager <release-manager@agents.moleculesai.app>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
4675402e58
|
feat(workspace): pre-stop serialization for pause/resume (closes #1386)
Add a pre-stop hook that captures agent state before container exit and writes a scrubbed snapshot to /configs/.agent_snapshot.json. On restart, the snapshot is loaded and the adapter's restore_state() is called before the A2A server starts. - New lib/pre_stop.py: build_snapshot / write_snapshot / read_snapshot / delete_snapshot + _scrub_value deep-scrubber (uses lib.snapshot_scrub to redact API keys, tokens, and sandbox output before persisting) - BaseAdapter.pre_stop_state(): captures _executor._session_id and recent transcript_lines; overridden by adapters with richer in-memory state - BaseAdapter.restore_state(): stores snapshot fields as adapter attrs for create_executor() to pick up - main.py: calls pre_stop serialization in finally block (after server serves) and restore_state() after adapter setup, before server starts - Added 12 unit tests covering scrub, read/write, adapter integration Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
3bef6af241 |
fix: apply #1124 env-var defaults + scrub F1088 credentials from INCIDENT_LOG.md (#1347)
- PLATFORM_URL: replace unreachable http://platform:8080 mesh-only default with Docker-aware detection (host.docker.internal in containers, localhost for local dev) across all workspace Python modules and the git-token-helper shell script. - WORKSPACE_ID: add fail-fast validation in main.py (SystemExit if empty) consistent with coordinator.py / a2a_cli.py patterns already in place. - INCIDENT_LOG.md: replace all 3 F1088 credential types with ***REDACTED*** (sk-cp- 2x, github_pat_ 2x, ADMIN_TOKEN base64 3x). Fixes #1124, #1333. Co-authored-by: Molecule AI Dev Lead <dev-lead@agents.moleculesai.app> |
||
|
|
e07e22ad57 |
fix(orchestrator): fail-fast if WORKSPACE_ID env var is unset/empty (#1124) (#1336)
* fix(orchestrator): fail-fast if WORKSPACE_ID env var is unset/empty
Issue #1124: orchestrator GET /workspaces/{WORKSPACE_ID} returned 404
because 5 Python modules defaulted WORKSPACE_ID to "" instead of
validating the injected value. Empty string produced URLs like
/workspaces//heartbeat — route not found.
Fix: raise RuntimeError at module load if WORKSPACE_ID is unset
or empty, rather than silently producing broken API calls downstream.
Files changed (all same pattern):
- workspace/a2a_cli.py
- workspace/a2a_client.py
- workspace/coordinator.py
- workspace/consolidation.py
- workspace/molecule_ai_status.py
The platform (provisioner.go:375) correctly injects WORKSPACE_ID at
container provision time. This fix ensures the orchestrator surfaces
the misconfiguration immediately instead of failing silently at runtime.
Closes #1124.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(incidents): rebuild INCIDENT_LOG — linter reset, all sections restored
Rebuilt after linter reset. Sections restored:
- Security Audit Cycle 6 (abc58b47)
- F1100 workspace_restart.go path traversal (resolved via
|
||
|
|
d3b310e895 |
Merge pull request #1049 from Molecule-AI/feat/platform-native-hma-instructions
feat(runtime): inject HMA memory instructions at platform level (#1047) |
||
|
|
b1bb5f838a |
fix: GitHub token refresh — add WorkspaceAuth path for credential helper (#1068)
PR #729 tightened AdminAuth to require ADMIN_TOKEN, breaking the workspace credential helper which called /admin/github-installation-token with a workspace bearer token. Tokens expired after 60 min with no refresh. Fix: Add /workspaces/:id/github-installation-token under WorkspaceAuth so any authenticated workspace can refresh its GitHub token. Keep the admin path as backward-compatible alias. Update molecule-git-token-helper.sh to use the workspace-scoped path when WORKSPACE_ID is set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
5bc3edfbdd |
Fix test assertions to account for HMA instructions in system prompt
Mock get_hma_instructions in exact-match tests so they don't break when HMA content is appended. Add a dedicated test for HMA inclusion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
6103ca341c |
feat(runtime): inject HMA memory instructions at platform level (#1047)
Every agent now gets hierarchical memory instructions in their system prompt automatically — no template configuration needed. Instructions cover commit_memory (LOCAL/TEAM/GLOBAL scopes), recall_memory, and when to use each proactively. Follows the same pattern as A2A instructions: defined in executor_helpers.py, injected by _build_system_prompt() in the claude_sdk_executor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
46c20731e6 |
feat: event-driven cron triggers + auto-push hook for agent productivity
Three changes to boost agent throughput: 1. Event-driven cron triggers (webhooks.go): GitHub issues/opened events fire all "pick-up-work" schedules immediately. PR review/submitted events fire "PR review" and "security review" schedules. Uses next_run_at=now() so the scheduler picks them up on next tick. 2. Auto-push hook (executor_helpers.py): After every task completion, agents automatically push unpushed commits and open a PR targeting staging. Guards: only on non-protected branches with unpushed work. Uses /usr/local/bin/git and /usr/local/bin/gh wrappers with baked-in GH_TOKEN. Never crashes the agent — all errors logged and continued. 3. Integration (claude_sdk_executor.py): auto_push_hook() called in the _execute_locked finally block after commit_memory. Closes productivity gap where agents wrote code but never pushed, and where work crons only fired on timers instead of reacting to events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
3976361483 |
feat(workspace): snapshot secret scrubber (closes #823)
Sub-issue of #799, security condition C4. Standalone module in workspace/lib/snapshot_scrub.py with three public functions: - scrub_content(str) → str: regex-based redaction of secret patterns - is_sandbox_content(str) → bool: detect run_code tool output markers - scrub_snapshot(dict) → dict: walk memories, scrub each, drop sandbox entries Patterns covered: sk-ant-/sk-proj-, ghp_/ghs_/github_pat_, AKIA, cfut_, mol_pk_, ctx7_, Bearer, env-var assignments, base64 blobs ≥33 chars. 21 unit tests, 100% coverage on new code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
37ed319562 |
fix: update workspace script comments for workspace-template → workspace rename
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
39074cc4ae |
chore: final open-source cleanup — binary, stale paths, private refs
- Remove compiled workspace-server/server binary from git - Fix .gitignore, .gitattributes, .githooks/pre-commit for renamed dirs - Fix CI workflow path filters (workspace-template → workspace) - Replace real EC2 IP and personal slug in test_saas_tenant.sh - Scrub molecule-controlplane references in docs - Fix stale workspace-template/ paths in provisioner, handlers, tests - Clean tracked Python cache files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
d8026347e5 |
chore: open-source restructure — rename dirs, remove internal files, scrub secrets
Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |