molecule-core

Author	SHA1	Message	Date
Hongming Wang	91ddaaa2ff	Merge pull request #276 from Molecule-AI/feat/hermes-phase2d-i-system-prompt feat(hermes): Phase 2d-i — system-prompt.md injection on all 3 dispatch paths	2026-04-15 16:53:31 -07:00
Hongming Wang	616bb4eea6	Merge pull request #288 from Molecule-AI/fix/security-headers-referrer-permissions fix(security): add Referrer-Policy + Permissions-Policy headers (#282)	2026-04-15 16:52:37 -07:00
Hongming Wang	713b3cb5a7	fix(security): add Referrer-Policy + Permissions-Policy headers (#282 ) Closes #282. CLAUDE.md documented the SecurityHeaders() middleware as setting 6 headers (X-Content-Type-Options, X-Frame-Options, Referrer- Policy, Content-Security-Policy, Permissions-Policy, HSTS) but the implementation only set 4 — Referrer-Policy and Permissions-Policy were silently missing. Adds: - Referrer-Policy: strict-origin-when-cross-origin — prevents browsers from leaking full paths/queries in Referer on cross- origin navigation. Particularly relevant for canvas embeds of Langfuse trace URLs that may contain trace IDs. - Permissions-Policy: camera=(), microphone=(), geolocation=() — denies sensor access by default. Iframes the canvas embeds (Langfuse trace viewer etc.) can no longer request these without an explicit delegation. Regression tests added to securityheaders_test.go — both headers are now in the same table-driven assertion loop as the other 4, so a future edit that drops them again fails CI loudly. LOW severity — this is defense-in-depth, not a direct exploit path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:52:19 -07:00
Hongming Wang	5b9e566f95	Merge pull request #277 from Molecule-AI/fix/wire-security-plugins-to-roles feat(template): wire molecule-hitl + molecule-security-scan into roles (#266, #275)	2026-04-15 16:22:19 -07:00
Hongming Wang	2dbb608723	feat(template): wire molecule-hitl + molecule-security-scan into roles (#266 , #275 ) Closes #266 and #275. Per-role install matrix matching the per-tick #266 triage comment. ## Added plugins \| Role \| Plugin \| Rationale \| \|---\|---\|---\| \| Backend Engineer \| molecule-hitl \| Scope includes destructive DB migrations + runtime config changes — @requires_approval stops unattended agents from shipping prod schema mutations. \| \| DevOps Engineer \| molecule-hitl \| Scope covers fly deploys + registry pushes + CI pipeline mutations — @requires_approval before destructive infra ops. \| \| Security Auditor \| molecule-hitl \| Gates public issue filing for critical findings; prevents false-positive spam of the tracker. \| \| Security Auditor \| molecule-security-scan \| Primary consumer of gosec/bandit/CVE scanning via builtin_tools/security_scan.py. Security Auditor system prompt already expects to run these tools; this wires them. \| ## Per-PR #71 semantics Each workspace's `plugins:` UNIONs with `defaults.plugins` — these additions don't drop any existing plugin. Security Auditor's list went from 3 → 5; Backend + DevOps Engineer now have a role-specific list layered on top of defaults. ## NOT adding (yet) Dev Lead / Research Lead / Technical Researcher / QA Engineer / UIUX Designer / PM / Documentation Specialist — none have destructive ops scope in the role description. If you want belt-and-suspenders HITL coverage I can extend this PR; leaving narrow for now. ## Test plan - [x] YAML parses cleanly (python3 -c 'import yaml; yaml.safe_load(...)') - [x] Three edited roles' plugins lists verified by walk-script - [ ] Next org re-import activates the plugins on each workspace container - [ ] Agents invoke request_approval / security_scan from their system prompts after re-import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:21:58 -07:00
rabbitblood	88ce2a18cd	feat(hermes): Phase 2d-i — system-prompt.md injection on all 3 dispatch paths The Hermes adapter never read /configs/system-prompt.md. Any role that switched to runtime: hermes was silently losing its role identity because the system prompt wasn't passed to the model. This PR fixes that by: 1. HermesA2AExecutor.__init__ takes new optional `config_path` kwarg 2. `create_executor(config_path=...)` forwards to the constructor 3. `adapter.py` passes `config.config_path` through from AdapterConfig 4. `execute()` reads system-prompt.md via executor_helpers.get_system_prompt (hot-reload-capable — reads on every turn, not just at startup) 5. `_do_inference(user_message, history, system_prompt)` — new arg threads through the dispatch to each native path 6. Each path uses the provider's NATIVE system field: - OpenAI-compat: prepends `{"role":"system", "content":...}` to messages - Anthropic: top-level `system=` kwarg (NOT in messages — Anthropic requires system at the top level) - Gemini: `config=GenerateContentConfig(system_instruction=...)` ## Phase scoreboard - 2a (in main) — native Anthropic dispatch infra - 2b (in main) — native Gemini dispatch - 2c (in main) — multi-turn history on all paths - 2d-i (this PR) — system prompts on all paths - 2d-ii (future) — tool calling on native paths - 2d-iii (future) — vision content blocks on native paths - 2d-iv (future) — streaming ## Test coverage 46/46 tests pass (20 Phase 2 dispatch + 26 Phase 1 registry): - Existing dispatch tests updated to assert the 3-arg call shape `("hello", None, None)` — history + system_prompt both None - 4 new tests: - `dispatch_passes_system_prompt_to_anthropic` — happy path, third arg flows - `dispatch_passes_system_prompt_to_gemini` — happy path - `dispatch_passes_system_prompt_to_openai` — happy path - `executor_accepts_config_path_kwarg` — constructor stores config_path - `create_executor_forwards_config_path` — both back-compat and registry resolution paths forward config_path through to the executor ## Back-compat - `config_path=None` (default) → execute() skips system-prompt injection, same behavior as pre-2d-i - Workspaces with `runtime: hermes` but no `/configs/system-prompt.md` file get `system_prompt=None` (get_system_prompt returns fallback), same as before - The 13 OpenAI-compat providers work identically — system_prompt just adds a leading message, which every OpenAI-compat endpoint already supports - Anthropic + Gemini previously got zero system context; now they get the same system prompt the workspace's system-prompt.md carries ## Why this matters Before this PR: if someone flipped a workspace from `runtime: claude-code` to `runtime: hermes`, the agent would act generically (no role identity, no project conventions, no CLAUDE.md context) because the Hermes executor never looked at system-prompt.md. That's a silent correctness regression the test suite wouldn't catch because none of our live workspaces use the hermes runtime today. With this PR: Hermes workspaces get the same system prompt injection as Claude-code workspaces, making the `runtime: hermes` switch a true drop-in alternative. ## Related - #267 Phase 2c (multi-turn history — in main) - #255 Phase 2b (gemini native — in main) - #240 Phase 2a (anthropic native — in main) - #208 Phase 1 (provider registry — in main) - project_hermes_multi_provider.md — Phase 2d-i was the next queued item	2026-04-15 16:21:47 -07:00
Hongming Wang	bf7706c96e	Merge pull request #267 from Molecule-AI/feat/hermes-phase2c-streaming feat(hermes): Phase 2c — multi-turn history passed natively to all dispatch paths	2026-04-15 16:10:21 -07:00
Hongming Wang	db2613260e	Merge pull request #273 from Molecule-AI/fix/ci-self-hosted-runner-failures fix(ci): publish-platform-image keychain + path diagnostics	2026-04-15 16:06:53 -07:00
Hongming Wang	63934ab487	fix(ci): publish-platform-image keychain + path diagnostics Every publish-platform-image run since the `3ff40c4` self-hosted runner migration has been failing with two runner-level issues that the workflow now works around (keychain) or surfaces clearly (path): 1. "error storing credentials - err: exit status 1, out: 'User interaction is not allowed. (-25308)'" docker/login-action tries to persist the GHCR + Fly tokens in the macOS Keychain, but the Mac mini runner runs as a non-interactive launchd service without an unlocked desktop session — keychain access raises -25308. Fix: set DOCKER_CONFIG to a per-run temp dir containing a plain config.json before the login step so credentials land in a file, not the keychain. This is the same trick the GitHub-hosted macos runners use in docker action examples. 2. "Unexpected error attempting to determine if executable file exists '/usr/local/bin/docker': Error: EACCES: permission denied, stat '/usr/local/bin/docker'" Not a workflow bug — the runner literally can't read the Docker binary path. Adds a diagnostic step before QEMU/buildx setup that prints: PATH, `command -v docker`, `docker --version`, and `ls -la` on both /usr/local/bin/docker and /opt/homebrew/bin/docker. Surfacing these in the log means the next failure (if any) shows the actual problem instead of hiding behind a cryptic buildx error. Does NOT fix the root cause of #2 — that needs the user to SSH into the Mac mini runner and reinstall / re-permission Docker Desktop (or switch to Colima/OrbStack). The diagnostic output will tell us exactly which path is broken. The 20+ queued CI runs from `ci.yml` are unrelated to this PR — they are stuck because the self-hosted runner has severely degraded queue throughput (runs wait 2+ hours before being picked up). That's a separate runner-health issue tracked as a user action in the triage report. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 16:06:28 -07:00
rabbitblood	d40a9d940c	feat(hermes): Phase 2c — multi-turn history passed natively to all paths Completes the Phase 2 scope by keeping conversation turns as turns across all three dispatch paths. Pre-2c, history was flattened into a single user message via shared_runtime.build_task_text, which worked as a fallback but lost the model's native multi-turn awareness (role attribution, instruction-following on mid-conversation corrections, system-prompt grounding against prior turns). Phase 2a + 2b shipped the dispatch infrastructure + per-provider native paths. This PR uses them properly. ## What's new - `_history_to_openai_messages(user_message, history)` (static) — maps A2A `(role, text)` tuples to OpenAI Chat Completions `[{"role":"user"\|"assistant","content":str}]`. Roles: `human`→`user`, `ai`→`assistant`. Current turn appended as the final user message. - `_history_to_anthropic_messages` (static) — identical wire shape to OpenAI for text-only turns, so it delegates. Phase 2d tool_use/vision blocks will diverge here. - `_history_to_gemini_contents` (static) — Gemini uses a different shape: `role="user"\|"model"` (NOT "assistant") and text wrapped in `parts=[{"text":...}]`. Delegates to none of the others. - `_do_openai_compat(user_message, history=None)` — accepts history, builds messages via `_history_to_openai_messages`. Back-compat: pass `history=None` to get the old single-turn behavior. - `_do_anthropic_native(user_message, history=None)` — same signature change, calls `_history_to_anthropic_messages`. Still uses `anthropic.AsyncAnthropic().messages.create()`, just with proper multi-turn. - `_do_gemini_native(user_message, history=None)` — same pattern, calls `_history_to_gemini_contents`, passes to Gemini's `generate_content(contents=...)`. - `_do_inference(user_message, history=None)` — new signature, dispatches by auth_scheme as before, passes both args through. - `execute()` — no longer calls `build_task_text`. Calls `extract_history(context)` directly and forwards to `_do_inference`. Removes the `build_task_text` import (not needed in this file anymore). ## Tests Existing 7 dispatch tests updated for the new `(user_message, history)` signature — they assert the path is called with `("hello", None)` since they pass no history. 5 NEW tests: - `test_history_to_openai_messages_empty_history` — empty history degrades to single user message (back-compat) - `test_history_to_openai_messages_multi_turn` — round-trip of a 3-turn history + current turn - `test_history_to_anthropic_messages_same_as_openai` — cross-check that anthropic path produces identical wire shape for text-only - `test_history_to_gemini_contents_uses_model_role_and_parts_wrapper` — verifies the Gemini-specific role mapping (`ai`→`model`) + parts wrapper - `test_dispatch_passes_history_through` — end-to-end: _do_inference forwards history to the chosen provider path All 41 tests pass (15 Phase 2 dispatch + 26 Phase 1 registry): pytest tests/test_hermes_phase2_dispatch.py tests/test_hermes_providers.py 41 passed in 0.07s ## Back-compat - No public API changes to `create_executor()`. Callers that hit `execute()` via A2A get the new multi-turn behavior automatically via `extract_history(context)`. - Callers that passed an empty history list (or None) get the same single-turn behavior as pre-2c. - The `build_task_text` helper in shared_runtime is unchanged — other adapters (AutoGen, LangGraph) that use it keep working. Only Hermes bypasses it now. ## What's NOT in this PR (Phase 2d) - Tool calling / function calling on native paths (anthropic `tools=`, gemini `tools=Tool(function_declarations=[...])`) - Vision content blocks (image_url → anthropic `{type:"image", source: {type:"base64",...}}` / gemini `{inline_data:{mime_type,data}}`) - System instructions pass-through (anthropic `system=`, gemini `system_instruction=`) - Streaming (`astream_messages` / `streamGenerateContent` stream variants) - Extended thinking (anthropic `thinking={"type":"enabled"}`) / Gemini thinking config Phase 2c is the multi-turn upgrade. Tool + vision + streaming are Phase 2d, scoped in project_hermes_multi_provider.md. ## Related - #240 Phase 2a (native Anthropic dispatch — in main) - #255 Phase 2b (native Gemini dispatch — in main) - Phase 1 (#208 — provider registry baseline, in main) - `project_hermes_multi_provider.md` queued memory - CEO 2026-04-15: "focus on supporting hermes agent"	2026-04-15 14:21:10 -07:00
Hongming Wang	12db566b00	Merge pull request #264 from Molecule-AI/feat/plugin-compliance-posture-split feat(plugin): split compliance-posture into 3 plugins (#256)	2026-04-15 14:15:55 -07:00
Hongming Wang	720e92e426	feat(plugin): split compliance-posture into 3 plugins (#256 ) Closes #256. Per CEO direction, shipping three separate opt-in plugins instead of one bundled "compliance-posture" — keeps installs granular so a workspace that only wants CVE scanning doesn't carry OWASP policy or append-only audit retention. - plugins/molecule-compliance/ — wraps compliance.py (OWASP OA-01 prompt injection + OA-03 excessive agency). Skill: owasp-agentic. - plugins/molecule-audit/ — wraps audit.py (EU AI Act Art. 12/13/17 append-only JSONL log, SIEM-friendly). Skill: ai-act-audit-log. - plugins/molecule-security-scan/ — wraps security_scan.py (Snyk or pip-audit CVE gate on skill requirements.txt). Skill: skill-cve-gate. Each plugin ships a manifest + one SKILL.md with: - When to install / when to skip - Configuration shape (config.yaml blocks) - Anti-patterns to avoid - Cross-references to the other two plugins so an operator can reason about the full compliance surface All three wrap code that already exists in workspace-template/builtin_tools/ — no Python changes. Install per workspace via POST /workspaces/:id/plugins {"source":"builtin://molecule-<name>"}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:15:25 -07:00
Hongming Wang	9edc576ac9	Merge pull request #263 from Molecule-AI/docs/sync-2026-04-15-tick-32 docs: sync CLAUDE.md test counts after tick-32	2026-04-15 14:11:16 -07:00
Hongming Wang	97c0384fd4	docs: sync CLAUDE.md test counts after 2026-04-15 tick-32 Tick 32 (manual) merged a large batch of PRs — the test counts in CLAUDE.md were drifting behind reality by enough to matter: - platform: 816 → 818 (YAML injection fix + sanitizeRuntime allowlist) - canvas: 453 → 482 (12 CookieConsent + 17 PricingTable/billing) - workspace-template: 1180 → 1179 (Hermes Phase 2a/2b dispatch tests landed but the test_hermes_providers env-var-leak fix removed a fragile flake-path count; net -1) This is measured not guessed: running the full suites on fresh main. Not in this sync but worth mentioning for the next retrospective: - controlplane repo received the full GDPR/admin/usage/consent/email stack (#29-#34) — that work sits in molecule-controlplane, not monorepo CLAUDE.md - monorepo picked up /pricing route, cookie consent banner, molecule- hitl plugin (#262), Hermes Phase 2a native Anthropic + 2b Gemini Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:05:21 -07:00
Hongming Wang	22e008aeef	Merge pull request #262 from Molecule-AI/feat/plugin-molecule-hitl feat(plugin): molecule-hitl — opt-in HITL gates (#257)	2026-04-15 14:03:44 -07:00
Hongming Wang	4d048e20d3	feat(plugin): molecule-hitl — opt-in HITL gates (#257 ) Closes #257. Thin manifest + skill doc that activates the existing builtin_tools/hitl.py primitives as a per-workspace opt-in plugin. The Python implementation (@requires_approval decorator, pause_task / resume_task tools, multi-channel notification, RBAC bypass roles) is already in every runtime image — this plugin is the policy layer that tells agents when to call them. - plugins/molecule-hitl/plugin.yaml — runtimes: langgraph, claude_code, deepagents; skills: hitl-gates - plugins/molecule-hitl/skills/hitl-gates/SKILL.md — documents the 5 classes of action that need a gate (deployment / irreversible FS / public message / production mutation / cross-workspace destructive), decorator pattern, pause/resume pattern, config shape, 4 anti-patterns, 5-step test plan No Python code — all implementation already exists. Install per workspace via POST /workspaces/:id/plugins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:03:19 -07:00
Hongming Wang	825b8a227f	Merge pull request #255 from Molecule-AI/feat/hermes-phase2b-gemini-native feat(hermes): Phase 2b — native Google Gemini generateContent dispatch path	2026-04-15 14:01:00 -07:00
Hongming Wang	353dc306e9	Merge pull request #240 from Molecule-AI/feat/hermes-phase2-native-sdks feat(hermes): Phase 2a — native Anthropic Messages API dispatch (auth_scheme='anthropic')	2026-04-15 14:00:51 -07:00
Hongming Wang	a868162465	Merge pull request #261 from Molecule-AI/fix/hermes-test-env-isolation fix(tests): hermes provider env-var leak broke test_hermes_smoke	2026-04-15 14:00:12 -07:00
Hongming Wang	66120e6c37	fix(tests): hermes provider env-var leak broke test_hermes_smoke Pre-existing flaky test: when the full workspace-template suite ran in collection order, test_hermes_smoke.py::test_create_executor_raises_ without_keys failed with "DID NOT RAISE ValueError". Failure only surfaced when test_hermes_providers ran first. Root cause: test_hermes_providers had an autouse fixture that used monkeypatch.delenv on entry, but several tests in that file mutate os.environ directly (e.g. `os.environ["HERMES_API_KEY"] = "test"`), bypassing monkeypatch. monkeypatch only tracks its own deltas, so on fixture teardown the direct-mutation values stayed in os.environ. HERMES_API_KEY leaked across file boundaries into test_hermes_smoke, which then saw a key present when it expected absence. Fix: replace monkeypatch-based fixture with pure snapshot/restore: - Snapshot all provider env vars at entry - Clear them - yield (test runs, may mutate freely) - try/finally restore the exact pre-test state This is deterministic regardless of whether a test uses monkeypatch, direct mutation, or neither. Also adds a comment documenting WHY we switched away from monkeypatch so a future reviewer doesn't revert. Full workspace-template suite: 1169 passed, 9 skipped, 2 xfailed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:59:48 -07:00
Hongming Wang	51e3556efe	Merge pull request #238 from Molecule-AI/docs/sync-2026-04-15-overnight-sweep docs: sync 2026-04-15 overnight sweep — CLAUDE.md + PLAN.md + edit-history	2026-04-15 13:55:56 -07:00
Hongming Wang	59e6665f94	Merge pull request #251 from Molecule-AI/feat/cookie-consent-banner feat(canvas): cookie consent banner	2026-04-15 13:49:53 -07:00
Hongming Wang	942e50e0a4	Merge pull request #252 from Molecule-AI/fix/channels-discover-adminauth fix(security): gate /channels/discover behind AdminAuth (#250)	2026-04-15 13:49:45 -07:00
Hongming Wang	9958eb8366	Merge pull request #254 from Molecule-AI/fix/security-auditor-yaml-check chore(template): add YAML injection to Security Auditor check list (#248)	2026-04-15 13:49:39 -07:00
Hongming Wang	05113aec6b	Merge pull request #259 from Molecule-AI/docs/saas-secrets-resend docs: add Resend + Stripe to saas-secrets runbook	2026-04-15 13:49:34 -07:00
Hongming Wang	b76f9dbcdb	Merge pull request #242 from Molecule-AI/docs/gdpr-erasure-runbook docs: GDPR Art. 17 erasure runbook	2026-04-15 13:49:28 -07:00
Hongming Wang	5d7deb9363	Merge pull request #260 from Molecule-AI/feat/pricing-page feat(canvas): /pricing route with plan selector + Stripe checkout	2026-04-15 13:48:47 -07:00
Hongming Wang	4b865fa755	feat(canvas): /pricing route with plan selector + Stripe checkout Adds a public /pricing route the apex + tenant canvas can both serve. Three-tier plan cards (Free, Starter, Pro) with per-plan CTA buttons that dispatch correctly regardless of the user's state: Free → redirect to signup Anonymous + paid → redirect to signup (Stripe opens post-auth) Authed + paid → POST /cp/billing/checkout, redirect to Stripe URL No tenant slug → inline error ("pick an org first") Network failures → surfaced in an ARIA alert banner Files: - src/lib/billing.ts — plan metadata + startCheckout + openBillingPortal wrappers over /cp/billing/{checkout,portal} - src/components/PricingTable.tsx — client component, lazy session probe on first CTA click (no probe for anonymous browsers) - src/app/pricing/page.tsx — server-rendered shell with SEO metadata, links to legal pages in the footer - Tests: 10 billing helper tests + 9 PricingTable tests (17 total, additional ones cover the plan-list canonical order) Design notes: - The pricing data (features + prices) is a static const in billing.ts, not fetched from the API. Changing prices requires a deploy — which we'd need to do anyway for tier definition changes. - PLAN_ID 'starter' is flagged highlighted=true so the middle card gets the 'Most popular' visual treatment. One source of truth; test locks it. - Session probe is lazy (first CTA click, not mount) so anonymous visitors don't generate a /cp/auth/me request just to read the page. AuthGate interaction: - On apex (no tenant slug), AuthGate passthrough — /pricing renders freely - On tenant subdomain, AuthGate still bounces anonymous users to login before reaching /pricing — this is the correct UX for the "I'm already logged in and want to upgrade my own org" flow Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:41:44 -07:00
Hongming Wang	7bfc40f2bd	docs: add Resend + Stripe to saas-secrets runbook Extends the secret map with RESEND_API_KEY, RESEND_FROM_EMAIL, STRIPE_API_KEY, STRIPE_WEBHOOK_SECRET — the four SaaS secrets the control plane reads once the current PR stack (#29-#34 on molecule-controlplane) ships. Adds rotation procedures for each: - Resend: low-blast-radius, best-effort sends, domain verification gotcha documented - Stripe API key: independent rotation from webhook secret, live verify via /cp/billing/checkout - Stripe webhook secret: 24h overlap window procedure using stripe trigger for live verify Also adds Resend + Stripe entries to the emergency-contacts list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:35:23 -07:00
rabbitblood	485dcb4cae	feat(hermes): Phase 2b — native Google Gemini generateContent dispatch path Completes Hermes Phase 2 by adding the second native SDK path: Google Gemini via the official `google-genai` Python SDK. Stacked on top of Phase 2a (feat/hermes-phase2-native-sdks) which introduced the dispatch infra + the anthropic native path. ## What's new in this PR 1. `providers.py`: flip `gemini` entry to `auth_scheme="gemini"` and update `base_url` from the OpenAI-compat endpoint (`/v1beta/openai`) to the bare host (`https://generativelanguage.googleapis.com`) which the native SDK uses. 2. `executor.py`: new method `_do_gemini_native(task_text)` that uses `google.genai.Client().aio.models.generate_content(...)`. Dispatch table in `_do_inference` now routes `"gemini"` → `_do_gemini_native`. Same fail-loud semantics as `_do_anthropic_native` — missing SDK raises a clear RuntimeError with install instructions. 3. `requirements.txt`: add `google-genai>=1.0.0`. 4. `test_hermes_phase2_dispatch.py`: +3 tests - `test_gemini_entry_has_gemini_scheme` — registry flip + base URL validated - `test_dispatch_gemini_scheme_calls_gemini_native` — dispatch runs gemini native, not openai-compat or anthropic-native - `test_gemini_native_raises_clear_error_when_sdk_missing` — fail-loud on missing `google-genai` package Plus updated existing dispatch tests to mock `_do_gemini_native` alongside the other paths so "no cross-calls" assertions stay tight. All 36 tests pass locally (10 Phase 2 dispatch + 26 Phase 1 registry): pytest tests/test_hermes_phase2_dispatch.py tests/test_hermes_providers.py 36 passed in 0.07s ## Dispatch table after this PR auth_scheme="openai" → _do_openai_compat (13 providers) auth_scheme="anthropic" → _do_anthropic_native (1 provider, Phase 2a) auth_scheme="gemini" → _do_gemini_native (1 provider, Phase 2b) ← NEW <unknown> → _do_openai_compat + warning (forward-compat) ## Back-compat - All 13 openai-scheme providers unchanged - `hermes_api_key` / `HERMES_API_KEY` / `OPENROUTER_API_KEY` paths unchanged - Only `gemini` provider changes behavior: now uses native generateContent instead of the `/v1beta/openai` compat shim - Existing Gemini callers setting `GEMINI_API_KEY` get the native path automatically — no caller changes needed ## What's NOT in this PR (future phases) - Streaming support (`astream_messages` / `streamGenerateContent` stream variants) for either native path - Tool calling / function calling on native paths - Vision content blocks (image_url → anthropic image blocks; image_url → gemini inline_data with base64 + mime_type) - Extended thinking (anthropic) / thinking config (gemini) - System instructions pass-through on the gemini native path Phase 2c/2d will layer these on. This PR is the minimum-viable native dispatch — single-turn text in, text out — same shape as Phase 2a. ## Stacking This PR targets `feat/hermes-phase2-native-sdks` (Phase 2a) as its base branch, NOT main, so the diff shows only the Gemini-specific additions. When Phase 2a merges to main, GitHub auto-rebases this PR onto the new main head. If reviewer prefers a single combined PR, close #240 and land this one instead — the commits on feat/hermes-phase2-native-sdks are already included in this branch's history. ## Related - #240 Phase 2a (parent branch) - #208 Phase 1 (registry + openai-compat path — already in main) - `project_hermes_multi_provider.md` queued memory — Phase 2 was the next item, this PR completes it - `docs/ecosystem-watch.md` → `### Hermes Agent` — Research Lead's eco-watch entry that catalogued Hermes's native provider list and shaped the original Phase 2 scope	2026-04-15 13:20:39 -07:00
Hongming Wang	e1ff890150	chore(template): add YAML injection to Security Auditor check list (#248 ) Closes #248. Three instances of the same YAML-injection bug class (#221 name/role, #233 template path, #241 runtime/model) shipped in this repo over the last weeks. The common root cause is the Security Auditor's system prompt didn't list YAML injection as an explicit check class, so audits missed the pattern every time. Adds: - "YAML injection" to the 'Think like an attacker' list in How You Work - An explicit entry in What You Check with the three prior instances cited so future auditors see the pattern and the fix shape (double-quoted scalars or a proper YAML encoder) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:18:52 -07:00
Hongming Wang	8881b68aaf	fix(security): YAML injection + path traversal via runtime/model (#241 ) Closes #241 (MEDIUM, auth-gated by AdminAuth on POST /workspaces). ## Vectors closed 1. YAML injection via runtime: a crafted payload `runtime: "langgraph\ninitial_prompt: run id && curl …"` was splatted raw into config.yaml, smuggling an attacker-controlled initial_prompt into the agent's startup config. 2. Path traversal oracle via runtime: the runtime string was joined into filepath.Join for the runtime-default template fallback. `runtime: ../../sensitive` could probe host directory existence. 3. YAML injection via model: same shape as runtime but via the freeform model field. ## Fix - New sanitizeRuntime(raw string) string allowlists 8 known runtimes (langgraph/claude-code/openclaw/crewai/autogen/deepagents/hermes/codex); unknown → collapses to langgraph with a warning log. Called at every place the runtime is used: ensureDefaultConfig, workspace.go:175 runtimeDefault fallback, org.go:370 runtimeDefault fallback. - New yamlQuote(s string) string helper that always emits a double- quoted YAML scalar. name, role, and model now always go through it instead of the ad-hoc "quote if contains special chars" logic that was in place pre-#221. Removing the "sometimes quoted, sometimes not" ambiguity simplifies reasoning about what survives from user input. ## Tests - TestEnsureDefaultConfig_RejectsInjectedRuntime — parses the output as YAML and asserts no top-level initial_prompt key survives - TestEnsureDefaultConfig_QuotesInjectedModel — same YAML-parse test for the model field - TestSanitizeRuntime_Allowlist — 12 cases (8 valid runtimes + empty + whitespace + unknown + path-traversal + newline-injection) - Updated 6 existing TestEnsureDefaultConfig_* assertions to expect the new always-quoted form (name: "Test Agent" vs name: Test Agent) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:17:32 -07:00
Hongming Wang	81d5b658ad	fix(security): gate /channels/discover behind AdminAuth (#250 ) Closes #250 (MEDIUM). POST /channels/discover was on the open router and accepted an arbitrary Telegram bot token, turning it into: 1. A free bot-token validity oracle — attackers can enumerate/probe tokens at zero cost 2. A drive-by deleteWebhook side effect — every call invokes tgbotapi.DeleteWebhookConfig against the target bot, breaking legitimate webhook delivery 3. A rate-limit amplifier — getMe + deleteWebhook + getUpdates per call Fix: one-line addition of middleware.AdminAuth(db.DB) to the route, matching its actual intent (platform-operator admin helper, not a per-workspace route). Pattern mirrors /admin/liveness, /events, and /bundles/export from PR #167. No new test: AdminAuth behavior is covered by wsauth_middleware_test.go; this PR only wires it onto an additional route. The load-bearing code comment references #250 so future reviewers can't revert without an issue citation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:11:22 -07:00
Hongming Wang	0dd4f25952	feat(canvas): cookie consent banner with privacy-preserving default Adds a GDPR/ePrivacy-compliant cookie banner to the canvas root layout. Privacy-preserving default: no optional cookies are considered accepted until the user clicks "Accept all". Clicking "Necessary only" or dismissing records "rejected" and the banner does not re-appear until the cookie-policy version bumps. - New CookieConsent component wired into src/app/layout.tsx so it renders on every canvas route - Persists decision to localStorage as {decision, decidedAt, version} - Versioned schema: bumping CURRENT_VERSION re-prompts every user - Exports hasConsent() helper for feature code that needs to gate analytics / functional cookies on user choice - ARIA: role=dialog + aria-labelledby/aria-describedby so screen readers announce it as a dialog - Same storage key + schema as the control-plane legal-page banner (see molecule-controlplane PR #XX) so a user who accepts on one surface does not re-see the banner on the other Tests: 12 Vitest cases covering first-visit render, accept/reject persistence, version re-prompt, invalid-JSON recovery, privacy link attrs, ARIA markup, and the hasConsent helper under every state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:01:48 -07:00
Hongming Wang	9b82bce7ef	docs: GDPR Art. 17 erasure runbook Documents the 4-step hard-delete cascade implemented in molecule-controlplane PR #29 (Stripe → Redis → Infra → DB rows), how to read the org_purges audit table when a purge fails, the 30-day GDPR deadline, and what the cascade deliberately does NOT cover (WorkOS users, LLM provider history, Langfuse traces). Cross-referenced from the "SaaS ops" block in CLAUDE.md so future agents find it when handling erasure requests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:42:16 -07:00
rabbitblood	3985d80220	feat(hermes): Phase 2a — native Anthropic Messages API dispatch path Completes the Hermes adapter's native-SDK plan for the provider that gains the most from leaving OpenAI-compat: Anthropic. OpenAI-compat works fine for plain text turns on every provider (Phase 1 covered that with one code path for all 15 providers), but Anthropic's Messages API has first-class tool use, vision content blocks, and extended thinking that the OpenAI-compat shim strips or mis-translates. Rather than ship all native SDK paths in one PR (Anthropic + Gemini + future), this lands Anthropic only (Phase 2a). Gemini is Phase 2b, shipping after a production measurement window on Phase 2a. ## Design Providers now dispatch by `auth_scheme` field. Phase 1 added the field but every provider used `"openai"`. Phase 2 flips `anthropic` to `"anthropic"` and wires a second inference path keyed on that: - `HermesA2AExecutor._do_openai_compat(task_text)` — existing path, handles 14 of 15 providers (Nous Portal, OpenRouter, OpenAI, xAI, Gemini, Qwen, GLM, Kimi, MiniMax, DeepSeek, Groq, Together, Fireworks, Mistral) - `HermesA2AExecutor._do_anthropic_native(task_text)` — NEW, uses the official `anthropic` Python SDK's `AsyncAnthropic().messages.create(...)` - `HermesA2AExecutor._do_inference(task_text)` — dispatches by `self.provider_cfg.auth_scheme` Unknown schemes fall back to OpenAI-compat with a logged warning, so future provider additions don't crash if a native SDK path ships late. ## Fail-loud on missing SDK `_do_anthropic_native` raises a clear `RuntimeError` with install instructions if the `anthropic` package is missing at runtime: Hermes anthropic native path requires the `anthropic` package. Install in the workspace image with `pip install anthropic>=0.39.0` or set HERMES provider=openrouter to route Claude models through OpenRouter's OpenAI-compat shim instead. This is intentional: silent fallback would mask fidelity loss (tool_use blocks become plain text, vision gets stripped). Loud failure is better. `requirements.txt` adds `anthropic>=0.39.0` so the package is baked into the workspace-template image build path. Operators building custom workspace images without anthropic installed get the loud error. ## Back-compat - `create_executor(hermes_api_key="x")` → still routes to Nous Portal (`auth_scheme="openai"`), unchanged - `HERMES_API_KEY` env var → still first in RESOLUTION_ORDER - `OPENROUTER_API_KEY` env var → still second - All 14 OpenAI-compat providers unchanged — they take the same code path as before - ONLY `anthropic` provider changes behavior: it now uses the native Messages API instead of the `/v1/chat/completions` compat shim ## Constructor signature change `HermesA2AExecutor.__init__` now takes `provider_cfg: ProviderConfig` instead of separate `api_key + base_url + model`. The three fields are derived from `provider_cfg` + an optional model override. This is a breaking change for any external caller building an executor directly, but the only documented public entry point is `create_executor()`, which is updated in the same commit to pass the cfg through. ## Test coverage `workspace-template/tests/test_hermes_phase2_dispatch.py` — 7 new tests: 1. `test_anthropic_entry_has_anthropic_scheme` — registry flip 2. `test_all_other_providers_still_openai_scheme` — regression guard 3. `test_dispatch_openai_scheme_calls_openai_compat` — happy path 4. `test_dispatch_anthropic_scheme_calls_anthropic_native` — happy path 5. `test_dispatch_unknown_scheme_falls_back_to_openai_compat` — forward compat 6. `test_anthropic_native_raises_clear_error_when_sdk_missing` — fail-loud 7. `test_create_executor_passes_provider_cfg` — constructor wiring All pass locally (pytest tests/test_hermes_phase2_dispatch.py -v, 0.04s). Phase 1 tests unchanged: `test_hermes_providers.py` 26/26 pass, no regressions. ## What's NOT in this PR (Phase 2b) - Gemini native `generateContent` path (`auth_scheme="gemini"`) - Streaming support across both native paths (`astream_messages`, `streamGenerateContent`) - Tool calling on the anthropic native path (the `tools` + `tool_use` blocks) - Vision content blocks (image_url → anthropic image blocks) - Extended thinking parameter passthrough All scoped in `project_hermes_multi_provider.md`. Phase 2a is the minimum viable native Anthropic dispatch — single-turn text in, text out, no tools. ## Related - Phase 1 baseline (already in main): #208 — provider registry + OpenAI-compat path - Queued memory: `project_hermes_multi_provider.md` — full phased plan - Triggering directive: CEO 2026-04-15 — "once current works are cleared, focus on supporting hermes agent"	2026-04-15 12:23:56 -07:00
Hongming Wang	fda2b56532	docs: sync CLAUDE.md + PLAN.md + edit-history with 2026-04-15 overnight sweep Captures ~27 PRs merged across both repos this session: security hardening cluster (#94/#99/#106/#110/#119/#162/#155/#167/#185/#200/#203/ #209/#233), data-integrity fixes (#212/#224/#236), CI runner migration (#186), platform/scheduler reliability (#95/#149/#207/#206), workspace runtime features (#205/#208/#198/#216/#225/#235/#231), code-review follow-ups (#228/#232). Updated counts: 816 Go (+70), 1180 Python (+40), 453 vitest (unchanged — UI/a11y patches), 97 jest (unchanged). CLAUDE.md additions: - Idle Loop section (#205) under Architectural Patterns - Admin auth middleware variants section linking docs/runbooks/admin-auth.md - Migration runner section explaining the .down.sql filter (#212) - Per-route auth notes in the API table (PATCH field-whitelist, CanvasOrBearer on PUT /canvas/viewport, AdminAuth on bundles/events/templates-import/ approvals-pending/admin-liveness) - Database section updated with workspace_auth_tokens auto-revoke (#110), scheduler.error_detail surfacing (#206), workspace_schedules.last_status 'skipped' state (#207) PLAN.md additions: - New Recently launched (overnight sweep) section with full PR/issue index - Phase status updated (B–G now complete, H partial) - Live infrastructure deltas (migration fix, token rotation, legal pages) - Outstanding items consolidated Edit-history file expanded from the tick-9 stub to a full session record covering malware cleanup, CI runner migration, security cluster, data integrity, infra/feature/code-review batches, and outstanding user actions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:16:24 -07:00
Hongming Wang	eb6796042b	Merge pull request #236 from Molecule-AI/fix/issue-234-log-injection fix(security): #234 — sanitize source_id spoof log line via %q	2026-04-15 12:04:32 -07:00
Hongming Wang	8efc06aca6	fix(security): #234 — sanitize source_id spoof log line via %q Closes #234 LOW. The security log I added in PR #228 (code-review follow-up) echoed body.SourceID with %s, which preserves any \n / \r that json.Unmarshal decoded from the attacker's JSON. An authenticated workspace could have injected fake log entries by sending source_id="evil\ntimestamp=FORGED level=INFO msg=fake". Fix: use %q on both body_source_id and c.ClientIP(). Go-quoted string escapes all control characters so multi-line payloads stay on a single log line. One-line fix. Regression test: TestActivityHandler_Report_SourceIDLogInjection exercises the code path with a literal \n in source_id. Assertion is limited to "handler returns 403 cleanly with no panic" because capturing log output in Go tests requires a log.SetOutput swap, which adds noise for little signal vs just reading the test log output (visible when running with -v). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:04:26 -07:00
Hongming Wang	7d89fd4ea4	Merge pull request #235 from Molecule-AI/fix/issue-220-initial-idle-prompt-auth fix(workspace-template): #220 — auth_headers on initial_prompt + idle loop	2026-04-15 12:02:06 -07:00
Hongming Wang	1c41c30310	fix(workspace-template): #220 — send auth_headers() on initial_prompt + idle loop Closes #220. #215 added auth_headers() to /registry/register but missed two other self-post paths from the same workspace container: 1. initial_prompt (_do_send_sync) — fires once on first boot after the A2A server is ready. Posts to /workspaces/:id/a2a via the platform proxy. Missing headers meant the initial prompt got silently dropped as 401 on any token-enrolled workspace. 2. idle loop (_post_sync) — fires every idle_interval_seconds while the workspace has no active task (#205 pattern). Same proxy path, same missing headers, same silent 401 in multi-tenant mode. Both now build headers as {"Content-Type": "application/json", **auth_headers()} auth_headers() returns {"Authorization": "Bearer <token>"} when /auth-token.txt exists, empty dict otherwise (first boot before register issues the token). The existing lazy-bootstrap fail-open on the platform side covers the empty-dict case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:02:01 -07:00
Hongming Wang	533beb8da3	Merge pull request #233 from Molecule-AI/fix/issue-226-create-template-traversal fix(security): #226 — gate POST /workspaces template against traversal	2026-04-15 12:00:32 -07:00
Hongming Wang	3d561b24ef	fix(security): #226 — gate POST /workspaces template/runtime against traversal Closes #226 MEDIUM. WorkspaceHandler.Create joined payload.Template directly into filepath.Join(configsDir, template) without validating it stayed inside configsDir. An attacker posting Template="../../etc" would have the provisioner walk and mount arbitrary host directories into the workspace container. Same fix as #103 (POST /org/import): use the existing resolveInsideRoot helper to reject absolute paths and any ".." that escapes the root. Applied at both call sites in workspace.go: 1. Synchronous runtime detection before DB insert — 400 on bad input 2. Async provisioning goroutine — early return, logs the rejection (belt-and-suspenders; the create path already blocks) No test added inline because the existing resolveInsideRoot suite (org_path_test.go) already covers absolute / traversal / prefix-sibling / empty-path / deep-subpath cases. A duplicate test for the workspace handler wouldn't add signal. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:00:26 -07:00
Hongming Wang	07cd0a2dfa	Merge pull request #224 from Molecule-AI/fix/issue-221-yaml-injection fix(security): sanitize workspace name before YAML interpolation	2026-04-15 11:59:10 -07:00
Hongming Wang	dc5c4b9dfa	Merge pull request #231 from Molecule-AI/fix/160-sdk-error-probe fix(claude-sdk): #160 — probe CLI directly when SDK swallowed the real stderr	2026-04-15 11:58:59 -07:00
Hongming Wang	6ebfefa64f	Merge pull request #227 from Molecule-AI/test/issue-217-plugin-pipeline-tests test(handlers): unit test suite for plugins_install_pipeline.go	2026-04-15 11:58:56 -07:00
Hongming Wang	edf72a80f8	Merge pull request #225 from Molecule-AI/fix/issue-215-register-auth fix(workspace-template): add auth_headers() to /registry/register POST	2026-04-15 11:58:53 -07:00
Hongming Wang	f1899aa67f	Merge pull request #216 from Molecule-AI/feat/tr-idle-prompt chore(template): enable idle-loop pilot on Technical Researcher (#205 follow-up)	2026-04-15 11:58:50 -07:00
Hongming Wang	81d05bd7e3	Merge pull request #223 from Molecule-AI/fix/reno-stars-browser-automation-default fix(reno-stars): default plugins to browser-automation	2026-04-15 11:58:46 -07:00
Hongming Wang	76bd2a2ccf	fix(security): #221 — quote name as YAML scalar instead of stripping newlines The original fix stripped \n/\r but left the rest in place, then relied on a substring-based test which was over-strict (the escaped fragment still contained the banned substring as bytes). Better approach: emit the name as a double-quoted YAML scalar with all escape sequences (\\, \", \n, \r, \t) handled inline. This is the canonical YAML-safe way to embed user input — no injection possible because every control character is either escaped or rejected by the YAML parser inside the scalar context. Test rewritten to parse the output as YAML and verify: 1. parsed[\"name\"] equals the literal attacker input (payload preserved) 2. no banned top-level keys leaked to the parsed map 3. legitimate default keys (description/version/tier/model) still present Updated the two existing tests that asserted the unquoted name format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 11:58:16 -07:00

1 2 3 4 5 ...

345 Commits