Commit Graph

17 Commits

Author SHA1 Message Date
Hongming Wang
2fa6f7c6cd docs: sync documentation with 2026-04-14 evening-tick merges (#63, #64, #65)
- edit-history/2026-04-14.md: append tick-4 section covering the 12
  modular guardrail plugins (#63), global-secrets auto-restart fan-out
  (#64, fixes issue #15), and synthetic restart-context A2A message
  (#65, fixes issue #19 Layer 1; Layer 2 deferred to issue #66).
- CLAUDE.md: bump Go test count 699 -> 726 (measured); note global
  secrets auto-restart on SetGlobal/DeleteGlobal in the route table;
  add Workspace Lifecycle paragraph for the restart-context message
  and its system:restart-context caller prefix.
- PLAN.md: bump Go test count in the coverage table; record issues
  #15 and #19 Layer 1 as launched; add new Backlog entry for the
  Layer 2 follow-up (issue #66).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:54:04 -07:00
Hongming Wang
8b896b1a56
feat(plugins): split guardrails into 12 modular plugins (#63)
Noteworthy: large-addition (+1601 lines, 12 new plugins) + modifies core AgentskillsAdaptor (SDK + runtime copies, drift-guarded). All 7 gates pass, 0 critical findings. Cross-vendor review skipped (tool unavailable).
2026-04-14 12:47:24 -07:00
Hongming Wang
d0c5626df1
Merge pull request #61 from Molecule-AI/feat/claude-hooks-upgrade
feat(.claude): ambient hooks + sequential-thinking MCP + /triage command
2026-04-14 12:25:54 -07:00
Hongming Wang
90a513d1d0 feat(plugins): split guardrails into 12 modular plugins
Replaces the proposed monolithic molecule-guardrails plugin with 12
single-purpose plugins users can install à la carte. Powered by a
small extension to the AgentskillsAdaptor base class so any plugin can
ship hooks/, commands/, and a settings-fragment.json without writing a
custom adapter.

## Base adapter changes

workspace-template/plugins_registry/builtins.py + sdk/python/molecule_plugin/builtins.py
(both copies — drift-tested):
- New _install_claude_layer() helper called at the end of install()
- Conditionally copies hooks/ → /configs/.claude/hooks/ (preserving exec bit)
- Conditionally copies commands/*.md → /configs/.claude/commands/
- Conditionally merges settings-fragment.json into /configs/.claude/settings.json
  with ${CLAUDE_DIR} placeholder rewritten to the workspace's absolute install
  path. Existing user hooks are preserved (deep-merge by event name).
- All steps no-op when the plugin doesn't ship the corresponding files,
  so existing skill+rule plugins (molecule-dev, superpowers, ecc,
  browser-automation) are unchanged.

Drift test (tests/test_plugins_builtins_drift.py) still passes.

## 12 new plugins

Hook plugins (ambient enforcement):
- molecule-careful-bash       — refuses destructive bash; ships careful-mode skill
- molecule-freeze-scope       — locks edits via .claude/freeze
- molecule-audit-trail        — appends every Edit/Write to audit.jsonl
- molecule-session-context    — auto-loads cron-learnings at session start
- molecule-prompt-watchdog    — injects warnings on destructive prompt keywords

Skill plugins (on-demand):
- molecule-skill-code-review        — 16-criteria multi-axis review
- molecule-skill-cross-vendor-review — adversarial second-model review
- molecule-skill-llm-judge          — deliverable-vs-request scoring
- molecule-skill-update-docs        — post-merge doc sync
- molecule-skill-cron-learnings     — operational-memory JSONL format

Workflow plugins (slash commands):
- molecule-workflow-triage  — /triage full PR-triage cycle
- molecule-workflow-retro   — /retro + cron-retro skill, weekly retrospective

Each ships only what it needs — most have just plugin.yaml + skills/ or
hooks/ + adapter (one-line stub: `from plugins_registry.builtins import
AgentskillsAdaptor as Adaptor`). Total ~120 files but each plugin is
small and self-contained.

## Verification

- python3 -m molecule_plugin validate plugins/molecule-* → all 13 valid
  (12 new + pre-existing molecule-dev)
- End-to-end install smoke test on representative samples: hook plugin
  (molecule-careful-bash), skill-only plugin (molecule-skill-code-review),
  workflow plugin (molecule-workflow-triage). All produce expected
  /configs/ tree, settings.json paths rewritten, exec bits preserved,
  zero warnings.
- workspace-template pytest tests/test_plugins_builtins_drift.py → passes
  (SDK + runtime stay in sync).

## CLAUDE.md repo-doc updated

Lists all 12 new plugins under the existing Plugins section, organized
by category (hook / skill / workflow). Each entry one line, recommend-
together hints where dependencies make sense.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:20:04 -07:00
Hongming Wang
3f8eb7406f feat(.claude): ambient hooks + sequential-thinking MCP + /triage command
Skills are opt-in (I have to remember to invoke them). Hooks are
ambient — they fire on every matching event automatically. This PR
moves the careful-mode and learnings discipline from "doc I should
read" to "harness-enforced behavior I cannot bypass".

## 6 new hooks (.claude/hooks/)

- pre-bash-careful — REFUSES git push --force to main, rm -rf at root,
  DROP TABLE against prod schema. WARNs on force-with-lease, gh pr/
  issue close. Tested: blocks the destructive case, allows safe ones.
- pre-edit-freeze — implements /freeze. When .claude/freeze contains
  a path glob, edits outside it are denied. Tested: edits to PLAN.md
  blocked when scope locked to platform/internal/handlers/.
- session-start-context — auto-loads last 20 cron-learnings, freeze
  status, open-PR/issue counts as additionalContext at session start.
  Tested: emits valid SessionStart JSON.
- post-edit-audit — appends every Edit/Write to .claude/audit.jsonl
  (gitignored). One-line records {ts, tool, file, ok}. Tested writes.
- user-prompt-tag — injects context warnings when prompt mentions
  force-push, drop-table, "delete all", "push to main", etc. Tested:
  emits warning for "force push the fix to main".
- subagent-stop-judge — off by default; touch .claude/judge-subagents
  to enable. When on, prompts orchestrator to verify subagent's last
  message addresses the original task. Cost-free MVP (no LLM call yet).

All hooks are Python (jq isn't on the hook PATH on macOS — Python is).
Shared helpers in _lib.py: read_input, deny_pretooluse, add_context,
warn_to_stderr.

## settings.json — wires all 6 hooks

Adds SessionStart, UserPromptSubmit, SubagentStop event handlers.
Existing PreToolUse:Bash + PostToolUse:Edit chains gain the new hooks
alongside the existing ones (check-inbox.sh, echo reminder).

Adds @modelcontextprotocol/server-sequential-thinking MCP server for
structured chain-of-thought scratchpad — useful when triaging multiple
PRs in parallel without losing context.

## .claude/commands/triage.md — slash command shortcut

Manual /triage runs the same flow as the c5074cd5 hourly cron, on
demand. Saves ~4KB of prompt every invocation by pulling the cron
prompt out of working memory.

## CLAUDE.md additions

New "Agent operating rules (auto-loaded — read first)" section right
after Ecosystem Context. Documents:
- Cron / triage discipline (read learnings, treat docs PRs touching
  CLAUDE.md/PLAN.md as noteworthy, write per-tick reflections)
- Table of all 6 hooks active in this repo
- List of skills and how to invoke them
- Standing rules (inviolable) consolidated for the agent

This block auto-loads into every conversation context — free behavior
change without me remembering to opt in.

## .gitignore

audit.jsonl, freeze, judge-subagents, per-tick-reflections.md are all
local operational state, never committed.

## Verification

- echo '{"tool_input":{"command":"git push --force origin main"}}' |
  bash pre-bash-careful.sh → emits deny JSON ✓
- Same for git status (safe command) → empty output, exit 0 ✓
- pre-edit-freeze with .claude/freeze=platform/handlers/ blocks
  edits to PLAN.md, allows edits inside the locked path ✓
- post-edit-audit appends valid JSONL ✓
- session-start-context emits additionalContext with PR/issue counts ✓
- user-prompt-tag emits warning for "force push to main" prompt ✓
- python3 -c "json.load(open('.claude/settings.json'))" → valid ✓

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:00:35 -07:00
Hongming Wang
479f1776a8 feat(provisioner): configurable per-tier memory/CPU limits (#14)
Resolves #14. ApplyTierConfig now reads TIER{2,3,4}_MEMORY_MB and
TIER{2,3,4}_CPU_SHARES env vars, falling back to the compiled defaults
agreed in the issue:

  - T2: 512 MiB  / 1024 shares (1 CPU)  — unchanged baseline
  - T3: 2048 MiB / 2048 shares (2 CPU)  — new cap (previously uncapped)
  - T4: 4096 MiB / 4096 shares (4 CPU)  — new cap (previously uncapped)

CPU_SHARES follows Docker's 1024 = 1 CPU convention; internally the
value is translated to NanoCPUs for a hard allocation so behaviour
remains deterministic across hosts. Malformed or non-positive env
values silently fall back to the default.

Behaviour change note: T3 and T4 previously had no explicit cap.
Operators who relied on unlimited can set very large TIERn_MEMORY_MB /
TIERn_CPU_SHARES values; a follow-up can add unset-means-unlimited
semantics if required.

Tests:
  - TestGetTierMemoryMB_DefaultsMatchLegacy
  - TestGetTierMemoryMB_EnvOverride (covers malformed + zero fallback)
  - TestGetTierCPUShares_EnvOverride
  - TestApplyTierConfig_T3_UsesEnvOverride (wiring)
  - TestApplyTierConfig_T3_DefaultCap (documents the new cap)

Docs: .env.example section + CLAUDE.md platform env-vars list updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:49:37 -07:00
Hongming Wang
dcf8a07887 docs: sync documentation with 2026-04-14 tick-3 merges (#53, #54, #55)
- docs/edit-history/2026-04-14.md: append tick-3 section covering the
  admin test-token route (#53), the prior-tick doc-sync PR (#54), and
  the hermes required_env alignment (#55). Record measured test counts
  (Go +4 for the TestAdminTestToken_* quartet).
- CLAUDE.md: bump Go test count 695 → 699 with a note pointing at the
  new quartet. Route-table row and env-var mentions for the admin
  route already landed with #53; verified on main.
- .env.example: add MOLECULE_ENABLE_TEST_TOKENS with a comment about
  the prod-hidden default. Closes the code-review doc-sync flag from
  #53 (var was in CLAUDE.md but missing from .env.example).

No PLAN.md / README.md / README.zh-CN.md update needed — none of the
three merges expose a user-visible surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:37:42 -07:00
Hongming Wang
0832f997f0 feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6)
Adds a gated admin endpoint that mints a fresh workspace bearer token on
demand, eliminating the register-race currently used by
test_comprehensive_e2e.sh (PR #5 follow-up).

- New handler admin_test_token.go: returns 404 unless MOLECULE_ENV != production
  or MOLECULE_ENABLE_TEST_TOKENS=1. Hides route existence in prod (404 not 403).
- Mints via wsauth.IssueToken; logs at INFO without the token itself.
- Verifies workspace exists before minting (missing -> 404, never 500).
- Tests cover prod-hidden, enable-flag-overrides-prod, missing workspace,
  and happy-path + token-validates round trip.
- tests/e2e/_lib.sh gains e2e_mint_test_token helper for downstream adoption.
- CLAUDE.md updated with route + env vars.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 09:35:26 -07:00
Hongming Wang
a2ea1b183b
Merge pull request #49 from Molecule-AI/feat/hermes-pr2
feat(hermes): implement create_executor() with HERMES_API_KEY / OPENROUTER_API_KEY fallback + smoke tests
2026-04-14 08:16:15 -07:00
Hongming Wang
56068a7698 docs(hermes): document HERMES_API_KEY env var and runtime-table row
Adds HERMES_API_KEY to .env.example with a cross-reference to the
OPENROUTER_API_KEY fallback, and adds the hermes runtime row to the
CLAUDE.md runtime table so the new adapter is discoverable alongside
its siblings (langgraph, claude-code, openclaw, crewai, autogen,
deepagents).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 08:11:37 -07:00
Hongming Wang
d15a202be2
Merge pull request #16 from Molecule-AI/fix/infra-compose-external-network
fix(infra): attach docker-compose.infra.yml services to molecule-monorepo-net + add Temporal
2026-04-13 22:19:36 -07:00
Hongming Wang
870faabced docs(gate-5): document Temporal dependency in CLAUDE.md/PLAN.md 2026-04-13 21:38:25 -07:00
Hongming Wang
2f0c708d81 fix: gate-5 document browser-automation plugin in CLAUDE.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:37:29 -07:00
Hongming Wang
fd2c3fbfc4 docs: correct stale test counts in PR #9
Subagent used old CLAUDE.md baselines instead of measuring actuals.
Verified counts via pytest --collect-only and go test -v:

- Go platform: 536 → 695 (+159 off)
- Python workspace-template: 1084 → 1140 (+56 off)
- SDK python: 121 → 132 (+11 off)
- Canvas vitest: 357 (already correct)
- MCP jest: 97 (already correct)

Files updated:
- CLAUDE.md (Unit Tests block)
- PLAN.md (Test Coverage table + totals: 2,295 → 2,421)
- docs/development/local-development.md
- docs/edit-history/2026-04-13.md (session test-count table +
  explanatory note about why the Python and SDK counts didn't
  change today)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:51:12 -07:00
Hongming Wang
5429880b67 docs: sync documentation with 2026-04-13 merges (PRs #1-#8)
Covers today's quality + infra pass: brand/structural cleanup, MCP
per-domain refactor (1697 -> 89 lines, 87 tools), canvas ConfirmDialog
unification, 4 platform handler decompositions (+47 Go tests), E2E
hardening for Phase 30.1/30.6 auth, and two new CI jobs (e2e-api +
shellcheck).

- CLAUDE.md: updated test counts (Go 536, canvas 357, SDK 121, MCP 97,
  workspace 1084); documented MCP per-domain split + new api.ts; added
  handler-decomposition section; Phase 30.1/30.6 auth callout; new
  CI jobs; env vars cross-ref.
- PLAN.md: Phase 31 "Quality + Infra Pass" marked shipped; test totals
  refreshed to 2,295.
- README.zh-CN.md: license badge MIT -> BSL 1.1; added BSL license block.
- docs/api-protocol/platform-api.md: registry table gains Auth column
  documenting Phase 30.1 bearer-token and Phase 30.6 X-Workspace-ID
  requirements on heartbeat/update-card/discover/peers.
- docs/development/local-development.md: updated stale test counts;
  added e2e-api + shellcheck CI jobs; pointer to new testing-e2e.md.
- docs/development/testing-e2e.md: new — per-script reference, auth
  prerequisites, local run, CI coverage, adding-a-new-check checklist.
- docs/edit-history/2026-04-13.md: top-of-file summary section added
  spanning PRs #1-#8; preserves existing per-feature entries below.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:46:28 -07:00
Hongming Wang
af931aa8da refactor(mcp-server): DRY envelopes, typed apiCall, explicit re-exports
Second-pass cleanup after the monolith split. Addresses every issue
from the code-review pass.

Core additions in src/api.ts:
- toMcpResult(data) + toMcpText(text): single source of truth for the
  MCP text-content envelope (was ~87 duplicated literals)
- ApiError type + isApiError(v) guard: typed discriminated-union for
  the error-by-value pattern; replaces open-coded shape checks
- apiCall<T = unknown>: generic so callers can document expected
  response shape without unchecked "as" casts

Bulk cleanups across all 12 tools/*.ts:
- Every handler now returns toMcpResult(data) or toMcpText(text)
- Open-coded "typeof obj === 'object' && 'error' in obj" in
  remote_agents.ts replaced with isApiError(v)
- Extracted initialCanvasPosition() helper out of
  handleCreateWorkspace; explains why random seeding exists
- Added runtime/workspace_dir/workspace_access to create_workspace
  zod schema (previously accepted by handler but hidden from clients)

src/index.ts:
- Replaced "export * from" with explicit named re-exports so the
  public surface is auditable and future name collisions fail loudly

Tests:
- createServer() smoke test that records every srv.tool(...) call and
  asserts 87 registered tools unique by name. Catches future PRs that
  forget to wire a registerXxxTools(srv).

Docs:
- Fix broken relative links in sdk/python/molecule_agent/README.md
  (was ../../examples/ from inside sdk/python/, should be ../examples/)
- Update stale "61 tools" -> "87 tools" in CLAUDE.md + main() log

Verification:
- npm run build clean
- npx jest -> 97/97 passed (was 96; +1 smoke test)
- grep "content: [{ type: \"text\" as const" src/tools/ -> 0 matches
- No file over 216 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:26:17 -07:00
Hongming Wang
24fec62d7f initial commit — Molecule AI platform
Forked clean from public hackathon repo (Starfire-AgentTeam, BSL 1.1)
with full rebrand to Molecule AI under github.com/Molecule-AI/molecule-monorepo.

Brand: Starfire → Molecule AI.
Slug: starfire / agent-molecule → molecule.
Env vars: STARFIRE_* → MOLECULE_*.
Go module: github.com/agent-molecule/platform → github.com/Molecule-AI/molecule-monorepo/platform.
Python packages: starfire_plugin → molecule_plugin, starfire_agent → molecule_agent.
DB: agentmolecule → molecule.

History truncated; see public repo for prior commits and contributor
attribution. Verified green: go test -race ./... (platform), pytest
(workspace-template 1129 + sdk 132), vitest (canvas 352), build (mcp).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:55:37 -07:00