molecule-core

Author	SHA1	Message	Date
Hongming Wang	e778d477ba	Merge pull request #381 from Molecule-AI/chore/triage-operator-handoff chore(handoff): Triage Operator role + agent handoff package	2026-04-15 23:43:05 -07:00
Hongming Wang	16ae320bed	Merge pull request #381 from Molecule-AI/chore/triage-operator-handoff chore(handoff): Triage Operator role + agent handoff package	2026-04-15 23:43:05 -07:00
Hongming Wang	2e43bb7271	chore(handoff): triage-operator role + agent handoff package Wraps up a ~100-tick autonomous triage session by converting the prior operator's institutional knowledge into standing, checked-in artifacts so the next team picking up the hourly PR + issue cycle can drop in without re-discovering everything from scratch. ## New role: Triage Operator Peer to Dev Lead, Research Lead, Documentation Specialist under PM. Owns the 7-gate PR verification + issue-pickup cycle across both molecule-monorepo and molecule-controlplane. NOT an engineer — never writes logic, never makes design calls. Mechanical fixes on other people's branches + verified-merge only. Runs on cron `17 * * * *`. On first boot reads four handoff files + the last 20 lines of cron-learnings.jsonl, waits for the scheduled tick (no first-boot triage — known stale-state footgun). ## Files org-templates/molecule-dev/triage-operator/ - system-prompt.md (48 lines) — role prompt loaded at boot. Standing rules, verification discipline, escalation paths. - philosophy.md (135 lines) — 10 principles each tied to a real incident. Rule 2 ("tool succeeded ≠ work done") references the WorkOS refresh-token + missing-migration saga. Rule 3 (authority verification) references PR #370 CEO directive hold. - playbook.md (234 lines) — step-by-step tick flow (Step 0 guards → 1 list → 2 seven-gate → 3 docs sync → 4 issue pickup → 5 report). Expected 5–30 min wall-clock. When-not-to-triage. - handoff-notes.md (146 lines) — point-in-time state for the NEXT operator arriving fresh. 15 PRs merged this session, in-flight items, design-call backlog with recommendations per issue. - SKILL.md (152 lines) — installable skill spec. Invocation, inputs, outputs, required composed skills, edge cases, output format. .claude/AGENT_HANDOFF.md (206 lines) — top-level handoff for any Claude Code agent working this repo (not just the triage operator). The 10 principles (one-liners), communication style the user expects, currently-live state, open items, what NOT to do, break- glass escalation conditions. Points at triage-operator/philosophy.md for full incident context. ## Wiring org.yaml gains a Triage Operator workspace block under PM with: - tier: 3, model: opus - 8 plugins (careful-bash, session-context, cron-learnings, code-review, cross-vendor-review, llm-judge, update-docs, hitl) - Hourly cron at `:17` with the full Step 0–5 flow inline as prompt - canvas position (1150, 250) — peer to Documentation Specialist ## Why this ships now The 30-min manual triage cron was cancelled per CEO direction. The role moves to another team. Without this handoff package they'd be rediscovering the same incident-classes I shipped fixes for (#318 fail-open, #327 cross-tenant decrypt, #351 tokenless grace, WorkOS refresh-token saga, missing migration runner). The philosophy file gives them the scar tissue in ~10 min of reading; the playbook gives them the steps; the SKILL gives them an invocable entry point. No code changes outside org.yaml. Existing TestPlugins_UnionWithDefaults still passes (verified in platform test run). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 23:41:01 -07:00
Hongming Wang	df5821a251	chore(handoff): triage-operator role + agent handoff package Wraps up a ~100-tick autonomous triage session by converting the prior operator's institutional knowledge into standing, checked-in artifacts so the next team picking up the hourly PR + issue cycle can drop in without re-discovering everything from scratch. ## New role: Triage Operator Peer to Dev Lead, Research Lead, Documentation Specialist under PM. Owns the 7-gate PR verification + issue-pickup cycle across both molecule-monorepo and molecule-controlplane. NOT an engineer — never writes logic, never makes design calls. Mechanical fixes on other people's branches + verified-merge only. Runs on cron `17 * * * *`. On first boot reads four handoff files + the last 20 lines of cron-learnings.jsonl, waits for the scheduled tick (no first-boot triage — known stale-state footgun). ## Files org-templates/molecule-dev/triage-operator/ - system-prompt.md (48 lines) — role prompt loaded at boot. Standing rules, verification discipline, escalation paths. - philosophy.md (135 lines) — 10 principles each tied to a real incident. Rule 2 ("tool succeeded ≠ work done") references the WorkOS refresh-token + missing-migration saga. Rule 3 (authority verification) references PR #370 CEO directive hold. - playbook.md (234 lines) — step-by-step tick flow (Step 0 guards → 1 list → 2 seven-gate → 3 docs sync → 4 issue pickup → 5 report). Expected 5–30 min wall-clock. When-not-to-triage. - handoff-notes.md (146 lines) — point-in-time state for the NEXT operator arriving fresh. 15 PRs merged this session, in-flight items, design-call backlog with recommendations per issue. - SKILL.md (152 lines) — installable skill spec. Invocation, inputs, outputs, required composed skills, edge cases, output format. .claude/AGENT_HANDOFF.md (206 lines) — top-level handoff for any Claude Code agent working this repo (not just the triage operator). The 10 principles (one-liners), communication style the user expects, currently-live state, open items, what NOT to do, break- glass escalation conditions. Points at triage-operator/philosophy.md for full incident context. ## Wiring org.yaml gains a Triage Operator workspace block under PM with: - tier: 3, model: opus - 8 plugins (careful-bash, session-context, cron-learnings, code-review, cross-vendor-review, llm-judge, update-docs, hitl) - Hourly cron at `:17` with the full Step 0–5 flow inline as prompt - canvas position (1150, 250) — peer to Documentation Specialist ## Why this ships now The 30-min manual triage cron was cancelled per CEO direction. The role moves to another team. Without this handoff package they'd be rediscovering the same incident-classes I shipped fixes for (#318 fail-open, #327 cross-tenant decrypt, #351 tokenless grace, WorkOS refresh-token saga, missing migration runner). The philosophy file gives them the scar tissue in ~10 min of reading; the playbook gives them the steps; the SKILL gives them an invocable entry point. No code changes outside org.yaml. Existing TestPlugins_UnionWithDefaults still passes (verified in platform test run). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 23:41:01 -07:00
Hongming Wang	04c3911c0e	fix(security): forward Authorization header in transcript proxy (#405 ) (#380 ) The platform's GET /workspaces/:id/transcript proxy was constructing the outbound request without an Authorization header. The workspace's /transcript endpoint (hardened in #287/#328) fails-closed when the header is absent, so every transcript call in production returned 401 from the workspace. Fix: after WorkspaceAuth validates the incoming bearer token, the handler now forwards it verbatim via req.Header.Set("Authorization", ...). Forwarding is safe — the token has already been validated by the middleware. Tests: - TestTranscript_ForwardsAuthHeader: was t.Skip'd as a bug marker; now active. Verifies the Authorization header reaches the workspace stub. - TestTranscript_NoAuthHeader_PassesThrough: new. Verifies that a missing header produces no synthetic Authorization on the upstream call, and the workspace 401 is faithfully relayed. Identified by QA audit 2026-04-16. Co-authored-by: QA Engineer <qa-engineer@molecule-ai.internal> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 23:38:07 -07:00
Hongming Wang	52bdadbd6d	fix(security): forward Authorization header in transcript proxy (#405 ) (#380 ) The platform's GET /workspaces/:id/transcript proxy was constructing the outbound request without an Authorization header. The workspace's /transcript endpoint (hardened in #287/#328) fails-closed when the header is absent, so every transcript call in production returned 401 from the workspace. Fix: after WorkspaceAuth validates the incoming bearer token, the handler now forwards it verbatim via req.Header.Set("Authorization", ...). Forwarding is safe — the token has already been validated by the middleware. Tests: - TestTranscript_ForwardsAuthHeader: was t.Skip'd as a bug marker; now active. Verifies the Authorization header reaches the workspace stub. - TestTranscript_NoAuthHeader_PassesThrough: new. Verifies that a missing header produces no synthetic Authorization on the upstream call, and the workspace 401 is faithfully relayed. Identified by QA audit 2026-04-16. Co-authored-by: QA Engineer <qa-engineer@molecule-ai.internal> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 23:38:07 -07:00
Hongming Wang	1ff544eba8	feat(adapters): add gemini-cli runtime adapter (closes #332 ) (#379 ) Adds a `gemini-cli` workspace runtime backed by Google's Gemini CLI (@google/gemini-cli, ~101k ★, Apache 2.0). Mirrors the claude-code adapter pattern: Docker image installs the CLI, CLIAgentExecutor drives the subprocess, A2A MCP tools wire via ~/.gemini/settings.json. Changes: - workspace-template/adapters/gemini_cli/ — new adapter (Dockerfile, adapter.py, __init__.py, requirements.txt); setup() seeds GEMINI.md from system-prompt.md and injects A2A MCP server into settings.json - workspace-template/cli_executor.py — adds gemini-cli to RUNTIME_PRESETS (--yolo flag, -p prompt, --model, GEMINI_API_KEY env auth); adds mcp_via_settings preset flag to skip --mcp-config injection for runtimes that own their own settings file - workspace-configs-templates/gemini-cli/ — default config.yaml + system-prompt.md template - tests/test_adapters.py — adds gemini-cli to expected adapter set - CLAUDE.md — documents new runtime row in the image table Requires: GEMINI_API_KEY global secret. Build: bash workspace-template/build-all.sh gemini-cli Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 23:30:00 -07:00
Hongming Wang	0aec76400a	feat(adapters): add gemini-cli runtime adapter (closes #332 ) (#379 ) Adds a `gemini-cli` workspace runtime backed by Google's Gemini CLI (@google/gemini-cli, ~101k ★, Apache 2.0). Mirrors the claude-code adapter pattern: Docker image installs the CLI, CLIAgentExecutor drives the subprocess, A2A MCP tools wire via ~/.gemini/settings.json. Changes: - workspace-template/adapters/gemini_cli/ — new adapter (Dockerfile, adapter.py, __init__.py, requirements.txt); setup() seeds GEMINI.md from system-prompt.md and injects A2A MCP server into settings.json - workspace-template/cli_executor.py — adds gemini-cli to RUNTIME_PRESETS (--yolo flag, -p prompt, --model, GEMINI_API_KEY env auth); adds mcp_via_settings preset flag to skip --mcp-config injection for runtimes that own their own settings file - workspace-configs-templates/gemini-cli/ — default config.yaml + system-prompt.md template - tests/test_adapters.py — adds gemini-cli to expected adapter set - CLAUDE.md — documents new runtime row in the image table Requires: GEMINI_API_KEY global secret. Build: bash workspace-template/build-all.sh gemini-cli Co-authored-by: DevOps Engineer <devops@molecule.ai> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 23:30:00 -07:00
Hongming Wang	592fe6d7f7	feat(org-templates): add 7-role marketing team sub-tree (#373 ) Add Marketing Lead + 6 reports as a peer sub-tree of PM under the CEO: DevRel Engineer, Product Marketing Manager, Content Marketer, Community Manager, SEO Growth Analyst, Social Media / Brand. - Marketing Lead: tier-3 Opus CMO-equivalent with a 5-min orchestrator pulse (minutes 4/9/14/... offset from Dev Lead's 2/7/12/...) that dispatches cross-role work, reviews drafts, and routes cross-team asks back to PM. - DevRel + PMM: tier-3 Opus (technical writing + positioning judgment). Each has an idle_prompt for proactive issue-claim plus an hourly evolution cron (DevRel = sample-coverage audit, PMM = competitor diff against docs/ecosystem-watch.md). - Content / Community / SEO / Social: tier-2 Sonnet with idle_prompts for backlog-pull (matches the #205 idle-loop pattern proven on Technical Researcher + Market Analyst + Competitive Intelligence). Each has an hourly cron tuned to its surface. - category_routing gets 6 new keys (content, positioning, community, growth, social, devrel) so audit_summary messages fan out correctly. - Canvas positions lay out the marketing cluster to the right of PM/Dev Lead (x=1000-1300, y=50/250/400) so the graph stays readable. Each role also gets a system-prompt.md under its files_dir with responsibilities, team interfaces, conventions, and self-review gates (molecule-skill-llm-judge or molecule-hitl depending on risk). Per CEO directive 2026-04-16 ("comprehensive marketing team"). This is PR 1 of 2 — follow-up will add cross-tree A2A conventions and wire DevRel ↔ Backend Engineer / PMM ↔ Competitive Intelligence delegations. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 23:20:04 -07:00
Hongming Wang	b2e1631640	feat(org-templates): add 7-role marketing team sub-tree (#373 ) Add Marketing Lead + 6 reports as a peer sub-tree of PM under the CEO: DevRel Engineer, Product Marketing Manager, Content Marketer, Community Manager, SEO Growth Analyst, Social Media / Brand. - Marketing Lead: tier-3 Opus CMO-equivalent with a 5-min orchestrator pulse (minutes 4/9/14/... offset from Dev Lead's 2/7/12/...) that dispatches cross-role work, reviews drafts, and routes cross-team asks back to PM. - DevRel + PMM: tier-3 Opus (technical writing + positioning judgment). Each has an idle_prompt for proactive issue-claim plus an hourly evolution cron (DevRel = sample-coverage audit, PMM = competitor diff against docs/ecosystem-watch.md). - Content / Community / SEO / Social: tier-2 Sonnet with idle_prompts for backlog-pull (matches the #205 idle-loop pattern proven on Technical Researcher + Market Analyst + Competitive Intelligence). Each has an hourly cron tuned to its surface. - category_routing gets 6 new keys (content, positioning, community, growth, social, devrel) so audit_summary messages fan out correctly. - Canvas positions lay out the marketing cluster to the right of PM/Dev Lead (x=1000-1300, y=50/250/400) so the graph stays readable. Each role also gets a system-prompt.md under its files_dir with responsibilities, team interfaces, conventions, and self-review gates (molecule-skill-llm-judge or molecule-hitl depending on risk). Per CEO directive 2026-04-16 ("comprehensive marketing team"). This is PR 1 of 2 — follow-up will add cross-tree A2A conventions and wire DevRel ↔ Backend Engineer / PMM ↔ Competitive Intelligence delegations. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 23:20:04 -07:00
Hongming Wang	06c205da77	Merge pull request #370 from Molecule-AI/feat/engineers-pick-up-issues feat(template): engineers pick up issues proactively (CEO 2026-04-16 directive)	2026-04-15 22:53:44 -07:00
Hongming Wang	e557259aad	Merge pull request #370 from Molecule-AI/feat/engineers-pick-up-issues feat(template): engineers pick up issues proactively (CEO 2026-04-16 directive)	2026-04-15 22:53:44 -07:00
rabbitblood	4f9ef2dd0e	feat(template): engineers pick up issues proactively (CEO 2026-04-16 directive) CEO directive verbatim: "devs should pick up issues and declare that its assigned to them, PM and leaders regularly check in. dont just rely on outside reviewer". Adds `idle_prompt` + `idle_interval_seconds: 600` to Frontend Engineer, Backend Engineer, and DevOps Engineer. Each engineer now polls open GH issues matching its specialty, claims unassigned ones via `gh issue edit --add-assignee @me`, leaves a public comment declaring the pickup, and commits memory to prevent double-pickup on the next tick. Previously engineers were reactive-only per the #159 orchestrator/worker split. The CEO is correcting that: devs should be a true self-organizing unit, not a work-queue that only advances when an outside reviewer dispatches. ## Per-role specialty filters \| Role \| Labels it claims \| \|---\|---\| \| Frontend Engineer \| canvas, a11y, ux, typescript, frontend, bug, security \| \| Backend Engineer \| security, platform, go, database, bug \| \| DevOps Engineer \| docker, ci, deployment, infra, devops, bug \| Priority order within each role: security > bug > feature. ## Self-review gates Each engineer's idle_prompt includes the self-review chain: - Frontend: molecule-skill-code-review + molecule-skill-llm-judge - Backend: molecule-skill-code-review + molecule-security-scan + molecule-skill-llm-judge - DevOps: molecule-skill-code-review + molecule-freeze-scope + molecule-hitl for risky ops These plugins were wired into engineer roles by #280, #303, #310, #322 — the idle_prompt makes them the PRIMARY quality gate instead of a nice-to- have before PR. Matches the "team self-regulates, don't rely on outside reviewer" spirit. ## Hard rules (same shape as researcher idle_prompts from #216/#321) - Max 1 claim per tick (1 `gh issue edit --add-assignee` call) - Never take someone else's assigned issue - Under 90 seconds wall-clock for the claim + plan step - Don't double-pick: check `task-assigned:<role>` memory first - No busy-work fabrication: write "<role>-idle HH:MM — no work" if nothing matches ## What this does NOT change - Leaders' orchestrator pulses still dispatch (#159) — this is the TAIL pickup, not the primary dispatch path. Dev Lead still prioritizes via its own pulse. - PR merging still goes through reviewer per `feedback_never_merge_prs.md`. This directive is about the QUALITY GATE (team self-review, peer review via Dev Lead's pulse) not about bypassing merge approval. - Destructive/irreversible ops still need explicit human ack via molecule-hitl's @requires_approval decorator. ## Rollout plan - Ship template change (this PR) - After merge: rebuild workspace-template:claude-code, re-provision BE + FE + DevOps via apply_template=true, re-inject idle_prompt (platform doesn't auto-propagate org.yaml to live configs — tracked separately) - Measure: 24h of activity_logs. Should see `a2a_receive` events every 10 min per engineer, response bodies mentioning claim decisions or idle-clean states, and `gh issue edit` events showing up as assignees. ## Related - `feedback_devs_pick_up_issues_leaders_check_in.md` — memory saved last cycle - #159 orchestrator/worker split (leaders dispatch) - #216 / #321 researcher idle_prompts (same pattern applied to researchers) - `project_north_star_24_7.md` — team self-regulation is the north-star	2026-04-15 22:49:10 -07:00
rabbitblood	90d68ca039	feat(template): engineers pick up issues proactively (CEO 2026-04-16 directive) CEO directive verbatim: "devs should pick up issues and declare that its assigned to them, PM and leaders regularly check in. dont just rely on outside reviewer". Adds `idle_prompt` + `idle_interval_seconds: 600` to Frontend Engineer, Backend Engineer, and DevOps Engineer. Each engineer now polls open GH issues matching its specialty, claims unassigned ones via `gh issue edit --add-assignee @me`, leaves a public comment declaring the pickup, and commits memory to prevent double-pickup on the next tick. Previously engineers were reactive-only per the #159 orchestrator/worker split. The CEO is correcting that: devs should be a true self-organizing unit, not a work-queue that only advances when an outside reviewer dispatches. ## Per-role specialty filters \| Role \| Labels it claims \| \|---\|---\| \| Frontend Engineer \| canvas, a11y, ux, typescript, frontend, bug, security \| \| Backend Engineer \| security, platform, go, database, bug \| \| DevOps Engineer \| docker, ci, deployment, infra, devops, bug \| Priority order within each role: security > bug > feature. ## Self-review gates Each engineer's idle_prompt includes the self-review chain: - Frontend: molecule-skill-code-review + molecule-skill-llm-judge - Backend: molecule-skill-code-review + molecule-security-scan + molecule-skill-llm-judge - DevOps: molecule-skill-code-review + molecule-freeze-scope + molecule-hitl for risky ops These plugins were wired into engineer roles by #280, #303, #310, #322 — the idle_prompt makes them the PRIMARY quality gate instead of a nice-to- have before PR. Matches the "team self-regulates, don't rely on outside reviewer" spirit. ## Hard rules (same shape as researcher idle_prompts from #216/#321) - Max 1 claim per tick (1 `gh issue edit --add-assignee` call) - Never take someone else's assigned issue - Under 90 seconds wall-clock for the claim + plan step - Don't double-pick: check `task-assigned:<role>` memory first - No busy-work fabrication: write "<role>-idle HH:MM — no work" if nothing matches ## What this does NOT change - Leaders' orchestrator pulses still dispatch (#159) — this is the TAIL pickup, not the primary dispatch path. Dev Lead still prioritizes via its own pulse. - PR merging still goes through reviewer per `feedback_never_merge_prs.md`. This directive is about the QUALITY GATE (team self-review, peer review via Dev Lead's pulse) not about bypassing merge approval. - Destructive/irreversible ops still need explicit human ack via molecule-hitl's @requires_approval decorator. ## Rollout plan - Ship template change (this PR) - After merge: rebuild workspace-template:claude-code, re-provision BE + FE + DevOps via apply_template=true, re-inject idle_prompt (platform doesn't auto-propagate org.yaml to live configs — tracked separately) - Measure: 24h of activity_logs. Should see `a2a_receive` events every 10 min per engineer, response bodies mentioning claim decisions or idle-clean states, and `gh issue edit` events showing up as assignees. ## Related - `feedback_devs_pick_up_issues_leaders_check_in.md` — memory saved last cycle - #159 orchestrator/worker split (leaders dispatch) - #216 / #321 researcher idle_prompts (same pattern applied to researchers) - `project_north_star_24_7.md` — team self-regulation is the north-star	2026-04-15 22:49:10 -07:00
Hongming Wang	829e4bf89b	Merge pull request #369 from Molecule-AI/chore/eco-watch-2026-04-18 All CI green. Docs-only: adds AMD GAIA + ClawRun ecosystem survey entries.	2026-04-15 22:46:53 -07:00
Hongming Wang	4b467c37a8	Merge pull request #369 from Molecule-AI/chore/eco-watch-2026-04-18 All CI green. Docs-only: adds AMD GAIA + ClawRun ecosystem survey entries.	2026-04-15 22:46:53 -07:00
Research Lead	dff50f5927	chore(eco-watch): 2026-04-18 survey — AMD GAIA + ClawRun Add two new entries to docs/ecosystem-watch.md: - AMD GAIA (amd/gaia, ~1.2k ⭐, MIT, v0.17.2 April 10 2026): AMD-backed local-first agent framework with MCP client support, RAG, vision, and voice. Hardware-locked to Ryzen AI but signals local/privacy-first positioning. @tool decorator pattern worth borrowing for workspace adapters. - ClawRun (clawrun-sh/clawrun, ~84 ⭐, Apache 2.0, 45 releases): Closest architectural match we've tracked — hosting/lifecycle layer with sandbox, heartbeat, snapshot/resume, channels, and cost tracking. Per-channel budget enforcement is a concrete gap in our workspace_channels. Filed #368. HEAD at survey time: `8db86df` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:40:44 +00:00
Research Lead	3ed4038149	chore(eco-watch): 2026-04-18 survey — AMD GAIA + ClawRun Add two new entries to docs/ecosystem-watch.md: - AMD GAIA (amd/gaia, ~1.2k ⭐, MIT, v0.17.2 April 10 2026): AMD-backed local-first agent framework with MCP client support, RAG, vision, and voice. Hardware-locked to Ryzen AI but signals local/privacy-first positioning. @tool decorator pattern worth borrowing for workspace adapters. - ClawRun (clawrun-sh/clawrun, ~84 ⭐, Apache 2.0, 45 releases): Closest architectural match we've tracked — hosting/lifecycle layer with sandbox, heartbeat, snapshot/resume, channels, and cost tracking. Per-channel budget enforcement is a concrete gap in our workspace_channels. Filed #368. HEAD at survey time: `a4a89a3` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:40:44 +00:00
Hongming Wang	8db86df330	Merge pull request #363 from Molecule-AI/chore/eco-watch-2026-04-17 All CI green. Docs-only: adds GenericAgent + OpenSRE ecosystem survey entries.	2026-04-15 22:14:23 -07:00
Hongming Wang	a4a89a30c1	Merge pull request #363 from Molecule-AI/chore/eco-watch-2026-04-17 All CI green. Docs-only: adds GenericAgent + OpenSRE ecosystem survey entries.	2026-04-15 22:14:23 -07:00
Research Lead	04ceb95142	chore(eco-watch): 2026-04-17 survey — GenericAgent + OpenSRE Add two new entries to docs/ecosystem-watch.md: - GenericAgent (lsdefine/GenericAgent, ~2.1k ⭐, MIT, v1.0 January 2026): self-evolving skill tree with a four-tier memory hierarchy (rules/indices/facts/skills/archives). Skill crystallisation at runtime is the automation of our install-time plugins model. Filed #361 to add named memory tiers to agent_memories. - OpenSRE (Tracer-Cloud/opensre, ~900 ⭐, Apache 2.0): AI SRE agent toolkit with 40+ production DevOps integrations and MCP support. Filed #362 to evaluate its adapters as a Molecule AI DevOps workspace skill pack. HEAD at survey time: `2e1fc8d` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:11:01 +00:00
Research Lead	fe6e3032a4	chore(eco-watch): 2026-04-17 survey — GenericAgent + OpenSRE Add two new entries to docs/ecosystem-watch.md: - GenericAgent (lsdefine/GenericAgent, ~2.1k ⭐, MIT, v1.0 January 2026): self-evolving skill tree with a four-tier memory hierarchy (rules/indices/facts/skills/archives). Skill crystallisation at runtime is the automation of our install-time plugins model. Filed #361 to add named memory tiers to agent_memories. - OpenSRE (Tracer-Cloud/opensre, ~900 ⭐, Apache 2.0): AI SRE agent toolkit with 40+ production DevOps integrations and MCP support. Filed #362 to evaluate its adapters as a Molecule AI DevOps workspace skill pack. HEAD at survey time: `93fd546` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:11:01 +00:00
Hongming Wang	2e1fc8d832	Merge pull request #360 from Molecule-AI/chore/issue-358-wsauth-dead-constants All CI green. Removes dead constants and stale comment left over from PR #357 grace-period test deletion (closes #358).	2026-04-15 22:05:37 -07:00
Hongming Wang	93fd5467e2	Merge pull request #360 from Molecule-AI/chore/issue-358-wsauth-dead-constants All CI green. Removes dead constants and stale comment left over from PR #357 grace-period test deletion (closes #358).	2026-04-15 22:05:37 -07:00
PM Bot	409a249ca6	chore(test): remove dead constants from wsauth_middleware_test.go (#358 ) PR #357 deleted the grace-period tests that used hasLiveTokenQuery and workspaceExistsQuery, but the constants themselves (and the stale comment describing the old HasAnyLiveToken-based dispatch) were not removed. Remove both dead const declarations and update the header comment to reflect the strict-enforcement contract introduced by #357. Closes #358. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:02:11 +00:00
PM Bot	e257cd80d4	chore(test): remove dead constants from wsauth_middleware_test.go (#358 ) PR #357 deleted the grace-period tests that used hasLiveTokenQuery and workspaceExistsQuery, but the constants themselves (and the stale comment describing the old HasAnyLiveToken-based dispatch) were not removed. Remove both dead const declarations and update the header comment to reflect the strict-enforcement contract introduced by #357. Closes #358. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 05:02:11 +00:00
Hongming Wang	d09e72c5fd	Merge pull request #357 from Molecule-AI/fix/issue-351-remove-tokenless-grace-period All CI green. Merges strict WorkspaceAuth — removes tokenless grace period that enabled zombie workspace enumeration (#351).	2026-04-15 21:57:17 -07:00
Hongming Wang	4e514aa59a	Merge pull request #357 from Molecule-AI/fix/issue-351-remove-tokenless-grace-period All CI green. Merges strict WorkspaceAuth — removes tokenless grace period that enabled zombie workspace enumeration (#351).	2026-04-15 21:57:17 -07:00
Hongming Wang	b2b0045913	fix(security): remove WorkspaceAuth tokenless grace period (#351 ) Severity HIGH. #318 closed the fake-UUID fail-open for WorkspaceAuth but left the grace period intact for real workspaces with no live tokens. Zombie test-artifact workspaces from prior DAST runs still exist in the DB with empty configs and no tokens, so they pass WorkspaceExists=true but HasAnyLiveToken=false — and fell through the grace period, leaking every global-secret key name to any unauthenticated caller on the Docker network. Phase 30.1 shipped months ago; every production workspace has gone through multiple boot cycles and acquired a token since. The "legacy workspaces grandfathered" window no longer serves legitimate traffic. Removing it entirely is the cleanest fix — and does NOT affect registration (which is on /registry/register, outside this middleware's scope). New contract (strict): every /workspaces/:id/* request MUST carry Authorization: Bearer <token-for-this-workspace> Any missing/mismatched/revoked/wrong-workspace bearer → 401. No existence check, no fallback. The wsauth.WorkspaceExists helper is kept in the package for any future caller but no longer used here. Tests: - TestWorkspaceAuth_351_NoBearer_Returns401_NoDBCalls — new, covers fake UUID / zombie / pre-token in one sub-table. Asserts zero DB calls on missing bearer. - Existing C4/C8 + #170 tests updated to drop the stale HasAnyLiveToken sqlmock expectations. - Renamed TestWorkspaceAuth_Issue170_SecretDelete_FailOpen_NoTokens to _NoTokensStillRejected and flipped the assertion from 200 to 401. - Dropped TestWorkspaceAuth_318_ExistsQueryError_Returns500 — the code path it covered no longer exists. Full platform test sweep green. Closes #351 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:52:44 -07:00
Hongming Wang	fa239217a0	fix(security): remove WorkspaceAuth tokenless grace period (#351 ) Severity HIGH. #318 closed the fake-UUID fail-open for WorkspaceAuth but left the grace period intact for real workspaces with no live tokens. Zombie test-artifact workspaces from prior DAST runs still exist in the DB with empty configs and no tokens, so they pass WorkspaceExists=true but HasAnyLiveToken=false — and fell through the grace period, leaking every global-secret key name to any unauthenticated caller on the Docker network. Phase 30.1 shipped months ago; every production workspace has gone through multiple boot cycles and acquired a token since. The "legacy workspaces grandfathered" window no longer serves legitimate traffic. Removing it entirely is the cleanest fix — and does NOT affect registration (which is on /registry/register, outside this middleware's scope). New contract (strict): every /workspaces/:id/* request MUST carry Authorization: Bearer <token-for-this-workspace> Any missing/mismatched/revoked/wrong-workspace bearer → 401. No existence check, no fallback. The wsauth.WorkspaceExists helper is kept in the package for any future caller but no longer used here. Tests: - TestWorkspaceAuth_351_NoBearer_Returns401_NoDBCalls — new, covers fake UUID / zombie / pre-token in one sub-table. Asserts zero DB calls on missing bearer. - Existing C4/C8 + #170 tests updated to drop the stale HasAnyLiveToken sqlmock expectations. - Renamed TestWorkspaceAuth_Issue170_SecretDelete_FailOpen_NoTokens to _NoTokensStillRejected and flipped the assertion from 200 to 401. - Dropped TestWorkspaceAuth_318_ExistsQueryError_Returns500 — the code path it covered no longer exists. Full platform test sweep green. Closes #351 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:52:44 -07:00
Hongming Wang	742d061787	Merge pull request #350 from Molecule-AI/chore/eco-watch-2026-04-16b chore(eco-watch): 2026-04-16b survey — AgentScope + Plannotator	2026-04-15 21:47:50 -07:00
Hongming Wang	75146f4314	Merge pull request #350 from Molecule-AI/chore/eco-watch-2026-04-16b chore(eco-watch): 2026-04-16b survey — AgentScope + Plannotator	2026-04-15 21:47:50 -07:00
Research Lead	93720565b0	chore(eco-watch): 2026-04-16b survey — AgentScope + Plannotator Add two new entries to docs/ecosystem-watch.md: - AgentScope (modelscope/agentscope, ~23.8k ⭐, Apache 2.0, v1.0.18 March 26 2026): Alibaba/ModelScope multi-agent framework with MCP support, MsgHub typed routing, and OpenTelemetry observability. No canvas or workspace lifecycle — framework-layer complement, not a platform competitor. - Plannotator (backnotprop/plannotator, ~4.3k ⭐, Apache 2.0+MIT, v0.17.10 April 13 2026): Browser-based agent plan annotation tool with structured feedback types (delete/insert/replace/comment). Directly informs our hitl.py feedback schema. Filed #349 to add structured feedback types to resume_task. HEAD at survey time: `0897f9e` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 04:40:51 +00:00
Research Lead	6be5d09764	chore(eco-watch): 2026-04-16b survey — AgentScope + Plannotator Add two new entries to docs/ecosystem-watch.md: - AgentScope (modelscope/agentscope, ~23.8k ⭐, Apache 2.0, v1.0.18 March 26 2026): Alibaba/ModelScope multi-agent framework with MCP support, MsgHub typed routing, and OpenTelemetry observability. No canvas or workspace lifecycle — framework-layer complement, not a platform competitor. - Plannotator (backnotprop/plannotator, ~4.3k ⭐, Apache 2.0+MIT, v0.17.10 April 13 2026): Browser-based agent plan annotation tool with structured feedback types (delete/insert/replace/comment). Directly informs our hitl.py feedback schema. Filed #349 to add structured feedback types to resume_task. HEAD at survey time: `4196876` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 04:40:51 +00:00
Hongming Wang	0897f9e59c	Merge pull request #346 from Molecule-AI/chore/issue-342-auditor-prompt-drift chore(auditor): close #319 + #337 prompt drift on Security Auditor (#342)	2026-04-15 21:31:06 -07:00
Hongming Wang	4196876c2b	Merge pull request #346 from Molecule-AI/chore/issue-342-auditor-prompt-drift chore(auditor): close #319 + #337 prompt drift on Security Auditor (#342)	2026-04-15 21:31:06 -07:00
Hongming Wang	d8183e16cc	Merge pull request #343 from Molecule-AI/fix/issue-337-webhook-secret-constant-time fix(security): constant-time webhook_secret comparison (#337)	2026-04-15 21:31:02 -07:00
Hongming Wang	c5d40b861b	Merge pull request #343 from Molecule-AI/fix/issue-337-webhook-secret-constant-time fix(security): constant-time webhook_secret comparison (#337)	2026-04-15 21:31:02 -07:00
Hongming Wang	c6a721fd56	Merge pull request #341 from Molecule-AI/fix/publish-platform-image-keychain-again fix(ci): disable osxkeychain credsStore on self-hosted runner (#199 follow-up)	2026-04-15 21:30:59 -07:00
Hongming Wang	af3d9904e1	Merge pull request #341 from Molecule-AI/fix/publish-platform-image-keychain-again fix(ci): disable osxkeychain credsStore on self-hosted runner (#199 follow-up)	2026-04-15 21:30:59 -07:00
Hongming Wang	c7477047c2	Merge pull request #338 from Molecule-AI/fix/issue-328-transcript-fail-closed fix(security): /transcript fails closed when auth token missing (#328)	2026-04-15 21:30:56 -07:00
Hongming Wang	e7bde9a919	Merge pull request #338 from Molecule-AI/fix/issue-328-transcript-fail-closed fix(security): /transcript fails closed when auth token missing (#328)	2026-04-15 21:30:56 -07:00
Hongming Wang	2da48dda13	chore(auditor): close #319 + #337 prompt drift on Security Auditor (#342 ) Two recent platform-level security changes (#319 channel_config encryption, #337 constant-time webhook_secret compare) were not reflected in the Security Auditor's system prompt or the schedule cron prompt. That meant the auditor wouldn't proactively look for the next instance of either class — a new credential field added to channel_config without being added to sensitiveFields, or a new secret comparison using raw `!=`, would slip through until a human happened to notice. Updated two files: 1. org-templates/molecule-dev/security-auditor/system-prompt.md Added two bullets to "What You Check": - Secret comparisons must use subtle.ConstantTimeCompare / crypto.timingSafeEqual (cites #337 as the repo's recent instance) - Secret storage at rest: any new channel_config credential field must be added to sensitiveFields and exercised in both the Encrypt (write) and Decrypt (read) boundary helpers, and the ec1: prefix must never leak into API responses (cites #319) 2. org-templates/molecule-dev/org.yaml Same two checks added to the Security Auditor's 12-hour cron prompt's "MANUAL REVIEW of every changed file" section. Wording is concrete enough to paste into a grep: "flag any `!=` / `==` / bytes.Equal against a user-supplied value that gates auth". Pure config / prompt — no code changes, no tests to write. YAML parse verified, TestPlugins_UnionWithDefaults still passes. Closes #342 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:24:34 -07:00
Hongming Wang	6b153ca3cb	chore(auditor): close #319 + #337 prompt drift on Security Auditor (#342 ) Two recent platform-level security changes (#319 channel_config encryption, #337 constant-time webhook_secret compare) were not reflected in the Security Auditor's system prompt or the schedule cron prompt. That meant the auditor wouldn't proactively look for the next instance of either class — a new credential field added to channel_config without being added to sensitiveFields, or a new secret comparison using raw `!=`, would slip through until a human happened to notice. Updated two files: 1. org-templates/molecule-dev/security-auditor/system-prompt.md Added two bullets to "What You Check": - Secret comparisons must use subtle.ConstantTimeCompare / crypto.timingSafeEqual (cites #337 as the repo's recent instance) - Secret storage at rest: any new channel_config credential field must be added to sensitiveFields and exercised in both the Encrypt (write) and Decrypt (read) boundary helpers, and the ec1: prefix must never leak into API responses (cites #319) 2. org-templates/molecule-dev/org.yaml Same two checks added to the Security Auditor's 12-hour cron prompt's "MANUAL REVIEW of every changed file" section. Wording is concrete enough to paste into a grep: "flag any `!=` / `==` / bytes.Equal against a user-supplied value that gates auth". Pure config / prompt — no code changes, no tests to write. YAML parse verified, TestPlugins_UnionWithDefaults still passes. Closes #342 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:24:34 -07:00
Hongming Wang	7af8f33bcc	fix(security): constant-time webhook_secret comparison (#337 ) Severity LOW. The /webhooks/:type handler compared the Telegram X-Telegram-Bot-Api-Secret-Token header against the decrypted webhook_secret using Go's `!=` operator, which short-circuits on the first mismatched byte. Under low-latency Docker-network conditions an attacker could time response latency byte-by-byte and converge on the real secret, then inject Telegram-formatted messages into any channel. Fix: switch to crypto/subtle.ConstantTimeCompare, which runs in time proportional to the length of the shorter input regardless of content match. Same posture as the cdp-proxy token compare in host-bridge (which already used timingSafeEqual). Risk profile over the public internet is low (Telegram webhooks have natural jitter that masks the signal), but the defensive pattern matters for consistency across all secret comparisons. Closes #337 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:23:12 -07:00
Hongming Wang	50819500f0	fix(security): constant-time webhook_secret comparison (#337 ) Severity LOW. The /webhooks/:type handler compared the Telegram X-Telegram-Bot-Api-Secret-Token header against the decrypted webhook_secret using Go's `!=` operator, which short-circuits on the first mismatched byte. Under low-latency Docker-network conditions an attacker could time response latency byte-by-byte and converge on the real secret, then inject Telegram-formatted messages into any channel. Fix: switch to crypto/subtle.ConstantTimeCompare, which runs in time proportional to the length of the shorter input regardless of content match. Same posture as the cdp-proxy token compare in host-bridge (which already used timingSafeEqual). Risk profile over the public internet is low (Telegram webhooks have natural jitter that masks the signal), but the defensive pattern matters for consistency across all secret comparisons. Closes #337 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 21:23:12 -07:00
Hongming Wang	94a9f92c50	fix(security): scope PausePollersForToken to requesting workspace (closes #329 ) CI 5/6 pass (E2E cancel = run-supersession pattern). Dev Lead review 04:21: ✅ Approved. Fixes cross-tenant token exposure: PausePollersForToken now scoped to requesting workspace_id via SQL WHERE clause. Closes #329.	2026-04-15 21:22:50 -07:00
Hongming Wang	a205c92428	fix(security): scope PausePollersForToken to requesting workspace (closes #329 ) CI 5/6 pass (E2E cancel = run-supersession pattern). Dev Lead review 04:21: ✅ Approved. Fixes cross-tenant token exposure: PausePollersForToken now scoped to requesting workspace_id via SQL WHERE clause. Closes #329.	2026-04-15 21:22:50 -07:00
Hongming Wang	9ea6fc23e0	chore(eco-watch): 2026-04-16 daily survey — Gemini CLI + open-multi-agent CI fully green. Dev Lead review: ✅ Approved. Docs-only: adds Gemini CLI and open-multi-agent entries to ecosystem-watch.md; files issues #332 (gemini-cli adapter) and #333 (PM goal-decomp skill).	2026-04-15 21:22:37 -07:00
Hongming Wang	12dc0ebdf2	chore(eco-watch): 2026-04-16 daily survey — Gemini CLI + open-multi-agent CI fully green. Dev Lead review: ✅ Approved. Docs-only: adds Gemini CLI and open-multi-agent entries to ecosystem-watch.md; files issues #332 (gemini-cli adapter) and #333 (PM goal-decomp skill).	2026-04-15 21:22:37 -07:00

... 56 57 58 59 60 ...

3663 Commits