feat(plugins): hot-reload classifier — skip restart on SKILL-content-only updates
Some checks failed
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 17s
branch-protection drift check / Branch protection drift (pull_request) Successful in 21s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 19s
Harness Replays / detect-changes (pull_request) Successful in 22s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s
CI / Detect changes (pull_request) Successful in 27s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 22s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 9s
Harness Replays / Harness Replays (pull_request) Failing after 25s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m41s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m33s
CI / Platform (Go) (pull_request) Successful in 5m11s

Closes molecule-core#112. Composes with #114 (atomic install).

Before issuing restartFunc, classify the diff between staged and live:
  - skill-content-only: only **/SKILL.md content changed
                        → skip restart (Claude Code re-reads SKILL.md on
                          each Skill invocation; no in-memory cache)
  - cold: anything else
                        → restartFunc as before
                          (hooks/settings load at session start;
                          plugin.yaml is structural; added/removed files
                          require a fresh load)

DETECTION
  - Hash every regular file in staged tree (host filesystem, sha256)
  - Hash every regular file in live tree (in-container via docker exec
    sh -c 'cd <livePath> && find . -type f -print0 | xargs -0 sha256sum')
  - .complete marker dropped from comparison (mtime varies install-to-
    install; including it would force-cold every reinstall)
  - File added/removed → cold
  - File content differs but isn't SKILL.md → cold
  - All differences are SKILL.md basenames → skill-content-only

DEFAULTS COLD
  - First install (no live tree) → cold
  - Live tree read failure → cold (conservative; never hot-reload speculatively)
  - Symlinks skipped during hash (same posture as tar walker)

PHASE 4 SELF-REVIEW
  Correctness: No finding — all error paths default to cold; never
    falsely classify as skill-content-only. The .complete drop is
    a deliberate exception (the marker is bookkeeping, not content).
  Readability: No finding — single-purpose helpers (hashLocalTree,
    hashContainerTree, isSkillMarkdown, shQuote) each do one thing.
    The classifier itself reads as 'compare set, then walk diff with
    isSkillMarkdown gate.'
  Architecture: No finding — composes existing execAsRoot primitive;
    new helpers in plugins_classifier.go don't touch any other
    handler. Old behavior unchanged when live read fails.
  Security: No finding — shQuote single-quotes any non-trivial path,
    pluginName comes from validatePluginName-validated source, and
    the docker exec command takes the path as a single arg (xargs -0
    handles binary-safe path delimiting). Symlinks skipped.
  Performance: No finding — adds two tree walks (host + container)
    per install. Container walk is one docker exec call returning
    sha256 lines; for typical plugins (~10-50 files) round-trip is
    ~100ms. Versus the saved ~5-10s of restart on a hot-reloadable
    update, this is a clear win.

TESTS (4 new, all green; full handler suite green)
  TestIsSkillMarkdown        — basename match, case-sensitive
  TestHashLocalTree_StableHash — re-hash same dir = same map
  TestHashLocalTree_SymlinkSkipped — hostile link doesn't poison classifier
  TestShQuote                — quoting boundary for shell injection safety

REFS
  molecule-core#112 — this issue
  molecule-core#114 — atomic install (.complete marker added there)
  Reno-Stars iteration safety (Hongming 2026-05-08)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude-ceo-assistant 2026-05-08 08:26:05 -07:00
parent 3e96184d6f
commit 249e760fbd
3 changed files with 349 additions and 1 deletions

View File

@ -0,0 +1,214 @@
package handlers
// plugins_classifier.go — diff classifier for plugin updates.
//
// Closes molecule-core#112. Composes with #114 (atomic install) so the
// platform can decide *before* triggering restartFunc whether the
// update is content-only (SKILL.md text changed; agent re-reads at next
// Skill invocation) or structural (hooks/settings/plugin.yaml/file added
// or removed; agent must restart to pick up the new state).
//
// SKILL.md content is hot-reloadable because Claude Code reads the file
// on each Skill invocation — no in-memory cache. Hooks and settings.json
// are loaded at session start and need a session restart. plugin.yaml
// changes are structural by definition (manifest controls everything
// else).
//
// CLASSIFICATION RULE
// classify(staged, live) → "skill-content-only" if and only if
// every file present in either tree is one of:
// - identical between staged and live, OR
// - a **/SKILL.md file with content change (text body modified)
// AND no files were added or removed.
// Anything else → "cold" (the safe default).
//
// The classifier reads live-tree files from inside the container via
// `docker exec cat`. Comparison is by SHA-256 over file content, not
// mtime — mtime changes on every install regardless of content.
import (
"context"
"crypto/sha256"
"encoding/hex"
"fmt"
"io/fs"
"os"
"path/filepath"
"strings"
)
const (
// classifyKindSkillContentOnly: install can skip restartFunc; the
// only changes are SKILL.md body text.
classifyKindSkillContentOnly = "skill-content-only"
// classifyKindCold: must restart the workspace container; structural
// or hook/settings change.
classifyKindCold = "cold"
)
// classifyInstallChanges compares the staged plugin tree (host filesystem)
// against the currently-live plugin tree inside the container. Returns
// classifyKindSkillContentOnly when the only diff is SKILL.md content
// changes, classifyKindCold otherwise (added/removed files, hooks/
// settings.json edits, plugin.yaml edits, anything else).
//
// `noLive` is the sentinel returned when /configs/plugins/<name> doesn't
// exist (first install for this plugin). Treated as cold — no live state
// to hot-reload into.
func (h *PluginsHandler) classifyInstallChanges(
ctx context.Context, containerName, hostStagedDir, pluginName string,
) (string, error) {
livePath := "/configs/plugins/" + pluginName
// Probe: does live exist? If not, this is a first install — cold.
if _, err := h.execAsRoot(ctx, containerName, []string{
"test", "-d", livePath,
}); err != nil {
return classifyKindCold, nil
}
// Build hash maps for both trees.
stagedHashes, err := hashLocalTree(hostStagedDir)
if err != nil {
return classifyKindCold, fmt.Errorf("classifier: hash staged: %w", err)
}
liveHashes, err := h.hashContainerTree(ctx, containerName, livePath)
if err != nil {
// Live tree read failure: be conservative, cold-restart.
return classifyKindCold, nil
}
// Drop the .complete marker from comparison — its mtime/atime can
// vary across installs but content is empty/trivial; including it
// would force-cold every reinstall.
delete(stagedHashes, ".complete")
delete(liveHashes, ".complete")
// Set difference: any file in one but not the other → cold.
for rel := range stagedHashes {
if _, ok := liveHashes[rel]; !ok {
return classifyKindCold, nil // file added
}
}
for rel := range liveHashes {
if _, ok := stagedHashes[rel]; !ok {
return classifyKindCold, nil // file removed
}
}
// Same set of files. Walk the diff.
for rel, stagedHash := range stagedHashes {
liveHash := liveHashes[rel]
if stagedHash == liveHash {
continue
}
// Content differs. Allow if and only if it's a SKILL.md.
if !isSkillMarkdown(rel) {
return classifyKindCold, nil
}
}
return classifyKindSkillContentOnly, nil
}
// isSkillMarkdown returns true for any path whose basename is SKILL.md
// (case-sensitive, matches Claude Code's skill discovery rule).
func isSkillMarkdown(rel string) bool {
return filepath.Base(rel) == "SKILL.md"
}
// hashLocalTree walks a host directory and returns rel-path → sha256-hex.
// Symlinks are skipped (same posture as the tar walker).
func hashLocalTree(root string) (map[string]string, error) {
out := map[string]string{}
err := filepath.WalkDir(root, func(p string, d fs.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() {
return nil
}
info, err := d.Info()
if err != nil {
return err
}
if info.Mode()&os.ModeSymlink != 0 {
return nil
}
if !info.Mode().IsRegular() {
return nil
}
rel, err := filepath.Rel(root, p)
if err != nil {
return err
}
body, err := os.ReadFile(p)
if err != nil {
return err
}
sum := sha256.Sum256(body)
out[filepath.ToSlash(rel)] = hex.EncodeToString(sum[:])
return nil
})
if err != nil {
return nil, err
}
return out, nil
}
// hashContainerTree reads every regular file under livePath via docker
// exec sh -c 'cd <livePath> && find . -type f -not -name .complete | xargs -I {} sh -c "echo {}; sha256sum {}"'.
//
// The output is parsed line-by-line into rel-path → sha256-hex.
func (h *PluginsHandler) hashContainerTree(
ctx context.Context, containerName, livePath string,
) (map[string]string, error) {
out, err := h.execAsRoot(ctx, containerName, []string{
"sh", "-c",
// Find regular files, hash each, output `<hex> ./<relpath>`.
// `cd` then `find .` keeps paths relative to livePath.
fmt.Sprintf("cd %s && find . -type f -print0 | xargs -0 -r sha256sum 2>/dev/null", shQuote(livePath)),
})
if err != nil {
return nil, fmt.Errorf("hash container tree: %w", err)
}
hashes := map[string]string{}
for _, line := range strings.Split(out, "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
// sha256sum output: "<hex> <path>" (two spaces). Path starts with "./".
parts := strings.SplitN(line, " ", 2)
if len(parts) != 2 {
continue
}
hash := parts[0]
rel := strings.TrimPrefix(parts[1], "./")
hashes[rel] = hash
}
return hashes, nil
}
// shQuote single-quotes a string for safe insertion into a shell command.
// Returns the input unchanged if it's already shell-safe (alphanumeric +
// /._-). Otherwise wraps in single quotes and escapes inner '.
func shQuote(s string) string {
safe := true
for _, c := range s {
switch {
case c >= 'a' && c <= 'z':
case c >= 'A' && c <= 'Z':
case c >= '0' && c <= '9':
case c == '/' || c == '.' || c == '_' || c == '-':
default:
safe = false
}
if !safe {
break
}
}
if safe {
return s
}
return "'" + strings.ReplaceAll(s, "'", `'\''`) + "'"
}

View File

@ -0,0 +1,121 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
// TestIsSkillMarkdown: pin which paths the classifier considers
// hot-reloadable. SKILL.md by basename only — case-sensitive.
func TestIsSkillMarkdown(t *testing.T) {
yes := []string{
"SKILL.md",
"skills/foo/SKILL.md",
"deeply/nested/SKILL.md",
}
no := []string{
"plugin.yaml",
"hooks.json",
"settings.json",
"README.md",
"skill.md", // case-sensitive
"SKILLS.md", // not a skill file
"skills/foo/extra.md",
}
for _, s := range yes {
if !isSkillMarkdown(s) {
t.Errorf("isSkillMarkdown(%q) = false; want true", s)
}
}
for _, s := range no {
if isSkillMarkdown(s) {
t.Errorf("isSkillMarkdown(%q) = true; want false", s)
}
}
}
// TestHashLocalTree_StableHash: hashing the same content twice must
// produce identical maps. Pinned because if hashLocalTree ever picks up
// mtime/inode (e.g. via a refactor to use os.Lstat metadata), every
// install would classify as cold and we'd lose the hot-reload.
func TestHashLocalTree_StableHash(t *testing.T) {
dir := t.TempDir()
if err := os.MkdirAll(filepath.Join(dir, "skills/foo"), 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(dir, "plugin.yaml"), []byte("name: foo\n"), 0o644); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(dir, "skills/foo/SKILL.md"), []byte("# Foo\n"), 0o644); err != nil {
t.Fatal(err)
}
h1, err := hashLocalTree(dir)
if err != nil {
t.Fatal(err)
}
h2, err := hashLocalTree(dir)
if err != nil {
t.Fatal(err)
}
if len(h1) != len(h2) {
t.Fatalf("hash count differs: %d vs %d", len(h1), len(h2))
}
for k, v := range h1 {
if h2[k] != v {
t.Errorf("hash[%q] differs: %q vs %q", k, v, h2[k])
}
}
}
// TestHashLocalTree_SymlinkSkipped: symlinks should not appear in the
// hash map — same posture as the tar walker. Otherwise a hostile plugin
// could include a symlink whose hash changes when its target changes,
// silently flipping classification.
func TestHashLocalTree_SymlinkSkipped(t *testing.T) {
dir := t.TempDir()
if err := os.WriteFile(filepath.Join(dir, "real.txt"), []byte("ok"), 0o644); err != nil {
t.Fatal(err)
}
target := filepath.Join(t.TempDir(), "target")
if err := os.WriteFile(target, []byte("outside"), 0o644); err != nil {
t.Fatal(err)
}
if err := os.Symlink(target, filepath.Join(dir, "link")); err != nil {
t.Fatal(err)
}
h, err := hashLocalTree(dir)
if err != nil {
t.Fatal(err)
}
if _, exists := h["link"]; exists {
t.Errorf("symlink leaked into hash map: %v", h)
}
if _, exists := h["real.txt"]; !exists {
t.Errorf("real.txt missing from hash map: %v", h)
}
}
// TestShQuote: the classifier injects livePath into a shell command via
// docker exec. Path must be quoted to handle pluginName entries with
// hyphens (which are safe but exercised here) and any future special-
// character edge case. Pin the safe-vs-quoted boundary.
func TestShQuote(t *testing.T) {
cases := []struct {
in, want string
}{
{"foo", "foo"},
{"/configs/plugins/foo-bar", "/configs/plugins/foo-bar"},
{"with space", "'with space'"},
{"with'quote", "'with'\\''quote'"},
{"$envvar", "'$envvar'"},
{"path/with/dots.txt", "path/with/dots.txt"},
}
for _, tc := range cases {
if got := shQuote(tc.in); got != tc.want {
t.Errorf("shQuote(%q) = %q; want %q", tc.in, got, tc.want)
}
}
}

View File

@ -276,6 +276,15 @@ func (h *PluginsHandler) resolveAndStage(ctx context.Context, req installRequest
// using NewPluginsHandler without a DB; production wires it in router.go. // using NewPluginsHandler without a DB; production wires it in router.go.
func (h *PluginsHandler) deliverToContainer(ctx context.Context, workspaceID string, r *stageResult) error { func (h *PluginsHandler) deliverToContainer(ctx context.Context, workspaceID string, r *stageResult) error {
if containerName := h.findRunningContainer(ctx, workspaceID); containerName != "" { if containerName := h.findRunningContainer(ctx, workspaceID); containerName != "" {
// Hot-reload classifier (molecule-core#112) — decide BEFORE the
// install whether this update can skip restartFunc. SKILL.md
// content changes are filesystem-visible to Claude Code on the
// next Skill invocation; hooks / settings.json / plugin.yaml /
// added-or-removed files need a container restart.
// Classifier reads live tree from container; on any read error
// it returns kindCold so we never hot-reload speculatively.
kind, _ := h.classifyInstallChanges(ctx, containerName, r.StagedDir, r.PluginName)
// Atomic stage→snapshot→swap→marker (molecule-core#114). // Atomic stage→snapshot→swap→marker (molecule-core#114).
// Replaces the prior single docker.CopyToContainer write that // Replaces the prior single docker.CopyToContainer write that
// left a partially-extracted tree on mid-install failure with // left a partially-extracted tree on mid-install failure with
@ -290,8 +299,12 @@ func (h *PluginsHandler) deliverToContainer(ctx context.Context, workspaceID str
"chown", "-R", "1000:1000", "/configs/plugins/" + r.PluginName, "chown", "-R", "1000:1000", "/configs/plugins/" + r.PluginName,
}) })
if h.restartFunc != nil { if h.restartFunc != nil {
if kind == classifyKindSkillContentOnly {
log.Printf("Plugin install: %s → workspace %s — SKILL-content-only update, SKIPPING restart", r.PluginName, workspaceID)
} else {
go h.restartFunc(workspaceID) go h.restartFunc(workspaceID)
} }
}
return nil return nil
} }