molecule-core/workspace-server/internal/handlers/plugins.go
security-auditor c1de2287fd
Some checks failed
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 4m46s
CI / Detect changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 4s
Harness Replays / detect-changes (pull_request) Successful in 5s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 53s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 44s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m21s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m28s
Harness Replays / Harness Replays (pull_request) Failing after 43s
CI / Platform (Go) (pull_request) Successful in 3m19s
fix(workspace-server): SSOT-route container check + 422 on external runtimes
Two coupled fixes for molecule-core#10 (plugin install 503 vs
status=online split-state):

1. SSOT for "is this workspace's container running" — `findRunningContainer`
   in plugins.go used to carry its own copy of `cli.ContainerInspect`, which
   collapsed transient daemon errors into the same `""` return as a
   genuinely-stopped container. Healthsweep's `Provisioner.IsRunning`
   handled the same input correctly (defensive). Promote the inspect logic
   to `provisioner.RunningContainerName`, route both consumers through it.
   Transient errors get a distinct log line on the plugins side so triage
   doesn't confuse a flaky daemon with a stopped container.

2. Runtime-aware Install/Uninstall — `runtime='external'` workspaces have
   no local container; push-install via docker exec is meaningless. They
   pull plugins via the download endpoint instead (Phase 30.3). Without a
   guard they fell through to `findRunningContainer` and 503'd with a
   misleading "container not running." Add an early 422 with a hint
   pointing at the download endpoint.

The two fixes are independent: (1) preserves correctness when the SSOT
helper is later modified; (2) eliminates the persistent split-state on
the 5 external persona-agent workspaces in this DB (and on tenant
deployments hitting the same shape).

* `internal/provisioner/provisioner.go` — new `RunningContainerName(ctx,
  cli, id) (string, error)` with three documented outcomes (running /
  stopped / transient). `Provisioner.IsRunning` now wraps it; behavior
  preserved.
* `internal/handlers/plugins.go` — `findRunningContainer` shimmed onto
  `RunningContainerName`; new `isExternalRuntime(id)` predicate.
* `internal/handlers/plugins_install.go` — Install + Uninstall reject
  external runtimes with 422 + hint, before the source-fetch step.
* `internal/handlers/plugins_install_external_test.go` — 5 cases:
  external→422, uninstall-external→422, container-backed-falls-through,
  no-runtime-lookup-fails-open, lookup-error-fails-open.
* `internal/handlers/plugins_findrunning_ssot_test.go` — two AST gates
  pin the SSOT routing so future PRs can't silently re-introduce the
  parallel impl. Mutation-tested: reverting either consumer to a direct
  `ContainerInspect` makes the gate fail.

Refs: molecule-core#10

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 22:58:20 -07:00

247 lines
9.1 KiB
Go

package handlers
import (
"bytes"
"context"
"io"
"log"
"os"
"path/filepath"
"strings"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/plugins"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
"github.com/docker/docker/api/types/container"
"github.com/docker/docker/client"
"github.com/docker/docker/pkg/stdcopy"
"gopkg.in/yaml.v3"
)
// RuntimeLookup resolves a workspace's runtime identifier by ID. The
// handler uses this to filter the plugin registry to compatible plugins
// without needing a direct DB dependency. A nil lookup disables
// workspace-scoped filtering (handler falls back to unfiltered list).
type RuntimeLookup func(workspaceID string) (string, error)
// pluginSources is the contract PluginsHandler uses to talk to the
// plugin source registry. Extracted as an interface (#1814) so tests can
// substitute a stub without standing up the real *plugins.Registry +
// every concrete resolver. Production wires *plugins.Registry directly,
// which satisfies this interface — see the compile-time assertion below.
//
// Method set is intentionally narrow — only what handler code calls.
// Register is included because WithSourceResolver and NewPluginsHandler
// both invoke it; a stub that doesn't need to record registrations can
// implement it as a no-op.
type pluginSources interface {
Register(resolver plugins.SourceResolver)
Resolve(source plugins.Source) (plugins.SourceResolver, error)
Schemes() []string
}
// Compile-time assertion: *plugins.Registry satisfies pluginSources.
// Catches a future method-signature drift at build time instead of when
// router wiring runs in main().
var _ pluginSources = (*plugins.Registry)(nil)
// PluginsHandler manages the plugin registry and per-workspace plugin installation.
type PluginsHandler struct {
pluginsDir string // host path to plugins/ registry
docker *client.Client // Docker client for container operations
restartFunc func(string) // auto-restart workspace after install/uninstall
runtimeLookup RuntimeLookup // workspace_id → runtime (optional)
// sources narrowed from `*plugins.Registry` to the pluginSources
// interface (#1814) so tests can substitute a stub. Production
// callers still pass *plugins.Registry, which satisfies the
// interface — see the compile-time assertion above.
sources pluginSources
}
// NewPluginsHandler constructs a PluginsHandler with the default source
// registry (local + github resolvers). Deployments can add more schemes
// via WithSourceResolver before routes are wired — e.g. a private
// enterprise registry or ClawHub. Logs the effective install limits
// exactly once per process on first construction.
func NewPluginsHandler(pluginsDir string, docker *client.Client, restartFunc func(string)) *PluginsHandler {
sources := plugins.NewRegistry()
sources.Register(plugins.NewLocalResolver(pluginsDir))
sources.Register(plugins.NewGithubResolver())
logInstallLimitsOnce(os.Stderr)
return &PluginsHandler{
pluginsDir: pluginsDir,
docker: docker,
restartFunc: restartFunc,
sources: sources,
}
}
// WithSourceResolver registers a custom source resolver (e.g. a ClawHub
// client) alongside the defaults. Call during router wiring, before the
// first request. Chainable.
func (h *PluginsHandler) WithSourceResolver(resolver plugins.SourceResolver) *PluginsHandler {
h.sources.Register(resolver)
return h
}
// WithRuntimeLookup installs a workspace-runtime resolver. Used by the
// router during wiring so tests don't need a real DB.
func (h *PluginsHandler) WithRuntimeLookup(lookup RuntimeLookup) *PluginsHandler {
h.runtimeLookup = lookup
return h
}
// pluginInfo is the API response for a plugin.
type pluginInfo struct {
Name string `json:"name"`
Version string `json:"version"`
Description string `json:"description"`
Author string `json:"author"`
Tags []string `json:"tags"`
Skills []string `json:"skills"`
// Runtimes declares which workspace runtimes this plugin ships an adaptor
// for. Empty means "unspecified" — the canvas still allows install (the
// raw-drop fallback surfaces a warning at install time). Runtime names
// use underscore form (e.g. "claude_code").
Runtimes []string `json:"runtimes"`
// SupportedOnRuntime is populated by ListInstalled/compatibility only.
// When a workspace changes runtime, plugins whose manifest doesn't
// declare the new runtime become inert (files present, tools unwired).
// The canvas reads this to grey out rows.
// Pointer so the field is omitted on endpoints that don't compute it.
SupportedOnRuntime *bool `json:"supported_on_runtime,omitempty"`
}
// supportsRuntime returns true if the plugin declares support for the given
// runtime OR if it declares no runtimes at all (treat as "unspecified, try it").
// Comparison is normalized — "claude-code" and "claude_code" are equal.
func (p pluginInfo) supportsRuntime(runtime string) bool {
if len(p.Runtimes) == 0 {
return true
}
want := strings.ReplaceAll(runtime, "-", "_")
for _, r := range p.Runtimes {
if strings.ReplaceAll(r, "-", "_") == want {
return true
}
}
return false
}
func (h *PluginsHandler) readPluginManifest(pluginPath, fallbackName string) pluginInfo {
data, err := os.ReadFile(filepath.Join(pluginPath, "plugin.yaml"))
if err != nil {
return pluginInfo{Name: fallbackName}
}
return parseManifestYAML(fallbackName, data)
}
// parseManifestYAML parses plugin.yaml bytes into pluginInfo.
func parseManifestYAML(fallbackName string, data []byte) pluginInfo {
info := pluginInfo{Name: fallbackName}
var raw map[string]interface{}
if yaml.Unmarshal(data, &raw) != nil {
return info
}
info.Version = strDefault(raw, "version", "")
info.Description = strDefault(raw, "description", "")
info.Author = strDefault(raw, "author", "")
if tags, ok := raw["tags"].([]interface{}); ok {
for _, t := range tags {
if s, ok := t.(string); ok {
info.Tags = append(info.Tags, s)
}
}
}
if skills, ok := raw["skills"].([]interface{}); ok {
for _, s := range skills {
if str, ok := s.(string); ok {
info.Skills = append(info.Skills, str)
}
}
}
if runtimes, ok := raw["runtimes"].([]interface{}); ok {
for _, r := range runtimes {
if str, ok := r.(string); ok {
info.Runtimes = append(info.Runtimes, str)
}
}
}
return info
}
func strDefault(m map[string]interface{}, key, fallback string) string {
if v, ok := m[key]; ok {
if s, ok := v.(string); ok {
return s
}
}
return fallback
}
// findRunningContainer returns the live container name for workspaceID, or ""
// when the container is genuinely not running OR the daemon errored
// transiently. Routed through provisioner.RunningContainerName as the SSOT
// (molecule-core#10) so this handler agrees with healthsweep on the same
// inputs. Transient daemon errors are logged distinctly so triage doesn't
// confuse a flaky daemon with a stopped container.
func (h *PluginsHandler) findRunningContainer(ctx context.Context, workspaceID string) string {
name, err := provisioner.RunningContainerName(ctx, h.docker, workspaceID)
if err != nil {
log.Printf("plugins: docker inspect transient error for %s: %v (treating as not-running for this request)", workspaceID, err)
return ""
}
return name
}
// isExternalRuntime reports whether the workspace's runtime is the
// `external` (remote-pull) shape introduced in Phase 30. External
// workspaces have no local container — `POST /plugins` (push-install via
// docker exec) doesn't apply to them; they pull via the download endpoint
// instead. Returns false (allow-install) if the lookup is unwired or
// errors — failing open here is safe because the downstream
// findRunningContainer step still gates on a real container being there.
//
// Background — molecule-core#10: without this check, external workspaces
// fall through to findRunningContainer's NotFound path and return a
// misleading 503 "container not running" instead of a clear "use the
// pull endpoint" message.
func (h *PluginsHandler) isExternalRuntime(workspaceID string) bool {
if h.runtimeLookup == nil {
return false
}
runtime, err := h.runtimeLookup(workspaceID)
if err != nil {
return false
}
return runtime == "external"
}
func (h *PluginsHandler) execAsRoot(ctx context.Context, containerName string, cmd []string) (string, error) {
return h.execInContainerAs(ctx, containerName, "root", cmd)
}
func (h *PluginsHandler) execInContainer(ctx context.Context, containerName string, cmd []string) (string, error) {
return h.execInContainerAs(ctx, containerName, "", cmd)
}
func (h *PluginsHandler) execInContainerAs(ctx context.Context, containerName, user string, cmd []string) (string, error) {
execCfg := container.ExecOptions{
Cmd: cmd,
AttachStdout: true,
AttachStderr: true,
User: user,
}
execID, err := h.docker.ContainerExecCreate(ctx, containerName, execCfg)
if err != nil {
return "", err
}
resp, err := h.docker.ContainerExecAttach(ctx, execID.ID, container.ExecAttachOptions{})
if err != nil {
return "", err
}
defer resp.Close()
var stdout bytes.Buffer
stdcopy.StdCopy(&stdout, io.Discard, resp.Reader)
return strings.TrimSpace(stdout.String()), nil
}