fix(template-schedules): apply review findings (ordering + bounds)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
CI / Platform (Go) (pull_request) Successful in 4m57s
CI / all-required (pull_request) Successful in 6m14s
audit-force-merge / audit (pull_request) Successful in 5s

Addresses both review subagents' REQUEST_CHANGES verdicts on
PR #1929:

Code review (correctness)
- #1: Move schedule seeding to AFTER provisionWorkspaceAuto
  succeeds so the scheduler never fires cron rows against a
  workspace whose backend never wired. Failed-backend workspaces
  no longer end up with orphan template_schedules rows.
- #2: seedTemplateSchedules now returns (seeded, skipped int) so
  the caller can observe partial-seed states; workspace.go Create
  logs the (seeded, skipped) pair when skipped > 0, surfacing
  silent partial-loss that the prior (int) return masked.

Security review (hostile-template defenses)
- #3 / #4: parseTemplateSchedules reads config.yaml through an
  io.LimitReader bounded by maxTemplateConfigYAMLBytes (1 MiB)
  and rejects files over the cap before yaml.Unmarshal runs.
  Defends against billion-laughs / anchor-explosion DoS.
- #3: schedules slice length capped at maxTemplateSchedules (100,
  10x the largest current production grid). Hostile template with
  50k schedules now rejected at parse time, not after 50k inserts.
- #3: cron_expr length capped at maxScheduleCronExprLen (128) per
  schedule; resolved prompt body capped at maxSchedulePromptBytes
  (16 KiB) per schedule. Oversized entries are skipped (counted
  as `skipped`) so one bad row doesn't break the rest.
- #3: Seed loop honours ctx.Err() so an aborted Create request
  stops further inserts rather than running to completion on a
  dead goroutine.
- #8: Schedule names quoted via %q in all log lines so CRLF in a
  hostile name can't injection-pollute stdout/Loki.

Tests
- TestParseTemplateSchedules_RejectsOversizeFile — gate against
  the LimitReader cap (1 MiB + 1 byte of '#').
- TestParseTemplateSchedules_RejectsTooManySchedules — gate
  against the schedule-count cap (maxTemplateSchedules + 1
  minimal entries).
- Full handlers test suite still green (17.4s).

Non-fix surface
- Code-review #3 (runtime-default fallback also seeds): runtime-
  default templates do not currently ship a schedules: block so
  this is benign in practice; documented behavior in the comment.
- Code-review #4 (files_dir in workspace-template config.yaml):
  not part of the current template_registry schema; flagged for
  follow-up if templates start declaring files_dir.
- Security-review #7 (cron prompt as agent self-message escalation
  vector): out of scope per security reviewer's own note; tracked
  separately. Will file an issue.

Verified locally:
  go vet ./...                 → clean
  go build ./...               → clean
  gofmt -d <changed files>     → clean
  go test ./internal/handlers/ → PASS (7 unit tests for parser,
                                  full suite 17.4s)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-26 16:31:52 -07:00
parent d8b409f1bc
commit b9a3ef4294
3 changed files with 141 additions and 38 deletions
@@ -15,11 +15,25 @@ package handlers
// The actual INSERT statement is the canonical orgImportScheduleSQL
// defined in org.go — reused here verbatim so the four guarantees
// stay in one place.
//
// Hostile-template defenses (a tenant can upload a config.yaml via
// POST /templates/import or webhook-sync a repo they control):
// - config.yaml is loaded through a 1 MiB LimitReader so a YAML
// anchor-bomb / billion-laughs cannot pre-explode memory before
// unmarshal returns.
// - len(schedules), per-schedule cron length, and resolved prompt
// body length are all bounded; over-sized entries are skipped
// rather than committed.
// - Per-row insert errors and ctx cancellation surface to the
// caller via the returned counts so partial-seed states are
// observable (workspace.go Create logs the (seeded, skipped)
// pair when skipped > 0).
import (
"context"
"errors"
"fmt"
"io"
"log"
"os"
"path/filepath"
@@ -31,6 +45,17 @@ import (
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/scheduler"
)
// Bounds protecting the seeder against hostile or buggy templates.
// All chosen with generous headroom relative to legitimate use
// (reno-stars org template — the largest production schedule grid —
// runs ~10 entries per workspace, each prompt body well under 1 KiB).
const (
maxTemplateConfigYAMLBytes int64 = 1 << 20 // 1 MiB — hard cap on config.yaml size
maxTemplateSchedules = 100 // 10x current largest grid
maxScheduleCronExprLen = 128 // cron-spec syntax is short by construction
maxSchedulePromptBytes = 16 << 10 // 16 KiB after prompt_file resolution
)
// templateConfigSchedules is the minimal shape parsed from a workspace
// template's config.yaml. Only the `schedules:` block is modelled;
// the rest of the file (providers, runtime_config, …) is opaque to
@@ -44,46 +69,74 @@ type templateConfigSchedules struct {
// returns its `schedules:` block (nil + nil error when the file is
// absent or the block is empty).
//
// Returns an error only when a present config.yaml fails to read or
// parse — callers should treat that as a template-author bug rather
// than a runtime fault. The Create handler logs the error and
// continues so a broken schedules block can never block workspace
// provisioning.
// The file is read through a 1 MiB LimitReader so a billion-laughs
// or anchor-explosion YAML cannot pre-explode memory before
// Unmarshal returns. Returns an error only when a present
// config.yaml fails to read or parse — callers should treat that as
// a template-author bug rather than a runtime fault. The Create
// handler logs the error and continues so a broken schedules block
// can never block workspace provisioning.
func parseTemplateSchedules(templatePath string) ([]OrgSchedule, error) {
if templatePath == "" {
return nil, nil
}
data, err := os.ReadFile(filepath.Join(templatePath, "config.yaml"))
f, err := os.Open(filepath.Join(templatePath, "config.yaml"))
if err != nil {
if errors.Is(err, os.ErrNotExist) {
return nil, nil
}
return nil, fmt.Errorf("open template config.yaml: %w", err)
}
defer f.Close()
// Read maxTemplateConfigYAMLBytes+1 — if we filled the buffer the
// underlying file exceeded the cap and we refuse to unmarshal.
data, err := io.ReadAll(io.LimitReader(f, maxTemplateConfigYAMLBytes+1))
if err != nil {
return nil, fmt.Errorf("read template config.yaml: %w", err)
}
if int64(len(data)) > maxTemplateConfigYAMLBytes {
return nil, fmt.Errorf("template config.yaml exceeds %d-byte cap", maxTemplateConfigYAMLBytes)
}
var cfg templateConfigSchedules
if err := yaml.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("parse template config.yaml schedules: %w", err)
}
if len(cfg.Schedules) > maxTemplateSchedules {
return nil, fmt.Errorf("template declares %d schedules; cap is %d", len(cfg.Schedules), maxTemplateSchedules)
}
return cfg.Schedules, nil
}
// seedTemplateSchedules INSERTs (or refreshes) each schedule into
// workspace_schedules with source='template'. Returns the count of
// rows successfully upserted.
// workspace_schedules with source='template'. Returns (seeded,
// skipped) counts so the caller can observe partial-seed states.
//
// Prompt body resolution mirrors org_import.go: inline `prompt:` wins,
// else `prompt_file:` is resolved relative to templatePath via
// resolvePromptRef. Per-schedule failures (bad cron, missing prompt
// file, DB error) are logged and skipped so one bad row never breaks
// the rest of the grid.
// file, DB error, oversize input) are logged with the schedule name
// quoted via %q (CRLF-safe) and skipped so one bad row never breaks
// the rest of the grid. A cancelled ctx breaks the loop early.
//
// Timezone defaults to "UTC" when unset. Env-var expansion in the
// timezone field is intentionally not performed — that mirrors the
// org/import behavior; template authors should pick a literal IANA
// zone (or rely on UTC + operator overrides per-tenant).
func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string, schedules []OrgSchedule) int {
seeded := 0
func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string, schedules []OrgSchedule) (seeded, skipped int) {
for _, sched := range schedules {
// Honour caller cancellation — protects against long seed
// loops on a request whose client already gave up.
if err := ctx.Err(); err != nil {
log.Printf("Template schedule seed: ctx cancelled after %d/%d on %s: %v", seeded, len(schedules), workspaceID, err)
skipped += len(schedules) - seeded - skipped
return
}
if len(sched.CronExpr) > maxScheduleCronExprLen {
log.Printf("Template schedule seed: cron_expr too long (%d > %d) for %q on %s — skipping", len(sched.CronExpr), maxScheduleCronExprLen, sched.Name, workspaceID)
skipped++
continue
}
tz := sched.Timezone
if tz == "" {
tz = "UTC"
@@ -94,25 +147,34 @@ func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string
}
prompt, promptErr := resolvePromptRef(sched.Prompt, sched.PromptFile, templatePath, "")
if promptErr != nil {
log.Printf("Template schedule seed: failed to resolve prompt for '%s' on %s: %v — skipping", sched.Name, workspaceID, promptErr)
log.Printf("Template schedule seed: failed to resolve prompt for %q on %s: %v — skipping", sched.Name, workspaceID, promptErr)
skipped++
continue
}
if prompt == "" {
log.Printf("Template schedule seed: schedule '%s' on %s has empty prompt — skipping", sched.Name, workspaceID)
log.Printf("Template schedule seed: schedule %q on %s has empty prompt — skipping", sched.Name, workspaceID)
skipped++
continue
}
if len(prompt) > maxSchedulePromptBytes {
log.Printf("Template schedule seed: prompt too long (%d > %d bytes) for %q on %s — skipping", len(prompt), maxSchedulePromptBytes, sched.Name, workspaceID)
skipped++
continue
}
nextRun, nextRunErr := scheduler.ComputeNextRun(sched.CronExpr, tz, time.Now())
if nextRunErr != nil {
log.Printf("Template schedule seed: invalid cron for '%s' on %s: %v — skipping", sched.Name, workspaceID, nextRunErr)
log.Printf("Template schedule seed: invalid cron for %q on %s: %v — skipping", sched.Name, workspaceID, nextRunErr)
skipped++
continue
}
if _, err := db.DB.ExecContext(ctx, orgImportScheduleSQL,
workspaceID, sched.Name, sched.CronExpr, tz, prompt, enabled, nextRun); err != nil {
log.Printf("Template schedule seed: failed to upsert '%s' on %s: %v", sched.Name, workspaceID, err)
log.Printf("Template schedule seed: failed to upsert %q on %s: %v", sched.Name, workspaceID, err)
skipped++
continue
}
seeded++
log.Printf("Template schedule seed: '%s' (%s, %d chars) upserted on %s (source=template)", sched.Name, sched.CronExpr, len(prompt), workspaceID)
log.Printf("Template schedule seed: %q (%s, %d chars) upserted on %s (source=template)", sched.Name, sched.CronExpr, len(prompt), workspaceID)
}
return seeded
return
}
@@ -106,3 +106,36 @@ schedules:
t.Fatal("expected parse error on malformed YAML, got nil")
}
}
// TestParseTemplateSchedules_RejectsOversizeFile gates against the
// billion-laughs / anchor-bomb DoS class: a hostile config.yaml over
// the 1 MiB cap must be refused before yaml.Unmarshal runs.
func TestParseTemplateSchedules_RejectsOversizeFile(t *testing.T) {
dir := t.TempDir()
// One byte over the cap — fastest path to the gate.
pad := make([]byte, maxTemplateConfigYAMLBytes+1)
for i := range pad {
pad[i] = '#'
}
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(pad))
if _, err := parseTemplateSchedules(dir); err == nil {
t.Fatal("expected oversize-file error, got nil")
}
}
// TestParseTemplateSchedules_RejectsTooManySchedules gates against a
// hostile config.yaml that flips one row into a 10k-row insert storm.
func TestParseTemplateSchedules_RejectsTooManySchedules(t *testing.T) {
dir := t.TempDir()
var b []byte
b = append(b, []byte("schedules:\n")...)
// maxTemplateSchedules+1 minimal entries — they don't have to be
// valid as schedules because the gate trips before resolution.
for i := 0; i <= maxTemplateSchedules; i++ {
b = append(b, []byte(" - name: s\n cron_expr: \"* * * * *\"\n prompt: x\n")...)
}
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(b))
if _, err := parseTemplateSchedules(dir); err == nil {
t.Fatal("expected schedule-count error, got nil")
}
}
+28 -20
View File
@@ -761,25 +761,6 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
configFiles = h.ensureDefaultConfig(id, payload)
}
// Seed schedules declared in the workspace template's config.yaml.
// Mirrors the org/import path (org_import.go) so a workspace created
// directly from a workspace template lands with the same schedule
// grid the org/import path would have produced. Idempotent across
// re-imports via orgImportScheduleSQL's ON CONFLICT clause; runtime-
// added rows are preserved (Issue #24 contract).
//
// Non-fatal: a broken schedules: block must never block workspace
// provisioning — the workspace row already committed above, and the
// schedule grid is recoverable via /workspaces/{id}/schedules POSTs.
if templatePath != "" {
if templateScheds, parseErr := parseTemplateSchedules(templatePath); parseErr != nil {
log.Printf("Create %s: parsing template schedules: %v (continuing)", id, parseErr)
} else if len(templateScheds) > 0 {
seeded := seedTemplateSchedules(ctx, id, templatePath, templateScheds)
log.Printf("Create %s: seeded %d/%d template schedules", id, seeded, len(templateScheds))
}
}
// Auto-provision — pick backend: control plane (SaaS) or Docker (self-hosted).
// Routing AND the no-backend mark-failed path are both inside
// provisionWorkspaceAuto (single source of truth). The Create-specific
@@ -788,7 +769,8 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// runtime/model/tier as JSON — the Config tab needs that to render
// even on failed workspaces, so Create owns this Create-only side
// effect rather than coupling Auto to a UI concern.
if !h.provisionWorkspaceAuto(id, templatePath, configFiles, payload) {
provisionOK := h.provisionWorkspaceAuto(id, templatePath, configFiles, payload)
if !provisionOK {
cfgJSON := fmt.Sprintf(`{"name":%q,"runtime":%q,"tier":%d,"template":%q}`,
payload.Name, payload.Runtime, payload.Tier, payload.Template)
if _, err := db.DB.ExecContext(ctx, `
@@ -799,6 +781,32 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
}
}
// Seed schedules declared in the workspace template's config.yaml
// AFTER provisionWorkspaceAuto succeeds so the scheduler never
// fires cron rows against a workspace whose backend never wired
// (review feedback PR #1929#1). Async EC2 provisioning may still
// fail downstream; scheduler.go is expected to handle non-online
// status as a no-op tick. Idempotent across re-creates via
// orgImportScheduleSQL's ON CONFLICT clause; runtime-added rows
// are preserved (Issue #24 contract). Restart does not re-seed
// (so user-deleted template rows stay deleted).
//
// Non-fatal: a broken schedules: block must never block workspace
// provisioning — the workspace row is already live and the grid
// is recoverable via POST /workspaces/{id}/schedules.
if provisionOK && templatePath != "" {
if templateScheds, parseErr := parseTemplateSchedules(templatePath); parseErr != nil {
log.Printf("Create %s: parsing template schedules: %v (continuing)", id, parseErr)
} else if len(templateScheds) > 0 {
seeded, skipped := seedTemplateSchedules(ctx, id, templatePath, templateScheds)
if skipped > 0 {
log.Printf("Create %s: template schedule partial-seed: seeded=%d skipped=%d total=%d", id, seeded, skipped, len(templateScheds))
} else {
log.Printf("Create %s: seeded %d/%d template schedules", id, seeded, len(templateScheds))
}
}
}
c.JSON(http.StatusCreated, gin.H{
"id": id,
"status": "provisioning",