fix(template-schedules): apply review findings (ordering + bounds)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
CI / Platform (Go) (pull_request) Successful in 4m57s
CI / all-required (pull_request) Successful in 6m14s
audit-force-merge / audit (pull_request) Successful in 5s
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 12s
CI / Python Lint & Test (pull_request) Successful in 11s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 12s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 11s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 12s
gate-check-v3 / gate-check (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 5s
sop-checklist / review-refire (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 56s
sop-tier-check / tier-check (pull_request) Successful in 6s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m42s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m14s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m12s
CI / Platform (Go) (pull_request) Successful in 4m57s
CI / all-required (pull_request) Successful in 6m14s
audit-force-merge / audit (pull_request) Successful in 5s
Addresses both review subagents' REQUEST_CHANGES verdicts on PR #1929: Code review (correctness) - #1: Move schedule seeding to AFTER provisionWorkspaceAuto succeeds so the scheduler never fires cron rows against a workspace whose backend never wired. Failed-backend workspaces no longer end up with orphan template_schedules rows. - #2: seedTemplateSchedules now returns (seeded, skipped int) so the caller can observe partial-seed states; workspace.go Create logs the (seeded, skipped) pair when skipped > 0, surfacing silent partial-loss that the prior (int) return masked. Security review (hostile-template defenses) - #3 / #4: parseTemplateSchedules reads config.yaml through an io.LimitReader bounded by maxTemplateConfigYAMLBytes (1 MiB) and rejects files over the cap before yaml.Unmarshal runs. Defends against billion-laughs / anchor-explosion DoS. - #3: schedules slice length capped at maxTemplateSchedules (100, 10x the largest current production grid). Hostile template with 50k schedules now rejected at parse time, not after 50k inserts. - #3: cron_expr length capped at maxScheduleCronExprLen (128) per schedule; resolved prompt body capped at maxSchedulePromptBytes (16 KiB) per schedule. Oversized entries are skipped (counted as `skipped`) so one bad row doesn't break the rest. - #3: Seed loop honours ctx.Err() so an aborted Create request stops further inserts rather than running to completion on a dead goroutine. - #8: Schedule names quoted via %q in all log lines so CRLF in a hostile name can't injection-pollute stdout/Loki. Tests - TestParseTemplateSchedules_RejectsOversizeFile — gate against the LimitReader cap (1 MiB + 1 byte of '#'). - TestParseTemplateSchedules_RejectsTooManySchedules — gate against the schedule-count cap (maxTemplateSchedules + 1 minimal entries). - Full handlers test suite still green (17.4s). Non-fix surface - Code-review #3 (runtime-default fallback also seeds): runtime- default templates do not currently ship a schedules: block so this is benign in practice; documented behavior in the comment. - Code-review #4 (files_dir in workspace-template config.yaml): not part of the current template_registry schema; flagged for follow-up if templates start declaring files_dir. - Security-review #7 (cron prompt as agent self-message escalation vector): out of scope per security reviewer's own note; tracked separately. Will file an issue. Verified locally: go vet ./... → clean go build ./... → clean gofmt -d <changed files> → clean go test ./internal/handlers/ → PASS (7 unit tests for parser, full suite 17.4s) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -15,11 +15,25 @@ package handlers
|
||||
// The actual INSERT statement is the canonical orgImportScheduleSQL
|
||||
// defined in org.go — reused here verbatim so the four guarantees
|
||||
// stay in one place.
|
||||
//
|
||||
// Hostile-template defenses (a tenant can upload a config.yaml via
|
||||
// POST /templates/import or webhook-sync a repo they control):
|
||||
// - config.yaml is loaded through a 1 MiB LimitReader so a YAML
|
||||
// anchor-bomb / billion-laughs cannot pre-explode memory before
|
||||
// unmarshal returns.
|
||||
// - len(schedules), per-schedule cron length, and resolved prompt
|
||||
// body length are all bounded; over-sized entries are skipped
|
||||
// rather than committed.
|
||||
// - Per-row insert errors and ctx cancellation surface to the
|
||||
// caller via the returned counts so partial-seed states are
|
||||
// observable (workspace.go Create logs the (seeded, skipped)
|
||||
// pair when skipped > 0).
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"os"
|
||||
"path/filepath"
|
||||
@@ -31,6 +45,17 @@ import (
|
||||
"git.moleculesai.app/molecule-ai/molecule-core/workspace-server/internal/scheduler"
|
||||
)
|
||||
|
||||
// Bounds protecting the seeder against hostile or buggy templates.
|
||||
// All chosen with generous headroom relative to legitimate use
|
||||
// (reno-stars org template — the largest production schedule grid —
|
||||
// runs ~10 entries per workspace, each prompt body well under 1 KiB).
|
||||
const (
|
||||
maxTemplateConfigYAMLBytes int64 = 1 << 20 // 1 MiB — hard cap on config.yaml size
|
||||
maxTemplateSchedules = 100 // 10x current largest grid
|
||||
maxScheduleCronExprLen = 128 // cron-spec syntax is short by construction
|
||||
maxSchedulePromptBytes = 16 << 10 // 16 KiB after prompt_file resolution
|
||||
)
|
||||
|
||||
// templateConfigSchedules is the minimal shape parsed from a workspace
|
||||
// template's config.yaml. Only the `schedules:` block is modelled;
|
||||
// the rest of the file (providers, runtime_config, …) is opaque to
|
||||
@@ -44,46 +69,74 @@ type templateConfigSchedules struct {
|
||||
// returns its `schedules:` block (nil + nil error when the file is
|
||||
// absent or the block is empty).
|
||||
//
|
||||
// Returns an error only when a present config.yaml fails to read or
|
||||
// parse — callers should treat that as a template-author bug rather
|
||||
// than a runtime fault. The Create handler logs the error and
|
||||
// continues so a broken schedules block can never block workspace
|
||||
// provisioning.
|
||||
// The file is read through a 1 MiB LimitReader so a billion-laughs
|
||||
// or anchor-explosion YAML cannot pre-explode memory before
|
||||
// Unmarshal returns. Returns an error only when a present
|
||||
// config.yaml fails to read or parse — callers should treat that as
|
||||
// a template-author bug rather than a runtime fault. The Create
|
||||
// handler logs the error and continues so a broken schedules block
|
||||
// can never block workspace provisioning.
|
||||
func parseTemplateSchedules(templatePath string) ([]OrgSchedule, error) {
|
||||
if templatePath == "" {
|
||||
return nil, nil
|
||||
}
|
||||
data, err := os.ReadFile(filepath.Join(templatePath, "config.yaml"))
|
||||
f, err := os.Open(filepath.Join(templatePath, "config.yaml"))
|
||||
if err != nil {
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
return nil, nil
|
||||
}
|
||||
return nil, fmt.Errorf("open template config.yaml: %w", err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
// Read maxTemplateConfigYAMLBytes+1 — if we filled the buffer the
|
||||
// underlying file exceeded the cap and we refuse to unmarshal.
|
||||
data, err := io.ReadAll(io.LimitReader(f, maxTemplateConfigYAMLBytes+1))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read template config.yaml: %w", err)
|
||||
}
|
||||
if int64(len(data)) > maxTemplateConfigYAMLBytes {
|
||||
return nil, fmt.Errorf("template config.yaml exceeds %d-byte cap", maxTemplateConfigYAMLBytes)
|
||||
}
|
||||
var cfg templateConfigSchedules
|
||||
if err := yaml.Unmarshal(data, &cfg); err != nil {
|
||||
return nil, fmt.Errorf("parse template config.yaml schedules: %w", err)
|
||||
}
|
||||
if len(cfg.Schedules) > maxTemplateSchedules {
|
||||
return nil, fmt.Errorf("template declares %d schedules; cap is %d", len(cfg.Schedules), maxTemplateSchedules)
|
||||
}
|
||||
return cfg.Schedules, nil
|
||||
}
|
||||
|
||||
// seedTemplateSchedules INSERTs (or refreshes) each schedule into
|
||||
// workspace_schedules with source='template'. Returns the count of
|
||||
// rows successfully upserted.
|
||||
// workspace_schedules with source='template'. Returns (seeded,
|
||||
// skipped) counts so the caller can observe partial-seed states.
|
||||
//
|
||||
// Prompt body resolution mirrors org_import.go: inline `prompt:` wins,
|
||||
// else `prompt_file:` is resolved relative to templatePath via
|
||||
// resolvePromptRef. Per-schedule failures (bad cron, missing prompt
|
||||
// file, DB error) are logged and skipped so one bad row never breaks
|
||||
// the rest of the grid.
|
||||
// file, DB error, oversize input) are logged with the schedule name
|
||||
// quoted via %q (CRLF-safe) and skipped so one bad row never breaks
|
||||
// the rest of the grid. A cancelled ctx breaks the loop early.
|
||||
//
|
||||
// Timezone defaults to "UTC" when unset. Env-var expansion in the
|
||||
// timezone field is intentionally not performed — that mirrors the
|
||||
// org/import behavior; template authors should pick a literal IANA
|
||||
// zone (or rely on UTC + operator overrides per-tenant).
|
||||
func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string, schedules []OrgSchedule) int {
|
||||
seeded := 0
|
||||
func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string, schedules []OrgSchedule) (seeded, skipped int) {
|
||||
for _, sched := range schedules {
|
||||
// Honour caller cancellation — protects against long seed
|
||||
// loops on a request whose client already gave up.
|
||||
if err := ctx.Err(); err != nil {
|
||||
log.Printf("Template schedule seed: ctx cancelled after %d/%d on %s: %v", seeded, len(schedules), workspaceID, err)
|
||||
skipped += len(schedules) - seeded - skipped
|
||||
return
|
||||
}
|
||||
if len(sched.CronExpr) > maxScheduleCronExprLen {
|
||||
log.Printf("Template schedule seed: cron_expr too long (%d > %d) for %q on %s — skipping", len(sched.CronExpr), maxScheduleCronExprLen, sched.Name, workspaceID)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
tz := sched.Timezone
|
||||
if tz == "" {
|
||||
tz = "UTC"
|
||||
@@ -94,25 +147,34 @@ func seedTemplateSchedules(ctx context.Context, workspaceID, templatePath string
|
||||
}
|
||||
prompt, promptErr := resolvePromptRef(sched.Prompt, sched.PromptFile, templatePath, "")
|
||||
if promptErr != nil {
|
||||
log.Printf("Template schedule seed: failed to resolve prompt for '%s' on %s: %v — skipping", sched.Name, workspaceID, promptErr)
|
||||
log.Printf("Template schedule seed: failed to resolve prompt for %q on %s: %v — skipping", sched.Name, workspaceID, promptErr)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
if prompt == "" {
|
||||
log.Printf("Template schedule seed: schedule '%s' on %s has empty prompt — skipping", sched.Name, workspaceID)
|
||||
log.Printf("Template schedule seed: schedule %q on %s has empty prompt — skipping", sched.Name, workspaceID)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
if len(prompt) > maxSchedulePromptBytes {
|
||||
log.Printf("Template schedule seed: prompt too long (%d > %d bytes) for %q on %s — skipping", len(prompt), maxSchedulePromptBytes, sched.Name, workspaceID)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
nextRun, nextRunErr := scheduler.ComputeNextRun(sched.CronExpr, tz, time.Now())
|
||||
if nextRunErr != nil {
|
||||
log.Printf("Template schedule seed: invalid cron for '%s' on %s: %v — skipping", sched.Name, workspaceID, nextRunErr)
|
||||
log.Printf("Template schedule seed: invalid cron for %q on %s: %v — skipping", sched.Name, workspaceID, nextRunErr)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
if _, err := db.DB.ExecContext(ctx, orgImportScheduleSQL,
|
||||
workspaceID, sched.Name, sched.CronExpr, tz, prompt, enabled, nextRun); err != nil {
|
||||
log.Printf("Template schedule seed: failed to upsert '%s' on %s: %v", sched.Name, workspaceID, err)
|
||||
log.Printf("Template schedule seed: failed to upsert %q on %s: %v", sched.Name, workspaceID, err)
|
||||
skipped++
|
||||
continue
|
||||
}
|
||||
seeded++
|
||||
log.Printf("Template schedule seed: '%s' (%s, %d chars) upserted on %s (source=template)", sched.Name, sched.CronExpr, len(prompt), workspaceID)
|
||||
log.Printf("Template schedule seed: %q (%s, %d chars) upserted on %s (source=template)", sched.Name, sched.CronExpr, len(prompt), workspaceID)
|
||||
}
|
||||
return seeded
|
||||
return
|
||||
}
|
||||
|
||||
@@ -106,3 +106,36 @@ schedules:
|
||||
t.Fatal("expected parse error on malformed YAML, got nil")
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseTemplateSchedules_RejectsOversizeFile gates against the
|
||||
// billion-laughs / anchor-bomb DoS class: a hostile config.yaml over
|
||||
// the 1 MiB cap must be refused before yaml.Unmarshal runs.
|
||||
func TestParseTemplateSchedules_RejectsOversizeFile(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
// One byte over the cap — fastest path to the gate.
|
||||
pad := make([]byte, maxTemplateConfigYAMLBytes+1)
|
||||
for i := range pad {
|
||||
pad[i] = '#'
|
||||
}
|
||||
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(pad))
|
||||
if _, err := parseTemplateSchedules(dir); err == nil {
|
||||
t.Fatal("expected oversize-file error, got nil")
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseTemplateSchedules_RejectsTooManySchedules gates against a
|
||||
// hostile config.yaml that flips one row into a 10k-row insert storm.
|
||||
func TestParseTemplateSchedules_RejectsTooManySchedules(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
var b []byte
|
||||
b = append(b, []byte("schedules:\n")...)
|
||||
// maxTemplateSchedules+1 minimal entries — they don't have to be
|
||||
// valid as schedules because the gate trips before resolution.
|
||||
for i := 0; i <= maxTemplateSchedules; i++ {
|
||||
b = append(b, []byte(" - name: s\n cron_expr: \"* * * * *\"\n prompt: x\n")...)
|
||||
}
|
||||
mustWriteFile(t, filepath.Join(dir, "config.yaml"), string(b))
|
||||
if _, err := parseTemplateSchedules(dir); err == nil {
|
||||
t.Fatal("expected schedule-count error, got nil")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -761,25 +761,6 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
|
||||
configFiles = h.ensureDefaultConfig(id, payload)
|
||||
}
|
||||
|
||||
// Seed schedules declared in the workspace template's config.yaml.
|
||||
// Mirrors the org/import path (org_import.go) so a workspace created
|
||||
// directly from a workspace template lands with the same schedule
|
||||
// grid the org/import path would have produced. Idempotent across
|
||||
// re-imports via orgImportScheduleSQL's ON CONFLICT clause; runtime-
|
||||
// added rows are preserved (Issue #24 contract).
|
||||
//
|
||||
// Non-fatal: a broken schedules: block must never block workspace
|
||||
// provisioning — the workspace row already committed above, and the
|
||||
// schedule grid is recoverable via /workspaces/{id}/schedules POSTs.
|
||||
if templatePath != "" {
|
||||
if templateScheds, parseErr := parseTemplateSchedules(templatePath); parseErr != nil {
|
||||
log.Printf("Create %s: parsing template schedules: %v (continuing)", id, parseErr)
|
||||
} else if len(templateScheds) > 0 {
|
||||
seeded := seedTemplateSchedules(ctx, id, templatePath, templateScheds)
|
||||
log.Printf("Create %s: seeded %d/%d template schedules", id, seeded, len(templateScheds))
|
||||
}
|
||||
}
|
||||
|
||||
// Auto-provision — pick backend: control plane (SaaS) or Docker (self-hosted).
|
||||
// Routing AND the no-backend mark-failed path are both inside
|
||||
// provisionWorkspaceAuto (single source of truth). The Create-specific
|
||||
@@ -788,7 +769,8 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
|
||||
// runtime/model/tier as JSON — the Config tab needs that to render
|
||||
// even on failed workspaces, so Create owns this Create-only side
|
||||
// effect rather than coupling Auto to a UI concern.
|
||||
if !h.provisionWorkspaceAuto(id, templatePath, configFiles, payload) {
|
||||
provisionOK := h.provisionWorkspaceAuto(id, templatePath, configFiles, payload)
|
||||
if !provisionOK {
|
||||
cfgJSON := fmt.Sprintf(`{"name":%q,"runtime":%q,"tier":%d,"template":%q}`,
|
||||
payload.Name, payload.Runtime, payload.Tier, payload.Template)
|
||||
if _, err := db.DB.ExecContext(ctx, `
|
||||
@@ -799,6 +781,32 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
|
||||
}
|
||||
}
|
||||
|
||||
// Seed schedules declared in the workspace template's config.yaml
|
||||
// AFTER provisionWorkspaceAuto succeeds so the scheduler never
|
||||
// fires cron rows against a workspace whose backend never wired
|
||||
// (review feedback PR #1929#1). Async EC2 provisioning may still
|
||||
// fail downstream; scheduler.go is expected to handle non-online
|
||||
// status as a no-op tick. Idempotent across re-creates via
|
||||
// orgImportScheduleSQL's ON CONFLICT clause; runtime-added rows
|
||||
// are preserved (Issue #24 contract). Restart does not re-seed
|
||||
// (so user-deleted template rows stay deleted).
|
||||
//
|
||||
// Non-fatal: a broken schedules: block must never block workspace
|
||||
// provisioning — the workspace row is already live and the grid
|
||||
// is recoverable via POST /workspaces/{id}/schedules.
|
||||
if provisionOK && templatePath != "" {
|
||||
if templateScheds, parseErr := parseTemplateSchedules(templatePath); parseErr != nil {
|
||||
log.Printf("Create %s: parsing template schedules: %v (continuing)", id, parseErr)
|
||||
} else if len(templateScheds) > 0 {
|
||||
seeded, skipped := seedTemplateSchedules(ctx, id, templatePath, templateScheds)
|
||||
if skipped > 0 {
|
||||
log.Printf("Create %s: template schedule partial-seed: seeded=%d skipped=%d total=%d", id, seeded, skipped, len(templateScheds))
|
||||
} else {
|
||||
log.Printf("Create %s: seeded %d/%d template schedules", id, seeded, len(templateScheds))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
c.JSON(http.StatusCreated, gin.H{
|
||||
"id": id,
|
||||
"status": "provisioning",
|
||||
|
||||
Reference in New Issue
Block a user