feat(workspace): agent config-watcher for hot-reload of config.yaml without container restart #117

New Issue

claude-ceo-assistant · 2026-05-08T14:59:04Z

2026-05-08 14:59:04 +00:00

Problem

/configs is already bind-mounted into the workspace container, so a config.yaml edit is visible on the agent's filesystem instantly. But the agent process (claude-code subprocess wrapped by the Python adapter) reads its config at startup and never re-checks. Result: every config tweak requires a full container restart via restartFunc.

This is the most-visited dev surface during agent tuning (changing model, idle_prompt, runtime_config, system_prompt) and the restart cost (~5-10s per change + dropped in-flight A2A messages) compounds during iteration. For Reno-Stars, it's also a visible service-blip every time we adjust their team's config.

Proposed approach

Add a config-watcher to the workspace runtime base (Python) using watchfiles (or stdlib inotify/fsevents wrapper) that:

Watches <config_path>/config.yaml, <config_path>/system-prompt.md, and <config_path>/plugins/** for filesystem changes.
Classifies the change:
- Hot-reloadable: model/idle_prompt/system_prompt/category_routing/files_dir content/SKILL.md content
- Cold-restart-required: runtime/tier/workspace_dir/plugin add-or-remove/hook changes/settings.json
For hot-reloadable changes: emit a structured event the running agent picks up at the next message boundary — re-read config, re-load system prompt, swap model on next call. No process restart.
For cold-restart-required: signal the platform to call restartFunc (or the existing path remains).

Why this matters relative to issue molecule-core#112

#112 (hot-reload SKILL-content-only, plugin-side) addresses the platform's install path — diff classification before deciding to call restartFunc. This issue addresses the runtime path — the agent process learning that something on disk changed and adapting without restart.

They compose: #112 stops the platform from issuing unnecessary restarts; this issue lets the agent benefit (without needing a process restart) when the platform DOES write new content to disk.

Acceptance criteria

New module under workspace/ (or per-runtime adapter) that runs a watcher in a background thread
Watcher debounces (≥250ms) so a multi-file save doesn't fan out 5 reloads
Hot-reloadable change classifier with explicit allowlist (default = cold)
A2A message handler checks 'reload pending' flag at message boundary; reloads before processing next message
New unit test: edit config.yaml mid-test → next message uses new config
Real-subprocess test: run claude-code adapter, change model on disk, send message, verify the new model id is reported in the response
Cold path unchanged: hooks change → still falls through to platform restartFunc
Feature gate via env var MOLECULE_CONFIG_WATCH=1 initially (off by default; can flip on after Reno-Stars soak)

Out of scope

Auto-rotating tokens / secrets reload (separate path — workspace_secrets table changes already handled)
Multi-config-file resolution order changes (no SSOT semantic shift)
Cross-workspace config sharing
Pluggable watcher backend (stick with watchfiles)

Risks + mitigations

Mid-message reload race: agent halfway through generating output when watcher fires → mitigation: reload only at message-boundary, never mid-generation
Watcher leak: process exits without stopping watcher → mitigation: lifecycle-tied to AgentExecutor; cleanup in finally
False positives on transient writes (atomic temp-rename mid-write, editor swap files): mitigation: debounce + ignore *.tmp, *.swp, .DS_Store

Refs

molecule-core#112 — companion plugin-side hot-reload classifier
workspace-configs-templates/claude-code-default/adapter.py — where the watcher would integrate
Reno-Stars rollout safety (Hongming 2026-05-08)
Existing workspace/config.py for the parse path the watcher would re-invoke

## Problem `/configs` is already bind-mounted into the workspace container, so a config.yaml edit is *visible* on the agent's filesystem instantly. But the agent process (claude-code subprocess wrapped by the Python adapter) reads its config at startup and never re-checks. Result: every config tweak requires a full container restart via `restartFunc`. This is the most-visited dev surface during agent tuning (changing model, idle_prompt, runtime_config, system_prompt) and the restart cost (~5-10s per change + dropped in-flight A2A messages) compounds during iteration. For Reno-Stars, it's also a visible service-blip every time we adjust their team's config. ## Proposed approach Add a config-watcher to the workspace runtime base (Python) using `watchfiles` (or stdlib `inotify`/`fsevents` wrapper) that: 1. **Watches** `<config_path>/config.yaml`, `<config_path>/system-prompt.md`, and `<config_path>/plugins/**` for filesystem changes. 2. **Classifies** the change: - **Hot-reloadable**: model/idle_prompt/system_prompt/category_routing/files_dir content/SKILL.md content - **Cold-restart-required**: runtime/tier/workspace_dir/plugin add-or-remove/hook changes/settings.json 3. **For hot-reloadable changes**: emit a structured event the running agent picks up at the next message boundary — re-read config, re-load system prompt, swap model on next call. No process restart. 4. **For cold-restart-required**: signal the platform to call `restartFunc` (or the existing path remains). ## Why this matters relative to issue molecule-core#112 #112 (hot-reload SKILL-content-only, plugin-side) addresses the platform's *install* path — diff classification before deciding to call restartFunc. This issue addresses the *runtime* path — the agent process learning that something on disk changed and adapting without restart. They compose: #112 stops the platform from issuing unnecessary restarts; this issue lets the agent benefit (without needing a process restart) when the platform DOES write new content to disk. ## Acceptance criteria - New module under `workspace/` (or per-runtime adapter) that runs a watcher in a background thread - Watcher debounces (≥250ms) so a multi-file save doesn't fan out 5 reloads - Hot-reloadable change classifier with explicit allowlist (default = cold) - A2A message handler checks 'reload pending' flag at message boundary; reloads before processing next message - New unit test: edit config.yaml mid-test → next message uses new config - Real-subprocess test: run claude-code adapter, change model on disk, send message, verify the new model id is reported in the response - Cold path unchanged: hooks change → still falls through to platform restartFunc - Feature gate via env var `MOLECULE_CONFIG_WATCH=1` initially (off by default; can flip on after Reno-Stars soak) ## Out of scope - Auto-rotating tokens / secrets reload (separate path — workspace_secrets table changes already handled) - Multi-config-file resolution order changes (no SSOT semantic shift) - Cross-workspace config sharing - Pluggable watcher backend (stick with `watchfiles`) ## Risks + mitigations - **Mid-message reload race**: agent halfway through generating output when watcher fires → mitigation: reload only at message-boundary, never mid-generation - **Watcher leak**: process exits without stopping watcher → mitigation: lifecycle-tied to AgentExecutor; cleanup in finally - **False positives** on transient writes (atomic temp-rename mid-write, editor swap files): mitigation: debounce + ignore `*.tmp`, `*.swp`, `.DS_Store` ## Refs - molecule-core#112 — companion plugin-side hot-reload classifier - `workspace-configs-templates/claude-code-default/adapter.py` — where the watcher would integrate - Reno-Stars rollout safety (Hongming 2026-05-08) - Existing `workspace/config.py` for the parse path the watcher would re-invoke

claude-ceo-assistant commented

2026-05-08 16:10:35 +00:00

Phase 1 finding — partial coverage already exists, smaller scope warranted

Investigating before implementing surfaced that the read-at-boundary hot-reload pattern is already in the codebase:

# workspace/executor_helpers.py
def get_system_prompt(config_path: str, fallback: str | None = None) -> str | None:
    """Read system-prompt.md from the config dir each call (supports hot-reload)."""
    prompt_file = Path(config_path) / "system-prompt.md"
    if prompt_file.exists():
        return prompt_file.read_text(encoding="utf-8", errors="replace").strip()
    return fallback

No watchfiles dependency. No background thread. No debounce. Just filesystem-driven re-read on each invocation. This is a strictly cleaner shape than the issue's original watchfiles-based design.

What this means for the issue

system-prompt.md is already hot-reloadable end-to-end (get_system_prompt is called per message).
config.yaml model field is NOT — the executor caches self._model at construction. Extending the pattern to model + idle_prompt + category_routing requires touching the executor's constructor signature.
plugins/ filesystem changes — already addressed via core#112 (hot-reload classifier on the platform side; SKILL.md filesystem reads are SDK-driven, not adapter-driven).

Closing this issue

The original scope assumed we needed watchfiles + a background watcher thread + a custom debounce/classify pipeline. Phase 1 found that the codebase's existing pattern is already simpler and covers the most-visited case (system-prompt). The remaining gap (model field) is a focused follow-up issue (filed as a new issue) — significantly smaller than this one's original scope, no Python deps to add.

Follow-up issue: extend get_system_prompt-style read-at-boundary to config.model so model swaps take effect at next message without container restart.

Closing this as 'partial coverage exists; remaining gap focused-followup-filed.'

## Phase 1 finding — partial coverage already exists, smaller scope warranted Investigating before implementing surfaced that the **read-at-boundary hot-reload pattern is already in the codebase**: ```python # workspace/executor_helpers.py def get_system_prompt(config_path: str, fallback: str | None = None) -> str | None: """Read system-prompt.md from the config dir each call (supports hot-reload).""" prompt_file = Path(config_path) / "system-prompt.md" if prompt_file.exists(): return prompt_file.read_text(encoding="utf-8", errors="replace").strip() return fallback ``` No watchfiles dependency. No background thread. No debounce. Just filesystem-driven re-read on each invocation. This is a strictly cleaner shape than the issue's original watchfiles-based design. ## What this means for the issue - **system-prompt.md** is **already** hot-reloadable end-to-end (get_system_prompt is called per message). - **config.yaml model field** is NOT — the executor caches `self._model` at construction. Extending the pattern to model + idle_prompt + category_routing requires touching the executor's constructor signature. - **plugins/** filesystem changes — already addressed via core#112 (hot-reload classifier on the platform side; SKILL.md filesystem reads are SDK-driven, not adapter-driven). ## Closing this issue The original scope assumed we needed watchfiles + a background watcher thread + a custom debounce/classify pipeline. Phase 1 found that the codebase's existing pattern is already simpler and covers the most-visited case (system-prompt). The remaining gap (model field) is a focused follow-up issue (filed as a new issue) — significantly smaller than this one's original scope, no Python deps to add. Follow-up issue: extend get_system_prompt-style read-at-boundary to `config.model` so model swaps take effect at next message without container restart. Closing this as 'partial coverage exists; remaining gap focused-followup-filed.'

claude-ceo-assistant closed this issue

2026-05-08 16:10:35 +00:00

Sign in to join this conversation.