main
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d47c15d526 |
fix(validate): recognize !external + !include as opaque refs (skip, not error)
molecule-ai-org-template-molecule-dev's CI has been red since the
"pin: dev-department v1.0.0" merge. Symptom:
::error::Workspace at <unnamed>: missing 'name'
::error::Workspace at <unnamed>: missing 'name'
Root cause: org.yaml uses `!external` for the dev-department subtree
fetch (introduced internal#77 / molecule-core#105). The PermissiveLoader
formerly handed every unknown tag to a single multi-constructor that
flattens the parsed value to a plain dict. The validator's
validate_workspace() then saw a dict with no `name` key and tripped
the "missing name" error — but the dict was a `!external` directive,
not a malformed workspace.
The fix wraps both supported tags in distinct sentinel types:
- !include → IncludeRef (str subclass)
- !external → ExternalRef (dict subclass)
validate_workspace() and count_ws() now skip these instead of treating
them as workspace shape. Real workspace dicts (with names) still get
the full structural check. Unknown tags fall through to the
multi-constructor exactly as before, preserving back-compat.
Verified on the live failing org.yaml:
✓ org.yaml valid: Molecule AI Dev Team (0 direct workspaces;
external refs not counted)
And on a synthetic case with one real bug (missing-name workspace
nested under children):
::error::Workspace at <unnamed>: missing 'name'
::error::Workspace at <unnamed>/<unnamed>: missing 'name'
exit 1
So the validator still catches real shape bugs; it just doesn't
false-positive on the new !external pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f125d68910
|
fix(validator): address post-merge review findings on #17 + #18 (#19)
Independent code review of #17 (adapter runtime-load) and #18 (schema versioning) surfaced four Required and three Optional findings worth fixing before the patterns harden into the codebase. Required: R1: Delete .molecule-ci/scripts/{validate-workspace-template, migrate-template}.py — dead-vendored mirror. The new validator workflow invokes .molecule-ci-canonical/scripts/ (the canonical clone), not .molecule-ci/scripts/. The mirror was the exact drift class #90 is supposed to eliminate: next contributor would edit one copy and silently diverge. Other workflows (validate-plugin, validate-org-template) still use the legacy path and keep their own scripts there — so removing OUR two files is asymmetric but correct, and the legacy path can phase out organically. R2: validate-workspace-template.yml's `cache-dependency-path` pointed at the validator's own deps file (just `pyyaml>=6.0`). Pip cache key never invalidated when the template added crewai/langgraph/ etc. Repoint to the calling repo's `requirements.txt`, which is the file the heavy install actually uses one step later. R3: `_check_schema_v1` looped `SCHEMA_V1_REQUIRED_KEYS` and re-emitted "missing required key `template_schema_version`" — but the dispatcher already verified the field is present + int before reaching v1, so that branch was dead defensive code. Skip it explicitly with a comment, but keep the field in the constant for contract documentation + the unknown-keys filter. R4: `_template_adapter_under_validation` was a fixed sys.modules key, meaning back-to-back invocations in the same Python process shared the slot. Use a per-call-unique name keyed on the absolute path's hash. No observed bug today; defensive-only. Optional: O1: Class-discovery filter now also requires `__module__ == module_name`. Without this, an `from molecule_runtime.adapters.base import AbstractCLIAdapter` re-export would count as a "real" adapter, masking the genuine "no concrete subclass" case the gate exists to catch. Cheap and forward-proofs against any future abstract intermediate the runtime might expose. Added a sibling test pinning the new behavior. O2: migrate-template.py's docstring claimed "uses ruamel.yaml when available" but the implementation only ever calls `yaml.safe_dump`. Replaced the lie with a clearer caveat block + a forward-pointer to ruamel-when-comments-detected as a future enhancement. O3: Reordered the workflow so the secret-scan step runs BEFORE `pip install -r requirements.txt`. Same threat surface as the Docker build smoke (which already runs first), but cheap defense- in-depth: a malicious template PR adding a malicious dep to requirements.txt now has its post-install hook execute AFTER the secret scanner has already inspected the diff. Test changes: - test_adapter_with_no_baseadapter_subclass_errors updated for the new error message ("no concrete class inheriting from"). - New test_only_imported_baseadapter_subclass_does_not_count pins the O1 __module__-filter behavior. - 43/43 tests pass (was 42/42 before the new test). - Real langgraph template still passes the full validator end-to-end. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
84a104a146
|
feat(validator): schema-version dispatch + migrate-template.py framework (#18)
Closes the schema-versioning workstream of #90. Sets up the machinery for "we will be updating a lot" (the user's framing) without forcing the first real schema bump to discover semantics under deadline pressure. Today every template is at v1; this PR adds the framework, ships zero behavior change for v1 templates, and reserves v2+ for when there's a concrete reason to bump. Validator changes: - `KNOWN_SCHEMA_VERSIONS = {1}` — the set the validator currently accepts. Future bumps add to this set. - `DEPRECATED_SCHEMA_VERSIONS: set[int] = set()` — versions accepted with warning during a deprecation window. - Per-version contract: `_check_schema_v1(config)` enforces the v1 REQUIRED_KEYS / OPTIONAL_KEYS / KNOWN_RUNTIMES contract — exactly what the previous monolithic check_config_yaml did. - Dispatch table: `SCHEMA_CHECKS = {1: _check_schema_v1}`. Versions that aren't in the table hard-error. - check_config_yaml() now: reads template_schema_version → emits deprecation warning if applicable → dispatches to the right SCHEMA_CHECKS entry → unknown versions hard-error with actionable instructions ("add a SCHEMA_V<N> block"). - Schema versions are FROZEN once shipped: never edit a SCHEMA_V<N> constant in place. To bump, ADD v<N+1> alongside, deprecate v<N>, migrate consumers, drop v<N> next cycle. Header comment documents the discipline. New script `migrate-template.py`: - `MIGRATIONS: dict[int, Callable[[dict], dict]]` registry — each entry maps a SOURCE version to the function that produces the next version's dict. Empty today. - `migrate_config(config, from, to)` chains migrations sequentially. Forward-only (errors on backward), errors on missing intermediate steps (never silently skip), asserts every migration stamps its output's template_schema_version. - CLI: `migrate-template.py [--from N] [--to M] [--dry-run] DIR`. Defaults: --from = whatever config.yaml declares, --to = highest reachable from MIGRATIONS (currently 1, so a no-op). Behavior change to the existing test_missing_required_keys_errors test: Previously the validator emitted 3 "missing required key" errors when name/runtime/template_schema_version were all missing. Now it short-circuits on missing version with a single actionable error — listing downstream missing keys is noise on top of the real problem (no version means we can't pick a contract). The test was updated to pin the new behavior; a new sibling test (test_missing_required_keys_under_v1_dispatch_errors) pins that v1 still lists name/runtime/etc. when present-with-v1. Verification: - 42/42 tests pass (20 prior + 9 new schema-dispatch tests in test_validate_workspace_template.py + 17 new migrator tests in test_migrate_template.py). - Real langgraph template runs through the full updated validator end-to-end with 0 warnings / 0 errors. This + #17 means #90 is done end-to-end: - Phase 2: validator green on all 8 templates as a required check (already shipped) - Phase 2.5: adapter.py runtime-load contract (#17) - Phase 3: schema versioning + migration framework (this PR) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8309a55e6c
|
feat(validator): runtime-load check for adapter.py contract (#17)
Adds the third workstream of #90 (eliminate template repo drift): a strong contract check that exercises adapter.py the same way the runtime does at workspace boot. Without this, a template can have a syntactically-valid Dockerfile + an adapter.py that ImportErrors at runtime, build clean through Docker smoke, and crash on first user prompt — exactly the human-error class #90 is meant to eliminate. Existing checks ranked from weakest to strongest: 1. check_adapter() — text-grep for legacy `molecule_ai` imports. Catches one specific footgun. 2. Docker build smoke — `docker build` succeeds. Doesn't RUN the image, so adapter.py is never imported. Misses every adapter-load bug. 3. (NEW) check_adapter_runtime_load — imports adapter.py via the same `importlib.spec_from_file_location` path the runtime uses, and asserts at least one class inherits from molecule_runtime.adapters.base.BaseAdapter. Hard-error conditions: - adapter.py raises any exception during import (SyntaxError, ImportError, NameError, etc.). Same exception would crash the workspace at boot. - No class in the module inherits from BaseAdapter. The runtime's class-discovery silently falls through to the default langgraph executor in this case — exactly the silent-failure shape the contract is meant to catch. Skip conditions: - No adapter.py exists. Templates without one inherit the default executor by design (policy, not drift). - molecule-ai-workspace-runtime not importable in the validator env. Warns loudly so the CI-config bug surfaces, but doesn't hard-fail (we'd be reporting "your adapter is broken" when the actual cause is missing infra). Workflow update: validate-workspace-template.yml now installs the template's requirements.txt before invoking the validator (or falls back to installing molecule-ai-workspace-runtime alone if the template has no requirements.txt). This satisfies the runtime-load check's import dependencies the same way the workspace container does at boot — `pip install -r requirements.txt`. Verified locally: - 20/20 tests in test_validate_workspace_template.py pass (14 existing + 6 new). - Real langgraph template passes the full new validator including runtime-load (0 warnings, 0 errors). - Surveyed all 8 production templates' adapter.py shapes; every one already inherits from BaseAdapter, so this check turns green on first run with zero per-template fixups needed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
73102cdaa9 |
feat(validate-workspace-template): strict drift gate + canonical-fetch workflow
P6 Phase 1: enforce the workspace-template contract via CI on every
template-repo push, eliminating the slow drift that produced 8
copies of a 28-line Dockerfile in different states of decay.
The previous validator (50 lines, soft warnings only) couldn't
catch the cache-trap pattern (Dockerfile missing ARG RUNTIME_VERSION)
that silently shipped the previous runtime wheel during cascade
publishes — observed five times in a row on 2026-04-27. Hardened
into structural checks that fail CI, not just warn:
- Dockerfile must base on python:3.11-slim
- Dockerfile must declare ARG RUNTIME_VERSION AND reference
${RUNTIME_VERSION} in a RUN block (the arg has to be in the
layer's command line for docker to hash it into the cache key)
- Dockerfile must create the agent uid-1000 user (Claude Code
refuses --dangerously-skip-permissions as root for safety)
- Dockerfile must end at molecule-runtime — directly via
ENTRYPOINT or via a wrapper script that exec's it (claude-code
has entrypoint.sh for gosu drop-priv; hermes has start.sh to
boot the hermes-agent daemon first; both are allowed)
- config.yaml must have name + runtime + integer
template_schema_version. Quoted "1" fails — observed previously
in a copy-pasted template that the YAML loader turned into str
- requirements.txt must declare molecule-ai-workspace-runtime
Also fixed: the original validator's warning telling adapter.py
NOT to import molecule_runtime was backwards — that's the
canonical package name post-#87. Now it warns on the legacy
molecule_ai prefix instead.
Reusable workflow change: instead of running
.molecule-ci/scripts/validate-workspace-template.py (a per-template
vendored copy that drifts as the validator evolves), the workflow
now checks out molecule-ci itself into .molecule-ci-canonical and
runs the canonical script from there. Single source of truth —
every template runs the SAME contract on every CI run. The legacy
.molecule-ci/scripts/ directories in each template repo can be
deleted in a Phase 2 cleanup PR.
14 unit tests pin the contract:
- canonical template passes
- claude-code-style custom entrypoint passes when the wrapper
exec's molecule-runtime
- 5 Dockerfile drift modes each error individually
- 3 config.yaml drift modes each error/warn
- requirements.txt missing-runtime errors
- legacy molecule_ai import warns
- regression cover: modern molecule_runtime import does NOT
trigger the (deleted) backwards warning
All 8 production template repos pass the new contract today —
this PR locks in the current good state, it does not force any
template-repo edits.
Contract documented at docs/template-contract.md so the rules are
discoverable without reading the validator.
|
||
| 487056d7dd | ci: add .molecule-ci/scripts/ directory with all CI scripts |