## Symptom (cycle 6+ of #488) Workspaces appear `online` (heartbeats fine) but every cron tick fails silently with `No conversation found with session ID: <uuid>` → `ProcessError: exit code 1` → idle loop logs HTTP 200, no actual work happens. Backend Engineer received 5 idle pulses without claiming a single one of the 6 open Hermes issues (#496-500) because the bug prevents `gh issue list` from ever firing. ## Root cause (verified live in ws-20cb8ff8-3e4 today) claude-code stores sessions at `/root/.claude/projects/<cwd-with-/-as-->/<id>.jsonl`. When a workspace container is recreated, `self._session_id` from a prior instance references a file that no longer exists. Passing it as `resume=<id>` to ClaudeAgentOptions crashes the CLI on the very first call. The existing #75 fix only fires AFTER the first ProcessError lands, and per-cycle executor re-instantiation can reload the stale id from elsewhere — restart-with-reset_claude_session was the only working mitigation, hand-fired every cycle. ## Fix New `_resolve_resume()` in ClaudeSDKExecutor: probes a handful of well-known session-file locations (`/root/.claude/projects/*/<id>.jsonl`, `/root/.claude/sessions/<id>.jsonl`, plus the agent-uid variants) via `glob.glob`. If no file matches the in-memory `_session_id`, drops the id (sets to None) AND returns None so `ClaudeAgentOptions.resume` is unset — CLI starts a fresh session. Logged at INFO with `#488` in the message so operators correlate. `_build_options()` now calls `_resolve_resume()` instead of reading `self._session_id` directly. Cheap path when no session set: zero glob calls. Hot path (session set + file exists): one glob call, short-circuits on first match. ## Drive-by fix: stale `from X import` in 4 modules Same regression class as #1 (the runtime release that closed it): - `claude_sdk_executor.py:43`: `from executor_helpers import …` - `cli_executor.py:39-40`: `from config import …`, `from executor_helpers import …` - `main.py:28-30`: `from config import …`, `from heartbeat import …`, `from preflight import …` - `preflight.py:7`: `from config import …` All rewritten to absolute `from molecule_runtime.<module> import …` so they resolve outside of workspace containers (e.g. test environments where `/app` isn't on sys.path). The grep guard in `tests/test_imports.py` already covered `adapters` — extending to all top-level imports would catch this class going forward; not in this PR to keep scope tight. ## Tests 6 new in `tests/test_session_resume_gate.py`: - baseline (no session) → no glob, returns None - file exists → keep id, returns id, single glob (early-exit) - file missing → drop id (clears `_session_id`), returns None - late-pattern match → walks all patterns until hit - log includes session id (operator triage) - log references #488 (debugger discoverability) All 16 tests (10 existing + 6 new) pass. ## Release plan - Bump version 0.1.1 → 0.1.2 (in this commit) - After merge, push v0.1.2 tag → publish.yml auto-publishes to PyPI - Then rebuild workspace template images locally so workspaces pick up the fix (templates pin `>=0.1.0`, will resolve to 0.1.2 on next build) - Then mass-restart workspaces with reset_claude_session=true once to clear any DB-side stale state, and the permanent fix kicks in Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
36 lines
947 B
TOML
36 lines
947 B
TOML
[build-system]
|
|
requires = ["setuptools>=68.0", "wheel"]
|
|
build-backend = "setuptools.build_meta"
|
|
|
|
[project]
|
|
name = "molecule-ai-workspace-runtime"
|
|
version = "0.1.2"
|
|
description = "Molecule AI workspace runtime — shared infrastructure for all agent adapters"
|
|
requires-python = ">=3.11"
|
|
license = {text = "BSL-1.1"}
|
|
readme = "README.md"
|
|
# Don't pin heavy deps — each adapter adds its own
|
|
dependencies = [
|
|
"a2a-sdk[http-server]>=0.3.25",
|
|
"httpx>=0.27.0",
|
|
"uvicorn>=0.30.0",
|
|
"starlette>=0.38.0",
|
|
"websockets>=12.0",
|
|
"pyyaml>=6.0",
|
|
"langchain-core>=0.3.0",
|
|
"opentelemetry-api>=1.24.0",
|
|
"opentelemetry-sdk>=1.24.0",
|
|
"opentelemetry-exporter-otlp-proto-http>=1.24.0",
|
|
"temporalio>=1.7.0",
|
|
]
|
|
|
|
[project.scripts]
|
|
molecule-runtime = "molecule_runtime.main:main_sync"
|
|
|
|
[tool.setuptools.packages.find]
|
|
where = ["."]
|
|
include = ["molecule_runtime*"]
|
|
|
|
[tool.setuptools.package-data]
|
|
"molecule_runtime" = ["py.typed"]
|