docs: add CLAUDE.md, known-issues.md, and runbooks/local-dev-setup.md

2026-04-21 10:55:10 +00:00 · 2026-04-21 10:55:10 +00:00 · 65c6a3ae05
commit 65c6a3ae05
parent 1833d1c491
3 changed files with 568 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,210 @@
 # Molecule AI Workspace Template — gemini-cli Runtime
 ## Purpose
 This template provides a self-contained Docker workspace for the gemini-cli agent runtime used by Molecule AI platforms. It packages a configured gemini-cli agent inside a Docker container, wired to connect back to the Molecule AI platform via `adapter.py`.
 It is NOT a plugin. It has no `plugin.yaml` and no `rules/` directory. It is a workspace *environment* — a Dockerfile, a runtime config, and an adapter — that the platform spins up on behalf of an agent.
 ## Key Files
 ### `config.yaml`
 Runtime configuration for the gemini-cli agent.
 ```yaml
 schema_version: "1"
 runtime:
  agent: gemini-cli
  model: gemini-2.5-flash
  api_key_env: GEMINI_API_KEY
 skills:
  enabled: true
  list:
    - name: file-search
      path: /workspace/skills/file-search
    - name: web-fetch
      path: /workspace/skills/web-fetch
 adapter:
  platform_url: https://platform.molecule.ai
  workspace_id_env: WORKSPACE_ID
  timeout_seconds: 30
 ```
 - `model` selects the Gemini model variant. Common values: `gemini-2.0-flash`, `gemini-2.5-flash`, `gemini-2.5-pro`.
 - `api_key_env` names the env var that holds the Gemini API key at container startup. The key itself is injected by the Molecule AI platform or set locally during dev.
 - `skills` lists local skill directories to expose to the agent. These are loaded at startup by gemini-cli.
 ### `adapter.py`
 Thin shim that translates Molecule AI platform events into gemini-cli tool calls and streams responses back. Key entry points:
 ```python
 # adapter.py
 import os, sys
 def connect(platform_url: str, workspace_id: str, timeout: int = 30):
    """Called by the platform shim to hand off a session to gemini-cli."""
    ...
 def stream_response(session_id: str, prompt: str) -> str:
    """Blocking call: send prompt to gemini-cli, stream token back."""
    ...
 ```
 `connect()` is invoked once per session by the platform harness inside the container. `stream_response()` is called for each agent turn.
 ### `system-prompt.md`
 Injected at container startup into the gemini-cli prompt stack. This is the canonical place to set the agent's persona, guardrails, and tool whitelist. gemini-cli concatenates it before its default system message.
 ```markdown
 # System Prompt — Molecule AI gemini-cli Agent
 You are a research agent running inside a Molecule AI workspace.
 You have access to the following tools: file-search, web-fetch.
 Do not call tools outside this list without explicit user approval.
 ...
 ```
 ### `requirements.txt`
 Pinned Python dependencies for the adapter and any skill loaders.
 ```
 gemini-cli>=1.0.0
 molecule-ai-adapter>=2.1.0
 httpx>=0.27.0
 pydantic>=2.0.0
 ```
 ### `Dockerfile`
 Builds the workspace image. Key stages:
 ```dockerfile
 FROM python:3.11-slim
 WORKDIR /workspace
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY . .
 RUN mkdir -p /workspace/skills
 ENV GEMINI_API_KEY=""
 ENV WORKSPACE_ID="local-dev"
 CMD ["python", "-m", "adapter"]
 ```
 The `CMD` invokes the adapter module, which bootstraps gemini-cli and connects to the platform. To override the startup command locally:
 ```bash
 docker build -t molecule-gemini-cli:dev .
 docker run --rm \
  -e GEMINI_API_KEY="$(cat ~/.gemini-api-key)" \
  -e WORKSPACE_ID="dev-001" \
  molecule-gemini-cli:dev
 ```
 ## Runtime Config Conventions
 ### Gemini Model Selection
 Set `runtime.model` in `config.yaml`. gemini-cli resolves this to a Vertex AI / AI Studio model name at startup. If the model string does not match a known alias, gemini-cli exits with:
 ```
 ValueError: Unknown model 'gemini-99-pro'. Did you mean 'gemini-2.0-pro'?
 ```
 ### API Key Handling
 The Gemini API key is injected via the `GEMINI_API_KEY` env var. The platform sets this before `docker run`. Never bake API keys into the image. In local dev, pass it with `--build-arg` or `-e`.
 ```bash
 # Build-time secret (buildkit needed for --build-arg secrecy)
 docker build --build-arg GEMINI_API_KEY -t molecule-gemini-cli:dev .
 # Runtime secret (recommended for local dev)
 docker run --rm -e GEMINI_API_KEY="$GEMINI_API_KEY" molecule-gemini-cli:dev
 ```
 ### Skill Loading from config.yaml
 gemini-cli loads skills listed under `skills.list` in `config.yaml` at startup. Each entry requires `name` and `path`. If the `path` does not exist, gemini-cli logs a warning and skips the skill:
 ```
 WARN: skill 'file-search' path /workspace/skills/file-search not found, skipping
 ```
 The skill directories must be volume-mounted or present in the image.
 ## Dev Setup
 ```bash
 # 1. Clone
 git clone https://github.com/molecule-ai/molecule-ai-workspace-template-gemini-cli.git
 cd molecule-ai-workspace-template-gemini-cli
 # 2. Install dependencies
 pip install -r requirements.txt
 # 3. Build image
 docker build -t molecule-gemini-cli:dev .
 # 4. Config override for local dev
 # Edit config.yaml or set environment variables:
 export GEMINI_API_KEY="$(cat ~/.gemini-api-key)"    # not in the repo
 export WORKSPACE_ID="dev-local"
 # 5. Smoke test
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="$WORKSPACE_ID" \
  molecule-gemini-cli:dev python -c "
 from adapter import connect, stream_response
 connect('http://localhost:8080', 'dev-local')
 print(stream_response('test-session', 'ping'))
 "
 # 6. Verify adapter connects to platform
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="$WORKSPACE_ID" \
  -e ADAPTER_PLATFORM_URL="https://platform.molecule.ai" \
  molecule-gemini-cli:dev python -c "from adapter import connect; connect()"
 ```
 ## Testing
 ```bash
 # Smoke test — runs adapter.connect() and exits 0 on success
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="smoke-test" \
  molecule-gemini-cli:dev python -c "
 import sys, os
 from adapter import connect
 try:
    connect(os.environ['ADAPTER_PLATFORM_URL'], os.environ['WORKSPACE_ID'])
    print('OK')
    sys.exit(0)
 except Exception as e:
    print(f'FAIL: {e}')
    sys.exit(1)
 "
 ```
 ## Release Process
 1. **Schema version bump** — increment `schema_version` in `config.yaml` following the platform's compatibility matrix. Breaking changes require a major version bump.
 2. **Tag** — tag the commit with the new version:
   ```bash
   git tag -a v1.2.0 -m "release: schema v1.2, add skill hot-reload"
   git push origin main --tags
   ```
 3. The CI pipeline builds and pushes the image to the registry on tags matching `v*`.
 4. Update the platform workspace registry entry to point at the new tag.
--- a/known-issues.md
+++ b/known-issues.md
@ -0,0 +1,161 @@
 # Known Issues
 This document tracks unresolved issues that are known to cause failures or unexpected behavior in the gemini-cli workspace template. Entries are organized by severity and include workaround instructions where available.
 ---
 ## Issue 1: Missing `GEMINI_API_KEY` causes silent startup failure
 **Severity:** High
 **Description:**
 If `GEMINI_API_KEY` is unset when the container starts, gemini-cli initializes without an API key but does not exit immediately. The agent starts, accepts sessions, and then produces no response for every prompt — the platform sees an agent that "never replies."
 The underlying cause is that gemini-cli's auth layer attempts to load the key lazily on the first API call, not at startup. No error is raised until the first `stream_response()` call, which then fails with a generic timeout or an auth error that may be swallowed by the platform shim.
 **Affected versions:** All template versions prior to the env-validation shim in `adapter.py` (see workaround).
 **Workaround:**
 Validate the env var before invoking the adapter:
 ```bash
 if [ -z "$GEMINI_API_KEY" ]; then
  echo "ERROR: GEMINI_API_KEY is not set" >&2
  exit 1
 fi
 ```
 Or add an early check in `adapter.py`:
 ```python
 import os
 def connect(platform_url, workspace_id, timeout=30):
    if not os.environ.get("GEMINI_API_KEY"):
        raise RuntimeError("GEMINI_API_KEY environment variable is not set")
    ...
 ```
 **Tracking:** Internal issue `WKS-001`.
 ---
 ## Issue 2: `system-prompt.md` injected after gemini-cli defaults, overriding template's SOUL.md conventions
 **Severity:** Medium
 **Description:**
 gemini-cli loads system prompts in the following order:
 1. Built-in defaults (`gemini-cli/resources/defaults/system.txt`)
 2. `SOUL.md` in the current working directory (gemini-cli's convention for agent personality files)
 3. `system-prompt.md` injected by the workspace template
 Because `system-prompt.md` is concatenated last, it overwrites any setting that was already set in `SOUL.md` (or gemini-cli's defaults). This means template authors cannot rely on gemini-cli's `SOUL.md` convention to set agent personality, guardrails, or tool restrictions — anything set there is silently clobbered.
 This is particularly problematic for deployments that rely on gemini-cli's default tool list (which includes shell execution, file read/write, and internet access) since the template's `system-prompt.md` must explicitly deny those tools to enforce a tighter scope.
 **Workaround:**
 Do not use `SOUL.md` for runtime configuration. Put all system-prompt content exclusively in `system-prompt.md` and leave `SOUL.md` absent or empty. The adapter startup script should delete or truncate `SOUL.md` if present:
 ```bash
 # In Dockerfile, after COPY:
 RUN rm -f /workspace/SOUL.md
 ```
 **Tracking:** Internal issue `WKS-002`.
 ---
 ## Issue 3: `config.yaml` model override not propagated to the gemini-cli config file inside Docker
 **Severity:** Medium
 **Description:**
 The template's `config.yaml` exposes `runtime.model` as the canonical model selection knob. However, gemini-cli reads its own config file (`~/.config/gemini-cli/config.json`) for model selection, not `config.yaml`. The template's `config.yaml` is read by the adapter shim only; it does not rewrite gemini-cli's config file.
 As a result, even if `config.yaml` specifies `model: gemini-2.5-pro`, the container may still run the model configured in gemini-cli's internal config (defaulting to `gemini-2.0-flash`).
 **Reproduction:**
 ```bash
 # Set a non-default model in config.yaml
 sed -i 's/^  model:.*/  model: gemini-2.5-pro/' config.yaml
 docker build -t molecule-gemini-cli:dev .
 docker run --rm molecule-gemini-cli:dev \
  python -c "from adapter import stream_response; print(stream_response('s', 'what model are you'))"
 # Output: "gemini-2.0-flash"  (not gemini-2.5-pro)
 ```
 **Workaround:**
 The adapter must sync the model value into gemini-cli's config file before starting the session:
 ```python
 import json, os, pathlib
 def sync_model_to_gemini_config(model: str):
    config_path = pathlib.Path(os.path.expanduser("~/.config/gemini-cli/config.json"))
    config_path.parent.mkdir(parents=True, exist_ok=True)
    if config_path.exists():
        cfg = json.loads(config_path.read_text())
    else:
        cfg = {}
    cfg["model"] = model
    config_path.write_text(json.dumps(cfg, indent=2))
 ```
 Call `sync_model_to_gemini_config()` inside `connect()` before instantiating the gemini-cli client.
 **Tracking:** Internal issue `WKS-003`.
 ---
 ## Issue 4: Template schema version 1 but platform v2 introduces breaking config key renames
 **Severity:** High (breaking for platform v2 deployments)
 **Description:**
 The template ships with `schema_version: "1"` in `config.yaml`. Platform version 2 (v2) renamed several top-level keys:
 | v1 key                    | v2 key                           |
 |---------------------------|----------------------------------|
 | `runtime.agent`           | `agent.runtime`                  |
 | `runtime.model`           | `agent.model`                    |
 | `runtime.api_key_env`     | `auth.gemini_api_key_env`        |
 | `adapter.platform_url`    | `platform.endpoint`             |
 | `adapter.workspace_id_env`| `platform.workspace_id_env`      |
 | `adapter.timeout_seconds` | `platform.request_timeout_secs`  |
 Templates using v1 syntax on a v2 platform silently ignore renamed keys — the adapter gets default values instead of configured ones, leading to runtime failures that are difficult to diagnose.
 **Detection:**
 ```bash
 # Check which schema version the platform expects
 curl -s https://platform.molecule.ai/api/schema-version | jq .
 ```
 If the platform returns `2` and `config.yaml` has `schema_version: "1"`, the config is incompatible.
 **Workaround:**
 Maintain separate `config.v1.yaml` and `config.v2.yaml` files and select the correct one at container startup based on the platform's reported schema version:
 ```bash
 # In Dockerfile CMD or entrypoint script:
 PLATFORM_SCHEMA=$(curl -s https://platform.molecule.ai/api/schema-version | jq -r '.version')
 if [ "$PLATFORM_SCHEMA" = "2" ]; then
  cp /workspace/config.v2.yaml /workspace/config.yaml
 else
  cp /workspace/config.v1.yaml /workspace/config.yaml
 fi
 exec python -m adapter
 ```
 **Tracking:** Internal issue `WKS-004`. Fixed in template v2.0 (pending release).
--- a/runbooks/local-dev-setup.md
+++ b/runbooks/local-dev-setup.md
@ -0,0 +1,197 @@
 # Local Dev Setup Runbook
 This runbook covers setting up the gemini-cli workspace template on a local machine for development and testing. Follow each step in order.
 ---
 ## Prerequisites
 - Python 3.11+
 - Docker 24.0+
 - A valid Gemini API key (from Google AI Studio or Google Cloud)
 - Git
 ---
 ## Step 1: Clone the Repository
 ```bash
 git clone https://github.com/molecule-ai/molecule-ai-workspace-template-gemini-cli.git
 cd molecule-ai-workspace-template-gemini-cli
 ```
 ---
 ## Step 2: Install Python Dependencies
 Create a virtual environment and install the pinned dependencies:
 ```bash
 python -m venv .venv
 source .venv/bin/activate      # Windows: .venv\Scripts\activate
 pip install -r requirements.txt
 ```
 Expected output:
 ```
 Collecting gemini-cli>=1.0.0
  Downloading gemini_cli-1.2.1-py3-none-any.whl (2.1 MB)
 Collecting molecule-ai-adapter>=2.1.0
  Downloading molecule_ai_adapter-2.3.0-py3-none-any.whl (650 kB)
 ...
 Installing collected packages: gemini-cli, molecule-ai-adapter, httpx, pydantic
 Successfully installed gemini-cli-1.2.1 molecule-ai-adapter-2.3.0 httpx-0.27.2 pydantic-2.9.2
 ```
 ---
 ## Step 3: Set Your API Key
 Store your Gemini API key in a local file (never commit this file):
 ```bash
 # Replace with your actual key from https://aistudio.google.com/apikey
 echo "AIzaSy..." > ~/.gemini-api-key
 chmod 600 ~/.gemini-api-key
 ```
 Set the env var for the current session:
 ```bash
 export GEMINI_API_KEY="$(cat ~/.gemini-api-key)"
 ```
 ---
 ## Step 4: Build the Docker Image
 ```bash
 docker build -t molecule-gemini-cli:dev .
 ```
 To include the API key at build time (buildkit only — do not do this in CI or shared machines):
 ```bash
 DOCKER_BUILDKIT=1 docker build \
  --build-arg GEMINI_API_KEY="$GEMINI_API_KEY" \
  -t molecule-gemini-cli:dev \
  .
 ```
 Standard build without build-time secret:
 ```bash
 docker build -t molecule-gemini-cli:dev .
 ```
 ---
 ## Step 5: Config Override for Local Dev
 The template reads `config.yaml` for runtime settings. For local dev, override settings via environment variables or by editing a local copy.
 **Option A — environment variables (recommended for dev):**
 ```bash
 export WORKSPACE_ID="dev-local"
 export ADAPTER_PLATFORM_URL="https://platform.molecule.ai"
 export GEMINI_API_KEY="$(cat ~/.gemini-api-key)"
 ```
 **Option B — local config file override:**
 ```bash
 # Work on a copy, never modify config.yaml directly
 cp config.yaml config.yaml.local
 $EDITOR config.yaml.local
 ```
 Then run the container with the local config mounted:
 ```bash
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="dev-local" \
  -v "$(pwd)/config.yaml.local:/workspace/config.yaml:ro" \
  molecule-gemini-cli:dev
 ```
 ---
 ## Step 6: Docker Run Smoke Test
 Verify the container starts and the adapter connects successfully:
 ```bash
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="smoke-test" \
  molecule-gemini-cli:dev python -c "
 import sys, os
 from adapter import connect
 try:
    connect(
        os.environ.get('ADAPTER_PLATFORM_URL', 'https://platform.molecule.ai'),
        os.environ['WORKSPACE_ID']
    )
    print('OK — adapter connected successfully')
    sys.exit(0)
 except Exception as e:
    print(f'FAIL: {e}', file=sys.stderr)
    sys.exit(1)
 "
 ```
 Expected output:
 ```
 OK — adapter connected successfully
 ```
 If the exit code is non-zero, see [Common Issues](#common-issues) below.
 ---
 ## Step 7: Verify Adapter Connects to Platform
 Run a full agent round-trip test using the platform endpoint:
 ```bash
 docker run --rm \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  -e WORKSPACE_ID="dev-local" \
  -e ADAPTER_PLATFORM_URL="https://platform.molecule.ai" \
  molecule-gemini-cli:dev python -c "
 from adapter import connect, stream_response
 connect('https://platform.molecule.ai', 'dev-local')
 reply = stream_response('test-session', 'Say hello in one sentence.')
 print(reply)
 "
 ```
 Expected output (or similar):
 ```
 Hello! I'm ready to assist you.
 ```
 If the connection is refused:
 ```
 ConnectionRefusedError: [Errno 111] Connection refused
 ```
 See issue `adapter connection refused` in the table below.
 ---
 ## Common Issues
 | # | Issue | Symptom | Resolution |
 |---|-------|---------|------------|
 | 1 | `GEMINI_API_KEY` is not set | Container starts but the agent produces no response; `stream_response()` hangs then times out with `AuthenticationError: Invalid API key` or silent hang | Confirm the env var is set: `echo $GEMINI_API_KEY`. If empty, obtain a key from https://aistudio.google.com/apikey and export it before `docker run` |
 | 2 | Model not found | gemini-cli exits with `ValueError: Unknown model 'gemini-99-pro'. Did you mean 'gemini-2.0-flash'?` | Check `config.yaml` for the `runtime.model` value. Valid models: `gemini-2.0-flash`, `gemini-2.5-flash`, `gemini-2.5-pro`. Do not use preview or alias names |
 | 3 | Docker networking | `ConnectionRefusedError` or `HTTPConnectError` when adapter tries to reach `platform.molecule.ai` inside the container | Ensure the host network is reachable from inside the container. Try `--network=host` on Linux, or map port explicitly: `-p 8080:8080`. Verify the platform URL is correct and the host machine is not behind a VPN blocking Docker's bridge network |
 | 4 | Skill not loading | gemini-cli starts but reports `WARN: skill 'file-search' path /workspace/skills/file-search not found, skipping` for each skill | Verify skill directories exist in the image. Add them with a volume mount: `-v "$(pwd)/skills:/workspace/skills:ro"`. Ensure the skill paths in `config.yaml` match the mounted paths exactly |
 | 5 | Adapter connection refused | `ConnectionRefusedError: [Errno 111] Connection refused` on `adapter.connect()` call | The adapter is trying to reach the platform at `ADAPTER_PLATFORM_URL` but nothing is listening there. If running against a local platform mock, start it first: `python -m local_platform_mock`. If running against the real platform, check that `ADAPTER_PLATFORM_URL` is set to the correct public endpoint and that the host machine can reach it |