Merge pull request #48 from Molecule-AI/fix/merge-pr-39-main
merge: PR #39 workspace hibernation docs
This commit is contained in:
commit
016e301dc3
@ -80,8 +80,33 @@ Core workspace CRUD and lifecycle operations.
|
||||
| Method | Path | Auth | Description |
|
||||
|--------|------|------|-------------|
|
||||
| POST | `/workspaces/:id/restart` | WorkspaceAuth | Restart the workspace container. Sends a `restart_context` A2A message after successful re-registration. |
|
||||
| POST | `/workspaces/:id/pause` | WorkspaceAuth | Stop the container and set status to `paused`. Paused workspaces skip health sweep, liveness monitor, and auto-restart. |
|
||||
| POST | `/workspaces/:id/pause` | WorkspaceAuth | Stop the container and set status to `paused`. Paused workspaces skip health sweep, liveness monitor, and auto-restart. Resume manually via `/resume`. |
|
||||
| POST | `/workspaces/:id/resume` | WorkspaceAuth | Re-provision a paused workspace. Status transitions to `provisioning`. |
|
||||
| POST | `/workspaces/:id/hibernate` | WorkspaceAuth | Immediately hibernate a workspace (stop container, set status to `hibernated`). Useful for manual cost control. See hibernation note below. |
|
||||
|
||||
<Callout type="info">
|
||||
**Workspace hibernation**
|
||||
|
||||
A workspace with `hibernation_idle_minutes` set in its config will be **automatically hibernated** by the platform after that many idle minutes (no active tasks, no recent heartbeat). The monitor checks every 2 minutes.
|
||||
|
||||
`hibernated` differs from `paused`:
|
||||
- **`paused`** — manual, resumes only via `POST /resume`.
|
||||
- **`hibernated`** — automatic (or via `POST /hibernate`), resumes **automatically** when an A2A message arrives.
|
||||
|
||||
When a message is sent to a hibernated workspace, the platform returns:
|
||||
```
|
||||
HTTP 503 Retry-After: 15
|
||||
{"waking": true}
|
||||
```
|
||||
Callers should retry after ~15 seconds. The workspace typically returns to `online` within that window.
|
||||
|
||||
To opt a workspace into auto-hibernation, add to its `config.yaml`:
|
||||
```yaml
|
||||
hibernation_idle_minutes: 30 # hibernate after 30 min idle; null (default) = disabled
|
||||
```
|
||||
|
||||
**Atomic hibernation guarantee:** The platform uses a single atomic SQL claim (`UPDATE … WHERE active_tasks = 0`) before stopping the container. If a task arrives between the idle check and the container stop, the claim fails and hibernation is aborted — no in-flight tasks are silently lost.
|
||||
</Callout>
|
||||
|
||||
### Budget
|
||||
|
||||
|
||||
@ -24,104 +24,22 @@ Workspaces talk to each other via **A2A** (agent-to-agent) messages, routed
|
||||
by the platform. Communication rules: same workspace, siblings, and
|
||||
parent/child are allowed; everything else is denied.
|
||||
|
||||
### AGENTS.md auto-generation
|
||||
|
||||
At startup, every workspace automatically generates `/workspace/AGENTS.md`
|
||||
from its `config.yaml`. The file follows the
|
||||
[AAIF (Agent Artifact Interchange Format)](https://github.com/google/A2A) standard
|
||||
and contains:
|
||||
|
||||
| Section | Source |
|
||||
|---------|--------|
|
||||
| Name | `config.yaml → name` |
|
||||
| Role | `config.yaml → role` (falls back to description) |
|
||||
| Description | `config.yaml → description` |
|
||||
| A2A Endpoint | `$AGENT_URL` env var, or `http://localhost:{a2a.port}/a2a` |
|
||||
| MCP Tools | union of `config.yaml → tools` + `plugins` |
|
||||
|
||||
Peers fetch it via `GET /workspace/AGENTS.md` for capability discovery. Keep
|
||||
`name`, `role`, and `description` accurate in `config.yaml` — they are the
|
||||
sole source of truth for what this agent announces to the org.
|
||||
|
||||
```yaml
|
||||
# config.yaml — relevant fields for AGENTS.md
|
||||
name: Backend Engineer
|
||||
role: "Owns the Go platform — API, migrations, tests, and CI gates."
|
||||
description: "Senior backend engineer focused on correctness, security, and performance."
|
||||
```
|
||||
|
||||
The generator is non-fatal: a missing or unreadable `config.yaml` prints a
|
||||
startup warning but does not prevent the workspace from booting.
|
||||
|
||||
## Workspace budgets
|
||||
|
||||
A **budget limit** is a per-workspace monthly spend ceiling, expressed in
|
||||
**USD cents** (e.g. `500` = $5.00/month). It is set by a tenant admin — workspace
|
||||
agents cannot read or clear their own financial ceiling.
|
||||
|
||||
### What happens at the ceiling
|
||||
|
||||
When `monthly_spend >= budget_limit`, the platform blocks new A2A proxy calls
|
||||
to that workspace:
|
||||
|
||||
```
|
||||
HTTP 402 Payment Required
|
||||
{"error": "workspace budget limit exceeded"}
|
||||
```
|
||||
|
||||
The workspace itself keeps running — only inbound A2A messages and channel
|
||||
sends are gated. Raising `budget_limit` or resetting spend immediately
|
||||
restores traffic.
|
||||
|
||||
**Fail-open behaviour:** if the platform cannot reach the budget record
|
||||
(e.g. a transient DB error), traffic is **allowed through** rather than
|
||||
blocked. The spend ceiling is a soft guardrail, not a hard guarantee.
|
||||
|
||||
### Setting a limit (admin only)
|
||||
|
||||
```bash
|
||||
# Set ceiling to $5.00/month
|
||||
curl -X PATCH https://api.moleculesai.app/workspaces/<id>/budget \
|
||||
-H "Authorization: Bearer <admin-token>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"budget_limit": 500}'
|
||||
|
||||
# Remove ceiling entirely
|
||||
curl -X PATCH https://api.moleculesai.app/workspaces/<id>/budget \
|
||||
-H "Authorization: Bearer <admin-token>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"budget_limit": null}'
|
||||
```
|
||||
|
||||
Values must be a positive integer (USD cents) or `null`. Negative values are
|
||||
rejected with `400 Bad Request`.
|
||||
|
||||
### Monitoring spend
|
||||
|
||||
Use the dedicated budget endpoint — financial fields are **not** included in
|
||||
the standard `GET /workspaces/:id` response:
|
||||
|
||||
```bash
|
||||
curl https://api.moleculesai.app/workspaces/<id>/budget \
|
||||
-H "Authorization: Bearer <admin-token>"
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"budget_limit": 500,
|
||||
"monthly_spend": 312,
|
||||
"budget_remaining": 188
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `budget_limit` | Ceiling in USD cents, or `null` if no limit is set |
|
||||
| `monthly_spend` | Accumulated spend this billing period in USD cents |
|
||||
| `budget_remaining` | `null` if no limit; otherwise `max(0, limit − spend)`. Can read negative if spend exceeded the ceiling before enforcement kicked in. |
|
||||
|
||||
See the [API Reference](/docs/api-reference#budget) for the full endpoint specification.
|
||||
|
||||
### Workspace status lifecycle
|
||||
|
||||
| Status | Meaning | Resumes via |
|
||||
|--------|---------|-------------|
|
||||
| `provisioning` | Container being started | automatic |
|
||||
| `online` | Running and accepting tasks | — |
|
||||
| `degraded` | Heartbeat `error_rate > 0.5` | auto-recovers |
|
||||
| `offline` | Missed heartbeats (liveness sweep) | auto-restart |
|
||||
| `paused` | Manually stopped via `/pause` | `POST /resume` |
|
||||
| `hibernated` | Auto-paused after idle timeout (or via `/hibernate`) | automatic on next A2A message |
|
||||
| `removed` | Deleted | — |
|
||||
|
||||
**Hibernation** is an opt-in automatic cost-saving mode. Set `hibernation_idle_minutes` in the workspace's `config.yaml` to enable it. When a hibernated workspace receives an A2A message, the platform wakes it automatically (returning `503 Retry-After: 15` while it comes online). See [API Reference — Lifecycle](/docs/api-reference#lifecycle) for the `/hibernate` endpoint and configuration details.
|
||||
|
||||
## External agents
|
||||
|
||||
An **external agent** is a workspace with `runtime: external` — it runs on
|
||||
|
||||
Loading…
Reference in New Issue
Block a user